GCC 9.1 Compiler Tuning Threadripper AMD znver1

AMD Ryzen Threadripper 2990WX compiler benchmarks on GCC 9.1 with Ubuntu Linux by Michael Larabel.

-O2 -march=athlon64

Environment Notes: CXXFLAGS=-O2-march=athlon64 CFLAGS=-O2-march=athlon64
Compiler Notes: --disable-multilib --enable-checing=release
Processor Notes: Scaling Governor: acpi-cpufreq ondemand
Python Notes: Python 2.7.15rc1 + Python 3.6.7
Security Notes: __user pointer sanitization + Full AMD retpoline IBPB: conditional STIBP: disabled RSB filling + SSB disabled via prctl and seccomp

-O3 -march=athlon64

Environment Notes: CXXFLAGS=-O3-march=athlon64 CFLAGS=-O3-march=athlon64
Compiler Notes: --disable-multilib --enable-checing=release
Processor Notes: Scaling Governor: acpi-cpufreq ondemand
Python Notes: Python 2.7.15rc1 + Python 3.6.7
Security Notes: __user pointer sanitization + Full AMD retpoline IBPB: conditional STIBP: disabled RSB filling + SSB disabled via prctl and seccomp

-O3 -march=athlon64-sse3

Environment Notes: CXXFLAGS=-O3-march=athlon64-sse3 CFLAGS=-O3-march=athlon64-sse3
Compiler Notes: --disable-multilib --enable-checing=release
Processor Notes: Scaling Governor: acpi-cpufreq ondemand
Python Notes: Python 2.7.15rc1 + Python 3.6.7
Security Notes: __user pointer sanitization + Full AMD retpoline IBPB: conditional STIBP: disabled RSB filling + SSB disabled via prctl and seccomp

-O2 -march=native

Environment Notes: CXXFLAGS=-O2-march=native CFLAGS=-O2-march=native
Compiler Notes: --disable-multilib --enable-checing=release
Processor Notes: Scaling Governor: acpi-cpufreq ondemand
Python Notes: Python 2.7.15rc1 + Python 3.6.7
Security Notes: __user pointer sanitization + Full AMD retpoline IBPB: conditional STIBP: disabled RSB filling + SSB disabled via prctl and seccomp

-O3 -march=native

Environment Notes: CXXFLAGS=-O3-march=native CFLAGS=-O3-march=native
Compiler Notes: --disable-multilib --enable-checing=release
Processor Notes: Scaling Governor: acpi-cpufreq ondemand
Python Notes: Python 2.7.15rc1 + Python 3.6.7
Security Notes: __user pointer sanitization + Full AMD retpoline IBPB: conditional STIBP: disabled RSB filling + SSB disabled via prctl and seccomp

-O3 -march=native -flto

Environment Notes: CXXFLAGS=-O3-march=native-flto CFLAGS=-O3-march=native-flto
Compiler Notes: --disable-multilib --enable-checing=release
Processor Notes: Scaling Governor: acpi-cpufreq ondemand
Python Notes: Python 2.7.15rc1 + Python 3.6.7
Security Notes: __user pointer sanitization + Full AMD retpoline IBPB: conditional STIBP: disabled RSB filling + SSB disabled via prctl and seccomp

PGO

AMD Ryzen Threadripper 2990WX 32-Core

Processor: AMD Ryzen Threadripper 2990WX 32-Core @ 3.00GHz (32 Cores / 64 Threads), Motherboard: ASUS ROG ZENITH EXTREME (1701 BIOS), Chipset: AMD 17h, Memory: 32768MB, Disk: Samsung SSD 970 EVO 500GB, Graphics: AMD Radeon RX 64 8GB (1590/800MHz), Audio: Realtek ALC1220, Monitor: ASUS VP28U, Network: Intel I211 + Qualcomm Atheros QCA6174 802.11ac + Wilocity Wil6200 802.11ad

OS: Ubuntu 18.04, Kernel: 4.18.0-18-generic (x86_64), Desktop: GNOME Shell 3.28.3, Display Server: X Server 1.20.1, Display Driver: amdgpu 18.1.0, OpenGL: 4.5 Mesa 18.2.8 (LLVM 7.0.0), Compiler: GCC 9.1.0, File-System: ext4, Screen Resolution: 3840x2160

AOM AV1

This is a simple test of the AOMedia AV1 encoder run on the CPU with a sample video file. Learn more via the OpenBenchmarking.org test page.

SVT-AV1

This is a test of the Intel Open Visual Cloud Scalable Video Technology SVT-AV1 CPU-based multi-threaded video encoder for the AV1 video format with a sample 1080p YUV video file. Learn more via the OpenBenchmarking.org test page.

SVT-HEVC

This is a test of the Intel Open Visual Cloud Scalable Video Technology SVT-HEVC CPU-based multi-threaded video encoder for the HEVC / H.265 video format with a sample 1080p YUV video file. Learn more via the OpenBenchmarking.org test page.

SVT-VP9

This is a test of the Intel Open Visual Cloud Scalable Video Technology SVT-VP9 CPU-based multi-threaded video encoder for the VP9 video format with a sample 1080p YUV video file. Learn more via the OpenBenchmarking.org test page.

VP9 libvpx Encoding

This is a standard video encoding performance test of Google's libvpx library and the vpxenc command for the VP9/WebM format using a sample 1080p video. Learn more via the OpenBenchmarking.org test page.

x264

This is a simple test of the x264 encoder run on the CPU (OpenCL support disabled) with a sample video file. Learn more via the OpenBenchmarking.org test page.

x265

This is a simple test of the x265 encoder run on the CPU with a sample 1080p video file. Learn more via the OpenBenchmarking.org test page.

High Performance Conjugate Gradient

HPCG is the High Performance Conjugate Gradient and is a new scientific benchmark from Sandia National Lans focused for super-computer testing with modern real-world workloads compared to HPCC. Learn more via the OpenBenchmarking.org test page.

GraphicsMagick

This is a test of GraphicsMagick with its OpenMP implementation that performs various imaging tests to stress the system's CPU. Learn more via the OpenBenchmarking.org test page.

FFTW

FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions. Learn more via the OpenBenchmarking.org test page.

LuaJIT

This test profile is a collection of Lua scripts/benchmarks run against a locally-built copy of LuaJIT upstream. Learn more via the OpenBenchmarking.org test page.

SciMark

This test runs the ANSI C version of SciMark 2.0, which is a benchmark for scientific and numerical computing developed by programmers at the National Institute of Standards and Technology. This test is made up of Fast Foruier Transform, Jacobi Successive Over-relaxation, Monte Carlo, Sparse Matrix Multiply, and dense LU matrix factorization benchmarks. Learn more via the OpenBenchmarking.org test page.

Himeno Benchmark

The Himeno benchmark is a linear solver of pressure Poisson using a point-Jacobi method. Learn more via the OpenBenchmarking.org test page.

MBW

This is a basic/simple memory (RAM) bandwidth benchmark for memory copy operations. Learn more via the OpenBenchmarking.org test page.

TSCP

This is a performance test of TSCP, Tom Kerrigan's Simple Chess Program, which has a built-in performance benchmark. Learn more via the OpenBenchmarking.org test page.

Stockfish

This is a test of Stockfish, an advanced C++11 chess benchmark that can scale up to 128 CPU cores. Learn more via the OpenBenchmarking.org test page.

Memcached mcperf

This is a test of twmperf/mcperf with memcached. Learn more via the OpenBenchmarking.org test page.

Redis

Redis is an open-source data structure server. Learn more via the OpenBenchmarking.org test page.

NGINX Benchmark

This is a test of ab, which is the Apache Benchmark program running against nginx. This test profile measures how many requests per second a given system can sustain when carrying out 2,000,000 requests with 500 requests being carried out concurrently. Learn more via the OpenBenchmarking.org test page.

OpenSSL

OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test measures the RSA 4096-bit performance of OpenSSL. Learn more via the OpenBenchmarking.org test page.

PostgreSQL pgbench

This is a simple benchmark of PostgreSQL using pgbench. Learn more via the OpenBenchmarking.org test page.

ctx_clock

Ctx_clock is a simple test program to measure the context switch time in clock cycles. Learn more via the OpenBenchmarking.org test page.

MKL-DNN

This is a test of the Intel MKL-DNN as the Intel Math Kernel Library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Learn more via the OpenBenchmarking.org test page.

t-test1

This is a test of t-test1 for basic memory allocator benchmarks. Note this test profile is currently very basic and the overall time does include the warmup time of the custom t-test1 compilation. Improvements welcome. Learn more via the OpenBenchmarking.org test page.

Timed MAFFT Alignment

This test performs an alignment of 100 pyruvate decarboxylase sequences. Learn more via the OpenBenchmarking.org test page.

Timed ImageMagick Compilation

This test times how long it takes to build ImageMagick. Learn more via the OpenBenchmarking.org test page.

Timed LLVM Compilation

This test times how long it takes to build the LLVM compiler. Learn more via the OpenBenchmarking.org test page.

Timed PHP Compilation

This test times how long it takes to build PHP 5 with the Zend engine. Learn more via the OpenBenchmarking.org test page.

C-Ray

This is a test of C-Ray, a simple raytracer designed to test the floating-point CPU performance. This test is multi-threaded (16 threads per core), will shoot 8 rays per pixel for anti-aliasing, and will generate a 1600 x 1200 image. Learn more via the OpenBenchmarking.org test page.

Smallpt

Smallpt is a C++ global illumination renderer written in less than 100 lines of code. Global illumination is done via unbiased Monte Carlo path tracing and there is multi-threading support via the OpenMP library. Learn more via the OpenBenchmarking.org test page.

AOBench

AOBench is a lightweight ambient occlusion renderer, written in C. The test profile is using a size of 2048 x 2048. Learn more via the OpenBenchmarking.org test page.

Bullet Physics Engine

This is a benchmark of the Bullet Physics Engine. Learn more via the OpenBenchmarking.org test page.

XZ Compression

This test measures the time needed to compress a sample file (an Ubuntu file-system image) using XZ compression. Learn more via the OpenBenchmarking.org test page.

Zstd Compression

This test measures the time needed to compress a sample file (an Ubuntu file-system image) using Zstd compression. Learn more via the OpenBenchmarking.org test page.

FLAC Audio Encoding

This test times how long it takes to encode a sample WAV file to FLAC format five times. Learn more via the OpenBenchmarking.org test page.

LAME MP3 Encoding

LAME is an MP3 encoder licensed under the LGPL. This test measures the time required to encode a WAV file to MP3 format. Learn more via the OpenBenchmarking.org test page.

CppPerformanceBenchmarks

CppPerformanceBenchmarks is a set of C++ compiler performance benchmarks. Learn more via the OpenBenchmarking.org test page.

79 Results Shown

AOM AV1
SVT-AV1
SVT-HEVC
SVT-VP9
VP9 libvpx Encoding
x264
x265
High Performance Conjugate Gradient
GraphicsMagick:
Swirl
Rotate
Sharpen
Enhanced
Resizing
Noise-Gaussian
HWB Color Space
FFTW:
Stock - 2D FFT Size 4096
Float + SSE - 2D FFT Size 4096
LuaJIT:
Composite
Monte Carlo
Fast Fourier Transform
Sparse Matrix Multiply
Dense LU Matrix Factorization
Jacobi Successive Over-Relaxation
SciMark:
Composite
Monte Carlo
Fast Fourier Transform
Sparse Matrix Multiply
Dense LU Matrix Factorization
Jacobi Successive Over-Relaxation
Himeno Benchmark
MBW:
Memory Copy - 8192 MiB
Memory Copy, Fixed Block Size - 8192 MiB
TSCP
Stockfish
Memcached mcperf:
Add
Get
Set
Append
Delete
Prepend
Replace
Redis:
LPOP
SADD
LPUSH
GET
SET
NGINX Benchmark
OpenSSL
PostgreSQL pgbench:
Buffer Test - Normal Load - Read Only
Buffer Test - Normal Load - Read Write
ctx_clock
MKL-DNN
t-test1:
1
2
Timed MAFFT Alignment
Timed ImageMagick Compilation
Timed LLVM Compilation
Timed PHP Compilation
C-Ray
Smallpt
AOBench
Bullet Physics Engine:
Raytests
3000 Fall
1000 Stack
1000 Convex
136 Ragdolls
Prim Trimesh
Convex Trimesh
XZ Compression
Zstd Compression
FLAC Audio Encoding
LAME MP3 Encoding
CppPerformanceBenchmarks:
Atol
Ctype
Math Library
Rand Numbers
Stepanov Vector
Function Objects
Stepanov Abstraction

-O2 -march=athlon64

Testing initiated at 11 May 2019 17:06 by user phoronix.

-O3 -march=athlon64

Testing initiated at 9 May 2019 13:16 by user phoronix.

-O3 -march=athlon64-sse3

Testing initiated at 9 May 2019 08:33 by user phoronix.

-O2 -march=native

Testing initiated at 10 May 2019 06:28 by user phoronix.

-O3 -march=native

Testing initiated at 8 May 2019 19:49 by user phoronix.

-O3 -march=native -flto

Testing initiated at 9 May 2019 21:12 by user phoronix.

PGO

Testing initiated at 12 May 2019 09:23 by user phoronix.

AMD Ryzen Threadripper 2990WX 32-Core

Testing initiated at 12 May 2019 14:52 by user phoronix.

GCC 9.1 Compiler Tuning Threadripper AMD znver1

View

Statistics

Graph Settings

Multi-Way Comparison

Table

Run Management

-O2 -march=athlon64

-O3 -march=athlon64

-O3 -march=athlon64-sse3

-O2 -march=native

-O3 -march=native

-O3 -march=native -flto

PGO

AMD Ryzen Threadripper 2990WX 32-Core

AOM AV1

SVT-AV1

SVT-HEVC

SVT-VP9

VP9 libvpx Encoding

x264

x265

High Performance Conjugate Gradient

GraphicsMagick

FFTW

LuaJIT

SciMark

Himeno Benchmark

MBW

TSCP

Stockfish

Memcached mcperf

Redis

NGINX Benchmark

OpenSSL

PostgreSQL pgbench

ctx_clock

MKL-DNN

t-test1

Timed MAFFT Alignment

Timed ImageMagick Compilation

Timed LLVM Compilation

Timed PHP Compilation

C-Ray

Smallpt

AOBench

Bullet Physics Engine

XZ Compression

Zstd Compression

FLAC Audio Encoding

LAME MP3 Encoding

CppPerformanceBenchmarks

79 Results Shown

-O2 -march=athlon64

-O3 -march=athlon64

-O3 -march=athlon64-sse3

-O2 -march=native

-O3 -march=native

-O3 -march=native -flto

PGO

AMD Ryzen Threadripper 2990WX 32-Core