AMD EPYC Compiler Tuning Ref
AMD Ryzen 7 1800X Eight-Core testing with a MSI X370 XPOWER GAMING TITANIUM (MS-7A31) v1.0 (1.G0 BIOS) and AMD Radeon RX Vega 8GB on LinuxMint 19.1 via the Phoronix Test Suite.
x86-64
Environment Notes: CXXFLAGS=-O3-march=x86-64 CFLAGS=-O3-march=x86-64
Compiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --enable-libmpx --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib --with-tune=generic --without-cuda-driver -v
Python Notes: Python 2.7.15rc1 + Python 3.6.7
Security Notes: __user pointer sanitization + Full AMD retpoline IBPB: conditional STIBP: disabled RSB filling + SSB disabled via prctl and seccomp
znver1
Processor: 2 x AMD EPYC 7601 32-Core (64 Cores / 128 Threads), Motherboard: Dell 02MJ3T (1.2.5 BIOS), Chipset: AMD Family 17h, Memory: 16 x 32 GB DDR4-2400MT/s 36ASF4G72PZ-2G6D2, Disk: 20 x 500GB Samsung SSD 860 + 120GB SSDSCKJB120G7R, Graphics: Matrox G200eW3, Monitor: VE228, Network: 2 x Broadcom BCM57416 NetXtreme-E 10GBase-T RDMA + 2 x Broadcom NetXtreme BCM5720 PCIe
OS: Ubuntu 18.04, Kernel: 5.0.0-050000rc6-generic (x86_64) 20190210, Desktop: GNOME Shell 3.28.3, Display Server: X Server, Compiler: GCC 7.3.0, File-System: ext4, Screen Resolution: 1600x1200
Environment Notes: CXXFLAGS=-O3-march=znver1 CFLAGS=-O3-march=znver1
Compiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --enable-libmpx --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib --with-tune=generic --without-cuda-driver -v
Python Notes: Python 2.7.15rc1 + Python 3.6.7
Security Notes: __user pointer sanitization + Full AMD retpoline IBPB: conditional STIBP: disabled RSB filling + SSB disabled via prctl and seccomp
AMD Ryzen 7 1800X
Processor: AMD Ryzen 7 1800X Eight-Core @ 3.60GHz (8 Cores / 16 Threads), Motherboard: MSI X370 XPOWER GAMING TITANIUM (MS-7A31) v1.0 (1.G0 BIOS), Chipset: AMD Family 17h, Memory: 16384MB, Disk: 1024GB Samsung SSD 960 PRO 1TB + 256GB Samsung SSD 850 + 5001GB Seagate ST5000LM000-2AN1 + 3001GB Ext HDD 1021, Graphics: AMD Radeon RX Vega 8GB (1750/945MHz), Audio: AMD Device aaf8, Monitor: PHL BDM4065, Network: Intel I211
OS: LinuxMint 19.1, Kernel: 4.20.10-042010-generic (x86_64), Desktop: Cinnamon 4.0.9, Display Server: X Server 1.19.6, Display Driver: amdgpu 18.0.1, OpenGL: 4.5 Mesa 18.2.2 (LLVM 7.0.0), Compiler: GCC 7.3.0, File-System: ext4, Screen Resolution: 3840x2160
Compiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --enable-libmpx --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib --with-tune=generic --without-cuda-driver -v
Processor Notes: Scaling Governor: acpi-cpufreq ondemand
Python Notes: Python 2.7.15rc1 + Python 3.6.7
Security Notes: __user pointer sanitization + Full AMD retpoline IBPB: conditional STIBP: disabled RSB filling + SSB disabled via prctl and seccomp
t-test1
This is a test of t-test1 for basic memory allocator benchmarks. Note this test profile is currently very basic and the overall time does include the warmup time of the custom t-test1 compilation. Improvements welcome. Learn more via the OpenBenchmarking.org test page.
Sockperf
This is a network socket API performance benchmark. Learn more via the OpenBenchmarking.org test page.
GNU MPC
GNU MPC is a C library for the arithmetic of complex numbers. Learn more via the OpenBenchmarking.org test page.
CloverLeaf
CloverLeaf is a Lagrangian-Eulerian hydrodynamics benchmark. This test profile currently makes use of CloverLeaf's OpenMP version and benchmarked with the clover_bm8192.in input file. Learn more via the OpenBenchmarking.org test page.
NAMD
NAMD is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems. NAMD was developed by the Theoretical and Computational Biophysics Group in the Beckman Institute for Advanced Science and Technology at the University of Illinois at Urbana-Champaign. Learn more via the OpenBenchmarking.org test page.
lzbench
lzbench is an in-memory benchmark of various compressors. The file used for compression is a Linux kernel source tree tarball. Learn more via the OpenBenchmarking.org test page.
FFTE
FFTE is a package by Daisuke Takahashi to compute Discrete Fourier Transforms of 1-, 2- and 3- dimensional sequences of length (2^p)*(3^q)*(5^r). Learn more via the OpenBenchmarking.org test page.
Timed HMMer Search
This test searches through the Pfam database of profile hidden markov models. The search finds the domain structure of Drosophila Sevenless protein. Learn more via the OpenBenchmarking.org test page.
Timed MAFFT Alignment
This test performs an alignment of 100 pyruvate decarboxylase sequences. Learn more via the OpenBenchmarking.org test page.
BLAKE2
This is a benchmark of BLAKE2 using the blake2s binary. BLAKE2 is a high-performance crypto alternative to MD5 and SHA-2/3. Learn more via the OpenBenchmarking.org test page.
Fhourstones
This integer benchmark solves positions in the game of Connect-4, as played on a vertical 7x6 board. By default, it uses a 64Mb transposition table with the twobig replacement strategy. Positions are represented as 64-bit bitboards, and the hash function is computed using a single 64-bit modulo operation, giving 64-bit machines a slight edge. The alpha-beta searcher sorts moves dynamically based on the history heuristic. Learn more via the OpenBenchmarking.org test page.
CacheBench
This is a performance test of CacheBench, which is part of LLCbench. CacheBench is designed to test the memory and cache bandwidth performance Learn more via the OpenBenchmarking.org test page.
LuaJIT
This test profile is a collection of Lua scripts/benchmarks run against a locally-built copy of LuaJIT upstream. Learn more via the OpenBenchmarking.org test page.
SciMark
This test runs the ANSI C version of SciMark 2.0, which is a benchmark for scientific and numerical computing developed by programmers at the National Institute of Standards and Technology. This test is made up of Fast Foruier Transform, Jacobi Successive Over-relaxation, Monte Carlo, Sparse Matrix Multiply, and dense LU matrix factorization benchmarks. Learn more via the OpenBenchmarking.org test page.
Botan
Botan is a cross-platform open-source C++ crypto library that supports most all publicly known cryptographic algorithms. Learn more via the OpenBenchmarking.org test page.
Crafty
This is a performance test of Crafty, an advanced open-source chess engine. Learn more via the OpenBenchmarking.org test page.
TSCP
This is a performance test of TSCP, Tom Kerrigan's Simple Chess Program, which has a built-in performance benchmark. Learn more via the OpenBenchmarking.org test page.
John The Ripper
This is a benchmark of John The Ripper, which is a password cracker. Learn more via the OpenBenchmarking.org test page.
TTSIOD 3D Renderer
A portable GPL 3D software renderer that supports OpenMP and Intel Threading Building Blocks with many different rendering modes. This version does not use OpenGL but is entirely CPU/software based. Learn more via the OpenBenchmarking.org test page.
SVT-AV1
This is a test of the Intel Open Visual Cloud Scalable Video Technology SVT-AV1 CPU-based multi-threaded video encoder for the AV1 video format with a sample 1080p YUV video file. Learn more via the OpenBenchmarking.org test page.
SVT-HEVC
This is a test of the Intel Open Visual Cloud Scalable Video Technology SVT-HEVC CPU-based multi-threaded video encoder for the HEVC / H.265 video format with a sample 1080p YUV video file. Learn more via the OpenBenchmarking.org test page.
VP9 libvpx Encoding
This is a standard video encoding performance test of Google's libvpx library and the vpxenc command for the VP9/WebM format using a sample 1080p video. Learn more via the OpenBenchmarking.org test page.
x264
This is a simple test of the x264 encoder run on the CPU (OpenCL support disabled) with a sample video file. Learn more via the OpenBenchmarking.org test page.
x265
This is a simple test of the x265 encoder run on the CPU with a sample 1080p video file. Learn more via the OpenBenchmarking.org test page.
GraphicsMagick
This is a test of GraphicsMagick with its OpenMP implementation that performs various imaging tests to stress the system's CPU. Learn more via the OpenBenchmarking.org test page.
Himeno Benchmark
The Himeno benchmark is a linear solver of pressure Poisson using a point-Jacobi method. Learn more via the OpenBenchmarking.org test page.
7-Zip Compression
This is a test of 7-Zip using p7zip with its integrated benchmark feature or upstream 7-Zip for the Windows x64 build. Learn more via the OpenBenchmarking.org test page.
Timed Apache Compilation
This test times how long it takes to build the Apache HTTP Server. Learn more via the OpenBenchmarking.org test page.
Timed GCC Compilation
This test times how long it takes to build the GNU Compiler Collection (GCC). Learn more via the OpenBenchmarking.org test page.
Timed ImageMagick Compilation
This test times how long it takes to build ImageMagick. Learn more via the OpenBenchmarking.org test page.
Timed LLVM Compilation
This test times how long it takes to build the LLVM compiler. Learn more via the OpenBenchmarking.org test page.
Timed PHP Compilation
This test times how long it takes to build PHP 5 with the Zend engine. Learn more via the OpenBenchmarking.org test page.
C-Ray
This is a test of C-Ray, a simple raytracer designed to test the floating-point CPU performance. This test is multi-threaded (16 threads per core), will shoot 8 rays per pixel for anti-aliasing, and will generate a 1600 x 1200 image. Learn more via the OpenBenchmarking.org test page.
Primesieve
Primesieve generates prime numbers using a highly optimized sieve of Eratosthenes implementation. Primesieve benchmarks the CPU's L1/L2 cache performance. Learn more via the OpenBenchmarking.org test page.
Smallpt
Smallpt is a C++ global illumination renderer written in less than 100 lines of code. Global illumination is done via unbiased Monte Carlo path tracing and there is multi-threading support via the OpenMP library. Learn more via the OpenBenchmarking.org test page.
AOBench
AOBench is a lightweight ambient occlusion renderer, written in C. The test profile is using a size of 2048 x 2048. Learn more via the OpenBenchmarking.org test page.
Bullet Physics Engine
This is a benchmark of the Bullet Physics Engine. Learn more via the OpenBenchmarking.org test page.
LZMA Compression
This test measures the time needed to compress a file using LZMA compression. Learn more via the OpenBenchmarking.org test page.
XZ Compression
This test measures the time needed to compress a sample file (an Ubuntu file-system image) using XZ compression. Learn more via the OpenBenchmarking.org test page.
Zstd Compression
This test measures the time needed to compress a sample file (an Ubuntu file-system image) using Zstd compression. Learn more via the OpenBenchmarking.org test page.
FLAC Audio Encoding
This test times how long it takes to encode a sample WAV file to FLAC format five times. Learn more via the OpenBenchmarking.org test page.
LAME MP3 Encoding
LAME is an MP3 encoder licensed under the LGPL. This test measures the time required to encode a WAV file to MP3 format. Learn more via the OpenBenchmarking.org test page.
Ogg Encoding
This test times how long it takes to encode a sample WAV file to Ogg format using vorbis-tools, libvorbis, and libogg. Learn more via the OpenBenchmarking.org test page.
m-queens
A solver for the N-queens problem with multi-threading support via the OpenMP library. Learn more via the OpenBenchmarking.org test page.
Mencoder
This test uses mplayer's mencoder utility and the libavcodec family for testing the system's audio/video encoding performance. Learn more via the OpenBenchmarking.org test page.
Tachyon
This is a test of the threaded Tachyon, a parallel ray-tracing system. Learn more via the OpenBenchmarking.org test page.
OpenSSL
OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test measures the RSA 4096-bit performance of OpenSSL. Learn more via the OpenBenchmarking.org test page.
Aircrack-ng
Aircrack-ng is a tool for assessing WiFi/WLAN network security. Learn more via the OpenBenchmarking.org test page.
libjpeg-turbo tjbench
tjbench is a JPEG decompression/compression benchmark part of libjpeg-turbo. Learn more via the OpenBenchmarking.org test page.
PostgreSQL pgbench
This is a simple benchmark of PostgreSQL using pgbench. Learn more via the OpenBenchmarking.org test page.
CppPerformanceBenchmarks
CppPerformanceBenchmarks is a set of C++ compiler performance benchmarks. Learn more via the OpenBenchmarking.org test page.
Redis
Redis is an open-source data structure server. Learn more via the OpenBenchmarking.org test page.
Sysbench
This is a benchmark of Sysbench with CPU and memory sub-tests. Learn more via the OpenBenchmarking.org test page.
Xsbench
XSBench is a mini-app representing a key computational kernel of the Monte Carlo neutronics application OpenMC. Learn more via the OpenBenchmarking.org test page.
NGINX Benchmark
This is a test of ab, which is the Apache Benchmark program running against nginx. This test profile measures how many requests per second a given system can sustain when carrying out 2,000,000 requests with 500 requests being carried out concurrently. Learn more via the OpenBenchmarking.org test page.
Apache Benchmark
This is a test of ab, which is the Apache benchmark program. This test profile measures how many requests per second a given system can sustain when carrying out 1,000,000 requests with 100 requests being carried out concurrently. Learn more via the OpenBenchmarking.org test page.
Apache Siege
This is a test of the Apache web server performance being facilitated by the Siege web serverb enchmark program. Learn more via the OpenBenchmarking.org test page.
x86-64
Environment Notes: CXXFLAGS=-O3-march=x86-64 CFLAGS=-O3-march=x86-64
Compiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --enable-libmpx --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib --with-tune=generic --without-cuda-driver -v
Python Notes: Python 2.7.15rc1 + Python 3.6.7
Security Notes: __user pointer sanitization + Full AMD retpoline IBPB: conditional STIBP: disabled RSB filling + SSB disabled via prctl and seccomp
Testing initiated at 12 February 2019 06:27 by user root.
znver1
Processor: 2 x AMD EPYC 7601 32-Core (64 Cores / 128 Threads), Motherboard: Dell 02MJ3T (1.2.5 BIOS), Chipset: AMD Family 17h, Memory: 16 x 32 GB DDR4-2400MT/s 36ASF4G72PZ-2G6D2, Disk: 20 x 500GB Samsung SSD 860 + 120GB SSDSCKJB120G7R, Graphics: Matrox G200eW3, Monitor: VE228, Network: 2 x Broadcom BCM57416 NetXtreme-E 10GBase-T RDMA + 2 x Broadcom NetXtreme BCM5720 PCIe
OS: Ubuntu 18.04, Kernel: 5.0.0-050000rc6-generic (x86_64) 20190210, Desktop: GNOME Shell 3.28.3, Display Server: X Server, Compiler: GCC 7.3.0, File-System: ext4, Screen Resolution: 1600x1200
Environment Notes: CXXFLAGS=-O3-march=znver1 CFLAGS=-O3-march=znver1
Compiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --enable-libmpx --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib --with-tune=generic --without-cuda-driver -v
Python Notes: Python 2.7.15rc1 + Python 3.6.7
Security Notes: __user pointer sanitization + Full AMD retpoline IBPB: conditional STIBP: disabled RSB filling + SSB disabled via prctl and seccomp
Testing initiated at 12 February 2019 16:34 by user root.
AMD Ryzen 7 1800X
Processor: AMD Ryzen 7 1800X Eight-Core @ 3.60GHz (8 Cores / 16 Threads), Motherboard: MSI X370 XPOWER GAMING TITANIUM (MS-7A31) v1.0 (1.G0 BIOS), Chipset: AMD Family 17h, Memory: 16384MB, Disk: 1024GB Samsung SSD 960 PRO 1TB + 256GB Samsung SSD 850 + 5001GB Seagate ST5000LM000-2AN1 + 3001GB Ext HDD 1021, Graphics: AMD Radeon RX Vega 8GB (1750/945MHz), Audio: AMD Device aaf8, Monitor: PHL BDM4065, Network: Intel I211
OS: LinuxMint 19.1, Kernel: 4.20.10-042010-generic (x86_64), Desktop: Cinnamon 4.0.9, Display Server: X Server 1.19.6, Display Driver: amdgpu 18.0.1, OpenGL: 4.5 Mesa 18.2.2 (LLVM 7.0.0), Compiler: GCC 7.3.0, File-System: ext4, Screen Resolution: 3840x2160
Compiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --enable-libmpx --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib --with-tune=generic --without-cuda-driver -v
Processor Notes: Scaling Governor: acpi-cpufreq ondemand
Python Notes: Python 2.7.15rc1 + Python 3.6.7
Security Notes: __user pointer sanitization + Full AMD retpoline IBPB: conditional STIBP: disabled RSB filling + SSB disabled via prctl and seccomp
Testing initiated at 16 February 2019 11:46 by user mjb1.