AMD Ryzen 9 3900X 12-Core testing of GCC 9 and GCC 10 development with Znver2 tuning following recent cost table updates, etc. Benchmarks by Michael Larabel for a future article..
GCC 9.1.0 Environment Notes: CXXFLAGS=-O3 CFLAGS=-O3Compiler Notes: --disable-multilib --enable-checking=releaseProcessor Notes: Scaling Governor: acpi-cpufreq ondemandPython Notes: Python 2.7.15+ + Python 3.6.8Security Notes: l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: always-on RSB filling
GCC 9.1.0 znver2 Processor: AMD Ryzen 9 3900X 12-Core @ 3.80GHz (12 Cores / 24 Threads), Motherboard: ASUS ROG CROSSHAIR VIII HERO (WI-FI) (0702 BIOS), Chipset: AMD Device 1480, Memory: 16384MB, Disk: 2000GB Force MP600, Graphics: Sapphire AMD Radeon RX 550 640SP / 560/560X 4GB (1300/1750MHz), Audio: AMD Device aae0, Monitor: ASUS VP28U, Network: Realtek Device 8125 + Intel I211 + Intel Device 2723
OS: Ubuntu 18.04, Kernel: 5.3.0-999-generic (x86_64) 20190725, Desktop: GNOME Shell 3.28.4, Display Server: X Server 1.20.4, Display Driver: modesetting 1.20.4, OpenGL: 4.5 Mesa 19.0.2 (LLVM 8.0.0), Compiler: GCC 9.1.0, File-System: ext4, Screen Resolution: 3840x2160
Environment Notes: CXXFLAGS=-O3-march=znver2 CFLAGS=-O3-march=znver2Compiler Notes: --disable-multilib --enable-checking=releaseProcessor Notes: Scaling Governor: acpi-cpufreq ondemandPython Notes: Python 2.7.15+ + Python 3.6.8Security Notes: l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: always-on RSB filling
GCC 10.0.0 Environment Notes: CXXFLAGS=-O3 CFLAGS=-O3Compiler Notes: --disable-multilib --enable-checking=releaseProcessor Notes: Scaling Governor: acpi-cpufreq ondemandPython Notes: Python 2.7.15+ + Python 3.6.8Security Notes: l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: always-on RSB filling
GCC 10.0.0 znver2 OS: Ubuntu 18.04, Kernel: 5.3.0-999-generic (x86_64) 20190725, Desktop: GNOME Shell 3.28.4, Display Server: X Server 1.20.4, Display Driver: modesetting 1.20.4, OpenGL: 4.5 Mesa 19.0.2 (LLVM 8.0.0), Compiler: GCC 10.0.0 20190727, File-System: ext4, Screen Resolution: 3840x2160
Environment Notes: CXXFLAGS=-O3-march=znver2 CFLAGS=-O3-march=znver2Compiler Notes: --disable-multilib --enable-checking=releaseProcessor Notes: Scaling Governor: acpi-cpufreq ondemandPython Notes: Python 2.7.15+ + Python 3.6.8Security Notes: l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: always-on RSB filling
SciMark This test runs the ANSI C version of SciMark 2.0, which is a benchmark for scientific and numerical computing developed by programmers at the National Institute of Standards and Technology. This test is made up of Fast Foruier Transform, Jacobi Successive Over-relaxation, Monte Carlo, Sparse Matrix Multiply, and dense LU matrix factorization benchmarks. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Dense LU Matrix Factorization GCC 9.1.0 GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 znver2 2K 4K 6K 8K 10K SE +/- 60.72, N = 3 SE +/- 19.91, N = 3 SE +/- 12.24, N = 3 SE +/- 15.98, N = 3 6891.37 8526.66 10777.88 11370.27 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -lm
OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Composite GCC 9.1.0 GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 znver2 800 1600 2400 3200 4000 SE +/- 25.64, N = 3 SE +/- 5.96, N = 3 SE +/- 13.97, N = 3 SE +/- 5.91, N = 3 2768.16 3127.49 3553.67 3686.60 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -lm
FFTW FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 2D FFT Size 32 GCC 9.1.0 GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 znver2 3K 6K 9K 12K 15K SE +/- 155.95, N = 3 SE +/- 2.19, N = 3 SE +/- 141.66, N = 3 11909 12902 14119 14314 -march=znver2 -march=znver2 1. (CC) gcc options: -pthread -O3 -lm
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 2D FFT Size 512 GCC 9.1.0 GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 znver2 2K 4K 6K 8K 10K SE +/- 30.08, N = 3 SE +/- 19.17, N = 3 SE +/- 10.67, N = 3 SE +/- 148.34, N = 4 9028.10 9583.73 10531.00 10814.00 -march=znver2 -march=znver2 1. (CC) gcc options: -pthread -O3 -lm
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 1D FFT Size 32 GCC 9.1.0 znver2 GCC 10.0.0 GCC 9.1.0 GCC 10.0.0 znver2 3K 6K 9K 12K 15K SE +/- 15.90, N = 3 SE +/- 110.06, N = 3 SE +/- 1.76, N = 3 SE +/- 5.51, N = 3 11828 12748 12958 14113 -march=znver2 -march=znver2 1. (CC) gcc options: -pthread -O3 -lm
AOM AV1 This is a simple test of the AOMedia AV1 encoder run on the CPU with a sample video file. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 2019-02-11 AV1 Video Encoding GCC 9.1.0 GCC 10.0.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 0.072 0.144 0.216 0.288 0.36 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.27 0.27 0.31 0.32 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
SciMark This test runs the ANSI C version of SciMark 2.0, which is a benchmark for scientific and numerical computing developed by programmers at the National Institute of Standards and Technology. This test is made up of Fast Foruier Transform, Jacobi Successive Over-relaxation, Monte Carlo, Sparse Matrix Multiply, and dense LU matrix factorization benchmarks. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Fast Fourier Transform GCC 10.0.0 znver2 GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 70 140 210 280 350 SE +/- 0.24, N = 3 SE +/- 0.21, N = 3 SE +/- 2.85, N = 3 SE +/- 0.54, N = 3 261.10 273.49 295.18 301.17 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -lm
OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Jacobi Successive Over-Relaxation GCC 9.1.0 GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 znver2 500 1000 1500 2000 2500 SE +/- 20.16, N = 3 SE +/- 0.16, N = 3 SE +/- 0.89, N = 3 SE +/- 0.53, N = 3 2125.26 2175.85 2293.46 2408.26 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -lm
FFTW FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 2D FFT Size 4096 GCC 9.1.0 GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 znver2 2K 4K 6K 8K 10K SE +/- 73.85, N = 3 SE +/- 40.30, N = 3 SE +/- 95.02, N = 3 SE +/- 67.62, N = 3 7063.03 7071.30 7823.27 7920.17 -march=znver2 -march=znver2 1. (CC) gcc options: -pthread -O3 -lm
C-Ray This is a test of C-Ray, a simple raytracer designed to test the floating-point CPU performance. This test is multi-threaded (16 threads per core), will shoot 8 rays per pixel for anti-aliasing, and will generate a 1600 x 1200 image. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 1.1 Total Time - 4K, 16 Rays Per Pixel GCC 9.1.0 GCC 10.0.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 10 20 30 40 50 SE +/- 0.02, N = 3 SE +/- 0.06, N = 3 SE +/- 0.04, N = 3 SE +/- 0.05, N = 3 43.09 42.63 39.42 39.36 -march=znver2 -march=znver2 1. (CC) gcc options: -lm -lpthread -O3
AOBench AOBench is a lightweight ambient occlusion renderer, written in C. The test profile is using a size of 2048 x 2048. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better AOBench Size: 2048 x 2048 - Total Time GCC 10.0.0 GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 8 16 24 32 40 SE +/- 0.06, N = 3 SE +/- 0.05, N = 3 SE +/- 0.10, N = 3 SE +/- 0.02, N = 3 35.98 34.60 33.20 33.05 -march=znver2 -march=znver2 1. (CC) gcc options: -lm -O3
GraphicsMagick This is a test of GraphicsMagick with its OpenMP implementation that performs various imaging tests to stress the system's CPU. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.30 Operation: Sharpen GCC 9.1.0 GCC 10.0.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 40 80 120 160 200 SE +/- 0.33, N = 3 181 181 195 196 -march=znver2 -march=znver2 1. (CC) gcc options: -fopenmp -O3 -pthread -ljbig -lwebp -lwebpmux -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -ldl -lpthread
HPC Challenge HPC Challenge (HPCC) is a cluster-focused benchmark consisting of the HPL Linpack TPP benchmark, DGEMM, STREAM, PTRANS, RandomAccess, FFT, and communication bandwidth and latency. This HPC Challenge test profile attempts to ship with standard yet versatile configuration/input files though they can be modified. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GB/s, More Is Better HPC Challenge 1.5.0 Test / Class: G-Ptrans GCC 10.0.0 GCC 9.1.0 GCC 10.0.0 znver2 GCC 9.1.0 znver2 0.6643 1.3286 1.9929 2.6572 3.3215 SE +/- 0.00151, N = 3 SE +/- 0.00047, N = 3 SE +/- 0.00082, N = 3 SE +/- 0.00095, N = 3 2.72974 2.73255 2.94730 2.95225 -march=znver2 -march=znver2 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -funroll-loops 2. OpenBLAS + Open MPI 2.1.1
lzbench lzbench is an in-memory benchmark of various compressors. The file used for compression is a Linux kernel source tree tarball. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MB/s, More Is Better lzbench 2017-08-08 Test: XZ 0 - Process: Compression GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 9 18 27 36 45 SE +/- 0.33, N = 3 37 39 40 40 1. (CXX) g++ options: -lrt -static -lpthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
TSCP This is a performance test of TSCP, Tom Kerrigan's Simple Chess Program, which has a built-in performance benchmark. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Nodes Per Second, More Is Better TSCP 1.81 AI Chess Performance GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 GCC 10.0.0 znver2 300K 600K 900K 1200K 1500K SE +/- 620.00, N = 5 SE +/- 10688.23, N = 5 SE +/- 676.60, N = 5 SE +/- 6261.48, N = 5 1305781 1337188 1366017 1408752 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -march=native
SciMark This test runs the ANSI C version of SciMark 2.0, which is a benchmark for scientific and numerical computing developed by programmers at the National Institute of Standards and Technology. This test is made up of Fast Foruier Transform, Jacobi Successive Over-relaxation, Monte Carlo, Sparse Matrix Multiply, and dense LU matrix factorization benchmarks. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Sparse Matrix Multiply GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 9.1.0 GCC 10.0.0 800 1600 2400 3200 4000 SE +/- 13.73, N = 3 SE +/- 58.43, N = 3 SE +/- 37.65, N = 3 SE +/- 15.78, N = 3 3580.73 3675.94 3767.63 3856.63 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -lm
lzbench lzbench is an in-memory benchmark of various compressors. The file used for compression is a Linux kernel source tree tarball. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MB/s, More Is Better lzbench 2017-08-08 Test: Libdeflate 1 - Process: Compression GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 GCC 9.1.0 znver2 60 120 180 240 300 SE +/- 1.86, N = 3 SE +/- 0.67, N = 3 239 248 250 257 1. (CXX) g++ options: -lrt -static -lpthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
SVT-VP9 This is a test of the Intel Open Visual Cloud Scalable Video Technology SVT-VP9 CPU-based multi-threaded video encoder for the VP9 video format with a sample 1080p YUV video file. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 2019-02-17 1080p 8-bit YUV To VP9 Video Encode GCC 10.0.0 GCC 9.1.0 GCC 10.0.0 znver2 GCC 9.1.0 znver2 20 40 60 80 100 SE +/- 0.15, N = 3 SE +/- 0.28, N = 3 SE +/- 0.19, N = 3 SE +/- 0.08, N = 3 89.84 89.99 92.35 96.54 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -fPIE -fPIC -O2 -flto -fvisibility=hidden -mavx -pie -rdynamic -lpthread -lrt -lm
lzbench lzbench is an in-memory benchmark of various compressors. The file used for compression is a Linux kernel source tree tarball. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MB/s, More Is Better lzbench 2017-08-08 Test: XZ 0 - Process: Decompression GCC 10.0.0 znver2 GCC 10.0.0 GCC 9.1.0 GCC 9.1.0 znver2 30 60 90 120 150 SE +/- 0.33, N = 3 108 113 116 116 1. (CXX) g++ options: -lrt -static -lpthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
LAME MP3 Encoding LAME is an MP3 encoder licensed under the LGPL. This test measures the time required to encode a WAV file to MP3 format. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better LAME MP3 Encoding 3.100 WAV To MP3 GCC 10.0.0 znver2 GCC 10.0.0 GCC 9.1.0 GCC 9.1.0 znver2 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 SE +/- 0.01, N = 3 7.45 7.28 7.25 6.94 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -lncurses -lm
GraphicsMagick This is a test of GraphicsMagick with its OpenMP implementation that performs various imaging tests to stress the system's CPU. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.30 Operation: Enhanced GCC 10.0.0 GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 50 100 150 200 250 SE +/- 1.20, N = 3 208 209 221 223 -march=znver2 -march=znver2 1. (CC) gcc options: -fopenmp -O3 -pthread -ljbig -lwebp -lwebpmux -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -ldl -lpthread
Ogg Encoding This test times how long it takes to encode a sample WAV file to Ogg format using vorbis-tools, libvorbis, and libogg. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Ogg Encoding 1.3.3 WAV To Ogg GCC 10.0.0 znver2 GCC 9.1.0 GCC 10.0.0 GCC 9.1.0 znver2 1.206 2.412 3.618 4.824 6.03 SE +/- 0.00, N = 4 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 5.36 5.13 5.05 5.05 -march=znver2 -march=znver2 1. (CC) gcc options: -O2 -ffast-math -fsigned-char -O3 -logg
Redis Redis is an open-source data structure server. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Requests Per Second, More Is Better Redis 4.0.8 Test: SET GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 500K 1000K 1500K 2000K 2500K SE +/- 28123.08, N = 3 SE +/- 14796.01, N = 3 SE +/- 30290.32, N = 4 SE +/- 19021.82, N = 3 2051361.33 2084989.88 2122162.94 2169531.00 1. (CC) gcc options: -ggdb -rdynamic -lm -ldl -pthread
GraphicsMagick This is a test of GraphicsMagick with its OpenMP implementation that performs various imaging tests to stress the system's CPU. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.30 Operation: Rotate GCC 9.1.0 GCC 10.0.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 60 120 180 240 300 SE +/- 4.33, N = 3 SE +/- 1.20, N = 3 SE +/- 1.86, N = 3 262 262 263 277 -march=znver2 -march=znver2 1. (CC) gcc options: -fopenmp -O3 -pthread -ljbig -lwebp -lwebpmux -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -ldl -lpthread
lzbench lzbench is an in-memory benchmark of various compressors. The file used for compression is a Linux kernel source tree tarball. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MB/s, More Is Better lzbench 2017-08-08 Test: Libdeflate 1 - Process: Decompression GCC 9.1.0 GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 znver2 300 600 900 1200 1500 SE +/- 0.33, N = 3 SE +/- 10.00, N = 3 1119 1147 1159 1183 1. (CXX) g++ options: -lrt -static -lpthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
OpenBenchmarking.org MB/s, More Is Better lzbench 2017-08-08 Test: Brotli 0 - Process: Compression GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 GCC 9.1.0 znver2 110 220 330 440 550 SE +/- 0.67, N = 3 SE +/- 4.47, N = 11 SE +/- 0.88, N = 3 SE +/- 4.10, N = 3 494 499 507 515 1. (CXX) g++ options: -lrt -static -lpthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
SciMark This test runs the ANSI C version of SciMark 2.0, which is a benchmark for scientific and numerical computing developed by programmers at the National Institute of Standards and Technology. This test is made up of Fast Foruier Transform, Jacobi Successive Over-relaxation, Monte Carlo, Sparse Matrix Multiply, and dense LU matrix factorization benchmarks. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Monte Carlo GCC 10.0.0 znver2 GCC 9.1.0 GCC 10.0.0 GCC 9.1.0 znver2 200 400 600 800 1000 SE +/- 0.29, N = 3 SE +/- 7.16, N = 3 SE +/- 0.24, N = 3 SE +/- 0.74, N = 3 759.97 761.38 777.17 800.23 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -lm
GraphicsMagick This is a test of GraphicsMagick with its OpenMP implementation that performs various imaging tests to stress the system's CPU. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.30 Operation: HWB Color Space GCC 9.1.0 GCC 10.0.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 70 140 210 280 350 SE +/- 2.60, N = 3 SE +/- 2.19, N = 3 SE +/- 2.19, N = 3 SE +/- 0.33, N = 3 287 288 293 302 -march=znver2 -march=znver2 1. (CC) gcc options: -fopenmp -O3 -pthread -ljbig -lwebp -lwebpmux -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -ldl -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.30 Operation: Swirl GCC 9.1.0 GCC 10.0.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 60 120 180 240 300 SE +/- 1.86, N = 3 SE +/- 0.88, N = 3 SE +/- 1.20, N = 3 251 254 259 264 -march=znver2 -march=znver2 1. (CC) gcc options: -fopenmp -O3 -pthread -ljbig -lwebp -lwebpmux -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -ldl -lpthread
MKL-DNN This is a test of the Intel MKL-DNN as the Intel Math Kernel Library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN 2019-04-16 Harness: IP Batch All - Data Type: f32 GCC 9.1.0 znver2 GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 300 600 900 1200 1500 SE +/- 17.21, N = 3 SE +/- 5.99, N = 3 SE +/- 25.50, N = 3 SE +/- 7.48, N = 3 1599.68 1582.78 1556.91 1523.52 -march=znver2 - MIN: 1393.73 MIN: 1385.56 -march=znver2 - MIN: 1368.2 MIN: 1357.02 1. (CXX) g++ options: -O3 -std=c++11 -march=native -mtune=native -fPIC -fopenmp -pie -lmklml_intel -ldl
Himeno Benchmark The Himeno benchmark is a linear solver of pressure Poisson using a point-Jacobi method. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MFLOPS, More Is Better Himeno Benchmark 3.0 Poisson Pressure Solver GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 GCC 10.0.0 znver2 300 600 900 1200 1500 SE +/- 11.21, N = 3 SE +/- 6.19, N = 3 SE +/- 2.93, N = 3 SE +/- 0.48, N = 3 1322.90 1378.46 1385.23 1385.88 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -mavx2
MKL-DNN This is a test of the Intel MKL-DNN as the Intel Math Kernel Library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN 2019-04-16 Harness: Deconvolution Batch deconv_all - Data Type: f32 GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 11K 22K 33K 44K 55K SE +/- 668.40, N = 3 SE +/- 589.20, N = 6 SE +/- 390.75, N = 3 SE +/- 224.88, N = 3 52238.80 51813.05 50679.53 50039.13 -march=znver2 - MIN: 49224.9 MIN: 48543.1 -march=znver2 - MIN: 48056.6 MIN: 46883.1 1. (CXX) g++ options: -O3 -std=c++11 -march=native -mtune=native -fPIC -fopenmp -pie -lmklml_intel -ldl
GraphicsMagick This is a test of GraphicsMagick with its OpenMP implementation that performs various imaging tests to stress the system's CPU. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.30 Operation: Resizing GCC 10.0.0 GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 60 120 180 240 300 SE +/- 2.65, N = 3 SE +/- 2.19, N = 3 SE +/- 1.15, N = 3 SE +/- 1.53, N = 3 274 275 280 286 -march=znver2 -march=znver2 1. (CC) gcc options: -fopenmp -O3 -pthread -ljbig -lwebp -lwebpmux -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -ldl -lpthread
Smallpt Smallpt is a C++ global illumination renderer written in less than 100 lines of code. Global illumination is done via unbiased Monte Carlo path tracing and there is multi-threading support via the OpenMP library. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Smallpt 1.0 Global Illumination Renderer; 128 Samples GCC 10.0.0 GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 7.84 7.78 7.67 7.53 -march=znver2 -march=znver2 1. (CXX) g++ options: -fopenmp -O3
Sockperf This is a network socket API performance benchmark. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org usec, Fewer Is Better Sockperf 3.4 Test: Latency Ping Pong GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 0.7088 1.4176 2.1264 2.8352 3.544 SE +/- 0.04, N = 6 SE +/- 0.04, N = 5 SE +/- 0.02, N = 25 SE +/- 0.02, N = 25 3.15 3.12 3.04 3.03 -march=znver2 -march=znver2 1. (CXX) g++ options: --param -O3 -rdynamic -ldl -lpthread
MKL-DNN This is a test of the Intel MKL-DNN as the Intel Math Kernel Library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN 2019-04-16 Harness: Deconvolution Batch deconv_1d - Data Type: f32 GCC 9.1.0 GCC 10.0.0 znver2 GCC 9.1.0 znver2 GCC 10.0.0 50 100 150 200 250 SE +/- 2.00, N = 15 SE +/- 1.85, N = 15 SE +/- 1.79, N = 13 SE +/- 0.29, N = 3 221.23 218.66 217.02 212.83 MIN: 202.07 -march=znver2 - MIN: 203.42 -march=znver2 - MIN: 203.65 MIN: 201.7 1. (CXX) g++ options: -O3 -std=c++11 -march=native -mtune=native -fPIC -fopenmp -pie -lmklml_intel -ldl
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN 2019-04-16 Harness: Deconvolution Batch deconv_3d - Data Type: f32 GCC 9.1.0 GCC 10.0.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 13 26 39 52 65 SE +/- 0.66, N = 7 SE +/- 0.69, N = 15 SE +/- 0.49, N = 15 SE +/- 0.58, N = 8 59.00 58.16 57.97 56.87 MIN: 50.8 MIN: 50.91 -march=znver2 - MIN: 51.57 -march=znver2 - MIN: 50.96 1. (CXX) g++ options: -O3 -std=c++11 -march=native -mtune=native -fPIC -fopenmp -pie -lmklml_intel -ldl
lzbench lzbench is an in-memory benchmark of various compressors. The file used for compression is a Linux kernel source tree tarball. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MB/s, More Is Better lzbench 2017-08-08 Test: Zstd 1 - Process: Compression GCC 10.0.0 znver2 GCC 10.0.0 GCC 9.1.0 GCC 9.1.0 znver2 100 200 300 400 500 SE +/- 0.33, N = 3 SE +/- 3.18, N = 3 SE +/- 4.91, N = 8 453 467 468 468 1. (CXX) g++ options: -lrt -static -lpthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
HPC Challenge HPC Challenge (HPCC) is a cluster-focused benchmark consisting of the HPL Linpack TPP benchmark, DGEMM, STREAM, PTRANS, RandomAccess, FFT, and communication bandwidth and latency. This HPC Challenge test profile attempts to ship with standard yet versatile configuration/input files though they can be modified. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GB/s, More Is Better HPC Challenge 1.5.0 Test / Class: Random Ring Bandwidth GCC 9.1.0 GCC 10.0.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 1.1354 2.2708 3.4062 4.5416 5.677 SE +/- 0.02698, N = 3 SE +/- 0.05697, N = 3 SE +/- 0.07571, N = 3 SE +/- 0.04322, N = 3 4.89161 4.94947 4.98832 5.04603 -march=znver2 -march=znver2 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -funroll-loops 2. OpenBLAS + Open MPI 2.1.1
FFTW FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 2D FFT Size 32 GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 GCC 10.0.0 znver2 10K 20K 30K 40K 50K SE +/- 105.51, N = 3 SE +/- 663.38, N = 4 SE +/- 54.85, N = 3 SE +/- 28.47, N = 3 44951 45253 45361 46305 -march=znver2 -march=znver2 1. (CC) gcc options: -pthread -O3 -lm
lzbench lzbench is an in-memory benchmark of various compressors. The file used for compression is a Linux kernel source tree tarball. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MB/s, More Is Better lzbench 2017-08-08 Test: Zstd 1 - Process: Decompression GCC 10.0.0 znver2 GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 300 600 900 1200 1500 SE +/- 0.33, N = 3 SE +/- 12.79, N = 8 SE +/- 9.50, N = 3 SE +/- 0.58, N = 3 1250 1268 1269 1287 1. (CXX) g++ options: -lrt -static -lpthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
Sockperf This is a network socket API performance benchmark. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Messages Per Second, More Is Better Sockperf 3.4 Test: Throughput GCC 9.1.0 GCC 10.0.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 110K 220K 330K 440K 550K SE +/- 4767.11, N = 5 SE +/- 5409.10, N = 5 SE +/- 4175.03, N = 5 SE +/- 3715.76, N = 18 514551 514748 517095 529657 -march=znver2 -march=znver2 1. (CXX) g++ options: --param -O3 -rdynamic -ldl -lpthread
GNU MPC GNU MPC is a C library for the arithmetic of complex numbers. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Global Score, More Is Better GNU MPC 1.1.0 Multi-Precision Benchmark GCC 10.0.0 znver2 GCC 9.1.0 znver2 GCC 10.0.0 GCC 9.1.0 2K 4K 6K 8K 10K SE +/- 102.03, N = 3 SE +/- 50.44, N = 3 SE +/- 26.46, N = 3 SE +/- 31.80, N = 3 9357 9577 9580 9597 -march=znver2 -march=znver2 1. (CC) gcc options: -lm -O3 -MT -MD -MP -MF
HPC Challenge HPC Challenge (HPCC) is a cluster-focused benchmark consisting of the HPL Linpack TPP benchmark, DGEMM, STREAM, PTRANS, RandomAccess, FFT, and communication bandwidth and latency. This HPC Challenge test profile attempts to ship with standard yet versatile configuration/input files though they can be modified. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GFLOP/s, More Is Better HPC Challenge 1.5.0 Test / Class: G-Ffte GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 2 4 6 8 10 SE +/- 0.02013, N = 3 SE +/- 0.06300, N = 3 SE +/- 0.02559, N = 3 SE +/- 0.18198, N = 3 8.59803 8.60514 8.63794 8.81748 -march=znver2 -march=znver2 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -funroll-loops 2. OpenBLAS + Open MPI 2.1.1
OpenBenchmarking.org GFLOPS, More Is Better HPC Challenge 1.5.0 Test / Class: G-Ffte GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 2 4 6 8 10 SE +/- 0.02013, N = 3 SE +/- 0.06300, N = 3 SE +/- 0.02559, N = 3 SE +/- 0.18198, N = 3 8.59803 8.60514 8.63794 8.81748 -march=znver2 -march=znver2 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -funroll-loops 2. OpenBLAS + Open MPI 2.1.1
Coremark This is a test of EEMBC CoreMark processor benchmark. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Iterations/Sec, More Is Better Coremark 1.0 CoreMark Size 666 - Iterations Per Second GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 9.1.0 GCC 10.0.0 120K 240K 360K 480K 600K SE +/- 2761.64, N = 3 SE +/- 1036.74, N = 3 SE +/- 1430.19, N = 3 SE +/- 1210.22, N = 3 555154.60 567096.65 567987.34 568329.00 -march=znver2 -march=znver2 1. (CC) gcc options: -O2 -O3 -lrt" -lrt
GROMACS The Gromacs molecular dynamics package testing on the CPU with the water_GMX50 data. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2018.3 Water Benchmark GCC 10.0.0 GCC 9.1.0 GCC 10.0.0 znver2 GCC 9.1.0 znver2 0.2228 0.4456 0.6684 0.8912 1.114 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.97 0.98 0.98 0.99 -march=znver2 -march=znver2 1. (CXX) g++ options: -march=core-avx2 -O3 -std=c++11 -funroll-all-loops -fopenmp -lrt -lpthread -lm
HPC Challenge HPC Challenge (HPCC) is a cluster-focused benchmark consisting of the HPL Linpack TPP benchmark, DGEMM, STREAM, PTRANS, RandomAccess, FFT, and communication bandwidth and latency. This HPC Challenge test profile attempts to ship with standard yet versatile configuration/input files though they can be modified. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org usecs, Fewer Is Better HPC Challenge 1.5.0 Test / Class: Random Ring Latency GCC 10.0.0 GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 0.0747 0.1494 0.2241 0.2988 0.3735 SE +/- 0.00125, N = 3 SE +/- 0.00042, N = 3 SE +/- 0.00047, N = 3 SE +/- 0.00071, N = 3 0.33186 0.32698 0.32596 0.32521 -march=znver2 -march=znver2 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -funroll-loops 2. OpenBLAS + Open MPI 2.1.1
GraphicsMagick This is a test of GraphicsMagick with its OpenMP implementation that performs various imaging tests to stress the system's CPU. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.30 Operation: Noise-Gaussian GCC 9.1.0 GCC 10.0.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 40 80 120 160 200 SE +/- 0.67, N = 3 SE +/- 0.33, N = 3 SE +/- 0.58, N = 3 170 170 171 173 -march=znver2 -march=znver2 1. (CC) gcc options: -fopenmp -O3 -pthread -ljbig -lwebp -lwebpmux -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -ldl -lpthread
HPC Challenge HPC Challenge (HPCC) is a cluster-focused benchmark consisting of the HPL Linpack TPP benchmark, DGEMM, STREAM, PTRANS, RandomAccess, FFT, and communication bandwidth and latency. This HPC Challenge test profile attempts to ship with standard yet versatile configuration/input files though they can be modified. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MB/s, More Is Better HPC Challenge 1.5.0 Test / Class: Max Ping Pong Bandwidth GCC 9.1.0 znver2 GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 5K 10K 15K 20K 25K SE +/- 119.42, N = 3 SE +/- 62.37, N = 3 SE +/- 195.64, N = 3 SE +/- 159.70, N = 3 23832.61 23885.44 23993.04 24227.25 -march=znver2 -march=znver2 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -funroll-loops 2. OpenBLAS + Open MPI 2.1.1
MKL-DNN This is a test of the Intel MKL-DNN as the Intel Math Kernel Library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN 2019-04-16 Harness: Convolution Batch conv_3d - Data Type: f32 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 9.1.0 GCC 10.0.0 30 60 90 120 150 SE +/- 0.45, N = 3 SE +/- 1.48, N = 4 SE +/- 0.16, N = 3 SE +/- 0.79, N = 3 118.47 118.02 117.60 116.62 -march=znver2 - MIN: 103.47 -march=znver2 - MIN: 102.11 MIN: 103.13 MIN: 102.39 1. (CXX) g++ options: -O3 -std=c++11 -march=native -mtune=native -fPIC -fopenmp -pie -lmklml_intel -ldl
FFmpeg This test uses FFmpeg for testing the system's audio/video encoding performance. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better FFmpeg 4.0.2 H.264 HD To NTSC DV GCC 10.0.0 GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 6.88 6.86 6.83 6.78 -march=znver2 -march=znver2 1. (CC) gcc options: -lavdevice -lavfilter -lavformat -lavcodec -lswresample -lswscale -lavutil -lXv -lX11 -lXext -lm -lxcb -lxcb-shape -lxcb-xfixes -lasound -lSDL2 -lsndio -pthread -lbz2 -llzma -O3 -std=c11 -fomit-frame-pointer -fno-math-errno -fno-signed-zeros -fno-tree-vectorize -MMD -MF -MT
MKL-DNN This is a test of the Intel MKL-DNN as the Intel Math Kernel Library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN 2019-04-16 Harness: Convolution Batch conv_alexnet - Data Type: f32 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 500 1000 1500 2000 2500 SE +/- 17.20, N = 3 SE +/- 9.57, N = 3 SE +/- 9.61, N = 3 SE +/- 6.13, N = 3 2543.93 2527.50 2520.01 2507.16 -march=znver2 - MIN: 2467.76 MIN: 2462.11 -march=znver2 - MIN: 2467.07 MIN: 2461.57 1. (CXX) g++ options: -O3 -std=c++11 -march=native -mtune=native -fPIC -fopenmp -pie -lmklml_intel -ldl
HPC Challenge HPC Challenge (HPCC) is a cluster-focused benchmark consisting of the HPL Linpack TPP benchmark, DGEMM, STREAM, PTRANS, RandomAccess, FFT, and communication bandwidth and latency. This HPC Challenge test profile attempts to ship with standard yet versatile configuration/input files though they can be modified. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GB/s, More Is Better HPC Challenge 1.5.0 Test / Class: EP-STREAM Triad GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 GCC 10.0.0 znver2 0.3894 0.7788 1.1682 1.5576 1.947 SE +/- 0.00015, N = 3 SE +/- 0.00081, N = 3 SE +/- 0.00098, N = 3 SE +/- 0.00091, N = 3 1.70820 1.71668 1.72205 1.73055 -march=znver2 -march=znver2 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -funroll-loops 2. OpenBLAS + Open MPI 2.1.1
Apache Benchmark This is a test of ab, which is the Apache benchmark program. This test profile measures how many requests per second a given system can sustain when carrying out 1,000,000 requests with 100 requests being carried out concurrently. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Requests Per Second, More Is Better Apache Benchmark 2.4.29 Static Web Page Serving GCC 10.0.0 znver2 GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 8K 16K 24K 32K 40K SE +/- 79.10, N = 3 SE +/- 139.15, N = 3 SE +/- 57.64, N = 3 SE +/- 65.39, N = 3 38009.25 38022.79 38392.29 38490.98 -march=znver2 -march=znver2 1. (CC) gcc options: -shared -fPIC -pthread -O3
SVT-HEVC This is a test of the Intel Open Visual Cloud Scalable Video Technology SVT-HEVC CPU-based multi-threaded video encoder for the HEVC / H.265 video format with a sample 1080p YUV video file. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better SVT-HEVC 2019-02-03 1080p 8-bit YUV To HEVC Video Encode GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 50 100 150 200 250 SE +/- 3.72, N = 3 SE +/- 0.72, N = 3 SE +/- 1.78, N = 3 SE +/- 1.73, N = 3 246.01 247.33 247.99 248.85 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -fPIE -fPIC -O2 -flto -fvisibility=hidden -march=native -pie -rdynamic -lpthread -lrt
x265 This is a simple test of the x265 encoder run on the CPU with a sample 1080p video file. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better x265 3.0 H.265 1080p Video Encoding GCC 10.0.0 znver2 GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 12 24 36 48 60 SE +/- 0.19, N = 3 SE +/- 0.06, N = 3 SE +/- 0.28, N = 3 SE +/- 0.20, N = 3 52.40 52.53 52.94 53.00 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
HPC Challenge HPC Challenge (HPCC) is a cluster-focused benchmark consisting of the HPL Linpack TPP benchmark, DGEMM, STREAM, PTRANS, RandomAccess, FFT, and communication bandwidth and latency. This HPC Challenge test profile attempts to ship with standard yet versatile configuration/input files though they can be modified. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GFLOPS, More Is Better HPC Challenge 1.5.0 Test / Class: G-HPL GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 GCC 9.1.0 znver2 16 32 48 64 80 SE +/- 0.23, N = 3 SE +/- 0.22, N = 3 SE +/- 0.08, N = 3 SE +/- 0.37, N = 3 70.97 71.05 71.07 71.78 -march=znver2 -march=znver2 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -funroll-loops 2. OpenBLAS + Open MPI 2.1.1
x264 This is a simple test of the x264 encoder run on the CPU (OpenCL support disabled) with a sample video file. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better x264 2018-09-25 H.264 Video Encoding GCC 9.1.0 znver2 GCC 10.0.0 GCC 9.1.0 GCC 10.0.0 znver2 30 60 90 120 150 SE +/- 2.03, N = 3 SE +/- 2.27, N = 3 SE +/- 1.55, N = 7 SE +/- 2.09, N = 4 138.41 138.74 139.59 139.82 -march=znver2 -march=znver2 1. (CC) gcc options: -ldl -m64 -lm -lpthread -O3 -ffast-math -std=gnu99 -fPIC -fomit-frame-pointer -fno-tree-vectorize
OpenSSL OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test measures the RSA 4096-bit performance of OpenSSL. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Signs Per Second, More Is Better OpenSSL 1.1.1 RSA 4096-bit Performance GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 GCC 9.1.0 800 1600 2400 3200 4000 SE +/- 1.42, N = 3 SE +/- 0.70, N = 3 SE +/- 1.89, N = 3 SE +/- 7.07, N = 3 3481.50 3487.10 3492.53 3516.27 -march=znver2 -march=znver2 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
NGINX Benchmark This is a test of ab, which is the Apache Benchmark program running against nginx. This test profile measures how many requests per second a given system can sustain when carrying out 2,000,000 requests with 500 requests being carried out concurrently. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Requests Per Second, More Is Better NGINX Benchmark 1.9.9 Static Web Page Serving GCC 10.0.0 znver2 GCC 10.0.0 GCC 9.1.0 znver2 GCC 9.1.0 9K 18K 27K 36K 45K SE +/- 23.74, N = 3 SE +/- 158.42, N = 3 SE +/- 112.05, N = 3 SE +/- 102.83, N = 3 39346.91 39525.70 39602.49 39734.85 -march=znver2 -march=znver2 1. (CC) gcc options: -lpthread -lcrypt -lcrypto -lz -O3 -march=native
MKL-DNN This is a test of the Intel MKL-DNN as the Intel Math Kernel Library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN 2019-04-16 Harness: Convolution Batch conv_all - Data Type: f32 GCC 10.0.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 9.1.0 4K 8K 12K 16K 20K SE +/- 87.03, N = 3 SE +/- 22.35, N = 3 SE +/- 41.11, N = 3 SE +/- 42.61, N = 3 19803.57 19696.57 19694.33 19613.70 MIN: 19014.9 -march=znver2 - MIN: 19033.5 -march=znver2 - MIN: 18995.6 MIN: 18961.5 1. (CXX) g++ options: -O3 -std=c++11 -march=native -mtune=native -fPIC -fopenmp -pie -lmklml_intel -ldl
PostgreSQL pgbench This is a simple benchmark of PostgreSQL using pgbench. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 10.3 Scaling: Buffer Test - Test: Normal Load - Mode: Read Only GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 GCC 9.1.0 60K 120K 180K 240K 300K SE +/- 235.79, N = 3 SE +/- 237.85, N = 3 SE +/- 102.78, N = 3 SE +/- 513.53, N = 3 297539.89 298969.75 300244.81 300353.09 -march=znver2 -march=znver2 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -lpgcommon -lpgport -lpq -lpthread -lrt -lcrypt -ldl -lm
Stockfish This is a test of Stockfish, an advanced C++11 chess benchmark that can scale up to 128 CPU cores. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 9 Total Time GCC 9.1.0 GCC 10.0.0 znver2 GCC 9.1.0 znver2 GCC 10.0.0 8M 16M 24M 32M 40M SE +/- 210046.69, N = 3 SE +/- 131167.27, N = 3 SE +/- 164232.11, N = 3 SE +/- 237875.03, N = 3 39278964 39540328 39561655 39631993 -march=znver2 -march=znver2 1. (CXX) g++ options: -m64 -lpthread -O3 -fno-exceptions -std=c++11 -pedantic -msse -msse3 -mpopcnt -flto
HPC Challenge HPC Challenge (HPCC) is a cluster-focused benchmark consisting of the HPL Linpack TPP benchmark, DGEMM, STREAM, PTRANS, RandomAccess, FFT, and communication bandwidth and latency. This HPC Challenge test profile attempts to ship with standard yet versatile configuration/input files though they can be modified. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GFLOPS, More Is Better HPC Challenge 1.5.0 Test / Class: EP-DGEMM GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 8 16 24 32 40 SE +/- 0.42, N = 3 SE +/- 0.19, N = 3 SE +/- 0.22, N = 3 SE +/- 0.11, N = 3 32.60 32.83 32.84 32.86 -march=znver2 -march=znver2 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -funroll-loops 2. OpenBLAS + Open MPI 2.1.1
PostgreSQL pgbench This is a simple benchmark of PostgreSQL using pgbench. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 10.3 Scaling: Buffer Test - Test: Normal Load - Mode: Read Write GCC 10.0.0 znver2 GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 6K 12K 18K 24K 30K SE +/- 124.84, N = 3 SE +/- 40.41, N = 3 SE +/- 55.36, N = 3 SE +/- 31.16, N = 3 29148.60 29149.20 29178.23 29372.39 -march=znver2 -march=znver2 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -lpgcommon -lpgport -lpq -lpthread -lrt -lcrypt -ldl -lm
MKL-DNN This is a test of the Intel MKL-DNN as the Intel Math Kernel Library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN 2019-04-16 Harness: Convolution Batch conv_googlenet_v3 - Data Type: f32 GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 200 400 600 800 1000 SE +/- 6.23, N = 3 SE +/- 6.51, N = 3 SE +/- 5.61, N = 3 SE +/- 6.39, N = 3 1153.46 1147.62 1145.95 1145.01 -march=znver2 - MIN: 1057.54 MIN: 1052.13 -march=znver2 - MIN: 1052.71 MIN: 1050.58 1. (CXX) g++ options: -O3 -std=c++11 -march=native -mtune=native -fPIC -fopenmp -pie -lmklml_intel -ldl
XZ Compression This test measures the time needed to compress a sample file (an Ubuntu file-system image) using XZ compression. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better XZ Compression 5.2.4 Compressing ubuntu-16.04.3-server-i386.img, Compression Level 9 GCC 10.0.0 znver2 GCC 10.0.0 GCC 9.1.0 znver2 GCC 9.1.0 6 12 18 24 30 SE +/- 0.11, N = 3 SE +/- 0.12, N = 3 SE +/- 0.13, N = 3 SE +/- 0.10, N = 3 25.39 25.26 25.25 25.23 -march=znver2 -march=znver2 1. (CC) gcc options: -pthread -fvisibility=hidden -O3
SVT-AV1 This is a test of the Intel Open Visual Cloud Scalable Video Technology SVT-AV1 CPU-based multi-threaded video encoder for the AV1 video format with a sample 1080p YUV video file. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 0.5 1080p 8-bit YUV To AV1 Video Encode GCC 10.0.0 GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 11 22 33 44 55 SE +/- 0.27, N = 3 SE +/- 0.19, N = 3 SE +/- 0.15, N = 3 SE +/- 0.13, N = 3 46.22 46.39 46.45 46.49 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -pie -lpthread -lm
HPC Challenge HPC Challenge (HPCC) is a cluster-focused benchmark consisting of the HPL Linpack TPP benchmark, DGEMM, STREAM, PTRANS, RandomAccess, FFT, and communication bandwidth and latency. This HPC Challenge test profile attempts to ship with standard yet versatile configuration/input files though they can be modified. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GUP/s, More Is Better HPC Challenge 1.5.0 Test / Class: G-Random Access GCC 9.1.0 GCC 10.0.0 znver2 GCC 10.0.0 GCC 9.1.0 znver2 0.022 0.044 0.066 0.088 0.11 SE +/- 0.00036, N = 3 SE +/- 0.00044, N = 3 SE +/- 0.00041, N = 3 SE +/- 0.00042, N = 3 0.09757 0.09771 0.09778 0.09798 -march=znver2 -march=znver2 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -funroll-loops 2. OpenBLAS + Open MPI 2.1.1
m-queens A solver for the N-queens problem with multi-threading support via the OpenMP library. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better m-queens 1.2 Time To Solve GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 GCC 9.1.0 11 22 33 44 55 SE +/- 0.08, N = 3 SE +/- 0.10, N = 3 SE +/- 0.11, N = 3 SE +/- 0.12, N = 3 47.27 47.21 47.14 47.12 -march=znver2 -march=znver2 1. (CXX) g++ options: -fopenmp -O3 -O2 -march=native
Apache Siege This is a test of the Apache web server performance being facilitated by the Siege web serverb enchmark program. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Transactions Per Second, More Is Better Apache Siege 2.4.29 Concurrent Users: 250 GCC 10.0.0 GCC 9.1.0 znver2 GCC 9.1.0 GCC 10.0.0 znver2 20K 40K 60K 80K 100K SE +/- 122.71, N = 3 SE +/- 4063.46, N = 12 SE +/- 3755.13, N = 15 SE +/- 1636.75, N = 12 62725.24 96842.13 98050.91 102423.07 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -lpthread -ldl -lssl -lcrypto
OpenBenchmarking.org Transactions Per Second, More Is Better Apache Siege 2.4.29 Concurrent Users: 200 GCC 9.1.0 GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 znver2 20K 40K 60K 80K 100K SE +/- 798.56, N = 3 SE +/- 3302.37, N = 12 SE +/- 1288.23, N = 15 SE +/- 3575.15, N = 15 60835.79 82293.14 83275.06 99824.49 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -lpthread -ldl -lssl -lcrypto
Redis Redis is an open-source data structure server. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Requests Per Second, More Is Better Redis 4.0.8 Test: GET GCC 10.0.0 znver2 GCC 10.0.0 GCC 9.1.0 znver2 GCC 9.1.0 700K 1400K 2100K 2800K 3500K SE +/- 51486.64, N = 15 SE +/- 47460.73, N = 15 SE +/- 61029.58, N = 15 SE +/- 40781.06, N = 3 3031706.22 3042507.47 3066070.28 3297713.33 1. (CC) gcc options: -ggdb -rdynamic -lm -ldl -pthread
MKL-DNN This is a test of the Intel MKL-DNN as the Intel Math Kernel Library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN 2019-04-16 Harness: IP Batch 1D - Data Type: f32 GCC 10.0.0 znver2 GCC 9.1.0 znver2 GCC 10.0.0 GCC 9.1.0 30 60 90 120 150 SE +/- 3.21, N = 14 SE +/- 2.27, N = 15 SE +/- 3.18, N = 12 SE +/- 3.35, N = 15 157.72 155.34 154.09 152.51 -march=znver2 - MIN: 127 -march=znver2 - MIN: 127.99 MIN: 129 MIN: 111.42 1. (CXX) g++ options: -O3 -std=c++11 -march=native -mtune=native -fPIC -fopenmp -pie -lmklml_intel -ldl
GCC 9.1.0 Environment Notes: CXXFLAGS=-O3 CFLAGS=-O3Compiler Notes: --disable-multilib --enable-checking=releaseProcessor Notes: Scaling Governor: acpi-cpufreq ondemandPython Notes: Python 2.7.15+ + Python 3.6.8Security Notes: l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: always-on RSB filling
Testing initiated at 27 July 2019 08:54 by user phoronix.
GCC 9.1.0 znver2 Processor: AMD Ryzen 9 3900X 12-Core @ 3.80GHz (12 Cores / 24 Threads), Motherboard: ASUS ROG CROSSHAIR VIII HERO (WI-FI) (0702 BIOS), Chipset: AMD Device 1480, Memory: 16384MB, Disk: 2000GB Force MP600, Graphics: Sapphire AMD Radeon RX 550 640SP / 560/560X 4GB (1300/1750MHz), Audio: AMD Device aae0, Monitor: ASUS VP28U, Network: Realtek Device 8125 + Intel I211 + Intel Device 2723
OS: Ubuntu 18.04, Kernel: 5.3.0-999-generic (x86_64) 20190725, Desktop: GNOME Shell 3.28.4, Display Server: X Server 1.20.4, Display Driver: modesetting 1.20.4, OpenGL: 4.5 Mesa 19.0.2 (LLVM 8.0.0), Compiler: GCC 9.1.0, File-System: ext4, Screen Resolution: 3840x2160
Environment Notes: CXXFLAGS=-O3-march=znver2 CFLAGS=-O3-march=znver2Compiler Notes: --disable-multilib --enable-checking=releaseProcessor Notes: Scaling Governor: acpi-cpufreq ondemandPython Notes: Python 2.7.15+ + Python 3.6.8Security Notes: l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: always-on RSB filling
Testing initiated at 27 July 2019 19:17 by user phoronix.
GCC 10.0.0 Environment Notes: CXXFLAGS=-O3 CFLAGS=-O3Compiler Notes: --disable-multilib --enable-checking=releaseProcessor Notes: Scaling Governor: acpi-cpufreq ondemandPython Notes: Python 2.7.15+ + Python 3.6.8Security Notes: l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: always-on RSB filling
Testing initiated at 28 July 2019 18:18 by user phoronix.
GCC 10.0.0 znver2 Processor: AMD Ryzen 9 3900X 12-Core @ 3.80GHz (12 Cores / 24 Threads), Motherboard: ASUS ROG CROSSHAIR VIII HERO (WI-FI) (0702 BIOS), Chipset: AMD Device 1480, Memory: 16384MB, Disk: 2000GB Force MP600, Graphics: Sapphire AMD Radeon RX 550 640SP / 560/560X 4GB (1300/1750MHz), Audio: AMD Device aae0, Monitor: ASUS VP28U, Network: Realtek Device 8125 + Intel I211 + Intel Device 2723
OS: Ubuntu 18.04, Kernel: 5.3.0-999-generic (x86_64) 20190725, Desktop: GNOME Shell 3.28.4, Display Server: X Server 1.20.4, Display Driver: modesetting 1.20.4, OpenGL: 4.5 Mesa 19.0.2 (LLVM 8.0.0), Compiler: GCC 10.0.0 20190727, File-System: ext4, Screen Resolution: 3840x2160
Environment Notes: CXXFLAGS=-O3-march=znver2 CFLAGS=-O3-march=znver2Compiler Notes: --disable-multilib --enable-checking=releaseProcessor Notes: Scaling Governor: acpi-cpufreq ondemandPython Notes: Python 2.7.15+ + Python 3.6.8Security Notes: l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: always-on RSB filling
Testing initiated at 28 July 2019 06:10 by user phoronix.