Ryzen 9 3900X Znver2 Compiler Tuning AMD Ryzen 9 3900X 12-Core testing of GCC 9 and GCC 10 development with Znver2 tuning following recent cost table updates, etc. Benchmarks by Michael Larabel for a future article..
HTML result view exported from: https://openbenchmarking.org/result/1907290-HV-RYZEN939034&rdt&grw .
Ryzen 9 3900X Znver2 Compiler Tuning Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL Compiler File-System Screen Resolution GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 AMD Ryzen 9 3900X 12-Core @ 3.80GHz (12 Cores / 24 Threads) ASUS ROG CROSSHAIR VIII HERO (WI-FI) (0702 BIOS) AMD Device 1480 16384MB 2000GB Force MP600 Sapphire AMD Radeon RX 550 640SP / 560/560X 4GB (1300/1750MHz) AMD Device aae0 ASUS VP28U Realtek Device 8125 + Intel I211 + Intel Device 2723 Ubuntu 18.04 5.3.0-999-generic (x86_64) 20190725 GNOME Shell 3.28.4 X Server 1.20.4 modesetting 1.20.4 4.5 Mesa 19.0.2 (LLVM 8.0.0) GCC 9.1.0 ext4 3840x2160 GCC 10.0.0 20190727 OpenBenchmarking.org Environment Details - GCC 9.1.0: CXXFLAGS=-O3 CFLAGS=-O3 - GCC 9.1.0 znver2: CXXFLAGS=-O3-march=znver2 CFLAGS=-O3-march=znver2 - GCC 10.0.0 znver2: CXXFLAGS=-O3-march=znver2 CFLAGS=-O3-march=znver2 - GCC 10.0.0: CXXFLAGS=-O3 CFLAGS=-O3 Compiler Details - --disable-multilib --enable-checking=release Processor Details - Scaling Governor: acpi-cpufreq ondemand Python Details - Python 2.7.15+ + Python 3.6.8 Security Details - l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: always-on RSB filling
Ryzen 9 3900X Znver2 Compiler Tuning bullet: Raytests mpcbench: Multi-Precision Benchmark bullet: 3000 Fall bullet: 1000 Stack bullet: 1000 Convex bullet: 136 Ragdolls bullet: Prim Trimesh bullet: Convex Trimesh tscp: AI Chess Performance scimark2: Composite scimark2: Monte Carlo scimark2: Fast Fourier Transform scimark2: Sparse Matrix Multiply scimark2: Dense LU Matrix Factorization scimark2: Jacobi Successive Over-Relaxation compress-xz: Compressing ubuntu-16.04.3-server-i386.img, Compression Level 9 cpp-perf-bench: Atol cpp-perf-bench: Ctype cpp-perf-bench: Math Library cpp-perf-bench: Rand Numbers cpp-perf-bench: Stepanov Vector cpp-perf-bench: Function Objects cpp-perf-bench: Stepanov Abstraction mkl-dnn: Deconvolution Batch deconv_3d - f32 sockperf: Throughput sockperf: Latency Ping Pong lzbench: XZ 0 - Compression mkl-dnn: IP Batch All - f32 lzbench: XZ 0 - Decompression mkl-dnn: Convolution Batch conv_alexnet - f32 mkl-dnn: Deconvolution Batch deconv_all - f32 mkl-dnn: Deconvolution Batch deconv_1d - f32 lzbench: Zstd 1 - Compression mkl-dnn: Convolution Batch conv_3d - f32 mkl-dnn: IP Batch 1D - f32 lzbench: Zstd 1 - Decompression lzbench: Brotli 0 - Compression lzbench: Libdeflate 1 - Compression lzbench: Libdeflate 1 - Decompression encode-flac: WAV To FLAC mkl-dnn: Convolution Batch conv_googlenet_v3 - f32 mkl-dnn: Convolution Batch conv_all - f32 encode-mp3: WAV To MP3 tjbench: Decompression Throughput encode-ogg: WAV To Ogg fftw: Stock - 1D FFT Size 32 fftw: Stock - 2D FFT Size 32 fftw: Stock - 2D FFT Size 512 fftw: Stock - 2D FFT Size 4096 fftw: Float + SSE - 2D FFT Size 32 himeno: Poisson Pressure Solver hpcc: G-HPL hpcc: G-Ffte hpcc: G-Ffte hpcc: EP-DGEMM hpcc: G-Ptrans hpcc: EP-STREAM Triad hpcc: G-Rand Access hpcc: Rand Ring Latency hpcc: Rand Ring Bandwidth hpcc: Max Ping Pong Bandwidth gromacs: Water Benchmark hpcg: coremark: CoreMark Size 666 - Iterations Per Second stockfish: Total Time john-the-ripper: Blowfish build-llvm: Time To Compile build-php: Time To Compile m-queens: Time To Solve cpuminer-opt: m7m cpuminer-opt: deep cpuminer-opt: lbry cpuminer-opt: skein cpuminer-opt: myr-gr cpuminer-opt: sha256t aom-av1: AV1 Video Encoding aobench: 2048 x 2048 - Total Time graphics-magick: Swirl graphics-magick: Rotate graphics-magick: Sharpen graphics-magick: Enhanced graphics-magick: Resizing graphics-magick: Noise-Gaussian graphics-magick: HWB Color Space svt-vp9: 1080p 8-bit YUV To VP9 Video Encode x264: H.264 Video Encoding svt-av1: 1080p 8-bit YUV To AV1 Video Encode x265: H.265 1080p Video Encoding c-ray: Total Time - 4K, 16 Rays Per Pixel svt-hevc: 1080p 8-bit YUV To HEVC Video Encode ffmpeg: H.264 HD To NTSC DV smallpt: Global Illumination Renderer; 128 Samples mcperf: Get mcperf: Set nginx: Static Web Page Serving apache: Static Web Page Serving openssl: RSA 4096-bit Performance apache-siege: 200 apache-siege: 250 redis: GET redis: SET pgbench: Buffer Test - Normal Load - Read Only pgbench: Buffer Test - Normal Load - Read Write GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 1.98 9597 3.20 3.88 3.51 2.05 0.75 0.89 1305781 2768.16 761.38 295.18 3767.63 6891.37 2125.26 25.23 59.31 31.52 309.36 751.15 76.45 14.40 27.60 59.00 514551 3.12 39 1523.52 116 2527.50 51813.05 221.23 468 117.60 152.51 1269 494 239 1119 7.70 1147.62 19613.70 7.25 218.09 5.13 12958 11909 9028.10 7063.03 45253 1322.90 70.96900 8.59803 8.59803 32.83083 2.73255 1.70820 0.09757 0.32596 4.89161 24227.253 0.98 1.09 567987.34 39278964 20335 280.27 52.71 47.12 593.89 11190 34583 39397 14127 87951 0.27 34.60 251 262 181 209 275 170 287 89.99 139.59 46.45 52.94 43.09 246.01 6.86 7.78 92376.40 52914.10 39734.85 38392.29 3516.27 60835.79 98050.91 3297713.33 2122162.94 300353.09 29178.23 2.04 9577 3.22 3.77 3.57 2.04 0.77 0.90 1337188 3686.60 800.23 273.49 3580.73 11370.27 2408.26 25.25 60.38 31.43 302.82 750.66 74.08 14.15 28.93 57.97 517095 3.15 40 1599.68 116 2520.01 52238.80 217.02 468 118.47 155.34 1268 515 257 1183 7.99 1153.46 19696.57 6.94 225.64 5.05 11828 14314 10814 7920.17 44951 1378.46 71.77663 8.60514 8.60514 32.60263 2.95225 1.71668 0.09798 0.32698 4.98832 23832.608 0.99 1.08 555154.60 39561655 20253 284.10 53.91 47.27 590.66 10230.34 34420 39797 14023 87238 0.31 33.20 259 263 195 221 280 171 293 96.54 138.41 46.39 52.53 39.42 247.33 6.83 7.67 93850.59 59232.07 39602.49 38022.79 3481.50 99824.49 96842.13 3066070.28 2169531.00 297539.89 29149.20 2.06 9357 3.27 3.85 3.60 2.05 0.77 0.91 1408752 3553.67 759.97 261.10 3675.94 10777.88 2293.46 25.39 63.34 31.30 306.02 787.77 77.22 14.90 28.30 56.87 529657 3.04 37 1556.91 108 2543.93 50679.53 218.66 453 118.02 157.72 1250 505 248 1159 8.11 1145.95 19694.33 7.45 225.44 5.36 14113 14119 10531 7823.27 46305 1385.88 71.04930 8.63794 8.63794 32.84363 2.94730 1.73055 0.09771 0.32521 5.04603 23993.043 0.98 1.08 567096.65 39540328 20426 300.31 53.76 47.21 590.80 11123 34630 39843 14137 86440 0.32 33.05 264 277 196 223 286 173 302 92.35 139.82 46.49 52.40 39.36 247.99 6.78 7.53 97228.27 52910.87 39346.91 38009.25 3487.10 83275.06 102423.07 3031706.22 2084989.88 298969.75 29148.60 2.11 9580 3.44 4.11 3.73 2.20 0.81 0.95 1366017 3127.49 777.17 301.17 3856.63 8526.66 2175.85 25.26 59.97 31.51 307.23 799.88 74.26 15.10 28.19 58.16 514748 3.03 40 1582.78 113 2507.16 50039.13 212.83 467 116.62 154.09 1287 522 250 1147 7.72 1145.01 19803.57 7.28 220.33 5.05 12748 12902 9583.73 7071.30 45361 1385.23 71.07010 8.81748 8.81748 32.86343 2.72974 1.72205 0.09778 0.33186 4.94947 23885.438 0.97 1.09 568329.00 39631993 20426 292.53 54.43 47.14 591.32 11137 35288 39720 14130 86417 0.27 35.98 254 262 181 208 274 170 288 89.84 138.74 46.22 53.00 42.63 248.85 6.88 7.84 95710.60 57193.25 39525.70 38490.98 3492.53 82293.14 62725.24 3042507.47 2051361.33 300244.81 29372.39 OpenBenchmarking.org
Bullet Physics Engine Test: Raytests OpenBenchmarking.org Seconds, Fewer Is Better Bullet Physics Engine 2.81 Test: Raytests GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 0.4748 0.9496 1.4244 1.8992 2.374 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 SE +/- 0.00, N = 6 1.98 2.04 2.06 2.11 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -rdynamic -lglut -lGL -lGLU
GNU MPC Multi-Precision Benchmark OpenBenchmarking.org Global Score, More Is Better GNU MPC 1.1.0 Multi-Precision Benchmark GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 2K 4K 6K 8K 10K SE +/- 31.80, N = 3 SE +/- 50.44, N = 3 SE +/- 102.03, N = 3 SE +/- 26.46, N = 3 9597 9577 9357 9580 -march=znver2 -march=znver2 1. (CC) gcc options: -lm -O3 -MT -MD -MP -MF
Bullet Physics Engine Test: 3000 Fall OpenBenchmarking.org Seconds, Fewer Is Better Bullet Physics Engine 2.81 Test: 3000 Fall GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 0.774 1.548 2.322 3.096 3.87 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 SE +/- 0.00, N = 3 3.20 3.22 3.27 3.44 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -rdynamic -lglut -lGL -lGLU
Bullet Physics Engine Test: 1000 Stack OpenBenchmarking.org Seconds, Fewer Is Better Bullet Physics Engine 2.81 Test: 1000 Stack GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 0.9248 1.8496 2.7744 3.6992 4.624 SE +/- 0.02, N = 3 SE +/- 0.00, N = 3 SE +/- 0.04, N = 3 SE +/- 0.00, N = 3 3.88 3.77 3.85 4.11 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -rdynamic -lglut -lGL -lGLU
Bullet Physics Engine Test: 1000 Convex OpenBenchmarking.org Seconds, Fewer Is Better Bullet Physics Engine 2.81 Test: 1000 Convex GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 0.8393 1.6786 2.5179 3.3572 4.1965 SE +/- 0.02, N = 3 SE +/- 0.00, N = 3 SE +/- 0.04, N = 3 SE +/- 0.00, N = 3 3.51 3.57 3.60 3.73 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -rdynamic -lglut -lGL -lGLU
Bullet Physics Engine Test: 136 Ragdolls OpenBenchmarking.org Seconds, Fewer Is Better Bullet Physics Engine 2.81 Test: 136 Ragdolls GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 0.495 0.99 1.485 1.98 2.475 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 SE +/- 0.00, N = 3 2.05 2.04 2.05 2.20 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -rdynamic -lglut -lGL -lGLU
Bullet Physics Engine Test: Prim Trimesh OpenBenchmarking.org Seconds, Fewer Is Better Bullet Physics Engine 2.81 Test: Prim Trimesh GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 0.1823 0.3646 0.5469 0.7292 0.9115 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 0.75 0.77 0.77 0.81 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -rdynamic -lglut -lGL -lGLU
Bullet Physics Engine Test: Convex Trimesh OpenBenchmarking.org Seconds, Fewer Is Better Bullet Physics Engine 2.81 Test: Convex Trimesh GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 0.2138 0.4276 0.6414 0.8552 1.069 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 0.89 0.90 0.91 0.95 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -rdynamic -lglut -lGL -lGLU
TSCP AI Chess Performance OpenBenchmarking.org Nodes Per Second, More Is Better TSCP 1.81 AI Chess Performance GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 300K 600K 900K 1200K 1500K SE +/- 620.00, N = 5 SE +/- 10688.23, N = 5 SE +/- 6261.48, N = 5 SE +/- 676.60, N = 5 1305781 1337188 1408752 1366017 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -march=native
SciMark Computational Test: Composite OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Composite GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 800 1600 2400 3200 4000 SE +/- 25.64, N = 3 SE +/- 5.91, N = 3 SE +/- 13.97, N = 3 SE +/- 5.96, N = 3 2768.16 3686.60 3553.67 3127.49 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -lm
SciMark Computational Test: Monte Carlo OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Monte Carlo GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 200 400 600 800 1000 SE +/- 7.16, N = 3 SE +/- 0.74, N = 3 SE +/- 0.29, N = 3 SE +/- 0.24, N = 3 761.38 800.23 759.97 777.17 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -lm
SciMark Computational Test: Fast Fourier Transform OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Fast Fourier Transform GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 70 140 210 280 350 SE +/- 2.85, N = 3 SE +/- 0.21, N = 3 SE +/- 0.24, N = 3 SE +/- 0.54, N = 3 295.18 273.49 261.10 301.17 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -lm
SciMark Computational Test: Sparse Matrix Multiply OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Sparse Matrix Multiply GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 800 1600 2400 3200 4000 SE +/- 37.65, N = 3 SE +/- 13.73, N = 3 SE +/- 58.43, N = 3 SE +/- 15.78, N = 3 3767.63 3580.73 3675.94 3856.63 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -lm
SciMark Computational Test: Dense LU Matrix Factorization OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Dense LU Matrix Factorization GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 2K 4K 6K 8K 10K SE +/- 60.72, N = 3 SE +/- 15.98, N = 3 SE +/- 12.24, N = 3 SE +/- 19.91, N = 3 6891.37 11370.27 10777.88 8526.66 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -lm
SciMark Computational Test: Jacobi Successive Over-Relaxation OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Jacobi Successive Over-Relaxation GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 500 1000 1500 2000 2500 SE +/- 20.16, N = 3 SE +/- 0.53, N = 3 SE +/- 0.89, N = 3 SE +/- 0.16, N = 3 2125.26 2408.26 2293.46 2175.85 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -lm
XZ Compression Compressing ubuntu-16.04.3-server-i386.img, Compression Level 9 OpenBenchmarking.org Seconds, Fewer Is Better XZ Compression 5.2.4 Compressing ubuntu-16.04.3-server-i386.img, Compression Level 9 GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 6 12 18 24 30 SE +/- 0.10, N = 3 SE +/- 0.13, N = 3 SE +/- 0.11, N = 3 SE +/- 0.12, N = 3 25.23 25.25 25.39 25.26 -march=znver2 -march=znver2 1. (CC) gcc options: -pthread -fvisibility=hidden -O3
CppPerformanceBenchmarks Test: Atol OpenBenchmarking.org Seconds, Fewer Is Better CppPerformanceBenchmarks 9 Test: Atol GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 14 28 42 56 70 SE +/- 0.30, N = 3 SE +/- 0.53, N = 11 SE +/- 0.06, N = 3 SE +/- 0.17, N = 3 59.31 60.38 63.34 59.97 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -std=c++11
CppPerformanceBenchmarks Test: Ctype OpenBenchmarking.org Seconds, Fewer Is Better CppPerformanceBenchmarks 9 Test: Ctype GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 7 14 21 28 35 SE +/- 0.28, N = 3 SE +/- 0.14, N = 3 SE +/- 0.03, N = 3 SE +/- 0.38, N = 5 31.52 31.43 31.30 31.51 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -std=c++11
CppPerformanceBenchmarks Test: Math Library OpenBenchmarking.org Seconds, Fewer Is Better CppPerformanceBenchmarks 9 Test: Math Library GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 70 140 210 280 350 SE +/- 0.26, N = 3 SE +/- 3.91, N = 3 SE +/- 2.37, N = 3 SE +/- 4.29, N = 3 309.36 302.82 306.02 307.23 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -std=c++11
CppPerformanceBenchmarks Test: Random Numbers OpenBenchmarking.org Seconds, Fewer Is Better CppPerformanceBenchmarks 9 Test: Random Numbers GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 200 400 600 800 1000 SE +/- 2.69, N = 3 SE +/- 0.27, N = 3 SE +/- 10.35, N = 5 SE +/- 4.15, N = 3 751.15 750.66 787.77 799.88 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -std=c++11
CppPerformanceBenchmarks Test: Stepanov Vector OpenBenchmarking.org Seconds, Fewer Is Better CppPerformanceBenchmarks 9 Test: Stepanov Vector GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 20 40 60 80 100 SE +/- 0.35, N = 3 SE +/- 0.12, N = 3 SE +/- 0.04, N = 3 SE +/- 0.88, N = 3 76.45 74.08 77.22 74.26 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -std=c++11
CppPerformanceBenchmarks Test: Function Objects OpenBenchmarking.org Seconds, Fewer Is Better CppPerformanceBenchmarks 9 Test: Function Objects GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 4 8 12 16 20 SE +/- 0.08, N = 3 SE +/- 0.03, N = 3 SE +/- 0.20, N = 4 SE +/- 0.17, N = 3 14.40 14.15 14.90 15.10 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -std=c++11
CppPerformanceBenchmarks Test: Stepanov Abstraction OpenBenchmarking.org Seconds, Fewer Is Better CppPerformanceBenchmarks 9 Test: Stepanov Abstraction GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 7 14 21 28 35 SE +/- 0.03, N = 3 SE +/- 0.00, N = 3 SE +/- 0.45, N = 3 SE +/- 0.08, N = 3 27.60 28.93 28.30 28.19 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -std=c++11
MKL-DNN Harness: Deconvolution Batch deconv_3d - Data Type: f32 OpenBenchmarking.org ms, Fewer Is Better MKL-DNN 2019-04-16 Harness: Deconvolution Batch deconv_3d - Data Type: f32 GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 13 26 39 52 65 SE +/- 0.66, N = 7 SE +/- 0.49, N = 15 SE +/- 0.58, N = 8 SE +/- 0.69, N = 15 59.00 57.97 56.87 58.16 MIN: 50.8 -march=znver2 - MIN: 51.57 -march=znver2 - MIN: 50.96 MIN: 50.91 1. (CXX) g++ options: -O3 -std=c++11 -march=native -mtune=native -fPIC -fopenmp -pie -lmklml_intel -ldl
Sockperf Test: Throughput OpenBenchmarking.org Messages Per Second, More Is Better Sockperf 3.4 Test: Throughput GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 110K 220K 330K 440K 550K SE +/- 4767.11, N = 5 SE +/- 4175.03, N = 5 SE +/- 3715.76, N = 18 SE +/- 5409.10, N = 5 514551 517095 529657 514748 -march=znver2 -march=znver2 1. (CXX) g++ options: --param -O3 -rdynamic -ldl -lpthread
Sockperf Test: Latency Ping Pong OpenBenchmarking.org usec, Fewer Is Better Sockperf 3.4 Test: Latency Ping Pong GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 0.7088 1.4176 2.1264 2.8352 3.544 SE +/- 0.04, N = 5 SE +/- 0.04, N = 6 SE +/- 0.02, N = 25 SE +/- 0.02, N = 25 3.12 3.15 3.04 3.03 -march=znver2 -march=znver2 1. (CXX) g++ options: --param -O3 -rdynamic -ldl -lpthread
lzbench Test: XZ 0 - Process: Compression OpenBenchmarking.org MB/s, More Is Better lzbench 2017-08-08 Test: XZ 0 - Process: Compression GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 9 18 27 36 45 SE +/- 0.33, N = 3 39 40 37 40 1. (CXX) g++ options: -lrt -static -lpthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
MKL-DNN Harness: IP Batch All - Data Type: f32 OpenBenchmarking.org ms, Fewer Is Better MKL-DNN 2019-04-16 Harness: IP Batch All - Data Type: f32 GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 300 600 900 1200 1500 SE +/- 7.48, N = 3 SE +/- 17.21, N = 3 SE +/- 25.50, N = 3 SE +/- 5.99, N = 3 1523.52 1599.68 1556.91 1582.78 MIN: 1357.02 -march=znver2 - MIN: 1393.73 -march=znver2 - MIN: 1368.2 MIN: 1385.56 1. (CXX) g++ options: -O3 -std=c++11 -march=native -mtune=native -fPIC -fopenmp -pie -lmklml_intel -ldl
lzbench Test: XZ 0 - Process: Decompression OpenBenchmarking.org MB/s, More Is Better lzbench 2017-08-08 Test: XZ 0 - Process: Decompression GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 30 60 90 120 150 SE +/- 0.33, N = 3 116 116 108 113 1. (CXX) g++ options: -lrt -static -lpthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
MKL-DNN Harness: Convolution Batch conv_alexnet - Data Type: f32 OpenBenchmarking.org ms, Fewer Is Better MKL-DNN 2019-04-16 Harness: Convolution Batch conv_alexnet - Data Type: f32 GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 500 1000 1500 2000 2500 SE +/- 9.57, N = 3 SE +/- 9.61, N = 3 SE +/- 17.20, N = 3 SE +/- 6.13, N = 3 2527.50 2520.01 2543.93 2507.16 MIN: 2462.11 -march=znver2 - MIN: 2467.07 -march=znver2 - MIN: 2467.76 MIN: 2461.57 1. (CXX) g++ options: -O3 -std=c++11 -march=native -mtune=native -fPIC -fopenmp -pie -lmklml_intel -ldl
MKL-DNN Harness: Deconvolution Batch deconv_all - Data Type: f32 OpenBenchmarking.org ms, Fewer Is Better MKL-DNN 2019-04-16 Harness: Deconvolution Batch deconv_all - Data Type: f32 GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 11K 22K 33K 44K 55K SE +/- 589.20, N = 6 SE +/- 668.40, N = 3 SE +/- 390.75, N = 3 SE +/- 224.88, N = 3 51813.05 52238.80 50679.53 50039.13 MIN: 48543.1 -march=znver2 - MIN: 49224.9 -march=znver2 - MIN: 48056.6 MIN: 46883.1 1. (CXX) g++ options: -O3 -std=c++11 -march=native -mtune=native -fPIC -fopenmp -pie -lmklml_intel -ldl
MKL-DNN Harness: Deconvolution Batch deconv_1d - Data Type: f32 OpenBenchmarking.org ms, Fewer Is Better MKL-DNN 2019-04-16 Harness: Deconvolution Batch deconv_1d - Data Type: f32 GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 50 100 150 200 250 SE +/- 2.00, N = 15 SE +/- 1.79, N = 13 SE +/- 1.85, N = 15 SE +/- 0.29, N = 3 221.23 217.02 218.66 212.83 MIN: 202.07 -march=znver2 - MIN: 203.65 -march=znver2 - MIN: 203.42 MIN: 201.7 1. (CXX) g++ options: -O3 -std=c++11 -march=native -mtune=native -fPIC -fopenmp -pie -lmklml_intel -ldl
lzbench Test: Zstd 1 - Process: Compression OpenBenchmarking.org MB/s, More Is Better lzbench 2017-08-08 Test: Zstd 1 - Process: Compression GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 100 200 300 400 500 SE +/- 3.18, N = 3 SE +/- 4.91, N = 8 SE +/- 0.33, N = 3 468 468 453 467 1. (CXX) g++ options: -lrt -static -lpthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
MKL-DNN Harness: Convolution Batch conv_3d - Data Type: f32 OpenBenchmarking.org ms, Fewer Is Better MKL-DNN 2019-04-16 Harness: Convolution Batch conv_3d - Data Type: f32 GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 30 60 90 120 150 SE +/- 0.16, N = 3 SE +/- 0.45, N = 3 SE +/- 1.48, N = 4 SE +/- 0.79, N = 3 117.60 118.47 118.02 116.62 MIN: 103.13 -march=znver2 - MIN: 103.47 -march=znver2 - MIN: 102.11 MIN: 102.39 1. (CXX) g++ options: -O3 -std=c++11 -march=native -mtune=native -fPIC -fopenmp -pie -lmklml_intel -ldl
MKL-DNN Harness: IP Batch 1D - Data Type: f32 OpenBenchmarking.org ms, Fewer Is Better MKL-DNN 2019-04-16 Harness: IP Batch 1D - Data Type: f32 GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 30 60 90 120 150 SE +/- 3.35, N = 15 SE +/- 2.27, N = 15 SE +/- 3.21, N = 14 SE +/- 3.18, N = 12 152.51 155.34 157.72 154.09 MIN: 111.42 -march=znver2 - MIN: 127.99 -march=znver2 - MIN: 127 MIN: 129 1. (CXX) g++ options: -O3 -std=c++11 -march=native -mtune=native -fPIC -fopenmp -pie -lmklml_intel -ldl
lzbench Test: Zstd 1 - Process: Decompression OpenBenchmarking.org MB/s, More Is Better lzbench 2017-08-08 Test: Zstd 1 - Process: Decompression GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 300 600 900 1200 1500 SE +/- 9.50, N = 3 SE +/- 12.79, N = 8 SE +/- 0.33, N = 3 SE +/- 0.58, N = 3 1269 1268 1250 1287 1. (CXX) g++ options: -lrt -static -lpthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
lzbench Test: Brotli 0 - Process: Compression OpenBenchmarking.org MB/s, More Is Better lzbench 2017-08-08 Test: Brotli 0 - Process: Compression GCC 10.0.0 GCC 10.0.0 znver2 GCC 9.1.0 GCC 9.1.0 znver2 110 220 330 440 550 SE +/- 0.88, N = 3 SE +/- 4.47, N = 11 SE +/- 0.67, N = 3 SE +/- 4.10, N = 3 507 499 494 515 1. (CXX) g++ options: -lrt -static -lpthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
lzbench Test: Libdeflate 1 - Process: Compression OpenBenchmarking.org MB/s, More Is Better lzbench 2017-08-08 Test: Libdeflate 1 - Process: Compression GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 60 120 180 240 300 SE +/- 1.86, N = 3 SE +/- 0.67, N = 3 239 257 248 250 1. (CXX) g++ options: -lrt -static -lpthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
lzbench Test: Libdeflate 1 - Process: Decompression OpenBenchmarking.org MB/s, More Is Better lzbench 2017-08-08 Test: Libdeflate 1 - Process: Decompression GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 300 600 900 1200 1500 SE +/- 10.00, N = 3 SE +/- 0.33, N = 3 1119 1183 1159 1147 1. (CXX) g++ options: -lrt -static -lpthread -fomit-frame-pointer -fstrict-aliasing -ffast-math -O3
FLAC Audio Encoding WAV To FLAC OpenBenchmarking.org Seconds, Fewer Is Better FLAC Audio Encoding 1.3.2 WAV To FLAC GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 2 4 6 8 10 SE +/- 0.02, N = 5 SE +/- 0.01, N = 5 SE +/- 0.04, N = 5 SE +/- 0.03, N = 5 7.70 7.99 8.11 7.72 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -fvisibility=hidden -logg -lm
MKL-DNN Harness: Convolution Batch conv_googlenet_v3 - Data Type: f32 OpenBenchmarking.org ms, Fewer Is Better MKL-DNN 2019-04-16 Harness: Convolution Batch conv_googlenet_v3 - Data Type: f32 GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 200 400 600 800 1000 SE +/- 6.51, N = 3 SE +/- 6.23, N = 3 SE +/- 5.61, N = 3 SE +/- 6.39, N = 3 1147.62 1153.46 1145.95 1145.01 MIN: 1052.13 -march=znver2 - MIN: 1057.54 -march=znver2 - MIN: 1052.71 MIN: 1050.58 1. (CXX) g++ options: -O3 -std=c++11 -march=native -mtune=native -fPIC -fopenmp -pie -lmklml_intel -ldl
MKL-DNN Harness: Convolution Batch conv_all - Data Type: f32 OpenBenchmarking.org ms, Fewer Is Better MKL-DNN 2019-04-16 Harness: Convolution Batch conv_all - Data Type: f32 GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 4K 8K 12K 16K 20K SE +/- 42.61, N = 3 SE +/- 22.35, N = 3 SE +/- 41.11, N = 3 SE +/- 87.03, N = 3 19613.70 19696.57 19694.33 19803.57 MIN: 18961.5 -march=znver2 - MIN: 19033.5 -march=znver2 - MIN: 18995.6 MIN: 19014.9 1. (CXX) g++ options: -O3 -std=c++11 -march=native -mtune=native -fPIC -fopenmp -pie -lmklml_intel -ldl
LAME MP3 Encoding WAV To MP3 OpenBenchmarking.org Seconds, Fewer Is Better LAME MP3 Encoding 3.100 WAV To MP3 GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 2 4 6 8 10 SE +/- 0.04, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 7.25 6.94 7.45 7.28 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -lncurses -lm
libjpeg-turbo tjbench Test: Decompression Throughput OpenBenchmarking.org Megapixels/sec, More Is Better libjpeg-turbo tjbench 2.0.2 Test: Decompression Throughput GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 50 100 150 200 250 SE +/- 0.44, N = 3 SE +/- 0.31, N = 3 SE +/- 0.30, N = 3 SE +/- 2.32, N = 3 218.09 225.64 225.44 220.33 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -rdynamic
Ogg Encoding WAV To Ogg OpenBenchmarking.org Seconds, Fewer Is Better Ogg Encoding 1.3.3 WAV To Ogg GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 1.206 2.412 3.618 4.824 6.03 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 4 SE +/- 0.00, N = 3 5.13 5.05 5.36 5.05 -march=znver2 -march=znver2 1. (CC) gcc options: -O2 -ffast-math -fsigned-char -O3 -logg
FFTW Build: Stock - Size: 1D FFT Size 32 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 1D FFT Size 32 GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 3K 6K 9K 12K 15K SE +/- 1.76, N = 3 SE +/- 15.90, N = 3 SE +/- 5.51, N = 3 SE +/- 110.06, N = 3 12958 11828 14113 12748 -march=znver2 -march=znver2 1. (CC) gcc options: -pthread -O3 -lm
FFTW Build: Stock - Size: 2D FFT Size 32 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 2D FFT Size 32 GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 3K 6K 9K 12K 15K SE +/- 155.95, N = 3 SE +/- 141.66, N = 3 SE +/- 2.19, N = 3 11909 14314 14119 12902 -march=znver2 -march=znver2 1. (CC) gcc options: -pthread -O3 -lm
FFTW Build: Stock - Size: 2D FFT Size 512 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 2D FFT Size 512 GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 2K 4K 6K 8K 10K SE +/- 30.08, N = 3 SE +/- 148.34, N = 4 SE +/- 10.67, N = 3 SE +/- 19.17, N = 3 9028.10 10814.00 10531.00 9583.73 -march=znver2 -march=znver2 1. (CC) gcc options: -pthread -O3 -lm
FFTW Build: Stock - Size: 2D FFT Size 4096 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 2D FFT Size 4096 GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 2K 4K 6K 8K 10K SE +/- 73.85, N = 3 SE +/- 67.62, N = 3 SE +/- 95.02, N = 3 SE +/- 40.30, N = 3 7063.03 7920.17 7823.27 7071.30 -march=znver2 -march=znver2 1. (CC) gcc options: -pthread -O3 -lm
FFTW Build: Float + SSE - Size: 2D FFT Size 32 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 2D FFT Size 32 GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 10K 20K 30K 40K 50K SE +/- 663.38, N = 4 SE +/- 105.51, N = 3 SE +/- 28.47, N = 3 SE +/- 54.85, N = 3 45253 44951 46305 45361 -march=znver2 -march=znver2 1. (CC) gcc options: -pthread -O3 -lm
Himeno Benchmark Poisson Pressure Solver OpenBenchmarking.org MFLOPS, More Is Better Himeno Benchmark 3.0 Poisson Pressure Solver GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 300 600 900 1200 1500 SE +/- 11.21, N = 3 SE +/- 6.19, N = 3 SE +/- 0.48, N = 3 SE +/- 2.93, N = 3 1322.90 1378.46 1385.88 1385.23 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -mavx2
HPC Challenge Test / Class: G-HPL OpenBenchmarking.org GFLOPS, More Is Better HPC Challenge 1.5.0 Test / Class: G-HPL GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 16 32 48 64 80 SE +/- 0.23, N = 3 SE +/- 0.37, N = 3 SE +/- 0.22, N = 3 SE +/- 0.08, N = 3 70.97 71.78 71.05 71.07 -march=znver2 -march=znver2 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -funroll-loops 2. OpenBLAS + Open MPI 2.1.1
HPC Challenge Test / Class: G-Ffte OpenBenchmarking.org GFLOPS, More Is Better HPC Challenge 1.5.0 Test / Class: G-Ffte GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 2 4 6 8 10 SE +/- 0.02013, N = 3 SE +/- 0.06300, N = 3 SE +/- 0.02559, N = 3 SE +/- 0.18198, N = 3 8.59803 8.60514 8.63794 8.81748 -march=znver2 -march=znver2 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -funroll-loops 2. OpenBLAS + Open MPI 2.1.1
HPC Challenge Test / Class: G-Ffte OpenBenchmarking.org GFLOP/s, More Is Better HPC Challenge 1.5.0 Test / Class: G-Ffte GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 2 4 6 8 10 SE +/- 0.02013, N = 3 SE +/- 0.06300, N = 3 SE +/- 0.02559, N = 3 SE +/- 0.18198, N = 3 8.59803 8.60514 8.63794 8.81748 -march=znver2 -march=znver2 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -funroll-loops 2. OpenBLAS + Open MPI 2.1.1
HPC Challenge Test / Class: EP-DGEMM OpenBenchmarking.org GFLOPS, More Is Better HPC Challenge 1.5.0 Test / Class: EP-DGEMM GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 8 16 24 32 40 SE +/- 0.19, N = 3 SE +/- 0.42, N = 3 SE +/- 0.22, N = 3 SE +/- 0.11, N = 3 32.83 32.60 32.84 32.86 -march=znver2 -march=znver2 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -funroll-loops 2. OpenBLAS + Open MPI 2.1.1
HPC Challenge Test / Class: G-Ptrans OpenBenchmarking.org GB/s, More Is Better HPC Challenge 1.5.0 Test / Class: G-Ptrans GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 0.6643 1.3286 1.9929 2.6572 3.3215 SE +/- 0.00047, N = 3 SE +/- 0.00095, N = 3 SE +/- 0.00082, N = 3 SE +/- 0.00151, N = 3 2.73255 2.95225 2.94730 2.72974 -march=znver2 -march=znver2 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -funroll-loops 2. OpenBLAS + Open MPI 2.1.1
HPC Challenge Test / Class: EP-STREAM Triad OpenBenchmarking.org GB/s, More Is Better HPC Challenge 1.5.0 Test / Class: EP-STREAM Triad GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 0.3894 0.7788 1.1682 1.5576 1.947 SE +/- 0.00015, N = 3 SE +/- 0.00081, N = 3 SE +/- 0.00091, N = 3 SE +/- 0.00098, N = 3 1.70820 1.71668 1.73055 1.72205 -march=znver2 -march=znver2 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -funroll-loops 2. OpenBLAS + Open MPI 2.1.1
HPC Challenge Test / Class: G-Random Access OpenBenchmarking.org GUP/s, More Is Better HPC Challenge 1.5.0 Test / Class: G-Random Access GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 0.022 0.044 0.066 0.088 0.11 SE +/- 0.00036, N = 3 SE +/- 0.00042, N = 3 SE +/- 0.00044, N = 3 SE +/- 0.00041, N = 3 0.09757 0.09798 0.09771 0.09778 -march=znver2 -march=znver2 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -funroll-loops 2. OpenBLAS + Open MPI 2.1.1
HPC Challenge Test / Class: Random Ring Latency OpenBenchmarking.org usecs, Fewer Is Better HPC Challenge 1.5.0 Test / Class: Random Ring Latency GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 0.0747 0.1494 0.2241 0.2988 0.3735 SE +/- 0.00047, N = 3 SE +/- 0.00042, N = 3 SE +/- 0.00071, N = 3 SE +/- 0.00125, N = 3 0.32596 0.32698 0.32521 0.33186 -march=znver2 -march=znver2 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -funroll-loops 2. OpenBLAS + Open MPI 2.1.1
HPC Challenge Test / Class: Random Ring Bandwidth OpenBenchmarking.org GB/s, More Is Better HPC Challenge 1.5.0 Test / Class: Random Ring Bandwidth GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 1.1354 2.2708 3.4062 4.5416 5.677 SE +/- 0.02698, N = 3 SE +/- 0.07571, N = 3 SE +/- 0.04322, N = 3 SE +/- 0.05697, N = 3 4.89161 4.98832 5.04603 4.94947 -march=znver2 -march=znver2 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -funroll-loops 2. OpenBLAS + Open MPI 2.1.1
HPC Challenge Test / Class: Max Ping Pong Bandwidth OpenBenchmarking.org MB/s, More Is Better HPC Challenge 1.5.0 Test / Class: Max Ping Pong Bandwidth GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 5K 10K 15K 20K 25K SE +/- 159.70, N = 3 SE +/- 119.42, N = 3 SE +/- 195.64, N = 3 SE +/- 62.37, N = 3 24227.25 23832.61 23993.04 23885.44 -march=znver2 -march=znver2 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -funroll-loops 2. OpenBLAS + Open MPI 2.1.1
GROMACS Water Benchmark OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2018.3 Water Benchmark GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 0.2228 0.4456 0.6684 0.8912 1.114 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.98 0.99 0.98 0.97 -march=znver2 -march=znver2 1. (CXX) g++ options: -march=core-avx2 -O3 -std=c++11 -funroll-all-loops -fopenmp -lrt -lpthread -lm
High Performance Conjugate Gradient OpenBenchmarking.org GFLOP/s, More Is Better High Performance Conjugate Gradient 3.0 GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 0.2453 0.4906 0.7359 0.9812 1.2265 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 4 1.09 1.08 1.08 1.09
Coremark CoreMark Size 666 - Iterations Per Second OpenBenchmarking.org Iterations/Sec, More Is Better Coremark 1.0 CoreMark Size 666 - Iterations Per Second GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 120K 240K 360K 480K 600K SE +/- 1430.19, N = 3 SE +/- 2761.64, N = 3 SE +/- 1036.74, N = 3 SE +/- 1210.22, N = 3 567987.34 555154.60 567096.65 568329.00 -march=znver2 -march=znver2 1. (CC) gcc options: -O2 -O3 -lrt" -lrt
Stockfish Total Time OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 9 Total Time GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 8M 16M 24M 32M 40M SE +/- 210046.69, N = 3 SE +/- 164232.11, N = 3 SE +/- 131167.27, N = 3 SE +/- 237875.03, N = 3 39278964 39561655 39540328 39631993 -march=znver2 -march=znver2 1. (CXX) g++ options: -m64 -lpthread -O3 -fno-exceptions -std=c++11 -pedantic -msse -msse3 -mpopcnt -flto
John The Ripper Test: Blowfish OpenBenchmarking.org Real C/S, More Is Better John The Ripper 1.9.0-jumbo-1 Test: Blowfish GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 4K 8K 12K 16K 20K SE +/- 64.93, N = 3 SE +/- 63.01, N = 3 SE +/- 64.22, N = 3 SE +/- 63.74, N = 3 20335 20253 20426 20426 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lgmp -pthread -lm -lz -ldl -lcrypt -lbz2
Timed LLVM Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed LLVM Compilation 6.0.1 Time To Compile GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 70 140 210 280 350 280.27 284.10 300.31 292.53
Timed PHP Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed PHP Compilation 7.1.9 Time To Compile GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 12 24 36 48 60 SE +/- 0.15, N = 3 SE +/- 0.36, N = 3 SE +/- 0.51, N = 3 SE +/- 0.09, N = 3 52.71 53.91 53.76 54.43 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -pedantic -ldl -lz -lm
m-queens Time To Solve OpenBenchmarking.org Seconds, Fewer Is Better m-queens 1.2 Time To Solve GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 11 22 33 44 55 SE +/- 0.12, N = 3 SE +/- 0.08, N = 3 SE +/- 0.10, N = 3 SE +/- 0.11, N = 3 47.12 47.27 47.21 47.14 -march=znver2 -march=znver2 1. (CXX) g++ options: -fopenmp -O3 -O2 -march=native
Cpuminer-Opt Algorithm: m7m OpenBenchmarking.org kH/s - Hash Speed, More Is Better Cpuminer-Opt 3.8.8.1 Algorithm: m7m GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 130 260 390 520 650 SE +/- 0.29, N = 3 SE +/- 0.15, N = 3 SE +/- 0.35, N = 3 SE +/- 0.27, N = 3 593.89 590.66 590.80 591.32 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -lcurl -lz -lpthread -lssl -lcrypto -lgmp
Cpuminer-Opt Algorithm: deep OpenBenchmarking.org kH/s - Hash Speed, More Is Better Cpuminer-Opt 3.8.8.1 Algorithm: deep GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 2K 4K 6K 8K 10K SE +/- 926.03, N = 12 SE +/- 8.82, N = 3 SE +/- 3.33, N = 3 11190.00 10230.34 11123.00 11137.00 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -lcurl -lz -lpthread -lssl -lcrypto -lgmp
Cpuminer-Opt Algorithm: lbry OpenBenchmarking.org kH/s - Hash Speed, More Is Better Cpuminer-Opt 3.8.8.1 Algorithm: lbry GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 8K 16K 24K 32K 40K SE +/- 550.28, N = 3 SE +/- 5.77, N = 3 SE +/- 20.82, N = 3 SE +/- 460.86, N = 5 34583 34420 34630 35288 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -lcurl -lz -lpthread -lssl -lcrypto -lgmp
Cpuminer-Opt Algorithm: skein OpenBenchmarking.org kH/s - Hash Speed, More Is Better Cpuminer-Opt 3.8.8.1 Algorithm: skein GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 9K 18K 27K 36K 45K SE +/- 602.50, N = 3 SE +/- 21.86, N = 3 SE +/- 133.46, N = 3 SE +/- 5.77, N = 3 39397 39797 39843 39720 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -lcurl -lz -lpthread -lssl -lcrypto -lgmp
Cpuminer-Opt Algorithm: myr-gr OpenBenchmarking.org kH/s - Hash Speed, More Is Better Cpuminer-Opt 3.8.8.1 Algorithm: myr-gr GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 3K 6K 9K 12K 15K SE +/- 49.78, N = 3 SE +/- 26.03, N = 3 SE +/- 6.67, N = 3 SE +/- 40.00, N = 3 14127 14023 14137 14130 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -lcurl -lz -lpthread -lssl -lcrypto -lgmp
Cpuminer-Opt Algorithm: sha256t OpenBenchmarking.org kH/s - Hash Speed, More Is Better Cpuminer-Opt 3.8.8.1 Algorithm: sha256t GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 20K 40K 60K 80K 100K SE +/- 990.16, N = 7 SE +/- 1027.26, N = 6 SE +/- 180.83, N = 3 SE +/- 116.81, N = 3 87951 87238 86440 86417 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -lcurl -lz -lpthread -lssl -lcrypto -lgmp
AOM AV1 AV1 Video Encoding OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 2019-02-11 AV1 Video Encoding GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 0.072 0.144 0.216 0.288 0.36 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.27 0.31 0.32 0.27 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
AOBench Size: 2048 x 2048 - Total Time OpenBenchmarking.org Seconds, Fewer Is Better AOBench Size: 2048 x 2048 - Total Time GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 8 16 24 32 40 SE +/- 0.05, N = 3 SE +/- 0.10, N = 3 SE +/- 0.02, N = 3 SE +/- 0.06, N = 3 34.60 33.20 33.05 35.98 -march=znver2 -march=znver2 1. (CC) gcc options: -lm -O3
GraphicsMagick Operation: Swirl OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.30 Operation: Swirl GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 60 120 180 240 300 SE +/- 1.86, N = 3 SE +/- 1.20, N = 3 SE +/- 0.88, N = 3 251 259 264 254 -march=znver2 -march=znver2 1. (CC) gcc options: -fopenmp -O3 -pthread -ljbig -lwebp -lwebpmux -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -ldl -lpthread
GraphicsMagick Operation: Rotate OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.30 Operation: Rotate GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 60 120 180 240 300 SE +/- 1.20, N = 3 SE +/- 1.86, N = 3 SE +/- 4.33, N = 3 262 263 277 262 -march=znver2 -march=znver2 1. (CC) gcc options: -fopenmp -O3 -pthread -ljbig -lwebp -lwebpmux -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -ldl -lpthread
GraphicsMagick Operation: Sharpen OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.30 Operation: Sharpen GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 40 80 120 160 200 SE +/- 0.33, N = 3 181 195 196 181 -march=znver2 -march=znver2 1. (CC) gcc options: -fopenmp -O3 -pthread -ljbig -lwebp -lwebpmux -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -ldl -lpthread
GraphicsMagick Operation: Enhanced OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.30 Operation: Enhanced GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 50 100 150 200 250 SE +/- 1.20, N = 3 209 221 223 208 -march=znver2 -march=znver2 1. (CC) gcc options: -fopenmp -O3 -pthread -ljbig -lwebp -lwebpmux -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -ldl -lpthread
GraphicsMagick Operation: Resizing OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.30 Operation: Resizing GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 60 120 180 240 300 SE +/- 2.19, N = 3 SE +/- 1.15, N = 3 SE +/- 1.53, N = 3 SE +/- 2.65, N = 3 275 280 286 274 -march=znver2 -march=znver2 1. (CC) gcc options: -fopenmp -O3 -pthread -ljbig -lwebp -lwebpmux -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -ldl -lpthread
GraphicsMagick Operation: Noise-Gaussian OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.30 Operation: Noise-Gaussian GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 40 80 120 160 200 SE +/- 0.67, N = 3 SE +/- 0.58, N = 3 SE +/- 0.33, N = 3 170 171 173 170 -march=znver2 -march=znver2 1. (CC) gcc options: -fopenmp -O3 -pthread -ljbig -lwebp -lwebpmux -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -ldl -lpthread
GraphicsMagick Operation: HWB Color Space OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.30 Operation: HWB Color Space GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 70 140 210 280 350 SE +/- 2.60, N = 3 SE +/- 2.19, N = 3 SE +/- 0.33, N = 3 SE +/- 2.19, N = 3 287 293 302 288 -march=znver2 -march=znver2 1. (CC) gcc options: -fopenmp -O3 -pthread -ljbig -lwebp -lwebpmux -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -ldl -lpthread
SVT-VP9 1080p 8-bit YUV To VP9 Video Encode OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 2019-02-17 1080p 8-bit YUV To VP9 Video Encode GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 20 40 60 80 100 SE +/- 0.28, N = 3 SE +/- 0.08, N = 3 SE +/- 0.19, N = 3 SE +/- 0.15, N = 3 89.99 96.54 92.35 89.84 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -fPIE -fPIC -O2 -flto -fvisibility=hidden -mavx -pie -rdynamic -lpthread -lrt -lm
x264 H.264 Video Encoding OpenBenchmarking.org Frames Per Second, More Is Better x264 2018-09-25 H.264 Video Encoding GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 30 60 90 120 150 SE +/- 1.55, N = 7 SE +/- 2.03, N = 3 SE +/- 2.09, N = 4 SE +/- 2.27, N = 3 139.59 138.41 139.82 138.74 -march=znver2 -march=znver2 1. (CC) gcc options: -ldl -m64 -lm -lpthread -O3 -ffast-math -std=gnu99 -fPIC -fomit-frame-pointer -fno-tree-vectorize
SVT-AV1 1080p 8-bit YUV To AV1 Video Encode OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 0.5 1080p 8-bit YUV To AV1 Video Encode GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 11 22 33 44 55 SE +/- 0.15, N = 3 SE +/- 0.19, N = 3 SE +/- 0.13, N = 3 SE +/- 0.27, N = 3 46.45 46.39 46.49 46.22 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -pie -lpthread -lm
x265 H.265 1080p Video Encoding OpenBenchmarking.org Frames Per Second, More Is Better x265 3.0 H.265 1080p Video Encoding GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 12 24 36 48 60 SE +/- 0.28, N = 3 SE +/- 0.06, N = 3 SE +/- 0.19, N = 3 SE +/- 0.20, N = 3 52.94 52.53 52.40 53.00 -march=znver2 -march=znver2 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
C-Ray Total Time - 4K, 16 Rays Per Pixel OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 1.1 Total Time - 4K, 16 Rays Per Pixel GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 10 20 30 40 50 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 SE +/- 0.05, N = 3 SE +/- 0.06, N = 3 43.09 39.42 39.36 42.63 -march=znver2 -march=znver2 1. (CC) gcc options: -lm -lpthread -O3
SVT-HEVC 1080p 8-bit YUV To HEVC Video Encode OpenBenchmarking.org Frames Per Second, More Is Better SVT-HEVC 2019-02-03 1080p 8-bit YUV To HEVC Video Encode GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 50 100 150 200 250 SE +/- 3.72, N = 3 SE +/- 0.72, N = 3 SE +/- 1.78, N = 3 SE +/- 1.73, N = 3 246.01 247.33 247.99 248.85 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -fPIE -fPIC -O2 -flto -fvisibility=hidden -march=native -pie -rdynamic -lpthread -lrt
FFmpeg H.264 HD To NTSC DV OpenBenchmarking.org Seconds, Fewer Is Better FFmpeg 4.0.2 H.264 HD To NTSC DV GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 6.86 6.83 6.78 6.88 -march=znver2 -march=znver2 1. (CC) gcc options: -lavdevice -lavfilter -lavformat -lavcodec -lswresample -lswscale -lavutil -lXv -lX11 -lXext -lm -lxcb -lxcb-shape -lxcb-xfixes -lasound -lSDL2 -lsndio -pthread -lbz2 -llzma -O3 -std=c11 -fomit-frame-pointer -fno-math-errno -fno-signed-zeros -fno-tree-vectorize -MMD -MF -MT
Smallpt Global Illumination Renderer; 128 Samples OpenBenchmarking.org Seconds, Fewer Is Better Smallpt 1.0 Global Illumination Renderer; 128 Samples GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 2 4 6 8 10 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 7.78 7.67 7.53 7.84 -march=znver2 -march=znver2 1. (CXX) g++ options: -fopenmp -O3
Memcached mcperf Method: Get OpenBenchmarking.org Operations Per Second, More Is Better Memcached mcperf 1.5.10 Method: Get GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 20K 40K 60K 80K 100K SE +/- 1551.16, N = 3 SE +/- 937.65, N = 15 SE +/- 1267.59, N = 3 SE +/- 1025.20, N = 15 92376.40 93850.59 97228.27 95710.60 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -lm -rdynamic
Memcached mcperf Method: Set OpenBenchmarking.org Operations Per Second, More Is Better Memcached mcperf 1.5.10 Method: Set GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 13K 26K 39K 52K 65K SE +/- 293.82, N = 3 SE +/- 3850.96, N = 15 SE +/- 393.33, N = 3 SE +/- 2058.38, N = 15 52914.10 59232.07 52910.87 57193.25 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -lm -rdynamic
NGINX Benchmark Static Web Page Serving OpenBenchmarking.org Requests Per Second, More Is Better NGINX Benchmark 1.9.9 Static Web Page Serving GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 9K 18K 27K 36K 45K SE +/- 102.83, N = 3 SE +/- 112.05, N = 3 SE +/- 23.74, N = 3 SE +/- 158.42, N = 3 39734.85 39602.49 39346.91 39525.70 -march=znver2 -march=znver2 1. (CC) gcc options: -lpthread -lcrypt -lcrypto -lz -O3 -march=native
Apache Benchmark Static Web Page Serving OpenBenchmarking.org Requests Per Second, More Is Better Apache Benchmark 2.4.29 Static Web Page Serving GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 8K 16K 24K 32K 40K SE +/- 57.64, N = 3 SE +/- 139.15, N = 3 SE +/- 79.10, N = 3 SE +/- 65.39, N = 3 38392.29 38022.79 38009.25 38490.98 -march=znver2 -march=znver2 1. (CC) gcc options: -shared -fPIC -pthread -O3
OpenSSL RSA 4096-bit Performance OpenBenchmarking.org Signs Per Second, More Is Better OpenSSL 1.1.1 RSA 4096-bit Performance GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 800 1600 2400 3200 4000 SE +/- 7.07, N = 3 SE +/- 1.42, N = 3 SE +/- 0.70, N = 3 SE +/- 1.89, N = 3 3516.27 3481.50 3487.10 3492.53 -march=znver2 -march=znver2 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
Apache Siege Concurrent Users: 200 OpenBenchmarking.org Transactions Per Second, More Is Better Apache Siege 2.4.29 Concurrent Users: 200 GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 20K 40K 60K 80K 100K SE +/- 798.56, N = 3 SE +/- 3575.15, N = 15 SE +/- 1288.23, N = 15 SE +/- 3302.37, N = 12 60835.79 99824.49 83275.06 82293.14 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -lpthread -ldl -lssl -lcrypto
Apache Siege Concurrent Users: 250 OpenBenchmarking.org Transactions Per Second, More Is Better Apache Siege 2.4.29 Concurrent Users: 250 GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 20K 40K 60K 80K 100K SE +/- 3755.13, N = 15 SE +/- 4063.46, N = 12 SE +/- 1636.75, N = 12 SE +/- 122.71, N = 3 98050.91 96842.13 102423.07 62725.24 -march=znver2 -march=znver2 1. (CC) gcc options: -O3 -lpthread -ldl -lssl -lcrypto
Redis Test: GET OpenBenchmarking.org Requests Per Second, More Is Better Redis 4.0.8 Test: GET GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 700K 1400K 2100K 2800K 3500K SE +/- 40781.06, N = 3 SE +/- 61029.58, N = 15 SE +/- 51486.64, N = 15 SE +/- 47460.73, N = 15 3297713.33 3066070.28 3031706.22 3042507.47 1. (CC) gcc options: -ggdb -rdynamic -lm -ldl -pthread
Redis Test: SET OpenBenchmarking.org Requests Per Second, More Is Better Redis 4.0.8 Test: SET GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 500K 1000K 1500K 2000K 2500K SE +/- 30290.32, N = 4 SE +/- 19021.82, N = 3 SE +/- 14796.01, N = 3 SE +/- 28123.08, N = 3 2122162.94 2169531.00 2084989.88 2051361.33 1. (CC) gcc options: -ggdb -rdynamic -lm -ldl -pthread
PostgreSQL pgbench Scaling: Buffer Test - Test: Normal Load - Mode: Read Only OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 10.3 Scaling: Buffer Test - Test: Normal Load - Mode: Read Only GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 60K 120K 180K 240K 300K SE +/- 513.53, N = 3 SE +/- 235.79, N = 3 SE +/- 237.85, N = 3 SE +/- 102.78, N = 3 300353.09 297539.89 298969.75 300244.81 -march=znver2 -march=znver2 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -lpgcommon -lpgport -lpq -lpthread -lrt -lcrypt -ldl -lm
PostgreSQL pgbench Scaling: Buffer Test - Test: Normal Load - Mode: Read Write OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 10.3 Scaling: Buffer Test - Test: Normal Load - Mode: Read Write GCC 9.1.0 GCC 9.1.0 znver2 GCC 10.0.0 znver2 GCC 10.0.0 6K 12K 18K 24K 30K SE +/- 55.36, N = 3 SE +/- 40.41, N = 3 SE +/- 124.84, N = 3 SE +/- 31.16, N = 3 29178.23 29149.20 29148.60 29372.39 -march=znver2 -march=znver2 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -lpgcommon -lpgport -lpq -lpthread -lrt -lcrypt -ldl -lm
Phoronix Test Suite v10.8.5