Znver2 GCC9 Compiler Tests AMD Zen 2 GCC compiler benchmarks on Ubuntu Linux. Tests by Michael Larabel for a future article. -O3 -march=znver2: Processor: AMD Ryzen 9 3900X 12-Core @ 3.80GHz (12 Cores / 24 Threads), Motherboard: ASUS ROG CROSSHAIR VIII HERO (WI-FI) (0066 BIOS), Chipset: AMD Device 1480, Memory: 16384MB, Disk: 2000GB Force MP600, Graphics: Sapphire AMD Baffin [Polaris11] 4GB (1300/1750MHz), Audio: AMD Device aae0, Monitor: ASUS VP28U, Network: Realtek Device 8125 + Intel I211 + Intel Device 2723 OS: Ubuntu 18.04, Kernel: 5.2.0-999-generic (x86_64) 20190703, Desktop: GNOME Shell 3.28.3, Display Server: X Server 1.20.1, Display Driver: modesetting 1.20.1, OpenGL: 4.5 Mesa 18.2.2 (LLVM 7.0.0), Compiler: GCC 9.1.0, File-System: ext4, Screen Resolution: 3840x2160 -O3 -march=znver1: Processor: AMD Ryzen 9 3900X 12-Core @ 3.80GHz (12 Cores / 24 Threads), Motherboard: ASUS ROG CROSSHAIR VIII HERO (WI-FI) (0066 BIOS), Chipset: AMD Device 1480, Memory: 16384MB, Disk: 2000GB Force MP600, Graphics: Sapphire AMD Baffin [Polaris11] 4GB (1300/1750MHz), Audio: AMD Device aae0, Monitor: ASUS VP28U, Network: Realtek Device 8125 + Intel I211 + Intel Device 2723 OS: Ubuntu 18.04, Kernel: 5.2.0-999-generic (x86_64) 20190703, Desktop: GNOME Shell 3.28.3, Display Server: X Server 1.20.1, Display Driver: modesetting 1.20.1, OpenGL: 4.5 Mesa 18.2.2 (LLVM 7.0.0), Compiler: GCC 9.1.0, File-System: ext4, Screen Resolution: 3840x2160 -O3 -march=x86-64: Processor: AMD Ryzen 9 3900X 12-Core @ 3.80GHz (12 Cores / 24 Threads), Motherboard: ASUS ROG CROSSHAIR VIII HERO (WI-FI) (0066 BIOS), Chipset: AMD Device 1480, Memory: 16384MB, Disk: 2000GB Force MP600, Graphics: Sapphire AMD Baffin [Polaris11] 4GB (1300/1750MHz), Audio: AMD Device aae0, Monitor: ASUS VP28U, Network: Realtek Device 8125 + Intel I211 + Intel Device 2723 OS: Ubuntu 18.04, Kernel: 5.2.0-999-generic (x86_64) 20190703, Desktop: GNOME Shell 3.28.3, Display Server: X Server 1.20.1, Display Driver: modesetting 1.20.1, OpenGL: 4.5 Mesa 18.2.2 (LLVM 7.0.0), Compiler: GCC 9.1.0, File-System: ext4, Screen Resolution: 3840x2160 FFTW 3.3.6 Build: Stock - Size: 1D FFT Size 4096 Mflops > Higher Is Better -O3 -march=znver2 . 11231.00 |================================================ -O3 -march=znver1 . 11448.00 |================================================= -O3 -march=x86-64 . 9534.70 |========================================= FFTW 3.3.6 Build: Stock - Size: 2D FFT Size 4096 Mflops > Higher Is Better -O3 -march=znver2 . 8001.67 |================================================== -O3 -march=znver1 . 7660.90 |================================================ -O3 -march=x86-64 . 7039.43 |============================================ FFTW 3.3.6 Build: Float + SSE - Size: 1D FFT Size 4096 Mflops > Higher Is Better -O3 -march=znver2 . 56652 |==================================================== -O3 -march=znver1 . 51757 |================================================ FFTW 3.3.6 Build: Float + SSE - Size: 2D FFT Size 4096 Mflops > Higher Is Better -O3 -march=znver2 . 19960 |==================================================== -O3 -march=znver1 . 19405 |=================================================== SciMark 2.0 Computational Test: Composite Mflops > Higher Is Better -O3 -march=znver2 . 3700.64 |================================================== -O3 -march=znver1 . 3128.65 |========================================== -O3 -march=x86-64 . 2786.33 |====================================== SciMark 2.0 Computational Test: Monte Carlo Mflops > Higher Is Better -O3 -march=znver2 . 799.07 |=================================================== -O3 -march=znver1 . 757.56 |================================================ -O3 -march=x86-64 . 766.81 |================================================= SciMark 2.0 Computational Test: Fast Fourier Transform Mflops > Higher Is Better -O3 -march=znver2 . 274.11 |=============================================== -O3 -march=znver1 . 260.12 |============================================= -O3 -march=x86-64 . 297.13 |=================================================== SciMark 2.0 Computational Test: Sparse Matrix Multiply Mflops > Higher Is Better -O3 -march=znver2 . 3575.96 |================================================ -O3 -march=znver1 . 3702.03 |================================================= -O3 -march=x86-64 . 3762.43 |================================================== SciMark 2.0 Computational Test: Dense LU Matrix Factorization Mflops > Higher Is Better -O3 -march=znver2 . 11431.93 |================================================= -O3 -march=znver1 . 8631.93 |===================================== -O3 -march=x86-64 . 6959.69 |============================== SciMark 2.0 Computational Test: Jacobi Successive Over-Relaxation Mflops > Higher Is Better -O3 -march=znver2 . 2422.10 |================================================== -O3 -march=znver1 . 2291.62 |=============================================== -O3 -march=x86-64 . 2145.57 |============================================ TSCP 1.81 AI Chess Performance Nodes Per Second > Higher Is Better -O3 -march=znver2 . 1321681 |================================================ -O3 -march=znver1 . 1372148 |================================================== -O3 -march=x86-64 . 1333926 |================================================= John The Ripper 1.9.0-jumbo-1 Test: Blowfish Real C/S > Higher Is Better -O3 -march=znver2 . 20232 |===================================== -O3 -march=znver1 . 28221 |==================================================== -O3 -march=x86-64 . 28401 |==================================================== MKL-DNN 2019-04-16 Harness: IP Batch 1D - Data Type: f32 ms < Lower Is Better -O3 -march=znver2 . 158.46 |=================================================== -O3 -march=znver1 . 159.95 |=================================================== -O3 -march=x86-64 . 152.36 |================================================= MKL-DNN 2019-04-16 Harness: Deconvolution Batch deconv_1d - Data Type: f32 ms < Lower Is Better -O3 -march=znver2 . 216.70 |================================================== -O3 -march=znver1 . 219.29 |=================================================== -O3 -march=x86-64 . 221.00 |=================================================== MKL-DNN 2019-04-16 Harness: Convolution Batch conv_alexnet - Data Type: f32 ms < Lower Is Better -O3 -march=znver2 . 2524.93 |================================================= -O3 -march=znver1 . 2562.89 |================================================== -O3 -march=x86-64 . 2512.04 |================================================= VP9 libvpx Encoding 1.8.0 vpxenc VP9 1080p Video Encode Frames Per Second > Higher Is Better -O3 -march=znver2 . 175.34 |=================================================== -O3 -march=znver1 . 174.40 |=================================================== -O3 -march=x86-64 . 175.99 |=================================================== x264 2018-09-25 H.264 Video Encoding Frames Per Second > Higher Is Better -O3 -march=znver2 . 140.96 |================================================== -O3 -march=znver1 . 141.82 |================================================== -O3 -march=x86-64 . 143.27 |=================================================== x265 3.0 H.265 1080p Video Encoding Frames Per Second > Higher Is Better -O3 -march=znver2 . 52.91 |==================================================== -O3 -march=znver1 . 53.15 |==================================================== -O3 -march=x86-64 . 53.33 |==================================================== GraphicsMagick 1.3.30 Operation: Rotate Iterations Per Minute > Higher Is Better -O3 -march=znver2 . 276 |====================================================== -O3 -march=znver1 . 260 |=================================================== -O3 -march=x86-64 . 261 |=================================================== GraphicsMagick 1.3.30 Operation: Sharpen Iterations Per Minute > Higher Is Better -O3 -march=znver2 . 194 |====================================================== -O3 -march=znver1 . 193 |====================================================== -O3 -march=x86-64 . 179 |================================================== GraphicsMagick 1.3.30 Operation: Resizing Iterations Per Minute > Higher Is Better -O3 -march=znver2 . 285 |====================================================== -O3 -march=znver1 . 279 |===================================================== -O3 -march=x86-64 . 270 |=================================================== Himeno Benchmark 3.0 Poisson Pressure Solver MFLOPS > Higher Is Better -O3 -march=znver2 . 1347.96 |================================================== -O3 -march=znver1 . 1345.95 |================================================== -O3 -march=x86-64 . 1336.77 |================================================== 7-Zip Compression 16.02 Compress Speed Test MIPS > Higher Is Better -O3 -march=znver2 . 78562 |==================================================== -O3 -march=znver1 . 78412 |==================================================== -O3 -march=x86-64 . 78655 |==================================================== Stockfish 9 Total Time Nodes Per Second > Higher Is Better -O3 -march=znver2 . 39471726 |================================================ -O3 -march=znver1 . 39908751 |================================================= -O3 -march=x86-64 . 39537930 |================================================= Timed LLVM Compilation 6.0.1 Time To Compile Seconds < Lower Is Better -O3 -march=znver2 . 286.56 |=================================================== -O3 -march=znver1 . 284.24 |=================================================== -O3 -march=x86-64 . 281.90 |================================================== Timed PHP Compilation 7.1.9 Time To Compile Seconds < Lower Is Better -O3 -march=znver2 . 53.41 |==================================================== -O3 -march=znver1 . 53.44 |==================================================== -O3 -march=x86-64 . 52.89 |=================================================== C-Ray 1.1 Total Time - 4K, 16 Rays Per Pixel Seconds < Lower Is Better -O3 -march=znver2 . 39.46 |=============================================== -O3 -march=znver1 . 39.42 |=============================================== -O3 -march=x86-64 . 43.20 |==================================================== AOBench Size: 2048 x 2048 - Total Time Seconds < Lower Is Better -O3 -march=znver2 . 34.64 |================================================= -O3 -march=znver1 . 35.14 |================================================== -O3 -march=x86-64 . 36.49 |==================================================== Bullet Physics Engine 2.81 Test: Raytests Seconds < Lower Is Better -O3 -march=znver2 . 2.06 |=================================================== -O3 -march=znver1 . 2.13 |===================================================== -O3 -march=x86-64 . 2.09 |==================================================== Bullet Physics Engine 2.81 Test: 3000 Fall Seconds < Lower Is Better -O3 -march=znver2 . 3.23 |=================================================== -O3 -march=znver1 . 3.36 |===================================================== -O3 -march=x86-64 . 3.37 |===================================================== Bullet Physics Engine 2.81 Test: 1000 Stack Seconds < Lower Is Better -O3 -march=znver2 . 3.79 |================================================= -O3 -march=znver1 . 3.98 |==================================================== -O3 -march=x86-64 . 4.08 |===================================================== Bullet Physics Engine 2.81 Test: 1000 Convex Seconds < Lower Is Better -O3 -march=znver2 . 3.59 |=================================================== -O3 -march=znver1 . 3.73 |===================================================== -O3 -march=x86-64 . 3.70 |===================================================== Bullet Physics Engine 2.81 Test: 136 Ragdolls Seconds < Lower Is Better -O3 -march=znver2 . 2.05 |================================================== -O3 -march=znver1 . 2.13 |==================================================== -O3 -march=x86-64 . 2.17 |===================================================== XZ Compression 5.2.4 Compressing ubuntu-16.04.3-server-i386.img, Compression Level 9 Seconds < Lower Is Better -O3 -march=znver2 . 25.21 |==================================================== -O3 -march=znver1 . 25.09 |==================================================== -O3 -march=x86-64 . 25.22 |==================================================== FLAC Audio Encoding 1.3.2 WAV To FLAC Seconds < Lower Is Better -O3 -march=znver2 . 8.12 |===================================================== -O3 -march=znver1 . 8.15 |===================================================== -O3 -march=x86-64 . 7.75 |================================================== LAME MP3 Encoding 3.100 WAV To MP3 Seconds < Lower Is Better -O3 -march=znver2 . 7.04 |==================================================== -O3 -march=znver1 . 6.98 |==================================================== -O3 -march=x86-64 . 7.16 |===================================================== PostgreSQL pgbench 10.3 Scaling: Buffer Test - Test: Normal Load - Mode: Read Only TPS > Higher Is Better -O3 -march=znver2 . 382751.07 |================================================ -O3 -march=znver1 . 383329.30 |================================================ -O3 -march=x86-64 . 385510.95 |================================================ PostgreSQL pgbench 10.3 Scaling: Buffer Test - Test: Normal Load - Mode: Read Write TPS > Higher Is Better -O3 -march=znver2 . 30044.32 |================================================= -O3 -march=znver1 . 29380.50 |================================================ -O3 -march=x86-64 . 30143.80 |================================================= CppPerformanceBenchmarks 9 Test: Math Library Seconds < Lower Is Better -O3 -march=znver2 . 309.02 |=================================================== -O3 -march=znver1 . 311.77 |=================================================== -O3 -march=x86-64 . 312.01 |=================================================== CppPerformanceBenchmarks 9 Test: Function Objects Seconds < Lower Is Better -O3 -march=znver2 . 14.56 |=================================================== -O3 -march=znver1 . 14.99 |==================================================== -O3 -march=x86-64 . 14.90 |==================================================== Redis 4.0.8 Test: GET Requests Per Second > Higher Is Better -O3 -march=znver2 . 3090850.69 |============================================== -O3 -march=znver1 . 3126726.23 |=============================================== -O3 -march=x86-64 . 3026826.32 |============================================= Redis 4.0.8 Test: SET Requests Per Second > Higher Is Better -O3 -march=znver2 . 2089609.47 |=============================================== -O3 -march=znver1 . 2080013.12 |=============================================== -O3 -march=x86-64 . 2074827.24 |=============================================== Memcached mcperf 1.5.10 Method: Get Operations Per Second > Higher Is Better -O3 -march=znver2 . 110755.89 |=============================================== -O3 -march=znver1 . 112447.77 |================================================ -O3 -march=x86-64 . 107814.85 |============================================== Memcached mcperf 1.5.10 Method: Set Operations Per Second > Higher Is Better -O3 -march=znver2 . 69121.64 |================================================= -O3 -march=znver1 . 60810.38 |=========================================== -O3 -march=x86-64 . 60770.80 |===========================================