GCC AMD Ryzen Zen znver1 Compiler Optimizations AMD Ryzen 7 1800X Eight-Core testing for a future article. -O3: Processor: AMD Ryzen 7 1800X Eight-Core @ 3.60GHz (16 Cores), Motherboard: MSI X370 XPOWER GAMING TITANIUM (MS-7A31) v1.0, Chipset: AMD Device 1450, Memory: 16384MB, Disk: 256GB INTEL SSDPEKKW256G7, Graphics: Sapphire AMD Radeon R9 FURY / NANO 4096MB, Audio: AMD Fiji HDMI/DP, Monitor: DELL P2415Q, Network: Intel I211 Gigabit Connection OS: Ubuntu 17.04, Kernel: 4.10.0-9-generic (x86_64), Desktop: Unity 7.5.0, Display Server: X Server 1.18.4, Display Driver: modesetting 1.18.4, OpenGL: 4.5 Mesa 17.0.0- padoka PPA Gallium 0.4 (LLVM 4.0.0), Vulkan: 1.0.39, Compiler: GCC 6.3.0 20161229, File-System: ext4, Screen Resolution: 3840x2160 -O3 -march=bdver1: Processor: AMD Ryzen 7 1800X Eight-Core @ 3.60GHz (16 Cores), Motherboard: MSI X370 XPOWER GAMING TITANIUM (MS-7A31) v1.0, Chipset: AMD Device 1450, Memory: 16384MB, Disk: 256GB INTEL SSDPEKKW256G7, Graphics: Sapphire AMD Radeon R9 FURY / NANO 4096MB, Audio: AMD Fiji HDMI/DP, Monitor: DELL P2415Q, Network: Intel I211 Gigabit Connection OS: Ubuntu 17.04, Kernel: 4.10.0-9-generic (x86_64), Desktop: Unity 7.5.0, Display Server: X Server 1.18.4, Display Driver: modesetting 1.18.4, OpenGL: 4.5 Mesa 17.0.0- padoka PPA Gallium 0.4 (LLVM 4.0.0), Vulkan: 1.0.39, Compiler: GCC 6.3.0 20161229, File-System: ext4, Screen Resolution: 3840x2160 -O3 -march=bdver4: Processor: AMD Ryzen 7 1800X Eight-Core @ 3.60GHz (16 Cores), Motherboard: MSI X370 XPOWER GAMING TITANIUM (MS-7A31) v1.0, Chipset: AMD Device 1450, Memory: 16384MB, Disk: 256GB INTEL SSDPEKKW256G7, Graphics: Sapphire AMD Radeon R9 FURY / NANO 4096MB, Audio: AMD Fiji HDMI/DP, Monitor: DELL P2415Q, Network: Intel I211 Gigabit Connection OS: Ubuntu 17.04, Kernel: 4.10.0-9-generic (x86_64), Desktop: Unity 7.5.0, Display Server: X Server 1.18.4, Display Driver: modesetting 1.18.4, OpenGL: 4.5 Mesa 17.0.0- padoka PPA Gallium 0.4 (LLVM 4.0.0), Vulkan: 1.0.39, Compiler: GCC 6.3.0 20161229, File-System: ext4, Screen Resolution: 3840x2160 -O3 -march=znver1: Processor: AMD Ryzen 7 1800X Eight-Core @ 3.60GHz (16 Cores), Motherboard: MSI X370 XPOWER GAMING TITANIUM (MS-7A31) v1.0, Chipset: AMD Device 1450, Memory: 16384MB, Disk: 256GB INTEL SSDPEKKW256G7, Graphics: Sapphire AMD Radeon R9 FURY / NANO 4096MB, Audio: AMD Fiji HDMI/DP, Monitor: DELL P2415Q, Network: Intel I211 Gigabit Connection OS: Ubuntu 17.04, Kernel: 4.10.0-9-generic (x86_64), Desktop: Unity 7.5.0, Display Server: X Server 1.18.4, Display Driver: modesetting 1.18.4, OpenGL: 4.5 Mesa 17.0.0- padoka PPA Gallium 0.4 (LLVM 4.0.0), Vulkan: 1.0.39, Compiler: GCC 6.3.0 20161229, File-System: ext4, Screen Resolution: 3840x2160 -O3 -march=k8-sse3: Processor: AMD Ryzen 7 1800X Eight-Core @ 3.60GHz (16 Cores), Motherboard: MSI X370 XPOWER GAMING TITANIUM (MS-7A31) v1.0, Chipset: AMD Device 1450, Memory: 16384MB, Disk: 256GB INTEL SSDPEKKW256G7, Graphics: Sapphire AMD Radeon R9 FURY / NANO 4096MB, Audio: AMD Fiji HDMI/DP, Monitor: DELL P2415Q, Network: Intel I211 Gigabit Connection OS: Ubuntu 17.04, Kernel: 4.10.0-9-generic (x86_64), Desktop: Unity 7.5.0, Display Server: X Server 1.18.4, Display Driver: modesetting 1.18.4, OpenGL: 4.5 Mesa 17.0.0- padoka PPA Gallium 0.4 (LLVM 4.0.0), Vulkan: 1.0.39, Compiler: GCC 6.3.0 20161229, File-System: ext4, Screen Resolution: 3840x2160 C-Ray 1.1 Total Time Seconds < Lower Is Better -O3 ................ 8.17 |================================== -O3 -march=znver1 .. 7.64 |================================ -O3 -march=k8-sse3 . 12.36 |=================================================== GraphicsMagick 1.3.19 Operation: Sharpen Iterations Per Minute > Higher Is Better -O3 ................ 187 |================================================= -O3 -march=znver1 .. 204 |===================================================== -O3 -march=k8-sse3 . 157 |========================================= LAME MP3 Encoding 3.99.3 WAV To MP3 Seconds < Lower Is Better -O3 ................ 9.00 |=============================================== -O3 -march=znver1 .. 8.73 |============================================= -O3 -march=k8-sse3 . 9.99 |==================================================== SciMark 2.0 Computational Test: Fast Fourier Transform Mflops > Higher Is Better -O3 ................ 157.64 |=============================================== -O3 -march=bdver1 .. 150.42 |============================================= -O3 -march=bdver4 .. 149.47 |============================================ -O3 -march=znver1 .. 148.46 |============================================ -O3 -march=k8-sse3 . 168.82 |================================================== SciMark 2.0 Computational Test: Dense LU Matrix Factorization Mflops > Higher Is Better -O3 ................ 3436.15 |================================================= -O3 -march=bdver1 .. 3460.15 |================================================= -O3 -march=bdver4 .. 3446.64 |================================================= -O3 -march=znver1 .. 3310.83 |=============================================== -O3 -march=k8-sse3 . 3071.16 |=========================================== GraphicsMagick 1.3.19 Operation: Blur Iterations Per Minute > Higher Is Better -O3 ................ 186 |=================================================== -O3 -march=znver1 .. 192 |===================================================== -O3 -march=k8-sse3 . 173 |================================================ FLAC Audio Encoding 1.3.1 WAV To FLAC Seconds < Lower Is Better -O3 ................ 5.22 |================================================ -O3 -march=znver1 .. 5.70 |==================================================== -O3 -march=k8-sse3 . 5.71 |==================================================== Timed Apache Compilation 2.4.7 Time To Compile Seconds < Lower Is Better -O3 ................ 24.33 |=============================================== -O3 -march=bdver1 .. 26.48 |=================================================== -O3 -march=bdver4 .. 26.38 |=================================================== -O3 -march=znver1 .. 26.32 |=================================================== -O3 -march=k8-sse3 . 24.51 |=============================================== FFTW 3.3.4 Build: Float + SSE - Size: 2D FFT Size 1024 Mflops > Higher Is Better -O3 ................ 21035 |================================================ -O3 -march=znver1 .. 22209 |=================================================== -O3 -march=k8-sse3 . 20628 |=============================================== SciMark 2.0 Computational Test: Sparse Matrix Multiply Mflops > Higher Is Better -O3 ................ 2421.08 |============================================== -O3 -march=bdver1 .. 2518.05 |================================================ -O3 -march=bdver4 .. 2519.53 |================================================ -O3 -march=znver1 .. 2568.32 |================================================= -O3 -march=k8-sse3 . 2524.93 |================================================ Himeno Benchmark 3.0 Poisson Pressure Solver MFLOPS > Higher Is Better -O3 ................ 1195.38 |================================================= -O3 -march=bdver1 .. 1194.42 |================================================= -O3 -march=bdver4 .. 1186.02 |================================================= -O3 -march=znver1 .. 1135.46 |=============================================== -O3 -march=k8-sse3 . 1190.23 |================================================= SciMark 2.0 Computational Test: Composite Mflops > Higher Is Better -O3 ................ 1585.02 |================================================ -O3 -march=bdver1 .. 1605.68 |================================================= -O3 -march=bdver4 .. 1602.85 |================================================= -O3 -march=znver1 .. 1588.29 |================================================ -O3 -march=k8-sse3 . 1533.07 |=============================================== GraphicsMagick 1.3.19 Operation: Resizing Iterations Per Minute > Higher Is Better -O3 ................ 241 |=================================================== -O3 -march=znver1 .. 252 |===================================================== -O3 -march=k8-sse3 . 241 |=================================================== TTSIOD 3D Renderer 2.3a Phong Rendering With Soft-Shadow Mapping FPS > Higher Is Better -O3 ................ 342.43 |================================================ -O3 -march=bdver1 .. 355.79 |================================================== -O3 -march=znver1 .. 355.26 |================================================== -O3 -march=k8-sse3 . 343.39 |================================================ GraphicsMagick 1.3.19 Operation: HWB Color Space Iterations Per Minute > Higher Is Better -O3 ................ 254 |==================================================== -O3 -march=znver1 .. 261 |===================================================== -O3 -march=k8-sse3 . 254 |==================================================== GraphicsMagick 1.3.19 Operation: Local Adaptive Thresholding Iterations Per Minute > Higher Is Better -O3 ................ 140 |==================================================== -O3 -march=znver1 .. 143 |===================================================== -O3 -march=k8-sse3 . 140 |==================================================== SciMark 2.0 Computational Test: Monte Carlo Mflops > Higher Is Better -O3 ................ 738.39 |================================================== -O3 -march=bdver1 .. 728.45 |================================================= -O3 -march=bdver4 .. 727.88 |================================================= -O3 -march=znver1 .. 738.77 |================================================== -O3 -march=k8-sse3 . 727.86 |================================================= libjpeg-turbo tjbench 1.5.1 Test: Decompression Throughput Megapixels/sec > Higher Is Better -O3 ................ 178.19 |================================================= -O3 -march=znver1 .. 180.57 |================================================== -O3 -march=k8-sse3 . 178.13 |================================================= Timed HMMer Search 2.3.2 Pfam Database Search Seconds < Lower Is Better -O3 ................ 7.11 |=================================================== -O3 -march=znver1 .. 7.12 |=================================================== -O3 -march=k8-sse3 . 7.20 |==================================================== John The Ripper 1.8.0 Test: Blowfish Real C/S > Higher Is Better -O3 ................ 12798 |=================================================== -O3 -march=bdver1 .. 12878 |=================================================== -O3 -march=bdver4 .. 12887 |=================================================== -O3 -march=znver1 .. 12881 |=================================================== -O3 -march=k8-sse3 . 12829 |=================================================== Timed ImageMagick Compilation 6.9.0 Time To Compile Seconds < Lower Is Better -O3 ................ 177.31 |================================================== -O3 -march=bdver1 .. 177.46 |================================================== -O3 -march=bdver4 .. 178.17 |================================================== -O3 -march=znver1 .. 177.73 |================================================== -O3 -march=k8-sse3 . 177.93 |================================================== Stockfish 2014-11-26 Total Time ms < Lower Is Better -O3 ................ 3615 |==================================================== -O3 -march=bdver1 .. 3623 |==================================================== -O3 -march=znver1 .. 3611 |==================================================== -O3 -march=k8-sse3 . 3608 |==================================================== SciMark 2.0 Computational Test: Jacobi Successive Over-Relaxation Mflops > Higher Is Better -O3 ................ 1171.83 |================================================= -O3 -march=bdver1 .. 1171.31 |================================================= -O3 -march=bdver4 .. 1170.73 |================================================= -O3 -march=znver1 .. 1175.08 |================================================= -O3 -march=k8-sse3 . 1172.56 |================================================= Hierarchical INTegration 1.0 Test: FLOAT QUIPs > Higher Is Better -O3 ................ 333762114.34 |============================================ -O3 -march=znver1 .. 333251163.50 |============================================ -O3 -march=k8-sse3 . 332568933.81 |============================================