AOCC 1.0 Compiler Tuning AMD Ryzen 7 1700 Eight-Core testing with a MSI B350 TOMAHAWK (MS-7A34) v1.0 and HIS AMD Radeon HD 7750/8740 / R7 250E 1024MB on Ubuntu 17.04 via the Phoronix Test Suite. -O0: Processor: AMD Ryzen 7 1700 Eight-Core @ 3.00GHz (16 Cores), Motherboard: MSI B350 TOMAHAWK (MS-7A34) v1.0, Chipset: AMD Device 1450, Memory: 16384MB, Disk: 120GB Samsung SSD 840, Graphics: HIS AMD Radeon HD 7750/8740 / R7 250E 1024MB, Audio: AMD Cape Verde/Pitcairn, Monitor: DELL S2409W, Network: Realtek RTL8111/8168/8411 OS: Ubuntu 17.04, Kernel: 4.12.0-999-generic (x86_64) 20170518, Desktop: Unity 7.5.0, Display Driver: modesetting 1.19.3, Compiler: Clang 4.0.0, File-System: ext4, Screen Resolution: 1920x1080 -O2: Processor: AMD Ryzen 7 1700 Eight-Core @ 3.00GHz (16 Cores), Motherboard: MSI B350 TOMAHAWK (MS-7A34) v1.0, Chipset: AMD Device 1450, Memory: 16384MB, Disk: 120GB Samsung SSD 840, Graphics: HIS AMD Radeon HD 7750/8740 / R7 250E 1024MB, Audio: AMD Cape Verde/Pitcairn, Monitor: DELL S2409W, Network: Realtek RTL8111/8168/8411 OS: Ubuntu 17.04, Kernel: 4.12.0-999-generic (x86_64) 20170518, Desktop: Unity 7.5.0, Display Driver: modesetting 1.19.3, Compiler: Clang 4.0.0, File-System: ext4, Screen Resolution: 1920x1080 -O3: Processor: AMD Ryzen 7 1700 Eight-Core @ 3.00GHz (16 Cores), Motherboard: MSI B350 TOMAHAWK (MS-7A34) v1.0, Chipset: AMD Device 1450, Memory: 16384MB, Disk: 120GB Samsung SSD 840, Graphics: HIS AMD Radeon HD 7750/8740 / R7 250E 1024MB, Audio: AMD Cape Verde/Pitcairn, Monitor: DELL S2409W, Network: Realtek RTL8111/8168/8411 OS: Ubuntu 17.04, Kernel: 4.12.0-999-generic (x86_64) 20170518, Desktop: Unity 7.5.0, Display Driver: modesetting 1.19.3, Compiler: Clang 4.0.0, File-System: ext4, Screen Resolution: 1920x1080 -O3 -march=znver1: Processor: AMD Ryzen 7 1700 Eight-Core @ 3.00GHz (16 Cores), Motherboard: MSI B350 TOMAHAWK (MS-7A34) v1.0, Chipset: AMD Device 1450, Memory: 16384MB, Disk: 120GB Samsung SSD 840, Graphics: HIS AMD Radeon HD 7750/8740 / R7 250E 1024MB, Audio: AMD Cape Verde/Pitcairn, Monitor: DELL S2409W, Network: Realtek RTL8111/8168/8411 OS: Ubuntu 17.04, Kernel: 4.12.0-999-generic (x86_64) 20170518, Desktop: Unity 7.5.0, Display Driver: modesetting 1.19.3, Compiler: Clang 4.0.0, File-System: ext4, Screen Resolution: 1920x1080 -O3 -march=znver1 -mllvm -enable-strided-vectorization: Processor: AMD Ryzen 7 1700 Eight-Core @ 3.00GHz (16 Cores), Motherboard: MSI B350 TOMAHAWK (MS-7A34) v1.0, Chipset: AMD Device 1450, Memory: 16384MB, Disk: 120GB Samsung SSD 840, Graphics: HIS AMD Radeon HD 7750/8740 / R7 250E 1024MB, Audio: AMD Cape Verde/Pitcairn, Monitor: DELL S2409W, Network: Realtek RTL8111/8168/8411 OS: Ubuntu 17.04, Kernel: 4.12.0-999-generic (x86_64) 20170518, Desktop: Unity 7.5.0, Display Driver: modesetting 1.19.3, Compiler: Clang 4.0.0, File-System: ext4, Screen Resolution: 1920x1080 -Ofast -march=znver1: Processor: AMD Ryzen 7 1700 Eight-Core @ 3.00GHz (16 Cores), Motherboard: MSI B350 TOMAHAWK (MS-7A34) v1.0, Chipset: AMD Device 1450, Memory: 16384MB, Disk: 120GB Samsung SSD 840, Graphics: HIS AMD Radeon HD 7750/8740 / R7 250E 1024MB, Audio: AMD Cape Verde/Pitcairn, Monitor: DELL S2409W, Network: Realtek RTL8111/8168/8411 OS: Ubuntu 17.04, Kernel: 4.12.0-999-generic (x86_64) 20170518, Desktop: Unity 7.5.0, Display Driver: modesetting 1.19.3, Compiler: Clang 4.0.0, File-System: ext4, Screen Resolution: 1920x1080 FFTW 3.3.4 Build: Float + SSE - Size: 2D FFT Size 1024 Mflops > Higher Is Better -O0 .................................................... 2434.70 |= -O2 .................................................... 20190.00 |============ -O3 .................................................... 20297.00 |============ -O3 -march=znver1 ...................................... 20171.00 |============ -O3 -march=znver1 -mllvm -enable-strided-vectorization . 20433.00 |============ -Ofast -march=znver1 ................................... 20676.00 |============ Timed MAFFT Alignment 6.864 Multiple Sequence Alignment Seconds < Lower Is Better -O0 .................................................... 3.87 |================ -O2 .................................................... 3.63 |=============== -O3 .................................................... 3.77 |================ -O3 -march=znver1 ...................................... 3.66 |=============== -O3 -march=znver1 -mllvm -enable-strided-vectorization . 3.78 |================ -Ofast -march=znver1 ................................... 3.82 |================ SciMark 2.0 Computational Test: Composite Mflops > Higher Is Better -O0 .................................................... 2147.42 |============= -O2 .................................................... 2130.55 |============= -O3 .................................................... 2196.34 |============= -O3 -march=znver1 ...................................... 2156.12 |============= -O3 -march=znver1 -mllvm -enable-strided-vectorization . 2139.67 |============= -Ofast -march=znver1 ................................... 2147.94 |============= SciMark 2.0 Computational Test: Monte Carlo Mflops > Higher Is Better -O0 .................................................... 643.45 |============== -O2 .................................................... 642.37 |============== -O3 .................................................... 643.43 |============== -O3 -march=znver1 ...................................... 660.30 |============== -O3 -march=znver1 -mllvm -enable-strided-vectorization . 659.88 |============== -Ofast -march=znver1 ................................... 659.67 |============== SciMark 2.0 Computational Test: Fast Fourier Transform Mflops > Higher Is Better -O0 .................................................... 134.67 |============== -O2 .................................................... 134.06 |============== -O3 .................................................... 134.23 |============== -O3 -march=znver1 ...................................... 131.50 |============== -O3 -march=znver1 -mllvm -enable-strided-vectorization . 134.46 |============== -Ofast -march=znver1 ................................... 135.39 |============== SciMark 2.0 Computational Test: Sparse Matrix Multiply Mflops > Higher Is Better -O0 .................................................... 2648.16 |============= -O2 .................................................... 2646.12 |============= -O3 .................................................... 2667.90 |============= -O3 -march=znver1 ...................................... 2636.32 |============= -O3 -march=znver1 -mllvm -enable-strided-vectorization . 2616.74 |============= -Ofast -march=znver1 ................................... 2619.99 |============= SciMark 2.0 Computational Test: Dense LU Matrix Factorization Mflops > Higher Is Better -O0 .................................................... 5637.05 |============= -O2 .................................................... 5554.97 |============ -O3 .................................................... 5859.93 |============= -O3 -march=znver1 ...................................... 5670.42 |============= -O3 -march=znver1 -mllvm -enable-strided-vectorization . 5605.43 |============ -Ofast -march=znver1 ................................... 5644.15 |============= SciMark 2.0 Computational Test: Jacobi Successive Over-Relaxation Mflops > Higher Is Better -O0 .................................................... 1673.77 |============= -O2 .................................................... 1675.22 |============= -O3 .................................................... 1676.24 |============= -O3 -march=znver1 ...................................... 1682.07 |============= -O3 -march=znver1 -mllvm -enable-strided-vectorization . 1681.87 |============= -Ofast -march=znver1 ................................... 1680.49 |============= TSCP 1.81 AI Chess Performance Nodes Per Second > Higher Is Better -O0 .................................................... 475710 |====== -O2 .................................................... 1016572 |============= -O3 .................................................... 1053919 |============= -O3 -march=znver1 ...................................... 1021094 |============= -O3 -march=znver1 -mllvm -enable-strided-vectorization . 1021094 |============= -Ofast -march=znver1 ................................... 1029891 |============= GraphicsMagick 1.3.19 Operation: Blur Iterations Per Minute > Higher Is Better -O0 .................................................... 46 |======= -O2 .................................................... 107 |================= -O3 .................................................... 106 |================= -O3 -march=znver1 ...................................... 102 |================ -O3 -march=znver1 -mllvm -enable-strided-vectorization . 106 |================= -Ofast -march=znver1 ................................... 100 |================ GraphicsMagick 1.3.19 Operation: Sharpen Iterations Per Minute > Higher Is Better -O0 .................................................... 21 |====== -O2 .................................................... 57 |================ -O3 .................................................... 58 |================ -O3 -march=znver1 ...................................... 59 |================= -O3 -march=znver1 -mllvm -enable-strided-vectorization . 60 |================= -Ofast -march=znver1 ................................... 64 |================== GraphicsMagick 1.3.19 Operation: Resizing Iterations Per Minute > Higher Is Better -O0 .................................................... 49 |====== -O2 .................................................... 132 |================ -O3 .................................................... 128 |=============== -O3 -march=znver1 ...................................... 137 |================ -O3 -march=znver1 -mllvm -enable-strided-vectorization . 143 |================= -Ofast -march=znver1 ................................... 138 |================ GraphicsMagick 1.3.19 Operation: HWB Color Space Iterations Per Minute > Higher Is Better -O0 .................................................... 72 |======= -O2 .................................................... 163 |================= -O3 .................................................... 166 |================= -O3 -march=znver1 ...................................... 149 |=============== -O3 -march=znver1 -mllvm -enable-strided-vectorization . 157 |================ -Ofast -march=znver1 ................................... 161 |================ GraphicsMagick 1.3.19 Operation: Local Adaptive Thresholding Iterations Per Minute > Higher Is Better -O0 .................................................... 28 |=== -O2 .................................................... 133 |================ -O3 .................................................... 135 |================ -O3 -march=znver1 ...................................... 135 |================ -O3 -march=znver1 -mllvm -enable-strided-vectorization . 141 |================= -Ofast -march=znver1 ................................... 135 |================ Himeno Benchmark 3.0 Poisson Pressure Solver MFLOPS > Higher Is Better -O0 .................................................... 325.64 |==== -O2 .................................................... 1157.38 |============= -O3 .................................................... 1150.22 |============= -O3 -march=znver1 ...................................... 1133.27 |============= -O3 -march=znver1 -mllvm -enable-strided-vectorization . 1133.24 |============= -Ofast -march=znver1 ................................... 1036.50 |============ C-Ray 1.1 Total Time Seconds < Lower Is Better -O0 .................................................... 32.30 |=============== -O2 .................................................... 14.01 |======= -O3 .................................................... 14.00 |======= -O3 -march=znver1 ...................................... 13.49 |====== -O3 -march=znver1 -mllvm -enable-strided-vectorization . 13.49 |====== -Ofast -march=znver1 ................................... 13.41 |====== Stockfish 2014-11-26 Total Time ms < Lower Is Better -O0 .................................................... 3711 |================ -O2 .................................................... 3712 |================ -O3 .................................................... 3703 |================ -O3 -march=znver1 ...................................... 3643 |================ -O3 -march=znver1 -mllvm -enable-strided-vectorization . 3644 |================ -Ofast -march=znver1 ................................... 3618 |================ FLAC Audio Encoding 1.3.1 WAV To FLAC Seconds < Lower Is Better -O0 .................. 62.12 |================================================= -O2 .................. 6.80 |===== -O3 .................. 6.78 |===== -O3 -march=znver1 .... 5.64 |==== -Ofast -march=znver1 . 6.26 |===== LAME MP3 Encoding 3.99.3 WAV To MP3 Seconds < Lower Is Better -O0 .................. 36.48 |================================================= -O2 .................. 9.40 |============= -O3 .................. 9.42 |============= -O3 -march=znver1 .... 10.61 |============== -Ofast -march=znver1 . 10.61 |============== WavPack Audio Encoding 5.1 WAV To WavPack Seconds < Lower Is Better -O0 .................. 7.76 |================================================== -O2 .................. 6.52 |========================================== -O3 .................. 6.51 |========================================== -O3 -march=znver1 .... 6.43 |========================================= -Ofast -march=znver1 . 6.46 |========================================== OpenSSL 1.0.1g RSA 4096-bit Performance Signs Per Second > Higher Is Better -O0 .................................................... 986.70 |============== -O2 .................................................... 987.67 |============== -O3 .................................................... 986.93 |============== -O3 -march=znver1 ...................................... 987.43 |============== -O3 -march=znver1 -mllvm -enable-strided-vectorization . 986.73 |============== -Ofast -march=znver1 ................................... 986.43 |============== libjpeg-turbo tjbench 1.5.1 Test: Decompression Throughput Megapixels/sec > Higher Is Better -O0 .................................................... 71.84 |====== -O2 .................................................... 161.68 |============= -O3 .................................................... 162.59 |============= -O3 -march=znver1 ...................................... 168.45 |============== -O3 -march=znver1 -mllvm -enable-strided-vectorization . 168.66 |============== -Ofast -march=znver1 ................................... 165.24 |============== PostgreSQL pgbench 9.6.3 Scaling: Buffer Test - Test: Normal Load - Mode: Read Write TPS > Higher Is Better -O0 .................................................... 1865.82 |============ -O2 .................................................... 1906.40 |============= -O3 .................................................... 1942.42 |============= -O3 -march=znver1 ...................................... 1930.21 |============= -O3 -march=znver1 -mllvm -enable-strided-vectorization . 1886.70 |============= PostgreSQL pgbench 9.6.3 Scaling: Buffer Test - Test: Single Thread - Mode: Read Write TPS > Higher Is Better -O0 .................................................... 201.67 |============ -O2 .................................................... 225.06 |============== -O3 .................................................... 226.94 |============== -O3 -march=znver1 ...................................... 225.61 |============== -O3 -march=znver1 -mllvm -enable-strided-vectorization . 226.38 |============== PostgreSQL pgbench 9.6.3 Scaling: Buffer Test - Test: Heavy Contention - Mode: Read Write TPS > Higher Is Better -O0 .................................................... 1985.80 |============= -O2 .................................................... 1903.00 |============ -O3 .................................................... 2037.20 |============= -O3 -march=znver1 ...................................... 1952.70 |============ -O3 -march=znver1 -mllvm -enable-strided-vectorization . 1932.86 |============ Redis 3.0.1 Test: GET Requests Per Second > Higher Is Better -O0 .................................................... 1204370.92 |====== -O2 .................................................... 1983766.97 |========== -O3 .................................................... 1971325.87 |========== -O3 -march=znver1 ...................................... 1945705.79 |========== -O3 -march=znver1 -mllvm -enable-strided-vectorization . 2008848.23 |========== -Ofast -march=znver1 ................................... 1953486.67 |========== Redis 3.0.1 Test: SET Requests Per Second > Higher Is Better -O0 .................................................... 890829.00 |====== -O2 .................................................... 1377516.08 |========== -O3 .................................................... 1399320.00 |========== -O3 -march=znver1 ...................................... 1379585.37 |========== -O3 -march=znver1 -mllvm -enable-strided-vectorization . 1406611.75 |========== -Ofast -march=znver1 ................................... 1386323.96 |==========