AOCC 1.0 Compiler Tuning AMD Ryzen 7 1700 Eight-Core testing with a MSI B350 TOMAHAWK (MS-7A34) v1.0 and HIS AMD Radeon HD 7750/8740 / R7 250E 1024MB on Ubuntu 17.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/1705218-TR-AOCC10COM39 .
AOCC 1.0 Compiler Tuning Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Driver Compiler File-System Screen Resolution -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 AMD Ryzen 7 1700 Eight-Core @ 3.00GHz (16 Cores) MSI B350 TOMAHAWK (MS-7A34) v1.0 AMD Device 1450 16384MB 120GB Samsung SSD 840 HIS AMD Radeon HD 7750/8740 / R7 250E 1024MB AMD Cape Verde/Pitcairn DELL S2409W Realtek RTL8111/8168/8411 Ubuntu 17.04 4.12.0-999-generic (x86_64) 20170518 Unity 7.5.0 modesetting 1.19.3 Clang 4.0.0 ext4 1920x1080 OpenBenchmarking.org Compiler Details - Optimized build with assertions; Default target: x86_64-unknown-linux-gnu; Host CPU: znver1 Processor Details - Scaling Governor: acpi-cpufreq ondemand
AOCC 1.0 Compiler Tuning fftw: Float + SSE - 2D FFT Size 1024 mafft: Multiple Sequence Alignment scimark2: Composite scimark2: Monte Carlo scimark2: Fast Fourier Transform scimark2: Sparse Matrix Multiply scimark2: Dense LU Matrix Factorization scimark2: Jacobi Successive Over-Relaxation tscp: AI Chess Performance graphics-magick: Blur graphics-magick: Sharpen graphics-magick: Resizing graphics-magick: HWB Color Space graphics-magick: Local Adaptive Thresholding himeno: Poisson Pressure Solver c-ray: Total Time stockfish: Total Time encode-flac: WAV To FLAC encode-mp3: WAV To MP3 encode-wavpack: WAV To WavPack openssl: RSA 4096-bit Performance tjbench: Decompression Throughput pgbench: Buffer Test - Normal Load - Read Write pgbench: Buffer Test - Single Thread - Read Write pgbench: Buffer Test - Heavy Contention - Read Write redis: GET redis: SET -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 2434.70 3.87 2147.42 643.45 134.67 2648.16 5637.05 1673.77 475710 46 21 49 72 28 325.64 32.30 3711 62.12 36.48 7.76 986.70 71.84 1865.82 201.67 1985.80 1204370.92 890829.00 20190 3.63 2130.55 642.37 134.06 2646.12 5554.97 1675.22 1016572 107 57 132 163 133 1157.38 14.01 3712 6.80 9.40 6.52 987.67 161.68 1906.40 225.06 1903.00 1983766.97 1377516.08 20297 3.77 2196.34 643.43 134.23 2667.90 5859.93 1676.24 1053919 106 58 128 166 135 1150.22 14.00 3703 6.78 9.42 6.51 986.93 162.59 1942.42 226.94 2037.20 1971325.87 1399320.00 20171 3.66 2156.12 660.30 131.50 2636.32 5670.42 1682.07 1021094 102 59 137 149 135 1133.27 13.49 3643 5.64 10.61 6.43 987.43 168.45 1930.21 225.61 1952.70 1945705.79 1379585.37 20433 3.78 2139.67 659.88 134.46 2616.74 5605.43 1681.87 1021094 106 60 143 157 141 1133.24 13.49 3644 986.73 168.66 1886.70 226.38 1932.86 2008848.23 1406611.75 20676 3.82 2147.94 659.67 135.39 2619.99 5644.15 1680.49 1029891 100 64 138 161 135 1036.50 13.41 3618 6.26 10.61 6.46 986.43 165.24 1953486.67 1386323.96 OpenBenchmarking.org
FFTW Build: Float + SSE - Size: 2D FFT Size 1024 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.4 Build: Float + SSE - Size: 2D FFT Size 1024 -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 4K 8K 12K 16K 20K SE +/- 2.81, N = 5 SE +/- 110.89, N = 5 SE +/- 89.96, N = 5 SE +/- 65.74, N = 5 SE +/- 81.73, N = 5 SE +/- 51.64, N = 5 2434.70 20190.00 20297.00 20171.00 20433.00 20676.00 -O2 -O3 -O3 -march=znver1 -march=znver1 -O3 -mllvm -Ofast -march=znver1 1. (CC) gcc options: -lm
Timed MAFFT Alignment Multiple Sequence Alignment OpenBenchmarking.org Seconds, Fewer Is Better Timed MAFFT Alignment 6.864 Multiple Sequence Alignment -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 0.8708 1.7416 2.6124 3.4832 4.354 SE +/- 0.06, N = 4 SE +/- 0.11, N = 6 SE +/- 0.02, N = 3 SE +/- 0.08, N = 6 SE +/- 0.10, N = 6 SE +/- 0.09, N = 6 3.87 3.63 3.77 3.66 3.78 3.82 1. (CC) gcc options: -O3 -lm -lpthread
SciMark Computational Test: Composite OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Composite -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 500 1000 1500 2000 2500 SE +/- 2.89, N = 4 SE +/- 8.24, N = 4 SE +/- 28.83, N = 4 SE +/- 4.05, N = 4 SE +/- 5.52, N = 4 SE +/- 5.09, N = 4 2147.42 2130.55 2196.34 2156.12 2139.67 2147.94 -O0 -O2 -O3 -O3 -march=znver1 -march=znver1 -O3 -mllvm -Ofast -march=znver1 1. (CC) gcc options: -lm
SciMark Computational Test: Monte Carlo OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Monte Carlo -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 140 280 420 560 700 SE +/- 0.13, N = 4 SE +/- 1.18, N = 4 SE +/- 0.30, N = 4 SE +/- 0.05, N = 4 SE +/- 0.16, N = 4 SE +/- 0.11, N = 4 643.45 642.37 643.43 660.30 659.88 659.67 -O0 -O2 -O3 -O3 -march=znver1 -march=znver1 -O3 -mllvm -Ofast -march=znver1 1. (CC) gcc options: -lm
SciMark Computational Test: Fast Fourier Transform OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Fast Fourier Transform -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 30 60 90 120 150 SE +/- 0.39, N = 4 SE +/- 0.36, N = 4 SE +/- 0.25, N = 4 SE +/- 3.77, N = 4 SE +/- 0.16, N = 4 SE +/- 0.42, N = 4 134.67 134.06 134.23 131.50 134.46 135.39 -O0 -O2 -O3 -O3 -march=znver1 -march=znver1 -O3 -mllvm -Ofast -march=znver1 1. (CC) gcc options: -lm
SciMark Computational Test: Sparse Matrix Multiply OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Sparse Matrix Multiply -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 600 1200 1800 2400 3000 SE +/- 7.78, N = 4 SE +/- 4.64, N = 4 SE +/- 39.95, N = 4 SE +/- 10.92, N = 4 SE +/- 6.22, N = 4 SE +/- 5.95, N = 4 2648.16 2646.12 2667.90 2636.32 2616.74 2619.99 -O0 -O2 -O3 -O3 -march=znver1 -march=znver1 -O3 -mllvm -Ofast -march=znver1 1. (CC) gcc options: -lm
SciMark Computational Test: Dense LU Matrix Factorization OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Dense LU Matrix Factorization -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 1300 2600 3900 5200 6500 SE +/- 21.28, N = 4 SE +/- 40.80, N = 4 SE +/- 103.73, N = 4 SE +/- 16.49, N = 4 SE +/- 27.73, N = 4 SE +/- 25.89, N = 4 5637.05 5554.97 5859.93 5670.42 5605.43 5644.15 -O0 -O2 -O3 -O3 -march=znver1 -march=znver1 -O3 -mllvm -Ofast -march=znver1 1. (CC) gcc options: -lm
SciMark Computational Test: Jacobi Successive Over-Relaxation OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Jacobi Successive Over-Relaxation -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 400 800 1200 1600 2000 SE +/- 2.17, N = 4 SE +/- 0.50, N = 4 SE +/- 0.43, N = 4 SE +/- 0.29, N = 4 SE +/- 0.52, N = 4 SE +/- 1.12, N = 4 1673.77 1675.22 1676.24 1682.07 1681.87 1680.49 -O0 -O2 -O3 -O3 -march=znver1 -march=znver1 -O3 -mllvm -Ofast -march=znver1 1. (CC) gcc options: -lm
TSCP AI Chess Performance OpenBenchmarking.org Nodes Per Second, More Is Better TSCP 1.81 AI Chess Performance -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 200K 400K 600K 800K 1000K SE +/- 82.20, N = 5 SE +/- 701.56, N = 5 SE +/- 494.31, N = 5 SE +/- 463.44, N = 5 SE +/- 463.44, N = 5 SE +/- 1971.42, N = 5 475710 1016572 1053919 1021094 1021094 1029891 -O0 -O2 -O3 -O3 -march=znver1 -march=znver1 -O3 -mllvm -Ofast -march=znver1 1. (CC) gcc options:
GraphicsMagick Operation: Blur OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.19 Operation: Blur -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 20 40 60 80 100 46 107 106 102 106 100 -O0 -O2 -O3 -O3 -march=znver1 -march=znver1 -O3 -mllvm -lpng16 -Ofast -march=znver1 1. (CC) gcc options: -pthread -ljbig -lwebp -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -lpthread
GraphicsMagick Operation: Sharpen OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.19 Operation: Sharpen -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 14 28 42 56 70 SE +/- 0.33, N = 3 21 57 58 59 60 64 -O0 -O2 -O3 -O3 -march=znver1 -march=znver1 -O3 -mllvm -lpng16 -Ofast -march=znver1 1. (CC) gcc options: -pthread -ljbig -lwebp -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -lpthread
GraphicsMagick Operation: Resizing OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.19 Operation: Resizing -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 30 60 90 120 150 SE +/- 6.17, N = 6 SE +/- 0.67, N = 3 49 132 128 137 143 138 -O0 -O2 -O3 -O3 -march=znver1 -march=znver1 -O3 -mllvm -lpng16 -Ofast -march=znver1 1. (CC) gcc options: -pthread -ljbig -lwebp -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -lpthread
GraphicsMagick Operation: HWB Color Space OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.19 Operation: HWB Color Space -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 40 80 120 160 200 72 163 166 149 157 161 -O0 -O2 -O3 -O3 -march=znver1 -march=znver1 -O3 -mllvm -lpng16 -Ofast -march=znver1 1. (CC) gcc options: -pthread -ljbig -lwebp -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -lpthread
GraphicsMagick Operation: Local Adaptive Thresholding OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.19 Operation: Local Adaptive Thresholding -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 30 60 90 120 150 28 133 135 135 141 135 -O0 -O2 -O3 -O3 -march=znver1 -march=znver1 -O3 -mllvm -lpng16 -Ofast -march=znver1 1. (CC) gcc options: -pthread -ljbig -lwebp -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -lpthread
Himeno Benchmark Poisson Pressure Solver OpenBenchmarking.org MFLOPS, More Is Better Himeno Benchmark 3.0 Poisson Pressure Solver -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 200 400 600 800 1000 SE +/- 0.62, N = 3 SE +/- 1.11, N = 3 SE +/- 0.19, N = 3 SE +/- 0.82, N = 3 SE +/- 0.76, N = 3 SE +/- 0.70, N = 3 325.64 1157.38 1150.22 1133.27 1133.24 1036.50 -O0 -O2 -march=znver1 -march=znver1 -mllvm -Ofast -march=znver1 1. (CC) gcc options: -O3 -mavx2
C-Ray Total Time OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 1.1 Total Time -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 8 16 24 32 40 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 32.30 14.01 14.00 13.49 13.49 13.41 -O0 -O2 -march=znver1 -march=znver1 -mllvm -Ofast -march=znver1 1. (CC) gcc options: -lm -lpthread -O3
Stockfish Total Time OpenBenchmarking.org ms, Fewer Is Better Stockfish 2014-11-26 Total Time -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 800 1600 2400 3200 4000 SE +/- 7.88, N = 3 SE +/- 7.75, N = 3 SE +/- 9.53, N = 3 SE +/- 5.00, N = 3 SE +/- 8.08, N = 3 SE +/- 4.98, N = 3 3711 3712 3703 3643 3644 3618 -O0 -O2 -march=znver1 -march=znver1 -mllvm -Ofast -march=znver1 1. (CXX) g++ options: -lpthread -fno-exceptions -fno-rtti -ansi -pedantic -O3 -msse -msse3 -mpopcnt
FLAC Audio Encoding WAV To FLAC OpenBenchmarking.org Seconds, Fewer Is Better FLAC Audio Encoding 1.3.1 WAV To FLAC -O0 -O2 -O3 -O3 -march=znver1 -Ofast -march=znver1 14 28 42 56 70 SE +/- 0.01, N = 5 SE +/- 0.01, N = 5 SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 62.12 6.80 6.78 5.64 6.26 -O0 -O2 -O3 -Ofast -march=znver1 1. (CXX) g++ options: -logg -lm
LAME MP3 Encoding WAV To MP3 OpenBenchmarking.org Seconds, Fewer Is Better LAME MP3 Encoding 3.99.3 WAV To MP3 -O0 -O2 -O3 -O3 -march=znver1 -Ofast -march=znver1 8 16 24 32 40 SE +/- 0.01, N = 5 SE +/- 0.01, N = 5 SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 36.48 9.40 9.42 10.61 10.61 -O0 -O2 -march=znver1 -Ofast -march=znver1 1. (CC) gcc options: -O3 -ffast-math -funroll-loops -fschedule-insns2 -fbranch-count-reg -fforce-addr -pipe -lm
WavPack Audio Encoding WAV To WavPack OpenBenchmarking.org Seconds, Fewer Is Better WavPack Audio Encoding 5.1 WAV To WavPack -O0 -O2 -O3 -O3 -march=znver1 -Ofast -march=znver1 2 4 6 8 10 SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 SE +/- 0.01, N = 5 7.76 6.52 6.51 6.43 6.46 -O0 -O2 -O3 -O3 -march=znver1 -Ofast -march=znver1 1. (CC) gcc options: -lm
OpenSSL RSA 4096-bit Performance OpenBenchmarking.org Signs Per Second, More Is Better OpenSSL 1.0.1g RSA 4096-bit Performance -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 200 400 600 800 1000 SE +/- 0.85, N = 3 SE +/- 0.49, N = 3 SE +/- 0.58, N = 3 SE +/- 0.57, N = 3 SE +/- 0.85, N = 3 SE +/- 0.77, N = 3 986.70 987.67 986.93 987.43 986.73 986.43 1. (CC) gcc options: -m64 -O3 -lssl -lcrypto -ldl
libjpeg-turbo tjbench Test: Decompression Throughput OpenBenchmarking.org Megapixels/sec, More Is Better libjpeg-turbo tjbench 1.5.1 Test: Decompression Throughput -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 40 80 120 160 200 SE +/- 0.02, N = 3 SE +/- 0.15, N = 3 SE +/- 0.07, N = 3 SE +/- 0.04, N = 3 SE +/- 0.01, N = 3 SE +/- 1.75, N = 3 71.84 161.68 162.59 168.45 168.66 165.24 -O0 -lm -O2 -O3 -lm -O3 -march=znver1 -lm -march=znver1 -O3 -lm -Ofast -march=znver1 -lm 1. (CC) gcc options:
PostgreSQL pgbench Scaling: Buffer Test - Test: Normal Load - Mode: Read Write OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 9.6.3 Scaling: Buffer Test - Test: Normal Load - Mode: Read Write -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization 400 800 1200 1600 2000 SE +/- 42.77, N = 6 SE +/- 32.61, N = 6 SE +/- 27.77, N = 6 SE +/- 31.64, N = 6 SE +/- 36.05, N = 6 1865.82 1906.40 1942.42 1930.21 1886.70 -O0 -shared -O2 -shared -O3 -lpgcommon -lpgport -lrt -lcrypt -ldl -lm -O3 -march=znver1 -shared -march=znver1 -O3 -mllvm -shared 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -fpic
PostgreSQL pgbench Scaling: Buffer Test - Test: Single Thread - Mode: Read Write OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 9.6.3 Scaling: Buffer Test - Test: Single Thread - Mode: Read Write -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization 50 100 150 200 250 SE +/- 2.36, N = 3 SE +/- 0.26, N = 3 SE +/- 0.29, N = 3 SE +/- 0.74, N = 3 SE +/- 0.93, N = 3 201.67 225.06 226.94 225.61 226.38 -O0 -shared -O2 -shared -O3 -lpgcommon -lpgport -lrt -lcrypt -ldl -lm -O3 -march=znver1 -shared -march=znver1 -O3 -mllvm -shared 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -fpic
PostgreSQL pgbench Scaling: Buffer Test - Test: Heavy Contention - Mode: Read Write OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 9.6.3 Scaling: Buffer Test - Test: Heavy Contention - Mode: Read Write -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization 400 800 1200 1600 2000 SE +/- 31.25, N = 3 SE +/- 24.85, N = 3 SE +/- 8.58, N = 3 SE +/- 32.80, N = 4 SE +/- 27.83, N = 5 1985.80 1903.00 2037.20 1952.70 1932.86 -O0 -shared -O2 -shared -O3 -lpgcommon -lpgport -lrt -lcrypt -ldl -lm -O3 -march=znver1 -shared -march=znver1 -O3 -mllvm -shared 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -fpic
Redis Test: GET OpenBenchmarking.org Requests Per Second, More Is Better Redis 3.0.1 Test: GET -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 400K 800K 1200K 1600K 2000K SE +/- 4609.53, N = 3 SE +/- 32907.71, N = 4 SE +/- 15178.56, N = 3 SE +/- 13206.56, N = 3 SE +/- 34111.46, N = 6 SE +/- 18769.12, N = 3 1204370.92 1983766.97 1971325.87 1945705.79 2008848.23 1953486.67 1. (CC) gcc options: -ggdb -rdynamic -lm -pthread -ldl
Redis Test: SET OpenBenchmarking.org Requests Per Second, More Is Better Redis 3.0.1 Test: SET -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 300K 600K 900K 1200K 1500K SE +/- 12621.48, N = 3 SE +/- 8504.02, N = 3 SE +/- 6804.89, N = 3 SE +/- 13825.40, N = 3 SE +/- 10024.40, N = 3 SE +/- 1282.46, N = 3 890829.00 1377516.08 1399320.00 1379585.37 1406611.75 1386323.96 1. (CC) gcc options: -ggdb -rdynamic -lm -pthread -ldl
Phoronix Test Suite v10.8.5