AOCC 1.0 Compiler Tuning AMD Ryzen 7 1700 Eight-Core testing with a MSI B350 TOMAHAWK (MS-7A34) v1.0 and HIS AMD Radeon HD 7750/8740 / R7 250E 1024MB on Ubuntu 17.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/1705218-TR-AOCC10COM39&sor&grr .
AOCC 1.0 Compiler Tuning Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Driver Compiler File-System Screen Resolution -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 AMD Ryzen 7 1700 Eight-Core @ 3.00GHz (16 Cores) MSI B350 TOMAHAWK (MS-7A34) v1.0 AMD Device 1450 16384MB 120GB Samsung SSD 840 HIS AMD Radeon HD 7750/8740 / R7 250E 1024MB AMD Cape Verde/Pitcairn DELL S2409W Realtek RTL8111/8168/8411 Ubuntu 17.04 4.12.0-999-generic (x86_64) 20170518 Unity 7.5.0 modesetting 1.19.3 Clang 4.0.0 ext4 1920x1080 OpenBenchmarking.org Compiler Details - Optimized build with assertions; Default target: x86_64-unknown-linux-gnu; Host CPU: znver1 Processor Details - Scaling Governor: acpi-cpufreq ondemand
AOCC 1.0 Compiler Tuning redis: SET redis: GET pgbench: Buffer Test - Heavy Contention - Read Write pgbench: Buffer Test - Single Thread - Read Write pgbench: Buffer Test - Normal Load - Read Write tjbench: Decompression Throughput openssl: RSA 4096-bit Performance encode-wavpack: WAV To WavPack encode-mp3: WAV To MP3 encode-flac: WAV To FLAC stockfish: Total Time c-ray: Total Time himeno: Poisson Pressure Solver graphics-magick: Local Adaptive Thresholding graphics-magick: HWB Color Space graphics-magick: Resizing graphics-magick: Sharpen graphics-magick: Blur tscp: AI Chess Performance scimark2: Jacobi Successive Over-Relaxation scimark2: Dense LU Matrix Factorization scimark2: Sparse Matrix Multiply scimark2: Fast Fourier Transform scimark2: Monte Carlo scimark2: Composite mafft: Multiple Sequence Alignment fftw: Float + SSE - 2D FFT Size 1024 -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 890829.00 1204370.92 1985.80 201.67 1865.82 71.84 986.70 7.76 36.48 62.12 3711 32.30 325.64 28 72 49 21 46 475710 1673.77 5637.05 2648.16 134.67 643.45 2147.42 3.87 2434.70 1377516.08 1983766.97 1903.00 225.06 1906.40 161.68 987.67 6.52 9.40 6.80 3712 14.01 1157.38 133 163 132 57 107 1016572 1675.22 5554.97 2646.12 134.06 642.37 2130.55 3.63 20190 1399320.00 1971325.87 2037.20 226.94 1942.42 162.59 986.93 6.51 9.42 6.78 3703 14.00 1150.22 135 166 128 58 106 1053919 1676.24 5859.93 2667.90 134.23 643.43 2196.34 3.77 20297 1379585.37 1945705.79 1952.70 225.61 1930.21 168.45 987.43 6.43 10.61 5.64 3643 13.49 1133.27 135 149 137 59 102 1021094 1682.07 5670.42 2636.32 131.50 660.30 2156.12 3.66 20171 1406611.75 2008848.23 1932.86 226.38 1886.70 168.66 986.73 3644 13.49 1133.24 141 157 143 60 106 1021094 1681.87 5605.43 2616.74 134.46 659.88 2139.67 3.78 20433 1386323.96 1953486.67 165.24 986.43 6.46 10.61 6.26 3618 13.41 1036.50 135 161 138 64 100 1029891 1680.49 5644.15 2619.99 135.39 659.67 2147.94 3.82 20676 OpenBenchmarking.org
Redis Test: SET OpenBenchmarking.org Requests Per Second, More Is Better Redis 3.0.1 Test: SET -O3 -march=znver1 -mllvm -enable-strided-vectorization -O3 -Ofast -march=znver1 -O3 -march=znver1 -O2 -O0 300K 600K 900K 1200K 1500K SE +/- 10024.40, N = 3 SE +/- 6804.89, N = 3 SE +/- 1282.46, N = 3 SE +/- 13825.40, N = 3 SE +/- 8504.02, N = 3 SE +/- 12621.48, N = 3 1406611.75 1399320.00 1386323.96 1379585.37 1377516.08 890829.00 1. (CC) gcc options: -ggdb -rdynamic -lm -pthread -ldl
Redis Test: GET OpenBenchmarking.org Requests Per Second, More Is Better Redis 3.0.1 Test: GET -O3 -march=znver1 -mllvm -enable-strided-vectorization -O2 -O3 -Ofast -march=znver1 -O3 -march=znver1 -O0 400K 800K 1200K 1600K 2000K SE +/- 34111.46, N = 6 SE +/- 32907.71, N = 4 SE +/- 15178.56, N = 3 SE +/- 18769.12, N = 3 SE +/- 13206.56, N = 3 SE +/- 4609.53, N = 3 2008848.23 1983766.97 1971325.87 1953486.67 1945705.79 1204370.92 1. (CC) gcc options: -ggdb -rdynamic -lm -pthread -ldl
PostgreSQL pgbench Scaling: Buffer Test - Test: Heavy Contention - Mode: Read Write OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 9.6.3 Scaling: Buffer Test - Test: Heavy Contention - Mode: Read Write -O3 -O0 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -O2 400 800 1200 1600 2000 SE +/- 8.58, N = 3 SE +/- 31.25, N = 3 SE +/- 32.80, N = 4 SE +/- 27.83, N = 5 SE +/- 24.85, N = 3 2037.20 1985.80 1952.70 1932.86 1903.00 -O3 -lpgcommon -lpgport -lrt -lcrypt -ldl -lm -O0 -shared -O3 -march=znver1 -shared -march=znver1 -O3 -mllvm -shared -O2 -shared 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -fpic
PostgreSQL pgbench Scaling: Buffer Test - Test: Single Thread - Mode: Read Write OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 9.6.3 Scaling: Buffer Test - Test: Single Thread - Mode: Read Write -O3 -O3 -march=znver1 -mllvm -enable-strided-vectorization -O3 -march=znver1 -O2 -O0 50 100 150 200 250 SE +/- 0.29, N = 3 SE +/- 0.93, N = 3 SE +/- 0.74, N = 3 SE +/- 0.26, N = 3 SE +/- 2.36, N = 3 226.94 226.38 225.61 225.06 201.67 -O3 -lpgcommon -lpgport -lrt -lcrypt -ldl -lm -march=znver1 -O3 -mllvm -shared -O3 -march=znver1 -shared -O2 -shared -O0 -shared 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -fpic
PostgreSQL pgbench Scaling: Buffer Test - Test: Normal Load - Mode: Read Write OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 9.6.3 Scaling: Buffer Test - Test: Normal Load - Mode: Read Write -O3 -O3 -march=znver1 -O2 -O3 -march=znver1 -mllvm -enable-strided-vectorization -O0 400 800 1200 1600 2000 SE +/- 27.77, N = 6 SE +/- 31.64, N = 6 SE +/- 32.61, N = 6 SE +/- 36.05, N = 6 SE +/- 42.77, N = 6 1942.42 1930.21 1906.40 1886.70 1865.82 -O3 -lpgcommon -lpgport -lrt -lcrypt -ldl -lm -O3 -march=znver1 -shared -O2 -shared -march=znver1 -O3 -mllvm -shared -O0 -shared 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -fpic
libjpeg-turbo tjbench Test: Decompression Throughput OpenBenchmarking.org Megapixels/sec, More Is Better libjpeg-turbo tjbench 1.5.1 Test: Decompression Throughput -O3 -march=znver1 -mllvm -enable-strided-vectorization -O3 -march=znver1 -Ofast -march=znver1 -O3 -O2 -O0 40 80 120 160 200 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 SE +/- 1.75, N = 3 SE +/- 0.07, N = 3 SE +/- 0.15, N = 3 SE +/- 0.02, N = 3 168.66 168.45 165.24 162.59 161.68 71.84 -march=znver1 -O3 -lm -O3 -march=znver1 -lm -Ofast -march=znver1 -lm -O3 -lm -O2 -O0 -lm 1. (CC) gcc options:
OpenSSL RSA 4096-bit Performance OpenBenchmarking.org Signs Per Second, More Is Better OpenSSL 1.0.1g RSA 4096-bit Performance -O2 -O3 -march=znver1 -O3 -O3 -march=znver1 -mllvm -enable-strided-vectorization -O0 -Ofast -march=znver1 200 400 600 800 1000 SE +/- 0.49, N = 3 SE +/- 0.57, N = 3 SE +/- 0.58, N = 3 SE +/- 0.85, N = 3 SE +/- 0.85, N = 3 SE +/- 0.77, N = 3 987.67 987.43 986.93 986.73 986.70 986.43 1. (CC) gcc options: -m64 -O3 -lssl -lcrypto -ldl
WavPack Audio Encoding WAV To WavPack OpenBenchmarking.org Seconds, Fewer Is Better WavPack Audio Encoding 5.1 WAV To WavPack -O3 -march=znver1 -Ofast -march=znver1 -O3 -O2 -O0 2 4 6 8 10 SE +/- 0.00, N = 5 SE +/- 0.01, N = 5 SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 6.43 6.46 6.51 6.52 7.76 -O3 -march=znver1 -Ofast -march=znver1 -O3 -O2 -O0 1. (CC) gcc options: -lm
LAME MP3 Encoding WAV To MP3 OpenBenchmarking.org Seconds, Fewer Is Better LAME MP3 Encoding 3.99.3 WAV To MP3 -O2 -O3 -O3 -march=znver1 -Ofast -march=znver1 -O0 8 16 24 32 40 SE +/- 0.01, N = 5 SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 SE +/- 0.01, N = 5 9.40 9.42 10.61 10.61 36.48 -O2 -march=znver1 -Ofast -march=znver1 -O0 1. (CC) gcc options: -O3 -ffast-math -funroll-loops -fschedule-insns2 -fbranch-count-reg -fforce-addr -pipe -lm
FLAC Audio Encoding WAV To FLAC OpenBenchmarking.org Seconds, Fewer Is Better FLAC Audio Encoding 1.3.1 WAV To FLAC -O3 -march=znver1 -Ofast -march=znver1 -O3 -O2 -O0 14 28 42 56 70 SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 SE +/- 0.01, N = 5 SE +/- 0.01, N = 5 5.64 6.26 6.78 6.80 62.12 -Ofast -march=znver1 -O3 -O2 -O0 1. (CXX) g++ options: -logg -lm
Stockfish Total Time OpenBenchmarking.org ms, Fewer Is Better Stockfish 2014-11-26 Total Time -Ofast -march=znver1 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -O3 -O0 -O2 800 1600 2400 3200 4000 SE +/- 4.98, N = 3 SE +/- 5.00, N = 3 SE +/- 8.08, N = 3 SE +/- 9.53, N = 3 SE +/- 7.88, N = 3 SE +/- 7.75, N = 3 3618 3643 3644 3703 3711 3712 -Ofast -march=znver1 -march=znver1 -march=znver1 -mllvm -O0 -O2 1. (CXX) g++ options: -lpthread -fno-exceptions -fno-rtti -ansi -pedantic -O3 -msse -msse3 -mpopcnt
C-Ray Total Time OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 1.1 Total Time -Ofast -march=znver1 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -O3 -O2 -O0 8 16 24 32 40 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 13.41 13.49 13.49 14.00 14.01 32.30 -Ofast -march=znver1 -march=znver1 -march=znver1 -mllvm -O2 -O0 1. (CC) gcc options: -lm -lpthread -O3
Himeno Benchmark Poisson Pressure Solver OpenBenchmarking.org MFLOPS, More Is Better Himeno Benchmark 3.0 Poisson Pressure Solver -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 -O0 200 400 600 800 1000 SE +/- 1.11, N = 3 SE +/- 0.19, N = 3 SE +/- 0.82, N = 3 SE +/- 0.76, N = 3 SE +/- 0.70, N = 3 SE +/- 0.62, N = 3 1157.38 1150.22 1133.27 1133.24 1036.50 325.64 -O2 -march=znver1 -march=znver1 -mllvm -Ofast -march=znver1 -O0 1. (CC) gcc options: -O3 -mavx2
GraphicsMagick Operation: Local Adaptive Thresholding OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.19 Operation: Local Adaptive Thresholding -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 -O3 -march=znver1 -O3 -O2 -O0 30 60 90 120 150 141 135 135 135 133 28 -march=znver1 -O3 -mllvm -lpng16 -Ofast -march=znver1 -O3 -march=znver1 -O3 -O2 -O0 1. (CC) gcc options: -pthread -ljbig -lwebp -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -lpthread
GraphicsMagick Operation: HWB Color Space OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.19 Operation: HWB Color Space -O3 -O2 -Ofast -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -O3 -march=znver1 -O0 40 80 120 160 200 166 163 161 157 149 72 -O3 -O2 -Ofast -march=znver1 -march=znver1 -O3 -mllvm -lpng16 -O3 -march=znver1 -O0 1. (CC) gcc options: -pthread -ljbig -lwebp -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -lpthread
GraphicsMagick Operation: Resizing OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.19 Operation: Resizing -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 -O3 -march=znver1 -O2 -O3 -O0 30 60 90 120 150 SE +/- 0.67, N = 3 SE +/- 6.17, N = 6 143 138 137 132 128 49 -march=znver1 -O3 -mllvm -lpng16 -Ofast -march=znver1 -O3 -march=znver1 -O2 -O3 -O0 1. (CC) gcc options: -pthread -ljbig -lwebp -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -lpthread
GraphicsMagick Operation: Sharpen OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.19 Operation: Sharpen -Ofast -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -O3 -march=znver1 -O3 -O2 -O0 14 28 42 56 70 SE +/- 0.33, N = 3 64 60 59 58 57 21 -Ofast -march=znver1 -march=znver1 -O3 -mllvm -lpng16 -O3 -march=znver1 -O3 -O2 -O0 1. (CC) gcc options: -pthread -ljbig -lwebp -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -lpthread
GraphicsMagick Operation: Blur OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.19 Operation: Blur -O2 -O3 -march=znver1 -mllvm -enable-strided-vectorization -O3 -O3 -march=znver1 -Ofast -march=znver1 -O0 20 40 60 80 100 107 106 106 102 100 46 -O2 -march=znver1 -O3 -mllvm -lpng16 -O3 -O3 -march=znver1 -Ofast -march=znver1 -O0 1. (CC) gcc options: -pthread -ljbig -lwebp -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -lpthread
TSCP AI Chess Performance OpenBenchmarking.org Nodes Per Second, More Is Better TSCP 1.81 AI Chess Performance -O3 -Ofast -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -O3 -march=znver1 -O2 -O0 200K 400K 600K 800K 1000K SE +/- 494.31, N = 5 SE +/- 1971.42, N = 5 SE +/- 463.44, N = 5 SE +/- 463.44, N = 5 SE +/- 701.56, N = 5 SE +/- 82.20, N = 5 1053919 1029891 1021094 1021094 1016572 475710 -O3 -Ofast -march=znver1 -march=znver1 -O3 -mllvm -O3 -march=znver1 -O2 -O0 1. (CC) gcc options:
SciMark Computational Test: Jacobi Successive Over-Relaxation OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Jacobi Successive Over-Relaxation -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 -O3 -O2 -O0 400 800 1200 1600 2000 SE +/- 0.29, N = 4 SE +/- 0.52, N = 4 SE +/- 1.12, N = 4 SE +/- 0.43, N = 4 SE +/- 0.50, N = 4 SE +/- 2.17, N = 4 1682.07 1681.87 1680.49 1676.24 1675.22 1673.77 -O3 -march=znver1 -march=znver1 -O3 -mllvm -Ofast -march=znver1 -O3 -O2 -O0 1. (CC) gcc options: -lm
SciMark Computational Test: Dense LU Matrix Factorization OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Dense LU Matrix Factorization -O3 -O3 -march=znver1 -Ofast -march=znver1 -O0 -O3 -march=znver1 -mllvm -enable-strided-vectorization -O2 1300 2600 3900 5200 6500 SE +/- 103.73, N = 4 SE +/- 16.49, N = 4 SE +/- 25.89, N = 4 SE +/- 21.28, N = 4 SE +/- 27.73, N = 4 SE +/- 40.80, N = 4 5859.93 5670.42 5644.15 5637.05 5605.43 5554.97 -O3 -O3 -march=znver1 -Ofast -march=znver1 -O0 -march=znver1 -O3 -mllvm -O2 1. (CC) gcc options: -lm
SciMark Computational Test: Sparse Matrix Multiply OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Sparse Matrix Multiply -O3 -O0 -O2 -O3 -march=znver1 -Ofast -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization 600 1200 1800 2400 3000 SE +/- 39.95, N = 4 SE +/- 7.78, N = 4 SE +/- 4.64, N = 4 SE +/- 10.92, N = 4 SE +/- 5.95, N = 4 SE +/- 6.22, N = 4 2667.90 2648.16 2646.12 2636.32 2619.99 2616.74 -O3 -O0 -O2 -O3 -march=znver1 -Ofast -march=znver1 -march=znver1 -O3 -mllvm 1. (CC) gcc options: -lm
SciMark Computational Test: Fast Fourier Transform OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Fast Fourier Transform -Ofast -march=znver1 -O0 -O3 -march=znver1 -mllvm -enable-strided-vectorization -O3 -O2 -O3 -march=znver1 30 60 90 120 150 SE +/- 0.42, N = 4 SE +/- 0.39, N = 4 SE +/- 0.16, N = 4 SE +/- 0.25, N = 4 SE +/- 0.36, N = 4 SE +/- 3.77, N = 4 135.39 134.67 134.46 134.23 134.06 131.50 -Ofast -march=znver1 -O0 -march=znver1 -O3 -mllvm -O3 -O2 -O3 -march=znver1 1. (CC) gcc options: -lm
SciMark Computational Test: Monte Carlo OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Monte Carlo -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 -O0 -O3 -O2 140 280 420 560 700 SE +/- 0.05, N = 4 SE +/- 0.16, N = 4 SE +/- 0.11, N = 4 SE +/- 0.13, N = 4 SE +/- 0.30, N = 4 SE +/- 1.18, N = 4 660.30 659.88 659.67 643.45 643.43 642.37 -O3 -march=znver1 -march=znver1 -O3 -mllvm -Ofast -march=znver1 -O0 -O3 -O2 1. (CC) gcc options: -lm
SciMark Computational Test: Composite OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Composite -O3 -O3 -march=znver1 -Ofast -march=znver1 -O0 -O3 -march=znver1 -mllvm -enable-strided-vectorization -O2 500 1000 1500 2000 2500 SE +/- 28.83, N = 4 SE +/- 4.05, N = 4 SE +/- 5.09, N = 4 SE +/- 2.89, N = 4 SE +/- 5.52, N = 4 SE +/- 8.24, N = 4 2196.34 2156.12 2147.94 2147.42 2139.67 2130.55 -O3 -O3 -march=znver1 -Ofast -march=znver1 -O0 -march=znver1 -O3 -mllvm -O2 1. (CC) gcc options: -lm
Timed MAFFT Alignment Multiple Sequence Alignment OpenBenchmarking.org Seconds, Fewer Is Better Timed MAFFT Alignment 6.864 Multiple Sequence Alignment -O2 -O3 -march=znver1 -O3 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 -O0 0.8708 1.7416 2.6124 3.4832 4.354 SE +/- 0.11, N = 6 SE +/- 0.08, N = 6 SE +/- 0.02, N = 3 SE +/- 0.10, N = 6 SE +/- 0.09, N = 6 SE +/- 0.06, N = 4 3.63 3.66 3.77 3.78 3.82 3.87 1. (CC) gcc options: -O3 -lm -lpthread
FFTW Build: Float + SSE - Size: 2D FFT Size 1024 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.4 Build: Float + SSE - Size: 2D FFT Size 1024 -Ofast -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -O3 -O2 -O3 -march=znver1 -O0 4K 8K 12K 16K 20K SE +/- 51.64, N = 5 SE +/- 81.73, N = 5 SE +/- 89.96, N = 5 SE +/- 110.89, N = 5 SE +/- 65.74, N = 5 SE +/- 2.81, N = 5 20676.00 20433.00 20297.00 20190.00 20171.00 2434.70 -Ofast -march=znver1 -march=znver1 -O3 -mllvm -O3 -O2 -O3 -march=znver1 1. (CC) gcc options: -lm
Phoronix Test Suite v10.8.5