AOCC 1.0 Compiler Tuning AMD Ryzen 7 1700 Eight-Core testing with a MSI B350 TOMAHAWK (MS-7A34) v1.0 and HIS AMD Radeon HD 7750/8740 / R7 250E 1024MB on Ubuntu 17.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/1705218-TR-AOCC10COM39&sro&grw .
AOCC 1.0 Compiler Tuning Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Driver Compiler File-System Screen Resolution -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 AMD Ryzen 7 1700 Eight-Core @ 3.00GHz (16 Cores) MSI B350 TOMAHAWK (MS-7A34) v1.0 AMD Device 1450 16384MB 120GB Samsung SSD 840 HIS AMD Radeon HD 7750/8740 / R7 250E 1024MB AMD Cape Verde/Pitcairn DELL S2409W Realtek RTL8111/8168/8411 Ubuntu 17.04 4.12.0-999-generic (x86_64) 20170518 Unity 7.5.0 modesetting 1.19.3 Clang 4.0.0 ext4 1920x1080 OpenBenchmarking.org Compiler Details - Optimized build with assertions; Default target: x86_64-unknown-linux-gnu; Host CPU: znver1 Processor Details - Scaling Governor: acpi-cpufreq ondemand
AOCC 1.0 Compiler Tuning tscp: AI Chess Performance scimark2: Composite scimark2: Monte Carlo scimark2: Fast Fourier Transform scimark2: Sparse Matrix Multiply scimark2: Dense LU Matrix Factorization scimark2: Jacobi Successive Over-Relaxation encode-flac: WAV To FLAC encode-mp3: WAV To MP3 tjbench: Decompression Throughput encode-wavpack: WAV To WavPack fftw: Float + SSE - 2D FFT Size 1024 mafft: Multiple Sequence Alignment himeno: Poisson Pressure Solver stockfish: Total Time graphics-magick: Blur graphics-magick: Sharpen graphics-magick: Resizing graphics-magick: HWB Color Space graphics-magick: Local Adaptive Thresholding c-ray: Total Time openssl: RSA 4096-bit Performance redis: GET redis: SET pgbench: Buffer Test - Normal Load - Read Write pgbench: Buffer Test - Single Thread - Read Write pgbench: Buffer Test - Heavy Contention - Read Write -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 475710 2147.42 643.45 134.67 2648.16 5637.05 1673.77 62.12 36.48 71.84 7.76 2434.70 3.87 325.64 3711 46 21 49 72 28 32.30 986.70 1204370.92 890829.00 1865.82 201.67 1985.80 1016572 2130.55 642.37 134.06 2646.12 5554.97 1675.22 6.80 9.40 161.68 6.52 20190 3.63 1157.38 3712 107 57 132 163 133 14.01 987.67 1983766.97 1377516.08 1906.40 225.06 1903.00 1053919 2196.34 643.43 134.23 2667.90 5859.93 1676.24 6.78 9.42 162.59 6.51 20297 3.77 1150.22 3703 106 58 128 166 135 14.00 986.93 1971325.87 1399320.00 1942.42 226.94 2037.20 1021094 2156.12 660.30 131.50 2636.32 5670.42 1682.07 5.64 10.61 168.45 6.43 20171 3.66 1133.27 3643 102 59 137 149 135 13.49 987.43 1945705.79 1379585.37 1930.21 225.61 1952.70 1021094 2139.67 659.88 134.46 2616.74 5605.43 1681.87 168.66 20433 3.78 1133.24 3644 106 60 143 157 141 13.49 986.73 2008848.23 1406611.75 1886.70 226.38 1932.86 1029891 2147.94 659.67 135.39 2619.99 5644.15 1680.49 6.26 10.61 165.24 6.46 20676 3.82 1036.50 3618 100 64 138 161 135 13.41 986.43 1953486.67 1386323.96 OpenBenchmarking.org
TSCP AI Chess Performance OpenBenchmarking.org Nodes Per Second, More Is Better TSCP 1.81 AI Chess Performance -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 200K 400K 600K 800K 1000K SE +/- 82.20, N = 5 SE +/- 701.56, N = 5 SE +/- 494.31, N = 5 SE +/- 463.44, N = 5 SE +/- 463.44, N = 5 SE +/- 1971.42, N = 5 475710 1016572 1053919 1021094 1021094 1029891 -O0 -O2 -O3 -O3 -march=znver1 -march=znver1 -O3 -mllvm -Ofast -march=znver1 1. (CC) gcc options:
SciMark Computational Test: Composite OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Composite -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 500 1000 1500 2000 2500 SE +/- 2.89, N = 4 SE +/- 8.24, N = 4 SE +/- 28.83, N = 4 SE +/- 4.05, N = 4 SE +/- 5.52, N = 4 SE +/- 5.09, N = 4 2147.42 2130.55 2196.34 2156.12 2139.67 2147.94 -O0 -O2 -O3 -O3 -march=znver1 -march=znver1 -O3 -mllvm -Ofast -march=znver1 1. (CC) gcc options: -lm
SciMark Computational Test: Monte Carlo OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Monte Carlo -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 140 280 420 560 700 SE +/- 0.13, N = 4 SE +/- 1.18, N = 4 SE +/- 0.30, N = 4 SE +/- 0.05, N = 4 SE +/- 0.16, N = 4 SE +/- 0.11, N = 4 643.45 642.37 643.43 660.30 659.88 659.67 -O0 -O2 -O3 -O3 -march=znver1 -march=znver1 -O3 -mllvm -Ofast -march=znver1 1. (CC) gcc options: -lm
SciMark Computational Test: Fast Fourier Transform OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Fast Fourier Transform -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 30 60 90 120 150 SE +/- 0.39, N = 4 SE +/- 0.36, N = 4 SE +/- 0.25, N = 4 SE +/- 3.77, N = 4 SE +/- 0.16, N = 4 SE +/- 0.42, N = 4 134.67 134.06 134.23 131.50 134.46 135.39 -O0 -O2 -O3 -O3 -march=znver1 -march=znver1 -O3 -mllvm -Ofast -march=znver1 1. (CC) gcc options: -lm
SciMark Computational Test: Sparse Matrix Multiply OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Sparse Matrix Multiply -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 600 1200 1800 2400 3000 SE +/- 7.78, N = 4 SE +/- 4.64, N = 4 SE +/- 39.95, N = 4 SE +/- 10.92, N = 4 SE +/- 6.22, N = 4 SE +/- 5.95, N = 4 2648.16 2646.12 2667.90 2636.32 2616.74 2619.99 -O0 -O2 -O3 -O3 -march=znver1 -march=znver1 -O3 -mllvm -Ofast -march=znver1 1. (CC) gcc options: -lm
SciMark Computational Test: Dense LU Matrix Factorization OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Dense LU Matrix Factorization -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 1300 2600 3900 5200 6500 SE +/- 21.28, N = 4 SE +/- 40.80, N = 4 SE +/- 103.73, N = 4 SE +/- 16.49, N = 4 SE +/- 27.73, N = 4 SE +/- 25.89, N = 4 5637.05 5554.97 5859.93 5670.42 5605.43 5644.15 -O0 -O2 -O3 -O3 -march=znver1 -march=znver1 -O3 -mllvm -Ofast -march=znver1 1. (CC) gcc options: -lm
SciMark Computational Test: Jacobi Successive Over-Relaxation OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Jacobi Successive Over-Relaxation -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 400 800 1200 1600 2000 SE +/- 2.17, N = 4 SE +/- 0.50, N = 4 SE +/- 0.43, N = 4 SE +/- 0.29, N = 4 SE +/- 0.52, N = 4 SE +/- 1.12, N = 4 1673.77 1675.22 1676.24 1682.07 1681.87 1680.49 -O0 -O2 -O3 -O3 -march=znver1 -march=znver1 -O3 -mllvm -Ofast -march=znver1 1. (CC) gcc options: -lm
FLAC Audio Encoding WAV To FLAC OpenBenchmarking.org Seconds, Fewer Is Better FLAC Audio Encoding 1.3.1 WAV To FLAC -O0 -O2 -O3 -O3 -march=znver1 -Ofast -march=znver1 14 28 42 56 70 SE +/- 0.01, N = 5 SE +/- 0.01, N = 5 SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 62.12 6.80 6.78 5.64 6.26 -O0 -O2 -O3 -Ofast -march=znver1 1. (CXX) g++ options: -logg -lm
LAME MP3 Encoding WAV To MP3 OpenBenchmarking.org Seconds, Fewer Is Better LAME MP3 Encoding 3.99.3 WAV To MP3 -O0 -O2 -O3 -O3 -march=znver1 -Ofast -march=znver1 8 16 24 32 40 SE +/- 0.01, N = 5 SE +/- 0.01, N = 5 SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 36.48 9.40 9.42 10.61 10.61 -O0 -O2 -march=znver1 -Ofast -march=znver1 1. (CC) gcc options: -O3 -ffast-math -funroll-loops -fschedule-insns2 -fbranch-count-reg -fforce-addr -pipe -lm
libjpeg-turbo tjbench Test: Decompression Throughput OpenBenchmarking.org Megapixels/sec, More Is Better libjpeg-turbo tjbench 1.5.1 Test: Decompression Throughput -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 40 80 120 160 200 SE +/- 0.02, N = 3 SE +/- 0.15, N = 3 SE +/- 0.07, N = 3 SE +/- 0.04, N = 3 SE +/- 0.01, N = 3 SE +/- 1.75, N = 3 71.84 161.68 162.59 168.45 168.66 165.24 -O0 -lm -O2 -O3 -lm -O3 -march=znver1 -lm -march=znver1 -O3 -lm -Ofast -march=znver1 -lm 1. (CC) gcc options:
WavPack Audio Encoding WAV To WavPack OpenBenchmarking.org Seconds, Fewer Is Better WavPack Audio Encoding 5.1 WAV To WavPack -O0 -O2 -O3 -O3 -march=znver1 -Ofast -march=znver1 2 4 6 8 10 SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 SE +/- 0.01, N = 5 7.76 6.52 6.51 6.43 6.46 -O0 -O2 -O3 -O3 -march=znver1 -Ofast -march=znver1 1. (CC) gcc options: -lm
FFTW Build: Float + SSE - Size: 2D FFT Size 1024 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.4 Build: Float + SSE - Size: 2D FFT Size 1024 -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 4K 8K 12K 16K 20K SE +/- 2.81, N = 5 SE +/- 110.89, N = 5 SE +/- 89.96, N = 5 SE +/- 65.74, N = 5 SE +/- 81.73, N = 5 SE +/- 51.64, N = 5 2434.70 20190.00 20297.00 20171.00 20433.00 20676.00 -O2 -O3 -O3 -march=znver1 -march=znver1 -O3 -mllvm -Ofast -march=znver1 1. (CC) gcc options: -lm
Timed MAFFT Alignment Multiple Sequence Alignment OpenBenchmarking.org Seconds, Fewer Is Better Timed MAFFT Alignment 6.864 Multiple Sequence Alignment -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 0.8708 1.7416 2.6124 3.4832 4.354 SE +/- 0.06, N = 4 SE +/- 0.11, N = 6 SE +/- 0.02, N = 3 SE +/- 0.08, N = 6 SE +/- 0.10, N = 6 SE +/- 0.09, N = 6 3.87 3.63 3.77 3.66 3.78 3.82 1. (CC) gcc options: -O3 -lm -lpthread
Himeno Benchmark Poisson Pressure Solver OpenBenchmarking.org MFLOPS, More Is Better Himeno Benchmark 3.0 Poisson Pressure Solver -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 200 400 600 800 1000 SE +/- 0.62, N = 3 SE +/- 1.11, N = 3 SE +/- 0.19, N = 3 SE +/- 0.82, N = 3 SE +/- 0.76, N = 3 SE +/- 0.70, N = 3 325.64 1157.38 1150.22 1133.27 1133.24 1036.50 -O0 -O2 -march=znver1 -march=znver1 -mllvm -Ofast -march=znver1 1. (CC) gcc options: -O3 -mavx2
Stockfish Total Time OpenBenchmarking.org ms, Fewer Is Better Stockfish 2014-11-26 Total Time -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 800 1600 2400 3200 4000 SE +/- 7.88, N = 3 SE +/- 7.75, N = 3 SE +/- 9.53, N = 3 SE +/- 5.00, N = 3 SE +/- 8.08, N = 3 SE +/- 4.98, N = 3 3711 3712 3703 3643 3644 3618 -O0 -O2 -march=znver1 -march=znver1 -mllvm -Ofast -march=znver1 1. (CXX) g++ options: -lpthread -fno-exceptions -fno-rtti -ansi -pedantic -O3 -msse -msse3 -mpopcnt
GraphicsMagick Operation: Blur OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.19 Operation: Blur -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 20 40 60 80 100 46 107 106 102 106 100 -O0 -O2 -O3 -O3 -march=znver1 -march=znver1 -O3 -mllvm -lpng16 -Ofast -march=znver1 1. (CC) gcc options: -pthread -ljbig -lwebp -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -lpthread
GraphicsMagick Operation: Sharpen OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.19 Operation: Sharpen -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 14 28 42 56 70 SE +/- 0.33, N = 3 21 57 58 59 60 64 -O0 -O2 -O3 -O3 -march=znver1 -march=znver1 -O3 -mllvm -lpng16 -Ofast -march=znver1 1. (CC) gcc options: -pthread -ljbig -lwebp -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -lpthread
GraphicsMagick Operation: Resizing OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.19 Operation: Resizing -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 30 60 90 120 150 SE +/- 6.17, N = 6 SE +/- 0.67, N = 3 49 132 128 137 143 138 -O0 -O2 -O3 -O3 -march=znver1 -march=znver1 -O3 -mllvm -lpng16 -Ofast -march=znver1 1. (CC) gcc options: -pthread -ljbig -lwebp -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -lpthread
GraphicsMagick Operation: HWB Color Space OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.19 Operation: HWB Color Space -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 40 80 120 160 200 72 163 166 149 157 161 -O0 -O2 -O3 -O3 -march=znver1 -march=znver1 -O3 -mllvm -lpng16 -Ofast -march=znver1 1. (CC) gcc options: -pthread -ljbig -lwebp -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -lpthread
GraphicsMagick Operation: Local Adaptive Thresholding OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.19 Operation: Local Adaptive Thresholding -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 30 60 90 120 150 28 133 135 135 141 135 -O0 -O2 -O3 -O3 -march=znver1 -march=znver1 -O3 -mllvm -lpng16 -Ofast -march=znver1 1. (CC) gcc options: -pthread -ljbig -lwebp -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -lpthread
C-Ray Total Time OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 1.1 Total Time -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 8 16 24 32 40 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 32.30 14.01 14.00 13.49 13.49 13.41 -O0 -O2 -march=znver1 -march=znver1 -mllvm -Ofast -march=znver1 1. (CC) gcc options: -lm -lpthread -O3
OpenSSL RSA 4096-bit Performance OpenBenchmarking.org Signs Per Second, More Is Better OpenSSL 1.0.1g RSA 4096-bit Performance -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 200 400 600 800 1000 SE +/- 0.85, N = 3 SE +/- 0.49, N = 3 SE +/- 0.58, N = 3 SE +/- 0.57, N = 3 SE +/- 0.85, N = 3 SE +/- 0.77, N = 3 986.70 987.67 986.93 987.43 986.73 986.43 1. (CC) gcc options: -m64 -O3 -lssl -lcrypto -ldl
Redis Test: GET OpenBenchmarking.org Requests Per Second, More Is Better Redis 3.0.1 Test: GET -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 400K 800K 1200K 1600K 2000K SE +/- 4609.53, N = 3 SE +/- 32907.71, N = 4 SE +/- 15178.56, N = 3 SE +/- 13206.56, N = 3 SE +/- 34111.46, N = 6 SE +/- 18769.12, N = 3 1204370.92 1983766.97 1971325.87 1945705.79 2008848.23 1953486.67 1. (CC) gcc options: -ggdb -rdynamic -lm -pthread -ldl
Redis Test: SET OpenBenchmarking.org Requests Per Second, More Is Better Redis 3.0.1 Test: SET -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 300K 600K 900K 1200K 1500K SE +/- 12621.48, N = 3 SE +/- 8504.02, N = 3 SE +/- 6804.89, N = 3 SE +/- 13825.40, N = 3 SE +/- 10024.40, N = 3 SE +/- 1282.46, N = 3 890829.00 1377516.08 1399320.00 1379585.37 1406611.75 1386323.96 1. (CC) gcc options: -ggdb -rdynamic -lm -pthread -ldl
PostgreSQL pgbench Scaling: Buffer Test - Test: Normal Load - Mode: Read Write OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 9.6.3 Scaling: Buffer Test - Test: Normal Load - Mode: Read Write -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization 400 800 1200 1600 2000 SE +/- 42.77, N = 6 SE +/- 32.61, N = 6 SE +/- 27.77, N = 6 SE +/- 31.64, N = 6 SE +/- 36.05, N = 6 1865.82 1906.40 1942.42 1930.21 1886.70 -O0 -shared -O2 -shared -O3 -lpgcommon -lpgport -lrt -lcrypt -ldl -lm -O3 -march=znver1 -shared -march=znver1 -O3 -mllvm -shared 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -fpic
PostgreSQL pgbench Scaling: Buffer Test - Test: Single Thread - Mode: Read Write OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 9.6.3 Scaling: Buffer Test - Test: Single Thread - Mode: Read Write -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization 50 100 150 200 250 SE +/- 2.36, N = 3 SE +/- 0.26, N = 3 SE +/- 0.29, N = 3 SE +/- 0.74, N = 3 SE +/- 0.93, N = 3 201.67 225.06 226.94 225.61 226.38 -O0 -shared -O2 -shared -O3 -lpgcommon -lpgport -lrt -lcrypt -ldl -lm -O3 -march=znver1 -shared -march=znver1 -O3 -mllvm -shared 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -fpic
PostgreSQL pgbench Scaling: Buffer Test - Test: Heavy Contention - Mode: Read Write OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 9.6.3 Scaling: Buffer Test - Test: Heavy Contention - Mode: Read Write -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization 400 800 1200 1600 2000 SE +/- 31.25, N = 3 SE +/- 24.85, N = 3 SE +/- 8.58, N = 3 SE +/- 32.80, N = 4 SE +/- 27.83, N = 5 1985.80 1903.00 2037.20 1952.70 1932.86 -O0 -shared -O2 -shared -O3 -lpgcommon -lpgport -lrt -lcrypt -ldl -lm -O3 -march=znver1 -shared -march=znver1 -O3 -mllvm -shared 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -fpic
Phoronix Test Suite v10.8.5