AOCC 1.0 Compiler Tuning AMD Ryzen 7 1700 Eight-Core testing with a MSI B350 TOMAHAWK (MS-7A34) v1.0 and HIS AMD Radeon HD 7750/8740 / R7 250E 1024MB on Ubuntu 17.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/1705218-TR-AOCC10COM39&grs&rdt .
AOCC 1.0 Compiler Tuning Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Driver Compiler File-System Screen Resolution -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 -O2 -O0 AMD Ryzen 7 1700 Eight-Core @ 3.00GHz (16 Cores) MSI B350 TOMAHAWK (MS-7A34) v1.0 AMD Device 1450 16384MB 120GB Samsung SSD 840 HIS AMD Radeon HD 7750/8740 / R7 250E 1024MB AMD Cape Verde/Pitcairn DELL S2409W Realtek RTL8111/8168/8411 Ubuntu 17.04 4.12.0-999-generic (x86_64) 20170518 Unity 7.5.0 modesetting 1.19.3 Clang 4.0.0 ext4 1920x1080 OpenBenchmarking.org Compiler Details - Optimized build with assertions; Default target: x86_64-unknown-linux-gnu; Host CPU: znver1 Processor Details - Scaling Governor: acpi-cpufreq ondemand
AOCC 1.0 Compiler Tuning fftw: Float + SSE - 2D FFT Size 1024 graphics-magick: Local Adaptive Thresholding encode-mp3: WAV To MP3 himeno: Poisson Pressure Solver graphics-magick: Sharpen graphics-magick: Resizing c-ray: Total Time tjbench: Decompression Throughput graphics-magick: Blur graphics-magick: HWB Color Space tscp: AI Chess Performance encode-flac: WAV To FLAC redis: GET redis: SET encode-wavpack: WAV To WavPack pgbench: Buffer Test - Single Thread - Read Write pgbench: Buffer Test - Heavy Contention - Read Write mafft: Multiple Sequence Alignment scimark2: Dense LU Matrix Factorization pgbench: Buffer Test - Normal Load - Read Write scimark2: Composite scimark2: Fast Fourier Transform scimark2: Monte Carlo stockfish: Total Time scimark2: Sparse Matrix Multiply scimark2: Jacobi Successive Over-Relaxation openssl: RSA 4096-bit Performance -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 -O2 -O0 20297 135 9.42 1150.22 58 128 14.00 162.59 106 166 1053919 6.78 1971325.87 1399320.00 6.51 226.94 2037.20 3.77 5859.93 1942.42 2196.34 134.23 643.43 3703 2667.90 1676.24 986.93 20171 135 10.61 1133.27 59 137 13.49 168.45 102 149 1021094 5.64 1945705.79 1379585.37 6.43 225.61 1952.70 3.66 5670.42 1930.21 2156.12 131.50 660.30 3643 2636.32 1682.07 987.43 20433 141 1133.24 60 143 13.49 168.66 106 157 1021094 2008848.23 1406611.75 226.38 1932.86 3.78 5605.43 1886.70 2139.67 134.46 659.88 3644 2616.74 1681.87 986.73 20676 135 10.61 1036.50 64 138 13.41 165.24 100 161 1029891 6.26 1953486.67 1386323.96 6.46 3.82 5644.15 2147.94 135.39 659.67 3618 2619.99 1680.49 986.43 20190 133 9.40 1157.38 57 132 14.01 161.68 107 163 1016572 6.80 1983766.97 1377516.08 6.52 225.06 1903.00 3.63 5554.97 1906.40 2130.55 134.06 642.37 3712 2646.12 1675.22 987.67 2434.70 28 36.48 325.64 21 49 32.30 71.84 46 72 475710 62.12 1204370.92 890829.00 7.76 201.67 1985.80 3.87 5637.05 1865.82 2147.42 134.67 643.45 3711 2648.16 1673.77 986.70 OpenBenchmarking.org
FFTW Build: Float + SSE - Size: 2D FFT Size 1024 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.4 Build: Float + SSE - Size: 2D FFT Size 1024 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 -O2 -O0 4K 8K 12K 16K 20K SE +/- 89.96, N = 5 SE +/- 65.74, N = 5 SE +/- 81.73, N = 5 SE +/- 51.64, N = 5 SE +/- 110.89, N = 5 SE +/- 2.81, N = 5 20297.00 20171.00 20433.00 20676.00 20190.00 2434.70 -O3 -O3 -march=znver1 -march=znver1 -O3 -mllvm -Ofast -march=znver1 -O2 1. (CC) gcc options: -lm
GraphicsMagick Operation: Local Adaptive Thresholding OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.19 Operation: Local Adaptive Thresholding -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 -O2 -O0 30 60 90 120 150 135 135 141 135 133 28 -O3 -O3 -march=znver1 -march=znver1 -O3 -mllvm -lpng16 -Ofast -march=znver1 -O2 -O0 1. (CC) gcc options: -pthread -ljbig -lwebp -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -lpthread
LAME MP3 Encoding WAV To MP3 OpenBenchmarking.org Seconds, Fewer Is Better LAME MP3 Encoding 3.99.3 WAV To MP3 -O3 -O3 -march=znver1 -Ofast -march=znver1 -O2 -O0 8 16 24 32 40 SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 SE +/- 0.01, N = 5 SE +/- 0.01, N = 5 9.42 10.61 10.61 9.40 36.48 -march=znver1 -Ofast -march=znver1 -O2 -O0 1. (CC) gcc options: -O3 -ffast-math -funroll-loops -fschedule-insns2 -fbranch-count-reg -fforce-addr -pipe -lm
Himeno Benchmark Poisson Pressure Solver OpenBenchmarking.org MFLOPS, More Is Better Himeno Benchmark 3.0 Poisson Pressure Solver -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 -O2 -O0 200 400 600 800 1000 SE +/- 0.19, N = 3 SE +/- 0.82, N = 3 SE +/- 0.76, N = 3 SE +/- 0.70, N = 3 SE +/- 1.11, N = 3 SE +/- 0.62, N = 3 1150.22 1133.27 1133.24 1036.50 1157.38 325.64 -march=znver1 -march=znver1 -mllvm -Ofast -march=znver1 -O2 -O0 1. (CC) gcc options: -O3 -mavx2
GraphicsMagick Operation: Sharpen OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.19 Operation: Sharpen -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 -O2 -O0 14 28 42 56 70 SE +/- 0.33, N = 3 58 59 60 64 57 21 -O3 -O3 -march=znver1 -march=znver1 -O3 -mllvm -lpng16 -Ofast -march=znver1 -O2 -O0 1. (CC) gcc options: -pthread -ljbig -lwebp -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -lpthread
GraphicsMagick Operation: Resizing OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.19 Operation: Resizing -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 -O2 -O0 30 60 90 120 150 SE +/- 6.17, N = 6 SE +/- 0.67, N = 3 128 137 143 138 132 49 -O3 -O3 -march=znver1 -march=znver1 -O3 -mllvm -lpng16 -Ofast -march=znver1 -O2 -O0 1. (CC) gcc options: -pthread -ljbig -lwebp -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -lpthread
C-Ray Total Time OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 1.1 Total Time -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 -O2 -O0 8 16 24 32 40 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 14.00 13.49 13.49 13.41 14.01 32.30 -march=znver1 -march=znver1 -mllvm -Ofast -march=znver1 -O2 -O0 1. (CC) gcc options: -lm -lpthread -O3
libjpeg-turbo tjbench Test: Decompression Throughput OpenBenchmarking.org Megapixels/sec, More Is Better libjpeg-turbo tjbench 1.5.1 Test: Decompression Throughput -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 -O2 -O0 40 80 120 160 200 SE +/- 0.07, N = 3 SE +/- 0.04, N = 3 SE +/- 0.01, N = 3 SE +/- 1.75, N = 3 SE +/- 0.15, N = 3 SE +/- 0.02, N = 3 162.59 168.45 168.66 165.24 161.68 71.84 -O3 -lm -O3 -march=znver1 -lm -march=znver1 -O3 -lm -Ofast -march=znver1 -lm -O2 -O0 -lm 1. (CC) gcc options:
GraphicsMagick Operation: Blur OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.19 Operation: Blur -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 -O2 -O0 20 40 60 80 100 106 102 106 100 107 46 -O3 -O3 -march=znver1 -march=znver1 -O3 -mllvm -lpng16 -Ofast -march=znver1 -O2 -O0 1. (CC) gcc options: -pthread -ljbig -lwebp -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -lpthread
GraphicsMagick Operation: HWB Color Space OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.19 Operation: HWB Color Space -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 -O2 -O0 40 80 120 160 200 166 149 157 161 163 72 -O3 -O3 -march=znver1 -march=znver1 -O3 -mllvm -lpng16 -Ofast -march=znver1 -O2 -O0 1. (CC) gcc options: -pthread -ljbig -lwebp -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -lpthread
TSCP AI Chess Performance OpenBenchmarking.org Nodes Per Second, More Is Better TSCP 1.81 AI Chess Performance -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 -O2 -O0 200K 400K 600K 800K 1000K SE +/- 494.31, N = 5 SE +/- 463.44, N = 5 SE +/- 463.44, N = 5 SE +/- 1971.42, N = 5 SE +/- 701.56, N = 5 SE +/- 82.20, N = 5 1053919 1021094 1021094 1029891 1016572 475710 -O3 -O3 -march=znver1 -march=znver1 -O3 -mllvm -Ofast -march=znver1 -O2 -O0 1. (CC) gcc options:
FLAC Audio Encoding WAV To FLAC OpenBenchmarking.org Seconds, Fewer Is Better FLAC Audio Encoding 1.3.1 WAV To FLAC -O3 -O3 -march=znver1 -Ofast -march=znver1 -O2 -O0 14 28 42 56 70 SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 SE +/- 0.01, N = 5 SE +/- 0.01, N = 5 6.78 5.64 6.26 6.80 62.12 -O3 -Ofast -march=znver1 -O2 -O0 1. (CXX) g++ options: -logg -lm
Redis Test: GET OpenBenchmarking.org Requests Per Second, More Is Better Redis 3.0.1 Test: GET -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 -O2 -O0 400K 800K 1200K 1600K 2000K SE +/- 15178.56, N = 3 SE +/- 13206.56, N = 3 SE +/- 34111.46, N = 6 SE +/- 18769.12, N = 3 SE +/- 32907.71, N = 4 SE +/- 4609.53, N = 3 1971325.87 1945705.79 2008848.23 1953486.67 1983766.97 1204370.92 1. (CC) gcc options: -ggdb -rdynamic -lm -pthread -ldl
Redis Test: SET OpenBenchmarking.org Requests Per Second, More Is Better Redis 3.0.1 Test: SET -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 -O2 -O0 300K 600K 900K 1200K 1500K SE +/- 6804.89, N = 3 SE +/- 13825.40, N = 3 SE +/- 10024.40, N = 3 SE +/- 1282.46, N = 3 SE +/- 8504.02, N = 3 SE +/- 12621.48, N = 3 1399320.00 1379585.37 1406611.75 1386323.96 1377516.08 890829.00 1. (CC) gcc options: -ggdb -rdynamic -lm -pthread -ldl
WavPack Audio Encoding WAV To WavPack OpenBenchmarking.org Seconds, Fewer Is Better WavPack Audio Encoding 5.1 WAV To WavPack -O3 -O3 -march=znver1 -Ofast -march=znver1 -O2 -O0 2 4 6 8 10 SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 SE +/- 0.01, N = 5 SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 6.51 6.43 6.46 6.52 7.76 -O3 -O3 -march=znver1 -Ofast -march=znver1 -O2 -O0 1. (CC) gcc options: -lm
PostgreSQL pgbench Scaling: Buffer Test - Test: Single Thread - Mode: Read Write OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 9.6.3 Scaling: Buffer Test - Test: Single Thread - Mode: Read Write -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -O2 -O0 50 100 150 200 250 SE +/- 0.29, N = 3 SE +/- 0.74, N = 3 SE +/- 0.93, N = 3 SE +/- 0.26, N = 3 SE +/- 2.36, N = 3 226.94 225.61 226.38 225.06 201.67 -O3 -lpgcommon -lpgport -lrt -lcrypt -ldl -lm -O3 -march=znver1 -shared -march=znver1 -O3 -mllvm -shared -O2 -shared -O0 -shared 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -fpic
PostgreSQL pgbench Scaling: Buffer Test - Test: Heavy Contention - Mode: Read Write OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 9.6.3 Scaling: Buffer Test - Test: Heavy Contention - Mode: Read Write -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -O2 -O0 400 800 1200 1600 2000 SE +/- 8.58, N = 3 SE +/- 32.80, N = 4 SE +/- 27.83, N = 5 SE +/- 24.85, N = 3 SE +/- 31.25, N = 3 2037.20 1952.70 1932.86 1903.00 1985.80 -O3 -lpgcommon -lpgport -lrt -lcrypt -ldl -lm -O3 -march=znver1 -shared -march=znver1 -O3 -mllvm -shared -O2 -shared -O0 -shared 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -fpic
Timed MAFFT Alignment Multiple Sequence Alignment OpenBenchmarking.org Seconds, Fewer Is Better Timed MAFFT Alignment 6.864 Multiple Sequence Alignment -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 -O2 -O0 0.8708 1.7416 2.6124 3.4832 4.354 SE +/- 0.02, N = 3 SE +/- 0.08, N = 6 SE +/- 0.10, N = 6 SE +/- 0.09, N = 6 SE +/- 0.11, N = 6 SE +/- 0.06, N = 4 3.77 3.66 3.78 3.82 3.63 3.87 1. (CC) gcc options: -O3 -lm -lpthread
SciMark Computational Test: Dense LU Matrix Factorization OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Dense LU Matrix Factorization -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 -O2 -O0 1300 2600 3900 5200 6500 SE +/- 103.73, N = 4 SE +/- 16.49, N = 4 SE +/- 27.73, N = 4 SE +/- 25.89, N = 4 SE +/- 40.80, N = 4 SE +/- 21.28, N = 4 5859.93 5670.42 5605.43 5644.15 5554.97 5637.05 -O3 -O3 -march=znver1 -march=znver1 -O3 -mllvm -Ofast -march=znver1 -O2 -O0 1. (CC) gcc options: -lm
PostgreSQL pgbench Scaling: Buffer Test - Test: Normal Load - Mode: Read Write OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 9.6.3 Scaling: Buffer Test - Test: Normal Load - Mode: Read Write -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -O2 -O0 400 800 1200 1600 2000 SE +/- 27.77, N = 6 SE +/- 31.64, N = 6 SE +/- 36.05, N = 6 SE +/- 32.61, N = 6 SE +/- 42.77, N = 6 1942.42 1930.21 1886.70 1906.40 1865.82 -O3 -lpgcommon -lpgport -lrt -lcrypt -ldl -lm -O3 -march=znver1 -shared -march=znver1 -O3 -mllvm -shared -O2 -shared -O0 -shared 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -fpic
SciMark Computational Test: Composite OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Composite -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 -O2 -O0 500 1000 1500 2000 2500 SE +/- 28.83, N = 4 SE +/- 4.05, N = 4 SE +/- 5.52, N = 4 SE +/- 5.09, N = 4 SE +/- 8.24, N = 4 SE +/- 2.89, N = 4 2196.34 2156.12 2139.67 2147.94 2130.55 2147.42 -O3 -O3 -march=znver1 -march=znver1 -O3 -mllvm -Ofast -march=znver1 -O2 -O0 1. (CC) gcc options: -lm
SciMark Computational Test: Fast Fourier Transform OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Fast Fourier Transform -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 -O2 -O0 30 60 90 120 150 SE +/- 0.25, N = 4 SE +/- 3.77, N = 4 SE +/- 0.16, N = 4 SE +/- 0.42, N = 4 SE +/- 0.36, N = 4 SE +/- 0.39, N = 4 134.23 131.50 134.46 135.39 134.06 134.67 -O3 -O3 -march=znver1 -march=znver1 -O3 -mllvm -Ofast -march=znver1 -O2 -O0 1. (CC) gcc options: -lm
SciMark Computational Test: Monte Carlo OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Monte Carlo -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 -O2 -O0 140 280 420 560 700 SE +/- 0.30, N = 4 SE +/- 0.05, N = 4 SE +/- 0.16, N = 4 SE +/- 0.11, N = 4 SE +/- 1.18, N = 4 SE +/- 0.13, N = 4 643.43 660.30 659.88 659.67 642.37 643.45 -O3 -O3 -march=znver1 -march=znver1 -O3 -mllvm -Ofast -march=znver1 -O2 -O0 1. (CC) gcc options: -lm
Stockfish Total Time OpenBenchmarking.org ms, Fewer Is Better Stockfish 2014-11-26 Total Time -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 -O2 -O0 800 1600 2400 3200 4000 SE +/- 9.53, N = 3 SE +/- 5.00, N = 3 SE +/- 8.08, N = 3 SE +/- 4.98, N = 3 SE +/- 7.75, N = 3 SE +/- 7.88, N = 3 3703 3643 3644 3618 3712 3711 -march=znver1 -march=znver1 -mllvm -Ofast -march=znver1 -O2 -O0 1. (CXX) g++ options: -lpthread -O3 -fno-exceptions -fno-rtti -ansi -pedantic -msse -msse3 -mpopcnt
SciMark Computational Test: Sparse Matrix Multiply OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Sparse Matrix Multiply -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 -O2 -O0 600 1200 1800 2400 3000 SE +/- 39.95, N = 4 SE +/- 10.92, N = 4 SE +/- 6.22, N = 4 SE +/- 5.95, N = 4 SE +/- 4.64, N = 4 SE +/- 7.78, N = 4 2667.90 2636.32 2616.74 2619.99 2646.12 2648.16 -O3 -O3 -march=znver1 -march=znver1 -O3 -mllvm -Ofast -march=znver1 -O2 -O0 1. (CC) gcc options: -lm
SciMark Computational Test: Jacobi Successive Over-Relaxation OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Jacobi Successive Over-Relaxation -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 -O2 -O0 400 800 1200 1600 2000 SE +/- 0.43, N = 4 SE +/- 0.29, N = 4 SE +/- 0.52, N = 4 SE +/- 1.12, N = 4 SE +/- 0.50, N = 4 SE +/- 2.17, N = 4 1676.24 1682.07 1681.87 1680.49 1675.22 1673.77 -O3 -O3 -march=znver1 -march=znver1 -O3 -mllvm -Ofast -march=znver1 -O2 -O0 1. (CC) gcc options: -lm
OpenSSL RSA 4096-bit Performance OpenBenchmarking.org Signs Per Second, More Is Better OpenSSL 1.0.1g RSA 4096-bit Performance -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 -O2 -O0 200 400 600 800 1000 SE +/- 0.58, N = 3 SE +/- 0.57, N = 3 SE +/- 0.85, N = 3 SE +/- 0.77, N = 3 SE +/- 0.49, N = 3 SE +/- 0.85, N = 3 986.93 987.43 986.73 986.43 987.67 986.70 1. (CC) gcc options: -m64 -O3 -lssl -lcrypto -ldl
Phoronix Test Suite v10.8.5