AOCC 1.0 Compiler Tuning AMD Ryzen 7 1700 Eight-Core testing with a MSI B350 TOMAHAWK (MS-7A34) v1.0 and HIS AMD Radeon HD 7750/8740 / R7 250E 1024MB on Ubuntu 17.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/1705218-TR-AOCC10COM39&grt&sor .
AOCC 1.0 Compiler Tuning Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Driver Compiler File-System Screen Resolution -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 AMD Ryzen 7 1700 Eight-Core @ 3.00GHz (16 Cores) MSI B350 TOMAHAWK (MS-7A34) v1.0 AMD Device 1450 16384MB 120GB Samsung SSD 840 HIS AMD Radeon HD 7750/8740 / R7 250E 1024MB AMD Cape Verde/Pitcairn DELL S2409W Realtek RTL8111/8168/8411 Ubuntu 17.04 4.12.0-999-generic (x86_64) 20170518 Unity 7.5.0 modesetting 1.19.3 Clang 4.0.0 ext4 1920x1080 OpenBenchmarking.org Compiler Details - Optimized build with assertions; Default target: x86_64-unknown-linux-gnu; Host CPU: znver1 Processor Details - Scaling Governor: acpi-cpufreq ondemand
AOCC 1.0 Compiler Tuning c-ray: Total Time fftw: Float + SSE - 2D FFT Size 1024 encode-flac: WAV To FLAC graphics-magick: Blur graphics-magick: Sharpen graphics-magick: Resizing graphics-magick: HWB Color Space graphics-magick: Local Adaptive Thresholding himeno: Poisson Pressure Solver encode-mp3: WAV To MP3 tjbench: Decompression Throughput openssl: RSA 4096-bit Performance pgbench: Buffer Test - Normal Load - Read Write pgbench: Buffer Test - Single Thread - Read Write pgbench: Buffer Test - Heavy Contention - Read Write redis: GET redis: SET scimark2: Composite scimark2: Monte Carlo scimark2: Fast Fourier Transform scimark2: Sparse Matrix Multiply scimark2: Dense LU Matrix Factorization scimark2: Jacobi Successive Over-Relaxation stockfish: Total Time mafft: Multiple Sequence Alignment tscp: AI Chess Performance encode-wavpack: WAV To WavPack -O0 -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 32.30 2434.70 62.12 46 21 49 72 28 325.64 36.48 71.84 986.70 1865.82 201.67 1985.80 1204370.92 890829.00 2147.42 643.45 134.67 2648.16 5637.05 1673.77 3711 3.87 475710 7.76 14.01 20190 6.80 107 57 132 163 133 1157.38 9.40 161.68 987.67 1906.40 225.06 1903.00 1983766.97 1377516.08 2130.55 642.37 134.06 2646.12 5554.97 1675.22 3712 3.63 1016572 6.52 14.00 20297 6.78 106 58 128 166 135 1150.22 9.42 162.59 986.93 1942.42 226.94 2037.20 1971325.87 1399320.00 2196.34 643.43 134.23 2667.90 5859.93 1676.24 3703 3.77 1053919 6.51 13.49 20171 5.64 102 59 137 149 135 1133.27 10.61 168.45 987.43 1930.21 225.61 1952.70 1945705.79 1379585.37 2156.12 660.30 131.50 2636.32 5670.42 1682.07 3643 3.66 1021094 6.43 13.49 20433 106 60 143 157 141 1133.24 168.66 986.73 1886.70 226.38 1932.86 2008848.23 1406611.75 2139.67 659.88 134.46 2616.74 5605.43 1681.87 3644 3.78 1021094 13.41 20676 6.26 100 64 138 161 135 1036.50 10.61 165.24 986.43 1953486.67 1386323.96 2147.94 659.67 135.39 2619.99 5644.15 1680.49 3618 3.82 1029891 6.46 OpenBenchmarking.org
C-Ray Total Time OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 1.1 Total Time -Ofast -march=znver1 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -O3 -O2 -O0 8 16 24 32 40 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 13.41 13.49 13.49 14.00 14.01 32.30 -Ofast -march=znver1 -march=znver1 -march=znver1 -mllvm -O2 -O0 1. (CC) gcc options: -lm -lpthread -O3
FFTW Build: Float + SSE - Size: 2D FFT Size 1024 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.4 Build: Float + SSE - Size: 2D FFT Size 1024 -Ofast -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -O3 -O2 -O3 -march=znver1 -O0 4K 8K 12K 16K 20K SE +/- 51.64, N = 5 SE +/- 81.73, N = 5 SE +/- 89.96, N = 5 SE +/- 110.89, N = 5 SE +/- 65.74, N = 5 SE +/- 2.81, N = 5 20676.00 20433.00 20297.00 20190.00 20171.00 2434.70 -Ofast -march=znver1 -march=znver1 -O3 -mllvm -O3 -O2 -O3 -march=znver1 1. (CC) gcc options: -lm
FLAC Audio Encoding WAV To FLAC OpenBenchmarking.org Seconds, Fewer Is Better FLAC Audio Encoding 1.3.1 WAV To FLAC -O3 -march=znver1 -Ofast -march=znver1 -O3 -O2 -O0 14 28 42 56 70 SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 SE +/- 0.01, N = 5 SE +/- 0.01, N = 5 5.64 6.26 6.78 6.80 62.12 -Ofast -march=znver1 -O3 -O2 -O0 1. (CXX) g++ options: -logg -lm
GraphicsMagick Operation: Blur OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.19 Operation: Blur -O2 -O3 -march=znver1 -mllvm -enable-strided-vectorization -O3 -O3 -march=znver1 -Ofast -march=znver1 -O0 20 40 60 80 100 107 106 106 102 100 46 -O2 -march=znver1 -O3 -mllvm -lpng16 -O3 -O3 -march=znver1 -Ofast -march=znver1 -O0 1. (CC) gcc options: -pthread -ljbig -lwebp -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -lpthread
GraphicsMagick Operation: Sharpen OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.19 Operation: Sharpen -Ofast -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -O3 -march=znver1 -O3 -O2 -O0 14 28 42 56 70 SE +/- 0.33, N = 3 64 60 59 58 57 21 -Ofast -march=znver1 -march=znver1 -O3 -mllvm -lpng16 -O3 -march=znver1 -O3 -O2 -O0 1. (CC) gcc options: -pthread -ljbig -lwebp -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -lpthread
GraphicsMagick Operation: Resizing OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.19 Operation: Resizing -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 -O3 -march=znver1 -O2 -O3 -O0 30 60 90 120 150 SE +/- 0.67, N = 3 SE +/- 6.17, N = 6 143 138 137 132 128 49 -march=znver1 -O3 -mllvm -lpng16 -Ofast -march=znver1 -O3 -march=znver1 -O2 -O3 -O0 1. (CC) gcc options: -pthread -ljbig -lwebp -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -lpthread
GraphicsMagick Operation: HWB Color Space OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.19 Operation: HWB Color Space -O3 -O2 -Ofast -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -O3 -march=znver1 -O0 40 80 120 160 200 166 163 161 157 149 72 -O3 -O2 -Ofast -march=znver1 -march=znver1 -O3 -mllvm -lpng16 -O3 -march=znver1 -O0 1. (CC) gcc options: -pthread -ljbig -lwebp -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -lpthread
GraphicsMagick Operation: Local Adaptive Thresholding OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.19 Operation: Local Adaptive Thresholding -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 -O3 -march=znver1 -O3 -O2 -O0 30 60 90 120 150 141 135 135 135 133 28 -march=znver1 -O3 -mllvm -lpng16 -Ofast -march=znver1 -O3 -march=znver1 -O3 -O2 -O0 1. (CC) gcc options: -pthread -ljbig -lwebp -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lz -lm -lpthread
Himeno Benchmark Poisson Pressure Solver OpenBenchmarking.org MFLOPS, More Is Better Himeno Benchmark 3.0 Poisson Pressure Solver -O2 -O3 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 -O0 200 400 600 800 1000 SE +/- 1.11, N = 3 SE +/- 0.19, N = 3 SE +/- 0.82, N = 3 SE +/- 0.76, N = 3 SE +/- 0.70, N = 3 SE +/- 0.62, N = 3 1157.38 1150.22 1133.27 1133.24 1036.50 325.64 -O2 -march=znver1 -march=znver1 -mllvm -Ofast -march=znver1 -O0 1. (CC) gcc options: -O3 -mavx2
LAME MP3 Encoding WAV To MP3 OpenBenchmarking.org Seconds, Fewer Is Better LAME MP3 Encoding 3.99.3 WAV To MP3 -O2 -O3 -O3 -march=znver1 -Ofast -march=znver1 -O0 8 16 24 32 40 SE +/- 0.01, N = 5 SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 SE +/- 0.01, N = 5 9.40 9.42 10.61 10.61 36.48 -O2 -march=znver1 -Ofast -march=znver1 -O0 1. (CC) gcc options: -O3 -ffast-math -funroll-loops -fschedule-insns2 -fbranch-count-reg -fforce-addr -pipe -lm
libjpeg-turbo tjbench Test: Decompression Throughput OpenBenchmarking.org Megapixels/sec, More Is Better libjpeg-turbo tjbench 1.5.1 Test: Decompression Throughput -O3 -march=znver1 -mllvm -enable-strided-vectorization -O3 -march=znver1 -Ofast -march=znver1 -O3 -O2 -O0 40 80 120 160 200 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 SE +/- 1.75, N = 3 SE +/- 0.07, N = 3 SE +/- 0.15, N = 3 SE +/- 0.02, N = 3 168.66 168.45 165.24 162.59 161.68 71.84 -march=znver1 -O3 -lm -O3 -march=znver1 -lm -Ofast -march=znver1 -lm -O3 -lm -O2 -O0 -lm 1. (CC) gcc options:
OpenSSL RSA 4096-bit Performance OpenBenchmarking.org Signs Per Second, More Is Better OpenSSL 1.0.1g RSA 4096-bit Performance -O2 -O3 -march=znver1 -O3 -O3 -march=znver1 -mllvm -enable-strided-vectorization -O0 -Ofast -march=znver1 200 400 600 800 1000 SE +/- 0.49, N = 3 SE +/- 0.57, N = 3 SE +/- 0.58, N = 3 SE +/- 0.85, N = 3 SE +/- 0.85, N = 3 SE +/- 0.77, N = 3 987.67 987.43 986.93 986.73 986.70 986.43 1. (CC) gcc options: -m64 -O3 -lssl -lcrypto -ldl
PostgreSQL pgbench Scaling: Buffer Test - Test: Normal Load - Mode: Read Write OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 9.6.3 Scaling: Buffer Test - Test: Normal Load - Mode: Read Write -O3 -O3 -march=znver1 -O2 -O3 -march=znver1 -mllvm -enable-strided-vectorization -O0 400 800 1200 1600 2000 SE +/- 27.77, N = 6 SE +/- 31.64, N = 6 SE +/- 32.61, N = 6 SE +/- 36.05, N = 6 SE +/- 42.77, N = 6 1942.42 1930.21 1906.40 1886.70 1865.82 -O3 -lpgcommon -lpgport -lrt -lcrypt -ldl -lm -O3 -march=znver1 -shared -O2 -shared -march=znver1 -O3 -mllvm -shared -O0 -shared 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -fpic
PostgreSQL pgbench Scaling: Buffer Test - Test: Single Thread - Mode: Read Write OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 9.6.3 Scaling: Buffer Test - Test: Single Thread - Mode: Read Write -O3 -O3 -march=znver1 -mllvm -enable-strided-vectorization -O3 -march=znver1 -O2 -O0 50 100 150 200 250 SE +/- 0.29, N = 3 SE +/- 0.93, N = 3 SE +/- 0.74, N = 3 SE +/- 0.26, N = 3 SE +/- 2.36, N = 3 226.94 226.38 225.61 225.06 201.67 -O3 -lpgcommon -lpgport -lrt -lcrypt -ldl -lm -march=znver1 -O3 -mllvm -shared -O3 -march=znver1 -shared -O2 -shared -O0 -shared 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -fpic
PostgreSQL pgbench Scaling: Buffer Test - Test: Heavy Contention - Mode: Read Write OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 9.6.3 Scaling: Buffer Test - Test: Heavy Contention - Mode: Read Write -O3 -O0 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -O2 400 800 1200 1600 2000 SE +/- 8.58, N = 3 SE +/- 31.25, N = 3 SE +/- 32.80, N = 4 SE +/- 27.83, N = 5 SE +/- 24.85, N = 3 2037.20 1985.80 1952.70 1932.86 1903.00 -O3 -lpgcommon -lpgport -lrt -lcrypt -ldl -lm -O0 -shared -O3 -march=znver1 -shared -march=znver1 -O3 -mllvm -shared -O2 -shared 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -fpic
Redis Test: GET OpenBenchmarking.org Requests Per Second, More Is Better Redis 3.0.1 Test: GET -O3 -march=znver1 -mllvm -enable-strided-vectorization -O2 -O3 -Ofast -march=znver1 -O3 -march=znver1 -O0 400K 800K 1200K 1600K 2000K SE +/- 34111.46, N = 6 SE +/- 32907.71, N = 4 SE +/- 15178.56, N = 3 SE +/- 18769.12, N = 3 SE +/- 13206.56, N = 3 SE +/- 4609.53, N = 3 2008848.23 1983766.97 1971325.87 1953486.67 1945705.79 1204370.92 1. (CC) gcc options: -ggdb -rdynamic -lm -pthread -ldl
Redis Test: SET OpenBenchmarking.org Requests Per Second, More Is Better Redis 3.0.1 Test: SET -O3 -march=znver1 -mllvm -enable-strided-vectorization -O3 -Ofast -march=znver1 -O3 -march=znver1 -O2 -O0 300K 600K 900K 1200K 1500K SE +/- 10024.40, N = 3 SE +/- 6804.89, N = 3 SE +/- 1282.46, N = 3 SE +/- 13825.40, N = 3 SE +/- 8504.02, N = 3 SE +/- 12621.48, N = 3 1406611.75 1399320.00 1386323.96 1379585.37 1377516.08 890829.00 1. (CC) gcc options: -ggdb -rdynamic -lm -pthread -ldl
SciMark Computational Test: Composite OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Composite -O3 -O3 -march=znver1 -Ofast -march=znver1 -O0 -O3 -march=znver1 -mllvm -enable-strided-vectorization -O2 500 1000 1500 2000 2500 SE +/- 28.83, N = 4 SE +/- 4.05, N = 4 SE +/- 5.09, N = 4 SE +/- 2.89, N = 4 SE +/- 5.52, N = 4 SE +/- 8.24, N = 4 2196.34 2156.12 2147.94 2147.42 2139.67 2130.55 -O3 -O3 -march=znver1 -Ofast -march=znver1 -O0 -march=znver1 -O3 -mllvm -O2 1. (CC) gcc options: -lm
SciMark Computational Test: Monte Carlo OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Monte Carlo -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 -O0 -O3 -O2 140 280 420 560 700 SE +/- 0.05, N = 4 SE +/- 0.16, N = 4 SE +/- 0.11, N = 4 SE +/- 0.13, N = 4 SE +/- 0.30, N = 4 SE +/- 1.18, N = 4 660.30 659.88 659.67 643.45 643.43 642.37 -O3 -march=znver1 -march=znver1 -O3 -mllvm -Ofast -march=znver1 -O0 -O3 -O2 1. (CC) gcc options: -lm
SciMark Computational Test: Fast Fourier Transform OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Fast Fourier Transform -Ofast -march=znver1 -O0 -O3 -march=znver1 -mllvm -enable-strided-vectorization -O3 -O2 -O3 -march=znver1 30 60 90 120 150 SE +/- 0.42, N = 4 SE +/- 0.39, N = 4 SE +/- 0.16, N = 4 SE +/- 0.25, N = 4 SE +/- 0.36, N = 4 SE +/- 3.77, N = 4 135.39 134.67 134.46 134.23 134.06 131.50 -Ofast -march=znver1 -O0 -march=znver1 -O3 -mllvm -O3 -O2 -O3 -march=znver1 1. (CC) gcc options: -lm
SciMark Computational Test: Sparse Matrix Multiply OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Sparse Matrix Multiply -O3 -O0 -O2 -O3 -march=znver1 -Ofast -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization 600 1200 1800 2400 3000 SE +/- 39.95, N = 4 SE +/- 7.78, N = 4 SE +/- 4.64, N = 4 SE +/- 10.92, N = 4 SE +/- 5.95, N = 4 SE +/- 6.22, N = 4 2667.90 2648.16 2646.12 2636.32 2619.99 2616.74 -O3 -O0 -O2 -O3 -march=znver1 -Ofast -march=znver1 -march=znver1 -O3 -mllvm 1. (CC) gcc options: -lm
SciMark Computational Test: Dense LU Matrix Factorization OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Dense LU Matrix Factorization -O3 -O3 -march=znver1 -Ofast -march=znver1 -O0 -O3 -march=znver1 -mllvm -enable-strided-vectorization -O2 1300 2600 3900 5200 6500 SE +/- 103.73, N = 4 SE +/- 16.49, N = 4 SE +/- 25.89, N = 4 SE +/- 21.28, N = 4 SE +/- 27.73, N = 4 SE +/- 40.80, N = 4 5859.93 5670.42 5644.15 5637.05 5605.43 5554.97 -O3 -O3 -march=znver1 -Ofast -march=znver1 -O0 -march=znver1 -O3 -mllvm -O2 1. (CC) gcc options: -lm
SciMark Computational Test: Jacobi Successive Over-Relaxation OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Jacobi Successive Over-Relaxation -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 -O3 -O2 -O0 400 800 1200 1600 2000 SE +/- 0.29, N = 4 SE +/- 0.52, N = 4 SE +/- 1.12, N = 4 SE +/- 0.43, N = 4 SE +/- 0.50, N = 4 SE +/- 2.17, N = 4 1682.07 1681.87 1680.49 1676.24 1675.22 1673.77 -O3 -march=znver1 -march=znver1 -O3 -mllvm -Ofast -march=znver1 -O3 -O2 -O0 1. (CC) gcc options: -lm
Stockfish Total Time OpenBenchmarking.org ms, Fewer Is Better Stockfish 2014-11-26 Total Time -Ofast -march=znver1 -O3 -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -O3 -O0 -O2 800 1600 2400 3200 4000 SE +/- 4.98, N = 3 SE +/- 5.00, N = 3 SE +/- 8.08, N = 3 SE +/- 9.53, N = 3 SE +/- 7.88, N = 3 SE +/- 7.75, N = 3 3618 3643 3644 3703 3711 3712 -Ofast -march=znver1 -march=znver1 -march=znver1 -mllvm -O0 -O2 1. (CXX) g++ options: -lpthread -fno-exceptions -fno-rtti -ansi -pedantic -O3 -msse -msse3 -mpopcnt
Timed MAFFT Alignment Multiple Sequence Alignment OpenBenchmarking.org Seconds, Fewer Is Better Timed MAFFT Alignment 6.864 Multiple Sequence Alignment -O2 -O3 -march=znver1 -O3 -O3 -march=znver1 -mllvm -enable-strided-vectorization -Ofast -march=znver1 -O0 0.8708 1.7416 2.6124 3.4832 4.354 SE +/- 0.11, N = 6 SE +/- 0.08, N = 6 SE +/- 0.02, N = 3 SE +/- 0.10, N = 6 SE +/- 0.09, N = 6 SE +/- 0.06, N = 4 3.63 3.66 3.77 3.78 3.82 3.87 1. (CC) gcc options: -O3 -lm -lpthread
TSCP AI Chess Performance OpenBenchmarking.org Nodes Per Second, More Is Better TSCP 1.81 AI Chess Performance -O3 -Ofast -march=znver1 -O3 -march=znver1 -mllvm -enable-strided-vectorization -O3 -march=znver1 -O2 -O0 200K 400K 600K 800K 1000K SE +/- 494.31, N = 5 SE +/- 1971.42, N = 5 SE +/- 463.44, N = 5 SE +/- 463.44, N = 5 SE +/- 701.56, N = 5 SE +/- 82.20, N = 5 1053919 1029891 1021094 1021094 1016572 475710 -O3 -Ofast -march=znver1 -march=znver1 -O3 -mllvm -O3 -march=znver1 -O2 -O0 1. (CC) gcc options:
WavPack Audio Encoding WAV To WavPack OpenBenchmarking.org Seconds, Fewer Is Better WavPack Audio Encoding 5.1 WAV To WavPack -O3 -march=znver1 -Ofast -march=znver1 -O3 -O2 -O0 2 4 6 8 10 SE +/- 0.00, N = 5 SE +/- 0.01, N = 5 SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 6.43 6.46 6.51 6.52 7.76 -O3 -march=znver1 -Ofast -march=znver1 -O3 -O2 -O0 1. (CC) gcc options: -lm
Phoronix Test Suite v10.8.5