GCC 9 compiler tuning benchmarks by Michael Larabel for a future article on Phoronix.com.
Compare your own system(s) to this result file with the
Phoronix Test Suite by running the command:
phoronix-test-suite benchmark 1902194-SP-AMDEPYCCO19 AMD EPYC Compiler Tuning - Phoronix Test Suite AMD EPYC Compiler Tuning GCC 9 compiler tuning benchmarks by Michael Larabel for a future article on Phoronix.com.
HTML result view exported from: https://openbenchmarking.org/result/1902194-SP-AMDEPYCCO19&grs&sro .
AMD EPYC Compiler Tuning Processor Motherboard Chipset Memory Disk Graphics Monitor Network OS Kernel Desktop Display Server Compiler File-System Screen Resolution -O0 -Og -O1 -O2 -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O2 -flto -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 2 x AMD EPYC 7601 32-Core (64 Cores / 128 Threads) Dell 02MJ3T (1.2.5 BIOS) AMD Family 17h 16 x 32 GB DDR4-2400MT/s 36ASF4G72PZ-2G6D2 120GB SSDSCKJB120G7R + 20 x 500GB Samsung SSD 860 Matrox G200eW3 VE228 2 x Broadcom BCM57416 NetXtreme-E 10GBase-T RDMA + 2 x Broadcom NetXtreme BCM5720 PCIe Ubuntu 18.04 5.0.0-050000rc6-generic (x86_64) 20190210 GNOME Shell 3.28.3 X Server GCC 9.0.1 20190210 ext4 1600x1200 OpenBenchmarking.org Environment Details - -O0: CXXFLAGS=-O0 CFLAGS=-O0 - -Og: CXXFLAGS=-Og CFLAGS=-Og - -O1: CXXFLAGS=-O1 CFLAGS=-O1 - -O2: CXXFLAGS=-O2 CFLAGS=-O2 - -O2 -ftree-vectorize -ftree-slp-vectorize: CXXFLAGS=-O2-ftree-vectorize-ftree-slp-vectorize CFLAGS=-O2-ftree-vectorize-ftree-slp-vectorize - -O2 -march=znver1: CXXFLAGS=-O2-march=znver1 CFLAGS=-O2-march=znver1 - -O2 -flto: CXXFLAGS=-O2-flto CFLAGS=-O2-flto - -O3: CXXFLAGS=-O3 CFLAGS=-O3 - -O3 -march=znver1: CXXFLAGS=-O3-march=znver1 CFLAGS=-O3-march=znver1 - -O3 -march=znver1 -flto: CXXFLAGS=-O3 march=znver1-flto CFLAGS=-O3-march=znver1-flto - -Ofast -march=znver1: CXXFLAGS=-Ofast-march=znver1 CFLAGS=-Ofast-march=znver1 Compiler Details - --disable-multilib --enable-checking=release Security Details - __user pointer sanitization + Full AMD retpoline IBPB: conditional STIBP: disabled RSB filling + SSB disabled via prctl and seccomp
AMD EPYC Compiler Tuning encode-flac: WAV To FLAC fftw: Float + SSE - 2D FFT Size 4096 build-php: Time To Compile scimark2: Sparse Matrix Multiply scimark2: Composite c-ray: Total Time - 4K, 16 Rays Per Pixel encode-mp3: WAV To MP3 fftw: Stock - 2D FFT Size 4096 build-imagemagick: Time To Compile himeno: Poisson Pressure Solver build-apache: Time To Compile graphics-magick: Sharpen graphics-magick: Enhanced graphics-magick: HWB Color Space graphics-magick: Swirl graphics-magick: Noise-Gaussian scimark2: Jacobi Successive Over-Relaxation scimark2: Monte Carlo graphics-magick: Rotate aobench: 2048 x 2048 - Total Time graphics-magick: Resizing pgbench: Buffer Test - Single Thread - Read Only hmmer: Pfam Database Search x264: H.264 Video Encoding pgbench: Buffer Test - Normal Load - Read Write tjbench: Decompression Throughput pgbench: Buffer Test - Single Thread - Read Write scimark2: Fast Fourier Transform pgbench: Buffer Test - Normal Load - Read Only john-the-ripper: Traditional DES bullet: 1000 Stack hint: DOUBLE bullet: 136 Ragdolls svt-vp9: 1080p 8-bit YUV To VP9 Video Encode bullet: 1000 Convex vpxenc: vpxenc VP9 1080p Video Encode svt-av1: 1080p 8-bit YUV To AV1 Video Encode vpxenc: vpxenc VP9 1080p Video Encode x265: H.265 1080p Video Encoding bullet: 3000 Fall bullet: Raytests stockfish: Total Time bullet: Convex Trimesh bullet: Prim Trimesh svt-av1: 1080p 8-bit YUV To AV1 Video Encode hint: FLOAT tscp: AI Chess Performance ctx-clock: Context Switch Time compress-zstd: Compressing ubuntu-16.04.3-server-i386.img, Compression Level 19 john-the-ripper: Blowfish scimark2: Dense LU Matrix Factorization -O0 -Og -O1 -O2 -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O2 -flto -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 96.77 2193 15.19 516 434 44.92 41.79 1708 5.23 383 11.43 82 90 102 96 92 832 108 98 92.50 74 9063 9.02 102 4585 111 886 201 419700 218232000 6.00 598545342 3.14 5.36 1.69 12.50 35.00 5.14 3.11 105868175 1.35 1.11 5.88 267404445 865459 23.12 15179 512 15.58 12642 21.42 2188 1205 28.64 16.78 4366 7.89 772 14.59 156 173 195 181 168 919 210 181 77.41 120 13333 7.39 142 3767 141 1080 257 507203 239289333 6.02 597234266 3.14 92.68 5.37 20.39 1.73 12.52 34.76 5.15 3.11 105709690 1.35 1.11 5.86 267368671 865187 132 14.39 56453 2539 15.01 13468 29.05 2411 1519 28.74 14.32 4632 18.42 785 17.51 180 187 210 194 179 919 576 191 56.61 126 13303 6.93 145 4301 139 1065 226 515102 257067200 6.00 585060029 3.15 5.37 1.70 12.53 35.62 5.16 3.12 105698092 1.35 1.11 5.87 268455578 864102 14.11 65995 3466 13.65 13391 52.17 2527 1369 25.84 14.07 4625 23.63 1017 23.82 181 189 211 195 180 919 560 191 55.54 131 14931 6.62 144 4167 140 1037 230 515340 257407667 6.01 599481605 3.14 5.37 1.69 12.54 34.55 5.14 3.12 104480422 1.35 1.11 5.81 267311970 864373 14.48 62718 2609 13.70 13285 52.58 2515 1724 25.77 10.96 4805 23.91 1007 24.03 180 188 212 196 178 919 560 190 55.53 128 15353 6.82 144 4239 139 1125 231 529699 257058000 5.98 602535297 3.15 94.82 5.37 20.34 1.67 12.56 35.41 5.14 3.11 104197865 1.35 1.11 5.89 267172145 864916 132 13.67 63586 4396 13.89 13346 51.96 2584 1501 21.58 14.00 5074 23.78 1001 23.82 183 191 211 196 180 1016 557 191 54.35 127 15111 6.54 144 4272 142 1060 229 510425 255957000 5.80 617516626 3.05 95.91 5.19 20.05 1.70 12.42 34.80 5.08 3.10 106084276 1.32 1.11 5.84 267268023 864915 14.71 61309 3231 13.64 13214 2299 1307 25.96 14.14 5091 98.67 1022 26.50 183 190 214 196 180 918 568 191 55.52 128 14851 6.56 4095 140 1127 232 520570 260736667 6.32 626640400 3.24 95.79 5.40 1.69 35.07 5.21 3.05 104536605 1.33 1.09 5.90 268173400 864101 132 14.08 65117 2515 13.61 13555 78.19 2475 1800 12.60 10.84 4751 25.06 1008 26.08 174 181 203 189 172 1427 560 183 53.53 118 15099 6.57 147 4262 141 1079 232 490551 253868583 6.05 595428047 3.15 5.38 1.73 12.31 35.21 5.16 3.11 104121840 1.35 1.11 5.90 267315647 864915 13.66 65806 4307 13.85 12752 78.13 2482 1961 11.35 10.57 5006 24.88 1011 25.94 183 191 210 195 180 1689 557 190 51.49 127 15188 6.29 144 5068 144 1145 227 505031 260019667 5.80 589289926 3.06 5.18 1.68 12.41 35.57 5.07 3.09 106497994 1.32 1.11 5.89 268506472 865732 14.37 66823 4851 14.21 13110 2052 1747 11.31 10.38 5571 118.48 1000 28.62 183 186 209 194 178 1675 1480 188 52.08 125 16012 6.16 4319 144 1074 230 454256 254777333 618644101 97.26 20.86 1.71 12.75 5.84 267239405 863018 132 13.16 58764 3300 13.95 13166 2579 1825 10.40 9.80 4885 25.21 1022 26.11 182 193 209 196 187 1676 561 189 51.73 124 15352 6.00 144 4102 144 1125 221 508384 258770667 5.80 605331833 3.06 97.80 5.18 20.13 1.70 12.37 34.91 5.09 3.09 106507244 1.32 1.11 5.91 267055407 864373 13.77 62841 4089 OpenBenchmarking.org
FLAC Audio Encoding WAV To FLAC OpenBenchmarking.org Seconds, Fewer Is Better FLAC Audio Encoding 1.3.2 WAV To FLAC -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 20 40 60 80 100 SE +/- 0.12, N = 5 SE +/- 0.09, N = 5 SE +/- 0.08, N = 5 SE +/- 0.11, N = 5 SE +/- 0.14, N = 5 SE +/- 0.09, N = 5 SE +/- 0.12, N = 5 SE +/- 0.10, N = 5 SE +/- 0.10, N = 5 SE +/- 0.08, N = 5 SE +/- 0.11, N = 5 96.77 15.01 13.65 13.64 13.70 13.89 13.61 13.85 14.21 13.95 15.58 -O0 1. (CXX) g++ options: -fvisibility=hidden -logg -lm
FFTW Build: Float + SSE - Size: 2D FFT Size 4096 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 2D FFT Size 4096 -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 3K 6K 9K 12K 15K SE +/- 1.00, N = 3 SE +/- 70.29, N = 3 SE +/- 160.95, N = 3 SE +/- 134.39, N = 3 SE +/- 72.78, N = 3 SE +/- 15.71, N = 3 SE +/- 115.66, N = 3 SE +/- 78.62, N = 3 SE +/- 49.65, N = 3 SE +/- 160.71, N = 8 SE +/- 165.58, N = 3 2193 13468 13391 13214 13285 13346 13555 12752 13110 13166 12642 -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 1. (CC) gcc options: -pthread -lm
Timed PHP Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed PHP Compilation 7.1.9 Time To Compile -O0 -O1 -O2 -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -Og 20 40 60 80 100 SE +/- 0.02, N = 3 SE +/- 0.06, N = 3 SE +/- 0.18, N = 3 SE +/- 0.26, N = 3 SE +/- 0.25, N = 3 SE +/- 0.22, N = 3 SE +/- 0.33, N = 3 SE +/- 0.10, N = 3 15.19 29.05 52.17 52.58 51.96 78.19 78.13 21.42 -O1 -O2 -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -Og 1. (CC) gcc options: -pedantic -ldl -lz -lm
SciMark Computational Test: Sparse Matrix Multiply OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Sparse Matrix Multiply -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 600 1200 1800 2400 3000 SE +/- 5.25, N = 3 SE +/- 59.22, N = 3 SE +/- 11.61, N = 3 SE +/- 5.41, N = 3 SE +/- 12.19, N = 3 SE +/- 14.53, N = 3 SE +/- 3.37, N = 3 SE +/- 10.26, N = 3 SE +/- 2.13, N = 3 SE +/- 12.59, N = 3 SE +/- 3.66, N = 3 516 2411 2527 2299 2515 2584 2475 2482 2052 2579 2188 -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 1. (CC) gcc options: -lm
SciMark Computational Test: Composite OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Composite -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 400 800 1200 1600 2000 SE +/- 5.30, N = 3 SE +/- 6.18, N = 3 SE +/- 12.17, N = 3 SE +/- 24.31, N = 3 SE +/- 8.94, N = 3 SE +/- 23.45, N = 5 SE +/- 7.96, N = 3 SE +/- 11.89, N = 3 SE +/- 35.09, N = 3 SE +/- 20.59, N = 3 SE +/- 18.65, N = 5 434 1519 1369 1307 1724 1501 1800 1961 1747 1825 1205 -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 1. (CC) gcc options: -lm
C-Ray Total Time - 4K, 16 Rays Per Pixel OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 1.1 Total Time - 4K, 16 Rays Per Pixel -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 10 20 30 40 50 SE +/- 0.09, N = 3 SE +/- 0.02, N = 3 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 SE +/- 0.06, N = 3 SE +/- 0.04, N = 3 SE +/- 0.04, N = 3 SE +/- 0.04, N = 3 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 44.92 28.74 25.84 25.96 25.77 21.58 12.60 11.35 11.31 10.40 28.64 -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -march=znver1 -march=znver1 -flto -Ofast -march=znver1 -Og 1. (CC) gcc options: -lm -lpthread -O3
LAME MP3 Encoding WAV To MP3 OpenBenchmarking.org Seconds, Fewer Is Better LAME MP3 Encoding 3.100 WAV To MP3 -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 10 20 30 40 50 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 41.79 14.32 14.07 14.14 10.96 14.00 10.84 10.57 10.38 9.80 16.78 -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 1. (CC) gcc options: -lm
FFTW Build: Stock - Size: 2D FFT Size 4096 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Stock - Size: 2D FFT Size 4096 -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 1200 2400 3600 4800 6000 SE +/- 5.17, N = 3 SE +/- 6.37, N = 3 SE +/- 2.28, N = 3 SE +/- 5.87, N = 3 SE +/- 26.82, N = 3 SE +/- 13.33, N = 3 SE +/- 41.07, N = 3 SE +/- 47.88, N = 3 SE +/- 26.34, N = 3 SE +/- 10.24, N = 3 SE +/- 11.16, N = 3 1708 4632 4625 5091 4805 5074 4751 5006 5571 4885 4366 -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 1. (CC) gcc options: -pthread -lm
Timed ImageMagick Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed ImageMagick Compilation 6.9.0 Time To Compile -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 30 60 90 120 150 SE +/- 0.10, N = 3 SE +/- 0.21, N = 8 SE +/- 0.10, N = 3 SE +/- 0.98, N = 3 SE +/- 0.20, N = 3 SE +/- 0.45, N = 3 SE +/- 0.44, N = 3 SE +/- 0.30, N = 3 SE +/- 0.45, N = 3 SE +/- 0.34, N = 3 SE +/- 0.04, N = 3 5.23 18.42 23.63 98.67 23.91 23.78 25.06 24.88 118.48 25.21 7.89
Himeno Benchmark Poisson Pressure Solver OpenBenchmarking.org MFLOPS, More Is Better Himeno Benchmark 3.0 Poisson Pressure Solver -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 200 400 600 800 1000 SE +/- 0.11, N = 3 SE +/- 5.78, N = 3 SE +/- 5.25, N = 3 SE +/- 2.58, N = 3 SE +/- 7.21, N = 3 SE +/- 2.90, N = 3 SE +/- 6.27, N = 3 SE +/- 0.08, N = 3 SE +/- 2.81, N = 3 SE +/- 8.54, N = 3 SE +/- 4.02, N = 3 383 785 1017 1022 1007 1001 1008 1011 1000 1022 772 -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -march=znver1 -march=znver1 -flto -Ofast -march=znver1 -Og 1. (CC) gcc options: -O3 -mavx2
Timed Apache Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Apache Compilation 2.4.7 Time To Compile -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 7 14 21 28 35 SE +/- 0.02, N = 3 SE +/- 0.06, N = 3 SE +/- 0.09, N = 3 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 SE +/- 0.04, N = 3 SE +/- 0.08, N = 3 SE +/- 0.03, N = 3 SE +/- 0.08, N = 3 SE +/- 0.03, N = 3 11.43 17.51 23.82 26.50 24.03 23.82 26.08 25.94 28.62 26.11 14.59
GraphicsMagick Operation: Sharpen OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.30 Operation: Sharpen -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 40 80 120 160 200 SE +/- 0.58, N = 3 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 SE +/- 0.58, N = 3 82 180 181 183 180 183 174 183 183 182 156 -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 1. (CC) gcc options: -fopenmp -pthread -ljbig -lwebp -lwebpmux -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lz -lm -ldl -lpthread
GraphicsMagick Operation: Enhanced OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.30 Operation: Enhanced -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 40 80 120 160 200 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 SE +/- 3.18, N = 3 90 187 189 190 188 191 181 191 186 193 173 -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 1. (CC) gcc options: -fopenmp -pthread -ljbig -lwebp -lwebpmux -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lz -lm -ldl -lpthread
GraphicsMagick Operation: HWB Color Space OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.30 Operation: HWB Color Space -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 50 100 150 200 250 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 SE +/- 0.58, N = 3 SE +/- 0.58, N = 3 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 SE +/- 0.88, N = 3 SE +/- 0.33, N = 3 102 210 211 214 212 211 203 210 209 209 195 -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 1. (CC) gcc options: -fopenmp -pthread -ljbig -lwebp -lwebpmux -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lz -lm -ldl -lpthread
GraphicsMagick Operation: Swirl OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.30 Operation: Swirl -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 40 80 120 160 200 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 SE +/- 0.67, N = 3 96 194 195 196 196 196 189 195 194 196 181 -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 1. (CC) gcc options: -fopenmp -pthread -ljbig -lwebp -lwebpmux -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lz -lm -ldl -lpthread
GraphicsMagick Operation: Noise-Gaussian OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.30 Operation: Noise-Gaussian -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 40 80 120 160 200 SE +/- 1.20, N = 3 SE +/- 0.88, N = 3 SE +/- 0.88, N = 3 SE +/- 0.58, N = 3 SE +/- 0.58, N = 3 SE +/- 0.58, N = 3 92 179 180 180 178 180 172 180 178 187 168 -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 1. (CC) gcc options: -fopenmp -pthread -ljbig -lwebp -lwebpmux -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lz -lm -ldl -lpthread
SciMark Computational Test: Jacobi Successive Over-Relaxation OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Jacobi Successive Over-Relaxation -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 400 800 1200 1600 2000 SE +/- 0.35, N = 3 SE +/- 0.03, N = 3 SE +/- 0.06, N = 3 SE +/- 0.08, N = 3 SE +/- 0.06, N = 3 SE +/- 0.06, N = 3 SE +/- 0.15, N = 3 SE +/- 0.21, N = 3 SE +/- 0.10, N = 3 SE +/- 0.12, N = 3 SE +/- 0.31, N = 3 832 919 919 918 919 1016 1427 1689 1675 1676 919 -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 1. (CC) gcc options: -lm
SciMark Computational Test: Monte Carlo OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Monte Carlo -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 300 600 900 1200 1500 SE +/- 0.03, N = 3 SE +/- 0.27, N = 3 SE +/- 0.03, N = 3 SE +/- 0.09, N = 3 SE +/- 0.22, N = 3 SE +/- 0.02, N = 3 SE +/- 0.11, N = 3 SE +/- 0.03, N = 3 SE +/- 0.33, N = 3 SE +/- 0.06, N = 3 SE +/- 0.28, N = 3 108 576 560 568 560 557 560 557 1480 561 210 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 1. (CC) gcc options: -lm
GraphicsMagick Operation: Rotate OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.30 Operation: Rotate -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 40 80 120 160 200 SE +/- 0.33, N = 3 98 191 191 191 190 191 183 190 188 189 181 -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 1. (CC) gcc options: -fopenmp -pthread -ljbig -lwebp -lwebpmux -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lz -lm -ldl -lpthread
AOBench Size: 2048 x 2048 - Total Time OpenBenchmarking.org Seconds, Fewer Is Better AOBench Size: 2048 x 2048 - Total Time -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 20 40 60 80 100 SE +/- 0.06, N = 3 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 92.50 56.61 55.54 55.52 55.53 54.35 53.53 51.49 52.08 51.73 77.41 -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -march=znver1 -march=znver1 -flto -Ofast -march=znver1 -Og 1. (CC) gcc options: -lm -O3
GraphicsMagick Operation: Resizing OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.30 Operation: Resizing -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 30 60 90 120 150 SE +/- 1.43, N = 12 SE +/- 2.52, N = 3 SE +/- 1.50, N = 12 SE +/- 1.53, N = 3 SE +/- 1.20, N = 3 SE +/- 1.94, N = 5 SE +/- 1.50, N = 8 SE +/- 1.32, N = 10 SE +/- 1.40, N = 8 74 126 131 128 128 127 118 127 125 124 120 -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 1. (CC) gcc options: -fopenmp -pthread -ljbig -lwebp -lwebpmux -ltiff -ljpeg -lXext -lSM -lICE -lX11 -llzma -lz -lm -ldl -lpthread
PostgreSQL pgbench Scaling: Buffer Test - Test: Single Thread - Mode: Read Only OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 10.3 Scaling: Buffer Test - Test: Single Thread - Mode: Read Only -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 3K 6K 9K 12K 15K SE +/- 69.77, N = 3 SE +/- 119.98, N = 3 SE +/- 48.94, N = 3 SE +/- 172.87, N = 3 SE +/- 295.94, N = 3 SE +/- 122.77, N = 3 SE +/- 32.93, N = 3 SE +/- 101.16, N = 3 SE +/- 224.15, N = 6 SE +/- 125.79, N = 3 SE +/- 149.12, N = 3 9063 13303 14931 14851 15353 15111 15099 15188 16012 15352 13333 -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -lpgcommon -lpgport -lpq -lpthread -lrt -lcrypt -ldl -lm
Timed HMMer Search Pfam Database Search OpenBenchmarking.org Seconds, Fewer Is Better Timed HMMer Search 2.3.2 Pfam Database Search -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 3 6 9 12 15 SE +/- 0.14, N = 3 SE +/- 0.06, N = 3 SE +/- 0.01, N = 3 SE +/- 0.05, N = 3 SE +/- 0.04, N = 3 SE +/- 0.05, N = 3 SE +/- 0.02, N = 3 SE +/- 0.07, N = 3 SE +/- 0.04, N = 3 SE +/- 0.04, N = 3 SE +/- 0.09, N = 3 9.02 6.93 6.62 6.56 6.82 6.54 6.57 6.29 6.16 6.00 7.39 -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 1. (CC) gcc options: -pthread -lhmmer -lsquid -lm
x264 H.264 Video Encoding OpenBenchmarking.org Frames Per Second, More Is Better x264 2018-09-25 H.264 Video Encoding -O0 -O1 -O2 -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -Ofast -march=znver1 -Og 30 60 90 120 150 SE +/- 0.12, N = 3 SE +/- 1.49, N = 3 SE +/- 0.97, N = 3 SE +/- 0.47, N = 3 SE +/- 1.40, N = 3 SE +/- 0.81, N = 3 SE +/- 1.78, N = 3 SE +/- 0.52, N = 3 SE +/- 1.09, N = 3 102 145 144 144 144 147 144 144 142 -O0 -O1 -O2 -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -march=znver1 -Ofast -march=znver1 -Og 1. (CC) gcc options: -ldl -m64 -lm -lpthread -O3 -ffast-math -std=gnu99 -fPIC -fomit-frame-pointer -fno-tree-vectorize
PostgreSQL pgbench Scaling: Buffer Test - Test: Normal Load - Mode: Read Write OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 10.3 Scaling: Buffer Test - Test: Normal Load - Mode: Read Write -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 1100 2200 3300 4400 5500 SE +/- 29.67, N = 3 SE +/- 71.59, N = 9 SE +/- 49.50, N = 9 SE +/- 26.99, N = 3 SE +/- 55.61, N = 6 SE +/- 12.79, N = 3 SE +/- 29.58, N = 3 SE +/- 50.49, N = 3 SE +/- 66.20, N = 5 SE +/- 66.04, N = 9 SE +/- 47.02, N = 3 4585 4301 4167 4095 4239 4272 4262 5068 4319 4102 3767 -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -lpgcommon -lpgport -lpq -lpthread -lrt -lcrypt -ldl -lm
libjpeg-turbo tjbench Test: Decompression Throughput OpenBenchmarking.org Megapixels/sec, More Is Better libjpeg-turbo tjbench 1.5.3 Test: Decompression Throughput -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 30 60 90 120 150 SE +/- 0.76, N = 3 SE +/- 0.03, N = 3 SE +/- 0.67, N = 3 SE +/- 0.03, N = 3 SE +/- 0.06, N = 3 SE +/- 0.91, N = 3 SE +/- 0.03, N = 3 SE +/- 0.06, N = 3 SE +/- 0.10, N = 3 SE +/- 0.07, N = 3 SE +/- 0.04, N = 3 111 139 140 140 139 142 141 144 144 144 141 -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 1. (CC) gcc options: -lm
PostgreSQL pgbench Scaling: Buffer Test - Test: Single Thread - Mode: Read Write OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 10.3 Scaling: Buffer Test - Test: Single Thread - Mode: Read Write -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 200 400 600 800 1000 SE +/- 6.16, N = 3 SE +/- 8.54, N = 3 SE +/- 17.89, N = 3 SE +/- 3.41, N = 3 SE +/- 1.22, N = 3 SE +/- 12.77, N = 9 SE +/- 12.77, N = 8 SE +/- 19.20, N = 3 SE +/- 9.80, N = 3 SE +/- 19.14, N = 3 SE +/- 4.15, N = 3 886 1065 1037 1127 1125 1060 1079 1145 1074 1125 1080 -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -lpgcommon -lpgport -lpq -lpthread -lrt -lcrypt -ldl -lm
SciMark Computational Test: Fast Fourier Transform OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Fast Fourier Transform -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 60 120 180 240 300 SE +/- 1.00, N = 3 SE +/- 0.44, N = 3 SE +/- 0.61, N = 3 SE +/- 2.57, N = 3 SE +/- 0.71, N = 3 SE +/- 1.88, N = 3 SE +/- 0.51, N = 3 SE +/- 0.03, N = 3 SE +/- 1.03, N = 3 SE +/- 1.04, N = 3 SE +/- 2.03, N = 3 201 226 230 232 231 229 232 227 230 221 257 -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 1. (CC) gcc options: -lm
PostgreSQL pgbench Scaling: Buffer Test - Test: Normal Load - Mode: Read Only OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 10.3 Scaling: Buffer Test - Test: Normal Load - Mode: Read Only -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 110K 220K 330K 440K 550K SE +/- 4794.85, N = 3 SE +/- 6061.06, N = 3 SE +/- 2768.55, N = 3 SE +/- 4765.38, N = 3 SE +/- 3875.50, N = 3 SE +/- 5952.34, N = 3 SE +/- 8068.95, N = 9 SE +/- 3819.62, N = 3 SE +/- 7546.41, N = 4 SE +/- 1629.04, N = 3 SE +/- 3395.99, N = 3 419700 515102 515340 520570 529699 510425 490551 505031 454256 508384 507203 -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -lpgcommon -lpgport -lpq -lpthread -lrt -lcrypt -ldl -lm
John The Ripper Test: Traditional DES OpenBenchmarking.org Real C/S, More Is Better John The Ripper 1.8.0-jumbo-1 Test: Traditional DES -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 60M 120M 180M 240M 300M SE +/- 2756947.41, N = 3 SE +/- 2774527.11, N = 10 SE +/- 2178423.16, N = 3 SE +/- 642920.77, N = 3 SE +/- 2839112.24, N = 3 SE +/- 2041895.52, N = 3 SE +/- 3859011.69, N = 12 SE +/- 2346357.84, N = 3 SE +/- 1374420.40, N = 3 SE +/- 1656338.97, N = 3 SE +/- 2445677.03, N = 3 218232000 257067200 257407667 260736667 257058000 255957000 253868583 260019667 254777333 258770667 239289333 1. (CC) gcc options: -lssl -lcrypto -fopenmp -lgmp -pthread -lm -lz -ldl -lcrypt
Bullet Physics Engine Test: 1000 Stack OpenBenchmarking.org Seconds, Fewer Is Better Bullet Physics Engine 2.81 Test: 1000 Stack -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -Ofast -march=znver1 -Og 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 6.00 6.00 6.01 6.32 5.98 5.80 6.05 5.80 5.80 6.02 -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -march=znver1 -Ofast -march=znver1 -Og 1. (CXX) g++ options: -O3 -rdynamic -lglut -lGL -lGLU
Hierarchical INTegration Test: DOUBLE OpenBenchmarking.org QUIPs, More Is Better Hierarchical INTegration 1.0 Test: DOUBLE -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 130M 260M 390M 520M 650M SE +/- 8177784.85, N = 3 SE +/- 1617493.63, N = 3 SE +/- 6546514.72, N = 3 SE +/- 10585229.89, N = 3 SE +/- 2338894.48, N = 3 SE +/- 9814749.06, N = 4 SE +/- 1832535.45, N = 3 SE +/- 7042504.19, N = 3 SE +/- 7234705.24, N = 9 SE +/- 7419115.18, N = 3 SE +/- 9099115.17, N = 9 598545342 585060029 599481605 626640400 602535297 617516626 595428047 589289926 618644101 605331833 597234266 -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -march=znver1 -march=znver1 -flto -Ofast -march=znver1 -Og 1. (CC) gcc options: -O3 -march=native -lm
Bullet Physics Engine Test: 136 Ragdolls OpenBenchmarking.org Seconds, Fewer Is Better Bullet Physics Engine 2.81 Test: 136 Ragdolls -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -Ofast -march=znver1 -Og 0.729 1.458 2.187 2.916 3.645 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 3.14 3.15 3.14 3.24 3.15 3.05 3.15 3.06 3.06 3.14 -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -march=znver1 -Ofast -march=znver1 -Og 1. (CXX) g++ options: -O3 -rdynamic -lglut -lGL -lGLU
SVT-VP9 1080p 8-bit YUV To VP9 Video Encode OpenBenchmarking.org Frames Per Second, More Is Better SVT-VP9 2019-02-17 1080p 8-bit YUV To VP9 Video Encode -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 20 40 60 80 100 SE +/- 0.75, N = 3 SE +/- 1.09, N = 3 SE +/- 0.18, N = 3 SE +/- 0.50, N = 3 SE +/- 0.30, N = 3 SE +/- 1.17, N = 3 95.79 94.82 95.91 97.26 97.80 92.68 -ftree-vectorize -ftree-slp-vectorize -march=native -O3 -march=znver1 -Ofast -march=znver1 -Og 1. (CC) gcc options: -O2 -flto -fPIE -fPIC -fvisibility=hidden -mavx -pie -rdynamic -lpthread -lrt -lm
Bullet Physics Engine Test: 1000 Convex OpenBenchmarking.org Seconds, Fewer Is Better Bullet Physics Engine 2.81 Test: 1000 Convex -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -Ofast -march=znver1 -Og 1.215 2.43 3.645 4.86 6.075 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 5.36 5.37 5.37 5.40 5.37 5.19 5.38 5.18 5.18 5.37 -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -march=znver1 -Ofast -march=znver1 -Og 1. (CXX) g++ options: -O3 -rdynamic -lglut -lGL -lGLU
VP9 libvpx Encoding vpxenc VP9 1080p Video Encode OpenBenchmarking.org Frames Per Second, More Is Better VP9 libvpx Encoding 1.8.0 vpxenc VP9 1080p Video Encode -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 5 10 15 20 25 SE +/- 0.05, N = 3 SE +/- 0.07, N = 3 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 SE +/- 0.01, N = 3 20.34 20.05 20.86 20.13 20.39 -O2 -ftree-vectorize -ftree-slp-vectorize -std=c++11 -O2 -march=native -std=c++11 -march=znver1 -flto -Ofast -march=znver1 -std=c++11 -Og -std=c++11 1. (CXX) g++ options: -m64 -lm -lpthread -O3 -fPIC -U_FORTIFY_SOURCE
SVT-AV1 1080p 8-bit YUV To AV1 Video Encode OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2019-02-03 1080p 8-bit YUV To AV1 Video Encode -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 0.3893 0.7786 1.1679 1.5572 1.9465 SE +/- 0.00, N = 3 SE +/- 0.02, N = 9 SE +/- 0.02, N = 8 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 6 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 SE +/- 0.00, N = 3 1.69 1.70 1.69 1.69 1.67 1.70 1.73 1.68 1.71 1.70 1.73 1. (CC) gcc options: -mavx2 -fPIE -fPIC -O2 -pie -lpthread -lm
VP9 libvpx Encoding vpxenc VP9 1080p Video Encode OpenBenchmarking.org Frames Per Second, More Is Better VP9 libvpx Encoding 1.8.0 vpxenc VP9 1080p Video Encode -O0 -O1 -O2 -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.18, N = 5 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 12.50 12.53 12.54 12.56 12.42 12.31 12.41 12.75 12.37 12.52 -O0 -std=c++11 -O1 -std=c++11 -O2 -std=c++11 -O2 -ftree-vectorize -ftree-slp-vectorize -std=c++11 -O2 -march=znver1 -std=c++11 -std=c++11 -march=znver1 -std=c++11 -march=znver1 -flto -Ofast -march=znver1 -std=c++11 -Og -std=c++11 1. (CXX) g++ options: -m64 -lm -lpthread -O3 -fPIC -U_FORTIFY_SOURCE
x265 H.265 1080p Video Encoding OpenBenchmarking.org Frames Per Second, More Is Better x265 3.0 H.265 1080p Video Encoding -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -Ofast -march=znver1 -Og 8 16 24 32 40 SE +/- 0.67, N = 3 SE +/- 0.37, N = 11 SE +/- 0.24, N = 3 SE +/- 0.38, N = 3 SE +/- 0.46, N = 3 SE +/- 0.09, N = 3 SE +/- 0.41, N = 3 SE +/- 0.18, N = 3 SE +/- 0.01, N = 3 SE +/- 0.58, N = 3 35.00 35.62 34.55 35.07 35.41 34.80 35.21 35.57 34.91 34.76 -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -march=znver1 -Ofast -march=znver1 -Og 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
Bullet Physics Engine Test: 3000 Fall OpenBenchmarking.org Seconds, Fewer Is Better Bullet Physics Engine 2.81 Test: 3000 Fall -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -Ofast -march=znver1 -Og 1.1723 2.3446 3.5169 4.6892 5.8615 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 5.14 5.16 5.14 5.21 5.14 5.08 5.16 5.07 5.09 5.15 -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -march=znver1 -Ofast -march=znver1 -Og 1. (CXX) g++ options: -O3 -rdynamic -lglut -lGL -lGLU
Bullet Physics Engine Test: Raytests OpenBenchmarking.org Seconds, Fewer Is Better Bullet Physics Engine 2.81 Test: Raytests -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -Ofast -march=znver1 -Og 0.702 1.404 2.106 2.808 3.51 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 3.11 3.12 3.12 3.05 3.11 3.10 3.11 3.09 3.09 3.11 -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -march=znver1 -Ofast -march=znver1 -Og 1. (CXX) g++ options: -O3 -rdynamic -lglut -lGL -lGLU
Stockfish Total Time OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 9 Total Time -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -Ofast -march=znver1 -Og 20M 40M 60M 80M 100M SE +/- 1595135.10, N = 3 SE +/- 549524.55, N = 3 SE +/- 1016511.53, N = 3 SE +/- 579693.09, N = 3 SE +/- 468403.16, N = 3 SE +/- 324013.38, N = 3 SE +/- 673773.08, N = 3 SE +/- 402849.09, N = 3 SE +/- 460638.54, N = 3 SE +/- 823190.91, N = 3 105868175 105698092 104480422 104536605 104197865 106084276 104121840 106497994 106507244 105709690 -O0 -O1 -O2 -O2 -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -march=znver1 -Ofast -march=znver1 -Og 1. (CXX) g++ options: -m64 -lpthread -fno-exceptions -std=c++11 -pedantic -O3 -msse -msse3 -mpopcnt -flto
Bullet Physics Engine Test: Convex Trimesh OpenBenchmarking.org Seconds, Fewer Is Better Bullet Physics Engine 2.81 Test: Convex Trimesh -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -Ofast -march=znver1 -Og 0.3038 0.6076 0.9114 1.2152 1.519 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 1.35 1.35 1.35 1.33 1.35 1.32 1.35 1.32 1.32 1.35 -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -march=znver1 -Ofast -march=znver1 -Og 1. (CXX) g++ options: -O3 -rdynamic -lglut -lGL -lGLU
Bullet Physics Engine Test: Prim Trimesh OpenBenchmarking.org Seconds, Fewer Is Better Bullet Physics Engine 2.81 Test: Prim Trimesh -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -Ofast -march=znver1 -Og 0.2498 0.4996 0.7494 0.9992 1.249 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 1.11 1.11 1.11 1.09 1.11 1.11 1.11 1.11 1.11 1.11 -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -march=znver1 -Ofast -march=znver1 -Og 1. (CXX) g++ options: -O3 -rdynamic -lglut -lGL -lGLU
SVT-AV1 1080p 8-bit YUV To AV1 Video Encode OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2019-02-15 1080p 8-bit YUV To AV1 Video Encode -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 1.3298 2.6596 3.9894 5.3192 6.649 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 SE +/- 0.03, N = 3 SE +/- 0.05, N = 3 SE +/- 0.02, N = 3 SE +/- 0.05, N = 3 SE +/- 0.04, N = 3 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 SE +/- 0.05, N = 3 SE +/- 0.04, N = 3 5.88 5.87 5.81 5.90 5.89 5.84 5.90 5.89 5.84 5.91 5.86 1. (CC) gcc options: -mavx -fPIE -fPIC -O2 -pie -lpthread -lm
Hierarchical INTegration Test: FLOAT OpenBenchmarking.org QUIPs, More Is Better Hierarchical INTegration 1.0 Test: FLOAT -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 60M 120M 180M 240M 300M SE +/- 321625.93, N = 3 SE +/- 1109543.66, N = 3 SE +/- 232284.24, N = 3 SE +/- 1057235.60, N = 3 SE +/- 54052.05, N = 3 SE +/- 211192.91, N = 3 SE +/- 219028.41, N = 3 SE +/- 1208988.13, N = 3 SE +/- 67545.97, N = 3 SE +/- 193963.83, N = 3 SE +/- 144731.32, N = 3 267404445 268455578 267311970 268173400 267172145 267268023 267315647 268506472 267239405 267055407 267368671 -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -march=znver1 -march=znver1 -flto -Ofast -march=znver1 -Og 1. (CC) gcc options: -O3 -march=native -lm
TSCP AI Chess Performance OpenBenchmarking.org Nodes Per Second, More Is Better TSCP 1.81 AI Chess Performance -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 200K 400K 600K 800K 1000K SE +/- 333.13, N = 5 SE +/- 542.88, N = 5 SE +/- 507.80, N = 5 SE +/- 331.91, N = 5 SE +/- 508.06, N = 5 SE +/- 272.00, N = 5 SE +/- 272.00, N = 5 SE +/- 667.00, N = 5 SE +/- 270.20, N = 5 SE +/- 507.80, N = 5 SE +/- 333.13, N = 5 865459 864102 864373 864101 864916 864915 864915 865732 863018 864373 865187 -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -march=znver1 -march=znver1 -flto -Ofast -march=znver1 -Og 1. (CC) gcc options: -O3 -march=native
ctx_clock Context Switch Time OpenBenchmarking.org Clocks, Fewer Is Better ctx_clock Context Switch Time -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O3 -march=znver1 -flto -Og 30 60 90 120 150 132 132 132 132 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O3 -march=znver1 -flto -Og 1. (CC) gcc options:
Zstd Compression Compressing ubuntu-16.04.3-server-i386.img, Compression Level 19 OpenBenchmarking.org Seconds, Fewer Is Better Zstd Compression 1.3.4 Compressing ubuntu-16.04.3-server-i386.img, Compression Level 19 -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 6 12 18 24 30 SE +/- 0.38, N = 4 SE +/- 0.25, N = 12 SE +/- 0.49, N = 9 SE +/- 0.31, N = 12 SE +/- 0.33, N = 12 SE +/- 0.35, N = 12 SE +/- 0.38, N = 12 SE +/- 0.44, N = 12 SE +/- 0.24, N = 11 SE +/- 0.21, N = 12 SE +/- 0.29, N = 12 23.12 14.11 14.48 14.08 13.67 14.71 13.66 14.37 13.16 13.77 14.39 -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 1. (CC) gcc options: -pthread -lz -llzma
John The Ripper Test: Blowfish OpenBenchmarking.org Real C/S, More Is Better John The Ripper 1.8.0-jumbo-1 Test: Blowfish -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 14K 28K 42K 56K 70K SE +/- 215.16, N = 12 SE +/- 1098.57, N = 4 SE +/- 1387.83, N = 12 SE +/- 1395.50, N = 12 SE +/- 1953.49, N = 12 SE +/- 1967.27, N = 12 SE +/- 1049.43, N = 3 SE +/- 1082.96, N = 12 SE +/- 1598.30, N = 12 SE +/- 1454.31, N = 11 SE +/- 1339.16, N = 9 15179 65995 62718 65117 63586 61309 65806 66823 58764 62841 56453 1. (CC) gcc options: -lssl -lcrypto -fopenmp -lgmp -pthread -lm -lz -ldl -lcrypt
SciMark Computational Test: Dense LU Matrix Factorization OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Dense LU Matrix Factorization -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 1000 2000 3000 4000 5000 SE +/- 30.37, N = 3 SE +/- 32.13, N = 3 SE +/- 54.41, N = 3 SE +/- 129.41, N = 3 SE +/- 57.07, N = 3 SE +/- 173.68, N = 3 SE +/- 42.05, N = 3 SE +/- 65.35, N = 3 SE +/- 178.79, N = 3 SE +/- 107.53, N = 3 SE +/- 139.26, N = 3 512 3466 2609 2515 4396 3231 4307 4851 3300 4089 2539 -O0 -O1 -O2 -O2 -flto -O2 -ftree-vectorize -ftree-slp-vectorize -O2 -march=znver1 -O3 -O3 -march=znver1 -O3 -march=znver1 -flto -Ofast -march=znver1 -Og 1. (CC) gcc options: -lm
Phoronix Test Suite v10.8.4