GCC 9 Compiler Benchmarking vs. LLVM Clang Intel Core i9-7980XE benchmarks of GCC 8 / GCC 9 versus LLVM Clang 7 and LLVM Clang 8 compilers on Ubuntu Linux. Benchmarks by Michael Larabel for a future article on Phoronix.com.
HTML result view exported from: https://openbenchmarking.org/result/1811133-SK-GCC9COMPI38&rdt&grr .
GCC 9 Compiler Benchmarking vs. LLVM Clang Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL Compiler File-System Screen Resolution GCC 9.0.0 20181112 GCC 8.2.0 Clang 7.0.0 Clang 8.0.0 20181111 Intel Core i9-7980XE @ 4.20GHz (18 Cores / 36 Threads) ASUS PRIME X299-A (1503 BIOS) Intel Sky Lake-E DMI3 Registers 16384MB 240GB Force MP510 NV120 12GB Realtek ALC1220 ASUS PB278 Intel Connection Ubuntu 18.10 4.18.0-10-generic (x86_64) GNOME Shell 3.30.1 X Server 1.20.1 modesetting 1.20.1 4.3 Mesa 18.2.2 GCC 9.0.0 20181112 ext4 2560x1440 GCC 8.2.0 Clang 7.0.0-3 Clang 8.0.0-svn346617-1~exp1+0~20181111195013.162~1.gbp8d271f OpenBenchmarking.org Environment Details - CXXFLAGS=-O3-march=native CFLAGS=-O3-march=native Compiler Details - GCC 9.0.0 20181112, GCC 8.2.0: --disable-multilib --enable-checking=release Processor Details - Scaling Governor: intel_pstate powersave Python Details - Python 2.7.15+ + Python 3.6.7 Security Details - KPTI + __user pointer sanitization + Full generic retpoline IBPB IBRS_FW + SSB disabled via prctl and seccomp + PTE Inversion; VMX: conditional cache flushes SMT vulnerable
GCC 9 Compiler Benchmarking vs. LLVM Clang pgbench: Buffer Test - Normal Load - Read Write parboil: OpenMP MRI Gridding build-linux-kernel: Time To Compile parboil: OpenMP LBM pgbench: Buffer Test - Normal Load - Read Only stockfish: Total Time build-llvm: Time To Compile john-the-ripper: Traditional DES fftw: Float + SSE - 2D FFT Size 2048 hpcg: c-ray: Total Time - 4K, 16 Rays Per Pixel compress-7zip: Compress Speed Test m-queens: Time To Solve himeno: Poisson Pressure Solver npb: BT.A ebizzy: aobench: 2048 x 2048 - Total Time scimark2: Composite npb: SP.A crafty: Elapsed Time build-apache: Time To Compile john-the-ripper: MD5 john-the-ripper: Blowfish openssl: RSA 4096-bit Performance mcperf: Set xsbench: mcperf: Get npb: FT.B encode-mp3: WAV To MP3 compress-zstd: Compressing ubuntu-16.04.3-server-i386.img, Compression Level 19 hmmer: Pfam Database Search parboil: OpenMP Stencil npb: EP.C fftw: Float + SSE - 1D FFT Size 2048 x264: H.264 Video Encoding tjbench: Decompression Throughput cloverleaf: Lagrangian-Eulerian Hydrodynamics parboil: OpenMP CUTCP npb: FT.A blake2: scimark2: Jacobi Successive Over-Relaxation scimark2: Dense LU Matrix Factorization scimark2: Sparse Matrix Multiply scimark2: Fast Fourier Transform scimark2: Monte Carlo GCC 9.0.0 20181112 GCC 8.2.0 Clang 7.0.0 Clang 8.0.0 20181111 21776 147 44.18 81.52 500154 46908339 220 74356333 20468 1.22 33.75 93437 48.72 3093 4557 620245 31.32 2613 3865 8423397 22.26 675993 22173 4673 72449 4664809 110495 7174 9.74 10.63 10.42 7.19 1250 59299 128 184 3.20 2.44 6718 3.51 2069 6063 3295 732 906 23830 146 44.26 79.36 496730 47827811 223 72707667 20014 1.25 33.70 93576 48.73 3089 4834 607319 31.35 2615 3542 8560641 22.05 670435 22254 4658 71776 4662395 108730 7220 10.13 10.61 10.26 7.13 925 57019 128 183 2.98 2.46 6756 3.66 2074 6071 3295 727 906 17186 504725 46179707 216 98537500 20258 1.33 66.76 50.19 2463 594997 32.30 2584 21.13 799209 27611 4547 73064 5156109 115868 12.16 10.27 8.59 56274 125 204 3.02 1662 6476 3300 763 717 10012 505631 45725081 189 100083000 19871 1.34 67.06 50.41 2455 588266 32.19 2577 19.70 818976 25614 4544 73009 5022107 115247 11.80 10.34 8.52 54433 125 198 3.10 1661 6449 3288 768 717 OpenBenchmarking.org
PostgreSQL pgbench Scaling: Buffer Test - Test: Normal Load - Mode: Read Write OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 10.3 Scaling: Buffer Test - Test: Normal Load - Mode: Read Write GCC 9.0.0 20181112 GCC 8.2.0 Clang 7.0.0 Clang 8.0.0 20181111 5K 10K 15K 20K 25K SE +/- 1475.07, N = 9 SE +/- 402.82, N = 4 SE +/- 1724.11, N = 9 SE +/- 421.62, N = 12 21776 23830 17186 10012 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=native -lpgcommon -lpgport -lpq -lpthread -lrt -lcrypt -ldl -lm
Parboil Test: OpenMP MRI Gridding OpenBenchmarking.org Seconds, Fewer Is Better Parboil 2.5 Test: OpenMP MRI Gridding GCC 9.0.0 20181112 GCC 8.2.0 30 60 90 120 150 SE +/- 0.40, N = 3 SE +/- 0.83, N = 3 147 146 1. (CXX) g++ options: -lm -lpthread -lgomp -O3 -ffast-math -fopenmp
Timed Linux Kernel Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 4.18 Time To Compile GCC 9.0.0 20181112 GCC 8.2.0 10 20 30 40 50 SE +/- 0.61, N = 6 SE +/- 0.50, N = 9 44.18 44.26
Parboil Test: OpenMP LBM OpenBenchmarking.org Seconds, Fewer Is Better Parboil 2.5 Test: OpenMP LBM GCC 9.0.0 20181112 GCC 8.2.0 20 40 60 80 100 SE +/- 0.27, N = 3 SE +/- 0.33, N = 3 81.52 79.36 1. (CXX) g++ options: -lm -lpthread -lgomp -O3 -ffast-math -fopenmp
PostgreSQL pgbench Scaling: Buffer Test - Test: Normal Load - Mode: Read Only OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 10.3 Scaling: Buffer Test - Test: Normal Load - Mode: Read Only GCC 9.0.0 20181112 GCC 8.2.0 Clang 7.0.0 Clang 8.0.0 20181111 110K 220K 330K 440K 550K SE +/- 274.70, N = 3 SE +/- 210.08, N = 3 SE +/- 3717.80, N = 3 SE +/- 3943.25, N = 3 500154 496730 504725 505631 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=native -lpgcommon -lpgport -lpq -lpthread -lrt -lcrypt -ldl -lm
Stockfish Total Time OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 9 Total Time GCC 9.0.0 20181112 GCC 8.2.0 Clang 7.0.0 Clang 8.0.0 20181111 10M 20M 30M 40M 50M SE +/- 469942.62, N = 3 SE +/- 290155.29, N = 3 SE +/- 358695.82, N = 3 SE +/- 77465.59, N = 3 46908339 47827811 46179707 45725081 1. (CXX) g++ options: -m64 -lpthread -O3 -march=native -fno-exceptions -std=c++11 -pedantic -msse -msse3 -mpopcnt -flto
Timed LLVM Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed LLVM Compilation 6.0.1 Time To Compile GCC 9.0.0 20181112 GCC 8.2.0 Clang 7.0.0 Clang 8.0.0 20181111 50 100 150 200 250 220 223 216 189
John The Ripper Test: Traditional DES OpenBenchmarking.org Real C/S, More Is Better John The Ripper 1.8.0-jumbo-1 Test: Traditional DES GCC 9.0.0 20181112 GCC 8.2.0 Clang 7.0.0 Clang 8.0.0 20181111 20M 40M 60M 80M 100M SE +/- 132290.51, N = 3 SE +/- 1115841.74, N = 3 SE +/- 1095165.68, N = 12 SE +/- 185734.58, N = 3 74356333 72707667 98537500 100083000 1. (CC) gcc options: -lssl -lcrypto -fopenmp -lgmp -pthread -lm -lz -ldl -lcrypt
FFTW Build: Float + SSE - Size: 2D FFT Size 2048 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 2D FFT Size 2048 GCC 9.0.0 20181112 GCC 8.2.0 Clang 7.0.0 Clang 8.0.0 20181111 4K 8K 12K 16K 20K SE +/- 88.29, N = 3 SE +/- 25.39, N = 3 SE +/- 260.52, N = 3 SE +/- 95.21, N = 3 20468 20014 20258 19871 1. (CC) gcc options: -pthread -O3 -march=native -lm
High Performance Conjugate Gradient OpenBenchmarking.org GFLOP/s, More Is Better High Performance Conjugate Gradient 3.0 GCC 9.0.0 20181112 GCC 8.2.0 Clang 7.0.0 Clang 8.0.0 20181111 0.3015 0.603 0.9045 1.206 1.5075 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 1.22 1.25 1.33 1.34
C-Ray Total Time - 4K, 16 Rays Per Pixel OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 1.1 Total Time - 4K, 16 Rays Per Pixel GCC 9.0.0 20181112 GCC 8.2.0 Clang 7.0.0 Clang 8.0.0 20181111 15 30 45 60 75 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 33.75 33.70 66.76 67.06 1. (CC) gcc options: -lm -lpthread -O3 -march=native
7-Zip Compression Compress Speed Test OpenBenchmarking.org MIPS, More Is Better 7-Zip Compression 16.02 Compress Speed Test GCC 9.0.0 20181112 GCC 8.2.0 20K 40K 60K 80K 100K SE +/- 384.58, N = 3 SE +/- 184.01, N = 3 93437 93576 1. (CXX) g++ options: -pipe -lpthread
m-queens Time To Solve OpenBenchmarking.org Seconds, Fewer Is Better m-queens 1.2 Time To Solve GCC 9.0.0 20181112 GCC 8.2.0 Clang 7.0.0 Clang 8.0.0 20181111 11 22 33 44 55 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 48.72 48.73 50.19 50.41 1. (CXX) g++ options: -fopenmp -O3 -march=native -O2
Himeno Benchmark Poisson Pressure Solver OpenBenchmarking.org MFLOPS, More Is Better Himeno Benchmark 3.0 Poisson Pressure Solver GCC 9.0.0 20181112 GCC 8.2.0 Clang 7.0.0 Clang 8.0.0 20181111 700 1400 2100 2800 3500 SE +/- 3.44, N = 3 SE +/- 4.75, N = 3 SE +/- 2.40, N = 3 SE +/- 4.58, N = 3 3093 3089 2463 2455 1. (CC) gcc options: -O3 -march=native -mavx2
NAS Parallel Benchmarks Test / Class: BT.A OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.3 Test / Class: BT.A GCC 9.0.0 20181112 GCC 8.2.0 1000 2000 3000 4000 5000 SE +/- 35.44, N = 3 SE +/- 31.51, N = 3 4557 4834 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi
ebizzy OpenBenchmarking.org Records/s, More Is Better ebizzy 0.3 GCC 9.0.0 20181112 GCC 8.2.0 Clang 7.0.0 Clang 8.0.0 20181111 130K 260K 390K 520K 650K SE +/- 7775.77, N = 3 SE +/- 5181.95, N = 3 SE +/- 8494.47, N = 12 SE +/- 10943.27, N = 3 620245 607319 594997 588266 1. (CC) gcc options: -pthread -lpthread -O3 -march=native
AOBench Size: 2048 x 2048 - Total Time OpenBenchmarking.org Seconds, Fewer Is Better AOBench Size: 2048 x 2048 - Total Time GCC 9.0.0 20181112 GCC 8.2.0 Clang 7.0.0 Clang 8.0.0 20181111 8 16 24 32 40 SE +/- 0.03, N = 3 SE +/- 0.04, N = 3 SE +/- 0.07, N = 3 SE +/- 0.01, N = 3 31.32 31.35 32.30 32.19 1. (CC) gcc options: -lm -O3 -march=native
SciMark Computational Test: Composite OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Composite GCC 9.0.0 20181112 GCC 8.2.0 Clang 7.0.0 Clang 8.0.0 20181111 600 1200 1800 2400 3000 SE +/- 2.04, N = 3 SE +/- 1.63, N = 3 SE +/- 1.31, N = 3 SE +/- 5.03, N = 3 2613 2615 2584 2577 1. (CC) gcc options: -O3 -march=native -lm
NAS Parallel Benchmarks Test / Class: SP.A OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.3 Test / Class: SP.A GCC 9.0.0 20181112 GCC 8.2.0 800 1600 2400 3200 4000 SE +/- 28.57, N = 3 SE +/- 45.01, N = 3 3865 3542 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi
Crafty Elapsed Time OpenBenchmarking.org Nodes Per Second, More Is Better Crafty 25.2 Elapsed Time GCC 9.0.0 20181112 GCC 8.2.0 2M 4M 6M 8M 10M SE +/- 15686.24, N = 3 SE +/- 8409.72, N = 3 8423397 8560641 1. (CC) gcc options: -pthread -lstdc++ -fprofile-use -lm
Timed Apache Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Apache Compilation 2.4.7 Time To Compile GCC 9.0.0 20181112 GCC 8.2.0 Clang 7.0.0 Clang 8.0.0 20181111 5 10 15 20 25 SE +/- 0.05, N = 3 SE +/- 0.07, N = 3 SE +/- 0.06, N = 3 SE +/- 0.06, N = 3 22.26 22.05 21.13 19.70
John The Ripper Test: MD5 OpenBenchmarking.org Real C/S, More Is Better John The Ripper 1.8.0-jumbo-1 Test: MD5 GCC 9.0.0 20181112 GCC 8.2.0 Clang 7.0.0 Clang 8.0.0 20181111 200K 400K 600K 800K 1000K SE +/- 1299.84, N = 3 SE +/- 925.69, N = 3 SE +/- 11085.39, N = 3 SE +/- 768.00, N = 3 675993 670435 799209 818976 1. (CC) gcc options: -lssl -lcrypto -fopenmp -lgmp -pthread -lm -lz -ldl -lcrypt
John The Ripper Test: Blowfish OpenBenchmarking.org Real C/S, More Is Better John The Ripper 1.8.0-jumbo-1 Test: Blowfish GCC 9.0.0 20181112 GCC 8.2.0 Clang 7.0.0 Clang 8.0.0 20181111 6K 12K 18K 24K 30K SE +/- 65.83, N = 3 SE +/- 70.42, N = 3 SE +/- 11.67, N = 3 SE +/- 9.29, N = 3 22173 22254 27611 25614 1. (CC) gcc options: -lssl -lcrypto -fopenmp -lgmp -pthread -lm -lz -ldl -lcrypt
OpenSSL RSA 4096-bit Performance OpenBenchmarking.org Signs Per Second, More Is Better OpenSSL 1.1.1 RSA 4096-bit Performance GCC 9.0.0 20181112 GCC 8.2.0 Clang 7.0.0 Clang 8.0.0 20181111 1000 2000 3000 4000 5000 SE +/- 0.31, N = 3 SE +/- 5.28, N = 3 SE +/- 2.68, N = 3 SE +/- 8.23, N = 3 4673 4658 4547 4544 -Qunused-arguments -Qunused-arguments 1. (CC) gcc options: -pthread -m64 -O3 -march=native -lssl -lcrypto -ldl
Memcached mcperf Method: Set OpenBenchmarking.org Operations Per Second, More Is Better Memcached mcperf 1.5.10 Method: Set GCC 9.0.0 20181112 GCC 8.2.0 Clang 7.0.0 Clang 8.0.0 20181111 16K 32K 48K 64K 80K SE +/- 305.67, N = 3 SE +/- 259.45, N = 3 SE +/- 499.66, N = 3 SE +/- 408.85, N = 3 72449 71776 73064 73009 1. (CC) gcc options: -O3 -march=native -lm -rdynamic
Xsbench OpenBenchmarking.org Lookups/s, More Is Better Xsbench 2017-07-06 GCC 9.0.0 20181112 GCC 8.2.0 Clang 7.0.0 Clang 8.0.0 20181111 1.1M 2.2M 3.3M 4.4M 5.5M SE +/- 1108.05, N = 3 SE +/- 806.44, N = 3 SE +/- 843.35, N = 3 SE +/- 1402.44, N = 3 4664809 4662395 5156109 5022107 1. (CC) gcc options: -std=gnu99 -fopenmp -O3 -lm
Memcached mcperf Method: Get OpenBenchmarking.org Operations Per Second, More Is Better Memcached mcperf 1.5.10 Method: Get GCC 9.0.0 20181112 GCC 8.2.0 Clang 7.0.0 Clang 8.0.0 20181111 20K 40K 60K 80K 100K SE +/- 666.93, N = 3 SE +/- 767.29, N = 3 SE +/- 605.02, N = 3 SE +/- 1208.65, N = 3 110495 108730 115868 115247 1. (CC) gcc options: -O3 -march=native -lm -rdynamic
NAS Parallel Benchmarks Test / Class: FT.B OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.3 Test / Class: FT.B GCC 9.0.0 20181112 GCC 8.2.0 1500 3000 4500 6000 7500 SE +/- 2.08, N = 3 SE +/- 8.95, N = 3 7174 7220 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi
LAME MP3 Encoding WAV To MP3 OpenBenchmarking.org Seconds, Fewer Is Better LAME MP3 Encoding 3.100 WAV To MP3 GCC 9.0.0 20181112 GCC 8.2.0 Clang 7.0.0 Clang 8.0.0 20181111 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 9.74 10.13 12.16 11.80 -pipe -lncurses -pipe -lncurses 1. (CC) gcc options: -O3 -march=native -lm
Zstd Compression Compressing ubuntu-16.04.3-server-i386.img, Compression Level 19 OpenBenchmarking.org Seconds, Fewer Is Better Zstd Compression 1.3.4 Compressing ubuntu-16.04.3-server-i386.img, Compression Level 19 GCC 9.0.0 20181112 GCC 8.2.0 Clang 7.0.0 Clang 8.0.0 20181111 3 6 9 12 15 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 SE +/- 0.07, N = 3 10.63 10.61 10.27 10.34 1. (CC) gcc options: -O3 -march=native -pthread -lz -llzma
Timed HMMer Search Pfam Database Search OpenBenchmarking.org Seconds, Fewer Is Better Timed HMMer Search 2.3.2 Pfam Database Search GCC 9.0.0 20181112 GCC 8.2.0 Clang 7.0.0 Clang 8.0.0 20181111 3 6 9 12 15 SE +/- 0.10, N = 3 SE +/- 0.06, N = 3 SE +/- 0.12, N = 3 SE +/- 0.11, N = 3 10.42 10.26 8.59 8.52 1. (CC) gcc options: -O3 -march=native -pthread -lhmmer -lsquid -lm
Parboil Test: OpenMP Stencil OpenBenchmarking.org Seconds, Fewer Is Better Parboil 2.5 Test: OpenMP Stencil GCC 9.0.0 20181112 GCC 8.2.0 2 4 6 8 10 SE +/- 0.06, N = 3 SE +/- 0.05, N = 3 7.19 7.13 1. (CXX) g++ options: -lm -lpthread -lgomp -O3 -ffast-math -fopenmp
NAS Parallel Benchmarks Test / Class: EP.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.3 Test / Class: EP.C GCC 9.0.0 20181112 GCC 8.2.0 300 600 900 1200 1500 SE +/- 12.17, N = 3 SE +/- 4.53, N = 3 1250 925 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi
FFTW Build: Float + SSE - Size: 1D FFT Size 2048 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.6 Build: Float + SSE - Size: 1D FFT Size 2048 GCC 9.0.0 20181112 GCC 8.2.0 Clang 7.0.0 Clang 8.0.0 20181111 13K 26K 39K 52K 65K SE +/- 196.81, N = 3 SE +/- 1025.80, N = 3 SE +/- 463.90, N = 3 SE +/- 809.18, N = 5 59299 57019 56274 54433 1. (CC) gcc options: -pthread -O3 -march=native -lm
x264 H.264 Video Encoding OpenBenchmarking.org Frames Per Second, More Is Better x264 2018-09-25 H.264 Video Encoding GCC 9.0.0 20181112 GCC 8.2.0 Clang 7.0.0 Clang 8.0.0 20181111 30 60 90 120 150 SE +/- 1.58, N = 7 SE +/- 1.97, N = 3 SE +/- 2.08, N = 4 SE +/- 1.11, N = 3 128 128 125 125 -mstack-alignment=64 -mstack-alignment=64 1. (CC) gcc options: -ldl -m64 -lm -lpthread -O3 -ffast-math -march=native -std=gnu99 -fPIC -fomit-frame-pointer -fno-tree-vectorize
libjpeg-turbo tjbench Test: Decompression Throughput OpenBenchmarking.org Megapixels/sec, More Is Better libjpeg-turbo tjbench 1.5.3 Test: Decompression Throughput GCC 9.0.0 20181112 GCC 8.2.0 Clang 7.0.0 Clang 8.0.0 20181111 40 80 120 160 200 SE +/- 0.54, N = 3 SE +/- 0.06, N = 3 SE +/- 0.36, N = 3 SE +/- 1.17, N = 3 184 183 204 198 1. (CC) gcc options: -O3 -march=native -lm
CloverLeaf Lagrangian-Eulerian Hydrodynamics OpenBenchmarking.org Seconds, Fewer Is Better CloverLeaf Lagrangian-Eulerian Hydrodynamics GCC 9.0.0 20181112 GCC 8.2.0 0.72 1.44 2.16 2.88 3.6 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 3.20 2.98 1. (F9X) gfortran options: -O3 -march=native -funroll-loops -fopenmp
Parboil Test: OpenMP CUTCP OpenBenchmarking.org Seconds, Fewer Is Better Parboil 2.5 Test: OpenMP CUTCP GCC 9.0.0 20181112 GCC 8.2.0 0.5535 1.107 1.6605 2.214 2.7675 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 2.44 2.46 1. (CXX) g++ options: -lm -lpthread -lgomp -O3 -ffast-math -fopenmp
NAS Parallel Benchmarks Test / Class: FT.A OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.3 Test / Class: FT.A GCC 9.0.0 20181112 GCC 8.2.0 1400 2800 4200 5600 7000 SE +/- 17.12, N = 3 SE +/- 12.06, N = 3 6718 6756 1. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi
BLAKE2 OpenBenchmarking.org Cycles Per Byte, Fewer Is Better BLAKE2 20170307 GCC 9.0.0 20181112 GCC 8.2.0 Clang 7.0.0 Clang 8.0.0 20181111 0.8235 1.647 2.4705 3.294 4.1175 SE +/- 0.11, N = 12 SE +/- 0.09, N = 12 SE +/- 0.06, N = 3 SE +/- 0.01, N = 3 3.51 3.66 3.02 3.10 1. (CC) gcc options: -O3 -march=native -lcrypto -lz
SciMark Computational Test: Jacobi Successive Over-Relaxation OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Jacobi Successive Over-Relaxation GCC 9.0.0 20181112 GCC 8.2.0 Clang 7.0.0 Clang 8.0.0 20181111 400 800 1200 1600 2000 SE +/- 3.14, N = 3 SE +/- 2.18, N = 3 SE +/- 1.35, N = 3 SE +/- 1.35, N = 3 2069 2074 1662 1661 1. (CC) gcc options: -O3 -march=native -lm
SciMark Computational Test: Dense LU Matrix Factorization OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Dense LU Matrix Factorization GCC 9.0.0 20181112 GCC 8.2.0 Clang 7.0.0 Clang 8.0.0 20181111 1400 2800 4200 5600 7000 SE +/- 7.72, N = 3 SE +/- 10.59, N = 3 SE +/- 0.69, N = 3 SE +/- 18.31, N = 3 6063 6071 6476 6449 1. (CC) gcc options: -O3 -march=native -lm
SciMark Computational Test: Sparse Matrix Multiply OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Sparse Matrix Multiply GCC 9.0.0 20181112 GCC 8.2.0 Clang 7.0.0 Clang 8.0.0 20181111 700 1400 2100 2800 3500 SE +/- 3.31, N = 3 SE +/- 1.55, N = 3 SE +/- 3.75, N = 3 SE +/- 5.51, N = 3 3295 3295 3300 3288 1. (CC) gcc options: -O3 -march=native -lm
SciMark Computational Test: Fast Fourier Transform OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Fast Fourier Transform GCC 9.0.0 20181112 GCC 8.2.0 Clang 7.0.0 Clang 8.0.0 20181111 170 340 510 680 850 SE +/- 1.85, N = 3 SE +/- 0.86, N = 3 SE +/- 3.38, N = 3 SE +/- 1.98, N = 3 732 727 763 768 1. (CC) gcc options: -O3 -march=native -lm
SciMark Computational Test: Monte Carlo OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Monte Carlo GCC 9.0.0 20181112 GCC 8.2.0 Clang 7.0.0 Clang 8.0.0 20181111 200 400 600 800 1000 SE +/- 0.99, N = 3 SE +/- 1.15, N = 3 SE +/- 0.32, N = 3 SE +/- 0.29, N = 3 906 906 717 717 1. (CC) gcc options: -O3 -march=native -lm
Phoronix Test Suite v10.8.5