GCC vs. LLVM Clang 3.8 3.9 Compiler Benchmarking Benchmarks by Michael Larabel for a future article on phoronix looking at early GCC 7 compiler performance compared to GCC 6 and GCC 5 and then LLVM Clang 3.8 and Clang 3.9.
HTML result view exported from: https://openbenchmarking.org/result/1609139-LO-GCCCLANG151&grw&sro .
GCC vs. LLVM Clang 3.8 3.9 Compiler Benchmarking Processor Motherboard Chipset Memory Disk Graphics Audio Network OS Kernel Desktop Display Server Display Driver OpenGL Compiler File-System Screen Resolution GCC 5.4.0 GCC 6.2.0 GCC 7.0.0 20160904 Clang 3.8.0 Clang 3.9.0 Intel Xeon E5-2609 v4 @ 1.70GHz (8 Cores) MSI X99A WORKSTATION (MS-7A54) v1.0 Intel Xeon E7 v4/Xeon 16384MB 3 x 120GB TOSHIBA-TR150 LLVMpipe Realtek ALC1150 Intel Connection Ubuntu 16.04 4.8.0-999-generic (x86_64) 20160908 Unity 7.4.0 X Server 1.18.3 modesetting 1.18.3 3.3 Mesa 11.2.0 Gallium 0.4 GCC 5.4.0 ext4 1024x768 GCC 6.2.0 GCC 7.0.0 20160904 Clang 3.8.0-2ubuntu4 Clang 3.9.0-svn279689-1~exp1 OpenBenchmarking.org Environment Details - LIBGL_ALWAYS_SOFTWARE=1 Compiler Details - GCC 5.4.0, GCC 6.2.0, GCC 7.0.0 20160904: --disable-multilib --enable-checking=release --enable-languages=c,c++,fortran Disk Details - CFQ / data=ordered,errors=remount-ro,relatime,rw Processor Details - Scaling Governor: intel_pstate powersave
GCC vs. LLVM Clang 3.8 3.9 Compiler Benchmarking bullet: Raytests bullet: 3000 Fall bullet: 136 Ragdolls scimark2: Composite scimark2: Monte Carlo scimark2: Fast Fourier Transform scimark2: Sparse Matrix Multiply scimark2: Dense LU Matrix Factorization scimark2: Jacobi Successive Over-Relaxation hint: FLOAT encode-flac: WAV To FLAC encode-mp3: WAV To MP3 fftw: Float + SSE - 1D FFT Size 4096 fftw: Float + SSE - 2D FFT Size 4096 hmmer: Pfam Database Search mafft: Multiple Sequence Alignment himeno: Poisson Pressure Solver lammps: Rhodopsin Protein n-queens: Elapsed Time build-imagemagick: Time To Compile build-php: Time To Compile c-ray: Total Time smallpt: Global Illumination Renderer; 100 Samples apache: Static Web Page Serving openssl: RSA 4096-bit Performance redis: GET redis: SET pgbench: Buffer Test - Normal Load - Read Write pgbench: Buffer Test - Single Thread - Read Write ebizzy: Phoronix Test Suite v6.6.0 fhourstones: Complex Connect-4 Solving GCC 5.4.0 GCC 6.2.0 GCC 7.0.0 20160904 Clang 3.8.0 Clang 3.9.0 5.94 9.13 6.45 695.70 304.10 228.18 1047.40 1327.15 571.65 177738053.32 13.45 22.84 10966 7410.50 12.18 5.99 893.82 72.00 57.53 67.21 35.31 21.30 47 27564.72 569.37 1434042.92 969029.10 4468.82 552.55 155360 6954.67 5.98 9.11 6.46 738.21 304.07 232.19 1280.02 1302.74 572.05 166219905.04 13.44 22.42 10936 7515.98 12.18 6.41 1091.32 56.21 94.51 36.25 21.25 47 28383.22 569.27 1408567.41 972159.31 4411.75 546.21 155273 6884.93 5.96 9.01 6.40 800.64 304.03 239.27 1290.94 1596.90 572.06 165733740.33 13.45 22.03 10951 7517.94 12.22 6.45 1095.90 52.54 81.27 36.52 23.36 30 29512.13 570.20 1393442.96 965269.60 4437.25 549.49 155692 6905.30 6.20 9.57 6.89 1111.78 126.46 223.71 1414.48 2936.85 857.42 141990405.69 13.26 26.94 10357 6841.34 12.37 7.74 748.09 63.22 74.34 28.45 39.34 29812.18 566.80 1342892.96 925939.11 4411.72 538.44 152747 7164.37 6.21 9.83 6.92 1115.00 126.43 226.01 1448.22 2917.96 856.35 143252876.20 13.15 27.46 10331 6923.64 13.76 7.27 841.55 62.30 78.45 35.25 44.42 27344.42 511.07 1348061.04 933433.87 4356.38 531.79 138671 6563.47 OpenBenchmarking.org
Bullet Physics Engine Test: Raytests OpenBenchmarking.org Seconds, Fewer Is Better Bullet Physics Engine 2.81 Test: Raytests Clang 3.8.0 Clang 3.9.0 GCC 5.4.0 GCC 6.2.0 GCC 7.0.0 20160904 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.05, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 6.20 6.21 5.94 5.98 5.96 1. (CXX) g++ options: -O3 -march=native -rdynamic -lglut -lGL -lGLU
Bullet Physics Engine Test: 3000 Fall OpenBenchmarking.org Seconds, Fewer Is Better Bullet Physics Engine 2.81 Test: 3000 Fall Clang 3.8.0 Clang 3.9.0 GCC 5.4.0 GCC 6.2.0 GCC 7.0.0 20160904 3 6 9 12 15 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.00, N = 3 9.57 9.83 9.13 9.11 9.01 1. (CXX) g++ options: -O3 -march=native -rdynamic -lglut -lGL -lGLU
Bullet Physics Engine Test: 136 Ragdolls OpenBenchmarking.org Seconds, Fewer Is Better Bullet Physics Engine 2.81 Test: 136 Ragdolls Clang 3.8.0 Clang 3.9.0 GCC 5.4.0 GCC 6.2.0 GCC 7.0.0 20160904 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 6.89 6.92 6.45 6.46 6.40 1. (CXX) g++ options: -O3 -march=native -rdynamic -lglut -lGL -lGLU
SciMark Computational Test: Composite OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Composite Clang 3.8.0 Clang 3.9.0 GCC 5.4.0 GCC 6.2.0 GCC 7.0.0 20160904 200 400 600 800 1000 SE +/- 0.09, N = 4 SE +/- 0.69, N = 4 SE +/- 2.13, N = 4 SE +/- 0.14, N = 4 SE +/- 0.09, N = 4 1111.78 1115.00 695.70 738.21 800.64 1. (CXX) g++ options: -O3 -march=native
SciMark Computational Test: Monte Carlo OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Monte Carlo Clang 3.8.0 Clang 3.9.0 GCC 5.4.0 GCC 6.2.0 GCC 7.0.0 20160904 70 140 210 280 350 SE +/- 0.02, N = 4 SE +/- 0.15, N = 4 SE +/- 0.00, N = 4 SE +/- 0.00, N = 4 SE +/- 0.00, N = 4 126.46 126.43 304.10 304.07 304.03 1. (CXX) g++ options: -O3 -march=native
SciMark Computational Test: Fast Fourier Transform OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Fast Fourier Transform Clang 3.8.0 Clang 3.9.0 GCC 5.4.0 GCC 6.2.0 GCC 7.0.0 20160904 50 100 150 200 250 SE +/- 0.36, N = 4 SE +/- 1.03, N = 4 SE +/- 0.13, N = 4 SE +/- 0.38, N = 4 SE +/- 0.16, N = 4 223.71 226.01 228.18 232.19 239.27 1. (CXX) g++ options: -O3 -march=native
SciMark Computational Test: Sparse Matrix Multiply OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Sparse Matrix Multiply Clang 3.8.0 Clang 3.9.0 GCC 5.4.0 GCC 6.2.0 GCC 7.0.0 20160904 300 600 900 1200 1500 SE +/- 0.32, N = 4 SE +/- 2.31, N = 4 SE +/- 2.22, N = 4 SE +/- 0.17, N = 4 SE +/- 0.35, N = 4 1414.48 1448.22 1047.40 1280.02 1290.94 1. (CXX) g++ options: -O3 -march=native
SciMark Computational Test: Dense LU Matrix Factorization OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Dense LU Matrix Factorization Clang 3.8.0 Clang 3.9.0 GCC 5.4.0 GCC 6.2.0 GCC 7.0.0 20160904 600 1200 1800 2400 3000 SE +/- 0.55, N = 4 SE +/- 1.74, N = 4 SE +/- 9.92, N = 4 SE +/- 0.23, N = 4 SE +/- 0.14, N = 4 2936.85 2917.96 1327.15 1302.74 1596.90 1. (CXX) g++ options: -O3 -march=native
SciMark Computational Test: Jacobi Successive Over-Relaxation OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Jacobi Successive Over-Relaxation Clang 3.8.0 Clang 3.9.0 GCC 5.4.0 GCC 6.2.0 GCC 7.0.0 20160904 200 400 600 800 1000 SE +/- 0.01, N = 4 SE +/- 0.04, N = 4 SE +/- 0.41, N = 4 SE +/- 0.00, N = 4 SE +/- 0.01, N = 4 857.42 856.35 571.65 572.05 572.06 1. (CXX) g++ options: -O3 -march=native
Hierarchical INTegration Test: FLOAT OpenBenchmarking.org QUIPs, More Is Better Hierarchical INTegration 1.0 Test: FLOAT Clang 3.8.0 Clang 3.9.0 GCC 5.4.0 GCC 6.2.0 GCC 7.0.0 20160904 40M 80M 120M 160M 200M SE +/- 15968.48, N = 3 SE +/- 131098.46, N = 3 SE +/- 355658.03, N = 3 SE +/- 165686.10, N = 3 SE +/- 291268.73, N = 3 141990405.69 143252876.20 177738053.32 166219905.04 165733740.33 1. (CC) gcc options: -O3 -march=native -lm
FLAC Audio Encoding WAV To FLAC OpenBenchmarking.org Seconds, Fewer Is Better FLAC Audio Encoding 1.3.1 WAV To FLAC Clang 3.8.0 Clang 3.9.0 GCC 5.4.0 GCC 6.2.0 GCC 7.0.0 20160904 3 6 9 12 15 SE +/- 0.01, N = 5 SE +/- 0.01, N = 5 SE +/- 0.01, N = 5 SE +/- 0.01, N = 5 SE +/- 0.01, N = 5 13.26 13.15 13.45 13.44 13.45 -fvisibility=hidden -fvisibility=hidden -fvisibility=hidden 1. (CXX) g++ options: -O3 -march=native -lm
LAME MP3 Encoding WAV To MP3 OpenBenchmarking.org Seconds, Fewer Is Better LAME MP3 Encoding 3.99.3 WAV To MP3 Clang 3.8.0 Clang 3.9.0 GCC 5.4.0 GCC 6.2.0 GCC 7.0.0 20160904 6 12 18 24 30 SE +/- 0.04, N = 5 SE +/- 0.04, N = 5 SE +/- 0.01, N = 5 SE +/- 0.03, N = 5 SE +/- 0.01, N = 5 26.94 27.46 22.84 22.42 22.03 1. (CC) gcc options: -O3 -ffast-math -funroll-loops -fschedule-insns2 -fbranch-count-reg -fforce-addr -pipe -march=native -lm
FFTW Build: Float + SSE - Size: 1D FFT Size 4096 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.4 Build: Float + SSE - Size: 1D FFT Size 4096 Clang 3.8.0 Clang 3.9.0 GCC 5.4.0 GCC 6.2.0 GCC 7.0.0 20160904 2K 4K 6K 8K 10K SE +/- 40.65, N = 5 SE +/- 39.29, N = 5 SE +/- 68.86, N = 5 SE +/- 44.65, N = 5 SE +/- 50.75, N = 5 10357 10331 10966 10936 10951 1. (CC) gcc options: -O3 -march=native -lm
FFTW Build: Float + SSE - Size: 2D FFT Size 4096 OpenBenchmarking.org Mflops, More Is Better FFTW 3.3.4 Build: Float + SSE - Size: 2D FFT Size 4096 Clang 3.8.0 Clang 3.9.0 GCC 5.4.0 GCC 6.2.0 GCC 7.0.0 20160904 1600 3200 4800 6400 8000 SE +/- 36.82, N = 5 SE +/- 16.42, N = 5 SE +/- 43.96, N = 5 SE +/- 22.70, N = 5 SE +/- 15.43, N = 5 6841.34 6923.64 7410.50 7515.98 7517.94 1. (CC) gcc options: -O3 -march=native -lm
Timed HMMer Search Pfam Database Search OpenBenchmarking.org Seconds, Fewer Is Better Timed HMMer Search 2.3.2 Pfam Database Search Clang 3.8.0 Clang 3.9.0 GCC 5.4.0 GCC 6.2.0 GCC 7.0.0 20160904 4 8 12 16 20 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 12.37 13.76 12.18 12.18 12.22 1. (CC) gcc options: -O3 -march=native -pthread -lhmmer -lsquid -lm
Timed MAFFT Alignment Multiple Sequence Alignment OpenBenchmarking.org Seconds, Fewer Is Better Timed MAFFT Alignment 6.864 Multiple Sequence Alignment Clang 3.8.0 Clang 3.9.0 GCC 5.4.0 GCC 6.2.0 GCC 7.0.0 20160904 2 4 6 8 10 SE +/- 0.14, N = 6 SE +/- 0.17, N = 6 SE +/- 0.03, N = 3 SE +/- 0.19, N = 6 SE +/- 0.10, N = 6 7.74 7.27 5.99 6.41 6.45 1. (CC) gcc options: -O3 -lm -lpthread
Himeno Benchmark Poisson Pressure Solver OpenBenchmarking.org MFLOPS, More Is Better Himeno Benchmark 3.0 Poisson Pressure Solver Clang 3.8.0 Clang 3.9.0 GCC 5.4.0 GCC 6.2.0 GCC 7.0.0 20160904 200 400 600 800 1000 SE +/- 0.07, N = 3 SE +/- 1.07, N = 3 SE +/- 0.49, N = 3 SE +/- 1.34, N = 3 SE +/- 0.65, N = 3 748.09 841.55 893.82 1091.32 1095.90 1. (CC) gcc options: -O3 -march=native -mavx2
LAMMPS Molecular Dynamics Simulator Test: Rhodopsin Protein OpenBenchmarking.org Loop Time, Fewer Is Better LAMMPS Molecular Dynamics Simulator 1.0 Test: Rhodopsin Protein Clang 3.8.0 Clang 3.9.0 GCC 5.4.0 16 32 48 64 80 SE +/- 0.14, N = 3 SE +/- 0.09, N = 3 SE +/- 0.12, N = 3 63.22 62.30 72.00 1. (CXX) g++ options: -lfftw -lmpich
N-Queens Elapsed Time OpenBenchmarking.org Seconds, Fewer Is Better N-Queens 1.0 Elapsed Time GCC 5.4.0 GCC 6.2.0 GCC 7.0.0 20160904 13 26 39 52 65 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 57.53 56.21 52.54 1. (CC) gcc options: -static -fopenmp -O3 -march=native
Timed ImageMagick Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed ImageMagick Compilation 6.9.0 Time To Compile Clang 3.8.0 Clang 3.9.0 GCC 5.4.0 GCC 6.2.0 GCC 7.0.0 20160904 20 40 60 80 100 SE +/- 0.05, N = 3 SE +/- 0.22, N = 3 SE +/- 0.01, N = 3 SE +/- 0.17, N = 3 SE +/- 0.14, N = 3 74.34 78.45 67.21 94.51 81.27
Timed PHP Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed PHP Compilation 5.2.9 Time To Compile Clang 3.8.0 Clang 3.9.0 GCC 5.4.0 GCC 6.2.0 GCC 7.0.0 20160904 8 16 24 32 40 SE +/- 0.17, N = 3 SE +/- 0.40, N = 3 SE +/- 0.06, N = 3 SE +/- 0.09, N = 3 SE +/- 0.05, N = 3 28.45 35.25 35.31 36.25 36.52 1. (CC) gcc options: -O3 -march=native -pedantic -ldl -lz -lm
C-Ray Total Time OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 1.1 Total Time Clang 3.8.0 Clang 3.9.0 GCC 5.4.0 GCC 6.2.0 GCC 7.0.0 20160904 10 20 30 40 50 SE +/- 0.08, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 39.34 44.42 21.30 21.25 23.36 1. (CC) gcc options: -lm -lpthread -O3 -march=native
Smallpt Global Illumination Renderer; 100 Samples OpenBenchmarking.org Seconds, Fewer Is Better Smallpt 1.0 Global Illumination Renderer; 100 Samples GCC 5.4.0 GCC 6.2.0 GCC 7.0.0 20160904 11 22 33 44 55 SE +/- 0.33, N = 3 SE +/- 0.88, N = 3 SE +/- 0.00, N = 3 47 47 30 1. (CXX) g++ options: -fopenmp -O3 -march=native
Apache Benchmark Static Web Page Serving OpenBenchmarking.org Requests Per Second, More Is Better Apache Benchmark 2.4.7 Static Web Page Serving Clang 3.8.0 Clang 3.9.0 GCC 5.4.0 GCC 6.2.0 GCC 7.0.0 20160904 6K 12K 18K 24K 30K SE +/- 52.16, N = 3 SE +/- 518.63, N = 3 SE +/- 80.36, N = 3 SE +/- 236.47, N = 3 SE +/- 81.85, N = 3 29812.18 27344.42 27564.72 28383.22 29512.13 1. (CC) gcc options: -shared -fPIC -pthread -O3 -march=native
OpenSSL RSA 4096-bit Performance OpenBenchmarking.org Signs Per Second, More Is Better OpenSSL 1.0.1g RSA 4096-bit Performance Clang 3.8.0 Clang 3.9.0 GCC 5.4.0 GCC 6.2.0 GCC 7.0.0 20160904 120 240 360 480 600 SE +/- 1.57, N = 3 SE +/- 2.68, N = 3 SE +/- 0.58, N = 3 SE +/- 1.31, N = 3 SE +/- 0.25, N = 3 566.80 511.07 569.37 569.27 570.20 1. (CC) gcc options: -m64 -O3 -lssl -lcrypto -ldl
Redis Test: GET OpenBenchmarking.org Requests Per Second, More Is Better Redis 3.0.1 Test: GET Clang 3.8.0 Clang 3.9.0 GCC 5.4.0 GCC 6.2.0 GCC 7.0.0 20160904 300K 600K 900K 1200K 1500K SE +/- 2621.93, N = 3 SE +/- 15364.63, N = 3 SE +/- 2474.75, N = 3 SE +/- 9042.38, N = 3 SE +/- 5158.54, N = 3 1342892.96 1348061.04 1434042.92 1408567.41 1393442.96 -std=gnu99 -pipe -g3 -O3 -funroll-loops -march=native 1. (CC) gcc options: -ggdb -rdynamic -lm -pthread -ldl
Redis Test: SET OpenBenchmarking.org Requests Per Second, More Is Better Redis 3.0.1 Test: SET Clang 3.8.0 Clang 3.9.0 GCC 5.4.0 GCC 6.2.0 GCC 7.0.0 20160904 200K 400K 600K 800K 1000K SE +/- 2474.98, N = 3 SE +/- 2866.48, N = 3 SE +/- 4216.02, N = 3 SE +/- 3624.82, N = 3 SE +/- 2999.32, N = 3 925939.11 933433.87 969029.10 972159.31 965269.60 -std=gnu99 -pipe -g3 -O3 -funroll-loops -march=native 1. (CC) gcc options: -ggdb -rdynamic -lm -pthread -ldl
PostgreSQL pgbench Scaling: Buffer Test - Test: Normal Load - Mode: Read Write OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 9.4.3 Scaling: Buffer Test - Test: Normal Load - Mode: Read Write Clang 3.8.0 Clang 3.9.0 GCC 5.4.0 GCC 6.2.0 GCC 7.0.0 20160904 1000 2000 3000 4000 5000 SE +/- 78.63, N = 3 SE +/- 68.11, N = 3 SE +/- 33.84, N = 3 SE +/- 67.37, N = 5 SE +/- 85.01, N = 3 4411.72 4356.38 4468.82 4411.75 4437.25 -pthreads -mthreads -pthreads -mthreads 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=native -pthread -lpgcommon -lpgport -lpq -lpthread -lrt -lcrypt -ldl -lm
PostgreSQL pgbench Scaling: Buffer Test - Test: Single Thread - Mode: Read Write OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 9.4.3 Scaling: Buffer Test - Test: Single Thread - Mode: Read Write Clang 3.8.0 Clang 3.9.0 GCC 5.4.0 GCC 6.2.0 GCC 7.0.0 20160904 120 240 360 480 600 SE +/- 14.62, N = 6 SE +/- 1.42, N = 3 SE +/- 5.89, N = 3 SE +/- 1.53, N = 3 SE +/- 0.66, N = 3 538.44 531.79 552.55 546.21 549.49 -pthreads -mthreads -pthreads -mthreads 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=native -pthread -lpgcommon -lpgport -lpq -lpthread -lrt -lcrypt -ldl -lm
ebizzy Phoronix Test Suite v6.6.0 OpenBenchmarking.org Records/s, More Is Better ebizzy 0.3 Phoronix Test Suite v6.6.0 Clang 3.8.0 Clang 3.9.0 GCC 5.4.0 GCC 6.2.0 GCC 7.0.0 20160904 30K 60K 90K 120K 150K SE +/- 2308.58, N = 4 SE +/- 392.88, N = 3 SE +/- 87.83, N = 3 SE +/- 68.34, N = 3 SE +/- 360.05, N = 3 152747 138671 155360 155273 155692 1. (CC) gcc options: -pthread -lpthread -O3 -march=native
Fhourstones Complex Connect-4 Solving OpenBenchmarking.org Kpos / sec, More Is Better Fhourstones 3.1 Complex Connect-4 Solving Clang 3.8.0 Clang 3.9.0 GCC 5.4.0 GCC 6.2.0 GCC 7.0.0 20160904 1500 3000 4500 6000 7500 SE +/- 0.87, N = 3 SE +/- 2.59, N = 3 SE +/- 5.88, N = 3 SE +/- 11.54, N = 3 SE +/- 10.00, N = 3 7164.37 6563.47 6954.67 6884.93 6905.30 1. (CC) gcc options: -O3
Phoronix Test Suite v10.8.5