GCC vs. LLVM Clang 3.8 3.9 Compiler Benchmarking

Benchmarks by Michael Larabel for a future article on phoronix looking at early GCC 7 compiler performance compared to GCC 6 and GCC 5 and then LLVM Clang 3.8 and Clang 3.9.

HTML result view exported from: https://openbenchmarking.org/result/1609139-LO-GCCCLANG151&sro&gru.

GCC vs. LLVM Clang 3.8 3.9 Compiler BenchmarkingProcessorMotherboardChipsetMemoryDiskGraphicsAudioNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLCompilerFile-SystemScreen ResolutionGCC 5.4.0GCC 6.2.0GCC 7.0.0 20160904Clang 3.8.0Clang 3.9.0Intel Xeon E5-2609 v4 @ 1.70GHz (8 Cores)MSI X99A WORKSTATION (MS-7A54) v1.0Intel Xeon E7 v4/Xeon16384MB3 x 120GB TOSHIBA-TR150LLVMpipeRealtek ALC1150Intel ConnectionUbuntu 16.044.8.0-999-generic (x86_64) 20160908Unity 7.4.0X Server 1.18.3modesetting 1.18.33.3 Mesa 11.2.0 Gallium 0.4GCC 5.4.0ext41024x768GCC 6.2.0GCC 7.0.0 20160904Clang 3.8.0-2ubuntu4Clang 3.9.0-svn279689-1~exp1OpenBenchmarking.orgEnvironment Details- LIBGL_ALWAYS_SOFTWARE=1Compiler Details- GCC 5.4.0, GCC 6.2.0, GCC 7.0.0 20160904: --disable-multilib --enable-checking=release --enable-languages=c,c++,fortranDisk Details- CFQ / data=ordered,errors=remount-ro,relatime,rwProcessor Details- Scaling Governor: intel_pstate powersave

GCC vs. LLVM Clang 3.8 3.9 Compiler Benchmarkingfhourstones: Complex Connect-4 Solvingfftw: Float + SSE - 1D FFT Size 4096fftw: Float + SSE - 2D FFT Size 4096scimark2: Compositescimark2: Monte Carloscimark2: Fast Fourier Transformscimark2: Sparse Matrix Multiplyscimark2: Dense LU Matrix Factorizationscimark2: Jacobi Successive Over-Relaxationhimeno: Poisson Pressure Solverhint: FLOATebizzy: Phoronix Test Suite v6.6.0redis: GETredis: SETapache: Static Web Page Servingopenssl: RSA 4096-bit Performancepgbench: Buffer Test - Normal Load - Read Writepgbench: Buffer Test - Single Thread - Read Writelammps: Rhodopsin Proteinhmmer: Pfam Database Searchmafft: Multiple Sequence Alignmentbuild-imagemagick: Time To Compilebuild-php: Time To Compilec-ray: Total Timesmallpt: Global Illumination Renderer; 100 Samplesbullet: Raytestsbullet: 3000 Fallbullet: 136 Ragdollsencode-flac: WAV To FLACencode-mp3: WAV To MP3n-queens: Elapsed TimeGCC 5.4.0GCC 6.2.0GCC 7.0.0 20160904Clang 3.8.0Clang 3.9.06954.67109667410.50695.70304.10228.181047.401327.15571.65893.82177738053.321553601434042.92969029.1027564.72569.374468.82552.5572.0012.185.9967.2135.3121.30475.949.136.4513.4522.8457.536884.93109367515.98738.21304.07232.191280.021302.74572.051091.32166219905.041552731408567.41972159.3128383.22569.274411.75546.2112.186.4194.5136.2521.25475.989.116.4613.4422.4256.216905.30109517517.94800.64304.03239.271290.941596.90572.061095.90165733740.331556921393442.96965269.6029512.13570.204437.25549.4912.226.4581.2736.5223.36305.969.016.4013.4522.0352.547164.37103576841.341111.78126.46223.711414.482936.85857.42748.09141990405.691527471342892.96925939.1129812.18566.804411.72538.4463.2212.377.7474.3428.4539.346.209.576.8913.2626.946563.47103316923.641115.00126.43226.011448.222917.96856.35841.55143252876.201386711348061.04933433.8727344.42511.074356.38531.7962.3013.767.2778.4535.2544.426.219.836.9213.1527.46OpenBenchmarking.org

Fhourstones

Complex Connect-4 Solving

OpenBenchmarking.orgKpos / sec, More Is BetterFhourstones 3.1Complex Connect-4 SolvingClang 3.8.0Clang 3.9.0GCC 5.4.0GCC 6.2.0GCC 7.0.0 2016090415003000450060007500SE +/- 0.87, N = 3SE +/- 2.59, N = 3SE +/- 5.88, N = 3SE +/- 11.54, N = 3SE +/- 10.00, N = 37164.376563.476954.676884.936905.301. (CC) gcc options: -O3

FFTW

Build: Float + SSE - Size: 1D FFT Size 4096

OpenBenchmarking.orgMflops, More Is BetterFFTW 3.3.4Build: Float + SSE - Size: 1D FFT Size 4096Clang 3.8.0Clang 3.9.0GCC 5.4.0GCC 6.2.0GCC 7.0.0 201609042K4K6K8K10KSE +/- 40.65, N = 5SE +/- 39.29, N = 5SE +/- 68.86, N = 5SE +/- 44.65, N = 5SE +/- 50.75, N = 510357103311096610936109511. (CC) gcc options: -O3 -march=native -lm

FFTW

Build: Float + SSE - Size: 2D FFT Size 4096

OpenBenchmarking.orgMflops, More Is BetterFFTW 3.3.4Build: Float + SSE - Size: 2D FFT Size 4096Clang 3.8.0Clang 3.9.0GCC 5.4.0GCC 6.2.0GCC 7.0.0 2016090416003200480064008000SE +/- 36.82, N = 5SE +/- 16.42, N = 5SE +/- 43.96, N = 5SE +/- 22.70, N = 5SE +/- 15.43, N = 56841.346923.647410.507515.987517.941. (CC) gcc options: -O3 -march=native -lm

SciMark

Computational Test: Composite

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: CompositeClang 3.8.0Clang 3.9.0GCC 5.4.0GCC 6.2.0GCC 7.0.0 201609042004006008001000SE +/- 0.09, N = 4SE +/- 0.69, N = 4SE +/- 2.13, N = 4SE +/- 0.14, N = 4SE +/- 0.09, N = 41111.781115.00695.70738.21800.641. (CXX) g++ options: -O3 -march=native

SciMark

Computational Test: Monte Carlo

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Monte CarloClang 3.8.0Clang 3.9.0GCC 5.4.0GCC 6.2.0GCC 7.0.0 2016090470140210280350SE +/- 0.02, N = 4SE +/- 0.15, N = 4SE +/- 0.00, N = 4SE +/- 0.00, N = 4SE +/- 0.00, N = 4126.46126.43304.10304.07304.031. (CXX) g++ options: -O3 -march=native

SciMark

Computational Test: Fast Fourier Transform

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Fast Fourier TransformClang 3.8.0Clang 3.9.0GCC 5.4.0GCC 6.2.0GCC 7.0.0 2016090450100150200250SE +/- 0.36, N = 4SE +/- 1.03, N = 4SE +/- 0.13, N = 4SE +/- 0.38, N = 4SE +/- 0.16, N = 4223.71226.01228.18232.19239.271. (CXX) g++ options: -O3 -march=native

SciMark

Computational Test: Sparse Matrix Multiply

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Sparse Matrix MultiplyClang 3.8.0Clang 3.9.0GCC 5.4.0GCC 6.2.0GCC 7.0.0 2016090430060090012001500SE +/- 0.32, N = 4SE +/- 2.31, N = 4SE +/- 2.22, N = 4SE +/- 0.17, N = 4SE +/- 0.35, N = 41414.481448.221047.401280.021290.941. (CXX) g++ options: -O3 -march=native

SciMark

Computational Test: Dense LU Matrix Factorization

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Dense LU Matrix FactorizationClang 3.8.0Clang 3.9.0GCC 5.4.0GCC 6.2.0GCC 7.0.0 201609046001200180024003000SE +/- 0.55, N = 4SE +/- 1.74, N = 4SE +/- 9.92, N = 4SE +/- 0.23, N = 4SE +/- 0.14, N = 42936.852917.961327.151302.741596.901. (CXX) g++ options: -O3 -march=native

SciMark

Computational Test: Jacobi Successive Over-Relaxation

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Jacobi Successive Over-RelaxationClang 3.8.0Clang 3.9.0GCC 5.4.0GCC 6.2.0GCC 7.0.0 201609042004006008001000SE +/- 0.01, N = 4SE +/- 0.04, N = 4SE +/- 0.41, N = 4SE +/- 0.00, N = 4SE +/- 0.01, N = 4857.42856.35571.65572.05572.061. (CXX) g++ options: -O3 -march=native

Himeno Benchmark

Poisson Pressure Solver

OpenBenchmarking.orgMFLOPS, More Is BetterHimeno Benchmark 3.0Poisson Pressure SolverClang 3.8.0Clang 3.9.0GCC 5.4.0GCC 6.2.0GCC 7.0.0 201609042004006008001000SE +/- 0.07, N = 3SE +/- 1.07, N = 3SE +/- 0.49, N = 3SE +/- 1.34, N = 3SE +/- 0.65, N = 3748.09841.55893.821091.321095.901. (CC) gcc options: -O3 -march=native -mavx2

Hierarchical INTegration

Test: FLOAT

OpenBenchmarking.orgQUIPs, More Is BetterHierarchical INTegration 1.0Test: FLOATClang 3.8.0Clang 3.9.0GCC 5.4.0GCC 6.2.0GCC 7.0.0 2016090440M80M120M160M200MSE +/- 15968.48, N = 3SE +/- 131098.46, N = 3SE +/- 355658.03, N = 3SE +/- 165686.10, N = 3SE +/- 291268.73, N = 3141990405.69143252876.20177738053.32166219905.04165733740.331. (CC) gcc options: -O3 -march=native -lm

ebizzy

Phoronix Test Suite v6.6.0

OpenBenchmarking.orgRecords/s, More Is Betterebizzy 0.3Phoronix Test Suite v6.6.0Clang 3.8.0Clang 3.9.0GCC 5.4.0GCC 6.2.0GCC 7.0.0 2016090430K60K90K120K150KSE +/- 2308.58, N = 4SE +/- 392.88, N = 3SE +/- 87.83, N = 3SE +/- 68.34, N = 3SE +/- 360.05, N = 31527471386711553601552731556921. (CC) gcc options: -pthread -lpthread -O3 -march=native

Redis

Test: GET

OpenBenchmarking.orgRequests Per Second, More Is BetterRedis 3.0.1Test: GETClang 3.8.0Clang 3.9.0GCC 5.4.0GCC 6.2.0GCC 7.0.0 20160904300K600K900K1200K1500KSE +/- 2621.93, N = 3SE +/- 15364.63, N = 3SE +/- 2474.75, N = 3SE +/- 9042.38, N = 3SE +/- 5158.54, N = 31342892.961348061.041434042.921408567.411393442.96-std=gnu99 -pipe -g3 -O3 -funroll-loops -march=native1. (CC) gcc options: -ggdb -rdynamic -lm -pthread -ldl

Redis

Test: SET

OpenBenchmarking.orgRequests Per Second, More Is BetterRedis 3.0.1Test: SETClang 3.8.0Clang 3.9.0GCC 5.4.0GCC 6.2.0GCC 7.0.0 20160904200K400K600K800K1000KSE +/- 2474.98, N = 3SE +/- 2866.48, N = 3SE +/- 4216.02, N = 3SE +/- 3624.82, N = 3SE +/- 2999.32, N = 3925939.11933433.87969029.10972159.31965269.60-std=gnu99 -pipe -g3 -O3 -funroll-loops -march=native1. (CC) gcc options: -ggdb -rdynamic -lm -pthread -ldl

Apache Benchmark

Static Web Page Serving

OpenBenchmarking.orgRequests Per Second, More Is BetterApache Benchmark 2.4.7Static Web Page ServingClang 3.8.0Clang 3.9.0GCC 5.4.0GCC 6.2.0GCC 7.0.0 201609046K12K18K24K30KSE +/- 52.16, N = 3SE +/- 518.63, N = 3SE +/- 80.36, N = 3SE +/- 236.47, N = 3SE +/- 81.85, N = 329812.1827344.4227564.7228383.2229512.131. (CC) gcc options: -shared -fPIC -pthread -O3 -march=native

OpenSSL

RSA 4096-bit Performance

OpenBenchmarking.orgSigns Per Second, More Is BetterOpenSSL 1.0.1gRSA 4096-bit PerformanceClang 3.8.0Clang 3.9.0GCC 5.4.0GCC 6.2.0GCC 7.0.0 20160904120240360480600SE +/- 1.57, N = 3SE +/- 2.68, N = 3SE +/- 0.58, N = 3SE +/- 1.31, N = 3SE +/- 0.25, N = 3566.80511.07569.37569.27570.201. (CC) gcc options: -m64 -O3 -lssl -lcrypto -ldl

PostgreSQL pgbench

Scaling: Buffer Test - Test: Normal Load - Mode: Read Write

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL pgbench 9.4.3Scaling: Buffer Test - Test: Normal Load - Mode: Read WriteClang 3.8.0Clang 3.9.0GCC 5.4.0GCC 6.2.0GCC 7.0.0 2016090410002000300040005000SE +/- 78.63, N = 3SE +/- 68.11, N = 3SE +/- 33.84, N = 3SE +/- 67.37, N = 5SE +/- 85.01, N = 34411.724356.384468.824411.754437.25-pthreads -mthreads-pthreads -mthreads1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=native -pthread -lpgcommon -lpgport -lpq -lpthread -lrt -lcrypt -ldl -lm

PostgreSQL pgbench

Scaling: Buffer Test - Test: Single Thread - Mode: Read Write

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL pgbench 9.4.3Scaling: Buffer Test - Test: Single Thread - Mode: Read WriteClang 3.8.0Clang 3.9.0GCC 5.4.0GCC 6.2.0GCC 7.0.0 20160904120240360480600SE +/- 14.62, N = 6SE +/- 1.42, N = 3SE +/- 5.89, N = 3SE +/- 1.53, N = 3SE +/- 0.66, N = 3538.44531.79552.55546.21549.49-pthreads -mthreads-pthreads -mthreads1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=native -pthread -lpgcommon -lpgport -lpq -lpthread -lrt -lcrypt -ldl -lm

LAMMPS Molecular Dynamics Simulator

Test: Rhodopsin Protein

OpenBenchmarking.orgLoop Time, Fewer Is BetterLAMMPS Molecular Dynamics Simulator 1.0Test: Rhodopsin ProteinClang 3.8.0Clang 3.9.0GCC 5.4.01632486480SE +/- 0.14, N = 3SE +/- 0.09, N = 3SE +/- 0.12, N = 363.2262.3072.001. (CXX) g++ options: -lfftw -lmpich

Timed HMMer Search

Pfam Database Search

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed HMMer Search 2.3.2Pfam Database SearchClang 3.8.0Clang 3.9.0GCC 5.4.0GCC 6.2.0GCC 7.0.0 2016090448121620SE +/- 0.02, N = 3SE +/- 0.04, N = 3SE +/- 0.03, N = 3SE +/- 0.01, N = 3SE +/- 0.02, N = 312.3713.7612.1812.1812.221. (CC) gcc options: -O3 -march=native -pthread -lhmmer -lsquid -lm

Timed MAFFT Alignment

Multiple Sequence Alignment

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed MAFFT Alignment 6.864Multiple Sequence AlignmentClang 3.8.0Clang 3.9.0GCC 5.4.0GCC 6.2.0GCC 7.0.0 20160904246810SE +/- 0.14, N = 6SE +/- 0.17, N = 6SE +/- 0.03, N = 3SE +/- 0.19, N = 6SE +/- 0.10, N = 67.747.275.996.416.451. (CC) gcc options: -O3 -lm -lpthread

Timed ImageMagick Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed ImageMagick Compilation 6.9.0Time To CompileClang 3.8.0Clang 3.9.0GCC 5.4.0GCC 6.2.0GCC 7.0.0 2016090420406080100SE +/- 0.05, N = 3SE +/- 0.22, N = 3SE +/- 0.01, N = 3SE +/- 0.17, N = 3SE +/- 0.14, N = 374.3478.4567.2194.5181.27

Timed PHP Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed PHP Compilation 5.2.9Time To CompileClang 3.8.0Clang 3.9.0GCC 5.4.0GCC 6.2.0GCC 7.0.0 20160904816243240SE +/- 0.17, N = 3SE +/- 0.40, N = 3SE +/- 0.06, N = 3SE +/- 0.09, N = 3SE +/- 0.05, N = 328.4535.2535.3136.2536.521. (CC) gcc options: -O3 -march=native -pedantic -ldl -lz -lm

C-Ray

Total Time

OpenBenchmarking.orgSeconds, Fewer Is BetterC-Ray 1.1Total TimeClang 3.8.0Clang 3.9.0GCC 5.4.0GCC 6.2.0GCC 7.0.0 201609041020304050SE +/- 0.08, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 339.3444.4221.3021.2523.361. (CC) gcc options: -lm -lpthread -O3 -march=native

Smallpt

Global Illumination Renderer; 100 Samples

OpenBenchmarking.orgSeconds, Fewer Is BetterSmallpt 1.0Global Illumination Renderer; 100 SamplesGCC 5.4.0GCC 6.2.0GCC 7.0.0 201609041122334455SE +/- 0.33, N = 3SE +/- 0.88, N = 3SE +/- 0.00, N = 34747301. (CXX) g++ options: -fopenmp -O3 -march=native

Bullet Physics Engine

Test: Raytests

OpenBenchmarking.orgSeconds, Fewer Is BetterBullet Physics Engine 2.81Test: RaytestsClang 3.8.0Clang 3.9.0GCC 5.4.0GCC 6.2.0GCC 7.0.0 20160904246810SE +/- 0.01, N = 3SE +/- 0.05, N = 3SE +/- 0.00, N = 3SE +/- 0.01, N = 3SE +/- 0.02, N = 36.206.215.945.985.961. (CXX) g++ options: -O3 -march=native -rdynamic -lglut -lGL -lGLU

Bullet Physics Engine

Test: 3000 Fall

OpenBenchmarking.orgSeconds, Fewer Is BetterBullet Physics Engine 2.81Test: 3000 FallClang 3.8.0Clang 3.9.0GCC 5.4.0GCC 6.2.0GCC 7.0.0 201609043691215SE +/- 0.02, N = 3SE +/- 0.02, N = 3SE +/- 0.01, N = 3SE +/- 0.02, N = 3SE +/- 0.00, N = 39.579.839.139.119.011. (CXX) g++ options: -O3 -march=native -rdynamic -lglut -lGL -lGLU

Bullet Physics Engine

Test: 136 Ragdolls

OpenBenchmarking.orgSeconds, Fewer Is BetterBullet Physics Engine 2.81Test: 136 RagdollsClang 3.8.0Clang 3.9.0GCC 5.4.0GCC 6.2.0GCC 7.0.0 20160904246810SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 36.896.926.456.466.401. (CXX) g++ options: -O3 -march=native -rdynamic -lglut -lGL -lGLU

FLAC Audio Encoding

WAV To FLAC

OpenBenchmarking.orgSeconds, Fewer Is BetterFLAC Audio Encoding 1.3.1WAV To FLACClang 3.8.0Clang 3.9.0GCC 5.4.0GCC 6.2.0GCC 7.0.0 201609043691215SE +/- 0.01, N = 5SE +/- 0.01, N = 5SE +/- 0.01, N = 5SE +/- 0.01, N = 5SE +/- 0.01, N = 513.2613.1513.4513.4413.45-fvisibility=hidden-fvisibility=hidden-fvisibility=hidden1. (CXX) g++ options: -O3 -march=native -lm

LAME MP3 Encoding

WAV To MP3

OpenBenchmarking.orgSeconds, Fewer Is BetterLAME MP3 Encoding 3.99.3WAV To MP3Clang 3.8.0Clang 3.9.0GCC 5.4.0GCC 6.2.0GCC 7.0.0 20160904612182430SE +/- 0.04, N = 5SE +/- 0.04, N = 5SE +/- 0.01, N = 5SE +/- 0.03, N = 5SE +/- 0.01, N = 526.9427.4622.8422.4222.031. (CC) gcc options: -O3 -ffast-math -funroll-loops -fschedule-insns2 -fbranch-count-reg -fforce-addr -pipe -march=native -lm

N-Queens

Elapsed Time

OpenBenchmarking.orgSeconds, Fewer Is BetterN-Queens 1.0Elapsed TimeGCC 5.4.0GCC 6.2.0GCC 7.0.0 201609041326395265SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 357.5356.2152.541. (CC) gcc options: -static -fopenmp -O3 -march=native


Phoronix Test Suite v10.8.5