LLVM Clang 3.3 vs. GCC 4.8 - Intel Core-AVX2 Haswell

Intel Core i7-4770K testing with a Intel DH87RL motherboard looking at the GCC 4.7, GCC 4.8, LLVM Clang 3.2, and LLVM Clang 3.3 compiler performance with core-avx2 Haswell optimizations. Intel Core i7 Haswell core-avx2 compiler benchmarks for a future article on Phoronix by Michael Larabel.

HTML result view exported from: https://openbenchmarking.org/result/1306206-SO-LLVMCLANG08&grt&rdt.

LLVM Clang 3.3 vs. GCC 4.8 - Intel Core-AVX2 HaswellProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLCompilerFile-SystemScreen ResolutionLLVM Clang 3.3GCC 4.8.1LLVM Clang 3.2GCC 4.7.3Intel Core i7-4770K @ 3.50GHz (8 Cores)Intel DH87RLIntel 4th Gen Core DRAM15360MB240GB OCZ VERTEX3Intel Haswell DesktopIntel Haswell HDMIVA2431Intel Connection I217-VUbuntu 13.103.10.0-999-generic (x86_64)Unity 7.0.0X Server 1.13.3intel 2.21.93.0 Mesa 9.1.3Clang 3.3 + LLVM 3.3ext41920x1080GCC 4.8.1Clang 3.2 + LLVM 3.2svnGCC 4.7.3OpenBenchmarking.orgCompiler Details- LLVM Clang 3.3: Optimized build; Built Jun 20 2013 (12:21:18); Default target: x86_64-unknown-linux-gnu; Host CPU: x86-64 - GCC 4.8.1: --disable-multilib --enable-checking=release --enable-languages=c,c++,fortran- LLVM Clang 3.2: Optimized build; Built Jun 20 2013 (14:54:23); Default target: x86_64-unknown-linux-gnu; Host CPU: x86-64 - GCC 4.7.3: --disable-multilib --enable-checking=release --enable-languages=c,c++,fortranProcessor Details- Scaling Governor: acpi- freq ondemand

LLVM Clang 3.3 vs. GCC 4.8 - Intel Core-AVX2 Haswellapache: Static Web Page Servingblake2: Phoronix Test Suite v4.8.0m0c-ray: Total Timeffmpeg: H.264 HD To NTSC DVhimeno: Poisson Pressure Solverencode-mp3: WAV To MP3primesieve: 1e12 Prime Number Generationscimark2: Monte Carloscimark2: Fast Fourier Transformscimark2: Sparse Matrix Multiplyscimark2: Dense LU Matrix Factorizationscimark2: Jacobi Successive Over-Relaxationsmallpt: Global Illumination Renderer; 100 Samplestachyon: Total Timehmmer: Pfam Database Searchbuild-imagemagick: Time To Compilemafft: Multiple Sequence Alignmentbuild-php: Time To Compiletscp: AI Chess Performancex264: H.264 Video EncodingLLVM Clang 3.3GCC 4.8.1LLVM Clang 3.2GCC 4.7.325295.827.4527.0313.181419.9014.45326.85619.77237.861263.291827.341666.2414210.9810.9434.356.1921.03624749153.1525786.155.3017.0212.811593.0979.24615.32245.001123.801773.351164.632510.3078.615.1733.30599455156.3425888.957.5427.4612.771532.5112.74615.04246.791234.191774.031670.7714010.4411.9231.945.9619.59624323155.3525743.995.3221.4512.861663.9779.36450.21251.671177.861861.551163.522510.4193.675.2732.92631626155.33OpenBenchmarking.org

Apache Benchmark

Static Web Page Serving

OpenBenchmarking.orgRequests Per Second, More Is BetterApache Benchmark 2.4.3Static Web Page ServingLLVM Clang 3.3GCC 4.8.1LLVM Clang 3.2GCC 4.7.36K12K18K24K30KSE +/- 280.78, N = 3SE +/- 36.32, N = 3SE +/- 32.41, N = 3SE +/- 51.73, N = 325295.8225786.1525888.9525743.991. (CC) gcc options: -shared -fPIC -pthread -march=core-avx2 -O3

BLAKE2

Phoronix Test Suite v4.8.0m0

OpenBenchmarking.orgCycles Per Byte, Fewer Is BetterBLAKE2 20121223Phoronix Test Suite v4.8.0m0LLVM Clang 3.3GCC 4.8.1LLVM Clang 3.2GCC 4.7.3246810SE +/- 0.13, N = 4SE +/- 0.01, N = 3SE +/- 0.11, N = 3SE +/- 0.00, N = 37.455.307.545.321. (CC) gcc options: -std=gnu99 -O3 -march=native -lcrypto -lz

C-Ray

Total Time

OpenBenchmarking.orgSeconds, Fewer Is BetterC-Ray 1.1Total TimeLLVM Clang 3.3GCC 4.8.1LLVM Clang 3.2GCC 4.7.3612182430SE +/- 0.00, N = 3SE +/- 0.01, N = 3SE +/- 0.02, N = 3SE +/- 0.01, N = 327.0317.0227.4621.451. (CC) gcc options: -lm -lpthread -O3 -march=core-avx2

FFmpeg

H.264 HD To NTSC DV

OpenBenchmarking.orgSeconds, Fewer Is BetterFFmpeg 1.1H.264 HD To NTSC DVLLVM Clang 3.3GCC 4.8.1LLVM Clang 3.2GCC 4.7.33691215SE +/- 0.12, N = 3SE +/- 0.06, N = 3SE +/- 0.04, N = 3SE +/- 0.08, N = 313.1812.8112.7712.86-Qunused-arguments-fno-tree-vectorize -MF -MT-Qunused-arguments-fno-tree-vectorize -MF -MT1. (CC) gcc options: -lavdevice -lavfilter -lavformat -lavcodec -lswresample -lswscale -lavutil -ldl -lasound -lSDL -lm -pthread -lbz2 -march=core-avx2 -O3 -std=c99 -fomit-frame-pointer -fno-math-errno -fno-signed-zeros -MMD

Himeno Benchmark

Poisson Pressure Solver

OpenBenchmarking.orgMFLOPS, More Is BetterHimeno Benchmark 3.0Poisson Pressure SolverLLVM Clang 3.3GCC 4.8.1LLVM Clang 3.2GCC 4.7.3400800120016002000SE +/- 13.80, N = 3SE +/- 20.45, N = 3SE +/- 2.02, N = 3SE +/- 0.19, N = 31419.901593.091532.511663.971. (CC) gcc options: -O3 -march=core-avx2

LAME MP3 Encoding

WAV To MP3

OpenBenchmarking.orgSeconds, Fewer Is BetterLAME MP3 Encoding 3.99.3WAV To MP3LLVM Clang 3.3LLVM Clang 3.248121620SE +/- 0.01, N = 5SE +/- 0.01, N = 514.4512.741. (CC) gcc options: -pipe -march=core-avx2 -O3 -lm

Primesieve

1e12 Prime Number Generation

OpenBenchmarking.orgSeconds, Fewer Is BetterPrimesieve 4.21e12 Prime Number GenerationLLVM Clang 3.3GCC 4.8.1GCC 4.7.370140210280350SE +/- 2.08, N = 3SE +/- 0.08, N = 3SE +/- 0.06, N = 3326.8579.2479.36-fopenmp-fopenmp1. (CXX) g++ options: -O2

SciMark

Computational Test: Monte Carlo

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Monte CarloLLVM Clang 3.3GCC 4.8.1LLVM Clang 3.2GCC 4.7.3130260390520650SE +/- 0.89, N = 4SE +/- 0.00, N = 4SE +/- 5.67, N = 4SE +/- 0.55, N = 4619.77615.32615.04450.21

SciMark

Computational Test: Fast Fourier Transform

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Fast Fourier TransformLLVM Clang 3.3GCC 4.8.1LLVM Clang 3.2GCC 4.7.350100150200250SE +/- 0.98, N = 4SE +/- 0.34, N = 4SE +/- 1.60, N = 4SE +/- 0.68, N = 4237.86245.00246.79251.67

SciMark

Computational Test: Sparse Matrix Multiply

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Sparse Matrix MultiplyLLVM Clang 3.3GCC 4.8.1LLVM Clang 3.2GCC 4.7.330060090012001500SE +/- 5.09, N = 4SE +/- 5.13, N = 4SE +/- 30.06, N = 4SE +/- 0.85, N = 41263.291123.801234.191177.86

SciMark

Computational Test: Dense LU Matrix Factorization

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Dense LU Matrix FactorizationLLVM Clang 3.3GCC 4.8.1LLVM Clang 3.2GCC 4.7.3400800120016002000SE +/- 22.59, N = 4SE +/- 1.48, N = 4SE +/- 36.01, N = 4SE +/- 1.88, N = 41827.341773.351774.031861.55

SciMark

Computational Test: Jacobi Successive Over-Relaxation

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Jacobi Successive Over-RelaxationLLVM Clang 3.3GCC 4.8.1LLVM Clang 3.2GCC 4.7.3400800120016002000SE +/- 1.85, N = 4SE +/- 1.11, N = 4SE +/- 0.00, N = 4SE +/- 1.28, N = 41666.241164.631670.771163.52

Smallpt

Global Illumination Renderer; 100 Samples

OpenBenchmarking.orgSeconds, Fewer Is BetterSmallpt 1.0Global Illumination Renderer; 100 SamplesLLVM Clang 3.3GCC 4.8.1LLVM Clang 3.2GCC 4.7.3306090120150SE +/- 1.33, N = 3SE +/- 0.33, N = 3SE +/- 0.33, N = 3SE +/- 0.00, N = 314225140251. (CXX) g++ options: -fopenmp -march=core-avx2 -O3

Tachyon

Total Time

OpenBenchmarking.orgSeconds, Fewer Is BetterTachyon 0.98.9Total TimeLLVM Clang 3.3LLVM Clang 3.23691215SE +/- 0.24, N = 6SE +/- 0.06, N = 310.9810.441. (CC) gcc options: -m32 -O3 -fomit-frame-pointer -ffast-math -ltachyon -lm -lpthread

Timed HMMer Search

Pfam Database Search

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed HMMer Search 2.3.2Pfam Database SearchLLVM Clang 3.3GCC 4.8.1LLVM Clang 3.2GCC 4.7.33691215SE +/- 0.18, N = 4SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.02, N = 310.9410.3011.9210.411. (CC) gcc options: -march=core-avx2 -O3 -pthread -lhmmer -lsquid -lm

Timed ImageMagick Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed ImageMagick Compilation 6.8.1-10Time To CompileLLVM Clang 3.3GCC 4.8.1LLVM Clang 3.2GCC 4.7.320406080100SE +/- 0.37, N = 3SE +/- 0.07, N = 3SE +/- 0.12, N = 3SE +/- 0.60, N = 334.3578.6131.9493.67

Timed MAFFT Alignment

Multiple Sequence Alignment

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed MAFFT Alignment 6.864Multiple Sequence AlignmentLLVM Clang 3.3GCC 4.8.1LLVM Clang 3.2GCC 4.7.3246810SE +/- 0.12, N = 3SE +/- 0.03, N = 3SE +/- 0.10, N = 3SE +/- 0.02, N = 36.195.175.965.271. (CC) gcc options: -O3 -lm -lpthread

Timed PHP Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed PHP Compilation 5.2.9Time To CompileLLVM Clang 3.3GCC 4.8.1LLVM Clang 3.2GCC 4.7.3816243240SE +/- 0.20, N = 3SE +/- 0.46, N = 3SE +/- 0.03, N = 3SE +/- 0.14, N = 321.0333.3019.5932.92-lpthread1. (CC) gcc options: -march=core-avx2 -O3 -pedantic -ldl -lz -lm

TSCP

AI Chess Performance

OpenBenchmarking.orgNodes Per Second, More Is BetterTSCP 1.81AI Chess PerformanceLLVM Clang 3.3GCC 4.8.1LLVM Clang 3.2GCC 4.7.3140K280K420K560K700KSE +/- 480.04, N = 5SE +/- 520.60, N = 5SE +/- 141.40, N = 5SE +/- 0.00, N = 5624749599455624323631626

x264

H.264 Video Encoding

OpenBenchmarking.orgFrames Per Second, More Is Betterx264 2013-06-08H.264 Video EncodingLLVM Clang 3.3GCC 4.8.1LLVM Clang 3.2GCC 4.7.3306090120150SE +/- 0.58, N = 5SE +/- 0.40, N = 5SE +/- 0.60, N = 5SE +/- 0.33, N = 5153.15156.34155.35155.331. (CC) gcc options: -ldl -m64 -lm -lpthread -O3 -ffast-math -march=core-avx2 -std=gnu99 -fomit-frame-pointer -fno-tree-vectorize


Phoronix Test Suite v10.8.5