LLVM Clang 3.3 vs. GCC 4.8 - Intel Core-AVX2 Haswell Intel Core i7-4770K testing with a Intel DH87RL motherboard looking at the GCC 4.7, GCC 4.8, LLVM Clang 3.2, and LLVM Clang 3.3 compiler performance with core-avx2 Haswell optimizations. Intel Core i7 Haswell core-avx2 compiler benchmarks for a future article on Phoronix by Michael Larabel.
HTML result view exported from: https://openbenchmarking.org/result/1306206-SO-LLVMCLANG08&grr .
LLVM Clang 3.3 vs. GCC 4.8 - Intel Core-AVX2 Haswell Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL Compiler File-System Screen Resolution GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 Intel Core i7-4770K @ 3.50GHz (8 Cores) Intel DH87RL Intel 4th Gen Core DRAM 15360MB 240GB OCZ VERTEX3 Intel Haswell Desktop Intel Haswell HDMI VA2431 Intel Connection I217-V Ubuntu 13.10 3.10.0-999-generic (x86_64) Unity 7.0.0 X Server 1.13.3 intel 2.21.9 3.0 Mesa 9.1.3 GCC 4.7.3 ext4 1920x1080 GCC 4.8.1 Clang 3.2 + LLVM 3.2svn Clang 3.3 + LLVM 3.3 OpenBenchmarking.org Compiler Details - GCC 4.7.3: --disable-multilib --enable-checking=release --enable-languages=c,c++,fortran - GCC 4.8.1: --disable-multilib --enable-checking=release --enable-languages=c,c++,fortran - LLVM Clang 3.2: Optimized build; Built Jun 20 2013 (14:54:23); Default target: x86_64-unknown-linux-gnu; Host CPU: x86-64 - LLVM Clang 3.3: Optimized build; Built Jun 20 2013 (12:21:18); Default target: x86_64-unknown-linux-gnu; Host CPU: x86-64 Processor Details - Scaling Governor: acpi- freq ondemand
LLVM Clang 3.3 vs. GCC 4.8 - Intel Core-AVX2 Haswell apache: Static Web Page Serving tachyon: Total Time ffmpeg: H.264 HD To NTSC DV encode-mp3: WAV To MP3 smallpt: Global Illumination Renderer; 100 Samples primesieve: 1e12 Prime Number Generation c-ray: Total Time build-php: Time To Compile build-imagemagick: Time To Compile himeno: Poisson Pressure Solver x264: H.264 Video Encoding tscp: AI Chess Performance scimark2: Jacobi Successive Over-Relaxation scimark2: Dense LU Matrix Factorization scimark2: Sparse Matrix Multiply scimark2: Fast Fourier Transform scimark2: Monte Carlo blake2: Phoronix Test Suite v4.8.0m0 mafft: Multiple Sequence Alignment hmmer: Pfam Database Search GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 25743.99 12.86 25 79.36 21.45 32.92 93.67 1663.97 155.33 631626 1163.52 1861.55 1177.86 251.67 450.21 5.32 5.27 10.41 25786.15 12.81 25 79.24 17.02 33.30 78.61 1593.09 156.34 599455 1164.63 1773.35 1123.80 245.00 615.32 5.30 5.17 10.30 25888.95 10.44 12.77 12.74 140 27.46 19.59 31.94 1532.51 155.35 624323 1670.77 1774.03 1234.19 246.79 615.04 7.54 5.96 11.92 25295.82 10.98 13.18 14.45 142 326.85 27.03 21.03 34.35 1419.90 153.15 624749 1666.24 1827.34 1263.29 237.86 619.77 7.45 6.19 10.94 OpenBenchmarking.org
Apache Benchmark Static Web Page Serving OpenBenchmarking.org Requests Per Second, More Is Better Apache Benchmark 2.4.3 Static Web Page Serving GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 6K 12K 18K 24K 30K SE +/- 51.73, N = 3 SE +/- 36.32, N = 3 SE +/- 32.41, N = 3 SE +/- 280.78, N = 3 25743.99 25786.15 25888.95 25295.82 1. (CC) gcc options: -shared -fPIC -pthread -march=core-avx2 -O3
Tachyon Total Time OpenBenchmarking.org Seconds, Fewer Is Better Tachyon 0.98.9 Total Time LLVM Clang 3.2 LLVM Clang 3.3 3 6 9 12 15 SE +/- 0.06, N = 3 SE +/- 0.24, N = 6 10.44 10.98 1. (CC) gcc options: -m32 -O3 -fomit-frame-pointer -ffast-math -ltachyon -lm -lpthread
FFmpeg H.264 HD To NTSC DV OpenBenchmarking.org Seconds, Fewer Is Better FFmpeg 1.1 H.264 HD To NTSC DV GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 3 6 9 12 15 SE +/- 0.08, N = 3 SE +/- 0.06, N = 3 SE +/- 0.04, N = 3 SE +/- 0.12, N = 3 12.86 12.81 12.77 13.18 -fno-tree-vectorize -MF -MT -fno-tree-vectorize -MF -MT -Qunused-arguments -Qunused-arguments 1. (CC) gcc options: -lavdevice -lavfilter -lavformat -lavcodec -lswresample -lswscale -lavutil -ldl -lasound -lSDL -lm -pthread -lbz2 -march=core-avx2 -O3 -std=c99 -fomit-frame-pointer -fno-math-errno -fno-signed-zeros -MMD
LAME MP3 Encoding WAV To MP3 OpenBenchmarking.org Seconds, Fewer Is Better LAME MP3 Encoding 3.99.3 WAV To MP3 LLVM Clang 3.2 LLVM Clang 3.3 4 8 12 16 20 SE +/- 0.01, N = 5 SE +/- 0.01, N = 5 12.74 14.45 1. (CC) gcc options: -pipe -march=core-avx2 -O3 -lm
Smallpt Global Illumination Renderer; 100 Samples OpenBenchmarking.org Seconds, Fewer Is Better Smallpt 1.0 Global Illumination Renderer; 100 Samples GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 30 60 90 120 150 SE +/- 0.00, N = 3 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 SE +/- 1.33, N = 3 25 25 140 142 1. (CXX) g++ options: -fopenmp -march=core-avx2 -O3
Primesieve 1e12 Prime Number Generation OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 4.2 1e12 Prime Number Generation GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.3 70 140 210 280 350 SE +/- 0.06, N = 3 SE +/- 0.08, N = 3 SE +/- 2.08, N = 3 79.36 79.24 326.85 -fopenmp -fopenmp 1. (CXX) g++ options: -O2
C-Ray Total Time OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 1.1 Total Time GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 6 12 18 24 30 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.00, N = 3 21.45 17.02 27.46 27.03 1. (CC) gcc options: -lm -lpthread -O3 -march=core-avx2
Timed PHP Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed PHP Compilation 5.2.9 Time To Compile GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 8 16 24 32 40 SE +/- 0.14, N = 3 SE +/- 0.46, N = 3 SE +/- 0.03, N = 3 SE +/- 0.20, N = 3 32.92 33.30 19.59 21.03 -lpthread 1. (CC) gcc options: -march=core-avx2 -O3 -pedantic -ldl -lz -lm
Timed ImageMagick Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed ImageMagick Compilation 6.8.1-10 Time To Compile GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 20 40 60 80 100 SE +/- 0.60, N = 3 SE +/- 0.07, N = 3 SE +/- 0.12, N = 3 SE +/- 0.37, N = 3 93.67 78.61 31.94 34.35
Himeno Benchmark Poisson Pressure Solver OpenBenchmarking.org MFLOPS, More Is Better Himeno Benchmark 3.0 Poisson Pressure Solver GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 400 800 1200 1600 2000 SE +/- 0.19, N = 3 SE +/- 20.45, N = 3 SE +/- 2.02, N = 3 SE +/- 13.80, N = 3 1663.97 1593.09 1532.51 1419.90 1. (CC) gcc options: -O3 -march=core-avx2
x264 H.264 Video Encoding OpenBenchmarking.org Frames Per Second, More Is Better x264 2013-06-08 H.264 Video Encoding GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 30 60 90 120 150 SE +/- 0.33, N = 5 SE +/- 0.40, N = 5 SE +/- 0.60, N = 5 SE +/- 0.58, N = 5 155.33 156.34 155.35 153.15 1. (CC) gcc options: -ldl -m64 -lm -lpthread -O3 -ffast-math -march=core-avx2 -std=gnu99 -fomit-frame-pointer -fno-tree-vectorize
TSCP AI Chess Performance OpenBenchmarking.org Nodes Per Second, More Is Better TSCP 1.81 AI Chess Performance GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 140K 280K 420K 560K 700K SE +/- 0.00, N = 5 SE +/- 520.60, N = 5 SE +/- 141.40, N = 5 SE +/- 480.04, N = 5 631626 599455 624323 624749
SciMark Computational Test: Jacobi Successive Over-Relaxation OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Jacobi Successive Over-Relaxation GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 400 800 1200 1600 2000 SE +/- 1.28, N = 4 SE +/- 1.11, N = 4 SE +/- 0.00, N = 4 SE +/- 1.85, N = 4 1163.52 1164.63 1670.77 1666.24
SciMark Computational Test: Dense LU Matrix Factorization OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Dense LU Matrix Factorization GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 400 800 1200 1600 2000 SE +/- 1.88, N = 4 SE +/- 1.48, N = 4 SE +/- 36.01, N = 4 SE +/- 22.59, N = 4 1861.55 1773.35 1774.03 1827.34
SciMark Computational Test: Sparse Matrix Multiply OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Sparse Matrix Multiply GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 300 600 900 1200 1500 SE +/- 0.85, N = 4 SE +/- 5.13, N = 4 SE +/- 30.06, N = 4 SE +/- 5.09, N = 4 1177.86 1123.80 1234.19 1263.29
SciMark Computational Test: Fast Fourier Transform OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Fast Fourier Transform GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 50 100 150 200 250 SE +/- 0.68, N = 4 SE +/- 0.34, N = 4 SE +/- 1.60, N = 4 SE +/- 0.98, N = 4 251.67 245.00 246.79 237.86
SciMark Computational Test: Monte Carlo OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Monte Carlo GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 130 260 390 520 650 SE +/- 0.55, N = 4 SE +/- 0.00, N = 4 SE +/- 5.67, N = 4 SE +/- 0.89, N = 4 450.21 615.32 615.04 619.77
BLAKE2 Phoronix Test Suite v4.8.0m0 OpenBenchmarking.org Cycles Per Byte, Fewer Is Better BLAKE2 20121223 Phoronix Test Suite v4.8.0m0 GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 2 4 6 8 10 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.11, N = 3 SE +/- 0.13, N = 4 5.32 5.30 7.54 7.45 1. (CC) gcc options: -std=gnu99 -O3 -march=native -lcrypto -lz
Timed MAFFT Alignment Multiple Sequence Alignment OpenBenchmarking.org Seconds, Fewer Is Better Timed MAFFT Alignment 6.864 Multiple Sequence Alignment GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 2 4 6 8 10 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 SE +/- 0.10, N = 3 SE +/- 0.12, N = 3 5.27 5.17 5.96 6.19 1. (CC) gcc options: -O3 -lm -lpthread
Timed HMMer Search Pfam Database Search OpenBenchmarking.org Seconds, Fewer Is Better Timed HMMer Search 2.3.2 Pfam Database Search GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 3 6 9 12 15 SE +/- 0.02, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.18, N = 4 10.41 10.30 11.92 10.94 1. (CC) gcc options: -march=core-avx2 -O3 -pthread -lhmmer -lsquid -lm
Phoronix Test Suite v10.8.5