LLVM Clang 3.3 vs. GCC 4.8 - Intel Core-AVX2 Haswell Intel Core i7-4770K testing with a Intel DH87RL motherboard looking at the GCC 4.7, GCC 4.8, LLVM Clang 3.2, and LLVM Clang 3.3 compiler performance with core-avx2 Haswell optimizations. Intel Core i7 Haswell core-avx2 compiler benchmarks for a future article on Phoronix by Michael Larabel.
HTML result view exported from: https://openbenchmarking.org/result/1306206-SO-LLVMCLANG08&grw .
LLVM Clang 3.3 vs. GCC 4.8 - Intel Core-AVX2 Haswell Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL Compiler File-System Screen Resolution GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 Intel Core i7-4770K @ 3.50GHz (8 Cores) Intel DH87RL Intel 4th Gen Core DRAM 15360MB 240GB OCZ VERTEX3 Intel Haswell Desktop Intel Haswell HDMI VA2431 Intel Connection I217-V Ubuntu 13.10 3.10.0-999-generic (x86_64) Unity 7.0.0 X Server 1.13.3 intel 2.21.9 3.0 Mesa 9.1.3 GCC 4.7.3 ext4 1920x1080 GCC 4.8.1 Clang 3.2 + LLVM 3.2svn Clang 3.3 + LLVM 3.3 OpenBenchmarking.org Compiler Details - GCC 4.7.3: --disable-multilib --enable-checking=release --enable-languages=c,c++,fortran - GCC 4.8.1: --disable-multilib --enable-checking=release --enable-languages=c,c++,fortran - LLVM Clang 3.2: Optimized build; Built Jun 20 2013 (14:54:23); Default target: x86_64-unknown-linux-gnu; Host CPU: x86-64 - LLVM Clang 3.3: Optimized build; Built Jun 20 2013 (12:21:18); Default target: x86_64-unknown-linux-gnu; Host CPU: x86-64 Processor Details - Scaling Governor: acpi- freq ondemand
LLVM Clang 3.3 vs. GCC 4.8 - Intel Core-AVX2 Haswell tscp: AI Chess Performance scimark2: Monte Carlo scimark2: Fast Fourier Transform scimark2: Sparse Matrix Multiply scimark2: Dense LU Matrix Factorization scimark2: Jacobi Successive Over-Relaxation blake2: Phoronix Test Suite v4.8.0m0 encode-mp3: WAV To MP3 hmmer: Pfam Database Search mafft: Multiple Sequence Alignment himeno: Poisson Pressure Solver build-imagemagick: Time To Compile primesieve: 1e12 Prime Number Generation build-php: Time To Compile tachyon: Total Time x264: H.264 Video Encoding c-ray: Total Time ffmpeg: H.264 HD To NTSC DV smallpt: Global Illumination Renderer; 100 Samples apache: Static Web Page Serving GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 631626 450.21 251.67 1177.86 1861.55 1163.52 5.32 10.41 5.27 1663.97 93.67 79.36 32.92 155.33 21.45 12.86 25 25743.99 599455 615.32 245.00 1123.80 1773.35 1164.63 5.30 10.30 5.17 1593.09 78.61 79.24 33.30 156.34 17.02 12.81 25 25786.15 624323 615.04 246.79 1234.19 1774.03 1670.77 7.54 12.74 11.92 5.96 1532.51 31.94 19.59 10.44 155.35 27.46 12.77 140 25888.95 624749 619.77 237.86 1263.29 1827.34 1666.24 7.45 14.45 10.94 6.19 1419.90 34.35 326.85 21.03 10.98 153.15 27.03 13.18 142 25295.82 OpenBenchmarking.org
TSCP AI Chess Performance OpenBenchmarking.org Nodes Per Second, More Is Better TSCP 1.81 AI Chess Performance GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 140K 280K 420K 560K 700K SE +/- 0.00, N = 5 SE +/- 520.60, N = 5 SE +/- 141.40, N = 5 SE +/- 480.04, N = 5 631626 599455 624323 624749
SciMark Computational Test: Monte Carlo OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Monte Carlo GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 130 260 390 520 650 SE +/- 0.55, N = 4 SE +/- 0.00, N = 4 SE +/- 5.67, N = 4 SE +/- 0.89, N = 4 450.21 615.32 615.04 619.77
SciMark Computational Test: Fast Fourier Transform OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Fast Fourier Transform GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 50 100 150 200 250 SE +/- 0.68, N = 4 SE +/- 0.34, N = 4 SE +/- 1.60, N = 4 SE +/- 0.98, N = 4 251.67 245.00 246.79 237.86
SciMark Computational Test: Sparse Matrix Multiply OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Sparse Matrix Multiply GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 300 600 900 1200 1500 SE +/- 0.85, N = 4 SE +/- 5.13, N = 4 SE +/- 30.06, N = 4 SE +/- 5.09, N = 4 1177.86 1123.80 1234.19 1263.29
SciMark Computational Test: Dense LU Matrix Factorization OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Dense LU Matrix Factorization GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 400 800 1200 1600 2000 SE +/- 1.88, N = 4 SE +/- 1.48, N = 4 SE +/- 36.01, N = 4 SE +/- 22.59, N = 4 1861.55 1773.35 1774.03 1827.34
SciMark Computational Test: Jacobi Successive Over-Relaxation OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Jacobi Successive Over-Relaxation GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 400 800 1200 1600 2000 SE +/- 1.28, N = 4 SE +/- 1.11, N = 4 SE +/- 0.00, N = 4 SE +/- 1.85, N = 4 1163.52 1164.63 1670.77 1666.24
BLAKE2 Phoronix Test Suite v4.8.0m0 OpenBenchmarking.org Cycles Per Byte, Fewer Is Better BLAKE2 20121223 Phoronix Test Suite v4.8.0m0 GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 2 4 6 8 10 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.11, N = 3 SE +/- 0.13, N = 4 5.32 5.30 7.54 7.45 1. (CC) gcc options: -std=gnu99 -O3 -march=native -lcrypto -lz
LAME MP3 Encoding WAV To MP3 OpenBenchmarking.org Seconds, Fewer Is Better LAME MP3 Encoding 3.99.3 WAV To MP3 LLVM Clang 3.2 LLVM Clang 3.3 4 8 12 16 20 SE +/- 0.01, N = 5 SE +/- 0.01, N = 5 12.74 14.45 1. (CC) gcc options: -pipe -march=core-avx2 -O3 -lm
Timed HMMer Search Pfam Database Search OpenBenchmarking.org Seconds, Fewer Is Better Timed HMMer Search 2.3.2 Pfam Database Search GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 3 6 9 12 15 SE +/- 0.02, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.18, N = 4 10.41 10.30 11.92 10.94 1. (CC) gcc options: -march=core-avx2 -O3 -pthread -lhmmer -lsquid -lm
Timed MAFFT Alignment Multiple Sequence Alignment OpenBenchmarking.org Seconds, Fewer Is Better Timed MAFFT Alignment 6.864 Multiple Sequence Alignment GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 2 4 6 8 10 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 SE +/- 0.10, N = 3 SE +/- 0.12, N = 3 5.27 5.17 5.96 6.19 1. (CC) gcc options: -O3 -lm -lpthread
Himeno Benchmark Poisson Pressure Solver OpenBenchmarking.org MFLOPS, More Is Better Himeno Benchmark 3.0 Poisson Pressure Solver GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 400 800 1200 1600 2000 SE +/- 0.19, N = 3 SE +/- 20.45, N = 3 SE +/- 2.02, N = 3 SE +/- 13.80, N = 3 1663.97 1593.09 1532.51 1419.90 1. (CC) gcc options: -O3 -march=core-avx2
Timed ImageMagick Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed ImageMagick Compilation 6.8.1-10 Time To Compile GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 20 40 60 80 100 SE +/- 0.60, N = 3 SE +/- 0.07, N = 3 SE +/- 0.12, N = 3 SE +/- 0.37, N = 3 93.67 78.61 31.94 34.35
Primesieve 1e12 Prime Number Generation OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 4.2 1e12 Prime Number Generation GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.3 70 140 210 280 350 SE +/- 0.06, N = 3 SE +/- 0.08, N = 3 SE +/- 2.08, N = 3 79.36 79.24 326.85 -fopenmp -fopenmp 1. (CXX) g++ options: -O2
Timed PHP Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed PHP Compilation 5.2.9 Time To Compile GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 8 16 24 32 40 SE +/- 0.14, N = 3 SE +/- 0.46, N = 3 SE +/- 0.03, N = 3 SE +/- 0.20, N = 3 32.92 33.30 19.59 21.03 -lpthread 1. (CC) gcc options: -march=core-avx2 -O3 -pedantic -ldl -lz -lm
Tachyon Total Time OpenBenchmarking.org Seconds, Fewer Is Better Tachyon 0.98.9 Total Time LLVM Clang 3.2 LLVM Clang 3.3 3 6 9 12 15 SE +/- 0.06, N = 3 SE +/- 0.24, N = 6 10.44 10.98 1. (CC) gcc options: -m32 -O3 -fomit-frame-pointer -ffast-math -ltachyon -lm -lpthread
x264 H.264 Video Encoding OpenBenchmarking.org Frames Per Second, More Is Better x264 2013-06-08 H.264 Video Encoding GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 30 60 90 120 150 SE +/- 0.33, N = 5 SE +/- 0.40, N = 5 SE +/- 0.60, N = 5 SE +/- 0.58, N = 5 155.33 156.34 155.35 153.15 1. (CC) gcc options: -ldl -m64 -lm -lpthread -O3 -ffast-math -march=core-avx2 -std=gnu99 -fomit-frame-pointer -fno-tree-vectorize
C-Ray Total Time OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 1.1 Total Time GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 6 12 18 24 30 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.00, N = 3 21.45 17.02 27.46 27.03 1. (CC) gcc options: -lm -lpthread -O3 -march=core-avx2
FFmpeg H.264 HD To NTSC DV OpenBenchmarking.org Seconds, Fewer Is Better FFmpeg 1.1 H.264 HD To NTSC DV GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 3 6 9 12 15 SE +/- 0.08, N = 3 SE +/- 0.06, N = 3 SE +/- 0.04, N = 3 SE +/- 0.12, N = 3 12.86 12.81 12.77 13.18 -fno-tree-vectorize -MF -MT -fno-tree-vectorize -MF -MT -Qunused-arguments -Qunused-arguments 1. (CC) gcc options: -lavdevice -lavfilter -lavformat -lavcodec -lswresample -lswscale -lavutil -ldl -lasound -lSDL -lm -pthread -lbz2 -march=core-avx2 -O3 -std=c99 -fomit-frame-pointer -fno-math-errno -fno-signed-zeros -MMD
Smallpt Global Illumination Renderer; 100 Samples OpenBenchmarking.org Seconds, Fewer Is Better Smallpt 1.0 Global Illumination Renderer; 100 Samples GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 30 60 90 120 150 SE +/- 0.00, N = 3 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 SE +/- 1.33, N = 3 25 25 140 142 1. (CXX) g++ options: -fopenmp -march=core-avx2 -O3
Apache Benchmark Static Web Page Serving OpenBenchmarking.org Requests Per Second, More Is Better Apache Benchmark 2.4.3 Static Web Page Serving GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 6K 12K 18K 24K 30K SE +/- 51.73, N = 3 SE +/- 36.32, N = 3 SE +/- 32.41, N = 3 SE +/- 280.78, N = 3 25743.99 25786.15 25888.95 25295.82 1. (CC) gcc options: -shared -fPIC -pthread -march=core-avx2 -O3
Phoronix Test Suite v10.8.5