LLVM Clang 3.3 vs. GCC 4.8 - Intel Core-AVX2 Haswell Intel Core i7-4770K testing with a Intel DH87RL motherboard looking at the GCC 4.7, GCC 4.8, LLVM Clang 3.2, and LLVM Clang 3.3 compiler performance with core-avx2 Haswell optimizations. Intel Core i7 Haswell core-avx2 compiler benchmarks for a future article on Phoronix by Michael Larabel.
HTML result view exported from: https://openbenchmarking.org/result/1306206-SO-LLVMCLANG08&sro&grs .
LLVM Clang 3.3 vs. GCC 4.8 - Intel Core-AVX2 Haswell Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL Compiler File-System Screen Resolution GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 Intel Core i7-4770K @ 3.50GHz (8 Cores) Intel DH87RL Intel 4th Gen Core DRAM 15360MB 240GB OCZ VERTEX3 Intel Haswell Desktop Intel Haswell HDMI VA2431 Intel Connection I217-V Ubuntu 13.10 3.10.0-999-generic (x86_64) Unity 7.0.0 X Server 1.13.3 intel 2.21.9 3.0 Mesa 9.1.3 GCC 4.7.3 ext4 1920x1080 GCC 4.8.1 Clang 3.2 + LLVM 3.2svn Clang 3.3 + LLVM 3.3 OpenBenchmarking.org Compiler Details - GCC 4.7.3: --disable-multilib --enable-checking=release --enable-languages=c,c++,fortran - GCC 4.8.1: --disable-multilib --enable-checking=release --enable-languages=c,c++,fortran - LLVM Clang 3.2: Optimized build; Built Jun 20 2013 (14:54:23); Default target: x86_64-unknown-linux-gnu; Host CPU: x86-64 - LLVM Clang 3.3: Optimized build; Built Jun 20 2013 (12:21:18); Default target: x86_64-unknown-linux-gnu; Host CPU: x86-64 Processor Details - Scaling Governor: acpi- freq ondemand
LLVM Clang 3.3 vs. GCC 4.8 - Intel Core-AVX2 Haswell smallpt: Global Illumination Renderer; 100 Samples primesieve: 1e12 Prime Number Generation build-imagemagick: Time To Compile build-php: Time To Compile c-ray: Total Time scimark2: Jacobi Successive Over-Relaxation blake2: Phoronix Test Suite v4.8.0m0 scimark2: Monte Carlo mafft: Multiple Sequence Alignment himeno: Poisson Pressure Solver hmmer: Pfam Database Search encode-mp3: WAV To MP3 scimark2: Sparse Matrix Multiply scimark2: Fast Fourier Transform tscp: AI Chess Performance tachyon: Total Time scimark2: Dense LU Matrix Factorization ffmpeg: H.264 HD To NTSC DV apache: Static Web Page Serving x264: H.264 Video Encoding GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 25 79.36 93.67 32.92 21.45 1163.52 5.32 450.21 5.27 1663.97 10.41 1177.86 251.67 631626 1861.55 12.86 25743.99 155.33 25 79.24 78.61 33.30 17.02 1164.63 5.30 615.32 5.17 1593.09 10.30 1123.80 245.00 599455 1773.35 12.81 25786.15 156.34 140 31.94 19.59 27.46 1670.77 7.54 615.04 5.96 1532.51 11.92 12.74 1234.19 246.79 624323 10.44 1774.03 12.77 25888.95 155.35 142 326.85 34.35 21.03 27.03 1666.24 7.45 619.77 6.19 1419.90 10.94 14.45 1263.29 237.86 624749 10.98 1827.34 13.18 25295.82 153.15 OpenBenchmarking.org
Smallpt Global Illumination Renderer; 100 Samples OpenBenchmarking.org Seconds, Fewer Is Better Smallpt 1.0 Global Illumination Renderer; 100 Samples GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 30 60 90 120 150 SE +/- 0.00, N = 3 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 SE +/- 1.33, N = 3 25 25 140 142 1. (CXX) g++ options: -fopenmp -march=core-avx2 -O3
Primesieve 1e12 Prime Number Generation OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 4.2 1e12 Prime Number Generation GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.3 70 140 210 280 350 SE +/- 0.06, N = 3 SE +/- 0.08, N = 3 SE +/- 2.08, N = 3 79.36 79.24 326.85 -fopenmp -fopenmp 1. (CXX) g++ options: -O2
Timed ImageMagick Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed ImageMagick Compilation 6.8.1-10 Time To Compile GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 20 40 60 80 100 SE +/- 0.60, N = 3 SE +/- 0.07, N = 3 SE +/- 0.12, N = 3 SE +/- 0.37, N = 3 93.67 78.61 31.94 34.35
Timed PHP Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed PHP Compilation 5.2.9 Time To Compile GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 8 16 24 32 40 SE +/- 0.14, N = 3 SE +/- 0.46, N = 3 SE +/- 0.03, N = 3 SE +/- 0.20, N = 3 32.92 33.30 19.59 21.03 -lpthread 1. (CC) gcc options: -march=core-avx2 -O3 -pedantic -ldl -lz -lm
C-Ray Total Time OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 1.1 Total Time GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 6 12 18 24 30 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.00, N = 3 21.45 17.02 27.46 27.03 1. (CC) gcc options: -lm -lpthread -O3 -march=core-avx2
SciMark Computational Test: Jacobi Successive Over-Relaxation OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Jacobi Successive Over-Relaxation GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 400 800 1200 1600 2000 SE +/- 1.28, N = 4 SE +/- 1.11, N = 4 SE +/- 0.00, N = 4 SE +/- 1.85, N = 4 1163.52 1164.63 1670.77 1666.24
BLAKE2 Phoronix Test Suite v4.8.0m0 OpenBenchmarking.org Cycles Per Byte, Fewer Is Better BLAKE2 20121223 Phoronix Test Suite v4.8.0m0 GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 2 4 6 8 10 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.11, N = 3 SE +/- 0.13, N = 4 5.32 5.30 7.54 7.45 1. (CC) gcc options: -std=gnu99 -O3 -march=native -lcrypto -lz
SciMark Computational Test: Monte Carlo OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Monte Carlo GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 130 260 390 520 650 SE +/- 0.55, N = 4 SE +/- 0.00, N = 4 SE +/- 5.67, N = 4 SE +/- 0.89, N = 4 450.21 615.32 615.04 619.77
Timed MAFFT Alignment Multiple Sequence Alignment OpenBenchmarking.org Seconds, Fewer Is Better Timed MAFFT Alignment 6.864 Multiple Sequence Alignment GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 2 4 6 8 10 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 SE +/- 0.10, N = 3 SE +/- 0.12, N = 3 5.27 5.17 5.96 6.19 1. (CC) gcc options: -O3 -lm -lpthread
Himeno Benchmark Poisson Pressure Solver OpenBenchmarking.org MFLOPS, More Is Better Himeno Benchmark 3.0 Poisson Pressure Solver GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 400 800 1200 1600 2000 SE +/- 0.19, N = 3 SE +/- 20.45, N = 3 SE +/- 2.02, N = 3 SE +/- 13.80, N = 3 1663.97 1593.09 1532.51 1419.90 1. (CC) gcc options: -O3 -march=core-avx2
Timed HMMer Search Pfam Database Search OpenBenchmarking.org Seconds, Fewer Is Better Timed HMMer Search 2.3.2 Pfam Database Search GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 3 6 9 12 15 SE +/- 0.02, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.18, N = 4 10.41 10.30 11.92 10.94 1. (CC) gcc options: -march=core-avx2 -O3 -pthread -lhmmer -lsquid -lm
LAME MP3 Encoding WAV To MP3 OpenBenchmarking.org Seconds, Fewer Is Better LAME MP3 Encoding 3.99.3 WAV To MP3 LLVM Clang 3.2 LLVM Clang 3.3 4 8 12 16 20 SE +/- 0.01, N = 5 SE +/- 0.01, N = 5 12.74 14.45 1. (CC) gcc options: -pipe -march=core-avx2 -O3 -lm
SciMark Computational Test: Sparse Matrix Multiply OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Sparse Matrix Multiply GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 300 600 900 1200 1500 SE +/- 0.85, N = 4 SE +/- 5.13, N = 4 SE +/- 30.06, N = 4 SE +/- 5.09, N = 4 1177.86 1123.80 1234.19 1263.29
SciMark Computational Test: Fast Fourier Transform OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Fast Fourier Transform GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 50 100 150 200 250 SE +/- 0.68, N = 4 SE +/- 0.34, N = 4 SE +/- 1.60, N = 4 SE +/- 0.98, N = 4 251.67 245.00 246.79 237.86
TSCP AI Chess Performance OpenBenchmarking.org Nodes Per Second, More Is Better TSCP 1.81 AI Chess Performance GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 140K 280K 420K 560K 700K SE +/- 0.00, N = 5 SE +/- 520.60, N = 5 SE +/- 141.40, N = 5 SE +/- 480.04, N = 5 631626 599455 624323 624749
Tachyon Total Time OpenBenchmarking.org Seconds, Fewer Is Better Tachyon 0.98.9 Total Time LLVM Clang 3.2 LLVM Clang 3.3 3 6 9 12 15 SE +/- 0.06, N = 3 SE +/- 0.24, N = 6 10.44 10.98 1. (CC) gcc options: -m32 -O3 -fomit-frame-pointer -ffast-math -ltachyon -lm -lpthread
SciMark Computational Test: Dense LU Matrix Factorization OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Dense LU Matrix Factorization GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 400 800 1200 1600 2000 SE +/- 1.88, N = 4 SE +/- 1.48, N = 4 SE +/- 36.01, N = 4 SE +/- 22.59, N = 4 1861.55 1773.35 1774.03 1827.34
FFmpeg H.264 HD To NTSC DV OpenBenchmarking.org Seconds, Fewer Is Better FFmpeg 1.1 H.264 HD To NTSC DV GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 3 6 9 12 15 SE +/- 0.08, N = 3 SE +/- 0.06, N = 3 SE +/- 0.04, N = 3 SE +/- 0.12, N = 3 12.86 12.81 12.77 13.18 -fno-tree-vectorize -MF -MT -fno-tree-vectorize -MF -MT -Qunused-arguments -Qunused-arguments 1. (CC) gcc options: -lavdevice -lavfilter -lavformat -lavcodec -lswresample -lswscale -lavutil -ldl -lasound -lSDL -lm -pthread -lbz2 -march=core-avx2 -O3 -std=c99 -fomit-frame-pointer -fno-math-errno -fno-signed-zeros -MMD
Apache Benchmark Static Web Page Serving OpenBenchmarking.org Requests Per Second, More Is Better Apache Benchmark 2.4.3 Static Web Page Serving GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 6K 12K 18K 24K 30K SE +/- 51.73, N = 3 SE +/- 36.32, N = 3 SE +/- 32.41, N = 3 SE +/- 280.78, N = 3 25743.99 25786.15 25888.95 25295.82 1. (CC) gcc options: -shared -fPIC -pthread -march=core-avx2 -O3
x264 H.264 Video Encoding OpenBenchmarking.org Frames Per Second, More Is Better x264 2013-06-08 H.264 Video Encoding GCC 4.7.3 GCC 4.8.1 LLVM Clang 3.2 LLVM Clang 3.3 30 60 90 120 150 SE +/- 0.33, N = 5 SE +/- 0.40, N = 5 SE +/- 0.60, N = 5 SE +/- 0.58, N = 5 155.33 156.34 155.35 153.15 1. (CC) gcc options: -ldl -m64 -lm -lpthread -O3 -ffast-math -march=core-avx2 -std=gnu99 -fomit-frame-pointer -fno-tree-vectorize
Phoronix Test Suite v10.8.5