LLVM Clang 3.4 Intel Core i7 Haswell Core-AVX2 Intel Core i7-4770K Haswell testing of GCC 4.8.1 and an early GCC 4.9.0 compiler snapshot along with LLVM Clang 3.3 and an LLVM Clang 3.4 development snapshot. Testing with CFLAGS/CXXFLAGS for the Intel Core i7 Haswell CPU of -O3 and -march=core-avx2. Benchmarking by Michael Larabel for a future article on phoronix.com
HTML result view exported from: https://openbenchmarking.org/result/1306273-SO-CLANG34LL14&grt&sro .
LLVM Clang 3.4 Intel Core i7 Haswell Core-AVX2 Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL Compiler File-System Screen Resolution GCC 4.8.1 GCC 4.9.0 20130623 LLVM Clang 3.3 LLVM 3.4 SVN 20130626 Intel Core i7-4770K @ 3.50GHz (8 Cores) Intel DH87RL Intel 4th Gen Core DRAM 15360MB 240GB OCZ VERTEX3 Intel Haswell Desktop Intel Haswell HDMI VA2431 Intel Connection I217-V Ubuntu 13.10 3.10.0-999-generic (x86_64) KDE 4.10.4 X Server 1.13.3 intel 2.21.9 3.0 Mesa 9.2.0-devel (git-bbd2d57) GCC 4.8.1 + Clang 3.3 + LLVM 3.3 ext4 1920x1080 GCC 4.9.0 20130623 + Clang 3.3 + LLVM 3.3 Clang 3.3 + LLVM 3.3 Clang 3.4 (SVN 185044) + LLVM 3.4svn OpenBenchmarking.org Compiler Details - GCC 4.8.1: --disable-multilib --enable-checking=release --enable-languages=c,c++,fortran - GCC 4.9.0 20130623: --disable-multilib --enable-checking=release --enable-languages=c,c++,fortran - LLVM Clang 3.3: Optimized build; Built Jun 20 2013 (17:06:21); Default target: x86_64-unknown-linux-gnu; Host CPU: x86-64 - LLVM 3.4 SVN 20130626: Optimized build; Built Jun 26 2013 (19:58:17); Default target: x86_64-unknown-linux-gnu; Host CPU: x86-64 Processor Details - Scaling Governor: acpi-cpufreq ondemand
LLVM Clang 3.4 Intel Core i7 Haswell Core-AVX2 blake2: Phoronix Test Suite v4.8.0m1 c-ray: Total Time encode-flac: WAV To FLAC graphics-magick: Resizing himeno: Poisson Pressure Solver n-queens: Elapsed Time primesieve: 1e12 Prime Number Generation scimark2: Composite scimark2: Fast Fourier Transform scimark2: Sparse Matrix Multiply scimark2: Dense LU Matrix Factorization scimark2: Jacobi Successive Over-Relaxation smallpt: Global Illumination Renderer; 100 Samples hmmer: Pfam Database Search build-imagemagick: Time To Compile mafft: Multiple Sequence Alignment x264: H.264 Video Encoding GCC 4.8.1 GCC 4.9.0 20130623 LLVM Clang 3.3 LLVM 3.4 SVN 20130626 5.71 17.06 5.37 175 1048.77 36.69 79.17 1009.99 242.66 1204.00 1825.43 1169.09 25 10.47 79.18 5.47 156.26 5.29 17.03 5.26 181 1588.37 36.72 79.20 1005.56 248.01 1148.95 1851.86 1170.18 25 10.40 74.59 5.52 155.67 7.40 27.06 5.64 91 1586.39 184.28 323.37 1102.35 249.83 1228.19 1755.85 1673.05 140 10.81 34.39 6.09 155.01 7.81 26.22 4.55 91 1395.01 186.32 320.23 1204.77 239.90 1182.30 2397.89 1613.32 140 10.70 34.37 5.98 152.82 OpenBenchmarking.org
BLAKE2 Phoronix Test Suite v4.8.0m1 OpenBenchmarking.org Cycles Per Byte, Fewer Is Better BLAKE2 20121223 Phoronix Test Suite v4.8.0m1 GCC 4.8.1 GCC 4.9.0 20130623 LLVM 3.4 SVN 20130626 LLVM Clang 3.3 2 4 6 8 10 SE +/- 0.08, N = 3 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 5.71 5.29 7.81 7.40 1. (CC) gcc options: -std=gnu99 -O3 -march=native -lcrypto -lz
C-Ray Total Time OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 1.1 Total Time GCC 4.8.1 GCC 4.9.0 20130623 LLVM 3.4 SVN 20130626 LLVM Clang 3.3 6 12 18 24 30 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 17.06 17.03 26.22 27.06 1. (CC) gcc options: -lm -lpthread -O3 -march=core-avx2
FLAC Audio Encoding WAV To FLAC OpenBenchmarking.org Seconds, Fewer Is Better FLAC Audio Encoding 1.3.0 WAV To FLAC GCC 4.8.1 GCC 4.9.0 20130623 LLVM 3.4 SVN 20130626 LLVM Clang 3.3 1.269 2.538 3.807 5.076 6.345 SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 5.37 5.26 4.55 5.64 1. (CXX) g++ options: -O3 -march=core-avx2 -fvisibility=hidden -logg -lm
GraphicsMagick Operation: Resizing OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.16 Operation: Resizing GCC 4.8.1 GCC 4.9.0 20130623 LLVM 3.4 SVN 20130626 LLVM Clang 3.3 40 80 120 160 200 SE +/- 2.73, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 175 181 91 91 -std=gnu99 -fopenmp -std=gnu99 -fopenmp 1. (CC) gcc options: -O3 -march=core-avx2 -pthread -lXext -lSM -lICE -lX11 -lbz2 -lz -lm -lpthread
Himeno Benchmark Poisson Pressure Solver OpenBenchmarking.org MFLOPS, More Is Better Himeno Benchmark 3.0 Poisson Pressure Solver GCC 4.8.1 GCC 4.9.0 20130623 LLVM 3.4 SVN 20130626 LLVM Clang 3.3 300 600 900 1200 1500 SE +/- 5.10, N = 3 SE +/- 2.58, N = 3 SE +/- 82.18, N = 6 SE +/- 0.45, N = 3 1048.77 1588.37 1395.01 1586.39 1. (CC) gcc options: -O3 -march=core-avx2
N-Queens Elapsed Time OpenBenchmarking.org Seconds, Fewer Is Better N-Queens 1.0 Elapsed Time GCC 4.8.1 GCC 4.9.0 20130623 LLVM 3.4 SVN 20130626 LLVM Clang 3.3 40 80 120 160 200 SE +/- 0.04, N = 3 SE +/- 0.03, N = 3 SE +/- 0.05, N = 3 SE +/- 0.11, N = 3 36.69 36.72 186.32 184.28 1. (CC) gcc options: -static -fopenmp -O3 -march=core-avx2
Primesieve 1e12 Prime Number Generation OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 4.2 1e12 Prime Number Generation GCC 4.8.1 GCC 4.9.0 20130623 LLVM 3.4 SVN 20130626 LLVM Clang 3.3 70 140 210 280 350 SE +/- 0.14, N = 3 SE +/- 0.04, N = 3 SE +/- 0.41, N = 3 SE +/- 0.19, N = 3 79.17 79.20 320.23 323.37 -fopenmp -fopenmp 1. (CXX) g++ options: -O2
SciMark Computational Test: Composite OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Composite GCC 4.8.1 GCC 4.9.0 20130623 LLVM 3.4 SVN 20130626 LLVM Clang 3.3 300 600 900 1200 1500 SE +/- 3.60, N = 4 SE +/- 0.86, N = 4 SE +/- 1.71, N = 4 SE +/- 1.35, N = 4 1009.99 1005.56 1204.77 1102.35 1. (CXX) g++ options: -O3 -march=core-avx2
SciMark Computational Test: Fast Fourier Transform OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Fast Fourier Transform GCC 4.8.1 GCC 4.9.0 20130623 LLVM 3.4 SVN 20130626 LLVM Clang 3.3 50 100 150 200 250 SE +/- 2.71, N = 4 SE +/- 0.51, N = 4 SE +/- 0.81, N = 3 SE +/- 0.21, N = 4 242.66 248.01 239.90 249.83 1. (CXX) g++ options: -O3 -march=core-avx2
SciMark Computational Test: Sparse Matrix Multiply OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Sparse Matrix Multiply GCC 4.8.1 GCC 4.9.0 20130623 LLVM 3.4 SVN 20130626 LLVM Clang 3.3 300 600 900 1200 1500 SE +/- 8.39, N = 4 SE +/- 1.61, N = 4 SE +/- 8.53, N = 4 SE +/- 1.06, N = 4 1204.00 1148.95 1182.30 1228.19 1. (CXX) g++ options: -O3 -march=core-avx2
SciMark Computational Test: Dense LU Matrix Factorization OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Dense LU Matrix Factorization GCC 4.8.1 GCC 4.9.0 20130623 LLVM 3.4 SVN 20130626 LLVM Clang 3.3 500 1000 1500 2000 2500 SE +/- 17.54, N = 4 SE +/- 2.63, N = 4 SE +/- 26.35, N = 4 SE +/- 2.77, N = 4 1825.43 1851.86 2397.89 1755.85 1. (CXX) g++ options: -O3 -march=core-avx2
SciMark Computational Test: Jacobi Successive Over-Relaxation OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Jacobi Successive Over-Relaxation GCC 4.8.1 GCC 4.9.0 20130623 LLVM 3.4 SVN 20130626 LLVM Clang 3.3 400 800 1200 1600 2000 SE +/- 3.34, N = 4 SE +/- 0.00, N = 4 SE +/- 20.67, N = 4 SE +/- 1.32, N = 4 1169.09 1170.18 1613.32 1673.05 1. (CXX) g++ options: -O3 -march=core-avx2
Smallpt Global Illumination Renderer; 100 Samples OpenBenchmarking.org Seconds, Fewer Is Better Smallpt 1.0 Global Illumination Renderer; 100 Samples GCC 4.8.1 GCC 4.9.0 20130623 LLVM 3.4 SVN 20130626 LLVM Clang 3.3 30 60 90 120 150 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 25 25 140 140 1. (CXX) g++ options: -fopenmp -O3 -march=core-avx2
Timed HMMer Search Pfam Database Search OpenBenchmarking.org Seconds, Fewer Is Better Timed HMMer Search 2.3.2 Pfam Database Search GCC 4.8.1 GCC 4.9.0 20130623 LLVM 3.4 SVN 20130626 LLVM Clang 3.3 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 10.47 10.40 10.70 10.81 1. (CC) gcc options: -O3 -march=core-avx2 -pthread -lhmmer -lsquid -lm
Timed ImageMagick Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed ImageMagick Compilation 6.8.1-10 Time To Compile GCC 4.8.1 GCC 4.9.0 20130623 LLVM 3.4 SVN 20130626 LLVM Clang 3.3 20 40 60 80 100 SE +/- 0.28, N = 3 SE +/- 0.11, N = 3 SE +/- 0.12, N = 3 SE +/- 0.10, N = 3 79.18 74.59 34.37 34.39
Timed MAFFT Alignment Multiple Sequence Alignment OpenBenchmarking.org Seconds, Fewer Is Better Timed MAFFT Alignment 6.864 Multiple Sequence Alignment GCC 4.8.1 GCC 4.9.0 20130623 LLVM 3.4 SVN 20130626 LLVM Clang 3.3 2 4 6 8 10 SE +/- 0.13, N = 6 SE +/- 0.15, N = 6 SE +/- 0.04, N = 3 SE +/- 0.10, N = 6 5.47 5.52 5.98 6.09 1. (CC) gcc options: -O3 -lm -lpthread
x264 H.264 Video Encoding OpenBenchmarking.org Frames Per Second, More Is Better x264 2013-06-08 H.264 Video Encoding GCC 4.8.1 GCC 4.9.0 20130623 LLVM 3.4 SVN 20130626 LLVM Clang 3.3 30 60 90 120 150 SE +/- 0.79, N = 5 SE +/- 0.90, N = 5 SE +/- 0.51, N = 5 SE +/- 0.15, N = 5 156.26 155.67 152.82 155.01 1. (CC) gcc options: -ldl -m64 -lm -lpthread -O3 -ffast-math -march=core-avx2 -std=gnu99 -fomit-frame-pointer -fno-tree-vectorize
Phoronix Test Suite v10.8.5