LLVM Clang 3.4 Intel Core i7 Haswell Core-AVX2

Intel Core i7-4770K Haswell testing of GCC 4.8.1 and an early GCC 4.9.0 compiler snapshot along with LLVM Clang 3.3 and an LLVM Clang 3.4 development snapshot. Testing with CFLAGS/CXXFLAGS for the Intel Core i7 Haswell CPU of -O3 and -march=core-avx2. Benchmarking by Michael Larabel for a future article on phoronix.com

HTML result view exported from: https://openbenchmarking.org/result/1306273-SO-CLANG34LL14&sor&gru.

LLVM Clang 3.4 Intel Core i7 Haswell Core-AVX2ProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLCompilerFile-SystemScreen ResolutionGCC 4.8.1GCC 4.9.0 20130623LLVM Clang 3.3LLVM 3.4 SVN 20130626Intel Core i7-4770K @ 3.50GHz (8 Cores)Intel DH87RLIntel 4th Gen Core DRAM15360MB240GB OCZ VERTEX3Intel Haswell DesktopIntel Haswell HDMIVA2431Intel Connection I217-VUbuntu 13.103.10.0-999-generic (x86_64)KDE 4.10.4X Server 1.13.3intel 2.21.93.0 Mesa 9.2.0-devel (git-bbd2d57)GCC 4.8.1 + Clang 3.3 + LLVM 3.3ext41920x1080GCC 4.9.0 20130623 + Clang 3.3 + LLVM 3.3Clang 3.3 + LLVM 3.3Clang 3.4 (SVN 185044) + LLVM 3.4svnOpenBenchmarking.orgCompiler Details- GCC 4.8.1: --disable-multilib --enable-checking=release --enable-languages=c,c++,fortran- GCC 4.9.0 20130623: --disable-multilib --enable-checking=release --enable-languages=c,c++,fortran- LLVM Clang 3.3: Optimized build; Built Jun 20 2013 (17:06:21); Default target: x86_64-unknown-linux-gnu; Host CPU: x86-64- LLVM 3.4 SVN 20130626: Optimized build; Built Jun 26 2013 (19:58:17); Default target: x86_64-unknown-linux-gnu; Host CPU: x86-64Processor Details- Scaling Governor: acpi-cpufreq ondemand

LLVM Clang 3.4 Intel Core i7 Haswell Core-AVX2x264: H.264 Video Encodinggraphics-magick: Resizingscimark2: Compositescimark2: Fast Fourier Transformscimark2: Sparse Matrix Multiplyscimark2: Dense LU Matrix Factorizationscimark2: Jacobi Successive Over-Relaxationhimeno: Poisson Pressure Solverblake2: Phoronix Test Suite v4.8.0m1hmmer: Pfam Database Searchmafft: Multiple Sequence Alignmentbuild-imagemagick: Time To Compilec-ray: Total Timeprimesieve: 1e12 Prime Number Generationsmallpt: Global Illumination Renderer; 100 Samplesencode-flac: WAV To FLACn-queens: Elapsed TimeGCC 4.8.1GCC 4.9.0 20130623LLVM Clang 3.3LLVM 3.4 SVN 20130626156.261751009.99242.661204.001825.431169.091048.775.7110.475.4779.1817.0679.17255.3736.69155.671811005.56248.011148.951851.861170.181588.375.2910.405.5274.5917.0379.20255.2636.72155.01911102.35249.831228.191755.851673.051586.397.4010.816.0934.3927.06323.371405.64184.28152.82911204.77239.901182.302397.891613.321395.017.8110.705.9834.3726.22320.231404.55186.32OpenBenchmarking.org

x264

H.264 Video Encoding

OpenBenchmarking.orgFrames Per Second, More Is Betterx264 2013-06-08H.264 Video EncodingGCC 4.8.1GCC 4.9.0 20130623LLVM Clang 3.3LLVM 3.4 SVN 20130626306090120150SE +/- 0.79, N = 5SE +/- 0.90, N = 5SE +/- 0.15, N = 5SE +/- 0.51, N = 5156.26155.67155.01152.821. (CC) gcc options: -ldl -m64 -lm -lpthread -O3 -ffast-math -march=core-avx2 -std=gnu99 -fomit-frame-pointer -fno-tree-vectorize

GraphicsMagick

Operation: Resizing

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.16Operation: ResizingGCC 4.9.0 20130623GCC 4.8.1LLVM 3.4 SVN 20130626LLVM Clang 3.34080120160200SE +/- 0.00, N = 3SE +/- 2.73, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 31811759191-std=gnu99 -fopenmp-std=gnu99 -fopenmp1. (CC) gcc options: -O3 -march=core-avx2 -pthread -lXext -lSM -lICE -lX11 -lbz2 -lz -lm -lpthread

SciMark

Computational Test: Composite

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: CompositeLLVM 3.4 SVN 20130626LLVM Clang 3.3GCC 4.8.1GCC 4.9.0 2013062330060090012001500SE +/- 1.71, N = 4SE +/- 1.35, N = 4SE +/- 3.60, N = 4SE +/- 0.86, N = 41204.771102.351009.991005.561. (CXX) g++ options: -O3 -march=core-avx2

SciMark

Computational Test: Fast Fourier Transform

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Fast Fourier TransformLLVM Clang 3.3GCC 4.9.0 20130623GCC 4.8.1LLVM 3.4 SVN 2013062650100150200250SE +/- 0.21, N = 4SE +/- 0.51, N = 4SE +/- 2.71, N = 4SE +/- 0.81, N = 3249.83248.01242.66239.901. (CXX) g++ options: -O3 -march=core-avx2

SciMark

Computational Test: Sparse Matrix Multiply

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Sparse Matrix MultiplyLLVM Clang 3.3GCC 4.8.1LLVM 3.4 SVN 20130626GCC 4.9.0 2013062330060090012001500SE +/- 1.06, N = 4SE +/- 8.39, N = 4SE +/- 8.53, N = 4SE +/- 1.61, N = 41228.191204.001182.301148.951. (CXX) g++ options: -O3 -march=core-avx2

SciMark

Computational Test: Dense LU Matrix Factorization

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Dense LU Matrix FactorizationLLVM 3.4 SVN 20130626GCC 4.9.0 20130623GCC 4.8.1LLVM Clang 3.35001000150020002500SE +/- 26.35, N = 4SE +/- 2.63, N = 4SE +/- 17.54, N = 4SE +/- 2.77, N = 42397.891851.861825.431755.851. (CXX) g++ options: -O3 -march=core-avx2

SciMark

Computational Test: Jacobi Successive Over-Relaxation

OpenBenchmarking.orgMflops, More Is BetterSciMark 2.0Computational Test: Jacobi Successive Over-RelaxationLLVM Clang 3.3LLVM 3.4 SVN 20130626GCC 4.9.0 20130623GCC 4.8.1400800120016002000SE +/- 1.32, N = 4SE +/- 20.67, N = 4SE +/- 0.00, N = 4SE +/- 3.34, N = 41673.051613.321170.181169.091. (CXX) g++ options: -O3 -march=core-avx2

Himeno Benchmark

Poisson Pressure Solver

OpenBenchmarking.orgMFLOPS, More Is BetterHimeno Benchmark 3.0Poisson Pressure SolverGCC 4.9.0 20130623LLVM Clang 3.3LLVM 3.4 SVN 20130626GCC 4.8.130060090012001500SE +/- 2.58, N = 3SE +/- 0.45, N = 3SE +/- 82.18, N = 6SE +/- 5.10, N = 31588.371586.391395.011048.771. (CC) gcc options: -O3 -march=core-avx2

BLAKE2

Phoronix Test Suite v4.8.0m1

OpenBenchmarking.orgCycles Per Byte, Fewer Is BetterBLAKE2 20121223Phoronix Test Suite v4.8.0m1GCC 4.9.0 20130623GCC 4.8.1LLVM Clang 3.3LLVM 3.4 SVN 20130626246810SE +/- 0.00, N = 3SE +/- 0.08, N = 3SE +/- 0.02, N = 3SE +/- 0.02, N = 35.295.717.407.811. (CC) gcc options: -std=gnu99 -O3 -march=native -lcrypto -lz

Timed HMMer Search

Pfam Database Search

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed HMMer Search 2.3.2Pfam Database SearchGCC 4.9.0 20130623GCC 4.8.1LLVM 3.4 SVN 20130626LLVM Clang 3.33691215SE +/- 0.03, N = 3SE +/- 0.01, N = 3SE +/- 0.02, N = 3SE +/- 0.02, N = 310.4010.4710.7010.811. (CC) gcc options: -O3 -march=core-avx2 -pthread -lhmmer -lsquid -lm

Timed MAFFT Alignment

Multiple Sequence Alignment

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed MAFFT Alignment 6.864Multiple Sequence AlignmentGCC 4.8.1GCC 4.9.0 20130623LLVM 3.4 SVN 20130626LLVM Clang 3.3246810SE +/- 0.13, N = 6SE +/- 0.15, N = 6SE +/- 0.04, N = 3SE +/- 0.10, N = 65.475.525.986.091. (CC) gcc options: -O3 -lm -lpthread

Timed ImageMagick Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed ImageMagick Compilation 6.8.1-10Time To CompileLLVM 3.4 SVN 20130626LLVM Clang 3.3GCC 4.9.0 20130623GCC 4.8.120406080100SE +/- 0.12, N = 3SE +/- 0.10, N = 3SE +/- 0.11, N = 3SE +/- 0.28, N = 334.3734.3974.5979.18

C-Ray

Total Time

OpenBenchmarking.orgSeconds, Fewer Is BetterC-Ray 1.1Total TimeGCC 4.9.0 20130623GCC 4.8.1LLVM 3.4 SVN 20130626LLVM Clang 3.3612182430SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 317.0317.0626.2227.061. (CC) gcc options: -lm -lpthread -O3 -march=core-avx2

Primesieve

1e12 Prime Number Generation

OpenBenchmarking.orgSeconds, Fewer Is BetterPrimesieve 4.21e12 Prime Number GenerationGCC 4.8.1GCC 4.9.0 20130623LLVM 3.4 SVN 20130626LLVM Clang 3.370140210280350SE +/- 0.14, N = 3SE +/- 0.04, N = 3SE +/- 0.41, N = 3SE +/- 0.19, N = 379.1779.20320.23323.37-fopenmp-fopenmp1. (CXX) g++ options: -O2

Smallpt

Global Illumination Renderer; 100 Samples

OpenBenchmarking.orgSeconds, Fewer Is BetterSmallpt 1.0Global Illumination Renderer; 100 SamplesGCC 4.8.1GCC 4.9.0 20130623LLVM Clang 3.3LLVM 3.4 SVN 20130626306090120150SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.33, N = 3SE +/- 0.33, N = 325251401401. (CXX) g++ options: -fopenmp -O3 -march=core-avx2

FLAC Audio Encoding

WAV To FLAC

OpenBenchmarking.orgSeconds, Fewer Is BetterFLAC Audio Encoding 1.3.0WAV To FLACLLVM 3.4 SVN 20130626GCC 4.9.0 20130623GCC 4.8.1LLVM Clang 3.31.2692.5383.8075.0766.345SE +/- 0.00, N = 5SE +/- 0.00, N = 5SE +/- 0.00, N = 5SE +/- 0.00, N = 54.555.265.375.641. (CXX) g++ options: -O3 -march=core-avx2 -fvisibility=hidden -logg -lm

N-Queens

Elapsed Time

OpenBenchmarking.orgSeconds, Fewer Is BetterN-Queens 1.0Elapsed TimeGCC 4.8.1GCC 4.9.0 20130623LLVM Clang 3.3LLVM 3.4 SVN 201306264080120160200SE +/- 0.04, N = 3SE +/- 0.03, N = 3SE +/- 0.11, N = 3SE +/- 0.05, N = 336.6936.72184.28186.321. (CC) gcc options: -static -fopenmp -O3 -march=core-avx2


Phoronix Test Suite v10.8.5