LLVM Clang 3.2 Loop Vectorizer

Intel Core i7-3960X testing of the automatic loop vectorizer in LLVM 3.2 with the Clang compiler. Benchmarking by Michael Larabel for a future article on phoronix.com.

HTML result view exported from: https://openbenchmarking.org/result/1210264-RA-LLVMLOOPV66&grr&rdt.

LLVM Clang 3.2 Loop VectorizerProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLCompilerFile-SystemScreen ResolutionLoop VectorizationDefaultIntel Core i7-3960X @ 3.30GHz (12 Cores)Intel DX79SIIntel Xeon E5/Core8192MB64GB OCZ VERTEXAMD Radeon HD 4650 512MBRealtek ALC892DELL S2409WIntel 82579LM Gigabit ConnectionUbuntu 12.103.5.0-17-generic (x86_64)Unity 6.8.0X Server 1.13.0radeon 6.99.992.1 Mesa 9.0 Gallium 0.4Clang 3.2 (SVN 166775) + LLVM 3.2svnext41920x1080OpenBenchmarking.orgCompiler Details- Optimized build; Built Oct 26 2012 (10:29:50); Default target: x86_64-unknown-linux-gnu; Host CPU: corei7-avx Processor Details- Scaling Governor: ondemandSystem Details- Compiz was running on this system.

LLVM Clang 3.2 Loop Vectorizerpgbench: TPC-B Transactions Per Secondsmallpt: Global Illumination Renderer; 100 Samplesc-ray: Total Timehimeno: Poisson Pressure Solvergraphics-magick: Local Adaptive Thresholdinggraphics-magick: HWB Color Spacegraphics-magick: Resizinggraphics-magick: Sharpengraphics-magick: Blurhmmer: Pfam Database SearchLoop VectorizationDefault336.6515723.151554.662012284317315.37324.7715320.921595.354412389318215.35OpenBenchmarking.org

PostgreSQL pgbench

TPC-B Transactions Per Second

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL pgbench 8.4.11TPC-B Transactions Per SecondLoop VectorizationDefault70140210280350SE +/- 3.97, N = 3SE +/- 0.52, N = 3336.65324.77-mllvm -vectorize1. (CC) gcc options: -O3 -march=native -fno-strict-aliasing -fwrapv -lpgport -lpq -lcrypt -ldl -lm

Smallpt

Global Illumination Renderer; 100 Samples

OpenBenchmarking.orgSeconds, Fewer Is BetterSmallpt 1.0Global Illumination Renderer; 100 SamplesLoop VectorizationDefault306090120150SE +/- 0.33, N = 3SE +/- 0.33, N = 3157153-mllvm -vectorize1. (CXX) g++ options: -fopenmp -O3 -march=native

C-Ray

Total Time

OpenBenchmarking.orgSeconds, Fewer Is BetterC-Ray 1.1Total TimeLoop VectorizationDefault612182430SE +/- 0.04, N = 3SE +/- 0.03, N = 323.1520.92-mllvm -vectorize1. (CC) gcc options: -lm -lpthread -O3 -march=native

Himeno Benchmark

Poisson Pressure Solver

OpenBenchmarking.orgMFLOPS, More Is BetterHimeno Benchmark 3.0Poisson Pressure SolverLoop VectorizationDefault30060090012001500SE +/- 6.98, N = 3SE +/- 3.05, N = 31554.661595.35-mllvm -vectorize1. (CC) gcc options: -O3 -march=native

GraphicsMagick

Operation: Local Adaptive Thresholding

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.16Operation: Local Adaptive ThresholdingLoop VectorizationDefault1020304050SE +/- 0.00, N = 3SE +/- 0.00, N = 32044-mllvm -vectorize -lpng121. (CC) gcc options: -O3 -march=native -pthread -lXext -lX11 -lz -lm -lpthread

GraphicsMagick

Operation: HWB Color Space

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.16Operation: HWB Color SpaceLoop VectorizationDefault306090120150SE +/- 0.33, N = 3SE +/- 0.00, N = 3122123-mllvm -vectorize -lpng121. (CC) gcc options: -O3 -march=native -pthread -lXext -lX11 -lz -lm -lpthread

GraphicsMagick

Operation: Resizing

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.16Operation: ResizingLoop VectorizationDefault20406080100SE +/- 0.00, N = 3SE +/- 0.00, N = 38489-mllvm -vectorize -lpng121. (CC) gcc options: -O3 -march=native -pthread -lXext -lX11 -lz -lm -lpthread

GraphicsMagick

Operation: Sharpen

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.16Operation: SharpenLoop VectorizationDefault714212835SE +/- 0.00, N = 3SE +/- 0.00, N = 33131-mllvm -vectorize -lpng121. (CC) gcc options: -O3 -march=native -pthread -lXext -lX11 -lz -lm -lpthread

GraphicsMagick

Operation: Blur

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.16Operation: BlurLoop VectorizationDefault20406080100SE +/- 0.00, N = 3SE +/- 0.00, N = 37382-mllvm -vectorize -lpng121. (CC) gcc options: -O3 -march=native -pthread -lXext -lX11 -lz -lm -lpthread

Timed HMMer Search

Pfam Database Search

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed HMMer Search 2.3.2Pfam Database SearchLoop VectorizationDefault48121620SE +/- 0.02, N = 3SE +/- 0.05, N = 315.3715.35-mllvm -vectorize1. (CC) gcc options: -O3 -march=native -pthread -lhmmer -lsquid -lm


Phoronix Test Suite v10.8.3