LLVM Clang 3.2 Loop Vectorizer

Intel Core i7-3960X testing of the automatic loop vectorizer in LLVM 3.2 with the Clang compiler. Benchmarking by Michael Larabel for a future article on phoronix.com.

HTML result view exported from: https://openbenchmarking.org/result/1210264-RA-LLVMLOOPV66&sro&grr.

LLVM Clang 3.2 Loop VectorizerProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLCompilerFile-SystemScreen ResolutionDefaultLoop VectorizationIntel Core i7-3960X @ 3.30GHz (12 Cores)Intel DX79SIIntel Xeon E5/Core8192MB64GB OCZ VERTEXAMD Radeon HD 4650 512MBRealtek ALC892DELL S2409WIntel 82579LM Gigabit ConnectionUbuntu 12.103.5.0-17-generic (x86_64)Unity 6.8.0X Server 1.13.0radeon 6.99.992.1 Mesa 9.0 Gallium 0.4Clang 3.2 (SVN 166775) + LLVM 3.2svnext41920x1080OpenBenchmarking.orgCompiler Details- Optimized build; Built Oct 26 2012 (10:29:50); Default target: x86_64-unknown-linux-gnu; Host CPU: corei7-avx Processor Details- Scaling Governor: ondemandSystem Details- Compiz was running on this system.

LLVM Clang 3.2 Loop Vectorizerpgbench: TPC-B Transactions Per Secondsmallpt: Global Illumination Renderer; 100 Samplesc-ray: Total Timehimeno: Poisson Pressure Solvergraphics-magick: Local Adaptive Thresholdinggraphics-magick: HWB Color Spacegraphics-magick: Resizinggraphics-magick: Sharpengraphics-magick: Blurhmmer: Pfam Database SearchDefaultLoop Vectorization324.7715320.921595.354412389318215.35336.6515723.151554.662012284317315.37OpenBenchmarking.org

PostgreSQL pgbench

TPC-B Transactions Per Second

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL pgbench 8.4.11TPC-B Transactions Per SecondDefaultLoop Vectorization70140210280350SE +/- 0.52, N = 3SE +/- 3.97, N = 3324.77336.65-mllvm -vectorize1. (CC) gcc options: -O3 -march=native -fno-strict-aliasing -fwrapv -lpgport -lpq -lcrypt -ldl -lm

Smallpt

Global Illumination Renderer; 100 Samples

OpenBenchmarking.orgSeconds, Fewer Is BetterSmallpt 1.0Global Illumination Renderer; 100 SamplesDefaultLoop Vectorization306090120150SE +/- 0.33, N = 3SE +/- 0.33, N = 3153157-mllvm -vectorize1. (CXX) g++ options: -fopenmp -O3 -march=native

C-Ray

Total Time

OpenBenchmarking.orgSeconds, Fewer Is BetterC-Ray 1.1Total TimeDefaultLoop Vectorization612182430SE +/- 0.03, N = 3SE +/- 0.04, N = 320.9223.15-mllvm -vectorize1. (CC) gcc options: -lm -lpthread -O3 -march=native

Himeno Benchmark

Poisson Pressure Solver

OpenBenchmarking.orgMFLOPS, More Is BetterHimeno Benchmark 3.0Poisson Pressure SolverDefaultLoop Vectorization30060090012001500SE +/- 3.05, N = 3SE +/- 6.98, N = 31595.351554.66-mllvm -vectorize1. (CC) gcc options: -O3 -march=native

GraphicsMagick

Operation: Local Adaptive Thresholding

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.16Operation: Local Adaptive ThresholdingDefaultLoop Vectorization1020304050SE +/- 0.00, N = 3SE +/- 0.00, N = 34420-mllvm -vectorize -lpng121. (CC) gcc options: -O3 -march=native -pthread -lXext -lX11 -lz -lm -lpthread

GraphicsMagick

Operation: HWB Color Space

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.16Operation: HWB Color SpaceDefaultLoop Vectorization306090120150SE +/- 0.00, N = 3SE +/- 0.33, N = 3123122-mllvm -vectorize -lpng121. (CC) gcc options: -O3 -march=native -pthread -lXext -lX11 -lz -lm -lpthread

GraphicsMagick

Operation: Resizing

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.16Operation: ResizingDefaultLoop Vectorization20406080100SE +/- 0.00, N = 3SE +/- 0.00, N = 38984-mllvm -vectorize -lpng121. (CC) gcc options: -O3 -march=native -pthread -lXext -lX11 -lz -lm -lpthread

GraphicsMagick

Operation: Sharpen

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.16Operation: SharpenDefaultLoop Vectorization714212835SE +/- 0.00, N = 3SE +/- 0.00, N = 33131-mllvm -vectorize -lpng121. (CC) gcc options: -O3 -march=native -pthread -lXext -lX11 -lz -lm -lpthread

GraphicsMagick

Operation: Blur

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.16Operation: BlurDefaultLoop Vectorization20406080100SE +/- 0.00, N = 3SE +/- 0.00, N = 38273-mllvm -vectorize -lpng121. (CC) gcc options: -O3 -march=native -pthread -lXext -lX11 -lz -lm -lpthread

Timed HMMer Search

Pfam Database Search

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed HMMer Search 2.3.2Pfam Database SearchDefaultLoop Vectorization48121620SE +/- 0.05, N = 3SE +/- 0.02, N = 315.3515.37-mllvm -vectorize1. (CC) gcc options: -O3 -march=native -pthread -lhmmer -lsquid -lm


Phoronix Test Suite v10.8.4