LLVM Clang 3.2 Loop Vectorizer

Intel Core i7-3960X testing of the automatic loop vectorizer in LLVM 3.2 with the Clang compiler. Benchmarking by Michael Larabel for a future article on phoronix.com.

HTML result view exported from: https://openbenchmarking.org/result/1210264-RA-LLVMLOOPV66&rdt&gru.

LLVM Clang 3.2 Loop VectorizerProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLCompilerFile-SystemScreen ResolutionLoop VectorizationDefaultIntel Core i7-3960X @ 3.30GHz (12 Cores)Intel DX79SIIntel Xeon E5/Core8192MB64GB OCZ VERTEXAMD Radeon HD 4650 512MBRealtek ALC892DELL S2409WIntel 82579LM Gigabit ConnectionUbuntu 12.103.5.0-17-generic (x86_64)Unity 6.8.0X Server 1.13.0radeon 6.99.992.1 Mesa 9.0 Gallium 0.4Clang 3.2 (SVN 166775) + LLVM 3.2svnext41920x1080OpenBenchmarking.orgCompiler Details- Optimized build; Built Oct 26 2012 (10:29:50); Default target: x86_64-unknown-linux-gnu; Host CPU: corei7-avx Processor Details- Scaling Governor: ondemandSystem Details- Compiz was running on this system.

LLVM Clang 3.2 Loop Vectorizergraphics-magick: Blurgraphics-magick: Sharpengraphics-magick: Resizinggraphics-magick: HWB Color Spacegraphics-magick: Local Adaptive Thresholdinghimeno: Poisson Pressure Solverpgbench: TPC-B Transactions Per Secondhmmer: Pfam Database Searchc-ray: Total Timesmallpt: Global Illumination Renderer; 100 SamplesLoop VectorizationDefault733184122201554.66336.6515.3723.15157823189123441595.35324.7715.3520.92153OpenBenchmarking.org

GraphicsMagick

Operation: Blur

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.16Operation: BlurLoop VectorizationDefault20406080100SE +/- 0.00, N = 3SE +/- 0.00, N = 37382-mllvm -vectorize -lpng121. (CC) gcc options: -O3 -march=native -pthread -lXext -lX11 -lz -lm -lpthread

GraphicsMagick

Operation: Sharpen

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.16Operation: SharpenLoop VectorizationDefault714212835SE +/- 0.00, N = 3SE +/- 0.00, N = 33131-mllvm -vectorize -lpng121. (CC) gcc options: -O3 -march=native -pthread -lXext -lX11 -lz -lm -lpthread

GraphicsMagick

Operation: Resizing

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.16Operation: ResizingLoop VectorizationDefault20406080100SE +/- 0.00, N = 3SE +/- 0.00, N = 38489-mllvm -vectorize -lpng121. (CC) gcc options: -O3 -march=native -pthread -lXext -lX11 -lz -lm -lpthread

GraphicsMagick

Operation: HWB Color Space

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.16Operation: HWB Color SpaceLoop VectorizationDefault306090120150SE +/- 0.33, N = 3SE +/- 0.00, N = 3122123-mllvm -vectorize -lpng121. (CC) gcc options: -O3 -march=native -pthread -lXext -lX11 -lz -lm -lpthread

GraphicsMagick

Operation: Local Adaptive Thresholding

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.16Operation: Local Adaptive ThresholdingLoop VectorizationDefault1020304050SE +/- 0.00, N = 3SE +/- 0.00, N = 32044-mllvm -vectorize -lpng121. (CC) gcc options: -O3 -march=native -pthread -lXext -lX11 -lz -lm -lpthread

Himeno Benchmark

Poisson Pressure Solver

OpenBenchmarking.orgMFLOPS, More Is BetterHimeno Benchmark 3.0Poisson Pressure SolverLoop VectorizationDefault30060090012001500SE +/- 6.98, N = 3SE +/- 3.05, N = 31554.661595.35-mllvm -vectorize1. (CC) gcc options: -O3 -march=native

PostgreSQL pgbench

TPC-B Transactions Per Second

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL pgbench 8.4.11TPC-B Transactions Per SecondLoop VectorizationDefault70140210280350SE +/- 3.97, N = 3SE +/- 0.52, N = 3336.65324.77-mllvm -vectorize1. (CC) gcc options: -O3 -march=native -fno-strict-aliasing -fwrapv -lpgport -lpq -lcrypt -ldl -lm

Timed HMMer Search

Pfam Database Search

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed HMMer Search 2.3.2Pfam Database SearchLoop VectorizationDefault48121620SE +/- 0.02, N = 3SE +/- 0.05, N = 315.3715.35-mllvm -vectorize1. (CC) gcc options: -O3 -march=native -pthread -lhmmer -lsquid -lm

C-Ray

Total Time

OpenBenchmarking.orgSeconds, Fewer Is BetterC-Ray 1.1Total TimeLoop VectorizationDefault612182430SE +/- 0.04, N = 3SE +/- 0.03, N = 323.1520.92-mllvm -vectorize1. (CC) gcc options: -lm -lpthread -O3 -march=native

Smallpt

Global Illumination Renderer; 100 Samples

OpenBenchmarking.orgSeconds, Fewer Is BetterSmallpt 1.0Global Illumination Renderer; 100 SamplesLoop VectorizationDefault306090120150SE +/- 0.33, N = 3SE +/- 0.33, N = 3157153-mllvm -vectorize1. (CXX) g++ options: -fopenmp -O3 -march=native


Phoronix Test Suite v10.8.4