LLVM Clang 3.2 Loop Vectorizer

Intel Core i7-3960X testing of the automatic loop vectorizer in LLVM 3.2 with the Clang compiler. Benchmarking by Michael Larabel for a future article on phoronix.com.

HTML result view exported from: https://openbenchmarking.org/result/1210278-BY-1210264RA90&rdt.

LLVM Clang 3.2 Loop VectorizerProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLCompilerFile-SystemScreen ResolutionLoop VectorizationDefault2012-10-27 15:48Intel Core i7-3960X @ 3.30GHz (12 Cores)Intel DX79SIIntel Xeon E5/Core8192MB64GB OCZ VERTEXAMD Radeon HD 4650 512MBRealtek ALC892DELL S2409WIntel 82579LM Gigabit ConnectionUbuntu 12.103.5.0-17-generic (x86_64)Unity 6.8.0X Server 1.13.0radeon 6.99.992.1 Mesa 9.0 Gallium 0.4Clang 3.2 (SVN 166775) + LLVM 3.2svnext41920x1080AMD Athlon II X3 455 @ 3.30GHz (3 Cores)Gigabyte GA-MA790X-DS4AMD nee ATI RD780 + SB6004096MB500GB Seagate ST3500418AS + 250GB 2500BEV External + 500GB My Passport 0730ATI Radeon HD 3800 512MB (851/1143MHz)Realtek ALC889AASUS VH242HRealtek RTL8111/8168B + Ralink RT3060 Wireless 802.11n 1T/1RSUSE LINUX 12.23.4.11-2.16-desktop (x86_64)KDE 4.8.5X Server 1.12.3fglrx 8.97.23.3.11653GCC 4.7btrfsOpenBenchmarking.orgCompiler Details- Loop Vectorization: Optimized build; Built Oct 26 2012 (10:29:50); Default target: x86_64-unknown-linux-gnu; Host CPU: corei7-avx- Default: Optimized build; Built Oct 26 2012 (10:29:50); Default target: x86_64-unknown-linux-gnu; Host CPU: corei7-avx- 2012-10-27 15:48: --build=x86_64-suse-linux --disable-libgcj --disable-libitm --disable-libmudflap --disable-libssp --disable-libstdcxx-pch --disable-plugin --enable-__cxa_atexit --enable-checking=release --enable-languages=c,c++,objc,fortran,obj-c++,java,ada --enable-libstdcxx-allocator=new --enable-linux-futex --enable-ssp --enable-version-specific-runtime-libs --mandir=/usr/share/man --with-arch-32=i586 --with-slibdir=/lib64 --with-tune=generic --without-system-libunwind Processor Details- Loop Vectorization, Default: Scaling Governor: ondemandSystem Details- Loop Vectorization, Default: Compiz was running on this system.

LLVM Clang 3.2 Loop Vectorizerhmmer: Pfam Database Searchgraphics-magick: Blurgraphics-magick: Sharpengraphics-magick: Resizinggraphics-magick: HWB Color Spacegraphics-magick: Local Adaptive Thresholdinghimeno: Poisson Pressure Solverc-ray: Total Timesmallpt: Global Illumination Renderer; 100 Samplespgbench: TPC-B Transactions Per SecondLoop VectorizationDefault2012-10-27 15:4815.37733184122201554.6623.15157336.6515.35823189123441595.3520.92153324.7725.61619.3680.8626520.85OpenBenchmarking.org

Timed HMMer Search

Pfam Database Search

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed HMMer Search 2.3.2Pfam Database SearchLoop VectorizationDefault2012-10-27 15:48612182430SE +/- 0.02, N = 3SE +/- 0.05, N = 3SE +/- 0.46, N = 315.3715.3525.61-O3 -march=native -mllvm -vectorize-O3 -march=native-O21. (CC) gcc options: -pthread -lhmmer -lsquid -lm

GraphicsMagick

Operation: Blur

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.16Operation: BlurLoop VectorizationDefault20406080100SE +/- 0.00, N = 3SE +/- 0.00, N = 37382-mllvm -vectorize -lpng121. (CC) gcc options: -O3 -march=native -pthread -lXext -lX11 -lz -lm -lpthread

GraphicsMagick

Operation: Sharpen

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.16Operation: SharpenLoop VectorizationDefault714212835SE +/- 0.00, N = 3SE +/- 0.00, N = 33131-mllvm -vectorize -lpng121. (CC) gcc options: -O3 -march=native -pthread -lXext -lX11 -lz -lm -lpthread

GraphicsMagick

Operation: Resizing

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.16Operation: ResizingLoop VectorizationDefault20406080100SE +/- 0.00, N = 3SE +/- 0.00, N = 38489-mllvm -vectorize -lpng121. (CC) gcc options: -O3 -march=native -pthread -lXext -lX11 -lz -lm -lpthread

GraphicsMagick

Operation: HWB Color Space

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.16Operation: HWB Color SpaceLoop VectorizationDefault306090120150SE +/- 0.33, N = 3SE +/- 0.00, N = 3122123-mllvm -vectorize -lpng121. (CC) gcc options: -O3 -march=native -pthread -lXext -lX11 -lz -lm -lpthread

GraphicsMagick

Operation: Local Adaptive Thresholding

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.16Operation: Local Adaptive ThresholdingLoop VectorizationDefault1020304050SE +/- 0.00, N = 3SE +/- 0.00, N = 32044-mllvm -vectorize -lpng121. (CC) gcc options: -O3 -march=native -pthread -lXext -lX11 -lz -lm -lpthread

Himeno Benchmark

Poisson Pressure Solver

OpenBenchmarking.orgMFLOPS, More Is BetterHimeno Benchmark 3.0Poisson Pressure SolverLoop VectorizationDefault2012-10-27 15:4830060090012001500SE +/- 6.98, N = 3SE +/- 3.05, N = 3SE +/- 0.87, N = 31554.661595.35619.36-march=native -mllvm -vectorize-march=native1. (CC) gcc options: -O3

C-Ray

Total Time

OpenBenchmarking.orgSeconds, Fewer Is BetterC-Ray 1.1Total TimeLoop VectorizationDefault2012-10-27 15:4820406080100SE +/- 0.04, N = 3SE +/- 0.03, N = 3SE +/- 0.08, N = 323.1520.9280.86-march=native -mllvm -vectorize-march=native1. (CC) gcc options: -lm -lpthread -O3

Smallpt

Global Illumination Renderer; 100 Samples

OpenBenchmarking.orgSeconds, Fewer Is BetterSmallpt 1.0Global Illumination Renderer; 100 SamplesLoop VectorizationDefault2012-10-27 15:4860120180240300SE +/- 0.33, N = 3SE +/- 0.33, N = 3SE +/- 0.33, N = 3157153265-O3 -march=native -mllvm -vectorize-O3 -march=native1. (CXX) g++ options: -fopenmp

PostgreSQL pgbench

TPC-B Transactions Per Second

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL pgbench 8.4.11TPC-B Transactions Per SecondLoop VectorizationDefault2012-10-27 15:4870140210280350SE +/- 3.97, N = 3SE +/- 0.52, N = 3SE +/- 0.10, N = 3336.65324.7720.85-O3 -march=native -mllvm -vectorize-O3 -march=native1. (CC) gcc options: -fno-strict-aliasing -fwrapv -lpgport -lpq -lcrypt -ldl -lm


Phoronix Test Suite v10.8.4