LLVM Clang 3.2 Loop Vectorizer

Intel Core i7-3960X testing of the automatic loop vectorizer in LLVM 3.2 with the Clang compiler. Benchmarking by Michael Larabel for a future article on phoronix.com.

HTML result view exported from: https://openbenchmarking.org/result/1210278-BY-1210264RA90.

LLVM Clang 3.2 Loop VectorizerProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLCompilerFile-SystemScreen ResolutionDefaultLoop Vectorization2012-10-27 15:48Intel Core i7-3960X @ 3.30GHz (12 Cores)Intel DX79SIIntel Xeon E5/Core8192MB64GB OCZ VERTEXAMD Radeon HD 4650 512MBRealtek ALC892DELL S2409WIntel 82579LM Gigabit ConnectionUbuntu 12.103.5.0-17-generic (x86_64)Unity 6.8.0X Server 1.13.0radeon 6.99.992.1 Mesa 9.0 Gallium 0.4Clang 3.2 (SVN 166775) + LLVM 3.2svnext41920x1080AMD Athlon II X3 455 @ 3.30GHz (3 Cores)Gigabyte GA-MA790X-DS4AMD nee ATI RD780 + SB6004096MB500GB Seagate ST3500418AS + 250GB 2500BEV External + 500GB My Passport 0730ATI Radeon HD 3800 512MB (851/1143MHz)Realtek ALC889AASUS VH242HRealtek RTL8111/8168B + Ralink RT3060 Wireless 802.11n 1T/1RSUSE LINUX 12.23.4.11-2.16-desktop (x86_64)KDE 4.8.5X Server 1.12.3fglrx 8.97.23.3.11653GCC 4.7btrfsOpenBenchmarking.orgCompiler Details- Default: Optimized build; Built Oct 26 2012 (10:29:50); Default target: x86_64-unknown-linux-gnu; Host CPU: corei7-avx- Loop Vectorization: Optimized build; Built Oct 26 2012 (10:29:50); Default target: x86_64-unknown-linux-gnu; Host CPU: corei7-avx- 2012-10-27 15:48: --build=x86_64-suse-linux --disable-libgcj --disable-libitm --disable-libmudflap --disable-libssp --disable-libstdcxx-pch --disable-plugin --enable-__cxa_atexit --enable-checking=release --enable-languages=c,c++,objc,fortran,obj-c++,java,ada --enable-libstdcxx-allocator=new --enable-linux-futex --enable-ssp --enable-version-specific-runtime-libs --mandir=/usr/share/man --with-arch-32=i586 --with-slibdir=/lib64 --with-tune=generic --without-system-libunwind Processor Details- Default, Loop Vectorization: Scaling Governor: ondemandSystem Details- Default, Loop Vectorization: Compiz was running on this system.

LLVM Clang 3.2 Loop Vectorizerhmmer: Pfam Database Searchgraphics-magick: Blurgraphics-magick: Sharpengraphics-magick: Resizinggraphics-magick: HWB Color Spacegraphics-magick: Local Adaptive Thresholdinghimeno: Poisson Pressure Solverc-ray: Total Timesmallpt: Global Illumination Renderer; 100 Samplespgbench: TPC-B Transactions Per SecondDefaultLoop Vectorization2012-10-27 15:4815.35823189123441595.3520.92153324.7715.37733184122201554.6623.15157336.6525.61619.3680.8626520.85OpenBenchmarking.org

Timed HMMer Search

Pfam Database Search

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed HMMer Search 2.3.2Pfam Database SearchDefaultLoop Vectorization2012-10-27 15:48612182430SE +/- 0.05, N = 3SE +/- 0.02, N = 3SE +/- 0.46, N = 315.3515.3725.61-O3 -march=native-O3 -march=native -mllvm -vectorize-O21. (CC) gcc options: -pthread -lhmmer -lsquid -lm

GraphicsMagick

Operation: Blur

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.16Operation: BlurDefaultLoop Vectorization20406080100SE +/- 0.00, N = 3SE +/- 0.00, N = 38273-mllvm -vectorize -lpng121. (CC) gcc options: -O3 -march=native -pthread -lXext -lX11 -lz -lm -lpthread

GraphicsMagick

Operation: Sharpen

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.16Operation: SharpenDefaultLoop Vectorization714212835SE +/- 0.00, N = 3SE +/- 0.00, N = 33131-mllvm -vectorize -lpng121. (CC) gcc options: -O3 -march=native -pthread -lXext -lX11 -lz -lm -lpthread

GraphicsMagick

Operation: Resizing

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.16Operation: ResizingDefaultLoop Vectorization20406080100SE +/- 0.00, N = 3SE +/- 0.00, N = 38984-mllvm -vectorize -lpng121. (CC) gcc options: -O3 -march=native -pthread -lXext -lX11 -lz -lm -lpthread

GraphicsMagick

Operation: HWB Color Space

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.16Operation: HWB Color SpaceDefaultLoop Vectorization306090120150SE +/- 0.00, N = 3SE +/- 0.33, N = 3123122-mllvm -vectorize -lpng121. (CC) gcc options: -O3 -march=native -pthread -lXext -lX11 -lz -lm -lpthread

GraphicsMagick

Operation: Local Adaptive Thresholding

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.16Operation: Local Adaptive ThresholdingDefaultLoop Vectorization1020304050SE +/- 0.00, N = 3SE +/- 0.00, N = 34420-mllvm -vectorize -lpng121. (CC) gcc options: -O3 -march=native -pthread -lXext -lX11 -lz -lm -lpthread

Himeno Benchmark

Poisson Pressure Solver

OpenBenchmarking.orgMFLOPS, More Is BetterHimeno Benchmark 3.0Poisson Pressure SolverDefaultLoop Vectorization2012-10-27 15:4830060090012001500SE +/- 3.05, N = 3SE +/- 6.98, N = 3SE +/- 0.87, N = 31595.351554.66619.36-march=native-march=native -mllvm -vectorize1. (CC) gcc options: -O3

C-Ray

Total Time

OpenBenchmarking.orgSeconds, Fewer Is BetterC-Ray 1.1Total TimeDefaultLoop Vectorization2012-10-27 15:4820406080100SE +/- 0.03, N = 3SE +/- 0.04, N = 3SE +/- 0.08, N = 320.9223.1580.86-march=native-march=native -mllvm -vectorize1. (CC) gcc options: -lm -lpthread -O3

Smallpt

Global Illumination Renderer; 100 Samples

OpenBenchmarking.orgSeconds, Fewer Is BetterSmallpt 1.0Global Illumination Renderer; 100 SamplesDefaultLoop Vectorization2012-10-27 15:4860120180240300SE +/- 0.33, N = 3SE +/- 0.33, N = 3SE +/- 0.33, N = 3153157265-O3 -march=native-O3 -march=native -mllvm -vectorize1. (CXX) g++ options: -fopenmp

PostgreSQL pgbench

TPC-B Transactions Per Second

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL pgbench 8.4.11TPC-B Transactions Per SecondDefaultLoop Vectorization2012-10-27 15:4870140210280350SE +/- 0.52, N = 3SE +/- 3.97, N = 3SE +/- 0.10, N = 3324.77336.6520.85-O3 -march=native-O3 -march=native -mllvm -vectorize1. (CC) gcc options: -fno-strict-aliasing -fwrapv -lpgport -lpq -lcrypt -ldl -lm


Phoronix Test Suite v10.8.4