LLVM Clang 3.2 Loop Vectorizer Intel Core i7-3960X testing of the automatic loop vectorizer in LLVM 3.2 with the Clang compiler. Benchmarking by Michael Larabel for a future article on phoronix.com.
HTML result view exported from: https://openbenchmarking.org/result/1210278-BY-1210264RA90&sor&grw .
LLVM Clang 3.2 Loop Vectorizer Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL Compiler File-System Screen Resolution Default Loop Vectorization 2012-10-27 15:48 Intel Core i7-3960X @ 3.30GHz (12 Cores) Intel DX79SI Intel Xeon E5/Core 8192MB 64GB OCZ VERTEX AMD Radeon HD 4650 512MB Realtek ALC892 DELL S2409W Intel 82579LM Gigabit Connection Ubuntu 12.10 3.5.0-17-generic (x86_64) Unity 6.8.0 X Server 1.13.0 radeon 6.99.99 2.1 Mesa 9.0 Gallium 0.4 Clang 3.2 (SVN 166775) + LLVM 3.2svn ext4 1920x1080 AMD Athlon II X3 455 @ 3.30GHz (3 Cores) Gigabyte GA-MA790X-DS4 AMD nee ATI RD780 + SB600 4096MB 500GB Seagate ST3500418AS + 250GB 2500BEV External + 500GB My Passport 0730 ATI Radeon HD 3800 512MB (851/1143MHz) Realtek ALC889A ASUS VH242H Realtek RTL8111/8168B + Ralink RT3060 Wireless 802.11n 1T/1R SUSE LINUX 12.2 3.4.11-2.16-desktop (x86_64) KDE 4.8.5 X Server 1.12.3 fglrx 8.97.2 3.3.11653 GCC 4.7 btrfs OpenBenchmarking.org Compiler Details - Default: Optimized build; Built Oct 26 2012 (10:29:50); Default target: x86_64-unknown-linux-gnu; Host CPU: corei7-avx - Loop Vectorization: Optimized build; Built Oct 26 2012 (10:29:50); Default target: x86_64-unknown-linux-gnu; Host CPU: corei7-avx - 2012-10-27 15:48: --build=x86_64-suse-linux --disable-libgcj --disable-libitm --disable-libmudflap --disable-libssp --disable-libstdcxx-pch --disable-plugin --enable-__cxa_atexit --enable-checking=release --enable-languages=c,c++,objc,fortran,obj-c++,java,ada --enable-libstdcxx-allocator=new --enable-linux-futex --enable-ssp --enable-version-specific-runtime-libs --mandir=/usr/share/man --with-arch-32=i586 --with-slibdir=/lib64 --with-tune=generic --without-system-libunwind Processor Details - Default, Loop Vectorization: Scaling Governor: ondemand System Details - Default, Loop Vectorization: Compiz was running on this system.
LLVM Clang 3.2 Loop Vectorizer hmmer: Pfam Database Search himeno: Poisson Pressure Solver graphics-magick: Blur graphics-magick: Sharpen graphics-magick: Resizing graphics-magick: HWB Color Space graphics-magick: Local Adaptive Thresholding c-ray: Total Time smallpt: Global Illumination Renderer; 100 Samples pgbench: TPC-B Transactions Per Second Default Loop Vectorization 2012-10-27 15:48 15.35 1595.35 82 31 89 123 44 20.92 153 324.77 15.37 1554.66 73 31 84 122 20 23.15 157 336.65 25.61 619.36 80.86 265 20.85 OpenBenchmarking.org
Timed HMMer Search Pfam Database Search OpenBenchmarking.org Seconds, Fewer Is Better Timed HMMer Search 2.3.2 Pfam Database Search Default Loop Vectorization 2012-10-27 15:48 6 12 18 24 30 SE +/- 0.05, N = 3 SE +/- 0.02, N = 3 SE +/- 0.46, N = 3 15.35 15.37 25.61 -O3 -march=native -O3 -march=native -mllvm -vectorize -O2 1. (CC) gcc options: -pthread -lhmmer -lsquid -lm
Himeno Benchmark Poisson Pressure Solver OpenBenchmarking.org MFLOPS, More Is Better Himeno Benchmark 3.0 Poisson Pressure Solver Default Loop Vectorization 2012-10-27 15:48 300 600 900 1200 1500 SE +/- 3.05, N = 3 SE +/- 6.98, N = 3 SE +/- 0.87, N = 3 1595.35 1554.66 619.36 -march=native -march=native -mllvm -vectorize 1. (CC) gcc options: -O3
GraphicsMagick Operation: Blur OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.16 Operation: Blur Default Loop Vectorization 20 40 60 80 100 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 82 73 -mllvm -vectorize -lpng12 1. (CC) gcc options: -O3 -march=native -pthread -lXext -lX11 -lz -lm -lpthread
GraphicsMagick Operation: Sharpen OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.16 Operation: Sharpen Loop Vectorization Default 7 14 21 28 35 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 31 31 -mllvm -vectorize -lpng12 1. (CC) gcc options: -O3 -march=native -pthread -lXext -lX11 -lz -lm -lpthread
GraphicsMagick Operation: Resizing OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.16 Operation: Resizing Default Loop Vectorization 20 40 60 80 100 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 89 84 -mllvm -vectorize -lpng12 1. (CC) gcc options: -O3 -march=native -pthread -lXext -lX11 -lz -lm -lpthread
GraphicsMagick Operation: HWB Color Space OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.16 Operation: HWB Color Space Default Loop Vectorization 30 60 90 120 150 SE +/- 0.00, N = 3 SE +/- 0.33, N = 3 123 122 -mllvm -vectorize -lpng12 1. (CC) gcc options: -O3 -march=native -pthread -lXext -lX11 -lz -lm -lpthread
GraphicsMagick Operation: Local Adaptive Thresholding OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.16 Operation: Local Adaptive Thresholding Default Loop Vectorization 10 20 30 40 50 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 44 20 -mllvm -vectorize -lpng12 1. (CC) gcc options: -O3 -march=native -pthread -lXext -lX11 -lz -lm -lpthread
C-Ray Total Time OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 1.1 Total Time Default Loop Vectorization 2012-10-27 15:48 20 40 60 80 100 SE +/- 0.03, N = 3 SE +/- 0.04, N = 3 SE +/- 0.08, N = 3 20.92 23.15 80.86 -march=native -march=native -mllvm -vectorize 1. (CC) gcc options: -lm -lpthread -O3
Smallpt Global Illumination Renderer; 100 Samples OpenBenchmarking.org Seconds, Fewer Is Better Smallpt 1.0 Global Illumination Renderer; 100 Samples Default Loop Vectorization 2012-10-27 15:48 60 120 180 240 300 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 153 157 265 -O3 -march=native -O3 -march=native -mllvm -vectorize 1. (CXX) g++ options: -fopenmp
PostgreSQL pgbench TPC-B Transactions Per Second OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 8.4.11 TPC-B Transactions Per Second Loop Vectorization Default 2012-10-27 15:48 70 140 210 280 350 SE +/- 3.97, N = 3 SE +/- 0.52, N = 3 SE +/- 0.10, N = 3 336.65 324.77 20.85 -O3 -march=native -mllvm -vectorize -O3 -march=native 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -lpgport -lpq -lcrypt -ldl -lm
Phoronix Test Suite v10.8.4