LLVM Clang 3.2 Loop Vectorizer Intel Core i7-3960X testing of the automatic loop vectorizer in LLVM 3.2 with the Clang compiler. Benchmarking by Michael Larabel for a future article on phoronix.com.
HTML result view exported from: https://openbenchmarking.org/result/1210278-BY-1210264RA90&grs&rdt .
LLVM Clang 3.2 Loop Vectorizer Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL Compiler File-System Screen Resolution Loop Vectorization Default 2012-10-27 15:48 Intel Core i7-3960X @ 3.30GHz (12 Cores) Intel DX79SI Intel Xeon E5/Core 8192MB 64GB OCZ VERTEX AMD Radeon HD 4650 512MB Realtek ALC892 DELL S2409W Intel 82579LM Gigabit Connection Ubuntu 12.10 3.5.0-17-generic (x86_64) Unity 6.8.0 X Server 1.13.0 radeon 6.99.99 2.1 Mesa 9.0 Gallium 0.4 Clang 3.2 (SVN 166775) + LLVM 3.2svn ext4 1920x1080 AMD Athlon II X3 455 @ 3.30GHz (3 Cores) Gigabyte GA-MA790X-DS4 AMD nee ATI RD780 + SB600 4096MB 500GB Seagate ST3500418AS + 250GB 2500BEV External + 500GB My Passport 0730 ATI Radeon HD 3800 512MB (851/1143MHz) Realtek ALC889A ASUS VH242H Realtek RTL8111/8168B + Ralink RT3060 Wireless 802.11n 1T/1R SUSE LINUX 12.2 3.4.11-2.16-desktop (x86_64) KDE 4.8.5 X Server 1.12.3 fglrx 8.97.2 3.3.11653 GCC 4.7 btrfs OpenBenchmarking.org Compiler Details - Loop Vectorization: Optimized build; Built Oct 26 2012 (10:29:50); Default target: x86_64-unknown-linux-gnu; Host CPU: corei7-avx - Default: Optimized build; Built Oct 26 2012 (10:29:50); Default target: x86_64-unknown-linux-gnu; Host CPU: corei7-avx - 2012-10-27 15:48: --build=x86_64-suse-linux --disable-libgcj --disable-libitm --disable-libmudflap --disable-libssp --disable-libstdcxx-pch --disable-plugin --enable-__cxa_atexit --enable-checking=release --enable-languages=c,c++,objc,fortran,obj-c++,java,ada --enable-libstdcxx-allocator=new --enable-linux-futex --enable-ssp --enable-version-specific-runtime-libs --mandir=/usr/share/man --with-arch-32=i586 --with-slibdir=/lib64 --with-tune=generic --without-system-libunwind Processor Details - Loop Vectorization, Default: Scaling Governor: ondemand System Details - Loop Vectorization, Default: Compiz was running on this system.
LLVM Clang 3.2 Loop Vectorizer c-ray: Total Time himeno: Poisson Pressure Solver graphics-magick: Local Adaptive Thresholding pgbench: TPC-B Transactions Per Second smallpt: Global Illumination Renderer; 100 Samples hmmer: Pfam Database Search graphics-magick: Blur graphics-magick: Resizing graphics-magick: HWB Color Space graphics-magick: Sharpen Loop Vectorization Default 2012-10-27 15:48 23.15 1554.66 20 336.65 157 15.37 73 84 122 31 20.92 1595.35 44 324.77 153 15.35 82 89 123 31 80.86 619.36 20.85 265 25.61 OpenBenchmarking.org
C-Ray Total Time OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 1.1 Total Time Loop Vectorization Default 2012-10-27 15:48 20 40 60 80 100 SE +/- 0.04, N = 3 SE +/- 0.03, N = 3 SE +/- 0.08, N = 3 23.15 20.92 80.86 -march=native -mllvm -vectorize -march=native 1. (CC) gcc options: -lm -lpthread -O3
Himeno Benchmark Poisson Pressure Solver OpenBenchmarking.org MFLOPS, More Is Better Himeno Benchmark 3.0 Poisson Pressure Solver Loop Vectorization Default 2012-10-27 15:48 300 600 900 1200 1500 SE +/- 6.98, N = 3 SE +/- 3.05, N = 3 SE +/- 0.87, N = 3 1554.66 1595.35 619.36 -march=native -mllvm -vectorize -march=native 1. (CC) gcc options: -O3
GraphicsMagick Operation: Local Adaptive Thresholding OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.16 Operation: Local Adaptive Thresholding Loop Vectorization Default 10 20 30 40 50 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 20 44 -mllvm -vectorize -lpng12 1. (CC) gcc options: -O3 -march=native -pthread -lXext -lX11 -lz -lm -lpthread
PostgreSQL pgbench TPC-B Transactions Per Second OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 8.4.11 TPC-B Transactions Per Second Loop Vectorization Default 2012-10-27 15:48 70 140 210 280 350 SE +/- 3.97, N = 3 SE +/- 0.52, N = 3 SE +/- 0.10, N = 3 336.65 324.77 20.85 -O3 -march=native -mllvm -vectorize -O3 -march=native 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -lpgport -lpq -lcrypt -ldl -lm
Smallpt Global Illumination Renderer; 100 Samples OpenBenchmarking.org Seconds, Fewer Is Better Smallpt 1.0 Global Illumination Renderer; 100 Samples Loop Vectorization Default 2012-10-27 15:48 60 120 180 240 300 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 157 153 265 -O3 -march=native -mllvm -vectorize -O3 -march=native 1. (CXX) g++ options: -fopenmp
Timed HMMer Search Pfam Database Search OpenBenchmarking.org Seconds, Fewer Is Better Timed HMMer Search 2.3.2 Pfam Database Search Loop Vectorization Default 2012-10-27 15:48 6 12 18 24 30 SE +/- 0.02, N = 3 SE +/- 0.05, N = 3 SE +/- 0.46, N = 3 15.37 15.35 25.61 -O3 -march=native -mllvm -vectorize -O3 -march=native -O2 1. (CC) gcc options: -pthread -lhmmer -lsquid -lm
GraphicsMagick Operation: Blur OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.16 Operation: Blur Loop Vectorization Default 20 40 60 80 100 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 73 82 -mllvm -vectorize -lpng12 1. (CC) gcc options: -O3 -march=native -pthread -lXext -lX11 -lz -lm -lpthread
GraphicsMagick Operation: Resizing OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.16 Operation: Resizing Loop Vectorization Default 20 40 60 80 100 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 84 89 -mllvm -vectorize -lpng12 1. (CC) gcc options: -O3 -march=native -pthread -lXext -lX11 -lz -lm -lpthread
GraphicsMagick Operation: HWB Color Space OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.16 Operation: HWB Color Space Loop Vectorization Default 30 60 90 120 150 SE +/- 0.33, N = 3 SE +/- 0.00, N = 3 122 123 -mllvm -vectorize -lpng12 1. (CC) gcc options: -O3 -march=native -pthread -lXext -lX11 -lz -lm -lpthread
GraphicsMagick Operation: Sharpen OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.16 Operation: Sharpen Loop Vectorization Default 7 14 21 28 35 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 31 31 -mllvm -vectorize -lpng12 1. (CC) gcc options: -O3 -march=native -pthread -lXext -lX11 -lz -lm -lpthread
Phoronix Test Suite v10.8.4