GCC Compiler Intel Sandy Bridge AVX fpmath GCC testing for a future article on Phoronix.com. Testing with CFLAGS/CXXFLAGS set to -O3 and -march=native. Benchmarks of GCC 4.7 RC1 when comparing the effects of building the GNU Compiler Collection with the --with-fpmath=avx to see how GCC 4.7 is impacted by the AVX floating-point arithmetic.
HTML result view exported from: https://openbenchmarking.org/result/1203140-BY-MATHAVX1508&grw&sor .
GCC Compiler Intel Sandy Bridge AVX fpmath Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL Compiler File-System Screen Resolution Stock AVX fpmath Intel Core i7-3960X @ 3.20GHz (12 Cores) Intel DX79SI Intel Sandy DMI2 16384MB 240GB OCZ VERTEX3 AMD Radeon HD 5700 1024MB Realtek ALC892 DELL P2210H Intel 82579LM Gigabit Connection Ubuntu 12.04 3.2.0-18-generic (x86_64) Unity 5.4.0 X Server 1.11.3 radeon 6.14.99 2.1 Mesa 8.0.1 Gallium 0.4 GCC 4.7.0 ext4 1920x1080 OpenBenchmarking.org Compiler Details - Stock: --enable-checking=release --enable-languages=c,c++,fortran --enable-lto - AVX fpmath: --enable-checking=release --enable-languages=c,c++,fortran --enable-lto --with-fpmath=avx System Details - Compiz was running on this system.
GCC Compiler Intel Sandy Bridge AVX fpmath clomp: Static OMP Speedup minion: Bibd minion: Graceful minion: Quasigroup minion: Solitaire encode-flac: WAV To FLAC encode-ogg: WAV To Ogg hmmer: Pfam Database Search mafft: Multiple Sequence Alignment himeno: Poisson Pressure Solver npb: BT.A npb: FT.B npb: LU.A npb: MG.B compress-lzma: 256MB File Compression npb: SP.A npb: UA.A build-php: Time To Compile nero2d: Total Time build-linux-kernel: Time To Compile vpxenc: vpxenc graphics-magick: HWB Color Space graphics-magick: Blur graphics-magick: Local Adaptive Thresholding graphics-magick: Resizing graphics-magick: Sharpen x264: H.264 Video Encoding c-ray: Total Time ttsiod-renderer: Phong Rendering With Soft-Shadow Mapping ffmpeg: AVI To NTSC VCD smallpt: Global Illumination Renderer; 100 Samples apache: Static Web Page Serving openssl: RSA 4096-bit Performance byte: Dhrystone 2 fhourstones: Complex Connect-4 Solving gmpbench: Total Time Stock AVX fpmath 6.38 161.22 90.02 188.09 145.82 6.29 9.43 9.25 4.89 1377.66 17009.49 10069.85 16396.35 10409.75 155.14 9410.41 73.66 24.08 534.09 70.92 23.58 163 123 85 151 98 171.68 25.70 158.67 12.34 20 16263.52 75.45 21255660.73 10353.83 2833.40 6.34 163.42 89.23 189.65 148.53 6.38 9.44 9.22 5.06 1378.74 18631.50 10613.43 16999.52 10408.69 155.14 9625.70 74.75 24.17 532.10 70.92 23.42 163 123 85 151 98 172.89 25.74 158.95 12.31 20 16427.85 75.45 21254777.57 10235.50 2834 OpenBenchmarking.org
CLOMP Static OMP Speedup OpenBenchmarking.org Speedup, More Is Better CLOMP 3.3 Static OMP Speedup Stock AVX fpmath 2 4 6 8 10 SE +/- 0.03, N = 5 SE +/- 0.05, N = 5 6.38 6.34 1. (CC) gcc options: --openmp -O3 -lm
Minion Benchmark: Bibd OpenBenchmarking.org Seconds, Fewer Is Better Minion 0.12 Benchmark: Bibd Stock AVX fpmath 40 80 120 160 200 SE +/- 0.53, N = 3 SE +/- 1.07, N = 3 161.22 163.42 1. (CXX) g++ options: -O3 -fomit-frame-pointer -rdynamic -lboost_iostreams-mt
Minion Benchmark: Graceful OpenBenchmarking.org Seconds, Fewer Is Better Minion 0.12 Benchmark: Graceful AVX fpmath Stock 20 40 60 80 100 SE +/- 0.45, N = 3 SE +/- 0.49, N = 3 89.23 90.02 1. (CXX) g++ options: -O3 -fomit-frame-pointer -rdynamic -lboost_iostreams-mt
Minion Benchmark: Quasigroup OpenBenchmarking.org Seconds, Fewer Is Better Minion 0.12 Benchmark: Quasigroup Stock AVX fpmath 40 80 120 160 200 SE +/- 0.12, N = 3 SE +/- 0.64, N = 3 188.09 189.65 1. (CXX) g++ options: -O3 -fomit-frame-pointer -rdynamic -lboost_iostreams-mt
Minion Benchmark: Solitaire OpenBenchmarking.org Seconds, Fewer Is Better Minion 0.12 Benchmark: Solitaire Stock AVX fpmath 30 60 90 120 150 SE +/- 0.80, N = 3 SE +/- 0.86, N = 3 145.82 148.53 1. (CXX) g++ options: -O3 -fomit-frame-pointer -rdynamic -lboost_iostreams-mt
FLAC Audio Encoding WAV To FLAC OpenBenchmarking.org Seconds, Fewer Is Better FLAC Audio Encoding 1.2.1 WAV To FLAC Stock AVX fpmath 2 4 6 8 10 SE +/- 0.00, N = 5 SE +/- 0.09, N = 6 6.29 6.38 1. (CXX) g++ options: -O3 -march=native -logg -lm
Ogg Encoding WAV To Ogg OpenBenchmarking.org Seconds, Fewer Is Better Ogg Encoding 1.3.0 WAV To Ogg Stock AVX fpmath 3 6 9 12 15 SE +/- 0.00, N = 5 SE +/- 0.01, N = 5 9.43 9.44 1. (CC) gcc options: -O2 -ffast-math -fsigned-char -O3 -march=native -lvorbis -lm -logg
Timed HMMer Search Pfam Database Search OpenBenchmarking.org Seconds, Fewer Is Better Timed HMMer Search 2.3.2 Pfam Database Search AVX fpmath Stock 3 6 9 12 15 SE +/- 0.05, N = 3 SE +/- 0.03, N = 3 9.22 9.25 1. (CC) gcc options: -O3 -march=native -pthread -lhmmer -lsquid -lm
Timed MAFFT Alignment Multiple Sequence Alignment OpenBenchmarking.org Seconds, Fewer Is Better Timed MAFFT Alignment 6.864 Multiple Sequence Alignment Stock AVX fpmath 1.1385 2.277 3.4155 4.554 5.6925 SE +/- 0.05, N = 3 SE +/- 0.06, N = 3 4.89 5.06 1. (CC) gcc options: -O3 -lm -lpthread
Himeno Benchmark Poisson Pressure Solver OpenBenchmarking.org MFLOPS, More Is Better Himeno Benchmark 3.0 Poisson Pressure Solver AVX fpmath Stock 300 600 900 1200 1500 SE +/- 1.49, N = 3 SE +/- 0.77, N = 3 1378.74 1377.66 1. (CC) gcc options: -O3 -march=native
NAS Parallel Benchmarks Test / Class: BT.A OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.3 Test / Class: BT.A AVX fpmath Stock 4K 8K 12K 16K 20K SE +/- 23.03, N = 3 SE +/- 15.33, N = 3 18631.50 17009.49 1. (F9X) gfortran options: -fopenmp
NAS Parallel Benchmarks Test / Class: FT.B OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.3 Test / Class: FT.B AVX fpmath Stock 2K 4K 6K 8K 10K SE +/- 20.86, N = 3 SE +/- 11.49, N = 3 10613.43 10069.85 1. (F9X) gfortran options: -fopenmp
NAS Parallel Benchmarks Test / Class: LU.A OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.3 Test / Class: LU.A AVX fpmath Stock 4K 8K 12K 16K 20K SE +/- 42.68, N = 3 SE +/- 20.47, N = 3 16999.52 16396.35 1. (F9X) gfortran options: -fopenmp
NAS Parallel Benchmarks Test / Class: MG.B OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.3 Test / Class: MG.B Stock AVX fpmath 2K 4K 6K 8K 10K SE +/- 21.92, N = 3 SE +/- 37.33, N = 3 10409.75 10408.69 1. (F9X) gfortran options: -fopenmp
LZMA Compression 256MB File Compression OpenBenchmarking.org Seconds, Fewer Is Better LZMA Compression 256MB File Compression Stock AVX fpmath 30 60 90 120 150 SE +/- 0.29, N = 3 SE +/- 0.38, N = 3 155.14 155.14 1. (CC) gcc options: -O3 -march=native
NAS Parallel Benchmarks Test / Class: SP.A OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.3 Test / Class: SP.A AVX fpmath Stock 2K 4K 6K 8K 10K SE +/- 26.51, N = 3 SE +/- 18.78, N = 3 9625.70 9410.41 1. (F9X) gfortran options: -fopenmp
NAS Parallel Benchmarks Test / Class: UA.A OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.3 Test / Class: UA.A AVX fpmath Stock 20 40 60 80 100 SE +/- 0.15, N = 3 SE +/- 0.22, N = 3 74.75 73.66 1. (F9X) gfortran options: -fopenmp
Timed PHP Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed PHP Compilation 5.2.9 Time To Compile Stock AVX fpmath 6 12 18 24 30 SE +/- 0.04, N = 3 SE +/- 0.04, N = 3 24.08 24.17 1. (CC) gcc options: -O3 -march=native -pedantic -ldl -lz -lm
Open FMM Nero2D Total Time OpenBenchmarking.org Seconds, Fewer Is Better Open FMM Nero2D 2.0.2 Total Time AVX fpmath Stock 120 240 360 480 600 532.10 534.09 1. (CXX) g++ options: -O3 -march=native -lfftw3 -llapack -lblas -lm
Timed Linux Kernel Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 3.1 Time To Compile Stock AVX fpmath 16 32 48 64 80 SE +/- 0.74, N = 3 SE +/- 0.59, N = 3 70.92 70.92
VP8 libvpx Encoding vpxenc OpenBenchmarking.org Frames Per Second, More Is Better VP8 libvpx Encoding 0.9.7-p1 vpxenc Stock AVX fpmath 6 12 18 24 30 SE +/- 0.08, N = 3 SE +/- 0.01, N = 3 23.58 23.42 1. (CC) gcc options: -m64 -lvpx -lm -lpthread
GraphicsMagick Operation: HWB Color Space OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.12 Operation: HWB Color Space AVX fpmath Stock 40 80 120 160 200 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 163 163 1. (CC) gcc options: -std=gnu99 -fopenmp -O3 -march=native -pthread -ltiff -lfreetype -lXext -lSM -lICE -lX11 -lbz2 -lz -lm -lrt -lpthread
GraphicsMagick Operation: Blur OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.12 Operation: Blur AVX fpmath Stock 30 60 90 120 150 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 123 123 1. (CC) gcc options: -std=gnu99 -fopenmp -O3 -march=native -pthread -ltiff -lfreetype -lXext -lSM -lICE -lX11 -lbz2 -lz -lm -lrt -lpthread
GraphicsMagick Operation: Local Adaptive Thresholding OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.12 Operation: Local Adaptive Thresholding AVX fpmath Stock 20 40 60 80 100 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 85 85 1. (CC) gcc options: -std=gnu99 -fopenmp -O3 -march=native -pthread -ltiff -lfreetype -lXext -lSM -lICE -lX11 -lbz2 -lz -lm -lrt -lpthread
GraphicsMagick Operation: Resizing OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.12 Operation: Resizing AVX fpmath Stock 30 60 90 120 150 SE +/- 0.33, N = 3 SE +/- 0.00, N = 3 151 151 1. (CC) gcc options: -std=gnu99 -fopenmp -O3 -march=native -pthread -ltiff -lfreetype -lXext -lSM -lICE -lX11 -lbz2 -lz -lm -lrt -lpthread
GraphicsMagick Operation: Sharpen OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.12 Operation: Sharpen AVX fpmath Stock 20 40 60 80 100 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 98 98 1. (CC) gcc options: -std=gnu99 -fopenmp -O3 -march=native -pthread -ltiff -lfreetype -lXext -lSM -lICE -lX11 -lbz2 -lz -lm -lrt -lpthread
x264 H.264 Video Encoding OpenBenchmarking.org Frames Per Second, More Is Better x264 2011-12-06 H.264 Video Encoding AVX fpmath Stock 40 80 120 160 200 SE +/- 1.56, N = 3 SE +/- 1.29, N = 3 172.89 171.68
C-Ray Total Time OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 1.1 Total Time Stock AVX fpmath 6 12 18 24 30 SE +/- 0.04, N = 3 SE +/- 0.08, N = 3 25.70 25.74 1. (CC) gcc options: -lm -lpthread -O3 -march=native
TTSIOD 3D Renderer Phong Rendering With Soft-Shadow Mapping OpenBenchmarking.org FPS, More Is Better TTSIOD 3D Renderer 2.2w Phong Rendering With Soft-Shadow Mapping AVX fpmath Stock 40 80 120 160 200 SE +/- 0.47, N = 3 SE +/- 0.44, N = 3 158.95 158.67 1. (CXX) g++ options: -O3 -march=native -fomit-frame-pointer -ffast-math -mtune=native -flto -msse -mrecip -mfpmath=sse -msse2 -mssse3 -lSDL -lstdc++
FFmpeg AVI To NTSC VCD OpenBenchmarking.org Seconds, Fewer Is Better FFmpeg 0.10 AVI To NTSC VCD AVX fpmath Stock 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.06, N = 3 12.31 12.34 1. (CC) gcc options: -lavdevice -lavfilter -lavformat -lavcodec -lswresample -lswscale -lavutil -ldl -lasound -lSDL -lm -pthread -lbz2
Smallpt Global Illumination Renderer; 100 Samples OpenBenchmarking.org Seconds, Fewer Is Better Smallpt 1.0 Global Illumination Renderer; 100 Samples Stock AVX fpmath 5 10 15 20 25 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 20 20 1. (CXX) g++ options: -fopenmp -O3 -march=native
Apache Benchmark Static Web Page Serving OpenBenchmarking.org Requests Per Second, More Is Better Apache Benchmark 2.2.21 Static Web Page Serving AVX fpmath Stock 4K 8K 12K 16K 20K SE +/- 71.91, N = 3 SE +/- 70.28, N = 3 16427.85 16263.52 1. (CC) gcc options: -pthread -O3 -march=native -lm -lexpat -lrt -lcrypt -lpthread -ldl
OpenSSL RSA 4096-bit Performance OpenBenchmarking.org Signs Per Second, More Is Better OpenSSL 1.0.0e RSA 4096-bit Performance AVX fpmath Stock 20 40 60 80 100 SE +/- 0.25, N = 4 SE +/- 0.19, N = 4 75.45 75.45 1. (CC) gcc options: -m64 -O3 -lssl -lcrypto -ldl
BYTE Unix Benchmark Computational Test: Dhrystone 2 OpenBenchmarking.org LPS, More Is Better BYTE Unix Benchmark 3.6 Computational Test: Dhrystone 2 Stock AVX fpmath 5M 10M 15M 20M 25M SE +/- 40039.93, N = 3 SE +/- 32934.69, N = 3 21255660.73 21254777.57 1. (CC) gcc options: -O3 -march=native
Fhourstones Complex Connect-4 Solving OpenBenchmarking.org Kpos / sec, More Is Better Fhourstones 3.1 Complex Connect-4 Solving Stock AVX fpmath 2K 4K 6K 8K 10K SE +/- 13.21, N = 3 SE +/- 32.53, N = 3 10353.83 10235.50 1. (CC) gcc options: -O3
GMPbench Total Time OpenBenchmarking.org GMPbench Score, More Is Better GMPbench 0.2 Total Time AVX fpmath Stock 600 1200 1800 2400 3000 2834.00 2833.40 1. (CC) gcc options: -O3 -march=native
Phoronix Test Suite v10.8.4