GCC 4.9 Snapshot Compiler Flag Tests GCC 4.9 compiler optimization tests of an Intel Core i7 Haswell CPU and applying different CFLAGS/CXXFLAGS to look at impact of core-avx2 CPU optimizations and other x86_64 -march= options. Benchmarks by Michael Larabel.
HTML result view exported from: https://openbenchmarking.org/result/1308319-SO-GCC49SNAP71&sro&grr .
GCC 4.9 Snapshot Compiler Flag Tests Processor Motherboard Chipset Memory Disk Graphics Audio Network OS Kernel Desktop Display Server Display Driver OpenGL Compiler File-System Screen Resolution Core-AVX2 Core-AVX-I Corei7-AVX Corei7 Core2 Nocona Intel Core i7-4900MQ @ 2.80GHz (8 Cores) System76 Gazelle Professional Intel Xeon E3-1200 v3/4th 8192MB 120GB INTEL SSDSC2CW12 Intel 4th Gen Core IGP (1300MHz) Intel Haswell HDMI Realtek RTL8111/8168/8411 + Intel Centrino Advanced-N 6235 Ubuntu 13.10 3.11.0-4-generic (x86_64) Unity 7.1.0 X Server 1.14.2.901 (1.14.3 RC 1) intel 2.21.14 3.1 Mesa 9.2.0 GCC 4.9.0 20130731 ext4 1920x1080 OpenBenchmarking.org Compiler Details - --build=x86_64-linux-gnu --disable-browser-plugin --disable-nls --disable-werror --enable-checking=yes --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,ada,c++,java,go,fortran,objc,obj-c++ --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-objc-gc --enable-plugin --enable-shared --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-4.9-snap-amd64/jre --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-4.9-snap-amd64 --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-4.9-snap-amd64 --with-multilib-list=m32,m64,mx32 --with-tune=generic -v Processor Details - Scaling Governor: acpi-cpufreq ondemand
GCC 4.9 Snapshot Compiler Flag Tests apache: Static Web Page Serving n-queens: Elapsed Time ffmpeg: H.264 HD To NTSC DV encode-ogg: WAV To Ogg encode-mp3: WAV To MP3 encode-flac: WAV To FLAC smallpt: Global Illumination Renderer; 100 Samples primesieve: 1e12 Prime Number Generation c-ray: Total Time himeno: Poisson Pressure Solver graphics-magick: Local Adaptive Thresholding graphics-magick: HWB Color Space graphics-magick: Resizing graphics-magick: Sharpen graphics-magick: Blur x264: H.264 Video Encoding ttsiod-renderer: Phong Rendering With Soft-Shadow Mapping scimark2: Jacobi Successive Over-Relaxation scimark2: Dense LU Matrix Factorization scimark2: Sparse Matrix Multiply scimark2: Fast Fourier Transform scimark2: Monte Carlo Core-AVX2 Core-AVX-I Corei7-AVX Corei7 Core2 Nocona 31291.52 42.54 13.17 7.06 14.30 5.31 29 93.19 19.71 1770.48 116 200 180 126 149 162.96 133.95 1142.92 2783.23 2332.25 335.89 595.70 31007.19 44.51 13.31 7.06 14.08 5.59 29 93.17 26.38 1868.47 116 201 176 116 149 163.77 133.36 1142.92 2794.16 2184.67 329.12 595.28 30875.41 44.50 13.37 7.06 14.07 5.59 29 93.15 26.40 1803.82 116 201 176 116 149 162.86 132.85 1142.92 2786.85 2322.34 315.45 594.05 30666.53 44.54 13.50 7.08 12.57 5.09 30 93.14 26.52 1746.95 115 200 175 116 149 162.76 150.07 1140.79 2842.55 2388.35 316.62 596.52 31064.17 44.51 13.33 7.06 12.88 5.21 30 93.15 26.53 1747.23 115 201 176 116 149 162.10 146.09 1140.79 2853.98 2379.68 319.57 595.70 29830.02 45.76 13.30 7.07 12.64 5.29 30 93.31 26.66 1695.27 114 198 172 115 148 162.14 144.87 1140.79 2589.06 2346.98 318.08 698.37 OpenBenchmarking.org
Apache Benchmark Static Web Page Serving OpenBenchmarking.org Requests Per Second, More Is Better Apache Benchmark 2.4.3 Static Web Page Serving Core-AVX-I Core-AVX2 Core2 Corei7 Corei7-AVX Nocona 7K 14K 21K 28K 35K SE +/- 162.72, N = 3 SE +/- 88.76, N = 3 SE +/- 43.24, N = 3 SE +/- 225.32, N = 3 SE +/- 191.20, N = 3 SE +/- 111.38, N = 3 31007.19 31291.52 31064.17 30666.53 30875.41 29830.02 -march=core-avx-i -march=core-avx2 -march=core2 -march=corei7 -march=corei7-avx -march=nocona 1. (CC) gcc options: -shared -fPIC -pthread -O3
N-Queens Elapsed Time OpenBenchmarking.org Seconds, Fewer Is Better N-Queens 1.0 Elapsed Time Core-AVX-I Core-AVX2 Core2 Corei7 Corei7-AVX Nocona 10 20 30 40 50 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 44.51 42.54 44.51 44.54 44.50 45.76 -march=core-avx-i -march=core-avx2 -march=core2 -march=corei7 -march=corei7-avx -march=nocona 1. (CC) gcc options: -static -fopenmp -O3
FFmpeg H.264 HD To NTSC DV OpenBenchmarking.org Seconds, Fewer Is Better FFmpeg 2.0 H.264 HD To NTSC DV Core-AVX-I Core-AVX2 Core2 Corei7 Corei7-AVX Nocona 3 6 9 12 15 SE +/- 0.04, N = 3 SE +/- 0.05, N = 3 SE +/- 0.03, N = 3 SE +/- 0.06, N = 3 SE +/- 0.07, N = 3 SE +/- 0.04, N = 3 13.31 13.17 13.33 13.50 13.37 13.30 -march=core-avx-i -march=core-avx2 -march=core2 -march=corei7 -march=corei7-avx -march=nocona 1. (CC) gcc options: -lavdevice -lavfilter -lavformat -lavcodec -lswresample -lswscale -lavutil -ldl -lasound -lSDL -lm -pthread -lbz2 -O3 -std=c99 -fomit-frame-pointer -fno-math-errno -fno-signed-zeros -fno-tree-vectorize -MMD -MF -MT
Ogg Encoding WAV To Ogg OpenBenchmarking.org Seconds, Fewer Is Better Ogg Encoding 1.3.0 WAV To Ogg Core-AVX-I Core-AVX2 Core2 Corei7 Corei7-AVX Nocona 2 4 6 8 10 SE +/- 0.01, N = 5 SE +/- 0.00, N = 5 SE +/- 0.01, N = 5 SE +/- 0.01, N = 5 SE +/- 0.00, N = 5 SE +/- 0.01, N = 5 7.06 7.06 7.06 7.08 7.06 7.07 -march=core-avx-i -march=core-avx2 -march=core2 -march=corei7 -march=corei7-avx -march=nocona 1. (CC) gcc options: -O2 -ffast-math -fsigned-char -O3 -lm -logg
LAME MP3 Encoding WAV To MP3 OpenBenchmarking.org Seconds, Fewer Is Better LAME MP3 Encoding 3.99.3 WAV To MP3 Core-AVX-I Core-AVX2 Core2 Corei7 Corei7-AVX Nocona 4 8 12 16 20 SE +/- 0.01, N = 5 SE +/- 0.02, N = 5 SE +/- 0.01, N = 5 SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 SE +/- 0.01, N = 5 14.08 14.30 12.88 12.57 14.07 12.64 -march=core-avx-i -march=core-avx2 -march=core2 -march=corei7 -march=corei7-avx -march=nocona 1. (CC) gcc options: -pipe -O3 -lm
FLAC Audio Encoding WAV To FLAC OpenBenchmarking.org Seconds, Fewer Is Better FLAC Audio Encoding 1.3.0 WAV To FLAC Core-AVX-I Core-AVX2 Core2 Corei7 Corei7-AVX Nocona 1.2578 2.5156 3.7734 5.0312 6.289 SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 5.59 5.31 5.21 5.09 5.59 5.29 -march=core-avx-i -march=core-avx2 -march=core2 -march=corei7 -march=corei7-avx -march=nocona 1. (CXX) g++ options: -O3 -fvisibility=hidden -logg -lm
Smallpt Global Illumination Renderer; 100 Samples OpenBenchmarking.org Seconds, Fewer Is Better Smallpt 1.0 Global Illumination Renderer; 100 Samples Core-AVX-I Core-AVX2 Core2 Corei7 Corei7-AVX Nocona 7 14 21 28 35 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.33, N = 3 SE +/- 0.00, N = 3 29 29 30 30 29 30 -march=core-avx-i -march=core-avx2 -march=core2 -march=corei7 -march=corei7-avx -march=nocona 1. (CXX) g++ options: -fopenmp -O3
Primesieve 1e12 Prime Number Generation OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 4.2 1e12 Prime Number Generation Core-AVX-I Core-AVX2 Core2 Corei7 Corei7-AVX Nocona 20 40 60 80 100 SE +/- 0.43, N = 3 SE +/- 0.44, N = 3 SE +/- 0.41, N = 3 SE +/- 0.37, N = 3 SE +/- 0.35, N = 3 SE +/- 0.33, N = 3 93.17 93.19 93.15 93.14 93.15 93.31 1. (CXX) g++ options: -O2 -fopenmp
C-Ray Total Time OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 1.1 Total Time Core-AVX-I Core-AVX2 Core2 Corei7 Corei7-AVX Nocona 6 12 18 24 30 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 26.38 19.71 26.53 26.52 26.40 26.66 -march=core-avx-i -march=core-avx2 -march=core2 -march=corei7 -march=corei7-avx -march=nocona 1. (CC) gcc options: -lm -lpthread -O3
Himeno Benchmark Poisson Pressure Solver OpenBenchmarking.org MFLOPS, More Is Better Himeno Benchmark 3.0 Poisson Pressure Solver Core-AVX-I Core-AVX2 Core2 Corei7 Corei7-AVX Nocona 400 800 1200 1600 2000 SE +/- 0.59, N = 3 SE +/- 29.31, N = 4 SE +/- 1.11, N = 3 SE +/- 1.79, N = 3 SE +/- 24.88, N = 6 SE +/- 0.23, N = 3 1868.47 1770.48 1747.23 1746.95 1803.82 1695.27 -march=core-avx-i -march=core-avx2 -march=core2 -march=corei7 -march=corei7-avx -march=nocona 1. (CC) gcc options: -O3
GraphicsMagick Operation: Local Adaptive Thresholding OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.16 Operation: Local Adaptive Thresholding Core-AVX-I Core-AVX2 Core2 Corei7 Corei7-AVX Nocona 30 60 90 120 150 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 116 116 115 115 116 114 -march=core-avx-i -march=core-avx2 -march=core2 -march=corei7 -march=corei7-avx -march=nocona 1. (CC) gcc options: -std=gnu99 -fopenmp -O3 -pthread -lXext -lSM -lICE -lX11 -lbz2 -lz -lm -lgomp -lpthread
GraphicsMagick Operation: HWB Color Space OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.16 Operation: HWB Color Space Core-AVX-I Core-AVX2 Core2 Corei7 Corei7-AVX Nocona 40 80 120 160 200 SE +/- 0.00, N = 3 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 SE +/- 0.00, N = 3 201 200 201 200 201 198 -march=core-avx-i -march=core-avx2 -march=core2 -march=corei7 -march=corei7-avx -march=nocona 1. (CC) gcc options: -std=gnu99 -fopenmp -O3 -pthread -lXext -lSM -lICE -lX11 -lbz2 -lz -lm -lgomp -lpthread
GraphicsMagick Operation: Resizing OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.16 Operation: Resizing Core-AVX-I Core-AVX2 Core2 Corei7 Corei7-AVX Nocona 40 80 120 160 200 SE +/- 0.00, N = 3 SE +/- 0.67, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 176 180 176 175 176 172 -march=core-avx-i -march=core-avx2 -march=core2 -march=corei7 -march=corei7-avx -march=nocona 1. (CC) gcc options: -std=gnu99 -fopenmp -O3 -pthread -lXext -lSM -lICE -lX11 -lbz2 -lz -lm -lgomp -lpthread
GraphicsMagick Operation: Sharpen OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.16 Operation: Sharpen Core-AVX-I Core-AVX2 Core2 Corei7 Corei7-AVX Nocona 30 60 90 120 150 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.33, N = 3 116 126 116 116 116 115 -march=core-avx-i -march=core-avx2 -march=core2 -march=corei7 -march=corei7-avx -march=nocona 1. (CC) gcc options: -std=gnu99 -fopenmp -O3 -pthread -lXext -lSM -lICE -lX11 -lbz2 -lz -lm -lgomp -lpthread
GraphicsMagick Operation: Blur OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.16 Operation: Blur Core-AVX-I Core-AVX2 Core2 Corei7 Corei7-AVX Nocona 30 60 90 120 150 SE +/- 0.33, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.33, N = 3 149 149 149 149 149 148 -march=core-avx-i -march=core-avx2 -march=core2 -march=corei7 -march=corei7-avx -march=nocona 1. (CC) gcc options: -std=gnu99 -fopenmp -O3 -pthread -lXext -lSM -lICE -lX11 -lbz2 -lz -lm -lgomp -lpthread
x264 H.264 Video Encoding OpenBenchmarking.org Frames Per Second, More Is Better x264 2013-06-08 H.264 Video Encoding Core-AVX-I Core-AVX2 Core2 Corei7 Corei7-AVX Nocona 40 80 120 160 200 SE +/- 0.36, N = 5 SE +/- 0.32, N = 5 SE +/- 0.45, N = 5 SE +/- 0.56, N = 5 SE +/- 0.58, N = 5 SE +/- 0.30, N = 5 163.77 162.96 162.10 162.76 162.86 162.14 -march=core-avx-i -march=core-avx2 -march=core2 -march=corei7 -march=corei7-avx -march=nocona 1. (CC) gcc options: -ldl -m64 -lm -lpthread -O3 -ffast-math -std=gnu99 -fomit-frame-pointer -fno-tree-vectorize
TTSIOD 3D Renderer Phong Rendering With Soft-Shadow Mapping OpenBenchmarking.org FPS, More Is Better TTSIOD 3D Renderer 2.2z Phong Rendering With Soft-Shadow Mapping Core-AVX-I Core-AVX2 Core2 Corei7 Corei7-AVX Nocona 30 60 90 120 150 SE +/- 0.40, N = 3 SE +/- 0.65, N = 3 SE +/- 0.25, N = 3 SE +/- 0.19, N = 3 SE +/- 0.31, N = 3 SE +/- 0.90, N = 3 133.36 133.95 146.09 150.07 132.85 144.87 -march=core-avx-i -march=core-avx2 -march=core2 -march=corei7 -march=corei7-avx -march=nocona 1. (CXX) g++ options: -O3 -fomit-frame-pointer -ffast-math -mtune=native -flto -msse -mrecip -mfpmath=sse -msse2 -mssse3 -lSDL -lstdc++
SciMark Computational Test: Jacobi Successive Over-Relaxation OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Jacobi Successive Over-Relaxation Core-AVX-I Core-AVX2 Core2 Corei7 Corei7-AVX Nocona 200 400 600 800 1000 SE +/- 1.07, N = 4 SE +/- 1.07, N = 4 SE +/- 1.07, N = 4 SE +/- 1.07, N = 4 SE +/- 1.07, N = 4 SE +/- 1.07, N = 4 1142.92 1142.92 1140.79 1140.79 1142.92 1140.79 -march=core-avx-i -march=core-avx2 -march=core2 -march=corei7 -march=corei7-avx -march=nocona 1. (CXX) g++ options: -O3
SciMark Computational Test: Dense LU Matrix Factorization OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Dense LU Matrix Factorization Core-AVX-I Core-AVX2 Core2 Corei7 Corei7-AVX Nocona 600 1200 1800 2400 3000 SE +/- 3.50, N = 4 SE +/- 3.48, N = 4 SE +/- 4.81, N = 4 SE +/- 1.89, N = 4 SE +/- 1.82, N = 4 SE +/- 12.57, N = 2 2794.16 2783.23 2853.98 2842.55 2786.85 2589.06 -march=core-avx-i -march=core-avx2 -march=core2 -march=corei7 -march=corei7-avx -march=nocona 1. (CXX) g++ options: -O3
SciMark Computational Test: Sparse Matrix Multiply OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Sparse Matrix Multiply Core-AVX-I Core-AVX2 Core2 Corei7 Corei7-AVX Nocona 500 1000 1500 2000 2500 SE +/- 9.72, N = 4 SE +/- 3.18, N = 4 SE +/- 3.31, N = 4 SE +/- 2.84, N = 4 SE +/- 3.15, N = 4 SE +/- 32.43, N = 4 2184.67 2332.25 2379.68 2388.35 2322.34 2346.98 -march=core-avx-i -march=core-avx2 -march=core2 -march=corei7 -march=corei7-avx -march=nocona 1. (CXX) g++ options: -O3
SciMark Computational Test: Fast Fourier Transform OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Fast Fourier Transform Core-AVX-I Core-AVX2 Core2 Corei7 Corei7-AVX Nocona 70 140 210 280 350 SE +/- 1.63, N = 4 SE +/- 1.12, N = 4 SE +/- 0.57, N = 4 SE +/- 1.20, N = 4 SE +/- 0.73, N = 4 SE +/- 0.48, N = 4 329.12 335.89 319.57 316.62 315.45 318.08 -march=core-avx-i -march=core-avx2 -march=core2 -march=corei7 -march=corei7-avx -march=nocona 1. (CXX) g++ options: -O3
SciMark Computational Test: Monte Carlo OpenBenchmarking.org Mflops, More Is Better SciMark 2.0 Computational Test: Monte Carlo Core-AVX-I Core-AVX2 Core2 Corei7 Corei7-AVX Nocona 150 300 450 600 750 SE +/- 0.41, N = 4 SE +/- 0.48, N = 4 SE +/- 0.48, N = 4 SE +/- 0.00, N = 4 SE +/- 0.47, N = 4 SE +/- 0.66, N = 4 595.28 595.70 595.70 596.52 594.05 698.37 -march=core-avx-i -march=core-avx2 -march=core2 -march=corei7 -march=corei7-avx -march=nocona 1. (CXX) g++ options: -O3
Phoronix Test Suite v10.8.5