GCC 6.1 Compiler Optimization Benchmarks GCC 6.1.0 compiler benchmarks with different optimization flags. Intel Xeon E5-2687W v3 GCC compiler benchmarks on Debian. Tests by Michael Larabel of Phoronix for a future article. -O0: Processor: Intel Xeon E5-2687W v3 @ 3.50GHz (20 Cores), Motherboard: MSI X99S SLI PLUS (MS-7885) v1.0, Chipset: Intel Xeon E7 v3/Xeon, Memory: 16384MB, Disk: PNY CS1211 120GB + 80GB INTEL SSDSCKGW08, Graphics: AMD FirePro V7900 2048MB, Audio: Realtek ALC892, Monitor: ASUS PB278, Network: Intel Connection OS: Debian testing, Kernel: 4.5.0-1-amd64 (x86_64), Display Server: X Server 1.18.3, Display Driver: modesetting 1.18.3, OpenGL: 3.3 Mesa 11.1.3 Gallium 0.4, Compiler: GCC 6.1.0, File-System: ext4, Screen Resolution: 2560x1440 -Os: Processor: Intel Xeon E5-2687W v3 @ 3.50GHz (20 Cores), Motherboard: MSI X99S SLI PLUS (MS-7885) v1.0, Chipset: Intel Xeon E7 v3/Xeon, Memory: 16384MB, Disk: PNY CS1211 120GB + 80GB INTEL SSDSCKGW08, Graphics: AMD FirePro V7900 2048MB, Audio: Realtek ALC892, Monitor: ASUS PB278, Network: Intel Connection OS: Debian testing, Kernel: 4.5.0-1-amd64 (x86_64), Display Server: X Server 1.18.3, Display Driver: modesetting 1.18.3, OpenGL: 3.3 Mesa 11.1.3 Gallium 0.4, Compiler: GCC 6.1.0, File-System: ext4, Screen Resolution: 2560x1440 -Og: Processor: Intel Xeon E5-2687W v3 @ 3.50GHz (20 Cores), Motherboard: MSI X99S SLI PLUS (MS-7885) v1.0, Chipset: Intel Xeon E7 v3/Xeon, Memory: 16384MB, Disk: PNY CS1211 120GB + 80GB INTEL SSDSCKGW08, Graphics: AMD FirePro V7900 2048MB, Audio: Realtek ALC892, Monitor: ASUS PB278, Network: Intel Connection OS: Debian testing, Kernel: 4.5.0-1-amd64 (x86_64), Display Server: X Server 1.18.3, Display Driver: modesetting 1.18.3, OpenGL: 3.3 Mesa 11.1.3 Gallium 0.4, Compiler: GCC 6.1.0, File-System: ext4, Screen Resolution: 2560x1440 -O1: Processor: Intel Xeon E5-2687W v3 @ 3.50GHz (20 Cores), Motherboard: MSI X99S SLI PLUS (MS-7885) v1.0, Chipset: Intel Xeon E7 v3/Xeon, Memory: 16384MB, Disk: PNY CS1211 120GB + 80GB INTEL SSDSCKGW08, Graphics: AMD FirePro V7900 2048MB, Audio: Realtek ALC892, Monitor: ASUS PB278, Network: Intel Connection OS: Debian testing, Kernel: 4.5.0-1-amd64 (x86_64), Display Server: X Server 1.18.3, Display Driver: modesetting 1.18.3, OpenGL: 3.3 Mesa 11.1.3 Gallium 0.4, Compiler: GCC 6.1.0, File-System: ext4, Screen Resolution: 2560x1440 -O2: Processor: Intel Xeon E5-2687W v3 @ 3.50GHz (20 Cores), Motherboard: MSI X99S SLI PLUS (MS-7885) v1.0, Chipset: Intel Xeon E7 v3/Xeon, Memory: 16384MB, Disk: PNY CS1211 120GB + 80GB INTEL SSDSCKGW08, Graphics: AMD FirePro V7900 2048MB, Audio: Realtek ALC892, Monitor: ASUS PB278, Network: Intel Connection OS: Debian testing, Kernel: 4.5.0-1-amd64 (x86_64), Display Server: X Server 1.18.3, Display Driver: modesetting 1.18.3, OpenGL: 3.3 Mesa 11.1.3 Gallium 0.4, Compiler: GCC 6.1.0, File-System: ext4, Screen Resolution: 2560x1440 -O3: Processor: Intel Xeon E5-2687W v3 @ 3.50GHz (20 Cores), Motherboard: MSI X99S SLI PLUS (MS-7885) v1.0, Chipset: Intel Xeon E7 v3/Xeon, Memory: 16384MB, Disk: PNY CS1211 120GB + 80GB INTEL SSDSCKGW08, Graphics: AMD FirePro V7900 2048MB, Audio: Realtek ALC892, Monitor: ASUS PB278, Network: Intel Connection OS: Debian testing, Kernel: 4.5.0-1-amd64 (x86_64), Display Server: X Server 1.18.3, Display Driver: modesetting 1.18.3, OpenGL: 3.3 Mesa 11.1.3 Gallium 0.4, Compiler: GCC 6.1.0, File-System: ext4, Screen Resolution: 2560x1440 -O3 -march=native: Processor: Intel Xeon E5-2687W v3 @ 3.50GHz (20 Cores), Motherboard: MSI X99S SLI PLUS (MS-7885) v1.0, Chipset: Intel Xeon E7 v3/Xeon, Memory: 16384MB, Disk: PNY CS1211 120GB + 80GB INTEL SSDSCKGW08, Graphics: AMD FirePro V7900 2048MB, Audio: Realtek ALC892, Monitor: ASUS PB278, Network: Intel Connection OS: Debian testing, Kernel: 4.5.0-1-amd64 (x86_64), Display Server: X Server 1.18.3, Display Driver: modesetting 1.18.3, OpenGL: 3.3 Mesa 11.1.3 Gallium 0.4, Compiler: GCC 6.1.0, File-System: ext4, Screen Resolution: 2560x1440 -O3 -march=native -flto: Processor: Intel Xeon E5-2687W v3 @ 3.50GHz (20 Cores), Motherboard: MSI X99S SLI PLUS (MS-7885) v1.0, Chipset: Intel Xeon E7 v3/Xeon, Memory: 16384MB, Disk: PNY CS1211 120GB + 80GB INTEL SSDSCKGW08, Graphics: AMD FirePro V7900 2048MB, Audio: Realtek ALC892, Monitor: ASUS PB278, Network: Intel Connection OS: Debian testing, Kernel: 4.5.0-1-amd64 (x86_64), Display Server: X Server 1.18.3, Display Driver: modesetting 1.18.3, OpenGL: 3.3 Mesa 11.1.3 Gallium 0.4, Compiler: GCC 6.1.0, File-System: ext4, Screen Resolution: 2560x1440 -Ofast -march=native: Processor: Intel Xeon E5-2687W v3 @ 3.50GHz (20 Cores), Motherboard: MSI X99S SLI PLUS (MS-7885) v1.0, Chipset: Intel Xeon E7 v3/Xeon, Memory: 16384MB, Disk: PNY CS1211 120GB + 80GB INTEL SSDSCKGW08, Graphics: AMD FirePro V7900 2048MB, Audio: Realtek ALC892, Monitor: ASUS PB278, Network: Intel Connection OS: Debian testing, Kernel: 4.5.0-1-amd64 (x86_64), Display Server: X Server 1.18.3, Display Driver: modesetting 1.18.3, OpenGL: 3.3 Mesa 11.1.3 Gallium 0.4, Compiler: GCC 6.1.0, File-System: ext4, Screen Resolution: 2560x1440 Timed HMMer Search 2.3.2 Pfam Database Search Seconds < Lower Is Better -O0 .................. 13.82 |================================================= -Os .................. 10.68 |====================================== -Og .................. 8.23 |============================= -O1 .................. 10.19 |==================================== -O2 .................. 11.63 |========================================= -O3 .................. 13.08 |============================================== -O3 -march=native .... 13.04 |============================================== -Ofast -march=native . 8.32 |============================= SciMark 2.0 Computational Test: Composite Mflops > Higher Is Better -O0 ..................... 1407.02 |=========================================== -Os ..................... 1426.79 |=========================================== -Og ..................... 1427.70 |=========================================== -O1 ..................... 1437.73 |============================================ -O2 ..................... 1426.80 |=========================================== -O3 ..................... 1442.30 |============================================ -O3 -march=native ....... 1388.10 |========================================== -O3 -march=native -flto . 1445.10 |============================================ -Ofast -march=native .... 1421.80 |=========================================== SciMark 2.0 Computational Test: Monte Carlo Mflops > Higher Is Better -O0 ..................... 545.51 |======================================== -Os ..................... 552.04 |======================================== -Og ..................... 546.26 |======================================== -O1 ..................... 551.44 |======================================== -O2 ..................... 537.90 |======================================= -O3 ..................... 547.85 |======================================== -O3 -march=native ....... 547.58 |======================================== -O3 -march=native -flto . 614.59 |============================================= -Ofast -march=native .... 553.28 |========================================= SciMark 2.0 Computational Test: Fast Fourier Transform Mflops > Higher Is Better -O0 ..................... 440.57 |========================================== -Os ..................... 456.35 |============================================ -Og ..................... 456.61 |============================================ -O1 ..................... 447.31 |=========================================== -O2 ..................... 461.55 |============================================ -O3 ..................... 458.56 |============================================ -O3 -march=native ....... 443.71 |=========================================== -O3 -march=native -flto . 465.50 |============================================= -Ofast -march=native .... 468.61 |============================================= SciMark 2.0 Computational Test: Sparse Matrix Multiply Mflops > Higher Is Better -O0 ..................... 2565.94 |=========================================== -Os ..................... 2589.56 |=========================================== -Og ..................... 2580.32 |=========================================== -O1 ..................... 2609.45 |============================================ -O2 ..................... 2571.06 |=========================================== -O3 ..................... 2622.50 |============================================ -O3 -march=native ....... 2440.96 |========================================= -O3 -march=native -flto . 2511.39 |========================================== -Ofast -march=native .... 2517.37 |========================================== SciMark 2.0 Computational Test: Dense LU Matrix Factorization Mflops > Higher Is Better -O0 ..................... 2454.12 |========================================== -Os ..................... 2482.13 |========================================== -Og ..................... 2521.75 |=========================================== -O1 ..................... 2534.08 |=========================================== -O2 ..................... 2534.23 |=========================================== -O3 ..................... 2531.29 |=========================================== -O3 -march=native ....... 2468.30 |========================================== -O3 -march=native -flto . 2586.62 |============================================ -Ofast -march=native .... 2519.72 |=========================================== SciMark 2.0 Computational Test: Jacobi Successive Over-Relaxation Mflops > Higher Is Better -O0 ..................... 1028.95 |=========================================== -Os ..................... 1053.88 |============================================ -Og ..................... 1033.56 |=========================================== -O1 ..................... 1046.36 |============================================ -O2 ..................... 1029.27 |=========================================== -O3 ..................... 1051.27 |============================================ -O3 -march=native ....... 1039.94 |=========================================== -O3 -march=native -flto . 1047.43 |============================================ -Ofast -march=native .... 1050.05 |============================================ GraphicsMagick 1.3.19 Operation: Blur Iterations Per Minute > Higher Is Better -O0 .................. 82 |============================= -Os .................. 110 |======================================= -Og .................. 113 |======================================== -O1 .................. 137 |================================================= -O2 .................. 131 |============================================== -O3 .................. 130 |============================================== -O3 -march=native .... 138 |================================================= -Ofast -march=native . 144 |=================================================== GraphicsMagick 1.3.19 Operation: Sharpen Iterations Per Minute > Higher Is Better -O0 .................. 71 |========================= -Os .................. 124 |============================================ -Og .................. 100 |=================================== -O1 .................. 135 |=============================================== -O2 .................. 134 |=============================================== -O3 .................. 136 |================================================ -O3 -march=native .... 143 |================================================== -Ofast -march=native . 145 |=================================================== GraphicsMagick 1.3.19 Operation: Resizing Iterations Per Minute > Higher Is Better -O0 .................. 97 |=========================== -Os .................. 168 |=============================================== -Og .................. 149 |========================================== -O1 .................. 168 |=============================================== -O2 .................. 174 |================================================= -O3 .................. 171 |================================================ -O3 -march=native .... 180 |================================================== -Ofast -march=native . 182 |=================================================== GraphicsMagick 1.3.19 Operation: HWB Color Space Iterations Per Minute > Higher Is Better -O0 .................. 110 |============================ -Os .................. 188 |=============================================== -Og .................. 168 |========================================== -O1 .................. 187 |=============================================== -O2 .................. 186 |=============================================== -O3 .................. 185 |============================================== -O3 -march=native .... 190 |================================================ -Ofast -march=native . 204 |=================================================== GraphicsMagick 1.3.19 Operation: Local Adaptive Thresholding Iterations Per Minute > Higher Is Better -O0 .................. 17 |========== -Os .................. 68 |========================================= -Og .................. 54 |================================= -O1 .................. 76 |============================================== -O2 .................. 82 |================================================== -O3 .................. 83 |================================================== -O3 -march=native .... 85 |=================================================== -Ofast -march=native . 86 |==================================================== Himeno Benchmark 3.0 Poisson Pressure Solver MFLOPS > Higher Is Better -O0 ..................... 424.57 |========= -Os ..................... 1181.18 |======================== -Og ..................... 1102.64 |======================= -O1 ..................... 1060.93 |====================== -O2 ..................... 1916.56 |======================================= -O3 ..................... 1895.45 |======================================= -O3 -march=native ....... 2113.04 |=========================================== -O3 -march=native -flto . 2150.96 |============================================ -Ofast -march=native .... 2019.61 |========================================= Timed ImageMagick Compilation 6.9.0 Time To Compile Seconds < Lower Is Better -O0 ..................... 9.34 |=== -Os ..................... 32.43 |============ -Og ..................... 13.35 |===== -O1 ..................... 27.24 |========== -O2 ..................... 38.55 |============== -O3 ..................... 55.45 |===================== -O3 -march=native ....... 55.40 |===================== -O3 -march=native -flto . 121.45 |============================================= -Ofast -march=native .... 55.89 |===================== Timed PHP Compilation 5.2.9 Time To Compile Seconds < Lower Is Better -O0 ..................... 5.58 |=== -Os ..................... 11.61 |====== -Og ..................... 8.18 |===== -O1 ..................... 9.76 |===== -O2 ..................... 16.08 |========= -O3 ..................... 17.59 |========== -O3 -march=native ....... 18.10 |========== -O3 -march=native -flto . 82.86 |============================================== -Ofast -march=native .... 17.99 |========== FLAC Audio Encoding 1.3.1 WAV To FLAC Seconds < Lower Is Better -O0 .................. 46.74 |================================================= -Os .................. 10.62 |=========== -Og .................. 8.11 |========= -O1 .................. 7.68 |======== -O2 .................. 6.68 |======= -O3 .................. 6.83 |======= -O3 -march=native .... 7.01 |======= -Ofast -march=native . 7.03 |======= LAME MP3 Encoding 3.99.3 WAV To MP3 Seconds < Lower Is Better -O0 .................. 36.02 |================================================= -Os .................. 16.28 |====================== -Og .................. 17.15 |======================= -O1 .................. 15.14 |===================== -O2 .................. 14.26 |=================== -O3 .................. 12.52 |================= -O3 -march=native .... 12.45 |================= -Ofast -march=native . 11.34 |=============== PostgreSQL pgbench 9.4.3 Scaling: Buffer Test - Test: Normal Load - Mode: Read Write TPS > Higher Is Better -O0 ............... 4468.97 |================================================== -Os ............... 4275.85 |================================================ -Og ............... 4364.38 |================================================= -O1 ............... 4257.86 |=============================================== -O2 ............... 4322.67 |================================================ -O3 ............... 4495.93 |================================================== -O3 -march=native . 4281.30 |================================================ PostgreSQL pgbench 9.4.3 Scaling: Buffer Test - Test: Single Thread - Mode: Read Write TPS > Higher Is Better -O0 ............... 303.93 |=========================================== -Os ............... 346.32 |================================================= -Og ............... 353.15 |================================================= -O1 ............... 351.47 |================================================= -O2 ............... 363.87 |=================================================== -O3 ............... 351.89 |================================================= -O3 -march=native . 349.97 |================================================= PostgreSQL pgbench 9.4.3 Scaling: Buffer Test - Test: Heavy Contention - Mode: Read Write TPS > Higher Is Better -O0 ............... 4840.76 |================================================== -Os ............... 4497.15 |============================================== -Og ............... 4538.18 |=============================================== -O1 ............... 4494.38 |============================================== -O2 ............... 4494.08 |============================================== -O3 ............... 4720.08 |================================================= -O3 -march=native . 4539.62 |=============================================== Redis 3.0.1 Test: LPOP Requests Per Second > Higher Is Better -O0 .................. 547091.88 |===================================== -Os .................. 642758.08 |============================================ -Og .................. 637156.51 |============================================ -O1 .................. 655900.13 |============================================= -O2 .................. 649030.83 |============================================ -O3 .................. 646935.48 |============================================ -O3 -march=native .... 655097.69 |============================================= -Ofast -march=native . 656696.79 |============================================= Redis 3.0.1 Test: SADD Requests Per Second > Higher Is Better -O0 .................. 491891.31 |==================================== -Os .................. 607478.25 |============================================ -Og .................. 582272.77 |=========================================== -O1 .................. 605722.93 |============================================ -O2 .................. 598759.46 |============================================ -O3 .................. 605861.48 |============================================ -O3 -march=native .... 615258.45 |============================================= -Ofast -march=native . 616016.48 |============================================= Redis 3.0.1 Test: LPUSH Requests Per Second > Higher Is Better -O0 .................. 476295.97 |==================================== -Os .................. 598808.52 |============================================= -Og .................. 602047.96 |============================================= -O1 .................. 589230.57 |============================================ -O2 .................. 599526.56 |============================================= -O3 .................. 593409.02 |============================================ -O3 -march=native .... 584299.37 |============================================ -Ofast -march=native . 598935.85 |============================================= Redis 3.0.1 Test: GET Requests Per Second > Higher Is Better -O0 .................. 548655.63 |===================================== -Os .................. 645755.96 |=========================================== -Og .................. 652904.39 |============================================ -O1 .................. 655681.23 |============================================ -O2 .................. 628643.96 |========================================== -O3 .................. 669846.73 |============================================= -O3 -march=native .... 631189.52 |========================================== -Ofast -march=native . 631191.87 |========================================== Redis 3.0.1 Test: SET Requests Per Second > Higher Is Better -O0 .................. 479934.92 |==================================== -Os .................. 596757.10 |============================================= -Og .................. 592312.44 |============================================= -O1 .................. 597265.48 |============================================= -O2 .................. 586099.29 |============================================ -O3 .................. 587251.75 |============================================ -O3 -march=native .... 584905.04 |============================================ -Ofast -march=native . 588019.67 |============================================ Hierarchical INTegration 1.0 Test: FLOAT QUIPs > Higher Is Better -O0 ..................... 103731655.85 |============ -Os ..................... 303914359.33 |==================================== -Og ..................... 326497871.93 |======================================= -O1 ..................... 242450705.97 |============================= -O2 ..................... 317711776.83 |====================================== -O3 ..................... 312279718.27 |===================================== -O3 -march=native ....... 310268777.87 |===================================== -O3 -march=native -flto . 312975471.93 |===================================== -Ofast -march=native .... 309403432.89 |=====================================