GCC 6.1 Compiler Optimization Benchmarks GCC 6.1.0 compiler benchmarks with different optimization flags. Intel Xeon E5-2687W v3 GCC compiler benchmarks on Debian. Tests by Michael Larabel of Phoronix for a future article. -O0: Processor: Intel Xeon E5-2687W v3 @ 3.50GHz (20 Cores), Motherboard: MSI X99S SLI PLUS (MS-7885) v1.0, Chipset: Intel Xeon E7 v3/Xeon, Memory: 16384MB, Disk: PNY CS1211 120GB + 80GB INTEL SSDSCKGW08, Graphics: AMD FirePro V7900 2048MB, Audio: Realtek ALC892, Monitor: ASUS PB278, Network: Intel Connection OS: Debian testing, Kernel: 4.5.0-1-amd64 (x86_64), Display Server: X Server 1.18.3, Display Driver: modesetting 1.18.3, OpenGL: 3.3 Mesa 11.1.3 Gallium 0.4, Compiler: GCC 6.1.0, File-System: ext4, Screen Resolution: 2560x1440 -Os: Processor: Intel Xeon E5-2687W v3 @ 3.50GHz (20 Cores), Motherboard: MSI X99S SLI PLUS (MS-7885) v1.0, Chipset: Intel Xeon E7 v3/Xeon, Memory: 16384MB, Disk: PNY CS1211 120GB + 80GB INTEL SSDSCKGW08, Graphics: AMD FirePro V7900 2048MB, Audio: Realtek ALC892, Monitor: ASUS PB278, Network: Intel Connection OS: Debian testing, Kernel: 4.5.0-1-amd64 (x86_64), Display Server: X Server 1.18.3, Display Driver: modesetting 1.18.3, OpenGL: 3.3 Mesa 11.1.3 Gallium 0.4, Compiler: GCC 6.1.0, File-System: ext4, Screen Resolution: 2560x1440 -Og: Processor: Intel Xeon E5-2687W v3 @ 3.50GHz (20 Cores), Motherboard: MSI X99S SLI PLUS (MS-7885) v1.0, Chipset: Intel Xeon E7 v3/Xeon, Memory: 16384MB, Disk: PNY CS1211 120GB + 80GB INTEL SSDSCKGW08, Graphics: AMD FirePro V7900 2048MB, Audio: Realtek ALC892, Monitor: ASUS PB278, Network: Intel Connection OS: Debian testing, Kernel: 4.5.0-1-amd64 (x86_64), Display Server: X Server 1.18.3, Display Driver: modesetting 1.18.3, OpenGL: 3.3 Mesa 11.1.3 Gallium 0.4, Compiler: GCC 6.1.0, File-System: ext4, Screen Resolution: 2560x1440 -O1: Processor: Intel Xeon E5-2687W v3 @ 3.50GHz (20 Cores), Motherboard: MSI X99S SLI PLUS (MS-7885) v1.0, Chipset: Intel Xeon E7 v3/Xeon, Memory: 16384MB, Disk: PNY CS1211 120GB + 80GB INTEL SSDSCKGW08, Graphics: AMD FirePro V7900 2048MB, Audio: Realtek ALC892, Monitor: ASUS PB278, Network: Intel Connection OS: Debian testing, Kernel: 4.5.0-1-amd64 (x86_64), Display Server: X Server 1.18.3, Display Driver: modesetting 1.18.3, OpenGL: 3.3 Mesa 11.1.3 Gallium 0.4, Compiler: GCC 6.1.0, File-System: ext4, Screen Resolution: 2560x1440 -O2: Processor: Intel Xeon E5-2687W v3 @ 3.50GHz (20 Cores), Motherboard: MSI X99S SLI PLUS (MS-7885) v1.0, Chipset: Intel Xeon E7 v3/Xeon, Memory: 16384MB, Disk: PNY CS1211 120GB + 80GB INTEL SSDSCKGW08, Graphics: AMD FirePro V7900 2048MB, Audio: Realtek ALC892, Monitor: ASUS PB278, Network: Intel Connection OS: Debian testing, Kernel: 4.5.0-1-amd64 (x86_64), Display Server: X Server 1.18.3, Display Driver: modesetting 1.18.3, OpenGL: 3.3 Mesa 11.1.3 Gallium 0.4, Compiler: GCC 6.1.0, File-System: ext4, Screen Resolution: 2560x1440 -O3: Processor: Intel Xeon E5-2687W v3 @ 3.50GHz (20 Cores), Motherboard: MSI X99S SLI PLUS (MS-7885) v1.0, Chipset: Intel Xeon E7 v3/Xeon, Memory: 16384MB, Disk: PNY CS1211 120GB + 80GB INTEL SSDSCKGW08, Graphics: AMD FirePro V7900 2048MB, Audio: Realtek ALC892, Monitor: ASUS PB278, Network: Intel Connection OS: Debian testing, Kernel: 4.5.0-1-amd64 (x86_64), Display Server: X Server 1.18.3, Display Driver: modesetting 1.18.3, OpenGL: 3.3 Mesa 11.1.3 Gallium 0.4, Compiler: GCC 6.1.0, File-System: ext4, Screen Resolution: 2560x1440 -O3 -march=native: Processor: Intel Xeon E5-2687W v3 @ 3.50GHz (20 Cores), Motherboard: MSI X99S SLI PLUS (MS-7885) v1.0, Chipset: Intel Xeon E7 v3/Xeon, Memory: 16384MB, Disk: PNY CS1211 120GB + 80GB INTEL SSDSCKGW08, Graphics: AMD FirePro V7900 2048MB, Audio: Realtek ALC892, Monitor: ASUS PB278, Network: Intel Connection OS: Debian testing, Kernel: 4.5.0-1-amd64 (x86_64), Display Server: X Server 1.18.3, Display Driver: modesetting 1.18.3, OpenGL: 3.3 Mesa 11.1.3 Gallium 0.4, Compiler: GCC 6.1.0, File-System: ext4, Screen Resolution: 2560x1440 -O3 -march=native -flto: Processor: Intel Xeon E5-2687W v3 @ 3.50GHz (20 Cores), Motherboard: MSI X99S SLI PLUS (MS-7885) v1.0, Chipset: Intel Xeon E7 v3/Xeon, Memory: 16384MB, Disk: PNY CS1211 120GB + 80GB INTEL SSDSCKGW08, Graphics: AMD FirePro V7900 2048MB, Audio: Realtek ALC892, Monitor: ASUS PB278, Network: Intel Connection OS: Debian testing, Kernel: 4.5.0-1-amd64 (x86_64), Display Server: X Server 1.18.3, Display Driver: modesetting 1.18.3, OpenGL: 3.3 Mesa 11.1.3 Gallium 0.4, Compiler: GCC 6.1.0, File-System: ext4, Screen Resolution: 2560x1440 -Ofast -march=native: Processor: Intel Xeon E5-2687W v3 @ 3.50GHz (20 Cores), Motherboard: MSI X99S SLI PLUS (MS-7885) v1.0, Chipset: Intel Xeon E7 v3/Xeon, Memory: 16384MB, Disk: PNY CS1211 120GB + 80GB INTEL SSDSCKGW08, Graphics: AMD FirePro V7900 2048MB, Audio: Realtek ALC892, Monitor: ASUS PB278, Network: Intel Connection OS: Debian testing, Kernel: 4.5.0-1-amd64 (x86_64), Display Server: X Server 1.18.3, Display Driver: modesetting 1.18.3, OpenGL: 3.3 Mesa 11.1.3 Gallium 0.4, Compiler: GCC 6.1.0, File-System: ext4, Screen Resolution: 2560x1440 s10: Processor: Intel Xeon E31245 @ 3.70GHz (8 Cores), Motherboard: ASUS P8B WS, Memory: 16384MB, Disk: 3001GB Hitachi HDS72303 + 128GB SAMSUNG MZNTE128, Graphics: Intel Sandybridge Server (1350MHz), Audio: Realtek Generic, Monitor: SyncMaster OS: Gentoo 2.2, Kernel: 4.5.0-gentoo (x86_64), Desktop: KDE Frameworks 5, Display Server: X Server 1.18.3, Display Driver: intel 2.99.917, OpenGL: 3.3 Mesa 11.2.2, Compiler: GCC 5.3.0 + Clang 3.8.0 + LLVM 3.8.0, File-System: ext4, Screen Resolution: 1920x1080 Timed HMMer Search 2.3.2 Pfam Database Search Seconds < Lower Is Better -O0 .................. 13.82 |============================================ -Os .................. 10.68 |================================== -Og .................. 8.23 |========================== -O1 .................. 10.19 |================================= -O2 .................. 11.63 |===================================== -O3 .................. 13.08 |========================================== -O3 -march=native .... 13.04 |========================================== -Ofast -march=native . 8.32 |=========================== s10 .................. 15.29 |================================================= SciMark 2.0 Computational Test: Composite Mflops > Higher Is Better -O0 ..................... 1407.02 |=========================================== -Os ..................... 1426.79 |=========================================== -Og ..................... 1427.70 |=========================================== -O1 ..................... 1437.73 |============================================ -O2 ..................... 1426.80 |=========================================== -O3 ..................... 1442.30 |============================================ -O3 -march=native ....... 1388.10 |========================================== -O3 -march=native -flto . 1445.10 |============================================ -Ofast -march=native .... 1421.80 |=========================================== s10 ..................... 1023.06 |=============================== SciMark 2.0 Computational Test: Monte Carlo Mflops > Higher Is Better -O0 ..................... 545.51 |======================================== -Os ..................... 552.04 |======================================== -Og ..................... 546.26 |======================================== -O1 ..................... 551.44 |======================================== -O2 ..................... 537.90 |======================================= -O3 ..................... 547.85 |======================================== -O3 -march=native ....... 547.58 |======================================== -O3 -march=native -flto . 614.59 |============================================= -Ofast -march=native .... 553.28 |========================================= s10 ..................... 464.45 |================================== SciMark 2.0 Computational Test: Fast Fourier Transform Mflops > Higher Is Better -O0 ..................... 440.57 |========================================== -Os ..................... 456.35 |============================================ -Og ..................... 456.61 |============================================ -O1 ..................... 447.31 |=========================================== -O2 ..................... 461.55 |============================================ -O3 ..................... 458.56 |============================================ -O3 -march=native ....... 443.71 |=========================================== -O3 -march=native -flto . 465.50 |============================================= -Ofast -march=native .... 468.61 |============================================= s10 ..................... 231.80 |====================== SciMark 2.0 Computational Test: Sparse Matrix Multiply Mflops > Higher Is Better -O0 ..................... 2565.94 |=========================================== -Os ..................... 2589.56 |=========================================== -Og ..................... 2580.32 |=========================================== -O1 ..................... 2609.45 |============================================ -O2 ..................... 2571.06 |=========================================== -O3 ..................... 2622.50 |============================================ -O3 -march=native ....... 2440.96 |========================================= -O3 -march=native -flto . 2511.39 |========================================== -Ofast -march=native .... 2517.37 |========================================== s10 ..................... 1544.66 |========================== SciMark 2.0 Computational Test: Dense LU Matrix Factorization Mflops > Higher Is Better -O0 ..................... 2454.12 |========================================== -Os ..................... 2482.13 |========================================== -Og ..................... 2521.75 |=========================================== -O1 ..................... 2534.08 |=========================================== -O2 ..................... 2534.23 |=========================================== -O3 ..................... 2531.29 |=========================================== -O3 -march=native ....... 2468.30 |========================================== -O3 -march=native -flto . 2586.62 |============================================ -Ofast -march=native .... 2519.72 |=========================================== s10 ..................... 1835.64 |=============================== SciMark 2.0 Computational Test: Jacobi Successive Over-Relaxation Mflops > Higher Is Better -O0 ..................... 1028.95 |=========================================== -Os ..................... 1053.88 |============================================ -Og ..................... 1033.56 |=========================================== -O1 ..................... 1046.36 |============================================ -O2 ..................... 1029.27 |=========================================== -O3 ..................... 1051.27 |============================================ -O3 -march=native ....... 1039.94 |=========================================== -O3 -march=native -flto . 1047.43 |============================================ -Ofast -march=native .... 1050.05 |============================================ s10 ..................... 1038.75 |=========================================== GraphicsMagick 1.3.19 Operation: Blur Iterations Per Minute > Higher Is Better -O0 .................. 82 |============================= -Os .................. 110 |======================================= -Og .................. 113 |======================================== -O1 .................. 137 |================================================= -O2 .................. 131 |============================================== -O3 .................. 130 |============================================== -O3 -march=native .... 138 |================================================= -Ofast -march=native . 144 |=================================================== s10 .................. 125 |============================================ GraphicsMagick 1.3.19 Operation: Sharpen Iterations Per Minute > Higher Is Better -O0 .................. 71 |========================= -Os .................. 124 |============================================ -Og .................. 100 |=================================== -O1 .................. 135 |=============================================== -O2 .................. 134 |=============================================== -O3 .................. 136 |================================================ -O3 -march=native .... 143 |================================================== -Ofast -march=native . 145 |=================================================== s10 .................. 103 |==================================== GraphicsMagick 1.3.19 Operation: Resizing Iterations Per Minute > Higher Is Better -O0 .................. 97 |=========================== -Os .................. 168 |=============================================== -Og .................. 149 |========================================== -O1 .................. 168 |=============================================== -O2 .................. 174 |================================================= -O3 .................. 171 |================================================ -O3 -march=native .... 180 |================================================== -Ofast -march=native . 182 |=================================================== s10 .................. 151 |========================================== GraphicsMagick 1.3.19 Operation: HWB Color Space Iterations Per Minute > Higher Is Better -O0 .................. 110 |============================ -Os .................. 188 |=============================================== -Og .................. 168 |========================================== -O1 .................. 187 |=============================================== -O2 .................. 186 |=============================================== -O3 .................. 185 |============================================== -O3 -march=native .... 190 |================================================ -Ofast -march=native . 204 |=================================================== s10 .................. 159 |======================================== GraphicsMagick 1.3.19 Operation: Local Adaptive Thresholding Iterations Per Minute > Higher Is Better -O0 .................. 17 |========== -Os .................. 68 |========================================= -Og .................. 54 |================================= -O1 .................. 76 |============================================== -O2 .................. 82 |================================================== -O3 .................. 83 |================================================== -O3 -march=native .... 85 |=================================================== -Ofast -march=native . 86 |==================================================== s10 .................. 78 |=============================================== Himeno Benchmark 3.0 Poisson Pressure Solver MFLOPS > Higher Is Better -O0 ..................... 424.57 |========= -Os ..................... 1181.18 |======================== -Og ..................... 1102.64 |======================= -O1 ..................... 1060.93 |====================== -O2 ..................... 1916.56 |======================================= -O3 ..................... 1895.45 |======================================= -O3 -march=native ....... 2113.04 |=========================================== -O3 -march=native -flto . 2150.96 |============================================ -Ofast -march=native .... 2019.61 |========================================= s10 ..................... 1474.95 |============================== Timed ImageMagick Compilation 6.9.0 Time To Compile Seconds < Lower Is Better -O0 ..................... 9.34 |=== -Os ..................... 32.43 |============ -Og ..................... 13.35 |===== -O1 ..................... 27.24 |========== -O2 ..................... 38.55 |============== -O3 ..................... 55.45 |===================== -O3 -march=native ....... 55.40 |===================== -O3 -march=native -flto . 121.45 |============================================= -Ofast -march=native .... 55.89 |===================== s10 ..................... 74.78 |============================ Timed PHP Compilation 5.2.9 Time To Compile Seconds < Lower Is Better -O0 ..................... 5.58 |=== -Os ..................... 11.61 |====== -Og ..................... 8.18 |===== -O1 ..................... 9.76 |===== -O2 ..................... 16.08 |========= -O3 ..................... 17.59 |========== -O3 -march=native ....... 18.10 |========== -O3 -march=native -flto . 82.86 |============================================== -Ofast -march=native .... 17.99 |========== s10 ..................... 34.49 |=================== FLAC Audio Encoding 1.3.1 WAV To FLAC Seconds < Lower Is Better -O0 .................. 46.74 |================================================= -Os .................. 10.62 |=========== -Og .................. 8.11 |========= -O1 .................. 7.68 |======== -O2 .................. 6.68 |======= -O3 .................. 6.83 |======= -O3 -march=native .... 7.01 |======= -Ofast -march=native . 7.03 |======= s10 .................. 8.97 |========= LAME MP3 Encoding 3.99.3 WAV To MP3 Seconds < Lower Is Better -O0 .................. 36.02 |================================================= -Os .................. 16.28 |====================== -Og .................. 17.15 |======================= -O1 .................. 15.14 |===================== -O2 .................. 14.26 |=================== -O3 .................. 12.52 |================= -O3 -march=native .... 12.45 |================= -Ofast -march=native . 11.34 |=============== s10 .................. 14.45 |==================== PostgreSQL pgbench 9.4.3 Scaling: Buffer Test - Test: Normal Load - Mode: Read Write TPS > Higher Is Better -O0 ............... 4468.97 |================================================== -Os ............... 4275.85 |================================================ -Og ............... 4364.38 |================================================= -O1 ............... 4257.86 |=============================================== -O2 ............... 4322.67 |================================================ -O3 ............... 4495.93 |================================================== -O3 -march=native . 4281.30 |================================================ s10 ............... 683.11 |======== PostgreSQL pgbench 9.4.3 Scaling: Buffer Test - Test: Single Thread - Mode: Read Write TPS > Higher Is Better -O0 ............... 303.93 |=========================================== -Os ............... 346.32 |================================================= -Og ............... 353.15 |================================================= -O1 ............... 351.47 |================================================= -O2 ............... 363.87 |=================================================== -O3 ............... 351.89 |================================================= -O3 -march=native . 349.97 |================================================= s10 ............... 91.81 |============= PostgreSQL pgbench 9.4.3 Scaling: Buffer Test - Test: Heavy Contention - Mode: Read Write TPS > Higher Is Better -O0 ............... 4840.76 |================================================== -Os ............... 4497.15 |============================================== -Og ............... 4538.18 |=============================================== -O1 ............... 4494.38 |============================================== -O2 ............... 4494.08 |============================================== -O3 ............... 4720.08 |================================================= -O3 -march=native . 4539.62 |=============================================== s10 ............... 791.05 |======== Redis 3.0.1 Test: LPOP Requests Per Second > Higher Is Better -O0 .................. 547091.88 |=============== -Os .................. 642758.08 |================== -Og .................. 637156.51 |================== -O1 .................. 655900.13 |=================== -O2 .................. 649030.83 |================== -O3 .................. 646935.48 |================== -O3 -march=native .... 655097.69 |================== -Ofast -march=native . 656696.79 |=================== s10 .................. 1559720.21 |============================================ Redis 3.0.1 Test: SADD Requests Per Second > Higher Is Better -O0 .................. 491891.31 |================== -Os .................. 607478.25 |====================== -Og .................. 582272.77 |===================== -O1 .................. 605722.93 |====================== -O2 .................. 598759.46 |====================== -O3 .................. 605861.48 |====================== -O3 -march=native .... 615258.45 |====================== -Ofast -march=native . 616016.48 |====================== s10 .................. 1216564.21 |============================================ Redis 3.0.1 Test: LPUSH Requests Per Second > Higher Is Better -O0 .................. 476295.97 |================== -Os .................. 598808.52 |======================= -Og .................. 602047.96 |======================= -O1 .................. 589230.57 |====================== -O2 .................. 599526.56 |======================= -O3 .................. 593409.02 |====================== -O3 -march=native .... 584299.37 |====================== -Ofast -march=native . 598935.85 |======================= s10 .................. 1169238.33 |============================================ Redis 3.0.1 Test: GET Requests Per Second > Higher Is Better -O0 .................. 548655.63 |=============== -Os .................. 645755.96 |================== -Og .................. 652904.39 |================== -O1 .................. 655681.23 |================== -O2 .................. 628643.96 |================== -O3 .................. 669846.73 |=================== -O3 -march=native .... 631189.52 |================== -Ofast -march=native . 631191.87 |================== s10 .................. 1570800.87 |============================================ Redis 3.0.1 Test: SET Requests Per Second > Higher Is Better -O0 .................. 479934.92 |================== -Os .................. 596757.10 |======================= -Og .................. 592312.44 |======================= -O1 .................. 597265.48 |======================= -O2 .................. 586099.29 |====================== -O3 .................. 587251.75 |====================== -O3 -march=native .... 584905.04 |====================== -Ofast -march=native . 588019.67 |======================= s10 .................. 1149452.71 |============================================ Hierarchical INTegration 1.0 Test: FLOAT QUIPs > Higher Is Better -O0 ..................... 103731655.85 |============ -Os ..................... 303914359.33 |==================================== -Og ..................... 326497871.93 |======================================= -O1 ..................... 242450705.97 |============================= -O2 ..................... 317711776.83 |====================================== -O3 ..................... 312279718.27 |===================================== -O3 -march=native ....... 310268777.87 |===================================== -O3 -march=native -flto . 312975471.93 |===================================== -Ofast -march=native .... 309403432.89 |===================================== s10 ..................... 287050317.81 |==================================