GCC 4.8 Link-Time Optimization LTO Prelimary test data for a future article. GCC 4.7.2: Stock: Processor: Intel Core i7-3770K @ 3.50GHz (8 Cores), Motherboard: ECS Z77H2-A2X v1.0, Chipset: Intel Xeon E3-1200 v2/3rd, Memory: 8192MB, Disk: 60GB OCZ VERTEX2, Graphics: NVIDIA GeForce GTX 550 Ti 1024MB (405/324MHz), Audio: Realtek ALC892, Monitor: DELL P2210H, Network: Realtek RTL8111/8168B + Intel Centrino Advanced-N 6205 OS: Ubuntu 13.04, Kernel: 3.8.0-4-generic (x86_64), Desktop: Unity 6.6.0, Display Server: X Server 1.13.2, Display Driver: nouveau 1.0.6, OpenGL: 3.0 Mesa 9.0.2 Gallium 0.4, Compiler: GCC 4.7, File-System: ext4, Screen Resolution: 1920x1080 GCC 4.7.2: LTO: Processor: Intel Core i7-3770K @ 3.50GHz (8 Cores), Motherboard: ECS Z77H2-A2X v1.0, Chipset: Intel Xeon E3-1200 v2/3rd, Memory: 8192MB, Disk: 60GB OCZ VERTEX2, Graphics: NVIDIA GeForce GTX 550 Ti 1024MB (405/324MHz), Audio: Realtek ALC892, Monitor: DELL P2210H, Network: Realtek RTL8111/8168B + Intel Centrino Advanced-N 6205 OS: Ubuntu 13.04, Kernel: 3.8.0-4-generic (x86_64), Desktop: Unity 6.6.0, Display Server: X Server 1.13.2, Display Driver: nouveau 1.0.6, OpenGL: 3.0 Mesa 9.0.2 Gallium 0.4, Compiler: GCC 4.7, File-System: ext4, Screen Resolution: 1920x1080 GCC 4.8.0: Stock: Processor: Intel Core i7-3770K @ 3.50GHz (8 Cores), Motherboard: ECS Z77H2-A2X v1.0, Chipset: Intel Xeon E3-1200 v2/3rd, Memory: 8192MB, Disk: 60GB OCZ VERTEX2, Graphics: NVIDIA GeForce GTX 550 Ti 1024MB (405/324MHz), Audio: Realtek ALC892, Monitor: DELL P2210H, Network: Realtek RTL8111/8168B + Intel Centrino Advanced-N 6205 OS: Ubuntu 13.04, Kernel: 3.8.0-4-generic (x86_64), Desktop: Unity 6.6.0, Display Server: X Server 1.13.2, Display Driver: nouveau 1.0.6, OpenGL: 3.0 Mesa 9.0.2 Gallium 0.4, Compiler: GCC 4.8.0 20130121, File-System: ext4, Screen Resolution: 1920x1080 GCC 4.8.0: LTO: Processor: Intel Core i7-3770K @ 3.50GHz (8 Cores), Motherboard: ECS Z77H2-A2X v1.0, Chipset: Intel Xeon E3-1200 v2/3rd, Memory: 8192MB, Disk: 60GB OCZ VERTEX2, Graphics: NVIDIA GeForce GTX 550 Ti 1024MB (405/324MHz), Audio: Realtek ALC892, Monitor: DELL P2210H, Network: Realtek RTL8111/8168B + Intel Centrino Advanced-N 6205 OS: Ubuntu 13.04, Kernel: 3.8.0-4-generic (x86_64), Desktop: Unity 6.6.0, Display Server: X Server 1.13.2, Display Driver: nouveau 1.0.6, OpenGL: 3.0 Mesa 9.0.2 Gallium 0.4, Compiler: GCC 4.8.0 20130121, File-System: ext4, Screen Resolution: 1920x1080 Rodinia 2.2 Test: OpenMP LavaMD Seconds < Lower Is Better GCC 4.7.2: Stock . 100.09 |==================================================== GCC 4.7.2: LTO ... 99.43 |==================================================== GCC 4.8.0: Stock . 100.24 |==================================================== GCC 4.8.0: LTO ... 100.03 |==================================================== Rodinia 2.2 Test: OpenMP Leukocyte Seconds < Lower Is Better GCC 4.7.2: Stock . 36.97 |==================================================== GCC 4.7.2: LTO ... 36.55 |=================================================== GCC 4.8.0: Stock . 37.36 |===================================================== GCC 4.8.0: LTO ... 37.66 |===================================================== Rodinia 2.2 Test: OpenMP CFD Solver Seconds < Lower Is Better GCC 4.7.2: Stock . 88.56 |==================================================== GCC 4.7.2: LTO ... 87.54 |==================================================== GCC 4.8.0: Stock . 89.09 |===================================================== GCC 4.8.0: LTO ... 89.55 |===================================================== Rodinia 2.2 Test: OpenMP Streamcluster Seconds < Lower Is Better GCC 4.7.2: Stock . 31.46 |===================================================== GCC 4.7.2: LTO ... 30.99 |==================================================== GCC 4.8.0: Stock . 31.33 |===================================================== GCC 4.8.0: LTO ... 31.36 |===================================================== NAS Parallel Benchmarks 3.3 Test / Class: BT.A Total Mop/s > Higher Is Better GCC 4.7.2: Stock . 13953.33 |================================================== GCC 4.7.2: LTO ... 13991.07 |================================================== GCC 4.8.0: Stock . 13935.87 |================================================== GCC 4.8.0: LTO ... 13890.39 |================================================== NAS Parallel Benchmarks 3.3 Test / Class: CG.B Total Mop/s > Higher Is Better GCC 4.7.2: Stock . 2919.03 |================================================== GCC 4.7.2: LTO ... 2960.01 |=================================================== GCC 4.8.0: Stock . 2952.17 |=================================================== GCC 4.8.0: LTO ... 2882.45 |================================================== NAS Parallel Benchmarks 3.3 Test / Class: EP.B Total Mop/s > Higher Is Better GCC 4.7.2: Stock . 274.16 |==================================================== GCC 4.7.2: LTO ... 275.03 |==================================================== GCC 4.8.0: Stock . 258.85 |================================================= GCC 4.8.0: LTO ... 258.41 |================================================= NAS Parallel Benchmarks 3.3 Test / Class: FT.B Total Mop/s > Higher Is Better GCC 4.7.2: Stock . 7686.09 |================================================== GCC 4.7.2: LTO ... 7797.11 |=================================================== GCC 4.8.0: Stock . 7728.30 |=================================================== GCC 4.8.0: LTO ... 7669.98 |================================================== NAS Parallel Benchmarks 3.3 Test / Class: IS.C Total Mop/s > Higher Is Better GCC 4.7.2: Stock . 205.13 |==================================================== GCC 4.7.2: LTO ... 205.13 |==================================================== GCC 4.8.0: Stock . 205.66 |==================================================== GCC 4.8.0: LTO ... 206.50 |==================================================== NAS Parallel Benchmarks 3.3 Test / Class: LU.A Total Mop/s > Higher Is Better GCC 4.7.2: Stock . 14051.29 |================================================== GCC 4.7.2: LTO ... 14113.38 |================================================== GCC 4.8.0: Stock . 14146.77 |================================================== GCC 4.8.0: LTO ... 13952.21 |================================================= NAS Parallel Benchmarks 3.3 Test / Class: MG.B Total Mop/s > Higher Is Better GCC 4.7.2: Stock . 6062.56 |================================================= GCC 4.7.2: LTO ... 6283.14 |=================================================== GCC 4.8.0: Stock . 6216.09 |================================================== GCC 4.8.0: LTO ... 6189.04 |================================================== NAS Parallel Benchmarks 3.3 Test / Class: SP.A Total Mop/s > Higher Is Better GCC 4.7.2: Stock . 6126.81 |================================================== GCC 4.7.2: LTO ... 6190.89 |=================================================== GCC 4.8.0: Stock . 6119.68 |================================================== GCC 4.8.0: LTO ... 6200.41 |=================================================== NAS Parallel Benchmarks 3.3 Test / Class: UA.A Total Mop/s > Higher Is Better GCC 4.7.2: Stock . 45.05 |==================================================== GCC 4.7.2: LTO ... 45.84 |===================================================== GCC 4.8.0: Stock . 45.23 |==================================================== GCC 4.8.0: LTO ... 44.55 |==================================================== LAMMPS Molecular Dynamics Simulator 1.0 Test: Rhodopsin Protein Loop Time < Lower Is Better GCC 4.7.2: Stock . 37.44 |===================================================== GCC 4.7.2: LTO ... 37.45 |===================================================== GCC 4.8.0: Stock . 36.08 |=================================================== GCC 4.8.0: LTO ... 36.16 |=================================================== FFTE 5.0 Test: N=64, 1D Complex FFT Routine MFLOPS > Higher Is Better GCC 4.7.2: Stock . 5838.22 |================================================== GCC 4.7.2: LTO ... 5857.02 |=================================================== GCC 4.8.0: Stock . 5899.41 |=================================================== GCC 4.8.0: LTO ... 5890.24 |=================================================== Timed HMMer Search 2.3.2 Pfam Database Search Seconds < Lower Is Better GCC 4.7.2: Stock . 10.12 |===================================================== GCC 4.7.2: LTO ... 10.06 |===================================================== GCC 4.8.0: Stock . 10.10 |===================================================== GCC 4.8.0: LTO ... 10.04 |===================================================== Timed MAFFT Alignment 6.864 Multiple Sequence Alignment Seconds < Lower Is Better GCC 4.7.2: Stock . 5.96 |====================================================== GCC 4.7.2: LTO ... 5.67 |=================================================== GCC 4.8.0: Stock . 5.72 |==================================================== GCC 4.8.0: LTO ... 5.53 |================================================== BLAKE2 20121223 Phoronix Test Suite v4.4.0m2 Cycles Per Byte < Lower Is Better GCC 4.7.2: Stock . 5.33 |====================================================== GCC 4.7.2: LTO ... 5.32 |====================================================== GCC 4.8.0: Stock . 5.31 |====================================================== GCC 4.8.0: LTO ... 5.31 |====================================================== GMPbench 0.2 Total Time GMPbench Score > Higher Is Better GCC 4.7.2: Stock . 3583.50 |=================================================== GCC 4.7.2: LTO ... 3588.50 |=================================================== GCC 4.8.0: Stock . 3581.30 |=================================================== GCC 4.8.0: LTO ... 3572.30 |=================================================== BYTE Unix Benchmark 3.6 Computational Test: Dhrystone 2 LPS > Higher Is Better GCC 4.7.2: Stock . 28580136.73 |================================== GCC 4.7.2: LTO ... 39569578.13 |=============================================== GCC 4.8.0: Stock . 30134631.90 |==================================== GCC 4.8.0: LTO ... 37703356.83 |============================================= SciMark 2.0 Computational Test: Monte Carlo Mflops > Higher Is Better GCC 4.7.2: Stock . 418.62 |======================================= GCC 4.7.2: LTO ... 419.02 |======================================= GCC 4.8.0: Stock . 555.98 |==================================================== GCC 4.8.0: LTO ... 555.63 |==================================================== SciMark 2.0 Computational Test: Fast Fourier Transform Mflops > Higher Is Better GCC 4.7.2: Stock . 344.67 |==================================================== GCC 4.7.2: LTO ... 334.57 |================================================== GCC 4.8.0: Stock . 328.78 |================================================== GCC 4.8.0: LTO ... 339.22 |=================================================== SciMark 2.0 Computational Test: Sparse Matrix Multiply Mflops > Higher Is Better GCC 4.7.2: Stock . 2296.31 |=================================================== GCC 4.7.2: LTO ... 2277.20 |=================================================== GCC 4.8.0: Stock . 2220.19 |================================================= GCC 4.8.0: LTO ... 2239.84 |================================================== SciMark 2.0 Computational Test: Dense LU Matrix Factorization Mflops > Higher Is Better GCC 4.7.2: Stock . 2410.55 |=================================================== GCC 4.7.2: LTO ... 2410.58 |=================================================== GCC 4.8.0: Stock . 2399.71 |=================================================== GCC 4.8.0: LTO ... 2405.14 |=================================================== SciMark 2.0 Computational Test: Jacobi Successive Over-Relaxation Mflops > Higher Is Better GCC 4.7.2: Stock . 1181.46 |=================================================== GCC 4.7.2: LTO ... 1180.32 |=================================================== GCC 4.8.0: Stock . 1180.32 |=================================================== GCC 4.8.0: LTO ... 1179.18 |=================================================== VP8 libvpx Encoding 1.1.0 vpxenc Frames Per Second > Higher Is Better GCC 4.7.2: Stock . 28.07 |===================================================== GCC 4.7.2: LTO ... 27.97 |===================================================== GCC 4.8.0: Stock . 27.10 |=================================================== GCC 4.8.0: LTO ... 27.73 |==================================================== Himeno Benchmark 3.0 Poisson Pressure Solver MFLOPS > Higher Is Better GCC 4.7.2: Stock . 1676.54 |================================================== GCC 4.7.2: LTO ... 1652.54 |================================================= GCC 4.8.0: Stock . 1610.63 |================================================ GCC 4.8.0: LTO ... 1709.87 |=================================================== 7-Zip Compression 9.20.1 Compress Speed Test MIPS > Higher Is Better GCC 4.7.2: Stock . 21064 |=================================================== GCC 4.7.2: LTO ... 21706 |===================================================== GCC 4.8.0: Stock . 20896 |=================================================== GCC 4.8.0: LTO ... 21158 |==================================================== Timed ImageMagick Compilation 6.8.1-10 Time To Compile Seconds < Lower Is Better GCC 4.7.2: Stock . 65.82 |========= GCC 4.7.2: LTO ... 135.51 |=================== GCC 4.8.0: Stock . 250.48 |================================== GCC 4.8.0: LTO ... 379.88 |==================================================== Timed PHP Compilation 5.2.9 Time To Compile Seconds < Lower Is Better GCC 4.7.2: Stock . 27.32 |============== GCC 4.7.2: LTO ... 101.10 |==================================================== GCC 4.8.0: Stock . 43.75 |====================== GCC 4.8.0: LTO ... 101.71 |==================================================== Parallel BZIP2 Compression 1.1.6 256MB File Compression Seconds < Lower Is Better GCC 4.7.2: Stock . 8.09 |=================================================== GCC 4.7.2: LTO ... 8.52 |====================================================== GCC 4.8.0: Stock . 7.69 |================================================= GCC 4.8.0: LTO ... 7.75 |================================================= Smallpt 1.0 Global Illumination Renderer; 100 Samples Seconds < Lower Is Better GCC 4.7.2: Stock . 38 |======================================================== GCC 4.7.2: LTO ... 38 |======================================================== GCC 4.8.0: Stock . 38 |======================================================== GCC 4.8.0: LTO ... 38 |======================================================== Bullet Physics Engine 2.81 Test: Raytests Seconds < Lower Is Better GCC 4.7.2: Stock . 3.39 |===================================================== GCC 4.7.2: LTO ... 3.46 |====================================================== GCC 4.8.0: Stock . 3.31 |==================================================== GCC 4.8.0: LTO ... 3.37 |===================================================== Bullet Physics Engine 2.81 Test: 3000 Fall Seconds < Lower Is Better GCC 4.7.2: Stock . 5.05 |================================================== GCC 4.7.2: LTO ... 5.29 |===================================================== GCC 4.8.0: Stock . 4.99 |================================================== GCC 4.8.0: LTO ... 5.44 |====================================================== Bullet Physics Engine 2.81 Test: 1000 Stack Seconds < Lower Is Better GCC 4.7.2: Stock . 6.06 |=============================================== GCC 4.7.2: LTO ... 6.32 |================================================= GCC 4.8.0: Stock . 5.89 |============================================= GCC 4.8.0: LTO ... 7.01 |====================================================== Bullet Physics Engine 2.81 Test: 1000 Convex Seconds < Lower Is Better GCC 4.7.2: Stock . 5.92 |==================================================== GCC 4.7.2: LTO ... 6.03 |===================================================== GCC 4.8.0: Stock . 5.77 |=================================================== GCC 4.8.0: LTO ... 6.12 |====================================================== Bullet Physics Engine 2.81 Test: 136 Ragdolls Seconds < Lower Is Better GCC 4.7.2: Stock . 3.59 |================================================= GCC 4.7.2: LTO ... 3.77 |==================================================== GCC 4.8.0: Stock . 3.60 |================================================= GCC 4.8.0: LTO ... 3.95 |====================================================== Bullet Physics Engine 2.81 Test: Prim Trimesh Seconds < Lower Is Better GCC 4.7.2: Stock . 1.10 |==================================================== GCC 4.7.2: LTO ... 1.15 |====================================================== GCC 4.8.0: Stock . 1.08 |=================================================== GCC 4.8.0: LTO ... 1.13 |===================================================== Bullet Physics Engine 2.81 Test: Convex Trimesh Seconds < Lower Is Better GCC 4.7.2: Stock . 1.38 |=================================================== GCC 4.7.2: LTO ... 1.47 |====================================================== GCC 4.8.0: Stock . 1.36 |================================================== GCC 4.8.0: LTO ... 1.44 |===================================================== LAME MP3 Encoding 3.99.3 WAV To MP3 Seconds < Lower Is Better GCC 4.7.2: Stock . 13.59 |==================================================== GCC 4.7.2: LTO ... 13.88 |===================================================== GCC 4.8.0: Stock . 13.45 |=================================================== GCC 4.8.0: LTO ... 13.68 |==================================================== FFmpeg 1.1 H.264 HD To NTSC DV Seconds < Lower Is Better GCC 4.7.2: Stock . 16.97 |===================================================== GCC 4.7.2: LTO ... 16.88 |===================================================== GCC 4.8.0: Stock . 16.85 |===================================================== GCC 4.8.0: LTO ... 16.74 |==================================================== Mencoder 1.1 AVI To LAVC Seconds < Lower Is Better GCC 4.7.2: Stock . 20.56 |===================================================== GCC 4.8.0: Stock . 20.59 |===================================================== Minion 0.15 Benchmark: Graceful Seconds < Lower Is Better GCC 4.7.2: Stock . 57.47 |=================================================== GCC 4.7.2: LTO ... 57.59 |=================================================== GCC 4.8.0: Stock . 59.27 |===================================================== GCC 4.8.0: LTO ... 59.14 |===================================================== Minion 0.15 Benchmark: Solitaire Seconds < Lower Is Better GCC 4.7.2: Stock . 74.60 |===================================================== GCC 4.7.2: LTO ... 74.62 |===================================================== GCC 4.8.0: Stock . 73.75 |==================================================== GCC 4.8.0: LTO ... 73.54 |==================================================== Minion 0.15 Benchmark: Quasigroup Seconds < Lower Is Better GCC 4.7.2: Stock . 124.64 |==================================================== GCC 4.7.2: LTO ... 125.31 |==================================================== GCC 4.8.0: Stock . 124.03 |=================================================== GCC 4.8.0: LTO ... 123.43 |=================================================== Open FMM Nero2D 2.0.2 Total Time Seconds < Lower Is Better GCC 4.7.2: Stock . 1104.04 |=================================================== GCC 4.7.2: LTO ... 1064.01 |================================================= GCC 4.8.0: Stock . 1092.29 |================================================== Tachyon 0.98.9 Total Time Seconds < Lower Is Better GCC 4.7.2: Stock . 13.30 |==================================================== GCC 4.7.2: LTO ... 13.45 |===================================================== GCC 4.8.0: Stock . 13.06 |=================================================== GCC 4.8.0: LTO ... 13.04 |=================================================== Apache Benchmark 2.4.3 Static Web Page Serving Requests Per Second > Higher Is Better GCC 4.7.2: Stock . 29921.42 |================================================= GCC 4.7.2: LTO ... 30778.21 |================================================== GCC 4.8.0: Stock . 30794.73 |================================================== GCC 4.8.0: LTO ... 30772.14 |==================================================