LLVM Clang 3.8 Compiler Tuning Intel Xeon E5-2687W v3 testing with a MSI X99S SLI PLUS (MS-7885) v1.0 and AMD FirePro V7900 2048MB on Ubuntu 16.04 via the Phoronix Test Suite. -O0: Processor: Intel Xeon E5-2687W v3 @ 3.50GHz (20 Cores), Motherboard: MSI X99S SLI PLUS (MS-7885) v1.0, Chipset: Intel Xeon E7 v3/Xeon, Memory: 16384MB, Disk: PNY CS1211 120GB + 80GB INTEL SSDSCKGW08, Graphics: AMD FirePro V7900 2048MB, Audio: Realtek ALC892, Monitor: ASUS PB278, Network: Intel Connection OS: Ubuntu 16.04, Kernel: 4.5.0-040500rc1-generic (x86_64) 20160124, Desktop: Unity 7.4.0, Display Server: X Server 1.17.3, Display Driver: radeon 7.6.1, OpenGL: 3.3 Mesa 11.0.8 Gallium 0.4, Compiler: Clang 3.8.0 (SVN 259676) + LLVM 3.8.0, File-System: ext4, Screen Resolution: 2560x1440 -O1: Processor: Intel Xeon E5-2687W v3 @ 3.50GHz (20 Cores), Motherboard: MSI X99S SLI PLUS (MS-7885) v1.0, Chipset: Intel Xeon E7 v3/Xeon, Memory: 16384MB, Disk: PNY CS1211 120GB + 80GB INTEL SSDSCKGW08, Graphics: AMD FirePro V7900 2048MB, Audio: Realtek ALC892, Monitor: ASUS PB278, Network: Intel Connection OS: Ubuntu 16.04, Kernel: 4.5.0-040500rc1-generic (x86_64) 20160124, Desktop: Unity 7.4.0, Display Server: X Server 1.17.3, Display Driver: radeon 7.6.1, OpenGL: 3.3 Mesa 11.0.8 Gallium 0.4, Compiler: Clang 3.8.0 (SVN 259676) + LLVM 3.8.0, File-System: ext4, Screen Resolution: 2560x1440 -O2: Processor: Intel Xeon E5-2687W v3 @ 3.50GHz (20 Cores), Motherboard: MSI X99S SLI PLUS (MS-7885) v1.0, Chipset: Intel Xeon E7 v3/Xeon, Memory: 16384MB, Disk: PNY CS1211 120GB + 80GB INTEL SSDSCKGW08, Graphics: AMD FirePro V7900 2048MB, Audio: Realtek ALC892, Monitor: ASUS PB278, Network: Intel Connection OS: Ubuntu 16.04, Kernel: 4.5.0-040500rc1-generic (x86_64) 20160124, Desktop: Unity 7.4.0, Display Server: X Server 1.17.3, Display Driver: radeon 7.6.1, OpenGL: 3.3 Mesa 11.0.8 Gallium 0.4, Compiler: Clang 3.8.0 (SVN 259676) + LLVM 3.8.0, File-System: ext4, Screen Resolution: 2560x1440 -Oz: Processor: Intel Xeon E5-2687W v3 @ 3.50GHz (20 Cores), Motherboard: MSI X99S SLI PLUS (MS-7885) v1.0, Chipset: Intel Xeon E7 v3/Xeon, Memory: 16384MB, Disk: PNY CS1211 120GB + 80GB INTEL SSDSCKGW08, Graphics: AMD FirePro V7900 2048MB, Audio: Realtek ALC892, Monitor: ASUS PB278, Network: Intel Connection OS: Ubuntu 16.04, Kernel: 4.5.0-040500rc1-generic (x86_64) 20160124, Desktop: Unity 7.4.0, Display Server: X Server 1.17.3, Display Driver: radeon 7.6.1, OpenGL: 3.3 Mesa 11.0.8 Gallium 0.4, Compiler: Clang 3.8.0 (SVN 259676) + LLVM 3.8.0, File-System: ext4, Screen Resolution: 2560x1440 -O3: Processor: Intel Xeon E5-2687W v3 @ 3.50GHz (20 Cores), Motherboard: MSI X99S SLI PLUS (MS-7885) v1.0, Chipset: Intel Xeon E7 v3/Xeon, Memory: 16384MB, Disk: PNY CS1211 120GB + 80GB INTEL SSDSCKGW08, Graphics: AMD FirePro V7900 2048MB, Audio: Realtek ALC892, Monitor: ASUS PB278, Network: Intel Connection OS: Ubuntu 16.04, Kernel: 4.5.0-040500rc1-generic (x86_64) 20160124, Desktop: Unity 7.4.0, Display Server: X Server 1.17.3, Display Driver: radeon 7.6.1, OpenGL: 3.3 Mesa 11.0.8 Gallium 0.4, Compiler: Clang 3.8.0 (SVN 259676) + LLVM 3.8.0, File-System: ext4, Screen Resolution: 2560x1440 -O3 -march=native: Processor: Intel Xeon E5-2687W v3 @ 3.50GHz (20 Cores), Motherboard: MSI X99S SLI PLUS (MS-7885) v1.0, Chipset: Intel Xeon E7 v3/Xeon, Memory: 16384MB, Disk: PNY CS1211 120GB + 80GB INTEL SSDSCKGW08, Graphics: AMD FirePro V7900 2048MB, Audio: Realtek ALC892, Monitor: ASUS PB278, Network: Intel Connection OS: Ubuntu 16.04, Kernel: 4.5.0-040500rc1-generic (x86_64) 20160124, Desktop: Unity 7.4.0, Display Server: X Server 1.17.3, Display Driver: radeon 7.6.1, OpenGL: 3.3 Mesa 11.0.8 Gallium 0.4, Compiler: Clang 3.8.0 (SVN 259676) + LLVM 3.8.0, File-System: ext4, Screen Resolution: 2560x1440 FLAC Audio Encoding 1.3.1 WAV To FLAC Seconds < Lower Is Better -O0 ............... 57.44 |==================================================== -O1 ............... 9.12 |======== -O2 ............... 8.65 |======== -Oz ............... 12.27 |=========== -O3 ............... 8.66 |======== -O3 -march=native . 7.11 |====== Himeno Benchmark 3.0 Poisson Pressure Solver MFLOPS > Higher Is Better -O0 ............... 284.58 |========== -O1 ............... 1334.47 |================================================= -O2 ............... 1359.01 |================================================== -Oz ............... 1002.24 |===================================== -O3 ............... 1354.29 |================================================== -O3 -march=native . 1342.94 |================================================= GraphicsMagick 1.3.19 Operation: Local Adaptive Thresholding Iterations Per Minute > Higher Is Better -O0 ............... 18 |============ -O1 ............... 70 |============================================== -O2 ............... 81 |===================================================== -Oz ............... 71 |============================================== -O3 ............... 80 |==================================================== -O3 -march=native . 84 |======================================================= Timed PHP Compilation 5.2.9 Time To Compile Seconds < Lower Is Better -O0 ............... 4.24 |============== -O1 ............... 8.93 |============================= -O2 ............... 11.82 |======================================= -Oz ............... 9.55 |=============================== -O3 ............... 15.79 |==================================================== -O3 -march=native . 15.93 |==================================================== Hierarchical INTegration 1.0 Test: FLOAT QUIPs > Higher Is Better -O0 ............... 112554681.82 |================ -O1 ............... 266823791.59 |===================================== -O2 ............... 322378022.36 |============================================= -Oz ............... 314508487.41 |============================================ -O3 ............... 321885788.84 |============================================= -O3 -march=native . 268576816.54 |===================================== LAME MP3 Encoding 3.99.3 WAV To MP3 Seconds < Lower Is Better -O0 ............... 37.10 |==================================================== -O1 ............... 15.68 |====================== -O2 ............... 13.95 |==================== -Oz ............... 16.88 |======================== -O3 ............... 14.38 |==================== -O3 -march=native . 15.22 |===================== Timed Apache Compilation 2.4.7 Time To Compile Seconds < Lower Is Better -O0 ............... 9.25 |====================== -O1 ............... 17.83 |========================================== -O2 ............... 21.27 |=================================================== -Oz ............... 19.22 |============================================== -O3 ............... 21.67 |==================================================== -O3 -march=native . 21.87 |==================================================== C-Ray 1.1 Total Time Seconds < Lower Is Better -O0 ............... 27.58 |==================================================== -O1 ............... 16.30 |=============================== -O2 ............... 19.81 |===================================== -Oz ............... 19.70 |===================================== -O3 ............... 13.21 |========================= -O3 -march=native . 12.78 |======================== GraphicsMagick 1.3.19 Operation: Sharpen Iterations Per Minute > Higher Is Better -O0 ............... 61 |============================= -O1 ............... 111 |===================================================== -O2 ............... 113 |====================================================== -Oz ............... 108 |==================================================== -O3 ............... 113 |====================================================== -O3 -march=native . 107 |=================================================== GraphicsMagick 1.3.19 Operation: HWB Color Space Iterations Per Minute > Higher Is Better -O0 ............... 83 |============================== -O1 ............... 150 |====================================================== -O2 ............... 150 |====================================================== -Oz ............... 144 |==================================================== -O3 ............... 150 |====================================================== -O3 -march=native . 150 |====================================================== GraphicsMagick 1.3.19 Operation: Blur Iterations Per Minute > Higher Is Better -O0 ............... 67 |============================== -O1 ............... 116 |===================================================== -O2 ............... 119 |====================================================== -Oz ............... 113 |=================================================== -O3 ............... 119 |====================================================== -O3 -march=native . 117 |===================================================== PostgreSQL pgbench 9.4.3 Scaling: Buffer Test - Test: Single Thread - Mode: Read Write TPS > Higher Is Better -O0 ............... 284.51 |====================================== -O1 ............... 348.86 |=============================================== -O2 ............... 379.00 |=================================================== -Oz ............... 357.32 |================================================ -O3 ............... 351.75 |=============================================== -O3 -march=native . 362.58 |================================================= Redis 3.0.1 Test: LPUSH Requests Per Second > Higher Is Better -O0 ............... 456864.10 |===================================== -O1 ............... 585266.59 |================================================ -O2 ............... 570314.27 |============================================== -Oz ............... 589078.56 |================================================ -O3 ............... 587392.23 |================================================ -O3 -march=native . 575188.23 |=============================================== Redis 3.0.1 Test: SET Requests Per Second > Higher Is Better -O0 ............... 468101.81 |====================================== -O1 ............... 575054.87 |=============================================== -O2 ............... 579653.25 |================================================ -Oz ............... 585438.60 |================================================ -O3 ............... 582801.37 |================================================ -O3 -march=native . 580208.84 |================================================ Redis 3.0.1 Test: SADD Requests Per Second > Higher Is Better -O0 ............... 481943.16 |======================================= -O1 ............... 587106.65 |=============================================== -O2 ............... 599041.21 |================================================ -Oz ............... 581864.31 |=============================================== -O3 ............... 592835.98 |================================================ -O3 -march=native . 587747.73 |=============================================== Redis 3.0.1 Test: GET Requests Per Second > Higher Is Better -O0 ............... 527734.53 |======================================= -O1 ............... 625334.40 |============================================== -O2 ............... 624757.29 |============================================== -Oz ............... 632178.77 |=============================================== -O3 ............... 649056.62 |================================================ -O3 -march=native . 642443.67 |================================================ Redis 3.0.1 Test: LPOP Requests Per Second > Higher Is Better -O0 ............... 541334.66 |======================================== -O1 ............... 642965.38 |================================================ -O2 ............... 633016.31 |=============================================== -Oz ............... 631316.90 |=============================================== -O3 ............... 642800.77 |================================================ -O3 -march=native . 624901.33 |=============================================== SciMark 2.0 Computational Test: Dense LU Matrix Factorization Mflops > Higher Is Better -O0 ............... 4133.95 |========================================== -O1 ............... 4187.61 |=========================================== -O2 ............... 4137.16 |========================================== -Oz ............... 4181.89 |=========================================== -O3 ............... 4127.10 |========================================== -O3 -march=native . 4901.04 |================================================== PostgreSQL pgbench 9.4.3 Scaling: Buffer Test - Test: Heavy Contention - Mode: Read Write TPS > Higher Is Better -O0 ............... 5176.52 |================================================ -O1 ............... 5209.08 |================================================= -O2 ............... 5050.83 |=============================================== -Oz ............... 4911.10 |============================================== -O3 ............... 4630.05 |=========================================== -O3 -march=native . 5342.15 |================================================== PostgreSQL pgbench 9.4.3 Scaling: Buffer Test - Test: Normal Load - Mode: Read Write TPS > Higher Is Better -O0 ............... 5007.75 |================================================= -O1 ............... 5049.34 |================================================== -O2 ............... 4968.10 |================================================= -Oz ............... 4736.91 |=============================================== -O3 ............... 4597.65 |============================================= -O3 -march=native . 5092.20 |================================================== Smallpt 1.0 Global Illumination Renderer; 100 Samples Seconds < Lower Is Better -O2 ............... 12 |=================================================== -Oz ............... 13 |======================================================= -O3 ............... 12 |=================================================== -O3 -march=native . 12 |=================================================== SciMark 2.0 Computational Test: Sparse Matrix Multiply Mflops > Higher Is Better -O0 ............... 2807.96 |================================================== -O1 ............... 2776.50 |================================================= -O2 ............... 2798.01 |================================================== -Oz ............... 2806.62 |================================================== -O3 ............... 2766.53 |================================================= -O3 -march=native . 2613.77 |=============================================== SciMark 2.0 Computational Test: Composite Mflops > Higher Is Better -O0 ............... 1803.18 |=============================================== -O1 ............... 1807.18 |=============================================== -O2 ............... 1802.36 |=============================================== -Oz ............... 1812.18 |=============================================== -O3 ............... 1793.50 |=============================================== -O3 -march=native . 1918.30 |================================================== SciMark 2.0 Computational Test: Fast Fourier Transform Mflops > Higher Is Better -O0 ............... 358.28 |================================================== -O1 ............... 357.19 |================================================== -O2 ............... 363.59 |=================================================== -Oz ............... 357.50 |================================================== -O3 ............... 358.05 |================================================== -O3 -march=native . 362.39 |=================================================== Apache Benchmark 2.4.7 Static Web Page Serving Requests Per Second > Higher Is Better -O0 ............... 23021.42 |================================================ -O1 ............... 23380.23 |================================================= -O2 ............... 23360.85 |================================================= -Oz ............... 23342.16 |================================================= -O3 ............... 23283.07 |================================================= -O3 -march=native . 23355.46 |================================================= SciMark 2.0 Computational Test: Monte Carlo Mflops > Higher Is Better -O0 ............... 238.14 |=================================================== -O1 ............... 237.83 |=================================================== -O2 ............... 234.76 |================================================== -Oz ............... 237.11 |=================================================== -O3 ............... 237.63 |=================================================== -O3 -march=native . 235.58 |================================================== SciMark 2.0 Computational Test: Jacobi Successive Over-Relaxation Mflops > Higher Is Better -O0 ............... 1477.57 |================================================== -O1 ............... 1476.74 |================================================== -O2 ............... 1478.30 |================================================== -Oz ............... 1477.78 |================================================== -O3 ............... 1478.19 |================================================== -O3 -march=native . 1478.72 |================================================== Timed HMMer Search 2.3.2 Pfam Database Search Seconds < Lower Is Better -O0 ............... 12.18 |======================================== -O1 ............... 15.99 |==================================================== -O2 ............... 14.02 |============================================== -Oz ............... 15.09 |================================================= -O3 ............... 15.13 |================================================= -O3 -march=native . 14.80 |================================================