GCC 8.0 vs. Clang 6.0 AMD EPYC Tuning Comparison Tests for a future article on Phoronix. GCC 8.0: x86-64: Processor: AMD EPYC 7601 32-Core @ 2.20GHz (64 Cores), Motherboard: TYAN B8026T70AE24HR, Chipset: AMD Device 1450, Memory: 126976MB, Disk: 280GB INTEL SSDPE21D280GA, Graphics: ASPEED ASPEED Family, Monitor: VE228, Network: Broadcom Limited NetXtreme BCM5720 Gigabit PCIe OS: Ubuntu 17.10, Kernel: 4.13.0-21-generic (x86_64), Desktop: GNOME Shell 3.26.1, Display Driver: modesetting 1.19.5, OpenCL: OpenCL 1.2 pocl 1.0 LLVM 5.0.0, Compiler: GCC 8.0.0 20171231 + clang (GCC) 8.0.0 20171231 (experimental) + LLVM 5.0.0, File-System: ext4, Screen Resolution: 1920x1080, System Layer: vm-other Xen 4.9.0 Hypervisor GCC 8.0: znver1: Processor: AMD EPYC 7601 32-Core @ 2.20GHz (64 Cores), Motherboard: TYAN B8026T70AE24HR, Chipset: AMD Device 1450, Memory: 126976MB, Disk: 280GB INTEL SSDPE21D280GA, Graphics: ASPEED ASPEED Family, Monitor: VE228, Network: Broadcom Limited NetXtreme BCM5720 Gigabit PCIe OS: Ubuntu 17.10, Kernel: 4.13.0-21-generic (x86_64), Desktop: GNOME Shell 3.26.1, Display Driver: modesetting 1.19.5, OpenCL: OpenCL 1.2 pocl 1.0 LLVM 5.0.0, Compiler: GCC 8.0.0 20171231 + clang (GCC) 8.0.0 20171231 (experimental) + LLVM 5.0.0, File-System: ext4, Screen Resolution: 1920x1080, System Layer: vm-other Xen 4.9.0 Hypervisor Clang 6.0: x86-64: Processor: AMD EPYC 7601 32-Core @ 2.20GHz (64 Cores), Motherboard: TYAN B8026T70AE24HR, Chipset: AMD Device 1450, Memory: 126976MB, Disk: 280GB INTEL SSDPE21D280GA, Graphics: ASPEED ASPEED Family, Monitor: VE228, Network: Broadcom Limited NetXtreme BCM5720 Gigabit PCIe OS: Ubuntu 17.10, Kernel: 4.13.0-21-generic (x86_64), Desktop: GNOME Shell 3.26.1, Display Driver: modesetting 1.19.5, OpenCL: OpenCL 1.2 pocl 1.0 LLVM 5.0.0, Compiler: Clang 6.0.0 (SVN 321623) + LLVM 6.0.0svn, File-System: ext4, Screen Resolution: 1920x1080, System Layer: vm-other Xen 4.9.0 Hypervisor Clang 6.0: znver1: Processor: AMD EPYC 7601 32-Core @ 2.20GHz (64 Cores), Motherboard: TYAN B8026T70AE24HR, Chipset: AMD Device 1450, Memory: 126976MB, Disk: 280GB INTEL SSDPE21D280GA, Graphics: ASPEED ASPEED Family, Monitor: VE228, Network: Broadcom Limited NetXtreme BCM5720 Gigabit PCIe OS: Ubuntu 17.10, Kernel: 4.13.0-21-generic (x86_64), Desktop: GNOME Shell 3.26.1, Display Driver: modesetting 1.19.5, OpenCL: OpenCL 1.2 pocl 1.0 LLVM 5.0.0, Compiler: Clang 6.0.0 (SVN 321623) + LLVM 6.0.0svn, File-System: ext4, Screen Resolution: 1920x1080, System Layer: vm-other Xen 4.9.0 Hypervisor SQLite 3.8.10.2 Test Target: Default Test Directory Seconds < Lower Is Better GCC 8.0: x86-64 ... 7.61 |===================================================== GCC 8.0: znver1 ... 7.16 |================================================== Clang 6.0: x86-64 . 7.53 |==================================================== Clang 6.0: znver1 . 7.48 |==================================================== PolyBench-C 3.2 Test: 3 Matrix Multiplications Seconds < Lower Is Better GCC 8.0: x86-64 ... 60.68 |================================================ GCC 8.0: znver1 ... 65.45 |==================================================== Clang 6.0: x86-64 . 62.98 |================================================== Clang 6.0: znver1 . 62.75 |================================================== FFTW 3.3.6 Build: Stock - Size: 2D FFT Size 4096 Mflops > Higher Is Better GCC 8.0: x86-64 ... 4959.73 |============================================ GCC 8.0: znver1 ... 5627.83 |================================================== Clang 6.0: x86-64 . 4660.83 |========================================= Clang 6.0: znver1 . 5031.60 |============================================= FFTW 3.3.6 Build: Float + SSE - Size: 2D FFT Size 4096 Mflops > Higher Is Better GCC 8.0: znver1 ... 13630 |==================================================== Clang 6.0: x86-64 . 13649 |==================================================== Clang 6.0: znver1 . 12481 |================================================ Timed HMMer Search 2.3.2 Pfam Database Search Seconds < Lower Is Better GCC 8.0: x86-64 ... 13.65 |==================================================== GCC 8.0: znver1 ... 12.40 |=============================================== Clang 6.0: x86-64 . 12.85 |================================================= Clang 6.0: znver1 . 11.09 |========================================== SciMark 2.0 Computational Test: Composite Mflops > Higher Is Better GCC 8.0: x86-64 ... 1579.48 |============================================== GCC 8.0: znver1 ... 1680.45 |================================================= Clang 6.0: x86-64 . 1479.53 |============================================ Clang 6.0: znver1 . 1699.32 |================================================== SciMark 2.0 Computational Test: Monte Carlo Mflops > Higher Is Better GCC 8.0: x86-64 ... 561.03 |=================================================== GCC 8.0: znver1 ... 555.76 |=================================================== Clang 6.0: x86-64 . 531.38 |================================================ Clang 6.0: znver1 . 552.19 |================================================== SciMark 2.0 Computational Test: Fast Fourier Transform Mflops > Higher Is Better GCC 8.0: x86-64 ... 233.89 |=================================================== GCC 8.0: znver1 ... 231.09 |================================================== Clang 6.0: x86-64 . 179.29 |======================================= Clang 6.0: znver1 . 226.68 |================================================= SciMark 2.0 Computational Test: Sparse Matrix Multiply Mflops > Higher Is Better GCC 8.0: x86-64 ... 2263.87 |================================================== GCC 8.0: znver1 ... 2259.95 |================================================== Clang 6.0: x86-64 . 2190.10 |================================================ Clang 6.0: znver1 . 2258.64 |================================================== SciMark 2.0 Computational Test: Dense LU Matrix Factorization Mflops > Higher Is Better GCC 8.0: x86-64 ... 3513.11 |============================================ GCC 8.0: znver1 ... 3678.86 |============================================== Clang 6.0: x86-64 . 3190.43 |======================================== Clang 6.0: znver1 . 4034.89 |================================================== SciMark 2.0 Computational Test: Jacobi Successive Over-Relaxation Mflops > Higher Is Better GCC 8.0: x86-64 ... 1423.14 |========================================== GCC 8.0: znver1 ... 1676.62 |================================================== Clang 6.0: x86-64 . 1110.65 |================================= Clang 6.0: znver1 . 1424.21 |========================================== TSCP 1.81 AI Chess Performance Nodes Per Second > Higher Is Better GCC 8.0: x86-64 ... 874251 |================================================= GCC 8.0: znver1 ... 875085 |================================================= Clang 6.0: x86-64 . 917658 |=================================================== Clang 6.0: znver1 . 918269 |=================================================== GraphicsMagick 1.3.19 Operation: Blur Iterations Per Minute > Higher Is Better GCC 8.0: x86-64 ... 116 |=================================================== GCC 8.0: znver1 ... 123 |====================================================== Clang 6.0: x86-64 . 101 |============================================ Clang 6.0: znver1 . 104 |============================================== GraphicsMagick 1.3.19 Operation: Sharpen Iterations Per Minute > Higher Is Better GCC 8.0: x86-64 ... 157 |=================================================== GCC 8.0: znver1 ... 165 |====================================================== Clang 6.0: x86-64 . 131 |=========================================== Clang 6.0: znver1 . 136 |============================================= GraphicsMagick 1.3.19 Operation: HWB Color Space Iterations Per Minute > Higher Is Better GCC 8.0: x86-64 ... 177 |=================================================== GCC 8.0: znver1 ... 186 |====================================================== Clang 6.0: x86-64 . 150 |============================================ Clang 6.0: znver1 . 155 |============================================= GraphicsMagick 1.3.19 Operation: Local Adaptive Thresholding Iterations Per Minute > Higher Is Better GCC 8.0: x86-64 ... 92 |==================================================== GCC 8.0: znver1 ... 95 |===================================================== Clang 6.0: x86-64 . 97 |====================================================== Clang 6.0: znver1 . 98 |======================================================= Himeno Benchmark 3.0 Poisson Pressure Solver MFLOPS > Higher Is Better GCC 8.0: x86-64 ... 949.19 |============================================= GCC 8.0: znver1 ... 935.64 |============================================ Clang 6.0: x86-64 . 1032.71 |================================================= Clang 6.0: znver1 . 1052.47 |================================================== ebizzy 0.3 Records/s > Higher Is Better GCC 8.0: x86-64 ... 1126032 |================================================= GCC 8.0: znver1 ... 1101176 |================================================ Clang 6.0: x86-64 . 1076648 |=============================================== Clang 6.0: znver1 . 1145405 |================================================== C-Ray 1.1 Total Time Seconds < Lower Is Better GCC 8.0: x86-64 ... 3.93 |============================================== GCC 8.0: znver1 ... 3.37 |======================================= Clang 6.0: x86-64 . 4.53 |===================================================== Clang 6.0: znver1 . 4.48 |==================================================== Bullet Physics Engine 2.81 Test: Raytests Seconds < Lower Is Better GCC 8.0: x86-64 ... 3.12 |=================================================== GCC 8.0: znver1 ... 3.06 |================================================== Clang 6.0: x86-64 . 3.22 |===================================================== Clang 6.0: znver1 . 3.18 |==================================================== Bullet Physics Engine 2.81 Test: 3000 Fall Seconds < Lower Is Better GCC 8.0: x86-64 ... 5.34 |==================================================== GCC 8.0: znver1 ... 5.27 |=================================================== Clang 6.0: x86-64 . 5.48 |===================================================== Clang 6.0: znver1 . 5.34 |==================================================== Bullet Physics Engine 2.81 Test: 1000 Stack Seconds < Lower Is Better GCC 8.0: x86-64 ... 6.18 |==================================================== GCC 8.0: znver1 ... 5.93 |================================================== Clang 6.0: x86-64 . 6.30 |===================================================== Clang 6.0: znver1 . 6.08 |=================================================== Bullet Physics Engine 2.81 Test: 1000 Convex Seconds < Lower Is Better GCC 8.0: x86-64 ... 5.44 |===================================================== GCC 8.0: znver1 ... 5.28 |=================================================== Clang 6.0: x86-64 . 5.43 |===================================================== Clang 6.0: znver1 . 5.31 |==================================================== Bullet Physics Engine 2.81 Test: 136 Ragdolls Seconds < Lower Is Better GCC 8.0: x86-64 ... 3.26 |===================================================== GCC 8.0: znver1 ... 3.19 |==================================================== Clang 6.0: x86-64 . 3.28 |===================================================== Clang 6.0: znver1 . 3.23 |==================================================== Bullet Physics Engine 2.81 Test: Prim Trimesh Seconds < Lower Is Better GCC 8.0: x86-64 ... 1.10 |===================================================== GCC 8.0: znver1 ... 1.10 |===================================================== Clang 6.0: x86-64 . 1.10 |===================================================== Clang 6.0: znver1 . 1.09 |===================================================== Bullet Physics Engine 2.81 Test: Convex Trimesh Seconds < Lower Is Better GCC 8.0: x86-64 ... 1.34 |===================================================== GCC 8.0: znver1 ... 1.30 |=================================================== Clang 6.0: x86-64 . 1.33 |===================================================== Clang 6.0: znver1 . 1.32 |==================================================== FLAC Audio Encoding 1.3.1 WAV To FLAC Seconds < Lower Is Better GCC 8.0: x86-64 ... 7.12 |================================================ GCC 8.0: znver1 ... 7.45 |================================================== Clang 6.0: x86-64 . 7.94 |===================================================== Clang 6.0: znver1 . 6.63 |============================================ LAME MP3 Encoding 3.99.5 WAV To MP3 Seconds < Lower Is Better GCC 8.0: x86-64 ... 11.10 |============================================= GCC 8.0: znver1 ... 10.81 |============================================ Clang 6.0: x86-64 . 11.33 |============================================== Clang 6.0: znver1 . 12.81 |==================================================== Apache Benchmark 2.4.7 Static Web Page Serving Requests Per Second > Higher Is Better GCC 8.0: x86-64 ... 9841.30 |================================================== GCC 8.0: znver1 ... 9791.23 |================================================== Clang 6.0: x86-64 . 9531.43 |================================================ Clang 6.0: znver1 . 9663.93 |=================================================