GCC 9.1 PGO Optimizations AMD Threadripper AMD Ryzen Threadripper 2990WX GCC 9 PGO benchmarks by Michael Larabel (Profile Guided Optimizations). -O3 -march=native: Processor: AMD Ryzen Threadripper 2990WX 32-Core @ 3.00GHz (32 Cores / 64 Threads), Motherboard: ASUS ROG ZENITH EXTREME (1701 BIOS), Chipset: AMD 17h, Memory: 32768MB, Disk: Samsung SSD 970 EVO 500GB, Graphics: AMD Radeon RX 64 8GB (1590/800MHz), Audio: Realtek ALC1220, Monitor: ASUS VP28U, Network: Intel I211 + Qualcomm Atheros QCA6174 802.11ac + Wilocity Wil6200 802.11ad OS: Ubuntu 18.04, Kernel: 4.18.0-18-generic (x86_64), Desktop: GNOME Shell 3.28.3, Display Server: X Server 1.20.1, Display Driver: amdgpu 18.1.0, OpenGL: 4.5 Mesa 18.2.8 (LLVM 7.0.0), Compiler: GCC 9.1.0, File-System: ext4, Screen Resolution: 3840x2160 -O3 -march=native + PGO: Processor: AMD Ryzen Threadripper 2990WX 32-Core @ 3.00GHz (32 Cores / 64 Threads), Motherboard: ASUS ROG ZENITH EXTREME (1701 BIOS), Chipset: AMD 17h, Memory: 32768MB, Disk: Samsung SSD 970 EVO 500GB, Graphics: AMD Radeon RX 64 8GB (1590/800MHz), Audio: Realtek ALC1220, Monitor: ASUS VP28U, Network: Intel I211 + Qualcomm Atheros QCA6174 802.11ac + Wilocity Wil6200 802.11ad OS: Ubuntu 18.04, Kernel: 4.18.0-18-generic (x86_64), Desktop: GNOME Shell 3.28.3, Display Server: X Server 1.20.1, Display Driver: amdgpu 18.1.0, OpenGL: 4.5 Mesa 18.2.8 (LLVM 7.0.0), Compiler: GCC 9.1.0, File-System: ext4, Screen Resolution: 3840x2160 t-test1 2017-01-13 Threads: 1 Seconds < Lower Is Better -O3 -march=native ....... 28.74 |============================================== -O3 -march=native + PGO . 9.62 |=============== High Performance Conjugate Gradient 3.0 GFLOP/s > Higher Is Better -O3 -march=native ....... 0.91 |=============================================== -O3 -march=native + PGO . 0.86 |============================================ Timed MAFFT Alignment 7.392 Multiple Sequence Alignment Seconds < Lower Is Better -O3 -march=native ....... 2.63 |=============================================== -O3 -march=native + PGO . 2.64 |=============================================== SciMark 2.0 Computational Test: Composite Mflops > Higher Is Better -O3 -march=native ....... 2555 |=============================================== -O3 -march=native + PGO . 2456 |============================================= SciMark 2.0 Computational Test: Monte Carlo Mflops > Higher Is Better -O3 -march=native ....... 728 |================================================ -O3 -march=native + PGO . 271 |================== SciMark 2.0 Computational Test: Fast Fourier Transform Mflops > Higher Is Better -O3 -march=native ....... 261 |=============================================== -O3 -march=native + PGO . 269 |================================================ SciMark 2.0 Computational Test: Sparse Matrix Multiply Mflops > Higher Is Better -O3 -march=native ....... 3220 |=============================================== -O3 -march=native + PGO . 3139 |============================================== SciMark 2.0 Computational Test: Dense LU Matrix Factorization Mflops > Higher Is Better -O3 -march=native ....... 6356 |=============================================== -O3 -march=native + PGO . 6408 |=============================================== SciMark 2.0 Computational Test: Jacobi Successive Over-Relaxation Mflops > Higher Is Better -O3 -march=native ....... 2208 |=============================================== -O3 -march=native + PGO . 2195 |=============================================== TSCP 1.81 AI Chess Performance Nodes Per Second > Higher Is Better -O3 -march=native ....... 1109114 |======================================= -O3 -march=native + PGO . 1250067 |============================================ AOM AV1 2019-02-11 AV1 Video Encoding Frames Per Second > Higher Is Better -O3 -march=native ....... 0.22 |============================================= -O3 -march=native + PGO . 0.23 |=============================================== x265 3.0 H.265 1080p Video Encoding Frames Per Second > Higher Is Better -O3 -march=native ....... 33.79 |============================================== -O3 -march=native + PGO . 33.88 |============================================== Himeno Benchmark 3.0 Poisson Pressure Solver MFLOPS > Higher Is Better -O3 -march=native ....... 1321 |=============================================== -O3 -march=native + PGO . 1307 |=============================================== Stockfish 9 Total Time Nodes Per Second > Higher Is Better -O3 -march=native ....... 67841877 |=========================================== -O3 -march=native + PGO . 65102152 |========================================= C-Ray 1.1 Total Time - 4K, 16 Rays Per Pixel Seconds < Lower Is Better -O3 -march=native ....... 17.96 |============================================== -O3 -march=native + PGO . 17.73 |============================================= Smallpt 1.0 Global Illumination Renderer; 128 Samples Seconds < Lower Is Better -O3 -march=native ....... 3.83 |=============================================== -O3 -march=native + PGO . 3.82 |=============================================== AOBench Size: 2048 x 2048 - Total Time Seconds < Lower Is Better -O3 -march=native ....... 39.10 |============================================== -O3 -march=native + PGO . 36.86 |=========================================== CppPerformanceBenchmarks 9 Test: Atol Seconds < Lower Is Better -O3 -march=native ....... 69.31 |============================================== -O3 -march=native + PGO . 68.92 |============================================== CppPerformanceBenchmarks 9 Test: Ctype Seconds < Lower Is Better -O3 -march=native ....... 34.02 |============================================== -O3 -march=native + PGO . 29.01 |======================================= CppPerformanceBenchmarks 9 Test: Math Library Seconds < Lower Is Better -O3 -march=native ....... 353 |================================================ -O3 -march=native + PGO . 348 |=============================================== CppPerformanceBenchmarks 9 Test: Random Numbers Seconds < Lower Is Better -O3 -march=native ....... 1027 |============================================== -O3 -march=native + PGO . 1051 |=============================================== CppPerformanceBenchmarks 9 Test: Stepanov Vector Seconds < Lower Is Better -O3 -march=native ....... 75.27 |============================================== -O3 -march=native + PGO . 72.87 |============================================= CppPerformanceBenchmarks 9 Test: Function Objects Seconds < Lower Is Better -O3 -march=native ....... 15.46 |============================================== -O3 -march=native + PGO . 14.14 |========================================== CppPerformanceBenchmarks 9 Test: Stepanov Abstraction Seconds < Lower Is Better -O3 -march=native ....... 28.36 |============================================== -O3 -march=native + PGO . 27.75 |============================================= Memcached mcperf 1.5.10 Method: Add Operations Per Second > Higher Is Better -O3 -march=native . 47774 |==================================================== Geometric Mean Of All Test Results Result Composite - GCC 9.1 PGO Optimizations AMD Threadripper Geometric Mean > Higher Is Better -O3 -march=native ....... 58.49 |============================================= -O3 -march=native + PGO . 59.73 |==============================================