AMD Ryzen Threadripper GCC 10 PGO benchmarks by Michael Larabel for a future article.
Compare your own system(s) to this result file with the
Phoronix Test Suite by running the command:
phoronix-test-suite benchmark 1912222-PTS-GCC10AMD29
GCC 10 AMD Threadripper 3960X PGO Optimization
AMD Ryzen Threadripper GCC 10 PGO benchmarks by Michael Larabel for a future article.
GCC 10:
Processor: AMD Ryzen Threadripper 3960X 24-Core @ 3.80GHz (24 Cores / 48 Threads), Motherboard: MSI Creator TRX40 (MS-7C59) v1.0 (1.12N1 BIOS), Chipset: AMD Starship/Matisse, Memory: 32768MB, Disk: 1000GB Sabrent Rocket 4.0 1TB, Graphics: Gigabyte AMD Radeon 540/540X/550/550X / RX 540X/550/550X 2GB (1206/1750MHz), Audio: AMD Baffin HDMI/DP, Monitor: ASUS VP28U, Network: Aquantia AQC107 NBase-T/IEEE + Intel I211 + Intel Device 2723
OS: Ubuntu 19.10, Kernel: 5.4.0-nvme-hwmon (x86_64), Desktop: GNOME Shell 3.34.1, Display Server: X Server 1.20.5, Display Driver: modesetting 1.20.5, OpenGL: 4.5 Mesa 19.2.1 (LLVM 9.0.0), Compiler: GCC 10.0.0 20191208, File-System: ext4, Screen Resolution: 3840x2160
GCC 10 - PGO:
Processor: AMD Ryzen Threadripper 3960X 24-Core @ 3.80GHz (24 Cores / 48 Threads), Motherboard: MSI Creator TRX40 (MS-7C59) v1.0 (1.12N1 BIOS), Chipset: AMD Starship/Matisse, Memory: 32768MB, Disk: 1000GB Sabrent Rocket 4.0 1TB, Graphics: Gigabyte AMD Radeon 540/540X/550/550X / RX 540X/550/550X 2GB (1206/1750MHz), Audio: AMD Baffin HDMI/DP, Monitor: ASUS VP28U, Network: Aquantia AQC107 NBase-T/IEEE + Intel I211 + Intel Device 2723
OS: Ubuntu 19.10, Kernel: 5.4.0-nvme-hwmon (x86_64), Desktop: GNOME Shell 3.34.1, Display Server: X Server 1.20.5, Display Driver: modesetting 1.20.5, OpenGL: 4.5 Mesa 19.2.1 (LLVM 9.0.0), Compiler: GCC 10.0.0 20191208, File-System: ext4, Screen Resolution: 3840x2160
SQLite 3.30.1
Threads / Copies: 1
Seconds < Lower Is Better
GCC 10 . 14.18 |===============================================================
HPC Challenge 1.5.0
Test / Class: G-HPL
GFLOPS > Higher Is Better
GCC 10 ....... 63.63 |=========================================================
GCC 10 - PGO . 63.48 |=========================================================
HPC Challenge 1.5.0
Test / Class: G-Ffte
GFLOPS > Higher Is Better
GCC 10 ....... 10.49 |========================================================
GCC 10 - PGO . 10.64 |=========================================================
HPC Challenge 1.5.0
Test / Class: G-Ffte
GFLOP/s > Higher Is Better
GCC 10 ....... 10.49 |========================================================
GCC 10 - PGO . 10.64 |=========================================================
HPC Challenge 1.5.0
Test / Class: EP-DGEMM
GFLOPS > Higher Is Better
GCC 10 ....... 32.93 |=========================================================
GCC 10 - PGO . 32.68 |=========================================================
HPC Challenge 1.5.0
Test / Class: G-Ptrans
GB/s > Higher Is Better
GCC 10 ....... 5.47737 |=======================================================
GCC 10 - PGO . 5.52421 |=======================================================
HPC Challenge 1.5.0
Test / Class: EP-STREAM Triad
GB/s > Higher Is Better
GCC 10 ....... 1.79750 |=======================================================
GCC 10 - PGO . 1.79549 |=======================================================
HPC Challenge 1.5.0
Test / Class: G-Random Access
GUP/s > Higher Is Better
GCC 10 ....... 0.14278 |=======================================================
GCC 10 - PGO . 0.14161 |=======================================================
HPC Challenge 1.5.0
Test / Class: Random Ring Latency
usecs < Lower Is Better
GCC 10 ....... 0.45863 |=======================================================
GCC 10 - PGO . 0.45680 |=======================================================
HPC Challenge 1.5.0
Test / Class: Random Ring Bandwidth
GB/s > Higher Is Better
GCC 10 ....... 3.40678 |=======================================================
GCC 10 - PGO . 3.36248 |======================================================
HPC Challenge 1.5.0
Test / Class: Max Ping Pong Bandwidth
MB/s > Higher Is Better
GCC 10 ....... 22977.00 |=====================================================
GCC 10 - PGO . 23248.09 |======================================================
miniFE 2.2
Problem Size: Small
CG Mflops > Higher Is Better
GCC 10 ....... 7740.10 |=======================================================
GCC 10 - PGO . 7728.25 |=======================================================
FFTW 3.3.6
Build: Stock - Size: 1D FFT Size 32
Mflops > Higher Is Better
GCC 10 . 10443 |===============================================================
FFTW 3.3.6
Build: Stock - Size: 2D FFT Size 32
Mflops > Higher Is Better
GCC 10 . 10512 |===============================================================
FFTW 3.3.6
Build: Stock - Size: 2D FFT Size 4096
Mflops > Higher Is Better
GCC 10 . 6687.3 |==============================================================
FFTW 3.3.6
Build: Float + SSE - Size: 1D FFT Size 32
Mflops > Higher Is Better
GCC 10 . 15396 |===============================================================
FFTW 3.3.6
Build: Float + SSE - Size: 2D FFT Size 32
Mflops > Higher Is Better
GCC 10 . 45404 |===============================================================
FFTW 3.3.6
Build: Float + SSE - Size: 2D FFT Size 4096
Mflops > Higher Is Better
GCC 10 . 22667 |===============================================================
Timed MrBayes Analysis 3.2.7
Primate Phylogeny Analysis
Seconds < Lower Is Better
GCC 10 . 70.01 |===============================================================
QMCPACK 3.8
Total Execution Time - Seconds < Lower Is Better
GCC 10 ....... 1878 |=========================================================
GCC 10 - PGO . 1896 |==========================================================
BYTE Unix Benchmark 3.6
Computational Test: Dhrystone 2
LPS > Higher Is Better
GCC 10 ....... 48055276.3 |====================================================
GCC 10 - PGO . 48511999.7 |====================================================
Crafty 25.2
Elapsed Time
Nodes Per Second > Higher Is Better
GCC 10 ....... 9234824 |=======================================================
GCC 10 - PGO . 9309942 |=======================================================
TSCP 1.81
AI Chess Performance
Nodes Per Second > Higher Is Better
GCC 10 ....... 1346651 |================================================
GCC 10 - PGO . 1533348 |=======================================================
MKL-DNN DNNL 1.1
Harness: Deconvolution Batch deconv_1d - Data Type: f32
ms < Lower Is Better
GCC 10 ....... 2.32419 |=======================================================
GCC 10 - PGO . 2.31720 |=======================================================
MKL-DNN DNNL 1.1
Harness: Convolution Batch conv_alexnet - Data Type: f32
ms < Lower Is Better
GCC 10 ....... 123.99 |========================================================
GCC 10 - PGO . 123.30 |========================================================
MKL-DNN DNNL 1.1
Harness: Recurrent Neural Network Training - Data Type: f32
ms < Lower Is Better
GCC 10 ....... 194.25 |========================================================
GCC 10 - PGO . 195.32 |========================================================
MKL-DNN DNNL 1.1
Harness: Convolution Batch conv_googlenet_v3 - Data Type: f32
ms < Lower Is Better
GCC 10 ....... 52.33 |========================================================
GCC 10 - PGO . 53.41 |=========================================================
TTSIOD 3D Renderer 2.3b
Phong Rendering With Soft-Shadow Mapping
FPS > Higher Is Better
GCC 10 . 938.47 |==============================================================
ACES DGEMM 1.0
Sustained Floating-Point Rate
GFLOP/s > Higher Is Better
GCC 10 ....... 8.567282 |================================================
GCC 10 - PGO . 9.669028 |======================================================
Himeno Benchmark 3.0
Poisson Pressure Solver
MFLOPS > Higher Is Better
GCC 10 ....... 4684.30 |=====================================================
GCC 10 - PGO . 4848.09 |=======================================================
Stockfish 9
Total Time
Nodes Per Second > Higher Is Better
GCC 10 ....... 79359613 |======================================================
GCC 10 - PGO . 78501983 |=====================================================
Timed ImageMagick Compilation 6.9.0
Time To Compile
Seconds < Lower Is Better
GCC 10 . 16.47 |===============================================================
XZ Compression 5.2.4
Compressing ubuntu-16.04.3-server-i386.img, Compression Level 9
Seconds < Lower Is Better
GCC 10 . 19.87 |===============================================================
FLAC Audio Encoding 1.3.2
WAV To FLAC
Seconds < Lower Is Better
GCC 10 . 7.719 |===============================================================
LAME MP3 Encoding 3.100
WAV To MP3
Seconds < Lower Is Better
GCC 10 . 7.297 |===============================================================
Radiance Benchmark 5.0
Test: Serial
Seconds < Lower Is Better
GCC 10 ....... 555.94 |========================================================
GCC 10 - PGO . 556.48 |========================================================
Radiance Benchmark 5.0
Test: SMP Parallel
Seconds < Lower Is Better
GCC 10 ....... 171.30 |========================================================
GCC 10 - PGO . 172.35 |========================================================
OpenSSL 1.1.1
RSA 4096-bit Performance
Signs Per Second > Higher Is Better
GCC 10 ....... 7180.6 |========================================================
GCC 10 - PGO . 7072.4 |=======================================================
ASKAP 2018-11-10
Test: tConvolve MT - Gridding
Million Grid Points Per Second > Higher Is Better
GCC 10 ....... 1947.24 |=======================================================
GCC 10 - PGO . 1947.23 |=======================================================
ASKAP 2018-11-10
Test: tConvolve MT - Degridding
Million Grid Points Per Second > Higher Is Better
GCC 10 ....... 3359.12 |=======================================================
GCC 10 - PGO . 3359.70 |=======================================================
ASKAP 2018-11-10
Test: tConvolve OpenMP - Gridding
Million Grid Points Per Second > Higher Is Better
GCC 10 ....... 5433.80 |=======================================================
GCC 10 - PGO . 5361.35 |======================================================
ASKAP 2018-11-10
Test: tConvolve OpenMP - Degridding
Million Grid Points Per Second > Higher Is Better
GCC 10 ....... 4096.25 |=======================================================
GCC 10 - PGO . 3995.25 |======================================================
GROMACS 2019.4
Water Benchmark
Ns Per Day > Higher Is Better
GCC 10 . 2.505 |===============================================================
PostgreSQL pgbench 12.0
Scaling: Buffer Test - Test: Normal Load - Mode: Read Only
TPS > Higher Is Better
GCC 10 . 669039.84 |===========================================================
PostgreSQL pgbench 12.0
Scaling: Buffer Test - Test: Heavy Contention - Mode: Read Only
TPS > Higher Is Better
GCC 10 . 676349.71 |===========================================================
SQLite Speedtest 3.30
Timed Time - Size 1,000
Seconds < Lower Is Better
GCC 10 . 57.26 |===============================================================
Facebook RocksDB 6.3.6
Test: Random Fill
Op/s > Higher Is Better
GCC 10 ....... 938039 |========================================================
GCC 10 - PGO . 921081 |=======================================================
Facebook RocksDB 6.3.6
Test: Random Read
Op/s > Higher Is Better
GCC 10 ....... 145207827 |=====================================================
GCC 10 - PGO . 144768694 |=====================================================
Facebook RocksDB 6.3.6
Test: Sequential Fill
Op/s > Higher Is Better
GCC 10 ....... 1019862 |=======================================================
GCC 10 - PGO . 1020657 |=======================================================
Facebook RocksDB 6.3.6
Test: Random Fill Sync
Op/s > Higher Is Better
GCC 10 ....... 24588 |=========================================================
GCC 10 - PGO . 24588 |=========================================================
Facebook RocksDB 6.3.6
Test: Read While Writing
Op/s > Higher Is Better
GCC 10 ....... 4889956 |=======================================================
GCC 10 - PGO . 4868440 |=======================================================