Threadripper 3960X GCC 10 LTO Testing

AMD Ryzen Threadripper 3960X GCC 10 LTO benchmarking by Michael Larabel for a future article.

-O3 -march=native -flto

Environment Notes: CXXFLAGS="-O3 -march=native -flto" CFLAGS="-O3 -march=native -flto"
Compiler Notes: --disable-multilib --enable-checking=release
Disk Notes: NONE / errors=remount-ro,relatime,rw
Processor Notes: Scaling Governor: acpi-cpufreq ondemand - CPU Microcode: 0x8301025
Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: conditional RSB filling + tsx_async_abort: Not affected

-O3 -march=native -flto -fwhole-program

Environment Notes: CXXFLAGS="-O3 -march=native -flto -fwhole-program" CFLAGS="-O3 -march=native -flto -fwhole-program"
Compiler Notes: --disable-multilib --enable-checking=release
Disk Notes: NONE / errors=remount-ro,relatime,rw
Processor Notes: Scaling Governor: acpi-cpufreq ondemand - CPU Microcode: 0x8301025
Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: conditional RSB filling + tsx_async_abort: Not affected

-O3 -march=native

Processor: AMD Ryzen Threadripper 3960X 24-Core @ 3.80GHz (24 Cores / 48 Threads), Motherboard: MSI Creator TRX40 (MS-7C59) v1.0 (1.12N1 BIOS), Chipset: AMD Starship/Matisse, Memory: 32768MB, Disk: 1000GB Sabrent Rocket 4.0 1TB, Graphics: Gigabyte AMD Radeon 540/540X/550/550X / RX 540X/550/550X 2GB (1206/1750MHz), Audio: AMD Baffin HDMI/DP, Monitor: ASUS VP28U, Network: Aquantia AQC107 NBase-T/IEEE + Intel I211 + Intel Device 2723

OS: Ubuntu 19.10, Kernel: 5.4.0-nvme-hwmon (x86_64), Desktop: GNOME Shell 3.34.1, Display Server: X Server 1.20.5, Display Driver: modesetting 1.20.5, OpenGL: 4.5 Mesa 19.2.1 (LLVM 9.0.0), Compiler: GCC 10.0.0 20191208, File-System: ext4, Screen Resolution: 3840x2160

Environment Notes: CXXFLAGS="-O3 -march=native" CFLAGS="-O3 -march=native"
Compiler Notes: --disable-multilib --enable-checking=release
Disk Notes: NONE / errors=remount-ro,relatime,rw
Processor Notes: Scaling Governor: acpi-cpufreq ondemand - CPU Microcode: 0x8301025
Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: conditional RSB filling + tsx_async_abort: Not affected

HPC Challenge

HPC Challenge (HPCC) is a cluster-focused benchmark consisting of the HPL Linpack TPP benchmark, DGEMM, STREAM, PTRANS, RandomAccess, FFT, and communication bandwidth and latency. This HPC Challenge test profile attempts to ship with standard yet versatile configuration/input files though they can be modified. Learn more via the OpenBenchmarking.org test page.

QMCPACK

QMCPACK is a modern high-performance open-source Quantum Monte Carlo (QMC) simulation code making use of MPI for this benchmark of the H20 example code. Learn more via the OpenBenchmarking.org test page.

FFTW

FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions. Learn more via the OpenBenchmarking.org test page.

Radiance Benchmark

This is a benchmark of NREL Radiance, a synthetic imaging system that is open-source and developed by the Lawrence Berkeley National Laboratory in California. Learn more via the OpenBenchmarking.org test page.

Timed MrBayes Analysis

This test performs a bayesian analysis of a set of primate genome sequences in order to estimate their phylogeny. Learn more via the OpenBenchmarking.org test page.

FFTW

FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions. Learn more via the OpenBenchmarking.org test page.

MKL-DNN DNNL

This is a test of the Intel MKL-DNN (DNNL / Deep Neural Network Library) as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Learn more via the OpenBenchmarking.org test page.

BYTE Unix Benchmark

This is a test of BYTE. Learn more via the OpenBenchmarking.org test page.

PostgreSQL pgbench

This is a simple benchmark of PostgreSQL using pgbench. Learn more via the OpenBenchmarking.org test page.

ASKAP

This is a CUDA benchmark of ATNF's ASKAP Benchmark with currently using the tConvolveCuda sub-test. Learn more via the OpenBenchmarking.org test page.

GROMACS

The Gromacs molecular dynamics package testing on the CPU with the water_GMX50 data. Learn more via the OpenBenchmarking.org test page.

Himeno Benchmark

The Himeno benchmark is a linear solver of pressure Poisson using a point-Jacobi method. Learn more via the OpenBenchmarking.org test page.

ACES DGEMM

This is a multi-threaded DGEMM benchmark. Learn more via the OpenBenchmarking.org test page.

Facebook RocksDB

This is a benchmark of Facebook's RocksDB as an embeddable persistent key-value store for fast storage based on Google's LevelDB. Learn more via the OpenBenchmarking.org test page.

Radiance Benchmark

SQLite Speedtest

This is a benchmark of SQLite's speedtest1 benchmark program with an increased problem size of 1,000. Learn more via the OpenBenchmarking.org test page.

Timed ImageMagick Compilation

This test times how long it takes to build ImageMagick. Learn more via the OpenBenchmarking.org test page.

Stockfish

This is a test of Stockfish, an advanced C++11 chess benchmark that can scale up to 128 CPU cores. Learn more via the OpenBenchmarking.org test page.

Facebook RocksDB

This is a benchmark of Facebook's RocksDB as an embeddable persistent key-value store for fast storage based on Google's LevelDB. Learn more via the OpenBenchmarking.org test page.

NGINX Benchmark

This is a test of ab, which is the Apache Benchmark program running against nginx. This test profile measures how many requests per second a given system can sustain when carrying out 2,000,000 requests with 500 requests being carried out concurrently. Learn more via the OpenBenchmarking.org test page.

MKL-DNN DNNL

miniFE

MiniFE Finite Element is an application for unstructured implicit finite element codes. Learn more via the OpenBenchmarking.org test page.

MKL-DNN DNNL

Crafty

This is a performance test of Crafty, an advanced open-source chess engine. Learn more via the OpenBenchmarking.org test page.

TTSIOD 3D Renderer

A portable GPL 3D software renderer that supports OpenMP and Intel Threading Building Blocks with many different rendering modes. This version does not use OpenGL but is entirely CPU/software based. Learn more via the OpenBenchmarking.org test page.

OpenSSL

OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test measures the RSA 4096-bit performance of OpenSSL. Learn more via the OpenBenchmarking.org test page.

XZ Compression

This test measures the time needed to compress a sample file (an Ubuntu file-system image) using XZ compression. Learn more via the OpenBenchmarking.org test page.

MKL-DNN DNNL

SQLite

This is a simple benchmark of SQLite. At present this test profile just measures the time to perform a pre-defined number of insertions on an indexed database. Learn more via the OpenBenchmarking.org test page.

FLAC Audio Encoding

This test times how long it takes to encode a sample WAV file to FLAC format five times. Learn more via the OpenBenchmarking.org test page.

Zstd Compression

This test measures the time needed to compress a sample file (an Ubuntu file-system image) using Zstd compression. Learn more via the OpenBenchmarking.org test page.

ASKAP

This is a CUDA benchmark of ATNF's ASKAP Benchmark with currently using the tConvolveCuda sub-test. Learn more via the OpenBenchmarking.org test page.

LAME MP3 Encoding

LAME is an MP3 encoder licensed under the LGPL. This test measures the time required to encode a WAV file to MP3 format. Learn more via the OpenBenchmarking.org test page.

FFTW

FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions. Learn more via the OpenBenchmarking.org test page.

TSCP

This is a performance test of TSCP, Tom Kerrigan's Simple Chess Program, which has a built-in performance benchmark. Learn more via the OpenBenchmarking.org test page.

HPC Challenge

53 Results Shown

HPC Challenge
QMCPACK
FFTW
Radiance Benchmark
Timed MrBayes Analysis
FFTW
MKL-DNN DNNL
BYTE Unix Benchmark
PostgreSQL pgbench:
Buffer Test - Normal Load - Read Only
Buffer Test - Heavy Contention - Read Only
ASKAP:
tConvolve MT - Degridding
tConvolve MT - Gridding
GROMACS
Himeno Benchmark
ACES DGEMM
Facebook RocksDB:
Rand Fill Sync
Rand Fill
Rand Read
Read While Writing
Radiance Benchmark
SQLite Speedtest
Timed ImageMagick Compilation
Stockfish
Facebook RocksDB
NGINX Benchmark
MKL-DNN DNNL
miniFE
MKL-DNN DNNL
Crafty
TTSIOD 3D Renderer
OpenSSL
XZ Compression
MKL-DNN DNNL
SQLite
FLAC Audio Encoding
Zstd Compression
ASKAP:
tConvolve OpenMP - Degridding
tConvolve OpenMP - Gridding
LAME MP3 Encoding
FFTW:
Float + SSE - 2D FFT Size 32
Stock - 1D FFT Size 32
Stock - 2D FFT Size 32
Float + SSE - 1D FFT Size 32
TSCP
HPC Challenge:
Max Ping Pong Bandwidth
Rand Ring Bandwidth
Rand Ring Latency
G-Rand Access
EP-STREAM Triad
G-Ptrans
EP-DGEMM
G-Ffte
G-Ffte

-O3 -march=native -flto

Testing initiated at 20 December 2019 20:56 by user pts.

-O3 -march=native -flto -fwhole-program

Testing initiated at 21 December 2019 06:49 by user pts.

-O3 -march=native

Testing initiated at 21 December 2019 11:30 by user pts.

Threadripper 3960X GCC 10 LTO Testing

View

Statistics

Graph Settings

Multi-Way Comparison

Table

Run Management

-O3 -march=native -flto

-O3 -march=native -flto -fwhole-program

-O3 -march=native

HPC Challenge

QMCPACK

FFTW

Radiance Benchmark

Timed MrBayes Analysis

FFTW

MKL-DNN DNNL

BYTE Unix Benchmark

PostgreSQL pgbench

ASKAP

GROMACS

Himeno Benchmark

ACES DGEMM

Facebook RocksDB

Radiance Benchmark

SQLite Speedtest

Timed ImageMagick Compilation

Stockfish

Facebook RocksDB

NGINX Benchmark

MKL-DNN DNNL

miniFE

MKL-DNN DNNL

Crafty

TTSIOD 3D Renderer

OpenSSL

XZ Compression

MKL-DNN DNNL

SQLite

FLAC Audio Encoding

Zstd Compression

ASKAP

LAME MP3 Encoding

FFTW

TSCP

HPC Challenge

53 Results Shown

-O3 -march=native -flto

-O3 -march=native -flto -fwhole-program

-O3 -march=native