GCC 10 AMD Threadripper 3960X PGO Optimization

AMD Ryzen Threadripper 3960X 24-Core testing with a MSI Creator TRX40 (MS-7C59) v1.0 (1.12N1 BIOS) and Gigabyte AMD Radeon 540/540X/550/550X / RX 540X/550/550X 2GB on Ubuntu 19.10 via the Phoronix Test Suite.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 1912220-PTS-GCC10AMD97
Jump To Table - Results

View

Do Not Show Noisy Results
Do Not Show Results With Incomplete Data
Do Not Show Results With Little Change/Spread
List Notable Results

Limit displaying results to tests within:

Audio Encoding 2 Tests
Bioinformatics 3 Tests
Chess Test Suite 3 Tests
C/C++ Compiler Tests 14 Tests
Compression Tests 2 Tests
CPU Massive 14 Tests
Creator Workloads 4 Tests
Database Test Suite 4 Tests
Encoding 2 Tests
HPC - High Performance Computing 9 Tests
Common Kernel Benchmarks 4 Tests
LAPACK (Linear Algebra Pack) Tests 2 Tests
Linear Algebra 2 Tests
Molecular Dynamics 2 Tests
MPI Benchmarks 6 Tests
Multi-Core 9 Tests
OpenMPI Tests 5 Tests
Programmer / Developer System Benchmarks 5 Tests
Renderers 2 Tests
Scientific Computing 8 Tests
Server 5 Tests
Server CPU Tests 5 Tests
Single-Threaded 4 Tests

Statistics

Show Overall Harmonic Mean(s)
Show Overall Geometric Mean
Show Geometric Means Per-Suite/Category
Show Wins / Losses Counts (Pie Chart)
Normalize Results
Remove Outliers Before Calculating Averages

Graph Settings

Force Line Graphs Where Applicable
Convert To Scalar Where Applicable
Prefer Vertical Bar Graphs

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Highlight
Result
Hide
Result
Result
Identifier
Performance Per
Dollar
Date
Run
  Test
  Duration
GCC 10
December 21 2019
  4 Hours, 20 Minutes
Sabrent Rocket 4.0 1TB
December 22 2019
  5 Hours, 32 Minutes
Invert Hiding All Results Option
  4 Hours, 56 Minutes
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


GCC 10 AMD Threadripper 3960X PGO OptimizationOpenBenchmarking.orgPhoronix Test SuiteAMD Ryzen Threadripper 3960X 24-Core @ 3.80GHz (24 Cores / 48 Threads)MSI Creator TRX40 (MS-7C59) v1.0 (1.12N1 BIOS)AMD Starship/Matisse32768MB1000GB Sabrent Rocket 4.0 1TBGigabyte AMD Radeon 540/540X/550/550X / RX 540X/550/550X 2GB (1206/1750MHz)AMD Baffin HDMI/DPASUS VP28UAquantia AQC107 NBase-T/IEEE + Intel I211 + Intel Device 2723Ubuntu 19.105.4.0-nvme-hwmon (x86_64)GNOME Shell 3.34.1X Server 1.20.5modesetting 1.20.54.5 Mesa 19.2.1 (LLVM 9.0.0)GCC 10.0.0 20191208ext43840x2160ProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLCompilerFile-SystemScreen ResolutionGCC 10 AMD Threadripper 3960X PGO Optimization BenchmarksSystem Logs- --disable-multilib --enable-checking=release- NONE / errors=remount-ro,relatime,rw- Scaling Governor: acpi-cpufreq ondemand - CPU Microcode: 0x8301025- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: conditional RSB filling + tsx_async_abort: Not affected

GCC 10 vs. Sabrent Rocket 4.0 1TB ComparisonPhoronix Test SuiteBaseline+4649.8%+4649.8%+9299.6%+9299.6%+13949.4%+13949.4%184%3.2%2.9%Float + SSE - 2D FFT Size 4096501.7%C.u.1.0.3.s.i.i.C.L.93922.7%Float + SSE - 1D FFT Size 32365.2%Total Time3149.1%280%Stock - 1D FFT Size 32203.9%Time To CompileStock - 2D FFT Size 4096177.8%Stock - 2D FFT Size 32172.2%Buffer Test - Heavy Contention - Read Only159.6%Buffer Test - Normal Load - Read Only153.3%Timed Time - Size 1,000127.4%Dhrystone 2122%P.R.W.S.S.M18599%C.u.1.0.3.s.i.i.C.L.115446.7%Float + SSE - 2D FFT Size 321124.4%S.F.P.R10815.3%WAV To MP326.7%Water Benchmark25.6%P.P.A18.9%A.C.P13.2%WAV To FLAC3.8%C.B.c - f32P.P.S12.9%FFTWXZ CompressionFFTWStockfishQMCPACKFFTWTimed ImageMagick CompilationFFTWFFTWPostgreSQL pgbenchPostgreSQL pgbenchSQLite SpeedtestBYTE Unix BenchmarkTTSIOD 3D RendererZstd CompressionFFTWACES DGEMMLAME MP3 EncodingGROMACSTimed MrBayes AnalysisTSCPFLAC Audio EncodingMKL-DNN DNNLHimeno BenchmarkSQLiteGCC 10Sabrent Rocket 4.0 1TB

FFTW

FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMflops, More Is BetterFFTW 3.3.6Build: Float + SSE - Size: 2D FFT Size 4096Sabrent Rocket 4.0 1TBGCC 105K10K15K20K25KSE +/- 285.77, N = 33767.322667.0-O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math1. (CC) gcc options: -pthread -lm

XZ Compression

This test measures the time needed to compress a sample file (an Ubuntu file-system image) using XZ compression. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterXZ Compression 5.2.4Compressing ubuntu-16.04.3-server-i386.img, Compression Level 9Sabrent Rocket 4.0 1TBGCC 102004006008001000SE +/- 0.02, N = 3799.1119.871. (CC) gcc options: -pthread -fvisibility=hidden

FFTW

FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMflops, More Is BetterFFTW 3.3.6Build: Float + SSE - Size: 1D FFT Size 32Sabrent Rocket 4.0 1TBGCC 103K6K9K12K15KSE +/- 15.37, N = 33309.715396.0-O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math1. (CC) gcc options: -pthread -lm

Stockfish

This is a test of Stockfish, an advanced C++11 chess benchmark that can scale up to 128 CPU cores. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNodes Per Second, More Is BetterStockfish 9Total TimeSabrent Rocket 4.0 1TBGCC 1020M40M60M80M100MSE +/- 526550.53, N = 32442535793596131. (CXX) g++ options: -m64 -lpthread -fno-exceptions -std=c++11 -pedantic -O3 -msse -msse3 -mpopcnt -flto

QMCPACK

QMCPACK is a modern high-performance open-source Quantum Monte Carlo (QMC) simulation code making use of MPI for this benchmark of the H20 example code. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTotal Execution Time - Seconds, Fewer Is BetterQMCPACK 3.8Sabrent Rocket 4.0 1TBGCC 10150030004500600075007137.21878.01. (CXX) g++ options: -fopenmp -fomit-frame-pointer -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -march=native -O3 -ffast-math -lm

FFTW

FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMflops, More Is BetterFFTW 3.3.6Build: Stock - Size: 1D FFT Size 32Sabrent Rocket 4.0 1TBGCC 102K4K6K8K10KSE +/- 16.77, N = 33436.610443.0-O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math1. (CC) gcc options: -pthread -lm

Timed ImageMagick Compilation

This test times how long it takes to build ImageMagick. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed ImageMagick Compilation 6.9.0Time To CompileSabrent Rocket 4.0 1TBGCC 1048121620SE +/- 0.078, N = 35.79916.469

FFTW

FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMflops, More Is BetterFFTW 3.3.6Build: Stock - Size: 2D FFT Size 4096Sabrent Rocket 4.0 1TBGCC 1014002800420056007000SE +/- 7.80, N = 32407.06687.3-O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math1. (CC) gcc options: -pthread -lm

OpenBenchmarking.orgMflops, More Is BetterFFTW 3.3.6Build: Stock - Size: 2D FFT Size 32Sabrent Rocket 4.0 1TBGCC 102K4K6K8K10KSE +/- 11.02, N = 33862.410512.0-O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math1. (CC) gcc options: -pthread -lm

PostgreSQL pgbench

This is a simple benchmark of PostgreSQL using pgbench. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL pgbench 12.0Scaling: Buffer Test - Test: Heavy Contention - Mode: Read OnlySabrent Rocket 4.0 1TBGCC 10140K280K420K560K700KSE +/- 4887.19, N = 3260493.71676349.71-O2 -lpq1. (CC) gcc options: -fno-strict-aliasing -fwrapv -lpgcommon -lpgport -lpthread -lrt -lcrypt -ldl -lm

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL pgbench 12.0Scaling: Buffer Test - Test: Normal Load - Mode: Read OnlySabrent Rocket 4.0 1TBGCC 10140K280K420K560K700KSE +/- 622.09, N = 3264129.49669039.84-O2 -lpq1. (CC) gcc options: -fno-strict-aliasing -fwrapv -lpgcommon -lpgport -lpthread -lrt -lcrypt -ldl -lm

SQLite Speedtest

This is a benchmark of SQLite's speedtest1 benchmark program with an increased problem size of 1,000. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterSQLite Speedtest 3.30Timed Time - Size 1,000Sabrent Rocket 4.0 1TBGCC 10306090120150SE +/- 0.13, N = 3130.2257.26-O21. (CC) gcc options: -ldl -lz -lpthread

BYTE Unix Benchmark

This is a test of BYTE. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgLPS, More Is BetterBYTE Unix Benchmark 3.6Computational Test: Dhrystone 2Sabrent Rocket 4.0 1TBGCC 1010M20M30M40M50MSE +/- 550382.53, N = 321645648.248055276.3

TTSIOD 3D Renderer

A portable GPL 3D software renderer that supports OpenMP and Intel Threading Building Blocks with many different rendering modes. This version does not use OpenGL but is entirely CPU/software based. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFPS, More Is BetterTTSIOD 3D Renderer 2.3bPhong Rendering With Soft-Shadow MappingSabrent Rocket 4.0 1TBGCC 102004006008001000SE +/- 1.20634, N = 35.01883938.47100-O3 -fopenmp -fwhole-program1. (CXX) g++ options: -fomit-frame-pointer -ffast-math -mtune=native -flto -msse -mrecip -mfpmath=sse -msse2 -mssse3 -lSDL -lstdc++

Zstd Compression

This test measures the time needed to compress a sample file (an Ubuntu file-system image) using Zstd compression. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterZstd Compression 1.3.4Compressing ubuntu-16.04.3-server-i386.img, Compression Level 19Sabrent Rocket 4.0 1TBGCC 1030060090012001500SE +/- 0.03, N = 31595.1010.261. (CC) gcc options: -pthread -lz -llzma

FFTW

FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMflops, More Is BetterFFTW 3.3.6Build: Float + SSE - Size: 2D FFT Size 32Sabrent Rocket 4.0 1TBGCC 1010K20K30K40K50KSE +/- 56.20, N = 33708.445404.0-O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math1. (CC) gcc options: -pthread -lm

LAME MP3 Encoding

LAME is an MP3 encoder licensed under the LGPL. This test measures the time required to encode a WAV file to MP3 format. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterLAME MP3 Encoding 3.100WAV To MP3Sabrent Rocket 4.0 1TBGCC 103691215SE +/- 0.002, N = 39.2437.2971. (CC) gcc options: -O3 -ffast-math -funroll-loops -fschedule-insns2 -fbranch-count-reg -fforce-addr -pipe -lm

GROMACS

The Gromacs molecular dynamics package testing on the CPU with the water_GMX50 data. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNs Per Day, More Is BetterGROMACS 2019.4Water BenchmarkSabrent Rocket 4.0 1TBGCC 100.56361.12721.69082.25442.818SE +/- 0.002, N = 31.9952.5051. (CXX) g++ options: -mavx2 -mfma -std=c++11 -O3 -funroll-all-loops -pthread -lrt -lpthread -lm

Timed MrBayes Analysis

This test performs a bayesian analysis of a set of primate genome sequences in order to estimate their phylogeny. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed MrBayes Analysis 3.2.7Primate Phylogeny AnalysisSabrent Rocket 4.0 1TBGCC 1020406080100SE +/- 0.25, N = 383.2770.011. (CC) gcc options: -mmmx -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4a -msha -maes -mavx -mfma -mavx2 -mrdrnd -mbmi -mbmi2 -madx -mabm -O3 -std=c99 -pedantic -lm

TSCP

This is a performance test of TSCP, Tom Kerrigan's Simple Chess Program, which has a built-in performance benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNodes Per Second, More Is BetterTSCP 1.81AI Chess PerformanceSabrent Rocket 4.0 1TBGCC 10300K600K900K1200K1500KSE +/- 1472.68, N = 5118958513466511. (CC) gcc options: -O3 -march=native

FLAC Audio Encoding

This test times how long it takes to encode a sample WAV file to FLAC format five times. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterFLAC Audio Encoding 1.3.2WAV To FLACSabrent Rocket 4.0 1TBGCC 10246810SE +/- 0.009, N = 58.0167.719-O21. (CXX) g++ options: -fvisibility=hidden -logg -lm

MKL-DNN DNNL

This is a test of the Intel MKL-DNN (DNNL / Deep Neural Network Library) as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterMKL-DNN DNNL 1.1Harness: Convolution Batch conv_alexnet - Data Type: f32Sabrent Rocket 4.0 1TBGCC 10306090120150SE +/- 1.44, N = 3120.15123.99-lm - MIN: 119.49MIN: 121.921. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl

Himeno Benchmark

The Himeno benchmark is a linear solver of pressure Poisson using a point-Jacobi method. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMFLOPS, More Is BetterHimeno Benchmark 3.0Poisson Pressure SolverSabrent Rocket 4.0 1TBGCC 1010002000300040005000SE +/- 55.99, N = 54820.054684.301. (CC) gcc options: -O3 -mavx2

SQLite

This is a simple benchmark of SQLite. At present this test profile just measures the time to perform a pre-defined number of insertions on an indexed database. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterSQLite 3.30.1Threads / Copies: 1Sabrent Rocket 4.0 1TBGCC 1048121620SE +/- 0.01, N = 314.6014.18-O21. (CC) gcc options: -lz -lm -ldl -lpthread

OpenSSL

OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test measures the RSA 4096-bit performance of OpenSSL. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSigns Per Second, More Is BetterOpenSSL 1.1.1RSA 4096-bit PerformanceSabrent Rocket 4.0 1TBGCC 1015003000450060007500SE +/- 21.70, N = 37060.17180.6-O3 -lssl1. (CC) gcc options: -pthread -m64 -lcrypto -ldl

Facebook RocksDB

This is a benchmark of Facebook's RocksDB as an embeddable persistent key-value store for fast storage based on Google's LevelDB. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgOp/s, More Is BetterFacebook RocksDB 6.3.6Test: Sequential FillSabrent Rocket 4.0 1TBGCC 10200K400K600K800K1000KSE +/- 3135.06, N = 3103530810198621. (CXX) g++ options: -O3 -march=native -std=c++11 -fno-builtin-memcmp -fno-rtti -rdynamic -lpthread

OpenBenchmarking.orgOp/s, More Is BetterFacebook RocksDB 6.3.6Test: Random ReadSabrent Rocket 4.0 1TBGCC 1030M60M90M120M150MSE +/- 1800355.11, N = 31433355051452078271. (CXX) g++ options: -O3 -march=native -std=c++11 -fno-builtin-memcmp -fno-rtti -rdynamic -lpthread

Radiance Benchmark

This is a benchmark of NREL Radiance, a synthetic imaging system that is open-source and developed by the Lawrence Berkeley National Laboratory in California. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterRadiance Benchmark 5.0Test: SMP ParallelSabrent Rocket 4.0 1TBGCC 104080120160200169.59171.30

Facebook RocksDB

This is a benchmark of Facebook's RocksDB as an embeddable persistent key-value store for fast storage based on Google's LevelDB. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgOp/s, More Is BetterFacebook RocksDB 6.3.6Test: Read While WritingSabrent Rocket 4.0 1TBGCC 101.1M2.2M3.3M4.4M5.5MSE +/- 20082.88, N = 3493704848899561. (CXX) g++ options: -O3 -march=native -std=c++11 -fno-builtin-memcmp -fno-rtti -rdynamic -lpthread

MKL-DNN DNNL

This is a test of the Intel MKL-DNN (DNNL / Deep Neural Network Library) as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterMKL-DNN DNNL 1.1Harness: Deconvolution Batch deconv_1d - Data Type: f32Sabrent Rocket 4.0 1TBGCC 100.52291.04581.56872.09162.6145SE +/- 0.00388, N = 32.307042.32419-lm - MIN: 2.25MIN: 2.261. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl

Crafty

This is a performance test of Crafty, an advanced open-source chess engine. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNodes Per Second, More Is BetterCrafty 25.2Elapsed TimeSabrent Rocket 4.0 1TBGCC 102M4M6M8M10MSE +/- 7954.66, N = 3930178892348241. (CC) gcc options: -pthread -lstdc++ -fprofile-use -lm

Facebook RocksDB

This is a benchmark of Facebook's RocksDB as an embeddable persistent key-value store for fast storage based on Google's LevelDB. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgOp/s, More Is BetterFacebook RocksDB 6.3.6Test: Random FillSabrent Rocket 4.0 1TBGCC 10200K400K600K800K1000KSE +/- 16043.44, N = 39321209380391. (CXX) g++ options: -O3 -march=native -std=c++11 -fno-builtin-memcmp -fno-rtti -rdynamic -lpthread

OpenBenchmarking.orgOp/s, More Is BetterFacebook RocksDB 6.3.6Test: Random Fill SyncSabrent Rocket 4.0 1TBGCC 105K10K15K20K25KSE +/- 19.92, N = 324460245881. (CXX) g++ options: -O3 -march=native -std=c++11 -fno-builtin-memcmp -fno-rtti -rdynamic -lpthread

ASKAP

This is a CUDA benchmark of ATNF's ASKAP Benchmark with currently using the tConvolveCuda sub-test. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 2018-11-10Test: tConvolve MT - GriddingSabrent Rocket 4.0 1TBGCC 10400800120016002000SE +/- 3.33, N = 31937.581947.241. (CXX) g++ options: -lpthread

Radiance Benchmark

This is a benchmark of NREL Radiance, a synthetic imaging system that is open-source and developed by the Lawrence Berkeley National Laboratory in California. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterRadiance Benchmark 5.0Test: SerialSabrent Rocket 4.0 1TBGCC 10120240360480600554.96555.94

miniFE

MiniFE Finite Element is an application for unstructured implicit finite element codes. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgCG Mflops, More Is BetterminiFE 2.2Problem Size: SmallSabrent Rocket 4.0 1TBGCC 1017003400510068008500SE +/- 11.29, N = 37737.787740.101. (CXX) g++ options: -O3 -fopenmp -pthread -lmpi_cxx -lmpi

ASKAP

This is a CUDA benchmark of ATNF's ASKAP Benchmark with currently using the tConvolveCuda sub-test. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 2018-11-10Test: tConvolve MT - DegriddingSabrent Rocket 4.0 1TBGCC 107001400210028003500SE +/- 3.58, N = 33359.703359.121. (CXX) g++ options: -lpthread

MKL-DNN DNNL

This is a test of the Intel MKL-DNN (DNNL / Deep Neural Network Library) as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterMKL-DNN DNNL 1.1Harness: Recurrent Neural Network Training - Data Type: f32Sabrent Rocket 4.0 1TBGCC 104080120160200SE +/- 0.35, N = 3194.23194.25-lm - MIN: 193.18MIN: 192.531. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl

OpenBenchmarking.orgms, Fewer Is BetterMKL-DNN DNNL 1.1Harness: Convolution Batch conv_googlenet_v3 - Data Type: f32Sabrent Rocket 4.0 1TBGCC 101224364860SE +/- 0.14, N = 352.3352.33-lm - MIN: 51.67MIN: 51.481. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl

ASKAP

This is a CUDA benchmark of ATNF's ASKAP Benchmark with currently using the tConvolveCuda sub-test. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 2018-11-10Test: tConvolve OpenMP - DegriddingSabrent Rocket 4.0 1TBGCC 109001800270036004500SE +/- 0.00, N = 34096.254096.251. (CXX) g++ options: -lpthread

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 2018-11-10Test: tConvolve OpenMP - GriddingSabrent Rocket 4.0 1TBGCC 1012002400360048006000SE +/- 0.00, N = 35433.85433.81. (CXX) g++ options: -lpthread

BYTE Unix Benchmark

This is a test of BYTE. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgLPS, More Is BetterBYTE Unix Benchmark 3.6Computational Test: Floating-Point ArithmeticSabrent Rocket 4.0 1TBGCC 100.2250.450.6750.91.12511

OpenBenchmarking.orgLPS, More Is BetterBYTE Unix Benchmark 3.6Computational Test: Register ArithmeticSabrent Rocket 4.0 1TBGCC 100.2250.450.6750.91.12511

OpenBenchmarking.orgLPS, More Is BetterBYTE Unix Benchmark 3.6Computational Test: Integer ArithmeticSabrent Rocket 4.0 1TBGCC 100.2250.450.6750.91.12511

HPC Challenge

HPC Challenge (HPCC) is a cluster-focused benchmark consisting of the HPL Linpack TPP benchmark, DGEMM, STREAM, PTRANS, RandomAccess, FFT, and communication bandwidth and latency. This HPC Challenge test profile attempts to ship with standard yet versatile configuration/input files though they can be modified. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMB/s, More Is BetterHPC Challenge 1.5.0Test / Class: Max Ping Pong BandwidthGCC 105K10K15K20K25KSE +/- 313.29, N = 322977.001. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops2. ATLAS + Open MPI 3.1.3

OpenBenchmarking.orgGB/s, More Is BetterHPC Challenge 1.5.0Test / Class: Random Ring BandwidthGCC 100.76651.5332.29953.0663.8325SE +/- 0.01038, N = 33.406781. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops2. ATLAS + Open MPI 3.1.3

OpenBenchmarking.orgusecs, Fewer Is BetterHPC Challenge 1.5.0Test / Class: Random Ring LatencyGCC 100.10320.20640.30960.41280.516SE +/- 0.00067, N = 30.458631. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops2. ATLAS + Open MPI 3.1.3

OpenBenchmarking.orgGUP/s, More Is BetterHPC Challenge 1.5.0Test / Class: G-Random AccessGCC 100.03210.06420.09630.12840.1605SE +/- 0.00039, N = 30.142781. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops2. ATLAS + Open MPI 3.1.3

OpenBenchmarking.orgGB/s, More Is BetterHPC Challenge 1.5.0Test / Class: EP-STREAM TriadGCC 100.40440.80881.21321.61762.022SE +/- 0.00127, N = 31.797501. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops2. ATLAS + Open MPI 3.1.3

OpenBenchmarking.orgGB/s, More Is BetterHPC Challenge 1.5.0Test / Class: G-PtransGCC 101.23242.46483.69724.92966.162SE +/- 0.00581, N = 35.477371. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops2. ATLAS + Open MPI 3.1.3

OpenBenchmarking.orgGFLOPS, More Is BetterHPC Challenge 1.5.0Test / Class: EP-DGEMMGCC 10816243240SE +/- 0.38, N = 332.931. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops2. ATLAS + Open MPI 3.1.3

OpenBenchmarking.orgGFLOP/s, More Is BetterHPC Challenge 1.5.0Test / Class: G-FfteGCC 103691215SE +/- 0.05, N = 310.491. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops2. ATLAS + Open MPI 3.1.3

OpenBenchmarking.orgGFLOPS, More Is BetterHPC Challenge 1.5.0Test / Class: G-FfteGCC 103691215SE +/- 0.05, N = 310.491. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops2. ATLAS + Open MPI 3.1.3

OpenBenchmarking.orgGFLOPS, More Is BetterHPC Challenge 1.5.0Test / Class: G-HPLGCC 101428425670SE +/- 0.23, N = 363.631. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops2. ATLAS + Open MPI 3.1.3

ACES DGEMM

This is a multi-threaded DGEMM benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterACES DGEMM 1.0Sustained Floating-Point RateSabrent Rocket 4.0 1TBGCC 10246810SE +/- 0.158518, N = 120.0784898.5672821. (CC) gcc options: -O3 -march=native -fopenmp