Threadripper 3960X GCC 10 LTO Testing

AMD Ryzen Threadripper 3960X GCC 10 LTO benchmarking by Michael Larabel for a future article.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 1912225-PTS-THREADRI29
Jump To Table - Results

View

Do Not Show Noisy Results
Do Not Show Results With Incomplete Data
Do Not Show Results With Little Change/Spread
List Notable Results
Show Result Confidence Charts
Allow Limiting Results To Certain Suite(s)

Statistics

Show Overall Harmonic Mean(s)
Show Overall Geometric Mean
Show Wins / Losses Counts (Pie Chart)
Normalize Results
Remove Outliers Before Calculating Averages

Graph Settings

Force Line Graphs Where Applicable
Convert To Scalar Where Applicable
Prefer Vertical Bar Graphs

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Highlight
Result
Toggle/Hide
Result
Result
Identifier
Performance Per
Dollar
Date
Run
  Test
  Duration
-O3 -march=native
December 21 2019
  4 Hours, 8 Minutes
-O3 -march=native -flto
December 20 2019
  3 Hours, 59 Minutes
-O3 -march=native -flto -fwhole-program
December 21 2019
  3 Hours, 56 Minutes
Invert Behavior (Only Show Selected Data)
  4 Hours, 1 Minute

Only show results where is faster than
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


Threadripper 3960X GCC 10 LTO TestingOpenBenchmarking.orgPhoronix Test SuiteAMD Ryzen Threadripper 3960X 24-Core @ 3.80GHz (24 Cores / 48 Threads)MSI Creator TRX40 (MS-7C59) v1.0 (1.12N1 BIOS)AMD Starship/Matisse32768MB1000GB Sabrent Rocket 4.0 1TBGigabyte AMD Radeon 540/540X/550/550X / RX 540X/550/550X 2GB (1206/1750MHz)AMD Baffin HDMI/DPASUS VP28UAquantia AQC107 NBase-T/IEEE + Intel I211 + Intel Device 2723Ubuntu 19.105.4.0-nvme-hwmon (x86_64)GNOME Shell 3.34.1X Server 1.20.5modesetting 1.20.54.5 Mesa 19.2.1 (LLVM 9.0.0)GCC 10.0.0 20191208ext43840x2160ProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLCompilerFile-SystemScreen ResolutionThreadripper 3960X GCC 10 LTO Testing BenchmarksSystem Logs- -O3 -march=native: CXXFLAGS="-O3 -march=native" CFLAGS="-O3 -march=native"- -O3 -march=native -flto: CXXFLAGS="-O3 -march=native -flto" CFLAGS="-O3 -march=native -flto"- -O3 -march=native -flto -fwhole-program: CXXFLAGS="-O3 -march=native -flto -fwhole-program" CFLAGS="-O3 -march=native -flto -fwhole-program" - --disable-multilib --enable-checking=release- NONE / errors=remount-ro,relatime,rw- Scaling Governor: acpi-cpufreq ondemand - CPU Microcode: 0x8301025- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: conditional RSB filling + tsx_async_abort: Not affected

-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-programLogarithmic Result OverviewPhoronix Test SuiteTimed ImageMagick CompilationBYTE Unix BenchmarkTimed MrBayes AnalysisTSCPHimeno BenchmarkCraftyFFTWStockfishSQLite SpeedtestACES DGEMMZstd CompressionRadiance BenchmarkLAME MP3 EncodingNGINX BenchmarkHPC ChallengeXZ CompressionMKL-DNN DNNLSQLiteFacebook RocksDBTTSIOD 3D RendererFLAC Audio EncodingASKAPminiFEGROMACSQMCPACK

Threadripper 3960X GCC 10 LTO Testingtscp: AI Chess Performancecrafty: Elapsed Timecompress-xz: Compressing ubuntu-16.04.3-server-i386.img, Compression Level 9encode-flac: WAV To FLACencode-mp3: WAV To MP3fftw: Stock - 1D FFT Size 32fftw: Stock - 2D FFT Size 32fftw: Stock - 2D FFT Size 4096fftw: Float + SSE - 1D FFT Size 32fftw: Float + SSE - 2D FFT Size 32fftw: Float + SSE - 2D FFT Size 4096mrbayes: Primate Phylogeny Analysishimeno: Poisson Pressure Solverhpcc: G-HPLhpcc: G-Fftehpcc: G-Fftehpcc: EP-DGEMMhpcc: G-Ptranshpcc: EP-STREAM Triadhpcc: G-Rand Accesshpcc: Rand Ring Latencymkl-dnn: Recurrent Neural Network Training - f32mkl-dnn: Convolution Batch conv_alexnet - f32mkl-dnn: Deconvolution Batch deconv_1d - f32hpcc: Rand Ring Bandwidthhpcc: Max Ping Pong Bandwidthgromacs: Water Benchmarkaskap: tConvolve MT - Griddingaskap: tConvolve MT - Degriddingmkl-dnn: Convolution Batch conv_googlenet_v3 - f32askap: tConvolve OpenMP - Griddingaskap: tConvolve OpenMP - Degriddingmt-dgemm: Sustained Floating-Point Rateqmcpack: minife: Smallbuild-imagemagick: Time To Compilestockfish: Total Timecompress-zstd: Compressing ubuntu-16.04.3-server-i386.img, Compression Level 19ttsiod-renderer: Phong Rendering With Soft-Shadow Mappingradiance: Serialradiance: SMP Parallelnginx: Static Web Page Servingopenssl: RSA 4096-bit Performancesqlite: 1rocksdb: Rand Fillrocksdb: Rand Readrocksdb: Seq Fillrocksdb: Rand Fill Syncrocksdb: Read While Writingsqlite-speedtest: Timed Time - Size 1,000pgbench: Buffer Test - Normal Load - Read Onlypgbench: Buffer Test - Heavy Contention - Read Onlybyte: Dhrystone 2-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program1350615924065120.0178.0796.71013149132178209.115513459572328171.8344893.66346263.5449015.1963015.1963032.495905.796631.684260.164310.45479194.173126.1412.307353.4218822614.6282.5141946.683369.7452.23395471.54138.928.78111118787745.5415.336798137419.994946.107559.022168.58543138.297178.714.203927119145113856101822324502486826657.367670670.777447676431.70707147812720.81422472920928719.8678.0446.62211201126778969.715488463502450569.1324766.39693463.6848715.2173715.2173732.757075.787911.688740.167220.45234194.665126.9712.316273.3311722951.4282.5171949.813366.1952.82695435.314117.588.7614751895.17720.9875.2467961398810.168950.330556.135174.53443673.897182.914.232916114147319777101084024409490176756.442701920.062619703431.18291267070357.61418074897834619.8298.0736.69711069125958540.715271457342428968.0164684.70197563.7164715.9778315.9778333.596935.807511.697700.166790.46971195.071125.1282.315903.3149923030.7092.5151948.423363.8253.24545433.84117.588.6257761900.37735.1674.8738137594010.140950.184555.693170.63143510.9714.281919822141628836101927924495489875056.22464880292.8OpenBenchmarking.org

TSCP

This is a performance test of TSCP, Tom Kerrigan's Simple Chess Program, which has a built-in performance benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNodes Per Second, More Is BetterTSCP 1.81AI Chess Performance-O3 -march=native-O3 -march=native -flto -fwhole-program-O3 -march=native -flto300K600K900K1200K1500KSE +/- 1620.83, N = 5SE +/- 1462.93, N = 5SE +/- 1795.91, N = 5135061514180741422472-flto -fwhole-program-flto1. (CC) gcc options: -O3 -march=native

Crafty

This is a performance test of Crafty, an advanced open-source chess engine. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNodes Per Second, More Is BetterCrafty 25.2Elapsed Time-O3 -march=native -flto -fwhole-program-O3 -march=native -flto-O3 -march=native2M4M6M8M10MSE +/- 5847.06, N = 3SE +/- 13239.35, N = 3SE +/- 19679.93, N = 38978346920928792406511. (CC) gcc options: -pthread -lstdc++ -fprofile-use -lm

XZ Compression

This test measures the time needed to compress a sample file (an Ubuntu file-system image) using XZ compression. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterXZ Compression 5.2.4Compressing ubuntu-16.04.3-server-i386.img, Compression Level 9-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program510152025SE +/- 0.03, N = 3SE +/- 0.06, N = 3SE +/- 0.03, N = 320.0219.8719.83-flto-flto -fwhole-program1. (CC) gcc options: -pthread -fvisibility=hidden -O3 -march=native

FLAC Audio Encoding

This test times how long it takes to encode a sample WAV file to FLAC format five times. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterFLAC Audio Encoding 1.3.2WAV To FLAC-O3 -march=native-O3 -march=native -flto -fwhole-program-O3 -march=native -flto246810SE +/- 0.005, N = 5SE +/- 0.004, N = 5SE +/- 0.006, N = 58.0798.0738.044-flto -fwhole-program-flto1. (CXX) g++ options: -O3 -march=native -fvisibility=hidden -logg -lm

LAME MP3 Encoding

LAME is an MP3 encoder licensed under the LGPL. This test measures the time required to encode a WAV file to MP3 format. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterLAME MP3 Encoding 3.100WAV To MP3-O3 -march=native-O3 -march=native -flto -fwhole-program-O3 -march=native -flto246810SE +/- 0.016, N = 3SE +/- 0.007, N = 3SE +/- 0.067, N = 36.7106.6976.622-flto -fwhole-program-flto1. (CC) gcc options: -O3 -ffast-math -funroll-loops -fschedule-insns2 -fbranch-count-reg -fforce-addr -pipe -march=native -lm

FFTW

FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMflops, More Is BetterFFTW 3.3.6Build: Stock - Size: 1D FFT Size 32-O3 -march=native -flto -fwhole-program-O3 -march=native -flto-O3 -march=native3K6K9K12K15KSE +/- 3.21, N = 3SE +/- 55.87, N = 3SE +/- 27.74, N = 3110691120113149-flto -fwhole-program-flto1. (CC) gcc options: -pthread -O3 -march=native -lm

OpenBenchmarking.orgMflops, More Is BetterFFTW 3.3.6Build: Stock - Size: 2D FFT Size 32-O3 -march=native -flto -fwhole-program-O3 -march=native -flto-O3 -march=native3K6K9K12K15KSE +/- 72.89, N = 3SE +/- 14.19, N = 3SE +/- 6.49, N = 3125951267713217-flto -fwhole-program-flto1. (CC) gcc options: -pthread -O3 -march=native -lm

OpenBenchmarking.orgMflops, More Is BetterFFTW 3.3.6Build: Stock - Size: 2D FFT Size 4096-O3 -march=native-O3 -march=native -flto -fwhole-program-O3 -march=native -flto2K4K6K8K10KSE +/- 138.82, N = 3SE +/- 111.43, N = 3SE +/- 155.20, N = 38209.18540.78969.7-flto -fwhole-program-flto1. (CC) gcc options: -pthread -O3 -march=native -lm

OpenBenchmarking.orgMflops, More Is BetterFFTW 3.3.6Build: Float + SSE - Size: 1D FFT Size 32-O3 -march=native -flto -fwhole-program-O3 -march=native -flto-O3 -march=native3K6K9K12K15KSE +/- 111.55, N = 3SE +/- 25.21, N = 3SE +/- 84.89, N = 3152711548815513-flto -fwhole-program-flto1. (CC) gcc options: -pthread -O3 -march=native -lm

OpenBenchmarking.orgMflops, More Is BetterFFTW 3.3.6Build: Float + SSE - Size: 2D FFT Size 32-O3 -march=native -flto -fwhole-program-O3 -march=native-O3 -march=native -flto10K20K30K40K50KSE +/- 49.75, N = 3SE +/- 23.68, N = 3SE +/- 82.72, N = 3457344595746350-flto -fwhole-program-flto1. (CC) gcc options: -pthread -O3 -march=native -lm

OpenBenchmarking.orgMflops, More Is BetterFFTW 3.3.6Build: Float + SSE - Size: 2D FFT Size 4096-O3 -march=native-O3 -march=native -flto -fwhole-program-O3 -march=native -flto5K10K15K20K25KSE +/- 190.93, N = 3SE +/- 143.81, N = 3SE +/- 153.78, N = 3232812428924505-flto -fwhole-program-flto1. (CC) gcc options: -pthread -O3 -march=native -lm

Timed MrBayes Analysis

This test performs a bayesian analysis of a set of primate genome sequences in order to estimate their phylogeny. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed MrBayes Analysis 3.2.7Primate Phylogeny Analysis-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program1632486480SE +/- 1.38, N = 13SE +/- 0.87, N = 4SE +/- 0.14, N = 371.8369.1368.02-flto-flto -fwhole-program1. (CC) gcc options: -mmmx -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4a -msha -maes -mavx -mfma -mavx2 -mrdrnd -mbmi -mbmi2 -madx -mabm -O3 -std=c99 -pedantic -march=native -lm

Himeno Benchmark

The Himeno benchmark is a linear solver of pressure Poisson using a point-Jacobi method. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMFLOPS, More Is BetterHimeno Benchmark 3.0Poisson Pressure Solver-O3 -march=native -flto -fwhole-program-O3 -march=native -flto-O3 -march=native10002000300040005000SE +/- 63.96, N = 4SE +/- 22.94, N = 3SE +/- 32.73, N = 34684.704766.404893.66-flto -fwhole-program-flto1. (CC) gcc options: -O3 -march=native -mavx2

HPC Challenge

HPC Challenge (HPCC) is a cluster-focused benchmark consisting of the HPL Linpack TPP benchmark, DGEMM, STREAM, PTRANS, RandomAccess, FFT, and communication bandwidth and latency. This HPC Challenge test profile attempts to ship with standard yet versatile configuration/input files though they can be modified. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS, More Is BetterHPC Challenge 1.5.0Test / Class: G-HPL-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program1428425670SE +/- 0.05, N = 3SE +/- 0.14, N = 3SE +/- 0.17, N = 363.5463.6863.72-flto-flto -fwhole-program1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -march=native -funroll-loops2. ATLAS + Open MPI 3.1.3

OpenBenchmarking.orgGFLOPS, More Is BetterHPC Challenge 1.5.0Test / Class: G-Ffte-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program48121620SE +/- 0.07, N = 3SE +/- 0.45, N = 3SE +/- 0.37, N = 315.2015.2215.98-flto-flto -fwhole-program1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -march=native -funroll-loops2. ATLAS + Open MPI 3.1.3

OpenBenchmarking.orgGFLOP/s, More Is BetterHPC Challenge 1.5.0Test / Class: G-Ffte-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program48121620SE +/- 0.07, N = 3SE +/- 0.45, N = 3SE +/- 0.37, N = 315.2015.2215.98-flto-flto -fwhole-program1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -march=native -funroll-loops2. ATLAS + Open MPI 3.1.3

OpenBenchmarking.orgGFLOPS, More Is BetterHPC Challenge 1.5.0Test / Class: EP-DGEMM-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program816243240SE +/- 0.16, N = 3SE +/- 0.64, N = 3SE +/- 0.27, N = 332.5032.7633.60-flto-flto -fwhole-program1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -march=native -funroll-loops2. ATLAS + Open MPI 3.1.3

OpenBenchmarking.orgGB/s, More Is BetterHPC Challenge 1.5.0Test / Class: G-Ptrans-O3 -march=native -flto-O3 -march=native-O3 -march=native -flto -fwhole-program1.30672.61343.92015.22686.5335SE +/- 0.01376, N = 3SE +/- 0.02501, N = 3SE +/- 0.01388, N = 35.787915.796635.80751-flto-flto -fwhole-program1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -march=native -funroll-loops2. ATLAS + Open MPI 3.1.3

OpenBenchmarking.orgGB/s, More Is BetterHPC Challenge 1.5.0Test / Class: EP-STREAM Triad-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program0.3820.7641.1461.5281.91SE +/- 0.00200, N = 3SE +/- 0.00593, N = 3SE +/- 0.01665, N = 31.684261.688741.69770-flto-flto -fwhole-program1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -march=native -funroll-loops2. ATLAS + Open MPI 3.1.3

OpenBenchmarking.orgGUP/s, More Is BetterHPC Challenge 1.5.0Test / Class: G-Random Access-O3 -march=native-O3 -march=native -flto -fwhole-program-O3 -march=native -flto0.03760.07520.11280.15040.188SE +/- 0.00098, N = 3SE +/- 0.00041, N = 3SE +/- 0.00027, N = 30.164310.166790.16722-flto -fwhole-program-flto1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -march=native -funroll-loops2. ATLAS + Open MPI 3.1.3

OpenBenchmarking.orgusecs, Fewer Is BetterHPC Challenge 1.5.0Test / Class: Random Ring Latency-O3 -march=native -flto -fwhole-program-O3 -march=native-O3 -march=native -flto0.10570.21140.31710.42280.5285SE +/- 0.01816, N = 3SE +/- 0.00087, N = 3SE +/- 0.00082, N = 30.469710.454790.45234-flto -fwhole-program-flto1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -march=native -funroll-loops2. ATLAS + Open MPI 3.1.3

MKL-DNN DNNL

This is a test of the Intel MKL-DNN (DNNL / Deep Neural Network Library) as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterMKL-DNN DNNL 1.1Harness: Recurrent Neural Network Training - Data Type: f32-O3 -march=native -flto -fwhole-program-O3 -march=native -flto-O3 -march=native4080120160200SE +/- 0.25, N = 3SE +/- 0.33, N = 3SE +/- 0.42, N = 3195.07194.67194.17MIN: 193.44MIN: 192.64MIN: 192.291. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl

OpenBenchmarking.orgms, Fewer Is BetterMKL-DNN DNNL 1.1Harness: Convolution Batch conv_alexnet - Data Type: f32-O3 -march=native -flto-O3 -march=native-O3 -march=native -flto -fwhole-program306090120150SE +/- 0.18, N = 3SE +/- 0.75, N = 3SE +/- 1.81, N = 3126.97126.14125.13MIN: 125.84MIN: 124.08MIN: 120.751. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl

OpenBenchmarking.orgms, Fewer Is BetterMKL-DNN DNNL 1.1Harness: Deconvolution Batch deconv_1d - Data Type: f32-O3 -march=native -flto-O3 -march=native -flto -fwhole-program-O3 -march=native0.52121.04241.56362.08482.606SE +/- 0.00389, N = 4SE +/- 0.00798, N = 3SE +/- 0.00889, N = 32.316272.315902.30735MIN: 2.25MIN: 2.24MIN: 2.221. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl

HPC Challenge

HPC Challenge (HPCC) is a cluster-focused benchmark consisting of the HPL Linpack TPP benchmark, DGEMM, STREAM, PTRANS, RandomAccess, FFT, and communication bandwidth and latency. This HPC Challenge test profile attempts to ship with standard yet versatile configuration/input files though they can be modified. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is BetterHPC Challenge 1.5.0Test / Class: Random Ring Bandwidth-O3 -march=native -flto -fwhole-program-O3 -march=native -flto-O3 -march=native0.76991.53982.30973.07963.8495SE +/- 0.08613, N = 3SE +/- 0.02086, N = 3SE +/- 0.03737, N = 33.314993.331173.42188-flto -fwhole-program-flto1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -march=native -funroll-loops2. ATLAS + Open MPI 3.1.3

OpenBenchmarking.orgMB/s, More Is BetterHPC Challenge 1.5.0Test / Class: Max Ping Pong Bandwidth-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program5K10K15K20K25KSE +/- 238.70, N = 3SE +/- 301.57, N = 3SE +/- 509.86, N = 322614.6322951.4323030.71-flto-flto -fwhole-program1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -march=native -funroll-loops2. ATLAS + Open MPI 3.1.3

GROMACS

The Gromacs molecular dynamics package testing on the CPU with the water_GMX50 data. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNs Per Day, More Is BetterGROMACS 2019.4Water Benchmark-O3 -march=native-O3 -march=native -flto -fwhole-program-O3 -march=native -flto0.56631.13261.69892.26522.8315SE +/- 0.004, N = 3SE +/- 0.004, N = 3SE +/- 0.001, N = 32.5142.5152.517-flto -fwhole-program-flto1. (CXX) g++ options: -mavx2 -mfma -O3 -march=native -std=c++11 -funroll-all-loops -pthread -lrt -lpthread -lm

ASKAP

This is a CUDA benchmark of ATNF's ASKAP Benchmark with currently using the tConvolveCuda sub-test. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 2018-11-10Test: tConvolve MT - Gridding-O3 -march=native-O3 -march=native -flto -fwhole-program-O3 -march=native -flto400800120016002000SE +/- 6.64, N = 3SE +/- 3.43, N = 3SE +/- 2.80, N = 31946.681948.421949.811. (CXX) g++ options: -lpthread

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 2018-11-10Test: tConvolve MT - Degridding-O3 -march=native -flto -fwhole-program-O3 -march=native -flto-O3 -march=native7001400210028003500SE +/- 1.18, N = 3SE +/- 2.36, N = 3SE +/- 3.60, N = 33363.823366.193369.741. (CXX) g++ options: -lpthread

MKL-DNN DNNL

This is a test of the Intel MKL-DNN (DNNL / Deep Neural Network Library) as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterMKL-DNN DNNL 1.1Harness: Convolution Batch conv_googlenet_v3 - Data Type: f32-O3 -march=native -flto -fwhole-program-O3 -march=native -flto-O3 -march=native1224364860SE +/- 0.50, N = 3SE +/- 0.34, N = 3SE +/- 0.18, N = 353.2552.8352.23MIN: 51.66MIN: 51.44MIN: 51.261. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl

ASKAP

This is a CUDA benchmark of ATNF's ASKAP Benchmark with currently using the tConvolveCuda sub-test. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 2018-11-10Test: tConvolve OpenMP - Gridding-O3 -march=native -flto -fwhole-program-O3 -march=native -flto-O3 -march=native12002400360048006000SE +/- 0.00, N = 3SE +/- 64.06, N = 3SE +/- 37.73, N = 35433.805435.315471.501. (CXX) g++ options: -lpthread

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 2018-11-10Test: tConvolve OpenMP - Degridding-O3 -march=native -flto-O3 -march=native -flto -fwhole-program-O3 -march=native9001800270036004500SE +/- 21.33, N = 3SE +/- 21.33, N = 3SE +/- 21.33, N = 34117.584117.584138.921. (CXX) g++ options: -lpthread

ACES DGEMM

This is a multi-threaded DGEMM benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterACES DGEMM 1.0Sustained Floating-Point Rate-O3 -march=native -flto -fwhole-program-O3 -march=native -flto-O3 -march=native246810SE +/- 0.076167, N = 15SE +/- 0.132221, N = 3SE +/- 0.116188, N = 38.6257768.7614758.781111-flto -fwhole-program-flto1. (CC) gcc options: -O3 -march=native -fopenmp

QMCPACK

QMCPACK is a modern high-performance open-source Quantum Monte Carlo (QMC) simulation code making use of MPI for this benchmark of the H20 example code. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTotal Execution Time - Seconds, Fewer Is BetterQMCPACK 3.8-O3 -march=native -flto -fwhole-program-O3 -march=native -flto-O3 -march=native4008001200160020001900.31895.11878.0-flto -fwhole-program-flto1. (CXX) g++ options: -O3 -march=native -fopenmp -fomit-frame-pointer -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -ffast-math -lm

miniFE

MiniFE Finite Element is an application for unstructured implicit finite element codes. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgCG Mflops, More Is BetterminiFE 2.2Problem Size: Small-O3 -march=native -flto-O3 -march=native -flto -fwhole-program-O3 -march=native17003400510068008500SE +/- 12.76, N = 3SE +/- 8.54, N = 3SE +/- 18.94, N = 37720.987735.167745.541. (CXX) g++ options: -O3 -fopenmp -pthread -lmpi_cxx -lmpi

Timed ImageMagick Compilation

This test times how long it takes to build ImageMagick. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed ImageMagick Compilation 6.9.0Time To Compile-O3 -march=native -flto-O3 -march=native -flto -fwhole-program-O3 -march=native20406080100SE +/- 0.02, N = 3SE +/- 0.25, N = 3SE +/- 0.16, N = 375.2574.8715.34

Stockfish

This is a test of Stockfish, an advanced C++11 chess benchmark that can scale up to 128 CPU cores. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNodes Per Second, More Is BetterStockfish 9Total Time-O3 -march=native -flto-O3 -march=native-O3 -march=native -flto -fwhole-program20M40M60M80M100MSE +/- 774628.71, N = 3SE +/- 1040327.44, N = 3SE +/- 729395.20, N = 3796139887981374181375940-fwhole-program1. (CXX) g++ options: -m64 -lpthread -O3 -march=native -flto -fno-exceptions -std=c++11 -pedantic -msse -msse3 -mpopcnt

Zstd Compression

This test measures the time needed to compress a sample file (an Ubuntu file-system image) using Zstd compression. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterZstd Compression 1.3.4Compressing ubuntu-16.04.3-server-i386.img, Compression Level 19-O3 -march=native -flto-O3 -march=native -flto -fwhole-program-O3 -march=native3691215SE +/- 0.025, N = 3SE +/- 0.092, N = 3SE +/- 0.119, N = 510.16810.1409.994-flto-flto -fwhole-program1. (CC) gcc options: -O3 -march=native -pthread -lz -llzma

TTSIOD 3D Renderer

A portable GPL 3D software renderer that supports OpenMP and Intel Threading Building Blocks with many different rendering modes. This version does not use OpenGL but is entirely CPU/software based. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFPS, More Is BetterTTSIOD 3D Renderer 2.3bPhong Rendering With Soft-Shadow Mapping-O3 -march=native-O3 -march=native -flto -fwhole-program-O3 -march=native -flto2004006008001000SE +/- 3.96, N = 3SE +/- 1.61, N = 3SE +/- 0.32, N = 3946.11950.18950.331. (CXX) g++ options: -O3 -march=native -fomit-frame-pointer -ffast-math -mtune=native -flto -msse -mrecip -mfpmath=sse -msse2 -mssse3 -lSDL -fopenmp -fwhole-program -lstdc++

Radiance Benchmark

This is a benchmark of NREL Radiance, a synthetic imaging system that is open-source and developed by the Lawrence Berkeley National Laboratory in California. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterRadiance Benchmark 5.0Test: Serial-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program120240360480600559.02556.14555.69

OpenBenchmarking.orgSeconds, Fewer Is BetterRadiance Benchmark 5.0Test: SMP Parallel-O3 -march=native -flto-O3 -march=native -flto -fwhole-program-O3 -march=native4080120160200174.53170.63168.59

NGINX Benchmark

This is a test of ab, which is the Apache Benchmark program running against nginx. This test profile measures how many requests per second a given system can sustain when carrying out 2,000,000 requests with 500 requests being carried out concurrently. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgRequests Per Second, More Is BetterNGINX Benchmark 1.9.9Static Web Page Serving-O3 -march=native-O3 -march=native -flto -fwhole-program-O3 -march=native -flto9K18K27K36K45KSE +/- 628.72, N = 3SE +/- 103.71, N = 3SE +/- 294.79, N = 343138.2943510.9743673.89-flto -fwhole-program-flto1. (CC) gcc options: -lpthread -lcrypt -lcrypto -lz -O3 -march=native

OpenSSL

OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test measures the RSA 4096-bit performance of OpenSSL. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSigns Per Second, More Is BetterOpenSSL 1.1.1RSA 4096-bit Performance-O3 -march=native-O3 -march=native -flto15003000450060007500SE +/- 21.22, N = 3SE +/- 21.57, N = 37178.77182.9-flto1. (CC) gcc options: -pthread -m64 -O3 -march=native -lssl -lcrypto -ldl

SQLite

This is a simple benchmark of SQLite. At present this test profile just measures the time to perform a pre-defined number of insertions on an indexed database. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterSQLite 3.30.1Threads / Copies: 1-O3 -march=native -flto -fwhole-program-O3 -march=native -flto-O3 -march=native48121620SE +/- 0.05, N = 3SE +/- 0.05, N = 3SE +/- 0.01, N = 314.2814.2314.20-flto -fwhole-program-flto1. (CC) gcc options: -O3 -march=native -lz -lm -ldl -lpthread

Facebook RocksDB

This is a benchmark of Facebook's RocksDB as an embeddable persistent key-value store for fast storage based on Google's LevelDB. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgOp/s, More Is BetterFacebook RocksDB 6.3.6Test: Random Fill-O3 -march=native -flto-O3 -march=native -flto -fwhole-program-O3 -march=native200K400K600K800K1000KSE +/- 8977.10, N = 3SE +/- 2437.07, N = 3SE +/- 7063.47, N = 39161149198229271191. (CXX) g++ options: -O3 -march=native -std=c++11 -fno-builtin-memcmp -fno-rtti -rdynamic -lpthread

OpenBenchmarking.orgOp/s, More Is BetterFacebook RocksDB 6.3.6Test: Random Read-O3 -march=native -flto -fwhole-program-O3 -march=native-O3 -march=native -flto30M60M90M120M150MSE +/- 84413.94, N = 3SE +/- 154500.46, N = 3SE +/- 1236531.46, N = 31416288361451138561473197771. (CXX) g++ options: -O3 -march=native -std=c++11 -fno-builtin-memcmp -fno-rtti -rdynamic -lpthread

OpenBenchmarking.orgOp/s, More Is BetterFacebook RocksDB 6.3.6Test: Sequential Fill-O3 -march=native -flto-O3 -march=native-O3 -march=native -flto -fwhole-program200K400K600K800K1000KSE +/- 923.75, N = 3SE +/- 3160.54, N = 3SE +/- 1987.24, N = 31010840101822310192791. (CXX) g++ options: -O3 -march=native -std=c++11 -fno-builtin-memcmp -fno-rtti -rdynamic -lpthread

OpenBenchmarking.orgOp/s, More Is BetterFacebook RocksDB 6.3.6Test: Random Fill Sync-O3 -march=native -flto-O3 -march=native -flto -fwhole-program-O3 -march=native5K10K15K20K25KSE +/- 31.00, N = 3SE +/- 34.81, N = 3SE +/- 29.87, N = 32440924495245021. (CXX) g++ options: -O3 -march=native -std=c++11 -fno-builtin-memcmp -fno-rtti -rdynamic -lpthread

OpenBenchmarking.orgOp/s, More Is BetterFacebook RocksDB 6.3.6Test: Read While Writing-O3 -march=native-O3 -march=native -flto -fwhole-program-O3 -march=native -flto1000K2000K3000K4000K5000KSE +/- 30555.74, N = 3SE +/- 22374.25, N = 3SE +/- 9839.07, N = 34868266489875049017671. (CXX) g++ options: -O3 -march=native -std=c++11 -fno-builtin-memcmp -fno-rtti -rdynamic -lpthread

SQLite Speedtest

This is a benchmark of SQLite's speedtest1 benchmark program with an increased problem size of 1,000. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterSQLite Speedtest 3.30Timed Time - Size 1,000-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program1326395265SE +/- 0.11, N = 3SE +/- 0.10, N = 3SE +/- 0.06, N = 357.3756.4456.22-flto-flto -fwhole-program1. (CC) gcc options: -O3 -march=native -ldl -lz -lpthread

PostgreSQL pgbench

This is a simple benchmark of PostgreSQL using pgbench. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL pgbench 12.0Scaling: Buffer Test - Test: Normal Load - Mode: Read Only-O3 -march=native-O3 -march=native -flto150K300K450K600K750KSE +/- 654.90, N = 3SE +/- 375.35, N = 3670670.78701920.06-flto1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=native -lpgcommon -lpgport -lpq -lpthread -lrt -lcrypt -ldl -lm

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL pgbench 12.0Scaling: Buffer Test - Test: Heavy Contention - Mode: Read Only-O3 -march=native-O3 -march=native -flto150K300K450K600K750KSE +/- 3573.89, N = 3SE +/- 5480.07, N = 3676431.71703431.18-flto1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=native -lpgcommon -lpgport -lpq -lpthread -lrt -lcrypt -ldl -lm

BYTE Unix Benchmark

This is a test of BYTE. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgLPS, More Is BetterBYTE Unix Benchmark 3.6Computational Test: Dhrystone 2-O3 -march=native-O3 -march=native -flto -fwhole-program-O3 -march=native -flto14M28M42M56M70MSE +/- 256405.96, N = 3SE +/- 276889.23, N = 3SE +/- 508456.63, N = 347812720.864880292.867070357.6-flto -fwhole-program-flto1. (CC) gcc options: -O3 -march=native