Threadripper 3960X GCC 10 LTO Testing

AMD Ryzen Threadripper 3960X GCC 10 LTO benchmarking by Michael Larabel for a future article.

HTML result view exported from: https://openbenchmarking.org/result/1912225-PTS-THREADRI29&grw.

Threadripper 3960X GCC 10 LTO TestingProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLCompilerFile-SystemScreen Resolution-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-programAMD Ryzen Threadripper 3960X 24-Core @ 3.80GHz (24 Cores / 48 Threads)MSI Creator TRX40 (MS-7C59) v1.0 (1.12N1 BIOS)AMD Starship/Matisse32768MB1000GB Sabrent Rocket 4.0 1TBGigabyte AMD Radeon 540/540X/550/550X / RX 540X/550/550X 2GB (1206/1750MHz)AMD Baffin HDMI/DPASUS VP28UAquantia AQC107 NBase-T/IEEE + Intel I211 + Intel Device 2723Ubuntu 19.105.4.0-nvme-hwmon (x86_64)GNOME Shell 3.34.1X Server 1.20.5modesetting 1.20.54.5 Mesa 19.2.1 (LLVM 9.0.0)GCC 10.0.0 20191208ext43840x2160OpenBenchmarking.orgEnvironment Details- -O3 -march=native: CXXFLAGS="-O3 -march=native" CFLAGS="-O3 -march=native"- -O3 -march=native -flto: CXXFLAGS="-O3 -march=native -flto" CFLAGS="-O3 -march=native -flto"- -O3 -march=native -flto -fwhole-program: CXXFLAGS="-O3 -march=native -flto -fwhole-program" CFLAGS="-O3 -march=native -flto -fwhole-program"Compiler Details- --disable-multilib --enable-checking=releaseDisk Details- NONE / errors=remount-ro,relatime,rwProcessor Details- Scaling Governor: acpi-cpufreq ondemand - CPU Microcode: 0x8301025Security Details- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: conditional RSB filling + tsx_async_abort: Not affected

Threadripper 3960X GCC 10 LTO Testingtscp: AI Chess Performancecrafty: Elapsed Timecompress-xz: Compressing ubuntu-16.04.3-server-i386.img, Compression Level 9encode-flac: WAV To FLACencode-mp3: WAV To MP3fftw: Stock - 1D FFT Size 32fftw: Stock - 2D FFT Size 32fftw: Stock - 2D FFT Size 4096fftw: Float + SSE - 1D FFT Size 32fftw: Float + SSE - 2D FFT Size 32fftw: Float + SSE - 2D FFT Size 4096mrbayes: Primate Phylogeny Analysishimeno: Poisson Pressure Solverhpcc: G-HPLhpcc: G-Fftehpcc: G-Fftehpcc: EP-DGEMMhpcc: G-Ptranshpcc: EP-STREAM Triadhpcc: G-Rand Accesshpcc: Rand Ring Latencymkl-dnn: Recurrent Neural Network Training - f32mkl-dnn: Convolution Batch conv_alexnet - f32mkl-dnn: Deconvolution Batch deconv_1d - f32hpcc: Rand Ring Bandwidthhpcc: Max Ping Pong Bandwidthgromacs: Water Benchmarkaskap: tConvolve MT - Griddingaskap: tConvolve MT - Degriddingmkl-dnn: Convolution Batch conv_googlenet_v3 - f32askap: tConvolve OpenMP - Griddingaskap: tConvolve OpenMP - Degriddingmt-dgemm: Sustained Floating-Point Rateqmcpack: minife: Smallbuild-imagemagick: Time To Compilestockfish: Total Timecompress-zstd: Compressing ubuntu-16.04.3-server-i386.img, Compression Level 19ttsiod-renderer: Phong Rendering With Soft-Shadow Mappingradiance: Serialradiance: SMP Parallelnginx: Static Web Page Servingopenssl: RSA 4096-bit Performancesqlite: 1rocksdb: Rand Fillrocksdb: Rand Readrocksdb: Seq Fillrocksdb: Rand Fill Syncrocksdb: Read While Writingsqlite-speedtest: Timed Time - Size 1,000pgbench: Buffer Test - Normal Load - Read Onlypgbench: Buffer Test - Heavy Contention - Read Onlybyte: Dhrystone 2-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program1350615924065120.0178.0796.71013149132178209.115513459572328171.8344893.66346263.5449015.1963015.1963032.495905.796631.684260.164310.45479194.173126.1412.307353.4218822614.6282.5141946.683369.7452.23395471.54138.928.78111118787745.5415.336798137419.994946.107559.022168.58543138.297178.714.203927119145113856101822324502486826657.367670670.777447676431.70707147812720.81422472920928719.8678.0446.62211201126778969.715488463502450569.1324766.39693463.6848715.2173715.2173732.757075.787911.688740.167220.45234194.665126.9712.316273.3311722951.4282.5171949.813366.1952.82695435.314117.588.7614751895.17720.9875.2467961398810.168950.330556.135174.53443673.897182.914.232916114147319777101084024409490176756.442701920.062619703431.18291267070357.61418074897834619.8298.0736.69711069125958540.715271457342428968.0164684.70197563.7164715.9778315.9778333.596935.807511.697700.166790.46971195.071125.1282.315903.3149923030.7092.5151948.423363.8253.24545433.84117.588.6257761900.37735.1674.8738137594010.140950.184555.693170.63143510.9714.281919822141628836101927924495489875056.22464880292.8OpenBenchmarking.org

TSCP

AI Chess Performance

OpenBenchmarking.orgNodes Per Second, More Is BetterTSCP 1.81AI Chess Performance-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program300K600K900K1200K1500KSE +/- 1620.83, N = 5SE +/- 1795.91, N = 5SE +/- 1462.93, N = 5135061514224721418074-flto-flto -fwhole-program1. (CC) gcc options: -O3 -march=native

Crafty

Elapsed Time

OpenBenchmarking.orgNodes Per Second, More Is BetterCrafty 25.2Elapsed Time-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program2M4M6M8M10MSE +/- 19679.93, N = 3SE +/- 13239.35, N = 3SE +/- 5847.06, N = 39240651920928789783461. (CC) gcc options: -pthread -lstdc++ -fprofile-use -lm

XZ Compression

Compressing ubuntu-16.04.3-server-i386.img, Compression Level 9

OpenBenchmarking.orgSeconds, Fewer Is BetterXZ Compression 5.2.4Compressing ubuntu-16.04.3-server-i386.img, Compression Level 9-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program510152025SE +/- 0.03, N = 3SE +/- 0.06, N = 3SE +/- 0.03, N = 320.0219.8719.83-flto-flto -fwhole-program1. (CC) gcc options: -pthread -fvisibility=hidden -O3 -march=native

FLAC Audio Encoding

WAV To FLAC

OpenBenchmarking.orgSeconds, Fewer Is BetterFLAC Audio Encoding 1.3.2WAV To FLAC-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program246810SE +/- 0.005, N = 5SE +/- 0.006, N = 5SE +/- 0.004, N = 58.0798.0448.073-flto-flto -fwhole-program1. (CXX) g++ options: -O3 -march=native -fvisibility=hidden -logg -lm

LAME MP3 Encoding

WAV To MP3

OpenBenchmarking.orgSeconds, Fewer Is BetterLAME MP3 Encoding 3.100WAV To MP3-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program246810SE +/- 0.016, N = 3SE +/- 0.067, N = 3SE +/- 0.007, N = 36.7106.6226.697-flto-flto -fwhole-program1. (CC) gcc options: -O3 -ffast-math -funroll-loops -fschedule-insns2 -fbranch-count-reg -fforce-addr -pipe -march=native -lm

FFTW

Build: Stock - Size: 1D FFT Size 32

OpenBenchmarking.orgMflops, More Is BetterFFTW 3.3.6Build: Stock - Size: 1D FFT Size 32-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program3K6K9K12K15KSE +/- 27.74, N = 3SE +/- 55.87, N = 3SE +/- 3.21, N = 3131491120111069-flto-flto -fwhole-program1. (CC) gcc options: -pthread -O3 -march=native -lm

FFTW

Build: Stock - Size: 2D FFT Size 32

OpenBenchmarking.orgMflops, More Is BetterFFTW 3.3.6Build: Stock - Size: 2D FFT Size 32-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program3K6K9K12K15KSE +/- 6.49, N = 3SE +/- 14.19, N = 3SE +/- 72.89, N = 3132171267712595-flto-flto -fwhole-program1. (CC) gcc options: -pthread -O3 -march=native -lm

FFTW

Build: Stock - Size: 2D FFT Size 4096

OpenBenchmarking.orgMflops, More Is BetterFFTW 3.3.6Build: Stock - Size: 2D FFT Size 4096-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program2K4K6K8K10KSE +/- 138.82, N = 3SE +/- 155.20, N = 3SE +/- 111.43, N = 38209.18969.78540.7-flto-flto -fwhole-program1. (CC) gcc options: -pthread -O3 -march=native -lm

FFTW

Build: Float + SSE - Size: 1D FFT Size 32

OpenBenchmarking.orgMflops, More Is BetterFFTW 3.3.6Build: Float + SSE - Size: 1D FFT Size 32-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program3K6K9K12K15KSE +/- 84.89, N = 3SE +/- 25.21, N = 3SE +/- 111.55, N = 3155131548815271-flto-flto -fwhole-program1. (CC) gcc options: -pthread -O3 -march=native -lm

FFTW

Build: Float + SSE - Size: 2D FFT Size 32

OpenBenchmarking.orgMflops, More Is BetterFFTW 3.3.6Build: Float + SSE - Size: 2D FFT Size 32-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program10K20K30K40K50KSE +/- 23.68, N = 3SE +/- 82.72, N = 3SE +/- 49.75, N = 3459574635045734-flto-flto -fwhole-program1. (CC) gcc options: -pthread -O3 -march=native -lm

FFTW

Build: Float + SSE - Size: 2D FFT Size 4096

OpenBenchmarking.orgMflops, More Is BetterFFTW 3.3.6Build: Float + SSE - Size: 2D FFT Size 4096-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program5K10K15K20K25KSE +/- 190.93, N = 3SE +/- 153.78, N = 3SE +/- 143.81, N = 3232812450524289-flto-flto -fwhole-program1. (CC) gcc options: -pthread -O3 -march=native -lm

Timed MrBayes Analysis

Primate Phylogeny Analysis

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed MrBayes Analysis 3.2.7Primate Phylogeny Analysis-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program1632486480SE +/- 1.38, N = 13SE +/- 0.87, N = 4SE +/- 0.14, N = 371.8369.1368.02-flto-flto -fwhole-program1. (CC) gcc options: -mmmx -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4a -msha -maes -mavx -mfma -mavx2 -mrdrnd -mbmi -mbmi2 -madx -mabm -O3 -std=c99 -pedantic -march=native -lm

Himeno Benchmark

Poisson Pressure Solver

OpenBenchmarking.orgMFLOPS, More Is BetterHimeno Benchmark 3.0Poisson Pressure Solver-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program10002000300040005000SE +/- 32.73, N = 3SE +/- 22.94, N = 3SE +/- 63.96, N = 44893.664766.404684.70-flto-flto -fwhole-program1. (CC) gcc options: -O3 -march=native -mavx2

HPC Challenge

Test / Class: G-HPL

OpenBenchmarking.orgGFLOPS, More Is BetterHPC Challenge 1.5.0Test / Class: G-HPL-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program1428425670SE +/- 0.05, N = 3SE +/- 0.14, N = 3SE +/- 0.17, N = 363.5463.6863.72-flto-flto -fwhole-program1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -march=native -funroll-loops2. ATLAS + Open MPI 3.1.3

HPC Challenge

Test / Class: G-Ffte

OpenBenchmarking.orgGFLOPS, More Is BetterHPC Challenge 1.5.0Test / Class: G-Ffte-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program48121620SE +/- 0.07, N = 3SE +/- 0.45, N = 3SE +/- 0.37, N = 315.2015.2215.98-flto-flto -fwhole-program1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -march=native -funroll-loops2. ATLAS + Open MPI 3.1.3

HPC Challenge

Test / Class: G-Ffte

OpenBenchmarking.orgGFLOP/s, More Is BetterHPC Challenge 1.5.0Test / Class: G-Ffte-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program48121620SE +/- 0.07, N = 3SE +/- 0.45, N = 3SE +/- 0.37, N = 315.2015.2215.98-flto-flto -fwhole-program1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -march=native -funroll-loops2. ATLAS + Open MPI 3.1.3

HPC Challenge

Test / Class: EP-DGEMM

OpenBenchmarking.orgGFLOPS, More Is BetterHPC Challenge 1.5.0Test / Class: EP-DGEMM-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program816243240SE +/- 0.16, N = 3SE +/- 0.64, N = 3SE +/- 0.27, N = 332.5032.7633.60-flto-flto -fwhole-program1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -march=native -funroll-loops2. ATLAS + Open MPI 3.1.3

HPC Challenge

Test / Class: G-Ptrans

OpenBenchmarking.orgGB/s, More Is BetterHPC Challenge 1.5.0Test / Class: G-Ptrans-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program1.30672.61343.92015.22686.5335SE +/- 0.02501, N = 3SE +/- 0.01376, N = 3SE +/- 0.01388, N = 35.796635.787915.80751-flto-flto -fwhole-program1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -march=native -funroll-loops2. ATLAS + Open MPI 3.1.3

HPC Challenge

Test / Class: EP-STREAM Triad

OpenBenchmarking.orgGB/s, More Is BetterHPC Challenge 1.5.0Test / Class: EP-STREAM Triad-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program0.3820.7641.1461.5281.91SE +/- 0.00200, N = 3SE +/- 0.00593, N = 3SE +/- 0.01665, N = 31.684261.688741.69770-flto-flto -fwhole-program1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -march=native -funroll-loops2. ATLAS + Open MPI 3.1.3

HPC Challenge

Test / Class: G-Random Access

OpenBenchmarking.orgGUP/s, More Is BetterHPC Challenge 1.5.0Test / Class: G-Random Access-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program0.03760.07520.11280.15040.188SE +/- 0.00098, N = 3SE +/- 0.00027, N = 3SE +/- 0.00041, N = 30.164310.167220.16679-flto-flto -fwhole-program1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -march=native -funroll-loops2. ATLAS + Open MPI 3.1.3

HPC Challenge

Test / Class: Random Ring Latency

OpenBenchmarking.orgusecs, Fewer Is BetterHPC Challenge 1.5.0Test / Class: Random Ring Latency-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program0.10570.21140.31710.42280.5285SE +/- 0.00087, N = 3SE +/- 0.00082, N = 3SE +/- 0.01816, N = 30.454790.452340.46971-flto-flto -fwhole-program1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -march=native -funroll-loops2. ATLAS + Open MPI 3.1.3

MKL-DNN DNNL

Harness: Recurrent Neural Network Training - Data Type: f32

OpenBenchmarking.orgms, Fewer Is BetterMKL-DNN DNNL 1.1Harness: Recurrent Neural Network Training - Data Type: f32-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program4080120160200SE +/- 0.42, N = 3SE +/- 0.33, N = 3SE +/- 0.25, N = 3194.17194.67195.07MIN: 192.29MIN: 192.64MIN: 193.441. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl

MKL-DNN DNNL

Harness: Convolution Batch conv_alexnet - Data Type: f32

OpenBenchmarking.orgms, Fewer Is BetterMKL-DNN DNNL 1.1Harness: Convolution Batch conv_alexnet - Data Type: f32-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program306090120150SE +/- 0.75, N = 3SE +/- 0.18, N = 3SE +/- 1.81, N = 3126.14126.97125.13MIN: 124.08MIN: 125.84MIN: 120.751. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl

MKL-DNN DNNL

Harness: Deconvolution Batch deconv_1d - Data Type: f32

OpenBenchmarking.orgms, Fewer Is BetterMKL-DNN DNNL 1.1Harness: Deconvolution Batch deconv_1d - Data Type: f32-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program0.52121.04241.56362.08482.606SE +/- 0.00889, N = 3SE +/- 0.00389, N = 4SE +/- 0.00798, N = 32.307352.316272.31590MIN: 2.22MIN: 2.25MIN: 2.241. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl

HPC Challenge

Test / Class: Random Ring Bandwidth

OpenBenchmarking.orgGB/s, More Is BetterHPC Challenge 1.5.0Test / Class: Random Ring Bandwidth-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program0.76991.53982.30973.07963.8495SE +/- 0.03737, N = 3SE +/- 0.02086, N = 3SE +/- 0.08613, N = 33.421883.331173.31499-flto-flto -fwhole-program1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -march=native -funroll-loops2. ATLAS + Open MPI 3.1.3

HPC Challenge

Test / Class: Max Ping Pong Bandwidth

OpenBenchmarking.orgMB/s, More Is BetterHPC Challenge 1.5.0Test / Class: Max Ping Pong Bandwidth-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program5K10K15K20K25KSE +/- 238.70, N = 3SE +/- 301.57, N = 3SE +/- 509.86, N = 322614.6322951.4323030.71-flto-flto -fwhole-program1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -O3 -march=native -funroll-loops2. ATLAS + Open MPI 3.1.3

GROMACS

Water Benchmark

OpenBenchmarking.orgNs Per Day, More Is BetterGROMACS 2019.4Water Benchmark-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program0.56631.13261.69892.26522.8315SE +/- 0.004, N = 3SE +/- 0.001, N = 3SE +/- 0.004, N = 32.5142.5172.515-flto-flto -fwhole-program1. (CXX) g++ options: -mavx2 -mfma -O3 -march=native -std=c++11 -funroll-all-loops -pthread -lrt -lpthread -lm

ASKAP

Test: tConvolve MT - Gridding

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 2018-11-10Test: tConvolve MT - Gridding-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program400800120016002000SE +/- 6.64, N = 3SE +/- 2.80, N = 3SE +/- 3.43, N = 31946.681949.811948.421. (CXX) g++ options: -lpthread

ASKAP

Test: tConvolve MT - Degridding

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 2018-11-10Test: tConvolve MT - Degridding-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program7001400210028003500SE +/- 3.60, N = 3SE +/- 2.36, N = 3SE +/- 1.18, N = 33369.743366.193363.821. (CXX) g++ options: -lpthread

MKL-DNN DNNL

Harness: Convolution Batch conv_googlenet_v3 - Data Type: f32

OpenBenchmarking.orgms, Fewer Is BetterMKL-DNN DNNL 1.1Harness: Convolution Batch conv_googlenet_v3 - Data Type: f32-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program1224364860SE +/- 0.18, N = 3SE +/- 0.34, N = 3SE +/- 0.50, N = 352.2352.8353.25MIN: 51.26MIN: 51.44MIN: 51.661. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl

ASKAP

Test: tConvolve OpenMP - Gridding

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 2018-11-10Test: tConvolve OpenMP - Gridding-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program12002400360048006000SE +/- 37.73, N = 3SE +/- 64.06, N = 3SE +/- 0.00, N = 35471.505435.315433.801. (CXX) g++ options: -lpthread

ASKAP

Test: tConvolve OpenMP - Degridding

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 2018-11-10Test: tConvolve OpenMP - Degridding-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program9001800270036004500SE +/- 21.33, N = 3SE +/- 21.33, N = 3SE +/- 21.33, N = 34138.924117.584117.581. (CXX) g++ options: -lpthread

ACES DGEMM

Sustained Floating-Point Rate

OpenBenchmarking.orgGFLOP/s, More Is BetterACES DGEMM 1.0Sustained Floating-Point Rate-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program246810SE +/- 0.116188, N = 3SE +/- 0.132221, N = 3SE +/- 0.076167, N = 158.7811118.7614758.625776-flto-flto -fwhole-program1. (CC) gcc options: -O3 -march=native -fopenmp

QMCPACK

OpenBenchmarking.orgTotal Execution Time - Seconds, Fewer Is BetterQMCPACK 3.8-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program4008001200160020001878.01895.11900.3-flto-flto -fwhole-program1. (CXX) g++ options: -O3 -march=native -fopenmp -fomit-frame-pointer -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -ffast-math -lm

miniFE

Problem Size: Small

OpenBenchmarking.orgCG Mflops, More Is BetterminiFE 2.2Problem Size: Small-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program17003400510068008500SE +/- 18.94, N = 3SE +/- 12.76, N = 3SE +/- 8.54, N = 37745.547720.987735.161. (CXX) g++ options: -O3 -fopenmp -pthread -lmpi_cxx -lmpi

Timed ImageMagick Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed ImageMagick Compilation 6.9.0Time To Compile-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program20406080100SE +/- 0.16, N = 3SE +/- 0.02, N = 3SE +/- 0.25, N = 315.3475.2574.87

Stockfish

Total Time

OpenBenchmarking.orgNodes Per Second, More Is BetterStockfish 9Total Time-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program20M40M60M80M100MSE +/- 1040327.44, N = 3SE +/- 774628.71, N = 3SE +/- 729395.20, N = 3798137417961398881375940-fwhole-program1. (CXX) g++ options: -m64 -lpthread -O3 -march=native -fno-exceptions -std=c++11 -pedantic -msse -msse3 -mpopcnt -flto

Zstd Compression

Compressing ubuntu-16.04.3-server-i386.img, Compression Level 19

OpenBenchmarking.orgSeconds, Fewer Is BetterZstd Compression 1.3.4Compressing ubuntu-16.04.3-server-i386.img, Compression Level 19-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program3691215SE +/- 0.119, N = 5SE +/- 0.025, N = 3SE +/- 0.092, N = 39.99410.16810.140-flto-flto -fwhole-program1. (CC) gcc options: -O3 -march=native -pthread -lz -llzma

TTSIOD 3D Renderer

Phong Rendering With Soft-Shadow Mapping

OpenBenchmarking.orgFPS, More Is BetterTTSIOD 3D Renderer 2.3bPhong Rendering With Soft-Shadow Mapping-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program2004006008001000SE +/- 3.96, N = 3SE +/- 0.32, N = 3SE +/- 1.61, N = 3946.11950.33950.181. (CXX) g++ options: -O3 -march=native -fomit-frame-pointer -ffast-math -mtune=native -flto -msse -mrecip -mfpmath=sse -msse2 -mssse3 -lSDL -fopenmp -fwhole-program -lstdc++

Radiance Benchmark

Test: Serial

OpenBenchmarking.orgSeconds, Fewer Is BetterRadiance Benchmark 5.0Test: Serial-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program120240360480600559.02556.14555.69

Radiance Benchmark

Test: SMP Parallel

OpenBenchmarking.orgSeconds, Fewer Is BetterRadiance Benchmark 5.0Test: SMP Parallel-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program4080120160200168.59174.53170.63

NGINX Benchmark

Static Web Page Serving

OpenBenchmarking.orgRequests Per Second, More Is BetterNGINX Benchmark 1.9.9Static Web Page Serving-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program9K18K27K36K45KSE +/- 628.72, N = 3SE +/- 294.79, N = 3SE +/- 103.71, N = 343138.2943673.8943510.97-flto-flto -fwhole-program1. (CC) gcc options: -lpthread -lcrypt -lcrypto -lz -O3 -march=native

OpenSSL

RSA 4096-bit Performance

OpenBenchmarking.orgSigns Per Second, More Is BetterOpenSSL 1.1.1RSA 4096-bit Performance-O3 -march=native-O3 -march=native -flto15003000450060007500SE +/- 21.22, N = 3SE +/- 21.57, N = 37178.77182.9-flto1. (CC) gcc options: -pthread -m64 -O3 -march=native -lssl -lcrypto -ldl

SQLite

Threads / Copies: 1

OpenBenchmarking.orgSeconds, Fewer Is BetterSQLite 3.30.1Threads / Copies: 1-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program48121620SE +/- 0.01, N = 3SE +/- 0.05, N = 3SE +/- 0.05, N = 314.2014.2314.28-flto-flto -fwhole-program1. (CC) gcc options: -O3 -march=native -lz -lm -ldl -lpthread

Facebook RocksDB

Test: Random Fill

OpenBenchmarking.orgOp/s, More Is BetterFacebook RocksDB 6.3.6Test: Random Fill-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program200K400K600K800K1000KSE +/- 7063.47, N = 3SE +/- 8977.10, N = 3SE +/- 2437.07, N = 39271199161149198221. (CXX) g++ options: -O3 -march=native -std=c++11 -fno-builtin-memcmp -fno-rtti -rdynamic -lpthread

Facebook RocksDB

Test: Random Read

OpenBenchmarking.orgOp/s, More Is BetterFacebook RocksDB 6.3.6Test: Random Read-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program30M60M90M120M150MSE +/- 154500.46, N = 3SE +/- 1236531.46, N = 3SE +/- 84413.94, N = 31451138561473197771416288361. (CXX) g++ options: -O3 -march=native -std=c++11 -fno-builtin-memcmp -fno-rtti -rdynamic -lpthread

Facebook RocksDB

Test: Sequential Fill

OpenBenchmarking.orgOp/s, More Is BetterFacebook RocksDB 6.3.6Test: Sequential Fill-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program200K400K600K800K1000KSE +/- 3160.54, N = 3SE +/- 923.75, N = 3SE +/- 1987.24, N = 31018223101084010192791. (CXX) g++ options: -O3 -march=native -std=c++11 -fno-builtin-memcmp -fno-rtti -rdynamic -lpthread

Facebook RocksDB

Test: Random Fill Sync

OpenBenchmarking.orgOp/s, More Is BetterFacebook RocksDB 6.3.6Test: Random Fill Sync-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program5K10K15K20K25KSE +/- 29.87, N = 3SE +/- 31.00, N = 3SE +/- 34.81, N = 32450224409244951. (CXX) g++ options: -O3 -march=native -std=c++11 -fno-builtin-memcmp -fno-rtti -rdynamic -lpthread

Facebook RocksDB

Test: Read While Writing

OpenBenchmarking.orgOp/s, More Is BetterFacebook RocksDB 6.3.6Test: Read While Writing-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program1000K2000K3000K4000K5000KSE +/- 30555.74, N = 3SE +/- 9839.07, N = 3SE +/- 22374.25, N = 34868266490176748987501. (CXX) g++ options: -O3 -march=native -std=c++11 -fno-builtin-memcmp -fno-rtti -rdynamic -lpthread

SQLite Speedtest

Timed Time - Size 1,000

OpenBenchmarking.orgSeconds, Fewer Is BetterSQLite Speedtest 3.30Timed Time - Size 1,000-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program1326395265SE +/- 0.11, N = 3SE +/- 0.10, N = 3SE +/- 0.06, N = 357.3756.4456.22-flto-flto -fwhole-program1. (CC) gcc options: -O3 -march=native -ldl -lz -lpthread

PostgreSQL pgbench

Scaling: Buffer Test - Test: Normal Load - Mode: Read Only

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL pgbench 12.0Scaling: Buffer Test - Test: Normal Load - Mode: Read Only-O3 -march=native-O3 -march=native -flto150K300K450K600K750KSE +/- 654.90, N = 3SE +/- 375.35, N = 3670670.78701920.06-flto1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=native -lpgcommon -lpgport -lpq -lpthread -lrt -lcrypt -ldl -lm

PostgreSQL pgbench

Scaling: Buffer Test - Test: Heavy Contention - Mode: Read Only

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL pgbench 12.0Scaling: Buffer Test - Test: Heavy Contention - Mode: Read Only-O3 -march=native-O3 -march=native -flto150K300K450K600K750KSE +/- 3573.89, N = 3SE +/- 5480.07, N = 3676431.71703431.18-flto1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O3 -march=native -lpgcommon -lpgport -lpq -lpthread -lrt -lcrypt -ldl -lm

BYTE Unix Benchmark

Computational Test: Dhrystone 2

OpenBenchmarking.orgLPS, More Is BetterBYTE Unix Benchmark 3.6Computational Test: Dhrystone 2-O3 -march=native-O3 -march=native -flto-O3 -march=native -flto -fwhole-program14M28M42M56M70MSE +/- 256405.96, N = 3SE +/- 508456.63, N = 3SE +/- 276889.23, N = 347812720.867070357.664880292.8-flto-flto -fwhole-program1. (CC) gcc options: -O3 -march=native


Phoronix Test Suite v10.8.5