GCC 10 AMD Threadripper 3960X PGO Optimization

AMD Ryzen Threadripper GCC 10 PGO benchmarks by Michael Larabel for a future article.

HTML result view exported from: https://openbenchmarking.org/result/1912222-PTS-GCC10AMD29.

GCC 10 AMD Threadripper 3960X PGO OptimizationProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLCompilerFile-SystemScreen ResolutionGCC 10GCC 10 - PGOAMD Ryzen Threadripper 3960X 24-Core @ 3.80GHz (24 Cores / 48 Threads)MSI Creator TRX40 (MS-7C59) v1.0 (1.12N1 BIOS)AMD Starship/Matisse32768MB1000GB Sabrent Rocket 4.0 1TBGigabyte AMD Radeon 540/540X/550/550X / RX 540X/550/550X 2GB (1206/1750MHz)AMD Baffin HDMI/DPASUS VP28UAquantia AQC107 NBase-T/IEEE + Intel I211 + Intel Device 2723Ubuntu 19.105.4.0-nvme-hwmon (x86_64)GNOME Shell 3.34.1X Server 1.20.5modesetting 1.20.54.5 Mesa 19.2.1 (LLVM 9.0.0)GCC 10.0.0 20191208ext43840x2160OpenBenchmarking.orgCompiler Details- --disable-multilib --enable-checking=releaseDisk Details- NONE / errors=remount-ro,relatime,rwProcessor Details- Scaling Governor: acpi-cpufreq ondemand - CPU Microcode: 0x8301025Security Details- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: conditional RSB filling + tsx_async_abort: Not affected

GCC 10 AMD Threadripper 3960X PGO Optimizationsqlite: 1hpcc: G-HPLhpcc: G-Fftehpcc: G-Fftehpcc: EP-DGEMMhpcc: G-Ptranshpcc: EP-STREAM Triadhpcc: G-Rand Accesshpcc: Rand Ring Latencyhpcc: Rand Ring Bandwidthhpcc: Max Ping Pong Bandwidthminife: Smallfftw: Stock - 1D FFT Size 32fftw: Stock - 2D FFT Size 32fftw: Stock - 2D FFT Size 4096fftw: Float + SSE - 1D FFT Size 32fftw: Float + SSE - 2D FFT Size 32fftw: Float + SSE - 2D FFT Size 4096mrbayes: Primate Phylogeny Analysisqmcpack: byte: Dhrystone 2crafty: Elapsed Timetscp: AI Chess Performancemkl-dnn: Deconvolution Batch deconv_1d - f32mkl-dnn: Convolution Batch conv_alexnet - f32mkl-dnn: Recurrent Neural Network Training - f32mkl-dnn: Convolution Batch conv_googlenet_v3 - f32ttsiod-renderer: Phong Rendering With Soft-Shadow Mappingmt-dgemm: Sustained Floating-Point Ratehimeno: Poisson Pressure Solverstockfish: Total Timebuild-imagemagick: Time To Compilecompress-xz: Compressing ubuntu-16.04.3-server-i386.img, Compression Level 9encode-flac: WAV To FLACencode-mp3: WAV To MP3radiance: Serialradiance: SMP Parallelopenssl: RSA 4096-bit Performanceaskap: tConvolve MT - Griddingaskap: tConvolve MT - Degriddingaskap: tConvolve OpenMP - Griddingaskap: tConvolve OpenMP - Degriddinggromacs: Water Benchmarkpgbench: Buffer Test - Normal Load - Read Onlypgbench: Buffer Test - Heavy Contention - Read Onlysqlite-speedtest: Timed Time - Size 1,000rocksdb: Rand Fillrocksdb: Rand Readrocksdb: Seq Fillrocksdb: Rand Fill Syncrocksdb: Read While WritingGCC 10GCC 10 - PGO14.18463.6293310.4912710.4912732.927935.477371.797500.142780.458633.4067822976.9987740.1010443105126687.315396454042266770.008187848055276.3923482413466512.32419123.990194.24852.3291938.4718.5672824684.2994947935961316.46919.8657.7197.297555.936171.2957180.61947.243359.125433.84096.252.505669039.844251676349.71103157.263938039145207827101986224588488995663.4750710.6366310.6366332.676135.524211.795490.141610.456803.3624823248.0867728.25189648511999.7930994215333482.31720123.296195.32153.40819.6690284848.09352278501983556.477172.3537072.41947.233359.705361.353995.259210811447686941020657245884868440OpenBenchmarking.org

SQLite

Threads / Copies: 1

OpenBenchmarking.orgSeconds, Fewer Is BetterSQLite 3.30.1Threads / Copies: 1GCC 1048121620SE +/- 0.01, N = 314.181. (CC) gcc options: -O2 -lz -lm -ldl -lpthread

HPC Challenge

Test / Class: G-HPL

OpenBenchmarking.orgGFLOPS, More Is BetterHPC Challenge 1.5.0Test / Class: G-HPLGCC 10GCC 10 - PGO1428425670SE +/- 0.23, N = 3SE +/- 0.02, N = 363.6363.481. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops2. ATLAS + Open MPI 3.1.3

HPC Challenge

Test / Class: G-Ffte

OpenBenchmarking.orgGFLOPS, More Is BetterHPC Challenge 1.5.0Test / Class: G-FfteGCC 10GCC 10 - PGO3691215SE +/- 0.05, N = 3SE +/- 0.11, N = 310.4910.641. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops2. ATLAS + Open MPI 3.1.3

HPC Challenge

Test / Class: G-Ffte

OpenBenchmarking.orgGFLOP/s, More Is BetterHPC Challenge 1.5.0Test / Class: G-FfteGCC 10GCC 10 - PGO3691215SE +/- 0.05, N = 3SE +/- 0.11, N = 310.4910.641. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops2. ATLAS + Open MPI 3.1.3

HPC Challenge

Test / Class: EP-DGEMM

OpenBenchmarking.orgGFLOPS, More Is BetterHPC Challenge 1.5.0Test / Class: EP-DGEMMGCC 10GCC 10 - PGO816243240SE +/- 0.38, N = 3SE +/- 0.06, N = 332.9332.681. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops2. ATLAS + Open MPI 3.1.3

HPC Challenge

Test / Class: G-Ptrans

OpenBenchmarking.orgGB/s, More Is BetterHPC Challenge 1.5.0Test / Class: G-PtransGCC 10GCC 10 - PGO1.24292.48583.72874.97166.2145SE +/- 0.00581, N = 3SE +/- 0.03180, N = 35.477375.524211. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops2. ATLAS + Open MPI 3.1.3

HPC Challenge

Test / Class: EP-STREAM Triad

OpenBenchmarking.orgGB/s, More Is BetterHPC Challenge 1.5.0Test / Class: EP-STREAM TriadGCC 10GCC 10 - PGO0.40440.80881.21321.61762.022SE +/- 0.00127, N = 3SE +/- 0.00146, N = 31.797501.795491. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops2. ATLAS + Open MPI 3.1.3

HPC Challenge

Test / Class: G-Random Access

OpenBenchmarking.orgGUP/s, More Is BetterHPC Challenge 1.5.0Test / Class: G-Random AccessGCC 10GCC 10 - PGO0.03210.06420.09630.12840.1605SE +/- 0.00039, N = 3SE +/- 0.00060, N = 30.142780.141611. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops2. ATLAS + Open MPI 3.1.3

HPC Challenge

Test / Class: Random Ring Latency

OpenBenchmarking.orgusecs, Fewer Is BetterHPC Challenge 1.5.0Test / Class: Random Ring LatencyGCC 10GCC 10 - PGO0.10320.20640.30960.41280.516SE +/- 0.00067, N = 3SE +/- 0.00047, N = 30.458630.456801. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops2. ATLAS + Open MPI 3.1.3

HPC Challenge

Test / Class: Random Ring Bandwidth

OpenBenchmarking.orgGB/s, More Is BetterHPC Challenge 1.5.0Test / Class: Random Ring BandwidthGCC 10GCC 10 - PGO0.76651.5332.29953.0663.8325SE +/- 0.01038, N = 3SE +/- 0.04431, N = 33.406783.362481. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops2. ATLAS + Open MPI 3.1.3

HPC Challenge

Test / Class: Max Ping Pong Bandwidth

OpenBenchmarking.orgMB/s, More Is BetterHPC Challenge 1.5.0Test / Class: Max Ping Pong BandwidthGCC 10GCC 10 - PGO5K10K15K20K25KSE +/- 313.29, N = 3SE +/- 58.02, N = 322977.0023248.091. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops2. ATLAS + Open MPI 3.1.3

miniFE

Problem Size: Small

OpenBenchmarking.orgCG Mflops, More Is BetterminiFE 2.2Problem Size: SmallGCC 10GCC 10 - PGO17003400510068008500SE +/- 11.29, N = 3SE +/- 8.78, N = 37740.107728.251. (CXX) g++ options: -O3 -fopenmp -pthread -lmpi_cxx -lmpi

FFTW

Build: Stock - Size: 1D FFT Size 32

OpenBenchmarking.orgMflops, More Is BetterFFTW 3.3.6Build: Stock - Size: 1D FFT Size 32GCC 102K4K6K8K10KSE +/- 16.77, N = 3104431. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm

FFTW

Build: Stock - Size: 2D FFT Size 32

OpenBenchmarking.orgMflops, More Is BetterFFTW 3.3.6Build: Stock - Size: 2D FFT Size 32GCC 102K4K6K8K10KSE +/- 11.02, N = 3105121. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm

FFTW

Build: Stock - Size: 2D FFT Size 4096

OpenBenchmarking.orgMflops, More Is BetterFFTW 3.3.6Build: Stock - Size: 2D FFT Size 4096GCC 1014002800420056007000SE +/- 7.80, N = 36687.31. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm

FFTW

Build: Float + SSE - Size: 1D FFT Size 32

OpenBenchmarking.orgMflops, More Is BetterFFTW 3.3.6Build: Float + SSE - Size: 1D FFT Size 32GCC 103K6K9K12K15KSE +/- 15.37, N = 3153961. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm

FFTW

Build: Float + SSE - Size: 2D FFT Size 32

OpenBenchmarking.orgMflops, More Is BetterFFTW 3.3.6Build: Float + SSE - Size: 2D FFT Size 32GCC 1010K20K30K40K50KSE +/- 56.20, N = 3454041. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm

FFTW

Build: Float + SSE - Size: 2D FFT Size 4096

OpenBenchmarking.orgMflops, More Is BetterFFTW 3.3.6Build: Float + SSE - Size: 2D FFT Size 4096GCC 105K10K15K20K25KSE +/- 285.77, N = 3226671. (CC) gcc options: -pthread -O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math -lm

Timed MrBayes Analysis

Primate Phylogeny Analysis

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed MrBayes Analysis 3.2.7Primate Phylogeny AnalysisGCC 101632486480SE +/- 0.25, N = 370.011. (CC) gcc options: -mmmx -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4a -msha -maes -mavx -mfma -mavx2 -mrdrnd -mbmi -mbmi2 -madx -mabm -O3 -std=c99 -pedantic -lm

QMCPACK

OpenBenchmarking.orgTotal Execution Time - Seconds, Fewer Is BetterQMCPACK 3.8GCC 10GCC 10 - PGO40080012001600200018781896-fprofile-correction1. (CXX) g++ options: -fopenmp -fomit-frame-pointer -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -march=native -O3 -ffast-math -lm

BYTE Unix Benchmark

Computational Test: Dhrystone 2

OpenBenchmarking.orgLPS, More Is BetterBYTE Unix Benchmark 3.6Computational Test: Dhrystone 2GCC 10GCC 10 - PGO10M20M30M40M50MSE +/- 550382.53, N = 3SE +/- 627061.00, N = 348055276.348511999.7

Crafty

Elapsed Time

OpenBenchmarking.orgNodes Per Second, More Is BetterCrafty 25.2Elapsed TimeGCC 10GCC 10 - PGO2M4M6M8M10MSE +/- 7954.66, N = 3SE +/- 73907.13, N = 3923482493099421. (CC) gcc options: -pthread -lstdc++ -fprofile-use -lm

TSCP

AI Chess Performance

OpenBenchmarking.orgNodes Per Second, More Is BetterTSCP 1.81AI Chess PerformanceGCC 10GCC 10 - PGO300K600K900K1200K1500KSE +/- 1472.68, N = 5SE +/- 852.40, N = 513466511533348-fprofile-correction1. (CC) gcc options: -O3 -march=native

MKL-DNN DNNL

Harness: Deconvolution Batch deconv_1d - Data Type: f32

OpenBenchmarking.orgms, Fewer Is BetterMKL-DNN DNNL 1.1Harness: Deconvolution Batch deconv_1d - Data Type: f32GCC 10GCC 10 - PGO0.52291.04581.56872.09162.6145SE +/- 0.00388, N = 3SE +/- 0.00905, N = 32.324192.31720MIN: 2.26-lm - MIN: 2.241. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl

MKL-DNN DNNL

Harness: Convolution Batch conv_alexnet - Data Type: f32

OpenBenchmarking.orgms, Fewer Is BetterMKL-DNN DNNL 1.1Harness: Convolution Batch conv_alexnet - Data Type: f32GCC 10GCC 10 - PGO306090120150SE +/- 1.44, N = 3SE +/- 0.70, N = 3123.99123.30MIN: 121.92-lm - MIN: 121.281. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl

MKL-DNN DNNL

Harness: Recurrent Neural Network Training - Data Type: f32

OpenBenchmarking.orgms, Fewer Is BetterMKL-DNN DNNL 1.1Harness: Recurrent Neural Network Training - Data Type: f32GCC 10GCC 10 - PGO4080120160200SE +/- 0.35, N = 3SE +/- 0.82, N = 3194.25195.32MIN: 192.53-lm - MIN: 192.261. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl

MKL-DNN DNNL

Harness: Convolution Batch conv_googlenet_v3 - Data Type: f32

OpenBenchmarking.orgms, Fewer Is BetterMKL-DNN DNNL 1.1Harness: Convolution Batch conv_googlenet_v3 - Data Type: f32GCC 10GCC 10 - PGO1224364860SE +/- 0.14, N = 3SE +/- 0.66, N = 352.3353.41MIN: 51.48-lm - MIN: 51.61. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl

TTSIOD 3D Renderer

Phong Rendering With Soft-Shadow Mapping

OpenBenchmarking.orgFPS, More Is BetterTTSIOD 3D Renderer 2.3bPhong Rendering With Soft-Shadow MappingGCC 102004006008001000SE +/- 1.21, N = 3938.471. (CXX) g++ options: -O3 -fomit-frame-pointer -ffast-math -mtune=native -flto -msse -mrecip -mfpmath=sse -msse2 -mssse3 -lSDL -fopenmp -fwhole-program -lstdc++

ACES DGEMM

Sustained Floating-Point Rate

OpenBenchmarking.orgGFLOP/s, More Is BetterACES DGEMM 1.0Sustained Floating-Point RateGCC 10GCC 10 - PGO3691215SE +/- 0.158518, N = 12SE +/- 0.073545, N = 38.5672829.6690281. (CC) gcc options: -O3 -march=native -fopenmp

Himeno Benchmark

Poisson Pressure Solver

OpenBenchmarking.orgMFLOPS, More Is BetterHimeno Benchmark 3.0Poisson Pressure SolverGCC 10GCC 10 - PGO10002000300040005000SE +/- 55.99, N = 5SE +/- 30.05, N = 34684.304848.091. (CC) gcc options: -O3 -mavx2

Stockfish

Total Time

OpenBenchmarking.orgNodes Per Second, More Is BetterStockfish 9Total TimeGCC 10GCC 10 - PGO20M40M60M80M100MSE +/- 526550.53, N = 3SE +/- 536300.93, N = 379359613785019831. (CXX) g++ options: -m64 -lpthread -fno-exceptions -std=c++11 -pedantic -O3 -msse -msse3 -mpopcnt -flto

Timed ImageMagick Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed ImageMagick Compilation 6.9.0Time To CompileGCC 1048121620SE +/- 0.08, N = 316.47

XZ Compression

Compressing ubuntu-16.04.3-server-i386.img, Compression Level 9

OpenBenchmarking.orgSeconds, Fewer Is BetterXZ Compression 5.2.4Compressing ubuntu-16.04.3-server-i386.img, Compression Level 9GCC 10510152025SE +/- 0.02, N = 319.871. (CC) gcc options: -pthread -fvisibility=hidden -O2

FLAC Audio Encoding

WAV To FLAC

OpenBenchmarking.orgSeconds, Fewer Is BetterFLAC Audio Encoding 1.3.2WAV To FLACGCC 10246810SE +/- 0.009, N = 57.7191. (CXX) g++ options: -O2 -fvisibility=hidden -logg -lm

LAME MP3 Encoding

WAV To MP3

OpenBenchmarking.orgSeconds, Fewer Is BetterLAME MP3 Encoding 3.100WAV To MP3GCC 10246810SE +/- 0.002, N = 37.2971. (CC) gcc options: -O3 -ffast-math -funroll-loops -fschedule-insns2 -fbranch-count-reg -fforce-addr -pipe -lm

Radiance Benchmark

Test: Serial

OpenBenchmarking.orgSeconds, Fewer Is BetterRadiance Benchmark 5.0Test: SerialGCC 10GCC 10 - PGO120240360480600555.94556.48

Radiance Benchmark

Test: SMP Parallel

OpenBenchmarking.orgSeconds, Fewer Is BetterRadiance Benchmark 5.0Test: SMP ParallelGCC 10GCC 10 - PGO4080120160200171.30172.35

OpenSSL

RSA 4096-bit Performance

OpenBenchmarking.orgSigns Per Second, More Is BetterOpenSSL 1.1.1RSA 4096-bit PerformanceGCC 10GCC 10 - PGO15003000450060007500SE +/- 21.70, N = 3SE +/- 20.62, N = 37180.67072.4-O3 -lssl1. (CC) gcc options: -pthread -m64 -lcrypto -ldl

ASKAP

Test: tConvolve MT - Gridding

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 2018-11-10Test: tConvolve MT - GriddingGCC 10GCC 10 - PGO400800120016002000SE +/- 3.33, N = 3SE +/- 2.66, N = 31947.241947.231. (CXX) g++ options: -lpthread

ASKAP

Test: tConvolve MT - Degridding

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 2018-11-10Test: tConvolve MT - DegriddingGCC 10GCC 10 - PGO7001400210028003500SE +/- 3.58, N = 3SE +/- 2.69, N = 33359.123359.701. (CXX) g++ options: -lpthread

ASKAP

Test: tConvolve OpenMP - Gridding

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 2018-11-10Test: tConvolve OpenMP - GriddingGCC 10GCC 10 - PGO12002400360048006000SE +/- 0.00, N = 3SE +/- 36.23, N = 35433.805361.351. (CXX) g++ options: -lpthread

ASKAP

Test: tConvolve OpenMP - Degridding

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 2018-11-10Test: tConvolve OpenMP - DegriddingGCC 10GCC 10 - PGO9001800270036004500SE +/- 0.00, N = 3SE +/- 53.24, N = 34096.253995.251. (CXX) g++ options: -lpthread

GROMACS

Water Benchmark

OpenBenchmarking.orgNs Per Day, More Is BetterGROMACS 2019.4Water BenchmarkGCC 100.56361.12721.69082.25442.818SE +/- 0.002, N = 32.5051. (CXX) g++ options: -mavx2 -mfma -std=c++11 -O3 -funroll-all-loops -pthread -lrt -lpthread -lm

PostgreSQL pgbench

Scaling: Buffer Test - Test: Normal Load - Mode: Read Only

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL pgbench 12.0Scaling: Buffer Test - Test: Normal Load - Mode: Read OnlyGCC 10140K280K420K560K700KSE +/- 622.09, N = 3669039.841. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lpthread -lrt -lcrypt -ldl -lm

PostgreSQL pgbench

Scaling: Buffer Test - Test: Heavy Contention - Mode: Read Only

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL pgbench 12.0Scaling: Buffer Test - Test: Heavy Contention - Mode: Read OnlyGCC 10140K280K420K560K700KSE +/- 4887.19, N = 3676349.711. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lpthread -lrt -lcrypt -ldl -lm

SQLite Speedtest

Timed Time - Size 1,000

OpenBenchmarking.orgSeconds, Fewer Is BetterSQLite Speedtest 3.30Timed Time - Size 1,000GCC 101326395265SE +/- 0.13, N = 357.261. (CC) gcc options: -O2 -ldl -lz -lpthread

Facebook RocksDB

Test: Random Fill

OpenBenchmarking.orgOp/s, More Is BetterFacebook RocksDB 6.3.6Test: Random FillGCC 10GCC 10 - PGO200K400K600K800K1000KSE +/- 16043.44, N = 3SE +/- 1253.44, N = 39380399210811. (CXX) g++ options: -O3 -march=native -std=c++11 -fno-builtin-memcmp -fno-rtti -rdynamic -lpthread

Facebook RocksDB

Test: Random Read

OpenBenchmarking.orgOp/s, More Is BetterFacebook RocksDB 6.3.6Test: Random ReadGCC 10GCC 10 - PGO30M60M90M120M150MSE +/- 1800355.11, N = 3SE +/- 727018.18, N = 31452078271447686941. (CXX) g++ options: -O3 -march=native -std=c++11 -fno-builtin-memcmp -fno-rtti -rdynamic -lpthread

Facebook RocksDB

Test: Sequential Fill

OpenBenchmarking.orgOp/s, More Is BetterFacebook RocksDB 6.3.6Test: Sequential FillGCC 10GCC 10 - PGO200K400K600K800K1000KSE +/- 3135.06, N = 3SE +/- 3754.87, N = 3101986210206571. (CXX) g++ options: -O3 -march=native -std=c++11 -fno-builtin-memcmp -fno-rtti -rdynamic -lpthread

Facebook RocksDB

Test: Random Fill Sync

OpenBenchmarking.orgOp/s, More Is BetterFacebook RocksDB 6.3.6Test: Random Fill SyncGCC 10GCC 10 - PGO5K10K15K20K25KSE +/- 19.92, N = 3SE +/- 28.62, N = 324588245881. (CXX) g++ options: -O3 -march=native -std=c++11 -fno-builtin-memcmp -fno-rtti -rdynamic -lpthread

Facebook RocksDB

Test: Read While Writing

OpenBenchmarking.orgOp/s, More Is BetterFacebook RocksDB 6.3.6Test: Read While WritingGCC 10GCC 10 - PGO1000K2000K3000K4000K5000KSE +/- 20082.88, N = 3SE +/- 26586.51, N = 3488995648684401. (CXX) g++ options: -O3 -march=native -std=c++11 -fno-builtin-memcmp -fno-rtti -rdynamic -lpthread


Phoronix Test Suite v10.8.4