GCC 10 AMD Threadripper 3960X PGO Optimization

AMD Ryzen Threadripper 3960X 24-Core testing with a MSI Creator TRX40 (MS-7C59) v1.0 (1.12N1 BIOS) and Gigabyte AMD Radeon 540/540X/550/550X / RX 540X/550/550X 2GB on Ubuntu 19.10 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/1912220-PTS-GCC10AMD97&grs&sor.

GCC 10 AMD Threadripper 3960X PGO OptimizationProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLCompilerFile-SystemScreen ResolutionGCC 10Sabrent Rocket 4.0 1TBAMD Ryzen Threadripper 3960X 24-Core @ 3.80GHz (24 Cores / 48 Threads)MSI Creator TRX40 (MS-7C59) v1.0 (1.12N1 BIOS)AMD Starship/Matisse32768MB1000GB Sabrent Rocket 4.0 1TBGigabyte AMD Radeon 540/540X/550/550X / RX 540X/550/550X 2GB (1206/1750MHz)AMD Baffin HDMI/DPASUS VP28UAquantia AQC107 NBase-T/IEEE + Intel I211 + Intel Device 2723Ubuntu 19.105.4.0-nvme-hwmon (x86_64)GNOME Shell 3.34.1X Server 1.20.5modesetting 1.20.54.5 Mesa 19.2.1 (LLVM 9.0.0)GCC 10.0.0 20191208ext43840x2160OpenBenchmarking.orgCompiler Details- --disable-multilib --enable-checking=releaseDisk Details- NONE / errors=remount-ro,relatime,rwProcessor Details- Scaling Governor: acpi-cpufreq ondemand - CPU Microcode: 0x8301025Security Details- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: conditional RSB filling + tsx_async_abort: Not affected

GCC 10 AMD Threadripper 3960X PGO Optimizationfftw: Float + SSE - 2D FFT Size 4096compress-xz: Compressing ubuntu-16.04.3-server-i386.img, Compression Level 9fftw: Float + SSE - 1D FFT Size 32stockfish: Total Timeqmcpack: fftw: Stock - 1D FFT Size 32build-imagemagick: Time To Compilefftw: Stock - 2D FFT Size 4096fftw: Stock - 2D FFT Size 32pgbench: Buffer Test - Heavy Contention - Read Onlypgbench: Buffer Test - Normal Load - Read Onlysqlite-speedtest: Timed Time - Size 1,000byte: Dhrystone 2ttsiod-renderer: Phong Rendering With Soft-Shadow Mappingcompress-zstd: Compressing ubuntu-16.04.3-server-i386.img, Compression Level 19fftw: Float + SSE - 2D FFT Size 32encode-mp3: WAV To MP3gromacs: Water Benchmarkmrbayes: Primate Phylogeny Analysistscp: AI Chess Performanceencode-flac: WAV To FLACmkl-dnn: Convolution Batch conv_alexnet - f32himeno: Poisson Pressure Solversqlite: 1openssl: RSA 4096-bit Performancerocksdb: Seq Fillrocksdb: Rand Readradiance: SMP Parallelrocksdb: Read While Writingmkl-dnn: Deconvolution Batch deconv_1d - f32crafty: Elapsed Timerocksdb: Rand Fillrocksdb: Rand Fill Syncaskap: tConvolve MT - Griddingradiance: Serialminife: Smallaskap: tConvolve MT - Degriddingmkl-dnn: Recurrent Neural Network Training - f32mkl-dnn: Convolution Batch conv_googlenet_v3 - f32askap: tConvolve OpenMP - Degriddingaskap: tConvolve OpenMP - Griddingbyte: Floating-Point Arithmeticbyte: Register Arithmeticbyte: Integer Arithmetichpcc: Max Ping Pong Bandwidthhpcc: Rand Ring Bandwidthhpcc: Rand Ring Latencyhpcc: G-Rand Accesshpcc: EP-STREAM Triadhpcc: G-Ptranshpcc: EP-DGEMMhpcc: G-Fftehpcc: G-Fftehpcc: G-HPLmt-dgemm: Sustained Floating-Point RateGCC 10Sabrent Rocket 4.0 1TB2266719.865153967935961318781044316.4696687.310512676349.711031669039.84425157.26348055276.3938.47110.260454047.2972.50570.00813466517.719123.9904684.29949414.1847180.61019862145207827171.29548899562.324199234824938039245881947.24555.9367740.103359.12194.24852.32914096.255433.811122976.9983.406780.458630.142781.797505.4773732.9279310.4912710.4912763.629338.5672823767.3799.1123309.724425357137.23436.65.79924073862.4260493.711863264129.487095130.2221645648.25.018831595.0963708.49.2431.99583.27211895858.016120.1544820.05251914.5957060.11035308143335505169.59449370482.307049301788932120244601937.58554.9557737.783359.7194.22552.33194096.255433.81110.078489OpenBenchmarking.org

FFTW

Build: Float + SSE - Size: 2D FFT Size 4096

OpenBenchmarking.orgMflops, More Is BetterFFTW 3.3.6Build: Float + SSE - Size: 2D FFT Size 4096GCC 10Sabrent Rocket 4.0 1TB5K10K15K20K25KSE +/- 285.77, N = 322667.03767.3-O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math1. (CC) gcc options: -pthread -lm

XZ Compression

Compressing ubuntu-16.04.3-server-i386.img, Compression Level 9

OpenBenchmarking.orgSeconds, Fewer Is BetterXZ Compression 5.2.4Compressing ubuntu-16.04.3-server-i386.img, Compression Level 9GCC 10Sabrent Rocket 4.0 1TB2004006008001000SE +/- 0.02, N = 319.87799.111. (CC) gcc options: -pthread -fvisibility=hidden

FFTW

Build: Float + SSE - Size: 1D FFT Size 32

OpenBenchmarking.orgMflops, More Is BetterFFTW 3.3.6Build: Float + SSE - Size: 1D FFT Size 32GCC 10Sabrent Rocket 4.0 1TB3K6K9K12K15KSE +/- 15.37, N = 315396.03309.7-O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math1. (CC) gcc options: -pthread -lm

Stockfish

Total Time

OpenBenchmarking.orgNodes Per Second, More Is BetterStockfish 9Total TimeGCC 10Sabrent Rocket 4.0 1TB20M40M60M80M100MSE +/- 526550.53, N = 37935961324425351. (CXX) g++ options: -m64 -lpthread -fno-exceptions -std=c++11 -pedantic -O3 -msse -msse3 -mpopcnt -flto

QMCPACK

OpenBenchmarking.orgTotal Execution Time - Seconds, Fewer Is BetterQMCPACK 3.8GCC 10Sabrent Rocket 4.0 1TB150030004500600075001878.07137.21. (CXX) g++ options: -fopenmp -fomit-frame-pointer -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -march=native -O3 -ffast-math -lm

FFTW

Build: Stock - Size: 1D FFT Size 32

OpenBenchmarking.orgMflops, More Is BetterFFTW 3.3.6Build: Stock - Size: 1D FFT Size 32GCC 10Sabrent Rocket 4.0 1TB2K4K6K8K10KSE +/- 16.77, N = 310443.03436.6-O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math1. (CC) gcc options: -pthread -lm

Timed ImageMagick Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed ImageMagick Compilation 6.9.0Time To CompileSabrent Rocket 4.0 1TBGCC 1048121620SE +/- 0.078, N = 35.79916.469

FFTW

Build: Stock - Size: 2D FFT Size 4096

OpenBenchmarking.orgMflops, More Is BetterFFTW 3.3.6Build: Stock - Size: 2D FFT Size 4096GCC 10Sabrent Rocket 4.0 1TB14002800420056007000SE +/- 7.80, N = 36687.32407.0-O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math1. (CC) gcc options: -pthread -lm

FFTW

Build: Stock - Size: 2D FFT Size 32

OpenBenchmarking.orgMflops, More Is BetterFFTW 3.3.6Build: Stock - Size: 2D FFT Size 32GCC 10Sabrent Rocket 4.0 1TB2K4K6K8K10KSE +/- 11.02, N = 310512.03862.4-O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math1. (CC) gcc options: -pthread -lm

PostgreSQL pgbench

Scaling: Buffer Test - Test: Heavy Contention - Mode: Read Only

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL pgbench 12.0Scaling: Buffer Test - Test: Heavy Contention - Mode: Read OnlyGCC 10Sabrent Rocket 4.0 1TB140K280K420K560K700KSE +/- 4887.19, N = 3676349.71260493.71-O2 -lpq1. (CC) gcc options: -fno-strict-aliasing -fwrapv -lpgcommon -lpgport -lpthread -lrt -lcrypt -ldl -lm

PostgreSQL pgbench

Scaling: Buffer Test - Test: Normal Load - Mode: Read Only

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL pgbench 12.0Scaling: Buffer Test - Test: Normal Load - Mode: Read OnlyGCC 10Sabrent Rocket 4.0 1TB140K280K420K560K700KSE +/- 622.09, N = 3669039.84264129.49-O2 -lpq1. (CC) gcc options: -fno-strict-aliasing -fwrapv -lpgcommon -lpgport -lpthread -lrt -lcrypt -ldl -lm

SQLite Speedtest

Timed Time - Size 1,000

OpenBenchmarking.orgSeconds, Fewer Is BetterSQLite Speedtest 3.30Timed Time - Size 1,000GCC 10Sabrent Rocket 4.0 1TB306090120150SE +/- 0.13, N = 357.26130.22-O21. (CC) gcc options: -ldl -lz -lpthread

BYTE Unix Benchmark

Computational Test: Dhrystone 2

OpenBenchmarking.orgLPS, More Is BetterBYTE Unix Benchmark 3.6Computational Test: Dhrystone 2GCC 10Sabrent Rocket 4.0 1TB10M20M30M40M50MSE +/- 550382.53, N = 348055276.321645648.2

TTSIOD 3D Renderer

Phong Rendering With Soft-Shadow Mapping

OpenBenchmarking.orgFPS, More Is BetterTTSIOD 3D Renderer 2.3bPhong Rendering With Soft-Shadow MappingGCC 10Sabrent Rocket 4.0 1TB2004006008001000SE +/- 1.20634, N = 3938.471005.01883-O3 -fopenmp -fwhole-program1. (CXX) g++ options: -fomit-frame-pointer -ffast-math -mtune=native -flto -msse -mrecip -mfpmath=sse -msse2 -mssse3 -lSDL -lstdc++

Zstd Compression

Compressing ubuntu-16.04.3-server-i386.img, Compression Level 19

OpenBenchmarking.orgSeconds, Fewer Is BetterZstd Compression 1.3.4Compressing ubuntu-16.04.3-server-i386.img, Compression Level 19GCC 10Sabrent Rocket 4.0 1TB30060090012001500SE +/- 0.03, N = 310.261595.101. (CC) gcc options: -pthread -lz -llzma

FFTW

Build: Float + SSE - Size: 2D FFT Size 32

OpenBenchmarking.orgMflops, More Is BetterFFTW 3.3.6Build: Float + SSE - Size: 2D FFT Size 32GCC 10Sabrent Rocket 4.0 1TB10K20K30K40K50KSE +/- 56.20, N = 345404.03708.4-O3 -fomit-frame-pointer -mtune=native -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math1. (CC) gcc options: -pthread -lm

LAME MP3 Encoding

WAV To MP3

OpenBenchmarking.orgSeconds, Fewer Is BetterLAME MP3 Encoding 3.100WAV To MP3GCC 10Sabrent Rocket 4.0 1TB3691215SE +/- 0.002, N = 37.2979.2431. (CC) gcc options: -O3 -ffast-math -funroll-loops -fschedule-insns2 -fbranch-count-reg -fforce-addr -pipe -lm

GROMACS

Water Benchmark

OpenBenchmarking.orgNs Per Day, More Is BetterGROMACS 2019.4Water BenchmarkGCC 10Sabrent Rocket 4.0 1TB0.56361.12721.69082.25442.818SE +/- 0.002, N = 32.5051.9951. (CXX) g++ options: -mavx2 -mfma -std=c++11 -O3 -funroll-all-loops -pthread -lrt -lpthread -lm

Timed MrBayes Analysis

Primate Phylogeny Analysis

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed MrBayes Analysis 3.2.7Primate Phylogeny AnalysisGCC 10Sabrent Rocket 4.0 1TB20406080100SE +/- 0.25, N = 370.0183.271. (CC) gcc options: -mmmx -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4a -msha -maes -mavx -mfma -mavx2 -mrdrnd -mbmi -mbmi2 -madx -mabm -O3 -std=c99 -pedantic -lm

TSCP

AI Chess Performance

OpenBenchmarking.orgNodes Per Second, More Is BetterTSCP 1.81AI Chess PerformanceGCC 10Sabrent Rocket 4.0 1TB300K600K900K1200K1500KSE +/- 1472.68, N = 5134665111895851. (CC) gcc options: -O3 -march=native

FLAC Audio Encoding

WAV To FLAC

OpenBenchmarking.orgSeconds, Fewer Is BetterFLAC Audio Encoding 1.3.2WAV To FLACGCC 10Sabrent Rocket 4.0 1TB246810SE +/- 0.009, N = 57.7198.016-O21. (CXX) g++ options: -fvisibility=hidden -logg -lm

MKL-DNN DNNL

Harness: Convolution Batch conv_alexnet - Data Type: f32

OpenBenchmarking.orgms, Fewer Is BetterMKL-DNN DNNL 1.1Harness: Convolution Batch conv_alexnet - Data Type: f32Sabrent Rocket 4.0 1TBGCC 10306090120150SE +/- 1.44, N = 3120.15123.99-lm - MIN: 119.49MIN: 121.921. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl

Himeno Benchmark

Poisson Pressure Solver

OpenBenchmarking.orgMFLOPS, More Is BetterHimeno Benchmark 3.0Poisson Pressure SolverSabrent Rocket 4.0 1TBGCC 1010002000300040005000SE +/- 55.99, N = 54820.054684.301. (CC) gcc options: -O3 -mavx2

SQLite

Threads / Copies: 1

OpenBenchmarking.orgSeconds, Fewer Is BetterSQLite 3.30.1Threads / Copies: 1GCC 10Sabrent Rocket 4.0 1TB48121620SE +/- 0.01, N = 314.1814.60-O21. (CC) gcc options: -lz -lm -ldl -lpthread

OpenSSL

RSA 4096-bit Performance

OpenBenchmarking.orgSigns Per Second, More Is BetterOpenSSL 1.1.1RSA 4096-bit PerformanceGCC 10Sabrent Rocket 4.0 1TB15003000450060007500SE +/- 21.70, N = 37180.67060.1-O3 -lssl1. (CC) gcc options: -pthread -m64 -lcrypto -ldl

Facebook RocksDB

Test: Sequential Fill

OpenBenchmarking.orgOp/s, More Is BetterFacebook RocksDB 6.3.6Test: Sequential FillSabrent Rocket 4.0 1TBGCC 10200K400K600K800K1000KSE +/- 3135.06, N = 3103530810198621. (CXX) g++ options: -O3 -march=native -std=c++11 -fno-builtin-memcmp -fno-rtti -rdynamic -lpthread

Facebook RocksDB

Test: Random Read

OpenBenchmarking.orgOp/s, More Is BetterFacebook RocksDB 6.3.6Test: Random ReadGCC 10Sabrent Rocket 4.0 1TB30M60M90M120M150MSE +/- 1800355.11, N = 31452078271433355051. (CXX) g++ options: -O3 -march=native -std=c++11 -fno-builtin-memcmp -fno-rtti -rdynamic -lpthread

Radiance Benchmark

Test: SMP Parallel

OpenBenchmarking.orgSeconds, Fewer Is BetterRadiance Benchmark 5.0Test: SMP ParallelSabrent Rocket 4.0 1TBGCC 104080120160200169.59171.30

Facebook RocksDB

Test: Read While Writing

OpenBenchmarking.orgOp/s, More Is BetterFacebook RocksDB 6.3.6Test: Read While WritingSabrent Rocket 4.0 1TBGCC 101.1M2.2M3.3M4.4M5.5MSE +/- 20082.88, N = 3493704848899561. (CXX) g++ options: -O3 -march=native -std=c++11 -fno-builtin-memcmp -fno-rtti -rdynamic -lpthread

MKL-DNN DNNL

Harness: Deconvolution Batch deconv_1d - Data Type: f32

OpenBenchmarking.orgms, Fewer Is BetterMKL-DNN DNNL 1.1Harness: Deconvolution Batch deconv_1d - Data Type: f32Sabrent Rocket 4.0 1TBGCC 100.52291.04581.56872.09162.6145SE +/- 0.00388, N = 32.307042.32419-lm - MIN: 2.25MIN: 2.261. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl

Crafty

Elapsed Time

OpenBenchmarking.orgNodes Per Second, More Is BetterCrafty 25.2Elapsed TimeSabrent Rocket 4.0 1TBGCC 102M4M6M8M10MSE +/- 7954.66, N = 3930178892348241. (CC) gcc options: -pthread -lstdc++ -fprofile-use -lm

Facebook RocksDB

Test: Random Fill

OpenBenchmarking.orgOp/s, More Is BetterFacebook RocksDB 6.3.6Test: Random FillGCC 10Sabrent Rocket 4.0 1TB200K400K600K800K1000KSE +/- 16043.44, N = 39380399321201. (CXX) g++ options: -O3 -march=native -std=c++11 -fno-builtin-memcmp -fno-rtti -rdynamic -lpthread

Facebook RocksDB

Test: Random Fill Sync

OpenBenchmarking.orgOp/s, More Is BetterFacebook RocksDB 6.3.6Test: Random Fill SyncGCC 10Sabrent Rocket 4.0 1TB5K10K15K20K25KSE +/- 19.92, N = 324588244601. (CXX) g++ options: -O3 -march=native -std=c++11 -fno-builtin-memcmp -fno-rtti -rdynamic -lpthread

ASKAP

Test: tConvolve MT - Gridding

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 2018-11-10Test: tConvolve MT - GriddingGCC 10Sabrent Rocket 4.0 1TB400800120016002000SE +/- 3.33, N = 31947.241937.581. (CXX) g++ options: -lpthread

Radiance Benchmark

Test: Serial

OpenBenchmarking.orgSeconds, Fewer Is BetterRadiance Benchmark 5.0Test: SerialSabrent Rocket 4.0 1TBGCC 10120240360480600554.96555.94

miniFE

Problem Size: Small

OpenBenchmarking.orgCG Mflops, More Is BetterminiFE 2.2Problem Size: SmallGCC 10Sabrent Rocket 4.0 1TB17003400510068008500SE +/- 11.29, N = 37740.107737.781. (CXX) g++ options: -O3 -fopenmp -pthread -lmpi_cxx -lmpi

ASKAP

Test: tConvolve MT - Degridding

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 2018-11-10Test: tConvolve MT - DegriddingSabrent Rocket 4.0 1TBGCC 107001400210028003500SE +/- 3.58, N = 33359.703359.121. (CXX) g++ options: -lpthread

MKL-DNN DNNL

Harness: Recurrent Neural Network Training - Data Type: f32

OpenBenchmarking.orgms, Fewer Is BetterMKL-DNN DNNL 1.1Harness: Recurrent Neural Network Training - Data Type: f32Sabrent Rocket 4.0 1TBGCC 104080120160200SE +/- 0.35, N = 3194.23194.25-lm - MIN: 193.18MIN: 192.531. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl

MKL-DNN DNNL

Harness: Convolution Batch conv_googlenet_v3 - Data Type: f32

OpenBenchmarking.orgms, Fewer Is BetterMKL-DNN DNNL 1.1Harness: Convolution Batch conv_googlenet_v3 - Data Type: f32GCC 10Sabrent Rocket 4.0 1TB1224364860SE +/- 0.14, N = 352.3352.33MIN: 51.48-lm - MIN: 51.671. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl

ASKAP

Test: tConvolve OpenMP - Degridding

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 2018-11-10Test: tConvolve OpenMP - DegriddingSabrent Rocket 4.0 1TBGCC 109001800270036004500SE +/- 0.00, N = 34096.254096.251. (CXX) g++ options: -lpthread

ASKAP

Test: tConvolve OpenMP - Gridding

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 2018-11-10Test: tConvolve OpenMP - GriddingSabrent Rocket 4.0 1TBGCC 1012002400360048006000SE +/- 0.00, N = 35433.85433.81. (CXX) g++ options: -lpthread

BYTE Unix Benchmark

Computational Test: Floating-Point Arithmetic

OpenBenchmarking.orgLPS, More Is BetterBYTE Unix Benchmark 3.6Computational Test: Floating-Point ArithmeticSabrent Rocket 4.0 1TBGCC 100.2250.450.6750.91.12511

BYTE Unix Benchmark

Computational Test: Register Arithmetic

OpenBenchmarking.orgLPS, More Is BetterBYTE Unix Benchmark 3.6Computational Test: Register ArithmeticSabrent Rocket 4.0 1TBGCC 100.2250.450.6750.91.12511

BYTE Unix Benchmark

Computational Test: Integer Arithmetic

OpenBenchmarking.orgLPS, More Is BetterBYTE Unix Benchmark 3.6Computational Test: Integer ArithmeticSabrent Rocket 4.0 1TBGCC 100.2250.450.6750.91.12511

HPC Challenge

Test / Class: Max Ping Pong Bandwidth

OpenBenchmarking.orgMB/s, More Is BetterHPC Challenge 1.5.0Test / Class: Max Ping Pong BandwidthGCC 105K10K15K20K25KSE +/- 313.29, N = 322977.001. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops2. ATLAS + Open MPI 3.1.3

HPC Challenge

Test / Class: Random Ring Bandwidth

OpenBenchmarking.orgGB/s, More Is BetterHPC Challenge 1.5.0Test / Class: Random Ring BandwidthGCC 100.76651.5332.29953.0663.8325SE +/- 0.01038, N = 33.406781. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops2. ATLAS + Open MPI 3.1.3

HPC Challenge

Test / Class: Random Ring Latency

OpenBenchmarking.orgusecs, Fewer Is BetterHPC Challenge 1.5.0Test / Class: Random Ring LatencyGCC 100.10320.20640.30960.41280.516SE +/- 0.00067, N = 30.458631. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops2. ATLAS + Open MPI 3.1.3

HPC Challenge

Test / Class: G-Random Access

OpenBenchmarking.orgGUP/s, More Is BetterHPC Challenge 1.5.0Test / Class: G-Random AccessGCC 100.03210.06420.09630.12840.1605SE +/- 0.00039, N = 30.142781. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops2. ATLAS + Open MPI 3.1.3

HPC Challenge

Test / Class: EP-STREAM Triad

OpenBenchmarking.orgGB/s, More Is BetterHPC Challenge 1.5.0Test / Class: EP-STREAM TriadGCC 100.40440.80881.21321.61762.022SE +/- 0.00127, N = 31.797501. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops2. ATLAS + Open MPI 3.1.3

HPC Challenge

Test / Class: G-Ptrans

OpenBenchmarking.orgGB/s, More Is BetterHPC Challenge 1.5.0Test / Class: G-PtransGCC 101.23242.46483.69724.92966.162SE +/- 0.00581, N = 35.477371. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops2. ATLAS + Open MPI 3.1.3

HPC Challenge

Test / Class: EP-DGEMM

OpenBenchmarking.orgGFLOPS, More Is BetterHPC Challenge 1.5.0Test / Class: EP-DGEMMGCC 10816243240SE +/- 0.38, N = 332.931. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops2. ATLAS + Open MPI 3.1.3

HPC Challenge

Test / Class: G-Ffte

OpenBenchmarking.orgGFLOP/s, More Is BetterHPC Challenge 1.5.0Test / Class: G-FfteGCC 103691215SE +/- 0.05, N = 310.491. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops2. ATLAS + Open MPI 3.1.3

HPC Challenge

Test / Class: G-Ffte

OpenBenchmarking.orgGFLOPS, More Is BetterHPC Challenge 1.5.0Test / Class: G-FfteGCC 103691215SE +/- 0.05, N = 310.491. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops2. ATLAS + Open MPI 3.1.3

HPC Challenge

Test / Class: G-HPL

OpenBenchmarking.orgGFLOPS, More Is BetterHPC Challenge 1.5.0Test / Class: G-HPLGCC 101428425670SE +/- 0.23, N = 363.631. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops2. ATLAS + Open MPI 3.1.3

ACES DGEMM

Sustained Floating-Point Rate

OpenBenchmarking.orgGFLOP/s, More Is BetterACES DGEMM 1.0Sustained Floating-Point RateGCC 10Sabrent Rocket 4.0 1TB246810SE +/- 0.158518, N = 128.5672820.0784891. (CC) gcc options: -O3 -march=native -fopenmp


Phoronix Test Suite v10.8.5