mpitest

AMD Ryzen Threadripper PRO 3995WX 64-Cores testing with a GIGABYTE WRX80-SU8 N/A (WRX80SU8-F2 BIOS) and Gigabyte NVIDIA GeForce RTX 3080 Ti 12GB on Ubuntu 24.04 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2501206-NE-MPITEST7573&grt.

mpitestProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenCLCompilerFile-SystemScreen ResolutionAMD Ryzen Threadripper PRO 3995WX 64-CoresAMD Ryzen Threadripper PRO 3995WX 64-Cores @ 2.70GHz (64 Cores / 128 Threads)GIGABYTE WRX80-SU8 N/A (WRX80SU8-F2 BIOS)AMD Starship/Matisse4 x 16GB DDR4-2133MT/s CM4X16GC3200C16K2E1000GB CT1000P3SSD8 + 1000GB KINGSTON SNV3S1000GGigabyte NVIDIA GeForce RTX 3080 Ti 12GBAMD Starship/Matisse2 x LG TV2 x Realtek RTL8111/8168/8211/8411 + 2 x Intel I210 + 2 x Intel X550 + Intel Wi-Fi 6 AX200Ubuntu 24.046.8.0-51-generic (x86_64)GNOME Shell 46.0X Server 1.21.1.11NVIDIAOpenCL 3.0 CUDA 12.4.131GCC 13.3.0 + CUDA 12.6ext41920x1080OpenBenchmarking.org- Transparent Huge Pages: madvise- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-fG75Ri/gcc-13-13.3.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-fG75Ri/gcc-13-13.3.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0x830107c- Python 3.12.4- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Mitigation of untrained return thunk; SMT enabled with STIBP protection + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected

mpitestaskap: tConvolve MPI - Degriddingaskap: tConvolve MPI - Griddinggpaw: Carbon Nanotubegromacs: MPI CPU - water_GMX50_baregromacs: NVIDIA CUDA GPU - water_GMX50_barehpcg: 104 104 104 - 60hpcg: 104 104 104 - 1800hpcc: G-HPLhpcc: G-Fftehpcc: EP-DGEMMhpcc: G-Ptranshpcc: EP-STREAM Triadhpcc: G-Rand Accesshpcc: Rand Ring Latencyhpcc: Rand Ring Bandwidthhpcc: Max Ping Pong Bandwidthintel-mpi: IMB-P2P PingPongintel-mpi: IMB-MPI1 Exchangeintel-mpi: IMB-MPI1 Exchangeintel-mpi: IMB-MPI1 PingPongintel-mpi: IMB-MPI1 Sendrecvintel-mpi: IMB-MPI1 Sendrecvlammps: 20k Atomslammps: Rhodopsin Proteinminife: Smallmocassin: Gas HII40mocassin: Dust 2D tau100.0npb: BT.Cnpb: CG.Cnpb: EP.Cnpb: EP.Dnpb: FT.Cnpb: IS.Dnpb: LU.Cnpb: MG.Cnpb: SP.Bnpb: SP.Cpennant: sedovbigpennant: leblancbigmrbayes: Primate Phylogeny Analysisincompact3d: input.i3d 129 Cells Per Directionincompact3d: input.i3d 193 Cells Per DirectionAMD Ryzen Threadripper PRO 3995WX 64-Cores11744.3910871.70136.7873.17123.8015.599166.33226152.5333311.193477.253582.541220.568270.192711.141250.716539246.266253666682490.11557.432411.511859.20292.1724.59821.8386809.8419.249145.59749906.825819.635140.815322.8421478.66853.8849703.319608.9730075.2016955.2717.351165.925810131.46113.486003259.3955548OpenBenchmarking.org

ASKAP

Test: tConvolve MPI - Degridding

OpenBenchmarking.orgMpix/sec, More Is BetterASKAP 1.0Test: tConvolve MPI - DegriddingAMD Ryzen Threadripper PRO 3995WX 64-Cores3K6K9K12K15KSE +/- 430.92, N = 1211744.391. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

ASKAP

Test: tConvolve MPI - Gridding

OpenBenchmarking.orgMpix/sec, More Is BetterASKAP 1.0Test: tConvolve MPI - GriddingAMD Ryzen Threadripper PRO 3995WX 64-Cores2K4K6K8K10KSE +/- 392.31, N = 1210871.701. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

GPAW

Input: Carbon Nanotube

OpenBenchmarking.orgSeconds, Fewer Is BetterGPAW 23.6Input: Carbon NanotubeAMD Ryzen Threadripper PRO 3995WX 64-Cores306090120150SE +/- 0.41, N = 3136.791. (CC) gcc options: -pthread -shared -lxc -lblas -lmpi -fno-strict-overflow -O2 -fPIC -isystem -UNDEBUG -std=c99

GROMACS

Implementation: MPI CPU - Input: water_GMX50_bare

OpenBenchmarking.orgNs Per Day, More Is BetterGROMACS 2024Implementation: MPI CPU - Input: water_GMX50_bareAMD Ryzen Threadripper PRO 3995WX 64-Cores0.71351.4272.14052.8543.5675SE +/- 0.036, N = 33.1711. (CXX) g++ options: -O3 -lm

GROMACS

Implementation: NVIDIA CUDA GPU - Input: water_GMX50_bare

OpenBenchmarking.orgNs Per Day, More Is BetterGROMACS 2024Implementation: NVIDIA CUDA GPU - Input: water_GMX50_bareAMD Ryzen Threadripper PRO 3995WX 64-Cores612182430SE +/- 0.01, N = 323.801. (CXX) g++ options: -O3 -lm

High Performance Conjugate Gradient

X Y Z: 104 104 104 - RT: 60

OpenBenchmarking.orgGFLOP/s, More Is BetterHigh Performance Conjugate Gradient 3.1X Y Z: 104 104 104 - RT: 60AMD Ryzen Threadripper PRO 3995WX 64-Cores1.25982.51963.77945.03926.299SE +/- 0.17243, N = 95.599161. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi

High Performance Conjugate Gradient

X Y Z: 104 104 104 - RT: 1800

OpenBenchmarking.orgGFLOP/s, More Is BetterHigh Performance Conjugate Gradient 3.1X Y Z: 104 104 104 - RT: 1800AMD Ryzen Threadripper PRO 3995WX 64-Cores246810SE +/- 0.00769, N = 36.332261. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi

HPC Challenge

Test / Class: G-HPL

OpenBenchmarking.orgGFLOPS, More Is BetterHPC Challenge 1.5.0Test / Class: G-HPLAMD Ryzen Threadripper PRO 3995WX 64-Cores306090120150SE +/- 0.48, N = 3152.531. (CC) gcc options: -lblas -lm -lmpi -fomit-frame-pointer -funroll-loops2. OpenBLAS + Open MPI 4.1.6

HPC Challenge

Test / Class: G-Ffte

OpenBenchmarking.orgGFLOPS, More Is BetterHPC Challenge 1.5.0Test / Class: G-FfteAMD Ryzen Threadripper PRO 3995WX 64-Cores3691215SE +/- 0.01, N = 311.191. (CC) gcc options: -lblas -lm -lmpi -fomit-frame-pointer -funroll-loops2. OpenBLAS + Open MPI 4.1.6

HPC Challenge

Test / Class: EP-DGEMM

OpenBenchmarking.orgGFLOPS, More Is BetterHPC Challenge 1.5.0Test / Class: EP-DGEMMAMD Ryzen Threadripper PRO 3995WX 64-Cores246810SE +/- 0.08554, N = 37.253581. (CC) gcc options: -lblas -lm -lmpi -fomit-frame-pointer -funroll-loops2. OpenBLAS + Open MPI 4.1.6

HPC Challenge

Test / Class: G-Ptrans

OpenBenchmarking.orgGB/s, More Is BetterHPC Challenge 1.5.0Test / Class: G-PtransAMD Ryzen Threadripper PRO 3995WX 64-Cores0.57181.14361.71542.28722.859SE +/- 0.01461, N = 32.541221. (CC) gcc options: -lblas -lm -lmpi -fomit-frame-pointer -funroll-loops2. OpenBLAS + Open MPI 4.1.6

HPC Challenge

Test / Class: EP-STREAM Triad

OpenBenchmarking.orgGB/s, More Is BetterHPC Challenge 1.5.0Test / Class: EP-STREAM TriadAMD Ryzen Threadripper PRO 3995WX 64-Cores0.12790.25580.38370.51160.6395SE +/- 0.00036, N = 30.568271. (CC) gcc options: -lblas -lm -lmpi -fomit-frame-pointer -funroll-loops2. OpenBLAS + Open MPI 4.1.6

HPC Challenge

Test / Class: G-Random Access

OpenBenchmarking.orgGUP/s, More Is BetterHPC Challenge 1.5.0Test / Class: G-Random AccessAMD Ryzen Threadripper PRO 3995WX 64-Cores0.04340.08680.13020.17360.217SE +/- 0.00082, N = 30.192711. (CC) gcc options: -lblas -lm -lmpi -fomit-frame-pointer -funroll-loops2. OpenBLAS + Open MPI 4.1.6

HPC Challenge

Test / Class: Random Ring Latency

OpenBenchmarking.orgusecs, Fewer Is BetterHPC Challenge 1.5.0Test / Class: Random Ring LatencyAMD Ryzen Threadripper PRO 3995WX 64-Cores0.25680.51360.77041.02721.284SE +/- 0.00848, N = 31.141251. (CC) gcc options: -lblas -lm -lmpi -fomit-frame-pointer -funroll-loops2. OpenBLAS + Open MPI 4.1.6

HPC Challenge

Test / Class: Random Ring Bandwidth

OpenBenchmarking.orgGB/s, More Is BetterHPC Challenge 1.5.0Test / Class: Random Ring BandwidthAMD Ryzen Threadripper PRO 3995WX 64-Cores0.16120.32240.48360.64480.806SE +/- 0.01836, N = 30.716531. (CC) gcc options: -lblas -lm -lmpi -fomit-frame-pointer -funroll-loops2. OpenBLAS + Open MPI 4.1.6

HPC Challenge

Test / Class: Max Ping Pong Bandwidth

OpenBenchmarking.orgMB/s, More Is BetterHPC Challenge 1.5.0Test / Class: Max Ping Pong BandwidthAMD Ryzen Threadripper PRO 3995WX 64-Cores2K4K6K8K10KSE +/- 9.69, N = 39246.271. (CC) gcc options: -lblas -lm -lmpi -fomit-frame-pointer -funroll-loops2. OpenBLAS + Open MPI 4.1.6

Intel MPI Benchmarks

Test: IMB-P2P PingPong

OpenBenchmarking.orgAverage Msg/sec, More Is BetterIntel MPI Benchmarks 2019.3Test: IMB-P2P PingPongAMD Ryzen Threadripper PRO 3995WX 64-Cores5M10M15M20M25MSE +/- 223826.40, N = 325366668MIN: 2089 / MAX: 678551081. (CXX) g++ options: -O0 -pedantic -fopenmp -lmpi_cxx -lmpi

Intel MPI Benchmarks

Test: IMB-MPI1 Exchange

OpenBenchmarking.orgAverage Mbytes/sec, More Is BetterIntel MPI Benchmarks 2019.3Test: IMB-MPI1 ExchangeAMD Ryzen Threadripper PRO 3995WX 64-Cores5001000150020002500SE +/- 29.82, N = 32490.11MAX: 9884.841. (CXX) g++ options: -O0 -pedantic -fopenmp -lmpi_cxx -lmpi

Intel MPI Benchmarks

Test: IMB-MPI1 Exchange

OpenBenchmarking.orgAverage usec, Fewer Is BetterIntel MPI Benchmarks 2019.3Test: IMB-MPI1 ExchangeAMD Ryzen Threadripper PRO 3995WX 64-Cores120240360480600SE +/- 8.21, N = 3557.43MIN: 0.97 / MAX: 22328.951. (CXX) g++ options: -O0 -pedantic -fopenmp -lmpi_cxx -lmpi

Intel MPI Benchmarks

Test: IMB-MPI1 PingPong

OpenBenchmarking.orgAverage Mbytes/sec, More Is BetterIntel MPI Benchmarks 2019.3Test: IMB-MPI1 PingPongAMD Ryzen Threadripper PRO 3995WX 64-Cores5001000150020002500SE +/- 10.55, N = 32411.51MIN: 4.38 / MAX: 7065.211. (CXX) g++ options: -O0 -pedantic -fopenmp -lmpi_cxx -lmpi

Intel MPI Benchmarks

Test: IMB-MPI1 Sendrecv

OpenBenchmarking.orgAverage Mbytes/sec, More Is BetterIntel MPI Benchmarks 2019.3Test: IMB-MPI1 SendrecvAMD Ryzen Threadripper PRO 3995WX 64-Cores400800120016002000SE +/- 6.66, N = 31859.20MAX: 6882.761. (CXX) g++ options: -O0 -pedantic -fopenmp -lmpi_cxx -lmpi

Intel MPI Benchmarks

Test: IMB-MPI1 Sendrecv

OpenBenchmarking.orgAverage usec, Fewer Is BetterIntel MPI Benchmarks 2019.3Test: IMB-MPI1 SendrecvAMD Ryzen Threadripper PRO 3995WX 64-Cores60120180240300SE +/- 2.93, N = 3292.17MIN: 0.5 / MAX: 9957.711. (CXX) g++ options: -O0 -pedantic -fopenmp -lmpi_cxx -lmpi

LAMMPS Molecular Dynamics Simulator

Model: 20k Atoms

OpenBenchmarking.orgns/day, More Is BetterLAMMPS Molecular Dynamics Simulator 23Jun2022Model: 20k AtomsAMD Ryzen Threadripper PRO 3995WX 64-Cores612182430SE +/- 0.32, N = 324.601. (CXX) g++ options: -O3 -lm -ldl

LAMMPS Molecular Dynamics Simulator

Model: Rhodopsin Protein

OpenBenchmarking.orgns/day, More Is BetterLAMMPS Molecular Dynamics Simulator 23Jun2022Model: Rhodopsin ProteinAMD Ryzen Threadripper PRO 3995WX 64-Cores510152025SE +/- 0.24, N = 521.841. (CXX) g++ options: -O3 -lm -ldl

miniFE

Problem Size: Small

OpenBenchmarking.orgCG Mflops, More Is BetterminiFE 2.2Problem Size: SmallAMD Ryzen Threadripper PRO 3995WX 64-Cores15003000450060007500SE +/- 73.24, N = 56809.841. (CXX) g++ options: -O3 -fopenmp -lmpi_cxx -lmpi

Monte Carlo Simulations of Ionised Nebulae

Input: Gas HII40

OpenBenchmarking.orgSeconds, Fewer Is BetterMonte Carlo Simulations of Ionised Nebulae 2.02.73.3Input: Gas HII40AMD Ryzen Threadripper PRO 3995WX 64-Cores510152025SE +/- 0.08, N = 319.251. (F9X) gfortran options: -cpp -Jsource/ -ffree-line-length-0 -lm -std=legacy -O2 -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lz

Monte Carlo Simulations of Ionised Nebulae

Input: Dust 2D tau100.0

OpenBenchmarking.orgSeconds, Fewer Is BetterMonte Carlo Simulations of Ionised Nebulae 2.02.73.3Input: Dust 2D tau100.0AMD Ryzen Threadripper PRO 3995WX 64-Cores306090120150SE +/- 0.98, N = 3145.601. (F9X) gfortran options: -cpp -Jsource/ -ffree-line-length-0 -lm -std=legacy -O2 -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lz

NAS Parallel Benchmarks

Test / Class: BT.C

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: BT.CAMD Ryzen Threadripper PRO 3995WX 64-Cores11K22K33K44K55KSE +/- 287.94, N = 349906.821. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6

NAS Parallel Benchmarks

Test / Class: CG.C

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: CG.CAMD Ryzen Threadripper PRO 3995WX 64-Cores12002400360048006000SE +/- 70.34, N = 45819.631. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6

NAS Parallel Benchmarks

Test / Class: EP.C

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: EP.CAMD Ryzen Threadripper PRO 3995WX 64-Cores11002200330044005500SE +/- 57.80, N = 35140.811. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6

NAS Parallel Benchmarks

Test / Class: EP.D

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: EP.DAMD Ryzen Threadripper PRO 3995WX 64-Cores11002200330044005500SE +/- 13.91, N = 35322.841. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6

NAS Parallel Benchmarks

Test / Class: FT.C

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: FT.CAMD Ryzen Threadripper PRO 3995WX 64-Cores5K10K15K20K25KSE +/- 53.71, N = 321478.661. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6

NAS Parallel Benchmarks

Test / Class: IS.D

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: IS.DAMD Ryzen Threadripper PRO 3995WX 64-Cores2004006008001000SE +/- 5.18, N = 3853.881. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6

NAS Parallel Benchmarks

Test / Class: LU.C

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: LU.CAMD Ryzen Threadripper PRO 3995WX 64-Cores11K22K33K44K55KSE +/- 171.34, N = 349703.31. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6

NAS Parallel Benchmarks

Test / Class: MG.C

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: MG.CAMD Ryzen Threadripper PRO 3995WX 64-Cores4K8K12K16K20KSE +/- 58.72, N = 319608.971. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6

NAS Parallel Benchmarks

Test / Class: SP.B

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: SP.BAMD Ryzen Threadripper PRO 3995WX 64-Cores6K12K18K24K30KSE +/- 331.56, N = 430075.201. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6

NAS Parallel Benchmarks

Test / Class: SP.C

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: SP.CAMD Ryzen Threadripper PRO 3995WX 64-Cores4K8K12K16K20KSE +/- 70.76, N = 316955.271. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6

Pennant

Test: sedovbig

OpenBenchmarking.orgHydro Cycle Time - Seconds, Fewer Is BetterPennant 1.0.1Test: sedovbigAMD Ryzen Threadripper PRO 3995WX 64-Cores48121620SE +/- 0.22, N = 417.351. (CXX) g++ options: -fopenmp -lmpi_cxx -lmpi

Pennant

Test: leblancbig

OpenBenchmarking.orgHydro Cycle Time - Seconds, Fewer Is BetterPennant 1.0.1Test: leblancbigAMD Ryzen Threadripper PRO 3995WX 64-Cores1.33332.66663.99995.33326.6665SE +/- 0.074192, N = 155.9258101. (CXX) g++ options: -fopenmp -lmpi_cxx -lmpi

Timed MrBayes Analysis

Primate Phylogeny Analysis

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed MrBayes Analysis 3.2.7Primate Phylogeny AnalysisAMD Ryzen Threadripper PRO 3995WX 64-Cores306090120150SE +/- 0.17, N = 3131.461. (CC) gcc options: -mmmx -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4a -msha -maes -mavx -mfma -mavx2 -mrdrnd -mbmi -mbmi2 -madx -mabm -O3 -std=c99 -pedantic -lm

Xcompact3d Incompact3d

Input: input.i3d 129 Cells Per Direction

OpenBenchmarking.orgSeconds, Fewer Is BetterXcompact3d Incompact3d 2021-03-11Input: input.i3d 129 Cells Per DirectionAMD Ryzen Threadripper PRO 3995WX 64-Cores3691215SE +/- 0.10, N = 313.491. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

Xcompact3d Incompact3d

Input: input.i3d 193 Cells Per Direction

OpenBenchmarking.orgSeconds, Fewer Is BetterXcompact3d Incompact3d 2021-03-11Input: input.i3d 193 Cells Per DirectionAMD Ryzen Threadripper PRO 3995WX 64-Cores1326395265SE +/- 0.13, N = 359.401. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz


Phoronix Test Suite v10.8.5