mpitest AMD Ryzen Threadripper PRO 3995WX 64-Cores testing with a GIGABYTE WRX80-SU8 N/A (WRX80SU8-F2 BIOS) and Gigabyte NVIDIA GeForce RTX 3080 Ti 12GB on Ubuntu 24.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2501206-NE-MPITEST7573&grr .
mpitest Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenCL Compiler File-System Screen Resolution AMD Ryzen Threadripper PRO 3995WX 64-Cores AMD Ryzen Threadripper PRO 3995WX 64-Cores @ 2.70GHz (64 Cores / 128 Threads) GIGABYTE WRX80-SU8 N/A (WRX80SU8-F2 BIOS) AMD Starship/Matisse 4 x 16GB DDR4-2133MT/s CM4X16GC3200C16K2E 1000GB CT1000P3SSD8 + 1000GB KINGSTON SNV3S1000G Gigabyte NVIDIA GeForce RTX 3080 Ti 12GB AMD Starship/Matisse 2 x LG TV 2 x Realtek RTL8111/8168/8211/8411 + 2 x Intel I210 + 2 x Intel X550 + Intel Wi-Fi 6 AX200 Ubuntu 24.04 6.8.0-51-generic (x86_64) GNOME Shell 46.0 X Server 1.21.1.11 NVIDIA OpenCL 3.0 CUDA 12.4.131 GCC 13.3.0 + CUDA 12.6 ext4 1920x1080 OpenBenchmarking.org - Transparent Huge Pages: madvise - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-fG75Ri/gcc-13-13.3.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-fG75Ri/gcc-13-13.3.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0x830107c - Python 3.12.4 - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Mitigation of untrained return thunk; SMT enabled with STIBP protection + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
mpitest hpcg: 104 104 104 - 60 hpcc: G-HPL hpcg: 104 104 104 - 1800 askap: tConvolve MPI - Gridding askap: tConvolve MPI - Degridding lammps: 20k Atoms mocassin: Dust 2D tau100.0 gpaw: Carbon Nanotube mrbayes: Primate Phylogeny Analysis intel-mpi: IMB-MPI1 Exchange intel-mpi: IMB-MPI1 Exchange npb: SP.C intel-mpi: IMB-P2P PingPong gromacs: MPI CPU - water_GMX50_bare minife: Small incompact3d: input.i3d 193 Cells Per Direction npb: BT.C intel-mpi: IMB-MPI1 Sendrecv intel-mpi: IMB-MPI1 Sendrecv npb: LU.C pennant: leblancbig gromacs: NVIDIA CUDA GPU - water_GMX50_bare npb: CG.C npb: IS.D npb: EP.D pennant: sedovbig npb: FT.C npb: SP.B mocassin: Gas HII40 incompact3d: input.i3d 129 Cells Per Direction intel-mpi: IMB-MPI1 PingPong npb: MG.C lammps: Rhodopsin Protein npb: EP.C hpcc: Max Ping Pong Bandwidth hpcc: Rand Ring Bandwidth hpcc: Rand Ring Latency hpcc: G-Rand Access hpcc: EP-STREAM Triad hpcc: G-Ptrans hpcc: EP-DGEMM hpcc: G-Ffte AMD Ryzen Threadripper PRO 3995WX 64-Cores 5.59916 152.53333 6.33226 10871.70 11744.39 24.598 145.597 136.787 131.461 557.43 2490.11 16955.27 25366668 3.171 6809.84 59.3955548 49906.82 292.17 1859.20 49703.3 5.925810 23.801 5819.63 853.88 5322.84 17.35116 21478.66 30075.20 19.249 13.4860032 2411.51 19608.97 21.838 5140.81 9246.266 0.71653 1.14125 0.19271 0.56827 2.54122 7.25358 11.19347 OpenBenchmarking.org
High Performance Conjugate Gradient X Y Z: 104 104 104 - RT: 60 OpenBenchmarking.org GFLOP/s, More Is Better High Performance Conjugate Gradient 3.1 X Y Z: 104 104 104 - RT: 60 AMD Ryzen Threadripper PRO 3995WX 64-Cores 1.2598 2.5196 3.7794 5.0392 6.299 SE +/- 0.17243, N = 9 5.59916 1. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi
HPC Challenge Test / Class: G-HPL OpenBenchmarking.org GFLOPS, More Is Better HPC Challenge 1.5.0 Test / Class: G-HPL AMD Ryzen Threadripper PRO 3995WX 64-Cores 30 60 90 120 150 SE +/- 0.48, N = 3 152.53 1. (CC) gcc options: -lblas -lm -lmpi -fomit-frame-pointer -funroll-loops 2. OpenBLAS + Open MPI 4.1.6
High Performance Conjugate Gradient X Y Z: 104 104 104 - RT: 1800 OpenBenchmarking.org GFLOP/s, More Is Better High Performance Conjugate Gradient 3.1 X Y Z: 104 104 104 - RT: 1800 AMD Ryzen Threadripper PRO 3995WX 64-Cores 2 4 6 8 10 SE +/- 0.00769, N = 3 6.33226 1. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi
ASKAP Test: tConvolve MPI - Gridding OpenBenchmarking.org Mpix/sec, More Is Better ASKAP 1.0 Test: tConvolve MPI - Gridding AMD Ryzen Threadripper PRO 3995WX 64-Cores 2K 4K 6K 8K 10K SE +/- 392.31, N = 12 10871.70 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
ASKAP Test: tConvolve MPI - Degridding OpenBenchmarking.org Mpix/sec, More Is Better ASKAP 1.0 Test: tConvolve MPI - Degridding AMD Ryzen Threadripper PRO 3995WX 64-Cores 3K 6K 9K 12K 15K SE +/- 430.92, N = 12 11744.39 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
LAMMPS Molecular Dynamics Simulator Model: 20k Atoms OpenBenchmarking.org ns/day, More Is Better LAMMPS Molecular Dynamics Simulator 23Jun2022 Model: 20k Atoms AMD Ryzen Threadripper PRO 3995WX 64-Cores 6 12 18 24 30 SE +/- 0.32, N = 3 24.60 1. (CXX) g++ options: -O3 -lm -ldl
Monte Carlo Simulations of Ionised Nebulae Input: Dust 2D tau100.0 OpenBenchmarking.org Seconds, Fewer Is Better Monte Carlo Simulations of Ionised Nebulae 2.02.73.3 Input: Dust 2D tau100.0 AMD Ryzen Threadripper PRO 3995WX 64-Cores 30 60 90 120 150 SE +/- 0.98, N = 3 145.60 1. (F9X) gfortran options: -cpp -Jsource/ -ffree-line-length-0 -lm -std=legacy -O2 -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lz
GPAW Input: Carbon Nanotube OpenBenchmarking.org Seconds, Fewer Is Better GPAW 23.6 Input: Carbon Nanotube AMD Ryzen Threadripper PRO 3995WX 64-Cores 30 60 90 120 150 SE +/- 0.41, N = 3 136.79 1. (CC) gcc options: -pthread -shared -lxc -lblas -lmpi -fno-strict-overflow -O2 -fPIC -isystem -UNDEBUG -std=c99
Timed MrBayes Analysis Primate Phylogeny Analysis OpenBenchmarking.org Seconds, Fewer Is Better Timed MrBayes Analysis 3.2.7 Primate Phylogeny Analysis AMD Ryzen Threadripper PRO 3995WX 64-Cores 30 60 90 120 150 SE +/- 0.17, N = 3 131.46 1. (CC) gcc options: -mmmx -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4a -msha -maes -mavx -mfma -mavx2 -mrdrnd -mbmi -mbmi2 -madx -mabm -O3 -std=c99 -pedantic -lm
Intel MPI Benchmarks Test: IMB-MPI1 Exchange OpenBenchmarking.org Average usec, Fewer Is Better Intel MPI Benchmarks 2019.3 Test: IMB-MPI1 Exchange AMD Ryzen Threadripper PRO 3995WX 64-Cores 120 240 360 480 600 SE +/- 8.21, N = 3 557.43 MIN: 0.97 / MAX: 22328.95 1. (CXX) g++ options: -O0 -pedantic -fopenmp -lmpi_cxx -lmpi
Intel MPI Benchmarks Test: IMB-MPI1 Exchange OpenBenchmarking.org Average Mbytes/sec, More Is Better Intel MPI Benchmarks 2019.3 Test: IMB-MPI1 Exchange AMD Ryzen Threadripper PRO 3995WX 64-Cores 500 1000 1500 2000 2500 SE +/- 29.82, N = 3 2490.11 MAX: 9884.84 1. (CXX) g++ options: -O0 -pedantic -fopenmp -lmpi_cxx -lmpi
NAS Parallel Benchmarks Test / Class: SP.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: SP.C AMD Ryzen Threadripper PRO 3995WX 64-Cores 4K 8K 12K 16K 20K SE +/- 70.76, N = 3 16955.27 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
Intel MPI Benchmarks Test: IMB-P2P PingPong OpenBenchmarking.org Average Msg/sec, More Is Better Intel MPI Benchmarks 2019.3 Test: IMB-P2P PingPong AMD Ryzen Threadripper PRO 3995WX 64-Cores 5M 10M 15M 20M 25M SE +/- 223826.40, N = 3 25366668 MIN: 2089 / MAX: 67855108 1. (CXX) g++ options: -O0 -pedantic -fopenmp -lmpi_cxx -lmpi
GROMACS Implementation: MPI CPU - Input: water_GMX50_bare OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2024 Implementation: MPI CPU - Input: water_GMX50_bare AMD Ryzen Threadripper PRO 3995WX 64-Cores 0.7135 1.427 2.1405 2.854 3.5675 SE +/- 0.036, N = 3 3.171 1. (CXX) g++ options: -O3 -lm
miniFE Problem Size: Small OpenBenchmarking.org CG Mflops, More Is Better miniFE 2.2 Problem Size: Small AMD Ryzen Threadripper PRO 3995WX 64-Cores 1500 3000 4500 6000 7500 SE +/- 73.24, N = 5 6809.84 1. (CXX) g++ options: -O3 -fopenmp -lmpi_cxx -lmpi
Xcompact3d Incompact3d Input: input.i3d 193 Cells Per Direction OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 193 Cells Per Direction AMD Ryzen Threadripper PRO 3995WX 64-Cores 13 26 39 52 65 SE +/- 0.13, N = 3 59.40 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
NAS Parallel Benchmarks Test / Class: BT.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: BT.C AMD Ryzen Threadripper PRO 3995WX 64-Cores 11K 22K 33K 44K 55K SE +/- 287.94, N = 3 49906.82 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
Intel MPI Benchmarks Test: IMB-MPI1 Sendrecv OpenBenchmarking.org Average usec, Fewer Is Better Intel MPI Benchmarks 2019.3 Test: IMB-MPI1 Sendrecv AMD Ryzen Threadripper PRO 3995WX 64-Cores 60 120 180 240 300 SE +/- 2.93, N = 3 292.17 MIN: 0.5 / MAX: 9957.71 1. (CXX) g++ options: -O0 -pedantic -fopenmp -lmpi_cxx -lmpi
Intel MPI Benchmarks Test: IMB-MPI1 Sendrecv OpenBenchmarking.org Average Mbytes/sec, More Is Better Intel MPI Benchmarks 2019.3 Test: IMB-MPI1 Sendrecv AMD Ryzen Threadripper PRO 3995WX 64-Cores 400 800 1200 1600 2000 SE +/- 6.66, N = 3 1859.20 MAX: 6882.76 1. (CXX) g++ options: -O0 -pedantic -fopenmp -lmpi_cxx -lmpi
NAS Parallel Benchmarks Test / Class: LU.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: LU.C AMD Ryzen Threadripper PRO 3995WX 64-Cores 11K 22K 33K 44K 55K SE +/- 171.34, N = 3 49703.3 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
Pennant Test: leblancbig OpenBenchmarking.org Hydro Cycle Time - Seconds, Fewer Is Better Pennant 1.0.1 Test: leblancbig AMD Ryzen Threadripper PRO 3995WX 64-Cores 1.3333 2.6666 3.9999 5.3332 6.6665 SE +/- 0.074192, N = 15 5.925810 1. (CXX) g++ options: -fopenmp -lmpi_cxx -lmpi
GROMACS Implementation: NVIDIA CUDA GPU - Input: water_GMX50_bare OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2024 Implementation: NVIDIA CUDA GPU - Input: water_GMX50_bare AMD Ryzen Threadripper PRO 3995WX 64-Cores 6 12 18 24 30 SE +/- 0.01, N = 3 23.80 1. (CXX) g++ options: -O3 -lm
NAS Parallel Benchmarks Test / Class: CG.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: CG.C AMD Ryzen Threadripper PRO 3995WX 64-Cores 1200 2400 3600 4800 6000 SE +/- 70.34, N = 4 5819.63 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
NAS Parallel Benchmarks Test / Class: IS.D OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: IS.D AMD Ryzen Threadripper PRO 3995WX 64-Cores 200 400 600 800 1000 SE +/- 5.18, N = 3 853.88 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
NAS Parallel Benchmarks Test / Class: EP.D OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: EP.D AMD Ryzen Threadripper PRO 3995WX 64-Cores 1100 2200 3300 4400 5500 SE +/- 13.91, N = 3 5322.84 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
Pennant Test: sedovbig OpenBenchmarking.org Hydro Cycle Time - Seconds, Fewer Is Better Pennant 1.0.1 Test: sedovbig AMD Ryzen Threadripper PRO 3995WX 64-Cores 4 8 12 16 20 SE +/- 0.22, N = 4 17.35 1. (CXX) g++ options: -fopenmp -lmpi_cxx -lmpi
NAS Parallel Benchmarks Test / Class: FT.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: FT.C AMD Ryzen Threadripper PRO 3995WX 64-Cores 5K 10K 15K 20K 25K SE +/- 53.71, N = 3 21478.66 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
NAS Parallel Benchmarks Test / Class: SP.B OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: SP.B AMD Ryzen Threadripper PRO 3995WX 64-Cores 6K 12K 18K 24K 30K SE +/- 331.56, N = 4 30075.20 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
Monte Carlo Simulations of Ionised Nebulae Input: Gas HII40 OpenBenchmarking.org Seconds, Fewer Is Better Monte Carlo Simulations of Ionised Nebulae 2.02.73.3 Input: Gas HII40 AMD Ryzen Threadripper PRO 3995WX 64-Cores 5 10 15 20 25 SE +/- 0.08, N = 3 19.25 1. (F9X) gfortran options: -cpp -Jsource/ -ffree-line-length-0 -lm -std=legacy -O2 -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lz
Xcompact3d Incompact3d Input: input.i3d 129 Cells Per Direction OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 129 Cells Per Direction AMD Ryzen Threadripper PRO 3995WX 64-Cores 3 6 9 12 15 SE +/- 0.10, N = 3 13.49 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
Intel MPI Benchmarks Test: IMB-MPI1 PingPong OpenBenchmarking.org Average Mbytes/sec, More Is Better Intel MPI Benchmarks 2019.3 Test: IMB-MPI1 PingPong AMD Ryzen Threadripper PRO 3995WX 64-Cores 500 1000 1500 2000 2500 SE +/- 10.55, N = 3 2411.51 MIN: 4.38 / MAX: 7065.21 1. (CXX) g++ options: -O0 -pedantic -fopenmp -lmpi_cxx -lmpi
NAS Parallel Benchmarks Test / Class: MG.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: MG.C AMD Ryzen Threadripper PRO 3995WX 64-Cores 4K 8K 12K 16K 20K SE +/- 58.72, N = 3 19608.97 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
LAMMPS Molecular Dynamics Simulator Model: Rhodopsin Protein OpenBenchmarking.org ns/day, More Is Better LAMMPS Molecular Dynamics Simulator 23Jun2022 Model: Rhodopsin Protein AMD Ryzen Threadripper PRO 3995WX 64-Cores 5 10 15 20 25 SE +/- 0.24, N = 5 21.84 1. (CXX) g++ options: -O3 -lm -ldl
NAS Parallel Benchmarks Test / Class: EP.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: EP.C AMD Ryzen Threadripper PRO 3995WX 64-Cores 1100 2200 3300 4400 5500 SE +/- 57.80, N = 3 5140.81 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.6
HPC Challenge Test / Class: Max Ping Pong Bandwidth OpenBenchmarking.org MB/s, More Is Better HPC Challenge 1.5.0 Test / Class: Max Ping Pong Bandwidth AMD Ryzen Threadripper PRO 3995WX 64-Cores 2K 4K 6K 8K 10K SE +/- 9.69, N = 3 9246.27 1. (CC) gcc options: -lblas -lm -lmpi -fomit-frame-pointer -funroll-loops 2. OpenBLAS + Open MPI 4.1.6
HPC Challenge Test / Class: Random Ring Bandwidth OpenBenchmarking.org GB/s, More Is Better HPC Challenge 1.5.0 Test / Class: Random Ring Bandwidth AMD Ryzen Threadripper PRO 3995WX 64-Cores 0.1612 0.3224 0.4836 0.6448 0.806 SE +/- 0.01836, N = 3 0.71653 1. (CC) gcc options: -lblas -lm -lmpi -fomit-frame-pointer -funroll-loops 2. OpenBLAS + Open MPI 4.1.6
HPC Challenge Test / Class: Random Ring Latency OpenBenchmarking.org usecs, Fewer Is Better HPC Challenge 1.5.0 Test / Class: Random Ring Latency AMD Ryzen Threadripper PRO 3995WX 64-Cores 0.2568 0.5136 0.7704 1.0272 1.284 SE +/- 0.00848, N = 3 1.14125 1. (CC) gcc options: -lblas -lm -lmpi -fomit-frame-pointer -funroll-loops 2. OpenBLAS + Open MPI 4.1.6
HPC Challenge Test / Class: G-Random Access OpenBenchmarking.org GUP/s, More Is Better HPC Challenge 1.5.0 Test / Class: G-Random Access AMD Ryzen Threadripper PRO 3995WX 64-Cores 0.0434 0.0868 0.1302 0.1736 0.217 SE +/- 0.00082, N = 3 0.19271 1. (CC) gcc options: -lblas -lm -lmpi -fomit-frame-pointer -funroll-loops 2. OpenBLAS + Open MPI 4.1.6
HPC Challenge Test / Class: EP-STREAM Triad OpenBenchmarking.org GB/s, More Is Better HPC Challenge 1.5.0 Test / Class: EP-STREAM Triad AMD Ryzen Threadripper PRO 3995WX 64-Cores 0.1279 0.2558 0.3837 0.5116 0.6395 SE +/- 0.00036, N = 3 0.56827 1. (CC) gcc options: -lblas -lm -lmpi -fomit-frame-pointer -funroll-loops 2. OpenBLAS + Open MPI 4.1.6
HPC Challenge Test / Class: G-Ptrans OpenBenchmarking.org GB/s, More Is Better HPC Challenge 1.5.0 Test / Class: G-Ptrans AMD Ryzen Threadripper PRO 3995WX 64-Cores 0.5718 1.1436 1.7154 2.2872 2.859 SE +/- 0.01461, N = 3 2.54122 1. (CC) gcc options: -lblas -lm -lmpi -fomit-frame-pointer -funroll-loops 2. OpenBLAS + Open MPI 4.1.6
HPC Challenge Test / Class: EP-DGEMM OpenBenchmarking.org GFLOPS, More Is Better HPC Challenge 1.5.0 Test / Class: EP-DGEMM AMD Ryzen Threadripper PRO 3995WX 64-Cores 2 4 6 8 10 SE +/- 0.08554, N = 3 7.25358 1. (CC) gcc options: -lblas -lm -lmpi -fomit-frame-pointer -funroll-loops 2. OpenBLAS + Open MPI 4.1.6
HPC Challenge Test / Class: G-Ffte OpenBenchmarking.org GFLOPS, More Is Better HPC Challenge 1.5.0 Test / Class: G-Ffte AMD Ryzen Threadripper PRO 3995WX 64-Cores 3 6 9 12 15 SE +/- 0.01, N = 3 11.19 1. (CC) gcc options: -lblas -lm -lmpi -fomit-frame-pointer -funroll-loops 2. OpenBLAS + Open MPI 4.1.6
Phoronix Test Suite v10.8.5