EPYC 9684X 1P

AMD EPYC 9684X 96-Core testing with a AMD Titanite_4G (RTI1007B BIOS) and ASPEED on Ubuntu 22.04 via the Phoronix Test Suite.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2307202-NE-EPYC9684X88
Jump To Table - Results

View

Do Not Show Noisy Results
Do Not Show Results With Incomplete Data
Do Not Show Results With Little Change/Spread
List Notable Results
Show Result Confidence Charts
Allow Limiting Results To Certain Suite(s)

Statistics

Show Overall Harmonic Mean(s)
Show Overall Geometric Mean
Show Wins / Losses Counts (Pie Chart)
Normalize Results
Remove Outliers Before Calculating Averages

Graph Settings

Force Line Graphs Where Applicable
Convert To Scalar Where Applicable
Disable Color Branding
Prefer Vertical Bar Graphs

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Highlight
Result
Toggle/Hide
Result
Result
Identifier
View Logs
Performance Per
Dollar
Date
Run
  Test
  Duration
EPYC 9684X
July 20 2023
  2 Hours, 5 Minutes
AMD 9684X
July 20 2023
  2 Hours, 6 Minutes
Invert Behavior (Only Show Selected Data)
  2 Hours, 6 Minutes
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


EPYC 9684X 1POpenBenchmarking.orgPhoronix Test SuiteAMD EPYC 9684X 96-Core @ 2.55GHz (96 Cores / 192 Threads)AMD Titanite_4G (RTI1007B BIOS)AMD Device 14a4768GB2 x 1920GB SAMSUNG MZWLJ1T9HBJR-00007ASPEEDBroadcom NetXtreme BCM5720 PCIeUbuntu 22.045.19.0-41-generic (x86_64)GNOME Shell 42.5X Server 1.21.1.41.3.224GCC 11.3.0ext41024x768ProcessorMotherboardChipsetMemoryDiskGraphicsNetworkOSKernelDesktopDisplay ServerVulkanCompilerFile-SystemScreen ResolutionEPYC 9684X 1P BenchmarksSystem Logs- Transparent Huge Pages: madvise- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-xKiWfi/gcc-11-11.3.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-xKiWfi/gcc-11-11.3.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa101121 - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected

EPYC 9684X vs. AMD 9684X ComparisonPhoronix Test SuiteBaseline+12.1%+12.1%+24.2%+24.2%+36.3%+36.3%+48.4%+48.4%14.4%11.1%4.5%3.9%3.9%3.3%3.2%3.1%3%2.4%2.3%2.1%104 104 104 - 6048.5%tConvolve OpenMP - Degridding25%i.i.1.C.P.DtConvolve OpenMP - Griddingc2c - FFTW - double-long - 12810.5%c2c - FFTW - double-long - 2569%CG.C7.4%MG.C6.8%IS.D4.8%r2c - FFTW - float-long - 256c2c - FFTW - double - 1284.3%CPU Cacher2c - Stock - double-long - 256r2c - FFTW - double-long - 1283.5%c2c - FFTW - float-long - 128BT.Cr2c - Stock - float - 2563.2%c2c - FFTW - double - 256144 144 144 - 603.1%c2c - Stock - float-long - 128X.b.i.i2.9%r2c - FFTW - double - 2562.4%tConvolve MPI - Griddingc2c - Stock - double-long - 128c2c - Stock - float - 2562.1%FT.CHigh Performance Conjugate GradientASKAPXcompact3d Incompact3dASKAPHeFFTe - Highly Efficient FFT for ExascaleHeFFTe - Highly Efficient FFT for ExascaleNAS Parallel BenchmarksNAS Parallel BenchmarksNAS Parallel BenchmarksHeFFTe - Highly Efficient FFT for ExascaleHeFFTe - Highly Efficient FFT for ExascaleStress-NGHeFFTe - Highly Efficient FFT for ExascaleHeFFTe - Highly Efficient FFT for ExascaleHeFFTe - Highly Efficient FFT for ExascaleNAS Parallel BenchmarksHeFFTe - Highly Efficient FFT for ExascaleHeFFTe - Highly Efficient FFT for ExascaleHigh Performance Conjugate GradientHeFFTe - Highly Efficient FFT for ExascaleXcompact3d Incompact3dHeFFTe - Highly Efficient FFT for ExascaleASKAPHeFFTe - Highly Efficient FFT for ExascaleHeFFTe - Highly Efficient FFT for ExascaleNAS Parallel BenchmarksEPYC 9684XAMD 9684X

EPYC 9684X 1Phpcg: 192 192 192 - 60hpcg: 160 160 160 - 60hpcg: 144 144 144 - 60libxsmm: 128incompact3d: X3D-benchmarking input.i3dopenfoam: drivaerFastback, Medium Mesh Size - Execution Timeopenfoam: drivaerFastback, Medium Mesh Size - Mesh Timeaskap: tConvolve MT - Degriddingaskap: tConvolve MT - Griddinghpcg: 104 104 104 - 60libxsmm: 256blender: Barbershop - CPU-Onlyopenfoam: drivaerFastback, Small Mesh Size - Execution Timeopenfoam: drivaerFastback, Small Mesh Size - Mesh Timeblender: Pabellon Barcelona - CPU-Onlyblender: Classroom - CPU-Onlystress-ng: CPU Cachestress-ng: Vector Shufflestress-ng: Wide Vector Mathstress-ng: CPU Stressstress-ng: Vector Mathstress-ng: Vector Floating Pointstress-ng: Matrix Mathaskap: tConvolve MPI - Griddingaskap: tConvolve MPI - Degriddingheffte: c2c - FFTW - double - 512heffte: c2c - Stock - double - 512heffte: c2c - FFTW - double-long - 512heffte: c2c - Stock - double-long - 512gromacs: MPI CPU - water_GMX50_bareblender: Fishy Cat - CPU-Onlyblender: BMW27 - CPU-Onlynamd: ATPase Simulation - 327,506 Atomslulesh: heffte: r2c - FFTW - double-long - 512heffte: r2c - FFTW - double - 512xmrig: Monero - 1Membree: Pathtracer - Asian Dragon Objheffte: r2c - Stock - double - 512heffte: r2c - Stock - double-long - 512embree: Pathtracer ISPC - Asian Dragon Objxmrig: Wownero - 1Mheffte: c2c - Stock - float-long - 512heffte: c2c - Stock - float - 512npb: EP.Dheffte: c2c - FFTW - float - 512heffte: c2c - FFTW - float-long - 512askap: Hogbom Clean OpenMPastcenc: Exhaustivenpb: BT.Cminife: Smallincompact3d: input.i3d 193 Cells Per Directionnpb: IS.Dastcenc: Thoroughnpb: SP.Cnpb: LU.Cheffte: r2c - FFTW - float-long - 512heffte: r2c - FFTW - float - 512astcenc: Fastheffte: r2c - Stock - float-long - 512heffte: r2c - Stock - float - 512embree: Pathtracer - Crownaskap: tConvolve OpenMP - Degriddingaskap: tConvolve OpenMP - Griddingembree: Pathtracer ISPC - Crownembree: Pathtracer - Asian Dragonembree: Pathtracer ISPC - Asian Dragonlibxsmm: 64libxsmm: 32astcenc: Mediumnpb: FT.Cheffte: c2c - Stock - double-long - 256heffte: c2c - Stock - double - 256heffte: c2c - FFTW - double - 256heffte: c2c - FFTW - double-long - 256npb: CG.Cincompact3d: input.i3d 129 Cells Per Directionnpb: SP.Bnpb: MG.Cheffte: c2c - Stock - float - 256heffte: r2c - Stock - double-long - 256heffte: r2c - Stock - double - 256heffte: c2c - FFTW - float - 256heffte: r2c - FFTW - double-long - 256heffte: c2c - Stock - float-long - 256heffte: c2c - FFTW - float-long - 256heffte: r2c - FFTW - double - 256npb: EP.Cheffte: r2c - Stock - float-long - 256heffte: r2c - FFTW - float - 256heffte: r2c - FFTW - float-long - 256heffte: r2c - Stock - float - 256heffte: c2c - Stock - double - 128heffte: c2c - Stock - double-long - 128heffte: c2c - FFTW - double-long - 128heffte: c2c - FFTW - double - 128heffte: r2c - FFTW - double - 128heffte: r2c - Stock - double - 128heffte: c2c - Stock - float - 128heffte: c2c - FFTW - float-long - 128heffte: r2c - FFTW - double-long - 128heffte: c2c - FFTW - float - 128heffte: r2c - Stock - double-long - 128heffte: c2c - Stock - float-long - 128heffte: r2c - Stock - float-long - 128heffte: r2c - FFTW - float-long - 128heffte: r2c - FFTW - float - 128heffte: r2c - Stock - float - 128EPYC 9684XAMD 9684X22.721122.817423.5463119.7377.588806184.45409107.4931569113617.834.3553177.4142.128.95793123.30187149.4240.481373344.9863766.483480531.79213036.81545786.37256936.31418126.4373226.759410.368.162768.824268.44568.811811.80120.6116.260.2506830858.462135.542136.45869478.2114.1719144.622145.496122.041374195148.562149.98210620.59154.311155.7131204.826.1412305166.453878.47.652035245839.2556.8898211454.6340754.87332.908336.4511019.0656341.923343.723110.69466656426625.6117.0799126.2454142.27922479.31299.8421.4357118915.9783.530483.891382.559989.797862397.222.24493504170601.48141081.08170.734175.594174177.706184.786177.123182.81194.0177839.23327.355315.15318.534331.60970.122168.649884.365381.8473125.241115.816110.016126.301129.282130.752115.339110.998176.529184.354186.051178.47822.417322.775322.84763091.5388.701202185.1475107.825771569113606.923.13773131.5141.9629.25989423.16633549.540.41426891.1363760.463484985.88214798.32545747.29254718.1418120.0874970.258310.168.275668.374568.627968.636611.79820.4416.460.2462331065.289136.27135.55669671.8115.0435143.312143.534122.310174288.7149.973149.16310476.83155.387155.6641204.826.1406314982.5853906.37.653934965572.1656.8975208198.13335538.3327.73340.7481009.8949336.837339.165109.647453251.229584117.2488126.4095142.29952472.21300.7421.9692121389.7682.858782.851785.150182.392658099.41.96318495169864.87132064.96167.193182.393176.447176.295184.654178.56179.544189.4437952328.778312.386332.764321.4268.868970.229676.33778.4937125.7116.285111.535130.477124.932128.608116.264114.309176.556187.191188.224175.82OpenBenchmarking.org

High Performance Conjugate Gradient

HPCG is the High Performance Conjugate Gradient and is a new scientific benchmark from Sandia National Lans focused for super-computer testing with modern real-world workloads compared to HPCC. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHigh Performance Conjugate Gradient 3.1X Y Z: 192 192 192 - RT: 60EPYC 9684XAMD 9684X51015202522.7222.421. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi

OpenBenchmarking.orgGFLOP/s, More Is BetterHigh Performance Conjugate Gradient 3.1X Y Z: 160 160 160 - RT: 60EPYC 9684XAMD 9684X51015202522.8222.781. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi

OpenBenchmarking.orgGFLOP/s, More Is BetterHigh Performance Conjugate Gradient 3.1X Y Z: 144 144 144 - RT: 60EPYC 9684XAMD 9684X61218243023.5522.851. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi

libxsmm

Libxsmm is an open-source library for specialized dense and sparse matrix operations and deep learning primitives. Libxsmm supports making use of Intel AMX, AVX-512, and other modern CPU instruction set capabilities. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS/s, More Is Betterlibxsmm 2-1.17-3645M N K: 128EPYC 9684XAMD 9684X70014002100280035003119.73091.51. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2

Xcompact3d Incompact3d

Xcompact3d Incompact3d is a Fortran-MPI based, finite difference high-performance code for solving the incompressible Navier-Stokes equation and as many as you need scalar transport equations. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterXcompact3d Incompact3d 2021-03-11Input: X3D-benchmarking input.i3dEPYC 9684XAMD 9684X80160240320400377.59388.701. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

OpenFOAM

OpenFOAM is the leading free, open-source software for computational fluid dynamics (CFD). This test profile currently uses the drivaerFastback test case for analyzing automotive aerodynamics or alternatively the older motorBike input. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 10Input: drivaerFastback, Medium Mesh Size - Execution TimeEPYC 9684XAMD 9684X4080120160200184.45185.151. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 10Input: drivaerFastback, Medium Mesh Size - Mesh TimeEPYC 9684XAMD 9684X20406080100107.49107.831. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm

ASKAP

ASKAP is a set of benchmarks from the Australian SKA Pathfinder. The principal ASKAP benchmarks are the Hogbom Clean Benchmark (tHogbomClean) and Convolutional Resamping Benchmark (tConvolve) as well as some previous ASKAP benchmarks being included as well for OpenCL and CUDA execution of tConvolve. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 1.0Test: tConvolve MT - DegriddingEPYC 9684XAMD 9684X3K6K9K12K15K15691156911. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 1.0Test: tConvolve MT - GriddingEPYC 9684XAMD 9684X3K6K9K12K15K13617.813606.91. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

High Performance Conjugate Gradient

HPCG is the High Performance Conjugate Gradient and is a new scientific benchmark from Sandia National Lans focused for super-computer testing with modern real-world workloads compared to HPCC. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHigh Performance Conjugate Gradient 3.1X Y Z: 104 104 104 - RT: 60EPYC 9684XAMD 9684X81624324034.3623.141. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi

libxsmm

Libxsmm is an open-source library for specialized dense and sparse matrix operations and deep learning primitives. Libxsmm supports making use of Intel AMX, AVX-512, and other modern CPU instruction set capabilities. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS/s, More Is Betterlibxsmm 2-1.17-3645M N K: 256EPYC 9684XAMD 9684X70014002100280035003177.43131.51. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2

Blender

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 3.6Blend File: Barbershop - Compute: CPU-OnlyEPYC 9684XAMD 9684X306090120150142.10141.96

OpenFOAM

OpenFOAM is the leading free, open-source software for computational fluid dynamics (CFD). This test profile currently uses the drivaerFastback test case for analyzing automotive aerodynamics or alternatively the older motorBike input. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 10Input: drivaerFastback, Small Mesh Size - Execution TimeEPYC 9684XAMD 9684X71421283528.9629.261. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 10Input: drivaerFastback, Small Mesh Size - Mesh TimeEPYC 9684XAMD 9684X61218243023.3023.171. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm

Blender

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 3.6Blend File: Pabellon Barcelona - Compute: CPU-OnlyEPYC 9684XAMD 9684X112233445549.4249.50

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 3.6Blend File: Classroom - Compute: CPU-OnlyEPYC 9684XAMD 9684X91827364540.4840.40

Stress-NG

Stress-NG is a Linux stress tool developed by Colin Ian King. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: CPU CacheEPYC 9684XAMD 9684X300K600K900K1200K1500K1373344.981426891.131. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Vector ShuffleEPYC 9684XAMD 9684X14K28K42K56K70K63766.4863760.461. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Wide Vector MathEPYC 9684XAMD 9684X700K1400K2100K2800K3500K3480531.793484985.881. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: CPU StressEPYC 9684XAMD 9684X50K100K150K200K250K213036.81214798.321. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Vector MathEPYC 9684XAMD 9684X120K240K360K480K600K545786.37545747.291. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Vector Floating PointEPYC 9684XAMD 9684X60K120K180K240K300K256936.31254718.101. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Matrix MathEPYC 9684XAMD 9684X90K180K270K360K450K418126.43418120.081. (CXX) g++ options: -O2 -std=gnu99 -lc

ASKAP

ASKAP is a set of benchmarks from the Australian SKA Pathfinder. The principal ASKAP benchmarks are the Hogbom Clean Benchmark (tHogbomClean) and Convolutional Resamping Benchmark (tConvolve) as well as some previous ASKAP benchmarks being included as well for OpenCL and CUDA execution of tConvolve. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMpix/sec, More Is BetterASKAP 1.0Test: tConvolve MPI - GriddingEPYC 9684XAMD 9684X16K32K48K64K80K73226.774970.21. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

OpenBenchmarking.orgMpix/sec, More Is BetterASKAP 1.0Test: tConvolve MPI - DegriddingEPYC 9684XAMD 9684X13K26K39K52K65K59410.358310.11. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

HeFFTe - Highly Efficient FFT for Exascale

HeFFTe is the Highly Efficient FFT for Exascale software developed as part of the Exascale Computing Project. This test profile uses HeFFTe's built-in speed benchmarks under a variety of configuration options and currently catering to CPU/processor testing. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: double - X Y Z: 512EPYC 9684XAMD 9684X153045607568.1668.281. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: Stock - Precision: double - X Y Z: 512EPYC 9684XAMD 9684X153045607568.8268.371. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 512EPYC 9684XAMD 9684X153045607568.4568.631. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 512EPYC 9684XAMD 9684X153045607568.8168.641. (CXX) g++ options: -O3

GROMACS

The GROMACS (GROningen MAchine for Chemical Simulations) molecular dynamics package testing with the water_GMX50 data. This test profile allows selecting between CPU and GPU-based GROMACS builds. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNs Per Day, More Is BetterGROMACS 2023Implementation: MPI CPU - Input: water_GMX50_bareEPYC 9684XAMD 9684X369121511.8011.801. (CXX) g++ options: -O3

Blender

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 3.6Blend File: Fishy Cat - Compute: CPU-OnlyEPYC 9684XAMD 9684X51015202520.6120.44

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 3.6Blend File: BMW27 - Compute: CPU-OnlyEPYC 9684XAMD 9684X4812162016.2616.46

NAMD

NAMD is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems. NAMD was developed by the Theoretical and Computational Biophysics Group in the Beckman Institute for Advanced Science and Technology at the University of Illinois at Urbana-Champaign. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgdays/ns, Fewer Is BetterNAMD 2.14ATPase Simulation - 327,506 AtomsEPYC 9684XAMD 9684X0.05640.11280.16920.22560.2820.250680.24623

LULESH

LULESH is the Livermore Unstructured Lagrangian Explicit Shock Hydrodynamics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgz/s, More Is BetterLULESH 2.0.3EPYC 9684XAMD 9684X7K14K21K28K35K30858.4631065.291. (CXX) g++ options: -O3 -fopenmp -lm -lmpi_cxx -lmpi

HeFFTe - Highly Efficient FFT for Exascale

HeFFTe is the Highly Efficient FFT for Exascale software developed as part of the Exascale Computing Project. This test profile uses HeFFTe's built-in speed benchmarks under a variety of configuration options and currently catering to CPU/processor testing. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 512EPYC 9684XAMD 9684X306090120150135.54136.271. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: double - X Y Z: 512EPYC 9684XAMD 9684X306090120150136.46135.561. (CXX) g++ options: -O3

Xmrig

Xmrig is an open-source cross-platform CPU/GPU miner for RandomX, KawPow, CryptoNight and AstroBWT. This test profile is setup to measure the Xmlrig CPU mining performance. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgH/s, More Is BetterXmrig 6.18.1Variant: Monero - Hash Count: 1MEPYC 9684XAMD 9684X15K30K45K60K75K69478.269671.81. (CXX) g++ options: -fexceptions -fno-rtti -maes -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc

Embree

Intel Embree is a collection of high-performance ray-tracing kernels for execution on CPUs (and GPUs via SYCL) and supporting instruction sets such as SSE, AVX, AVX2, and AVX-512. Embree also supports making use of the Intel SPMD Program Compiler (ISPC). Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.1Binary: Pathtracer - Model: Asian Dragon ObjEPYC 9684XAMD 9684X306090120150114.17115.04MIN: 112.65 / MAX: 116.39MIN: 113.62 / MAX: 117.34

HeFFTe - Highly Efficient FFT for Exascale

HeFFTe is the Highly Efficient FFT for Exascale software developed as part of the Exascale Computing Project. This test profile uses HeFFTe's built-in speed benchmarks under a variety of configuration options and currently catering to CPU/processor testing. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: Stock - Precision: double - X Y Z: 512EPYC 9684XAMD 9684X306090120150144.62143.311. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 512EPYC 9684XAMD 9684X306090120150145.50143.531. (CXX) g++ options: -O3

Embree

Intel Embree is a collection of high-performance ray-tracing kernels for execution on CPUs (and GPUs via SYCL) and supporting instruction sets such as SSE, AVX, AVX2, and AVX-512. Embree also supports making use of the Intel SPMD Program Compiler (ISPC). Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.1Binary: Pathtracer ISPC - Model: Asian Dragon ObjEPYC 9684XAMD 9684X306090120150122.04122.31MIN: 120.36 / MAX: 124.72MIN: 120.61 / MAX: 124.64

Xmrig

Xmrig is an open-source cross-platform CPU/GPU miner for RandomX, KawPow, CryptoNight and AstroBWT. This test profile is setup to measure the Xmlrig CPU mining performance. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgH/s, More Is BetterXmrig 6.18.1Variant: Wownero - Hash Count: 1MEPYC 9684XAMD 9684X16K32K48K64K80K74195.074288.71. (CXX) g++ options: -fexceptions -fno-rtti -maes -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc

HeFFTe - Highly Efficient FFT for Exascale

HeFFTe is the Highly Efficient FFT for Exascale software developed as part of the Exascale Computing Project. This test profile uses HeFFTe's built-in speed benchmarks under a variety of configuration options and currently catering to CPU/processor testing. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 512EPYC 9684XAMD 9684X306090120150148.56149.971. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: Stock - Precision: float - X Y Z: 512EPYC 9684XAMD 9684X306090120150149.98149.161. (CXX) g++ options: -O3

NAS Parallel Benchmarks

NPB, NAS Parallel Benchmarks, is a benchmark developed by NASA for high-end computer systems. This test profile currently uses the MPI version of NPB. This test profile offers selecting the different NPB tests/problems and varying problem sizes. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: EP.DEPYC 9684XAMD 9684X2K4K6K8K10K10620.5910476.831. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2

HeFFTe - Highly Efficient FFT for Exascale

HeFFTe is the Highly Efficient FFT for Exascale software developed as part of the Exascale Computing Project. This test profile uses HeFFTe's built-in speed benchmarks under a variety of configuration options and currently catering to CPU/processor testing. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: float - X Y Z: 512EPYC 9684XAMD 9684X306090120150154.31155.391. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 512EPYC 9684XAMD 9684X306090120150155.71155.661. (CXX) g++ options: -O3

ASKAP

ASKAP is a set of benchmarks from the Australian SKA Pathfinder. The principal ASKAP benchmarks are the Hogbom Clean Benchmark (tHogbomClean) and Convolutional Resamping Benchmark (tConvolve) as well as some previous ASKAP benchmarks being included as well for OpenCL and CUDA execution of tConvolve. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgIterations Per Second, More Is BetterASKAP 1.0Test: Hogbom Clean OpenMPEPYC 9684XAMD 9684X300600900120015001204.821204.821. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

ASTC Encoder

ASTC Encoder (astcenc) is for the Adaptive Scalable Texture Compression (ASTC) format commonly used with OpenGL, OpenGL ES, and Vulkan graphics APIs. This test profile does a coding test of both compression/decompression. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMT/s, More Is BetterASTC Encoder 4.0Preset: ExhaustiveEPYC 9684XAMD 9684X2468106.14126.14061. (CXX) g++ options: -O3 -flto -pthread

NAS Parallel Benchmarks

NPB, NAS Parallel Benchmarks, is a benchmark developed by NASA for high-end computer systems. This test profile currently uses the MPI version of NPB. This test profile offers selecting the different NPB tests/problems and varying problem sizes. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: BT.CEPYC 9684XAMD 9684X70K140K210K280K350K305166.40314982.581. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2

miniFE

MiniFE Finite Element is an application for unstructured implicit finite element codes. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgCG Mflops, More Is BetterminiFE 2.2Problem Size: SmallEPYC 9684XAMD 9684X12K24K36K48K60K53878.453906.31. (CXX) g++ options: -O3 -fopenmp -lmpi_cxx -lmpi

Xcompact3d Incompact3d

Xcompact3d Incompact3d is a Fortran-MPI based, finite difference high-performance code for solving the incompressible Navier-Stokes equation and as many as you need scalar transport equations. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterXcompact3d Incompact3d 2021-03-11Input: input.i3d 193 Cells Per DirectionEPYC 9684XAMD 9684X2468107.652035247.653934961. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

NAS Parallel Benchmarks

NPB, NAS Parallel Benchmarks, is a benchmark developed by NASA for high-end computer systems. This test profile currently uses the MPI version of NPB. This test profile offers selecting the different NPB tests/problems and varying problem sizes. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: IS.DEPYC 9684XAMD 9684X130026003900520065005839.255572.161. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2

ASTC Encoder

ASTC Encoder (astcenc) is for the Adaptive Scalable Texture Compression (ASTC) format commonly used with OpenGL, OpenGL ES, and Vulkan graphics APIs. This test profile does a coding test of both compression/decompression. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMT/s, More Is BetterASTC Encoder 4.0Preset: ThoroughEPYC 9684XAMD 9684X132639526556.8956.901. (CXX) g++ options: -O3 -flto -pthread

NAS Parallel Benchmarks

NPB, NAS Parallel Benchmarks, is a benchmark developed by NASA for high-end computer systems. This test profile currently uses the MPI version of NPB. This test profile offers selecting the different NPB tests/problems and varying problem sizes. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: SP.CEPYC 9684XAMD 9684X50K100K150K200K250K211454.60208198.131. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: LU.CEPYC 9684XAMD 9684X70K140K210K280K350K340754.87335538.301. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2

HeFFTe - Highly Efficient FFT for Exascale

HeFFTe is the Highly Efficient FFT for Exascale software developed as part of the Exascale Computing Project. This test profile uses HeFFTe's built-in speed benchmarks under a variety of configuration options and currently catering to CPU/processor testing. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 512EPYC 9684XAMD 9684X70140210280350332.91327.731. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: float - X Y Z: 512EPYC 9684XAMD 9684X70140210280350336.45340.751. (CXX) g++ options: -O3

ASTC Encoder

ASTC Encoder (astcenc) is for the Adaptive Scalable Texture Compression (ASTC) format commonly used with OpenGL, OpenGL ES, and Vulkan graphics APIs. This test profile does a coding test of both compression/decompression. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMT/s, More Is BetterASTC Encoder 4.0Preset: FastEPYC 9684XAMD 9684X20040060080010001019.071009.891. (CXX) g++ options: -O3 -flto -pthread

HeFFTe - Highly Efficient FFT for Exascale

HeFFTe is the Highly Efficient FFT for Exascale software developed as part of the Exascale Computing Project. This test profile uses HeFFTe's built-in speed benchmarks under a variety of configuration options and currently catering to CPU/processor testing. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 512EPYC 9684XAMD 9684X70140210280350341.92336.841. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: Stock - Precision: float - X Y Z: 512EPYC 9684XAMD 9684X70140210280350343.72339.171. (CXX) g++ options: -O3

Embree

Intel Embree is a collection of high-performance ray-tracing kernels for execution on CPUs (and GPUs via SYCL) and supporting instruction sets such as SSE, AVX, AVX2, and AVX-512. Embree also supports making use of the Intel SPMD Program Compiler (ISPC). Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.1Binary: Pathtracer - Model: CrownEPYC 9684XAMD 9684X20406080100110.69109.65MIN: 108.41 / MAX: 113.64MIN: 107.65 / MAX: 113.51

ASKAP

ASKAP is a set of benchmarks from the Australian SKA Pathfinder. The principal ASKAP benchmarks are the Hogbom Clean Benchmark (tHogbomClean) and Convolutional Resamping Benchmark (tConvolve) as well as some previous ASKAP benchmarks being included as well for OpenCL and CUDA execution of tConvolve. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 1.0Test: tConvolve OpenMP - DegriddingEPYC 9684XAMD 9684X14K28K42K56K70K66564.053251.21. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 1.0Test: tConvolve OpenMP - GriddingEPYC 9684XAMD 9684X6K12K18K24K30K26625.629584.01. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

Embree

Intel Embree is a collection of high-performance ray-tracing kernels for execution on CPUs (and GPUs via SYCL) and supporting instruction sets such as SSE, AVX, AVX2, and AVX-512. Embree also supports making use of the Intel SPMD Program Compiler (ISPC). Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.1Binary: Pathtracer ISPC - Model: CrownEPYC 9684XAMD 9684X306090120150117.08117.25MIN: 114.57 / MAX: 120.79MIN: 114.61 / MAX: 121.35

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.1Binary: Pathtracer - Model: Asian DragonEPYC 9684XAMD 9684X306090120150126.25126.41MIN: 124.97 / MAX: 128.11MIN: 124.93 / MAX: 128.49

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.1Binary: Pathtracer ISPC - Model: Asian DragonEPYC 9684XAMD 9684X306090120150142.28142.30MIN: 140.41 / MAX: 144.57MIN: 140.74 / MAX: 144.53

libxsmm

Libxsmm is an open-source library for specialized dense and sparse matrix operations and deep learning primitives. Libxsmm supports making use of Intel AMX, AVX-512, and other modern CPU instruction set capabilities. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS/s, More Is Betterlibxsmm 2-1.17-3645M N K: 64EPYC 9684XAMD 9684X50010001500200025002479.32472.21. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2

OpenBenchmarking.orgGFLOPS/s, More Is Betterlibxsmm 2-1.17-3645M N K: 32EPYC 9684XAMD 9684X300600900120015001299.81300.71. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2

ASTC Encoder

ASTC Encoder (astcenc) is for the Adaptive Scalable Texture Compression (ASTC) format commonly used with OpenGL, OpenGL ES, and Vulkan graphics APIs. This test profile does a coding test of both compression/decompression. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMT/s, More Is BetterASTC Encoder 4.0Preset: MediumEPYC 9684XAMD 9684X90180270360450421.44421.971. (CXX) g++ options: -O3 -flto -pthread

NAS Parallel Benchmarks

NPB, NAS Parallel Benchmarks, is a benchmark developed by NASA for high-end computer systems. This test profile currently uses the MPI version of NPB. This test profile offers selecting the different NPB tests/problems and varying problem sizes. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: FT.CEPYC 9684XAMD 9684X30K60K90K120K150K118915.97121389.761. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2

HeFFTe - Highly Efficient FFT for Exascale

HeFFTe is the Highly Efficient FFT for Exascale software developed as part of the Exascale Computing Project. This test profile uses HeFFTe's built-in speed benchmarks under a variety of configuration options and currently catering to CPU/processor testing. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 256EPYC 9684XAMD 9684X2040608010083.5382.861. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: Stock - Precision: double - X Y Z: 256EPYC 9684XAMD 9684X2040608010083.8982.851. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: double - X Y Z: 256EPYC 9684XAMD 9684X2040608010082.5685.151. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 256EPYC 9684XAMD 9684X2040608010089.8082.391. (CXX) g++ options: -O3

NAS Parallel Benchmarks

NPB, NAS Parallel Benchmarks, is a benchmark developed by NASA for high-end computer systems. This test profile currently uses the MPI version of NPB. This test profile offers selecting the different NPB tests/problems and varying problem sizes. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: CG.CEPYC 9684XAMD 9684X13K26K39K52K65K62397.2258099.401. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2

Xcompact3d Incompact3d

Xcompact3d Incompact3d is a Fortran-MPI based, finite difference high-performance code for solving the incompressible Navier-Stokes equation and as many as you need scalar transport equations. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterXcompact3d Incompact3d 2021-03-11Input: input.i3d 129 Cells Per DirectionEPYC 9684XAMD 9684X0.50511.01021.51532.02042.52552.244935041.963184951. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

NAS Parallel Benchmarks

NPB, NAS Parallel Benchmarks, is a benchmark developed by NASA for high-end computer systems. This test profile currently uses the MPI version of NPB. This test profile offers selecting the different NPB tests/problems and varying problem sizes. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: SP.BEPYC 9684XAMD 9684X40K80K120K160K200K170601.48169864.871. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: MG.CEPYC 9684XAMD 9684X30K60K90K120K150K141081.08132064.961. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2

HeFFTe - Highly Efficient FFT for Exascale

HeFFTe is the Highly Efficient FFT for Exascale software developed as part of the Exascale Computing Project. This test profile uses HeFFTe's built-in speed benchmarks under a variety of configuration options and currently catering to CPU/processor testing. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: Stock - Precision: float - X Y Z: 256EPYC 9684XAMD 9684X4080120160200170.73167.191. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 256EPYC 9684XAMD 9684X4080120160200175.59182.391. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: Stock - Precision: double - X Y Z: 256EPYC 9684XAMD 9684X4080120160200174.00176.451. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: float - X Y Z: 256EPYC 9684XAMD 9684X4080120160200177.71176.301. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 256EPYC 9684XAMD 9684X4080120160200184.79184.651. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 256EPYC 9684XAMD 9684X4080120160200177.12178.561. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 256EPYC 9684XAMD 9684X4080120160200182.81179.541. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: double - X Y Z: 256EPYC 9684XAMD 9684X4080120160200194.02189.441. (CXX) g++ options: -O3

NAS Parallel Benchmarks

NPB, NAS Parallel Benchmarks, is a benchmark developed by NASA for high-end computer systems. This test profile currently uses the MPI version of NPB. This test profile offers selecting the different NPB tests/problems and varying problem sizes. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: EP.CEPYC 9684XAMD 9684X2K4K6K8K10K7839.237952.001. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2

HeFFTe - Highly Efficient FFT for Exascale

HeFFTe is the Highly Efficient FFT for Exascale software developed as part of the Exascale Computing Project. This test profile uses HeFFTe's built-in speed benchmarks under a variety of configuration options and currently catering to CPU/processor testing. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 256EPYC 9684XAMD 9684X70140210280350327.36328.781. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: float - X Y Z: 256EPYC 9684XAMD 9684X70140210280350315.15312.391. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 256EPYC 9684XAMD 9684X70140210280350318.53332.761. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: Stock - Precision: float - X Y Z: 256EPYC 9684XAMD 9684X70140210280350331.61321.421. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: Stock - Precision: double - X Y Z: 128EPYC 9684XAMD 9684X163248648070.1268.871. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 128EPYC 9684XAMD 9684X163248648068.6570.231. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 128EPYC 9684XAMD 9684X2040608010084.3776.341. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: double - X Y Z: 128EPYC 9684XAMD 9684X2040608010081.8578.491. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: double - X Y Z: 128EPYC 9684XAMD 9684X306090120150125.24125.701. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: Stock - Precision: double - X Y Z: 128EPYC 9684XAMD 9684X306090120150115.82116.291. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: Stock - Precision: float - X Y Z: 128EPYC 9684XAMD 9684X20406080100110.02111.541. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 128EPYC 9684XAMD 9684X306090120150126.30130.481. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 128EPYC 9684XAMD 9684X306090120150129.28124.931. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: float - X Y Z: 128EPYC 9684XAMD 9684X306090120150130.75128.611. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 128EPYC 9684XAMD 9684X306090120150115.34116.261. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 128EPYC 9684XAMD 9684X306090120150111.00114.311. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 128EPYC 9684XAMD 9684X4080120160200176.53176.561. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 128EPYC 9684XAMD 9684X4080120160200184.35187.191. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: float - X Y Z: 128EPYC 9684XAMD 9684X4080120160200186.05188.221. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: Stock - Precision: float - X Y Z: 128EPYC 9684XAMD 9684X4080120160200178.48175.821. (CXX) g++ options: -O3

108 Results Shown

High Performance Conjugate Gradient:
  192 192 192 - 60
  160 160 160 - 60
  144 144 144 - 60
libxsmm
Xcompact3d Incompact3d
OpenFOAM:
  drivaerFastback, Medium Mesh Size - Execution Time
  drivaerFastback, Medium Mesh Size - Mesh Time
ASKAP:
  tConvolve MT - Degridding
  tConvolve MT - Gridding
High Performance Conjugate Gradient
libxsmm
Blender
OpenFOAM:
  drivaerFastback, Small Mesh Size - Execution Time
  drivaerFastback, Small Mesh Size - Mesh Time
Blender:
  Pabellon Barcelona - CPU-Only
  Classroom - CPU-Only
Stress-NG:
  CPU Cache
  Vector Shuffle
  Wide Vector Math
  CPU Stress
  Vector Math
  Vector Floating Point
  Matrix Math
ASKAP:
  tConvolve MPI - Gridding
  tConvolve MPI - Degridding
HeFFTe - Highly Efficient FFT for Exascale:
  c2c - FFTW - double - 512
  c2c - Stock - double - 512
  c2c - FFTW - double-long - 512
  c2c - Stock - double-long - 512
GROMACS
Blender:
  Fishy Cat - CPU-Only
  BMW27 - CPU-Only
NAMD
LULESH
HeFFTe - Highly Efficient FFT for Exascale:
  r2c - FFTW - double-long - 512
  r2c - FFTW - double - 512
Xmrig
Embree
HeFFTe - Highly Efficient FFT for Exascale:
  r2c - Stock - double - 512
  r2c - Stock - double-long - 512
Embree
Xmrig
HeFFTe - Highly Efficient FFT for Exascale:
  c2c - Stock - float-long - 512
  c2c - Stock - float - 512
NAS Parallel Benchmarks
HeFFTe - Highly Efficient FFT for Exascale:
  c2c - FFTW - float - 512
  c2c - FFTW - float-long - 512
ASKAP
ASTC Encoder
NAS Parallel Benchmarks
miniFE
Xcompact3d Incompact3d
NAS Parallel Benchmarks
ASTC Encoder
NAS Parallel Benchmarks:
  SP.C
  LU.C
HeFFTe - Highly Efficient FFT for Exascale:
  r2c - FFTW - float-long - 512
  r2c - FFTW - float - 512
ASTC Encoder
HeFFTe - Highly Efficient FFT for Exascale:
  r2c - Stock - float-long - 512
  r2c - Stock - float - 512
Embree
ASKAP:
  tConvolve OpenMP - Degridding
  tConvolve OpenMP - Gridding
Embree:
  Pathtracer ISPC - Crown
  Pathtracer - Asian Dragon
  Pathtracer ISPC - Asian Dragon
libxsmm:
  64
  32
ASTC Encoder
NAS Parallel Benchmarks
HeFFTe - Highly Efficient FFT for Exascale:
  c2c - Stock - double-long - 256
  c2c - Stock - double - 256
  c2c - FFTW - double - 256
  c2c - FFTW - double-long - 256
NAS Parallel Benchmarks
Xcompact3d Incompact3d
NAS Parallel Benchmarks:
  SP.B
  MG.C
HeFFTe - Highly Efficient FFT for Exascale:
  c2c - Stock - float - 256
  r2c - Stock - double-long - 256
  r2c - Stock - double - 256
  c2c - FFTW - float - 256
  r2c - FFTW - double-long - 256
  c2c - Stock - float-long - 256
  c2c - FFTW - float-long - 256
  r2c - FFTW - double - 256
NAS Parallel Benchmarks
HeFFTe - Highly Efficient FFT for Exascale:
  r2c - Stock - float-long - 256
  r2c - FFTW - float - 256
  r2c - FFTW - float-long - 256
  r2c - Stock - float - 256
  c2c - Stock - double - 128
  c2c - Stock - double-long - 128
  c2c - FFTW - double-long - 128
  c2c - FFTW - double - 128
  r2c - FFTW - double - 128
  r2c - Stock - double - 128
  c2c - Stock - float - 128
  c2c - FFTW - float-long - 128
  r2c - FFTW - double-long - 128
  c2c - FFTW - float - 128
  r2c - Stock - double-long - 128
  c2c - Stock - float-long - 128
  r2c - Stock - float-long - 128
  r2c - FFTW - float-long - 128
  r2c - FFTW - float - 128
  r2c - Stock - float - 128