EPYC 9684X 1P

Tests for a future article. AMD EPYC 9684X 96-Core testing with a AMD Titanite_4G (RTI1007B BIOS) and ASPEED on Ubuntu 22.04 via the Phoronix Test Suite.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2307209-NE-EPYC9684X51
Jump To Table - Results

View

Do Not Show Noisy Results
Do Not Show Results With Incomplete Data
Do Not Show Results With Little Change/Spread
List Notable Results
Show Result Confidence Charts
Allow Limiting Results To Certain Suite(s)

Statistics

Show Overall Harmonic Mean(s)
Show Overall Geometric Mean
Show Wins / Losses Counts (Pie Chart)
Normalize Results
Remove Outliers Before Calculating Averages

Graph Settings

Force Line Graphs Where Applicable
Convert To Scalar Where Applicable
Disable Color Branding
Prefer Vertical Bar Graphs

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Highlight
Result
Toggle/Hide
Result
Result
Identifier
View Logs
Performance Per
Dollar
Date
Run
  Test
  Duration
EPYC 9684X
July 20 2023
  2 Hours, 5 Minutes
AMD 9684X
July 20 2023
  2 Hours, 6 Minutes
Invert Behavior (Only Show Selected Data)
  2 Hours, 6 Minutes
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


EPYC 9684X 1POpenBenchmarking.orgPhoronix Test SuiteAMD EPYC 9684X 96-Core @ 2.55GHz (96 Cores / 192 Threads)AMD Titanite_4G (RTI1007B BIOS)AMD Device 14a4768GB2 x 1920GB SAMSUNG MZWLJ1T9HBJR-00007ASPEEDBroadcom NetXtreme BCM5720 PCIeUbuntu 22.045.19.0-41-generic (x86_64)GNOME Shell 42.5X Server 1.21.1.41.3.224GCC 11.3.0ext41024x768ProcessorMotherboardChipsetMemoryDiskGraphicsNetworkOSKernelDesktopDisplay ServerVulkanCompilerFile-SystemScreen ResolutionEPYC 9684X 1P BenchmarksSystem Logs- Transparent Huge Pages: madvise- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-xKiWfi/gcc-11-11.3.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-xKiWfi/gcc-11-11.3.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa101121 - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected

EPYC 9684X vs. AMD 9684X ComparisonPhoronix Test SuiteBaseline+12.1%+12.1%+24.2%+24.2%+36.3%+36.3%+48.4%+48.4%14.4%11.1%4.5%3.9%3.9%3.3%3.2%3.1%3%2.4%2.3%2.1%104 104 104 - 6048.5%tConvolve OpenMP - Degridding25%i.i.1.C.P.DtConvolve OpenMP - Griddingc2c - FFTW - double-long - 12810.5%c2c - FFTW - double-long - 2569%CG.C7.4%MG.C6.8%IS.D4.8%r2c - FFTW - float-long - 256c2c - FFTW - double - 1284.3%CPU Cacher2c - Stock - double-long - 256r2c - FFTW - double-long - 1283.5%c2c - FFTW - float-long - 128BT.Cr2c - Stock - float - 2563.2%c2c - FFTW - double - 256144 144 144 - 603.1%c2c - Stock - float-long - 128X.b.i.i2.9%r2c - FFTW - double - 2562.4%tConvolve MPI - Griddingc2c - Stock - double-long - 128c2c - Stock - float - 2562.1%FT.CHigh Performance Conjugate GradientASKAPXcompact3d Incompact3dASKAPHeFFTe - Highly Efficient FFT for ExascaleHeFFTe - Highly Efficient FFT for ExascaleNAS Parallel BenchmarksNAS Parallel BenchmarksNAS Parallel BenchmarksHeFFTe - Highly Efficient FFT for ExascaleHeFFTe - Highly Efficient FFT for ExascaleStress-NGHeFFTe - Highly Efficient FFT for ExascaleHeFFTe - Highly Efficient FFT for ExascaleHeFFTe - Highly Efficient FFT for ExascaleNAS Parallel BenchmarksHeFFTe - Highly Efficient FFT for ExascaleHeFFTe - Highly Efficient FFT for ExascaleHigh Performance Conjugate GradientHeFFTe - Highly Efficient FFT for ExascaleXcompact3d Incompact3dHeFFTe - Highly Efficient FFT for ExascaleASKAPHeFFTe - Highly Efficient FFT for ExascaleHeFFTe - Highly Efficient FFT for ExascaleNAS Parallel BenchmarksEPYC 9684XAMD 9684X

EPYC 9684X 1Pstress-ng: CPU Cachestress-ng: CPU Stressstress-ng: Matrix Mathstress-ng: Vector Mathstress-ng: Vector Shufflestress-ng: Wide Vector Mathstress-ng: Vector Floating Pointastcenc: Fastastcenc: Mediumastcenc: Thoroughastcenc: Exhaustiveheffte: c2c - FFTW - double - 512xmrig: Monero - 1Mxmrig: Wownero - 1Mheffte: c2c - Stock - float-long - 128heffte: c2c - FFTW - double - 256heffte: r2c - FFTW - float-long - 512heffte: c2c - FFTW - double-long - 256heffte: c2c - FFTW - float-long - 256heffte: r2c - FFTW - float-long - 128heffte: r2c - Stock - double - 128heffte: r2c - Stock - double - 512heffte: r2c - Stock - float - 512heffte: c2c - Stock - double - 256heffte: r2c - Stock - float - 128heffte: r2c - FFTW - double - 256heffte: c2c - Stock - float - 512heffte: c2c - Stock - float - 128gromacs: MPI CPU - water_GMX50_bareheffte: c2c - Stock - float - 256libxsmm: 128heffte: r2c - FFTW - double - 128libxsmm: 256heffte: r2c - FFTW - double - 512libxsmm: 32heffte: r2c - Stock - float - 256libxsmm: 64heffte: c2c - Stock - double - 128heffte: c2c - FFTW - float - 128heffte: c2c - Stock - double - 512heffte: c2c - FFTW - float - 256heffte: r2c - Stock - double - 256heffte: c2c - FFTW - float - 512heffte: c2c - FFTW - float-long - 128heffte: r2c - FFTW - float - 128heffte: c2c - FFTW - float-long - 512heffte: r2c - FFTW - float - 256heffte: r2c - FFTW - float-long - 256heffte: r2c - FFTW - float - 512heffte: c2c - FFTW - double-long - 128heffte: c2c - FFTW - double - 128heffte: c2c - FFTW - double-long - 512heffte: c2c - Stock - float-long - 256hpcg: 104 104 104 - 60heffte: r2c - Stock - double-long - 512hpcg: 160 160 160 - 60heffte: r2c - Stock - double-long - 128heffte: r2c - Stock - double-long - 256heffte: c2c - Stock - double-long - 256heffte: c2c - Stock - double-long - 512heffte: r2c - Stock - float-long - 512heffte: c2c - Stock - double-long - 128heffte: r2c - Stock - float-long - 128heffte: r2c - Stock - float-long - 256heffte: r2c - FFTW - double-long - 512heffte: r2c - FFTW - double-long - 256heffte: r2c - FFTW - double-long - 128hpcg: 144 144 144 - 60hpcg: 192 192 192 - 60npb: BT.Cnpb: CG.Cnpb: EP.Cnpb: EP.Dnpb: FT.Cnpb: IS.Dnpb: LU.Cnpb: MG.Cheffte: c2c - Stock - float-long - 512npb: SP.Bnpb: SP.Cnamd: ATPase Simulation - 327,506 Atomsaskap: tConvolve MT - Griddingaskap: tConvolve MT - Degriddingaskap: tConvolve MPI - Degriddingaskap: tConvolve MPI - Griddingaskap: tConvolve OpenMP - Griddingaskap: tConvolve OpenMP - Degriddingaskap: Hogbom Clean OpenMPlulesh: openfoam: drivaerFastback, Small Mesh Size - Mesh Timeopenfoam: drivaerFastback, Small Mesh Size - Execution Timeopenfoam: drivaerFastback, Medium Mesh Size - Mesh Timeopenfoam: drivaerFastback, Medium Mesh Size - Execution Timeminife: Smallincompact3d: X3D-benchmarking input.i3dincompact3d: input.i3d 129 Cells Per Directionincompact3d: input.i3d 193 Cells Per Directionblender: BMW27 - CPU-Onlyblender: Classroom - CPU-Onlyblender: Fishy Cat - CPU-Onlyblender: Barbershop - CPU-Onlyblender: Pabellon Barcelona - CPU-Onlyembree: Pathtracer - Crownembree: Pathtracer ISPC - Crownembree: Pathtracer - Asian Dragonembree: Pathtracer - Asian Dragon Objembree: Pathtracer ISPC - Asian Dragonembree: Pathtracer ISPC - Asian Dragon ObjEPYC 9684XAMD 9684X1373344.98213036.81418126.43545786.3763766.483480531.79256936.311019.0656421.435756.88986.141268.162769478.274195110.99882.5599332.90889.7978182.81184.354115.816144.622343.72383.8913178.478194.017149.982110.01611.801170.7343119.7125.2413177.4136.4581299.8331.6092479.370.1221130.75268.8242177.706174154.311126.301186.051155.713315.15318.534336.45184.365381.847368.445177.12334.355145.49622.8174115.339175.59483.530468.8118341.92368.6498176.529327.355135.542184.786129.28223.54622.7211305166.462397.227839.2310620.59118915.975839.25340754.87141081.08148.562170601.48211454.60.2506813617.81569159410.373226.726625.6665641204.8230858.46223.30187128.957931107.493184.4540953878.4377.5888062.244935047.6520352416.2640.4820.61142.149.42110.6946117.0799126.2454114.1719142.2792122.04131426891.13214798.32418120.08545747.2963760.463484985.88254718.11009.8949421.969256.89756.140668.275669671.874288.7114.30985.1501327.7382.3926179.544187.191116.285143.312339.16582.8517175.82189.443149.163111.53511.798167.1933091.5125.73131.5135.5561300.7321.422472.268.8689128.60868.3745176.295176.447155.387130.477188.224155.664312.386332.764340.74876.33778.493768.6279178.5623.1377143.53422.7753116.264182.39382.858768.6366336.83770.2296176.556328.778136.27184.654124.93222.847622.4173314982.5858099.4795210476.83121389.765572.16335538.3132064.96149.973169864.87208198.130.2462313606.91569158310.174970.22958453251.21204.8231065.28923.16633529.259894107.82577185.147553906.3388.7012021.963184957.6539349616.4640.420.44141.9649.5109.6474117.2488126.4095115.0435142.2995122.3101OpenBenchmarking.org

Stress-NG

Stress-NG is a Linux stress tool developed by Colin Ian King. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: CPU CacheEPYC 9684XAMD 9684X300K600K900K1200K1500K1373344.981426891.131. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: CPU StressEPYC 9684XAMD 9684X50K100K150K200K250K213036.81214798.321. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Matrix MathAMD 9684XEPYC 9684X90K180K270K360K450K418120.08418126.431. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Vector MathAMD 9684XEPYC 9684X120K240K360K480K600K545747.29545786.371. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Vector ShuffleAMD 9684XEPYC 9684X14K28K42K56K70K63760.4663766.481. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Wide Vector MathEPYC 9684XAMD 9684X700K1400K2100K2800K3500K3480531.793484985.881. (CXX) g++ options: -O2 -std=gnu99 -lc

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Vector Floating PointAMD 9684XEPYC 9684X60K120K180K240K300K254718.10256936.311. (CXX) g++ options: -O2 -std=gnu99 -lc

ASTC Encoder

ASTC Encoder (astcenc) is for the Adaptive Scalable Texture Compression (ASTC) format commonly used with OpenGL, OpenGL ES, and Vulkan graphics APIs. This test profile does a coding test of both compression/decompression. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMT/s, More Is BetterASTC Encoder 4.0Preset: FastAMD 9684XEPYC 9684X20040060080010001009.891019.071. (CXX) g++ options: -O3 -flto -pthread

OpenBenchmarking.orgMT/s, More Is BetterASTC Encoder 4.0Preset: MediumEPYC 9684XAMD 9684X90180270360450421.44421.971. (CXX) g++ options: -O3 -flto -pthread

OpenBenchmarking.orgMT/s, More Is BetterASTC Encoder 4.0Preset: ThoroughEPYC 9684XAMD 9684X132639526556.8956.901. (CXX) g++ options: -O3 -flto -pthread

OpenBenchmarking.orgMT/s, More Is BetterASTC Encoder 4.0Preset: ExhaustiveAMD 9684XEPYC 9684X2468106.14066.14121. (CXX) g++ options: -O3 -flto -pthread

HeFFTe - Highly Efficient FFT for Exascale

HeFFTe is the Highly Efficient FFT for Exascale software developed as part of the Exascale Computing Project. This test profile uses HeFFTe's built-in speed benchmarks under a variety of configuration options and currently catering to CPU/processor testing. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: double - X Y Z: 512EPYC 9684XAMD 9684X153045607568.1668.281. (CXX) g++ options: -O3

Xmrig

Xmrig is an open-source cross-platform CPU/GPU miner for RandomX, KawPow, CryptoNight and AstroBWT. This test profile is setup to measure the Xmlrig CPU mining performance. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgH/s, More Is BetterXmrig 6.18.1Variant: Monero - Hash Count: 1MEPYC 9684XAMD 9684X15K30K45K60K75K69478.269671.81. (CXX) g++ options: -fexceptions -fno-rtti -maes -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc

OpenBenchmarking.orgH/s, More Is BetterXmrig 6.18.1Variant: Wownero - Hash Count: 1MEPYC 9684XAMD 9684X16K32K48K64K80K74195.074288.71. (CXX) g++ options: -fexceptions -fno-rtti -maes -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc

HeFFTe - Highly Efficient FFT for Exascale

HeFFTe is the Highly Efficient FFT for Exascale software developed as part of the Exascale Computing Project. This test profile uses HeFFTe's built-in speed benchmarks under a variety of configuration options and currently catering to CPU/processor testing. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 128EPYC 9684XAMD 9684X306090120150111.00114.311. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: double - X Y Z: 256EPYC 9684XAMD 9684X2040608010082.5685.151. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 512AMD 9684XEPYC 9684X70140210280350327.73332.911. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 256AMD 9684XEPYC 9684X2040608010082.3989.801. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 256AMD 9684XEPYC 9684X4080120160200179.54182.811. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 128EPYC 9684XAMD 9684X4080120160200184.35187.191. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: Stock - Precision: double - X Y Z: 128EPYC 9684XAMD 9684X306090120150115.82116.291. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: Stock - Precision: double - X Y Z: 512AMD 9684XEPYC 9684X306090120150143.31144.621. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: Stock - Precision: float - X Y Z: 512AMD 9684XEPYC 9684X70140210280350339.17343.721. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: Stock - Precision: double - X Y Z: 256AMD 9684XEPYC 9684X2040608010082.8583.891. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: Stock - Precision: float - X Y Z: 128AMD 9684XEPYC 9684X4080120160200175.82178.481. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: double - X Y Z: 256AMD 9684XEPYC 9684X4080120160200189.44194.021. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: Stock - Precision: float - X Y Z: 512AMD 9684XEPYC 9684X306090120150149.16149.981. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: Stock - Precision: float - X Y Z: 128EPYC 9684XAMD 9684X20406080100110.02111.541. (CXX) g++ options: -O3

GROMACS

The GROMACS (GROningen MAchine for Chemical Simulations) molecular dynamics package testing with the water_GMX50 data. This test profile allows selecting between CPU and GPU-based GROMACS builds. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNs Per Day, More Is BetterGROMACS 2023Implementation: MPI CPU - Input: water_GMX50_bareAMD 9684XEPYC 9684X369121511.8011.801. (CXX) g++ options: -O3

HeFFTe - Highly Efficient FFT for Exascale

HeFFTe is the Highly Efficient FFT for Exascale software developed as part of the Exascale Computing Project. This test profile uses HeFFTe's built-in speed benchmarks under a variety of configuration options and currently catering to CPU/processor testing. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: Stock - Precision: float - X Y Z: 256AMD 9684XEPYC 9684X4080120160200167.19170.731. (CXX) g++ options: -O3

libxsmm

Libxsmm is an open-source library for specialized dense and sparse matrix operations and deep learning primitives. Libxsmm supports making use of Intel AMX, AVX-512, and other modern CPU instruction set capabilities. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS/s, More Is Betterlibxsmm 2-1.17-3645M N K: 128AMD 9684XEPYC 9684X70014002100280035003091.53119.71. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2

HeFFTe - Highly Efficient FFT for Exascale

HeFFTe is the Highly Efficient FFT for Exascale software developed as part of the Exascale Computing Project. This test profile uses HeFFTe's built-in speed benchmarks under a variety of configuration options and currently catering to CPU/processor testing. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: double - X Y Z: 128EPYC 9684XAMD 9684X306090120150125.24125.701. (CXX) g++ options: -O3

libxsmm

Libxsmm is an open-source library for specialized dense and sparse matrix operations and deep learning primitives. Libxsmm supports making use of Intel AMX, AVX-512, and other modern CPU instruction set capabilities. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS/s, More Is Betterlibxsmm 2-1.17-3645M N K: 256AMD 9684XEPYC 9684X70014002100280035003131.53177.41. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2

HeFFTe - Highly Efficient FFT for Exascale

HeFFTe is the Highly Efficient FFT for Exascale software developed as part of the Exascale Computing Project. This test profile uses HeFFTe's built-in speed benchmarks under a variety of configuration options and currently catering to CPU/processor testing. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: double - X Y Z: 512AMD 9684XEPYC 9684X306090120150135.56136.461. (CXX) g++ options: -O3

libxsmm

Libxsmm is an open-source library for specialized dense and sparse matrix operations and deep learning primitives. Libxsmm supports making use of Intel AMX, AVX-512, and other modern CPU instruction set capabilities. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS/s, More Is Betterlibxsmm 2-1.17-3645M N K: 32EPYC 9684XAMD 9684X300600900120015001299.81300.71. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2

HeFFTe - Highly Efficient FFT for Exascale

HeFFTe is the Highly Efficient FFT for Exascale software developed as part of the Exascale Computing Project. This test profile uses HeFFTe's built-in speed benchmarks under a variety of configuration options and currently catering to CPU/processor testing. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: Stock - Precision: float - X Y Z: 256AMD 9684XEPYC 9684X70140210280350321.42331.611. (CXX) g++ options: -O3

libxsmm

Libxsmm is an open-source library for specialized dense and sparse matrix operations and deep learning primitives. Libxsmm supports making use of Intel AMX, AVX-512, and other modern CPU instruction set capabilities. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS/s, More Is Betterlibxsmm 2-1.17-3645M N K: 64AMD 9684XEPYC 9684X50010001500200025002472.22479.31. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2

HeFFTe - Highly Efficient FFT for Exascale

HeFFTe is the Highly Efficient FFT for Exascale software developed as part of the Exascale Computing Project. This test profile uses HeFFTe's built-in speed benchmarks under a variety of configuration options and currently catering to CPU/processor testing. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: Stock - Precision: double - X Y Z: 128AMD 9684XEPYC 9684X163248648068.8770.121. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: float - X Y Z: 128AMD 9684XEPYC 9684X306090120150128.61130.751. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: Stock - Precision: double - X Y Z: 512AMD 9684XEPYC 9684X153045607568.3768.821. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: float - X Y Z: 256AMD 9684XEPYC 9684X4080120160200176.30177.711. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: Stock - Precision: double - X Y Z: 256EPYC 9684XAMD 9684X4080120160200174.00176.451. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: float - X Y Z: 512EPYC 9684XAMD 9684X306090120150154.31155.391. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 128EPYC 9684XAMD 9684X306090120150126.30130.481. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: float - X Y Z: 128EPYC 9684XAMD 9684X4080120160200186.05188.221. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 512AMD 9684XEPYC 9684X306090120150155.66155.711. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: float - X Y Z: 256AMD 9684XEPYC 9684X70140210280350312.39315.151. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 256EPYC 9684XAMD 9684X70140210280350318.53332.761. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: float - X Y Z: 512EPYC 9684XAMD 9684X70140210280350336.45340.751. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 128AMD 9684XEPYC 9684X2040608010076.3484.371. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: double - X Y Z: 128AMD 9684XEPYC 9684X2040608010078.4981.851. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 512EPYC 9684XAMD 9684X153045607568.4568.631. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 256EPYC 9684XAMD 9684X4080120160200177.12178.561. (CXX) g++ options: -O3

High Performance Conjugate Gradient

HPCG is the High Performance Conjugate Gradient and is a new scientific benchmark from Sandia National Lans focused for super-computer testing with modern real-world workloads compared to HPCC. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHigh Performance Conjugate Gradient 3.1X Y Z: 104 104 104 - RT: 60AMD 9684XEPYC 9684X81624324023.1434.361. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi

HeFFTe - Highly Efficient FFT for Exascale

HeFFTe is the Highly Efficient FFT for Exascale software developed as part of the Exascale Computing Project. This test profile uses HeFFTe's built-in speed benchmarks under a variety of configuration options and currently catering to CPU/processor testing. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 512AMD 9684XEPYC 9684X306090120150143.53145.501. (CXX) g++ options: -O3

High Performance Conjugate Gradient

HPCG is the High Performance Conjugate Gradient and is a new scientific benchmark from Sandia National Lans focused for super-computer testing with modern real-world workloads compared to HPCC. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHigh Performance Conjugate Gradient 3.1X Y Z: 160 160 160 - RT: 60AMD 9684XEPYC 9684X51015202522.7822.821. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi

HeFFTe - Highly Efficient FFT for Exascale

HeFFTe is the Highly Efficient FFT for Exascale software developed as part of the Exascale Computing Project. This test profile uses HeFFTe's built-in speed benchmarks under a variety of configuration options and currently catering to CPU/processor testing. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 128EPYC 9684XAMD 9684X306090120150115.34116.261. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 256EPYC 9684XAMD 9684X4080120160200175.59182.391. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 256AMD 9684XEPYC 9684X2040608010082.8683.531. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 512AMD 9684XEPYC 9684X153045607568.6468.811. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 512AMD 9684XEPYC 9684X70140210280350336.84341.921. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 128EPYC 9684XAMD 9684X163248648068.6570.231. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 128EPYC 9684XAMD 9684X4080120160200176.53176.561. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 256EPYC 9684XAMD 9684X70140210280350327.36328.781. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 512EPYC 9684XAMD 9684X306090120150135.54136.271. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 256AMD 9684XEPYC 9684X4080120160200184.65184.791. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 128AMD 9684XEPYC 9684X306090120150124.93129.281. (CXX) g++ options: -O3

High Performance Conjugate Gradient

HPCG is the High Performance Conjugate Gradient and is a new scientific benchmark from Sandia National Lans focused for super-computer testing with modern real-world workloads compared to HPCC. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHigh Performance Conjugate Gradient 3.1X Y Z: 144 144 144 - RT: 60AMD 9684XEPYC 9684X61218243022.8523.551. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi

OpenBenchmarking.orgGFLOP/s, More Is BetterHigh Performance Conjugate Gradient 3.1X Y Z: 192 192 192 - RT: 60AMD 9684XEPYC 9684X51015202522.4222.721. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi

NAS Parallel Benchmarks

NPB, NAS Parallel Benchmarks, is a benchmark developed by NASA for high-end computer systems. This test profile currently uses the MPI version of NPB. This test profile offers selecting the different NPB tests/problems and varying problem sizes. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: BT.CEPYC 9684XAMD 9684X70K140K210K280K350K305166.40314982.581. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: CG.CAMD 9684XEPYC 9684X13K26K39K52K65K58099.4062397.221. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: EP.CEPYC 9684XAMD 9684X2K4K6K8K10K7839.237952.001. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: EP.DAMD 9684XEPYC 9684X2K4K6K8K10K10476.8310620.591. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: FT.CEPYC 9684XAMD 9684X30K60K90K120K150K118915.97121389.761. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: IS.DAMD 9684XEPYC 9684X130026003900520065005572.165839.251. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: LU.CAMD 9684XEPYC 9684X70K140K210K280K350K335538.30340754.871. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: MG.CAMD 9684XEPYC 9684X30K60K90K120K150K132064.96141081.081. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2

HeFFTe - Highly Efficient FFT for Exascale

HeFFTe is the Highly Efficient FFT for Exascale software developed as part of the Exascale Computing Project. This test profile uses HeFFTe's built-in speed benchmarks under a variety of configuration options and currently catering to CPU/processor testing. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 512EPYC 9684XAMD 9684X306090120150148.56149.971. (CXX) g++ options: -O3

NAS Parallel Benchmarks

NPB, NAS Parallel Benchmarks, is a benchmark developed by NASA for high-end computer systems. This test profile currently uses the MPI version of NPB. This test profile offers selecting the different NPB tests/problems and varying problem sizes. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: SP.BAMD 9684XEPYC 9684X40K80K120K160K200K169864.87170601.481. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: SP.CAMD 9684XEPYC 9684X50K100K150K200K250K208198.13211454.601. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2

NAMD

NAMD is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems. NAMD was developed by the Theoretical and Computational Biophysics Group in the Beckman Institute for Advanced Science and Technology at the University of Illinois at Urbana-Champaign. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgdays/ns, Fewer Is BetterNAMD 2.14ATPase Simulation - 327,506 AtomsEPYC 9684XAMD 9684X0.05640.11280.16920.22560.2820.250680.24623

ASKAP

ASKAP is a set of benchmarks from the Australian SKA Pathfinder. The principal ASKAP benchmarks are the Hogbom Clean Benchmark (tHogbomClean) and Convolutional Resamping Benchmark (tConvolve) as well as some previous ASKAP benchmarks being included as well for OpenCL and CUDA execution of tConvolve. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 1.0Test: tConvolve MT - GriddingAMD 9684XEPYC 9684X3K6K9K12K15K13606.913617.81. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 1.0Test: tConvolve MT - DegriddingEPYC 9684XAMD 9684X3K6K9K12K15K15691156911. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

OpenBenchmarking.orgMpix/sec, More Is BetterASKAP 1.0Test: tConvolve MPI - DegriddingAMD 9684XEPYC 9684X13K26K39K52K65K58310.159410.31. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

OpenBenchmarking.orgMpix/sec, More Is BetterASKAP 1.0Test: tConvolve MPI - GriddingEPYC 9684XAMD 9684X16K32K48K64K80K73226.774970.21. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 1.0Test: tConvolve OpenMP - GriddingEPYC 9684XAMD 9684X6K12K18K24K30K26625.629584.01. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP 1.0Test: tConvolve OpenMP - DegriddingAMD 9684XEPYC 9684X14K28K42K56K70K53251.266564.01. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

OpenBenchmarking.orgIterations Per Second, More Is BetterASKAP 1.0Test: Hogbom Clean OpenMPEPYC 9684XAMD 9684X300600900120015001204.821204.821. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

LULESH

LULESH is the Livermore Unstructured Lagrangian Explicit Shock Hydrodynamics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgz/s, More Is BetterLULESH 2.0.3EPYC 9684XAMD 9684X7K14K21K28K35K30858.4631065.291. (CXX) g++ options: -O3 -fopenmp -lm -lmpi_cxx -lmpi

OpenFOAM

OpenFOAM is the leading free, open-source software for computational fluid dynamics (CFD). This test profile currently uses the drivaerFastback test case for analyzing automotive aerodynamics or alternatively the older motorBike input. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 10Input: drivaerFastback, Small Mesh Size - Mesh TimeEPYC 9684XAMD 9684X61218243023.3023.171. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 10Input: drivaerFastback, Small Mesh Size - Execution TimeAMD 9684XEPYC 9684X71421283529.2628.961. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 10Input: drivaerFastback, Medium Mesh Size - Mesh TimeAMD 9684XEPYC 9684X20406080100107.83107.491. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 10Input: drivaerFastback, Medium Mesh Size - Execution TimeAMD 9684XEPYC 9684X4080120160200185.15184.451. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm

miniFE

MiniFE Finite Element is an application for unstructured implicit finite element codes. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgCG Mflops, More Is BetterminiFE 2.2Problem Size: SmallEPYC 9684XAMD 9684X12K24K36K48K60K53878.453906.31. (CXX) g++ options: -O3 -fopenmp -lmpi_cxx -lmpi

Xcompact3d Incompact3d

Xcompact3d Incompact3d is a Fortran-MPI based, finite difference high-performance code for solving the incompressible Navier-Stokes equation and as many as you need scalar transport equations. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterXcompact3d Incompact3d 2021-03-11Input: X3D-benchmarking input.i3dAMD 9684XEPYC 9684X80160240320400388.70377.591. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

OpenBenchmarking.orgSeconds, Fewer Is BetterXcompact3d Incompact3d 2021-03-11Input: input.i3d 129 Cells Per DirectionEPYC 9684XAMD 9684X0.50511.01021.51532.02042.52552.244935041.963184951. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

OpenBenchmarking.orgSeconds, Fewer Is BetterXcompact3d Incompact3d 2021-03-11Input: input.i3d 193 Cells Per DirectionAMD 9684XEPYC 9684X2468107.653934967.652035241. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

Blender

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 3.6Blend File: BMW27 - Compute: CPU-OnlyAMD 9684XEPYC 9684X4812162016.4616.26

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 3.6Blend File: Classroom - Compute: CPU-OnlyEPYC 9684XAMD 9684X91827364540.4840.40

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 3.6Blend File: Fishy Cat - Compute: CPU-OnlyEPYC 9684XAMD 9684X51015202520.6120.44

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 3.6Blend File: Barbershop - Compute: CPU-OnlyEPYC 9684XAMD 9684X306090120150142.10141.96

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 3.6Blend File: Pabellon Barcelona - Compute: CPU-OnlyAMD 9684XEPYC 9684X112233445549.5049.42

Embree

Intel Embree is a collection of high-performance ray-tracing kernels for execution on CPUs (and GPUs via SYCL) and supporting instruction sets such as SSE, AVX, AVX2, and AVX-512. Embree also supports making use of the Intel SPMD Program Compiler (ISPC). Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.1Binary: Pathtracer - Model: CrownAMD 9684XEPYC 9684X20406080100109.65110.69MIN: 107.65 / MAX: 113.51MIN: 108.41 / MAX: 113.64

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.1Binary: Pathtracer ISPC - Model: CrownEPYC 9684XAMD 9684X306090120150117.08117.25MIN: 114.57 / MAX: 120.79MIN: 114.61 / MAX: 121.35

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.1Binary: Pathtracer - Model: Asian DragonEPYC 9684XAMD 9684X306090120150126.25126.41MIN: 124.97 / MAX: 128.11MIN: 124.93 / MAX: 128.49

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.1Binary: Pathtracer - Model: Asian Dragon ObjEPYC 9684XAMD 9684X306090120150114.17115.04MIN: 112.65 / MAX: 116.39MIN: 113.62 / MAX: 117.34

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.1Binary: Pathtracer ISPC - Model: Asian DragonEPYC 9684XAMD 9684X306090120150142.28142.30MIN: 140.41 / MAX: 144.57MIN: 140.74 / MAX: 144.53

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.1Binary: Pathtracer ISPC - Model: Asian Dragon ObjEPYC 9684XAMD 9684X306090120150122.04122.31MIN: 120.36 / MAX: 124.72MIN: 120.61 / MAX: 124.64

108 Results Shown

Stress-NG:
  CPU Cache
  CPU Stress
  Matrix Math
  Vector Math
  Vector Shuffle
  Wide Vector Math
  Vector Floating Point
ASTC Encoder:
  Fast
  Medium
  Thorough
  Exhaustive
HeFFTe - Highly Efficient FFT for Exascale
Xmrig:
  Monero - 1M
  Wownero - 1M
HeFFTe - Highly Efficient FFT for Exascale:
  c2c - Stock - float-long - 128
  c2c - FFTW - double - 256
  r2c - FFTW - float-long - 512
  c2c - FFTW - double-long - 256
  c2c - FFTW - float-long - 256
  r2c - FFTW - float-long - 128
  r2c - Stock - double - 128
  r2c - Stock - double - 512
  r2c - Stock - float - 512
  c2c - Stock - double - 256
  r2c - Stock - float - 128
  r2c - FFTW - double - 256
  c2c - Stock - float - 512
  c2c - Stock - float - 128
GROMACS
HeFFTe - Highly Efficient FFT for Exascale
libxsmm
HeFFTe - Highly Efficient FFT for Exascale
libxsmm
HeFFTe - Highly Efficient FFT for Exascale
libxsmm
HeFFTe - Highly Efficient FFT for Exascale
libxsmm
HeFFTe - Highly Efficient FFT for Exascale:
  c2c - Stock - double - 128
  c2c - FFTW - float - 128
  c2c - Stock - double - 512
  c2c - FFTW - float - 256
  r2c - Stock - double - 256
  c2c - FFTW - float - 512
  c2c - FFTW - float-long - 128
  r2c - FFTW - float - 128
  c2c - FFTW - float-long - 512
  r2c - FFTW - float - 256
  r2c - FFTW - float-long - 256
  r2c - FFTW - float - 512
  c2c - FFTW - double-long - 128
  c2c - FFTW - double - 128
  c2c - FFTW - double-long - 512
  c2c - Stock - float-long - 256
High Performance Conjugate Gradient
HeFFTe - Highly Efficient FFT for Exascale
High Performance Conjugate Gradient
HeFFTe - Highly Efficient FFT for Exascale:
  r2c - Stock - double-long - 128
  r2c - Stock - double-long - 256
  c2c - Stock - double-long - 256
  c2c - Stock - double-long - 512
  r2c - Stock - float-long - 512
  c2c - Stock - double-long - 128
  r2c - Stock - float-long - 128
  r2c - Stock - float-long - 256
  r2c - FFTW - double-long - 512
  r2c - FFTW - double-long - 256
  r2c - FFTW - double-long - 128
High Performance Conjugate Gradient:
  144 144 144 - 60
  192 192 192 - 60
NAS Parallel Benchmarks:
  BT.C
  CG.C
  EP.C
  EP.D
  FT.C
  IS.D
  LU.C
  MG.C
HeFFTe - Highly Efficient FFT for Exascale
NAS Parallel Benchmarks:
  SP.B
  SP.C
NAMD
ASKAP:
  tConvolve MT - Gridding
  tConvolve MT - Degridding
  tConvolve MPI - Degridding
  tConvolve MPI - Gridding
  tConvolve OpenMP - Gridding
  tConvolve OpenMP - Degridding
  Hogbom Clean OpenMP
LULESH
OpenFOAM:
  drivaerFastback, Small Mesh Size - Mesh Time
  drivaerFastback, Small Mesh Size - Execution Time
  drivaerFastback, Medium Mesh Size - Mesh Time
  drivaerFastback, Medium Mesh Size - Execution Time
miniFE
Xcompact3d Incompact3d:
  X3D-benchmarking input.i3d
  input.i3d 129 Cells Per Direction
  input.i3d 193 Cells Per Direction
Blender:
  BMW27 - CPU-Only
  Classroom - CPU-Only
  Fishy Cat - CPU-Only
  Barbershop - CPU-Only
  Pabellon Barcelona - CPU-Only
Embree:
  Pathtracer - Crown
  Pathtracer ISPC - Crown
  Pathtracer - Asian Dragon
  Pathtracer - Asian Dragon Obj
  Pathtracer ISPC - Asian Dragon
  Pathtracer ISPC - Asian Dragon Obj