AMD EPYC 9575F HPC Tuning Guide

Benchmarks for a future article by Michael Larabel.

HTML result view exported from: https://openbenchmarking.org/result/2411294-NE-AMDEPYC9542&sor&grs.

AMD EPYC 9575F HPC Tuning GuideProcessorMotherboardChipsetMemoryDiskGraphicsNetworkOSKernelDesktopDisplay ServerCompilerFile-SystemScreen ResolutionDefaultHPC Tuning RecommendationsAMD EPYC 9575F 64-Core @ 3.30GHz (64 Cores / 128 Threads)Supermicro Super Server H13SSL-N v1.01 (3.0 BIOS)AMD 1Ah12 x 64GB DDR5-6000MT/s Micron MTC40F2046S1RC64BDY QSFF3201GB Micron_7450_MTFDKCB3T2TFSASPEED2 x Broadcom NetXtreme BCM5720 PCIeUbuntu 24.106.12.0-rc7-linux-pm-next-phx (x86_64)GNOME Shell 47.0X ServerGCC 14.2.0ext41024x768AMD EPYC 9575F 64-Core @ 3.30GHz (64 Cores)OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xb002116Python Details- Python 3.12.7Security Details- Default: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected - HPC Tuning Recommendations: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: disabled; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected

AMD EPYC 9575F HPC Tuning Guidexnnpack: FP16MobileNetV3Smallxnnpack: FP32MobileNetV3Largexnnpack: FP16MobileNetV3Largexnnpack: FP32MobileNetV2openfoam: drivaerFastback, Large Mesh Size - Execution Timeopenfoam: drivaerFastback, Medium Mesh Size - Execution Timelibxsmm: 256openradioss: Bumper Beamhpcg: 160 160 160 - 60openfoam: drivaerFastback, Small Mesh Size - Mesh Timeheffte: c2c - FFTW - float - 256openfoam: drivaerFastback, Large Mesh Size - Mesh Timeopenradioss: Chrysler Neon 1Mincompact3d: input.i3d 193 Cells Per Directionheffte: r2c - FFTW - float - 256graph500: 26graph500: 26graph500: 26specfem3d: Tomographic Modelopenfoam: drivaerFastback, Medium Mesh Size - Mesh Timespecfem3d: Homogeneous Halfspaceopenradioss: Bird Strike on Windshieldgraph500: 26openradioss: INIVOL and Fluid Structure Interaction Drop Containernwchem: C240 Buckyballeasywave: e2Asean Grid + BengkuluSept2007 Source - 1200gromacs: MPI CPU - water_GMX50_bareopenradioss: Cell Phone Drop Testeasywave: e2Asean Grid + BengkuluSept2007 Source - 2400cp2k: H20-256openfoam: drivaerFastback, Small Mesh Size - Execution Timeopenradioss: Rubber O-Ring Seal Installationheffte: r2c - FFTW - float - 128gpaw: Carbon Nanotubexnnpack: QS8MobileNetV2xnnpack: FP16MobileNetV2xnnpack: FP16MobileNetV1xnnpack: FP32MobileNetV3Smallxnnpack: FP32MobileNetV1heffte: c2c - FFTW - float - 128incompact3d: X3D-benchmarking input.i3dhpcg: 144 144 144 - 60DefaultHPC Tuning Recommendations54497258704647588380.5323283.11872422.766.0940.348118.974323178.295603.69492125.147.95593405374.902133529000013055600006001530007.31000888884.1193269.00769683082.7747037800090.691249.120.81414.53317.7651.643137.80224.57934438.44332.01327.85750324681246753982429223.464325.05256841.094028423959385626516426.0644218.127753098.952.7349.090815.967165210.419517.58403110.257.01252317411.072146189000014200900006469420006.87288497479.4680328.51437893178.2749571100094.311298.221.26514.22918.0752.469135.94524.25671438.33332.91327.83726172553148929411686235.079245.67947551.0171OpenBenchmarking.org

XNNPACK

Model: FP16MobileNetV3Small

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP16MobileNetV3SmallHPC Tuning RecommendationsDefault12002400360048006000SE +/- 15.58, N = 10SE +/- 65.16, N = 3284254491. (CXX) g++ options: -O3 -lrt -lm

XNNPACK

Model: FP32MobileNetV3Large

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP32MobileNetV3LargeHPC Tuning RecommendationsDefault16003200480064008000SE +/- 23.40, N = 10SE +/- 43.97, N = 3395972581. (CXX) g++ options: -O3 -lrt -lm

XNNPACK

Model: FP16MobileNetV3Large

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP16MobileNetV3LargeHPC Tuning RecommendationsDefault15003000450060007500SE +/- 55.41, N = 10SE +/- 11.89, N = 3385670461. (CXX) g++ options: -O3 -lrt -lm

XNNPACK

Model: FP32MobileNetV2

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP32MobileNetV2HPC Tuning RecommendationsDefault10002000300040005000SE +/- 49.78, N = 10SE +/- 24.31, N = 3265147581. (CXX) g++ options: -O3 -lrt -lm

OpenFOAM

Input: drivaerFastback, Large Mesh Size - Execution Time

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 10Input: drivaerFastback, Large Mesh Size - Execution TimeHPC Tuning RecommendationsDefault2K4K6K8K10K6426.068380.531. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm

OpenFOAM

Input: drivaerFastback, Medium Mesh Size - Execution Time

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 10Input: drivaerFastback, Medium Mesh Size - Execution TimeHPC Tuning RecommendationsDefault60120180240300218.13283.121. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm

libxsmm

M N K: 256

OpenBenchmarking.orgGFLOPS/s, More Is Betterlibxsmm 2-1.17-3645M N K: 256HPC Tuning RecommendationsDefault7001400210028003500SE +/- 23.61, N = 3SE +/- 4.19, N = 33098.92422.71. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2

OpenRadioss

Model: Bumper Beam

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenRadioss 2023.09.15Model: Bumper BeamHPC Tuning RecommendationsDefault1530456075SE +/- 0.19, N = 3SE +/- 0.61, N = 1552.7366.09

High Performance Conjugate Gradient

X Y Z: 160 160 160 - RT: 60

OpenBenchmarking.orgGFLOP/s, More Is BetterHigh Performance Conjugate Gradient 3.1X Y Z: 160 160 160 - RT: 60HPC Tuning RecommendationsDefault1122334455SE +/- 0.11, N = 3SE +/- 0.27, N = 349.0940.351. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi

OpenFOAM

Input: drivaerFastback, Small Mesh Size - Mesh Time

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 10Input: drivaerFastback, Small Mesh Size - Mesh TimeHPC Tuning RecommendationsDefault51015202515.9718.971. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm

HeFFTe - Highly Efficient FFT for Exascale

Test: c2c - Backend: FFTW - Precision: float - X Y Z: 256

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.4Test: c2c - Backend: FFTW - Precision: float - X Y Z: 256HPC Tuning RecommendationsDefault50100150200250SE +/- 0.34, N = 13SE +/- 2.31, N = 15210.42178.301. (CXX) g++ options: -O3

OpenFOAM

Input: drivaerFastback, Large Mesh Size - Mesh Time

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 10Input: drivaerFastback, Large Mesh Size - Mesh TimeHPC Tuning RecommendationsDefault130260390520650517.58603.691. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm

OpenRadioss

Model: Chrysler Neon 1M

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenRadioss 2023.09.15Model: Chrysler Neon 1MHPC Tuning RecommendationsDefault306090120150SE +/- 0.10, N = 3SE +/- 1.75, N = 12110.25125.14

Xcompact3d Incompact3d

Input: input.i3d 193 Cells Per Direction

OpenBenchmarking.orgSeconds, Fewer Is BetterXcompact3d Incompact3d 2021-03-11Input: input.i3d 193 Cells Per DirectionHPC Tuning RecommendationsDefault246810SE +/- 0.02216234, N = 6SE +/- 0.05610512, N = 57.012523177.955934051. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

HeFFTe - Highly Efficient FFT for Exascale

Test: r2c - Backend: FFTW - Precision: float - X Y Z: 256

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.4Test: r2c - Backend: FFTW - Precision: float - X Y Z: 256HPC Tuning RecommendationsDefault90180270360450SE +/- 2.78, N = 15SE +/- 2.51, N = 13411.07374.901. (CXX) g++ options: -O3

Graph500

Scale: 26

OpenBenchmarking.orgbfs max_TEPS, More Is BetterGraph500 3.0Scale: 26HPC Tuning RecommendationsDefault300M600M900M1200M1500M146189000013352900001. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi

Graph500

Scale: 26

OpenBenchmarking.orgbfs median_TEPS, More Is BetterGraph500 3.0Scale: 26HPC Tuning RecommendationsDefault300M600M900M1200M1500M142009000013055600001. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi

Graph500

Scale: 26

OpenBenchmarking.orgsssp max_TEPS, More Is BetterGraph500 3.0Scale: 26HPC Tuning RecommendationsDefault140M280M420M560M700M6469420006001530001. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi

SPECFEM3D

Model: Tomographic Model

OpenBenchmarking.orgSeconds, Fewer Is BetterSPECFEM3D 4.1.1Model: Tomographic ModelHPC Tuning RecommendationsDefault246810SE +/- 0.017950948, N = 5SE +/- 0.069507248, N = 66.8728849747.3100088881. (F9X) gfortran options: -O2 -fopenmp -std=f2008 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

OpenFOAM

Input: drivaerFastback, Medium Mesh Size - Mesh Time

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 10Input: drivaerFastback, Medium Mesh Size - Mesh TimeHPC Tuning RecommendationsDefault2040608010079.4784.121. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm

SPECFEM3D

Model: Homogeneous Halfspace

OpenBenchmarking.orgSeconds, Fewer Is BetterSPECFEM3D 4.1.1Model: Homogeneous HalfspaceHPC Tuning RecommendationsDefault3691215SE +/- 0.018784174, N = 5SE +/- 0.067842113, N = 58.5143789319.0076968301. (F9X) gfortran options: -O2 -fopenmp -std=f2008 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

OpenRadioss

Model: Bird Strike on Windshield

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenRadioss 2023.09.15Model: Bird Strike on WindshieldHPC Tuning RecommendationsDefault20406080100SE +/- 0.26, N = 3SE +/- 0.12, N = 378.2782.77

Graph500

Scale: 26

OpenBenchmarking.orgsssp median_TEPS, More Is BetterGraph500 3.0Scale: 26HPC Tuning RecommendationsDefault110M220M330M440M550M4957110004703780001. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi

OpenRadioss

Model: INIVOL and Fluid Structure Interaction Drop Container

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenRadioss 2023.09.15Model: INIVOL and Fluid Structure Interaction Drop ContainerDefaultHPC Tuning Recommendations20406080100SE +/- 0.35, N = 3SE +/- 0.16, N = 390.6994.31

NWChem

Input: C240 Buckyball

OpenBenchmarking.orgSeconds, Fewer Is BetterNWChem 7.2.3Input: C240 BuckyballDefaultHPC Tuning Recommendations300600900120015001249.11298.21. (F9X) gfortran options: -lnwctask -lccsd -lmcscf -lselci -lmp2 -lmoints -lstepper -ldriver -loptim -lnwdft -lgradients -lcphf -lesp -lddscf -ldangchang -lguess -lhessian -lvib -lnwcutil -lrimp2 -lproperty -lsolvation -lnwints -lprepar -lnwmd -lnwpw -lofpw -lpaw -lpspw -lband -lnwpwlib -lcafe -lspace -lanalyze -lqhop -lpfft -ldplot -ldrdy -lvscf -lqmmm -lqmd -letrans -ltce -lbq -lmm -lcons -lperfm -ldntmc -lccca -ldimqm -lfcidump -lgwmol -lga -larmci -lpeigs -l64to32 -llapack -lopenblas -lpthread -lrt -lcomex -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz -ffast-math -std=legacy -fdefault-integer-8 -O0

easyWave

Input: e2Asean Grid + BengkuluSept2007 Source - Time: 1200

OpenBenchmarking.orgSeconds, Fewer Is BettereasyWave r34Input: e2Asean Grid + BengkuluSept2007 Source - Time: 1200DefaultHPC Tuning Recommendations510152025SE +/- 0.23, N = 3SE +/- 0.16, N = 320.8121.271. (CXX) g++ options: -O3 -fopenmp

GROMACS

Implementation: MPI CPU - Input: water_GMX50_bare

OpenBenchmarking.orgNs Per Day, More Is BetterGROMACS 2024Implementation: MPI CPU - Input: water_GMX50_bareDefaultHPC Tuning Recommendations48121620SE +/- 0.00, N = 3SE +/- 0.01, N = 314.5314.231. (CXX) g++ options: -O3 -lm

OpenRadioss

Model: Cell Phone Drop Test

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenRadioss 2023.09.15Model: Cell Phone Drop TestDefaultHPC Tuning Recommendations48121620SE +/- 0.16, N = 3SE +/- 0.02, N = 317.7618.07

easyWave

Input: e2Asean Grid + BengkuluSept2007 Source - Time: 2400

OpenBenchmarking.orgSeconds, Fewer Is BettereasyWave r34Input: e2Asean Grid + BengkuluSept2007 Source - Time: 2400DefaultHPC Tuning Recommendations1224364860SE +/- 0.13, N = 3SE +/- 0.49, N = 1551.6452.471. (CXX) g++ options: -O3 -fopenmp

CP2K Molecular Dynamics

Input: H20-256

OpenBenchmarking.orgSeconds, Fewer Is BetterCP2K Molecular Dynamics 2024.3Input: H20-256HPC Tuning RecommendationsDefault306090120150SE +/- 0.47, N = 3SE +/- 0.29, N = 3135.95137.801. (F9X) gfortran options: -fopenmp -march=native -mtune=native -O3 -funroll-loops -fbacktrace -ffree-form -fimplicit-none -std=f2008 -lcp2kstart -lcp2kmc -lcp2kswarm -lcp2kmotion -lcp2kthermostat -lcp2kemd -lcp2ktmc -lcp2kmain -lcp2kdbt -lcp2ktas -lcp2kgrid -lcp2kgriddgemm -lcp2kgridcpu -lcp2kgridref -lcp2kgridcommon -ldbcsrarnoldi -ldbcsrx -lcp2kdbx -lcp2kdbm -lcp2kshg_int -lcp2keri_mme -lcp2kminimax -lcp2khfxbase -lcp2ksubsys -lcp2kxc -lcp2kao -lcp2kpw_env -lcp2kinput -lcp2kpw -lcp2kgpu -lcp2kfft -lcp2kfpga -lcp2kfm -lcp2kcommon -lcp2koffload -lcp2kmpiwrap -lcp2kbase -ldbcsr -lsirius -lspla -lspfft -lsymspg -lvdwxc -l:libhdf5_fortran.a -l:libhdf5.a -lz -lgsl -lelpa_openmp -lcosma -lcosta -lscalapack -lxsmmf -lxsmm -ldl -lpthread -llibgrpp -lxcf03 -lxc -lint2 -lfftw3_mpi -lfftw3 -lfftw3_omp -lmpi_cxx -lmpi -l:libopenblas.a -lvori -lstdc++ -lmpi_usempif08 -lmpi_mpifh -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm

OpenFOAM

Input: drivaerFastback, Small Mesh Size - Execution Time

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 10Input: drivaerFastback, Small Mesh Size - Execution TimeHPC Tuning RecommendationsDefault61218243024.2624.581. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm

OpenRadioss

Model: Rubber O-Ring Seal Installation

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenRadioss 2023.09.15Model: Rubber O-Ring Seal InstallationHPC Tuning RecommendationsDefault918273645SE +/- 0.02, N = 3SE +/- 0.41, N = 438.3338.44

HeFFTe - Highly Efficient FFT for Exascale

Test: r2c - Backend: FFTW - Precision: float - X Y Z: 128

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.4Test: r2c - Backend: FFTW - Precision: float - X Y Z: 128HPC Tuning RecommendationsDefault70140210280350SE +/- 3.79, N = 15SE +/- 1.37, N = 14332.91332.011. (CXX) g++ options: -O3

GPAW

Input: Carbon Nanotube

OpenBenchmarking.orgSeconds, Fewer Is BetterGPAW 23.6Input: Carbon NanotubeHPC Tuning RecommendationsDefault714212835SE +/- 0.07, N = 3SE +/- 0.17, N = 327.8427.861. (CC) gcc options: -shared -fwrapv -O2 -lxc -lblas -lmpi

NWChem

System Power Consumption Monitor

MinAvgMaxHPC Tuning Recommendations98530586Default100602654OpenBenchmarking.orgWatts, Fewer Is BetterNWChem 7.2.3System Power Consumption Monitor2004006008001000

NWChem

CPU Power Consumption Monitor

MinAvgMaxHPC Tuning Recommendations2.0339.3355.6Default83.7391.1402.0OpenBenchmarking.orgWatts, Fewer Is BetterNWChem 7.2.3CPU Power Consumption Monitor110220330440550

libxsmm

System Power Consumption Monitor

MinAvgMaxHPC Tuning Recommendations98387556Default103486644OpenBenchmarking.orgWatts, Fewer Is Betterlibxsmm 2-1.17-3645System Power Consumption Monitor2004006008001000

libxsmm

CPU Power Consumption Monitor

MinAvgMaxHPC Tuning Recommendations1.9236.7327.5Default107.8296.6387.2OpenBenchmarking.orgWatts, Fewer Is Betterlibxsmm 2-1.17-3645CPU Power Consumption Monitor100200300400500

XNNPACK

System Power Consumption Monitor

MinAvgMaxHPC Tuning Recommendations97.7353.6403.1Default94.9438.7489.6OpenBenchmarking.orgWatts, Fewer Is BetterXNNPACK b7b048System Power Consumption Monitor130260390520650

XNNPACK

CPU Power Consumption Monitor

MinAvgMaxHPC Tuning Recommendations1.9224.7253.8Default126.0294.5327.8OpenBenchmarking.orgWatts, Fewer Is BetterXNNPACK b7b048CPU Power Consumption Monitor80160240320400

XNNPACK

Model: QS8MobileNetV2

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: QS8MobileNetV2HPC Tuning RecommendationsDefault11002200330044005500SE +/- 75.96, N = 10SE +/- 15.76, N = 3261750321. (CXX) g++ options: -O3 -lrt -lm

XNNPACK

Model: FP16MobileNetV2

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP16MobileNetV2HPC Tuning RecommendationsDefault10002000300040005000SE +/- 73.59, N = 10SE +/- 14.43, N = 3255346811. (CXX) g++ options: -O3 -lrt -lm

XNNPACK

Model: FP16MobileNetV1

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP16MobileNetV1HPC Tuning RecommendationsDefault5001000150020002500SE +/- 75.71, N = 10SE +/- 1.45, N = 3148924671. (CXX) g++ options: -O3 -lrt -lm

XNNPACK

Model: FP32MobileNetV3Small

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP32MobileNetV3SmallHPC Tuning RecommendationsDefault12002400360048006000SE +/- 99.39, N = 10SE +/- 13.20, N = 3294153981. (CXX) g++ options: -O3 -lrt -lm

XNNPACK

Model: FP32MobileNetV1

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP32MobileNetV1HPC Tuning RecommendationsDefault5001000150020002500SE +/- 170.15, N = 10SE +/- 5.13, N = 3168624291. (CXX) g++ options: -O3 -lrt -lm

HeFFTe - Highly Efficient FFT for Exascale

System Power Consumption Monitor

MinAvgMaxHPC Tuning Recommendations95.2130.9273.3Default96.5140.7338.4OpenBenchmarking.orgWatts, Fewer Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.4System Power Consumption Monitor80160240320400

HeFFTe - Highly Efficient FFT for Exascale

CPU Power Consumption Monitor

MinAvgMaxHPC Tuning Recommendations1.759.582.0Default57.071.982.1OpenBenchmarking.orgWatts, Fewer Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.4CPU Power Consumption Monitor20406080100

HeFFTe - Highly Efficient FFT for Exascale

System Power Consumption Monitor

MinAvgMaxDefault95.4105.9168.3HPC Tuning Recommendations95.2128.0236.1OpenBenchmarking.orgWatts, Fewer Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.4System Power Consumption Monitor60120180240300

HeFFTe - Highly Efficient FFT for Exascale

CPU Power Consumption Monitor

MinAvgMaxHPC Tuning Recommendations1.454.672.9Default56.761.671.7OpenBenchmarking.orgWatts, Fewer Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.4CPU Power Consumption Monitor20406080100

HeFFTe - Highly Efficient FFT for Exascale

Test: c2c - Backend: FFTW - Precision: float - X Y Z: 128

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.4Test: c2c - Backend: FFTW - Precision: float - X Y Z: 128HPC Tuning RecommendationsDefault50100150200250SE +/- 3.65, N = 15SE +/- 0.74, N = 14235.08223.461. (CXX) g++ options: -O3

HeFFTe - Highly Efficient FFT for Exascale

System Power Consumption Monitor

MinAvgMaxHPC Tuning Recommendations95.8110.6193.0Default97.1122.4223.2OpenBenchmarking.orgWatts, Fewer Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.4System Power Consumption Monitor60120180240300

HeFFTe - Highly Efficient FFT for Exascale

CPU Power Consumption Monitor

MinAvgMaxHPC Tuning Recommendations1.257.877.6Default0.860.967.4OpenBenchmarking.orgWatts, Fewer Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.4CPU Power Consumption Monitor20406080100

HeFFTe - Highly Efficient FFT for Exascale

System Power Consumption Monitor

MinAvgMaxDefault98.6122.6264.0HPC Tuning Recommendations98.0128.7243.3OpenBenchmarking.orgWatts, Fewer Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.4System Power Consumption Monitor70140210280350

HeFFTe - Highly Efficient FFT for Exascale

CPU Power Consumption Monitor

MinAvgMaxHPC Tuning Recommendations1.054.873.7Default0.658.375.3OpenBenchmarking.orgWatts, Fewer Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.4CPU Power Consumption Monitor20406080100

GPAW

System Power Consumption Monitor

MinAvgMaxHPC Tuning Recommendations105526605Default105585660OpenBenchmarking.orgWatts, Fewer Is BetterGPAW 23.6System Power Consumption Monitor2004006008001000

GPAW

CPU Power Consumption Monitor

MinAvgMaxHPC Tuning Recommendations2.7292.4361.6Default89.8350.3401.7OpenBenchmarking.orgWatts, Fewer Is BetterGPAW 23.6CPU Power Consumption Monitor110220330440550

Xcompact3d Incompact3d

System Power Consumption Monitor

MinAvgMaxHPC Tuning Recommendations101581613Default106642685OpenBenchmarking.orgWatts, Fewer Is BetterXcompact3d Incompact3d 2021-03-11System Power Consumption Monitor2004006008001000

Xcompact3d Incompact3d

CPU Power Consumption Monitor

MinAvgMaxHPC Tuning Recommendations1.0333.4344.1Default0.1387.2401.6OpenBenchmarking.orgWatts, Fewer Is BetterXcompact3d Incompact3d 2021-03-11CPU Power Consumption Monitor110220330440550

Xcompact3d Incompact3d

Input: X3D-benchmarking input.i3d

OpenBenchmarking.orgSeconds, Fewer Is BetterXcompact3d Incompact3d 2021-03-11Input: X3D-benchmarking input.i3dHPC Tuning RecommendationsDefault70140210280350SE +/- 3.22, N = 9SE +/- 7.62, N = 9245.68325.051. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

Xcompact3d Incompact3d

System Power Consumption Monitor

MinAvgMaxHPC Tuning Recommendations104332604Default105436674OpenBenchmarking.orgWatts, Fewer Is BetterXcompact3d Incompact3d 2021-03-11System Power Consumption Monitor2004006008001000

Xcompact3d Incompact3d

CPU Power Consumption Monitor

MinAvgMaxHPC Tuning Recommendations0.0209.2345.9Default138.1269.9401.8OpenBenchmarking.orgWatts, Fewer Is BetterXcompact3d Incompact3d 2021-03-11CPU Power Consumption Monitor110220330440550

OpenFOAM

System Power Consumption Monitor

OpenBenchmarking.orgWatts, Fewer Is BetterOpenFOAM 10System Power Consumption MonitorHPC Tuning RecommendationsDefault120240360480600Min: 104 / Avg: 599.41 / Max: 635.5Min: 109.8 / Avg: 642.84 / Max: 689.8

OpenFOAM

CPU Power Consumption Monitor

OpenBenchmarking.orgWatts, Fewer Is BetterOpenFOAM 10CPU Power Consumption MonitorHPC Tuning RecommendationsDefault70140210280350Min: 76.13 / Avg: 350.87 / Max: 357.15Min: 54.24 / Avg: 392.08 / Max: 401.87

OpenFOAM

System Power Consumption Monitor

MinAvgMaxHPC Tuning Recommendations97589629Default101638679OpenBenchmarking.orgWatts, Fewer Is BetterOpenFOAM 10System Power Consumption Monitor2004006008001000

OpenFOAM

CPU Power Consumption Monitor

MinAvgMaxHPC Tuning Recommendations1.8339.6360.1Default53.0386.9401.6OpenBenchmarking.orgWatts, Fewer Is BetterOpenFOAM 10CPU Power Consumption Monitor110220330440550

OpenFOAM

System Power Consumption Monitor

MinAvgMaxHPC Tuning Recommendations104481577Default106555638OpenBenchmarking.orgWatts, Fewer Is BetterOpenFOAM 10System Power Consumption Monitor2004006008001000

OpenFOAM

CPU Power Consumption Monitor

MinAvgMaxHPC Tuning Recommendations0.3277.4356.0Default0.6302.0401.3OpenBenchmarking.orgWatts, Fewer Is BetterOpenFOAM 10CPU Power Consumption Monitor110220330440550

OpenRadioss

System Power Consumption Monitor

MinAvgMaxHPC Tuning Recommendations99546616Default103603683OpenBenchmarking.orgWatts, Fewer Is BetterOpenRadioss 2023.09.15System Power Consumption Monitor2004006008001000

OpenRadioss

CPU Power Consumption Monitor

MinAvgMaxHPC Tuning Recommendations1.4308.0354.1Default50.9356.6402.2OpenBenchmarking.orgWatts, Fewer Is BetterOpenRadioss 2023.09.15CPU Power Consumption Monitor110220330440550

OpenRadioss

System Power Consumption Monitor

MinAvgMaxHPC Tuning Recommendations98.6507.4545.1Default98.8573.0621.3OpenBenchmarking.orgWatts, Fewer Is BetterOpenRadioss 2023.09.15System Power Consumption Monitor160320480640800

OpenRadioss

CPU Power Consumption Monitor

MinAvgMaxHPC Tuning Recommendations4.8314.4347.8Default56.9369.5401.9OpenBenchmarking.orgWatts, Fewer Is BetterOpenRadioss 2023.09.15CPU Power Consumption Monitor110220330440550

OpenRadioss

System Power Consumption Monitor

MinAvgMaxHPC Tuning Recommendations98.9475.2518.6Default98.6542.4592.6OpenBenchmarking.orgWatts, Fewer Is BetterOpenRadioss 2023.09.15System Power Consumption Monitor160320480640800

OpenRadioss

CPU Power Consumption Monitor

MinAvgMaxHPC Tuning Recommendations56.2308.7341.2Default0.7356.9393.1OpenBenchmarking.orgWatts, Fewer Is BetterOpenRadioss 2023.09.15CPU Power Consumption Monitor110220330440550

OpenRadioss

System Power Consumption Monitor

MinAvgMaxHPC Tuning Recommendations98.1405.0536.5Default100.3457.4611.9OpenBenchmarking.orgWatts, Fewer Is BetterOpenRadioss 2023.09.15System Power Consumption Monitor160320480640800

OpenRadioss

CPU Power Consumption Monitor

MinAvgMaxHPC Tuning Recommendations4.0233.8346.0Default56.8286.3399.5OpenBenchmarking.orgWatts, Fewer Is BetterOpenRadioss 2023.09.15CPU Power Consumption Monitor110220330440550

OpenRadioss

System Power Consumption Monitor

MinAvgMaxHPC Tuning Recommendations99.2467.1534.7Default98.9546.0607.6OpenBenchmarking.orgWatts, Fewer Is BetterOpenRadioss 2023.09.15System Power Consumption Monitor160320480640800

OpenRadioss

CPU Power Consumption Monitor

MinAvgMaxHPC Tuning Recommendations56.5301.4345.7Default50.3347.4399.7OpenBenchmarking.orgWatts, Fewer Is BetterOpenRadioss 2023.09.15CPU Power Consumption Monitor110220330440550

OpenRadioss

System Power Consumption Monitor

MinAvgMaxHPC Tuning Recommendations97.9494.3544.0Default107.5572.9612.9OpenBenchmarking.orgWatts, Fewer Is BetterOpenRadioss 2023.09.15System Power Consumption Monitor160320480640800

OpenRadioss

CPU Power Consumption Monitor

MinAvgMaxHPC Tuning Recommendations5.2308.6349.2Default2.1363.4399.6OpenBenchmarking.orgWatts, Fewer Is BetterOpenRadioss 2023.09.15CPU Power Consumption Monitor110220330440550

SPECFEM3D

System Power Consumption Monitor

MinAvgMaxHPC Tuning Recommendations97.1345.2505.6Default97.8409.9612.7OpenBenchmarking.orgWatts, Fewer Is BetterSPECFEM3D 4.1.1System Power Consumption Monitor160320480640800

SPECFEM3D

CPU Power Consumption Monitor

MinAvgMaxHPC Tuning Recommendations3.6200.8332.3Default100.1257.1401.9OpenBenchmarking.orgWatts, Fewer Is BetterSPECFEM3D 4.1.1CPU Power Consumption Monitor110220330440550

SPECFEM3D

System Power Consumption Monitor

MinAvgMaxHPC Tuning Recommendations96.9356.2508.1Default98.2419.3614.1OpenBenchmarking.orgWatts, Fewer Is BetterSPECFEM3D 4.1.1System Power Consumption Monitor160320480640800

SPECFEM3D

CPU Power Consumption Monitor

MinAvgMaxHPC Tuning Recommendations0.0210.7333.2Default1.6245.0401.9OpenBenchmarking.orgWatts, Fewer Is BetterSPECFEM3D 4.1.1CPU Power Consumption Monitor110220330440550

easyWave

System Power Consumption Monitor

MinAvgMaxHPC Tuning Recommendations95.4275.7353.2Default102.8309.3342.8OpenBenchmarking.orgWatts, Fewer Is BettereasyWave r34System Power Consumption Monitor100200300400500

easyWave

CPU Power Consumption Monitor

MinAvgMaxHPC Tuning Recommendations3.6150.4185.3Default0.5167.2192.5OpenBenchmarking.orgWatts, Fewer Is BettereasyWave r34CPU Power Consumption Monitor50100150200250

easyWave

System Power Consumption Monitor

MinAvgMaxHPC Tuning Recommendations101.6250.7312.6Default101.7282.5387.8OpenBenchmarking.orgWatts, Fewer Is BettereasyWave r34System Power Consumption Monitor100200300400500

easyWave

CPU Power Consumption Monitor

MinAvgMaxHPC Tuning Recommendations2.0130.9167.2Default55.9166.9218.7OpenBenchmarking.orgWatts, Fewer Is BettereasyWave r34CPU Power Consumption Monitor60120180240300

CP2K Molecular Dynamics

System Power Consumption Monitor

MinAvgMaxHPC Tuning Recommendations99552582Default102626653OpenBenchmarking.orgWatts, Fewer Is BetterCP2K Molecular Dynamics 2024.3System Power Consumption Monitor2004006008001000

CP2K Molecular Dynamics

CPU Power Consumption Monitor

MinAvgMaxHPC Tuning Recommendations0.7328.5346.7Default2.9382.1400.9OpenBenchmarking.orgWatts, Fewer Is BetterCP2K Molecular Dynamics 2024.3CPU Power Consumption Monitor110220330440550

Graph500

System Power Consumption Monitor

MinAvgMaxHPC Tuning Recommendations108564576Default109636653OpenBenchmarking.orgWatts, Fewer Is BetterGraph500 3.0System Power Consumption Monitor2004006008001000

Graph500

CPU Power Consumption Monitor

MinAvgMaxHPC Tuning Recommendations8.0337.5349.4Default137.4396.7401.7OpenBenchmarking.orgWatts, Fewer Is BetterGraph500 3.0CPU Power Consumption Monitor110220330440550

High Performance Conjugate Gradient

System Power Consumption Monitor

MinAvgMaxHPC Tuning Recommendations109580607Default106637679OpenBenchmarking.orgWatts, Fewer Is BetterHigh Performance Conjugate Gradient 3.1System Power Consumption Monitor2004006008001000

High Performance Conjugate Gradient

CPU Power Consumption Monitor

MinAvgMaxHPC Tuning Recommendations6.1337.3347.9Default84.2387.0402.1OpenBenchmarking.orgWatts, Fewer Is BetterHigh Performance Conjugate Gradient 3.1CPU Power Consumption Monitor110220330440550

High Performance Conjugate Gradient

System Power Consumption Monitor

MinAvgMaxHPC Tuning Recommendations100578611Default101639679OpenBenchmarking.orgWatts, Fewer Is BetterHigh Performance Conjugate Gradient 3.1System Power Consumption Monitor2004006008001000

High Performance Conjugate Gradient

CPU Power Consumption Monitor

MinAvgMaxHPC Tuning Recommendations25.4336.6347.9Default83.7387.3402.2OpenBenchmarking.orgWatts, Fewer Is BetterHigh Performance Conjugate Gradient 3.1CPU Power Consumption Monitor110220330440550

High Performance Conjugate Gradient

X Y Z: 144 144 144 - RT: 60

OpenBenchmarking.orgGFLOP/s, More Is BetterHigh Performance Conjugate Gradient 3.1X Y Z: 144 144 144 - RT: 60HPC Tuning RecommendationsDefault1224364860SE +/- 1.04, N = 12SE +/- 0.58, N = 951.0241.091. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi

GROMACS

System Power Consumption Monitor

MinAvgMaxHPC Tuning Recommendations98443562Default96505637OpenBenchmarking.orgWatts, Fewer Is BetterGROMACS 2024System Power Consumption Monitor2004006008001000

GROMACS

CPU Power Consumption Monitor

MinAvgMaxHPC Tuning Recommendations36.9248.8346.5Default36.9298.3401.9OpenBenchmarking.orgWatts, Fewer Is BetterGROMACS 2024CPU Power Consumption Monitor110220330440550


Phoronix Test Suite v10.8.5