EPYC 9684X 1P Tests for a future article. AMD EPYC 9684X 96-Core testing with a AMD Titanite_4G (RTI1007B BIOS) and ASPEED on Ubuntu 22.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2307209-NE-EPYC9684X51&sor&gru .
EPYC 9684X 1P Processor Motherboard Chipset Memory Disk Graphics Network OS Kernel Desktop Display Server Vulkan Compiler File-System Screen Resolution EPYC 9684X AMD 9684X AMD EPYC 9684X 96-Core @ 2.55GHz (96 Cores / 192 Threads) AMD Titanite_4G (RTI1007B BIOS) AMD Device 14a4 768GB 2 x 1920GB SAMSUNG MZWLJ1T9HBJR-00007 ASPEED Broadcom NetXtreme BCM5720 PCIe Ubuntu 22.04 5.19.0-41-generic (x86_64) GNOME Shell 42.5 X Server 1.21.1.4 1.3.224 GCC 11.3.0 ext4 1024x768 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-xKiWfi/gcc-11-11.3.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-xKiWfi/gcc-11-11.3.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa101121 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
EPYC 9684X 1P stress-ng: CPU Cache stress-ng: CPU Stress stress-ng: Matrix Math stress-ng: Vector Math stress-ng: Vector Shuffle stress-ng: Wide Vector Math stress-ng: Vector Floating Point minife: Small embree: Pathtracer - Crown embree: Pathtracer ISPC - Crown embree: Pathtracer - Asian Dragon embree: Pathtracer - Asian Dragon Obj embree: Pathtracer ISPC - Asian Dragon embree: Pathtracer ISPC - Asian Dragon Obj hpcg: 104 104 104 - 60 hpcg: 144 144 144 - 60 hpcg: 160 160 160 - 60 hpcg: 192 192 192 - 60 heffte: c2c - FFTW - float - 128 heffte: c2c - FFTW - float - 256 heffte: c2c - FFTW - float - 512 heffte: r2c - FFTW - float - 128 heffte: r2c - FFTW - float - 256 heffte: r2c - FFTW - float - 512 heffte: c2c - FFTW - double - 128 heffte: c2c - FFTW - double - 256 heffte: c2c - FFTW - double - 512 heffte: c2c - Stock - float - 128 heffte: c2c - Stock - float - 256 heffte: c2c - Stock - float - 512 heffte: r2c - FFTW - double - 128 heffte: r2c - FFTW - double - 256 heffte: r2c - FFTW - double - 512 heffte: r2c - Stock - float - 128 heffte: r2c - Stock - float - 256 heffte: r2c - Stock - float - 512 heffte: c2c - Stock - double - 128 heffte: c2c - Stock - double - 256 heffte: c2c - Stock - double - 512 heffte: r2c - Stock - double - 128 heffte: r2c - Stock - double - 256 heffte: r2c - Stock - double - 512 heffte: c2c - FFTW - float-long - 128 heffte: c2c - FFTW - float-long - 256 heffte: c2c - FFTW - float-long - 512 heffte: r2c - FFTW - float-long - 128 heffte: r2c - FFTW - float-long - 256 heffte: r2c - FFTW - float-long - 512 heffte: c2c - FFTW - double-long - 128 heffte: c2c - FFTW - double-long - 256 heffte: c2c - FFTW - double-long - 512 heffte: c2c - Stock - float-long - 128 heffte: c2c - Stock - float-long - 256 heffte: c2c - Stock - float-long - 512 heffte: r2c - FFTW - double-long - 128 heffte: r2c - FFTW - double-long - 256 heffte: r2c - FFTW - double-long - 512 heffte: r2c - Stock - float-long - 128 heffte: r2c - Stock - float-long - 256 heffte: r2c - Stock - float-long - 512 heffte: c2c - Stock - double-long - 128 heffte: c2c - Stock - double-long - 256 heffte: c2c - Stock - double-long - 512 heffte: r2c - Stock - double-long - 128 heffte: r2c - Stock - double-long - 256 heffte: r2c - Stock - double-long - 512 libxsmm: 128 libxsmm: 256 libxsmm: 32 libxsmm: 64 xmrig: Monero - 1M xmrig: Wownero - 1M askap: Hogbom Clean OpenMP askap: tConvolve MT - Gridding askap: tConvolve MT - Degridding askap: tConvolve OpenMP - Gridding askap: tConvolve OpenMP - Degridding askap: tConvolve MPI - Degridding askap: tConvolve MPI - Gridding astcenc: Fast astcenc: Medium astcenc: Thorough astcenc: Exhaustive gromacs: MPI CPU - water_GMX50_bare npb: BT.C npb: CG.C npb: EP.C npb: EP.D npb: FT.C npb: IS.D npb: LU.C npb: MG.C npb: SP.B npb: SP.C lulesh: namd: ATPase Simulation - 327,506 Atoms incompact3d: X3D-benchmarking input.i3d incompact3d: input.i3d 129 Cells Per Direction incompact3d: input.i3d 193 Cells Per Direction openfoam: drivaerFastback, Small Mesh Size - Mesh Time openfoam: drivaerFastback, Small Mesh Size - Execution Time openfoam: drivaerFastback, Medium Mesh Size - Mesh Time openfoam: drivaerFastback, Medium Mesh Size - Execution Time blender: BMW27 - CPU-Only blender: Classroom - CPU-Only blender: Fishy Cat - CPU-Only blender: Barbershop - CPU-Only blender: Pabellon Barcelona - CPU-Only EPYC 9684X AMD 9684X 1373344.98 213036.81 418126.43 545786.37 63766.48 3480531.79 256936.31 53878.4 110.6946 117.0799 126.2454 114.1719 142.2792 122.0413 34.355 23.546 22.8174 22.7211 130.752 177.706 154.311 186.051 315.15 336.451 81.8473 82.5599 68.1627 110.016 170.734 149.982 125.241 194.017 136.458 178.478 331.609 343.723 70.1221 83.8913 68.8242 115.816 174 144.622 126.301 182.81 155.713 184.354 318.534 332.908 84.3653 89.7978 68.445 110.998 177.123 148.562 129.282 184.786 135.542 176.529 327.355 341.923 68.6498 83.5304 68.8118 115.339 175.594 145.496 3119.7 3177.4 1299.8 2479.3 69478.2 74195 1204.82 13617.8 15691 26625.6 66564 59410.3 73226.7 1019.0656 421.4357 56.8898 6.1412 11.801 305166.4 62397.22 7839.23 10620.59 118915.97 5839.25 340754.87 141081.08 170601.48 211454.6 30858.462 0.25068 377.588806 2.24493504 7.65203524 23.301871 28.957931 107.493 184.45409 16.26 40.48 20.61 142.1 49.42 1426891.13 214798.32 418120.08 545747.29 63760.46 3484985.88 254718.1 53906.3 109.6474 117.2488 126.4095 115.0435 142.2995 122.3101 23.1377 22.8476 22.7753 22.4173 128.608 176.295 155.387 188.224 312.386 340.748 78.4937 85.1501 68.2756 111.535 167.193 149.163 125.7 189.443 135.556 175.82 321.42 339.165 68.8689 82.8517 68.3745 116.285 176.447 143.312 130.477 179.544 155.664 187.191 332.764 327.73 76.337 82.3926 68.6279 114.309 178.56 149.973 124.932 184.654 136.27 176.556 328.778 336.837 70.2296 82.8587 68.6366 116.264 182.393 143.534 3091.5 3131.5 1300.7 2472.2 69671.8 74288.7 1204.82 13606.9 15691 29584 53251.2 58310.1 74970.2 1009.8949 421.9692 56.8975 6.1406 11.798 314982.58 58099.4 7952 10476.83 121389.76 5572.16 335538.3 132064.96 169864.87 208198.13 31065.289 0.24623 388.701202 1.96318495 7.65393496 23.166335 29.259894 107.82577 185.1475 16.46 40.4 20.44 141.96 49.5 OpenBenchmarking.org
Stress-NG Test: CPU Cache OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: CPU Cache AMD 9684X EPYC 9684X 300K 600K 900K 1200K 1500K 1426891.13 1373344.98 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Stress-NG Test: CPU Stress OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: CPU Stress AMD 9684X EPYC 9684X 50K 100K 150K 200K 250K 214798.32 213036.81 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Stress-NG Test: Matrix Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Matrix Math EPYC 9684X AMD 9684X 90K 180K 270K 360K 450K 418126.43 418120.08 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Stress-NG Test: Vector Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Vector Math EPYC 9684X AMD 9684X 120K 240K 360K 480K 600K 545786.37 545747.29 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Stress-NG Test: Vector Shuffle OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Vector Shuffle EPYC 9684X AMD 9684X 14K 28K 42K 56K 70K 63766.48 63760.46 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Stress-NG Test: Wide Vector Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Wide Vector Math AMD 9684X EPYC 9684X 700K 1400K 2100K 2800K 3500K 3484985.88 3480531.79 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Stress-NG Test: Vector Floating Point OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Vector Floating Point EPYC 9684X AMD 9684X 60K 120K 180K 240K 300K 256936.31 254718.10 1. (CXX) g++ options: -O2 -std=gnu99 -lc
miniFE Problem Size: Small OpenBenchmarking.org CG Mflops, More Is Better miniFE 2.2 Problem Size: Small AMD 9684X EPYC 9684X 12K 24K 36K 48K 60K 53906.3 53878.4 1. (CXX) g++ options: -O3 -fopenmp -lmpi_cxx -lmpi
Embree Binary: Pathtracer - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.1 Binary: Pathtracer - Model: Crown EPYC 9684X AMD 9684X 20 40 60 80 100 110.69 109.65 MIN: 108.41 / MAX: 113.64 MIN: 107.65 / MAX: 113.51
Embree Binary: Pathtracer ISPC - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.1 Binary: Pathtracer ISPC - Model: Crown AMD 9684X EPYC 9684X 30 60 90 120 150 117.25 117.08 MIN: 114.61 / MAX: 121.35 MIN: 114.57 / MAX: 120.79
Embree Binary: Pathtracer - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.1 Binary: Pathtracer - Model: Asian Dragon AMD 9684X EPYC 9684X 30 60 90 120 150 126.41 126.25 MIN: 124.93 / MAX: 128.49 MIN: 124.97 / MAX: 128.11
Embree Binary: Pathtracer - Model: Asian Dragon Obj OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.1 Binary: Pathtracer - Model: Asian Dragon Obj AMD 9684X EPYC 9684X 30 60 90 120 150 115.04 114.17 MIN: 113.62 / MAX: 117.34 MIN: 112.65 / MAX: 116.39
Embree Binary: Pathtracer ISPC - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.1 Binary: Pathtracer ISPC - Model: Asian Dragon AMD 9684X EPYC 9684X 30 60 90 120 150 142.30 142.28 MIN: 140.74 / MAX: 144.53 MIN: 140.41 / MAX: 144.57
Embree Binary: Pathtracer ISPC - Model: Asian Dragon Obj OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.1 Binary: Pathtracer ISPC - Model: Asian Dragon Obj AMD 9684X EPYC 9684X 30 60 90 120 150 122.31 122.04 MIN: 120.61 / MAX: 124.64 MIN: 120.36 / MAX: 124.72
High Performance Conjugate Gradient X Y Z: 104 104 104 - RT: 60 OpenBenchmarking.org GFLOP/s, More Is Better High Performance Conjugate Gradient 3.1 X Y Z: 104 104 104 - RT: 60 EPYC 9684X AMD 9684X 8 16 24 32 40 34.36 23.14 1. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi
High Performance Conjugate Gradient X Y Z: 144 144 144 - RT: 60 OpenBenchmarking.org GFLOP/s, More Is Better High Performance Conjugate Gradient 3.1 X Y Z: 144 144 144 - RT: 60 EPYC 9684X AMD 9684X 6 12 18 24 30 23.55 22.85 1. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi
High Performance Conjugate Gradient X Y Z: 160 160 160 - RT: 60 OpenBenchmarking.org GFLOP/s, More Is Better High Performance Conjugate Gradient 3.1 X Y Z: 160 160 160 - RT: 60 EPYC 9684X AMD 9684X 5 10 15 20 25 22.82 22.78 1. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi
High Performance Conjugate Gradient X Y Z: 192 192 192 - RT: 60 OpenBenchmarking.org GFLOP/s, More Is Better High Performance Conjugate Gradient 3.1 X Y Z: 192 192 192 - RT: 60 EPYC 9684X AMD 9684X 5 10 15 20 25 22.72 22.42 1. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: float - X Y Z: 128 EPYC 9684X AMD 9684X 30 60 90 120 150 130.75 128.61 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: float - X Y Z: 256 EPYC 9684X AMD 9684X 40 80 120 160 200 177.71 176.30 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: float - X Y Z: 512 AMD 9684X EPYC 9684X 30 60 90 120 150 155.39 154.31 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: float - X Y Z: 128 AMD 9684X EPYC 9684X 40 80 120 160 200 188.22 186.05 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: float - X Y Z: 256 EPYC 9684X AMD 9684X 70 140 210 280 350 315.15 312.39 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: float - X Y Z: 512 AMD 9684X EPYC 9684X 70 140 210 280 350 340.75 336.45 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: double - X Y Z: 128 EPYC 9684X AMD 9684X 20 40 60 80 100 81.85 78.49 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: double - X Y Z: 256 AMD 9684X EPYC 9684X 20 40 60 80 100 85.15 82.56 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: double - X Y Z: 512 AMD 9684X EPYC 9684X 15 30 45 60 75 68.28 68.16 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: float - X Y Z: 128 AMD 9684X EPYC 9684X 20 40 60 80 100 111.54 110.02 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: float - X Y Z: 256 EPYC 9684X AMD 9684X 40 80 120 160 200 170.73 167.19 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: float - X Y Z: 512 EPYC 9684X AMD 9684X 30 60 90 120 150 149.98 149.16 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: double - X Y Z: 128 AMD 9684X EPYC 9684X 30 60 90 120 150 125.70 125.24 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: double - X Y Z: 256 EPYC 9684X AMD 9684X 40 80 120 160 200 194.02 189.44 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: double - X Y Z: 512 EPYC 9684X AMD 9684X 30 60 90 120 150 136.46 135.56 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: float - X Y Z: 128 EPYC 9684X AMD 9684X 40 80 120 160 200 178.48 175.82 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: float - X Y Z: 256 EPYC 9684X AMD 9684X 70 140 210 280 350 331.61 321.42 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: float - X Y Z: 512 EPYC 9684X AMD 9684X 70 140 210 280 350 343.72 339.17 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: double - X Y Z: 128 EPYC 9684X AMD 9684X 16 32 48 64 80 70.12 68.87 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: double - X Y Z: 256 EPYC 9684X AMD 9684X 20 40 60 80 100 83.89 82.85 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: double - X Y Z: 512 EPYC 9684X AMD 9684X 15 30 45 60 75 68.82 68.37 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: double - X Y Z: 128 AMD 9684X EPYC 9684X 30 60 90 120 150 116.29 115.82 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: double - X Y Z: 256 AMD 9684X EPYC 9684X 40 80 120 160 200 176.45 174.00 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: double - X Y Z: 512 EPYC 9684X AMD 9684X 30 60 90 120 150 144.62 143.31 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 128 AMD 9684X EPYC 9684X 30 60 90 120 150 130.48 126.30 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 256 EPYC 9684X AMD 9684X 40 80 120 160 200 182.81 179.54 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 512 EPYC 9684X AMD 9684X 30 60 90 120 150 155.71 155.66 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 128 AMD 9684X EPYC 9684X 40 80 120 160 200 187.19 184.35 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 256 AMD 9684X EPYC 9684X 70 140 210 280 350 332.76 318.53 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 512 EPYC 9684X AMD 9684X 70 140 210 280 350 332.91 327.73 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 128 EPYC 9684X AMD 9684X 20 40 60 80 100 84.37 76.34 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 256 EPYC 9684X AMD 9684X 20 40 60 80 100 89.80 82.39 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 512 AMD 9684X EPYC 9684X 15 30 45 60 75 68.63 68.45 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 128 AMD 9684X EPYC 9684X 30 60 90 120 150 114.31 111.00 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 256 AMD 9684X EPYC 9684X 40 80 120 160 200 178.56 177.12 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 512 AMD 9684X EPYC 9684X 30 60 90 120 150 149.97 148.56 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 128 EPYC 9684X AMD 9684X 30 60 90 120 150 129.28 124.93 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 256 EPYC 9684X AMD 9684X 40 80 120 160 200 184.79 184.65 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 512 AMD 9684X EPYC 9684X 30 60 90 120 150 136.27 135.54 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 128 AMD 9684X EPYC 9684X 40 80 120 160 200 176.56 176.53 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 256 AMD 9684X EPYC 9684X 70 140 210 280 350 328.78 327.36 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 512 EPYC 9684X AMD 9684X 70 140 210 280 350 341.92 336.84 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 128 AMD 9684X EPYC 9684X 16 32 48 64 80 70.23 68.65 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 256 EPYC 9684X AMD 9684X 20 40 60 80 100 83.53 82.86 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 512 EPYC 9684X AMD 9684X 15 30 45 60 75 68.81 68.64 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 128 AMD 9684X EPYC 9684X 30 60 90 120 150 116.26 115.34 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 256 AMD 9684X EPYC 9684X 40 80 120 160 200 182.39 175.59 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 512 EPYC 9684X AMD 9684X 30 60 90 120 150 145.50 143.53 1. (CXX) g++ options: -O3
libxsmm M N K: 128 OpenBenchmarking.org GFLOPS/s, More Is Better libxsmm 2-1.17-3645 M N K: 128 EPYC 9684X AMD 9684X 700 1400 2100 2800 3500 3119.7 3091.5 1. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2
libxsmm M N K: 256 OpenBenchmarking.org GFLOPS/s, More Is Better libxsmm 2-1.17-3645 M N K: 256 EPYC 9684X AMD 9684X 700 1400 2100 2800 3500 3177.4 3131.5 1. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2
libxsmm M N K: 32 OpenBenchmarking.org GFLOPS/s, More Is Better libxsmm 2-1.17-3645 M N K: 32 AMD 9684X EPYC 9684X 300 600 900 1200 1500 1300.7 1299.8 1. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2
libxsmm M N K: 64 OpenBenchmarking.org GFLOPS/s, More Is Better libxsmm 2-1.17-3645 M N K: 64 EPYC 9684X AMD 9684X 500 1000 1500 2000 2500 2479.3 2472.2 1. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2
Xmrig Variant: Monero - Hash Count: 1M OpenBenchmarking.org H/s, More Is Better Xmrig 6.18.1 Variant: Monero - Hash Count: 1M AMD 9684X EPYC 9684X 15K 30K 45K 60K 75K 69671.8 69478.2 1. (CXX) g++ options: -fexceptions -fno-rtti -maes -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
Xmrig Variant: Wownero - Hash Count: 1M OpenBenchmarking.org H/s, More Is Better Xmrig 6.18.1 Variant: Wownero - Hash Count: 1M AMD 9684X EPYC 9684X 16K 32K 48K 64K 80K 74288.7 74195.0 1. (CXX) g++ options: -fexceptions -fno-rtti -maes -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
ASKAP Test: Hogbom Clean OpenMP OpenBenchmarking.org Iterations Per Second, More Is Better ASKAP 1.0 Test: Hogbom Clean OpenMP AMD 9684X EPYC 9684X 300 600 900 1200 1500 1204.82 1204.82 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
ASKAP Test: tConvolve MT - Gridding OpenBenchmarking.org Million Grid Points Per Second, More Is Better ASKAP 1.0 Test: tConvolve MT - Gridding EPYC 9684X AMD 9684X 3K 6K 9K 12K 15K 13617.8 13606.9 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
ASKAP Test: tConvolve MT - Degridding OpenBenchmarking.org Million Grid Points Per Second, More Is Better ASKAP 1.0 Test: tConvolve MT - Degridding AMD 9684X EPYC 9684X 3K 6K 9K 12K 15K 15691 15691 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
ASKAP Test: tConvolve OpenMP - Gridding OpenBenchmarking.org Million Grid Points Per Second, More Is Better ASKAP 1.0 Test: tConvolve OpenMP - Gridding AMD 9684X EPYC 9684X 6K 12K 18K 24K 30K 29584.0 26625.6 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
ASKAP Test: tConvolve OpenMP - Degridding OpenBenchmarking.org Million Grid Points Per Second, More Is Better ASKAP 1.0 Test: tConvolve OpenMP - Degridding EPYC 9684X AMD 9684X 14K 28K 42K 56K 70K 66564.0 53251.2 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
ASKAP Test: tConvolve MPI - Degridding OpenBenchmarking.org Mpix/sec, More Is Better ASKAP 1.0 Test: tConvolve MPI - Degridding EPYC 9684X AMD 9684X 13K 26K 39K 52K 65K 59410.3 58310.1 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
ASKAP Test: tConvolve MPI - Gridding OpenBenchmarking.org Mpix/sec, More Is Better ASKAP 1.0 Test: tConvolve MPI - Gridding AMD 9684X EPYC 9684X 16K 32K 48K 64K 80K 74970.2 73226.7 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
ASTC Encoder Preset: Fast OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 4.0 Preset: Fast EPYC 9684X AMD 9684X 200 400 600 800 1000 1019.07 1009.89 1. (CXX) g++ options: -O3 -flto -pthread
ASTC Encoder Preset: Medium OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 4.0 Preset: Medium AMD 9684X EPYC 9684X 90 180 270 360 450 421.97 421.44 1. (CXX) g++ options: -O3 -flto -pthread
ASTC Encoder Preset: Thorough OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 4.0 Preset: Thorough AMD 9684X EPYC 9684X 13 26 39 52 65 56.90 56.89 1. (CXX) g++ options: -O3 -flto -pthread
ASTC Encoder Preset: Exhaustive OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 4.0 Preset: Exhaustive EPYC 9684X AMD 9684X 2 4 6 8 10 6.1412 6.1406 1. (CXX) g++ options: -O3 -flto -pthread
GROMACS Implementation: MPI CPU - Input: water_GMX50_bare OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2023 Implementation: MPI CPU - Input: water_GMX50_bare EPYC 9684X AMD 9684X 3 6 9 12 15 11.80 11.80 1. (CXX) g++ options: -O3
NAS Parallel Benchmarks Test / Class: BT.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: BT.C AMD 9684X EPYC 9684X 70K 140K 210K 280K 350K 314982.58 305166.40 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2
NAS Parallel Benchmarks Test / Class: CG.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: CG.C EPYC 9684X AMD 9684X 13K 26K 39K 52K 65K 62397.22 58099.40 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2
NAS Parallel Benchmarks Test / Class: EP.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: EP.C AMD 9684X EPYC 9684X 2K 4K 6K 8K 10K 7952.00 7839.23 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2
NAS Parallel Benchmarks Test / Class: EP.D OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: EP.D EPYC 9684X AMD 9684X 2K 4K 6K 8K 10K 10620.59 10476.83 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2
NAS Parallel Benchmarks Test / Class: FT.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: FT.C AMD 9684X EPYC 9684X 30K 60K 90K 120K 150K 121389.76 118915.97 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2
NAS Parallel Benchmarks Test / Class: IS.D OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: IS.D EPYC 9684X AMD 9684X 1300 2600 3900 5200 6500 5839.25 5572.16 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2
NAS Parallel Benchmarks Test / Class: LU.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: LU.C EPYC 9684X AMD 9684X 70K 140K 210K 280K 350K 340754.87 335538.30 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2
NAS Parallel Benchmarks Test / Class: MG.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: MG.C EPYC 9684X AMD 9684X 30K 60K 90K 120K 150K 141081.08 132064.96 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2
NAS Parallel Benchmarks Test / Class: SP.B OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: SP.B EPYC 9684X AMD 9684X 40K 80K 120K 160K 200K 170601.48 169864.87 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2
NAS Parallel Benchmarks Test / Class: SP.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: SP.C EPYC 9684X AMD 9684X 50K 100K 150K 200K 250K 211454.60 208198.13 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2
LULESH OpenBenchmarking.org z/s, More Is Better LULESH 2.0.3 AMD 9684X EPYC 9684X 7K 14K 21K 28K 35K 31065.29 30858.46 1. (CXX) g++ options: -O3 -fopenmp -lm -lmpi_cxx -lmpi
NAMD ATPase Simulation - 327,506 Atoms OpenBenchmarking.org days/ns, Fewer Is Better NAMD 2.14 ATPase Simulation - 327,506 Atoms AMD 9684X EPYC 9684X 0.0564 0.1128 0.1692 0.2256 0.282 0.24623 0.25068
Xcompact3d Incompact3d Input: X3D-benchmarking input.i3d OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: X3D-benchmarking input.i3d EPYC 9684X AMD 9684X 80 160 240 320 400 377.59 388.70 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
Xcompact3d Incompact3d Input: input.i3d 129 Cells Per Direction OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 129 Cells Per Direction AMD 9684X EPYC 9684X 0.5051 1.0102 1.5153 2.0204 2.5255 1.96318495 2.24493504 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
Xcompact3d Incompact3d Input: input.i3d 193 Cells Per Direction OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 193 Cells Per Direction EPYC 9684X AMD 9684X 2 4 6 8 10 7.65203524 7.65393496 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
OpenFOAM Input: drivaerFastback, Small Mesh Size - Mesh Time OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: drivaerFastback, Small Mesh Size - Mesh Time AMD 9684X EPYC 9684X 6 12 18 24 30 23.17 23.30 1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm
OpenFOAM Input: drivaerFastback, Small Mesh Size - Execution Time OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: drivaerFastback, Small Mesh Size - Execution Time EPYC 9684X AMD 9684X 7 14 21 28 35 28.96 29.26 1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm
OpenFOAM Input: drivaerFastback, Medium Mesh Size - Mesh Time OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: drivaerFastback, Medium Mesh Size - Mesh Time EPYC 9684X AMD 9684X 20 40 60 80 100 107.49 107.83 1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm
OpenFOAM Input: drivaerFastback, Medium Mesh Size - Execution Time OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: drivaerFastback, Medium Mesh Size - Execution Time EPYC 9684X AMD 9684X 40 80 120 160 200 184.45 185.15 1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm
Blender Blend File: BMW27 - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.6 Blend File: BMW27 - Compute: CPU-Only EPYC 9684X AMD 9684X 4 8 12 16 20 16.26 16.46
Blender Blend File: Classroom - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.6 Blend File: Classroom - Compute: CPU-Only AMD 9684X EPYC 9684X 9 18 27 36 45 40.40 40.48
Blender Blend File: Fishy Cat - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.6 Blend File: Fishy Cat - Compute: CPU-Only AMD 9684X EPYC 9684X 5 10 15 20 25 20.44 20.61
Blender Blend File: Barbershop - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.6 Blend File: Barbershop - Compute: CPU-Only AMD 9684X EPYC 9684X 30 60 90 120 150 141.96 142.10
Blender Blend File: Pabellon Barcelona - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.6 Blend File: Pabellon Barcelona - Compute: CPU-Only EPYC 9684X AMD 9684X 11 22 33 44 55 49.42 49.50
Phoronix Test Suite v10.8.5