EPYC 9684X 1P Tests for a future article. AMD EPYC 9684X 96-Core testing with a AMD Titanite_4G (RTI1007B BIOS) and ASPEED on Ubuntu 22.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2307209-NE-EPYC9684X51&rdt&grs .
EPYC 9684X 1P Processor Motherboard Chipset Memory Disk Graphics Network OS Kernel Desktop Display Server Vulkan Compiler File-System Screen Resolution EPYC 9684X AMD 9684X AMD EPYC 9684X 96-Core @ 2.55GHz (96 Cores / 192 Threads) AMD Titanite_4G (RTI1007B BIOS) AMD Device 14a4 768GB 2 x 1920GB SAMSUNG MZWLJ1T9HBJR-00007 ASPEED Broadcom NetXtreme BCM5720 PCIe Ubuntu 22.04 5.19.0-41-generic (x86_64) GNOME Shell 42.5 X Server 1.21.1.4 1.3.224 GCC 11.3.0 ext4 1024x768 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-xKiWfi/gcc-11-11.3.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-xKiWfi/gcc-11-11.3.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa101121 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
EPYC 9684X 1P hpcg: 104 104 104 - 60 askap: tConvolve OpenMP - Degridding incompact3d: input.i3d 129 Cells Per Direction askap: tConvolve OpenMP - Gridding heffte: c2c - FFTW - double-long - 128 heffte: c2c - FFTW - double-long - 256 npb: CG.C npb: MG.C npb: IS.D heffte: r2c - FFTW - float-long - 256 heffte: c2c - FFTW - double - 128 stress-ng: CPU Cache heffte: r2c - Stock - double-long - 256 heffte: r2c - FFTW - double-long - 128 heffte: c2c - FFTW - float-long - 128 npb: BT.C heffte: r2c - Stock - float - 256 heffte: c2c - FFTW - double - 256 hpcg: 144 144 144 - 60 heffte: c2c - Stock - float-long - 128 incompact3d: X3D-benchmarking input.i3d heffte: r2c - FFTW - double - 256 askap: tConvolve MPI - Gridding heffte: c2c - Stock - double-long - 128 heffte: c2c - Stock - float - 256 npb: FT.C askap: tConvolve MPI - Degridding heffte: c2c - Stock - double - 128 heffte: c2c - FFTW - float-long - 256 namd: ATPase Simulation - 327,506 Atoms heffte: c2c - FFTW - float - 128 heffte: r2c - FFTW - float-long - 512 npb: SP.C npb: LU.C heffte: r2c - FFTW - float-long - 128 heffte: r2c - Stock - float - 128 heffte: r2c - Stock - float-long - 512 libxsmm: 256 npb: EP.C heffte: r2c - Stock - double - 256 heffte: c2c - Stock - float - 128 npb: EP.D heffte: r2c - Stock - double-long - 512 hpcg: 192 192 192 - 60 heffte: r2c - Stock - float - 512 heffte: r2c - FFTW - float - 512 heffte: c2c - Stock - double - 256 blender: BMW27 - CPU-Only heffte: r2c - FFTW - float - 128 openfoam: drivaerFastback, Small Mesh Size - Execution Time embree: Pathtracer - Crown heffte: c2c - Stock - float-long - 512 heffte: r2c - Stock - double - 512 libxsmm: 128 astcenc: Fast heffte: r2c - FFTW - float - 256 stress-ng: Vector Floating Point blender: Fishy Cat - CPU-Only stress-ng: CPU Stress heffte: c2c - Stock - float-long - 256 heffte: c2c - Stock - double-long - 256 heffte: r2c - Stock - double-long - 128 heffte: c2c - FFTW - float - 256 embree: Pathtracer - Asian Dragon Obj heffte: c2c - FFTW - float - 512 lulesh: heffte: r2c - FFTW - double - 512 heffte: c2c - Stock - double - 512 openfoam: drivaerFastback, Small Mesh Size - Mesh Time heffte: c2c - Stock - float - 512 heffte: r2c - FFTW - double-long - 512 heffte: r2c - Stock - float-long - 256 npb: SP.B heffte: r2c - Stock - double - 128 openfoam: drivaerFastback, Medium Mesh Size - Execution Time heffte: r2c - FFTW - double - 128 openfoam: drivaerFastback, Medium Mesh Size - Mesh Time libxsmm: 64 xmrig: Monero - 1M heffte: c2c - FFTW - double-long - 512 heffte: c2c - Stock - double-long - 512 embree: Pathtracer ISPC - Asian Dragon Obj blender: Classroom - CPU-Only hpcg: 160 160 160 - 60 heffte: c2c - FFTW - double - 512 blender: Pabellon Barcelona - CPU-Only embree: Pathtracer ISPC - Crown embree: Pathtracer - Asian Dragon stress-ng: Wide Vector Math astcenc: Medium xmrig: Wownero - 1M blender: Barbershop - CPU-Only askap: tConvolve MT - Gridding heffte: r2c - FFTW - double-long - 256 libxsmm: 32 minife: Small heffte: c2c - FFTW - float-long - 512 gromacs: MPI CPU - water_GMX50_bare incompact3d: input.i3d 193 Cells Per Direction heffte: r2c - Stock - float-long - 128 embree: Pathtracer ISPC - Asian Dragon astcenc: Thorough astcenc: Exhaustive stress-ng: Vector Shuffle stress-ng: Vector Math stress-ng: Matrix Math askap: Hogbom Clean OpenMP askap: tConvolve MT - Degridding EPYC 9684X AMD 9684X 34.355 66564 2.24493504 26625.6 84.3653 89.7978 62397.22 141081.08 5839.25 318.534 81.8473 1373344.98 175.594 129.282 126.301 305166.4 331.609 82.5599 23.546 110.998 377.588806 194.017 73226.7 68.6498 170.734 118915.97 59410.3 70.1221 182.81 0.25068 130.752 332.908 211454.6 340754.87 184.354 178.478 341.923 3177.4 7839.23 174 110.016 10620.59 145.496 22.7211 343.723 336.451 83.8913 16.26 186.051 28.957931 110.6946 148.562 144.622 3119.7 1019.0656 315.15 256936.31 20.61 213036.81 177.123 83.5304 115.339 177.706 114.1719 154.311 30858.462 136.458 68.8242 23.301871 149.982 135.542 327.355 170601.48 115.816 184.45409 125.241 107.493 2479.3 69478.2 68.445 68.8118 122.0413 40.48 22.8174 68.1627 49.42 117.0799 126.2454 3480531.79 421.4357 74195 142.1 13617.8 184.786 1299.8 53878.4 155.713 11.801 7.65203524 176.529 142.2792 56.8898 6.1412 63766.48 545786.37 418126.43 1204.82 15691 23.1377 53251.2 1.96318495 29584 76.337 82.3926 58099.4 132064.96 5572.16 332.764 78.4937 1426891.13 182.393 124.932 130.477 314982.58 321.42 85.1501 22.8476 114.309 388.701202 189.443 74970.2 70.2296 167.193 121389.76 58310.1 68.8689 179.544 0.24623 128.608 327.73 208198.13 335538.3 187.191 175.82 336.837 3131.5 7952 176.447 111.535 10476.83 143.534 22.4173 339.165 340.748 82.8517 16.46 188.224 29.259894 109.6474 149.973 143.312 3091.5 1009.8949 312.386 254718.1 20.44 214798.32 178.56 82.8587 116.264 176.295 115.0435 155.387 31065.289 135.556 68.3745 23.166335 149.163 136.27 328.778 169864.87 116.285 185.1475 125.7 107.82577 2472.2 69671.8 68.6279 68.6366 122.3101 40.4 22.7753 68.2756 49.5 117.2488 126.4095 3484985.88 421.9692 74288.7 141.96 13606.9 184.654 1300.7 53906.3 155.664 11.798 7.65393496 176.556 142.2995 56.8975 6.1406 63760.46 545747.29 418120.08 1204.82 15691 OpenBenchmarking.org
High Performance Conjugate Gradient X Y Z: 104 104 104 - RT: 60 OpenBenchmarking.org GFLOP/s, More Is Better High Performance Conjugate Gradient 3.1 X Y Z: 104 104 104 - RT: 60 EPYC 9684X AMD 9684X 8 16 24 32 40 34.36 23.14 1. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi
ASKAP Test: tConvolve OpenMP - Degridding OpenBenchmarking.org Million Grid Points Per Second, More Is Better ASKAP 1.0 Test: tConvolve OpenMP - Degridding EPYC 9684X AMD 9684X 14K 28K 42K 56K 70K 66564.0 53251.2 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
Xcompact3d Incompact3d Input: input.i3d 129 Cells Per Direction OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 129 Cells Per Direction EPYC 9684X AMD 9684X 0.5051 1.0102 1.5153 2.0204 2.5255 2.24493504 1.96318495 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
ASKAP Test: tConvolve OpenMP - Gridding OpenBenchmarking.org Million Grid Points Per Second, More Is Better ASKAP 1.0 Test: tConvolve OpenMP - Gridding EPYC 9684X AMD 9684X 6K 12K 18K 24K 30K 26625.6 29584.0 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 128 EPYC 9684X AMD 9684X 20 40 60 80 100 84.37 76.34 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 256 EPYC 9684X AMD 9684X 20 40 60 80 100 89.80 82.39 1. (CXX) g++ options: -O3
NAS Parallel Benchmarks Test / Class: CG.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: CG.C EPYC 9684X AMD 9684X 13K 26K 39K 52K 65K 62397.22 58099.40 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2
NAS Parallel Benchmarks Test / Class: MG.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: MG.C EPYC 9684X AMD 9684X 30K 60K 90K 120K 150K 141081.08 132064.96 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2
NAS Parallel Benchmarks Test / Class: IS.D OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: IS.D EPYC 9684X AMD 9684X 1300 2600 3900 5200 6500 5839.25 5572.16 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 256 EPYC 9684X AMD 9684X 70 140 210 280 350 318.53 332.76 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: double - X Y Z: 128 EPYC 9684X AMD 9684X 20 40 60 80 100 81.85 78.49 1. (CXX) g++ options: -O3
Stress-NG Test: CPU Cache OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: CPU Cache EPYC 9684X AMD 9684X 300K 600K 900K 1200K 1500K 1373344.98 1426891.13 1. (CXX) g++ options: -O2 -std=gnu99 -lc
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 256 EPYC 9684X AMD 9684X 40 80 120 160 200 175.59 182.39 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 128 EPYC 9684X AMD 9684X 30 60 90 120 150 129.28 124.93 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 128 EPYC 9684X AMD 9684X 30 60 90 120 150 126.30 130.48 1. (CXX) g++ options: -O3
NAS Parallel Benchmarks Test / Class: BT.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: BT.C EPYC 9684X AMD 9684X 70K 140K 210K 280K 350K 305166.40 314982.58 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: float - X Y Z: 256 EPYC 9684X AMD 9684X 70 140 210 280 350 331.61 321.42 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: double - X Y Z: 256 EPYC 9684X AMD 9684X 20 40 60 80 100 82.56 85.15 1. (CXX) g++ options: -O3
High Performance Conjugate Gradient X Y Z: 144 144 144 - RT: 60 OpenBenchmarking.org GFLOP/s, More Is Better High Performance Conjugate Gradient 3.1 X Y Z: 144 144 144 - RT: 60 EPYC 9684X AMD 9684X 6 12 18 24 30 23.55 22.85 1. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 128 EPYC 9684X AMD 9684X 30 60 90 120 150 111.00 114.31 1. (CXX) g++ options: -O3
Xcompact3d Incompact3d Input: X3D-benchmarking input.i3d OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: X3D-benchmarking input.i3d EPYC 9684X AMD 9684X 80 160 240 320 400 377.59 388.70 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: double - X Y Z: 256 EPYC 9684X AMD 9684X 40 80 120 160 200 194.02 189.44 1. (CXX) g++ options: -O3
ASKAP Test: tConvolve MPI - Gridding OpenBenchmarking.org Mpix/sec, More Is Better ASKAP 1.0 Test: tConvolve MPI - Gridding EPYC 9684X AMD 9684X 16K 32K 48K 64K 80K 73226.7 74970.2 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 128 EPYC 9684X AMD 9684X 16 32 48 64 80 68.65 70.23 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: float - X Y Z: 256 EPYC 9684X AMD 9684X 40 80 120 160 200 170.73 167.19 1. (CXX) g++ options: -O3
NAS Parallel Benchmarks Test / Class: FT.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: FT.C EPYC 9684X AMD 9684X 30K 60K 90K 120K 150K 118915.97 121389.76 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2
ASKAP Test: tConvolve MPI - Degridding OpenBenchmarking.org Mpix/sec, More Is Better ASKAP 1.0 Test: tConvolve MPI - Degridding EPYC 9684X AMD 9684X 13K 26K 39K 52K 65K 59410.3 58310.1 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: double - X Y Z: 128 EPYC 9684X AMD 9684X 16 32 48 64 80 70.12 68.87 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 256 EPYC 9684X AMD 9684X 40 80 120 160 200 182.81 179.54 1. (CXX) g++ options: -O3
NAMD ATPase Simulation - 327,506 Atoms OpenBenchmarking.org days/ns, Fewer Is Better NAMD 2.14 ATPase Simulation - 327,506 Atoms EPYC 9684X AMD 9684X 0.0564 0.1128 0.1692 0.2256 0.282 0.25068 0.24623
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: float - X Y Z: 128 EPYC 9684X AMD 9684X 30 60 90 120 150 130.75 128.61 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 512 EPYC 9684X AMD 9684X 70 140 210 280 350 332.91 327.73 1. (CXX) g++ options: -O3
NAS Parallel Benchmarks Test / Class: SP.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: SP.C EPYC 9684X AMD 9684X 50K 100K 150K 200K 250K 211454.60 208198.13 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2
NAS Parallel Benchmarks Test / Class: LU.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: LU.C EPYC 9684X AMD 9684X 70K 140K 210K 280K 350K 340754.87 335538.30 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 128 EPYC 9684X AMD 9684X 40 80 120 160 200 184.35 187.19 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: float - X Y Z: 128 EPYC 9684X AMD 9684X 40 80 120 160 200 178.48 175.82 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 512 EPYC 9684X AMD 9684X 70 140 210 280 350 341.92 336.84 1. (CXX) g++ options: -O3
libxsmm M N K: 256 OpenBenchmarking.org GFLOPS/s, More Is Better libxsmm 2-1.17-3645 M N K: 256 EPYC 9684X AMD 9684X 700 1400 2100 2800 3500 3177.4 3131.5 1. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2
NAS Parallel Benchmarks Test / Class: EP.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: EP.C EPYC 9684X AMD 9684X 2K 4K 6K 8K 10K 7839.23 7952.00 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: double - X Y Z: 256 EPYC 9684X AMD 9684X 40 80 120 160 200 174.00 176.45 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: float - X Y Z: 128 EPYC 9684X AMD 9684X 20 40 60 80 100 110.02 111.54 1. (CXX) g++ options: -O3
NAS Parallel Benchmarks Test / Class: EP.D OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: EP.D EPYC 9684X AMD 9684X 2K 4K 6K 8K 10K 10620.59 10476.83 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 512 EPYC 9684X AMD 9684X 30 60 90 120 150 145.50 143.53 1. (CXX) g++ options: -O3
High Performance Conjugate Gradient X Y Z: 192 192 192 - RT: 60 OpenBenchmarking.org GFLOP/s, More Is Better High Performance Conjugate Gradient 3.1 X Y Z: 192 192 192 - RT: 60 EPYC 9684X AMD 9684X 5 10 15 20 25 22.72 22.42 1. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: float - X Y Z: 512 EPYC 9684X AMD 9684X 70 140 210 280 350 343.72 339.17 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: float - X Y Z: 512 EPYC 9684X AMD 9684X 70 140 210 280 350 336.45 340.75 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: double - X Y Z: 256 EPYC 9684X AMD 9684X 20 40 60 80 100 83.89 82.85 1. (CXX) g++ options: -O3
Blender Blend File: BMW27 - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.6 Blend File: BMW27 - Compute: CPU-Only EPYC 9684X AMD 9684X 4 8 12 16 20 16.26 16.46
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: float - X Y Z: 128 EPYC 9684X AMD 9684X 40 80 120 160 200 186.05 188.22 1. (CXX) g++ options: -O3
OpenFOAM Input: drivaerFastback, Small Mesh Size - Execution Time OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: drivaerFastback, Small Mesh Size - Execution Time EPYC 9684X AMD 9684X 7 14 21 28 35 28.96 29.26 1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm
Embree Binary: Pathtracer - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.1 Binary: Pathtracer - Model: Crown EPYC 9684X AMD 9684X 20 40 60 80 100 110.69 109.65 MIN: 108.41 / MAX: 113.64 MIN: 107.65 / MAX: 113.51
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 512 EPYC 9684X AMD 9684X 30 60 90 120 150 148.56 149.97 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: double - X Y Z: 512 EPYC 9684X AMD 9684X 30 60 90 120 150 144.62 143.31 1. (CXX) g++ options: -O3
libxsmm M N K: 128 OpenBenchmarking.org GFLOPS/s, More Is Better libxsmm 2-1.17-3645 M N K: 128 EPYC 9684X AMD 9684X 700 1400 2100 2800 3500 3119.7 3091.5 1. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2
ASTC Encoder Preset: Fast OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 4.0 Preset: Fast EPYC 9684X AMD 9684X 200 400 600 800 1000 1019.07 1009.89 1. (CXX) g++ options: -O3 -flto -pthread
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: float - X Y Z: 256 EPYC 9684X AMD 9684X 70 140 210 280 350 315.15 312.39 1. (CXX) g++ options: -O3
Stress-NG Test: Vector Floating Point OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Vector Floating Point EPYC 9684X AMD 9684X 60K 120K 180K 240K 300K 256936.31 254718.10 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Blender Blend File: Fishy Cat - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.6 Blend File: Fishy Cat - Compute: CPU-Only EPYC 9684X AMD 9684X 5 10 15 20 25 20.61 20.44
Stress-NG Test: CPU Stress OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: CPU Stress EPYC 9684X AMD 9684X 50K 100K 150K 200K 250K 213036.81 214798.32 1. (CXX) g++ options: -O2 -std=gnu99 -lc
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 256 EPYC 9684X AMD 9684X 40 80 120 160 200 177.12 178.56 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 256 EPYC 9684X AMD 9684X 20 40 60 80 100 83.53 82.86 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 128 EPYC 9684X AMD 9684X 30 60 90 120 150 115.34 116.26 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: float - X Y Z: 256 EPYC 9684X AMD 9684X 40 80 120 160 200 177.71 176.30 1. (CXX) g++ options: -O3
Embree Binary: Pathtracer - Model: Asian Dragon Obj OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.1 Binary: Pathtracer - Model: Asian Dragon Obj EPYC 9684X AMD 9684X 30 60 90 120 150 114.17 115.04 MIN: 112.65 / MAX: 116.39 MIN: 113.62 / MAX: 117.34
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: float - X Y Z: 512 EPYC 9684X AMD 9684X 30 60 90 120 150 154.31 155.39 1. (CXX) g++ options: -O3
LULESH OpenBenchmarking.org z/s, More Is Better LULESH 2.0.3 EPYC 9684X AMD 9684X 7K 14K 21K 28K 35K 30858.46 31065.29 1. (CXX) g++ options: -O3 -fopenmp -lm -lmpi_cxx -lmpi
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: double - X Y Z: 512 EPYC 9684X AMD 9684X 30 60 90 120 150 136.46 135.56 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: double - X Y Z: 512 EPYC 9684X AMD 9684X 15 30 45 60 75 68.82 68.37 1. (CXX) g++ options: -O3
OpenFOAM Input: drivaerFastback, Small Mesh Size - Mesh Time OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: drivaerFastback, Small Mesh Size - Mesh Time EPYC 9684X AMD 9684X 6 12 18 24 30 23.30 23.17 1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: float - X Y Z: 512 EPYC 9684X AMD 9684X 30 60 90 120 150 149.98 149.16 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 512 EPYC 9684X AMD 9684X 30 60 90 120 150 135.54 136.27 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 256 EPYC 9684X AMD 9684X 70 140 210 280 350 327.36 328.78 1. (CXX) g++ options: -O3
NAS Parallel Benchmarks Test / Class: SP.B OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: SP.B EPYC 9684X AMD 9684X 40K 80K 120K 160K 200K 170601.48 169864.87 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.2
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: double - X Y Z: 128 EPYC 9684X AMD 9684X 30 60 90 120 150 115.82 116.29 1. (CXX) g++ options: -O3
OpenFOAM Input: drivaerFastback, Medium Mesh Size - Execution Time OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: drivaerFastback, Medium Mesh Size - Execution Time EPYC 9684X AMD 9684X 40 80 120 160 200 184.45 185.15 1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: double - X Y Z: 128 EPYC 9684X AMD 9684X 30 60 90 120 150 125.24 125.70 1. (CXX) g++ options: -O3
OpenFOAM Input: drivaerFastback, Medium Mesh Size - Mesh Time OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: drivaerFastback, Medium Mesh Size - Mesh Time EPYC 9684X AMD 9684X 20 40 60 80 100 107.49 107.83 1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm
libxsmm M N K: 64 OpenBenchmarking.org GFLOPS/s, More Is Better libxsmm 2-1.17-3645 M N K: 64 EPYC 9684X AMD 9684X 500 1000 1500 2000 2500 2479.3 2472.2 1. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2
Xmrig Variant: Monero - Hash Count: 1M OpenBenchmarking.org H/s, More Is Better Xmrig 6.18.1 Variant: Monero - Hash Count: 1M EPYC 9684X AMD 9684X 15K 30K 45K 60K 75K 69478.2 69671.8 1. (CXX) g++ options: -fexceptions -fno-rtti -maes -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 512 EPYC 9684X AMD 9684X 15 30 45 60 75 68.45 68.63 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 512 EPYC 9684X AMD 9684X 15 30 45 60 75 68.81 68.64 1. (CXX) g++ options: -O3
Embree Binary: Pathtracer ISPC - Model: Asian Dragon Obj OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.1 Binary: Pathtracer ISPC - Model: Asian Dragon Obj EPYC 9684X AMD 9684X 30 60 90 120 150 122.04 122.31 MIN: 120.36 / MAX: 124.72 MIN: 120.61 / MAX: 124.64
Blender Blend File: Classroom - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.6 Blend File: Classroom - Compute: CPU-Only EPYC 9684X AMD 9684X 9 18 27 36 45 40.48 40.40
High Performance Conjugate Gradient X Y Z: 160 160 160 - RT: 60 OpenBenchmarking.org GFLOP/s, More Is Better High Performance Conjugate Gradient 3.1 X Y Z: 160 160 160 - RT: 60 EPYC 9684X AMD 9684X 5 10 15 20 25 22.82 22.78 1. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: double - X Y Z: 512 EPYC 9684X AMD 9684X 15 30 45 60 75 68.16 68.28 1. (CXX) g++ options: -O3
Blender Blend File: Pabellon Barcelona - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.6 Blend File: Pabellon Barcelona - Compute: CPU-Only EPYC 9684X AMD 9684X 11 22 33 44 55 49.42 49.50
Embree Binary: Pathtracer ISPC - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.1 Binary: Pathtracer ISPC - Model: Crown EPYC 9684X AMD 9684X 30 60 90 120 150 117.08 117.25 MIN: 114.57 / MAX: 120.79 MIN: 114.61 / MAX: 121.35
Embree Binary: Pathtracer - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.1 Binary: Pathtracer - Model: Asian Dragon EPYC 9684X AMD 9684X 30 60 90 120 150 126.25 126.41 MIN: 124.97 / MAX: 128.11 MIN: 124.93 / MAX: 128.49
Stress-NG Test: Wide Vector Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Wide Vector Math EPYC 9684X AMD 9684X 700K 1400K 2100K 2800K 3500K 3480531.79 3484985.88 1. (CXX) g++ options: -O2 -std=gnu99 -lc
ASTC Encoder Preset: Medium OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 4.0 Preset: Medium EPYC 9684X AMD 9684X 90 180 270 360 450 421.44 421.97 1. (CXX) g++ options: -O3 -flto -pthread
Xmrig Variant: Wownero - Hash Count: 1M OpenBenchmarking.org H/s, More Is Better Xmrig 6.18.1 Variant: Wownero - Hash Count: 1M EPYC 9684X AMD 9684X 16K 32K 48K 64K 80K 74195.0 74288.7 1. (CXX) g++ options: -fexceptions -fno-rtti -maes -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
Blender Blend File: Barbershop - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.6 Blend File: Barbershop - Compute: CPU-Only EPYC 9684X AMD 9684X 30 60 90 120 150 142.10 141.96
ASKAP Test: tConvolve MT - Gridding OpenBenchmarking.org Million Grid Points Per Second, More Is Better ASKAP 1.0 Test: tConvolve MT - Gridding EPYC 9684X AMD 9684X 3K 6K 9K 12K 15K 13617.8 13606.9 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 256 EPYC 9684X AMD 9684X 40 80 120 160 200 184.79 184.65 1. (CXX) g++ options: -O3
libxsmm M N K: 32 OpenBenchmarking.org GFLOPS/s, More Is Better libxsmm 2-1.17-3645 M N K: 32 EPYC 9684X AMD 9684X 300 600 900 1200 1500 1299.8 1300.7 1. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2
miniFE Problem Size: Small OpenBenchmarking.org CG Mflops, More Is Better miniFE 2.2 Problem Size: Small EPYC 9684X AMD 9684X 12K 24K 36K 48K 60K 53878.4 53906.3 1. (CXX) g++ options: -O3 -fopenmp -lmpi_cxx -lmpi
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 512 EPYC 9684X AMD 9684X 30 60 90 120 150 155.71 155.66 1. (CXX) g++ options: -O3
GROMACS Implementation: MPI CPU - Input: water_GMX50_bare OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2023 Implementation: MPI CPU - Input: water_GMX50_bare EPYC 9684X AMD 9684X 3 6 9 12 15 11.80 11.80 1. (CXX) g++ options: -O3
Xcompact3d Incompact3d Input: input.i3d 193 Cells Per Direction OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 193 Cells Per Direction EPYC 9684X AMD 9684X 2 4 6 8 10 7.65203524 7.65393496 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 128 EPYC 9684X AMD 9684X 40 80 120 160 200 176.53 176.56 1. (CXX) g++ options: -O3
Embree Binary: Pathtracer ISPC - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.1 Binary: Pathtracer ISPC - Model: Asian Dragon EPYC 9684X AMD 9684X 30 60 90 120 150 142.28 142.30 MIN: 140.41 / MAX: 144.57 MIN: 140.74 / MAX: 144.53
ASTC Encoder Preset: Thorough OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 4.0 Preset: Thorough EPYC 9684X AMD 9684X 13 26 39 52 65 56.89 56.90 1. (CXX) g++ options: -O3 -flto -pthread
ASTC Encoder Preset: Exhaustive OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 4.0 Preset: Exhaustive EPYC 9684X AMD 9684X 2 4 6 8 10 6.1412 6.1406 1. (CXX) g++ options: -O3 -flto -pthread
Stress-NG Test: Vector Shuffle OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Vector Shuffle EPYC 9684X AMD 9684X 14K 28K 42K 56K 70K 63766.48 63760.46 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Stress-NG Test: Vector Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Vector Math EPYC 9684X AMD 9684X 120K 240K 360K 480K 600K 545786.37 545747.29 1. (CXX) g++ options: -O2 -std=gnu99 -lc
Stress-NG Test: Matrix Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.10 Test: Matrix Math EPYC 9684X AMD 9684X 90K 180K 270K 360K 450K 418126.43 418120.08 1. (CXX) g++ options: -O2 -std=gnu99 -lc
ASKAP Test: Hogbom Clean OpenMP OpenBenchmarking.org Iterations Per Second, More Is Better ASKAP 1.0 Test: Hogbom Clean OpenMP EPYC 9684X AMD 9684X 300 600 900 1200 1500 1204.82 1204.82 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
ASKAP Test: tConvolve MT - Degridding OpenBenchmarking.org Million Grid Points Per Second, More Is Better ASKAP 1.0 Test: tConvolve MT - Degridding EPYC 9684X AMD 9684X 3K 6K 9K 12K 15K 15691 15691 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
Phoronix Test Suite v10.8.5