AMD Zen 3 Zen 4 Linux Server Performance

AMD Ryzen 7 5800X3D 8-Core testing with a ASUS ROG CROSSHAIR VIII HERO (4501 BIOS) and Sapphire AMD Radeon RX 6500 XT 4GB on Ubuntu 22.04 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2307028-NE-AMDZEN3ZE93&grw.

AMD Zen 3 Zen 4 Linux Server PerformanceProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerOpenGLVulkanCompilerFile-SystemScreen ResolutionRyzen 7 5800X3DAMD Ryzen 7 5800X3D 8-Core @ 3.40GHz (8 Cores / 16 Threads)ASUS ROG CROSSHAIR VIII HERO (4501 BIOS)AMD Starship/Matisse64GB3201GB Micron_7450_MTFDKCC3T2TFSSapphire AMD Radeon RX 6500 XT 4GB (2975/1124MHz)AMD Navi 21 HDMI AudioASUS VP28URealtek RTL8125 2.5GbE + Intel I211Ubuntu 22.045.19.0-46-generic (x86_64)GNOME Shell 42.5X Server + Wayland4.6 Mesa 22.2.5-0ubuntu0.1~22.04.3 (LLVM 15.0.7 DRM 3.47)1.3.224GCC 11.3.0ext43840x2160OpenBenchmarking.org- Transparent Huge Pages: madvise- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-aYxV0E/gcc-11-11.3.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-aYxV0E/gcc-11-11.3.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa20120a - OpenJDK Runtime Environment (build 11.0.19+7-post-Ubuntu-0ubuntu122.04.1)- Python 3.10.6- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected

AMD Zen 3 Zen 4 Linux Server Performanceminibude: OpenMP - BM1minibude: OpenMP - BM1heffte: c2c - FFTW - double - 128hpcg: 104 104 104 - 60heffte: r2c - Stock - double - 128heffte: r2c - Stock - float - 128heffte: c2c - Stock - double - 128heffte: c2c - Stock - float - 128heffte: r2c - FFTW - double - 128heffte: r2c - FFTW - float - 128heffte: c2c - FFTW - float - 128libxsmm: 64libxsmm: 32namd: ATPase Simulation - 327,506 Atomsamg: qmcpack: Li2_STO_aeqmcpack: simple-H2Oqmcpack: FeCO6_b3lyp_gmsqmcpack: FeCO6_b3lyp_gmsminife: SmallRyzen 7 5800X3D436.05817.44219.05824.9446932.602768.785014.186838.832447.0565122.66976.5519143.671.11.84456260886167243.3921.668141.07147.455275.93OpenBenchmarking.org

miniBUDE

Implementation: OpenMP - Input Deck: BM1

OpenBenchmarking.orgGFInst/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM1Ryzen 7 5800X3D90180270360450SE +/- 0.07, N = 3436.061. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm

miniBUDE

Implementation: OpenMP - Input Deck: BM1

OpenBenchmarking.orgBillion Interactions/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM1Ryzen 7 5800X3D48121620SE +/- 0.00, N = 317.441. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm

HeFFTe - Highly Efficient FFT for Exascale

Test: c2c - Backend: FFTW - Precision: double - X Y Z: 128

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: double - X Y Z: 128Ryzen 7 5800X3D510152025SE +/- 0.05, N = 319.061. (CXX) g++ options: -O3

High Performance Conjugate Gradient

X Y Z: 104 104 104 - RT: 60

OpenBenchmarking.orgGFLOP/s, More Is BetterHigh Performance Conjugate Gradient 3.1X Y Z: 104 104 104 - RT: 60Ryzen 7 5800X3D1.11262.22523.33784.45045.563SE +/- 0.01797, N = 34.944691. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi

HeFFTe - Highly Efficient FFT for Exascale

Test: r2c - Backend: Stock - Precision: double - X Y Z: 128

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: Stock - Precision: double - X Y Z: 128Ryzen 7 5800X3D816243240SE +/- 0.27, N = 332.601. (CXX) g++ options: -O3

HeFFTe - Highly Efficient FFT for Exascale

Test: r2c - Backend: Stock - Precision: float - X Y Z: 128

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: Stock - Precision: float - X Y Z: 128Ryzen 7 5800X3D1530456075SE +/- 0.52, N = 1568.791. (CXX) g++ options: -O3

HeFFTe - Highly Efficient FFT for Exascale

Test: c2c - Backend: Stock - Precision: double - X Y Z: 128

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: Stock - Precision: double - X Y Z: 128Ryzen 7 5800X3D48121620SE +/- 0.06, N = 314.191. (CXX) g++ options: -O3

HeFFTe - Highly Efficient FFT for Exascale

Test: c2c - Backend: Stock - Precision: float - X Y Z: 128

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: Stock - Precision: float - X Y Z: 128Ryzen 7 5800X3D918273645SE +/- 0.31, N = 338.831. (CXX) g++ options: -O3

HeFFTe - Highly Efficient FFT for Exascale

Test: r2c - Backend: FFTW - Precision: double - X Y Z: 128

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: double - X Y Z: 128Ryzen 7 5800X3D1122334455SE +/- 0.53, N = 347.061. (CXX) g++ options: -O3

HeFFTe - Highly Efficient FFT for Exascale

Test: r2c - Backend: FFTW - Precision: float - X Y Z: 128

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: float - X Y Z: 128Ryzen 7 5800X3D306090120150SE +/- 1.12, N = 3122.671. (CXX) g++ options: -O3

HeFFTe - Highly Efficient FFT for Exascale

Test: c2c - Backend: FFTW - Precision: float - X Y Z: 128

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: float - X Y Z: 128Ryzen 7 5800X3D20406080100SE +/- 0.66, N = 1576.551. (CXX) g++ options: -O3

libxsmm

M N K: 64

OpenBenchmarking.orgGFLOPS/s, More Is Betterlibxsmm 2-1.17-3645M N K: 64Ryzen 7 5800X3D306090120150SE +/- 0.00, N = 3143.61. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -pedantic -O2 -fopenmp -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -march=core-avx2

libxsmm

M N K: 32

OpenBenchmarking.orgGFLOPS/s, More Is Betterlibxsmm 2-1.17-3645M N K: 32Ryzen 7 5800X3D1632486480SE +/- 0.00, N = 371.11. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -pedantic -O2 -fopenmp -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -march=core-avx2

NAMD

ATPase Simulation - 327,506 Atoms

OpenBenchmarking.orgdays/ns, Fewer Is BetterNAMD 2.14ATPase Simulation - 327,506 AtomsRyzen 7 5800X3D0.4150.831.2451.662.075SE +/- 0.00086, N = 31.84456

Algebraic Multi-Grid Benchmark

OpenBenchmarking.orgFigure Of Merit, More Is BetterAlgebraic Multi-Grid Benchmark 1.2Ryzen 7 5800X3D60M120M180M240M300MSE +/- 170926.48, N = 32608861671. (CC) gcc options: -lparcsr_ls -lparcsr_mv -lseq_mv -lIJ_mv -lkrylov -lHYPRE_utilities -lm -fopenmp -lmpi

QMCPACK

Input: Li2_STO_ae

OpenBenchmarking.orgTotal Execution Time - Seconds, Fewer Is BetterQMCPACK 3.16Input: Li2_STO_aeRyzen 7 5800X3D50100150200250SE +/- 2.83, N = 3243.391. (CXX) g++ options: -fopenmp -foffload=disable -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -ffast-math -march=native -O3 -lm -ldl

QMCPACK

Input: simple-H2O

OpenBenchmarking.orgTotal Execution Time - Seconds, Fewer Is BetterQMCPACK 3.16Input: simple-H2ORyzen 7 5800X3D510152025SE +/- 0.04, N = 321.671. (CXX) g++ options: -fopenmp -foffload=disable -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -ffast-math -march=native -O3 -lm -ldl

QMCPACK

Input: FeCO6_b3lyp_gms

OpenBenchmarking.orgTotal Execution Time - Seconds, Fewer Is BetterQMCPACK 3.16Input: FeCO6_b3lyp_gmsRyzen 7 5800X3D306090120150SE +/- 0.06, N = 3141.071. (CXX) g++ options: -fopenmp -foffload=disable -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -ffast-math -march=native -O3 -lm -ldl

QMCPACK

Input: FeCO6_b3lyp_gms

OpenBenchmarking.orgTotal Execution Time - Seconds, Fewer Is BetterQMCPACK 3.16Input: FeCO6_b3lyp_gmsRyzen 7 5800X3D306090120150SE +/- 0.39, N = 3147.451. (CXX) g++ options: -fopenmp -foffload=disable -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -ffast-math -march=native -O3 -lm -ldl

miniFE

Problem Size: Small

OpenBenchmarking.orgCG Mflops, More Is BetterminiFE 2.2Problem Size: SmallRyzen 7 5800X3D11002200330044005500SE +/- 5.20, N = 35275.931. (CXX) g++ options: -O3 -fopenmp -lmpi_cxx -lmpi


Phoronix Test Suite v10.8.5