NVIDIA GH200 Benchmarks Smoke Run Comparison Benchmarks by Michael Larabel for a future article. Just some initial smoke run benchmarks looking at the NVIDIA GH200 CPU performance versus other server CPUs. More interesting benchmarks to come.
HTML result view exported from: https://openbenchmarking.org/result/2401256-NE-NVIDIAGH254&grr&sor .
NVIDIA GH200 Benchmarks Smoke Run Comparison Processor Motherboard Chipset Memory Disk Graphics Network Monitor OS Kernel Desktop Display Server Compiler File-System Screen Resolution EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 Xeon Platinum 8490H Ampere Altra Max 128c GPTshop GH200 AMD EPYC 9554 64-Core @ 3.10GHz (64 Cores / 128 Threads) AMD Titanite_4G (RTI1007B BIOS) AMD Device 14a4 768GB 3201GB Micron_7450_MTFDKCC3T2TFS ASPEED Broadcom NetXtreme BCM5720 PCIe Ubuntu 23.10 6.6.0-rc5-phx-patched (x86_64) GNOME Shell 45.0 X Server 1.21.1.7 GCC 13.2.0 ext4 1920x1200 AMD EPYC 9654 96-Core @ 2.40GHz (96 Cores / 192 Threads) AMD EPYC 9684X 96-Core @ 2.55GHz (96 Cores / 192 Threads) AMD EPYC 9754 128-Core @ 2.25GHz (128 Cores / 256 Threads) Intel Xeon Platinum 8490H @ 3.50GHz (60 Cores / 120 Threads) Quanta Cloud S6Q-MB-MPS (3A10.uh BIOS) Intel Device 1bce 512GB ARMv8 Neoverse-N1 @ 3.00GHz (128 Cores) GIGABYTE MP32-AR2-00 v01000100 (F31k SCP: 2.10.20220531 BIOS) Ampere Computing LLC Altra PCI Root Complex A VGA HDMI 2 x Intel I350 6.6.0-rc5-phx-patched (aarch64) 1920x1080 ARMv8 Neoverse-V2 @ 3.39GHz (72 Cores) Quanta Cloud QuantaGrid S74G-2U 1S7GZ9Z0000 S7G MB (CG1) (3A06 BIOS) 1 x 480 GB DRAM-6400MT/s 960GB SAMSUNG MZ1L2960HCJR-00A07 + 1920GB SAMSUNG MZTL21T9 2 x Mellanox MT2910 + 2 x QLogic FastLinQ QL41000 10/25/40/50GbE 6.5.0-15-generic (aarch64) 1920x1200 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - EPYC 9554: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - EPYC 9654: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - EPYC 9684X: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - EPYC 9754: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Xeon Platinum 8490H: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Ampere Altra Max 128c: --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -v - GPTshop GH200: --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -v Processor Details - EPYC 9554: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa10113e - EPYC 9654: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa10113e - EPYC 9684X: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa10113e - EPYC 9754: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xaa00116 - Xeon Platinum 8490H: Scaling Governor: intel_pstate performance (EPP: performance) - CPU Microcode: 0x2b0004b1 - Ampere Altra Max 128c: Scaling Governor: cppc_cpufreq performance (Boost: Disabled) - GPTshop GH200: Scaling Governor: cppc_cpufreq performance (Boost: Disabled) Java Details - EPYC 9554: OpenJDK Runtime Environment (build 11.0.20+8-post-Ubuntu-1ubuntu1) - EPYC 9654: OpenJDK Runtime Environment (build 11.0.20+8-post-Ubuntu-1ubuntu1) - EPYC 9684X: OpenJDK Runtime Environment (build 11.0.20+8-post-Ubuntu-1ubuntu1) - EPYC 9754: OpenJDK Runtime Environment (build 11.0.20+8-post-Ubuntu-1ubuntu1) - Xeon Platinum 8490H: OpenJDK Runtime Environment (build 11.0.20+8-post-Ubuntu-1ubuntu1) - Ampere Altra Max 128c: OpenJDK Runtime Environment (build 11.0.20+8-post-Ubuntu-1ubuntu1) - GPTshop GH200: OpenJDK Runtime Environment (build 11.0.21+9-post-Ubuntu-0ubuntu123.10) Python Details - Python 3.11.6 Security Details - EPYC 9554: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected - EPYC 9654: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected - EPYC 9684X: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected - EPYC 9754: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected - Xeon Platinum 8490H: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected - Ampere Altra Max 128c: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of CSV2 BHB + srbds: Not affected + tsx_async_abort: Not affected - GPTshop GH200: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected
NVIDIA GH200 Benchmarks Smoke Run Comparison hpcg: 144 144 144 - 60 incompact3d: X3D-benchmarking input.i3d cloverleaf: clover_bm16 build-gem5: Time To Compile duckdb: TPC-H Parquet build-linux-kernel: allmodconfig askap: tConvolve MT - Gridding graph500: 26 graph500: 26 graph500: 26 graph500: 26 openssl: SHA512 duckdb: IMDB build-llvm: Ninja build-nodejs: Time To Compile rocksdb: Rand Read cloverleaf: clover_bm64_short build-godot: Time To Compile cassandra: Writes easywave: e2Asean Grid + BengkuluSept2007 Source - 2400 askap: tConvolve MPI - Gridding askap: tConvolve MPI - Degridding xmrig: Wownero - 1M xmrig: Monero - 1M easywave: e2Asean Grid + BengkuluSept2007 Source - 1200 graphics-magick: Sharpen graphics-magick: Enhanced rawtherapee: Total Benchmark Time amg: compress-7zip: Compression Rating primesieve: 1e13 rodinia: OpenMP LavaMD stress-ng: NUMA stress-ng: Memory Copying lulesh: mt-dgemm: Sustained Floating-Point Rate npb: IS.D incompact3d: input.i3d 193 Cells Per Direction npb: CG.C minife: Small npb: FT.C npb: MG.C EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 Xeon Platinum 8490H Ampere Altra Max 128c GPTshop GH200 22.4511 515.792758 460.24 179.743 143.306 238.289 10327.5 467453000 368016000 865152000 836159000 33622028530 103.720 131.203 121.928 376983415 45.82 103.500 301963 56.883 37265.9 31177.2 60032.4 47926.4 22.452 669 1142 49.943 2321156000 489736 28.136 34.014 967.56 22827.20 24037.442 30.174904 5184.36 9.74551621 45867.44 50351.4 124827.75 125921.29 32.1283 430.080329 429.46 180.857 146.276 284.249 12332.4 499347000 391167000 839917000 816172000 40459323610 115.805 133.070 120.156 429310263 47.98 101.563 285324 59.535 47003.2 41081.6 75127.4 59047.7 23.507 809 1319 52.051 2296730667 564950 24.164 30.240 1237.36 28310.82 23701.442 39.282158 5195.24 8.45212936 54465.04 50320.2 125164.03 118608.07 23.7784 440.548584 298.29 180.061 147.489 283.865 14731.5 501222000 383754000 882181000 854029000 40105639580 116.050 131.990 119.203 486862785 51.31 98.555 286460 60.252 67965.5 58310.1 73567.3 69205.6 24.579 775 1277 52.991 2374853333 554411 25.969 30.317 1173.07 26495.63 24712.884 40.320062 6138.54 7.30814419 59557.08 52937.6 124391.00 137608.68 25.8918 493.502452 459.16 208.576 177.134 319.426 12660.8 502151000 377033000 1184510000 1147090000 53648078203 147.601 146.386 130.460 611259161 49.03 118.247 236320 72.355 27269.2 26593.4 77607.2 29356.1 29.588 924 1451 66.132 2291049667 586571 21.756 25.150 1397.61 34497.51 22356.746 43.681923 5773.25 9.02806168 44431.58 50393.7 141255.07 127404.16 31.2432 353.557595 324.17 197.759 148.099 289.966 8112.07 454483000 333129000 1113120000 1073860000 22445820447 99.249 169.528 151.567 268374486 38.93 128.467 139654 118.554 22365.2 19616.0 35870.2 27608.5 45.850 691 1124 46.578 1611826000 389690 52.375 42.713 1144.04 18809.55 23997.715 23.257388 2885.71 12.7983420 35560.25 35452.5 70989.97 88049.90 21.2399 606.521708 550.30 260.326 237.161 307.878 4295.72 333239000 224032000 987770000 980463000 34434960797 143.781 266.543 268.704 435437276 62.77 227.783 227548 326.368 2416.54 5285.36 1909.2 4322.1 122.550 1274 1230 66.045 1060656000 328378 41.675 31.791 1454.68 27163.45 16218.331 18.114994 1196.33 23.8852240 17272.73 24122.0 35153.27 42291.84 41.5526 257.342539 247.28 184.661 156.987 283.339 9561.31 463087000 298983000 1313880000 1214290000 45292038457 91.633 195.598 173.834 429416658 28.36 138.891 371439 98.719 9678.85 12429.5 21993.3 17290.3 36.562 1244 1584 46.546 1993579333 341300 36.049 30.959 6279.41 27302.27 23433.373 17.563016 1770.65 9.83151112 24350.32 46477.3 47948.15 57191.85 OpenBenchmarking.org
High Performance Conjugate Gradient X Y Z: 144 144 144 - RT: 60 OpenBenchmarking.org GFLOP/s, More Is Better High Performance Conjugate Gradient 3.1 X Y Z: 144 144 144 - RT: 60 GPTshop GH200 EPYC 9654 Xeon Platinum 8490H EPYC 9754 EPYC 9684X EPYC 9554 Ampere Altra Max 128c 9 18 27 36 45 SE +/- 0.36, N = 3 SE +/- 0.35, N = 3 SE +/- 0.11, N = 3 SE +/- 0.27, N = 3 SE +/- 0.28, N = 9 SE +/- 0.04, N = 3 SE +/- 0.00, N = 3 41.55 32.13 31.24 25.89 23.78 22.45 21.24 1. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi
Xcompact3d Incompact3d Input: X3D-benchmarking input.i3d OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: X3D-benchmarking input.i3d GPTshop GH200 Xeon Platinum 8490H EPYC 9654 EPYC 9684X EPYC 9754 EPYC 9554 Ampere Altra Max 128c 130 260 390 520 650 SE +/- 1.42, N = 3 SE +/- 4.12, N = 4 SE +/- 8.73, N = 9 SE +/- 5.21, N = 9 SE +/- 14.11, N = 3 SE +/- 7.98, N = 9 SE +/- 0.07, N = 3 257.34 353.56 430.08 440.55 493.50 515.79 606.52 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
CloverLeaf Input: clover_bm16 OpenBenchmarking.org Seconds, Fewer Is Better CloverLeaf 1.3 Input: clover_bm16 GPTshop GH200 EPYC 9684X Xeon Platinum 8490H EPYC 9654 EPYC 9754 EPYC 9554 Ampere Altra Max 128c 120 240 360 480 600 SE +/- 0.52, N = 3 SE +/- 14.65, N = 9 SE +/- 0.02, N = 3 SE +/- 15.29, N = 9 SE +/- 14.10, N = 9 SE +/- 4.26, N = 7 SE +/- 0.15, N = 3 247.28 298.29 324.17 429.46 459.16 460.24 550.30 1. (F9X) gfortran options: -O3 -march=native -funroll-loops -fopenmp
Timed Gem5 Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Gem5 Compilation 23.0.1 Time To Compile EPYC 9554 EPYC 9684X EPYC 9654 GPTshop GH200 Xeon Platinum 8490H EPYC 9754 Ampere Altra Max 128c 60 120 180 240 300 SE +/- 1.67, N = 3 SE +/- 1.97, N = 3 SE +/- 1.09, N = 3 SE +/- 2.18, N = 12 SE +/- 1.53, N = 3 SE +/- 2.85, N = 3 SE +/- 3.13, N = 4 179.74 180.06 180.86 184.66 197.76 208.58 260.33
DuckDB Benchmark: TPC-H Parquet OpenBenchmarking.org Seconds, Fewer Is Better DuckDB 0.9.1 Benchmark: TPC-H Parquet EPYC 9554 EPYC 9654 EPYC 9684X Xeon Platinum 8490H GPTshop GH200 EPYC 9754 Ampere Altra Max 128c 50 100 150 200 250 SE +/- 0.53, N = 3 SE +/- 0.47, N = 3 SE +/- 0.26, N = 3 SE +/- 0.30, N = 3 SE +/- 0.69, N = 3 SE +/- 0.64, N = 3 SE +/- 4.97, N = 9 143.31 146.28 147.49 148.10 156.99 177.13 237.16 1. (CXX) g++ options: -O3 -rdynamic -lssl -lcrypto -ldl
Timed Linux Kernel Compilation Build: allmodconfig OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 6.1 Build: allmodconfig EPYC 9554 GPTshop GH200 EPYC 9684X EPYC 9654 Xeon Platinum 8490H Ampere Altra Max 128c EPYC 9754 70 140 210 280 350 SE +/- 0.61, N = 3 SE +/- 1.02, N = 3 SE +/- 0.50, N = 3 SE +/- 0.41, N = 3 SE +/- 0.60, N = 3 SE +/- 0.75, N = 3 SE +/- 2.48, N = 3 238.29 283.34 283.87 284.25 289.97 307.88 319.43
ASKAP Test: tConvolve MT - Gridding OpenBenchmarking.org Million Grid Points Per Second, More Is Better ASKAP 1.0 Test: tConvolve MT - Gridding EPYC 9684X EPYC 9754 EPYC 9654 EPYC 9554 GPTshop GH200 Xeon Platinum 8490H Ampere Altra Max 128c 3K 6K 9K 12K 15K SE +/- 240.55, N = 12 SE +/- 9.06, N = 3 SE +/- 32.54, N = 3 SE +/- 9.54, N = 3 SE +/- 2.75, N = 3 SE +/- 4.50, N = 3 SE +/- 0.36, N = 3 14731.50 12660.80 12332.40 10327.50 9561.31 8112.07 4295.72 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
Graph500 Scale: 26 OpenBenchmarking.org sssp max_TEPS, More Is Better Graph500 3.0 Scale: 26 EPYC 9754 EPYC 9684X EPYC 9654 EPYC 9554 GPTshop GH200 Xeon Platinum 8490H Ampere Altra Max 128c 110M 220M 330M 440M 550M 502151000 501222000 499347000 467453000 463087000 454483000 333239000 1. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi
Graph500 Scale: 26 OpenBenchmarking.org sssp median_TEPS, More Is Better Graph500 3.0 Scale: 26 EPYC 9654 EPYC 9684X EPYC 9754 EPYC 9554 Xeon Platinum 8490H GPTshop GH200 Ampere Altra Max 128c 80M 160M 240M 320M 400M 391167000 383754000 377033000 368016000 333129000 298983000 224032000 1. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi
Graph500 Scale: 26 OpenBenchmarking.org bfs max_TEPS, More Is Better Graph500 3.0 Scale: 26 GPTshop GH200 EPYC 9754 Xeon Platinum 8490H Ampere Altra Max 128c EPYC 9684X EPYC 9554 EPYC 9654 300M 600M 900M 1200M 1500M 1313880000 1184510000 1113120000 987770000 882181000 865152000 839917000 1. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi
Graph500 Scale: 26 OpenBenchmarking.org bfs median_TEPS, More Is Better Graph500 3.0 Scale: 26 GPTshop GH200 EPYC 9754 Xeon Platinum 8490H Ampere Altra Max 128c EPYC 9684X EPYC 9554 EPYC 9654 300M 600M 900M 1200M 1500M 1214290000 1147090000 1073860000 980463000 854029000 836159000 816172000 1. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi
OpenSSL Algorithm: SHA512 OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: SHA512 EPYC 9754 GPTshop GH200 EPYC 9654 EPYC 9684X Ampere Altra Max 128c EPYC 9554 Xeon Platinum 8490H 11000M 22000M 33000M 44000M 55000M SE +/- 1530571.35, N = 3 SE +/- 90952676.69, N = 3 SE +/- 16069151.20, N = 3 SE +/- 6401405.16, N = 3 SE +/- 36142057.88, N = 3 SE +/- 4732001.41, N = 3 SE +/- 147259946.50, N = 3 53648078203 45292038457 40459323610 40105639580 34434960797 33622028530 22445820447 -m64 -lssl -lcrypto -m64 -m64 -lssl -lcrypto -m64 -m64 -lssl -lcrypto 1. (CC) gcc options: -pthread -O3 -ldl
DuckDB Benchmark: IMDB OpenBenchmarking.org Seconds, Fewer Is Better DuckDB 0.9.1 Benchmark: IMDB GPTshop GH200 Xeon Platinum 8490H EPYC 9554 EPYC 9654 EPYC 9684X Ampere Altra Max 128c EPYC 9754 30 60 90 120 150 SE +/- 0.75, N = 3 SE +/- 0.24, N = 3 SE +/- 0.16, N = 3 SE +/- 0.18, N = 3 SE +/- 0.26, N = 3 SE +/- 0.18, N = 3 SE +/- 0.08, N = 3 91.63 99.25 103.72 115.81 116.05 143.78 147.60 1. (CXX) g++ options: -O3 -rdynamic -lssl -lcrypto -ldl
Timed LLVM Compilation Build System: Ninja OpenBenchmarking.org Seconds, Fewer Is Better Timed LLVM Compilation 16.0 Build System: Ninja EPYC 9554 EPYC 9684X EPYC 9654 EPYC 9754 Xeon Platinum 8490H GPTshop GH200 Ampere Altra Max 128c 60 120 180 240 300 SE +/- 0.11, N = 3 SE +/- 0.07, N = 3 SE +/- 0.08, N = 3 SE +/- 0.28, N = 3 SE +/- 0.08, N = 3 SE +/- 0.88, N = 3 SE +/- 0.95, N = 3 131.20 131.99 133.07 146.39 169.53 195.60 266.54
Timed Node.js Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Node.js Compilation 19.8.1 Time To Compile EPYC 9684X EPYC 9654 EPYC 9554 EPYC 9754 Xeon Platinum 8490H GPTshop GH200 Ampere Altra Max 128c 60 120 180 240 300 SE +/- 0.09, N = 3 SE +/- 0.08, N = 3 SE +/- 0.32, N = 3 SE +/- 0.08, N = 3 SE +/- 0.27, N = 3 SE +/- 0.08, N = 3 SE +/- 0.19, N = 3 119.20 120.16 121.93 130.46 151.57 173.83 268.70
RocksDB Test: Random Read OpenBenchmarking.org Op/s, More Is Better RocksDB 8.0 Test: Random Read EPYC 9754 EPYC 9684X Ampere Altra Max 128c GPTshop GH200 EPYC 9654 EPYC 9554 Xeon Platinum 8490H 130M 260M 390M 520M 650M SE +/- 817605.37, N = 3 SE +/- 539077.40, N = 3 SE +/- 3039704.34, N = 15 SE +/- 3229592.93, N = 11 SE +/- 1006626.36, N = 3 SE +/- 106225.72, N = 3 SE +/- 6039899.37, N = 15 611259161 486862785 435437276 429416658 429310263 376983415 268374486 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
CloverLeaf Input: clover_bm64_short OpenBenchmarking.org Seconds, Fewer Is Better CloverLeaf 1.3 Input: clover_bm64_short GPTshop GH200 Xeon Platinum 8490H EPYC 9554 EPYC 9654 EPYC 9754 EPYC 9684X Ampere Altra Max 128c 14 28 42 56 70 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 1.40, N = 12 SE +/- 1.47, N = 12 SE +/- 1.50, N = 12 SE +/- 1.39, N = 12 SE +/- 0.02, N = 3 28.36 38.93 45.82 47.98 49.03 51.31 62.77 1. (F9X) gfortran options: -O3 -march=native -funroll-loops -fopenmp
Timed Godot Game Engine Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Godot Game Engine Compilation 4.0 Time To Compile EPYC 9684X EPYC 9654 EPYC 9554 EPYC 9754 Xeon Platinum 8490H GPTshop GH200 Ampere Altra Max 128c 50 100 150 200 250 SE +/- 0.06, N = 3 SE +/- 0.16, N = 3 SE +/- 0.03, N = 3 SE +/- 0.26, N = 3 SE +/- 0.43, N = 3 SE +/- 0.65, N = 3 SE +/- 0.35, N = 3 98.56 101.56 103.50 118.25 128.47 138.89 227.78
Apache Cassandra Test: Writes OpenBenchmarking.org Op/s, More Is Better Apache Cassandra 4.1.3 Test: Writes GPTshop GH200 EPYC 9554 EPYC 9684X EPYC 9654 EPYC 9754 Ampere Altra Max 128c Xeon Platinum 8490H 80K 160K 240K 320K 400K SE +/- 2382.80, N = 3 SE +/- 554.18, N = 3 SE +/- 1195.73, N = 3 SE +/- 162.88, N = 3 SE +/- 1223.06, N = 3 SE +/- 2776.53, N = 3 SE +/- 883.78, N = 3 371439 301963 286460 285324 236320 227548 139654
easyWave Input: e2Asean Grid + BengkuluSept2007 Source - Time: 2400 OpenBenchmarking.org Seconds, Fewer Is Better easyWave r34 Input: e2Asean Grid + BengkuluSept2007 Source - Time: 2400 EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H Ampere Altra Max 128c 70 140 210 280 350 SE +/- 0.33, N = 3 SE +/- 0.40, N = 3 SE +/- 0.14, N = 3 SE +/- 0.45, N = 3 SE +/- 0.34, N = 3 SE +/- 0.36, N = 3 SE +/- 1.24, N = 3 56.88 59.54 60.25 72.36 98.72 118.55 326.37 1. (CXX) g++ options: -O3 -fopenmp
ASKAP Test: tConvolve MPI - Gridding OpenBenchmarking.org Mpix/sec, More Is Better ASKAP 1.0 Test: tConvolve MPI - Gridding EPYC 9684X EPYC 9654 EPYC 9554 EPYC 9754 Xeon Platinum 8490H GPTshop GH200 Ampere Altra Max 128c 15K 30K 45K 60K 75K SE +/- 485.47, N = 3 SE +/- 405.08, N = 3 SE +/- 219.20, N = 3 SE +/- 655.18, N = 14 SE +/- 146.74, N = 3 SE +/- 39.51, N = 3 SE +/- 0.46, N = 3 67965.50 47003.20 37265.90 27269.20 22365.20 9678.85 2416.54 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
ASKAP Test: tConvolve MPI - Degridding OpenBenchmarking.org Mpix/sec, More Is Better ASKAP 1.0 Test: tConvolve MPI - Degridding EPYC 9684X EPYC 9654 EPYC 9554 EPYC 9754 Xeon Platinum 8490H GPTshop GH200 Ampere Altra Max 128c 12K 24K 36K 48K 60K SE +/- 0.00, N = 3 SE +/- 475.58, N = 3 SE +/- 153.57, N = 3 SE +/- 574.27, N = 15 SE +/- 131.23, N = 3 SE +/- 37.76, N = 3 SE +/- 4.44, N = 3 58310.10 41081.60 31177.20 26593.40 19616.00 12429.50 5285.36 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
Xmrig Variant: Wownero - Hash Count: 1M OpenBenchmarking.org H/s, More Is Better Xmrig 6.18.1 Variant: Wownero - Hash Count: 1M EPYC 9754 EPYC 9654 EPYC 9684X EPYC 9554 Xeon Platinum 8490H GPTshop GH200 Ampere Altra Max 128c 17K 34K 51K 68K 85K SE +/- 604.78, N = 4 SE +/- 60.06, N = 4 SE +/- 35.19, N = 4 SE +/- 18.19, N = 3 SE +/- 13.06, N = 3 SE +/- 2.75, N = 3 SE +/- 4.67, N = 3 77607.2 75127.4 73567.3 60032.4 35870.2 21993.3 1909.2 -maes -maes -maes -maes -maes 1. (CXX) g++ options: -fexceptions -fno-rtti -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
Xmrig Variant: Monero - Hash Count: 1M OpenBenchmarking.org H/s, More Is Better Xmrig 6.18.1 Variant: Monero - Hash Count: 1M EPYC 9684X EPYC 9654 EPYC 9554 EPYC 9754 Xeon Platinum 8490H GPTshop GH200 Ampere Altra Max 128c 15K 30K 45K 60K 75K SE +/- 80.10, N = 4 SE +/- 142.78, N = 3 SE +/- 42.25, N = 3 SE +/- 2550.91, N = 12 SE +/- 7.51, N = 3 SE +/- 5.62, N = 3 SE +/- 16.52, N = 3 69205.6 59047.7 47926.4 29356.1 27608.5 17290.3 4322.1 -maes -maes -maes -maes -maes 1. (CXX) g++ options: -fexceptions -fno-rtti -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
easyWave Input: e2Asean Grid + BengkuluSept2007 Source - Time: 1200 OpenBenchmarking.org Seconds, Fewer Is Better easyWave r34 Input: e2Asean Grid + BengkuluSept2007 Source - Time: 1200 EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H Ampere Altra Max 128c 30 60 90 120 150 SE +/- 0.06, N = 3 SE +/- 0.14, N = 3 SE +/- 0.32, N = 3 SE +/- 0.20, N = 15 SE +/- 0.11, N = 3 SE +/- 0.04, N = 3 SE +/- 1.12, N = 3 22.45 23.51 24.58 29.59 36.56 45.85 122.55 1. (CXX) g++ options: -O3 -fopenmp
GraphicsMagick Operation: Sharpen OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.38 Operation: Sharpen Ampere Altra Max 128c GPTshop GH200 EPYC 9754 EPYC 9654 EPYC 9684X Xeon Platinum 8490H EPYC 9554 300 600 900 1200 1500 SE +/- 1.20, N = 3 SE +/- 5.13, N = 3 SE +/- 0.67, N = 3 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 1274 1244 924 809 775 691 669 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -llzma -lbz2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -llzma -lbz2 -lxml2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -llzma -lbz2 -lxml2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -llzma -lbz2 -lxml2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -llzma -lbz2 -lxml2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -llzma -lbz2 -lxml2 1. (CC) gcc options: -fopenmp -O2 -ljpeg -lXext -lSM -lICE -lX11 -lz -lzstd -lm -lpthread
GraphicsMagick Operation: Enhanced OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.38 Operation: Enhanced GPTshop GH200 EPYC 9754 EPYC 9654 EPYC 9684X Ampere Altra Max 128c EPYC 9554 Xeon Platinum 8490H 300 600 900 1200 1500 SE +/- 0.33, N = 3 SE +/- 0.58, N = 3 SE +/- 0.88, N = 3 SE +/- 1.67, N = 3 SE +/- 2.73, N = 3 SE +/- 0.88, N = 3 SE +/- 0.33, N = 3 1584 1451 1319 1277 1230 1142 1124 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -llzma -lbz2 -lxml2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -llzma -lbz2 -lxml2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -llzma -lbz2 -lxml2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -llzma -lbz2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -llzma -lbz2 -lxml2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -llzma -lbz2 -lxml2 1. (CC) gcc options: -fopenmp -O2 -ljpeg -lXext -lSM -lICE -lX11 -lz -lzstd -lm -lpthread
RawTherapee Total Benchmark Time OpenBenchmarking.org Seconds, Fewer Is Better RawTherapee Total Benchmark Time GPTshop GH200 Xeon Platinum 8490H EPYC 9554 EPYC 9654 EPYC 9684X Ampere Altra Max 128c EPYC 9754 15 30 45 60 75 SE +/- 0.10, N = 3 SE +/- 0.05, N = 3 SE +/- 0.04, N = 3 SE +/- 0.07, N = 3 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 SE +/- 0.22, N = 3 46.55 46.58 49.94 52.05 52.99 66.05 66.13 1. RawTherapee, version 5.9, command line.
Algebraic Multi-Grid Benchmark OpenBenchmarking.org Figure Of Merit, More Is Better Algebraic Multi-Grid Benchmark 1.2 EPYC 9684X EPYC 9554 EPYC 9654 EPYC 9754 GPTshop GH200 Xeon Platinum 8490H Ampere Altra Max 128c 500M 1000M 1500M 2000M 2500M SE +/- 1953221.73, N = 3 SE +/- 3180941.37, N = 3 SE +/- 5502047.33, N = 3 SE +/- 2811300.43, N = 3 SE +/- 22467067.78, N = 3 SE +/- 269139.99, N = 3 SE +/- 31942.66, N = 3 2374853333 2321156000 2296730667 2291049667 1993579333 1611826000 1060656000 1. (CC) gcc options: -lparcsr_ls -lparcsr_mv -lseq_mv -lIJ_mv -lkrylov -lHYPRE_utilities -lm -fopenmp -lmpi
7-Zip Compression Test: Compression Rating OpenBenchmarking.org MIPS, More Is Better 7-Zip Compression 22.01 Test: Compression Rating EPYC 9754 EPYC 9654 EPYC 9684X EPYC 9554 Xeon Platinum 8490H GPTshop GH200 Ampere Altra Max 128c 130K 260K 390K 520K 650K SE +/- 439.85, N = 3 SE +/- 311.94, N = 3 SE +/- 366.95, N = 3 SE +/- 224.17, N = 3 SE +/- 629.56, N = 3 SE +/- 222.71, N = 3 SE +/- 1588.70, N = 3 586571 564950 554411 489736 389690 341300 328378 1. (CXX) g++ options: -lpthread -ldl -O2 -fPIC
Primesieve Length: 1e13 OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 8.0 Length: 1e13 EPYC 9754 EPYC 9654 EPYC 9684X EPYC 9554 GPTshop GH200 Ampere Altra Max 128c Xeon Platinum 8490H 12 24 36 48 60 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 SE +/- 0.51, N = 3 SE +/- 0.07, N = 3 SE +/- 0.07, N = 3 21.76 24.16 25.97 28.14 36.05 41.68 52.38 1. (CXX) g++ options: -O3
Rodinia Test: OpenMP LavaMD OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenMP LavaMD EPYC 9754 EPYC 9654 EPYC 9684X GPTshop GH200 Ampere Altra Max 128c EPYC 9554 Xeon Platinum 8490H 10 20 30 40 50 SE +/- 0.10, N = 3 SE +/- 0.06, N = 3 SE +/- 0.05, N = 3 SE +/- 0.07, N = 3 SE +/- 0.01, N = 3 SE +/- 0.05, N = 3 SE +/- 0.53, N = 3 25.15 30.24 30.32 30.96 31.79 34.01 42.71 1. (CXX) g++ options: -O2 -lOpenCL
Stress-NG Test: NUMA OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: NUMA GPTshop GH200 Ampere Altra Max 128c EPYC 9754 EPYC 9654 EPYC 9684X Xeon Platinum 8490H EPYC 9554 1300 2600 3900 5200 6500 SE +/- 10.32, N = 3 SE +/- 2.51, N = 3 SE +/- 2.90, N = 3 SE +/- 1.10, N = 3 SE +/- 1.38, N = 3 SE +/- 1.16, N = 3 SE +/- 4.03, N = 3 6279.41 1454.68 1397.61 1237.36 1173.07 1144.04 967.56 -O2 -std=gnu99 -O2 -std=gnu99 -lm -laio -lapparmor -latomic -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz -lm -laio -lapparmor -latomic -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz -lm -laio -lapparmor -latomic -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz -lm -laio -lapparmor -latomic -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz -lm -laio -lapparmor -latomic -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz 1. (CXX) g++ options: -lc
Stress-NG Test: Memory Copying OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Memory Copying EPYC 9754 EPYC 9654 GPTshop GH200 Ampere Altra Max 128c EPYC 9684X EPYC 9554 Xeon Platinum 8490H 7K 14K 21K 28K 35K SE +/- 23.24, N = 3 SE +/- 22.89, N = 3 SE +/- 58.68, N = 3 SE +/- 0.31, N = 3 SE +/- 12.67, N = 3 SE +/- 19.64, N = 3 SE +/- 67.98, N = 3 34497.51 28310.82 27302.27 27163.45 26495.63 22827.20 18809.55 -lm -laio -lapparmor -latomic -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz -lm -laio -lapparmor -latomic -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz -O2 -std=gnu99 -O2 -std=gnu99 -lm -laio -lapparmor -latomic -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz -lm -laio -lapparmor -latomic -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz -lm -laio -lapparmor -latomic -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz 1. (CXX) g++ options: -lc
LULESH OpenBenchmarking.org z/s, More Is Better LULESH 2.0.3 EPYC 9684X EPYC 9554 Xeon Platinum 8490H EPYC 9654 GPTshop GH200 EPYC 9754 Ampere Altra Max 128c 5K 10K 15K 20K 25K SE +/- 90.30, N = 3 SE +/- 113.24, N = 3 SE +/- 49.50, N = 5 SE +/- 84.04, N = 3 SE +/- 40.15, N = 3 SE +/- 59.94, N = 3 SE +/- 24.21, N = 3 24712.88 24037.44 23997.72 23701.44 23433.37 22356.75 16218.33 1. (CXX) g++ options: -O3 -fopenmp -lm -lmpi_cxx -lmpi
ACES DGEMM Sustained Floating-Point Rate OpenBenchmarking.org GFLOP/s, More Is Better ACES DGEMM 1.0 Sustained Floating-Point Rate EPYC 9754 EPYC 9684X EPYC 9654 EPYC 9554 Xeon Platinum 8490H Ampere Altra Max 128c GPTshop GH200 10 20 30 40 50 SE +/- 0.13, N = 7 SE +/- 0.12, N = 7 SE +/- 0.18, N = 7 SE +/- 0.28, N = 6 SE +/- 0.13, N = 5 SE +/- 0.09, N = 4 SE +/- 0.19, N = 15 43.68 40.32 39.28 30.17 23.26 18.11 17.56 1. (CC) gcc options: -O3 -march=native -fopenmp
NAS Parallel Benchmarks Test / Class: IS.D OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: IS.D EPYC 9684X EPYC 9754 EPYC 9654 EPYC 9554 Xeon Platinum 8490H GPTshop GH200 Ampere Altra Max 128c 1300 2600 3900 5200 6500 SE +/- 65.80, N = 5 SE +/- 18.26, N = 5 SE +/- 13.89, N = 5 SE +/- 14.93, N = 5 SE +/- 6.63, N = 4 SE +/- 0.92, N = 3 SE +/- 0.34, N = 3 6138.54 5773.25 5195.24 5184.36 2885.71 1770.65 1196.33 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.5
Xcompact3d Incompact3d Input: input.i3d 193 Cells Per Direction OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 193 Cells Per Direction EPYC 9684X EPYC 9654 EPYC 9754 EPYC 9554 GPTshop GH200 Xeon Platinum 8490H Ampere Altra Max 128c 6 12 18 24 30 SE +/- 0.03937117, N = 5 SE +/- 0.07670149, N = 5 SE +/- 0.04030589, N = 5 SE +/- 0.03040517, N = 5 SE +/- 0.03425295, N = 5 SE +/- 0.04713862, N = 4 SE +/- 0.05179277, N = 3 7.30814419 8.45212936 9.02806168 9.74551621 9.83151112 12.79834200 23.88522400 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
NAS Parallel Benchmarks Test / Class: CG.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: CG.C EPYC 9684X EPYC 9654 EPYC 9554 EPYC 9754 Xeon Platinum 8490H GPTshop GH200 Ampere Altra Max 128c 13K 26K 39K 52K 65K SE +/- 469.70, N = 9 SE +/- 537.84, N = 15 SE +/- 363.39, N = 8 SE +/- 391.92, N = 15 SE +/- 268.67, N = 15 SE +/- 58.76, N = 6 SE +/- 15.14, N = 5 59557.08 54465.04 45867.44 44431.58 35560.25 24350.32 17272.73 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.5
miniFE Problem Size: Small OpenBenchmarking.org CG Mflops, More Is Better miniFE 2.2 Problem Size: Small EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 GPTshop GH200 Xeon Platinum 8490H Ampere Altra Max 128c 11K 22K 33K 44K 55K SE +/- 507.76, N = 5 SE +/- 105.96, N = 5 SE +/- 51.65, N = 5 SE +/- 45.75, N = 5 SE +/- 54.33, N = 5 SE +/- 14.30, N = 4 SE +/- 5.88, N = 4 52937.6 50393.7 50351.4 50320.2 46477.3 35452.5 24122.0 1. (CXX) g++ options: -O3 -fopenmp -lmpi_cxx -lmpi
NAS Parallel Benchmarks Test / Class: FT.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: FT.C EPYC 9754 EPYC 9654 EPYC 9554 EPYC 9684X Xeon Platinum 8490H GPTshop GH200 Ampere Altra Max 128c 30K 60K 90K 120K 150K SE +/- 1190.42, N = 8 SE +/- 698.81, N = 8 SE +/- 1162.69, N = 15 SE +/- 686.49, N = 8 SE +/- 445.75, N = 6 SE +/- 163.34, N = 5 SE +/- 41.38, N = 4 141255.07 125164.03 124827.75 124391.00 70989.97 47948.15 35153.27 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.5
NAS Parallel Benchmarks Test / Class: MG.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: MG.C EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Xeon Platinum 8490H GPTshop GH200 Ampere Altra Max 128c 30K 60K 90K 120K 150K SE +/- 1538.33, N = 15 SE +/- 667.76, N = 9 SE +/- 1712.13, N = 15 SE +/- 1159.21, N = 15 SE +/- 405.89, N = 9 SE +/- 113.65, N = 8 SE +/- 10.27, N = 7 137608.68 127404.16 125921.29 118608.07 88049.90 57191.85 42291.84 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.5
System Temperature Monitor Phoronix Test Suite System Monitoring OpenBenchmarking.org Celsius System Temperature Monitor Phoronix Test Suite System Monitoring GPTshop GH200 20 40 60 80 100 Min: 71.9 / Avg: 92.77 / Max: 102.5
System Power Consumption Monitor Phoronix Test Suite System Monitoring OpenBenchmarking.org Watts System Power Consumption Monitor Phoronix Test Suite System Monitoring GPTshop GH200 70 140 210 280 350 Min: 167.62 / Avg: 281.77 / Max: 416.43
CPU Peak Freq (Highest CPU Core Frequency) Monitor Phoronix Test Suite System Monitoring OpenBenchmarking.org Megahertz CPU Peak Freq (Highest CPU Core Frequency) Monitor Phoronix Test Suite System Monitoring GPTshop GH200 700 1400 2100 2800 3500 Min: 2047 / Avg: 3332.19 / Max: 4104
Timed Gem5 Compilation System Temperature Monitor Min Avg Max GPTshop GH200 80.1 87.5 97.8 OpenBenchmarking.org Celsius, Fewer Is Better Timed Gem5 Compilation 23.0.1 System Temperature Monitor 20 40 60 80 100
Timed Gem5 Compilation System Power Consumption Monitor Min Avg Max GPTshop GH200 178.1 252.6 366.7 OpenBenchmarking.org Watts, Fewer Is Better Timed Gem5 Compilation 23.0.1 System Power Consumption Monitor 100 200 300 400 500
Timed Gem5 Compilation CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better Timed Gem5 Compilation 23.0.1 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
RocksDB System Temperature Monitor Min Avg Max GPTshop GH200 85.0 98.3 101.2 OpenBenchmarking.org Celsius, Fewer Is Better RocksDB 8.0 System Temperature Monitor 20 40 60 80 100
RocksDB System Power Consumption Monitor Min Avg Max GPTshop GH200 178.0 319.3 382.4 OpenBenchmarking.org Watts, Fewer Is Better RocksDB 8.0 System Power Consumption Monitor 100 200 300 400 500
RocksDB CPU Peak Freq (Highest CPU Core Frequency) Monitor Min Avg Max GPTshop GH200 2729 3179 3411 OpenBenchmarking.org Megahertz, More Is Better RocksDB 8.0 CPU Peak Freq (Highest CPU Core Frequency) Monitor 800 1600 2400 3200 4000
Apache Cassandra System Temperature Monitor Min Avg Max GPTshop GH200 78.5 89.2 92.7 OpenBenchmarking.org Celsius, Fewer Is Better Apache Cassandra 4.1.3 System Temperature Monitor 20 40 60 80 100
Apache Cassandra System Power Consumption Monitor Min Avg Max GPTshop GH200 183.3 287.6 335.2 OpenBenchmarking.org Watts, Fewer Is Better Apache Cassandra 4.1.3 System Power Consumption Monitor 80 160 240 320 400
Apache Cassandra CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better Apache Cassandra 4.1.3 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
Stress-NG System Temperature Monitor Min Avg Max GPTshop GH200 86.5 91.8 93.5 OpenBenchmarking.org Celsius, Fewer Is Better Stress-NG 0.16.04 System Temperature Monitor 20 40 60 80 100
Stress-NG System Power Consumption Monitor Min Avg Max GPTshop GH200 193.8 299.1 319.5 OpenBenchmarking.org Watts, Fewer Is Better Stress-NG 0.16.04 System Power Consumption Monitor 80 160 240 320 400
Stress-NG CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better Stress-NG 0.16.04 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
Stress-NG System Temperature Monitor Min Avg Max GPTshop GH200 83.2 88.8 92.9 OpenBenchmarking.org Celsius, Fewer Is Better Stress-NG 0.16.04 System Temperature Monitor 20 40 60 80 100
Stress-NG System Power Consumption Monitor Min Avg Max GPTshop GH200 188.5 282.8 319.4 OpenBenchmarking.org Watts, Fewer Is Better Stress-NG 0.16.04 System Power Consumption Monitor 80 160 240 320 400
Stress-NG CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better Stress-NG 0.16.04 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
RawTherapee System Temperature Monitor Min Avg Max GPTshop GH200 82.8 83.8 85.2 OpenBenchmarking.org Celsius, Fewer Is Better RawTherapee System Temperature Monitor 20 40 60 80 100
RawTherapee System Power Consumption Monitor Min Avg Max GPTshop GH200 188.6 208.0 251.3 OpenBenchmarking.org Watts, Fewer Is Better RawTherapee System Power Consumption Monitor 60 120 180 240 300
RawTherapee CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better RawTherapee CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
DuckDB System Temperature Monitor Min Avg Max GPTshop GH200 84.5 84.8 86.6 OpenBenchmarking.org Celsius, Fewer Is Better DuckDB 0.9.1 System Temperature Monitor 20 40 60 80 100
DuckDB System Power Consumption Monitor Min Avg Max GPTshop GH200 183.3 207.4 351.0 OpenBenchmarking.org Watts, Fewer Is Better DuckDB 0.9.1 System Power Consumption Monitor 100 200 300 400 500
DuckDB CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better DuckDB 0.9.1 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
DuckDB System Temperature Monitor Min Avg Max GPTshop GH200 79.0 83.1 94.0 OpenBenchmarking.org Celsius, Fewer Is Better DuckDB 0.9.1 System Temperature Monitor 20 40 60 80 100
DuckDB System Power Consumption Monitor Min Avg Max GPTshop GH200 183.3 212.6 298.5 OpenBenchmarking.org Watts, Fewer Is Better DuckDB 0.9.1 System Power Consumption Monitor 80 160 240 320 400
DuckDB CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better DuckDB 0.9.1 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
Graph500 System Temperature Monitor Min Avg Max GPTshop GH200 89.8 97.8 100.1 OpenBenchmarking.org Celsius, Fewer Is Better Graph500 3.0 System Temperature Monitor 20 40 60 80 100
Graph500 System Power Consumption Monitor Min Avg Max GPTshop GH200 272.3 321.3 377.1 OpenBenchmarking.org Watts, Fewer Is Better Graph500 3.0 System Power Consumption Monitor 100 200 300 400 500
Graph500 CPU Peak Freq (Highest CPU Core Frequency) Monitor Min Avg Max GPTshop GH200 2729 3246 3411 OpenBenchmarking.org Megahertz, More Is Better Graph500 3.0 CPU Peak Freq (Highest CPU Core Frequency) Monitor 800 1600 2400 3200 4000
ASKAP System Temperature Monitor Min Avg Max GPTshop GH200 85.6 92.0 95.1 OpenBenchmarking.org Celsius, Fewer Is Better ASKAP 1.0 System Temperature Monitor 20 40 60 80 100
ASKAP System Power Consumption Monitor Min Avg Max GPTshop GH200 188.6 287.3 335.1 OpenBenchmarking.org Watts, Fewer Is Better ASKAP 1.0 System Power Consumption Monitor 80 160 240 320 400
ASKAP CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better ASKAP 1.0 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
OpenSSL System Temperature Monitor Min Avg Max GPTshop GH200 89.0 97.8 99.8 OpenBenchmarking.org Celsius, Fewer Is Better OpenSSL 3.1 System Temperature Monitor 20 40 60 80 100
OpenSSL System Power Consumption Monitor Min Avg Max GPTshop GH200 188.8 307.5 345.7 OpenBenchmarking.org Watts, Fewer Is Better OpenSSL 3.1 System Power Consumption Monitor 100 200 300 400 500
OpenSSL CPU Peak Freq (Highest CPU Core Frequency) Monitor Min Avg Max GPTshop GH200 2729 3344 3411 OpenBenchmarking.org Megahertz, More Is Better OpenSSL 3.1 CPU Peak Freq (Highest CPU Core Frequency) Monitor 800 1600 2400 3200 4000
Primesieve System Temperature Monitor Min Avg Max GPTshop GH200 85.2 96.6 99.6 OpenBenchmarking.org Celsius, Fewer Is Better Primesieve 8.0 System Temperature Monitor 20 40 60 80 100
Primesieve System Power Consumption Monitor Min Avg Max GPTshop GH200 337.8 371.2 387.2 OpenBenchmarking.org Watts, Fewer Is Better Primesieve 8.0 System Power Consumption Monitor 100 200 300 400 500
Primesieve CPU Peak Freq (Highest CPU Core Frequency) Monitor Min Avg Max GPTshop GH200 2729 3155 3411 OpenBenchmarking.org Megahertz, More Is Better Primesieve 8.0 CPU Peak Freq (Highest CPU Core Frequency) Monitor 800 1600 2400 3200 4000
Timed Node.js Compilation System Temperature Monitor Min Avg Max GPTshop GH200 87.0 92.8 97.8 OpenBenchmarking.org Celsius, Fewer Is Better Timed Node.js Compilation 19.8.1 System Temperature Monitor 20 40 60 80 100
Timed Node.js Compilation System Power Consumption Monitor Min Avg Max GPTshop GH200 185.9 300.4 358.7 OpenBenchmarking.org Watts, Fewer Is Better Timed Node.js Compilation 19.8.1 System Power Consumption Monitor 100 200 300 400 500
Timed Node.js Compilation CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better Timed Node.js Compilation 19.8.1 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
Timed LLVM Compilation System Temperature Monitor Min Avg Max GPTshop GH200 86.9 95.0 99.5 OpenBenchmarking.org Celsius, Fewer Is Better Timed LLVM Compilation 16.0 System Temperature Monitor 20 40 60 80 100
Timed LLVM Compilation System Power Consumption Monitor Min Avg Max GPTshop GH200 183.3 296.8 358.9 OpenBenchmarking.org Watts, Fewer Is Better Timed LLVM Compilation 16.0 System Power Consumption Monitor 100 200 300 400 500
Timed LLVM Compilation CPU Peak Freq (Highest CPU Core Frequency) Monitor Min Avg Max GPTshop GH200 2729 3349 3411 OpenBenchmarking.org Megahertz, More Is Better Timed LLVM Compilation 16.0 CPU Peak Freq (Highest CPU Core Frequency) Monitor 800 1600 2400 3200 4000
Timed Linux Kernel Compilation System Temperature Monitor Min Avg Max GPTshop GH200 84.2 97.0 100.1 OpenBenchmarking.org Celsius, Fewer Is Better Timed Linux Kernel Compilation 6.1 System Temperature Monitor 20 40 60 80 100
Timed Linux Kernel Compilation System Power Consumption Monitor Min Avg Max GPTshop GH200 183.5 302.8 371.9 OpenBenchmarking.org Watts, Fewer Is Better Timed Linux Kernel Compilation 6.1 System Power Consumption Monitor 100 200 300 400 500
Timed Linux Kernel Compilation CPU Peak Freq (Highest CPU Core Frequency) Monitor Min Avg Max GPTshop GH200 2729 3254 3411 OpenBenchmarking.org Megahertz, More Is Better Timed Linux Kernel Compilation 6.1 CPU Peak Freq (Highest CPU Core Frequency) Monitor 800 1600 2400 3200 4000
Timed Godot Game Engine Compilation System Temperature Monitor Min Avg Max GPTshop GH200 83.7 91.2 99.3 OpenBenchmarking.org Celsius, Fewer Is Better Timed Godot Game Engine Compilation 4.0 System Temperature Monitor 20 40 60 80 100
Timed Godot Game Engine Compilation System Power Consumption Monitor Min Avg Max GPTshop GH200 188.5 260.8 379.7 OpenBenchmarking.org Watts, Fewer Is Better Timed Godot Game Engine Compilation 4.0 System Power Consumption Monitor 100 200 300 400 500
Timed Godot Game Engine Compilation CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better Timed Godot Game Engine Compilation 4.0 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
ACES DGEMM System Temperature Monitor Min Avg Max GPTshop GH200 85.8 92.3 95.8 OpenBenchmarking.org Celsius, Fewer Is Better ACES DGEMM 1.0 System Temperature Monitor 20 40 60 80 100
ACES DGEMM System Power Consumption Monitor Min Avg Max GPTshop GH200 185.9 277.8 345.7 OpenBenchmarking.org Watts, Fewer Is Better ACES DGEMM 1.0 System Power Consumption Monitor 100 200 300 400 500
ACES DGEMM CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better ACES DGEMM 1.0 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
easyWave System Temperature Monitor Min Avg Max GPTshop GH200 81.6 85.6 88.0 OpenBenchmarking.org Celsius, Fewer Is Better easyWave r34 System Temperature Monitor 20 40 60 80 100
easyWave System Power Consumption Monitor Min Avg Max GPTshop GH200 186.0 272.7 361.5 OpenBenchmarking.org Watts, Fewer Is Better easyWave r34 System Power Consumption Monitor 100 200 300 400 500
easyWave CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better easyWave r34 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
easyWave System Temperature Monitor Min Avg Max GPTshop GH200 83.6 85.9 87.5 OpenBenchmarking.org Celsius, Fewer Is Better easyWave r34 System Temperature Monitor 20 40 60 80 100
easyWave System Power Consumption Monitor Min Avg Max GPTshop GH200 191.2 250.4 348.8 OpenBenchmarking.org Watts, Fewer Is Better easyWave r34 System Power Consumption Monitor 100 200 300 400 500
easyWave CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better easyWave r34 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
GraphicsMagick System Temperature Monitor Min Avg Max GPTshop GH200 87.9 94.2 97.3 OpenBenchmarking.org Celsius, Fewer Is Better GraphicsMagick 1.3.38 System Temperature Monitor 20 40 60 80 100
GraphicsMagick System Power Consumption Monitor Min Avg Max GPTshop GH200 185.9 296.1 322.4 OpenBenchmarking.org Watts, Fewer Is Better GraphicsMagick 1.3.38 System Power Consumption Monitor 80 160 240 320 400
GraphicsMagick CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better GraphicsMagick 1.3.38 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
GraphicsMagick System Temperature Monitor Min Avg Max GPTshop GH200 83.9 93.9 99.3 OpenBenchmarking.org Celsius, Fewer Is Better GraphicsMagick 1.3.38 System Temperature Monitor 20 40 60 80 100
GraphicsMagick System Power Consumption Monitor Min Avg Max GPTshop GH200 185.9 311.9 340.5 OpenBenchmarking.org Watts, Fewer Is Better GraphicsMagick 1.3.38 System Power Consumption Monitor 80 160 240 320 400
GraphicsMagick CPU Peak Freq (Highest CPU Core Frequency) Monitor Min Avg Max GPTshop GH200 2729 3381 3411 OpenBenchmarking.org Megahertz, More Is Better GraphicsMagick 1.3.38 CPU Peak Freq (Highest CPU Core Frequency) Monitor 800 1600 2400 3200 4000
Xmrig System Temperature Monitor Min Avg Max GPTshop GH200 87.1 91.7 93.2 OpenBenchmarking.org Celsius, Fewer Is Better Xmrig 6.18.1 System Temperature Monitor 20 40 60 80 100
Xmrig System Power Consumption Monitor Min Avg Max GPTshop GH200 298.5 310.4 314.3 OpenBenchmarking.org Watts, Fewer Is Better Xmrig 6.18.1 System Power Consumption Monitor 80 160 240 320 400
Xmrig CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better Xmrig 6.18.1 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
Xmrig System Temperature Monitor Min Avg Max GPTshop GH200 84.8 90.4 93.0 OpenBenchmarking.org Celsius, Fewer Is Better Xmrig 6.18.1 System Temperature Monitor 20 40 60 80 100
Xmrig System Power Consumption Monitor Min Avg Max GPTshop GH200 188.6 296.1 316.9 OpenBenchmarking.org Watts, Fewer Is Better Xmrig 6.18.1 System Power Consumption Monitor 80 160 240 320 400
Xmrig CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better Xmrig 6.18.1 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
LULESH System Power Consumption Monitor Min Avg Max GPTshop GH200 188.5 250.2 308.4 OpenBenchmarking.org Watts, Fewer Is Better LULESH 2.0.3 System Power Consumption Monitor 80 160 240 320 400
LULESH CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better LULESH 2.0.3 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
Xcompact3d Incompact3d System Temperature Monitor Min Avg Max GPTshop GH200 92.0 94.7 98.5 OpenBenchmarking.org Celsius, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 System Temperature Monitor 20 40 60 80 100
Xcompact3d Incompact3d System Power Consumption Monitor Min Avg Max GPTshop GH200 191.2 283.9 358.9 OpenBenchmarking.org Watts, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 System Power Consumption Monitor 100 200 300 400 500
Xcompact3d Incompact3d CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better Xcompact3d Incompact3d 2021-03-11 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
Xcompact3d Incompact3d System Temperature Monitor Min Avg Max GPTshop GH200 82.5 96.2 101.8 OpenBenchmarking.org Celsius, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 System Temperature Monitor 20 40 60 80 100
Xcompact3d Incompact3d System Power Consumption Monitor Min Avg Max GPTshop GH200 188.6 323.5 395.2 OpenBenchmarking.org Watts, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 System Power Consumption Monitor 110 220 330 440 550
Xcompact3d Incompact3d CPU Peak Freq (Highest CPU Core Frequency) Monitor Min Avg Max GPTshop GH200 2729 3268 3411 OpenBenchmarking.org Megahertz, More Is Better Xcompact3d Incompact3d 2021-03-11 CPU Peak Freq (Highest CPU Core Frequency) Monitor 800 1600 2400 3200 4000
Rodinia System Temperature Monitor Min Avg Max GPTshop GH200 88.9 93.9 100.1 OpenBenchmarking.org Celsius, Fewer Is Better Rodinia 3.1 System Temperature Monitor 20 40 60 80 100
Rodinia System Power Consumption Monitor Min Avg Max GPTshop GH200 186.0 267.7 379.8 OpenBenchmarking.org Watts, Fewer Is Better Rodinia 3.1 System Power Consumption Monitor 100 200 300 400 500
Rodinia CPU Peak Freq (Highest CPU Core Frequency) Monitor Min Avg Max GPTshop GH200 2729 3119 3411 OpenBenchmarking.org Megahertz, More Is Better Rodinia 3.1 CPU Peak Freq (Highest CPU Core Frequency) Monitor 800 1600 2400 3200 4000
CloverLeaf System Temperature Monitor Min Avg Max GPTshop GH200 91.9 94.6 96.6 OpenBenchmarking.org Celsius, Fewer Is Better CloverLeaf 1.3 System Temperature Monitor 20 40 60 80 100
CloverLeaf System Power Consumption Monitor Min Avg Max GPTshop GH200 191.2 268.3 332.7 OpenBenchmarking.org Watts, Fewer Is Better CloverLeaf 1.3 System Power Consumption Monitor 80 160 240 320 400
CloverLeaf CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better CloverLeaf 1.3 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
CloverLeaf System Temperature Monitor Min Avg Max GPTshop GH200 88.5 96.9 98.8 OpenBenchmarking.org Celsius, Fewer Is Better CloverLeaf 1.3 System Temperature Monitor 20 40 60 80 100
CloverLeaf System Power Consumption Monitor Min Avg Max GPTshop GH200 185.9 315.8 337.8 OpenBenchmarking.org Watts, Fewer Is Better CloverLeaf 1.3 System Power Consumption Monitor 80 160 240 320 400
CloverLeaf CPU Peak Freq (Highest CPU Core Frequency) Monitor Min Avg Max GPTshop GH200 2729 3346 3411 OpenBenchmarking.org Megahertz, More Is Better CloverLeaf 1.3 CPU Peak Freq (Highest CPU Core Frequency) Monitor 800 1600 2400 3200 4000
NAS Parallel Benchmarks System Temperature Monitor Min Avg Max GPTshop GH200 88.8 91.3 97.1 OpenBenchmarking.org Celsius, Fewer Is Better NAS Parallel Benchmarks 3.4 System Temperature Monitor 20 40 60 80 100
NAS Parallel Benchmarks System Power Consumption Monitor Min Avg Max GPTshop GH200 185.9 237.6 381.2 OpenBenchmarking.org Watts, Fewer Is Better NAS Parallel Benchmarks 3.4 System Power Consumption Monitor 100 200 300 400 500
NAS Parallel Benchmarks CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better NAS Parallel Benchmarks 3.4 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
NAS Parallel Benchmarks System Temperature Monitor Min Avg Max GPTshop GH200 87.5 90.4 92.8 OpenBenchmarking.org Celsius, Fewer Is Better NAS Parallel Benchmarks 3.4 System Temperature Monitor 20 40 60 80 100
NAS Parallel Benchmarks System Power Consumption Monitor Min Avg Max GPTshop GH200 184.6 277.7 351.4 OpenBenchmarking.org Watts, Fewer Is Better NAS Parallel Benchmarks 3.4 System Power Consumption Monitor 100 200 300 400 500
NAS Parallel Benchmarks CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better NAS Parallel Benchmarks 3.4 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
NAS Parallel Benchmarks System Temperature Monitor Min Avg Max GPTshop GH200 89.5 92.2 98.0 OpenBenchmarking.org Celsius, Fewer Is Better NAS Parallel Benchmarks 3.4 System Temperature Monitor 20 40 60 80 100
NAS Parallel Benchmarks System Power Consumption Monitor Min Avg Max GPTshop GH200 185.9 287.4 378.3 OpenBenchmarking.org Watts, Fewer Is Better NAS Parallel Benchmarks 3.4 System Power Consumption Monitor 100 200 300 400 500
NAS Parallel Benchmarks CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better NAS Parallel Benchmarks 3.4 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
NAS Parallel Benchmarks System Temperature Monitor Min Avg Max GPTshop GH200 91.6 95.4 99.9 OpenBenchmarking.org Celsius, Fewer Is Better NAS Parallel Benchmarks 3.4 System Temperature Monitor 20 40 60 80 100
NAS Parallel Benchmarks System Power Consumption Monitor Min Avg Max GPTshop GH200 189.8 265.5 392.9 OpenBenchmarking.org Watts, Fewer Is Better NAS Parallel Benchmarks 3.4 System Power Consumption Monitor 110 220 330 440 550
NAS Parallel Benchmarks CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better NAS Parallel Benchmarks 3.4 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
High Performance Conjugate Gradient System Temperature Monitor Min Avg Max GPTshop GH200 71.9 96.0 99.3 OpenBenchmarking.org Celsius, Fewer Is Better High Performance Conjugate Gradient 3.1 System Temperature Monitor 20 40 60 80 100
High Performance Conjugate Gradient System Power Consumption Monitor Min Avg Max GPTshop GH200 189.8 324.5 414.7 OpenBenchmarking.org Watts, Fewer Is Better High Performance Conjugate Gradient 3.1 System Power Consumption Monitor 110 220 330 440 550
High Performance Conjugate Gradient CPU Peak Freq (Highest CPU Core Frequency) Monitor Min Avg Max GPTshop GH200 2729 3345 4104 OpenBenchmarking.org Megahertz, More Is Better High Performance Conjugate Gradient 3.1 CPU Peak Freq (Highest CPU Core Frequency) Monitor 1100 2200 3300 4400 5500
RocksDB Test: Random Read OpenBenchmarking.org Op/s Per Watt, More Is Better RocksDB 8.0 Test: Random Read Ampere Altra Max 128c EPYC 9754 EPYC 9654 EPYC 9684X EPYC 9554 Xeon Platinum 8490H 500K 1000K 1500K 2000K 2500K 2282816.53 2068476.16 1599837.63 1587869.93 1450487.41 808152.35
Apache Cassandra Test: Writes OpenBenchmarking.org Op/s Per Watt, More Is Better Apache Cassandra 4.1.3 Test: Writes Ampere Altra Max 128c EPYC 9554 EPYC 9754 EPYC 9654 EPYC 9684X Xeon Platinum 8490H 400 800 1200 1600 2000 2068.00 1700.28 1432.67 1349.72 1282.09 557.85
Stress-NG Test: Memory Copying OpenBenchmarking.org Bogo Ops/s Per Watt, More Is Better Stress-NG 0.16.04 Test: Memory Copying Ampere Altra Max 128c EPYC 9754 EPYC 9684X EPYC 9654 EPYC 9554 Xeon Platinum 8490H 40 80 120 160 200 166.58 123.67 97.03 95.60 78.72 59.82
Stress-NG Test: NUMA OpenBenchmarking.org Bogo Ops/s Per Watt, More Is Better Stress-NG 0.16.04 Test: NUMA Ampere Altra Max 128c EPYC 9754 EPYC 9684X EPYC 9654 Xeon Platinum 8490H EPYC 9554 5 10 15 20 25 20.369 4.951 4.106 4.088 3.603 3.450
Graph500 Scale: 26 OpenBenchmarking.org sssp max_TEPS Per Watt, More Is Better Graph500 3.0 Scale: 26 Ampere Altra Max 128c EPYC 9554 EPYC 9754 EPYC 9684X EPYC 9654 Xeon Platinum 8490H 400K 800K 1200K 1600K 2000K 1918278.63 1818554.40 1711121.29 1663398.87 1644954.69 1312538.58
ASKAP Test: tConvolve MPI - Gridding OpenBenchmarking.org Mpix/sec Per Watt, More Is Better ASKAP 1.0 Test: tConvolve MPI - Gridding EPYC 9684X EPYC 9654 EPYC 9554 EPYC 9754 Xeon Platinum 8490H Ampere Altra Max 128c 60 120 180 240 300 275.76 196.62 191.48 129.89 71.04 17.51
OpenSSL Algorithm: SHA512 OpenBenchmarking.org byte/s Per Watt, More Is Better OpenSSL 3.1 Algorithm: SHA512 Ampere Altra Max 128c EPYC 9754 EPYC 9654 EPYC 9684X EPYC 9554 Xeon Platinum 8490H 40M 80M 120M 160M 200M 192898458.48 160226634.86 133650046.77 131019550.91 112644110.72 65707856.51
ACES DGEMM Sustained Floating-Point Rate OpenBenchmarking.org GFLOP/s Per Watt, More Is Better ACES DGEMM 1.0 Sustained Floating-Point Rate EPYC 9754 EPYC 9654 EPYC 9684X EPYC 9554 Ampere Altra Max 128c Xeon Platinum 8490H 0.0502 0.1004 0.1506 0.2008 0.251 0.223 0.220 0.219 0.171 0.136 0.091
GraphicsMagick Operation: Enhanced OpenBenchmarking.org Iterations Per Minute Per Watt, More Is Better GraphicsMagick 1.3.38 Operation: Enhanced Ampere Altra Max 128c EPYC 9754 EPYC 9654 EPYC 9684X EPYC 9554 Xeon Platinum 8490H 3 6 9 12 15 9.833 6.328 5.323 5.193 4.871 3.405
GraphicsMagick Operation: Sharpen OpenBenchmarking.org Iterations Per Minute Per Watt, More Is Better GraphicsMagick 1.3.38 Operation: Sharpen Ampere Altra Max 128c EPYC 9754 EPYC 9654 EPYC 9684X EPYC 9554 Xeon Platinum 8490H 2 4 6 8 10 8.016 3.752 3.127 3.042 2.691 2.081
Xmrig Variant: Wownero - Hash Count: 1M OpenBenchmarking.org H/s Per Watt, More Is Better Xmrig 6.18.1 Variant: Wownero - Hash Count: 1M EPYC 9754 EPYC 9654 EPYC 9684X EPYC 9554 Xeon Platinum 8490H Ampere Altra Max 128c 60 120 180 240 300 284.00 270.21 265.76 212.04 114.36 18.61
Xmrig Variant: Monero - Hash Count: 1M OpenBenchmarking.org H/s Per Watt, More Is Better Xmrig 6.18.1 Variant: Monero - Hash Count: 1M EPYC 9684X EPYC 9654 EPYC 9554 EPYC 9754 Xeon Platinum 8490H Ampere Altra Max 128c 50 100 150 200 250 245.11 208.70 166.56 124.40 86.36 36.63
LULESH OpenBenchmarking.org z/s Per Watt, More Is Better LULESH 2.0.3 EPYC 9554 Ampere Altra Max 128c EPYC 9684X EPYC 9654 EPYC 9754 Xeon Platinum 8490H 30 60 90 120 150 148.93 144.69 139.86 136.97 124.36 106.69
Algebraic Multi-Grid Benchmark OpenBenchmarking.org Figure Of Merit Per Watt, More Is Better Algebraic Multi-Grid Benchmark 1.2 EPYC 9554 EPYC 9754 EPYC 9684X EPYC 9654 Ampere Altra Max 128c Xeon Platinum 8490H 3M 6M 9M 12M 15M 12260061.49 11292392.93 10140759.12 9775985.20 7481610.11 5406835.94
miniFE Problem Size: Small OpenBenchmarking.org CG Mflops Per Watt, More Is Better miniFE 2.2 Problem Size: Small EPYC 9754 EPYC 9554 EPYC 9684X EPYC 9654 Ampere Altra Max 128c Xeon Platinum 8490H 90 180 270 360 450 416.05 396.88 351.89 340.61 248.68 154.24
NAS Parallel Benchmarks Test / Class: MG.C OpenBenchmarking.org Total Mop/s Per Watt, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: MG.C EPYC 9554 EPYC 9754 EPYC 9684X EPYC 9654 Xeon Platinum 8490H Ampere Altra Max 128c 300 600 900 1200 1500 1439.34 1333.53 1291.84 1167.53 549.13 407.26
NAS Parallel Benchmarks Test / Class: IS.D OpenBenchmarking.org Total Mop/s Per Watt, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: IS.D EPYC 9754 EPYC 9684X EPYC 9554 EPYC 9654 Xeon Platinum 8490H Ampere Altra Max 128c 9 18 27 36 45 39.264 36.170 32.759 30.167 11.604 8.928
NAS Parallel Benchmarks Test / Class: FT.C OpenBenchmarking.org Total Mop/s Per Watt, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: FT.C EPYC 9754 EPYC 9554 EPYC 9654 EPYC 9684X Xeon Platinum 8490H Ampere Altra Max 128c 200 400 600 800 1000 1081.10 904.17 837.59 812.37 306.03 244.94
NAS Parallel Benchmarks Test / Class: CG.C OpenBenchmarking.org Total Mop/s Per Watt, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: CG.C EPYC 9684X EPYC 9654 EPYC 9554 EPYC 9754 Xeon Platinum 8490H Ampere Altra Max 128c 90 180 270 360 450 437.49 380.13 332.67 297.51 162.31 116.77
High Performance Conjugate Gradient X Y Z: 144 144 144 - RT: 60 OpenBenchmarking.org GFLOP/s Per Watt, More Is Better High Performance Conjugate Gradient 3.1 X Y Z: 144 144 144 - RT: 60 Ampere Altra Max 128c EPYC 9654 EPYC 9754 EPYC 9554 Xeon Platinum 8490H EPYC 9684X 0.0297 0.0594 0.0891 0.1188 0.1485 0.132 0.117 0.109 0.101 0.091 0.087
Phoronix Test Suite v10.8.5