NVIDIA GH200 Benchmarks Smoke Run Comparison Benchmarks by Michael Larabel for a future article. Just some initial smoke run benchmarks looking at the NVIDIA GH200 CPU performance versus other server CPUs. More interesting benchmarks to come.
HTML result view exported from: https://openbenchmarking.org/result/2401256-NE-NVIDIAGH254&rdt&grs .
NVIDIA GH200 Benchmarks Smoke Run Comparison Processor Motherboard Chipset Memory Disk Graphics Network Monitor OS Kernel Desktop Display Server Compiler File-System Screen Resolution Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c GPTshop GH200 Intel Xeon Platinum 8490H @ 3.50GHz (60 Cores / 120 Threads) Quanta Cloud S6Q-MB-MPS (3A10.uh BIOS) Intel Device 1bce 512GB 3201GB Micron_7450_MTFDKCC3T2TFS ASPEED Ubuntu 23.10 6.6.0-rc5-phx-patched (x86_64) GNOME Shell 45.0 X Server 1.21.1.7 GCC 13.2.0 ext4 1920x1200 AMD EPYC 9684X 96-Core @ 2.55GHz (96 Cores / 192 Threads) AMD Titanite_4G (RTI1007B BIOS) AMD Device 14a4 768GB Broadcom NetXtreme BCM5720 PCIe AMD EPYC 9754 128-Core @ 2.25GHz (128 Cores / 256 Threads) AMD EPYC 9554 64-Core @ 3.10GHz (64 Cores / 128 Threads) AMD EPYC 9654 96-Core @ 2.40GHz (96 Cores / 192 Threads) ARMv8 Neoverse-N1 @ 3.00GHz (128 Cores) GIGABYTE MP32-AR2-00 v01000100 (F31k SCP: 2.10.20220531 BIOS) Ampere Computing LLC Altra PCI Root Complex A 512GB VGA HDMI 2 x Intel I350 6.6.0-rc5-phx-patched (aarch64) 1920x1080 ARMv8 Neoverse-V2 @ 3.39GHz (72 Cores) Quanta Cloud QuantaGrid S74G-2U 1S7GZ9Z0000 S7G MB (CG1) (3A06 BIOS) 1 x 480 GB DRAM-6400MT/s 960GB SAMSUNG MZ1L2960HCJR-00A07 + 1920GB SAMSUNG MZTL21T9 2 x Mellanox MT2910 + 2 x QLogic FastLinQ QL41000 10/25/40/50GbE 6.5.0-15-generic (aarch64) 1920x1200 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - Xeon Platinum 8490H: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - EPYC 9684X: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - EPYC 9754: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - EPYC 9554: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - EPYC 9654: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Ampere Altra Max 128c: --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -v - GPTshop GH200: --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -v Processor Details - Xeon Platinum 8490H: Scaling Governor: intel_pstate performance (EPP: performance) - CPU Microcode: 0x2b0004b1 - EPYC 9684X: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa10113e - EPYC 9754: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xaa00116 - EPYC 9554: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa10113e - EPYC 9654: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa10113e - Ampere Altra Max 128c: Scaling Governor: cppc_cpufreq performance (Boost: Disabled) - GPTshop GH200: Scaling Governor: cppc_cpufreq performance (Boost: Disabled) Java Details - Xeon Platinum 8490H: OpenJDK Runtime Environment (build 11.0.20+8-post-Ubuntu-1ubuntu1) - EPYC 9684X: OpenJDK Runtime Environment (build 11.0.20+8-post-Ubuntu-1ubuntu1) - EPYC 9754: OpenJDK Runtime Environment (build 11.0.20+8-post-Ubuntu-1ubuntu1) - EPYC 9554: OpenJDK Runtime Environment (build 11.0.20+8-post-Ubuntu-1ubuntu1) - EPYC 9654: OpenJDK Runtime Environment (build 11.0.20+8-post-Ubuntu-1ubuntu1) - Ampere Altra Max 128c: OpenJDK Runtime Environment (build 11.0.20+8-post-Ubuntu-1ubuntu1) - GPTshop GH200: OpenJDK Runtime Environment (build 11.0.21+9-post-Ubuntu-0ubuntu123.10) Python Details - Python 3.11.6 Security Details - Xeon Platinum 8490H: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected - EPYC 9684X: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected - EPYC 9754: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected - EPYC 9554: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected - EPYC 9654: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected - Ampere Altra Max 128c: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of CSV2 BHB + srbds: Not affected + tsx_async_abort: Not affected - GPTshop GH200: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected
NVIDIA GH200 Benchmarks Smoke Run Comparison stress-ng: NUMA easywave: e2Asean Grid + BengkuluSept2007 Source - 2400 easywave: e2Asean Grid + BengkuluSept2007 Source - 1200 npb: IS.D xmrig: Wownero - 1M npb: FT.C npb: CG.C askap: tConvolve MT - Gridding incompact3d: input.i3d 193 Cells Per Direction npb: MG.C askap: tConvolve MPI - Gridding cassandra: Writes mt-dgemm: Sustained Floating-Point Rate primesieve: 1e13 openssl: SHA512 incompact3d: X3D-benchmarking input.i3d build-godot: Time To Compile rocksdb: Rand Read build-nodejs: Time To Compile amg: minife: Small build-llvm: Ninja xmrig: Monero - 1M askap: tConvolve MPI - Degridding hpcg: 144 144 144 - 60 graphics-magick: Sharpen stress-ng: Memory Copying compress-7zip: Compression Rating graph500: 26 rodinia: OpenMP LavaMD duckdb: TPC-H Parquet duckdb: IMDB graph500: 26 lulesh: graph500: 26 graph500: 26 build-gem5: Time To Compile rawtherapee: Total Benchmark Time graphics-magick: Enhanced build-linux-kernel: allmodconfig cloverleaf: clover_bm64_short cloverleaf: clover_bm16 Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c GPTshop GH200 1144.04 118.554 45.850 2885.71 35870.2 70989.97 35560.25 8112.07 12.7983420 88049.90 22365.2 139654 23.257388 52.375 22445820447 353.557595 128.467 268374486 151.567 1611826000 35452.5 169.528 27608.5 19616.0 31.2432 691 18809.55 389690 333129000 42.713 148.099 99.249 1113120000 23997.715 454483000 1073860000 197.759 46.578 1124 289.966 38.93 324.17 1173.07 60.252 24.579 6138.54 73567.3 124391.00 59557.08 14731.5 7.30814419 137608.68 67965.5 286460 40.320062 25.969 40105639580 440.548584 98.555 486862785 119.203 2374853333 52937.6 131.990 69205.6 58310.1 23.7784 775 26495.63 554411 383754000 30.317 147.489 116.050 882181000 24712.884 501222000 854029000 180.061 52.991 1277 283.865 51.31 298.29 1397.61 72.355 29.588 5773.25 77607.2 141255.07 44431.58 12660.8 9.02806168 127404.16 27269.2 236320 43.681923 21.756 53648078203 493.502452 118.247 611259161 130.460 2291049667 50393.7 146.386 29356.1 26593.4 25.8918 924 34497.51 586571 377033000 25.150 177.134 147.601 1184510000 22356.746 502151000 1147090000 208.576 66.132 1451 319.426 49.03 459.16 967.56 56.883 22.452 5184.36 60032.4 124827.75 45867.44 10327.5 9.74551621 125921.29 37265.9 301963 30.174904 28.136 33622028530 515.792758 103.500 376983415 121.928 2321156000 50351.4 131.203 47926.4 31177.2 22.4511 669 22827.20 489736 368016000 34.014 143.306 103.720 865152000 24037.442 467453000 836159000 179.743 49.943 1142 238.289 45.82 460.24 1237.36 59.535 23.507 5195.24 75127.4 125164.03 54465.04 12332.4 8.45212936 118608.07 47003.2 285324 39.282158 24.164 40459323610 430.080329 101.563 429310263 120.156 2296730667 50320.2 133.070 59047.7 41081.6 32.1283 809 28310.82 564950 391167000 30.240 146.276 115.805 839917000 23701.442 499347000 816172000 180.857 52.051 1319 284.249 47.98 429.46 1454.68 326.368 122.550 1196.33 1909.2 35153.27 17272.73 4295.72 23.8852240 42291.84 2416.54 227548 18.114994 41.675 34434960797 606.521708 227.783 435437276 268.704 1060656000 24122.0 266.543 4322.1 5285.36 21.2399 1274 27163.45 328378 224032000 31.791 237.161 143.781 987770000 16218.331 333239000 980463000 260.326 66.045 1230 307.878 62.77 550.30 6279.41 98.719 36.562 1770.65 21993.3 47948.15 24350.32 9561.31 9.83151112 57191.85 9678.85 371439 17.563016 36.049 45292038457 257.342539 138.891 429416658 173.834 1993579333 46477.3 195.598 17290.3 12429.5 41.5526 1244 27302.27 341300 298983000 30.959 156.987 91.633 1313880000 23433.373 463087000 1214290000 184.661 46.546 1584 283.339 28.36 247.28 OpenBenchmarking.org
Stress-NG Test: NUMA OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: NUMA Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c GPTshop GH200 1300 2600 3900 5200 6500 SE +/- 1.16, N = 3 SE +/- 1.38, N = 3 SE +/- 2.90, N = 3 SE +/- 4.03, N = 3 SE +/- 1.10, N = 3 SE +/- 2.51, N = 3 SE +/- 10.32, N = 3 1144.04 1173.07 1397.61 967.56 1237.36 1454.68 6279.41 -lm -laio -lapparmor -latomic -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz -lm -laio -lapparmor -latomic -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz -lm -laio -lapparmor -latomic -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz -lm -laio -lapparmor -latomic -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz -lm -laio -lapparmor -latomic -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz -O2 -std=gnu99 -O2 -std=gnu99 1. (CXX) g++ options: -lc
easyWave Input: e2Asean Grid + BengkuluSept2007 Source - Time: 2400 OpenBenchmarking.org Seconds, Fewer Is Better easyWave r34 Input: e2Asean Grid + BengkuluSept2007 Source - Time: 2400 Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c GPTshop GH200 70 140 210 280 350 SE +/- 0.36, N = 3 SE +/- 0.14, N = 3 SE +/- 0.45, N = 3 SE +/- 0.33, N = 3 SE +/- 0.40, N = 3 SE +/- 1.24, N = 3 SE +/- 0.34, N = 3 118.55 60.25 72.36 56.88 59.54 326.37 98.72 1. (CXX) g++ options: -O3 -fopenmp
easyWave Input: e2Asean Grid + BengkuluSept2007 Source - Time: 1200 OpenBenchmarking.org Seconds, Fewer Is Better easyWave r34 Input: e2Asean Grid + BengkuluSept2007 Source - Time: 1200 Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c GPTshop GH200 30 60 90 120 150 SE +/- 0.04, N = 3 SE +/- 0.32, N = 3 SE +/- 0.20, N = 15 SE +/- 0.06, N = 3 SE +/- 0.14, N = 3 SE +/- 1.12, N = 3 SE +/- 0.11, N = 3 45.85 24.58 29.59 22.45 23.51 122.55 36.56 1. (CXX) g++ options: -O3 -fopenmp
NAS Parallel Benchmarks Test / Class: IS.D OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: IS.D Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c GPTshop GH200 1300 2600 3900 5200 6500 SE +/- 6.63, N = 4 SE +/- 65.80, N = 5 SE +/- 18.26, N = 5 SE +/- 14.93, N = 5 SE +/- 13.89, N = 5 SE +/- 0.34, N = 3 SE +/- 0.92, N = 3 2885.71 6138.54 5773.25 5184.36 5195.24 1196.33 1770.65 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.5
Xmrig Variant: Wownero - Hash Count: 1M OpenBenchmarking.org H/s, More Is Better Xmrig 6.18.1 Variant: Wownero - Hash Count: 1M Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c GPTshop GH200 17K 34K 51K 68K 85K SE +/- 13.06, N = 3 SE +/- 35.19, N = 4 SE +/- 604.78, N = 4 SE +/- 18.19, N = 3 SE +/- 60.06, N = 4 SE +/- 4.67, N = 3 SE +/- 2.75, N = 3 35870.2 73567.3 77607.2 60032.4 75127.4 1909.2 21993.3 -maes -maes -maes -maes -maes 1. (CXX) g++ options: -fexceptions -fno-rtti -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
NAS Parallel Benchmarks Test / Class: FT.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: FT.C Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c GPTshop GH200 30K 60K 90K 120K 150K SE +/- 445.75, N = 6 SE +/- 686.49, N = 8 SE +/- 1190.42, N = 8 SE +/- 1162.69, N = 15 SE +/- 698.81, N = 8 SE +/- 41.38, N = 4 SE +/- 163.34, N = 5 70989.97 124391.00 141255.07 124827.75 125164.03 35153.27 47948.15 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.5
NAS Parallel Benchmarks Test / Class: CG.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: CG.C Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c GPTshop GH200 13K 26K 39K 52K 65K SE +/- 268.67, N = 15 SE +/- 469.70, N = 9 SE +/- 391.92, N = 15 SE +/- 363.39, N = 8 SE +/- 537.84, N = 15 SE +/- 15.14, N = 5 SE +/- 58.76, N = 6 35560.25 59557.08 44431.58 45867.44 54465.04 17272.73 24350.32 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.5
ASKAP Test: tConvolve MT - Gridding OpenBenchmarking.org Million Grid Points Per Second, More Is Better ASKAP 1.0 Test: tConvolve MT - Gridding Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c GPTshop GH200 3K 6K 9K 12K 15K SE +/- 4.50, N = 3 SE +/- 240.55, N = 12 SE +/- 9.06, N = 3 SE +/- 9.54, N = 3 SE +/- 32.54, N = 3 SE +/- 0.36, N = 3 SE +/- 2.75, N = 3 8112.07 14731.50 12660.80 10327.50 12332.40 4295.72 9561.31 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
Xcompact3d Incompact3d Input: input.i3d 193 Cells Per Direction OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 193 Cells Per Direction Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c GPTshop GH200 6 12 18 24 30 SE +/- 0.04713862, N = 4 SE +/- 0.03937117, N = 5 SE +/- 0.04030589, N = 5 SE +/- 0.03040517, N = 5 SE +/- 0.07670149, N = 5 SE +/- 0.05179277, N = 3 SE +/- 0.03425295, N = 5 12.79834200 7.30814419 9.02806168 9.74551621 8.45212936 23.88522400 9.83151112 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
NAS Parallel Benchmarks Test / Class: MG.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: MG.C Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c GPTshop GH200 30K 60K 90K 120K 150K SE +/- 405.89, N = 9 SE +/- 1538.33, N = 15 SE +/- 667.76, N = 9 SE +/- 1712.13, N = 15 SE +/- 1159.21, N = 15 SE +/- 10.27, N = 7 SE +/- 113.65, N = 8 88049.90 137608.68 127404.16 125921.29 118608.07 42291.84 57191.85 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.5
ASKAP Test: tConvolve MPI - Gridding OpenBenchmarking.org Mpix/sec, More Is Better ASKAP 1.0 Test: tConvolve MPI - Gridding Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c GPTshop GH200 15K 30K 45K 60K 75K SE +/- 146.74, N = 3 SE +/- 485.47, N = 3 SE +/- 655.18, N = 14 SE +/- 219.20, N = 3 SE +/- 405.08, N = 3 SE +/- 0.46, N = 3 SE +/- 39.51, N = 3 22365.20 67965.50 27269.20 37265.90 47003.20 2416.54 9678.85 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
Apache Cassandra Test: Writes OpenBenchmarking.org Op/s, More Is Better Apache Cassandra 4.1.3 Test: Writes Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c GPTshop GH200 80K 160K 240K 320K 400K SE +/- 883.78, N = 3 SE +/- 1195.73, N = 3 SE +/- 1223.06, N = 3 SE +/- 554.18, N = 3 SE +/- 162.88, N = 3 SE +/- 2776.53, N = 3 SE +/- 2382.80, N = 3 139654 286460 236320 301963 285324 227548 371439
ACES DGEMM Sustained Floating-Point Rate OpenBenchmarking.org GFLOP/s, More Is Better ACES DGEMM 1.0 Sustained Floating-Point Rate Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c GPTshop GH200 10 20 30 40 50 SE +/- 0.13, N = 5 SE +/- 0.12, N = 7 SE +/- 0.13, N = 7 SE +/- 0.28, N = 6 SE +/- 0.18, N = 7 SE +/- 0.09, N = 4 SE +/- 0.19, N = 15 23.26 40.32 43.68 30.17 39.28 18.11 17.56 1. (CC) gcc options: -O3 -march=native -fopenmp
Primesieve Length: 1e13 OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 8.0 Length: 1e13 Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c GPTshop GH200 12 24 36 48 60 SE +/- 0.07, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 SE +/- 0.07, N = 3 SE +/- 0.51, N = 3 52.38 25.97 21.76 28.14 24.16 41.68 36.05 1. (CXX) g++ options: -O3
OpenSSL Algorithm: SHA512 OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: SHA512 Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c GPTshop GH200 11000M 22000M 33000M 44000M 55000M SE +/- 147259946.50, N = 3 SE +/- 6401405.16, N = 3 SE +/- 1530571.35, N = 3 SE +/- 4732001.41, N = 3 SE +/- 16069151.20, N = 3 SE +/- 36142057.88, N = 3 SE +/- 90952676.69, N = 3 22445820447 40105639580 53648078203 33622028530 40459323610 34434960797 45292038457 -m64 -lssl -lcrypto -m64 -m64 -m64 -m64 -lssl -lcrypto -lssl -lcrypto 1. (CC) gcc options: -pthread -O3 -ldl
Xcompact3d Incompact3d Input: X3D-benchmarking input.i3d OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: X3D-benchmarking input.i3d Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c GPTshop GH200 130 260 390 520 650 SE +/- 4.12, N = 4 SE +/- 5.21, N = 9 SE +/- 14.11, N = 3 SE +/- 7.98, N = 9 SE +/- 8.73, N = 9 SE +/- 0.07, N = 3 SE +/- 1.42, N = 3 353.56 440.55 493.50 515.79 430.08 606.52 257.34 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
Timed Godot Game Engine Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Godot Game Engine Compilation 4.0 Time To Compile Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c GPTshop GH200 50 100 150 200 250 SE +/- 0.43, N = 3 SE +/- 0.06, N = 3 SE +/- 0.26, N = 3 SE +/- 0.03, N = 3 SE +/- 0.16, N = 3 SE +/- 0.35, N = 3 SE +/- 0.65, N = 3 128.47 98.56 118.25 103.50 101.56 227.78 138.89
RocksDB Test: Random Read OpenBenchmarking.org Op/s, More Is Better RocksDB 8.0 Test: Random Read Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c GPTshop GH200 130M 260M 390M 520M 650M SE +/- 6039899.37, N = 15 SE +/- 539077.40, N = 3 SE +/- 817605.37, N = 3 SE +/- 106225.72, N = 3 SE +/- 1006626.36, N = 3 SE +/- 3039704.34, N = 15 SE +/- 3229592.93, N = 11 268374486 486862785 611259161 376983415 429310263 435437276 429416658 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
Timed Node.js Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Node.js Compilation 19.8.1 Time To Compile Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c GPTshop GH200 60 120 180 240 300 SE +/- 0.27, N = 3 SE +/- 0.09, N = 3 SE +/- 0.08, N = 3 SE +/- 0.32, N = 3 SE +/- 0.08, N = 3 SE +/- 0.19, N = 3 SE +/- 0.08, N = 3 151.57 119.20 130.46 121.93 120.16 268.70 173.83
Algebraic Multi-Grid Benchmark OpenBenchmarking.org Figure Of Merit, More Is Better Algebraic Multi-Grid Benchmark 1.2 Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c GPTshop GH200 500M 1000M 1500M 2000M 2500M SE +/- 269139.99, N = 3 SE +/- 1953221.73, N = 3 SE +/- 2811300.43, N = 3 SE +/- 3180941.37, N = 3 SE +/- 5502047.33, N = 3 SE +/- 31942.66, N = 3 SE +/- 22467067.78, N = 3 1611826000 2374853333 2291049667 2321156000 2296730667 1060656000 1993579333 1. (CC) gcc options: -lparcsr_ls -lparcsr_mv -lseq_mv -lIJ_mv -lkrylov -lHYPRE_utilities -lm -fopenmp -lmpi
miniFE Problem Size: Small OpenBenchmarking.org CG Mflops, More Is Better miniFE 2.2 Problem Size: Small Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c GPTshop GH200 11K 22K 33K 44K 55K SE +/- 14.30, N = 4 SE +/- 507.76, N = 5 SE +/- 105.96, N = 5 SE +/- 51.65, N = 5 SE +/- 45.75, N = 5 SE +/- 5.88, N = 4 SE +/- 54.33, N = 5 35452.5 52937.6 50393.7 50351.4 50320.2 24122.0 46477.3 1. (CXX) g++ options: -O3 -fopenmp -lmpi_cxx -lmpi
Timed LLVM Compilation Build System: Ninja OpenBenchmarking.org Seconds, Fewer Is Better Timed LLVM Compilation 16.0 Build System: Ninja Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c GPTshop GH200 60 120 180 240 300 SE +/- 0.08, N = 3 SE +/- 0.07, N = 3 SE +/- 0.28, N = 3 SE +/- 0.11, N = 3 SE +/- 0.08, N = 3 SE +/- 0.95, N = 3 SE +/- 0.88, N = 3 169.53 131.99 146.39 131.20 133.07 266.54 195.60
Xmrig Variant: Monero - Hash Count: 1M OpenBenchmarking.org H/s, More Is Better Xmrig 6.18.1 Variant: Monero - Hash Count: 1M Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c GPTshop GH200 15K 30K 45K 60K 75K SE +/- 7.51, N = 3 SE +/- 80.10, N = 4 SE +/- 2550.91, N = 12 SE +/- 42.25, N = 3 SE +/- 142.78, N = 3 SE +/- 16.52, N = 3 SE +/- 5.62, N = 3 27608.5 69205.6 29356.1 47926.4 59047.7 4322.1 17290.3 -maes -maes -maes -maes -maes 1. (CXX) g++ options: -fexceptions -fno-rtti -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
ASKAP Test: tConvolve MPI - Degridding OpenBenchmarking.org Mpix/sec, More Is Better ASKAP 1.0 Test: tConvolve MPI - Degridding Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c GPTshop GH200 12K 24K 36K 48K 60K SE +/- 131.23, N = 3 SE +/- 0.00, N = 3 SE +/- 574.27, N = 15 SE +/- 153.57, N = 3 SE +/- 475.58, N = 3 SE +/- 4.44, N = 3 SE +/- 37.76, N = 3 19616.00 58310.10 26593.40 31177.20 41081.60 5285.36 12429.50 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
High Performance Conjugate Gradient X Y Z: 144 144 144 - RT: 60 OpenBenchmarking.org GFLOP/s, More Is Better High Performance Conjugate Gradient 3.1 X Y Z: 144 144 144 - RT: 60 Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c GPTshop GH200 9 18 27 36 45 SE +/- 0.11, N = 3 SE +/- 0.28, N = 9 SE +/- 0.27, N = 3 SE +/- 0.04, N = 3 SE +/- 0.35, N = 3 SE +/- 0.00, N = 3 SE +/- 0.36, N = 3 31.24 23.78 25.89 22.45 32.13 21.24 41.55 1. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi
GraphicsMagick Operation: Sharpen OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.38 Operation: Sharpen Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c GPTshop GH200 300 600 900 1200 1500 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 SE +/- 0.67, N = 3 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 SE +/- 1.20, N = 3 SE +/- 5.13, N = 3 691 775 924 669 809 1274 1244 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -llzma -lbz2 -lxml2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -llzma -lbz2 -lxml2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -llzma -lbz2 -lxml2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -llzma -lbz2 -lxml2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -llzma -lbz2 -lxml2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -llzma -lbz2 1. (CC) gcc options: -fopenmp -O2 -ljpeg -lXext -lSM -lICE -lX11 -lz -lzstd -lm -lpthread
Stress-NG Test: Memory Copying OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Memory Copying Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c GPTshop GH200 7K 14K 21K 28K 35K SE +/- 67.98, N = 3 SE +/- 12.67, N = 3 SE +/- 23.24, N = 3 SE +/- 19.64, N = 3 SE +/- 22.89, N = 3 SE +/- 0.31, N = 3 SE +/- 58.68, N = 3 18809.55 26495.63 34497.51 22827.20 28310.82 27163.45 27302.27 -lm -laio -lapparmor -latomic -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz -lm -laio -lapparmor -latomic -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz -lm -laio -lapparmor -latomic -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz -lm -laio -lapparmor -latomic -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz -lm -laio -lapparmor -latomic -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz -O2 -std=gnu99 -O2 -std=gnu99 1. (CXX) g++ options: -lc
7-Zip Compression Test: Compression Rating OpenBenchmarking.org MIPS, More Is Better 7-Zip Compression 22.01 Test: Compression Rating Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c GPTshop GH200 130K 260K 390K 520K 650K SE +/- 629.56, N = 3 SE +/- 366.95, N = 3 SE +/- 439.85, N = 3 SE +/- 224.17, N = 3 SE +/- 311.94, N = 3 SE +/- 1588.70, N = 3 SE +/- 222.71, N = 3 389690 554411 586571 489736 564950 328378 341300 1. (CXX) g++ options: -lpthread -ldl -O2 -fPIC
Graph500 Scale: 26 OpenBenchmarking.org sssp median_TEPS, More Is Better Graph500 3.0 Scale: 26 Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c GPTshop GH200 80M 160M 240M 320M 400M 333129000 383754000 377033000 368016000 391167000 224032000 298983000 1. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi
Rodinia Test: OpenMP LavaMD OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenMP LavaMD Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c GPTshop GH200 10 20 30 40 50 SE +/- 0.53, N = 3 SE +/- 0.05, N = 3 SE +/- 0.10, N = 3 SE +/- 0.05, N = 3 SE +/- 0.06, N = 3 SE +/- 0.01, N = 3 SE +/- 0.07, N = 3 42.71 30.32 25.15 34.01 30.24 31.79 30.96 1. (CXX) g++ options: -O2 -lOpenCL
DuckDB Benchmark: TPC-H Parquet OpenBenchmarking.org Seconds, Fewer Is Better DuckDB 0.9.1 Benchmark: TPC-H Parquet Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c GPTshop GH200 50 100 150 200 250 SE +/- 0.30, N = 3 SE +/- 0.26, N = 3 SE +/- 0.64, N = 3 SE +/- 0.53, N = 3 SE +/- 0.47, N = 3 SE +/- 4.97, N = 9 SE +/- 0.69, N = 3 148.10 147.49 177.13 143.31 146.28 237.16 156.99 1. (CXX) g++ options: -O3 -rdynamic -lssl -lcrypto -ldl
DuckDB Benchmark: IMDB OpenBenchmarking.org Seconds, Fewer Is Better DuckDB 0.9.1 Benchmark: IMDB Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c GPTshop GH200 30 60 90 120 150 SE +/- 0.24, N = 3 SE +/- 0.26, N = 3 SE +/- 0.08, N = 3 SE +/- 0.16, N = 3 SE +/- 0.18, N = 3 SE +/- 0.18, N = 3 SE +/- 0.75, N = 3 99.25 116.05 147.60 103.72 115.81 143.78 91.63 1. (CXX) g++ options: -O3 -rdynamic -lssl -lcrypto -ldl
Graph500 Scale: 26 OpenBenchmarking.org bfs max_TEPS, More Is Better Graph500 3.0 Scale: 26 Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c GPTshop GH200 300M 600M 900M 1200M 1500M 1113120000 882181000 1184510000 865152000 839917000 987770000 1313880000 1. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi
LULESH OpenBenchmarking.org z/s, More Is Better LULESH 2.0.3 Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c GPTshop GH200 5K 10K 15K 20K 25K SE +/- 49.50, N = 5 SE +/- 90.30, N = 3 SE +/- 59.94, N = 3 SE +/- 113.24, N = 3 SE +/- 84.04, N = 3 SE +/- 24.21, N = 3 SE +/- 40.15, N = 3 23997.72 24712.88 22356.75 24037.44 23701.44 16218.33 23433.37 1. (CXX) g++ options: -O3 -fopenmp -lm -lmpi_cxx -lmpi
Graph500 Scale: 26 OpenBenchmarking.org sssp max_TEPS, More Is Better Graph500 3.0 Scale: 26 Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c GPTshop GH200 110M 220M 330M 440M 550M 454483000 501222000 502151000 467453000 499347000 333239000 463087000 1. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi
Graph500 Scale: 26 OpenBenchmarking.org bfs median_TEPS, More Is Better Graph500 3.0 Scale: 26 Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c GPTshop GH200 300M 600M 900M 1200M 1500M 1073860000 854029000 1147090000 836159000 816172000 980463000 1214290000 1. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi
Timed Gem5 Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Gem5 Compilation 23.0.1 Time To Compile Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c GPTshop GH200 60 120 180 240 300 SE +/- 1.53, N = 3 SE +/- 1.97, N = 3 SE +/- 2.85, N = 3 SE +/- 1.67, N = 3 SE +/- 1.09, N = 3 SE +/- 3.13, N = 4 SE +/- 2.18, N = 12 197.76 180.06 208.58 179.74 180.86 260.33 184.66
RawTherapee Total Benchmark Time OpenBenchmarking.org Seconds, Fewer Is Better RawTherapee Total Benchmark Time Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c GPTshop GH200 15 30 45 60 75 SE +/- 0.05, N = 3 SE +/- 0.03, N = 3 SE +/- 0.22, N = 3 SE +/- 0.04, N = 3 SE +/- 0.07, N = 3 SE +/- 0.02, N = 3 SE +/- 0.10, N = 3 46.58 52.99 66.13 49.94 52.05 66.05 46.55 1. RawTherapee, version 5.9, command line.
GraphicsMagick Operation: Enhanced OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.38 Operation: Enhanced Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c GPTshop GH200 300 600 900 1200 1500 SE +/- 0.33, N = 3 SE +/- 1.67, N = 3 SE +/- 0.58, N = 3 SE +/- 0.88, N = 3 SE +/- 0.88, N = 3 SE +/- 2.73, N = 3 SE +/- 0.33, N = 3 1124 1277 1451 1142 1319 1230 1584 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -llzma -lbz2 -lxml2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -llzma -lbz2 -lxml2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -llzma -lbz2 -lxml2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -llzma -lbz2 -lxml2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -llzma -lbz2 -lxml2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -llzma -lbz2 1. (CC) gcc options: -fopenmp -O2 -ljpeg -lXext -lSM -lICE -lX11 -lz -lzstd -lm -lpthread
Timed Linux Kernel Compilation Build: allmodconfig OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 6.1 Build: allmodconfig Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c GPTshop GH200 70 140 210 280 350 SE +/- 0.60, N = 3 SE +/- 0.50, N = 3 SE +/- 2.48, N = 3 SE +/- 0.61, N = 3 SE +/- 0.41, N = 3 SE +/- 0.75, N = 3 SE +/- 1.02, N = 3 289.97 283.87 319.43 238.29 284.25 307.88 283.34
System Temperature Monitor Phoronix Test Suite System Monitoring OpenBenchmarking.org Celsius System Temperature Monitor Phoronix Test Suite System Monitoring GPTshop GH200 20 40 60 80 100 Min: 71.9 / Avg: 92.77 / Max: 102.5
System Power Consumption Monitor Phoronix Test Suite System Monitoring OpenBenchmarking.org Watts System Power Consumption Monitor Phoronix Test Suite System Monitoring GPTshop GH200 70 140 210 280 350 Min: 167.62 / Avg: 281.77 / Max: 416.43
CPU Peak Freq (Highest CPU Core Frequency) Monitor Phoronix Test Suite System Monitoring OpenBenchmarking.org Megahertz CPU Peak Freq (Highest CPU Core Frequency) Monitor Phoronix Test Suite System Monitoring GPTshop GH200 700 1400 2100 2800 3500 Min: 2047 / Avg: 3332.19 / Max: 4104
Timed Gem5 Compilation System Temperature Monitor Min Avg Max GPTshop GH200 80.1 87.5 97.8 OpenBenchmarking.org Celsius, Fewer Is Better Timed Gem5 Compilation 23.0.1 System Temperature Monitor 20 40 60 80 100
Timed Gem5 Compilation System Power Consumption Monitor Min Avg Max GPTshop GH200 178.1 252.6 366.7 OpenBenchmarking.org Watts, Fewer Is Better Timed Gem5 Compilation 23.0.1 System Power Consumption Monitor 100 200 300 400 500
Timed Gem5 Compilation CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better Timed Gem5 Compilation 23.0.1 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
RocksDB System Temperature Monitor Min Avg Max GPTshop GH200 85.0 98.3 101.2 OpenBenchmarking.org Celsius, Fewer Is Better RocksDB 8.0 System Temperature Monitor 20 40 60 80 100
RocksDB System Power Consumption Monitor Min Avg Max GPTshop GH200 178.0 319.3 382.4 OpenBenchmarking.org Watts, Fewer Is Better RocksDB 8.0 System Power Consumption Monitor 100 200 300 400 500
RocksDB CPU Peak Freq (Highest CPU Core Frequency) Monitor Min Avg Max GPTshop GH200 2729 3179 3411 OpenBenchmarking.org Megahertz, More Is Better RocksDB 8.0 CPU Peak Freq (Highest CPU Core Frequency) Monitor 800 1600 2400 3200 4000
Apache Cassandra System Temperature Monitor Min Avg Max GPTshop GH200 78.5 89.2 92.7 OpenBenchmarking.org Celsius, Fewer Is Better Apache Cassandra 4.1.3 System Temperature Monitor 20 40 60 80 100
Apache Cassandra System Power Consumption Monitor Min Avg Max GPTshop GH200 183.3 287.6 335.2 OpenBenchmarking.org Watts, Fewer Is Better Apache Cassandra 4.1.3 System Power Consumption Monitor 80 160 240 320 400
Apache Cassandra CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better Apache Cassandra 4.1.3 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
Stress-NG System Temperature Monitor Min Avg Max GPTshop GH200 86.5 91.8 93.5 OpenBenchmarking.org Celsius, Fewer Is Better Stress-NG 0.16.04 System Temperature Monitor 20 40 60 80 100
Stress-NG System Power Consumption Monitor Min Avg Max GPTshop GH200 193.8 299.1 319.5 OpenBenchmarking.org Watts, Fewer Is Better Stress-NG 0.16.04 System Power Consumption Monitor 80 160 240 320 400
Stress-NG CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better Stress-NG 0.16.04 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
Stress-NG System Temperature Monitor Min Avg Max GPTshop GH200 83.2 88.8 92.9 OpenBenchmarking.org Celsius, Fewer Is Better Stress-NG 0.16.04 System Temperature Monitor 20 40 60 80 100
Stress-NG System Power Consumption Monitor Min Avg Max GPTshop GH200 188.5 282.8 319.4 OpenBenchmarking.org Watts, Fewer Is Better Stress-NG 0.16.04 System Power Consumption Monitor 80 160 240 320 400
Stress-NG CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better Stress-NG 0.16.04 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
RawTherapee System Temperature Monitor Min Avg Max GPTshop GH200 82.8 83.8 85.2 OpenBenchmarking.org Celsius, Fewer Is Better RawTherapee System Temperature Monitor 20 40 60 80 100
RawTherapee System Power Consumption Monitor Min Avg Max GPTshop GH200 188.6 208.0 251.3 OpenBenchmarking.org Watts, Fewer Is Better RawTherapee System Power Consumption Monitor 60 120 180 240 300
RawTherapee CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better RawTherapee CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
DuckDB System Temperature Monitor Min Avg Max GPTshop GH200 84.5 84.8 86.6 OpenBenchmarking.org Celsius, Fewer Is Better DuckDB 0.9.1 System Temperature Monitor 20 40 60 80 100
DuckDB System Power Consumption Monitor Min Avg Max GPTshop GH200 183.3 207.4 351.0 OpenBenchmarking.org Watts, Fewer Is Better DuckDB 0.9.1 System Power Consumption Monitor 100 200 300 400 500
DuckDB CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better DuckDB 0.9.1 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
DuckDB System Temperature Monitor Min Avg Max GPTshop GH200 79.0 83.1 94.0 OpenBenchmarking.org Celsius, Fewer Is Better DuckDB 0.9.1 System Temperature Monitor 20 40 60 80 100
DuckDB System Power Consumption Monitor Min Avg Max GPTshop GH200 183.3 212.6 298.5 OpenBenchmarking.org Watts, Fewer Is Better DuckDB 0.9.1 System Power Consumption Monitor 80 160 240 320 400
DuckDB CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better DuckDB 0.9.1 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
Graph500 System Temperature Monitor Min Avg Max GPTshop GH200 89.8 97.8 100.1 OpenBenchmarking.org Celsius, Fewer Is Better Graph500 3.0 System Temperature Monitor 20 40 60 80 100
Graph500 System Power Consumption Monitor Min Avg Max GPTshop GH200 272.3 321.3 377.1 OpenBenchmarking.org Watts, Fewer Is Better Graph500 3.0 System Power Consumption Monitor 100 200 300 400 500
Graph500 CPU Peak Freq (Highest CPU Core Frequency) Monitor Min Avg Max GPTshop GH200 2729 3246 3411 OpenBenchmarking.org Megahertz, More Is Better Graph500 3.0 CPU Peak Freq (Highest CPU Core Frequency) Monitor 800 1600 2400 3200 4000
ASKAP System Temperature Monitor Min Avg Max GPTshop GH200 85.6 92.0 95.1 OpenBenchmarking.org Celsius, Fewer Is Better ASKAP 1.0 System Temperature Monitor 20 40 60 80 100
ASKAP System Power Consumption Monitor Min Avg Max GPTshop GH200 188.6 287.3 335.1 OpenBenchmarking.org Watts, Fewer Is Better ASKAP 1.0 System Power Consumption Monitor 80 160 240 320 400
ASKAP CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better ASKAP 1.0 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
OpenSSL System Temperature Monitor Min Avg Max GPTshop GH200 89.0 97.8 99.8 OpenBenchmarking.org Celsius, Fewer Is Better OpenSSL 3.1 System Temperature Monitor 20 40 60 80 100
OpenSSL System Power Consumption Monitor Min Avg Max GPTshop GH200 188.8 307.5 345.7 OpenBenchmarking.org Watts, Fewer Is Better OpenSSL 3.1 System Power Consumption Monitor 100 200 300 400 500
OpenSSL CPU Peak Freq (Highest CPU Core Frequency) Monitor Min Avg Max GPTshop GH200 2729 3344 3411 OpenBenchmarking.org Megahertz, More Is Better OpenSSL 3.1 CPU Peak Freq (Highest CPU Core Frequency) Monitor 800 1600 2400 3200 4000
Primesieve System Temperature Monitor Min Avg Max GPTshop GH200 85.2 96.6 99.6 OpenBenchmarking.org Celsius, Fewer Is Better Primesieve 8.0 System Temperature Monitor 20 40 60 80 100
Primesieve System Power Consumption Monitor Min Avg Max GPTshop GH200 337.8 371.2 387.2 OpenBenchmarking.org Watts, Fewer Is Better Primesieve 8.0 System Power Consumption Monitor 100 200 300 400 500
Primesieve CPU Peak Freq (Highest CPU Core Frequency) Monitor Min Avg Max GPTshop GH200 2729 3155 3411 OpenBenchmarking.org Megahertz, More Is Better Primesieve 8.0 CPU Peak Freq (Highest CPU Core Frequency) Monitor 800 1600 2400 3200 4000
Timed Node.js Compilation System Temperature Monitor Min Avg Max GPTshop GH200 87.0 92.8 97.8 OpenBenchmarking.org Celsius, Fewer Is Better Timed Node.js Compilation 19.8.1 System Temperature Monitor 20 40 60 80 100
Timed Node.js Compilation System Power Consumption Monitor Min Avg Max GPTshop GH200 185.9 300.4 358.7 OpenBenchmarking.org Watts, Fewer Is Better Timed Node.js Compilation 19.8.1 System Power Consumption Monitor 100 200 300 400 500
Timed Node.js Compilation CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better Timed Node.js Compilation 19.8.1 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
Timed LLVM Compilation System Temperature Monitor Min Avg Max GPTshop GH200 86.9 95.0 99.5 OpenBenchmarking.org Celsius, Fewer Is Better Timed LLVM Compilation 16.0 System Temperature Monitor 20 40 60 80 100
Timed LLVM Compilation System Power Consumption Monitor Min Avg Max GPTshop GH200 183.3 296.8 358.9 OpenBenchmarking.org Watts, Fewer Is Better Timed LLVM Compilation 16.0 System Power Consumption Monitor 100 200 300 400 500
Timed LLVM Compilation CPU Peak Freq (Highest CPU Core Frequency) Monitor Min Avg Max GPTshop GH200 2729 3349 3411 OpenBenchmarking.org Megahertz, More Is Better Timed LLVM Compilation 16.0 CPU Peak Freq (Highest CPU Core Frequency) Monitor 800 1600 2400 3200 4000
Timed Linux Kernel Compilation System Temperature Monitor Min Avg Max GPTshop GH200 84.2 97.0 100.1 OpenBenchmarking.org Celsius, Fewer Is Better Timed Linux Kernel Compilation 6.1 System Temperature Monitor 20 40 60 80 100
Timed Linux Kernel Compilation System Power Consumption Monitor Min Avg Max GPTshop GH200 183.5 302.8 371.9 OpenBenchmarking.org Watts, Fewer Is Better Timed Linux Kernel Compilation 6.1 System Power Consumption Monitor 100 200 300 400 500
Timed Linux Kernel Compilation CPU Peak Freq (Highest CPU Core Frequency) Monitor Min Avg Max GPTshop GH200 2729 3254 3411 OpenBenchmarking.org Megahertz, More Is Better Timed Linux Kernel Compilation 6.1 CPU Peak Freq (Highest CPU Core Frequency) Monitor 800 1600 2400 3200 4000
Timed Godot Game Engine Compilation System Temperature Monitor Min Avg Max GPTshop GH200 83.7 91.2 99.3 OpenBenchmarking.org Celsius, Fewer Is Better Timed Godot Game Engine Compilation 4.0 System Temperature Monitor 20 40 60 80 100
Timed Godot Game Engine Compilation System Power Consumption Monitor Min Avg Max GPTshop GH200 188.5 260.8 379.7 OpenBenchmarking.org Watts, Fewer Is Better Timed Godot Game Engine Compilation 4.0 System Power Consumption Monitor 100 200 300 400 500
Timed Godot Game Engine Compilation CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better Timed Godot Game Engine Compilation 4.0 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
ACES DGEMM System Temperature Monitor Min Avg Max GPTshop GH200 85.8 92.3 95.8 OpenBenchmarking.org Celsius, Fewer Is Better ACES DGEMM 1.0 System Temperature Monitor 20 40 60 80 100
ACES DGEMM System Power Consumption Monitor Min Avg Max GPTshop GH200 185.9 277.8 345.7 OpenBenchmarking.org Watts, Fewer Is Better ACES DGEMM 1.0 System Power Consumption Monitor 100 200 300 400 500
ACES DGEMM CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better ACES DGEMM 1.0 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
easyWave System Temperature Monitor Min Avg Max GPTshop GH200 81.6 85.6 88.0 OpenBenchmarking.org Celsius, Fewer Is Better easyWave r34 System Temperature Monitor 20 40 60 80 100
easyWave System Power Consumption Monitor Min Avg Max GPTshop GH200 186.0 272.7 361.5 OpenBenchmarking.org Watts, Fewer Is Better easyWave r34 System Power Consumption Monitor 100 200 300 400 500
easyWave CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better easyWave r34 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
easyWave System Temperature Monitor Min Avg Max GPTshop GH200 83.6 85.9 87.5 OpenBenchmarking.org Celsius, Fewer Is Better easyWave r34 System Temperature Monitor 20 40 60 80 100
easyWave System Power Consumption Monitor Min Avg Max GPTshop GH200 191.2 250.4 348.8 OpenBenchmarking.org Watts, Fewer Is Better easyWave r34 System Power Consumption Monitor 100 200 300 400 500
easyWave CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better easyWave r34 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
GraphicsMagick System Temperature Monitor Min Avg Max GPTshop GH200 87.9 94.2 97.3 OpenBenchmarking.org Celsius, Fewer Is Better GraphicsMagick 1.3.38 System Temperature Monitor 20 40 60 80 100
GraphicsMagick System Power Consumption Monitor Min Avg Max GPTshop GH200 185.9 296.1 322.4 OpenBenchmarking.org Watts, Fewer Is Better GraphicsMagick 1.3.38 System Power Consumption Monitor 80 160 240 320 400
GraphicsMagick CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better GraphicsMagick 1.3.38 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
GraphicsMagick System Temperature Monitor Min Avg Max GPTshop GH200 83.9 93.9 99.3 OpenBenchmarking.org Celsius, Fewer Is Better GraphicsMagick 1.3.38 System Temperature Monitor 20 40 60 80 100
GraphicsMagick System Power Consumption Monitor Min Avg Max GPTshop GH200 185.9 311.9 340.5 OpenBenchmarking.org Watts, Fewer Is Better GraphicsMagick 1.3.38 System Power Consumption Monitor 80 160 240 320 400
GraphicsMagick CPU Peak Freq (Highest CPU Core Frequency) Monitor Min Avg Max GPTshop GH200 2729 3381 3411 OpenBenchmarking.org Megahertz, More Is Better GraphicsMagick 1.3.38 CPU Peak Freq (Highest CPU Core Frequency) Monitor 800 1600 2400 3200 4000
Xmrig System Temperature Monitor Min Avg Max GPTshop GH200 87.1 91.7 93.2 OpenBenchmarking.org Celsius, Fewer Is Better Xmrig 6.18.1 System Temperature Monitor 20 40 60 80 100
Xmrig System Power Consumption Monitor Min Avg Max GPTshop GH200 298.5 310.4 314.3 OpenBenchmarking.org Watts, Fewer Is Better Xmrig 6.18.1 System Power Consumption Monitor 80 160 240 320 400
Xmrig CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better Xmrig 6.18.1 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
Xmrig System Temperature Monitor Min Avg Max GPTshop GH200 84.8 90.4 93.0 OpenBenchmarking.org Celsius, Fewer Is Better Xmrig 6.18.1 System Temperature Monitor 20 40 60 80 100
Xmrig System Power Consumption Monitor Min Avg Max GPTshop GH200 188.6 296.1 316.9 OpenBenchmarking.org Watts, Fewer Is Better Xmrig 6.18.1 System Power Consumption Monitor 80 160 240 320 400
Xmrig CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better Xmrig 6.18.1 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
LULESH System Power Consumption Monitor Min Avg Max GPTshop GH200 188.5 250.2 308.4 OpenBenchmarking.org Watts, Fewer Is Better LULESH 2.0.3 System Power Consumption Monitor 80 160 240 320 400
LULESH CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better LULESH 2.0.3 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
Xcompact3d Incompact3d System Temperature Monitor Min Avg Max GPTshop GH200 92.0 94.7 98.5 OpenBenchmarking.org Celsius, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 System Temperature Monitor 20 40 60 80 100
Xcompact3d Incompact3d System Power Consumption Monitor Min Avg Max GPTshop GH200 191.2 283.9 358.9 OpenBenchmarking.org Watts, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 System Power Consumption Monitor 100 200 300 400 500
Xcompact3d Incompact3d CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better Xcompact3d Incompact3d 2021-03-11 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
Xcompact3d Incompact3d System Temperature Monitor Min Avg Max GPTshop GH200 82.5 96.2 101.8 OpenBenchmarking.org Celsius, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 System Temperature Monitor 20 40 60 80 100
Xcompact3d Incompact3d System Power Consumption Monitor Min Avg Max GPTshop GH200 188.6 323.5 395.2 OpenBenchmarking.org Watts, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 System Power Consumption Monitor 110 220 330 440 550
Xcompact3d Incompact3d CPU Peak Freq (Highest CPU Core Frequency) Monitor Min Avg Max GPTshop GH200 2729 3268 3411 OpenBenchmarking.org Megahertz, More Is Better Xcompact3d Incompact3d 2021-03-11 CPU Peak Freq (Highest CPU Core Frequency) Monitor 800 1600 2400 3200 4000
Rodinia System Temperature Monitor Min Avg Max GPTshop GH200 88.9 93.9 100.1 OpenBenchmarking.org Celsius, Fewer Is Better Rodinia 3.1 System Temperature Monitor 20 40 60 80 100
Rodinia System Power Consumption Monitor Min Avg Max GPTshop GH200 186.0 267.7 379.8 OpenBenchmarking.org Watts, Fewer Is Better Rodinia 3.1 System Power Consumption Monitor 100 200 300 400 500
Rodinia CPU Peak Freq (Highest CPU Core Frequency) Monitor Min Avg Max GPTshop GH200 2729 3119 3411 OpenBenchmarking.org Megahertz, More Is Better Rodinia 3.1 CPU Peak Freq (Highest CPU Core Frequency) Monitor 800 1600 2400 3200 4000
CloverLeaf System Temperature Monitor Min Avg Max GPTshop GH200 91.9 94.6 96.6 OpenBenchmarking.org Celsius, Fewer Is Better CloverLeaf 1.3 System Temperature Monitor 20 40 60 80 100
CloverLeaf System Power Consumption Monitor Min Avg Max GPTshop GH200 191.2 268.3 332.7 OpenBenchmarking.org Watts, Fewer Is Better CloverLeaf 1.3 System Power Consumption Monitor 80 160 240 320 400
CloverLeaf CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better CloverLeaf 1.3 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
CloverLeaf System Temperature Monitor Min Avg Max GPTshop GH200 88.5 96.9 98.8 OpenBenchmarking.org Celsius, Fewer Is Better CloverLeaf 1.3 System Temperature Monitor 20 40 60 80 100
CloverLeaf System Power Consumption Monitor Min Avg Max GPTshop GH200 185.9 315.8 337.8 OpenBenchmarking.org Watts, Fewer Is Better CloverLeaf 1.3 System Power Consumption Monitor 80 160 240 320 400
CloverLeaf CPU Peak Freq (Highest CPU Core Frequency) Monitor Min Avg Max GPTshop GH200 2729 3346 3411 OpenBenchmarking.org Megahertz, More Is Better CloverLeaf 1.3 CPU Peak Freq (Highest CPU Core Frequency) Monitor 800 1600 2400 3200 4000
NAS Parallel Benchmarks System Temperature Monitor Min Avg Max GPTshop GH200 88.8 91.3 97.1 OpenBenchmarking.org Celsius, Fewer Is Better NAS Parallel Benchmarks 3.4 System Temperature Monitor 20 40 60 80 100
NAS Parallel Benchmarks System Power Consumption Monitor Min Avg Max GPTshop GH200 185.9 237.6 381.2 OpenBenchmarking.org Watts, Fewer Is Better NAS Parallel Benchmarks 3.4 System Power Consumption Monitor 100 200 300 400 500
NAS Parallel Benchmarks CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better NAS Parallel Benchmarks 3.4 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
NAS Parallel Benchmarks System Temperature Monitor Min Avg Max GPTshop GH200 87.5 90.4 92.8 OpenBenchmarking.org Celsius, Fewer Is Better NAS Parallel Benchmarks 3.4 System Temperature Monitor 20 40 60 80 100
NAS Parallel Benchmarks System Power Consumption Monitor Min Avg Max GPTshop GH200 184.6 277.7 351.4 OpenBenchmarking.org Watts, Fewer Is Better NAS Parallel Benchmarks 3.4 System Power Consumption Monitor 100 200 300 400 500
NAS Parallel Benchmarks CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better NAS Parallel Benchmarks 3.4 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
NAS Parallel Benchmarks System Temperature Monitor Min Avg Max GPTshop GH200 89.5 92.2 98.0 OpenBenchmarking.org Celsius, Fewer Is Better NAS Parallel Benchmarks 3.4 System Temperature Monitor 20 40 60 80 100
NAS Parallel Benchmarks System Power Consumption Monitor Min Avg Max GPTshop GH200 185.9 287.4 378.3 OpenBenchmarking.org Watts, Fewer Is Better NAS Parallel Benchmarks 3.4 System Power Consumption Monitor 100 200 300 400 500
NAS Parallel Benchmarks CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better NAS Parallel Benchmarks 3.4 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
NAS Parallel Benchmarks System Temperature Monitor Min Avg Max GPTshop GH200 91.6 95.4 99.9 OpenBenchmarking.org Celsius, Fewer Is Better NAS Parallel Benchmarks 3.4 System Temperature Monitor 20 40 60 80 100
NAS Parallel Benchmarks System Power Consumption Monitor Min Avg Max GPTshop GH200 189.8 265.5 392.9 OpenBenchmarking.org Watts, Fewer Is Better NAS Parallel Benchmarks 3.4 System Power Consumption Monitor 110 220 330 440 550
NAS Parallel Benchmarks CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better NAS Parallel Benchmarks 3.4 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
High Performance Conjugate Gradient System Temperature Monitor Min Avg Max GPTshop GH200 71.9 96.0 99.3 OpenBenchmarking.org Celsius, Fewer Is Better High Performance Conjugate Gradient 3.1 System Temperature Monitor 20 40 60 80 100
High Performance Conjugate Gradient System Power Consumption Monitor Min Avg Max GPTshop GH200 189.8 324.5 414.7 OpenBenchmarking.org Watts, Fewer Is Better High Performance Conjugate Gradient 3.1 System Power Consumption Monitor 110 220 330 440 550
High Performance Conjugate Gradient CPU Peak Freq (Highest CPU Core Frequency) Monitor Min Avg Max GPTshop GH200 2729 3345 4104 OpenBenchmarking.org Megahertz, More Is Better High Performance Conjugate Gradient 3.1 CPU Peak Freq (Highest CPU Core Frequency) Monitor 1100 2200 3300 4400 5500
RocksDB Test: Random Read OpenBenchmarking.org Op/s Per Watt, More Is Better RocksDB 8.0 Test: Random Read Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c 500K 1000K 1500K 2000K 2500K 808152.35 1587869.93 2068476.16 1450487.41 1599837.63 2282816.53
Apache Cassandra Test: Writes OpenBenchmarking.org Op/s Per Watt, More Is Better Apache Cassandra 4.1.3 Test: Writes Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c 400 800 1200 1600 2000 557.85 1282.09 1432.67 1700.28 1349.72 2068.00
Stress-NG Test: Memory Copying OpenBenchmarking.org Bogo Ops/s Per Watt, More Is Better Stress-NG 0.16.04 Test: Memory Copying Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c 40 80 120 160 200 59.82 97.03 123.67 78.72 95.60 166.58
Stress-NG Test: NUMA OpenBenchmarking.org Bogo Ops/s Per Watt, More Is Better Stress-NG 0.16.04 Test: NUMA Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c 5 10 15 20 25 3.603 4.106 4.951 3.450 4.088 20.369
Graph500 Scale: 26 OpenBenchmarking.org sssp max_TEPS Per Watt, More Is Better Graph500 3.0 Scale: 26 Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c 400K 800K 1200K 1600K 2000K 1312538.58 1663398.87 1711121.29 1818554.40 1644954.69 1918278.63
ASKAP Test: tConvolve MPI - Gridding OpenBenchmarking.org Mpix/sec Per Watt, More Is Better ASKAP 1.0 Test: tConvolve MPI - Gridding Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c 60 120 180 240 300 71.04 275.76 129.89 191.48 196.62 17.51
OpenSSL Algorithm: SHA512 OpenBenchmarking.org byte/s Per Watt, More Is Better OpenSSL 3.1 Algorithm: SHA512 Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c 40M 80M 120M 160M 200M 65707856.51 131019550.91 160226634.86 112644110.72 133650046.77 192898458.48
ACES DGEMM Sustained Floating-Point Rate OpenBenchmarking.org GFLOP/s Per Watt, More Is Better ACES DGEMM 1.0 Sustained Floating-Point Rate Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c 0.0502 0.1004 0.1506 0.2008 0.251 0.091 0.219 0.223 0.171 0.220 0.136
GraphicsMagick Operation: Enhanced OpenBenchmarking.org Iterations Per Minute Per Watt, More Is Better GraphicsMagick 1.3.38 Operation: Enhanced Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c 3 6 9 12 15 3.405 5.193 6.328 4.871 5.323 9.833
GraphicsMagick Operation: Sharpen OpenBenchmarking.org Iterations Per Minute Per Watt, More Is Better GraphicsMagick 1.3.38 Operation: Sharpen Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c 2 4 6 8 10 2.081 3.042 3.752 2.691 3.127 8.016
Xmrig Variant: Wownero - Hash Count: 1M OpenBenchmarking.org H/s Per Watt, More Is Better Xmrig 6.18.1 Variant: Wownero - Hash Count: 1M Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c 60 120 180 240 300 114.36 265.76 284.00 212.04 270.21 18.61
Xmrig Variant: Monero - Hash Count: 1M OpenBenchmarking.org H/s Per Watt, More Is Better Xmrig 6.18.1 Variant: Monero - Hash Count: 1M Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c 50 100 150 200 250 86.36 245.11 124.40 166.56 208.70 36.63
LULESH OpenBenchmarking.org z/s Per Watt, More Is Better LULESH 2.0.3 Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c 30 60 90 120 150 106.69 139.86 124.36 148.93 136.97 144.69
Algebraic Multi-Grid Benchmark OpenBenchmarking.org Figure Of Merit Per Watt, More Is Better Algebraic Multi-Grid Benchmark 1.2 Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c 3M 6M 9M 12M 15M 5406835.94 10140759.12 11292392.93 12260061.49 9775985.20 7481610.11
CloverLeaf Input: clover_bm64_short OpenBenchmarking.org Seconds, Fewer Is Better CloverLeaf 1.3 Input: clover_bm64_short Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c GPTshop GH200 14 28 42 56 70 SE +/- 0.00, N = 3 SE +/- 1.39, N = 12 SE +/- 1.50, N = 12 SE +/- 1.40, N = 12 SE +/- 1.47, N = 12 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 38.93 51.31 49.03 45.82 47.98 62.77 28.36 1. (F9X) gfortran options: -O3 -march=native -funroll-loops -fopenmp
CloverLeaf Input: clover_bm16 OpenBenchmarking.org Seconds, Fewer Is Better CloverLeaf 1.3 Input: clover_bm16 Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c GPTshop GH200 120 240 360 480 600 SE +/- 0.02, N = 3 SE +/- 14.65, N = 9 SE +/- 14.10, N = 9 SE +/- 4.26, N = 7 SE +/- 15.29, N = 9 SE +/- 0.15, N = 3 SE +/- 0.52, N = 3 324.17 298.29 459.16 460.24 429.46 550.30 247.28 1. (F9X) gfortran options: -O3 -march=native -funroll-loops -fopenmp
miniFE Problem Size: Small OpenBenchmarking.org CG Mflops Per Watt, More Is Better miniFE 2.2 Problem Size: Small Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c 90 180 270 360 450 154.24 351.89 416.05 396.88 340.61 248.68
NAS Parallel Benchmarks Test / Class: MG.C OpenBenchmarking.org Total Mop/s Per Watt, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: MG.C Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c 300 600 900 1200 1500 549.13 1291.84 1333.53 1439.34 1167.53 407.26
NAS Parallel Benchmarks Test / Class: IS.D OpenBenchmarking.org Total Mop/s Per Watt, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: IS.D Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c 9 18 27 36 45 11.604 36.170 39.264 32.759 30.167 8.928
NAS Parallel Benchmarks Test / Class: FT.C OpenBenchmarking.org Total Mop/s Per Watt, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: FT.C Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c 200 400 600 800 1000 306.03 812.37 1081.10 904.17 837.59 244.94
NAS Parallel Benchmarks Test / Class: CG.C OpenBenchmarking.org Total Mop/s Per Watt, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: CG.C Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c 90 180 270 360 450 162.31 437.49 297.51 332.67 380.13 116.77
High Performance Conjugate Gradient X Y Z: 144 144 144 - RT: 60 OpenBenchmarking.org GFLOP/s Per Watt, More Is Better High Performance Conjugate Gradient 3.1 X Y Z: 144 144 144 - RT: 60 Xeon Platinum 8490H EPYC 9684X EPYC 9754 EPYC 9554 EPYC 9654 Ampere Altra Max 128c 0.0297 0.0594 0.0891 0.1188 0.1485 0.091 0.087 0.109 0.101 0.117 0.132
Phoronix Test Suite v10.8.5