NVIDIA GH200 Benchmarks Smoke Run Comparison Benchmarks by Michael Larabel for a future article. Just some initial smoke run benchmarks looking at the NVIDIA GH200 CPU performance versus other server CPUs. More interesting benchmarks to come.
HTML result view exported from: https://openbenchmarking.org/result/2401256-NE-NVIDIAGH254&sro&grt .
NVIDIA GH200 Benchmarks Smoke Run Comparison Processor Motherboard Chipset Memory Disk Graphics Network Monitor OS Kernel Desktop Display Server Compiler File-System Screen Resolution EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 Xeon Platinum 8490H Ampere Altra Max 128c GPTshop GH200 AMD EPYC 9554 64-Core @ 3.10GHz (64 Cores / 128 Threads) AMD Titanite_4G (RTI1007B BIOS) AMD Device 14a4 768GB 3201GB Micron_7450_MTFDKCC3T2TFS ASPEED Broadcom NetXtreme BCM5720 PCIe Ubuntu 23.10 6.6.0-rc5-phx-patched (x86_64) GNOME Shell 45.0 X Server 1.21.1.7 GCC 13.2.0 ext4 1920x1200 AMD EPYC 9654 96-Core @ 2.40GHz (96 Cores / 192 Threads) AMD EPYC 9684X 96-Core @ 2.55GHz (96 Cores / 192 Threads) AMD EPYC 9754 128-Core @ 2.25GHz (128 Cores / 256 Threads) Intel Xeon Platinum 8490H @ 3.50GHz (60 Cores / 120 Threads) Quanta Cloud S6Q-MB-MPS (3A10.uh BIOS) Intel Device 1bce 512GB ARMv8 Neoverse-N1 @ 3.00GHz (128 Cores) GIGABYTE MP32-AR2-00 v01000100 (F31k SCP: 2.10.20220531 BIOS) Ampere Computing LLC Altra PCI Root Complex A VGA HDMI 2 x Intel I350 6.6.0-rc5-phx-patched (aarch64) 1920x1080 ARMv8 Neoverse-V2 @ 3.39GHz (72 Cores) Quanta Cloud QuantaGrid S74G-2U 1S7GZ9Z0000 S7G MB (CG1) (3A06 BIOS) 1 x 480 GB DRAM-6400MT/s 960GB SAMSUNG MZ1L2960HCJR-00A07 + 1920GB SAMSUNG MZTL21T9 2 x Mellanox MT2910 + 2 x QLogic FastLinQ QL41000 10/25/40/50GbE 6.5.0-15-generic (aarch64) 1920x1200 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - EPYC 9554: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - EPYC 9654: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - EPYC 9684X: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - EPYC 9754: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Xeon Platinum 8490H: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Ampere Altra Max 128c: --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -v - GPTshop GH200: --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -v Processor Details - EPYC 9554: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa10113e - EPYC 9654: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa10113e - EPYC 9684X: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa10113e - EPYC 9754: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xaa00116 - Xeon Platinum 8490H: Scaling Governor: intel_pstate performance (EPP: performance) - CPU Microcode: 0x2b0004b1 - Ampere Altra Max 128c: Scaling Governor: cppc_cpufreq performance (Boost: Disabled) - GPTshop GH200: Scaling Governor: cppc_cpufreq performance (Boost: Disabled) Java Details - EPYC 9554: OpenJDK Runtime Environment (build 11.0.20+8-post-Ubuntu-1ubuntu1) - EPYC 9654: OpenJDK Runtime Environment (build 11.0.20+8-post-Ubuntu-1ubuntu1) - EPYC 9684X: OpenJDK Runtime Environment (build 11.0.20+8-post-Ubuntu-1ubuntu1) - EPYC 9754: OpenJDK Runtime Environment (build 11.0.20+8-post-Ubuntu-1ubuntu1) - Xeon Platinum 8490H: OpenJDK Runtime Environment (build 11.0.20+8-post-Ubuntu-1ubuntu1) - Ampere Altra Max 128c: OpenJDK Runtime Environment (build 11.0.20+8-post-Ubuntu-1ubuntu1) - GPTshop GH200: OpenJDK Runtime Environment (build 11.0.21+9-post-Ubuntu-0ubuntu123.10) Python Details - Python 3.11.6 Security Details - EPYC 9554: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected - EPYC 9654: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected - EPYC 9684X: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected - EPYC 9754: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected - Xeon Platinum 8490H: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected - Ampere Altra Max 128c: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of CSV2 BHB + srbds: Not affected + tsx_async_abort: Not affected - GPTshop GH200: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected
NVIDIA GH200 Benchmarks Smoke Run Comparison compress-7zip: Compression Rating mt-dgemm: Sustained Floating-Point Rate amg: cassandra: Writes askap: tConvolve MT - Gridding askap: tConvolve MPI - Degridding askap: tConvolve MPI - Gridding cloverleaf: clover_bm16 cloverleaf: clover_bm64_short duckdb: IMDB duckdb: TPC-H Parquet easywave: e2Asean Grid + BengkuluSept2007 Source - 1200 easywave: e2Asean Grid + BengkuluSept2007 Source - 2400 graph500: 26 graph500: 26 graph500: 26 graph500: 26 graphics-magick: Sharpen graphics-magick: Enhanced hpcg: 144 144 144 - 60 lulesh: minife: Small npb: CG.C npb: FT.C npb: IS.D npb: MG.C openssl: SHA512 primesieve: 1e13 rawtherapee: Total Benchmark Time rocksdb: Rand Read rodinia: OpenMP LavaMD stress-ng: NUMA stress-ng: Memory Copying build-gem5: Time To Compile build-godot: Time To Compile build-linux-kernel: allmodconfig build-llvm: Ninja build-nodejs: Time To Compile incompact3d: X3D-benchmarking input.i3d incompact3d: input.i3d 193 Cells Per Direction xmrig: Monero - 1M xmrig: Wownero - 1M EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 Xeon Platinum 8490H Ampere Altra Max 128c GPTshop GH200 489736 30.174904 2321156000 301963 10327.5 31177.2 37265.9 460.24 45.82 103.720 143.306 22.452 56.883 836159000 865152000 368016000 467453000 669 1142 22.4511 24037.442 50351.4 45867.44 124827.75 5184.36 125921.29 33622028530 28.136 49.943 376983415 34.014 967.56 22827.20 179.743 103.500 238.289 131.203 121.928 515.792758 9.74551621 47926.4 60032.4 564950 39.282158 2296730667 285324 12332.4 41081.6 47003.2 429.46 47.98 115.805 146.276 23.507 59.535 816172000 839917000 391167000 499347000 809 1319 32.1283 23701.442 50320.2 54465.04 125164.03 5195.24 118608.07 40459323610 24.164 52.051 429310263 30.240 1237.36 28310.82 180.857 101.563 284.249 133.070 120.156 430.080329 8.45212936 59047.7 75127.4 554411 40.320062 2374853333 286460 14731.5 58310.1 67965.5 298.29 51.31 116.050 147.489 24.579 60.252 854029000 882181000 383754000 501222000 775 1277 23.7784 24712.884 52937.6 59557.08 124391.00 6138.54 137608.68 40105639580 25.969 52.991 486862785 30.317 1173.07 26495.63 180.061 98.555 283.865 131.990 119.203 440.548584 7.30814419 69205.6 73567.3 586571 43.681923 2291049667 236320 12660.8 26593.4 27269.2 459.16 49.03 147.601 177.134 29.588 72.355 1147090000 1184510000 377033000 502151000 924 1451 25.8918 22356.746 50393.7 44431.58 141255.07 5773.25 127404.16 53648078203 21.756 66.132 611259161 25.150 1397.61 34497.51 208.576 118.247 319.426 146.386 130.460 493.502452 9.02806168 29356.1 77607.2 389690 23.257388 1611826000 139654 8112.07 19616.0 22365.2 324.17 38.93 99.249 148.099 45.850 118.554 1073860000 1113120000 333129000 454483000 691 1124 31.2432 23997.715 35452.5 35560.25 70989.97 2885.71 88049.90 22445820447 52.375 46.578 268374486 42.713 1144.04 18809.55 197.759 128.467 289.966 169.528 151.567 353.557595 12.7983420 27608.5 35870.2 328378 18.114994 1060656000 227548 4295.72 5285.36 2416.54 550.30 62.77 143.781 237.161 122.550 326.368 980463000 987770000 224032000 333239000 1274 1230 21.2399 16218.331 24122.0 17272.73 35153.27 1196.33 42291.84 34434960797 41.675 66.045 435437276 31.791 1454.68 27163.45 260.326 227.783 307.878 266.543 268.704 606.521708 23.8852240 4322.1 1909.2 341300 17.563016 1993579333 371439 9561.31 12429.5 9678.85 247.28 28.36 91.633 156.987 36.562 98.719 1214290000 1313880000 298983000 463087000 1244 1584 41.5526 23433.373 46477.3 24350.32 47948.15 1770.65 57191.85 45292038457 36.049 46.546 429416658 30.959 6279.41 27302.27 184.661 138.891 283.339 195.598 173.834 257.342539 9.83151112 17290.3 21993.3 OpenBenchmarking.org
7-Zip Compression Test: Compression Rating OpenBenchmarking.org MIPS, More Is Better 7-Zip Compression 22.01 Test: Compression Rating Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 130K 260K 390K 520K 650K SE +/- 1588.70, N = 3 SE +/- 224.17, N = 3 SE +/- 311.94, N = 3 SE +/- 366.95, N = 3 SE +/- 439.85, N = 3 SE +/- 222.71, N = 3 SE +/- 629.56, N = 3 328378 489736 564950 554411 586571 341300 389690 1. (CXX) g++ options: -lpthread -ldl -O2 -fPIC
ACES DGEMM Sustained Floating-Point Rate OpenBenchmarking.org GFLOP/s, More Is Better ACES DGEMM 1.0 Sustained Floating-Point Rate Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 10 20 30 40 50 SE +/- 0.09, N = 4 SE +/- 0.28, N = 6 SE +/- 0.18, N = 7 SE +/- 0.12, N = 7 SE +/- 0.13, N = 7 SE +/- 0.19, N = 15 SE +/- 0.13, N = 5 18.11 30.17 39.28 40.32 43.68 17.56 23.26 1. (CC) gcc options: -O3 -march=native -fopenmp
ACES DGEMM Sustained Floating-Point Rate OpenBenchmarking.org GFLOP/s Per Watt, More Is Better ACES DGEMM 1.0 Sustained Floating-Point Rate Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 Xeon Platinum 8490H 0.0502 0.1004 0.1506 0.2008 0.251 0.136 0.171 0.220 0.219 0.223 0.091
ACES DGEMM CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better ACES DGEMM 1.0 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
ACES DGEMM System Power Consumption Monitor Min Avg Max GPTshop GH200 185.9 277.8 345.7 OpenBenchmarking.org Watts, Fewer Is Better ACES DGEMM 1.0 System Power Consumption Monitor 100 200 300 400 500
ACES DGEMM System Temperature Monitor Min Avg Max GPTshop GH200 85.8 92.3 95.8 OpenBenchmarking.org Celsius, Fewer Is Better ACES DGEMM 1.0 System Temperature Monitor 20 40 60 80 100
Algebraic Multi-Grid Benchmark OpenBenchmarking.org Figure Of Merit, More Is Better Algebraic Multi-Grid Benchmark 1.2 Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 500M 1000M 1500M 2000M 2500M SE +/- 31942.66, N = 3 SE +/- 3180941.37, N = 3 SE +/- 5502047.33, N = 3 SE +/- 1953221.73, N = 3 SE +/- 2811300.43, N = 3 SE +/- 22467067.78, N = 3 SE +/- 269139.99, N = 3 1060656000 2321156000 2296730667 2374853333 2291049667 1993579333 1611826000 1. (CC) gcc options: -lparcsr_ls -lparcsr_mv -lseq_mv -lIJ_mv -lkrylov -lHYPRE_utilities -lm -fopenmp -lmpi
Algebraic Multi-Grid Benchmark OpenBenchmarking.org Figure Of Merit Per Watt, More Is Better Algebraic Multi-Grid Benchmark 1.2 Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 Xeon Platinum 8490H 3M 6M 9M 12M 15M 7481610.11 12260061.49 9775985.20 10140759.12 11292392.93 5406835.94
Apache Cassandra Test: Writes OpenBenchmarking.org Op/s, More Is Better Apache Cassandra 4.1.3 Test: Writes Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 80K 160K 240K 320K 400K SE +/- 2776.53, N = 3 SE +/- 554.18, N = 3 SE +/- 162.88, N = 3 SE +/- 1195.73, N = 3 SE +/- 1223.06, N = 3 SE +/- 2382.80, N = 3 SE +/- 883.78, N = 3 227548 301963 285324 286460 236320 371439 139654
Apache Cassandra Test: Writes OpenBenchmarking.org Op/s Per Watt, More Is Better Apache Cassandra 4.1.3 Test: Writes Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 Xeon Platinum 8490H 400 800 1200 1600 2000 2068.00 1700.28 1349.72 1282.09 1432.67 557.85
Apache Cassandra CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better Apache Cassandra 4.1.3 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
Apache Cassandra System Power Consumption Monitor Min Avg Max GPTshop GH200 183.3 287.6 335.2 OpenBenchmarking.org Watts, Fewer Is Better Apache Cassandra 4.1.3 System Power Consumption Monitor 80 160 240 320 400
Apache Cassandra System Temperature Monitor Min Avg Max GPTshop GH200 78.5 89.2 92.7 OpenBenchmarking.org Celsius, Fewer Is Better Apache Cassandra 4.1.3 System Temperature Monitor 20 40 60 80 100
ASKAP Test: tConvolve MT - Gridding OpenBenchmarking.org Million Grid Points Per Second, More Is Better ASKAP 1.0 Test: tConvolve MT - Gridding Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 3K 6K 9K 12K 15K SE +/- 0.36, N = 3 SE +/- 9.54, N = 3 SE +/- 32.54, N = 3 SE +/- 240.55, N = 12 SE +/- 9.06, N = 3 SE +/- 2.75, N = 3 SE +/- 4.50, N = 3 4295.72 10327.50 12332.40 14731.50 12660.80 9561.31 8112.07 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
ASKAP Test: tConvolve MPI - Degridding OpenBenchmarking.org Mpix/sec, More Is Better ASKAP 1.0 Test: tConvolve MPI - Degridding Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 12K 24K 36K 48K 60K SE +/- 4.44, N = 3 SE +/- 153.57, N = 3 SE +/- 475.58, N = 3 SE +/- 0.00, N = 3 SE +/- 574.27, N = 15 SE +/- 37.76, N = 3 SE +/- 131.23, N = 3 5285.36 31177.20 41081.60 58310.10 26593.40 12429.50 19616.00 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
ASKAP Test: tConvolve MPI - Gridding OpenBenchmarking.org Mpix/sec, More Is Better ASKAP 1.0 Test: tConvolve MPI - Gridding Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 15K 30K 45K 60K 75K SE +/- 0.46, N = 3 SE +/- 219.20, N = 3 SE +/- 405.08, N = 3 SE +/- 485.47, N = 3 SE +/- 655.18, N = 14 SE +/- 39.51, N = 3 SE +/- 146.74, N = 3 2416.54 37265.90 47003.20 67965.50 27269.20 9678.85 22365.20 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
ASKAP Test: tConvolve MPI - Gridding OpenBenchmarking.org Mpix/sec Per Watt, More Is Better ASKAP 1.0 Test: tConvolve MPI - Gridding Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 Xeon Platinum 8490H 60 120 180 240 300 17.51 191.48 196.62 275.76 129.89 71.04
ASKAP CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better ASKAP 1.0 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
ASKAP System Power Consumption Monitor Min Avg Max GPTshop GH200 188.6 287.3 335.1 OpenBenchmarking.org Watts, Fewer Is Better ASKAP 1.0 System Power Consumption Monitor 80 160 240 320 400
ASKAP System Temperature Monitor Min Avg Max GPTshop GH200 85.6 92.0 95.1 OpenBenchmarking.org Celsius, Fewer Is Better ASKAP 1.0 System Temperature Monitor 20 40 60 80 100
CloverLeaf Input: clover_bm16 OpenBenchmarking.org Seconds, Fewer Is Better CloverLeaf 1.3 Input: clover_bm16 Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 120 240 360 480 600 SE +/- 0.15, N = 3 SE +/- 4.26, N = 7 SE +/- 15.29, N = 9 SE +/- 14.65, N = 9 SE +/- 14.10, N = 9 SE +/- 0.52, N = 3 SE +/- 0.02, N = 3 550.30 460.24 429.46 298.29 459.16 247.28 324.17 1. (F9X) gfortran options: -O3 -march=native -funroll-loops -fopenmp
CloverLeaf Input: clover_bm64_short OpenBenchmarking.org Seconds, Fewer Is Better CloverLeaf 1.3 Input: clover_bm64_short Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 14 28 42 56 70 SE +/- 0.02, N = 3 SE +/- 1.40, N = 12 SE +/- 1.47, N = 12 SE +/- 1.39, N = 12 SE +/- 1.50, N = 12 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 62.77 45.82 47.98 51.31 49.03 28.36 38.93 1. (F9X) gfortran options: -O3 -march=native -funroll-loops -fopenmp
CloverLeaf CPU Peak Freq (Highest CPU Core Frequency) Monitor Min Avg Max GPTshop GH200 2729 3346 3411 OpenBenchmarking.org Megahertz, More Is Better CloverLeaf 1.3 CPU Peak Freq (Highest CPU Core Frequency) Monitor 800 1600 2400 3200 4000
CloverLeaf System Power Consumption Monitor Min Avg Max GPTshop GH200 185.9 315.8 337.8 OpenBenchmarking.org Watts, Fewer Is Better CloverLeaf 1.3 System Power Consumption Monitor 80 160 240 320 400
CloverLeaf System Temperature Monitor Min Avg Max GPTshop GH200 88.5 96.9 98.8 OpenBenchmarking.org Celsius, Fewer Is Better CloverLeaf 1.3 System Temperature Monitor 20 40 60 80 100
CloverLeaf CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better CloverLeaf 1.3 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
CloverLeaf System Power Consumption Monitor Min Avg Max GPTshop GH200 191.2 268.3 332.7 OpenBenchmarking.org Watts, Fewer Is Better CloverLeaf 1.3 System Power Consumption Monitor 80 160 240 320 400
CloverLeaf System Temperature Monitor Min Avg Max GPTshop GH200 91.9 94.6 96.6 OpenBenchmarking.org Celsius, Fewer Is Better CloverLeaf 1.3 System Temperature Monitor 20 40 60 80 100
CPU Peak Freq (Highest CPU Core Frequency) Monitor Phoronix Test Suite System Monitoring OpenBenchmarking.org Megahertz CPU Peak Freq (Highest CPU Core Frequency) Monitor Phoronix Test Suite System Monitoring GPTshop GH200 700 1400 2100 2800 3500 Min: 2047 / Avg: 3332.19 / Max: 4104
DuckDB Benchmark: IMDB OpenBenchmarking.org Seconds, Fewer Is Better DuckDB 0.9.1 Benchmark: IMDB Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 30 60 90 120 150 SE +/- 0.18, N = 3 SE +/- 0.16, N = 3 SE +/- 0.18, N = 3 SE +/- 0.26, N = 3 SE +/- 0.08, N = 3 SE +/- 0.75, N = 3 SE +/- 0.24, N = 3 143.78 103.72 115.81 116.05 147.60 91.63 99.25 1. (CXX) g++ options: -O3 -rdynamic -lssl -lcrypto -ldl
DuckDB Benchmark: TPC-H Parquet OpenBenchmarking.org Seconds, Fewer Is Better DuckDB 0.9.1 Benchmark: TPC-H Parquet Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 50 100 150 200 250 SE +/- 4.97, N = 9 SE +/- 0.53, N = 3 SE +/- 0.47, N = 3 SE +/- 0.26, N = 3 SE +/- 0.64, N = 3 SE +/- 0.69, N = 3 SE +/- 0.30, N = 3 237.16 143.31 146.28 147.49 177.13 156.99 148.10 1. (CXX) g++ options: -O3 -rdynamic -lssl -lcrypto -ldl
DuckDB CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better DuckDB 0.9.1 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
DuckDB System Power Consumption Monitor Min Avg Max GPTshop GH200 183.3 212.6 298.5 OpenBenchmarking.org Watts, Fewer Is Better DuckDB 0.9.1 System Power Consumption Monitor 80 160 240 320 400
DuckDB System Temperature Monitor Min Avg Max GPTshop GH200 79.0 83.1 94.0 OpenBenchmarking.org Celsius, Fewer Is Better DuckDB 0.9.1 System Temperature Monitor 20 40 60 80 100
DuckDB CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better DuckDB 0.9.1 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
DuckDB System Power Consumption Monitor Min Avg Max GPTshop GH200 183.3 207.4 351.0 OpenBenchmarking.org Watts, Fewer Is Better DuckDB 0.9.1 System Power Consumption Monitor 100 200 300 400 500
DuckDB System Temperature Monitor Min Avg Max GPTshop GH200 84.5 84.8 86.6 OpenBenchmarking.org Celsius, Fewer Is Better DuckDB 0.9.1 System Temperature Monitor 20 40 60 80 100
easyWave Input: e2Asean Grid + BengkuluSept2007 Source - Time: 1200 OpenBenchmarking.org Seconds, Fewer Is Better easyWave r34 Input: e2Asean Grid + BengkuluSept2007 Source - Time: 1200 Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 30 60 90 120 150 SE +/- 1.12, N = 3 SE +/- 0.06, N = 3 SE +/- 0.14, N = 3 SE +/- 0.32, N = 3 SE +/- 0.20, N = 15 SE +/- 0.11, N = 3 SE +/- 0.04, N = 3 122.55 22.45 23.51 24.58 29.59 36.56 45.85 1. (CXX) g++ options: -O3 -fopenmp
easyWave Input: e2Asean Grid + BengkuluSept2007 Source - Time: 2400 OpenBenchmarking.org Seconds, Fewer Is Better easyWave r34 Input: e2Asean Grid + BengkuluSept2007 Source - Time: 2400 Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 70 140 210 280 350 SE +/- 1.24, N = 3 SE +/- 0.33, N = 3 SE +/- 0.40, N = 3 SE +/- 0.14, N = 3 SE +/- 0.45, N = 3 SE +/- 0.34, N = 3 SE +/- 0.36, N = 3 326.37 56.88 59.54 60.25 72.36 98.72 118.55 1. (CXX) g++ options: -O3 -fopenmp
easyWave CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better easyWave r34 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
easyWave System Power Consumption Monitor Min Avg Max GPTshop GH200 191.2 250.4 348.8 OpenBenchmarking.org Watts, Fewer Is Better easyWave r34 System Power Consumption Monitor 100 200 300 400 500
easyWave System Temperature Monitor Min Avg Max GPTshop GH200 83.6 85.9 87.5 OpenBenchmarking.org Celsius, Fewer Is Better easyWave r34 System Temperature Monitor 20 40 60 80 100
easyWave CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better easyWave r34 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
easyWave System Power Consumption Monitor Min Avg Max GPTshop GH200 186.0 272.7 361.5 OpenBenchmarking.org Watts, Fewer Is Better easyWave r34 System Power Consumption Monitor 100 200 300 400 500
easyWave System Temperature Monitor Min Avg Max GPTshop GH200 81.6 85.6 88.0 OpenBenchmarking.org Celsius, Fewer Is Better easyWave r34 System Temperature Monitor 20 40 60 80 100
Graph500 Scale: 26 OpenBenchmarking.org bfs median_TEPS, More Is Better Graph500 3.0 Scale: 26 Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 300M 600M 900M 1200M 1500M 980463000 836159000 816172000 854029000 1147090000 1214290000 1073860000 1. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi
Graph500 Scale: 26 OpenBenchmarking.org bfs max_TEPS, More Is Better Graph500 3.0 Scale: 26 Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 300M 600M 900M 1200M 1500M 987770000 865152000 839917000 882181000 1184510000 1313880000 1113120000 1. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi
Graph500 Scale: 26 OpenBenchmarking.org sssp median_TEPS, More Is Better Graph500 3.0 Scale: 26 Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 80M 160M 240M 320M 400M 224032000 368016000 391167000 383754000 377033000 298983000 333129000 1. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi
Graph500 Scale: 26 OpenBenchmarking.org sssp max_TEPS, More Is Better Graph500 3.0 Scale: 26 Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 110M 220M 330M 440M 550M 333239000 467453000 499347000 501222000 502151000 463087000 454483000 1. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi
Graph500 Scale: 26 OpenBenchmarking.org sssp max_TEPS Per Watt, More Is Better Graph500 3.0 Scale: 26 Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 Xeon Platinum 8490H 400K 800K 1200K 1600K 2000K 1918278.63 1818554.40 1644954.69 1663398.87 1711121.29 1312538.58
Graph500 CPU Peak Freq (Highest CPU Core Frequency) Monitor Min Avg Max GPTshop GH200 2729 3246 3411 OpenBenchmarking.org Megahertz, More Is Better Graph500 3.0 CPU Peak Freq (Highest CPU Core Frequency) Monitor 800 1600 2400 3200 4000
Graph500 System Power Consumption Monitor Min Avg Max GPTshop GH200 272.3 321.3 377.1 OpenBenchmarking.org Watts, Fewer Is Better Graph500 3.0 System Power Consumption Monitor 100 200 300 400 500
Graph500 System Temperature Monitor Min Avg Max GPTshop GH200 89.8 97.8 100.1 OpenBenchmarking.org Celsius, Fewer Is Better Graph500 3.0 System Temperature Monitor 20 40 60 80 100
GraphicsMagick Operation: Sharpen OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.38 Operation: Sharpen Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 300 600 900 1200 1500 SE +/- 1.20, N = 3 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 SE +/- 0.67, N = 3 SE +/- 5.13, N = 3 SE +/- 0.33, N = 3 1274 669 809 775 924 1244 691 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -llzma -lbz2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -llzma -lbz2 -lxml2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -llzma -lbz2 -lxml2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -llzma -lbz2 -lxml2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -llzma -lbz2 -lxml2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -llzma -lbz2 -lxml2 1. (CC) gcc options: -fopenmp -O2 -ljpeg -lXext -lSM -lICE -lX11 -lz -lzstd -lm -lpthread
GraphicsMagick Operation: Sharpen OpenBenchmarking.org Iterations Per Minute Per Watt, More Is Better GraphicsMagick 1.3.38 Operation: Sharpen Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 Xeon Platinum 8490H 2 4 6 8 10 8.016 2.691 3.127 3.042 3.752 2.081
GraphicsMagick Operation: Enhanced OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.38 Operation: Enhanced Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 300 600 900 1200 1500 SE +/- 2.73, N = 3 SE +/- 0.88, N = 3 SE +/- 0.88, N = 3 SE +/- 1.67, N = 3 SE +/- 0.58, N = 3 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 1230 1142 1319 1277 1451 1584 1124 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -llzma -lbz2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -llzma -lbz2 -lxml2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -llzma -lbz2 -lxml2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -llzma -lbz2 -lxml2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -llzma -lbz2 -lxml2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -llzma -lbz2 -lxml2 1. (CC) gcc options: -fopenmp -O2 -ljpeg -lXext -lSM -lICE -lX11 -lz -lzstd -lm -lpthread
GraphicsMagick Operation: Enhanced OpenBenchmarking.org Iterations Per Minute Per Watt, More Is Better GraphicsMagick 1.3.38 Operation: Enhanced Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 Xeon Platinum 8490H 3 6 9 12 15 9.833 4.871 5.323 5.193 6.328 3.405
GraphicsMagick CPU Peak Freq (Highest CPU Core Frequency) Monitor Min Avg Max GPTshop GH200 2729 3381 3411 OpenBenchmarking.org Megahertz, More Is Better GraphicsMagick 1.3.38 CPU Peak Freq (Highest CPU Core Frequency) Monitor 800 1600 2400 3200 4000
GraphicsMagick System Power Consumption Monitor Min Avg Max GPTshop GH200 185.9 311.9 340.5 OpenBenchmarking.org Watts, Fewer Is Better GraphicsMagick 1.3.38 System Power Consumption Monitor 80 160 240 320 400
GraphicsMagick System Temperature Monitor Min Avg Max GPTshop GH200 83.9 93.9 99.3 OpenBenchmarking.org Celsius, Fewer Is Better GraphicsMagick 1.3.38 System Temperature Monitor 20 40 60 80 100
GraphicsMagick CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better GraphicsMagick 1.3.38 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
GraphicsMagick System Power Consumption Monitor Min Avg Max GPTshop GH200 185.9 296.1 322.4 OpenBenchmarking.org Watts, Fewer Is Better GraphicsMagick 1.3.38 System Power Consumption Monitor 80 160 240 320 400
GraphicsMagick System Temperature Monitor Min Avg Max GPTshop GH200 87.9 94.2 97.3 OpenBenchmarking.org Celsius, Fewer Is Better GraphicsMagick 1.3.38 System Temperature Monitor 20 40 60 80 100
High Performance Conjugate Gradient X Y Z: 144 144 144 - RT: 60 OpenBenchmarking.org GFLOP/s, More Is Better High Performance Conjugate Gradient 3.1 X Y Z: 144 144 144 - RT: 60 Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 9 18 27 36 45 SE +/- 0.00, N = 3 SE +/- 0.04, N = 3 SE +/- 0.35, N = 3 SE +/- 0.28, N = 9 SE +/- 0.27, N = 3 SE +/- 0.36, N = 3 SE +/- 0.11, N = 3 21.24 22.45 32.13 23.78 25.89 41.55 31.24 1. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi
High Performance Conjugate Gradient X Y Z: 144 144 144 - RT: 60 OpenBenchmarking.org GFLOP/s Per Watt, More Is Better High Performance Conjugate Gradient 3.1 X Y Z: 144 144 144 - RT: 60 Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 Xeon Platinum 8490H 0.0297 0.0594 0.0891 0.1188 0.1485 0.132 0.101 0.117 0.087 0.109 0.091
High Performance Conjugate Gradient CPU Peak Freq (Highest CPU Core Frequency) Monitor Min Avg Max GPTshop GH200 2729 3345 4104 OpenBenchmarking.org Megahertz, More Is Better High Performance Conjugate Gradient 3.1 CPU Peak Freq (Highest CPU Core Frequency) Monitor 1100 2200 3300 4400 5500
High Performance Conjugate Gradient System Power Consumption Monitor Min Avg Max GPTshop GH200 189.8 324.5 414.7 OpenBenchmarking.org Watts, Fewer Is Better High Performance Conjugate Gradient 3.1 System Power Consumption Monitor 110 220 330 440 550
High Performance Conjugate Gradient System Temperature Monitor Min Avg Max GPTshop GH200 71.9 96.0 99.3 OpenBenchmarking.org Celsius, Fewer Is Better High Performance Conjugate Gradient 3.1 System Temperature Monitor 20 40 60 80 100
LULESH OpenBenchmarking.org z/s, More Is Better LULESH 2.0.3 Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 5K 10K 15K 20K 25K SE +/- 24.21, N = 3 SE +/- 113.24, N = 3 SE +/- 84.04, N = 3 SE +/- 90.30, N = 3 SE +/- 59.94, N = 3 SE +/- 40.15, N = 3 SE +/- 49.50, N = 5 16218.33 24037.44 23701.44 24712.88 22356.75 23433.37 23997.72 1. (CXX) g++ options: -O3 -fopenmp -lm -lmpi_cxx -lmpi
LULESH OpenBenchmarking.org z/s Per Watt, More Is Better LULESH 2.0.3 Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 Xeon Platinum 8490H 30 60 90 120 150 144.69 148.93 136.97 139.86 124.36 106.69
LULESH CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better LULESH 2.0.3 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
LULESH System Power Consumption Monitor Min Avg Max GPTshop GH200 188.5 250.2 308.4 OpenBenchmarking.org Watts, Fewer Is Better LULESH 2.0.3 System Power Consumption Monitor 80 160 240 320 400
miniFE Problem Size: Small OpenBenchmarking.org CG Mflops, More Is Better miniFE 2.2 Problem Size: Small Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 11K 22K 33K 44K 55K SE +/- 5.88, N = 4 SE +/- 51.65, N = 5 SE +/- 45.75, N = 5 SE +/- 507.76, N = 5 SE +/- 105.96, N = 5 SE +/- 54.33, N = 5 SE +/- 14.30, N = 4 24122.0 50351.4 50320.2 52937.6 50393.7 46477.3 35452.5 1. (CXX) g++ options: -O3 -fopenmp -lmpi_cxx -lmpi
miniFE Problem Size: Small OpenBenchmarking.org CG Mflops Per Watt, More Is Better miniFE 2.2 Problem Size: Small Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 Xeon Platinum 8490H 90 180 270 360 450 248.68 396.88 340.61 351.89 416.05 154.24
NAS Parallel Benchmarks Test / Class: CG.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: CG.C Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 13K 26K 39K 52K 65K SE +/- 15.14, N = 5 SE +/- 363.39, N = 8 SE +/- 537.84, N = 15 SE +/- 469.70, N = 9 SE +/- 391.92, N = 15 SE +/- 58.76, N = 6 SE +/- 268.67, N = 15 17272.73 45867.44 54465.04 59557.08 44431.58 24350.32 35560.25 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.5
NAS Parallel Benchmarks Test / Class: CG.C OpenBenchmarking.org Total Mop/s Per Watt, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: CG.C Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 Xeon Platinum 8490H 90 180 270 360 450 116.77 332.67 380.13 437.49 297.51 162.31
NAS Parallel Benchmarks Test / Class: FT.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: FT.C Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 30K 60K 90K 120K 150K SE +/- 41.38, N = 4 SE +/- 1162.69, N = 15 SE +/- 698.81, N = 8 SE +/- 686.49, N = 8 SE +/- 1190.42, N = 8 SE +/- 163.34, N = 5 SE +/- 445.75, N = 6 35153.27 124827.75 125164.03 124391.00 141255.07 47948.15 70989.97 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.5
NAS Parallel Benchmarks Test / Class: FT.C OpenBenchmarking.org Total Mop/s Per Watt, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: FT.C Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 Xeon Platinum 8490H 200 400 600 800 1000 244.94 904.17 837.59 812.37 1081.10 306.03
NAS Parallel Benchmarks Test / Class: IS.D OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: IS.D Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 1300 2600 3900 5200 6500 SE +/- 0.34, N = 3 SE +/- 14.93, N = 5 SE +/- 13.89, N = 5 SE +/- 65.80, N = 5 SE +/- 18.26, N = 5 SE +/- 0.92, N = 3 SE +/- 6.63, N = 4 1196.33 5184.36 5195.24 6138.54 5773.25 1770.65 2885.71 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.5
NAS Parallel Benchmarks Test / Class: IS.D OpenBenchmarking.org Total Mop/s Per Watt, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: IS.D Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 Xeon Platinum 8490H 9 18 27 36 45 8.928 32.759 30.167 36.170 39.264 11.604
NAS Parallel Benchmarks Test / Class: MG.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: MG.C Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 30K 60K 90K 120K 150K SE +/- 10.27, N = 7 SE +/- 1712.13, N = 15 SE +/- 1159.21, N = 15 SE +/- 1538.33, N = 15 SE +/- 667.76, N = 9 SE +/- 113.65, N = 8 SE +/- 405.89, N = 9 42291.84 125921.29 118608.07 137608.68 127404.16 57191.85 88049.90 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.5
NAS Parallel Benchmarks Test / Class: MG.C OpenBenchmarking.org Total Mop/s Per Watt, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: MG.C Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 Xeon Platinum 8490H 300 600 900 1200 1500 407.26 1439.34 1167.53 1291.84 1333.53 549.13
NAS Parallel Benchmarks CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better NAS Parallel Benchmarks 3.4 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
NAS Parallel Benchmarks System Power Consumption Monitor Min Avg Max GPTshop GH200 189.8 265.5 392.9 OpenBenchmarking.org Watts, Fewer Is Better NAS Parallel Benchmarks 3.4 System Power Consumption Monitor 110 220 330 440 550
NAS Parallel Benchmarks System Temperature Monitor Min Avg Max GPTshop GH200 91.6 95.4 99.9 OpenBenchmarking.org Celsius, Fewer Is Better NAS Parallel Benchmarks 3.4 System Temperature Monitor 20 40 60 80 100
NAS Parallel Benchmarks CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better NAS Parallel Benchmarks 3.4 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
NAS Parallel Benchmarks System Power Consumption Monitor Min Avg Max GPTshop GH200 185.9 287.4 378.3 OpenBenchmarking.org Watts, Fewer Is Better NAS Parallel Benchmarks 3.4 System Power Consumption Monitor 100 200 300 400 500
NAS Parallel Benchmarks System Temperature Monitor Min Avg Max GPTshop GH200 89.5 92.2 98.0 OpenBenchmarking.org Celsius, Fewer Is Better NAS Parallel Benchmarks 3.4 System Temperature Monitor 20 40 60 80 100
NAS Parallel Benchmarks CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better NAS Parallel Benchmarks 3.4 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
NAS Parallel Benchmarks System Power Consumption Monitor Min Avg Max GPTshop GH200 184.6 277.7 351.4 OpenBenchmarking.org Watts, Fewer Is Better NAS Parallel Benchmarks 3.4 System Power Consumption Monitor 100 200 300 400 500
NAS Parallel Benchmarks System Temperature Monitor Min Avg Max GPTshop GH200 87.5 90.4 92.8 OpenBenchmarking.org Celsius, Fewer Is Better NAS Parallel Benchmarks 3.4 System Temperature Monitor 20 40 60 80 100
NAS Parallel Benchmarks CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better NAS Parallel Benchmarks 3.4 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
NAS Parallel Benchmarks System Power Consumption Monitor Min Avg Max GPTshop GH200 185.9 237.6 381.2 OpenBenchmarking.org Watts, Fewer Is Better NAS Parallel Benchmarks 3.4 System Power Consumption Monitor 100 200 300 400 500
NAS Parallel Benchmarks System Temperature Monitor Min Avg Max GPTshop GH200 88.8 91.3 97.1 OpenBenchmarking.org Celsius, Fewer Is Better NAS Parallel Benchmarks 3.4 System Temperature Monitor 20 40 60 80 100
OpenSSL Algorithm: SHA512 OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: SHA512 Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 11000M 22000M 33000M 44000M 55000M SE +/- 36142057.88, N = 3 SE +/- 4732001.41, N = 3 SE +/- 16069151.20, N = 3 SE +/- 6401405.16, N = 3 SE +/- 1530571.35, N = 3 SE +/- 90952676.69, N = 3 SE +/- 147259946.50, N = 3 34434960797 33622028530 40459323610 40105639580 53648078203 45292038457 22445820447 -lssl -lcrypto -m64 -m64 -m64 -m64 -lssl -lcrypto -m64 -lssl -lcrypto 1. (CC) gcc options: -pthread -O3 -ldl
OpenSSL Algorithm: SHA512 OpenBenchmarking.org byte/s Per Watt, More Is Better OpenSSL 3.1 Algorithm: SHA512 Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 Xeon Platinum 8490H 40M 80M 120M 160M 200M 192898458.48 112644110.72 133650046.77 131019550.91 160226634.86 65707856.51
OpenSSL CPU Peak Freq (Highest CPU Core Frequency) Monitor Min Avg Max GPTshop GH200 2729 3344 3411 OpenBenchmarking.org Megahertz, More Is Better OpenSSL 3.1 CPU Peak Freq (Highest CPU Core Frequency) Monitor 800 1600 2400 3200 4000
OpenSSL System Power Consumption Monitor Min Avg Max GPTshop GH200 188.8 307.5 345.7 OpenBenchmarking.org Watts, Fewer Is Better OpenSSL 3.1 System Power Consumption Monitor 100 200 300 400 500
OpenSSL System Temperature Monitor Min Avg Max GPTshop GH200 89.0 97.8 99.8 OpenBenchmarking.org Celsius, Fewer Is Better OpenSSL 3.1 System Temperature Monitor 20 40 60 80 100
Primesieve Length: 1e13 OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 8.0 Length: 1e13 Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 12 24 36 48 60 SE +/- 0.07, N = 3 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.51, N = 3 SE +/- 0.07, N = 3 41.68 28.14 24.16 25.97 21.76 36.05 52.38 1. (CXX) g++ options: -O3
Primesieve CPU Peak Freq (Highest CPU Core Frequency) Monitor Min Avg Max GPTshop GH200 2729 3155 3411 OpenBenchmarking.org Megahertz, More Is Better Primesieve 8.0 CPU Peak Freq (Highest CPU Core Frequency) Monitor 800 1600 2400 3200 4000
Primesieve System Power Consumption Monitor Min Avg Max GPTshop GH200 337.8 371.2 387.2 OpenBenchmarking.org Watts, Fewer Is Better Primesieve 8.0 System Power Consumption Monitor 100 200 300 400 500
Primesieve System Temperature Monitor Min Avg Max GPTshop GH200 85.2 96.6 99.6 OpenBenchmarking.org Celsius, Fewer Is Better Primesieve 8.0 System Temperature Monitor 20 40 60 80 100
RawTherapee Total Benchmark Time OpenBenchmarking.org Seconds, Fewer Is Better RawTherapee Total Benchmark Time Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 15 30 45 60 75 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 SE +/- 0.07, N = 3 SE +/- 0.03, N = 3 SE +/- 0.22, N = 3 SE +/- 0.10, N = 3 SE +/- 0.05, N = 3 66.05 49.94 52.05 52.99 66.13 46.55 46.58 1. RawTherapee, version 5.9, command line.
RawTherapee CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better RawTherapee CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
RawTherapee System Power Consumption Monitor Min Avg Max GPTshop GH200 188.6 208.0 251.3 OpenBenchmarking.org Watts, Fewer Is Better RawTherapee System Power Consumption Monitor 60 120 180 240 300
RawTherapee System Temperature Monitor Min Avg Max GPTshop GH200 82.8 83.8 85.2 OpenBenchmarking.org Celsius, Fewer Is Better RawTherapee System Temperature Monitor 20 40 60 80 100
RocksDB Test: Random Read OpenBenchmarking.org Op/s, More Is Better RocksDB 8.0 Test: Random Read Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 130M 260M 390M 520M 650M SE +/- 3039704.34, N = 15 SE +/- 106225.72, N = 3 SE +/- 1006626.36, N = 3 SE +/- 539077.40, N = 3 SE +/- 817605.37, N = 3 SE +/- 3229592.93, N = 11 SE +/- 6039899.37, N = 15 435437276 376983415 429310263 486862785 611259161 429416658 268374486 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
RocksDB Test: Random Read OpenBenchmarking.org Op/s Per Watt, More Is Better RocksDB 8.0 Test: Random Read Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 Xeon Platinum 8490H 500K 1000K 1500K 2000K 2500K 2282816.53 1450487.41 1599837.63 1587869.93 2068476.16 808152.35
RocksDB CPU Peak Freq (Highest CPU Core Frequency) Monitor Min Avg Max GPTshop GH200 2729 3179 3411 OpenBenchmarking.org Megahertz, More Is Better RocksDB 8.0 CPU Peak Freq (Highest CPU Core Frequency) Monitor 800 1600 2400 3200 4000
RocksDB System Power Consumption Monitor Min Avg Max GPTshop GH200 178.0 319.3 382.4 OpenBenchmarking.org Watts, Fewer Is Better RocksDB 8.0 System Power Consumption Monitor 100 200 300 400 500
RocksDB System Temperature Monitor Min Avg Max GPTshop GH200 85.0 98.3 101.2 OpenBenchmarking.org Celsius, Fewer Is Better RocksDB 8.0 System Temperature Monitor 20 40 60 80 100
Rodinia Test: OpenMP LavaMD OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenMP LavaMD Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 10 20 30 40 50 SE +/- 0.01, N = 3 SE +/- 0.05, N = 3 SE +/- 0.06, N = 3 SE +/- 0.05, N = 3 SE +/- 0.10, N = 3 SE +/- 0.07, N = 3 SE +/- 0.53, N = 3 31.79 34.01 30.24 30.32 25.15 30.96 42.71 1. (CXX) g++ options: -O2 -lOpenCL
Rodinia CPU Peak Freq (Highest CPU Core Frequency) Monitor Min Avg Max GPTshop GH200 2729 3119 3411 OpenBenchmarking.org Megahertz, More Is Better Rodinia 3.1 CPU Peak Freq (Highest CPU Core Frequency) Monitor 800 1600 2400 3200 4000
Rodinia System Power Consumption Monitor Min Avg Max GPTshop GH200 186.0 267.7 379.8 OpenBenchmarking.org Watts, Fewer Is Better Rodinia 3.1 System Power Consumption Monitor 100 200 300 400 500
Rodinia System Temperature Monitor Min Avg Max GPTshop GH200 88.9 93.9 100.1 OpenBenchmarking.org Celsius, Fewer Is Better Rodinia 3.1 System Temperature Monitor 20 40 60 80 100
Stress-NG Test: NUMA OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: NUMA Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 1300 2600 3900 5200 6500 SE +/- 2.51, N = 3 SE +/- 4.03, N = 3 SE +/- 1.10, N = 3 SE +/- 1.38, N = 3 SE +/- 2.90, N = 3 SE +/- 10.32, N = 3 SE +/- 1.16, N = 3 1454.68 967.56 1237.36 1173.07 1397.61 6279.41 1144.04 -O2 -std=gnu99 -lm -laio -lapparmor -latomic -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz -lm -laio -lapparmor -latomic -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz -lm -laio -lapparmor -latomic -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz -lm -laio -lapparmor -latomic -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz -O2 -std=gnu99 -lm -laio -lapparmor -latomic -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz 1. (CXX) g++ options: -lc
Stress-NG Test: NUMA OpenBenchmarking.org Bogo Ops/s Per Watt, More Is Better Stress-NG 0.16.04 Test: NUMA Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 Xeon Platinum 8490H 5 10 15 20 25 20.369 3.450 4.088 4.106 4.951 3.603
Stress-NG Test: Memory Copying OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Memory Copying Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 7K 14K 21K 28K 35K SE +/- 0.31, N = 3 SE +/- 19.64, N = 3 SE +/- 22.89, N = 3 SE +/- 12.67, N = 3 SE +/- 23.24, N = 3 SE +/- 58.68, N = 3 SE +/- 67.98, N = 3 27163.45 22827.20 28310.82 26495.63 34497.51 27302.27 18809.55 -O2 -std=gnu99 -lm -laio -lapparmor -latomic -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz -lm -laio -lapparmor -latomic -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz -lm -laio -lapparmor -latomic -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz -lm -laio -lapparmor -latomic -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz -O2 -std=gnu99 -lm -laio -lapparmor -latomic -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz 1. (CXX) g++ options: -lc
Stress-NG Test: Memory Copying OpenBenchmarking.org Bogo Ops/s Per Watt, More Is Better Stress-NG 0.16.04 Test: Memory Copying Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 Xeon Platinum 8490H 40 80 120 160 200 166.58 78.72 95.60 97.03 123.67 59.82
Stress-NG CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better Stress-NG 0.16.04 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
Stress-NG System Power Consumption Monitor Min Avg Max GPTshop GH200 188.5 282.8 319.4 OpenBenchmarking.org Watts, Fewer Is Better Stress-NG 0.16.04 System Power Consumption Monitor 80 160 240 320 400
Stress-NG System Temperature Monitor Min Avg Max GPTshop GH200 83.2 88.8 92.9 OpenBenchmarking.org Celsius, Fewer Is Better Stress-NG 0.16.04 System Temperature Monitor 20 40 60 80 100
Stress-NG CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better Stress-NG 0.16.04 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
Stress-NG System Power Consumption Monitor Min Avg Max GPTshop GH200 193.8 299.1 319.5 OpenBenchmarking.org Watts, Fewer Is Better Stress-NG 0.16.04 System Power Consumption Monitor 80 160 240 320 400
Stress-NG System Temperature Monitor Min Avg Max GPTshop GH200 86.5 91.8 93.5 OpenBenchmarking.org Celsius, Fewer Is Better Stress-NG 0.16.04 System Temperature Monitor 20 40 60 80 100
System Power Consumption Monitor Phoronix Test Suite System Monitoring OpenBenchmarking.org Watts System Power Consumption Monitor Phoronix Test Suite System Monitoring GPTshop GH200 70 140 210 280 350 Min: 167.62 / Avg: 281.77 / Max: 416.43
System Temperature Monitor Phoronix Test Suite System Monitoring OpenBenchmarking.org Celsius System Temperature Monitor Phoronix Test Suite System Monitoring GPTshop GH200 20 40 60 80 100 Min: 71.9 / Avg: 92.77 / Max: 102.5
Timed Gem5 Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Gem5 Compilation 23.0.1 Time To Compile Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 60 120 180 240 300 SE +/- 3.13, N = 4 SE +/- 1.67, N = 3 SE +/- 1.09, N = 3 SE +/- 1.97, N = 3 SE +/- 2.85, N = 3 SE +/- 2.18, N = 12 SE +/- 1.53, N = 3 260.33 179.74 180.86 180.06 208.58 184.66 197.76
Timed Gem5 Compilation CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better Timed Gem5 Compilation 23.0.1 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
Timed Gem5 Compilation System Power Consumption Monitor Min Avg Max GPTshop GH200 178.1 252.6 366.7 OpenBenchmarking.org Watts, Fewer Is Better Timed Gem5 Compilation 23.0.1 System Power Consumption Monitor 100 200 300 400 500
Timed Gem5 Compilation System Temperature Monitor Min Avg Max GPTshop GH200 80.1 87.5 97.8 OpenBenchmarking.org Celsius, Fewer Is Better Timed Gem5 Compilation 23.0.1 System Temperature Monitor 20 40 60 80 100
Timed Godot Game Engine Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Godot Game Engine Compilation 4.0 Time To Compile Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 50 100 150 200 250 SE +/- 0.35, N = 3 SE +/- 0.03, N = 3 SE +/- 0.16, N = 3 SE +/- 0.06, N = 3 SE +/- 0.26, N = 3 SE +/- 0.65, N = 3 SE +/- 0.43, N = 3 227.78 103.50 101.56 98.56 118.25 138.89 128.47
Timed Godot Game Engine Compilation CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better Timed Godot Game Engine Compilation 4.0 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
Timed Godot Game Engine Compilation System Power Consumption Monitor Min Avg Max GPTshop GH200 188.5 260.8 379.7 OpenBenchmarking.org Watts, Fewer Is Better Timed Godot Game Engine Compilation 4.0 System Power Consumption Monitor 100 200 300 400 500
Timed Godot Game Engine Compilation System Temperature Monitor Min Avg Max GPTshop GH200 83.7 91.2 99.3 OpenBenchmarking.org Celsius, Fewer Is Better Timed Godot Game Engine Compilation 4.0 System Temperature Monitor 20 40 60 80 100
Timed Linux Kernel Compilation Build: allmodconfig OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 6.1 Build: allmodconfig Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 70 140 210 280 350 SE +/- 0.75, N = 3 SE +/- 0.61, N = 3 SE +/- 0.41, N = 3 SE +/- 0.50, N = 3 SE +/- 2.48, N = 3 SE +/- 1.02, N = 3 SE +/- 0.60, N = 3 307.88 238.29 284.25 283.87 319.43 283.34 289.97
Timed Linux Kernel Compilation CPU Peak Freq (Highest CPU Core Frequency) Monitor Min Avg Max GPTshop GH200 2729 3254 3411 OpenBenchmarking.org Megahertz, More Is Better Timed Linux Kernel Compilation 6.1 CPU Peak Freq (Highest CPU Core Frequency) Monitor 800 1600 2400 3200 4000
Timed Linux Kernel Compilation System Power Consumption Monitor Min Avg Max GPTshop GH200 183.5 302.8 371.9 OpenBenchmarking.org Watts, Fewer Is Better Timed Linux Kernel Compilation 6.1 System Power Consumption Monitor 100 200 300 400 500
Timed Linux Kernel Compilation System Temperature Monitor Min Avg Max GPTshop GH200 84.2 97.0 100.1 OpenBenchmarking.org Celsius, Fewer Is Better Timed Linux Kernel Compilation 6.1 System Temperature Monitor 20 40 60 80 100
Timed LLVM Compilation Build System: Ninja OpenBenchmarking.org Seconds, Fewer Is Better Timed LLVM Compilation 16.0 Build System: Ninja Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 60 120 180 240 300 SE +/- 0.95, N = 3 SE +/- 0.11, N = 3 SE +/- 0.08, N = 3 SE +/- 0.07, N = 3 SE +/- 0.28, N = 3 SE +/- 0.88, N = 3 SE +/- 0.08, N = 3 266.54 131.20 133.07 131.99 146.39 195.60 169.53
Timed LLVM Compilation CPU Peak Freq (Highest CPU Core Frequency) Monitor Min Avg Max GPTshop GH200 2729 3349 3411 OpenBenchmarking.org Megahertz, More Is Better Timed LLVM Compilation 16.0 CPU Peak Freq (Highest CPU Core Frequency) Monitor 800 1600 2400 3200 4000
Timed LLVM Compilation System Power Consumption Monitor Min Avg Max GPTshop GH200 183.3 296.8 358.9 OpenBenchmarking.org Watts, Fewer Is Better Timed LLVM Compilation 16.0 System Power Consumption Monitor 100 200 300 400 500
Timed LLVM Compilation System Temperature Monitor Min Avg Max GPTshop GH200 86.9 95.0 99.5 OpenBenchmarking.org Celsius, Fewer Is Better Timed LLVM Compilation 16.0 System Temperature Monitor 20 40 60 80 100
Timed Node.js Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Node.js Compilation 19.8.1 Time To Compile Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 60 120 180 240 300 SE +/- 0.19, N = 3 SE +/- 0.32, N = 3 SE +/- 0.08, N = 3 SE +/- 0.09, N = 3 SE +/- 0.08, N = 3 SE +/- 0.08, N = 3 SE +/- 0.27, N = 3 268.70 121.93 120.16 119.20 130.46 173.83 151.57
Timed Node.js Compilation CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better Timed Node.js Compilation 19.8.1 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
Timed Node.js Compilation System Power Consumption Monitor Min Avg Max GPTshop GH200 185.9 300.4 358.7 OpenBenchmarking.org Watts, Fewer Is Better Timed Node.js Compilation 19.8.1 System Power Consumption Monitor 100 200 300 400 500
Timed Node.js Compilation System Temperature Monitor Min Avg Max GPTshop GH200 87.0 92.8 97.8 OpenBenchmarking.org Celsius, Fewer Is Better Timed Node.js Compilation 19.8.1 System Temperature Monitor 20 40 60 80 100
Xcompact3d Incompact3d Input: X3D-benchmarking input.i3d OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: X3D-benchmarking input.i3d Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 130 260 390 520 650 SE +/- 0.07, N = 3 SE +/- 7.98, N = 9 SE +/- 8.73, N = 9 SE +/- 5.21, N = 9 SE +/- 14.11, N = 3 SE +/- 1.42, N = 3 SE +/- 4.12, N = 4 606.52 515.79 430.08 440.55 493.50 257.34 353.56 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
Xcompact3d Incompact3d Input: input.i3d 193 Cells Per Direction OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 193 Cells Per Direction Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 6 12 18 24 30 SE +/- 0.05179277, N = 3 SE +/- 0.03040517, N = 5 SE +/- 0.07670149, N = 5 SE +/- 0.03937117, N = 5 SE +/- 0.04030589, N = 5 SE +/- 0.03425295, N = 5 SE +/- 0.04713862, N = 4 23.88522400 9.74551621 8.45212936 7.30814419 9.02806168 9.83151112 12.79834200 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
Xcompact3d Incompact3d CPU Peak Freq (Highest CPU Core Frequency) Monitor Min Avg Max GPTshop GH200 2729 3268 3411 OpenBenchmarking.org Megahertz, More Is Better Xcompact3d Incompact3d 2021-03-11 CPU Peak Freq (Highest CPU Core Frequency) Monitor 800 1600 2400 3200 4000
Xcompact3d Incompact3d System Power Consumption Monitor Min Avg Max GPTshop GH200 188.6 323.5 395.2 OpenBenchmarking.org Watts, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 System Power Consumption Monitor 110 220 330 440 550
Xcompact3d Incompact3d System Temperature Monitor Min Avg Max GPTshop GH200 82.5 96.2 101.8 OpenBenchmarking.org Celsius, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 System Temperature Monitor 20 40 60 80 100
Xcompact3d Incompact3d CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better Xcompact3d Incompact3d 2021-03-11 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
Xcompact3d Incompact3d System Power Consumption Monitor Min Avg Max GPTshop GH200 191.2 283.9 358.9 OpenBenchmarking.org Watts, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 System Power Consumption Monitor 100 200 300 400 500
Xcompact3d Incompact3d System Temperature Monitor Min Avg Max GPTshop GH200 92.0 94.7 98.5 OpenBenchmarking.org Celsius, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 System Temperature Monitor 20 40 60 80 100
Xmrig Variant: Monero - Hash Count: 1M OpenBenchmarking.org H/s, More Is Better Xmrig 6.18.1 Variant: Monero - Hash Count: 1M Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 15K 30K 45K 60K 75K SE +/- 16.52, N = 3 SE +/- 42.25, N = 3 SE +/- 142.78, N = 3 SE +/- 80.10, N = 4 SE +/- 2550.91, N = 12 SE +/- 5.62, N = 3 SE +/- 7.51, N = 3 4322.1 47926.4 59047.7 69205.6 29356.1 17290.3 27608.5 -maes -maes -maes -maes -maes 1. (CXX) g++ options: -fexceptions -fno-rtti -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
Xmrig Variant: Monero - Hash Count: 1M OpenBenchmarking.org H/s Per Watt, More Is Better Xmrig 6.18.1 Variant: Monero - Hash Count: 1M Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 Xeon Platinum 8490H 50 100 150 200 250 36.63 166.56 208.70 245.11 124.40 86.36
Xmrig Variant: Wownero - Hash Count: 1M OpenBenchmarking.org H/s, More Is Better Xmrig 6.18.1 Variant: Wownero - Hash Count: 1M Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 17K 34K 51K 68K 85K SE +/- 4.67, N = 3 SE +/- 18.19, N = 3 SE +/- 60.06, N = 4 SE +/- 35.19, N = 4 SE +/- 604.78, N = 4 SE +/- 2.75, N = 3 SE +/- 13.06, N = 3 1909.2 60032.4 75127.4 73567.3 77607.2 21993.3 35870.2 -maes -maes -maes -maes -maes 1. (CXX) g++ options: -fexceptions -fno-rtti -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
Xmrig Variant: Wownero - Hash Count: 1M OpenBenchmarking.org H/s Per Watt, More Is Better Xmrig 6.18.1 Variant: Wownero - Hash Count: 1M Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 Xeon Platinum 8490H 60 120 180 240 300 18.61 212.04 270.21 265.76 284.00 114.36
Xmrig CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better Xmrig 6.18.1 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
Xmrig System Power Consumption Monitor Min Avg Max GPTshop GH200 188.6 296.1 316.9 OpenBenchmarking.org Watts, Fewer Is Better Xmrig 6.18.1 System Power Consumption Monitor 80 160 240 320 400
Xmrig System Temperature Monitor Min Avg Max GPTshop GH200 84.8 90.4 93.0 OpenBenchmarking.org Celsius, Fewer Is Better Xmrig 6.18.1 System Temperature Monitor 20 40 60 80 100
Xmrig CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better Xmrig 6.18.1 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
Xmrig System Power Consumption Monitor Min Avg Max GPTshop GH200 298.5 310.4 314.3 OpenBenchmarking.org Watts, Fewer Is Better Xmrig 6.18.1 System Power Consumption Monitor 80 160 240 320 400
Xmrig System Temperature Monitor Min Avg Max GPTshop GH200 87.1 91.7 93.2 OpenBenchmarking.org Celsius, Fewer Is Better Xmrig 6.18.1 System Temperature Monitor 20 40 60 80 100
Phoronix Test Suite v10.8.5