NVIDIA GH200 Benchmarks Smoke Run Comparison Benchmarks by Michael Larabel for a future article. Just some initial smoke run benchmarks looking at the NVIDIA GH200 CPU performance versus other server CPUs. More interesting benchmarks to come.
HTML result view exported from: https://openbenchmarking.org/result/2401256-NE-NVIDIAGH254&sro&gru .
NVIDIA GH200 Benchmarks Smoke Run Comparison Processor Motherboard Chipset Memory Disk Graphics Network Monitor OS Kernel Desktop Display Server Compiler File-System Screen Resolution EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 Xeon Platinum 8490H Ampere Altra Max 128c GPTshop GH200 AMD EPYC 9554 64-Core @ 3.10GHz (64 Cores / 128 Threads) AMD Titanite_4G (RTI1007B BIOS) AMD Device 14a4 768GB 3201GB Micron_7450_MTFDKCC3T2TFS ASPEED Broadcom NetXtreme BCM5720 PCIe Ubuntu 23.10 6.6.0-rc5-phx-patched (x86_64) GNOME Shell 45.0 X Server 1.21.1.7 GCC 13.2.0 ext4 1920x1200 AMD EPYC 9654 96-Core @ 2.40GHz (96 Cores / 192 Threads) AMD EPYC 9684X 96-Core @ 2.55GHz (96 Cores / 192 Threads) AMD EPYC 9754 128-Core @ 2.25GHz (128 Cores / 256 Threads) Intel Xeon Platinum 8490H @ 3.50GHz (60 Cores / 120 Threads) Quanta Cloud S6Q-MB-MPS (3A10.uh BIOS) Intel Device 1bce 512GB ARMv8 Neoverse-N1 @ 3.00GHz (128 Cores) GIGABYTE MP32-AR2-00 v01000100 (F31k SCP: 2.10.20220531 BIOS) Ampere Computing LLC Altra PCI Root Complex A VGA HDMI 2 x Intel I350 6.6.0-rc5-phx-patched (aarch64) 1920x1080 ARMv8 Neoverse-V2 @ 3.39GHz (72 Cores) Quanta Cloud QuantaGrid S74G-2U 1S7GZ9Z0000 S7G MB (CG1) (3A06 BIOS) 1 x 480 GB DRAM-6400MT/s 960GB SAMSUNG MZ1L2960HCJR-00A07 + 1920GB SAMSUNG MZTL21T9 2 x Mellanox MT2910 + 2 x QLogic FastLinQ QL41000 10/25/40/50GbE 6.5.0-15-generic (aarch64) 1920x1200 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - EPYC 9554: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - EPYC 9654: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - EPYC 9684X: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - EPYC 9754: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Xeon Platinum 8490H: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Ampere Altra Max 128c: --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -v - GPTshop GH200: --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -v Processor Details - EPYC 9554: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa10113e - EPYC 9654: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa10113e - EPYC 9684X: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa10113e - EPYC 9754: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xaa00116 - Xeon Platinum 8490H: Scaling Governor: intel_pstate performance (EPP: performance) - CPU Microcode: 0x2b0004b1 - Ampere Altra Max 128c: Scaling Governor: cppc_cpufreq performance (Boost: Disabled) - GPTshop GH200: Scaling Governor: cppc_cpufreq performance (Boost: Disabled) Java Details - EPYC 9554: OpenJDK Runtime Environment (build 11.0.20+8-post-Ubuntu-1ubuntu1) - EPYC 9654: OpenJDK Runtime Environment (build 11.0.20+8-post-Ubuntu-1ubuntu1) - EPYC 9684X: OpenJDK Runtime Environment (build 11.0.20+8-post-Ubuntu-1ubuntu1) - EPYC 9754: OpenJDK Runtime Environment (build 11.0.20+8-post-Ubuntu-1ubuntu1) - Xeon Platinum 8490H: OpenJDK Runtime Environment (build 11.0.20+8-post-Ubuntu-1ubuntu1) - Ampere Altra Max 128c: OpenJDK Runtime Environment (build 11.0.20+8-post-Ubuntu-1ubuntu1) - GPTshop GH200: OpenJDK Runtime Environment (build 11.0.21+9-post-Ubuntu-0ubuntu123.10) Python Details - Python 3.11.6 Security Details - EPYC 9554: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected - EPYC 9654: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected - EPYC 9684X: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected - EPYC 9754: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected - Xeon Platinum 8490H: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected - Ampere Altra Max 128c: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of CSV2 BHB + srbds: Not affected + tsx_async_abort: Not affected - GPTshop GH200: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected
NVIDIA GH200 Benchmarks Smoke Run Comparison graph500: 26 graph500: 26 stress-ng: NUMA stress-ng: Memory Copying openssl: SHA512 minife: Small amg: hpcg: 144 144 144 - 60 mt-dgemm: Sustained Floating-Point Rate xmrig: Monero - 1M xmrig: Wownero - 1M graphics-magick: Sharpen graphics-magick: Enhanced askap: tConvolve MT - Gridding compress-7zip: Compression Rating askap: tConvolve MPI - Degridding askap: tConvolve MPI - Gridding cassandra: Writes rocksdb: Rand Read graph500: 26 graph500: 26 npb: CG.C npb: FT.C npb: IS.D npb: MG.C lulesh: cloverleaf: clover_bm16 cloverleaf: clover_bm64_short rodinia: OpenMP LavaMD incompact3d: X3D-benchmarking input.i3d incompact3d: input.i3d 193 Cells Per Direction easywave: e2Asean Grid + BengkuluSept2007 Source - 1200 easywave: e2Asean Grid + BengkuluSept2007 Source - 2400 build-godot: Time To Compile build-linux-kernel: allmodconfig build-llvm: Ninja build-nodejs: Time To Compile primesieve: 1e13 duckdb: IMDB duckdb: TPC-H Parquet rawtherapee: Total Benchmark Time build-gem5: Time To Compile EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 Xeon Platinum 8490H Ampere Altra Max 128c GPTshop GH200 865152000 836159000 967.56 22827.20 33622028530 50351.4 2321156000 22.4511 30.174904 47926.4 60032.4 669 1142 10327.5 489736 31177.2 37265.9 301963 376983415 467453000 368016000 45867.44 124827.75 5184.36 125921.29 24037.442 460.24 45.82 34.014 515.792758 9.74551621 22.452 56.883 103.500 238.289 131.203 121.928 28.136 103.720 143.306 49.943 179.743 839917000 816172000 1237.36 28310.82 40459323610 50320.2 2296730667 32.1283 39.282158 59047.7 75127.4 809 1319 12332.4 564950 41081.6 47003.2 285324 429310263 499347000 391167000 54465.04 125164.03 5195.24 118608.07 23701.442 429.46 47.98 30.240 430.080329 8.45212936 23.507 59.535 101.563 284.249 133.070 120.156 24.164 115.805 146.276 52.051 180.857 882181000 854029000 1173.07 26495.63 40105639580 52937.6 2374853333 23.7784 40.320062 69205.6 73567.3 775 1277 14731.5 554411 58310.1 67965.5 286460 486862785 501222000 383754000 59557.08 124391.00 6138.54 137608.68 24712.884 298.29 51.31 30.317 440.548584 7.30814419 24.579 60.252 98.555 283.865 131.990 119.203 25.969 116.050 147.489 52.991 180.061 1184510000 1147090000 1397.61 34497.51 53648078203 50393.7 2291049667 25.8918 43.681923 29356.1 77607.2 924 1451 12660.8 586571 26593.4 27269.2 236320 611259161 502151000 377033000 44431.58 141255.07 5773.25 127404.16 22356.746 459.16 49.03 25.150 493.502452 9.02806168 29.588 72.355 118.247 319.426 146.386 130.460 21.756 147.601 177.134 66.132 208.576 1113120000 1073860000 1144.04 18809.55 22445820447 35452.5 1611826000 31.2432 23.257388 27608.5 35870.2 691 1124 8112.07 389690 19616.0 22365.2 139654 268374486 454483000 333129000 35560.25 70989.97 2885.71 88049.90 23997.715 324.17 38.93 42.713 353.557595 12.7983420 45.850 118.554 128.467 289.966 169.528 151.567 52.375 99.249 148.099 46.578 197.759 987770000 980463000 1454.68 27163.45 34434960797 24122.0 1060656000 21.2399 18.114994 4322.1 1909.2 1274 1230 4295.72 328378 5285.36 2416.54 227548 435437276 333239000 224032000 17272.73 35153.27 1196.33 42291.84 16218.331 550.30 62.77 31.791 606.521708 23.8852240 122.550 326.368 227.783 307.878 266.543 268.704 41.675 143.781 237.161 66.045 260.326 1313880000 1214290000 6279.41 27302.27 45292038457 46477.3 1993579333 41.5526 17.563016 17290.3 21993.3 1244 1584 9561.31 341300 12429.5 9678.85 371439 429416658 463087000 298983000 24350.32 47948.15 1770.65 57191.85 23433.373 247.28 28.36 30.959 257.342539 9.83151112 36.562 98.719 138.891 283.339 195.598 173.834 36.049 91.633 156.987 46.546 184.661 OpenBenchmarking.org
System Temperature Monitor Phoronix Test Suite System Monitoring OpenBenchmarking.org Celsius System Temperature Monitor Phoronix Test Suite System Monitoring GPTshop GH200 20 40 60 80 100 Min: 71.9 / Avg: 92.77 / Max: 102.5
CPU Peak Freq (Highest CPU Core Frequency) Monitor Phoronix Test Suite System Monitoring OpenBenchmarking.org Megahertz CPU Peak Freq (Highest CPU Core Frequency) Monitor Phoronix Test Suite System Monitoring GPTshop GH200 700 1400 2100 2800 3500 Min: 2047 / Avg: 3332.19 / Max: 4104
System Power Consumption Monitor Phoronix Test Suite System Monitoring OpenBenchmarking.org Watts System Power Consumption Monitor Phoronix Test Suite System Monitoring GPTshop GH200 70 140 210 280 350 Min: 167.62 / Avg: 281.77 / Max: 416.43
Graph500 Scale: 26 OpenBenchmarking.org bfs max_TEPS, More Is Better Graph500 3.0 Scale: 26 Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 300M 600M 900M 1200M 1500M 987770000 865152000 839917000 882181000 1184510000 1313880000 1113120000 1. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi
Graph500 Scale: 26 OpenBenchmarking.org bfs median_TEPS, More Is Better Graph500 3.0 Scale: 26 Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 300M 600M 900M 1200M 1500M 980463000 836159000 816172000 854029000 1147090000 1214290000 1073860000 1. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi
Stress-NG Test: NUMA OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: NUMA Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 1300 2600 3900 5200 6500 SE +/- 2.51, N = 3 SE +/- 4.03, N = 3 SE +/- 1.10, N = 3 SE +/- 1.38, N = 3 SE +/- 2.90, N = 3 SE +/- 10.32, N = 3 SE +/- 1.16, N = 3 1454.68 967.56 1237.36 1173.07 1397.61 6279.41 1144.04 -O2 -std=gnu99 -lm -laio -lapparmor -latomic -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz -lm -laio -lapparmor -latomic -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz -lm -laio -lapparmor -latomic -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz -lm -laio -lapparmor -latomic -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz -O2 -std=gnu99 -lm -laio -lapparmor -latomic -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz 1. (CXX) g++ options: -lc
Stress-NG Test: NUMA OpenBenchmarking.org Bogo Ops/s Per Watt, More Is Better Stress-NG 0.16.04 Test: NUMA Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 Xeon Platinum 8490H 5 10 15 20 25 20.369 3.450 4.088 4.106 4.951 3.603
Stress-NG Test: Memory Copying OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.16.04 Test: Memory Copying Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 7K 14K 21K 28K 35K SE +/- 0.31, N = 3 SE +/- 19.64, N = 3 SE +/- 22.89, N = 3 SE +/- 12.67, N = 3 SE +/- 23.24, N = 3 SE +/- 58.68, N = 3 SE +/- 67.98, N = 3 27163.45 22827.20 28310.82 26495.63 34497.51 27302.27 18809.55 -O2 -std=gnu99 -lm -laio -lapparmor -latomic -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz -lm -laio -lapparmor -latomic -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz -lm -laio -lapparmor -latomic -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz -lm -laio -lapparmor -latomic -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz -O2 -std=gnu99 -lm -laio -lapparmor -latomic -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz 1. (CXX) g++ options: -lc
Stress-NG Test: Memory Copying OpenBenchmarking.org Bogo Ops/s Per Watt, More Is Better Stress-NG 0.16.04 Test: Memory Copying Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 Xeon Platinum 8490H 40 80 120 160 200 166.58 78.72 95.60 97.03 123.67 59.82
OpenSSL Algorithm: SHA512 OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: SHA512 Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 11000M 22000M 33000M 44000M 55000M SE +/- 36142057.88, N = 3 SE +/- 4732001.41, N = 3 SE +/- 16069151.20, N = 3 SE +/- 6401405.16, N = 3 SE +/- 1530571.35, N = 3 SE +/- 90952676.69, N = 3 SE +/- 147259946.50, N = 3 34434960797 33622028530 40459323610 40105639580 53648078203 45292038457 22445820447 -lssl -lcrypto -m64 -m64 -m64 -m64 -lssl -lcrypto -m64 -lssl -lcrypto 1. (CC) gcc options: -pthread -O3 -ldl
OpenSSL Algorithm: SHA512 OpenBenchmarking.org byte/s Per Watt, More Is Better OpenSSL 3.1 Algorithm: SHA512 Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 Xeon Platinum 8490H 40M 80M 120M 160M 200M 192898458.48 112644110.72 133650046.77 131019550.91 160226634.86 65707856.51
miniFE Problem Size: Small OpenBenchmarking.org CG Mflops Per Watt, More Is Better miniFE 2.2 Problem Size: Small Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 Xeon Platinum 8490H 90 180 270 360 450 248.68 396.88 340.61 351.89 416.05 154.24
miniFE Problem Size: Small OpenBenchmarking.org CG Mflops, More Is Better miniFE 2.2 Problem Size: Small Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 11K 22K 33K 44K 55K SE +/- 5.88, N = 4 SE +/- 51.65, N = 5 SE +/- 45.75, N = 5 SE +/- 507.76, N = 5 SE +/- 105.96, N = 5 SE +/- 54.33, N = 5 SE +/- 14.30, N = 4 24122.0 50351.4 50320.2 52937.6 50393.7 46477.3 35452.5 1. (CXX) g++ options: -O3 -fopenmp -lmpi_cxx -lmpi
Algebraic Multi-Grid Benchmark OpenBenchmarking.org Figure Of Merit Per Watt, More Is Better Algebraic Multi-Grid Benchmark 1.2 Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 Xeon Platinum 8490H 3M 6M 9M 12M 15M 7481610.11 12260061.49 9775985.20 10140759.12 11292392.93 5406835.94
Algebraic Multi-Grid Benchmark OpenBenchmarking.org Figure Of Merit, More Is Better Algebraic Multi-Grid Benchmark 1.2 Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 500M 1000M 1500M 2000M 2500M SE +/- 31942.66, N = 3 SE +/- 3180941.37, N = 3 SE +/- 5502047.33, N = 3 SE +/- 1953221.73, N = 3 SE +/- 2811300.43, N = 3 SE +/- 22467067.78, N = 3 SE +/- 269139.99, N = 3 1060656000 2321156000 2296730667 2374853333 2291049667 1993579333 1611826000 1. (CC) gcc options: -lparcsr_ls -lparcsr_mv -lseq_mv -lIJ_mv -lkrylov -lHYPRE_utilities -lm -fopenmp -lmpi
High Performance Conjugate Gradient X Y Z: 144 144 144 - RT: 60 OpenBenchmarking.org GFLOP/s, More Is Better High Performance Conjugate Gradient 3.1 X Y Z: 144 144 144 - RT: 60 Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 9 18 27 36 45 SE +/- 0.00, N = 3 SE +/- 0.04, N = 3 SE +/- 0.35, N = 3 SE +/- 0.28, N = 9 SE +/- 0.27, N = 3 SE +/- 0.36, N = 3 SE +/- 0.11, N = 3 21.24 22.45 32.13 23.78 25.89 41.55 31.24 1. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi
High Performance Conjugate Gradient X Y Z: 144 144 144 - RT: 60 OpenBenchmarking.org GFLOP/s Per Watt, More Is Better High Performance Conjugate Gradient 3.1 X Y Z: 144 144 144 - RT: 60 Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 Xeon Platinum 8490H 0.0297 0.0594 0.0891 0.1188 0.1485 0.132 0.101 0.117 0.087 0.109 0.091
ACES DGEMM Sustained Floating-Point Rate OpenBenchmarking.org GFLOP/s, More Is Better ACES DGEMM 1.0 Sustained Floating-Point Rate Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 10 20 30 40 50 SE +/- 0.09, N = 4 SE +/- 0.28, N = 6 SE +/- 0.18, N = 7 SE +/- 0.12, N = 7 SE +/- 0.13, N = 7 SE +/- 0.19, N = 15 SE +/- 0.13, N = 5 18.11 30.17 39.28 40.32 43.68 17.56 23.26 1. (CC) gcc options: -O3 -march=native -fopenmp
ACES DGEMM Sustained Floating-Point Rate OpenBenchmarking.org GFLOP/s Per Watt, More Is Better ACES DGEMM 1.0 Sustained Floating-Point Rate Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 Xeon Platinum 8490H 0.0502 0.1004 0.1506 0.2008 0.251 0.136 0.171 0.220 0.219 0.223 0.091
Xmrig Variant: Monero - Hash Count: 1M OpenBenchmarking.org H/s Per Watt, More Is Better Xmrig 6.18.1 Variant: Monero - Hash Count: 1M Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 Xeon Platinum 8490H 50 100 150 200 250 36.63 166.56 208.70 245.11 124.40 86.36
Xmrig Variant: Wownero - Hash Count: 1M OpenBenchmarking.org H/s Per Watt, More Is Better Xmrig 6.18.1 Variant: Wownero - Hash Count: 1M Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 Xeon Platinum 8490H 60 120 180 240 300 18.61 212.04 270.21 265.76 284.00 114.36
Xmrig Variant: Monero - Hash Count: 1M OpenBenchmarking.org H/s, More Is Better Xmrig 6.18.1 Variant: Monero - Hash Count: 1M Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 15K 30K 45K 60K 75K SE +/- 16.52, N = 3 SE +/- 42.25, N = 3 SE +/- 142.78, N = 3 SE +/- 80.10, N = 4 SE +/- 2550.91, N = 12 SE +/- 5.62, N = 3 SE +/- 7.51, N = 3 4322.1 47926.4 59047.7 69205.6 29356.1 17290.3 27608.5 -maes -maes -maes -maes -maes 1. (CXX) g++ options: -fexceptions -fno-rtti -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
Xmrig Variant: Wownero - Hash Count: 1M OpenBenchmarking.org H/s, More Is Better Xmrig 6.18.1 Variant: Wownero - Hash Count: 1M Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 17K 34K 51K 68K 85K SE +/- 4.67, N = 3 SE +/- 18.19, N = 3 SE +/- 60.06, N = 4 SE +/- 35.19, N = 4 SE +/- 604.78, N = 4 SE +/- 2.75, N = 3 SE +/- 13.06, N = 3 1909.2 60032.4 75127.4 73567.3 77607.2 21993.3 35870.2 -maes -maes -maes -maes -maes 1. (CXX) g++ options: -fexceptions -fno-rtti -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
GraphicsMagick Operation: Sharpen OpenBenchmarking.org Iterations Per Minute Per Watt, More Is Better GraphicsMagick 1.3.38 Operation: Sharpen Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 Xeon Platinum 8490H 2 4 6 8 10 8.016 2.691 3.127 3.042 3.752 2.081
GraphicsMagick Operation: Sharpen OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.38 Operation: Sharpen Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 300 600 900 1200 1500 SE +/- 1.20, N = 3 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 SE +/- 0.67, N = 3 SE +/- 5.13, N = 3 SE +/- 0.33, N = 3 1274 669 809 775 924 1244 691 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -llzma -lbz2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -llzma -lbz2 -lxml2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -llzma -lbz2 -lxml2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -llzma -lbz2 -lxml2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -llzma -lbz2 -lxml2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -llzma -lbz2 -lxml2 1. (CC) gcc options: -fopenmp -O2 -ljpeg -lXext -lSM -lICE -lX11 -lz -lzstd -lm -lpthread
GraphicsMagick Operation: Enhanced OpenBenchmarking.org Iterations Per Minute Per Watt, More Is Better GraphicsMagick 1.3.38 Operation: Enhanced Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 Xeon Platinum 8490H 3 6 9 12 15 9.833 4.871 5.323 5.193 6.328 3.405
GraphicsMagick Operation: Enhanced OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.38 Operation: Enhanced Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 300 600 900 1200 1500 SE +/- 2.73, N = 3 SE +/- 0.88, N = 3 SE +/- 0.88, N = 3 SE +/- 1.67, N = 3 SE +/- 0.58, N = 3 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 1230 1142 1319 1277 1451 1584 1124 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -llzma -lbz2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -llzma -lbz2 -lxml2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -llzma -lbz2 -lxml2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -llzma -lbz2 -lxml2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -llzma -lbz2 -lxml2 -ljbig -lwebp -lwebpmux -ltiff -lfreetype -llzma -lbz2 -lxml2 1. (CC) gcc options: -fopenmp -O2 -ljpeg -lXext -lSM -lICE -lX11 -lz -lzstd -lm -lpthread
High Performance Conjugate Gradient CPU Peak Freq (Highest CPU Core Frequency) Monitor Min Avg Max GPTshop GH200 2729 3345 4104 OpenBenchmarking.org Megahertz, More Is Better High Performance Conjugate Gradient 3.1 CPU Peak Freq (Highest CPU Core Frequency) Monitor 1100 2200 3300 4400 5500
NAS Parallel Benchmarks CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better NAS Parallel Benchmarks 3.4 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
NAS Parallel Benchmarks CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better NAS Parallel Benchmarks 3.4 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
NAS Parallel Benchmarks CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better NAS Parallel Benchmarks 3.4 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
NAS Parallel Benchmarks CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better NAS Parallel Benchmarks 3.4 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
CloverLeaf CPU Peak Freq (Highest CPU Core Frequency) Monitor Min Avg Max GPTshop GH200 2729 3346 3411 OpenBenchmarking.org Megahertz, More Is Better CloverLeaf 1.3 CPU Peak Freq (Highest CPU Core Frequency) Monitor 800 1600 2400 3200 4000
CloverLeaf CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better CloverLeaf 1.3 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
Rodinia CPU Peak Freq (Highest CPU Core Frequency) Monitor Min Avg Max GPTshop GH200 2729 3119 3411 OpenBenchmarking.org Megahertz, More Is Better Rodinia 3.1 CPU Peak Freq (Highest CPU Core Frequency) Monitor 800 1600 2400 3200 4000
Xcompact3d Incompact3d CPU Peak Freq (Highest CPU Core Frequency) Monitor Min Avg Max GPTshop GH200 2729 3268 3411 OpenBenchmarking.org Megahertz, More Is Better Xcompact3d Incompact3d 2021-03-11 CPU Peak Freq (Highest CPU Core Frequency) Monitor 800 1600 2400 3200 4000
Xcompact3d Incompact3d CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better Xcompact3d Incompact3d 2021-03-11 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
LULESH CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better LULESH 2.0.3 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
Xmrig CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better Xmrig 6.18.1 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
Xmrig CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better Xmrig 6.18.1 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
GraphicsMagick CPU Peak Freq (Highest CPU Core Frequency) Monitor Min Avg Max GPTshop GH200 2729 3381 3411 OpenBenchmarking.org Megahertz, More Is Better GraphicsMagick 1.3.38 CPU Peak Freq (Highest CPU Core Frequency) Monitor 800 1600 2400 3200 4000
GraphicsMagick CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better GraphicsMagick 1.3.38 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
easyWave CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better easyWave r34 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
easyWave CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better easyWave r34 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
ACES DGEMM CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better ACES DGEMM 1.0 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
Timed Godot Game Engine Compilation CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better Timed Godot Game Engine Compilation 4.0 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
Timed Linux Kernel Compilation CPU Peak Freq (Highest CPU Core Frequency) Monitor Min Avg Max GPTshop GH200 2729 3254 3411 OpenBenchmarking.org Megahertz, More Is Better Timed Linux Kernel Compilation 6.1 CPU Peak Freq (Highest CPU Core Frequency) Monitor 800 1600 2400 3200 4000
Timed LLVM Compilation CPU Peak Freq (Highest CPU Core Frequency) Monitor Min Avg Max GPTshop GH200 2729 3349 3411 OpenBenchmarking.org Megahertz, More Is Better Timed LLVM Compilation 16.0 CPU Peak Freq (Highest CPU Core Frequency) Monitor 800 1600 2400 3200 4000
Timed Node.js Compilation CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better Timed Node.js Compilation 19.8.1 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
Primesieve CPU Peak Freq (Highest CPU Core Frequency) Monitor Min Avg Max GPTshop GH200 2729 3155 3411 OpenBenchmarking.org Megahertz, More Is Better Primesieve 8.0 CPU Peak Freq (Highest CPU Core Frequency) Monitor 800 1600 2400 3200 4000
OpenSSL CPU Peak Freq (Highest CPU Core Frequency) Monitor Min Avg Max GPTshop GH200 2729 3344 3411 OpenBenchmarking.org Megahertz, More Is Better OpenSSL 3.1 CPU Peak Freq (Highest CPU Core Frequency) Monitor 800 1600 2400 3200 4000
ASKAP CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better ASKAP 1.0 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
Graph500 CPU Peak Freq (Highest CPU Core Frequency) Monitor Min Avg Max GPTshop GH200 2729 3246 3411 OpenBenchmarking.org Megahertz, More Is Better Graph500 3.0 CPU Peak Freq (Highest CPU Core Frequency) Monitor 800 1600 2400 3200 4000
DuckDB CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better DuckDB 0.9.1 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
DuckDB CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better DuckDB 0.9.1 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
RawTherapee CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better RawTherapee CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
Stress-NG CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better Stress-NG 0.16.04 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
Stress-NG CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better Stress-NG 0.16.04 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
Apache Cassandra CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better Apache Cassandra 4.1.3 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
RocksDB CPU Peak Freq (Highest CPU Core Frequency) Monitor Min Avg Max GPTshop GH200 2729 3179 3411 OpenBenchmarking.org Megahertz, More Is Better RocksDB 8.0 CPU Peak Freq (Highest CPU Core Frequency) Monitor 800 1600 2400 3200 4000
Timed Gem5 Compilation CPU Peak Freq (Highest CPU Core Frequency) Monitor OpenBenchmarking.org Megahertz, More Is Better Timed Gem5 Compilation 23.0.1 CPU Peak Freq (Highest CPU Core Frequency) Monitor GPTshop GH200 700 1400 2100 2800 3500 3411
ASKAP Test: tConvolve MT - Gridding OpenBenchmarking.org Million Grid Points Per Second, More Is Better ASKAP 1.0 Test: tConvolve MT - Gridding Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 3K 6K 9K 12K 15K SE +/- 0.36, N = 3 SE +/- 9.54, N = 3 SE +/- 32.54, N = 3 SE +/- 240.55, N = 12 SE +/- 9.06, N = 3 SE +/- 2.75, N = 3 SE +/- 4.50, N = 3 4295.72 10327.50 12332.40 14731.50 12660.80 9561.31 8112.07 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
7-Zip Compression Test: Compression Rating OpenBenchmarking.org MIPS, More Is Better 7-Zip Compression 22.01 Test: Compression Rating Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 130K 260K 390K 520K 650K SE +/- 1588.70, N = 3 SE +/- 224.17, N = 3 SE +/- 311.94, N = 3 SE +/- 366.95, N = 3 SE +/- 439.85, N = 3 SE +/- 222.71, N = 3 SE +/- 629.56, N = 3 328378 489736 564950 554411 586571 341300 389690 1. (CXX) g++ options: -lpthread -ldl -O2 -fPIC
ASKAP Test: tConvolve MPI - Gridding OpenBenchmarking.org Mpix/sec Per Watt, More Is Better ASKAP 1.0 Test: tConvolve MPI - Gridding Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 Xeon Platinum 8490H 60 120 180 240 300 17.51 191.48 196.62 275.76 129.89 71.04
ASKAP Test: tConvolve MPI - Degridding OpenBenchmarking.org Mpix/sec, More Is Better ASKAP 1.0 Test: tConvolve MPI - Degridding Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 12K 24K 36K 48K 60K SE +/- 4.44, N = 3 SE +/- 153.57, N = 3 SE +/- 475.58, N = 3 SE +/- 0.00, N = 3 SE +/- 574.27, N = 15 SE +/- 37.76, N = 3 SE +/- 131.23, N = 3 5285.36 31177.20 41081.60 58310.10 26593.40 12429.50 19616.00 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
ASKAP Test: tConvolve MPI - Gridding OpenBenchmarking.org Mpix/sec, More Is Better ASKAP 1.0 Test: tConvolve MPI - Gridding Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 15K 30K 45K 60K 75K SE +/- 0.46, N = 3 SE +/- 219.20, N = 3 SE +/- 405.08, N = 3 SE +/- 485.47, N = 3 SE +/- 655.18, N = 14 SE +/- 39.51, N = 3 SE +/- 146.74, N = 3 2416.54 37265.90 47003.20 67965.50 27269.20 9678.85 22365.20 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
Apache Cassandra Test: Writes OpenBenchmarking.org Op/s Per Watt, More Is Better Apache Cassandra 4.1.3 Test: Writes Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 Xeon Platinum 8490H 400 800 1200 1600 2000 2068.00 1700.28 1349.72 1282.09 1432.67 557.85
Apache Cassandra Test: Writes OpenBenchmarking.org Op/s, More Is Better Apache Cassandra 4.1.3 Test: Writes Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 80K 160K 240K 320K 400K SE +/- 2776.53, N = 3 SE +/- 554.18, N = 3 SE +/- 162.88, N = 3 SE +/- 1195.73, N = 3 SE +/- 1223.06, N = 3 SE +/- 2382.80, N = 3 SE +/- 883.78, N = 3 227548 301963 285324 286460 236320 371439 139654
RocksDB Test: Random Read OpenBenchmarking.org Op/s, More Is Better RocksDB 8.0 Test: Random Read Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 130M 260M 390M 520M 650M SE +/- 3039704.34, N = 15 SE +/- 106225.72, N = 3 SE +/- 1006626.36, N = 3 SE +/- 539077.40, N = 3 SE +/- 817605.37, N = 3 SE +/- 3229592.93, N = 11 SE +/- 6039899.37, N = 15 435437276 376983415 429310263 486862785 611259161 429416658 268374486 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
RocksDB Test: Random Read OpenBenchmarking.org Op/s Per Watt, More Is Better RocksDB 8.0 Test: Random Read Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 Xeon Platinum 8490H 500K 1000K 1500K 2000K 2500K 2282816.53 1450487.41 1599837.63 1587869.93 2068476.16 808152.35
Graph500 Scale: 26 OpenBenchmarking.org sssp max_TEPS, More Is Better Graph500 3.0 Scale: 26 Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 110M 220M 330M 440M 550M 333239000 467453000 499347000 501222000 502151000 463087000 454483000 1. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi
Graph500 Scale: 26 OpenBenchmarking.org sssp max_TEPS Per Watt, More Is Better Graph500 3.0 Scale: 26 Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 Xeon Platinum 8490H 400K 800K 1200K 1600K 2000K 1918278.63 1818554.40 1644954.69 1663398.87 1711121.29 1312538.58
Graph500 Scale: 26 OpenBenchmarking.org sssp median_TEPS, More Is Better Graph500 3.0 Scale: 26 Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 80M 160M 240M 320M 400M 224032000 368016000 391167000 383754000 377033000 298983000 333129000 1. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi
NAS Parallel Benchmarks Test / Class: FT.C OpenBenchmarking.org Total Mop/s Per Watt, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: FT.C Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 Xeon Platinum 8490H 200 400 600 800 1000 244.94 904.17 837.59 812.37 1081.10 306.03
NAS Parallel Benchmarks Test / Class: IS.D OpenBenchmarking.org Total Mop/s Per Watt, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: IS.D Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 Xeon Platinum 8490H 9 18 27 36 45 8.928 32.759 30.167 36.170 39.264 11.604
NAS Parallel Benchmarks Test / Class: CG.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: CG.C Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 13K 26K 39K 52K 65K SE +/- 15.14, N = 5 SE +/- 363.39, N = 8 SE +/- 537.84, N = 15 SE +/- 469.70, N = 9 SE +/- 391.92, N = 15 SE +/- 58.76, N = 6 SE +/- 268.67, N = 15 17272.73 45867.44 54465.04 59557.08 44431.58 24350.32 35560.25 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.5
NAS Parallel Benchmarks Test / Class: FT.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: FT.C Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 30K 60K 90K 120K 150K SE +/- 41.38, N = 4 SE +/- 1162.69, N = 15 SE +/- 698.81, N = 8 SE +/- 686.49, N = 8 SE +/- 1190.42, N = 8 SE +/- 163.34, N = 5 SE +/- 445.75, N = 6 35153.27 124827.75 125164.03 124391.00 141255.07 47948.15 70989.97 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.5
NAS Parallel Benchmarks Test / Class: MG.C OpenBenchmarking.org Total Mop/s Per Watt, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: MG.C Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 Xeon Platinum 8490H 300 600 900 1200 1500 407.26 1439.34 1167.53 1291.84 1333.53 549.13
NAS Parallel Benchmarks Test / Class: IS.D OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: IS.D Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 1300 2600 3900 5200 6500 SE +/- 0.34, N = 3 SE +/- 14.93, N = 5 SE +/- 13.89, N = 5 SE +/- 65.80, N = 5 SE +/- 18.26, N = 5 SE +/- 0.92, N = 3 SE +/- 6.63, N = 4 1196.33 5184.36 5195.24 6138.54 5773.25 1770.65 2885.71 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.5
NAS Parallel Benchmarks Test / Class: CG.C OpenBenchmarking.org Total Mop/s Per Watt, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: CG.C Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 Xeon Platinum 8490H 90 180 270 360 450 116.77 332.67 380.13 437.49 297.51 162.31
NAS Parallel Benchmarks Test / Class: MG.C OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: MG.C Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 30K 60K 90K 120K 150K SE +/- 10.27, N = 7 SE +/- 1712.13, N = 15 SE +/- 1159.21, N = 15 SE +/- 1538.33, N = 15 SE +/- 667.76, N = 9 SE +/- 113.65, N = 8 SE +/- 405.89, N = 9 42291.84 125921.29 118608.07 137608.68 127404.16 57191.85 88049.90 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 2. Open MPI 4.1.5
LULESH OpenBenchmarking.org z/s, More Is Better LULESH 2.0.3 Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 5K 10K 15K 20K 25K SE +/- 24.21, N = 3 SE +/- 113.24, N = 3 SE +/- 84.04, N = 3 SE +/- 90.30, N = 3 SE +/- 59.94, N = 3 SE +/- 40.15, N = 3 SE +/- 49.50, N = 5 16218.33 24037.44 23701.44 24712.88 22356.75 23433.37 23997.72 1. (CXX) g++ options: -O3 -fopenmp -lm -lmpi_cxx -lmpi
LULESH OpenBenchmarking.org z/s Per Watt, More Is Better LULESH 2.0.3 Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 Xeon Platinum 8490H 30 60 90 120 150 144.69 148.93 136.97 139.86 124.36 106.69
High Performance Conjugate Gradient System Temperature Monitor Min Avg Max GPTshop GH200 71.9 96.0 99.3 OpenBenchmarking.org Celsius, Fewer Is Better High Performance Conjugate Gradient 3.1 System Temperature Monitor 20 40 60 80 100
NAS Parallel Benchmarks System Temperature Monitor Min Avg Max GPTshop GH200 91.6 95.4 99.9 OpenBenchmarking.org Celsius, Fewer Is Better NAS Parallel Benchmarks 3.4 System Temperature Monitor 20 40 60 80 100
NAS Parallel Benchmarks System Temperature Monitor Min Avg Max GPTshop GH200 89.5 92.2 98.0 OpenBenchmarking.org Celsius, Fewer Is Better NAS Parallel Benchmarks 3.4 System Temperature Monitor 20 40 60 80 100
NAS Parallel Benchmarks System Temperature Monitor Min Avg Max GPTshop GH200 87.5 90.4 92.8 OpenBenchmarking.org Celsius, Fewer Is Better NAS Parallel Benchmarks 3.4 System Temperature Monitor 20 40 60 80 100
NAS Parallel Benchmarks System Temperature Monitor Min Avg Max GPTshop GH200 88.8 91.3 97.1 OpenBenchmarking.org Celsius, Fewer Is Better NAS Parallel Benchmarks 3.4 System Temperature Monitor 20 40 60 80 100
CloverLeaf System Temperature Monitor Min Avg Max GPTshop GH200 88.5 96.9 98.8 OpenBenchmarking.org Celsius, Fewer Is Better CloverLeaf 1.3 System Temperature Monitor 20 40 60 80 100
CloverLeaf System Temperature Monitor Min Avg Max GPTshop GH200 91.9 94.6 96.6 OpenBenchmarking.org Celsius, Fewer Is Better CloverLeaf 1.3 System Temperature Monitor 20 40 60 80 100
Rodinia System Temperature Monitor Min Avg Max GPTshop GH200 88.9 93.9 100.1 OpenBenchmarking.org Celsius, Fewer Is Better Rodinia 3.1 System Temperature Monitor 20 40 60 80 100
Xcompact3d Incompact3d System Temperature Monitor Min Avg Max GPTshop GH200 82.5 96.2 101.8 OpenBenchmarking.org Celsius, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 System Temperature Monitor 20 40 60 80 100
Xcompact3d Incompact3d System Temperature Monitor Min Avg Max GPTshop GH200 92.0 94.7 98.5 OpenBenchmarking.org Celsius, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 System Temperature Monitor 20 40 60 80 100
Xmrig System Temperature Monitor Min Avg Max GPTshop GH200 84.8 90.4 93.0 OpenBenchmarking.org Celsius, Fewer Is Better Xmrig 6.18.1 System Temperature Monitor 20 40 60 80 100
Xmrig System Temperature Monitor Min Avg Max GPTshop GH200 87.1 91.7 93.2 OpenBenchmarking.org Celsius, Fewer Is Better Xmrig 6.18.1 System Temperature Monitor 20 40 60 80 100
GraphicsMagick System Temperature Monitor Min Avg Max GPTshop GH200 83.9 93.9 99.3 OpenBenchmarking.org Celsius, Fewer Is Better GraphicsMagick 1.3.38 System Temperature Monitor 20 40 60 80 100
GraphicsMagick System Temperature Monitor Min Avg Max GPTshop GH200 87.9 94.2 97.3 OpenBenchmarking.org Celsius, Fewer Is Better GraphicsMagick 1.3.38 System Temperature Monitor 20 40 60 80 100
easyWave System Temperature Monitor Min Avg Max GPTshop GH200 83.6 85.9 87.5 OpenBenchmarking.org Celsius, Fewer Is Better easyWave r34 System Temperature Monitor 20 40 60 80 100
easyWave System Temperature Monitor Min Avg Max GPTshop GH200 81.6 85.6 88.0 OpenBenchmarking.org Celsius, Fewer Is Better easyWave r34 System Temperature Monitor 20 40 60 80 100
ACES DGEMM System Temperature Monitor Min Avg Max GPTshop GH200 85.8 92.3 95.8 OpenBenchmarking.org Celsius, Fewer Is Better ACES DGEMM 1.0 System Temperature Monitor 20 40 60 80 100
Timed Godot Game Engine Compilation System Temperature Monitor Min Avg Max GPTshop GH200 83.7 91.2 99.3 OpenBenchmarking.org Celsius, Fewer Is Better Timed Godot Game Engine Compilation 4.0 System Temperature Monitor 20 40 60 80 100
Timed Linux Kernel Compilation System Temperature Monitor Min Avg Max GPTshop GH200 84.2 97.0 100.1 OpenBenchmarking.org Celsius, Fewer Is Better Timed Linux Kernel Compilation 6.1 System Temperature Monitor 20 40 60 80 100
Timed LLVM Compilation System Temperature Monitor Min Avg Max GPTshop GH200 86.9 95.0 99.5 OpenBenchmarking.org Celsius, Fewer Is Better Timed LLVM Compilation 16.0 System Temperature Monitor 20 40 60 80 100
Timed Node.js Compilation System Temperature Monitor Min Avg Max GPTshop GH200 87.0 92.8 97.8 OpenBenchmarking.org Celsius, Fewer Is Better Timed Node.js Compilation 19.8.1 System Temperature Monitor 20 40 60 80 100
Primesieve System Temperature Monitor Min Avg Max GPTshop GH200 85.2 96.6 99.6 OpenBenchmarking.org Celsius, Fewer Is Better Primesieve 8.0 System Temperature Monitor 20 40 60 80 100
OpenSSL System Temperature Monitor Min Avg Max GPTshop GH200 89.0 97.8 99.8 OpenBenchmarking.org Celsius, Fewer Is Better OpenSSL 3.1 System Temperature Monitor 20 40 60 80 100
ASKAP System Temperature Monitor Min Avg Max GPTshop GH200 85.6 92.0 95.1 OpenBenchmarking.org Celsius, Fewer Is Better ASKAP 1.0 System Temperature Monitor 20 40 60 80 100
Graph500 System Temperature Monitor Min Avg Max GPTshop GH200 89.8 97.8 100.1 OpenBenchmarking.org Celsius, Fewer Is Better Graph500 3.0 System Temperature Monitor 20 40 60 80 100
DuckDB System Temperature Monitor Min Avg Max GPTshop GH200 79.0 83.1 94.0 OpenBenchmarking.org Celsius, Fewer Is Better DuckDB 0.9.1 System Temperature Monitor 20 40 60 80 100
DuckDB System Temperature Monitor Min Avg Max GPTshop GH200 84.5 84.8 86.6 OpenBenchmarking.org Celsius, Fewer Is Better DuckDB 0.9.1 System Temperature Monitor 20 40 60 80 100
RawTherapee System Temperature Monitor Min Avg Max GPTshop GH200 82.8 83.8 85.2 OpenBenchmarking.org Celsius, Fewer Is Better RawTherapee System Temperature Monitor 20 40 60 80 100
Stress-NG System Temperature Monitor Min Avg Max GPTshop GH200 83.2 88.8 92.9 OpenBenchmarking.org Celsius, Fewer Is Better Stress-NG 0.16.04 System Temperature Monitor 20 40 60 80 100
Stress-NG System Temperature Monitor Min Avg Max GPTshop GH200 86.5 91.8 93.5 OpenBenchmarking.org Celsius, Fewer Is Better Stress-NG 0.16.04 System Temperature Monitor 20 40 60 80 100
Apache Cassandra System Temperature Monitor Min Avg Max GPTshop GH200 78.5 89.2 92.7 OpenBenchmarking.org Celsius, Fewer Is Better Apache Cassandra 4.1.3 System Temperature Monitor 20 40 60 80 100
RocksDB System Temperature Monitor Min Avg Max GPTshop GH200 85.0 98.3 101.2 OpenBenchmarking.org Celsius, Fewer Is Better RocksDB 8.0 System Temperature Monitor 20 40 60 80 100
Timed Gem5 Compilation System Temperature Monitor Min Avg Max GPTshop GH200 80.1 87.5 97.8 OpenBenchmarking.org Celsius, Fewer Is Better Timed Gem5 Compilation 23.0.1 System Temperature Monitor 20 40 60 80 100
CloverLeaf Input: clover_bm16 OpenBenchmarking.org Seconds, Fewer Is Better CloverLeaf 1.3 Input: clover_bm16 Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 120 240 360 480 600 SE +/- 0.15, N = 3 SE +/- 4.26, N = 7 SE +/- 15.29, N = 9 SE +/- 14.65, N = 9 SE +/- 14.10, N = 9 SE +/- 0.52, N = 3 SE +/- 0.02, N = 3 550.30 460.24 429.46 298.29 459.16 247.28 324.17 1. (F9X) gfortran options: -O3 -march=native -funroll-loops -fopenmp
CloverLeaf Input: clover_bm64_short OpenBenchmarking.org Seconds, Fewer Is Better CloverLeaf 1.3 Input: clover_bm64_short Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 14 28 42 56 70 SE +/- 0.02, N = 3 SE +/- 1.40, N = 12 SE +/- 1.47, N = 12 SE +/- 1.39, N = 12 SE +/- 1.50, N = 12 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 62.77 45.82 47.98 51.31 49.03 28.36 38.93 1. (F9X) gfortran options: -O3 -march=native -funroll-loops -fopenmp
Rodinia Test: OpenMP LavaMD OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenMP LavaMD Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 10 20 30 40 50 SE +/- 0.01, N = 3 SE +/- 0.05, N = 3 SE +/- 0.06, N = 3 SE +/- 0.05, N = 3 SE +/- 0.10, N = 3 SE +/- 0.07, N = 3 SE +/- 0.53, N = 3 31.79 34.01 30.24 30.32 25.15 30.96 42.71 1. (CXX) g++ options: -O2 -lOpenCL
Xcompact3d Incompact3d Input: X3D-benchmarking input.i3d OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: X3D-benchmarking input.i3d Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 130 260 390 520 650 SE +/- 0.07, N = 3 SE +/- 7.98, N = 9 SE +/- 8.73, N = 9 SE +/- 5.21, N = 9 SE +/- 14.11, N = 3 SE +/- 1.42, N = 3 SE +/- 4.12, N = 4 606.52 515.79 430.08 440.55 493.50 257.34 353.56 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
Xcompact3d Incompact3d Input: input.i3d 193 Cells Per Direction OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 193 Cells Per Direction Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 6 12 18 24 30 SE +/- 0.05179277, N = 3 SE +/- 0.03040517, N = 5 SE +/- 0.07670149, N = 5 SE +/- 0.03937117, N = 5 SE +/- 0.04030589, N = 5 SE +/- 0.03425295, N = 5 SE +/- 0.04713862, N = 4 23.88522400 9.74551621 8.45212936 7.30814419 9.02806168 9.83151112 12.79834200 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
easyWave Input: e2Asean Grid + BengkuluSept2007 Source - Time: 1200 OpenBenchmarking.org Seconds, Fewer Is Better easyWave r34 Input: e2Asean Grid + BengkuluSept2007 Source - Time: 1200 Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 30 60 90 120 150 SE +/- 1.12, N = 3 SE +/- 0.06, N = 3 SE +/- 0.14, N = 3 SE +/- 0.32, N = 3 SE +/- 0.20, N = 15 SE +/- 0.11, N = 3 SE +/- 0.04, N = 3 122.55 22.45 23.51 24.58 29.59 36.56 45.85 1. (CXX) g++ options: -O3 -fopenmp
easyWave Input: e2Asean Grid + BengkuluSept2007 Source - Time: 2400 OpenBenchmarking.org Seconds, Fewer Is Better easyWave r34 Input: e2Asean Grid + BengkuluSept2007 Source - Time: 2400 Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 70 140 210 280 350 SE +/- 1.24, N = 3 SE +/- 0.33, N = 3 SE +/- 0.40, N = 3 SE +/- 0.14, N = 3 SE +/- 0.45, N = 3 SE +/- 0.34, N = 3 SE +/- 0.36, N = 3 326.37 56.88 59.54 60.25 72.36 98.72 118.55 1. (CXX) g++ options: -O3 -fopenmp
Timed Godot Game Engine Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Godot Game Engine Compilation 4.0 Time To Compile Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 50 100 150 200 250 SE +/- 0.35, N = 3 SE +/- 0.03, N = 3 SE +/- 0.16, N = 3 SE +/- 0.06, N = 3 SE +/- 0.26, N = 3 SE +/- 0.65, N = 3 SE +/- 0.43, N = 3 227.78 103.50 101.56 98.56 118.25 138.89 128.47
Timed Linux Kernel Compilation Build: allmodconfig OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 6.1 Build: allmodconfig Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 70 140 210 280 350 SE +/- 0.75, N = 3 SE +/- 0.61, N = 3 SE +/- 0.41, N = 3 SE +/- 0.50, N = 3 SE +/- 2.48, N = 3 SE +/- 1.02, N = 3 SE +/- 0.60, N = 3 307.88 238.29 284.25 283.87 319.43 283.34 289.97
Timed LLVM Compilation Build System: Ninja OpenBenchmarking.org Seconds, Fewer Is Better Timed LLVM Compilation 16.0 Build System: Ninja Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 60 120 180 240 300 SE +/- 0.95, N = 3 SE +/- 0.11, N = 3 SE +/- 0.08, N = 3 SE +/- 0.07, N = 3 SE +/- 0.28, N = 3 SE +/- 0.88, N = 3 SE +/- 0.08, N = 3 266.54 131.20 133.07 131.99 146.39 195.60 169.53
Timed Node.js Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Node.js Compilation 19.8.1 Time To Compile Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 60 120 180 240 300 SE +/- 0.19, N = 3 SE +/- 0.32, N = 3 SE +/- 0.08, N = 3 SE +/- 0.09, N = 3 SE +/- 0.08, N = 3 SE +/- 0.08, N = 3 SE +/- 0.27, N = 3 268.70 121.93 120.16 119.20 130.46 173.83 151.57
Primesieve Length: 1e13 OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 8.0 Length: 1e13 Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 12 24 36 48 60 SE +/- 0.07, N = 3 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.51, N = 3 SE +/- 0.07, N = 3 41.68 28.14 24.16 25.97 21.76 36.05 52.38 1. (CXX) g++ options: -O3
DuckDB Benchmark: IMDB OpenBenchmarking.org Seconds, Fewer Is Better DuckDB 0.9.1 Benchmark: IMDB Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 30 60 90 120 150 SE +/- 0.18, N = 3 SE +/- 0.16, N = 3 SE +/- 0.18, N = 3 SE +/- 0.26, N = 3 SE +/- 0.08, N = 3 SE +/- 0.75, N = 3 SE +/- 0.24, N = 3 143.78 103.72 115.81 116.05 147.60 91.63 99.25 1. (CXX) g++ options: -O3 -rdynamic -lssl -lcrypto -ldl
DuckDB Benchmark: TPC-H Parquet OpenBenchmarking.org Seconds, Fewer Is Better DuckDB 0.9.1 Benchmark: TPC-H Parquet Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 50 100 150 200 250 SE +/- 4.97, N = 9 SE +/- 0.53, N = 3 SE +/- 0.47, N = 3 SE +/- 0.26, N = 3 SE +/- 0.64, N = 3 SE +/- 0.69, N = 3 SE +/- 0.30, N = 3 237.16 143.31 146.28 147.49 177.13 156.99 148.10 1. (CXX) g++ options: -O3 -rdynamic -lssl -lcrypto -ldl
RawTherapee Total Benchmark Time OpenBenchmarking.org Seconds, Fewer Is Better RawTherapee Total Benchmark Time Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 15 30 45 60 75 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 SE +/- 0.07, N = 3 SE +/- 0.03, N = 3 SE +/- 0.22, N = 3 SE +/- 0.10, N = 3 SE +/- 0.05, N = 3 66.05 49.94 52.05 52.99 66.13 46.55 46.58 1. RawTherapee, version 5.9, command line.
Timed Gem5 Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Gem5 Compilation 23.0.1 Time To Compile Ampere Altra Max 128c EPYC 9554 EPYC 9654 EPYC 9684X EPYC 9754 GPTshop GH200 Xeon Platinum 8490H 60 120 180 240 300 SE +/- 3.13, N = 4 SE +/- 1.67, N = 3 SE +/- 1.09, N = 3 SE +/- 1.97, N = 3 SE +/- 2.85, N = 3 SE +/- 2.18, N = 12 SE +/- 1.53, N = 3 260.33 179.74 180.86 180.06 208.58 184.66 197.76
High Performance Conjugate Gradient System Power Consumption Monitor Min Avg Max GPTshop GH200 189.8 324.5 414.7 OpenBenchmarking.org Watts, Fewer Is Better High Performance Conjugate Gradient 3.1 System Power Consumption Monitor 110 220 330 440 550
NAS Parallel Benchmarks System Power Consumption Monitor Min Avg Max GPTshop GH200 189.8 265.5 392.9 OpenBenchmarking.org Watts, Fewer Is Better NAS Parallel Benchmarks 3.4 System Power Consumption Monitor 110 220 330 440 550
NAS Parallel Benchmarks System Power Consumption Monitor Min Avg Max GPTshop GH200 185.9 287.4 378.3 OpenBenchmarking.org Watts, Fewer Is Better NAS Parallel Benchmarks 3.4 System Power Consumption Monitor 100 200 300 400 500
NAS Parallel Benchmarks System Power Consumption Monitor Min Avg Max GPTshop GH200 184.6 277.7 351.4 OpenBenchmarking.org Watts, Fewer Is Better NAS Parallel Benchmarks 3.4 System Power Consumption Monitor 100 200 300 400 500
NAS Parallel Benchmarks System Power Consumption Monitor Min Avg Max GPTshop GH200 185.9 237.6 381.2 OpenBenchmarking.org Watts, Fewer Is Better NAS Parallel Benchmarks 3.4 System Power Consumption Monitor 100 200 300 400 500
CloverLeaf System Power Consumption Monitor Min Avg Max GPTshop GH200 185.9 315.8 337.8 OpenBenchmarking.org Watts, Fewer Is Better CloverLeaf 1.3 System Power Consumption Monitor 80 160 240 320 400
CloverLeaf System Power Consumption Monitor Min Avg Max GPTshop GH200 191.2 268.3 332.7 OpenBenchmarking.org Watts, Fewer Is Better CloverLeaf 1.3 System Power Consumption Monitor 80 160 240 320 400
Rodinia System Power Consumption Monitor Min Avg Max GPTshop GH200 186.0 267.7 379.8 OpenBenchmarking.org Watts, Fewer Is Better Rodinia 3.1 System Power Consumption Monitor 100 200 300 400 500
Xcompact3d Incompact3d System Power Consumption Monitor Min Avg Max GPTshop GH200 188.6 323.5 395.2 OpenBenchmarking.org Watts, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 System Power Consumption Monitor 110 220 330 440 550
Xcompact3d Incompact3d System Power Consumption Monitor Min Avg Max GPTshop GH200 191.2 283.9 358.9 OpenBenchmarking.org Watts, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 System Power Consumption Monitor 100 200 300 400 500
LULESH System Power Consumption Monitor Min Avg Max GPTshop GH200 188.5 250.2 308.4 OpenBenchmarking.org Watts, Fewer Is Better LULESH 2.0.3 System Power Consumption Monitor 80 160 240 320 400
Xmrig System Power Consumption Monitor Min Avg Max GPTshop GH200 188.6 296.1 316.9 OpenBenchmarking.org Watts, Fewer Is Better Xmrig 6.18.1 System Power Consumption Monitor 80 160 240 320 400
Xmrig System Power Consumption Monitor Min Avg Max GPTshop GH200 298.5 310.4 314.3 OpenBenchmarking.org Watts, Fewer Is Better Xmrig 6.18.1 System Power Consumption Monitor 80 160 240 320 400
GraphicsMagick System Power Consumption Monitor Min Avg Max GPTshop GH200 185.9 311.9 340.5 OpenBenchmarking.org Watts, Fewer Is Better GraphicsMagick 1.3.38 System Power Consumption Monitor 80 160 240 320 400
GraphicsMagick System Power Consumption Monitor Min Avg Max GPTshop GH200 185.9 296.1 322.4 OpenBenchmarking.org Watts, Fewer Is Better GraphicsMagick 1.3.38 System Power Consumption Monitor 80 160 240 320 400
easyWave System Power Consumption Monitor Min Avg Max GPTshop GH200 191.2 250.4 348.8 OpenBenchmarking.org Watts, Fewer Is Better easyWave r34 System Power Consumption Monitor 100 200 300 400 500
easyWave System Power Consumption Monitor Min Avg Max GPTshop GH200 186.0 272.7 361.5 OpenBenchmarking.org Watts, Fewer Is Better easyWave r34 System Power Consumption Monitor 100 200 300 400 500
ACES DGEMM System Power Consumption Monitor Min Avg Max GPTshop GH200 185.9 277.8 345.7 OpenBenchmarking.org Watts, Fewer Is Better ACES DGEMM 1.0 System Power Consumption Monitor 100 200 300 400 500
Timed Godot Game Engine Compilation System Power Consumption Monitor Min Avg Max GPTshop GH200 188.5 260.8 379.7 OpenBenchmarking.org Watts, Fewer Is Better Timed Godot Game Engine Compilation 4.0 System Power Consumption Monitor 100 200 300 400 500
Timed Linux Kernel Compilation System Power Consumption Monitor Min Avg Max GPTshop GH200 183.5 302.8 371.9 OpenBenchmarking.org Watts, Fewer Is Better Timed Linux Kernel Compilation 6.1 System Power Consumption Monitor 100 200 300 400 500
Timed LLVM Compilation System Power Consumption Monitor Min Avg Max GPTshop GH200 183.3 296.8 358.9 OpenBenchmarking.org Watts, Fewer Is Better Timed LLVM Compilation 16.0 System Power Consumption Monitor 100 200 300 400 500
Timed Node.js Compilation System Power Consumption Monitor Min Avg Max GPTshop GH200 185.9 300.4 358.7 OpenBenchmarking.org Watts, Fewer Is Better Timed Node.js Compilation 19.8.1 System Power Consumption Monitor 100 200 300 400 500
Primesieve System Power Consumption Monitor Min Avg Max GPTshop GH200 337.8 371.2 387.2 OpenBenchmarking.org Watts, Fewer Is Better Primesieve 8.0 System Power Consumption Monitor 100 200 300 400 500
OpenSSL System Power Consumption Monitor Min Avg Max GPTshop GH200 188.8 307.5 345.7 OpenBenchmarking.org Watts, Fewer Is Better OpenSSL 3.1 System Power Consumption Monitor 100 200 300 400 500
ASKAP System Power Consumption Monitor Min Avg Max GPTshop GH200 188.6 287.3 335.1 OpenBenchmarking.org Watts, Fewer Is Better ASKAP 1.0 System Power Consumption Monitor 80 160 240 320 400
Graph500 System Power Consumption Monitor Min Avg Max GPTshop GH200 272.3 321.3 377.1 OpenBenchmarking.org Watts, Fewer Is Better Graph500 3.0 System Power Consumption Monitor 100 200 300 400 500
DuckDB System Power Consumption Monitor Min Avg Max GPTshop GH200 183.3 212.6 298.5 OpenBenchmarking.org Watts, Fewer Is Better DuckDB 0.9.1 System Power Consumption Monitor 80 160 240 320 400
DuckDB System Power Consumption Monitor Min Avg Max GPTshop GH200 183.3 207.4 351.0 OpenBenchmarking.org Watts, Fewer Is Better DuckDB 0.9.1 System Power Consumption Monitor 100 200 300 400 500
RawTherapee System Power Consumption Monitor Min Avg Max GPTshop GH200 188.6 208.0 251.3 OpenBenchmarking.org Watts, Fewer Is Better RawTherapee System Power Consumption Monitor 60 120 180 240 300
Stress-NG System Power Consumption Monitor Min Avg Max GPTshop GH200 188.5 282.8 319.4 OpenBenchmarking.org Watts, Fewer Is Better Stress-NG 0.16.04 System Power Consumption Monitor 80 160 240 320 400
Stress-NG System Power Consumption Monitor Min Avg Max GPTshop GH200 193.8 299.1 319.5 OpenBenchmarking.org Watts, Fewer Is Better Stress-NG 0.16.04 System Power Consumption Monitor 80 160 240 320 400
Apache Cassandra System Power Consumption Monitor Min Avg Max GPTshop GH200 183.3 287.6 335.2 OpenBenchmarking.org Watts, Fewer Is Better Apache Cassandra 4.1.3 System Power Consumption Monitor 80 160 240 320 400
RocksDB System Power Consumption Monitor Min Avg Max GPTshop GH200 178.0 319.3 382.4 OpenBenchmarking.org Watts, Fewer Is Better RocksDB 8.0 System Power Consumption Monitor 100 200 300 400 500
Timed Gem5 Compilation System Power Consumption Monitor Min Avg Max GPTshop GH200 178.1 252.6 366.7 OpenBenchmarking.org Watts, Fewer Is Better Timed Gem5 Compilation 23.0.1 System Power Consumption Monitor 100 200 300 400 500
Phoronix Test Suite v10.8.5