NVIDIA GH200 memory bandwidth system

ARMv8 Neoverse-V2 testing with a Quanta Cloud QuantaGrid S74G-2U 1S7GZ9Z0000 S7G MB (CG1) (3A06 BIOS) and NVIDIA GH200 480GB on Ubuntu 22.04 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2402276-NE-NVIDIAGH242&sro&grs.

NVIDIA GH200 memory bandwidth systemProcessorMotherboardMemoryDiskGraphicsNetworkOSKernelDisplay DriverOpenCLVulkanCompilerFile-SystemScreen ResolutionabcARMv8 Neoverse-V2 @ 3.39GHz (72 Cores)Quanta Cloud QuantaGrid S74G-2U 1S7GZ9Z0000 S7G MB (CG1) (3A06 BIOS)1 x 480GB DRAM-6400MT/s960GB SAMSUNG MZ1L2960HCJR-00A07 + 1920GB SAMSUNG MZTL21T9NVIDIA GH200 480GB2 x Mellanox MT2910 + 2 x QLogic FastLinQ QL41000 10/25/40/50GbEUbuntu 22.046.5.0-1007-NVIDIA-64k (aarch64)NVIDIAOpenCL 3.0 CUDA 12.4.891.3.277GCC 11.4.0 + CUDA 11.5ext41920x1200OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -v Processor Details- Scaling Governor: cppc_cpufreq performance (Boost: Disabled)Security Details- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected

NVIDIA GH200 memory bandwidth systemstream: Addstream: Triadtinymembench: Standard Memsetstream: Copystream: Scalembw: Memory Copy - 128 MiBramspeed: Copy - Floating Pointmbw: Memory Copy, Fixed Block Size - 8192 MiBramspeed: Scale - Integermbw: Memory Copy, Fixed Block Size - 128 MiBramspeed: Average - Integermbw: Memory Copy, Fixed Block Size - 4096 MiBramspeed: Copy - Integermbw: Memory Copy - 4096 MiBmbw: Memory Copy, Fixed Block Size - 512 MiBmbw: Memory Copy - 512 MiBmbw: Memory Copy - 1024 MiBramspeed: Triad - Integerramspeed: Scale - Floating Pointramspeed: Add - Integermbw: Memory Copy - 8192 MiBramspeed: Average - Floating Pointramspeed: Triad - Floating Pointramspeed: Add - Floating Pointmbw: Memory Copy, Fixed Block Size - 1024 MiBtinymembench: Standard Memcpyabc313161.3311997.868827.5321183.9319543.816741.26677595.3213429.53553823.1416806.05850216.6113214.40774997.2513139.17413991.54314018.31313621.04834645.1477928.9635195.3013432.38255752.1034714.3434873.1113684.46825198.3341627.2337166.465367.2342107.4338616.616179.79074212.4512999.40354778.6916881.74449961.2113184.27175559.8613190.15414040.38314099.96713721.31634759.7778075.3035312.4113487.66555527.0234806.7734985.3613705.80424982.6340226.7336556.670609.9342567.8337805.217104.44174633.3813490.98753893.6917095.96449510.7013325.48674789.1313258.18414105.60413987.13913726.31634882.3778268.4535348.2913489.44755658.8734842.2934891.9113718.73626996.1OpenBenchmarking.org

Stream

Type: Add

OpenBenchmarking.orgMB/s, More Is BetterStream 2013-01-17Type: Addabc70K140K210K280K350KSE +/- 2548.52, N = 5SE +/- 1241.84, N = 5SE +/- 1117.82, N = 5313161.3341627.2340226.71. (CC) gcc options: -O3 -march=native -fopenmp

Stream

Type: Triad

OpenBenchmarking.orgMB/s, More Is BetterStream 2013-01-17Type: Triadabc70K140K210K280K350KSE +/- 2534.25, N = 5SE +/- 561.20, N = 5SE +/- 1078.16, N = 5311997.8337166.4336556.61. (CC) gcc options: -O3 -march=native -fopenmp

Tinymembench

Standard Memset

OpenBenchmarking.orgMB/s, More Is BetterTinymembench 2018-05-28Standard Memsetabc15K30K45K60K75KSE +/- 52.90, N = 3SE +/- 376.32, N = 6SE +/- 292.74, N = 968827.565367.270609.91. (CC) gcc options: -O2 -lm

Stream

Type: Copy

OpenBenchmarking.orgMB/s, More Is BetterStream 2013-01-17Type: Copyabc70K140K210K280K350KSE +/- 2114.35, N = 5SE +/- 1265.00, N = 5SE +/- 981.27, N = 5321183.9342107.4342567.81. (CC) gcc options: -O3 -march=native -fopenmp

Stream

Type: Scale

OpenBenchmarking.orgMB/s, More Is BetterStream 2013-01-17Type: Scaleabc70K140K210K280K350KSE +/- 1618.66, N = 5SE +/- 819.28, N = 5SE +/- 1113.20, N = 5319543.8338616.6337805.21. (CC) gcc options: -O3 -march=native -fopenmp

MBW

Test: Memory Copy - Array Size: 128 MiB

OpenBenchmarking.orgMiB/s, More Is BetterMBW 2018-09-08Test: Memory Copy - Array Size: 128 MiBabc4K8K12K16K20KSE +/- 168.54, N = 3SE +/- 216.80, N = 12SE +/- 127.70, N = 1016741.2716179.7917104.441. (CC) gcc options: -O3 -march=native

RAMspeed SMP

Type: Copy - Benchmark: Floating Point

OpenBenchmarking.orgMB/s, More Is BetterRAMspeed SMP 3.5.0Type: Copy - Benchmark: Floating Pointabc17K34K51K68K85KSE +/- 147.53, N = 3SE +/- 483.74, N = 3SE +/- 957.21, N = 377595.3274212.4574633.381. (CC) gcc options: -O3 -march=native

MBW

Test: Memory Copy, Fixed Block Size - Array Size: 8192 MiB

OpenBenchmarking.orgMiB/s, More Is BetterMBW 2018-09-08Test: Memory Copy, Fixed Block Size - Array Size: 8192 MiBabc3K6K9K12K15KSE +/- 1.46, N = 3SE +/- 1.46, N = 3SE +/- 1.15, N = 313429.5412999.4013490.991. (CC) gcc options: -O3 -march=native

RAMspeed SMP

Type: Scale - Benchmark: Integer

OpenBenchmarking.orgMB/s, More Is BetterRAMspeed SMP 3.5.0Type: Scale - Benchmark: Integerabc12K24K36K48K60KSE +/- 135.79, N = 3SE +/- 175.46, N = 3SE +/- 368.97, N = 353823.1454778.6953893.691. (CC) gcc options: -O3 -march=native

MBW

Test: Memory Copy, Fixed Block Size - Array Size: 128 MiB

OpenBenchmarking.orgMiB/s, More Is BetterMBW 2018-09-08Test: Memory Copy, Fixed Block Size - Array Size: 128 MiBabc4K8K12K16K20KSE +/- 160.35, N = 15SE +/- 151.66, N = 15SE +/- 153.98, N = 1516806.0616881.7417095.961. (CC) gcc options: -O3 -march=native

RAMspeed SMP

Type: Average - Benchmark: Integer

OpenBenchmarking.orgMB/s, More Is BetterRAMspeed SMP 3.5.0Type: Average - Benchmark: Integerabc11K22K33K44K55KSE +/- 523.54, N = 3SE +/- 339.06, N = 3SE +/- 120.67, N = 350216.6149961.2149510.701. (CC) gcc options: -O3 -march=native

MBW

Test: Memory Copy, Fixed Block Size - Array Size: 4096 MiB

OpenBenchmarking.orgMiB/s, More Is BetterMBW 2018-09-08Test: Memory Copy, Fixed Block Size - Array Size: 4096 MiBabc3K6K9K12K15KSE +/- 26.21, N = 3SE +/- 83.60, N = 3SE +/- 50.01, N = 313214.4113184.2713325.491. (CC) gcc options: -O3 -march=native

RAMspeed SMP

Type: Copy - Benchmark: Integer

OpenBenchmarking.orgMB/s, More Is BetterRAMspeed SMP 3.5.0Type: Copy - Benchmark: Integerabc16K32K48K64K80KSE +/- 532.99, N = 3SE +/- 518.30, N = 15SE +/- 1063.03, N = 374997.2575559.8674789.131. (CC) gcc options: -O3 -march=native

MBW

Test: Memory Copy - Array Size: 4096 MiB

OpenBenchmarking.orgMiB/s, More Is BetterMBW 2018-09-08Test: Memory Copy - Array Size: 4096 MiBabc3K6K9K12K15KSE +/- 20.59, N = 3SE +/- 24.22, N = 3SE +/- 62.34, N = 313139.1713190.1513258.181. (CC) gcc options: -O3 -march=native

MBW

Test: Memory Copy, Fixed Block Size - Array Size: 512 MiB

OpenBenchmarking.orgMiB/s, More Is BetterMBW 2018-09-08Test: Memory Copy, Fixed Block Size - Array Size: 512 MiBabc3K6K9K12K15KSE +/- 46.16, N = 3SE +/- 60.13, N = 3SE +/- 13.55, N = 313991.5414040.3814105.601. (CC) gcc options: -O3 -march=native

MBW

Test: Memory Copy - Array Size: 512 MiB

OpenBenchmarking.orgMiB/s, More Is BetterMBW 2018-09-08Test: Memory Copy - Array Size: 512 MiBabc3K6K9K12K15KSE +/- 22.87, N = 3SE +/- 21.00, N = 3SE +/- 48.48, N = 314018.3114099.9713987.141. (CC) gcc options: -O3 -march=native

MBW

Test: Memory Copy - Array Size: 1024 MiB

OpenBenchmarking.orgMiB/s, More Is BetterMBW 2018-09-08Test: Memory Copy - Array Size: 1024 MiBabc3K6K9K12K15KSE +/- 46.36, N = 3SE +/- 8.31, N = 3SE +/- 9.44, N = 313621.0513721.3213726.321. (CC) gcc options: -O3 -march=native

RAMspeed SMP

Type: Triad - Benchmark: Integer

OpenBenchmarking.orgMB/s, More Is BetterRAMspeed SMP 3.5.0Type: Triad - Benchmark: Integerabc7K14K21K28K35KSE +/- 14.04, N = 3SE +/- 24.34, N = 3SE +/- 106.35, N = 334645.1434759.7734882.371. (CC) gcc options: -O3 -march=native

RAMspeed SMP

Type: Scale - Benchmark: Floating Point

OpenBenchmarking.orgMB/s, More Is BetterRAMspeed SMP 3.5.0Type: Scale - Benchmark: Floating Pointabc20K40K60K80K100KSE +/- 54.22, N = 3SE +/- 42.08, N = 3SE +/- 53.91, N = 377928.9678075.3078268.451. (CC) gcc options: -O3 -march=native

RAMspeed SMP

Type: Add - Benchmark: Integer

OpenBenchmarking.orgMB/s, More Is BetterRAMspeed SMP 3.5.0Type: Add - Benchmark: Integerabc8K16K24K32K40KSE +/- 12.14, N = 3SE +/- 27.82, N = 3SE +/- 16.40, N = 335195.3035312.4135348.291. (CC) gcc options: -O3 -march=native

MBW

Test: Memory Copy - Array Size: 8192 MiB

OpenBenchmarking.orgMiB/s, More Is BetterMBW 2018-09-08Test: Memory Copy - Array Size: 8192 MiBabc3K6K9K12K15KSE +/- 1.93, N = 3SE +/- 0.98, N = 3SE +/- 1.24, N = 313432.3813487.6713489.451. (CC) gcc options: -O3 -march=native

RAMspeed SMP

Type: Average - Benchmark: Floating Point

OpenBenchmarking.orgMB/s, More Is BetterRAMspeed SMP 3.5.0Type: Average - Benchmark: Floating Pointabc12K24K36K48K60KSE +/- 274.93, N = 3SE +/- 360.38, N = 3SE +/- 333.14, N = 355752.1055527.0255658.871. (CC) gcc options: -O3 -march=native

RAMspeed SMP

Type: Triad - Benchmark: Floating Point

OpenBenchmarking.orgMB/s, More Is BetterRAMspeed SMP 3.5.0Type: Triad - Benchmark: Floating Pointabc7K14K21K28K35KSE +/- 11.29, N = 3SE +/- 52.63, N = 3SE +/- 27.38, N = 334714.3434806.7734842.291. (CC) gcc options: -O3 -march=native

RAMspeed SMP

Type: Add - Benchmark: Floating Point

OpenBenchmarking.orgMB/s, More Is BetterRAMspeed SMP 3.5.0Type: Add - Benchmark: Floating Pointabc7K14K21K28K35KSE +/- 14.35, N = 3SE +/- 31.66, N = 3SE +/- 9.27, N = 334873.1134985.3634891.911. (CC) gcc options: -O3 -march=native

MBW

Test: Memory Copy, Fixed Block Size - Array Size: 1024 MiB

OpenBenchmarking.orgMiB/s, More Is BetterMBW 2018-09-08Test: Memory Copy, Fixed Block Size - Array Size: 1024 MiBabc3K6K9K12K15KSE +/- 4.14, N = 3SE +/- 9.34, N = 3SE +/- 15.68, N = 313684.4713705.8013718.741. (CC) gcc options: -O3 -march=native

Tinymembench

Standard Memcpy

OpenBenchmarking.orgMB/s, More Is BetterTinymembench 2018-05-28Standard Memcpyabc6K12K18K24K30KSE +/- 180.61, N = 3SE +/- 896.59, N = 6SE +/- 720.83, N = 925198.324982.626996.11. (CC) gcc options: -O2 -lm


Phoronix Test Suite v10.8.5