NVIDIA GH200 memory bandwidth system

ARMv8 Neoverse-V2 testing with a Quanta Cloud QuantaGrid S74G-2U 1S7GZ9Z0000 S7G MB (CG1) (3A06 BIOS) and NVIDIA GH200 480GB on Ubuntu 22.04 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2402276-NE-NVIDIAGH242&grr&sor.

NVIDIA GH200 memory bandwidth systemProcessorMotherboardMemoryDiskGraphicsNetworkOSKernelDisplay DriverOpenCLVulkanCompilerFile-SystemScreen ResolutionabcARMv8 Neoverse-V2 @ 3.39GHz (72 Cores)Quanta Cloud QuantaGrid S74G-2U 1S7GZ9Z0000 S7G MB (CG1) (3A06 BIOS)1 x 480GB DRAM-6400MT/s960GB SAMSUNG MZ1L2960HCJR-00A07 + 1920GB SAMSUNG MZTL21T9NVIDIA GH200 480GB2 x Mellanox MT2910 + 2 x QLogic FastLinQ QL41000 10/25/40/50GbEUbuntu 22.046.5.0-1007-NVIDIA-64k (aarch64)NVIDIAOpenCL 3.0 CUDA 12.4.891.3.277GCC 11.4.0 + CUDA 11.5ext41920x1200OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -v Processor Details- Scaling Governor: cppc_cpufreq performance (Boost: Disabled)Security Details- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected

NVIDIA GH200 memory bandwidth systemtinymembench: Standard Memsettinymembench: Standard Memcpyramspeed: Copy - Integermbw: Memory Copy, Fixed Block Size - 8192 MiBmbw: Memory Copy - 8192 MiBramspeed: Add - Integerramspeed: Average - Integerramspeed: Triad - Integerramspeed: Scale - Integerramspeed: Scale - Floating Pointramspeed: Average - Floating Pointramspeed: Copy - Floating Pointramspeed: Triad - Floating Pointramspeed: Add - Floating Pointmbw: Memory Copy - 4096 MiBmbw: Memory Copy, Fixed Block Size - 4096 MiBmbw: Memory Copy - 1024 MiBmbw: Memory Copy, Fixed Block Size - 1024 MiBstream: Copymbw: Memory Copy, Fixed Block Size - 128 MiBmbw: Memory Copy - 512 MiBmbw: Memory Copy, Fixed Block Size - 512 MiBmbw: Memory Copy - 128 MiBstream: Addstream: Triadstream: Scaleabc68827.525198.374997.2513429.53513432.38235195.3050216.6134645.1453823.1477928.9655752.1077595.3234714.3434873.1113139.17413214.40713621.04813684.468321183.916806.05814018.31313991.54316741.266313161.3311997.8319543.865367.224982.675559.8612999.40313487.66535312.4149961.2134759.7754778.6978075.3055527.0274212.4534806.7734985.3613190.15413184.27113721.31613705.804342107.416881.74414099.96714040.38316179.790341627.2337166.4338616.670609.926996.174789.1313490.98713489.44735348.2949510.7034882.3753893.6978268.4555658.8774633.3834842.2934891.9113258.18413325.48613726.31613718.736342567.817095.96413987.13914105.60417104.441340226.7336556.6337805.2OpenBenchmarking.org

Tinymembench

Standard Memset

OpenBenchmarking.orgMB/s, More Is BetterTinymembench 2018-05-28Standard Memsetcab15K30K45K60K75KSE +/- 292.74, N = 9SE +/- 52.90, N = 3SE +/- 376.32, N = 670609.968827.565367.21. (CC) gcc options: -O2 -lm

Tinymembench

Standard Memcpy

OpenBenchmarking.orgMB/s, More Is BetterTinymembench 2018-05-28Standard Memcpycab6K12K18K24K30KSE +/- 720.83, N = 9SE +/- 180.61, N = 3SE +/- 896.59, N = 626996.125198.324982.61. (CC) gcc options: -O2 -lm

RAMspeed SMP

Type: Copy - Benchmark: Integer

OpenBenchmarking.orgMB/s, More Is BetterRAMspeed SMP 3.5.0Type: Copy - Benchmark: Integerbac16K32K48K64K80KSE +/- 518.30, N = 15SE +/- 532.99, N = 3SE +/- 1063.03, N = 375559.8674997.2574789.131. (CC) gcc options: -O3 -march=native

MBW

Test: Memory Copy, Fixed Block Size - Array Size: 8192 MiB

OpenBenchmarking.orgMiB/s, More Is BetterMBW 2018-09-08Test: Memory Copy, Fixed Block Size - Array Size: 8192 MiBcab3K6K9K12K15KSE +/- 1.15, N = 3SE +/- 1.46, N = 3SE +/- 1.46, N = 313490.9913429.5412999.401. (CC) gcc options: -O3 -march=native

MBW

Test: Memory Copy - Array Size: 8192 MiB

OpenBenchmarking.orgMiB/s, More Is BetterMBW 2018-09-08Test: Memory Copy - Array Size: 8192 MiBcba3K6K9K12K15KSE +/- 1.24, N = 3SE +/- 0.98, N = 3SE +/- 1.93, N = 313489.4513487.6713432.381. (CC) gcc options: -O3 -march=native

RAMspeed SMP

Type: Add - Benchmark: Integer

OpenBenchmarking.orgMB/s, More Is BetterRAMspeed SMP 3.5.0Type: Add - Benchmark: Integercba8K16K24K32K40KSE +/- 16.40, N = 3SE +/- 27.82, N = 3SE +/- 12.14, N = 335348.2935312.4135195.301. (CC) gcc options: -O3 -march=native

RAMspeed SMP

Type: Average - Benchmark: Integer

OpenBenchmarking.orgMB/s, More Is BetterRAMspeed SMP 3.5.0Type: Average - Benchmark: Integerabc11K22K33K44K55KSE +/- 523.54, N = 3SE +/- 339.06, N = 3SE +/- 120.67, N = 350216.6149961.2149510.701. (CC) gcc options: -O3 -march=native

RAMspeed SMP

Type: Triad - Benchmark: Integer

OpenBenchmarking.orgMB/s, More Is BetterRAMspeed SMP 3.5.0Type: Triad - Benchmark: Integercba7K14K21K28K35KSE +/- 106.35, N = 3SE +/- 24.34, N = 3SE +/- 14.04, N = 334882.3734759.7734645.141. (CC) gcc options: -O3 -march=native

RAMspeed SMP

Type: Scale - Benchmark: Integer

OpenBenchmarking.orgMB/s, More Is BetterRAMspeed SMP 3.5.0Type: Scale - Benchmark: Integerbca12K24K36K48K60KSE +/- 175.46, N = 3SE +/- 368.97, N = 3SE +/- 135.79, N = 354778.6953893.6953823.141. (CC) gcc options: -O3 -march=native

RAMspeed SMP

Type: Scale - Benchmark: Floating Point

OpenBenchmarking.orgMB/s, More Is BetterRAMspeed SMP 3.5.0Type: Scale - Benchmark: Floating Pointcba20K40K60K80K100KSE +/- 53.91, N = 3SE +/- 42.08, N = 3SE +/- 54.22, N = 378268.4578075.3077928.961. (CC) gcc options: -O3 -march=native

RAMspeed SMP

Type: Average - Benchmark: Floating Point

OpenBenchmarking.orgMB/s, More Is BetterRAMspeed SMP 3.5.0Type: Average - Benchmark: Floating Pointacb12K24K36K48K60KSE +/- 274.93, N = 3SE +/- 333.14, N = 3SE +/- 360.38, N = 355752.1055658.8755527.021. (CC) gcc options: -O3 -march=native

RAMspeed SMP

Type: Copy - Benchmark: Floating Point

OpenBenchmarking.orgMB/s, More Is BetterRAMspeed SMP 3.5.0Type: Copy - Benchmark: Floating Pointacb17K34K51K68K85KSE +/- 147.53, N = 3SE +/- 957.21, N = 3SE +/- 483.74, N = 377595.3274633.3874212.451. (CC) gcc options: -O3 -march=native

RAMspeed SMP

Type: Triad - Benchmark: Floating Point

OpenBenchmarking.orgMB/s, More Is BetterRAMspeed SMP 3.5.0Type: Triad - Benchmark: Floating Pointcba7K14K21K28K35KSE +/- 27.38, N = 3SE +/- 52.63, N = 3SE +/- 11.29, N = 334842.2934806.7734714.341. (CC) gcc options: -O3 -march=native

RAMspeed SMP

Type: Add - Benchmark: Floating Point

OpenBenchmarking.orgMB/s, More Is BetterRAMspeed SMP 3.5.0Type: Add - Benchmark: Floating Pointbca7K14K21K28K35KSE +/- 31.66, N = 3SE +/- 9.27, N = 3SE +/- 14.35, N = 334985.3634891.9134873.111. (CC) gcc options: -O3 -march=native

MBW

Test: Memory Copy - Array Size: 4096 MiB

OpenBenchmarking.orgMiB/s, More Is BetterMBW 2018-09-08Test: Memory Copy - Array Size: 4096 MiBcba3K6K9K12K15KSE +/- 62.34, N = 3SE +/- 24.22, N = 3SE +/- 20.59, N = 313258.1813190.1513139.171. (CC) gcc options: -O3 -march=native

MBW

Test: Memory Copy, Fixed Block Size - Array Size: 4096 MiB

OpenBenchmarking.orgMiB/s, More Is BetterMBW 2018-09-08Test: Memory Copy, Fixed Block Size - Array Size: 4096 MiBcab3K6K9K12K15KSE +/- 50.01, N = 3SE +/- 26.21, N = 3SE +/- 83.60, N = 313325.4913214.4113184.271. (CC) gcc options: -O3 -march=native

MBW

Test: Memory Copy - Array Size: 1024 MiB

OpenBenchmarking.orgMiB/s, More Is BetterMBW 2018-09-08Test: Memory Copy - Array Size: 1024 MiBcba3K6K9K12K15KSE +/- 9.44, N = 3SE +/- 8.31, N = 3SE +/- 46.36, N = 313726.3213721.3213621.051. (CC) gcc options: -O3 -march=native

MBW

Test: Memory Copy, Fixed Block Size - Array Size: 1024 MiB

OpenBenchmarking.orgMiB/s, More Is BetterMBW 2018-09-08Test: Memory Copy, Fixed Block Size - Array Size: 1024 MiBcba3K6K9K12K15KSE +/- 15.68, N = 3SE +/- 9.34, N = 3SE +/- 4.14, N = 313718.7413705.8013684.471. (CC) gcc options: -O3 -march=native

Stream

Type: Copy

OpenBenchmarking.orgMB/s, More Is BetterStream 2013-01-17Type: Copycba70K140K210K280K350KSE +/- 981.27, N = 5SE +/- 1265.00, N = 5SE +/- 2114.35, N = 5342567.8342107.4321183.91. (CC) gcc options: -O3 -march=native -fopenmp

MBW

Test: Memory Copy, Fixed Block Size - Array Size: 128 MiB

OpenBenchmarking.orgMiB/s, More Is BetterMBW 2018-09-08Test: Memory Copy, Fixed Block Size - Array Size: 128 MiBcba4K8K12K16K20KSE +/- 153.98, N = 15SE +/- 151.66, N = 15SE +/- 160.35, N = 1517095.9616881.7416806.061. (CC) gcc options: -O3 -march=native

MBW

Test: Memory Copy - Array Size: 512 MiB

OpenBenchmarking.orgMiB/s, More Is BetterMBW 2018-09-08Test: Memory Copy - Array Size: 512 MiBbac3K6K9K12K15KSE +/- 21.00, N = 3SE +/- 22.87, N = 3SE +/- 48.48, N = 314099.9714018.3113987.141. (CC) gcc options: -O3 -march=native

MBW

Test: Memory Copy, Fixed Block Size - Array Size: 512 MiB

OpenBenchmarking.orgMiB/s, More Is BetterMBW 2018-09-08Test: Memory Copy, Fixed Block Size - Array Size: 512 MiBcba3K6K9K12K15KSE +/- 13.55, N = 3SE +/- 60.13, N = 3SE +/- 46.16, N = 314105.6014040.3813991.541. (CC) gcc options: -O3 -march=native

MBW

Test: Memory Copy - Array Size: 128 MiB

OpenBenchmarking.orgMiB/s, More Is BetterMBW 2018-09-08Test: Memory Copy - Array Size: 128 MiBcab4K8K12K16K20KSE +/- 127.70, N = 10SE +/- 168.54, N = 3SE +/- 216.80, N = 1217104.4416741.2716179.791. (CC) gcc options: -O3 -march=native

Stream

Type: Add

OpenBenchmarking.orgMB/s, More Is BetterStream 2013-01-17Type: Addbca70K140K210K280K350KSE +/- 1241.84, N = 5SE +/- 1117.82, N = 5SE +/- 2548.52, N = 5341627.2340226.7313161.31. (CC) gcc options: -O3 -march=native -fopenmp

Stream

Type: Triad

OpenBenchmarking.orgMB/s, More Is BetterStream 2013-01-17Type: Triadbca70K140K210K280K350KSE +/- 561.20, N = 5SE +/- 1078.16, N = 5SE +/- 2534.25, N = 5337166.4336556.6311997.81. (CC) gcc options: -O3 -march=native -fopenmp

Stream

Type: Scale

OpenBenchmarking.orgMB/s, More Is BetterStream 2013-01-17Type: Scalebca70K140K210K280K350KSE +/- 819.28, N = 5SE +/- 1113.20, N = 5SE +/- 1618.66, N = 5338616.6337805.2319543.81. (CC) gcc options: -O3 -march=native -fopenmp


Phoronix Test Suite v10.8.5