NVIDIA GH200 memory bandwidth system ARMv8 Neoverse-V2 testing with a Quanta Cloud QuantaGrid S74G-2U 1S7GZ9Z0000 S7G MB (CG1) (3A06 BIOS) and NVIDIA GH200 480GB on Ubuntu 22.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2402276-NE-NVIDIAGH242&grs&rdt .
NVIDIA GH200 memory bandwidth system Processor Motherboard Memory Disk Graphics Network OS Kernel Display Driver OpenCL Vulkan Compiler File-System Screen Resolution a b c ARMv8 Neoverse-V2 @ 3.39GHz (72 Cores) Quanta Cloud QuantaGrid S74G-2U 1S7GZ9Z0000 S7G MB (CG1) (3A06 BIOS) 1 x 480GB DRAM-6400MT/s 960GB SAMSUNG MZ1L2960HCJR-00A07 + 1920GB SAMSUNG MZTL21T9 NVIDIA GH200 480GB 2 x Mellanox MT2910 + 2 x QLogic FastLinQ QL41000 10/25/40/50GbE Ubuntu 22.04 6.5.0-1007-NVIDIA-64k (aarch64) NVIDIA OpenCL 3.0 CUDA 12.4.89 1.3.277 GCC 11.4.0 + CUDA 11.5 ext4 1920x1200 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -v Processor Details - Scaling Governor: cppc_cpufreq performance (Boost: Disabled) Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected
NVIDIA GH200 memory bandwidth system stream: Add stream: Triad tinymembench: Standard Memset stream: Copy stream: Scale mbw: Memory Copy - 128 MiB ramspeed: Copy - Floating Point mbw: Memory Copy, Fixed Block Size - 8192 MiB ramspeed: Scale - Integer mbw: Memory Copy, Fixed Block Size - 128 MiB ramspeed: Average - Integer mbw: Memory Copy, Fixed Block Size - 4096 MiB ramspeed: Copy - Integer mbw: Memory Copy - 4096 MiB mbw: Memory Copy, Fixed Block Size - 512 MiB mbw: Memory Copy - 512 MiB mbw: Memory Copy - 1024 MiB ramspeed: Triad - Integer ramspeed: Scale - Floating Point ramspeed: Add - Integer mbw: Memory Copy - 8192 MiB ramspeed: Average - Floating Point ramspeed: Triad - Floating Point ramspeed: Add - Floating Point mbw: Memory Copy, Fixed Block Size - 1024 MiB tinymembench: Standard Memcpy a b c 313161.3 311997.8 68827.5 321183.9 319543.8 16741.266 77595.32 13429.535 53823.14 16806.058 50216.61 13214.407 74997.25 13139.174 13991.543 14018.313 13621.048 34645.14 77928.96 35195.30 13432.382 55752.10 34714.34 34873.11 13684.468 25198.3 341627.2 337166.4 65367.2 342107.4 338616.6 16179.790 74212.45 12999.403 54778.69 16881.744 49961.21 13184.271 75559.86 13190.154 14040.383 14099.967 13721.316 34759.77 78075.30 35312.41 13487.665 55527.02 34806.77 34985.36 13705.804 24982.6 340226.7 336556.6 70609.9 342567.8 337805.2 17104.441 74633.38 13490.987 53893.69 17095.964 49510.70 13325.486 74789.13 13258.184 14105.604 13987.139 13726.316 34882.37 78268.45 35348.29 13489.447 55658.87 34842.29 34891.91 13718.736 26996.1 OpenBenchmarking.org
Stream Type: Add OpenBenchmarking.org MB/s, More Is Better Stream 2013-01-17 Type: Add a b c 70K 140K 210K 280K 350K SE +/- 2548.52, N = 5 SE +/- 1241.84, N = 5 SE +/- 1117.82, N = 5 313161.3 341627.2 340226.7 1. (CC) gcc options: -O3 -march=native -fopenmp
Stream Type: Triad OpenBenchmarking.org MB/s, More Is Better Stream 2013-01-17 Type: Triad a b c 70K 140K 210K 280K 350K SE +/- 2534.25, N = 5 SE +/- 561.20, N = 5 SE +/- 1078.16, N = 5 311997.8 337166.4 336556.6 1. (CC) gcc options: -O3 -march=native -fopenmp
Tinymembench Standard Memset OpenBenchmarking.org MB/s, More Is Better Tinymembench 2018-05-28 Standard Memset a b c 15K 30K 45K 60K 75K SE +/- 52.90, N = 3 SE +/- 376.32, N = 6 SE +/- 292.74, N = 9 68827.5 65367.2 70609.9 1. (CC) gcc options: -O2 -lm
Stream Type: Copy OpenBenchmarking.org MB/s, More Is Better Stream 2013-01-17 Type: Copy a b c 70K 140K 210K 280K 350K SE +/- 2114.35, N = 5 SE +/- 1265.00, N = 5 SE +/- 981.27, N = 5 321183.9 342107.4 342567.8 1. (CC) gcc options: -O3 -march=native -fopenmp
Stream Type: Scale OpenBenchmarking.org MB/s, More Is Better Stream 2013-01-17 Type: Scale a b c 70K 140K 210K 280K 350K SE +/- 1618.66, N = 5 SE +/- 819.28, N = 5 SE +/- 1113.20, N = 5 319543.8 338616.6 337805.2 1. (CC) gcc options: -O3 -march=native -fopenmp
MBW Test: Memory Copy - Array Size: 128 MiB OpenBenchmarking.org MiB/s, More Is Better MBW 2018-09-08 Test: Memory Copy - Array Size: 128 MiB a b c 4K 8K 12K 16K 20K SE +/- 168.54, N = 3 SE +/- 216.80, N = 12 SE +/- 127.70, N = 10 16741.27 16179.79 17104.44 1. (CC) gcc options: -O3 -march=native
RAMspeed SMP Type: Copy - Benchmark: Floating Point OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Copy - Benchmark: Floating Point a b c 17K 34K 51K 68K 85K SE +/- 147.53, N = 3 SE +/- 483.74, N = 3 SE +/- 957.21, N = 3 77595.32 74212.45 74633.38 1. (CC) gcc options: -O3 -march=native
MBW Test: Memory Copy, Fixed Block Size - Array Size: 8192 MiB OpenBenchmarking.org MiB/s, More Is Better MBW 2018-09-08 Test: Memory Copy, Fixed Block Size - Array Size: 8192 MiB a b c 3K 6K 9K 12K 15K SE +/- 1.46, N = 3 SE +/- 1.46, N = 3 SE +/- 1.15, N = 3 13429.54 12999.40 13490.99 1. (CC) gcc options: -O3 -march=native
RAMspeed SMP Type: Scale - Benchmark: Integer OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Scale - Benchmark: Integer a b c 12K 24K 36K 48K 60K SE +/- 135.79, N = 3 SE +/- 175.46, N = 3 SE +/- 368.97, N = 3 53823.14 54778.69 53893.69 1. (CC) gcc options: -O3 -march=native
MBW Test: Memory Copy, Fixed Block Size - Array Size: 128 MiB OpenBenchmarking.org MiB/s, More Is Better MBW 2018-09-08 Test: Memory Copy, Fixed Block Size - Array Size: 128 MiB a b c 4K 8K 12K 16K 20K SE +/- 160.35, N = 15 SE +/- 151.66, N = 15 SE +/- 153.98, N = 15 16806.06 16881.74 17095.96 1. (CC) gcc options: -O3 -march=native
RAMspeed SMP Type: Average - Benchmark: Integer OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Average - Benchmark: Integer a b c 11K 22K 33K 44K 55K SE +/- 523.54, N = 3 SE +/- 339.06, N = 3 SE +/- 120.67, N = 3 50216.61 49961.21 49510.70 1. (CC) gcc options: -O3 -march=native
MBW Test: Memory Copy, Fixed Block Size - Array Size: 4096 MiB OpenBenchmarking.org MiB/s, More Is Better MBW 2018-09-08 Test: Memory Copy, Fixed Block Size - Array Size: 4096 MiB a b c 3K 6K 9K 12K 15K SE +/- 26.21, N = 3 SE +/- 83.60, N = 3 SE +/- 50.01, N = 3 13214.41 13184.27 13325.49 1. (CC) gcc options: -O3 -march=native
RAMspeed SMP Type: Copy - Benchmark: Integer OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Copy - Benchmark: Integer a b c 16K 32K 48K 64K 80K SE +/- 532.99, N = 3 SE +/- 518.30, N = 15 SE +/- 1063.03, N = 3 74997.25 75559.86 74789.13 1. (CC) gcc options: -O3 -march=native
MBW Test: Memory Copy - Array Size: 4096 MiB OpenBenchmarking.org MiB/s, More Is Better MBW 2018-09-08 Test: Memory Copy - Array Size: 4096 MiB a b c 3K 6K 9K 12K 15K SE +/- 20.59, N = 3 SE +/- 24.22, N = 3 SE +/- 62.34, N = 3 13139.17 13190.15 13258.18 1. (CC) gcc options: -O3 -march=native
MBW Test: Memory Copy, Fixed Block Size - Array Size: 512 MiB OpenBenchmarking.org MiB/s, More Is Better MBW 2018-09-08 Test: Memory Copy, Fixed Block Size - Array Size: 512 MiB a b c 3K 6K 9K 12K 15K SE +/- 46.16, N = 3 SE +/- 60.13, N = 3 SE +/- 13.55, N = 3 13991.54 14040.38 14105.60 1. (CC) gcc options: -O3 -march=native
MBW Test: Memory Copy - Array Size: 512 MiB OpenBenchmarking.org MiB/s, More Is Better MBW 2018-09-08 Test: Memory Copy - Array Size: 512 MiB a b c 3K 6K 9K 12K 15K SE +/- 22.87, N = 3 SE +/- 21.00, N = 3 SE +/- 48.48, N = 3 14018.31 14099.97 13987.14 1. (CC) gcc options: -O3 -march=native
MBW Test: Memory Copy - Array Size: 1024 MiB OpenBenchmarking.org MiB/s, More Is Better MBW 2018-09-08 Test: Memory Copy - Array Size: 1024 MiB a b c 3K 6K 9K 12K 15K SE +/- 46.36, N = 3 SE +/- 8.31, N = 3 SE +/- 9.44, N = 3 13621.05 13721.32 13726.32 1. (CC) gcc options: -O3 -march=native
RAMspeed SMP Type: Triad - Benchmark: Integer OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Triad - Benchmark: Integer a b c 7K 14K 21K 28K 35K SE +/- 14.04, N = 3 SE +/- 24.34, N = 3 SE +/- 106.35, N = 3 34645.14 34759.77 34882.37 1. (CC) gcc options: -O3 -march=native
RAMspeed SMP Type: Scale - Benchmark: Floating Point OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Scale - Benchmark: Floating Point a b c 20K 40K 60K 80K 100K SE +/- 54.22, N = 3 SE +/- 42.08, N = 3 SE +/- 53.91, N = 3 77928.96 78075.30 78268.45 1. (CC) gcc options: -O3 -march=native
RAMspeed SMP Type: Add - Benchmark: Integer OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Add - Benchmark: Integer a b c 8K 16K 24K 32K 40K SE +/- 12.14, N = 3 SE +/- 27.82, N = 3 SE +/- 16.40, N = 3 35195.30 35312.41 35348.29 1. (CC) gcc options: -O3 -march=native
MBW Test: Memory Copy - Array Size: 8192 MiB OpenBenchmarking.org MiB/s, More Is Better MBW 2018-09-08 Test: Memory Copy - Array Size: 8192 MiB a b c 3K 6K 9K 12K 15K SE +/- 1.93, N = 3 SE +/- 0.98, N = 3 SE +/- 1.24, N = 3 13432.38 13487.67 13489.45 1. (CC) gcc options: -O3 -march=native
RAMspeed SMP Type: Average - Benchmark: Floating Point OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Average - Benchmark: Floating Point a b c 12K 24K 36K 48K 60K SE +/- 274.93, N = 3 SE +/- 360.38, N = 3 SE +/- 333.14, N = 3 55752.10 55527.02 55658.87 1. (CC) gcc options: -O3 -march=native
RAMspeed SMP Type: Triad - Benchmark: Floating Point OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Triad - Benchmark: Floating Point a b c 7K 14K 21K 28K 35K SE +/- 11.29, N = 3 SE +/- 52.63, N = 3 SE +/- 27.38, N = 3 34714.34 34806.77 34842.29 1. (CC) gcc options: -O3 -march=native
RAMspeed SMP Type: Add - Benchmark: Floating Point OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Add - Benchmark: Floating Point a b c 7K 14K 21K 28K 35K SE +/- 14.35, N = 3 SE +/- 31.66, N = 3 SE +/- 9.27, N = 3 34873.11 34985.36 34891.91 1. (CC) gcc options: -O3 -march=native
MBW Test: Memory Copy, Fixed Block Size - Array Size: 1024 MiB OpenBenchmarking.org MiB/s, More Is Better MBW 2018-09-08 Test: Memory Copy, Fixed Block Size - Array Size: 1024 MiB a b c 3K 6K 9K 12K 15K SE +/- 4.14, N = 3 SE +/- 9.34, N = 3 SE +/- 15.68, N = 3 13684.47 13705.80 13718.74 1. (CC) gcc options: -O3 -march=native
Tinymembench Standard Memcpy OpenBenchmarking.org MB/s, More Is Better Tinymembench 2018-05-28 Standard Memcpy a b c 6K 12K 18K 24K 30K SE +/- 180.61, N = 3 SE +/- 896.59, N = 6 SE +/- 720.83, N = 9 25198.3 24982.6 26996.1 1. (CC) gcc options: -O2 -lm
Phoronix Test Suite v10.8.5