NVIDIA GH200 memory bandwidth system ARMv8 Neoverse-V2 testing with a Quanta Cloud QuantaGrid S74G-2U 1S7GZ9Z0000 S7G MB (CG1) (3A06 BIOS) and NVIDIA GH200 480GB on Ubuntu 22.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2402276-NE-NVIDIAGH242&sro .
NVIDIA GH200 memory bandwidth system Processor Motherboard Memory Disk Graphics Network OS Kernel Display Driver OpenCL Vulkan Compiler File-System Screen Resolution a b c ARMv8 Neoverse-V2 @ 3.39GHz (72 Cores) Quanta Cloud QuantaGrid S74G-2U 1S7GZ9Z0000 S7G MB (CG1) (3A06 BIOS) 1 x 480GB DRAM-6400MT/s 960GB SAMSUNG MZ1L2960HCJR-00A07 + 1920GB SAMSUNG MZTL21T9 NVIDIA GH200 480GB 2 x Mellanox MT2910 + 2 x QLogic FastLinQ QL41000 10/25/40/50GbE Ubuntu 22.04 6.5.0-1007-NVIDIA-64k (aarch64) NVIDIA OpenCL 3.0 CUDA 12.4.89 1.3.277 GCC 11.4.0 + CUDA 11.5 ext4 1920x1200 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -v Processor Details - Scaling Governor: cppc_cpufreq performance (Boost: Disabled) Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected
NVIDIA GH200 memory bandwidth system ramspeed: Add - Integer ramspeed: Copy - Integer ramspeed: Scale - Integer ramspeed: Triad - Integer ramspeed: Average - Integer ramspeed: Add - Floating Point ramspeed: Copy - Floating Point ramspeed: Scale - Floating Point ramspeed: Triad - Floating Point ramspeed: Average - Floating Point stream: Copy stream: Scale stream: Triad stream: Add tinymembench: Standard Memcpy tinymembench: Standard Memset mbw: Memory Copy - 128 MiB mbw: Memory Copy - 512 MiB mbw: Memory Copy - 1024 MiB mbw: Memory Copy - 4096 MiB mbw: Memory Copy - 8192 MiB mbw: Memory Copy, Fixed Block Size - 128 MiB mbw: Memory Copy, Fixed Block Size - 512 MiB mbw: Memory Copy, Fixed Block Size - 1024 MiB mbw: Memory Copy, Fixed Block Size - 4096 MiB mbw: Memory Copy, Fixed Block Size - 8192 MiB a b c 35195.30 74997.25 53823.14 34645.14 50216.61 34873.11 77595.32 77928.96 34714.34 55752.10 321183.9 319543.8 311997.8 313161.3 25198.3 68827.5 16741.266 14018.313 13621.048 13139.174 13432.382 16806.058 13991.543 13684.468 13214.407 13429.535 35312.41 75559.86 54778.69 34759.77 49961.21 34985.36 74212.45 78075.30 34806.77 55527.02 342107.4 338616.6 337166.4 341627.2 24982.6 65367.2 16179.790 14099.967 13721.316 13190.154 13487.665 16881.744 14040.383 13705.804 13184.271 12999.403 35348.29 74789.13 53893.69 34882.37 49510.70 34891.91 74633.38 78268.45 34842.29 55658.87 342567.8 337805.2 336556.6 340226.7 26996.1 70609.9 17104.441 13987.139 13726.316 13258.184 13489.447 17095.964 14105.604 13718.736 13325.486 13490.987 OpenBenchmarking.org
RAMspeed SMP Type: Add - Benchmark: Integer OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Add - Benchmark: Integer a b c 8K 16K 24K 32K 40K SE +/- 12.14, N = 3 SE +/- 27.82, N = 3 SE +/- 16.40, N = 3 35195.30 35312.41 35348.29 1. (CC) gcc options: -O3 -march=native
RAMspeed SMP Type: Copy - Benchmark: Integer OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Copy - Benchmark: Integer a b c 16K 32K 48K 64K 80K SE +/- 532.99, N = 3 SE +/- 518.30, N = 15 SE +/- 1063.03, N = 3 74997.25 75559.86 74789.13 1. (CC) gcc options: -O3 -march=native
RAMspeed SMP Type: Scale - Benchmark: Integer OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Scale - Benchmark: Integer a b c 12K 24K 36K 48K 60K SE +/- 135.79, N = 3 SE +/- 175.46, N = 3 SE +/- 368.97, N = 3 53823.14 54778.69 53893.69 1. (CC) gcc options: -O3 -march=native
RAMspeed SMP Type: Triad - Benchmark: Integer OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Triad - Benchmark: Integer a b c 7K 14K 21K 28K 35K SE +/- 14.04, N = 3 SE +/- 24.34, N = 3 SE +/- 106.35, N = 3 34645.14 34759.77 34882.37 1. (CC) gcc options: -O3 -march=native
RAMspeed SMP Type: Average - Benchmark: Integer OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Average - Benchmark: Integer a b c 11K 22K 33K 44K 55K SE +/- 523.54, N = 3 SE +/- 339.06, N = 3 SE +/- 120.67, N = 3 50216.61 49961.21 49510.70 1. (CC) gcc options: -O3 -march=native
RAMspeed SMP Type: Add - Benchmark: Floating Point OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Add - Benchmark: Floating Point a b c 7K 14K 21K 28K 35K SE +/- 14.35, N = 3 SE +/- 31.66, N = 3 SE +/- 9.27, N = 3 34873.11 34985.36 34891.91 1. (CC) gcc options: -O3 -march=native
RAMspeed SMP Type: Copy - Benchmark: Floating Point OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Copy - Benchmark: Floating Point a b c 17K 34K 51K 68K 85K SE +/- 147.53, N = 3 SE +/- 483.74, N = 3 SE +/- 957.21, N = 3 77595.32 74212.45 74633.38 1. (CC) gcc options: -O3 -march=native
RAMspeed SMP Type: Scale - Benchmark: Floating Point OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Scale - Benchmark: Floating Point a b c 20K 40K 60K 80K 100K SE +/- 54.22, N = 3 SE +/- 42.08, N = 3 SE +/- 53.91, N = 3 77928.96 78075.30 78268.45 1. (CC) gcc options: -O3 -march=native
RAMspeed SMP Type: Triad - Benchmark: Floating Point OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Triad - Benchmark: Floating Point a b c 7K 14K 21K 28K 35K SE +/- 11.29, N = 3 SE +/- 52.63, N = 3 SE +/- 27.38, N = 3 34714.34 34806.77 34842.29 1. (CC) gcc options: -O3 -march=native
RAMspeed SMP Type: Average - Benchmark: Floating Point OpenBenchmarking.org MB/s, More Is Better RAMspeed SMP 3.5.0 Type: Average - Benchmark: Floating Point a b c 12K 24K 36K 48K 60K SE +/- 274.93, N = 3 SE +/- 360.38, N = 3 SE +/- 333.14, N = 3 55752.10 55527.02 55658.87 1. (CC) gcc options: -O3 -march=native
Stream Type: Copy OpenBenchmarking.org MB/s, More Is Better Stream 2013-01-17 Type: Copy a b c 70K 140K 210K 280K 350K SE +/- 2114.35, N = 5 SE +/- 1265.00, N = 5 SE +/- 981.27, N = 5 321183.9 342107.4 342567.8 1. (CC) gcc options: -O3 -march=native -fopenmp
Stream Type: Scale OpenBenchmarking.org MB/s, More Is Better Stream 2013-01-17 Type: Scale a b c 70K 140K 210K 280K 350K SE +/- 1618.66, N = 5 SE +/- 819.28, N = 5 SE +/- 1113.20, N = 5 319543.8 338616.6 337805.2 1. (CC) gcc options: -O3 -march=native -fopenmp
Stream Type: Triad OpenBenchmarking.org MB/s, More Is Better Stream 2013-01-17 Type: Triad a b c 70K 140K 210K 280K 350K SE +/- 2534.25, N = 5 SE +/- 561.20, N = 5 SE +/- 1078.16, N = 5 311997.8 337166.4 336556.6 1. (CC) gcc options: -O3 -march=native -fopenmp
Stream Type: Add OpenBenchmarking.org MB/s, More Is Better Stream 2013-01-17 Type: Add a b c 70K 140K 210K 280K 350K SE +/- 2548.52, N = 5 SE +/- 1241.84, N = 5 SE +/- 1117.82, N = 5 313161.3 341627.2 340226.7 1. (CC) gcc options: -O3 -march=native -fopenmp
Tinymembench Standard Memcpy OpenBenchmarking.org MB/s, More Is Better Tinymembench 2018-05-28 Standard Memcpy a b c 6K 12K 18K 24K 30K SE +/- 180.61, N = 3 SE +/- 896.59, N = 6 SE +/- 720.83, N = 9 25198.3 24982.6 26996.1 1. (CC) gcc options: -O2 -lm
Tinymembench Standard Memset OpenBenchmarking.org MB/s, More Is Better Tinymembench 2018-05-28 Standard Memset a b c 15K 30K 45K 60K 75K SE +/- 52.90, N = 3 SE +/- 376.32, N = 6 SE +/- 292.74, N = 9 68827.5 65367.2 70609.9 1. (CC) gcc options: -O2 -lm
MBW Test: Memory Copy - Array Size: 128 MiB OpenBenchmarking.org MiB/s, More Is Better MBW 2018-09-08 Test: Memory Copy - Array Size: 128 MiB a b c 4K 8K 12K 16K 20K SE +/- 168.54, N = 3 SE +/- 216.80, N = 12 SE +/- 127.70, N = 10 16741.27 16179.79 17104.44 1. (CC) gcc options: -O3 -march=native
MBW Test: Memory Copy - Array Size: 512 MiB OpenBenchmarking.org MiB/s, More Is Better MBW 2018-09-08 Test: Memory Copy - Array Size: 512 MiB a b c 3K 6K 9K 12K 15K SE +/- 22.87, N = 3 SE +/- 21.00, N = 3 SE +/- 48.48, N = 3 14018.31 14099.97 13987.14 1. (CC) gcc options: -O3 -march=native
MBW Test: Memory Copy - Array Size: 1024 MiB OpenBenchmarking.org MiB/s, More Is Better MBW 2018-09-08 Test: Memory Copy - Array Size: 1024 MiB a b c 3K 6K 9K 12K 15K SE +/- 46.36, N = 3 SE +/- 8.31, N = 3 SE +/- 9.44, N = 3 13621.05 13721.32 13726.32 1. (CC) gcc options: -O3 -march=native
MBW Test: Memory Copy - Array Size: 4096 MiB OpenBenchmarking.org MiB/s, More Is Better MBW 2018-09-08 Test: Memory Copy - Array Size: 4096 MiB a b c 3K 6K 9K 12K 15K SE +/- 20.59, N = 3 SE +/- 24.22, N = 3 SE +/- 62.34, N = 3 13139.17 13190.15 13258.18 1. (CC) gcc options: -O3 -march=native
MBW Test: Memory Copy - Array Size: 8192 MiB OpenBenchmarking.org MiB/s, More Is Better MBW 2018-09-08 Test: Memory Copy - Array Size: 8192 MiB a b c 3K 6K 9K 12K 15K SE +/- 1.93, N = 3 SE +/- 0.98, N = 3 SE +/- 1.24, N = 3 13432.38 13487.67 13489.45 1. (CC) gcc options: -O3 -march=native
MBW Test: Memory Copy, Fixed Block Size - Array Size: 128 MiB OpenBenchmarking.org MiB/s, More Is Better MBW 2018-09-08 Test: Memory Copy, Fixed Block Size - Array Size: 128 MiB a b c 4K 8K 12K 16K 20K SE +/- 160.35, N = 15 SE +/- 151.66, N = 15 SE +/- 153.98, N = 15 16806.06 16881.74 17095.96 1. (CC) gcc options: -O3 -march=native
MBW Test: Memory Copy, Fixed Block Size - Array Size: 512 MiB OpenBenchmarking.org MiB/s, More Is Better MBW 2018-09-08 Test: Memory Copy, Fixed Block Size - Array Size: 512 MiB a b c 3K 6K 9K 12K 15K SE +/- 46.16, N = 3 SE +/- 60.13, N = 3 SE +/- 13.55, N = 3 13991.54 14040.38 14105.60 1. (CC) gcc options: -O3 -march=native
MBW Test: Memory Copy, Fixed Block Size - Array Size: 1024 MiB OpenBenchmarking.org MiB/s, More Is Better MBW 2018-09-08 Test: Memory Copy, Fixed Block Size - Array Size: 1024 MiB a b c 3K 6K 9K 12K 15K SE +/- 4.14, N = 3 SE +/- 9.34, N = 3 SE +/- 15.68, N = 3 13684.47 13705.80 13718.74 1. (CC) gcc options: -O3 -march=native
MBW Test: Memory Copy, Fixed Block Size - Array Size: 4096 MiB OpenBenchmarking.org MiB/s, More Is Better MBW 2018-09-08 Test: Memory Copy, Fixed Block Size - Array Size: 4096 MiB a b c 3K 6K 9K 12K 15K SE +/- 26.21, N = 3 SE +/- 83.60, N = 3 SE +/- 50.01, N = 3 13214.41 13184.27 13325.49 1. (CC) gcc options: -O3 -march=native
MBW Test: Memory Copy, Fixed Block Size - Array Size: 8192 MiB OpenBenchmarking.org MiB/s, More Is Better MBW 2018-09-08 Test: Memory Copy, Fixed Block Size - Array Size: 8192 MiB a b c 3K 6K 9K 12K 15K SE +/- 1.46, N = 3 SE +/- 1.46, N = 3 SE +/- 1.15, N = 3 13429.54 12999.40 13490.99 1. (CC) gcc options: -O3 -march=native
Phoronix Test Suite v10.8.5