bandwidth ARMv8 Neoverse-V2 testing with a Quanta Cloud QuantaGrid S74G-2U 1S7GZ9Z0000 S7G MB (CG1) (3A06 BIOS) and NVIDIA GH200 480GB on Ubuntu 22.04 via the Phoronix Test Suite. ARMv8 Neoverse-V2: Processor: ARMv8 Neoverse-V2 @ 3.39GHz (72 Cores), Motherboard: Quanta Cloud QuantaGrid S74G-2U 1S7GZ9Z0000 S7G MB (CG1) (3A06 BIOS), Memory: 1 x 480GB DRAM-6400MT/s, Disk: 960GB SAMSUNG MZ1L2960HCJR-00A07 + 1920GB SAMSUNG MZTL21T9, Graphics: NVIDIA GH200 480GB, Network: 2 x Mellanox MT2910 + 2 x QLogic FastLinQ QL41000 10/25/40/50GbE OS: Ubuntu 22.04, Kernel: 6.5.0-1007-NVIDIA-64k (aarch64), Display Driver: NVIDIA, OpenCL: OpenCL 3.0 CUDA 12.4.89, Vulkan: 1.3.277, Compiler: GCC 11.4.0 + CUDA 11.5, File-System: ext4, Screen Resolution: 1920x1200 b: Processor: ARMv8 Neoverse-V2 @ 3.39GHz (72 Cores), Motherboard: Quanta Cloud QuantaGrid S74G-2U 1S7GZ9Z0000 S7G MB (CG1) (3A06 BIOS), Memory: 1 x 480GB DRAM-6400MT/s, Disk: 960GB SAMSUNG MZ1L2960HCJR-00A07 + 1920GB SAMSUNG MZTL21T9, Graphics: NVIDIA GH200 480GB, Network: 2 x Mellanox MT2910 + 2 x QLogic FastLinQ QL41000 10/25/40/50GbE OS: Ubuntu 22.04, Kernel: 6.5.0-1007-NVIDIA-64k (aarch64), Display Driver: NVIDIA, OpenCL: OpenCL 3.0 CUDA 12.4.89, Vulkan: 1.3.277, Compiler: GCC 11.4.0 + CUDA 11.5, File-System: ext4, Screen Resolution: 1920x1200 c: Processor: ARMv8 Neoverse-V2 @ 3.39GHz (72 Cores), Motherboard: Quanta Cloud QuantaGrid S74G-2U 1S7GZ9Z0000 S7G MB (CG1) (3A06 BIOS), Memory: 1 x 480GB DRAM-6400MT/s, Disk: 960GB SAMSUNG MZ1L2960HCJR-00A07 + 1920GB SAMSUNG MZTL21T9, Graphics: NVIDIA GH200 480GB, Network: 2 x Mellanox MT2910 + 2 x QLogic FastLinQ QL41000 10/25/40/50GbE OS: Ubuntu 22.04, Kernel: 6.5.0-1007-NVIDIA-64k (aarch64), Display Driver: NVIDIA, OpenCL: OpenCL 3.0 CUDA 12.4.89, Vulkan: 1.3.277, Compiler: GCC 11.4.0 + CUDA 11.5, File-System: ext4, Screen Resolution: 1920x1200 d: Processor: ARMv8 Neoverse-V2 @ 3.39GHz (72 Cores), Motherboard: Quanta Cloud QuantaGrid S74G-2U 1S7GZ9Z0000 S7G MB (CG1) (3A06 BIOS), Memory: 1 x 480GB DRAM-6400MT/s, Disk: 960GB SAMSUNG MZ1L2960HCJR-00A07 + 1920GB SAMSUNG MZTL21T9, Graphics: NVIDIA GH200 480GB, Network: 2 x Mellanox MT2910 + 2 x QLogic FastLinQ QL41000 10/25/40/50GbE OS: Ubuntu 22.04, Kernel: 6.5.0-1007-NVIDIA-64k (aarch64), Display Driver: NVIDIA, OpenCL: OpenCL 3.0 CUDA 12.4.89, Vulkan: 1.3.277, Compiler: GCC 11.4.0 + CUDA 11.5, File-System: ext4, Screen Resolution: 1920x1200 e: Processor: ARMv8 Neoverse-V2 @ 3.39GHz (72 Cores), Motherboard: Quanta Cloud QuantaGrid S74G-2U 1S7GZ9Z0000 S7G MB (CG1) (3A06 BIOS), Memory: 1 x 480GB DRAM-6400MT/s, Disk: 960GB SAMSUNG MZ1L2960HCJR-00A07 + 1920GB SAMSUNG MZTL21T9, Graphics: NVIDIA GH200 480GB, Network: 2 x Mellanox MT2910 + 2 x QLogic FastLinQ QL41000 10/25/40/50GbE OS: Ubuntu 22.04, Kernel: 6.5.0-1007-NVIDIA-64k (aarch64), Display Driver: NVIDIA, OpenCL: OpenCL 3.0 CUDA 12.4.89, Vulkan: 1.3.277, Compiler: GCC 11.4.0 + CUDA 11.5, File-System: ext4, Screen Resolution: 1920x1200 f: Processor: ARMv8 Neoverse-V2 @ 3.39GHz (72 Cores), Motherboard: Quanta Cloud QuantaGrid S74G-2U 1S7GZ9Z0000 S7G MB (CG1) (3A06 BIOS), Memory: 1 x 480GB DRAM-6400MT/s, Disk: 960GB SAMSUNG MZ1L2960HCJR-00A07 + 1920GB SAMSUNG MZTL21T9, Graphics: NVIDIA GH200 480GB, Network: 2 x Mellanox MT2910 + 2 x QLogic FastLinQ QL41000 10/25/40/50GbE OS: Ubuntu 22.04, Kernel: 6.5.0-1007-NVIDIA-64k (aarch64), Display Driver: NVIDIA, OpenCL: OpenCL 3.0 CUDA 12.4.89, Vulkan: 1.3.277, Compiler: GCC 11.4.0 + CUDA 11.5, File-System: ext4, Screen Resolution: 1920x1200 High Performance Conjugate Gradient 3.1 X Y Z: 104 104 104 - RT: 60 GFLOP/s > Higher Is Better ARMv8 Neoverse-V2 . 44.71 |==================================================== b ................. 39.57 |============================================== c ................. 38.95 |============================================= d ................. 38.63 |============================================= e ................. 38.69 |============================================= f ................. 38.71 |============================================= High Performance Conjugate Gradient 3.1 X Y Z: 144 144 144 - RT: 60 GFLOP/s > Higher Is Better ARMv8 Neoverse-V2 . 41.92 |==================================================== b ................. 38.96 |================================================ c ................. 38.23 |=============================================== d ................. 38.02 |=============================================== e ................. 37.99 |=============================================== f ................. 38.11 |=============================================== High Performance Conjugate Gradient 3.1 X Y Z: 160 160 160 - RT: 60 GFLOP/s > Higher Is Better ARMv8 Neoverse-V2 . 39.92 |==================================================== b ................. 38.78 |=================================================== c ................. 38.51 |================================================== d ................. 38.17 |================================================== e ................. 38.00 |================================================= f ................. 38.03 |================================================== High Performance Conjugate Gradient 3.1 X Y Z: 192 192 192 - RT: 60 GFLOP/s > Higher Is Better ARMv8 Neoverse-V2 . 38.73 |==================================================== b ................. 38.48 |==================================================== c ................. 38.11 |=================================================== d ................. 38.06 |=================================================== e ................. 37.98 |=================================================== f ................. 37.85 |=================================================== Graph500 3.0 Scale: 26 bfs median_TEPS > Higher Is Better ARMv8 Neoverse-V2 . 1505570000 |=============================================== b ................. 1503510000 |=============================================== c ................. 1498670000 |=============================================== d ................. 1489270000 |============================================== e ................. 1469720000 |============================================== f ................. 1481400000 |============================================== Graph500 3.0 Scale: 26 bfs max_TEPS > Higher Is Better ARMv8 Neoverse-V2 . 1573470000 |=============================================== b ................. 1574260000 |=============================================== c ................. 1571860000 |=============================================== d ................. 1562170000 |=============================================== e ................. 1547980000 |============================================== f ................. 1559480000 |=============================================== Graph500 3.0 Scale: 26 sssp median_TEPS > Higher Is Better ARMv8 Neoverse-V2 . 333905000 |================================================ b ................. 322051000 |============================================== c ................. 327451000 |=============================================== d ................. 322704000 |============================================== e ................. 322883000 |============================================== f ................. 318469000 |============================================== Graph500 3.0 Scale: 26 sssp max_TEPS > Higher Is Better ARMv8 Neoverse-V2 . 511909000 |================================================ b ................. 501380000 |=============================================== c ................. 492413000 |============================================== d ................. 498666000 |=============================================== e ................. 491501000 |============================================== f ................. 483974000 |============================================= GROMACS 2024 Implementation: MPI CPU - Input: water_GMX50_bare Ns Per Day > Higher Is Better ARMv8 Neoverse-V2 . 5.429 |=================================================== b ................. 5.520 |==================================================== c ................. 5.529 |==================================================== d ................. 5.530 |==================================================== e ................. 5.515 |==================================================== f ................. 5.508 |==================================================== GROMACS 2024 Implementation: NVIDIA CUDA GPU - Input: water_GMX50_bare Ns Per Day > Higher Is Better