Amazon AWS amazon testing on Ubuntu 22.04 via the Phoronix Test Suite. m7g.16xlarge Graviton3: Processor: ARMv8 Neoverse-V1 (64 Cores), Motherboard: Amazon EC2 m7g.16xlarge (1.0 BIOS), Chipset: Amazon Device 0200, Memory: 256GB, Disk: 215GB Amazon Elastic Block Store, Network: Amazon Elastic OS: Ubuntu 22.04, Kernel: 5.19.0-1025-aws (aarch64), Compiler: GCC 11.3.0, File-System: ext4, System Layer: amazon BRL-CAD 7.34 VGR Performance Metric VGR Performance Metric > Higher Is Better m7g.16xlarge Graviton3 . 783777 |============================================== Kripke 1.2.6 Throughput FoM > Higher Is Better m7g.16xlarge Graviton3 . 339000400 |=========================================== Apache HTTP Server 2.4.56 Concurrent Requests: 1000 Requests Per Second > Higher Is Better m7g.16xlarge Graviton3 . 60965.70 |============================================ Apache HTTP Server 2.4.56 Concurrent Requests: 500 Requests Per Second > Higher Is Better m7g.16xlarge Graviton3 . 71754.89 |============================================ nginx 1.23.2 Connections: 1000 Requests Per Second > Higher Is Better m7g.16xlarge Graviton3 . 255616.04 |=========================================== nginx 1.23.2 Connections: 500 Requests Per Second > Higher Is Better m7g.16xlarge Graviton3 . 255768.44 |=========================================== GPAW 23.6 Input: Carbon Nanotube Seconds < Lower Is Better m7g.16xlarge Graviton3 . 61.83 |=============================================== Stress-NG 0.15.10 Test: Vector Floating Point Bogo Ops/s > Higher Is Better m7g.16xlarge Graviton3 . 76102.55 |============================================ Stress-NG 0.15.10 Test: Fused Multiply-Add Bogo Ops/s > Higher Is Better m7g.16xlarge Graviton3 . 63762252.76 |========================================= Stress-NG 0.15.10 Test: Wide Vector Math Bogo Ops/s > Higher Is Better m7g.16xlarge Graviton3 . 1542834.94 |========================================== Stress-NG 0.15.10 Test: Vector Shuffle Bogo Ops/s > Higher Is Better m7g.16xlarge Graviton3 . 54143.40 |============================================ Stress-NG 0.15.10 Test: Memory Copying Bogo Ops/s > Higher Is Better m7g.16xlarge Graviton3 . 20484.24 |============================================ Stress-NG 0.15.10 Test: Matrix 3D Math Bogo Ops/s > Higher Is Better m7g.16xlarge Graviton3 . 10403.93 |============================================ Stress-NG 0.15.10 Test: Vector Math Bogo Ops/s > Higher Is Better m7g.16xlarge Graviton3 . 217235.59 |=========================================== Stress-NG 0.15.10 Test: Matrix Math Bogo Ops/s > Higher Is Better m7g.16xlarge Graviton3 . 368750.67 |=========================================== Stress-NG 0.15.10 Test: CPU Cache Bogo Ops/s > Higher Is Better m7g.16xlarge Graviton3 . 3892396.34 |========================================== Stress-NG 0.15.10 Test: NUMA Bogo Ops/s > Higher Is Better m7g.16xlarge Graviton3 . 3759.10 |============================================= GROMACS 2023 Implementation: MPI CPU - Input: water_GMX50_bare Ns Per Day > Higher Is Better m7g.16xlarge Graviton3 . 4.223 |=============================================== Graph500 3.0 Scale: 26 sssp max_TEPS > Higher Is Better m7g.16xlarge Graviton3 . 419754000 |=========================================== Graph500 3.0 Scale: 26 sssp median_TEPS > Higher Is Better m7g.16xlarge Graviton3 . 299497000 |=========================================== Graph500 3.0 Scale: 26 bfs max_TEPS > Higher Is Better m7g.16xlarge Graviton3 . 1227790000 |========================================== Graph500 3.0 Scale: 26 bfs median_TEPS > Higher Is Better m7g.16xlarge Graviton3 . 1194320000 |========================================== Liquid-DSP 1.6 Threads: 64 - Buffer Length: 256 - Filter Length: 512 samples/s > Higher Is Better m7g.16xlarge Graviton3 . 162753333 |=========================================== Liquid-DSP 1.6 Threads: 32 - Buffer Length: 256 - Filter Length: 512 samples/s > Higher Is Better m7g.16xlarge Graviton3 . 81396667 |============================================ Liquid-DSP 1.6 Threads: 64 - Buffer Length: 256 - Filter Length: 57 samples/s > Higher Is Better m7g.16xlarge Graviton3 . 1442400000 |========================================== Liquid-DSP 1.6 Threads: 64 - Buffer Length: 256 - Filter Length: 32 samples/s > Higher Is Better m7g.16xlarge Graviton3 . 2270500000 |========================================== Liquid-DSP 1.6 Threads: 32 - Buffer Length: 256 - Filter Length: 57 samples/s > Higher Is Better m7g.16xlarge Graviton3 . 721493333 |=========================================== Liquid-DSP 1.6 Threads: 32 - Buffer Length: 256 - Filter Length: 32 samples/s > Higher Is Better m7g.16xlarge Graviton3 . 1136066667 |========================================== OpenSSL 3.1 Algorithm: ChaCha20-Poly1305 byte/s > Higher Is Better m7g.16xlarge Graviton3 . 74287460990 |========================================= OpenSSL 3.1 Algorithm: AES-256-GCM byte/s > Higher Is Better m7g.16xlarge Graviton3 . 283333113630 |======================================== OpenSSL 3.1 Algorithm: AES-128-GCM byte/s > Higher Is Better m7g.16xlarge Graviton3 . 332033171900 |======================================== OpenSSL 3.1 Algorithm: ChaCha20 byte/s > Higher Is Better m7g.16xlarge Graviton3 . 103226784517 |======================================== OpenSSL 3.1 Algorithm: RSA4096 verify/s > Higher Is Better m7g.16xlarge Graviton3 . 713859.5 |============================================ OpenSSL 3.1 Algorithm: RSA4096 sign/s > Higher Is Better m7g.16xlarge Graviton3 . 10181.9 |============================================= OpenSSL 3.1 Algorithm: SHA512 byte/s > Higher Is Better m7g.16xlarge Graviton3 . 32125448870 |========================================= OpenSSL 3.1 Algorithm: SHA256 byte/s > Higher Is Better m7g.16xlarge Graviton3 . 54212515580 |========================================= Timed Node.js Compilation 19.8.1 Time To Compile Seconds < Lower Is Better m7g.16xlarge Graviton3 . 237.78 |============================================== Timed Godot Game Engine Compilation 4.0 Time To Compile Seconds < Lower Is Better m7g.16xlarge Graviton3 . 154.38 |============================================== Timed Gem5 Compilation 21.2 Time To Compile Seconds < Lower Is Better m7g.16xlarge Graviton3 . 180.25 |============================================== 7-Zip Compression 22.01 Test: Decompression Rating MIPS > Higher Is Better m7g.16xlarge Graviton3 . 285540 |============================================== 7-Zip Compression 22.01 Test: Compression Rating MIPS > Higher Is Better m7g.16xlarge Graviton3 . 316825 |============================================== Coremark 1.0 CoreMark Size 666 - Iterations Per Second Iterations/Sec > Higher Is Better m7g.16xlarge Graviton3 . 1601880.34 |========================================== ACES DGEMM 1.0 Sustained Floating-Point Rate GFLOP/s > Higher Is Better m7g.16xlarge Graviton3 . 24.36 |=============================================== srsRAN Project 23.5 Test: PUSCH Processor Benchmark, Throughput Thread Mbps > Higher Is Better m7g.16xlarge Graviton3 . 95.8 |================================================ srsRAN Project 23.5 Test: PUSCH Processor Benchmark, Throughput Total Mbps > Higher Is Better m7g.16xlarge Graviton3 . 5413.8 |============================================== srsRAN Project 23.5 Test: Downlink Processor Benchmark Mbps > Higher Is Better m7g.16xlarge Graviton3 . 318.5 |=============================================== LULESH 2.0.3 z/s > Higher Is Better m7g.16xlarge Graviton3 . 28296.38 |============================================ LAMMPS Molecular Dynamics Simulator 23Jun2022 Model: Rhodopsin Protein ns/day > Higher Is Better m7g.16xlarge Graviton3 . 37.56 |=============================================== LAMMPS Molecular Dynamics Simulator 23Jun2022 Model: 20k Atoms ns/day > Higher Is Better m7g.16xlarge Graviton3 . 36.93 |=============================================== nekRS 23.0 Input: TurboPipe Periodic flops/rank > Higher Is Better m7g.16xlarge Graviton3 . 3976300000 |========================================== nekRS 23.0 Input: Kershaw flops/rank > Higher Is Better m7g.16xlarge Graviton3 . 3150680000 |========================================== Remhos 1.0 Test: Sample Remap Example Seconds < Lower Is Better m7g.16xlarge Graviton3 . 14.04 |=============================================== Monte Carlo Simulations of Ionised Nebulae 2.02.73.3 Input: Dust 2D tau100.0 Seconds < Lower Is Better m7g.16xlarge Graviton3 . 82.67 |=============================================== Monte Carlo Simulations of Ionised Nebulae 2.02.73.3 Input: Gas HII40 Seconds < Lower Is Better m7g.16xlarge Graviton3 . 13.58 |=============================================== Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 193 Cells Per Direction Seconds < Lower Is Better m7g.16xlarge Graviton3 . 13.95 |=============================================== Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 129 Cells Per Direction Seconds < Lower Is Better m7g.16xlarge Graviton3 . 3.09871038 |========================================== QMCPACK 3.16 Input: FeCO6_b3lyp_gms Total Execution Time - Seconds < Lower Is Better m7g.16xlarge Graviton3 . 205.72 |============================================== QMCPACK 3.16 Input: FeCO6_b3lyp_gms Total Execution Time - Seconds < Lower Is Better m7g.16xlarge Graviton3 . 211.60 |============================================== QMCPACK 3.16 Input: simple-H2O Total Execution Time - Seconds < Lower Is Better m7g.16xlarge Graviton3 . 28.04 |=============================================== QMCPACK 3.16 Input: Li2_STO_ae Total Execution Time - Seconds < Lower Is Better m7g.16xlarge Graviton3 . 112.61 |============================================== NWChem 7.0.2 Input: C240 Buckyball Seconds < Lower Is Better m7g.16xlarge Graviton3 . 1940.2 |============================================== Pennant 1.0.1 Test: leblancbig Hydro Cycle Time - Seconds < Lower Is Better m7g.16xlarge Graviton3 . 6.720537 |============================================ Pennant 1.0.1 Test: sedovbig Hydro Cycle Time - Seconds < Lower Is Better m7g.16xlarge Graviton3 . 9.206490 |============================================ HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: double - X Y Z: 512 GFLOP/s > Higher Is Better m7g.16xlarge Graviton3 . 84.47 |=============================================== HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: double - X Y Z: 256 GFLOP/s > Higher Is Better m7g.16xlarge Graviton3 . 78.50 |=============================================== HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: double - X Y Z: 128 GFLOP/s > Higher Is Better m7g.16xlarge Graviton3 . 138.01 |============================================== HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: double - X Y Z: 512 GFLOP/s > Higher Is Better m7g.16xlarge Graviton3 . 46.25 |=============================================== HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: double - X Y Z: 256 GFLOP/s > Higher Is Better m7g.16xlarge Graviton3 . 40.89 |=============================================== HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: double - X Y Z: 128 GFLOP/s > Higher Is Better m7g.16xlarge Graviton3 . 57.15 |=============================================== HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: float - X Y Z: 512 GFLOP/s > Higher Is Better m7g.16xlarge Graviton3 . 162.96 |============================================== HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: float - X Y Z: 256 GFLOP/s > Higher Is Better m7g.16xlarge Graviton3 . 164.87 |============================================== HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: float - X Y Z: 128 GFLOP/s > Higher Is Better m7g.16xlarge Graviton3 . 306.54 |============================================== HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: float - X Y Z: 512 GFLOP/s > Higher Is Better m7g.16xlarge Graviton3 . 88.05 |=============================================== HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: float - X Y Z: 256 GFLOP/s > Higher Is Better m7g.16xlarge Graviton3 . 81.44 |=============================================== HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: float - X Y Z: 128 GFLOP/s > Higher Is Better m7g.16xlarge Graviton3 . 186.36 |============================================== Laghos 3.1 Test: Sedov Blast Wave, ube_922_hex.mesh Major Kernels Total Rate > Higher Is Better m7g.16xlarge Graviton3 . 410.55 |============================================== Laghos 3.1 Test: Triple Point Problem Major Kernels Total Rate > Higher Is Better m7g.16xlarge Graviton3 . 232.01 |============================================== Algebraic Multi-Grid Benchmark 1.2 Figure Of Merit > Higher Is Better m7g.16xlarge Graviton3 . 1646761667 |========================================== Rodinia 3.1 Test: OpenMP Streamcluster Seconds < Lower Is Better m7g.16xlarge Graviton3 . 11.66 |=============================================== Rodinia 3.1 Test: OpenMP CFD Solver Seconds < Lower Is Better m7g.16xlarge Graviton3 . 4.375 |=============================================== Rodinia 3.1 Test: OpenMP LavaMD Seconds < Lower Is Better m7g.16xlarge Graviton3 . 43.79 |=============================================== LeelaChessZero 0.28 Backend: Eigen Nodes Per Second > Higher Is Better m7g.16xlarge Graviton3 . 1398 |================================================ LeelaChessZero 0.28 Backend: BLAS Nodes Per Second > Higher Is Better m7g.16xlarge Graviton3 . 1301 |================================================ NAS Parallel Benchmarks 3.4 Test / Class: SP.C Total Mop/s > Higher Is Better m7g.16xlarge Graviton3 . 17244.85 |============================================ NAS Parallel Benchmarks 3.4 Test / Class: MG.C Total Mop/s > Higher Is Better m7g.16xlarge Graviton3 . 50126.29 |============================================ NAS Parallel Benchmarks 3.4 Test / Class: LU.C Total Mop/s > Higher Is Better m7g.16xlarge Graviton3 . 28341.68 |============================================ NAS Parallel Benchmarks 3.4 Test / Class: EP.D Total Mop/s > Higher Is Better m7g.16xlarge Graviton3 . 3738.98 |============================================= NAS Parallel Benchmarks 3.4 Test / Class: CG.C Total Mop/s > Higher Is Better m7g.16xlarge Graviton3 . 21988.99 |============================================ High Performance Conjugate Gradient 3.1 X Y Z: 160 160 160 - RT: 60 GFLOP/s > Higher Is Better m7g.16xlarge Graviton3 . 33.82 |=============================================== High Performance Conjugate Gradient 3.1 X Y Z: 144 144 144 - RT: 60 GFLOP/s > Higher Is Better m7g.16xlarge Graviton3 . 33.79 |=============================================== Stockfish 15 Total Time Nodes Per Second > Higher Is Better m7g.16xlarge Graviton3 . 112119711 |=========================================== Rodinia 3.1 Test: OpenMP Leukocyte Seconds < Lower Is Better Rodinia 3.1 Test: OpenMP HotSpot3D Seconds < Lower Is Better