Amazon AWS amazon testing on Ubuntu 22.04 via the Phoronix Test Suite. m7g.16xlarge Graviton3: Processor: ARMv8 Neoverse-V1 (64 Cores), Motherboard: Amazon EC2 m7g.16xlarge (1.0 BIOS), Chipset: Amazon Device 0200, Memory: 256GB, Disk: 215GB Amazon Elastic Block Store, Network: Amazon Elastic OS: Ubuntu 22.04, Kernel: 5.19.0-1025-aws (aarch64), Compiler: GCC 11.3.0, File-System: ext4, System Layer: amazon c6g.16xlarge Graviton2: Processor: ARMv8 Neoverse-N1 (64 Cores), Motherboard: Amazon EC2 c6g.16xlarge (1.0 BIOS), Chipset: Amazon Device 0200, Memory: 128GB, Disk: 215GB Amazon Elastic Block Store, Network: Amazon Elastic OS: Ubuntu 22.04, Kernel: 5.19.0-1025-aws (aarch64), Compiler: GCC 11.3.0, File-System: ext4, System Layer: amazon OpenSSL 3.1 Algorithm: RSA4096 sign/s > Higher Is Better m7g.16xlarge Graviton3 . 10181.9 |============================================= c6g.16xlarge Graviton2 . 2624.3 |============ OpenSSL 3.1 Algorithm: RSA4096 verify/s > Higher Is Better m7g.16xlarge Graviton3 . 713859.5 |============================================ c6g.16xlarge Graviton2 . 214040.9 |============= OpenSSL 3.1 Algorithm: SHA512 byte/s > Higher Is Better m7g.16xlarge Graviton3 . 32125448870 |========================================= c6g.16xlarge Graviton2 . 14393925490 |================== OpenSSL 3.1 Algorithm: AES-256-GCM byte/s > Higher Is Better m7g.16xlarge Graviton3 . 283333113630 |======================================== c6g.16xlarge Graviton2 . 129199593157 |================== OpenSSL 3.1 Algorithm: AES-128-GCM byte/s > Higher Is Better m7g.16xlarge Graviton3 . 332033171900 |======================================== c6g.16xlarge Graviton2 . 158436163857 |=================== HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: float - X Y Z: 512 GFLOP/s > Higher Is Better m7g.16xlarge Graviton3 . 88.05 |=============================================== c6g.16xlarge Graviton2 . 42.83 |======================= Stress-NG 0.15.10 Test: CPU Cache Bogo Ops/s > Higher Is Better m7g.16xlarge Graviton3 . 3892396.34 |========================================== c6g.16xlarge Graviton2 . 1921785.20 |===================== HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: float - X Y Z: 512 GFLOP/s > Higher Is Better m7g.16xlarge Graviton3 . 162.96 |============================================== c6g.16xlarge Graviton2 . 81.94 |======================= HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: double - X Y Z: 256 GFLOP/s > Higher Is Better m7g.16xlarge Graviton3 . 40.89 |=============================================== c6g.16xlarge Graviton2 . 20.63 |======================== HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: double - X Y Z: 256 GFLOP/s > Higher Is Better m7g.16xlarge Graviton3 . 78.50 |=============================================== c6g.16xlarge Graviton2 . 40.11 |======================== NAS Parallel Benchmarks 3.4 Test / Class: MG.C Total Mop/s > Higher Is Better m7g.16xlarge Graviton3 . 50126.29 |============================================ c6g.16xlarge Graviton2 . 25671.29 |======================= HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: float - X Y Z: 256 GFLOP/s > Higher Is Better m7g.16xlarge Graviton3 . 81.44 |=============================================== c6g.16xlarge Graviton2 . 41.98 |======================== HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: double - X Y Z: 512 GFLOP/s > Higher Is Better m7g.16xlarge Graviton3 . 46.25 |=============================================== c6g.16xlarge Graviton2 . 24.27 |========================= HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: double - X Y Z: 512 GFLOP/s > Higher Is Better m7g.16xlarge Graviton3 . 84.47 |=============================================== c6g.16xlarge Graviton2 . 44.93 |========================= Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 193 Cells Per Direction Seconds < Lower Is Better m7g.16xlarge Graviton3 . 13.95 |========================= c6g.16xlarge Graviton2 . 25.88 |=============================================== Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 129 Cells Per Direction Seconds < Lower Is Better m7g.16xlarge Graviton3 . 3.09871038 |======================= c6g.16xlarge Graviton2 . 5.63720735 |========================================== Pennant 1.0.1 Test: leblancbig Hydro Cycle Time - Seconds < Lower Is Better m7g.16xlarge Graviton3 . 6.720537 |======================== c6g.16xlarge Graviton2 . 12.176830 |=========================================== Stress-NG 0.15.10 Test: Memory Copying Bogo Ops/s > Higher Is Better m7g.16xlarge Graviton3 . 20484.24 |============================================ c6g.16xlarge Graviton2 . 11324.79 |======================== Stress-NG 0.15.10 Test: Matrix 3D Math Bogo Ops/s > Higher Is Better m7g.16xlarge Graviton3 . 10403.93 |============================================ c6g.16xlarge Graviton2 . 5752.17 |======================== nekRS 23.0 Input: TurboPipe Periodic flops/rank > Higher Is Better m7g.16xlarge Graviton3 . 3976300000 |========================================== c6g.16xlarge Graviton2 . 2220190000 |======================= Pennant 1.0.1 Test: sedovbig Hydro Cycle Time - Seconds < Lower Is Better m7g.16xlarge Graviton3 . 9.206490 |======================== c6g.16xlarge Graviton2 . 16.480500 |=========================================== nekRS 23.0 Input: Kershaw flops/rank > Higher Is Better m7g.16xlarge Graviton3 . 3150680000 |========================================== c6g.16xlarge Graviton2 . 1760336667 |======================= HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: float - X Y Z: 256 GFLOP/s > Higher Is Better m7g.16xlarge Graviton3 . 164.87 |============================================== c6g.16xlarge Graviton2 . 92.40 |========================== Stress-NG 0.15.10 Test: NUMA Bogo Ops/s > Higher Is Better m7g.16xlarge Graviton3 . 3759.10 |============================================= c6g.16xlarge Graviton2 . 2112.66 |========================= Stress-NG 0.15.10 Test: Vector Floating Point Bogo Ops/s > Higher Is Better m7g.16xlarge Graviton3 . 76102.55 |============================================ c6g.16xlarge Graviton2 . 42850.82 |========================= NAS Parallel Benchmarks 3.4 Test / Class: SP.C Total Mop/s > Higher Is Better m7g.16xlarge Graviton3 . 17244.85 |============================================ c6g.16xlarge Graviton2 . 9711.70 |========================= Monte Carlo Simulations of Ionised Nebulae 2.02.73.3 Input: Dust 2D tau100.0 Seconds < Lower Is Better m7g.16xlarge Graviton3 . 82.67 |========================== c6g.16xlarge Graviton2 . 145.37 |============================================== HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: double - X Y Z: 128 GFLOP/s > Higher Is Better m7g.16xlarge Graviton3 . 57.15 |=============================================== c6g.16xlarge Graviton2 . 32.75 |=========================== nginx 1.23.2 Connections: 500 Requests Per Second > Higher Is Better m7g.16xlarge Graviton3 . 255768.44 |=========================================== c6g.16xlarge Graviton2 . 148964.69 |========================= HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: double - X Y Z: 128 GFLOP/s > Higher Is Better m7g.16xlarge Graviton3 . 138.01 |============================================== c6g.16xlarge Graviton2 . 81.45 |=========================== Stress-NG 0.15.10 Test: Fused Multiply-Add Bogo Ops/s > Higher Is Better m7g.16xlarge Graviton3 . 63762252.76 |========================================= c6g.16xlarge Graviton2 . 37732190.54 |======================== NAS Parallel Benchmarks 3.4 Test / Class: EP.D Total Mop/s > Higher Is Better m7g.16xlarge Graviton3 . 3738.98 |============================================= c6g.16xlarge Graviton2 . 2216.26 |=========================== NAS Parallel Benchmarks 3.4 Test / Class: CG.C Total Mop/s > Higher Is Better m7g.16xlarge Graviton3 . 21988.99 |============================================ c6g.16xlarge Graviton2 . 13103.62 |========================== srsRAN Project 23.5 Test: Downlink Processor Benchmark Mbps > Higher Is Better m7g.16xlarge Graviton3 . 318.5 |=============================================== c6g.16xlarge Graviton2 . 197.2 |============================= QMCPACK 3.16 Input: simple-H2O Total Execution Time - Seconds < Lower Is Better m7g.16xlarge Graviton3 . 28.04 |============================= c6g.16xlarge Graviton2 . 45.23 |=============================================== LULESH 2.0.3 z/s > Higher Is Better m7g.16xlarge Graviton3 . 28296.38 |============================================ c6g.16xlarge Graviton2 . 17557.49 |=========================== nginx 1.23.2 Connections: 1000 Requests Per Second > Higher Is Better m7g.16xlarge Graviton3 . 255616.04 |=========================================== c6g.16xlarge Graviton2 . 158676.40 |=========================== Algebraic Multi-Grid Benchmark 1.2 Figure Of Merit > Higher Is Better m7g.16xlarge Graviton3 . 1646761667 |========================================== c6g.16xlarge Graviton2 . 1035586333 |========================== OpenSSL 3.1 Algorithm: ChaCha20-Poly1305 byte/s > Higher Is Better m7g.16xlarge Graviton3 . 74287460990 |========================================= c6g.16xlarge Graviton2 . 46717636807 |========================== LeelaChessZero 0.28 Backend: Eigen Nodes Per Second > Higher Is Better m7g.16xlarge Graviton3 . 1398 |================================================ c6g.16xlarge Graviton2 . 891 |=============================== Stress-NG 0.15.10 Test: Wide Vector Math Bogo Ops/s > Higher Is Better m7g.16xlarge Graviton3 . 1542834.94 |========================================== c6g.16xlarge Graviton2 . 997272.65 |=========================== Kripke 1.2.6 Throughput FoM > Higher Is Better m7g.16xlarge Graviton3 . 339000400 |=========================================== c6g.16xlarge Graviton2 . 220120233 |============================ NWChem 7.0.2 Input: C240 Buckyball Seconds < Lower Is Better m7g.16xlarge Graviton3 . 1940.2 |============================== c6g.16xlarge Graviton2 . 2976.9 |============================================== OpenSSL 3.1 Algorithm: ChaCha20 byte/s > Higher Is Better m7g.16xlarge Graviton3 . 103226784517 |======================================== c6g.16xlarge Graviton2 . 67292541203 |========================== Monte Carlo Simulations of Ionised Nebulae 2.02.73.3 Input: Gas HII40 Seconds < Lower Is Better m7g.16xlarge Graviton3 . 13.58 |=============================== c6g.16xlarge Graviton2 . 20.76 |=============================================== GROMACS 2023 Implementation: MPI CPU - Input: water_GMX50_bare Ns Per Day > Higher Is Better m7g.16xlarge Graviton3 . 4.223 |=============================================== c6g.16xlarge Graviton2 . 2.767 |=============================== Stress-NG 0.15.10 Test: Vector Shuffle Bogo Ops/s > Higher Is Better m7g.16xlarge Graviton3 . 54143.40 |============================================ c6g.16xlarge Graviton2 . 35614.51 |============================= NAS Parallel Benchmarks 3.4 Test / Class: LU.C Total Mop/s > Higher Is Better m7g.16xlarge Graviton3 . 28341.68 |============================================ c6g.16xlarge Graviton2 . 18741.90 |============================= srsRAN Project 23.5 Test: PUSCH Processor Benchmark, Throughput Thread Mbps > Higher Is Better m7g.16xlarge Graviton3 . 95.8 |================================================ c6g.16xlarge Graviton2 . 63.8 |================================ GPAW 23.6 Input: Carbon Nanotube Seconds < Lower Is Better m7g.16xlarge Graviton3 . 61.83 |=============================== c6g.16xlarge Graviton2 . 92.76 |=============================================== Liquid-DSP 1.6 Threads: 32 - Buffer Length: 256 - Filter Length: 32 samples/s > Higher Is Better m7g.16xlarge Graviton3 . 1136066667 |========================================== c6g.16xlarge Graviton2 . 765466667 |============================ Liquid-DSP 1.6 Threads: 64 - Buffer Length: 256 - Filter Length: 32 samples/s > Higher Is Better m7g.16xlarge Graviton3 . 2270500000 |========================================== c6g.16xlarge Graviton2 . 1531400000 |============================ Remhos 1.0 Test: Sample Remap Example Seconds < Lower Is Better m7g.16xlarge Graviton3 . 14.04 |================================ c6g.16xlarge Graviton2 . 20.74 |=============================================== Liquid-DSP 1.6 Threads: 32 - Buffer Length: 256 - Filter Length: 57 samples/s > Higher Is Better m7g.16xlarge Graviton3 . 721493333 |=========================================== c6g.16xlarge Graviton2 . 489270000 |============================= Liquid-DSP 1.6 Threads: 64 - Buffer Length: 256 - Filter Length: 57 samples/s > Higher Is Better m7g.16xlarge Graviton3 . 1442400000 |========================================== c6g.16xlarge Graviton2 . 978200000 |============================ Graph500 3.0 Scale: 26 sssp max_TEPS > Higher Is Better m7g.16xlarge Graviton3 . 419754000 |=========================================== c6g.16xlarge Graviton2 . 284689000 |============================= BRL-CAD 7.34 VGR Performance Metric VGR Performance Metric > Higher Is Better m7g.16xlarge Graviton3 . 783777 |============================================== c6g.16xlarge Graviton2 . 533020 |=============================== Stress-NG 0.15.10 Test: Vector Math Bogo Ops/s > Higher Is Better m7g.16xlarge Graviton3 . 217235.59 |=========================================== c6g.16xlarge Graviton2 . 147886.14 |============================= LAMMPS Molecular Dynamics Simulator 23Jun2022 Model: 20k Atoms ns/day > Higher Is Better m7g.16xlarge Graviton3 . 36.93 |=============================================== c6g.16xlarge Graviton2 . 25.17 |================================ QMCPACK 3.16 Input: Li2_STO_ae Total Execution Time - Seconds < Lower Is Better m7g.16xlarge Graviton3 . 112.61 |=============================== c6g.16xlarge Graviton2 . 165.12 |============================================== HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: float - X Y Z: 128 GFLOP/s > Higher Is Better m7g.16xlarge Graviton3 . 306.54 |============================================== c6g.16xlarge Graviton2 . 209.50 |=============================== QMCPACK 3.16 Input: FeCO6_b3lyp_gms Total Execution Time - Seconds < Lower Is Better m7g.16xlarge Graviton3 . 205.72 |================================ c6g.16xlarge Graviton2 . 297.94 |============================================== LAMMPS Molecular Dynamics Simulator 23Jun2022 Model: Rhodopsin Protein ns/day > Higher Is Better m7g.16xlarge Graviton3 . 37.56 |=============================================== c6g.16xlarge Graviton2 . 25.95 |================================ Graph500 3.0 Scale: 26 sssp median_TEPS > Higher Is Better m7g.16xlarge Graviton3 . 299497000 |=========================================== c6g.16xlarge Graviton2 . 209350000 |============================== QMCPACK 3.16 Input: FeCO6_b3lyp_gms Total Execution Time - Seconds < Lower Is Better m7g.16xlarge Graviton3 . 211.60 |================================ c6g.16xlarge Graviton2 . 302.19 |============================================== Rodinia 3.1 Test: OpenMP LavaMD Seconds < Lower Is Better m7g.16xlarge Graviton3 . 43.79 |================================= c6g.16xlarge Graviton2 . 62.22 |=============================================== Timed Godot Game Engine Compilation 4.0 Time To Compile Seconds < Lower Is Better m7g.16xlarge Graviton3 . 154.38 |================================= c6g.16xlarge Graviton2 . 218.28 |============================================== Graph500 3.0 Scale: 26 bfs max_TEPS > Higher Is Better m7g.16xlarge Graviton3 . 1227790000 |========================================== c6g.16xlarge Graviton2 . 874389000 |============================== Graph500 3.0 Scale: 26 bfs median_TEPS > Higher Is Better m7g.16xlarge Graviton3 . 1194320000 |========================================== c6g.16xlarge Graviton2 . 860432000 |============================== Rodinia 3.1 Test: OpenMP CFD Solver Seconds < Lower Is Better m7g.16xlarge Graviton3 . 4.375 |================================== c6g.16xlarge Graviton2 . 6.051 |=============================================== HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: float - X Y Z: 128 GFLOP/s > Higher Is Better m7g.16xlarge Graviton3 . 186.36 |============================================== c6g.16xlarge Graviton2 . 135.36 |================================= srsRAN Project 23.5 Test: PUSCH Processor Benchmark, Throughput Total Mbps > Higher Is Better m7g.16xlarge Graviton3 . 5413.8 |============================================== c6g.16xlarge Graviton2 . 3938.7 |================================= LeelaChessZero 0.28 Backend: BLAS Nodes Per Second > Higher Is Better m7g.16xlarge Graviton3 . 1301 |================================================ c6g.16xlarge Graviton2 . 947 |=================================== 7-Zip Compression 22.01 Test: Compression Rating MIPS > Higher Is Better m7g.16xlarge Graviton3 . 316825 |============================================== c6g.16xlarge Graviton2 . 240702 |=================================== Stress-NG 0.15.10 Test: Matrix Math Bogo Ops/s > Higher Is Better m7g.16xlarge Graviton3 . 368750.67 |=========================================== c6g.16xlarge Graviton2 . 284713.63 |================================= Laghos 3.1 Test: Triple Point Problem Major Kernels Total Rate > Higher Is Better m7g.16xlarge Graviton3 . 232.01 |============================================== c6g.16xlarge Graviton2 . 180.80 |==================================== OpenSSL 3.1 Algorithm: SHA256 byte/s > Higher Is Better m7g.16xlarge Graviton3 . 54212515580 |========================================= c6g.16xlarge Graviton2 . 42472798847 |================================ Laghos 3.1 Test: Sedov Blast Wave, ube_922_hex.mesh Major Kernels Total Rate > Higher Is Better m7g.16xlarge Graviton3 . 410.55 |============================================== c6g.16xlarge Graviton2 . 322.37 |==================================== Coremark 1.0 CoreMark Size 666 - Iterations Per Second Iterations/Sec > Higher Is Better m7g.16xlarge Graviton3 . 1601880.34 |========================================== c6g.16xlarge Graviton2 . 1260642.18 |================================= Timed Gem5 Compilation 21.2 Time To Compile Seconds < Lower Is Better m7g.16xlarge Graviton3 . 180.25 |===================================== c6g.16xlarge Graviton2 . 225.31 |============================================== 7-Zip Compression 22.01 Test: Decompression Rating MIPS > Higher Is Better m7g.16xlarge Graviton3 . 285540 |============================================== c6g.16xlarge Graviton2 . 234202 |====================================== Timed Node.js Compilation 19.8.1 Time To Compile Seconds < Lower Is Better m7g.16xlarge Graviton3 . 237.78 |====================================== c6g.16xlarge Graviton2 . 287.81 |============================================== Liquid-DSP 1.6 Threads: 64 - Buffer Length: 256 - Filter Length: 512 samples/s > Higher Is Better m7g.16xlarge Graviton3 . 162753333 |=========================================== c6g.16xlarge Graviton2 . 134926667 |==================================== Liquid-DSP 1.6 Threads: 32 - Buffer Length: 256 - Filter Length: 512 samples/s > Higher Is Better m7g.16xlarge Graviton3 . 81396667 |============================================ c6g.16xlarge Graviton2 . 67486333 |==================================== ACES DGEMM 1.0 Sustained Floating-Point Rate GFLOP/s > Higher Is Better m7g.16xlarge Graviton3 . 24.36 |=============================================== c6g.16xlarge Graviton2 . 20.42 |======================================= Rodinia 3.1 Test: OpenMP Streamcluster Seconds < Lower Is Better m7g.16xlarge Graviton3 . 11.66 |======================================== c6g.16xlarge Graviton2 . 13.74 |=============================================== Apache HTTP Server 2.4.56 Concurrent Requests: 1000 Requests Per Second > Higher Is Better m7g.16xlarge Graviton3 . 60965.70 |======================================== c6g.16xlarge Graviton2 . 67276.83 |============================================ Apache HTTP Server 2.4.56 Concurrent Requests: 500 Requests Per Second > Higher Is Better m7g.16xlarge Graviton3 . 71754.89 |============================================ c6g.16xlarge Graviton2 . 66640.93 |========================================= High Performance Conjugate Gradient 3.1 X Y Z: 160 160 160 - RT: 60 GFLOP/s > Higher Is Better m7g.16xlarge Graviton3 . 33.82 |=============================================== High Performance Conjugate Gradient 3.1 X Y Z: 144 144 144 - RT: 60 GFLOP/s > Higher Is Better m7g.16xlarge Graviton3 . 33.79 |=============================================== Stockfish 15 Total Time Nodes Per Second > Higher Is Better m7g.16xlarge Graviton3 . 112119711 |=========================================== c6g.16xlarge Graviton2 . 86609284 |=================================