Benchmarks by Michael Larabel for a future article on Phoronix.com.
Compare your own system(s) to this result file with the
Phoronix Test Suite by running the command:
phoronix-test-suite benchmark 2308110-NE-2307106NE96
Amazon AWS Graviton3E vs. Graviton 2/3 benchmarks
Benchmarks by Michael Larabel for a future article on Phoronix.com.
,,"m7g.16xlarge Graviton3","c6g.16xlarge Graviton2","c7g.16xlarge Graviton3","c7gn.16xlarge Graviton3E","c6a.16xlarge AMD Zen 3"
Processor,,ARMv8 Neoverse-V1 (64 Cores),ARMv8 Neoverse-N1 (64 Cores),ARMv8 Neoverse-V1 (64 Cores),ARMv8 Neoverse-V1 (64 Cores),AMD EPYC 7R13 (32 Cores / 64 Threads)
Motherboard,,Amazon EC2 m7g.16xlarge (1.0 BIOS),Amazon EC2 c6g.16xlarge (1.0 BIOS),Amazon EC2 c7g.16xlarge (1.0 BIOS),Amazon EC2 c7gn.16xlarge (1.0 BIOS),Amazon EC2 c6a.16xlarge (1.0 BIOS)
Chipset,,Amazon Device 0200,Amazon Device 0200,Amazon Device 0200,Amazon Device 0200,Intel 440FX 82441FX PMC
Memory,,256GB,128GB,128GB,128GB,128GB
Disk,,215GB Amazon Elastic Block Store,215GB Amazon Elastic Block Store,215GB Amazon Elastic Block Store,215GB Amazon Elastic Block Store,322GB Amazon Elastic Block Store
Network,,Amazon Elastic,Amazon Elastic,Amazon Elastic,Amazon Elastic,Amazon Elastic
OS,,Ubuntu 22.04,Ubuntu 22.04,Ubuntu 22.04,Ubuntu 22.04,Ubuntu 22.04
Kernel,,5.19.0-1025-aws (aarch64),5.19.0-1025-aws (aarch64),5.19.0-1025-aws (aarch64),5.19.0-1025-aws (aarch64),5.19.0-1025-aws (x86_64)
Compiler,,GCC 11.3.0,GCC 11.3.0,GCC 11.3.0,GCC 11.3.0,GCC 11.4.0
File-System,,ext4,ext4,ext4,ext4,ext4
System Layer,,amazon,amazon,amazon,amazon,amazon
Vulkan,,,,,,1.3.238
,,"m7g.16xlarge Graviton3","c6g.16xlarge Graviton2","c7g.16xlarge Graviton3","c7gn.16xlarge Graviton3E","c6a.16xlarge AMD Zen 3"
"NAS Parallel Benchmarks - Test / Class: LU.C (Mop/s)",HIB,28341.68,18741.90,28375.71,28369.11,95221.40
"Liquid-DSP - Threads: 32 - Buffer Length: 256 - Filter Length: 512 (samples/s)",HIB,81396667,67486333,81412000,81394000,274803333
"OpenSSL - Algorithm: RSA4096 (sign/s)",HIB,10181.9,2624.3,10181.4,10183.3,8392.4
"srsRAN Project - Test: Downlink Processor Benchmark (Mbps)",HIB,318.5,197.2,319.7,323.2,691.3
"NAS Parallel Benchmarks - Test / Class: SP.C (Mop/s)",HIB,17244.85,9711.70,17219.95,17163.11,34025.35
"Liquid-DSP - Threads: 64 - Buffer Length: 256 - Filter Length: 512 (samples/s)",HIB,162753333,134926667,162766667,162756667,460076667
"srsRAN Project - Test: PUSCH Processor Benchmark, Throughput Thread (Mbps)",HIB,95.8,63.8,95.7,97.4,215.9
"OpenSSL - Algorithm: RSA4096 (verify/s)",HIB,713859.5,214040.9,713945.9,713754.8,548396.5
"Liquid-DSP - Threads: 32 - Buffer Length: 256 - Filter Length: 57 (samples/s)",HIB,721493333,489270000,721386667,721380000,1444266667
"Graph500 - Scale: 26 (bfs max_TEPS)",HIB,1227790000,874389000,1206990000,1207760000,417777000
"Graph500 - Scale: 26 (bfs median_TEPS)",HIB,1194320000,860432000,1177710000,1175640000,410571000
"OpenSSL - Algorithm: AES-256-GCM (byte/s)",HIB,283333113630,129199593157,283373795737,351152465420,138457889450
"OpenSSL - Algorithm: AES-128-GCM (byte/s)",HIB,332033171900,158436163857,332064349843,411130469943,151449269317
"ACES DGEMM - Sustained Floating-Point Rate (GFLOP/s)",HIB,24.362353,20.417952,24.140605,24.078529,9.388050
"Stress-NG - Test: Memory Copying (Bogo Ops/s)",HIB,20484.24,11324.79,20478.67,20475.96,8080.43
"Stress-NG - Test: Matrix Math (Bogo Ops/s)",HIB,368750.67,284713.63,368671.39,369258.89,147576.41
"Stress-NG - Test: Vector Shuffle (Bogo Ops/s)",HIB,54143.40,35614.51,54472.07,54695.04,22255.84
"nekRS - Input: Kershaw (flops/rank)",HIB,3150680000,1760336667,3261853333,3302823333,4308810000
"Stress-NG - Test: Matrix 3D Math (Bogo Ops/s)",HIB,10403.93,5752.17,10813.59,10882.02,4571.96
"Monte Carlo Simulations of Ionised Nebulae - Input: Dust 2D tau100.0 (sec)",LIB,82.669,145.374,82.822,82.974,194.435
"Xcompact3d Incompact3d - Input: input.i3d 129 Cells Per Direction (sec)",LIB,3.09871038,5.63720735,3.14447999,3.11489828,7.01975288
"Stress-NG - Test: Vector Floating Point (Bogo Ops/s)",HIB,76102.55,42850.82,76178.46,76911.74,96529.51
"OpenSSL - Algorithm: SHA512 (byte/s)",HIB,32125448870,14393925490,32145914147,32126059040,15291283297
"Xcompact3d Incompact3d - Input: input.i3d 193 Cells Per Direction (sec)",LIB,13.9454180,25.8825658,13.8326693,13.7606726,30.3145288
"Rodinia - Test: OpenMP CFD Solver (sec)",LIB,4.375,6.051,4.442,4.429,9.342
"Algebraic Multi-Grid Benchmark - (Figure Of Merit)",HIB,1646761667,1035586333,1765277667,1765966333,836999300
"HeFFTe - Highly Efficient FFT for Exascale - Test: c2c - Backend: FFTW - Precision: float - X Y Z: 512 (GFLOP/s)",HIB,88.0482,42.8284,88.1842,88.4551,44.3176
"Stress-NG - Test: Fused Multiply-Add (Bogo Ops/s)",HIB,63762252.76,37732190.54,63818458.61,63723431.55,30920910.92
"OpenSSL - Algorithm: ChaCha20 (byte/s)",HIB,103226784517,67292541203,103275516997,114118119423,138389378753
"Graph500 - Scale: 26 (sssp max_TEPS)",HIB,419754000,284689000,415758000,411762000,204550000
"HeFFTe - Highly Efficient FFT for Exascale - Test: r2c - Backend: FFTW - Precision: double - X Y Z: 512 (GFLOP/s)",HIB,84.4739,44.9297,84.7451,85.0060,42.4394
"HeFFTe - Highly Efficient FFT for Exascale - Test: r2c - Backend: FFTW - Precision: float - X Y Z: 512 (GFLOP/s)",HIB,162.956,81.9412,163.276,163.559,82.7584
"HeFFTe - Highly Efficient FFT for Exascale - Test: c2c - Backend: FFTW - Precision: double - X Y Z: 256 (GFLOP/s)",HIB,40.8923,20.6279,40.8283,40.9708,20.8719
"OpenSSL - Algorithm: ChaCha20-Poly1305 (byte/s)",HIB,74287460990,46717636807,74318842213,79969465487,92522999373
"HeFFTe - Highly Efficient FFT for Exascale - Test: c2c - Backend: FFTW - Precision: double - X Y Z: 512 (GFLOP/s)",HIB,46.2504,24.2658,46.3706,46.5300,23.5212
"HeFFTe - Highly Efficient FFT for Exascale - Test: r2c - Backend: FFTW - Precision: double - X Y Z: 256 (GFLOP/s)",HIB,78.5049,40.1104,77.7685,78.1658,41.5868
"nekRS - Input: TurboPipe Periodic (flops/rank)",HIB,3976300000,2220190000,3978983333,4141440000,4337536667
"NAS Parallel Benchmarks - Test / Class: MG.C (Mop/s)",HIB,50126.29,25671.29,49742.30,49860.68,45946.81
"HeFFTe - Highly Efficient FFT for Exascale - Test: c2c - Backend: FFTW - Precision: float - X Y Z: 256 (GFLOP/s)",HIB,81.4442,41.9816,81.0096,81.1671,43.5907
"HeFFTe - Highly Efficient FFT for Exascale - Test: r2c - Backend: FFTW - Precision: float - X Y Z: 128 (GFLOP/s)",HIB,306.540,209.496,301.418,300.396,158.858
"LAMMPS Molecular Dynamics Simulator - Model: Rhodopsin Protein (ns/day)",HIB,37.558,25.950,37.412,37.482,19.563
"Graph500 - Scale: 26 (sssp median_TEPS)",HIB,299497000,209350000,293826000,296164000,157688000
"HeFFTe - Highly Efficient FFT for Exascale - Test: c2c - Backend: FFTW - Precision: float - X Y Z: 128 (GFLOP/s)",HIB,186.356,135.358,184.026,184.110,98.7026
"LAMMPS Molecular Dynamics Simulator - Model: 20k Atoms (ns/day)",HIB,36.927,25.171,36.862,36.838,20.342
"Pennant - Test: leblancbig (Hydro Cycle Time - sec)",LIB,6.720537,12.17683,6.961345,6.839998,9.917565
"NWChem - Input: C240 Buckyball (sec)",LIB,1940.2,2976.9,1962.7,1914,3440.4
"Pennant - Test: sedovbig (Hydro Cycle Time - sec)",LIB,9.206490,16.48050,9.422270,9.340953,16.53050
"HeFFTe - Highly Efficient FFT for Exascale - Test: r2c - Backend: FFTW - Precision: float - X Y Z: 256 (GFLOP/s)",HIB,164.873,92.3996,162.010,162.361,102.652
"Liquid-DSP - Threads: 64 - Buffer Length: 256 - Filter Length: 57 (samples/s)",HIB,1442400000,978200000,1442366667,1442666667,1710800000
"GROMACS - Implementation: MPI CPU - Input: water_GMX50_bare (Ns/Day)",HIB,4.223,2.767,4.200,4.820,3.965
"LULESH - (z/s)",HIB,28296.378,17557.485,28708.656,28736.226,16708.258
"nginx - Connections: 500 (Reqs/sec)",HIB,255768.44,148964.69,255145.52,253518.51,165847.75
"HeFFTe - Highly Efficient FFT for Exascale - Test: r2c - Backend: FFTW - Precision: double - X Y Z: 128 (GFLOP/s)",HIB,138.014,81.4498,133.514,133.422,86.3730
"NAS Parallel Benchmarks - Test / Class: CG.C (Mop/s)",HIB,21988.99,13103.62,21911.02,22155.36,20210.00
"NAS Parallel Benchmarks - Test / Class: EP.D (Mop/s)",HIB,3738.98,2216.26,3664.54,3657.67,3061.42
"QMCPACK - Input: simple-H2O (Execution Time - sec)",LIB,28.041,45.225,27.990,27.999,26.867
"srsRAN Project - Test: PUSCH Processor Benchmark, Throughput Total (Mbps)",HIB,5413.8,3938.7,5356.8,5431.2,6479.1
"GPAW - Input: Carbon Nanotube (sec)",LIB,61.831,92.760,62.083,56.440,89.818
"QMCPACK - Input: FeCO6_b3lyp_gms (Execution Time - sec)",LIB,211.60,302.19,211.32,188.28,184.10
"Monte Carlo Simulations of Ionised Nebulae - Input: Gas HII40 (sec)",LIB,13.575,20.758,13.659,13.525,12.669
"BRL-CAD - VGR Performance Metric (VGR Performance Metric)",HIB,783777,533020,789066,744743,485038
"LeelaChessZero - Backend: Eigen (Nodes/s)",HIB,1398,891,1382,1444,1152
"nginx - Connections: 1000 (Reqs/sec)",HIB,255616.04,158676.40,255552.05,256585.83,163178.67
"Kripke - (Throughput FoM)",HIB,339000400,220120233,354442733,354234067,237087650
"QMCPACK - Input: FeCO6_b3lyp_gms (Execution Time - sec)",LIB,205.72,297.94,204.77,204.25,187.32
"Remhos - Test: Sample Remap Example (sec)",LIB,14.040,20.740,14.120,14.082,22.104
"Liquid-DSP - Threads: 32 - Buffer Length: 256 - Filter Length: 32 (samples/s)",HIB,1136066667,765466667,1136133333,1136000000,1193966667
"Stress-NG - Test: Wide Vector Math (Bogo Ops/s)",HIB,1542834.94,997272.65,1535336.57,1530043.52,1380146.63
"Laghos - Test: Sedov Blast Wave, ube_922_hex.mesh (Major Kernels Rate)",HIB,410.55,322.37,408.01,423.11,275.92
"Stress-NG - Test: Vector Math (Bogo Ops/s)",HIB,217235.59,147886.14,217446.12,217567.10,221776.15
"Liquid-DSP - Threads: 64 - Buffer Length: 256 - Filter Length: 32 (samples/s)",HIB,2270500000,1531400000,2271966667,2266833333,2184866667
"Timed Godot Game Engine Compilation - Time To Compile (sec)",LIB,154.378,218.276,156.687,155.951,147.737
"LeelaChessZero - Backend: BLAS (Nodes/s)",HIB,1301,947,1333,1392,1316
"QMCPACK - Input: Li2_STO_ae (Execution Time - sec)",LIB,112.61,165.12,112.64,113.20,123.95
"Rodinia - Test: OpenMP LavaMD (sec)",LIB,43.788,62.224,43.963,44.044,64.179
"7-Zip Compression - Test: Compression Rating (MIPS)",HIB,316825,240702,311056,312009,230970
"Laghos - Test: Triple Point Problem (Major Kernels Rate)",HIB,232.01,180.80,230.68,236.22,227.40
"Coremark - CoreMark Size 666 - Iterations Per Second (Iterations/Sec)",HIB,1601880.342264,1260642.177024,1605948.674645,1611801.559265,1466587.036580
"OpenSSL - Algorithm: SHA256 (byte/s)",HIB,54212515580,42472798847,54216561263,54154218593,45857534777
"Timed Gem5 Compilation - Time To Compile (sec)",LIB,180.247,225.305,181.779,182.471,192.118
"Timed Node.js Compilation - Time To Compile (sec)",LIB,237.783,287.814,238.543,238.636,230.423
"7-Zip Compression - Test: Decompression Rating (MIPS)",HIB,285540,234202,285633,285677,235787
"Stress-NG - Test: CPU Cache (Bogo Ops/s)",HIB,3892396.34,1921785.20,3844101.98,3860335.38,1447265.35
"Stress-NG - Test: NUMA (Bogo Ops/s)",HIB,3759.10,2112.66,3523.58,3525.17,552.68
"Stockfish - Total Time (Nodes/s)",HIB,112119711,86609284,117316476,117027121,96905609
"HeFFTe - Highly Efficient FFT for Exascale - Test: c2c - Backend: FFTW - Precision: double - X Y Z: 128 (GFLOP/s)",HIB,57.1503,32.7468,55.1055,55.1038,48.9432
"Rodinia - Test: OpenMP Streamcluster (sec)",LIB,11.663,13.735,11.625,10.690,8.396