Amazon EC2 c7g.4xlarge Graviton3 Graviton3 benchmarks by Michael Larabel. c7g.4xlarge: Processor: ARMv8 Neoverse-V1 (16 Cores), Motherboard: Amazon EC2 c7g.4xlarge (1.0 BIOS), Chipset: Amazon Device 0200, Memory: 32GB, Disk: 193GB Amazon Elastic Block Store, Network: Amazon Elastic OS: Ubuntu 22.04, Kernel: 5.15.0-1004-aws (aarch64), Compiler: GCC 11.2.0, File-System: ext4, System Layer: amazon ampere c7g.4xlarge compar: Processor: Ampere ARMv8 Neoverse-N1 @ 3.00GHz (160 Cores), Motherboard: FOXCONN Mt. Collins (0ACOC017 SCP: 1.08.20210825 BIOS), Chipset: Ampere Computing LLC Device e100, Memory: 16 x 32 GB DDR4-3200MT/s Samsung M393A4K40DB3-CWE, Disk: 2 x 1920GB SAMSUNG MZQL21T9HCJR-00A07, Graphics: ASPEED, Monitor: PL2294H, Network: 4 x Mellanox MT27710 + 2 x Intel I350 OS: Ubuntu 20.04, Kernel: 5.4.0-100-generic (aarch64), Vulkan: 1.1.182, Compiler: GCC 9.4.0, File-System: ext4, Screen Resolution: 1920x1080 LeelaChessZero 0.28 Backend: Eigen Nodes Per Second > Higher Is Better c7g.4xlarge ............... 1189 |========================== ampere c7g.4xlarge compar . 2026 |============================================= LeelaChessZero 0.28 Backend: BLAS Nodes Per Second > Higher Is Better c7g.4xlarge ............... 1103 |========================= ampere c7g.4xlarge compar . 1978 |============================================= Timed Node.js Compilation 17.3 Time To Compile Seconds < Lower Is Better c7g.4xlarge ............... 497.58 |=========================================== ampere c7g.4xlarge compar . 197.76 |================= Timed Gem5 Compilation 21.2 Time To Compile Seconds < Lower Is Better c7g.4xlarge ............... 391.17 |=========================================== ampere c7g.4xlarge compar . 280.19 |=============================== Timed LLVM Compilation 13.0 Build System: Ninja Seconds < Lower Is Better c7g.4xlarge ............... 544.93 |=========================================== ampere c7g.4xlarge compar . 174.55 |============== asmFish 2018-07-23 1024 Hash Memory, 26 Depth Nodes/second > Higher Is Better c7g.4xlarge ............... 32134123 |=========== ampere c7g.4xlarge compar . 119173049 |======================================== POV-Ray 3.7.0.7 Trace Time Seconds < Lower Is Better c7g.4xlarge ............... 37.86 |=========== ampere c7g.4xlarge compar . 148.59 |=========================================== TensorFlow Lite 2022-05-18 Model: NASNet Mobile Microseconds < Lower Is Better c7g.4xlarge ............... 11591.9 |= ampere c7g.4xlarge compar . 661777.0 |========================================= Timed MrBayes Analysis 3.2.7 Primate Phylogeny Analysis Seconds < Lower Is Better c7g.4xlarge ............... 251.40 |================================= ampere c7g.4xlarge compar . 331.96 |=========================================== Build2 0.13 Time To Compile Seconds < Lower Is Better c7g.4xlarge ............... 115.02 |=========================================== ampere c7g.4xlarge compar . 91.34 |================================== SecureMark 1.0.4 Benchmark: SecureMark-TLS marks > Higher Is Better c7g.4xlarge ............... 183708 |=========================================== ampere c7g.4xlarge compar . 141771 |================================= LAMMPS Molecular Dynamics Simulator 29Oct2020 Model: 20k Atoms ns/day > Higher Is Better c7g.4xlarge ............... 11.43 |================ ampere c7g.4xlarge compar . 30.71 |============================================ libavif avifenc 0.10 Encoder Speed: 0 Seconds < Lower Is Better c7g.4xlarge ............... 256.84 |========================================== ampere c7g.4xlarge compar . 264.20 |=========================================== Ngspice 34 Circuit: C7552 Seconds < Lower Is Better c7g.4xlarge ............... 191.29 |=================================== ampere c7g.4xlarge compar . 235.95 |=========================================== Ngspice 34 Circuit: C2670 Seconds < Lower Is Better c7g.4xlarge ............... 198.22 |===================================== ampere c7g.4xlarge compar . 232.16 |=========================================== ACES DGEMM 1.0 Sustained Floating-Point Rate GFLOP/s > Higher Is Better c7g.4xlarge ............... 5.853864 |========================================= ampere c7g.4xlarge compar . 3.048297 |===================== Timed PHP Compilation 7.4.2 Time To Compile Seconds < Lower Is Better c7g.4xlarge ............... 69.48 |============================================ ampere c7g.4xlarge compar . 70.00 |============================================ NAS Parallel Benchmarks 3.4 Test / Class: SP.C Total Mop/s > Higher Is Better c7g.4xlarge ............... 4467.19 |======= ampere c7g.4xlarge compar . 25174.49 |========================================= TensorFlow Lite 2022-05-18 Model: Inception V4 Microseconds < Lower Is Better c7g.4xlarge ............... 41855.1 |===== ampere c7g.4xlarge compar . 330312.0 |========================================= TensorFlow Lite 2022-05-18 Model: SqueezeNet Microseconds < Lower Is Better c7g.4xlarge ............... 3257.94 |== ampere c7g.4xlarge compar . 57204.10 |========================================= TensorFlow Lite 2022-05-18 Model: Mobilenet Float Microseconds < Lower Is Better c7g.4xlarge ............... 2156.60 |=== ampere c7g.4xlarge compar . 34625.50 |========================================= High Performance Conjugate Gradient 3.1 GFLOP/s > Higher Is Better c7g.4xlarge ............... 26.31 |========================== ampere c7g.4xlarge compar . 44.72 |============================================ OpenSSL 3.0 Algorithm: SHA256 byte/s > Higher Is Better c7g.4xlarge ............... 13722045973 |==== ampere c7g.4xlarge compar . 126737105807 |===================================== TensorFlow Lite 2022-05-18 Model: Inception ResNet V2 Microseconds < Lower Is Better c7g.4xlarge ............... 40051.3 |==== ampere c7g.4xlarge compar . 401349.0 |========================================= Zstd Compression 1.5.0 Compression Level: 19, Long Mode - Decompression Speed MB/s > Higher Is Better c7g.4xlarge ............... 3240.6 |=========================================== ampere c7g.4xlarge compar . 2465.2 |================================= Zstd Compression 1.5.0 Compression Level: 19, Long Mode - Compression Speed MB/s > Higher Is Better c7g.4xlarge ............... 39.5 |============================================= ampere c7g.4xlarge compar . 28.9 |================================= NAS Parallel Benchmarks 3.4 Test / Class: BT.C Total Mop/s > Higher Is Better c7g.4xlarge ............... 10339.53 |======= ampere c7g.4xlarge compar . 63065.20 |========================================= NAS Parallel Benchmarks 3.4 Test / Class: LU.C Total Mop/s > Higher Is Better c7g.4xlarge ............... 7730.41 |======== ampere c7g.4xlarge compar . 40864.21 |========================================= libavif avifenc 0.10 Encoder Speed: 2 Seconds < Lower Is Better c7g.4xlarge ............... 141.70 |=================================== ampere c7g.4xlarge compar . 173.06 |=========================================== DaCapo Benchmark 9.12-MR1 Java Test: Tradebeans msec < Lower Is Better c7g.4xlarge ............... 3203 |================= ampere c7g.4xlarge compar . 8612 |============================================= 7-Zip Compression 21.06 Test: Decompression Rating MIPS > Higher Is Better c7g.4xlarge ............... 73054 |======= ampere c7g.4xlarge compar . 435194 |=========================================== 7-Zip Compression 21.06 Test: Compression Rating MIPS > Higher Is Better c7g.4xlarge ............... 97824 |================== ampere c7g.4xlarge compar . 227543 |=========================================== GROMACS 2022.1 Implementation: MPI CPU - Input: water_GMX50_bare Ns Per Day > Higher Is Better c7g.4xlarge ............... 1.128 |========= ampere c7g.4xlarge compar . 5.565 |============================================ Zstd Compression 1.5.0 Compression Level: 19 - Decompression Speed MB/s > Higher Is Better c7g.4xlarge ............... 3050.3 |=========================================== ampere c7g.4xlarge compar . 2330.1 |================================= Zstd Compression 1.5.0 Compression Level: 19 - Compression Speed MB/s > Higher Is Better c7g.4xlarge ............... 41.2 |================================ ampere c7g.4xlarge compar . 58.0 |============================================= Zstd Compression 1.5.0 Compression Level: 3 - Decompression Speed MB/s > Higher Is Better c7g.4xlarge ............... 3508.5 |=========================================== ampere c7g.4xlarge compar . 2806.6 |================================== Zstd Compression 1.5.0 Compression Level: 3 - Compression Speed MB/s > Higher Is Better c7g.4xlarge ............... 4639.1 |=========================================== ampere c7g.4xlarge compar . 3012.8 |============================ QuantLib 1.21 MFLOPS > Higher Is Better c7g.4xlarge ............... 2512.7 |=========================================== ampere c7g.4xlarge compar . 1974.8 |================================== GPAW 22.1 Input: Carbon Nanotube Seconds < Lower Is Better c7g.4xlarge ............... 155.18 |=========================================== ampere c7g.4xlarge compar . 65.40 |================== NAS Parallel Benchmarks 3.4 Test / Class: IS.D Total Mop/s > Higher Is Better c7g.4xlarge ............... 1041.90 |======================================= ampere c7g.4xlarge compar . 1116.18 |========================================== Stress-NG 0.14 Test: CPU Cache Bogo Ops/s > Higher Is Better c7g.4xlarge ............... 64.31 |==== ampere c7g.4xlarge compar . 641.90 |=========================================== Rodinia 3.1 Test: OpenMP LavaMD Seconds < Lower Is Better c7g.4xlarge ............... 143.33 |=========================================== ampere c7g.4xlarge compar . 31.35 |========= NAS Parallel Benchmarks 3.4 Test / Class: EP.D Total Mop/s > Higher Is Better c7g.4xlarge ............... 934.72 |====== ampere c7g.4xlarge compar . 6522.95 |========================================== ASTC Encoder 3.2 Preset: Exhaustive Seconds < Lower Is Better c7g.4xlarge ............... 139.38 |=========================================== ampere c7g.4xlarge compar . 18.05 |====== simdjson 1.0 Throughput Test: DistinctUserID GB/s > Higher Is Better c7g.4xlarge ............... 2.69 |============================================= ampere c7g.4xlarge compar . 1.88 |=============================== Coremark 1.0 CoreMark Size 666 - Iterations Per Second Iterations/Sec > Higher Is Better c7g.4xlarge ............... 405413.86 |===== ampere c7g.4xlarge compar . 2933123.68 |======================================= simdjson 1.0 Throughput Test: PartialTweets GB/s > Higher Is Better c7g.4xlarge ............... 2.62 |============================================= ampere c7g.4xlarge compar . 1.84 |================================ ONNX Runtime 1.11 Model: fcn-resnet101-11 - Device: CPU - Executor: Standard Inferences Per Minute > Higher Is Better c7g.4xlarge ............... 38 |=============================================== ONNX Runtime 1.11 Model: GPT-2 - Device: CPU - Executor: Standard Inferences Per Minute > Higher Is Better c7g.4xlarge ............... 7990 |============================================= ONNX Runtime 1.11 Model: bertsquad-12 - Device: CPU - Executor: Standard Inferences Per Minute > Higher Is Better c7g.4xlarge ............... 407 |============================================== TensorFlow Lite 2022-05-18 Model: Mobilenet Quant Microseconds < Lower Is Better c7g.4xlarge ............... 1502.95 |== ampere c7g.4xlarge compar . 36866.10 |========================================= ONNX Runtime 1.11 Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard Inferences Per Minute > Higher Is Better c7g.4xlarge ............... 609 |============================================== OpenSSL 3.0 Algorithm: RSA4096 verify/s > Higher Is Better c7g.4xlarge ............... 178460.4 |=========== ampere c7g.4xlarge compar . 646811.2 |========================================= OpenSSL 3.0 Algorithm: RSA4096 sign/s > Higher Is Better c7g.4xlarge ............... 2546.4 |=============== ampere c7g.4xlarge compar . 7387.0 |=========================================== ONNX Runtime 1.11 Model: super-resolution-10 - Device: CPU - Executor: Standard Inferences Per Minute > Higher Is Better c7g.4xlarge ............... 2817 |============================================= simdjson 1.0 Throughput Test: Kostya GB/s > Higher Is Better c7g.4xlarge ............... 1.94 |============================================= ampere c7g.4xlarge compar . 1.48 |================================== simdjson 1.0 Throughput Test: LargeRandom GB/s > Higher Is Better c7g.4xlarge ............... 0.70 |============================================= ampere c7g.4xlarge compar . 0.56 |==================================== WebP Image Encode 1.1 Encode Settings: Quality 100, Lossless, Highest Compression Encode Time - Seconds < Lower Is Better c7g.4xlarge ............... 48.21 |===================================== ampere c7g.4xlarge compar . 56.89 |============================================ Rodinia 3.1 Test: OpenMP Streamcluster Seconds < Lower Is Better c7g.4xlarge ............... 13.30 |============= ampere c7g.4xlarge compar . 44.70 |============================================ DaCapo Benchmark 9.12-MR1 Java Test: H2 msec < Lower Is Better c7g.4xlarge ............... 2951 |======================= ampere c7g.4xlarge compar . 5747 |============================================= Apache HTTP Server 2.4.48 Concurrent Requests: 1000 Requests Per Second > Higher Is Better c7g.4xlarge ............... 72719.33 |========================================= nginx 1.21.1 Concurrent Requests: 1000 Requests Per Second > Higher Is Better c7g.4xlarge ............... 346814.75 |======================================== nginx 1.21.1 Concurrent Requests: 200 Requests Per Second > Higher Is Better c7g.4xlarge ............... 352380.98 |======================================== Apache HTTP Server 2.4.48 Concurrent Requests: 200 Requests Per Second > Higher Is Better c7g.4xlarge ............... 73676.95 |========================================= nginx 1.21.1 Concurrent Requests: 100 Requests Per Second > Higher Is Better c7g.4xlarge ............... 345710.87 |======================================== Apache HTTP Server 2.4.48 Concurrent Requests: 500 Requests Per Second > Higher Is Better c7g.4xlarge ............... 73546.32 |========================================= nginx 1.21.1 Concurrent Requests: 500 Requests Per Second > Higher Is Better c7g.4xlarge ............... 346613.34 |======================================== Apache HTTP Server 2.4.48 Concurrent Requests: 100 Requests Per Second > Higher Is Better c7g.4xlarge ............... 67231.88 |========================================= m-queens 1.2 Time To Solve Seconds < Lower Is Better c7g.4xlarge ............... 66.822 |=========================================== ampere c7g.4xlarge compar . 8.161 |===== PHPBench 0.8.1 PHP Benchmark Suite Score > Higher Is Better c7g.4xlarge ............... 666484 |=========================================== ampere c7g.4xlarge compar . 487006 |=============================== Stress-NG 0.14 Test: Crypto Bogo Ops/s > Higher Is Better c7g.4xlarge ............... 23181.81 |===== ampere c7g.4xlarge compar . 198623.07 |======================================== Stockfish 13 Total Time Nodes Per Second > Higher Is Better c7g.4xlarge ............... 27608891 |======= ampere c7g.4xlarge compar . 156545134 |======================================== ASTC Encoder 3.2 Preset: Thorough Seconds < Lower Is Better c7g.4xlarge ............... 13.9248 |========================================== ampere c7g.4xlarge compar . 8.2208 |========================= Stress-NG 0.14 Test: CPU Stress Bogo Ops/s > Higher Is Better c7g.4xlarge ............... 5029.71 |===== ampere c7g.4xlarge compar . 39188.63 |========================================= Stress-NG 0.14 Test: Memory Copying Bogo Ops/s > Higher Is Better c7g.4xlarge ............... 6693.32 |===================== ampere c7g.4xlarge compar . 12885.53 |========================================= Google SynthMark 20201109 Test: VoiceMark_100 Voices > Higher Is Better c7g.4xlarge ............... 675.64 |=========================================== ampere c7g.4xlarge compar . 561.60 |==================================== Stress-NG 0.14 Test: Vector Math Bogo Ops/s > Higher Is Better c7g.4xlarge ............... 55258.17 |====== ampere c7g.4xlarge compar . 394967.53 |======================================== Stress-NG 0.14 Test: Matrix Math Bogo Ops/s > Higher Is Better c7g.4xlarge ............... 80088.74 |==== ampere c7g.4xlarge compar . 735128.78 |======================================== PyBench 2018-02-16 Total For Average Test Times Milliseconds < Lower Is Better c7g.4xlarge ............... 1185 |====================================== ampere c7g.4xlarge compar . 1421 |============================================= Timed Apache Compilation 2.4.41 Time To Compile Seconds < Lower Is Better c7g.4xlarge ............... 26.94 |==================================== ampere c7g.4xlarge compar . 32.92 |============================================ Algebraic Multi-Grid Benchmark 1.2 Figure Of Merit > Higher Is Better c7g.4xlarge ............... 1258807333 |====================== ampere c7g.4xlarge compar . 2194008000 |======================================= Timed ImageMagick Compilation 6.9.0 Time To Compile Seconds < Lower Is Better c7g.4xlarge ............... 27.90 |============================================ ampere c7g.4xlarge compar . 25.00 |======================================= WebP Image Encode 1.1 Encode Settings: Quality 100, Lossless Encode Time - Seconds < Lower Is Better c7g.4xlarge ............... 22.77 |====================================== ampere c7g.4xlarge compar . 26.54 |============================================ Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 193 Cells Per Direction Seconds < Lower Is Better c7g.4xlarge ............... 29.13 |============================================ ampere c7g.4xlarge compar . 12.09 |================== C-Ray 1.1 Total Time - 4K, 16 Rays Per Pixel Seconds < Lower Is Better c7g.4xlarge ............... 38.517 |=========================================== ampere c7g.4xlarge compar . 5.258 |====== NAS Parallel Benchmarks 3.4 Test / Class: FT.C Total Mop/s > Higher Is Better c7g.4xlarge ............... 11791.77 |========== ampere c7g.4xlarge compar . 46649.69 |========================================= Liquid-DSP 2021.01.31 Threads: 16 - Buffer Length: 256 - Filter Length: 57 samples/s > Higher Is Better c7g.4xlarge ............... 383606667 |======================================== ampere c7g.4xlarge compar . 319906667 |================================= Rodinia 3.1 Test: OpenMP CFD Solver Seconds < Lower Is Better c7g.4xlarge ............... 10.48 |================ ampere c7g.4xlarge compar . 28.48 |============================================ LULESH 2.0.3 z/s > Higher Is Better c7g.4xlarge ............... 10940.94 |============== ampere c7g.4xlarge compar . 31552.73 |========================================= N-Queens 1.0 Elapsed Time Seconds < Lower Is Better c7g.4xlarge ............... 21.536 |=========================================== ampere c7g.4xlarge compar . 1.936 |==== NAS Parallel Benchmarks 3.4 Test / Class: CG.C Total Mop/s > Higher Is Better c7g.4xlarge ............... 6571.95 |=========== ampere c7g.4xlarge compar . 24449.50 |========================================= Stress-NG 0.14 Test: IO_uring Bogo Ops/s > Higher Is Better c7g.4xlarge ............... 843015.78 |======================================== LAMMPS Molecular Dynamics Simulator 29Oct2020 Model: Rhodopsin Protein ns/day > Higher Is Better c7g.4xlarge ............... 11.29 |====================== ampere c7g.4xlarge compar . 22.27 |============================================ libavif avifenc 0.10 Encoder Speed: 6, Lossless Seconds < Lower Is Better c7g.4xlarge ............... 11.908 |=========================================== ampere c7g.4xlarge compar . 8.359 |============================== WebP Image Encode 1.1 Encode Settings: Quality 100, Highest Compression Encode Time - Seconds < Lower Is Better c7g.4xlarge ............... 9.346 |======================================= ampere c7g.4xlarge compar . 10.206 |=========================================== DaCapo Benchmark 9.12-MR1 Java Test: Jython msec < Lower Is Better c7g.4xlarge ............... 3940 |================================= ampere c7g.4xlarge compar . 5404 |============================================= NAS Parallel Benchmarks 3.4 Test / Class: MG.C Total Mop/s > Higher Is Better c7g.4xlarge ............... 13481.61 |========== ampere c7g.4xlarge compar . 52823.39 |========================================= libavif avifenc 0.10 Encoder Speed: 6 Seconds < Lower Is Better c7g.4xlarge ............... 9.385 |============================================ ampere c7g.4xlarge compar . 4.910 |======================= DaCapo Benchmark 9.12-MR1 Java Test: Tradesoap msec < Lower Is Better c7g.4xlarge ............... 3524 |============================================= Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 129 Cells Per Direction Seconds < Lower Is Better c7g.4xlarge ............... 8.01671425 |======================================= ampere c7g.4xlarge compar . 3.06693709 |=============== libavif avifenc 0.10 Encoder Speed: 10, Lossless Seconds < Lower Is Better c7g.4xlarge ............... 5.765 |========================================= ampere c7g.4xlarge compar . 6.170 |============================================ TSCP 1.81 AI Chess Performance Nodes Per Second > Higher Is Better c7g.4xlarge ............... 1370094 |========================================== ampere c7g.4xlarge compar . 1041565 |================================