Graviton3 benchmarks by Michael Larabel.
c7g.4xlarge Processor: ARMv8 Neoverse-V1 (16 Cores), Motherboard: Amazon EC2 c7g.4xlarge (1.0 BIOS), Chipset: Amazon Device 0200, Memory: 32GB, Disk: 193GB Amazon Elastic Block Store, Network: Amazon Elastic
OS: Ubuntu 22.04, Kernel: 5.15.0-1004-aws (aarch64), Compiler: GCC 11.2.0, File-System: ext4, System Layer: amazon
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -vJava Notes: OpenJDK Runtime Environment (build 11.0.15+10-Ubuntu-0ubuntu0.22.04.1)Python Notes: Python 3.10.4Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of CSV2 BHB + srbds: Not affected + tsx_async_abort: Not affected
ampere c7g.4xlarge compar Processor: Ampere ARMv8 Neoverse-N1 @ 3.00GHz (160 Cores) , Motherboard: FOXCONN Mt. Collins (0ACOC017 SCP: 1.08.20210825 BIOS) , Chipset: Ampere Computing LLC Device e100 , Memory: 16 x 32 GB DDR4-3200MT/s Samsung M393A4K40DB3-CWE , Disk: 2 x 1920GB SAMSUNG MZQL21T9HCJR-00A07 , Graphics: ASPEED , Monitor: PL2294H, Network: 4 x Mellanox MT27710 + 2 x Intel I350
OS: Ubuntu 20.04, Kernel: 5.4.0-100-generic (aarch64), Vulkan: 1.1.182, Compiler: GCC 9.4.0, File-System: ext4, Screen Resolution: 1920x1080
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -vProcessor Notes: Scaling Governor: cppc_cpufreq performanceJava Notes: OpenJDK Runtime Environment (build 11.0.15+10-Ubuntu-0ubuntu0.20.04.1)Python Notes: Python 3.8.10Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Amazon EC2 c7g.4xlarge Graviton3 Processor Motherboard Chipset Memory Disk Network Graphics Monitor OS Kernel Compiler File-System System Layer Vulkan Screen Resolution c7g.4xlarge ampere c7g.4xlarge compar ARMv8 Neoverse-V1 (16 Cores) Amazon EC2 c7g.4xlarge (1.0 BIOS) Amazon Device 0200 32GB 193GB Amazon Elastic Block Store Amazon Elastic Ubuntu 22.04 5.15.0-1004-aws (aarch64) GCC 11.2.0 ext4 amazon Ampere ARMv8 Neoverse-N1 @ 3.00GHz (160 Cores) FOXCONN Mt. Collins (0ACOC017 SCP: 1.08.20210825 BIOS) Ampere Computing LLC Device e100 16 x 32 GB DDR4-3200MT/s Samsung M393A4K40DB3-CWE 2 x 1920GB SAMSUNG MZQL21T9HCJR-00A07 ASPEED PL2294H 4 x Mellanox MT27710 + 2 x Intel I350 Ubuntu 20.04 5.4.0-100-generic (aarch64) 1.1.182 GCC 9.4.0 1920x1080 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - c7g.4xlarge: --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -v - ampere c7g.4xlarge compar: --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -v Java Details - c7g.4xlarge: OpenJDK Runtime Environment (build 11.0.15+10-Ubuntu-0ubuntu0.22.04.1) - ampere c7g.4xlarge compar: OpenJDK Runtime Environment (build 11.0.15+10-Ubuntu-0ubuntu0.20.04.1) Python Details - c7g.4xlarge: Python 3.10.4 - ampere c7g.4xlarge compar: Python 3.8.10 Security Details - c7g.4xlarge: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of CSV2 BHB + srbds: Not affected + tsx_async_abort: Not affected - ampere c7g.4xlarge compar: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected Processor Details - ampere c7g.4xlarge compar: Scaling Governor: cppc_cpufreq performance
c7g.4xlarge vs. ampere c7g.4xlarge compar Comparison Phoronix Test Suite Baseline +1402.3% +1402.3% +2804.6% +2804.6% +4206.9% +4206.9% 898.1% 823.6% 817.9% 756.8% 718.8% 679.1% 672% 632.5% 623.5% 614.8% 597.9% 509.9% 495.7% 467% 463.5% 428.6% 393.4% 357.2% 295.6% 291.8% 272% 270.9% 262.4% 212.2% 190.1% 188.4% 168.8% 161.4% 151.6% 140.9% 137.3% 132.6% 1012.4% 97.3% 92.5% 91.1% 79.3% 74.3% 70.4% 70% 69.4% 42.5% 40.8% 39.6% 25.9% 11.6% 7.1% CPU Cache SHA256 Matrix Math Crypto Time To Solve Inception V4 689.2% CPU Stress Exhaustive Total Time - 4.1.R.P.P CoreMark Size 666 - I.P.S Vector Math EP.D BT.C NASNet Mobile 5609% D.R Total Time SP.C LU.C MPI CPU - water_GMX50_bare OpenMP LavaMD FT.C Trace Time 292.4% MG.C CG.C 1.H.M.2.D RSA4096 O.S 236.2% Ninja Mobilenet Quant 2352.9% RSA4096 OpenMP CFD Solver 171.8% Tradebeans 168.9% 20k Atoms i.i.1.C.P.D Time To Compile i.i.1.C.P.D Carbon Nanotube Compression Rating SqueezeNet 1655.8% Mobilenet Float 1505.6% Elapsed Time I.R.V 902.1% Rhodopsin Protein H2 94.7% Memory Copying S.F.P.R 92% 6 BLAS Eigen Thorough 3 - Compression Speed 54% DistinctUserID 43.1% 6, Lossless PartialTweets 42.4% 19 - Compression Speed Time To Compile Jython 37.2% P.B.S 36.9% 19, Long Mode - Compression Speed 36.7% P.P.A 32% A.C.P 31.5% 19, Long Mode - D.S 31.5% Kostya 31.1% 19 - D.S 30.9% SecureMark-TLS 29.6% 27.2% Time To Compile 3 - D.S 25% LargeRand 25% C7552 23.3% Time To Compile 22.2% 2 22.1% VoiceMark_100 20.3% T.F.A.T.T 19.9% 16 - 256 - 57 19.9% Q.1.L.H.C 18% C2670 17.1% Q.1.L 16.5% Time To Compile Q.1.H.C 9.2% IS.D 10, Lossless 7% 2.9% Stress-NG OpenSSL Stress-NG Stress-NG m-queens TensorFlow Lite Stress-NG ASTC Encoder C-Ray Coremark Stress-NG NAS Parallel Benchmarks NAS Parallel Benchmarks TensorFlow Lite 7-Zip Compression Stockfish NAS Parallel Benchmarks NAS Parallel Benchmarks GROMACS Rodinia NAS Parallel Benchmarks POV-Ray NAS Parallel Benchmarks NAS Parallel Benchmarks asmFish OpenSSL Rodinia Timed LLVM Compilation TensorFlow Lite OpenSSL LULESH Rodinia DaCapo Benchmark LAMMPS Molecular Dynamics Simulator Xcompact3d Incompact3d Timed Node.js Compilation Xcompact3d Incompact3d GPAW 7-Zip Compression TensorFlow Lite TensorFlow Lite N-Queens TensorFlow Lite LAMMPS Molecular Dynamics Simulator DaCapo Benchmark Stress-NG ACES DGEMM libavif avifenc LeelaChessZero Algebraic Multi-Grid Benchmark LeelaChessZero High Performance Conjugate Gradient ASTC Encoder Zstd Compression simdjson libavif avifenc simdjson Zstd Compression Timed Gem5 Compilation DaCapo Benchmark PHPBench Zstd Compression Timed MrBayes Analysis TSCP Zstd Compression simdjson Zstd Compression SecureMark QuantLib Build2 Zstd Compression simdjson Ngspice Timed Apache Compilation libavif avifenc Google SynthMark PyBench Liquid-DSP WebP Image Encode Ngspice WebP Image Encode Timed ImageMagick Compilation WebP Image Encode NAS Parallel Benchmarks libavif avifenc libavif avifenc c7g.4xlarge ampere c7g.4xlarge compar
Amazon EC2 c7g.4xlarge Graviton3 compress-7zip: Compression Rating compress-7zip: Decompression Rating mt-dgemm: Sustained Floating-Point Rate amg: apache: 100 apache: 200 apache: 500 apache: 1000 asmfish: 1024 Hash Memory, 26 Depth astcenc: Thorough astcenc: Exhaustive build2: Time To Compile c-ray: Total Time - 4K, 16 Rays Per Pixel coremark: CoreMark Size 666 - Iterations Per Second dacapobench: H2 dacapobench: Jython dacapobench: Tradesoap dacapobench: Tradebeans synthmark: VoiceMark_100 gpaw: Carbon Nanotube gromacs: MPI CPU - water_GMX50_bare hpcg: lammps: 20k Atoms lammps: Rhodopsin Protein lczero: BLAS lczero: Eigen avifenc: 0 avifenc: 2 avifenc: 6 avifenc: 6, Lossless avifenc: 10, Lossless liquid-dsp: 16 - 256 - 57 lulesh: m-queens: Time To Solve n-queens: Elapsed Time npb: BT.C npb: CG.C npb: EP.D npb: FT.C npb: IS.D npb: LU.C npb: MG.C npb: SP.C nginx: 100 nginx: 200 nginx: 500 nginx: 1000 ngspice: C2670 ngspice: C7552 onnx: GPT-2 - CPU - Standard onnx: bertsquad-12 - CPU - Standard onnx: fcn-resnet101-11 - CPU - Standard onnx: ArcFace ResNet-100 - CPU - Standard onnx: super-resolution-10 - CPU - Standard openssl: SHA256 openssl: RSA4096 openssl: RSA4096 phpbench: PHP Benchmark Suite povray: Trace Time pybench: Total For Average Test Times quantlib: rodinia: OpenMP LavaMD rodinia: OpenMP CFD Solver rodinia: OpenMP Streamcluster securemark: SecureMark-TLS simdjson: Kostya simdjson: LargeRand simdjson: PartialTweets simdjson: DistinctUserID stockfish: Total Time stress-ng: Crypto stress-ng: IO_uring stress-ng: CPU Cache stress-ng: CPU Stress stress-ng: Matrix Math stress-ng: Vector Math stress-ng: Memory Copying tensorflow-lite: SqueezeNet tensorflow-lite: Inception V4 tensorflow-lite: NASNet Mobile tensorflow-lite: Mobilenet Float tensorflow-lite: Mobilenet Quant tensorflow-lite: Inception ResNet V2 build-apache: Time To Compile build-gem5: Time To Compile build-imagemagick: Time To Compile build-llvm: Ninja mrbayes: Primate Phylogeny Analysis build-nodejs: Time To Compile build-php: Time To Compile tscp: AI Chess Performance webp: Quality 100, Lossless webp: Quality 100, Highest Compression webp: Quality 100, Lossless, Highest Compression incompact3d: input.i3d 129 Cells Per Direction incompact3d: input.i3d 193 Cells Per Direction compress-zstd: 3 - Compression Speed compress-zstd: 3 - Decompression Speed compress-zstd: 19 - Compression Speed compress-zstd: 19 - Decompression Speed compress-zstd: 19, Long Mode - Compression Speed compress-zstd: 19, Long Mode - Decompression Speed c7g.4xlarge ampere c7g.4xlarge compar 97824 73054 5.853864 1258807333 67231.88 73676.95 73546.32 72719.33 32134123 13.9248 139.3797 115.020 38.517 405413.860554 2951 3940 3524 3203 675.635 155.180 1.128 26.3058 11.425 11.291 1103 1189 256.841 141.698 9.385 11.908 5.765 383606667 10940.939 66.822 21.536 10339.53 6571.95 934.72 11791.77 1041.90 7730.41 13481.61 4467.19 345710.87 352380.98 346613.34 346814.75 198.224 191.286 7990 407 38 609 2817 13722045973 2546.4 178460.4 666484 37.863 1185 2512.7 143.334 10.478 13.296 183708 1.94 0.7 2.62 2.69 27608891 23181.81 843015.78 64.31 5029.71 80088.74 55258.17 6693.32 3257.94 41855.1 11591.9 2156.60 1502.95 40051.3 26.940 391.171 27.904 544.929 251.397 497.579 69.483 1370094 22.769 9.346 48.208 8.01671425 29.1258570 4639.1 3508.5 41.2 3050.3 39.5 3240.6 227543 435194 3.048297 2194008000 119173049 8.2208 18.0535 91.335 5.258 2933123.678773 5747 5404 8612 561.595 65.402 5.565 44.7232 30.713 22.272 1978 2026 264.203 173.056 4.910 8.359 6.170 319906667 31552.726 8.161 1.936 63065.20 24449.50 6522.95 46649.69 1116.18 40864.21 52823.39 25174.49 232.159 235.945 126737105807 7387.0 646811.2 487006 148.591 1421 1974.8 31.353 28.484 44.695 141771 1.48 0.56 1.84 1.88 156545134 198623.07 641.90 39188.63 735128.78 394967.53 12885.53 57204.1 330312 661777 34625.5 36866.1 401349 32.922 280.188 24.997 174.549 331.963 197.761 70.000 1041565 26.536 10.206 56.888 3.06693709 12.0899995 3012.8 2806.6 58.0 2330.1 28.9 2465.2 OpenBenchmarking.org
OpenBenchmarking.org MIPS, More Is Better 7-Zip Compression 21.06 Test: Decompression Rating ampere c7g.4xlarge compar c7g.4xlarge 90K 180K 270K 360K 450K SE +/- 4896.50, N = 15 SE +/- 12.88, N = 3 435194 73054 1. (CXX) g++ options: -lpthread -ldl -O2 -fPIC
Algebraic Multi-Grid Benchmark AMG is a parallel algebraic multigrid solver for linear systems arising from problems on unstructured grids. The driver provided with AMG builds linear systems for various 3-dimensional problems. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Figure Of Merit, More Is Better Algebraic Multi-Grid Benchmark 1.2 ampere c7g.4xlarge compar c7g.4xlarge 500M 1000M 1500M 2000M 2500M SE +/- 3186022.65, N = 3 SE +/- 952437.28, N = 3 2194008000 1258807333 -pthread 1. (CC) gcc options: -lparcsr_ls -lparcsr_mv -lseq_mv -lIJ_mv -lkrylov -lHYPRE_utilities -lm -fopenmp -lmpi
Apache HTTP Server This is a test of the Apache HTTPD web server. This Apache HTTPD web server benchmark test profile makes use of the Golang "Bombardier" program for facilitating the HTTP requests over a fixed period time with a configurable number of concurrent clients. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Requests Per Second, More Is Better Apache HTTP Server 2.4.48 Concurrent Requests: 100 c7g.4xlarge 14K 28K 42K 56K 70K SE +/- 38.09, N = 3 67231.88 1. (CC) gcc options: -shared -fPIC -O2
Concurrent Requests: 100
ampere c7g.4xlarge compar: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./apache: 2: /go/bin/bombardier: not found
OpenBenchmarking.org Requests Per Second, More Is Better Apache HTTP Server 2.4.48 Concurrent Requests: 200 c7g.4xlarge 16K 32K 48K 64K 80K SE +/- 649.31, N = 3 73676.95 1. (CC) gcc options: -shared -fPIC -O2
Concurrent Requests: 200
ampere c7g.4xlarge compar: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./apache: 2: /go/bin/bombardier: not found
OpenBenchmarking.org Requests Per Second, More Is Better Apache HTTP Server 2.4.48 Concurrent Requests: 500 c7g.4xlarge 16K 32K 48K 64K 80K SE +/- 89.82, N = 3 73546.32 1. (CC) gcc options: -shared -fPIC -O2
Concurrent Requests: 500
ampere c7g.4xlarge compar: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./apache: 2: /go/bin/bombardier: not found
OpenBenchmarking.org Requests Per Second, More Is Better Apache HTTP Server 2.4.48 Concurrent Requests: 1000 c7g.4xlarge 16K 32K 48K 64K 80K SE +/- 83.83, N = 3 72719.33 1. (CC) gcc options: -shared -fPIC -O2
Concurrent Requests: 1000
ampere c7g.4xlarge compar: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./apache: 2: /go/bin/bombardier: not found
ASTC Encoder ASTC Encoder (astcenc) is for the Adaptive Scalable Texture Compression (ASTC) format commonly used with OpenGL, OpenGL ES, and Vulkan graphics APIs. This test profile does a coding test of both compression/decompression. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 3.2 Preset: Thorough ampere c7g.4xlarge compar c7g.4xlarge 4 8 12 16 20 SE +/- 0.0795, N = 15 SE +/- 0.0011, N = 3 8.2208 13.9248 1. (CXX) g++ options: -O3 -flto -pthread
OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 3.2 Preset: Exhaustive ampere c7g.4xlarge compar c7g.4xlarge 30 60 90 120 150 SE +/- 0.15, N = 3 SE +/- 0.01, N = 3 18.05 139.38 1. (CXX) g++ options: -O3 -flto -pthread
Build2 This test profile measures the time to bootstrap/install the build2 C++ build toolchain from source. Build2 is a cross-platform build toolchain for C/C++ code and features Cargo-like features. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Build2 0.13 Time To Compile ampere c7g.4xlarge compar c7g.4xlarge 30 60 90 120 150 SE +/- 1.87, N = 15 SE +/- 0.64, N = 3 91.34 115.02
C-Ray This is a test of C-Ray, a simple raytracer designed to test the floating-point CPU performance. This test is multi-threaded (16 threads per core), will shoot 8 rays per pixel for anti-aliasing, and will generate a 1600 x 1200 image. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 1.1 Total Time - 4K, 16 Rays Per Pixel ampere c7g.4xlarge compar c7g.4xlarge 9 18 27 36 45 SE +/- 0.014, N = 3 SE +/- 0.016, N = 3 5.258 38.517 1. (CC) gcc options: -lm -lpthread -O3
OpenBenchmarking.org msec, Fewer Is Better DaCapo Benchmark 9.12-MR1 Java Test: Tradesoap c7g.4xlarge 800 1600 2400 3200 4000 SE +/- 14.95, N = 4 3524
Java Test: Tradesoap
ampere c7g.4xlarge compar: The test run did not produce a result. The test run did not produce a result. The test run did not produce a result. The test run did not produce a result.
Google SynthMark SynthMark is a cross platform tool for benchmarking CPU performance under a variety of real-time audio workloads. It uses a polyphonic synthesizer model to provide standardized tests for latency, jitter and computational throughput. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Voices, More Is Better Google SynthMark 20201109 Test: VoiceMark_100 c7g.4xlarge ampere c7g.4xlarge compar 150 300 450 600 750 SE +/- 0.32, N = 3 SE +/- 0.18, N = 3 675.64 561.60 1. (CXX) g++ options: -lm -lpthread -std=c++11 -Ofast
GPAW GPAW is a density-functional theory (DFT) Python code based on the projector-augmented wave (PAW) method and the atomic simulation environment (ASE). Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better GPAW 22.1 Input: Carbon Nanotube ampere c7g.4xlarge compar c7g.4xlarge 30 60 90 120 150 SE +/- 0.11, N = 3 SE +/- 0.08, N = 3 65.40 155.18 -pthread 1. (CC) gcc options: -shared -fwrapv -O2 -lxc -lblas -lmpi
GROMACS The GROMACS (GROningen MAchine for Chemical Simulations) molecular dynamics package testing with the water_GMX50 data. This test profile allows selecting between CPU and GPU-based GROMACS builds. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2022.1 Implementation: MPI CPU - Input: water_GMX50_bare ampere c7g.4xlarge compar c7g.4xlarge 1.2521 2.5042 3.7563 5.0084 6.2605 SE +/- 0.027, N = 2 SE +/- 0.002, N = 3 5.565 1.128 -pthread 1. (CXX) g++ options: -O3
LeelaChessZero LeelaChessZero (lc0 / lczero) is a chess engine automated vian neural networks. This test profile can be used for OpenCL, CUDA + cuDNN, and BLAS (CPU-based) benchmarking. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.28 Backend: BLAS ampere c7g.4xlarge compar c7g.4xlarge 400 800 1200 1600 2000 SE +/- 29.70, N = 9 SE +/- 6.44, N = 3 1978 1103 1. (CXX) g++ options: -flto -pthread
OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.28 Backend: Eigen ampere c7g.4xlarge compar c7g.4xlarge 400 800 1200 1600 2000 SE +/- 28.10, N = 9 SE +/- 9.70, N = 3 2026 1189 1. (CXX) g++ options: -flto -pthread
OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.10 Encoder Speed: 2 c7g.4xlarge ampere c7g.4xlarge compar 40 80 120 160 200 SE +/- 0.11, N = 3 SE +/- 1.54, N = 3 141.70 173.06 1. (CXX) g++ options: -O3 -fPIC -lm
OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.10 Encoder Speed: 6 ampere c7g.4xlarge compar c7g.4xlarge 3 6 9 12 15 SE +/- 0.061, N = 4 SE +/- 0.025, N = 3 4.910 9.385 1. (CXX) g++ options: -O3 -fPIC -lm
OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.10 Encoder Speed: 6, Lossless ampere c7g.4xlarge compar c7g.4xlarge 3 6 9 12 15 SE +/- 0.039, N = 3 SE +/- 0.011, N = 3 8.359 11.908 1. (CXX) g++ options: -O3 -fPIC -lm
OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.10 Encoder Speed: 10, Lossless c7g.4xlarge ampere c7g.4xlarge compar 2 4 6 8 10 SE +/- 0.021, N = 3 SE +/- 0.015, N = 3 5.765 6.170 1. (CXX) g++ options: -O3 -fPIC -lm
Liquid-DSP LiquidSDR's Liquid-DSP is a software-defined radio (SDR) digital signal processing library. This test profile runs a multi-threaded benchmark of this SDR/DSP library focused on embedded platform usage. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 16 - Buffer Length: 256 - Filter Length: 57 c7g.4xlarge ampere c7g.4xlarge compar 80M 160M 240M 320M 400M SE +/- 400097.21, N = 3 SE +/- 234828.54, N = 3 383606667 319906667 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
NAS Parallel Benchmarks NPB, NAS Parallel Benchmarks, is a benchmark developed by NASA for high-end computer systems. This test profile currently uses the MPI version of NPB. This test profile offers selecting the different NPB tests/problems and varying problem sizes. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: BT.C ampere c7g.4xlarge compar c7g.4xlarge 14K 28K 42K 56K 70K SE +/- 56.15, N = 3 SE +/- 7.36, N = 3 63065.20 10339.53 -pthread 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: CG.C ampere c7g.4xlarge compar c7g.4xlarge 5K 10K 15K 20K 25K SE +/- 224.01, N = 3 SE +/- 17.12, N = 3 24449.50 6571.95 -pthread -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: EP.D ampere c7g.4xlarge compar c7g.4xlarge 1400 2800 4200 5600 7000 SE +/- 37.11, N = 3 SE +/- 0.39, N = 3 6522.95 934.72 -pthread -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: FT.C ampere c7g.4xlarge compar c7g.4xlarge 10K 20K 30K 40K 50K SE +/- 33.90, N = 2 SE +/- 1.17, N = 3 46649.69 11791.77 -pthread -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: IS.D ampere c7g.4xlarge compar c7g.4xlarge 200 400 600 800 1000 SE +/- 24.32, N = 15 SE +/- 2.29, N = 3 1116.18 1041.90 -pthread -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: LU.C ampere c7g.4xlarge compar c7g.4xlarge 9K 18K 27K 36K 45K SE +/- 28.42, N = 3 SE +/- 1.96, N = 3 40864.21 7730.41 -pthread -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: MG.C ampere c7g.4xlarge compar c7g.4xlarge 11K 22K 33K 44K 55K SE +/- 24.22, N = 3 SE +/- 4.69, N = 3 52823.39 13481.61 -pthread -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: SP.C ampere c7g.4xlarge compar c7g.4xlarge 5K 10K 15K 20K 25K SE +/- 10.76, N = 3 SE +/- 9.61, N = 3 25174.49 4467.19 -pthread -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi
nginx This is a benchmark of the lightweight Nginx HTTP(S) web-server. This Nginx web server benchmark test profile makes use of the Golang "Bombardier" program for facilitating the HTTP requests over a fixed period time with a configurable number of concurrent clients. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.21.1 Concurrent Requests: 100 c7g.4xlarge 70K 140K 210K 280K 350K SE +/- 2009.97, N = 3 345710.87 1. (CC) gcc options: -lcrypt -lz -O3 -march=native
Concurrent Requests: 100
ampere c7g.4xlarge compar: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./nginx: 2: /go/bin/bombardier: not found
OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.21.1 Concurrent Requests: 200 c7g.4xlarge 80K 160K 240K 320K 400K SE +/- 3986.77, N = 3 352380.98 1. (CC) gcc options: -lcrypt -lz -O3 -march=native
Concurrent Requests: 200
ampere c7g.4xlarge compar: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./nginx: 2: /go/bin/bombardier: not found
OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.21.1 Concurrent Requests: 500 c7g.4xlarge 70K 140K 210K 280K 350K SE +/- 1017.52, N = 3 346613.34 1. (CC) gcc options: -lcrypt -lz -O3 -march=native
Concurrent Requests: 500
ampere c7g.4xlarge compar: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./nginx: 2: /go/bin/bombardier: not found
OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.21.1 Concurrent Requests: 1000 c7g.4xlarge 70K 140K 210K 280K 350K SE +/- 1410.11, N = 3 346814.75 1. (CC) gcc options: -lcrypt -lz -O3 -march=native
Concurrent Requests: 1000
ampere c7g.4xlarge compar: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./nginx: 2: /go/bin/bombardier: not found
Ngspice Ngspice is an open-source SPICE circuit simulator. Ngspice was originally based on the Berkeley SPICE electronic circuit simulator. Ngspice supports basic threading using OpenMP. This test profile is making use of the ISCAS 85 benchmark circuits. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Ngspice 34 Circuit: C2670 c7g.4xlarge ampere c7g.4xlarge compar 50 100 150 200 250 SE +/- 0.86, N = 3 SE +/- 0.94, N = 3 198.22 232.16 -lXft -lfontconfig -lXrender -lfreetype 1. (CC) gcc options: -O0 -fopenmp -lm -lstdc++ -lfftw3 -lXaw -lXmu -lXt -lXext -lX11 -lSM -lICE
OpenBenchmarking.org Seconds, Fewer Is Better Ngspice 34 Circuit: C7552 c7g.4xlarge ampere c7g.4xlarge compar 50 100 150 200 250 SE +/- 1.94, N = 3 SE +/- 2.35, N = 3 191.29 235.95 -lXft -lfontconfig -lXrender -lfreetype 1. (CC) gcc options: -O0 -fopenmp -lm -lstdc++ -lfftw3 -lXaw -lXmu -lXt -lXext -lX11 -lSM -lICE
ONNX Runtime ONNX Runtime is developed by Microsoft and partners as a open-source, cross-platform, high performance machine learning inferencing and training accelerator. This test profile runs the ONNX Runtime with various models available from the ONNX Zoo. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: GPT-2 - Device: CPU - Executor: Standard c7g.4xlarge 2K 4K 6K 8K 10K SE +/- 2.40, N = 3 7990 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
Model: GPT-2 - Device: CPU - Executor: Standard
ampere c7g.4xlarge compar: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./onnx: line 2: ./onnxruntime/build/Linux/Release/onnxruntime_perf_test: No such file or directory
OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: bertsquad-12 - Device: CPU - Executor: Standard c7g.4xlarge 90 180 270 360 450 SE +/- 0.17, N = 3 407 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
Model: bertsquad-12 - Device: CPU - Executor: Standard
ampere c7g.4xlarge compar: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./onnx: line 2: ./onnxruntime/build/Linux/Release/onnxruntime_perf_test: No such file or directory
OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: fcn-resnet101-11 - Device: CPU - Executor: Standard c7g.4xlarge 9 18 27 36 45 SE +/- 0.00, N = 3 38 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
Model: fcn-resnet101-11 - Device: CPU - Executor: Standard
ampere c7g.4xlarge compar: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./onnx: line 2: ./onnxruntime/build/Linux/Release/onnxruntime_perf_test: No such file or directory
OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard c7g.4xlarge 130 260 390 520 650 SE +/- 0.00, N = 3 609 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard
ampere c7g.4xlarge compar: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./onnx: line 2: ./onnxruntime/build/Linux/Release/onnxruntime_perf_test: No such file or directory
OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: super-resolution-10 - Device: CPU - Executor: Standard c7g.4xlarge 600 1200 1800 2400 3000 SE +/- 1.86, N = 3 2817 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
Model: super-resolution-10 - Device: CPU - Executor: Standard
ampere c7g.4xlarge compar: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./onnx: line 2: ./onnxruntime/build/Linux/Release/onnxruntime_perf_test: No such file or directory
OpenSSL OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test profile makes use of the built-in "openssl speed" benchmarking capabilities. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.0 Algorithm: SHA256 ampere c7g.4xlarge compar c7g.4xlarge 30000M 60000M 90000M 120000M 150000M SE +/- 487835541.05, N = 3 SE +/- 7739237.92, N = 3 126737105807 13722045973 1. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl
OpenBenchmarking.org sign/s, More Is Better OpenSSL 3.0 Algorithm: RSA4096 ampere c7g.4xlarge compar c7g.4xlarge 1600 3200 4800 6400 8000 SE +/- 20.84, N = 3 SE +/- 0.23, N = 3 7387.0 2546.4 1. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl
OpenBenchmarking.org verify/s, More Is Better OpenSSL 3.0 Algorithm: RSA4096 ampere c7g.4xlarge compar c7g.4xlarge 140K 280K 420K 560K 700K SE +/- 61.45, N = 3 SE +/- 82.61, N = 3 646811.2 178460.4 1. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl
PHPBench PHPBench is a benchmark suite for PHP. It performs a large number of simple tests in order to bench various aspects of the PHP interpreter. PHPBench can be used to compare hardware, operating systems, PHP versions, PHP accelerators and caches, compiler options, etc. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Score, More Is Better PHPBench 0.8.1 PHP Benchmark Suite c7g.4xlarge ampere c7g.4xlarge compar 140K 280K 420K 560K 700K SE +/- 525.83, N = 3 SE +/- 1848.20, N = 3 666484 487006
POV-Ray This is a test of POV-Ray, the Persistence of Vision Raytracer. POV-Ray is used to create 3D graphics using ray-tracing. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better POV-Ray 3.7.0.7 Trace Time c7g.4xlarge ampere c7g.4xlarge compar 30 60 90 120 150 SE +/- 0.01, N = 3 SE +/- 2.69, N = 12 37.86 148.59 -R/usr/lib -pthread 1. (CXX) g++ options: -pipe -O3 -ffast-math -lXpm -lSM -lICE -lX11 -ltiff -ljpeg -lpng -lz -lrt -lm -lboost_thread -lboost_system
PyBench This test profile reports the total time of the different average timed test results from PyBench. PyBench reports average test times for different functions such as BuiltinFunctionCalls and NestedForLoops, with this total result providing a rough estimate as to Python's average performance on a given system. This test profile runs PyBench each time for 20 rounds. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Milliseconds, Fewer Is Better PyBench 2018-02-16 Total For Average Test Times c7g.4xlarge ampere c7g.4xlarge compar 300 600 900 1200 1500 SE +/- 0.33, N = 3 SE +/- 2.96, N = 3 1185 1421
QuantLib QuantLib is an open-source library/framework around quantitative finance for modeling, trading and risk management scenarios. QuantLib is written in C++ with Boost and its built-in benchmark used reports the QuantLib Benchmark Index benchmark score. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MFLOPS, More Is Better QuantLib 1.21 c7g.4xlarge ampere c7g.4xlarge compar 500 1000 1500 2000 2500 SE +/- 0.15, N = 3 SE +/- 13.49, N = 15 2512.7 1974.8 1. (CXX) g++ options: -O3 -march=native -rdynamic
Rodinia Rodinia is a suite focused upon accelerating compute-intensive applications with accelerators. CUDA, OpenMP, and OpenCL parallel models are supported by the included applications. This profile utilizes select OpenCL, NVIDIA CUDA and OpenMP test binaries at the moment. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenMP LavaMD ampere c7g.4xlarge compar c7g.4xlarge 30 60 90 120 150 SE +/- 0.19, N = 3 SE +/- 0.15, N = 3 31.35 143.33 1. (CXX) g++ options: -O2 -lOpenCL
OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenMP CFD Solver c7g.4xlarge ampere c7g.4xlarge compar 7 14 21 28 35 SE +/- 0.02, N = 3 SE +/- 0.28, N = 3 10.48 28.48 1. (CXX) g++ options: -O2 -lOpenCL
OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenMP Streamcluster c7g.4xlarge ampere c7g.4xlarge compar 10 20 30 40 50 SE +/- 0.33, N = 12 SE +/- 0.41, N = 3 13.30 44.70 1. (CXX) g++ options: -O2 -lOpenCL
SecureMark SecureMark is an objective, standardized benchmarking framework for measuring the efficiency of cryptographic processing solutions developed by EEMBC. SecureMark-TLS is benchmarking Transport Layer Security performance with a focus on IoT/edge computing. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org marks, More Is Better SecureMark 1.0.4 Benchmark: SecureMark-TLS c7g.4xlarge ampere c7g.4xlarge compar 40K 80K 120K 160K 200K SE +/- 773.26, N = 3 SE +/- 23.89, N = 3 183708 141771 1. (CC) gcc options: -pedantic -O3
simdjson This is a benchmark of SIMDJSON, a high performance JSON parser. SIMDJSON aims to be the fastest JSON parser and is used by projects like Microsoft FishStore, Yandex ClickHouse, Shopify, and others. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GB/s, More Is Better simdjson 1.0 Throughput Test: Kostya c7g.4xlarge ampere c7g.4xlarge compar 0.4365 0.873 1.3095 1.746 2.1825 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 1.94 1.48 -pthread 1. (CXX) g++ options: -O3
OpenBenchmarking.org GB/s, More Is Better simdjson 1.0 Throughput Test: LargeRandom c7g.4xlarge ampere c7g.4xlarge compar 0.1575 0.315 0.4725 0.63 0.7875 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.70 0.56 -pthread 1. (CXX) g++ options: -O3
OpenBenchmarking.org GB/s, More Is Better simdjson 1.0 Throughput Test: PartialTweets c7g.4xlarge ampere c7g.4xlarge compar 0.5895 1.179 1.7685 2.358 2.9475 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 2.62 1.84 -pthread 1. (CXX) g++ options: -O3
OpenBenchmarking.org GB/s, More Is Better simdjson 1.0 Throughput Test: DistinctUserID c7g.4xlarge ampere c7g.4xlarge compar 0.6053 1.2106 1.8159 2.4212 3.0265 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 2.69 1.88 -pthread 1. (CXX) g++ options: -O3
Stockfish This is a test of Stockfish, an advanced open-source C++11 chess benchmark that can scale up to 512 CPU threads. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 13 Total Time ampere c7g.4xlarge compar c7g.4xlarge 30M 60M 90M 120M 150M SE +/- 742399.94, N = 3 SE +/- 153578.64, N = 3 156545134 27608891 1. (CXX) g++ options: -lgcov -lpthread -fno-exceptions -std=c++17 -fprofile-use -fno-peel-loops -fno-tracer -pedantic -O3 -flto -flto=jobserver
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.14 Test: IO_uring c7g.4xlarge 200K 400K 600K 800K 1000K SE +/- 614.16, N = 3 843015.78 1. (CC) gcc options: -O2 -std=gnu99 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
Test: IO_uring
ampere c7g.4xlarge compar: The test run did not produce a result. The test run did not produce a result. The test run did not produce a result.
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.14 Test: CPU Cache ampere c7g.4xlarge compar c7g.4xlarge 140 280 420 560 700 SE +/- 5.56, N = 7 SE +/- 3.64, N = 12 641.90 64.31 1. (CC) gcc options: -O2 -std=gnu99 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.14 Test: CPU Stress ampere c7g.4xlarge compar c7g.4xlarge 8K 16K 24K 32K 40K SE +/- 393.27, N = 3 SE +/- 0.41, N = 3 39188.63 5029.71 1. (CC) gcc options: -O2 -std=gnu99 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.14 Test: Matrix Math ampere c7g.4xlarge compar c7g.4xlarge 160K 320K 480K 640K 800K SE +/- 7975.03, N = 3 SE +/- 3.18, N = 3 735128.78 80088.74 1. (CC) gcc options: -O2 -std=gnu99 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.14 Test: Vector Math ampere c7g.4xlarge compar c7g.4xlarge 80K 160K 240K 320K 400K SE +/- 2465.32, N = 3 SE +/- 17.05, N = 3 394967.53 55258.17 1. (CC) gcc options: -O2 -std=gnu99 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.14 Test: Memory Copying ampere c7g.4xlarge compar c7g.4xlarge 3K 6K 9K 12K 15K SE +/- 53.04, N = 3 SE +/- 3.52, N = 3 12885.53 6693.32 1. (CC) gcc options: -O2 -std=gnu99 -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lrt -lz -pthread
TensorFlow Lite This is a benchmark of the TensorFlow Lite implementation focused on TensorFlow machine learning for mobile, IoT, edge, and other cases. The current Linux support is limited to running on CPUs. This test profile is measuring the average inference time. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2022-05-18 Model: SqueezeNet c7g.4xlarge ampere c7g.4xlarge compar 12K 24K 36K 48K 60K SE +/- 22.07, N = 3 SE +/- 1420.97, N = 15 3257.94 57204.10
TSCP This is a performance test of TSCP, Tom Kerrigan's Simple Chess Program, which has a built-in performance benchmark. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Nodes Per Second, More Is Better TSCP 1.81 AI Chess Performance c7g.4xlarge ampere c7g.4xlarge compar 300K 600K 900K 1200K 1500K SE +/- 0.00, N = 5 SE +/- 966.51, N = 5 1370094 1041565 1. (CC) gcc options: -O3 -march=native
WebP Image Encode This is a test of Google's libwebp with the cwebp image encode utility and using a sample 6000x4000 pixel JPEG image as the input. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100, Lossless c7g.4xlarge ampere c7g.4xlarge compar 6 12 18 24 30 SE +/- 0.09, N = 3 SE +/- 0.00, N = 3 22.77 26.54 -pthread -ltiff 1. (CC) gcc options: -fvisibility=hidden -O2 -lm -ljpeg -lpng16
OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100, Highest Compression c7g.4xlarge ampere c7g.4xlarge compar 3 6 9 12 15 SE +/- 0.007, N = 3 SE +/- 0.103, N = 3 9.346 10.206 -pthread -ltiff 1. (CC) gcc options: -fvisibility=hidden -O2 -lm -ljpeg -lpng16
OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100, Lossless, Highest Compression c7g.4xlarge ampere c7g.4xlarge compar 13 26 39 52 65 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 48.21 56.89 -pthread -ltiff 1. (CC) gcc options: -fvisibility=hidden -O2 -lm -ljpeg -lpng16
Xcompact3d Incompact3d Xcompact3d Incompact3d is a Fortran-MPI based, finite difference high-performance code for solving the incompressible Navier-Stokes equation and as many as you need scalar transport equations. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 129 Cells Per Direction ampere c7g.4xlarge compar c7g.4xlarge 2 4 6 8 10 SE +/- 0.00831997, N = 2 SE +/- 0.01401446, N = 3 3.06693709 8.01671425 -pthread -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi
OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 193 Cells Per Direction ampere c7g.4xlarge compar c7g.4xlarge 7 14 21 28 35 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 12.09 29.13 -pthread -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi
Zstd Compression This test measures the time needed to compress/decompress a sample file (a FreeBSD disk image - FreeBSD-12.2-RELEASE-amd64-memstick.img) using Zstd compression with options for different compression levels / settings. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 3 - Compression Speed c7g.4xlarge ampere c7g.4xlarge compar 1000 2000 3000 4000 5000 SE +/- 9.57, N = 3 SE +/- 207.50, N = 15 4639.1 3012.8 -llzma 1. (CC) gcc options: -O3 -pthread -lz
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 3 - Decompression Speed c7g.4xlarge ampere c7g.4xlarge compar 800 1600 2400 3200 4000 SE +/- 2.07, N = 3 SE +/- 5.11, N = 12 3508.5 2806.6 -llzma 1. (CC) gcc options: -O3 -pthread -lz
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 19 - Compression Speed ampere c7g.4xlarge compar c7g.4xlarge 13 26 39 52 65 SE +/- 1.05, N = 15 SE +/- 0.00, N = 3 58.0 41.2 -llzma 1. (CC) gcc options: -O3 -pthread -lz
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 19 - Decompression Speed c7g.4xlarge ampere c7g.4xlarge compar 700 1400 2100 2800 3500 SE +/- 7.75, N = 3 SE +/- 2.68, N = 15 3050.3 2330.1 -llzma 1. (CC) gcc options: -O3 -pthread -lz
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 19, Long Mode - Compression Speed c7g.4xlarge ampere c7g.4xlarge compar 9 18 27 36 45 SE +/- 0.23, N = 3 SE +/- 1.04, N = 15 39.5 28.9 -llzma 1. (CC) gcc options: -O3 -pthread -lz
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 19, Long Mode - Decompression Speed c7g.4xlarge ampere c7g.4xlarge compar 700 1400 2100 2800 3500 SE +/- 6.93, N = 3 SE +/- 4.45, N = 15 3240.6 2465.2 -llzma 1. (CC) gcc options: -O3 -pthread -lz
c7g.4xlarge Processor: ARMv8 Neoverse-V1 (16 Cores), Motherboard: Amazon EC2 c7g.4xlarge (1.0 BIOS), Chipset: Amazon Device 0200, Memory: 32GB, Disk: 193GB Amazon Elastic Block Store, Network: Amazon Elastic
OS: Ubuntu 22.04, Kernel: 5.15.0-1004-aws (aarch64), Compiler: GCC 11.2.0, File-System: ext4, System Layer: amazon
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -vJava Notes: OpenJDK Runtime Environment (build 11.0.15+10-Ubuntu-0ubuntu0.22.04.1)Python Notes: Python 3.10.4Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of CSV2 BHB + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 24 May 2022 11:30 by user ubuntu.
ampere c7g.4xlarge compar Processor: Ampere ARMv8 Neoverse-N1 @ 3.00GHz (160 Cores), Motherboard: FOXCONN Mt. Collins (0ACOC017 SCP: 1.08.20210825 BIOS), Chipset: Ampere Computing LLC Device e100, Memory: 16 x 32 GB DDR4-3200MT/s Samsung M393A4K40DB3-CWE, Disk: 2 x 1920GB SAMSUNG MZQL21T9HCJR-00A07, Graphics: ASPEED, Monitor: PL2294H, Network: 4 x Mellanox MT27710 + 2 x Intel I350
OS: Ubuntu 20.04, Kernel: 5.4.0-100-generic (aarch64), Vulkan: 1.1.182, Compiler: GCC 9.4.0, File-System: ext4, Screen Resolution: 1920x1080
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -vProcessor Notes: Scaling Governor: cppc_cpufreq performanceJava Notes: OpenJDK Runtime Environment (build 11.0.15+10-Ubuntu-0ubuntu0.20.04.1)Python Notes: Python 3.8.10Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 25 May 2022 13:32 by user root.