Amazon AWS

amazon testing on Ubuntu 22.04 via the Phoronix Test Suite.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2306232-NE-2306227NE39
Jump To Table - Results

View

Do Not Show Noisy Results
Do Not Show Results With Incomplete Data
Do Not Show Results With Little Change/Spread
List Notable Results
Show Result Confidence Charts

Limit displaying results to tests within:

BLAS (Basic Linear Algebra Sub-Routine) Tests 4 Tests
C++ Boost Tests 3 Tests
Chess Test Suite 2 Tests
Timed Code Compilation 3 Tests
C/C++ Compiler Tests 7 Tests
CPU Massive 12 Tests
Creator Workloads 2 Tests
Fortran Tests 7 Tests
HPC - High Performance Computing 18 Tests
Common Kernel Benchmarks 3 Tests
LAPACK (Linear Algebra Pack) Tests 2 Tests
Linear Algebra 2 Tests
Molecular Dynamics 6 Tests
MPI Benchmarks 9 Tests
Multi-Core 13 Tests
NVIDIA GPU Compute 3 Tests
OpenMPI Tests 19 Tests
Programmer / Developer System Benchmarks 5 Tests
Python Tests 5 Tests
Scientific Computing 12 Tests
Software Defined Radio 2 Tests
Server 3 Tests
Server CPU Tests 6 Tests
Common Workstation Benchmarks 2 Tests

Statistics

Show Overall Harmonic Mean(s)
Show Overall Geometric Mean
Show Geometric Means Per-Suite/Category
Show Wins / Losses Counts (Pie Chart)
Normalize Results
Remove Outliers Before Calculating Averages

Graph Settings

Force Line Graphs Where Applicable
Convert To Scalar Where Applicable
Prefer Vertical Bar Graphs

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Highlight
Result
Hide
Result
Result
Identifier
Performance Per
Dollar
Date
Run
  Test
  Duration
m7g.16xlarge Graviton3
June 22 2023
  7 Hours, 33 Minutes
c6g.16xlarge Graviton2
June 23 2023
  8 Hours, 28 Minutes
Invert Hiding All Results Option
  8 Hours
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


Amazon AWSProcessorMotherboardChipsetMemoryDiskNetworkOSKernelCompilerFile-SystemSystem Layerm7g.16xlarge Graviton3c6g.16xlarge Graviton2ARMv8 Neoverse-V1 (64 Cores)Amazon EC2 m7g.16xlarge (1.0 BIOS)Amazon Device 0200256GB215GB Amazon Elastic Block StoreAmazon ElasticUbuntu 22.045.19.0-1025-aws (aarch64)GCC 11.3.0ext4amazonARMv8 Neoverse-N1 (64 Cores)Amazon EC2 c6g.16xlarge (1.0 BIOS)128GBOpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -v Python Details- Python 3.10.6Security Details- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of CSV2 BHB + srbds: Not affected + tsx_async_abort: Not affected

m7g.16xlarge Graviton3 vs. c6g.16xlarge Graviton2 ComparisonPhoronix Test SuiteBaseline+72%+72%+144%+144%+216%+216%10.4%RSA4096288%RSA4096233.5%SHA512123.2%AES-256-GCM119.3%AES-128-GCM109.6%c2c - FFTW - float - 512105.6%CPU Cache102.5%r2c - FFTW - float - 51298.9%c2c - FFTW - double - 25698.2%r2c - FFTW - double - 25695.7%MG.C95.3%c2c - FFTW - float - 25694%c2c - FFTW - double - 51290.6%r2c - FFTW - double - 51288%i.i.1.C.P.D85.6%i.i.1.C.P.D81.9%leblancbig81.2%Memory Copying80.9%Matrix 3D Math80.9%TurboPipe Periodic79.1%sedovbig79%Kershaw79%r2c - FFTW - float - 25678.4%NUMA77.9%V.F.P77.6%SP.C77.6%Dust 2D tau100.075.9%c2c - FFTW - double - 12874.5%50071.7%r2c - FFTW - double - 12869.4%Fused Multiply-Add69%EP.D68.7%CG.C67.8%D.P.B61.5%simple-H2O61.3%61.2%100061.1%59%ChaCha20-Poly130559%Eigen56.9%Wide Vector Math54.7%54%C240 Buckyball53.4%ChaCha2053.4%Gas HII4052.9%MPI CPU - water_GMX50_bare52.6%Vector Shuffle52%LU.C51.2%P.P.B.T.T50.2%Carbon Nanotube50%32 - 256 - 3248.4%64 - 256 - 3248.3%S.R.E47.7%32 - 256 - 5747.5%64 - 256 - 5747.5%2647.4%V.P.M47%Vector Math46.9%20k Atoms46.7%Li2_STO_ae46.6%r2c - FFTW - float - 12846.3%FeCO6_b3lyp_gms44.8%Rhodopsin Protein44.7%2643.1%FeCO6_b3lyp_gms42.8%OpenMP LavaMD42.1%Time To Compile41.4%2640.4%2638.8%OpenMP CFD Solver38.3%c2c - FFTW - float - 12837.7%P.P.B.T.T37.5%BLAS37.4%Compression Rating31.6%Matrix Math29.5%Total Time29.5%T.P.P28.3%SHA25627.6%S.B.W.u.m27.4%CoreMark Size 666 - I.P.S27.1%Time To Compile25%D.R21.9%Time To Compile21%64 - 256 - 51220.6%32 - 256 - 51220.6%S.F.P.R19.3%O.S17.8%10005007.7%OpenSSLOpenSSLOpenSSLOpenSSLOpenSSLHeFFTe - Highly Efficient FFT for ExascaleStress-NGHeFFTe - Highly Efficient FFT for ExascaleHeFFTe - Highly Efficient FFT for ExascaleHeFFTe - Highly Efficient FFT for ExascaleNAS Parallel BenchmarksHeFFTe - Highly Efficient FFT for ExascaleHeFFTe - Highly Efficient FFT for ExascaleHeFFTe - Highly Efficient FFT for ExascaleXcompact3d Incompact3dXcompact3d Incompact3dPennantStress-NGStress-NGnekRSPennantnekRSHeFFTe - Highly Efficient FFT for ExascaleStress-NGStress-NGNAS Parallel BenchmarksMonte Carlo Simulations of Ionised NebulaeHeFFTe - Highly Efficient FFT for ExascalenginxHeFFTe - Highly Efficient FFT for ExascaleStress-NGNAS Parallel BenchmarksNAS Parallel BenchmarkssrsRAN ProjectQMCPACKLULESHnginxAlgebraic Multi-Grid BenchmarkOpenSSLLeelaChessZeroStress-NGKripkeNWChemOpenSSLMonte Carlo Simulations of Ionised NebulaeGROMACSStress-NGNAS Parallel BenchmarkssrsRAN ProjectGPAWLiquid-DSPLiquid-DSPRemhosLiquid-DSPLiquid-DSPGraph500BRL-CADStress-NGLAMMPS Molecular Dynamics SimulatorQMCPACKHeFFTe - Highly Efficient FFT for ExascaleQMCPACKLAMMPS Molecular Dynamics SimulatorGraph500QMCPACKRodiniaTimed Godot Game Engine CompilationGraph500Graph500RodiniaHeFFTe - Highly Efficient FFT for ExascalesrsRAN ProjectLeelaChessZero7-Zip CompressionStress-NGStockfishLaghosOpenSSLLaghosCoremarkTimed Gem5 Compilation7-Zip CompressionTimed Node.js CompilationLiquid-DSPLiquid-DSPACES DGEMMRodiniaApache HTTP ServerApache HTTP Serverm7g.16xlarge Graviton3c6g.16xlarge Graviton2

Amazon AWSstress-ng: NUMAstress-ng: CPU Cachestress-ng: Matrix Mathstress-ng: Vector Mathstress-ng: Matrix 3D Mathstress-ng: Memory Copyingstress-ng: Vector Shufflestress-ng: Wide Vector Mathstress-ng: Fused Multiply-Addstress-ng: Vector Floating Pointheffte: r2c - FFTW - float - 128brl-cad: VGR Performance Metricheffte: r2c - FFTW - double - 256heffte: r2c - FFTW - double - 512heffte: c2c - FFTW - double - 512laghos: Triple Point Problemgraph500: 26heffte: c2c - FFTW - double - 256heffte: r2c - FFTW - float - 256heffte: r2c - FFTW - float - 512heffte: c2c - FFTW - float - 512graph500: 26heffte: c2c - FFTW - float - 256heffte: c2c - FFTW - float - 128laghos: Sedov Blast Wave, ube_922_hex.meshgraph500: 26heffte: c2c - FFTW - double - 128graph500: 26heffte: r2c - FFTW - double - 128nekrs: Kershawnekrs: TurboPipe Periodiclczero: BLASlczero: Eigengromacs: MPI CPU - water_GMX50_barelammps: 20k Atomslammps: Rhodopsin Proteinhpcg: 144 144 144 - 60hpcg: 160 160 160 - 60npb: CG.Cnpb: EP.Dnpb: LU.Cnpb: MG.Cnpb: SP.Crodinia: OpenMP LavaMDrodinia: OpenMP CFD Solverrodinia: OpenMP Streamclustermt-dgemm: Sustained Floating-Point Ratepennant: sedovbigpennant: leblancbigamg: kripke: lulesh: nwchem: C240 Buckyballmocassin: Gas HII40mocassin: Dust 2D tau100.0qmcpack: Li2_STO_aeqmcpack: simple-H2Oqmcpack: FeCO6_b3lyp_gmsqmcpack: FeCO6_b3lyp_gmsincompact3d: input.i3d 129 Cells Per Directionincompact3d: input.i3d 193 Cells Per Directionremhos: Sample Remap Examplegpaw: Carbon Nanotubecoremark: CoreMark Size 666 - Iterations Per Secondstockfish: Total Timecompress-7zip: Compression Ratingcompress-7zip: Decompression Ratingbuild-godot: Time To Compilebuild-gem5: Time To Compilebuild-nodejs: Time To Compileliquid-dsp: 32 - 256 - 32liquid-dsp: 32 - 256 - 57liquid-dsp: 64 - 256 - 32liquid-dsp: 64 - 256 - 57liquid-dsp: 32 - 256 - 512liquid-dsp: 64 - 256 - 512srsran: Downlink Processor Benchmarksrsran: PUSCH Processor Benchmark, Throughput Totalsrsran: PUSCH Processor Benchmark, Throughput Threadnginx: 500nginx: 1000apache: 500apache: 1000openssl: SHA256openssl: SHA512openssl: RSA4096openssl: RSA4096openssl: ChaCha20openssl: AES-128-GCMopenssl: AES-256-GCMopenssl: ChaCha20-Poly1305m7g.16xlarge Graviton3c6g.16xlarge Graviton23759.103892396.34368750.67217235.5910403.9320484.2454143.401542834.9463762252.7676102.55306.54078377778.504984.473946.2504232.0129949700040.8923164.873162.95688.0482122779000081.4442186.356410.55119432000057.1503419754000138.01431506800003976300000130113984.22336.92737.55833.790133.819521988.993738.9828341.6850126.2917244.8543.7884.37511.66324.3623539.2064906.720537164676166733900040028296.3781940.213.57582.669112.6128.041211.60205.723.0987103813.945418014.04061.8311601880.342264112119711316825285540154.378180.247237.78311360666677214933332270500000144240000081396667162753333318.55413.895.8255768.44255616.0471754.8960965.70542125155803212544887010181.9713859.5103226784517332033171900283333113630742874609902112.661921785.20284713.63147886.145752.1711324.7935614.51997272.6537732190.5442850.82209.49653302040.110444.929724.2658180.8020935000020.627992.399681.941242.828487438900041.9816135.358322.3786043200032.746828468900081.4498176033666722201900009478912.76725.17125.95013103.622216.2618741.9025671.299711.7062.2246.05113.73520.41795216.4805012.17683103558633322012023317557.4852976.920.758145.374165.1245.225302.19297.945.6372073525.882565820.74092.7601260642.17702486609284240702234202218.276225.305287.814765466667489270000153140000097820000067486333134926667197.23938.763.8148964.69158676.4066640.9367276.8342472798847143939254902624.3214040.96729254120315843616385712919959315746717636807OpenBenchmarking.org

Stress-NG

Stress-NG is a Linux stress tool developed by Colin Ian King. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: NUMAm7g.16xlarge Graviton3c6g.16xlarge Graviton28001600240032004000SE +/- 5.17, N = 3SE +/- 1.53, N = 33759.102112.661. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: CPU Cachem7g.16xlarge Graviton3c6g.16xlarge Graviton2800K1600K2400K3200K4000KSE +/- 57217.78, N = 15SE +/- 21905.72, N = 153892396.341921785.201. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Matrix Mathm7g.16xlarge Graviton3c6g.16xlarge Graviton280K160K240K320K400KSE +/- 53.44, N = 3SE +/- 8.13, N = 3368750.67284713.631. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Vector Mathm7g.16xlarge Graviton3c6g.16xlarge Graviton250K100K150K200K250KSE +/- 47.94, N = 3SE +/- 37.96, N = 3217235.59147886.141. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Matrix 3D Mathm7g.16xlarge Graviton3c6g.16xlarge Graviton22K4K6K8K10KSE +/- 6.38, N = 3SE +/- 1.40, N = 310403.935752.171. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Memory Copyingm7g.16xlarge Graviton3c6g.16xlarge Graviton24K8K12K16K20KSE +/- 3.80, N = 3SE +/- 1.12, N = 320484.2411324.791. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Vector Shufflem7g.16xlarge Graviton3c6g.16xlarge Graviton212K24K36K48K60KSE +/- 21.44, N = 3SE +/- 74.80, N = 354143.4035614.511. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Wide Vector Mathm7g.16xlarge Graviton3c6g.16xlarge Graviton2300K600K900K1200K1500KSE +/- 16116.93, N = 15SE +/- 505.84, N = 31542834.94997272.651. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Fused Multiply-Addm7g.16xlarge Graviton3c6g.16xlarge Graviton214M28M42M56M70MSE +/- 4870.19, N = 3SE +/- 3687.67, N = 363762252.7637732190.541. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz

OpenBenchmarking.orgBogo Ops/s, More Is BetterStress-NG 0.15.10Test: Vector Floating Pointm7g.16xlarge Graviton3c6g.16xlarge Graviton216K32K48K64K80KSE +/- 190.19, N = 3SE +/- 31.31, N = 376102.5542850.821. (CXX) g++ options: -lm -lapparmor -latomic -lc -lcrypt -ldl -ljpeg -lpthread -lrt -lsctp -lz

HeFFTe - Highly Efficient FFT for Exascale

HeFFTe is the Highly Efficient FFT for Exascale software developed as part of the Exascale Computing Project. This test profile uses HeFFTe's built-in speed benchmarks under a variety of configuration options and currently catering to CPU/processor testing. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: float - X Y Z: 128m7g.16xlarge Graviton3c6g.16xlarge Graviton270140210280350SE +/- 0.83, N = 3SE +/- 0.64, N = 3306.54209.501. (CXX) g++ options: -O3

BRL-CAD

BRL-CAD is a cross-platform, open-source solid modeling system with built-in benchmark mode. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgVGR Performance Metric, More Is BetterBRL-CAD 7.34VGR Performance Metricm7g.16xlarge Graviton3c6g.16xlarge Graviton2200K400K600K800K1000K7837775330201. (CXX) g++ options: -std=c++14 -pipe -fvisibility=hidden -fno-strict-aliasing -fno-common -fexceptions -ftemplate-depth-128 -ggdb3 -O3 -fipa-pta -fstrength-reduce -finline-functions -flto -ltcl8.6 -lregex_brl -lz_brl -lnetpbm -ldl -lm -ltk8.6

HeFFTe - Highly Efficient FFT for Exascale

HeFFTe is the Highly Efficient FFT for Exascale software developed as part of the Exascale Computing Project. This test profile uses HeFFTe's built-in speed benchmarks under a variety of configuration options and currently catering to CPU/processor testing. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: double - X Y Z: 256m7g.16xlarge Graviton3c6g.16xlarge Graviton220406080100SE +/- 0.02, N = 3SE +/- 0.01, N = 378.5040.111. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: double - X Y Z: 512m7g.16xlarge Graviton3c6g.16xlarge Graviton220406080100SE +/- 0.02, N = 3SE +/- 0.03, N = 384.4744.931. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: double - X Y Z: 512m7g.16xlarge Graviton3c6g.16xlarge Graviton21020304050SE +/- 0.01, N = 3SE +/- 0.01, N = 346.2524.271. (CXX) g++ options: -O3

Laghos

Laghos (LAGrangian High-Order Solver) is a miniapp that solves the time-dependent Euler equations of compressible gas dynamics in a moving Lagrangian frame using unstructured high-order finite element spatial discretization and explicit high-order time-stepping. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMajor Kernels Total Rate, More Is BetterLaghos 3.1Test: Triple Point Problemm7g.16xlarge Graviton3c6g.16xlarge Graviton250100150200250SE +/- 0.28, N = 3SE +/- 0.48, N = 3232.01180.801. (CXX) g++ options: -O3 -std=c++11 -lmfem -lHYPRE -lmetis -lrt -lmpi_cxx -lmpi

Graph500

This is a benchmark of the reference implementation of Graph500, an HPC benchmark focused on data intensive loads and commonly tested on supercomputers for complex data problems. Graph500 primarily stresses the communication subsystem of the hardware under test. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgsssp median_TEPS, More Is BetterGraph500 3.0Scale: 26m7g.16xlarge Graviton3c6g.16xlarge Graviton260M120M180M240M300M2994970002093500001. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi

HeFFTe - Highly Efficient FFT for Exascale

HeFFTe is the Highly Efficient FFT for Exascale software developed as part of the Exascale Computing Project. This test profile uses HeFFTe's built-in speed benchmarks under a variety of configuration options and currently catering to CPU/processor testing. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: double - X Y Z: 256m7g.16xlarge Graviton3c6g.16xlarge Graviton2918273645SE +/- 0.01, N = 3SE +/- 0.01, N = 340.8920.631. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: float - X Y Z: 256m7g.16xlarge Graviton3c6g.16xlarge Graviton24080120160200SE +/- 0.27, N = 3SE +/- 0.19, N = 3164.8792.401. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: float - X Y Z: 512m7g.16xlarge Graviton3c6g.16xlarge Graviton24080120160200SE +/- 0.13, N = 3SE +/- 0.03, N = 3162.9681.941. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: float - X Y Z: 512m7g.16xlarge Graviton3c6g.16xlarge Graviton220406080100SE +/- 0.02, N = 3SE +/- 0.01, N = 388.0542.831. (CXX) g++ options: -O3

Graph500

This is a benchmark of the reference implementation of Graph500, an HPC benchmark focused on data intensive loads and commonly tested on supercomputers for complex data problems. Graph500 primarily stresses the communication subsystem of the hardware under test. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgbfs max_TEPS, More Is BetterGraph500 3.0Scale: 26m7g.16xlarge Graviton3c6g.16xlarge Graviton2300M600M900M1200M1500M12277900008743890001. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi

HeFFTe - Highly Efficient FFT for Exascale

HeFFTe is the Highly Efficient FFT for Exascale software developed as part of the Exascale Computing Project. This test profile uses HeFFTe's built-in speed benchmarks under a variety of configuration options and currently catering to CPU/processor testing. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: float - X Y Z: 256m7g.16xlarge Graviton3c6g.16xlarge Graviton220406080100SE +/- 0.01, N = 3SE +/- 0.05, N = 381.4441.981. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: float - X Y Z: 128m7g.16xlarge Graviton3c6g.16xlarge Graviton24080120160200SE +/- 0.27, N = 3SE +/- 0.35, N = 3186.36135.361. (CXX) g++ options: -O3

Laghos

Laghos (LAGrangian High-Order Solver) is a miniapp that solves the time-dependent Euler equations of compressible gas dynamics in a moving Lagrangian frame using unstructured high-order finite element spatial discretization and explicit high-order time-stepping. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMajor Kernels Total Rate, More Is BetterLaghos 3.1Test: Sedov Blast Wave, ube_922_hex.meshm7g.16xlarge Graviton3c6g.16xlarge Graviton290180270360450SE +/- 0.42, N = 3SE +/- 0.89, N = 3410.55322.371. (CXX) g++ options: -O3 -std=c++11 -lmfem -lHYPRE -lmetis -lrt -lmpi_cxx -lmpi

Graph500

This is a benchmark of the reference implementation of Graph500, an HPC benchmark focused on data intensive loads and commonly tested on supercomputers for complex data problems. Graph500 primarily stresses the communication subsystem of the hardware under test. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgbfs median_TEPS, More Is BetterGraph500 3.0Scale: 26m7g.16xlarge Graviton3c6g.16xlarge Graviton2300M600M900M1200M1500M11943200008604320001. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi

HeFFTe - Highly Efficient FFT for Exascale

HeFFTe is the Highly Efficient FFT for Exascale software developed as part of the Exascale Computing Project. This test profile uses HeFFTe's built-in speed benchmarks under a variety of configuration options and currently catering to CPU/processor testing. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: double - X Y Z: 128m7g.16xlarge Graviton3c6g.16xlarge Graviton21326395265SE +/- 0.28, N = 3SE +/- 0.08, N = 357.1532.751. (CXX) g++ options: -O3

Graph500

This is a benchmark of the reference implementation of Graph500, an HPC benchmark focused on data intensive loads and commonly tested on supercomputers for complex data problems. Graph500 primarily stresses the communication subsystem of the hardware under test. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgsssp max_TEPS, More Is BetterGraph500 3.0Scale: 26m7g.16xlarge Graviton3c6g.16xlarge Graviton290M180M270M360M450M4197540002846890001. (CC) gcc options: -fcommon -O3 -lpthread -lm -lmpi

HeFFTe - Highly Efficient FFT for Exascale

HeFFTe is the Highly Efficient FFT for Exascale software developed as part of the Exascale Computing Project. This test profile uses HeFFTe's built-in speed benchmarks under a variety of configuration options and currently catering to CPU/processor testing. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: double - X Y Z: 128m7g.16xlarge Graviton3c6g.16xlarge Graviton2306090120150SE +/- 0.12, N = 3SE +/- 0.61, N = 3138.0181.451. (CXX) g++ options: -O3

nekRS

nekRS is an open-source Navier Stokes solver based on the spectral element method. NekRS supports both CPU and GPU/accelerator support though this test profile is currently configured for CPU execution. NekRS is part of Nek5000 of the Mathematics and Computer Science MCS at Argonne National Laboratory. This nekRS benchmark is primarily relevant to large core count HPC servers and otherwise may be very time consuming on smaller systems. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgflops/rank, More Is BetternekRS 23.0Input: Kershawm7g.16xlarge Graviton3c6g.16xlarge Graviton2700M1400M2100M2800M3500MSE +/- 1575066.14, N = 3SE +/- 737119.02, N = 3315068000017603366671. (CXX) g++ options: -fopenmp -O2 -march=native -mtune=native -ftree-vectorize -rdynamic -lmpi_cxx -lmpi

OpenBenchmarking.orgflops/rank, More Is BetternekRS 23.0Input: TurboPipe Periodicm7g.16xlarge Graviton3c6g.16xlarge Graviton2900M1800M2700M3600M4500MSE +/- 1199180.28, N = 3SE +/- 144222.05, N = 3397630000022201900001. (CXX) g++ options: -fopenmp -O2 -march=native -mtune=native -ftree-vectorize -rdynamic -lmpi_cxx -lmpi

LeelaChessZero

LeelaChessZero (lc0 / lczero) is a chess engine automated vian neural networks. This test profile can be used for OpenCL, CUDA + cuDNN, and BLAS (CPU-based) benchmarking. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.28Backend: BLASm7g.16xlarge Graviton3c6g.16xlarge Graviton230060090012001500SE +/- 4.67, N = 3SE +/- 11.79, N = 313019471. (CXX) g++ options: -flto -pthread

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.28Backend: Eigenm7g.16xlarge Graviton3c6g.16xlarge Graviton230060090012001500SE +/- 8.74, N = 3SE +/- 4.73, N = 313988911. (CXX) g++ options: -flto -pthread

GROMACS

The GROMACS (GROningen MAchine for Chemical Simulations) molecular dynamics package testing with the water_GMX50 data. This test profile allows selecting between CPU and GPU-based GROMACS builds. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNs Per Day, More Is BetterGROMACS 2023Implementation: MPI CPU - Input: water_GMX50_barem7g.16xlarge Graviton3c6g.16xlarge Graviton20.95021.90042.85063.80084.751SE +/- 0.003, N = 3SE +/- 0.002, N = 34.2232.7671. (CXX) g++ options: -O3

LAMMPS Molecular Dynamics Simulator

LAMMPS is a classical molecular dynamics code, and an acronym for Large-scale Atomic/Molecular Massively Parallel Simulator. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgns/day, More Is BetterLAMMPS Molecular Dynamics Simulator 23Jun2022Model: 20k Atomsm7g.16xlarge Graviton3c6g.16xlarge Graviton2816243240SE +/- 0.03, N = 3SE +/- 0.01, N = 336.9325.171. (CXX) g++ options: -O3 -ldl

OpenBenchmarking.orgns/day, More Is BetterLAMMPS Molecular Dynamics Simulator 23Jun2022Model: Rhodopsin Proteinm7g.16xlarge Graviton3c6g.16xlarge Graviton2918273645SE +/- 0.06, N = 3SE +/- 0.08, N = 337.5625.951. (CXX) g++ options: -O3 -ldl

High Performance Conjugate Gradient

HPCG is the High Performance Conjugate Gradient and is a new scientific benchmark from Sandia National Lans focused for super-computer testing with modern real-world workloads compared to HPCC. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHigh Performance Conjugate Gradient 3.1X Y Z: 144 144 144 - RT: 60m7g.16xlarge Graviton3816243240SE +/- 0.00, N = 333.791. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi

X Y Z: 144 144 144 - RT: 60

c6g.16xlarge Graviton2: The test quit with a non-zero exit status. E: cat: 'HPCG-Benchmark*.txt': No such file or directory

OpenBenchmarking.orgGFLOP/s, More Is BetterHigh Performance Conjugate Gradient 3.1X Y Z: 160 160 160 - RT: 60m7g.16xlarge Graviton3816243240SE +/- 0.00, N = 333.821. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi

X Y Z: 160 160 160 - RT: 60

c6g.16xlarge Graviton2: The test quit with a non-zero exit status. E: cat: 'HPCG-Benchmark*.txt': No such file or directory

NAS Parallel Benchmarks

NPB, NAS Parallel Benchmarks, is a benchmark developed by NASA for high-end computer systems. This test profile currently uses the MPI version of NPB. This test profile offers selecting the different NPB tests/problems and varying problem sizes. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: CG.Cm7g.16xlarge Graviton3c6g.16xlarge Graviton25K10K15K20K25KSE +/- 130.18, N = 3SE +/- 31.56, N = 321988.9913103.621. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: EP.Dm7g.16xlarge Graviton3c6g.16xlarge Graviton28001600240032004000SE +/- 1.69, N = 3SE +/- 2.22, N = 33738.982216.261. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: LU.Cm7g.16xlarge Graviton3c6g.16xlarge Graviton26K12K18K24K30KSE +/- 48.62, N = 3SE +/- 26.12, N = 328341.6818741.901. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: MG.Cm7g.16xlarge Graviton3c6g.16xlarge Graviton211K22K33K44K55KSE +/- 24.30, N = 3SE +/- 7.02, N = 350126.2925671.291. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: SP.Cm7g.16xlarge Graviton3c6g.16xlarge Graviton24K8K12K16K20KSE +/- 10.19, N = 3SE +/- 1.54, N = 317244.859711.701. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

Rodinia

Rodinia is a suite focused upon accelerating compute-intensive applications with accelerators. CUDA, OpenMP, and OpenCL parallel models are supported by the included applications. This profile utilizes select OpenCL, NVIDIA CUDA and OpenMP test binaries at the moment. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenMP LavaMDm7g.16xlarge Graviton3c6g.16xlarge Graviton21428425670SE +/- 0.15, N = 3SE +/- 0.04, N = 343.7962.221. (CXX) g++ options: -O2 -lOpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenMP CFD Solverm7g.16xlarge Graviton3c6g.16xlarge Graviton2246810SE +/- 0.011, N = 3SE +/- 0.016, N = 34.3756.0511. (CXX) g++ options: -O2 -lOpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenMP Streamclusterm7g.16xlarge Graviton3c6g.16xlarge Graviton248121620SE +/- 0.14, N = 3SE +/- 0.21, N = 1511.6613.741. (CXX) g++ options: -O2 -lOpenCL

ACES DGEMM

This is a multi-threaded DGEMM benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterACES DGEMM 1.0Sustained Floating-Point Ratem7g.16xlarge Graviton3c6g.16xlarge Graviton2612182430SE +/- 0.17, N = 13SE +/- 0.15, N = 324.3620.421. (CC) gcc options: -O3 -march=native -fopenmp

Pennant

Pennant is an application focused on hydrodynamics on general unstructured meshes in 2D. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgHydro Cycle Time - Seconds, Fewer Is BetterPennant 1.0.1Test: sedovbigm7g.16xlarge Graviton3c6g.16xlarge Graviton248121620SE +/- 0.011347, N = 3SE +/- 0.018218, N = 39.20649016.4805001. (CXX) g++ options: -fopenmp -lmpi_cxx -lmpi

OpenBenchmarking.orgHydro Cycle Time - Seconds, Fewer Is BetterPennant 1.0.1Test: leblancbigm7g.16xlarge Graviton3c6g.16xlarge Graviton23691215SE +/- 0.000869, N = 3SE +/- 0.018924, N = 36.72053712.1768301. (CXX) g++ options: -fopenmp -lmpi_cxx -lmpi

Algebraic Multi-Grid Benchmark

AMG is a parallel algebraic multigrid solver for linear systems arising from problems on unstructured grids. The driver provided with AMG builds linear systems for various 3-dimensional problems. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFigure Of Merit, More Is BetterAlgebraic Multi-Grid Benchmark 1.2m7g.16xlarge Graviton3c6g.16xlarge Graviton2400M800M1200M1600M2000MSE +/- 103191.30, N = 3SE +/- 140169.34, N = 3164676166710355863331. (CC) gcc options: -lparcsr_ls -lparcsr_mv -lseq_mv -lIJ_mv -lkrylov -lHYPRE_utilities -lm -fopenmp -lmpi

Kripke

Kripke is a simple, scalable, 3D Sn deterministic particle transport code. Its primary purpose is to research how data layout, programming paradigms and architectures effect the implementation and performance of Sn transport. Kripke is developed by LLNL. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgThroughput FoM, More Is BetterKripke 1.2.6m7g.16xlarge Graviton3c6g.16xlarge Graviton270M140M210M280M350MSE +/- 619419.33, N = 3SE +/- 102787.75, N = 33390004002201202331. (CXX) g++ options: -O3 -fopenmp -ldl

LULESH

LULESH is the Livermore Unstructured Lagrangian Explicit Shock Hydrodynamics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgz/s, More Is BetterLULESH 2.0.3m7g.16xlarge Graviton3c6g.16xlarge Graviton26K12K18K24K30KSE +/- 27.09, N = 3SE +/- 38.55, N = 328296.3817557.491. (CXX) g++ options: -O3 -fopenmp -lm -lmpi_cxx -lmpi

NWChem

NWChem is an open-source high performance computational chemistry package. Per NWChem's documentation, "NWChem aims to provide its users with computational chemistry tools that are scalable both in their ability to treat large scientific computational chemistry problems efficiently, and in their use of available parallel computing resources from high-performance parallel supercomputers to conventional workstation clusters." Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterNWChem 7.0.2Input: C240 Buckyballm7g.16xlarge Graviton3c6g.16xlarge Graviton260012001800240030001940.22976.91. (F9X) gfortran options: -lnwctask -lccsd -lmcscf -lselci -lmp2 -lmoints -lstepper -ldriver -loptim -lnwdft -lgradients -lcphf -lesp -lddscf -ldangchang -lguess -lhessian -lvib -lnwcutil -lrimp2 -lproperty -lsolvation -lnwints -lprepar -lnwmd -lnwpw -lofpw -lpaw -lpspw -lband -lnwpwlib -lcafe -lspace -lanalyze -lqhop -lpfft -ldplot -ldrdy -lvscf -lqmmm -lqmd -letrans -ltce -lbq -lmm -lcons -lperfm -ldntmc -lccca -ldimqm -lga -larmci -lpeigs -l64to32 -lopenblas -lpthread -lrt -llapack -lnwcblas -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz -lcomex -ffast-math -std=legacy -fdefault-integer-8 -finline-functions -O2

Monte Carlo Simulations of Ionised Nebulae

Mocassin is the Monte Carlo Simulations of Ionised Nebulae. MOCASSIN is a fully 3D or 2D photoionisation and dust radiative transfer code which employs a Monte Carlo approach to the transfer of radiation through media of arbitrary geometry and density distribution. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterMonte Carlo Simulations of Ionised Nebulae 2.02.73.3Input: Gas HII40m7g.16xlarge Graviton3c6g.16xlarge Graviton2510152025SE +/- 0.05, N = 3SE +/- 0.17, N = 313.5820.761. (F9X) gfortran options: -cpp -Jsource/ -ffree-line-length-0 -lm -std=legacy -O2 -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lz

OpenBenchmarking.orgSeconds, Fewer Is BetterMonte Carlo Simulations of Ionised Nebulae 2.02.73.3Input: Dust 2D tau100.0m7g.16xlarge Graviton3c6g.16xlarge Graviton2306090120150SE +/- 0.01, N = 3SE +/- 0.86, N = 382.67145.371. (F9X) gfortran options: -cpp -Jsource/ -ffree-line-length-0 -lm -std=legacy -O2 -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lz

QMCPACK

QMCPACK is a modern high-performance open-source Quantum Monte Carlo (QMC) simulation code making use of MPI for this benchmark of the H20 example code. QMCPACK is an open-source production level many-body ab initio Quantum Monte Carlo code for computing the electronic structure of atoms, molecules, and solids. QMCPACK is supported by the U.S. Department of Energy. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTotal Execution Time - Seconds, Fewer Is BetterQMCPACK 3.16Input: Li2_STO_aem7g.16xlarge Graviton3c6g.16xlarge Graviton24080120160200SE +/- 0.08, N = 3SE +/- 1.13, N = 3112.61165.121. (CXX) g++ options: -fopenmp -foffload=disable -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -ffast-math -mcpu=native -O3 -lm -ldl

OpenBenchmarking.orgTotal Execution Time - Seconds, Fewer Is BetterQMCPACK 3.16Input: simple-H2Om7g.16xlarge Graviton3c6g.16xlarge Graviton21020304050SE +/- 0.03, N = 3SE +/- 0.24, N = 328.0445.231. (CXX) g++ options: -fopenmp -foffload=disable -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -ffast-math -mcpu=native -O3 -lm -ldl

OpenBenchmarking.orgTotal Execution Time - Seconds, Fewer Is BetterQMCPACK 3.16Input: FeCO6_b3lyp_gmsm7g.16xlarge Graviton3c6g.16xlarge Graviton270140210280350SE +/- 0.22, N = 3SE +/- 0.37, N = 3211.60302.191. (CXX) g++ options: -fopenmp -foffload=disable -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -ffast-math -mcpu=native -O3 -lm -ldl

OpenBenchmarking.orgTotal Execution Time - Seconds, Fewer Is BetterQMCPACK 3.16Input: FeCO6_b3lyp_gmsm7g.16xlarge Graviton3c6g.16xlarge Graviton260120180240300SE +/- 0.45, N = 3SE +/- 1.75, N = 3205.72297.941. (CXX) g++ options: -fopenmp -foffload=disable -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -ffast-math -mcpu=native -O3 -lm -ldl

Xcompact3d Incompact3d

Xcompact3d Incompact3d is a Fortran-MPI based, finite difference high-performance code for solving the incompressible Navier-Stokes equation and as many as you need scalar transport equations. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterXcompact3d Incompact3d 2021-03-11Input: input.i3d 129 Cells Per Directionm7g.16xlarge Graviton3c6g.16xlarge Graviton21.26842.53683.80525.07366.342SE +/- 0.02702838, N = 3SE +/- 0.02560507, N = 33.098710385.637207351. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

OpenBenchmarking.orgSeconds, Fewer Is BetterXcompact3d Incompact3d 2021-03-11Input: input.i3d 193 Cells Per Directionm7g.16xlarge Graviton3c6g.16xlarge Graviton2612182430SE +/- 0.02, N = 3SE +/- 0.03, N = 313.9525.881. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

Remhos

Remhos (REMap High-Order Solver) is a miniapp that solves the pure advection equations that are used to perform monotonic and conservative discontinuous field interpolation (remap) as part of the Eulerian phase in Arbitrary Lagrangian Eulerian (ALE) simulations. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterRemhos 1.0Test: Sample Remap Examplem7g.16xlarge Graviton3c6g.16xlarge Graviton2510152025SE +/- 0.04, N = 3SE +/- 0.08, N = 314.0420.741. (CXX) g++ options: -O3 -std=c++11 -lmfem -lHYPRE -lmetis -lrt -lmpi_cxx -lmpi

GPAW

GPAW is a density-functional theory (DFT) Python code based on the projector-augmented wave (PAW) method and the atomic simulation environment (ASE). Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterGPAW 23.6Input: Carbon Nanotubem7g.16xlarge Graviton3c6g.16xlarge Graviton220406080100SE +/- 0.03, N = 3SE +/- 0.02, N = 361.8392.761. (CC) gcc options: -shared -fwrapv -O2 -lxc -lblas -lmpi

Coremark

This is a test of EEMBC CoreMark processor benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgIterations/Sec, More Is BetterCoremark 1.0CoreMark Size 666 - Iterations Per Secondm7g.16xlarge Graviton3c6g.16xlarge Graviton2300K600K900K1200K1500KSE +/- 11449.37, N = 15SE +/- 153.60, N = 31601880.341260642.181. (CC) gcc options: -O2 -lrt" -lrt

Stockfish

This is a test of Stockfish, an advanced open-source C++11 chess benchmark that can scale up to 512 CPU threads. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNodes Per Second, More Is BetterStockfish 15Total Timem7g.16xlarge Graviton3c6g.16xlarge Graviton220M40M60M80M100MSE +/- 2854071.93, N = 15SE +/- 2597495.37, N = 15112119711866092841. (CXX) g++ options: -lgcov -lpthread -fno-exceptions -std=c++17 -fno-peel-loops -fno-tracer -pedantic -O3 -flto -flto=jobserver

7-Zip Compression

This is a test of 7-Zip compression/decompression with its integrated benchmark feature. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMIPS, More Is Better7-Zip Compression 22.01Test: Compression Ratingm7g.16xlarge Graviton3c6g.16xlarge Graviton270K140K210K280K350KSE +/- 154.72, N = 3SE +/- 209.44, N = 33168252407021. (CXX) g++ options: -lpthread -ldl -O2 -fPIC

OpenBenchmarking.orgMIPS, More Is Better7-Zip Compression 22.01Test: Decompression Ratingm7g.16xlarge Graviton3c6g.16xlarge Graviton260K120K180K240K300KSE +/- 93.51, N = 3SE +/- 15.43, N = 32855402342021. (CXX) g++ options: -lpthread -ldl -O2 -fPIC

Timed Godot Game Engine Compilation

This test times how long it takes to compile the Godot Game Engine. Godot is a popular, open-source, cross-platform 2D/3D game engine and is built using the SCons build system and targeting the X11 platform. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Godot Game Engine Compilation 4.0Time To Compilem7g.16xlarge Graviton3c6g.16xlarge Graviton250100150200250SE +/- 0.32, N = 3SE +/- 0.30, N = 3154.38218.28

Timed Gem5 Compilation

This test times how long it takes to compile Gem5. Gem5 is a simulator for computer system architecture research. Gem5 is widely used for computer architecture research within the industry, academia, and more. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Gem5 Compilation 21.2Time To Compilem7g.16xlarge Graviton3c6g.16xlarge Graviton250100150200250SE +/- 0.13, N = 3SE +/- 0.35, N = 3180.25225.31

Timed Node.js Compilation

This test profile times how long it takes to build/compile Node.js itself from source. Node.js is a JavaScript run-time built from the Chrome V8 JavaScript engine while itself is written in C/C++. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Node.js Compilation 19.8.1Time To Compilem7g.16xlarge Graviton3c6g.16xlarge Graviton260120180240300SE +/- 0.33, N = 3SE +/- 0.16, N = 3237.78287.81

Liquid-DSP

LiquidSDR's Liquid-DSP is a software-defined radio (SDR) digital signal processing library. This test profile runs a multi-threaded benchmark of this SDR/DSP library focused on embedded platform usage. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 32 - Buffer Length: 256 - Filter Length: 32m7g.16xlarge Graviton3c6g.16xlarge Graviton2200M400M600M800M1000MSE +/- 233333.33, N = 3SE +/- 456520.66, N = 311360666677654666671. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 32 - Buffer Length: 256 - Filter Length: 57m7g.16xlarge Graviton3c6g.16xlarge Graviton2150M300M450M600M750MSE +/- 3333.33, N = 3SE +/- 23094.01, N = 37214933334892700001. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 64 - Buffer Length: 256 - Filter Length: 32m7g.16xlarge Graviton3c6g.16xlarge Graviton2500M1000M1500M2000M2500MSE +/- 435889.89, N = 3SE +/- 251661.15, N = 3227050000015314000001. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 64 - Buffer Length: 256 - Filter Length: 57m7g.16xlarge Graviton3c6g.16xlarge Graviton2300M600M900M1200M1500MSE +/- 152752.52, N = 3SE +/- 11547.01, N = 314424000009782000001. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 32 - Buffer Length: 256 - Filter Length: 512m7g.16xlarge Graviton3c6g.16xlarge Graviton220M40M60M80M100MSE +/- 1855.92, N = 3SE +/- 333.33, N = 381396667674863331. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 1.6Threads: 64 - Buffer Length: 256 - Filter Length: 512m7g.16xlarge Graviton3c6g.16xlarge Graviton230M60M90M120M150MSE +/- 6666.67, N = 3SE +/- 3333.33, N = 31627533331349266671. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

srsRAN Project

srsRAN Project is a complete ORAN-native 5G RAN solution created by Software Radio Systems (SRS). The srsRAN Project radio suite was formerly known as srsLTE and can be used for building your own software-defined radio (SDR) 4G/5G mobile network. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMbps, More Is BettersrsRAN Project 23.5Test: Downlink Processor Benchmarkm7g.16xlarge Graviton3c6g.16xlarge Graviton270140210280350SE +/- 0.91, N = 3SE +/- 0.25, N = 3318.5197.21. (CXX) g++ options: -O3 -fno-trapping-math -fno-math-errno -lgtest

OpenBenchmarking.orgMbps, More Is BettersrsRAN Project 23.5Test: PUSCH Processor Benchmark, Throughput Totalm7g.16xlarge Graviton3c6g.16xlarge Graviton212002400360048006000SE +/- 4.08, N = 3SE +/- 2.53, N = 35413.83938.71. (CXX) g++ options: -O3 -fno-trapping-math -fno-math-errno -lgtest

OpenBenchmarking.orgMbps, More Is BettersrsRAN Project 23.5Test: PUSCH Processor Benchmark, Throughput Threadm7g.16xlarge Graviton3c6g.16xlarge Graviton220406080100SE +/- 0.03, N = 3SE +/- 0.03, N = 395.863.81. (CXX) g++ options: -O3 -fno-trapping-math -fno-math-errno -lgtest

nginx

This is a benchmark of the lightweight Nginx HTTP(S) web-server. This Nginx web server benchmark test profile makes use of the wrk program for facilitating the HTTP requests over a fixed period time with a configurable number of concurrent clients/connections. HTTPS with a self-signed OpenSSL certificate is used by this test for local benchmarking. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgRequests Per Second, More Is Betternginx 1.23.2Connections: 500m7g.16xlarge Graviton3c6g.16xlarge Graviton250K100K150K200K250KSE +/- 323.56, N = 3SE +/- 90.87, N = 3255768.44148964.691. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2

OpenBenchmarking.orgRequests Per Second, More Is Betternginx 1.23.2Connections: 1000m7g.16xlarge Graviton3c6g.16xlarge Graviton250K100K150K200K250KSE +/- 137.20, N = 3SE +/- 185.79, N = 3255616.04158676.401. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2

Apache HTTP Server

This is a test of the Apache HTTPD web server. This Apache HTTPD web server benchmark test profile makes use of the wrk program for facilitating the HTTP requests over a fixed period time with a configurable number of concurrent clients. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgRequests Per Second, More Is BetterApache HTTP Server 2.4.56Concurrent Requests: 500m7g.16xlarge Graviton3c6g.16xlarge Graviton215K30K45K60K75KSE +/- 116.32, N = 3SE +/- 181.58, N = 371754.8966640.931. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2

OpenBenchmarking.orgRequests Per Second, More Is BetterApache HTTP Server 2.4.56Concurrent Requests: 1000c6g.16xlarge Graviton2m7g.16xlarge Graviton314K28K42K56K70KSE +/- 107.55, N = 3SE +/- 72.21, N = 367276.8360965.701. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2

OpenSSL

OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test profile makes use of the built-in "openssl speed" benchmarking capabilities. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.1Algorithm: SHA256m7g.16xlarge Graviton3c6g.16xlarge Graviton212000M24000M36000M48000M60000MSE +/- 18610524.10, N = 3SE +/- 245440310.03, N = 354212515580424727988471. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.1Algorithm: SHA512m7g.16xlarge Graviton3c6g.16xlarge Graviton27000M14000M21000M28000M35000MSE +/- 17714077.14, N = 3SE +/- 9173912.49, N = 332125448870143939254901. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl

OpenBenchmarking.orgsign/s, More Is BetterOpenSSL 3.1Algorithm: RSA4096m7g.16xlarge Graviton3c6g.16xlarge Graviton22K4K6K8K10KSE +/- 1.27, N = 3SE +/- 1.71, N = 310181.92624.31. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl

OpenBenchmarking.orgverify/s, More Is BetterOpenSSL 3.1Algorithm: RSA4096m7g.16xlarge Graviton3c6g.16xlarge Graviton2150K300K450K600K750KSE +/- 21.82, N = 3SE +/- 88.30, N = 3713859.5214040.91. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.1Algorithm: ChaCha20m7g.16xlarge Graviton3c6g.16xlarge Graviton220000M40000M60000M80000M100000MSE +/- 1293723.80, N = 3SE +/- 35952887.59, N = 3103226784517672925412031. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.1Algorithm: AES-128-GCMm7g.16xlarge Graviton3c6g.16xlarge Graviton270000M140000M210000M280000M350000MSE +/- 81289574.27, N = 3SE +/- 9833681.11, N = 33320331719001584361638571. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.1Algorithm: AES-256-GCMm7g.16xlarge Graviton3c6g.16xlarge Graviton260000M120000M180000M240000M300000MSE +/- 6411836.47, N = 3SE +/- 2312792.64, N = 32833331136301291995931571. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.1Algorithm: ChaCha20-Poly1305m7g.16xlarge Graviton3c6g.16xlarge Graviton216000M32000M48000M64000M80000MSE +/- 1340503.89, N = 3SE +/- 1132293.08, N = 374287460990467176368071. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl

91 Results Shown

Stress-NG:
  NUMA
  CPU Cache
  Matrix Math
  Vector Math
  Matrix 3D Math
  Memory Copying
  Vector Shuffle
  Wide Vector Math
  Fused Multiply-Add
  Vector Floating Point
HeFFTe - Highly Efficient FFT for Exascale
BRL-CAD
HeFFTe - Highly Efficient FFT for Exascale:
  r2c - FFTW - double - 256
  r2c - FFTW - double - 512
  c2c - FFTW - double - 512
Laghos
Graph500
HeFFTe - Highly Efficient FFT for Exascale:
  c2c - FFTW - double - 256
  r2c - FFTW - float - 256
  r2c - FFTW - float - 512
  c2c - FFTW - float - 512
Graph500
HeFFTe - Highly Efficient FFT for Exascale:
  c2c - FFTW - float - 256
  c2c - FFTW - float - 128
Laghos
Graph500
HeFFTe - Highly Efficient FFT for Exascale
Graph500
HeFFTe - Highly Efficient FFT for Exascale
nekRS:
  Kershaw
  TurboPipe Periodic
LeelaChessZero:
  BLAS
  Eigen
GROMACS
LAMMPS Molecular Dynamics Simulator:
  20k Atoms
  Rhodopsin Protein
High Performance Conjugate Gradient:
  144 144 144 - 60
  160 160 160 - 60
NAS Parallel Benchmarks:
  CG.C
  EP.D
  LU.C
  MG.C
  SP.C
Rodinia:
  OpenMP LavaMD
  OpenMP CFD Solver
  OpenMP Streamcluster
ACES DGEMM
Pennant:
  sedovbig
  leblancbig
Algebraic Multi-Grid Benchmark
Kripke
LULESH
NWChem
Monte Carlo Simulations of Ionised Nebulae:
  Gas HII40
  Dust 2D tau100.0
QMCPACK:
  Li2_STO_ae
  simple-H2O
  FeCO6_b3lyp_gms
  FeCO6_b3lyp_gms
Xcompact3d Incompact3d:
  input.i3d 129 Cells Per Direction
  input.i3d 193 Cells Per Direction
Remhos
GPAW
Coremark
Stockfish
7-Zip Compression:
  Compression Rating
  Decompression Rating
Timed Godot Game Engine Compilation
Timed Gem5 Compilation
Timed Node.js Compilation
Liquid-DSP:
  32 - 256 - 32
  32 - 256 - 57
  64 - 256 - 32
  64 - 256 - 57
  32 - 256 - 512
  64 - 256 - 512
srsRAN Project:
  Downlink Processor Benchmark
  PUSCH Processor Benchmark, Throughput Total
  PUSCH Processor Benchmark, Throughput Thread
nginx:
  500
  1000
Apache HTTP Server:
  500
  1000
OpenSSL:
  SHA256
  SHA512
  RSA4096
  RSA4096
  ChaCha20
  AES-128-GCM
  AES-256-GCM
  ChaCha20-Poly1305