AmpereOne benchmarks by Michael Larabel for a future article review on Phoronix.
AmpereOne A192-32X @ 128 Cores Processor: AmpereOne @ 3.20GHz (128 Cores), Motherboard: Supermicro ARS-211M-NR R13SPD v1.02 (T20240726102529 BIOS), Chipset: Ampere Computing LLC Device e208, Memory: 8 x 64GB DDR5-5200MT/s, Disk: 3841GB SAMSUNG MZQL23T8HCLS-00A07 + 960GB SAMSUNG MZ1L2960HCJR-00A07, Graphics: ASPEED, Monitor: VGA HDMI, Network: 2 x Broadcom BCM57414 NetXtreme-E 10Gb/25Gb + 2 x Mellanox MT2892
OS: Ubuntu 24.04, Kernel: 6.8.0-39-generic-64k (aarch64), Compiler: GCC 13.2.0, File-System: ext4, Screen Resolution: 1920x1080
Kernel Notes: Transparent Huge Pages: alwaysCompiler Notes: --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-dIwDw0/gcc-13-13.2.0/debian/tmp-nvptx/usr --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto --without-cuda-driver -vProcessor Notes: Scaling Governor: cppc_cpufreq performance (Boost: Disabled)Python Notes: Python 3.12.3Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Not affected + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected
AmpereOne A192-32X @ 64 Cores Changed Processor to AmpereOne @ 3.20GHz (64 Cores) .
AmpereOne A192-32X @ 96 Cores Changed Processor to AmpereOne @ 3.20GHz (96 Cores) .
AmpereOne A192-32X @ 160 Cores Changed Processor to AmpereOne @ 3.20GHz (160 Cores) .
AmpereOne A192-32X @ 32 Cores Changed Processor to AmpereOne @ 3.20GHz (32 Cores) .
WRF WRF, the Weather Research and Forecasting Model, is a "next-generation mesoscale numerical weather prediction system designed for both atmospheric research and operational forecasting applications. It features two dynamical cores, a data assimilation system, and a software architecture supporting parallel computation and system extensibility." Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better WRF 4.2.2 Input: conus 2.5km AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 128 Cores AmpereOne A192-32X @ 160 Cores 5K 10K 15K 20K 25K 21231.06 12930.95 10563.09 9608.64 9224.59 1. (F9X) gfortran options: -O2 -ftree-vectorize -funroll-loops -ffree-form -fconvert=big-endian -frecord-marker=4 -fallow-invalid-boz -lesmf_time -lwrfio_nf -lnetcdff -lnetcdf -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
NWChem NWChem is an open-source high performance computational chemistry package. Per NWChem's documentation, "NWChem aims to provide its users with computational chemistry tools that are scalable both in their ability to treat large scientific computational chemistry problems efficiently, and in their use of available parallel computing resources from high-performance parallel supercomputers to conventional workstation clusters." Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better NWChem 7.0.2 Input: C240 Buckyball AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 160 Cores AmpereOne A192-32X @ 128 Cores AmpereOne A192-32X @ 96 Cores 900 1800 2700 3600 4500 4230.1 2374.0 2239.7 2239.1 2214.2 1. (F9X) gfortran options: -lnwctask -lccsd -lmcscf -lselci -lmp2 -lmoints -lstepper -ldriver -loptim -lnwdft -lgradients -lcphf -lesp -lddscf -ldangchang -lguess -lhessian -lvib -lnwcutil -lrimp2 -lproperty -lsolvation -lnwints -lprepar -lnwmd -lnwpw -lofpw -lpaw -lpspw -lband -lnwpwlib -lcafe -lspace -lanalyze -lqhop -lpfft -ldplot -ldrdy -lvscf -lqmmm -lqmd -letrans -ltce -lbq -lmm -lcons -lperfm -ldntmc -lccca -ldimqm -lga -larmci -lpeigs -l64to32 -lopenblas -lpthread -lrt -llapack -lnwcblas -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz -lcomex -ffast-math -std=legacy -fdefault-integer-8 -finline-functions -O2
High Performance Conjugate Gradient HPCG is the High Performance Conjugate Gradient and is a new scientific benchmark from Sandia National Lans focused for super-computer testing with modern real-world workloads compared to HPCC. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GFLOP/s, More Is Better High Performance Conjugate Gradient 3.1 X Y Z: 144 144 144 - RT: 60 AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 128 Cores AmpereOne A192-32X @ 160 Cores 8 16 24 32 40 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.08, N = 3 11.26 20.72 27.78 31.97 33.63 1. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -lmpi_cxx -lmpi
Xcompact3d Incompact3d Xcompact3d Incompact3d is a Fortran-MPI based, finite difference high-performance code for solving the incompressible Navier-Stokes equation and as many as you need scalar transport equations. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: X3D-benchmarking input.i3d AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 128 Cores AmpereOne A192-32X @ 160 Cores 120 240 360 480 600 SE +/- 0.12, N = 3 SE +/- 0.81, N = 3 SE +/- 0.37, N = 3 SE +/- 1.73, N = 3 SE +/- 0.46, N = 3 549.04 367.41 318.38 307.12 293.60 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
CloverLeaf CloverLeaf is a Lagrangian-Eulerian hydrodynamics benchmark. This test profile currently makes use of CloverLeaf's OpenMP version. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better CloverLeaf 1.3 Input: clover_bm16 AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 160 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 128 Cores 110 220 330 440 550 SE +/- 0.21, N = 3 SE +/- 1.09, N = 3 SE +/- 0.31, N = 3 SE +/- 0.22, N = 3 SE +/- 0.28, N = 3 499.16 354.28 339.34 330.75 329.15 1. (F9X) gfortran options: -O3 -march=native -funroll-loops -fopenmp
Stockfish This is a test of Stockfish, an advanced open-source C++11 chess benchmark that can scale up to 1024 CPU threads. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 16.1 Chess Benchmark AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 128 Cores AmpereOne A192-32X @ 160 Cores 30M 60M 90M 120M 150M SE +/- 111315.22, N = 3 SE +/- 193507.24, N = 3 SE +/- 1643187.92, N = 12 SE +/- 2128472.89, N = 12 SE +/- 2347181.46, N = 10 21721077 43751905 66379026 89911572 119077506 1. (CXX) g++ options: -lgcov -lpthread -fno-exceptions -std=c++17 -fno-peel-loops -fno-tracer -pedantic -O3 -funroll-loops -flto -flto-partition=one -flto=jobserver
Timed Node.js Compilation This test profile times how long it takes to build/compile Node.js itself from source. Node.js is a JavaScript run-time built from the Chrome V8 JavaScript engine while itself is written in C/C++. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Timed Node.js Compilation 21.7.2 Time To Compile AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 128 Cores AmpereOne A192-32X @ 160 Cores 110 220 330 440 550 SE +/- 0.19, N = 3 SE +/- 0.38, N = 3 SE +/- 0.31, N = 3 SE +/- 0.12, N = 3 SE +/- 0.26, N = 3 494.12 312.62 254.38 233.62 217.67
LAMMPS Molecular Dynamics Simulator LAMMPS is a classical molecular dynamics code, and an acronym for Large-scale Atomic/Molecular Massively Parallel Simulator. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ns/day, More Is Better LAMMPS Molecular Dynamics Simulator 23Jun2022 Model: 20k Atoms AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 128 Cores AmpereOne A192-32X @ 160 Cores 12 24 36 48 60 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 SE +/- 0.08, N = 3 SE +/- 0.08, N = 3 SE +/- 0.06, N = 3 18.52 31.05 39.88 48.09 51.56 1. (CXX) g++ options: -O3 -lm -ldl
Timed LLVM Compilation This test times how long it takes to compile/build the LLVM compiler stack. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Timed LLVM Compilation 16.0 Build System: Ninja AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 128 Cores AmpereOne A192-32X @ 160 Cores 80 160 240 320 400 SE +/- 0.30, N = 3 SE +/- 0.07, N = 3 SE +/- 0.18, N = 3 SE +/- 0.79, N = 3 SE +/- 0.28, N = 3 366.01 239.77 209.93 192.20 182.74
Timed Gem5 Compilation This test times how long it takes to compile Gem5. Gem5 is a simulator for computer system architecture research. Gem5 is widely used for computer architecture research within the industry, academia, and more. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Timed Gem5 Compilation 23.0.1 Time To Compile AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 128 Cores AmpereOne A192-32X @ 160 Cores 70 140 210 280 350 SE +/- 0.29, N = 3 SE +/- 0.34, N = 3 SE +/- 0.49, N = 3 SE +/- 0.87, N = 3 SE +/- 0.45, N = 3 311.65 235.77 222.56 210.57 202.67
OpenFOAM OpenFOAM is the leading free, open-source software for computational fluid dynamics (CFD). This test profile currently uses the drivaerFastback test case for analyzing automotive aerodynamics or alternatively the older motorBike input. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: drivaerFastback, Medium Mesh Size - Execution Time AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 128 Cores AmpereOne A192-32X @ 160 Cores 200 400 600 800 1000 995.62 563.93 442.27 376.84 335.45 1. (CXX) g++ options: -std=c++14 -O3 -mcpu=native -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm
OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: drivaerFastback, Medium Mesh Size - Mesh Time AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 160 Cores AmpereOne A192-32X @ 128 Cores 40 80 120 160 200 199.91 160.16 147.85 146.86 143.93 1. (CXX) g++ options: -std=c++14 -O3 -mcpu=native -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm
PostgreSQL This is a benchmark of PostgreSQL using the integrated pgbench for facilitating the database benchmarks. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better PostgreSQL 16 Scaling Factor: 100 - Clients: 1000 - Mode: Read Only - Average Latency AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 128 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 160 Cores 0.1888 0.3776 0.5664 0.7552 0.944 SE +/- 0.002, N = 3 SE +/- 0.003, N = 3 SE +/- 0.005, N = 3 SE +/- 0.003, N = 3 SE +/- 0.007, N = 12 0.839 0.388 0.369 0.365 0.361 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
OpenBenchmarking.org TPS, More Is Better PostgreSQL 16 Scaling Factor: 100 - Clients: 1000 - Mode: Read Only AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 128 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 160 Cores 600K 1200K 1800K 2400K 3000K SE +/- 3213.57, N = 3 SE +/- 22685.45, N = 3 SE +/- 38685.91, N = 3 SE +/- 23815.94, N = 3 SE +/- 51808.62, N = 12 1191340 2575090 2713588 2742242 2778784 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
Xmrig Xmrig is an open-source cross-platform CPU/GPU miner for RandomX, KawPow, CryptoNight and AstroBWT. This test profile is setup to measure the Xmrig CPU mining performance. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org H/s, More Is Better Xmrig 6.21 Variant: GhostRider - Hash Count: 1M AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 128 Cores AmpereOne A192-32X @ 160 Cores 3K 6K 9K 12K 15K SE +/- 9.24, N = 3 SE +/- 83.23, N = 3 SE +/- 63.96, N = 3 SE +/- 96.21, N = 15 SE +/- 93.26, N = 3 3072.9 6065.1 9542.9 12738.0 15534.2 1. (CXX) g++ options: -fexceptions -fno-rtti -O3 -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
QMCPACK QMCPACK is a modern high-performance open-source Quantum Monte Carlo (QMC) simulation code making use of MPI for this benchmark of the H20 example code. QMCPACK is an open-source production level many-body ab initio Quantum Monte Carlo code for computing the electronic structure of atoms, molecules, and solids. QMCPACK is supported by the U.S. Department of Energy. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Total Execution Time - Seconds, Fewer Is Better QMCPACK 3.17.1 Input: Li2_STO_ae AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 128 Cores AmpereOne A192-32X @ 160 Cores 40 80 120 160 200 SE +/- 0.15, N = 3 SE +/- 0.23, N = 3 SE +/- 0.17, N = 3 SE +/- 0.64, N = 3 SE +/- 0.37, N = 3 164.23 127.00 114.96 109.88 106.32 1. (CXX) g++ options: -fopenmp -foffload=disable -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -ffast-math -mcpu=native -O3 -lm -ldl
NAS Parallel Benchmarks NPB, NAS Parallel Benchmarks, is a benchmark developed by NASA for high-end computer systems. This test profile currently uses the MPI version of NPB. This test profile offers selecting the different NPB tests/problems and varying problem sizes. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: SP.C AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 128 Cores AmpereOne A192-32X @ 160 Cores 6K 12K 18K 24K 30K SE +/- 7.19, N = 3 SE +/- 8.73, N = 3 SE +/- 12.08, N = 3 SE +/- 14.45, N = 3 SE +/- 18.61, N = 3 5519.45 13715.04 17166.69 25255.32 28844.11 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
ASTC Encoder ASTC Encoder (astcenc) is for the Adaptive Scalable Texture Compression (ASTC) format commonly used with OpenGL, OpenGL ES, and Vulkan graphics APIs. This test profile does a coding test of both compression/decompression. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 4.7 Preset: Very Thorough AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 128 Cores AmpereOne A192-32X @ 160 Cores 2 4 6 8 10 SE +/- 0.0002, N = 3 SE +/- 0.0002, N = 3 SE +/- 0.0003, N = 3 SE +/- 0.0008, N = 3 SE +/- 0.0006, N = 3 1.2361 2.4698 3.7003 4.9295 6.1547 1. (CXX) g++ options: -O3 -flto -pthread
OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 4.7 Preset: Exhaustive AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 128 Cores AmpereOne A192-32X @ 160 Cores 0.8629 1.7258 2.5887 3.4516 4.3145 SE +/- 0.0002, N = 3 SE +/- 0.0003, N = 3 SE +/- 0.0002, N = 3 SE +/- 0.0004, N = 3 SE +/- 0.0003, N = 3 0.7696 1.5378 2.3044 3.0713 3.8353 1. (CXX) g++ options: -O3 -flto -pthread
PyTorch This is a benchmark of PyTorch making use of pytorch-benchmark [https://github.com/LukasHedegaard/pytorch-benchmark]. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 512 - Model: ResNet-50 AmpereOne A192-32X @ 160 Cores AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 128 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 96 Cores 6 12 18 24 30 SE +/- 0.08, N = 3 SE +/- 0.09, N = 3 SE +/- 0.22, N = 3 SE +/- 0.09, N = 3 SE +/- 0.21, N = 3 22.17 22.32 24.06 25.57 26.22 MIN: 20.09 / MAX: 22.55 MIN: 21.91 / MAX: 22.77 MIN: 20.95 / MAX: 24.81 MIN: 19.76 / MAX: 26.07 MIN: 24.56 / MAX: 26.93
GROMACS The GROMACS (GROningen MAchine for Chemical Simulations) molecular dynamics package testing with the water_GMX50 data. This test profile allows selecting between CPU and GPU-based GROMACS builds. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2024 Implementation: MPI CPU - Input: water_GMX50_bare AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 128 Cores AmpereOne A192-32X @ 160 Cores 2 4 6 8 10 SE +/- 0.016, N = 5 SE +/- 0.001, N = 3 SE +/- 0.002, N = 3 SE +/- 0.001, N = 3 SE +/- 0.003, N = 3 1.468 2.851 4.079 5.206 6.256 1. (CXX) g++ options: -O3 -lm
Helsing Helsing is an open-source POSIX vampire number generator. This test profile measures the time it takes to generate vampire numbers between varying numbers of digits. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Helsing 1.0-beta Digit Range: 14 digit AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 128 Cores AmpereOne A192-32X @ 160 Cores 40 80 120 160 200 SE +/- 0.01, N = 3 SE +/- 0.09, N = 3 SE +/- 0.10, N = 3 SE +/- 0.03, N = 3 SE +/- 0.04, N = 3 174.21 87.01 58.08 43.40 35.18 1. (CC) gcc options: -O2 -pthread
GPAW GPAW is a density-functional theory (DFT) Python code based on the projector-augmented wave (PAW) method and the atomic simulation environment (ASE). Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better GPAW 23.6 Input: Carbon Nanotube AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 128 Cores AmpereOne A192-32X @ 160 Cores 30 60 90 120 150 SE +/- 0.05, N = 3 SE +/- 0.05, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.05, N = 3 117.88 70.91 53.10 44.98 41.88 1. (CC) gcc options: -shared -fwrapv -O2 -lxc -lblas -lmpi
ASKAP ASKAP is a set of benchmarks from the Australian SKA Pathfinder. The principal ASKAP benchmarks are the Hogbom Clean Benchmark (tHogbomClean) and Convolutional Resamping Benchmark (tConvolve) as well as some previous ASKAP benchmarks being included as well for OpenCL and CUDA execution of tConvolve. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mpix/sec, More Is Better ASKAP 1.0 Test: tConvolve MPI - Gridding AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 160 Cores AmpereOne A192-32X @ 128 Cores 4K 8K 12K 16K 20K SE +/- 25.34, N = 3 SE +/- 73.63, N = 3 SE +/- 28.93, N = 3 SE +/- 203.80, N = 3 SE +/- 26.80, N = 3 8920.11 13258.70 16514.50 18121.70 18386.90 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
OpenBenchmarking.org Mpix/sec, More Is Better ASKAP 1.0 Test: tConvolve MPI - Degridding AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 128 Cores AmpereOne A192-32X @ 160 Cores 4K 8K 12K 16K 20K SE +/- 15.65, N = 3 SE +/- 35.02, N = 3 SE +/- 192.84, N = 3 SE +/- 44.66, N = 3 SE +/- 67.66, N = 3 7012.87 11286.00 14696.00 18018.80 20081.70 1. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp
Speedb Speedb is a next-generation key value storage engine that is RocksDB compatible and aiming for stability, efficiency, and performance. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Op/s, More Is Better Speedb 2.7 Test: Random Read AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 128 Cores AmpereOne A192-32X @ 160 Cores 140M 280M 420M 560M 700M SE +/- 17250.81, N = 3 SE +/- 1848285.25, N = 3 SE +/- 119433.65, N = 3 SE +/- 114538.19, N = 3 SE +/- 430545.53, N = 3 126847842 249704806 380234305 507348157 633736504 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti
GraphicsMagick This is a test of GraphicsMagick with its OpenMP implementation that performs various imaging tests on a sample high resolution (currently 15400 x 6940) JPEG image. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.43 Operation: Sharpen AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 128 Cores AmpereOne A192-32X @ 160 Cores 80 160 240 320 400 SE +/- 0.00, N = 3 SE +/- 0.33, N = 3 SE +/- 3.18, N = 3 SE +/- 2.60, N = 3 SE +/- 2.60, N = 3 91 176 249 308 387 1. (CC) gcc options: -fopenmp -O2 -ltiff -ljbig -lsharpyuv -lwebp -lwebpmux -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -lxml2 -lzstd -llzma -lbz2 -lz -lm -lpthread -lgomp
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.43 Operation: Noise-Gaussian AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 128 Cores AmpereOne A192-32X @ 160 Cores 50 100 150 200 250 SE +/- 0.00, N = 3 SE +/- 0.33, N = 3 SE +/- 0.88, N = 3 SE +/- 0.67, N = 3 SE +/- 0.58, N = 3 85 142 178 203 226 1. (CC) gcc options: -fopenmp -O2 -ltiff -ljbig -lsharpyuv -lwebp -lwebpmux -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -lxml2 -lzstd -llzma -lbz2 -lz -lm -lpthread -lgomp
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.43 Operation: Enhanced AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 128 Cores AmpereOne A192-32X @ 160 Cores 70 140 210 280 350 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 SE +/- 1.20, N = 3 66 131 194 253 302 1. (CC) gcc options: -fopenmp -O2 -ltiff -ljbig -lsharpyuv -lwebp -lwebpmux -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -lxml2 -lzstd -llzma -lbz2 -lz -lm -lpthread -lgomp
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.43 Operation: Swirl AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 128 Cores AmpereOne A192-32X @ 160 Cores 200 400 600 800 1000 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.33, N = 3 SE +/- 1.76, N = 3 SE +/- 2.40, N = 3 179 348 507 662 807 1. (CC) gcc options: -fopenmp -O2 -ltiff -ljbig -lsharpyuv -lwebp -lwebpmux -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -lxml2 -lzstd -llzma -lbz2 -lz -lm -lpthread -lgomp
RocksDB This is a benchmark of Meta/Facebook's RocksDB as an embeddable persistent key-value store for fast storage based on Google's LevelDB. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Read While Writing AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 128 Cores AmpereOne A192-32X @ 160 Cores 2M 4M 6M 8M 10M SE +/- 19036.70, N = 3 SE +/- 91773.39, N = 3 SE +/- 38617.89, N = 3 SE +/- 119192.66, N = 3 SE +/- 75896.27, N = 3 3436049 6655730 8068069 8460095 9156334 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti
OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Random Read AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 128 Cores AmpereOne A192-32X @ 160 Cores 130M 260M 390M 520M 650M SE +/- 13970.84, N = 3 SE +/- 84410.44, N = 3 SE +/- 915048.87, N = 3 SE +/- 156369.77, N = 3 SE +/- 69890.71, N = 3 124567637 248932877 372309429 498075884 621920651 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti
NAS Parallel Benchmarks NPB, NAS Parallel Benchmarks, is a benchmark developed by NASA for high-end computer systems. This test profile currently uses the MPI version of NPB. This test profile offers selecting the different NPB tests/problems and varying problem sizes. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: EP.D AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 128 Cores AmpereOne A192-32X @ 160 Cores 1400 2800 4200 5600 7000 SE +/- 3.57, N = 3 SE +/- 12.37, N = 3 SE +/- 4.87, N = 3 SE +/- 24.16, N = 3 SE +/- 2.29, N = 3 1286.86 2542.48 3791.79 5087.14 6325.04 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
QuantLib QuantLib is an open-source library/framework around quantitative finance for modeling, trading and risk management scenarios. QuantLib is written in C++ with Boost and its built-in benchmark used reports the QuantLib Benchmark Index benchmark score. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MFLOPS, More Is Better QuantLib 1.32 Configuration: Multi-Threaded AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 128 Cores AmpereOne A192-32X @ 160 Cores 50K 100K 150K 200K 250K SE +/- 17.09, N = 3 SE +/- 1.35, N = 3 SE +/- 43.40, N = 3 SE +/- 57.74, N = 3 SE +/- 30.07, N = 3 50295.1 100367.9 150449.5 200797.8 250463.6 1. (CXX) g++ options: -O3 -march=native -fPIE -pie
srsRAN Project srsRAN Project is a complete ORAN-native 5G RAN solution created by Software Radio Systems (SRS). The srsRAN Project radio suite was formerly known as srsLTE and can be used for building your own software-defined radio (SDR) 4G/5G mobile network. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.10.1-20240325 Test: PUSCH Processor Benchmark, Throughput Total AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 128 Cores AmpereOne A192-32X @ 160 Cores 500 1000 1500 2000 2500 SE +/- 0.00, N = 3 SE +/- 0.06, N = 3 SE +/- 0.00, N = 3 SE +/- 0.10, N = 3 SE +/- 0.09, N = 3 551.8 994.3 1470.3 1883.4 2354.0 MIN: 328.7 MIN: 582.2 / MAX: 994.4 MIN: 848.9 MIN: 1131.8 / MAX: 1883.5 MIN: 1414.7 / MAX: 2354.1 1. (CXX) g++ options: -O3 -fno-trapping-math -fno-math-errno -ldl
Coremark This is a test of EEMBC CoreMark processor benchmark. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Iterations/Sec, More Is Better Coremark 1.0 CoreMark Size 666 - Iterations Per Second AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 128 Cores AmpereOne A192-32X @ 160 Cores 700K 1400K 2100K 2800K 3500K SE +/- 135.81, N = 3 SE +/- 17648.10, N = 4 SE +/- 24157.99, N = 3 SE +/- 12021.71, N = 3 SE +/- 29681.14, N = 3 771783.50 1520096.93 2243103.32 2717686.05 3069687.78 1. (CC) gcc options: -O2 -lrt" -lrt
CloverLeaf CloverLeaf is a Lagrangian-Eulerian hydrodynamics benchmark. This test profile currently makes use of CloverLeaf's OpenMP version. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better CloverLeaf 1.3 Input: clover_bm64_short AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 160 Cores AmpereOne A192-32X @ 128 Cores 12 24 36 48 60 SE +/- 0.09, N = 3 SE +/- 0.06, N = 3 SE +/- 0.09, N = 3 SE +/- 0.21, N = 3 SE +/- 0.14, N = 3 54.69 40.18 38.49 38.23 37.85 1. (F9X) gfortran options: -O3 -march=native -funroll-loops -fopenmp
Primesieve Primesieve generates prime numbers using a highly optimized sieve of Eratosthenes implementation. Primesieve primarily benchmarks the CPU's L1/L2 cache performance. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 12.1 Length: 1e13 AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 128 Cores AmpereOne A192-32X @ 160 Cores 20 40 60 80 100 SE +/- 0.02, N = 3 SE +/- 0.09, N = 3 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 SE +/- 0.05, N = 4 75.36 38.00 25.56 19.49 15.97 1. (CXX) g++ options: -O3
Xcompact3d Incompact3d Xcompact3d Incompact3d is a Fortran-MPI based, finite difference high-performance code for solving the incompressible Navier-Stokes equation and as many as you need scalar transport equations. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 193 Cells Per Direction AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 128 Cores AmpereOne A192-32X @ 160 Cores 5 10 15 20 25 SE +/- 0.22492051, N = 3 SE +/- 0.00881027, N = 4 SE +/- 0.00898947, N = 4 SE +/- 0.22363750, N = 12 SE +/- 0.25501410, N = 15 22.70747760 13.75409600 11.17631460 10.20729636 9.41105728 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
Algebraic Multi-Grid Benchmark AMG is a parallel algebraic multigrid solver for linear systems arising from problems on unstructured grids. The driver provided with AMG builds linear systems for various 3-dimensional problems. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Figure Of Merit, More Is Better Algebraic Multi-Grid Benchmark 1.2 AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 128 Cores AmpereOne A192-32X @ 160 Cores 400M 800M 1200M 1600M 2000M SE +/- 568382.82, N = 3 SE +/- 1045977.11, N = 3 SE +/- 739129.89, N = 3 SE +/- 541472.38, N = 3 SE +/- 626107.91, N = 3 774298500 1354105667 1692750000 1791333000 1832053667 1. (CC) gcc options: -lparcsr_ls -lparcsr_mv -lseq_mv -lIJ_mv -lkrylov -lHYPRE_utilities -lm -fopenmp -lmpi
ASTC Encoder ASTC Encoder (astcenc) is for the Adaptive Scalable Texture Compression (ASTC) format commonly used with OpenGL, OpenGL ES, and Vulkan graphics APIs. This test profile does a coding test of both compression/decompression. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MT/s, More Is Better ASTC Encoder 4.7 Preset: Thorough AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 128 Cores AmpereOne A192-32X @ 160 Cores 10 20 30 40 50 SE +/- 0.0000, N = 3 SE +/- 0.0020, N = 3 SE +/- 0.0007, N = 3 SE +/- 0.0055, N = 3 SE +/- 0.0029, N = 4 8.5766 17.0717 25.4784 33.7871 41.9686 1. (CXX) g++ options: -O3 -flto -pthread
OpenFOAM OpenFOAM is the leading free, open-source software for computational fluid dynamics (CFD). This test profile currently uses the drivaerFastback test case for analyzing automotive aerodynamics or alternatively the older motorBike input. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: drivaerFastback, Small Mesh Size - Execution Time AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 128 Cores AmpereOne A192-32X @ 160 Cores 20 40 60 80 100 104.62 57.37 42.96 37.73 32.82 1. (CXX) g++ options: -std=c++14 -O3 -mcpu=native -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm
OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: drivaerFastback, Small Mesh Size - Mesh Time AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 160 Cores AmpereOne A192-32X @ 128 Cores 8 16 24 32 40 33.48 29.47 28.29 27.81 27.26 1. (CXX) g++ options: -std=c++14 -O3 -mcpu=native -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfiniteVolume -lmeshTools -lparallel -llagrangian -lregionModels -lgenericPatchFields -lOpenFOAM -ldl -lm
NAS Parallel Benchmarks NPB, NAS Parallel Benchmarks, is a benchmark developed by NASA for high-end computer systems. This test profile currently uses the MPI version of NPB. This test profile offers selecting the different NPB tests/problems and varying problem sizes. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Total Mop/s, More Is Better NAS Parallel Benchmarks 3.4 Test / Class: IS.D AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 160 Cores AmpereOne A192-32X @ 128 Cores 500 1000 1500 2000 2500 SE +/- 8.53, N = 3 SE +/- 18.69, N = 3 SE +/- 18.24, N = 4 SE +/- 13.09, N = 3 SE +/- 4.94, N = 3 1102.12 1492.18 1500.88 2420.50 2420.96 1. (F9X) gfortran options: -O3 -march=native -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
Liquid-DSP LiquidSDR's Liquid-DSP is a software-defined radio (SDR) digital signal processing library. This test profile runs a multi-threaded benchmark of this SDR/DSP library focused on embedded platform usage. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 128 - Buffer Length: 256 - Filter Length: 32 AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 128 Cores AmpereOne A192-32X @ 160 Cores 600M 1200M 1800M 2400M 3000M SE +/- 218581.28, N = 3 SE +/- 317979.73, N = 3 SE +/- 1156623.44, N = 3 SE +/- 688799.28, N = 3 SE +/- 176383.42, N = 3 754193333 1495033333 2238066667 2982133333 2983933333 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
John The Ripper This is a benchmark of John The Ripper, which is a password cracker. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: Blowfish AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 128 Cores AmpereOne A192-32X @ 160 Cores 30K 60K 90K 120K 150K SE +/- 14.11, N = 3 SE +/- 15.84, N = 3 SE +/- 92.18, N = 3 SE +/- 257.46, N = 3 SE +/- 347.77, N = 3 30472 60882 91071 121100 150613 1. (CC) gcc options: -lssl -lcrypto -fopenmp -lm -lrt -lz -ldl -lcrypt -lbz2
OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: bcrypt AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 128 Cores AmpereOne A192-32X @ 160 Cores 30K 60K 90K 120K 150K SE +/- 2.00, N = 3 SE +/- 128.00, N = 3 SE +/- 87.11, N = 3 SE +/- 376.38, N = 3 SE +/- 778.55, N = 3 30487 60787 91068 120886 149891 1. (CC) gcc options: -lssl -lcrypto -fopenmp -lm -lrt -lz -ldl -lcrypt -lbz2
7-Zip Compression This is a test of 7-Zip compression/decompression with its integrated benchmark feature. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MIPS, More Is Better 7-Zip Compression 22.01 Test: Decompression Rating AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 128 Cores AmpereOne A192-32X @ 160 Cores 160K 320K 480K 640K 800K SE +/- 699.19, N = 3 SE +/- 1384.56, N = 3 SE +/- 191.34, N = 3 SE +/- 5325.21, N = 3 SE +/- 3252.04, N = 3 154754 305644 458613 601553 748883 1. (CXX) g++ options: -lpthread -ldl -O2 -fPIC
OpenBenchmarking.org MIPS, More Is Better 7-Zip Compression 22.01 Test: Compression Rating AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 128 Cores AmpereOne A192-32X @ 160 Cores 150K 300K 450K 600K 750K SE +/- 521.69, N = 3 SE +/- 3457.21, N = 3 SE +/- 4489.83, N = 3 SE +/- 2815.31, N = 3 SE +/- 3030.31, N = 3 189272 344626 481827 604723 698343 1. (CXX) g++ options: -lpthread -ldl -O2 -fPIC
Pennant Pennant is an application focused on hydrodynamics on general unstructured meshes in 2D. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Hydro Cycle Time - Seconds, Fewer Is Better Pennant 1.0.1 Test: sedovbig AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 128 Cores AmpereOne A192-32X @ 160 Cores 6 12 18 24 30 SE +/- 0.031676, N = 3 SE +/- 0.059346, N = 4 SE +/- 0.029025, N = 5 SE +/- 0.034211, N = 6 SE +/- 0.095285, N = 15 25.908500 13.161580 8.135246 6.060620 4.984562 1. (CXX) g++ options: -fopenmp -lmpi_cxx -lmpi
OpenBenchmarking.org Hydro Cycle Time - Seconds, Fewer Is Better Pennant 1.0.1 Test: leblancbig AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 128 Cores AmpereOne A192-32X @ 160 Cores 5 10 15 20 25 SE +/- 0.002751, N = 3 SE +/- 0.018943, N = 5 SE +/- 0.038839, N = 6 SE +/- 0.149446, N = 12 SE +/- 0.062223, N = 15 19.546460 9.334206 5.885511 4.780215 3.794667 1. (CXX) g++ options: -fopenmp -lmpi_cxx -lmpi
miniFE MiniFE Finite Element is an application for unstructured implicit finite element codes. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org CG Mflops, More Is Better miniFE 2.2 Problem Size: Small AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 128 Cores AmpereOne A192-32X @ 160 Cores 9K 18K 27K 36K 45K SE +/- 6.97, N = 3 SE +/- 89.27, N = 4 SE +/- 106.36, N = 4 SE +/- 60.85, N = 4 SE +/- 162.64, N = 4 16927.6 29130.1 36739.6 39605.8 40530.1 1. (CXX) g++ options: -O3 -fopenmp -lmpi_cxx -lmpi
srsRAN Project srsRAN Project is a complete ORAN-native 5G RAN solution created by Software Radio Systems (SRS). The srsRAN Project radio suite was formerly known as srsLTE and can be used for building your own software-defined radio (SDR) 4G/5G mobile network. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.10.1-20240325 Test: PDSCH Processor Benchmark, Throughput Total AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 128 Cores AmpereOne A192-32X @ 160 Cores 5K 10K 15K 20K 25K SE +/- 28.69, N = 4 SE +/- 47.88, N = 3 SE +/- 27.16, N = 3 SE +/- 45.75, N = 3 SE +/- 93.65, N = 3 5962.0 11144.9 15292.2 18948.2 21927.6 1. (CXX) g++ options: -O3 -fno-trapping-math -fno-math-errno -ldl
m-queens A solver for the N-queens problem with multi-threading support via the OpenMP library. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better m-queens 1.2 Time To Solve AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 128 Cores AmpereOne A192-32X @ 160 Cores 7 14 21 28 35 SE +/- 0.013, N = 3 SE +/- 0.009, N = 4 SE +/- 0.009, N = 5 SE +/- 0.019, N = 6 SE +/- 0.018, N = 6 30.963 15.562 10.437 7.913 6.371 1. (CXX) g++ options: -fopenmp -O2 -march=native
LULESH LULESH is the Livermore Unstructured Lagrangian Explicit Shock Hydrodynamics. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org z/s, More Is Better LULESH 2.0.3 AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 128 Cores AmpereOne A192-32X @ 160 Cores 9K 18K 27K 36K 45K SE +/- 202.33, N = 5 SE +/- 164.49, N = 4 SE +/- 252.99, N = 4 SE +/- 158.62, N = 3 SE +/- 83.25, N = 3 18959.66 33097.41 33783.38 41666.59 41723.92 1. (CXX) g++ options: -O3 -fopenmp -lm -lmpi_cxx -lmpi
Timed Mesa Compilation This test profile times how long it takes to compile Mesa with Meson/Ninja. For minimizing build dependencies and avoid versioning conflicts, test this is just the core Mesa build without LLVM or the extra Gallium3D/Mesa drivers enabled. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Timed Mesa Compilation 24.0 Time To Compile AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 128 Cores AmpereOne A192-32X @ 160 Cores 6 12 18 24 30 SE +/- 0.12, N = 3 SE +/- 0.07, N = 3 SE +/- 0.06, N = 3 SE +/- 0.11, N = 3 SE +/- 0.05, N = 3 23.21 19.05 17.91 17.50 17.13
LAMMPS Molecular Dynamics Simulator LAMMPS is a classical molecular dynamics code, and an acronym for Large-scale Atomic/Molecular Massively Parallel Simulator. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ns/day, More Is Better LAMMPS Molecular Dynamics Simulator 23Jun2022 Model: Rhodopsin Protein AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 128 Cores AmpereOne A192-32X @ 160 Cores 11 22 33 44 55 SE +/- 0.01, N = 12 SE +/- 0.02, N = 11 SE +/- 1.11, N = 15 SE +/- 1.42, N = 15 SE +/- 0.05, N = 9 17.78 30.79 39.03 45.75 50.85 1. (CXX) g++ options: -O3 -lm -ldl
oneDNN This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI toolkit. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: Deconvolution Batch shapes_3d - Engine: CPU AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 160 Cores AmpereOne A192-32X @ 128 Cores 2 4 6 8 10 SE +/- 0.00301, N = 9 SE +/- 0.01038, N = 9 SE +/- 0.00486, N = 9 SE +/- 0.01136, N = 9 SE +/- 0.00575, N = 9 8.04302 4.16372 2.94166 2.73224 2.41716 MIN: 7.97 MIN: 4.05 MIN: 2.85 MIN: 2.59 MIN: 2.3 1. (CXX) g++ options: -O3 -march=native -fopenmp -mcpu=generic -fPIC -pie -ldl -lpthread
Parallel BZIP2 Compression This test measures the time needed to compress a file (FreeBSD-13.0-RELEASE-amd64-memstick.img) using Parallel BZIP2 compression. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Parallel BZIP2 Compression 1.1.13 FreeBSD-13.0-RELEASE-amd64-memstick.img Compression AmpereOne A192-32X @ 32 Cores AmpereOne A192-32X @ 64 Cores AmpereOne A192-32X @ 96 Cores AmpereOne A192-32X @ 128 Cores AmpereOne A192-32X @ 160 Cores 1.1932 2.3864 3.5796 4.7728 5.966 SE +/- 0.003938, N = 7 SE +/- 0.008052, N = 9 SE +/- 0.014149, N = 10 SE +/- 0.008019, N = 11 SE +/- 0.005048, N = 11 5.303238 3.128529 2.539720 2.054855 1.837818 1. (CXX) g++ options: -O2 -pthread -lbz2 -lpthread
AmpereOne A192-32X @ 128 Cores Processor: AmpereOne @ 3.20GHz (128 Cores), Motherboard: Supermicro ARS-211M-NR R13SPD v1.02 (T20240726102529 BIOS), Chipset: Ampere Computing LLC Device e208, Memory: 8 x 64GB DDR5-5200MT/s, Disk: 3841GB SAMSUNG MZQL23T8HCLS-00A07 + 960GB SAMSUNG MZ1L2960HCJR-00A07, Graphics: ASPEED, Monitor: VGA HDMI, Network: 2 x Broadcom BCM57414 NetXtreme-E 10Gb/25Gb + 2 x Mellanox MT2892
OS: Ubuntu 24.04, Kernel: 6.8.0-39-generic-64k (aarch64), Compiler: GCC 13.2.0, File-System: ext4, Screen Resolution: 1920x1080
Kernel Notes: Transparent Huge Pages: alwaysCompiler Notes: --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-dIwDw0/gcc-13-13.2.0/debian/tmp-nvptx/usr --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto --without-cuda-driver -vProcessor Notes: Scaling Governor: cppc_cpufreq performance (Boost: Disabled)Python Notes: Python 3.12.3Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Not affected + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 26 August 2024 18:45 by user ubuntu.
AmpereOne A192-32X @ 64 Cores Processor: AmpereOne @ 3.20GHz (64 Cores), Motherboard: Supermicro ARS-211M-NR R13SPD v1.02 (T20240726102529 BIOS), Chipset: Ampere Computing LLC Device e208, Memory: 8 x 64GB DDR5-5200MT/s, Disk: 3841GB SAMSUNG MZQL23T8HCLS-00A07 + 960GB SAMSUNG MZ1L2960HCJR-00A07, Graphics: ASPEED, Monitor: VGA HDMI, Network: 2 x Broadcom BCM57414 NetXtreme-E 10Gb/25Gb + 2 x Mellanox MT2892
OS: Ubuntu 24.04, Kernel: 6.8.0-39-generic-64k (aarch64), Compiler: GCC 13.2.0, File-System: ext4, Screen Resolution: 1920x1080
Kernel Notes: Transparent Huge Pages: alwaysCompiler Notes: --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-dIwDw0/gcc-13-13.2.0/debian/tmp-nvptx/usr --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto --without-cuda-driver -vProcessor Notes: Scaling Governor: cppc_cpufreq performance (Boost: Disabled)Python Notes: Python 3.12.3Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Not affected + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 27 August 2024 09:36 by user ubuntu.
AmpereOne A192-32X @ 96 Cores Processor: AmpereOne @ 3.20GHz (96 Cores), Motherboard: Supermicro ARS-211M-NR R13SPD v1.02 (T20240726102529 BIOS), Chipset: Ampere Computing LLC Device e208, Memory: 8 x 64GB DDR5-5200MT/s, Disk: 3841GB SAMSUNG MZQL23T8HCLS-00A07 + 960GB SAMSUNG MZ1L2960HCJR-00A07, Graphics: ASPEED, Monitor: VGA HDMI, Network: 2 x Broadcom BCM57414 NetXtreme-E 10Gb/25Gb + 2 x Mellanox MT2892
OS: Ubuntu 24.04, Kernel: 6.8.0-39-generic-64k (aarch64), Compiler: GCC 13.2.0, File-System: ext4, Screen Resolution: 1920x1080
Kernel Notes: Transparent Huge Pages: alwaysCompiler Notes: --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-dIwDw0/gcc-13-13.2.0/debian/tmp-nvptx/usr --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto --without-cuda-driver -vProcessor Notes: Scaling Governor: cppc_cpufreq performance (Boost: Disabled)Python Notes: Python 3.12.3Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Not affected + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 27 August 2024 19:15 by user ubuntu.
AmpereOne A192-32X @ 160 Cores Processor: AmpereOne @ 3.20GHz (160 Cores), Motherboard: Supermicro ARS-211M-NR R13SPD v1.02 (T20240726102529 BIOS), Chipset: Ampere Computing LLC Device e208, Memory: 8 x 64GB DDR5-5200MT/s, Disk: 3841GB SAMSUNG MZQL23T8HCLS-00A07 + 960GB SAMSUNG MZ1L2960HCJR-00A07, Graphics: ASPEED, Monitor: VGA HDMI, Network: 2 x Broadcom BCM57414 NetXtreme-E 10Gb/25Gb + 2 x Mellanox MT2892
OS: Ubuntu 24.04, Kernel: 6.8.0-39-generic-64k (aarch64), Compiler: GCC 13.2.0, File-System: ext4, Screen Resolution: 1920x1080
Kernel Notes: Transparent Huge Pages: alwaysCompiler Notes: --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-dIwDw0/gcc-13-13.2.0/debian/tmp-nvptx/usr --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto --without-cuda-driver -vProcessor Notes: Scaling Governor: cppc_cpufreq performance (Boost: Disabled)Python Notes: Python 3.12.3Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Not affected + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 28 August 2024 09:35 by user ubuntu.
AmpereOne A192-32X @ 32 Cores Processor: AmpereOne @ 3.20GHz (32 Cores), Motherboard: Supermicro ARS-211M-NR R13SPD v1.02 (T20240726102529 BIOS), Chipset: Ampere Computing LLC Device e208, Memory: 8 x 64GB DDR5-5200MT/s, Disk: 3841GB SAMSUNG MZQL23T8HCLS-00A07 + 960GB SAMSUNG MZ1L2960HCJR-00A07, Graphics: ASPEED, Monitor: VGA HDMI, Network: 2 x Broadcom BCM57414 NetXtreme-E 10Gb/25Gb + 2 x Mellanox MT2892
OS: Ubuntu 24.04, Kernel: 6.8.0-39-generic-64k (aarch64), Compiler: GCC 13.2.0, File-System: ext4, Screen Resolution: 1920x1080
Kernel Notes: Transparent Huge Pages: alwaysCompiler Notes: --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-dIwDw0/gcc-13-13.2.0/debian/tmp-nvptx/usr --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto --without-cuda-driver -vProcessor Notes: Scaling Governor: cppc_cpufreq performance (Boost: Disabled)Python Notes: Python 3.12.3Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Not affected + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 28 August 2024 17:45 by user ubuntu.