Benchmarks by Michael Larabel for a future article. amazon testing on Ubuntu 22.04 via the Phoronix Test Suite.
-march=armv8.4-a+sve Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -march=armv8.4-a+sve" CFLAGS="-O3 -march=armv8.4-a+sve"Compiler Notes: --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-nls --disable-werror --enable-checking=yes,extra,rtl --enable-clocale=gnu --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++ --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-objc-gc=auto --enable-plugin --enable-shared --host=aarch64-linux-gnu --program-prefix= --target=aarch64-linux-gnu --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -vPython Notes: Python 3.10.4Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of CSV2 BHB + srbds: Not affected + tsx_async_abort: Not affected
-march=armv8.4-a Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -march=armv8.4-a" CFLAGS="-O3 -march=armv8.4-a"Compiler Notes: --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-nls --disable-werror --enable-checking=yes,extra,rtl --enable-clocale=gnu --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++ --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-objc-gc=auto --enable-plugin --enable-shared --host=aarch64-linux-gnu --program-prefix= --target=aarch64-linux-gnu --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -vPython Notes: Python 3.10.4Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of CSV2 BHB + srbds: Not affected + tsx_async_abort: Not affected
-march=armv8.4-a+sve -mcpu=neoverse-v1 Processor: ARMv8 Neoverse-V1 (32 Cores), Motherboard: Amazon EC2 c7g.8xlarge (1.0 BIOS), Chipset: Amazon Device 0200, Memory: 62GB, Disk: 301GB Amazon Elastic Block Store, Network: Amazon Elastic
OS: Ubuntu 22.04, Kernel: 5.15.0-1004-aws (aarch64), Compiler: GCC 12.0.0 20220117, File-System: ext4, System Layer: amazon
Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -march=armv8.4-a+sve -mcpu=neoverse-v1" CFLAGS="-O3 -march=armv8.4-a+sve -mcpu=neoverse-v1"Compiler Notes: --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-nls --disable-werror --enable-checking=yes,extra,rtl --enable-clocale=gnu --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++ --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-objc-gc=auto --enable-plugin --enable-shared --host=aarch64-linux-gnu --program-prefix= --target=aarch64-linux-gnu --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -vPython Notes: Python 3.10.4Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of CSV2 BHB + srbds: Not affected + tsx_async_abort: Not affected
Crypto++ Crypto++ is a C++ class library of cryptographic algorithms. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MiB/second, More Is Better Crypto++ 8.2 Test: Unkeyed Algorithms -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 100 200 300 400 500 SE +/- 1.67, N = 3 SE +/- 0.25, N = 3 SE +/- 0.25, N = 3 270.93 459.87 449.02 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CXX) g++ options: -O3 -fPIC -pthread -pipe
LeelaChessZero LeelaChessZero (lc0 / lczero) is a chess engine automated vian neural networks. This test profile can be used for OpenCL, CUDA + cuDNN, and BLAS (CPU-based) benchmarking. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.28 Backend: BLAS -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 300 600 900 1200 1500 SE +/- 16.50, N = 3 SE +/- 13.99, N = 5 SE +/- 6.96, N = 3 1280 1281 1297 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CXX) g++ options: -flto -O3 -pthread
OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.28 Backend: Eigen -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 300 600 900 1200 1500 SE +/- 3.18, N = 3 SE +/- 13.65, N = 3 SE +/- 14.64, N = 5 1337 1311 1333 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CXX) g++ options: -flto -O3 -pthread
Timed MrBayes Analysis This test performs a bayesian analysis of a set of primate genome sequences in order to estimate their phylogeny. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Timed MrBayes Analysis 3.2.7 Primate Phylogeny Analysis -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 50 100 150 200 250 SE +/- 0.10, N = 3 SE +/- 0.07, N = 3 SE +/- 0.18, N = 3 241.54 237.54 234.26 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CC) gcc options: -O3 -std=c99 -pedantic -lm
LAMMPS Molecular Dynamics Simulator LAMMPS is a classical molecular dynamics code, and an acronym for Large-scale Atomic/Molecular Massively Parallel Simulator. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ns/day, More Is Better LAMMPS Molecular Dynamics Simulator 29Oct2020 Model: Rhodopsin Protein -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 5 10 15 20 25 SE +/- 0.04, N = 3 SE +/- 0.09, N = 3 SE +/- 0.01, N = 3 21.02 21.33 21.15 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CXX) g++ options: -O3 -lm
WebP Image Encode This is a test of Google's libwebp with the cwebp image encode utility and using a sample 6000x4000 pixel JPEG image as the input. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100, Lossless -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 6 12 18 24 30 SE +/- 0.23, N = 3 SE +/- 0.01, N = 3 SE +/- 0.18, N = 3 23.85 23.85 23.40 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CC) gcc options: -fvisibility=hidden -O3 -lm -ljpeg -lpng16 -ltiff
OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100, Highest Compression -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 2 4 6 8 10 SE +/- 0.006, N = 3 SE +/- 0.006, N = 3 SE +/- 0.017, N = 3 8.602 8.630 8.640 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CC) gcc options: -fvisibility=hidden -O3 -lm -ljpeg -lpng16 -ltiff
GNU GMP GMPbench GMPbench is a test of the GNU Multiple Precision Arithmetic (GMP) Library. GMPbench is a single-threaded integer benchmark that leverages the GMP library to stress the CPU with widening integer multiplication. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GMPbench Score, More Is Better GNU GMP GMPbench 6.2.1 Total Time -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 900 1800 2700 3600 4500 4152.7 4152.3 4155.6 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CC) gcc options: -O3 -lm
Xmrig Xmrig is an open-source cross-platform CPU/GPU miner for RandomX, KawPow, CryptoNight and AstroBWT. This test profile is setup to measure the Xmlrig CPU mining performance. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org H/s, More Is Better Xmrig 6.12.1 Variant: Monero - Hash Count: 1M -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 2K 4K 6K 8K 10K SE +/- 3.15, N = 3 SE +/- 9.56, N = 3 SE +/- 6.18, N = 3 8681.4 8645.4 8669.8 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CXX) g++ options: -O3 -fexceptions -fno-rtti -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
OpenBenchmarking.org H/s, More Is Better Xmrig 6.12.1 Variant: Wownero - Hash Count: 1M -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 3K 6K 9K 12K 15K SE +/- 0.35, N = 3 SE +/- 27.59, N = 3 SE +/- 26.88, N = 3 11842.0 11811.2 11877.8 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CXX) g++ options: -O3 -fexceptions -fno-rtti -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
Zstd Compression This test measures the time needed to compress/decompress a sample file (a FreeBSD disk image - FreeBSD-12.2-RELEASE-amd64-memstick.img) using Zstd compression with options for different compression levels / settings. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 3 - Compression Speed -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1500 3000 4500 6000 7500 SE +/- 9.61, N = 3 SE +/- 14.45, N = 3 SE +/- 15.42, N = 3 6938.3 6937.8 7027.2 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CC) gcc options: -O3 -pthread -lz -llzma
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 19 - Compression Speed -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 16 32 48 64 80 SE +/- 0.09, N = 3 SE +/- 0.23, N = 3 SE +/- 0.03, N = 3 73.0 72.9 74.0 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CC) gcc options: -O3 -pthread -lz -llzma
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 19 - Decompression Speed -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 700 1400 2100 2800 3500 SE +/- 8.27, N = 3 SE +/- 8.46, N = 3 SE +/- 6.60, N = 3 3167.5 3083.4 3094.8 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CC) gcc options: -O3 -pthread -lz -llzma
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 3, Long Mode - Compression Speed -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 300 600 900 1200 1500 SE +/- 4.22, N = 3 SE +/- 5.37, N = 3 SE +/- 4.88, N = 3 1243.8 1241.3 1242.7 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CC) gcc options: -O3 -pthread -lz -llzma
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 3, Long Mode - Decompression Speed -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 800 1600 2400 3200 4000 SE +/- 40.64, N = 3 SE +/- 3.95, N = 3 SE +/- 1.28, N = 3 3882.3 3820.8 3824.8 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CC) gcc options: -O3 -pthread -lz -llzma
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 19, Long Mode - Compression Speed -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 9 18 27 36 45 SE +/- 0.06, N = 3 SE +/- 0.03, N = 3 SE +/- 0.00, N = 3 39.0 40.0 40.3 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CC) gcc options: -O3 -pthread -lz -llzma
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.0 Compression Level: 19, Long Mode - Decompression Speed -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 700 1400 2100 2800 3500 SE +/- 5.09, N = 3 SE +/- 7.62, N = 3 SE +/- 0.59, N = 3 3339.7 3250.9 3263.7 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CC) gcc options: -O3 -pthread -lz -llzma
JPEG XL libjxl The JPEG XL Image Coding System is designed to provide next-generation JPEG image capabilities with JPEG XL offering better image quality and compression over legacy JPEG. This test profile is currently focused on the multi-threaded JPEG XL image encode performance using the reference libjxl library. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MP/s, More Is Better JPEG XL libjxl 0.6.1 Input: PNG - Encode Speed: 7 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 8.07 8.32 8.36 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CXX) g++ options: -O3 -funwind-tables -O2 -fPIE -pie
OpenBenchmarking.org MP/s, More Is Better JPEG XL libjxl 0.6.1 Input: PNG - Encode Speed: 8 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 0.1508 0.3016 0.4524 0.6032 0.754 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.67 0.67 0.67 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CXX) g++ options: -O3 -funwind-tables -O2 -fPIE -pie
OpenBenchmarking.org MP/s, More Is Better JPEG XL libjxl 0.6.1 Input: JPEG - Encode Speed: 7 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 20 40 60 80 100 SE +/- 0.22, N = 3 SE +/- 0.13, N = 3 SE +/- 0.12, N = 3 78.80 73.21 79.58 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CXX) g++ options: -O3 -funwind-tables -O2 -fPIE -pie
OpenBenchmarking.org MP/s, More Is Better JPEG XL libjxl 0.6.1 Input: JPEG - Encode Speed: 8 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 6 12 18 24 30 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 27.09 26.30 27.34 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CXX) g++ options: -O3 -funwind-tables -O2 -fPIE -pie
Nettle GNU Nettle is a low-level cryptographic library used by GnuTLS and other software. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mbyte/s, More Is Better Nettle 3.8 Test: aes256 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1000 2000 3000 4000 5000 SE +/- 0.72, N = 3 SE +/- 3.11, N = 3 SE +/- 0.39, N = 3 4438.27 4435.91 4447.04 -march=armv8.4-a+sve -mcpu=neoverse-v1 - MIN: 3923.68 / MAX: 5627.25 -march=armv8.4-a -lgmp - MIN: 3927.32 / MAX: 5628.86 -march=armv8.4-a+sve -lgmp - MIN: 3925.11 / MAX: 5627.84 1. (CC) gcc options: -O3 -ggdb3 -lnettle -lm -lcrypto
OpenBenchmarking.org Mbyte/s, More Is Better Nettle 3.8 Test: chacha -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 160 320 480 640 800 SE +/- 0.15, N = 3 SE +/- 0.61, N = 3 SE +/- 0.55, N = 3 731.07 740.25 733.59 -march=armv8.4-a+sve -mcpu=neoverse-v1 - MIN: 446.51 / MAX: 951.28 -march=armv8.4-a -lgmp - MIN: 454.21 / MAX: 956.53 -march=armv8.4-a+sve -lgmp - MIN: 442.26 / MAX: 956.22 1. (CC) gcc options: -O3 -ggdb3 -lnettle -lm -lcrypto
OpenBenchmarking.org Mbyte/s, More Is Better Nettle 3.8 Test: sha512 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 110 220 330 440 550 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 SE +/- 0.07, N = 3 481.90 498.83 504.33 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -lgmp -march=armv8.4-a+sve -lgmp 1. (CC) gcc options: -O3 -ggdb3 -lnettle -lm -lcrypto
OpenBenchmarking.org Mbyte/s, More Is Better Nettle 3.8 Test: poly1305-aes -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 200 400 600 800 1000 SE +/- 0.05, N = 3 SE +/- 1.52, N = 3 SE +/- 5.37, N = 3 859.93 871.90 820.51 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -lgmp -march=armv8.4-a+sve -lgmp 1. (CC) gcc options: -O3 -ggdb3 -lnettle -lm -lcrypto
LuaJIT This test profile is a collection of Lua scripts/benchmarks run against a locally-built copy of LuaJIT upstream. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Mflops, More Is Better LuaJIT 2.1-git Test: Composite -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 300 600 900 1200 1500 SE +/- 13.31, N = 6 SE +/- 0.41, N = 3 SE +/- 18.19, N = 3 1303.89 1282.59 1309.03 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CC) gcc options: -lm -ldl -O2 -fomit-frame-pointer -O3 -U_FORTIFY_SOURCE -fno-stack-protector
OpenBenchmarking.org Mflops, More Is Better LuaJIT 2.1-git Test: Monte Carlo -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 70 140 210 280 350 SE +/- 0.56, N = 3 SE +/- 0.35, N = 3 SE +/- 0.54, N = 3 343.90 343.27 343.85 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CC) gcc options: -lm -ldl -O2 -fomit-frame-pointer -O3 -U_FORTIFY_SOURCE -fno-stack-protector
OpenBenchmarking.org Mflops, More Is Better LuaJIT 2.1-git Test: Fast Fourier Transform -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 140 280 420 560 700 SE +/- 0.07, N = 3 SE +/- 0.39, N = 3 SE +/- 10.69, N = 3 668.40 661.55 615.71 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CC) gcc options: -lm -ldl -O2 -fomit-frame-pointer -O3 -U_FORTIFY_SOURCE -fno-stack-protector
OpenBenchmarking.org Mflops, More Is Better LuaJIT 2.1-git Test: Sparse Matrix Multiply -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 300 600 900 1200 1500 SE +/- 7.84, N = 3 SE +/- 3.14, N = 3 SE +/- 7.20, N = 3 1164.28 1151.57 1162.33 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CC) gcc options: -lm -ldl -O2 -fomit-frame-pointer -O3 -U_FORTIFY_SOURCE -fno-stack-protector
OpenBenchmarking.org Mflops, More Is Better LuaJIT 2.1-git Test: Dense LU Matrix Factorization -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 800 1600 2400 3200 4000 SE +/- 93.82, N = 3 SE +/- 6.02, N = 3 SE +/- 86.66, N = 3 3547.90 3355.53 3521.13 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CC) gcc options: -lm -ldl -O2 -fomit-frame-pointer -O3 -U_FORTIFY_SOURCE -fno-stack-protector
OpenBenchmarking.org Mflops, More Is Better LuaJIT 2.1-git Test: Jacobi Successive Over-Relaxation -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 200 400 600 800 1000 SE +/- 1.01, N = 3 SE +/- 0.68, N = 3 SE +/- 1.00, N = 3 902.14 901.02 902.16 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CC) gcc options: -lm -ldl -O2 -fomit-frame-pointer -O3 -U_FORTIFY_SOURCE -fno-stack-protector
Botan Botan is a BSD-licensed cross-platform open-source C++ crypto library "cryptography toolkit" that supports most publicly known cryptographic algorithms. The project's stated goal is to be "the best option for cryptography in C++ by offering the tools necessary to implement a range of practical systems, such as TLS protocol, X.509 certificates, modern AEAD ciphers, PKCS#11 and TPM hardware support, password hashing, and post quantum crypto schemes." Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: KASUMI -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 15 30 45 60 75 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 65.29 62.02 62.00 1. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: KASUMI - Decrypt -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 15 30 45 60 75 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 67.51 62.28 62.26 1. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: AES-256 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1200 2400 3600 4800 6000 SE +/- 3.07, N = 3 SE +/- 14.75, N = 3 SE +/- 9.30, N = 3 5415.19 5494.31 5442.65 1. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: AES-256 - Decrypt -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1200 2400 3600 4800 6000 SE +/- 1.75, N = 3 SE +/- 5.58, N = 3 SE +/- 8.64, N = 3 5409.82 5477.57 5474.32 1. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: Twofish -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 50 100 150 200 250 SE +/- 0.12, N = 3 SE +/- 0.12, N = 3 SE +/- 0.26, N = 3 244.21 239.70 248.89 1. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: Twofish - Decrypt -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 60 120 180 240 300 SE +/- 0.06, N = 3 SE +/- 0.20, N = 3 SE +/- 0.11, N = 3 246.61 246.16 258.15 1. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: Blowfish -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 60 120 180 240 300 SE +/- 0.03, N = 3 SE +/- 0.29, N = 3 SE +/- 0.10, N = 3 280.99 278.87 280.57 1. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: Blowfish - Decrypt -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 60 120 180 240 300 SE +/- 0.04, N = 3 SE +/- 0.07, N = 3 SE +/- 0.03, N = 3 283.41 288.51 289.03 1. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: CAST-256 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 20 40 60 80 100 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 109.29 108.79 108.75 1. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: CAST-256 - Decrypt -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 20 40 60 80 100 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 109.11 108.60 108.62 1. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: ChaCha20Poly1305 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 80 160 240 320 400 SE +/- 0.03, N = 3 SE +/- 0.04, N = 3 SE +/- 0.07, N = 3 385.38 389.38 390.31 1. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt
OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: ChaCha20Poly1305 - Decrypt -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 80 160 240 320 400 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 SE +/- 0.13, N = 3 378.98 382.51 383.95 1. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt
GraphicsMagick This is a test of GraphicsMagick with its OpenMP implementation that performs various imaging tests on a sample 6000x4000 pixel JPEG image. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Swirl -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 300 600 900 1200 1500 SE +/- 0.88, N = 3 SE +/- 0.33, N = 3 SE +/- 1.33, N = 3 1257 1225 1272 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CC) gcc options: -fopenmp -O3 -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lz -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Rotate -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 130 260 390 520 650 SE +/- 0.67, N = 3 SE +/- 0.00, N = 3 SE +/- 1.20, N = 3 591 577 611 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CC) gcc options: -fopenmp -O3 -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lz -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Enhanced -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 160 320 480 640 800 SE +/- 0.67, N = 3 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 741 732 718 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CC) gcc options: -fopenmp -O3 -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lz -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Resizing -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 500 1000 1500 2000 2500 SE +/- 0.33, N = 3 SE +/- 22.36, N = 3 SE +/- 1.86, N = 3 2402 2339 2414 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CC) gcc options: -fopenmp -O3 -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lz -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Noise-Gaussian -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 110 220 330 440 550 SE +/- 0.00, N = 3 SE +/- 0.67, N = 3 SE +/- 0.58, N = 3 509 494 515 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CC) gcc options: -fopenmp -O3 -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lz -lm -lpthread
OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: HWB Color Space -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 200 400 600 800 1000 SE +/- 1.00, N = 3 SE +/- 1.00, N = 3 SE +/- 0.33, N = 3 1016 978 1067 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CC) gcc options: -fopenmp -O3 -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lz -lm -lpthread
AOM AV1 OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.3 Encoder Mode: Speed 8 Realtime - Input: Bosphorus 4K -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 14 28 42 56 70 SE +/- 0.62, N = 6 SE +/- 0.43, N = 3 SE +/- 0.61, N = 3 61.56 62.13 62.19 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm
OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.3 Encoder Mode: Speed 10 Realtime - Input: Bosphorus 4K -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 15 30 45 60 75 SE +/- 0.29, N = 3 SE +/- 0.47, N = 3 SE +/- 0.22, N = 3 63.25 61.88 65.62 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm
OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.3 Encoder Mode: Speed 8 Realtime - Input: Bosphorus 1080p -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 30 60 90 120 150 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 SE +/- 0.09, N = 3 124.22 120.13 123.95 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm
OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.3 Encoder Mode: Speed 9 Realtime - Input: Bosphorus 1080p -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 30 60 90 120 150 SE +/- 0.12, N = 3 SE +/- 0.03, N = 3 SE +/- 0.20, N = 3 156.63 152.46 156.71 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm
OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 3.3 Encoder Mode: Speed 10 Realtime - Input: Bosphorus 1080p -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 40 80 120 160 200 SE +/- 0.17, N = 3 SE +/- 0.28, N = 3 SE +/- 0.20, N = 3 193.55 190.27 193.68 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm
x264 This is a multi-threaded test of the x264 video encoder run on the CPU with a choice of 1080p or 4K video input. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Frames Per Second, More Is Better x264 2022-02-22 Video Input: Bosphorus 4K -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 11 22 33 44 55 SE +/- 0.01, N = 3 SE +/- 0.08, N = 3 SE +/- 0.01, N = 3 48.56 48.43 48.51 1. (CC) gcc options: -ldl -lavformat -lavcodec -lavutil -lswscale -lm -lpthread -O3 -flto
OpenBenchmarking.org Frames Per Second, More Is Better x264 2022-02-22 Video Input: Bosphorus 1080p -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 40 80 120 160 200 SE +/- 0.09, N = 3 SE +/- 0.07, N = 3 SE +/- 0.10, N = 3 169.30 168.92 169.58 1. (CC) gcc options: -ldl -lavformat -lavcodec -lavutil -lswscale -lm -lpthread -O3 -flto
ACES DGEMM This is a multi-threaded DGEMM benchmark. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GFLOP/s, More Is Better ACES DGEMM 1.0 Sustained Floating-Point Rate -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 3 6 9 12 15 SE +/- 0.11, N = 3 SE +/- 0.04, N = 3 SE +/- 0.05, N = 3 13.35 12.81 13.44 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CC) gcc options: -O3 -march=native -fopenmp
Coremark This is a test of EEMBC CoreMark processor benchmark. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Iterations/Sec, More Is Better Coremark 1.0 CoreMark Size 666 - Iterations Per Second -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 200K 400K 600K 800K 1000K SE +/- 108.80, N = 3 SE +/- 169.09, N = 3 SE +/- 416.47, N = 3 798137.71 789646.92 762066.50 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CC) gcc options: -O2 -O3 -lrt" -lrt
Himeno Benchmark The Himeno benchmark is a linear solver of pressure Poisson using a point-Jacobi method. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MFLOPS, More Is Better Himeno Benchmark 3.0 Poisson Pressure Solver -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1200 2400 3600 4800 6000 SE +/- 13.44, N = 3 SE +/- 2.66, N = 3 SE +/- 5.76, N = 3 5538.74 5561.56 5508.28 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CC) gcc options: -O3
Stockfish This is a test of Stockfish, an advanced open-source C++11 chess benchmark that can scale up to 512 CPU threads. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 13 Total Time -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 13M 26M 39M 52M 65M SE +/- 655325.68, N = 15 SE +/- 721518.45, N = 14 SE +/- 645132.01, N = 3 59966785 57485680 55823340 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CXX) g++ options: -lgcov -lpthread -O3 -fno-exceptions -std=c++17 -pedantic -flto -fprofile-use -fno-peel-loops -fno-tracer -flto=jobserver
Stargate Digital Audio Workstation Stargate is an open-source, cross-platform digital audio workstation (DAW) software package with "a unique and carefully curated experience" with scalability from old systems up through modern multi-core systems. Stargate is GPLv3 licensed and makes use of Qt5 (PyQt5) for its user-interface. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Render Ratio, More Is Better Stargate Digital Audio Workstation 21.10.9 Sample Rate: 44100 - Buffer Size: 512 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 2 4 6 8 10 SE +/- 0.002556, N = 3 SE +/- 0.002564, N = 3 SE +/- 0.003428, N = 3 6.068650 6.072916 6.213500 1. (CXX) g++ options: -lpthread -lsndfile -lm -O3 -march=native -ffast-math -funroll-loops -fstrength-reduce -fstrict-aliasing -finline-functions
OpenBenchmarking.org Render Ratio, More Is Better Stargate Digital Audio Workstation 21.10.9 Sample Rate: 96000 - Buffer Size: 512 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1.0068 2.0136 3.0204 4.0272 5.034 SE +/- 0.001339, N = 3 SE +/- 0.003502, N = 3 SE +/- 0.002191, N = 3 4.412487 4.414055 4.474848 1. (CXX) g++ options: -lpthread -lsndfile -lm -O3 -march=native -ffast-math -funroll-loops -fstrength-reduce -fstrict-aliasing -finline-functions
OpenBenchmarking.org Render Ratio, More Is Better Stargate Digital Audio Workstation 21.10.9 Sample Rate: 44100 - Buffer Size: 1024 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 2 4 6 8 10 SE +/- 0.002583, N = 3 SE +/- 0.002515, N = 3 SE +/- 0.002411, N = 3 6.360138 6.370035 6.538163 1. (CXX) g++ options: -lpthread -lsndfile -lm -O3 -march=native -ffast-math -funroll-loops -fstrength-reduce -fstrict-aliasing -finline-functions
OpenBenchmarking.org Render Ratio, More Is Better Stargate Digital Audio Workstation 21.10.9 Sample Rate: 480000 - Buffer Size: 512 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 2 4 6 8 10 SE +/- 0.002132, N = 3 SE +/- 0.001956, N = 3 SE +/- 0.002030, N = 3 5.981662 6.005386 6.124455 1. (CXX) g++ options: -lpthread -lsndfile -lm -O3 -march=native -ffast-math -funroll-loops -fstrength-reduce -fstrict-aliasing -finline-functions
OpenBenchmarking.org Render Ratio, More Is Better Stargate Digital Audio Workstation 21.10.9 Sample Rate: 96000 - Buffer Size: 1024 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1.0818 2.1636 3.2454 4.3272 5.409 SE +/- 0.000548, N = 3 SE +/- 0.000903, N = 3 SE +/- 0.002826, N = 3 4.729756 4.729848 4.807797 1. (CXX) g++ options: -lpthread -lsndfile -lm -O3 -march=native -ffast-math -funroll-loops -fstrength-reduce -fstrict-aliasing -finline-functions
OpenBenchmarking.org Render Ratio, More Is Better Stargate Digital Audio Workstation 21.10.9 Sample Rate: 480000 - Buffer Size: 1024 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 2 4 6 8 10 SE +/- 0.001046, N = 3 SE +/- 0.002118, N = 3 SE +/- 0.002250, N = 3 6.288006 6.322684 6.450182 1. (CXX) g++ options: -lpthread -lsndfile -lm -O3 -march=native -ffast-math -funroll-loops -fstrength-reduce -fstrict-aliasing -finline-functions
C-Ray This is a test of C-Ray, a simple raytracer designed to test the floating-point CPU performance. This test is multi-threaded (16 threads per core), will shoot 8 rays per pixel for anti-aliasing, and will generate a 1600 x 1200 image. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better C-Ray 1.1 Total Time - 4K, 16 Rays Per Pixel -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 5 10 15 20 25 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 SE +/- 0.00, N = 3 19.56 19.30 19.30 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CC) gcc options: -lm -lpthread -O3
POV-Ray This is a test of POV-Ray, the Persistence of Vision Raytracer. POV-Ray is used to create 3D graphics using ray-tracing. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better POV-Ray 3.7.0.7 Trace Time -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 5 10 15 20 25 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 SE +/- 0.04, N = 3 19.61 19.85 20.26 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CXX) g++ options: -pipe -O3 -ffast-math -R/usr/lib -lXpm -lSM -lICE -lX11 -lIlmImf -lIlmImf-2_5 -lImath-2_5 -lHalf-2_5 -lIex-2_5 -lIexMath-2_5 -lIlmThread-2_5 -lIlmThread -ltiff -ljpeg -lpng -lz -lrt -lm -lboost_thread -lboost_system
Primesieve Primesieve generates prime numbers using a highly optimized sieve of Eratosthenes implementation. Primesieve benchmarks the CPU's L1/L2 cache performance. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 7.7 1e12 Prime Number Generation -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 2 4 6 8 10 SE +/- 0.022, N = 3 SE +/- 0.022, N = 3 SE +/- 0.043, N = 3 8.435 8.438 8.533 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CXX) g++ options: -O3
Smallpt Smallpt is a C++ global illumination renderer written in less than 100 lines of code. Global illumination is done via unbiased Monte Carlo path tracing and there is multi-threading support via the OpenMP library. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Smallpt 1.0 Global Illumination Renderer; 128 Samples -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 0.8766 1.7532 2.6298 3.5064 4.383 SE +/- 0.003, N = 3 SE +/- 0.002, N = 3 SE +/- 0.003, N = 3 3.892 3.895 3.896 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CXX) g++ options: -fopenmp -O3
AOBench AOBench is a lightweight ambient occlusion renderer, written in C. The test profile is using a size of 2048 x 2048. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better AOBench Size: 2048 x 2048 - Total Time -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 8 16 24 32 40 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 33.55 33.48 33.49 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CC) gcc options: -lm -O3
FLAC Audio Encoding This test times how long it takes to encode a sample WAV file to FLAC format ten times. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better FLAC Audio Encoding 1.3.3 WAV To FLAC -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 9 18 27 36 45 SE +/- 0.01, N = 5 SE +/- 0.02, N = 5 SE +/- 0.01, N = 5 38.60 38.31 38.52 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CXX) g++ options: -O3 -fvisibility=hidden -logg -lm
LAME MP3 Encoding LAME is an MP3 encoder licensed under the LGPL. This test measures the time required to encode a WAV file to MP3 format. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better LAME MP3 Encoding 3.100 WAV To MP3 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 2 4 6 8 10 SE +/- 0.001, N = 3 SE +/- 0.004, N = 3 SE +/- 0.002, N = 3 7.446 8.054 7.440 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CC) gcc options: -O3 -ffast-math -funroll-loops -fschedule-insns2 -fbranch-count-reg -fforce-addr -pipe -lm
Opus Codec Encoding Opus is an open audio codec. Opus is a lossy audio compression format designed primarily for interactive real-time applications over the Internet. This test uses Opus-Tools and measures the time required to encode a WAV file to Opus. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Opus Codec Encoding 1.3.1 WAV To Opus Encode -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 5 10 15 20 25 SE +/- 0.01, N = 5 SE +/- 0.00, N = 5 SE +/- 0.02, N = 5 14.38 18.32 14.40 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CXX) g++ options: -O3 -fvisibility=hidden -logg -lm
eSpeak-NG Speech Engine This test times how long it takes the eSpeak speech synthesizer to read Project Gutenberg's The Outline of Science and output to a WAV file. This test profile is now tracking the eSpeak-NG version of eSpeak. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better eSpeak-NG Speech Engine 20200907 Text-To-Speech Synthesis -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 8 16 24 32 40 SE +/- 0.28, N = 20 SE +/- 0.31, N = 16 SE +/- 0.30, N = 20 29.99 36.59 29.98 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CC) gcc options: -O3 -std=c99
Ngspice Ngspice is an open-source SPICE circuit simulator. Ngspice was originally based on the Berkeley SPICE electronic circuit simulator. Ngspice supports basic threading using OpenMP. This test profile is making use of the ISCAS 85 benchmark circuits. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Ngspice 34 Circuit: C2670 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 20 40 60 80 100 SE +/- 0.12, N = 3 SE +/- 0.06, N = 3 SE +/- 0.18, N = 3 104.10 102.56 106.91 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CC) gcc options: -O3 -fopenmp -lm -lstdc++ -lfftw3 -lXaw -lXmu -lXt -lXext -lX11 -lXft -lfontconfig -lXrender -lfreetype -lSM -lICE
OpenBenchmarking.org Seconds, Fewer Is Better Ngspice 34 Circuit: C7552 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 20 40 60 80 100 SE +/- 1.20, N = 3 SE +/- 0.51, N = 3 SE +/- 1.04, N = 3 106.77 103.93 111.64 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CC) gcc options: -O3 -fopenmp -lm -lstdc++ -lfftw3 -lXaw -lXmu -lXt -lXext -lX11 -lXft -lfontconfig -lXrender -lfreetype -lSM -lICE
RNNoise RNNoise is a recurrent neural network for audio noise reduction developed by Mozilla and Xiph.Org. This test profile is a single-threaded test measuring the time to denoise a sample 26 minute long 16-bit RAW audio file using this recurrent neural network noise suppression library. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better RNNoise 2020-06-28 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 4 8 12 16 20 SE +/- 0.04, N = 3 SE +/- 0.05, N = 3 SE +/- 0.05, N = 3 17.29 17.62 17.39 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CC) gcc options: -O3 -pedantic -fvisibility=hidden
OpenJPEG OpenJPEG is an open-source JPEG 2000 codec written in the C programming language. The default input for this test profile is the NASA/JPL-Caltech/MSSS Curiosity panorama 717MB TIFF image file converting to JPEG2000 format. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better OpenJPEG 2.4 Encode: NASA Curiosity Panorama M34 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 12K 24K 36K 48K 60K SE +/- 7.80, N = 3 SE +/- 19.06, N = 3 SE +/- 89.48, N = 3 55415 57205 55196 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CXX) g++ options: -O3 -rdynamic
OpenSSL OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test profile makes use of the built-in "openssl speed" benchmarking capabilities. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.0 Algorithm: SHA256 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 6000M 12000M 18000M 24000M 30000M SE +/- 24491056.96, N = 3 SE +/- 25639974.71, N = 3 SE +/- 32102278.66, N = 3 27681290600 27603943570 27428176880 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl
OpenBenchmarking.org sign/s, More Is Better OpenSSL 3.0 Algorithm: RSA4096 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1100 2200 3300 4400 5500 SE +/- 4.19, N = 3 SE +/- 0.78, N = 3 SE +/- 0.53, N = 3 5114.1 5090.5 5088.1 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl
OpenBenchmarking.org verify/s, More Is Better OpenSSL 3.0 Algorithm: RSA4096 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 80K 160K 240K 320K 400K SE +/- 38.59, N = 3 SE +/- 8.85, N = 3 SE +/- 10.52, N = 3 355966.7 356359.6 356407.8 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl
Liquid-DSP LiquidSDR's Liquid-DSP is a software-defined radio (SDR) digital signal processing library. This test profile runs a multi-threaded benchmark of this SDR/DSP library focused on embedded platform usage. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 8 - Buffer Length: 256 - Filter Length: 57 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 40M 80M 120M 160M 200M SE +/- 12018.50, N = 3 SE +/- 12018.50, N = 3 SE +/- 26666.67, N = 3 150673333 176363333 167733333 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 16 - Buffer Length: 256 - Filter Length: 57 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 80M 160M 240M 320M 400M SE +/- 64291.01, N = 3 SE +/- 30550.50, N = 3 SE +/- 20275.88, N = 3 301330000 352700000 335423333 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 2021.01.31 Threads: 32 - Buffer Length: 256 - Filter Length: 57 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 150M 300M 450M 600M 750M SE +/- 16666.67, N = 3 SE +/- 125476.87, N = 3 SE +/- 1978807.16, N = 3 602443333 705233333 668636667 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
GROMACS The GROMACS (GROningen MAchine for Chemical Simulations) molecular dynamics package testing with the water_GMX50 data. This test profile allows selecting between CPU and GPU-based GROMACS builds. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2022.1 Implementation: MPI CPU - Input: water_GMX50_bare -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 0.5128 1.0256 1.5384 2.0512 2.564 SE +/- 0.002, N = 3 SE +/- 0.001, N = 3 SE +/- 0.002, N = 3 2.279 2.277 2.275 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CXX) g++ options: -O3
ASTC Encoder ASTC Encoder (astcenc) is for the Adaptive Scalable Texture Compression (ASTC) format commonly used with OpenGL, OpenGL ES, and Vulkan graphics APIs. This test profile does a coding test of both compression/decompression. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 3.2 Preset: Medium -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1.0987 2.1974 3.2961 4.3948 5.4935 SE +/- 0.0099, N = 3 SE +/- 0.0125, N = 3 SE +/- 0.0080, N = 3 4.7290 4.8833 4.8092 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CXX) g++ options: -O3 -flto -pthread
OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 3.2 Preset: Thorough -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 3 6 9 12 15 SE +/- 0.0034, N = 3 SE +/- 0.0046, N = 3 SE +/- 0.0027, N = 3 8.9427 9.1435 9.0131 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CXX) g++ options: -O3 -flto -pthread
OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 3.2 Preset: Exhaustive -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 8 16 24 32 40 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 34.68 35.36 35.19 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CXX) g++ options: -O3 -flto -pthread
Google Draco Draco is a library developed by Google for compressing/decompressing 3D geometric meshes and point clouds. This test profile uses some Artec3D PLY models as the sample 3D model input formats for Draco compression/decompression. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.5.0 Model: Lion -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1100 2200 3300 4400 5500 SE +/- 7.84, N = 3 SE +/- 2.65, N = 3 SE +/- 2.40, N = 3 5297 5354 5309 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CXX) g++ options: -O3
OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.5.0 Model: Church Facade -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 2K 4K 6K 8K 10K SE +/- 26.24, N = 3 SE +/- 6.64, N = 3 SE +/- 7.00, N = 3 7797 7935 7843 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CXX) g++ options: -O3
Redis Redis is an open-source in-memory data structure store, used as a database, cache, and message broker. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: GET -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 500K 1000K 1500K 2000K 2500K SE +/- 8325.87, N = 3 SE +/- 1605.36, N = 3 SE +/- 9056.40, N = 3 2546595.58 2523377.92 2513289.20 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: SET -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 400K 800K 1200K 1600K 2000K SE +/- 1178.58, N = 3 SE +/- 794.78, N = 3 SE +/- 7427.93, N = 3 1879962.00 1865840.13 1861924.13 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
Caffe This is a benchmark of the Caffe deep learning framework and currently supports the AlexNet and Googlenet model and execution on both CPUs and NVIDIA GPUs. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: AlexNet - Acceleration: CPU - Iterations: 200 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 9K 18K 27K 36K 45K SE +/- 6.51, N = 3 SE +/- 31.22, N = 3 SE +/- 12.55, N = 3 42262 43634 43931 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CXX) g++ options: -O3 -fPIC -rdynamic -lglog -lgflags -lprotobuf -lcrypto -lcurl -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: CPU - Iterations: 200 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 30K 60K 90K 120K 150K SE +/- 69.18, N = 3 SE +/- 49.72, N = 3 SE +/- 105.70, N = 3 120428 123807 125125 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CXX) g++ options: -O3 -fPIC -rdynamic -lglog -lgflags -lprotobuf -lcrypto -lcurl -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
TNN TNN is an open-source deep learning reasoning framework developed by Tencent. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: DenseNet -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 600 1200 1800 2400 3000 SE +/- 24.72, N = 3 SE +/- 12.63, N = 3 SE +/- 22.16, N = 3 2390.26 2730.40 2346.32 -march=armv8.4-a+sve -mcpu=neoverse-v1 - MIN: 2289.32 / MAX: 2501.92 -march=armv8.4-a - MIN: 2665.42 / MAX: 2834.73 -march=armv8.4-a+sve - MIN: 2268.4 / MAX: 2446.6 1. (CXX) g++ options: -O3 -fopenmp -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl
OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: MobileNet v2 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 60 120 180 240 300 SE +/- 0.24, N = 3 SE +/- 0.10, N = 3 SE +/- 0.69, N = 3 273.40 260.78 280.24 -march=armv8.4-a+sve -mcpu=neoverse-v1 - MIN: 272.22 / MAX: 274.95 -march=armv8.4-a - MIN: 259.13 / MAX: 262.38 -march=armv8.4-a+sve - MIN: 277.9 / MAX: 282.31 1. (CXX) g++ options: -O3 -fopenmp -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl
OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: SqueezeNet v2 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 20 40 60 80 100 SE +/- 0.09, N = 3 SE +/- 0.07, N = 3 SE +/- 0.09, N = 3 76.21 71.13 76.30 -march=armv8.4-a+sve -mcpu=neoverse-v1 - MIN: 75.89 / MAX: 76.52 -march=armv8.4-a - MIN: 70.76 / MAX: 71.58 -march=armv8.4-a+sve - MIN: 76.07 / MAX: 76.53 1. (CXX) g++ options: -O3 -fopenmp -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl
OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: SqueezeNet v1.1 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 60 120 180 240 300 SE +/- 0.25, N = 3 SE +/- 0.08, N = 3 SE +/- 0.14, N = 3 205.28 257.70 205.80 -march=armv8.4-a+sve -mcpu=neoverse-v1 - MIN: 204.74 / MAX: 206.1 -march=armv8.4-a - MIN: 256.95 / MAX: 258.39 -march=armv8.4-a+sve - MIN: 205.46 / MAX: 206.28 1. (CXX) g++ options: -O3 -fopenmp -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl
Sysbench This is a benchmark of Sysbench with the built-in CPU and memory sub-tests. Sysbench is a scriptable multi-threaded benchmark tool based on LuaJIT. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Events Per Second, More Is Better Sysbench 1.0.20 Test: CPU -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 20K 40K 60K 80K 100K SE +/- 7.82, N = 3 SE +/- 8.50, N = 3 SE +/- 2.72, N = 3 96702.81 96726.40 96666.76 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CC) gcc options: -O2 -funroll-loops -O3 -rdynamic -ldl -laio -lm
ONNX Runtime ONNX Runtime is developed by Microsoft and partners as a open-source, cross-platform, high performance machine learning inferencing and training accelerator. This test profile runs the ONNX Runtime with various models available from the ONNX Zoo. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: GPT-2 - Device: CPU - Executor: Standard -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 3K 6K 9K 12K 15K SE +/- 10.09, N = 3 SE +/- 63.90, N = 3 SE +/- 12.91, N = 3 12460 12364 12317 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CXX) g++ options: -O3 -ffunction-sections -fdata-sections -march=native -mtune=native -flto -fno-fat-lto-objects -ldl -lrt
OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: bertsquad-12 - Device: CPU - Executor: Standard -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 170 340 510 680 850 SE +/- 0.44, N = 3 SE +/- 0.44, N = 3 SE +/- 0.50, N = 3 774 773 772 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CXX) g++ options: -O3 -ffunction-sections -fdata-sections -march=native -mtune=native -flto -fno-fat-lto-objects -ldl -lrt
OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: fcn-resnet101-11 - Device: CPU - Executor: Standard -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 16 32 48 64 80 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 73 73 73 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CXX) g++ options: -O3 -ffunction-sections -fdata-sections -march=native -mtune=native -flto -fno-fat-lto-objects -ldl -lrt
OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 200 400 600 800 1000 SE +/- 0.67, N = 3 SE +/- 0.88, N = 3 SE +/- 0.17, N = 3 938 938 935 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CXX) g++ options: -O3 -ffunction-sections -fdata-sections -march=native -mtune=native -flto -fno-fat-lto-objects -ldl -lrt
OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: super-resolution-10 - Device: CPU - Executor: Standard -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1200 2400 3600 4800 6000 SE +/- 0.67, N = 3 SE +/- 0.93, N = 3 SE +/- 2.17, N = 3 5416 5413 5411 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CXX) g++ options: -O3 -ffunction-sections -fdata-sections -march=native -mtune=native -flto -fno-fat-lto-objects -ldl -lrt
WavPack Audio Encoding This test times how long it takes to encode a sample WAV file to WavPack format with very high quality settings. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better WavPack Audio Encoding 5.3 WAV To WavPack -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 5 10 15 20 25 SE +/- 0.03, N = 5 SE +/- 0.00, N = 5 SE +/- 0.03, N = 5 20.76 20.49 20.52 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CXX) g++ options: -O3 -rdynamic
Kripke Kripke is a simple, scalable, 3D Sn deterministic particle transport code. Its primary purpose is to research how data layout, programming paradigms and architectures effect the implementation and performance of Sn transport. Kripke is developed by LLNL. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Throughput FoM, More Is Better Kripke 1.2.4 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 40M 80M 120M 160M 200M SE +/- 200700.56, N = 3 SE +/- 226703.11, N = 3 SE +/- 298633.53, N = 3 194776367 204143167 192709233 -march=armv8.4-a+sve -mcpu=neoverse-v1 -march=armv8.4-a -march=armv8.4-a+sve 1. (CXX) g++ options: -O3 -fopenmp
-march=armv8.4-a+sve Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -march=armv8.4-a+sve" CFLAGS="-O3 -march=armv8.4-a+sve"Compiler Notes: --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-nls --disable-werror --enable-checking=yes,extra,rtl --enable-clocale=gnu --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++ --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-objc-gc=auto --enable-plugin --enable-shared --host=aarch64-linux-gnu --program-prefix= --target=aarch64-linux-gnu --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -vPython Notes: Python 3.10.4Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of CSV2 BHB + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 6 June 2022 17:44 by user ubuntu.
-march=armv8.4-a Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -march=armv8.4-a" CFLAGS="-O3 -march=armv8.4-a"Compiler Notes: --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-nls --disable-werror --enable-checking=yes,extra,rtl --enable-clocale=gnu --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++ --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-objc-gc=auto --enable-plugin --enable-shared --host=aarch64-linux-gnu --program-prefix= --target=aarch64-linux-gnu --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -vPython Notes: Python 3.10.4Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of CSV2 BHB + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 7 June 2022 00:47 by user ubuntu.
-march=armv8.4-a+sve -mcpu=neoverse-v1 Processor: ARMv8 Neoverse-V1 (32 Cores), Motherboard: Amazon EC2 c7g.8xlarge (1.0 BIOS), Chipset: Amazon Device 0200, Memory: 62GB, Disk: 301GB Amazon Elastic Block Store, Network: Amazon Elastic
OS: Ubuntu 22.04, Kernel: 5.15.0-1004-aws (aarch64), Compiler: GCC 12.0.0 20220117, File-System: ext4, System Layer: amazon
Kernel Notes: Transparent Huge Pages: madviseEnvironment Notes: CXXFLAGS="-O3 -march=armv8.4-a+sve -mcpu=neoverse-v1" CFLAGS="-O3 -march=armv8.4-a+sve -mcpu=neoverse-v1"Compiler Notes: --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-nls --disable-werror --enable-checking=yes,extra,rtl --enable-clocale=gnu --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++ --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-objc-gc=auto --enable-plugin --enable-shared --host=aarch64-linux-gnu --program-prefix= --target=aarch64-linux-gnu --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -vPython Notes: Python 3.10.4Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of CSV2 BHB + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 8 June 2022 13:12 by user ubuntu.