Neoverse-V1 Compiler Tests

amazon testing on Ubuntu 22.04 via the Phoronix Test Suite by michael larabel for a future article.

HTML result view exported from: https://openbenchmarking.org/result/2206087-PTS-SVE2607864&rdt&grr.

Neoverse-V1 Compiler TestsProcessorMotherboardChipsetMemoryDiskNetworkOSKernelCompilerFile-SystemSystem Layerarmv8.4-aarmv8.4-a+sveARMv8 Neoverse-V1 (32 Cores)Amazon EC2 c7g.8xlarge (1.0 BIOS)Amazon Device 020062GB301GB Amazon Elastic Block StoreAmazon ElasticUbuntu 22.045.15.0-1004-aws (aarch64)GCC 12.0.0 20220117ext4amazonOpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseEnvironment Details- armv8.4-a: CXXFLAGS="-O3 -march=armv8.4-a" CFLAGS="-O3 -march=armv8.4-a" - armv8.4-a+sve: CXXFLAGS="-O3 -march=armv8.4-a+sve" CFLAGS="-O3 -march=armv8.4-a+sve" Compiler Details- --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-nls --disable-werror --enable-checking=yes,extra,rtl --enable-clocale=gnu --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++ --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-objc-gc=auto --enable-plugin --enable-shared --host=aarch64-linux-gnu --program-prefix= --target=aarch64-linux-gnu --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -v Python Details- Python 3.10.4Security Details- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of CSV2 BHB + srbds: Not affected + tsx_async_abort: Not affected

Neoverse-V1 Compiler Testslczero: BLASlczero: Eigenjpegxl: PNG - 8mrbayes: Primate Phylogeny Analysisespeak: Text-To-Speech Synthesisopenssl: SHA256tnn: CPU - DenseNetgmpbench: Total Timejpegxl: PNG - 7caffe: GoogleNet - CPU - 200onnx: fcn-resnet101-11 - CPU - Standardonnx: GPT-2 - CPU - Standardonnx: bertsquad-12 - CPU - Standardonnx: ArcFace ResNet-100 - CPU - Standardonnx: super-resolution-10 - CPU - Standardxmrig: Monero - 1Mngspice: C7552ngspice: C2670sysbench: CPUxmrig: Wownero - 1Mgromacs: MPI CPU - water_GMX50_barestockfish: Total Timecryptopp: Unkeyed Algorithmsencode-flac: WAV To FLACgraphics-magick: Resizinggraphics-magick: Noise-Gaussiangraphics-magick: Enhancedgraphics-magick: Rotategraphics-magick: Swirlgraphics-magick: HWB Color Spaceopenssl: RSA4096openssl: RSA4096himeno: Poisson Pressure Solvercompress-zstd: 19 - Decompression Speedcompress-zstd: 19 - Compression Speedcompress-zstd: 19, Long Mode - Decompression Speedcompress-zstd: 19, Long Mode - Compression Speedcaffe: AlexNet - CPU - 200stargate: 96000 - 512astcenc: Exhaustivestargate: 96000 - 1024botan: AES-256 - Decryptbotan: AES-256compress-zstd: 3 - Compression Speedcompress-zstd: 3, Long Mode - Decompression Speedcompress-zstd: 3, Long Mode - Compression Speedencode-wavpack: WAV To WavPackaobench: 2048 x 2048 - Total Timeluajit: Compositebotan: Blowfish - Decryptbotan: Blowfishbotan: Twofish - Decryptbotan: Twofishbotan: ChaCha20Poly1305 - Decryptbotan: ChaCha20Poly1305botan: CAST-256 - Decryptbotan: CAST-256botan: KASUMI - Decryptbotan: KASUMIstargate: 480000 - 512stargate: 44100 - 512stargate: 480000 - 1024stargate: 44100 - 1024encode-opus: WAV To Opus Encodewebp: Quality 100, Losslesspovray: Trace Timeliquid-dsp: 32 - 256 - 57liquid-dsp: 16 - 256 - 57liquid-dsp: 8 - 256 - 57c-ray: Total Time - 4K, 16 Rays Per Pixeltnn: CPU - MobileNet v2mt-dgemm: Sustained Floating-Point Raternnoise: kripke: coremark: CoreMark Size 666 - Iterations Per Secondtnn: CPU - SqueezeNet v1.1redis: SETjpegxl: JPEG - 7redis: GETx264: Bosphorus 4Kjpegxl: JPEG - 8aom-av1: Speed 8 Realtime - Bosphorus 4Kaom-av1: Speed 10 Realtime - Bosphorus 4Kdraco: Church Facadeastcenc: Thoroughwebp: Quality 100, Highest Compressionprimesieve: 1e12 Prime Number Generationencode-mp3: WAV To MP3draco: Liontnn: CPU - SqueezeNet v2aom-av1: Speed 8 Realtime - Bosphorus 1080popenjpeg: NASA Curiosity Panorama M34astcenc: Mediumnettle: aes256aom-av1: Speed 9 Realtime - Bosphorus 1080psmallpt: Global Illumination Renderer; 128 Samplesx264: Bosphorus 1080paom-av1: Speed 10 Realtime - Bosphorus 1080pnettle: sha512nettle: chachalammps: Rhodopsin Proteinnettle: poly1305-aesluajit: Jacobi Successive Over-Relaxationluajit: Dense LU Matrix Factorizationluajit: Sparse Matrix Multiplyluajit: Fast Fourier Transformluajit: Monte Carloarmv8.4-aarmv8.4-a+sve128113110.67237.54236.587276039435702730.404152.38.32123807731236477393854138645.4103.933102.55896726.4011811.22.27757485680459.87016338.31123394947325771225978356359.65090.55561.5592873083.472.93250.940.0436344.41405535.36474.7298485477.5715494.3136937.83820.81241.320.48833.4811282.59288.505278.874246.155239.703382.514389.375108.599108.78662.27762.0176.0053866.0729166.3226846.37003518.32023.85219.84770523333335270000017636333319.296260.77812.81392717.622204143167789646.924800257.7041865840.1373.212523377.9248.4326.3062.1361.8879359.14358.6308.4388.054535471.130120.13572054.88334435.91152.463.895168.92190.27498.83740.2521.330871.90901.023355.531151.57661.55343.27129713330.67234.25529.982274281768802346.3224155.68.36125125731231777293554118669.8111.644106.91096666.7611877.82.27555823340449.02420838.515241451571861112721067356407.85088.15508.2824353094.874.03263.740.3439314.47484835.19264.8077975474.3215442.6497027.23824.81242.720.51533.4941309.03289.032280.570258.148248.887383.945390.311108.623108.75462.26162.006.1244556.2135006.4501826.53816314.40223.40320.26366863666733542333316773333319.299280.24313.44223317.385192709233762066.501163205.7991861924.1379.582513289.248.5127.3462.1965.6278439.01318.6408.5337.440530976.301123.95551964.80924447.04156.713.896169.58193.68504.33733.5921.149820.51902.163521.131162.33615.71343.85OpenBenchmarking.org

LeelaChessZero

Backend: BLAS

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.28Backend: BLASarmv8.4-aarmv8.4-a+sve30060090012001500SE +/- 13.99, N = 5SE +/- 6.96, N = 312811297-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -flto -O3 -pthread

LeelaChessZero

Backend: Eigen

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.28Backend: Eigenarmv8.4-aarmv8.4-a+sve30060090012001500SE +/- 13.65, N = 3SE +/- 14.64, N = 513111333-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -flto -O3 -pthread

JPEG XL libjxl

Input: PNG - Encode Speed: 8

OpenBenchmarking.orgMP/s, More Is BetterJPEG XL libjxl 0.6.1Input: PNG - Encode Speed: 8armv8.4-aarmv8.4-a+sve0.15080.30160.45240.60320.754SE +/- 0.00, N = 3SE +/- 0.00, N = 30.670.67-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -funwind-tables -O2 -fPIE -pie

Timed MrBayes Analysis

Primate Phylogeny Analysis

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed MrBayes Analysis 3.2.7Primate Phylogeny Analysisarmv8.4-aarmv8.4-a+sve50100150200250SE +/- 0.07, N = 3SE +/- 0.18, N = 3237.54234.26-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -O3 -std=c99 -pedantic -lm

eSpeak-NG Speech Engine

Text-To-Speech Synthesis

OpenBenchmarking.orgSeconds, Fewer Is BettereSpeak-NG Speech Engine 20200907Text-To-Speech Synthesisarmv8.4-aarmv8.4-a+sve816243240SE +/- 0.31, N = 16SE +/- 0.30, N = 2036.5929.98-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -O3 -std=c99

OpenSSL

Algorithm: SHA256

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.0Algorithm: SHA256armv8.4-aarmv8.4-a+sve6000M12000M18000M24000M30000MSE +/- 25639974.71, N = 3SE +/- 32102278.66, N = 32760394357027428176880-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl

TNN

Target: CPU - Model: DenseNet

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: DenseNetarmv8.4-aarmv8.4-a+sve6001200180024003000SE +/- 12.63, N = 3SE +/- 22.16, N = 32730.402346.32-march=armv8.4-a - MIN: 2665.42 / MAX: 2834.73-march=armv8.4-a+sve - MIN: 2268.4 / MAX: 2446.61. (CXX) g++ options: -O3 -fopenmp -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl

GNU GMP GMPbench

Total Time

OpenBenchmarking.orgGMPbench Score, More Is BetterGNU GMP GMPbench 6.2.1Total Timearmv8.4-aarmv8.4-a+sve90018002700360045004152.34155.6-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -O3 -lm

JPEG XL libjxl

Input: PNG - Encode Speed: 7

OpenBenchmarking.orgMP/s, More Is BetterJPEG XL libjxl 0.6.1Input: PNG - Encode Speed: 7armv8.4-aarmv8.4-a+sve246810SE +/- 0.02, N = 3SE +/- 0.01, N = 38.328.36-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -funwind-tables -O2 -fPIE -pie

Caffe

Model: GoogleNet - Acceleration: CPU - Iterations: 200

OpenBenchmarking.orgMilli-Seconds, Fewer Is BetterCaffe 2020-02-13Model: GoogleNet - Acceleration: CPU - Iterations: 200armv8.4-aarmv8.4-a+sve30K60K90K120K150KSE +/- 49.72, N = 3SE +/- 105.70, N = 3123807125125-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -fPIC -rdynamic -lglog -lgflags -lprotobuf -lcrypto -lcurl -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas

ONNX Runtime

Model: fcn-resnet101-11 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: fcn-resnet101-11 - Device: CPU - Executor: Standardarmv8.4-aarmv8.4-a+sve1632486480SE +/- 0.00, N = 3SE +/- 0.00, N = 37373-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -ffunction-sections -fdata-sections -march=native -mtune=native -flto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: GPT-2 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: GPT-2 - Device: CPU - Executor: Standardarmv8.4-aarmv8.4-a+sve3K6K9K12K15KSE +/- 63.90, N = 3SE +/- 12.91, N = 31236412317-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -ffunction-sections -fdata-sections -march=native -mtune=native -flto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: bertsquad-12 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: bertsquad-12 - Device: CPU - Executor: Standardarmv8.4-aarmv8.4-a+sve170340510680850SE +/- 0.44, N = 3SE +/- 0.50, N = 3773772-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -ffunction-sections -fdata-sections -march=native -mtune=native -flto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: ArcFace ResNet-100 - Device: CPU - Executor: Standardarmv8.4-aarmv8.4-a+sve2004006008001000SE +/- 0.88, N = 3SE +/- 0.17, N = 3938935-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -ffunction-sections -fdata-sections -march=native -mtune=native -flto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: super-resolution-10 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: super-resolution-10 - Device: CPU - Executor: Standardarmv8.4-aarmv8.4-a+sve12002400360048006000SE +/- 0.93, N = 3SE +/- 2.17, N = 354135411-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -ffunction-sections -fdata-sections -march=native -mtune=native -flto -fno-fat-lto-objects -ldl -lrt

Xmrig

Variant: Monero - Hash Count: 1M

OpenBenchmarking.orgH/s, More Is BetterXmrig 6.12.1Variant: Monero - Hash Count: 1Marmv8.4-aarmv8.4-a+sve2K4K6K8K10KSE +/- 9.56, N = 3SE +/- 6.18, N = 38645.48669.8-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -fexceptions -fno-rtti -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc

Ngspice

Circuit: C7552

OpenBenchmarking.orgSeconds, Fewer Is BetterNgspice 34Circuit: C7552armv8.4-aarmv8.4-a+sve20406080100SE +/- 0.51, N = 3SE +/- 1.04, N = 3103.93111.64-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -O3 -fopenmp -lm -lstdc++ -lfftw3 -lXaw -lXmu -lXt -lXext -lX11 -lXft -lfontconfig -lXrender -lfreetype -lSM -lICE

Ngspice

Circuit: C2670

OpenBenchmarking.orgSeconds, Fewer Is BetterNgspice 34Circuit: C2670armv8.4-aarmv8.4-a+sve20406080100SE +/- 0.06, N = 3SE +/- 0.18, N = 3102.56106.91-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -O3 -fopenmp -lm -lstdc++ -lfftw3 -lXaw -lXmu -lXt -lXext -lX11 -lXft -lfontconfig -lXrender -lfreetype -lSM -lICE

Sysbench

Test: CPU

OpenBenchmarking.orgEvents Per Second, More Is BetterSysbench 1.0.20Test: CPUarmv8.4-aarmv8.4-a+sve20K40K60K80K100KSE +/- 8.50, N = 3SE +/- 2.72, N = 396726.4096666.76-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -O2 -funroll-loops -O3 -rdynamic -ldl -laio -lm

Xmrig

Variant: Wownero - Hash Count: 1M

OpenBenchmarking.orgH/s, More Is BetterXmrig 6.12.1Variant: Wownero - Hash Count: 1Marmv8.4-aarmv8.4-a+sve3K6K9K12K15KSE +/- 27.59, N = 3SE +/- 26.88, N = 311811.211877.8-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -fexceptions -fno-rtti -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc

GROMACS

Implementation: MPI CPU - Input: water_GMX50_bare

OpenBenchmarking.orgNs Per Day, More Is BetterGROMACS 2022.1Implementation: MPI CPU - Input: water_GMX50_barearmv8.4-aarmv8.4-a+sve0.51231.02461.53692.04922.5615SE +/- 0.001, N = 3SE +/- 0.002, N = 32.2772.275-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3

Stockfish

Total Time

OpenBenchmarking.orgNodes Per Second, More Is BetterStockfish 13Total Timearmv8.4-aarmv8.4-a+sve12M24M36M48M60MSE +/- 721518.45, N = 14SE +/- 645132.01, N = 35748568055823340-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -lgcov -lpthread -O3 -fno-exceptions -std=c++17 -pedantic -flto -fprofile-use -fno-peel-loops -fno-tracer -flto=jobserver

Crypto++

Test: Unkeyed Algorithms

OpenBenchmarking.orgMiB/second, More Is BetterCrypto++ 8.2Test: Unkeyed Algorithmsarmv8.4-aarmv8.4-a+sve100200300400500SE +/- 0.25, N = 3SE +/- 0.25, N = 3459.87449.02-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -fPIC -pthread -pipe

FLAC Audio Encoding

WAV To FLAC

OpenBenchmarking.orgSeconds, Fewer Is BetterFLAC Audio Encoding 1.3.3WAV To FLACarmv8.4-aarmv8.4-a+sve918273645SE +/- 0.02, N = 5SE +/- 0.01, N = 538.3138.52-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -fvisibility=hidden -logg -lm

GraphicsMagick

Operation: Resizing

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: Resizingarmv8.4-aarmv8.4-a+sve5001000150020002500SE +/- 22.36, N = 3SE +/- 1.86, N = 323392414-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -fopenmp -O3 -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lz -lm -lpthread

GraphicsMagick

Operation: Noise-Gaussian

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: Noise-Gaussianarmv8.4-aarmv8.4-a+sve110220330440550SE +/- 0.67, N = 3SE +/- 0.58, N = 3494515-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -fopenmp -O3 -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lz -lm -lpthread

GraphicsMagick

Operation: Enhanced

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: Enhancedarmv8.4-aarmv8.4-a+sve160320480640800SE +/- 0.33, N = 3SE +/- 0.33, N = 3732718-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -fopenmp -O3 -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lz -lm -lpthread

GraphicsMagick

Operation: Rotate

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: Rotatearmv8.4-aarmv8.4-a+sve130260390520650SE +/- 0.00, N = 3SE +/- 1.20, N = 3577611-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -fopenmp -O3 -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lz -lm -lpthread

GraphicsMagick

Operation: Swirl

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: Swirlarmv8.4-aarmv8.4-a+sve30060090012001500SE +/- 0.33, N = 3SE +/- 1.33, N = 312251272-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -fopenmp -O3 -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lz -lm -lpthread

GraphicsMagick

Operation: HWB Color Space

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: HWB Color Spacearmv8.4-aarmv8.4-a+sve2004006008001000SE +/- 1.00, N = 3SE +/- 0.33, N = 39781067-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -fopenmp -O3 -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lz -lm -lpthread

OpenSSL

Algorithm: RSA4096

OpenBenchmarking.orgverify/s, More Is BetterOpenSSL 3.0Algorithm: RSA4096armv8.4-aarmv8.4-a+sve80K160K240K320K400KSE +/- 8.85, N = 3SE +/- 10.52, N = 3356359.6356407.8-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl

OpenSSL

Algorithm: RSA4096

OpenBenchmarking.orgsign/s, More Is BetterOpenSSL 3.0Algorithm: RSA4096armv8.4-aarmv8.4-a+sve11002200330044005500SE +/- 0.78, N = 3SE +/- 0.53, N = 35090.55088.1-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl

Himeno Benchmark

Poisson Pressure Solver

OpenBenchmarking.orgMFLOPS, More Is BetterHimeno Benchmark 3.0Poisson Pressure Solverarmv8.4-aarmv8.4-a+sve12002400360048006000SE +/- 2.66, N = 3SE +/- 5.76, N = 35561.565508.28-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -O3

Zstd Compression

Compression Level: 19 - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 19 - Decompression Speedarmv8.4-aarmv8.4-a+sve7001400210028003500SE +/- 8.46, N = 3SE +/- 6.60, N = 33083.43094.8-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -O3 -pthread -lz -llzma

Zstd Compression

Compression Level: 19 - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 19 - Compression Speedarmv8.4-aarmv8.4-a+sve1632486480SE +/- 0.23, N = 3SE +/- 0.03, N = 372.974.0-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -O3 -pthread -lz -llzma

Zstd Compression

Compression Level: 19, Long Mode - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 19, Long Mode - Decompression Speedarmv8.4-aarmv8.4-a+sve7001400210028003500SE +/- 7.62, N = 3SE +/- 0.59, N = 33250.93263.7-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -O3 -pthread -lz -llzma

Zstd Compression

Compression Level: 19, Long Mode - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 19, Long Mode - Compression Speedarmv8.4-aarmv8.4-a+sve918273645SE +/- 0.03, N = 3SE +/- 0.00, N = 340.040.3-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -O3 -pthread -lz -llzma

Caffe

Model: AlexNet - Acceleration: CPU - Iterations: 200

OpenBenchmarking.orgMilli-Seconds, Fewer Is BetterCaffe 2020-02-13Model: AlexNet - Acceleration: CPU - Iterations: 200armv8.4-aarmv8.4-a+sve9K18K27K36K45KSE +/- 31.22, N = 3SE +/- 12.55, N = 34363443931-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -fPIC -rdynamic -lglog -lgflags -lprotobuf -lcrypto -lcurl -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas

Stargate Digital Audio Workstation

Sample Rate: 96000 - Buffer Size: 512

OpenBenchmarking.orgRender Ratio, More Is BetterStargate Digital Audio Workstation 21.10.9Sample Rate: 96000 - Buffer Size: 512armv8.4-aarmv8.4-a+sve1.00682.01363.02044.02725.034SE +/- 0.003502, N = 3SE +/- 0.002191, N = 34.4140554.4748481. (CXX) g++ options: -lpthread -lsndfile -lm -O3 -march=native -ffast-math -funroll-loops -fstrength-reduce -fstrict-aliasing -finline-functions

ASTC Encoder

Preset: Exhaustive

OpenBenchmarking.orgSeconds, Fewer Is BetterASTC Encoder 3.2Preset: Exhaustivearmv8.4-aarmv8.4-a+sve816243240SE +/- 0.02, N = 3SE +/- 0.01, N = 335.3635.19-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -flto -pthread

Stargate Digital Audio Workstation

Sample Rate: 96000 - Buffer Size: 1024

OpenBenchmarking.orgRender Ratio, More Is BetterStargate Digital Audio Workstation 21.10.9Sample Rate: 96000 - Buffer Size: 1024armv8.4-aarmv8.4-a+sve1.08182.16363.24544.32725.409SE +/- 0.000903, N = 3SE +/- 0.002826, N = 34.7298484.8077971. (CXX) g++ options: -lpthread -lsndfile -lm -O3 -march=native -ffast-math -funroll-loops -fstrength-reduce -fstrict-aliasing -finline-functions

Botan

Test: AES-256 - Decrypt

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: AES-256 - Decryptarmv8.4-aarmv8.4-a+sve12002400360048006000SE +/- 5.58, N = 3SE +/- 8.64, N = 35477.575474.321. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt

Botan

Test: AES-256

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: AES-256armv8.4-aarmv8.4-a+sve12002400360048006000SE +/- 14.75, N = 3SE +/- 9.30, N = 35494.315442.651. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt

Zstd Compression

Compression Level: 3 - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 3 - Compression Speedarmv8.4-aarmv8.4-a+sve15003000450060007500SE +/- 14.45, N = 3SE +/- 15.42, N = 36937.87027.2-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -O3 -pthread -lz -llzma

Zstd Compression

Compression Level: 3, Long Mode - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 3, Long Mode - Decompression Speedarmv8.4-aarmv8.4-a+sve8001600240032004000SE +/- 3.95, N = 3SE +/- 1.28, N = 33820.83824.8-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -O3 -pthread -lz -llzma

Zstd Compression

Compression Level: 3, Long Mode - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 3, Long Mode - Compression Speedarmv8.4-aarmv8.4-a+sve30060090012001500SE +/- 5.37, N = 3SE +/- 4.88, N = 31241.31242.7-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -O3 -pthread -lz -llzma

WavPack Audio Encoding

WAV To WavPack

OpenBenchmarking.orgSeconds, Fewer Is BetterWavPack Audio Encoding 5.3WAV To WavPackarmv8.4-aarmv8.4-a+sve510152025SE +/- 0.00, N = 5SE +/- 0.03, N = 520.4920.52-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -rdynamic

AOBench

Size: 2048 x 2048 - Total Time

OpenBenchmarking.orgSeconds, Fewer Is BetterAOBenchSize: 2048 x 2048 - Total Timearmv8.4-aarmv8.4-a+sve816243240SE +/- 0.00, N = 3SE +/- 0.01, N = 333.4833.49-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -lm -O3

LuaJIT

Test: Composite

OpenBenchmarking.orgMflops, More Is BetterLuaJIT 2.1-gitTest: Compositearmv8.4-aarmv8.4-a+sve30060090012001500SE +/- 0.41, N = 3SE +/- 18.19, N = 31282.591309.03-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -lm -ldl -O2 -fomit-frame-pointer -O3 -U_FORTIFY_SOURCE -fno-stack-protector

Botan

Test: Blowfish - Decrypt

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: Blowfish - Decryptarmv8.4-aarmv8.4-a+sve60120180240300SE +/- 0.07, N = 3SE +/- 0.03, N = 3288.51289.031. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt

Botan

Test: Blowfish

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: Blowfisharmv8.4-aarmv8.4-a+sve60120180240300SE +/- 0.29, N = 3SE +/- 0.10, N = 3278.87280.571. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt

Botan

Test: Twofish - Decrypt

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: Twofish - Decryptarmv8.4-aarmv8.4-a+sve60120180240300SE +/- 0.20, N = 3SE +/- 0.11, N = 3246.16258.151. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt

Botan

Test: Twofish

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: Twofisharmv8.4-aarmv8.4-a+sve50100150200250SE +/- 0.12, N = 3SE +/- 0.26, N = 3239.70248.891. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt

Botan

Test: ChaCha20Poly1305 - Decrypt

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: ChaCha20Poly1305 - Decryptarmv8.4-aarmv8.4-a+sve80160240320400SE +/- 0.02, N = 3SE +/- 0.13, N = 3382.51383.951. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt

Botan

Test: ChaCha20Poly1305

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: ChaCha20Poly1305armv8.4-aarmv8.4-a+sve80160240320400SE +/- 0.04, N = 3SE +/- 0.07, N = 3389.38390.311. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt

Botan

Test: CAST-256 - Decrypt

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: CAST-256 - Decryptarmv8.4-aarmv8.4-a+sve20406080100SE +/- 0.02, N = 3SE +/- 0.01, N = 3108.60108.621. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt

Botan

Test: CAST-256

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: CAST-256armv8.4-aarmv8.4-a+sve20406080100SE +/- 0.02, N = 3SE +/- 0.01, N = 3108.79108.751. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt

Botan

Test: KASUMI - Decrypt

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: KASUMI - Decryptarmv8.4-aarmv8.4-a+sve1428425670SE +/- 0.00, N = 3SE +/- 0.00, N = 362.2862.261. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt

Botan

Test: KASUMI

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: KASUMIarmv8.4-aarmv8.4-a+sve1428425670SE +/- 0.00, N = 3SE +/- 0.00, N = 362.0262.001. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt

Stargate Digital Audio Workstation

Sample Rate: 480000 - Buffer Size: 512

OpenBenchmarking.orgRender Ratio, More Is BetterStargate Digital Audio Workstation 21.10.9Sample Rate: 480000 - Buffer Size: 512armv8.4-aarmv8.4-a+sve246810SE +/- 0.001956, N = 3SE +/- 0.002030, N = 36.0053866.1244551. (CXX) g++ options: -lpthread -lsndfile -lm -O3 -march=native -ffast-math -funroll-loops -fstrength-reduce -fstrict-aliasing -finline-functions

Stargate Digital Audio Workstation

Sample Rate: 44100 - Buffer Size: 512

OpenBenchmarking.orgRender Ratio, More Is BetterStargate Digital Audio Workstation 21.10.9Sample Rate: 44100 - Buffer Size: 512armv8.4-aarmv8.4-a+sve246810SE +/- 0.002564, N = 3SE +/- 0.003428, N = 36.0729166.2135001. (CXX) g++ options: -lpthread -lsndfile -lm -O3 -march=native -ffast-math -funroll-loops -fstrength-reduce -fstrict-aliasing -finline-functions

Stargate Digital Audio Workstation

Sample Rate: 480000 - Buffer Size: 1024

OpenBenchmarking.orgRender Ratio, More Is BetterStargate Digital Audio Workstation 21.10.9Sample Rate: 480000 - Buffer Size: 1024armv8.4-aarmv8.4-a+sve246810SE +/- 0.002118, N = 3SE +/- 0.002250, N = 36.3226846.4501821. (CXX) g++ options: -lpthread -lsndfile -lm -O3 -march=native -ffast-math -funroll-loops -fstrength-reduce -fstrict-aliasing -finline-functions

Stargate Digital Audio Workstation

Sample Rate: 44100 - Buffer Size: 1024

OpenBenchmarking.orgRender Ratio, More Is BetterStargate Digital Audio Workstation 21.10.9Sample Rate: 44100 - Buffer Size: 1024armv8.4-aarmv8.4-a+sve246810SE +/- 0.002515, N = 3SE +/- 0.002411, N = 36.3700356.5381631. (CXX) g++ options: -lpthread -lsndfile -lm -O3 -march=native -ffast-math -funroll-loops -fstrength-reduce -fstrict-aliasing -finline-functions

Opus Codec Encoding

WAV To Opus Encode

OpenBenchmarking.orgSeconds, Fewer Is BetterOpus Codec Encoding 1.3.1WAV To Opus Encodearmv8.4-aarmv8.4-a+sve510152025SE +/- 0.00, N = 5SE +/- 0.02, N = 518.3214.40-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -fvisibility=hidden -logg -lm

WebP Image Encode

Encode Settings: Quality 100, Lossless

OpenBenchmarking.orgEncode Time - Seconds, Fewer Is BetterWebP Image Encode 1.1Encode Settings: Quality 100, Losslessarmv8.4-aarmv8.4-a+sve612182430SE +/- 0.01, N = 3SE +/- 0.18, N = 323.8523.40-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -fvisibility=hidden -O3 -lm -ljpeg -lpng16 -ltiff

POV-Ray

Trace Time

OpenBenchmarking.orgSeconds, Fewer Is BetterPOV-Ray 3.7.0.7Trace Timearmv8.4-aarmv8.4-a+sve510152025SE +/- 0.03, N = 3SE +/- 0.04, N = 319.8520.26-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -pipe -O3 -ffast-math -R/usr/lib -lXpm -lSM -lICE -lX11 -lIlmImf -lIlmImf-2_5 -lImath-2_5 -lHalf-2_5 -lIex-2_5 -lIexMath-2_5 -lIlmThread-2_5 -lIlmThread -ltiff -ljpeg -lpng -lz -lrt -lm -lboost_thread -lboost_system

Liquid-DSP

Threads: 32 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 2021.01.31Threads: 32 - Buffer Length: 256 - Filter Length: 57armv8.4-aarmv8.4-a+sve150M300M450M600M750MSE +/- 125476.87, N = 3SE +/- 1978807.16, N = 3705233333668636667-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 16 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 2021.01.31Threads: 16 - Buffer Length: 256 - Filter Length: 57armv8.4-aarmv8.4-a+sve80M160M240M320M400MSE +/- 30550.50, N = 3SE +/- 20275.88, N = 3352700000335423333-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 8 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 2021.01.31Threads: 8 - Buffer Length: 256 - Filter Length: 57armv8.4-aarmv8.4-a+sve40M80M120M160M200MSE +/- 12018.50, N = 3SE +/- 26666.67, N = 3176363333167733333-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

C-Ray

Total Time - 4K, 16 Rays Per Pixel

OpenBenchmarking.orgSeconds, Fewer Is BetterC-Ray 1.1Total Time - 4K, 16 Rays Per Pixelarmv8.4-aarmv8.4-a+sve510152025SE +/- 0.03, N = 3SE +/- 0.00, N = 319.3019.30-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -lm -lpthread -O3

TNN

Target: CPU - Model: MobileNet v2

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: MobileNet v2armv8.4-aarmv8.4-a+sve60120180240300SE +/- 0.10, N = 3SE +/- 0.69, N = 3260.78280.24-march=armv8.4-a - MIN: 259.13 / MAX: 262.38-march=armv8.4-a+sve - MIN: 277.9 / MAX: 282.311. (CXX) g++ options: -O3 -fopenmp -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl

ACES DGEMM

Sustained Floating-Point Rate

OpenBenchmarking.orgGFLOP/s, More Is BetterACES DGEMM 1.0Sustained Floating-Point Ratearmv8.4-aarmv8.4-a+sve3691215SE +/- 0.04, N = 3SE +/- 0.05, N = 312.8113.44-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -O3 -march=native -fopenmp

RNNoise

OpenBenchmarking.orgSeconds, Fewer Is BetterRNNoise 2020-06-28armv8.4-aarmv8.4-a+sve48121620SE +/- 0.05, N = 3SE +/- 0.05, N = 317.6217.39-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -O3 -pedantic -fvisibility=hidden

Kripke

OpenBenchmarking.orgThroughput FoM, More Is BetterKripke 1.2.4armv8.4-aarmv8.4-a+sve40M80M120M160M200MSE +/- 226703.11, N = 3SE +/- 298633.53, N = 3204143167192709233-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -fopenmp

Coremark

CoreMark Size 666 - Iterations Per Second

OpenBenchmarking.orgIterations/Sec, More Is BetterCoremark 1.0CoreMark Size 666 - Iterations Per Secondarmv8.4-aarmv8.4-a+sve200K400K600K800K1000KSE +/- 169.09, N = 3SE +/- 416.47, N = 3789646.92762066.50-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -O2 -O3 -lrt" -lrt

TNN

Target: CPU - Model: SqueezeNet v1.1

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: SqueezeNet v1.1armv8.4-aarmv8.4-a+sve60120180240300SE +/- 0.08, N = 3SE +/- 0.14, N = 3257.70205.80-march=armv8.4-a - MIN: 256.95 / MAX: 258.39-march=armv8.4-a+sve - MIN: 205.46 / MAX: 206.281. (CXX) g++ options: -O3 -fopenmp -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl

Redis

Test: SET

OpenBenchmarking.orgRequests Per Second, More Is BetterRedis 6.0.9Test: SETarmv8.4-aarmv8.4-a+sve400K800K1200K1600K2000KSE +/- 794.78, N = 3SE +/- 7427.93, N = 31865840.131861924.13-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3

JPEG XL libjxl

Input: JPEG - Encode Speed: 7

OpenBenchmarking.orgMP/s, More Is BetterJPEG XL libjxl 0.6.1Input: JPEG - Encode Speed: 7armv8.4-aarmv8.4-a+sve20406080100SE +/- 0.13, N = 3SE +/- 0.12, N = 373.2179.58-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -funwind-tables -O2 -fPIE -pie

Redis

Test: GET

OpenBenchmarking.orgRequests Per Second, More Is BetterRedis 6.0.9Test: GETarmv8.4-aarmv8.4-a+sve500K1000K1500K2000K2500KSE +/- 1605.36, N = 3SE +/- 9056.40, N = 32523377.922513289.20-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3

x264

Video Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is Betterx264 2022-02-22Video Input: Bosphorus 4Karmv8.4-aarmv8.4-a+sve1122334455SE +/- 0.08, N = 3SE +/- 0.01, N = 348.4348.511. (CC) gcc options: -ldl -lavformat -lavcodec -lavutil -lswscale -lm -lpthread -O3 -flto

JPEG XL libjxl

Input: JPEG - Encode Speed: 8

OpenBenchmarking.orgMP/s, More Is BetterJPEG XL libjxl 0.6.1Input: JPEG - Encode Speed: 8armv8.4-aarmv8.4-a+sve612182430SE +/- 0.01, N = 3SE +/- 0.03, N = 326.3027.34-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -funwind-tables -O2 -fPIE -pie

AOM AV1

Encoder Mode: Speed 8 Realtime - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 3.3Encoder Mode: Speed 8 Realtime - Input: Bosphorus 4Karmv8.4-aarmv8.4-a+sve1428425670SE +/- 0.43, N = 3SE +/- 0.61, N = 362.1362.19-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm

AOM AV1

Encoder Mode: Speed 10 Realtime - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 3.3Encoder Mode: Speed 10 Realtime - Input: Bosphorus 4Karmv8.4-aarmv8.4-a+sve1530456075SE +/- 0.47, N = 3SE +/- 0.22, N = 361.8865.62-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm

Google Draco

Model: Church Facade

OpenBenchmarking.orgms, Fewer Is BetterGoogle Draco 1.5.0Model: Church Facadearmv8.4-aarmv8.4-a+sve2K4K6K8K10KSE +/- 6.64, N = 3SE +/- 7.00, N = 379357843-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3

ASTC Encoder

Preset: Thorough

OpenBenchmarking.orgSeconds, Fewer Is BetterASTC Encoder 3.2Preset: Thorougharmv8.4-aarmv8.4-a+sve3691215SE +/- 0.0046, N = 3SE +/- 0.0027, N = 39.14359.0131-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -flto -pthread

WebP Image Encode

Encode Settings: Quality 100, Highest Compression

OpenBenchmarking.orgEncode Time - Seconds, Fewer Is BetterWebP Image Encode 1.1Encode Settings: Quality 100, Highest Compressionarmv8.4-aarmv8.4-a+sve246810SE +/- 0.006, N = 3SE +/- 0.017, N = 38.6308.640-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -fvisibility=hidden -O3 -lm -ljpeg -lpng16 -ltiff

Primesieve

1e12 Prime Number Generation

OpenBenchmarking.orgSeconds, Fewer Is BetterPrimesieve 7.71e12 Prime Number Generationarmv8.4-aarmv8.4-a+sve246810SE +/- 0.022, N = 3SE +/- 0.043, N = 38.4388.533-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3

LAME MP3 Encoding

WAV To MP3

OpenBenchmarking.orgSeconds, Fewer Is BetterLAME MP3 Encoding 3.100WAV To MP3armv8.4-aarmv8.4-a+sve246810SE +/- 0.004, N = 3SE +/- 0.002, N = 38.0547.440-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -O3 -ffast-math -funroll-loops -fschedule-insns2 -fbranch-count-reg -fforce-addr -pipe -lm

Google Draco

Model: Lion

OpenBenchmarking.orgms, Fewer Is BetterGoogle Draco 1.5.0Model: Lionarmv8.4-aarmv8.4-a+sve11002200330044005500SE +/- 2.65, N = 3SE +/- 2.40, N = 353545309-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3

TNN

Target: CPU - Model: SqueezeNet v2

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: SqueezeNet v2armv8.4-aarmv8.4-a+sve20406080100SE +/- 0.07, N = 3SE +/- 0.09, N = 371.1376.30-march=armv8.4-a - MIN: 70.76 / MAX: 71.58-march=armv8.4-a+sve - MIN: 76.07 / MAX: 76.531. (CXX) g++ options: -O3 -fopenmp -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl

AOM AV1

Encoder Mode: Speed 8 Realtime - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 3.3Encoder Mode: Speed 8 Realtime - Input: Bosphorus 1080parmv8.4-aarmv8.4-a+sve306090120150SE +/- 0.03, N = 3SE +/- 0.09, N = 3120.13123.95-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm

OpenJPEG

Encode: NASA Curiosity Panorama M34

OpenBenchmarking.orgms, Fewer Is BetterOpenJPEG 2.4Encode: NASA Curiosity Panorama M34armv8.4-aarmv8.4-a+sve12K24K36K48K60KSE +/- 19.06, N = 3SE +/- 89.48, N = 35720555196-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -rdynamic

ASTC Encoder

Preset: Medium

OpenBenchmarking.orgSeconds, Fewer Is BetterASTC Encoder 3.2Preset: Mediumarmv8.4-aarmv8.4-a+sve1.09872.19743.29614.39485.4935SE +/- 0.0125, N = 3SE +/- 0.0080, N = 34.88334.8092-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -flto -pthread

Nettle

Test: aes256

OpenBenchmarking.orgMbyte/s, More Is BetterNettle 3.8Test: aes256armv8.4-aarmv8.4-a+sve10002000300040005000SE +/- 3.11, N = 3SE +/- 0.39, N = 34435.914447.04-march=armv8.4-a - MIN: 3927.32 / MAX: 5628.86-march=armv8.4-a+sve - MIN: 3925.11 / MAX: 5627.841. (CC) gcc options: -O3 -ggdb3 -lnettle -lgmp -lm -lcrypto

AOM AV1

Encoder Mode: Speed 9 Realtime - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 3.3Encoder Mode: Speed 9 Realtime - Input: Bosphorus 1080parmv8.4-aarmv8.4-a+sve306090120150SE +/- 0.03, N = 3SE +/- 0.20, N = 3152.46156.71-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm

Smallpt

Global Illumination Renderer; 128 Samples

OpenBenchmarking.orgSeconds, Fewer Is BetterSmallpt 1.0Global Illumination Renderer; 128 Samplesarmv8.4-aarmv8.4-a+sve0.87661.75322.62983.50644.383SE +/- 0.002, N = 3SE +/- 0.003, N = 33.8953.896-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -fopenmp -O3

x264

Video Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is Betterx264 2022-02-22Video Input: Bosphorus 1080parmv8.4-aarmv8.4-a+sve4080120160200SE +/- 0.07, N = 3SE +/- 0.10, N = 3168.92169.581. (CC) gcc options: -ldl -lavformat -lavcodec -lavutil -lswscale -lm -lpthread -O3 -flto

AOM AV1

Encoder Mode: Speed 10 Realtime - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 3.3Encoder Mode: Speed 10 Realtime - Input: Bosphorus 1080parmv8.4-aarmv8.4-a+sve4080120160200SE +/- 0.28, N = 3SE +/- 0.20, N = 3190.27193.68-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm

Nettle

Test: sha512

OpenBenchmarking.orgMbyte/s, More Is BetterNettle 3.8Test: sha512armv8.4-aarmv8.4-a+sve110220330440550SE +/- 0.04, N = 3SE +/- 0.07, N = 3498.83504.33-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -O3 -ggdb3 -lnettle -lgmp -lm -lcrypto

Nettle

Test: chacha

OpenBenchmarking.orgMbyte/s, More Is BetterNettle 3.8Test: chachaarmv8.4-aarmv8.4-a+sve160320480640800SE +/- 0.61, N = 3SE +/- 0.55, N = 3740.25733.59-march=armv8.4-a - MIN: 454.21 / MAX: 956.53-march=armv8.4-a+sve - MIN: 442.26 / MAX: 956.221. (CC) gcc options: -O3 -ggdb3 -lnettle -lgmp -lm -lcrypto

LAMMPS Molecular Dynamics Simulator

Model: Rhodopsin Protein

OpenBenchmarking.orgns/day, More Is BetterLAMMPS Molecular Dynamics Simulator 29Oct2020Model: Rhodopsin Proteinarmv8.4-aarmv8.4-a+sve510152025SE +/- 0.09, N = 3SE +/- 0.01, N = 321.3321.15-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -lm

Nettle

Test: poly1305-aes

OpenBenchmarking.orgMbyte/s, More Is BetterNettle 3.8Test: poly1305-aesarmv8.4-aarmv8.4-a+sve2004006008001000SE +/- 1.52, N = 3SE +/- 5.37, N = 3871.90820.51-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -O3 -ggdb3 -lnettle -lgmp -lm -lcrypto

LuaJIT

Test: Jacobi Successive Over-Relaxation

OpenBenchmarking.orgMflops, More Is BetterLuaJIT 2.1-gitTest: Jacobi Successive Over-Relaxationarmv8.4-aarmv8.4-a+sve2004006008001000SE +/- 0.68, N = 3SE +/- 1.00, N = 3901.02902.16-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -lm -ldl -O2 -fomit-frame-pointer -O3 -U_FORTIFY_SOURCE -fno-stack-protector

LuaJIT

Test: Dense LU Matrix Factorization

OpenBenchmarking.orgMflops, More Is BetterLuaJIT 2.1-gitTest: Dense LU Matrix Factorizationarmv8.4-aarmv8.4-a+sve8001600240032004000SE +/- 6.02, N = 3SE +/- 86.66, N = 33355.533521.13-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -lm -ldl -O2 -fomit-frame-pointer -O3 -U_FORTIFY_SOURCE -fno-stack-protector

LuaJIT

Test: Sparse Matrix Multiply

OpenBenchmarking.orgMflops, More Is BetterLuaJIT 2.1-gitTest: Sparse Matrix Multiplyarmv8.4-aarmv8.4-a+sve30060090012001500SE +/- 3.14, N = 3SE +/- 7.20, N = 31151.571162.33-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -lm -ldl -O2 -fomit-frame-pointer -O3 -U_FORTIFY_SOURCE -fno-stack-protector

LuaJIT

Test: Fast Fourier Transform

OpenBenchmarking.orgMflops, More Is BetterLuaJIT 2.1-gitTest: Fast Fourier Transformarmv8.4-aarmv8.4-a+sve140280420560700SE +/- 0.39, N = 3SE +/- 10.69, N = 3661.55615.71-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -lm -ldl -O2 -fomit-frame-pointer -O3 -U_FORTIFY_SOURCE -fno-stack-protector

LuaJIT

Test: Monte Carlo

OpenBenchmarking.orgMflops, More Is BetterLuaJIT 2.1-gitTest: Monte Carloarmv8.4-aarmv8.4-a+sve70140210280350SE +/- 0.35, N = 3SE +/- 0.54, N = 3343.27343.85-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -lm -ldl -O2 -fomit-frame-pointer -O3 -U_FORTIFY_SOURCE -fno-stack-protector


Phoronix Test Suite v10.8.5