Neoverse-V1 Compiler Tests

amazon testing on Ubuntu 22.04 via the Phoronix Test Suite by michael larabel for a future article.

HTML result view exported from: https://openbenchmarking.org/result/2206087-PTS-SVE2607864&sor&grs.

Neoverse-V1 Compiler TestsProcessorMotherboardChipsetMemoryDiskNetworkOSKernelCompilerFile-SystemSystem Layerarmv8.4-aarmv8.4-a+sveARMv8 Neoverse-V1 (32 Cores)Amazon EC2 c7g.8xlarge (1.0 BIOS)Amazon Device 020062GB301GB Amazon Elastic Block StoreAmazon ElasticUbuntu 22.045.15.0-1004-aws (aarch64)GCC 12.0.0 20220117ext4amazonOpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseEnvironment Details- armv8.4-a: CXXFLAGS="-O3 -march=armv8.4-a" CFLAGS="-O3 -march=armv8.4-a" - armv8.4-a+sve: CXXFLAGS="-O3 -march=armv8.4-a+sve" CFLAGS="-O3 -march=armv8.4-a+sve" Compiler Details- --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-nls --disable-werror --enable-checking=yes,extra,rtl --enable-clocale=gnu --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++ --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-objc-gc=auto --enable-plugin --enable-shared --host=aarch64-linux-gnu --program-prefix= --target=aarch64-linux-gnu --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -v Python Details- Python 3.10.4Security Details- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of CSV2 BHB + srbds: Not affected + tsx_async_abort: Not affected

Neoverse-V1 Compiler Testsencode-opus: WAV To Opus Encodetnn: CPU - SqueezeNet v1.1espeak: Text-To-Speech Synthesistnn: CPU - DenseNetgraphics-magick: HWB Color Spacejpegxl: JPEG - 7encode-mp3: WAV To MP3tnn: CPU - MobileNet v2luajit: Fast Fourier Transformngspice: C7552tnn: CPU - SqueezeNet v2nettle: poly1305-aesaom-av1: Speed 10 Realtime - Bosphorus 4Kkripke: graphics-magick: Rotateliquid-dsp: 32 - 256 - 57liquid-dsp: 16 - 256 - 57liquid-dsp: 8 - 256 - 57luajit: Dense LU Matrix Factorizationmt-dgemm: Sustained Floating-Point Ratebotan: Twofish - Decryptgraphics-magick: Noise-Gaussianngspice: C2670jpegxl: JPEG - 8graphics-magick: Swirlbotan: Twofishopenjpeg: NASA Curiosity Panorama M34coremark: CoreMark Size 666 - Iterations Per Secondgraphics-magick: Resizingaom-av1: Speed 8 Realtime - Bosphorus 1080pstockfish: Total Timeaom-av1: Speed 9 Realtime - Bosphorus 1080pstargate: 44100 - 1024cryptopp: Unkeyed Algorithmsstargate: 44100 - 512povray: Trace Timeluajit: Compositestargate: 480000 - 1024stargate: 480000 - 512graphics-magick: Enhancedwebp: Quality 100, Losslessaom-av1: Speed 10 Realtime - Bosphorus 1080plczero: Eigenstargate: 96000 - 1024astcenc: Mediumcompress-zstd: 19 - Compression Speedastcenc: Thoroughmrbayes: Primate Phylogeny Analysisstargate: 96000 - 512rnnoise: compress-zstd: 3 - Compression Speedlczero: BLASdraco: Church Facadeprimesieve: 1e12 Prime Number Generationnettle: sha512caffe: GoogleNet - CPU - 200himeno: Poisson Pressure Solverbotan: AES-256luajit: Sparse Matrix Multiplynettle: chachalammps: Rhodopsin Proteindraco: Lioncompress-zstd: 19, Long Mode - Compression Speedcaffe: AlexNet - CPU - 200openssl: SHA256botan: Blowfishxmrig: Wownero - 1Mencode-flac: WAV To FLACastcenc: Exhaustivejpegxl: PNG - 7redis: GETcompress-zstd: 19, Long Mode - Decompression Speedx264: Bosphorus 1080ponnx: GPT-2 - CPU - Standardbotan: ChaCha20Poly1305 - Decryptcompress-zstd: 19 - Decompression Speedonnx: ArcFace ResNet-100 - CPU - Standardxmrig: Monero - 1Mnettle: aes256botan: ChaCha20Poly1305redis: SETbotan: Blowfish - Decryptluajit: Monte Carlox264: Bosphorus 4Kencode-wavpack: WAV To WavPackonnx: bertsquad-12 - CPU - Standardluajit: Jacobi Successive Over-Relaxationwebp: Quality 100, Highest Compressioncompress-zstd: 3, Long Mode - Compression Speedcompress-zstd: 3, Long Mode - Decompression Speedaom-av1: Speed 8 Realtime - Bosphorus 4Kgromacs: MPI CPU - water_GMX50_baregmpbench: Total Timesysbench: CPUbotan: AES-256 - Decryptopenssl: RSA4096aobench: 2048 x 2048 - Total Timeonnx: super-resolution-10 - CPU - Standardbotan: CAST-256botan: KASUMIbotan: KASUMI - Decryptsmallpt: Global Illumination Renderer; 128 Samplesbotan: CAST-256 - Decryptc-ray: Total Time - 4K, 16 Rays Per Pixelopenssl: RSA4096onnx: fcn-resnet101-11 - CPU - Standardjpegxl: PNG - 8armv8.4-aarmv8.4-a+sve18.320257.70436.5872730.4097873.218.054260.778661.55103.93371.130871.9061.882041431675777052333333527000001763633333355.5312.813927246.155494102.55826.301225239.70357205789646.9248002339120.1357485680152.466.370035459.8701636.07291619.8471282.596.3226846.00538673223.852190.2713114.7298484.883372.99.1435237.5424.41405517.6226937.8128179358.438498.831238075561.5592875494.3131151.57740.2521.330535440.04363427603943570278.87411811.238.31135.36478.322523377.923250.9168.9212364382.5143083.49388645.44435.91389.3751865840.13288.505343.2748.4320.488773901.028.6301241.33820.862.132.2774152.396726.405477.5715090.533.4815413108.78662.01762.2773.895108.59919.296356359.6730.6714.402205.79929.9822346.322106779.587.440280.243615.71111.64476.301820.5165.621927092336116686366673354233331677333333521.1313.442233258.148515106.91027.341272248.88755196762066.5011632414123.9555823340156.716.538163449.0242086.21350020.2631309.036.4501826.12445571823.403193.6813334.8077974.809274.09.0131234.2554.47484817.3857027.2129778438.533504.331251255508.2824355442.6491162.33733.5921.149530940.34393127428176880280.57011877.838.51535.19268.362513289.23263.7169.5812317383.9453094.89358669.84447.04390.3111861924.13289.032343.8548.5120.515772902.168.6401242.73824.862.192.2754155.696666.765474.3215088.133.4945411108.75462.0062.2613.896108.62319.299356407.8730.67OpenBenchmarking.org

Opus Codec Encoding

WAV To Opus Encode

OpenBenchmarking.orgSeconds, Fewer Is BetterOpus Codec Encoding 1.3.1WAV To Opus Encodearmv8.4-a+svearmv8.4-a510152025SE +/- 0.02, N = 5SE +/- 0.00, N = 514.4018.32-march=armv8.4-a+sve-march=armv8.4-a1. (CXX) g++ options: -O3 -fvisibility=hidden -logg -lm

TNN

Target: CPU - Model: SqueezeNet v1.1

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: SqueezeNet v1.1armv8.4-a+svearmv8.4-a60120180240300SE +/- 0.14, N = 3SE +/- 0.08, N = 3205.80257.70-march=armv8.4-a+sve - MIN: 205.46 / MAX: 206.28-march=armv8.4-a - MIN: 256.95 / MAX: 258.391. (CXX) g++ options: -O3 -fopenmp -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl

eSpeak-NG Speech Engine

Text-To-Speech Synthesis

OpenBenchmarking.orgSeconds, Fewer Is BettereSpeak-NG Speech Engine 20200907Text-To-Speech Synthesisarmv8.4-a+svearmv8.4-a816243240SE +/- 0.30, N = 20SE +/- 0.31, N = 1629.9836.59-march=armv8.4-a+sve-march=armv8.4-a1. (CC) gcc options: -O3 -std=c99

TNN

Target: CPU - Model: DenseNet

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: DenseNetarmv8.4-a+svearmv8.4-a6001200180024003000SE +/- 22.16, N = 3SE +/- 12.63, N = 32346.322730.40-march=armv8.4-a+sve - MIN: 2268.4 / MAX: 2446.6-march=armv8.4-a - MIN: 2665.42 / MAX: 2834.731. (CXX) g++ options: -O3 -fopenmp -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl

GraphicsMagick

Operation: HWB Color Space

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: HWB Color Spacearmv8.4-a+svearmv8.4-a2004006008001000SE +/- 0.33, N = 3SE +/- 1.00, N = 31067978-march=armv8.4-a+sve-march=armv8.4-a1. (CC) gcc options: -fopenmp -O3 -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lz -lm -lpthread

JPEG XL libjxl

Input: JPEG - Encode Speed: 7

OpenBenchmarking.orgMP/s, More Is BetterJPEG XL libjxl 0.6.1Input: JPEG - Encode Speed: 7armv8.4-a+svearmv8.4-a20406080100SE +/- 0.12, N = 3SE +/- 0.13, N = 379.5873.21-march=armv8.4-a+sve-march=armv8.4-a1. (CXX) g++ options: -O3 -funwind-tables -O2 -fPIE -pie

LAME MP3 Encoding

WAV To MP3

OpenBenchmarking.orgSeconds, Fewer Is BetterLAME MP3 Encoding 3.100WAV To MP3armv8.4-a+svearmv8.4-a246810SE +/- 0.002, N = 3SE +/- 0.004, N = 37.4408.054-march=armv8.4-a+sve-march=armv8.4-a1. (CC) gcc options: -O3 -ffast-math -funroll-loops -fschedule-insns2 -fbranch-count-reg -fforce-addr -pipe -lm

TNN

Target: CPU - Model: MobileNet v2

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: MobileNet v2armv8.4-aarmv8.4-a+sve60120180240300SE +/- 0.10, N = 3SE +/- 0.69, N = 3260.78280.24-march=armv8.4-a - MIN: 259.13 / MAX: 262.38-march=armv8.4-a+sve - MIN: 277.9 / MAX: 282.311. (CXX) g++ options: -O3 -fopenmp -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl

LuaJIT

Test: Fast Fourier Transform

OpenBenchmarking.orgMflops, More Is BetterLuaJIT 2.1-gitTest: Fast Fourier Transformarmv8.4-aarmv8.4-a+sve140280420560700SE +/- 0.39, N = 3SE +/- 10.69, N = 3661.55615.71-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -lm -ldl -O2 -fomit-frame-pointer -O3 -U_FORTIFY_SOURCE -fno-stack-protector

Ngspice

Circuit: C7552

OpenBenchmarking.orgSeconds, Fewer Is BetterNgspice 34Circuit: C7552armv8.4-aarmv8.4-a+sve20406080100SE +/- 0.51, N = 3SE +/- 1.04, N = 3103.93111.64-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -O3 -fopenmp -lm -lstdc++ -lfftw3 -lXaw -lXmu -lXt -lXext -lX11 -lXft -lfontconfig -lXrender -lfreetype -lSM -lICE

TNN

Target: CPU - Model: SqueezeNet v2

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: SqueezeNet v2armv8.4-aarmv8.4-a+sve20406080100SE +/- 0.07, N = 3SE +/- 0.09, N = 371.1376.30-march=armv8.4-a - MIN: 70.76 / MAX: 71.58-march=armv8.4-a+sve - MIN: 76.07 / MAX: 76.531. (CXX) g++ options: -O3 -fopenmp -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl

Nettle

Test: poly1305-aes

OpenBenchmarking.orgMbyte/s, More Is BetterNettle 3.8Test: poly1305-aesarmv8.4-aarmv8.4-a+sve2004006008001000SE +/- 1.52, N = 3SE +/- 5.37, N = 3871.90820.51-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -O3 -ggdb3 -lnettle -lgmp -lm -lcrypto

AOM AV1

Encoder Mode: Speed 10 Realtime - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 3.3Encoder Mode: Speed 10 Realtime - Input: Bosphorus 4Karmv8.4-a+svearmv8.4-a1530456075SE +/- 0.22, N = 3SE +/- 0.47, N = 365.6261.88-march=armv8.4-a+sve-march=armv8.4-a1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm

Kripke

OpenBenchmarking.orgThroughput FoM, More Is BetterKripke 1.2.4armv8.4-aarmv8.4-a+sve40M80M120M160M200MSE +/- 226703.11, N = 3SE +/- 298633.53, N = 3204143167192709233-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -fopenmp

GraphicsMagick

Operation: Rotate

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: Rotatearmv8.4-a+svearmv8.4-a130260390520650SE +/- 1.20, N = 3SE +/- 0.00, N = 3611577-march=armv8.4-a+sve-march=armv8.4-a1. (CC) gcc options: -fopenmp -O3 -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lz -lm -lpthread

Liquid-DSP

Threads: 32 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 2021.01.31Threads: 32 - Buffer Length: 256 - Filter Length: 57armv8.4-aarmv8.4-a+sve150M300M450M600M750MSE +/- 125476.87, N = 3SE +/- 1978807.16, N = 3705233333668636667-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 16 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 2021.01.31Threads: 16 - Buffer Length: 256 - Filter Length: 57armv8.4-aarmv8.4-a+sve80M160M240M320M400MSE +/- 30550.50, N = 3SE +/- 20275.88, N = 3352700000335423333-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 8 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 2021.01.31Threads: 8 - Buffer Length: 256 - Filter Length: 57armv8.4-aarmv8.4-a+sve40M80M120M160M200MSE +/- 12018.50, N = 3SE +/- 26666.67, N = 3176363333167733333-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

LuaJIT

Test: Dense LU Matrix Factorization

OpenBenchmarking.orgMflops, More Is BetterLuaJIT 2.1-gitTest: Dense LU Matrix Factorizationarmv8.4-a+svearmv8.4-a8001600240032004000SE +/- 86.66, N = 3SE +/- 6.02, N = 33521.133355.53-march=armv8.4-a+sve-march=armv8.4-a1. (CC) gcc options: -lm -ldl -O2 -fomit-frame-pointer -O3 -U_FORTIFY_SOURCE -fno-stack-protector

ACES DGEMM

Sustained Floating-Point Rate

OpenBenchmarking.orgGFLOP/s, More Is BetterACES DGEMM 1.0Sustained Floating-Point Ratearmv8.4-a+svearmv8.4-a3691215SE +/- 0.05, N = 3SE +/- 0.04, N = 313.4412.81-march=armv8.4-a+sve-march=armv8.4-a1. (CC) gcc options: -O3 -march=native -fopenmp

Botan

Test: Twofish - Decrypt

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: Twofish - Decryptarmv8.4-a+svearmv8.4-a60120180240300SE +/- 0.11, N = 3SE +/- 0.20, N = 3258.15246.161. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt

GraphicsMagick

Operation: Noise-Gaussian

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: Noise-Gaussianarmv8.4-a+svearmv8.4-a110220330440550SE +/- 0.58, N = 3SE +/- 0.67, N = 3515494-march=armv8.4-a+sve-march=armv8.4-a1. (CC) gcc options: -fopenmp -O3 -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lz -lm -lpthread

Ngspice

Circuit: C2670

OpenBenchmarking.orgSeconds, Fewer Is BetterNgspice 34Circuit: C2670armv8.4-aarmv8.4-a+sve20406080100SE +/- 0.06, N = 3SE +/- 0.18, N = 3102.56106.91-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -O3 -fopenmp -lm -lstdc++ -lfftw3 -lXaw -lXmu -lXt -lXext -lX11 -lXft -lfontconfig -lXrender -lfreetype -lSM -lICE

JPEG XL libjxl

Input: JPEG - Encode Speed: 8

OpenBenchmarking.orgMP/s, More Is BetterJPEG XL libjxl 0.6.1Input: JPEG - Encode Speed: 8armv8.4-a+svearmv8.4-a612182430SE +/- 0.03, N = 3SE +/- 0.01, N = 327.3426.30-march=armv8.4-a+sve-march=armv8.4-a1. (CXX) g++ options: -O3 -funwind-tables -O2 -fPIE -pie

GraphicsMagick

Operation: Swirl

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: Swirlarmv8.4-a+svearmv8.4-a30060090012001500SE +/- 1.33, N = 3SE +/- 0.33, N = 312721225-march=armv8.4-a+sve-march=armv8.4-a1. (CC) gcc options: -fopenmp -O3 -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lz -lm -lpthread

Botan

Test: Twofish

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: Twofisharmv8.4-a+svearmv8.4-a50100150200250SE +/- 0.26, N = 3SE +/- 0.12, N = 3248.89239.701. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt

OpenJPEG

Encode: NASA Curiosity Panorama M34

OpenBenchmarking.orgms, Fewer Is BetterOpenJPEG 2.4Encode: NASA Curiosity Panorama M34armv8.4-a+svearmv8.4-a12K24K36K48K60KSE +/- 89.48, N = 3SE +/- 19.06, N = 35519657205-march=armv8.4-a+sve-march=armv8.4-a1. (CXX) g++ options: -O3 -rdynamic

Coremark

CoreMark Size 666 - Iterations Per Second

OpenBenchmarking.orgIterations/Sec, More Is BetterCoremark 1.0CoreMark Size 666 - Iterations Per Secondarmv8.4-aarmv8.4-a+sve200K400K600K800K1000KSE +/- 169.09, N = 3SE +/- 416.47, N = 3789646.92762066.50-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -O2 -O3 -lrt" -lrt

GraphicsMagick

Operation: Resizing

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: Resizingarmv8.4-a+svearmv8.4-a5001000150020002500SE +/- 1.86, N = 3SE +/- 22.36, N = 324142339-march=armv8.4-a+sve-march=armv8.4-a1. (CC) gcc options: -fopenmp -O3 -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lz -lm -lpthread

AOM AV1

Encoder Mode: Speed 8 Realtime - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 3.3Encoder Mode: Speed 8 Realtime - Input: Bosphorus 1080parmv8.4-a+svearmv8.4-a306090120150SE +/- 0.09, N = 3SE +/- 0.03, N = 3123.95120.13-march=armv8.4-a+sve-march=armv8.4-a1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm

Stockfish

Total Time

OpenBenchmarking.orgNodes Per Second, More Is BetterStockfish 13Total Timearmv8.4-aarmv8.4-a+sve12M24M36M48M60MSE +/- 721518.45, N = 14SE +/- 645132.01, N = 35748568055823340-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -lgcov -lpthread -O3 -fno-exceptions -std=c++17 -pedantic -flto -fprofile-use -fno-peel-loops -fno-tracer -flto=jobserver

AOM AV1

Encoder Mode: Speed 9 Realtime - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 3.3Encoder Mode: Speed 9 Realtime - Input: Bosphorus 1080parmv8.4-a+svearmv8.4-a306090120150SE +/- 0.20, N = 3SE +/- 0.03, N = 3156.71152.46-march=armv8.4-a+sve-march=armv8.4-a1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm

Stargate Digital Audio Workstation

Sample Rate: 44100 - Buffer Size: 1024

OpenBenchmarking.orgRender Ratio, More Is BetterStargate Digital Audio Workstation 21.10.9Sample Rate: 44100 - Buffer Size: 1024armv8.4-a+svearmv8.4-a246810SE +/- 0.002411, N = 3SE +/- 0.002515, N = 36.5381636.3700351. (CXX) g++ options: -lpthread -lsndfile -lm -O3 -march=native -ffast-math -funroll-loops -fstrength-reduce -fstrict-aliasing -finline-functions

Crypto++

Test: Unkeyed Algorithms

OpenBenchmarking.orgMiB/second, More Is BetterCrypto++ 8.2Test: Unkeyed Algorithmsarmv8.4-aarmv8.4-a+sve100200300400500SE +/- 0.25, N = 3SE +/- 0.25, N = 3459.87449.02-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -fPIC -pthread -pipe

Stargate Digital Audio Workstation

Sample Rate: 44100 - Buffer Size: 512

OpenBenchmarking.orgRender Ratio, More Is BetterStargate Digital Audio Workstation 21.10.9Sample Rate: 44100 - Buffer Size: 512armv8.4-a+svearmv8.4-a246810SE +/- 0.003428, N = 3SE +/- 0.002564, N = 36.2135006.0729161. (CXX) g++ options: -lpthread -lsndfile -lm -O3 -march=native -ffast-math -funroll-loops -fstrength-reduce -fstrict-aliasing -finline-functions

POV-Ray

Trace Time

OpenBenchmarking.orgSeconds, Fewer Is BetterPOV-Ray 3.7.0.7Trace Timearmv8.4-aarmv8.4-a+sve510152025SE +/- 0.03, N = 3SE +/- 0.04, N = 319.8520.26-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -pipe -O3 -ffast-math -R/usr/lib -lXpm -lSM -lICE -lX11 -lIlmImf -lIlmImf-2_5 -lImath-2_5 -lHalf-2_5 -lIex-2_5 -lIexMath-2_5 -lIlmThread-2_5 -lIlmThread -ltiff -ljpeg -lpng -lz -lrt -lm -lboost_thread -lboost_system

LuaJIT

Test: Composite

OpenBenchmarking.orgMflops, More Is BetterLuaJIT 2.1-gitTest: Compositearmv8.4-a+svearmv8.4-a30060090012001500SE +/- 18.19, N = 3SE +/- 0.41, N = 31309.031282.59-march=armv8.4-a+sve-march=armv8.4-a1. (CC) gcc options: -lm -ldl -O2 -fomit-frame-pointer -O3 -U_FORTIFY_SOURCE -fno-stack-protector

Stargate Digital Audio Workstation

Sample Rate: 480000 - Buffer Size: 1024

OpenBenchmarking.orgRender Ratio, More Is BetterStargate Digital Audio Workstation 21.10.9Sample Rate: 480000 - Buffer Size: 1024armv8.4-a+svearmv8.4-a246810SE +/- 0.002250, N = 3SE +/- 0.002118, N = 36.4501826.3226841. (CXX) g++ options: -lpthread -lsndfile -lm -O3 -march=native -ffast-math -funroll-loops -fstrength-reduce -fstrict-aliasing -finline-functions

Stargate Digital Audio Workstation

Sample Rate: 480000 - Buffer Size: 512

OpenBenchmarking.orgRender Ratio, More Is BetterStargate Digital Audio Workstation 21.10.9Sample Rate: 480000 - Buffer Size: 512armv8.4-a+svearmv8.4-a246810SE +/- 0.002030, N = 3SE +/- 0.001956, N = 36.1244556.0053861. (CXX) g++ options: -lpthread -lsndfile -lm -O3 -march=native -ffast-math -funroll-loops -fstrength-reduce -fstrict-aliasing -finline-functions

GraphicsMagick

Operation: Enhanced

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: Enhancedarmv8.4-aarmv8.4-a+sve160320480640800SE +/- 0.33, N = 3SE +/- 0.33, N = 3732718-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -fopenmp -O3 -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lz -lm -lpthread

WebP Image Encode

Encode Settings: Quality 100, Lossless

OpenBenchmarking.orgEncode Time - Seconds, Fewer Is BetterWebP Image Encode 1.1Encode Settings: Quality 100, Losslessarmv8.4-a+svearmv8.4-a612182430SE +/- 0.18, N = 3SE +/- 0.01, N = 323.4023.85-march=armv8.4-a+sve-march=armv8.4-a1. (CC) gcc options: -fvisibility=hidden -O3 -lm -ljpeg -lpng16 -ltiff

AOM AV1

Encoder Mode: Speed 10 Realtime - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 3.3Encoder Mode: Speed 10 Realtime - Input: Bosphorus 1080parmv8.4-a+svearmv8.4-a4080120160200SE +/- 0.20, N = 3SE +/- 0.28, N = 3193.68190.27-march=armv8.4-a+sve-march=armv8.4-a1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm

LeelaChessZero

Backend: Eigen

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.28Backend: Eigenarmv8.4-a+svearmv8.4-a30060090012001500SE +/- 14.64, N = 5SE +/- 13.65, N = 313331311-march=armv8.4-a+sve-march=armv8.4-a1. (CXX) g++ options: -flto -O3 -pthread

Stargate Digital Audio Workstation

Sample Rate: 96000 - Buffer Size: 1024

OpenBenchmarking.orgRender Ratio, More Is BetterStargate Digital Audio Workstation 21.10.9Sample Rate: 96000 - Buffer Size: 1024armv8.4-a+svearmv8.4-a1.08182.16363.24544.32725.409SE +/- 0.002826, N = 3SE +/- 0.000903, N = 34.8077974.7298481. (CXX) g++ options: -lpthread -lsndfile -lm -O3 -march=native -ffast-math -funroll-loops -fstrength-reduce -fstrict-aliasing -finline-functions

ASTC Encoder

Preset: Medium

OpenBenchmarking.orgSeconds, Fewer Is BetterASTC Encoder 3.2Preset: Mediumarmv8.4-a+svearmv8.4-a1.09872.19743.29614.39485.4935SE +/- 0.0080, N = 3SE +/- 0.0125, N = 34.80924.8833-march=armv8.4-a+sve-march=armv8.4-a1. (CXX) g++ options: -O3 -flto -pthread

Zstd Compression

Compression Level: 19 - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 19 - Compression Speedarmv8.4-a+svearmv8.4-a1632486480SE +/- 0.03, N = 3SE +/- 0.23, N = 374.072.9-march=armv8.4-a+sve-march=armv8.4-a1. (CC) gcc options: -O3 -pthread -lz -llzma

ASTC Encoder

Preset: Thorough

OpenBenchmarking.orgSeconds, Fewer Is BetterASTC Encoder 3.2Preset: Thorougharmv8.4-a+svearmv8.4-a3691215SE +/- 0.0027, N = 3SE +/- 0.0046, N = 39.01319.1435-march=armv8.4-a+sve-march=armv8.4-a1. (CXX) g++ options: -O3 -flto -pthread

Timed MrBayes Analysis

Primate Phylogeny Analysis

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed MrBayes Analysis 3.2.7Primate Phylogeny Analysisarmv8.4-a+svearmv8.4-a50100150200250SE +/- 0.18, N = 3SE +/- 0.07, N = 3234.26237.54-march=armv8.4-a+sve-march=armv8.4-a1. (CC) gcc options: -O3 -std=c99 -pedantic -lm

Stargate Digital Audio Workstation

Sample Rate: 96000 - Buffer Size: 512

OpenBenchmarking.orgRender Ratio, More Is BetterStargate Digital Audio Workstation 21.10.9Sample Rate: 96000 - Buffer Size: 512armv8.4-a+svearmv8.4-a1.00682.01363.02044.02725.034SE +/- 0.002191, N = 3SE +/- 0.003502, N = 34.4748484.4140551. (CXX) g++ options: -lpthread -lsndfile -lm -O3 -march=native -ffast-math -funroll-loops -fstrength-reduce -fstrict-aliasing -finline-functions

RNNoise

OpenBenchmarking.orgSeconds, Fewer Is BetterRNNoise 2020-06-28armv8.4-a+svearmv8.4-a48121620SE +/- 0.05, N = 3SE +/- 0.05, N = 317.3917.62-march=armv8.4-a+sve-march=armv8.4-a1. (CC) gcc options: -O3 -pedantic -fvisibility=hidden

Zstd Compression

Compression Level: 3 - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 3 - Compression Speedarmv8.4-a+svearmv8.4-a15003000450060007500SE +/- 15.42, N = 3SE +/- 14.45, N = 37027.26937.8-march=armv8.4-a+sve-march=armv8.4-a1. (CC) gcc options: -O3 -pthread -lz -llzma

LeelaChessZero

Backend: BLAS

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.28Backend: BLASarmv8.4-a+svearmv8.4-a30060090012001500SE +/- 6.96, N = 3SE +/- 13.99, N = 512971281-march=armv8.4-a+sve-march=armv8.4-a1. (CXX) g++ options: -flto -O3 -pthread

Google Draco

Model: Church Facade

OpenBenchmarking.orgms, Fewer Is BetterGoogle Draco 1.5.0Model: Church Facadearmv8.4-a+svearmv8.4-a2K4K6K8K10KSE +/- 7.00, N = 3SE +/- 6.64, N = 378437935-march=armv8.4-a+sve-march=armv8.4-a1. (CXX) g++ options: -O3

Primesieve

1e12 Prime Number Generation

OpenBenchmarking.orgSeconds, Fewer Is BetterPrimesieve 7.71e12 Prime Number Generationarmv8.4-aarmv8.4-a+sve246810SE +/- 0.022, N = 3SE +/- 0.043, N = 38.4388.533-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3

Nettle

Test: sha512

OpenBenchmarking.orgMbyte/s, More Is BetterNettle 3.8Test: sha512armv8.4-a+svearmv8.4-a110220330440550SE +/- 0.07, N = 3SE +/- 0.04, N = 3504.33498.83-march=armv8.4-a+sve-march=armv8.4-a1. (CC) gcc options: -O3 -ggdb3 -lnettle -lgmp -lm -lcrypto

Caffe

Model: GoogleNet - Acceleration: CPU - Iterations: 200

OpenBenchmarking.orgMilli-Seconds, Fewer Is BetterCaffe 2020-02-13Model: GoogleNet - Acceleration: CPU - Iterations: 200armv8.4-aarmv8.4-a+sve30K60K90K120K150KSE +/- 49.72, N = 3SE +/- 105.70, N = 3123807125125-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -fPIC -rdynamic -lglog -lgflags -lprotobuf -lcrypto -lcurl -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas

Himeno Benchmark

Poisson Pressure Solver

OpenBenchmarking.orgMFLOPS, More Is BetterHimeno Benchmark 3.0Poisson Pressure Solverarmv8.4-aarmv8.4-a+sve12002400360048006000SE +/- 2.66, N = 3SE +/- 5.76, N = 35561.565508.28-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -O3

Botan

Test: AES-256

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: AES-256armv8.4-aarmv8.4-a+sve12002400360048006000SE +/- 14.75, N = 3SE +/- 9.30, N = 35494.315442.651. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt

LuaJIT

Test: Sparse Matrix Multiply

OpenBenchmarking.orgMflops, More Is BetterLuaJIT 2.1-gitTest: Sparse Matrix Multiplyarmv8.4-a+svearmv8.4-a30060090012001500SE +/- 7.20, N = 3SE +/- 3.14, N = 31162.331151.57-march=armv8.4-a+sve-march=armv8.4-a1. (CC) gcc options: -lm -ldl -O2 -fomit-frame-pointer -O3 -U_FORTIFY_SOURCE -fno-stack-protector

Nettle

Test: chacha

OpenBenchmarking.orgMbyte/s, More Is BetterNettle 3.8Test: chachaarmv8.4-aarmv8.4-a+sve160320480640800SE +/- 0.61, N = 3SE +/- 0.55, N = 3740.25733.59-march=armv8.4-a - MIN: 454.21 / MAX: 956.53-march=armv8.4-a+sve - MIN: 442.26 / MAX: 956.221. (CC) gcc options: -O3 -ggdb3 -lnettle -lgmp -lm -lcrypto

LAMMPS Molecular Dynamics Simulator

Model: Rhodopsin Protein

OpenBenchmarking.orgns/day, More Is BetterLAMMPS Molecular Dynamics Simulator 29Oct2020Model: Rhodopsin Proteinarmv8.4-aarmv8.4-a+sve510152025SE +/- 0.09, N = 3SE +/- 0.01, N = 321.3321.15-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -lm

Google Draco

Model: Lion

OpenBenchmarking.orgms, Fewer Is BetterGoogle Draco 1.5.0Model: Lionarmv8.4-a+svearmv8.4-a11002200330044005500SE +/- 2.40, N = 3SE +/- 2.65, N = 353095354-march=armv8.4-a+sve-march=armv8.4-a1. (CXX) g++ options: -O3

Zstd Compression

Compression Level: 19, Long Mode - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 19, Long Mode - Compression Speedarmv8.4-a+svearmv8.4-a918273645SE +/- 0.00, N = 3SE +/- 0.03, N = 340.340.0-march=armv8.4-a+sve-march=armv8.4-a1. (CC) gcc options: -O3 -pthread -lz -llzma

Caffe

Model: AlexNet - Acceleration: CPU - Iterations: 200

OpenBenchmarking.orgMilli-Seconds, Fewer Is BetterCaffe 2020-02-13Model: AlexNet - Acceleration: CPU - Iterations: 200armv8.4-aarmv8.4-a+sve9K18K27K36K45KSE +/- 31.22, N = 3SE +/- 12.55, N = 34363443931-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -fPIC -rdynamic -lglog -lgflags -lprotobuf -lcrypto -lcurl -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas

OpenSSL

Algorithm: SHA256

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.0Algorithm: SHA256armv8.4-aarmv8.4-a+sve6000M12000M18000M24000M30000MSE +/- 25639974.71, N = 3SE +/- 32102278.66, N = 32760394357027428176880-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl

Botan

Test: Blowfish

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: Blowfisharmv8.4-a+svearmv8.4-a60120180240300SE +/- 0.10, N = 3SE +/- 0.29, N = 3280.57278.871. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt

Xmrig

Variant: Wownero - Hash Count: 1M

OpenBenchmarking.orgH/s, More Is BetterXmrig 6.12.1Variant: Wownero - Hash Count: 1Marmv8.4-a+svearmv8.4-a3K6K9K12K15KSE +/- 26.88, N = 3SE +/- 27.59, N = 311877.811811.2-march=armv8.4-a+sve-march=armv8.4-a1. (CXX) g++ options: -O3 -fexceptions -fno-rtti -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc

FLAC Audio Encoding

WAV To FLAC

OpenBenchmarking.orgSeconds, Fewer Is BetterFLAC Audio Encoding 1.3.3WAV To FLACarmv8.4-aarmv8.4-a+sve918273645SE +/- 0.02, N = 5SE +/- 0.01, N = 538.3138.52-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -fvisibility=hidden -logg -lm

ASTC Encoder

Preset: Exhaustive

OpenBenchmarking.orgSeconds, Fewer Is BetterASTC Encoder 3.2Preset: Exhaustivearmv8.4-a+svearmv8.4-a816243240SE +/- 0.01, N = 3SE +/- 0.02, N = 335.1935.36-march=armv8.4-a+sve-march=armv8.4-a1. (CXX) g++ options: -O3 -flto -pthread

JPEG XL libjxl

Input: PNG - Encode Speed: 7

OpenBenchmarking.orgMP/s, More Is BetterJPEG XL libjxl 0.6.1Input: PNG - Encode Speed: 7armv8.4-a+svearmv8.4-a246810SE +/- 0.01, N = 3SE +/- 0.02, N = 38.368.32-march=armv8.4-a+sve-march=armv8.4-a1. (CXX) g++ options: -O3 -funwind-tables -O2 -fPIE -pie

Redis

Test: GET

OpenBenchmarking.orgRequests Per Second, More Is BetterRedis 6.0.9Test: GETarmv8.4-aarmv8.4-a+sve500K1000K1500K2000K2500KSE +/- 1605.36, N = 3SE +/- 9056.40, N = 32523377.922513289.20-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3

Zstd Compression

Compression Level: 19, Long Mode - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 19, Long Mode - Decompression Speedarmv8.4-a+svearmv8.4-a7001400210028003500SE +/- 0.59, N = 3SE +/- 7.62, N = 33263.73250.9-march=armv8.4-a+sve-march=armv8.4-a1. (CC) gcc options: -O3 -pthread -lz -llzma

x264

Video Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is Betterx264 2022-02-22Video Input: Bosphorus 1080parmv8.4-a+svearmv8.4-a4080120160200SE +/- 0.10, N = 3SE +/- 0.07, N = 3169.58168.921. (CC) gcc options: -ldl -lavformat -lavcodec -lavutil -lswscale -lm -lpthread -O3 -flto

ONNX Runtime

Model: GPT-2 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: GPT-2 - Device: CPU - Executor: Standardarmv8.4-aarmv8.4-a+sve3K6K9K12K15KSE +/- 63.90, N = 3SE +/- 12.91, N = 31236412317-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -ffunction-sections -fdata-sections -march=native -mtune=native -flto -fno-fat-lto-objects -ldl -lrt

Botan

Test: ChaCha20Poly1305 - Decrypt

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: ChaCha20Poly1305 - Decryptarmv8.4-a+svearmv8.4-a80160240320400SE +/- 0.13, N = 3SE +/- 0.02, N = 3383.95382.511. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt

Zstd Compression

Compression Level: 19 - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 19 - Decompression Speedarmv8.4-a+svearmv8.4-a7001400210028003500SE +/- 6.60, N = 3SE +/- 8.46, N = 33094.83083.4-march=armv8.4-a+sve-march=armv8.4-a1. (CC) gcc options: -O3 -pthread -lz -llzma

ONNX Runtime

Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: ArcFace ResNet-100 - Device: CPU - Executor: Standardarmv8.4-aarmv8.4-a+sve2004006008001000SE +/- 0.88, N = 3SE +/- 0.17, N = 3938935-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -ffunction-sections -fdata-sections -march=native -mtune=native -flto -fno-fat-lto-objects -ldl -lrt

Xmrig

Variant: Monero - Hash Count: 1M

OpenBenchmarking.orgH/s, More Is BetterXmrig 6.12.1Variant: Monero - Hash Count: 1Marmv8.4-a+svearmv8.4-a2K4K6K8K10KSE +/- 6.18, N = 3SE +/- 9.56, N = 38669.88645.4-march=armv8.4-a+sve-march=armv8.4-a1. (CXX) g++ options: -O3 -fexceptions -fno-rtti -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc

Nettle

Test: aes256

OpenBenchmarking.orgMbyte/s, More Is BetterNettle 3.8Test: aes256armv8.4-a+svearmv8.4-a10002000300040005000SE +/- 0.39, N = 3SE +/- 3.11, N = 34447.044435.91-march=armv8.4-a+sve - MIN: 3925.11 / MAX: 5627.84-march=armv8.4-a - MIN: 3927.32 / MAX: 5628.861. (CC) gcc options: -O3 -ggdb3 -lnettle -lgmp -lm -lcrypto

Botan

Test: ChaCha20Poly1305

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: ChaCha20Poly1305armv8.4-a+svearmv8.4-a80160240320400SE +/- 0.07, N = 3SE +/- 0.04, N = 3390.31389.381. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt

Redis

Test: SET

OpenBenchmarking.orgRequests Per Second, More Is BetterRedis 6.0.9Test: SETarmv8.4-aarmv8.4-a+sve400K800K1200K1600K2000KSE +/- 794.78, N = 3SE +/- 7427.93, N = 31865840.131861924.13-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3

Botan

Test: Blowfish - Decrypt

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: Blowfish - Decryptarmv8.4-a+svearmv8.4-a60120180240300SE +/- 0.03, N = 3SE +/- 0.07, N = 3289.03288.511. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt

LuaJIT

Test: Monte Carlo

OpenBenchmarking.orgMflops, More Is BetterLuaJIT 2.1-gitTest: Monte Carloarmv8.4-a+svearmv8.4-a70140210280350SE +/- 0.54, N = 3SE +/- 0.35, N = 3343.85343.27-march=armv8.4-a+sve-march=armv8.4-a1. (CC) gcc options: -lm -ldl -O2 -fomit-frame-pointer -O3 -U_FORTIFY_SOURCE -fno-stack-protector

x264

Video Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is Betterx264 2022-02-22Video Input: Bosphorus 4Karmv8.4-a+svearmv8.4-a1122334455SE +/- 0.01, N = 3SE +/- 0.08, N = 348.5148.431. (CC) gcc options: -ldl -lavformat -lavcodec -lavutil -lswscale -lm -lpthread -O3 -flto

WavPack Audio Encoding

WAV To WavPack

OpenBenchmarking.orgSeconds, Fewer Is BetterWavPack Audio Encoding 5.3WAV To WavPackarmv8.4-aarmv8.4-a+sve510152025SE +/- 0.00, N = 5SE +/- 0.03, N = 520.4920.52-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -rdynamic

ONNX Runtime

Model: bertsquad-12 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: bertsquad-12 - Device: CPU - Executor: Standardarmv8.4-aarmv8.4-a+sve170340510680850SE +/- 0.44, N = 3SE +/- 0.50, N = 3773772-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -ffunction-sections -fdata-sections -march=native -mtune=native -flto -fno-fat-lto-objects -ldl -lrt

LuaJIT

Test: Jacobi Successive Over-Relaxation

OpenBenchmarking.orgMflops, More Is BetterLuaJIT 2.1-gitTest: Jacobi Successive Over-Relaxationarmv8.4-a+svearmv8.4-a2004006008001000SE +/- 1.00, N = 3SE +/- 0.68, N = 3902.16901.02-march=armv8.4-a+sve-march=armv8.4-a1. (CC) gcc options: -lm -ldl -O2 -fomit-frame-pointer -O3 -U_FORTIFY_SOURCE -fno-stack-protector

WebP Image Encode

Encode Settings: Quality 100, Highest Compression

OpenBenchmarking.orgEncode Time - Seconds, Fewer Is BetterWebP Image Encode 1.1Encode Settings: Quality 100, Highest Compressionarmv8.4-aarmv8.4-a+sve246810SE +/- 0.006, N = 3SE +/- 0.017, N = 38.6308.640-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -fvisibility=hidden -O3 -lm -ljpeg -lpng16 -ltiff

Zstd Compression

Compression Level: 3, Long Mode - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 3, Long Mode - Compression Speedarmv8.4-a+svearmv8.4-a30060090012001500SE +/- 4.88, N = 3SE +/- 5.37, N = 31242.71241.3-march=armv8.4-a+sve-march=armv8.4-a1. (CC) gcc options: -O3 -pthread -lz -llzma

Zstd Compression

Compression Level: 3, Long Mode - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 3, Long Mode - Decompression Speedarmv8.4-a+svearmv8.4-a8001600240032004000SE +/- 1.28, N = 3SE +/- 3.95, N = 33824.83820.8-march=armv8.4-a+sve-march=armv8.4-a1. (CC) gcc options: -O3 -pthread -lz -llzma

AOM AV1

Encoder Mode: Speed 8 Realtime - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 3.3Encoder Mode: Speed 8 Realtime - Input: Bosphorus 4Karmv8.4-a+svearmv8.4-a1428425670SE +/- 0.61, N = 3SE +/- 0.43, N = 362.1962.13-march=armv8.4-a+sve-march=armv8.4-a1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm

GROMACS

Implementation: MPI CPU - Input: water_GMX50_bare

OpenBenchmarking.orgNs Per Day, More Is BetterGROMACS 2022.1Implementation: MPI CPU - Input: water_GMX50_barearmv8.4-aarmv8.4-a+sve0.51231.02461.53692.04922.5615SE +/- 0.001, N = 3SE +/- 0.002, N = 32.2772.275-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3

GNU GMP GMPbench

Total Time

OpenBenchmarking.orgGMPbench Score, More Is BetterGNU GMP GMPbench 6.2.1Total Timearmv8.4-a+svearmv8.4-a90018002700360045004155.64152.3-march=armv8.4-a+sve-march=armv8.4-a1. (CC) gcc options: -O3 -lm

Sysbench

Test: CPU

OpenBenchmarking.orgEvents Per Second, More Is BetterSysbench 1.0.20Test: CPUarmv8.4-aarmv8.4-a+sve20K40K60K80K100KSE +/- 8.50, N = 3SE +/- 2.72, N = 396726.4096666.76-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -O2 -funroll-loops -O3 -rdynamic -ldl -laio -lm

Botan

Test: AES-256 - Decrypt

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: AES-256 - Decryptarmv8.4-aarmv8.4-a+sve12002400360048006000SE +/- 5.58, N = 3SE +/- 8.64, N = 35477.575474.321. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt

OpenSSL

Algorithm: RSA4096

OpenBenchmarking.orgsign/s, More Is BetterOpenSSL 3.0Algorithm: RSA4096armv8.4-aarmv8.4-a+sve11002200330044005500SE +/- 0.78, N = 3SE +/- 0.53, N = 35090.55088.1-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl

AOBench

Size: 2048 x 2048 - Total Time

OpenBenchmarking.orgSeconds, Fewer Is BetterAOBenchSize: 2048 x 2048 - Total Timearmv8.4-aarmv8.4-a+sve816243240SE +/- 0.00, N = 3SE +/- 0.01, N = 333.4833.49-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -lm -O3

ONNX Runtime

Model: super-resolution-10 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: super-resolution-10 - Device: CPU - Executor: Standardarmv8.4-aarmv8.4-a+sve12002400360048006000SE +/- 0.93, N = 3SE +/- 2.17, N = 354135411-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -ffunction-sections -fdata-sections -march=native -mtune=native -flto -fno-fat-lto-objects -ldl -lrt

Botan

Test: CAST-256

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: CAST-256armv8.4-aarmv8.4-a+sve20406080100SE +/- 0.02, N = 3SE +/- 0.01, N = 3108.79108.751. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt

Botan

Test: KASUMI

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: KASUMIarmv8.4-aarmv8.4-a+sve1428425670SE +/- 0.00, N = 3SE +/- 0.00, N = 362.0262.001. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt

Botan

Test: KASUMI - Decrypt

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: KASUMI - Decryptarmv8.4-aarmv8.4-a+sve1428425670SE +/- 0.00, N = 3SE +/- 0.00, N = 362.2862.261. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt

Smallpt

Global Illumination Renderer; 128 Samples

OpenBenchmarking.orgSeconds, Fewer Is BetterSmallpt 1.0Global Illumination Renderer; 128 Samplesarmv8.4-aarmv8.4-a+sve0.87661.75322.62983.50644.383SE +/- 0.002, N = 3SE +/- 0.003, N = 33.8953.896-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -fopenmp -O3

Botan

Test: CAST-256 - Decrypt

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: CAST-256 - Decryptarmv8.4-a+svearmv8.4-a20406080100SE +/- 0.01, N = 3SE +/- 0.02, N = 3108.62108.601. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt

C-Ray

Total Time - 4K, 16 Rays Per Pixel

OpenBenchmarking.orgSeconds, Fewer Is BetterC-Ray 1.1Total Time - 4K, 16 Rays Per Pixelarmv8.4-aarmv8.4-a+sve510152025SE +/- 0.03, N = 3SE +/- 0.00, N = 319.3019.30-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -lm -lpthread -O3

OpenSSL

Algorithm: RSA4096

OpenBenchmarking.orgverify/s, More Is BetterOpenSSL 3.0Algorithm: RSA4096armv8.4-a+svearmv8.4-a80K160K240K320K400KSE +/- 10.52, N = 3SE +/- 8.85, N = 3356407.8356359.6-march=armv8.4-a+sve-march=armv8.4-a1. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl

ONNX Runtime

Model: fcn-resnet101-11 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: fcn-resnet101-11 - Device: CPU - Executor: Standardarmv8.4-a+svearmv8.4-a1632486480SE +/- 0.00, N = 3SE +/- 0.00, N = 37373-march=armv8.4-a+sve-march=armv8.4-a1. (CXX) g++ options: -O3 -ffunction-sections -fdata-sections -march=native -mtune=native -flto -fno-fat-lto-objects -ldl -lrt

JPEG XL libjxl

Input: PNG - Encode Speed: 8

OpenBenchmarking.orgMP/s, More Is BetterJPEG XL libjxl 0.6.1Input: PNG - Encode Speed: 8armv8.4-a+svearmv8.4-a0.15080.30160.45240.60320.754SE +/- 0.00, N = 3SE +/- 0.00, N = 30.670.67-march=armv8.4-a+sve-march=armv8.4-a1. (CXX) g++ options: -O3 -funwind-tables -O2 -fPIE -pie


Phoronix Test Suite v10.8.5