Neoverse-V1 Compiler Tests

amazon testing on Ubuntu 22.04 via the Phoronix Test Suite by michael larabel for a future article.

HTML result view exported from: https://openbenchmarking.org/result/2206087-PTS-SVE2607864&gru.

Neoverse-V1 Compiler TestsProcessorMotherboardChipsetMemoryDiskNetworkOSKernelCompilerFile-SystemSystem Layerarmv8.4-aarmv8.4-a+sveARMv8 Neoverse-V1 (32 Cores)Amazon EC2 c7g.8xlarge (1.0 BIOS)Amazon Device 020062GB301GB Amazon Elastic Block StoreAmazon ElasticUbuntu 22.045.15.0-1004-aws (aarch64)GCC 12.0.0 20220117ext4amazonOpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseEnvironment Details- armv8.4-a: CXXFLAGS="-O3 -march=armv8.4-a" CFLAGS="-O3 -march=armv8.4-a" - armv8.4-a+sve: CXXFLAGS="-O3 -march=armv8.4-a+sve" CFLAGS="-O3 -march=armv8.4-a+sve" Compiler Details- --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-nls --disable-werror --enable-checking=yes,extra,rtl --enable-clocale=gnu --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++ --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-objc-gc=auto --enable-plugin --enable-shared --host=aarch64-linux-gnu --program-prefix= --target=aarch64-linux-gnu --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -v Python Details- Python 3.10.4Security Details- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of CSV2 BHB + srbds: Not affected + tsx_async_abort: Not affected

Neoverse-V1 Compiler Testsopenssl: SHA256sysbench: CPUaom-av1: Speed 8 Realtime - Bosphorus 4Kaom-av1: Speed 10 Realtime - Bosphorus 4Kaom-av1: Speed 8 Realtime - Bosphorus 1080paom-av1: Speed 9 Realtime - Bosphorus 1080paom-av1: Speed 10 Realtime - Bosphorus 1080px264: Bosphorus 4Kx264: Bosphorus 1080pmt-dgemm: Sustained Floating-Point Rategmpbench: Total Timexmrig: Monero - 1Mxmrig: Wownero - 1Monnx: GPT-2 - CPU - Standardonnx: bertsquad-12 - CPU - Standardonnx: fcn-resnet101-11 - CPU - Standardonnx: ArcFace ResNet-100 - CPU - Standardonnx: super-resolution-10 - CPU - Standardgraphics-magick: Swirlgraphics-magick: Rotategraphics-magick: Enhancedgraphics-magick: Resizinggraphics-magick: Noise-Gaussiangraphics-magick: HWB Color Spacecoremark: CoreMark Size 666 - Iterations Per Secondcompress-zstd: 3 - Compression Speedcompress-zstd: 19 - Compression Speedcompress-zstd: 19 - Decompression Speedcompress-zstd: 3, Long Mode - Compression Speedcompress-zstd: 3, Long Mode - Decompression Speedcompress-zstd: 19, Long Mode - Compression Speedcompress-zstd: 19, Long Mode - Decompression Speednettle: aes256nettle: chachanettle: sha512nettle: poly1305-aesluajit: Compositeluajit: Monte Carloluajit: Fast Fourier Transformluajit: Sparse Matrix Multiplyluajit: Dense LU Matrix Factorizationluajit: Jacobi Successive Over-Relaxationhimeno: Poisson Pressure Solverbotan: KASUMIbotan: KASUMI - Decryptbotan: AES-256botan: AES-256 - Decryptbotan: Twofishbotan: Twofish - Decryptbotan: Blowfishbotan: Blowfish - Decryptbotan: CAST-256botan: CAST-256 - Decryptbotan: ChaCha20Poly1305botan: ChaCha20Poly1305 - Decryptcryptopp: Unkeyed Algorithmsjpegxl: PNG - 7jpegxl: PNG - 8jpegxl: JPEG - 7jpegxl: JPEG - 8lczero: BLASlczero: Eigenstockfish: Total Timegromacs: MPI CPU - water_GMX50_barelammps: Rhodopsin Proteinstargate: 44100 - 512stargate: 96000 - 512stargate: 44100 - 1024stargate: 480000 - 512stargate: 96000 - 1024stargate: 480000 - 1024redis: GETredis: SETliquid-dsp: 8 - 256 - 57liquid-dsp: 16 - 256 - 57liquid-dsp: 32 - 256 - 57openssl: RSA4096kripke: openssl: RSA4096webp: Quality 100, Losslesswebp: Quality 100, Highest Compressioncaffe: AlexNet - CPU - 200caffe: GoogleNet - CPU - 200openjpeg: NASA Curiosity Panorama M34draco: Liondraco: Church Facadetnn: CPU - DenseNettnn: CPU - MobileNet v2tnn: CPU - SqueezeNet v2tnn: CPU - SqueezeNet v1.1mrbayes: Primate Phylogeny Analysisc-ray: Total Time - 4K, 16 Rays Per Pixelpovray: Trace Timeprimesieve: 1e12 Prime Number Generationsmallpt: Global Illumination Renderer; 128 Samplesaobench: 2048 x 2048 - Total Timeencode-flac: WAV To FLACencode-mp3: WAV To MP3encode-opus: WAV To Opus Encodeespeak: Text-To-Speech Synthesisngspice: C2670ngspice: C7552rnnoise: astcenc: Mediumastcenc: Thoroughastcenc: Exhaustiveencode-wavpack: WAV To WavPackarmv8.4-aarmv8.4-a+sve2760394357096726.4062.1361.88120.13152.46190.2748.43168.9212.8139274152.38645.411811.21236477373938541312255777322339494978789646.9248006937.872.93083.41241.33820.840.03250.94435.91740.25498.83871.901282.59343.27661.551151.573355.53901.025561.55928762.01762.2775494.3135477.571239.703246.155278.874288.505108.786108.599389.375382.514459.8701638.320.6773.2126.3012811311574856802.27721.3306.0729164.4140556.3700356.0053864.7298486.3226842523377.921865840.131763633333527000007052333335090.5204143167356359.623.8528.6304363412380757205535479352730.40260.77871.130257.704237.54219.29619.8478.4383.89533.48138.3118.05418.32036.587102.558103.93317.6224.88339.143535.364720.4882742817688096666.7662.1965.62123.95156.71193.6848.51169.5813.4422334155.68669.811877.812317772739355411127261171824145151067762066.5011637027.274.03094.81242.73824.840.33263.74447.04733.59504.33820.511309.03343.85615.711162.333521.13902.165508.28243562.0062.2615442.6495474.321248.887258.148280.570289.032108.754108.623390.311383.945449.0242088.360.6779.5827.3412971333558233402.27521.1496.2135004.4748486.5381636.1244554.8077976.4501822513289.21861924.131677333333354233336686366675088.1192709233356407.823.4038.6404393112512555196530978432346.322280.24376.301205.799234.25519.29920.2638.5333.89633.49438.5157.44014.40229.982106.910111.64417.3854.80929.013135.192620.515OpenBenchmarking.org

OpenSSL

Algorithm: SHA256

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.0Algorithm: SHA256armv8.4-aarmv8.4-a+sve6000M12000M18000M24000M30000MSE +/- 25639974.71, N = 3SE +/- 32102278.66, N = 32760394357027428176880-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl

Sysbench

Test: CPU

OpenBenchmarking.orgEvents Per Second, More Is BetterSysbench 1.0.20Test: CPUarmv8.4-aarmv8.4-a+sve20K40K60K80K100KSE +/- 8.50, N = 3SE +/- 2.72, N = 396726.4096666.76-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -O2 -funroll-loops -O3 -rdynamic -ldl -laio -lm

AOM AV1

Encoder Mode: Speed 8 Realtime - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 3.3Encoder Mode: Speed 8 Realtime - Input: Bosphorus 4Karmv8.4-aarmv8.4-a+sve1428425670SE +/- 0.43, N = 3SE +/- 0.61, N = 362.1362.19-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm

AOM AV1

Encoder Mode: Speed 10 Realtime - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 3.3Encoder Mode: Speed 10 Realtime - Input: Bosphorus 4Karmv8.4-aarmv8.4-a+sve1530456075SE +/- 0.47, N = 3SE +/- 0.22, N = 361.8865.62-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm

AOM AV1

Encoder Mode: Speed 8 Realtime - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 3.3Encoder Mode: Speed 8 Realtime - Input: Bosphorus 1080parmv8.4-aarmv8.4-a+sve306090120150SE +/- 0.03, N = 3SE +/- 0.09, N = 3120.13123.95-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm

AOM AV1

Encoder Mode: Speed 9 Realtime - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 3.3Encoder Mode: Speed 9 Realtime - Input: Bosphorus 1080parmv8.4-aarmv8.4-a+sve306090120150SE +/- 0.03, N = 3SE +/- 0.20, N = 3152.46156.71-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm

AOM AV1

Encoder Mode: Speed 10 Realtime - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 3.3Encoder Mode: Speed 10 Realtime - Input: Bosphorus 1080parmv8.4-aarmv8.4-a+sve4080120160200SE +/- 0.28, N = 3SE +/- 0.20, N = 3190.27193.68-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm

x264

Video Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is Betterx264 2022-02-22Video Input: Bosphorus 4Karmv8.4-aarmv8.4-a+sve1122334455SE +/- 0.08, N = 3SE +/- 0.01, N = 348.4348.511. (CC) gcc options: -ldl -lavformat -lavcodec -lavutil -lswscale -lm -lpthread -O3 -flto

x264

Video Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is Betterx264 2022-02-22Video Input: Bosphorus 1080parmv8.4-aarmv8.4-a+sve4080120160200SE +/- 0.07, N = 3SE +/- 0.10, N = 3168.92169.581. (CC) gcc options: -ldl -lavformat -lavcodec -lavutil -lswscale -lm -lpthread -O3 -flto

ACES DGEMM

Sustained Floating-Point Rate

OpenBenchmarking.orgGFLOP/s, More Is BetterACES DGEMM 1.0Sustained Floating-Point Ratearmv8.4-aarmv8.4-a+sve3691215SE +/- 0.04, N = 3SE +/- 0.05, N = 312.8113.44-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -O3 -march=native -fopenmp

GNU GMP GMPbench

Total Time

OpenBenchmarking.orgGMPbench Score, More Is BetterGNU GMP GMPbench 6.2.1Total Timearmv8.4-aarmv8.4-a+sve90018002700360045004152.34155.6-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -O3 -lm

Xmrig

Variant: Monero - Hash Count: 1M

OpenBenchmarking.orgH/s, More Is BetterXmrig 6.12.1Variant: Monero - Hash Count: 1Marmv8.4-aarmv8.4-a+sve2K4K6K8K10KSE +/- 9.56, N = 3SE +/- 6.18, N = 38645.48669.8-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -fexceptions -fno-rtti -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc

Xmrig

Variant: Wownero - Hash Count: 1M

OpenBenchmarking.orgH/s, More Is BetterXmrig 6.12.1Variant: Wownero - Hash Count: 1Marmv8.4-aarmv8.4-a+sve3K6K9K12K15KSE +/- 27.59, N = 3SE +/- 26.88, N = 311811.211877.8-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -fexceptions -fno-rtti -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc

ONNX Runtime

Model: GPT-2 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: GPT-2 - Device: CPU - Executor: Standardarmv8.4-aarmv8.4-a+sve3K6K9K12K15KSE +/- 63.90, N = 3SE +/- 12.91, N = 31236412317-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -ffunction-sections -fdata-sections -march=native -mtune=native -flto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: bertsquad-12 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: bertsquad-12 - Device: CPU - Executor: Standardarmv8.4-aarmv8.4-a+sve170340510680850SE +/- 0.44, N = 3SE +/- 0.50, N = 3773772-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -ffunction-sections -fdata-sections -march=native -mtune=native -flto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: fcn-resnet101-11 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: fcn-resnet101-11 - Device: CPU - Executor: Standardarmv8.4-aarmv8.4-a+sve1632486480SE +/- 0.00, N = 3SE +/- 0.00, N = 37373-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -ffunction-sections -fdata-sections -march=native -mtune=native -flto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: ArcFace ResNet-100 - Device: CPU - Executor: Standardarmv8.4-aarmv8.4-a+sve2004006008001000SE +/- 0.88, N = 3SE +/- 0.17, N = 3938935-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -ffunction-sections -fdata-sections -march=native -mtune=native -flto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: super-resolution-10 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: super-resolution-10 - Device: CPU - Executor: Standardarmv8.4-aarmv8.4-a+sve12002400360048006000SE +/- 0.93, N = 3SE +/- 2.17, N = 354135411-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -ffunction-sections -fdata-sections -march=native -mtune=native -flto -fno-fat-lto-objects -ldl -lrt

GraphicsMagick

Operation: Swirl

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: Swirlarmv8.4-aarmv8.4-a+sve30060090012001500SE +/- 0.33, N = 3SE +/- 1.33, N = 312251272-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -fopenmp -O3 -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lz -lm -lpthread

GraphicsMagick

Operation: Rotate

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: Rotatearmv8.4-aarmv8.4-a+sve130260390520650SE +/- 0.00, N = 3SE +/- 1.20, N = 3577611-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -fopenmp -O3 -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lz -lm -lpthread

GraphicsMagick

Operation: Enhanced

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: Enhancedarmv8.4-aarmv8.4-a+sve160320480640800SE +/- 0.33, N = 3SE +/- 0.33, N = 3732718-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -fopenmp -O3 -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lz -lm -lpthread

GraphicsMagick

Operation: Resizing

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: Resizingarmv8.4-aarmv8.4-a+sve5001000150020002500SE +/- 22.36, N = 3SE +/- 1.86, N = 323392414-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -fopenmp -O3 -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lz -lm -lpthread

GraphicsMagick

Operation: Noise-Gaussian

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: Noise-Gaussianarmv8.4-aarmv8.4-a+sve110220330440550SE +/- 0.67, N = 3SE +/- 0.58, N = 3494515-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -fopenmp -O3 -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lz -lm -lpthread

GraphicsMagick

Operation: HWB Color Space

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: HWB Color Spacearmv8.4-aarmv8.4-a+sve2004006008001000SE +/- 1.00, N = 3SE +/- 0.33, N = 39781067-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -fopenmp -O3 -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lz -lm -lpthread

Coremark

CoreMark Size 666 - Iterations Per Second

OpenBenchmarking.orgIterations/Sec, More Is BetterCoremark 1.0CoreMark Size 666 - Iterations Per Secondarmv8.4-aarmv8.4-a+sve200K400K600K800K1000KSE +/- 169.09, N = 3SE +/- 416.47, N = 3789646.92762066.50-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -O2 -O3 -lrt" -lrt

Zstd Compression

Compression Level: 3 - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 3 - Compression Speedarmv8.4-aarmv8.4-a+sve15003000450060007500SE +/- 14.45, N = 3SE +/- 15.42, N = 36937.87027.2-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -O3 -pthread -lz -llzma

Zstd Compression

Compression Level: 19 - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 19 - Compression Speedarmv8.4-aarmv8.4-a+sve1632486480SE +/- 0.23, N = 3SE +/- 0.03, N = 372.974.0-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -O3 -pthread -lz -llzma

Zstd Compression

Compression Level: 19 - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 19 - Decompression Speedarmv8.4-aarmv8.4-a+sve7001400210028003500SE +/- 8.46, N = 3SE +/- 6.60, N = 33083.43094.8-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -O3 -pthread -lz -llzma

Zstd Compression

Compression Level: 3, Long Mode - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 3, Long Mode - Compression Speedarmv8.4-aarmv8.4-a+sve30060090012001500SE +/- 5.37, N = 3SE +/- 4.88, N = 31241.31242.7-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -O3 -pthread -lz -llzma

Zstd Compression

Compression Level: 3, Long Mode - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 3, Long Mode - Decompression Speedarmv8.4-aarmv8.4-a+sve8001600240032004000SE +/- 3.95, N = 3SE +/- 1.28, N = 33820.83824.8-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -O3 -pthread -lz -llzma

Zstd Compression

Compression Level: 19, Long Mode - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 19, Long Mode - Compression Speedarmv8.4-aarmv8.4-a+sve918273645SE +/- 0.03, N = 3SE +/- 0.00, N = 340.040.3-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -O3 -pthread -lz -llzma

Zstd Compression

Compression Level: 19, Long Mode - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 19, Long Mode - Decompression Speedarmv8.4-aarmv8.4-a+sve7001400210028003500SE +/- 7.62, N = 3SE +/- 0.59, N = 33250.93263.7-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -O3 -pthread -lz -llzma

Nettle

Test: aes256

OpenBenchmarking.orgMbyte/s, More Is BetterNettle 3.8Test: aes256armv8.4-aarmv8.4-a+sve10002000300040005000SE +/- 3.11, N = 3SE +/- 0.39, N = 34435.914447.04-march=armv8.4-a - MIN: 3927.32 / MAX: 5628.86-march=armv8.4-a+sve - MIN: 3925.11 / MAX: 5627.841. (CC) gcc options: -O3 -ggdb3 -lnettle -lgmp -lm -lcrypto

Nettle

Test: chacha

OpenBenchmarking.orgMbyte/s, More Is BetterNettle 3.8Test: chachaarmv8.4-aarmv8.4-a+sve160320480640800SE +/- 0.61, N = 3SE +/- 0.55, N = 3740.25733.59-march=armv8.4-a - MIN: 454.21 / MAX: 956.53-march=armv8.4-a+sve - MIN: 442.26 / MAX: 956.221. (CC) gcc options: -O3 -ggdb3 -lnettle -lgmp -lm -lcrypto

Nettle

Test: sha512

OpenBenchmarking.orgMbyte/s, More Is BetterNettle 3.8Test: sha512armv8.4-aarmv8.4-a+sve110220330440550SE +/- 0.04, N = 3SE +/- 0.07, N = 3498.83504.33-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -O3 -ggdb3 -lnettle -lgmp -lm -lcrypto

Nettle

Test: poly1305-aes

OpenBenchmarking.orgMbyte/s, More Is BetterNettle 3.8Test: poly1305-aesarmv8.4-aarmv8.4-a+sve2004006008001000SE +/- 1.52, N = 3SE +/- 5.37, N = 3871.90820.51-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -O3 -ggdb3 -lnettle -lgmp -lm -lcrypto

LuaJIT

Test: Composite

OpenBenchmarking.orgMflops, More Is BetterLuaJIT 2.1-gitTest: Compositearmv8.4-aarmv8.4-a+sve30060090012001500SE +/- 0.41, N = 3SE +/- 18.19, N = 31282.591309.03-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -lm -ldl -O2 -fomit-frame-pointer -O3 -U_FORTIFY_SOURCE -fno-stack-protector

LuaJIT

Test: Monte Carlo

OpenBenchmarking.orgMflops, More Is BetterLuaJIT 2.1-gitTest: Monte Carloarmv8.4-aarmv8.4-a+sve70140210280350SE +/- 0.35, N = 3SE +/- 0.54, N = 3343.27343.85-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -lm -ldl -O2 -fomit-frame-pointer -O3 -U_FORTIFY_SOURCE -fno-stack-protector

LuaJIT

Test: Fast Fourier Transform

OpenBenchmarking.orgMflops, More Is BetterLuaJIT 2.1-gitTest: Fast Fourier Transformarmv8.4-aarmv8.4-a+sve140280420560700SE +/- 0.39, N = 3SE +/- 10.69, N = 3661.55615.71-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -lm -ldl -O2 -fomit-frame-pointer -O3 -U_FORTIFY_SOURCE -fno-stack-protector

LuaJIT

Test: Sparse Matrix Multiply

OpenBenchmarking.orgMflops, More Is BetterLuaJIT 2.1-gitTest: Sparse Matrix Multiplyarmv8.4-aarmv8.4-a+sve30060090012001500SE +/- 3.14, N = 3SE +/- 7.20, N = 31151.571162.33-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -lm -ldl -O2 -fomit-frame-pointer -O3 -U_FORTIFY_SOURCE -fno-stack-protector

LuaJIT

Test: Dense LU Matrix Factorization

OpenBenchmarking.orgMflops, More Is BetterLuaJIT 2.1-gitTest: Dense LU Matrix Factorizationarmv8.4-aarmv8.4-a+sve8001600240032004000SE +/- 6.02, N = 3SE +/- 86.66, N = 33355.533521.13-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -lm -ldl -O2 -fomit-frame-pointer -O3 -U_FORTIFY_SOURCE -fno-stack-protector

LuaJIT

Test: Jacobi Successive Over-Relaxation

OpenBenchmarking.orgMflops, More Is BetterLuaJIT 2.1-gitTest: Jacobi Successive Over-Relaxationarmv8.4-aarmv8.4-a+sve2004006008001000SE +/- 0.68, N = 3SE +/- 1.00, N = 3901.02902.16-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -lm -ldl -O2 -fomit-frame-pointer -O3 -U_FORTIFY_SOURCE -fno-stack-protector

Himeno Benchmark

Poisson Pressure Solver

OpenBenchmarking.orgMFLOPS, More Is BetterHimeno Benchmark 3.0Poisson Pressure Solverarmv8.4-aarmv8.4-a+sve12002400360048006000SE +/- 2.66, N = 3SE +/- 5.76, N = 35561.565508.28-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -O3

Botan

Test: KASUMI

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: KASUMIarmv8.4-aarmv8.4-a+sve1428425670SE +/- 0.00, N = 3SE +/- 0.00, N = 362.0262.001. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt

Botan

Test: KASUMI - Decrypt

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: KASUMI - Decryptarmv8.4-aarmv8.4-a+sve1428425670SE +/- 0.00, N = 3SE +/- 0.00, N = 362.2862.261. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt

Botan

Test: AES-256

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: AES-256armv8.4-aarmv8.4-a+sve12002400360048006000SE +/- 14.75, N = 3SE +/- 9.30, N = 35494.315442.651. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt

Botan

Test: AES-256 - Decrypt

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: AES-256 - Decryptarmv8.4-aarmv8.4-a+sve12002400360048006000SE +/- 5.58, N = 3SE +/- 8.64, N = 35477.575474.321. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt

Botan

Test: Twofish

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: Twofisharmv8.4-aarmv8.4-a+sve50100150200250SE +/- 0.12, N = 3SE +/- 0.26, N = 3239.70248.891. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt

Botan

Test: Twofish - Decrypt

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: Twofish - Decryptarmv8.4-aarmv8.4-a+sve60120180240300SE +/- 0.20, N = 3SE +/- 0.11, N = 3246.16258.151. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt

Botan

Test: Blowfish

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: Blowfisharmv8.4-aarmv8.4-a+sve60120180240300SE +/- 0.29, N = 3SE +/- 0.10, N = 3278.87280.571. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt

Botan

Test: Blowfish - Decrypt

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: Blowfish - Decryptarmv8.4-aarmv8.4-a+sve60120180240300SE +/- 0.07, N = 3SE +/- 0.03, N = 3288.51289.031. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt

Botan

Test: CAST-256

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: CAST-256armv8.4-aarmv8.4-a+sve20406080100SE +/- 0.02, N = 3SE +/- 0.01, N = 3108.79108.751. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt

Botan

Test: CAST-256 - Decrypt

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: CAST-256 - Decryptarmv8.4-aarmv8.4-a+sve20406080100SE +/- 0.02, N = 3SE +/- 0.01, N = 3108.60108.621. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt

Botan

Test: ChaCha20Poly1305

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: ChaCha20Poly1305armv8.4-aarmv8.4-a+sve80160240320400SE +/- 0.04, N = 3SE +/- 0.07, N = 3389.38390.311. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt

Botan

Test: ChaCha20Poly1305 - Decrypt

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: ChaCha20Poly1305 - Decryptarmv8.4-aarmv8.4-a+sve80160240320400SE +/- 0.02, N = 3SE +/- 0.13, N = 3382.51383.951. (CXX) g++ options: -fstack-protector -pthread -lbotan-2 -ldl -lrt

Crypto++

Test: Unkeyed Algorithms

OpenBenchmarking.orgMiB/second, More Is BetterCrypto++ 8.2Test: Unkeyed Algorithmsarmv8.4-aarmv8.4-a+sve100200300400500SE +/- 0.25, N = 3SE +/- 0.25, N = 3459.87449.02-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -fPIC -pthread -pipe

JPEG XL libjxl

Input: PNG - Encode Speed: 7

OpenBenchmarking.orgMP/s, More Is BetterJPEG XL libjxl 0.6.1Input: PNG - Encode Speed: 7armv8.4-aarmv8.4-a+sve246810SE +/- 0.02, N = 3SE +/- 0.01, N = 38.328.36-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -funwind-tables -O2 -fPIE -pie

JPEG XL libjxl

Input: PNG - Encode Speed: 8

OpenBenchmarking.orgMP/s, More Is BetterJPEG XL libjxl 0.6.1Input: PNG - Encode Speed: 8armv8.4-aarmv8.4-a+sve0.15080.30160.45240.60320.754SE +/- 0.00, N = 3SE +/- 0.00, N = 30.670.67-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -funwind-tables -O2 -fPIE -pie

JPEG XL libjxl

Input: JPEG - Encode Speed: 7

OpenBenchmarking.orgMP/s, More Is BetterJPEG XL libjxl 0.6.1Input: JPEG - Encode Speed: 7armv8.4-aarmv8.4-a+sve20406080100SE +/- 0.13, N = 3SE +/- 0.12, N = 373.2179.58-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -funwind-tables -O2 -fPIE -pie

JPEG XL libjxl

Input: JPEG - Encode Speed: 8

OpenBenchmarking.orgMP/s, More Is BetterJPEG XL libjxl 0.6.1Input: JPEG - Encode Speed: 8armv8.4-aarmv8.4-a+sve612182430SE +/- 0.01, N = 3SE +/- 0.03, N = 326.3027.34-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -funwind-tables -O2 -fPIE -pie

LeelaChessZero

Backend: BLAS

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.28Backend: BLASarmv8.4-aarmv8.4-a+sve30060090012001500SE +/- 13.99, N = 5SE +/- 6.96, N = 312811297-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -flto -O3 -pthread

LeelaChessZero

Backend: Eigen

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.28Backend: Eigenarmv8.4-aarmv8.4-a+sve30060090012001500SE +/- 13.65, N = 3SE +/- 14.64, N = 513111333-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -flto -O3 -pthread

Stockfish

Total Time

OpenBenchmarking.orgNodes Per Second, More Is BetterStockfish 13Total Timearmv8.4-aarmv8.4-a+sve12M24M36M48M60MSE +/- 721518.45, N = 14SE +/- 645132.01, N = 35748568055823340-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -lgcov -lpthread -O3 -fno-exceptions -std=c++17 -pedantic -flto -fprofile-use -fno-peel-loops -fno-tracer -flto=jobserver

GROMACS

Implementation: MPI CPU - Input: water_GMX50_bare

OpenBenchmarking.orgNs Per Day, More Is BetterGROMACS 2022.1Implementation: MPI CPU - Input: water_GMX50_barearmv8.4-aarmv8.4-a+sve0.51231.02461.53692.04922.5615SE +/- 0.001, N = 3SE +/- 0.002, N = 32.2772.275-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3

LAMMPS Molecular Dynamics Simulator

Model: Rhodopsin Protein

OpenBenchmarking.orgns/day, More Is BetterLAMMPS Molecular Dynamics Simulator 29Oct2020Model: Rhodopsin Proteinarmv8.4-aarmv8.4-a+sve510152025SE +/- 0.09, N = 3SE +/- 0.01, N = 321.3321.15-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -lm

Stargate Digital Audio Workstation

Sample Rate: 44100 - Buffer Size: 512

OpenBenchmarking.orgRender Ratio, More Is BetterStargate Digital Audio Workstation 21.10.9Sample Rate: 44100 - Buffer Size: 512armv8.4-aarmv8.4-a+sve246810SE +/- 0.002564, N = 3SE +/- 0.003428, N = 36.0729166.2135001. (CXX) g++ options: -lpthread -lsndfile -lm -O3 -march=native -ffast-math -funroll-loops -fstrength-reduce -fstrict-aliasing -finline-functions

Stargate Digital Audio Workstation

Sample Rate: 96000 - Buffer Size: 512

OpenBenchmarking.orgRender Ratio, More Is BetterStargate Digital Audio Workstation 21.10.9Sample Rate: 96000 - Buffer Size: 512armv8.4-aarmv8.4-a+sve1.00682.01363.02044.02725.034SE +/- 0.003502, N = 3SE +/- 0.002191, N = 34.4140554.4748481. (CXX) g++ options: -lpthread -lsndfile -lm -O3 -march=native -ffast-math -funroll-loops -fstrength-reduce -fstrict-aliasing -finline-functions

Stargate Digital Audio Workstation

Sample Rate: 44100 - Buffer Size: 1024

OpenBenchmarking.orgRender Ratio, More Is BetterStargate Digital Audio Workstation 21.10.9Sample Rate: 44100 - Buffer Size: 1024armv8.4-aarmv8.4-a+sve246810SE +/- 0.002515, N = 3SE +/- 0.002411, N = 36.3700356.5381631. (CXX) g++ options: -lpthread -lsndfile -lm -O3 -march=native -ffast-math -funroll-loops -fstrength-reduce -fstrict-aliasing -finline-functions

Stargate Digital Audio Workstation

Sample Rate: 480000 - Buffer Size: 512

OpenBenchmarking.orgRender Ratio, More Is BetterStargate Digital Audio Workstation 21.10.9Sample Rate: 480000 - Buffer Size: 512armv8.4-aarmv8.4-a+sve246810SE +/- 0.001956, N = 3SE +/- 0.002030, N = 36.0053866.1244551. (CXX) g++ options: -lpthread -lsndfile -lm -O3 -march=native -ffast-math -funroll-loops -fstrength-reduce -fstrict-aliasing -finline-functions

Stargate Digital Audio Workstation

Sample Rate: 96000 - Buffer Size: 1024

OpenBenchmarking.orgRender Ratio, More Is BetterStargate Digital Audio Workstation 21.10.9Sample Rate: 96000 - Buffer Size: 1024armv8.4-aarmv8.4-a+sve1.08182.16363.24544.32725.409SE +/- 0.000903, N = 3SE +/- 0.002826, N = 34.7298484.8077971. (CXX) g++ options: -lpthread -lsndfile -lm -O3 -march=native -ffast-math -funroll-loops -fstrength-reduce -fstrict-aliasing -finline-functions

Stargate Digital Audio Workstation

Sample Rate: 480000 - Buffer Size: 1024

OpenBenchmarking.orgRender Ratio, More Is BetterStargate Digital Audio Workstation 21.10.9Sample Rate: 480000 - Buffer Size: 1024armv8.4-aarmv8.4-a+sve246810SE +/- 0.002118, N = 3SE +/- 0.002250, N = 36.3226846.4501821. (CXX) g++ options: -lpthread -lsndfile -lm -O3 -march=native -ffast-math -funroll-loops -fstrength-reduce -fstrict-aliasing -finline-functions

Redis

Test: GET

OpenBenchmarking.orgRequests Per Second, More Is BetterRedis 6.0.9Test: GETarmv8.4-aarmv8.4-a+sve500K1000K1500K2000K2500KSE +/- 1605.36, N = 3SE +/- 9056.40, N = 32523377.922513289.20-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3

Redis

Test: SET

OpenBenchmarking.orgRequests Per Second, More Is BetterRedis 6.0.9Test: SETarmv8.4-aarmv8.4-a+sve400K800K1200K1600K2000KSE +/- 794.78, N = 3SE +/- 7427.93, N = 31865840.131861924.13-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3

Liquid-DSP

Threads: 8 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 2021.01.31Threads: 8 - Buffer Length: 256 - Filter Length: 57armv8.4-aarmv8.4-a+sve40M80M120M160M200MSE +/- 12018.50, N = 3SE +/- 26666.67, N = 3176363333167733333-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 16 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 2021.01.31Threads: 16 - Buffer Length: 256 - Filter Length: 57armv8.4-aarmv8.4-a+sve80M160M240M320M400MSE +/- 30550.50, N = 3SE +/- 20275.88, N = 3352700000335423333-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 32 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 2021.01.31Threads: 32 - Buffer Length: 256 - Filter Length: 57armv8.4-aarmv8.4-a+sve150M300M450M600M750MSE +/- 125476.87, N = 3SE +/- 1978807.16, N = 3705233333668636667-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid

OpenSSL

Algorithm: RSA4096

OpenBenchmarking.orgsign/s, More Is BetterOpenSSL 3.0Algorithm: RSA4096armv8.4-aarmv8.4-a+sve11002200330044005500SE +/- 0.78, N = 3SE +/- 0.53, N = 35090.55088.1-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl

Kripke

OpenBenchmarking.orgThroughput FoM, More Is BetterKripke 1.2.4armv8.4-aarmv8.4-a+sve40M80M120M160M200MSE +/- 226703.11, N = 3SE +/- 298633.53, N = 3204143167192709233-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -fopenmp

OpenSSL

Algorithm: RSA4096

OpenBenchmarking.orgverify/s, More Is BetterOpenSSL 3.0Algorithm: RSA4096armv8.4-aarmv8.4-a+sve80K160K240K320K400KSE +/- 8.85, N = 3SE +/- 10.52, N = 3356359.6356407.8-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -pthread -O3 -lssl -lcrypto -ldl

WebP Image Encode

Encode Settings: Quality 100, Lossless

OpenBenchmarking.orgEncode Time - Seconds, Fewer Is BetterWebP Image Encode 1.1Encode Settings: Quality 100, Losslessarmv8.4-aarmv8.4-a+sve612182430SE +/- 0.01, N = 3SE +/- 0.18, N = 323.8523.40-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -fvisibility=hidden -O3 -lm -ljpeg -lpng16 -ltiff

WebP Image Encode

Encode Settings: Quality 100, Highest Compression

OpenBenchmarking.orgEncode Time - Seconds, Fewer Is BetterWebP Image Encode 1.1Encode Settings: Quality 100, Highest Compressionarmv8.4-aarmv8.4-a+sve246810SE +/- 0.006, N = 3SE +/- 0.017, N = 38.6308.640-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -fvisibility=hidden -O3 -lm -ljpeg -lpng16 -ltiff

Caffe

Model: AlexNet - Acceleration: CPU - Iterations: 200

OpenBenchmarking.orgMilli-Seconds, Fewer Is BetterCaffe 2020-02-13Model: AlexNet - Acceleration: CPU - Iterations: 200armv8.4-aarmv8.4-a+sve9K18K27K36K45KSE +/- 31.22, N = 3SE +/- 12.55, N = 34363443931-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -fPIC -rdynamic -lglog -lgflags -lprotobuf -lcrypto -lcurl -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas

Caffe

Model: GoogleNet - Acceleration: CPU - Iterations: 200

OpenBenchmarking.orgMilli-Seconds, Fewer Is BetterCaffe 2020-02-13Model: GoogleNet - Acceleration: CPU - Iterations: 200armv8.4-aarmv8.4-a+sve30K60K90K120K150KSE +/- 49.72, N = 3SE +/- 105.70, N = 3123807125125-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -fPIC -rdynamic -lglog -lgflags -lprotobuf -lcrypto -lcurl -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas

OpenJPEG

Encode: NASA Curiosity Panorama M34

OpenBenchmarking.orgms, Fewer Is BetterOpenJPEG 2.4Encode: NASA Curiosity Panorama M34armv8.4-aarmv8.4-a+sve12K24K36K48K60KSE +/- 19.06, N = 3SE +/- 89.48, N = 35720555196-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -rdynamic

Google Draco

Model: Lion

OpenBenchmarking.orgms, Fewer Is BetterGoogle Draco 1.5.0Model: Lionarmv8.4-aarmv8.4-a+sve11002200330044005500SE +/- 2.65, N = 3SE +/- 2.40, N = 353545309-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3

Google Draco

Model: Church Facade

OpenBenchmarking.orgms, Fewer Is BetterGoogle Draco 1.5.0Model: Church Facadearmv8.4-aarmv8.4-a+sve2K4K6K8K10KSE +/- 6.64, N = 3SE +/- 7.00, N = 379357843-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3

TNN

Target: CPU - Model: DenseNet

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: DenseNetarmv8.4-aarmv8.4-a+sve6001200180024003000SE +/- 12.63, N = 3SE +/- 22.16, N = 32730.402346.32-march=armv8.4-a - MIN: 2665.42 / MAX: 2834.73-march=armv8.4-a+sve - MIN: 2268.4 / MAX: 2446.61. (CXX) g++ options: -O3 -fopenmp -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl

TNN

Target: CPU - Model: MobileNet v2

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: MobileNet v2armv8.4-aarmv8.4-a+sve60120180240300SE +/- 0.10, N = 3SE +/- 0.69, N = 3260.78280.24-march=armv8.4-a - MIN: 259.13 / MAX: 262.38-march=armv8.4-a+sve - MIN: 277.9 / MAX: 282.311. (CXX) g++ options: -O3 -fopenmp -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl

TNN

Target: CPU - Model: SqueezeNet v2

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: SqueezeNet v2armv8.4-aarmv8.4-a+sve20406080100SE +/- 0.07, N = 3SE +/- 0.09, N = 371.1376.30-march=armv8.4-a - MIN: 70.76 / MAX: 71.58-march=armv8.4-a+sve - MIN: 76.07 / MAX: 76.531. (CXX) g++ options: -O3 -fopenmp -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl

TNN

Target: CPU - Model: SqueezeNet v1.1

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: SqueezeNet v1.1armv8.4-aarmv8.4-a+sve60120180240300SE +/- 0.08, N = 3SE +/- 0.14, N = 3257.70205.80-march=armv8.4-a - MIN: 256.95 / MAX: 258.39-march=armv8.4-a+sve - MIN: 205.46 / MAX: 206.281. (CXX) g++ options: -O3 -fopenmp -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl

Timed MrBayes Analysis

Primate Phylogeny Analysis

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed MrBayes Analysis 3.2.7Primate Phylogeny Analysisarmv8.4-aarmv8.4-a+sve50100150200250SE +/- 0.07, N = 3SE +/- 0.18, N = 3237.54234.26-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -O3 -std=c99 -pedantic -lm

C-Ray

Total Time - 4K, 16 Rays Per Pixel

OpenBenchmarking.orgSeconds, Fewer Is BetterC-Ray 1.1Total Time - 4K, 16 Rays Per Pixelarmv8.4-aarmv8.4-a+sve510152025SE +/- 0.03, N = 3SE +/- 0.00, N = 319.3019.30-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -lm -lpthread -O3

POV-Ray

Trace Time

OpenBenchmarking.orgSeconds, Fewer Is BetterPOV-Ray 3.7.0.7Trace Timearmv8.4-aarmv8.4-a+sve510152025SE +/- 0.03, N = 3SE +/- 0.04, N = 319.8520.26-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -pipe -O3 -ffast-math -R/usr/lib -lXpm -lSM -lICE -lX11 -lIlmImf -lIlmImf-2_5 -lImath-2_5 -lHalf-2_5 -lIex-2_5 -lIexMath-2_5 -lIlmThread-2_5 -lIlmThread -ltiff -ljpeg -lpng -lz -lrt -lm -lboost_thread -lboost_system

Primesieve

1e12 Prime Number Generation

OpenBenchmarking.orgSeconds, Fewer Is BetterPrimesieve 7.71e12 Prime Number Generationarmv8.4-aarmv8.4-a+sve246810SE +/- 0.022, N = 3SE +/- 0.043, N = 38.4388.533-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3

Smallpt

Global Illumination Renderer; 128 Samples

OpenBenchmarking.orgSeconds, Fewer Is BetterSmallpt 1.0Global Illumination Renderer; 128 Samplesarmv8.4-aarmv8.4-a+sve0.87661.75322.62983.50644.383SE +/- 0.002, N = 3SE +/- 0.003, N = 33.8953.896-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -fopenmp -O3

AOBench

Size: 2048 x 2048 - Total Time

OpenBenchmarking.orgSeconds, Fewer Is BetterAOBenchSize: 2048 x 2048 - Total Timearmv8.4-aarmv8.4-a+sve816243240SE +/- 0.00, N = 3SE +/- 0.01, N = 333.4833.49-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -lm -O3

FLAC Audio Encoding

WAV To FLAC

OpenBenchmarking.orgSeconds, Fewer Is BetterFLAC Audio Encoding 1.3.3WAV To FLACarmv8.4-aarmv8.4-a+sve918273645SE +/- 0.02, N = 5SE +/- 0.01, N = 538.3138.52-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -fvisibility=hidden -logg -lm

LAME MP3 Encoding

WAV To MP3

OpenBenchmarking.orgSeconds, Fewer Is BetterLAME MP3 Encoding 3.100WAV To MP3armv8.4-aarmv8.4-a+sve246810SE +/- 0.004, N = 3SE +/- 0.002, N = 38.0547.440-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -O3 -ffast-math -funroll-loops -fschedule-insns2 -fbranch-count-reg -fforce-addr -pipe -lm

Opus Codec Encoding

WAV To Opus Encode

OpenBenchmarking.orgSeconds, Fewer Is BetterOpus Codec Encoding 1.3.1WAV To Opus Encodearmv8.4-aarmv8.4-a+sve510152025SE +/- 0.00, N = 5SE +/- 0.02, N = 518.3214.40-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -fvisibility=hidden -logg -lm

eSpeak-NG Speech Engine

Text-To-Speech Synthesis

OpenBenchmarking.orgSeconds, Fewer Is BettereSpeak-NG Speech Engine 20200907Text-To-Speech Synthesisarmv8.4-aarmv8.4-a+sve816243240SE +/- 0.31, N = 16SE +/- 0.30, N = 2036.5929.98-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -O3 -std=c99

Ngspice

Circuit: C2670

OpenBenchmarking.orgSeconds, Fewer Is BetterNgspice 34Circuit: C2670armv8.4-aarmv8.4-a+sve20406080100SE +/- 0.06, N = 3SE +/- 0.18, N = 3102.56106.91-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -O3 -fopenmp -lm -lstdc++ -lfftw3 -lXaw -lXmu -lXt -lXext -lX11 -lXft -lfontconfig -lXrender -lfreetype -lSM -lICE

Ngspice

Circuit: C7552

OpenBenchmarking.orgSeconds, Fewer Is BetterNgspice 34Circuit: C7552armv8.4-aarmv8.4-a+sve20406080100SE +/- 0.51, N = 3SE +/- 1.04, N = 3103.93111.64-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -O3 -fopenmp -lm -lstdc++ -lfftw3 -lXaw -lXmu -lXt -lXext -lX11 -lXft -lfontconfig -lXrender -lfreetype -lSM -lICE

RNNoise

OpenBenchmarking.orgSeconds, Fewer Is BetterRNNoise 2020-06-28armv8.4-aarmv8.4-a+sve48121620SE +/- 0.05, N = 3SE +/- 0.05, N = 317.6217.39-march=armv8.4-a-march=armv8.4-a+sve1. (CC) gcc options: -O3 -pedantic -fvisibility=hidden

ASTC Encoder

Preset: Medium

OpenBenchmarking.orgSeconds, Fewer Is BetterASTC Encoder 3.2Preset: Mediumarmv8.4-aarmv8.4-a+sve1.09872.19743.29614.39485.4935SE +/- 0.0125, N = 3SE +/- 0.0080, N = 34.88334.8092-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -flto -pthread

ASTC Encoder

Preset: Thorough

OpenBenchmarking.orgSeconds, Fewer Is BetterASTC Encoder 3.2Preset: Thorougharmv8.4-aarmv8.4-a+sve3691215SE +/- 0.0046, N = 3SE +/- 0.0027, N = 39.14359.0131-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -flto -pthread

ASTC Encoder

Preset: Exhaustive

OpenBenchmarking.orgSeconds, Fewer Is BetterASTC Encoder 3.2Preset: Exhaustivearmv8.4-aarmv8.4-a+sve816243240SE +/- 0.02, N = 3SE +/- 0.01, N = 335.3635.19-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -flto -pthread

WavPack Audio Encoding

WAV To WavPack

OpenBenchmarking.orgSeconds, Fewer Is BetterWavPack Audio Encoding 5.3WAV To WavPackarmv8.4-aarmv8.4-a+sve510152025SE +/- 0.00, N = 5SE +/- 0.03, N = 520.4920.52-march=armv8.4-a-march=armv8.4-a+sve1. (CXX) g++ options: -O3 -rdynamic


Phoronix Test Suite v10.8.5