Neoverse-V1 Compiler Tests amazon testing on Ubuntu 22.04 via the Phoronix Test Suite by michael larabel for a future article. armv8.4-a: Processor: ARMv8 Neoverse-V1 (32 Cores), Motherboard: Amazon EC2 c7g.8xlarge (1.0 BIOS), Chipset: Amazon Device 0200, Memory: 62GB, Disk: 301GB Amazon Elastic Block Store, Network: Amazon Elastic OS: Ubuntu 22.04, Kernel: 5.15.0-1004-aws (aarch64), Compiler: GCC 12.0.0 20220117, File-System: ext4, System Layer: amazon armv8.4-a+sve: Processor: ARMv8 Neoverse-V1 (32 Cores), Motherboard: Amazon EC2 c7g.8xlarge (1.0 BIOS), Chipset: Amazon Device 0200, Memory: 62GB, Disk: 301GB Amazon Elastic Block Store, Network: Amazon Elastic OS: Ubuntu 22.04, Kernel: 5.15.0-1004-aws (aarch64), Compiler: GCC 12.0.0 20220117, File-System: ext4, System Layer: amazon Crypto++ 8.2 Test: Unkeyed Algorithms MiB/second > Higher Is Better armv8.4-a ..... 459.87 |======================================================= armv8.4-a+sve . 449.02 |====================================================== Nettle 3.8 Test: aes256 Mbyte/s > Higher Is Better armv8.4-a ..... 4435.91 |====================================================== armv8.4-a+sve . 4447.04 |====================================================== Nettle 3.8 Test: chacha Mbyte/s > Higher Is Better armv8.4-a ..... 740.25 |======================================================= armv8.4-a+sve . 733.59 |======================================================= Nettle 3.8 Test: sha512 Mbyte/s > Higher Is Better armv8.4-a ..... 498.83 |====================================================== armv8.4-a+sve . 504.33 |======================================================= Nettle 3.8 Test: poly1305-aes Mbyte/s > Higher Is Better armv8.4-a ..... 871.90 |======================================================= armv8.4-a+sve . 820.51 |==================================================== Botan 2.17.3 Test: KASUMI MiB/s > Higher Is Better armv8.4-a ..... 62.02 |======================================================== armv8.4-a+sve . 62.00 |======================================================== Botan 2.17.3 Test: KASUMI - Decrypt MiB/s > Higher Is Better armv8.4-a ..... 62.28 |======================================================== armv8.4-a+sve . 62.26 |======================================================== Botan 2.17.3 Test: AES-256 MiB/s > Higher Is Better armv8.4-a ..... 5494.31 |====================================================== armv8.4-a+sve . 5442.65 |===================================================== Botan 2.17.3 Test: AES-256 - Decrypt MiB/s > Higher Is Better armv8.4-a ..... 5477.57 |====================================================== armv8.4-a+sve . 5474.32 |====================================================== Botan 2.17.3 Test: Twofish MiB/s > Higher Is Better armv8.4-a ..... 239.70 |===================================================== armv8.4-a+sve . 248.89 |======================================================= Botan 2.17.3 Test: Twofish - Decrypt MiB/s > Higher Is Better armv8.4-a ..... 246.16 |==================================================== armv8.4-a+sve . 258.15 |======================================================= Botan 2.17.3 Test: Blowfish MiB/s > Higher Is Better armv8.4-a ..... 278.87 |======================================================= armv8.4-a+sve . 280.57 |======================================================= Botan 2.17.3 Test: Blowfish - Decrypt MiB/s > Higher Is Better armv8.4-a ..... 288.51 |======================================================= armv8.4-a+sve . 289.03 |======================================================= Botan 2.17.3 Test: CAST-256 MiB/s > Higher Is Better armv8.4-a ..... 108.79 |======================================================= armv8.4-a+sve . 108.75 |======================================================= Botan 2.17.3 Test: CAST-256 - Decrypt MiB/s > Higher Is Better armv8.4-a ..... 108.60 |======================================================= armv8.4-a+sve . 108.62 |======================================================= Botan 2.17.3 Test: ChaCha20Poly1305 MiB/s > Higher Is Better armv8.4-a ..... 389.38 |======================================================= armv8.4-a+sve . 390.31 |======================================================= Botan 2.17.3 Test: ChaCha20Poly1305 - Decrypt MiB/s > Higher Is Better armv8.4-a ..... 382.51 |======================================================= armv8.4-a+sve . 383.95 |======================================================= FLAC Audio Encoding 1.3.3 WAV To FLAC Seconds < Lower Is Better armv8.4-a ..... 38.31 |======================================================== armv8.4-a+sve . 38.52 |======================================================== LAME MP3 Encoding 3.100 WAV To MP3 Seconds < Lower Is Better armv8.4-a ..... 8.054 |======================================================== armv8.4-a+sve . 7.440 |==================================================== Ngspice 34 Circuit: C2670 Seconds < Lower Is Better armv8.4-a ..... 102.56 |===================================================== armv8.4-a+sve . 106.91 |======================================================= Ngspice 34 Circuit: C7552 Seconds < Lower Is Better armv8.4-a ..... 103.93 |=================================================== armv8.4-a+sve . 111.64 |======================================================= Opus Codec Encoding 1.3.1 WAV To Opus Encode Seconds < Lower Is Better armv8.4-a ..... 18.32 |======================================================== armv8.4-a+sve . 14.40 |============================================ Stargate Digital Audio Workstation 21.10.9 Sample Rate: 44100 - Buffer Size: 512 Render Ratio > Higher Is Better armv8.4-a ..... 6.072916 |==================================================== armv8.4-a+sve . 6.213500 |===================================================== Stargate Digital Audio Workstation 21.10.9 Sample Rate: 96000 - Buffer Size: 512 Render Ratio > Higher Is Better armv8.4-a ..... 4.414055 |==================================================== armv8.4-a+sve . 4.474848 |===================================================== Stargate Digital Audio Workstation 21.10.9 Sample Rate: 44100 - Buffer Size: 1024 Render Ratio > Higher Is Better armv8.4-a ..... 6.370035 |==================================================== armv8.4-a+sve . 6.538163 |===================================================== Stargate Digital Audio Workstation 21.10.9 Sample Rate: 480000 - Buffer Size: 512 Render Ratio > Higher Is Better armv8.4-a ..... 6.005386 |==================================================== armv8.4-a+sve . 6.124455 |===================================================== Stargate Digital Audio Workstation 21.10.9 Sample Rate: 96000 - Buffer Size: 1024 Render Ratio > Higher Is Better armv8.4-a ..... 4.729848 |==================================================== armv8.4-a+sve . 4.807797 |===================================================== Stargate Digital Audio Workstation 21.10.9 Sample Rate: 480000 - Buffer Size: 1024 Render Ratio > Higher Is Better armv8.4-a ..... 6.322684 |==================================================== armv8.4-a+sve . 6.450182 |===================================================== WavPack Audio Encoding 5.3 WAV To WavPack Seconds < Lower Is Better armv8.4-a ..... 20.49 |======================================================== armv8.4-a+sve . 20.52 |======================================================== ASTC Encoder 3.2 Preset: Medium Seconds < Lower Is Better armv8.4-a ..... 4.8833 |======================================================= armv8.4-a+sve . 4.8092 |====================================================== ASTC Encoder 3.2 Preset: Thorough Seconds < Lower Is Better armv8.4-a ..... 9.1435 |======================================================= armv8.4-a+sve . 9.0131 |====================================================== ASTC Encoder 3.2 Preset: Exhaustive Seconds < Lower Is Better armv8.4-a ..... 35.36 |======================================================== armv8.4-a+sve . 35.19 |======================================================== Google Draco 1.5.0 Model: Lion ms < Lower Is Better armv8.4-a ..... 5354 |========================================================= armv8.4-a+sve . 5309 |========================================================= Google Draco 1.5.0 Model: Church Facade ms < Lower Is Better armv8.4-a ..... 7935 |========================================================= armv8.4-a+sve . 7843 |======================================================== JPEG XL libjxl 0.6.1 Input: PNG - Encode Speed: 7 MP/s > Higher Is Better armv8.4-a ..... 8.32 |========================================================= armv8.4-a+sve . 8.36 |========================================================= JPEG XL libjxl 0.6.1 Input: PNG - Encode Speed: 8 MP/s > Higher Is Better armv8.4-a ..... 0.67 |========================================================= armv8.4-a+sve . 0.67 |========================================================= JPEG XL libjxl 0.6.1 Input: JPEG - Encode Speed: 7 MP/s > Higher Is Better armv8.4-a ..... 73.21 |==================================================== armv8.4-a+sve . 79.58 |======================================================== JPEG XL libjxl 0.6.1 Input: JPEG - Encode Speed: 8 MP/s > Higher Is Better armv8.4-a ..... 26.30 |====================================================== armv8.4-a+sve . 27.34 |======================================================== OpenJPEG 2.4 Encode: NASA Curiosity Panorama M34 ms < Lower Is Better armv8.4-a ..... 57205 |======================================================== armv8.4-a+sve . 55196 |====================================================== WebP Image Encode 1.1 Encode Settings: Quality 100, Lossless Encode Time - Seconds < Lower Is Better armv8.4-a ..... 23.85 |======================================================== armv8.4-a+sve . 23.40 |======================================================= WebP Image Encode 1.1 Encode Settings: Quality 100, Highest Compression Encode Time - Seconds < Lower Is Better armv8.4-a ..... 8.630 |======================================================== armv8.4-a+sve . 8.640 |======================================================== LuaJIT 2.1-git Test: Composite Mflops > Higher Is Better armv8.4-a ..... 1282.59 |===================================================== armv8.4-a+sve . 1309.03 |====================================================== LuaJIT 2.1-git Test: Monte Carlo Mflops > Higher Is Better armv8.4-a ..... 343.27 |======================================================= armv8.4-a+sve . 343.85 |======================================================= LuaJIT 2.1-git Test: Fast Fourier Transform Mflops > Higher Is Better armv8.4-a ..... 661.55 |======================================================= armv8.4-a+sve . 615.71 |=================================================== LuaJIT 2.1-git Test: Sparse Matrix Multiply Mflops > Higher Is Better armv8.4-a ..... 1151.57 |====================================================== armv8.4-a+sve . 1162.33 |====================================================== LuaJIT 2.1-git Test: Dense LU Matrix Factorization Mflops > Higher Is Better armv8.4-a ..... 3355.53 |=================================================== armv8.4-a+sve . 3521.13 |====================================================== LuaJIT 2.1-git Test: Jacobi Successive Over-Relaxation Mflops > Higher Is Better armv8.4-a ..... 901.02 |======================================================= armv8.4-a+sve . 902.16 |======================================================= eSpeak-NG Speech Engine 20200907 Text-To-Speech Synthesis Seconds < Lower Is Better armv8.4-a ..... 36.59 |======================================================== armv8.4-a+sve . 29.98 |============================================== Xmrig 6.12.1 Variant: Monero - Hash Count: 1M H/s > Higher Is Better armv8.4-a ..... 8645.4 |======================================================= armv8.4-a+sve . 8669.8 |======================================================= Xmrig 6.12.1 Variant: Wownero - Hash Count: 1M H/s > Higher Is Better armv8.4-a ..... 11811.2 |====================================================== armv8.4-a+sve . 11877.8 |====================================================== Timed MrBayes Analysis 3.2.7 Primate Phylogeny Analysis Seconds < Lower Is Better armv8.4-a ..... 237.54 |======================================================= armv8.4-a+sve . 234.26 |====================================================== Himeno Benchmark 3.0 Poisson Pressure Solver MFLOPS > Higher Is Better armv8.4-a ..... 5561.56 |====================================================== armv8.4-a+sve . 5508.28 |===================================================== LeelaChessZero 0.28 Backend: BLAS Nodes Per Second > Higher Is Better armv8.4-a ..... 1281 |======================================================== armv8.4-a+sve . 1297 |========================================================= LeelaChessZero 0.28 Backend: Eigen Nodes Per Second > Higher Is Better armv8.4-a ..... 1311 |======================================================== armv8.4-a+sve . 1333 |========================================================= RNNoise 2020-06-28 Seconds < Lower Is Better armv8.4-a ..... 17.62 |======================================================== armv8.4-a+sve . 17.39 |======================================================= ONNX Runtime 1.11 Model: GPT-2 - Device: CPU - Executor: Standard Inferences Per Minute > Higher Is Better armv8.4-a ..... 12364 |======================================================== armv8.4-a+sve . 12317 |======================================================== ONNX Runtime 1.11 Model: bertsquad-12 - Device: CPU - Executor: Standard Inferences Per Minute > Higher Is Better armv8.4-a ..... 773 |========================================================== armv8.4-a+sve . 772 |========================================================== ONNX Runtime 1.11 Model: fcn-resnet101-11 - Device: CPU - Executor: Standard Inferences Per Minute > Higher Is Better armv8.4-a ..... 73 |=========================================================== armv8.4-a+sve . 73 |=========================================================== ONNX Runtime 1.11 Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard Inferences Per Minute > Higher Is Better armv8.4-a ..... 938 |========================================================== armv8.4-a+sve . 935 |========================================================== ONNX Runtime 1.11 Model: super-resolution-10 - Device: CPU - Executor: Standard Inferences Per Minute > Higher Is Better armv8.4-a ..... 5413 |========================================================= armv8.4-a+sve . 5411 |========================================================= TNN 0.3 Target: CPU - Model: DenseNet ms < Lower Is Better armv8.4-a ..... 2730.40 |====================================================== armv8.4-a+sve . 2346.32 |============================================== TNN 0.3 Target: CPU - Model: MobileNet v2 ms < Lower Is Better armv8.4-a ..... 260.78 |=================================================== armv8.4-a+sve . 280.24 |======================================================= TNN 0.3 Target: CPU - Model: SqueezeNet v2 ms < Lower Is Better armv8.4-a ..... 71.13 |==================================================== armv8.4-a+sve . 76.30 |======================================================== TNN 0.3 Target: CPU - Model: SqueezeNet v1.1 ms < Lower Is Better armv8.4-a ..... 257.70 |======================================================= armv8.4-a+sve . 205.80 |============================================ Caffe 2020-02-13 Model: AlexNet - Acceleration: CPU - Iterations: 200 Milli-Seconds < Lower Is Better armv8.4-a ..... 43634 |======================================================== armv8.4-a+sve . 43931 |======================================================== Caffe 2020-02-13 Model: GoogleNet - Acceleration: CPU - Iterations: 200 Milli-Seconds < Lower Is Better armv8.4-a ..... 123807 |====================================================== armv8.4-a+sve . 125125 |======================================================= GROMACS 2022.1 Implementation: MPI CPU - Input: water_GMX50_bare Ns Per Day > Higher Is Better armv8.4-a ..... 2.277 |======================================================== armv8.4-a+sve . 2.275 |======================================================== LAMMPS Molecular Dynamics Simulator 29Oct2020 Model: Rhodopsin Protein ns/day > Higher Is Better armv8.4-a ..... 21.33 |======================================================== armv8.4-a+sve . 21.15 |======================================================== ACES DGEMM 1.0 Sustained Floating-Point Rate GFLOP/s > Higher Is Better armv8.4-a ..... 12.81 |===================================================== armv8.4-a+sve . 13.44 |======================================================== Kripke 1.2.4 Throughput FoM > Higher Is Better armv8.4-a ..... 204143167 |==================================================== armv8.4-a+sve . 192709233 |================================================= Coremark 1.0 CoreMark Size 666 - Iterations Per Second Iterations/Sec > Higher Is Better armv8.4-a ..... 789646.92 |==================================================== armv8.4-a+sve . 762066.50 |================================================== Primesieve 7.7 1e12 Prime Number Generation Seconds < Lower Is Better armv8.4-a ..... 8.438 |======================================================= armv8.4-a+sve . 8.533 |======================================================== Stockfish 13 Total Time Nodes Per Second > Higher Is Better armv8.4-a ..... 57485680 |===================================================== armv8.4-a+sve . 55823340 |=================================================== Zstd Compression 1.5.0 Compression Level: 3 - Compression Speed MB/s > Higher Is Better armv8.4-a ..... 6937.8 |====================================================== armv8.4-a+sve . 7027.2 |======================================================= Zstd Compression 1.5.0 Compression Level: 19 - Compression Speed MB/s > Higher Is Better armv8.4-a ..... 72.9 |======================================================== armv8.4-a+sve . 74.0 |========================================================= Zstd Compression 1.5.0 Compression Level: 19 - Decompression Speed MB/s > Higher Is Better armv8.4-a ..... 3083.4 |======================================================= armv8.4-a+sve . 3094.8 |======================================================= Zstd Compression 1.5.0 Compression Level: 3, Long Mode - Compression Speed MB/s > Higher Is Better armv8.4-a ..... 1241.3 |======================================================= armv8.4-a+sve . 1242.7 |======================================================= Zstd Compression 1.5.0 Compression Level: 3, Long Mode - Decompression Speed MB/s > Higher Is Better armv8.4-a ..... 3820.8 |======================================================= armv8.4-a+sve . 3824.8 |======================================================= Zstd Compression 1.5.0 Compression Level: 19, Long Mode - Compression Speed MB/s > Higher Is Better armv8.4-a ..... 40.0 |========================================================= armv8.4-a+sve . 40.3 |========================================================= Zstd Compression 1.5.0 Compression Level: 19, Long Mode - Decompression Speed MB/s > Higher Is Better armv8.4-a ..... 3250.9 |======================================================= armv8.4-a+sve . 3263.7 |======================================================= Sysbench 1.0.20 Test: CPU Events Per Second > Higher Is Better armv8.4-a ..... 96726.40 |===================================================== armv8.4-a+sve . 96666.76 |===================================================== AOM AV1 3.3 Encoder Mode: Speed 8 Realtime - Input: Bosphorus 4K Frames Per Second > Higher Is Better armv8.4-a ..... 62.13 |======================================================== armv8.4-a+sve . 62.19 |======================================================== AOM AV1 3.3 Encoder Mode: Speed 10 Realtime - Input: Bosphorus 4K Frames Per Second > Higher Is Better armv8.4-a ..... 61.88 |===================================================== armv8.4-a+sve . 65.62 |======================================================== AOM AV1 3.3 Encoder Mode: Speed 8 Realtime - Input: Bosphorus 1080p Frames Per Second > Higher Is Better armv8.4-a ..... 120.13 |===================================================== armv8.4-a+sve . 123.95 |======================================================= AOM AV1 3.3 Encoder Mode: Speed 9 Realtime - Input: Bosphorus 1080p Frames Per Second > Higher Is Better armv8.4-a ..... 152.46 |====================================================== armv8.4-a+sve . 156.71 |======================================================= AOM AV1 3.3 Encoder Mode: Speed 10 Realtime - Input: Bosphorus 1080p Frames Per Second > Higher Is Better armv8.4-a ..... 190.27 |====================================================== armv8.4-a+sve . 193.68 |======================================================= AOBench Size: 2048 x 2048 - Total Time Seconds < Lower Is Better armv8.4-a ..... 33.48 |======================================================== armv8.4-a+sve . 33.49 |======================================================== GraphicsMagick 1.3.33 Operation: Swirl Iterations Per Minute > Higher Is Better armv8.4-a ..... 1225 |======================================================= armv8.4-a+sve . 1272 |========================================================= GraphicsMagick 1.3.33 Operation: Rotate Iterations Per Minute > Higher Is Better armv8.4-a ..... 577 |======================================================= armv8.4-a+sve . 611 |========================================================== GraphicsMagick 1.3.33 Operation: Enhanced Iterations Per Minute > Higher Is Better armv8.4-a ..... 732 |========================================================== armv8.4-a+sve . 718 |========================================================= GraphicsMagick 1.3.33 Operation: Resizing Iterations Per Minute > Higher Is Better armv8.4-a ..... 2339 |======================================================= armv8.4-a+sve . 2414 |========================================================= GraphicsMagick 1.3.33 Operation: Noise-Gaussian Iterations Per Minute > Higher Is Better armv8.4-a ..... 494 |======================================================== armv8.4-a+sve . 515 |========================================================== GraphicsMagick 1.3.33 Operation: HWB Color Space Iterations Per Minute > Higher Is Better armv8.4-a ..... 978 |==================================================== armv8.4-a+sve . 1067 |========================================================= x264 2022-02-22 Video Input: Bosphorus 4K Frames Per Second > Higher Is Better armv8.4-a ..... 48.43 |======================================================== armv8.4-a+sve . 48.51 |======================================================== x264 2022-02-22 Video Input: Bosphorus 1080p Frames Per Second > Higher Is Better armv8.4-a ..... 168.92 |======================================================= armv8.4-a+sve . 169.58 |======================================================= C-Ray 1.1 Total Time - 4K, 16 Rays Per Pixel Seconds < Lower Is Better armv8.4-a ..... 19.30 |======================================================== armv8.4-a+sve . 19.30 |======================================================== POV-Ray 3.7.0.7 Trace Time Seconds < Lower Is Better armv8.4-a ..... 19.85 |======================================================= armv8.4-a+sve . 20.26 |======================================================== Smallpt 1.0 Global Illumination Renderer; 128 Samples Seconds < Lower Is Better armv8.4-a ..... 3.895 |======================================================== armv8.4-a+sve . 3.896 |======================================================== Liquid-DSP 2021.01.31 Threads: 8 - Buffer Length: 256 - Filter Length: 57 samples/s > Higher Is Better armv8.4-a ..... 176363333 |==================================================== armv8.4-a+sve . 167733333 |================================================= Liquid-DSP 2021.01.31 Threads: 16 - Buffer Length: 256 - Filter Length: 57 samples/s > Higher Is Better armv8.4-a ..... 352700000 |==================================================== armv8.4-a+sve . 335423333 |================================================= Liquid-DSP 2021.01.31 Threads: 32 - Buffer Length: 256 - Filter Length: 57 samples/s > Higher Is Better armv8.4-a ..... 705233333 |==================================================== armv8.4-a+sve . 668636667 |================================================= OpenSSL 3.0 Algorithm: SHA256 byte/s > Higher Is Better armv8.4-a ..... 27603943570 |================================================== armv8.4-a+sve . 27428176880 |================================================== OpenSSL 3.0 Algorithm: RSA4096 sign/s > Higher Is Better armv8.4-a ..... 5090.5 |======================================================= armv8.4-a+sve . 5088.1 |======================================================= OpenSSL 3.0 Algorithm: RSA4096 verify/s > Higher Is Better armv8.4-a ..... 356359.6 |===================================================== armv8.4-a+sve . 356407.8 |===================================================== Redis 6.0.9 Test: GET Requests Per Second > Higher Is Better armv8.4-a ..... 2523377.92 |=================================================== armv8.4-a+sve . 2513289.20 |=================================================== Redis 6.0.9 Test: SET Requests Per Second > Higher Is Better armv8.4-a ..... 1865840.13 |=================================================== armv8.4-a+sve . 1861924.13 |=================================================== GNU GMP GMPbench 6.2.1 Total Time GMPbench Score > Higher Is Better armv8.4-a ..... 4152.3 |======================================================= armv8.4-a+sve . 4155.6 |=======================================================