Graviton3 Neoverse-V1 Compiler Tests Benchmarks by Michael Larabel for a future article. amazon testing on Ubuntu 22.04 via the Phoronix Test Suite. -march=armv8.4-a: Processor: ARMv8 Neoverse-V1 (32 Cores), Motherboard: Amazon EC2 c7g.8xlarge (1.0 BIOS), Chipset: Amazon Device 0200, Memory: 62GB, Disk: 301GB Amazon Elastic Block Store, Network: Amazon Elastic OS: Ubuntu 22.04, Kernel: 5.15.0-1004-aws (aarch64), Compiler: GCC 12.0.0 20220117, File-System: ext4, System Layer: amazon -march=armv8.4-a+sve: Processor: ARMv8 Neoverse-V1 (32 Cores), Motherboard: Amazon EC2 c7g.8xlarge (1.0 BIOS), Chipset: Amazon Device 0200, Memory: 62GB, Disk: 301GB Amazon Elastic Block Store, Network: Amazon Elastic OS: Ubuntu 22.04, Kernel: 5.15.0-1004-aws (aarch64), Compiler: GCC 12.0.0 20220117, File-System: ext4, System Layer: amazon -march=armv8.4-a+sve -mcpu=neoverse-v1: Processor: ARMv8 Neoverse-V1 (32 Cores), Motherboard: Amazon EC2 c7g.8xlarge (1.0 BIOS), Chipset: Amazon Device 0200, Memory: 62GB, Disk: 301GB Amazon Elastic Block Store, Network: Amazon Elastic OS: Ubuntu 22.04, Kernel: 5.15.0-1004-aws (aarch64), Compiler: GCC 12.0.0 20220117, File-System: ext4, System Layer: amazon Crypto++ 8.2 Test: Unkeyed Algorithms MiB/second > Higher Is Better -march=armv8.4-a ....................... 459.87 |============================== -march=armv8.4-a+sve ................... 449.02 |============================= -march=armv8.4-a+sve -mcpu=neoverse-v1 . 270.93 |================== LeelaChessZero 0.28 Backend: BLAS Nodes Per Second > Higher Is Better -march=armv8.4-a ....................... 1281 |================================ -march=armv8.4-a+sve ................... 1297 |================================ -march=armv8.4-a+sve -mcpu=neoverse-v1 . 1280 |================================ LeelaChessZero 0.28 Backend: Eigen Nodes Per Second > Higher Is Better -march=armv8.4-a ....................... 1311 |=============================== -march=armv8.4-a+sve ................... 1333 |================================ -march=armv8.4-a+sve -mcpu=neoverse-v1 . 1337 |================================ Timed MrBayes Analysis 3.2.7 Primate Phylogeny Analysis Seconds < Lower Is Better -march=armv8.4-a ....................... 237.54 |============================== -march=armv8.4-a+sve ................... 234.26 |============================= -march=armv8.4-a+sve -mcpu=neoverse-v1 . 241.54 |============================== LAMMPS Molecular Dynamics Simulator 29Oct2020 Model: Rhodopsin Protein ns/day > Higher Is Better -march=armv8.4-a ....................... 21.33 |=============================== -march=armv8.4-a+sve ................... 21.15 |=============================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 21.02 |=============================== WebP Image Encode 1.1 Encode Settings: Quality 100, Lossless Encode Time - Seconds < Lower Is Better -march=armv8.4-a ....................... 23.85 |=============================== -march=armv8.4-a+sve ................... 23.40 |============================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 23.85 |=============================== WebP Image Encode 1.1 Encode Settings: Quality 100, Highest Compression Encode Time - Seconds < Lower Is Better -march=armv8.4-a ....................... 8.630 |=============================== -march=armv8.4-a+sve ................... 8.640 |=============================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 8.602 |=============================== GNU GMP GMPbench 6.2.1 Total Time GMPbench Score > Higher Is Better -march=armv8.4-a ....................... 4152.3 |============================== -march=armv8.4-a+sve ................... 4155.6 |============================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 4152.7 |============================== Xmrig 6.12.1 Variant: Monero - Hash Count: 1M H/s > Higher Is Better -march=armv8.4-a ....................... 8645.4 |============================== -march=armv8.4-a+sve ................... 8669.8 |============================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 8681.4 |============================== Xmrig 6.12.1 Variant: Wownero - Hash Count: 1M H/s > Higher Is Better -march=armv8.4-a ....................... 11811.2 |============================= -march=armv8.4-a+sve ................... 11877.8 |============================= -march=armv8.4-a+sve -mcpu=neoverse-v1 . 11842.0 |============================= Zstd Compression 1.5.0 Compression Level: 3 - Compression Speed MB/s > Higher Is Better -march=armv8.4-a ....................... 6937.8 |============================== -march=armv8.4-a+sve ................... 7027.2 |============================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 6938.3 |============================== Zstd Compression 1.5.0 Compression Level: 19 - Compression Speed MB/s > Higher Is Better -march=armv8.4-a ....................... 72.9 |================================ -march=armv8.4-a+sve ................... 74.0 |================================ -march=armv8.4-a+sve -mcpu=neoverse-v1 . 73.0 |================================ Zstd Compression 1.5.0 Compression Level: 19 - Decompression Speed MB/s > Higher Is Better -march=armv8.4-a ....................... 3083.4 |============================= -march=armv8.4-a+sve ................... 3094.8 |============================= -march=armv8.4-a+sve -mcpu=neoverse-v1 . 3167.5 |============================== Zstd Compression 1.5.0 Compression Level: 3, Long Mode - Compression Speed MB/s > Higher Is Better -march=armv8.4-a ....................... 1241.3 |============================== -march=armv8.4-a+sve ................... 1242.7 |============================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 1243.8 |============================== Zstd Compression 1.5.0 Compression Level: 3, Long Mode - Decompression Speed MB/s > Higher Is Better -march=armv8.4-a ....................... 3820.8 |============================== -march=armv8.4-a+sve ................... 3824.8 |============================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 3882.3 |============================== Zstd Compression 1.5.0 Compression Level: 19, Long Mode - Compression Speed MB/s > Higher Is Better -march=armv8.4-a ....................... 40.0 |================================ -march=armv8.4-a+sve ................... 40.3 |================================ -march=armv8.4-a+sve -mcpu=neoverse-v1 . 39.0 |=============================== Zstd Compression 1.5.0 Compression Level: 19, Long Mode - Decompression Speed MB/s > Higher Is Better -march=armv8.4-a ....................... 3250.9 |============================= -march=armv8.4-a+sve ................... 3263.7 |============================= -march=armv8.4-a+sve -mcpu=neoverse-v1 . 3339.7 |============================== JPEG XL libjxl 0.6.1 Input: PNG - Encode Speed: 7 MP/s > Higher Is Better -march=armv8.4-a ....................... 8.32 |================================ -march=armv8.4-a+sve ................... 8.36 |================================ -march=armv8.4-a+sve -mcpu=neoverse-v1 . 8.07 |=============================== JPEG XL libjxl 0.6.1 Input: PNG - Encode Speed: 8 MP/s > Higher Is Better -march=armv8.4-a ....................... 0.67 |================================ -march=armv8.4-a+sve ................... 0.67 |================================ -march=armv8.4-a+sve -mcpu=neoverse-v1 . 0.67 |================================ JPEG XL libjxl 0.6.1 Input: JPEG - Encode Speed: 7 MP/s > Higher Is Better -march=armv8.4-a ....................... 73.21 |============================= -march=armv8.4-a+sve ................... 79.58 |=============================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 78.80 |=============================== JPEG XL libjxl 0.6.1 Input: JPEG - Encode Speed: 8 MP/s > Higher Is Better -march=armv8.4-a ....................... 26.30 |============================== -march=armv8.4-a+sve ................... 27.34 |=============================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 27.09 |=============================== Nettle 3.8 Test: aes256 Mbyte/s > Higher Is Better -march=armv8.4-a ....................... 4435.91 |============================= -march=armv8.4-a+sve ................... 4447.04 |============================= -march=armv8.4-a+sve -mcpu=neoverse-v1 . 4438.27 |============================= Nettle 3.8 Test: chacha Mbyte/s > Higher Is Better -march=armv8.4-a ....................... 740.25 |============================== -march=armv8.4-a+sve ................... 733.59 |============================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 731.07 |============================== Nettle 3.8 Test: sha512 Mbyte/s > Higher Is Better -march=armv8.4-a ....................... 498.83 |============================== -march=armv8.4-a+sve ................... 504.33 |============================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 481.90 |============================= Nettle 3.8 Test: poly1305-aes Mbyte/s > Higher Is Better -march=armv8.4-a ....................... 871.90 |============================== -march=armv8.4-a+sve ................... 820.51 |============================ -march=armv8.4-a+sve -mcpu=neoverse-v1 . 859.93 |============================== LuaJIT 2.1-git Test: Composite Mflops > Higher Is Better -march=armv8.4-a ....................... 1282.59 |============================ -march=armv8.4-a+sve ................... 1309.03 |============================= -march=armv8.4-a+sve -mcpu=neoverse-v1 . 1303.89 |============================= LuaJIT 2.1-git Test: Monte Carlo Mflops > Higher Is Better -march=armv8.4-a ....................... 343.27 |============================== -march=armv8.4-a+sve ................... 343.85 |============================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 343.90 |============================== LuaJIT 2.1-git Test: Fast Fourier Transform Mflops > Higher Is Better -march=armv8.4-a ....................... 661.55 |============================== -march=armv8.4-a+sve ................... 615.71 |============================ -march=armv8.4-a+sve -mcpu=neoverse-v1 . 668.40 |============================== LuaJIT 2.1-git Test: Sparse Matrix Multiply Mflops > Higher Is Better -march=armv8.4-a ....................... 1151.57 |============================= -march=armv8.4-a+sve ................... 1162.33 |============================= -march=armv8.4-a+sve -mcpu=neoverse-v1 . 1164.28 |============================= LuaJIT 2.1-git Test: Dense LU Matrix Factorization Mflops > Higher Is Better -march=armv8.4-a ....................... 3355.53 |=========================== -march=armv8.4-a+sve ................... 3521.13 |============================= -march=armv8.4-a+sve -mcpu=neoverse-v1 . 3547.90 |============================= LuaJIT 2.1-git Test: Jacobi Successive Over-Relaxation Mflops > Higher Is Better -march=armv8.4-a ....................... 901.02 |============================== -march=armv8.4-a+sve ................... 902.16 |============================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 902.14 |============================== Botan 2.17.3 Test: KASUMI MiB/s > Higher Is Better -march=armv8.4-a ....................... 62.02 |============================= -march=armv8.4-a+sve ................... 62.00 |============================= -march=armv8.4-a+sve -mcpu=neoverse-v1 . 65.29 |=============================== Botan 2.17.3 Test: KASUMI - Decrypt MiB/s > Higher Is Better -march=armv8.4-a ....................... 62.28 |============================= -march=armv8.4-a+sve ................... 62.26 |============================= -march=armv8.4-a+sve -mcpu=neoverse-v1 . 67.51 |=============================== Botan 2.17.3 Test: AES-256 MiB/s > Higher Is Better -march=armv8.4-a ....................... 5494.31 |============================= -march=armv8.4-a+sve ................... 5442.65 |============================= -march=armv8.4-a+sve -mcpu=neoverse-v1 . 5415.19 |============================= Botan 2.17.3 Test: AES-256 - Decrypt MiB/s > Higher Is Better -march=armv8.4-a ....................... 5477.57 |============================= -march=armv8.4-a+sve ................... 5474.32 |============================= -march=armv8.4-a+sve -mcpu=neoverse-v1 . 5409.82 |============================= Botan 2.17.3 Test: Twofish MiB/s > Higher Is Better -march=armv8.4-a ....................... 239.70 |============================= -march=armv8.4-a+sve ................... 248.89 |============================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 244.21 |============================= Botan 2.17.3 Test: Twofish - Decrypt MiB/s > Higher Is Better -march=armv8.4-a ....................... 246.16 |============================= -march=armv8.4-a+sve ................... 258.15 |============================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 246.61 |============================= Botan 2.17.3 Test: Blowfish MiB/s > Higher Is Better -march=armv8.4-a ....................... 278.87 |============================== -march=armv8.4-a+sve ................... 280.57 |============================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 280.99 |============================== Botan 2.17.3 Test: Blowfish - Decrypt MiB/s > Higher Is Better -march=armv8.4-a ....................... 288.51 |============================== -march=armv8.4-a+sve ................... 289.03 |============================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 283.41 |============================= Botan 2.17.3 Test: CAST-256 MiB/s > Higher Is Better -march=armv8.4-a ....................... 108.79 |============================== -march=armv8.4-a+sve ................... 108.75 |============================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 109.29 |============================== Botan 2.17.3 Test: CAST-256 - Decrypt MiB/s > Higher Is Better -march=armv8.4-a ....................... 108.60 |============================== -march=armv8.4-a+sve ................... 108.62 |============================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 109.11 |============================== Botan 2.17.3 Test: ChaCha20Poly1305 MiB/s > Higher Is Better -march=armv8.4-a ....................... 389.38 |============================== -march=armv8.4-a+sve ................... 390.31 |============================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 385.38 |============================== Botan 2.17.3 Test: ChaCha20Poly1305 - Decrypt MiB/s > Higher Is Better -march=armv8.4-a ....................... 382.51 |============================== -march=armv8.4-a+sve ................... 383.95 |============================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 378.98 |============================== GraphicsMagick 1.3.33 Operation: Swirl Iterations Per Minute > Higher Is Better -march=armv8.4-a ....................... 1225 |=============================== -march=armv8.4-a+sve ................... 1272 |================================ -march=armv8.4-a+sve -mcpu=neoverse-v1 . 1257 |================================ GraphicsMagick 1.3.33 Operation: Rotate Iterations Per Minute > Higher Is Better -march=armv8.4-a ....................... 577 |=============================== -march=armv8.4-a+sve ................... 611 |================================= -march=armv8.4-a+sve -mcpu=neoverse-v1 . 591 |================================ GraphicsMagick 1.3.33 Operation: Enhanced Iterations Per Minute > Higher Is Better -march=armv8.4-a ....................... 732 |================================= -march=armv8.4-a+sve ................... 718 |================================ -march=armv8.4-a+sve -mcpu=neoverse-v1 . 741 |================================= GraphicsMagick 1.3.33 Operation: Resizing Iterations Per Minute > Higher Is Better -march=armv8.4-a ....................... 2339 |=============================== -march=armv8.4-a+sve ................... 2414 |================================ -march=armv8.4-a+sve -mcpu=neoverse-v1 . 2402 |================================ GraphicsMagick 1.3.33 Operation: Noise-Gaussian Iterations Per Minute > Higher Is Better -march=armv8.4-a ....................... 494 |================================ -march=armv8.4-a+sve ................... 515 |================================= -march=armv8.4-a+sve -mcpu=neoverse-v1 . 509 |================================= GraphicsMagick 1.3.33 Operation: HWB Color Space Iterations Per Minute > Higher Is Better -march=armv8.4-a ....................... 978 |============================= -march=armv8.4-a+sve ................... 1067 |================================ -march=armv8.4-a+sve -mcpu=neoverse-v1 . 1016 |============================== AOM AV1 3.3 Encoder Mode: Speed 8 Realtime - Input: Bosphorus 4K Frames Per Second > Higher Is Better -march=armv8.4-a ....................... 62.13 |=============================== -march=armv8.4-a+sve ................... 62.19 |=============================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 61.56 |=============================== AOM AV1 3.3 Encoder Mode: Speed 10 Realtime - Input: Bosphorus 4K Frames Per Second > Higher Is Better -march=armv8.4-a ....................... 61.88 |============================= -march=armv8.4-a+sve ................... 65.62 |=============================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 63.25 |============================== AOM AV1 3.3 Encoder Mode: Speed 8 Realtime - Input: Bosphorus 1080p Frames Per Second > Higher Is Better -march=armv8.4-a ....................... 120.13 |============================= -march=armv8.4-a+sve ................... 123.95 |============================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 124.22 |============================== AOM AV1 3.3 Encoder Mode: Speed 9 Realtime - Input: Bosphorus 1080p Frames Per Second > Higher Is Better -march=armv8.4-a ....................... 152.46 |============================= -march=armv8.4-a+sve ................... 156.71 |============================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 156.63 |============================== AOM AV1 3.3 Encoder Mode: Speed 10 Realtime - Input: Bosphorus 1080p Frames Per Second > Higher Is Better -march=armv8.4-a ....................... 190.27 |============================= -march=armv8.4-a+sve ................... 193.68 |============================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 193.55 |============================== x264 2022-02-22 Video Input: Bosphorus 4K Frames Per Second > Higher Is Better -march=armv8.4-a ....................... 48.43 |=============================== -march=armv8.4-a+sve ................... 48.51 |=============================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 48.56 |=============================== x264 2022-02-22 Video Input: Bosphorus 1080p Frames Per Second > Higher Is Better -march=armv8.4-a ....................... 168.92 |============================== -march=armv8.4-a+sve ................... 169.58 |============================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 169.30 |============================== ACES DGEMM 1.0 Sustained Floating-Point Rate GFLOP/s > Higher Is Better -march=armv8.4-a ....................... 12.81 |============================== -march=armv8.4-a+sve ................... 13.44 |=============================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 13.35 |=============================== Coremark 1.0 CoreMark Size 666 - Iterations Per Second Iterations/Sec > Higher Is Better -march=armv8.4-a ....................... 789646.92 |=========================== -march=armv8.4-a+sve ................... 762066.50 |========================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 798137.71 |=========================== Himeno Benchmark 3.0 Poisson Pressure Solver MFLOPS > Higher Is Better -march=armv8.4-a ....................... 5561.56 |============================= -march=armv8.4-a+sve ................... 5508.28 |============================= -march=armv8.4-a+sve -mcpu=neoverse-v1 . 5538.74 |============================= Stockfish 13 Total Time Nodes Per Second > Higher Is Better -march=armv8.4-a ....................... 57485680 |=========================== -march=armv8.4-a+sve ................... 55823340 |========================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 59966785 |============================ Stargate Digital Audio Workstation 21.10.9 Sample Rate: 44100 - Buffer Size: 512 Render Ratio > Higher Is Better -march=armv8.4-a ....................... 6.072916 |=========================== -march=armv8.4-a+sve ................... 6.213500 |============================ -march=armv8.4-a+sve -mcpu=neoverse-v1 . 6.068650 |=========================== Stargate Digital Audio Workstation 21.10.9 Sample Rate: 96000 - Buffer Size: 512 Render Ratio > Higher Is Better -march=armv8.4-a ....................... 4.414055 |============================ -march=armv8.4-a+sve ................... 4.474848 |============================ -march=armv8.4-a+sve -mcpu=neoverse-v1 . 4.412487 |============================ Stargate Digital Audio Workstation 21.10.9 Sample Rate: 44100 - Buffer Size: 1024 Render Ratio > Higher Is Better -march=armv8.4-a ....................... 6.370035 |=========================== -march=armv8.4-a+sve ................... 6.538163 |============================ -march=armv8.4-a+sve -mcpu=neoverse-v1 . 6.360138 |=========================== Stargate Digital Audio Workstation 21.10.9 Sample Rate: 480000 - Buffer Size: 512 Render Ratio > Higher Is Better -march=armv8.4-a ....................... 6.005386 |=========================== -march=armv8.4-a+sve ................... 6.124455 |============================ -march=armv8.4-a+sve -mcpu=neoverse-v1 . 5.981662 |=========================== Stargate Digital Audio Workstation 21.10.9 Sample Rate: 96000 - Buffer Size: 1024 Render Ratio > Higher Is Better -march=armv8.4-a ....................... 4.729848 |============================ -march=armv8.4-a+sve ................... 4.807797 |============================ -march=armv8.4-a+sve -mcpu=neoverse-v1 . 4.729756 |============================ Stargate Digital Audio Workstation 21.10.9 Sample Rate: 480000 - Buffer Size: 1024 Render Ratio > Higher Is Better -march=armv8.4-a ....................... 6.322684 |=========================== -march=armv8.4-a+sve ................... 6.450182 |============================ -march=armv8.4-a+sve -mcpu=neoverse-v1 . 6.288006 |=========================== C-Ray 1.1 Total Time - 4K, 16 Rays Per Pixel Seconds < Lower Is Better -march=armv8.4-a ....................... 19.30 |=============================== -march=armv8.4-a+sve ................... 19.30 |=============================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 19.56 |=============================== POV-Ray 3.7.0.7 Trace Time Seconds < Lower Is Better -march=armv8.4-a ....................... 19.85 |============================== -march=armv8.4-a+sve ................... 20.26 |=============================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 19.61 |============================== Primesieve 7.7 1e12 Prime Number Generation Seconds < Lower Is Better -march=armv8.4-a ....................... 8.438 |=============================== -march=armv8.4-a+sve ................... 8.533 |=============================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 8.435 |=============================== Smallpt 1.0 Global Illumination Renderer; 128 Samples Seconds < Lower Is Better -march=armv8.4-a ....................... 3.895 |=============================== -march=armv8.4-a+sve ................... 3.896 |=============================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 3.892 |=============================== AOBench Size: 2048 x 2048 - Total Time Seconds < Lower Is Better -march=armv8.4-a ....................... 33.48 |=============================== -march=armv8.4-a+sve ................... 33.49 |=============================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 33.55 |=============================== FLAC Audio Encoding 1.3.3 WAV To FLAC Seconds < Lower Is Better -march=armv8.4-a ....................... 38.31 |=============================== -march=armv8.4-a+sve ................... 38.52 |=============================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 38.60 |=============================== LAME MP3 Encoding 3.100 WAV To MP3 Seconds < Lower Is Better -march=armv8.4-a ....................... 8.054 |=============================== -march=armv8.4-a+sve ................... 7.440 |============================= -march=armv8.4-a+sve -mcpu=neoverse-v1 . 7.446 |============================= Opus Codec Encoding 1.3.1 WAV To Opus Encode Seconds < Lower Is Better -march=armv8.4-a ....................... 18.32 |=============================== -march=armv8.4-a+sve ................... 14.40 |======================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 14.38 |======================== eSpeak-NG Speech Engine 20200907 Text-To-Speech Synthesis Seconds < Lower Is Better -march=armv8.4-a ....................... 36.59 |=============================== -march=armv8.4-a+sve ................... 29.98 |========================= -march=armv8.4-a+sve -mcpu=neoverse-v1 . 29.99 |========================= Ngspice 34 Circuit: C2670 Seconds < Lower Is Better -march=armv8.4-a ....................... 102.56 |============================= -march=armv8.4-a+sve ................... 106.91 |============================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 104.10 |============================= Ngspice 34 Circuit: C7552 Seconds < Lower Is Better -march=armv8.4-a ....................... 103.93 |============================ -march=armv8.4-a+sve ................... 111.64 |============================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 106.77 |============================= RNNoise 2020-06-28 Seconds < Lower Is Better -march=armv8.4-a ....................... 17.62 |=============================== -march=armv8.4-a+sve ................... 17.39 |=============================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 17.29 |============================== OpenJPEG 2.4 Encode: NASA Curiosity Panorama M34 ms < Lower Is Better -march=armv8.4-a ....................... 57205 |=============================== -march=armv8.4-a+sve ................... 55196 |============================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 55415 |============================== OpenSSL 3.0 Algorithm: SHA256 byte/s > Higher Is Better -march=armv8.4-a ....................... 27603943570 |========================= -march=armv8.4-a+sve ................... 27428176880 |========================= -march=armv8.4-a+sve -mcpu=neoverse-v1 . 27681290600 |========================= OpenSSL 3.0 Algorithm: RSA4096 sign/s > Higher Is Better -march=armv8.4-a ....................... 5090.5 |============================== -march=armv8.4-a+sve ................... 5088.1 |============================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 5114.1 |============================== OpenSSL 3.0 Algorithm: RSA4096 verify/s > Higher Is Better -march=armv8.4-a ....................... 356359.6 |============================ -march=armv8.4-a+sve ................... 356407.8 |============================ -march=armv8.4-a+sve -mcpu=neoverse-v1 . 355966.7 |============================ Liquid-DSP 2021.01.31 Threads: 8 - Buffer Length: 256 - Filter Length: 57 samples/s > Higher Is Better -march=armv8.4-a ....................... 176363333 |=========================== -march=armv8.4-a+sve ................... 167733333 |========================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 150673333 |======================= Liquid-DSP 2021.01.31 Threads: 16 - Buffer Length: 256 - Filter Length: 57 samples/s > Higher Is Better -march=armv8.4-a ....................... 352700000 |=========================== -march=armv8.4-a+sve ................... 335423333 |========================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 301330000 |======================= Liquid-DSP 2021.01.31 Threads: 32 - Buffer Length: 256 - Filter Length: 57 samples/s > Higher Is Better -march=armv8.4-a ....................... 705233333 |=========================== -march=armv8.4-a+sve ................... 668636667 |========================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 602443333 |======================= GROMACS 2022.1 Implementation: MPI CPU - Input: water_GMX50_bare Ns Per Day > Higher Is Better -march=armv8.4-a ....................... 2.277 |=============================== -march=armv8.4-a+sve ................... 2.275 |=============================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 2.279 |=============================== ASTC Encoder 3.2 Preset: Medium Seconds < Lower Is Better -march=armv8.4-a ....................... 4.8833 |============================== -march=armv8.4-a+sve ................... 4.8092 |============================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 4.7290 |============================= ASTC Encoder 3.2 Preset: Thorough Seconds < Lower Is Better -march=armv8.4-a ....................... 9.1435 |============================== -march=armv8.4-a+sve ................... 9.0131 |============================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 8.9427 |============================= ASTC Encoder 3.2 Preset: Exhaustive Seconds < Lower Is Better -march=armv8.4-a ....................... 35.36 |=============================== -march=armv8.4-a+sve ................... 35.19 |=============================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 34.68 |============================== Google Draco 1.5.0 Model: Lion ms < Lower Is Better -march=armv8.4-a ....................... 5354 |================================ -march=armv8.4-a+sve ................... 5309 |================================ -march=armv8.4-a+sve -mcpu=neoverse-v1 . 5297 |================================ Google Draco 1.5.0 Model: Church Facade ms < Lower Is Better -march=armv8.4-a ....................... 7935 |================================ -march=armv8.4-a+sve ................... 7843 |================================ -march=armv8.4-a+sve -mcpu=neoverse-v1 . 7797 |=============================== Redis 6.0.9 Test: GET Requests Per Second > Higher Is Better -march=armv8.4-a ....................... 2523377.92 |========================== -march=armv8.4-a+sve ................... 2513289.20 |========================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 2546595.58 |========================== Redis 6.0.9 Test: SET Requests Per Second > Higher Is Better -march=armv8.4-a ....................... 1865840.13 |========================== -march=armv8.4-a+sve ................... 1861924.13 |========================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 1879962.00 |========================== Caffe 2020-02-13 Model: AlexNet - Acceleration: CPU - Iterations: 200 Milli-Seconds < Lower Is Better -march=armv8.4-a ....................... 43634 |=============================== -march=armv8.4-a+sve ................... 43931 |=============================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 42262 |============================== Caffe 2020-02-13 Model: GoogleNet - Acceleration: CPU - Iterations: 200 Milli-Seconds < Lower Is Better -march=armv8.4-a ....................... 123807 |============================== -march=armv8.4-a+sve ................... 125125 |============================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 120428 |============================= TNN 0.3 Target: CPU - Model: DenseNet ms < Lower Is Better -march=armv8.4-a ....................... 2730.40 |============================= -march=armv8.4-a+sve ................... 2346.32 |========================= -march=armv8.4-a+sve -mcpu=neoverse-v1 . 2390.26 |========================= TNN 0.3 Target: CPU - Model: MobileNet v2 ms < Lower Is Better -march=armv8.4-a ....................... 260.78 |============================ -march=armv8.4-a+sve ................... 280.24 |============================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 273.40 |============================= TNN 0.3 Target: CPU - Model: SqueezeNet v2 ms < Lower Is Better -march=armv8.4-a ....................... 71.13 |============================= -march=armv8.4-a+sve ................... 76.30 |=============================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 76.21 |=============================== TNN 0.3 Target: CPU - Model: SqueezeNet v1.1 ms < Lower Is Better -march=armv8.4-a ....................... 257.70 |============================== -march=armv8.4-a+sve ................... 205.80 |======================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 205.28 |======================== Sysbench 1.0.20 Test: CPU Events Per Second > Higher Is Better -march=armv8.4-a ....................... 96726.40 |============================ -march=armv8.4-a+sve ................... 96666.76 |============================ -march=armv8.4-a+sve -mcpu=neoverse-v1 . 96702.81 |============================ ONNX Runtime 1.11 Model: GPT-2 - Device: CPU - Executor: Standard Inferences Per Minute > Higher Is Better -march=armv8.4-a ....................... 12364 |=============================== -march=armv8.4-a+sve ................... 12317 |=============================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 12460 |=============================== ONNX Runtime 1.11 Model: bertsquad-12 - Device: CPU - Executor: Standard Inferences Per Minute > Higher Is Better -march=armv8.4-a ....................... 773 |================================= -march=armv8.4-a+sve ................... 772 |================================= -march=armv8.4-a+sve -mcpu=neoverse-v1 . 774 |================================= ONNX Runtime 1.11 Model: fcn-resnet101-11 - Device: CPU - Executor: Standard Inferences Per Minute > Higher Is Better -march=armv8.4-a ....................... 73 |================================== -march=armv8.4-a+sve ................... 73 |================================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 73 |================================== ONNX Runtime 1.11 Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard Inferences Per Minute > Higher Is Better -march=armv8.4-a ....................... 938 |================================= -march=armv8.4-a+sve ................... 935 |================================= -march=armv8.4-a+sve -mcpu=neoverse-v1 . 938 |================================= ONNX Runtime 1.11 Model: super-resolution-10 - Device: CPU - Executor: Standard Inferences Per Minute > Higher Is Better -march=armv8.4-a ....................... 5413 |================================ -march=armv8.4-a+sve ................... 5411 |================================ -march=armv8.4-a+sve -mcpu=neoverse-v1 . 5416 |================================ WavPack Audio Encoding 5.3 WAV To WavPack Seconds < Lower Is Better -march=armv8.4-a ....................... 20.49 |=============================== -march=armv8.4-a+sve ................... 20.52 |=============================== -march=armv8.4-a+sve -mcpu=neoverse-v1 . 20.76 |=============================== Kripke 1.2.4 Throughput FoM > Higher Is Better -march=armv8.4-a ....................... 204143167 |=========================== -march=armv8.4-a+sve ................... 192709233 |========================= -march=armv8.4-a+sve -mcpu=neoverse-v1 . 194776367 |==========================