a1.4xlarge amazon testing on Amazon Linux 2 via the Phoronix Test Suite. a1.4xlarge: Processor: ARMv8 Cortex-A72 (16 Cores), Motherboard: Amazon EC2 a1.4xlarge (1.0 BIOS), Chipset: Amazon Device 0200, Memory: 32GB, Disk: 86GB Amazon Elastic Block Store, Network: Amazon Elastic OS: Amazon Linux 2, Kernel: 5.10.186-179.751.amzn2.aarch64 (aarch64) 20230801, Compiler: GCC 10.4.1 20221124, File-System: xfs, System Layer: amazon Stream 2013-01-17 Type: Copy MB/s > Higher Is Better a1.4xlarge . 22601.6 |========================================================= Stream 2013-01-17 Type: Scale MB/s > Higher Is Better a1.4xlarge . 23474.6 |========================================================= Stream 2013-01-17 Type: Triad MB/s > Higher Is Better a1.4xlarge . 25486.8 |========================================================= Stream 2013-01-17 Type: Add MB/s > Higher Is Better a1.4xlarge . 25533.7 |========================================================= CacheBench Test: Read MB/s > Higher Is Better a1.4xlarge . 5829.02 |========================================================= CacheBench Test: Write MB/s > Higher Is Better a1.4xlarge . 8552.39 |========================================================= CacheBench Test: Read / Modify / Write MB/s > Higher Is Better a1.4xlarge . 17036.97 |======================================================== GNU GMP GMPbench 6.2.1 Total Time GMPbench Score > Higher Is Better a1.4xlarge . 1269.5 |========================================================== Zstd Compression 1.5.4 Compression Level: 3 - Compression Speed MB/s > Higher Is Better a1.4xlarge . 252.4 |=========================================================== Zstd Compression 1.5.4 Compression Level: 3 - Decompression Speed MB/s > Higher Is Better a1.4xlarge . 449.5 |=========================================================== Zstd Compression 1.5.4 Compression Level: 8 - Compression Speed MB/s > Higher Is Better a1.4xlarge . 96.4 |============================================================ Zstd Compression 1.5.4 Compression Level: 8 - Decompression Speed MB/s > Higher Is Better a1.4xlarge . 480.4 |=========================================================== Zstd Compression 1.5.4 Compression Level: 12 - Compression Speed MB/s > Higher Is Better a1.4xlarge . 37.4 |============================================================ Zstd Compression 1.5.4 Compression Level: 12 - Decompression Speed MB/s > Higher Is Better a1.4xlarge . 432.0 |=========================================================== Zstd Compression 1.5.4 Compression Level: 19 - Compression Speed MB/s > Higher Is Better a1.4xlarge . 6.19 |============================================================ Zstd Compression 1.5.4 Compression Level: 19 - Decompression Speed MB/s > Higher Is Better a1.4xlarge . 355.6 |=========================================================== Zstd Compression 1.5.4 Compression Level: 3, Long Mode - Compression Speed MB/s > Higher Is Better a1.4xlarge . 247.8 |=========================================================== Zstd Compression 1.5.4 Compression Level: 3, Long Mode - Decompression Speed MB/s > Higher Is Better a1.4xlarge . 458.8 |=========================================================== Zstd Compression 1.5.4 Compression Level: 8, Long Mode - Compression Speed MB/s > Higher Is Better a1.4xlarge . 103.9 |=========================================================== Zstd Compression 1.5.4 Compression Level: 8, Long Mode - Decompression Speed MB/s > Higher Is Better a1.4xlarge . 473.8 |=========================================================== Zstd Compression 1.5.4 Compression Level: 19, Long Mode - Compression Speed MB/s > Higher Is Better a1.4xlarge . 3.36 |============================================================ Zstd Compression 1.5.4 Compression Level: 19, Long Mode - Decompression Speed MB/s > Higher Is Better a1.4xlarge . 367.9 |=========================================================== LZ4 Compression 1.9.3 Compression Level: 1 - Compression Speed MB/s > Higher Is Better a1.4xlarge . 3613.89 |========================================================= LZ4 Compression 1.9.3 Compression Level: 1 - Decompression Speed MB/s > Higher Is Better a1.4xlarge . 4137.2 |========================================================== LZ4 Compression 1.9.3 Compression Level: 3 - Compression Speed MB/s > Higher Is Better a1.4xlarge . 18.98 |=========================================================== LZ4 Compression 1.9.3 Compression Level: 3 - Decompression Speed MB/s > Higher Is Better a1.4xlarge . 3741.3 |========================================================== LZ4 Compression 1.9.3 Compression Level: 9 - Compression Speed MB/s > Higher Is Better a1.4xlarge . 18.16 |=========================================================== LZ4 Compression 1.9.3 Compression Level: 9 - Decompression Speed MB/s > Higher Is Better a1.4xlarge . 3743.3 |========================================================== OpenSSL 3.1 Algorithm: SHA256 byte/s > Higher Is Better a1.4xlarge . 6705985087 |====================================================== OpenSSL 3.1 Algorithm: SHA512 byte/s > Higher Is Better a1.4xlarge . 2609151510 |====================================================== OpenSSL 3.1 Algorithm: RSA4096 sign/s > Higher Is Better a1.4xlarge . 588.9 |=========================================================== OpenSSL 3.1 Algorithm: RSA4096 verify/s > Higher Is Better a1.4xlarge . 45406.1 |========================================================= OpenSSL 3.1 Algorithm: ChaCha20 byte/s > Higher Is Better a1.4xlarge . 8294192013 |====================================================== OpenSSL 3.1 Algorithm: AES-128-GCM byte/s > Higher Is Better a1.4xlarge . 30600740043 |===================================================== OpenSSL 3.1 Algorithm: AES-256-GCM byte/s > Higher Is Better a1.4xlarge . 25798807190 |===================================================== OpenSSL 3.1 Algorithm: ChaCha20-Poly1305 byte/s > Higher Is Better a1.4xlarge . 6567410800 |====================================================== Botan 2.17.3 Test: KASUMI MiB/s > Higher Is Better a1.4xlarge . 40.37 |=========================================================== Botan 2.17.3 Test: KASUMI - Decrypt MiB/s > Higher Is Better a1.4xlarge . 40.48 |=========================================================== Botan 2.17.3 Test: AES-256 MiB/s > Higher Is Better a1.4xlarge . 1817.83 |========================================================= Botan 2.17.3 Test: AES-256 - Decrypt MiB/s > Higher Is Better a1.4xlarge . 1817.98 |========================================================= Botan 2.17.3 Test: Twofish MiB/s > Higher Is Better a1.4xlarge . 138.00 |========================================================== Botan 2.17.3 Test: Twofish - Decrypt MiB/s > Higher Is Better a1.4xlarge . 135.85 |========================================================== Botan 2.17.3 Test: Blowfish MiB/s > Higher Is Better a1.4xlarge . 160.23 |========================================================== Botan 2.17.3 Test: Blowfish - Decrypt MiB/s > Higher Is Better a1.4xlarge . 163.20 |========================================================== Botan 2.17.3 Test: CAST-256 MiB/s > Higher Is Better a1.4xlarge . 73.61 |=========================================================== Botan 2.17.3 Test: CAST-256 - Decrypt MiB/s > Higher Is Better a1.4xlarge . 72.69 |=========================================================== Botan 2.17.3 Test: ChaCha20Poly1305 MiB/s > Higher Is Better a1.4xlarge . 161.34 |========================================================== Botan 2.17.3 Test: ChaCha20Poly1305 - Decrypt MiB/s > Higher Is Better a1.4xlarge . 160.17 |========================================================== x264 2022-02-22 Video Input: Bosphorus 4K Frames Per Second > Higher Is Better a1.4xlarge . 9.52 |============================================================ x264 2022-02-22 Video Input: Bosphorus 1080p Frames Per Second > Higher Is Better a1.4xlarge . 39.98 |=========================================================== x265 3.4 Video Input: Bosphorus 4K Frames Per Second > Higher Is Better a1.4xlarge . 2.95 |============================================================ x265 3.4 Video Input: Bosphorus 1080p Frames Per Second > Higher Is Better a1.4xlarge . 10.74 |=========================================================== PyPerformance 1.0.0 Benchmark: go Milliseconds < Lower Is Better a1.4xlarge . 1.18 |============================================================ PyPerformance 1.0.0 Benchmark: 2to3 Milliseconds < Lower Is Better a1.4xlarge . 1.49 |============================================================ PyPerformance 1.0.0 Benchmark: chaos Milliseconds < Lower Is Better a1.4xlarge . 578 |============================================================= PyPerformance 1.0.0 Benchmark: float Milliseconds < Lower Is Better a1.4xlarge . 497 |============================================================= PyPerformance 1.0.0 Benchmark: nbody Milliseconds < Lower Is Better a1.4xlarge . 492 |============================================================= PyPerformance 1.0.0 Benchmark: pathlib Milliseconds < Lower Is Better a1.4xlarge . 131 |============================================================= PyPerformance 1.0.0 Benchmark: raytrace Milliseconds < Lower Is Better a1.4xlarge . 2.57 |============================================================ PyPerformance 1.0.0 Benchmark: json_loads Milliseconds < Lower Is Better a1.4xlarge . 93.7 |============================================================ PyPerformance 1.0.0 Benchmark: crypto_pyaes Milliseconds < Lower Is Better a1.4xlarge . 478 |============================================================= PyPerformance 1.0.0 Benchmark: regex_compile Milliseconds < Lower Is Better a1.4xlarge . 879 |============================================================= PyPerformance 1.0.0 Benchmark: python_startup Milliseconds < Lower Is Better a1.4xlarge . 28 |============================================================== PyPerformance 1.0.0 Benchmark: django_template Milliseconds < Lower Is Better a1.4xlarge . 335 |============================================================= PyPerformance 1.0.0 Benchmark: pickle_pure_python Milliseconds < Lower Is Better a1.4xlarge . 2.5 |============================================================= CppPerformanceBenchmarks 9 Test: Atol Seconds < Lower Is Better a1.4xlarge . 156.25 |========================================================== CppPerformanceBenchmarks 9 Test: Ctype Seconds < Lower Is Better a1.4xlarge . 100.57 |========================================================== CppPerformanceBenchmarks 9 Test: Math Library Seconds < Lower Is Better a1.4xlarge . 1216.10 |========================================================= CppPerformanceBenchmarks 9 Test: Random Numbers Seconds < Lower Is Better a1.4xlarge . 2710.48 |========================================================= CppPerformanceBenchmarks 9 Test: Stepanov Vector Seconds < Lower Is Better a1.4xlarge . 154.72 |========================================================== CppPerformanceBenchmarks 9 Test: Function Objects Seconds < Lower Is Better a1.4xlarge . 25.18 |=========================================================== CppPerformanceBenchmarks 9 Test: Stepanov Abstraction Seconds < Lower Is Better a1.4xlarge . 59.29 |=========================================================== GraphicsMagick 1.3.38 Operation: Swirl Iterations Per Minute > Higher Is Better a1.4xlarge . 235 |============================================================= GraphicsMagick 1.3.38 Operation: Rotate Iterations Per Minute > Higher Is Better a1.4xlarge . 257 |============================================================= GraphicsMagick 1.3.38 Operation: Sharpen Iterations Per Minute > Higher Is Better a1.4xlarge . 118 |============================================================= GraphicsMagick 1.3.38 Operation: Enhanced Iterations Per Minute > Higher Is Better a1.4xlarge . 104 |============================================================= GraphicsMagick 1.3.38 Operation: Resizing Iterations Per Minute > Higher Is Better a1.4xlarge . 418 |============================================================= GraphicsMagick 1.3.38 Operation: Noise-Gaussian Iterations Per Minute > Higher Is Better a1.4xlarge . 76 |============================================================== GraphicsMagick 1.3.38 Operation: HWB Color Space Iterations Per Minute > Higher Is Better a1.4xlarge . 425 |============================================================= Smallpt 1.0 Global Illumination Renderer; 128 Samples Seconds < Lower Is Better a1.4xlarge . 23.83 |=========================================================== Stockfish 15 Total Time Nodes Per Second > Higher Is Better a1.4xlarge . 8068443 |========================================================= C-Ray 1.1 Total Time - 4K, 16 Rays Per Pixel Seconds < Lower Is Better a1.4xlarge . 99.23 |=========================================================== SciMark 2.0 Computational Test: Composite Mflops > Higher Is Better a1.4xlarge . 210.17 |========================================================== SciMark 2.0 Computational Test: Monte Carlo Mflops > Higher Is Better a1.4xlarge . 62.21 |=========================================================== SciMark 2.0 Computational Test: Fast Fourier Transform Mflops > Higher Is Better a1.4xlarge . 52.04 |=========================================================== SciMark 2.0 Computational Test: Sparse Matrix Multiply Mflops > Higher Is Better a1.4xlarge . 209.34 |========================================================== SciMark 2.0 Computational Test: Dense LU Matrix Factorization Mflops > Higher Is Better a1.4xlarge . 245.94 |========================================================== SciMark 2.0 Computational Test: Jacobi Successive Over-Relaxation Mflops > Higher Is Better a1.4xlarge . 481.33 |========================================================== Renaissance 0.14 Test: Scala Dotty ms < Lower Is Better a1.4xlarge . 3231.9 |========================================================== Renaissance 0.14 Test: Random Forest ms < Lower Is Better a1.4xlarge . 2231.2 |========================================================== Renaissance 0.14 Test: ALS Movie Lens ms < Lower Is Better a1.4xlarge . 19333.6 |========================================================= Renaissance 0.14 Test: Apache Spark ALS ms < Lower Is Better a1.4xlarge . 7320.6 |========================================================== Renaissance 0.14 Test: Apache Spark Bayes ms < Lower Is Better a1.4xlarge . 3789.2 |========================================================== Renaissance 0.14 Test: Savina Reactors.IO ms < Lower Is Better a1.4xlarge . 47878.9 |========================================================= Renaissance 0.14 Test: Apache Spark PageRank ms < Lower Is Better a1.4xlarge . 8404.7 |========================================================== Renaissance 0.14 Test: Finagle HTTP Requests ms < Lower Is Better a1.4xlarge . 11688.9 |========================================================= Renaissance 0.14 Test: In-Memory Database Shootout ms < Lower Is Better a1.4xlarge . 10890.3 |========================================================= Renaissance 0.14 Test: Akka Unbalanced Cobwebbed Tree ms < Lower Is Better a1.4xlarge . 16270.1 |========================================================= Renaissance 0.14 Test: Genetic Algorithm Using Jenetics + Futures ms < Lower Is Better a1.4xlarge . 5729.5 |========================================================== DaCapo Benchmark 9.12-MR1 Java Test: H2 msec < Lower Is Better a1.4xlarge . 8894 |============================================================ DaCapo Benchmark 9.12-MR1 Java Test: Jython msec < Lower Is Better a1.4xlarge . 12049 |=========================================================== DaCapo Benchmark 9.12-MR1 Java Test: Eclipse msec < Lower Is Better a1.4xlarge . 48150 |=========================================================== DaCapo Benchmark 9.12-MR1 Java Test: Tradesoap msec < Lower Is Better a1.4xlarge . 11737 |=========================================================== DaCapo Benchmark 9.12-MR1 Java Test: Tradebeans msec < Lower Is Better a1.4xlarge . 8821 |============================================================ BlogBench 1.1 Test: Read Final Score > Higher Is Better a1.4xlarge . 1421893 |========================================================= BlogBench 1.1 Test: Write Final Score > Higher Is Better a1.4xlarge . 5982 |============================================================ nginx 1.23.2 Connections: 1 Requests Per Second > Higher Is Better nginx 1.23.2 Connections: 20 Requests Per Second > Higher Is Better a1.4xlarge . 17213.56 |======================================================== nginx 1.23.2 Connections: 100 Requests Per Second > Higher Is Better a1.4xlarge . 20833.31 |======================================================== nginx 1.23.2 Connections: 200 Requests Per Second > Higher Is Better a1.4xlarge . 21070.15 |======================================================== nginx 1.23.2 Connections: 500 Requests Per Second > Higher Is Better a1.4xlarge . 20281.73 |======================================================== nginx 1.23.2 Connections: 1000 Requests Per Second > Higher Is Better a1.4xlarge . 19286.86 |======================================================== nginx 1.23.2 Connections: 4000 Requests Per Second > Higher Is Better a1.4xlarge . 16673.77 |======================================================== Redis 7.0.12 + memtier_benchmark 2.0 Protocol: Redis - Clients: 50 - Set To Get Ratio: 1:1 Ops/sec > Higher Is Better a1.4xlarge . 570690.70 |======================================================= Redis 7.0.12 + memtier_benchmark 2.0 Protocol: Redis - Clients: 50 - Set To Get Ratio: 1:5 Ops/sec > Higher Is Better a1.4xlarge . 624904.21 |======================================================= Redis 7.0.12 + memtier_benchmark 2.0 Protocol: Redis - Clients: 50 - Set To Get Ratio: 5:1 Ops/sec > Higher Is Better a1.4xlarge . 503261.73 |======================================================= Redis 7.0.12 + memtier_benchmark 2.0 Protocol: Redis - Clients: 100 - Set To Get Ratio: 1:1 Ops/sec > Higher Is Better a1.4xlarge . 508379.89 |======================================================= Redis 7.0.12 + memtier_benchmark 2.0 Protocol: Redis - Clients: 100 - Set To Get Ratio: 1:5 Ops/sec > Higher Is Better a1.4xlarge . 588332.01 |======================================================= Redis 7.0.12 + memtier_benchmark 2.0 Protocol: Redis - Clients: 100 - Set To Get Ratio: 5:1 Ops/sec > Higher Is Better a1.4xlarge . 492717.66 |======================================================= Redis 7.0.12 + memtier_benchmark 2.0 Protocol: Redis - Clients: 50 - Set To Get Ratio: 10:1 Ops/sec > Higher Is Better a1.4xlarge . 489350.30 |======================================================= Redis 7.0.12 + memtier_benchmark 2.0 Protocol: Redis - Clients: 50 - Set To Get Ratio: 1:10 Ops/sec > Higher Is Better a1.4xlarge . 604413.07 |======================================================= Redis 7.0.12 + memtier_benchmark 2.0 Protocol: Redis - Clients: 500 - Set To Get Ratio: 1:1 Ops/sec > Higher Is Better a1.4xlarge . 571156.97 |======================================================= Redis 7.0.12 + memtier_benchmark 2.0 Protocol: Redis - Clients: 500 - Set To Get Ratio: 1:5 Ops/sec > Higher Is Better a1.4xlarge . 627539.97 |======================================================= Redis 7.0.12 + memtier_benchmark 2.0 Protocol: Redis - Clients: 500 - Set To Get Ratio: 5:1 Ops/sec > Higher Is Better a1.4xlarge . 537971.74 |======================================================= Redis 7.0.12 + memtier_benchmark 2.0 Protocol: Redis - Clients: 100 - Set To Get Ratio: 10:1 Ops/sec > Higher Is Better a1.4xlarge . 482407.96 |======================================================= Redis 7.0.12 + memtier_benchmark 2.0 Protocol: Redis - Clients: 100 - Set To Get Ratio: 1:10 Ops/sec > Higher Is Better a1.4xlarge . 558812.72 |======================================================= Redis 7.0.12 + memtier_benchmark 2.0 Protocol: Redis - Clients: 500 - Set To Get Ratio: 10:1 Ops/sec > Higher Is Better a1.4xlarge . 531602.56 |======================================================= Redis 7.0.12 + memtier_benchmark 2.0 Protocol: Redis - Clients: 500 - Set To Get Ratio: 1:10 Ops/sec > Higher Is Better a1.4xlarge . 617822.56 |======================================================= Apache Cassandra 4.1.3 Test: Writes Op/s > Higher Is Better a1.4xlarge . 24126 |=========================================================== Apache Cassandra 4.1.3 Test: Mixed 1:1 Op/s > Higher Is Better a1.4xlarge . 19008 |=========================================================== Apache Cassandra 4.1.3 Test: Mixed 1:3 Op/s > Higher Is Better a1.4xlarge . 17575 |=========================================================== libjpeg-turbo tjbench 2.1.0 Test: Decompression Throughput Megapixels/sec > Higher Is Better a1.4xlarge . 75.80 |=========================================================== VVenC 1.9 Video Input: Bosphorus 4K - Video Preset: Fast Frames Per Second > Higher Is Better a1.4xlarge . 1.087 |=========================================================== VVenC 1.9 Video Input: Bosphorus 4K - Video Preset: Faster Frames Per Second > Higher Is Better a1.4xlarge . 2.141 |=========================================================== VVenC 1.9 Video Input: Bosphorus 1080p - Video Preset: Fast Frames Per Second > Higher Is Better a1.4xlarge . 3.636 |=========================================================== VVenC 1.9 Video Input: Bosphorus 1080p - Video Preset: Faster Frames Per Second > Higher Is Better a1.4xlarge . 7.613 |=========================================================== NCNN 20230517 Target: CPU - Model: mobilenet ms < Lower Is Better a1.4xlarge . 44.13 |=========================================================== NCNN 20230517 Target: CPU-v2-v2 - Model: mobilenet-v2 ms < Lower Is Better a1.4xlarge . 14.67 |=========================================================== NCNN 20230517 Target: CPU-v3-v3 - Model: mobilenet-v3 ms < Lower Is Better a1.4xlarge . 11.32 |=========================================================== NCNN 20230517 Target: CPU - Model: shufflenet-v2 ms < Lower Is Better a1.4xlarge . 7.43 |============================================================ NCNN 20230517 Target: CPU - Model: mnasnet ms < Lower Is Better a1.4xlarge . 12.41 |=========================================================== NCNN 20230517 Target: CPU - Model: efficientnet-b0 ms < Lower Is Better a1.4xlarge . 17.73 |=========================================================== NCNN 20230517 Target: CPU - Model: blazeface ms < Lower Is Better a1.4xlarge . 3.71 |============================================================ NCNN 20230517 Target: CPU - Model: googlenet ms < Lower Is Better a1.4xlarge . 28.65 |=========================================================== NCNN 20230517 Target: CPU - Model: vgg16 ms < Lower Is Better a1.4xlarge . 92.10 |=========================================================== NCNN 20230517 Target: CPU - Model: resnet18 ms < Lower Is Better a1.4xlarge . 20.32 |=========================================================== NCNN 20230517 Target: CPU - Model: alexnet ms < Lower Is Better a1.4xlarge . 13.16 |=========================================================== NCNN 20230517 Target: CPU - Model: resnet50 ms < Lower Is Better a1.4xlarge . 52.01 |=========================================================== NCNN 20230517 Target: CPU - Model: yolov4-tiny ms < Lower Is Better a1.4xlarge . 54.16 |=========================================================== NCNN 20230517 Target: CPU - Model: squeezenet_ssd ms < Lower Is Better a1.4xlarge . 30.98 |=========================================================== NCNN 20230517 Target: CPU - Model: regnety_400m ms < Lower Is Better a1.4xlarge . 33.81 |=========================================================== NCNN 20230517 Target: CPU - Model: vision_transformer ms < Lower Is Better a1.4xlarge . 440.19 |========================================================== NCNN 20230517 Target: CPU - Model: FastestDet ms < Lower Is Better a1.4xlarge . 10.96 |=========================================================== NCNN 20230517 Target: Vulkan GPU - Model: mobilenet ms < Lower Is Better a1.4xlarge . 44.90 |=========================================================== NCNN 20230517 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 ms < Lower Is Better a1.4xlarge . 14.41 |=========================================================== NCNN 20230517 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 ms < Lower Is Better a1.4xlarge . 11.59 |=========================================================== NCNN 20230517 Target: Vulkan GPU - Model: shufflenet-v2 ms < Lower Is Better a1.4xlarge . 7.33 |============================================================ NCNN 20230517 Target: Vulkan GPU - Model: mnasnet ms < Lower Is Better a1.4xlarge . 12.53 |=========================================================== NCNN 20230517 Target: Vulkan GPU - Model: efficientnet-b0 ms < Lower Is Better a1.4xlarge . 17.74 |=========================================================== NCNN 20230517 Target: Vulkan GPU - Model: blazeface ms < Lower Is Better a1.4xlarge . 3.91 |============================================================ NCNN 20230517 Target: Vulkan GPU - Model: googlenet ms < Lower Is Better a1.4xlarge . 29.60 |=========================================================== NCNN 20230517 Target: Vulkan GPU - Model: vgg16 ms < Lower Is Better a1.4xlarge . 92.03 |=========================================================== NCNN 20230517 Target: Vulkan GPU - Model: resnet18 ms < Lower Is Better a1.4xlarge . 21.47 |=========================================================== NCNN 20230517 Target: Vulkan GPU - Model: alexnet ms < Lower Is Better a1.4xlarge . 12.73 |=========================================================== NCNN 20230517 Target: Vulkan GPU - Model: resnet50 ms < Lower Is Better a1.4xlarge . 51.97 |=========================================================== NCNN 20230517 Target: Vulkan GPU - Model: yolov4-tiny ms < Lower Is Better a1.4xlarge . 53.90 |=========================================================== NCNN 20230517 Target: Vulkan GPU - Model: squeezenet_ssd ms < Lower Is Better a1.4xlarge . 31.11 |=========================================================== NCNN 20230517 Target: Vulkan GPU - Model: regnety_400m ms < Lower Is Better a1.4xlarge . 33.65 |=========================================================== NCNN 20230517 Target: Vulkan GPU - Model: vision_transformer ms < Lower Is Better a1.4xlarge . 438.33 |========================================================== NCNN 20230517 Target: Vulkan GPU - Model: FastestDet ms < Lower Is Better a1.4xlarge . 10.86 |=========================================================== libxsmm 2-1.17-3645 M N K: 128 GFLOPS/s > Higher Is Better libxsmm 2-1.17-3645 M N K: 256 GFLOPS/s > Higher Is Better libxsmm 2-1.17-3645 M N K: 32 GFLOPS/s > Higher Is Better libxsmm 2-1.17-3645 M N K: 64 GFLOPS/s > Higher Is Better Apache Spark 3.3 Row Count: 1000000 - Partitions: 100 - SHA-512 Benchmark Time Seconds < Lower Is Better a1.4xlarge . 9.11 |============================================================ Apache Spark 3.3 Row Count: 1000000 - Partitions: 100 - Calculate Pi Benchmark Seconds < Lower Is Better a1.4xlarge . 543.39 |========================================================== Apache Spark 3.3 Row Count: 1000000 - Partitions: 100 - Calculate Pi Benchmark Using Dataframe Seconds < Lower Is Better a1.4xlarge . 45.13 |=========================================================== Apache Spark 3.3 Row Count: 1000000 - Partitions: 100 - Group By Test Time Seconds < Lower Is Better a1.4xlarge . 10.67 |=========================================================== Apache Spark 3.3 Row Count: 1000000 - Partitions: 100 - Repartition Test Time Seconds < Lower Is Better a1.4xlarge . 4.11 |============================================================ Apache Spark 3.3 Row Count: 1000000 - Partitions: 100 - Inner Join Test Time Seconds < Lower Is Better a1.4xlarge . 4.60 |============================================================ Apache Spark 3.3 Row Count: 1000000 - Partitions: 100 - Broadcast Inner Join Test Time Seconds < Lower Is Better a1.4xlarge . 3.54 |============================================================ Apache Spark 3.3 Row Count: 1000000 - Partitions: 500 - SHA-512 Benchmark Time Seconds < Lower Is Better a1.4xlarge . 9.41 |============================================================ Apache Spark 3.3 Row Count: 1000000 - Partitions: 500 - Calculate Pi Benchmark Seconds < Lower Is Better a1.4xlarge . 544.03 |========================================================== Apache Spark 3.3 Row Count: 1000000 - Partitions: 500 - Calculate Pi Benchmark Using Dataframe Seconds < Lower Is Better a1.4xlarge . 45.05 |=========================================================== Apache Spark 3.3 Row Count: 1000000 - Partitions: 500 - Group By Test Time Seconds < Lower Is Better a1.4xlarge . 11.44 |=========================================================== Apache Spark 3.3 Row Count: 1000000 - Partitions: 500 - Repartition Test Time Seconds < Lower Is Better a1.4xlarge . 4.23 |============================================================ Apache Spark 3.3 Row Count: 1000000 - Partitions: 500 - Inner Join Test Time Seconds < Lower Is Better a1.4xlarge . 5.06 |============================================================ Apache Spark 3.3 Row Count: 1000000 - Partitions: 500 - Broadcast Inner Join Test Time Seconds < Lower Is Better a1.4xlarge . 3.80 |============================================================ Apache Spark 3.3 Row Count: 1000000 - Partitions: 1000 - SHA-512 Benchmark Time Seconds < Lower Is Better a1.4xlarge . 9.96 |============================================================ Apache Spark 3.3 Row Count: 1000000 - Partitions: 1000 - Calculate Pi Benchmark Seconds < Lower Is Better a1.4xlarge . 553.51 |========================================================== Apache Spark 3.3 Row Count: 1000000 - Partitions: 1000 - Calculate Pi Benchmark Using Dataframe Seconds < Lower Is Better a1.4xlarge . 45.16 |=========================================================== Apache Spark 3.3 Row Count: 1000000 - Partitions: 1000 - Group By Test Time Seconds < Lower Is Better a1.4xlarge . 11.52 |=========================================================== Apache Spark 3.3 Row Count: 1000000 - Partitions: 1000 - Repartition Test Time Seconds < Lower Is Better a1.4xlarge . 4.73 |============================================================ Apache Spark 3.3 Row Count: 1000000 - Partitions: 1000 - Inner Join Test Time Seconds < Lower Is Better a1.4xlarge . 5.86 |============================================================ Apache Spark 3.3 Row Count: 1000000 - Partitions: 1000 - Broadcast Inner Join Test Time Seconds < Lower Is Better a1.4xlarge . 4.72 |============================================================ Apache Spark 3.3 Row Count: 1000000 - Partitions: 2000 - SHA-512 Benchmark Time Seconds < Lower Is Better a1.4xlarge . 10.69 |=========================================================== Apache Spark 3.3 Row Count: 1000000 - Partitions: 2000 - Calculate Pi Benchmark Seconds < Lower Is Better a1.4xlarge . 550.53 |========================================================== Apache Spark 3.3 Row Count: 1000000 - Partitions: 2000 - Calculate Pi Benchmark Using Dataframe Seconds < Lower Is Better a1.4xlarge . 45.09 |=========================================================== Apache Spark 3.3 Row Count: 1000000 - Partitions: 2000 - Group By Test Time Seconds < Lower Is Better a1.4xlarge . 12.53 |=========================================================== Apache Spark 3.3 Row Count: 1000000 - Partitions: 2000 - Repartition Test Time Seconds < Lower Is Better a1.4xlarge . 5.53 |============================================================ Apache Spark 3.3 Row Count: 1000000 - Partitions: 2000 - Inner Join Test Time Seconds < Lower Is Better a1.4xlarge . 7.88 |============================================================ Apache Spark 3.3 Row Count: 1000000 - Partitions: 2000 - Broadcast Inner Join Test Time Seconds < Lower Is Better a1.4xlarge . 5.65 |============================================================ Apache Spark 3.3 Row Count: 10000000 - Partitions: 100 - SHA-512 Benchmark Time Seconds < Lower Is Better a1.4xlarge . 38.00 |=========================================================== Apache Spark 3.3 Row Count: 10000000 - Partitions: 100 - Calculate Pi Benchmark Seconds < Lower Is Better a1.4xlarge . 553.37 |========================================================== Apache Spark 3.3 Row Count: 10000000 - Partitions: 100 - Calculate Pi Benchmark Using Dataframe Seconds < Lower Is Better a1.4xlarge . 45.24 |=========================================================== Apache Spark 3.3 Row Count: 10000000 - Partitions: 100 - Group By Test Time Seconds < Lower Is Better a1.4xlarge . 21.40 |=========================================================== Apache Spark 3.3 Row Count: 10000000 - Partitions: 100 - Repartition Test Time Seconds < Lower Is Better a1.4xlarge . 23.05 |=========================================================== Apache Spark 3.3 Row Count: 10000000 - Partitions: 100 - Inner Join Test Time Seconds < Lower Is Better a1.4xlarge . 27.92 |=========================================================== Apache Spark 3.3 Row Count: 10000000 - Partitions: 100 - Broadcast Inner Join Test Time Seconds < Lower Is Better a1.4xlarge . 28.45 |=========================================================== Apache Spark 3.3 Row Count: 10000000 - Partitions: 500 - SHA-512 Benchmark Time Seconds < Lower Is Better a1.4xlarge . 36.43 |=========================================================== Apache Spark 3.3 Row Count: 10000000 - Partitions: 500 - Calculate Pi Benchmark Seconds < Lower Is Better a1.4xlarge . 549.47 |========================================================== Apache Spark 3.3 Row Count: 10000000 - Partitions: 500 - Calculate Pi Benchmark Using Dataframe Seconds < Lower Is Better a1.4xlarge . 44.96 |=========================================================== Apache Spark 3.3 Row Count: 10000000 - Partitions: 500 - Group By Test Time Seconds < Lower Is Better a1.4xlarge . 21.33 |=========================================================== Apache Spark 3.3 Row Count: 10000000 - Partitions: 500 - Repartition Test Time Seconds < Lower Is Better a1.4xlarge . 22.75 |=========================================================== Apache Spark 3.3 Row Count: 10000000 - Partitions: 500 - Inner Join Test Time Seconds < Lower Is Better a1.4xlarge . 26.35 |=========================================================== Apache Spark 3.3 Row Count: 10000000 - Partitions: 500 - Broadcast Inner Join Test Time Seconds < Lower Is Better a1.4xlarge . 27.15 |=========================================================== Apache Spark 3.3 Row Count: 20000000 - Partitions: 100 - SHA-512 Benchmark Time Seconds < Lower Is Better a1.4xlarge . 69.18 |=========================================================== Apache Spark 3.3 Row Count: 20000000 - Partitions: 100 - Calculate Pi Benchmark Seconds < Lower Is Better a1.4xlarge . 546.49 |========================================================== Apache Spark 3.3 Row Count: 20000000 - Partitions: 100 - Calculate Pi Benchmark Using Dataframe Seconds < Lower Is Better a1.4xlarge . 45.13 |=========================================================== Apache Spark 3.3 Row Count: 20000000 - Partitions: 100 - Group By Test Time Seconds < Lower Is Better a1.4xlarge . 44.49 |=========================================================== Apache Spark 3.3 Row Count: 20000000 - Partitions: 100 - Repartition Test Time Seconds < Lower Is Better a1.4xlarge . 43.98 |=========================================================== Apache Spark 3.3 Row Count: 20000000 - Partitions: 100 - Inner Join Test Time Seconds < Lower Is Better a1.4xlarge . 54.92 |=========================================================== Apache Spark 3.3 Row Count: 20000000 - Partitions: 100 - Broadcast Inner Join Test Time Seconds < Lower Is Better a1.4xlarge . 56.14 |=========================================================== Apache Spark 3.3 Row Count: 20000000 - Partitions: 500 - SHA-512 Benchmark Time Seconds < Lower Is Better a1.4xlarge . 57.61 |=========================================================== Apache Spark 3.3 Row Count: 20000000 - Partitions: 500 - Calculate Pi Benchmark Seconds < Lower Is Better a1.4xlarge . 553.89 |========================================================== Apache Spark 3.3 Row Count: 20000000 - Partitions: 500 - Calculate Pi Benchmark Using Dataframe Seconds < Lower Is Better a1.4xlarge . 45.15 |=========================================================== Apache Spark 3.3 Row Count: 20000000 - Partitions: 500 - Group By Test Time Seconds < Lower Is Better a1.4xlarge . 36.68 |=========================================================== Apache Spark 3.3 Row Count: 20000000 - Partitions: 500 - Repartition Test Time Seconds < Lower Is Better a1.4xlarge . 40.80 |=========================================================== Apache Spark 3.3 Row Count: 20000000 - Partitions: 500 - Inner Join Test Time Seconds < Lower Is Better a1.4xlarge . 47.82 |=========================================================== Apache Spark 3.3 Row Count: 20000000 - Partitions: 500 - Broadcast Inner Join Test Time Seconds < Lower Is Better a1.4xlarge . 45.89 |=========================================================== Apache Spark 3.3 Row Count: 40000000 - Partitions: 100 - SHA-512 Benchmark Time Seconds < Lower Is Better a1.4xlarge . 107.75 |========================================================== Apache Spark 3.3 Row Count: 40000000 - Partitions: 100 - Calculate Pi Benchmark Seconds < Lower Is Better a1.4xlarge . 557.74 |========================================================== Apache Spark 3.3 Row Count: 40000000 - Partitions: 100 - Calculate Pi Benchmark Using Dataframe Seconds < Lower Is Better a1.4xlarge . 45.10 |=========================================================== Apache Spark 3.3 Row Count: 40000000 - Partitions: 100 - Group By Test Time Seconds < Lower Is Better a1.4xlarge . 66.44 |=========================================================== Apache Spark 3.3 Row Count: 40000000 - Partitions: 100 - Repartition Test Time Seconds < Lower Is Better a1.4xlarge . 80.69 |=========================================================== Apache Spark 3.3 Row Count: 40000000 - Partitions: 100 - Inner Join Test Time Seconds < Lower Is Better a1.4xlarge . 94.19 |=========================================================== Apache Spark 3.3 Row Count: 40000000 - Partitions: 100 - Broadcast Inner Join Test Time Seconds < Lower Is Better a1.4xlarge . 92.42 |=========================================================== Apache Spark 3.3 Row Count: 40000000 - Partitions: 500 - SHA-512 Benchmark Time Seconds < Lower Is Better a1.4xlarge . 110.36 |========================================================== Apache Spark 3.3 Row Count: 40000000 - Partitions: 500 - Calculate Pi Benchmark Seconds < Lower Is Better a1.4xlarge . 551.71 |========================================================== Apache Spark 3.3 Row Count: 40000000 - Partitions: 500 - Calculate Pi Benchmark Using Dataframe Seconds < Lower Is Better a1.4xlarge . 45.06 |=========================================================== Apache Spark 3.3 Row Count: 40000000 - Partitions: 500 - Group By Test Time Seconds < Lower Is Better a1.4xlarge . 70.24 |=========================================================== Apache Spark 3.3 Row Count: 40000000 - Partitions: 500 - Repartition Test Time Seconds < Lower Is Better a1.4xlarge . 81.14 |=========================================================== Apache Spark 3.3 Row Count: 40000000 - Partitions: 500 - Inner Join Test Time Seconds < Lower Is Better a1.4xlarge . 94.27 |=========================================================== Apache Spark 3.3 Row Count: 40000000 - Partitions: 500 - Broadcast Inner Join Test Time Seconds < Lower Is Better a1.4xlarge . 94.93 |=========================================================== Apache Spark 3.3 Row Count: 10000000 - Partitions: 1000 - SHA-512 Benchmark Time Seconds < Lower Is Better a1.4xlarge . 34.69 |=========================================================== Apache Spark 3.3 Row Count: 10000000 - Partitions: 1000 - Calculate Pi Benchmark Seconds < Lower Is Better a1.4xlarge . 541.76 |========================================================== Apache Spark 3.3 Row Count: 10000000 - Partitions: 1000 - Calculate Pi Benchmark Using Dataframe Seconds < Lower Is Better a1.4xlarge . 45.03 |=========================================================== Apache Spark 3.3 Row Count: 10000000 - Partitions: 1000 - Group By Test Time Seconds < Lower Is Better a1.4xlarge . 20.22 |=========================================================== Apache Spark 3.3 Row Count: 10000000 - Partitions: 1000 - Repartition Test Time Seconds < Lower Is Better a1.4xlarge . 22.34 |=========================================================== Apache Spark 3.3 Row Count: 10000000 - Partitions: 1000 - Inner Join Test Time Seconds < Lower Is Better a1.4xlarge . 27.86 |=========================================================== Apache Spark 3.3 Row Count: 10000000 - Partitions: 1000 - Broadcast Inner Join Test Time Seconds < Lower Is Better a1.4xlarge . 26.95 |=========================================================== Apache Spark 3.3 Row Count: 10000000 - Partitions: 2000 - SHA-512 Benchmark Time Seconds < Lower Is Better a1.4xlarge . 34.76 |=========================================================== Apache Spark 3.3 Row Count: 10000000 - Partitions: 2000 - Calculate Pi Benchmark Seconds < Lower Is Better a1.4xlarge . 554.75 |========================================================== Apache Spark 3.3 Row Count: 10000000 - Partitions: 2000 - Calculate Pi Benchmark Using Dataframe Seconds < Lower Is Better a1.4xlarge . 45.14 |=========================================================== Apache Spark 3.3 Row Count: 10000000 - Partitions: 2000 - Group By Test Time Seconds < Lower Is Better a1.4xlarge . 21.47 |=========================================================== Apache Spark 3.3 Row Count: 10000000 - Partitions: 2000 - Repartition Test Time Seconds < Lower Is Better a1.4xlarge . 23.17 |=========================================================== Apache Spark 3.3 Row Count: 10000000 - Partitions: 2000 - Inner Join Test Time Seconds < Lower Is Better a1.4xlarge . 29.44 |=========================================================== Apache Spark 3.3 Row Count: 10000000 - Partitions: 2000 - Broadcast Inner Join Test Time Seconds < Lower Is Better a1.4xlarge . 27.32 |=========================================================== Apache Spark 3.3 Row Count: 20000000 - Partitions: 1000 - SHA-512 Benchmark Time Seconds < Lower Is Better a1.4xlarge . 58.58 |=========================================================== Apache Spark 3.3 Row Count: 20000000 - Partitions: 1000 - Calculate Pi Benchmark Seconds < Lower Is Better a1.4xlarge . 545.05 |========================================================== Apache Spark 3.3 Row Count: 20000000 - Partitions: 1000 - Calculate Pi Benchmark Using Dataframe Seconds < Lower Is Better a1.4xlarge . 45.05 |=========================================================== Apache Spark 3.3 Row Count: 20000000 - Partitions: 1000 - Group By Test Time Seconds < Lower Is Better a1.4xlarge . 35.99 |=========================================================== Apache Spark 3.3 Row Count: 20000000 - Partitions: 1000 - Repartition Test Time Seconds < Lower Is Better a1.4xlarge . 41.04 |=========================================================== Apache Spark 3.3 Row Count: 20000000 - Partitions: 1000 - Inner Join Test Time Seconds < Lower Is Better a1.4xlarge . 51.03 |=========================================================== Apache Spark 3.3 Row Count: 20000000 - Partitions: 1000 - Broadcast Inner Join Test Time Seconds < Lower Is Better a1.4xlarge . 47.96 |=========================================================== Apache Spark 3.3 Row Count: 20000000 - Partitions: 2000 - SHA-512 Benchmark Time Seconds < Lower Is Better a1.4xlarge . 57.78 |=========================================================== Apache Spark 3.3 Row Count: 20000000 - Partitions: 2000 - Calculate Pi Benchmark Seconds < Lower Is Better a1.4xlarge . 551.36 |========================================================== Apache Spark 3.3 Row Count: 20000000 - Partitions: 2000 - Calculate Pi Benchmark Using Dataframe Seconds < Lower Is Better a1.4xlarge . 45.01 |=========================================================== Apache Spark 3.3 Row Count: 20000000 - Partitions: 2000 - Group By Test Time Seconds < Lower Is Better a1.4xlarge . 35.92 |=========================================================== Apache Spark 3.3 Row Count: 20000000 - Partitions: 2000 - Repartition Test Time Seconds < Lower Is Better a1.4xlarge . 42.03 |=========================================================== Apache Spark 3.3 Row Count: 20000000 - Partitions: 2000 - Inner Join Test Time Seconds < Lower Is Better a1.4xlarge . 52.05 |=========================================================== Apache Spark 3.3 Row Count: 20000000 - Partitions: 2000 - Broadcast Inner Join Test Time Seconds < Lower Is Better a1.4xlarge . 47.78 |=========================================================== Apache Spark 3.3 Row Count: 40000000 - Partitions: 1000 - SHA-512 Benchmark Time Seconds < Lower Is Better a1.4xlarge . 112.60 |========================================================== Apache Spark 3.3 Row Count: 40000000 - Partitions: 1000 - Calculate Pi Benchmark Seconds < Lower Is Better a1.4xlarge . 551.56 |========================================================== Apache Spark 3.3 Row Count: 40000000 - Partitions: 1000 - Calculate Pi Benchmark Using Dataframe Seconds < Lower Is Better a1.4xlarge . 44.93 |=========================================================== Apache Spark 3.3 Row Count: 40000000 - Partitions: 1000 - Group By Test Time Seconds < Lower Is Better a1.4xlarge . 60.53 |=========================================================== Apache Spark 3.3 Row Count: 40000000 - Partitions: 1000 - Repartition Test Time Seconds < Lower Is Better a1.4xlarge . 79.96 |=========================================================== Apache Spark 3.3 Row Count: 40000000 - Partitions: 1000 - Inner Join Test Time Seconds < Lower Is Better a1.4xlarge . 93.24 |=========================================================== Apache Spark 3.3 Row Count: 40000000 - Partitions: 1000 - Broadcast Inner Join Test Time Seconds < Lower Is Better a1.4xlarge . 96.51 |=========================================================== Apache Spark 3.3 Row Count: 40000000 - Partitions: 2000 - SHA-512 Benchmark Time Seconds < Lower Is Better a1.4xlarge . 111.58 |========================================================== Apache Spark 3.3 Row Count: 40000000 - Partitions: 2000 - Calculate Pi Benchmark Seconds < Lower Is Better a1.4xlarge . 551.46 |========================================================== Apache Spark 3.3 Row Count: 40000000 - Partitions: 2000 - Calculate Pi Benchmark Using Dataframe Seconds < Lower Is Better a1.4xlarge . 45.02 |=========================================================== Apache Spark 3.3 Row Count: 40000000 - Partitions: 2000 - Group By Test Time Seconds < Lower Is Better a1.4xlarge . 58.29 |=========================================================== Apache Spark 3.3 Row Count: 40000000 - Partitions: 2000 - Repartition Test Time Seconds < Lower Is Better a1.4xlarge . 78.92 |=========================================================== Apache Spark 3.3 Row Count: 40000000 - Partitions: 2000 - Inner Join Test Time Seconds < Lower Is Better a1.4xlarge . 96.64 |=========================================================== Apache Spark 3.3 Row Count: 40000000 - Partitions: 2000 - Broadcast Inner Join Test Time Seconds < Lower Is Better a1.4xlarge . 93.98 |===========================================================