eoy2024 Tests for a future article. AMD EPYC 4564P 16-Core testing with a Supermicro AS-3015A-I H13SAE-MF v1.00 (2.1 BIOS) and ASPEED on Ubuntu 24.04 via the Phoronix Test Suite. a: Processor: AMD EPYC 4564P 16-Core @ 5.88GHz (16 Cores / 32 Threads), Motherboard: Supermicro AS-3015A-I H13SAE-MF v1.00 (2.1 BIOS), Chipset: AMD Device 14d8, Memory: 2 x 32GB DRAM-4800MT/s Micron MTC20C2085S1EC48BA1 BC, Disk: 3201GB Micron_7450_MTFDKCC3T2TFS + 960GB SAMSUNG MZ1L2960HCJR-00A07, Graphics: ASPEED, Audio: AMD Rembrandt Radeon HD Audio, Monitor: VA2431, Network: 2 x Intel I210 OS: Ubuntu 24.04, Kernel: 6.8.0-11-generic (x86_64), Desktop: GNOME Shell 45.3, Display Server: X Server 1.21.1.11, Compiler: GCC 13.2.0, File-System: ext4, Screen Resolution: 1024x768 b: Processor: AMD EPYC 4564P 16-Core @ 5.88GHz (16 Cores / 32 Threads), Motherboard: Supermicro AS-3015A-I H13SAE-MF v1.00 (2.1 BIOS), Chipset: AMD Device 14d8, Memory: 2 x 32GB DRAM-4800MT/s Micron MTC20C2085S1EC48BA1 BC, Disk: 3201GB Micron_7450_MTFDKCC3T2TFS + 960GB SAMSUNG MZ1L2960HCJR-00A07, Graphics: ASPEED, Audio: AMD Rembrandt Radeon HD Audio, Monitor: VA2431, Network: 2 x Intel I210 OS: Ubuntu 24.04, Kernel: 6.8.0-11-generic (x86_64), Desktop: GNOME Shell 45.3, Display Server: X Server 1.21.1.11, Compiler: GCC 13.2.0, File-System: ext4, Screen Resolution: 1024x768 c: Processor: AMD EPYC 4564P 16-Core @ 5.88GHz (16 Cores / 32 Threads), Motherboard: Supermicro AS-3015A-I H13SAE-MF v1.00 (2.1 BIOS), Chipset: AMD Device 14d8, Memory: 2 x 32GB DRAM-4800MT/s Micron MTC20C2085S1EC48BA1 BC, Disk: 3201GB Micron_7450_MTFDKCC3T2TFS + 960GB SAMSUNG MZ1L2960HCJR-00A07, Graphics: ASPEED, Audio: AMD Rembrandt Radeon HD Audio, Monitor: VA2431, Network: 2 x Intel I210 OS: Ubuntu 24.04, Kernel: 6.8.0-11-generic (x86_64), Desktop: GNOME Shell 45.3, Display Server: X Server 1.21.1.11, Compiler: GCC 13.2.0, File-System: ext4, Screen Resolution: 1024x768 QuantLib 1.35-dev Size: S tasks/s > Higher Is Better a . 12.75 |==================================================================== b . 12.71 |==================================================================== c . 12.72 |==================================================================== RELION 5.0 Test: Basic - Device: CPU Seconds < Lower Is Better a . 944.27 |=================================================================== b . 867.32 |============================================================== c . 939.90 |=================================================================== SVT-AV1 2.3 Encoder Mode: Preset 3 - Input: Beauty 4K 10-bit Frames Per Second > Higher Is Better a . 1.422 |==================================================================== b . 1.415 |==================================================================== c . 1.411 |=================================================================== Whisper.cpp 1.6.2 Model: ggml-medium.en - Input: 2016 State of the Union Seconds < Lower Is Better a . 700.91 |=================================================================== b . 703.22 |=================================================================== CP2K Molecular Dynamics 2024.3 Input: H20-256 Seconds < Lower Is Better a . 592.86 |=============================================================== b . 629.56 |=================================================================== c . 624.57 |================================================================== Whisperfile 20Aug24 Model Size: Medium Seconds < Lower Is Better a . 534.92 |=================================================================== b . 532.81 |=================================================================== Apache CouchDB 3.4.1 Bulk Size: 500 - Inserts: 3000 - Rounds: 30 Seconds < Lower Is Better a . 511.78 |================================================================== b . 517.15 |=================================================================== Blender 4.3 Blend File: Barbershop - Compute: CPU-Only Seconds < Lower Is Better a . 506.2 |==================================================================== b . 509.3 |==================================================================== Llamafile 0.8.16 Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Prompt Processing 2048 Tokens Per Second > Higher Is Better a . 12288 |==================================================================== b . 12288 |==================================================================== QuantLib 1.35-dev Size: XXS tasks/s > Higher Is Better a . 13.43 |==================================================================== b . 13.43 |==================================================================== c . 13.49 |==================================================================== Apache CouchDB 3.4.1 Bulk Size: 300 - Inserts: 3000 - Rounds: 30 Seconds < Lower Is Better a . 367.83 |=================================================================== b . 368.66 |=================================================================== Llamafile 0.8.16 Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Text Generation 128 Tokens Per Second > Higher Is Better a . 1.99 |===================================================================== b . 2.00 |===================================================================== Llamafile 0.8.16 Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Prompt Processing 2048 Tokens Per Second > Higher Is Better a . 32768 |==================================================================== b . 32768 |==================================================================== BYTE Unix Benchmark 5.1.3-git Computational Test: Whetstone Double MWIPS > Higher Is Better a . 343491.9 |================================================================= b . 343113.0 |================================================================= c . 343187.0 |================================================================= BYTE Unix Benchmark 5.1.3-git Computational Test: Pipe LPS > Higher Is Better a . 48806257.1 |=============================================================== b . 48718087.1 |=============================================================== c . 48613927.9 |=============================================================== BYTE Unix Benchmark 5.1.3-git Computational Test: Dhrystone 2 LPS > Higher Is Better a . 1866536062.7 |============================================================= b . 1857795366.1 |============================================================= c . 1862548305.4 |============================================================= BYTE Unix Benchmark 5.1.3-git Computational Test: System Call LPS > Higher Is Better a . 49140426.6 |=============================================================== b . 49062324.1 |=============================================================== c . 49016743.6 |=============================================================== SVT-AV1 2.3 Encoder Mode: Preset 3 - Input: Bosphorus 4K Frames Per Second > Higher Is Better a . 9.590 |==================================================================== b . 9.554 |==================================================================== c . 9.495 |=================================================================== Whisper.cpp 1.6.2 Model: ggml-small.en - Input: 2016 State of the Union Seconds < Lower Is Better a . 245.08 |=================================================================== b . 240.60 |================================================================== Apache CouchDB 3.4.1 Bulk Size: 100 - Inserts: 3000 - Rounds: 30 Seconds < Lower Is Better a . 232.19 |================================================================== b . 235.35 |=================================================================== XNNPACK b7b048 Model: QS8MobileNetV2 us < Lower Is Better a . 844 |===================================================================== b . 854 |====================================================================== XNNPACK b7b048 Model: FP16MobileNetV3Small us < Lower Is Better a . 920 |===================================================================== b . 931 |====================================================================== XNNPACK b7b048 Model: FP16MobileNetV3Large us < Lower Is Better a . 1498 |=================================================================== b . 1549 |===================================================================== XNNPACK b7b048 Model: FP16MobileNetV2 us < Lower Is Better a . 1190 |================================================================== b . 1247 |===================================================================== XNNPACK b7b048 Model: FP16MobileNetV1 us < Lower Is Better a . 1143 |=================================================================== b . 1174 |===================================================================== XNNPACK b7b048 Model: FP32MobileNetV3Small us < Lower Is Better a . 979 |=================================================================== b . 1005 |===================================================================== XNNPACK b7b048 Model: FP32MobileNetV3Large us < Lower Is Better a . 1810 |=================================================================== b . 1877 |===================================================================== XNNPACK b7b048 Model: FP32MobileNetV2 us < Lower Is Better a . 1495 |================================================================== b . 1559 |===================================================================== XNNPACK b7b048 Model: FP32MobileNetV1 us < Lower Is Better a . 1252 |=================================================================== b . 1290 |===================================================================== Llama.cpp b4154 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048 Tokens Per Second > Higher Is Better a . 63.09 |==================================================================== b . 59.84 |================================================================ Llamafile 0.8.16 Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Prompt Processing 1024 Tokens Per Second > Higher Is Better a . 6144 |===================================================================== b . 6144 |===================================================================== Llama.cpp b4154 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048 Tokens Per Second > Higher Is Better a . 62.97 |================================================================== b . 65.27 |==================================================================== Whisperfile 20Aug24 Model Size: Small Seconds < Lower Is Better a . 195.42 |=================================================================== b . 192.68 |================================================================== SVT-AV1 2.3 Encoder Mode: Preset 5 - Input: Beauty 4K 10-bit Frames Per Second > Higher Is Better a . 6.504 |==================================================================== b . 6.371 |=================================================================== c . 6.374 |=================================================================== Llamafile 0.8.16 Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Text Generation 128 Tokens Per Second > Higher Is Better a . 10.47 |=================================================================== b . 10.64 |==================================================================== OpenSSL Algorithm: ChaCha20 byte/s > Higher Is Better a . 130588495050 |============================================================= b . 130359884190 |============================================================= OpenSSL Algorithm: ChaCha20-Poly1305 byte/s > Higher Is Better a . 92393529340 |============================================================== b . 92216350580 |============================================================== OpenSSL Algorithm: AES-256-GCM byte/s > Higher Is Better a . 97172751700 |============================================================== b . 96821737060 |============================================================== OpenSSL Algorithm: AES-128-GCM byte/s > Higher Is Better a . 104784522170 |============================================================= b . 104404347840 |============================================================= Blender 4.3 Blend File: Pabellon Barcelona - Compute: CPU-Only Seconds < Lower Is Better a . 166.12 |=================================================================== b . 166.25 |=================================================================== Rustls 0.23.17 Benchmark: handshake-resume - Suite: TLS13_CHACHA20_POLY1305_SHA256 handshakes/s > Higher Is Better a . 388077.69 |================================================================ b . 380493.86 |=============================================================== Gcrypt Library 1.10.3 Seconds < Lower Is Better a . 162.13 |=================================================================== b . 154.53 |================================================================ Rustls 0.23.17 Benchmark: handshake-ticket - Suite: TLS13_CHACHA20_POLY1305_SHA256 handshakes/s > Higher Is Better a . 404263.45 |================================================================ b . 397022.40 |=============================================================== OSPRay 3.2 Benchmark: particle_volume/scivis/real_time Items Per Second > Higher Is Better a . 8.98486 |================================================================== b . 8.93245 |================================================================== c . 8.97005 |================================================================== Apache CouchDB 3.4.1 Bulk Size: 500 - Inserts: 1000 - Rounds: 30 Seconds < Lower Is Better a . 148.05 |=================================================================== b . 149.03 |=================================================================== Blender 4.3 Blend File: Classroom - Compute: CPU-Only Seconds < Lower Is Better a . 143.36 |=================================================================== b . 144.41 |=================================================================== Rustls 0.23.17 Benchmark: handshake-ticket - Suite: TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 handshakes/s > Higher Is Better a . 1553632.14 |=============================================================== b . 1536355.90 |============================================================== OSPRay 3.2 Benchmark: particle_volume/pathtracer/real_time Items Per Second > Higher Is Better a . 236.25 |=================================================================== b . 235.33 |=================================================================== c . 234.97 |=================================================================== Rustls 0.23.17 Benchmark: handshake-resume - Suite: TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 handshakes/s > Higher Is Better a . 1820810.21 |=============================================================== b . 1821261.88 |=============================================================== Apache Cassandra 5.0 Test: Writes Op/s > Higher Is Better a . 271333 |=================================================================== b . 271373 |=================================================================== PyPerformance 1.11 Benchmark: async_tree_io Milliseconds < Lower Is Better a . 755 |====================================================================== b . 759 |====================================================================== Llamafile 0.8.16 Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Prompt Processing 1024 Tokens Per Second > Higher Is Better a . 16384 |==================================================================== b . 16384 |==================================================================== SVT-AV1 2.3 Encoder Mode: Preset 3 - Input: Bosphorus 1080p Frames Per Second > Higher Is Better a . 29.57 |==================================================================== b . 29.38 |==================================================================== c . 29.47 |==================================================================== Llamafile 0.8.16 Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Prompt Processing 512 Tokens Per Second > Higher Is Better a . 3072 |===================================================================== b . 3072 |===================================================================== OpenVINO GenAI 2024.5 Model: Gemma-7b-int4-ov - Device: CPU - Time Per Output Token ms < Lower Is Better a . 101.72 |=================================================================== b . 100.94 |================================================================== OpenVINO GenAI 2024.5 Model: Gemma-7b-int4-ov - Device: CPU - Time To First Token ms < Lower Is Better a . 106.62 |=================================================================== b . 107.03 |=================================================================== OpenVINO GenAI 2024.5 Model: Gemma-7b-int4-ov - Device: CPU tokens/s > Higher Is Better a . 9.83 |==================================================================== b . 9.91 |===================================================================== PyPerformance 1.11 Benchmark: xml_etree Milliseconds < Lower Is Better a . 35.8 |===================================================================== b . 35.7 |===================================================================== PyPerformance 1.11 Benchmark: asyncio_tcp_ssl Milliseconds < Lower Is Better a . 645 |=================================================================== b . 672 |====================================================================== GROMACS Input: water_GMX50_bare Ns Per Day > Higher Is Better a . 1.692 |==================================================================== b . 1.679 |=================================================================== OSPRay 3.2 Benchmark: particle_volume/ao/real_time Items Per Second > Higher Is Better a . 9.00917 |================================================================== b . 8.96632 |================================================================== c . 8.98586 |================================================================== Apache CouchDB 3.4.1 Bulk Size: 300 - Inserts: 1000 - Rounds: 30 Seconds < Lower Is Better a . 106.13 |================================================================== b . 107.18 |=================================================================== Numpy Benchmark Score > Higher Is Better a . 775.75 |=================================================================== b . 765.35 |================================================================== CP2K Molecular Dynamics 2024.3 Input: Fayalite-FIST Seconds < Lower Is Better a . 94.03 |============================================================ b . 102.42 |================================================================= c . 105.22 |=================================================================== simdjson 3.10 Throughput Test: Kostya GB/s > Higher Is Better a . 5.97 |===================================================================== b . 5.93 |===================================================================== c . 5.73 |================================================================== SVT-AV1 2.3 Encoder Mode: Preset 8 - Input: Beauty 4K 10-bit Frames Per Second > Higher Is Better a . 12.47 |=================================================================== b . 12.61 |==================================================================== c . 12.60 |==================================================================== Llamafile 0.8.16 Model: Llama-3.2-3B-Instruct.Q6_K - Test: Prompt Processing 2048 Tokens Per Second > Higher Is Better a . 32768 |==================================================================== b . 32768 |==================================================================== Llama.cpp b4154 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128 Tokens Per Second > Higher Is Better a . 6.88 |===================================================================== b . 6.85 |===================================================================== Llamafile 0.8.16 Model: Llama-3.2-3B-Instruct.Q6_K - Test: Text Generation 128 Tokens Per Second > Higher Is Better a . 20.13 |==================================================================== b . 19.81 |=================================================================== PyPerformance 1.11 Benchmark: python_startup Milliseconds < Lower Is Better a . 5.77 |===================================================================== b . 5.79 |===================================================================== Llama.cpp b4154 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024 Tokens Per Second > Higher Is Better a . 70.85 |==================================================================== b . 66.00 |=============================================================== Llama.cpp b4154 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128 Tokens Per Second > Higher Is Better a . 7.24 |===================================================================== b . 7.19 |===================================================================== Build2 0.17 Time To Compile Seconds < Lower Is Better a . 92.05 |==================================================================== b . 92.29 |==================================================================== ASTC Encoder 5.0 Preset: Very Thorough MT/s > Higher Is Better a . 2.7410 |=================================================================== b . 2.7248 |=================================================================== Llama.cpp b4154 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024 Tokens Per Second > Higher Is Better a . 69.26 |==================================================================== b . 68.80 |==================================================================== OpenVINO GenAI 2024.5 Model: Falcon-7b-instruct-int4-ov - Device: CPU - Time Per Output Token ms < Lower Is Better a . 77.34 |==================================================================== b . 77.13 |==================================================================== OpenVINO GenAI 2024.5 Model: Falcon-7b-instruct-int4-ov - Device: CPU - Time To First Token ms < Lower Is Better a . 86.06 |==================================================================== b . 84.39 |=================================================================== OpenVINO GenAI 2024.5 Model: Falcon-7b-instruct-int4-ov - Device: CPU tokens/s > Higher Is Better a . 12.93 |==================================================================== b . 12.97 |==================================================================== ASTC Encoder 5.0 Preset: Exhaustive MT/s > Higher Is Better a . 1.6844 |=================================================================== b . 1.6728 |=================================================================== Whisper.cpp 1.6.2 Model: ggml-base.en - Input: 2016 State of the Union Seconds < Lower Is Better a . 87.49 |==================================================================== b . 87.27 |==================================================================== simdjson 3.10 Throughput Test: LargeRandom GB/s > Higher Is Better a . 1.83 |===================================================================== b . 1.81 |==================================================================== c . 1.74 |================================================================== Rustls 0.23.17 Benchmark: handshake - Suite: TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 handshakes/s > Higher Is Better a . 423535.68 |================================================================ b . 402625.06 |============================================================= Stockfish 17 Chess Benchmark Nodes Per Second > Higher Is Better a . 54752796 |============================================================ b . 59130265 |================================================================= c . 53623108 |=========================================================== Llamafile 0.8.16 Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Prompt Processing 512 Tokens Per Second > Higher Is Better a . 8192 |===================================================================== b . 8192 |===================================================================== Primesieve 12.6 Length: 1e13 Seconds < Lower Is Better a . 78.50 |==================================================================== b . 78.95 |==================================================================== Renaissance 0.16 Test: ALS Movie Lens ms < Lower Is Better a . 9805.7 |================================================================== b . 9958.3 |=================================================================== c . 9907.4 |=================================================================== NAMD 3.0 Input: STMV with 1,066,628 Atoms ns/day > Higher Is Better a . 0.75656 |================================================================== b . 0.75634 |================================================================== c . 0.75813 |================================================================== oneDNN 3.6 Harness: Recurrent Neural Network Training - Engine: CPU ms < Lower Is Better a . 1372.03 |================================================================= b . 1383.64 |================================================================== Blender 4.3 Blend File: Junkshop - Compute: CPU-Only Seconds < Lower Is Better a . 73.56 |=================================================================== b . 74.26 |==================================================================== oneDNN 3.6 Harness: Recurrent Neural Network Inference - Engine: CPU ms < Lower Is Better a . 700.86 |================================================================== b . 711.43 |=================================================================== Llamafile 0.8.16 Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Text Generation 128 Tokens Per Second > Higher Is Better a . 26.28 |==================================================================== b . 25.83 |=================================================================== SVT-AV1 2.3 Encoder Mode: Preset 5 - Input: Bosphorus 4K Frames Per Second > Higher Is Better a . 34.54 |==================================================================== b . 34.23 |=================================================================== c . 34.45 |==================================================================== Blender 4.3 Blend File: Fishy Cat - Compute: CPU-Only Seconds < Lower Is Better a . 71.35 |==================================================================== b . 71.70 |==================================================================== Renaissance 0.16 Test: In-Memory Database Shootout ms < Lower Is Better a . 3256.1 |=================================================================== b . 3081.5 |=============================================================== c . 3046.8 |=============================================================== Apache CouchDB 3.4.1 Bulk Size: 100 - Inserts: 1000 - Rounds: 30 Seconds < Lower Is Better a . 69.93 |==================================================================== b . 70.11 |==================================================================== Renaissance 0.16 Test: Akka Unbalanced Cobwebbed Tree ms < Lower Is Better a . 4403.8 |================================================================== b . 4439.9 |=================================================================== c . 4331.7 |================================================================= Renaissance 0.16 Test: Apache Spark PageRank ms < Lower Is Better a . 2412.2 |================================================================== b . 2439.9 |=================================================================== c . 2439.2 |=================================================================== Renaissance 0.16 Test: Savina Reactors.IO ms < Lower Is Better a . 3506.4 |================================================================= b . 3594.3 |=================================================================== c . 3567.8 |=================================================================== SVT-AV1 2.3 Encoder Mode: Preset 13 - Input: Beauty 4K 10-bit Frames Per Second > Higher Is Better a . 18.59 |==================================================================== b . 18.44 |=================================================================== c . 18.56 |==================================================================== Renaissance 0.16 Test: Gaussian Mixture Model ms < Lower Is Better a . 3399.5 |================================================================= b . 3494.8 |=================================================================== c . 3472.4 |=================================================================== PyPerformance 1.11 Benchmark: gc_collect Milliseconds < Lower Is Better a . 677 |====================================================================== b . 681 |====================================================================== Renaissance 0.16 Test: Apache Spark Bayes ms < Lower Is Better a . 490.0 |=============================================================== b . 529.5 |==================================================================== c . 500.3 |================================================================ Renaissance 0.16 Test: Finagle HTTP Requests ms < Lower Is Better a . 2319.4 |=================================================================== b . 2264.7 |================================================================= c . 2296.6 |================================================================== Stockfish Chess Benchmark Nodes Per Second > Higher Is Better a . 46507038 |================================================================= b . 45751747 |================================================================ Renaissance 0.16 Test: Random Forest ms < Lower Is Better a . 414.4 |=================================================================== b . 398.1 |================================================================ c . 420.8 |==================================================================== Renaissance 0.16 Test: Scala Dotty ms < Lower Is Better a . 477.0 |==================================================================== b . 447.0 |================================================================ c . 458.5 |================================================================= ONNX Runtime 1.19 Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better a . 648.52 |================================================================== b . 657.42 |=================================================================== ONNX Runtime 1.19 Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better a . 1.54196 |================================================================== b . 1.52109 |================================================================= Renaissance 0.16 Test: Genetic Algorithm Using Jenetics + Futures ms < Lower Is Better a . 732.8 |=================================================================== b . 744.3 |==================================================================== c . 719.1 |================================================================== Llamafile 0.8.16 Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Prompt Processing 2048 Tokens Per Second > Higher Is Better a . 32768 |==================================================================== b . 32768 |==================================================================== simdjson 3.10 Throughput Test: DistinctUserID GB/s > Higher Is Better a . 10.46 |==================================================================== b . 10.38 |=================================================================== c . 10.43 |==================================================================== ONNX Runtime 1.19 Model: GPT-2 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better a . 7.42776 |================================================================== b . 7.44340 |================================================================== ONNX Runtime 1.19 Model: GPT-2 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better a . 134.60 |=================================================================== b . 134.31 |=================================================================== OSPRay 3.2 Benchmark: gravity_spheres_volume/dim_512/ao/real_time Items Per Second > Higher Is Better a . 7.63944 |================================================================== b . 7.57408 |================================================================= c . 7.64282 |================================================================== ONNX Runtime 1.19 Model: fcn-resnet101-11 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better a . 310.88 |================================================================== b . 315.41 |=================================================================== ONNX Runtime 1.19 Model: fcn-resnet101-11 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better a . 3.2167 |=================================================================== b . 3.1705 |================================================================== ONNX Runtime 1.19 Model: ZFNet-512 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better a . 9.76985 |================================================================== b . 9.62590 |================================================================= ONNX Runtime 1.19 Model: ZFNet-512 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better a . 102.33 |================================================================== b . 103.86 |=================================================================== OSPRay 3.2 Benchmark: gravity_spheres_volume/dim_512/scivis/real_time Items Per Second > Higher Is Better a . 7.58789 |================================================================== b . 7.52875 |================================================================= c . 7.55791 |================================================================== ONNX Runtime 1.19 Model: bertsquad-12 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better a . 64.14 |=================================================================== b . 65.28 |==================================================================== ONNX Runtime 1.19 Model: bertsquad-12 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better a . 15.59 |==================================================================== b . 15.32 |=================================================================== ONNX Runtime 1.19 Model: T5 Encoder - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better a . 6.39112 |================================================================== b . 6.37566 |================================================================== ONNX Runtime 1.19 Model: T5 Encoder - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better a . 156.45 |=================================================================== b . 156.83 |=================================================================== ONNX Runtime 1.19 Model: yolov4 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better a . 90.45 |=================================================================== b . 91.24 |==================================================================== ONNX Runtime 1.19 Model: yolov4 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better a . 11.06 |==================================================================== b . 10.96 |=================================================================== simdjson 3.10 Throughput Test: TopTweet GB/s > Higher Is Better a . 10.46 |================================================================== b . 10.80 |==================================================================== c . 10.79 |==================================================================== ONNX Runtime 1.19 Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better a . 23.55 |=================================================================== b . 23.83 |==================================================================== ONNX Runtime 1.19 Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better a . 42.45 |==================================================================== b . 41.96 |=================================================================== ONNX Runtime 1.19 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better a . 21.24 |==================================================================== b . 20.95 |=================================================================== ONNX Runtime 1.19 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better a . 47.07 |=================================================================== b . 47.73 |==================================================================== ONNX Runtime 1.19 Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better a . 1.57084 |================================================================= b . 1.58746 |================================================================== ONNX Runtime 1.19 Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better a . 636.32 |=================================================================== b . 629.66 |================================================================== ONNX Runtime 1.19 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better a . 2.55898 |================================================================= b . 2.58589 |================================================================== ONNX Runtime 1.19 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better a . 390.60 |=================================================================== b . 386.58 |================================================================== simdjson 3.10 Throughput Test: PartialTweets GB/s > Higher Is Better a . 9.76 |===================================================================== b . 9.82 |===================================================================== c . 9.68 |==================================================================== ONNX Runtime 1.19 Model: super-resolution-10 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better a . 7.08601 |================================================================== b . 7.09172 |================================================================== ONNX Runtime 1.19 Model: super-resolution-10 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better a . 141.12 |=================================================================== b . 141.00 |=================================================================== Llamafile 0.8.16 Model: Llama-3.2-3B-Instruct.Q6_K - Test: Prompt Processing 1024 Tokens Per Second > Higher Is Better a . 16384 |==================================================================== b . 16384 |==================================================================== Timed Eigen Compilation 3.4.0 Time To Compile Seconds < Lower Is Better a . 58.66 |=================================================================== b . 59.87 |==================================================================== OSPRay 3.2 Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_time Items Per Second > Higher Is Better a . 8.82093 |================================================================== b . 8.79096 |================================================================== c . 8.81199 |================================================================== CP2K Molecular Dynamics 2024.3 Input: H20-64 Seconds < Lower Is Better a . 58.19 |=================================================================== b . 57.35 |================================================================== c . 58.65 |==================================================================== Llamafile 0.8.16 Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Prompt Processing 256 Tokens Per Second > Higher Is Better a . 1536 |===================================================================== b . 1536 |===================================================================== PyPerformance 1.11 Benchmark: asyncio_websockets Milliseconds < Lower Is Better a . 315 |====================================================================== b . 316 |====================================================================== Blender 4.3 Blend File: BMW27 - Compute: CPU-Only Seconds < Lower Is Better a . 53.55 |==================================================================== b . 53.75 |==================================================================== LiteRT 2024-10-15 Model: Inception V4 Microseconds < Lower Is Better a . 21477.8 |============================================================= b . 23265.4 |================================================================== LiteRT 2024-10-15 Model: Inception ResNet V2 Microseconds < Lower Is Better a . 19530.2 |=============================================================== b . 20375.7 |================================================================== LiteRT 2024-10-15 Model: NASNet Mobile Microseconds < Lower Is Better a . 16936.0 |==================================================== b . 21468.7 |================================================================== LiteRT 2024-10-15 Model: DeepLab V3 Microseconds < Lower Is Better a . 3579.67 |======================================================= b . 4287.06 |================================================================== LiteRT 2024-10-15 Model: Mobilenet Float Microseconds < Lower Is Better a . 1211.48 |============================================================== b . 1295.51 |================================================================== LiteRT 2024-10-15 Model: SqueezeNet Microseconds < Lower Is Better a . 1794.11 |================================================================ b . 1860.35 |================================================================== LiteRT 2024-10-15 Model: Quantized COCO SSD MobileNet v1 Microseconds < Lower Is Better a . 2129.52 |================================================ b . 2958.48 |================================================================== LiteRT 2024-10-15 Model: Mobilenet Quant Microseconds < Lower Is Better a . 823.17 |=========================================================== b . 933.18 |=================================================================== Llama.cpp b4154 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512 Tokens Per Second > Higher Is Better a . 68.40 |==================================================================== b . 61.62 |============================================================= Llamafile 0.8.16 Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Text Generation 16 Tokens Per Second > Higher Is Better a . 1.78 |===================================================================== b . 1.79 |===================================================================== Rustls 0.23.17 Benchmark: handshake-resume - Suite: TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 handshakes/s > Higher Is Better a . 3563852.57 |=============================================================== b . 3504511.31 |============================================================== Rustls 0.23.17 Benchmark: handshake-ticket - Suite: TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 handshakes/s > Higher Is Better a . 2620332.00 |=============================================================== b . 2589637.92 |============================================================== ACES DGEMM 1.0 Sustained Floating-Point Rate GFLOP/s > Higher Is Better a . 1141.19 |================================================================== b . 1137.39 |================================================================== c . 1127.27 |================================================================= Llama.cpp b4154 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512 Tokens Per Second > Higher Is Better a . 70.76 |=============================================================== b . 75.96 |==================================================================== Llama.cpp b4154 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048 Tokens Per Second > Higher Is Better a . 279.04 |================================================================= b . 285.71 |=================================================================== Whisperfile 20Aug24 Model Size: Tiny Seconds < Lower Is Better a . 41.71 |=================================================================== b . 42.20 |==================================================================== FinanceBench 2016-07-25 Benchmark: Bonds OpenMP ms < Lower Is Better a . 33061.22 |================================================================ b . 33432.64 |================================================================= Llamafile 0.8.16 Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Prompt Processing 256 Tokens Per Second > Higher Is Better a . 4096 |===================================================================== b . 4096 |===================================================================== SVT-AV1 2.3 Encoder Mode: Preset 5 - Input: Bosphorus 1080p Frames Per Second > Higher Is Better a . 101.97 |=================================================================== b . 100.89 |================================================================== c . 102.11 |=================================================================== PyPerformance 1.11 Benchmark: django_template Milliseconds < Lower Is Better a . 20.7 |===================================================================== b . 20.8 |===================================================================== NAMD 3.0 Input: ATPase with 327,506 Atoms ns/day > Higher Is Better a . 2.79632 |================================================================= b . 2.79025 |================================================================= c . 2.82925 |================================================================== OpenVINO GenAI 2024.5 Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU - Time Per Output Token ms < Lower Is Better a . 51.86 |==================================================================== b . 52.10 |==================================================================== OpenVINO GenAI 2024.5 Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU - Time To First Token ms < Lower Is Better a . 55.93 |==================================================================== b . 56.26 |==================================================================== OpenVINO GenAI 2024.5 Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU tokens/s > Higher Is Better a . 19.28 |==================================================================== b . 19.20 |==================================================================== PyPerformance 1.11 Benchmark: raytrace Milliseconds < Lower Is Better a . 175 |===================================================================== b . 177 |====================================================================== Llamafile 0.8.16 Model: Llama-3.2-3B-Instruct.Q6_K - Test: Prompt Processing 512 Tokens Per Second > Higher Is Better a . 8192 |===================================================================== b . 8192 |===================================================================== PyPerformance 1.11 Benchmark: crypto_pyaes Milliseconds < Lower Is Better a . 41.7 |===================================================================== b . 41.8 |===================================================================== PyPerformance 1.11 Benchmark: go Milliseconds < Lower Is Better a . 77.8 |===================================================================== b . 77.0 |==================================================================== FinanceBench 2016-07-25 Benchmark: Repo OpenMP ms < Lower Is Better a . 21418.45 |================================================================= b . 21522.07 |================================================================= PyPerformance 1.11 Benchmark: chaos Milliseconds < Lower Is Better a . 38.2 |==================================================================== b . 38.7 |===================================================================== PyPerformance 1.11 Benchmark: regex_compile Milliseconds < Lower Is Better a . 69.8 |===================================================================== b . 70.2 |===================================================================== ASTC Encoder 5.0 Preset: Thorough MT/s > Higher Is Better a . 20.30 |==================================================================== b . 20.16 |==================================================================== Etcpak 2.0 Benchmark: Multi-Threaded - Configuration: ETC2 Mpx/s > Higher Is Better a . 577.82 |=================================================================== b . 575.02 |=================================================================== c . 573.91 |=================================================================== SVT-AV1 2.3 Encoder Mode: Preset 8 - Input: Bosphorus 4K Frames Per Second > Higher Is Better a . 102.01 |=================================================================== b . 99.55 |================================================================= c . 100.89 |================================================================== PyPerformance 1.11 Benchmark: pathlib Milliseconds < Lower Is Better a . 14.2 |===================================================================== b . 14.3 |===================================================================== Llamafile 0.8.16 Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Prompt Processing 1024 Tokens Per Second > Higher Is Better a . 16384 |==================================================================== b . 16384 |==================================================================== Rustls 0.23.17 Benchmark: handshake - Suite: TLS13_CHACHA20_POLY1305_SHA256 handshakes/s > Higher Is Better a . 76454.45 |================================================================= b . 76083.73 |================================================================= oneDNN 3.6 Harness: Deconvolution Batch shapes_1d - Engine: CPU ms < Lower Is Better a . 2.97612 |=============================================================== b . 3.11260 |================================================================== Rustls 0.23.17 Benchmark: handshake - Suite: TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 handshakes/s > Higher Is Better a . 80462.6 |================================================================== b . 79085.8 |================================================================= Llamafile 0.8.16 Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Text Generation 16 Tokens Per Second > Higher Is Better a . 10.22 |==================================================================== b . 10.26 |==================================================================== PyPerformance 1.11 Benchmark: json_loads Milliseconds < Lower Is Better a . 12.1 |===================================================================== b . 12.1 |===================================================================== PyPerformance 1.11 Benchmark: nbody Milliseconds < Lower Is Better a . 59.0 |===================================================================== b . 59.4 |===================================================================== 7-Zip Compression Test: Decompression Rating MIPS > Higher Is Better a . 165916 |================================================================== b . 166843 |=================================================================== c . 167321 |=================================================================== 7-Zip Compression Test: Compression Rating MIPS > Higher Is Better a . 163859 |=================================================================== b . 164050 |=================================================================== c . 164313 |=================================================================== Y-Cruncher 0.8.5 Pi Digits To Calculate: 1B Seconds < Lower Is Better a . 18.49 |==================================================================== b . 18.40 |==================================================================== POV-Ray Trace Time Seconds < Lower Is Better a . 18.54 |=================================================================== b . 18.85 |==================================================================== PyPerformance 1.11 Benchmark: pickle_pure_python Milliseconds < Lower Is Better a . 165 |====================================================================== b . 166 |====================================================================== PyPerformance 1.11 Benchmark: float Milliseconds < Lower Is Better a . 50.7 |===================================================================== b . 50.1 |==================================================================== x265 Video Input: Bosphorus 4K Frames Per Second > Higher Is Better a . 32.57 |==================================================================== b . 32.04 |=================================================================== c . 32.73 |==================================================================== Llama.cpp b4154 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024 Tokens Per Second > Higher Is Better a . 355.09 |=================================================================== b . 328.47 |============================================================== Llama.cpp b4154 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128 Tokens Per Second > Higher Is Better a . 47.72 |==================================================================== b . 46.28 |================================================================== Llamafile 0.8.16 Model: Llama-3.2-3B-Instruct.Q6_K - Test: Prompt Processing 256 Tokens Per Second > Higher Is Better a . 4096 |===================================================================== b . 4096 |===================================================================== oneDNN 3.6 Harness: IP Shapes 1D - Engine: CPU ms < Lower Is Better a . 1.12573 |================================================================ b . 1.15274 |================================================================== ASTC Encoder 5.0 Preset: Fast MT/s > Higher Is Better a . 396.65 |=================================================================== b . 396.43 |=================================================================== Llamafile 0.8.16 Model: Llama-3.2-3B-Instruct.Q6_K - Test: Text Generation 16 Tokens Per Second > Higher Is Better a . 19.03 |==================================================================== b . 18.32 |================================================================= SVT-AV1 2.3 Encoder Mode: Preset 13 - Input: Bosphorus 4K Frames Per Second > Higher Is Better a . 212.52 |=================================================================== b . 209.77 |================================================================== c . 212.95 |=================================================================== SVT-AV1 2.3 Encoder Mode: Preset 8 - Input: Bosphorus 1080p Frames Per Second > Higher Is Better a . 339.02 |=================================================================== b . 330.87 |================================================================= c . 338.65 |=================================================================== Y-Cruncher 0.8.5 Pi Digits To Calculate: 500M Seconds < Lower Is Better a . 8.772 |==================================================================== b . 8.794 |==================================================================== Llama.cpp b4154 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512 Tokens Per Second > Higher Is Better a . 327.30 |=================================================================== b . 324.21 |================================================================== Llamafile 0.8.16 Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Text Generation 16 Tokens Per Second > Higher Is Better a . 24.59 |==================================================================== b . 24.69 |==================================================================== Llamafile 0.8.16 Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Prompt Processing 512 Tokens Per Second > Higher Is Better a . 8192 |===================================================================== b . 8192 |===================================================================== oneDNN 3.6 Harness: IP Shapes 3D - Engine: CPU ms < Lower Is Better a . 4.05800 |================================================================ b . 4.15682 |================================================================== ASTC Encoder 5.0 Preset: Medium MT/s > Higher Is Better a . 156.22 |=================================================================== b . 155.27 |=================================================================== Renaissance 0.16 Test: Apache Spark ALS ms < Lower Is Better Primesieve 12.6 Length: 1e12 Seconds < Lower Is Better a . 6.347 |==================================================================== b . 6.378 |==================================================================== oneDNN 3.6 Harness: Convolution Batch Shapes Auto - Engine: CPU ms < Lower Is Better a . 6.67287 |================================================================= b . 6.81754 |================================================================== x265 Video Input: Bosphorus 1080p Frames Per Second > Higher Is Better a . 114.45 |=================================================================== b . 112.85 |================================================================== c . 114.52 |=================================================================== SVT-AV1 2.3 Encoder Mode: Preset 13 - Input: Bosphorus 1080p Frames Per Second > Higher Is Better a . 842.56 |=================================================================== b . 824.81 |================================================================== c . 838.17 |=================================================================== Llamafile 0.8.16 Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Prompt Processing 256 Tokens Per Second > Higher Is Better a . 4096 |===================================================================== b . 4096 |===================================================================== oneDNN 3.6 Harness: Deconvolution Batch shapes_3d - Engine: CPU ms < Lower Is Better a . 2.41294 |================================================================= b . 2.46279 |================================================================== OpenVINO GenAI 2024.5 Model: TinyLlama-1.1B-Chat-v1.0 - Device: CPU tokens/s > Higher Is Better OpenSSL Algorithm: RSA4096 OpenSSL Algorithm: SHA512 OpenSSL Algorithm: SHA256