llama smoke ARMv8 Neoverse-V2 testing with a Pegatron JIMBO P4352 (00022432 BIOS) and ASPEED on Ubuntu 24.04 via the Phoronix Test Suite. a: Processor: ARMv8 Neoverse-V2 @ 3.47GHz (72 Cores), Motherboard: Pegatron JIMBO P4352 (00022432 BIOS), Memory: 1 x 480GB LPDDR5-6400MT/s NVIDIA 699-2G530-0236-RC1, Disk: 1000GB CT1000T700SSD3, Graphics: ASPEED, Network: 2 x Intel X550 OS: Ubuntu 24.04, Kernel: 6.8.0-49-generic-64k (aarch64), Compiler: GCC 13.2.0 + Clang 18.1.3 + CUDA 11.8, File-System: ext4, Screen Resolution: 1920x1200 Llama.cpp b4154 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128 Tokens Per Second > Higher Is Better a . 20.04 |==================================================================== Llama.cpp b4154 CPU Peak Freq (Highest CPU Core Frequency) Monitor Megahertz > Higher Is Better a . MIN: 284 AVG: 3320 MAX: 3492 Llama.cpp b4154 CPU Power Consumption Monitor Watts < Lower Is Better a . MIN: 2 AVG: 107 MAX: 181 Llama.cpp b4154 CPU Usage (Summary) Monitor Percent < Lower Is Better a . MIN: 0 AVG: 71 MAX: 98 Llama.cpp b4154 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512 Tokens Per Second > Higher Is Better a . 121.12 |=================================================================== Llama.cpp b4154 CPU Peak Freq (Highest CPU Core Frequency) Monitor Megahertz > Higher Is Better a . MIN: 487 AVG: 3302 MAX: 3492 Llama.cpp b4154 CPU Power Consumption Monitor Watts < Lower Is Better a . MIN: 3 AVG: 199 MAX: 280 Llama.cpp b4154 CPU Usage (Summary) Monitor Percent < Lower Is Better a . MIN: 0.1 AVG: 79.8 MAX: 95.9 Llama.cpp b4154 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024 Tokens Per Second > Higher Is Better a . 119.19 |=================================================================== Llama.cpp b4154 CPU Peak Freq (Highest CPU Core Frequency) Monitor Megahertz > Higher Is Better a . MIN: 319 AVG: 3428 MAX: 3492 Llama.cpp b4154 CPU Power Consumption Monitor Watts < Lower Is Better a . MIN: 3 AVG: 221 MAX: 291 Llama.cpp b4154 CPU Usage (Summary) Monitor Percent < Lower Is Better a . MIN: 0.1 AVG: 87.6 MAX: 96.3 Llama.cpp b4154 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048 Tokens Per Second > Higher Is Better a . 106.01 |=================================================================== Llama.cpp b4154 CPU Peak Freq (Highest CPU Core Frequency) Monitor Megahertz > Higher Is Better a . MIN: 761 AVG: 3461 MAX: 3492 Llama.cpp b4154 CPU Power Consumption Monitor Watts < Lower Is Better a . MIN: 4 AVG: 229 MAX: 286 Llama.cpp b4154 CPU Usage (Summary) Monitor Percent < Lower Is Better a . MIN: 0.1 AVG: 91.9 MAX: 96.9 Llama.cpp b4154 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128 Tokens Per Second > Higher Is Better a . 50.62 |==================================================================== Llama.cpp b4154 CPU Peak Freq (Highest CPU Core Frequency) Monitor Megahertz > Higher Is Better a . MIN: 656 AVG: 3270 MAX: 3492 Llama.cpp b4154 CPU Power Consumption Monitor Watts < Lower Is Better a . MIN: 3 AVG: 88 MAX: 133 Llama.cpp b4154 CPU Usage (Summary) Monitor Percent < Lower Is Better a . MIN: 0.1 AVG: 66.8 MAX: 95.3 Llama.cpp b4154 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512 Tokens Per Second > Higher Is Better a . 126.98 |=================================================================== Llama.cpp b4154 CPU Peak Freq (Highest CPU Core Frequency) Monitor Megahertz > Higher Is Better a . MIN: 556 AVG: 3387 MAX: 3492 Llama.cpp b4154 CPU Power Consumption Monitor Watts < Lower Is Better a . MIN: 3 AVG: 146 MAX: 197 Llama.cpp b4154 CPU Usage (Summary) Monitor Percent < Lower Is Better a . MIN: 0 AVG: 77 MAX: 98 Llama.cpp b4154 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024 Tokens Per Second > Higher Is Better a . 130.04 |=================================================================== Llama.cpp b4154 CPU Peak Freq (Highest CPU Core Frequency) Monitor Megahertz > Higher Is Better a . MIN: 589 AVG: 3425 MAX: 3492 Llama.cpp b4154 CPU Power Consumption Monitor Watts < Lower Is Better a . MIN: 3 AVG: 163 MAX: 203 Llama.cpp b4154 CPU Usage (Summary) Monitor Percent < Lower Is Better a . MIN: 0 AVG: 83 MAX: 98 Llama.cpp b4154 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048 Tokens Per Second > Higher Is Better a . 131.38 |=================================================================== Llama.cpp b4154 CPU Peak Freq (Highest CPU Core Frequency) Monitor Megahertz > Higher Is Better a . MIN: 2010 AVG: 3474 MAX: 3492 Llama.cpp b4154 CPU Power Consumption Monitor Watts < Lower Is Better a . MIN: 3 AVG: 174 MAX: 203 Llama.cpp b4154 CPU Usage (Summary) Monitor Percent < Lower Is Better a . MIN: 0 AVG: 89 MAX: 98 Llama.cpp b4154 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128 Tokens Per Second > Higher Is Better a . 20.99 |==================================================================== Llama.cpp b4154 CPU Peak Freq (Highest CPU Core Frequency) Monitor Megahertz > Higher Is Better a . MIN: 827 AVG: 3349 MAX: 3492 Llama.cpp b4154 CPU Power Consumption Monitor Watts < Lower Is Better a . MIN: 3 AVG: 100 MAX: 179 Llama.cpp b4154 CPU Usage (Summary) Monitor Percent < Lower Is Better a . MIN: 0.1 AVG: 70.7 MAX: 96.6 Llama.cpp b4154 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512 Tokens Per Second > Higher Is Better a . 122.57 |=================================================================== Llama.cpp b4154 CPU Peak Freq (Highest CPU Core Frequency) Monitor Megahertz > Higher Is Better a . MIN: 1404 AVG: 3439 MAX: 3492 Llama.cpp b4154 CPU Power Consumption Monitor Watts < Lower Is Better a . MIN: 3 AVG: 203 MAX: 283 Llama.cpp b4154 CPU Usage (Summary) Monitor Percent < Lower Is Better a . MIN: 0.1 AVG: 80.3 MAX: 95.9 Llama.cpp b4154 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024 Tokens Per Second > Higher Is Better a . 119.72 |=================================================================== Llama.cpp b4154 CPU Peak Freq (Highest CPU Core Frequency) Monitor Megahertz > Higher Is Better a . MIN: 521 AVG: 3438 MAX: 3492 Llama.cpp b4154 CPU Power Consumption Monitor Watts < Lower Is Better a . MIN: 4 AVG: 223 MAX: 285 Llama.cpp b4154 CPU Usage (Summary) Monitor Percent < Lower Is Better a . MIN: 0.1 AVG: 87.5 MAX: 96.3 Llama.cpp b4154 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048 Tokens Per Second > Higher Is Better a . 106.83 |=================================================================== Llama.cpp b4154 CPU Peak Freq (Highest CPU Core Frequency) Monitor Megahertz > Higher Is Better a . MIN: 3456 AVG: 3491 MAX: 3492 Llama.cpp b4154 CPU Power Consumption Monitor Watts < Lower Is Better a . MIN: 4 AVG: 229 MAX: 284 Llama.cpp b4154 CPU Usage (Summary) Monitor Percent < Lower Is Better a . MIN: 0.1 AVG: 91.8 MAX: 97.0