ai-vino Intel Core Ultra 7 155H testing with a MTL Swift SFG14-72T Coral_MTH (V1.01 BIOS) and Intel Arc MTL 8GB on Ubuntu 24.10 via the Phoronix Test Suite. a: Processor: Intel Core Ultra 7 155H @ 4.80GHz (16 Cores / 22 Threads), Motherboard: MTL Swift SFG14-72T Coral_MTH (V1.01 BIOS), Chipset: Intel Device 7e7f, Memory: 8 x 2GB LPDDR5-6400MT/s Micron MT62F1G32D2DS-026, Disk: 1024GB Micron_2550_MTFDKBA1T0TGE, Graphics: Intel Arc MTL 8GB, Audio: Intel Meteor Lake-P HD Audio, Network: Intel Meteor Lake PCH CNVi WiFi OS: Ubuntu 24.10, Kernel: 6.11.0-rc6-phx (x86_64), Desktop: GNOME Shell 47.0, Display Server: X Server + Wayland, OpenGL: 4.6 Mesa 24.2.3-1ubuntu1, OpenCL: OpenCL 3.0, Compiler: GCC 14.2.0, File-System: ext4, Screen Resolution: 3840x1200 b: Processor: Intel Core Ultra 7 155H @ 4.80GHz (16 Cores / 22 Threads), Motherboard: MTL Swift SFG14-72T Coral_MTH (V1.01 BIOS), Chipset: Intel Device 7e7f, Memory: 8 x 2GB LPDDR5-6400MT/s Micron MT62F1G32D2DS-026, Disk: 1024GB Micron_2550_MTFDKBA1T0TGE, Graphics: Intel Arc MTL 8GB, Audio: Intel Meteor Lake-P HD Audio, Network: Intel Meteor Lake PCH CNVi WiFi OS: Ubuntu 24.10, Kernel: 6.11.0-rc6-phx (x86_64), Desktop: GNOME Shell 47.0, Display Server: X Server + Wayland, OpenGL: 4.6 Mesa 24.2.3-1ubuntu1, OpenCL: OpenCL 3.0, Compiler: GCC 14.2.0, File-System: ext4, Screen Resolution: 3840x1200 c: Processor: Intel Core Ultra 7 155H @ 4.80GHz (16 Cores / 22 Threads), Motherboard: MTL Swift SFG14-72T Coral_MTH (V1.01 BIOS), Chipset: Intel Device 7e7f, Memory: 8 x 2GB LPDDR5-6400MT/s Micron MT62F1G32D2DS-026, Disk: 1024GB Micron_2550_MTFDKBA1T0TGE, Graphics: Intel Arc MTL 8GB, Audio: Intel Meteor Lake-P HD Audio, Network: Intel Meteor Lake PCH CNVi WiFi OS: Ubuntu 24.10, Kernel: 6.11.0-rc6-phx (x86_64), Desktop: GNOME Shell 47.0, Display Server: X Server + Wayland, OpenGL: 4.6 Mesa 24.2.3-1ubuntu1, OpenCL: OpenCL 3.0, Compiler: GCC 14.2.0, File-System: ext4, Screen Resolution: 3840x1200 OpenVINO GenAI 2024.5 Model: Gemma-7b-int4-ov - Device: CPU tokens/s > Higher Is Better a . 269.02 |=================================================================== OpenVINO GenAI 2024.5 Model: TinyLlama-1.1B-Chat-v1.0 - Device: CPU tokens/s > Higher Is Better a . 48.28 |==================================================================== OpenVINO GenAI 2024.5 Model: Falcon-7b-instruct-int4-ov - Device: CPU tokens/s > Higher Is Better a . 98.60 |==================================================================== OpenVINO GenAI 2024.5 Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU tokens/s > Higher Is Better a . 105.94 |=================================================================== Llama.cpp b4154 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128 Tokens Per Second > Higher Is Better a . 7.08 |=================================================================== b . 7.32 |===================================================================== c . 7.10 |=================================================================== Llama.cpp b4154 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512 Tokens Per Second > Higher Is Better a . 20.05 |=================================================================== b . 20.50 |==================================================================== c . 19.36 |================================================================ Llama.cpp b4154 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024 Tokens Per Second > Higher Is Better a . 19.44 |==================================================================== b . 19.15 |=================================================================== c . 18.85 |================================================================== Llama.cpp b4154 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048 Tokens Per Second > Higher Is Better a . 18.86 |==================================================================== b . 18.28 |================================================================== c . 18.30 |================================================================== Llama.cpp b4154 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128 Tokens Per Second > Higher Is Better a . 7.53 |===================================================================== b . 7.57 |===================================================================== c . 7.54 |===================================================================== Llama.cpp b4154 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512 Tokens Per Second > Higher Is Better a . 19.39 |==================================================================== b . 18.86 |================================================================== c . 18.87 |================================================================== Llama.cpp b4154 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024 Tokens Per Second > Higher Is Better a . 19.20 |==================================================================== b . 18.99 |=================================================================== c . 18.83 |=================================================================== Llama.cpp b4154 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048 Tokens Per Second > Higher Is Better a . 18.58 |==================================================================== b . 18.47 |==================================================================== c . 18.15 |================================================================== Llama.cpp b4154 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128 Tokens Per Second > Higher Is Better a . 24.34 |==================================================================== b . 23.83 |=================================================================== c . 23.25 |================================================================= Llama.cpp b4154 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512 Tokens Per Second > Higher Is Better a . 88.66 |==================================================================== b . 86.35 |================================================================== c . 86.71 |=================================================================== Llama.cpp b4154 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024 Tokens Per Second > Higher Is Better a . 82.10 |==================================================================== b . 81.17 |=================================================================== c . 79.51 |================================================================== Llama.cpp b4154 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048 Tokens Per Second > Higher Is Better a . 73.16 |==================================================================== b . 72.91 |==================================================================== c . 72.16 |===================================================================