ncnn llama Intel Core Ultra 7 155H testing with a MTL Swift SFG14-72T Coral_MTH (V1.01 BIOS) and Intel Arc MTL 8GB on Ubuntu 24.10 via the Phoronix Test Suite. a: Processor: Intel Core Ultra 7 155H @ 4.80GHz (16 Cores / 22 Threads), Motherboard: MTL Swift SFG14-72T Coral_MTH (V1.01 BIOS), Chipset: Intel Device 7e7f, Memory: 8 x 2GB LPDDR5-6400MT/s Micron MT62F1G32D2DS-026, Disk: 1024GB Micron_2550_MTFDKBA1T0TGE, Graphics: Intel Arc MTL 8GB, Audio: Intel Meteor Lake-P HD Audio, Network: Intel Meteor Lake PCH CNVi WiFi OS: Ubuntu 24.10, Kernel: 6.11.0-rc6-phx (x86_64), Desktop: GNOME Shell 47.0, Display Server: X Server + Wayland, OpenGL: 4.6 Mesa 25.0~git2411250600.45c523~oibaf~o (git-45c5231 2024-11-25 oracular-oibaf-pp, OpenCL: OpenCL 3.0, Compiler: GCC 14.2.0, File-System: ext4, Screen Resolution: 3840x1200 b: Processor: Intel Core Ultra 7 155H @ 4.80GHz (16 Cores / 22 Threads), Motherboard: MTL Swift SFG14-72T Coral_MTH (V1.01 BIOS), Chipset: Intel Device 7e7f, Memory: 8 x 2GB LPDDR5-6400MT/s Micron MT62F1G32D2DS-026, Disk: 1024GB Micron_2550_MTFDKBA1T0TGE, Graphics: Intel Arc MTL 8GB, Audio: Intel Meteor Lake-P HD Audio, Network: Intel Meteor Lake PCH CNVi WiFi OS: Ubuntu 24.10, Kernel: 6.11.0-rc6-phx (x86_64), Desktop: GNOME Shell 47.0, Display Server: X Server + Wayland, OpenGL: 4.6 Mesa 25.0~git2411250600.45c523~oibaf~o (git-45c5231 2024-11-25 oracular-oibaf-pp, OpenCL: OpenCL 3.0, Compiler: GCC 14.2.0, File-System: ext4, Screen Resolution: 3840x1200 c: Processor: Intel Core Ultra 7 155H @ 4.80GHz (16 Cores / 22 Threads), Motherboard: MTL Swift SFG14-72T Coral_MTH (V1.01 BIOS), Chipset: Intel Device 7e7f, Memory: 8 x 2GB LPDDR5-6400MT/s Micron MT62F1G32D2DS-026, Disk: 1024GB Micron_2550_MTFDKBA1T0TGE, Graphics: Intel Arc MTL 8GB, Audio: Intel Meteor Lake-P HD Audio, Network: Intel Meteor Lake PCH CNVi WiFi OS: Ubuntu 24.10, Kernel: 6.11.0-rc6-phx (x86_64), Desktop: GNOME Shell 47.0, Display Server: X Server + Wayland, OpenGL: 4.6 Mesa 25.0~git2411250600.45c523~oibaf~o (git-45c5231 2024-11-25 oracular-oibaf-pp, OpenCL: OpenCL 3.0, Compiler: GCC 14.2.0, File-System: ext4, Screen Resolution: 3840x1200 Llama.cpp b4397 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048 Tokens Per Second > Higher Is Better a . 19.85 |==================================================================== b . 19.88 |==================================================================== c . 19.70 |=================================================================== Llama.cpp b4397 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048 Tokens Per Second > Higher Is Better a . 19.93 |=================================================================== b . 20.15 |==================================================================== c . 19.86 |=================================================================== Llama.cpp b4397 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024 Tokens Per Second > Higher Is Better a . 20.35 |==================================================================== b . 20.39 |==================================================================== c . 20.38 |==================================================================== Llama.cpp b4397 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024 Tokens Per Second > Higher Is Better a . 20.48 |==================================================================== b . 20.61 |==================================================================== c . 20.58 |==================================================================== Llama.cpp b4397 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048 Tokens Per Second > Higher Is Better a . 75.56 |=================================================================== b . 76.13 |==================================================================== c . 75.56 |=================================================================== Llama.cpp b4397 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512 Tokens Per Second > Higher Is Better a . 20.52 |=================================================================== b . 20.79 |==================================================================== c . 20.62 |=================================================================== Llama.cpp b4397 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512 Tokens Per Second > Higher Is Better a . 20.88 |==================================================================== b . 20.87 |==================================================================== c . 20.78 |==================================================================== NCNN 20241226 Target: Vulkan GPU - Model: FastestDet ms < Lower Is Better a . 9.27 |===================================================================== b . 9.23 |==================================================================== c . 9.30 |===================================================================== NCNN 20241226 Target: Vulkan GPU - Model: vision_transformer ms < Lower Is Better a . 147.62 |=================================================================== b . 147.55 |=================================================================== c . 148.25 |=================================================================== NCNN 20241226 Target: Vulkan GPU - Model: regnety_400m ms < Lower Is Better a . 41.53 |==================================================================== b . 41.56 |==================================================================== c . 41.71 |==================================================================== NCNN 20241226 Target: Vulkan GPU - Model: squeezenet_ssd ms < Lower Is Better a . 13.10 |==================================================================== b . 13.08 |==================================================================== c . 13.05 |==================================================================== NCNN 20241226 Target: Vulkan GPU - Model: yolov4-tiny ms < Lower Is Better a . 20.03 |==================================================================== b . 20.02 |==================================================================== c . 19.89 |==================================================================== NCNN 20241226 Target: Vulkan GPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3 ms < Lower Is Better a . 16.57 |=================================================================== b . 16.69 |==================================================================== c . 16.79 |==================================================================== NCNN 20241226 Target: Vulkan GPU - Model: resnet50 ms < Lower Is Better a . 21.40 |==================================================================== b . 21.29 |==================================================================== c . 21.44 |==================================================================== NCNN 20241226 Target: Vulkan GPU - Model: alexnet ms < Lower Is Better a . 6.44 |===================================================================== b . 6.43 |===================================================================== c . 6.37 |==================================================================== NCNN 20241226 Target: Vulkan GPU - Model: resnet18 ms < Lower Is Better a . 8.87 |==================================================================== b . 9.02 |===================================================================== c . 8.83 |==================================================================== NCNN 20241226 Target: Vulkan GPU - Model: vgg16 ms < Lower Is Better a . 36.88 |==================================================================== b . 36.63 |==================================================================== c . 36.89 |==================================================================== NCNN 20241226 Target: Vulkan GPU - Model: googlenet ms < Lower Is Better a . 14.45 |==================================================================== b . 14.48 |==================================================================== c . 14.44 |==================================================================== NCNN 20241226 Target: Vulkan GPU - Model: blazeface ms < Lower Is Better a . 4.49 |===================================================================== b . 4.46 |===================================================================== c . 4.48 |===================================================================== NCNN 20241226 Target: Vulkan GPU - Model: efficientnet-b0 ms < Lower Is Better a . 13.25 |==================================================================== b . 13.14 |=================================================================== c . 12.99 |=================================================================== NCNN 20241226 Target: Vulkan GPU - Model: mnasnet ms < Lower Is Better a . 7.89 |===================================================================== b . 7.67 |=================================================================== c . 7.67 |=================================================================== NCNN 20241226 Target: Vulkan GPU - Model: shufflenet-v2 ms < Lower Is Better a . 7.01 |===================================================================== b . 6.89 |==================================================================== c . 6.80 |=================================================================== NCNN 20241226 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 ms < Lower Is Better a . 8.03 |===================================================================== b . 7.95 |==================================================================== c . 7.86 |==================================================================== NCNN 20241226 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 ms < Lower Is Better a . 7.74 |===================================================================== b . 7.62 |==================================================================== c . 7.74 |===================================================================== NCNN 20241226 Target: Vulkan GPU - Model: mobilenet ms < Lower Is Better a . 16.57 |=================================================================== b . 16.69 |==================================================================== c . 16.79 |==================================================================== NCNN 20241226 Target: CPU - Model: FastestDet ms < Lower Is Better a . 9.15 |===================================================================== b . 9.20 |===================================================================== c . 9.18 |===================================================================== NCNN 20241226 Target: CPU - Model: vision_transformer ms < Lower Is Better a . 148.17 |================================================================== b . 150.67 |=================================================================== c . 149.74 |=================================================================== NCNN 20241226 Target: CPU - Model: regnety_400m ms < Lower Is Better a . 42.43 |==================================================================== b . 39.90 |================================================================ c . 40.40 |================================================================= NCNN 20241226 Target: CPU - Model: squeezenet_ssd ms < Lower Is Better a . 12.96 |==================================================================== b . 12.81 |=================================================================== c . 13.02 |==================================================================== NCNN 20241226 Target: CPU - Model: yolov4-tiny ms < Lower Is Better a . 20.11 |==================================================================== b . 20.02 |==================================================================== c . 19.76 |=================================================================== NCNN 20241226 Target: CPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3 ms < Lower Is Better a . 16.75 |==================================================================== b . 16.60 |=================================================================== c . 16.69 |==================================================================== NCNN 20241226 Target: CPU - Model: resnet50 ms < Lower Is Better a . 21.42 |==================================================================== b . 21.21 |=================================================================== c . 21.37 |==================================================================== NCNN 20241226 Target: CPU - Model: alexnet ms < Lower Is Better a . 6.39 |===================================================================== b . 6.29 |==================================================================== c . 6.39 |===================================================================== NCNN 20241226 Target: CPU - Model: resnet18 ms < Lower Is Better a . 8.97 |===================================================================== b . 8.98 |===================================================================== c . 8.87 |==================================================================== NCNN 20241226 Target: CPU - Model: vgg16 ms < Lower Is Better a . 36.88 |==================================================================== b . 36.78 |==================================================================== c . 36.55 |=================================================================== NCNN 20241226 Target: CPU - Model: googlenet ms < Lower Is Better a . 14.47 |==================================================================== b . 14.28 |=================================================================== c . 14.40 |==================================================================== NCNN 20241226 Target: CPU - Model: blazeface ms < Lower Is Better a . 4.49 |===================================================================== b . 4.41 |==================================================================== c . 4.42 |==================================================================== NCNN 20241226 Target: CPU - Model: efficientnet-b0 ms < Lower Is Better a . 12.83 |==================================================================== b . 12.36 |================================================================== c . 12.35 |================================================================= NCNN 20241226 Target: CPU - Model: mnasnet ms < Lower Is Better a . 7.44 |===================================================================== b . 7.39 |===================================================================== c . 7.43 |===================================================================== NCNN 20241226 Target: CPU - Model: shufflenet-v2 ms < Lower Is Better a . 6.74 |==================================================================== b . 6.77 |===================================================================== c . 6.79 |===================================================================== NCNN 20241226 Target: CPU-v3-v3 - Model: mobilenet-v3 ms < Lower Is Better a . 7.78 |===================================================================== b . 7.71 |==================================================================== c . 7.82 |===================================================================== NCNN 20241226 Target: CPU-v2-v2 - Model: mobilenet-v2 ms < Lower Is Better a . 7.68 |===================================================================== b . 7.69 |===================================================================== c . 7.72 |===================================================================== NCNN 20241226 Target: CPU - Model: mobilenet ms < Lower Is Better a . 16.75 |==================================================================== b . 16.60 |=================================================================== c . 16.69 |==================================================================== Llama.cpp b4397 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128 Tokens Per Second > Higher Is Better a . 7.30 |==================================================================== b . 7.39 |===================================================================== c . 7.34 |===================================================================== Llama.cpp b4397 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128 Tokens Per Second > Higher Is Better a . 7.51 |==================================================================== b . 7.57 |===================================================================== c . 7.53 |===================================================================== Llama.cpp b4397 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024 Tokens Per Second > Higher Is Better a . 82.04 |==================================================================== b . 82.17 |==================================================================== c . 82.01 |==================================================================== Llama.cpp b4397 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512 Tokens Per Second > Higher Is Better a . 88.09 |=================================================================== b . 88.31 |=================================================================== c . 89.19 |==================================================================== Llama.cpp b4397 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128 Tokens Per Second > Higher Is Better a . 22.91 |=================================================================== b . 23.25 |==================================================================== c . 23.33 |====================================================================