ncnn llama Intel Core Ultra 9 285K testing with a ASUS ROG MAXIMUS Z890 HERO (1203 BIOS) and ASUS AMD Radeon RX 7900 XTX 24GB on Ubuntu 24.10 via the Phoronix Test Suite. a: Processor: Intel Core Ultra 9 285K @ 5.10GHz (24 Cores), Motherboard: ASUS ROG MAXIMUS Z890 HERO (1203 BIOS), Chipset: Intel Device ae7f, Memory: 2 x 16GB DDR5-6400MT/s Micron CP16G64C38U5B.M8D1, Disk: Western Digital WD_BLACK SN850X 1000GB + 4001GB Western Digital WD_BLACK SN850X 4000GB, Graphics: ASUS AMD Radeon RX 7900 XTX 24GB, Audio: Intel Device 7f50, Monitor: ASUS VP28U, Network: Realtek Device 8126 + Intel I226-V + Intel Wi-Fi 7 OS: Ubuntu 24.10, Kernel: 6.11.0-13-generic (x86_64), Desktop: GNOME Shell 47.0, Display Server: X Server + Wayland, OpenGL: 4.6 Mesa 25.0~git2412210600.83a7d9~oibaf~o (git-83a7d9a 2024-12-21 oracular-oibaf-pp (LLVM 19.1.1 DRM 3.58), Compiler: GCC 14.2.0, File-System: ext4, Screen Resolution: 3840x2160 b: Processor: Intel Core Ultra 9 285K @ 5.10GHz (24 Cores), Motherboard: ASUS ROG MAXIMUS Z890 HERO (1203 BIOS), Chipset: Intel Device ae7f, Memory: 2 x 16GB DDR5-6400MT/s Micron CP16G64C38U5B.M8D1, Disk: Western Digital WD_BLACK SN850X 1000GB + 4001GB Western Digital WD_BLACK SN850X 4000GB, Graphics: ASUS AMD Radeon RX 7900 XTX 24GB, Audio: Intel Device 7f50, Monitor: ASUS VP28U, Network: Realtek Device 8126 + Intel I226-V + Intel Wi-Fi 7 OS: Ubuntu 24.10, Kernel: 6.11.0-13-generic (x86_64), Desktop: GNOME Shell 47.0, Display Server: X Server + Wayland, OpenGL: 4.6 Mesa 25.0~git2412210600.83a7d9~oibaf~o (git-83a7d9a 2024-12-21 oracular-oibaf-pp (LLVM 19.1.1 DRM 3.58), Compiler: GCC 14.2.0, File-System: ext4, Screen Resolution: 3840x2160 c: Processor: Intel Core Ultra 9 285K @ 5.10GHz (24 Cores), Motherboard: ASUS ROG MAXIMUS Z890 HERO (1203 BIOS), Chipset: Intel Device ae7f, Memory: 2 x 16GB DDR5-6400MT/s Micron CP16G64C38U5B.M8D1, Disk: Western Digital WD_BLACK SN850X 1000GB + 4001GB Western Digital WD_BLACK SN850X 4000GB, Graphics: ASUS AMD Radeon RX 7900 XTX 24GB, Audio: Intel Device 7f50, Monitor: ASUS VP28U, Network: Realtek Device 8126 + Intel I226-V + Intel Wi-Fi 7 OS: Ubuntu 24.10, Kernel: 6.11.0-13-generic (x86_64), Desktop: GNOME Shell 47.0, Display Server: X Server + Wayland, OpenGL: 4.6 Mesa 25.0~git2412210600.83a7d9~oibaf~o (git-83a7d9a 2024-12-21 oracular-oibaf-pp (LLVM 19.1.1 DRM 3.58), Compiler: GCC 14.2.0, File-System: ext4, Screen Resolution: 3840x2160 d: Processor: Intel Core Ultra 9 285K @ 5.10GHz (24 Cores), Motherboard: ASUS ROG MAXIMUS Z890 HERO (1203 BIOS), Chipset: Intel Device ae7f, Memory: 2 x 16GB DDR5-6400MT/s Micron CP16G64C38U5B.M8D1, Disk: Western Digital WD_BLACK SN850X 1000GB + 4001GB Western Digital WD_BLACK SN850X 4000GB, Graphics: ASUS AMD Radeon RX 7900 XTX 24GB, Audio: Intel Device 7f50, Monitor: ASUS VP28U, Network: Realtek Device 8126 + Intel I226-V + Intel Wi-Fi 7 OS: Ubuntu 24.10, Kernel: 6.11.0-13-generic (x86_64), Desktop: GNOME Shell 47.0, Display Server: X Server + Wayland, OpenGL: 4.6 Mesa 25.0~git2412210600.83a7d9~oibaf~o (git-83a7d9a 2024-12-21 oracular-oibaf-pp (LLVM 19.1.1 DRM 3.58), Compiler: GCC 14.2.0, File-System: ext4, Screen Resolution: 3840x2160 NCNN 20241226 Target: CPU - Model: mobilenet ms < Lower Is Better a . 71.23 |==================================================================== b . 70.33 |=================================================================== c . 69.57 |================================================================== d . 69.72 |=================================================================== NCNN 20241226 Target: CPU-v2-v2 - Model: mobilenet-v2 ms < Lower Is Better a . 44.82 |======================================================= b . 55.68 |==================================================================== c . 48.02 |=========================================================== d . 45.05 |======================================================= NCNN 20241226 Target: CPU-v3-v3 - Model: mobilenet-v3 ms < Lower Is Better a . 5.17 |===================================== b . 9.53 |===================================================================== c . 6.31 |============================================== d . 4.15 |============================== NCNN 20241226 Target: CPU - Model: shufflenet-v2 ms < Lower Is Better a . 15.90 |==================================================================== b . 5.47 |======================= c . 13.86 |=========================================================== d . 11.99 |=================================================== NCNN 20241226 Target: CPU - Model: mnasnet ms < Lower Is Better a . 39.76 |============================================================ b . 45.25 |==================================================================== c . 20.40 |=============================== d . 19.24 |============================= NCNN 20241226 Target: CPU - Model: efficientnet-b0 ms < Lower Is Better a . 85.60 |==================================================================== b . 73.64 |========================================================== c . 84.88 |=================================================================== d . 85.14 |==================================================================== NCNN 20241226 Target: CPU - Model: blazeface ms < Lower Is Better a . 7.70 |============================ b . 7.31 |=========================== c . 18.69 |==================================================================== d . 10.65 |======================================= NCNN 20241226 Target: CPU - Model: googlenet ms < Lower Is Better a . 50.48 |============================================================ b . 56.81 |==================================================================== c . 47.12 |======================================================== d . 47.79 |========================================================= NCNN 20241226 Target: CPU - Model: vgg16 ms < Lower Is Better a . 42.22 |==================================================================== b . 41.28 |================================================================== c . 41.92 |=================================================================== d . 42.45 |==================================================================== NCNN 20241226 Target: CPU - Model: resnet18 ms < Lower Is Better a . 20.34 |=========================================================== b . 18.65 |====================================================== c . 16.44 |================================================ d . 23.52 |==================================================================== NCNN 20241226 Target: CPU - Model: alexnet ms < Lower Is Better a . 18.78 |==================================================================== b . 18.38 |=================================================================== c . 17.86 |================================================================= d . 17.64 |================================================================ NCNN 20241226 Target: CPU - Model: resnet50 ms < Lower Is Better a . 55.70 |============================================================= b . 59.72 |================================================================= c . 53.65 |=========================================================== d . 62.05 |==================================================================== NCNN 20241226 Target: CPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3 ms < Lower Is Better a . 71.23 |==================================================================== b . 70.33 |=================================================================== c . 69.57 |================================================================== d . 69.72 |=================================================================== NCNN 20241226 Target: CPU - Model: yolov4-tiny ms < Lower Is Better a . 43.56 |================================================================== b . 43.71 |================================================================== c . 44.17 |=================================================================== d . 44.73 |==================================================================== NCNN 20241226 Target: CPU - Model: squeezenet_ssd ms < Lower Is Better a . 79.12 |================================================================= b . 81.94 |=================================================================== c . 78.21 |================================================================ d . 83.35 |==================================================================== NCNN 20241226 Target: CPU - Model: regnety_400m ms < Lower Is Better a . 164.12 |================================================================= b . 169.79 |=================================================================== c . 147.09 |========================================================== d . 134.37 |===================================================== NCNN 20241226 Target: CPU - Model: vision_transformer ms < Lower Is Better a . 104.35 |================================================================= b . 103.92 |================================================================= c . 107.06 |=================================================================== d . 105.12 |================================================================== NCNN 20241226 Target: CPU - Model: FastestDet ms < Lower Is Better a . 62.66 |============================================================ b . 57.86 |======================================================== c . 70.60 |==================================================================== d . 45.58 |============================================ NCNN 20241226 Target: Vulkan GPU - Model: mobilenet ms < Lower Is Better a . 71.75 |==================================================================== b . 71.47 |==================================================================== c . 70.10 |================================================================== d . 70.61 |=================================================================== NCNN 20241226 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 ms < Lower Is Better a . 54.48 |==================================================================== b . 51.27 |================================================================ c . 49.18 |============================================================= d . 50.50 |=============================================================== NCNN 20241226 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 ms < Lower Is Better a . 5.30 |========================= b . 14.26 |==================================================================== c . 5.91 |============================ d . 4.14 |==================== NCNN 20241226 Target: Vulkan GPU - Model: shufflenet-v2 ms < Lower Is Better a . 19.28 |========================================== b . 31.22 |=================================================================== c . 31.51 |==================================================================== d . 8.22 |================== NCNN 20241226 Target: Vulkan GPU - Model: mnasnet ms < Lower Is Better a . 30.96 |========================================= b . 51.29 |==================================================================== c . 38.53 |=================================================== d . 34.98 |============================================== NCNN 20241226 Target: Vulkan GPU - Model: efficientnet-b0 ms < Lower Is Better a . 82.40 |==================================================================== b . 79.07 |================================================================= c . 75.92 |=============================================================== d . 65.04 |====================================================== NCNN 20241226 Target: Vulkan GPU - Model: blazeface ms < Lower Is Better a . 11.13 |============================================ b . 8.83 |=================================== c . 17.35 |==================================================================== d . 6.12 |======================== NCNN 20241226 Target: Vulkan GPU - Model: googlenet ms < Lower Is Better a . 47.36 |=========================================================== b . 54.37 |==================================================================== c . 53.68 |=================================================================== d . 50.05 |=============================================================== NCNN 20241226 Target: Vulkan GPU - Model: vgg16 ms < Lower Is Better a . 42.02 |==================================================================== b . 41.99 |==================================================================== c . 41.99 |==================================================================== d . 41.94 |==================================================================== NCNN 20241226 Target: Vulkan GPU - Model: resnet18 ms < Lower Is Better a . 20.83 |=========================================================== b . 17.40 |================================================== c . 23.88 |==================================================================== d . 17.27 |================================================= NCNN 20241226 Target: Vulkan GPU - Model: alexnet ms < Lower Is Better a . 17.23 |============================================================== b . 18.75 |==================================================================== c . 18.41 |=================================================================== d . 16.67 |============================================================ NCNN 20241226 Target: Vulkan GPU - Model: resnet50 ms < Lower Is Better a . 48.28 |============================================================== b . 52.88 |==================================================================== c . 48.93 |=============================================================== d . 49.84 |================================================================ NCNN 20241226 Target: Vulkan GPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3 ms < Lower Is Better a . 71.75 |==================================================================== b . 71.47 |==================================================================== c . 70.10 |================================================================== d . 70.61 |=================================================================== NCNN 20241226 Target: Vulkan GPU - Model: yolov4-tiny ms < Lower Is Better a . 44.47 |==================================================================== b . 43.93 |=================================================================== c . 43.51 |================================================================== d . 44.61 |==================================================================== NCNN 20241226 Target: Vulkan GPU - Model: squeezenet_ssd ms < Lower Is Better a . 77.93 |=================================================================== b . 78.04 |=================================================================== c . 79.20 |==================================================================== d . 79.67 |==================================================================== NCNN 20241226 Target: Vulkan GPU - Model: regnety_400m ms < Lower Is Better a . 189.21 |================================================================= b . 196.20 |=================================================================== c . 157.04 |====================================================== d . 132.19 |============================================= NCNN 20241226 Target: Vulkan GPU - Model: vision_transformer ms < Lower Is Better a . 104.88 |=================================================================== b . 101.49 |================================================================= c . 102.97 |================================================================== d . 103.92 |================================================================== NCNN 20241226 Target: Vulkan GPU - Model: FastestDet ms < Lower Is Better a . 54.20 |====================================================== b . 54.47 |======================================================= c . 43.48 |============================================ d . 67.85 |==================================================================== Llama.cpp b4397 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128 Tokens Per Second > Higher Is Better a . 8.69 |=================================================================== b . 8.80 |==================================================================== c . 8.82 |==================================================================== d . 8.98 |===================================================================== Llama.cpp b4397 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512 Tokens Per Second > Higher Is Better a . 50.91 |==================================================================== b . 50.65 |=================================================================== c . 51.12 |==================================================================== d . 50.86 |==================================================================== Llama.cpp b4397 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024 Tokens Per Second > Higher Is Better a . 47.19 |==================================================================== b . 47.30 |==================================================================== c . 47.26 |==================================================================== d . 47.39 |==================================================================== Llama.cpp b4397 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048 Tokens Per Second > Higher Is Better a . 46.32 |==================================================================== b . 46.44 |==================================================================== c . 46.26 |==================================================================== Llama.cpp b4397 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128 Tokens Per Second > Higher Is Better a . 8.90 |=================================================================== b . 9.19 |===================================================================== c . 9.15 |===================================================================== Llama.cpp b4397 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512 Tokens Per Second > Higher Is Better a . 51.20 |=================================================================== b . 51.62 |==================================================================== c . 51.35 |==================================================================== Llama.cpp b4397 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024 Tokens Per Second > Higher Is Better a . 47.40 |==================================================================== b . 47.51 |==================================================================== c . 47.55 |==================================================================== Llama.cpp b4397 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048 Tokens Per Second > Higher Is Better a . 46.53 |==================================================================== b . 46.54 |==================================================================== c . 46.58 |==================================================================== Llama.cpp b4397 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128 Tokens Per Second > Higher Is Better a . 31.17 |================================================================= b . 29.36 |============================================================== c . 32.44 |==================================================================== Llama.cpp b4397 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512 Tokens Per Second > Higher Is Better a . 112.06 |================================================================== b . 111.69 |================================================================== c . 113.29 |=================================================================== Llama.cpp b4397 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024 Tokens Per Second > Higher Is Better a . 97.95 |==================================================================== b . 97.78 |==================================================================== c . 98.11 |==================================================================== Llama.cpp b4397 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048 Tokens Per Second > Higher Is Better a . 87.30 |==================================================================== b . 87.53 |==================================================================== c . 86.93 |====================================================================