ncnn llama ryzen ai AMD Ryzen AI 9 HX 370 testing with a ASUS Zenbook S 16 UM5606WA_UM5606WA UM5606WA v1.0 (UM5606WA.308 BIOS) and AMD Radeon 512MB on Ubuntu 24.10 via the Phoronix Test Suite. a: Processor: AMD Ryzen AI 9 HX 370 @ 4.37GHz (12 Cores / 24 Threads), Motherboard: ASUS Zenbook S 16 UM5606WA_UM5606WA UM5606WA v1.0 (UM5606WA.308 BIOS), Chipset: AMD Device 1507, Memory: 4 x 8GB LPDDR5-7500MT/s Samsung K3KL9L90CM-MGCT, Disk: 1024GB MTFDKBA1T0QFM-1BD1AABGB, Graphics: AMD Radeon 512MB, Audio: AMD Rembrandt Radeon HD Audio, Network: MEDIATEK Device 7925 OS: Ubuntu 24.10, Kernel: 6.11.0-rc6-phx (x86_64), Desktop: GNOME Shell 47.0, Display Server: X Server + Wayland, OpenGL: 4.6 Mesa 24.2.3-1ubuntu1 (LLVM 19.1.0 DRM 3.58), Compiler: GCC 14.2.0, File-System: ext4, Screen Resolution: 2880x1800 b: Processor: AMD Ryzen AI 9 HX 370 @ 4.37GHz (12 Cores / 24 Threads), Motherboard: ASUS Zenbook S 16 UM5606WA_UM5606WA UM5606WA v1.0 (UM5606WA.308 BIOS), Chipset: AMD Device 1507, Memory: 4 x 8GB LPDDR5-7500MT/s Samsung K3KL9L90CM-MGCT, Disk: 1024GB MTFDKBA1T0QFM-1BD1AABGB, Graphics: AMD Radeon 512MB, Audio: AMD Rembrandt Radeon HD Audio, Network: MEDIATEK Device 7925 OS: Ubuntu 24.10, Kernel: 6.11.0-rc6-phx (x86_64), Desktop: GNOME Shell 47.0, Display Server: X Server + Wayland, OpenGL: 4.6 Mesa 24.2.3-1ubuntu1 (LLVM 19.1.0 DRM 3.58), Compiler: GCC 14.2.0, File-System: ext4, Screen Resolution: 2880x1800 c: Processor: AMD Ryzen AI 9 HX 370 @ 4.37GHz (12 Cores / 24 Threads), Motherboard: ASUS Zenbook S 16 UM5606WA_UM5606WA UM5606WA v1.0 (UM5606WA.308 BIOS), Chipset: AMD Device 1507, Memory: 4 x 8GB LPDDR5-7500MT/s Samsung K3KL9L90CM-MGCT, Disk: 1024GB MTFDKBA1T0QFM-1BD1AABGB, Graphics: AMD Radeon 512MB, Audio: AMD Rembrandt Radeon HD Audio, Network: MEDIATEK Device 7925 OS: Ubuntu 24.10, Kernel: 6.11.0-rc6-phx (x86_64), Desktop: GNOME Shell 47.0, Display Server: X Server + Wayland, OpenGL: 4.6 Mesa 24.2.3-1ubuntu1 (LLVM 19.1.0 DRM 3.58), Compiler: GCC 14.2.0, File-System: ext4, Screen Resolution: 2880x1800 NCNN 20241226 Target: CPU - Model: mobilenet ms < Lower Is Better a . 11.61 |=================================================================== b . 11.49 |=================================================================== c . 11.72 |==================================================================== NCNN 20241226 Target: CPU-v2-v2 - Model: mobilenet-v2 ms < Lower Is Better a . 4.23 |===================================================================== b . 4.05 |================================================================== c . 4.13 |=================================================================== NCNN 20241226 Target: CPU-v3-v3 - Model: mobilenet-v3 ms < Lower Is Better a . 2.97 |==================================================================== b . 3.03 |===================================================================== c . 3.02 |===================================================================== NCNN 20241226 Target: CPU - Model: shufflenet-v2 ms < Lower Is Better a . 2.57 |==================================================================== b . 2.59 |===================================================================== c . 2.57 |==================================================================== NCNN 20241226 Target: CPU - Model: mnasnet ms < Lower Is Better a . 2.99 |==================================================================== b . 3.03 |===================================================================== c . 3.05 |===================================================================== NCNN 20241226 Target: CPU - Model: efficientnet-b0 ms < Lower Is Better a . 4.57 |===================================================================== b . 4.54 |==================================================================== c . 4.59 |===================================================================== NCNN 20241226 Target: CPU - Model: blazeface ms < Lower Is Better a . 1.00 |=============================================================== b . 1.09 |===================================================================== c . 1.00 |=============================================================== NCNN 20241226 Target: CPU - Model: googlenet ms < Lower Is Better a . 8.02 |=================================================================== b . 8.21 |===================================================================== c . 7.85 |================================================================== NCNN 20241226 Target: CPU - Model: vgg16 ms < Lower Is Better a . 34.92 |==================================================================== b . 34.34 |=================================================================== c . 34.05 |================================================================== NCNN 20241226 Target: CPU - Model: resnet18 ms < Lower Is Better a . 5.74 |=================================================================== b . 5.92 |===================================================================== c . 5.63 |================================================================== NCNN 20241226 Target: CPU - Model: alexnet ms < Lower Is Better a . 4.65 |=================================================================== b . 4.79 |===================================================================== c . 4.39 |=============================================================== NCNN 20241226 Target: CPU - Model: resnet50 ms < Lower Is Better a . 13.79 |================================================================= b . 14.03 |================================================================== c . 14.35 |==================================================================== NCNN 20241226 Target: CPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3 ms < Lower Is Better a . 11.61 |=================================================================== b . 11.49 |=================================================================== c . 11.72 |==================================================================== NCNN 20241226 Target: CPU - Model: yolov4-tiny ms < Lower Is Better a . 16.66 |==================================================================== b . 16.66 |==================================================================== c . 16.23 |================================================================== NCNN 20241226 Target: CPU - Model: squeezenet_ssd ms < Lower Is Better a . 8.11 |=================================================================== b . 8.33 |===================================================================== c . 8.30 |===================================================================== NCNN 20241226 Target: CPU - Model: regnety_400m ms < Lower Is Better a . 8.44 |================================================================= b . 8.96 |===================================================================== c . 8.68 |=================================================================== NCNN 20241226 Target: CPU - Model: vision_transformer ms < Lower Is Better a . 64.65 |================================================================= b . 67.13 |==================================================================== c . 63.27 |================================================================ NCNN 20241226 Target: CPU - Model: FastestDet ms < Lower Is Better a . 3.28 |======================================================= b . 4.13 |===================================================================== c . 3.81 |================================================================ NCNN 20241226 Target: Vulkan GPU - Model: mobilenet ms < Lower Is Better a . 11.33 |============================================================= b . 12.67 |==================================================================== c . 11.27 |============================================================ NCNN 20241226 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 ms < Lower Is Better a . 4.10 |============================================================== b . 4.54 |===================================================================== c . 4.11 |============================================================== NCNN 20241226 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 ms < Lower Is Better a . 3.02 |============================================================= b . 3.41 |===================================================================== c . 3.13 |=============================================================== NCNN 20241226 Target: Vulkan GPU - Model: shufflenet-v2 ms < Lower Is Better a . 2.62 |============================================================= b . 2.96 |===================================================================== c . 2.69 |=============================================================== NCNN 20241226 Target: Vulkan GPU - Model: mnasnet ms < Lower Is Better a . 3.03 |========================================================= b . 3.65 |===================================================================== c . 3.05 |========================================================== NCNN 20241226 Target: Vulkan GPU - Model: efficientnet-b0 ms < Lower Is Better a . 4.56 |============================================================ b . 5.22 |===================================================================== c . 4.76 |=============================================================== NCNN 20241226 Target: Vulkan GPU - Model: blazeface ms < Lower Is Better a . 1.01 |=========================================================== b . 1.19 |===================================================================== c . 1.04 |============================================================ NCNN 20241226 Target: Vulkan GPU - Model: googlenet ms < Lower Is Better a . 8.51 |============================================================== b . 9.49 |===================================================================== c . 8.47 |============================================================== NCNN 20241226 Target: Vulkan GPU - Model: vgg16 ms < Lower Is Better a . 35.80 |==================================================================== b . 35.20 |=================================================================== c . 33.45 |================================================================ NCNN 20241226 Target: Vulkan GPU - Model: resnet18 ms < Lower Is Better a . 6.55 |================================================================== b . 6.82 |===================================================================== c . 6.46 |================================================================= NCNN 20241226 Target: Vulkan GPU - Model: alexnet ms < Lower Is Better a . 5.31 |==================================================================== b . 5.40 |===================================================================== c . 5.10 |================================================================= NCNN 20241226 Target: Vulkan GPU - Model: resnet50 ms < Lower Is Better a . 14.72 |==================================================================== b . 14.65 |==================================================================== c . 14.06 |================================================================= NCNN 20241226 Target: Vulkan GPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3 ms < Lower Is Better a . 11.33 |============================================================= b . 12.67 |==================================================================== c . 11.27 |============================================================ NCNN 20241226 Target: Vulkan GPU - Model: yolov4-tiny ms < Lower Is Better a . 15.57 |================================================================== b . 15.95 |==================================================================== c . 15.77 |=================================================================== NCNN 20241226 Target: Vulkan GPU - Model: squeezenet_ssd ms < Lower Is Better a . 8.93 |================================================================= b . 9.41 |===================================================================== c . 8.69 |================================================================ NCNN 20241226 Target: Vulkan GPU - Model: regnety_400m ms < Lower Is Better a . 8.48 |=============================================================== b . 9.30 |===================================================================== c . 8.77 |================================================================= NCNN 20241226 Target: Vulkan GPU - Model: vision_transformer ms < Lower Is Better a . 65.73 |================================================================== b . 67.70 |==================================================================== c . 64.41 |================================================================= NCNN 20241226 Target: Vulkan GPU - Model: FastestDet ms < Lower Is Better a . 3.81 |============================================================== b . 4.23 |===================================================================== c . 4.12 |=================================================================== Llama.cpp b4397 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128 Tokens Per Second > Higher Is Better a . 10.16 |==================================================================== b . 10.17 |==================================================================== c . 10.12 |==================================================================== Llama.cpp b4397 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512 Tokens Per Second > Higher Is Better a . 34.67 |=============================================================== b . 37.63 |==================================================================== c . 37.59 |==================================================================== Llama.cpp b4397 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024 Tokens Per Second > Higher Is Better a . 30.41 |============================================================== b . 31.93 |================================================================= c . 33.19 |==================================================================== Llama.cpp b4397 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048 Tokens Per Second > Higher Is Better a . 29.61 |================================================================= b . 31.04 |==================================================================== c . 30.88 |==================================================================== Llama.cpp b4397 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128 Tokens Per Second > Higher Is Better a . 10.26 |=================================================================== b . 10.35 |==================================================================== c . 10.36 |==================================================================== Llama.cpp b4397 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512 Tokens Per Second > Higher Is Better a . 31.23 |================================================================ b . 33.32 |==================================================================== c . 33.20 |==================================================================== Llama.cpp b4397 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024 Tokens Per Second > Higher Is Better a . 30.77 |================================================================ b . 32.50 |==================================================================== c . 32.44 |==================================================================== Llama.cpp b4397 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048 Tokens Per Second > Higher Is Better a . 29.70 |================================================================== b . 30.28 |=================================================================== c . 30.57 |==================================================================== Llama.cpp b4397 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128 Tokens Per Second > Higher Is Better a . 53.26 |=================================================================== b . 53.94 |=================================================================== c . 54.39 |==================================================================== Llama.cpp b4397 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512 Tokens Per Second > Higher Is Better a . 124.53 |============================================================ b . 138.47 |=================================================================== c . 136.99 |================================================================== Llama.cpp b4397 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024 Tokens Per Second > Higher Is Better a . 122.15 |====================================================== b . 151.70 |=================================================================== c . 144.13 |================================================================ Llama.cpp b4397 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048 Tokens Per Second > Higher Is Better a . 114.91 |======================================================== b . 137.10 |=================================================================== c . 135.14 |==================================================================