ddda AMD Ryzen AI 9 365 testing with a ASUS Zenbook S 16 UM5606WA_UM5606WA UM5606WA v1.0 (UM5606WA.308 BIOS) and AMD Radeon 512MB on Ubuntu 24.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2412307-NE-DDDA8846655 .
ddda Processor Motherboard Chipset Memory Disk Graphics Audio Network OS Kernel Desktop Display Server OpenGL Compiler File-System Screen Resolution a b c d AMD Ryzen AI 9 365 @ 4.31GHz (10 Cores / 20 Threads) ASUS Zenbook S 16 UM5606WA_UM5606WA UM5606WA v1.0 (UM5606WA.308 BIOS) AMD Device 1507 4 x 6GB LPDDR5-7500MT/s Micron MT62F1536M32D4DS-026 1024GB MTFDKBA1T0QFM-1BD1AABGB AMD Radeon 512MB AMD Rembrandt Radeon HD Audio MEDIATEK Device 7925 Ubuntu 24.10 6.12.0-rc7-phx-eraps (x86_64) GNOME Shell 47.0 X Server + Wayland 4.6 Mesa 24.2.3-1ubuntu1 (LLVM 19.1.0 DRM 3.59) GCC 14.2.0 ext4 2880x1800 OpenBenchmarking.org Kernel Details - amdgpu.dcdebugmask=0x600 - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: amd-pstate-epp powersave (Boost: Enabled EPP: balance_performance) - Platform Profile: balanced - CPU Microcode: 0xb204011 - ACPI Profile: balanced Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; PBRSB-eIBRS: Not affected; BHI: Not affected; ERAPS hardware RSB flush + srbds: Not affected + tsx_async_abort: Not affected
ddda ncnn: CPU - mobilenet ncnn: CPU-v2-v2 - mobilenet-v2 ncnn: CPU-v3-v3 - mobilenet-v3 ncnn: CPU - shufflenet-v2 ncnn: CPU - mnasnet ncnn: CPU - efficientnet-b0 ncnn: CPU - blazeface ncnn: CPU - googlenet ncnn: CPU - vgg16 ncnn: CPU - resnet18 ncnn: CPU - alexnet ncnn: CPU - resnet50 ncnn: CPUv2-yolov3v2-yolov3 - mobilenetv2-yolov3 ncnn: CPU - yolov4-tiny ncnn: CPU - squeezenet_ssd ncnn: CPU - regnety_400m ncnn: CPU - vision_transformer ncnn: CPU - FastestDet ncnn: Vulkan GPU - mobilenet ncnn: Vulkan GPU-v2-v2 - mobilenet-v2 ncnn: Vulkan GPU-v3-v3 - mobilenet-v3 ncnn: Vulkan GPU - shufflenet-v2 ncnn: Vulkan GPU - mnasnet ncnn: Vulkan GPU - efficientnet-b0 ncnn: Vulkan GPU - blazeface ncnn: Vulkan GPU - googlenet ncnn: Vulkan GPU - vgg16 ncnn: Vulkan GPU - resnet18 ncnn: Vulkan GPU - alexnet ncnn: Vulkan GPU - resnet50 ncnn: Vulkan GPUv2-yolov3v2-yolov3 - mobilenetv2-yolov3 ncnn: Vulkan GPU - yolov4-tiny ncnn: Vulkan GPU - squeezenet_ssd ncnn: Vulkan GPU - regnety_400m ncnn: Vulkan GPU - vision_transformer ncnn: Vulkan GPU - FastestDet llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Text Generation 128 llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 512 llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 1024 llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 2048 llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Text Generation 128 llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 512 llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 1024 llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 2048 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Text Generation 128 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 512 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 1024 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 2048 a b c d 11.88 4.68 3.76 3.23 3.73 5.14 1.17 10.22 33.83 7.05 5.83 13.92 11.88 15.88 8.44 9.17 70.92 4.12 11.39 4.34 3.65 2.99 3.49 4.84 1.24 9.01 33.38 5.97 4.96 14.1 11.39 15.18 8.05 9.12 71.32 2.77 10.3 38.71 37.28 31.62 10.75 36.11 34.3 31.58 59.22 151.41 151.53 139.28 10.73 4.42 3.4 3.12 3.99 5 1.33 10.57 33.42 5.96 5.29 14.36 10.73 14.69 8.04 9.19 71.62 4.24 11.52 4.29 3.56 2.82 3.41 4.92 1.12 9.35 32.29 6.01 5.04 14 11.52 17.06 8.73 8.89 71.77 3.49 10.33 37.42 37.42 35.4 10.86 37.46 37.73 33.12 60.7 158.37 158.82 139.57 11.87 4.47 3.5 3.11 3.86 4.97 1.27 9.49 32.75 6.12 4.97 14.65 11.87 16.19 9.13 9.2 71.61 4.17 11.51 4.33 3.69 3 3.46 5.05 1.19 9.53 32.38 6.6 5.17 14.1 11.51 16.17 8.53 9.14 68.18 2.75 10.32 38.4 36.94 35.16 10.8 37.97 36.57 33.81 60.75 151.53 150.48 140.36 10.51 4.54 3.51 3.22 3.93 4.99 1.28 9.19 33.19 5.93 5.08 14.44 10.51 14.85 8.03 8.87 71.74 4.25 11.69 4.32 3.41 2.91 3.26 4.8 1.07 8.74 33.23 6.14 5.23 14.38 11.69 16.52 8.31 8.74 72.36 3.98 10.3 37.34 37.38 32.98 10.81 36.32 35.25 32.63 58.28 131.74 141.29 139.91 OpenBenchmarking.org
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: mobilenet a b c d 3 6 9 12 15 11.88 10.73 11.87 10.51 MIN: 11.76 / MAX: 17.05 MIN: 10.62 / MAX: 12.77 MIN: 11.6 / MAX: 35.77 MIN: 10.4 / MAX: 12.15 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU-v2-v2 - Model: mobilenet-v2 a b c d 1.053 2.106 3.159 4.212 5.265 4.68 4.42 4.47 4.54 MIN: 4.11 / MAX: 35.42 MIN: 3.66 / MAX: 33.56 MIN: 3.78 / MAX: 33.88 MIN: 3.8 / MAX: 29.67 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU-v3-v3 - Model: mobilenet-v3 a b c d 0.846 1.692 2.538 3.384 4.23 3.76 3.40 3.50 3.51 MIN: 3.53 / MAX: 28.89 MIN: 3.25 / MAX: 26.22 MIN: 3.3 / MAX: 45.42 MIN: 3.35 / MAX: 27.78 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: shufflenet-v2 a b c d 0.7268 1.4536 2.1804 2.9072 3.634 3.23 3.12 3.11 3.22 MIN: 2.93 / MAX: 35.88 MIN: 2.99 / MAX: 25.69 MIN: 2.86 / MAX: 32.43 MIN: 3.07 / MAX: 34.03 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: mnasnet a b c d 0.8978 1.7956 2.6934 3.5912 4.489 3.73 3.99 3.86 3.93 MIN: 3.52 / MAX: 25.76 MIN: 3.66 / MAX: 30.25 MIN: 3.51 / MAX: 53.01 MIN: 3.64 / MAX: 32.62 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: efficientnet-b0 a b c d 1.1565 2.313 3.4695 4.626 5.7825 5.14 5.00 4.97 4.99 MIN: 5.1 / MAX: 5.68 MIN: 4.92 / MAX: 9.16 MIN: 4.92 / MAX: 6.74 MIN: 4.93 / MAX: 5.87 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: blazeface a b c d 0.2993 0.5986 0.8979 1.1972 1.4965 1.17 1.33 1.27 1.28 MIN: 1.15 / MAX: 1.32 MIN: 1.3 / MAX: 2 MIN: 1.26 / MAX: 1.43 MIN: 1.26 / MAX: 1.87 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: googlenet a b c d 3 6 9 12 15 10.22 10.57 9.49 9.19 MIN: 10.02 / MAX: 14.16 MIN: 9.04 / MAX: 50.3 MIN: 8.97 / MAX: 57.01 MIN: 8.9 / MAX: 49.46 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: vgg16 a b c d 8 16 24 32 40 33.83 33.42 32.75 33.19 MIN: 31.89 / MAX: 74.06 MIN: 32.44 / MAX: 75.15 MIN: 31.39 / MAX: 55.09 MIN: 31.59 / MAX: 64.74 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: resnet18 a b c d 2 4 6 8 10 7.05 5.96 6.12 5.93 MIN: 7 / MAX: 7.99 MIN: 5.89 / MAX: 7.95 MIN: 5.81 / MAX: 30.87 MIN: 5.83 / MAX: 7.8 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: alexnet a b c d 1.3118 2.6236 3.9354 5.2472 6.559 5.83 5.29 4.97 5.08 MIN: 5.76 / MAX: 7.28 MIN: 5.01 / MAX: 17.98 MIN: 4.66 / MAX: 10.74 MIN: 4.97 / MAX: 6.57 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: resnet50 a b c d 4 8 12 16 20 13.92 14.36 14.65 14.44 MIN: 13.33 / MAX: 58.77 MIN: 12.97 / MAX: 61.42 MIN: 14.54 / MAX: 15.47 MIN: 13.49 / MAX: 34.89 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3 a b c d 3 6 9 12 15 11.88 10.73 11.87 10.51 MIN: 11.76 / MAX: 17.05 MIN: 10.62 / MAX: 12.77 MIN: 11.6 / MAX: 35.77 MIN: 10.4 / MAX: 12.15 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: yolov4-tiny a b c d 4 8 12 16 20 15.88 14.69 16.19 14.85 MIN: 15.09 / MAX: 54.54 MIN: 14.27 / MAX: 69.64 MIN: 14.04 / MAX: 25.8 MIN: 14.14 / MAX: 85.67 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: squeezenet_ssd a b c d 3 6 9 12 15 8.44 8.04 9.13 8.03 MIN: 8.25 / MAX: 27.11 MIN: 7.86 / MAX: 25.17 MIN: 8.78 / MAX: 67.75 MIN: 7.87 / MAX: 12.13 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: regnety_400m a b c d 3 6 9 12 15 9.17 9.19 9.20 8.87 MIN: 9.11 / MAX: 11.17 MIN: 9.1 / MAX: 12.28 MIN: 8.93 / MAX: 57.09 MIN: 8.63 / MAX: 46.65 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: vision_transformer a b c d 16 32 48 64 80 70.92 71.62 71.61 71.74 MIN: 68.66 / MAX: 103.73 MIN: 69.18 / MAX: 102.55 MIN: 67.29 / MAX: 115.35 MIN: 69.59 / MAX: 91.14 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: FastestDet a b c d 0.9563 1.9126 2.8689 3.8252 4.7815 4.12 4.24 4.17 4.25 MIN: 4.08 / MAX: 5.73 MIN: 4.12 / MAX: 25.65 MIN: 3.99 / MAX: 32.13 MIN: 4.21 / MAX: 8.33 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: mobilenet a b c d 3 6 9 12 15 11.39 11.52 11.51 11.69 MIN: 11.22 / MAX: 16.67 MIN: 11.29 / MAX: 40.04 MIN: 11.28 / MAX: 31.2 MIN: 11.56 / MAX: 17.13 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 a b c d 0.9765 1.953 2.9295 3.906 4.8825 4.34 4.29 4.33 4.32 MIN: 3.65 / MAX: 34.66 MIN: 3.64 / MAX: 29.66 MIN: 3.66 / MAX: 31.75 MIN: 3.64 / MAX: 27.83 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 a b c d 0.8303 1.6606 2.4909 3.3212 4.1515 3.65 3.56 3.69 3.41 MIN: 3.38 / MAX: 31.52 MIN: 3.32 / MAX: 26.9 MIN: 3.34 / MAX: 53.01 MIN: 3.11 / MAX: 34.27 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: shufflenet-v2 a b c d 0.675 1.35 2.025 2.7 3.375 2.99 2.82 3.00 2.91 MIN: 2.87 / MAX: 25.33 MIN: 2.69 / MAX: 25.04 MIN: 2.79 / MAX: 47.27 MIN: 2.66 / MAX: 51.34 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: mnasnet a b c d 0.7853 1.5706 2.3559 3.1412 3.9265 3.49 3.41 3.46 3.26 MIN: 3.36 / MAX: 29.39 MIN: 3.26 / MAX: 31.3 MIN: 3.33 / MAX: 25.82 MIN: 3.09 / MAX: 31.76 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: efficientnet-b0 a b c d 1.1363 2.2726 3.4089 4.5452 5.6815 4.84 4.92 5.05 4.80 MIN: 4.8 / MAX: 6.4 MIN: 4.88 / MAX: 6.43 MIN: 4.86 / MAX: 28.25 MIN: 4.72 / MAX: 9.79 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: blazeface a b c d 0.279 0.558 0.837 1.116 1.395 1.24 1.12 1.19 1.07 MIN: 1.18 / MAX: 5.63 MIN: 1.11 / MAX: 1.34 MIN: 1.18 / MAX: 1.23 MIN: 1.06 / MAX: 1.69 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: googlenet a b c d 3 6 9 12 15 9.01 9.35 9.53 8.74 MIN: 8.55 / MAX: 49.29 MIN: 9.02 / MAX: 29.75 MIN: 8.98 / MAX: 50.27 MIN: 8.35 / MAX: 66.31 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: vgg16 a b c d 8 16 24 32 40 33.38 32.29 32.38 33.23 MIN: 31.56 / MAX: 65.09 MIN: 30.48 / MAX: 69.49 MIN: 31.27 / MAX: 63.71 MIN: 30.69 / MAX: 148.44 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: resnet18 a b c d 2 4 6 8 10 5.97 6.01 6.60 6.14 MIN: 5.88 / MAX: 7.65 MIN: 5.92 / MAX: 7.29 MIN: 5.87 / MAX: 53.8 MIN: 5.83 / MAX: 26.68 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: alexnet a b c d 1.1768 2.3536 3.5304 4.7072 5.884 4.96 5.04 5.17 5.23 MIN: 4.89 / MAX: 6.2 MIN: 4.89 / MAX: 6.96 MIN: 4.98 / MAX: 20.08 MIN: 5.08 / MAX: 5.78 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: resnet50 a b c d 4 8 12 16 20 14.10 14.00 14.10 14.38 MIN: 13.4 / MAX: 38.99 MIN: 13.83 / MAX: 15.63 MIN: 13.45 / MAX: 36.9 MIN: 13.78 / MAX: 58.62 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3 a b c d 3 6 9 12 15 11.39 11.52 11.51 11.69 MIN: 11.22 / MAX: 16.67 MIN: 11.29 / MAX: 40.04 MIN: 11.28 / MAX: 31.2 MIN: 11.56 / MAX: 17.13 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: yolov4-tiny a b c d 4 8 12 16 20 15.18 17.06 16.17 16.52 MIN: 14.42 / MAX: 56.53 MIN: 16.45 / MAX: 52.49 MIN: 15.1 / MAX: 60.38 MIN: 15.5 / MAX: 86.62 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: squeezenet_ssd a b c d 2 4 6 8 10 8.05 8.73 8.53 8.31 MIN: 7.93 / MAX: 13.96 MIN: 8.63 / MAX: 9.35 MIN: 8.45 / MAX: 10.18 MIN: 8.2 / MAX: 13.89 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: regnety_400m a b c d 3 6 9 12 15 9.12 8.89 9.14 8.74 MIN: 8.89 / MAX: 34.9 MIN: 8.74 / MAX: 10.45 MIN: 9.05 / MAX: 13.54 MIN: 8.64 / MAX: 9.66 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: vision_transformer a b c d 16 32 48 64 80 71.32 71.77 68.18 72.36 MIN: 66.94 / MAX: 100.96 MIN: 66.86 / MAX: 121.7 MIN: 61.64 / MAX: 135.53 MIN: 70.52 / MAX: 91.57 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: FastestDet a b c d 0.8955 1.791 2.6865 3.582 4.4775 2.77 3.49 2.75 3.98 MIN: 2.72 / MAX: 4.61 MIN: 3.46 / MAX: 3.88 MIN: 2.71 / MAX: 4.51 MIN: 3.94 / MAX: 5.42 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128 a b c d 3 6 9 12 15 10.30 10.33 10.32 10.30 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512 a b c d 9 18 27 36 45 38.71 37.42 38.40 37.34 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024 a b c d 9 18 27 36 45 37.28 37.42 36.94 37.38 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048 a b c d 8 16 24 32 40 31.62 35.40 35.16 32.98 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128 a b c d 3 6 9 12 15 10.75 10.86 10.80 10.81 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512 a b c d 9 18 27 36 45 36.11 37.46 37.97 36.32 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024 a b c d 9 18 27 36 45 34.30 37.73 36.57 35.25 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048 a b c d 8 16 24 32 40 31.58 33.12 33.81 32.63 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128 a b c d 14 28 42 56 70 59.22 60.70 60.75 58.28 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512 a b c d 40 80 120 160 200 151.41 158.37 151.53 131.74 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024 a b c d 40 80 120 160 200 151.53 158.82 150.48 141.29 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048 a b c d 30 60 90 120 150 139.28 139.57 140.36 139.91 1. (CXX) g++ options: -O3
Phoronix Test Suite v10.8.5