ncnn llama ryzen AMD Ryzen 7 7840HS testing with a Framework Laptop 16 (AMD Ryzen 7040 ) FRANMZCP07 (03.01 BIOS) and AMD Radeon 780M 512MB on Ubuntu 24.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2412296-NE-NCNNLLAMA88 .
ncnn llama ryzen Processor Motherboard Chipset Memory Disk Graphics Audio Network OS Kernel Desktop Display Server OpenGL Compiler File-System Screen Resolution a b c AMD Ryzen 7 7840HS @ 5.29GHz (8 Cores / 16 Threads) Framework Laptop 16 (AMD Ryzen 7040 ) FRANMZCP07 (03.01 BIOS) AMD Device 14e8 2 x 8GB DDR5-5600MT/s A-DATA AD5S56008G-B 512GB Western Digital PC SN810 SDCPNRY-512G AMD Radeon 780M 512MB AMD Navi 31 HDMI/DP MEDIATEK MT7922 802.11ax PCI Ubuntu 24.04 6.8.0-49-generic (x86_64) GNOME Shell 46.0 X Server + Wayland 4.6 Mesa 24.2~git2406200600.0ac0fb~oibaf~n (git-0ac0fbc 2024-06-20 noble-oibaf-ppa) (LLVM 17.0.6 DRM 3.57) GCC 13.2.0 ext4 2560x1600 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-uJ7kn6/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-uJ7kn6/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - Platform Profile: balanced - CPU Microcode: 0xa704103 - ACPI Profile: balanced Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Vulnerable: Safe RET no microcode + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
ncnn llama ryzen ncnn: CPU - mobilenet ncnn: CPU-v2-v2 - mobilenet-v2 ncnn: CPU-v3-v3 - mobilenet-v3 ncnn: CPU - shufflenet-v2 ncnn: CPU - mnasnet ncnn: CPU - efficientnet-b0 ncnn: CPU - blazeface ncnn: CPU - googlenet ncnn: CPU - vgg16 ncnn: CPU - resnet18 ncnn: CPU - alexnet ncnn: CPU - resnet50 ncnn: CPUv2-yolov3v2-yolov3 - mobilenetv2-yolov3 ncnn: CPU - yolov4-tiny ncnn: CPU - squeezenet_ssd ncnn: CPU - regnety_400m ncnn: CPU - vision_transformer ncnn: CPU - FastestDet ncnn: Vulkan GPU - mobilenet ncnn: Vulkan GPU-v2-v2 - mobilenet-v2 ncnn: Vulkan GPU-v3-v3 - mobilenet-v3 ncnn: Vulkan GPU - shufflenet-v2 ncnn: Vulkan GPU - mnasnet ncnn: Vulkan GPU - efficientnet-b0 ncnn: Vulkan GPU - blazeface ncnn: Vulkan GPU - googlenet ncnn: Vulkan GPU - vgg16 ncnn: Vulkan GPU - resnet18 ncnn: Vulkan GPU - alexnet ncnn: Vulkan GPU - resnet50 ncnn: Vulkan GPUv2-yolov3v2-yolov3 - mobilenetv2-yolov3 ncnn: Vulkan GPU - yolov4-tiny ncnn: Vulkan GPU - squeezenet_ssd ncnn: Vulkan GPU - regnety_400m ncnn: Vulkan GPU - vision_transformer ncnn: Vulkan GPU - FastestDet llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Text Generation 128 llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 512 llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 1024 llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 2048 llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Text Generation 128 llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 512 llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 1024 llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 2048 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Text Generation 128 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 512 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 1024 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 2048 a b c 10.42 3.01 2.96 2.5 2.81 4.29 0.85 8.57 41.13 6.44 5.54 13.96 10.42 16.69 7.21 6.07 59.71 2.67 10.45 3.03 2.98 2.52 2.81 4.33 0.79 8.76 41.34 6.51 5.56 13.96 10.45 16.72 7.31 6.09 60.18 2.85 7.17 40.55 39.9 37.65 7.6 39.58 39.59 38.27 53.66 165.04 162.78 148.11 9.95 4.01 2.75 2.38 2.55 4.03 0.74 7.63 39.5 5.52 5.23 12.71 9.95 15.58 6.83 5.93 59.95 2.83 9.89 2.85 2.77 2.37 2.56 4.02 0.74 7.5 41.73 4.91 4.87 15.32 9.89 15.32 6.96 5.86 59.62 2.91 7.21 40.06 39.71 38.54 7.6 40.2 39.54 38.59 53.55 158.7 160.59 151.48 9.9 4.02 2.75 2.37 2.55 4.02 0.74 7.61 39.64 5.47 5.23 12.72 9.9 15.47 6.8 5.88 59.94 2.59 9.97 2.87 2.75 2.36 2.59 4.01 0.74 7.44 38.85 4.83 4.91 12.19 9.97 15.11 6.76 5.85 59.85 2.72 7.21 40.57 40.11 38.74 7.59 40.56 39.49 38.5 53.55 170.43 164.58 149.08 OpenBenchmarking.org
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: mobilenet a b c 3 6 9 12 15 10.42 9.95 9.90 MIN: 10.34 / MAX: 12.33 MIN: 9.85 / MAX: 14.03 MIN: 9.78 / MAX: 14.64 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU-v2-v2 - Model: mobilenet-v2 a b c 0.9045 1.809 2.7135 3.618 4.5225 3.01 4.01 4.02 MIN: 2.94 / MAX: 6.3 MIN: 2.79 / MAX: 226.32 MIN: 2.79 / MAX: 226.81 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU-v3-v3 - Model: mobilenet-v3 a b c 0.666 1.332 1.998 2.664 3.33 2.96 2.75 2.75 MIN: 2.9 / MAX: 5.75 MIN: 2.67 / MAX: 5.61 MIN: 2.67 / MAX: 5.76 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: shufflenet-v2 a b c 0.5625 1.125 1.6875 2.25 2.8125 2.50 2.38 2.37 MIN: 2.46 / MAX: 4.33 MIN: 2.33 / MAX: 4.92 MIN: 2.31 / MAX: 5.07 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: mnasnet a b c 0.6323 1.2646 1.8969 2.5292 3.1615 2.81 2.55 2.55 MIN: 2.73 / MAX: 5.32 MIN: 2.49 / MAX: 5.11 MIN: 2.48 / MAX: 5.75 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: efficientnet-b0 a b c 0.9653 1.9306 2.8959 3.8612 4.8265 4.29 4.03 4.02 MIN: 4.18 / MAX: 6.88 MIN: 3.95 / MAX: 6.79 MIN: 3.91 / MAX: 6.69 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: blazeface a b c 0.1913 0.3826 0.5739 0.7652 0.9565 0.85 0.74 0.74 MIN: 0.78 / MAX: 10.11 MIN: 0.73 / MAX: 0.92 MIN: 0.73 / MAX: 0.93 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: googlenet a b c 2 4 6 8 10 8.57 7.63 7.61 MIN: 8.44 / MAX: 11.96 MIN: 7.51 / MAX: 11 MIN: 7.5 / MAX: 9.49 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: vgg16 a b c 9 18 27 36 45 41.13 39.50 39.64 MIN: 40.58 / MAX: 49.94 MIN: 38.97 / MAX: 48.87 MIN: 39.2 / MAX: 50.57 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: resnet18 a b c 2 4 6 8 10 6.44 5.52 5.47 MIN: 6.37 / MAX: 9.26 MIN: 5.42 / MAX: 7.18 MIN: 5.38 / MAX: 7.66 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: alexnet a b c 1.2465 2.493 3.7395 4.986 6.2325 5.54 5.23 5.23 MIN: 5.5 / MAX: 6.96 MIN: 5.15 / MAX: 6.92 MIN: 5.17 / MAX: 6.5 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: resnet50 a b c 4 8 12 16 20 13.96 12.71 12.72 MIN: 13.83 / MAX: 16.06 MIN: 12.57 / MAX: 17.48 MIN: 12.59 / MAX: 15.03 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3 a b c 3 6 9 12 15 10.42 9.95 9.90 MIN: 10.34 / MAX: 12.33 MIN: 9.85 / MAX: 14.03 MIN: 9.78 / MAX: 14.64 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: yolov4-tiny a b c 4 8 12 16 20 16.69 15.58 15.47 MIN: 16.45 / MAX: 20.64 MIN: 15.32 / MAX: 34.14 MIN: 15.25 / MAX: 18.74 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: squeezenet_ssd a b c 2 4 6 8 10 7.21 6.83 6.80 MIN: 7.1 / MAX: 9.12 MIN: 6.69 / MAX: 11.01 MIN: 6.68 / MAX: 11.21 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: regnety_400m a b c 2 4 6 8 10 6.07 5.93 5.88 MIN: 5.98 / MAX: 10.09 MIN: 5.81 / MAX: 15.99 MIN: 5.8 / MAX: 14.93 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: vision_transformer a b c 13 26 39 52 65 59.71 59.95 59.94 MIN: 58.71 / MAX: 68.89 MIN: 59.31 / MAX: 77.78 MIN: 58.92 / MAX: 78.16 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: FastestDet a b c 0.6368 1.2736 1.9104 2.5472 3.184 2.67 2.83 2.59 MIN: 2.64 / MAX: 4.2 MIN: 2.78 / MAX: 7.44 MIN: 2.56 / MAX: 3.85 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: mobilenet a b c 3 6 9 12 15 10.45 9.89 9.97 MIN: 10.39 / MAX: 10.87 MIN: 9.79 / MAX: 12.12 MIN: 9.83 / MAX: 13.81 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 a b c 0.6818 1.3636 2.0454 2.7272 3.409 3.03 2.85 2.87 MIN: 2.95 / MAX: 5.6 MIN: 2.79 / MAX: 5.37 MIN: 2.79 / MAX: 5.93 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 a b c 0.6705 1.341 2.0115 2.682 3.3525 2.98 2.77 2.75 MIN: 2.92 / MAX: 5.52 MIN: 2.68 / MAX: 5.24 MIN: 2.69 / MAX: 5.4 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: shufflenet-v2 a b c 0.567 1.134 1.701 2.268 2.835 2.52 2.37 2.36 MIN: 2.47 / MAX: 4.97 MIN: 2.32 / MAX: 4.75 MIN: 2.29 / MAX: 5.34 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: mnasnet a b c 0.6323 1.2646 1.8969 2.5292 3.1615 2.81 2.56 2.59 MIN: 2.74 / MAX: 5.35 MIN: 2.5 / MAX: 5.05 MIN: 2.49 / MAX: 12.75 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: efficientnet-b0 a b c 0.9743 1.9486 2.9229 3.8972 4.8715 4.33 4.02 4.01 MIN: 4.2 / MAX: 13.97 MIN: 3.93 / MAX: 6.78 MIN: 3.92 / MAX: 6.51 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: blazeface a b c 0.1778 0.3556 0.5334 0.7112 0.889 0.79 0.74 0.74 MAX: 0.85 MIN: 0.73 / MAX: 0.82 MIN: 0.73 / MAX: 0.83 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: googlenet a b c 2 4 6 8 10 8.76 7.50 7.44 MIN: 8.51 / MAX: 17.3 MIN: 7.37 / MAX: 9.08 MIN: 7.32 / MAX: 9.07 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: vgg16 a b c 10 20 30 40 50 41.34 41.73 38.85 MIN: 40.71 / MAX: 55.66 MIN: 38.44 / MAX: 161 MIN: 38.52 / MAX: 49.2 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: resnet18 a b c 2 4 6 8 10 6.51 4.91 4.83 MIN: 6.4 / MAX: 10.78 MIN: 4.82 / MAX: 10.58 MIN: 4.76 / MAX: 5.35 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: alexnet a b c 1.251 2.502 3.753 5.004 6.255 5.56 4.87 4.91 MIN: 5.47 / MAX: 7.38 MIN: 4.77 / MAX: 8.68 MIN: 4.82 / MAX: 6.18 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: resnet50 a b c 4 8 12 16 20 13.96 15.32 12.19 MIN: 13.81 / MAX: 18.95 MIN: 12.54 / MAX: 83.28 MIN: 12.08 / MAX: 13.91 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3 a b c 3 6 9 12 15 10.45 9.89 9.97 MIN: 10.39 / MAX: 10.87 MIN: 9.79 / MAX: 12.12 MIN: 9.83 / MAX: 13.81 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: yolov4-tiny a b c 4 8 12 16 20 16.72 15.32 15.11 MIN: 16.58 / MAX: 20.56 MIN: 15.1 / MAX: 19.94 MIN: 14.91 / MAX: 16.77 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: squeezenet_ssd a b c 2 4 6 8 10 7.31 6.96 6.76 MIN: 7.2 / MAX: 11.67 MIN: 6.83 / MAX: 8.76 MIN: 6.63 / MAX: 8.4 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: regnety_400m a b c 2 4 6 8 10 6.09 5.86 5.85 MIN: 6.04 / MAX: 7.35 MIN: 5.79 / MAX: 7.35 MIN: 5.77 / MAX: 9.59 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: vision_transformer a b c 13 26 39 52 65 60.18 59.62 59.85 MIN: 58.25 / MAX: 98.64 MIN: 58.53 / MAX: 68.86 MIN: 58.6 / MAX: 81.55 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: FastestDet a b c 0.6548 1.3096 1.9644 2.6192 3.274 2.85 2.91 2.72 MIN: 2.82 / MAX: 4.26 MIN: 2.87 / MAX: 6.78 MIN: 2.69 / MAX: 4.02 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128 a b c 2 4 6 8 10 7.17 7.21 7.21 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512 a b c 9 18 27 36 45 40.55 40.06 40.57 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024 a b c 9 18 27 36 45 39.90 39.71 40.11 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048 a b c 9 18 27 36 45 37.65 38.54 38.74 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128 a b c 2 4 6 8 10 7.60 7.60 7.59 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512 a b c 9 18 27 36 45 39.58 40.20 40.56 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024 a b c 9 18 27 36 45 39.59 39.54 39.49 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048 a b c 9 18 27 36 45 38.27 38.59 38.50 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128 a b c 12 24 36 48 60 53.66 53.55 53.55 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512 a b c 40 80 120 160 200 165.04 158.70 170.43 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024 a b c 40 80 120 160 200 162.78 160.59 164.58 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048 a b c 30 60 90 120 150 148.11 151.48 149.08 1. (CXX) g++ options: -O3
Phoronix Test Suite v10.8.5