ncnn llama ryzen AMD Ryzen 7 7840HS testing with a Framework Laptop 16 (AMD Ryzen 7040 ) FRANMZCP07 (03.01 BIOS) and AMD Radeon 780M 512MB on Ubuntu 24.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2412296-NE-NCNNLLAMA88&sor .
ncnn llama ryzen Processor Motherboard Chipset Memory Disk Graphics Audio Network OS Kernel Desktop Display Server OpenGL Compiler File-System Screen Resolution a b c AMD Ryzen 7 7840HS @ 5.29GHz (8 Cores / 16 Threads) Framework Laptop 16 (AMD Ryzen 7040 ) FRANMZCP07 (03.01 BIOS) AMD Device 14e8 2 x 8GB DDR5-5600MT/s A-DATA AD5S56008G-B 512GB Western Digital PC SN810 SDCPNRY-512G AMD Radeon 780M 512MB AMD Navi 31 HDMI/DP MEDIATEK MT7922 802.11ax PCI Ubuntu 24.04 6.8.0-49-generic (x86_64) GNOME Shell 46.0 X Server + Wayland 4.6 Mesa 24.2~git2406200600.0ac0fb~oibaf~n (git-0ac0fbc 2024-06-20 noble-oibaf-ppa) (LLVM 17.0.6 DRM 3.57) GCC 13.2.0 ext4 2560x1600 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-uJ7kn6/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-uJ7kn6/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - Platform Profile: balanced - CPU Microcode: 0xa704103 - ACPI Profile: balanced Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Vulnerable: Safe RET no microcode + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
ncnn llama ryzen ncnn: CPU - mobilenet ncnn: CPU-v2-v2 - mobilenet-v2 ncnn: CPU-v3-v3 - mobilenet-v3 ncnn: CPU - shufflenet-v2 ncnn: CPU - mnasnet ncnn: CPU - efficientnet-b0 ncnn: CPU - blazeface ncnn: CPU - googlenet ncnn: CPU - vgg16 ncnn: CPU - resnet18 ncnn: CPU - alexnet ncnn: CPU - resnet50 ncnn: CPUv2-yolov3v2-yolov3 - mobilenetv2-yolov3 ncnn: CPU - yolov4-tiny ncnn: CPU - squeezenet_ssd ncnn: CPU - regnety_400m ncnn: CPU - vision_transformer ncnn: CPU - FastestDet ncnn: Vulkan GPU - mobilenet ncnn: Vulkan GPU-v2-v2 - mobilenet-v2 ncnn: Vulkan GPU-v3-v3 - mobilenet-v3 ncnn: Vulkan GPU - shufflenet-v2 ncnn: Vulkan GPU - mnasnet ncnn: Vulkan GPU - efficientnet-b0 ncnn: Vulkan GPU - blazeface ncnn: Vulkan GPU - googlenet ncnn: Vulkan GPU - vgg16 ncnn: Vulkan GPU - resnet18 ncnn: Vulkan GPU - alexnet ncnn: Vulkan GPU - resnet50 ncnn: Vulkan GPUv2-yolov3v2-yolov3 - mobilenetv2-yolov3 ncnn: Vulkan GPU - yolov4-tiny ncnn: Vulkan GPU - squeezenet_ssd ncnn: Vulkan GPU - regnety_400m ncnn: Vulkan GPU - vision_transformer ncnn: Vulkan GPU - FastestDet llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Text Generation 128 llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 512 llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 1024 llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 2048 llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Text Generation 128 llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 512 llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 1024 llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 2048 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Text Generation 128 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 512 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 1024 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 2048 a b c 10.42 3.01 2.96 2.5 2.81 4.29 0.85 8.57 41.13 6.44 5.54 13.96 10.42 16.69 7.21 6.07 59.71 2.67 10.45 3.03 2.98 2.52 2.81 4.33 0.79 8.76 41.34 6.51 5.56 13.96 10.45 16.72 7.31 6.09 60.18 2.85 7.17 40.55 39.9 37.65 7.6 39.58 39.59 38.27 53.66 165.04 162.78 148.11 9.95 4.01 2.75 2.38 2.55 4.03 0.74 7.63 39.5 5.52 5.23 12.71 9.95 15.58 6.83 5.93 59.95 2.83 9.89 2.85 2.77 2.37 2.56 4.02 0.74 7.5 41.73 4.91 4.87 15.32 9.89 15.32 6.96 5.86 59.62 2.91 7.21 40.06 39.71 38.54 7.6 40.2 39.54 38.59 53.55 158.7 160.59 151.48 9.9 4.02 2.75 2.37 2.55 4.02 0.74 7.61 39.64 5.47 5.23 12.72 9.9 15.47 6.8 5.88 59.94 2.59 9.97 2.87 2.75 2.36 2.59 4.01 0.74 7.44 38.85 4.83 4.91 12.19 9.97 15.11 6.76 5.85 59.85 2.72 7.21 40.57 40.11 38.74 7.59 40.56 39.49 38.5 53.55 170.43 164.58 149.08 OpenBenchmarking.org
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: mobilenet c b a 3 6 9 12 15 9.90 9.95 10.42 MIN: 9.78 / MAX: 14.64 MIN: 9.85 / MAX: 14.03 MIN: 10.34 / MAX: 12.33 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU-v2-v2 - Model: mobilenet-v2 a b c 0.9045 1.809 2.7135 3.618 4.5225 3.01 4.01 4.02 MIN: 2.94 / MAX: 6.3 MIN: 2.79 / MAX: 226.32 MIN: 2.79 / MAX: 226.81 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU-v3-v3 - Model: mobilenet-v3 b c a 0.666 1.332 1.998 2.664 3.33 2.75 2.75 2.96 MIN: 2.67 / MAX: 5.61 MIN: 2.67 / MAX: 5.76 MIN: 2.9 / MAX: 5.75 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: shufflenet-v2 c b a 0.5625 1.125 1.6875 2.25 2.8125 2.37 2.38 2.50 MIN: 2.31 / MAX: 5.07 MIN: 2.33 / MAX: 4.92 MIN: 2.46 / MAX: 4.33 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: mnasnet b c a 0.6323 1.2646 1.8969 2.5292 3.1615 2.55 2.55 2.81 MIN: 2.49 / MAX: 5.11 MIN: 2.48 / MAX: 5.75 MIN: 2.73 / MAX: 5.32 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: efficientnet-b0 c b a 0.9653 1.9306 2.8959 3.8612 4.8265 4.02 4.03 4.29 MIN: 3.91 / MAX: 6.69 MIN: 3.95 / MAX: 6.79 MIN: 4.18 / MAX: 6.88 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: blazeface b c a 0.1913 0.3826 0.5739 0.7652 0.9565 0.74 0.74 0.85 MIN: 0.73 / MAX: 0.92 MIN: 0.73 / MAX: 0.93 MIN: 0.78 / MAX: 10.11 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: googlenet c b a 2 4 6 8 10 7.61 7.63 8.57 MIN: 7.5 / MAX: 9.49 MIN: 7.51 / MAX: 11 MIN: 8.44 / MAX: 11.96 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: vgg16 b c a 9 18 27 36 45 39.50 39.64 41.13 MIN: 38.97 / MAX: 48.87 MIN: 39.2 / MAX: 50.57 MIN: 40.58 / MAX: 49.94 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: resnet18 c b a 2 4 6 8 10 5.47 5.52 6.44 MIN: 5.38 / MAX: 7.66 MIN: 5.42 / MAX: 7.18 MIN: 6.37 / MAX: 9.26 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: alexnet b c a 1.2465 2.493 3.7395 4.986 6.2325 5.23 5.23 5.54 MIN: 5.15 / MAX: 6.92 MIN: 5.17 / MAX: 6.5 MIN: 5.5 / MAX: 6.96 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: resnet50 b c a 4 8 12 16 20 12.71 12.72 13.96 MIN: 12.57 / MAX: 17.48 MIN: 12.59 / MAX: 15.03 MIN: 13.83 / MAX: 16.06 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3 c b a 3 6 9 12 15 9.90 9.95 10.42 MIN: 9.78 / MAX: 14.64 MIN: 9.85 / MAX: 14.03 MIN: 10.34 / MAX: 12.33 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: yolov4-tiny c b a 4 8 12 16 20 15.47 15.58 16.69 MIN: 15.25 / MAX: 18.74 MIN: 15.32 / MAX: 34.14 MIN: 16.45 / MAX: 20.64 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: squeezenet_ssd c b a 2 4 6 8 10 6.80 6.83 7.21 MIN: 6.68 / MAX: 11.21 MIN: 6.69 / MAX: 11.01 MIN: 7.1 / MAX: 9.12 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: regnety_400m c b a 2 4 6 8 10 5.88 5.93 6.07 MIN: 5.8 / MAX: 14.93 MIN: 5.81 / MAX: 15.99 MIN: 5.98 / MAX: 10.09 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: vision_transformer a c b 13 26 39 52 65 59.71 59.94 59.95 MIN: 58.71 / MAX: 68.89 MIN: 58.92 / MAX: 78.16 MIN: 59.31 / MAX: 77.78 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: FastestDet c a b 0.6368 1.2736 1.9104 2.5472 3.184 2.59 2.67 2.83 MIN: 2.56 / MAX: 3.85 MIN: 2.64 / MAX: 4.2 MIN: 2.78 / MAX: 7.44 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: mobilenet b c a 3 6 9 12 15 9.89 9.97 10.45 MIN: 9.79 / MAX: 12.12 MIN: 9.83 / MAX: 13.81 MIN: 10.39 / MAX: 10.87 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 b c a 0.6818 1.3636 2.0454 2.7272 3.409 2.85 2.87 3.03 MIN: 2.79 / MAX: 5.37 MIN: 2.79 / MAX: 5.93 MIN: 2.95 / MAX: 5.6 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 c b a 0.6705 1.341 2.0115 2.682 3.3525 2.75 2.77 2.98 MIN: 2.69 / MAX: 5.4 MIN: 2.68 / MAX: 5.24 MIN: 2.92 / MAX: 5.52 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: shufflenet-v2 c b a 0.567 1.134 1.701 2.268 2.835 2.36 2.37 2.52 MIN: 2.29 / MAX: 5.34 MIN: 2.32 / MAX: 4.75 MIN: 2.47 / MAX: 4.97 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: mnasnet b c a 0.6323 1.2646 1.8969 2.5292 3.1615 2.56 2.59 2.81 MIN: 2.5 / MAX: 5.05 MIN: 2.49 / MAX: 12.75 MIN: 2.74 / MAX: 5.35 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: efficientnet-b0 c b a 0.9743 1.9486 2.9229 3.8972 4.8715 4.01 4.02 4.33 MIN: 3.92 / MAX: 6.51 MIN: 3.93 / MAX: 6.78 MIN: 4.2 / MAX: 13.97 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: blazeface b c a 0.1778 0.3556 0.5334 0.7112 0.889 0.74 0.74 0.79 MIN: 0.73 / MAX: 0.82 MIN: 0.73 / MAX: 0.83 MAX: 0.85 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: googlenet c b a 2 4 6 8 10 7.44 7.50 8.76 MIN: 7.32 / MAX: 9.07 MIN: 7.37 / MAX: 9.08 MIN: 8.51 / MAX: 17.3 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: vgg16 c a b 10 20 30 40 50 38.85 41.34 41.73 MIN: 38.52 / MAX: 49.2 MIN: 40.71 / MAX: 55.66 MIN: 38.44 / MAX: 161 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: resnet18 c b a 2 4 6 8 10 4.83 4.91 6.51 MIN: 4.76 / MAX: 5.35 MIN: 4.82 / MAX: 10.58 MIN: 6.4 / MAX: 10.78 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: alexnet b c a 1.251 2.502 3.753 5.004 6.255 4.87 4.91 5.56 MIN: 4.77 / MAX: 8.68 MIN: 4.82 / MAX: 6.18 MIN: 5.47 / MAX: 7.38 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: resnet50 c a b 4 8 12 16 20 12.19 13.96 15.32 MIN: 12.08 / MAX: 13.91 MIN: 13.81 / MAX: 18.95 MIN: 12.54 / MAX: 83.28 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3 b c a 3 6 9 12 15 9.89 9.97 10.45 MIN: 9.79 / MAX: 12.12 MIN: 9.83 / MAX: 13.81 MIN: 10.39 / MAX: 10.87 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: yolov4-tiny c b a 4 8 12 16 20 15.11 15.32 16.72 MIN: 14.91 / MAX: 16.77 MIN: 15.1 / MAX: 19.94 MIN: 16.58 / MAX: 20.56 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: squeezenet_ssd c b a 2 4 6 8 10 6.76 6.96 7.31 MIN: 6.63 / MAX: 8.4 MIN: 6.83 / MAX: 8.76 MIN: 7.2 / MAX: 11.67 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: regnety_400m c b a 2 4 6 8 10 5.85 5.86 6.09 MIN: 5.77 / MAX: 9.59 MIN: 5.79 / MAX: 7.35 MIN: 6.04 / MAX: 7.35 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: vision_transformer b c a 13 26 39 52 65 59.62 59.85 60.18 MIN: 58.53 / MAX: 68.86 MIN: 58.6 / MAX: 81.55 MIN: 58.25 / MAX: 98.64 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: FastestDet c a b 0.6548 1.3096 1.9644 2.6192 3.274 2.72 2.85 2.91 MIN: 2.69 / MAX: 4.02 MIN: 2.82 / MAX: 4.26 MIN: 2.87 / MAX: 6.78 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128 c b a 2 4 6 8 10 7.21 7.21 7.17 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512 c a b 9 18 27 36 45 40.57 40.55 40.06 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024 c a b 9 18 27 36 45 40.11 39.90 39.71 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048 c b a 9 18 27 36 45 38.74 38.54 37.65 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128 b a c 2 4 6 8 10 7.60 7.60 7.59 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512 c b a 9 18 27 36 45 40.56 40.20 39.58 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024 a b c 9 18 27 36 45 39.59 39.54 39.49 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048 b c a 9 18 27 36 45 38.59 38.50 38.27 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128 a c b 12 24 36 48 60 53.66 53.55 53.55 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512 c a b 40 80 120 160 200 170.43 165.04 158.70 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024 c a b 40 80 120 160 200 164.58 162.78 160.59 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048 b c a 30 60 90 120 150 151.48 149.08 148.11 1. (CXX) g++ options: -O3
Phoronix Test Suite v10.8.5