adl ncnn llama Intel Core i7-1280P testing with a MSI Prestige 14Evo A12M MS-14C6 (E14C6IMS.115 BIOS) and MSI Intel ADL GT2 8GB on Ubuntu 24.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2412307-NE-ADLNCNNLL88 .
adl ncnn llama Processor Motherboard Chipset Memory Disk Graphics Audio Network OS Kernel Desktop Display Server OpenGL Compiler File-System Screen Resolution a b c Intel Core i7-1280P @ 4.70GHz (14 Cores / 20 Threads) MSI Prestige 14Evo A12M MS-14C6 (E14C6IMS.115 BIOS) Intel Alder Lake PCH 8 x 2GB LPDDR4-4267MT/s SK Hynix H9HCNNNCPMMLXR- 1024GB Micron_3400_MTFDKBA1T0TFH MSI Intel ADL GT2 8GB Realtek ALC274 Intel Alder Lake-P PCH CNVi WiFi Ubuntu 24.10 6.11.0-rc6-phx (x86_64) GNOME Shell 47.0 X Server + Wayland 4.6 Mesa 24.2.3-1ubuntu1 GCC 14.2.0 ext4 1920x1080 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0x434 - Thermald 2.5.8 Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Mitigation of Clear Register File + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; RSB filling; PBRSB-eIBRS: SW sequence; BHI: BHI_DIS_S + srbds: Not affected + tsx_async_abort: Not affected
adl ncnn llama ncnn: CPU - mobilenet ncnn: CPU-v2-v2 - mobilenet-v2 ncnn: CPU-v3-v3 - mobilenet-v3 ncnn: CPU - shufflenet-v2 ncnn: CPU - mnasnet ncnn: CPU - efficientnet-b0 ncnn: CPU - blazeface ncnn: CPU - googlenet ncnn: CPU - vgg16 ncnn: CPU - resnet18 ncnn: CPU - alexnet ncnn: CPU - resnet50 ncnn: CPUv2-yolov3v2-yolov3 - mobilenetv2-yolov3 ncnn: CPU - yolov4-tiny ncnn: CPU - squeezenet_ssd ncnn: CPU - regnety_400m ncnn: CPU - vision_transformer ncnn: CPU - FastestDet ncnn: Vulkan GPU - mobilenet ncnn: Vulkan GPU-v2-v2 - mobilenet-v2 ncnn: Vulkan GPU-v3-v3 - mobilenet-v3 ncnn: Vulkan GPU - shufflenet-v2 ncnn: Vulkan GPU - mnasnet ncnn: Vulkan GPU - efficientnet-b0 ncnn: Vulkan GPU - blazeface ncnn: Vulkan GPU - googlenet ncnn: Vulkan GPU - vgg16 ncnn: Vulkan GPU - resnet18 ncnn: Vulkan GPU - alexnet ncnn: Vulkan GPU - resnet50 ncnn: Vulkan GPUv2-yolov3v2-yolov3 - mobilenetv2-yolov3 ncnn: Vulkan GPU - yolov4-tiny ncnn: Vulkan GPU - squeezenet_ssd ncnn: Vulkan GPU - regnety_400m ncnn: Vulkan GPU - vision_transformer ncnn: Vulkan GPU - FastestDet llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Text Generation 128 llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 512 llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 1024 llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 2048 llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Text Generation 128 llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 512 llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 1024 llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 2048 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Text Generation 128 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 512 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 1024 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 2048 a b c 17.19 6.09 6.28 4.73 6.03 10.98 2.82 12.29 33.18 7.5 5.7 22.23 17.19 21.03 13.15 26.21 150.41 7.81 17.06 6.08 5.89 5.59 5.93 9.95 3.13 12.79 38.91 7.36 5.79 22.58 17.06 20.88 13.28 25.72 151.36 7.43 6.62 15.72 15.53 15.07 15.76 15.56 15.11 29.32 65.82 61.82 57.11 17.38 5.47 5.8 5.07 5.56 10.24 2.76 12.05 33.15 7.32 5.77 20.04 17.38 21.25 13.36 24.49 150.05 7.53 17.45 6.18 5.95 5.71 6.16 10.57 2.74 12.48 37.77 7.46 5.79 22.67 17.45 21.13 13.17 25.83 151.91 7.62 6.61 15.86 15.45 15.03 6.92 15.75 15.57 15.1 29.36 65.6 61.96 57.22 17 6.03 5.46 5.75 5.75 10 2.73 12.79 33.02 7.54 5.67 18.05 17 20.46 11.96 26.64 150.27 7.54 16.75 6.19 5.75 5.01 5.95 10.87 2.7 12.38 38.85 7.35 5.79 22.59 16.75 20.83 13.15 24.97 149.34 7.35 6.64 15.74 15.42 15.06 6.94 15.75 15.54 15.09 28.77 65.58 61.76 57.4 OpenBenchmarking.org
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: mobilenet a b c 4 8 12 16 20 17.19 17.38 17.00 MIN: 13.1 / MAX: 23.64 MIN: 12.67 / MAX: 23.33 MIN: 12.69 / MAX: 26.74 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU-v2-v2 - Model: mobilenet-v2 a b c 2 4 6 8 10 6.09 5.47 6.03 MIN: 4.84 / MAX: 11.24 MIN: 4.85 / MAX: 9.73 MIN: 4.88 / MAX: 8.96 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU-v3-v3 - Model: mobilenet-v3 a b c 2 4 6 8 10 6.28 5.80 5.46 MIN: 4.94 / MAX: 8.77 MIN: 4.86 / MAX: 10.21 MIN: 4.91 / MAX: 9.06 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: shufflenet-v2 a b c 1.2938 2.5876 3.8814 5.1752 6.469 4.73 5.07 5.75 MIN: 4.33 / MAX: 10.8 MIN: 4.29 / MAX: 8.26 MIN: 4.43 / MAX: 10.32 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: mnasnet a b c 2 4 6 8 10 6.03 5.56 5.75 MIN: 4.77 / MAX: 9.74 MIN: 4.67 / MAX: 8.73 MIN: 4.63 / MAX: 8.23 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: efficientnet-b0 a b c 3 6 9 12 15 10.98 10.24 10.00 MIN: 8.44 / MAX: 16.44 MIN: 8.11 / MAX: 16.53 MIN: 8.21 / MAX: 17.24 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: blazeface a b c 0.6345 1.269 1.9035 2.538 3.1725 2.82 2.76 2.73 MIN: 2.32 / MAX: 3.91 MIN: 2.36 / MAX: 7.25 MIN: 2.36 / MAX: 5.39 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: googlenet a b c 3 6 9 12 15 12.29 12.05 12.79 MIN: 10.75 / MAX: 17.53 MIN: 10.89 / MAX: 19.08 MIN: 10.77 / MAX: 17.9 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: vgg16 a b c 8 16 24 32 40 33.18 33.15 33.02 MIN: 31.47 / MAX: 37.16 MIN: 31.13 / MAX: 37.11 MIN: 31.26 / MAX: 37.28 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: resnet18 a b c 2 4 6 8 10 7.50 7.32 7.54 MIN: 6.69 / MAX: 11.06 MIN: 6.63 / MAX: 11.23 MIN: 6.56 / MAX: 11.93 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: alexnet a b c 1.2983 2.5966 3.8949 5.1932 6.4915 5.70 5.77 5.67 MIN: 5.4 / MAX: 7.54 MIN: 5.44 / MAX: 8.59 MIN: 5.34 / MAX: 8.71 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: resnet50 a b c 5 10 15 20 25 22.23 20.04 18.05 MIN: 17.37 / MAX: 28.93 MIN: 16.75 / MAX: 27.08 MIN: 16.22 / MAX: 24.29 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3 a b c 4 8 12 16 20 17.19 17.38 17.00 MIN: 13.1 / MAX: 23.64 MIN: 12.67 / MAX: 23.33 MIN: 12.69 / MAX: 26.74 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: yolov4-tiny a b c 5 10 15 20 25 21.03 21.25 20.46 MIN: 17 / MAX: 29.43 MIN: 17.16 / MAX: 30.77 MIN: 17.01 / MAX: 28.28 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: squeezenet_ssd a b c 3 6 9 12 15 13.15 13.36 11.96 MIN: 10.89 / MAX: 21.09 MIN: 10.53 / MAX: 19.04 MIN: 9.51 / MAX: 16.82 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: regnety_400m a b c 6 12 18 24 30 26.21 24.49 26.64 MIN: 22.85 / MAX: 46.01 MIN: 22.54 / MAX: 34.78 MIN: 22.92 / MAX: 42.29 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: vision_transformer a b c 30 60 90 120 150 150.41 150.05 150.27 MIN: 141.11 / MAX: 159.59 MIN: 138.47 / MAX: 159.78 MIN: 137.06 / MAX: 172.91 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: CPU - Model: FastestDet a b c 2 4 6 8 10 7.81 7.53 7.54 MIN: 6.05 / MAX: 11.95 MIN: 6.14 / MAX: 11.34 MIN: 6.05 / MAX: 10.7 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: mobilenet a b c 4 8 12 16 20 17.06 17.45 16.75 MIN: 12.71 / MAX: 26.89 MIN: 12.83 / MAX: 26.17 MIN: 12.86 / MAX: 26.54 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 a b c 2 4 6 8 10 6.08 6.18 6.19 MIN: 4.89 / MAX: 10.69 MIN: 4.86 / MAX: 10.78 MIN: 4.91 / MAX: 9.03 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 a b c 1.3388 2.6776 4.0164 5.3552 6.694 5.89 5.95 5.75 MIN: 4.95 / MAX: 8.92 MIN: 4.94 / MAX: 9.59 MIN: 4.88 / MAX: 9.83 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: shufflenet-v2 a b c 1.2848 2.5696 3.8544 5.1392 6.424 5.59 5.71 5.01 MIN: 4.42 / MAX: 9.86 MIN: 4.42 / MAX: 9.88 MIN: 4.29 / MAX: 9.21 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: mnasnet a b c 2 4 6 8 10 5.93 6.16 5.95 MIN: 4.67 / MAX: 10.57 MIN: 5.22 / MAX: 10.38 MIN: 4.69 / MAX: 9.39 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: efficientnet-b0 a b c 3 6 9 12 15 9.95 10.57 10.87 MIN: 8.18 / MAX: 15.88 MIN: 8.27 / MAX: 16.52 MIN: 8.13 / MAX: 16.83 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: blazeface a b c 0.7043 1.4086 2.1129 2.8172 3.5215 3.13 2.74 2.70 MIN: 2.34 / MAX: 5.15 MIN: 2.35 / MAX: 3.86 MIN: 2.31 / MAX: 4.54 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: googlenet a b c 3 6 9 12 15 12.79 12.48 12.38 MIN: 11.27 / MAX: 20.73 MIN: 11.06 / MAX: 18.7 MIN: 10.67 / MAX: 17.64 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: vgg16 a b c 9 18 27 36 45 38.91 37.77 38.85 MIN: 31.55 / MAX: 46.09 MIN: 31.74 / MAX: 43.77 MIN: 31.84 / MAX: 45.95 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: resnet18 a b c 2 4 6 8 10 7.36 7.46 7.35 MIN: 6.62 / MAX: 11.47 MIN: 6.62 / MAX: 10.15 MIN: 6.65 / MAX: 10.98 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: alexnet a b c 1.3028 2.6056 3.9084 5.2112 6.514 5.79 5.79 5.79 MIN: 5.45 / MAX: 7.28 MIN: 5.42 / MAX: 8.41 MIN: 5.44 / MAX: 8.79 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: resnet50 a b c 5 10 15 20 25 22.58 22.67 22.59 MIN: 19.86 / MAX: 32.21 MIN: 19.94 / MAX: 30.59 MIN: 19.98 / MAX: 34.26 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3 a b c 4 8 12 16 20 17.06 17.45 16.75 MIN: 12.71 / MAX: 26.89 MIN: 12.83 / MAX: 26.17 MIN: 12.86 / MAX: 26.54 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: yolov4-tiny a b c 5 10 15 20 25 20.88 21.13 20.83 MIN: 17.35 / MAX: 30.42 MIN: 17.32 / MAX: 28.77 MIN: 17.17 / MAX: 29.18 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: squeezenet_ssd a b c 3 6 9 12 15 13.28 13.17 13.15 MIN: 10.53 / MAX: 17.94 MIN: 10.36 / MAX: 23.56 MIN: 10.49 / MAX: 19.1 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: regnety_400m a b c 6 12 18 24 30 25.72 25.83 24.97 MIN: 23.02 / MAX: 40.14 MIN: 22.88 / MAX: 39.18 MIN: 22.76 / MAX: 36.26 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: vision_transformer a b c 30 60 90 120 150 151.36 151.91 149.34 MIN: 139.94 / MAX: 160.92 MIN: 138.66 / MAX: 160.45 MIN: 139.53 / MAX: 160.51 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: FastestDet a b c 2 4 6 8 10 7.43 7.62 7.35 MIN: 6.25 / MAX: 12.15 MIN: 6.15 / MAX: 12.11 MIN: 6.33 / MAX: 12.18 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128 a b c 2 4 6 8 10 6.62 6.61 6.64 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512 a b c 4 8 12 16 20 15.72 15.86 15.74 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024 a b c 4 8 12 16 20 15.53 15.45 15.42 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048 a b c 4 8 12 16 20 15.07 15.03 15.06 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128 b c 2 4 6 8 10 6.92 6.94 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512 a b c 4 8 12 16 20 15.76 15.75 15.75 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024 a b c 4 8 12 16 20 15.56 15.57 15.54 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048 a b c 4 8 12 16 20 15.11 15.10 15.09 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128 a b c 7 14 21 28 35 29.32 29.36 28.77 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512 a b c 15 30 45 60 75 65.82 65.60 65.58 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024 a b c 14 28 42 56 70 61.82 61.96 61.76 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048 a b c 13 26 39 52 65 57.11 57.22 57.40 1. (CXX) g++ options: -O3
Phoronix Test Suite v10.8.5