adl ncnn llama

Intel Core i7-1280P testing with a MSI Prestige 14Evo A12M MS-14C6 (E14C6IMS.115 BIOS) and MSI Intel ADL GT2 8GB on Ubuntu 24.10 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2412307-NE-ADLNCNNLL88&sor.

adl ncnn llamaProcessorMotherboardChipsetMemoryDiskGraphicsAudioNetworkOSKernelDesktopDisplay ServerOpenGLCompilerFile-SystemScreen ResolutionabcIntel Core i7-1280P @ 4.70GHz (14 Cores / 20 Threads)MSI Prestige 14Evo A12M MS-14C6 (E14C6IMS.115 BIOS)Intel Alder Lake PCH8 x 2GB LPDDR4-4267MT/s SK Hynix H9HCNNNCPMMLXR-1024GB Micron_3400_MTFDKBA1T0TFHMSI Intel ADL GT2 8GBRealtek ALC274Intel Alder Lake-P PCH CNVi WiFiUbuntu 24.106.11.0-rc6-phx (x86_64)GNOME Shell 47.0X Server + Wayland4.6 Mesa 24.2.3-1ubuntu1GCC 14.2.0ext41920x1080OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0x434 - Thermald 2.5.8 Security Details- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Mitigation of Clear Register File + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; RSB filling; PBRSB-eIBRS: SW sequence; BHI: BHI_DIS_S + srbds: Not affected + tsx_async_abort: Not affected

adl ncnn llamancnn: CPU - mobilenetncnn: CPU-v2-v2 - mobilenet-v2ncnn: CPU-v3-v3 - mobilenet-v3ncnn: CPU - shufflenet-v2ncnn: CPU - mnasnetncnn: CPU - efficientnet-b0ncnn: CPU - blazefacencnn: CPU - googlenetncnn: CPU - vgg16ncnn: CPU - resnet18ncnn: CPU - alexnetncnn: CPU - resnet50ncnn: CPUv2-yolov3v2-yolov3 - mobilenetv2-yolov3ncnn: CPU - yolov4-tinyncnn: CPU - squeezenet_ssdncnn: CPU - regnety_400mncnn: CPU - vision_transformerncnn: CPU - FastestDetncnn: Vulkan GPU - mobilenetncnn: Vulkan GPU-v2-v2 - mobilenet-v2ncnn: Vulkan GPU-v3-v3 - mobilenet-v3ncnn: Vulkan GPU - shufflenet-v2ncnn: Vulkan GPU - mnasnetncnn: Vulkan GPU - efficientnet-b0ncnn: Vulkan GPU - blazefacencnn: Vulkan GPU - googlenetncnn: Vulkan GPU - vgg16ncnn: Vulkan GPU - resnet18ncnn: Vulkan GPU - alexnetncnn: Vulkan GPU - resnet50ncnn: Vulkan GPUv2-yolov3v2-yolov3 - mobilenetv2-yolov3ncnn: Vulkan GPU - yolov4-tinyncnn: Vulkan GPU - squeezenet_ssdncnn: Vulkan GPU - regnety_400mncnn: Vulkan GPU - vision_transformerncnn: Vulkan GPU - FastestDetllama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Text Generation 128llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 512llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 1024llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 2048llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Text Generation 128llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 512llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 1024llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 2048llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Text Generation 128llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 512llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 1024llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 2048abc17.196.096.284.736.0310.982.8212.2933.187.55.722.2317.1921.0313.1526.21150.417.8117.066.085.895.595.939.953.1312.7938.917.365.7922.5817.0620.8813.2825.72151.367.436.6215.7215.5315.0715.7615.5615.1129.3265.8261.8257.1117.385.475.85.075.5610.242.7612.0533.157.325.7720.0417.3821.2513.3624.49150.057.5317.456.185.955.716.1610.572.7412.4837.777.465.7922.6717.4521.1313.1725.83151.917.626.6115.8615.4515.036.9215.7515.5715.129.3665.661.9657.22176.035.465.755.75102.7312.7933.027.545.6718.051720.4611.9626.64150.277.5416.756.195.755.015.9510.872.712.3838.857.355.7922.5916.7520.8313.1524.97149.347.356.6415.7415.4215.066.9415.7515.5415.0928.7765.5861.7657.4OpenBenchmarking.org

NCNN

Target: CPU - Model: mobilenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: mobilenetcab4812162017.0017.1917.38MIN: 12.69 / MAX: 26.74MIN: 13.1 / MAX: 23.64MIN: 12.67 / MAX: 23.331. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU-v2-v2 - Model: mobilenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU-v2-v2 - Model: mobilenet-v2bca2468105.476.036.09MIN: 4.85 / MAX: 9.73MIN: 4.88 / MAX: 8.96MIN: 4.84 / MAX: 11.241. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU-v3-v3 - Model: mobilenet-v3

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU-v3-v3 - Model: mobilenet-v3cba2468105.465.806.28MIN: 4.91 / MAX: 9.06MIN: 4.86 / MAX: 10.21MIN: 4.94 / MAX: 8.771. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: shufflenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: shufflenet-v2abc1.29382.58763.88145.17526.4694.735.075.75MIN: 4.33 / MAX: 10.8MIN: 4.29 / MAX: 8.26MIN: 4.43 / MAX: 10.321. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: mnasnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: mnasnetbca2468105.565.756.03MIN: 4.67 / MAX: 8.73MIN: 4.63 / MAX: 8.23MIN: 4.77 / MAX: 9.741. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: efficientnet-b0

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: efficientnet-b0cba369121510.0010.2410.98MIN: 8.21 / MAX: 17.24MIN: 8.11 / MAX: 16.53MIN: 8.44 / MAX: 16.441. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: blazeface

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: blazefacecba0.63451.2691.90352.5383.17252.732.762.82MIN: 2.36 / MAX: 5.39MIN: 2.36 / MAX: 7.25MIN: 2.32 / MAX: 3.911. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: googlenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: googlenetbac369121512.0512.2912.79MIN: 10.89 / MAX: 19.08MIN: 10.75 / MAX: 17.53MIN: 10.77 / MAX: 17.91. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: vgg16

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: vgg16cba81624324033.0233.1533.18MIN: 31.26 / MAX: 37.28MIN: 31.13 / MAX: 37.11MIN: 31.47 / MAX: 37.161. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: resnet18

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: resnet18bac2468107.327.507.54MIN: 6.63 / MAX: 11.23MIN: 6.69 / MAX: 11.06MIN: 6.56 / MAX: 11.931. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: alexnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: alexnetcab1.29832.59663.89495.19326.49155.675.705.77MIN: 5.34 / MAX: 8.71MIN: 5.4 / MAX: 7.54MIN: 5.44 / MAX: 8.591. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: resnet50

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: resnet50cba51015202518.0520.0422.23MIN: 16.22 / MAX: 24.29MIN: 16.75 / MAX: 27.08MIN: 17.37 / MAX: 28.931. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3cab4812162017.0017.1917.38MIN: 12.69 / MAX: 26.74MIN: 13.1 / MAX: 23.64MIN: 12.67 / MAX: 23.331. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: yolov4-tiny

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: yolov4-tinycab51015202520.4621.0321.25MIN: 17.01 / MAX: 28.28MIN: 17 / MAX: 29.43MIN: 17.16 / MAX: 30.771. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: squeezenet_ssd

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: squeezenet_ssdcab369121511.9613.1513.36MIN: 9.51 / MAX: 16.82MIN: 10.89 / MAX: 21.09MIN: 10.53 / MAX: 19.041. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: regnety_400m

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: regnety_400mbac61218243024.4926.2126.64MIN: 22.54 / MAX: 34.78MIN: 22.85 / MAX: 46.01MIN: 22.92 / MAX: 42.291. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: vision_transformer

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: vision_transformerbca306090120150150.05150.27150.41MIN: 138.47 / MAX: 159.78MIN: 137.06 / MAX: 172.91MIN: 141.11 / MAX: 159.591. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: FastestDet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: FastestDetbca2468107.537.547.81MIN: 6.14 / MAX: 11.34MIN: 6.05 / MAX: 10.7MIN: 6.05 / MAX: 11.951. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: mobilenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: mobilenetcab4812162016.7517.0617.45MIN: 12.86 / MAX: 26.54MIN: 12.71 / MAX: 26.89MIN: 12.83 / MAX: 26.171. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2abc2468106.086.186.19MIN: 4.89 / MAX: 10.69MIN: 4.86 / MAX: 10.78MIN: 4.91 / MAX: 9.031. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3cab1.33882.67764.01645.35526.6945.755.895.95MIN: 4.88 / MAX: 9.83MIN: 4.95 / MAX: 8.92MIN: 4.94 / MAX: 9.591. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: shufflenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: shufflenet-v2cab1.28482.56963.85445.13926.4245.015.595.71MIN: 4.29 / MAX: 9.21MIN: 4.42 / MAX: 9.86MIN: 4.42 / MAX: 9.881. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: mnasnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: mnasnetacb2468105.935.956.16MIN: 4.67 / MAX: 10.57MIN: 4.69 / MAX: 9.39MIN: 5.22 / MAX: 10.381. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: efficientnet-b0

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: efficientnet-b0abc36912159.9510.5710.87MIN: 8.18 / MAX: 15.88MIN: 8.27 / MAX: 16.52MIN: 8.13 / MAX: 16.831. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: blazeface

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: blazefacecba0.70431.40862.11292.81723.52152.702.743.13MIN: 2.31 / MAX: 4.54MIN: 2.35 / MAX: 3.86MIN: 2.34 / MAX: 5.151. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: googlenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: googlenetcba369121512.3812.4812.79MIN: 10.67 / MAX: 17.64MIN: 11.06 / MAX: 18.7MIN: 11.27 / MAX: 20.731. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: vgg16

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: vgg16bca91827364537.7738.8538.91MIN: 31.74 / MAX: 43.77MIN: 31.84 / MAX: 45.95MIN: 31.55 / MAX: 46.091. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: resnet18

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: resnet18cab2468107.357.367.46MIN: 6.65 / MAX: 10.98MIN: 6.62 / MAX: 11.47MIN: 6.62 / MAX: 10.151. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: alexnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: alexnetabc1.30282.60563.90845.21126.5145.795.795.79MIN: 5.45 / MAX: 7.28MIN: 5.42 / MAX: 8.41MIN: 5.44 / MAX: 8.791. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: resnet50

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: resnet50acb51015202522.5822.5922.67MIN: 19.86 / MAX: 32.21MIN: 19.98 / MAX: 34.26MIN: 19.94 / MAX: 30.591. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3cab4812162016.7517.0617.45MIN: 12.86 / MAX: 26.54MIN: 12.71 / MAX: 26.89MIN: 12.83 / MAX: 26.171. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: yolov4-tiny

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: yolov4-tinycab51015202520.8320.8821.13MIN: 17.17 / MAX: 29.18MIN: 17.35 / MAX: 30.42MIN: 17.32 / MAX: 28.771. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: squeezenet_ssd

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: squeezenet_ssdcba369121513.1513.1713.28MIN: 10.49 / MAX: 19.1MIN: 10.36 / MAX: 23.56MIN: 10.53 / MAX: 17.941. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: regnety_400m

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: regnety_400mcab61218243024.9725.7225.83MIN: 22.76 / MAX: 36.26MIN: 23.02 / MAX: 40.14MIN: 22.88 / MAX: 39.181. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: vision_transformer

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: vision_transformercab306090120150149.34151.36151.91MIN: 139.53 / MAX: 160.51MIN: 139.94 / MAX: 160.92MIN: 138.66 / MAX: 160.451. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: FastestDet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: FastestDetcab2468107.357.437.62MIN: 6.33 / MAX: 12.18MIN: 6.25 / MAX: 12.15MIN: 6.15 / MAX: 12.111. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

Llama.cpp

Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128cab2468106.646.626.611. (CXX) g++ options: -O3

Llama.cpp

Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512bca4812162015.8615.7415.721. (CXX) g++ options: -O3

Llama.cpp

Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024abc4812162015.5315.4515.421. (CXX) g++ options: -O3

Llama.cpp

Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048acb4812162015.0715.0615.031. (CXX) g++ options: -O3

Llama.cpp

Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128cb2468106.946.921. (CXX) g++ options: -O3

Llama.cpp

Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512acb4812162015.7615.7515.751. (CXX) g++ options: -O3

Llama.cpp

Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024bac4812162015.5715.5615.541. (CXX) g++ options: -O3

Llama.cpp

Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048abc4812162015.1115.1015.091. (CXX) g++ options: -O3

Llama.cpp

Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128bac71421283529.3629.3228.771. (CXX) g++ options: -O3

Llama.cpp

Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512abc153045607565.8265.6065.581. (CXX) g++ options: -O3

Llama.cpp

Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024bac142842567061.9661.8261.761. (CXX) g++ options: -O3

Llama.cpp

Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048cba132639526557.4057.2257.111. (CXX) g++ options: -O3


Phoronix Test Suite v10.8.5