ncnn llama

Intel Core Ultra 7 256V testing with a ASUS Zenbook S 14 UX5406SA_UX5406SA UX5406SA v1.0 (UX5406SA.300 BIOS) and ASUS Intel LNL 7GB on Ubuntu 24.10 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2412296-NE-NCNNLLAMA15&sor.

ncnn llamaProcessorMotherboardChipsetMemoryDiskGraphicsAudioNetworkOSKernelDesktopDisplay ServerOpenGLOpenCLCompilerFile-SystemScreen ResolutionabcdIntel Core Ultra 7 256V @ 4.70GHz (8 Cores)ASUS Zenbook S 14 UX5406SA_UX5406SA UX5406SA v1.0 (UX5406SA.300 BIOS)Intel Device a87f8 x 2GB LPDDR5-8533MT/s Samsung1024GB Western Digital WD PC SN560 SDDPNQE-1T00-1102ASUS Intel LNL 7GBIntel Lunar Lake-M HD AudioIntel Device a840Ubuntu 24.106.12.0-rc6-phx-drm-next (x86_64)GNOME Shell 47.0X Server + Wayland4.6 Mesa 25.0~git2411250600.45c523~oibaf~o (git-45c5231 2024-11-25 oracular-oibaf-ppOpenCL 3.0GCC 14.2.0ext42880x1800OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: intel_pstate powersave (EPP: performance) - Platform Profile: performance - CPU Microcode: 0x114 - Thermald 2.5.8 - ACPI Profile: performance Security Details- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected

ncnn llamancnn: CPU - mobilenetncnn: CPU-v2-v2 - mobilenet-v2ncnn: CPU-v3-v3 - mobilenet-v3ncnn: CPU - shufflenet-v2ncnn: CPU - mnasnetncnn: CPU - efficientnet-b0ncnn: CPU - blazefacencnn: CPU - googlenetncnn: CPU - vgg16ncnn: CPU - resnet18ncnn: CPU - alexnetncnn: CPU - resnet50ncnn: CPUv2-yolov3v2-yolov3 - mobilenetv2-yolov3ncnn: CPU - yolov4-tinyncnn: CPU - squeezenet_ssdncnn: CPU - regnety_400mncnn: CPU - vision_transformerncnn: CPU - FastestDetncnn: Vulkan GPU - mobilenetncnn: Vulkan GPU-v2-v2 - mobilenet-v2ncnn: Vulkan GPU-v3-v3 - mobilenet-v3ncnn: Vulkan GPU - shufflenet-v2ncnn: Vulkan GPU - mnasnetncnn: Vulkan GPU - efficientnet-b0ncnn: Vulkan GPU - blazefacencnn: Vulkan GPU - googlenetncnn: Vulkan GPU - vgg16ncnn: Vulkan GPU - resnet18ncnn: Vulkan GPU - alexnetncnn: Vulkan GPU - resnet50ncnn: Vulkan GPUv2-yolov3v2-yolov3 - mobilenetv2-yolov3ncnn: Vulkan GPU - yolov4-tinyncnn: Vulkan GPU - squeezenet_ssdncnn: Vulkan GPU - regnety_400mncnn: Vulkan GPU - vision_transformerncnn: Vulkan GPU - FastestDetllama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Text Generation 128llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 512llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 1024llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 2048llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Text Generation 128llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 512llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 1024llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 2048llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Text Generation 128llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 512llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 1024llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 2048abcd14.147.016.134.826.3311.162.3716.6248.7211.368.3930.4914.1417.715.1721.56199.024.9313.994.754.513.654.477.921.9211.2544.957.565.9119.113.9917.649.5716.59196.624.828.8728.7728.0927.169.2428.72827.1138.3561.7360.8358.7313.5954.53.654.487.991.8411.2144.47.515.8618.9613.5917.629.5516.51199.844.6914.354.874.393.664.377.991.9211.2844.617.656.1719.2614.3517.749.7716.73200.124.958.8528.6127.9327.099.2228.327.927.2238.3561.3461.4961.414.44.94.433.644.497.841.8811.2644.857.596.1219.2514.417.549.6616.9197.494.6813.8854.533.654.47.871.9211.2744.657.555.9119.0613.8817.519.6616.39202.374.818.8528.4327.7827.159.228.5426.2527.1938.4360.8554.0856.7914.294.984.533.654.527.951.9411.3644.327.626.0619.214.2917.939.6316.34199.34.9413.634.964.493.664.427.981.9111.2545.047.565.9419.0613.6317.849.716.25198.734.638.8328.5126.8727.019.2128.527.8127.2637.6462.3154.3957.28OpenBenchmarking.org

NCNN

Target: CPU - Model: mobilenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: mobilenetbadc4812162013.5914.1414.2914.40MIN: 12.35 / MAX: 16.17MIN: 12.81 / MAX: 19.39MIN: 13.43 / MAX: 15.38MIN: 13.42 / MAX: 16.321. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU-v2-v2 - Model: mobilenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU-v2-v2 - Model: mobilenet-v2cdba2468104.904.985.007.01MIN: 4.22 / MAX: 7.14MIN: 4.61 / MAX: 7.09MIN: 4.79 / MAX: 6.66MIN: 6.26 / MAX: 8.271. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU-v3-v3 - Model: mobilenet-v3

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU-v3-v3 - Model: mobilenet-v3cbda2468104.434.504.536.13MIN: 4.15 / MAX: 6.44MIN: 4.3 / MAX: 7.07MIN: 4.32 / MAX: 6.54MIN: 5.69 / MAX: 8.861. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: shufflenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: shufflenet-v2cbda1.08452.1693.25354.3385.42253.643.653.654.82MIN: 3.49 / MAX: 4.97MIN: 3.46 / MAX: 5.16MIN: 3.6 / MAX: 5.03MIN: 4.59 / MAX: 5.021. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: mnasnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: mnasnetbcda2468104.484.494.526.33MIN: 4.18 / MAX: 4.99MIN: 4.38 / MAX: 5.08MIN: 4.29 / MAX: 7.54MIN: 5.92 / MAX: 7.41. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: efficientnet-b0

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: efficientnet-b0cdba36912157.847.957.9911.16MIN: 7.34 / MAX: 9.39MIN: 7.4 / MAX: 8.36MIN: 7.54 / MAX: 9.91MIN: 10.59 / MAX: 11.761. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: blazeface

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: blazefacebcda0.53331.06661.59992.13322.66651.841.881.942.37MIN: 1.74 / MAX: 1.91MIN: 1.82 / MAX: 1.99MIN: 1.9 / MAX: 1.99MIN: 2.23 / MAX: 2.491. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: googlenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: googlenetbcda4812162011.2111.2611.3616.62MIN: 10.76 / MAX: 11.55MIN: 10.66 / MAX: 13.28MIN: 11.11 / MAX: 11.76MIN: 15.75 / MAX: 17.721. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: vgg16

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: vgg16dbca112233445544.3244.4044.8548.72MIN: 41.83 / MAX: 45.85MIN: 41.87 / MAX: 45.98MIN: 42.51 / MAX: 46.34MIN: 47.5 / MAX: 50.511. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: resnet18

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: resnet18bcda36912157.517.597.6211.36MIN: 7.26 / MAX: 7.95MIN: 7.4 / MAX: 7.96MIN: 7.39 / MAX: 8.42MIN: 10.53 / MAX: 12.291. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: alexnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: alexnetbdca2468105.866.066.128.39MIN: 5.71 / MAX: 6.23MIN: 5.91 / MAX: 7.6MIN: 5.95 / MAX: 6.36MIN: 7.52 / MAX: 9.641. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: resnet50

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: resnet50bdca71421283518.9619.2019.2530.49MIN: 18.53 / MAX: 19.8MIN: 18.8 / MAX: 19.92MIN: 18.8 / MAX: 21.77MIN: 28.71 / MAX: 34.171. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3badc4812162013.5914.1414.2914.40MIN: 12.35 / MAX: 16.17MIN: 12.81 / MAX: 19.39MIN: 13.43 / MAX: 15.38MIN: 13.42 / MAX: 16.321. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: yolov4-tiny

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: yolov4-tinycbad4812162017.5417.6217.7017.93MIN: 16.19 / MAX: 18.43MIN: 16.86 / MAX: 18.72MIN: 16.34 / MAX: 19.92MIN: 16.96 / MAX: 19.341. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: squeezenet_ssd

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: squeezenet_ssdbdca481216209.559.639.6615.17MIN: 9.04 / MAX: 10.3MIN: 9.13 / MAX: 10.02MIN: 9.09 / MAX: 10.11MIN: 13.87 / MAX: 20.81. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: regnety_400m

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: regnety_400mdbca51015202516.3416.5116.9021.56MIN: 15.56 / MAX: 21.35MIN: 15.62 / MAX: 18.22MIN: 15.72 / MAX: 18MIN: 20.87 / MAX: 28.321. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: vision_transformer

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: vision_transformercadb4080120160200197.49199.02199.30199.84MIN: 193.52 / MAX: 203.23MIN: 194.44 / MAX: 203.76MIN: 195.1 / MAX: 203.24MIN: 194.4 / MAX: 204.251. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: FastestDet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: FastestDetcbad1.11152.2233.33454.4465.55754.684.694.934.94MIN: 4.45 / MAX: 5.02MIN: 4.36 / MAX: 4.91MIN: 4.85 / MAX: 5.13MIN: 4.53 / MAX: 5.071. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: mobilenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: mobilenetdcab4812162013.6313.8813.9914.35MIN: 12.41 / MAX: 15.26MIN: 12.41 / MAX: 15.7MIN: 12.4 / MAX: 16.02MIN: 13.46 / MAX: 16.31. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2abdc1.1252.253.3754.55.6254.754.874.965.00MIN: 4.23 / MAX: 5.52MIN: 4.22 / MAX: 7.49MIN: 4.56 / MAX: 7.55MIN: 4.26 / MAX: 7.251. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3bdac1.01932.03863.05794.07725.09654.394.494.514.53MIN: 4.16 / MAX: 6.09MIN: 4.21 / MAX: 5.78MIN: 4.25 / MAX: 4.9MIN: 4.3 / MAX: 7.771. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: shufflenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: shufflenet-v2acbd0.82351.6472.47053.2944.11753.653.653.663.66MIN: 3.6 / MAX: 3.72MIN: 3.48 / MAX: 5.38MIN: 3.58 / MAX: 6.33MIN: 3.59 / MAX: 5.531. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: mnasnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: mnasnetbcda1.00582.01163.01744.02325.0294.374.404.424.47MIN: 4.19 / MAX: 4.99MIN: 4.22 / MAX: 4.98MIN: 4.23 / MAX: 4.94MIN: 4.22 / MAX: 5.041. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: efficientnet-b0

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: efficientnet-b0cadb2468107.877.927.987.99MIN: 7.36 / MAX: 9.34MIN: 7.56 / MAX: 8.16MIN: 7.68 / MAX: 8.29MIN: 7.88 / MAX: 8.151. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: blazeface

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: blazefacedabc0.4320.8641.2961.7282.161.911.921.921.92MIN: 1.84 / MAX: 1.97MIN: 1.87 / MAX: 1.98MIN: 1.83 / MAX: 2.03MIN: 1.84 / MAX: 1.981. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: googlenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: googlenetadcb369121511.2511.2511.2711.28MIN: 10.92 / MAX: 12.23MIN: 10.97 / MAX: 12.35MIN: 10.72 / MAX: 14.11MIN: 10.66 / MAX: 11.921. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: vgg16

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: vgg16bcad102030405044.6144.6544.9545.04MIN: 42.3 / MAX: 46.22MIN: 41.13 / MAX: 46.45MIN: 42.32 / MAX: 46.93MIN: 43.03 / MAX: 46.431. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: resnet18

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: resnet18cadb2468107.557.567.567.65MIN: 7.3 / MAX: 9.6MIN: 7.29 / MAX: 7.92MIN: 7.28 / MAX: 7.94MIN: 7.42 / MAX: 7.911. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: alexnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: alexnetacdb2468105.915.915.946.17MIN: 5.73 / MAX: 6.22MIN: 5.74 / MAX: 6.27MIN: 5.75 / MAX: 6.23MIN: 5.98 / MAX: 6.591. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: resnet50

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: resnet50cdab51015202519.0619.0619.1019.26MIN: 18.63 / MAX: 19.69MIN: 18.66 / MAX: 19.88MIN: 18.54 / MAX: 21.48MIN: 18.87 / MAX: 20.21. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3dcab4812162013.6313.8813.9914.35MIN: 12.41 / MAX: 15.26MIN: 12.41 / MAX: 15.7MIN: 12.4 / MAX: 16.02MIN: 13.46 / MAX: 16.31. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: yolov4-tiny

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: yolov4-tinycabd4812162017.5117.6417.7417.84MIN: 16.35 / MAX: 19.44MIN: 16.96 / MAX: 18.52MIN: 16.9 / MAX: 19.03MIN: 16.99 / MAX: 19.521. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: squeezenet_ssd

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: squeezenet_ssdacdb36912159.579.669.709.77MIN: 9.13 / MAX: 11.39MIN: 8.92 / MAX: 10.07MIN: 9.2 / MAX: 10.59MIN: 9.06 / MAX: 10.171. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: regnety_400m

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: regnety_400mdcab4812162016.2516.3916.5916.73MIN: 15.65 / MAX: 17.82MIN: 15.67 / MAX: 19.65MIN: 15.66 / MAX: 19.68MIN: 15.69 / MAX: 211. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: vision_transformer

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: vision_transformeradbc4080120160200196.62198.73200.12202.37MIN: 192.65 / MAX: 200.94MIN: 194.73 / MAX: 203.13MIN: 194.52 / MAX: 204.7MIN: 195.84 / MAX: 206.711. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: FastestDet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: FastestDetdcab1.11382.22763.34144.45525.5694.634.814.824.95MIN: 4.33 / MAX: 4.89MIN: 4.69 / MAX: 5.02MIN: 4.71 / MAX: 5.03MIN: 4.86 / MAX: 5.311. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

Llama.cpp

Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128acbd2468108.878.858.858.831. (CXX) g++ options: -O3

Llama.cpp

Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512abdc71421283528.7728.6128.5128.431. (CXX) g++ options: -O3

Llama.cpp

Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024abcd71421283528.0927.9327.7826.871. (CXX) g++ options: -O3

Llama.cpp

Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048acbd61218243027.1627.1527.0927.011. (CXX) g++ options: -O3

Llama.cpp

Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128abdc36912159.249.229.219.201. (CXX) g++ options: -O3

Llama.cpp

Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512acdb71421283528.7028.5428.5028.301. (CXX) g++ options: -O3

Llama.cpp

Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024abdc71421283528.0027.9027.8126.251. (CXX) g++ options: -O3

Llama.cpp

Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048dbca61218243027.2627.2227.1927.111. (CXX) g++ options: -O3

Llama.cpp

Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128cbad91827364538.4338.3538.3537.641. (CXX) g++ options: -O3

Llama.cpp

Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512dabc142842567062.3161.7361.3460.851. (CXX) g++ options: -O3

Llama.cpp

Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024badc142842567061.4960.8354.3954.081. (CXX) g++ options: -O3

Llama.cpp

Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048badc142842567061.4058.7357.2856.791. (CXX) g++ options: -O3


Phoronix Test Suite v10.8.5