ncnn llama

Intel Core Ultra 7 256V testing with a ASUS Zenbook S 14 UX5406SA_UX5406SA UX5406SA v1.0 (UX5406SA.300 BIOS) and ASUS Intel LNL 7GB on Ubuntu 24.10 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2412296-NE-NCNNLLAMA15.

ncnn llamaProcessorMotherboardChipsetMemoryDiskGraphicsAudioNetworkOSKernelDesktopDisplay ServerOpenGLOpenCLCompilerFile-SystemScreen ResolutionabcdIntel Core Ultra 7 256V @ 4.70GHz (8 Cores)ASUS Zenbook S 14 UX5406SA_UX5406SA UX5406SA v1.0 (UX5406SA.300 BIOS)Intel Device a87f8 x 2GB LPDDR5-8533MT/s Samsung1024GB Western Digital WD PC SN560 SDDPNQE-1T00-1102ASUS Intel LNL 7GBIntel Lunar Lake-M HD AudioIntel Device a840Ubuntu 24.106.12.0-rc6-phx-drm-next (x86_64)GNOME Shell 47.0X Server + Wayland4.6 Mesa 25.0~git2411250600.45c523~oibaf~o (git-45c5231 2024-11-25 oracular-oibaf-ppOpenCL 3.0GCC 14.2.0ext42880x1800OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: intel_pstate powersave (EPP: performance) - Platform Profile: performance - CPU Microcode: 0x114 - Thermald 2.5.8 - ACPI Profile: performance Security Details- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected

ncnn llamancnn: CPU - mobilenetncnn: CPU-v2-v2 - mobilenet-v2ncnn: CPU-v3-v3 - mobilenet-v3ncnn: CPU - shufflenet-v2ncnn: CPU - mnasnetncnn: CPU - efficientnet-b0ncnn: CPU - blazefacencnn: CPU - googlenetncnn: CPU - vgg16ncnn: CPU - resnet18ncnn: CPU - alexnetncnn: CPU - resnet50ncnn: CPUv2-yolov3v2-yolov3 - mobilenetv2-yolov3ncnn: CPU - yolov4-tinyncnn: CPU - squeezenet_ssdncnn: CPU - regnety_400mncnn: CPU - vision_transformerncnn: CPU - FastestDetncnn: Vulkan GPU - mobilenetncnn: Vulkan GPU-v2-v2 - mobilenet-v2ncnn: Vulkan GPU-v3-v3 - mobilenet-v3ncnn: Vulkan GPU - shufflenet-v2ncnn: Vulkan GPU - mnasnetncnn: Vulkan GPU - efficientnet-b0ncnn: Vulkan GPU - blazefacencnn: Vulkan GPU - googlenetncnn: Vulkan GPU - vgg16ncnn: Vulkan GPU - resnet18ncnn: Vulkan GPU - alexnetncnn: Vulkan GPU - resnet50ncnn: Vulkan GPUv2-yolov3v2-yolov3 - mobilenetv2-yolov3ncnn: Vulkan GPU - yolov4-tinyncnn: Vulkan GPU - squeezenet_ssdncnn: Vulkan GPU - regnety_400mncnn: Vulkan GPU - vision_transformerncnn: Vulkan GPU - FastestDetllama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Text Generation 128llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 512llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 1024llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 2048llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Text Generation 128llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 512llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 1024llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 2048llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Text Generation 128llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 512llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 1024llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 2048abcd14.147.016.134.826.3311.162.3716.6248.7211.368.3930.4914.1417.715.1721.56199.024.9313.994.754.513.654.477.921.9211.2544.957.565.9119.113.9917.649.5716.59196.624.828.8728.7728.0927.169.2428.72827.1138.3561.7360.8358.7313.5954.53.654.487.991.8411.2144.47.515.8618.9613.5917.629.5516.51199.844.6914.354.874.393.664.377.991.9211.2844.617.656.1719.2614.3517.749.7716.73200.124.958.8528.6127.9327.099.2228.327.927.2238.3561.3461.4961.414.44.94.433.644.497.841.8811.2644.857.596.1219.2514.417.549.6616.9197.494.6813.8854.533.654.47.871.9211.2744.657.555.9119.0613.8817.519.6616.39202.374.818.8528.4327.7827.159.228.5426.2527.1938.4360.8554.0856.7914.294.984.533.654.527.951.9411.3644.327.626.0619.214.2917.939.6316.34199.34.9413.634.964.493.664.427.981.9111.2545.047.565.9419.0613.6317.849.716.25198.734.638.8328.5126.8727.019.2128.527.8127.2637.6462.3154.3957.28OpenBenchmarking.org

NCNN

Target: CPU - Model: mobilenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: mobilenetabcd4812162014.1413.5914.4014.29MIN: 12.81 / MAX: 19.39MIN: 12.35 / MAX: 16.17MIN: 13.42 / MAX: 16.32MIN: 13.43 / MAX: 15.381. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU-v2-v2 - Model: mobilenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU-v2-v2 - Model: mobilenet-v2abcd2468107.015.004.904.98MIN: 6.26 / MAX: 8.27MIN: 4.79 / MAX: 6.66MIN: 4.22 / MAX: 7.14MIN: 4.61 / MAX: 7.091. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU-v3-v3 - Model: mobilenet-v3

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU-v3-v3 - Model: mobilenet-v3abcd2468106.134.504.434.53MIN: 5.69 / MAX: 8.86MIN: 4.3 / MAX: 7.07MIN: 4.15 / MAX: 6.44MIN: 4.32 / MAX: 6.541. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: shufflenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: shufflenet-v2abcd1.08452.1693.25354.3385.42254.823.653.643.65MIN: 4.59 / MAX: 5.02MIN: 3.46 / MAX: 5.16MIN: 3.49 / MAX: 4.97MIN: 3.6 / MAX: 5.031. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: mnasnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: mnasnetabcd2468106.334.484.494.52MIN: 5.92 / MAX: 7.4MIN: 4.18 / MAX: 4.99MIN: 4.38 / MAX: 5.08MIN: 4.29 / MAX: 7.541. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: efficientnet-b0

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: efficientnet-b0abcd369121511.167.997.847.95MIN: 10.59 / MAX: 11.76MIN: 7.54 / MAX: 9.91MIN: 7.34 / MAX: 9.39MIN: 7.4 / MAX: 8.361. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: blazeface

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: blazefaceabcd0.53331.06661.59992.13322.66652.371.841.881.94MIN: 2.23 / MAX: 2.49MIN: 1.74 / MAX: 1.91MIN: 1.82 / MAX: 1.99MIN: 1.9 / MAX: 1.991. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: googlenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: googlenetabcd4812162016.6211.2111.2611.36MIN: 15.75 / MAX: 17.72MIN: 10.76 / MAX: 11.55MIN: 10.66 / MAX: 13.28MIN: 11.11 / MAX: 11.761. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: vgg16

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: vgg16abcd112233445548.7244.4044.8544.32MIN: 47.5 / MAX: 50.51MIN: 41.87 / MAX: 45.98MIN: 42.51 / MAX: 46.34MIN: 41.83 / MAX: 45.851. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: resnet18

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: resnet18abcd369121511.367.517.597.62MIN: 10.53 / MAX: 12.29MIN: 7.26 / MAX: 7.95MIN: 7.4 / MAX: 7.96MIN: 7.39 / MAX: 8.421. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: alexnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: alexnetabcd2468108.395.866.126.06MIN: 7.52 / MAX: 9.64MIN: 5.71 / MAX: 6.23MIN: 5.95 / MAX: 6.36MIN: 5.91 / MAX: 7.61. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: resnet50

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: resnet50abcd71421283530.4918.9619.2519.20MIN: 28.71 / MAX: 34.17MIN: 18.53 / MAX: 19.8MIN: 18.8 / MAX: 21.77MIN: 18.8 / MAX: 19.921. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3abcd4812162014.1413.5914.4014.29MIN: 12.81 / MAX: 19.39MIN: 12.35 / MAX: 16.17MIN: 13.42 / MAX: 16.32MIN: 13.43 / MAX: 15.381. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: yolov4-tiny

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: yolov4-tinyabcd4812162017.7017.6217.5417.93MIN: 16.34 / MAX: 19.92MIN: 16.86 / MAX: 18.72MIN: 16.19 / MAX: 18.43MIN: 16.96 / MAX: 19.341. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: squeezenet_ssd

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: squeezenet_ssdabcd4812162015.179.559.669.63MIN: 13.87 / MAX: 20.8MIN: 9.04 / MAX: 10.3MIN: 9.09 / MAX: 10.11MIN: 9.13 / MAX: 10.021. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: regnety_400m

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: regnety_400mabcd51015202521.5616.5116.9016.34MIN: 20.87 / MAX: 28.32MIN: 15.62 / MAX: 18.22MIN: 15.72 / MAX: 18MIN: 15.56 / MAX: 21.351. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: vision_transformer

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: vision_transformerabcd4080120160200199.02199.84197.49199.30MIN: 194.44 / MAX: 203.76MIN: 194.4 / MAX: 204.25MIN: 193.52 / MAX: 203.23MIN: 195.1 / MAX: 203.241. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: FastestDet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: FastestDetabcd1.11152.2233.33454.4465.55754.934.694.684.94MIN: 4.85 / MAX: 5.13MIN: 4.36 / MAX: 4.91MIN: 4.45 / MAX: 5.02MIN: 4.53 / MAX: 5.071. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: mobilenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: mobilenetabcd4812162013.9914.3513.8813.63MIN: 12.4 / MAX: 16.02MIN: 13.46 / MAX: 16.3MIN: 12.41 / MAX: 15.7MIN: 12.41 / MAX: 15.261. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2abcd1.1252.253.3754.55.6254.754.875.004.96MIN: 4.23 / MAX: 5.52MIN: 4.22 / MAX: 7.49MIN: 4.26 / MAX: 7.25MIN: 4.56 / MAX: 7.551. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3abcd1.01932.03863.05794.07725.09654.514.394.534.49MIN: 4.25 / MAX: 4.9MIN: 4.16 / MAX: 6.09MIN: 4.3 / MAX: 7.77MIN: 4.21 / MAX: 5.781. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: shufflenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: shufflenet-v2abcd0.82351.6472.47053.2944.11753.653.663.653.66MIN: 3.6 / MAX: 3.72MIN: 3.58 / MAX: 6.33MIN: 3.48 / MAX: 5.38MIN: 3.59 / MAX: 5.531. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: mnasnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: mnasnetabcd1.00582.01163.01744.02325.0294.474.374.404.42MIN: 4.22 / MAX: 5.04MIN: 4.19 / MAX: 4.99MIN: 4.22 / MAX: 4.98MIN: 4.23 / MAX: 4.941. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: efficientnet-b0

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: efficientnet-b0abcd2468107.927.997.877.98MIN: 7.56 / MAX: 8.16MIN: 7.88 / MAX: 8.15MIN: 7.36 / MAX: 9.34MIN: 7.68 / MAX: 8.291. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: blazeface

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: blazefaceabcd0.4320.8641.2961.7282.161.921.921.921.91MIN: 1.87 / MAX: 1.98MIN: 1.83 / MAX: 2.03MIN: 1.84 / MAX: 1.98MIN: 1.84 / MAX: 1.971. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: googlenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: googlenetabcd369121511.2511.2811.2711.25MIN: 10.92 / MAX: 12.23MIN: 10.66 / MAX: 11.92MIN: 10.72 / MAX: 14.11MIN: 10.97 / MAX: 12.351. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: vgg16

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: vgg16abcd102030405044.9544.6144.6545.04MIN: 42.32 / MAX: 46.93MIN: 42.3 / MAX: 46.22MIN: 41.13 / MAX: 46.45MIN: 43.03 / MAX: 46.431. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: resnet18

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: resnet18abcd2468107.567.657.557.56MIN: 7.29 / MAX: 7.92MIN: 7.42 / MAX: 7.91MIN: 7.3 / MAX: 9.6MIN: 7.28 / MAX: 7.941. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: alexnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: alexnetabcd2468105.916.175.915.94MIN: 5.73 / MAX: 6.22MIN: 5.98 / MAX: 6.59MIN: 5.74 / MAX: 6.27MIN: 5.75 / MAX: 6.231. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: resnet50

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: resnet50abcd51015202519.1019.2619.0619.06MIN: 18.54 / MAX: 21.48MIN: 18.87 / MAX: 20.2MIN: 18.63 / MAX: 19.69MIN: 18.66 / MAX: 19.881. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3abcd4812162013.9914.3513.8813.63MIN: 12.4 / MAX: 16.02MIN: 13.46 / MAX: 16.3MIN: 12.41 / MAX: 15.7MIN: 12.41 / MAX: 15.261. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: yolov4-tiny

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: yolov4-tinyabcd4812162017.6417.7417.5117.84MIN: 16.96 / MAX: 18.52MIN: 16.9 / MAX: 19.03MIN: 16.35 / MAX: 19.44MIN: 16.99 / MAX: 19.521. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: squeezenet_ssd

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: squeezenet_ssdabcd36912159.579.779.669.70MIN: 9.13 / MAX: 11.39MIN: 9.06 / MAX: 10.17MIN: 8.92 / MAX: 10.07MIN: 9.2 / MAX: 10.591. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: regnety_400m

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: regnety_400mabcd4812162016.5916.7316.3916.25MIN: 15.66 / MAX: 19.68MIN: 15.69 / MAX: 21MIN: 15.67 / MAX: 19.65MIN: 15.65 / MAX: 17.821. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: vision_transformer

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: vision_transformerabcd4080120160200196.62200.12202.37198.73MIN: 192.65 / MAX: 200.94MIN: 194.52 / MAX: 204.7MIN: 195.84 / MAX: 206.71MIN: 194.73 / MAX: 203.131. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: FastestDet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: FastestDetabcd1.11382.22763.34144.45525.5694.824.954.814.63MIN: 4.71 / MAX: 5.03MIN: 4.86 / MAX: 5.31MIN: 4.69 / MAX: 5.02MIN: 4.33 / MAX: 4.891. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

Llama.cpp

Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128abcd2468108.878.858.858.831. (CXX) g++ options: -O3

Llama.cpp

Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512abcd71421283528.7728.6128.4328.511. (CXX) g++ options: -O3

Llama.cpp

Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024abcd71421283528.0927.9327.7826.871. (CXX) g++ options: -O3

Llama.cpp

Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048abcd61218243027.1627.0927.1527.011. (CXX) g++ options: -O3

Llama.cpp

Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128abcd36912159.249.229.209.211. (CXX) g++ options: -O3

Llama.cpp

Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512abcd71421283528.7028.3028.5428.501. (CXX) g++ options: -O3

Llama.cpp

Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024abcd71421283528.0027.9026.2527.811. (CXX) g++ options: -O3

Llama.cpp

Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048abcd61218243027.1127.2227.1927.261. (CXX) g++ options: -O3

Llama.cpp

Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128abcd91827364538.3538.3538.4337.641. (CXX) g++ options: -O3

Llama.cpp

Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512abcd142842567061.7361.3460.8562.311. (CXX) g++ options: -O3

Llama.cpp

Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024abcd142842567060.8361.4954.0854.391. (CXX) g++ options: -O3

Llama.cpp

Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048abcd142842567058.7361.4056.7957.281. (CXX) g++ options: -O3


Phoronix Test Suite v10.8.5