ncnn llama

Intel Core Ultra 7 155H testing with a MTL Swift SFG14-72T Coral_MTH (V1.01 BIOS) and Intel Arc MTL 8GB on Ubuntu 24.10 via the Phoronix Test Suite.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2412304-NE-NCNNLLAMA37
Jump To Table - Results

View

Do Not Show Noisy Results
Do Not Show Results With Incomplete Data
Do Not Show Results With Little Change/Spread
List Notable Results
Show Result Confidence Charts
Allow Limiting Results To Certain Suite(s)

Statistics

Show Overall Harmonic Mean(s)
Show Overall Geometric Mean
Show Wins / Losses Counts (Pie Chart)
Normalize Results
Remove Outliers Before Calculating Averages

Graph Settings

Force Line Graphs Where Applicable
Convert To Scalar Where Applicable
Prefer Vertical Bar Graphs

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Highlight
Result
Toggle/Hide
Result
Result
Identifier
Performance Per
Dollar
Date
Run
  Test
  Duration
a
December 30 2024
  2 Hours, 9 Minutes
b
December 30 2024
  2 Hours, 8 Minutes
c
December 30 2024
  2 Hours, 9 Minutes
Invert Behavior (Only Show Selected Data)
  2 Hours, 9 Minutes

Only show results where is faster than
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


ncnn llamaOpenBenchmarking.orgPhoronix Test SuiteIntel Core Ultra 7 155H @ 4.80GHz (16 Cores / 22 Threads)MTL Swift SFG14-72T Coral_MTH (V1.01 BIOS)Intel Device 7e7f8 x 2GB LPDDR5-6400MT/s Micron MT62F1G32D2DS-0261024GB Micron_2550_MTFDKBA1T0TGEIntel Arc MTL 8GBIntel Meteor Lake-P HD AudioIntel Meteor Lake PCH CNVi WiFiUbuntu 24.106.11.0-rc6-phx (x86_64)GNOME Shell 47.0X Server + Wayland4.6 Mesa 25.0~git2411250600.45c523~oibaf~o (git-45c5231 2024-11-25 oracular-oibaf-ppOpenCL 3.0GCC 14.2.0ext43840x1200ProcessorMotherboardChipsetMemoryDiskGraphicsAudioNetworkOSKernelDesktopDisplay ServerOpenGLOpenCLCompilerFile-SystemScreen ResolutionNcnn Llama BenchmarksSystem Logs- Transparent Huge Pages: madvise- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0x1e - Thermald 2.5.8- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; RSB filling; PBRSB-eIBRS: Not affected; BHI: BHI_DIS_S + srbds: Not affected + tsx_async_abort: Not affected

abcResult OverviewPhoronix Test Suite100%102%103%105%106%NCNNNCNNNCNNNCNNNCNNNCNNNCNNLlama.cppNCNNNCNNNCNNNCNNNCNNNCNNLlama.cppNCNNNCNNNCNNNCNNLlama.cppLlama.cppNCNNLlama.cppNCNNNCNNLlama.cppNCNNNCNNNCNNLlama.cppNCNNLlama.cppNCNNNCNNNCNNNCNNNCNNNCNNLlama.cppNCNNNCNNLlama.cppNCNNNCNNNCNNNCNNLlama.cppLlama.cppCPU - regnety_400mCPU - efficientnet-b0Vulkan GPU - shufflenet-v2Vulkan GPU - mnasnetVulkan GPU-v3-v3 - mobilenet-v3Vulkan GPU - resnet18Vulkan GPU - efficientnet-b0CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - T.G.1CPU - blazefaceCPU - yolov4-tinyCPU - vision_transformerCPU - squeezenet_ssdCPU - alexnetVulkan GPU-v2-v2 - mobilenet-v2CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - P.P.2CPU-v3-v3 - mobilenet-v3CPU - googlenetV.G.y.y - mobilenetv2-yolov3Vulkan GPU - mobilenetCPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - P.P.5CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - P.P.5CPU - resnet18CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - T.G.1Vulkan GPU - alexnetCPU - resnet50CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - P.P.2CPUv2-yolov3v2-yolov3 - mobilenetv2-yolov3CPU - mobilenetCPU - vgg16CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - T.G.1Vulkan GPU - FastestDetCPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - P.P.2CPU - shufflenet-v2Vulkan GPU - vgg16Vulkan GPU - resnet50Vulkan GPU - yolov4-tinyCPU - mnasnetVulkan GPU - blazefaceCPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - P.P.1CPU - FastestDetCPU-v2-v2 - mobilenet-v2CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - P.P.5Vulkan GPU - vision_transformerVulkan GPU - regnety_400mVulkan GPU - squeezenet_ssdVulkan GPU - googlenetCPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - P.P.1CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - P.P.1

ncnn llamancnn: CPU - regnety_400mncnn: CPU - efficientnet-b0ncnn: Vulkan GPU - shufflenet-v2ncnn: Vulkan GPU - mnasnetncnn: Vulkan GPU-v3-v3 - mobilenet-v3ncnn: Vulkan GPU - resnet18ncnn: Vulkan GPU - efficientnet-b0llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Text Generation 128ncnn: CPU - blazefacencnn: CPU - yolov4-tinyncnn: CPU - vision_transformerncnn: CPU - squeezenet_ssdncnn: CPU - alexnetncnn: Vulkan GPU-v2-v2 - mobilenet-v2llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 2048ncnn: CPU-v3-v3 - mobilenet-v3ncnn: CPU - googlenetncnn: Vulkan GPUv2-yolov3v2-yolov3 - mobilenetv2-yolov3ncnn: Vulkan GPU - mobilenetllama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 512llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 512ncnn: CPU - resnet18llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Text Generation 128ncnn: Vulkan GPU - alexnetncnn: CPU - resnet50llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 2048ncnn: CPUv2-yolov3v2-yolov3 - mobilenetv2-yolov3ncnn: CPU - mobilenetncnn: CPU - vgg16llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Text Generation 128ncnn: Vulkan GPU - FastestDetllama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 2048ncnn: CPU - shufflenet-v2ncnn: Vulkan GPU - vgg16ncnn: Vulkan GPU - resnet50ncnn: Vulkan GPU - yolov4-tinyncnn: CPU - mnasnetncnn: Vulkan GPU - blazefacellama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 1024ncnn: CPU - FastestDetncnn: CPU-v2-v2 - mobilenet-v2llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 512ncnn: Vulkan GPU - vision_transformerncnn: Vulkan GPU - regnety_400mncnn: Vulkan GPU - squeezenet_ssdncnn: Vulkan GPU - googlenetllama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 1024llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 1024abc42.4312.837.017.898.038.8713.2522.914.4920.11148.1712.966.397.7419.937.7814.4716.5716.5720.5288.098.977.36.4421.4219.8516.7516.7536.887.519.2775.566.7436.8821.420.037.444.4920.489.157.6820.88147.6241.5313.114.4520.3582.0439.912.366.897.677.959.0213.1423.254.4120.02150.6712.816.297.6220.157.7114.2816.6916.6920.7988.318.987.396.4321.2119.8816.616.636.787.579.2376.136.7736.6321.2920.027.394.4620.619.27.6920.87147.5541.5613.0814.4820.3982.1740.412.356.87.677.868.8312.9923.334.4219.76149.7413.026.397.7419.867.8214.416.7916.7920.6289.198.877.346.3721.3719.716.6916.6936.557.539.375.566.7936.8921.4419.897.434.4820.589.187.7220.78148.2541.7113.0514.4420.3882.01OpenBenchmarking.org

NCNN

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: regnety_400macb102030405042.4340.4039.90MIN: 36.44 / MAX: 54.34MIN: 35.22 / MAX: 44.82MIN: 34.56 / MAX: 43.091. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: efficientnet-b0abc369121512.8312.3612.35MIN: 11.4 / MAX: 16.56MIN: 10.91 / MAX: 17.1MIN: 10.5 / MAX: 17.241. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: shufflenet-v2abc2468107.016.896.80MIN: 5.71 / MAX: 10.26MIN: 5.74 / MAX: 10.47MIN: 5.75 / MAX: 9.761. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: mnasnetacb2468107.897.677.67MIN: 6.21 / MAX: 12.29MIN: 6.39 / MAX: 11.69MIN: 6.59 / MAX: 11.51. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3abc2468108.037.957.86MIN: 6.85 / MAX: 12.15MIN: 6.62 / MAX: 11.98MIN: 6.8 / MAX: 11.681. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: resnet18bac36912159.028.878.83MIN: 8.25 / MAX: 13.6MIN: 8.26 / MAX: 12.35MIN: 8.12 / MAX: 12.051. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: efficientnet-b0abc369121513.2513.1412.99MIN: 11.37 / MAX: 16.4MIN: 11.23 / MAX: 17.39MIN: 11.41 / MAX: 15.941. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

Llama.cpp

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128abc61218243022.9123.2523.331. (CXX) g++ options: -O3

NCNN

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: blazefaceacb1.01032.02063.03094.04125.05154.494.424.41MIN: 4.22 / MAX: 4.83MIN: 3.64 / MAX: 5.89MIN: 3.55 / MAX: 6.61. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: yolov4-tinyabc51015202520.1120.0219.76MIN: 17.71 / MAX: 27.12MIN: 17.01 / MAX: 26.57MIN: 17.47 / MAX: 26.611. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: vision_transformerbca306090120150150.67149.74148.17MIN: 115.09 / MAX: 182.83MIN: 115.38 / MAX: 192.66MIN: 115.2 / MAX: 184.691. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: squeezenet_ssdcab369121513.0212.9612.81MIN: 11.97 / MAX: 17.19MIN: 11.92 / MAX: 16.61MIN: 11.75 / MAX: 15.981. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: alexnetcab2468106.396.396.29MIN: 5.77 / MAX: 8.36MIN: 5.46 / MAX: 9.11MIN: 5.57 / MAX: 8.751. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2cab2468107.747.747.62MIN: 6.46 / MAX: 11.78MIN: 6.54 / MAX: 9.96MIN: 6.4 / MAX: 11.611. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

Llama.cpp

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048cab51015202519.8619.9320.151. (CXX) g++ options: -O3

NCNN

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU-v3-v3 - Model: mobilenet-v3cab2468107.827.787.71MIN: 6.59 / MAX: 10.2MIN: 6.43 / MAX: 10.34MIN: 6.41 / MAX: 10.931. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: googlenetacb4812162014.4714.4014.28MIN: 13.56 / MAX: 19.51MIN: 13.45 / MAX: 20.13MIN: 13.29 / MAX: 19.251. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3cba4812162016.7916.6916.57MIN: 14.77 / MAX: 21.67MIN: 14.89 / MAX: 19.78MIN: 14.41 / MAX: 20.361. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: mobilenetcba4812162016.7916.6916.57MIN: 14.77 / MAX: 21.67MIN: 14.89 / MAX: 19.78MIN: 14.41 / MAX: 20.361. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

Llama.cpp

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512acb51015202520.5220.6220.791. (CXX) g++ options: -O3

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512abc2040608010088.0988.3189.191. (CXX) g++ options: -O3

NCNN

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: resnet18bac36912158.988.978.87MIN: 8.19 / MAX: 12.72MIN: 8.17 / MAX: 12.1MIN: 8.21 / MAX: 12.631. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

Llama.cpp

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128acb2468107.307.347.391. (CXX) g++ options: -O3

NCNN

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: alexnetabc2468106.446.436.37MIN: 5.72 / MAX: 8.52MIN: 5.74 / MAX: 8.68MIN: 5.51 / MAX: 8.641. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: resnet50acb51015202521.4221.3721.21MIN: 20.05 / MAX: 26.84MIN: 20.01 / MAX: 25.98MIN: 19.75 / MAX: 25.951. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

Llama.cpp

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048cab51015202519.7019.8519.881. (CXX) g++ options: -O3

NCNN

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3acb4812162016.7516.6916.60MIN: 14.93 / MAX: 20.88MIN: 14.67 / MAX: 22.07MIN: 14.51 / MAX: 21.621. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: mobilenetacb4812162016.7516.6916.60MIN: 14.93 / MAX: 20.88MIN: 14.67 / MAX: 22.07MIN: 14.51 / MAX: 21.621. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: vgg16abc81624324036.8836.7836.55MIN: 33.2 / MAX: 42.02MIN: 32.94 / MAX: 42.53MIN: 33.17 / MAX: 41.661. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

Llama.cpp

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128acb2468107.517.537.571. (CXX) g++ options: -O3

NCNN

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: FastestDetcab36912159.309.279.23MIN: 8.55 / MAX: 10.04MIN: 7.32 / MAX: 11.69MIN: 7.72 / MAX: 11.531. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

Llama.cpp

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048acb2040608010075.5675.5676.131. (CXX) g++ options: -O3

NCNN

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: shufflenet-v2cba2468106.796.776.74MIN: 6.05 / MAX: 9.41MIN: 6.39 / MAX: 7.29MIN: 6.38 / MAX: 7.471. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: vgg16cab81624324036.8936.8836.63MIN: 33.19 / MAX: 43.26MIN: 33.67 / MAX: 42.46MIN: 33.11 / MAX: 43.21. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: resnet50cab51015202521.4421.4021.29MIN: 20.23 / MAX: 25.81MIN: 20.34 / MAX: 26.47MIN: 19.81 / MAX: 26.441. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: yolov4-tinyabc51015202520.0320.0219.89MIN: 17.73 / MAX: 25.71MIN: 17.52 / MAX: 25.48MIN: 17.8 / MAX: 26.151. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: mnasnetacb2468107.447.437.39MIN: 6.33 / MAX: 8.18MIN: 6.46 / MAX: 10.63MIN: 6.52 / MAX: 10.841. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: blazefaceacb1.01032.02063.03094.04125.05154.494.484.46MIN: 3.6 / MAX: 6.53MIN: 4.29 / MAX: 4.88MIN: 4.21 / MAX: 51. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

Llama.cpp

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024acb51015202520.4820.5820.611. (CXX) g++ options: -O3

NCNN

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU - Model: FastestDetbca36912159.209.189.15MIN: 7.55 / MAX: 11.72MIN: 7.59 / MAX: 9.96MIN: 7.81 / MAX: 10.381. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: CPU-v2-v2 - Model: mobilenet-v2cba2468107.727.697.68MIN: 6.42 / MAX: 9.33MIN: 6.45 / MAX: 10.36MIN: 6.41 / MAX: 11.211. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

Llama.cpp

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512cba51015202520.7820.8720.881. (CXX) g++ options: -O3

NCNN

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: vision_transformercab306090120150148.25147.62147.55MIN: 117.01 / MAX: 183.44MIN: 116.91 / MAX: 177.85MIN: 116.1 / MAX: 182.721. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: regnety_400mcba102030405041.7141.5641.53MIN: 35.35 / MAX: 44.46MIN: 35.3 / MAX: 44.58MIN: 34.82 / MAX: 45.861. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: squeezenet_ssdabc369121513.1013.0813.05MIN: 12.01 / MAX: 17.22MIN: 11.88 / MAX: 16.69MIN: 11.97 / MAX: 16.691. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: googlenetbac4812162014.4814.4514.44MIN: 13.55 / MAX: 18.16MIN: 13.51 / MAX: 20.97MIN: 13.47 / MAX: 17.851. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

Llama.cpp

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024acb51015202520.3520.3820.391. (CXX) g++ options: -O3

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024cab2040608010082.0182.0482.171. (CXX) g++ options: -O3