dddda

AMD Ryzen AI 9 365 testing with a ASUS Zenbook S 16 UM5606WA_UM5606WA UM5606WA v1.0 (UM5606WA.308 BIOS) and AMD Radeon 512MB on Ubuntu 24.10 via the Phoronix Test Suite.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2412110-NE-DDDDA678639
Jump To Table - Results

View

Do Not Show Noisy Results
Do Not Show Results With Incomplete Data
Do Not Show Results With Little Change/Spread
List Notable Results
Show Result Confidence Charts
Allow Limiting Results To Certain Suite(s)

Statistics

Show Overall Harmonic Mean(s)
Show Overall Geometric Mean
Show Wins / Losses Counts (Pie Chart)
Normalize Results
Remove Outliers Before Calculating Averages

Graph Settings

Force Line Graphs Where Applicable
Convert To Scalar Where Applicable
Prefer Vertical Bar Graphs

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Highlight
Result
Toggle/Hide
Result
Result
Identifier
Performance Per
Dollar
Date
Run
  Test
  Duration
a
December 11
  8 Hours, 8 Minutes
b
December 11
  1 Hour, 55 Minutes
c
December 11
  1 Hour, 54 Minutes
Invert Behavior (Only Show Selected Data)
  3 Hours, 59 Minutes

Only show results where is faster than
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


ddddaOpenBenchmarking.orgPhoronix Test SuiteAMD Ryzen AI 9 365 @ 4.31GHz (10 Cores / 20 Threads)ASUS Zenbook S 16 UM5606WA_UM5606WA UM5606WA v1.0 (UM5606WA.308 BIOS)AMD Device 15074 x 6GB LPDDR5-7500MT/s Micron MT62F1536M32D4DS-0261024GB MTFDKBA1T0QFM-1BD1AABGBAMD Radeon 512MBAMD Rembrandt Radeon HD AudioMEDIATEK Device 7925Ubuntu 24.106.12.0-rc7-phx-eraps (x86_64)GNOME Shell 47.0X Server + Wayland4.6 Mesa 24.2.3-1ubuntu1 (LLVM 19.1.0 DRM 3.59)GCC 14.2.0ext42880x1800ProcessorMotherboardChipsetMemoryDiskGraphicsAudioNetworkOSKernelDesktopDisplay ServerOpenGLCompilerFile-SystemScreen ResolutionDddda BenchmarksSystem Logs- amdgpu.dcdebugmask=0x600 - Transparent Huge Pages: madvise- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: amd-pstate-epp powersave (Boost: Enabled EPP: balance_performance) - Platform Profile: balanced - CPU Microcode: 0xb204011 - ACPI Profile: balanced - Python 3.12.7- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; PBRSB-eIBRS: Not affected; BHI: Not affected; ERAPS hardware RSB flush + srbds: Not affected + tsx_async_abort: Not affected

abcResult OverviewPhoronix Test Suite100%102%105%107%110%OpenVINO GenAIOpenVINOLlamafileLlama.cpp

ddddaopenvino: Face Detection FP16 - CPUopenvino: Person Detection FP16 - CPUopenvino: Person Detection FP32 - CPUopenvino: Vehicle Detection FP16 - CPUopenvino: Face Detection FP16-INT8 - CPUopenvino: Face Detection Retail FP16 - CPUopenvino: Road Segmentation ADAS FP16 - CPUopenvino: Vehicle Detection FP16-INT8 - CPUopenvino: Weld Porosity Detection FP16 - CPUopenvino: Face Detection Retail FP16-INT8 - CPUopenvino: Road Segmentation ADAS FP16-INT8 - CPUopenvino: Machine Translation EN To DE FP16 - CPUopenvino: Weld Porosity Detection FP16-INT8 - CPUopenvino: Person Vehicle Bike Detection FP16 - CPUopenvino: Noise Suppression Poconet-Like FP16 - CPUopenvino: Handwritten English Recognition FP16 - CPUopenvino: Person Re-Identification Retail FP16 - CPUopenvino: Age Gender Recognition Retail 0013 FP16 - CPUopenvino: Handwritten English Recognition FP16-INT8 - CPUopenvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPUllama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Text Generation 128llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 512llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 1024llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 2048llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Text Generation 128llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 512llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 1024llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 2048llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Text Generation 128llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 512llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 1024llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 2048llamafile: Llama-3.2-3B-Instruct.Q6_K - Text Generation 16llamafile: Llama-3.2-3B-Instruct.Q6_K - Text Generation 128llamafile: Llama-3.2-3B-Instruct.Q6_K - Prompt Processing 256llamafile: Llama-3.2-3B-Instruct.Q6_K - Prompt Processing 512llamafile: TinyLlama-1.1B-Chat-v1.0.BF16 - Text Generation 16llamafile: Llama-3.2-3B-Instruct.Q6_K - Prompt Processing 1024llamafile: Llama-3.2-3B-Instruct.Q6_K - Prompt Processing 2048llamafile: TinyLlama-1.1B-Chat-v1.0.BF16 - Text Generation 128llamafile: mistral-7b-instruct-v0.2.Q5_K_M - Text Generation 16llamafile: TinyLlama-1.1B-Chat-v1.0.BF16 - Prompt Processing 256llamafile: TinyLlama-1.1B-Chat-v1.0.BF16 - Prompt Processing 512llamafile: mistral-7b-instruct-v0.2.Q5_K_M - Text Generation 128llamafile: wizardcoder-python-34b-v1.0.Q6_K - Text Generation 16llamafile: TinyLlama-1.1B-Chat-v1.0.BF16 - Prompt Processing 1024llamafile: TinyLlama-1.1B-Chat-v1.0.BF16 - Prompt Processing 2048openvino-genai: Gemma-7b-int4-ov - CPUopenvino-genai: TinyLlama-1.1B-Chat-v1.0 - CPUopenvino-genai: Falcon-7b-instruct-int4-ov - CPUopenvino-genai: Phi-3-mini-128k-instruct-int4-ov - CPUopenvino: Face Detection FP16 - CPUopenvino: Person Detection FP16 - CPUopenvino: Person Detection FP32 - CPUopenvino: Vehicle Detection FP16 - CPUopenvino: Face Detection FP16-INT8 - CPUopenvino: Face Detection Retail FP16 - CPUopenvino: Road Segmentation ADAS FP16 - CPUopenvino: Vehicle Detection FP16-INT8 - CPUopenvino: Weld Porosity Detection FP16 - CPUopenvino: Face Detection Retail FP16-INT8 - CPUopenvino: Road Segmentation ADAS FP16-INT8 - CPUopenvino: Machine Translation EN To DE FP16 - CPUopenvino: Weld Porosity Detection FP16-INT8 - CPUopenvino: Person Vehicle Bike Detection FP16 - CPUopenvino: Noise Suppression Poconet-Like FP16 - CPUopenvino: Handwritten English Recognition FP16 - CPUopenvino: Person Re-Identification Retail FP16 - CPUopenvino: Age Gender Recognition Retail 0013 FP16 - CPUopenvino: Handwritten English Recognition FP16-INT8 - CPUopenvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPUopenvino-genai: Gemma-7b-int4-ov - CPU - Time To First Tokenopenvino-genai: Gemma-7b-int4-ov - CPU - Time Per Output Tokenopenvino-genai: TinyLlama-1.1B-Chat-v1.0 - CPU - Time To First Tokenopenvino-genai: TinyLlama-1.1B-Chat-v1.0 - CPU - Time Per Output Tokenopenvino-genai: Falcon-7b-instruct-int4-ov - CPU - Time To First Tokenopenvino-genai: Falcon-7b-instruct-int4-ov - CPU - Time Per Output Tokenopenvino-genai: Phi-3-mini-128k-instruct-int4-ov - CPU - Time To First Tokenopenvino-genai: Phi-3-mini-128k-instruct-int4-ov - CPU - Time Per Output Tokenabc4.6143.9544.07283.788.54950.04126.02467.17446.411587.70170.0548.99886.77437.20607.91245.49540.9512337.97254.3917776.3010.2336.8335.8933.2410.8136.8335.9333.6461.76158.11147.47126.0626.5126.244096819234.27163843276833.9813.954096819213.690.0816384327689.7131.4314.2519.83865.4690.9690.6814.09467.214.1931.728.5422.376.2723.4981.6211.259.1316.3040.667.370.7739.270.53204.63102.9934.2631.8114670.1793.750.424.5943.643.86314.258.891032.88136.7507.16481.791712.41175.7454.49953.24481.89664.33266.88557.712742.09282.9318962.5310.2840.136.134.1910.8340.7436.4134.4760.43157.49142.57153.7122.9126.444096819229.99163843276833.511.994096819213.850.0716384327688.930.3612.0118.58867.8191.6691.1412.74493.8529.227.8720.715.8222.7273.3510.468.2814.8837.47.150.7535.290.49215.32112.3835.9732.93181.6383.28100.0253.824.5243.7343.31312.328.03939.9133.18496.31458.731676.72182.3953.46920.13474.85661.16252.75561.1612864.63278.118885.0810.3138.4536.1933.6510.8338.935.6633.8161.36158.18147.12137.2522.7226.314096819230.14163843276833.6512.174096819213.890.0716384327689.4431.3814.0618.22882.6991.492.2812.78497.64.2329.998.0421.765.9421.8974.7710.838.414.9639.517.110.7435.90.5204.71105.9334.1131.87148.9471.15103.2854.9OpenBenchmarking.org

OpenVINO

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.5Model: Face Detection FP16 - Device: CPUabc1.03732.07463.11194.14925.1865SE +/- 0.02, N = 34.614.594.521. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.5Model: Person Detection FP16 - Device: CPUacb1020304050SE +/- 0.04, N = 343.9543.7343.601. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.5Model: Person Detection FP32 - Device: CPUabc1020304050SE +/- 0.12, N = 344.0743.8643.311. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.5Model: Vehicle Detection FP16 - Device: CPUbca70140210280350SE +/- 3.12, N = 15314.25312.32283.781. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.5Model: Face Detection FP16-INT8 - Device: CPUbac246810SE +/- 0.09, N = 38.898.548.031. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.5Model: Face Detection Retail FP16 - Device: CPUbac2004006008001000SE +/- 5.37, N = 31032.88950.04939.901. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.5Model: Road Segmentation ADAS FP16 - Device: CPUbca306090120150SE +/- 0.86, N = 15136.70133.18126.021. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.5Model: Vehicle Detection FP16-INT8 - Device: CPUbca110220330440550SE +/- 3.54, N = 3507.16496.31467.171. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.5Model: Weld Porosity Detection FP16 - Device: CPUbca100200300400500SE +/- 6.37, N = 3481.79458.73446.411. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.5Model: Face Detection Retail FP16-INT8 - Device: CPUbca400800120016002000SE +/- 5.90, N = 31712.411676.721587.701. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.5Model: Road Segmentation ADAS FP16-INT8 - Device: CPUcba4080120160200SE +/- 1.58, N = 7182.39175.74170.051. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.5Model: Machine Translation EN To DE FP16 - Device: CPUbca1224364860SE +/- 0.34, N = 1254.4953.4648.991. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.5Model: Weld Porosity Detection FP16-INT8 - Device: CPUbca2004006008001000SE +/- 10.32, N = 3953.24920.13886.771. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.5Model: Person Vehicle Bike Detection FP16 - Device: CPUbca100200300400500SE +/- 2.99, N = 15481.89474.85437.201. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.5Model: Noise Suppression Poconet-Like FP16 - Device: CPUbca140280420560700SE +/- 4.56, N = 15664.33661.16607.911. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.5Model: Handwritten English Recognition FP16 - Device: CPUbca60120180240300SE +/- 0.26, N = 3266.88252.75245.491. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.5Model: Person Re-Identification Retail FP16 - Device: CPUcba120240360480600SE +/- 4.09, N = 3561.16557.70540.951. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.5Model: Age Gender Recognition Retail 0013 FP16 - Device: CPUcba3K6K9K12K15KSE +/- 70.01, N = 312864.6312742.0912337.971. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.5Model: Handwritten English Recognition FP16-INT8 - Device: CPUbca60120180240300SE +/- 2.11, N = 9282.93278.10254.391. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.5Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPUbca4K8K12K16K20KSE +/- 74.62, N = 318962.5318885.0817776.301. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

Llama.cpp

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4154Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128cba3691215SE +/- 0.06, N = 310.3110.2810.231. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4154Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512bca918273645SE +/- 0.29, N = 940.1038.4536.831. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4154Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024cba816243240SE +/- 0.50, N = 336.1936.1035.891. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4154Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048bca816243240SE +/- 0.06, N = 334.1933.6533.241. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4154Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128cba3691215SE +/- 0.04, N = 310.8310.8310.811. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4154Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512bca918273645SE +/- 0.29, N = 1040.7438.9036.831. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4154Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024bac816243240SE +/- 0.38, N = 336.4135.9335.661. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4154Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048bca816243240SE +/- 0.15, N = 334.4733.8133.641. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4154Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128acb1428425670SE +/- 0.13, N = 361.7661.3660.431. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4154Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512cab306090120150SE +/- 1.19, N = 3158.18158.11157.491. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4154Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024acb306090120150SE +/- 0.34, N = 3147.47147.12142.571. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4154Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048bca306090120150SE +/- 1.74, N = 15153.71137.25126.061. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas

Llamafile

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: Llama-3.2-3B-Instruct.Q6_K - Test: Text Generation 16abc612182430SE +/- 0.28, N = 1226.5122.9122.72

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: Llama-3.2-3B-Instruct.Q6_K - Test: Text Generation 128bca612182430SE +/- 0.22, N = 326.4426.3126.24

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: Llama-3.2-3B-Instruct.Q6_K - Test: Prompt Processing 256cba9001800270036004500SE +/- 0.00, N = 3409640964096

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: Llama-3.2-3B-Instruct.Q6_K - Test: Prompt Processing 512cba2K4K6K8K10KSE +/- 0.00, N = 3819281928192

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Text Generation 16acb816243240SE +/- 0.44, N = 1534.2730.1429.99

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: Llama-3.2-3B-Instruct.Q6_K - Test: Prompt Processing 1024cba4K8K12K16K20KSE +/- 0.00, N = 3163841638416384

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: Llama-3.2-3B-Instruct.Q6_K - Test: Prompt Processing 2048cba7K14K21K28K35KSE +/- 0.00, N = 3327683276832768

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Text Generation 128acb816243240SE +/- 0.12, N = 333.9833.6533.50

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Text Generation 16acb48121620SE +/- 0.15, N = 1313.9512.1711.99

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Prompt Processing 256cba9001800270036004500SE +/- 0.00, N = 3409640964096

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Prompt Processing 512cba2K4K6K8K10KSE +/- 0.00, N = 3819281928192

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Text Generation 128cba48121620SE +/- 0.14, N = 313.8913.8513.69

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Text Generation 16acb0.0180.0360.0540.0720.09SE +/- 0.00, N = 30.080.070.07

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Prompt Processing 1024cba4K8K12K16K20KSE +/- 0.00, N = 3163841638416384

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Prompt Processing 2048cba7K14K21K28K35KSE +/- 0.00, N = 3327683276832768

OpenVINO GenAI

OpenBenchmarking.orgtokens/s, More Is BetterOpenVINO GenAI 2024.5Model: Gemma-7b-int4-ov - Device: CPUacb36912159.719.448.90

OpenBenchmarking.orgtokens/s, More Is BetterOpenVINO GenAI 2024.5Model: TinyLlama-1.1B-Chat-v1.0 - Device: CPUacb71421283531.4331.3830.36

OpenBenchmarking.orgtokens/s, More Is BetterOpenVINO GenAI 2024.5Model: Falcon-7b-instruct-int4-ov - Device: CPUacb4812162014.2514.0612.01

OpenBenchmarking.orgtokens/s, More Is BetterOpenVINO GenAI 2024.5Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPUabc51015202519.8318.5818.22

OpenVINO

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.5Model: Face Detection FP16 - Device: CPUabc2004006008001000SE +/- 4.05, N = 3865.46867.81882.69MIN: 681.88 / MAX: 935.79MIN: 432.97 / MAX: 929.79MIN: 811.64 / MAX: 914.611. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.5Model: Person Detection FP16 - Device: CPUacb20406080100SE +/- 0.10, N = 390.9691.4091.66MIN: 45.37 / MAX: 129.41MIN: 73 / MAX: 111.22MIN: 70.22 / MAX: 107.211. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.5Model: Person Detection FP32 - Device: CPUabc20406080100SE +/- 0.24, N = 390.6891.1492.28MIN: 42.08 / MAX: 127.52MIN: 74.95 / MAX: 112.27MIN: 71.08 / MAX: 120.331. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.5Model: Vehicle Detection FP16 - Device: CPUbca48121620SE +/- 0.15, N = 1512.7012.7814.09MIN: 6.35 / MAX: 22.55MIN: 9.47 / MAX: 20.55MIN: 6.67 / MAX: 44.251. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.5Model: Face Detection FP16-INT8 - Device: CPUbac110220330440550SE +/- 4.94, N = 3449.00467.21497.60MIN: 400.92 / MAX: 475.03MIN: 389.62 / MAX: 499.44MIN: 415.79 / MAX: 524.41. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.5Model: Face Detection Retail FP16 - Device: CPUbac0.95181.90362.85543.80724.759SE +/- 0.02, N = 33.854.194.23MIN: 2.03 / MAX: 29.88MIN: 2.09 / MAX: 31MIN: 2.29 / MAX: 29.371. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.5Model: Road Segmentation ADAS FP16 - Device: CPUbca714212835SE +/- 0.21, N = 1529.2229.9931.72MIN: 22.58 / MAX: 35.18MIN: 14.48 / MAX: 36.5MIN: 14.27 / MAX: 66.491. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.5Model: Vehicle Detection FP16-INT8 - Device: CPUbca246810SE +/- 0.06, N = 37.878.048.54MIN: 4.58 / MAX: 16.38MIN: 4.77 / MAX: 36.82MIN: 4.72 / MAX: 37.11. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.5Model: Weld Porosity Detection FP16 - Device: CPUbca510152025SE +/- 0.32, N = 320.7121.7622.37MIN: 9.11 / MAX: 43.48MIN: 9.51 / MAX: 51.32MIN: 9.45 / MAX: 55.551. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.5Model: Face Detection Retail FP16-INT8 - Device: CPUbca246810SE +/- 0.02, N = 35.825.946.27MIN: 2.57 / MAX: 33.51MIN: 2.83 / MAX: 34.92MIN: 2.63 / MAX: 36.961. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.5Model: Road Segmentation ADAS FP16-INT8 - Device: CPUcba612182430SE +/- 0.22, N = 721.8922.7223.49MIN: 16.13 / MAX: 43.35MIN: 17.07 / MAX: 30.25MIN: 12 / MAX: 52.691. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.5Model: Machine Translation EN To DE FP16 - Device: CPUbca20406080100SE +/- 0.57, N = 1273.3574.7781.62MIN: 56.95 / MAX: 89.78MIN: 50.72 / MAX: 91.82MIN: 51.47 / MAX: 126.221. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.5Model: Weld Porosity Detection FP16-INT8 - Device: CPUbca3691215SE +/- 0.13, N = 310.4610.8311.25MIN: 4.65 / MAX: 39.08MIN: 5.05 / MAX: 41.09MIN: 4.28 / MAX: 41.131. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.5Model: Person Vehicle Bike Detection FP16 - Device: CPUbca3691215SE +/- 0.06, N = 158.288.409.13MIN: 5.61 / MAX: 29.51MIN: 5.46 / MAX: 31.1MIN: 4.67 / MAX: 38.311. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.5Model: Noise Suppression Poconet-Like FP16 - Device: CPUbca48121620SE +/- 0.12, N = 1514.8814.9616.30MIN: 9.08 / MAX: 23.51MIN: 9.32 / MAX: 24.49MIN: 8.05 / MAX: 48.271. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.5Model: Handwritten English Recognition FP16 - Device: CPUbca918273645SE +/- 0.04, N = 337.4039.5140.66MIN: 23.82 / MAX: 72.21MIN: 24.79 / MAX: 75.37MIN: 19.98 / MAX: 76.171. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.5Model: Person Re-Identification Retail FP16 - Device: CPUcba246810SE +/- 0.06, N = 37.117.157.37MIN: 4.3 / MAX: 30.31MIN: 4.45 / MAX: 34.36MIN: 3.58 / MAX: 36.291. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.5Model: Age Gender Recognition Retail 0013 FP16 - Device: CPUcba0.17330.34660.51990.69320.8665SE +/- 0.01, N = 30.740.750.77MIN: 0.29 / MAX: 21.6MIN: 0.28 / MAX: 25.7MIN: 0.28 / MAX: 28.441. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.5Model: Handwritten English Recognition FP16-INT8 - Device: CPUbca918273645SE +/- 0.33, N = 935.2935.9039.27MIN: 23.48 / MAX: 69.74MIN: 22.3 / MAX: 71.26MIN: 19.02 / MAX: 801. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.5Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPUbca0.11930.23860.35790.47720.5965SE +/- 0.00, N = 30.490.500.53MIN: 0.21 / MAX: 7.16MIN: 0.2 / MAX: 24.4MIN: 0.2 / MAX: 27.591. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

71 Results Shown

OpenVINO:
  Face Detection FP16 - CPU
  Person Detection FP16 - CPU
  Person Detection FP32 - CPU
  Vehicle Detection FP16 - CPU
  Face Detection FP16-INT8 - CPU
  Face Detection Retail FP16 - CPU
  Road Segmentation ADAS FP16 - CPU
  Vehicle Detection FP16-INT8 - CPU
  Weld Porosity Detection FP16 - CPU
  Face Detection Retail FP16-INT8 - CPU
  Road Segmentation ADAS FP16-INT8 - CPU
  Machine Translation EN To DE FP16 - CPU
  Weld Porosity Detection FP16-INT8 - CPU
  Person Vehicle Bike Detection FP16 - CPU
  Noise Suppression Poconet-Like FP16 - CPU
  Handwritten English Recognition FP16 - CPU
  Person Re-Identification Retail FP16 - CPU
  Age Gender Recognition Retail 0013 FP16 - CPU
  Handwritten English Recognition FP16-INT8 - CPU
  Age Gender Recognition Retail 0013 FP16-INT8 - CPU
Llama.cpp:
  CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Text Generation 128
  CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 512
  CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 1024
  CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 2048
  CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Text Generation 128
  CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 512
  CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 1024
  CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 2048
  CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Text Generation 128
  CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 512
  CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 1024
  CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 2048
Llamafile:
  Llama-3.2-3B-Instruct.Q6_K - Text Generation 16
  Llama-3.2-3B-Instruct.Q6_K - Text Generation 128
  Llama-3.2-3B-Instruct.Q6_K - Prompt Processing 256
  Llama-3.2-3B-Instruct.Q6_K - Prompt Processing 512
  TinyLlama-1.1B-Chat-v1.0.BF16 - Text Generation 16
  Llama-3.2-3B-Instruct.Q6_K - Prompt Processing 1024
  Llama-3.2-3B-Instruct.Q6_K - Prompt Processing 2048
  TinyLlama-1.1B-Chat-v1.0.BF16 - Text Generation 128
  mistral-7b-instruct-v0.2.Q5_K_M - Text Generation 16
  TinyLlama-1.1B-Chat-v1.0.BF16 - Prompt Processing 256
  TinyLlama-1.1B-Chat-v1.0.BF16 - Prompt Processing 512
  mistral-7b-instruct-v0.2.Q5_K_M - Text Generation 128
  wizardcoder-python-34b-v1.0.Q6_K - Text Generation 16
  TinyLlama-1.1B-Chat-v1.0.BF16 - Prompt Processing 1024
  TinyLlama-1.1B-Chat-v1.0.BF16 - Prompt Processing 2048
OpenVINO GenAI:
  Gemma-7b-int4-ov - CPU
  TinyLlama-1.1B-Chat-v1.0 - CPU
  Falcon-7b-instruct-int4-ov - CPU
  Phi-3-mini-128k-instruct-int4-ov - CPU
OpenVINO:
  Face Detection FP16 - CPU
  Person Detection FP16 - CPU
  Person Detection FP32 - CPU
  Vehicle Detection FP16 - CPU
  Face Detection FP16-INT8 - CPU
  Face Detection Retail FP16 - CPU
  Road Segmentation ADAS FP16 - CPU
  Vehicle Detection FP16-INT8 - CPU
  Weld Porosity Detection FP16 - CPU
  Face Detection Retail FP16-INT8 - CPU
  Road Segmentation ADAS FP16-INT8 - CPU
  Machine Translation EN To DE FP16 - CPU
  Weld Porosity Detection FP16-INT8 - CPU
  Person Vehicle Bike Detection FP16 - CPU
  Noise Suppression Poconet-Like FP16 - CPU
  Handwritten English Recognition FP16 - CPU
  Person Re-Identification Retail FP16 - CPU
  Age Gender Recognition Retail 0013 FP16 - CPU
  Handwritten English Recognition FP16-INT8 - CPU
  Age Gender Recognition Retail 0013 FP16-INT8 - CPU