dddda

AMD Ryzen AI 9 365 testing with a ASUS Zenbook S 16 UM5606WA_UM5606WA UM5606WA v1.0 (UM5606WA.308 BIOS) and AMD Radeon 512MB on Ubuntu 24.10 via the Phoronix Test Suite.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2412110-NE-DDDDA678639
Jump To Table - Results

View

Do Not Show Noisy Results
Do Not Show Results With Incomplete Data
Do Not Show Results With Little Change/Spread
List Notable Results
Show Result Confidence Charts
Allow Limiting Results To Certain Suite(s)

Statistics

Show Overall Harmonic Mean(s)
Show Overall Geometric Mean
Show Wins / Losses Counts (Pie Chart)
Normalize Results
Remove Outliers Before Calculating Averages

Graph Settings

Force Line Graphs Where Applicable
Convert To Scalar Where Applicable
Prefer Vertical Bar Graphs

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Highlight
Result
Toggle/Hide
Result
Result
Identifier
Performance Per
Dollar
Date
Run
  Test
  Duration
a
December 11
  8 Hours, 8 Minutes
b
December 11
  1 Hour, 55 Minutes
c
December 11
  1 Hour, 54 Minutes
Invert Behavior (Only Show Selected Data)
  3 Hours, 59 Minutes

Only show results where is faster than
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


ddddaOpenBenchmarking.orgPhoronix Test SuiteAMD Ryzen AI 9 365 @ 4.31GHz (10 Cores / 20 Threads)ASUS Zenbook S 16 UM5606WA_UM5606WA UM5606WA v1.0 (UM5606WA.308 BIOS)AMD Device 15074 x 6GB LPDDR5-7500MT/s Micron MT62F1536M32D4DS-0261024GB MTFDKBA1T0QFM-1BD1AABGBAMD Radeon 512MBAMD Rembrandt Radeon HD AudioMEDIATEK Device 7925Ubuntu 24.106.12.0-rc7-phx-eraps (x86_64)GNOME Shell 47.0X Server + Wayland4.6 Mesa 24.2.3-1ubuntu1 (LLVM 19.1.0 DRM 3.59)GCC 14.2.0ext42880x1800ProcessorMotherboardChipsetMemoryDiskGraphicsAudioNetworkOSKernelDesktopDisplay ServerOpenGLCompilerFile-SystemScreen ResolutionDddda BenchmarksSystem Logs- amdgpu.dcdebugmask=0x600 - Transparent Huge Pages: madvise- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: amd-pstate-epp powersave (Boost: Enabled EPP: balance_performance) - Platform Profile: balanced - CPU Microcode: 0xb204011 - ACPI Profile: balanced - Python 3.12.7- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; PBRSB-eIBRS: Not affected; BHI: Not affected; ERAPS hardware RSB flush + srbds: Not affected + tsx_async_abort: Not affected

abcResult OverviewPhoronix Test Suite100%102%105%107%110%OpenVINO GenAIOpenVINOLlamafileLlama.cpp

ddddallama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 2048openvino-genai: Falcon-7b-instruct-int4-ov - CPUllamafile: Llama-3.2-3B-Instruct.Q6_K - Text Generation 16llamafile: mistral-7b-instruct-v0.2.Q5_K_M - Text Generation 16llamafile: wizardcoder-python-34b-v1.0.Q6_K - Text Generation 16llamafile: TinyLlama-1.1B-Chat-v1.0.BF16 - Text Generation 16openvino: Handwritten English Recognition FP16-INT8 - CPUopenvino: Machine Translation EN To DE FP16 - CPUopenvino: Machine Translation EN To DE FP16 - CPUopenvino: Handwritten English Recognition FP16-INT8 - CPUopenvino: Vehicle Detection FP16 - CPUopenvino: Face Detection FP16-INT8 - CPUopenvino: Vehicle Detection FP16 - CPUopenvino: Face Detection FP16-INT8 - CPUllama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 512openvino: Person Vehicle Bike Detection FP16 - CPUopenvino: Person Vehicle Bike Detection FP16 - CPUopenvino: Face Detection Retail FP16 - CPUopenvino: Face Detection Retail FP16 - CPUopenvino: Noise Suppression Poconet-Like FP16 - CPUopenvino: Noise Suppression Poconet-Like FP16 - CPUopenvino-genai: Gemma-7b-int4-ov - CPUllama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 512openvino-genai: Phi-3-mini-128k-instruct-int4-ov - CPUopenvino: Handwritten English Recognition FP16 - CPUopenvino: Handwritten English Recognition FP16 - CPUopenvino: Vehicle Detection FP16-INT8 - CPUopenvino: Road Segmentation ADAS FP16 - CPUopenvino: Vehicle Detection FP16-INT8 - CPUopenvino: Road Segmentation ADAS FP16 - CPUopenvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPUopenvino: Weld Porosity Detection FP16 - CPUopenvino: Weld Porosity Detection FP16 - CPUopenvino: Face Detection Retail FP16-INT8 - CPUopenvino: Face Detection Retail FP16-INT8 - CPUopenvino: Weld Porosity Detection FP16-INT8 - CPUopenvino: Weld Porosity Detection FP16-INT8 - CPUopenvino: Road Segmentation ADAS FP16-INT8 - CPUopenvino: Road Segmentation ADAS FP16-INT8 - CPUopenvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPUopenvino: Age Gender Recognition Retail 0013 FP16 - CPUopenvino: Age Gender Recognition Retail 0013 FP16 - CPUopenvino: Person Re-Identification Retail FP16 - CPUopenvino: Person Re-Identification Retail FP16 - CPUopenvino-genai: TinyLlama-1.1B-Chat-v1.0 - CPUllama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 1024llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 2048llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 2048llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Text Generation 128llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 1024openvino: Face Detection FP16 - CPUopenvino: Face Detection FP16 - CPUopenvino: Person Detection FP32 - CPUopenvino: Person Detection FP32 - CPUllamafile: mistral-7b-instruct-v0.2.Q5_K_M - Text Generation 128llamafile: TinyLlama-1.1B-Chat-v1.0.BF16 - Text Generation 128llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 1024openvino: Person Detection FP16 - CPUllama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Text Generation 128openvino: Person Detection FP16 - CPUllamafile: Llama-3.2-3B-Instruct.Q6_K - Text Generation 128llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 512llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Text Generation 128llamafile: TinyLlama-1.1B-Chat-v1.0.BF16 - Prompt Processing 2048llamafile: TinyLlama-1.1B-Chat-v1.0.BF16 - Prompt Processing 1024llamafile: TinyLlama-1.1B-Chat-v1.0.BF16 - Prompt Processing 512llamafile: TinyLlama-1.1B-Chat-v1.0.BF16 - Prompt Processing 256llamafile: Llama-3.2-3B-Instruct.Q6_K - Prompt Processing 2048llamafile: Llama-3.2-3B-Instruct.Q6_K - Prompt Processing 1024llamafile: Llama-3.2-3B-Instruct.Q6_K - Prompt Processing 512llamafile: Llama-3.2-3B-Instruct.Q6_K - Prompt Processing 256openvino-genai: Phi-3-mini-128k-instruct-int4-ov - CPU - Time Per Output Tokenopenvino-genai: Phi-3-mini-128k-instruct-int4-ov - CPU - Time To First Tokenopenvino-genai: Falcon-7b-instruct-int4-ov - CPU - Time Per Output Tokenopenvino-genai: Falcon-7b-instruct-int4-ov - CPU - Time To First Tokenopenvino-genai: TinyLlama-1.1B-Chat-v1.0 - CPU - Time Per Output Tokenopenvino-genai: TinyLlama-1.1B-Chat-v1.0 - CPU - Time To First Tokenopenvino-genai: Gemma-7b-int4-ov - CPU - Time Per Output Tokenopenvino-genai: Gemma-7b-int4-ov - CPU - Time To First Tokenabc126.0614.2526.5113.950.0834.2739.2781.6248.99254.3914.09467.21283.788.5436.839.13437.20950.044.1916.30607.919.7136.8319.8340.66245.49467.1731.728.54126.020.5322.37446.411587.706.2711.25886.7723.49170.0517776.3012337.970.77540.957.3731.43147.4733.2433.6461.7635.934.61865.4690.6844.0713.6933.9835.8943.9510.2390.9626.24158.1110.8132768163848192409632768163848192409650.4293.770.1714631.8134.26102.99204.63153.7112.0122.9111.990.0729.9935.2973.3554.49282.9312.7449314.258.8940.748.28481.891032.883.8514.88664.338.940.118.5837.4266.88507.1629.227.87136.70.4920.71481.791712.415.8210.46953.2422.72175.7418962.5312742.090.75557.77.1530.36142.5734.1934.4760.4336.414.59867.8191.1443.8613.8533.536.143.610.2891.6626.44157.4910.8332768163848192409632768163848192409653.82100.0283.28181.6332.9335.97112.38215.32137.2514.0622.7212.170.0730.1435.974.7753.46278.112.78497.6312.328.0338.98.4474.85939.94.2314.96661.169.4438.4518.2239.51252.75496.3129.998.04133.180.521.76458.731676.725.9410.83920.1321.89182.3918885.0812864.630.74561.167.1131.38147.1233.6533.8161.3635.664.52882.6992.2843.3113.8933.6536.1943.7310.3191.426.31158.1810.8332768163848192409632768163848192409654.9103.2871.15148.9431.8734.11105.93204.71OpenBenchmarking.org

Llama.cpp

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4154Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048acb306090120150SE +/- 1.74, N = 15126.06137.25153.711. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas

OpenVINO GenAI

OpenBenchmarking.orgtokens/s, More Is BetterOpenVINO GenAI 2024.5Model: Falcon-7b-instruct-int4-ov - Device: CPUbca4812162012.0114.0614.25

Llamafile

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: Llama-3.2-3B-Instruct.Q6_K - Test: Text Generation 16cba612182430SE +/- 0.28, N = 1222.7222.9126.51

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Text Generation 16bca48121620SE +/- 0.15, N = 1311.9912.1713.95

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Text Generation 16bca0.0180.0360.0540.0720.09SE +/- 0.00, N = 30.070.070.08

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Text Generation 16bca816243240SE +/- 0.44, N = 1529.9930.1434.27

OpenVINO

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.5Model: Handwritten English Recognition FP16-INT8 - Device: CPUacb918273645SE +/- 0.33, N = 939.2735.9035.29MIN: 19.02 / MAX: 80MIN: 22.3 / MAX: 71.26MIN: 23.48 / MAX: 69.741. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.5Model: Machine Translation EN To DE FP16 - Device: CPUacb20406080100SE +/- 0.57, N = 1281.6274.7773.35MIN: 51.47 / MAX: 126.22MIN: 50.72 / MAX: 91.82MIN: 56.95 / MAX: 89.781. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.5Model: Machine Translation EN To DE FP16 - Device: CPUacb1224364860SE +/- 0.34, N = 1248.9953.4654.491. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.5Model: Handwritten English Recognition FP16-INT8 - Device: CPUacb60120180240300SE +/- 2.11, N = 9254.39278.10282.931. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.5Model: Vehicle Detection FP16 - Device: CPUacb48121620SE +/- 0.15, N = 1514.0912.7812.70MIN: 6.67 / MAX: 44.25MIN: 9.47 / MAX: 20.55MIN: 6.35 / MAX: 22.551. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.5Model: Face Detection FP16-INT8 - Device: CPUcab110220330440550SE +/- 4.94, N = 3497.60467.21449.00MIN: 415.79 / MAX: 524.4MIN: 389.62 / MAX: 499.44MIN: 400.92 / MAX: 475.031. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.5Model: Vehicle Detection FP16 - Device: CPUacb70140210280350SE +/- 3.12, N = 15283.78312.32314.251. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.5Model: Face Detection FP16-INT8 - Device: CPUcab246810SE +/- 0.09, N = 38.038.548.891. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

Llama.cpp

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4154Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512acb918273645SE +/- 0.29, N = 1036.8338.9040.741. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas

OpenVINO

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.5Model: Person Vehicle Bike Detection FP16 - Device: CPUacb3691215SE +/- 0.06, N = 159.138.408.28MIN: 4.67 / MAX: 38.31MIN: 5.46 / MAX: 31.1MIN: 5.61 / MAX: 29.511. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.5Model: Person Vehicle Bike Detection FP16 - Device: CPUacb100200300400500SE +/- 2.99, N = 15437.20474.85481.891. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.5Model: Face Detection Retail FP16 - Device: CPUcab2004006008001000SE +/- 5.37, N = 3939.90950.041032.881. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.5Model: Face Detection Retail FP16 - Device: CPUcab0.95181.90362.85543.80724.759SE +/- 0.02, N = 34.234.193.85MIN: 2.29 / MAX: 29.37MIN: 2.09 / MAX: 31MIN: 2.03 / MAX: 29.881. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.5Model: Noise Suppression Poconet-Like FP16 - Device: CPUacb48121620SE +/- 0.12, N = 1516.3014.9614.88MIN: 8.05 / MAX: 48.27MIN: 9.32 / MAX: 24.49MIN: 9.08 / MAX: 23.511. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.5Model: Noise Suppression Poconet-Like FP16 - Device: CPUacb140280420560700SE +/- 4.56, N = 15607.91661.16664.331. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenVINO GenAI

OpenBenchmarking.orgtokens/s, More Is BetterOpenVINO GenAI 2024.5Model: Gemma-7b-int4-ov - Device: CPUbca36912158.909.449.71

Llama.cpp

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4154Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512acb918273645SE +/- 0.29, N = 936.8338.4540.101. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas

OpenVINO GenAI

OpenBenchmarking.orgtokens/s, More Is BetterOpenVINO GenAI 2024.5Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPUcba51015202518.2218.5819.83

OpenVINO

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.5Model: Handwritten English Recognition FP16 - Device: CPUacb918273645SE +/- 0.04, N = 340.6639.5137.40MIN: 19.98 / MAX: 76.17MIN: 24.79 / MAX: 75.37MIN: 23.82 / MAX: 72.211. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.5Model: Handwritten English Recognition FP16 - Device: CPUacb60120180240300SE +/- 0.26, N = 3245.49252.75266.881. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.5Model: Vehicle Detection FP16-INT8 - Device: CPUacb110220330440550SE +/- 3.54, N = 3467.17496.31507.161. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.5Model: Road Segmentation ADAS FP16 - Device: CPUacb714212835SE +/- 0.21, N = 1531.7229.9929.22MIN: 14.27 / MAX: 66.49MIN: 14.48 / MAX: 36.5MIN: 22.58 / MAX: 35.181. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.5Model: Vehicle Detection FP16-INT8 - Device: CPUacb246810SE +/- 0.06, N = 38.548.047.87MIN: 4.72 / MAX: 37.1MIN: 4.77 / MAX: 36.82MIN: 4.58 / MAX: 16.381. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.5Model: Road Segmentation ADAS FP16 - Device: CPUacb306090120150SE +/- 0.86, N = 15126.02133.18136.701. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.5Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPUacb0.11930.23860.35790.47720.5965SE +/- 0.00, N = 30.530.500.49MIN: 0.2 / MAX: 27.59MIN: 0.2 / MAX: 24.4MIN: 0.21 / MAX: 7.161. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.5Model: Weld Porosity Detection FP16 - Device: CPUacb510152025SE +/- 0.32, N = 322.3721.7620.71MIN: 9.45 / MAX: 55.55MIN: 9.51 / MAX: 51.32MIN: 9.11 / MAX: 43.481. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.5Model: Weld Porosity Detection FP16 - Device: CPUacb100200300400500SE +/- 6.37, N = 3446.41458.73481.791. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.5Model: Face Detection Retail FP16-INT8 - Device: CPUacb400800120016002000SE +/- 5.90, N = 31587.701676.721712.411. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.5Model: Face Detection Retail FP16-INT8 - Device: CPUacb246810SE +/- 0.02, N = 36.275.945.82MIN: 2.63 / MAX: 36.96MIN: 2.83 / MAX: 34.92MIN: 2.57 / MAX: 33.511. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.5Model: Weld Porosity Detection FP16-INT8 - Device: CPUacb3691215SE +/- 0.13, N = 311.2510.8310.46MIN: 4.28 / MAX: 41.13MIN: 5.05 / MAX: 41.09MIN: 4.65 / MAX: 39.081. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.5Model: Weld Porosity Detection FP16-INT8 - Device: CPUacb2004006008001000SE +/- 10.32, N = 3886.77920.13953.241. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.5Model: Road Segmentation ADAS FP16-INT8 - Device: CPUabc612182430SE +/- 0.22, N = 723.4922.7221.89MIN: 12 / MAX: 52.69MIN: 17.07 / MAX: 30.25MIN: 16.13 / MAX: 43.351. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.5Model: Road Segmentation ADAS FP16-INT8 - Device: CPUabc4080120160200SE +/- 1.58, N = 7170.05175.74182.391. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.5Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPUacb4K8K12K16K20KSE +/- 74.62, N = 317776.3018885.0818962.531. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.5Model: Age Gender Recognition Retail 0013 FP16 - Device: CPUabc3K6K9K12K15KSE +/- 70.01, N = 312337.9712742.0912864.631. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.5Model: Age Gender Recognition Retail 0013 FP16 - Device: CPUabc0.17330.34660.51990.69320.8665SE +/- 0.01, N = 30.770.750.74MIN: 0.28 / MAX: 28.44MIN: 0.28 / MAX: 25.7MIN: 0.29 / MAX: 21.61. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.5Model: Person Re-Identification Retail FP16 - Device: CPUabc120240360480600SE +/- 4.09, N = 3540.95557.70561.161. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.5Model: Person Re-Identification Retail FP16 - Device: CPUabc246810SE +/- 0.06, N = 37.377.157.11MIN: 3.58 / MAX: 36.29MIN: 4.45 / MAX: 34.36MIN: 4.3 / MAX: 30.311. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenVINO GenAI

OpenBenchmarking.orgtokens/s, More Is BetterOpenVINO GenAI 2024.5Model: TinyLlama-1.1B-Chat-v1.0 - Device: CPUbca71421283530.3631.3831.43

Llama.cpp

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4154Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024bca306090120150SE +/- 0.34, N = 3142.57147.12147.471. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4154Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048acb816243240SE +/- 0.06, N = 333.2433.6534.191. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4154Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048acb816243240SE +/- 0.15, N = 333.6433.8134.471. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4154Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128bca1428425670SE +/- 0.13, N = 360.4361.3661.761. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4154Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024cab816243240SE +/- 0.38, N = 335.6635.9336.411. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas

OpenVINO

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.5Model: Face Detection FP16 - Device: CPUcba1.03732.07463.11194.14925.1865SE +/- 0.02, N = 34.524.594.611. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.5Model: Face Detection FP16 - Device: CPUcba2004006008001000SE +/- 4.05, N = 3882.69867.81865.46MIN: 811.64 / MAX: 914.61MIN: 432.97 / MAX: 929.79MIN: 681.88 / MAX: 935.791. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.5Model: Person Detection FP32 - Device: CPUcba20406080100SE +/- 0.24, N = 392.2891.1490.68MIN: 71.08 / MAX: 120.33MIN: 74.95 / MAX: 112.27MIN: 42.08 / MAX: 127.521. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.5Model: Person Detection FP32 - Device: CPUcba1020304050SE +/- 0.12, N = 343.3143.8644.071. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

Llamafile

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Text Generation 128abc48121620SE +/- 0.14, N = 313.6913.8513.89

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Text Generation 128bca816243240SE +/- 0.12, N = 333.5033.6533.98

Llama.cpp

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4154Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024abc816243240SE +/- 0.50, N = 335.8936.1036.191. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas

OpenVINO

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.5Model: Person Detection FP16 - Device: CPUbca1020304050SE +/- 0.04, N = 343.6043.7343.951. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

Llama.cpp

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4154Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128abc3691215SE +/- 0.06, N = 310.2310.2810.311. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas

OpenVINO

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.5Model: Person Detection FP16 - Device: CPUbca20406080100SE +/- 0.10, N = 391.6691.4090.96MIN: 70.22 / MAX: 107.21MIN: 73 / MAX: 111.22MIN: 45.37 / MAX: 129.411. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs

Llamafile

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: Llama-3.2-3B-Instruct.Q6_K - Test: Text Generation 128acb612182430SE +/- 0.22, N = 326.2426.3126.44

Llama.cpp

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4154Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512bac306090120150SE +/- 1.19, N = 3157.49158.11158.181. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4154Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128abc3691215SE +/- 0.04, N = 310.8110.8310.831. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas

Llamafile

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Prompt Processing 2048abc7K14K21K28K35KSE +/- 0.00, N = 3327683276832768

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Prompt Processing 1024abc4K8K12K16K20KSE +/- 0.00, N = 3163841638416384

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Prompt Processing 512abc2K4K6K8K10KSE +/- 0.00, N = 3819281928192

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Prompt Processing 256abc9001800270036004500SE +/- 0.00, N = 3409640964096

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: Llama-3.2-3B-Instruct.Q6_K - Test: Prompt Processing 2048abc7K14K21K28K35KSE +/- 0.00, N = 3327683276832768

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: Llama-3.2-3B-Instruct.Q6_K - Test: Prompt Processing 1024abc4K8K12K16K20KSE +/- 0.00, N = 3163841638416384

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: Llama-3.2-3B-Instruct.Q6_K - Test: Prompt Processing 512abc2K4K6K8K10KSE +/- 0.00, N = 3819281928192

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.8.16Model: Llama-3.2-3B-Instruct.Q6_K - Test: Prompt Processing 256abc9001800270036004500SE +/- 0.00, N = 3409640964096

71 Results Shown

Llama.cpp
OpenVINO GenAI
Llamafile:
  Llama-3.2-3B-Instruct.Q6_K - Text Generation 16
  mistral-7b-instruct-v0.2.Q5_K_M - Text Generation 16
  wizardcoder-python-34b-v1.0.Q6_K - Text Generation 16
  TinyLlama-1.1B-Chat-v1.0.BF16 - Text Generation 16
OpenVINO:
  Handwritten English Recognition FP16-INT8 - CPU
  Machine Translation EN To DE FP16 - CPU
  Machine Translation EN To DE FP16 - CPU
  Handwritten English Recognition FP16-INT8 - CPU
  Vehicle Detection FP16 - CPU
  Face Detection FP16-INT8 - CPU
  Vehicle Detection FP16 - CPU
  Face Detection FP16-INT8 - CPU
Llama.cpp
OpenVINO:
  Person Vehicle Bike Detection FP16 - CPU:
    ms
    FPS
  Face Detection Retail FP16 - CPU:
    FPS
    ms
  Noise Suppression Poconet-Like FP16 - CPU:
    ms
    FPS
OpenVINO GenAI
Llama.cpp
OpenVINO GenAI
OpenVINO:
  Handwritten English Recognition FP16 - CPU:
    ms
    FPS
  Vehicle Detection FP16-INT8 - CPU:
    FPS
  Road Segmentation ADAS FP16 - CPU:
    ms
  Vehicle Detection FP16-INT8 - CPU:
    ms
  Road Segmentation ADAS FP16 - CPU:
    FPS
  Age Gender Recognition Retail 0013 FP16-INT8 - CPU:
    ms
  Weld Porosity Detection FP16 - CPU:
    ms
    FPS
  Face Detection Retail FP16-INT8 - CPU:
    FPS
    ms
  Weld Porosity Detection FP16-INT8 - CPU:
    ms
    FPS
  Road Segmentation ADAS FP16-INT8 - CPU:
    ms
    FPS
  Age Gender Recognition Retail 0013 FP16-INT8 - CPU:
    FPS
  Age Gender Recognition Retail 0013 FP16 - CPU:
    FPS
    ms
  Person Re-Identification Retail FP16 - CPU:
    FPS
    ms
OpenVINO GenAI
Llama.cpp:
  CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 1024
  CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 2048
  CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 2048
  CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Text Generation 128
  CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 1024
OpenVINO:
  Face Detection FP16 - CPU:
    FPS
    ms
  Person Detection FP32 - CPU:
    ms
    FPS
Llamafile:
  mistral-7b-instruct-v0.2.Q5_K_M - Text Generation 128
  TinyLlama-1.1B-Chat-v1.0.BF16 - Text Generation 128
Llama.cpp
OpenVINO
Llama.cpp
OpenVINO
Llamafile
Llama.cpp:
  CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 512
  CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Text Generation 128
Llamafile:
  TinyLlama-1.1B-Chat-v1.0.BF16 - Prompt Processing 2048
  TinyLlama-1.1B-Chat-v1.0.BF16 - Prompt Processing 1024
  TinyLlama-1.1B-Chat-v1.0.BF16 - Prompt Processing 512
  TinyLlama-1.1B-Chat-v1.0.BF16 - Prompt Processing 256
  Llama-3.2-3B-Instruct.Q6_K - Prompt Processing 2048
  Llama-3.2-3B-Instruct.Q6_K - Prompt Processing 1024
  Llama-3.2-3B-Instruct.Q6_K - Prompt Processing 512
  Llama-3.2-3B-Instruct.Q6_K - Prompt Processing 256