ampere arm tests Tests for a future article. ARMv8 Neoverse-N1 testing with a System76 Thelio Astra (3.02 BIOS) and NVIDIA RTX A400/PCIe 4GB on Ubuntu 24.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2411241-PTS-AMPEREAR12&grr&sor .
ampere arm tests Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL Compiler File-System Screen Resolution a b ARMv8 Neoverse-N1 @ 3.00GHz (128 Cores) System76 Thelio Astra (3.02 BIOS) Ampere Computing LLC Altra PCI Root Complex A 8 x 32GB DDR4-3200MT/s Micron 18ASF4G72PDZ-3G2F1 1024GB KINGSTON SKC3000S1024G NVIDIA RTX A400/PCIe 4GB NVIDIA Device 2291 DELL P2415Q 2 x Intel X550 + Intel I210 Ubuntu 24.04 6.8.0-48-generic-64k (aarch64) GNOME Shell 46.0 X Server NVIDIA 550.120 4.6.0 GCC 13.2.0 ext4 3840x2160 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-dIwDw0/gcc-13-13.2.0/debian/tmp-nvptx/usr --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto --without-cuda-driver -v Processor Details - a: Scaling Governor: cppc_cpufreq schedutil (Boost: Disabled) - b: Scaling Governor: cppc_cpufreq performance (Boost: Disabled) Java Details - OpenJDK Runtime Environment (build 21.0.5+11-Ubuntu-1ubuntu124.04) Python Details - Python 3.12.3 Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of CSV2 BHB + srbds: Not affected + tsx_async_abort: Not affected
ampere arm tests openvino-genai: Gemma-7b-int4-ov - CPU - Time Per Output Token openvino-genai: Gemma-7b-int4-ov - CPU - Time To First Token openvino-genai: Gemma-7b-int4-ov - CPU openvino-genai: Falcon-7b-instruct-int4-ov - CPU - Time Per Output Token openvino-genai: Falcon-7b-instruct-int4-ov - CPU - Time To First Token openvino-genai: Falcon-7b-instruct-int4-ov - CPU renaissance: Akka Unbalanced Cobwebbed Tree openvino-genai: Phi-3-mini-128k-instruct-int4-ov - CPU - Time Per Output Token openvino-genai: Phi-3-mini-128k-instruct-int4-ov - CPU - Time To First Token openvino-genai: Phi-3-mini-128k-instruct-int4-ov - CPU llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 2048 llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 2048 renaissance: ALS Movie Lens renaissance: In-Memory Database Shootout renaissance: Savina Reactors.IO renaissance: Apache Spark PageRank openvino: Face Detection FP16-INT8 - CPU openvino: Face Detection FP16-INT8 - CPU renaissance: Gaussian Mixture Model openvino: Face Detection FP16 - CPU openvino: Face Detection FP16 - CPU renaissance: Finagle HTTP Requests renaissance: Apache Spark Bayes renaissance: Rand Forest openvino: Road Segmentation ADAS FP16-INT8 - CPU openvino: Road Segmentation ADAS FP16-INT8 - CPU openvino-genai: TinyLlama-1.1B-Chat-v1.0 - CPU - Time Per Output Token openvino-genai: TinyLlama-1.1B-Chat-v1.0 - CPU - Time To First Token openvino-genai: TinyLlama-1.1B-Chat-v1.0 - CPU renaissance: Scala Dotty openvino: Person Detection FP32 - CPU openvino: Person Detection FP32 - CPU openvino: Person Detection FP16 - CPU openvino: Person Detection FP16 - CPU renaissance: Genetic Algorithm Using Jenetics + Futures openvino: Machine Translation EN To DE FP16 - CPU openvino: Machine Translation EN To DE FP16 - CPU openvino: Noise Suppression Poconet-Like FP16 - CPU openvino: Noise Suppression Poconet-Like FP16 - CPU openvino: Person Vehicle Bike Detection FP16 - CPU openvino: Person Vehicle Bike Detection FP16 - CPU openvino: Vehicle Detection FP16-INT8 - CPU openvino: Vehicle Detection FP16-INT8 - CPU openvino: Face Detection Retail FP16-INT8 - CPU openvino: Face Detection Retail FP16-INT8 - CPU openvino: Road Segmentation ADAS FP16 - CPU openvino: Road Segmentation ADAS FP16 - CPU openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Person Re-Identification Retail FP16 - CPU openvino: Person Re-Identification Retail FP16 - CPU openvino: Handwritten English Recognition FP16-INT8 - CPU openvino: Handwritten English Recognition FP16-INT8 - CPU openvino: Handwritten English Recognition FP16 - CPU openvino: Handwritten English Recognition FP16 - CPU openvino: Weld Porosity Detection FP16 - CPU openvino: Weld Porosity Detection FP16 - CPU openvino: Vehicle Detection FP16 - CPU openvino: Vehicle Detection FP16 - CPU openvino: Face Detection Retail FP16 - CPU openvino: Face Detection Retail FP16 - CPU openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 1024 openvino: Age Gender Recognition Retail 0013 FP16 - CPU openvino: Age Gender Recognition Retail 0013 FP16 - CPU llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 1024 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 2048 llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Text Generation 128 primesieve: 1e13 llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Text Generation 128 llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 512 llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 512 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 1024 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Text Generation 128 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 512 primesieve: 1e12 a b 168.98 201.31 5.92 101.2 121.27 9.88 43100.5 104.37 131.88 9.58 104.84 105.45 21263.0 14065.8 11584.0 3933.2 4788.61 6.53 6255.5 4567.63 6.89 6084.3 331.4 905.4 1021.92 31.09 55.13 82.43 18.14 1360.1 938.5 33.99 937.32 34.03 2583.2 106.86 299.07 350.23 91.23 107.33 297.98 368.95 86.55 81.42 392.46 138.6 230.52 61.26 521.97 178.63 179.01 208.51 153.15 180.27 177.16 63.11 506.71 60.9 524.99 26.12 1224.24 17.28 1850.02 101.01 17.3 1848.45 103.45 261.2 15.38 41.048 19.04 96.79 96.79 239.94 36.58 199.65 2.575 160.47 184.88 6.23 101.42 122.07 9.86 44671.6 103.51 131.64 9.66 104.02 105.16 20800.9 14026.9 11761.5 3959.6 4793.12 6.53 6162.5 4571.1 6.89 5918.8 330.7 907.6 1021.49 31.1 47.83 68.65 20.91 1318.9 937.77 34.01 938.79 33.97 2587.8 106.21 300.92 347.78 91.86 108.65 294.35 367.3 86.94 81.32 392.98 137.56 232.2 60.83 525.64 175.83 181.87 208.37 153.23 180.37 177.13 63.17 506.24 58.17 549.61 27.69 1154.89 16.23 1970.51 103.43 17.59 1818.46 103.7 260.03 16.24 42.813 19 94.36 96.9 229.86 38.61 194.76 2.598 OpenBenchmarking.org
OpenVINO GenAI Model: Gemma-7b-int4-ov - Device: CPU - Time Per Output Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: Gemma-7b-int4-ov - Device: CPU - Time Per Output Token b a 40 80 120 160 200 160.47 168.98
OpenVINO GenAI Model: Gemma-7b-int4-ov - Device: CPU - Time To First Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: Gemma-7b-int4-ov - Device: CPU - Time To First Token b a 40 80 120 160 200 184.88 201.31
OpenVINO GenAI Model: Gemma-7b-int4-ov - Device: CPU OpenBenchmarking.org tokens/s, More Is Better OpenVINO GenAI 2024.5 Model: Gemma-7b-int4-ov - Device: CPU b a 2 4 6 8 10 6.23 5.92
OpenVINO GenAI Model: Falcon-7b-instruct-int4-ov - Device: CPU - Time Per Output Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: Falcon-7b-instruct-int4-ov - Device: CPU - Time Per Output Token a b 20 40 60 80 100 101.20 101.42
OpenVINO GenAI Model: Falcon-7b-instruct-int4-ov - Device: CPU - Time To First Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: Falcon-7b-instruct-int4-ov - Device: CPU - Time To First Token a b 30 60 90 120 150 121.27 122.07
OpenVINO GenAI Model: Falcon-7b-instruct-int4-ov - Device: CPU OpenBenchmarking.org tokens/s, More Is Better OpenVINO GenAI 2024.5 Model: Falcon-7b-instruct-int4-ov - Device: CPU a b 3 6 9 12 15 9.88 9.86
Renaissance Test: Akka Unbalanced Cobwebbed Tree OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Akka Unbalanced Cobwebbed Tree a b 10K 20K 30K 40K 50K 43100.5 44671.6 MAX: 43839.69 MIN: 44225.66 / MAX: 45576.96
OpenVINO GenAI Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU - Time Per Output Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU - Time Per Output Token b a 20 40 60 80 100 103.51 104.37
OpenVINO GenAI Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU - Time To First Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU - Time To First Token b a 30 60 90 120 150 131.64 131.88
OpenVINO GenAI Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU OpenBenchmarking.org tokens/s, More Is Better OpenVINO GenAI 2024.5 Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU b a 3 6 9 12 15 9.66 9.58
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048 a b 20 40 60 80 100 104.84 104.02 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -mcpu=native -fopenmp -lopenblas
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048 a b 20 40 60 80 100 105.45 105.16 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -mcpu=native -fopenmp -lopenblas
Renaissance Test: ALS Movie Lens OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: ALS Movie Lens b a 5K 10K 15K 20K 25K 20800.9 21263.0 MIN: 20144.95 / MAX: 20800.92 MIN: 20319.56 / MAX: 21299.7
Renaissance Test: In-Memory Database Shootout OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: In-Memory Database Shootout b a 3K 6K 9K 12K 15K 14026.9 14065.8 MIN: 13770.63 / MAX: 14791.41 MIN: 13793.23 / MAX: 15145.96
Renaissance Test: Savina Reactors.IO OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Savina Reactors.IO a b 3K 6K 9K 12K 15K 11584.0 11761.5 MIN: 10895.07 / MAX: 11896.07 MIN: 11188.89 / MAX: 13121.97
Renaissance Test: Apache Spark PageRank OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Apache Spark PageRank a b 800 1600 2400 3200 4000 3933.2 3959.6 MIN: 3619.83 / MAX: 3933.23 MIN: 3632.35 / MAX: 3959.61
OpenVINO Model: Face Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Face Detection FP16-INT8 - Device: CPU a b 1000 2000 3000 4000 5000 4788.61 4793.12 MIN: 3508.77 / MAX: 13349.18 MIN: 3477.12 / MAX: 13307.71 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Face Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Face Detection FP16-INT8 - Device: CPU b a 2 4 6 8 10 6.53 6.53 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
Renaissance Test: Gaussian Mixture Model OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Gaussian Mixture Model b a 1300 2600 3900 5200 6500 6162.5 6255.5 MIN: 6162.48 / MAX: 6944.07 MIN: 6255.49 / MAX: 6996.44
OpenVINO Model: Face Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Face Detection FP16 - Device: CPU a b 1000 2000 3000 4000 5000 4567.63 4571.10 MIN: 2840.63 / MAX: 12274.43 MIN: 2793.73 / MAX: 12373.74 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Face Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Face Detection FP16 - Device: CPU b a 2 4 6 8 10 6.89 6.89 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
Renaissance Test: Finagle HTTP Requests OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Finagle HTTP Requests b a 1300 2600 3900 5200 6500 5918.8 6084.3 MIN: 5597 / MAX: 6068.47 MIN: 5660.68 / MAX: 6387.6
Renaissance Test: Apache Spark Bayes OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Apache Spark Bayes b a 70 140 210 280 350 330.7 331.4 MIN: 299.76 / MAX: 409.61 MIN: 300.99 / MAX: 409.98
Renaissance Test: Random Forest OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Random Forest a b 200 400 600 800 1000 905.4 907.6 MIN: 756.48 / MAX: 982.42 MIN: 753.44 / MAX: 984.59
OpenVINO Model: Road Segmentation ADAS FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Road Segmentation ADAS FP16-INT8 - Device: CPU b a 200 400 600 800 1000 1021.49 1021.92 MIN: 705.35 / MAX: 1115.9 MIN: 713.38 / MAX: 1114.63 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Road Segmentation ADAS FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Road Segmentation ADAS FP16-INT8 - Device: CPU b a 7 14 21 28 35 31.10 31.09 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO GenAI Model: TinyLlama-1.1B-Chat-v1.0 - Device: CPU - Time Per Output Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: TinyLlama-1.1B-Chat-v1.0 - Device: CPU - Time Per Output Token b a 12 24 36 48 60 47.83 55.13
OpenVINO GenAI Model: TinyLlama-1.1B-Chat-v1.0 - Device: CPU - Time To First Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: TinyLlama-1.1B-Chat-v1.0 - Device: CPU - Time To First Token b a 20 40 60 80 100 68.65 82.43
OpenVINO GenAI Model: TinyLlama-1.1B-Chat-v1.0 - Device: CPU OpenBenchmarking.org tokens/s, More Is Better OpenVINO GenAI 2024.5 Model: TinyLlama-1.1B-Chat-v1.0 - Device: CPU b a 5 10 15 20 25 20.91 18.14
Renaissance Test: Scala Dotty OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Scala Dotty b a 300 600 900 1200 1500 1318.9 1360.1 MIN: 1098.06 / MAX: 2074.06 MIN: 1126.97 / MAX: 2195.04
OpenVINO Model: Person Detection FP32 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Person Detection FP32 - Device: CPU b a 200 400 600 800 1000 937.77 938.50 MIN: 531.07 / MAX: 1163.54 MIN: 455.19 / MAX: 1165.35 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Person Detection FP32 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Person Detection FP32 - Device: CPU b a 8 16 24 32 40 34.01 33.99 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Person Detection FP16 - Device: CPU a b 200 400 600 800 1000 937.32 938.79 MIN: 489.42 / MAX: 1170.93 MIN: 526.21 / MAX: 1168.25 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Person Detection FP16 - Device: CPU a b 8 16 24 32 40 34.03 33.97 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
Renaissance Test: Genetic Algorithm Using Jenetics + Futures OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Genetic Algorithm Using Jenetics + Futures a b 600 1200 1800 2400 3000 2583.2 2587.8 MIN: 2024.35 / MAX: 2583.21 MIN: 1997.15 / MAX: 2587.82
OpenVINO Model: Machine Translation EN To DE FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Machine Translation EN To DE FP16 - Device: CPU b a 20 40 60 80 100 106.21 106.86 MIN: 96.3 / MAX: 152.74 MIN: 97.37 / MAX: 159.08 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Machine Translation EN To DE FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Machine Translation EN To DE FP16 - Device: CPU b a 70 140 210 280 350 300.92 299.07 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Noise Suppression Poconet-Like FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Noise Suppression Poconet-Like FP16 - Device: CPU b a 80 160 240 320 400 347.78 350.23 MIN: 97.47 / MAX: 710.93 MIN: 100.1 / MAX: 702.59 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Noise Suppression Poconet-Like FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Noise Suppression Poconet-Like FP16 - Device: CPU b a 20 40 60 80 100 91.86 91.23 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Person Vehicle Bike Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Person Vehicle Bike Detection FP16 - Device: CPU a b 20 40 60 80 100 107.33 108.65 MIN: 24.19 / MAX: 176.55 MIN: 29.12 / MAX: 173.29 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Person Vehicle Bike Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Person Vehicle Bike Detection FP16 - Device: CPU a b 60 120 180 240 300 297.98 294.35 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Vehicle Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Vehicle Detection FP16-INT8 - Device: CPU b a 80 160 240 320 400 367.30 368.95 MIN: 258.19 / MAX: 427.48 MIN: 242.56 / MAX: 426.71 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Vehicle Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Vehicle Detection FP16-INT8 - Device: CPU b a 20 40 60 80 100 86.94 86.55 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Face Detection Retail FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Face Detection Retail FP16-INT8 - Device: CPU b a 20 40 60 80 100 81.32 81.42 MIN: 69.6 / MAX: 110.26 MIN: 68.58 / MAX: 111.7 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Face Detection Retail FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Face Detection Retail FP16-INT8 - Device: CPU b a 90 180 270 360 450 392.98 392.46 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Road Segmentation ADAS FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Road Segmentation ADAS FP16 - Device: CPU b a 30 60 90 120 150 137.56 138.60 MIN: 71.32 / MAX: 276.49 MIN: 68.04 / MAX: 255.32 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Road Segmentation ADAS FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Road Segmentation ADAS FP16 - Device: CPU b a 50 100 150 200 250 232.20 230.52 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Weld Porosity Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Weld Porosity Detection FP16-INT8 - Device: CPU b a 14 28 42 56 70 60.83 61.26 MIN: 27.04 / MAX: 1145.55 MIN: 26.59 / MAX: 1137.75 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Weld Porosity Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Weld Porosity Detection FP16-INT8 - Device: CPU b a 110 220 330 440 550 525.64 521.97 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Person Re-Identification Retail FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Person Re-Identification Retail FP16 - Device: CPU b a 40 80 120 160 200 175.83 178.63 MIN: 16.88 / MAX: 235.43 MIN: 18.26 / MAX: 231.24 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Person Re-Identification Retail FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Person Re-Identification Retail FP16 - Device: CPU b a 40 80 120 160 200 181.87 179.01 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Handwritten English Recognition FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Handwritten English Recognition FP16-INT8 - Device: CPU b a 50 100 150 200 250 208.37 208.51 MIN: 202.58 / MAX: 292.61 MIN: 203.39 / MAX: 292.57 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Handwritten English Recognition FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Handwritten English Recognition FP16-INT8 - Device: CPU b a 30 60 90 120 150 153.23 153.15 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Handwritten English Recognition FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Handwritten English Recognition FP16 - Device: CPU a b 40 80 120 160 200 180.27 180.37 MIN: 175.48 / MAX: 263.73 MIN: 176.16 / MAX: 264.3 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Handwritten English Recognition FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Handwritten English Recognition FP16 - Device: CPU a b 40 80 120 160 200 177.16 177.13 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Weld Porosity Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Weld Porosity Detection FP16 - Device: CPU a b 14 28 42 56 70 63.11 63.17 MIN: 8.46 / MAX: 1254.27 MIN: 8.62 / MAX: 1257.71 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Weld Porosity Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Weld Porosity Detection FP16 - Device: CPU a b 110 220 330 440 550 506.71 506.24 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Vehicle Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Vehicle Detection FP16 - Device: CPU b a 14 28 42 56 70 58.17 60.90 MIN: 15.88 / MAX: 118.67 MIN: 16.22 / MAX: 130.67 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Vehicle Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Vehicle Detection FP16 - Device: CPU b a 120 240 360 480 600 549.61 524.99 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Face Detection Retail FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Face Detection Retail FP16 - Device: CPU a b 7 14 21 28 35 26.12 27.69 MIN: 6.29 / MAX: 82.72 MIN: 6.19 / MAX: 78.84 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Face Detection Retail FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Face Detection Retail FP16 - Device: CPU a b 300 600 900 1200 1500 1224.24 1154.89 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU b a 4 8 12 16 20 16.23 17.28 MIN: 7.16 / MAX: 294.42 MIN: 7.99 / MAX: 296.13 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU b a 400 800 1200 1600 2000 1970.51 1850.02 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024 b a 20 40 60 80 100 103.43 101.01 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -mcpu=native -fopenmp -lopenblas
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU a b 4 8 12 16 20 17.30 17.59 MIN: 0.97 / MAX: 293.08 MIN: 1.17 / MAX: 295.88 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU a b 400 800 1200 1600 2000 1848.45 1818.46 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024 b a 20 40 60 80 100 103.70 103.45 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -mcpu=native -fopenmp -lopenblas
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048 a b 60 120 180 240 300 261.20 260.03 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -mcpu=native -fopenmp -lopenblas
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128 b a 4 8 12 16 20 16.24 15.38 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -mcpu=native -fopenmp -lopenblas
Primesieve Length: 1e13 OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 12.6 Length: 1e13 a b 10 20 30 40 50 41.05 42.81 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128 a b 5 10 15 20 25 19.04 19.00 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -mcpu=native -fopenmp -lopenblas
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512 a b 20 40 60 80 100 96.79 94.36 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -mcpu=native -fopenmp -lopenblas
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512 b a 20 40 60 80 100 96.90 96.79 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -mcpu=native -fopenmp -lopenblas
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024 a b 50 100 150 200 250 239.94 229.86 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -mcpu=native -fopenmp -lopenblas
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128 b a 9 18 27 36 45 38.61 36.58 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -mcpu=native -fopenmp -lopenblas
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512 a b 40 80 120 160 200 199.65 194.76 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -mcpu=native -fopenmp -lopenblas
Primesieve Length: 1e12 OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 12.6 Length: 1e12 a b 0.5846 1.1692 1.7538 2.3384 2.923 2.575 2.598 1. (CXX) g++ options: -O3
Phoronix Test Suite v10.8.5