ampere arm tests Tests for a future article. ARMv8 Neoverse-N1 testing with a System76 Thelio Astra (3.02 BIOS) and NVIDIA RTX A400/PCIe 4GB on Ubuntu 24.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2411241-PTS-AMPEREAR12&gru&sor .
ampere arm tests Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL Compiler File-System Screen Resolution a b ARMv8 Neoverse-N1 @ 3.00GHz (128 Cores) System76 Thelio Astra (3.02 BIOS) Ampere Computing LLC Altra PCI Root Complex A 8 x 32GB DDR4-3200MT/s Micron 18ASF4G72PDZ-3G2F1 1024GB KINGSTON SKC3000S1024G NVIDIA RTX A400/PCIe 4GB NVIDIA Device 2291 DELL P2415Q 2 x Intel X550 + Intel I210 Ubuntu 24.04 6.8.0-48-generic-64k (aarch64) GNOME Shell 46.0 X Server NVIDIA 550.120 4.6.0 GCC 13.2.0 ext4 3840x2160 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-dIwDw0/gcc-13-13.2.0/debian/tmp-nvptx/usr --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto --without-cuda-driver -v Processor Details - a: Scaling Governor: cppc_cpufreq schedutil (Boost: Disabled) - b: Scaling Governor: cppc_cpufreq performance (Boost: Disabled) Java Details - OpenJDK Runtime Environment (build 21.0.5+11-Ubuntu-1ubuntu124.04) Python Details - Python 3.12.3 Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of CSV2 BHB + srbds: Not affected + tsx_async_abort: Not affected
ampere arm tests openvino: Face Detection FP16 - CPU openvino: Person Detection FP16 - CPU openvino: Person Detection FP32 - CPU openvino: Vehicle Detection FP16 - CPU openvino: Face Detection FP16-INT8 - CPU openvino: Face Detection Retail FP16 - CPU openvino: Road Segmentation ADAS FP16 - CPU openvino: Vehicle Detection FP16-INT8 - CPU openvino: Weld Porosity Detection FP16 - CPU openvino: Face Detection Retail FP16-INT8 - CPU openvino: Road Segmentation ADAS FP16-INT8 - CPU openvino: Machine Translation EN To DE FP16 - CPU openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Person Vehicle Bike Detection FP16 - CPU openvino: Noise Suppression Poconet-Like FP16 - CPU openvino: Handwritten English Recognition FP16 - CPU openvino: Person Re-Identification Retail FP16 - CPU openvino: Age Gender Recognition Retail 0013 FP16 - CPU openvino: Handwritten English Recognition FP16-INT8 - CPU openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Text Generation 128 llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 512 llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 1024 llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 2048 llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Text Generation 128 llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 512 llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 1024 llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 2048 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Text Generation 128 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 512 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 1024 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 2048 openvino-genai: Gemma-7b-int4-ov - CPU openvino-genai: TinyLlama-1.1B-Chat-v1.0 - CPU openvino-genai: Falcon-7b-instruct-int4-ov - CPU openvino-genai: Phi-3-mini-128k-instruct-int4-ov - CPU renaissance: Scala Dotty renaissance: Rand Forest renaissance: ALS Movie Lens renaissance: Apache Spark Bayes renaissance: Savina Reactors.IO renaissance: Apache Spark PageRank renaissance: Finagle HTTP Requests renaissance: Gaussian Mixture Model renaissance: In-Memory Database Shootout renaissance: Akka Unbalanced Cobwebbed Tree renaissance: Genetic Algorithm Using Jenetics + Futures openvino: Face Detection FP16 - CPU openvino: Person Detection FP16 - CPU openvino: Person Detection FP32 - CPU openvino: Vehicle Detection FP16 - CPU openvino: Face Detection FP16-INT8 - CPU openvino: Face Detection Retail FP16 - CPU openvino: Road Segmentation ADAS FP16 - CPU openvino: Vehicle Detection FP16-INT8 - CPU openvino: Weld Porosity Detection FP16 - CPU openvino: Face Detection Retail FP16-INT8 - CPU openvino: Road Segmentation ADAS FP16-INT8 - CPU openvino: Machine Translation EN To DE FP16 - CPU openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Person Vehicle Bike Detection FP16 - CPU openvino: Noise Suppression Poconet-Like FP16 - CPU openvino: Handwritten English Recognition FP16 - CPU openvino: Person Re-Identification Retail FP16 - CPU openvino: Age Gender Recognition Retail 0013 FP16 - CPU openvino: Handwritten English Recognition FP16-INT8 - CPU openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU openvino-genai: Gemma-7b-int4-ov - CPU - Time To First Token openvino-genai: Gemma-7b-int4-ov - CPU - Time Per Output Token openvino-genai: TinyLlama-1.1B-Chat-v1.0 - CPU - Time To First Token openvino-genai: TinyLlama-1.1B-Chat-v1.0 - CPU - Time Per Output Token openvino-genai: Falcon-7b-instruct-int4-ov - CPU - Time To First Token openvino-genai: Falcon-7b-instruct-int4-ov - CPU - Time Per Output Token openvino-genai: Phi-3-mini-128k-instruct-int4-ov - CPU - Time To First Token openvino-genai: Phi-3-mini-128k-instruct-int4-ov - CPU - Time Per Output Token primesieve: 1e12 primesieve: 1e13 a b 6.89 34.03 33.99 524.99 6.53 1224.24 230.52 86.55 506.71 392.46 31.09 299.07 521.97 297.98 91.23 177.16 179.01 1848.45 153.15 1850.02 19.04 96.79 103.45 104.84 15.38 96.79 101.01 105.45 36.58 199.65 239.94 261.2 5.92 18.14 9.88 9.58 1360.1 905.4 21263.0 331.4 11584.0 3933.2 6084.3 6255.5 14065.8 43100.5 2583.2 4567.63 937.32 938.5 60.9 4788.61 26.12 138.6 368.95 63.11 81.42 1021.92 106.86 61.26 107.33 350.23 180.27 178.63 17.3 208.51 17.28 201.31 168.98 82.43 55.13 121.27 101.2 131.88 104.37 2.575 41.048 6.89 33.97 34.01 549.61 6.53 1154.89 232.2 86.94 506.24 392.98 31.1 300.92 525.64 294.35 91.86 177.13 181.87 1818.46 153.23 1970.51 19 94.36 103.7 104.02 16.24 96.9 103.43 105.16 38.61 194.76 229.86 260.03 6.23 20.91 9.86 9.66 1318.9 907.6 20800.9 330.7 11761.5 3959.6 5918.8 6162.5 14026.9 44671.6 2587.8 4571.1 938.79 937.77 58.17 4793.12 27.69 137.56 367.3 63.17 81.32 1021.49 106.21 60.83 108.65 347.78 180.37 175.83 17.59 208.37 16.23 184.88 160.47 68.65 47.83 122.07 101.42 131.64 103.51 2.598 42.813 OpenBenchmarking.org
OpenVINO Model: Face Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Face Detection FP16 - Device: CPU b a 2 4 6 8 10 6.89 6.89 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Person Detection FP16 - Device: CPU a b 8 16 24 32 40 34.03 33.97 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Person Detection FP32 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Person Detection FP32 - Device: CPU b a 8 16 24 32 40 34.01 33.99 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Vehicle Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Vehicle Detection FP16 - Device: CPU b a 120 240 360 480 600 549.61 524.99 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Face Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Face Detection FP16-INT8 - Device: CPU b a 2 4 6 8 10 6.53 6.53 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Face Detection Retail FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Face Detection Retail FP16 - Device: CPU a b 300 600 900 1200 1500 1224.24 1154.89 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Road Segmentation ADAS FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Road Segmentation ADAS FP16 - Device: CPU b a 50 100 150 200 250 232.20 230.52 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Vehicle Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Vehicle Detection FP16-INT8 - Device: CPU b a 20 40 60 80 100 86.94 86.55 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Weld Porosity Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Weld Porosity Detection FP16 - Device: CPU a b 110 220 330 440 550 506.71 506.24 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Face Detection Retail FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Face Detection Retail FP16-INT8 - Device: CPU b a 90 180 270 360 450 392.98 392.46 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Road Segmentation ADAS FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Road Segmentation ADAS FP16-INT8 - Device: CPU b a 7 14 21 28 35 31.10 31.09 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Machine Translation EN To DE FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Machine Translation EN To DE FP16 - Device: CPU b a 70 140 210 280 350 300.92 299.07 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Weld Porosity Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Weld Porosity Detection FP16-INT8 - Device: CPU b a 110 220 330 440 550 525.64 521.97 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Person Vehicle Bike Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Person Vehicle Bike Detection FP16 - Device: CPU a b 60 120 180 240 300 297.98 294.35 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Noise Suppression Poconet-Like FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Noise Suppression Poconet-Like FP16 - Device: CPU b a 20 40 60 80 100 91.86 91.23 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Handwritten English Recognition FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Handwritten English Recognition FP16 - Device: CPU a b 40 80 120 160 200 177.16 177.13 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Person Re-Identification Retail FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Person Re-Identification Retail FP16 - Device: CPU b a 40 80 120 160 200 181.87 179.01 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU a b 400 800 1200 1600 2000 1848.45 1818.46 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Handwritten English Recognition FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Handwritten English Recognition FP16-INT8 - Device: CPU b a 30 60 90 120 150 153.23 153.15 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU b a 400 800 1200 1600 2000 1970.51 1850.02 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128 a b 5 10 15 20 25 19.04 19.00 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -mcpu=native -fopenmp -lopenblas
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512 a b 20 40 60 80 100 96.79 94.36 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -mcpu=native -fopenmp -lopenblas
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024 b a 20 40 60 80 100 103.70 103.45 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -mcpu=native -fopenmp -lopenblas
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048 a b 20 40 60 80 100 104.84 104.02 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -mcpu=native -fopenmp -lopenblas
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128 b a 4 8 12 16 20 16.24 15.38 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -mcpu=native -fopenmp -lopenblas
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512 b a 20 40 60 80 100 96.90 96.79 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -mcpu=native -fopenmp -lopenblas
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024 b a 20 40 60 80 100 103.43 101.01 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -mcpu=native -fopenmp -lopenblas
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048 a b 20 40 60 80 100 105.45 105.16 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -mcpu=native -fopenmp -lopenblas
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128 b a 9 18 27 36 45 38.61 36.58 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -mcpu=native -fopenmp -lopenblas
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512 a b 40 80 120 160 200 199.65 194.76 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -mcpu=native -fopenmp -lopenblas
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024 a b 50 100 150 200 250 239.94 229.86 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -mcpu=native -fopenmp -lopenblas
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048 a b 60 120 180 240 300 261.20 260.03 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -mcpu=native -fopenmp -lopenblas
OpenVINO GenAI Model: Gemma-7b-int4-ov - Device: CPU OpenBenchmarking.org tokens/s, More Is Better OpenVINO GenAI 2024.5 Model: Gemma-7b-int4-ov - Device: CPU b a 2 4 6 8 10 6.23 5.92
OpenVINO GenAI Model: TinyLlama-1.1B-Chat-v1.0 - Device: CPU OpenBenchmarking.org tokens/s, More Is Better OpenVINO GenAI 2024.5 Model: TinyLlama-1.1B-Chat-v1.0 - Device: CPU b a 5 10 15 20 25 20.91 18.14
OpenVINO GenAI Model: Falcon-7b-instruct-int4-ov - Device: CPU OpenBenchmarking.org tokens/s, More Is Better OpenVINO GenAI 2024.5 Model: Falcon-7b-instruct-int4-ov - Device: CPU a b 3 6 9 12 15 9.88 9.86
OpenVINO GenAI Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU OpenBenchmarking.org tokens/s, More Is Better OpenVINO GenAI 2024.5 Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU b a 3 6 9 12 15 9.66 9.58
Renaissance Test: Scala Dotty OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Scala Dotty b a 300 600 900 1200 1500 1318.9 1360.1 MIN: 1098.06 / MAX: 2074.06 MIN: 1126.97 / MAX: 2195.04
Renaissance Test: Random Forest OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Random Forest a b 200 400 600 800 1000 905.4 907.6 MIN: 756.48 / MAX: 982.42 MIN: 753.44 / MAX: 984.59
Renaissance Test: ALS Movie Lens OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: ALS Movie Lens b a 5K 10K 15K 20K 25K 20800.9 21263.0 MIN: 20144.95 / MAX: 20800.92 MIN: 20319.56 / MAX: 21299.7
Renaissance Test: Apache Spark Bayes OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Apache Spark Bayes b a 70 140 210 280 350 330.7 331.4 MIN: 299.76 / MAX: 409.61 MIN: 300.99 / MAX: 409.98
Renaissance Test: Savina Reactors.IO OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Savina Reactors.IO a b 3K 6K 9K 12K 15K 11584.0 11761.5 MIN: 10895.07 / MAX: 11896.07 MIN: 11188.89 / MAX: 13121.97
Renaissance Test: Apache Spark PageRank OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Apache Spark PageRank a b 800 1600 2400 3200 4000 3933.2 3959.6 MIN: 3619.83 / MAX: 3933.23 MIN: 3632.35 / MAX: 3959.61
Renaissance Test: Finagle HTTP Requests OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Finagle HTTP Requests b a 1300 2600 3900 5200 6500 5918.8 6084.3 MIN: 5597 / MAX: 6068.47 MIN: 5660.68 / MAX: 6387.6
Renaissance Test: Gaussian Mixture Model OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Gaussian Mixture Model b a 1300 2600 3900 5200 6500 6162.5 6255.5 MIN: 6162.48 / MAX: 6944.07 MIN: 6255.49 / MAX: 6996.44
Renaissance Test: In-Memory Database Shootout OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: In-Memory Database Shootout b a 3K 6K 9K 12K 15K 14026.9 14065.8 MIN: 13770.63 / MAX: 14791.41 MIN: 13793.23 / MAX: 15145.96
Renaissance Test: Akka Unbalanced Cobwebbed Tree OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Akka Unbalanced Cobwebbed Tree a b 10K 20K 30K 40K 50K 43100.5 44671.6 MAX: 43839.69 MIN: 44225.66 / MAX: 45576.96
Renaissance Test: Genetic Algorithm Using Jenetics + Futures OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Genetic Algorithm Using Jenetics + Futures a b 600 1200 1800 2400 3000 2583.2 2587.8 MIN: 2024.35 / MAX: 2583.21 MIN: 1997.15 / MAX: 2587.82
OpenVINO Model: Face Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Face Detection FP16 - Device: CPU a b 1000 2000 3000 4000 5000 4567.63 4571.10 MIN: 2840.63 / MAX: 12274.43 MIN: 2793.73 / MAX: 12373.74 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Person Detection FP16 - Device: CPU a b 200 400 600 800 1000 937.32 938.79 MIN: 489.42 / MAX: 1170.93 MIN: 526.21 / MAX: 1168.25 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Person Detection FP32 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Person Detection FP32 - Device: CPU b a 200 400 600 800 1000 937.77 938.50 MIN: 531.07 / MAX: 1163.54 MIN: 455.19 / MAX: 1165.35 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Vehicle Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Vehicle Detection FP16 - Device: CPU b a 14 28 42 56 70 58.17 60.90 MIN: 15.88 / MAX: 118.67 MIN: 16.22 / MAX: 130.67 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Face Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Face Detection FP16-INT8 - Device: CPU a b 1000 2000 3000 4000 5000 4788.61 4793.12 MIN: 3508.77 / MAX: 13349.18 MIN: 3477.12 / MAX: 13307.71 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Face Detection Retail FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Face Detection Retail FP16 - Device: CPU a b 7 14 21 28 35 26.12 27.69 MIN: 6.29 / MAX: 82.72 MIN: 6.19 / MAX: 78.84 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Road Segmentation ADAS FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Road Segmentation ADAS FP16 - Device: CPU b a 30 60 90 120 150 137.56 138.60 MIN: 71.32 / MAX: 276.49 MIN: 68.04 / MAX: 255.32 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Vehicle Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Vehicle Detection FP16-INT8 - Device: CPU b a 80 160 240 320 400 367.30 368.95 MIN: 258.19 / MAX: 427.48 MIN: 242.56 / MAX: 426.71 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Weld Porosity Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Weld Porosity Detection FP16 - Device: CPU a b 14 28 42 56 70 63.11 63.17 MIN: 8.46 / MAX: 1254.27 MIN: 8.62 / MAX: 1257.71 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Face Detection Retail FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Face Detection Retail FP16-INT8 - Device: CPU b a 20 40 60 80 100 81.32 81.42 MIN: 69.6 / MAX: 110.26 MIN: 68.58 / MAX: 111.7 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Road Segmentation ADAS FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Road Segmentation ADAS FP16-INT8 - Device: CPU b a 200 400 600 800 1000 1021.49 1021.92 MIN: 705.35 / MAX: 1115.9 MIN: 713.38 / MAX: 1114.63 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Machine Translation EN To DE FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Machine Translation EN To DE FP16 - Device: CPU b a 20 40 60 80 100 106.21 106.86 MIN: 96.3 / MAX: 152.74 MIN: 97.37 / MAX: 159.08 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Weld Porosity Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Weld Porosity Detection FP16-INT8 - Device: CPU b a 14 28 42 56 70 60.83 61.26 MIN: 27.04 / MAX: 1145.55 MIN: 26.59 / MAX: 1137.75 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Person Vehicle Bike Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Person Vehicle Bike Detection FP16 - Device: CPU a b 20 40 60 80 100 107.33 108.65 MIN: 24.19 / MAX: 176.55 MIN: 29.12 / MAX: 173.29 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Noise Suppression Poconet-Like FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Noise Suppression Poconet-Like FP16 - Device: CPU b a 80 160 240 320 400 347.78 350.23 MIN: 97.47 / MAX: 710.93 MIN: 100.1 / MAX: 702.59 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Handwritten English Recognition FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Handwritten English Recognition FP16 - Device: CPU a b 40 80 120 160 200 180.27 180.37 MIN: 175.48 / MAX: 263.73 MIN: 176.16 / MAX: 264.3 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Person Re-Identification Retail FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Person Re-Identification Retail FP16 - Device: CPU b a 40 80 120 160 200 175.83 178.63 MIN: 16.88 / MAX: 235.43 MIN: 18.26 / MAX: 231.24 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU a b 4 8 12 16 20 17.30 17.59 MIN: 0.97 / MAX: 293.08 MIN: 1.17 / MAX: 295.88 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Handwritten English Recognition FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Handwritten English Recognition FP16-INT8 - Device: CPU b a 50 100 150 200 250 208.37 208.51 MIN: 202.58 / MAX: 292.61 MIN: 203.39 / MAX: 292.57 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU b a 4 8 12 16 20 16.23 17.28 MIN: 7.16 / MAX: 294.42 MIN: 7.99 / MAX: 296.13 1. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -std=c++14 -fPIC -fvisibility=hidden -MD -MT -MF
OpenVINO GenAI Model: Gemma-7b-int4-ov - Device: CPU - Time To First Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: Gemma-7b-int4-ov - Device: CPU - Time To First Token b a 40 80 120 160 200 184.88 201.31
OpenVINO GenAI Model: Gemma-7b-int4-ov - Device: CPU - Time Per Output Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: Gemma-7b-int4-ov - Device: CPU - Time Per Output Token b a 40 80 120 160 200 160.47 168.98
OpenVINO GenAI Model: TinyLlama-1.1B-Chat-v1.0 - Device: CPU - Time To First Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: TinyLlama-1.1B-Chat-v1.0 - Device: CPU - Time To First Token b a 20 40 60 80 100 68.65 82.43
OpenVINO GenAI Model: TinyLlama-1.1B-Chat-v1.0 - Device: CPU - Time Per Output Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: TinyLlama-1.1B-Chat-v1.0 - Device: CPU - Time Per Output Token b a 12 24 36 48 60 47.83 55.13
OpenVINO GenAI Model: Falcon-7b-instruct-int4-ov - Device: CPU - Time To First Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: Falcon-7b-instruct-int4-ov - Device: CPU - Time To First Token a b 30 60 90 120 150 121.27 122.07
OpenVINO GenAI Model: Falcon-7b-instruct-int4-ov - Device: CPU - Time Per Output Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: Falcon-7b-instruct-int4-ov - Device: CPU - Time Per Output Token a b 20 40 60 80 100 101.20 101.42
OpenVINO GenAI Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU - Time To First Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU - Time To First Token b a 30 60 90 120 150 131.64 131.88
OpenVINO GenAI Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU - Time Per Output Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU - Time Per Output Token b a 20 40 60 80 100 103.51 104.37
Primesieve Length: 1e12 OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 12.6 Length: 1e12 a b 0.5846 1.1692 1.7538 2.3384 2.923 2.575 2.598 1. (CXX) g++ options: -O3
Primesieve Length: 1e13 OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 12.6 Length: 1e13 a b 10 20 30 40 50 41.05 42.81 1. (CXX) g++ options: -O3
Phoronix Test Suite v10.8.5