x3d sun Benchmarks for a future article. AMD Ryzen 7 9800X3D 8-Core testing with a ASRock X870E Taichi (3.12.AS02 BIOS) and NVIDIA GeForce RTX 4070 12GB on Ubuntu 24.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2411257-PTS-X3DSUN4547&sor&gru .
x3d sun Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL OpenCL Compiler File-System Screen Resolution a b AMD Ryzen 7 9800X3D 8-Core @ 6.22GHz (8 Cores / 16 Threads) ASRock X870E Taichi (3.12.AS02 BIOS) AMD Device 14d8 2 x 16GB DDR5-6000MT/s F5-6000J2836G16G Western Digital WD_BLACK SN850X 2000GB NVIDIA GeForce RTX 4070 12GB NVIDIA AD104 HD Audio DELL U2723QE Realtek Device 8126 + MEDIATEK Device 0717 Ubuntu 24.04 6.8.0-49-generic (x86_64) GNOME Shell 46.0 X Server 1.21.1.11 NVIDIA 565.57.01 4.6.0 OpenCL 3.0 CUDA 12.7.33 GCC 13.2.0 ext4 3840x2160 OpenBenchmarking.org Kernel Details - nouveau.modeset=0 - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-uJ7kn6/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-uJ7kn6/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: amd-pstate-epp performance (EPP: performance) - CPU Microcode: 0xb404023 Java Details - OpenJDK Runtime Environment (build 21.0.5+11-Ubuntu-1ubuntu124.04) Python Details - Python 3.12.3 Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
x3d sun openvino: Face Detection FP16 - CPU openvino: Person Detection FP16 - CPU openvino: Person Detection FP32 - CPU openvino: Vehicle Detection FP16 - CPU openvino: Face Detection FP16-INT8 - CPU openvino: Face Detection Retail FP16 - CPU openvino: Road Segmentation ADAS FP16 - CPU openvino: Vehicle Detection FP16-INT8 - CPU openvino: Weld Porosity Detection FP16 - CPU openvino: Face Detection Retail FP16-INT8 - CPU openvino: Road Segmentation ADAS FP16-INT8 - CPU openvino: Machine Translation EN To DE FP16 - CPU openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Person Vehicle Bike Detection FP16 - CPU openvino: Noise Suppression Poconet-Like FP16 - CPU openvino: Handwritten English Recognition FP16 - CPU openvino: Person Re-Identification Retail FP16 - CPU openvino: Age Gender Recognition Retail 0013 FP16 - CPU openvino: Handwritten English Recognition FP16-INT8 - CPU openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Text Generation 128 llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 512 llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 1024 llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 2048 llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Text Generation 128 llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 512 llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 1024 llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 2048 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Text Generation 128 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 512 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 1024 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 2048 openvino-genai: Gemma-7b-int4-ov - CPU openvino-genai: TinyLlama-1.1B-Chat-v1.0 - CPU openvino-genai: Falcon-7b-instruct-int4-ov - CPU openvino-genai: Phi-3-mini-128k-instruct-int4-ov - CPU renaissance: Scala Dotty renaissance: Rand Forest renaissance: ALS Movie Lens renaissance: Apache Spark Bayes renaissance: Savina Reactors.IO renaissance: Apache Spark PageRank renaissance: Finagle HTTP Requests renaissance: Gaussian Mixture Model renaissance: In-Memory Database Shootout renaissance: Akka Unbalanced Cobwebbed Tree renaissance: Genetic Algorithm Using Jenetics + Futures openvino: Face Detection FP16 - CPU openvino: Person Detection FP16 - CPU openvino: Person Detection FP32 - CPU openvino: Vehicle Detection FP16 - CPU openvino: Face Detection FP16-INT8 - CPU openvino: Face Detection Retail FP16 - CPU openvino: Road Segmentation ADAS FP16 - CPU openvino: Vehicle Detection FP16-INT8 - CPU openvino: Weld Porosity Detection FP16 - CPU openvino: Face Detection Retail FP16-INT8 - CPU openvino: Road Segmentation ADAS FP16-INT8 - CPU openvino: Machine Translation EN To DE FP16 - CPU openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Person Vehicle Bike Detection FP16 - CPU openvino: Noise Suppression Poconet-Like FP16 - CPU openvino: Handwritten English Recognition FP16 - CPU openvino: Person Re-Identification Retail FP16 - CPU openvino: Age Gender Recognition Retail 0013 FP16 - CPU openvino: Handwritten English Recognition FP16-INT8 - CPU openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU openvino-genai: Gemma-7b-int4-ov - CPU - Time To First Token openvino-genai: Gemma-7b-int4-ov - CPU - Time Per Output Token openvino-genai: TinyLlama-1.1B-Chat-v1.0 - CPU - Time To First Token openvino-genai: TinyLlama-1.1B-Chat-v1.0 - CPU - Time Per Output Token openvino-genai: Falcon-7b-instruct-int4-ov - CPU - Time To First Token openvino-genai: Falcon-7b-instruct-int4-ov - CPU - Time Per Output Token openvino-genai: Phi-3-mini-128k-instruct-int4-ov - CPU - Time To First Token openvino-genai: Phi-3-mini-128k-instruct-int4-ov - CPU - Time Per Output Token primesieve: 1e12 primesieve: 1e13 blender: BMW27 - CPU-Only blender: Junkshop - CPU-Only blender: BMW27 - NVIDIA OptiX blender: Classroom - CPU-Only blender: Fishy Cat - CPU-Only blender: Barbershop - CPU-Only blender: Junkshop - NVIDIA OptiX blender: Classroom - NVIDIA OptiX blender: Fishy Cat - NVIDIA OptiX blender: Barbershop - NVIDIA OptiX blender: Pabellon Barcelona - CPU-Only blender: Pabellon Barcelona - NVIDIA OptiX a b 13.55 125.14 125.85 1074.93 25.6 2720.37 399.65 1432.67 1301.51 3828.93 468.1 125.44 2441.6 1275.14 1228.77 559.03 1612.55 30648.41 565.22 40630 7.27 108.23 104.57 96.53 7.68 108.99 104.62 96.13 60.37 320.35 268.9 242.14 10.76 26.42 14.17 21.63 332.3 313.6 5675.8 464.0 3235.7 1731.2 946.6 2754.2 1774.9 2997.6 629.9 295.13 31.93 31.75 3.7 156.19 1.42 9.98 2.77 6.04 2.02 8.52 31.85 3.19 3.09 6.45 14.23 2.44 0.24 14.05 0.18 107.55 92.93 39.61 37.85 80.92 70.58 51.76 46.24 10.852 135.897 84.28 109.31 6.25 238.72 118.76 822.15 12.66 15.67 11.51 62.28 267.62 16.88 13.56 134.63 127.03 1053.24 25.51 2649.11 380.54 1429.01 1303.1 3868.7 458.47 124.85 2427.32 1273.96 1220.46 550.55 1608.6 30751.32 564.93 40555.42 7.28 109.3 104.56 96.74 7.67 108.38 103.69 95.2 60.15 294.73 268.47 255.78 10.79 26.39 14.19 21.61 337.5 320.3 5503.2 461.9 3274.5 1768.0 1036.1 2785.8 1719.0 2999.4 651.5 294.8 29.67 31.45 3.77 156.63 1.44 10.48 2.78 6.04 2.01 8.7 32 3.2 3.1 6.49 14.42 2.44 0.24 14.06 0.18 105.09 92.68 39.66 37.89 80.51 70.49 51.79 46.27 10.839 135.575 84.85 110.24 6.36 240.88 119.39 824.33 12.66 15.52 11.55 62.35 269.06 16.85 OpenBenchmarking.org
OpenVINO Model: Face Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Face Detection FP16 - Device: CPU b a 3 6 9 12 15 13.56 13.55 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Person Detection FP16 - Device: CPU b a 30 60 90 120 150 134.63 125.14 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Person Detection FP32 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Person Detection FP32 - Device: CPU b a 30 60 90 120 150 127.03 125.85 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Vehicle Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Vehicle Detection FP16 - Device: CPU a b 200 400 600 800 1000 1074.93 1053.24 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Face Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Face Detection FP16-INT8 - Device: CPU a b 6 12 18 24 30 25.60 25.51 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Face Detection Retail FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Face Detection Retail FP16 - Device: CPU a b 600 1200 1800 2400 3000 2720.37 2649.11 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Road Segmentation ADAS FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Road Segmentation ADAS FP16 - Device: CPU a b 90 180 270 360 450 399.65 380.54 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Vehicle Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Vehicle Detection FP16-INT8 - Device: CPU a b 300 600 900 1200 1500 1432.67 1429.01 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Weld Porosity Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Weld Porosity Detection FP16 - Device: CPU b a 300 600 900 1200 1500 1303.10 1301.51 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Face Detection Retail FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Face Detection Retail FP16-INT8 - Device: CPU b a 800 1600 2400 3200 4000 3868.70 3828.93 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Road Segmentation ADAS FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Road Segmentation ADAS FP16-INT8 - Device: CPU a b 100 200 300 400 500 468.10 458.47 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Machine Translation EN To DE FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Machine Translation EN To DE FP16 - Device: CPU a b 30 60 90 120 150 125.44 124.85 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Weld Porosity Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Weld Porosity Detection FP16-INT8 - Device: CPU a b 500 1000 1500 2000 2500 2441.60 2427.32 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Person Vehicle Bike Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Person Vehicle Bike Detection FP16 - Device: CPU a b 300 600 900 1200 1500 1275.14 1273.96 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Noise Suppression Poconet-Like FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Noise Suppression Poconet-Like FP16 - Device: CPU a b 300 600 900 1200 1500 1228.77 1220.46 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Handwritten English Recognition FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Handwritten English Recognition FP16 - Device: CPU a b 120 240 360 480 600 559.03 550.55 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Person Re-Identification Retail FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Person Re-Identification Retail FP16 - Device: CPU a b 300 600 900 1200 1500 1612.55 1608.60 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU b a 7K 14K 21K 28K 35K 30751.32 30648.41 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Handwritten English Recognition FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Handwritten English Recognition FP16-INT8 - Device: CPU a b 120 240 360 480 600 565.22 564.93 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU a b 9K 18K 27K 36K 45K 40630.00 40555.42 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128 b a 2 4 6 8 10 7.28 7.27 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512 b a 20 40 60 80 100 109.30 108.23 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024 a b 20 40 60 80 100 104.57 104.56 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048 b a 20 40 60 80 100 96.74 96.53 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128 a b 2 4 6 8 10 7.68 7.67 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512 a b 20 40 60 80 100 108.99 108.38 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024 a b 20 40 60 80 100 104.62 103.69 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048 a b 20 40 60 80 100 96.13 95.20 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128 a b 14 28 42 56 70 60.37 60.15 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512 a b 70 140 210 280 350 320.35 294.73 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024 a b 60 120 180 240 300 268.90 268.47 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048 b a 60 120 180 240 300 255.78 242.14 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
OpenVINO GenAI Model: Gemma-7b-int4-ov - Device: CPU OpenBenchmarking.org tokens/s, More Is Better OpenVINO GenAI 2024.5 Model: Gemma-7b-int4-ov - Device: CPU b a 3 6 9 12 15 10.79 10.76
OpenVINO GenAI Model: TinyLlama-1.1B-Chat-v1.0 - Device: CPU OpenBenchmarking.org tokens/s, More Is Better OpenVINO GenAI 2024.5 Model: TinyLlama-1.1B-Chat-v1.0 - Device: CPU a b 6 12 18 24 30 26.42 26.39
OpenVINO GenAI Model: Falcon-7b-instruct-int4-ov - Device: CPU OpenBenchmarking.org tokens/s, More Is Better OpenVINO GenAI 2024.5 Model: Falcon-7b-instruct-int4-ov - Device: CPU b a 4 8 12 16 20 14.19 14.17
OpenVINO GenAI Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU OpenBenchmarking.org tokens/s, More Is Better OpenVINO GenAI 2024.5 Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU a b 5 10 15 20 25 21.63 21.61
Renaissance Test: Scala Dotty OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Scala Dotty a b 70 140 210 280 350 332.3 337.5 MIN: 279.66 / MAX: 459.23 MIN: 285.27 / MAX: 447.97
Renaissance Test: Random Forest OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Random Forest a b 70 140 210 280 350 313.6 320.3 MIN: 286.88 / MAX: 359.83 MIN: 287.69 / MAX: 368.17
Renaissance Test: ALS Movie Lens OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: ALS Movie Lens b a 1200 2400 3600 4800 6000 5503.2 5675.8 MIN: 5503.17 / MAX: 6137.99 MIN: 5581.56 / MAX: 6205.97
Renaissance Test: Apache Spark Bayes OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Apache Spark Bayes b a 100 200 300 400 500 461.9 464.0 MIN: 421.63 / MAX: 470.36 MIN: 423.57 / MAX: 464.02
Renaissance Test: Savina Reactors.IO OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Savina Reactors.IO a b 700 1400 2100 2800 3500 3235.7 3274.5 MAX: 3969.9 MIN: 3274.47 / MAX: 4491.18
Renaissance Test: Apache Spark PageRank OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Apache Spark PageRank a b 400 800 1200 1600 2000 1731.2 1768.0 MIN: 1337.87 / MAX: 1731.24 MIN: 1347.59 / MAX: 1768.01
Renaissance Test: Finagle HTTP Requests OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Finagle HTTP Requests a b 200 400 600 800 1000 946.6 1036.1 MIN: 922.09 / MAX: 948.56 MIN: 913.25 / MAX: 1036.14
Renaissance Test: Gaussian Mixture Model OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Gaussian Mixture Model a b 600 1200 1800 2400 3000 2754.2 2785.8 MIN: 2079.76 / MAX: 2754.23 MIN: 2099.87 / MAX: 2785.82
Renaissance Test: In-Memory Database Shootout OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: In-Memory Database Shootout b a 400 800 1200 1600 2000 1719.0 1774.9 MIN: 1638.13 / MAX: 1767.56 MIN: 1710.54 / MAX: 1868.66
Renaissance Test: Akka Unbalanced Cobwebbed Tree OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Akka Unbalanced Cobwebbed Tree a b 600 1200 1800 2400 3000 2997.6 2999.4 MIN: 2997.57 / MAX: 3595.56 MAX: 3557.95
Renaissance Test: Genetic Algorithm Using Jenetics + Futures OpenBenchmarking.org ms, Fewer Is Better Renaissance 0.16 Test: Genetic Algorithm Using Jenetics + Futures a b 140 280 420 560 700 629.9 651.5 MIN: 621.08 / MAX: 634.04 MIN: 640.26
OpenVINO Model: Face Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Face Detection FP16 - Device: CPU b a 60 120 180 240 300 294.80 295.13 MIN: 284.42 / MAX: 305.81 MIN: 283.45 / MAX: 303.93 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Person Detection FP16 - Device: CPU b a 7 14 21 28 35 29.67 31.93 MIN: 25.34 / MAX: 46.19 MIN: 25.26 / MAX: 45.33 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Person Detection FP32 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Person Detection FP32 - Device: CPU b a 7 14 21 28 35 31.45 31.75 MIN: 22.07 / MAX: 46.26 MIN: 26.2 / MAX: 44.37 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Vehicle Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Vehicle Detection FP16 - Device: CPU a b 0.8483 1.6966 2.5449 3.3932 4.2415 3.70 3.77 MIN: 2.12 / MAX: 8.14 MIN: 2.13 / MAX: 7.07 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Face Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Face Detection FP16-INT8 - Device: CPU a b 30 60 90 120 150 156.19 156.63 MIN: 149.55 / MAX: 159.9 MIN: 148.3 / MAX: 162.5 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Face Detection Retail FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Face Detection Retail FP16 - Device: CPU a b 0.324 0.648 0.972 1.296 1.62 1.42 1.44 MIN: 0.79 / MAX: 5.25 MIN: 0.79 / MAX: 5.45 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Road Segmentation ADAS FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Road Segmentation ADAS FP16 - Device: CPU a b 3 6 9 12 15 9.98 10.48 MIN: 6.76 / MAX: 15.17 MIN: 5.96 / MAX: 13.86 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Vehicle Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Vehicle Detection FP16-INT8 - Device: CPU a b 0.6255 1.251 1.8765 2.502 3.1275 2.77 2.78 MIN: 1.66 / MAX: 5.05 MIN: 1.65 / MAX: 6.66 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Weld Porosity Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Weld Porosity Detection FP16 - Device: CPU a b 2 4 6 8 10 6.04 6.04 MIN: 3.27 / MAX: 13.47 MIN: 3.43 / MAX: 11.68 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Face Detection Retail FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Face Detection Retail FP16-INT8 - Device: CPU b a 0.4545 0.909 1.3635 1.818 2.2725 2.01 2.02 MIN: 1.12 / MAX: 5 MIN: 1.13 / MAX: 4.45 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Road Segmentation ADAS FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Road Segmentation ADAS FP16-INT8 - Device: CPU a b 2 4 6 8 10 8.52 8.70 MIN: 5.15 / MAX: 17.8 MIN: 5.2 / MAX: 14.14 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Machine Translation EN To DE FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Machine Translation EN To DE FP16 - Device: CPU a b 7 14 21 28 35 31.85 32.00 MIN: 16.85 / MAX: 42.5 MIN: 15.15 / MAX: 41.09 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Weld Porosity Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Weld Porosity Detection FP16-INT8 - Device: CPU a b 0.72 1.44 2.16 2.88 3.6 3.19 3.20 MIN: 1.67 / MAX: 6.29 MIN: 1.67 / MAX: 9.26 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Person Vehicle Bike Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Person Vehicle Bike Detection FP16 - Device: CPU a b 0.6975 1.395 2.0925 2.79 3.4875 3.09 3.10 MIN: 2.19 / MAX: 7.29 MIN: 2.07 / MAX: 8.09 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Noise Suppression Poconet-Like FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Noise Suppression Poconet-Like FP16 - Device: CPU a b 2 4 6 8 10 6.45 6.49 MIN: 4.75 / MAX: 8.7 MIN: 4.49 / MAX: 18.72 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Handwritten English Recognition FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Handwritten English Recognition FP16 - Device: CPU a b 4 8 12 16 20 14.23 14.42 MIN: 10.48 / MAX: 23.62 MIN: 9.72 / MAX: 24.33 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Person Re-Identification Retail FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Person Re-Identification Retail FP16 - Device: CPU a b 0.549 1.098 1.647 2.196 2.745 2.44 2.44 MIN: 1.46 / MAX: 6.21 MIN: 1.46 / MAX: 4.98 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU a b 0.054 0.108 0.162 0.216 0.27 0.24 0.24 MIN: 0.14 / MAX: 3.21 MIN: 0.14 / MAX: 3.74 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Handwritten English Recognition FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Handwritten English Recognition FP16-INT8 - Device: CPU a b 4 8 12 16 20 14.05 14.06 MIN: 11.05 / MAX: 27.25 MIN: 9.58 / MAX: 24.22 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU a b 0.0405 0.081 0.1215 0.162 0.2025 0.18 0.18 MIN: 0.1 / MAX: 4.15 MIN: 0.11 / MAX: 4.01 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO GenAI Model: Gemma-7b-int4-ov - Device: CPU - Time To First Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: Gemma-7b-int4-ov - Device: CPU - Time To First Token b a 20 40 60 80 100 105.09 107.55
OpenVINO GenAI Model: Gemma-7b-int4-ov - Device: CPU - Time Per Output Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: Gemma-7b-int4-ov - Device: CPU - Time Per Output Token b a 20 40 60 80 100 92.68 92.93
OpenVINO GenAI Model: TinyLlama-1.1B-Chat-v1.0 - Device: CPU - Time To First Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: TinyLlama-1.1B-Chat-v1.0 - Device: CPU - Time To First Token a b 9 18 27 36 45 39.61 39.66
OpenVINO GenAI Model: TinyLlama-1.1B-Chat-v1.0 - Device: CPU - Time Per Output Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: TinyLlama-1.1B-Chat-v1.0 - Device: CPU - Time Per Output Token a b 9 18 27 36 45 37.85 37.89
OpenVINO GenAI Model: Falcon-7b-instruct-int4-ov - Device: CPU - Time To First Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: Falcon-7b-instruct-int4-ov - Device: CPU - Time To First Token b a 20 40 60 80 100 80.51 80.92
OpenVINO GenAI Model: Falcon-7b-instruct-int4-ov - Device: CPU - Time Per Output Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: Falcon-7b-instruct-int4-ov - Device: CPU - Time Per Output Token b a 16 32 48 64 80 70.49 70.58
OpenVINO GenAI Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU - Time To First Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU - Time To First Token a b 12 24 36 48 60 51.76 51.79
OpenVINO GenAI Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU - Time Per Output Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU - Time Per Output Token a b 10 20 30 40 50 46.24 46.27
Primesieve Length: 1e12 OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 12.6 Length: 1e12 b a 3 6 9 12 15 10.84 10.85 1. (CXX) g++ options: -O3
Primesieve Length: 1e13 OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 12.6 Length: 1e13 b a 30 60 90 120 150 135.58 135.90 1. (CXX) g++ options: -O3
Blender Blend File: BMW27 - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: BMW27 - Compute: CPU-Only a b 20 40 60 80 100 84.28 84.85
Blender Blend File: Junkshop - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Junkshop - Compute: CPU-Only a b 20 40 60 80 100 109.31 110.24
Blender Blend File: BMW27 - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: BMW27 - Compute: NVIDIA OptiX a b 2 4 6 8 10 6.25 6.36
Blender Blend File: Classroom - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Classroom - Compute: CPU-Only a b 50 100 150 200 250 238.72 240.88
Blender Blend File: Fishy Cat - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Fishy Cat - Compute: CPU-Only a b 30 60 90 120 150 118.76 119.39
Blender Blend File: Barbershop - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Barbershop - Compute: CPU-Only a b 200 400 600 800 1000 822.15 824.33
Blender Blend File: Junkshop - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Junkshop - Compute: NVIDIA OptiX a b 3 6 9 12 15 12.66 12.66
Blender Blend File: Classroom - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Classroom - Compute: NVIDIA OptiX b a 4 8 12 16 20 15.52 15.67
Blender Blend File: Fishy Cat - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Fishy Cat - Compute: NVIDIA OptiX a b 3 6 9 12 15 11.51 11.55
Blender Blend File: Barbershop - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Barbershop - Compute: NVIDIA OptiX a b 14 28 42 56 70 62.28 62.35
Blender Blend File: Pabellon Barcelona - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Pabellon Barcelona - Compute: CPU-Only a b 60 120 180 240 300 267.62 269.06
Blender Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX b a 4 8 12 16 20 16.85 16.88
Phoronix Test Suite v10.8.5