AMD EPYC Turin AI/ML Tuning Guide AMD EPYC 9655P following AMD tuning guide for AI/ML workloads - https://www.amd.com/content/dam/amd/en/documents/epyc-technical-docs/tuning-guides/58467_amd-epyc-9005-tg-bios-and-workload.pdf Benchmarks by Michael Larabel for a future article.
HTML result view exported from: https://openbenchmarking.org/result/2411286-NE-AMDEPYCTU24&sor&grs .
AMD EPYC Turin AI/ML Tuning Guide Processor Motherboard Chipset Memory Disk Graphics Network OS Kernel Desktop Display Server Compiler File-System Screen Resolution Stock AI/ML Tuning Recommendations AMD EPYC 9655P 96-Core @ 2.60GHz (96 Cores / 192 Threads) Supermicro Super Server H13SSL-N v1.01 (3.0 BIOS) AMD 1Ah 12 x 64GB DDR5-6000MT/s Micron MTC40F2046S1RC64BDY QSFF 3201GB Micron_7450_MTFDKCB3T2TFS ASPEED 2 x Broadcom NetXtreme BCM5720 PCIe Ubuntu 24.10 6.12.0-rc7-linux-pm-next-phx (x86_64) GNOME Shell 47.0 X Server GCC 14.2.0 ext4 1024x768 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xb002116 Python Details - Python 3.12.7 Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
AMD EPYC Turin AI/ML Tuning Guide openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Person Vehicle Bike Detection FP16 - CPU llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 512 openvino: Person Vehicle Bike Detection FP16 - CPU openvino: Face Detection Retail FP16-INT8 - CPU openvino: Face Detection Retail FP16-INT8 - CPU onednn: Deconvolution Batch shapes_3d - CPU onednn: Convolution Batch Shapes Auto - CPU llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 2048 onednn: IP Shapes 1D - CPU pytorch: CPU - 512 - ResNet-152 onednn: Recurrent Neural Network Inference - CPU openvino: Vehicle Detection FP16-INT8 - CPU openvino: Vehicle Detection FP16-INT8 - CPU llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 1024 pytorch: CPU - 256 - ResNet-152 onednn: Recurrent Neural Network Training - CPU llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 512 openvino: Age Gender Recognition Retail 0013 FP16 - CPU llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 1024 openvino: Road Segmentation ADAS FP16-INT8 - CPU onednn: IP Shapes 3D - CPU openvino: Road Segmentation ADAS FP16-INT8 - CPU openvino: Age Gender Recognition Retail 0013 FP16 - CPU pytorch: CPU - 512 - ResNet-50 openvino: Machine Translation EN To DE FP16 - CPU openvino: Machine Translation EN To DE FP16 - CPU onnx: ResNet50 v1-12-int8 - CPU - Standard whisper-cpp: ggml-small.en - 2016 State of the Union whisperfile: Small pytorch: CPU - 256 - ResNet-50 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Text Generation 128 openvino: Person Re-Identification Retail FP16 - CPU xnnpack: FP16MobileNetV1 openvino: Person Re-Identification Retail FP16 - CPU openvino: Handwritten English Recognition FP16-INT8 - CPU openvino: Handwritten English Recognition FP16-INT8 - CPU litert: Mobilenet Float xnnpack: QS8MobileNetV2 openvino: Noise Suppression Poconet-Like FP16 - CPU openvino: Noise Suppression Poconet-Like FP16 - CPU openvino: Person Detection FP16 - CPU openvino: Person Detection FP16 - CPU tensorflow: CPU - 512 - ResNet-50 whisperfile: Medium litert: SqueezeNet xnnpack: FP32MobileNetV2 xnnpack: FP32MobileNetV1 tensorflow: CPU - 256 - ResNet-50 openvino-genai: Phi-3-mini-128k-instruct-int4-ov - CPU whisperfile: Tiny llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Text Generation 128 llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Text Generation 128 whisper-cpp: ggml-medium.en - 2016 State of the Union xnnpack: FP32MobileNetV3Small llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 2048 onednn: Deconvolution Batch shapes_1d - CPU xnnpack: FP16MobileNetV2 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 2048 openvino-genai: Gemma-7b-int4-ov - CPU numpy: openvino-genai: Falcon-7b-instruct-int4-ov - CPU litert: Inception V4 xnnpack: FP16MobileNetV3Small onnx: ResNet101_DUC_HDC-12 - CPU - Standard onnx: ResNet101_DUC_HDC-12 - CPU - Standard onnx: ResNet50 v1-12-int8 - CPU - Standard litert: NASNet Mobile llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 512 openvino-genai: Gemma-7b-int4-ov - CPU - Time Per Output Token openvino-genai: Gemma-7b-int4-ov - CPU - Time To First Token openvino-genai: Falcon-7b-instruct-int4-ov - CPU - Time Per Output Token openvino-genai: Falcon-7b-instruct-int4-ov - CPU - Time To First Token openvino-genai: Phi-3-mini-128k-instruct-int4-ov - CPU - Time Per Output Token openvino-genai: Phi-3-mini-128k-instruct-int4-ov - CPU - Time To First Token Stock AI/ML Tuning Recommendations 6.72 14035.76 7.09 72.4 6720.98 23587.02 3.94 0.718484 0.341475 144.53 0.535874 20.60 276.379 5.77 8270.99 96.99 20.78 425.718 72.99 0.45 97.19 18.99 0.265564 2517.89 140497.34 51.48 852.83 56.24 276.076 221.62918 90.67175 51.61 92.82 4.53 4634 10525.86 3559.87 26.94 4438.52 10042 13.68 6747.48 716.20 66.89 231.18 200.48292 7035.74 9203 4539 204.33 55.63 31.88560 48.07 45.84 454.28794 10488 149.00 6.70897 9092 306.51 37.84 885.50 51.01 43898.9 10966 165.094 6.09788 3.62341 733737 154.60 26.43 36.13 19.60 29.07 17.98 24.17 6.09 15437.79 6.66 77.05 7144.84 25050.47 3.71 0.677050 0.321833 152.92 0.507355 21.71 262.517 5.49 8691.00 101.77 21.79 406.453 76.41 0.43 101.71 18.16 0.254123 2630.42 146329.80 53.34 881.22 54.43 285.104 214.62792 88.03640 53.13 95.42 4.41 4513 10800.12 3649.63 26.28 4335.67 9813 13.38 6891.64 730.73 65.56 235.35 197.24095 6926.88 9062 4471 207.38 56.46 31.43573 48.73 46.45 449.70262 10387 150.29 6.65482 9028 308.32 38.00 887.75 51.12 43824.9 10323 158.528 6.35626 3.50697 689396 155.73 26.31 35.43 19.56 30.70 17.71 25.66 OpenBenchmarking.org
OpenVINO Model: Weld Porosity Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Weld Porosity Detection FP16-INT8 - Device: CPU AI/ML Tuning Recommendations Stock 2 4 6 8 10 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 6.09 6.72 MIN: 2.21 / MAX: 23.86 MIN: 2.22 / MAX: 22.18 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Weld Porosity Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Weld Porosity Detection FP16-INT8 - Device: CPU AI/ML Tuning Recommendations Stock 3K 6K 9K 12K 15K SE +/- 14.35, N = 3 SE +/- 9.10, N = 3 15437.79 14035.76 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Person Vehicle Bike Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Person Vehicle Bike Detection FP16 - Device: CPU AI/ML Tuning Recommendations Stock 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 6.66 7.09 MIN: 3.65 / MAX: 22.09 MIN: 4.15 / MAX: 20.16 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512 AI/ML Tuning Recommendations Stock 20 40 60 80 100 SE +/- 0.77, N = 5 SE +/- 0.83, N = 3 77.05 72.40 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
OpenVINO Model: Person Vehicle Bike Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Person Vehicle Bike Detection FP16 - Device: CPU AI/ML Tuning Recommendations Stock 1500 3000 4500 6000 7500 SE +/- 8.11, N = 3 SE +/- 13.33, N = 3 7144.84 6720.98 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Face Detection Retail FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Face Detection Retail FP16-INT8 - Device: CPU AI/ML Tuning Recommendations Stock 5K 10K 15K 20K 25K SE +/- 17.84, N = 3 SE +/- 17.66, N = 3 25050.47 23587.02 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Face Detection Retail FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Face Detection Retail FP16-INT8 - Device: CPU AI/ML Tuning Recommendations Stock 0.8865 1.773 2.6595 3.546 4.4325 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 3.71 3.94 MIN: 1.71 / MAX: 19.33 MIN: 1.76 / MAX: 17.65 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
oneDNN Harness: Deconvolution Batch shapes_3d - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Deconvolution Batch shapes_3d - Engine: CPU AI/ML Tuning Recommendations Stock 0.1617 0.3234 0.4851 0.6468 0.8085 SE +/- 0.000482, N = 9 SE +/- 0.001205, N = 9 0.677050 0.718484 MIN: 0.58 MIN: 0.62 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Convolution Batch Shapes Auto - Engine: CPU AI/ML Tuning Recommendations Stock 0.0768 0.1536 0.2304 0.3072 0.384 SE +/- 0.001024, N = 7 SE +/- 0.000295, N = 7 0.321833 0.341475 MIN: 0.31 MIN: 0.32 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048 AI/ML Tuning Recommendations Stock 30 60 90 120 150 SE +/- 1.38, N = 3 SE +/- 0.97, N = 3 152.92 144.53 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
oneDNN Harness: IP Shapes 1D - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: IP Shapes 1D - Engine: CPU AI/ML Tuning Recommendations Stock 0.1206 0.2412 0.3618 0.4824 0.603 SE +/- 0.001097, N = 4 SE +/- 0.001151, N = 4 0.507355 0.535874 MIN: 0.46 MIN: 0.49 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
PyTorch Device: CPU - Batch Size: 512 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 512 - Model: ResNet-152 AI/ML Tuning Recommendations Stock 5 10 15 20 25 SE +/- 0.18, N = 3 SE +/- 0.10, N = 3 21.71 20.60 MIN: 20.13 / MAX: 22.23 MIN: 19.3 / MAX: 21.02
oneDNN Harness: Recurrent Neural Network Inference - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Recurrent Neural Network Inference - Engine: CPU AI/ML Tuning Recommendations Stock 60 120 180 240 300 SE +/- 0.27, N = 3 SE +/- 0.68, N = 3 262.52 276.38 MIN: 257.81 MIN: 269.7 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
OpenVINO Model: Vehicle Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Vehicle Detection FP16-INT8 - Device: CPU AI/ML Tuning Recommendations Stock 1.2983 2.5966 3.8949 5.1932 6.4915 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 5.49 5.77 MIN: 2.47 / MAX: 19.56 MIN: 2.28 / MAX: 21.36 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Vehicle Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Vehicle Detection FP16-INT8 - Device: CPU AI/ML Tuning Recommendations Stock 2K 4K 6K 8K 10K SE +/- 7.22, N = 3 SE +/- 3.96, N = 3 8691.00 8270.99 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024 AI/ML Tuning Recommendations Stock 20 40 60 80 100 SE +/- 1.16, N = 3 SE +/- 0.34, N = 3 101.77 96.99 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
PyTorch Device: CPU - Batch Size: 256 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 256 - Model: ResNet-152 AI/ML Tuning Recommendations Stock 5 10 15 20 25 SE +/- 0.27, N = 3 SE +/- 0.04, N = 3 21.79 20.78 MIN: 20.38 / MAX: 22.54 MIN: 19.72 / MAX: 21.04
oneDNN Harness: Recurrent Neural Network Training - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Recurrent Neural Network Training - Engine: CPU AI/ML Tuning Recommendations Stock 90 180 270 360 450 SE +/- 0.37, N = 3 SE +/- 0.52, N = 3 406.45 425.72 MIN: 400.13 MIN: 419.47 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512 AI/ML Tuning Recommendations Stock 20 40 60 80 100 SE +/- 0.99, N = 3 SE +/- 0.98, N = 3 76.41 72.99 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU AI/ML Tuning Recommendations Stock 0.1013 0.2026 0.3039 0.4052 0.5065 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 0.43 0.45 MIN: 0.15 / MAX: 23.94 MIN: 0.16 / MAX: 25.14 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024 AI/ML Tuning Recommendations Stock 20 40 60 80 100 SE +/- 0.67, N = 15 SE +/- 1.13, N = 4 101.71 97.19 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
OpenVINO Model: Road Segmentation ADAS FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Road Segmentation ADAS FP16-INT8 - Device: CPU AI/ML Tuning Recommendations Stock 5 10 15 20 25 SE +/- 0.02, N = 3 SE +/- 0.06, N = 3 18.16 18.99 MIN: 7.68 / MAX: 40.38 MIN: 9.19 / MAX: 39 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
oneDNN Harness: IP Shapes 3D - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: IP Shapes 3D - Engine: CPU AI/ML Tuning Recommendations Stock 0.0598 0.1196 0.1794 0.2392 0.299 SE +/- 0.000491, N = 5 SE +/- 0.000944, N = 5 0.254123 0.265564 MIN: 0.24 MIN: 0.24 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
OpenVINO Model: Road Segmentation ADAS FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Road Segmentation ADAS FP16-INT8 - Device: CPU AI/ML Tuning Recommendations Stock 600 1200 1800 2400 3000 SE +/- 2.42, N = 3 SE +/- 7.45, N = 3 2630.42 2517.89 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU AI/ML Tuning Recommendations Stock 30K 60K 90K 120K 150K SE +/- 528.15, N = 3 SE +/- 313.81, N = 3 146329.80 140497.34 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
PyTorch Device: CPU - Batch Size: 512 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 512 - Model: ResNet-50 AI/ML Tuning Recommendations Stock 12 24 36 48 60 SE +/- 0.26, N = 3 SE +/- 0.15, N = 3 53.34 51.48 MIN: 49 / MAX: 54.59 MIN: 46.04 / MAX: 52.41
OpenVINO Model: Machine Translation EN To DE FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Machine Translation EN To DE FP16 - Device: CPU AI/ML Tuning Recommendations Stock 200 400 600 800 1000 SE +/- 0.40, N = 3 SE +/- 0.22, N = 3 881.22 852.83 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Machine Translation EN To DE FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Machine Translation EN To DE FP16 - Device: CPU AI/ML Tuning Recommendations Stock 13 26 39 52 65 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 54.43 56.24 MIN: 28.33 / MAX: 92.29 MIN: 29.3 / MAX: 94.69 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard AI/ML Tuning Recommendations Stock 60 120 180 240 300 SE +/- 1.46, N = 3 SE +/- 2.53, N = 7 285.10 276.08 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
Whisper.cpp Model: ggml-small.en - Input: 2016 State of the Union OpenBenchmarking.org Seconds, Fewer Is Better Whisper.cpp 1.6.2 Model: ggml-small.en - Input: 2016 State of the Union AI/ML Tuning Recommendations Stock 50 100 150 200 250 SE +/- 0.68, N = 3 SE +/- 2.33, N = 3 214.63 221.63 1. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread -msse3 -mssse3 -mavx -mf16c -mfma -mavx2 -mavx512f -mavx512cd -mavx512vl -mavx512dq -mavx512bw -mavx512vbmi -mavx512vnni
Whisperfile Model Size: Small OpenBenchmarking.org Seconds, Fewer Is Better Whisperfile 20Aug24 Model Size: Small AI/ML Tuning Recommendations Stock 20 40 60 80 100 SE +/- 0.57, N = 3 SE +/- 0.71, N = 3 88.04 90.67
PyTorch Device: CPU - Batch Size: 256 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 256 - Model: ResNet-50 AI/ML Tuning Recommendations Stock 12 24 36 48 60 SE +/- 0.28, N = 3 SE +/- 0.15, N = 3 53.13 51.61 MIN: 46.57 / MAX: 54.35 MIN: 45.56 / MAX: 52.57
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128 AI/ML Tuning Recommendations Stock 20 40 60 80 100 SE +/- 0.43, N = 6 SE +/- 0.49, N = 6 95.42 92.82 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
OpenVINO Model: Person Re-Identification Retail FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Person Re-Identification Retail FP16 - Device: CPU AI/ML Tuning Recommendations Stock 1.0193 2.0386 3.0579 4.0772 5.0965 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 4.41 4.53 MIN: 2.45 / MAX: 17.46 MIN: 1.95 / MAX: 23.94 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
XNNPACK Model: FP16MobileNetV1 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV1 AI/ML Tuning Recommendations Stock 1000 2000 3000 4000 5000 SE +/- 10.73, N = 3 SE +/- 15.14, N = 3 4513 4634 1. (CXX) g++ options: -O3 -lrt -lm
OpenVINO Model: Person Re-Identification Retail FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Person Re-Identification Retail FP16 - Device: CPU AI/ML Tuning Recommendations Stock 2K 4K 6K 8K 10K SE +/- 6.08, N = 3 SE +/- 12.77, N = 3 10800.12 10525.86 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Handwritten English Recognition FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Handwritten English Recognition FP16-INT8 - Device: CPU AI/ML Tuning Recommendations Stock 800 1600 2400 3200 4000 SE +/- 3.35, N = 3 SE +/- 2.96, N = 3 3649.63 3559.87 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Handwritten English Recognition FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Handwritten English Recognition FP16-INT8 - Device: CPU AI/ML Tuning Recommendations Stock 6 12 18 24 30 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 26.28 26.94 MIN: 15.86 / MAX: 40.89 MIN: 15.65 / MAX: 45.15 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
LiteRT Model: Mobilenet Float OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Mobilenet Float AI/ML Tuning Recommendations Stock 1000 2000 3000 4000 5000 SE +/- 7.66, N = 3 SE +/- 11.59, N = 3 4335.67 4438.52
XNNPACK Model: QS8MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: QS8MobileNetV2 AI/ML Tuning Recommendations Stock 2K 4K 6K 8K 10K SE +/- 87.21, N = 3 SE +/- 154.48, N = 3 9813 10042 1. (CXX) g++ options: -O3 -lrt -lm
OpenVINO Model: Noise Suppression Poconet-Like FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Noise Suppression Poconet-Like FP16 - Device: CPU AI/ML Tuning Recommendations Stock 4 8 12 16 20 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 13.38 13.68 MIN: 6.98 / MAX: 34.62 MIN: 7.09 / MAX: 36.01 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Noise Suppression Poconet-Like FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Noise Suppression Poconet-Like FP16 - Device: CPU AI/ML Tuning Recommendations Stock 1500 3000 4500 6000 7500 SE +/- 9.90, N = 3 SE +/- 11.12, N = 3 6891.64 6747.48 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.5 Model: Person Detection FP16 - Device: CPU AI/ML Tuning Recommendations Stock 160 320 480 640 800 SE +/- 0.74, N = 3 SE +/- 0.66, N = 3 730.73 716.20 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.5 Model: Person Detection FP16 - Device: CPU AI/ML Tuning Recommendations Stock 15 30 45 60 75 SE +/- 0.07, N = 3 SE +/- 0.06, N = 3 65.56 66.89 MIN: 32.6 / MAX: 131.97 MIN: 34.58 / MAX: 130 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl -lstdc++fs
TensorFlow Device: CPU - Batch Size: 512 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 512 - Model: ResNet-50 AI/ML Tuning Recommendations Stock 50 100 150 200 250 SE +/- 0.20, N = 3 SE +/- 0.28, N = 3 235.35 231.18
Whisperfile Model Size: Medium OpenBenchmarking.org Seconds, Fewer Is Better Whisperfile 20Aug24 Model Size: Medium AI/ML Tuning Recommendations Stock 40 80 120 160 200 SE +/- 0.71, N = 3 SE +/- 0.88, N = 3 197.24 200.48
LiteRT Model: SqueezeNet OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: SqueezeNet AI/ML Tuning Recommendations Stock 1500 3000 4500 6000 7500 SE +/- 31.55, N = 3 SE +/- 31.31, N = 3 6926.88 7035.74
XNNPACK Model: FP32MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV2 AI/ML Tuning Recommendations Stock 2K 4K 6K 8K 10K SE +/- 25.31, N = 3 SE +/- 32.13, N = 3 9062 9203 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV1 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV1 AI/ML Tuning Recommendations Stock 1000 2000 3000 4000 5000 SE +/- 54.67, N = 3 SE +/- 42.62, N = 3 4471 4539 1. (CXX) g++ options: -O3 -lrt -lm
TensorFlow Device: CPU - Batch Size: 256 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 256 - Model: ResNet-50 AI/ML Tuning Recommendations Stock 50 100 150 200 250 SE +/- 0.72, N = 3 SE +/- 0.57, N = 3 207.38 204.33
OpenVINO GenAI Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU OpenBenchmarking.org tokens/s, More Is Better OpenVINO GenAI 2024.5 Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU AI/ML Tuning Recommendations Stock 13 26 39 52 65 SE +/- 0.14, N = 4 SE +/- 0.16, N = 4 56.46 55.63
Whisperfile Model Size: Tiny OpenBenchmarking.org Seconds, Fewer Is Better Whisperfile 20Aug24 Model Size: Tiny AI/ML Tuning Recommendations Stock 7 14 21 28 35 SE +/- 0.25, N = 3 SE +/- 0.26, N = 3 31.44 31.89
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128 AI/ML Tuning Recommendations Stock 11 22 33 44 55 SE +/- 0.05, N = 4 SE +/- 0.07, N = 4 48.73 48.07 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128 AI/ML Tuning Recommendations Stock 11 22 33 44 55 SE +/- 0.09, N = 4 SE +/- 0.05, N = 4 46.45 45.84 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
Whisper.cpp Model: ggml-medium.en - Input: 2016 State of the Union OpenBenchmarking.org Seconds, Fewer Is Better Whisper.cpp 1.6.2 Model: ggml-medium.en - Input: 2016 State of the Union AI/ML Tuning Recommendations Stock 100 200 300 400 500 SE +/- 1.24, N = 3 SE +/- 1.45, N = 3 449.70 454.29 1. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread -msse3 -mssse3 -mavx -mf16c -mfma -mavx2 -mavx512f -mavx512cd -mavx512vl -mavx512dq -mavx512bw -mavx512vbmi -mavx512vnni
XNNPACK Model: FP32MobileNetV3Small OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV3Small AI/ML Tuning Recommendations Stock 2K 4K 6K 8K 10K SE +/- 28.22, N = 3 SE +/- 11.35, N = 3 10387 10488 1. (CXX) g++ options: -O3 -lrt -lm
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048 AI/ML Tuning Recommendations Stock 30 60 90 120 150 SE +/- 1.36, N = 15 SE +/- 2.06, N = 3 150.29 149.00 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
oneDNN Harness: Deconvolution Batch shapes_1d - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Deconvolution Batch shapes_1d - Engine: CPU AI/ML Tuning Recommendations Stock 2 4 6 8 10 SE +/- 0.01789, N = 3 SE +/- 0.03430, N = 3 6.65482 6.70897 MIN: 3.91 MIN: 6.07 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
XNNPACK Model: FP16MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV2 AI/ML Tuning Recommendations Stock 2K 4K 6K 8K 10K SE +/- 94.57, N = 3 SE +/- 89.20, N = 3 9028 9092 1. (CXX) g++ options: -O3 -lrt -lm
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048 AI/ML Tuning Recommendations Stock 70 140 210 280 350 SE +/- 2.23, N = 15 SE +/- 2.62, N = 3 308.32 306.51 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
OpenVINO GenAI Model: Gemma-7b-int4-ov - Device: CPU OpenBenchmarking.org tokens/s, More Is Better OpenVINO GenAI 2024.5 Model: Gemma-7b-int4-ov - Device: CPU AI/ML Tuning Recommendations Stock 9 18 27 36 45 SE +/- 0.09, N = 3 SE +/- 0.13, N = 3 38.00 37.84
Numpy Benchmark OpenBenchmarking.org Score, More Is Better Numpy Benchmark AI/ML Tuning Recommendations Stock 200 400 600 800 1000 SE +/- 0.72, N = 3 SE +/- 1.94, N = 3 887.75 885.50
OpenVINO GenAI Model: Falcon-7b-instruct-int4-ov - Device: CPU OpenBenchmarking.org tokens/s, More Is Better OpenVINO GenAI 2024.5 Model: Falcon-7b-instruct-int4-ov - Device: CPU AI/ML Tuning Recommendations Stock 12 24 36 48 60 SE +/- 0.19, N = 3 SE +/- 0.09, N = 3 51.12 51.01
LiteRT Model: Inception V4 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Inception V4 AI/ML Tuning Recommendations Stock 9K 18K 27K 36K 45K SE +/- 159.42, N = 3 SE +/- 47.06, N = 3 43824.9 43898.9
XNNPACK System Power Consumption Monitor Min Avg Max Stock 99.6 377.9 403.8 AI/ML Tuning Recommendations 100.4 424.2 458.8 OpenBenchmarking.org Watts, Fewer Is Better XNNPACK b7b048 System Power Consumption Monitor 120 240 360 480 600
XNNPACK CPU Power Consumption Monitor Min Avg Max Stock 1.9 241.3 263.5 AI/ML Tuning Recommendations 0.0 279.8 305.1 OpenBenchmarking.org Watts, Fewer Is Better XNNPACK b7b048 CPU Power Consumption Monitor 80 160 240 320 400
XNNPACK Model: FP16MobileNetV3Small OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV3Small AI/ML Tuning Recommendations Stock 2K 4K 6K 8K 10K SE +/- 21.33, N = 3 SE +/- 410.82, N = 3 10323 10966 1. (CXX) g++ options: -O3 -lrt -lm
ONNX Runtime System Power Consumption Monitor Min Avg Max Stock 98.9 429.9 477.0 AI/ML Tuning Recommendations 98.5 460.1 509.9 OpenBenchmarking.org Watts, Fewer Is Better ONNX Runtime 1.19 System Power Consumption Monitor 130 260 390 520 650
ONNX Runtime CPU Power Consumption Monitor Min Avg Max Stock 3.0 248.9 279.9 AI/ML Tuning Recommendations 77.0 270.8 300.9 OpenBenchmarking.org Watts, Fewer Is Better ONNX Runtime 1.19 CPU Power Consumption Monitor 80 160 240 320 400
ONNX Runtime Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standard AI/ML Tuning Recommendations Stock 40 80 120 160 200 SE +/- 3.81, N = 15 SE +/- 3.66, N = 15 158.53 165.09 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.19 Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standard AI/ML Tuning Recommendations Stock 2 4 6 8 10 SE +/- 0.14383, N = 15 SE +/- 0.13191, N = 15 6.35626 6.09788 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime System Power Consumption Monitor Min Avg Max Stock 98.4 328.4 350.0 AI/ML Tuning Recommendations 99.3 375.5 395.9 OpenBenchmarking.org Watts, Fewer Is Better ONNX Runtime 1.19 System Power Consumption Monitor 110 220 330 440 550
ONNX Runtime CPU Power Consumption Monitor Min Avg Max Stock 1.5 203.6 221.2 AI/ML Tuning Recommendations 82.2 237.3 256.8 OpenBenchmarking.org Watts, Fewer Is Better ONNX Runtime 1.19 CPU Power Consumption Monitor 70 140 210 280 350
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.19 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard AI/ML Tuning Recommendations Stock 0.8153 1.6306 2.4459 3.2612 4.0765 SE +/- 0.01794, N = 3 SE +/- 0.03395, N = 7 3.50697 3.62341 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
Numpy Benchmark System Power Consumption Monitor Min Avg Max Stock 97.9 171.2 178.1 AI/ML Tuning Recommendations 98.9 176.8 188.2 OpenBenchmarking.org Watts, Fewer Is Better Numpy Benchmark System Power Consumption Monitor 50 100 150 200 250
Numpy Benchmark CPU Power Consumption Monitor Min Avg Max Stock 3.0 78.8 85.7 AI/ML Tuning Recommendations 47.9 80.7 85.3 OpenBenchmarking.org Watts, Fewer Is Better Numpy Benchmark CPU Power Consumption Monitor 20 40 60 80 100
oneDNN System Power Consumption Monitor Min Avg Max Stock 98.9 329.4 466.9 AI/ML Tuning Recommendations 98.9 361.1 498.3 OpenBenchmarking.org Watts, Fewer Is Better oneDNN 3.6 System Power Consumption Monitor 130 260 390 520 650
oneDNN CPU Power Consumption Monitor Min Avg Max Stock 0.1 197.3 289.9 AI/ML Tuning Recommendations 127.7 222.2 267.0 OpenBenchmarking.org Watts, Fewer Is Better oneDNN 3.6 CPU Power Consumption Monitor 70 140 210 280 350
oneDNN System Power Consumption Monitor Min Avg Max Stock 97.3 337.3 448.7 AI/ML Tuning Recommendations 97.8 365.3 470.9 OpenBenchmarking.org Watts, Fewer Is Better oneDNN 3.6 System Power Consumption Monitor 120 240 360 480 600
oneDNN CPU Power Consumption Monitor Min Avg Max Stock 1.1 203.6 264.7 AI/ML Tuning Recommendations 103.9 226.6 275.0 OpenBenchmarking.org Watts, Fewer Is Better oneDNN 3.6 CPU Power Consumption Monitor 70 140 210 280 350
oneDNN System Power Consumption Monitor Min Avg Max Stock 97.5 274.5 366.1 AI/ML Tuning Recommendations 97.2 303.0 419.6 OpenBenchmarking.org Watts, Fewer Is Better oneDNN 3.6 System Power Consumption Monitor 110 220 330 440 550
oneDNN CPU Power Consumption Monitor Min Avg Max Stock 0.3 155.2 226.2 AI/ML Tuning Recommendations 85.8 188.4 252.5 OpenBenchmarking.org Watts, Fewer Is Better oneDNN 3.6 CPU Power Consumption Monitor 70 140 210 280 350
oneDNN System Power Consumption Monitor Min Avg Max Stock 97.4 255.0 348.1 AI/ML Tuning Recommendations 97.5 268.5 401.0 OpenBenchmarking.org Watts, Fewer Is Better oneDNN 3.6 System Power Consumption Monitor 110 220 330 440 550
oneDNN CPU Power Consumption Monitor Min Avg Max Stock 3.8 146.3 204.5 AI/ML Tuning Recommendations 72.3 169.6 230.8 OpenBenchmarking.org Watts, Fewer Is Better oneDNN 3.6 CPU Power Consumption Monitor 60 120 180 240 300
oneDNN System Power Consumption Monitor Min Avg Max AI/ML Tuning Recommendations 97.1 186.6 377.1 Stock 97.3 220.2 359.1 OpenBenchmarking.org Watts, Fewer Is Better oneDNN 3.6 System Power Consumption Monitor 100 200 300 400 500
oneDNN CPU Power Consumption Monitor Min Avg Max Stock 3.3 107.2 148.5 AI/ML Tuning Recommendations 103.2 116.5 135.9 OpenBenchmarking.org Watts, Fewer Is Better oneDNN 3.6 CPU Power Consumption Monitor 40 80 120 160 200
oneDNN System Power Consumption Monitor Min Avg Max Stock 96.9 310.1 409.7 AI/ML Tuning Recommendations 98.1 331.7 435.9 OpenBenchmarking.org Watts, Fewer Is Better oneDNN 3.6 System Power Consumption Monitor 110 220 330 440 550
oneDNN CPU Power Consumption Monitor Min Avg Max Stock 2.2 174.0 251.2 AI/ML Tuning Recommendations 100.5 202.2 263.1 OpenBenchmarking.org Watts, Fewer Is Better oneDNN 3.6 CPU Power Consumption Monitor 70 140 210 280 350
oneDNN System Power Consumption Monitor Min Avg Max AI/ML Tuning Recommendations 98.8 244.9 408.8 Stock 97.9 288.6 451.9 OpenBenchmarking.org Watts, Fewer Is Better oneDNN 3.6 System Power Consumption Monitor 120 240 360 480 600
oneDNN CPU Power Consumption Monitor Min Avg Max Stock 1.4 147.2 241.7 AI/ML Tuning Recommendations 79.0 177.2 273.9 OpenBenchmarking.org Watts, Fewer Is Better oneDNN 3.6 CPU Power Consumption Monitor 70 140 210 280 350
PyTorch System Power Consumption Monitor Min Avg Max Stock 98.2 427.3 460.6 AI/ML Tuning Recommendations 98.8 480.2 527.5 OpenBenchmarking.org Watts, Fewer Is Better PyTorch 2.2.1 System Power Consumption Monitor 130 260 390 520 650
PyTorch CPU Power Consumption Monitor Min Avg Max Stock 52.1 269.5 294.9 AI/ML Tuning Recommendations 58.3 313.0 344.4 OpenBenchmarking.org Watts, Fewer Is Better PyTorch 2.2.1 CPU Power Consumption Monitor 80 160 240 320 400
PyTorch System Power Consumption Monitor Min Avg Max Stock 98.8 395.8 456.8 AI/ML Tuning Recommendations 99.0 441.7 516.7 OpenBenchmarking.org Watts, Fewer Is Better PyTorch 2.2.1 System Power Consumption Monitor 130 260 390 520 650
PyTorch CPU Power Consumption Monitor Min Avg Max Stock 1.9 237.5 291.1 AI/ML Tuning Recommendations 53.6 285.5 336.3 OpenBenchmarking.org Watts, Fewer Is Better PyTorch 2.2.1 CPU Power Consumption Monitor 80 160 240 320 400
PyTorch System Power Consumption Monitor Min Avg Max Stock 96.8 426.1 462.5 AI/ML Tuning Recommendations 95.7 486.4 526.2 OpenBenchmarking.org Watts, Fewer Is Better PyTorch 2.2.1 System Power Consumption Monitor 130 260 390 520 650
PyTorch CPU Power Consumption Monitor Min Avg Max Stock 0.7 264.7 294.6 AI/ML Tuning Recommendations 52.9 311.9 342.9 OpenBenchmarking.org Watts, Fewer Is Better PyTorch 2.2.1 CPU Power Consumption Monitor 80 160 240 320 400
PyTorch System Power Consumption Monitor Min Avg Max Stock 98.0 393.2 455.2 AI/ML Tuning Recommendations 98.4 440.0 515.9 OpenBenchmarking.org Watts, Fewer Is Better PyTorch 2.2.1 System Power Consumption Monitor 130 260 390 520 650
PyTorch CPU Power Consumption Monitor Min Avg Max Stock 2.9 234.6 290.6 AI/ML Tuning Recommendations 55.1 281.1 335.8 OpenBenchmarking.org Watts, Fewer Is Better PyTorch 2.2.1 CPU Power Consumption Monitor 80 160 240 320 400
LiteRT System Power Consumption Monitor Min Avg Max Stock 98.4 400.2 424.0 AI/ML Tuning Recommendations 98.1 454.4 495.8 OpenBenchmarking.org Watts, Fewer Is Better LiteRT 2024-10-15 System Power Consumption Monitor 130 260 390 520 650
LiteRT CPU Power Consumption Monitor Min Avg Max Stock 1.4 243.5 272.8 AI/ML Tuning Recommendations 101.0 303.8 326.5 OpenBenchmarking.org Watts, Fewer Is Better LiteRT 2024-10-15 CPU Power Consumption Monitor 80 160 240 320 400
LiteRT System Power Consumption Monitor Min Avg Max Stock 99.2 373.3 408.2 AI/ML Tuning Recommendations 99.6 442.1 460.5 OpenBenchmarking.org Watts, Fewer Is Better LiteRT 2024-10-15 System Power Consumption Monitor 120 240 360 480 600
LiteRT CPU Power Consumption Monitor Min Avg Max Stock 3.6 237.5 264.4 AI/ML Tuning Recommendations 87.0 279.8 303.9 OpenBenchmarking.org Watts, Fewer Is Better LiteRT 2024-10-15 CPU Power Consumption Monitor 80 160 240 320 400
LiteRT System Power Consumption Monitor Min Avg Max Stock 97.1 339.2 364.8 AI/ML Tuning Recommendations 97.7 390.9 414.6 OpenBenchmarking.org Watts, Fewer Is Better LiteRT 2024-10-15 System Power Consumption Monitor 110 220 330 440 550
LiteRT CPU Power Consumption Monitor Min Avg Max Stock 3.7 221.3 238.5 AI/ML Tuning Recommendations 99.2 261.1 279.6 OpenBenchmarking.org Watts, Fewer Is Better LiteRT 2024-10-15 CPU Power Consumption Monitor 70 140 210 280 350
LiteRT Model: NASNet Mobile OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: NASNet Mobile AI/ML Tuning Recommendations Stock 160K 320K 480K 640K 800K SE +/- 22050.37, N = 12 SE +/- 17324.48, N = 15 689396 733737
LiteRT System Power Consumption Monitor Min Avg Max Stock 101.7 395.1 407.3 AI/ML Tuning Recommendations 100.3 420.8 461.0 OpenBenchmarking.org Watts, Fewer Is Better LiteRT 2024-10-15 System Power Consumption Monitor 120 240 360 480 600
LiteRT CPU Power Consumption Monitor Min Avg Max Stock 1.1 236.1 263.3 AI/ML Tuning Recommendations 124.4 282.6 304.2 OpenBenchmarking.org Watts, Fewer Is Better LiteRT 2024-10-15 CPU Power Consumption Monitor 80 160 240 320 400
TensorFlow System Power Consumption Monitor Min Avg Max Stock 100.5 472.2 518.6 AI/ML Tuning Recommendations 101.6 487.8 535.3 OpenBenchmarking.org Watts, Fewer Is Better TensorFlow 2.16.1 System Power Consumption Monitor 140 280 420 560 700
TensorFlow CPU Power Consumption Monitor Min Avg Max Stock 1.3 254.8 268.0 AI/ML Tuning Recommendations 53.7 264.7 276.0 OpenBenchmarking.org Watts, Fewer Is Better TensorFlow 2.16.1 CPU Power Consumption Monitor 70 140 210 280 350
TensorFlow System Power Consumption Monitor Min Avg Max Stock 99.1 442.9 475.7 AI/ML Tuning Recommendations 98.6 460.6 494.4 OpenBenchmarking.org Watts, Fewer Is Better TensorFlow 2.16.1 System Power Consumption Monitor 130 260 390 520 650
TensorFlow CPU Power Consumption Monitor Min Avg Max Stock 4.1 240.6 255.8 AI/ML Tuning Recommendations 53.0 253.1 264.6 OpenBenchmarking.org Watts, Fewer Is Better TensorFlow 2.16.1 CPU Power Consumption Monitor 70 140 210 280 350
Whisper.cpp System Power Consumption Monitor Min Avg Max Stock 98.2 389.5 469.4 AI/ML Tuning Recommendations 98.7 440.8 504.3 OpenBenchmarking.org Watts, Fewer Is Better Whisper.cpp 1.6.2 System Power Consumption Monitor 130 260 390 520 650
Whisper.cpp CPU Power Consumption Monitor Min Avg Max Stock 3.4 244.1 261.6 AI/ML Tuning Recommendations 46.7 283.0 299.0 OpenBenchmarking.org Watts, Fewer Is Better Whisper.cpp 1.6.2 CPU Power Consumption Monitor 80 160 240 320 400
Whisper.cpp System Power Consumption Monitor Min Avg Max Stock 98.5 357.9 406.6 AI/ML Tuning Recommendations 98.5 391.4 464.3 OpenBenchmarking.org Watts, Fewer Is Better Whisper.cpp 1.6.2 System Power Consumption Monitor 120 240 360 480 600
Whisper.cpp CPU Power Consumption Monitor Min Avg Max Stock 0.8 220.1 235.8 AI/ML Tuning Recommendations 57.9 253.6 273.4 OpenBenchmarking.org Watts, Fewer Is Better Whisper.cpp 1.6.2 CPU Power Consumption Monitor 70 140 210 280 350
Whisperfile System Power Consumption Monitor Min Avg Max Stock 98.2 340.8 387.8 AI/ML Tuning Recommendations 98.4 361.7 409.6 OpenBenchmarking.org Watts, Fewer Is Better Whisperfile 20Aug24 System Power Consumption Monitor 110 220 330 440 550
Whisperfile CPU Power Consumption Monitor Min Avg Max Stock 3.1 188.6 206.7 AI/ML Tuning Recommendations 52.4 204.4 221.7 OpenBenchmarking.org Watts, Fewer Is Better Whisperfile 20Aug24 CPU Power Consumption Monitor 60 120 180 240 300
Whisperfile System Power Consumption Monitor Min Avg Max Stock 99.1 327.7 371.6 AI/ML Tuning Recommendations 98.0 351.0 405.4 OpenBenchmarking.org Watts, Fewer Is Better Whisperfile 20Aug24 System Power Consumption Monitor 110 220 330 440 550
Whisperfile CPU Power Consumption Monitor Min Avg Max Stock 5.5 180.6 208.1 AI/ML Tuning Recommendations 46.8 200.1 231.1 OpenBenchmarking.org Watts, Fewer Is Better Whisperfile 20Aug24 CPU Power Consumption Monitor 60 120 180 240 300
Whisperfile System Power Consumption Monitor Min Avg Max Stock 99.3 256.2 339.0 AI/ML Tuning Recommendations 98.6 285.0 376.0 OpenBenchmarking.org Watts, Fewer Is Better Whisperfile 20Aug24 System Power Consumption Monitor 100 200 300 400 500
Whisperfile CPU Power Consumption Monitor Min Avg Max Stock 2.7 137.0 193.3 AI/ML Tuning Recommendations 0.4 155.0 217.0 OpenBenchmarking.org Watts, Fewer Is Better Whisperfile 20Aug24 CPU Power Consumption Monitor 60 120 180 240 300
Llama.cpp System Power Consumption Monitor Min Avg Max Stock 99.2 415.9 506.6 AI/ML Tuning Recommendations 98.9 463.4 547.0 OpenBenchmarking.org Watts, Fewer Is Better Llama.cpp b4154 System Power Consumption Monitor 140 280 420 560 700
Llama.cpp CPU Power Consumption Monitor Min Avg Max Stock 2.7 248.2 291.5 AI/ML Tuning Recommendations 86.4 288.4 323.2 OpenBenchmarking.org Watts, Fewer Is Better Llama.cpp b4154 CPU Power Consumption Monitor 80 160 240 320 400
Llama.cpp System Power Consumption Monitor Min Avg Max Stock 98.9 375.5 446.8 AI/ML Tuning Recommendations 98.9 444.6 524.3 OpenBenchmarking.org Watts, Fewer Is Better Llama.cpp b4154 System Power Consumption Monitor 130 260 390 520 650
Llama.cpp CPU Power Consumption Monitor Min Avg Max Stock 3.1 232.5 266.0 AI/ML Tuning Recommendations 92.1 285.4 337.6 OpenBenchmarking.org Watts, Fewer Is Better Llama.cpp b4154 CPU Power Consumption Monitor 80 160 240 320 400
Llama.cpp System Power Consumption Monitor Min Avg Max Stock 98.9 372.0 414.5 AI/ML Tuning Recommendations 100.3 442.0 512.9 OpenBenchmarking.org Watts, Fewer Is Better Llama.cpp b4154 System Power Consumption Monitor 130 260 390 520 650
Llama.cpp CPU Power Consumption Monitor Min Avg Max Stock 9.9 224.4 269.8 AI/ML Tuning Recommendations 94.7 289.4 335.4 OpenBenchmarking.org Watts, Fewer Is Better Llama.cpp b4154 CPU Power Consumption Monitor 80 160 240 320 400
Llama.cpp System Power Consumption Monitor Min Avg Max Stock 98 428 552 AI/ML Tuning Recommendations 99 462 644 OpenBenchmarking.org Watts, Fewer Is Better Llama.cpp b4154 System Power Consumption Monitor 200 400 600 800 1000
Llama.cpp CPU Power Consumption Monitor Min Avg Max Stock 6.0 238.9 331.9 AI/ML Tuning Recommendations 56.8 293.0 399.7 OpenBenchmarking.org Watts, Fewer Is Better Llama.cpp b4154 CPU Power Consumption Monitor 110 220 330 440 550
Llama.cpp System Power Consumption Monitor Min Avg Max Stock 98.1 411.1 499.8 AI/ML Tuning Recommendations 98.2 467.8 533.6 OpenBenchmarking.org Watts, Fewer Is Better Llama.cpp b4154 System Power Consumption Monitor 140 280 420 560 700
Llama.cpp CPU Power Consumption Monitor Min Avg Max Stock 12.7 254.5 302.2 AI/ML Tuning Recommendations 95.6 307.1 351.1 OpenBenchmarking.org Watts, Fewer Is Better Llama.cpp b4154 CPU Power Consumption Monitor 100 200 300 400 500
Llama.cpp System Power Consumption Monitor Min Avg Max Stock 97.9 371.5 450.0 AI/ML Tuning Recommendations 97.3 433.6 515.3 OpenBenchmarking.org Watts, Fewer Is Better Llama.cpp b4154 System Power Consumption Monitor 130 260 390 520 650
Llama.cpp CPU Power Consumption Monitor Min Avg Max Stock 14.5 244.4 300.1 AI/ML Tuning Recommendations 97.3 292.2 349.7 OpenBenchmarking.org Watts, Fewer Is Better Llama.cpp b4154 CPU Power Consumption Monitor 100 200 300 400 500
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4154 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512 AI/ML Tuning Recommendations Stock 30 60 90 120 150 SE +/- 3.23, N = 12 SE +/- 2.60, N = 12 155.73 154.60 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas
Llama.cpp System Power Consumption Monitor Min Avg Max Stock 96.7 256.1 410.9 AI/ML Tuning Recommendations 99.9 323.5 464.8 OpenBenchmarking.org Watts, Fewer Is Better Llama.cpp b4154 System Power Consumption Monitor 120 240 360 480 600
Llama.cpp CPU Power Consumption Monitor Min Avg Max Stock 11.3 165.3 269.9 AI/ML Tuning Recommendations 86.4 203.0 305.1 OpenBenchmarking.org Watts, Fewer Is Better Llama.cpp b4154 CPU Power Consumption Monitor 80 160 240 320 400
Llama.cpp System Power Consumption Monitor Min Avg Max Stock 97.9 410.1 502.2 AI/ML Tuning Recommendations 100.3 454.2 543.8 OpenBenchmarking.org Watts, Fewer Is Better Llama.cpp b4154 System Power Consumption Monitor 140 280 420 560 700
Llama.cpp CPU Power Consumption Monitor Min Avg Max Stock 11.3 247.5 289.5 AI/ML Tuning Recommendations 114.5 287.8 315.6 OpenBenchmarking.org Watts, Fewer Is Better Llama.cpp b4154 CPU Power Consumption Monitor 80 160 240 320 400
Llama.cpp System Power Consumption Monitor Min Avg Max Stock 98.5 382.8 438.4 AI/ML Tuning Recommendations 98.4 450.9 513.6 OpenBenchmarking.org Watts, Fewer Is Better Llama.cpp b4154 System Power Consumption Monitor 130 260 390 520 650
Llama.cpp CPU Power Consumption Monitor Min Avg Max Stock 17.5 231.5 260.6 AI/ML Tuning Recommendations 123.4 289.4 326.4 OpenBenchmarking.org Watts, Fewer Is Better Llama.cpp b4154 CPU Power Consumption Monitor 80 160 240 320 400
Llama.cpp System Power Consumption Monitor Min Avg Max Stock 98.4 365.8 443.9 AI/ML Tuning Recommendations 100.4 456.3 525.6 OpenBenchmarking.org Watts, Fewer Is Better Llama.cpp b4154 System Power Consumption Monitor 130 260 390 520 650
Llama.cpp CPU Power Consumption Monitor Min Avg Max Stock 7.3 221.9 263.5 AI/ML Tuning Recommendations 126.7 294.6 332.7 OpenBenchmarking.org Watts, Fewer Is Better Llama.cpp b4154 CPU Power Consumption Monitor 80 160 240 320 400
Llama.cpp System Power Consumption Monitor Min Avg Max Stock 98 390 558 AI/ML Tuning Recommendations 99 536 650 OpenBenchmarking.org Watts, Fewer Is Better Llama.cpp b4154 System Power Consumption Monitor 200 400 600 800 1000
Llama.cpp CPU Power Consumption Monitor Min Avg Max Stock 0.6 236.1 334.7 AI/ML Tuning Recommendations 57.2 321.1 401.3 OpenBenchmarking.org Watts, Fewer Is Better Llama.cpp b4154 CPU Power Consumption Monitor 110 220 330 440 550
OpenVINO GenAI System Power Consumption Monitor Min Avg Max Stock 99.9 420.1 486.7 AI/ML Tuning Recommendations 97.9 483.3 551.8 OpenBenchmarking.org Watts, Fewer Is Better OpenVINO GenAI 2024.5 System Power Consumption Monitor 140 280 420 560 700
OpenVINO GenAI CPU Power Consumption Monitor Min Avg Max Stock 14.0 235.6 293.0 AI/ML Tuning Recommendations 46.1 291.1 341.4 OpenBenchmarking.org Watts, Fewer Is Better OpenVINO GenAI 2024.5 CPU Power Consumption Monitor 80 160 240 320 400
OpenVINO GenAI Model: Gemma-7b-int4-ov - Device: CPU - Time Per Output Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: Gemma-7b-int4-ov - Device: CPU - Time Per Output Token AI/ML Tuning Recommendations Stock 6 12 18 24 30 SE +/- 0.06, N = 3 SE +/- 0.09, N = 3 26.31 26.43
OpenVINO GenAI Model: Gemma-7b-int4-ov - Device: CPU - Time To First Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: Gemma-7b-int4-ov - Device: CPU - Time To First Token AI/ML Tuning Recommendations Stock 8 16 24 32 40 SE +/- 0.08, N = 3 SE +/- 0.20, N = 3 35.43 36.13
OpenVINO GenAI System Power Consumption Monitor Min Avg Max Stock 98.2 426.7 509.8 AI/ML Tuning Recommendations 98.1 490.1 585.6 OpenBenchmarking.org Watts, Fewer Is Better OpenVINO GenAI 2024.5 System Power Consumption Monitor 160 320 480 640 800
OpenVINO GenAI CPU Power Consumption Monitor Min Avg Max Stock 6.1 239.0 309.8 AI/ML Tuning Recommendations 52.7 293.6 364.5 OpenBenchmarking.org Watts, Fewer Is Better OpenVINO GenAI 2024.5 CPU Power Consumption Monitor 100 200 300 400 500
OpenVINO GenAI Model: Falcon-7b-instruct-int4-ov - Device: CPU - Time Per Output Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: Falcon-7b-instruct-int4-ov - Device: CPU - Time Per Output Token AI/ML Tuning Recommendations Stock 5 10 15 20 25 SE +/- 0.07, N = 3 SE +/- 0.04, N = 3 19.56 19.60
OpenVINO GenAI Model: Falcon-7b-instruct-int4-ov - Device: CPU - Time To First Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: Falcon-7b-instruct-int4-ov - Device: CPU - Time To First Token Stock AI/ML Tuning Recommendations 7 14 21 28 35 SE +/- 0.03, N = 3 SE +/- 0.18, N = 3 29.07 30.70
OpenVINO GenAI System Power Consumption Monitor Min Avg Max Stock 99.1 331.8 453.9 AI/ML Tuning Recommendations 98.1 380.9 543.2 OpenBenchmarking.org Watts, Fewer Is Better OpenVINO GenAI 2024.5 System Power Consumption Monitor 140 280 420 560 700
OpenVINO GenAI CPU Power Consumption Monitor Min Avg Max Stock 0.8 190.8 280.5 AI/ML Tuning Recommendations 69.4 248.5 348.6 OpenBenchmarking.org Watts, Fewer Is Better OpenVINO GenAI 2024.5 CPU Power Consumption Monitor 100 200 300 400 500
OpenVINO GenAI Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU - Time Per Output Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU - Time Per Output Token AI/ML Tuning Recommendations Stock 4 8 12 16 20 SE +/- 0.04, N = 4 SE +/- 0.05, N = 4 17.71 17.98
OpenVINO GenAI Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU - Time To First Token OpenBenchmarking.org ms, Fewer Is Better OpenVINO GenAI 2024.5 Model: Phi-3-mini-128k-instruct-int4-ov - Device: CPU - Time To First Token Stock AI/ML Tuning Recommendations 6 12 18 24 30 SE +/- 0.14, N = 4 SE +/- 0.07, N = 4 24.17 25.66
OpenVINO System Power Consumption Monitor Min Avg Max Stock 98.8 523.1 566.9 AI/ML Tuning Recommendations 99.0 545.2 588.7 OpenBenchmarking.org Watts, Fewer Is Better OpenVINO 2024.5 System Power Consumption Monitor 160 320 480 640 800
OpenVINO CPU Power Consumption Monitor Min Avg Max Stock 26.6 283.4 308.8 AI/ML Tuning Recommendations 106.9 299.0 322.1 OpenBenchmarking.org Watts, Fewer Is Better OpenVINO 2024.5 CPU Power Consumption Monitor 80 160 240 320 400
OpenVINO System Power Consumption Monitor Min Avg Max Stock 100.6 506.3 535.9 AI/ML Tuning Recommendations 100.0 523.2 551.8 OpenBenchmarking.org Watts, Fewer Is Better OpenVINO 2024.5 System Power Consumption Monitor 140 280 420 560 700
OpenVINO CPU Power Consumption Monitor Min Avg Max Stock 0.2 301.5 333.3 AI/ML Tuning Recommendations 81.7 320.5 341.9 OpenBenchmarking.org Watts, Fewer Is Better OpenVINO 2024.5 CPU Power Consumption Monitor 80 160 240 320 400
OpenVINO System Power Consumption Monitor Min Avg Max Stock 98.2 485.5 529.0 AI/ML Tuning Recommendations 101.1 512.1 561.7 OpenBenchmarking.org Watts, Fewer Is Better OpenVINO 2024.5 System Power Consumption Monitor 140 280 420 560 700
OpenVINO CPU Power Consumption Monitor Min Avg Max Stock 5.3 278.3 306.1 AI/ML Tuning Recommendations 81.9 300.1 323.0 OpenBenchmarking.org Watts, Fewer Is Better OpenVINO 2024.5 CPU Power Consumption Monitor 80 160 240 320 400
OpenVINO System Power Consumption Monitor Min Avg Max Stock 99.3 549.0 578.2 AI/ML Tuning Recommendations 100.2 560.5 599.6 OpenBenchmarking.org Watts, Fewer Is Better OpenVINO 2024.5 System Power Consumption Monitor 160 320 480 640 800
OpenVINO CPU Power Consumption Monitor Min Avg Max Stock 6.9 317.3 351.6 AI/ML Tuning Recommendations 117.3 335.4 361.8 OpenBenchmarking.org Watts, Fewer Is Better OpenVINO 2024.5 CPU Power Consumption Monitor 100 200 300 400 500
OpenVINO System Power Consumption Monitor Min Avg Max Stock 101.1 472.8 504.1 AI/ML Tuning Recommendations 101.2 511.1 536.6 OpenBenchmarking.org Watts, Fewer Is Better OpenVINO 2024.5 System Power Consumption Monitor 140 280 420 560 700
OpenVINO CPU Power Consumption Monitor Min Avg Max Stock 13.5 282.0 312.1 AI/ML Tuning Recommendations 90.1 309.0 331.5 OpenBenchmarking.org Watts, Fewer Is Better OpenVINO 2024.5 CPU Power Consumption Monitor 80 160 240 320 400
OpenVINO System Power Consumption Monitor Min Avg Max Stock 100 556 612 AI/ML Tuning Recommendations 99 584 636 OpenBenchmarking.org Watts, Fewer Is Better OpenVINO 2024.5 System Power Consumption Monitor 200 400 600 800 1000
OpenVINO CPU Power Consumption Monitor Min Avg Max Stock 16.6 313.1 351.4 AI/ML Tuning Recommendations 79.4 334.8 365.0 OpenBenchmarking.org Watts, Fewer Is Better OpenVINO 2024.5 CPU Power Consumption Monitor 100 200 300 400 500
OpenVINO System Power Consumption Monitor Min Avg Max Stock 99.3 441.8 469.2 AI/ML Tuning Recommendations 98.7 461.9 496.8 OpenBenchmarking.org Watts, Fewer Is Better OpenVINO 2024.5 System Power Consumption Monitor 130 260 390 520 650
OpenVINO CPU Power Consumption Monitor Min Avg Max Stock 32.7 255.4 281.9 AI/ML Tuning Recommendations 79.1 276.2 299.8 OpenBenchmarking.org Watts, Fewer Is Better OpenVINO 2024.5 CPU Power Consumption Monitor 80 160 240 320 400
OpenVINO System Power Consumption Monitor Min Avg Max Stock 100.0 485.3 521.8 AI/ML Tuning Recommendations 99.2 521.4 553.7 OpenBenchmarking.org Watts, Fewer Is Better OpenVINO 2024.5 System Power Consumption Monitor 140 280 420 560 700
OpenVINO CPU Power Consumption Monitor Min Avg Max Stock 49.3 290.5 319.5 AI/ML Tuning Recommendations 132.1 314.2 336.5 OpenBenchmarking.org Watts, Fewer Is Better OpenVINO 2024.5 CPU Power Consumption Monitor 80 160 240 320 400
OpenVINO System Power Consumption Monitor Min Avg Max Stock 102.3 440.1 472.4 AI/ML Tuning Recommendations 105.1 501.8 525.4 OpenBenchmarking.org Watts, Fewer Is Better OpenVINO 2024.5 System Power Consumption Monitor 130 260 390 520 650
OpenVINO CPU Power Consumption Monitor Min Avg Max Stock 83.1 271.0 295.4 AI/ML Tuning Recommendations 94.6 304.8 328.6 OpenBenchmarking.org Watts, Fewer Is Better OpenVINO 2024.5 CPU Power Consumption Monitor 80 160 240 320 400
OpenVINO System Power Consumption Monitor Min Avg Max Stock 100 594 647 AI/ML Tuning Recommendations 100 615 672 OpenBenchmarking.org Watts, Fewer Is Better OpenVINO 2024.5 System Power Consumption Monitor 200 400 600 800 1000
OpenVINO CPU Power Consumption Monitor Min Avg Max Stock 62.9 324.7 356.2 AI/ML Tuning Recommendations 78.6 339.7 368.6 OpenBenchmarking.org Watts, Fewer Is Better OpenVINO 2024.5 CPU Power Consumption Monitor 100 200 300 400 500
OpenVINO System Power Consumption Monitor Min Avg Max Stock 99.1 436.2 470.9 AI/ML Tuning Recommendations 100.1 457.0 491.2 OpenBenchmarking.org Watts, Fewer Is Better OpenVINO 2024.5 System Power Consumption Monitor 130 260 390 520 650
OpenVINO CPU Power Consumption Monitor Min Avg Max Stock 1.5 262.5 291.6 AI/ML Tuning Recommendations 3.7 275.9 306.1 OpenBenchmarking.org Watts, Fewer Is Better OpenVINO 2024.5 CPU Power Consumption Monitor 80 160 240 320 400
Phoronix Test Suite v10.8.5