ml-benchmark2 AMD Ryzen 9 7950X3D 16-Core testing with a ASUS PRIME X670E-PRO WIFI (1813 BIOS) and MSI NVIDIA GeForce RTX 3060 12GB on Ubuntu 22.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2312269-NE-MLBENCHMA88 .
ml-benchmark2 Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL OpenCL Vulkan Compiler File-System Screen Resolution ml-benchmark2-12-25-23 AMD Ryzen 9 7950X3D 16-Core @ 4.20GHz (16 Cores / 32 Threads) ASUS PRIME X670E-PRO WIFI (1813 BIOS) AMD Device 14d8 128GB 4001GB CT4000P3PSSD8 + 1024GB SPCC M.2 PCIe SSD MSI NVIDIA GeForce RTX 3060 12GB NVIDIA Device 228e LC27T55 Realtek RTL8125 2.5GbE + MEDIATEK Device 0608 Ubuntu 22.04 6.2.0-39-generic (x86_64) GNOME Shell 42.9 X Server 1.21.1.4 NVIDIA 535.129.03 4.6.0 OpenCL 3.0 CUDA 12.2.147 1.3.242 GCC 11.4.0 + CUDA 12.3 ext4 1920x1080 OpenBenchmarking.org - Transparent Huge Pages: madvise - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0xa601206 - GLAMOR - BAR1 / Visible vRAM Size: 16384 MiB - vBIOS Version: 94.06.2f.00.98 - GPU Compute Cores: 3584 - Python 3.10.12 - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
ml-benchmark2 lczero: BLAS onednn: IP Shapes 1D - f32 - CPU onednn: IP Shapes 3D - f32 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU onednn: IP Shapes 1D - bf16bf16bf16 - CPU onednn: IP Shapes 3D - bf16bf16bf16 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: Deconvolution Batch shapes_1d - f32 - CPU onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - bf16bf16bf16 - CPU onednn: Deconvolution Batch shapes_1d - bf16bf16bf16 - CPU onednn: Deconvolution Batch shapes_3d - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU numpy: deepspeech: CPU rbenchmark: rnnoise: tensorflow-lite: SqueezeNet tensorflow-lite: Inception V4 tensorflow-lite: NASNet Mobile tensorflow-lite: Mobilenet Float tensorflow-lite: Mobilenet Quant tensorflow-lite: Inception ResNet V2 pytorch: CPU - 1 - ResNet-50 pytorch: CPU - 1 - ResNet-152 pytorch: CPU - 16 - ResNet-50 pytorch: CPU - 32 - ResNet-50 pytorch: CPU - 64 - ResNet-50 pytorch: CPU - 16 - ResNet-152 pytorch: CPU - 256 - ResNet-50 pytorch: CPU - 32 - ResNet-152 pytorch: CPU - 512 - ResNet-50 pytorch: CPU - 64 - ResNet-152 pytorch: CPU - 256 - ResNet-152 pytorch: CPU - 512 - ResNet-152 pytorch: CPU - 1 - Efficientnet_v2_l pytorch: CPU - 16 - Efficientnet_v2_l pytorch: CPU - 32 - Efficientnet_v2_l pytorch: CPU - 64 - Efficientnet_v2_l pytorch: CPU - 256 - Efficientnet_v2_l pytorch: CPU - 512 - Efficientnet_v2_l tensorflow: CPU - 16 - VGG-16 tensorflow: CPU - 32 - VGG-16 tensorflow: CPU - 64 - VGG-16 tensorflow: CPU - 16 - AlexNet tensorflow: CPU - 256 - VGG-16 tensorflow: CPU - 32 - AlexNet tensorflow: CPU - 512 - VGG-16 tensorflow: CPU - 64 - AlexNet tensorflow: CPU - 256 - AlexNet tensorflow: CPU - 512 - AlexNet tensorflow: CPU - 16 - GoogLeNet tensorflow: CPU - 16 - ResNet-50 tensorflow: CPU - 32 - GoogLeNet tensorflow: CPU - 32 - ResNet-50 tensorflow: CPU - 64 - GoogLeNet tensorflow: CPU - 64 - ResNet-50 tensorflow: CPU - 256 - GoogLeNet tensorflow: CPU - 256 - ResNet-50 tensorflow: CPU - 512 - GoogLeNet tensorflow: CPU - 512 - ResNet-50 deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Stream deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Synchronous Single-Stream deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Synchronous Single-Stream deepsparse: ResNet-50, Baseline - Asynchronous Multi-Stream deepsparse: ResNet-50, Baseline - Asynchronous Multi-Stream deepsparse: ResNet-50, Baseline - Synchronous Single-Stream deepsparse: ResNet-50, Baseline - Synchronous Single-Stream deepsparse: ResNet-50, Sparse INT8 - Asynchronous Multi-Stream deepsparse: ResNet-50, Sparse INT8 - Asynchronous Multi-Stream deepsparse: ResNet-50, Sparse INT8 - Synchronous Single-Stream deepsparse: ResNet-50, Sparse INT8 - Synchronous Single-Stream deepsparse: CV Detection, YOLOv5s COCO - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO - Synchronous Single-Stream deepsparse: CV Detection, YOLOv5s COCO - Synchronous Single-Stream deepsparse: BERT-Large, NLP Question Answering - Asynchronous Multi-Stream deepsparse: BERT-Large, NLP Question Answering - Asynchronous Multi-Stream deepsparse: BERT-Large, NLP Question Answering - Synchronous Single-Stream deepsparse: BERT-Large, NLP Question Answering - Synchronous Single-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Synchronous Single-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Synchronous Single-Stream deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Synchronous Single-Stream deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Synchronous Single-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Synchronous Single-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Synchronous Single-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Synchronous Single-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Synchronous Single-Stream deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Stream deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Stream deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Synchronous Single-Stream deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Synchronous Single-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Stream spacy: en_core_web_lg spacy: en_core_web_trf caffe: AlexNet - CPU - 100 caffe: AlexNet - CPU - 200 caffe: AlexNet - CPU - 1000 caffe: GoogleNet - CPU - 100 caffe: GoogleNet - CPU - 200 caffe: GoogleNet - CPU - 1000 mnn: nasnet mnn: mobilenetV3 mnn: squeezenetv1.1 mnn: resnet-v2-50 mnn: SqueezeNetV1.0 mnn: MobileNetV2_224 mnn: mobilenet-v1-1.0 mnn: inception-v3 ncnn: CPU - mobilenet ncnn: CPU-v2-v2 - mobilenet-v2 ncnn: CPU-v3-v3 - mobilenet-v3 ncnn: CPU - shufflenet-v2 ncnn: CPU - mnasnet ncnn: CPU - efficientnet-b0 ncnn: CPU - blazeface ncnn: CPU - googlenet ncnn: CPU - vgg16 ncnn: CPU - resnet18 ncnn: CPU - alexnet ncnn: CPU - resnet50 ncnn: CPU - yolov4-tiny ncnn: CPU - squeezenet_ssd ncnn: CPU - regnety_400m ncnn: CPU - vision_transformer ncnn: CPU - FastestDet ncnn: Vulkan GPU - mobilenet ncnn: Vulkan GPU-v2-v2 - mobilenet-v2 ncnn: Vulkan GPU-v3-v3 - mobilenet-v3 ncnn: Vulkan GPU - shufflenet-v2 ncnn: Vulkan GPU - mnasnet ncnn: Vulkan GPU - efficientnet-b0 ncnn: Vulkan GPU - blazeface ncnn: Vulkan GPU - googlenet ncnn: Vulkan GPU - vgg16 ncnn: Vulkan GPU - resnet18 ncnn: Vulkan GPU - alexnet ncnn: Vulkan GPU - resnet50 ncnn: Vulkan GPU - yolov4-tiny ncnn: Vulkan GPU - squeezenet_ssd ncnn: Vulkan GPU - regnety_400m ncnn: Vulkan GPU - vision_transformer ncnn: Vulkan GPU - FastestDet tnn: CPU - DenseNet tnn: CPU - MobileNet v2 tnn: CPU - SqueezeNet v2 tnn: CPU - SqueezeNet v1.1 openvino: Face Detection FP16 - CPU openvino: Face Detection FP16 - CPU openvino: Person Detection FP16 - CPU openvino: Person Detection FP16 - CPU openvino: Person Detection FP32 - CPU openvino: Person Detection FP32 - CPU openvino: Vehicle Detection FP16 - CPU openvino: Vehicle Detection FP16 - CPU openvino: Face Detection FP16-INT8 - CPU openvino: Face Detection FP16-INT8 - CPU openvino: Face Detection Retail FP16 - CPU openvino: Face Detection Retail FP16 - CPU openvino: Road Segmentation ADAS FP16 - CPU openvino: Road Segmentation ADAS FP16 - CPU openvino: Vehicle Detection FP16-INT8 - CPU openvino: Vehicle Detection FP16-INT8 - CPU openvino: Weld Porosity Detection FP16 - CPU openvino: Weld Porosity Detection FP16 - CPU openvino: Face Detection Retail FP16-INT8 - CPU openvino: Face Detection Retail FP16-INT8 - CPU openvino: Road Segmentation ADAS FP16-INT8 - CPU openvino: Road Segmentation ADAS FP16-INT8 - CPU openvino: Machine Translation EN To DE FP16 - CPU openvino: Machine Translation EN To DE FP16 - CPU openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Person Vehicle Bike Detection FP16 - CPU openvino: Person Vehicle Bike Detection FP16 - CPU openvino: Handwritten English Recognition FP16 - CPU openvino: Handwritten English Recognition FP16 - CPU openvino: Age Gender Recognition Retail 0013 FP16 - CPU openvino: Age Gender Recognition Retail 0013 FP16 - CPU openvino: Handwritten English Recognition FP16-INT8 - CPU openvino: Handwritten English Recognition FP16-INT8 - CPU openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU numenta-nab: KNN CAD numenta-nab: Relative Entropy numenta-nab: Windowed Gaussian numenta-nab: Earthgecko Skyline numenta-nab: Bayesian Changepoint numenta-nab: Contextual Anomaly Detector OSE ai-benchmark: Device Inference Score ai-benchmark: Device Training Score ai-benchmark: Device AI Score mlpack: scikit_ica mlpack: scikit_qda mlpack: scikit_svm mlpack: scikit_linearridgeregression scikit-learn: Tree scikit-learn: Lasso scikit-learn: Sparsify scikit-learn: Plot Ward scikit-learn: Plot Neighbors scikit-learn: Text Vectorizers scikit-learn: Plot Hierarchical scikit-learn: Feature Expansions scikit-learn: Isotonic / Logistic scikit-learn: Plot Incremental PCA scikit-learn: Isotonic / Pathological scikit-learn: Covertype Dataset Benchmark scikit-learn: Isotonic / Perturbed Logarithm scikit-learn: 20 Newsgroups / Logistic Regression scikit-learn: Sparse Rand Projections / 100 Iterations whisper-cpp: ggml-base.en - 2016 State of the Union whisper-cpp: ggml-small.en - 2016 State of the Union whisper-cpp: ggml-medium.en - 2016 State of the Union opencv: DNN - Deep Neural Network ml-benchmark2-12-25-23 175 2.13148 3.81408 0.542833 0.325432 0.772163 1.49285 5.27943 3.43419 2.88833 5.08690 0.505580 0.731894 1358.07 707.837 1367.75 1.49610 2.60894 1.61432 704.857 1343.53 702.315 721.25 52.49225 0.1133 14.603 1798.07 22388.9 10670.2 1293.76 1985.59 22832.8 67.87 26.53 45.48 45.71 45.17 17.88 45.77 17.78 44.93 17.98 17.71 17.89 13.70 10.29 10.45 10.43 10.47 10.42 16.48 17.71 18.48 140.39 19.00 215.96 19.13 297.39 386.97 400.77 127.79 35.90 127.29 36.10 120.40 35.45 112.12 34.24 110.88 34.09 22.0683 361.4936 17.5530 56.9632 934.0846 8.5518 270.6650 3.6930 270.1520 29.5978 192.0351 5.2053 2614.2430 3.0495 1172.1452 0.8513 120.3088 66.4627 94.1453 10.6189 27.6321 289.5002 17.5992 56.8125 270.2553 29.5869 192.1292 5.2028 128.0944 62.4189 95.2152 10.5009 187.2701 42.7042 109.7687 9.1055 34.9310 228.4604 26.4575 37.7800 428.2324 18.6647 100.1107 9.9862 22.3806 356.9725 17.6016 56.8061 19452 2354 24893 49227 243445 63606 124421 626915 11.960 1.818 2.770 11.360 4.402 3.723 2.640 22.381 10.64 4.52 4.25 4.42 3.86 5.30 1.70 10.38 35.80 6.70 5.09 13.00 17.12 8.75 11.06 42.19 5.08 10.67 4.29 4.40 4.36 3.97 5.56 1.60 10.69 35.09 6.64 5.20 13.39 17.22 8.79 11.29 42.15 4.76 2134.430 187.479 42.143 179.657 13.70 581.79 87.80 91.06 87.33 91.56 994.76 8.04 26.49 301.34 3485.66 2.29 441.60 18.10 1667.89 4.79 1386.49 11.53 4876.38 3.28 532.88 15.00 121.71 65.69 2688.84 5.95 1579.58 5.06 736.26 21.72 41501.98 0.38 602.34 26.55 58961.44 0.26 100.046 8.048 4.713 51.837 12.749 25.412 2932 3592 6524 29.93 31.30 14.12 1.08 37.432 237.287 71.653 40.579 123.612 41.823 138.998 90.526 1123.364 49.606 3114.638 270.959 1384.028 28.219 401.306 96.97771 289.73641 807.17623 31437 OpenBenchmarking.org
LeelaChessZero Backend: BLAS OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.30 Backend: BLAS ml-benchmark2-12-25-23 40 80 120 160 200 SE +/- 2.31, N = 3 175 1. (CXX) g++ options: -flto -pthread
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU ml-benchmark2-12-25-23 0.4796 0.9592 1.4388 1.9184 2.398 SE +/- 0.02341, N = 3 2.13148 MIN: 1.7 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU ml-benchmark2-12-25-23 0.8582 1.7164 2.5746 3.4328 4.291 SE +/- 0.05089, N = 15 3.81408 MIN: 3.26 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU ml-benchmark2-12-25-23 0.1221 0.2442 0.3663 0.4884 0.6105 SE +/- 0.014251, N = 15 0.542833 MIN: 0.37 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU ml-benchmark2-12-25-23 0.0732 0.1464 0.2196 0.2928 0.366 SE +/- 0.008133, N = 12 0.325432 MIN: 0.23 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU ml-benchmark2-12-25-23 0.1737 0.3474 0.5211 0.6948 0.8685 SE +/- 0.008040, N = 3 0.772163 MIN: 0.64 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU ml-benchmark2-12-25-23 0.3359 0.6718 1.0077 1.3436 1.6795 SE +/- 0.01642, N = 3 1.49285 MIN: 1.23 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU ml-benchmark2-12-25-23 1.1879 2.3758 3.5637 4.7516 5.9395 SE +/- 0.04391, N = 3 5.27943 MIN: 4.76 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU ml-benchmark2-12-25-23 0.7727 1.5454 2.3181 3.0908 3.8635 SE +/- 0.03067, N = 15 3.43419 MIN: 2.56 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU ml-benchmark2-12-25-23 0.6499 1.2998 1.9497 2.5996 3.2495 SE +/- 0.03209, N = 15 2.88833 MIN: 2.52 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU ml-benchmark2-12-25-23 1.1446 2.2892 3.4338 4.5784 5.723 SE +/- 0.06167, N = 4 5.08690 MIN: 4.34 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU ml-benchmark2-12-25-23 0.1138 0.2276 0.3414 0.4552 0.569 SE +/- 0.007201, N = 3 0.505580 MIN: 0.43 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU ml-benchmark2-12-25-23 0.1647 0.3294 0.4941 0.6588 0.8235 SE +/- 0.007344, N = 15 0.731894 MIN: 0.62 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU ml-benchmark2-12-25-23 300 600 900 1200 1500 SE +/- 16.16, N = 4 1358.07 MIN: 1187.15 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU ml-benchmark2-12-25-23 150 300 450 600 750 SE +/- 10.07, N = 3 707.84 MIN: 607.92 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU ml-benchmark2-12-25-23 300 600 900 1200 1500 SE +/- 3.41, N = 3 1367.75 MIN: 1186.82 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU ml-benchmark2-12-25-23 0.3366 0.6732 1.0098 1.3464 1.683 SE +/- 0.01461, N = 15 1.49610 MIN: 1.2 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU ml-benchmark2-12-25-23 0.587 1.174 1.761 2.348 2.935 SE +/- 0.02456, N = 3 2.60894 MIN: 2.23 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU ml-benchmark2-12-25-23 0.3632 0.7264 1.0896 1.4528 1.816 SE +/- 0.01893, N = 3 1.61432 MIN: 1.43 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU ml-benchmark2-12-25-23 150 300 450 600 750 SE +/- 4.11, N = 3 704.86 MIN: 611.57 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU ml-benchmark2-12-25-23 300 600 900 1200 1500 SE +/- 12.14, N = 3 1343.53 MIN: 1191.31 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU ml-benchmark2-12-25-23 150 300 450 600 750 SE +/- 6.22, N = 3 702.32 MIN: 612.77 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
Numpy Benchmark OpenBenchmarking.org Score, More Is Better Numpy Benchmark ml-benchmark2-12-25-23 160 320 480 640 800 SE +/- 3.15, N = 3 721.25
DeepSpeech Acceleration: CPU OpenBenchmarking.org Seconds, Fewer Is Better DeepSpeech 0.6 Acceleration: CPU ml-benchmark2-12-25-23 12 24 36 48 60 SE +/- 0.73, N = 3 52.49
R Benchmark OpenBenchmarking.org Seconds, Fewer Is Better R Benchmark ml-benchmark2-12-25-23 0.0255 0.051 0.0765 0.102 0.1275 SE +/- 0.0003, N = 3 0.1133 1. R scripting front-end version 4.1.2 (2021-11-01)
RNNoise OpenBenchmarking.org Seconds, Fewer Is Better RNNoise 2020-06-28 ml-benchmark2-12-25-23 4 8 12 16 20 SE +/- 0.04, N = 3 14.60 1. (CC) gcc options: -O2 -pedantic -fvisibility=hidden
TensorFlow Lite Model: SqueezeNet OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2022-05-18 Model: SqueezeNet ml-benchmark2-12-25-23 400 800 1200 1600 2000 SE +/- 7.51, N = 3 1798.07
TensorFlow Lite Model: Inception V4 OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2022-05-18 Model: Inception V4 ml-benchmark2-12-25-23 5K 10K 15K 20K 25K SE +/- 52.61, N = 3 22388.9
TensorFlow Lite Model: NASNet Mobile OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2022-05-18 Model: NASNet Mobile ml-benchmark2-12-25-23 2K 4K 6K 8K 10K SE +/- 51.78, N = 3 10670.2
TensorFlow Lite Model: Mobilenet Float OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2022-05-18 Model: Mobilenet Float ml-benchmark2-12-25-23 300 600 900 1200 1500 SE +/- 2.11, N = 3 1293.76
TensorFlow Lite Model: Mobilenet Quant OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2022-05-18 Model: Mobilenet Quant ml-benchmark2-12-25-23 400 800 1200 1600 2000 SE +/- 11.21, N = 3 1985.59
TensorFlow Lite Model: Inception ResNet V2 OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2022-05-18 Model: Inception ResNet V2 ml-benchmark2-12-25-23 5K 10K 15K 20K 25K SE +/- 111.48, N = 3 22832.8
PyTorch Device: CPU - Batch Size: 1 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 1 - Model: ResNet-50 ml-benchmark2-12-25-23 15 30 45 60 75 SE +/- 0.22, N = 3 67.87 MIN: 53.01 / MAX: 70.77
PyTorch Device: CPU - Batch Size: 1 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 1 - Model: ResNet-152 ml-benchmark2-12-25-23 6 12 18 24 30 SE +/- 0.25, N = 3 26.53 MIN: 20.93 / MAX: 27.56
PyTorch Device: CPU - Batch Size: 16 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 16 - Model: ResNet-50 ml-benchmark2-12-25-23 10 20 30 40 50 SE +/- 0.34, N = 3 45.48 MIN: 33.75 / MAX: 47.46
PyTorch Device: CPU - Batch Size: 32 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 32 - Model: ResNet-50 ml-benchmark2-12-25-23 10 20 30 40 50 SE +/- 0.19, N = 3 45.71 MIN: 38.99 / MAX: 47.61
PyTorch Device: CPU - Batch Size: 64 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 64 - Model: ResNet-50 ml-benchmark2-12-25-23 10 20 30 40 50 SE +/- 0.42, N = 3 45.17 MIN: 35.7 / MAX: 47.3
PyTorch Device: CPU - Batch Size: 16 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 16 - Model: ResNet-152 ml-benchmark2-12-25-23 4 8 12 16 20 SE +/- 0.24, N = 3 17.88 MIN: 12.93 / MAX: 18.62
PyTorch Device: CPU - Batch Size: 256 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 256 - Model: ResNet-50 ml-benchmark2-12-25-23 10 20 30 40 50 SE +/- 0.52, N = 3 45.77 MIN: 35.6 / MAX: 47.88
PyTorch Device: CPU - Batch Size: 32 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 32 - Model: ResNet-152 ml-benchmark2-12-25-23 4 8 12 16 20 SE +/- 0.08, N = 3 17.78 MIN: 13.52 / MAX: 18.49
PyTorch Device: CPU - Batch Size: 512 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 512 - Model: ResNet-50 ml-benchmark2-12-25-23 10 20 30 40 50 SE +/- 0.01, N = 3 44.93 MIN: 34.06 / MAX: 46.17
PyTorch Device: CPU - Batch Size: 64 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 64 - Model: ResNet-152 ml-benchmark2-12-25-23 4 8 12 16 20 SE +/- 0.07, N = 3 17.98 MIN: 13.29 / MAX: 18.54
PyTorch Device: CPU - Batch Size: 256 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 256 - Model: ResNet-152 ml-benchmark2-12-25-23 4 8 12 16 20 SE +/- 0.13, N = 3 17.71 MIN: 14.07 / MAX: 18.34
PyTorch Device: CPU - Batch Size: 512 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 512 - Model: ResNet-152 ml-benchmark2-12-25-23 4 8 12 16 20 SE +/- 0.01, N = 3 17.89 MIN: 13.88 / MAX: 18.4
PyTorch Device: CPU - Batch Size: 1 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 1 - Model: Efficientnet_v2_l ml-benchmark2-12-25-23 4 8 12 16 20 SE +/- 0.08, N = 3 13.70 MIN: 10.96 / MAX: 14.13
PyTorch Device: CPU - Batch Size: 16 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 16 - Model: Efficientnet_v2_l ml-benchmark2-12-25-23 3 6 9 12 15 SE +/- 0.07, N = 3 10.29 MIN: 8.22 / MAX: 11.02
PyTorch Device: CPU - Batch Size: 32 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 32 - Model: Efficientnet_v2_l ml-benchmark2-12-25-23 3 6 9 12 15 SE +/- 0.11, N = 3 10.45 MIN: 8.22 / MAX: 11.22
PyTorch Device: CPU - Batch Size: 64 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 64 - Model: Efficientnet_v2_l ml-benchmark2-12-25-23 3 6 9 12 15 SE +/- 0.14, N = 3 10.43 MIN: 8.26 / MAX: 11.26
PyTorch Device: CPU - Batch Size: 256 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 256 - Model: Efficientnet_v2_l ml-benchmark2-12-25-23 3 6 9 12 15 SE +/- 0.11, N = 4 10.47 MIN: 8.19 / MAX: 11.25
PyTorch Device: CPU - Batch Size: 512 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 512 - Model: Efficientnet_v2_l ml-benchmark2-12-25-23 3 6 9 12 15 SE +/- 0.04, N = 3 10.42 MIN: 8.29 / MAX: 11.15
TensorFlow Device: CPU - Batch Size: 16 - Model: VGG-16 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 16 - Model: VGG-16 ml-benchmark2-12-25-23 4 8 12 16 20 SE +/- 0.00, N = 3 16.48
TensorFlow Device: CPU - Batch Size: 32 - Model: VGG-16 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 32 - Model: VGG-16 ml-benchmark2-12-25-23 4 8 12 16 20 SE +/- 0.00, N = 3 17.71
TensorFlow Device: CPU - Batch Size: 64 - Model: VGG-16 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 64 - Model: VGG-16 ml-benchmark2-12-25-23 5 10 15 20 25 SE +/- 0.01, N = 3 18.48
TensorFlow Device: CPU - Batch Size: 16 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 16 - Model: AlexNet ml-benchmark2-12-25-23 30 60 90 120 150 SE +/- 0.22, N = 3 140.39
TensorFlow Device: CPU - Batch Size: 256 - Model: VGG-16 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 256 - Model: VGG-16 ml-benchmark2-12-25-23 5 10 15 20 25 SE +/- 0.01, N = 3 19.00
TensorFlow Device: CPU - Batch Size: 32 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 32 - Model: AlexNet ml-benchmark2-12-25-23 50 100 150 200 250 SE +/- 0.12, N = 3 215.96
TensorFlow Device: CPU - Batch Size: 512 - Model: VGG-16 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 512 - Model: VGG-16 ml-benchmark2-12-25-23 5 10 15 20 25 SE +/- 0.01, N = 3 19.13
TensorFlow Device: CPU - Batch Size: 64 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 64 - Model: AlexNet ml-benchmark2-12-25-23 60 120 180 240 300 SE +/- 0.16, N = 3 297.39
TensorFlow Device: CPU - Batch Size: 256 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 256 - Model: AlexNet ml-benchmark2-12-25-23 80 160 240 320 400 SE +/- 0.26, N = 3 386.97
TensorFlow Device: CPU - Batch Size: 512 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 512 - Model: AlexNet ml-benchmark2-12-25-23 90 180 270 360 450 SE +/- 0.13, N = 3 400.77
TensorFlow Device: CPU - Batch Size: 16 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 16 - Model: GoogLeNet ml-benchmark2-12-25-23 30 60 90 120 150 SE +/- 0.16, N = 3 127.79
TensorFlow Device: CPU - Batch Size: 16 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 16 - Model: ResNet-50 ml-benchmark2-12-25-23 8 16 24 32 40 SE +/- 0.04, N = 3 35.90
TensorFlow Device: CPU - Batch Size: 32 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 32 - Model: GoogLeNet ml-benchmark2-12-25-23 30 60 90 120 150 SE +/- 0.10, N = 3 127.29
TensorFlow Device: CPU - Batch Size: 32 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 32 - Model: ResNet-50 ml-benchmark2-12-25-23 8 16 24 32 40 SE +/- 0.02, N = 3 36.10
TensorFlow Device: CPU - Batch Size: 64 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 64 - Model: GoogLeNet ml-benchmark2-12-25-23 30 60 90 120 150 SE +/- 0.09, N = 3 120.40
TensorFlow Device: CPU - Batch Size: 64 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 64 - Model: ResNet-50 ml-benchmark2-12-25-23 8 16 24 32 40 SE +/- 0.01, N = 3 35.45
TensorFlow Device: CPU - Batch Size: 256 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 256 - Model: GoogLeNet ml-benchmark2-12-25-23 30 60 90 120 150 SE +/- 0.03, N = 3 112.12
TensorFlow Device: CPU - Batch Size: 256 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 256 - Model: ResNet-50 ml-benchmark2-12-25-23 8 16 24 32 40 SE +/- 0.00, N = 3 34.24
TensorFlow Device: CPU - Batch Size: 512 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 512 - Model: GoogLeNet ml-benchmark2-12-25-23 20 40 60 80 100 SE +/- 0.03, N = 3 110.88
TensorFlow Device: CPU - Batch Size: 512 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 512 - Model: ResNet-50 ml-benchmark2-12-25-23 8 16 24 32 40 SE +/- 0.00, N = 3 34.09
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream ml-benchmark2-12-25-23 5 10 15 20 25 SE +/- 0.03, N = 3 22.07
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream ml-benchmark2-12-25-23 80 160 240 320 400 SE +/- 0.63, N = 3 361.49
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream ml-benchmark2-12-25-23 4 8 12 16 20 SE +/- 0.01, N = 3 17.55
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream ml-benchmark2-12-25-23 13 26 39 52 65 SE +/- 0.03, N = 3 56.96
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream ml-benchmark2-12-25-23 200 400 600 800 1000 SE +/- 0.06, N = 3 934.08
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream ml-benchmark2-12-25-23 2 4 6 8 10 SE +/- 0.0005, N = 3 8.5518
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Synchronous Single-Stream ml-benchmark2-12-25-23 60 120 180 240 300 SE +/- 0.17, N = 3 270.67
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Synchronous Single-Stream ml-benchmark2-12-25-23 0.8309 1.6618 2.4927 3.3236 4.1545 SE +/- 0.0023, N = 3 3.6930
Neural Magic DeepSparse Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream ml-benchmark2-12-25-23 60 120 180 240 300 SE +/- 0.07, N = 3 270.15
Neural Magic DeepSparse Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream ml-benchmark2-12-25-23 7 14 21 28 35 SE +/- 0.01, N = 3 29.60
Neural Magic DeepSparse Model: ResNet-50, Baseline - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: ResNet-50, Baseline - Scenario: Synchronous Single-Stream ml-benchmark2-12-25-23 40 80 120 160 200 SE +/- 0.33, N = 3 192.04
Neural Magic DeepSparse Model: ResNet-50, Baseline - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: ResNet-50, Baseline - Scenario: Synchronous Single-Stream ml-benchmark2-12-25-23 1.1712 2.3424 3.5136 4.6848 5.856 SE +/- 0.0090, N = 3 5.2053
Neural Magic DeepSparse Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream ml-benchmark2-12-25-23 600 1200 1800 2400 3000 SE +/- 6.37, N = 3 2614.24
Neural Magic DeepSparse Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream ml-benchmark2-12-25-23 0.6861 1.3722 2.0583 2.7444 3.4305 SE +/- 0.0077, N = 3 3.0495
Neural Magic DeepSparse Model: ResNet-50, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: ResNet-50, Sparse INT8 - Scenario: Synchronous Single-Stream ml-benchmark2-12-25-23 300 600 900 1200 1500 SE +/- 2.24, N = 3 1172.15
Neural Magic DeepSparse Model: ResNet-50, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: ResNet-50, Sparse INT8 - Scenario: Synchronous Single-Stream ml-benchmark2-12-25-23 0.1915 0.383 0.5745 0.766 0.9575 SE +/- 0.0017, N = 3 0.8513
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream ml-benchmark2-12-25-23 30 60 90 120 150 SE +/- 0.30, N = 3 120.31
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream ml-benchmark2-12-25-23 15 30 45 60 75 SE +/- 0.17, N = 3 66.46
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Stream ml-benchmark2-12-25-23 20 40 60 80 100 SE +/- 0.13, N = 3 94.15
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Stream ml-benchmark2-12-25-23 3 6 9 12 15 SE +/- 0.02, N = 3 10.62
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: BERT-Large, NLP Question Answering - Scenario: Asynchronous Multi-Stream ml-benchmark2-12-25-23 7 14 21 28 35 SE +/- 0.02, N = 3 27.63
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: BERT-Large, NLP Question Answering - Scenario: Asynchronous Multi-Stream ml-benchmark2-12-25-23 60 120 180 240 300 SE +/- 0.25, N = 3 289.50
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: BERT-Large, NLP Question Answering - Scenario: Synchronous Single-Stream ml-benchmark2-12-25-23 4 8 12 16 20 SE +/- 0.01, N = 3 17.60
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: BERT-Large, NLP Question Answering - Scenario: Synchronous Single-Stream ml-benchmark2-12-25-23 13 26 39 52 65 SE +/- 0.04, N = 3 56.81
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream ml-benchmark2-12-25-23 60 120 180 240 300 SE +/- 0.30, N = 3 270.26
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream ml-benchmark2-12-25-23 7 14 21 28 35 SE +/- 0.03, N = 3 29.59
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream ml-benchmark2-12-25-23 40 80 120 160 200 SE +/- 0.36, N = 3 192.13
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream ml-benchmark2-12-25-23 1.1706 2.3412 3.5118 4.6824 5.853 SE +/- 0.0097, N = 3 5.2028
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream ml-benchmark2-12-25-23 30 60 90 120 150 SE +/- 0.13, N = 3 128.09
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream ml-benchmark2-12-25-23 14 28 42 56 70 SE +/- 0.07, N = 3 62.42
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Synchronous Single-Stream ml-benchmark2-12-25-23 20 40 60 80 100 SE +/- 0.07, N = 3 95.22
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Synchronous Single-Stream ml-benchmark2-12-25-23 3 6 9 12 15 SE +/- 0.01, N = 3 10.50
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream ml-benchmark2-12-25-23 40 80 120 160 200 SE +/- 0.27, N = 3 187.27
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream ml-benchmark2-12-25-23 10 20 30 40 50 SE +/- 0.06, N = 3 42.70
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream ml-benchmark2-12-25-23 20 40 60 80 100 SE +/- 0.13, N = 3 109.77
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream ml-benchmark2-12-25-23 3 6 9 12 15 SE +/- 0.0104, N = 3 9.1055
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream ml-benchmark2-12-25-23 8 16 24 32 40 SE +/- 0.16, N = 3 34.93
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream ml-benchmark2-12-25-23 50 100 150 200 250 SE +/- 1.03, N = 3 228.46
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream ml-benchmark2-12-25-23 6 12 18 24 30 SE +/- 0.02, N = 3 26.46
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream ml-benchmark2-12-25-23 9 18 27 36 45 SE +/- 0.02, N = 3 37.78
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream ml-benchmark2-12-25-23 90 180 270 360 450 SE +/- 0.62, N = 3 428.23
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream ml-benchmark2-12-25-23 5 10 15 20 25 SE +/- 0.03, N = 3 18.66
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Synchronous Single-Stream ml-benchmark2-12-25-23 20 40 60 80 100 SE +/- 0.08, N = 3 100.11
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Synchronous Single-Stream ml-benchmark2-12-25-23 3 6 9 12 15 SE +/- 0.0082, N = 3 9.9862
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream ml-benchmark2-12-25-23 5 10 15 20 25 SE +/- 0.05, N = 3 22.38
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream ml-benchmark2-12-25-23 80 160 240 320 400 SE +/- 0.71, N = 3 356.97
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream ml-benchmark2-12-25-23 4 8 12 16 20 SE +/- 0.03, N = 3 17.60
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream ml-benchmark2-12-25-23 13 26 39 52 65 SE +/- 0.08, N = 3 56.81
spaCy Model: en_core_web_lg OpenBenchmarking.org tokens/sec, More Is Better spaCy 3.4.1 Model: en_core_web_lg ml-benchmark2-12-25-23 4K 8K 12K 16K 20K SE +/- 113.01, N = 3 19452
spaCy Model: en_core_web_trf OpenBenchmarking.org tokens/sec, More Is Better spaCy 3.4.1 Model: en_core_web_trf ml-benchmark2-12-25-23 500 1000 1500 2000 2500 SE +/- 24.52, N = 3 2354
Caffe Model: AlexNet - Acceleration: CPU - Iterations: 100 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: AlexNet - Acceleration: CPU - Iterations: 100 ml-benchmark2-12-25-23 5K 10K 15K 20K 25K SE +/- 159.49, N = 3 24893 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lcrypto -lcurl -lpthread -lsz -lz -ldl -lm -llmdb
Caffe Model: AlexNet - Acceleration: CPU - Iterations: 200 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: AlexNet - Acceleration: CPU - Iterations: 200 ml-benchmark2-12-25-23 11K 22K 33K 44K 55K SE +/- 122.11, N = 3 49227 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lcrypto -lcurl -lpthread -lsz -lz -ldl -lm -llmdb
Caffe Model: AlexNet - Acceleration: CPU - Iterations: 1000 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: AlexNet - Acceleration: CPU - Iterations: 1000 ml-benchmark2-12-25-23 50K 100K 150K 200K 250K SE +/- 481.92, N = 3 243445 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lcrypto -lcurl -lpthread -lsz -lz -ldl -lm -llmdb
Caffe Model: GoogleNet - Acceleration: CPU - Iterations: 100 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: CPU - Iterations: 100 ml-benchmark2-12-25-23 14K 28K 42K 56K 70K SE +/- 328.59, N = 3 63606 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lcrypto -lcurl -lpthread -lsz -lz -ldl -lm -llmdb
Caffe Model: GoogleNet - Acceleration: CPU - Iterations: 200 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: CPU - Iterations: 200 ml-benchmark2-12-25-23 30K 60K 90K 120K 150K SE +/- 239.03, N = 3 124421 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lcrypto -lcurl -lpthread -lsz -lz -ldl -lm -llmdb
Caffe Model: GoogleNet - Acceleration: CPU - Iterations: 1000 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: CPU - Iterations: 1000 ml-benchmark2-12-25-23 130K 260K 390K 520K 650K SE +/- 1421.72, N = 3 626915 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lcrypto -lcurl -lpthread -lsz -lz -ldl -lm -llmdb
Mobile Neural Network Model: nasnet OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.1 Model: nasnet ml-benchmark2-12-25-23 3 6 9 12 15 SE +/- 0.11, N = 3 11.96 MIN: 10.36 / MAX: 117.89 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: mobilenetV3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.1 Model: mobilenetV3 ml-benchmark2-12-25-23 0.4091 0.8182 1.2273 1.6364 2.0455 SE +/- 0.014, N = 3 1.818 MIN: 1.54 / MAX: 31.92 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: squeezenetv1.1 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.1 Model: squeezenetv1.1 ml-benchmark2-12-25-23 0.6233 1.2466 1.8699 2.4932 3.1165 SE +/- 0.098, N = 3 2.770 MIN: 2.36 / MAX: 51.44 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: resnet-v2-50 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.1 Model: resnet-v2-50 ml-benchmark2-12-25-23 3 6 9 12 15 SE +/- 0.21, N = 3 11.36 MIN: 10.13 / MAX: 74.03 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: SqueezeNetV1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.1 Model: SqueezeNetV1.0 ml-benchmark2-12-25-23 0.9905 1.981 2.9715 3.962 4.9525 SE +/- 0.049, N = 3 4.402 MIN: 3.93 / MAX: 80.86 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: MobileNetV2_224 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.1 Model: MobileNetV2_224 ml-benchmark2-12-25-23 0.8377 1.6754 2.5131 3.3508 4.1885 SE +/- 0.086, N = 3 3.723 MIN: 3.23 / MAX: 82.07 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: mobilenet-v1-1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.1 Model: mobilenet-v1-1.0 ml-benchmark2-12-25-23 0.594 1.188 1.782 2.376 2.97 SE +/- 0.053, N = 3 2.640 MIN: 2.3 / MAX: 44.01 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: inception-v3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.1 Model: inception-v3 ml-benchmark2-12-25-23 5 10 15 20 25 SE +/- 0.32, N = 3 22.38 MIN: 19.4 / MAX: 90.14 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: mobilenet ml-benchmark2-12-25-23 3 6 9 12 15 SE +/- 0.22, N = 15 10.64 MIN: 8.78 / MAX: 504.79 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v2-v2 - Model: mobilenet-v2 ml-benchmark2-12-25-23 1.017 2.034 3.051 4.068 5.085 SE +/- 0.15, N = 15 4.52 MIN: 3.49 / MAX: 453.27 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3 - Model: mobilenet-v3 ml-benchmark2-12-25-23 0.9563 1.9126 2.8689 3.8252 4.7815 SE +/- 0.14, N = 15 4.25 MIN: 3.54 / MAX: 358.68 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: shufflenet-v2 ml-benchmark2-12-25-23 0.9945 1.989 2.9835 3.978 4.9725 SE +/- 0.15, N = 15 4.42 MIN: 3.7 / MAX: 373.85 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: mnasnet ml-benchmark2-12-25-23 0.8685 1.737 2.6055 3.474 4.3425 SE +/- 0.13, N = 15 3.86 MIN: 3.06 / MAX: 334.53 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: efficientnet-b0 ml-benchmark2-12-25-23 1.1925 2.385 3.5775 4.77 5.9625 SE +/- 0.13, N = 15 5.30 MIN: 4.07 / MAX: 351.52 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: blazeface ml-benchmark2-12-25-23 0.3825 0.765 1.1475 1.53 1.9125 SE +/- 0.09, N = 15 1.70 MIN: 1.16 / MAX: 246.39 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: googlenet ml-benchmark2-12-25-23 3 6 9 12 15 SE +/- 0.15, N = 15 10.38 MIN: 8.27 / MAX: 421.12 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: vgg16 ml-benchmark2-12-25-23 8 16 24 32 40 SE +/- 0.47, N = 15 35.80 MIN: 28.59 / MAX: 399.5 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: resnet18 ml-benchmark2-12-25-23 2 4 6 8 10 SE +/- 0.17, N = 15 6.70 MIN: 5.33 / MAX: 359.17 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: alexnet ml-benchmark2-12-25-23 1.1453 2.2906 3.4359 4.5812 5.7265 SE +/- 0.15, N = 15 5.09 MIN: 4.18 / MAX: 317.53 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: resnet50 ml-benchmark2-12-25-23 3 6 9 12 15 SE +/- 0.15, N = 15 13.00 MIN: 10.61 / MAX: 388.18 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: yolov4-tiny ml-benchmark2-12-25-23 4 8 12 16 20 SE +/- 0.14, N = 15 17.12 MIN: 13.93 / MAX: 387.87 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: squeezenet_ssd ml-benchmark2-12-25-23 2 4 6 8 10 SE +/- 0.14, N = 15 8.75 MIN: 6.94 / MAX: 418.75 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: regnety_400m ml-benchmark2-12-25-23 3 6 9 12 15 SE +/- 0.14, N = 15 11.06 MIN: 8.41 / MAX: 389.46 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: vision_transformer ml-benchmark2-12-25-23 10 20 30 40 50 SE +/- 0.29, N = 15 42.19 MIN: 35.18 / MAX: 582.74 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: FastestDet ml-benchmark2-12-25-23 1.143 2.286 3.429 4.572 5.715 SE +/- 0.22, N = 15 5.08 MIN: 3 / MAX: 221.01 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: mobilenet ml-benchmark2-12-25-23 3 6 9 12 15 SE +/- 0.26, N = 12 10.67 MIN: 8.56 / MAX: 353.75 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 ml-benchmark2-12-25-23 0.9653 1.9306 2.8959 3.8612 4.8265 SE +/- 0.13, N = 12 4.29 MIN: 3.31 / MAX: 320.74 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 ml-benchmark2-12-25-23 0.99 1.98 2.97 3.96 4.95 SE +/- 0.18, N = 12 4.40 MIN: 3.22 / MAX: 352.45 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: shufflenet-v2 ml-benchmark2-12-25-23 0.981 1.962 2.943 3.924 4.905 SE +/- 0.17, N = 12 4.36 MIN: 3.67 / MAX: 375.41 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: mnasnet ml-benchmark2-12-25-23 0.8933 1.7866 2.6799 3.5732 4.4665 SE +/- 0.14, N = 12 3.97 MIN: 3.13 / MAX: 314.65 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: efficientnet-b0 ml-benchmark2-12-25-23 1.251 2.502 3.753 5.004 6.255 SE +/- 0.21, N = 12 5.56 MIN: 4.32 / MAX: 601.76 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: blazeface ml-benchmark2-12-25-23 0.36 0.72 1.08 1.44 1.8 SE +/- 0.03, N = 12 1.60 MIN: 1.29 / MAX: 9.03 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: googlenet ml-benchmark2-12-25-23 3 6 9 12 15 SE +/- 0.23, N = 12 10.69 MIN: 8.16 / MAX: 460.92 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: vgg16 ml-benchmark2-12-25-23 8 16 24 32 40 SE +/- 0.46, N = 12 35.09 MIN: 28.16 / MAX: 385.01 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: resnet18 ml-benchmark2-12-25-23 2 4 6 8 10 SE +/- 0.19, N = 12 6.64 MIN: 5.3 / MAX: 394.05 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: alexnet ml-benchmark2-12-25-23 1.17 2.34 3.51 4.68 5.85 SE +/- 0.21, N = 12 5.20 MIN: 4.2 / MAX: 274.71 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: resnet50 ml-benchmark2-12-25-23 3 6 9 12 15 SE +/- 0.20, N = 12 13.39 MIN: 10.9 / MAX: 369.34 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: yolov4-tiny ml-benchmark2-12-25-23 4 8 12 16 20 SE +/- 0.18, N = 12 17.22 MIN: 13.77 / MAX: 381.41 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: squeezenet_ssd ml-benchmark2-12-25-23 2 4 6 8 10 SE +/- 0.17, N = 12 8.79 MIN: 6.89 / MAX: 385.84 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: regnety_400m ml-benchmark2-12-25-23 3 6 9 12 15 SE +/- 0.16, N = 12 11.29 MIN: 8.75 / MAX: 379.29 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: vision_transformer ml-benchmark2-12-25-23 10 20 30 40 50 SE +/- 0.31, N = 12 42.15 MIN: 34.59 / MAX: 476.34 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: FastestDet ml-benchmark2-12-25-23 1.071 2.142 3.213 4.284 5.355 SE +/- 0.28, N = 12 4.76 MIN: 2.77 / MAX: 309.73 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
TNN Target: CPU - Model: DenseNet OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: DenseNet ml-benchmark2-12-25-23 500 1000 1500 2000 2500 SE +/- 8.59, N = 3 2134.43 MIN: 2002.11 / MAX: 2290.65 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
TNN Target: CPU - Model: MobileNet v2 OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: MobileNet v2 ml-benchmark2-12-25-23 40 80 120 160 200 SE +/- 1.57, N = 15 187.48 MIN: 176.2 / MAX: 252.15 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
TNN Target: CPU - Model: SqueezeNet v2 OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: SqueezeNet v2 ml-benchmark2-12-25-23 10 20 30 40 50 SE +/- 0.64, N = 15 42.14 MIN: 38.56 / MAX: 49.14 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
TNN Target: CPU - Model: SqueezeNet v1.1 OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: SqueezeNet v1.1 ml-benchmark2-12-25-23 40 80 120 160 200 SE +/- 2.19, N = 15 179.66 MIN: 169.37 / MAX: 193.32 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
OpenVINO Model: Face Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Face Detection FP16 - Device: CPU ml-benchmark2-12-25-23 4 8 12 16 20 SE +/- 0.01, N = 3 13.70 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenVINO Model: Face Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Face Detection FP16 - Device: CPU ml-benchmark2-12-25-23 130 260 390 520 650 SE +/- 0.48, N = 3 581.79 MIN: 290.44 / MAX: 695.5 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Person Detection FP16 - Device: CPU ml-benchmark2-12-25-23 20 40 60 80 100 SE +/- 0.59, N = 3 87.80 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Person Detection FP16 - Device: CPU ml-benchmark2-12-25-23 20 40 60 80 100 SE +/- 0.62, N = 3 91.06 MIN: 33.94 / MAX: 197.35 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenVINO Model: Person Detection FP32 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Person Detection FP32 - Device: CPU ml-benchmark2-12-25-23 20 40 60 80 100 SE +/- 0.18, N = 3 87.33 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenVINO Model: Person Detection FP32 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Person Detection FP32 - Device: CPU ml-benchmark2-12-25-23 20 40 60 80 100 SE +/- 0.18, N = 3 91.56 MIN: 31.31 / MAX: 170.1 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenVINO Model: Vehicle Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Vehicle Detection FP16 - Device: CPU ml-benchmark2-12-25-23 200 400 600 800 1000 SE +/- 3.28, N = 3 994.76 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenVINO Model: Vehicle Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Vehicle Detection FP16 - Device: CPU ml-benchmark2-12-25-23 2 4 6 8 10 SE +/- 0.03, N = 3 8.04 MIN: 3.93 / MAX: 100.58 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenVINO Model: Face Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Face Detection FP16-INT8 - Device: CPU ml-benchmark2-12-25-23 6 12 18 24 30 SE +/- 0.02, N = 3 26.49 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenVINO Model: Face Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Face Detection FP16-INT8 - Device: CPU ml-benchmark2-12-25-23 70 140 210 280 350 SE +/- 0.11, N = 3 301.34 MIN: 180.53 / MAX: 466.91 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenVINO Model: Face Detection Retail FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Face Detection Retail FP16 - Device: CPU ml-benchmark2-12-25-23 700 1400 2100 2800 3500 SE +/- 2.28, N = 3 3485.66 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenVINO Model: Face Detection Retail FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Face Detection Retail FP16 - Device: CPU ml-benchmark2-12-25-23 0.5153 1.0306 1.5459 2.0612 2.5765 SE +/- 0.00, N = 3 2.29 MIN: 1.38 / MAX: 55.46 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenVINO Model: Road Segmentation ADAS FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Road Segmentation ADAS FP16 - Device: CPU ml-benchmark2-12-25-23 100 200 300 400 500 SE +/- 1.13, N = 3 441.60 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenVINO Model: Road Segmentation ADAS FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Road Segmentation ADAS FP16 - Device: CPU ml-benchmark2-12-25-23 4 8 12 16 20 SE +/- 0.04, N = 3 18.10 MIN: 10.53 / MAX: 93.07 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenVINO Model: Vehicle Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Vehicle Detection FP16-INT8 - Device: CPU ml-benchmark2-12-25-23 400 800 1200 1600 2000 SE +/- 2.61, N = 3 1667.89 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenVINO Model: Vehicle Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Vehicle Detection FP16-INT8 - Device: CPU ml-benchmark2-12-25-23 1.0778 2.1556 3.2334 4.3112 5.389 SE +/- 0.01, N = 3 4.79 MIN: 2.93 / MAX: 107.95 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenVINO Model: Weld Porosity Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Weld Porosity Detection FP16 - Device: CPU ml-benchmark2-12-25-23 300 600 900 1200 1500 SE +/- 0.66, N = 3 1386.49 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenVINO Model: Weld Porosity Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Weld Porosity Detection FP16 - Device: CPU ml-benchmark2-12-25-23 3 6 9 12 15 SE +/- 0.01, N = 3 11.53 MIN: 5.99 / MAX: 113.69 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenVINO Model: Face Detection Retail FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Face Detection Retail FP16-INT8 - Device: CPU ml-benchmark2-12-25-23 1000 2000 3000 4000 5000 SE +/- 3.56, N = 3 4876.38 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenVINO Model: Face Detection Retail FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Face Detection Retail FP16-INT8 - Device: CPU ml-benchmark2-12-25-23 0.738 1.476 2.214 2.952 3.69 SE +/- 0.00, N = 3 3.28 MIN: 2.08 / MAX: 88.57 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenVINO Model: Road Segmentation ADAS FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Road Segmentation ADAS FP16-INT8 - Device: CPU ml-benchmark2-12-25-23 120 240 360 480 600 SE +/- 1.61, N = 3 532.88 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenVINO Model: Road Segmentation ADAS FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Road Segmentation ADAS FP16-INT8 - Device: CPU ml-benchmark2-12-25-23 4 8 12 16 20 SE +/- 0.05, N = 3 15.00 MIN: 8.89 / MAX: 105.47 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenVINO Model: Machine Translation EN To DE FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Machine Translation EN To DE FP16 - Device: CPU ml-benchmark2-12-25-23 30 60 90 120 150 SE +/- 0.37, N = 3 121.71 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenVINO Model: Machine Translation EN To DE FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Machine Translation EN To DE FP16 - Device: CPU ml-benchmark2-12-25-23 15 30 45 60 75 SE +/- 0.20, N = 3 65.69 MIN: 28.67 / MAX: 156.27 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenVINO Model: Weld Porosity Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Weld Porosity Detection FP16-INT8 - Device: CPU ml-benchmark2-12-25-23 600 1200 1800 2400 3000 SE +/- 1.97, N = 3 2688.84 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenVINO Model: Weld Porosity Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Weld Porosity Detection FP16-INT8 - Device: CPU ml-benchmark2-12-25-23 1.3388 2.6776 4.0164 5.3552 6.694 SE +/- 0.00, N = 3 5.95 MIN: 3.2 / MAX: 99.8 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenVINO Model: Person Vehicle Bike Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Person Vehicle Bike Detection FP16 - Device: CPU ml-benchmark2-12-25-23 300 600 900 1200 1500 SE +/- 3.61, N = 3 1579.58 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenVINO Model: Person Vehicle Bike Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Person Vehicle Bike Detection FP16 - Device: CPU ml-benchmark2-12-25-23 1.1385 2.277 3.4155 4.554 5.6925 SE +/- 0.01, N = 3 5.06 MIN: 3.44 / MAX: 59.53 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenVINO Model: Handwritten English Recognition FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Handwritten English Recognition FP16 - Device: CPU ml-benchmark2-12-25-23 160 320 480 640 800 SE +/- 3.04, N = 3 736.26 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenVINO Model: Handwritten English Recognition FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Handwritten English Recognition FP16 - Device: CPU ml-benchmark2-12-25-23 5 10 15 20 25 SE +/- 0.09, N = 3 21.72 MIN: 14.79 / MAX: 129.44 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU ml-benchmark2-12-25-23 9K 18K 27K 36K 45K SE +/- 18.73, N = 3 41501.98 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU ml-benchmark2-12-25-23 0.0855 0.171 0.2565 0.342 0.4275 SE +/- 0.00, N = 3 0.38 MIN: 0.21 / MAX: 73.29 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenVINO Model: Handwritten English Recognition FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Handwritten English Recognition FP16-INT8 - Device: CPU ml-benchmark2-12-25-23 130 260 390 520 650 SE +/- 1.20, N = 3 602.34 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenVINO Model: Handwritten English Recognition FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Handwritten English Recognition FP16-INT8 - Device: CPU ml-benchmark2-12-25-23 6 12 18 24 30 SE +/- 0.05, N = 3 26.55 MIN: 18.75 / MAX: 163.99 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenVINO Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU ml-benchmark2-12-25-23 13K 26K 39K 52K 65K SE +/- 14.39, N = 3 58961.44 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
OpenVINO Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU ml-benchmark2-12-25-23 0.0585 0.117 0.1755 0.234 0.2925 SE +/- 0.00, N = 3 0.26 MIN: 0.16 / MAX: 134.43 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie
Numenta Anomaly Benchmark Detector: KNN CAD OpenBenchmarking.org Seconds, Fewer Is Better Numenta Anomaly Benchmark 1.1 Detector: KNN CAD ml-benchmark2-12-25-23 20 40 60 80 100 SE +/- 0.47, N = 3 100.05
Numenta Anomaly Benchmark Detector: Relative Entropy OpenBenchmarking.org Seconds, Fewer Is Better Numenta Anomaly Benchmark 1.1 Detector: Relative Entropy ml-benchmark2-12-25-23 2 4 6 8 10 SE +/- 0.034, N = 3 8.048
Numenta Anomaly Benchmark Detector: Windowed Gaussian OpenBenchmarking.org Seconds, Fewer Is Better Numenta Anomaly Benchmark 1.1 Detector: Windowed Gaussian ml-benchmark2-12-25-23 1.0604 2.1208 3.1812 4.2416 5.302 SE +/- 0.049, N = 15 4.713
Numenta Anomaly Benchmark Detector: Earthgecko Skyline OpenBenchmarking.org Seconds, Fewer Is Better Numenta Anomaly Benchmark 1.1 Detector: Earthgecko Skyline ml-benchmark2-12-25-23 12 24 36 48 60 SE +/- 0.56, N = 3 51.84
Numenta Anomaly Benchmark Detector: Bayesian Changepoint OpenBenchmarking.org Seconds, Fewer Is Better Numenta Anomaly Benchmark 1.1 Detector: Bayesian Changepoint ml-benchmark2-12-25-23 3 6 9 12 15 SE +/- 0.14, N = 4 12.75
Numenta Anomaly Benchmark Detector: Contextual Anomaly Detector OSE OpenBenchmarking.org Seconds, Fewer Is Better Numenta Anomaly Benchmark 1.1 Detector: Contextual Anomaly Detector OSE ml-benchmark2-12-25-23 6 12 18 24 30 SE +/- 0.27, N = 4 25.41
AI Benchmark Alpha Device Inference Score OpenBenchmarking.org Score, More Is Better AI Benchmark Alpha 0.1.2 Device Inference Score ml-benchmark2-12-25-23 600 1200 1800 2400 3000 2932
AI Benchmark Alpha Device Training Score OpenBenchmarking.org Score, More Is Better AI Benchmark Alpha 0.1.2 Device Training Score ml-benchmark2-12-25-23 800 1600 2400 3200 4000 3592
AI Benchmark Alpha Device AI Score OpenBenchmarking.org Score, More Is Better AI Benchmark Alpha 0.1.2 Device AI Score ml-benchmark2-12-25-23 1400 2800 4200 5600 7000 6524
Mlpack Benchmark Benchmark: scikit_ica OpenBenchmarking.org Seconds, Fewer Is Better Mlpack Benchmark Benchmark: scikit_ica ml-benchmark2-12-25-23 7 14 21 28 35 SE +/- 0.04, N = 3 29.93
Mlpack Benchmark Benchmark: scikit_qda OpenBenchmarking.org Seconds, Fewer Is Better Mlpack Benchmark Benchmark: scikit_qda ml-benchmark2-12-25-23 7 14 21 28 35 SE +/- 0.13, N = 3 31.30
Mlpack Benchmark Benchmark: scikit_svm OpenBenchmarking.org Seconds, Fewer Is Better Mlpack Benchmark Benchmark: scikit_svm ml-benchmark2-12-25-23 4 8 12 16 20 SE +/- 0.18, N = 3 14.12
Mlpack Benchmark Benchmark: scikit_linearridgeregression OpenBenchmarking.org Seconds, Fewer Is Better Mlpack Benchmark Benchmark: scikit_linearridgeregression ml-benchmark2-12-25-23 0.243 0.486 0.729 0.972 1.215 SE +/- 0.01, N = 15 1.08
Scikit-Learn Benchmark: Tree OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Tree ml-benchmark2-12-25-23 9 18 27 36 45 SE +/- 0.42, N = 3 37.43 1. (F9X) gfortran options: -O3 -fopenmp -fno-tree-vectorize -lm -lpthread -lgfortran -lc
Scikit-Learn Benchmark: Lasso OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Lasso ml-benchmark2-12-25-23 50 100 150 200 250 SE +/- 1.50, N = 3 237.29 1. (F9X) gfortran options: -O3 -fopenmp -fno-tree-vectorize -lm -lpthread -lgfortran -lc
Scikit-Learn Benchmark: Sparsify OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Sparsify ml-benchmark2-12-25-23 16 32 48 64 80 SE +/- 1.01, N = 3 71.65 1. (F9X) gfortran options: -O3 -fopenmp -fno-tree-vectorize -lm -lpthread -lgfortran -lc
Scikit-Learn Benchmark: Plot Ward OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Plot Ward ml-benchmark2-12-25-23 9 18 27 36 45 SE +/- 0.09, N = 3 40.58 1. (F9X) gfortran options: -O3 -fopenmp -fno-tree-vectorize -lm -lpthread -lgfortran -lc
Scikit-Learn Benchmark: Plot Neighbors OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Plot Neighbors ml-benchmark2-12-25-23 30 60 90 120 150 SE +/- 1.25, N = 12 123.61 1. (F9X) gfortran options: -O3 -fopenmp -fno-tree-vectorize -lm -lpthread -lgfortran -lc
Scikit-Learn Benchmark: Text Vectorizers OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Text Vectorizers ml-benchmark2-12-25-23 10 20 30 40 50 SE +/- 0.09, N = 3 41.82 1. (F9X) gfortran options: -O3 -fopenmp -fno-tree-vectorize -lm -lpthread -lgfortran -lc
Scikit-Learn Benchmark: Plot Hierarchical OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Plot Hierarchical ml-benchmark2-12-25-23 30 60 90 120 150 SE +/- 0.48, N = 3 139.00 1. (F9X) gfortran options: -O3 -fopenmp -fno-tree-vectorize -lm -lpthread -lgfortran -lc
Scikit-Learn Benchmark: Feature Expansions OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Feature Expansions ml-benchmark2-12-25-23 20 40 60 80 100 SE +/- 0.70, N = 3 90.53 1. (F9X) gfortran options: -O3 -fopenmp -fno-tree-vectorize -lm -lpthread -lgfortran -lc
Scikit-Learn Benchmark: Isotonic / Logistic OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Isotonic / Logistic ml-benchmark2-12-25-23 200 400 600 800 1000 SE +/- 4.14, N = 3 1123.36 1. (F9X) gfortran options: -O3 -fopenmp -fno-tree-vectorize -lm -lpthread -lgfortran -lc
Scikit-Learn Benchmark: Plot Incremental PCA OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Plot Incremental PCA ml-benchmark2-12-25-23 11 22 33 44 55 SE +/- 0.66, N = 15 49.61 1. (F9X) gfortran options: -O3 -fopenmp -fno-tree-vectorize -lm -lpthread -lgfortran -lc
Scikit-Learn Benchmark: Isotonic / Pathological OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Isotonic / Pathological ml-benchmark2-12-25-23 700 1400 2100 2800 3500 SE +/- 12.71, N = 3 3114.64 1. (F9X) gfortran options: -O3 -fopenmp -fno-tree-vectorize -lm -lpthread -lgfortran -lc
Scikit-Learn Benchmark: Covertype Dataset Benchmark OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Covertype Dataset Benchmark ml-benchmark2-12-25-23 60 120 180 240 300 SE +/- 2.26, N = 3 270.96 1. (F9X) gfortran options: -O3 -fopenmp -fno-tree-vectorize -lm -lpthread -lgfortran -lc
Scikit-Learn Benchmark: Isotonic / Perturbed Logarithm OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Isotonic / Perturbed Logarithm ml-benchmark2-12-25-23 300 600 900 1200 1500 SE +/- 6.24, N = 3 1384.03 1. (F9X) gfortran options: -O3 -fopenmp -fno-tree-vectorize -lm -lpthread -lgfortran -lc
Scikit-Learn Benchmark: 20 Newsgroups / Logistic Regression OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: 20 Newsgroups / Logistic Regression ml-benchmark2-12-25-23 7 14 21 28 35 SE +/- 0.07, N = 3 28.22 1. (F9X) gfortran options: -O3 -fopenmp -fno-tree-vectorize -lm -lpthread -lgfortran -lc
Scikit-Learn Benchmark: Sparse Random Projections / 100 Iterations OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Sparse Random Projections / 100 Iterations ml-benchmark2-12-25-23 90 180 270 360 450 SE +/- 2.81, N = 3 401.31 1. (F9X) gfortran options: -O3 -fopenmp -fno-tree-vectorize -lm -lpthread -lgfortran -lc
Whisper.cpp Model: ggml-base.en - Input: 2016 State of the Union OpenBenchmarking.org Seconds, Fewer Is Better Whisper.cpp 1.4 Model: ggml-base.en - Input: 2016 State of the Union ml-benchmark2-12-25-23 20 40 60 80 100 SE +/- 1.08, N = 15 96.98 1. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread
Whisper.cpp Model: ggml-small.en - Input: 2016 State of the Union OpenBenchmarking.org Seconds, Fewer Is Better Whisper.cpp 1.4 Model: ggml-small.en - Input: 2016 State of the Union ml-benchmark2-12-25-23 60 120 180 240 300 SE +/- 3.07, N = 9 289.74 1. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread
Whisper.cpp Model: ggml-medium.en - Input: 2016 State of the Union OpenBenchmarking.org Seconds, Fewer Is Better Whisper.cpp 1.4 Model: ggml-medium.en - Input: 2016 State of the Union ml-benchmark2-12-25-23 200 400 600 800 1000 SE +/- 9.64, N = 3 807.18 1. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread
OpenCV Test: DNN - Deep Neural Network OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: DNN - Deep Neural Network ml-benchmark2-12-25-23 7K 14K 21K 28K 35K SE +/- 829.30, N = 15 31437 1. (CXX) g++ options: -fPIC -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -shared
Phoronix Test Suite v10.8.5