ML Intel Core i7-12700H testing with a Intel NUC12SNKi72 (SNADL357.0055.2022.0923.1555 BIOS) and Intel Arctm A770M DG2 16GB on Ubuntu 22.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2305318-NE-ML327025586 .
ML Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server OpenGL OpenCL Vulkan Compiler File-System Screen Resolution Intel Arctm A770M DG2 Intel Core i7-12700H @ 4.60GHz (14 Cores / 20 Threads) Intel NUC12SNKi72 (SNADL357.0055.2022.0923.1555 BIOS) Intel Alder Lake PCH 16GB 1024GB SAMSUNG MZVL21T0HCLR-00A00 Intel Arctm A770M DG2 16GB (1400MHz) Realtek ALC274 S27H85x Intel I225-LM + Intel Alder Lake-P PCH CNVi WiFi Ubuntu 22.04 5.19.0-42-generic (x86_64) GNOME Shell 42.5 X Server 1.21.1.3 + Wayland 4.6 Mesa 23.1.0-devel (git-722bcd7973) OpenCL 3.0 + OpenCL 3.0 1.3.238 GCC 11.3.0 ext4 2560x1440 OpenBenchmarking.org - Transparent Huge Pages: madvise - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-aYxV0E/gcc-11-11.3.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-aYxV0E/gcc-11-11.3.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: intel_pstate powersave (EPP: performance) - CPU Microcode: 0x429 - Thermald 2.4.9 - Python 3.10.6 - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected
ML shoc: OpenCL - S3D shoc: OpenCL - Triad shoc: OpenCL - FFT SP shoc: OpenCL - MD5 Hash shoc: OpenCL - Reduction shoc: OpenCL - GEMM SGEMM_N shoc: OpenCL - Max SP Flops shoc: OpenCL - Bus Speed Download shoc: OpenCL - Bus Speed Readback shoc: OpenCL - Texture Read Bandwidth lczero: BLAS onednn: IP Shapes 1D - f32 - CPU onednn: IP Shapes 3D - f32 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: Deconvolution Batch shapes_1d - f32 - CPU onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU numpy: deepspeech: CPU rbenchmark: rnnoise: tensorflow-lite: SqueezeNet tensorflow-lite: Inception V4 tensorflow-lite: NASNet Mobile tensorflow-lite: Mobilenet Float tensorflow-lite: Mobilenet Quant tensorflow-lite: Inception ResNet V2 tensorflow: CPU - 16 - VGG-16 tensorflow: CPU - 32 - VGG-16 tensorflow: CPU - 64 - VGG-16 tensorflow: CPU - 16 - AlexNet tensorflow: CPU - 32 - AlexNet tensorflow: CPU - 64 - AlexNet tensorflow: CPU - 256 - AlexNet tensorflow: CPU - 512 - AlexNet tensorflow: CPU - 16 - GoogLeNet tensorflow: CPU - 16 - ResNet-50 tensorflow: CPU - 32 - GoogLeNet tensorflow: CPU - 32 - ResNet-50 tensorflow: CPU - 64 - GoogLeNet tensorflow: CPU - 64 - ResNet-50 tensorflow: CPU - 256 - GoogLeNet tensorflow: CPU - 512 - GoogLeNet deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Stream deepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Asynchronous Multi-Stream deepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Asynchronous Multi-Stream deepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Synchronous Single-Stream deepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Synchronous Single-Stream deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Stream deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Stream deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Synchronous Single-Stream deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Synchronous Single-Stream deepsparse: CV Detection, YOLOv5s COCO - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO - Synchronous Single-Stream deepsparse: CV Detection, YOLOv5s COCO - Synchronous Single-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Synchronous Single-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Synchronous Single-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Synchronous Single-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Synchronous Single-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Synchronous Single-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Synchronous Single-Stream deepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2 - Synchronous Single-Stream deepsparse: NLP Text Classification, BERT base uncased SST2 - Synchronous Single-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Stream spacy: en_core_web_lg spacy: en_core_web_trf caffe: AlexNet - CPU - 100 caffe: AlexNet - CPU - 200 caffe: AlexNet - CPU - 1000 caffe: GoogleNet - CPU - 100 caffe: GoogleNet - CPU - 200 caffe: GoogleNet - CPU - 1000 mnn: nasnet mnn: mobilenetV3 mnn: squeezenetv1.1 mnn: resnet-v2-50 mnn: SqueezeNetV1.0 mnn: MobileNetV2_224 mnn: mobilenet-v1-1.0 mnn: inception-v3 ncnn: CPU - mobilenet ncnn: CPU-v2-v2 - mobilenet-v2 ncnn: CPU-v3-v3 - mobilenet-v3 ncnn: CPU - shufflenet-v2 ncnn: CPU - mnasnet ncnn: CPU - efficientnet-b0 ncnn: CPU - blazeface ncnn: CPU - googlenet ncnn: CPU - vgg16 ncnn: CPU - resnet18 ncnn: CPU - alexnet ncnn: CPU - resnet50 ncnn: CPU - yolov4-tiny ncnn: CPU - squeezenet_ssd ncnn: CPU - regnety_400m ncnn: CPU - vision_transformer ncnn: CPU - FastestDet ncnn: Vulkan GPU - mobilenet ncnn: Vulkan GPU-v2-v2 - mobilenet-v2 ncnn: Vulkan GPU-v3-v3 - mobilenet-v3 ncnn: Vulkan GPU - shufflenet-v2 ncnn: Vulkan GPU - mnasnet ncnn: Vulkan GPU - efficientnet-b0 ncnn: Vulkan GPU - blazeface ncnn: Vulkan GPU - googlenet ncnn: Vulkan GPU - vgg16 ncnn: Vulkan GPU - resnet18 ncnn: Vulkan GPU - alexnet ncnn: Vulkan GPU - resnet50 ncnn: Vulkan GPU - yolov4-tiny ncnn: Vulkan GPU - squeezenet_ssd ncnn: Vulkan GPU - regnety_400m ncnn: Vulkan GPU - vision_transformer ncnn: Vulkan GPU - FastestDet tnn: CPU - DenseNet tnn: CPU - MobileNet v2 tnn: CPU - SqueezeNet v2 tnn: CPU - SqueezeNet v1.1 openvino: Face Detection FP16 - CPU openvino: Face Detection FP16 - CPU openvino: Person Detection FP16 - CPU openvino: Person Detection FP16 - CPU openvino: Person Detection FP32 - CPU openvino: Person Detection FP32 - CPU openvino: Vehicle Detection FP16 - CPU openvino: Vehicle Detection FP16 - CPU openvino: Face Detection FP16-INT8 - CPU openvino: Face Detection FP16-INT8 - CPU openvino: Vehicle Detection FP16-INT8 - CPU openvino: Vehicle Detection FP16-INT8 - CPU openvino: Weld Porosity Detection FP16 - CPU openvino: Weld Porosity Detection FP16 - CPU openvino: Machine Translation EN To DE FP16 - CPU openvino: Machine Translation EN To DE FP16 - CPU openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Person Vehicle Bike Detection FP16 - CPU openvino: Person Vehicle Bike Detection FP16 - CPU openvino: Age Gender Recognition Retail 0013 FP16 - CPU openvino: Age Gender Recognition Retail 0013 FP16 - CPU openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU numenta-nab: KNN CAD numenta-nab: Relative Entropy numenta-nab: Windowed Gaussian numenta-nab: Earthgecko Skyline numenta-nab: Bayesian Changepoint numenta-nab: Contextual Anomaly Detector OSE ai-benchmark: Device Inference Score ai-benchmark: Device Training Score ai-benchmark: Device AI Score mlpack: scikit_ica mlpack: scikit_qda mlpack: scikit_svm mlpack: scikit_linearridgeregression opencv: DNN - Deep Neural Network Intel Arctm A770M DG2 74.6828 11.0036 1253.70 18.261 275.028 2186.71 1439586 11.7577 11.3369 1124.02 1132 3.94803 11.3010 1.40659 2.48025 15.8673 9.12010 7.64277 15.6202 1.89282 3.21314 4363.75 2199.96 4377.60 2190.97 4387.13 2198.98 568.41 65.91483 0.1057 14.773 2642.68 37029.6 114426.0 1916.03 3248.53 129929.4 5.36 5.53 5.54 76.22 100.35 118.53 144.49 154.08 55.09 16.27 54.91 16.75 56.82 16.99 58.43 59.45 7.7657 895.0268 7.3377 136.2778 94.9321 73.6501 61.0694 16.3677 29.2966 238.1160 22.4977 44.4413 48.1926 144.9694 39.5236 25.2934 104.1457 67.1531 70.1283 14.2549 66.0015 105.8618 52.1130 19.1851 10.1681 684.0037 9.6843 103.2463 32.7644 212.5889 26.1122 38.2926 7.7507 897.7116 7.3345 136.3372 16538 1232 33092 67449 341794 93045 187086 940482 10.927 1.508 3.703 28.036 5.823 3.181 3.817 30.911 11.85 3.27 2.79 2.94 3.15 5.77 1.14 9.32 40.44 8.66 6.69 15.57 18.37 12.46 9.41 203.13 4.08 21.28 5.49 7.71 3.64 4.98 17.10 1.59 20.59 16.79 17.54 3.22 22.53 24.44 21.15 10.44 1487.89 20.43 1992.465 191.614 43.227 149.227 2.51 2364.50 1.57 3708.47 1.56 3747.64 174.86 34.24 8.96 662.97 463.39 12.90 264.48 57.49 28.49 210.29 929.62 21.50 371.68 16.10 8722.19 2.29 9443.09 2.11 189.838 14.040 8.107 91.994 22.230 33.234 1089 1631 2720 39.71 48.44 12.54 2.09 33936 OpenBenchmarking.org
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: S3D OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: S3D Intel Arctm A770M DG2 20 40 60 80 100 SE +/- 0.03, N = 3 74.68 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Triad OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Triad Intel Arctm A770M DG2 3 6 9 12 15 SE +/- 0.00, N = 3 11.00 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: FFT SP OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: FFT SP Intel Arctm A770M DG2 300 600 900 1200 1500 SE +/- 0.50, N = 3 1253.70 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: MD5 Hash OpenBenchmarking.org GHash/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: MD5 Hash Intel Arctm A770M DG2 4 8 12 16 20 SE +/- 0.03, N = 3 18.26 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Reduction OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Reduction Intel Arctm A770M DG2 60 120 180 240 300 SE +/- 0.23, N = 3 275.03 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: GEMM SGEMM_N OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: GEMM SGEMM_N Intel Arctm A770M DG2 500 1000 1500 2000 2500 SE +/- 19.21, N = 8 2186.71 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Max SP Flops OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Max SP Flops Intel Arctm A770M DG2 300K 600K 900K 1200K 1500K SE +/- 72882.34, N = 13 1439586 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Bus Speed Download OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Download Intel Arctm A770M DG2 3 6 9 12 15 SE +/- 0.00, N = 3 11.76 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Bus Speed Readback OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Readback Intel Arctm A770M DG2 3 6 9 12 15 SE +/- 0.00, N = 3 11.34 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Texture Read Bandwidth OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Texture Read Bandwidth Intel Arctm A770M DG2 200 400 600 800 1000 SE +/- 3.54, N = 3 1124.02 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
LeelaChessZero Backend: BLAS OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.28 Backend: BLAS Intel Arctm A770M DG2 200 400 600 800 1000 SE +/- 13.39, N = 4 1132 1. (CXX) g++ options: -flto -pthread
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU Intel Arctm A770M DG2 0.8883 1.7766 2.6649 3.5532 4.4415 SE +/- 0.00177, N = 3 3.94803 MIN: 3.52 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU Intel Arctm A770M DG2 3 6 9 12 15 SE +/- 0.00, N = 3 11.30 MIN: 11.22 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU Intel Arctm A770M DG2 0.3165 0.633 0.9495 1.266 1.5825 SE +/- 0.00916, N = 3 1.40659 MIN: 1.32 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU Intel Arctm A770M DG2 0.5581 1.1162 1.6743 2.2324 2.7905 SE +/- 0.00815, N = 3 2.48025 MIN: 2.35 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU Intel Arctm A770M DG2 4 8 12 16 20 SE +/- 0.01, N = 3 15.87 MIN: 15.53 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU Intel Arctm A770M DG2 3 6 9 12 15 SE +/- 0.01542, N = 3 9.12010 MIN: 4.82 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU Intel Arctm A770M DG2 2 4 6 8 10 SE +/- 0.01110, N = 3 7.64277 MIN: 7.24 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU Intel Arctm A770M DG2 4 8 12 16 20 SE +/- 0.01, N = 3 15.62 MIN: 15.1 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU Intel Arctm A770M DG2 0.4259 0.8518 1.2777 1.7036 2.1295 SE +/- 0.00073, N = 3 1.89282 MIN: 1.72 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU Intel Arctm A770M DG2 0.723 1.446 2.169 2.892 3.615 SE +/- 0.00195, N = 3 3.21314 MIN: 3.01 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU Intel Arctm A770M DG2 900 1800 2700 3600 4500 SE +/- 32.87, N = 3 4363.75 MIN: 4268.91 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU Intel Arctm A770M DG2 500 1000 1500 2000 2500 SE +/- 2.50, N = 3 2199.96 MIN: 2149.91 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU Intel Arctm A770M DG2 900 1800 2700 3600 4500 SE +/- 2.77, N = 3 4377.60 MIN: 4272.06 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU Intel Arctm A770M DG2 500 1000 1500 2000 2500 SE +/- 8.39, N = 3 2190.97 MIN: 2149.25 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU Intel Arctm A770M DG2 900 1800 2700 3600 4500 SE +/- 13.68, N = 3 4387.13 MIN: 4265.26 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU Intel Arctm A770M DG2 500 1000 1500 2000 2500 SE +/- 2.31, N = 3 2198.98 MIN: 2149.35 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
Numpy Benchmark OpenBenchmarking.org Score, More Is Better Numpy Benchmark Intel Arctm A770M DG2 120 240 360 480 600 SE +/- 1.10, N = 3 568.41
DeepSpeech Acceleration: CPU OpenBenchmarking.org Seconds, Fewer Is Better DeepSpeech 0.6 Acceleration: CPU Intel Arctm A770M DG2 15 30 45 60 75 SE +/- 0.11, N = 3 65.91
R Benchmark OpenBenchmarking.org Seconds, Fewer Is Better R Benchmark Intel Arctm A770M DG2 0.0238 0.0476 0.0714 0.0952 0.119 SE +/- 0.0010, N = 15 0.1057 1. R scripting front-end version 4.1.2 (2021-11-01)
RNNoise OpenBenchmarking.org Seconds, Fewer Is Better RNNoise 2020-06-28 Intel Arctm A770M DG2 4 8 12 16 20 SE +/- 0.03, N = 3 14.77 1. (CC) gcc options: -O2 -pedantic -fvisibility=hidden
TensorFlow Lite Model: SqueezeNet OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2022-05-18 Model: SqueezeNet Intel Arctm A770M DG2 600 1200 1800 2400 3000 SE +/- 24.06, N = 7 2642.68
TensorFlow Lite Model: Inception V4 OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2022-05-18 Model: Inception V4 Intel Arctm A770M DG2 8K 16K 24K 32K 40K SE +/- 250.09, N = 3 37029.6
TensorFlow Lite Model: NASNet Mobile OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2022-05-18 Model: NASNet Mobile Intel Arctm A770M DG2 20K 40K 60K 80K 100K SE +/- 5238.95, N = 12 114426.0
TensorFlow Lite Model: Mobilenet Float OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2022-05-18 Model: Mobilenet Float Intel Arctm A770M DG2 400 800 1200 1600 2000 SE +/- 22.41, N = 4 1916.03
TensorFlow Lite Model: Mobilenet Quant OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2022-05-18 Model: Mobilenet Quant Intel Arctm A770M DG2 700 1400 2100 2800 3500 SE +/- 34.95, N = 5 3248.53
TensorFlow Lite Model: Inception ResNet V2 OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2022-05-18 Model: Inception ResNet V2 Intel Arctm A770M DG2 30K 60K 90K 120K 150K SE +/- 13225.33, N = 15 129929.4
TensorFlow Device: CPU - Batch Size: 16 - Model: VGG-16 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 16 - Model: VGG-16 Intel Arctm A770M DG2 1.206 2.412 3.618 4.824 6.03 SE +/- 0.01, N = 3 5.36
TensorFlow Device: CPU - Batch Size: 32 - Model: VGG-16 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 32 - Model: VGG-16 Intel Arctm A770M DG2 1.2443 2.4886 3.7329 4.9772 6.2215 SE +/- 0.01, N = 3 5.53
TensorFlow Device: CPU - Batch Size: 64 - Model: VGG-16 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 64 - Model: VGG-16 Intel Arctm A770M DG2 1.2465 2.493 3.7395 4.986 6.2325 SE +/- 0.04, N = 3 5.54
TensorFlow Device: CPU - Batch Size: 16 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 16 - Model: AlexNet Intel Arctm A770M DG2 20 40 60 80 100 SE +/- 0.09, N = 3 76.22
TensorFlow Device: CPU - Batch Size: 32 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 32 - Model: AlexNet Intel Arctm A770M DG2 20 40 60 80 100 SE +/- 0.03, N = 3 100.35
TensorFlow Device: CPU - Batch Size: 64 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 64 - Model: AlexNet Intel Arctm A770M DG2 30 60 90 120 150 SE +/- 0.07, N = 3 118.53
TensorFlow Device: CPU - Batch Size: 256 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 256 - Model: AlexNet Intel Arctm A770M DG2 30 60 90 120 150 SE +/- 0.16, N = 3 144.49
TensorFlow Device: CPU - Batch Size: 512 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 512 - Model: AlexNet Intel Arctm A770M DG2 30 60 90 120 150 SE +/- 0.28, N = 3 154.08
TensorFlow Device: CPU - Batch Size: 16 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 16 - Model: GoogLeNet Intel Arctm A770M DG2 12 24 36 48 60 SE +/- 0.05, N = 3 55.09
TensorFlow Device: CPU - Batch Size: 16 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 16 - Model: ResNet-50 Intel Arctm A770M DG2 4 8 12 16 20 SE +/- 0.00, N = 3 16.27
TensorFlow Device: CPU - Batch Size: 32 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 32 - Model: GoogLeNet Intel Arctm A770M DG2 12 24 36 48 60 SE +/- 0.05, N = 3 54.91
TensorFlow Device: CPU - Batch Size: 32 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 32 - Model: ResNet-50 Intel Arctm A770M DG2 4 8 12 16 20 SE +/- 0.00, N = 3 16.75
TensorFlow Device: CPU - Batch Size: 64 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 64 - Model: GoogLeNet Intel Arctm A770M DG2 13 26 39 52 65 SE +/- 0.13, N = 3 56.82
TensorFlow Device: CPU - Batch Size: 64 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 64 - Model: ResNet-50 Intel Arctm A770M DG2 4 8 12 16 20 SE +/- 0.01, N = 3 16.99
TensorFlow Device: CPU - Batch Size: 256 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 256 - Model: GoogLeNet Intel Arctm A770M DG2 13 26 39 52 65 SE +/- 0.05, N = 3 58.43
TensorFlow Device: CPU - Batch Size: 512 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 512 - Model: GoogLeNet Intel Arctm A770M DG2 13 26 39 52 65 SE +/- 0.02, N = 3 59.45
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream Intel Arctm A770M DG2 2 4 6 8 10 SE +/- 0.0110, N = 3 7.7657
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream Intel Arctm A770M DG2 200 400 600 800 1000 SE +/- 1.39, N = 3 895.03
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream Intel Arctm A770M DG2 2 4 6 8 10 SE +/- 0.0059, N = 3 7.3377
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream Intel Arctm A770M DG2 30 60 90 120 150 SE +/- 0.11, N = 3 136.28
Neural Magic DeepSparse Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream Intel Arctm A770M DG2 20 40 60 80 100 SE +/- 0.50, N = 3 94.93
Neural Magic DeepSparse Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream Intel Arctm A770M DG2 16 32 48 64 80 SE +/- 0.39, N = 3 73.65
Neural Magic DeepSparse Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Synchronous Single-Stream Intel Arctm A770M DG2 14 28 42 56 70 SE +/- 0.33, N = 3 61.07
Neural Magic DeepSparse Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Synchronous Single-Stream Intel Arctm A770M DG2 4 8 12 16 20 SE +/- 0.09, N = 3 16.37
Neural Magic DeepSparse Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream Intel Arctm A770M DG2 7 14 21 28 35 SE +/- 0.17, N = 3 29.30
Neural Magic DeepSparse Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream Intel Arctm A770M DG2 50 100 150 200 250 SE +/- 1.34, N = 3 238.12
Neural Magic DeepSparse Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Synchronous Single-Stream Intel Arctm A770M DG2 5 10 15 20 25 SE +/- 0.07, N = 3 22.50
Neural Magic DeepSparse Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Synchronous Single-Stream Intel Arctm A770M DG2 10 20 30 40 50 SE +/- 0.15, N = 3 44.44
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream Intel Arctm A770M DG2 11 22 33 44 55 SE +/- 0.26, N = 3 48.19
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream Intel Arctm A770M DG2 30 60 90 120 150 SE +/- 0.80, N = 3 144.97
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Stream Intel Arctm A770M DG2 9 18 27 36 45 SE +/- 0.10, N = 3 39.52
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Stream Intel Arctm A770M DG2 6 12 18 24 30 SE +/- 0.06, N = 3 25.29
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream Intel Arctm A770M DG2 20 40 60 80 100 SE +/- 0.38, N = 3 104.15
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream Intel Arctm A770M DG2 15 30 45 60 75 SE +/- 0.27, N = 3 67.15
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream Intel Arctm A770M DG2 16 32 48 64 80 SE +/- 0.31, N = 3 70.13
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream Intel Arctm A770M DG2 4 8 12 16 20 SE +/- 0.06, N = 3 14.25
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream Intel Arctm A770M DG2 15 30 45 60 75 SE +/- 0.40, N = 3 66.00
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream Intel Arctm A770M DG2 20 40 60 80 100 SE +/- 0.62, N = 3 105.86
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream Intel Arctm A770M DG2 12 24 36 48 60 SE +/- 0.13, N = 3 52.11
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream Intel Arctm A770M DG2 5 10 15 20 25 SE +/- 0.05, N = 3 19.19
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream Intel Arctm A770M DG2 3 6 9 12 15 SE +/- 0.05, N = 3 10.17
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream Intel Arctm A770M DG2 150 300 450 600 750 SE +/- 4.26, N = 3 684.00
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream Intel Arctm A770M DG2 3 6 9 12 15 SE +/- 0.0145, N = 3 9.6843
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream Intel Arctm A770M DG2 20 40 60 80 100 SE +/- 0.15, N = 3 103.25
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream Intel Arctm A770M DG2 8 16 24 32 40 SE +/- 0.09, N = 3 32.76
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream Intel Arctm A770M DG2 50 100 150 200 250 SE +/- 0.61, N = 3 212.59
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Synchronous Single-Stream Intel Arctm A770M DG2 6 12 18 24 30 SE +/- 0.06, N = 3 26.11
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Synchronous Single-Stream Intel Arctm A770M DG2 9 18 27 36 45 SE +/- 0.09, N = 3 38.29
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream Intel Arctm A770M DG2 2 4 6 8 10 SE +/- 0.0141, N = 3 7.7507
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream Intel Arctm A770M DG2 200 400 600 800 1000 SE +/- 1.03, N = 3 897.71
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream Intel Arctm A770M DG2 2 4 6 8 10 SE +/- 0.0106, N = 3 7.3345
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream Intel Arctm A770M DG2 30 60 90 120 150 SE +/- 0.20, N = 3 136.34
spaCy Model: en_core_web_lg OpenBenchmarking.org tokens/sec, More Is Better spaCy 3.4.1 Model: en_core_web_lg Intel Arctm A770M DG2 4K 8K 12K 16K 20K SE +/- 15.06, N = 3 16538
spaCy Model: en_core_web_trf OpenBenchmarking.org tokens/sec, More Is Better spaCy 3.4.1 Model: en_core_web_trf Intel Arctm A770M DG2 300 600 900 1200 1500 SE +/- 2.91, N = 3 1232
Caffe Model: AlexNet - Acceleration: CPU - Iterations: 100 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: AlexNet - Acceleration: CPU - Iterations: 100 Intel Arctm A770M DG2 7K 14K 21K 28K 35K SE +/- 153.54, N = 3 33092 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lcrypto -lcurl -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
Caffe Model: AlexNet - Acceleration: CPU - Iterations: 200 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: AlexNet - Acceleration: CPU - Iterations: 200 Intel Arctm A770M DG2 14K 28K 42K 56K 70K SE +/- 236.68, N = 3 67449 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lcrypto -lcurl -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
Caffe Model: AlexNet - Acceleration: CPU - Iterations: 1000 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: AlexNet - Acceleration: CPU - Iterations: 1000 Intel Arctm A770M DG2 70K 140K 210K 280K 350K SE +/- 950.21, N = 3 341794 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lcrypto -lcurl -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
Caffe Model: GoogleNet - Acceleration: CPU - Iterations: 100 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: CPU - Iterations: 100 Intel Arctm A770M DG2 20K 40K 60K 80K 100K SE +/- 268.60, N = 3 93045 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lcrypto -lcurl -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
Caffe Model: GoogleNet - Acceleration: CPU - Iterations: 200 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: CPU - Iterations: 200 Intel Arctm A770M DG2 40K 80K 120K 160K 200K SE +/- 638.93, N = 3 187086 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lcrypto -lcurl -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
Caffe Model: GoogleNet - Acceleration: CPU - Iterations: 1000 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: CPU - Iterations: 1000 Intel Arctm A770M DG2 200K 400K 600K 800K 1000K SE +/- 1731.20, N = 3 940482 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lcrypto -lcurl -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
Mobile Neural Network Model: nasnet OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.1 Model: nasnet Intel Arctm A770M DG2 3 6 9 12 15 SE +/- 0.10, N = 12 10.93 MIN: 9.86 / MAX: 47.6 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: mobilenetV3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.1 Model: mobilenetV3 Intel Arctm A770M DG2 0.3393 0.6786 1.0179 1.3572 1.6965 SE +/- 0.046, N = 12 1.508 MIN: 1.3 / MAX: 8.13 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: squeezenetv1.1 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.1 Model: squeezenetv1.1 Intel Arctm A770M DG2 0.8332 1.6664 2.4996 3.3328 4.166 SE +/- 0.142, N = 12 3.703 MIN: 2.82 / MAX: 43.09 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: resnet-v2-50 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.1 Model: resnet-v2-50 Intel Arctm A770M DG2 7 14 21 28 35 SE +/- 0.98, N = 12 28.04 MIN: 23.35 / MAX: 56.65 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: SqueezeNetV1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.1 Model: SqueezeNetV1.0 Intel Arctm A770M DG2 1.3102 2.6204 3.9306 5.2408 6.551 SE +/- 0.112, N = 12 5.823 MIN: 4.91 / MAX: 40.29 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: MobileNetV2_224 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.1 Model: MobileNetV2_224 Intel Arctm A770M DG2 0.7157 1.4314 2.1471 2.8628 3.5785 SE +/- 0.033, N = 12 3.181 MIN: 2.57 / MAX: 12.01 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: mobilenet-v1-1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.1 Model: mobilenet-v1-1.0 Intel Arctm A770M DG2 0.8588 1.7176 2.5764 3.4352 4.294 SE +/- 0.252, N = 12 3.817 MIN: 3.14 / MAX: 28.69 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: inception-v3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.1 Model: inception-v3 Intel Arctm A770M DG2 7 14 21 28 35 SE +/- 0.62, N = 12 30.91 MIN: 28.06 / MAX: 69.93 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: mobilenet Intel Arctm A770M DG2 3 6 9 12 15 SE +/- 0.28, N = 12 11.85 MIN: 10.3 / MAX: 20.88 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU-v2-v2 - Model: mobilenet-v2 Intel Arctm A770M DG2 0.7358 1.4716 2.2074 2.9432 3.679 SE +/- 0.07, N = 12 3.27 MIN: 2.95 / MAX: 5.37 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU-v3-v3 - Model: mobilenet-v3 Intel Arctm A770M DG2 0.6278 1.2556 1.8834 2.5112 3.139 SE +/- 0.05, N = 12 2.79 MIN: 2.48 / MAX: 7.62 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: shufflenet-v2 Intel Arctm A770M DG2 0.6615 1.323 1.9845 2.646 3.3075 SE +/- 0.06, N = 12 2.94 MIN: 2.71 / MAX: 8.78 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: mnasnet Intel Arctm A770M DG2 0.7088 1.4176 2.1264 2.8352 3.544 SE +/- 0.10, N = 12 3.15 MIN: 2.58 / MAX: 8.74 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: efficientnet-b0 Intel Arctm A770M DG2 1.2983 2.5966 3.8949 5.1932 6.4915 SE +/- 0.10, N = 12 5.77 MIN: 5.12 / MAX: 15.13 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: blazeface Intel Arctm A770M DG2 0.2565 0.513 0.7695 1.026 1.2825 SE +/- 0.02, N = 12 1.14 MIN: 0.96 / MAX: 3.6 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: googlenet Intel Arctm A770M DG2 3 6 9 12 15 SE +/- 0.07, N = 12 9.32 MIN: 8.79 / MAX: 17.88 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: vgg16 Intel Arctm A770M DG2 9 18 27 36 45 SE +/- 0.42, N = 12 40.44 MIN: 37.76 / MAX: 424.77 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: resnet18 Intel Arctm A770M DG2 2 4 6 8 10 SE +/- 0.07, N = 12 8.66 MIN: 8.44 / MAX: 153.69 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: alexnet Intel Arctm A770M DG2 2 4 6 8 10 SE +/- 0.02, N = 12 6.69 MIN: 6.49 / MAX: 11.55 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: resnet50 Intel Arctm A770M DG2 4 8 12 16 20 SE +/- 0.05, N = 12 15.57 MIN: 15.14 / MAX: 24.06 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: yolov4-tiny Intel Arctm A770M DG2 5 10 15 20 25 SE +/- 0.24, N = 12 18.37 MIN: 17.29 / MAX: 327.85 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: squeezenet_ssd Intel Arctm A770M DG2 3 6 9 12 15 SE +/- 0.08, N = 12 12.46 MIN: 11.96 / MAX: 20.52 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: regnety_400m Intel Arctm A770M DG2 3 6 9 12 15 SE +/- 0.24, N = 12 9.41 MIN: 8.33 / MAX: 16.81 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: vision_transformer Intel Arctm A770M DG2 40 80 120 160 200 SE +/- 1.74, N = 12 203.13 MIN: 181.8 / MAX: 966.84 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: FastestDet Intel Arctm A770M DG2 0.918 1.836 2.754 3.672 4.59 SE +/- 0.20, N = 12 4.08 MIN: 3.32 / MAX: 27.33 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: mobilenet Intel Arctm A770M DG2 5 10 15 20 25 SE +/- 0.60, N = 9 21.28 MIN: 15.6 / MAX: 68.78 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 Intel Arctm A770M DG2 1.2353 2.4706 3.7059 4.9412 6.1765 SE +/- 1.25, N = 9 5.49 MIN: 3.29 / MAX: 38.71 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 Intel Arctm A770M DG2 2 4 6 8 10 SE +/- 1.71, N = 9 7.71 MIN: 3.54 / MAX: 35.62 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: shufflenet-v2 Intel Arctm A770M DG2 0.819 1.638 2.457 3.276 4.095 SE +/- 0.25, N = 9 3.64 MIN: 2.93 / MAX: 22.75 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: mnasnet Intel Arctm A770M DG2 1.1205 2.241 3.3615 4.482 5.6025 SE +/- 0.70, N = 9 4.98 MIN: 3.4 / MAX: 39.12 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: efficientnet-b0 Intel Arctm A770M DG2 4 8 12 16 20 SE +/- 1.05, N = 9 17.10 MIN: 7.31 / MAX: 38.92 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: blazeface Intel Arctm A770M DG2 0.3578 0.7156 1.0734 1.4312 1.789 SE +/- 0.10, N = 9 1.59 MIN: 1.31 / MAX: 5.68 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: googlenet Intel Arctm A770M DG2 5 10 15 20 25 SE +/- 0.30, N = 9 20.59 MIN: 9.29 / MAX: 40.28 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: vgg16 Intel Arctm A770M DG2 4 8 12 16 20 SE +/- 0.32, N = 9 16.79 MIN: 14.77 / MAX: 45.51 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: resnet18 Intel Arctm A770M DG2 4 8 12 16 20 SE +/- 0.51, N = 9 17.54 MIN: 6.07 / MAX: 37.81 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: alexnet Intel Arctm A770M DG2 0.7245 1.449 2.1735 2.898 3.6225 SE +/- 0.68, N = 9 3.22 MIN: 2.08 / MAX: 35.28 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: resnet50 Intel Arctm A770M DG2 5 10 15 20 25 SE +/- 0.39, N = 8 22.53 MIN: 10.36 / MAX: 39.05 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: yolov4-tiny Intel Arctm A770M DG2 6 12 18 24 30 SE +/- 0.71, N = 9 24.44 MIN: 20.18 / MAX: 81.89 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: squeezenet_ssd Intel Arctm A770M DG2 5 10 15 20 25 SE +/- 0.37, N = 9 21.15 MIN: 12.63 / MAX: 83.25 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: regnety_400m Intel Arctm A770M DG2 3 6 9 12 15 SE +/- 1.98, N = 9 10.44 MIN: 4.82 / MAX: 36.59 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: vision_transformer Intel Arctm A770M DG2 300 600 900 1200 1500 SE +/- 4.49, N = 9 1487.89 MIN: 695.27 / MAX: 2143.43 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: FastestDet Intel Arctm A770M DG2 5 10 15 20 25 SE +/- 1.11, N = 9 20.43 MIN: 4.71 / MAX: 37.22 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
TNN Target: CPU - Model: DenseNet OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: DenseNet Intel Arctm A770M DG2 400 800 1200 1600 2000 SE +/- 1.57, N = 3 1992.47 MIN: 1913.75 / MAX: 2125.39 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
TNN Target: CPU - Model: MobileNet v2 OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: MobileNet v2 Intel Arctm A770M DG2 40 80 120 160 200 SE +/- 0.39, N = 3 191.61 MIN: 184.04 / MAX: 217.57 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
TNN Target: CPU - Model: SqueezeNet v2 OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: SqueezeNet v2 Intel Arctm A770M DG2 10 20 30 40 50 SE +/- 0.03, N = 3 43.23 MIN: 42.91 / MAX: 43.66 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
TNN Target: CPU - Model: SqueezeNet v1.1 OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: SqueezeNet v1.1 Intel Arctm A770M DG2 30 60 90 120 150 SE +/- 0.10, N = 3 149.23 MIN: 148.73 / MAX: 150.29 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
OpenVINO Model: Face Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Face Detection FP16 - Device: CPU Intel Arctm A770M DG2 0.5648 1.1296 1.6944 2.2592 2.824 SE +/- 0.03, N = 4 2.51 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared
OpenVINO Model: Face Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Face Detection FP16 - Device: CPU Intel Arctm A770M DG2 500 1000 1500 2000 2500 SE +/- 17.33, N = 4 2364.50 MIN: 1171.79 / MAX: 2888.35 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Person Detection FP16 - Device: CPU Intel Arctm A770M DG2 0.3533 0.7066 1.0599 1.4132 1.7665 SE +/- 0.01, N = 3 1.57 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Person Detection FP16 - Device: CPU Intel Arctm A770M DG2 800 1600 2400 3200 4000 SE +/- 27.05, N = 3 3708.47 MIN: 2911.35 / MAX: 4667.02 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared
OpenVINO Model: Person Detection FP32 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Person Detection FP32 - Device: CPU Intel Arctm A770M DG2 0.351 0.702 1.053 1.404 1.755 SE +/- 0.02, N = 3 1.56 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared
OpenVINO Model: Person Detection FP32 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Person Detection FP32 - Device: CPU Intel Arctm A770M DG2 800 1600 2400 3200 4000 SE +/- 46.22, N = 3 3747.64 MIN: 3027.95 / MAX: 4624.13 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared
OpenVINO Model: Vehicle Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Vehicle Detection FP16 - Device: CPU Intel Arctm A770M DG2 40 80 120 160 200 SE +/- 0.43, N = 3 174.86 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared
OpenVINO Model: Vehicle Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Vehicle Detection FP16 - Device: CPU Intel Arctm A770M DG2 8 16 24 32 40 SE +/- 0.09, N = 3 34.24 MIN: 13 / MAX: 46.82 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared
OpenVINO Model: Face Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Face Detection FP16-INT8 - Device: CPU Intel Arctm A770M DG2 3 6 9 12 15 SE +/- 0.08, N = 3 8.96 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared
OpenVINO Model: Face Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Face Detection FP16-INT8 - Device: CPU Intel Arctm A770M DG2 140 280 420 560 700 SE +/- 5.52, N = 3 662.97 MIN: 284.46 / MAX: 1378.68 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared
OpenVINO Model: Vehicle Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Vehicle Detection FP16-INT8 - Device: CPU Intel Arctm A770M DG2 100 200 300 400 500 SE +/- 4.89, N = 3 463.39 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared
OpenVINO Model: Vehicle Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Vehicle Detection FP16-INT8 - Device: CPU Intel Arctm A770M DG2 3 6 9 12 15 SE +/- 0.14, N = 3 12.90 MIN: 7.03 / MAX: 28.09 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared
OpenVINO Model: Weld Porosity Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Weld Porosity Detection FP16 - Device: CPU Intel Arctm A770M DG2 60 120 180 240 300 SE +/- 3.36, N = 3 264.48 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared
OpenVINO Model: Weld Porosity Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Weld Porosity Detection FP16 - Device: CPU Intel Arctm A770M DG2 13 26 39 52 65 SE +/- 17.14, N = 3 57.49 MIN: 17.08 / MAX: 102.32 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared
OpenVINO Model: Machine Translation EN To DE FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Machine Translation EN To DE FP16 - Device: CPU Intel Arctm A770M DG2 7 14 21 28 35 SE +/- 0.16, N = 3 28.49 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared
OpenVINO Model: Machine Translation EN To DE FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Machine Translation EN To DE FP16 - Device: CPU Intel Arctm A770M DG2 50 100 150 200 250 SE +/- 1.20, N = 3 210.29 MIN: 165.03 / MAX: 293.37 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared
OpenVINO Model: Weld Porosity Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Weld Porosity Detection FP16-INT8 - Device: CPU Intel Arctm A770M DG2 200 400 600 800 1000 SE +/- 13.36, N = 3 929.62 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared
OpenVINO Model: Weld Porosity Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Weld Porosity Detection FP16-INT8 - Device: CPU Intel Arctm A770M DG2 5 10 15 20 25 SE +/- 0.30, N = 3 21.50 MIN: 9.43 / MAX: 54.11 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared
OpenVINO Model: Person Vehicle Bike Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Person Vehicle Bike Detection FP16 - Device: CPU Intel Arctm A770M DG2 80 160 240 320 400 SE +/- 2.76, N = 11 371.68 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared
OpenVINO Model: Person Vehicle Bike Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Person Vehicle Bike Detection FP16 - Device: CPU Intel Arctm A770M DG2 4 8 12 16 20 SE +/- 0.12, N = 11 16.10 MIN: 8.47 / MAX: 29.36 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU Intel Arctm A770M DG2 2K 4K 6K 8K 10K SE +/- 98.26, N = 4 8722.19 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU Intel Arctm A770M DG2 0.5153 1.0306 1.5459 2.0612 2.5765 SE +/- 0.03, N = 4 2.29 MIN: 1.01 / MAX: 14.75 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared
OpenVINO Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU Intel Arctm A770M DG2 2K 4K 6K 8K 10K SE +/- 101.33, N = 5 9443.09 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared
OpenVINO Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU Intel Arctm A770M DG2 0.4748 0.9496 1.4244 1.8992 2.374 SE +/- 0.02, N = 5 2.11 MIN: 0.98 / MAX: 13.73 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared
Numenta Anomaly Benchmark Detector: KNN CAD OpenBenchmarking.org Seconds, Fewer Is Better Numenta Anomaly Benchmark 1.1 Detector: KNN CAD Intel Arctm A770M DG2 40 80 120 160 200 SE +/- 1.52, N = 3 189.84
Numenta Anomaly Benchmark Detector: Relative Entropy OpenBenchmarking.org Seconds, Fewer Is Better Numenta Anomaly Benchmark 1.1 Detector: Relative Entropy Intel Arctm A770M DG2 4 8 12 16 20 SE +/- 0.15, N = 3 14.04
Numenta Anomaly Benchmark Detector: Windowed Gaussian OpenBenchmarking.org Seconds, Fewer Is Better Numenta Anomaly Benchmark 1.1 Detector: Windowed Gaussian Intel Arctm A770M DG2 2 4 6 8 10 SE +/- 0.113, N = 3 8.107
Numenta Anomaly Benchmark Detector: Earthgecko Skyline OpenBenchmarking.org Seconds, Fewer Is Better Numenta Anomaly Benchmark 1.1 Detector: Earthgecko Skyline Intel Arctm A770M DG2 20 40 60 80 100 SE +/- 0.95, N = 15 91.99
Numenta Anomaly Benchmark Detector: Bayesian Changepoint OpenBenchmarking.org Seconds, Fewer Is Better Numenta Anomaly Benchmark 1.1 Detector: Bayesian Changepoint Intel Arctm A770M DG2 5 10 15 20 25 SE +/- 0.34, N = 15 22.23
Numenta Anomaly Benchmark Detector: Contextual Anomaly Detector OSE OpenBenchmarking.org Seconds, Fewer Is Better Numenta Anomaly Benchmark 1.1 Detector: Contextual Anomaly Detector OSE Intel Arctm A770M DG2 8 16 24 32 40 SE +/- 0.12, N = 3 33.23
AI Benchmark Alpha Device Inference Score OpenBenchmarking.org Score, More Is Better AI Benchmark Alpha 0.1.2 Device Inference Score Intel Arctm A770M DG2 200 400 600 800 1000 1089
AI Benchmark Alpha Device Training Score OpenBenchmarking.org Score, More Is Better AI Benchmark Alpha 0.1.2 Device Training Score Intel Arctm A770M DG2 400 800 1200 1600 2000 1631
AI Benchmark Alpha Device AI Score OpenBenchmarking.org Score, More Is Better AI Benchmark Alpha 0.1.2 Device AI Score Intel Arctm A770M DG2 600 1200 1800 2400 3000 2720
Mlpack Benchmark Benchmark: scikit_ica OpenBenchmarking.org Seconds, Fewer Is Better Mlpack Benchmark Benchmark: scikit_ica Intel Arctm A770M DG2 9 18 27 36 45 SE +/- 0.05, N = 3 39.71
Mlpack Benchmark Benchmark: scikit_qda OpenBenchmarking.org Seconds, Fewer Is Better Mlpack Benchmark Benchmark: scikit_qda Intel Arctm A770M DG2 11 22 33 44 55 SE +/- 0.65, N = 15 48.44
Mlpack Benchmark Benchmark: scikit_svm OpenBenchmarking.org Seconds, Fewer Is Better Mlpack Benchmark Benchmark: scikit_svm Intel Arctm A770M DG2 3 6 9 12 15 SE +/- 0.03, N = 3 12.54
Mlpack Benchmark Benchmark: scikit_linearridgeregression OpenBenchmarking.org Seconds, Fewer Is Better Mlpack Benchmark Benchmark: scikit_linearridgeregression Intel Arctm A770M DG2 0.4703 0.9406 1.4109 1.8812 2.3515 SE +/- 0.02, N = 4 2.09
OpenCV Test: DNN - Deep Neural Network OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: DNN - Deep Neural Network Intel Arctm A770M DG2 7K 14K 21K 28K 35K SE +/- 1700.61, N = 15 33936 1. (CXX) g++ options: -fPIC -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -shared
Phoronix Test Suite v10.8.5