h510-g6405-1 Intel Pentium Gold G6405 testing with a ASRock H510M-HDV/M.2 SE (P1.60 BIOS) and Intel UHD 610 CML GT1 3GB on Ubuntu 20.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2311104-HERT-H510G6457&grs .
h510-g6405-1 Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server OpenGL Vulkan Compiler File-System Screen Resolution Intel UHD 610 CML GT1 Intel Pentium Gold G6405 @ 4.10GHz (2 Cores / 4 Threads) ASRock H510M-HDV/M.2 SE (P1.60 BIOS) Intel Comet Lake PCH 3584MB 1000GB Western Digital WDS100T2B0A Intel UHD 610 CML GT1 3GB (1050MHz) Realtek ALC897 G185BGEL01 Realtek RTL8111/8168/8411 Ubuntu 20.04 5.15.0-88-generic (x86_64) GNOME Shell 3.36.9 X Server 1.20.13 4.6 Mesa 21.2.6 1.2.182 GCC 9.4.0 ext4 1368x768 OpenBenchmarking.org - Transparent Huge Pages: madvise - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-9QDOt0/gcc-9-9.4.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0xf8 - Thermald 1.9.1 - Python 3.8.10 - gather_data_sampling: Not affected + itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Mitigation of Clear buffers; SMT vulnerable + retbleed: Mitigation of Enhanced IBRS + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Mitigation of Microcode + tsx_async_abort: Not affected
h510-g6405-1 whisper-cpp: ggml-medium.en - 2016 State of the Union whisper-cpp: ggml-small.en - 2016 State of the Union whisper-cpp: ggml-base.en - 2016 State of the Union scikit-learn: Sparse Rand Projections / 100 Iterations scikit-learn: Kernel PCA Solvers / Time vs. N Components scikit-learn: Kernel PCA Solvers / Time vs. N Samples scikit-learn: Hist Gradient Boosting Categorical Only scikit-learn: Plot Polynomial Kernel Approximation scikit-learn: 20 Newsgroups / Logistic Regression scikit-learn: Hist Gradient Boosting Higgs Boson scikit-learn: Plot Singular Value Decomposition scikit-learn: Hist Gradient Boosting Threading scikit-learn: Covertype Dataset Benchmark scikit-learn: Sample Without Replacement scikit-learn: Hist Gradient Boosting scikit-learn: Plot Incremental PCA scikit-learn: TSNE MNIST Dataset scikit-learn: LocalOutlierFactor scikit-learn: Feature Expansions scikit-learn: Plot OMP vs. LARS scikit-learn: Plot Hierarchical scikit-learn: Text Vectorizers scikit-learn: Plot Lasso Path scikit-learn: SGD Regression scikit-learn: Plot Neighbors scikit-learn: MNIST Dataset scikit-learn: Plot Ward scikit-learn: Sparsify scikit-learn: Lasso scikit-learn: Tree scikit-learn: SAGA scikit-learn: GLM mlpack: scikit_svm mlpack: scikit_ica onnx: Faster R-CNN R-50-FPN-int8 - CPU - Standard onnx: super-resolution-10 - CPU - Standard onnx: super-resolution-10 - CPU - Parallel onnx: ResNet50 v1-12-int8 - CPU - Standard onnx: ResNet50 v1-12-int8 - CPU - Parallel onnx: ArcFace ResNet-100 - CPU - Parallel onnx: fcn-resnet101-11 - CPU - Standard onnx: fcn-resnet101-11 - CPU - Parallel onnx: CaffeNet 12-int8 - CPU - Parallel onnx: GPT-2 - CPU - Standard onnx: GPT-2 - CPU - Parallel numenta-nab: Contextual Anomaly Detector OSE numenta-nab: Bayesian Changepoint numenta-nab: Earthgecko Skyline numenta-nab: Windowed Gaussian numenta-nab: Relative Entropy numenta-nab: KNN CAD openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU openvino: Handwritten English Recognition FP16-INT8 - CPU openvino: Handwritten English Recognition FP16-INT8 - CPU openvino: Age Gender Recognition Retail 0013 FP16 - CPU openvino: Age Gender Recognition Retail 0013 FP16 - CPU openvino: Handwritten English Recognition FP16 - CPU openvino: Handwritten English Recognition FP16 - CPU openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Machine Translation EN To DE FP16 - CPU openvino: Machine Translation EN To DE FP16 - CPU openvino: Road Segmentation ADAS FP16-INT8 - CPU openvino: Road Segmentation ADAS FP16-INT8 - CPU openvino: Face Detection Retail FP16-INT8 - CPU openvino: Face Detection Retail FP16-INT8 - CPU openvino: Weld Porosity Detection FP16 - CPU openvino: Weld Porosity Detection FP16 - CPU openvino: Vehicle Detection FP16-INT8 - CPU openvino: Vehicle Detection FP16-INT8 - CPU openvino: Road Segmentation ADAS FP16 - CPU openvino: Road Segmentation ADAS FP16 - CPU openvino: Face Detection Retail FP16 - CPU openvino: Face Detection Retail FP16 - CPU openvino: Face Detection FP16-INT8 - CPU openvino: Face Detection FP16-INT8 - CPU openvino: Vehicle Detection FP16 - CPU openvino: Vehicle Detection FP16 - CPU openvino: Person Detection FP32 - CPU openvino: Person Detection FP32 - CPU openvino: Person Detection FP16 - CPU openvino: Person Detection FP16 - CPU openvino: Face Detection FP16 - CPU openvino: Face Detection FP16 - CPU plaidml: No - Inference - ResNet 50 - CPU plaidml: No - Inference - VGG16 - CPU tnn: CPU - SqueezeNet v1.1 tnn: CPU - SqueezeNet v2 tnn: CPU - MobileNet v2 tnn: CPU - DenseNet ncnn: Vulkan GPU - FastestDet ncnn: Vulkan GPU - vision_transformer ncnn: Vulkan GPU - regnety_400m ncnn: Vulkan GPU - squeezenet_ssd ncnn: Vulkan GPU - yolov4-tiny ncnn: Vulkan GPU - resnet50 ncnn: Vulkan GPU - alexnet ncnn: Vulkan GPU - resnet18 ncnn: Vulkan GPU - vgg16 ncnn: Vulkan GPU - blazeface ncnn: Vulkan GPU - efficientnet-b0 ncnn: Vulkan GPU - shufflenet-v2 ncnn: Vulkan GPU-v3-v3 - mobilenet-v3 ncnn: Vulkan GPU-v2-v2 - mobilenet-v2 ncnn: Vulkan GPU - mobilenet ncnn: CPU - FastestDet ncnn: CPU - vision_transformer ncnn: CPU - regnety_400m ncnn: CPU - squeezenet_ssd ncnn: CPU - yolov4-tiny ncnn: CPU - resnet50 ncnn: CPU - alexnet ncnn: CPU - resnet18 ncnn: CPU - vgg16 ncnn: CPU - googlenet ncnn: CPU - blazeface ncnn: CPU - efficientnet-b0 ncnn: CPU - mnasnet ncnn: CPU - shufflenet-v2 ncnn: CPU-v3-v3 - mobilenet-v3 ncnn: CPU-v2-v2 - mobilenet-v2 ncnn: CPU - mobilenet mnn: inception-v3 mnn: mobilenet-v1-1.0 mnn: MobileNetV2_224 mnn: SqueezeNetV1.0 mnn: resnet-v2-50 mnn: squeezenetv1.1 mnn: mobilenetV3 mnn: nasnet caffe: GoogleNet - CPU - 1000 caffe: GoogleNet - CPU - 200 caffe: GoogleNet - CPU - 100 caffe: AlexNet - CPU - 1000 caffe: AlexNet - CPU - 200 caffe: AlexNet - CPU - 100 tensorflow-lite: Inception ResNet V2 tensorflow-lite: Mobilenet Quant tensorflow-lite: Mobilenet Float tensorflow-lite: NASNet Mobile tensorflow-lite: Inception V4 tensorflow-lite: SqueezeNet rnnoise: rbenchmark: numpy: onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Recurrent Neural Network Training - f32 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: Deconvolution Batch shapes_1d - f32 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU onednn: IP Shapes 3D - f32 - CPU onednn: IP Shapes 1D - f32 - CPU lczero: BLAS scikit-learn: Hist Gradient Boosting Adult onnx: Faster R-CNN R-50-FPN-int8 - CPU - Standard onnx: Faster R-CNN R-50-FPN-int8 - CPU - Parallel onnx: Faster R-CNN R-50-FPN-int8 - CPU - Parallel onnx: super-resolution-10 - CPU - Standard onnx: super-resolution-10 - CPU - Parallel onnx: ResNet50 v1-12-int8 - CPU - Standard onnx: ResNet50 v1-12-int8 - CPU - Parallel onnx: ArcFace ResNet-100 - CPU - Standard onnx: ArcFace ResNet-100 - CPU - Standard onnx: ArcFace ResNet-100 - CPU - Parallel onnx: fcn-resnet101-11 - CPU - Standard onnx: fcn-resnet101-11 - CPU - Parallel onnx: CaffeNet 12-int8 - CPU - Standard onnx: CaffeNet 12-int8 - CPU - Standard onnx: CaffeNet 12-int8 - CPU - Parallel onnx: bertsquad-12 - CPU - Standard onnx: bertsquad-12 - CPU - Standard onnx: bertsquad-12 - CPU - Parallel onnx: bertsquad-12 - CPU - Parallel onnx: GPT-2 - CPU - Standard onnx: GPT-2 - CPU - Parallel ncnn: Vulkan GPU - googlenet ncnn: Vulkan GPU - mnasnet onednn: IP Shapes 1D - bf16bf16bf16 - CPU Intel UHD 610 CML GT1 39940.555 11358.504 3116.24208 3027.193 324.216 456.279 27.024 295.976 60.843 185.743 352.774 449.435 581.596 147.242 194.726 82.858 796.405 448.570 212.153 249.520 273.303 79.734 465.908 219.599 262.915 86.743 93.201 103.999 1060.627 47.100 1055.940 1113.273 27.15 114.79 1.36375 9.33438 8.29467 10.7133 9.03460 2.23139 0.108787 0.0548757 31.9931 30.0089 28.7583 137.717 164.503 528.345 33.807 66.187 704.214 1.49 1342.06 123.65 16.17 3.80 525.03 155.38 12.87 37.49 53.33 804.02 2.49 119.36 16.76 18.67 107.05 117.47 17.02 59.24 33.75 236.97 8.44 38.71 51.65 3927.81 0.51 127.43 15.69 932.30 2.15 957.97 2.09 11527.95 0.17 2.42 1.62 327.082 72.920 365.268 5253.381 10.09 892.53 23.04 37.98 91.87 122.76 37.16 45.11 265.05 2.5 28.55 8.40 14.79 20.35 74.28 10.07 893.07 22.51 37.95 92.22 122.56 37.21 45.14 264.68 53.87 2.4 28.50 17.06 8.38 14.78 20.36 74.20 154.326 20.034 12.963 22.004 115.582 11.868 3.743 29.075 2089050 417906 208988 967170 193587 96808 386539 560906 21944.0 40066.8 415157 30349.5 26.349 0.3482 294.94 20613.0 41031.8 20604.1 41046.8 20608.4 41032.5 26.1720 19.1615 49.2715 90.0958 76.0143 58.7189 5.82157 11.9260 37.7239 36.8187 141 108.147 733.298 1022.703 0.993387 107.129 120.557 93.3403 110.709 430.820 2.39975 448.150 9192.22 18223.2 28.3464 36.5771 31.3494 835.681 1.20176 953.418 1.062983 33.3211 34.7677 56.04 27.55 OpenBenchmarking.org
Whisper.cpp Model: ggml-medium.en - Input: 2016 State of the Union OpenBenchmarking.org Seconds, Fewer Is Better Whisper.cpp 1.4 Model: ggml-medium.en - Input: 2016 State of the Union Intel UHD 610 CML GT1 9K 18K 27K 36K 45K SE +/- 132.56, N = 3 39940.56 1. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread
Whisper.cpp Model: ggml-small.en - Input: 2016 State of the Union OpenBenchmarking.org Seconds, Fewer Is Better Whisper.cpp 1.4 Model: ggml-small.en - Input: 2016 State of the Union Intel UHD 610 CML GT1 2K 4K 6K 8K 10K SE +/- 62.26, N = 3 11358.50 1. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread
Whisper.cpp Model: ggml-base.en - Input: 2016 State of the Union OpenBenchmarking.org Seconds, Fewer Is Better Whisper.cpp 1.4 Model: ggml-base.en - Input: 2016 State of the Union Intel UHD 610 CML GT1 700 1400 2100 2800 3500 SE +/- 0.81, N = 3 3116.24 1. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread
Scikit-Learn Benchmark: Sparse Random Projections / 100 Iterations OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Sparse Random Projections / 100 Iterations Intel UHD 610 CML GT1 600 1200 1800 2400 3000 SE +/- 2.34, N = 3 3027.19 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: Kernel PCA Solvers / Time vs. N Components OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Kernel PCA Solvers / Time vs. N Components Intel UHD 610 CML GT1 70 140 210 280 350 SE +/- 4.78, N = 9 324.22 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: Kernel PCA Solvers / Time vs. N Samples OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Kernel PCA Solvers / Time vs. N Samples Intel UHD 610 CML GT1 100 200 300 400 500 SE +/- 0.35, N = 3 456.28 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: Hist Gradient Boosting Categorical Only OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Hist Gradient Boosting Categorical Only Intel UHD 610 CML GT1 6 12 18 24 30 SE +/- 0.01, N = 3 27.02 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: Plot Polynomial Kernel Approximation OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Plot Polynomial Kernel Approximation Intel UHD 610 CML GT1 60 120 180 240 300 SE +/- 0.33, N = 3 295.98 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: 20 Newsgroups / Logistic Regression OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: 20 Newsgroups / Logistic Regression Intel UHD 610 CML GT1 14 28 42 56 70 SE +/- 0.06, N = 3 60.84 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: Hist Gradient Boosting Higgs Boson OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Hist Gradient Boosting Higgs Boson Intel UHD 610 CML GT1 40 80 120 160 200 SE +/- 0.49, N = 3 185.74 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: Plot Singular Value Decomposition OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Plot Singular Value Decomposition Intel UHD 610 CML GT1 80 160 240 320 400 SE +/- 0.33, N = 3 352.77 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: Hist Gradient Boosting Threading OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Hist Gradient Boosting Threading Intel UHD 610 CML GT1 100 200 300 400 500 SE +/- 1.33, N = 3 449.44 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: Covertype Dataset Benchmark OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Covertype Dataset Benchmark Intel UHD 610 CML GT1 130 260 390 520 650 SE +/- 0.92, N = 3 581.60 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: Sample Without Replacement OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Sample Without Replacement Intel UHD 610 CML GT1 30 60 90 120 150 SE +/- 1.32, N = 3 147.24 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: Hist Gradient Boosting OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Hist Gradient Boosting Intel UHD 610 CML GT1 40 80 120 160 200 SE +/- 0.61, N = 3 194.73 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: Plot Incremental PCA OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Plot Incremental PCA Intel UHD 610 CML GT1 20 40 60 80 100 SE +/- 0.09, N = 3 82.86 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: TSNE MNIST Dataset OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: TSNE MNIST Dataset Intel UHD 610 CML GT1 200 400 600 800 1000 SE +/- 0.17, N = 3 796.41 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: LocalOutlierFactor OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: LocalOutlierFactor Intel UHD 610 CML GT1 100 200 300 400 500 SE +/- 2.17, N = 3 448.57 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: Feature Expansions OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Feature Expansions Intel UHD 610 CML GT1 50 100 150 200 250 SE +/- 0.25, N = 3 212.15 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: Plot OMP vs. LARS OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Plot OMP vs. LARS Intel UHD 610 CML GT1 50 100 150 200 250 SE +/- 0.07, N = 3 249.52 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: Plot Hierarchical OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Plot Hierarchical Intel UHD 610 CML GT1 60 120 180 240 300 SE +/- 0.38, N = 3 273.30 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: Text Vectorizers OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Text Vectorizers Intel UHD 610 CML GT1 20 40 60 80 100 SE +/- 0.04, N = 3 79.73 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: Plot Lasso Path OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Plot Lasso Path Intel UHD 610 CML GT1 100 200 300 400 500 SE +/- 0.95, N = 3 465.91 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: SGD Regression OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: SGD Regression Intel UHD 610 CML GT1 50 100 150 200 250 SE +/- 0.43, N = 3 219.60 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: Plot Neighbors OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Plot Neighbors Intel UHD 610 CML GT1 60 120 180 240 300 SE +/- 0.89, N = 3 262.92 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: MNIST Dataset OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: MNIST Dataset Intel UHD 610 CML GT1 20 40 60 80 100 SE +/- 0.15, N = 3 86.74 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: Plot Ward OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Plot Ward Intel UHD 610 CML GT1 20 40 60 80 100 SE +/- 0.03, N = 3 93.20 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: Sparsify OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Sparsify Intel UHD 610 CML GT1 20 40 60 80 100 SE +/- 0.01, N = 3 104.00 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: Lasso OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Lasso Intel UHD 610 CML GT1 200 400 600 800 1000 SE +/- 0.24, N = 3 1060.63 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: Tree OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Tree Intel UHD 610 CML GT1 11 22 33 44 55 SE +/- 0.17, N = 3 47.10 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: SAGA OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: SAGA Intel UHD 610 CML GT1 200 400 600 800 1000 SE +/- 1.27, N = 3 1055.94 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: GLM OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: GLM Intel UHD 610 CML GT1 200 400 600 800 1000 SE +/- 4.22, N = 3 1113.27 1. (F9X) gfortran options: -O0
Mlpack Benchmark Benchmark: scikit_svm OpenBenchmarking.org Seconds, Fewer Is Better Mlpack Benchmark Benchmark: scikit_svm Intel UHD 610 CML GT1 6 12 18 24 30 SE +/- 0.03, N = 3 27.15
Mlpack Benchmark Benchmark: scikit_ica OpenBenchmarking.org Seconds, Fewer Is Better Mlpack Benchmark Benchmark: scikit_ica Intel UHD 610 CML GT1 30 60 90 120 150 SE +/- 0.48, N = 3 114.79
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard Intel UHD 610 CML GT1 0.3068 0.6136 0.9204 1.2272 1.534 SE +/- 0.00596, N = 3 1.36375 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: super-resolution-10 - Device: CPU - Executor: Standard Intel UHD 610 CML GT1 3 6 9 12 15 SE +/- 0.00561, N = 3 9.33438 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: super-resolution-10 - Device: CPU - Executor: Parallel Intel UHD 610 CML GT1 2 4 6 8 10 SE +/- 0.01346, N = 3 8.29467 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard Intel UHD 610 CML GT1 3 6 9 12 15 SE +/- 0.01, N = 3 10.71 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel Intel UHD 610 CML GT1 3 6 9 12 15 SE +/- 0.09629, N = 3 9.03460 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel Intel UHD 610 CML GT1 0.5021 1.0042 1.5063 2.0084 2.5105 SE +/- 0.00128, N = 3 2.23139 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: fcn-resnet101-11 - Device: CPU - Executor: Standard Intel UHD 610 CML GT1 0.0245 0.049 0.0735 0.098 0.1225 SE +/- 0.000029, N = 3 0.108787 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel Intel UHD 610 CML GT1 0.0123 0.0246 0.0369 0.0492 0.0615 SE +/- 0.0001368, N = 3 0.0548757 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel Intel UHD 610 CML GT1 7 14 21 28 35 SE +/- 0.49, N = 12 31.99 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: GPT-2 - Device: CPU - Executor: Standard Intel UHD 610 CML GT1 7 14 21 28 35 SE +/- 0.19, N = 3 30.01 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: GPT-2 - Device: CPU - Executor: Parallel Intel UHD 610 CML GT1 7 14 21 28 35 SE +/- 0.03, N = 3 28.76 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
Numenta Anomaly Benchmark Detector: Contextual Anomaly Detector OSE OpenBenchmarking.org Seconds, Fewer Is Better Numenta Anomaly Benchmark 1.1 Detector: Contextual Anomaly Detector OSE Intel UHD 610 CML GT1 30 60 90 120 150 SE +/- 0.88, N = 3 137.72
Numenta Anomaly Benchmark Detector: Bayesian Changepoint OpenBenchmarking.org Seconds, Fewer Is Better Numenta Anomaly Benchmark 1.1 Detector: Bayesian Changepoint Intel UHD 610 CML GT1 40 80 120 160 200 SE +/- 0.63, N = 3 164.50
Numenta Anomaly Benchmark Detector: Earthgecko Skyline OpenBenchmarking.org Seconds, Fewer Is Better Numenta Anomaly Benchmark 1.1 Detector: Earthgecko Skyline Intel UHD 610 CML GT1 110 220 330 440 550 SE +/- 2.87, N = 3 528.35
Numenta Anomaly Benchmark Detector: Windowed Gaussian OpenBenchmarking.org Seconds, Fewer Is Better Numenta Anomaly Benchmark 1.1 Detector: Windowed Gaussian Intel UHD 610 CML GT1 8 16 24 32 40 SE +/- 0.37, N = 3 33.81
Numenta Anomaly Benchmark Detector: Relative Entropy OpenBenchmarking.org Seconds, Fewer Is Better Numenta Anomaly Benchmark 1.1 Detector: Relative Entropy Intel UHD 610 CML GT1 15 30 45 60 75 SE +/- 0.23, N = 3 66.19
Numenta Anomaly Benchmark Detector: KNN CAD OpenBenchmarking.org Seconds, Fewer Is Better Numenta Anomaly Benchmark 1.1 Detector: KNN CAD Intel UHD 610 CML GT1 150 300 450 600 750 SE +/- 5.84, N = 3 704.21
OpenVINO Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU Intel UHD 610 CML GT1 0.3353 0.6706 1.0059 1.3412 1.6765 SE +/- 0.01, N = 3 1.49 MIN: 0.93 / MAX: 15.96 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU Intel UHD 610 CML GT1 300 600 900 1200 1500 SE +/- 5.84, N = 3 1342.06 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Handwritten English Recognition FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Handwritten English Recognition FP16-INT8 - Device: CPU Intel UHD 610 CML GT1 30 60 90 120 150 SE +/- 0.90, N = 3 123.65 MIN: 114.82 / MAX: 175.12 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Handwritten English Recognition FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Handwritten English Recognition FP16-INT8 - Device: CPU Intel UHD 610 CML GT1 4 8 12 16 20 SE +/- 0.12, N = 3 16.17 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU Intel UHD 610 CML GT1 0.855 1.71 2.565 3.42 4.275 SE +/- 0.01, N = 3 3.80 MIN: 3.04 / MAX: 19.93 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU Intel UHD 610 CML GT1 110 220 330 440 550 SE +/- 0.95, N = 3 525.03 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Handwritten English Recognition FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Handwritten English Recognition FP16 - Device: CPU Intel UHD 610 CML GT1 30 60 90 120 150 SE +/- 0.31, N = 3 155.38 MIN: 131.53 / MAX: 168.03 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Handwritten English Recognition FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Handwritten English Recognition FP16 - Device: CPU Intel UHD 610 CML GT1 3 6 9 12 15 SE +/- 0.03, N = 3 12.87 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Weld Porosity Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Weld Porosity Detection FP16-INT8 - Device: CPU Intel UHD 610 CML GT1 9 18 27 36 45 SE +/- 0.00, N = 3 37.49 MIN: 36.71 / MAX: 55.03 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Weld Porosity Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Weld Porosity Detection FP16-INT8 - Device: CPU Intel UHD 610 CML GT1 12 24 36 48 60 SE +/- 0.00, N = 3 53.33 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Machine Translation EN To DE FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Machine Translation EN To DE FP16 - Device: CPU Intel UHD 610 CML GT1 200 400 600 800 1000 SE +/- 10.67, N = 3 804.02 MIN: 772.04 / MAX: 830.67 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Machine Translation EN To DE FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Machine Translation EN To DE FP16 - Device: CPU Intel UHD 610 CML GT1 0.5603 1.1206 1.6809 2.2412 2.8015 SE +/- 0.03, N = 3 2.49 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Road Segmentation ADAS FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Road Segmentation ADAS FP16-INT8 - Device: CPU Intel UHD 610 CML GT1 30 60 90 120 150 SE +/- 1.02, N = 3 119.36 MIN: 114.3 / MAX: 132.68 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Road Segmentation ADAS FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Road Segmentation ADAS FP16-INT8 - Device: CPU Intel UHD 610 CML GT1 4 8 12 16 20 SE +/- 0.14, N = 3 16.76 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Face Detection Retail FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Face Detection Retail FP16-INT8 - Device: CPU Intel UHD 610 CML GT1 5 10 15 20 25 SE +/- 0.01, N = 3 18.67 MIN: 10.24 / MAX: 40.58 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Face Detection Retail FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Face Detection Retail FP16-INT8 - Device: CPU Intel UHD 610 CML GT1 20 40 60 80 100 SE +/- 0.03, N = 3 107.05 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Weld Porosity Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Weld Porosity Detection FP16 - Device: CPU Intel UHD 610 CML GT1 30 60 90 120 150 SE +/- 0.02, N = 3 117.47 MIN: 95.54 / MAX: 131.93 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Weld Porosity Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Weld Porosity Detection FP16 - Device: CPU Intel UHD 610 CML GT1 4 8 12 16 20 SE +/- 0.00, N = 3 17.02 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Vehicle Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Vehicle Detection FP16-INT8 - Device: CPU Intel UHD 610 CML GT1 13 26 39 52 65 SE +/- 0.05, N = 3 59.24 MIN: 56.57 / MAX: 73.79 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Vehicle Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Vehicle Detection FP16-INT8 - Device: CPU Intel UHD 610 CML GT1 8 16 24 32 40 SE +/- 0.03, N = 3 33.75 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Road Segmentation ADAS FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Road Segmentation ADAS FP16 - Device: CPU Intel UHD 610 CML GT1 50 100 150 200 250 SE +/- 0.30, N = 3 236.97 MIN: 121.52 / MAX: 369.29 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Road Segmentation ADAS FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Road Segmentation ADAS FP16 - Device: CPU Intel UHD 610 CML GT1 2 4 6 8 10 SE +/- 0.01, N = 3 8.44 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Face Detection Retail FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Face Detection Retail FP16 - Device: CPU Intel UHD 610 CML GT1 9 18 27 36 45 SE +/- 0.42, N = 3 38.71 MIN: 19.5 / MAX: 55.07 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Face Detection Retail FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Face Detection Retail FP16 - Device: CPU Intel UHD 610 CML GT1 12 24 36 48 60 SE +/- 0.56, N = 3 51.65 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Face Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Face Detection FP16-INT8 - Device: CPU Intel UHD 610 CML GT1 800 1600 2400 3200 4000 SE +/- 41.90, N = 15 3927.81 MIN: 3722.59 / MAX: 4255.2 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Face Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Face Detection FP16-INT8 - Device: CPU Intel UHD 610 CML GT1 0.1148 0.2296 0.3444 0.4592 0.574 SE +/- 0.01, N = 15 0.51 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Vehicle Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Vehicle Detection FP16 - Device: CPU Intel UHD 610 CML GT1 30 60 90 120 150 SE +/- 0.05, N = 3 127.43 MIN: 122.64 / MAX: 143.55 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Vehicle Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Vehicle Detection FP16 - Device: CPU Intel UHD 610 CML GT1 4 8 12 16 20 SE +/- 0.01, N = 3 15.69 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Person Detection FP32 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Person Detection FP32 - Device: CPU Intel UHD 610 CML GT1 200 400 600 800 1000 SE +/- 9.39, N = 3 932.30 MIN: 905.44 / MAX: 968.53 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Person Detection FP32 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Person Detection FP32 - Device: CPU Intel UHD 610 CML GT1 0.4838 0.9676 1.4514 1.9352 2.419 SE +/- 0.02, N = 3 2.15 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Person Detection FP16 - Device: CPU Intel UHD 610 CML GT1 200 400 600 800 1000 SE +/- 1.01, N = 3 957.97 MIN: 908.75 / MAX: 969.76 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Person Detection FP16 - Device: CPU Intel UHD 610 CML GT1 0.4703 0.9406 1.4109 1.8812 2.3515 SE +/- 0.00, N = 3 2.09 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Face Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Face Detection FP16 - Device: CPU Intel UHD 610 CML GT1 2K 4K 6K 8K 10K SE +/- 1.87, N = 3 11527.95 MIN: 11510.06 / MAX: 11568.08 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Face Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Face Detection FP16 - Device: CPU Intel UHD 610 CML GT1 0.0383 0.0766 0.1149 0.1532 0.1915 SE +/- 0.00, N = 3 0.17 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
PlaidML FP16: No - Mode: Inference - Network: ResNet 50 - Device: CPU OpenBenchmarking.org FPS, More Is Better PlaidML FP16: No - Mode: Inference - Network: ResNet 50 - Device: CPU Intel UHD 610 CML GT1 0.5445 1.089 1.6335 2.178 2.7225 SE +/- 0.00, N = 3 2.42
PlaidML FP16: No - Mode: Inference - Network: VGG16 - Device: CPU OpenBenchmarking.org FPS, More Is Better PlaidML FP16: No - Mode: Inference - Network: VGG16 - Device: CPU Intel UHD 610 CML GT1 0.3645 0.729 1.0935 1.458 1.8225 SE +/- 0.01, N = 3 1.62
TNN Target: CPU - Model: SqueezeNet v1.1 OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: SqueezeNet v1.1 Intel UHD 610 CML GT1 70 140 210 280 350 SE +/- 0.04, N = 3 327.08 MIN: 326.93 / MAX: 327.82 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
TNN Target: CPU - Model: SqueezeNet v2 OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: SqueezeNet v2 Intel UHD 610 CML GT1 16 32 48 64 80 SE +/- 0.01, N = 3 72.92 MIN: 72.83 / MAX: 73.21 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
TNN Target: CPU - Model: MobileNet v2 OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: MobileNet v2 Intel UHD 610 CML GT1 80 160 240 320 400 SE +/- 0.06, N = 3 365.27 MIN: 364.24 / MAX: 368.82 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
TNN Target: CPU - Model: DenseNet OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: DenseNet Intel UHD 610 CML GT1 1100 2200 3300 4400 5500 SE +/- 3.59, N = 3 5253.38 MIN: 5205.46 / MAX: 5295.82 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
NCNN Target: Vulkan GPU - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: FastestDet Intel UHD 610 CML GT1 3 6 9 12 15 SE +/- 0.00, N = 3 10.09 MIN: 10.04 / MAX: 11.61 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: Vulkan GPU - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: vision_transformer Intel UHD 610 CML GT1 200 400 600 800 1000 SE +/- 0.30, N = 3 892.53 MIN: 890.76 / MAX: 915.85 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: Vulkan GPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: regnety_400m Intel UHD 610 CML GT1 6 12 18 24 30 SE +/- 0.50, N = 3 23.04 MIN: 22.46 / MAX: 45.18 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: Vulkan GPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: squeezenet_ssd Intel UHD 610 CML GT1 9 18 27 36 45 SE +/- 0.04, N = 3 37.98 MIN: 37.77 / MAX: 38.58 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: Vulkan GPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: yolov4-tiny Intel UHD 610 CML GT1 20 40 60 80 100 SE +/- 0.06, N = 3 91.87 MIN: 91.54 / MAX: 92.98 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: Vulkan GPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: resnet50 Intel UHD 610 CML GT1 30 60 90 120 150 SE +/- 0.08, N = 3 122.76 MIN: 122.41 / MAX: 134.28 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: Vulkan GPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: alexnet Intel UHD 610 CML GT1 9 18 27 36 45 SE +/- 0.02, N = 3 37.16 MIN: 37.05 / MAX: 37.78 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: Vulkan GPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: resnet18 Intel UHD 610 CML GT1 10 20 30 40 50 SE +/- 0.02, N = 3 45.11 MIN: 44.98 / MAX: 45.77 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: Vulkan GPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: vgg16 Intel UHD 610 CML GT1 60 120 180 240 300 SE +/- 0.38, N = 3 265.05 MIN: 263.9 / MAX: 275.49 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: Vulkan GPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: blazeface Intel UHD 610 CML GT1 0.5625 1.125 1.6875 2.25 2.8125 SE +/- 0.07, N = 3 2.5 MIN: 2.33 / MAX: 48.58 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: Vulkan GPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: efficientnet-b0 Intel UHD 610 CML GT1 7 14 21 28 35 SE +/- 0.03, N = 3 28.55 MIN: 28.42 / MAX: 37.71 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: Vulkan GPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: shufflenet-v2 Intel UHD 610 CML GT1 2 4 6 8 10 SE +/- 0.00, N = 3 8.40 MIN: 8.36 / MAX: 9.01 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 Intel UHD 610 CML GT1 4 8 12 16 20 SE +/- 0.02, N = 3 14.79 MIN: 14.67 / MAX: 26.04 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 Intel UHD 610 CML GT1 5 10 15 20 25 SE +/- 0.02, N = 3 20.35 MIN: 20.21 / MAX: 22.68 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: Vulkan GPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: mobilenet Intel UHD 610 CML GT1 16 32 48 64 80 SE +/- 0.06, N = 3 74.28 MIN: 74.07 / MAX: 81.77 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: FastestDet Intel UHD 610 CML GT1 3 6 9 12 15 SE +/- 0.02, N = 3 10.07 MIN: 10 / MAX: 10.26 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: vision_transformer Intel UHD 610 CML GT1 200 400 600 800 1000 SE +/- 0.47, N = 3 893.07 MIN: 890.73 / MAX: 966.79 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: regnety_400m Intel UHD 610 CML GT1 5 10 15 20 25 SE +/- 0.01, N = 3 22.51 MIN: 22.44 / MAX: 23.67 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: squeezenet_ssd Intel UHD 610 CML GT1 9 18 27 36 45 SE +/- 0.02, N = 3 37.95 MIN: 37.76 / MAX: 45.13 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: yolov4-tiny Intel UHD 610 CML GT1 20 40 60 80 100 SE +/- 0.03, N = 3 92.22 MIN: 91.94 / MAX: 92.84 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: resnet50 Intel UHD 610 CML GT1 30 60 90 120 150 SE +/- 0.06, N = 3 122.56 MIN: 122.27 / MAX: 129.92 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: alexnet Intel UHD 610 CML GT1 9 18 27 36 45 SE +/- 0.00, N = 3 37.21 MIN: 37.03 / MAX: 48.06 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: resnet18 Intel UHD 610 CML GT1 10 20 30 40 50 SE +/- 0.03, N = 3 45.14 MIN: 44.95 / MAX: 47.17 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: vgg16 Intel UHD 610 CML GT1 60 120 180 240 300 SE +/- 0.19, N = 3 264.68 MIN: 263.84 / MAX: 276.96 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: googlenet Intel UHD 610 CML GT1 12 24 36 48 60 SE +/- 0.01, N = 3 53.87 MIN: 53.75 / MAX: 54.51 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: blazeface Intel UHD 610 CML GT1 0.54 1.08 1.62 2.16 2.7 SE +/- 0.00, N = 3 2.4 MIN: 2.33 / MAX: 2.48 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: efficientnet-b0 Intel UHD 610 CML GT1 7 14 21 28 35 SE +/- 0.02, N = 3 28.50 MIN: 28.38 / MAX: 28.88 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: mnasnet Intel UHD 610 CML GT1 4 8 12 16 20 SE +/- 0.02, N = 3 17.06 MIN: 16.92 / MAX: 28.58 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: shufflenet-v2 Intel UHD 610 CML GT1 2 4 6 8 10 SE +/- 0.01, N = 3 8.38 MIN: 8.33 / MAX: 10.87 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3 - Model: mobilenet-v3 Intel UHD 610 CML GT1 4 8 12 16 20 SE +/- 0.02, N = 3 14.78 MIN: 14.69 / MAX: 16.92 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v2-v2 - Model: mobilenet-v2 Intel UHD 610 CML GT1 5 10 15 20 25 SE +/- 0.03, N = 3 20.36 MIN: 20.2 / MAX: 30.3 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: mobilenet Intel UHD 610 CML GT1 16 32 48 64 80 SE +/- 0.01, N = 3 74.20 MIN: 73.99 / MAX: 86.15 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
Mobile Neural Network Model: inception-v3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.1 Model: inception-v3 Intel UHD 610 CML GT1 30 60 90 120 150 SE +/- 0.24, N = 3 154.33 MIN: 153.49 / MAX: 202.91 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: mobilenet-v1-1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.1 Model: mobilenet-v1-1.0 Intel UHD 610 CML GT1 5 10 15 20 25 SE +/- 0.02, N = 3 20.03 MIN: 19.9 / MAX: 63.44 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: MobileNetV2_224 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.1 Model: MobileNetV2_224 Intel UHD 610 CML GT1 3 6 9 12 15 SE +/- 0.03, N = 3 12.96 MIN: 12.84 / MAX: 26.57 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: SqueezeNetV1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.1 Model: SqueezeNetV1.0 Intel UHD 610 CML GT1 5 10 15 20 25 SE +/- 0.02, N = 3 22.00 MIN: 21.89 / MAX: 23.87 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: resnet-v2-50 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.1 Model: resnet-v2-50 Intel UHD 610 CML GT1 30 60 90 120 150 SE +/- 0.34, N = 3 115.58 MIN: 114.94 / MAX: 158.65 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: squeezenetv1.1 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.1 Model: squeezenetv1.1 Intel UHD 610 CML GT1 3 6 9 12 15 SE +/- 0.01, N = 3 11.87 MIN: 11.8 / MAX: 13.78 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: mobilenetV3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.1 Model: mobilenetV3 Intel UHD 610 CML GT1 0.8422 1.6844 2.5266 3.3688 4.211 SE +/- 0.017, N = 3 3.743 MIN: 3.69 / MAX: 6.53 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: nasnet OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.1 Model: nasnet Intel UHD 610 CML GT1 7 14 21 28 35 SE +/- 0.04, N = 3 29.08 MIN: 28.92 / MAX: 42.65 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Caffe Model: GoogleNet - Acceleration: CPU - Iterations: 1000 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: CPU - Iterations: 1000 Intel UHD 610 CML GT1 400K 800K 1200K 1600K 2000K SE +/- 548.57, N = 3 2089050 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
Caffe Model: GoogleNet - Acceleration: CPU - Iterations: 200 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: CPU - Iterations: 200 Intel UHD 610 CML GT1 90K 180K 270K 360K 450K SE +/- 156.10, N = 3 417906 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
Caffe Model: GoogleNet - Acceleration: CPU - Iterations: 100 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: CPU - Iterations: 100 Intel UHD 610 CML GT1 40K 80K 120K 160K 200K SE +/- 123.85, N = 3 208988 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
Caffe Model: AlexNet - Acceleration: CPU - Iterations: 1000 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: AlexNet - Acceleration: CPU - Iterations: 1000 Intel UHD 610 CML GT1 200K 400K 600K 800K 1000K SE +/- 207.18, N = 3 967170 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
Caffe Model: AlexNet - Acceleration: CPU - Iterations: 200 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: AlexNet - Acceleration: CPU - Iterations: 200 Intel UHD 610 CML GT1 40K 80K 120K 160K 200K SE +/- 86.43, N = 3 193587 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
Caffe Model: AlexNet - Acceleration: CPU - Iterations: 100 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: AlexNet - Acceleration: CPU - Iterations: 100 Intel UHD 610 CML GT1 20K 40K 60K 80K 100K SE +/- 154.03, N = 3 96808 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
TensorFlow Lite Model: Inception ResNet V2 OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2022-05-18 Model: Inception ResNet V2 Intel UHD 610 CML GT1 80K 160K 240K 320K 400K SE +/- 125.07, N = 3 386539
TensorFlow Lite Model: Mobilenet Quant OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2022-05-18 Model: Mobilenet Quant Intel UHD 610 CML GT1 120K 240K 360K 480K 600K SE +/- 20.11, N = 3 560906
TensorFlow Lite Model: Mobilenet Float OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2022-05-18 Model: Mobilenet Float Intel UHD 610 CML GT1 5K 10K 15K 20K 25K SE +/- 8.27, N = 3 21944.0
TensorFlow Lite Model: NASNet Mobile OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2022-05-18 Model: NASNet Mobile Intel UHD 610 CML GT1 9K 18K 27K 36K 45K SE +/- 22.46, N = 3 40066.8
TensorFlow Lite Model: Inception V4 OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2022-05-18 Model: Inception V4 Intel UHD 610 CML GT1 90K 180K 270K 360K 450K SE +/- 136.01, N = 3 415157
TensorFlow Lite Model: SqueezeNet OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2022-05-18 Model: SqueezeNet Intel UHD 610 CML GT1 7K 14K 21K 28K 35K SE +/- 33.23, N = 3 30349.5
RNNoise OpenBenchmarking.org Seconds, Fewer Is Better RNNoise 2020-06-28 Intel UHD 610 CML GT1 6 12 18 24 30 SE +/- 0.08, N = 3 26.35 1. (CC) gcc options: -O2 -pedantic -fvisibility=hidden
R Benchmark OpenBenchmarking.org Seconds, Fewer Is Better R Benchmark Intel UHD 610 CML GT1 0.0783 0.1566 0.2349 0.3132 0.3915 SE +/- 0.0003, N = 3 0.3482 1. R scripting front-end version 3.6.3 (2020-02-29)
Numpy Benchmark OpenBenchmarking.org Score, More Is Better Numpy Benchmark Intel UHD 610 CML GT1 60 120 180 240 300 SE +/- 0.23, N = 3 294.94
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU Intel UHD 610 CML GT1 4K 8K 12K 16K 20K SE +/- 7.60, N = 3 20613.0 MIN: 20591.8 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU Intel UHD 610 CML GT1 9K 18K 27K 36K 45K SE +/- 13.51, N = 3 41031.8 MIN: 40996.4 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU Intel UHD 610 CML GT1 4K 8K 12K 16K 20K SE +/- 8.89, N = 3 20604.1 MIN: 20576.6 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU Intel UHD 610 CML GT1 9K 18K 27K 36K 45K SE +/- 13.11, N = 3 41046.8 MIN: 41006.5 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU Intel UHD 610 CML GT1 4K 8K 12K 16K 20K SE +/- 9.98, N = 3 20608.4 MIN: 20581.9 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU Intel UHD 610 CML GT1 9K 18K 27K 36K 45K SE +/- 5.07, N = 3 41032.5 MIN: 41006.6 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU Intel UHD 610 CML GT1 6 12 18 24 30 SE +/- 0.02, N = 3 26.17 MIN: 25.84 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU Intel UHD 610 CML GT1 5 10 15 20 25 SE +/- 0.01, N = 3 19.16 MIN: 19.06 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU Intel UHD 610 CML GT1 11 22 33 44 55 SE +/- 0.13, N = 3 49.27 MIN: 47.33 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU Intel UHD 610 CML GT1 20 40 60 80 100 SE +/- 0.11, N = 3 90.10 MIN: 87.57 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU Intel UHD 610 CML GT1 20 40 60 80 100 SE +/- 0.42, N = 3 76.01 MIN: 73.69 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU Intel UHD 610 CML GT1 13 26 39 52 65 SE +/- 0.09, N = 3 58.72 MIN: 58.45 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU Intel UHD 610 CML GT1 1.3099 2.6198 3.9297 5.2396 6.5495 SE +/- 0.00596, N = 3 5.82157 MIN: 5.61 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU Intel UHD 610 CML GT1 3 6 9 12 15 SE +/- 0.03, N = 3 11.93 MIN: 11.84 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU Intel UHD 610 CML GT1 9 18 27 36 45 SE +/- 0.19, N = 3 37.72 MIN: 36.95 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU Intel UHD 610 CML GT1 8 16 24 32 40 SE +/- 0.06, N = 3 36.82 MIN: 36.41 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
LeelaChessZero Backend: BLAS OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.28 Backend: BLAS Intel UHD 610 CML GT1 30 60 90 120 150 SE +/- 1.55, N = 4 141 1. (CXX) g++ options: -flto -pthread
Scikit-Learn Benchmark: Hist Gradient Boosting Adult OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Hist Gradient Boosting Adult Intel UHD 610 CML GT1 20 40 60 80 100 SE +/- 6.84, N = 15 108.15 1. (F9X) gfortran options: -O0
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard Intel UHD 610 CML GT1 160 320 480 640 800 SE +/- 3.22, N = 3 733.30 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel Intel UHD 610 CML GT1 200 400 600 800 1000 SE +/- 38.86, N = 12 1022.70 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel Intel UHD 610 CML GT1 0.2235 0.447 0.6705 0.894 1.1175 SE +/- 0.037356, N = 12 0.993387 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: super-resolution-10 - Device: CPU - Executor: Standard Intel UHD 610 CML GT1 20 40 60 80 100 SE +/- 0.06, N = 3 107.13 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: super-resolution-10 - Device: CPU - Executor: Parallel Intel UHD 610 CML GT1 30 60 90 120 150 SE +/- 0.20, N = 3 120.56 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard Intel UHD 610 CML GT1 20 40 60 80 100 SE +/- 0.09, N = 3 93.34 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel Intel UHD 610 CML GT1 20 40 60 80 100 SE +/- 1.19, N = 3 110.71 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard Intel UHD 610 CML GT1 90 180 270 360 450 SE +/- 30.78, N = 12 430.82 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard Intel UHD 610 CML GT1 0.5399 1.0798 1.6197 2.1596 2.6995 SE +/- 0.10001, N = 12 2.39975 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel Intel UHD 610 CML GT1 100 200 300 400 500 SE +/- 0.26, N = 3 448.15 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: fcn-resnet101-11 - Device: CPU - Executor: Standard Intel UHD 610 CML GT1 2K 4K 6K 8K 10K SE +/- 2.41, N = 3 9192.22 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel Intel UHD 610 CML GT1 4K 8K 12K 16K 20K SE +/- 45.50, N = 3 18223.2 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard Intel UHD 610 CML GT1 7 14 21 28 35 SE +/- 1.75, N = 15 28.35 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard Intel UHD 610 CML GT1 8 16 24 32 40 SE +/- 1.51, N = 15 36.58 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel Intel UHD 610 CML GT1 7 14 21 28 35 SE +/- 0.56, N = 12 31.35 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: bertsquad-12 - Device: CPU - Executor: Standard Intel UHD 610 CML GT1 200 400 600 800 1000 SE +/- 18.17, N = 12 835.68 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: bertsquad-12 - Device: CPU - Executor: Standard Intel UHD 610 CML GT1 0.2704 0.5408 0.8112 1.0816 1.352 SE +/- 0.02147, N = 12 1.20176 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: bertsquad-12 - Device: CPU - Executor: Parallel Intel UHD 610 CML GT1 200 400 600 800 1000 SE +/- 32.34, N = 15 953.42 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: bertsquad-12 - Device: CPU - Executor: Parallel Intel UHD 610 CML GT1 0.2392 0.4784 0.7176 0.9568 1.196 SE +/- 0.029803, N = 15 1.062983 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: GPT-2 - Device: CPU - Executor: Standard Intel UHD 610 CML GT1 8 16 24 32 40 SE +/- 0.21, N = 3 33.32 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: GPT-2 - Device: CPU - Executor: Parallel Intel UHD 610 CML GT1 8 16 24 32 40 SE +/- 0.04, N = 3 34.77 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
NCNN Target: Vulkan GPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: googlenet Intel UHD 610 CML GT1 13 26 39 52 65 SE +/- 2.12, N = 3 56.04 MIN: 53.72 / MAX: 1402.43 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: Vulkan GPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: mnasnet Intel UHD 610 CML GT1 6 12 18 24 30 SE +/- 10.53, N = 3 27.55 MIN: 16.93 / MAX: 1277.45 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
Phoronix Test Suite v10.8.5