h510-g6405-1 Intel Pentium Gold G6405 testing with a ASRock H510M-HDV/M.2 SE (P1.60 BIOS) and Intel UHD 610 CML GT1 3GB on Ubuntu 20.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2311104-HERT-H510G6457&grr .
h510-g6405-1 Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server OpenGL Vulkan Compiler File-System Screen Resolution Intel UHD 610 CML GT1 Intel Pentium Gold G6405 @ 4.10GHz (2 Cores / 4 Threads) ASRock H510M-HDV/M.2 SE (P1.60 BIOS) Intel Comet Lake PCH 3584MB 1000GB Western Digital WDS100T2B0A Intel UHD 610 CML GT1 3GB (1050MHz) Realtek ALC897 G185BGEL01 Realtek RTL8111/8168/8411 Ubuntu 20.04 5.15.0-88-generic (x86_64) GNOME Shell 3.36.9 X Server 1.20.13 4.6 Mesa 21.2.6 1.2.182 GCC 9.4.0 ext4 1368x768 OpenBenchmarking.org - Transparent Huge Pages: madvise - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-9QDOt0/gcc-9-9.4.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0xf8 - Thermald 1.9.1 - Python 3.8.10 - gather_data_sampling: Not affected + itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Mitigation of Clear buffers; SMT vulnerable + retbleed: Mitigation of Enhanced IBRS + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Mitigation of Microcode + tsx_async_abort: Not affected
h510-g6405-1 whisper-cpp: ggml-medium.en - 2016 State of the Union whisper-cpp: ggml-small.en - 2016 State of the Union scikit-learn: Sparse Rand Projections / 100 Iterations whisper-cpp: ggml-base.en - 2016 State of the Union caffe: GoogleNet - CPU - 1000 scikit-learn: GLM scikit-learn: Lasso scikit-learn: SAGA plaidml: No - Inference - VGG16 - CPU scikit-learn: Kernel PCA Solvers / Time vs. N Components scikit-learn: TSNE MNIST Dataset caffe: AlexNet - CPU - 1000 plaidml: No - Inference - ResNet 50 - CPU mnn: inception-v3 mnn: mobilenet-v1-1.0 mnn: MobileNetV2_224 mnn: SqueezeNetV1.0 mnn: resnet-v2-50 mnn: squeezenetv1.1 mnn: mobilenetV3 mnn: nasnet scikit-learn: Covertype Dataset Benchmark numenta-nab: KNN CAD scikit-learn: Plot Lasso Path scikit-learn: Kernel PCA Solvers / Time vs. N Samples scikit-learn: Hist Gradient Boosting Threading scikit-learn: LocalOutlierFactor scikit-learn: Hist Gradient Boosting Adult ncnn: Vulkan GPU - FastestDet ncnn: Vulkan GPU - vision_transformer ncnn: Vulkan GPU - regnety_400m ncnn: Vulkan GPU - squeezenet_ssd ncnn: Vulkan GPU - yolov4-tiny ncnn: Vulkan GPU - resnet50 ncnn: Vulkan GPU - alexnet ncnn: Vulkan GPU - resnet18 ncnn: Vulkan GPU - vgg16 ncnn: Vulkan GPU - googlenet ncnn: Vulkan GPU - blazeface ncnn: Vulkan GPU - efficientnet-b0 ncnn: Vulkan GPU - mnasnet ncnn: Vulkan GPU - shufflenet-v2 ncnn: Vulkan GPU-v3-v3 - mobilenet-v3 ncnn: Vulkan GPU-v2-v2 - mobilenet-v2 ncnn: Vulkan GPU - mobilenet ncnn: CPU - FastestDet ncnn: CPU - vision_transformer ncnn: CPU - regnety_400m ncnn: CPU - squeezenet_ssd ncnn: CPU - yolov4-tiny ncnn: CPU - resnet50 ncnn: CPU - alexnet ncnn: CPU - resnet18 ncnn: CPU - vgg16 ncnn: CPU - googlenet ncnn: CPU - blazeface ncnn: CPU - efficientnet-b0 ncnn: CPU - mnasnet ncnn: CPU - shufflenet-v2 ncnn: CPU-v3-v3 - mobilenet-v3 ncnn: CPU-v2-v2 - mobilenet-v2 ncnn: CPU - mobilenet numenta-nab: Earthgecko Skyline lczero: BLAS scikit-learn: Plot Singular Value Decomposition onnx: bertsquad-12 - CPU - Parallel onnx: bertsquad-12 - CPU - Parallel onnx: CaffeNet 12-int8 - CPU - Standard onnx: CaffeNet 12-int8 - CPU - Standard caffe: GoogleNet - CPU - 200 scikit-learn: Plot Polynomial Kernel Approximation onnx: bertsquad-12 - CPU - Standard onnx: bertsquad-12 - CPU - Standard onnx: Faster R-CNN R-50-FPN-int8 - CPU - Parallel onnx: Faster R-CNN R-50-FPN-int8 - CPU - Parallel tnn: CPU - DenseNet onnx: ArcFace ResNet-100 - CPU - Standard onnx: ArcFace ResNet-100 - CPU - Standard scikit-learn: Plot Hierarchical onnx: CaffeNet 12-int8 - CPU - Parallel onnx: CaffeNet 12-int8 - CPU - Parallel scikit-learn: Plot Neighbors openvino: Face Detection FP16-INT8 - CPU openvino: Face Detection FP16-INT8 - CPU scikit-learn: Hist Gradient Boosting Higgs Boson scikit-learn: Plot OMP vs. LARS scikit-learn: SGD Regression onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Training - f32 - CPU scikit-learn: Feature Expansions scikit-learn: Hist Gradient Boosting caffe: GoogleNet - CPU - 100 numpy: scikit-learn: Sample Without Replacement caffe: AlexNet - CPU - 200 onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU numenta-nab: Bayesian Changepoint scikit-learn: Sparsify numenta-nab: Contextual Anomaly Detector OSE scikit-learn: Plot Ward mlpack: scikit_ica scikit-learn: MNIST Dataset scikit-learn: Plot Incremental PCA scikit-learn: Text Vectorizers onnx: fcn-resnet101-11 - CPU - Parallel onnx: fcn-resnet101-11 - CPU - Parallel onnx: fcn-resnet101-11 - CPU - Standard onnx: fcn-resnet101-11 - CPU - Standard caffe: AlexNet - CPU - 100 onnx: Faster R-CNN R-50-FPN-int8 - CPU - Standard onnx: Faster R-CNN R-50-FPN-int8 - CPU - Standard onnx: ArcFace ResNet-100 - CPU - Parallel onnx: ArcFace ResNet-100 - CPU - Parallel onnx: GPT-2 - CPU - Parallel onnx: GPT-2 - CPU - Parallel onnx: GPT-2 - CPU - Standard onnx: GPT-2 - CPU - Standard onnx: ResNet50 v1-12-int8 - CPU - Parallel onnx: ResNet50 v1-12-int8 - CPU - Parallel onnx: ResNet50 v1-12-int8 - CPU - Standard onnx: ResNet50 v1-12-int8 - CPU - Standard onnx: super-resolution-10 - CPU - Parallel onnx: super-resolution-10 - CPU - Parallel onnx: super-resolution-10 - CPU - Standard onnx: super-resolution-10 - CPU - Standard openvino: Face Detection FP16 - CPU openvino: Face Detection FP16 - CPU scikit-learn: 20 Newsgroups / Logistic Regression numenta-nab: Relative Entropy openvino: Machine Translation EN To DE FP16 - CPU openvino: Machine Translation EN To DE FP16 - CPU scikit-learn: Tree tensorflow-lite: Mobilenet Quant openvino: Person Detection FP32 - CPU openvino: Person Detection FP32 - CPU tensorflow-lite: Inception V4 tensorflow-lite: Inception ResNet V2 openvino: Person Detection FP16 - CPU openvino: Person Detection FP16 - CPU openvino: Road Segmentation ADAS FP16-INT8 - CPU openvino: Road Segmentation ADAS FP16-INT8 - CPU openvino: Road Segmentation ADAS FP16 - CPU openvino: Road Segmentation ADAS FP16 - CPU openvino: Handwritten English Recognition FP16 - CPU openvino: Handwritten English Recognition FP16 - CPU tensorflow-lite: NASNet Mobile openvino: Handwritten English Recognition FP16-INT8 - CPU openvino: Handwritten English Recognition FP16-INT8 - CPU openvino: Vehicle Detection FP16-INT8 - CPU openvino: Vehicle Detection FP16-INT8 - CPU tensorflow-lite: SqueezeNet tensorflow-lite: Mobilenet Float openvino: Vehicle Detection FP16 - CPU openvino: Vehicle Detection FP16 - CPU openvino: Weld Porosity Detection FP16 - CPU openvino: Weld Porosity Detection FP16 - CPU openvino: Face Detection Retail FP16-INT8 - CPU openvino: Face Detection Retail FP16-INT8 - CPU openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Face Detection Retail FP16 - CPU openvino: Face Detection Retail FP16 - CPU openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU openvino: Age Gender Recognition Retail 0013 FP16 - CPU openvino: Age Gender Recognition Retail 0013 FP16 - CPU scikit-learn: Hist Gradient Boosting Categorical Only rbenchmark: numenta-nab: Windowed Gaussian mlpack: scikit_svm rnnoise: tnn: CPU - MobileNet v2 tnn: CPU - SqueezeNet v1.1 onednn: Deconvolution Batch shapes_1d - f32 - CPU onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU onednn: IP Shapes 1D - f32 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU onednn: IP Shapes 3D - f32 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU tnn: CPU - SqueezeNet v2 onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU onednn: IP Shapes 1D - bf16bf16bf16 - CPU Intel UHD 610 CML GT1 39940.555 11358.504 3027.193 3116.24208 2089050 1113.273 1060.627 1055.940 1.62 324.216 796.405 967170 2.42 154.326 20.034 12.963 22.004 115.582 11.868 3.743 29.075 581.596 704.214 465.908 456.279 449.435 448.570 108.147 10.09 892.53 23.04 37.98 91.87 122.76 37.16 45.11 265.05 56.04 2.5 28.55 27.55 8.40 14.79 20.35 74.28 10.07 893.07 22.51 37.95 92.22 122.56 37.21 45.14 264.68 53.87 2.4 28.50 17.06 8.38 14.78 20.36 74.20 528.345 141 352.774 953.418 1.062983 28.3464 36.5771 417906 295.976 835.681 1.20176 1022.703 0.993387 5253.381 430.820 2.39975 273.303 31.3494 31.9931 262.915 3927.81 0.51 185.743 249.520 219.599 41046.8 41031.8 41032.5 212.153 194.726 208988 294.94 147.242 193587 20613.0 20604.1 20608.4 164.503 103.999 137.717 93.201 114.79 86.743 82.858 79.734 18223.2 0.0548757 9192.22 0.108787 96808 733.298 1.36375 448.150 2.23139 34.7677 28.7583 33.3211 30.0089 110.709 9.03460 93.3403 10.7133 120.557 8.29467 107.129 9.33438 11527.95 0.17 60.843 66.187 804.02 2.49 47.100 560906 932.30 2.15 415157 386539 957.97 2.09 119.36 16.76 236.97 8.44 155.38 12.87 40066.8 123.65 16.17 59.24 33.75 30349.5 21944.0 127.43 15.69 117.47 17.02 18.67 107.05 37.49 53.33 38.71 51.65 1.49 1342.06 3.80 525.03 27.024 0.3482 33.807 27.15 26.349 365.268 327.082 76.0143 19.1615 36.8187 11.9260 37.7239 5.82157 58.7189 49.2715 72.920 90.0958 26.1720 OpenBenchmarking.org
Whisper.cpp Model: ggml-medium.en - Input: 2016 State of the Union OpenBenchmarking.org Seconds, Fewer Is Better Whisper.cpp 1.4 Model: ggml-medium.en - Input: 2016 State of the Union Intel UHD 610 CML GT1 9K 18K 27K 36K 45K SE +/- 132.56, N = 3 39940.56 1. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread
Whisper.cpp Model: ggml-small.en - Input: 2016 State of the Union OpenBenchmarking.org Seconds, Fewer Is Better Whisper.cpp 1.4 Model: ggml-small.en - Input: 2016 State of the Union Intel UHD 610 CML GT1 2K 4K 6K 8K 10K SE +/- 62.26, N = 3 11358.50 1. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread
Scikit-Learn Benchmark: Sparse Random Projections / 100 Iterations OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Sparse Random Projections / 100 Iterations Intel UHD 610 CML GT1 600 1200 1800 2400 3000 SE +/- 2.34, N = 3 3027.19 1. (F9X) gfortran options: -O0
Whisper.cpp Model: ggml-base.en - Input: 2016 State of the Union OpenBenchmarking.org Seconds, Fewer Is Better Whisper.cpp 1.4 Model: ggml-base.en - Input: 2016 State of the Union Intel UHD 610 CML GT1 700 1400 2100 2800 3500 SE +/- 0.81, N = 3 3116.24 1. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread
Caffe Model: GoogleNet - Acceleration: CPU - Iterations: 1000 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: CPU - Iterations: 1000 Intel UHD 610 CML GT1 400K 800K 1200K 1600K 2000K SE +/- 548.57, N = 3 2089050 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
Scikit-Learn Benchmark: GLM OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: GLM Intel UHD 610 CML GT1 200 400 600 800 1000 SE +/- 4.22, N = 3 1113.27 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: Lasso OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Lasso Intel UHD 610 CML GT1 200 400 600 800 1000 SE +/- 0.24, N = 3 1060.63 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: SAGA OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: SAGA Intel UHD 610 CML GT1 200 400 600 800 1000 SE +/- 1.27, N = 3 1055.94 1. (F9X) gfortran options: -O0
PlaidML FP16: No - Mode: Inference - Network: VGG16 - Device: CPU OpenBenchmarking.org FPS, More Is Better PlaidML FP16: No - Mode: Inference - Network: VGG16 - Device: CPU Intel UHD 610 CML GT1 0.3645 0.729 1.0935 1.458 1.8225 SE +/- 0.01, N = 3 1.62
Scikit-Learn Benchmark: Kernel PCA Solvers / Time vs. N Components OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Kernel PCA Solvers / Time vs. N Components Intel UHD 610 CML GT1 70 140 210 280 350 SE +/- 4.78, N = 9 324.22 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: TSNE MNIST Dataset OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: TSNE MNIST Dataset Intel UHD 610 CML GT1 200 400 600 800 1000 SE +/- 0.17, N = 3 796.41 1. (F9X) gfortran options: -O0
Caffe Model: AlexNet - Acceleration: CPU - Iterations: 1000 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: AlexNet - Acceleration: CPU - Iterations: 1000 Intel UHD 610 CML GT1 200K 400K 600K 800K 1000K SE +/- 207.18, N = 3 967170 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
PlaidML FP16: No - Mode: Inference - Network: ResNet 50 - Device: CPU OpenBenchmarking.org FPS, More Is Better PlaidML FP16: No - Mode: Inference - Network: ResNet 50 - Device: CPU Intel UHD 610 CML GT1 0.5445 1.089 1.6335 2.178 2.7225 SE +/- 0.00, N = 3 2.42
Mobile Neural Network Model: inception-v3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.1 Model: inception-v3 Intel UHD 610 CML GT1 30 60 90 120 150 SE +/- 0.24, N = 3 154.33 MIN: 153.49 / MAX: 202.91 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: mobilenet-v1-1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.1 Model: mobilenet-v1-1.0 Intel UHD 610 CML GT1 5 10 15 20 25 SE +/- 0.02, N = 3 20.03 MIN: 19.9 / MAX: 63.44 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: MobileNetV2_224 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.1 Model: MobileNetV2_224 Intel UHD 610 CML GT1 3 6 9 12 15 SE +/- 0.03, N = 3 12.96 MIN: 12.84 / MAX: 26.57 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: SqueezeNetV1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.1 Model: SqueezeNetV1.0 Intel UHD 610 CML GT1 5 10 15 20 25 SE +/- 0.02, N = 3 22.00 MIN: 21.89 / MAX: 23.87 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: resnet-v2-50 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.1 Model: resnet-v2-50 Intel UHD 610 CML GT1 30 60 90 120 150 SE +/- 0.34, N = 3 115.58 MIN: 114.94 / MAX: 158.65 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: squeezenetv1.1 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.1 Model: squeezenetv1.1 Intel UHD 610 CML GT1 3 6 9 12 15 SE +/- 0.01, N = 3 11.87 MIN: 11.8 / MAX: 13.78 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: mobilenetV3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.1 Model: mobilenetV3 Intel UHD 610 CML GT1 0.8422 1.6844 2.5266 3.3688 4.211 SE +/- 0.017, N = 3 3.743 MIN: 3.69 / MAX: 6.53 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: nasnet OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.1 Model: nasnet Intel UHD 610 CML GT1 7 14 21 28 35 SE +/- 0.04, N = 3 29.08 MIN: 28.92 / MAX: 42.65 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Scikit-Learn Benchmark: Covertype Dataset Benchmark OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Covertype Dataset Benchmark Intel UHD 610 CML GT1 130 260 390 520 650 SE +/- 0.92, N = 3 581.60 1. (F9X) gfortran options: -O0
Numenta Anomaly Benchmark Detector: KNN CAD OpenBenchmarking.org Seconds, Fewer Is Better Numenta Anomaly Benchmark 1.1 Detector: KNN CAD Intel UHD 610 CML GT1 150 300 450 600 750 SE +/- 5.84, N = 3 704.21
Scikit-Learn Benchmark: Plot Lasso Path OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Plot Lasso Path Intel UHD 610 CML GT1 100 200 300 400 500 SE +/- 0.95, N = 3 465.91 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: Kernel PCA Solvers / Time vs. N Samples OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Kernel PCA Solvers / Time vs. N Samples Intel UHD 610 CML GT1 100 200 300 400 500 SE +/- 0.35, N = 3 456.28 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: Hist Gradient Boosting Threading OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Hist Gradient Boosting Threading Intel UHD 610 CML GT1 100 200 300 400 500 SE +/- 1.33, N = 3 449.44 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: LocalOutlierFactor OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: LocalOutlierFactor Intel UHD 610 CML GT1 100 200 300 400 500 SE +/- 2.17, N = 3 448.57 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: Hist Gradient Boosting Adult OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Hist Gradient Boosting Adult Intel UHD 610 CML GT1 20 40 60 80 100 SE +/- 6.84, N = 15 108.15 1. (F9X) gfortran options: -O0
NCNN Target: Vulkan GPU - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: FastestDet Intel UHD 610 CML GT1 3 6 9 12 15 SE +/- 0.00, N = 3 10.09 MIN: 10.04 / MAX: 11.61 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: Vulkan GPU - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: vision_transformer Intel UHD 610 CML GT1 200 400 600 800 1000 SE +/- 0.30, N = 3 892.53 MIN: 890.76 / MAX: 915.85 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: Vulkan GPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: regnety_400m Intel UHD 610 CML GT1 6 12 18 24 30 SE +/- 0.50, N = 3 23.04 MIN: 22.46 / MAX: 45.18 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: Vulkan GPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: squeezenet_ssd Intel UHD 610 CML GT1 9 18 27 36 45 SE +/- 0.04, N = 3 37.98 MIN: 37.77 / MAX: 38.58 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: Vulkan GPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: yolov4-tiny Intel UHD 610 CML GT1 20 40 60 80 100 SE +/- 0.06, N = 3 91.87 MIN: 91.54 / MAX: 92.98 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: Vulkan GPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: resnet50 Intel UHD 610 CML GT1 30 60 90 120 150 SE +/- 0.08, N = 3 122.76 MIN: 122.41 / MAX: 134.28 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: Vulkan GPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: alexnet Intel UHD 610 CML GT1 9 18 27 36 45 SE +/- 0.02, N = 3 37.16 MIN: 37.05 / MAX: 37.78 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: Vulkan GPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: resnet18 Intel UHD 610 CML GT1 10 20 30 40 50 SE +/- 0.02, N = 3 45.11 MIN: 44.98 / MAX: 45.77 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: Vulkan GPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: vgg16 Intel UHD 610 CML GT1 60 120 180 240 300 SE +/- 0.38, N = 3 265.05 MIN: 263.9 / MAX: 275.49 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: Vulkan GPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: googlenet Intel UHD 610 CML GT1 13 26 39 52 65 SE +/- 2.12, N = 3 56.04 MIN: 53.72 / MAX: 1402.43 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: Vulkan GPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: blazeface Intel UHD 610 CML GT1 0.5625 1.125 1.6875 2.25 2.8125 SE +/- 0.07, N = 3 2.5 MIN: 2.33 / MAX: 48.58 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: Vulkan GPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: efficientnet-b0 Intel UHD 610 CML GT1 7 14 21 28 35 SE +/- 0.03, N = 3 28.55 MIN: 28.42 / MAX: 37.71 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: Vulkan GPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: mnasnet Intel UHD 610 CML GT1 6 12 18 24 30 SE +/- 10.53, N = 3 27.55 MIN: 16.93 / MAX: 1277.45 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: Vulkan GPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: shufflenet-v2 Intel UHD 610 CML GT1 2 4 6 8 10 SE +/- 0.00, N = 3 8.40 MIN: 8.36 / MAX: 9.01 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 Intel UHD 610 CML GT1 4 8 12 16 20 SE +/- 0.02, N = 3 14.79 MIN: 14.67 / MAX: 26.04 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 Intel UHD 610 CML GT1 5 10 15 20 25 SE +/- 0.02, N = 3 20.35 MIN: 20.21 / MAX: 22.68 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: Vulkan GPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: mobilenet Intel UHD 610 CML GT1 16 32 48 64 80 SE +/- 0.06, N = 3 74.28 MIN: 74.07 / MAX: 81.77 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: FastestDet Intel UHD 610 CML GT1 3 6 9 12 15 SE +/- 0.02, N = 3 10.07 MIN: 10 / MAX: 10.26 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: vision_transformer Intel UHD 610 CML GT1 200 400 600 800 1000 SE +/- 0.47, N = 3 893.07 MIN: 890.73 / MAX: 966.79 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: regnety_400m Intel UHD 610 CML GT1 5 10 15 20 25 SE +/- 0.01, N = 3 22.51 MIN: 22.44 / MAX: 23.67 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: squeezenet_ssd Intel UHD 610 CML GT1 9 18 27 36 45 SE +/- 0.02, N = 3 37.95 MIN: 37.76 / MAX: 45.13 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: yolov4-tiny Intel UHD 610 CML GT1 20 40 60 80 100 SE +/- 0.03, N = 3 92.22 MIN: 91.94 / MAX: 92.84 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: resnet50 Intel UHD 610 CML GT1 30 60 90 120 150 SE +/- 0.06, N = 3 122.56 MIN: 122.27 / MAX: 129.92 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: alexnet Intel UHD 610 CML GT1 9 18 27 36 45 SE +/- 0.00, N = 3 37.21 MIN: 37.03 / MAX: 48.06 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: resnet18 Intel UHD 610 CML GT1 10 20 30 40 50 SE +/- 0.03, N = 3 45.14 MIN: 44.95 / MAX: 47.17 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: vgg16 Intel UHD 610 CML GT1 60 120 180 240 300 SE +/- 0.19, N = 3 264.68 MIN: 263.84 / MAX: 276.96 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: googlenet Intel UHD 610 CML GT1 12 24 36 48 60 SE +/- 0.01, N = 3 53.87 MIN: 53.75 / MAX: 54.51 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: blazeface Intel UHD 610 CML GT1 0.54 1.08 1.62 2.16 2.7 SE +/- 0.00, N = 3 2.4 MIN: 2.33 / MAX: 2.48 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: efficientnet-b0 Intel UHD 610 CML GT1 7 14 21 28 35 SE +/- 0.02, N = 3 28.50 MIN: 28.38 / MAX: 28.88 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: mnasnet Intel UHD 610 CML GT1 4 8 12 16 20 SE +/- 0.02, N = 3 17.06 MIN: 16.92 / MAX: 28.58 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: shufflenet-v2 Intel UHD 610 CML GT1 2 4 6 8 10 SE +/- 0.01, N = 3 8.38 MIN: 8.33 / MAX: 10.87 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3 - Model: mobilenet-v3 Intel UHD 610 CML GT1 4 8 12 16 20 SE +/- 0.02, N = 3 14.78 MIN: 14.69 / MAX: 16.92 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v2-v2 - Model: mobilenet-v2 Intel UHD 610 CML GT1 5 10 15 20 25 SE +/- 0.03, N = 3 20.36 MIN: 20.2 / MAX: 30.3 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: mobilenet Intel UHD 610 CML GT1 16 32 48 64 80 SE +/- 0.01, N = 3 74.20 MIN: 73.99 / MAX: 86.15 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
Numenta Anomaly Benchmark Detector: Earthgecko Skyline OpenBenchmarking.org Seconds, Fewer Is Better Numenta Anomaly Benchmark 1.1 Detector: Earthgecko Skyline Intel UHD 610 CML GT1 110 220 330 440 550 SE +/- 2.87, N = 3 528.35
LeelaChessZero Backend: BLAS OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.28 Backend: BLAS Intel UHD 610 CML GT1 30 60 90 120 150 SE +/- 1.55, N = 4 141 1. (CXX) g++ options: -flto -pthread
Scikit-Learn Benchmark: Plot Singular Value Decomposition OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Plot Singular Value Decomposition Intel UHD 610 CML GT1 80 160 240 320 400 SE +/- 0.33, N = 3 352.77 1. (F9X) gfortran options: -O0
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: bertsquad-12 - Device: CPU - Executor: Parallel Intel UHD 610 CML GT1 200 400 600 800 1000 SE +/- 32.34, N = 15 953.42 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: bertsquad-12 - Device: CPU - Executor: Parallel Intel UHD 610 CML GT1 0.2392 0.4784 0.7176 0.9568 1.196 SE +/- 0.029803, N = 15 1.062983 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard Intel UHD 610 CML GT1 7 14 21 28 35 SE +/- 1.75, N = 15 28.35 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard Intel UHD 610 CML GT1 8 16 24 32 40 SE +/- 1.51, N = 15 36.58 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
Caffe Model: GoogleNet - Acceleration: CPU - Iterations: 200 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: CPU - Iterations: 200 Intel UHD 610 CML GT1 90K 180K 270K 360K 450K SE +/- 156.10, N = 3 417906 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
Scikit-Learn Benchmark: Plot Polynomial Kernel Approximation OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Plot Polynomial Kernel Approximation Intel UHD 610 CML GT1 60 120 180 240 300 SE +/- 0.33, N = 3 295.98 1. (F9X) gfortran options: -O0
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: bertsquad-12 - Device: CPU - Executor: Standard Intel UHD 610 CML GT1 200 400 600 800 1000 SE +/- 18.17, N = 12 835.68 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: bertsquad-12 - Device: CPU - Executor: Standard Intel UHD 610 CML GT1 0.2704 0.5408 0.8112 1.0816 1.352 SE +/- 0.02147, N = 12 1.20176 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel Intel UHD 610 CML GT1 200 400 600 800 1000 SE +/- 38.86, N = 12 1022.70 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel Intel UHD 610 CML GT1 0.2235 0.447 0.6705 0.894 1.1175 SE +/- 0.037356, N = 12 0.993387 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
TNN Target: CPU - Model: DenseNet OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: DenseNet Intel UHD 610 CML GT1 1100 2200 3300 4400 5500 SE +/- 3.59, N = 3 5253.38 MIN: 5205.46 / MAX: 5295.82 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard Intel UHD 610 CML GT1 90 180 270 360 450 SE +/- 30.78, N = 12 430.82 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard Intel UHD 610 CML GT1 0.5399 1.0798 1.6197 2.1596 2.6995 SE +/- 0.10001, N = 12 2.39975 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
Scikit-Learn Benchmark: Plot Hierarchical OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Plot Hierarchical Intel UHD 610 CML GT1 60 120 180 240 300 SE +/- 0.38, N = 3 273.30 1. (F9X) gfortran options: -O0
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel Intel UHD 610 CML GT1 7 14 21 28 35 SE +/- 0.56, N = 12 31.35 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel Intel UHD 610 CML GT1 7 14 21 28 35 SE +/- 0.49, N = 12 31.99 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
Scikit-Learn Benchmark: Plot Neighbors OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Plot Neighbors Intel UHD 610 CML GT1 60 120 180 240 300 SE +/- 0.89, N = 3 262.92 1. (F9X) gfortran options: -O0
OpenVINO Model: Face Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Face Detection FP16-INT8 - Device: CPU Intel UHD 610 CML GT1 800 1600 2400 3200 4000 SE +/- 41.90, N = 15 3927.81 MIN: 3722.59 / MAX: 4255.2 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Face Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Face Detection FP16-INT8 - Device: CPU Intel UHD 610 CML GT1 0.1148 0.2296 0.3444 0.4592 0.574 SE +/- 0.01, N = 15 0.51 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
Scikit-Learn Benchmark: Hist Gradient Boosting Higgs Boson OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Hist Gradient Boosting Higgs Boson Intel UHD 610 CML GT1 40 80 120 160 200 SE +/- 0.49, N = 3 185.74 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: Plot OMP vs. LARS OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Plot OMP vs. LARS Intel UHD 610 CML GT1 50 100 150 200 250 SE +/- 0.07, N = 3 249.52 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: SGD Regression OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: SGD Regression Intel UHD 610 CML GT1 50 100 150 200 250 SE +/- 0.43, N = 3 219.60 1. (F9X) gfortran options: -O0
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU Intel UHD 610 CML GT1 9K 18K 27K 36K 45K SE +/- 13.11, N = 3 41046.8 MIN: 41006.5 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU Intel UHD 610 CML GT1 9K 18K 27K 36K 45K SE +/- 13.51, N = 3 41031.8 MIN: 40996.4 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU Intel UHD 610 CML GT1 9K 18K 27K 36K 45K SE +/- 5.07, N = 3 41032.5 MIN: 41006.6 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Scikit-Learn Benchmark: Feature Expansions OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Feature Expansions Intel UHD 610 CML GT1 50 100 150 200 250 SE +/- 0.25, N = 3 212.15 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: Hist Gradient Boosting OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Hist Gradient Boosting Intel UHD 610 CML GT1 40 80 120 160 200 SE +/- 0.61, N = 3 194.73 1. (F9X) gfortran options: -O0
Caffe Model: GoogleNet - Acceleration: CPU - Iterations: 100 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: CPU - Iterations: 100 Intel UHD 610 CML GT1 40K 80K 120K 160K 200K SE +/- 123.85, N = 3 208988 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
Numpy Benchmark OpenBenchmarking.org Score, More Is Better Numpy Benchmark Intel UHD 610 CML GT1 60 120 180 240 300 SE +/- 0.23, N = 3 294.94
Scikit-Learn Benchmark: Sample Without Replacement OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Sample Without Replacement Intel UHD 610 CML GT1 30 60 90 120 150 SE +/- 1.32, N = 3 147.24 1. (F9X) gfortran options: -O0
Caffe Model: AlexNet - Acceleration: CPU - Iterations: 200 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: AlexNet - Acceleration: CPU - Iterations: 200 Intel UHD 610 CML GT1 40K 80K 120K 160K 200K SE +/- 86.43, N = 3 193587 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU Intel UHD 610 CML GT1 4K 8K 12K 16K 20K SE +/- 7.60, N = 3 20613.0 MIN: 20591.8 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU Intel UHD 610 CML GT1 4K 8K 12K 16K 20K SE +/- 8.89, N = 3 20604.1 MIN: 20576.6 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU Intel UHD 610 CML GT1 4K 8K 12K 16K 20K SE +/- 9.98, N = 3 20608.4 MIN: 20581.9 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Numenta Anomaly Benchmark Detector: Bayesian Changepoint OpenBenchmarking.org Seconds, Fewer Is Better Numenta Anomaly Benchmark 1.1 Detector: Bayesian Changepoint Intel UHD 610 CML GT1 40 80 120 160 200 SE +/- 0.63, N = 3 164.50
Scikit-Learn Benchmark: Sparsify OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Sparsify Intel UHD 610 CML GT1 20 40 60 80 100 SE +/- 0.01, N = 3 104.00 1. (F9X) gfortran options: -O0
Numenta Anomaly Benchmark Detector: Contextual Anomaly Detector OSE OpenBenchmarking.org Seconds, Fewer Is Better Numenta Anomaly Benchmark 1.1 Detector: Contextual Anomaly Detector OSE Intel UHD 610 CML GT1 30 60 90 120 150 SE +/- 0.88, N = 3 137.72
Scikit-Learn Benchmark: Plot Ward OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Plot Ward Intel UHD 610 CML GT1 20 40 60 80 100 SE +/- 0.03, N = 3 93.20 1. (F9X) gfortran options: -O0
Mlpack Benchmark Benchmark: scikit_ica OpenBenchmarking.org Seconds, Fewer Is Better Mlpack Benchmark Benchmark: scikit_ica Intel UHD 610 CML GT1 30 60 90 120 150 SE +/- 0.48, N = 3 114.79
Scikit-Learn Benchmark: MNIST Dataset OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: MNIST Dataset Intel UHD 610 CML GT1 20 40 60 80 100 SE +/- 0.15, N = 3 86.74 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: Plot Incremental PCA OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Plot Incremental PCA Intel UHD 610 CML GT1 20 40 60 80 100 SE +/- 0.09, N = 3 82.86 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: Text Vectorizers OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Text Vectorizers Intel UHD 610 CML GT1 20 40 60 80 100 SE +/- 0.04, N = 3 79.73 1. (F9X) gfortran options: -O0
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel Intel UHD 610 CML GT1 4K 8K 12K 16K 20K SE +/- 45.50, N = 3 18223.2 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel Intel UHD 610 CML GT1 0.0123 0.0246 0.0369 0.0492 0.0615 SE +/- 0.0001368, N = 3 0.0548757 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: fcn-resnet101-11 - Device: CPU - Executor: Standard Intel UHD 610 CML GT1 2K 4K 6K 8K 10K SE +/- 2.41, N = 3 9192.22 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: fcn-resnet101-11 - Device: CPU - Executor: Standard Intel UHD 610 CML GT1 0.0245 0.049 0.0735 0.098 0.1225 SE +/- 0.000029, N = 3 0.108787 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
Caffe Model: AlexNet - Acceleration: CPU - Iterations: 100 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: AlexNet - Acceleration: CPU - Iterations: 100 Intel UHD 610 CML GT1 20K 40K 60K 80K 100K SE +/- 154.03, N = 3 96808 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard Intel UHD 610 CML GT1 160 320 480 640 800 SE +/- 3.22, N = 3 733.30 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard Intel UHD 610 CML GT1 0.3068 0.6136 0.9204 1.2272 1.534 SE +/- 0.00596, N = 3 1.36375 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel Intel UHD 610 CML GT1 100 200 300 400 500 SE +/- 0.26, N = 3 448.15 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel Intel UHD 610 CML GT1 0.5021 1.0042 1.5063 2.0084 2.5105 SE +/- 0.00128, N = 3 2.23139 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: GPT-2 - Device: CPU - Executor: Parallel Intel UHD 610 CML GT1 8 16 24 32 40 SE +/- 0.04, N = 3 34.77 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: GPT-2 - Device: CPU - Executor: Parallel Intel UHD 610 CML GT1 7 14 21 28 35 SE +/- 0.03, N = 3 28.76 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: GPT-2 - Device: CPU - Executor: Standard Intel UHD 610 CML GT1 8 16 24 32 40 SE +/- 0.21, N = 3 33.32 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: GPT-2 - Device: CPU - Executor: Standard Intel UHD 610 CML GT1 7 14 21 28 35 SE +/- 0.19, N = 3 30.01 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel Intel UHD 610 CML GT1 20 40 60 80 100 SE +/- 1.19, N = 3 110.71 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel Intel UHD 610 CML GT1 3 6 9 12 15 SE +/- 0.09629, N = 3 9.03460 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard Intel UHD 610 CML GT1 20 40 60 80 100 SE +/- 0.09, N = 3 93.34 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard Intel UHD 610 CML GT1 3 6 9 12 15 SE +/- 0.01, N = 3 10.71 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: super-resolution-10 - Device: CPU - Executor: Parallel Intel UHD 610 CML GT1 30 60 90 120 150 SE +/- 0.20, N = 3 120.56 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: super-resolution-10 - Device: CPU - Executor: Parallel Intel UHD 610 CML GT1 2 4 6 8 10 SE +/- 0.01346, N = 3 8.29467 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: super-resolution-10 - Device: CPU - Executor: Standard Intel UHD 610 CML GT1 20 40 60 80 100 SE +/- 0.06, N = 3 107.13 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: super-resolution-10 - Device: CPU - Executor: Standard Intel UHD 610 CML GT1 3 6 9 12 15 SE +/- 0.00561, N = 3 9.33438 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
OpenVINO Model: Face Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Face Detection FP16 - Device: CPU Intel UHD 610 CML GT1 2K 4K 6K 8K 10K SE +/- 1.87, N = 3 11527.95 MIN: 11510.06 / MAX: 11568.08 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Face Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Face Detection FP16 - Device: CPU Intel UHD 610 CML GT1 0.0383 0.0766 0.1149 0.1532 0.1915 SE +/- 0.00, N = 3 0.17 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
Scikit-Learn Benchmark: 20 Newsgroups / Logistic Regression OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: 20 Newsgroups / Logistic Regression Intel UHD 610 CML GT1 14 28 42 56 70 SE +/- 0.06, N = 3 60.84 1. (F9X) gfortran options: -O0
Numenta Anomaly Benchmark Detector: Relative Entropy OpenBenchmarking.org Seconds, Fewer Is Better Numenta Anomaly Benchmark 1.1 Detector: Relative Entropy Intel UHD 610 CML GT1 15 30 45 60 75 SE +/- 0.23, N = 3 66.19
OpenVINO Model: Machine Translation EN To DE FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Machine Translation EN To DE FP16 - Device: CPU Intel UHD 610 CML GT1 200 400 600 800 1000 SE +/- 10.67, N = 3 804.02 MIN: 772.04 / MAX: 830.67 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Machine Translation EN To DE FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Machine Translation EN To DE FP16 - Device: CPU Intel UHD 610 CML GT1 0.5603 1.1206 1.6809 2.2412 2.8015 SE +/- 0.03, N = 3 2.49 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
Scikit-Learn Benchmark: Tree OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Tree Intel UHD 610 CML GT1 11 22 33 44 55 SE +/- 0.17, N = 3 47.10 1. (F9X) gfortran options: -O0
TensorFlow Lite Model: Mobilenet Quant OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2022-05-18 Model: Mobilenet Quant Intel UHD 610 CML GT1 120K 240K 360K 480K 600K SE +/- 20.11, N = 3 560906
OpenVINO Model: Person Detection FP32 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Person Detection FP32 - Device: CPU Intel UHD 610 CML GT1 200 400 600 800 1000 SE +/- 9.39, N = 3 932.30 MIN: 905.44 / MAX: 968.53 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Person Detection FP32 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Person Detection FP32 - Device: CPU Intel UHD 610 CML GT1 0.4838 0.9676 1.4514 1.9352 2.419 SE +/- 0.02, N = 3 2.15 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
TensorFlow Lite Model: Inception V4 OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2022-05-18 Model: Inception V4 Intel UHD 610 CML GT1 90K 180K 270K 360K 450K SE +/- 136.01, N = 3 415157
TensorFlow Lite Model: Inception ResNet V2 OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2022-05-18 Model: Inception ResNet V2 Intel UHD 610 CML GT1 80K 160K 240K 320K 400K SE +/- 125.07, N = 3 386539
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Person Detection FP16 - Device: CPU Intel UHD 610 CML GT1 200 400 600 800 1000 SE +/- 1.01, N = 3 957.97 MIN: 908.75 / MAX: 969.76 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Person Detection FP16 - Device: CPU Intel UHD 610 CML GT1 0.4703 0.9406 1.4109 1.8812 2.3515 SE +/- 0.00, N = 3 2.09 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Road Segmentation ADAS FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Road Segmentation ADAS FP16-INT8 - Device: CPU Intel UHD 610 CML GT1 30 60 90 120 150 SE +/- 1.02, N = 3 119.36 MIN: 114.3 / MAX: 132.68 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Road Segmentation ADAS FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Road Segmentation ADAS FP16-INT8 - Device: CPU Intel UHD 610 CML GT1 4 8 12 16 20 SE +/- 0.14, N = 3 16.76 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Road Segmentation ADAS FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Road Segmentation ADAS FP16 - Device: CPU Intel UHD 610 CML GT1 50 100 150 200 250 SE +/- 0.30, N = 3 236.97 MIN: 121.52 / MAX: 369.29 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Road Segmentation ADAS FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Road Segmentation ADAS FP16 - Device: CPU Intel UHD 610 CML GT1 2 4 6 8 10 SE +/- 0.01, N = 3 8.44 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Handwritten English Recognition FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Handwritten English Recognition FP16 - Device: CPU Intel UHD 610 CML GT1 30 60 90 120 150 SE +/- 0.31, N = 3 155.38 MIN: 131.53 / MAX: 168.03 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Handwritten English Recognition FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Handwritten English Recognition FP16 - Device: CPU Intel UHD 610 CML GT1 3 6 9 12 15 SE +/- 0.03, N = 3 12.87 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
TensorFlow Lite Model: NASNet Mobile OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2022-05-18 Model: NASNet Mobile Intel UHD 610 CML GT1 9K 18K 27K 36K 45K SE +/- 22.46, N = 3 40066.8
OpenVINO Model: Handwritten English Recognition FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Handwritten English Recognition FP16-INT8 - Device: CPU Intel UHD 610 CML GT1 30 60 90 120 150 SE +/- 0.90, N = 3 123.65 MIN: 114.82 / MAX: 175.12 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Handwritten English Recognition FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Handwritten English Recognition FP16-INT8 - Device: CPU Intel UHD 610 CML GT1 4 8 12 16 20 SE +/- 0.12, N = 3 16.17 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Vehicle Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Vehicle Detection FP16-INT8 - Device: CPU Intel UHD 610 CML GT1 13 26 39 52 65 SE +/- 0.05, N = 3 59.24 MIN: 56.57 / MAX: 73.79 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Vehicle Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Vehicle Detection FP16-INT8 - Device: CPU Intel UHD 610 CML GT1 8 16 24 32 40 SE +/- 0.03, N = 3 33.75 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
TensorFlow Lite Model: SqueezeNet OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2022-05-18 Model: SqueezeNet Intel UHD 610 CML GT1 7K 14K 21K 28K 35K SE +/- 33.23, N = 3 30349.5
TensorFlow Lite Model: Mobilenet Float OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2022-05-18 Model: Mobilenet Float Intel UHD 610 CML GT1 5K 10K 15K 20K 25K SE +/- 8.27, N = 3 21944.0
OpenVINO Model: Vehicle Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Vehicle Detection FP16 - Device: CPU Intel UHD 610 CML GT1 30 60 90 120 150 SE +/- 0.05, N = 3 127.43 MIN: 122.64 / MAX: 143.55 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Vehicle Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Vehicle Detection FP16 - Device: CPU Intel UHD 610 CML GT1 4 8 12 16 20 SE +/- 0.01, N = 3 15.69 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Weld Porosity Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Weld Porosity Detection FP16 - Device: CPU Intel UHD 610 CML GT1 30 60 90 120 150 SE +/- 0.02, N = 3 117.47 MIN: 95.54 / MAX: 131.93 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Weld Porosity Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Weld Porosity Detection FP16 - Device: CPU Intel UHD 610 CML GT1 4 8 12 16 20 SE +/- 0.00, N = 3 17.02 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Face Detection Retail FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Face Detection Retail FP16-INT8 - Device: CPU Intel UHD 610 CML GT1 5 10 15 20 25 SE +/- 0.01, N = 3 18.67 MIN: 10.24 / MAX: 40.58 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Face Detection Retail FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Face Detection Retail FP16-INT8 - Device: CPU Intel UHD 610 CML GT1 20 40 60 80 100 SE +/- 0.03, N = 3 107.05 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Weld Porosity Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Weld Porosity Detection FP16-INT8 - Device: CPU Intel UHD 610 CML GT1 9 18 27 36 45 SE +/- 0.00, N = 3 37.49 MIN: 36.71 / MAX: 55.03 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Weld Porosity Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Weld Porosity Detection FP16-INT8 - Device: CPU Intel UHD 610 CML GT1 12 24 36 48 60 SE +/- 0.00, N = 3 53.33 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Face Detection Retail FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Face Detection Retail FP16 - Device: CPU Intel UHD 610 CML GT1 9 18 27 36 45 SE +/- 0.42, N = 3 38.71 MIN: 19.5 / MAX: 55.07 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Face Detection Retail FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Face Detection Retail FP16 - Device: CPU Intel UHD 610 CML GT1 12 24 36 48 60 SE +/- 0.56, N = 3 51.65 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU Intel UHD 610 CML GT1 0.3353 0.6706 1.0059 1.3412 1.6765 SE +/- 0.01, N = 3 1.49 MIN: 0.93 / MAX: 15.96 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU Intel UHD 610 CML GT1 300 600 900 1200 1500 SE +/- 5.84, N = 3 1342.06 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.2.dev Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU Intel UHD 610 CML GT1 0.855 1.71 2.565 3.42 4.275 SE +/- 0.01, N = 3 3.80 MIN: 3.04 / MAX: 19.93 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.2.dev Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU Intel UHD 610 CML GT1 110 220 330 440 550 SE +/- 0.95, N = 3 525.03 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
Scikit-Learn Benchmark: Hist Gradient Boosting Categorical Only OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Hist Gradient Boosting Categorical Only Intel UHD 610 CML GT1 6 12 18 24 30 SE +/- 0.01, N = 3 27.02 1. (F9X) gfortran options: -O0
R Benchmark OpenBenchmarking.org Seconds, Fewer Is Better R Benchmark Intel UHD 610 CML GT1 0.0783 0.1566 0.2349 0.3132 0.3915 SE +/- 0.0003, N = 3 0.3482 1. R scripting front-end version 3.6.3 (2020-02-29)
Numenta Anomaly Benchmark Detector: Windowed Gaussian OpenBenchmarking.org Seconds, Fewer Is Better Numenta Anomaly Benchmark 1.1 Detector: Windowed Gaussian Intel UHD 610 CML GT1 8 16 24 32 40 SE +/- 0.37, N = 3 33.81
Mlpack Benchmark Benchmark: scikit_svm OpenBenchmarking.org Seconds, Fewer Is Better Mlpack Benchmark Benchmark: scikit_svm Intel UHD 610 CML GT1 6 12 18 24 30 SE +/- 0.03, N = 3 27.15
RNNoise OpenBenchmarking.org Seconds, Fewer Is Better RNNoise 2020-06-28 Intel UHD 610 CML GT1 6 12 18 24 30 SE +/- 0.08, N = 3 26.35 1. (CC) gcc options: -O2 -pedantic -fvisibility=hidden
TNN Target: CPU - Model: MobileNet v2 OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: MobileNet v2 Intel UHD 610 CML GT1 80 160 240 320 400 SE +/- 0.06, N = 3 365.27 MIN: 364.24 / MAX: 368.82 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
TNN Target: CPU - Model: SqueezeNet v1.1 OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: SqueezeNet v1.1 Intel UHD 610 CML GT1 70 140 210 280 350 SE +/- 0.04, N = 3 327.08 MIN: 326.93 / MAX: 327.82 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU Intel UHD 610 CML GT1 20 40 60 80 100 SE +/- 0.42, N = 3 76.01 MIN: 73.69 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU Intel UHD 610 CML GT1 5 10 15 20 25 SE +/- 0.01, N = 3 19.16 MIN: 19.06 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU Intel UHD 610 CML GT1 8 16 24 32 40 SE +/- 0.06, N = 3 36.82 MIN: 36.41 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU Intel UHD 610 CML GT1 3 6 9 12 15 SE +/- 0.03, N = 3 11.93 MIN: 11.84 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU Intel UHD 610 CML GT1 9 18 27 36 45 SE +/- 0.19, N = 3 37.72 MIN: 36.95 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU Intel UHD 610 CML GT1 1.3099 2.6198 3.9297 5.2396 6.5495 SE +/- 0.00596, N = 3 5.82157 MIN: 5.61 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU Intel UHD 610 CML GT1 13 26 39 52 65 SE +/- 0.09, N = 3 58.72 MIN: 58.45 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU Intel UHD 610 CML GT1 11 22 33 44 55 SE +/- 0.13, N = 3 49.27 MIN: 47.33 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
TNN Target: CPU - Model: SqueezeNet v2 OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: SqueezeNet v2 Intel UHD 610 CML GT1 16 32 48 64 80 SE +/- 0.01, N = 3 72.92 MIN: 72.83 / MAX: 73.21 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU Intel UHD 610 CML GT1 20 40 60 80 100 SE +/- 0.11, N = 3 90.10 MIN: 87.57 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU Intel UHD 610 CML GT1 6 12 18 24 30 SE +/- 0.02, N = 3 26.17 MIN: 25.84 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Phoronix Test Suite v10.8.5