TestRunNewCMake Intel Pentium Gold G6400 testing with a ASRock H510M-HDV/M.2 SE (P1.60 BIOS) and Intel UHD 610 CML GT1 3GB on Ubuntu 20.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2311037-HERT-H510G6431&grs .
TestRunNewCMake Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server OpenGL Vulkan Compiler File-System Screen Resolution Intel UHD 610 CML GT1 Intel Pentium Gold G6400 @ 4.00GHz (2 Cores / 4 Threads) ASRock H510M-HDV/M.2 SE (P1.60 BIOS) Intel Comet Lake PCH 3584MB 1000GB Western Digital WDS100T2B0A Intel UHD 610 CML GT1 3GB (1050MHz) Realtek ALC897 G185BGEL01 Realtek RTL8111/8168/8411 Ubuntu 20.04 5.15.0-86-generic (x86_64) GNOME Shell 3.36.9 X Server 1.20.13 4.6 Mesa 21.2.6 1.2.182 GCC 9.4.0 ext4 1368x768 OpenBenchmarking.org - Transparent Huge Pages: madvise - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-9QDOt0/gcc-9-9.4.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0xf8 - Thermald 1.9.1 - Python 3.8.10 - gather_data_sampling: Not affected + itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Mitigation of Clear buffers; SMT vulnerable + retbleed: Mitigation of Enhanced IBRS + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Mitigation of Microcode + tsx_async_abort: Not affected
TestRunNewCMake whisper-cpp: ggml-medium.en - 2016 State of the Union whisper-cpp: ggml-small.en - 2016 State of the Union whisper-cpp: ggml-base.en - 2016 State of the Union scikit-learn: Sparse Rand Projections / 100 Iterations scikit-learn: Kernel PCA Solvers / Time vs. N Components scikit-learn: Kernel PCA Solvers / Time vs. N Samples scikit-learn: Hist Gradient Boosting Categorical Only scikit-learn: Plot Polynomial Kernel Approximation scikit-learn: 20 Newsgroups / Logistic Regression scikit-learn: Hist Gradient Boosting Higgs Boson scikit-learn: Plot Singular Value Decomposition scikit-learn: Hist Gradient Boosting Threading scikit-learn: Hist Gradient Boosting Adult scikit-learn: Covertype Dataset Benchmark scikit-learn: Sample Without Replacement scikit-learn: Hist Gradient Boosting scikit-learn: Plot Incremental PCA scikit-learn: TSNE MNIST Dataset scikit-learn: LocalOutlierFactor scikit-learn: Feature Expansions scikit-learn: Plot OMP vs. LARS scikit-learn: Plot Hierarchical scikit-learn: Text Vectorizers scikit-learn: Plot Lasso Path scikit-learn: SGD Regression scikit-learn: Plot Neighbors scikit-learn: MNIST Dataset scikit-learn: Plot Ward scikit-learn: Sparsify scikit-learn: Lasso scikit-learn: Tree scikit-learn: SAGA scikit-learn: GLM mlpack: scikit_svm mlpack: scikit_ica onnx: Faster R-CNN R-50-FPN-int8 - CPU - Standard onnx: super-resolution-10 - CPU - Standard onnx: ResNet50 v1-12-int8 - CPU - Standard onnx: ResNet50 v1-12-int8 - CPU - Parallel onnx: ArcFace ResNet-100 - CPU - Standard onnx: ArcFace ResNet-100 - CPU - Parallel onnx: fcn-resnet101-11 - CPU - Standard onnx: CaffeNet 12-int8 - CPU - Standard onnx: CaffeNet 12-int8 - CPU - Parallel onnx: bertsquad-12 - CPU - Standard onnx: bertsquad-12 - CPU - Parallel onnx: GPT-2 - CPU - Standard onnx: GPT-2 - CPU - Parallel numenta-nab: Contextual Anomaly Detector OSE numenta-nab: Bayesian Changepoint numenta-nab: Earthgecko Skyline numenta-nab: Windowed Gaussian numenta-nab: Relative Entropy numenta-nab: KNN CAD openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU openvino: Handwritten English Recognition FP16-INT8 - CPU openvino: Handwritten English Recognition FP16-INT8 - CPU openvino: Age Gender Recognition Retail 0013 FP16 - CPU openvino: Age Gender Recognition Retail 0013 FP16 - CPU openvino: Handwritten English Recognition FP16 - CPU openvino: Handwritten English Recognition FP16 - CPU openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Machine Translation EN To DE FP16 - CPU openvino: Machine Translation EN To DE FP16 - CPU openvino: Road Segmentation ADAS FP16-INT8 - CPU openvino: Road Segmentation ADAS FP16-INT8 - CPU openvino: Face Detection Retail FP16-INT8 - CPU openvino: Face Detection Retail FP16-INT8 - CPU openvino: Weld Porosity Detection FP16 - CPU openvino: Weld Porosity Detection FP16 - CPU openvino: Vehicle Detection FP16-INT8 - CPU openvino: Vehicle Detection FP16-INT8 - CPU openvino: Road Segmentation ADAS FP16 - CPU openvino: Road Segmentation ADAS FP16 - CPU openvino: Face Detection Retail FP16 - CPU openvino: Face Detection Retail FP16 - CPU openvino: Face Detection FP16-INT8 - CPU openvino: Face Detection FP16-INT8 - CPU openvino: Vehicle Detection FP16 - CPU openvino: Vehicle Detection FP16 - CPU openvino: Person Detection FP32 - CPU openvino: Person Detection FP32 - CPU openvino: Person Detection FP16 - CPU openvino: Person Detection FP16 - CPU openvino: Face Detection FP16 - CPU openvino: Face Detection FP16 - CPU plaidml: No - Inference - ResNet 50 - CPU plaidml: No - Inference - VGG16 - CPU tnn: CPU - SqueezeNet v1.1 tnn: CPU - SqueezeNet v2 tnn: CPU - MobileNet v2 tnn: CPU - DenseNet ncnn: Vulkan GPU - FastestDet ncnn: Vulkan GPU - vision_transformer ncnn: Vulkan GPU - regnety_400m ncnn: Vulkan GPU - squeezenet_ssd ncnn: Vulkan GPU - yolov4-tiny ncnn: Vulkan GPU - resnet50 ncnn: Vulkan GPU - alexnet ncnn: Vulkan GPU - resnet18 ncnn: Vulkan GPU - vgg16 ncnn: Vulkan GPU - googlenet ncnn: Vulkan GPU - blazeface ncnn: Vulkan GPU - efficientnet-b0 ncnn: Vulkan GPU - mnasnet ncnn: Vulkan GPU - shufflenet-v2 ncnn: Vulkan GPU-v3-v3 - mobilenet-v3 ncnn: Vulkan GPU-v2-v2 - mobilenet-v2 ncnn: Vulkan GPU - mobilenet ncnn: CPU - FastestDet ncnn: CPU - vision_transformer ncnn: CPU - regnety_400m ncnn: CPU - squeezenet_ssd ncnn: CPU - yolov4-tiny ncnn: CPU - resnet50 ncnn: CPU - alexnet ncnn: CPU - resnet18 ncnn: CPU - vgg16 ncnn: CPU - googlenet ncnn: CPU - blazeface ncnn: CPU - efficientnet-b0 ncnn: CPU - mnasnet ncnn: CPU - shufflenet-v2 ncnn: CPU-v3-v3 - mobilenet-v3 ncnn: CPU-v2-v2 - mobilenet-v2 ncnn: CPU - mobilenet mnn: inception-v3 mnn: mobilenet-v1-1.0 mnn: MobileNetV2_224 mnn: SqueezeNetV1.0 mnn: resnet-v2-50 mnn: squeezenetv1.1 mnn: mobilenetV3 mnn: nasnet caffe: GoogleNet - CPU - 1000 caffe: GoogleNet - CPU - 200 caffe: GoogleNet - CPU - 100 caffe: AlexNet - CPU - 1000 caffe: AlexNet - CPU - 200 caffe: AlexNet - CPU - 100 tensorflow-lite: Inception ResNet V2 tensorflow-lite: Mobilenet Quant tensorflow-lite: Mobilenet Float tensorflow-lite: NASNet Mobile tensorflow-lite: Inception V4 tensorflow-lite: SqueezeNet rnnoise: rbenchmark: numpy: onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Recurrent Neural Network Training - f32 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: Deconvolution Batch shapes_1d - f32 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU onednn: IP Shapes 3D - f32 - CPU onednn: IP Shapes 1D - f32 - CPU lczero: BLAS onnx: Faster R-CNN R-50-FPN-int8 - CPU - Standard onnx: Faster R-CNN R-50-FPN-int8 - CPU - Parallel onnx: Faster R-CNN R-50-FPN-int8 - CPU - Parallel onnx: super-resolution-10 - CPU - Standard onnx: super-resolution-10 - CPU - Parallel onnx: super-resolution-10 - CPU - Parallel onnx: ResNet50 v1-12-int8 - CPU - Standard onnx: ResNet50 v1-12-int8 - CPU - Parallel onnx: ArcFace ResNet-100 - CPU - Standard onnx: ArcFace ResNet-100 - CPU - Parallel onnx: fcn-resnet101-11 - CPU - Standard onnx: fcn-resnet101-11 - CPU - Parallel onnx: fcn-resnet101-11 - CPU - Parallel onnx: CaffeNet 12-int8 - CPU - Standard onnx: CaffeNet 12-int8 - CPU - Parallel onnx: bertsquad-12 - CPU - Standard onnx: bertsquad-12 - CPU - Parallel onnx: GPT-2 - CPU - Standard onnx: GPT-2 - CPU - Parallel onednn: IP Shapes 1D - bf16bf16bf16 - CPU Intel UHD 610 CML GT1 41069.323 11708.174 3200.85150 3104.849 327.831 463.431 27.723 301.898 61.600 194.069 357.661 445.698 104.542 584.048 146.723 199.375 85.502 807.651 456.448 211.252 253.193 278.834 81.385 476.230 228.715 267.817 87.879 94.532 107.140 1084.138 48.190 1078.486 1129.126 27.75 118.31 1.33205 8.91625 10.3522 8.89770 2.43458 2.16783 0.106003 38.8853 31.9536 1.17517 1.09291 29.4526 27.8791 139.353 168.694 534.684 34.114 68.527 726.697 1.53 1300.47 125.63 15.91 3.96 504.02 158.68 12.60 38.50 51.93 810.07 2.47 119.47 16.74 18.84 106.11 120.87 16.54 59.63 33.53 238.38 8.39 38.92 51.36 3863.39 0.52 128.29 15.58 941.81 2.12 926.63 2.15 11809.47 0.17 2.40 1.59 335.258 75.163 374.720 5394.682 10.32 922.08 23.61 38.59 93.72 125.65 37.88 46.11 269.82 55.09 2.45 29.19 17.45 8.63 15.15 20.77 75.75 10.31 922.27 23.14 38.64 93.85 125.87 37.89 46.04 268.68 54.88 2.46 29.21 17.42 8.65 15.09 20.69 75.80 158.585 20.613 13.323 22.634 119.022 12.234 3.861 30.010 2142790 429063 214399 986440 197396 98849 397245 576680 22665.5 41500.3 426984 31464.6 27.004 0.3572 290.97 21213.8 42196.3 21214.8 42203.1 21191.3 42198.7 26.9529 19.7928 48.7788 92.4204 78.2875 60.2532 5.90947 12.2778 38.0161 37.7613 137 750.726 1090.131 0.933197 112.527 135.211 7.64505 96.6196 112.391 410.746 461.300 9433.67 13223.7 0.0793033 25.7300 31.2940 851.918 914.994 33.9478 35.9756 OpenBenchmarking.org
Whisper.cpp Model: ggml-medium.en - Input: 2016 State of the Union OpenBenchmarking.org Seconds, Fewer Is Better Whisper.cpp 1.4 Model: ggml-medium.en - Input: 2016 State of the Union Intel UHD 610 CML GT1 9K 18K 27K 36K 45K SE +/- 369.73, N = 3 41069.32 1. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread
Whisper.cpp Model: ggml-small.en - Input: 2016 State of the Union OpenBenchmarking.org Seconds, Fewer Is Better Whisper.cpp 1.4 Model: ggml-small.en - Input: 2016 State of the Union Intel UHD 610 CML GT1 3K 6K 9K 12K 15K SE +/- 11.17, N = 3 11708.17 1. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread
Whisper.cpp Model: ggml-base.en - Input: 2016 State of the Union OpenBenchmarking.org Seconds, Fewer Is Better Whisper.cpp 1.4 Model: ggml-base.en - Input: 2016 State of the Union Intel UHD 610 CML GT1 700 1400 2100 2800 3500 SE +/- 0.85, N = 3 3200.85 1. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread
Scikit-Learn Benchmark: Sparse Random Projections / 100 Iterations OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Sparse Random Projections / 100 Iterations Intel UHD 610 CML GT1 700 1400 2100 2800 3500 SE +/- 2.11, N = 3 3104.85 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: Kernel PCA Solvers / Time vs. N Components OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Kernel PCA Solvers / Time vs. N Components Intel UHD 610 CML GT1 70 140 210 280 350 SE +/- 4.18, N = 9 327.83 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: Kernel PCA Solvers / Time vs. N Samples OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Kernel PCA Solvers / Time vs. N Samples Intel UHD 610 CML GT1 100 200 300 400 500 SE +/- 0.23, N = 3 463.43 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: Hist Gradient Boosting Categorical Only OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Hist Gradient Boosting Categorical Only Intel UHD 610 CML GT1 7 14 21 28 35 SE +/- 0.05, N = 3 27.72 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: Plot Polynomial Kernel Approximation OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Plot Polynomial Kernel Approximation Intel UHD 610 CML GT1 70 140 210 280 350 SE +/- 0.91, N = 3 301.90 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: 20 Newsgroups / Logistic Regression OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: 20 Newsgroups / Logistic Regression Intel UHD 610 CML GT1 14 28 42 56 70 SE +/- 0.03, N = 3 61.60 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: Hist Gradient Boosting Higgs Boson OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Hist Gradient Boosting Higgs Boson Intel UHD 610 CML GT1 40 80 120 160 200 SE +/- 1.57, N = 3 194.07 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: Plot Singular Value Decomposition OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Plot Singular Value Decomposition Intel UHD 610 CML GT1 80 160 240 320 400 SE +/- 0.72, N = 3 357.66 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: Hist Gradient Boosting Threading OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Hist Gradient Boosting Threading Intel UHD 610 CML GT1 100 200 300 400 500 SE +/- 3.29, N = 3 445.70 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: Hist Gradient Boosting Adult OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Hist Gradient Boosting Adult Intel UHD 610 CML GT1 20 40 60 80 100 SE +/- 0.18, N = 3 104.54 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: Covertype Dataset Benchmark OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Covertype Dataset Benchmark Intel UHD 610 CML GT1 130 260 390 520 650 SE +/- 1.24, N = 3 584.05 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: Sample Without Replacement OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Sample Without Replacement Intel UHD 610 CML GT1 30 60 90 120 150 SE +/- 1.02, N = 3 146.72 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: Hist Gradient Boosting OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Hist Gradient Boosting Intel UHD 610 CML GT1 40 80 120 160 200 SE +/- 0.41, N = 3 199.38 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: Plot Incremental PCA OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Plot Incremental PCA Intel UHD 610 CML GT1 20 40 60 80 100 SE +/- 0.30, N = 3 85.50 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: TSNE MNIST Dataset OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: TSNE MNIST Dataset Intel UHD 610 CML GT1 200 400 600 800 1000 SE +/- 1.86, N = 3 807.65 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: LocalOutlierFactor OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: LocalOutlierFactor Intel UHD 610 CML GT1 100 200 300 400 500 SE +/- 0.41, N = 3 456.45 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: Feature Expansions OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Feature Expansions Intel UHD 610 CML GT1 50 100 150 200 250 SE +/- 1.41, N = 3 211.25 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: Plot OMP vs. LARS OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Plot OMP vs. LARS Intel UHD 610 CML GT1 60 120 180 240 300 SE +/- 0.09, N = 3 253.19 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: Plot Hierarchical OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Plot Hierarchical Intel UHD 610 CML GT1 60 120 180 240 300 SE +/- 0.14, N = 3 278.83 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: Text Vectorizers OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Text Vectorizers Intel UHD 610 CML GT1 20 40 60 80 100 SE +/- 0.28, N = 3 81.39 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: Plot Lasso Path OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Plot Lasso Path Intel UHD 610 CML GT1 100 200 300 400 500 SE +/- 0.25, N = 3 476.23 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: SGD Regression OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: SGD Regression Intel UHD 610 CML GT1 50 100 150 200 250 SE +/- 0.63, N = 3 228.72 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: Plot Neighbors OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Plot Neighbors Intel UHD 610 CML GT1 60 120 180 240 300 SE +/- 1.07, N = 3 267.82 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: MNIST Dataset OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: MNIST Dataset Intel UHD 610 CML GT1 20 40 60 80 100 SE +/- 0.09, N = 3 87.88 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: Plot Ward OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Plot Ward Intel UHD 610 CML GT1 20 40 60 80 100 SE +/- 0.06, N = 3 94.53 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: Sparsify OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Sparsify Intel UHD 610 CML GT1 20 40 60 80 100 SE +/- 0.34, N = 3 107.14 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: Lasso OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Lasso Intel UHD 610 CML GT1 200 400 600 800 1000 SE +/- 0.75, N = 3 1084.14 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: Tree OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: Tree Intel UHD 610 CML GT1 11 22 33 44 55 SE +/- 0.48, N = 15 48.19 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: SAGA OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: SAGA Intel UHD 610 CML GT1 200 400 600 800 1000 SE +/- 0.44, N = 3 1078.49 1. (F9X) gfortran options: -O0
Scikit-Learn Benchmark: GLM OpenBenchmarking.org Seconds, Fewer Is Better Scikit-Learn 1.2.2 Benchmark: GLM Intel UHD 610 CML GT1 200 400 600 800 1000 SE +/- 2.76, N = 3 1129.13 1. (F9X) gfortran options: -O0
Mlpack Benchmark Benchmark: scikit_svm OpenBenchmarking.org Seconds, Fewer Is Better Mlpack Benchmark Benchmark: scikit_svm Intel UHD 610 CML GT1 7 14 21 28 35 SE +/- 0.02, N = 3 27.75
Mlpack Benchmark Benchmark: scikit_ica OpenBenchmarking.org Seconds, Fewer Is Better Mlpack Benchmark Benchmark: scikit_ica Intel UHD 610 CML GT1 30 60 90 120 150 SE +/- 0.17, N = 3 118.31
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard Intel UHD 610 CML GT1 0.2997 0.5994 0.8991 1.1988 1.4985 SE +/- 0.00245, N = 3 1.33205 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: super-resolution-10 - Device: CPU - Executor: Standard Intel UHD 610 CML GT1 2 4 6 8 10 SE +/- 0.14224, N = 12 8.91625 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard Intel UHD 610 CML GT1 3 6 9 12 15 SE +/- 0.11, N = 3 10.35 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel Intel UHD 610 CML GT1 2 4 6 8 10 SE +/- 0.03794, N = 3 8.89770 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard Intel UHD 610 CML GT1 0.5478 1.0956 1.6434 2.1912 2.739 SE +/- 0.00046, N = 3 2.43458 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel Intel UHD 610 CML GT1 0.4878 0.9756 1.4634 1.9512 2.439 SE +/- 0.00760, N = 3 2.16783 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: fcn-resnet101-11 - Device: CPU - Executor: Standard Intel UHD 610 CML GT1 0.0239 0.0478 0.0717 0.0956 0.1195 SE +/- 0.000003, N = 3 0.106003 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard Intel UHD 610 CML GT1 9 18 27 36 45 SE +/- 0.29, N = 11 38.89 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel Intel UHD 610 CML GT1 7 14 21 28 35 SE +/- 0.09, N = 3 31.95 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: bertsquad-12 - Device: CPU - Executor: Standard Intel UHD 610 CML GT1 0.2644 0.5288 0.7932 1.0576 1.322 SE +/- 0.01029, N = 15 1.17517 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: bertsquad-12 - Device: CPU - Executor: Parallel Intel UHD 610 CML GT1 0.2459 0.4918 0.7377 0.9836 1.2295 SE +/- 0.00204, N = 3 1.09291 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: GPT-2 - Device: CPU - Executor: Standard Intel UHD 610 CML GT1 7 14 21 28 35 SE +/- 0.06, N = 3 29.45 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: GPT-2 - Device: CPU - Executor: Parallel Intel UHD 610 CML GT1 7 14 21 28 35 SE +/- 0.43, N = 12 27.88 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
Numenta Anomaly Benchmark Detector: Contextual Anomaly Detector OSE OpenBenchmarking.org Seconds, Fewer Is Better Numenta Anomaly Benchmark 1.1 Detector: Contextual Anomaly Detector OSE Intel UHD 610 CML GT1 30 60 90 120 150 SE +/- 0.92, N = 3 139.35
Numenta Anomaly Benchmark Detector: Bayesian Changepoint OpenBenchmarking.org Seconds, Fewer Is Better Numenta Anomaly Benchmark 1.1 Detector: Bayesian Changepoint Intel UHD 610 CML GT1 40 80 120 160 200 SE +/- 0.26, N = 3 168.69
Numenta Anomaly Benchmark Detector: Earthgecko Skyline OpenBenchmarking.org Seconds, Fewer Is Better Numenta Anomaly Benchmark 1.1 Detector: Earthgecko Skyline Intel UHD 610 CML GT1 120 240 360 480 600 SE +/- 2.51, N = 3 534.68
Numenta Anomaly Benchmark Detector: Windowed Gaussian OpenBenchmarking.org Seconds, Fewer Is Better Numenta Anomaly Benchmark 1.1 Detector: Windowed Gaussian Intel UHD 610 CML GT1 8 16 24 32 40 SE +/- 0.03, N = 3 34.11
Numenta Anomaly Benchmark Detector: Relative Entropy OpenBenchmarking.org Seconds, Fewer Is Better Numenta Anomaly Benchmark 1.1 Detector: Relative Entropy Intel UHD 610 CML GT1 15 30 45 60 75 SE +/- 0.34, N = 3 68.53
Numenta Anomaly Benchmark Detector: KNN CAD OpenBenchmarking.org Seconds, Fewer Is Better Numenta Anomaly Benchmark 1.1 Detector: KNN CAD Intel UHD 610 CML GT1 160 320 480 640 800 SE +/- 0.28, N = 3 726.70
OpenVINO Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.1 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU Intel UHD 610 CML GT1 0.3443 0.6886 1.0329 1.3772 1.7215 SE +/- 0.00, N = 3 1.53 MIN: 0.88 / MAX: 18.36 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -pthread
OpenVINO Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.1 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU Intel UHD 610 CML GT1 300 600 900 1200 1500 SE +/- 1.99, N = 3 1300.47 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -pthread
OpenVINO Model: Handwritten English Recognition FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.1 Model: Handwritten English Recognition FP16-INT8 - Device: CPU Intel UHD 610 CML GT1 30 60 90 120 150 SE +/- 0.32, N = 3 125.63 MIN: 81.99 / MAX: 143.87 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -pthread
OpenVINO Model: Handwritten English Recognition FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.1 Model: Handwritten English Recognition FP16-INT8 - Device: CPU Intel UHD 610 CML GT1 4 8 12 16 20 SE +/- 0.04, N = 3 15.91 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -pthread
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.1 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU Intel UHD 610 CML GT1 0.891 1.782 2.673 3.564 4.455 SE +/- 0.01, N = 3 3.96 MIN: 2.43 / MAX: 19.95 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -pthread
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.1 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU Intel UHD 610 CML GT1 110 220 330 440 550 SE +/- 0.61, N = 3 504.02 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -pthread
OpenVINO Model: Handwritten English Recognition FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.1 Model: Handwritten English Recognition FP16 - Device: CPU Intel UHD 610 CML GT1 40 80 120 160 200 SE +/- 0.21, N = 3 158.68 MIN: 99.47 / MAX: 177.31 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -pthread
OpenVINO Model: Handwritten English Recognition FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.1 Model: Handwritten English Recognition FP16 - Device: CPU Intel UHD 610 CML GT1 3 6 9 12 15 SE +/- 0.02, N = 3 12.60 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -pthread
OpenVINO Model: Weld Porosity Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.1 Model: Weld Porosity Detection FP16-INT8 - Device: CPU Intel UHD 610 CML GT1 9 18 27 36 45 SE +/- 0.01, N = 3 38.50 MIN: 21.89 / MAX: 52.21 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -pthread
OpenVINO Model: Weld Porosity Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.1 Model: Weld Porosity Detection FP16-INT8 - Device: CPU Intel UHD 610 CML GT1 12 24 36 48 60 SE +/- 0.01, N = 3 51.93 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -pthread
OpenVINO Model: Machine Translation EN To DE FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.1 Model: Machine Translation EN To DE FP16 - Device: CPU Intel UHD 610 CML GT1 200 400 600 800 1000 SE +/- 6.01, N = 3 810.07 MIN: 656.58 / MAX: 849.91 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -pthread
OpenVINO Model: Machine Translation EN To DE FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.1 Model: Machine Translation EN To DE FP16 - Device: CPU Intel UHD 610 CML GT1 0.5558 1.1116 1.6674 2.2232 2.779 SE +/- 0.02, N = 3 2.47 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -pthread
OpenVINO Model: Road Segmentation ADAS FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.1 Model: Road Segmentation ADAS FP16-INT8 - Device: CPU Intel UHD 610 CML GT1 30 60 90 120 150 SE +/- 0.03, N = 3 119.47 MIN: 90.68 / MAX: 145.72 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -pthread
OpenVINO Model: Road Segmentation ADAS FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.1 Model: Road Segmentation ADAS FP16-INT8 - Device: CPU Intel UHD 610 CML GT1 4 8 12 16 20 SE +/- 0.01, N = 3 16.74 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -pthread
OpenVINO Model: Face Detection Retail FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.1 Model: Face Detection Retail FP16-INT8 - Device: CPU Intel UHD 610 CML GT1 5 10 15 20 25 SE +/- 0.02, N = 3 18.84 MIN: 11.22 / MAX: 33.21 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -pthread
OpenVINO Model: Face Detection Retail FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.1 Model: Face Detection Retail FP16-INT8 - Device: CPU Intel UHD 610 CML GT1 20 40 60 80 100 SE +/- 0.10, N = 3 106.11 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -pthread
OpenVINO Model: Weld Porosity Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.1 Model: Weld Porosity Detection FP16 - Device: CPU Intel UHD 610 CML GT1 30 60 90 120 150 SE +/- 0.01, N = 3 120.87 MIN: 100.98 / MAX: 135.07 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -pthread
OpenVINO Model: Weld Porosity Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.1 Model: Weld Porosity Detection FP16 - Device: CPU Intel UHD 610 CML GT1 4 8 12 16 20 SE +/- 0.00, N = 3 16.54 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -pthread
OpenVINO Model: Vehicle Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.1 Model: Vehicle Detection FP16-INT8 - Device: CPU Intel UHD 610 CML GT1 13 26 39 52 65 SE +/- 0.04, N = 3 59.63 MIN: 33.23 / MAX: 73.95 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -pthread
OpenVINO Model: Vehicle Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.1 Model: Vehicle Detection FP16-INT8 - Device: CPU Intel UHD 610 CML GT1 8 16 24 32 40 SE +/- 0.02, N = 3 33.53 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -pthread
OpenVINO Model: Road Segmentation ADAS FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.1 Model: Road Segmentation ADAS FP16 - Device: CPU Intel UHD 610 CML GT1 50 100 150 200 250 SE +/- 0.25, N = 3 238.38 MIN: 230.11 / MAX: 256.02 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -pthread
OpenVINO Model: Road Segmentation ADAS FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.1 Model: Road Segmentation ADAS FP16 - Device: CPU Intel UHD 610 CML GT1 2 4 6 8 10 SE +/- 0.01, N = 3 8.39 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -pthread
OpenVINO Model: Face Detection Retail FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.1 Model: Face Detection Retail FP16 - Device: CPU Intel UHD 610 CML GT1 9 18 27 36 45 SE +/- 0.04, N = 3 38.92 MIN: 22.17 / MAX: 55.68 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -pthread
OpenVINO Model: Face Detection Retail FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.1 Model: Face Detection Retail FP16 - Device: CPU Intel UHD 610 CML GT1 12 24 36 48 60 SE +/- 0.06, N = 3 51.36 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -pthread
OpenVINO Model: Face Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.1 Model: Face Detection FP16-INT8 - Device: CPU Intel UHD 610 CML GT1 800 1600 2400 3200 4000 SE +/- 2.32, N = 3 3863.39 MIN: 3664.15 / MAX: 4006.24 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -pthread
OpenVINO Model: Face Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.1 Model: Face Detection FP16-INT8 - Device: CPU Intel UHD 610 CML GT1 0.117 0.234 0.351 0.468 0.585 SE +/- 0.00, N = 3 0.52 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -pthread
OpenVINO Model: Vehicle Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.1 Model: Vehicle Detection FP16 - Device: CPU Intel UHD 610 CML GT1 30 60 90 120 150 SE +/- 0.04, N = 3 128.29 MIN: 74.44 / MAX: 146.59 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -pthread
OpenVINO Model: Vehicle Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.1 Model: Vehicle Detection FP16 - Device: CPU Intel UHD 610 CML GT1 4 8 12 16 20 SE +/- 0.01, N = 3 15.58 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -pthread
OpenVINO Model: Person Detection FP32 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.1 Model: Person Detection FP32 - Device: CPU Intel UHD 610 CML GT1 200 400 600 800 1000 SE +/- 8.41, N = 3 941.81 MIN: 825.86 / MAX: 990.85 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -pthread
OpenVINO Model: Person Detection FP32 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.1 Model: Person Detection FP32 - Device: CPU Intel UHD 610 CML GT1 0.477 0.954 1.431 1.908 2.385 SE +/- 0.02, N = 3 2.12 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -pthread
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.1 Model: Person Detection FP16 - Device: CPU Intel UHD 610 CML GT1 200 400 600 800 1000 SE +/- 3.54, N = 3 926.63 MIN: 767.41 / MAX: 991.25 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -pthread
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.1 Model: Person Detection FP16 - Device: CPU Intel UHD 610 CML GT1 0.4838 0.9676 1.4514 1.9352 2.419 SE +/- 0.01, N = 3 2.15 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -pthread
OpenVINO Model: Face Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2023.1 Model: Face Detection FP16 - Device: CPU Intel UHD 610 CML GT1 3K 6K 9K 12K 15K SE +/- 0.71, N = 3 11809.47 MIN: 11624.8 / MAX: 11889.26 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -pthread
OpenVINO Model: Face Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2023.1 Model: Face Detection FP16 - Device: CPU Intel UHD 610 CML GT1 0.0383 0.0766 0.1149 0.1532 0.1915 SE +/- 0.00, N = 3 0.17 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -pthread
PlaidML FP16: No - Mode: Inference - Network: ResNet 50 - Device: CPU OpenBenchmarking.org FPS, More Is Better PlaidML FP16: No - Mode: Inference - Network: ResNet 50 - Device: CPU Intel UHD 610 CML GT1 0.54 1.08 1.62 2.16 2.7 SE +/- 0.00, N = 3 2.40
PlaidML FP16: No - Mode: Inference - Network: VGG16 - Device: CPU OpenBenchmarking.org FPS, More Is Better PlaidML FP16: No - Mode: Inference - Network: VGG16 - Device: CPU Intel UHD 610 CML GT1 0.3578 0.7156 1.0734 1.4312 1.789 SE +/- 0.02, N = 3 1.59
TNN Target: CPU - Model: SqueezeNet v1.1 OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: SqueezeNet v1.1 Intel UHD 610 CML GT1 70 140 210 280 350 SE +/- 0.04, N = 3 335.26 MIN: 334.98 / MAX: 336.55 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
TNN Target: CPU - Model: SqueezeNet v2 OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: SqueezeNet v2 Intel UHD 610 CML GT1 20 40 60 80 100 SE +/- 0.28, N = 3 75.16 MIN: 74.69 / MAX: 79.56 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
TNN Target: CPU - Model: MobileNet v2 OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: MobileNet v2 Intel UHD 610 CML GT1 80 160 240 320 400 SE +/- 0.14, N = 3 374.72 MIN: 373.05 / MAX: 386.67 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
TNN Target: CPU - Model: DenseNet OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: DenseNet Intel UHD 610 CML GT1 1200 2400 3600 4800 6000 SE +/- 2.51, N = 3 5394.68 MIN: 5350.15 / MAX: 5444.47 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
NCNN Target: Vulkan GPU - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: FastestDet Intel UHD 610 CML GT1 3 6 9 12 15 SE +/- 0.02, N = 3 10.32 MIN: 10.16 / MAX: 16.45 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: Vulkan GPU - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: vision_transformer Intel UHD 610 CML GT1 200 400 600 800 1000 SE +/- 0.40, N = 3 922.08 MIN: 908.45 / MAX: 979.04 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: Vulkan GPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: regnety_400m Intel UHD 610 CML GT1 6 12 18 24 30 SE +/- 0.46, N = 3 23.61 MIN: 22.92 / MAX: 30.57 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: Vulkan GPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: squeezenet_ssd Intel UHD 610 CML GT1 9 18 27 36 45 SE +/- 0.02, N = 3 38.59 MIN: 38.16 / MAX: 46.82 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: Vulkan GPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: yolov4-tiny Intel UHD 610 CML GT1 20 40 60 80 100 SE +/- 0.03, N = 3 93.72 MIN: 92.89 / MAX: 104.34 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: Vulkan GPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: resnet50 Intel UHD 610 CML GT1 30 60 90 120 150 SE +/- 0.05, N = 3 125.65 MIN: 124.59 / MAX: 136.36 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: Vulkan GPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: alexnet Intel UHD 610 CML GT1 9 18 27 36 45 SE +/- 0.04, N = 3 37.88 MIN: 37.48 / MAX: 49.23 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: Vulkan GPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: resnet18 Intel UHD 610 CML GT1 10 20 30 40 50 SE +/- 0.02, N = 3 46.11 MIN: 45.65 / MAX: 55.32 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: Vulkan GPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: vgg16 Intel UHD 610 CML GT1 60 120 180 240 300 SE +/- 0.26, N = 3 269.82 MIN: 266.99 / MAX: 280.51 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: Vulkan GPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: googlenet Intel UHD 610 CML GT1 12 24 36 48 60 SE +/- 0.03, N = 3 55.09 MIN: 54.5 / MAX: 66.08 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: Vulkan GPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: blazeface Intel UHD 610 CML GT1 0.5513 1.1026 1.6539 2.2052 2.7565 SE +/- 0.02, N = 3 2.45 MIN: 2.36 / MAX: 8.3 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: Vulkan GPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: efficientnet-b0 Intel UHD 610 CML GT1 7 14 21 28 35 SE +/- 0.01, N = 3 29.19 MIN: 28.9 / MAX: 38.9 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: Vulkan GPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: mnasnet Intel UHD 610 CML GT1 4 8 12 16 20 SE +/- 0.02, N = 3 17.45 MIN: 17.25 / MAX: 23.55 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: Vulkan GPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: shufflenet-v2 Intel UHD 610 CML GT1 2 4 6 8 10 SE +/- 0.04, N = 3 8.63 MIN: 8.51 / MAX: 14.66 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 Intel UHD 610 CML GT1 4 8 12 16 20 SE +/- 0.05, N = 3 15.15 MIN: 14.93 / MAX: 21.19 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 Intel UHD 610 CML GT1 5 10 15 20 25 SE +/- 0.03, N = 3 20.77 MIN: 20.49 / MAX: 26.72 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: Vulkan GPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: mobilenet Intel UHD 610 CML GT1 20 40 60 80 100 SE +/- 0.04, N = 3 75.75 MIN: 75.1 / MAX: 86.99 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: FastestDet Intel UHD 610 CML GT1 3 6 9 12 15 SE +/- 0.00, N = 3 10.31 MIN: 10.17 / MAX: 16.27 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: vision_transformer Intel UHD 610 CML GT1 200 400 600 800 1000 SE +/- 0.37, N = 3 922.27 MIN: 909.35 / MAX: 1072.61 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: regnety_400m Intel UHD 610 CML GT1 6 12 18 24 30 SE +/- 0.01, N = 3 23.14 MIN: 22.84 / MAX: 34.82 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: squeezenet_ssd Intel UHD 610 CML GT1 9 18 27 36 45 SE +/- 0.04, N = 3 38.64 MIN: 38.15 / MAX: 50 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: yolov4-tiny Intel UHD 610 CML GT1 20 40 60 80 100 SE +/- 0.29, N = 3 93.85 MIN: 92.79 / MAX: 111.06 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: resnet50 Intel UHD 610 CML GT1 30 60 90 120 150 SE +/- 0.11, N = 3 125.87 MIN: 124.85 / MAX: 136.01 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: alexnet Intel UHD 610 CML GT1 9 18 27 36 45 SE +/- 0.01, N = 3 37.89 MIN: 37.43 / MAX: 49.15 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: resnet18 Intel UHD 610 CML GT1 10 20 30 40 50 SE +/- 0.16, N = 3 46.04 MIN: 45.38 / MAX: 54.29 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: vgg16 Intel UHD 610 CML GT1 60 120 180 240 300 SE +/- 1.39, N = 3 268.68 MIN: 263.85 / MAX: 281.8 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: googlenet Intel UHD 610 CML GT1 12 24 36 48 60 SE +/- 0.17, N = 3 54.88 MIN: 54.1 / MAX: 63.71 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: blazeface Intel UHD 610 CML GT1 0.5535 1.107 1.6605 2.214 2.7675 SE +/- 0.02, N = 3 2.46 MIN: 2.37 / MAX: 8.36 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: efficientnet-b0 Intel UHD 610 CML GT1 7 14 21 28 35 SE +/- 0.06, N = 3 29.21 MIN: 28.82 / MAX: 40.6 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: mnasnet Intel UHD 610 CML GT1 4 8 12 16 20 SE +/- 0.03, N = 3 17.42 MIN: 17.17 / MAX: 29.04 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: shufflenet-v2 Intel UHD 610 CML GT1 2 4 6 8 10 SE +/- 0.01, N = 3 8.65 MIN: 8.53 / MAX: 14.53 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3 - Model: mobilenet-v3 Intel UHD 610 CML GT1 4 8 12 16 20 SE +/- 0.01, N = 3 15.09 MIN: 14.9 / MAX: 21.12 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v2-v2 - Model: mobilenet-v2 Intel UHD 610 CML GT1 5 10 15 20 25 SE +/- 0.04, N = 3 20.69 MIN: 20.42 / MAX: 26.89 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: mobilenet Intel UHD 610 CML GT1 20 40 60 80 100 SE +/- 0.02, N = 3 75.80 MIN: 75.19 / MAX: 86.56 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
Mobile Neural Network Model: inception-v3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.1 Model: inception-v3 Intel UHD 610 CML GT1 40 80 120 160 200 SE +/- 0.13, N = 3 158.59 MIN: 156.93 / MAX: 210.33 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: mobilenet-v1-1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.1 Model: mobilenet-v1-1.0 Intel UHD 610 CML GT1 5 10 15 20 25 SE +/- 0.03, N = 3 20.61 MIN: 20.32 / MAX: 34.59 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: MobileNetV2_224 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.1 Model: MobileNetV2_224 Intel UHD 610 CML GT1 3 6 9 12 15 SE +/- 0.01, N = 3 13.32 MIN: 13.14 / MAX: 27.14 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: SqueezeNetV1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.1 Model: SqueezeNetV1.0 Intel UHD 610 CML GT1 5 10 15 20 25 SE +/- 0.01, N = 3 22.63 MIN: 22.38 / MAX: 36.29 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: resnet-v2-50 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.1 Model: resnet-v2-50 Intel UHD 610 CML GT1 30 60 90 120 150 SE +/- 0.09, N = 3 119.02 MIN: 117.79 / MAX: 154.34 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: squeezenetv1.1 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.1 Model: squeezenetv1.1 Intel UHD 610 CML GT1 3 6 9 12 15 SE +/- 0.02, N = 3 12.23 MIN: 12.06 / MAX: 26.18 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: mobilenetV3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.1 Model: mobilenetV3 Intel UHD 610 CML GT1 0.8687 1.7374 2.6061 3.4748 4.3435 SE +/- 0.004, N = 3 3.861 MIN: 3.78 / MAX: 17.86 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: nasnet OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.1 Model: nasnet Intel UHD 610 CML GT1 7 14 21 28 35 SE +/- 0.07, N = 3 30.01 MIN: 29.6 / MAX: 49.62 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Caffe Model: GoogleNet - Acceleration: CPU - Iterations: 1000 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: CPU - Iterations: 1000 Intel UHD 610 CML GT1 500K 1000K 1500K 2000K 2500K SE +/- 68.07, N = 3 2142790 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
Caffe Model: GoogleNet - Acceleration: CPU - Iterations: 200 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: CPU - Iterations: 200 Intel UHD 610 CML GT1 90K 180K 270K 360K 450K SE +/- 257.16, N = 3 429063 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
Caffe Model: GoogleNet - Acceleration: CPU - Iterations: 100 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: CPU - Iterations: 100 Intel UHD 610 CML GT1 50K 100K 150K 200K 250K SE +/- 48.25, N = 3 214399 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
Caffe Model: AlexNet - Acceleration: CPU - Iterations: 1000 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: AlexNet - Acceleration: CPU - Iterations: 1000 Intel UHD 610 CML GT1 200K 400K 600K 800K 1000K SE +/- 69.96, N = 3 986440 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
Caffe Model: AlexNet - Acceleration: CPU - Iterations: 200 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: AlexNet - Acceleration: CPU - Iterations: 200 Intel UHD 610 CML GT1 40K 80K 120K 160K 200K SE +/- 312.59, N = 3 197396 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
Caffe Model: AlexNet - Acceleration: CPU - Iterations: 100 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: AlexNet - Acceleration: CPU - Iterations: 100 Intel UHD 610 CML GT1 20K 40K 60K 80K 100K SE +/- 87.05, N = 3 98849 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
TensorFlow Lite Model: Inception ResNet V2 OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2022-05-18 Model: Inception ResNet V2 Intel UHD 610 CML GT1 90K 180K 270K 360K 450K SE +/- 72.96, N = 3 397245
TensorFlow Lite Model: Mobilenet Quant OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2022-05-18 Model: Mobilenet Quant Intel UHD 610 CML GT1 120K 240K 360K 480K 600K SE +/- 75.97, N = 3 576680
TensorFlow Lite Model: Mobilenet Float OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2022-05-18 Model: Mobilenet Float Intel UHD 610 CML GT1 5K 10K 15K 20K 25K SE +/- 36.79, N = 3 22665.5
TensorFlow Lite Model: NASNet Mobile OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2022-05-18 Model: NASNet Mobile Intel UHD 610 CML GT1 9K 18K 27K 36K 45K SE +/- 77.05, N = 3 41500.3
TensorFlow Lite Model: Inception V4 OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2022-05-18 Model: Inception V4 Intel UHD 610 CML GT1 90K 180K 270K 360K 450K SE +/- 295.20, N = 3 426984
TensorFlow Lite Model: SqueezeNet OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2022-05-18 Model: SqueezeNet Intel UHD 610 CML GT1 7K 14K 21K 28K 35K SE +/- 65.07, N = 3 31464.6
RNNoise OpenBenchmarking.org Seconds, Fewer Is Better RNNoise 2020-06-28 Intel UHD 610 CML GT1 6 12 18 24 30 SE +/- 0.08, N = 3 27.00 1. (CC) gcc options: -O2 -pedantic -fvisibility=hidden
R Benchmark OpenBenchmarking.org Seconds, Fewer Is Better R Benchmark Intel UHD 610 CML GT1 0.0804 0.1608 0.2412 0.3216 0.402 SE +/- 0.0004, N = 3 0.3572 1. R scripting front-end version 3.6.3 (2020-02-29)
Numpy Benchmark OpenBenchmarking.org Score, More Is Better Numpy Benchmark Intel UHD 610 CML GT1 60 120 180 240 300 SE +/- 0.11, N = 3 290.97
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU Intel UHD 610 CML GT1 5K 10K 15K 20K 25K SE +/- 2.25, N = 3 21213.8 MIN: 21162.9 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU Intel UHD 610 CML GT1 9K 18K 27K 36K 45K SE +/- 4.88, N = 3 42196.3 MIN: 42124.3 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU Intel UHD 610 CML GT1 5K 10K 15K 20K 25K SE +/- 2.99, N = 3 21214.8 MIN: 21156.9 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU Intel UHD 610 CML GT1 9K 18K 27K 36K 45K SE +/- 4.97, N = 3 42203.1 MIN: 42105 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU Intel UHD 610 CML GT1 5K 10K 15K 20K 25K SE +/- 10.14, N = 3 21191.3 MIN: 21127.4 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU Intel UHD 610 CML GT1 9K 18K 27K 36K 45K SE +/- 4.60, N = 3 42198.7 MIN: 42136.1 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU Intel UHD 610 CML GT1 6 12 18 24 30 SE +/- 0.03, N = 3 26.95 MIN: 26.53 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU Intel UHD 610 CML GT1 5 10 15 20 25 SE +/- 0.05, N = 3 19.79 MIN: 19.52 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU Intel UHD 610 CML GT1 11 22 33 44 55 SE +/- 0.20, N = 3 48.78 MIN: 47.03 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU Intel UHD 610 CML GT1 20 40 60 80 100 SE +/- 0.12, N = 3 92.42 MIN: 89.59 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU Intel UHD 610 CML GT1 20 40 60 80 100 SE +/- 0.25, N = 3 78.29 MIN: 75.65 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU Intel UHD 610 CML GT1 13 26 39 52 65 SE +/- 0.07, N = 3 60.25 MIN: 59.72 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU Intel UHD 610 CML GT1 1.3296 2.6592 3.9888 5.3184 6.648 SE +/- 0.01192, N = 3 5.90947 MIN: 5.66 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU Intel UHD 610 CML GT1 3 6 9 12 15 SE +/- 0.02, N = 3 12.28 MIN: 12.13 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU Intel UHD 610 CML GT1 9 18 27 36 45 SE +/- 0.09, N = 3 38.02 MIN: 37.16 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU Intel UHD 610 CML GT1 9 18 27 36 45 SE +/- 0.01, N = 3 37.76 MIN: 37.31 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
LeelaChessZero Backend: BLAS OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.28 Backend: BLAS Intel UHD 610 CML GT1 30 60 90 120 150 SE +/- 1.53, N = 3 137 1. (CXX) g++ options: -flto -pthread
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard Intel UHD 610 CML GT1 160 320 480 640 800 SE +/- 1.38, N = 3 750.73 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel Intel UHD 610 CML GT1 200 400 600 800 1000 SE +/- 36.89, N = 15 1090.13 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel Intel UHD 610 CML GT1 0.21 0.42 0.63 0.84 1.05 SE +/- 0.033556, N = 15 0.933197 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: super-resolution-10 - Device: CPU - Executor: Standard Intel UHD 610 CML GT1 30 60 90 120 150 SE +/- 2.13, N = 12 112.53 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: super-resolution-10 - Device: CPU - Executor: Parallel Intel UHD 610 CML GT1 30 60 90 120 150 SE +/- 7.92, N = 15 135.21 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: super-resolution-10 - Device: CPU - Executor: Parallel Intel UHD 610 CML GT1 2 4 6 8 10 SE +/- 0.30411, N = 15 7.64505 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard Intel UHD 610 CML GT1 20 40 60 80 100 SE +/- 1.08, N = 3 96.62 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel Intel UHD 610 CML GT1 30 60 90 120 150 SE +/- 0.48, N = 3 112.39 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard Intel UHD 610 CML GT1 90 180 270 360 450 SE +/- 0.08, N = 3 410.75 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel Intel UHD 610 CML GT1 100 200 300 400 500 SE +/- 1.62, N = 3 461.30 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: fcn-resnet101-11 - Device: CPU - Executor: Standard Intel UHD 610 CML GT1 2K 4K 6K 8K 10K SE +/- 0.27, N = 3 9433.67 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel Intel UHD 610 CML GT1 3K 6K 9K 12K 15K SE +/- 831.48, N = 15 13223.7 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel Intel UHD 610 CML GT1 0.0178 0.0356 0.0534 0.0712 0.089 SE +/- 0.0042026, N = 15 0.0793033 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard Intel UHD 610 CML GT1 6 12 18 24 30 SE +/- 0.20, N = 11 25.73 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel Intel UHD 610 CML GT1 7 14 21 28 35 SE +/- 0.09, N = 3 31.29 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: bertsquad-12 - Device: CPU - Executor: Standard Intel UHD 610 CML GT1 200 400 600 800 1000 SE +/- 8.02, N = 15 851.92 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: bertsquad-12 - Device: CPU - Executor: Parallel Intel UHD 610 CML GT1 200 400 600 800 1000 SE +/- 1.71, N = 3 914.99 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: GPT-2 - Device: CPU - Executor: Standard Intel UHD 610 CML GT1 8 16 24 32 40 SE +/- 0.07, N = 3 33.95 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: GPT-2 - Device: CPU - Executor: Parallel Intel UHD 610 CML GT1 8 16 24 32 40 SE +/- 0.66, N = 12 35.98 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -lpthread -pthread
Phoronix Test Suite v10.8.5