ml-benchmark2

AMD Ryzen 9 7950X3D 16-Core testing with a ASUS PRIME X670E-PRO WIFI (1813 BIOS) and MSI NVIDIA GeForce RTX 3060 12GB on Ubuntu 22.04 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2312269-NE-MLBENCHMA88&grs.

ml-benchmark2ProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLOpenCLVulkanCompilerFile-SystemScreen Resolutionml-benchmark2-12-25-23AMD Ryzen 9 7950X3D 16-Core @ 4.20GHz (16 Cores / 32 Threads)ASUS PRIME X670E-PRO WIFI (1813 BIOS)AMD Device 14d8128GB4001GB CT4000P3PSSD8 + 1024GB SPCC M.2 PCIe SSDMSI NVIDIA GeForce RTX 3060 12GBNVIDIA Device 228eLC27T55Realtek RTL8125 2.5GbE + MEDIATEK Device 0608Ubuntu 22.046.2.0-39-generic (x86_64)GNOME Shell 42.9X Server 1.21.1.4NVIDIA 535.129.034.6.0OpenCL 3.0 CUDA 12.2.1471.3.242GCC 11.4.0 + CUDA 12.3ext41920x1080OpenBenchmarking.org- Transparent Huge Pages: madvise- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0xa601206 - GLAMOR - BAR1 / Visible vRAM Size: 16384 MiB - vBIOS Version: 94.06.2f.00.98- GPU Compute Cores: 3584- Python 3.10.12- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected

ml-benchmark2whisper-cpp: ggml-medium.en - 2016 State of the Unionwhisper-cpp: ggml-small.en - 2016 State of the Unionwhisper-cpp: ggml-base.en - 2016 State of the Unionscikit-learn: Sparse Rand Projections / 100 Iterationsscikit-learn: 20 Newsgroups / Logistic Regressionscikit-learn: Isotonic / Perturbed Logarithmscikit-learn: Covertype Dataset Benchmarkscikit-learn: Isotonic / Pathologicalscikit-learn: Plot Incremental PCAscikit-learn: Isotonic / Logisticscikit-learn: Feature Expansionsscikit-learn: Plot Hierarchicalscikit-learn: Text Vectorizersscikit-learn: Plot Neighborsscikit-learn: Plot Wardscikit-learn: Sparsifyscikit-learn: Lassoscikit-learn: Treemlpack: scikit_linearridgeregressionmlpack: scikit_svmmlpack: scikit_qdamlpack: scikit_icaai-benchmark: Device AI Scoreai-benchmark: Device Training Scoreai-benchmark: Device Inference Scorenumenta-nab: Contextual Anomaly Detector OSEnumenta-nab: Bayesian Changepointnumenta-nab: Earthgecko Skylinenumenta-nab: Windowed Gaussiannumenta-nab: Relative Entropynumenta-nab: KNN CADopenvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPUopenvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPUopenvino: Handwritten English Recognition FP16-INT8 - CPUopenvino: Handwritten English Recognition FP16-INT8 - CPUopenvino: Age Gender Recognition Retail 0013 FP16 - CPUopenvino: Age Gender Recognition Retail 0013 FP16 - CPUopenvino: Handwritten English Recognition FP16 - CPUopenvino: Handwritten English Recognition FP16 - CPUopenvino: Person Vehicle Bike Detection FP16 - CPUopenvino: Person Vehicle Bike Detection FP16 - CPUopenvino: Weld Porosity Detection FP16-INT8 - CPUopenvino: Weld Porosity Detection FP16-INT8 - CPUopenvino: Machine Translation EN To DE FP16 - CPUopenvino: Machine Translation EN To DE FP16 - CPUopenvino: Road Segmentation ADAS FP16-INT8 - CPUopenvino: Road Segmentation ADAS FP16-INT8 - CPUopenvino: Face Detection Retail FP16-INT8 - CPUopenvino: Face Detection Retail FP16-INT8 - CPUopenvino: Weld Porosity Detection FP16 - CPUopenvino: Weld Porosity Detection FP16 - CPUopenvino: Vehicle Detection FP16-INT8 - CPUopenvino: Vehicle Detection FP16-INT8 - CPUopenvino: Road Segmentation ADAS FP16 - CPUopenvino: Road Segmentation ADAS FP16 - CPUopenvino: Face Detection Retail FP16 - CPUopenvino: Face Detection Retail FP16 - CPUopenvino: Face Detection FP16-INT8 - CPUopenvino: Face Detection FP16-INT8 - CPUopenvino: Vehicle Detection FP16 - CPUopenvino: Vehicle Detection FP16 - CPUopenvino: Person Detection FP32 - CPUopenvino: Person Detection FP32 - CPUopenvino: Person Detection FP16 - CPUopenvino: Person Detection FP16 - CPUopenvino: Face Detection FP16 - CPUopenvino: Face Detection FP16 - CPUtnn: CPU - SqueezeNet v1.1tnn: CPU - SqueezeNet v2tnn: CPU - MobileNet v2tnn: CPU - DenseNetncnn: Vulkan GPU - vision_transformerncnn: Vulkan GPU - regnety_400mncnn: Vulkan GPU - yolov4-tinyncnn: Vulkan GPU - resnet50ncnn: Vulkan GPU - vgg16ncnn: CPU - vision_transformerncnn: CPU - regnety_400mncnn: CPU - yolov4-tinyncnn: CPU - resnet50ncnn: CPU - vgg16ncnn: CPU - googlenetmnn: inception-v3mnn: mobilenet-v1-1.0mnn: MobileNetV2_224mnn: SqueezeNetV1.0mnn: resnet-v2-50mnn: mobilenetV3mnn: nasnetcaffe: GoogleNet - CPU - 1000caffe: GoogleNet - CPU - 200caffe: GoogleNet - CPU - 100caffe: AlexNet - CPU - 1000caffe: AlexNet - CPU - 200caffe: AlexNet - CPU - 100spacy: en_core_web_trfspacy: en_core_web_lgdeepsparse: NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Streamdeepsparse: NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Streamdeepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Streamdeepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Streamdeepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Synchronous Single-Streamdeepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Synchronous Single-Streamdeepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Streamdeepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Streamdeepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Synchronous Single-Streamdeepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Synchronous Single-Streamdeepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Streamdeepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Streamdeepsparse: NLP Text Classification, DistilBERT mnli - Synchronous Single-Streamdeepsparse: NLP Text Classification, DistilBERT mnli - Synchronous Single-Streamdeepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Streamdeepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Streamdeepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Synchronous Single-Streamdeepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Synchronous Single-Streamdeepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Streamdeepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Streamdeepsparse: CV Classification, ResNet-50 ImageNet - Synchronous Single-Streamdeepsparse: CV Classification, ResNet-50 ImageNet - Synchronous Single-Streamdeepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Streamdeepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Streamdeepsparse: BERT-Large, NLP Question Answering - Synchronous Single-Streamdeepsparse: BERT-Large, NLP Question Answering - Synchronous Single-Streamdeepsparse: BERT-Large, NLP Question Answering - Asynchronous Multi-Streamdeepsparse: BERT-Large, NLP Question Answering - Asynchronous Multi-Streamdeepsparse: CV Detection, YOLOv5s COCO - Synchronous Single-Streamdeepsparse: CV Detection, YOLOv5s COCO - Synchronous Single-Streamdeepsparse: CV Detection, YOLOv5s COCO - Asynchronous Multi-Streamdeepsparse: CV Detection, YOLOv5s COCO - Asynchronous Multi-Streamdeepsparse: ResNet-50, Sparse INT8 - Synchronous Single-Streamdeepsparse: ResNet-50, Sparse INT8 - Synchronous Single-Streamdeepsparse: ResNet-50, Sparse INT8 - Asynchronous Multi-Streamdeepsparse: ResNet-50, Sparse INT8 - Asynchronous Multi-Streamdeepsparse: ResNet-50, Baseline - Synchronous Single-Streamdeepsparse: ResNet-50, Baseline - Synchronous Single-Streamdeepsparse: ResNet-50, Baseline - Asynchronous Multi-Streamdeepsparse: ResNet-50, Baseline - Asynchronous Multi-Streamdeepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Synchronous Single-Streamdeepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Synchronous Single-Streamdeepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Streamdeepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Streamdeepsparse: NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Streamdeepsparse: NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Streamdeepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Streamdeepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Streamtensorflow: CPU - 512 - ResNet-50tensorflow: CPU - 512 - GoogLeNettensorflow: CPU - 256 - ResNet-50tensorflow: CPU - 256 - GoogLeNettensorflow: CPU - 64 - ResNet-50tensorflow: CPU - 64 - GoogLeNettensorflow: CPU - 32 - ResNet-50tensorflow: CPU - 32 - GoogLeNettensorflow: CPU - 16 - ResNet-50tensorflow: CPU - 16 - GoogLeNettensorflow: CPU - 512 - AlexNettensorflow: CPU - 256 - AlexNettensorflow: CPU - 64 - AlexNettensorflow: CPU - 512 - VGG-16tensorflow: CPU - 32 - AlexNettensorflow: CPU - 256 - VGG-16tensorflow: CPU - 16 - AlexNettensorflow: CPU - 64 - VGG-16tensorflow: CPU - 32 - VGG-16tensorflow: CPU - 16 - VGG-16pytorch: CPU - 512 - Efficientnet_v2_lpytorch: CPU - 256 - Efficientnet_v2_lpytorch: CPU - 64 - Efficientnet_v2_lpytorch: CPU - 32 - Efficientnet_v2_lpytorch: CPU - 16 - Efficientnet_v2_lpytorch: CPU - 1 - Efficientnet_v2_lpytorch: CPU - 512 - ResNet-152pytorch: CPU - 256 - ResNet-152pytorch: CPU - 64 - ResNet-152pytorch: CPU - 512 - ResNet-50pytorch: CPU - 32 - ResNet-152pytorch: CPU - 256 - ResNet-50pytorch: CPU - 16 - ResNet-152pytorch: CPU - 64 - ResNet-50pytorch: CPU - 32 - ResNet-50pytorch: CPU - 16 - ResNet-50pytorch: CPU - 1 - ResNet-152pytorch: CPU - 1 - ResNet-50tensorflow-lite: Inception ResNet V2tensorflow-lite: Mobilenet Quanttensorflow-lite: Mobilenet Floattensorflow-lite: NASNet Mobiletensorflow-lite: Inception V4tensorflow-lite: SqueezeNetrnnoise: rbenchmark: deepspeech: CPUnumpy: onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Training - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Inference - u8s8f32 - CPUonednn: Deconvolution Batch shapes_3d - bf16bf16bf16 - CPUonednn: Deconvolution Batch shapes_1d - bf16bf16bf16 - CPUonednn: Convolution Batch Shapes Auto - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Training - u8s8f32 - CPUonednn: Recurrent Neural Network Inference - f32 - CPUonednn: Recurrent Neural Network Training - f32 - CPUonednn: Deconvolution Batch shapes_3d - u8s8f32 - CPUonednn: Deconvolution Batch shapes_1d - u8s8f32 - CPUonednn: Convolution Batch Shapes Auto - u8s8f32 - CPUonednn: Deconvolution Batch shapes_3d - f32 - CPUonednn: Deconvolution Batch shapes_1d - f32 - CPUonednn: Convolution Batch Shapes Auto - f32 - CPUonednn: IP Shapes 3D - bf16bf16bf16 - CPUonednn: IP Shapes 1D - bf16bf16bf16 - CPUonednn: IP Shapes 3D - f32 - CPUonednn: IP Shapes 1D - f32 - CPUlczero: BLASopencv: DNN - Deep Neural Networkncnn: Vulkan GPU - FastestDetncnn: Vulkan GPU - squeezenet_ssdncnn: Vulkan GPU - alexnetncnn: Vulkan GPU - resnet18ncnn: Vulkan GPU - googlenetncnn: Vulkan GPU - blazefacencnn: Vulkan GPU - efficientnet-b0ncnn: Vulkan GPU - mnasnetncnn: Vulkan GPU - shufflenet-v2ncnn: Vulkan GPU-v3-v3 - mobilenet-v3ncnn: Vulkan GPU-v2-v2 - mobilenet-v2ncnn: Vulkan GPU - mobilenetncnn: CPU - FastestDetncnn: CPU - squeezenet_ssdncnn: CPU - alexnetncnn: CPU - resnet18ncnn: CPU - blazefacencnn: CPU - efficientnet-b0ncnn: CPU - mnasnetncnn: CPU - shufflenet-v2ncnn: CPU-v3-v3 - mobilenet-v3ncnn: CPU-v2-v2 - mobilenet-v2ncnn: CPU - mobilenetmnn: squeezenetv1.1onednn: IP Shapes 3D - u8s8f32 - CPUonednn: IP Shapes 1D - u8s8f32 - CPUshoc: OpenCL - S3Dml-benchmark2-12-25-23807.17623289.7364196.97771401.30628.2191384.028270.9593114.63849.6061123.36490.526138.99841.823123.61240.57971.653237.28737.4321.0814.1231.3029.9365243592293225.41212.74951.8374.7138.048100.0460.2658961.4426.55602.340.3841501.9821.72736.265.061579.585.952688.8465.69121.7115.00532.883.284876.3811.531386.494.791667.8918.10441.602.293485.66301.3426.498.04994.7691.5687.3391.0687.80581.7913.70179.65742.143187.4792134.43042.1511.2917.2213.3935.0942.1911.0617.1213.0035.8010.3822.3812.6403.7234.40211.3601.81811.96062691512442163606243445492272489323541945256.806117.6016356.972522.38069.9862100.110718.6647428.232437.780026.4575228.460434.93109.1055109.768742.7042187.270110.500995.215262.4189128.09445.2028192.129229.5869270.255356.812517.5992289.500227.632110.618994.145366.4627120.30880.85131172.14523.04952614.24305.2053192.035129.5978270.15203.6930270.66508.5518934.084656.963217.5530361.493622.068334.09110.8834.24112.1235.45120.4036.10127.2935.90127.79400.77386.97297.3919.13215.9619.00140.3918.4817.7116.4810.4210.4710.4310.4510.2913.7017.8917.7117.9844.9317.7845.7717.8845.1745.7145.4826.5367.8722832.81985.591293.7610670.222388.91798.0714.6030.113352.49225721.25702.3151343.53704.8571.614322.608941.496101367.75707.8371358.070.7318940.5055805.086902.888333.434195.279431.492850.7721633.814082.13148175314374.768.795.206.6410.691.605.563.974.364.404.2910.675.088.755.096.701.705.303.864.424.254.5210.642.7700.3254320.542833OpenBenchmarking.org

Whisper.cpp

Model: ggml-medium.en - Input: 2016 State of the Union

OpenBenchmarking.orgSeconds, Fewer Is BetterWhisper.cpp 1.4Model: ggml-medium.en - Input: 2016 State of the Unionml-benchmark2-12-25-232004006008001000SE +/- 9.64, N = 3807.181. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread

Whisper.cpp

Model: ggml-small.en - Input: 2016 State of the Union

OpenBenchmarking.orgSeconds, Fewer Is BetterWhisper.cpp 1.4Model: ggml-small.en - Input: 2016 State of the Unionml-benchmark2-12-25-2360120180240300SE +/- 3.07, N = 9289.741. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread

Whisper.cpp

Model: ggml-base.en - Input: 2016 State of the Union

OpenBenchmarking.orgSeconds, Fewer Is BetterWhisper.cpp 1.4Model: ggml-base.en - Input: 2016 State of the Unionml-benchmark2-12-25-2320406080100SE +/- 1.08, N = 1596.981. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread

Scikit-Learn

Benchmark: Sparse Random Projections / 100 Iterations

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Sparse Random Projections / 100 Iterationsml-benchmark2-12-25-2390180270360450SE +/- 2.81, N = 3401.311. (F9X) gfortran options: -O3 -fopenmp -fno-tree-vectorize -lm -lpthread -lgfortran -lc

Scikit-Learn

Benchmark: 20 Newsgroups / Logistic Regression

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: 20 Newsgroups / Logistic Regressionml-benchmark2-12-25-23714212835SE +/- 0.07, N = 328.221. (F9X) gfortran options: -O3 -fopenmp -fno-tree-vectorize -lm -lpthread -lgfortran -lc

Scikit-Learn

Benchmark: Isotonic / Perturbed Logarithm

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Isotonic / Perturbed Logarithmml-benchmark2-12-25-2330060090012001500SE +/- 6.24, N = 31384.031. (F9X) gfortran options: -O3 -fopenmp -fno-tree-vectorize -lm -lpthread -lgfortran -lc

Scikit-Learn

Benchmark: Covertype Dataset Benchmark

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Covertype Dataset Benchmarkml-benchmark2-12-25-2360120180240300SE +/- 2.26, N = 3270.961. (F9X) gfortran options: -O3 -fopenmp -fno-tree-vectorize -lm -lpthread -lgfortran -lc

Scikit-Learn

Benchmark: Isotonic / Pathological

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Isotonic / Pathologicalml-benchmark2-12-25-237001400210028003500SE +/- 12.71, N = 33114.641. (F9X) gfortran options: -O3 -fopenmp -fno-tree-vectorize -lm -lpthread -lgfortran -lc

Scikit-Learn

Benchmark: Plot Incremental PCA

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Plot Incremental PCAml-benchmark2-12-25-231122334455SE +/- 0.66, N = 1549.611. (F9X) gfortran options: -O3 -fopenmp -fno-tree-vectorize -lm -lpthread -lgfortran -lc

Scikit-Learn

Benchmark: Isotonic / Logistic

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Isotonic / Logisticml-benchmark2-12-25-232004006008001000SE +/- 4.14, N = 31123.361. (F9X) gfortran options: -O3 -fopenmp -fno-tree-vectorize -lm -lpthread -lgfortran -lc

Scikit-Learn

Benchmark: Feature Expansions

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Feature Expansionsml-benchmark2-12-25-2320406080100SE +/- 0.70, N = 390.531. (F9X) gfortran options: -O3 -fopenmp -fno-tree-vectorize -lm -lpthread -lgfortran -lc

Scikit-Learn

Benchmark: Plot Hierarchical

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Plot Hierarchicalml-benchmark2-12-25-23306090120150SE +/- 0.48, N = 3139.001. (F9X) gfortran options: -O3 -fopenmp -fno-tree-vectorize -lm -lpthread -lgfortran -lc

Scikit-Learn

Benchmark: Text Vectorizers

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Text Vectorizersml-benchmark2-12-25-231020304050SE +/- 0.09, N = 341.821. (F9X) gfortran options: -O3 -fopenmp -fno-tree-vectorize -lm -lpthread -lgfortran -lc

Scikit-Learn

Benchmark: Plot Neighbors

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Plot Neighborsml-benchmark2-12-25-23306090120150SE +/- 1.25, N = 12123.611. (F9X) gfortran options: -O3 -fopenmp -fno-tree-vectorize -lm -lpthread -lgfortran -lc

Scikit-Learn

Benchmark: Plot Ward

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Plot Wardml-benchmark2-12-25-23918273645SE +/- 0.09, N = 340.581. (F9X) gfortran options: -O3 -fopenmp -fno-tree-vectorize -lm -lpthread -lgfortran -lc

Scikit-Learn

Benchmark: Sparsify

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Sparsifyml-benchmark2-12-25-231632486480SE +/- 1.01, N = 371.651. (F9X) gfortran options: -O3 -fopenmp -fno-tree-vectorize -lm -lpthread -lgfortran -lc

Scikit-Learn

Benchmark: Lasso

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Lassoml-benchmark2-12-25-2350100150200250SE +/- 1.50, N = 3237.291. (F9X) gfortran options: -O3 -fopenmp -fno-tree-vectorize -lm -lpthread -lgfortran -lc

Scikit-Learn

Benchmark: Tree

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Treeml-benchmark2-12-25-23918273645SE +/- 0.42, N = 337.431. (F9X) gfortran options: -O3 -fopenmp -fno-tree-vectorize -lm -lpthread -lgfortran -lc

Mlpack Benchmark

Benchmark: scikit_linearridgeregression

OpenBenchmarking.orgSeconds, Fewer Is BetterMlpack BenchmarkBenchmark: scikit_linearridgeregressionml-benchmark2-12-25-230.2430.4860.7290.9721.215SE +/- 0.01, N = 151.08

Mlpack Benchmark

Benchmark: scikit_svm

OpenBenchmarking.orgSeconds, Fewer Is BetterMlpack BenchmarkBenchmark: scikit_svmml-benchmark2-12-25-2348121620SE +/- 0.18, N = 314.12

Mlpack Benchmark

Benchmark: scikit_qda

OpenBenchmarking.orgSeconds, Fewer Is BetterMlpack BenchmarkBenchmark: scikit_qdaml-benchmark2-12-25-23714212835SE +/- 0.13, N = 331.30

Mlpack Benchmark

Benchmark: scikit_ica

OpenBenchmarking.orgSeconds, Fewer Is BetterMlpack BenchmarkBenchmark: scikit_icaml-benchmark2-12-25-23714212835SE +/- 0.04, N = 329.93

AI Benchmark Alpha

Device AI Score

OpenBenchmarking.orgScore, More Is BetterAI Benchmark Alpha 0.1.2Device AI Scoreml-benchmark2-12-25-23140028004200560070006524

AI Benchmark Alpha

Device Training Score

OpenBenchmarking.orgScore, More Is BetterAI Benchmark Alpha 0.1.2Device Training Scoreml-benchmark2-12-25-2380016002400320040003592

AI Benchmark Alpha

Device Inference Score

OpenBenchmarking.orgScore, More Is BetterAI Benchmark Alpha 0.1.2Device Inference Scoreml-benchmark2-12-25-2360012001800240030002932

Numenta Anomaly Benchmark

Detector: Contextual Anomaly Detector OSE

OpenBenchmarking.orgSeconds, Fewer Is BetterNumenta Anomaly Benchmark 1.1Detector: Contextual Anomaly Detector OSEml-benchmark2-12-25-23612182430SE +/- 0.27, N = 425.41

Numenta Anomaly Benchmark

Detector: Bayesian Changepoint

OpenBenchmarking.orgSeconds, Fewer Is BetterNumenta Anomaly Benchmark 1.1Detector: Bayesian Changepointml-benchmark2-12-25-233691215SE +/- 0.14, N = 412.75

Numenta Anomaly Benchmark

Detector: Earthgecko Skyline

OpenBenchmarking.orgSeconds, Fewer Is BetterNumenta Anomaly Benchmark 1.1Detector: Earthgecko Skylineml-benchmark2-12-25-231224364860SE +/- 0.56, N = 351.84

Numenta Anomaly Benchmark

Detector: Windowed Gaussian

OpenBenchmarking.orgSeconds, Fewer Is BetterNumenta Anomaly Benchmark 1.1Detector: Windowed Gaussianml-benchmark2-12-25-231.06042.12083.18124.24165.302SE +/- 0.049, N = 154.713

Numenta Anomaly Benchmark

Detector: Relative Entropy

OpenBenchmarking.orgSeconds, Fewer Is BetterNumenta Anomaly Benchmark 1.1Detector: Relative Entropyml-benchmark2-12-25-23246810SE +/- 0.034, N = 38.048

Numenta Anomaly Benchmark

Detector: KNN CAD

OpenBenchmarking.orgSeconds, Fewer Is BetterNumenta Anomaly Benchmark 1.1Detector: KNN CADml-benchmark2-12-25-2320406080100SE +/- 0.47, N = 3100.05

OpenVINO

Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2023.2.devModel: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPUml-benchmark2-12-25-230.05850.1170.17550.2340.2925SE +/- 0.00, N = 30.26MIN: 0.16 / MAX: 134.431. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2023.2.devModel: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPUml-benchmark2-12-25-2313K26K39K52K65KSE +/- 14.39, N = 358961.441. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Handwritten English Recognition FP16-INT8 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2023.2.devModel: Handwritten English Recognition FP16-INT8 - Device: CPUml-benchmark2-12-25-23612182430SE +/- 0.05, N = 326.55MIN: 18.75 / MAX: 163.991. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Handwritten English Recognition FP16-INT8 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2023.2.devModel: Handwritten English Recognition FP16-INT8 - Device: CPUml-benchmark2-12-25-23130260390520650SE +/- 1.20, N = 3602.341. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2023.2.devModel: Age Gender Recognition Retail 0013 FP16 - Device: CPUml-benchmark2-12-25-230.08550.1710.25650.3420.4275SE +/- 0.00, N = 30.38MIN: 0.21 / MAX: 73.291. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2023.2.devModel: Age Gender Recognition Retail 0013 FP16 - Device: CPUml-benchmark2-12-25-239K18K27K36K45KSE +/- 18.73, N = 341501.981. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Handwritten English Recognition FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2023.2.devModel: Handwritten English Recognition FP16 - Device: CPUml-benchmark2-12-25-23510152025SE +/- 0.09, N = 321.72MIN: 14.79 / MAX: 129.441. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Handwritten English Recognition FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2023.2.devModel: Handwritten English Recognition FP16 - Device: CPUml-benchmark2-12-25-23160320480640800SE +/- 3.04, N = 3736.261. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Person Vehicle Bike Detection FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2023.2.devModel: Person Vehicle Bike Detection FP16 - Device: CPUml-benchmark2-12-25-231.13852.2773.41554.5545.6925SE +/- 0.01, N = 35.06MIN: 3.44 / MAX: 59.531. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Person Vehicle Bike Detection FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2023.2.devModel: Person Vehicle Bike Detection FP16 - Device: CPUml-benchmark2-12-25-2330060090012001500SE +/- 3.61, N = 31579.581. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Weld Porosity Detection FP16-INT8 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2023.2.devModel: Weld Porosity Detection FP16-INT8 - Device: CPUml-benchmark2-12-25-231.33882.67764.01645.35526.694SE +/- 0.00, N = 35.95MIN: 3.2 / MAX: 99.81. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Weld Porosity Detection FP16-INT8 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2023.2.devModel: Weld Porosity Detection FP16-INT8 - Device: CPUml-benchmark2-12-25-236001200180024003000SE +/- 1.97, N = 32688.841. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Machine Translation EN To DE FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2023.2.devModel: Machine Translation EN To DE FP16 - Device: CPUml-benchmark2-12-25-231530456075SE +/- 0.20, N = 365.69MIN: 28.67 / MAX: 156.271. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Machine Translation EN To DE FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2023.2.devModel: Machine Translation EN To DE FP16 - Device: CPUml-benchmark2-12-25-23306090120150SE +/- 0.37, N = 3121.711. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Road Segmentation ADAS FP16-INT8 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2023.2.devModel: Road Segmentation ADAS FP16-INT8 - Device: CPUml-benchmark2-12-25-2348121620SE +/- 0.05, N = 315.00MIN: 8.89 / MAX: 105.471. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Road Segmentation ADAS FP16-INT8 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2023.2.devModel: Road Segmentation ADAS FP16-INT8 - Device: CPUml-benchmark2-12-25-23120240360480600SE +/- 1.61, N = 3532.881. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Face Detection Retail FP16-INT8 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2023.2.devModel: Face Detection Retail FP16-INT8 - Device: CPUml-benchmark2-12-25-230.7381.4762.2142.9523.69SE +/- 0.00, N = 33.28MIN: 2.08 / MAX: 88.571. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Face Detection Retail FP16-INT8 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2023.2.devModel: Face Detection Retail FP16-INT8 - Device: CPUml-benchmark2-12-25-2310002000300040005000SE +/- 3.56, N = 34876.381. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Weld Porosity Detection FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2023.2.devModel: Weld Porosity Detection FP16 - Device: CPUml-benchmark2-12-25-233691215SE +/- 0.01, N = 311.53MIN: 5.99 / MAX: 113.691. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Weld Porosity Detection FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2023.2.devModel: Weld Porosity Detection FP16 - Device: CPUml-benchmark2-12-25-2330060090012001500SE +/- 0.66, N = 31386.491. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Vehicle Detection FP16-INT8 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2023.2.devModel: Vehicle Detection FP16-INT8 - Device: CPUml-benchmark2-12-25-231.07782.15563.23344.31125.389SE +/- 0.01, N = 34.79MIN: 2.93 / MAX: 107.951. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Vehicle Detection FP16-INT8 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2023.2.devModel: Vehicle Detection FP16-INT8 - Device: CPUml-benchmark2-12-25-23400800120016002000SE +/- 2.61, N = 31667.891. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Road Segmentation ADAS FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2023.2.devModel: Road Segmentation ADAS FP16 - Device: CPUml-benchmark2-12-25-2348121620SE +/- 0.04, N = 318.10MIN: 10.53 / MAX: 93.071. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Road Segmentation ADAS FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2023.2.devModel: Road Segmentation ADAS FP16 - Device: CPUml-benchmark2-12-25-23100200300400500SE +/- 1.13, N = 3441.601. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Face Detection Retail FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2023.2.devModel: Face Detection Retail FP16 - Device: CPUml-benchmark2-12-25-230.51531.03061.54592.06122.5765SE +/- 0.00, N = 32.29MIN: 1.38 / MAX: 55.461. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Face Detection Retail FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2023.2.devModel: Face Detection Retail FP16 - Device: CPUml-benchmark2-12-25-237001400210028003500SE +/- 2.28, N = 33485.661. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Face Detection FP16-INT8 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2023.2.devModel: Face Detection FP16-INT8 - Device: CPUml-benchmark2-12-25-2370140210280350SE +/- 0.11, N = 3301.34MIN: 180.53 / MAX: 466.911. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Face Detection FP16-INT8 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2023.2.devModel: Face Detection FP16-INT8 - Device: CPUml-benchmark2-12-25-23612182430SE +/- 0.02, N = 326.491. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Vehicle Detection FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2023.2.devModel: Vehicle Detection FP16 - Device: CPUml-benchmark2-12-25-23246810SE +/- 0.03, N = 38.04MIN: 3.93 / MAX: 100.581. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Vehicle Detection FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2023.2.devModel: Vehicle Detection FP16 - Device: CPUml-benchmark2-12-25-232004006008001000SE +/- 3.28, N = 3994.761. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Person Detection FP32 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2023.2.devModel: Person Detection FP32 - Device: CPUml-benchmark2-12-25-2320406080100SE +/- 0.18, N = 391.56MIN: 31.31 / MAX: 170.11. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Person Detection FP32 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2023.2.devModel: Person Detection FP32 - Device: CPUml-benchmark2-12-25-2320406080100SE +/- 0.18, N = 387.331. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Person Detection FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2023.2.devModel: Person Detection FP16 - Device: CPUml-benchmark2-12-25-2320406080100SE +/- 0.62, N = 391.06MIN: 33.94 / MAX: 197.351. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Person Detection FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2023.2.devModel: Person Detection FP16 - Device: CPUml-benchmark2-12-25-2320406080100SE +/- 0.59, N = 387.801. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Face Detection FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2023.2.devModel: Face Detection FP16 - Device: CPUml-benchmark2-12-25-23130260390520650SE +/- 0.48, N = 3581.79MIN: 290.44 / MAX: 695.51. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

OpenVINO

Model: Face Detection FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2023.2.devModel: Face Detection FP16 - Device: CPUml-benchmark2-12-25-2348121620SE +/- 0.01, N = 313.701. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie

TNN

Target: CPU - Model: SqueezeNet v1.1

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: SqueezeNet v1.1ml-benchmark2-12-25-234080120160200SE +/- 2.19, N = 15179.66MIN: 169.37 / MAX: 193.321. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl

TNN

Target: CPU - Model: SqueezeNet v2

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: SqueezeNet v2ml-benchmark2-12-25-231020304050SE +/- 0.64, N = 1542.14MIN: 38.56 / MAX: 49.141. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl

TNN

Target: CPU - Model: MobileNet v2

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: MobileNet v2ml-benchmark2-12-25-234080120160200SE +/- 1.57, N = 15187.48MIN: 176.2 / MAX: 252.151. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl

TNN

Target: CPU - Model: DenseNet

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: DenseNetml-benchmark2-12-25-235001000150020002500SE +/- 8.59, N = 32134.43MIN: 2002.11 / MAX: 2290.651. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl

NCNN

Target: Vulkan GPU - Model: vision_transformer

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: vision_transformerml-benchmark2-12-25-231020304050SE +/- 0.31, N = 1242.15MIN: 34.59 / MAX: 476.341. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: regnety_400m

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: regnety_400mml-benchmark2-12-25-233691215SE +/- 0.16, N = 1211.29MIN: 8.75 / MAX: 379.291. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: yolov4-tiny

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: yolov4-tinyml-benchmark2-12-25-2348121620SE +/- 0.18, N = 1217.22MIN: 13.77 / MAX: 381.411. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: resnet50

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: resnet50ml-benchmark2-12-25-233691215SE +/- 0.20, N = 1213.39MIN: 10.9 / MAX: 369.341. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: vgg16

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: vgg16ml-benchmark2-12-25-23816243240SE +/- 0.46, N = 1235.09MIN: 28.16 / MAX: 385.011. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: vision_transformer

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: vision_transformerml-benchmark2-12-25-231020304050SE +/- 0.29, N = 1542.19MIN: 35.18 / MAX: 582.741. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: regnety_400m

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: regnety_400mml-benchmark2-12-25-233691215SE +/- 0.14, N = 1511.06MIN: 8.41 / MAX: 389.461. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: yolov4-tiny

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: yolov4-tinyml-benchmark2-12-25-2348121620SE +/- 0.14, N = 1517.12MIN: 13.93 / MAX: 387.871. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: resnet50

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: resnet50ml-benchmark2-12-25-233691215SE +/- 0.15, N = 1513.00MIN: 10.61 / MAX: 388.181. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: vgg16

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: vgg16ml-benchmark2-12-25-23816243240SE +/- 0.47, N = 1535.80MIN: 28.59 / MAX: 399.51. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: googlenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: googlenetml-benchmark2-12-25-233691215SE +/- 0.15, N = 1510.38MIN: 8.27 / MAX: 421.121. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

Mobile Neural Network

Model: inception-v3

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.1Model: inception-v3ml-benchmark2-12-25-23510152025SE +/- 0.32, N = 322.38MIN: 19.4 / MAX: 90.141. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl

Mobile Neural Network

Model: mobilenet-v1-1.0

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.1Model: mobilenet-v1-1.0ml-benchmark2-12-25-230.5941.1881.7822.3762.97SE +/- 0.053, N = 32.640MIN: 2.3 / MAX: 44.011. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl

Mobile Neural Network

Model: MobileNetV2_224

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.1Model: MobileNetV2_224ml-benchmark2-12-25-230.83771.67542.51313.35084.1885SE +/- 0.086, N = 33.723MIN: 3.23 / MAX: 82.071. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl

Mobile Neural Network

Model: SqueezeNetV1.0

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.1Model: SqueezeNetV1.0ml-benchmark2-12-25-230.99051.9812.97153.9624.9525SE +/- 0.049, N = 34.402MIN: 3.93 / MAX: 80.861. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl

Mobile Neural Network

Model: resnet-v2-50

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.1Model: resnet-v2-50ml-benchmark2-12-25-233691215SE +/- 0.21, N = 311.36MIN: 10.13 / MAX: 74.031. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl

Mobile Neural Network

Model: mobilenetV3

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.1Model: mobilenetV3ml-benchmark2-12-25-230.40910.81821.22731.63642.0455SE +/- 0.014, N = 31.818MIN: 1.54 / MAX: 31.921. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl

Mobile Neural Network

Model: nasnet

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.1Model: nasnetml-benchmark2-12-25-233691215SE +/- 0.11, N = 311.96MIN: 10.36 / MAX: 117.891. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl

Caffe

Model: GoogleNet - Acceleration: CPU - Iterations: 1000

OpenBenchmarking.orgMilli-Seconds, Fewer Is BetterCaffe 2020-02-13Model: GoogleNet - Acceleration: CPU - Iterations: 1000ml-benchmark2-12-25-23130K260K390K520K650KSE +/- 1421.72, N = 36269151. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lcrypto -lcurl -lpthread -lsz -lz -ldl -lm -llmdb

Caffe

Model: GoogleNet - Acceleration: CPU - Iterations: 200

OpenBenchmarking.orgMilli-Seconds, Fewer Is BetterCaffe 2020-02-13Model: GoogleNet - Acceleration: CPU - Iterations: 200ml-benchmark2-12-25-2330K60K90K120K150KSE +/- 239.03, N = 31244211. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lcrypto -lcurl -lpthread -lsz -lz -ldl -lm -llmdb

Caffe

Model: GoogleNet - Acceleration: CPU - Iterations: 100

OpenBenchmarking.orgMilli-Seconds, Fewer Is BetterCaffe 2020-02-13Model: GoogleNet - Acceleration: CPU - Iterations: 100ml-benchmark2-12-25-2314K28K42K56K70KSE +/- 328.59, N = 3636061. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lcrypto -lcurl -lpthread -lsz -lz -ldl -lm -llmdb

Caffe

Model: AlexNet - Acceleration: CPU - Iterations: 1000

OpenBenchmarking.orgMilli-Seconds, Fewer Is BetterCaffe 2020-02-13Model: AlexNet - Acceleration: CPU - Iterations: 1000ml-benchmark2-12-25-2350K100K150K200K250KSE +/- 481.92, N = 32434451. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lcrypto -lcurl -lpthread -lsz -lz -ldl -lm -llmdb

Caffe

Model: AlexNet - Acceleration: CPU - Iterations: 200

OpenBenchmarking.orgMilli-Seconds, Fewer Is BetterCaffe 2020-02-13Model: AlexNet - Acceleration: CPU - Iterations: 200ml-benchmark2-12-25-2311K22K33K44K55KSE +/- 122.11, N = 3492271. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lcrypto -lcurl -lpthread -lsz -lz -ldl -lm -llmdb

Caffe

Model: AlexNet - Acceleration: CPU - Iterations: 100

OpenBenchmarking.orgMilli-Seconds, Fewer Is BetterCaffe 2020-02-13Model: AlexNet - Acceleration: CPU - Iterations: 100ml-benchmark2-12-25-235K10K15K20K25KSE +/- 159.49, N = 3248931. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lcrypto -lcurl -lpthread -lsz -lz -ldl -lm -llmdb

spaCy

Model: en_core_web_trf

OpenBenchmarking.orgtokens/sec, More Is BetterspaCy 3.4.1Model: en_core_web_trfml-benchmark2-12-25-235001000150020002500SE +/- 24.52, N = 32354

spaCy

Model: en_core_web_lg

OpenBenchmarking.orgtokens/sec, More Is BetterspaCy 3.4.1Model: en_core_web_lgml-benchmark2-12-25-234K8K12K16K20KSE +/- 113.01, N = 319452

Neural Magic DeepSparse

Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Streamml-benchmark2-12-25-231326395265SE +/- 0.08, N = 356.81

Neural Magic DeepSparse

Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Streamml-benchmark2-12-25-2348121620SE +/- 0.03, N = 317.60

Neural Magic DeepSparse

Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Streamml-benchmark2-12-25-2380160240320400SE +/- 0.71, N = 3356.97

Neural Magic DeepSparse

Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Streamml-benchmark2-12-25-23510152025SE +/- 0.05, N = 322.38

Neural Magic DeepSparse

Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Synchronous Single-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Synchronous Single-Streamml-benchmark2-12-25-233691215SE +/- 0.0082, N = 39.9862

Neural Magic DeepSparse

Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Synchronous Single-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Synchronous Single-Streamml-benchmark2-12-25-2320406080100SE +/- 0.08, N = 3100.11

Neural Magic DeepSparse

Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Streamml-benchmark2-12-25-23510152025SE +/- 0.03, N = 318.66

Neural Magic DeepSparse

Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Streamml-benchmark2-12-25-2390180270360450SE +/- 0.62, N = 3428.23

Neural Magic DeepSparse

Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Streamml-benchmark2-12-25-23918273645SE +/- 0.02, N = 337.78

Neural Magic DeepSparse

Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Streamml-benchmark2-12-25-23612182430SE +/- 0.02, N = 326.46

Neural Magic DeepSparse

Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Streamml-benchmark2-12-25-2350100150200250SE +/- 1.03, N = 3228.46

Neural Magic DeepSparse

Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Streamml-benchmark2-12-25-23816243240SE +/- 0.16, N = 334.93

Neural Magic DeepSparse

Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Streamml-benchmark2-12-25-233691215SE +/- 0.0104, N = 39.1055

Neural Magic DeepSparse

Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Streamml-benchmark2-12-25-2320406080100SE +/- 0.13, N = 3109.77

Neural Magic DeepSparse

Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Streamml-benchmark2-12-25-231020304050SE +/- 0.06, N = 342.70

Neural Magic DeepSparse

Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Streamml-benchmark2-12-25-234080120160200SE +/- 0.27, N = 3187.27

Neural Magic DeepSparse

Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Synchronous Single-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Synchronous Single-Streamml-benchmark2-12-25-233691215SE +/- 0.01, N = 310.50

Neural Magic DeepSparse

Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Synchronous Single-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Synchronous Single-Streamml-benchmark2-12-25-2320406080100SE +/- 0.07, N = 395.22

Neural Magic DeepSparse

Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Streamml-benchmark2-12-25-231428425670SE +/- 0.07, N = 362.42

Neural Magic DeepSparse

Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Streamml-benchmark2-12-25-23306090120150SE +/- 0.13, N = 3128.09

Neural Magic DeepSparse

Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Streamml-benchmark2-12-25-231.17062.34123.51184.68245.853SE +/- 0.0097, N = 35.2028

Neural Magic DeepSparse

Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Streamml-benchmark2-12-25-234080120160200SE +/- 0.36, N = 3192.13

Neural Magic DeepSparse

Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Streamml-benchmark2-12-25-23714212835SE +/- 0.03, N = 329.59

Neural Magic DeepSparse

Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Streamml-benchmark2-12-25-2360120180240300SE +/- 0.30, N = 3270.26

Neural Magic DeepSparse

Model: BERT-Large, NLP Question Answering - Scenario: Synchronous Single-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: BERT-Large, NLP Question Answering - Scenario: Synchronous Single-Streamml-benchmark2-12-25-231326395265SE +/- 0.04, N = 356.81

Neural Magic DeepSparse

Model: BERT-Large, NLP Question Answering - Scenario: Synchronous Single-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: BERT-Large, NLP Question Answering - Scenario: Synchronous Single-Streamml-benchmark2-12-25-2348121620SE +/- 0.01, N = 317.60

Neural Magic DeepSparse

Model: BERT-Large, NLP Question Answering - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: BERT-Large, NLP Question Answering - Scenario: Asynchronous Multi-Streamml-benchmark2-12-25-2360120180240300SE +/- 0.25, N = 3289.50

Neural Magic DeepSparse

Model: BERT-Large, NLP Question Answering - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: BERT-Large, NLP Question Answering - Scenario: Asynchronous Multi-Streamml-benchmark2-12-25-23714212835SE +/- 0.02, N = 327.63

Neural Magic DeepSparse

Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Streamml-benchmark2-12-25-233691215SE +/- 0.02, N = 310.62

Neural Magic DeepSparse

Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Streamml-benchmark2-12-25-2320406080100SE +/- 0.13, N = 394.15

Neural Magic DeepSparse

Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Streamml-benchmark2-12-25-231530456075SE +/- 0.17, N = 366.46

Neural Magic DeepSparse

Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Streamml-benchmark2-12-25-23306090120150SE +/- 0.30, N = 3120.31

Neural Magic DeepSparse

Model: ResNet-50, Sparse INT8 - Scenario: Synchronous Single-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: ResNet-50, Sparse INT8 - Scenario: Synchronous Single-Streamml-benchmark2-12-25-230.19150.3830.57450.7660.9575SE +/- 0.0017, N = 30.8513

Neural Magic DeepSparse

Model: ResNet-50, Sparse INT8 - Scenario: Synchronous Single-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: ResNet-50, Sparse INT8 - Scenario: Synchronous Single-Streamml-benchmark2-12-25-2330060090012001500SE +/- 2.24, N = 31172.15

Neural Magic DeepSparse

Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Streamml-benchmark2-12-25-230.68611.37222.05832.74443.4305SE +/- 0.0077, N = 33.0495

Neural Magic DeepSparse

Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Streamml-benchmark2-12-25-236001200180024003000SE +/- 6.37, N = 32614.24

Neural Magic DeepSparse

Model: ResNet-50, Baseline - Scenario: Synchronous Single-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: ResNet-50, Baseline - Scenario: Synchronous Single-Streamml-benchmark2-12-25-231.17122.34243.51364.68485.856SE +/- 0.0090, N = 35.2053

Neural Magic DeepSparse

Model: ResNet-50, Baseline - Scenario: Synchronous Single-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: ResNet-50, Baseline - Scenario: Synchronous Single-Streamml-benchmark2-12-25-234080120160200SE +/- 0.33, N = 3192.04

Neural Magic DeepSparse

Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Streamml-benchmark2-12-25-23714212835SE +/- 0.01, N = 329.60

Neural Magic DeepSparse

Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Streamml-benchmark2-12-25-2360120180240300SE +/- 0.07, N = 3270.15

Neural Magic DeepSparse

Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Synchronous Single-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Synchronous Single-Streamml-benchmark2-12-25-230.83091.66182.49273.32364.1545SE +/- 0.0023, N = 33.6930

Neural Magic DeepSparse

Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Synchronous Single-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Synchronous Single-Streamml-benchmark2-12-25-2360120180240300SE +/- 0.17, N = 3270.67

Neural Magic DeepSparse

Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Streamml-benchmark2-12-25-23246810SE +/- 0.0005, N = 38.5518

Neural Magic DeepSparse

Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Streamml-benchmark2-12-25-232004006008001000SE +/- 0.06, N = 3934.08

Neural Magic DeepSparse

Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Streamml-benchmark2-12-25-231326395265SE +/- 0.03, N = 356.96

Neural Magic DeepSparse

Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Streamml-benchmark2-12-25-2348121620SE +/- 0.01, N = 317.55

Neural Magic DeepSparse

Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Streamml-benchmark2-12-25-2380160240320400SE +/- 0.63, N = 3361.49

Neural Magic DeepSparse

Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Streamml-benchmark2-12-25-23510152025SE +/- 0.03, N = 322.07

TensorFlow

Device: CPU - Batch Size: 512 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 512 - Model: ResNet-50ml-benchmark2-12-25-23816243240SE +/- 0.00, N = 334.09

TensorFlow

Device: CPU - Batch Size: 512 - Model: GoogLeNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 512 - Model: GoogLeNetml-benchmark2-12-25-2320406080100SE +/- 0.03, N = 3110.88

TensorFlow

Device: CPU - Batch Size: 256 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 256 - Model: ResNet-50ml-benchmark2-12-25-23816243240SE +/- 0.00, N = 334.24

TensorFlow

Device: CPU - Batch Size: 256 - Model: GoogLeNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 256 - Model: GoogLeNetml-benchmark2-12-25-23306090120150SE +/- 0.03, N = 3112.12

TensorFlow

Device: CPU - Batch Size: 64 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 64 - Model: ResNet-50ml-benchmark2-12-25-23816243240SE +/- 0.01, N = 335.45

TensorFlow

Device: CPU - Batch Size: 64 - Model: GoogLeNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 64 - Model: GoogLeNetml-benchmark2-12-25-23306090120150SE +/- 0.09, N = 3120.40

TensorFlow

Device: CPU - Batch Size: 32 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 32 - Model: ResNet-50ml-benchmark2-12-25-23816243240SE +/- 0.02, N = 336.10

TensorFlow

Device: CPU - Batch Size: 32 - Model: GoogLeNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 32 - Model: GoogLeNetml-benchmark2-12-25-23306090120150SE +/- 0.10, N = 3127.29

TensorFlow

Device: CPU - Batch Size: 16 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 16 - Model: ResNet-50ml-benchmark2-12-25-23816243240SE +/- 0.04, N = 335.90

TensorFlow

Device: CPU - Batch Size: 16 - Model: GoogLeNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 16 - Model: GoogLeNetml-benchmark2-12-25-23306090120150SE +/- 0.16, N = 3127.79

TensorFlow

Device: CPU - Batch Size: 512 - Model: AlexNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 512 - Model: AlexNetml-benchmark2-12-25-2390180270360450SE +/- 0.13, N = 3400.77

TensorFlow

Device: CPU - Batch Size: 256 - Model: AlexNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 256 - Model: AlexNetml-benchmark2-12-25-2380160240320400SE +/- 0.26, N = 3386.97

TensorFlow

Device: CPU - Batch Size: 64 - Model: AlexNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 64 - Model: AlexNetml-benchmark2-12-25-2360120180240300SE +/- 0.16, N = 3297.39

TensorFlow

Device: CPU - Batch Size: 512 - Model: VGG-16

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 512 - Model: VGG-16ml-benchmark2-12-25-23510152025SE +/- 0.01, N = 319.13

TensorFlow

Device: CPU - Batch Size: 32 - Model: AlexNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 32 - Model: AlexNetml-benchmark2-12-25-2350100150200250SE +/- 0.12, N = 3215.96

TensorFlow

Device: CPU - Batch Size: 256 - Model: VGG-16

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 256 - Model: VGG-16ml-benchmark2-12-25-23510152025SE +/- 0.01, N = 319.00

TensorFlow

Device: CPU - Batch Size: 16 - Model: AlexNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 16 - Model: AlexNetml-benchmark2-12-25-23306090120150SE +/- 0.22, N = 3140.39

TensorFlow

Device: CPU - Batch Size: 64 - Model: VGG-16

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 64 - Model: VGG-16ml-benchmark2-12-25-23510152025SE +/- 0.01, N = 318.48

TensorFlow

Device: CPU - Batch Size: 32 - Model: VGG-16

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 32 - Model: VGG-16ml-benchmark2-12-25-2348121620SE +/- 0.00, N = 317.71

TensorFlow

Device: CPU - Batch Size: 16 - Model: VGG-16

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 16 - Model: VGG-16ml-benchmark2-12-25-2348121620SE +/- 0.00, N = 316.48

PyTorch

Device: CPU - Batch Size: 512 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 512 - Model: Efficientnet_v2_lml-benchmark2-12-25-233691215SE +/- 0.04, N = 310.42MIN: 8.29 / MAX: 11.15

PyTorch

Device: CPU - Batch Size: 256 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 256 - Model: Efficientnet_v2_lml-benchmark2-12-25-233691215SE +/- 0.11, N = 410.47MIN: 8.19 / MAX: 11.25

PyTorch

Device: CPU - Batch Size: 64 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 64 - Model: Efficientnet_v2_lml-benchmark2-12-25-233691215SE +/- 0.14, N = 310.43MIN: 8.26 / MAX: 11.26

PyTorch

Device: CPU - Batch Size: 32 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 32 - Model: Efficientnet_v2_lml-benchmark2-12-25-233691215SE +/- 0.11, N = 310.45MIN: 8.22 / MAX: 11.22

PyTorch

Device: CPU - Batch Size: 16 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 16 - Model: Efficientnet_v2_lml-benchmark2-12-25-233691215SE +/- 0.07, N = 310.29MIN: 8.22 / MAX: 11.02

PyTorch

Device: CPU - Batch Size: 1 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 1 - Model: Efficientnet_v2_lml-benchmark2-12-25-2348121620SE +/- 0.08, N = 313.70MIN: 10.96 / MAX: 14.13

PyTorch

Device: CPU - Batch Size: 512 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 512 - Model: ResNet-152ml-benchmark2-12-25-2348121620SE +/- 0.01, N = 317.89MIN: 13.88 / MAX: 18.4

PyTorch

Device: CPU - Batch Size: 256 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 256 - Model: ResNet-152ml-benchmark2-12-25-2348121620SE +/- 0.13, N = 317.71MIN: 14.07 / MAX: 18.34

PyTorch

Device: CPU - Batch Size: 64 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 64 - Model: ResNet-152ml-benchmark2-12-25-2348121620SE +/- 0.07, N = 317.98MIN: 13.29 / MAX: 18.54

PyTorch

Device: CPU - Batch Size: 512 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 512 - Model: ResNet-50ml-benchmark2-12-25-231020304050SE +/- 0.01, N = 344.93MIN: 34.06 / MAX: 46.17

PyTorch

Device: CPU - Batch Size: 32 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 32 - Model: ResNet-152ml-benchmark2-12-25-2348121620SE +/- 0.08, N = 317.78MIN: 13.52 / MAX: 18.49

PyTorch

Device: CPU - Batch Size: 256 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 256 - Model: ResNet-50ml-benchmark2-12-25-231020304050SE +/- 0.52, N = 345.77MIN: 35.6 / MAX: 47.88

PyTorch

Device: CPU - Batch Size: 16 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 16 - Model: ResNet-152ml-benchmark2-12-25-2348121620SE +/- 0.24, N = 317.88MIN: 12.93 / MAX: 18.62

PyTorch

Device: CPU - Batch Size: 64 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 64 - Model: ResNet-50ml-benchmark2-12-25-231020304050SE +/- 0.42, N = 345.17MIN: 35.7 / MAX: 47.3

PyTorch

Device: CPU - Batch Size: 32 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 32 - Model: ResNet-50ml-benchmark2-12-25-231020304050SE +/- 0.19, N = 345.71MIN: 38.99 / MAX: 47.61

PyTorch

Device: CPU - Batch Size: 16 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 16 - Model: ResNet-50ml-benchmark2-12-25-231020304050SE +/- 0.34, N = 345.48MIN: 33.75 / MAX: 47.46

PyTorch

Device: CPU - Batch Size: 1 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 1 - Model: ResNet-152ml-benchmark2-12-25-23612182430SE +/- 0.25, N = 326.53MIN: 20.93 / MAX: 27.56

PyTorch

Device: CPU - Batch Size: 1 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 1 - Model: ResNet-50ml-benchmark2-12-25-231530456075SE +/- 0.22, N = 367.87MIN: 53.01 / MAX: 70.77

TensorFlow Lite

Model: Inception ResNet V2

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2022-05-18Model: Inception ResNet V2ml-benchmark2-12-25-235K10K15K20K25KSE +/- 111.48, N = 322832.8

TensorFlow Lite

Model: Mobilenet Quant

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2022-05-18Model: Mobilenet Quantml-benchmark2-12-25-23400800120016002000SE +/- 11.21, N = 31985.59

TensorFlow Lite

Model: Mobilenet Float

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2022-05-18Model: Mobilenet Floatml-benchmark2-12-25-2330060090012001500SE +/- 2.11, N = 31293.76

TensorFlow Lite

Model: NASNet Mobile

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2022-05-18Model: NASNet Mobileml-benchmark2-12-25-232K4K6K8K10KSE +/- 51.78, N = 310670.2

TensorFlow Lite

Model: Inception V4

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2022-05-18Model: Inception V4ml-benchmark2-12-25-235K10K15K20K25KSE +/- 52.61, N = 322388.9

TensorFlow Lite

Model: SqueezeNet

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2022-05-18Model: SqueezeNetml-benchmark2-12-25-23400800120016002000SE +/- 7.51, N = 31798.07

RNNoise

OpenBenchmarking.orgSeconds, Fewer Is BetterRNNoise 2020-06-28ml-benchmark2-12-25-2348121620SE +/- 0.04, N = 314.601. (CC) gcc options: -O2 -pedantic -fvisibility=hidden

R Benchmark

OpenBenchmarking.orgSeconds, Fewer Is BetterR Benchmarkml-benchmark2-12-25-230.02550.0510.07650.1020.1275SE +/- 0.0003, N = 30.11331. R scripting front-end version 4.1.2 (2021-11-01)

DeepSpeech

Acceleration: CPU

OpenBenchmarking.orgSeconds, Fewer Is BetterDeepSpeech 0.6Acceleration: CPUml-benchmark2-12-25-231224364860SE +/- 0.73, N = 352.49

Numpy Benchmark

OpenBenchmarking.orgScore, More Is BetterNumpy Benchmarkml-benchmark2-12-25-23160320480640800SE +/- 3.15, N = 3721.25

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPUml-benchmark2-12-25-23150300450600750SE +/- 6.22, N = 3702.32MIN: 612.771. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPUml-benchmark2-12-25-2330060090012001500SE +/- 12.14, N = 31343.53MIN: 1191.311. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPUml-benchmark2-12-25-23150300450600750SE +/- 4.11, N = 3704.86MIN: 611.571. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPUml-benchmark2-12-25-230.36320.72641.08961.45281.816SE +/- 0.01893, N = 31.61432MIN: 1.431. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPUml-benchmark2-12-25-230.5871.1741.7612.3482.935SE +/- 0.02456, N = 32.60894MIN: 2.231. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPUml-benchmark2-12-25-230.33660.67321.00981.34641.683SE +/- 0.01461, N = 151.49610MIN: 1.21. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPUml-benchmark2-12-25-2330060090012001500SE +/- 3.41, N = 31367.75MIN: 1186.821. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPUml-benchmark2-12-25-23150300450600750SE +/- 10.07, N = 3707.84MIN: 607.921. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPUml-benchmark2-12-25-2330060090012001500SE +/- 16.16, N = 41358.07MIN: 1187.151. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPUml-benchmark2-12-25-230.16470.32940.49410.65880.8235SE +/- 0.007344, N = 150.731894MIN: 0.621. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPUml-benchmark2-12-25-230.11380.22760.34140.45520.569SE +/- 0.007201, N = 30.505580MIN: 0.431. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPUml-benchmark2-12-25-231.14462.28923.43384.57845.723SE +/- 0.06167, N = 45.08690MIN: 4.341. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPUml-benchmark2-12-25-230.64991.29981.94972.59963.2495SE +/- 0.03209, N = 152.88833MIN: 2.521. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPUml-benchmark2-12-25-230.77271.54542.31813.09083.8635SE +/- 0.03067, N = 153.43419MIN: 2.561. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPUml-benchmark2-12-25-231.18792.37583.56374.75165.9395SE +/- 0.04391, N = 35.27943MIN: 4.761. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPUml-benchmark2-12-25-230.33590.67181.00771.34361.6795SE +/- 0.01642, N = 31.49285MIN: 1.231. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPUml-benchmark2-12-25-230.17370.34740.52110.69480.8685SE +/- 0.008040, N = 30.772163MIN: 0.641. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: IP Shapes 3D - Data Type: f32 - Engine: CPUml-benchmark2-12-25-230.85821.71642.57463.43284.291SE +/- 0.05089, N = 153.81408MIN: 3.261. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: IP Shapes 1D - Data Type: f32 - Engine: CPUml-benchmark2-12-25-230.47960.95921.43881.91842.398SE +/- 0.02341, N = 32.13148MIN: 1.71. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

LeelaChessZero

Backend: BLAS

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.30Backend: BLASml-benchmark2-12-25-234080120160200SE +/- 2.31, N = 31751. (CXX) g++ options: -flto -pthread

OpenCV

Test: DNN - Deep Neural Network

OpenBenchmarking.orgms, Fewer Is BetterOpenCV 4.7Test: DNN - Deep Neural Networkml-benchmark2-12-25-237K14K21K28K35KSE +/- 829.30, N = 15314371. (CXX) g++ options: -fPIC -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -shared

NCNN

Target: Vulkan GPU - Model: FastestDet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: FastestDetml-benchmark2-12-25-231.0712.1423.2134.2845.355SE +/- 0.28, N = 124.76MIN: 2.77 / MAX: 309.731. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: squeezenet_ssd

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: squeezenet_ssdml-benchmark2-12-25-23246810SE +/- 0.17, N = 128.79MIN: 6.89 / MAX: 385.841. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: alexnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: alexnetml-benchmark2-12-25-231.172.343.514.685.85SE +/- 0.21, N = 125.20MIN: 4.2 / MAX: 274.711. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: resnet18

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: resnet18ml-benchmark2-12-25-23246810SE +/- 0.19, N = 126.64MIN: 5.3 / MAX: 394.051. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: googlenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: googlenetml-benchmark2-12-25-233691215SE +/- 0.23, N = 1210.69MIN: 8.16 / MAX: 460.921. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: blazeface

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: blazefaceml-benchmark2-12-25-230.360.721.081.441.8SE +/- 0.03, N = 121.60MIN: 1.29 / MAX: 9.031. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: efficientnet-b0

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: efficientnet-b0ml-benchmark2-12-25-231.2512.5023.7535.0046.255SE +/- 0.21, N = 125.56MIN: 4.32 / MAX: 601.761. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: mnasnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: mnasnetml-benchmark2-12-25-230.89331.78662.67993.57324.4665SE +/- 0.14, N = 123.97MIN: 3.13 / MAX: 314.651. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: shufflenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: shufflenet-v2ml-benchmark2-12-25-230.9811.9622.9433.9244.905SE +/- 0.17, N = 124.36MIN: 3.67 / MAX: 375.411. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3ml-benchmark2-12-25-230.991.982.973.964.95SE +/- 0.18, N = 124.40MIN: 3.22 / MAX: 352.451. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2ml-benchmark2-12-25-230.96531.93062.89593.86124.8265SE +/- 0.13, N = 124.29MIN: 3.31 / MAX: 320.741. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: mobilenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: mobilenetml-benchmark2-12-25-233691215SE +/- 0.26, N = 1210.67MIN: 8.56 / MAX: 353.751. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: FastestDet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: FastestDetml-benchmark2-12-25-231.1432.2863.4294.5725.715SE +/- 0.22, N = 155.08MIN: 3 / MAX: 221.011. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: squeezenet_ssd

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: squeezenet_ssdml-benchmark2-12-25-23246810SE +/- 0.14, N = 158.75MIN: 6.94 / MAX: 418.751. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: alexnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: alexnetml-benchmark2-12-25-231.14532.29063.43594.58125.7265SE +/- 0.15, N = 155.09MIN: 4.18 / MAX: 317.531. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: resnet18

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: resnet18ml-benchmark2-12-25-23246810SE +/- 0.17, N = 156.70MIN: 5.33 / MAX: 359.171. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: blazeface

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: blazefaceml-benchmark2-12-25-230.38250.7651.14751.531.9125SE +/- 0.09, N = 151.70MIN: 1.16 / MAX: 246.391. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: efficientnet-b0

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: efficientnet-b0ml-benchmark2-12-25-231.19252.3853.57754.775.9625SE +/- 0.13, N = 155.30MIN: 4.07 / MAX: 351.521. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: mnasnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: mnasnetml-benchmark2-12-25-230.86851.7372.60553.4744.3425SE +/- 0.13, N = 153.86MIN: 3.06 / MAX: 334.531. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: shufflenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: shufflenet-v2ml-benchmark2-12-25-230.99451.9892.98353.9784.9725SE +/- 0.15, N = 154.42MIN: 3.7 / MAX: 373.851. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU-v3-v3 - Model: mobilenet-v3

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU-v3-v3 - Model: mobilenet-v3ml-benchmark2-12-25-230.95631.91262.86893.82524.7815SE +/- 0.14, N = 154.25MIN: 3.54 / MAX: 358.681. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU-v2-v2 - Model: mobilenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU-v2-v2 - Model: mobilenet-v2ml-benchmark2-12-25-231.0172.0343.0514.0685.085SE +/- 0.15, N = 154.52MIN: 3.49 / MAX: 453.271. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: mobilenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: mobilenetml-benchmark2-12-25-233691215SE +/- 0.22, N = 1510.64MIN: 8.78 / MAX: 504.791. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

Mobile Neural Network

Model: squeezenetv1.1

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.1Model: squeezenetv1.1ml-benchmark2-12-25-230.62331.24661.86992.49323.1165SE +/- 0.098, N = 32.770MIN: 2.36 / MAX: 51.441. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl

oneDNN

Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPUml-benchmark2-12-25-230.07320.14640.21960.29280.366SE +/- 0.008133, N = 120.325432MIN: 0.231. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPUml-benchmark2-12-25-230.12210.24420.36630.48840.6105SE +/- 0.014251, N = 150.542833MIN: 0.371. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl


Phoronix Test Suite v10.8.4