m1test AMD Ryzen 5 7600X 6-Core testing with a ASUS TUF GAMING B650-PLUS WIFI (0823 BIOS) and Gigabyte NVIDIA GeForce RTX 4060 Ti 16GB on Ubuntu 22.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2307293-NE-M1TEST40932&grr .
m1test Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL OpenCL Vulkan Compiler File-System Screen Resolution firstmachinetest AMD Ryzen 5 7600X 6-Core @ 4.70GHz (6 Cores / 12 Threads) ASUS TUF GAMING B650-PLUS WIFI (0823 BIOS) AMD Device 14d8 32GB Western Digital WD_BLACK SN850X 2000GB Gigabyte NVIDIA GeForce RTX 4060 Ti 16GB NVIDIA Device 22bd DELL P2720DC Realtek RTL8125 2.5GbE + Realtek Device b852 Ubuntu 22.04 5.19.0-50-generic (x86_64) GNOME Shell 42.9 X Server 1.21.1.4 NVIDIA 535.86.05 4.6.0 OpenCL 3.0 CUDA 12.2.128 1.3.224 GCC 11.3.0 ext4 2560x1440 OpenBenchmarking.org - Transparent Huge Pages: madvise - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-aYxV0E/gcc-11-11.3.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-aYxV0E/gcc-11-11.3.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0xa601203 - BAR1 / Visible vRAM Size: 16384 MiB - vBIOS Version: 95.06.25.00.ac - GPU Compute Cores: 4352 - Python 3.10.6 - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
m1test tensorflow: CPU - 256 - VGG-16 shoc: OpenCL - Max SP Flops tensorflow: CPU - 256 - ResNet-50 tensorflow: CPU - 64 - VGG-16 tensorflow: CPU - 512 - GoogLeNet mnn: inception-v3 mnn: mobilenet-v1-1.0 mnn: MobileNetV2_224 mnn: SqueezeNetV1.0 mnn: resnet-v2-50 mnn: squeezenetv1.1 mnn: mobilenetV3 mnn: nasnet tensorflow: CPU - 32 - VGG-16 tensorflow: CPU - 256 - GoogLeNet lczero: BLAS ai-benchmark: Device AI Score ai-benchmark: Device Training Score ai-benchmark: Device Inference Score tensorflow: CPU - 512 - AlexNet tensorflow: CPU - 64 - ResNet-50 ncnn: Vulkan GPU - FastestDet ncnn: Vulkan GPU - vision_transformer ncnn: Vulkan GPU - regnety_400m ncnn: Vulkan GPU - squeezenet_ssd ncnn: Vulkan GPU - yolov4-tiny ncnn: Vulkan GPU - resnet50 ncnn: Vulkan GPU - alexnet ncnn: Vulkan GPU - resnet18 ncnn: Vulkan GPU - vgg16 ncnn: Vulkan GPU - googlenet ncnn: Vulkan GPU - blazeface ncnn: Vulkan GPU - efficientnet-b0 ncnn: Vulkan GPU - mnasnet ncnn: Vulkan GPU - shufflenet-v2 ncnn: Vulkan GPU-v3-v3 - mobilenet-v3 ncnn: Vulkan GPU-v2-v2 - mobilenet-v2 ncnn: Vulkan GPU - mobilenet tensorflow: CPU - 16 - VGG-16 tensorflow-lite: Mobilenet Quant numenta-nab: KNN CAD tnn: CPU - DenseNet tensorflow: CPU - 256 - AlexNet tensorflow: CPU - 32 - ResNet-50 numpy: tensorflow: CPU - 64 - GoogLeNet numenta-nab: Bayesian Changepoint numenta-nab: Earthgecko Skyline deepsparse: BERT-Large, NLP Question Answering - Asynchronous Multi-Stream deepsparse: BERT-Large, NLP Question Answering - Asynchronous Multi-Stream onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU tensorflow: CPU - 16 - ResNet-50 deepsparse: BERT-Large, NLP Question Answering - Synchronous Single-Stream deepsparse: BERT-Large, NLP Question Answering - Synchronous Single-Stream ncnn: CPU - FastestDet ncnn: CPU - vision_transformer ncnn: CPU - regnety_400m ncnn: CPU - squeezenet_ssd ncnn: CPU - yolov4-tiny ncnn: CPU - resnet50 ncnn: CPU - alexnet ncnn: CPU - resnet18 ncnn: CPU - vgg16 ncnn: CPU - googlenet ncnn: CPU - blazeface ncnn: CPU - efficientnet-b0 ncnn: CPU - mnasnet ncnn: CPU - shufflenet-v2 ncnn: CPU-v3-v3 - mobilenet-v3 ncnn: CPU-v2-v2 - mobilenet-v2 ncnn: CPU - mobilenet numenta-nab: Relative Entropy mlpack: scikit_qda tensorflow-lite: Inception V4 tensorflow-lite: Inception ResNet V2 tensorflow-lite: NASNet Mobile tensorflow-lite: Mobilenet Float tensorflow-lite: SqueezeNet spacy: en_core_web_trf spacy: en_core_web_lg tensorflow: CPU - 32 - GoogLeNet numenta-nab: Windowed Gaussian deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Stream deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Stream deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Synchronous Single-Stream deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Synchronous Single-Stream deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Synchronous Single-Stream deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Synchronous Single-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream tensorflow: CPU - 64 - AlexNet deepsparse: NLP Text Classification, BERT base uncased SST2 - Synchronous Single-Stream deepsparse: NLP Text Classification, BERT base uncased SST2 - Synchronous Single-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Synchronous Single-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Synchronous Single-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Stream deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Stream deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Stream deepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Asynchronous Multi-Stream deepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Asynchronous Multi-Stream deepspeech: CPU deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Synchronous Single-Stream deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Synchronous Single-Stream deepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Synchronous Single-Stream deepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Synchronous Single-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Synchronous Single-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Synchronous Single-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO - Asynchronous Multi-Stream deepsparse: ResNet-50, Baseline - Synchronous Single-Stream deepsparse: ResNet-50, Baseline - Synchronous Single-Stream deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Synchronous Single-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Synchronous Single-Stream deepsparse: ResNet-50, Baseline - Asynchronous Multi-Stream deepsparse: ResNet-50, Baseline - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Synchronous Single-Stream deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Synchronous Single-Stream deepsparse: ResNet-50, Sparse INT8 - Asynchronous Multi-Stream deepsparse: ResNet-50, Sparse INT8 - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO - Synchronous Single-Stream deepsparse: CV Detection, YOLOv5s COCO - Synchronous Single-Stream deepsparse: ResNet-50, Sparse INT8 - Synchronous Single-Stream deepsparse: ResNet-50, Sparse INT8 - Synchronous Single-Stream numenta-nab: Contextual Anomaly Detector OSE mlpack: scikit_ica mlpack: scikit_linearridgeregression tensorflow: CPU - 32 - AlexNet tensorflow: CPU - 16 - GoogLeNet onednn: Deconvolution Batch shapes_1d - bf16bf16bf16 - CPU onednn: Deconvolution Batch shapes_1d - f32 - CPU onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU onednn: IP Shapes 3D - bf16bf16bf16 - CPU tensorflow: CPU - 16 - AlexNet mlpack: scikit_svm onednn: IP Shapes 1D - bf16bf16bf16 - CPU onednn: IP Shapes 1D - f32 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU rnnoise: opencv: DNN - Deep Neural Network tnn: CPU - MobileNet v2 tnn: CPU - SqueezeNet v1.1 onednn: IP Shapes 3D - f32 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU shoc: OpenCL - Texture Read Bandwidth onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: Convolution Batch Shapes Auto - bf16bf16bf16 - CPU tnn: CPU - SqueezeNet v2 shoc: OpenCL - Bus Speed Readback onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: Deconvolution Batch shapes_3d - bf16bf16bf16 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU shoc: OpenCL - GEMM SGEMM_N shoc: OpenCL - S3D shoc: OpenCL - Reduction shoc: OpenCL - FFT SP shoc: OpenCL - Triad shoc: OpenCL - Bus Speed Download shoc: OpenCL - MD5 Hash rbenchmark: firstmachinetest 9.09 23395.7 25.12 9.05 74.51 16.040 2.797 1.689 2.477 11.920 1.388 0.759 5.894 8.87 74.32 1383 4305 2496 1809 193.91 25.28 3.06 719.18 4.46 13.11 19.79 20.12 18.29 8.74 68.87 9.07 1.05 7.92 3.41 2.54 3.83 3.33 10.70 8.45 2942.86 169.651 2181.402 189.83 25.51 807.43 74.56 18.988 92.489 266.6888 11.2486 2835.96 2834.98 2834.58 1417.35 1415.38 1414.69 25.69 105.1478 9.5099 1.86 113.91 4.58 10.37 13.21 11.87 4.69 6.28 29.96 5.98 0.5 2.69 1.56 1.52 1.48 1.93 7.25 12.833 33.09 29141.4 26415.7 5368.28 1348.48 1920.44 1657 19413 75.55 9.667 16.2371 184.5994 8.4159 118.6612 7.6609 390.9266 77.7156 38.5969 335.1708 8.9503 335.2462 8.9483 3.5599 280.5608 183.8336 16.3173 167.60 28.6962 34.8419 61.9364 16.1422 111.6776 8.9540 111.8071 8.9435 63.2478 47.4217 17.9465 167.0395 45.23452 22.1914 45.0475 7.9561 125.5946 39.1426 76.6235 14.5927 68.5040 26.7995 111.8935 59.9713 50.0100 11.8910 84.0533 59.5696 50.3374 11.8952 84.0227 26.8229 111.7964 22.0448 45.3506 2.3753 1259.3116 22.3672 44.691 1.0581 943.2536 30.817 27.50 1.62 142.72 77.31 8.81665 5.27941 1.00945 2.99807 107.99 14.60 1.53228 3.94237 0.731267 13.411 12915 182.512 177.530 4.97774 1.11340 2697.61 8.33924 8.57521 3.73203 41.947 13.1906 5.24738 3.12200 1.30084 6415.70 167.736 262.417 726.046 12.9202 13.3754 25.9256 OpenBenchmarking.org
TensorFlow Device: CPU - Batch Size: 256 - Model: VGG-16 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 256 - Model: VGG-16 firstmachinetest 3 6 9 12 15 SE +/- 0.00, N = 3 9.09
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Max SP Flops OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Max SP Flops firstmachinetest 5K 10K 15K 20K 25K SE +/- 1.82, N = 3 23395.7 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
TensorFlow Device: CPU - Batch Size: 256 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 256 - Model: ResNet-50 firstmachinetest 6 12 18 24 30 SE +/- 0.01, N = 3 25.12
TensorFlow Device: CPU - Batch Size: 64 - Model: VGG-16 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 64 - Model: VGG-16 firstmachinetest 3 6 9 12 15 SE +/- 0.00, N = 3 9.05
TensorFlow Device: CPU - Batch Size: 512 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 512 - Model: GoogLeNet firstmachinetest 20 40 60 80 100 SE +/- 0.03, N = 3 74.51
Mobile Neural Network Model: inception-v3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.1 Model: inception-v3 firstmachinetest 4 8 12 16 20 SE +/- 0.25, N = 15 16.04 MIN: 14.1 / MAX: 37.41 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: mobilenet-v1-1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.1 Model: mobilenet-v1-1.0 firstmachinetest 0.6293 1.2586 1.8879 2.5172 3.1465 SE +/- 0.002, N = 15 2.797 MIN: 2.76 / MAX: 9.64 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: MobileNetV2_224 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.1 Model: MobileNetV2_224 firstmachinetest 0.38 0.76 1.14 1.52 1.9 SE +/- 0.012, N = 15 1.689 MIN: 1.59 / MAX: 9.2 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: SqueezeNetV1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.1 Model: SqueezeNetV1.0 firstmachinetest 0.5573 1.1146 1.6719 2.2292 2.7865 SE +/- 0.023, N = 15 2.477 MIN: 2.35 / MAX: 10.29 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: resnet-v2-50 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.1 Model: resnet-v2-50 firstmachinetest 3 6 9 12 15 SE +/- 0.06, N = 15 11.92 MIN: 11.53 / MAX: 38.34 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: squeezenetv1.1 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.1 Model: squeezenetv1.1 firstmachinetest 0.3123 0.6246 0.9369 1.2492 1.5615 SE +/- 0.011, N = 15 1.388 MIN: 1.33 / MAX: 10.34 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: mobilenetV3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.1 Model: mobilenetV3 firstmachinetest 0.1708 0.3416 0.5124 0.6832 0.854 SE +/- 0.005, N = 15 0.759 MIN: 0.72 / MAX: 8.33 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: nasnet OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.1 Model: nasnet firstmachinetest 1.3262 2.6524 3.9786 5.3048 6.631 SE +/- 0.113, N = 15 5.894 MIN: 5.21 / MAX: 12.96 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
TensorFlow Device: CPU - Batch Size: 32 - Model: VGG-16 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 32 - Model: VGG-16 firstmachinetest 2 4 6 8 10 SE +/- 0.01, N = 3 8.87
TensorFlow Device: CPU - Batch Size: 256 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 256 - Model: GoogLeNet firstmachinetest 20 40 60 80 100 SE +/- 0.09, N = 3 74.32
LeelaChessZero Backend: BLAS OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.28 Backend: BLAS firstmachinetest 300 600 900 1200 1500 SE +/- 14.53, N = 3 1383 1. (CXX) g++ options: -flto -pthread
AI Benchmark Alpha Device AI Score OpenBenchmarking.org Score, More Is Better AI Benchmark Alpha 0.1.2 Device AI Score firstmachinetest 900 1800 2700 3600 4500 4305
AI Benchmark Alpha Device Training Score OpenBenchmarking.org Score, More Is Better AI Benchmark Alpha 0.1.2 Device Training Score firstmachinetest 500 1000 1500 2000 2500 2496
AI Benchmark Alpha Device Inference Score OpenBenchmarking.org Score, More Is Better AI Benchmark Alpha 0.1.2 Device Inference Score firstmachinetest 400 800 1200 1600 2000 1809
TensorFlow Device: CPU - Batch Size: 512 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 512 - Model: AlexNet firstmachinetest 40 80 120 160 200 SE +/- 0.03, N = 3 193.91
TensorFlow Device: CPU - Batch Size: 64 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 64 - Model: ResNet-50 firstmachinetest 6 12 18 24 30 SE +/- 0.01, N = 3 25.28
NCNN Target: Vulkan GPU - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: FastestDet firstmachinetest 0.6885 1.377 2.0655 2.754 3.4425 SE +/- 0.06, N = 3 3.06 MIN: 2.88 / MAX: 3.34 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: vision_transformer firstmachinetest 160 320 480 640 800 SE +/- 2.11, N = 3 719.18 MIN: 692.24 / MAX: 747.42 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: regnety_400m firstmachinetest 1.0035 2.007 3.0105 4.014 5.0175 SE +/- 0.05, N = 3 4.46 MIN: 4.28 / MAX: 4.81 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: squeezenet_ssd firstmachinetest 3 6 9 12 15 SE +/- 0.03, N = 3 13.11 MIN: 12.54 / MAX: 14.35 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: yolov4-tiny firstmachinetest 5 10 15 20 25 SE +/- 0.27, N = 3 19.79 MIN: 18.75 / MAX: 25.36 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: resnet50 firstmachinetest 5 10 15 20 25 SE +/- 0.01, N = 3 20.12 MIN: 19.86 / MAX: 20.55 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: alexnet firstmachinetest 5 10 15 20 25 SE +/- 0.01, N = 3 18.29 MIN: 17.98 / MAX: 18.73 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: resnet18 firstmachinetest 2 4 6 8 10 SE +/- 0.02, N = 3 8.74 MIN: 8.11 / MAX: 9.13 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: vgg16 firstmachinetest 15 30 45 60 75 SE +/- 0.02, N = 3 68.87 MIN: 68.54 / MAX: 69.35 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: googlenet firstmachinetest 3 6 9 12 15 SE +/- 0.02, N = 3 9.07 MIN: 8.48 / MAX: 9.47 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: blazeface firstmachinetest 0.2363 0.4726 0.7089 0.9452 1.1815 SE +/- 0.01, N = 3 1.05 MIN: 1.02 / MAX: 1.22 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: efficientnet-b0 firstmachinetest 2 4 6 8 10 SE +/- 0.04, N = 3 7.92 MIN: 7.26 / MAX: 8.31 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: mnasnet firstmachinetest 0.7673 1.5346 2.3019 3.0692 3.8365 SE +/- 0.05, N = 3 3.41 MIN: 3.29 / MAX: 3.91 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: shufflenet-v2 firstmachinetest 0.5715 1.143 1.7145 2.286 2.8575 SE +/- 0.07, N = 3 2.54 MIN: 2.39 / MAX: 2.92 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 firstmachinetest 0.8618 1.7236 2.5854 3.4472 4.309 SE +/- 0.01, N = 3 3.83 MIN: 3.72 / MAX: 4 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 firstmachinetest 0.7493 1.4986 2.2479 2.9972 3.7465 SE +/- 0.00, N = 3 3.33 MIN: 3.25 / MAX: 3.72 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: mobilenet firstmachinetest 3 6 9 12 15 SE +/- 0.01, N = 3 10.70 MIN: 9.96 / MAX: 14.76 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
TensorFlow Device: CPU - Batch Size: 16 - Model: VGG-16 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 16 - Model: VGG-16 firstmachinetest 2 4 6 8 10 SE +/- 0.01, N = 3 8.45
TensorFlow Lite Model: Mobilenet Quant OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2022-05-18 Model: Mobilenet Quant firstmachinetest 600 1200 1800 2400 3000 SE +/- 23.86, N = 9 2942.86
Numenta Anomaly Benchmark Detector: KNN CAD OpenBenchmarking.org Seconds, Fewer Is Better Numenta Anomaly Benchmark 1.1 Detector: KNN CAD firstmachinetest 40 80 120 160 200 SE +/- 1.13, N = 3 169.65
TNN Target: CPU - Model: DenseNet OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: DenseNet firstmachinetest 500 1000 1500 2000 2500 SE +/- 5.49, N = 3 2181.40 MIN: 2152.21 / MAX: 2244.7 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
TensorFlow Device: CPU - Batch Size: 256 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 256 - Model: AlexNet firstmachinetest 40 80 120 160 200 SE +/- 0.06, N = 3 189.83
TensorFlow Device: CPU - Batch Size: 32 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 32 - Model: ResNet-50 firstmachinetest 6 12 18 24 30 SE +/- 0.01, N = 3 25.51
Numpy Benchmark OpenBenchmarking.org Score, More Is Better Numpy Benchmark firstmachinetest 200 400 600 800 1000 SE +/- 0.92, N = 3 807.43
TensorFlow Device: CPU - Batch Size: 64 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 64 - Model: GoogLeNet firstmachinetest 20 40 60 80 100 SE +/- 0.02, N = 3 74.56
Numenta Anomaly Benchmark Detector: Bayesian Changepoint OpenBenchmarking.org Seconds, Fewer Is Better Numenta Anomaly Benchmark 1.1 Detector: Bayesian Changepoint firstmachinetest 5 10 15 20 25 SE +/- 0.17, N = 15 18.99
Numenta Anomaly Benchmark Detector: Earthgecko Skyline OpenBenchmarking.org Seconds, Fewer Is Better Numenta Anomaly Benchmark 1.1 Detector: Earthgecko Skyline firstmachinetest 20 40 60 80 100 SE +/- 0.55, N = 3 92.49
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: BERT-Large, NLP Question Answering - Scenario: Asynchronous Multi-Stream firstmachinetest 60 120 180 240 300 SE +/- 0.14, N = 3 266.69
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: BERT-Large, NLP Question Answering - Scenario: Asynchronous Multi-Stream firstmachinetest 3 6 9 12 15 SE +/- 0.01, N = 3 11.25
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU firstmachinetest 600 1200 1800 2400 3000 SE +/- 1.36, N = 3 2835.96 MIN: 2825.04 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU firstmachinetest 600 1200 1800 2400 3000 SE +/- 0.85, N = 3 2834.98 MIN: 2823.75 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU firstmachinetest 600 1200 1800 2400 3000 SE +/- 0.41, N = 3 2834.58 MIN: 2824.05 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU firstmachinetest 300 600 900 1200 1500 SE +/- 2.36, N = 3 1417.35 MIN: 1405.44 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU firstmachinetest 300 600 900 1200 1500 SE +/- 0.23, N = 3 1415.38 MIN: 1406.07 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU firstmachinetest 300 600 900 1200 1500 SE +/- 0.08, N = 3 1414.69 MIN: 1406.12 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
TensorFlow Device: CPU - Batch Size: 16 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 16 - Model: ResNet-50 firstmachinetest 6 12 18 24 30 SE +/- 0.00, N = 3 25.69
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: BERT-Large, NLP Question Answering - Scenario: Synchronous Single-Stream firstmachinetest 20 40 60 80 100 SE +/- 0.02, N = 3 105.15
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: BERT-Large, NLP Question Answering - Scenario: Synchronous Single-Stream firstmachinetest 3 6 9 12 15 SE +/- 0.0022, N = 3 9.5099
NCNN Target: CPU - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: FastestDet firstmachinetest 0.4185 0.837 1.2555 1.674 2.0925 SE +/- 0.07, N = 3 1.86 MIN: 1.76 / MAX: 4.68 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: vision_transformer firstmachinetest 30 60 90 120 150 SE +/- 1.07, N = 3 113.91 MIN: 112.28 / MAX: 580.42 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: regnety_400m firstmachinetest 1.0305 2.061 3.0915 4.122 5.1525 SE +/- 0.01, N = 3 4.58 MIN: 4.53 / MAX: 7.64 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: squeezenet_ssd firstmachinetest 3 6 9 12 15 SE +/- 0.01, N = 3 10.37 MIN: 10.14 / MAX: 15.44 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: yolov4-tiny firstmachinetest 3 6 9 12 15 SE +/- 0.02, N = 3 13.21 MIN: 12.7 / MAX: 16.45 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: resnet50 firstmachinetest 3 6 9 12 15 SE +/- 0.04, N = 3 11.87 MIN: 11.71 / MAX: 14.91 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: alexnet firstmachinetest 1.0553 2.1106 3.1659 4.2212 5.2765 SE +/- 0.01, N = 3 4.69 MIN: 4.63 / MAX: 7.79 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: resnet18 firstmachinetest 2 4 6 8 10 SE +/- 0.12, N = 3 6.28 MIN: 6.08 / MAX: 8.5 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: vgg16 firstmachinetest 7 14 21 28 35 SE +/- 0.01, N = 3 29.96 MIN: 29.64 / MAX: 36.55 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: googlenet firstmachinetest 1.3455 2.691 4.0365 5.382 6.7275 SE +/- 0.02, N = 3 5.98 MIN: 5.87 / MAX: 9.14 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: blazeface firstmachinetest 0.1125 0.225 0.3375 0.45 0.5625 SE +/- 0.00, N = 3 0.5 MIN: 0.49 / MAX: 0.66 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: efficientnet-b0 firstmachinetest 0.6053 1.2106 1.8159 2.4212 3.0265 SE +/- 0.00, N = 3 2.69 MIN: 2.65 / MAX: 3.07 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: mnasnet firstmachinetest 0.351 0.702 1.053 1.404 1.755 SE +/- 0.00, N = 3 1.56 MIN: 1.54 / MAX: 1.87 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: shufflenet-v2 firstmachinetest 0.342 0.684 1.026 1.368 1.71 SE +/- 0.00, N = 3 1.52 MIN: 1.48 / MAX: 1.73 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU-v3-v3 - Model: mobilenet-v3 firstmachinetest 0.333 0.666 0.999 1.332 1.665 SE +/- 0.00, N = 3 1.48 MIN: 1.46 / MAX: 1.71 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU-v2-v2 - Model: mobilenet-v2 firstmachinetest 0.4343 0.8686 1.3029 1.7372 2.1715 SE +/- 0.00, N = 3 1.93 MIN: 1.87 / MAX: 2.2 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: mobilenet firstmachinetest 2 4 6 8 10 SE +/- 0.05, N = 3 7.25 MIN: 7.07 / MAX: 9.05 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Numenta Anomaly Benchmark Detector: Relative Entropy OpenBenchmarking.org Seconds, Fewer Is Better Numenta Anomaly Benchmark 1.1 Detector: Relative Entropy firstmachinetest 3 6 9 12 15 SE +/- 0.12, N = 15 12.83
Mlpack Benchmark Benchmark: scikit_qda OpenBenchmarking.org Seconds, Fewer Is Better Mlpack Benchmark Benchmark: scikit_qda firstmachinetest 8 16 24 32 40 SE +/- 0.13, N = 3 33.09
TensorFlow Lite Model: Inception V4 OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2022-05-18 Model: Inception V4 firstmachinetest 6K 12K 18K 24K 30K SE +/- 8.55, N = 3 29141.4
TensorFlow Lite Model: Inception ResNet V2 OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2022-05-18 Model: Inception ResNet V2 firstmachinetest 6K 12K 18K 24K 30K SE +/- 21.93, N = 3 26415.7
TensorFlow Lite Model: NASNet Mobile OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2022-05-18 Model: NASNet Mobile firstmachinetest 1200 2400 3600 4800 6000 SE +/- 4.80, N = 3 5368.28
TensorFlow Lite Model: Mobilenet Float OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2022-05-18 Model: Mobilenet Float firstmachinetest 300 600 900 1200 1500 SE +/- 1.14, N = 3 1348.48
TensorFlow Lite Model: SqueezeNet OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2022-05-18 Model: SqueezeNet firstmachinetest 400 800 1200 1600 2000 SE +/- 1.99, N = 3 1920.44
spaCy Model: en_core_web_trf OpenBenchmarking.org tokens/sec, More Is Better spaCy 3.4.1 Model: en_core_web_trf firstmachinetest 400 800 1200 1600 2000 SE +/- 0.33, N = 3 1657
spaCy Model: en_core_web_lg OpenBenchmarking.org tokens/sec, More Is Better spaCy 3.4.1 Model: en_core_web_lg firstmachinetest 4K 8K 12K 16K 20K SE +/- 42.55, N = 3 19413
TensorFlow Device: CPU - Batch Size: 32 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 32 - Model: GoogLeNet firstmachinetest 20 40 60 80 100 SE +/- 0.04, N = 3 75.55
Numenta Anomaly Benchmark Detector: Windowed Gaussian OpenBenchmarking.org Seconds, Fewer Is Better Numenta Anomaly Benchmark 1.1 Detector: Windowed Gaussian firstmachinetest 3 6 9 12 15 SE +/- 0.092, N = 15 9.667
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream firstmachinetest 4 8 12 16 20 SE +/- 0.02, N = 3 16.24
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream firstmachinetest 40 80 120 160 200 SE +/- 0.24, N = 3 184.60
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Synchronous Single-Stream firstmachinetest 2 4 6 8 10 SE +/- 0.0077, N = 3 8.4159
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Synchronous Single-Stream firstmachinetest 30 60 90 120 150 SE +/- 0.11, N = 3 118.66
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream firstmachinetest 2 4 6 8 10 SE +/- 0.0094, N = 3 7.6609
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream firstmachinetest 80 160 240 320 400 SE +/- 0.49, N = 3 390.93
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream firstmachinetest 20 40 60 80 100 SE +/- 0.02, N = 3 77.72
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream firstmachinetest 9 18 27 36 45 SE +/- 0.01, N = 3 38.60
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream firstmachinetest 70 140 210 280 350 SE +/- 0.22, N = 3 335.17
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream firstmachinetest 3 6 9 12 15 SE +/- 0.0058, N = 3 8.9503
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream firstmachinetest 70 140 210 280 350 SE +/- 0.07, N = 3 335.25
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream firstmachinetest 2 4 6 8 10 SE +/- 0.0017, N = 3 8.9483
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Synchronous Single-Stream firstmachinetest 0.801 1.602 2.403 3.204 4.005 SE +/- 0.0117, N = 3 3.5599
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Synchronous Single-Stream firstmachinetest 60 120 180 240 300 SE +/- 0.93, N = 3 280.56
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream firstmachinetest 40 80 120 160 200 SE +/- 0.16, N = 3 183.83
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream firstmachinetest 4 8 12 16 20 SE +/- 0.01, N = 3 16.32
TensorFlow Device: CPU - Batch Size: 64 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 64 - Model: AlexNet firstmachinetest 40 80 120 160 200 SE +/- 0.18, N = 3 167.60
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Synchronous Single-Stream firstmachinetest 7 14 21 28 35 SE +/- 0.01, N = 3 28.70
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Synchronous Single-Stream firstmachinetest 8 16 24 32 40 SE +/- 0.01, N = 3 34.84
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream firstmachinetest 14 28 42 56 70 SE +/- 0.04, N = 3 61.94
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream firstmachinetest 4 8 12 16 20 SE +/- 0.01, N = 3 16.14
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream firstmachinetest 30 60 90 120 150 SE +/- 0.05, N = 3 111.68
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream firstmachinetest 3 6 9 12 15 SE +/- 0.0039, N = 3 8.9540
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream firstmachinetest 30 60 90 120 150 SE +/- 0.08, N = 3 111.81
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream firstmachinetest 2 4 6 8 10 SE +/- 0.0065, N = 3 8.9435
Neural Magic DeepSparse Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream firstmachinetest 14 28 42 56 70 SE +/- 0.06, N = 3 63.25
Neural Magic DeepSparse Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream firstmachinetest 11 22 33 44 55 SE +/- 0.05, N = 3 47.42
Neural Magic DeepSparse Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream firstmachinetest 4 8 12 16 20 SE +/- 0.02, N = 3 17.95
Neural Magic DeepSparse Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream firstmachinetest 40 80 120 160 200 SE +/- 0.17, N = 3 167.04
DeepSpeech Acceleration: CPU OpenBenchmarking.org Seconds, Fewer Is Better DeepSpeech 0.6 Acceleration: CPU firstmachinetest 10 20 30 40 50 SE +/- 0.03, N = 3 45.23
Neural Magic DeepSparse Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Synchronous Single-Stream firstmachinetest 5 10 15 20 25 SE +/- 0.04, N = 3 22.19
Neural Magic DeepSparse Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Synchronous Single-Stream firstmachinetest 10 20 30 40 50 SE +/- 0.09, N = 3 45.05
Neural Magic DeepSparse Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Synchronous Single-Stream firstmachinetest 2 4 6 8 10 SE +/- 0.0159, N = 3 7.9561
Neural Magic DeepSparse Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Synchronous Single-Stream firstmachinetest 30 60 90 120 150 SE +/- 0.25, N = 3 125.59
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream firstmachinetest 9 18 27 36 45 SE +/- 0.04, N = 3 39.14
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream firstmachinetest 20 40 60 80 100 SE +/- 0.07, N = 3 76.62
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream firstmachinetest 4 8 12 16 20 SE +/- 0.01, N = 3 14.59
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream firstmachinetest 15 30 45 60 75 SE +/- 0.06, N = 3 68.50
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream firstmachinetest 6 12 18 24 30 SE +/- 0.01, N = 3 26.80
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream firstmachinetest 30 60 90 120 150 SE +/- 0.03, N = 3 111.89
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream firstmachinetest 13 26 39 52 65 SE +/- 0.35, N = 3 59.97
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream firstmachinetest 11 22 33 44 55 SE +/- 0.29, N = 3 50.01
Neural Magic DeepSparse Model: ResNet-50, Baseline - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: ResNet-50, Baseline - Scenario: Synchronous Single-Stream firstmachinetest 3 6 9 12 15 SE +/- 0.02, N = 3 11.89
Neural Magic DeepSparse Model: ResNet-50, Baseline - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: ResNet-50, Baseline - Scenario: Synchronous Single-Stream firstmachinetest 20 40 60 80 100 SE +/- 0.16, N = 3 84.05
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream firstmachinetest 13 26 39 52 65 SE +/- 0.23, N = 3 59.57
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream firstmachinetest 11 22 33 44 55 SE +/- 0.19, N = 3 50.34
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream firstmachinetest 3 6 9 12 15 SE +/- 0.01, N = 3 11.90
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream firstmachinetest 20 40 60 80 100 SE +/- 0.08, N = 3 84.02
Neural Magic DeepSparse Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream firstmachinetest 6 12 18 24 30 SE +/- 0.02, N = 3 26.82
Neural Magic DeepSparse Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream firstmachinetest 30 60 90 120 150 SE +/- 0.08, N = 3 111.80
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Synchronous Single-Stream firstmachinetest 5 10 15 20 25 SE +/- 0.01, N = 3 22.04
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Synchronous Single-Stream firstmachinetest 10 20 30 40 50 SE +/- 0.02, N = 3 45.35
Neural Magic DeepSparse Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream firstmachinetest 0.5344 1.0688 1.6032 2.1376 2.672 SE +/- 0.0036, N = 3 2.3753
Neural Magic DeepSparse Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream firstmachinetest 300 600 900 1200 1500 SE +/- 1.91, N = 3 1259.31
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Stream firstmachinetest 5 10 15 20 25 SE +/- 0.01, N = 3 22.37
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Stream firstmachinetest 10 20 30 40 50 SE +/- 0.01, N = 3 44.69
Neural Magic DeepSparse Model: ResNet-50, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.5 Model: ResNet-50, Sparse INT8 - Scenario: Synchronous Single-Stream firstmachinetest 0.2381 0.4762 0.7143 0.9524 1.1905 SE +/- 0.0021, N = 3 1.0581
Neural Magic DeepSparse Model: ResNet-50, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.5 Model: ResNet-50, Sparse INT8 - Scenario: Synchronous Single-Stream firstmachinetest 200 400 600 800 1000 SE +/- 1.92, N = 3 943.25
Numenta Anomaly Benchmark Detector: Contextual Anomaly Detector OSE OpenBenchmarking.org Seconds, Fewer Is Better Numenta Anomaly Benchmark 1.1 Detector: Contextual Anomaly Detector OSE firstmachinetest 7 14 21 28 35 SE +/- 0.18, N = 3 30.82
Mlpack Benchmark Benchmark: scikit_ica OpenBenchmarking.org Seconds, Fewer Is Better Mlpack Benchmark Benchmark: scikit_ica firstmachinetest 6 12 18 24 30 SE +/- 0.07, N = 3 27.50
Mlpack Benchmark Benchmark: scikit_linearridgeregression OpenBenchmarking.org Seconds, Fewer Is Better Mlpack Benchmark Benchmark: scikit_linearridgeregression firstmachinetest 0.3645 0.729 1.0935 1.458 1.8225 SE +/- 0.01, N = 3 1.62
TensorFlow Device: CPU - Batch Size: 32 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 32 - Model: AlexNet firstmachinetest 30 60 90 120 150 SE +/- 0.15, N = 3 142.72
TensorFlow Device: CPU - Batch Size: 16 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 16 - Model: GoogLeNet firstmachinetest 20 40 60 80 100 SE +/- 0.05, N = 3 77.31
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU firstmachinetest 2 4 6 8 10 SE +/- 0.00619, N = 3 8.81665 MIN: 8.58 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU firstmachinetest 1.1879 2.3758 3.5637 4.7516 5.9395 SE +/- 0.00118, N = 3 5.27941 MIN: 5.09 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU firstmachinetest 0.2271 0.4542 0.6813 0.9084 1.1355 SE +/- 0.00052, N = 3 1.00945 MIN: 0.98 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU firstmachinetest 0.6746 1.3492 2.0238 2.6984 3.373 SE +/- 0.02962, N = 6 2.99807 MIN: 2.81 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
TensorFlow Device: CPU - Batch Size: 16 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 16 - Model: AlexNet firstmachinetest 20 40 60 80 100 SE +/- 0.06, N = 3 107.99
Mlpack Benchmark Benchmark: scikit_svm OpenBenchmarking.org Seconds, Fewer Is Better Mlpack Benchmark Benchmark: scikit_svm firstmachinetest 4 8 12 16 20 SE +/- 0.00, N = 3 14.60
oneDNN Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU firstmachinetest 0.3448 0.6896 1.0344 1.3792 1.724 SE +/- 0.00143, N = 3 1.53228 MIN: 1.46 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU firstmachinetest 0.887 1.774 2.661 3.548 4.435 SE +/- 0.00577, N = 3 3.94237 MIN: 3.67 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU firstmachinetest 0.1645 0.329 0.4935 0.658 0.8225 SE +/- 0.000672, N = 3 0.731267 MIN: 0.7 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
RNNoise OpenBenchmarking.org Seconds, Fewer Is Better RNNoise 2020-06-28 firstmachinetest 3 6 9 12 15 SE +/- 0.04, N = 3 13.41 1. (CC) gcc options: -O2 -pedantic -fvisibility=hidden
OpenCV Test: DNN - Deep Neural Network OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: DNN - Deep Neural Network firstmachinetest 3K 6K 9K 12K 15K SE +/- 66.50, N = 3 12915 1. (CXX) g++ options: -fPIC -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -shared
TNN Target: CPU - Model: MobileNet v2 OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: MobileNet v2 firstmachinetest 40 80 120 160 200 SE +/- 0.38, N = 3 182.51 MIN: 179.91 / MAX: 188.17 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
TNN Target: CPU - Model: SqueezeNet v1.1 OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: SqueezeNet v1.1 firstmachinetest 40 80 120 160 200 SE +/- 0.09, N = 3 177.53 MIN: 177.25 / MAX: 177.92 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU firstmachinetest 1.12 2.24 3.36 4.48 5.6 SE +/- 0.00196, N = 3 4.97774 MIN: 4.89 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU firstmachinetest 0.2505 0.501 0.7515 1.002 1.2525 SE +/- 0.00745, N = 3 1.11340 MIN: 1.04 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Texture Read Bandwidth OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Texture Read Bandwidth firstmachinetest 600 1200 1800 2400 3000 SE +/- 1.89, N = 3 2697.61 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU firstmachinetest 2 4 6 8 10 SE +/- 0.04575, N = 3 8.33924 MIN: 8.17 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU firstmachinetest 2 4 6 8 10 SE +/- 0.00397, N = 3 8.57521 MIN: 8.44 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU firstmachinetest 0.8397 1.6794 2.5191 3.3588 4.1985 SE +/- 0.00482, N = 3 3.73203 MIN: 3.65 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
TNN Target: CPU - Model: SqueezeNet v2 OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: SqueezeNet v2 firstmachinetest 10 20 30 40 50 SE +/- 0.42, N = 5 41.95 MIN: 40.18 / MAX: 43.01 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Bus Speed Readback OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Readback firstmachinetest 3 6 9 12 15 SE +/- 0.00, N = 3 13.19 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU firstmachinetest 1.1807 2.3614 3.5421 4.7228 5.9035 SE +/- 0.00166, N = 3 5.24738 MIN: 5.02 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU firstmachinetest 0.7025 1.405 2.1075 2.81 3.5125 SE +/- 0.00508, N = 3 3.12200 MIN: 2.93 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU firstmachinetest 0.2927 0.5854 0.8781 1.1708 1.4635 SE +/- 0.00296, N = 3 1.30084 MIN: 1.23 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: GEMM SGEMM_N OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: GEMM SGEMM_N firstmachinetest 1400 2800 4200 5600 7000 SE +/- 40.91, N = 3 6415.70 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: S3D OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: S3D firstmachinetest 40 80 120 160 200 SE +/- 0.11, N = 3 167.74 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Reduction OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Reduction firstmachinetest 60 120 180 240 300 SE +/- 0.01, N = 3 262.42 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: FFT SP OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: FFT SP firstmachinetest 160 320 480 640 800 SE +/- 0.21, N = 3 726.05 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Triad OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Triad firstmachinetest 3 6 9 12 15 SE +/- 0.02, N = 3 12.92 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Bus Speed Download OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Download firstmachinetest 3 6 9 12 15 SE +/- 0.00, N = 3 13.38 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: MD5 Hash OpenBenchmarking.org GHash/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: MD5 Hash firstmachinetest 6 12 18 24 30 SE +/- 0.00, N = 3 25.93 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
Phoronix Test Suite v10.8.5