M6i.8xlarge Benchmarks [2407019-NE-M6I8XLARG24]

147 Results Shown

Whisper.cpp
OpenCV
Mlpack Benchmark
OpenCV
Whisper.cpp
ONNX Runtime:
GPT-2 - CPU - Standard:
Inference Time Cost (ms)
Inferences Per Second
ArcFace ResNet-100 - CPU - Standard:
Inference Time Cost (ms)
Inferences Per Second
super-resolution-10 - CPU - Standard:
Inference Time Cost (ms)
Inferences Per Second
OpenCV:
Features 2D
Graph API
ONNX Runtime:
fcn-resnet101-11 - CPU - Standard:
Inference Time Cost (ms)
Inferences Per Second
OpenCV
ONNX Runtime:
Faster R-CNN R-50-FPN-int8 - CPU - Standard:
Inference Time Cost (ms)
Inferences Per Second
OpenCV:
DNN - Deep Neural Network
Object Detection
Video
Whisper.cpp
Mlpack Benchmark
oneDNN:
Deconvolution Batch shapes_1d - CPU
Recurrent Neural Network Training - CPU
Recurrent Neural Network Inference - CPU
OpenVINO:
Face Detection FP16 - CPU:
ms
FPS
Face Detection FP16-INT8 - CPU:
ms
FPS
ONNX Runtime:
Faster R-CNN R-50-FPN-int8 - CPU - Parallel:
Inference Time Cost (ms)
Inferences Per Second
GPT-2 - CPU - Parallel:
Inference Time Cost (ms)
Inferences Per Second
fcn-resnet101-11 - CPU - Parallel:
Inference Time Cost (ms)
Inferences Per Second
OpenVINO:
Person Detection FP16 - CPU:
ms
FPS
Person Detection FP32 - CPU:
ms
FPS
ONNX Runtime:
bertsquad-12 - CPU - Parallel:
Inference Time Cost (ms)
Inferences Per Second
bertsquad-12 - CPU - Standard:
Inference Time Cost (ms)
Inferences Per Second
yolov4 - CPU - Parallel:
Inference Time Cost (ms)
Inferences Per Second
yolov4 - CPU - Standard:
Inference Time Cost (ms)
Inferences Per Second
T5 Encoder - CPU - Standard:
Inference Time Cost (ms)
Inferences Per Second
OpenVINO:
Machine Translation EN To DE FP16 - CPU:
ms
FPS
ONNX Runtime:
T5 Encoder - CPU - Parallel:
Inference Time Cost (ms)
Inferences Per Second
ArcFace ResNet-100 - CPU - Parallel:
Inference Time Cost (ms)
Inferences Per Second
OpenVINO:
Road Segmentation ADAS FP16-INT8 - CPU:
ms
FPS
Noise Suppression Poconet-Like FP16 - CPU:
ms
FPS
Person Vehicle Bike Detection FP16 - CPU:
ms
FPS
Person Re-Identification Retail FP16 - CPU:
ms
FPS
Road Segmentation ADAS FP16 - CPU:
ms
FPS
Handwritten English Recognition FP16-INT8 - CPU:
ms
FPS
Handwritten English Recognition FP16 - CPU:
ms
FPS
Vehicle Detection FP16-INT8 - CPU:
ms
FPS
Face Detection Retail FP16-INT8 - CPU:
ms
FPS
Vehicle Detection FP16 - CPU:
ms
FPS
Age Gender Recognition Retail 0013 FP16-INT8 - CPU:
ms
FPS
Weld Porosity Detection FP16 - CPU:
ms
FPS
Face Detection Retail FP16 - CPU:
ms
FPS
Weld Porosity Detection FP16-INT8 - CPU:
ms
FPS
ONNX Runtime:
CaffeNet 12-int8 - CPU - Standard:
Inference Time Cost (ms)
Inferences Per Second
OpenVINO:
Age Gender Recognition Retail 0013 FP16 - CPU:
ms
FPS
ONNX Runtime:
CaffeNet 12-int8 - CPU - Parallel:
Inference Time Cost (ms)
Inferences Per Second
ResNet50 v1-12-int8 - CPU - Standard:
Inference Time Cost (ms)
Inferences Per Second
ResNet50 v1-12-int8 - CPU - Parallel:
Inference Time Cost (ms)
Inferences Per Second
super-resolution-10 - CPU - Parallel:
Inference Time Cost (ms)
Inferences Per Second
Neural Magic DeepSparse:
Llama2 Chat 7b Quantized - Asynchronous Multi-Stream:
ms/batch
items/sec
ResNet-50, Sparse INT8 - Asynchronous Multi-Stream:
ms/batch
items/sec
Mlpack Benchmark
Neural Magic DeepSparse:
NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Synchronous Single-Stream:
ms/batch
items/sec
Llama2 Chat 7b Quantized - Synchronous Single-Stream:
ms/batch
items/sec
BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Stream:
ms/batch
items/sec
NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Stream:
ms/batch
items/sec
BERT-Large, NLP Question Answering, Sparse INT8 - Synchronous Single-Stream:
ms/batch
items/sec
NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream:
ms/batch
items/sec
NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream:
ms/batch
items/sec
NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Stream:
ms/batch
items/sec
NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Stream:
ms/batch
items/sec
CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream:
ms/batch
items/sec
CV Segmentation, 90% Pruned YOLACT Pruned - Synchronous Single-Stream:
ms/batch
items/sec
NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream:
ms/batch
items/sec
Llama.cpp
Neural Magic DeepSparse:
NLP Text Classification, DistilBERT mnli - Synchronous Single-Stream:
ms/batch
items/sec
ResNet-50, Baseline - Synchronous Single-Stream:
ms/batch
items/sec
CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Stream:
ms/batch
items/sec
CV Detection, YOLOv5s COCO, Sparse INT8 - Synchronous Single-Stream:
ms/batch
items/sec
CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream:
ms/batch
items/sec
ResNet-50, Baseline - Asynchronous Multi-Stream:
ms/batch
items/sec
CV Classification, ResNet-50 ImageNet - Synchronous Single-Stream:
ms/batch
items/sec
ResNet-50, Sparse INT8 - Synchronous Single-Stream:
ms/batch
items/sec
Mlpack Benchmark
oneDNN:
IP Shapes 1D - CPU
IP Shapes 3D - CPU
Convolution Batch Shapes Auto - CPU
Deconvolution Batch shapes_3d - CPU

m6i.8xlarge

Processor: Intel Xeon Platinum 8375C (16 Cores / 32 Threads), Motherboard: Amazon EC2 m6i.8xlarge (1.0 BIOS), Chipset: Intel 440FX 82441FX PMC, Memory: 1 x 128 GB DDR4-3200MT/s, Disk: 537GB Amazon Elastic Block Store, Graphics: EFI VGA, Network: Amazon Elastic

OS: Ubuntu 22.04, Kernel: 6.5.0-1017-aws (x86_64), Vulkan: 1.3.255, Compiler: GCC 11.4.0, File-System: ext4, Screen Resolution: 800x600, System Layer: amazon

Kernel Notes: Transparent Huge Pages: madvise
Compiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v
Processor Notes: CPU Microcode: 0xd0003d1
Security Notes: gather_data_sampling: Unknown: Dependent on hypervisor status + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Mitigation of Clear buffers; SMT Host state unknown + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected

Testing initiated at 1 July 2024 09:30 by user root.

m6i.8xlarge

Statistics

Graph Settings

Multi-Way Comparison

Table

Run Management

m6i.8xlarge

Whisper.cpp

OpenCV

Mlpack Benchmark

OpenCV

Whisper.cpp

ONNX Runtime

OpenCV

ONNX Runtime

OpenCV

ONNX Runtime

OpenCV

Whisper.cpp

Mlpack Benchmark

oneDNN

OpenVINO

ONNX Runtime

OpenVINO

ONNX Runtime

OpenVINO

ONNX Runtime

OpenVINO

ONNX Runtime

OpenVINO

ONNX Runtime

Neural Magic DeepSparse

Mlpack Benchmark

Neural Magic DeepSparse

Llama.cpp

Neural Magic DeepSparse

Mlpack Benchmark

oneDNN

147 Results Shown

m6i.8xlarge