m6i.8xlarge amazon testing on Ubuntu 22.04 via the Phoronix Test Suite. m6i.8xlarge: Processor: Intel Xeon Platinum 8375C (16 Cores / 32 Threads), Motherboard: Amazon EC2 m6i.8xlarge (1.0 BIOS), Chipset: Intel 440FX 82441FX PMC, Memory: 1 x 128 GB DDR4-3200MT/s, Disk: 322GB Amazon Elastic Block Store, Graphics: EFI VGA, Network: Amazon Elastic OS: Ubuntu 22.04, Kernel: 6.5.0-1017-aws (x86_64), Vulkan: 1.3.255, Compiler: GCC 11.4.0, File-System: ext4, Screen Resolution: 800x600, System Layer: amazon ONNX Runtime 1.17 Model: GPT-2 - Device: CPU - Executor: Parallel Inferences Per Second > Higher Is Better m6i.8xlarge . 128.29 |========================================================= ONNX Runtime 1.17 Model: GPT-2 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better m6i.8xlarge . 152.48 |========================================================= ONNX Runtime 1.17 Model: yolov4 - Device: CPU - Executor: Parallel Inferences Per Second > Higher Is Better m6i.8xlarge . 11.43 |========================================================== ONNX Runtime 1.17 Model: yolov4 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better m6i.8xlarge . 13.52 |========================================================== ONNX Runtime 1.17 Model: T5 Encoder - Device: CPU - Executor: Parallel Inferences Per Second > Higher Is Better m6i.8xlarge . 193.01 |========================================================= ONNX Runtime 1.17 Model: T5 Encoder - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better m6i.8xlarge . 219.93 |========================================================= ONNX Runtime 1.17 Model: bertsquad-12 - Device: CPU - Executor: Parallel Inferences Per Second > Higher Is Better m6i.8xlarge . 15.07 |========================================================== ONNX Runtime 1.17 Model: bertsquad-12 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better m6i.8xlarge . 18.45 |========================================================== ONNX Runtime 1.17 Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel Inferences Per Second > Higher Is Better m6i.8xlarge . 557.47 |========================================================= ONNX Runtime 1.17 Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better m6i.8xlarge . 662.60 |========================================================= ONNX Runtime 1.17 Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel Inferences Per Second > Higher Is Better m6i.8xlarge . 1.83424 |======================================================== ONNX Runtime 1.17 Model: fcn-resnet101-11 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better m6i.8xlarge . 2.99347 |======================================================== ONNX Runtime 1.17 Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel Inferences Per Second > Higher Is Better m6i.8xlarge . 35.19 |========================================================== ONNX Runtime 1.17 Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better m6i.8xlarge . 35.39 |========================================================== ONNX Runtime 1.17 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel Inferences Per Second > Higher Is Better m6i.8xlarge . 254.41 |========================================================= ONNX Runtime 1.17 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better m6i.8xlarge . 279.86 |========================================================= ONNX Runtime 1.17 Model: super-resolution-10 - Device: CPU - Executor: Parallel Inferences Per Second > Higher Is Better m6i.8xlarge . 97.76 |========================================================== ONNX Runtime 1.17 Model: super-resolution-10 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better m6i.8xlarge . 139.17 |========================================================= ONNX Runtime 1.17 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel Inferences Per Second > Higher Is Better m6i.8xlarge . 4.47456 |======================================================== ONNX Runtime 1.17 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better m6i.8xlarge . 4.86425 |======================================================== Whisper.cpp 1.6.2 Model: ggml-base.en - Input: 2016 State of the Union Seconds < Lower Is Better m6i.8xlarge . 134.98 |========================================================= Whisper.cpp 1.6.2 Model: ggml-small.en - Input: 2016 State of the Union Seconds < Lower Is Better m6i.8xlarge . 350.88 |========================================================= Whisper.cpp 1.6.2 Model: ggml-medium.en - Input: 2016 State of the Union Seconds < Lower Is Better m6i.8xlarge . 994.88 |========================================================= Llama.cpp b3067 Model: Meta-Llama-3-8B-Instruct-Q8_0.gguf Tokens Per Second > Higher Is Better m6i.8xlarge . 11.73 |========================================================== OpenVINO 2024.0 Model: Face Detection FP16 - Device: CPU FPS > Higher Is Better m6i.8xlarge . 6.54 |=========================================================== OpenVINO 2024.0 Model: Face Detection FP16 - Device: CPU ms < Lower Is Better m6i.8xlarge . 1213.60 |======================================================== OpenVINO 2024.0 Model: Person Detection FP16 - Device: CPU FPS > Higher Is Better m6i.8xlarge . 49.66 |========================================================== OpenVINO 2024.0 Model: Person Detection FP16 - Device: CPU ms < Lower Is Better m6i.8xlarge . 160.92 |========================================================= OpenVINO 2024.0 Model: Person Detection FP32 - Device: CPU FPS > Higher Is Better m6i.8xlarge . 49.30 |========================================================== OpenVINO 2024.0 Model: Person Detection FP32 - Device: CPU ms < Lower Is Better m6i.8xlarge . 162.06 |========================================================= OpenVINO 2024.0 Model: Vehicle Detection FP16 - Device: CPU FPS > Higher Is Better m6i.8xlarge . 333.25 |========================================================= OpenVINO 2024.0 Model: Vehicle Detection FP16 - Device: CPU ms < Lower Is Better m6i.8xlarge . 23.98 |========================================================== OpenVINO 2024.0 Model: Face Detection FP16-INT8 - Device: CPU FPS > Higher Is Better m6i.8xlarge . 26.02 |========================================================== OpenVINO 2024.0 Model: Face Detection FP16-INT8 - Device: CPU ms < Lower Is Better m6i.8xlarge . 306.99 |========================================================= OpenVINO 2024.0 Model: Face Detection Retail FP16 - Device: CPU FPS > Higher Is Better m6i.8xlarge . 1237.05 |======================================================== OpenVINO 2024.0 Model: Face Detection Retail FP16 - Device: CPU ms < Lower Is Better m6i.8xlarge . 6.44 |=========================================================== OpenVINO 2024.0 Model: Road Segmentation ADAS FP16 - Device: CPU FPS > Higher Is Better m6i.8xlarge . 156.88 |========================================================= OpenVINO 2024.0 Model: Road Segmentation ADAS FP16 - Device: CPU ms < Lower Is Better m6i.8xlarge . 50.96 |========================================================== OpenVINO 2024.0 Model: Vehicle Detection FP16-INT8 - Device: CPU FPS > Higher Is Better m6i.8xlarge . 1141.04 |======================================================== OpenVINO 2024.0 Model: Vehicle Detection FP16-INT8 - Device: CPU ms < Lower Is Better m6i.8xlarge . 6.98 |=========================================================== OpenVINO 2024.0 Model: Weld Porosity Detection FP16 - Device: CPU FPS > Higher Is Better m6i.8xlarge . 673.40 |========================================================= OpenVINO 2024.0 Model: Weld Porosity Detection FP16 - Device: CPU ms < Lower Is Better m6i.8xlarge . 23.70 |========================================================== OpenVINO 2024.0 Model: Face Detection Retail FP16-INT8 - Device: CPU FPS > Higher Is Better m6i.8xlarge . 3237.98 |======================================================== OpenVINO 2024.0 Model: Face Detection Retail FP16-INT8 - Device: CPU ms < Lower Is Better m6i.8xlarge . 4.93 |=========================================================== OpenVINO 2024.0 Model: Road Segmentation ADAS FP16-INT8 - Device: CPU FPS > Higher Is Better m6i.8xlarge . 330.18 |========================================================= OpenVINO 2024.0 Model: Road Segmentation ADAS FP16-INT8 - Device: CPU ms < Lower Is Better m6i.8xlarge . 24.20 |========================================================== OpenVINO 2024.0 Model: Machine Translation EN To DE FP16 - Device: CPU FPS > Higher Is Better m6i.8xlarge . 82.76 |========================================================== OpenVINO 2024.0 Model: Machine Translation EN To DE FP16 - Device: CPU ms < Lower Is Better m6i.8xlarge . 96.60 |========================================================== OpenVINO 2024.0 Model: Weld Porosity Detection FP16-INT8 - Device: CPU FPS > Higher Is Better m6i.8xlarge . 2485.68 |======================================================== OpenVINO 2024.0 Model: Weld Porosity Detection FP16-INT8 - Device: CPU ms < Lower Is Better m6i.8xlarge . 6.39 |=========================================================== OpenVINO 2024.0 Model: Person Vehicle Bike Detection FP16 - Device: CPU FPS > Higher Is Better m6i.8xlarge . 556.15 |========================================================= OpenVINO 2024.0 Model: Person Vehicle Bike Detection FP16 - Device: CPU ms < Lower Is Better m6i.8xlarge . 14.36 |========================================================== OpenVINO 2024.0 Model: Noise Suppression Poconet-Like FP16 - Device: CPU FPS > Higher Is Better m6i.8xlarge . 755.57 |========================================================= OpenVINO 2024.0 Model: Noise Suppression Poconet-Like FP16 - Device: CPU ms < Lower Is Better m6i.8xlarge . 21.04 |========================================================== OpenVINO 2024.0 Model: Handwritten English Recognition FP16 - Device: CPU FPS > Higher Is Better m6i.8xlarge . 272.32 |========================================================= OpenVINO 2024.0 Model: Handwritten English Recognition FP16 - Device: CPU ms < Lower Is Better m6i.8xlarge . 58.71 |========================================================== OpenVINO 2024.0 Model: Person Re-Identification Retail FP16 - Device: CPU FPS > Higher Is Better m6i.8xlarge . 960.77 |========================================================= OpenVINO 2024.0 Model: Person Re-Identification Retail FP16 - Device: CPU ms < Lower Is Better m6i.8xlarge . 16.62 |========================================================== OpenVINO 2024.0 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU FPS > Higher Is Better m6i.8xlarge . 17107.68 |======================================================= OpenVINO 2024.0 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU ms < Lower Is Better m6i.8xlarge . 0.93 |=========================================================== OpenVINO 2024.0 Model: Handwritten English Recognition FP16-INT8 - Device: CPU FPS > Higher Is Better m6i.8xlarge . 314.78 |========================================================= OpenVINO 2024.0 Model: Handwritten English Recognition FP16-INT8 - Device: CPU ms < Lower Is Better m6i.8xlarge . 50.80 |========================================================== OpenVINO 2024.0 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU FPS > Higher Is Better m6i.8xlarge . 36577.56 |======================================================= OpenVINO 2024.0 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU ms < Lower Is Better m6i.8xlarge . 0.42 |=========================================================== ONNX Runtime 1.17 Model: GPT-2 - Device: CPU - Executor: Parallel Inference Time Cost (ms) < Lower Is Better m6i.8xlarge . 7.78959 |======================================================== ONNX Runtime 1.17 Model: GPT-2 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better m6i.8xlarge . 6.60049 |======================================================== ONNX Runtime 1.17 Model: yolov4 - Device: CPU - Executor: Parallel Inference Time Cost (ms) < Lower Is Better m6i.8xlarge . 87.52 |========================================================== ONNX Runtime 1.17 Model: yolov4 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better m6i.8xlarge . 75.99 |========================================================== ONNX Runtime 1.17 Model: T5 Encoder - Device: CPU - Executor: Parallel Inference Time Cost (ms) < Lower Is Better m6i.8xlarge . 5.18002 |======================================================== ONNX Runtime 1.17 Model: T5 Encoder - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better m6i.8xlarge . 4.61890 |======================================================== ONNX Runtime 1.17 Model: bertsquad-12 - Device: CPU - Executor: Parallel Inference Time Cost (ms) < Lower Is Better m6i.8xlarge . 66.37 |========================================================== ONNX Runtime 1.17 Model: bertsquad-12 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better m6i.8xlarge . 55.88 |========================================================== ONNX Runtime 1.17 Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel Inference Time Cost (ms) < Lower Is Better m6i.8xlarge . 1.79240 |======================================================== ONNX Runtime 1.17 Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better m6i.8xlarge . 1.54130 |======================================================== ONNX Runtime 1.17 Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel Inference Time Cost (ms) < Lower Is Better m6i.8xlarge . 546.06 |========================================================= ONNX Runtime 1.17 Model: fcn-resnet101-11 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better m6i.8xlarge . 357.87 |========================================================= ONNX Runtime 1.17 Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel Inference Time Cost (ms) < Lower Is Better m6i.8xlarge . 28.42 |========================================================== ONNX Runtime 1.17 Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better m6i.8xlarge . 28.25 |========================================================== ONNX Runtime 1.17 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel Inference Time Cost (ms) < Lower Is Better m6i.8xlarge . 3.92952 |======================================================== ONNX Runtime 1.17 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better m6i.8xlarge . 3.58213 |======================================================== ONNX Runtime 1.17 Model: super-resolution-10 - Device: CPU - Executor: Parallel Inference Time Cost (ms) < Lower Is Better m6i.8xlarge . 10.23 |========================================================== ONNX Runtime 1.17 Model: super-resolution-10 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better m6i.8xlarge . 7.56041 |======================================================== ONNX Runtime 1.17 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel Inference Time Cost (ms) < Lower Is Better m6i.8xlarge . 223.50 |========================================================= ONNX Runtime 1.17 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better m6i.8xlarge . 205.58 |=========================================================