m7g.8xlarge amazon testing on Ubuntu 22.04 via the Phoronix Test Suite. m7g.8xlarge: Processor: ARMv8 Neoverse-V1 (32 Cores), Motherboard: Amazon EC2 m7g.8xlarge (1.0 BIOS), Chipset: Amazon Device 0200, Memory: 128GB, Disk: 537GB Amazon Elastic Block Store, Network: Amazon Elastic OS: Ubuntu 22.04, Kernel: 6.5.0-1017-aws (aarch64), Vulkan: 1.3.255, Compiler: GCC 11.4.0, File-System: ext4, System Layer: amazon Llama.cpp b3067 Model: Meta-Llama-3-8B-Instruct-Q8_0.gguf Tokens Per Second > Higher Is Better m7g.8xlarge . 22.47 |========================================================== Whisper.cpp 1.6.2 Model: ggml-base.en - Input: 2016 State of the Union Seconds < Lower Is Better m7g.8xlarge . 81.61 |========================================================== Whisper.cpp 1.6.2 Model: ggml-small.en - Input: 2016 State of the Union Seconds < Lower Is Better m7g.8xlarge . 179.73 |========================================================= Whisper.cpp 1.6.2 Model: ggml-medium.en - Input: 2016 State of the Union Seconds < Lower Is Better m7g.8xlarge . 439.61 |========================================================= ONNX Runtime 1.17 Model: GPT-2 - Device: CPU - Executor: Parallel Inferences Per Second > Higher Is Better m7g.8xlarge . 144.42 |========================================================= ONNX Runtime 1.17 Model: GPT-2 - Device: CPU - Executor: Parallel Inference Time Cost (ms) < Lower Is Better m7g.8xlarge . 6.91744 |======================================================== ONNX Runtime 1.17 Model: GPT-2 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better m7g.8xlarge . 216.38 |========================================================= ONNX Runtime 1.17 Model: GPT-2 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better m7g.8xlarge . 4.61260 |======================================================== ONNX Runtime 1.17 Model: yolov4 - Device: CPU - Executor: Parallel Inferences Per Second > Higher Is Better m7g.8xlarge . 4.59152 |======================================================== ONNX Runtime 1.17 Model: yolov4 - Device: CPU - Executor: Parallel Inference Time Cost (ms) < Lower Is Better m7g.8xlarge . 217.82 |========================================================= ONNX Runtime 1.17 Model: yolov4 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better m7g.8xlarge . 8.85209 |======================================================== ONNX Runtime 1.17 Model: yolov4 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better m7g.8xlarge . 112.96 |========================================================= ONNX Runtime 1.17 Model: T5 Encoder - Device: CPU - Executor: Parallel Inferences Per Second > Higher Is Better m7g.8xlarge . 229.63 |========================================================= ONNX Runtime 1.17 Model: T5 Encoder - Device: CPU - Executor: Parallel Inference Time Cost (ms) < Lower Is Better m7g.8xlarge . 4.35358 |======================================================== ONNX Runtime 1.17 Model: T5 Encoder - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better m7g.8xlarge . 352.80 |========================================================= ONNX Runtime 1.17 Model: T5 Encoder - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better m7g.8xlarge . 2.83110 |======================================================== ONNX Runtime 1.17 Model: bertsquad-12 - Device: CPU - Executor: Parallel Inferences Per Second > Higher Is Better m7g.8xlarge . 8.97560 |======================================================== ONNX Runtime 1.17 Model: bertsquad-12 - Device: CPU - Executor: Parallel Inference Time Cost (ms) < Lower Is Better m7g.8xlarge . 111.42 |========================================================= ONNX Runtime 1.17 Model: bertsquad-12 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better m7g.8xlarge . 18.44 |========================================================== ONNX Runtime 1.17 Model: bertsquad-12 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better m7g.8xlarge . 54.23 |========================================================== ONNX Runtime 1.17 Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel Inferences Per Second > Higher Is Better m7g.8xlarge . 385.46 |========================================================= ONNX Runtime 1.17 Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel Inference Time Cost (ms) < Lower Is Better m7g.8xlarge . 2.59272 |======================================================== ONNX Runtime 1.17 Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better m7g.8xlarge . 929.21 |========================================================= ONNX Runtime 1.17 Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better m7g.8xlarge . 1.07450 |======================================================== ONNX Runtime 1.17 Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel Inferences Per Second > Higher Is Better m7g.8xlarge . 1.13658 |======================================================== ONNX Runtime 1.17 Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel Inference Time Cost (ms) < Lower Is Better m7g.8xlarge . 879.88 |========================================================= ONNX Runtime 1.17 Model: fcn-resnet101-11 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better m7g.8xlarge . 1.42024 |======================================================== ONNX Runtime 1.17 Model: fcn-resnet101-11 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better m7g.8xlarge . 704.10 |========================================================= ONNX Runtime 1.17 Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel Inferences Per Second > Higher Is Better m7g.8xlarge . 11.34 |========================================================== ONNX Runtime 1.17 Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel Inference Time Cost (ms) < Lower Is Better m7g.8xlarge . 88.16 |========================================================== ONNX Runtime 1.17 Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better m7g.8xlarge . 18.15 |========================================================== ONNX Runtime 1.17 Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better m7g.8xlarge . 55.09 |========================================================== ONNX Runtime 1.17 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel Inferences Per Second > Higher Is Better m7g.8xlarge . 182.88 |========================================================= ONNX Runtime 1.17 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel Inference Time Cost (ms) < Lower Is Better m7g.8xlarge . 5.46672 |======================================================== ONNX Runtime 1.17 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better m7g.8xlarge . 254.50 |========================================================= ONNX Runtime 1.17 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better m7g.8xlarge . 3.92809 |======================================================== ONNX Runtime 1.17 Model: super-resolution-10 - Device: CPU - Executor: Parallel Inferences Per Second > Higher Is Better m7g.8xlarge . 78.22 |========================================================== ONNX Runtime 1.17 Model: super-resolution-10 - Device: CPU - Executor: Parallel Inference Time Cost (ms) < Lower Is Better m7g.8xlarge . 12.78 |========================================================== ONNX Runtime 1.17 Model: super-resolution-10 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better m7g.8xlarge . 79.08 |========================================================== ONNX Runtime 1.17 Model: super-resolution-10 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better m7g.8xlarge . 12.64 |========================================================== ONNX Runtime 1.17 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel Inferences Per Second > Higher Is Better m7g.8xlarge . 5.76448 |======================================================== ONNX Runtime 1.17 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel Inference Time Cost (ms) < Lower Is Better m7g.8xlarge . 173.47 |========================================================= ONNX Runtime 1.17 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard Inferences Per Second > Higher Is Better m7g.8xlarge . 6.26020 |======================================================== ONNX Runtime 1.17 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard Inference Time Cost (ms) < Lower Is Better m7g.8xlarge . 159.76 |=========================================================