m7g.8xlarge amazon testing on Ubuntu 22.04 via the Phoronix Test Suite.
Compare your own system(s) to this result file with the
Phoronix Test Suite by running the command:
phoronix-test-suite benchmark 2407019-NE-M7G8XLARG55 m7g.8xlarge Processor: ARMv8 Neoverse-V1 (32 Cores), Motherboard: Amazon EC2 m7g.8xlarge (1.0 BIOS), Chipset: Amazon Device 0200, Memory: 128GB, Disk: 537GB Amazon Elastic Block Store, Network: Amazon Elastic
OS: Ubuntu 22.04, Kernel: 6.5.0-1017-aws (aarch64), Vulkan: 1.3.255, Compiler: GCC 11.4.0, File-System: ext4, System Layer: amazon
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -vSecurity Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of CSV2 BHB + srbds: Not affected + tsx_async_abort: Not affected
m7g.8xlarge OpenBenchmarking.org Phoronix Test Suite ARMv8 Neoverse-V1 (32 Cores) Amazon EC2 m7g.8xlarge (1.0 BIOS) Amazon Device 0200 128GB 537GB Amazon Elastic Block Store Amazon Elastic Ubuntu 22.04 6.5.0-1017-aws (aarch64) 1.3.255 GCC 11.4.0 ext4 amazon Processor Motherboard Chipset Memory Disk Network OS Kernel Vulkan Compiler File-System System Layer M7g.8xlarge Benchmarks System Logs - Transparent Huge Pages: madvise - --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -v - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of CSV2 BHB + srbds: Not affected + tsx_async_abort: Not affected
m7g.8xlarge deepsparse: NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Synchronous Single-Stream deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Synchronous Single-Stream deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Stream deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Synchronous Single-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Synchronous Single-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Synchronous Single-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Synchronous Single-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Synchronous Single-Stream deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Synchronous Single-Stream deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Synchronous Single-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Synchronous Single-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: Llama2 Chat 7b Quantized - Synchronous Single-Stream deepsparse: Llama2 Chat 7b Quantized - Synchronous Single-Stream deepsparse: Llama2 Chat 7b Quantized - Asynchronous Multi-Stream deepsparse: Llama2 Chat 7b Quantized - Asynchronous Multi-Stream deepsparse: ResNet-50, Sparse INT8 - Synchronous Single-Stream deepsparse: ResNet-50, Sparse INT8 - Synchronous Single-Stream deepsparse: ResNet-50, Sparse INT8 - Asynchronous Multi-Stream deepsparse: ResNet-50, Sparse INT8 - Asynchronous Multi-Stream deepsparse: ResNet-50, Baseline - Synchronous Single-Stream deepsparse: ResNet-50, Baseline - Synchronous Single-Stream deepsparse: ResNet-50, Baseline - Asynchronous Multi-Stream deepsparse: ResNet-50, Baseline - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Synchronous Single-Stream deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Synchronous Single-Stream deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream mlpack: scikit_linearridgeregression mlpack: scikit_svm mlpack: scikit_qda mlpack: scikit_ica onednn: Recurrent Neural Network Inference - CPU onednn: Recurrent Neural Network Training - CPU onednn: Deconvolution Batch shapes_3d - CPU onednn: Deconvolution Batch shapes_1d - CPU onednn: Convolution Batch Shapes Auto - CPU onednn: IP Shapes 1D - CPU opencv: Object Detection opencv: Image Processing opencv: Features 2D opencv: Stitching opencv: Video opencv: Core openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU openvino: Handwritten English Recognition FP16-INT8 - CPU openvino: Handwritten English Recognition FP16-INT8 - CPU openvino: Age Gender Recognition Retail 0013 FP16 - CPU openvino: Age Gender Recognition Retail 0013 FP16 - CPU openvino: Person Re-Identification Retail FP16 - CPU openvino: Person Re-Identification Retail FP16 - CPU openvino: Handwritten English Recognition FP16 - CPU openvino: Handwritten English Recognition FP16 - CPU openvino: Noise Suppression Poconet-Like FP16 - CPU openvino: Noise Suppression Poconet-Like FP16 - CPU openvino: Person Vehicle Bike Detection FP16 - CPU openvino: Person Vehicle Bike Detection FP16 - CPU openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Machine Translation EN To DE FP16 - CPU openvino: Machine Translation EN To DE FP16 - CPU openvino: Road Segmentation ADAS FP16-INT8 - CPU openvino: Road Segmentation ADAS FP16-INT8 - CPU openvino: Face Detection Retail FP16-INT8 - CPU openvino: Face Detection Retail FP16-INT8 - CPU openvino: Weld Porosity Detection FP16 - CPU openvino: Weld Porosity Detection FP16 - CPU openvino: Vehicle Detection FP16-INT8 - CPU openvino: Vehicle Detection FP16-INT8 - CPU openvino: Road Segmentation ADAS FP16 - CPU openvino: Road Segmentation ADAS FP16 - CPU openvino: Face Detection Retail FP16 - CPU openvino: Face Detection Retail FP16 - CPU openvino: Face Detection FP16-INT8 - CPU openvino: Face Detection FP16-INT8 - CPU openvino: Vehicle Detection FP16 - CPU openvino: Vehicle Detection FP16 - CPU openvino: Person Detection FP32 - CPU openvino: Person Detection FP32 - CPU openvino: Person Detection FP16 - CPU openvino: Person Detection FP16 - CPU openvino: Face Detection FP16 - CPU openvino: Face Detection FP16 - CPU onnx: Faster R-CNN R-50-FPN-int8 - CPU - Standard onnx: Faster R-CNN R-50-FPN-int8 - CPU - Parallel onnx: super-resolution-10 - CPU - Standard onnx: super-resolution-10 - CPU - Parallel onnx: ResNet50 v1-12-int8 - CPU - Standard onnx: ResNet50 v1-12-int8 - CPU - Parallel onnx: ArcFace ResNet-100 - CPU - Standard onnx: ArcFace ResNet-100 - CPU - Parallel onnx: fcn-resnet101-11 - CPU - Standard onnx: fcn-resnet101-11 - CPU - Parallel onnx: CaffeNet 12-int8 - CPU - Standard onnx: CaffeNet 12-int8 - CPU - Parallel onnx: bertsquad-12 - CPU - Standard onnx: bertsquad-12 - CPU - Parallel onnx: T5 Encoder - CPU - Standard onnx: T5 Encoder - CPU - Parallel onnx: yolov4 - CPU - Standard onnx: yolov4 - CPU - Parallel onnx: GPT-2 - CPU - Standard onnx: GPT-2 - CPU - Parallel whisper-cpp: ggml-medium.en - 2016 State of the Union whisper-cpp: ggml-small.en - 2016 State of the Union whisper-cpp: ggml-base.en - 2016 State of the Union llama-cpp: Meta-Llama-3-8B-Instruct-Q8_0.gguf onednn: IP Shapes 3D - CPU opencv: DNN - Deep Neural Network onnx: Faster R-CNN R-50-FPN-int8 - CPU - Standard onnx: Faster R-CNN R-50-FPN-int8 - CPU - Parallel onnx: super-resolution-10 - CPU - Standard onnx: super-resolution-10 - CPU - Parallel onnx: ResNet50 v1-12-int8 - CPU - Standard onnx: ResNet50 v1-12-int8 - CPU - Parallel onnx: ArcFace ResNet-100 - CPU - Standard onnx: ArcFace ResNet-100 - CPU - Parallel onnx: fcn-resnet101-11 - CPU - Standard onnx: fcn-resnet101-11 - CPU - Parallel onnx: CaffeNet 12-int8 - CPU - Standard onnx: CaffeNet 12-int8 - CPU - Parallel onnx: bertsquad-12 - CPU - Standard onnx: bertsquad-12 - CPU - Parallel onnx: T5 Encoder - CPU - Standard onnx: T5 Encoder - CPU - Parallel onnx: yolov4 - CPU - Standard onnx: yolov4 - CPU - Parallel onnx: GPT-2 - CPU - Standard onnx: GPT-2 - CPU - Parallel m7g.8xlarge 96.8102 10.3280 1498.9193 10.3225 15.8631 62.9757 91.6242 173.9385 80.7296 12.3836 1105.3839 14.3201 13.5301 73.8416 164.7921 96.5354 17.7329 56.3498 239.3726 66.4214 9.1805 108.7471 107.5853 148.0910 59.5688 16.7775 4083.7658 3.6747 2.4132 412.7734 16.1855 984.1460 9.1766 108.7917 107.4825 148.1704 4.9453 201.7181 39.4253 404.7362 96.8327 10.3256 1500.4961 10.3938 1.70 16.87 20.48 33.72 3943.27 7798.29 13.7726 64.4003 10.5836 6.50449 27261 105292 54743 284480 22760 96964 2.05 3886.78 201.65 39.62 2.09 3808.13 22.98 347.81 185.84 43.01 113.16 70.67 25.39 314.92 24.07 332.14 170.48 46.87 492.91 16.21 48.17 166.02 14.70 543.78 154.86 51.63 70.31 113.70 8.84 903.29 3042.17 2.58 26.78 298.31 376.80 21.20 377.31 21.17 2151.54 3.68 6.26020 5.76448 79.0814 78.2163 254.500 182.878 18.1520 11.3429 1.42024 1.13658 929.208 385.457 18.4386 8.97560 352.803 229.629 8.85209 4.59152 216.383 144.421 439.61170 179.73014 81.60950 22.47 4.52619 23459 159.755 173.474 12.6429 12.7839 3.92809 5.46672 55.0870 88.1604 704.104 879.876 1.07450 2.59272 54.2290 111.418 2.83110 4.35358 112.963 217.818 4.61260 6.91744 OpenBenchmarking.org
OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Synchronous Single-Stream m7g.8xlarge 1.1127 2.2254 3.3381 4.4508 5.5635 SE +/- 0.0052, N = 3 4.9453
oneDNN This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI toolkit. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: Recurrent Neural Network Inference - Engine: CPU m7g.8xlarge 800 1600 2400 3200 4000 SE +/- 12.89, N = 3 3943.27 MIN: 3910.4 1. (CXX) g++ options: -O3 -march=native -fopenmp -mcpu=generic -fPIC -pie -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: Recurrent Neural Network Training - Engine: CPU m7g.8xlarge 2K 4K 6K 8K 10K SE +/- 80.29, N = 3 7798.29 MIN: 7620.6 1. (CXX) g++ options: -O3 -march=native -fopenmp -mcpu=generic -fPIC -pie -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: Deconvolution Batch shapes_3d - Engine: CPU m7g.8xlarge 4 8 12 16 20 SE +/- 0.01, N = 3 13.77 MIN: 13.69 1. (CXX) g++ options: -O3 -march=native -fopenmp -mcpu=generic -fPIC -pie -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: Deconvolution Batch shapes_1d - Engine: CPU m7g.8xlarge 14 28 42 56 70 SE +/- 0.08, N = 3 64.40 MIN: 64.09 1. (CXX) g++ options: -O3 -march=native -fopenmp -mcpu=generic -fPIC -pie -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: Convolution Batch Shapes Auto - Engine: CPU m7g.8xlarge 3 6 9 12 15 SE +/- 0.02, N = 3 10.58 MIN: 10.45 1. (CXX) g++ options: -O3 -march=native -fopenmp -mcpu=generic -fPIC -pie -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: IP Shapes 1D - Engine: CPU m7g.8xlarge 2 4 6 8 10 SE +/- 0.00772, N = 3 6.50449 MIN: 6.42 1. (CXX) g++ options: -O3 -march=native -fopenmp -mcpu=generic -fPIC -pie -ldl
OpenCV This is a benchmark of the OpenCV (Computer Vision) library's built-in performance tests. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Object Detection m7g.8xlarge 6K 12K 18K 24K 30K SE +/- 75.10, N = 3 27261 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Image Processing m7g.8xlarge 20K 40K 60K 80K 100K SE +/- 282.17, N = 3 105292 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Features 2D m7g.8xlarge 12K 24K 36K 48K 60K SE +/- 267.56, N = 3 54743 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Stitching m7g.8xlarge 60K 120K 180K 240K 300K SE +/- 284.51, N = 3 284480 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Video m7g.8xlarge 5K 10K 15K 20K 25K SE +/- 92.56, N = 3 22760 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Core m7g.8xlarge 20K 40K 60K 80K 100K SE +/- 663.91, N = 3 96964 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
OpenVINO This is a test of the Intel OpenVINO, a toolkit around neural networks, using its built-in benchmarking support and analyzing the throughput and latency for various models. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU m7g.8xlarge 0.4613 0.9226 1.3839 1.8452 2.3065 SE +/- 0.00, N = 3 2.05 MIN: 1.28 / MAX: 23.68 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU m7g.8xlarge 800 1600 2400 3200 4000 SE +/- 2.45, N = 3 3886.78 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Handwritten English Recognition FP16-INT8 - Device: CPU m7g.8xlarge 40 80 120 160 200 SE +/- 0.38, N = 3 201.65 MIN: 199.37 / MAX: 225.02 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Handwritten English Recognition FP16-INT8 - Device: CPU m7g.8xlarge 9 18 27 36 45 SE +/- 0.07, N = 3 39.62 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU m7g.8xlarge 0.4703 0.9406 1.4109 1.8812 2.3515 SE +/- 0.00, N = 3 2.09 MIN: 1 / MAX: 26.97 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU m7g.8xlarge 800 1600 2400 3200 4000 SE +/- 1.81, N = 3 3808.13 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Person Re-Identification Retail FP16 - Device: CPU m7g.8xlarge 6 12 18 24 30 SE +/- 0.07, N = 3 22.98 MIN: 16.5 / MAX: 41.02 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Person Re-Identification Retail FP16 - Device: CPU m7g.8xlarge 80 160 240 320 400 SE +/- 1.11, N = 3 347.81 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Handwritten English Recognition FP16 - Device: CPU m7g.8xlarge 40 80 120 160 200 SE +/- 0.75, N = 3 185.84 MIN: 183.08 / MAX: 211.71 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Handwritten English Recognition FP16 - Device: CPU m7g.8xlarge 10 20 30 40 50 SE +/- 0.18, N = 3 43.01 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Noise Suppression Poconet-Like FP16 - Device: CPU m7g.8xlarge 30 60 90 120 150 SE +/- 0.04, N = 3 113.16 MIN: 110.94 / MAX: 150.98 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Noise Suppression Poconet-Like FP16 - Device: CPU m7g.8xlarge 16 32 48 64 80 SE +/- 0.02, N = 3 70.67 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Person Vehicle Bike Detection FP16 - Device: CPU m7g.8xlarge 6 12 18 24 30 SE +/- 0.32, N = 3 25.39 MIN: 22.51 / MAX: 39.78 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Person Vehicle Bike Detection FP16 - Device: CPU m7g.8xlarge 70 140 210 280 350 SE +/- 3.87, N = 3 314.92 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Weld Porosity Detection FP16-INT8 - Device: CPU m7g.8xlarge 6 12 18 24 30 SE +/- 0.12, N = 3 24.07 MIN: 22.31 / MAX: 225.53 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Weld Porosity Detection FP16-INT8 - Device: CPU m7g.8xlarge 70 140 210 280 350 SE +/- 1.64, N = 3 332.14 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Machine Translation EN To DE FP16 - Device: CPU m7g.8xlarge 40 80 120 160 200 SE +/- 0.08, N = 3 170.48 MIN: 154.07 / MAX: 334.2 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Machine Translation EN To DE FP16 - Device: CPU m7g.8xlarge 11 22 33 44 55 SE +/- 0.02, N = 3 46.87 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Road Segmentation ADAS FP16-INT8 - Device: CPU m7g.8xlarge 110 220 330 440 550 SE +/- 0.48, N = 3 492.91 MIN: 489.58 / MAX: 526.43 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Road Segmentation ADAS FP16-INT8 - Device: CPU m7g.8xlarge 4 8 12 16 20 SE +/- 0.01, N = 3 16.21 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Face Detection Retail FP16-INT8 - Device: CPU m7g.8xlarge 11 22 33 44 55 SE +/- 0.04, N = 3 48.17 MIN: 46.88 / MAX: 54.74 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Face Detection Retail FP16-INT8 - Device: CPU m7g.8xlarge 40 80 120 160 200 SE +/- 0.13, N = 3 166.02 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Weld Porosity Detection FP16 - Device: CPU m7g.8xlarge 4 8 12 16 20 SE +/- 0.02, N = 3 14.70 MIN: 11.59 / MAX: 168.15 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Weld Porosity Detection FP16 - Device: CPU m7g.8xlarge 120 240 360 480 600 SE +/- 0.86, N = 3 543.78 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Vehicle Detection FP16-INT8 - Device: CPU m7g.8xlarge 30 60 90 120 150 SE +/- 0.32, N = 3 154.86 MIN: 152.55 / MAX: 178.82 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Vehicle Detection FP16-INT8 - Device: CPU m7g.8xlarge 12 24 36 48 60 SE +/- 0.11, N = 3 51.63 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Road Segmentation ADAS FP16 - Device: CPU m7g.8xlarge 16 32 48 64 80 SE +/- 0.09, N = 3 70.31 MIN: 53.97 / MAX: 122.98 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Road Segmentation ADAS FP16 - Device: CPU m7g.8xlarge 30 60 90 120 150 SE +/- 0.14, N = 3 113.70 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Face Detection Retail FP16 - Device: CPU m7g.8xlarge 2 4 6 8 10 SE +/- 0.00, N = 3 8.84 MIN: 7.91 / MAX: 16.48 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Face Detection Retail FP16 - Device: CPU m7g.8xlarge 200 400 600 800 1000 SE +/- 0.28, N = 3 903.29 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Face Detection FP16-INT8 - Device: CPU m7g.8xlarge 700 1400 2100 2800 3500 SE +/- 1.31, N = 3 3042.17 MIN: 2748.48 / MAX: 4882.42 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Face Detection FP16-INT8 - Device: CPU m7g.8xlarge 0.5805 1.161 1.7415 2.322 2.9025 SE +/- 0.00, N = 3 2.58 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Vehicle Detection FP16 - Device: CPU m7g.8xlarge 6 12 18 24 30 SE +/- 0.04, N = 3 26.78 MIN: 23.25 / MAX: 51.98 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Vehicle Detection FP16 - Device: CPU m7g.8xlarge 60 120 180 240 300 SE +/- 0.45, N = 3 298.31 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Person Detection FP32 - Device: CPU m7g.8xlarge 80 160 240 320 400 SE +/- 0.30, N = 3 376.80 MIN: 207.53 / MAX: 510.75 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Person Detection FP32 - Device: CPU m7g.8xlarge 5 10 15 20 25 SE +/- 0.02, N = 3 21.20 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Person Detection FP16 - Device: CPU m7g.8xlarge 80 160 240 320 400 SE +/- 0.20, N = 3 377.31 MIN: 235.41 / MAX: 510.12 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Person Detection FP16 - Device: CPU m7g.8xlarge 5 10 15 20 25 SE +/- 0.01, N = 3 21.17 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Face Detection FP16 - Device: CPU m7g.8xlarge 500 1000 1500 2000 2500 SE +/- 2.66, N = 3 2151.54 MIN: 1728.64 / MAX: 4274.63 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Face Detection FP16 - Device: CPU m7g.8xlarge 0.828 1.656 2.484 3.312 4.14 SE +/- 0.01, N = 3 3.68 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
ONNX Runtime ONNX Runtime is developed by Microsoft and partners as a open-source, cross-platform, high performance machine learning inferencing and training accelerator. This test profile runs the ONNX Runtime with various models available from the ONNX Model Zoo. Learn more via the OpenBenchmarking.org test page.
Whisper.cpp OpenBenchmarking.org Seconds, Fewer Is Better Whisper.cpp 1.6.2 Model: ggml-medium.en - Input: 2016 State of the Union m7g.8xlarge 100 200 300 400 500 SE +/- 5.49, N = 9 439.61 1. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread -mcpu=native
OpenBenchmarking.org Seconds, Fewer Is Better Whisper.cpp 1.6.2 Model: ggml-small.en - Input: 2016 State of the Union m7g.8xlarge 40 80 120 160 200 SE +/- 2.02, N = 4 179.73 1. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread -mcpu=native
OpenBenchmarking.org Seconds, Fewer Is Better Whisper.cpp 1.6.2 Model: ggml-base.en - Input: 2016 State of the Union m7g.8xlarge 20 40 60 80 100 SE +/- 0.69, N = 15 81.61 1. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread -mcpu=native
Llama.cpp OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b3067 Model: Meta-Llama-3-8B-Instruct-Q8_0.gguf m7g.8xlarge 5 10 15 20 25 SE +/- 0.14, N = 3 22.47 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -mcpu=native -lopenblas
oneDNN This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI toolkit. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: IP Shapes 3D - Engine: CPU m7g.8xlarge 1.0184 2.0368 3.0552 4.0736 5.092 SE +/- 0.12704, N = 15 4.52619 MIN: 4.13 1. (CXX) g++ options: -O3 -march=native -fopenmp -mcpu=generic -fPIC -pie -ldl
OpenCV This is a benchmark of the OpenCV (Computer Vision) library's built-in performance tests. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: DNN - Deep Neural Network m7g.8xlarge 5K 10K 15K 20K 25K SE +/- 427.59, N = 15 23459 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
Test: Graph API
m7g.8xlarge: The test quit with a non-zero exit status. E: AbsExact error: G-API output and reference output matrixes are not bitexact equal.
m7g.8xlarge Processor: ARMv8 Neoverse-V1 (32 Cores), Motherboard: Amazon EC2 m7g.8xlarge (1.0 BIOS), Chipset: Amazon Device 0200, Memory: 128GB, Disk: 537GB Amazon Elastic Block Store, Network: Amazon Elastic
OS: Ubuntu 22.04, Kernel: 6.5.0-1017-aws (aarch64), Vulkan: 1.3.255, Compiler: GCC 11.4.0, File-System: ext4, System Layer: amazon
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -vSecurity Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of CSV2 BHB + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 1 July 2024 09:32 by user root.