Compare your own system(s) to this result file with the
Phoronix Test Suite by running the command:
phoronix-test-suite benchmark 2402103-NE-2402012NE79
nlp-benchmarks
AWS EC2 Amazon Linux 2023 Benchmarking
c6i.2xlarge:
Processor: Intel Xeon Platinum 8375C (4 Cores / 8 Threads), Motherboard: Amazon EC2 c6i.2xlarge (1.0 BIOS), Chipset: Intel 440FX 82441FX PMC, Memory: 1 x 16GB DDR4-3200MT/s, Disk: 215GB Amazon Elastic Block Store, Network: Amazon Elastic
OS: Amazon Linux 2023, Kernel: 6.1.61-85.141.amzn2023.x86_64 (x86_64), Compiler: GCC 11.4.1 20230605, File-System: xfs, System Layer: amazon
m7i-flex.2xlarge:
Processor: Intel Xeon Platinum 8488C (4 Cores / 8 Threads), Motherboard: Amazon EC2 m7i-flex.2xlarge (1.0 BIOS), Chipset: Intel 440FX 82441FX PMC, Memory: 1 x 32GB 4800MT/s, Disk: 215GB Amazon Elastic Block Store, Network: Amazon Elastic
OS: Amazon Linux 2023, Kernel: 6.1.72-96.166.amzn2023.x86_64 (x86_64), Compiler: GCC 11.4.1 20230605, File-System: xfs, System Layer: amazon
c7a.2xlarge:
Processor: AMD EPYC 9R14 (8 Cores), Motherboard: Amazon EC2 c7a.2xlarge (1.0 BIOS), Chipset: Intel 440FX 82441FX PMC, Memory: 1 x 16GB 4800MT/s, Disk: 215GB Amazon Elastic Block Store, Network: Amazon Elastic
OS: Amazon Linux 2023, Kernel: 6.1.72-96.166.amzn2023.x86_64 (x86_64), Compiler: GCC 11.4.1 20230605, File-System: xfs, System Layer: amazon
r7a.xlarge:
Processor: AMD EPYC 9R14 (4 Cores), Motherboard: Amazon EC2 r7a.xlarge (1.0 BIOS), Chipset: Intel 440FX 82441FX PMC, Memory: 1 x 32GB 4800MT/s, Disk: 215GB Amazon Elastic Block Store, Network: Amazon Elastic
OS: Amazon Linux 2023, Kernel: 6.1.72-96.166.amzn2023.x86_64 (x86_64), Compiler: GCC 11.4.1 20230605, File-System: xfs, System Layer: amazon
m7i.2xlarge:
Processor: Intel Xeon Platinum 8488C (4 Cores / 8 Threads), Motherboard: Amazon EC2 m7i.2xlarge (1.0 BIOS), Chipset: Intel 440FX 82441FX PMC, Memory: 1 x 32GB 4800MT/s, Disk: 215GB Amazon Elastic Block Store, Network: Amazon Elastic
OS: Amazon Linux 2023, Kernel: 6.1.72-96.166.amzn2023.x86_64 (x86_64), Compiler: GCC 11.4.1 20230605, File-System: xfs, System Layer: amazon
r7iz.xlarge:
Processor: Intel Xeon Gold 6455B (2 Cores / 4 Threads), Motherboard: Amazon EC2 r7iz.xlarge (1.0 BIOS), Chipset: Intel 440FX 82441FX PMC, Memory: 1 x 32GB 4800MT/s, Disk: 215GB Amazon Elastic Block Store, Network: Amazon Elastic
OS: Amazon Linux 2023, Kernel: 6.1.72-96.166.amzn2023.x86_64 (x86_64), Compiler: GCC 11.4.1 20230605, File-System: xfs, System Layer: amazon
PyTorch 2.1
Device: CPU - Batch Size: 16 - Model: Efficientnet_v2_l
batches/sec > Higher Is Better
c6i.2xlarge ...... 4.06 |=========================
m7i-flex.2xlarge . 5.38 |=================================
c7a.2xlarge ...... 8.75 |======================================================
r7a.xlarge ....... 5.66 |===================================
m7i.2xlarge ...... 5.26 |================================
r7iz.xlarge ...... 4.63 |=============================
PyTorch 2.1
Device: CPU - Batch Size: 32 - Model: Efficientnet_v2_l
batches/sec > Higher Is Better
c6i.2xlarge ...... 4.04 |=========================
m7i-flex.2xlarge . 5.47 |==================================
c7a.2xlarge ...... 8.71 |======================================================
r7a.xlarge ....... 5.67 |===================================
m7i.2xlarge ...... 5.28 |=================================
r7iz.xlarge ...... 4.63 |=============================
PyTorch 2.1
Device: CPU - Batch Size: 32 - Model: ResNet-50
batches/sec > Higher Is Better
c6i.2xlarge ...... 15.81 |===========================
m7i-flex.2xlarge . 18.34 |===============================
c7a.2xlarge ...... 31.29 |=====================================================
r7a.xlarge ....... 19.32 |=================================
m7i.2xlarge ...... 19.43 |=================================
r7iz.xlarge ...... 15.30 |==========================
PyTorch 2.1
Device: CPU - Batch Size: 16 - Model: ResNet-152
batches/sec > Higher Is Better
c6i.2xlarge ...... 6.36 |==========================
m7i-flex.2xlarge . 7.36 |==============================
c7a.2xlarge ...... 12.95 |=====================================================
r7a.xlarge ....... 7.68 |===============================
m7i.2xlarge ...... 7.65 |===============================
r7iz.xlarge ...... 6.37 |==========================
PyTorch 2.1
Device: CPU - Batch Size: 32 - Model: ResNet-152
batches/sec > Higher Is Better
c6i.2xlarge ...... 6.38 |==========================
m7i-flex.2xlarge . 7.55 |===============================
c7a.2xlarge ...... 13.03 |=====================================================
r7a.xlarge ....... 7.68 |===============================
m7i.2xlarge ...... 7.66 |===============================
r7iz.xlarge ...... 6.39 |==========================
PyTorch 2.1
Device: CPU - Batch Size: 1 - Model: Efficientnet_v2_l
batches/sec > Higher Is Better
c6i.2xlarge ...... 7.99 |====================================
m7i-flex.2xlarge . 8.99 |=========================================
c7a.2xlarge ...... 11.74 |=====================================================
r7a.xlarge ....... 8.44 |======================================
m7i.2xlarge ...... 8.84 |========================================
r7iz.xlarge ...... 8.17 |=====================================
Numpy Benchmark
Score > Higher Is Better
c6i.2xlarge ...... 374.99 |=================================
m7i-flex.2xlarge . 438.25 |======================================
c7a.2xlarge ...... 590.10 |====================================================
r7a.xlarge ....... 595.01 |====================================================
m7i.2xlarge ...... 452.50 |========================================
r7iz.xlarge ...... 554.28 |================================================
PyTorch 2.1
Device: CPU - Batch Size: 16 - Model: ResNet-50
batches/sec > Higher Is Better
c6i.2xlarge ...... 15.96 |===========================
m7i-flex.2xlarge . 18.84 |===============================
c7a.2xlarge ...... 31.89 |=====================================================
r7a.xlarge ....... 19.35 |================================
m7i.2xlarge ...... 19.62 |=================================
r7iz.xlarge ...... 15.92 |==========================
PyTorch 2.1
Device: CPU - Batch Size: 1 - Model: ResNet-152
batches/sec > Higher Is Better
c6i.2xlarge ...... 10.57 |============================
m7i-flex.2xlarge . 12.10 |================================
c7a.2xlarge ...... 20.00 |=====================================================
r7a.xlarge ....... 13.19 |===================================
m7i.2xlarge ...... 12.48 |=================================
r7iz.xlarge ...... 11.07 |=============================
oneDNN 3.3
Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU
ms < Lower Is Better
c6i.2xlarge ...... 2496.84 |==========================================
m7i-flex.2xlarge . 2382.46 |========================================
c7a.2xlarge ...... 1482.35 |=========================
r7a.xlarge ....... 2856.84 |================================================
m7i.2xlarge ...... 2320.94 |=======================================
r7iz.xlarge ...... 3062.69 |===================================================
oneDNN 3.3
Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU
ms < Lower Is Better
c6i.2xlarge ...... 2492.29 |=========================================
m7i-flex.2xlarge . 2389.64 |========================================
c7a.2xlarge ...... 1480.93 |=========================
r7a.xlarge ....... 2857.77 |================================================
m7i.2xlarge ...... 2310.90 |======================================
r7iz.xlarge ...... 3064.81 |===================================================
oneDNN 3.3
Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU
ms < Lower Is Better
c6i.2xlarge ...... 2501.50 |==========================================
m7i-flex.2xlarge . 2318.31 |=======================================
c7a.2xlarge ...... 1478.51 |=========================
r7a.xlarge ....... 2862.86 |================================================
m7i.2xlarge ...... 2303.44 |======================================
r7iz.xlarge ...... 3062.84 |===================================================
OpenVINO 2023.2.dev
Model: Face Detection FP16 - Device: CPU
ms < Lower Is Better
c6i.2xlarge ...... 2252.04 |===================================================
m7i-flex.2xlarge . 474.46 |===========
c7a.2xlarge ...... 774.23 |==================
r7a.xlarge ....... 764.30 |=================
m7i.2xlarge ...... 511.54 |============
r7iz.xlarge ...... 341.08 |========
OpenVINO 2023.2.dev
Model: Face Detection FP16 - Device: CPU
FPS > Higher Is Better
c6i.2xlarge ...... 1.77 |===========
m7i-flex.2xlarge . 8.43 |======================================================
c7a.2xlarge ...... 5.16 |=================================
r7a.xlarge ....... 2.62 |=================
m7i.2xlarge ...... 7.80 |==================================================
r7iz.xlarge ...... 5.86 |======================================
OpenVINO 2023.2.dev
Model: Face Detection FP16-INT8 - Device: CPU
ms < Lower Is Better
c6i.2xlarge ...... 610.59 |====================================================
m7i-flex.2xlarge . 241.21 |=====================
c7a.2xlarge ...... 409.78 |===================================
r7a.xlarge ....... 408.02 |===================================
m7i.2xlarge ...... 275.65 |=======================
r7iz.xlarge ...... 186.64 |================
OpenVINO 2023.2.dev
Model: Face Detection FP16-INT8 - Device: CPU
FPS > Higher Is Better
c6i.2xlarge ...... 6.53 |=====================
m7i-flex.2xlarge . 16.57 |=====================================================
c7a.2xlarge ...... 9.76 |===============================
r7a.xlarge ....... 4.90 |================
m7i.2xlarge ...... 14.51 |==============================================
r7iz.xlarge ...... 10.72 |==================================
OpenVINO 2023.2.dev
Model: Machine Translation EN To DE FP16 - Device: CPU
ms < Lower Is Better
c6i.2xlarge ...... 179.53 |====================================================
m7i-flex.2xlarge . 74.20 |=====================
c7a.2xlarge ...... 73.34 |=====================
r7a.xlarge ....... 64.98 |===================
m7i.2xlarge ...... 80.94 |=======================
r7iz.xlarge ...... 58.03 |=================
OpenVINO 2023.2.dev
Model: Machine Translation EN To DE FP16 - Device: CPU
FPS > Higher Is Better
c6i.2xlarge ...... 22.26 |======================
m7i-flex.2xlarge . 53.87 |====================================================
c7a.2xlarge ...... 54.51 |=====================================================
r7a.xlarge ....... 30.77 |==============================
m7i.2xlarge ...... 49.39 |================================================
r7iz.xlarge ...... 34.45 |=================================
PyTorch 2.1
Device: CPU - Batch Size: 1 - Model: ResNet-50
batches/sec > Higher Is Better
c6i.2xlarge ...... 26.78 |============================
m7i-flex.2xlarge . 29.89 |================================
c7a.2xlarge ...... 50.16 |=====================================================
r7a.xlarge ....... 33.39 |===================================
m7i.2xlarge ...... 31.36 |=================================
r7iz.xlarge ...... 27.43 |=============================
oneDNN 3.3
Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU
ms < Lower Is Better
c6i.2xlarge ...... 2.038360 |===============================================
m7i-flex.2xlarge . 1.016327 |=======================
c7a.2xlarge ...... 1.094290 |=========================
r7a.xlarge ....... 2.180290 |==================================================
m7i.2xlarge ...... 1.030110 |========================
r7iz.xlarge ...... 1.427360 |=================================
oneDNN 3.3
Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU
ms < Lower Is Better
c6i.2xlarge ...... 12.66180 |==================================================
m7i-flex.2xlarge . 8.10007 |================================
c7a.2xlarge ...... 5.01437 |====================
r7a.xlarge ....... 9.85441 |=======================================
m7i.2xlarge ...... 8.61718 |==================================
r7iz.xlarge ...... 11.19730 |============================================
oneDNN 3.3
Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU
ms < Lower Is Better
c6i.2xlarge ...... 46.69860 |==================================================
m7i-flex.2xlarge . 1.76350 |==
c7a.2xlarge ...... 6.90200 |=======
r7a.xlarge ....... 13.73180 |===============
m7i.2xlarge ...... 1.79360 |==
r7iz.xlarge ...... 2.37705 |===
PyBench 2018-02-16
Total For Average Test Times
Milliseconds < Lower Is Better
c6i.2xlarge ...... 1000 |======================================================
m7i-flex.2xlarge . 736 |========================================
c7a.2xlarge ...... 887 |================================================
r7a.xlarge ....... 887 |================================================
m7i.2xlarge ...... 815 |============================================
r7iz.xlarge ...... 691 |=====================================
oneDNN 3.3
Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU
ms < Lower Is Better
c6i.2xlarge ...... 5.93669 |=====================================
m7i-flex.2xlarge . 5.84261 |====================================
c7a.2xlarge ...... 7.35101 |==============================================
r7a.xlarge ....... 8.22691 |===================================================
m7i.2xlarge ...... 5.88019 |====================================
r7iz.xlarge ...... 7.66745 |================================================
oneDNN 3.3
Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU
ms < Lower Is Better
c6i.2xlarge ...... 6.24391 |===============================
m7i-flex.2xlarge . 7.19462 |===================================
c7a.2xlarge ...... 7.74619 |======================================
r7a.xlarge ....... 7.38732 |====================================
m7i.2xlarge ...... 6.61974 |================================
r7iz.xlarge ...... 10.19330 |==================================================
oneDNN 3.3
Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU
ms < Lower Is Better
c6i.2xlarge ...... 33.19200 |==================================================
m7i-flex.2xlarge . 3.68993 |======
c7a.2xlarge ...... 3.28245 |=====
r7a.xlarge ....... 5.40784 |========
m7i.2xlarge ...... 3.37610 |=====
r7iz.xlarge ...... 4.99403 |========
oneDNN 3.3
Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU
ms < Lower Is Better
c6i.2xlarge ...... 1.772860 |===================================
m7i-flex.2xlarge . 1.003269 |====================
c7a.2xlarge ...... 1.264050 |=========================
r7a.xlarge ....... 2.529820 |==================================================
m7i.2xlarge ...... 1.066020 |=====================
r7iz.xlarge ...... 1.457260 |=============================
oneDNN 3.3
Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU
ms < Lower Is Better
c6i.2xlarge ...... 34.71660 |==================================================
m7i-flex.2xlarge . 3.11075 |====
c7a.2xlarge ...... 3.12339 |====
r7a.xlarge ....... 6.25907 |=========
m7i.2xlarge ...... 3.20792 |=====
r7iz.xlarge ...... 4.09311 |======
oneDNN 3.3
Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU
ms < Lower Is Better
c6i.2xlarge ...... 8.09803 |====================================
m7i-flex.2xlarge . 7.94878 |===================================
c7a.2xlarge ...... 5.10330 |=======================
r7a.xlarge ....... 10.16790 |=============================================
m7i.2xlarge ...... 8.34462 |=====================================
r7iz.xlarge ...... 11.24930 |==================================================