oneDNN 3.0 Intel Sapphire Rapids AMX 2 x Intel Xeon Platinum 8490H oneDNN Intel AMX Sapphire Rapids benchmarks by Michael Larabel for a future article. AVX512_CORE_AMX: Processor: 2 x Intel Xeon Platinum 8490H @ 3.50GHz (120 Cores / 240 Threads), Motherboard: Quanta Cloud S6Q-MB-MPS (3A10.uh BIOS), Chipset: Intel Device 1bce, Memory: 16 x 64 GB 4800MT/s Samsung M321R8GA0BB0-CQKEG, Disk: 2 x 1920GB SAMSUNG MZWLJ1T9HBJR-00007 + 3841GB Micron_9300_MTFDHAL3T8TDP + 960GB INTEL SSDSC2KG96, Graphics: ASPEED, Monitor: VGA HDMI, Network: 4 x Intel E810-C for QSFP + 2 x Intel X710 for 10GBASE-T OS: Ubuntu 22.04, Kernel: 6.1.4-060104-generic (x86_64), Desktop: GNOME Shell 42.2, Display Server: X Server 1.21.1.3, Vulkan: 1.2.204, Compiler: GCC 11.3.0 + Clang 14.0.0-1ubuntu1, File-System: ext4, Screen Resolution: 1920x1080 AVX512_CORE_FP16: Processor: 2 x Intel Xeon Platinum 8490H @ 3.50GHz (120 Cores / 240 Threads), Motherboard: Quanta Cloud S6Q-MB-MPS (3A10.uh BIOS), Chipset: Intel Device 1bce, Memory: 16 x 64 GB 4800MT/s Samsung M321R8GA0BB0-CQKEG, Disk: 2 x 1920GB SAMSUNG MZWLJ1T9HBJR-00007 + 3841GB Micron_9300_MTFDHAL3T8TDP + 960GB INTEL SSDSC2KG96, Graphics: ASPEED, Monitor: VGA HDMI, Network: 4 x Intel E810-C for QSFP + 2 x Intel X710 for 10GBASE-T OS: Ubuntu 22.04, Kernel: 6.1.4-060104-generic (x86_64), Desktop: GNOME Shell 42.2, Display Server: X Server 1.21.1.3, Vulkan: 1.2.204, Compiler: GCC 11.3.0 + Clang 14.0.0-1ubuntu1, File-System: ext4, Screen Resolution: 1920x1080 AVX512_CORE_BF16: Processor: 2 x Intel Xeon Platinum 8490H @ 3.50GHz (120 Cores / 240 Threads), Motherboard: Quanta Cloud S6Q-MB-MPS (3A10.uh BIOS), Chipset: Intel Device 1bce, Memory: 16 x 64 GB 4800MT/s Samsung M321R8GA0BB0-CQKEG, Disk: 2 x 1920GB SAMSUNG MZWLJ1T9HBJR-00007 + 3841GB Micron_9300_MTFDHAL3T8TDP + 960GB INTEL SSDSC2KG96, Graphics: ASPEED, Monitor: VGA HDMI, Network: 4 x Intel E810-C for QSFP + 2 x Intel X710 for 10GBASE-T OS: Ubuntu 22.04, Kernel: 6.1.4-060104-generic (x86_64), Desktop: GNOME Shell 42.2, Display Server: X Server 1.21.1.3, Vulkan: 1.2.204, Compiler: GCC 11.3.0 + Clang 14.0.0-1ubuntu1, File-System: ext4, Screen Resolution: 1920x1080 AVX512_CORE_VNNI: Processor: 2 x Intel Xeon Platinum 8490H @ 3.50GHz (120 Cores / 240 Threads), Motherboard: Quanta Cloud S6Q-MB-MPS (3A10.uh BIOS), Chipset: Intel Device 1bce, Memory: 16 x 64 GB 4800MT/s Samsung M321R8GA0BB0-CQKEG, Disk: 2 x 1920GB SAMSUNG MZWLJ1T9HBJR-00007 + 3841GB Micron_9300_MTFDHAL3T8TDP + 960GB INTEL SSDSC2KG96, Graphics: ASPEED, Monitor: VGA HDMI, Network: 4 x Intel E810-C for QSFP + 2 x Intel X710 for 10GBASE-T OS: Ubuntu 22.04, Kernel: 6.1.4-060104-generic (x86_64), Desktop: GNOME Shell 42.2, Display Server: X Server 1.21.1.3, Vulkan: 1.2.204, Compiler: GCC 11.3.0 + Clang 14.0.0-1ubuntu1, File-System: ext4, Screen Resolution: 1920x1080 AVX512_CORE: Processor: 2 x Intel Xeon Platinum 8490H @ 3.50GHz (120 Cores / 240 Threads), Motherboard: Quanta Cloud S6Q-MB-MPS (3A10.uh BIOS), Chipset: Intel Device 1bce, Memory: 16 x 64 GB 4800MT/s Samsung M321R8GA0BB0-CQKEG, Disk: 2 x 1920GB SAMSUNG MZWLJ1T9HBJR-00007 + 3841GB Micron_9300_MTFDHAL3T8TDP + 960GB INTEL SSDSC2KG96, Graphics: ASPEED, Monitor: VGA HDMI, Network: 4 x Intel E810-C for QSFP + 2 x Intel X710 for 10GBASE-T OS: Ubuntu 22.04, Kernel: 6.1.4-060104-generic (x86_64), Desktop: GNOME Shell 42.2, Display Server: X Server 1.21.1.3, Vulkan: 1.2.204, Compiler: GCC 11.3.0 + Clang 14.0.0-1ubuntu1, File-System: ext4, Screen Resolution: 1920x1080 oneDNN 3.0 Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU ms < Lower Is Better AVX512_CORE_AMX .. 0.580646 |========= AVX512_CORE_FP16 . 2.170730 |=================================== AVX512_CORE_BF16 . 2.171090 |=================================== AVX512_CORE_VNNI . 3.067210 |================================================== AVX512_CORE ...... 3.056120 |================================================== oneDNN 3.0 CPU Peak Freq (Highest CPU Core Frequency) Monitor Megahertz > Higher Is Better AVX512_CORE_AMX .. MIN: 1900 AVG: 2766 MAX: 3501 AVX512_CORE_FP16 . MIN: 1900 AVG: 2844 MAX: 3506 AVX512_CORE_BF16 . MIN: 1900 AVG: 2791 MAX: 3502 AVX512_CORE_VNNI . MIN: 1900 AVG: 2917 MAX: 3502 AVX512_CORE ...... MIN: 1900 AVG: 2858 MAX: 3501 oneDNN 3.0 CPU Power Consumption Monitor Watts < Lower Is Better AVX512_CORE_AMX .. MIN: 194 AVG: 534 MAX: 637 AVX512_CORE_FP16 . MIN: 194 AVG: 586 MAX: 721 AVX512_CORE_BF16 . MIN: 199 AVG: 591 MAX: 722 AVX512_CORE_VNNI . MIN: 201 AVG: 596 MAX: 744 AVX512_CORE ...... MIN: 197 AVG: 596 MAX: 744 oneDNN 3.0 CPU Temperature Monitor Celsius < Lower Is Better AVX512_CORE_AMX .. MIN: 29.0 AVG: 41.4 MAX: 47.0 AVX512_CORE_FP16 . MIN: 33.0 AVG: 46.0 MAX: 52.0 AVX512_CORE_BF16 . MIN: 28.0 AVG: 43.1 MAX: 51.0 AVX512_CORE_VNNI . MIN: 29.0 AVG: 43.7 MAX: 51.0 AVX512_CORE ...... MIN: 30.0 AVG: 44.9 MAX: 52.0 oneDNN 3.0 Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU ms < Lower Is Better AVX512_CORE_AMX .. 0.445779 |========= AVX512_CORE_FP16 . 1.045110 |====================== AVX512_CORE_BF16 . 1.048360 |====================== AVX512_CORE_VNNI . 2.378920 |================================================== AVX512_CORE ...... 2.383610 |================================================== oneDNN 3.0 CPU Peak Freq (Highest CPU Core Frequency) Monitor Megahertz > Higher Is Better AVX512_CORE_AMX .. MIN: 1900 AVG: 2918 MAX: 3508 AVX512_CORE_FP16 . MIN: 1900 AVG: 2755 MAX: 3504 AVX512_CORE_BF16 . MIN: 1900 AVG: 3110 MAX: 3502 AVX512_CORE_VNNI . MIN: 1900 AVG: 2952 MAX: 3506 AVX512_CORE ...... MIN: 1900 AVG: 2867 MAX: 3510 oneDNN 3.0 CPU Power Consumption Monitor Watts < Lower Is Better AVX512_CORE_AMX .. MIN: 194 AVG: 440 MAX: 756 AVX512_CORE_FP16 . MIN: 196 AVG: 416 MAX: 702 AVX512_CORE_BF16 . MIN: 195 AVG: 425 MAX: 705 AVX512_CORE_VNNI . MIN: 198 AVG: 428 MAX: 722 AVX512_CORE ...... MIN: 200 AVG: 430 MAX: 721 oneDNN 3.0 CPU Temperature Monitor Celsius < Lower Is Better AVX512_CORE_AMX .. MIN: 35.0 AVG: 40.2 MAX: 45.0 AVX512_CORE_FP16 . MIN: 36.0 AVG: 40.9 MAX: 46.0 AVX512_CORE_BF16 . MIN: 35.0 AVG: 40.2 MAX: 45.0 AVX512_CORE_VNNI . MIN: 35.0 AVG: 39.8 MAX: 46.0 AVX512_CORE ...... MIN: 35.0 AVG: 40.7 MAX: 47.0 oneDNN 3.0 Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU ms < Lower Is Better AVX512_CORE_AMX .. 5.73717 |=========================================== AVX512_CORE_FP16 . 6.48366 |================================================ AVX512_CORE_BF16 . 6.58841 |================================================= AVX512_CORE_VNNI . 6.82433 |=================================================== AVX512_CORE ...... 6.64816 |================================================== oneDNN 3.0 CPU Peak Freq (Highest CPU Core Frequency) Monitor Megahertz > Higher Is Better AVX512_CORE_AMX .. MIN: 1900 AVG: 2985 MAX: 3512 AVX512_CORE_FP16 . MIN: 1900 AVG: 2924 MAX: 3510 AVX512_CORE_BF16 . MIN: 1900 AVG: 2933 MAX: 3508 AVX512_CORE_VNNI . MIN: 1900 AVG: 2940 MAX: 3515 AVX512_CORE ...... MIN: 1900 AVG: 2926 MAX: 3513 oneDNN 3.0 CPU Power Consumption Monitor Watts < Lower Is Better AVX512_CORE_AMX .. MIN: 106 AVG: 493 MAX: 604 AVX512_CORE_FP16 . MIN: 195 AVG: 508 MAX: 657 AVX512_CORE_BF16 . MIN: 197 AVG: 510 MAX: 652 AVX512_CORE_VNNI . MIN: 193 AVG: 534 MAX: 705 AVX512_CORE ...... MIN: 200 AVG: 536 MAX: 710 oneDNN 3.0 CPU Temperature Monitor Celsius < Lower Is Better AVX512_CORE_AMX .. MIN: 34.0 AVG: 43.4 MAX: 47.0 AVX512_CORE_FP16 . MIN: 34.0 AVG: 44.5 MAX: 48.0 AVX512_CORE_BF16 . MIN: 34.0 AVG: 44.2 MAX: 47.0 AVX512_CORE_VNNI . MIN: 34.0 AVG: 45.1 MAX: 50.0 AVX512_CORE ...... MIN: 34.0 AVG: 45.2 MAX: 49.0 oneDNN 3.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU ms < Lower Is Better AVX512_CORE_AMX .. 0.329961 |= AVX512_CORE_FP16 . 0.383785 |= AVX512_CORE_BF16 . 0.386636 |= AVX512_CORE_VNNI . 16.575100 |================================================= AVX512_CORE ...... 16.458900 |================================================= oneDNN 3.0 CPU Peak Freq (Highest CPU Core Frequency) Monitor Megahertz > Higher Is Better AVX512_CORE_AMX .. MIN: 1900 AVG: 2877 MAX: 5340 AVX512_CORE_FP16 . MIN: 1900 AVG: 2877 MAX: 3502 AVX512_CORE_BF16 . MIN: 1900 AVG: 2766 MAX: 3509 AVX512_CORE_VNNI . MIN: 1900 AVG: 3014 MAX: 3512 AVX512_CORE ...... MIN: 1900 AVG: 2959 MAX: 3508 oneDNN 3.0 CPU Power Consumption Monitor Watts < Lower Is Better AVX512_CORE_AMX .. MIN: 197 AVG: 485 MAX: 593 AVX512_CORE_FP16 . MIN: 199 AVG: 512 MAX: 647 AVX512_CORE_BF16 . MIN: 196 AVG: 507 MAX: 647 AVX512_CORE_VNNI . MIN: 199 AVG: 486 MAX: 601 AVX512_CORE ...... MIN: 197 AVG: 487 MAX: 609 oneDNN 3.0 CPU Temperature Monitor Celsius < Lower Is Better AVX512_CORE_AMX .. MIN: 35.0 AVG: 42.5 MAX: 46.0 AVX512_CORE_FP16 . MIN: 36.0 AVG: 44.2 MAX: 48.0 AVX512_CORE_BF16 . MIN: 35.0 AVG: 43.7 MAX: 48.0 AVX512_CORE_VNNI . MIN: 36.0 AVG: 42.7 MAX: 49.0 AVX512_CORE ...... MIN: 35.0 AVG: 42.8 MAX: 46.0 oneDNN 3.0 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU ms < Lower Is Better AVX512_CORE_AMX .. 850.35 |=================================================== AVX512_CORE_FP16 . 875.28 |==================================================== AVX512_CORE_BF16 . 864.46 |=================================================== AVX512_CORE_VNNI . 862.42 |=================================================== AVX512_CORE ...... 856.90 |=================================================== oneDNN 3.0 CPU Peak Freq (Highest CPU Core Frequency) Monitor Megahertz > Higher Is Better AVX512_CORE_AMX .. MIN: 1900 AVG: 2927 MAX: 3518 AVX512_CORE_FP16 . MIN: 1900 AVG: 2923 MAX: 3516 AVX512_CORE_BF16 . MIN: 1900 AVG: 2923 MAX: 3515 AVX512_CORE_VNNI . MIN: 1900 AVG: 2924 MAX: 3514 AVX512_CORE ...... MIN: 1900 AVG: 2926 MAX: 3515 oneDNN 3.0 CPU Power Consumption Monitor Watts < Lower Is Better AVX512_CORE_AMX .. MIN: 196 AVG: 576 MAX: 732 AVX512_CORE_FP16 . MIN: 194 AVG: 576 MAX: 732 AVX512_CORE_BF16 . MIN: 197 AVG: 577 MAX: 733 AVX512_CORE_VNNI . MIN: 197 AVG: 576 MAX: 733 AVX512_CORE ...... MIN: 201 AVG: 576 MAX: 734 oneDNN 3.0 CPU Temperature Monitor Celsius < Lower Is Better AVX512_CORE_AMX .. MIN: 37.0 AVG: 48.9 MAX: 53.0 AVX512_CORE_FP16 . MIN: 37.0 AVG: 49.6 MAX: 55.0 AVX512_CORE_BF16 . MIN: 38.0 AVG: 49.7 MAX: 54.0 AVX512_CORE_VNNI . MIN: 35.0 AVG: 48.7 MAX: 53.0 AVX512_CORE ...... MIN: 34.0 AVG: 48.6 MAX: 53.0 CPU Peak Freq (Highest CPU Core Frequency) Monitor Phoronix Test Suite System Monitoring Megahertz AVX512_CORE_AMX .. MIN: 1900 AVG: 2940 MAX: 5340 AVX512_CORE_FP16 . MIN: 1900 AVG: 2939 MAX: 5424 AVX512_CORE_BF16 . MIN: 1900 AVG: 2942 MAX: 3518 AVX512_CORE_VNNI . MIN: 1900 AVG: 2942 MAX: 3515 AVX512_CORE ...... MIN: 1900 AVG: 2934 MAX: 3515 CPU Power Consumption Monitor Phoronix Test Suite System Monitoring Watts AVX512_CORE_AMX .. MIN: 106 AVG: 565 MAX: 756 AVX512_CORE_FP16 . MIN: 194 AVG: 571 MAX: 756 AVX512_CORE_BF16 . MIN: 105 AVG: 568 MAX: 756 AVX512_CORE_VNNI . MIN: 105 AVG: 550 MAX: 744 AVX512_CORE ...... MIN: 195 AVG: 545 MAX: 744 CPU Temperature Monitor Phoronix Test Suite System Monitoring Celsius AVX512_CORE_AMX .. MIN: 29.0 AVG: 48.9 MAX: 54.0 AVX512_CORE_FP16 . MIN: 33.0 AVG: 49.7 MAX: 56.0 AVX512_CORE_BF16 . MIN: 28.0 AVG: 49.0 MAX: 54.0 AVX512_CORE_VNNI . MIN: 29.0 AVG: 47.1 MAX: 53.0 AVX512_CORE ...... MIN: 30.0 AVG: 46.7 MAX: 53.0