oneDNN 3.0 Intel Sapphire Rapids AMX

2 x Intel Xeon Platinum 8490H oneDNN Intel AMX Sapphire Rapids benchmarks by Michael Larabel for a future article.

HTML result view exported from: https://openbenchmarking.org/result/2301158-NE-ONEDNN30I77&sro&gru.

oneDNN 3.0 Intel Sapphire Rapids AMXProcessorMotherboardChipsetMemoryDiskGraphicsMonitorNetworkOSKernelDesktopDisplay ServerVulkanCompilerFile-SystemScreen ResolutionAVX512_CORE_AMXAVX512_CORE_FP16AVX512_CORE_BF16AVX512_CORE_VNNIAVX512_CORE2 x Intel Xeon Platinum 8490H @ 3.50GHz (120 Cores / 240 Threads)Quanta Cloud S6Q-MB-MPS (3A10.uh BIOS)Intel Device 1bce16 x 64 GB 4800MT/s Samsung M321R8GA0BB0-CQKEG2 x 1920GB SAMSUNG MZWLJ1T9HBJR-00007 + 3841GB Micron_9300_MTFDHAL3T8TDP + 960GB INTEL SSDSC2KG96ASPEEDVGA HDMI4 x Intel E810-C for QSFP + 2 x Intel X710 for 10GBASE-TUbuntu 22.046.1.4-060104-generic (x86_64)GNOME Shell 42.2X Server 1.21.1.31.2.204GCC 11.3.0 + Clang 14.0.0-1ubuntu1ext41920x1080OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-xKiWfi/gcc-11-11.3.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-xKiWfi/gcc-11-11.3.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: intel_pstate performance (EPP: performance) - CPU Microcode: 0x2b0000c0Security Details- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected

oneDNN 3.0 Intel Sapphire Rapids AMXonednn: Deconvolution Batch shapes_1d - bf16bf16bf16 - CPUonednn: Deconvolution Batch shapes_3d - bf16bf16bf16 - CPUonednn: IP Shapes 1D - bf16bf16bf16 - CPUonednn: Matrix Multiply Batch Shapes Transformer - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPUAVX512_CORE_AMXAVX512_CORE_FP16AVX512_CORE_BF16AVX512_CORE_VNNIAVX512_CORE0.5806460.4457795.737170.329961850.3462.170731.045116.483660.383785875.2842.171091.048366.588410.386636864.4593.067212.378926.8243316.5751862.4213.056122.383616.6481616.4589856.896OpenBenchmarking.org

CPU Temperature Monitor

Phoronix Test Suite System Monitoring

OpenBenchmarking.orgCelsiusCPU Temperature MonitorPhoronix Test Suite System MonitoringAVX512_COREAVX512_CORE_AMXAVX512_CORE_BF16AVX512_CORE_FP16AVX512_CORE_VNNI1122334455Min: 30 / Avg: 46.72 / Max: 53Min: 29 / Avg: 48.95 / Max: 54Min: 28 / Avg: 49 / Max: 54Min: 33 / Avg: 49.66 / Max: 56Min: 29 / Avg: 47.07 / Max: 53

CPU Peak Freq (Highest CPU Core Frequency) Monitor

Phoronix Test Suite System Monitoring

OpenBenchmarking.orgMegahertzCPU Peak Freq (Highest CPU Core Frequency) MonitorPhoronix Test Suite System MonitoringAVX512_COREAVX512_CORE_AMXAVX512_CORE_BF16AVX512_CORE_FP16AVX512_CORE_VNNI9001800270036004500Min: 1900 / Avg: 2933.61 / Max: 3515Min: 1900 / Avg: 2940.37 / Max: 5340Min: 1900 / Avg: 2941.89 / Max: 3518Min: 1900 / Avg: 2938.87 / Max: 5424Min: 1900 / Avg: 2941.72 / Max: 3515

CPU Power Consumption Monitor

Phoronix Test Suite System Monitoring

OpenBenchmarking.orgWattsCPU Power Consumption MonitorPhoronix Test Suite System MonitoringAVX512_COREAVX512_CORE_AMXAVX512_CORE_BF16AVX512_CORE_FP16AVX512_CORE_VNNI130260390520650Min: 195.36 / Avg: 545.16 / Max: 744.31Min: 106.17 / Avg: 565.15 / Max: 756.46Min: 105.24 / Avg: 567.97 / Max: 756.24Min: 193.64 / Avg: 570.71 / Max: 756.46Min: 104.8 / Avg: 550.33 / Max: 744.38

oneDNN

CPU Peak Freq (Highest CPU Core Frequency) Monitor

MinAvgMaxAVX512_CORE190028583501AVX512_CORE_AMX190027663501AVX512_CORE_BF16190027913502AVX512_CORE_FP16190028443506AVX512_CORE_VNNI190029173502OpenBenchmarking.orgMegahertz, More Is BetteroneDNN 3.0CPU Peak Freq (Highest CPU Core Frequency) Monitor10002000300040005000

oneDNN

CPU Peak Freq (Highest CPU Core Frequency) Monitor

MinAvgMaxAVX512_CORE190028673510AVX512_CORE_AMX190029183508AVX512_CORE_BF16190031103502AVX512_CORE_FP16190027553504AVX512_CORE_VNNI190029523506OpenBenchmarking.orgMegahertz, More Is BetteroneDNN 3.0CPU Peak Freq (Highest CPU Core Frequency) Monitor10002000300040005000

oneDNN

CPU Peak Freq (Highest CPU Core Frequency) Monitor

MinAvgMaxAVX512_CORE190029263513AVX512_CORE_AMX190029853512AVX512_CORE_BF16190029333508AVX512_CORE_FP16190029243510AVX512_CORE_VNNI190029403515OpenBenchmarking.orgMegahertz, More Is BetteroneDNN 3.0CPU Peak Freq (Highest CPU Core Frequency) Monitor10002000300040005000

oneDNN

CPU Peak Freq (Highest CPU Core Frequency) Monitor

MinAvgMaxAVX512_CORE190029593508AVX512_CORE_AMX190028775340AVX512_CORE_BF16190027663509AVX512_CORE_FP16190028773502AVX512_CORE_VNNI190030143512OpenBenchmarking.orgMegahertz, More Is BetteroneDNN 3.0CPU Peak Freq (Highest CPU Core Frequency) Monitor13002600390052006500

oneDNN

CPU Peak Freq (Highest CPU Core Frequency) Monitor

OpenBenchmarking.orgMegahertz, More Is BetteroneDNN 3.0CPU Peak Freq (Highest CPU Core Frequency) MonitorAVX512_COREAVX512_CORE_AMXAVX512_CORE_BF16AVX512_CORE_FP16AVX512_CORE_VNNI6001200180024003000Min: 1900 / Avg: 2925.95 / Max: 3515Min: 1900 / Avg: 2927.45 / Max: 3518Min: 1900 / Avg: 2923.11 / Max: 3515Min: 1900 / Avg: 2922.69 / Max: 3516Min: 1900 / Avg: 2923.89 / Max: 3514

oneDNN

CPU Temperature Monitor

MinAvgMaxAVX512_CORE30.044.952.0AVX512_CORE_AMX29.041.447.0AVX512_CORE_BF1628.043.151.0AVX512_CORE_FP1633.046.052.0AVX512_CORE_VNNI29.043.751.0OpenBenchmarking.orgCelsius, Fewer Is BetteroneDNN 3.0CPU Temperature Monitor1530456075

oneDNN

CPU Temperature Monitor

MinAvgMaxAVX512_CORE35.040.747.0AVX512_CORE_AMX35.040.245.0AVX512_CORE_BF1635.040.245.0AVX512_CORE_FP1636.040.946.0AVX512_CORE_VNNI35.039.846.0OpenBenchmarking.orgCelsius, Fewer Is BetteroneDNN 3.0CPU Temperature Monitor1428425670

oneDNN

CPU Temperature Monitor

MinAvgMaxAVX512_CORE34.045.249.0AVX512_CORE_AMX34.043.447.0AVX512_CORE_BF1634.044.247.0AVX512_CORE_FP1634.044.548.0AVX512_CORE_VNNI34.045.150.0OpenBenchmarking.orgCelsius, Fewer Is BetteroneDNN 3.0CPU Temperature Monitor1428425670

oneDNN

CPU Temperature Monitor

MinAvgMaxAVX512_CORE35.042.846.0AVX512_CORE_AMX35.042.546.0AVX512_CORE_BF1635.043.748.0AVX512_CORE_FP1636.044.248.0AVX512_CORE_VNNI36.042.749.0OpenBenchmarking.orgCelsius, Fewer Is BetteroneDNN 3.0CPU Temperature Monitor1428425670

oneDNN

CPU Temperature Monitor

OpenBenchmarking.orgCelsius, Fewer Is BetteroneDNN 3.0CPU Temperature MonitorAVX512_COREAVX512_CORE_AMXAVX512_CORE_BF16AVX512_CORE_FP16AVX512_CORE_VNNI1122334455Min: 34 / Avg: 48.63 / Max: 53Min: 37 / Avg: 48.92 / Max: 53Min: 38 / Avg: 49.73 / Max: 54Min: 37 / Avg: 49.6 / Max: 55Min: 35 / Avg: 48.74 / Max: 53

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPUAVX512_COREAVX512_CORE_AMXAVX512_CORE_BF16AVX512_CORE_FP16AVX512_CORE_VNNI0.69011.38022.07032.76043.4505SE +/- 0.025853, N = 3SE +/- 0.006207, N = 4SE +/- 0.008260, N = 3SE +/- 0.004682, N = 3SE +/- 0.014891, N = 33.0561200.5806462.1710902.1707303.067210MIN: 2.84MIN: 0.47MIN: 2MIN: 2.01MIN: 2.851. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPUAVX512_COREAVX512_CORE_AMXAVX512_CORE_BF16AVX512_CORE_FP16AVX512_CORE_VNNI0.53631.07261.60892.14522.6815SE +/- 0.008552, N = 9SE +/- 0.001544, N = 9SE +/- 0.005254, N = 9SE +/- 0.003634, N = 9SE +/- 0.007894, N = 92.3836100.4457791.0483601.0451102.378920MIN: 2.29MIN: 0.38MIN: 1.01MIN: 1.01MIN: 2.31. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

oneDNN

Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPUAVX512_COREAVX512_CORE_AMXAVX512_CORE_BF16AVX512_CORE_FP16AVX512_CORE_VNNI246810SE +/- 0.26087, N = 15SE +/- 0.11275, N = 15SE +/- 0.33773, N = 15SE +/- 0.27236, N = 15SE +/- 0.31537, N = 156.648165.737176.588416.483666.82433MIN: 3.81MIN: 3.75MIN: 3.66MIN: 3.65MIN: 4.261. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPUAVX512_COREAVX512_CORE_AMXAVX512_CORE_BF16AVX512_CORE_FP16AVX512_CORE_VNNI48121620SE +/- 0.370395, N = 15SE +/- 0.002541, N = 4SE +/- 0.003559, N = 4SE +/- 0.001848, N = 4SE +/- 0.135276, N = 416.4589000.3299610.3866360.38378516.575100MIN: 12.81MIN: 15.341. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPUAVX512_COREAVX512_CORE_AMXAVX512_CORE_BF16AVX512_CORE_FP16AVX512_CORE_VNNI2004006008001000SE +/- 10.59, N = 15SE +/- 9.48, N = 4SE +/- 10.05, N = 15SE +/- 12.20, N = 12SE +/- 13.04, N = 15856.90850.35864.46875.28862.42MIN: 741.53MIN: 797.61MIN: 758.74MIN: 777.12MIN: 767.351. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

oneDNN

CPU Power Consumption Monitor

MinAvgMaxAVX512_CORE197596744AVX512_CORE_AMX194534637AVX512_CORE_BF16199591722AVX512_CORE_FP16194586721AVX512_CORE_VNNI201596744OpenBenchmarking.orgWatts, Fewer Is BetteroneDNN 3.0CPU Power Consumption Monitor2004006008001000

oneDNN

CPU Power Consumption Monitor

MinAvgMaxAVX512_CORE200430721AVX512_CORE_AMX194440756AVX512_CORE_BF16195425705AVX512_CORE_FP16196416702AVX512_CORE_VNNI198428722OpenBenchmarking.orgWatts, Fewer Is BetteroneDNN 3.0CPU Power Consumption Monitor2004006008001000

oneDNN

CPU Power Consumption Monitor

MinAvgMaxAVX512_CORE200536710AVX512_CORE_AMX106493604AVX512_CORE_BF16197510652AVX512_CORE_FP16195508657AVX512_CORE_VNNI193534705OpenBenchmarking.orgWatts, Fewer Is BetteroneDNN 3.0CPU Power Consumption Monitor2004006008001000

oneDNN

CPU Power Consumption Monitor

MinAvgMaxAVX512_CORE197487609AVX512_CORE_AMX197485593AVX512_CORE_BF16196507647AVX512_CORE_FP16199512647AVX512_CORE_VNNI199486601OpenBenchmarking.orgWatts, Fewer Is BetteroneDNN 3.0CPU Power Consumption Monitor2004006008001000

oneDNN

CPU Power Consumption Monitor

OpenBenchmarking.orgWatts, Fewer Is BetteroneDNN 3.0CPU Power Consumption MonitorAVX512_COREAVX512_CORE_AMXAVX512_CORE_BF16AVX512_CORE_FP16AVX512_CORE_VNNI130260390520650Min: 200.74 / Avg: 576.14 / Max: 733.65Min: 196.01 / Avg: 575.78 / Max: 731.68Min: 196.71 / Avg: 576.5 / Max: 733.04Min: 193.88 / Avg: 575.76 / Max: 731.88Min: 197.35 / Avg: 576.1 / Max: 732.79


Phoronix Test Suite v10.8.4