onednn tgl

Intel Core i5-1145G7 testing with a LENOVO 20XW004AUS (N32ET71W 1.47 BIOS) and Intel Xe TGL GT2 3GB on Ubuntu 20.04 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2203301-NE-ONEDNNTGL27.

onednn tglProcessorMotherboardChipsetMemoryDiskGraphicsAudioNetworkOSKernelDesktopDisplay ServerOpenGLVulkanCompilerFile-SystemScreen ResolutionABCIntel Core i5-1145G7 @ 4.40GHz (4 Cores / 8 Threads)LENOVO 20XW004AUS (N32ET71W 1.47 BIOS)Intel Device a0ef16GB1024GB SAMSUNG MZVLB1T0HBLR-000H1Intel Xe TGL GT2 3GB (1300MHz)Realtek ALC287Intel Device a0f0Ubuntu 20.045.14.0-1027-oem (x86_64)GNOME Shell 3.36.9X Server 1.20.134.6 Mesa 21.2.61.2.182GCC 9.4.0ext41920x1200OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-yTrUTS/gcc-9-9.4.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: intel_pstate powersave (EPP: balance_performance) - Platform Profile: balanced - CPU Microcode: 0x88 - ACPI Profile: balanced Security Details- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected

onednn tglonednn: IP Shapes 1D - f32 - CPUonednn: IP Shapes 3D - f32 - CPUonednn: IP Shapes 1D - u8s8f32 - CPUonednn: IP Shapes 3D - u8s8f32 - CPUonednn: IP Shapes 1D - bf16bf16bf16 - CPUonednn: IP Shapes 3D - bf16bf16bf16 - CPUonednn: Convolution Batch Shapes Auto - f32 - CPUonednn: Deconvolution Batch shapes_1d - f32 - CPUonednn: Deconvolution Batch shapes_3d - f32 - CPUonednn: Convolution Batch Shapes Auto - u8s8f32 - CPUonednn: Deconvolution Batch shapes_1d - u8s8f32 - CPUonednn: Deconvolution Batch shapes_3d - u8s8f32 - CPUonednn: Recurrent Neural Network Training - f32 - CPUonednn: Recurrent Neural Network Inference - f32 - CPUonednn: Recurrent Neural Network Training - u8s8f32 - CPUonednn: Convolution Batch Shapes Auto - bf16bf16bf16 - CPUonednn: Deconvolution Batch shapes_1d - bf16bf16bf16 - CPUonednn: Deconvolution Batch shapes_3d - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Inference - u8s8f32 - CPUonednn: Matrix Multiply Batch Shapes Transformer - f32 - CPUonednn: Recurrent Neural Network Training - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPUonednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPUonednn: Matrix Multiply Batch Shapes Transformer - bf16bf16bf16 - CPUABC8.462085.409661.953012.1101819.83796.116739.4130817.698210.17127.008963.009062.410329343.974767.339344.9938.035963.555638.26924762.613.975989342.564760.201.7017412.274948.956695.406292.082772.2884219.83536.311369.9973618.30139.93067.616263.037812.357669342.434768.469347.9137.973963.928438.31144762.943.975779340.034762.011.7089712.198829.517345.452162.169092.2978619.80156.3516010.5153118.21099.824877.637503.075912.340119341.054759.639346.8237.960964.460838.39884763.903.980959341.194764.791.7107712.20369OpenBenchmarking.org

oneDNN

Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: IP Shapes 1D - Data Type: f32 - Engine: CPUABC3691215SE +/- 0.12800, N = 12SE +/- 0.18359, N = 12SE +/- 0.22405, N = 128.462088.956699.51734MIN: 6.9MIN: 6.78MIN: 6.541. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl

oneDNN

Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: IP Shapes 3D - Data Type: f32 - Engine: CPUABC1.22672.45343.68014.90686.1335SE +/- 0.01720, N = 3SE +/- 0.02326, N = 3SE +/- 0.05103, N = 35.409665.406295.45216MIN: 5.31MIN: 5.3MIN: 5.31. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl

oneDNN

Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPUABC0.4880.9761.4641.9522.44SE +/- 0.02279, N = 15SE +/- 0.03381, N = 14SE +/- 0.03939, N = 131.953012.082772.16909MIN: 1.53MIN: 1.42MIN: 1.441. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl

oneDNN

Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPUABC0.5171.0341.5512.0682.585SE +/- 0.01978, N = 3SE +/- 0.02070, N = 15SE +/- 0.02260, N = 142.110182.288422.29786MIN: 2.05MIN: 2.05MIN: 2.051. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl

oneDNN

Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPUABC510152025SE +/- 0.18, N = 7SE +/- 0.18, N = 7SE +/- 0.18, N = 719.8419.8419.80MIN: 18.31MIN: 18.36MIN: 18.391. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl

oneDNN

Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPUABC246810SE +/- 0.04671, N = 15SE +/- 0.06459, N = 15SE +/- 0.08757, N = 126.116736.311366.35160MIN: 4.83MIN: 4.78MIN: 4.81. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPUABC3691215SE +/- 0.13110, N = 15SE +/- 0.20633, N = 15SE +/- 0.26927, N = 159.413089.9973610.51531MIN: 7.82MIN: 7.8MIN: 7.81. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPUABC510152025SE +/- 0.31, N = 12SE +/- 0.37, N = 15SE +/- 0.43, N = 1217.7018.3018.21MIN: 12.6MIN: 12.66MIN: 12.851. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPUABC3691215SE +/- 0.07037, N = 3SE +/- 0.05241, N = 3SE +/- 0.01380, N = 310.171209.930609.82487MIN: 9.72MIN: 9.72MIN: 9.721. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPUABC246810SE +/- 0.03577, N = 3SE +/- 0.09598, N = 12SE +/- 0.07577, N = 157.008967.616267.63750MIN: 6.9MIN: 6.9MIN: 6.91. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPUABC0.69211.38422.07632.76843.4605SE +/- 0.06611, N = 12SE +/- 0.06747, N = 12SE +/- 0.07382, N = 123.009063.037813.07591MIN: 1.98MIN: 1.99MIN: 1.971. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPUABC0.54231.08461.62692.16922.7115SE +/- 0.01651, N = 3SE +/- 0.01602, N = 3SE +/- 0.01606, N = 32.410322.357662.34011MIN: 2.26MIN: 2.26MIN: 2.261. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl

oneDNN

Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPUABC2K4K6K8K10KSE +/- 15.58, N = 3SE +/- 13.80, N = 3SE +/- 11.27, N = 39343.979342.439341.05MIN: 9275.54MIN: 9281.9MIN: 9283.351. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPUABC10002000300040005000SE +/- 10.55, N = 3SE +/- 7.12, N = 3SE +/- 5.52, N = 34767.334768.464759.63MIN: 4703.91MIN: 4712.04MIN: 4706.41. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl

oneDNN

Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPUABC2K4K6K8K10KSE +/- 14.60, N = 3SE +/- 10.74, N = 3SE +/- 14.63, N = 39344.999347.919346.82MIN: 9276.29MIN: 9286.78MIN: 9277.561. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPUABC918273645SE +/- 0.08, N = 3SE +/- 0.03, N = 3SE +/- 0.02, N = 338.0437.9737.96MIN: 37.87MIN: 37.88MIN: 37.881. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPUABC1428425670SE +/- 1.13, N = 12SE +/- 1.15, N = 12SE +/- 1.22, N = 1263.5663.9364.46MIN: 48.57MIN: 48.09MIN: 48.751. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPUABC918273645SE +/- 0.00, N = 3SE +/- 0.02, N = 3SE +/- 0.06, N = 338.2738.3138.40MIN: 38.12MIN: 38.14MIN: 38.121. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPUABC10002000300040005000SE +/- 9.66, N = 3SE +/- 4.91, N = 3SE +/- 8.66, N = 34762.614762.944763.90MIN: 4701.82MIN: 4712.41MIN: 4704.671. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPUABC0.89571.79142.68713.58284.4785SE +/- 0.10299, N = 12SE +/- 0.08596, N = 12SE +/- 0.08379, N = 123.975983.975773.98095MIN: 2.51MIN: 2.5MIN: 2.51. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl

oneDNN

Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPUABC2K4K6K8K10KSE +/- 7.36, N = 3SE +/- 13.01, N = 3SE +/- 16.10, N = 39342.569340.039341.19MIN: 9290.71MIN: 9275.58MIN: 9269.091. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPUABC10002000300040005000SE +/- 4.70, N = 3SE +/- 5.86, N = 3SE +/- 9.20, N = 34760.204762.014764.79MIN: 4708.81MIN: 4707.88MIN: 4707.421. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPUABC0.38490.76981.15471.53961.9245SE +/- 0.03407, N = 12SE +/- 0.03087, N = 12SE +/- 0.02902, N = 121.701741.708971.71077MIN: 1.2MIN: 1.19MIN: 1.21. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPUABC3691215SE +/- 0.38, N = 12SE +/- 0.35, N = 12SE +/- 0.35, N = 1212.2712.2012.20MIN: 7.73MIN: 7.72MIN: 7.721. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl


Phoronix Test Suite v10.8.4