oneDNN 2.0 Intel Tiger Lake

Intel Core i7-1165G7 testing with a Dell 0GG9PT (1.0.3 BIOS) and Intel UHD 3GB on Ubuntu 20.10 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2012203-PTS-ONEDNN2092&sro&rro.

oneDNN 2.0 Intel Tiger LakeProcessorMotherboardChipsetMemoryDiskGraphicsAudioNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLVulkanCompilerFile-SystemScreen ResolutionRun 1Run 2Run 3Intel Core i7-1165G7 @ 4.70GHz (4 Cores / 8 Threads)Dell 0GG9PT (1.0.3 BIOS)Intel Tiger Lake-LP16GBKioxia KBG40ZNS256G NVMe 256GBIntel UHD 3GB (1300MHz)Realtek ALC289Intel Wi-Fi 6 AX201Ubuntu 20.105.10.0-051000rc7daily20201213-generic (x86_64) 20201212GNOME Shell 3.38.1X Server 1.20.9modesetting 1.20.94.6 Mesa 20.2.11.2.145GCC 10.2.0ext41920x1200OpenBenchmarking.orgCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-gcn/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: intel_pstate powersave - CPU Microcode: 0x60 - Thermald 2.3Security Details- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected

oneDNN 2.0 Intel Tiger Lakeonednn: IP Shapes 1D - f32 - CPUonednn: IP Shapes 3D - f32 - CPUonednn: IP Shapes 1D - u8s8f32 - CPUonednn: IP Shapes 3D - u8s8f32 - CPUonednn: IP Shapes 1D - bf16bf16bf16 - CPUonednn: IP Shapes 3D - bf16bf16bf16 - CPUonednn: Convolution Batch Shapes Auto - f32 - CPUonednn: Deconvolution Batch shapes_1d - f32 - CPUonednn: Deconvolution Batch shapes_3d - f32 - CPUonednn: Convolution Batch Shapes Auto - u8s8f32 - CPUonednn: Deconvolution Batch shapes_1d - u8s8f32 - CPUonednn: Deconvolution Batch shapes_3d - u8s8f32 - CPUonednn: Recurrent Neural Network Training - f32 - CPUonednn: Recurrent Neural Network Inference - f32 - CPUonednn: Recurrent Neural Network Training - u8s8f32 - CPUonednn: Convolution Batch Shapes Auto - bf16bf16bf16 - CPUonednn: Deconvolution Batch shapes_1d - bf16bf16bf16 - CPUonednn: Deconvolution Batch shapes_3d - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Inference - u8s8f32 - CPUonednn: Matrix Multiply Batch Shapes Transformer - f32 - CPUonednn: Recurrent Neural Network Training - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPUonednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPUonednn: Matrix Multiply Batch Shapes Transformer - bf16bf16bf16 - CPURun 1Run 2Run 312.195666.436062.489372.8400825.56117.9205513.497014.099711.354539.647803.008012.742998928.894570.329072.2650.885659.943737.74974537.864.963668891.894532.642.7477112.055312.231416.446132.499322.8530825.76138.2922214.959214.936912.1176510.093983.169322.881299301.854793.379309.3751.965161.726242.82094799.985.351029302.704807.712.9932012.655112.289257.175602.707143.1149626.35078.6347814.111114.924312.1534610.075883.170512.892649350.264799.779315.5752.118961.154142.79314809.375.350079299.604816.403.0055312.7105OpenBenchmarking.org

oneDNN

Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 1D - Data Type: f32 - Engine: CPURun 3Run 2Run 13691215SE +/- 0.47, N = 12SE +/- 0.49, N = 12SE +/- 0.46, N = 1212.2912.2312.20MIN: 5.95MIN: 6.4MIN: 6.661. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 3D - Data Type: f32 - Engine: CPURun 3Run 2Run 1246810SE +/- 0.06630, N = 7SE +/- 0.06535, N = 13SE +/- 0.05431, N = 147.175606.446136.43606MIN: 5.53MIN: 5.6MIN: 5.611. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPURun 3Run 2Run 10.60911.21821.82732.43643.0455SE +/- 0.02378, N = 12SE +/- 0.03239, N = 12SE +/- 0.03167, N = 122.707142.499322.48937MIN: 1.52MIN: 1.52MIN: 1.521. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPURun 3Run 2Run 10.70091.40182.10272.80363.5045SE +/- 0.04263, N = 12SE +/- 0.05466, N = 12SE +/- 0.04400, N = 123.114962.853082.84008MIN: 2.11MIN: 2.14MIN: 2.141. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPURun 3Run 2Run 1612182430SE +/- 0.50, N = 12SE +/- 0.55, N = 12SE +/- 0.58, N = 1226.3525.7625.56MIN: 18.7MIN: 18.62MIN: 18.571. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPURun 3Run 2Run 1246810SE +/- 0.13054, N = 12SE +/- 0.29805, N = 12SE +/- 0.11029, N = 128.634788.292227.92055MIN: 4.72MIN: 4.94MIN: 4.881. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPURun 3Run 2Run 148121620SE +/- 0.16, N = 12SE +/- 0.69, N = 15SE +/- 0.19, N = 1214.1114.9613.50MIN: 8.28MIN: 8.31MIN: 8.191. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPURun 3Run 2Run 148121620SE +/- 0.08, N = 3SE +/- 0.07, N = 3SE +/- 0.15, N = 414.9214.9414.10MIN: 12.23MIN: 12.23MIN: 12.171. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPURun 3Run 2Run 13691215SE +/- 0.18, N = 14SE +/- 0.17, N = 15SE +/- 0.16, N = 1412.1512.1211.35MIN: 9.29MIN: 9.29MIN: 9.251. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPURun 3Run 2Run 13691215SE +/- 0.08167, N = 12SE +/- 0.09631, N = 12SE +/- 0.08613, N = 1310.0758810.093989.64780MIN: 7.97MIN: 7.96MIN: 7.941. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPURun 3Run 2Run 10.71341.42682.14022.85363.567SE +/- 0.02450, N = 3SE +/- 0.02553, N = 3SE +/- 0.02778, N = 33.170513.169323.00801MIN: 2.6MIN: 2.53MIN: 2.491. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPURun 3Run 2Run 10.65081.30161.95242.60323.254SE +/- 0.04298, N = 15SE +/- 0.04127, N = 15SE +/- 0.03844, N = 152.892642.881292.74299MIN: 2.2MIN: 2.2MIN: 2.21. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPURun 3Run 2Run 12K4K6K8K10KSE +/- 24.45, N = 3SE +/- 5.14, N = 3SE +/- 7.25, N = 39350.269301.858928.89MIN: 9262.28MIN: 9240.8MIN: 8884.621. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPURun 3Run 2Run 110002000300040005000SE +/- 7.04, N = 3SE +/- 12.84, N = 3SE +/- 2.42, N = 34799.774793.374570.32MIN: 4734.03MIN: 4728.25MIN: 4526.961. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPURun 3Run 2Run 12K4K6K8K10KSE +/- 4.14, N = 3SE +/- 10.72, N = 3SE +/- 89.00, N = 39315.579309.379072.26MIN: 9244.67MIN: 9197.72MIN: 8900.411. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPURun 3Run 2Run 11224364860SE +/- 0.83, N = 12SE +/- 0.79, N = 12SE +/- 0.98, N = 1252.1251.9750.89MIN: 36.97MIN: 37.03MIN: 36.951. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPURun 3Run 2Run 11428425670SE +/- 0.82, N = 3SE +/- 0.62, N = 3SE +/- 0.67, N = 461.1561.7359.94MIN: 48.32MIN: 51.66MIN: 50.551. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPURun 3Run 2Run 11020304050SE +/- 0.55, N = 15SE +/- 0.54, N = 15SE +/- 0.36, N = 342.7942.8237.75MIN: 37.07MIN: 37.09MIN: 37.261. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPURun 3Run 2Run 110002000300040005000SE +/- 4.81, N = 3SE +/- 14.63, N = 3SE +/- 1.60, N = 34809.374799.984537.86MIN: 4750.47MIN: 4723.57MIN: 4499.031. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPURun 3Run 2Run 11.2042.4083.6124.8166.02SE +/- 0.02155, N = 3SE +/- 0.02405, N = 3SE +/- 0.00981, N = 35.350075.351024.96366MIN: 3.86MIN: 3.85MIN: 3.841. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPURun 3Run 2Run 12K4K6K8K10KSE +/- 8.81, N = 3SE +/- 6.77, N = 3SE +/- 18.40, N = 39299.609302.708891.89MIN: 9240.31MIN: 9224.29MIN: 8839.031. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPURun 3Run 2Run 110002000300040005000SE +/- 21.46, N = 3SE +/- 7.95, N = 3SE +/- 3.28, N = 34816.404807.714532.64MIN: 4739.61MIN: 4750.99MIN: 4496.041. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPURun 3Run 2Run 10.67621.35242.02862.70483.381SE +/- 0.01768, N = 3SE +/- 0.01448, N = 3SE +/- 0.00533, N = 33.005532.993202.74771MIN: 2.08MIN: 2.06MIN: 2.051. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPURun 3Run 2Run 13691215SE +/- 0.04, N = 3SE +/- 0.04, N = 3SE +/- 0.01, N = 312.7112.6612.06MIN: 11.25MIN: 11.29MIN: 11.051. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread


Phoronix Test Suite v10.8.5