Intel Core i5-1145G7 testing with a LENOVO 20XW004AUS (N32ET71W 1.47 BIOS) and Intel Xe TGL GT2 3GB on Ubuntu 20.04 via the Phoronix Test Suite.
Compare your own system(s) to this result file with the
Phoronix Test Suite by running the command:
phoronix-test-suite benchmark 2203301-NE-ONEDNNTGL27 onednn tgl - Phoronix Test Suite onednn tgl Intel Core i5-1145G7 testing with a LENOVO 20XW004AUS (N32ET71W 1.47 BIOS) and Intel Xe TGL GT2 3GB on Ubuntu 20.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2203301-NE-ONEDNNTGL27&gru&rdt&rro .
onednn tgl Processor Motherboard Chipset Memory Disk Graphics Audio Network OS Kernel Desktop Display Server OpenGL Vulkan Compiler File-System Screen Resolution A B C Intel Core i5-1145G7 @ 4.40GHz (4 Cores / 8 Threads) LENOVO 20XW004AUS (N32ET71W 1.47 BIOS) Intel Device a0ef 16GB 1024GB SAMSUNG MZVLB1T0HBLR-000H1 Intel Xe TGL GT2 3GB (1300MHz) Realtek ALC287 Intel Device a0f0 Ubuntu 20.04 5.14.0-1027-oem (x86_64) GNOME Shell 3.36.9 X Server 1.20.13 4.6 Mesa 21.2.6 1.2.182 GCC 9.4.0 ext4 1920x1200 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-yTrUTS/gcc-9-9.4.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: intel_pstate powersave (EPP: balance_performance) - Platform Profile: balanced - CPU Microcode: 0x88 - ACPI Profile: balanced Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
onednn tgl onednn: IP Shapes 1D - f32 - CPU onednn: IP Shapes 3D - f32 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU onednn: IP Shapes 1D - bf16bf16bf16 - CPU onednn: IP Shapes 3D - bf16bf16bf16 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: Deconvolution Batch shapes_1d - f32 - CPU onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - bf16bf16bf16 - CPU onednn: Deconvolution Batch shapes_1d - bf16bf16bf16 - CPU onednn: Deconvolution Batch shapes_3d - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - bf16bf16bf16 - CPU A B C 8.46208 5.40966 1.95301 2.11018 19.8379 6.11673 9.41308 17.6982 10.1712 7.00896 3.00906 2.41032 9343.97 4767.33 9344.99 38.0359 63.5556 38.2692 4762.61 3.97598 9342.56 4760.20 1.70174 12.27494 8.95669 5.40629 2.08277 2.28842 19.8353 6.31136 9.99736 18.3013 9.9306 7.61626 3.03781 2.35766 9342.43 4768.46 9347.91 37.9739 63.9284 38.3114 4762.94 3.97577 9340.03 4762.01 1.70897 12.19882 9.51734 5.45216 2.16909 2.29786 19.8015 6.35160 10.51531 18.2109 9.82487 7.63750 3.07591 2.34011 9341.05 4759.63 9346.82 37.9609 64.4608 38.3988 4763.90 3.98095 9341.19 4764.79 1.71077 12.20369 OpenBenchmarking.org
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU C B A 3 6 9 12 15 SE +/- 0.22405, N = 12 SE +/- 0.18359, N = 12 SE +/- 0.12800, N = 12 9.51734 8.95669 8.46208 MIN: 6.54 MIN: 6.78 MIN: 6.9 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU C B A 1.2267 2.4534 3.6801 4.9068 6.1335 SE +/- 0.05103, N = 3 SE +/- 0.02326, N = 3 SE +/- 0.01720, N = 3 5.45216 5.40629 5.40966 MIN: 5.3 MIN: 5.3 MIN: 5.31 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU C B A 0.488 0.976 1.464 1.952 2.44 SE +/- 0.03939, N = 13 SE +/- 0.03381, N = 14 SE +/- 0.02279, N = 15 2.16909 2.08277 1.95301 MIN: 1.44 MIN: 1.42 MIN: 1.53 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU C B A 0.517 1.034 1.551 2.068 2.585 SE +/- 0.02260, N = 14 SE +/- 0.02070, N = 15 SE +/- 0.01978, N = 3 2.29786 2.28842 2.11018 MIN: 2.05 MIN: 2.05 MIN: 2.05 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl
oneDNN Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU C B A 5 10 15 20 25 SE +/- 0.18, N = 7 SE +/- 0.18, N = 7 SE +/- 0.18, N = 7 19.80 19.84 19.84 MIN: 18.39 MIN: 18.36 MIN: 18.31 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl
oneDNN Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU C B A 2 4 6 8 10 SE +/- 0.08757, N = 12 SE +/- 0.06459, N = 15 SE +/- 0.04671, N = 15 6.35160 6.31136 6.11673 MIN: 4.8 MIN: 4.78 MIN: 4.83 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU C B A 3 6 9 12 15 SE +/- 0.26927, N = 15 SE +/- 0.20633, N = 15 SE +/- 0.13110, N = 15 10.51531 9.99736 9.41308 MIN: 7.8 MIN: 7.8 MIN: 7.82 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU C B A 5 10 15 20 25 SE +/- 0.43, N = 12 SE +/- 0.37, N = 15 SE +/- 0.31, N = 12 18.21 18.30 17.70 MIN: 12.85 MIN: 12.66 MIN: 12.6 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU C B A 3 6 9 12 15 SE +/- 0.01380, N = 3 SE +/- 0.05241, N = 3 SE +/- 0.07037, N = 3 9.82487 9.93060 10.17120 MIN: 9.72 MIN: 9.72 MIN: 9.72 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU C B A 2 4 6 8 10 SE +/- 0.07577, N = 15 SE +/- 0.09598, N = 12 SE +/- 0.03577, N = 3 7.63750 7.61626 7.00896 MIN: 6.9 MIN: 6.9 MIN: 6.9 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU C B A 0.6921 1.3842 2.0763 2.7684 3.4605 SE +/- 0.07382, N = 12 SE +/- 0.06747, N = 12 SE +/- 0.06611, N = 12 3.07591 3.03781 3.00906 MIN: 1.97 MIN: 1.99 MIN: 1.98 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU C B A 0.5423 1.0846 1.6269 2.1692 2.7115 SE +/- 0.01606, N = 3 SE +/- 0.01602, N = 3 SE +/- 0.01651, N = 3 2.34011 2.35766 2.41032 MIN: 2.26 MIN: 2.26 MIN: 2.26 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU C B A 2K 4K 6K 8K 10K SE +/- 11.27, N = 3 SE +/- 13.80, N = 3 SE +/- 15.58, N = 3 9341.05 9342.43 9343.97 MIN: 9283.35 MIN: 9281.9 MIN: 9275.54 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU C B A 1000 2000 3000 4000 5000 SE +/- 5.52, N = 3 SE +/- 7.12, N = 3 SE +/- 10.55, N = 3 4759.63 4768.46 4767.33 MIN: 4706.4 MIN: 4712.04 MIN: 4703.91 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU C B A 2K 4K 6K 8K 10K SE +/- 14.63, N = 3 SE +/- 10.74, N = 3 SE +/- 14.60, N = 3 9346.82 9347.91 9344.99 MIN: 9277.56 MIN: 9286.78 MIN: 9276.29 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU C B A 9 18 27 36 45 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 SE +/- 0.08, N = 3 37.96 37.97 38.04 MIN: 37.88 MIN: 37.88 MIN: 37.87 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU C B A 14 28 42 56 70 SE +/- 1.22, N = 12 SE +/- 1.15, N = 12 SE +/- 1.13, N = 12 64.46 63.93 63.56 MIN: 48.75 MIN: 48.09 MIN: 48.57 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU C B A 9 18 27 36 45 SE +/- 0.06, N = 3 SE +/- 0.02, N = 3 SE +/- 0.00, N = 3 38.40 38.31 38.27 MIN: 38.12 MIN: 38.14 MIN: 38.12 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU C B A 1000 2000 3000 4000 5000 SE +/- 8.66, N = 3 SE +/- 4.91, N = 3 SE +/- 9.66, N = 3 4763.90 4762.94 4762.61 MIN: 4704.67 MIN: 4712.41 MIN: 4701.82 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU C B A 0.8957 1.7914 2.6871 3.5828 4.4785 SE +/- 0.08379, N = 12 SE +/- 0.08596, N = 12 SE +/- 0.10299, N = 12 3.98095 3.97577 3.97598 MIN: 2.5 MIN: 2.5 MIN: 2.51 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU C B A 2K 4K 6K 8K 10K SE +/- 16.10, N = 3 SE +/- 13.01, N = 3 SE +/- 7.36, N = 3 9341.19 9340.03 9342.56 MIN: 9269.09 MIN: 9275.58 MIN: 9290.71 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU C B A 1000 2000 3000 4000 5000 SE +/- 9.20, N = 3 SE +/- 5.86, N = 3 SE +/- 4.70, N = 3 4764.79 4762.01 4760.20 MIN: 4707.42 MIN: 4707.88 MIN: 4708.81 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU C B A 0.3849 0.7698 1.1547 1.5396 1.9245 SE +/- 0.02902, N = 12 SE +/- 0.03087, N = 12 SE +/- 0.03407, N = 12 1.71077 1.70897 1.70174 MIN: 1.2 MIN: 1.19 MIN: 1.2 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU C B A 3 6 9 12 15 SE +/- 0.35, N = 12 SE +/- 0.35, N = 12 SE +/- 0.38, N = 12 12.20 12.20 12.27 MIN: 7.72 MIN: 7.72 MIN: 7.73 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl
Phoronix Test Suite v10.8.4