onednn tgl

Intel Core i7-1165G7 testing with a Dell 0GG9PT (3.3.0 BIOS) and Intel Xe TGL GT2 3GB on Ubuntu 21.10 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2209291-NE-ONEDNNTGL65&grr&rdt.

onednn tglProcessorMotherboardChipsetMemoryDiskGraphicsAudioNetworkOSKernelDesktopDisplay ServerOpenGLVulkanCompilerFile-SystemScreen ResolutionABCIntel Core i7-1165G7 @ 4.70GHz (4 Cores / 8 Threads)Dell 0GG9PT (3.3.0 BIOS)Intel Tiger Lake-LP16GBKioxia KBG40ZNS256G NVMe 256GBIntel Xe TGL GT2 3GB (1300MHz)Realtek ALC289Intel Wi-Fi 6 AX201Ubuntu 21.105.13.0-52-generic (x86_64)GNOME Shell 40.5X Server + Wayland4.6 Mesa 21.2.21.2.182GCC 11.2.0ext41920x1200OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-ZPT0kp/gcc-11-11.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-ZPT0kp/gcc-11-11.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0xa4 - Thermald 2.4.6 Security Details- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected

onednn tglaom-av1: Speed 4 Two-Pass - Bosphorus 4Kaom-av1: Speed 0 Two-Pass - Bosphorus 4Kaom-av1: Speed 6 Two-Pass - Bosphorus 4Kaom-av1: Speed 4 Two-Pass - Bosphorus 1080ponednn: Recurrent Neural Network Training - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Training - f32 - CPUonednn: Recurrent Neural Network Training - u8s8f32 - CPUaom-av1: Speed 0 Two-Pass - Bosphorus 1080ponednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Inference - f32 - CPUonednn: Recurrent Neural Network Inference - u8s8f32 - CPUaom-av1: Speed 6 Realtime - Bosphorus 4Kaom-av1: Speed 6 Two-Pass - Bosphorus 1080py-cruncher: 500Maom-av1: Speed 8 Realtime - Bosphorus 4Konednn: Deconvolution Batch shapes_1d - bf16bf16bf16 - CPUonednn: Deconvolution Batch shapes_1d - f32 - CPUonednn: Deconvolution Batch shapes_1d - u8s8f32 - CPUaom-av1: Speed 6 Realtime - Bosphorus 1080ponednn: IP Shapes 1D - f32 - CPUonednn: IP Shapes 1D - bf16bf16bf16 - CPUonednn: IP Shapes 1D - u8s8f32 - CPUaom-av1: Speed 9 Realtime - Bosphorus 4Kaom-av1: Speed 10 Realtime - Bosphorus 4Konednn: Matrix Multiply Batch Shapes Transformer - bf16bf16bf16 - CPUonednn: Matrix Multiply Batch Shapes Transformer - f32 - CPUonednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPUonednn: IP Shapes 3D - f32 - CPUonednn: IP Shapes 3D - bf16bf16bf16 - CPUonednn: IP Shapes 3D - u8s8f32 - CPUaom-av1: Speed 8 Realtime - Bosphorus 1080ponednn: Convolution Batch Shapes Auto - bf16bf16bf16 - CPUonednn: Convolution Batch Shapes Auto - f32 - CPUonednn: Convolution Batch Shapes Auto - u8s8f32 - CPUaom-av1: Speed 9 Realtime - Bosphorus 1080paom-av1: Speed 10 Realtime - Bosphorus 1080ponednn: Deconvolution Batch shapes_3d - bf16bf16bf16 - CPUonednn: Deconvolution Batch shapes_3d - f32 - CPUonednn: Deconvolution Batch shapes_3d - u8s8f32 - CPUABC2.470.084.626.448010.027830.317826.490.244026.314037.243957.9113.4318.1829.31923.7253.239812.97482.0960437.317.2984218.28581.7935144.2545.499.791233.246911.416976.744276.246542.2558996.4851.96878.527557.65511127.97135.5137.595810.18862.447932.430.084.556.867977.087816.977909.080.244063.973898.973946.0416.8817.1729.43123.2949.172512.37032.08509357.0626118.20941.6778543.0244.98.469863.248441.415436.504686.121872.6971294.4247.24648.567567.73506125.45131.7337.564310.23352.443432.390.074.416.248053.638169.467976.030.234011.374147.093966.9614.117.0829.6622.752.981612.55192.0973636.917.2501418.18551.8728541.8844.639.915173.250771.412896.697026.306132.6731391.0951.40798.567717.76604124.81154.4637.586410.3022.45115OpenBenchmarking.org

AOM AV1

Encoder Mode: Speed 4 Two-Pass - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 3.5Encoder Mode: Speed 4 Two-Pass - Input: Bosphorus 4KABC0.55581.11161.66742.22322.7792.472.432.391. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm

AOM AV1

Encoder Mode: Speed 0 Two-Pass - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 3.5Encoder Mode: Speed 0 Two-Pass - Input: Bosphorus 4KABC0.0180.0360.0540.0720.090.080.080.071. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm

AOM AV1

Encoder Mode: Speed 6 Two-Pass - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 3.5Encoder Mode: Speed 6 Two-Pass - Input: Bosphorus 4KABC1.03952.0793.11854.1585.19754.624.554.411. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm

AOM AV1

Encoder Mode: Speed 4 Two-Pass - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 3.5Encoder Mode: Speed 4 Two-Pass - Input: Bosphorus 1080pABC2468106.446.866.241. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm

oneDNN

Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPUABC2K4K6K8K10K8010.027977.088053.63MIN: 7961.23MIN: 7923.87MIN: 7993.121. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPUABC2K4K6K8K10K7830.317816.978169.46MIN: 7777.83MIN: 7766.77MIN: 8117.311. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPUABC2K4K6K8K10K7826.497909.087976.03MIN: 7770.28MIN: 7869.76MIN: 7923.871. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

AOM AV1

Encoder Mode: Speed 0 Two-Pass - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 3.5Encoder Mode: Speed 0 Two-Pass - Input: Bosphorus 1080pABC0.0540.1080.1620.2160.270.240.240.231. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPUABC90018002700360045004026.314063.974011.37MIN: 3987.43MIN: 4025.97MIN: 3976.821. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPUABC90018002700360045004037.243898.974147.09MIN: 3995.19MIN: 3842.11MIN: 4110.261. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPUABC90018002700360045003957.913946.043966.96MIN: 3916.97MIN: 3904.48MIN: 3920.521. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

AOM AV1

Encoder Mode: Speed 6 Realtime - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 3.5Encoder Mode: Speed 6 Realtime - Input: Bosphorus 4KABC4812162013.4316.8814.101. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm

AOM AV1

Encoder Mode: Speed 6 Two-Pass - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 3.5Encoder Mode: Speed 6 Two-Pass - Input: Bosphorus 1080pABC4812162018.1817.1717.081. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm

Y-Cruncher

Pi Digits To Calculate: 500M

OpenBenchmarking.orgSeconds, Fewer Is BetterY-Cruncher 0.7.10.9513Pi Digits To Calculate: 500MABC71421283529.3229.4329.66

AOM AV1

Encoder Mode: Speed 8 Realtime - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 3.5Encoder Mode: Speed 8 Realtime - Input: Bosphorus 4KABC61218243023.7223.2922.701. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPUABC122436486053.2449.1752.98MIN: 50.57MIN: 46.81MIN: 50.641. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPUABC369121512.9712.3712.55MIN: 11.88MIN: 11.39MIN: 11.531. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPUABC0.47190.94381.41571.88762.35952.096042.085092.09736MIN: 1.86MIN: 1.89MIN: 1.891. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

AOM AV1

Encoder Mode: Speed 6 Realtime - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 3.5Encoder Mode: Speed 6 Realtime - Input: Bosphorus 1080pABC91827364537.3135.0036.911. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm

oneDNN

Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: IP Shapes 1D - Data Type: f32 - Engine: CPUABC2468107.298427.062617.25014MIN: 6.56MIN: 6.03MIN: 6.561. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPUABC4812162018.2918.2118.19MIN: 17.81MIN: 17.79MIN: 17.751. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPUABC0.42140.84281.26421.68562.1071.793511.677851.87285MIN: 1.47MIN: 1.5MIN: 1.521. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

AOM AV1

Encoder Mode: Speed 9 Realtime - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 3.5Encoder Mode: Speed 9 Realtime - Input: Bosphorus 4KABC102030405044.2543.0241.881. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm

AOM AV1

Encoder Mode: Speed 10 Realtime - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 3.5Encoder Mode: Speed 10 Realtime - Input: Bosphorus 4KABC102030405045.4944.9044.631. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPUABC36912159.791238.469869.91517MIN: 9.31MIN: 7.59MIN: 9.371. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPUABC0.73141.46282.19422.92563.6573.246913.248443.25077MIN: 3.15MIN: 3.17MIN: 3.161. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPUABC0.31880.63760.95641.27521.5941.416971.415431.41289MIN: 1.37MIN: 1.35MIN: 1.371. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: IP Shapes 3D - Data Type: f32 - Engine: CPUABC2468106.744276.504686.69702MIN: 6.49MIN: 6.24MIN: 6.281. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPUABC2468106.246546.121876.30613MIN: 5.51MIN: 5.5MIN: 5.681. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPUABC0.60691.21381.82072.42763.03452.255892.697122.67313MIN: 2.16MIN: 2.56MIN: 2.561. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

AOM AV1

Encoder Mode: Speed 8 Realtime - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 3.5Encoder Mode: Speed 8 Realtime - Input: Bosphorus 1080pABC2040608010096.4894.4291.091. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPUABC122436486051.9747.2551.41MIN: 49.37MIN: 44.33MIN: 48.021. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPUABC2468108.527558.567568.56771MIN: 8.24MIN: 8.38MIN: 8.31. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPUABC2468107.655117.735067.76604MIN: 7.54MIN: 7.61MIN: 7.641. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

AOM AV1

Encoder Mode: Speed 9 Realtime - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 3.5Encoder Mode: Speed 9 Realtime - Input: Bosphorus 1080pABC306090120150127.97125.45124.811. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm

AOM AV1

Encoder Mode: Speed 10 Realtime - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterAOM AV1 3.5Encoder Mode: Speed 10 Realtime - Input: Bosphorus 1080pABC306090120150135.51131.73154.461. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPUABC91827364537.6037.5637.59MIN: 37.44MIN: 37.42MIN: 37.421. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPUABC369121510.1910.2310.30MIN: 9.53MIN: 9.78MIN: 9.81. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPUABC0.55151.1031.65452.2062.75752.447932.443432.45115MIN: 2.2MIN: 2.26MIN: 2.261. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl


Phoronix Test Suite v10.8.5