onednn 3.0 raptor lake

Intel Core i9-13900K testing with a ASUS PRIME Z790-P WIFI (0602 BIOS) and eVGA NVIDIA GeForce RTX 3060 12GB on Ubuntu 22.10 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2212209-PTS-ONEDNN3016.

onednn 3.0 raptor lakeProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLOpenCLVulkanCompilerFile-SystemScreen ResolutionabcIntel Core i9-13900K @ 4.00GHz (24 Cores / 32 Threads)ASUS PRIME Z790-P WIFI (0602 BIOS)Intel Device 7a2732GB1000GB Western Digital WDS100T1X0E-00AFY0eVGA NVIDIA GeForce RTX 3060 12GBRealtek ALC897ASUS VP28URealtek RTL8125 2.5GbE + Intel Device 7a70Ubuntu 22.105.19.0-26-generic (x86_64)GNOME Shell 43.1X Server 1.21.1.4NVIDIA 525.60.114.6.0OpenCL 3.0 CUDA 12.0.891.3.224GCC 12.2.0ext42560x1600OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0x10e - Thermald 2.5.1 Security Details- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected

onednn 3.0 raptor lakeonednn: IP Shapes 1D - f32 - CPUonednn: IP Shapes 3D - f32 - CPUonednn: IP Shapes 1D - u8s8f32 - CPUonednn: IP Shapes 3D - u8s8f32 - CPUonednn: Convolution Batch Shapes Auto - f32 - CPUonednn: Deconvolution Batch shapes_1d - f32 - CPUonednn: Deconvolution Batch shapes_3d - f32 - CPUonednn: Convolution Batch Shapes Auto - u8s8f32 - CPUonednn: Deconvolution Batch shapes_1d - u8s8f32 - CPUonednn: Deconvolution Batch shapes_3d - u8s8f32 - CPUonednn: Recurrent Neural Network Training - f32 - CPUonednn: Recurrent Neural Network Inference - f32 - CPUonednn: Recurrent Neural Network Training - u8s8f32 - CPUonednn: Recurrent Neural Network Inference - u8s8f32 - CPUonednn: Matrix Multiply Batch Shapes Transformer - f32 - CPUonednn: Recurrent Neural Network Training - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPUonednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPUabc1.715884.150080.7867090.6293035.763847.477303.427365.910060.9554041.449112119.681087.512126.291098.421.2693262117.941084.750.7830811.884913.935730.8630890.6219665.758167.366743.423915.898290.9727271.449382122.241073.622143.271099.641.1798552148.961095.580.7861771.860773.848790.8917270.5993915.751277.867923.425195.868060.9374701.448852089.671104.462097.671095.201.3829352113.691083.230.759611OpenBenchmarking.org

oneDNN

Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: IP Shapes 1D - Data Type: f32 - Engine: CPUabc0.42410.84821.27231.69642.1205SE +/- 0.01122, N = 3SE +/- 0.01809, N = 15SE +/- 0.02132, N = 151.715881.884911.86077MIN: 1.57MIN: 1.57MIN: 1.571. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: IP Shapes 3D - Data Type: f32 - Engine: CPUabc0.93381.86762.80143.73524.669SE +/- 0.02209, N = 3SE +/- 0.00290, N = 3SE +/- 0.00182, N = 34.150083.935733.84879MIN: 4.08MIN: 3.88MIN: 3.81. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPUabc0.20060.40120.60180.80241.003SE +/- 0.034022, N = 15SE +/- 0.039890, N = 15SE +/- 0.052959, N = 150.7867090.8630890.891727MIN: 0.65MIN: 0.65MIN: 0.651. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPUabc0.14160.28320.42480.56640.708SE +/- 0.001158, N = 3SE +/- 0.008037, N = 14SE +/- 0.000382, N = 30.6293030.6219660.599391MIN: 0.61MIN: 0.57MIN: 0.581. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPUabc1.29692.59383.89075.18766.4845SE +/- 0.00332, N = 3SE +/- 0.00221, N = 3SE +/- 0.00220, N = 35.763845.758165.75127MIN: 5.54MIN: 5.53MIN: 5.531. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPUabc246810SE +/- 0.11831, N = 15SE +/- 0.09423, N = 15SE +/- 0.15590, N = 157.477307.366747.86792MIN: 2.84MIN: 2.98MIN: 2.721. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPUabc0.77121.54242.31363.08483.856SE +/- 0.00441, N = 3SE +/- 0.00149, N = 3SE +/- 0.00291, N = 33.427363.423913.42519MIN: 3.38MIN: 3.39MIN: 3.381. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPUabc1.32982.65963.98945.31926.649SE +/- 0.00461, N = 3SE +/- 0.00611, N = 3SE +/- 0.00679, N = 35.910065.898295.86806MIN: 5.6MIN: 5.63MIN: 5.651. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPUabc0.21890.43780.65670.87561.0945SE +/- 0.007682, N = 15SE +/- 0.011174, N = 3SE +/- 0.006580, N = 30.9554040.9727270.937470MIN: 0.86MIN: 0.86MIN: 0.861. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPUabc0.32610.65220.97831.30441.6305SE +/- 0.00031, N = 3SE +/- 0.00127, N = 3SE +/- 0.00014, N = 31.449111.449381.44885MIN: 1.44MIN: 1.43MIN: 1.441. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPUabc5001000150020002500SE +/- 24.68, N = 3SE +/- 24.57, N = 3SE +/- 1.48, N = 32119.682122.242089.67MIN: 1981.97MIN: 1978.3MIN: 1979.871. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPUabc2004006008001000SE +/- 8.11, N = 11SE +/- 4.14, N = 3SE +/- 15.65, N = 31087.511073.621104.46MIN: 1012.76MIN: 1013.98MIN: 1015.481. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPUabc5001000150020002500SE +/- 21.82, N = 5SE +/- 15.78, N = 11SE +/- 21.19, N = 62126.292143.272097.67MIN: 1980.46MIN: 1977.9MIN: 1978.891. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPUabc2004006008001000SE +/- 11.90, N = 5SE +/- 10.37, N = 7SE +/- 9.64, N = 151098.421099.641095.20MIN: 1013.68MIN: 1013.55MIN: 1014.021. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPUabc0.31120.62240.93361.24481.556SE +/- 0.079866, N = 15SE +/- 0.089715, N = 12SE +/- 0.081255, N = 151.2693261.1798551.382935MIN: 0.74MIN: 0.73MIN: 0.731. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPUabc5001000150020002500SE +/- 26.18, N = 3SE +/- 25.31, N = 4SE +/- 21.10, N = 32117.942148.962113.69MIN: 1979.58MIN: 1978.19MIN: 1980.311. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPUabc2004006008001000SE +/- 12.11, N = 4SE +/- 8.18, N = 15SE +/- 13.87, N = 31084.751095.581083.23MIN: 1014.51MIN: 1014.02MIN: 1015.171. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPUabc0.17690.35380.53070.70760.8845SE +/- 0.020828, N = 15SE +/- 0.058336, N = 12SE +/- 0.017020, N = 150.7830810.7861770.759611MIN: 0.53MIN: 0.53MIN: 0.551. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl


Phoronix Test Suite v10.8.4