oidn onednn

Intel Core i9-13900K testing with a ASUS PRIME Z790-P WIFI (0812 BIOS) and AMD Radeon RX 7900 XTX 24GB on Ubuntu 23.10 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2310123-PTS-OIDNONED01&grw&sor.

oidn onednnProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorOSKernelDesktopDisplay ServerOpenGLCompilerFile-SystemScreen ResolutionabcIntel Core i9-13900K @ 5.50GHz (24 Cores / 32 Threads)ASUS PRIME Z790-P WIFI (0812 BIOS)Intel Device 7a2732GBWestern Digital WD_BLACK SN850X 1000GBAMD Radeon RX 7900 XTX 24GB (2304/1249MHz)Realtek ALC897ASUS VP28UUbuntu 23.106.5.0-7-generic (x86_64)GNOME Shell 45.0X Server + Wayland4.6 Mesa 23.2.1-1ubuntu3 (LLVM 15.0.7 DRM 3.54)GCC 13.2.0ext43840x2160OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0x119 - Thermald 2.5.4 Security Details- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected

oidn onednnonednn: IP Shapes 1D - f32 - CPUonednn: IP Shapes 3D - f32 - CPUonednn: IP Shapes 1D - u8s8f32 - CPUonednn: IP Shapes 3D - u8s8f32 - CPUonednn: Convolution Batch Shapes Auto - f32 - CPUonednn: Deconvolution Batch shapes_1d - f32 - CPUonednn: Deconvolution Batch shapes_3d - f32 - CPUonednn: Convolution Batch Shapes Auto - u8s8f32 - CPUonednn: Deconvolution Batch shapes_1d - u8s8f32 - CPUonednn: Deconvolution Batch shapes_3d - u8s8f32 - CPUonednn: Recurrent Neural Network Training - f32 - CPUonednn: Recurrent Neural Network Inference - f32 - CPUonednn: Recurrent Neural Network Training - u8s8f32 - CPUonednn: Recurrent Neural Network Inference - u8s8f32 - CPUonednn: Recurrent Neural Network Training - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPUoidn: RT.hdr_alb_nrm.3840x2160 - CPU-Onlyoidn: RT.ldr_alb_nrm.3840x2160 - CPU-Onlyoidn: RTLightmap.hdr.4096x4096 - CPU-Onlyabc1.834944.126160.6937360.6957926.001046.758114.044365.924990.9929531.593802109.871091.632139.931092.272138.321107.240.670.670.332.013074.179640.7226800.7039045.956606.832823.976245.963970.9922691.580942123.001094.822117.341079.342125.441105.660.670.670.331.976254.141790.6876030.6985156.242317.120423.993145.867111.0072151.565562155.381083.282164.011088.582154.151079.330.670.670.33OpenBenchmarking.org

oneDNN

Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: IP Shapes 1D - Data Type: f32 - Engine: CPUacb0.45290.90581.35871.81162.2645SE +/- 0.00124, N = 3SE +/- 0.00479, N = 3SE +/- 0.01347, N = 31.834941.976252.01307MIN: 1.58MIN: 1.58MIN: 1.581. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: IP Shapes 3D - Data Type: f32 - Engine: CPUacb0.94041.88082.82123.76164.702SE +/- 0.01877, N = 3SE +/- 0.01054, N = 3SE +/- 0.01855, N = 34.126164.141794.17964MIN: 3.85MIN: 3.9MIN: 3.891. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPUcab0.16260.32520.48780.65040.813SE +/- 0.000543, N = 3SE +/- 0.003785, N = 3SE +/- 0.004468, N = 30.6876030.6937360.722680MIN: 0.62MIN: 0.62MIN: 0.621. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPUacb0.15840.31680.47520.63360.792SE +/- 0.003205, N = 3SE +/- 0.003928, N = 3SE +/- 0.003187, N = 30.6957920.6985150.703904MIN: 0.58MIN: 0.59MIN: 0.61. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPUbac246810SE +/- 0.03828, N = 3SE +/- 0.07544, N = 3SE +/- 0.02742, N = 35.956606.001046.24231MIN: 5.67MIN: 5.53MIN: 5.461. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPUabc246810SE +/- 0.08251, N = 4SE +/- 0.08585, N = 15SE +/- 0.13608, N = 156.758116.832827.12042MIN: 2.64MIN: 2.59MIN: 2.611. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPUbca0.911.822.733.644.55SE +/- 0.08784, N = 15SE +/- 0.11227, N = 15SE +/- 0.11784, N = 153.976243.993144.04436MIN: 3.37MIN: 3.38MIN: 3.391. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPUcab1.34192.68384.02575.36766.7095SE +/- 0.03630, N = 3SE +/- 0.02525, N = 3SE +/- 0.02944, N = 35.867115.924995.96397MIN: 4.86MIN: 4.78MIN: 4.871. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPUbac0.22660.45320.67980.90641.133SE +/- 0.009400, N = 15SE +/- 0.007985, N = 15SE +/- 0.006142, N = 30.9922690.9929531.007215MIN: 0.86MIN: 0.86MIN: 0.861. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPUcba0.35860.71721.07581.43441.793SE +/- 0.01738, N = 15SE +/- 0.02186, N = 15SE +/- 0.03076, N = 151.565561.580941.59380MIN: 1.42MIN: 1.42MIN: 1.421. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPUabc5001000150020002500SE +/- 25.97, N = 3SE +/- 24.29, N = 3SE +/- 10.68, N = 32109.872123.002155.38MIN: 1942.02MIN: 1953.79MIN: 1933.951. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPUcab2004006008001000SE +/- 8.50, N = 10SE +/- 10.88, N = 3SE +/- 13.48, N = 41083.281091.631094.82MIN: 977.31MIN: 981.79MIN: 981.461. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPUbac5001000150020002500SE +/- 19.76, N = 3SE +/- 16.32, N = 3SE +/- 4.65, N = 32117.342139.932164.01MIN: 1947.46MIN: 1934.49MIN: 1943.271. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPUbca2004006008001000SE +/- 12.74, N = 4SE +/- 13.94, N = 3SE +/- 11.13, N = 61079.341088.581092.27MIN: 979.13MIN: 982.99MIN: 980.941. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPUbac5001000150020002500SE +/- 20.94, N = 3SE +/- 21.78, N = 3SE +/- 7.75, N = 32125.442138.322154.15MIN: 1948.17MIN: 1963.87MIN: 1942.21. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPUcba2004006008001000SE +/- 14.03, N = 3SE +/- 2.39, N = 3SE +/- 5.86, N = 31079.331105.661107.24MIN: 981MIN: 984.41MIN: 982.651. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

Intel Open Image Denoise

Run: RT.hdr_alb_nrm.3840x2160 - Device: CPU-Only

OpenBenchmarking.orgImages / Sec, More Is BetterIntel Open Image Denoise 2.1Run: RT.hdr_alb_nrm.3840x2160 - Device: CPU-Onlycba0.15080.30160.45240.60320.754SE +/- 0.00, N = 3SE +/- 0.01, N = 3SE +/- 0.00, N = 30.670.670.67

Intel Open Image Denoise

Run: RT.ldr_alb_nrm.3840x2160 - Device: CPU-Only

OpenBenchmarking.orgImages / Sec, More Is BetterIntel Open Image Denoise 2.1Run: RT.ldr_alb_nrm.3840x2160 - Device: CPU-Onlycba0.15080.30160.45240.60320.754SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 30.670.670.67

Intel Open Image Denoise

Run: RTLightmap.hdr.4096x4096 - Device: CPU-Only

OpenBenchmarking.orgImages / Sec, More Is BetterIntel Open Image Denoise 2.1Run: RTLightmap.hdr.4096x4096 - Device: CPU-Onlycba0.07430.14860.22290.29720.3715SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 30.330.330.33


Phoronix Test Suite v10.8.5