oidn onednn

Intel Core i9-13900K testing with a ASUS PRIME Z790-P WIFI (0812 BIOS) and AMD Radeon RX 7900 XTX 24GB on Ubuntu 23.10 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2310123-PTS-OIDNONED01&rdt&grw.

oidn onednnProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorOSKernelDesktopDisplay ServerOpenGLCompilerFile-SystemScreen ResolutionabcIntel Core i9-13900K @ 5.50GHz (24 Cores / 32 Threads)ASUS PRIME Z790-P WIFI (0812 BIOS)Intel Device 7a2732GBWestern Digital WD_BLACK SN850X 1000GBAMD Radeon RX 7900 XTX 24GB (2304/1249MHz)Realtek ALC897ASUS VP28UUbuntu 23.106.5.0-7-generic (x86_64)GNOME Shell 45.0X Server + Wayland4.6 Mesa 23.2.1-1ubuntu3 (LLVM 15.0.7 DRM 3.54)GCC 13.2.0ext43840x2160OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0x119 - Thermald 2.5.4 Security Details- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected

oidn onednnonednn: IP Shapes 1D - f32 - CPUonednn: IP Shapes 3D - f32 - CPUonednn: IP Shapes 1D - u8s8f32 - CPUonednn: IP Shapes 3D - u8s8f32 - CPUonednn: Convolution Batch Shapes Auto - f32 - CPUonednn: Deconvolution Batch shapes_1d - f32 - CPUonednn: Deconvolution Batch shapes_3d - f32 - CPUonednn: Convolution Batch Shapes Auto - u8s8f32 - CPUonednn: Deconvolution Batch shapes_1d - u8s8f32 - CPUonednn: Deconvolution Batch shapes_3d - u8s8f32 - CPUonednn: Recurrent Neural Network Training - f32 - CPUonednn: Recurrent Neural Network Inference - f32 - CPUonednn: Recurrent Neural Network Training - u8s8f32 - CPUonednn: Recurrent Neural Network Inference - u8s8f32 - CPUonednn: Recurrent Neural Network Training - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPUoidn: RT.hdr_alb_nrm.3840x2160 - CPU-Onlyoidn: RT.ldr_alb_nrm.3840x2160 - CPU-Onlyoidn: RTLightmap.hdr.4096x4096 - CPU-Onlyabc1.834944.126160.6937360.6957926.001046.758114.044365.924990.9929531.593802109.871091.632139.931092.272138.321107.240.670.670.332.013074.179640.7226800.7039045.956606.832823.976245.963970.9922691.580942123.001094.822117.341079.342125.441105.660.670.670.331.976254.141790.6876030.6985156.242317.120423.993145.867111.0072151.565562155.381083.282164.011088.582154.151079.330.670.670.33OpenBenchmarking.org

oneDNN

Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: IP Shapes 1D - Data Type: f32 - Engine: CPUabc0.45290.90581.35871.81162.2645SE +/- 0.00124, N = 3SE +/- 0.01347, N = 3SE +/- 0.00479, N = 31.834942.013071.97625MIN: 1.58MIN: 1.58MIN: 1.581. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: IP Shapes 3D - Data Type: f32 - Engine: CPUabc0.94041.88082.82123.76164.702SE +/- 0.01877, N = 3SE +/- 0.01855, N = 3SE +/- 0.01054, N = 34.126164.179644.14179MIN: 3.85MIN: 3.89MIN: 3.91. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPUabc0.16260.32520.48780.65040.813SE +/- 0.003785, N = 3SE +/- 0.004468, N = 3SE +/- 0.000543, N = 30.6937360.7226800.687603MIN: 0.62MIN: 0.62MIN: 0.621. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPUabc0.15840.31680.47520.63360.792SE +/- 0.003205, N = 3SE +/- 0.003187, N = 3SE +/- 0.003928, N = 30.6957920.7039040.698515MIN: 0.58MIN: 0.6MIN: 0.591. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPUabc246810SE +/- 0.07544, N = 3SE +/- 0.03828, N = 3SE +/- 0.02742, N = 36.001045.956606.24231MIN: 5.53MIN: 5.67MIN: 5.461. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPUabc246810SE +/- 0.08251, N = 4SE +/- 0.08585, N = 15SE +/- 0.13608, N = 156.758116.832827.12042MIN: 2.64MIN: 2.59MIN: 2.611. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPUabc0.911.822.733.644.55SE +/- 0.11784, N = 15SE +/- 0.08784, N = 15SE +/- 0.11227, N = 154.044363.976243.99314MIN: 3.39MIN: 3.37MIN: 3.381. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPUabc1.34192.68384.02575.36766.7095SE +/- 0.02525, N = 3SE +/- 0.02944, N = 3SE +/- 0.03630, N = 35.924995.963975.86711MIN: 4.78MIN: 4.87MIN: 4.861. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPUabc0.22660.45320.67980.90641.133SE +/- 0.007985, N = 15SE +/- 0.009400, N = 15SE +/- 0.006142, N = 30.9929530.9922691.007215MIN: 0.86MIN: 0.86MIN: 0.861. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPUabc0.35860.71721.07581.43441.793SE +/- 0.03076, N = 15SE +/- 0.02186, N = 15SE +/- 0.01738, N = 151.593801.580941.56556MIN: 1.42MIN: 1.42MIN: 1.421. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPUabc5001000150020002500SE +/- 25.97, N = 3SE +/- 24.29, N = 3SE +/- 10.68, N = 32109.872123.002155.38MIN: 1942.02MIN: 1953.79MIN: 1933.951. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPUabc2004006008001000SE +/- 10.88, N = 3SE +/- 13.48, N = 4SE +/- 8.50, N = 101091.631094.821083.28MIN: 981.79MIN: 981.46MIN: 977.311. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPUabc5001000150020002500SE +/- 16.32, N = 3SE +/- 19.76, N = 3SE +/- 4.65, N = 32139.932117.342164.01MIN: 1934.49MIN: 1947.46MIN: 1943.271. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPUabc2004006008001000SE +/- 11.13, N = 6SE +/- 12.74, N = 4SE +/- 13.94, N = 31092.271079.341088.58MIN: 980.94MIN: 979.13MIN: 982.991. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPUabc5001000150020002500SE +/- 21.78, N = 3SE +/- 20.94, N = 3SE +/- 7.75, N = 32138.322125.442154.15MIN: 1963.87MIN: 1948.17MIN: 1942.21. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.3Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPUabc2004006008001000SE +/- 5.86, N = 3SE +/- 2.39, N = 3SE +/- 14.03, N = 31107.241105.661079.33MIN: 982.65MIN: 984.41MIN: 9811. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

Intel Open Image Denoise

Run: RT.hdr_alb_nrm.3840x2160 - Device: CPU-Only

OpenBenchmarking.orgImages / Sec, More Is BetterIntel Open Image Denoise 2.1Run: RT.hdr_alb_nrm.3840x2160 - Device: CPU-Onlyabc0.15080.30160.45240.60320.754SE +/- 0.00, N = 3SE +/- 0.01, N = 3SE +/- 0.00, N = 30.670.670.67

Intel Open Image Denoise

Run: RT.ldr_alb_nrm.3840x2160 - Device: CPU-Only

OpenBenchmarking.orgImages / Sec, More Is BetterIntel Open Image Denoise 2.1Run: RT.ldr_alb_nrm.3840x2160 - Device: CPU-Onlyabc0.15080.30160.45240.60320.754SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 30.670.670.67

Intel Open Image Denoise

Run: RTLightmap.hdr.4096x4096 - Device: CPU-Only

OpenBenchmarking.orgImages / Sec, More Is BetterIntel Open Image Denoise 2.1Run: RTLightmap.hdr.4096x4096 - Device: CPU-Onlyabc0.07430.14860.22290.29720.3715SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 30.330.330.33


Phoronix Test Suite v10.8.4