tr onednn 3.1

AMD Ryzen Threadripper 3990X 64-Core testing with a Gigabyte TRX40 AORUS PRO WIFI (F6 BIOS) and AMD Radeon RX 5700 8GB on Ubuntu 23.04 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2303314-PTS-TRONEDNN36&rdt&grs.

tr onednn 3.1ProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerOpenGLCompilerFile-SystemScreen ResolutionabcdAMD Ryzen Threadripper 3990X 64-Core @ 2.90GHz (64 Cores / 128 Threads)Gigabyte TRX40 AORUS PRO WIFI (F6 BIOS)AMD Starship/Matisse128GBSamsung SSD 970 EVO Plus 500GBAMD Radeon RX 5700 8GB (1750/875MHz)AMD Navi 10 HDMI AudioDELL P2415QIntel I211 + Intel Wi-Fi 6 AX200Ubuntu 23.046.2.0-18-generic (x86_64)GNOME Shell 44.0X Server + Wayland4.6 Mesa 22.3.6 (LLVM 15.0.7 DRM 3.49)GCC 12.2.0ext43840x21604.6 Mesa 23.0.1 (LLVM 15.0.7 DRM 3.49)OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-Pa930Z/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-Pa930Z/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0x8301055Security Details- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Mitigation of untrained return thunk; SMT enabled with STIBP protection + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected

tr onednn 3.1onednn: IP Shapes 3D - f32 - CPUonednn: Convolution Batch Shapes Auto - f32 - CPUonednn: Deconvolution Batch shapes_1d - f32 - CPUonednn: Deconvolution Batch shapes_1d - u8s8f32 - CPUonednn: Convolution Batch Shapes Auto - u8s8f32 - CPUonednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPUonednn: Deconvolution Batch shapes_3d - u8s8f32 - CPUonednn: Recurrent Neural Network Inference - f32 - CPUonednn: Deconvolution Batch shapes_3d - f32 - CPUonednn: Recurrent Neural Network Inference - u8s8f32 - CPUonednn: Recurrent Neural Network Training - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Training - u8s8f32 - CPUonednn: Recurrent Neural Network Training - f32 - CPUonednn: IP Shapes 3D - u8s8f32 - CPUonednn: IP Shapes 1D - u8s8f32 - CPUonednn: IP Shapes 1D - f32 - CPUabcd6.477450.91937010.489511.786206.48493844.9780.989134859.0982.08690856.7224042.594024.214011.413.4860011.598893.690766.376740.92333310.90271.762026.54503864.2090.97877844.4622.07202858.4044007.144014.983998.121.091223.604122.432378.334051.0278310.23841.825926.6289858.7550.964088857.9072.10876850.44027.354008.564018.11.138272.607512.425988.415060.9674489.815091.747416.66997841.9230.978819839.1762.08485862.3044010.354001.774017.571.146152.333341.57043OpenBenchmarking.org

oneDNN

Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.1Harness: IP Shapes 3D - Data Type: f32 - Engine: CPUabcd246810SE +/- 0.01580, N = 36.477456.376748.334058.41506MIN: 5.78MIN: 6.23MIN: 8.22MIN: 8.31. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.1Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPUabcd0.23130.46260.69390.92521.1565SE +/- 0.002441, N = 30.9193700.9233331.0278300.967448MIN: 0.85MIN: 0.86MIN: 0.94MIN: 0.891. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.1Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPUabcd3691215SE +/- 0.09699, N = 710.4895110.9027010.238409.81509MIN: 8.19MIN: 8.81MIN: 8.5MIN: 8.371. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.1Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPUabcd0.41080.82161.23241.64322.054SE +/- 0.01827, N = 31.786201.762021.825921.74741MIN: 1.47MIN: 1.54MIN: 1.5MIN: 1.491. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.1Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPUabcd246810SE +/- 0.00405, N = 36.484936.545036.628906.66997MIN: 6.38MIN: 6.43MIN: 6.52MIN: 6.561. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.1Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPUabcd2004006008001000SE +/- 3.68, N = 3844.98864.21858.76841.92MIN: 824.01MIN: 847.67MIN: 842.97MIN: 825.421. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.1Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPUabcd0.22260.44520.66780.89041.113SE +/- 0.001428, N = 30.9891340.9787700.9640880.978819MIN: 0.93MIN: 0.92MIN: 0.9MIN: 0.921. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.1Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPUabcd2004006008001000SE +/- 6.16, N = 3859.10844.46857.91839.18MIN: 831.54MIN: 828.46MIN: 840.05MIN: 821.061. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.1Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPUabcd0.47450.9491.42351.8982.3725SE +/- 0.01000, N = 32.086902.072022.108762.08485MIN: 2.03MIN: 2.02MIN: 2.03MIN: 2.031. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.1Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPUabcd2004006008001000SE +/- 8.46, N = 3856.72858.40850.40862.30MIN: 825.43MIN: 842.21MIN: 834.24MIN: 844.31. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

oneDNN

Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.1Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPUabcd9001800270036004500SE +/- 7.92, N = 34042.594007.144027.354010.35MIN: 4009.52MIN: 3981.44MIN: 4005.47MIN: 3987.691. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

oneDNN

Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.1Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPUabcd9001800270036004500SE +/- 9.26, N = 34024.214014.984008.564001.77MIN: 3990.89MIN: 3992.86MIN: 3987.06MIN: 3979.091. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

oneDNN

Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.1Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPUabcd9001800270036004500SE +/- 6.72, N = 34011.413998.124018.104017.57MIN: 3977.24MIN: 3974.85MIN: 3995.65MIN: 3992.221. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

oneDNN

Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.1Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPUabcd0.78441.56882.35323.13763.922SE +/- 0.52175, N = 153.486001.091221.138271.14615MIN: 0.97MIN: 0.99MIN: 1.05MIN: 1.041. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

oneDNN

Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.1Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPUabcd3691215SE +/- 1.62613, N = 1211.598893.604122.607512.33334MIN: 1.68MIN: 2.35MIN: 2.14MIN: 1.991. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

oneDNN

Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.1Harness: IP Shapes 1D - Data Type: f32 - Engine: CPUabcd0.83041.66082.49123.32164.152SE +/- 0.06973, N = 123.690762.432372.425981.57043MIN: 2.44MIN: 1.98MIN: 2MIN: 1.391. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread


Phoronix Test Suite v10.8.5