onednn 2.7 zen4

AMD Ryzen 7 7700X 8-Core testing with a ASUS ROG CROSSHAIR X670E HERO (0604 BIOS) and GFX1036 512MB on Ubuntu 22.04 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2209286-PTS-ONEDNN2732&grr.

onednn 2.7 zen4ProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerOpenGLVulkanCompilerFile-SystemScreen ResolutionABCAMD Ryzen 7 7700X 8-Core @ 5.57GHz (8 Cores / 16 Threads)ASUS ROG CROSSHAIR X670E HERO (0604 BIOS)AMD Device 14d832GB2000GB Samsung SSD 980 PRO 2TBGFX1036 512MB (2200/3000MHz)AMD Device 1640ASUS MG28UIntel I225-V + Intel Wi-Fi 6 AX210/AX211/AX411Ubuntu 22.046.0.0-060000rc1daily20220820-generic (x86_64)GNOME Shell 42.2X Server 1.21.1.3 + Wayland4.6 Mesa 22.3.0-devel (git-4685385 2022-08-23 jammy-oibaf-ppa) (LLVM 14.0.6 DRM 3.48)1.3.224GCC 12.0.1 20220319ext43840x2160OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-OcsLtf/gcc-12-12-20220319/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-OcsLtf/gcc-12-12-20220319/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: amd-pstate schedutil (Boost: Enabled) - CPU Microcode: 0xa601203Security Details- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected

onednn 2.7 zen4onednn: Recurrent Neural Network Training - f32 - CPUonednn: Deconvolution Batch shapes_1d - f32 - CPUonednn: Recurrent Neural Network Training - u8s8f32 - CPUonednn: Recurrent Neural Network Training - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Inference - f32 - CPUonednn: Recurrent Neural Network Inference - u8s8f32 - CPUy-cruncher: 1Bonednn: Deconvolution Batch shapes_1d - bf16bf16bf16 - CPUonednn: Deconvolution Batch shapes_1d - u8s8f32 - CPUonednn: IP Shapes 1D - f32 - CPUonednn: IP Shapes 1D - u8s8f32 - CPUonednn: IP Shapes 1D - bf16bf16bf16 - CPUy-cruncher: 500Monednn: Matrix Multiply Batch Shapes Transformer - bf16bf16bf16 - CPUonednn: Matrix Multiply Batch Shapes Transformer - f32 - CPUonednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPUonednn: IP Shapes 3D - f32 - CPUonednn: IP Shapes 3D - bf16bf16bf16 - CPUonednn: IP Shapes 3D - u8s8f32 - CPUonednn: Convolution Batch Shapes Auto - f32 - CPUonednn: Convolution Batch Shapes Auto - u8s8f32 - CPUonednn: Convolution Batch Shapes Auto - bf16bf16bf16 - CPUonednn: Deconvolution Batch shapes_3d - f32 - CPUonednn: Deconvolution Batch shapes_3d - bf16bf16bf16 - CPUonednn: Deconvolution Batch shapes_3d - u8s8f32 - CPUABC2099.396.129562082.162082.271034.291033.731033.6026.7717.425750.7406582.914800.5477741.1427712.2300.4010821.222750.2392264.214592.353470.9221137.384837.198973.171183.699192.283870.9094732080.206.275802083.932082.781034.251034.461034.4426.7827.437410.7409632.921320.5482031.1449312.2520.4040401.221970.2392254.216922.386210.9214557.393877.176813.171003.705072.286730.9044312080.075.718162082.782084.711036.031034.461035.4826.7427.432420.7408012.912290.5483821.1447912.2380.4020661.222310.2394664.217742.377620.9223587.390967.181983.170663.706302.286710.901712OpenBenchmarking.org

oneDNN

Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPUABC5001000150020002500SE +/- 17.73, N = 8SE +/- 0.98, N = 3SE +/- 1.17, N = 32099.392080.202080.07MIN: 2071.66MIN: 2073.24MIN: 2072.311. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPUABC246810SE +/- 0.21200, N = 12SE +/- 0.18186, N = 15SE +/- 0.15579, N = 156.129566.275805.71816MIN: 4.08MIN: 4.07MIN: 4.051. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPUABC400800120016002000SE +/- 0.50, N = 3SE +/- 2.25, N = 3SE +/- 0.93, N = 32082.162083.932082.78MIN: 2075.96MIN: 2076.52MIN: 2077.21. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPUABC400800120016002000SE +/- 0.23, N = 3SE +/- 2.45, N = 3SE +/- 2.51, N = 32082.272082.782084.71MIN: 2076.08MIN: 2074.99MIN: 2076.541. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPUABC2004006008001000SE +/- 2.49, N = 3SE +/- 0.54, N = 3SE +/- 1.29, N = 31034.291034.251036.03MIN: 1028.37MIN: 1030.99MIN: 1030.991. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPUABC2004006008001000SE +/- 0.47, N = 3SE +/- 1.11, N = 3SE +/- 0.98, N = 31033.731034.461034.46MIN: 1029.46MIN: 1030.15MIN: 1029.661. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPUABC2004006008001000SE +/- 0.76, N = 3SE +/- 0.68, N = 3SE +/- 1.58, N = 31033.601034.441035.48MIN: 1029.06MIN: 1030.27MIN: 1029.41. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

Y-Cruncher

Pi Digits To Calculate: 1B

OpenBenchmarking.orgSeconds, Fewer Is BetterY-Cruncher 0.7.10.9513Pi Digits To Calculate: 1BABC612182430SE +/- 0.05, N = 3SE +/- 0.01, N = 3SE +/- 0.04, N = 326.7726.7826.74

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPUABC246810SE +/- 0.00113, N = 3SE +/- 0.00809, N = 3SE +/- 0.00246, N = 37.425757.437417.43242MIN: 7.25MIN: 7.22MIN: 7.241. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPUABC0.16670.33340.50010.66680.8335SE +/- 0.000378, N = 3SE +/- 0.000493, N = 3SE +/- 0.000122, N = 30.7406580.7409630.740801MIN: 0.72MIN: 0.72MIN: 0.721. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: IP Shapes 1D - Data Type: f32 - Engine: CPUABC0.65731.31461.97192.62923.2865SE +/- 0.00673, N = 3SE +/- 0.01135, N = 3SE +/- 0.00516, N = 32.914802.921322.91229MIN: 2.75MIN: 2.76MIN: 2.751. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPUABC0.12340.24680.37020.49360.617SE +/- 0.000119, N = 3SE +/- 0.000295, N = 3SE +/- 0.000374, N = 30.5477740.5482030.548382MIN: 0.54MIN: 0.54MIN: 0.541. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPUABC0.25760.51520.77281.03041.288SE +/- 0.00072, N = 3SE +/- 0.00038, N = 3SE +/- 0.00052, N = 31.142771.144931.14479MIN: 1.11MIN: 1.11MIN: 1.111. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

Y-Cruncher

Pi Digits To Calculate: 500M

OpenBenchmarking.orgSeconds, Fewer Is BetterY-Cruncher 0.7.10.9513Pi Digits To Calculate: 500MABC3691215SE +/- 0.04, N = 3SE +/- 0.02, N = 3SE +/- 0.01, N = 312.2312.2512.24

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPUABC0.09090.18180.27270.36360.4545SE +/- 0.000332, N = 3SE +/- 0.002438, N = 3SE +/- 0.000763, N = 30.4010820.4040400.402066MIN: 0.39MIN: 0.39MIN: 0.391. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPUABC0.27510.55020.82531.10041.3755SE +/- 0.00155, N = 3SE +/- 0.00159, N = 3SE +/- 0.00145, N = 31.222751.221971.22231MIN: 1.2MIN: 1.19MIN: 1.191. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPUABC0.05390.10780.16170.21560.2695SE +/- 0.000324, N = 3SE +/- 0.000154, N = 3SE +/- 0.000127, N = 30.2392260.2392250.239466MIN: 0.23MIN: 0.23MIN: 0.231. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: IP Shapes 3D - Data Type: f32 - Engine: CPUABC0.9491.8982.8473.7964.745SE +/- 0.00131, N = 3SE +/- 0.00162, N = 3SE +/- 0.00284, N = 34.214594.216924.21774MIN: 4.18MIN: 4.17MIN: 4.171. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPUABC0.53691.07381.61072.14762.6845SE +/- 0.00446, N = 3SE +/- 0.02506, N = 3SE +/- 0.02001, N = 32.353472.386212.37762MIN: 2.3MIN: 2.3MIN: 2.31. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPUABC0.20750.4150.62250.831.0375SE +/- 0.000796, N = 3SE +/- 0.001577, N = 3SE +/- 0.000899, N = 30.9221130.9214550.922358MIN: 0.86MIN: 0.86MIN: 0.861. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPUABC246810SE +/- 0.00173, N = 3SE +/- 0.00443, N = 3SE +/- 0.00434, N = 37.384837.393877.39096MIN: 7.29MIN: 7.29MIN: 7.281. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPUABC246810SE +/- 0.02083, N = 3SE +/- 0.02459, N = 3SE +/- 0.02240, N = 37.198977.176817.18198MIN: 7.09MIN: 7.08MIN: 7.081. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPUABC0.71351.4272.14052.8543.5675SE +/- 0.00152, N = 3SE +/- 0.00041, N = 3SE +/- 0.00095, N = 33.171183.171003.17066MIN: 3.12MIN: 3.12MIN: 3.121. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPUABC0.83391.66782.50173.33564.1695SE +/- 0.00164, N = 3SE +/- 0.00278, N = 3SE +/- 0.00107, N = 33.699193.705073.70630MIN: 3.54MIN: 3.57MIN: 3.551. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPUABC0.51451.0291.54352.0582.5725SE +/- 0.00123, N = 3SE +/- 0.00217, N = 3SE +/- 0.00199, N = 32.283872.286732.28671MIN: 2.18MIN: 2.19MIN: 2.181. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPUABC0.20460.40920.61380.81841.023SE +/- 0.000668, N = 3SE +/- 0.000950, N = 3SE +/- 0.002094, N = 30.9094730.9044310.901712MIN: 0.87MIN: 0.86MIN: 0.861. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl


Phoronix Test Suite v10.8.4