avx512 onednn 3.0 ryzen 9 7950x

AMD Ryzen 9 7950X 16-Core testing with a ASUS ROG CROSSHAIR X670E HERO (0805 BIOS) and AMD Radeon RX 7900 XTX 24GB on Ubuntu 22.04 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2212204-PTS-AVX512ON17&grw.

avx512 onednn 3.0 ryzen 9 7950xProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerOpenGLOpenCLCompilerFile-SystemScreen ResolutionabccdAMD Ryzen 9 7950X 16-Core @ 5.88GHz (16 Cores / 32 Threads)ASUS ROG CROSSHAIR X670E HERO (0805 BIOS)AMD Device 14d832GBWestern Digital WD_BLACK SN850X 1000GB + 2000GBAMD Radeon RX 7900 XTX 24GB (3220/1249MHz)AMD Device ab30ASUS MG28UIntel I225-V + Intel Wi-Fi 6 AX210/AX211/AX411Ubuntu 22.045.15.0-56-generic (x86_64)GNOME Shell 42.5X Server 1.21.1.3 + Wayland4.6 Mesa 22.3.0-devel (LLVM 15.0.3 DRM 3.49)OpenCL 2.1 AMD-APP (3513.0)GCC 11.3.0ext43840x2160OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-xKiWfi/gcc-11-11.3.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-xKiWfi/gcc-11-11.3.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: amd-pstate schedutil (Boost: Enabled) - CPU Microcode: 0xa601203Security Details- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected

avx512 onednn 3.0 ryzen 9 7950xonednn: IP Shapes 1D - f32 - CPUonednn: IP Shapes 3D - f32 - CPUonednn: IP Shapes 1D - u8s8f32 - CPUonednn: IP Shapes 3D - u8s8f32 - CPUonednn: IP Shapes 1D - bf16bf16bf16 - CPUonednn: IP Shapes 3D - bf16bf16bf16 - CPUonednn: Convolution Batch Shapes Auto - f32 - CPUonednn: Deconvolution Batch shapes_1d - f32 - CPUonednn: Deconvolution Batch shapes_3d - f32 - CPUonednn: Convolution Batch Shapes Auto - u8s8f32 - CPUonednn: Deconvolution Batch shapes_1d - u8s8f32 - CPUonednn: Deconvolution Batch shapes_3d - u8s8f32 - CPUonednn: Recurrent Neural Network Training - f32 - CPUonednn: Recurrent Neural Network Inference - f32 - CPUonednn: Recurrent Neural Network Training - u8s8f32 - CPUonednn: Convolution Batch Shapes Auto - bf16bf16bf16 - CPUonednn: Deconvolution Batch shapes_1d - bf16bf16bf16 - CPUonednn: Deconvolution Batch shapes_3d - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Inference - u8s8f32 - CPUonednn: Matrix Multiply Batch Shapes Transformer - f32 - CPUonednn: Recurrent Neural Network Training - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPUonednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPUonednn: Matrix Multiply Batch Shapes Transformer - bf16bf16bf16 - CPUabccd1.719733.249450.4580290.3420680.7025721.547035.603864.361352.364135.203790.4217980.5878321135.50580.9951136.911.697243.343451.41319580.5800.4368871135.35582.3780.1297730.2097251.712473.234680.3571620.3297740.7685431.571455.606733.763112.558665.196290.4208270.5869891129.74583.0661133.91.696553.375541.40158582.9570.4397011135.16576.9640.1304030.2123611.728273.270780.3576730.3680810.6313681.570795.611892.568382.359255.223470.4206170.5848071135.82582.2471133.541.698673.36731.40127579.4340.4387631132.81569.8450.1299290.2119331.745233.237470.3687890.3769150.7180841.545885.607814.772972.355225.271470.4208220.5886751134.55581.6661134.221.696533.367231.39831583.7520.4392011134.7583.7670.1296580.210342OpenBenchmarking.org

oneDNN

Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: IP Shapes 1D - Data Type: f32 - Engine: CPUabccd0.39270.78541.17811.57081.9635SE +/- 0.00347, N = 31.719731.712471.728271.74523MIN: 1.53MIN: 1.52MIN: 1.53MIN: 1.541. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: IP Shapes 3D - Data Type: f32 - Engine: CPUabccd0.73591.47182.20772.94363.6795SE +/- 0.01389, N = 33.249453.234683.270783.23747MIN: 3.18MIN: 3.18MIN: 3.21MIN: 3.181. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPUabccd0.10310.20620.30930.41240.5155SE +/- 0.022723, N = 120.4580290.3571620.3576730.368789MIN: 0.34MIN: 0.34MIN: 0.34MIN: 0.351. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPUabccd0.08480.16960.25440.33920.424SE +/- 0.003911, N = 150.3420680.3297740.3680810.376915MIN: 0.29MIN: 0.3MIN: 0.32MIN: 0.341. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPUabccd0.17290.34580.51870.69160.8645SE +/- 0.021497, N = 150.7025720.7685430.6313680.718084MIN: 0.58MIN: 0.67MIN: 0.58MIN: 0.591. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPUabccd0.35360.70721.06081.41441.768SE +/- 0.01845, N = 41.547031.571451.570791.54588MIN: 1.42MIN: 1.45MIN: 1.44MIN: 1.471. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPUabccd1.26272.52543.78815.05086.3135SE +/- 0.00957, N = 35.603865.606735.611895.60781MIN: 5.51MIN: 5.51MIN: 5.52MIN: 5.521. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPUabccd1.07392.14783.22174.29565.3695SE +/- 0.31165, N = 124.361353.763112.568384.77297MIN: 2.42MIN: 2.45MIN: 2.39MIN: 2.451. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPUabccd0.57571.15141.72712.30282.8785SE +/- 0.00179, N = 32.364132.558662.359252.35522MIN: 2.28MIN: 2.29MIN: 2.28MIN: 2.291. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPUabccd1.18612.37223.55834.74445.9305SE +/- 0.00451, N = 35.203795.196295.223475.27147MIN: 5.11MIN: 5.13MIN: 5.13MIN: 5.141. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPUabccd0.09490.18980.28470.37960.4745SE +/- 0.000955, N = 30.4217980.4208270.4206170.420822MIN: 0.4MIN: 0.4MIN: 0.4MIN: 0.41. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPUabccd0.13250.2650.39750.530.6625SE +/- 0.000795, N = 30.5878320.5869890.5848070.588675MIN: 0.57MIN: 0.57MIN: 0.56MIN: 0.571. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPUabccd2004006008001000SE +/- 2.11, N = 31135.501129.741135.821134.55MIN: 1125.5MIN: 1125.41MIN: 1130.28MIN: 1129.161. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPUabccd130260390520650SE +/- 1.72, N = 3581.00583.07582.25581.67MIN: 572.88MIN: 576.88MIN: 576.12MIN: 576.151. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPUabccd2004006008001000SE +/- 1.06, N = 31136.911133.901133.541134.22MIN: 1128.7MIN: 1128.93MIN: 1127.45MIN: 1128.641. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPUabccd0.38220.76441.14661.52881.911SE +/- 0.00009, N = 31.697241.696551.698671.69653MIN: 1.65MIN: 1.65MIN: 1.65MIN: 1.651. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPUabccd0.75951.5192.27853.0383.7975SE +/- 0.02067, N = 33.343453.375543.367303.36723MIN: 3.18MIN: 3.27MIN: 3.26MIN: 3.261. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPUabccd0.3180.6360.9541.2721.59SE +/- 0.00885, N = 31.413191.401581.401271.39831MIN: 1.35MIN: 1.36MIN: 1.36MIN: 1.361. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPUabccd130260390520650SE +/- 1.94, N = 3580.58582.96579.43583.75MIN: 571.58MIN: 577.42MIN: 573.74MIN: 577.541. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPUabccd0.09890.19780.29670.39560.4945SE +/- 0.001916, N = 30.4368870.4397010.4387630.439201MIN: 0.42MIN: 0.42MIN: 0.42MIN: 0.421. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPUabccd2004006008001000SE +/- 0.68, N = 31135.351135.161132.811134.70MIN: 1128.23MIN: 1129.54MIN: 1127.61MIN: 1129.331. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPUabccd130260390520650SE +/- 1.08, N = 3582.38576.96569.85583.77MIN: 574.3MIN: 571.43MIN: 566.12MIN: 578.11. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPUabccd0.02930.05860.08790.11720.1465SE +/- 0.000417, N = 30.1297730.1304030.1299290.129658MIN: 0.12MIN: 0.12MIN: 0.12MIN: 0.121. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPUabccd0.04780.09560.14340.19120.239SE +/- 0.001018, N = 30.2097250.2123610.2119330.210342MIN: 0.2MIN: 0.2MIN: 0.2MIN: 0.21. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl


Phoronix Test Suite v10.8.5