avx512 onednn 3.0 ryzen 9 7950x

AMD Ryzen 9 7950X 16-Core testing with a ASUS ROG CROSSHAIR X670E HERO (0805 BIOS) and AMD Radeon RX 7900 XTX 24GB on Ubuntu 22.04 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2212204-PTS-AVX512ON17&sor&grt.

avx512 onednn 3.0 ryzen 9 7950xProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerOpenGLOpenCLCompilerFile-SystemScreen ResolutionabccdAMD Ryzen 9 7950X 16-Core @ 5.88GHz (16 Cores / 32 Threads)ASUS ROG CROSSHAIR X670E HERO (0805 BIOS)AMD Device 14d832GBWestern Digital WD_BLACK SN850X 1000GB + 2000GBAMD Radeon RX 7900 XTX 24GB (3220/1249MHz)AMD Device ab30ASUS MG28UIntel I225-V + Intel Wi-Fi 6 AX210/AX211/AX411Ubuntu 22.045.15.0-56-generic (x86_64)GNOME Shell 42.5X Server 1.21.1.3 + Wayland4.6 Mesa 22.3.0-devel (LLVM 15.0.3 DRM 3.49)OpenCL 2.1 AMD-APP (3513.0)GCC 11.3.0ext43840x2160OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-xKiWfi/gcc-11-11.3.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-xKiWfi/gcc-11-11.3.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: amd-pstate schedutil (Boost: Enabled) - CPU Microcode: 0xa601203Security Details- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected

avx512 onednn 3.0 ryzen 9 7950xonednn: IP Shapes 1D - f32 - CPUonednn: IP Shapes 3D - f32 - CPUonednn: IP Shapes 1D - u8s8f32 - CPUonednn: IP Shapes 3D - u8s8f32 - CPUonednn: IP Shapes 1D - bf16bf16bf16 - CPUonednn: IP Shapes 3D - bf16bf16bf16 - CPUonednn: Convolution Batch Shapes Auto - f32 - CPUonednn: Deconvolution Batch shapes_1d - f32 - CPUonednn: Deconvolution Batch shapes_3d - f32 - CPUonednn: Convolution Batch Shapes Auto - u8s8f32 - CPUonednn: Deconvolution Batch shapes_1d - u8s8f32 - CPUonednn: Deconvolution Batch shapes_3d - u8s8f32 - CPUonednn: Recurrent Neural Network Training - f32 - CPUonednn: Recurrent Neural Network Inference - f32 - CPUonednn: Recurrent Neural Network Training - u8s8f32 - CPUonednn: Convolution Batch Shapes Auto - bf16bf16bf16 - CPUonednn: Deconvolution Batch shapes_1d - bf16bf16bf16 - CPUonednn: Deconvolution Batch shapes_3d - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Inference - u8s8f32 - CPUonednn: Matrix Multiply Batch Shapes Transformer - f32 - CPUonednn: Recurrent Neural Network Training - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPUonednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPUonednn: Matrix Multiply Batch Shapes Transformer - bf16bf16bf16 - CPUabccd1.719733.249450.4580290.3420680.7025721.547035.603864.361352.364135.203790.4217980.5878321135.50580.9951136.911.697243.343451.41319580.5800.4368871135.35582.3780.1297730.2097251.712473.234680.3571620.3297740.7685431.571455.606733.763112.558665.196290.4208270.5869891129.74583.0661133.91.696553.375541.40158582.9570.4397011135.16576.9640.1304030.2123611.728273.270780.3576730.3680810.6313681.570795.611892.568382.359255.223470.4206170.5848071135.82582.2471133.541.698673.36731.40127579.4340.4387631132.81569.8450.1299290.2119331.745233.237470.3687890.3769150.7180841.545885.607814.772972.355225.271470.4208220.5886751134.55581.6661134.221.696533.367231.39831583.7520.4392011134.7583.7670.1296580.210342OpenBenchmarking.org

oneDNN

Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: IP Shapes 1D - Data Type: f32 - Engine: CPUbaccd0.39270.78541.17811.57081.9635SE +/- 0.00347, N = 31.712471.719731.728271.74523MIN: 1.52MIN: 1.53MIN: 1.53MIN: 1.541. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: IP Shapes 3D - Data Type: f32 - Engine: CPUbdacc0.73591.47182.20772.94363.6795SE +/- 0.01389, N = 33.234683.237473.249453.27078MIN: 3.18MIN: 3.18MIN: 3.18MIN: 3.211. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPUbccda0.10310.20620.30930.41240.5155SE +/- 0.022723, N = 120.3571620.3576730.3687890.458029MIN: 0.34MIN: 0.34MIN: 0.35MIN: 0.341. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPUbaccd0.08480.16960.25440.33920.424SE +/- 0.003911, N = 150.3297740.3420680.3680810.376915MIN: 0.3MIN: 0.29MIN: 0.32MIN: 0.341. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPUccadb0.17290.34580.51870.69160.8645SE +/- 0.021497, N = 150.6313680.7025720.7180840.768543MIN: 0.58MIN: 0.58MIN: 0.59MIN: 0.671. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPUdaccb0.35360.70721.06081.41441.768SE +/- 0.01845, N = 41.545881.547031.570791.57145MIN: 1.47MIN: 1.42MIN: 1.44MIN: 1.451. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPUabdcc1.26272.52543.78815.05086.3135SE +/- 0.00957, N = 35.603865.606735.607815.61189MIN: 5.51MIN: 5.51MIN: 5.52MIN: 5.521. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPUccbad1.07392.14783.22174.29565.3695SE +/- 0.31165, N = 122.568383.763114.361354.77297MIN: 2.39MIN: 2.45MIN: 2.42MIN: 2.451. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPUdccab0.57571.15141.72712.30282.8785SE +/- 0.00179, N = 32.355222.359252.364132.55866MIN: 2.29MIN: 2.28MIN: 2.28MIN: 2.291. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPUbaccd1.18612.37223.55834.74445.9305SE +/- 0.00451, N = 35.196295.203795.223475.27147MIN: 5.13MIN: 5.11MIN: 5.13MIN: 5.141. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPUccdba0.09490.18980.28470.37960.4745SE +/- 0.000955, N = 30.4206170.4208220.4208270.421798MIN: 0.4MIN: 0.4MIN: 0.4MIN: 0.41. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPUccbad0.13250.2650.39750.530.6625SE +/- 0.000795, N = 30.5848070.5869890.5878320.588675MIN: 0.56MIN: 0.57MIN: 0.57MIN: 0.571. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPUbdacc2004006008001000SE +/- 2.11, N = 31129.741134.551135.501135.82MIN: 1125.41MIN: 1129.16MIN: 1125.5MIN: 1130.281. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPUadccb130260390520650SE +/- 1.72, N = 3581.00581.67582.25583.07MIN: 572.88MIN: 576.15MIN: 576.12MIN: 576.881. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPUccbda2004006008001000SE +/- 1.06, N = 31133.541133.901134.221136.91MIN: 1127.45MIN: 1128.93MIN: 1128.64MIN: 1128.71. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPUdbacc0.38220.76441.14661.52881.911SE +/- 0.00009, N = 31.696531.696551.697241.69867MIN: 1.65MIN: 1.65MIN: 1.65MIN: 1.651. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPUadccb0.75951.5192.27853.0383.7975SE +/- 0.02067, N = 33.343453.367233.367303.37554MIN: 3.18MIN: 3.26MIN: 3.26MIN: 3.271. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPUdccba0.3180.6360.9541.2721.59SE +/- 0.00885, N = 31.398311.401271.401581.41319MIN: 1.36MIN: 1.36MIN: 1.36MIN: 1.351. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPUccabd130260390520650SE +/- 1.94, N = 3579.43580.58582.96583.75MIN: 573.74MIN: 571.58MIN: 577.42MIN: 577.541. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPUaccdb0.09890.19780.29670.39560.4945SE +/- 0.001916, N = 30.4368870.4387630.4392010.439701MIN: 0.42MIN: 0.42MIN: 0.42MIN: 0.421. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPUccdba2004006008001000SE +/- 0.68, N = 31132.811134.701135.161135.35MIN: 1127.61MIN: 1129.33MIN: 1129.54MIN: 1128.231. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPUccbad130260390520650SE +/- 1.08, N = 3569.85576.96582.38583.77MIN: 566.12MIN: 571.43MIN: 574.3MIN: 578.11. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPUdaccb0.02930.05860.08790.11720.1465SE +/- 0.000417, N = 30.1296580.1297730.1299290.130403MIN: 0.12MIN: 0.12MIN: 0.12MIN: 0.121. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPUadccb0.04780.09560.14340.19120.239SE +/- 0.001018, N = 30.2097250.2103420.2119330.212361MIN: 0.2MIN: 0.2MIN: 0.2MIN: 0.21. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl


Phoronix Test Suite v10.8.5