3900XT oneDNN 2.0

AMD Ryzen 9 3900XT 12-Core testing with a MSI MEG X570 GODLIKE (MS-7C34) v1.0 (1.B3 BIOS) and AMD Radeon RX 56/64 8GB on Ubuntu 20.10 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2012114-PTS-3900XTON61.

3900XT oneDNN 2.0ProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLVulkanCompilerFile-SystemScreen Resolution123AMD Ryzen 9 3900XT 12-Core @ 3.80GHz (12 Cores / 24 Threads)MSI MEG X570 GODLIKE (MS-7C34) v1.0 (1.B3 BIOS)AMD Starship/Matisse16GB500GB Seagate FireCuda 520 SSD ZP500GM30002AMD Radeon RX 56/64 8GB (1630/945MHz)AMD Vega 10 HDMI AudioASUS MG28URealtek Device 2600 + Realtek Device 3000 + Intel Wi-Fi 6 AX200Ubuntu 20.105.8.0-31-generic (x86_64)GNOME Shell 3.38.1X Server 1.20.9amdgpu 19.1.04.6 Mesa 20.2.1 (LLVM 11.0.0)1.2.131GCC 10.2.0ext43840x2160OpenBenchmarking.orgCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-gcn/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: acpi-cpufreq ondemand (Boost: Enabled) - CPU Microcode: 0x8701021Security Details- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected

3900XT oneDNN 2.0onednn: IP Shapes 1D - f32 - CPUonednn: IP Shapes 3D - f32 - CPUonednn: IP Shapes 1D - u8s8f32 - CPUonednn: IP Shapes 3D - u8s8f32 - CPUonednn: Convolution Batch Shapes Auto - f32 - CPUonednn: Deconvolution Batch shapes_1d - f32 - CPUonednn: Deconvolution Batch shapes_3d - f32 - CPUonednn: Convolution Batch Shapes Auto - u8s8f32 - CPUonednn: Deconvolution Batch shapes_1d - u8s8f32 - CPUonednn: Deconvolution Batch shapes_3d - u8s8f32 - CPUonednn: Recurrent Neural Network Training - f32 - CPUonednn: Recurrent Neural Network Inference - f32 - CPUonednn: Recurrent Neural Network Training - u8s8f32 - CPUonednn: Recurrent Neural Network Inference - u8s8f32 - CPUonednn: Matrix Multiply Batch Shapes Transformer - f32 - CPUonednn: Recurrent Neural Network Training - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPUonednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU1234.9175610.92021.979430.91753322.70793.604125.3812925.45724.363893.549124173.772525.884159.192581.130.9555694234.302546.382.260334.9320910.48412.011420.91307722.77063.807445.2959825.14494.337633.584464164.132488.284256.152502.210.9344564205.892552.932.193075.0800510.323571.997870.90422722.84053.690575.4279025.26084.402623.639214198.122498.374099.212542.010.9245574135.542564.792.24168OpenBenchmarking.org

oneDNN

Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU1231.1432.2863.4294.5725.715SE +/- 0.05292, N = 3SE +/- 0.05679, N = 3SE +/- 0.09243, N = 154.917564.932095.08005MIN: 4.54MIN: 4.52MIN: 4.521. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU1233691215SE +/- 0.12, N = 3SE +/- 0.12, N = 3SE +/- 0.08, N = 1410.9210.4810.32MIN: 10.59MIN: 9.99MIN: 9.631. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU1230.45260.90521.35781.81042.263SE +/- 0.02675, N = 15SE +/- 0.02633, N = 15SE +/- 0.02721, N = 151.979432.011421.99787MIN: 1.84MIN: 1.83MIN: 1.841. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU1230.20640.41280.61920.82561.032SE +/- 0.015200, N = 3SE +/- 0.011052, N = 3SE +/- 0.013741, N = 120.9175330.9130770.904227MIN: 0.81MIN: 0.81MIN: 0.751. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU123510152025SE +/- 0.22, N = 3SE +/- 0.25, N = 3SE +/- 0.30, N = 422.7122.7722.84MIN: 21MIN: 21.58MIN: 21.641. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU1230.85671.71342.57013.42684.2835SE +/- 0.00833, N = 3SE +/- 0.03432, N = 3SE +/- 0.03303, N = 113.604123.807443.69057MIN: 3.45MIN: 3.47MIN: 3.471. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU1231.22132.44263.66394.88526.1065SE +/- 0.03493, N = 3SE +/- 0.01863, N = 3SE +/- 0.08314, N = 35.381295.295985.42790MIN: 5.14MIN: 5.16MIN: 5.171. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU123612182430SE +/- 0.43, N = 3SE +/- 0.26, N = 3SE +/- 0.33, N = 325.4625.1425.26MIN: 23.47MIN: 23.6MIN: 23.331. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU1230.99061.98122.97183.96244.953SE +/- 0.06495, N = 3SE +/- 0.04011, N = 3SE +/- 0.07560, N = 34.363894.337634.40262MIN: 4.07MIN: 4.08MIN: 4.081. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU1230.81881.63762.45643.27524.094SE +/- 0.02596, N = 3SE +/- 0.04801, N = 3SE +/- 0.07019, N = 123.549123.584463.63921MIN: 3.37MIN: 3.38MIN: 3.371. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU1239001800270036004500SE +/- 42.56, N = 3SE +/- 26.99, N = 3SE +/- 55.67, N = 34173.774164.134198.12MIN: 3999.11MIN: 3992.56MIN: 3922.681. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU1235001000150020002500SE +/- 28.90, N = 6SE +/- 18.11, N = 3SE +/- 21.94, N = 32525.882488.282498.37MIN: 2401.16MIN: 2399.27MIN: 2341.21. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU1239001800270036004500SE +/- 15.37, N = 3SE +/- 36.36, N = 3SE +/- 32.65, N = 34159.194256.154099.21MIN: 4016.61MIN: 4032.62MIN: 3912.691. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU1236001200180024003000SE +/- 31.75, N = 5SE +/- 8.38, N = 3SE +/- 27.05, N = 32581.132502.212542.01MIN: 2405.38MIN: 2403.23MIN: 2400.051. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU1230.2150.430.6450.861.075SE +/- 0.015503, N = 15SE +/- 0.009665, N = 15SE +/- 0.008671, N = 100.9555690.9344560.924557MIN: 0.85MIN: 0.85MIN: 0.861. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU1239001800270036004500SE +/- 24.48, N = 3SE +/- 24.69, N = 3SE +/- 30.67, N = 34234.304205.894135.54MIN: 4027.21MIN: 4019.84MIN: 3917.591. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU1236001200180024003000SE +/- 20.57, N = 15SE +/- 31.82, N = 4SE +/- 25.22, N = 32546.382552.932564.79MIN: 2395.99MIN: 2395.41MIN: 2392.971. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU1230.50861.01721.52582.03442.543SE +/- 0.02004, N = 15SE +/- 0.00579, N = 3SE +/- 0.02373, N = 82.260332.193072.24168MIN: 2.1MIN: 2.12MIN: 2.11. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread


Phoronix Test Suite v10.8.4