5900HX oneDNN 2.6

AMD Ryzen 9 5900HX testing with a ASUS G513QY v1.0 (G513QY.311 BIOS) and ASUS AMD Cezanne 512MB on Ubuntu 21.10 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2203307-PTS-5900HXON51.

5900HX oneDNN 2.6ProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerOpenGLVulkanCompilerFile-SystemScreen ResolutionABCAMD Ryzen 9 5900HX @ 3.30GHz (8 Cores / 16 Threads)ASUS G513QY v1.0 (G513QY.311 BIOS)AMD Renoir/Cezanne16GB512GB SAMSUNG MZVLQ512HBLU-00B00ASUS AMD Cezanne 512MB (2500/1000MHz)AMD Navi 21 HDMI AudioLQ156M1JW25Realtek RTL8111/8168/8411 + MEDIATEK Device 7961Ubuntu 21.105.17.0-051700-generic (x86_64)GNOME Shell 40.5X Server + Wayland4.6 Mesa 22.0.0-devel (git-9cb9101 2022-01-08 impish-oibaf-ppa) (LLVM 13.0.0 DRM 3.44)1.2.199GCC 11.2.0ext41920x1080OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-ZPT0kp/gcc-11-11.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-ZPT0kp/gcc-11-11.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - Platform Profile: balanced - CPU Microcode: 0xa50000c - ACPI Profile: balanced Security Details- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected

5900HX oneDNN 2.6onednn: IP Shapes 1D - f32 - CPUonednn: IP Shapes 3D - f32 - CPUonednn: IP Shapes 1D - u8s8f32 - CPUonednn: IP Shapes 3D - u8s8f32 - CPUonednn: Convolution Batch Shapes Auto - f32 - CPUonednn: Deconvolution Batch shapes_1d - f32 - CPUonednn: Deconvolution Batch shapes_3d - f32 - CPUonednn: Convolution Batch Shapes Auto - u8s8f32 - CPUonednn: Deconvolution Batch shapes_1d - u8s8f32 - CPUonednn: Deconvolution Batch shapes_3d - u8s8f32 - CPUonednn: Recurrent Neural Network Training - f32 - CPUonednn: Recurrent Neural Network Inference - f32 - CPUonednn: Recurrent Neural Network Training - u8s8f32 - CPUonednn: Recurrent Neural Network Inference - u8s8f32 - CPUonednn: Matrix Multiply Batch Shapes Transformer - f32 - CPUonednn: Recurrent Neural Network Training - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPUonednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPUonednn: Matrix Multiply Batch Shapes Transformer - bf16bf16bf16 - CPUABC3.7739215.65601.298513.5575031.33078.437016.5282932.93121.889962.836443953.612841.993973.672815.856.327323940.552800.242.425563.8316415.11141.297823.5442631.28798.563976.4156932.72191.888682.874843928.542802.083941.942819.766.305373926.072816.452.370363.7941814.33421.301463.3840231.22548.649106.4173832.50361.888932.900263939.752823.983941.252801.286.263983935.102801.182.36683OpenBenchmarking.org

oneDNN

Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: IP Shapes 1D - Data Type: f32 - Engine: CPUABC0.86211.72422.58633.44844.3105SE +/- 0.02746, N = 3SE +/- 0.00208, N = 3SE +/- 0.01719, N = 33.773923.831643.79418MIN: 3.51MIN: 3.58MIN: 3.531. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread

oneDNN

Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: IP Shapes 3D - Data Type: f32 - Engine: CPUABC48121620SE +/- 0.02, N = 3SE +/- 0.03, N = 3SE +/- 0.07, N = 315.6615.1114.33MIN: 15.44MIN: 14.83MIN: 13.921. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread

oneDNN

Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPUABC0.29280.58560.87841.17121.464SE +/- 0.01208, N = 3SE +/- 0.01165, N = 3SE +/- 0.00984, N = 31.298511.297821.30146MIN: 1.25MIN: 1.25MIN: 1.251. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread

oneDNN

Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPUABC0.80041.60082.40123.20164.002SE +/- 0.01706, N = 3SE +/- 0.00555, N = 3SE +/- 0.00505, N = 33.557503.544263.38402MIN: 3.45MIN: 3.47MIN: 3.311. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPUABC714212835SE +/- 0.04, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 331.3331.2931.23MIN: 30.93MIN: 30.93MIN: 30.751. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPUABC246810SE +/- 0.10958, N = 3SE +/- 0.08416, N = 15SE +/- 0.17417, N = 128.437018.563978.64910MIN: 5.52MIN: 5.54MIN: 5.511. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPUABC246810SE +/- 0.06187, N = 6SE +/- 0.00452, N = 3SE +/- 0.03351, N = 36.528296.415696.41738MIN: 6.23MIN: 6.25MIN: 6.251. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPUABC816243240SE +/- 0.07, N = 3SE +/- 0.10, N = 3SE +/- 0.07, N = 332.9332.7232.50MIN: 32.45MIN: 32.02MIN: 31.511. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPUABC0.42520.85041.27561.70082.126SE +/- 0.00698, N = 3SE +/- 0.00594, N = 3SE +/- 0.00266, N = 31.889961.888681.88893MIN: 1.8MIN: 1.8MIN: 1.791. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPUABC0.65261.30521.95782.61043.263SE +/- 0.02338, N = 3SE +/- 0.03458, N = 15SE +/- 0.05737, N = 152.836442.874842.90026MIN: 2.61MIN: 2.56MIN: 2.561. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread

oneDNN

Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPUABC8001600240032004000SE +/- 4.49, N = 3SE +/- 3.70, N = 3SE +/- 3.09, N = 33953.613928.543939.75MIN: 3920.42MIN: 3905.11MIN: 3911.821. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPUABC6001200180024003000SE +/- 29.06, N = 15SE +/- 8.65, N = 3SE +/- 24.07, N = 82841.992802.082823.98MIN: 2763.48MIN: 2773.01MIN: 2764.441. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread

oneDNN

Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPUABC9001800270036004500SE +/- 32.58, N = 3SE +/- 1.63, N = 3SE +/- 5.18, N = 33973.673941.943941.25MIN: 3914.02MIN: 3916.45MIN: 3909.141. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPUABC6001200180024003000SE +/- 8.10, N = 3SE +/- 6.18, N = 3SE +/- 5.92, N = 32815.852819.762801.28MIN: 2783.37MIN: 2796.92MIN: 2777.491. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPUABC246810SE +/- 0.00610, N = 3SE +/- 0.00210, N = 3SE +/- 0.00464, N = 36.327326.305376.26398MIN: 6.14MIN: 6.13MIN: 6.091. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread

oneDNN

Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPUABC8001600240032004000SE +/- 5.81, N = 3SE +/- 5.66, N = 3SE +/- 1.40, N = 33940.553926.073935.10MIN: 3903.19MIN: 3889.83MIN: 3909.641. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPUABC6001200180024003000SE +/- 7.66, N = 3SE +/- 7.15, N = 3SE +/- 0.88, N = 32800.242816.452801.18MIN: 2777.51MIN: 2798.14MIN: 27851. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPUABC0.54581.09161.63742.18322.729SE +/- 0.04682, N = 14SE +/- 0.00327, N = 3SE +/- 0.00822, N = 32.425562.370362.36683MIN: 2.26MIN: 2.28MIN: 2.261. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread


Phoronix Test Suite v10.8.4