5900HX oneDNN 2.6

AMD Ryzen 9 5900HX testing with a ASUS G513QY v1.0 (G513QY.311 BIOS) and ASUS AMD Cezanne 512MB on Ubuntu 21.10 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2203307-PTS-5900HXON51&grw&sor.

5900HX oneDNN 2.6ProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerOpenGLVulkanCompilerFile-SystemScreen ResolutionABCAMD Ryzen 9 5900HX @ 3.30GHz (8 Cores / 16 Threads)ASUS G513QY v1.0 (G513QY.311 BIOS)AMD Renoir/Cezanne16GB512GB SAMSUNG MZVLQ512HBLU-00B00ASUS AMD Cezanne 512MB (2500/1000MHz)AMD Navi 21 HDMI AudioLQ156M1JW25Realtek RTL8111/8168/8411 + MEDIATEK Device 7961Ubuntu 21.105.17.0-051700-generic (x86_64)GNOME Shell 40.5X Server + Wayland4.6 Mesa 22.0.0-devel (git-9cb9101 2022-01-08 impish-oibaf-ppa) (LLVM 13.0.0 DRM 3.44)1.2.199GCC 11.2.0ext41920x1080OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-ZPT0kp/gcc-11-11.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-ZPT0kp/gcc-11-11.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - Platform Profile: balanced - CPU Microcode: 0xa50000c - ACPI Profile: balanced Security Details- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected

5900HX oneDNN 2.6onednn: IP Shapes 1D - f32 - CPUonednn: IP Shapes 3D - f32 - CPUonednn: IP Shapes 1D - u8s8f32 - CPUonednn: IP Shapes 3D - u8s8f32 - CPUonednn: Convolution Batch Shapes Auto - f32 - CPUonednn: Deconvolution Batch shapes_1d - f32 - CPUonednn: Deconvolution Batch shapes_3d - f32 - CPUonednn: Convolution Batch Shapes Auto - u8s8f32 - CPUonednn: Deconvolution Batch shapes_1d - u8s8f32 - CPUonednn: Deconvolution Batch shapes_3d - u8s8f32 - CPUonednn: Recurrent Neural Network Training - f32 - CPUonednn: Recurrent Neural Network Inference - f32 - CPUonednn: Recurrent Neural Network Training - u8s8f32 - CPUonednn: Recurrent Neural Network Inference - u8s8f32 - CPUonednn: Matrix Multiply Batch Shapes Transformer - f32 - CPUonednn: Recurrent Neural Network Training - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPUonednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPUonednn: Matrix Multiply Batch Shapes Transformer - bf16bf16bf16 - CPUABC3.7739215.65601.298513.5575031.33078.437016.5282932.93121.889962.836443953.612841.993973.672815.856.327323940.552800.242.425563.8316415.11141.297823.5442631.28798.563976.4156932.72191.888682.874843928.542802.083941.942819.766.305373926.072816.452.370363.7941814.33421.301463.3840231.22548.649106.4173832.50361.888932.900263939.752823.983941.252801.286.263983935.102801.182.36683OpenBenchmarking.org

oneDNN

Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: IP Shapes 1D - Data Type: f32 - Engine: CPUACB0.86211.72422.58633.44844.3105SE +/- 0.02746, N = 3SE +/- 0.01719, N = 3SE +/- 0.00208, N = 33.773923.794183.83164MIN: 3.51MIN: 3.53MIN: 3.581. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread

oneDNN

Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: IP Shapes 3D - Data Type: f32 - Engine: CPUCBA48121620SE +/- 0.07, N = 3SE +/- 0.03, N = 3SE +/- 0.02, N = 314.3315.1115.66MIN: 13.92MIN: 14.83MIN: 15.441. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread

oneDNN

Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPUBAC0.29280.58560.87841.17121.464SE +/- 0.01165, N = 3SE +/- 0.01208, N = 3SE +/- 0.00984, N = 31.297821.298511.30146MIN: 1.25MIN: 1.25MIN: 1.251. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread

oneDNN

Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPUCBA0.80041.60082.40123.20164.002SE +/- 0.00505, N = 3SE +/- 0.00555, N = 3SE +/- 0.01706, N = 33.384023.544263.55750MIN: 3.31MIN: 3.47MIN: 3.451. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPUCBA714212835SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.04, N = 331.2331.2931.33MIN: 30.75MIN: 30.93MIN: 30.931. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPUABC246810SE +/- 0.10958, N = 3SE +/- 0.08416, N = 15SE +/- 0.17417, N = 128.437018.563978.64910MIN: 5.52MIN: 5.54MIN: 5.511. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPUBCA246810SE +/- 0.00452, N = 3SE +/- 0.03351, N = 3SE +/- 0.06187, N = 66.415696.417386.52829MIN: 6.25MIN: 6.25MIN: 6.231. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPUCBA816243240SE +/- 0.07, N = 3SE +/- 0.10, N = 3SE +/- 0.07, N = 332.5032.7232.93MIN: 31.51MIN: 32.02MIN: 32.451. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPUBCA0.42520.85041.27561.70082.126SE +/- 0.00594, N = 3SE +/- 0.00266, N = 3SE +/- 0.00698, N = 31.888681.888931.88996MIN: 1.8MIN: 1.79MIN: 1.81. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPUABC0.65261.30521.95782.61043.263SE +/- 0.02338, N = 3SE +/- 0.03458, N = 15SE +/- 0.05737, N = 152.836442.874842.90026MIN: 2.61MIN: 2.56MIN: 2.561. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread

oneDNN

Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPUBCA8001600240032004000SE +/- 3.70, N = 3SE +/- 3.09, N = 3SE +/- 4.49, N = 33928.543939.753953.61MIN: 3905.11MIN: 3911.82MIN: 3920.421. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPUBCA6001200180024003000SE +/- 8.65, N = 3SE +/- 24.07, N = 8SE +/- 29.06, N = 152802.082823.982841.99MIN: 2773.01MIN: 2764.44MIN: 2763.481. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread

oneDNN

Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPUCBA9001800270036004500SE +/- 5.18, N = 3SE +/- 1.63, N = 3SE +/- 32.58, N = 33941.253941.943973.67MIN: 3909.14MIN: 3916.45MIN: 3914.021. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPUCAB6001200180024003000SE +/- 5.92, N = 3SE +/- 8.10, N = 3SE +/- 6.18, N = 32801.282815.852819.76MIN: 2777.49MIN: 2783.37MIN: 2796.921. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPUCBA246810SE +/- 0.00464, N = 3SE +/- 0.00210, N = 3SE +/- 0.00610, N = 36.263986.305376.32732MIN: 6.09MIN: 6.13MIN: 6.141. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread

oneDNN

Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPUBCA8001600240032004000SE +/- 5.66, N = 3SE +/- 1.40, N = 3SE +/- 5.81, N = 33926.073935.103940.55MIN: 3889.83MIN: 3909.64MIN: 3903.191. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPUACB6001200180024003000SE +/- 7.66, N = 3SE +/- 0.88, N = 3SE +/- 7.15, N = 32800.242801.182816.45MIN: 2777.51MIN: 2785MIN: 2798.141. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPUCBA0.54581.09161.63742.18322.729SE +/- 0.00822, N = 3SE +/- 0.00327, N = 3SE +/- 0.04682, N = 142.366832.370362.42556MIN: 2.26MIN: 2.28MIN: 2.261. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread


Phoronix Test Suite v10.8.4