tr onednn 3.1 AMD Ryzen Threadripper 3960X 24-Core testing with a MSI Creator TRX40 (MS-7C59) v1.0 (1.12N1 BIOS) and Gigabyte AMD Radeon RX 5500/5500M / Pro 5500M on Ubuntu 22.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2304018-NE-TRONEDNN379&grr&sro .
tr onednn 3.1 Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Vulkan Compiler File-System Screen Resolution 1 2 3 AMD Ryzen Threadripper 3960X 24-Core @ 3.80GHz (24 Cores / 48 Threads) MSI Creator TRX40 (MS-7C59) v1.0 (1.12N1 BIOS) AMD Starship/Matisse 32GB 1000GB Sabrent Rocket 4.0 1TB Gigabyte AMD Radeon RX 5500/5500M / Pro 5500M (1900/875MHz) AMD Navi 10 HDMI Audio VA2431 Aquantia AQC107 NBase-T/IEEE + Intel I211 + Intel Wi-Fi 6 AX200 Ubuntu 22.04 5.19.0-051900rc7-generic (x86_64) GNOME Shell 42.2 X Server 1.3.204 GCC 11.2.0 ext4 1920x1080 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-gBFGDP/gcc-11-11.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-gBFGDP/gcc-11-11.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0x8301025 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Mitigation of untrained return thunk; SMT enabled with STIBP protection + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected
tr onednn 3.1 onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Deconvolution Batch shapes_1d - f32 - CPU onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU onednn: IP Shapes 1D - f32 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU onednn: IP Shapes 3D - f32 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: IP Shapes 1D - bf16bf16bf16 - CPU 1 2 3 2413.52 2420.28 2424.99 1101.92 1072.81 1086.06 6.04628 1.64442 1.48086 1.27364 5.86072 0.620631 9.17873 8.88991 2.0504 2.56022 2444.83 2437.88 2443.4 1092.44 1087.3 1096.19 5.94507 1.65789 1.52009 1.28856 6.02685 0.655153 9.3551 9.13355 2.03651 2.56663 2418.79 2452.18 2433.47 1087.8 1092.87 1095.7 6.06764 1.6362 1.4232 1.23854 6.01761 0.649985 9.31373 9.0513 2.05621 2.57774 OpenBenchmarking.org
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU 1 2 3 500 1000 1500 2000 2500 2413.52 2444.83 2418.79 MIN: 2406.94 MIN: 2438.42 MIN: 2411.51 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU 1 2 3 500 1000 1500 2000 2500 2420.28 2437.88 2452.18 MIN: 2414.27 MIN: 2430.87 MIN: 2446.15 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU 1 2 3 500 1000 1500 2000 2500 2424.99 2443.40 2433.47 MIN: 2416.37 MIN: 2437.92 MIN: 2427.52 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU 1 2 3 200 400 600 800 1000 1101.92 1092.44 1087.80 MIN: 1095.18 MIN: 1088.86 MIN: 1082.36 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU 1 2 3 200 400 600 800 1000 1072.81 1087.30 1092.87 MIN: 1067.5 MIN: 1083 MIN: 1087.75 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU 1 2 3 200 400 600 800 1000 1086.06 1096.19 1095.70 MIN: 1080.98 MIN: 1090.34 MIN: 1091.05 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU 1 2 3 2 4 6 8 10 6.04628 5.94507 6.06764 MIN: 5.41 MIN: 3.64 MIN: 3.83 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU 1 2 3 0.373 0.746 1.119 1.492 1.865 1.64442 1.65789 1.63620 MIN: 1.58 MIN: 1.58 MIN: 1.57 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU 1 2 3 0.342 0.684 1.026 1.368 1.71 1.48086 1.52009 1.42320 MIN: 1.36 MIN: 1.39 MIN: 1.32 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU 1 2 3 0.2899 0.5798 0.8697 1.1596 1.4495 1.27364 1.28856 1.23854 MIN: 1.18 MIN: 1.19 MIN: 1.17 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU 1 2 3 2 4 6 8 10 5.86072 6.02685 6.01761 MIN: 5.81 MIN: 5.97 MIN: 5.96 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU 1 2 3 0.1474 0.2948 0.4422 0.5896 0.737 0.620631 0.655153 0.649985 MIN: 0.57 MIN: 0.6 MIN: 0.59 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU 1 2 3 3 6 9 12 15 9.17873 9.35510 9.31373 MIN: 9.1 MIN: 9.26 MIN: 9.24 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU 1 2 3 3 6 9 12 15 8.88991 9.13355 9.05130 MIN: 8.76 MIN: 9 MIN: 8.94 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU 1 2 3 0.4626 0.9252 1.3878 1.8504 2.313 2.05040 2.03651 2.05621 MIN: 1.98 MIN: 1.96 MIN: 1.96 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU 1 2 3 0.58 1.16 1.74 2.32 2.9 2.56022 2.56663 2.57774 MIN: 2.5 MIN: 2.51 MIN: 2.51 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
Phoronix Test Suite v10.8.5