onednn 3.6 tr AMD Ryzen Threadripper 3990X 64-Core testing with a Gigabyte TRX40 AORUS PRO WIFI (F6 BIOS) and AMD Radeon RX 5700 8GB on Pop 22.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2410154-PTS-ONEDNN3605&sor&grw .
onednn 3.6 tr Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server OpenGL Vulkan Compiler File-System Screen Resolution a b c AMD Ryzen Threadripper 3990X 64-Core @ 2.90GHz (64 Cores / 128 Threads) Gigabyte TRX40 AORUS PRO WIFI (F6 BIOS) AMD Starship/Matisse 4 x 32GB DDR4-3000MT/s CMK64GX4M2D3000C16 Samsung SSD 970 EVO Plus 500GB AMD Radeon RX 5700 8GB AMD Navi 10 HDMI Audio DELL P2415Q Intel I211 + Intel Wi-Fi 6 AX200 Pop 22.04 6.8.0-76060800daily20240311-generic (x86_64) GNOME Shell 42.5 X Server 1.21.1.4 4.6 Mesa 24.0.3-1pop1~1711635559~22.04~7a9f319 (LLVM 15.0.7 DRM 3.57) 1.3.274 GCC 11.4.0 ext4 1920x1080 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0x830107a Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Mitigation of untrained return thunk; SMT enabled with STIBP protection + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
onednn 3.6 tr onednn: IP Shapes 1D - CPU onednn: IP Shapes 3D - CPU onednn: Convolution Batch Shapes Auto - CPU onednn: Deconvolution Batch shapes_1d - CPU onednn: Deconvolution Batch shapes_3d - CPU onednn: Recurrent Neural Network Training - CPU onednn: Recurrent Neural Network Inference - CPU a b c 1.26628 7.74579 0.949458 20.5780 2.16434 1338.92 768.828 1.36448 15.1285 1.04383 20.6903 2.16765 1346.96 770.242 1.30583 14.9667 1.04534 20.7648 2.16533 1346.06 772.891 OpenBenchmarking.org
oneDNN Harness: IP Shapes 1D - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: IP Shapes 1D - Engine: CPU a c b 0.307 0.614 0.921 1.228 1.535 SE +/- 0.03691, N = 15 SE +/- 0.03052, N = 15 SE +/- 0.02561, N = 15 1.26628 1.30583 1.36448 MIN: 1.01 MIN: 1.03 MIN: 1.04 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: IP Shapes 3D - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: IP Shapes 3D - Engine: CPU a c b 4 8 12 16 20 SE +/- 0.00760, N = 3 SE +/- 0.05077, N = 3 SE +/- 0.04414, N = 3 7.74579 14.96670 15.12850 MIN: 7.63 MIN: 14.69 MIN: 14.89 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Convolution Batch Shapes Auto - Engine: CPU a b c 0.2352 0.4704 0.7056 0.9408 1.176 SE +/- 0.002358, N = 3 SE +/- 0.008423, N = 3 SE +/- 0.009860, N = 3 0.949458 1.043830 1.045340 MIN: 0.88 MIN: 0.93 MIN: 0.95 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Deconvolution Batch shapes_1d - Engine: CPU a b c 5 10 15 20 25 SE +/- 0.09, N = 3 SE +/- 0.09, N = 3 SE +/- 0.04, N = 3 20.58 20.69 20.76 MIN: 14.89 MIN: 15.94 MIN: 19 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Deconvolution Batch shapes_3d - Engine: CPU a c b 0.4877 0.9754 1.4631 1.9508 2.4385 SE +/- 0.00886, N = 3 SE +/- 0.00650, N = 3 SE +/- 0.00898, N = 3 2.16434 2.16533 2.16765 MIN: 2.09 MIN: 2.09 MIN: 2.09 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: Recurrent Neural Network Training - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Recurrent Neural Network Training - Engine: CPU a c b 300 600 900 1200 1500 SE +/- 1.84, N = 3 SE +/- 0.47, N = 3 SE +/- 1.42, N = 3 1338.92 1346.06 1346.96 MIN: 1303.09 MIN: 1313.42 MIN: 1309.58 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: Recurrent Neural Network Inference - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Recurrent Neural Network Inference - Engine: CPU a b c 170 340 510 680 850 SE +/- 1.77, N = 3 SE +/- 0.74, N = 3 SE +/- 0.52, N = 3 768.83 770.24 772.89 MIN: 729.92 MIN: 735.93 MIN: 732.57 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
Phoronix Test Suite v10.8.5