3990X sysbench onednn AMD Ryzen Threadripper 3990X 64-Core testing with a System76 Thelio Major (F4c Z5 BIOS) and AMD Radeon RX 5600 OEM/5600 XT / 5700/5700 8GB on Pop 20.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2103132-PTS-3990XSYS94&grr&sor .
3990X sysbench onednn Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server OpenGL Vulkan Compiler File-System Screen Resolution 1 2 3 4 AMD Ryzen Threadripper 3990X 64-Core @ 2.90GHz (64 Cores / 128 Threads) System76 Thelio Major (F4c Z5 BIOS) AMD Starship/Matisse 126GB Samsung SSD 970 EVO Plus 500GB AMD Radeon RX 5600 OEM/5600 XT / 5700/5700 8GB (1750/875MHz) AMD Navi 10 HDMI Audio DELL P2415Q Intel I211 + Intel Wi-Fi 6 AX200 Pop 20.10 5.8.0-7630-generic (x86_64) GNOME Shell 3.38.2 X Server 1.20.8 4.6 Mesa 21.1.0-devel (git-96d7555 2021-01-22 groovy-oibaf-ppa) (LLVM 11.0.1) 1.2.145 GCC 10.2.0 + Clang 11.0.1-1~oibaf~g ext4 3840x2160 OpenBenchmarking.org Kernel Details - snd_usb_audio.ignore_ctl_error=1 - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-gcn/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq ondemand (Boost: Enabled) - CPU Microcode: 0x8301025 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
3990X sysbench onednn sysbench: CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Deconvolution Batch shapes_1d - f32 - CPU onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU onednn: IP Shapes 1D - f32 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU sysbench: RAM / Memory onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU onednn: IP Shapes 3D - f32 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU 1 2 3 4 124312.60 3780.64 3741.53 3750.35 742.788 743.197 743.972 7.56650 0.890976 1.23508 1.47243 7515.51 0.411915 0.667705 5.37221 1.002151 0.946125 6.52783 2.02131 0.947941 123807.73 3750.45 3770.99 3759.67 744.787 747.633 744.471 7.62499 0.896707 1.25724 1.48945 7520.15 0.415420 0.668714 8.44270 1.02689 1.02549 6.63625 2.03585 0.954759 123768.96 3760.17 3747.70 3747.38 743.581 745.817 746.450 7.59192 0.895418 1.24953 1.50143 7511.92 0.411918 0.667192 8.57665 1.03252 1.08129 6.58848 2.04302 0.953820 123660.73 3721.68 3730.17 3733.74 746.702 747.814 744.352 7.70634 0.889403 1.23832 1.46291 7484.70 0.417122 0.670735 8.11946 1.03548 1.08280 6.65608 2.02647 0.954016 OpenBenchmarking.org
Sysbench Test: CPU OpenBenchmarking.org Events Per Second, More Is Better Sysbench 1.0.20 Test: CPU 1 2 3 4 30K 60K 90K 120K 150K SE +/- 312.36, N = 3 SE +/- 228.83, N = 3 SE +/- 281.49, N = 3 SE +/- 209.17, N = 3 124312.60 123807.73 123768.96 123660.73 1. (CC) gcc options: -pthread -O2 -funroll-loops -rdynamic -ldl -laio -lm
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU 4 2 3 1 800 1600 2400 3200 4000 SE +/- 13.27, N = 3 SE +/- 8.77, N = 3 SE +/- 6.98, N = 3 SE +/- 10.06, N = 3 3721.68 3750.45 3760.17 3780.64 MIN: 3682 MIN: 3724.77 MIN: 3731.98 MIN: 3744.41 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU 4 1 3 2 800 1600 2400 3200 4000 SE +/- 10.20, N = 3 SE +/- 3.42, N = 3 SE +/- 5.93, N = 3 SE +/- 3.71, N = 3 3730.17 3741.53 3747.70 3770.99 MIN: 3690.55 MIN: 3720.38 MIN: 3723.93 MIN: 3740.38 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU 4 3 1 2 800 1600 2400 3200 4000 SE +/- 13.03, N = 3 SE +/- 2.18, N = 3 SE +/- 3.01, N = 3 SE +/- 4.29, N = 3 3733.74 3747.38 3750.35 3759.67 MIN: 3691.91 MIN: 3726.51 MIN: 3727.41 MIN: 3726.65 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU 1 3 2 4 160 320 480 640 800 SE +/- 0.53, N = 3 SE +/- 1.75, N = 3 SE +/- 0.96, N = 3 SE +/- 0.55, N = 3 742.79 743.58 744.79 746.70 MIN: 734.92 MIN: 733.02 MIN: 736.3 MIN: 738.99 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU 1 3 2 4 160 320 480 640 800 SE +/- 1.10, N = 3 SE +/- 0.40, N = 3 SE +/- 1.00, N = 3 SE +/- 1.95, N = 3 743.20 745.82 747.63 747.81 MIN: 734.28 MIN: 737.38 MIN: 738.57 MIN: 737.34 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU 1 4 2 3 160 320 480 640 800 SE +/- 2.22, N = 3 SE +/- 1.34, N = 3 SE +/- 0.50, N = 3 SE +/- 1.44, N = 3 743.97 744.35 744.47 746.45 MIN: 733.98 MIN: 734.63 MIN: 735.64 MIN: 736.4 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU 1 3 2 4 2 4 6 8 10 SE +/- 0.02931, N = 3 SE +/- 0.00967, N = 3 SE +/- 0.02617, N = 3 SE +/- 0.01670, N = 3 7.56650 7.59192 7.62499 7.70634 MIN: 6.16 MIN: 6.56 MIN: 6.17 MIN: 6.61 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU 4 1 3 2 0.2018 0.4036 0.6054 0.8072 1.009 SE +/- 0.000457, N = 3 SE +/- 0.001294, N = 3 SE +/- 0.002330, N = 3 SE +/- 0.002296, N = 3 0.889403 0.890976 0.895418 0.896707 MIN: 0.84 MIN: 0.84 MIN: 0.84 MIN: 0.84 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU 1 4 3 2 0.2829 0.5658 0.8487 1.1316 1.4145 SE +/- 0.00292, N = 3 SE +/- 0.00828, N = 3 SE +/- 0.00826, N = 3 SE +/- 0.00760, N = 3 1.23508 1.23832 1.24953 1.25724 MIN: 1.19 MIN: 1.19 MIN: 1.2 MIN: 1.2 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU 4 1 2 3 0.3378 0.6756 1.0134 1.3512 1.689 SE +/- 0.00627, N = 3 SE +/- 0.00777, N = 3 SE +/- 0.00450, N = 3 SE +/- 0.00467, N = 3 1.46291 1.47243 1.48945 1.50143 MIN: 1.24 MIN: 1.26 MIN: 1.28 MIN: 1.28 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Sysbench Test: RAM / Memory OpenBenchmarking.org MiB/sec, More Is Better Sysbench 1.0.20 Test: RAM / Memory 2 1 3 4 1600 3200 4800 6400 8000 SE +/- 9.49, N = 3 SE +/- 11.15, N = 3 SE +/- 12.72, N = 3 SE +/- 8.95, N = 3 7520.15 7515.51 7511.92 7484.70 1. (CC) gcc options: -pthread -O2 -funroll-loops -rdynamic -ldl -laio -lm
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU 1 3 2 4 0.0939 0.1878 0.2817 0.3756 0.4695 SE +/- 0.000765, N = 3 SE +/- 0.002080, N = 3 SE +/- 0.000908, N = 3 SE +/- 0.005514, N = 3 0.411915 0.411918 0.415420 0.417122 MIN: 0.4 MIN: 0.4 MIN: 0.4 MIN: 0.4 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU 3 1 2 4 0.1509 0.3018 0.4527 0.6036 0.7545 SE +/- 0.000661, N = 3 SE +/- 0.001823, N = 3 SE +/- 0.000941, N = 3 SE +/- 0.001611, N = 3 0.667192 0.667705 0.668714 0.670735 MIN: 0.64 MIN: 0.64 MIN: 0.64 MIN: 0.64 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU 1 4 2 3 2 4 6 8 10 SE +/- 0.01315, N = 3 SE +/- 0.02779, N = 3 SE +/- 0.11123, N = 3 SE +/- 0.01501, N = 3 5.37221 8.11946 8.44270 8.57665 MIN: 5.17 MIN: 7.82 MIN: 8.09 MIN: 8.33 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU 1 2 3 4 0.233 0.466 0.699 0.932 1.165 SE +/- 0.001528, N = 3 SE +/- 0.002666, N = 3 SE +/- 0.002913, N = 3 SE +/- 0.002595, N = 3 1.002151 1.026890 1.032520 1.035480 MIN: 0.95 MIN: 0.98 MIN: 0.98 MIN: 0.98 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU 1 2 3 4 0.2436 0.4872 0.7308 0.9744 1.218 SE +/- 0.006847, N = 3 SE +/- 0.016517, N = 3 SE +/- 0.002000, N = 3 SE +/- 0.009734, N = 3 0.946125 1.025490 1.081290 1.082800 MIN: 0.85 MIN: 0.92 MIN: 0.98 MIN: 0.98 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU 1 3 2 4 2 4 6 8 10 SE +/- 0.01403, N = 3 SE +/- 0.00793, N = 3 SE +/- 0.02161, N = 3 SE +/- 0.01940, N = 3 6.52783 6.58848 6.63625 6.65608 MIN: 6.36 MIN: 6.47 MIN: 6.5 MIN: 6.51 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU 1 4 2 3 0.4597 0.9194 1.3791 1.8388 2.2985 SE +/- 0.00410, N = 3 SE +/- 0.00241, N = 3 SE +/- 0.00592, N = 3 SE +/- 0.00417, N = 3 2.02131 2.02647 2.03585 2.04302 MIN: 1.93 MIN: 1.96 MIN: 1.96 MIN: 1.97 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU 1 3 4 2 0.2148 0.4296 0.6444 0.8592 1.074 SE +/- 0.002129, N = 3 SE +/- 0.000086, N = 3 SE +/- 0.000529, N = 3 SE +/- 0.001253, N = 3 0.947941 0.953820 0.954016 0.954759 MIN: 0.9 MIN: 0.89 MIN: 0.9 MIN: 0.9 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Phoronix Test Suite v10.8.5