TR 3970X oneDNN + clash AMD Ryzen Threadripper 3970X 32-Core testing with a ASUS ROG ZENITH II EXTREME (1201 BIOS) and AMD Radeon RX 5600 OEM/5600 XT / 5700/5700 8GB on Ubuntu 20.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2012102-PTS-TR3970XO71 .
TR 3970X oneDNN + clash Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL Vulkan Compiler File-System Screen Resolution 1 2 3 AMD Ryzen Threadripper 3970X 32-Core @ 3.70GHz (32 Cores / 64 Threads) ASUS ROG ZENITH II EXTREME (1201 BIOS) AMD Starship/Matisse 64GB Samsung SSD 980 PRO 500GB AMD Radeon RX 5600 OEM/5600 XT / 5700/5700 8GB (1750/875MHz) AMD Navi 10 HDMI Audio ASUS VP28U Aquantia AQC107 NBase-T/IEEE + Intel I211 + Intel Wi-Fi 6 AX200 Ubuntu 20.10 5.8.0-29-generic (x86_64) GNOME Shell 3.38.1 X Server 1.20.9 amdgpu 19.1.0 4.6 Mesa 20.2.1 (LLVM 11.0.0) 1.2.131 GCC 10.2.0 ext4 3840x2160 OpenBenchmarking.org Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-gcn/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq ondemand (Boost: Enabled) - CPU Microcode: 0x8301039 Graphics Details - GLAMOR Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
TR 3970X oneDNN + clash libplacebo: deband_heavy libplacebo: polar_nocompute libplacebo: hdr_peakdetect libplacebo: av1_grain_lap onednn: IP Shapes 1D - f32 - CPU onednn: IP Shapes 3D - f32 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: Deconvolution Batch shapes_1d - f32 - CPU onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU build-clash: Time To Compile 1 2 3 351.71 700.1 3262.67 602.60 1.20958 4.38972 0.936915 0.844063 5.80922 1.60368 2.72913 6.20119 1.76063 1.56172 3980.36 929.528 3974.30 930.358 0.403642 3977.18 931.502 0.927523 386.329 350.84 698.31 3261.05 600.12 1.21820 4.89684 0.935449 0.841168 6.14756 1.61715 2.70732 6.74754 1.74974 1.56151 3967.73 930.293 3975.07 932.468 0.404541 3961.17 926.133 0.927820 387.231 355.35 700.09 3262.45 601.56 1.20900 5.01124 0.937624 0.832783 6.23653 1.60088 2.70346 6.79948 1.75098 1.56305 3977.39 930.499 3980.60 931.090 0.408374 3980.54 933.869 0.927802 387.161 OpenBenchmarking.org
Libplacebo Test: deband_heavy OpenBenchmarking.org FPS, More Is Better Libplacebo 2.72.2 Test: deband_heavy 1 2 3 80 160 240 320 400 SE +/- 0.96, N = 3 SE +/- 0.97, N = 3 SE +/- 1.73, N = 3 351.71 350.84 355.35 1. (CXX) g++ options: -lm -lglslang -lHLSL -lOGLCompiler -lOSDependent -lSPIRV -lSPVRemapper -lSPIRV-Tools -lSPIRV-Tools-opt -lpthread -pthread -pipe -std=c++11 -fvisibility=hidden -fPIC -MD -MQ -MF
Libplacebo Test: polar_nocompute OpenBenchmarking.org FPS, More Is Better Libplacebo 2.72.2 Test: polar_nocompute 1 2 3 150 300 450 600 750 SE +/- 0.72, N = 3 SE +/- 0.18, N = 3 SE +/- 0.27, N = 3 700.10 698.31 700.09 1. (CXX) g++ options: -lm -lglslang -lHLSL -lOGLCompiler -lOSDependent -lSPIRV -lSPVRemapper -lSPIRV-Tools -lSPIRV-Tools-opt -lpthread -pthread -pipe -std=c++11 -fvisibility=hidden -fPIC -MD -MQ -MF
Libplacebo Test: hdr_peakdetect OpenBenchmarking.org FPS, More Is Better Libplacebo 2.72.2 Test: hdr_peakdetect 1 2 3 700 1400 2100 2800 3500 SE +/- 4.26, N = 3 SE +/- 7.04, N = 3 SE +/- 4.17, N = 3 3262.67 3261.05 3262.45 1. (CXX) g++ options: -lm -lglslang -lHLSL -lOGLCompiler -lOSDependent -lSPIRV -lSPVRemapper -lSPIRV-Tools -lSPIRV-Tools-opt -lpthread -pthread -pipe -std=c++11 -fvisibility=hidden -fPIC -MD -MQ -MF
Libplacebo Test: av1_grain_lap OpenBenchmarking.org FPS, More Is Better Libplacebo 2.72.2 Test: av1_grain_lap 1 2 3 130 260 390 520 650 SE +/- 0.80, N = 3 SE +/- 1.15, N = 3 SE +/- 2.50, N = 3 602.60 600.12 601.56 1. (CXX) g++ options: -lm -lglslang -lHLSL -lOGLCompiler -lOSDependent -lSPIRV -lSPVRemapper -lSPIRV-Tools -lSPIRV-Tools-opt -lpthread -pthread -pipe -std=c++11 -fvisibility=hidden -fPIC -MD -MQ -MF
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU 1 2 3 0.2741 0.5482 0.8223 1.0964 1.3705 SE +/- 0.00243, N = 3 SE +/- 0.00595, N = 3 SE +/- 0.00066, N = 3 1.20958 1.21820 1.20900 MIN: 1.12 MIN: 1.12 MIN: 1.12 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU 1 2 3 1.1275 2.255 3.3825 4.51 5.6375 SE +/- 0.01419, N = 3 SE +/- 0.00695, N = 3 SE +/- 0.02226, N = 3 4.38972 4.89684 5.01124 MIN: 4.07 MIN: 4.59 MIN: 4.65 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU 1 2 3 0.211 0.422 0.633 0.844 1.055 SE +/- 0.001417, N = 3 SE +/- 0.001947, N = 3 SE +/- 0.003296, N = 3 0.936915 0.935449 0.937624 MIN: 0.88 MIN: 0.88 MIN: 0.88 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU 1 2 3 0.1899 0.3798 0.5697 0.7596 0.9495 SE +/- 0.005764, N = 3 SE +/- 0.004003, N = 3 SE +/- 0.001176, N = 3 0.844063 0.841168 0.832783 MIN: 0.77 MIN: 0.74 MIN: 0.78 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU 1 2 3 2 4 6 8 10 SE +/- 0.02478, N = 3 SE +/- 0.00676, N = 3 SE +/- 0.01146, N = 3 5.80922 6.14756 6.23653 MIN: 5.28 MIN: 5.62 MIN: 5.67 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU 1 2 3 0.3639 0.7278 1.0917 1.4556 1.8195 SE +/- 0.00520, N = 3 SE +/- 0.00360, N = 3 SE +/- 0.00047, N = 3 1.60368 1.61715 1.60088 MIN: 1.5 MIN: 1.52 MIN: 1.5 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU 1 2 3 0.6141 1.2282 1.8423 2.4564 3.0705 SE +/- 0.02503, N = 3 SE +/- 0.00672, N = 3 SE +/- 0.01091, N = 3 2.72913 2.70732 2.70346 MIN: 2.64 MIN: 2.62 MIN: 2.62 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU 1 2 3 2 4 6 8 10 SE +/- 0.00964, N = 3 SE +/- 0.02495, N = 3 SE +/- 0.02984, N = 3 6.20119 6.74754 6.79948 MIN: 5.78 MIN: 6.31 MIN: 6.3 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU 1 2 3 0.3961 0.7922 1.1883 1.5844 1.9805 SE +/- 0.00478, N = 3 SE +/- 0.00328, N = 3 SE +/- 0.00191, N = 3 1.76063 1.74974 1.75098 MIN: 1.63 MIN: 1.63 MIN: 1.64 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU 1 2 3 0.3517 0.7034 1.0551 1.4068 1.7585 SE +/- 0.00671, N = 3 SE +/- 0.00409, N = 3 SE +/- 0.00545, N = 3 1.56172 1.56151 1.56305 MIN: 1.48 MIN: 1.46 MIN: 1.46 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU 1 2 3 900 1800 2700 3600 4500 SE +/- 8.10, N = 3 SE +/- 6.45, N = 3 SE +/- 6.52, N = 3 3980.36 3967.73 3977.39 MIN: 3931.06 MIN: 3911.52 MIN: 3927.02 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU 1 2 3 200 400 600 800 1000 SE +/- 3.92, N = 3 SE +/- 2.09, N = 3 SE +/- 1.05, N = 3 929.53 930.29 930.50 MIN: 905.59 MIN: 911.4 MIN: 912.72 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU 1 2 3 900 1800 2700 3600 4500 SE +/- 7.47, N = 3 SE +/- 6.86, N = 3 SE +/- 10.73, N = 3 3974.30 3975.07 3980.60 MIN: 3929.38 MIN: 3934.22 MIN: 3928.08 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU 1 2 3 200 400 600 800 1000 SE +/- 3.06, N = 3 SE +/- 2.57, N = 3 SE +/- 0.55, N = 3 930.36 932.47 931.09 MIN: 905.23 MIN: 908.01 MIN: 913.62 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU 1 2 3 0.0919 0.1838 0.2757 0.3676 0.4595 SE +/- 0.000208, N = 3 SE +/- 0.001111, N = 3 SE +/- 0.004350, N = 3 0.403642 0.404541 0.408374 MIN: 0.38 MIN: 0.38 MIN: 0.38 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU 1 2 3 900 1800 2700 3600 4500 SE +/- 9.31, N = 3 SE +/- 12.17, N = 3 SE +/- 3.22, N = 3 3977.18 3961.17 3980.54 MIN: 3934.22 MIN: 3906.73 MIN: 3936.7 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU 1 2 3 200 400 600 800 1000 SE +/- 1.31, N = 3 SE +/- 2.47, N = 3 SE +/- 3.41, N = 3 931.50 926.13 933.87 MIN: 913.56 MIN: 905.47 MIN: 908.49 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU 1 2 3 0.2088 0.4176 0.6264 0.8352 1.044 SE +/- 0.001498, N = 3 SE +/- 0.002641, N = 3 SE +/- 0.002172, N = 3 0.927523 0.927820 0.927802 MIN: 0.85 MIN: 0.85 MIN: 0.84 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Timed Clash Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Clash Compilation Time To Compile 1 2 3 80 160 240 320 400 SE +/- 0.41, N = 3 SE +/- 1.63, N = 3 SE +/- 0.36, N = 3 386.33 387.23 387.16
Phoronix Test Suite v10.8.4