TR 3970X oneDNN + clash

AMD Ryzen Threadripper 3970X 32-Core testing with a ASUS ROG ZENITH II EXTREME (1201 BIOS) and AMD Radeon RX 5600 OEM/5600 XT / 5700/5700 8GB on Ubuntu 20.10 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2012102-PTS-TR3970XO71.

TR 3970X oneDNN + clashProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLVulkanCompilerFile-SystemScreen Resolution123AMD Ryzen Threadripper 3970X 32-Core @ 3.70GHz (32 Cores / 64 Threads)ASUS ROG ZENITH II EXTREME (1201 BIOS)AMD Starship/Matisse64GBSamsung SSD 980 PRO 500GBAMD Radeon RX 5600 OEM/5600 XT / 5700/5700 8GB (1750/875MHz)AMD Navi 10 HDMI AudioASUS VP28UAquantia AQC107 NBase-T/IEEE + Intel I211 + Intel Wi-Fi 6 AX200Ubuntu 20.105.8.0-29-generic (x86_64)GNOME Shell 3.38.1X Server 1.20.9amdgpu 19.1.04.6 Mesa 20.2.1 (LLVM 11.0.0)1.2.131GCC 10.2.0ext43840x2160OpenBenchmarking.orgCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-gcn/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: acpi-cpufreq ondemand (Boost: Enabled) - CPU Microcode: 0x8301039Graphics Details- GLAMORSecurity Details- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected

TR 3970X oneDNN + clashlibplacebo: deband_heavylibplacebo: polar_nocomputelibplacebo: hdr_peakdetectlibplacebo: av1_grain_laponednn: IP Shapes 1D - f32 - CPUonednn: IP Shapes 3D - f32 - CPUonednn: IP Shapes 1D - u8s8f32 - CPUonednn: IP Shapes 3D - u8s8f32 - CPUonednn: Convolution Batch Shapes Auto - f32 - CPUonednn: Deconvolution Batch shapes_1d - f32 - CPUonednn: Deconvolution Batch shapes_3d - f32 - CPUonednn: Convolution Batch Shapes Auto - u8s8f32 - CPUonednn: Deconvolution Batch shapes_1d - u8s8f32 - CPUonednn: Deconvolution Batch shapes_3d - u8s8f32 - CPUonednn: Recurrent Neural Network Training - f32 - CPUonednn: Recurrent Neural Network Inference - f32 - CPUonednn: Recurrent Neural Network Training - u8s8f32 - CPUonednn: Recurrent Neural Network Inference - u8s8f32 - CPUonednn: Matrix Multiply Batch Shapes Transformer - f32 - CPUonednn: Recurrent Neural Network Training - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPUonednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPUbuild-clash: Time To Compile123351.71700.13262.67602.601.209584.389720.9369150.8440635.809221.603682.729136.201191.760631.561723980.36929.5283974.30930.3580.4036423977.18931.5020.927523386.329350.84698.313261.05600.121.218204.896840.9354490.8411686.147561.617152.707326.747541.749741.561513967.73930.2933975.07932.4680.4045413961.17926.1330.927820387.231355.35700.093262.45601.561.209005.011240.9376240.8327836.236531.600882.703466.799481.750981.563053977.39930.4993980.60931.0900.4083743980.54933.8690.927802387.161OpenBenchmarking.org

Libplacebo

Test: deband_heavy

OpenBenchmarking.orgFPS, More Is BetterLibplacebo 2.72.2Test: deband_heavy12380160240320400SE +/- 0.96, N = 3SE +/- 0.97, N = 3SE +/- 1.73, N = 3351.71350.84355.351. (CXX) g++ options: -lm -lglslang -lHLSL -lOGLCompiler -lOSDependent -lSPIRV -lSPVRemapper -lSPIRV-Tools -lSPIRV-Tools-opt -lpthread -pthread -pipe -std=c++11 -fvisibility=hidden -fPIC -MD -MQ -MF

Libplacebo

Test: polar_nocompute

OpenBenchmarking.orgFPS, More Is BetterLibplacebo 2.72.2Test: polar_nocompute123150300450600750SE +/- 0.72, N = 3SE +/- 0.18, N = 3SE +/- 0.27, N = 3700.10698.31700.091. (CXX) g++ options: -lm -lglslang -lHLSL -lOGLCompiler -lOSDependent -lSPIRV -lSPVRemapper -lSPIRV-Tools -lSPIRV-Tools-opt -lpthread -pthread -pipe -std=c++11 -fvisibility=hidden -fPIC -MD -MQ -MF

Libplacebo

Test: hdr_peakdetect

OpenBenchmarking.orgFPS, More Is BetterLibplacebo 2.72.2Test: hdr_peakdetect1237001400210028003500SE +/- 4.26, N = 3SE +/- 7.04, N = 3SE +/- 4.17, N = 33262.673261.053262.451. (CXX) g++ options: -lm -lglslang -lHLSL -lOGLCompiler -lOSDependent -lSPIRV -lSPVRemapper -lSPIRV-Tools -lSPIRV-Tools-opt -lpthread -pthread -pipe -std=c++11 -fvisibility=hidden -fPIC -MD -MQ -MF

Libplacebo

Test: av1_grain_lap

OpenBenchmarking.orgFPS, More Is BetterLibplacebo 2.72.2Test: av1_grain_lap123130260390520650SE +/- 0.80, N = 3SE +/- 1.15, N = 3SE +/- 2.50, N = 3602.60600.12601.561. (CXX) g++ options: -lm -lglslang -lHLSL -lOGLCompiler -lOSDependent -lSPIRV -lSPVRemapper -lSPIRV-Tools -lSPIRV-Tools-opt -lpthread -pthread -pipe -std=c++11 -fvisibility=hidden -fPIC -MD -MQ -MF

oneDNN

Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU1230.27410.54820.82231.09641.3705SE +/- 0.00243, N = 3SE +/- 0.00595, N = 3SE +/- 0.00066, N = 31.209581.218201.20900MIN: 1.12MIN: 1.12MIN: 1.121. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU1231.12752.2553.38254.515.6375SE +/- 0.01419, N = 3SE +/- 0.00695, N = 3SE +/- 0.02226, N = 34.389724.896845.01124MIN: 4.07MIN: 4.59MIN: 4.651. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU1230.2110.4220.6330.8441.055SE +/- 0.001417, N = 3SE +/- 0.001947, N = 3SE +/- 0.003296, N = 30.9369150.9354490.937624MIN: 0.88MIN: 0.88MIN: 0.881. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU1230.18990.37980.56970.75960.9495SE +/- 0.005764, N = 3SE +/- 0.004003, N = 3SE +/- 0.001176, N = 30.8440630.8411680.832783MIN: 0.77MIN: 0.74MIN: 0.781. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU123246810SE +/- 0.02478, N = 3SE +/- 0.00676, N = 3SE +/- 0.01146, N = 35.809226.147566.23653MIN: 5.28MIN: 5.62MIN: 5.671. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU1230.36390.72781.09171.45561.8195SE +/- 0.00520, N = 3SE +/- 0.00360, N = 3SE +/- 0.00047, N = 31.603681.617151.60088MIN: 1.5MIN: 1.52MIN: 1.51. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU1230.61411.22821.84232.45643.0705SE +/- 0.02503, N = 3SE +/- 0.00672, N = 3SE +/- 0.01091, N = 32.729132.707322.70346MIN: 2.64MIN: 2.62MIN: 2.621. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU123246810SE +/- 0.00964, N = 3SE +/- 0.02495, N = 3SE +/- 0.02984, N = 36.201196.747546.79948MIN: 5.78MIN: 6.31MIN: 6.31. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU1230.39610.79221.18831.58441.9805SE +/- 0.00478, N = 3SE +/- 0.00328, N = 3SE +/- 0.00191, N = 31.760631.749741.75098MIN: 1.63MIN: 1.63MIN: 1.641. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU1230.35170.70341.05511.40681.7585SE +/- 0.00671, N = 3SE +/- 0.00409, N = 3SE +/- 0.00545, N = 31.561721.561511.56305MIN: 1.48MIN: 1.46MIN: 1.461. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU1239001800270036004500SE +/- 8.10, N = 3SE +/- 6.45, N = 3SE +/- 6.52, N = 33980.363967.733977.39MIN: 3931.06MIN: 3911.52MIN: 3927.021. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU1232004006008001000SE +/- 3.92, N = 3SE +/- 2.09, N = 3SE +/- 1.05, N = 3929.53930.29930.50MIN: 905.59MIN: 911.4MIN: 912.721. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU1239001800270036004500SE +/- 7.47, N = 3SE +/- 6.86, N = 3SE +/- 10.73, N = 33974.303975.073980.60MIN: 3929.38MIN: 3934.22MIN: 3928.081. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU1232004006008001000SE +/- 3.06, N = 3SE +/- 2.57, N = 3SE +/- 0.55, N = 3930.36932.47931.09MIN: 905.23MIN: 908.01MIN: 913.621. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU1230.09190.18380.27570.36760.4595SE +/- 0.000208, N = 3SE +/- 0.001111, N = 3SE +/- 0.004350, N = 30.4036420.4045410.408374MIN: 0.38MIN: 0.38MIN: 0.381. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU1239001800270036004500SE +/- 9.31, N = 3SE +/- 12.17, N = 3SE +/- 3.22, N = 33977.183961.173980.54MIN: 3934.22MIN: 3906.73MIN: 3936.71. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU1232004006008001000SE +/- 1.31, N = 3SE +/- 2.47, N = 3SE +/- 3.41, N = 3931.50926.13933.87MIN: 913.56MIN: 905.47MIN: 908.491. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU1230.20880.41760.62640.83521.044SE +/- 0.001498, N = 3SE +/- 0.002641, N = 3SE +/- 0.002172, N = 30.9275230.9278200.927802MIN: 0.85MIN: 0.85MIN: 0.841. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

Timed Clash Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Clash CompilationTime To Compile12380160240320400SE +/- 0.41, N = 3SE +/- 1.63, N = 3SE +/- 0.36, N = 3386.33387.23387.16


Phoronix Test Suite v10.8.4