Xeon E3 oneDNN 2.0

Intel Xeon E3-1280 v5 testing with a MSI Z170A SLI PLUS (MS-7998) v1.0 (2.A0 BIOS) and ASUS AMD Radeon HD 7850 / R7 265 R9 270 1024SP on Ubuntu 20.04 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2012094-HA-XEONE3ONE76.

Xeon E3 oneDNN 2.0ProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLCompilerFile-SystemScreen Resolution123Intel Xeon E3-1280 v5 @ 4.00GHz (4 Cores / 8 Threads)MSI Z170A SLI PLUS (MS-7998) v1.0 (2.A0 BIOS)Intel Xeon E3-1200 v5/E3-150032GB256GB TOSHIBA RD400ASUS AMD Radeon HD 7850 / R7 265 R9 270 1024SPRealtek ALC1150VA2431Intel I219-VUbuntu 20.045.9.0-050900rc2daily20200826-generic (x86_64) 20200825GNOME Shell 3.36.4X Server 1.20.8modesetting 1.20.84.5 Mesa 20.0.8 (LLVM 10.0.0)GCC 9.3.0ext41920x1080OpenBenchmarking.orgCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: intel_pstate powersave - CPU Microcode: 0xe2 - Thermald 1.9.1 Security Details- itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Mitigation of PTE Inversion; VMX: conditional cache flushes SMT vulnerable + mds: Mitigation of Clear buffers; SMT vulnerable + meltdown: Mitigation of PTI + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full generic retpoline IBPB: conditional IBRS_FW STIBP: conditional RSB filling + srbds: Mitigation of Microcode + tsx_async_abort: Mitigation of Clear buffers; SMT vulnerable

Xeon E3 oneDNN 2.0onednn: IP Shapes 1D - f32 - CPUonednn: IP Shapes 3D - f32 - CPUonednn: IP Shapes 1D - u8s8f32 - CPUonednn: IP Shapes 3D - u8s8f32 - CPUonednn: Convolution Batch Shapes Auto - f32 - CPUonednn: Deconvolution Batch shapes_1d - f32 - CPUonednn: Deconvolution Batch shapes_3d - f32 - CPUonednn: Convolution Batch Shapes Auto - u8s8f32 - CPUonednn: Deconvolution Batch shapes_1d - u8s8f32 - CPUonednn: Deconvolution Batch shapes_3d - u8s8f32 - CPUonednn: Recurrent Neural Network Training - f32 - CPUonednn: Recurrent Neural Network Inference - f32 - CPUonednn: Recurrent Neural Network Training - u8s8f32 - CPUonednn: Recurrent Neural Network Inference - u8s8f32 - CPUonednn: Matrix Multiply Batch Shapes Transformer - f32 - CPUonednn: Recurrent Neural Network Training - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPUonednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU1238.1102312.27173.663783.2329020.910110.599914.398520.633311.32537.437837402.183950.407403.963952.605.386797397.773951.826.901318.1334912.25023.663633.2331520.897510.592414.396520.515511.07587.442427401.843954.927400.593953.555.388587401.563954.416.905078.1039412.26103.661433.2269720.916510.588914.429720.494811.31197.432927402.393949.857406.713949.875.372257397.073949.096.90071OpenBenchmarking.org

oneDNN

Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU123246810SE +/- 0.01358, N = 3SE +/- 0.00686, N = 3SE +/- 0.01643, N = 38.110238.133498.10394MIN: 7.93MIN: 7.97MIN: 7.951. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU1233691215SE +/- 0.02, N = 3SE +/- 0.02, N = 3SE +/- 0.02, N = 312.2712.2512.26MIN: 12.09MIN: 12.07MIN: 12.051. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU1230.82441.64882.47323.29764.122SE +/- 0.00042, N = 3SE +/- 0.00342, N = 3SE +/- 0.00315, N = 33.663783.663633.66143MIN: 3.63MIN: 3.63MIN: 3.631. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU1230.72751.4552.18252.913.6375SE +/- 0.00893, N = 3SE +/- 0.00755, N = 3SE +/- 0.00705, N = 33.232903.233153.22697MIN: 3.16MIN: 3.16MIN: 3.161. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU123510152025SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.03, N = 320.9120.9020.92MIN: 20.84MIN: 20.83MIN: 20.821. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU1233691215SE +/- 0.02, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 310.6010.5910.59MIN: 10.5MIN: 10.5MIN: 10.51. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU12348121620SE +/- 0.03, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 314.4014.4014.43MIN: 14.2MIN: 14.23MIN: 14.251. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU123510152025SE +/- 0.08, N = 3SE +/- 0.03, N = 3SE +/- 0.01, N = 320.6320.5220.49MIN: 20.4MIN: 20.38MIN: 20.391. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU1233691215SE +/- 0.09, N = 3SE +/- 0.01, N = 3SE +/- 0.12, N = 311.3311.0811.31MIN: 11.07MIN: 10.98MIN: 10.991. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU123246810SE +/- 0.00700, N = 3SE +/- 0.00268, N = 3SE +/- 0.01275, N = 37.437837.442427.43292MIN: 7.39MIN: 7.4MIN: 7.371. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU12316003200480064008000SE +/- 1.79, N = 3SE +/- 2.29, N = 3SE +/- 2.88, N = 37402.187401.847402.39MIN: 7383.43MIN: 7388.05MIN: 7389.31. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU1238001600240032004000SE +/- 0.48, N = 3SE +/- 2.81, N = 3SE +/- 1.24, N = 33950.403954.923949.85MIN: 3942.05MIN: 3944.12MIN: 3940.261. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU12316003200480064008000SE +/- 1.28, N = 3SE +/- 2.41, N = 3SE +/- 7.65, N = 37403.967400.597406.71MIN: 7391.29MIN: 7384.55MIN: 7379.881. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU1238001600240032004000SE +/- 1.74, N = 3SE +/- 1.94, N = 3SE +/- 0.92, N = 33952.603953.553949.87MIN: 3942.3MIN: 3943.31MIN: 3944.881. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU1231.21242.42483.63724.84966.062SE +/- 0.00577, N = 3SE +/- 0.00794, N = 3SE +/- 0.01548, N = 35.386795.388585.37225MIN: 5.31MIN: 5.32MIN: 5.291. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU12316003200480064008000SE +/- 2.95, N = 3SE +/- 2.59, N = 3SE +/- 5.25, N = 37397.777401.567397.07MIN: 7386.84MIN: 7390.11MIN: 7383.821. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU1238001600240032004000SE +/- 3.51, N = 3SE +/- 2.44, N = 3SE +/- 1.54, N = 33951.823954.413949.09MIN: 3942.06MIN: 3941.1MIN: 3941.431. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU123246810SE +/- 0.00304, N = 3SE +/- 0.00574, N = 3SE +/- 0.00762, N = 36.901316.905076.90071MIN: 6.85MIN: 6.85MIN: 6.861. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread


Phoronix Test Suite v10.8.4