Xeon E3 oneDNN 2.0

Intel Xeon E3-1280 v5 testing with a MSI Z170A SLI PLUS (MS-7998) v1.0 (2.A0 BIOS) and ASUS AMD Radeon HD 7850 / R7 265 R9 270 1024SP on Ubuntu 20.04 via the Phoronix Test Suite.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2012094-HA-XEONE3ONE76
Jump To Table - Results

View

Do Not Show Noisy Results
Do Not Show Results With Incomplete Data
Do Not Show Results With Little Change/Spread
List Notable Results

Statistics

Show Overall Harmonic Mean(s)
Show Overall Geometric Mean
Show Wins / Losses Counts (Pie Chart)
Normalize Results
Remove Outliers Before Calculating Averages

Graph Settings

Force Line Graphs Where Applicable
Convert To Scalar Where Applicable
Prefer Vertical Bar Graphs

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Highlight
Result
Hide
Result
Result
Identifier
View Logs
Performance Per
Dollar
Date
Run
  Test
  Duration
1
December 09 2020
  35 Minutes
2
December 09 2020
  35 Minutes
3
December 09 2020
  35 Minutes
Invert Hiding All Results Option
  35 Minutes

Only show results where is faster than
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


Xeon E3 oneDNN 2.0ProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLCompilerFile-SystemScreen Resolution123Intel Xeon E3-1280 v5 @ 4.00GHz (4 Cores / 8 Threads)MSI Z170A SLI PLUS (MS-7998) v1.0 (2.A0 BIOS)Intel Xeon E3-1200 v5/E3-150032GB256GB TOSHIBA RD400ASUS AMD Radeon HD 7850 / R7 265 R9 270 1024SPRealtek ALC1150VA2431Intel I219-VUbuntu 20.045.9.0-050900rc2daily20200826-generic (x86_64) 20200825GNOME Shell 3.36.4X Server 1.20.8modesetting 1.20.84.5 Mesa 20.0.8 (LLVM 10.0.0)GCC 9.3.0ext41920x1080OpenBenchmarking.orgCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: intel_pstate powersave - CPU Microcode: 0xe2 - Thermald 1.9.1 Security Details- itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Mitigation of PTE Inversion; VMX: conditional cache flushes SMT vulnerable + mds: Mitigation of Clear buffers; SMT vulnerable + meltdown: Mitigation of PTI + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full generic retpoline IBPB: conditional IBRS_FW STIBP: conditional RSB filling + srbds: Mitigation of Microcode + tsx_async_abort: Mitigation of Clear buffers; SMT vulnerable

123Result OverviewPhoronix Test Suite100%101%101%102%102%oneDNNoneDNNoneDNNoneDNNoneDNNoneDNNoneDNNoneDNNoneDNNoneDNNoneDNNoneDNNoneDNNoneDNNoneDNNoneDNNoneDNNoneDNND.B.s - u8s8f32 - CPUC.B.S.A - u8s8f32 - CPUIP Shapes 1D - f32 - CPUM.M.B.S.T - f32 - CPUD.B.s - f32 - CPUIP Shapes 3D - u8s8f32 - CPUIP Shapes 3D - f32 - CPUR.N.N.I - bf16bf16bf16 - CPUR.N.N.I - f32 - CPUD.B.s - u8s8f32 - CPUD.B.s - f32 - CPUR.N.N.I - u8s8f32 - CPUC.B.S.A - f32 - CPUR.N.N.T - u8s8f32 - CPUIP Shapes 1D - u8s8f32 - CPUM.M.B.S.T - u8s8f32 - CPUR.N.N.T - bf16bf16bf16 - CPUR.N.N.T - f32 - CPU

Xeon E3 oneDNN 2.0onednn: IP Shapes 1D - f32 - CPUonednn: IP Shapes 3D - f32 - CPUonednn: IP Shapes 1D - u8s8f32 - CPUonednn: IP Shapes 3D - u8s8f32 - CPUonednn: Convolution Batch Shapes Auto - f32 - CPUonednn: Deconvolution Batch shapes_1d - f32 - CPUonednn: Deconvolution Batch shapes_3d - f32 - CPUonednn: Convolution Batch Shapes Auto - u8s8f32 - CPUonednn: Deconvolution Batch shapes_1d - u8s8f32 - CPUonednn: Deconvolution Batch shapes_3d - u8s8f32 - CPUonednn: Recurrent Neural Network Training - f32 - CPUonednn: Recurrent Neural Network Inference - f32 - CPUonednn: Recurrent Neural Network Training - u8s8f32 - CPUonednn: Recurrent Neural Network Inference - u8s8f32 - CPUonednn: Matrix Multiply Batch Shapes Transformer - f32 - CPUonednn: Recurrent Neural Network Training - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPUonednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU1238.1102312.27173.663783.2329020.910110.599914.398520.633311.32537.437837402.183950.407403.963952.605.386797397.773951.826.901318.1334912.25023.663633.2331520.897510.592414.396520.515511.07587.442427401.843954.927400.593953.555.388587401.563954.416.905078.1039412.26103.661433.2269720.916510.588914.429720.494811.31197.432927402.393949.857406.713949.875.372257397.073949.096.90071OpenBenchmarking.org

oneDNN

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU321246810SE +/- 0.01643, N = 3SE +/- 0.00686, N = 3SE +/- 0.01358, N = 38.103948.133498.11023MIN: 7.95MIN: 7.97MIN: 7.931. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU3213691215Min: 8.07 / Avg: 8.1 / Max: 8.13Min: 8.12 / Avg: 8.13 / Max: 8.15Min: 8.08 / Avg: 8.11 / Max: 8.131. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU3213691215SE +/- 0.02, N = 3SE +/- 0.02, N = 3SE +/- 0.02, N = 312.2612.2512.27MIN: 12.05MIN: 12.07MIN: 12.091. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU32148121620Min: 12.23 / Avg: 12.26 / Max: 12.29Min: 12.21 / Avg: 12.25 / Max: 12.28Min: 12.23 / Avg: 12.27 / Max: 12.31. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU3210.82441.64882.47323.29764.122SE +/- 0.00315, N = 3SE +/- 0.00342, N = 3SE +/- 0.00042, N = 33.661433.663633.66378MIN: 3.63MIN: 3.63MIN: 3.631. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU321246810Min: 3.66 / Avg: 3.66 / Max: 3.67Min: 3.66 / Avg: 3.66 / Max: 3.67Min: 3.66 / Avg: 3.66 / Max: 3.661. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU3210.72751.4552.18252.913.6375SE +/- 0.00705, N = 3SE +/- 0.00755, N = 3SE +/- 0.00893, N = 33.226973.233153.23290MIN: 3.16MIN: 3.16MIN: 3.161. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU321246810Min: 3.21 / Avg: 3.23 / Max: 3.24Min: 3.22 / Avg: 3.23 / Max: 3.24Min: 3.22 / Avg: 3.23 / Max: 3.251. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU321510152025SE +/- 0.03, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 320.9220.9020.91MIN: 20.82MIN: 20.83MIN: 20.841. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU321510152025Min: 20.88 / Avg: 20.92 / Max: 20.98Min: 20.88 / Avg: 20.9 / Max: 20.92Min: 20.89 / Avg: 20.91 / Max: 20.931. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU3213691215SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.02, N = 310.5910.5910.60MIN: 10.5MIN: 10.5MIN: 10.51. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU3213691215Min: 10.58 / Avg: 10.59 / Max: 10.6Min: 10.58 / Avg: 10.59 / Max: 10.61Min: 10.57 / Avg: 10.6 / Max: 10.631. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU32148121620SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.03, N = 314.4314.4014.40MIN: 14.25MIN: 14.23MIN: 14.21. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU32148121620Min: 14.41 / Avg: 14.43 / Max: 14.45Min: 14.39 / Avg: 14.4 / Max: 14.41Min: 14.36 / Avg: 14.4 / Max: 14.451. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU321510152025SE +/- 0.01, N = 3SE +/- 0.03, N = 3SE +/- 0.08, N = 320.4920.5220.63MIN: 20.39MIN: 20.38MIN: 20.41. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU321510152025Min: 20.48 / Avg: 20.49 / Max: 20.52Min: 20.49 / Avg: 20.52 / Max: 20.57Min: 20.48 / Avg: 20.63 / Max: 20.761. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU3213691215SE +/- 0.12, N = 3SE +/- 0.01, N = 3SE +/- 0.09, N = 311.3111.0811.33MIN: 10.99MIN: 10.98MIN: 11.071. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU3213691215Min: 11.08 / Avg: 11.31 / Max: 11.44Min: 11.06 / Avg: 11.08 / Max: 11.09Min: 11.15 / Avg: 11.33 / Max: 11.421. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU321246810SE +/- 0.01275, N = 3SE +/- 0.00268, N = 3SE +/- 0.00700, N = 37.432927.442427.43783MIN: 7.37MIN: 7.4MIN: 7.391. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU3213691215Min: 7.41 / Avg: 7.43 / Max: 7.45Min: 7.44 / Avg: 7.44 / Max: 7.45Min: 7.43 / Avg: 7.44 / Max: 7.451. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU32116003200480064008000SE +/- 2.88, N = 3SE +/- 2.29, N = 3SE +/- 1.79, N = 37402.397401.847402.18MIN: 7389.3MIN: 7388.05MIN: 7383.431. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU32113002600390052006500Min: 7396.64 / Avg: 7402.39 / Max: 7405.61Min: 7397.84 / Avg: 7401.84 / Max: 7405.78Min: 7398.75 / Avg: 7402.18 / Max: 7404.811. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU3218001600240032004000SE +/- 1.24, N = 3SE +/- 2.81, N = 3SE +/- 0.48, N = 33949.853954.923950.40MIN: 3940.26MIN: 3944.12MIN: 3942.051. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU3217001400210028003500Min: 3948.61 / Avg: 3949.85 / Max: 3952.34Min: 3949.47 / Avg: 3954.92 / Max: 3958.82Min: 3949.56 / Avg: 3950.4 / Max: 3951.221. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU32116003200480064008000SE +/- 7.65, N = 3SE +/- 2.41, N = 3SE +/- 1.28, N = 37406.717400.597403.96MIN: 7379.88MIN: 7384.55MIN: 7391.291. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU32113002600390052006500Min: 7394.67 / Avg: 7406.71 / Max: 7420.9Min: 7397.17 / Avg: 7400.59 / Max: 7405.23Min: 7401.54 / Avg: 7403.96 / Max: 7405.91. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU3218001600240032004000SE +/- 0.92, N = 3SE +/- 1.94, N = 3SE +/- 1.74, N = 33949.873953.553952.60MIN: 3944.88MIN: 3943.31MIN: 3942.31. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU3217001400210028003500Min: 3948.24 / Avg: 3949.87 / Max: 3951.41Min: 3949.69 / Avg: 3953.55 / Max: 3955.86Min: 3949.62 / Avg: 3952.6 / Max: 3955.641. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU3211.21242.42483.63724.84966.062SE +/- 0.01548, N = 3SE +/- 0.00794, N = 3SE +/- 0.00577, N = 35.372255.388585.38679MIN: 5.29MIN: 5.32MIN: 5.311. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU321246810Min: 5.34 / Avg: 5.37 / Max: 5.39Min: 5.37 / Avg: 5.39 / Max: 5.4Min: 5.38 / Avg: 5.39 / Max: 5.391. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU32116003200480064008000SE +/- 5.25, N = 3SE +/- 2.59, N = 3SE +/- 2.95, N = 37397.077401.567397.77MIN: 7383.82MIN: 7390.11MIN: 7386.841. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU32113002600390052006500Min: 7389.72 / Avg: 7397.07 / Max: 7407.24Min: 7397.49 / Avg: 7401.56 / Max: 7406.37Min: 7391.95 / Avg: 7397.77 / Max: 7401.531. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU3218001600240032004000SE +/- 1.54, N = 3SE +/- 2.44, N = 3SE +/- 3.51, N = 33949.093954.413951.82MIN: 3941.43MIN: 3941.1MIN: 3942.061. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU3217001400210028003500Min: 3947.13 / Avg: 3949.09 / Max: 3952.13Min: 3949.66 / Avg: 3954.41 / Max: 3957.78Min: 3946.25 / Avg: 3951.82 / Max: 3958.31. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU321246810SE +/- 0.00762, N = 3SE +/- 0.00574, N = 3SE +/- 0.00304, N = 36.900716.905076.90131MIN: 6.86MIN: 6.85MIN: 6.851. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU3213691215Min: 6.89 / Avg: 6.9 / Max: 6.91Min: 6.89 / Avg: 6.91 / Max: 6.91Min: 6.9 / Avg: 6.9 / Max: 6.911. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread