Xeon E3 oneDNN 2.0

Intel Xeon E3-1280 v5 testing with a MSI Z170A SLI PLUS (MS-7998) v1.0 (2.A0 BIOS) and ASUS AMD Radeon HD 7850 / R7 265 R9 270 1024SP on Ubuntu 20.04 via the Phoronix Test Suite.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2012094-HA-XEONE3ONE76
Jump To Table - Results

View

Do Not Show Noisy Results
Do Not Show Results With Incomplete Data
Do Not Show Results With Little Change/Spread
List Notable Results

Statistics

Show Overall Harmonic Mean(s)
Show Overall Geometric Mean
Show Wins / Losses Counts (Pie Chart)
Normalize Results
Remove Outliers Before Calculating Averages

Graph Settings

Force Line Graphs Where Applicable
Convert To Scalar Where Applicable
Prefer Vertical Bar Graphs

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Highlight
Result
Hide
Result
Result
Identifier
View Logs
Performance Per
Dollar
Date
Run
  Test
  Duration
1
December 09 2020
  35 Minutes
2
December 09 2020
  35 Minutes
3
December 09 2020
  35 Minutes
Invert Hiding All Results Option
  35 Minutes

Only show results where is faster than
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


Xeon E3 oneDNN 2.0ProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLCompilerFile-SystemScreen Resolution123Intel Xeon E3-1280 v5 @ 4.00GHz (4 Cores / 8 Threads)MSI Z170A SLI PLUS (MS-7998) v1.0 (2.A0 BIOS)Intel Xeon E3-1200 v5/E3-150032GB256GB TOSHIBA RD400ASUS AMD Radeon HD 7850 / R7 265 R9 270 1024SPRealtek ALC1150VA2431Intel I219-VUbuntu 20.045.9.0-050900rc2daily20200826-generic (x86_64) 20200825GNOME Shell 3.36.4X Server 1.20.8modesetting 1.20.84.5 Mesa 20.0.8 (LLVM 10.0.0)GCC 9.3.0ext41920x1080OpenBenchmarking.orgCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: intel_pstate powersave - CPU Microcode: 0xe2 - Thermald 1.9.1 Security Details- itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Mitigation of PTE Inversion; VMX: conditional cache flushes SMT vulnerable + mds: Mitigation of Clear buffers; SMT vulnerable + meltdown: Mitigation of PTI + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full generic retpoline IBPB: conditional IBRS_FW STIBP: conditional RSB filling + srbds: Mitigation of Microcode + tsx_async_abort: Mitigation of Clear buffers; SMT vulnerable

123Result OverviewPhoronix Test Suite100%101%101%102%102%oneDNNoneDNNoneDNNoneDNNoneDNNoneDNNoneDNNoneDNNoneDNNoneDNNoneDNNoneDNNoneDNNoneDNNoneDNNoneDNNoneDNNoneDNND.B.s - u8s8f32 - CPUC.B.S.A - u8s8f32 - CPUIP Shapes 1D - f32 - CPUM.M.B.S.T - f32 - CPUD.B.s - f32 - CPUIP Shapes 3D - u8s8f32 - CPUIP Shapes 3D - f32 - CPUR.N.N.I - bf16bf16bf16 - CPUR.N.N.I - f32 - CPUD.B.s - u8s8f32 - CPUD.B.s - f32 - CPUR.N.N.I - u8s8f32 - CPUC.B.S.A - f32 - CPUR.N.N.T - u8s8f32 - CPUIP Shapes 1D - u8s8f32 - CPUM.M.B.S.T - u8s8f32 - CPUR.N.N.T - bf16bf16bf16 - CPUR.N.N.T - f32 - CPU

Xeon E3 oneDNN 2.0onednn: IP Shapes 1D - f32 - CPUonednn: IP Shapes 3D - f32 - CPUonednn: IP Shapes 1D - u8s8f32 - CPUonednn: IP Shapes 3D - u8s8f32 - CPUonednn: Convolution Batch Shapes Auto - f32 - CPUonednn: Deconvolution Batch shapes_1d - f32 - CPUonednn: Deconvolution Batch shapes_3d - f32 - CPUonednn: Convolution Batch Shapes Auto - u8s8f32 - CPUonednn: Deconvolution Batch shapes_1d - u8s8f32 - CPUonednn: Deconvolution Batch shapes_3d - u8s8f32 - CPUonednn: Recurrent Neural Network Training - f32 - CPUonednn: Recurrent Neural Network Inference - f32 - CPUonednn: Recurrent Neural Network Training - u8s8f32 - CPUonednn: Recurrent Neural Network Inference - u8s8f32 - CPUonednn: Matrix Multiply Batch Shapes Transformer - f32 - CPUonednn: Recurrent Neural Network Training - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPUonednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU1238.1102312.27173.663783.2329020.910110.599914.398520.633311.32537.437837402.183950.407403.963952.605.386797397.773951.826.901318.1334912.25023.663633.2331520.897510.592414.396520.515511.07587.442427401.843954.927400.593953.555.388587401.563954.416.905078.1039412.26103.661433.2269720.916510.588914.429720.494811.31197.432927402.393949.857406.713949.875.372257397.073949.096.90071OpenBenchmarking.org

oneDNN

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU123246810SE +/- 0.01358, N = 3SE +/- 0.00686, N = 3SE +/- 0.01643, N = 38.110238.133498.10394MIN: 7.93MIN: 7.97MIN: 7.951. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU1233691215Min: 8.08 / Avg: 8.11 / Max: 8.13Min: 8.12 / Avg: 8.13 / Max: 8.15Min: 8.07 / Avg: 8.1 / Max: 8.131. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU1233691215SE +/- 0.02, N = 3SE +/- 0.02, N = 3SE +/- 0.02, N = 312.2712.2512.26MIN: 12.09MIN: 12.07MIN: 12.051. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU12348121620Min: 12.23 / Avg: 12.27 / Max: 12.3Min: 12.21 / Avg: 12.25 / Max: 12.28Min: 12.23 / Avg: 12.26 / Max: 12.291. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU1230.82441.64882.47323.29764.122SE +/- 0.00042, N = 3SE +/- 0.00342, N = 3SE +/- 0.00315, N = 33.663783.663633.66143MIN: 3.63MIN: 3.63MIN: 3.631. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU123246810Min: 3.66 / Avg: 3.66 / Max: 3.66Min: 3.66 / Avg: 3.66 / Max: 3.67Min: 3.66 / Avg: 3.66 / Max: 3.671. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU1230.72751.4552.18252.913.6375SE +/- 0.00893, N = 3SE +/- 0.00755, N = 3SE +/- 0.00705, N = 33.232903.233153.22697MIN: 3.16MIN: 3.16MIN: 3.161. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU123246810Min: 3.22 / Avg: 3.23 / Max: 3.25Min: 3.22 / Avg: 3.23 / Max: 3.24Min: 3.21 / Avg: 3.23 / Max: 3.241. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU123510152025SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.03, N = 320.9120.9020.92MIN: 20.84MIN: 20.83MIN: 20.821. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU123510152025Min: 20.89 / Avg: 20.91 / Max: 20.93Min: 20.88 / Avg: 20.9 / Max: 20.92Min: 20.88 / Avg: 20.92 / Max: 20.981. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU1233691215SE +/- 0.02, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 310.6010.5910.59MIN: 10.5MIN: 10.5MIN: 10.51. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU1233691215Min: 10.57 / Avg: 10.6 / Max: 10.63Min: 10.58 / Avg: 10.59 / Max: 10.61Min: 10.58 / Avg: 10.59 / Max: 10.61. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU12348121620SE +/- 0.03, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 314.4014.4014.43MIN: 14.2MIN: 14.23MIN: 14.251. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU12348121620Min: 14.36 / Avg: 14.4 / Max: 14.45Min: 14.39 / Avg: 14.4 / Max: 14.41Min: 14.41 / Avg: 14.43 / Max: 14.451. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU123510152025SE +/- 0.08, N = 3SE +/- 0.03, N = 3SE +/- 0.01, N = 320.6320.5220.49MIN: 20.4MIN: 20.38MIN: 20.391. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU123510152025Min: 20.48 / Avg: 20.63 / Max: 20.76Min: 20.49 / Avg: 20.52 / Max: 20.57Min: 20.48 / Avg: 20.49 / Max: 20.521. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU1233691215SE +/- 0.09, N = 3SE +/- 0.01, N = 3SE +/- 0.12, N = 311.3311.0811.31MIN: 11.07MIN: 10.98MIN: 10.991. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU1233691215Min: 11.15 / Avg: 11.33 / Max: 11.42Min: 11.06 / Avg: 11.08 / Max: 11.09Min: 11.08 / Avg: 11.31 / Max: 11.441. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU123246810SE +/- 0.00700, N = 3SE +/- 0.00268, N = 3SE +/- 0.01275, N = 37.437837.442427.43292MIN: 7.39MIN: 7.4MIN: 7.371. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU1233691215Min: 7.43 / Avg: 7.44 / Max: 7.45Min: 7.44 / Avg: 7.44 / Max: 7.45Min: 7.41 / Avg: 7.43 / Max: 7.451. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU12316003200480064008000SE +/- 1.79, N = 3SE +/- 2.29, N = 3SE +/- 2.88, N = 37402.187401.847402.39MIN: 7383.43MIN: 7388.05MIN: 7389.31. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU12313002600390052006500Min: 7398.75 / Avg: 7402.18 / Max: 7404.81Min: 7397.84 / Avg: 7401.84 / Max: 7405.78Min: 7396.64 / Avg: 7402.39 / Max: 7405.611. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU1238001600240032004000SE +/- 0.48, N = 3SE +/- 2.81, N = 3SE +/- 1.24, N = 33950.403954.923949.85MIN: 3942.05MIN: 3944.12MIN: 3940.261. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU1237001400210028003500Min: 3949.56 / Avg: 3950.4 / Max: 3951.22Min: 3949.47 / Avg: 3954.92 / Max: 3958.82Min: 3948.61 / Avg: 3949.85 / Max: 3952.341. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU12316003200480064008000SE +/- 1.28, N = 3SE +/- 2.41, N = 3SE +/- 7.65, N = 37403.967400.597406.71MIN: 7391.29MIN: 7384.55MIN: 7379.881. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU12313002600390052006500Min: 7401.54 / Avg: 7403.96 / Max: 7405.9Min: 7397.17 / Avg: 7400.59 / Max: 7405.23Min: 7394.67 / Avg: 7406.71 / Max: 7420.91. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU1238001600240032004000SE +/- 1.74, N = 3SE +/- 1.94, N = 3SE +/- 0.92, N = 33952.603953.553949.87MIN: 3942.3MIN: 3943.31MIN: 3944.881. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU1237001400210028003500Min: 3949.62 / Avg: 3952.6 / Max: 3955.64Min: 3949.69 / Avg: 3953.55 / Max: 3955.86Min: 3948.24 / Avg: 3949.87 / Max: 3951.411. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU1231.21242.42483.63724.84966.062SE +/- 0.00577, N = 3SE +/- 0.00794, N = 3SE +/- 0.01548, N = 35.386795.388585.37225MIN: 5.31MIN: 5.32MIN: 5.291. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU123246810Min: 5.38 / Avg: 5.39 / Max: 5.39Min: 5.37 / Avg: 5.39 / Max: 5.4Min: 5.34 / Avg: 5.37 / Max: 5.391. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU12316003200480064008000SE +/- 2.95, N = 3SE +/- 2.59, N = 3SE +/- 5.25, N = 37397.777401.567397.07MIN: 7386.84MIN: 7390.11MIN: 7383.821. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU12313002600390052006500Min: 7391.95 / Avg: 7397.77 / Max: 7401.53Min: 7397.49 / Avg: 7401.56 / Max: 7406.37Min: 7389.72 / Avg: 7397.07 / Max: 7407.241. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU1238001600240032004000SE +/- 3.51, N = 3SE +/- 2.44, N = 3SE +/- 1.54, N = 33951.823954.413949.09MIN: 3942.06MIN: 3941.1MIN: 3941.431. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU1237001400210028003500Min: 3946.25 / Avg: 3951.82 / Max: 3958.3Min: 3949.66 / Avg: 3954.41 / Max: 3957.78Min: 3947.13 / Avg: 3949.09 / Max: 3952.131. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU123246810SE +/- 0.00304, N = 3SE +/- 0.00574, N = 3SE +/- 0.00762, N = 36.901316.905076.90071MIN: 6.85MIN: 6.85MIN: 6.861. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU1233691215Min: 6.9 / Avg: 6.9 / Max: 6.91Min: 6.89 / Avg: 6.91 / Max: 6.91Min: 6.89 / Avg: 6.9 / Max: 6.911. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread