Xeon E onednn Intel Xeon E-2288G testing with a Compulab SBC-ATCFL v1.2 (ATOP3.PRD.0.29.2 BIOS) and NVIDIA Quadro RTX 4000 8GB on Ubuntu 20.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2103135-FI-XEONEONED91 .
Xeon E onednn Processor Motherboard Chipset Memory Disk Graphics Audio Network OS Kernel Desktop Display Server Display Driver OpenCL Vulkan Compiler File-System Screen Resolution 1 2 3 Intel Xeon E-2288G @ 5.00GHz (8 Cores / 16 Threads) Compulab SBC-ATCFL v1.2 (ATOP3.PRD.0.29.2 BIOS) Intel Cannon Lake PCH 2 x 32 GB DDR4-2667MT/s Samsung M378A4G43MB1-CTD Samsung SSD 970 EVO Plus 250GB NVIDIA Quadro RTX 4000 8GB Intel Cannon Lake PCH cAVS Intel I219-LM + Intel I210 Ubuntu 20.10 5.8.0-41-generic (x86_64) GNOME Shell 3.38.2 X Server 1.20.9 modesetting 1.20.9 OpenCL 1.2 CUDA 11.2.109 1.2.155 GCC 10.2.0 ext4 1920x1080 OpenBenchmarking.org Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-gcn/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: intel_pstate powersave - CPU Microcode: 0xde Security Details - itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Mitigation of TSX disabled + tsx_async_abort: Mitigation of TSX disabled
Xeon E onednn onednn: IP Shapes 1D - f32 - CPU onednn: IP Shapes 3D - f32 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: Deconvolution Batch shapes_1d - f32 - CPU onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU 1 2 3 4.27720 10.8991 1.87497 2.43886 17.9347 8.53953 6.95774 17.2639 2.48126 3.89009 4179.63 2308.21 4194.52 2301.83 4.22402 4192.73 2304.72 3.85844 4.39480 10.9373 1.96286 2.40681 17.8492 8.35664 6.92709 17.2105 2.48146 3.91546 4185.77 2314.85 4186.18 2312.66 4.23861 4188.73 2319.55 3.88706 4.34701 10.8040 1.93683 2.44700 18.0086 8.76603 6.98955 17.1121 2.50934 3.92078 4182.36 2325.80 4189.69 2313.22 4.23955 4224.76 2318.11 3.87967 OpenBenchmarking.org
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU 1 2 3 0.9888 1.9776 2.9664 3.9552 4.944 SE +/- 0.03226, N = 15 SE +/- 0.04781, N = 15 SE +/- 0.03852, N = 15 4.27720 4.39480 4.34701 MIN: 3.59 MIN: 3.59 MIN: 3.6 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU 1 2 3 3 6 9 12 15 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 SE +/- 0.07, N = 3 10.90 10.94 10.80 MIN: 9.99 MIN: 9.98 MIN: 9.97 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU 1 2 3 0.4416 0.8832 1.3248 1.7664 2.208 SE +/- 0.02166, N = 6 SE +/- 0.00219, N = 3 SE +/- 0.01631, N = 15 1.87497 1.96286 1.93683 MIN: 1.55 MIN: 1.53 MIN: 1.53 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU 1 2 3 0.5506 1.1012 1.6518 2.2024 2.753 SE +/- 0.02338, N = 9 SE +/- 0.01956, N = 3 SE +/- 0.02297, N = 3 2.43886 2.40681 2.44700 MIN: 2.2 MIN: 2.2 MIN: 2.22 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU 1 2 3 4 8 12 16 20 SE +/- 0.12, N = 3 SE +/- 0.09, N = 3 SE +/- 0.07, N = 3 17.93 17.85 18.01 MIN: 17.08 MIN: 17.12 MIN: 17.16 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU 1 2 3 2 4 6 8 10 SE +/- 0.11858, N = 15 SE +/- 0.00541, N = 3 SE +/- 0.15702, N = 12 8.53953 8.35664 8.76603 MIN: 4.45 MIN: 5.06 MIN: 4.84 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU 1 2 3 2 4 6 8 10 SE +/- 0.01715, N = 3 SE +/- 0.02867, N = 3 SE +/- 0.10068, N = 4 6.95774 6.92709 6.98955 MIN: 6.58 MIN: 6.55 MIN: 6.57 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU 1 2 3 4 8 12 16 20 SE +/- 0.02, N = 3 SE +/- 0.06, N = 3 SE +/- 0.01, N = 3 17.26 17.21 17.11 MIN: 16.39 MIN: 16.35 MIN: 16.37 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU 1 2 3 0.5646 1.1292 1.6938 2.2584 2.823 SE +/- 0.00606, N = 3 SE +/- 0.00925, N = 3 SE +/- 0.01499, N = 3 2.48126 2.48146 2.50934 MIN: 2.21 MIN: 2.22 MIN: 2.08 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU 1 2 3 0.8822 1.7644 2.6466 3.5288 4.411 SE +/- 0.02275, N = 3 SE +/- 0.02302, N = 3 SE +/- 0.00661, N = 3 3.89009 3.91546 3.92078 MIN: 3.66 MIN: 3.68 MIN: 3.69 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU 1 2 3 900 1800 2700 3600 4500 SE +/- 19.54, N = 3 SE +/- 5.52, N = 3 SE +/- 8.62, N = 3 4179.63 4185.77 4182.36 MIN: 4110.36 MIN: 4137.54 MIN: 4131.32 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU 1 2 3 500 1000 1500 2000 2500 SE +/- 2.57, N = 3 SE +/- 13.89, N = 3 SE +/- 15.48, N = 3 2308.21 2314.85 2325.80 MIN: 2273.4 MIN: 2265.89 MIN: 2264.93 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU 1 2 3 900 1800 2700 3600 4500 SE +/- 25.41, N = 3 SE +/- 3.89, N = 3 SE +/- 4.37, N = 3 4194.52 4186.18 4189.69 MIN: 4124.44 MIN: 4137.73 MIN: 4128.17 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU 1 2 3 500 1000 1500 2000 2500 SE +/- 5.51, N = 3 SE +/- 4.70, N = 3 SE +/- 3.66, N = 3 2301.83 2312.66 2313.22 MIN: 2256.72 MIN: 2275.31 MIN: 2276.22 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU 1 2 3 0.9539 1.9078 2.8617 3.8156 4.7695 SE +/- 0.01514, N = 3 SE +/- 0.00373, N = 3 SE +/- 0.03388, N = 3 4.22402 4.23861 4.23955 MIN: 3.84 MIN: 3.86 MIN: 3.84 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU 1 2 3 900 1800 2700 3600 4500 SE +/- 15.97, N = 3 SE +/- 7.85, N = 3 SE +/- 23.95, N = 3 4192.73 4188.73 4224.76 MIN: 4116.8 MIN: 4141.92 MIN: 4158.66 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU 1 2 3 500 1000 1500 2000 2500 SE +/- 0.71, N = 3 SE +/- 11.13, N = 3 SE +/- 4.06, N = 3 2304.72 2319.55 2318.11 MIN: 2267.17 MIN: 2266.99 MIN: 2271 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.1.2 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU 1 2 3 0.8746 1.7492 2.6238 3.4984 4.373 SE +/- 0.00671, N = 3 SE +/- 0.01710, N = 3 SE +/- 0.00607, N = 3 3.85844 3.88706 3.87967 MIN: 3.24 MIN: 3.17 MIN: 3.24 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Phoronix Test Suite v10.8.4