oneDNN Xeon 2 x Intel Xeon Gold 5220R testing with a TYAN S7106 (V2.01.B40 BIOS) and ASPEED on Ubuntu 20.04 via the Phoronix Test Suite.
Compare your own system(s) to this result file with the
Phoronix Test Suite by running the command:
phoronix-test-suite benchmark 2006273-NE-ONEDNNXEO05 2 x Intel Xeon Gold 5220R Processor: 2 x Intel Xeon Gold 5220R @ 3.90GHz (36 Cores / 72 Threads), Motherboard: TYAN S7106 (V2.01.B40 BIOS), Chipset: Intel Sky Lake-E DMI3 Registers, Memory: 94GB, Disk: 500GB Samsung SSD 860, Graphics: ASPEED, Network: 2 x Intel I210 + 2 x QLogic cLOM8214 1/10GbE
OS: Ubuntu 20.04, Kernel: 5.8.0-rc2-phx-fgkaslr (x86_64) 20200624, Desktop: GNOME Shell 3.36.1, Display Server: X Server 1.20.8, Display Driver: modesetting 1.20.8, Compiler: GCC 9.3.0, File-System: ext4, Screen Resolution: 1024x768
Compiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: intel_pstate powersave - CPU Microcode: 0x5002f01Security Notes: itlb_multihit: KVM: Mitigation of Split huge pages + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Mitigation of TSX disabled
oneDNN Xeon OpenBenchmarking.org Phoronix Test Suite 2 x Intel Xeon Gold 5220R @ 3.90GHz (36 Cores / 72 Threads) TYAN S7106 (V2.01.B40 BIOS) Intel Sky Lake-E DMI3 Registers 94GB 500GB Samsung SSD 860 ASPEED 2 x Intel I210 + 2 x QLogic cLOM8214 1/10GbE Ubuntu 20.04 5.8.0-rc2-phx-fgkaslr (x86_64) 20200624 GNOME Shell 3.36.1 X Server 1.20.8 modesetting 1.20.8 GCC 9.3.0 ext4 1024x768 Processor Motherboard Chipset Memory Disk Graphics Network OS Kernel Desktop Display Server Display Driver Compiler File-System Screen Resolution OneDNN Xeon Benchmarks System Logs - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: intel_pstate powersave - CPU Microcode: 0x5002f01 - itlb_multihit: KVM: Mitigation of Split huge pages + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Mitigation of TSX disabled
oneDNN Xeon onednn: Matrix Multiply Batch Shapes Transformer - bf16bf16bf16 - CPU onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU onednn: Deconvolution Batch deconv_3d - bf16bf16bf16 - CPU onednn: Deconvolution Batch deconv_1d - bf16bf16bf16 - CPU onednn: Convolution Batch Shapes Auto - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Recurrent Neural Network Training - f32 - CPU onednn: Deconvolution Batch deconv_3d - u8s8f32 - CPU onednn: Deconvolution Batch deconv_1d - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Deconvolution Batch deconv_3d - f32 - CPU onednn: Deconvolution Batch deconv_1d - f32 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: IP Batch All - bf16bf16bf16 - CPU onednn: IP Batch 1D - bf16bf16bf16 - CPU onednn: IP Batch All - u8s8f32 - CPU onednn: IP Batch 1D - u8s8f32 - CPU onednn: IP Batch All - f32 - CPU onednn: IP Batch 1D - f32 - CPU 2 x Intel Xeon Gold 5220R 1.45097 0.293232 0.521635 9.49159 7.39127 6.41087 79.0327 223.824 0.688451 0.548081 7.05818 2.70383 1.96226 7.46084 51.1468 5.68213 6.79487 1.85258 27.3550 1.74752 OpenBenchmarking.org
oneDNN This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the oneAPI initiative. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU 2 x Intel Xeon Gold 5220R 0.3265 0.653 0.9795 1.306 1.6325 SE +/- 0.00165, N = 3 1.45097 MIN: 1.41 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU 2 x Intel Xeon Gold 5220R 0.066 0.132 0.198 0.264 0.33 SE +/- 0.000943, N = 3 0.293232 MIN: 0.27 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU 2 x Intel Xeon Gold 5220R 0.1174 0.2348 0.3522 0.4696 0.587 SE +/- 0.000931, N = 3 0.521635 MIN: 0.5 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Deconvolution Batch deconv_3d - Data Type: bf16bf16bf16 - Engine: CPU 2 x Intel Xeon Gold 5220R 3 6 9 12 15 SE +/- 0.02911, N = 3 9.49159 MIN: 9.34 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Deconvolution Batch deconv_1d - Data Type: bf16bf16bf16 - Engine: CPU 2 x Intel Xeon Gold 5220R 2 4 6 8 10 SE +/- 0.00572, N = 3 7.39127 MIN: 7.23 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU 2 x Intel Xeon Gold 5220R 2 4 6 8 10 SE +/- 0.00374, N = 3 6.41087 MIN: 6.3 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU 2 x Intel Xeon Gold 5220R 20 40 60 80 100 SE +/- 0.96, N = 5 79.03 MIN: 75.2 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU 2 x Intel Xeon Gold 5220R 50 100 150 200 250 SE +/- 3.77, N = 3 223.82 MIN: 211.45 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Deconvolution Batch deconv_3d - Data Type: u8s8f32 - Engine: CPU 2 x Intel Xeon Gold 5220R 0.1549 0.3098 0.4647 0.6196 0.7745 SE +/- 0.000899, N = 3 0.688451 MIN: 0.67 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Deconvolution Batch deconv_1d - Data Type: u8s8f32 - Engine: CPU 2 x Intel Xeon Gold 5220R 0.1233 0.2466 0.3699 0.4932 0.6165 SE +/- 0.001966, N = 3 0.548081 MIN: 0.52 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU 2 x Intel Xeon Gold 5220R 2 4 6 8 10 SE +/- 0.00472, N = 3 7.05818 MIN: 6.97 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Deconvolution Batch deconv_3d - Data Type: f32 - Engine: CPU 2 x Intel Xeon Gold 5220R 0.6084 1.2168 1.8252 2.4336 3.042 SE +/- 0.00240, N = 3 2.70383 MIN: 2.67 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Deconvolution Batch deconv_1d - Data Type: f32 - Engine: CPU 2 x Intel Xeon Gold 5220R 0.4415 0.883 1.3245 1.766 2.2075 SE +/- 0.00210, N = 3 1.96226 MIN: 1.91 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU 2 x Intel Xeon Gold 5220R 2 4 6 8 10 SE +/- 0.00219, N = 3 7.46084 MIN: 7.38 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: IP Batch All - Data Type: bf16bf16bf16 - Engine: CPU 2 x Intel Xeon Gold 5220R 12 24 36 48 60 SE +/- 0.05, N = 3 51.15 MIN: 50.12 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: IP Batch 1D - Data Type: bf16bf16bf16 - Engine: CPU 2 x Intel Xeon Gold 5220R 1.2785 2.557 3.8355 5.114 6.3925 SE +/- 0.00465, N = 3 5.68213 MIN: 5.52 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: IP Batch All - Data Type: u8s8f32 - Engine: CPU 2 x Intel Xeon Gold 5220R 2 4 6 8 10 SE +/- 0.00378, N = 3 6.79487 MIN: 6.61 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: IP Batch 1D - Data Type: u8s8f32 - Engine: CPU 2 x Intel Xeon Gold 5220R 0.4168 0.8336 1.2504 1.6672 2.084 SE +/- 0.00731, N = 3 1.85258 MIN: 1.74 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: IP Batch All - Data Type: f32 - Engine: CPU 2 x Intel Xeon Gold 5220R 6 12 18 24 30 SE +/- 0.04, N = 3 27.36 MIN: 26.64 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: IP Batch 1D - Data Type: f32 - Engine: CPU 2 x Intel Xeon Gold 5220R 0.3932 0.7864 1.1796 1.5728 1.966 SE +/- 0.00439, N = 3 1.74752 MIN: 1.66 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
2 x Intel Xeon Gold 5220R Processor: 2 x Intel Xeon Gold 5220R @ 3.90GHz (36 Cores / 72 Threads), Motherboard: TYAN S7106 (V2.01.B40 BIOS), Chipset: Intel Sky Lake-E DMI3 Registers, Memory: 94GB, Disk: 500GB Samsung SSD 860, Graphics: ASPEED, Network: 2 x Intel I210 + 2 x QLogic cLOM8214 1/10GbE
OS: Ubuntu 20.04, Kernel: 5.8.0-rc2-phx-fgkaslr (x86_64) 20200624, Desktop: GNOME Shell 3.36.1, Display Server: X Server 1.20.8, Display Driver: modesetting 1.20.8, Compiler: GCC 9.3.0, File-System: ext4, Screen Resolution: 1024x768
Compiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: intel_pstate powersave - CPU Microcode: 0x5002f01Security Notes: itlb_multihit: KVM: Mitigation of Split huge pages + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Mitigation of TSX disabled
Testing initiated at 27 June 2020 14:20 by user phoronix.