DNNL 9900K Intel Core i9-9900K testing with a ASUS PRIME Z390-A (1302 BIOS) and MSI AMD Radeon RX 470/480/570/570X/580/580X 8GB on Ubuntu 19.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/1910063-PTS-DNNL990013&grw .
DNNL 9900K Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL Compiler File-System Screen Resolution Core i9 9900K Intel Core i9-9900K @ 5.00GHz (8 Cores / 16 Threads) ASUS PRIME Z390-A (1302 BIOS) Intel Cannon Lake PCH 16384MB Samsung SSD 970 EVO 250GB + 2000GB SABRENT MSI AMD Radeon RX 470/480/570/570X/580/580X 8GB (1366/2000MHz) Realtek ALC1220 Acer B286HK Intel I219-V Ubuntu 19.04 5.4.0-999-generic (x86_64) 20191004 GNOME Shell 3.32.2 X Server 1.20.4 modesetting 1.20.4 4.5 Mesa 19.3.0-devel (git-396b410 2019-10-05 disco-oibaf-ppa) (LLVM 9.0.0) GCC 8.3.0 ext4 3840x2160 OpenBenchmarking.org - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --enable-libmpx --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib --with-tune=generic --without-cuda-driver -v - Scaling Governor: intel_pstate performance - l1tf: Not affected + mds: Mitigation of Clear buffers; SMT vulnerable + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full generic retpoline IBPB: conditional IBRS_FW STIBP: conditional RSB filling
DNNL 9900K mkl-dnn: Convolution Batch conv_alexnet - u8s8f32 mkl-dnn: Deconvolution Batch deconv_3d - u8s8f32 mkl-dnn: Recurrent Neural Network Training - f32 mkl-dnn: Deconvolution Batch deconv_all - f32 mkl-dnn: Deconvolution Batch deconv_1d - u8s8f32 mkl-dnn: IP Batch All - f32 mkl-dnn: Convolution Batch conv_all - u8s8f32 mkl-dnn: Deconvolution Batch deconv_1d - f32 mkl-dnn: Deconvolution Batch deconv_3d - f32 mkl-dnn: Convolution Batch conv_all - f32 mkl-dnn: Convolution Batch conv_3d - u8s8f32 mkl-dnn: Convolution Batch conv_3d - f32 mkl-dnn: IP Batch All - u8s8f32 mkl-dnn: IP Batch 1D - u8s8f32 mkl-dnn: Convolution Batch conv_googlenet_v3 - f32 mkl-dnn: Convolution Batch conv_alexnet - f32 mkl-dnn: Convolution Batch conv_googlenet_v3 - u8s8f32 mkl-dnn: IP Batch 1D - f32 Core i9 9900K 3682.60 9810.19 274.51 3227.73 6041.64 34.29 47603.10 5.86 6.64 2948.49 17489.70 23.83 247.61 44.10 165.97 374.81 5700.46 4.75 OpenBenchmarking.org
MKL-DNN DNNL Harness: Convolution Batch conv_alexnet - Data Type: u8s8f32 OpenBenchmarking.org ms, Fewer Is Better MKL-DNN DNNL 1.1 Harness: Convolution Batch conv_alexnet - Data Type: u8s8f32 Core i9 9900K 800 1600 2400 3200 4000 SE +/- 8.47, N = 3 3682.60 MIN: 3635.25 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
MKL-DNN DNNL Harness: Deconvolution Batch deconv_3d - Data Type: u8s8f32 OpenBenchmarking.org ms, Fewer Is Better MKL-DNN DNNL 1.1 Harness: Deconvolution Batch deconv_3d - Data Type: u8s8f32 Core i9 9900K 2K 4K 6K 8K 10K SE +/- 6.28, N = 3 9810.19 MIN: 9768.31 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
MKL-DNN DNNL Harness: Recurrent Neural Network Training - Data Type: f32 OpenBenchmarking.org ms, Fewer Is Better MKL-DNN DNNL 1.1 Harness: Recurrent Neural Network Training - Data Type: f32 Core i9 9900K 60 120 180 240 300 SE +/- 1.06, N = 3 274.51 MIN: 260.67 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
MKL-DNN DNNL Harness: Deconvolution Batch deconv_all - Data Type: f32 OpenBenchmarking.org ms, Fewer Is Better MKL-DNN DNNL 1.1 Harness: Deconvolution Batch deconv_all - Data Type: f32 Core i9 9900K 700 1400 2100 2800 3500 SE +/- 1.10, N = 3 3227.73 MIN: 3112.23 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
MKL-DNN DNNL Harness: Deconvolution Batch deconv_1d - Data Type: u8s8f32 OpenBenchmarking.org ms, Fewer Is Better MKL-DNN DNNL 1.1 Harness: Deconvolution Batch deconv_1d - Data Type: u8s8f32 Core i9 9900K 1300 2600 3900 5200 6500 SE +/- 9.29, N = 3 6041.64 MIN: 6005.37 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
MKL-DNN DNNL Harness: IP Batch All - Data Type: f32 OpenBenchmarking.org ms, Fewer Is Better MKL-DNN DNNL 1.1 Harness: IP Batch All - Data Type: f32 Core i9 9900K 8 16 24 32 40 SE +/- 0.05, N = 3 34.29 MIN: 32.78 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
MKL-DNN DNNL Harness: Convolution Batch conv_all - Data Type: u8s8f32 OpenBenchmarking.org ms, Fewer Is Better MKL-DNN DNNL 1.1 Harness: Convolution Batch conv_all - Data Type: u8s8f32 Core i9 9900K 10K 20K 30K 40K 50K SE +/- 41.55, N = 3 47603.10 MIN: 46847 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
MKL-DNN DNNL Harness: Deconvolution Batch deconv_1d - Data Type: f32 OpenBenchmarking.org ms, Fewer Is Better MKL-DNN DNNL 1.1 Harness: Deconvolution Batch deconv_1d - Data Type: f32 Core i9 9900K 1.3185 2.637 3.9555 5.274 6.5925 SE +/- 0.02, N = 3 5.86 MIN: 4.83 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
MKL-DNN DNNL Harness: Deconvolution Batch deconv_3d - Data Type: f32 OpenBenchmarking.org ms, Fewer Is Better MKL-DNN DNNL 1.1 Harness: Deconvolution Batch deconv_3d - Data Type: f32 Core i9 9900K 2 4 6 8 10 SE +/- 0.00, N = 3 6.64 MIN: 5.76 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
MKL-DNN DNNL Harness: Convolution Batch conv_all - Data Type: f32 OpenBenchmarking.org ms, Fewer Is Better MKL-DNN DNNL 1.1 Harness: Convolution Batch conv_all - Data Type: f32 Core i9 9900K 600 1200 1800 2400 3000 SE +/- 5.77, N = 3 2948.49 MIN: 2822.24 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
MKL-DNN DNNL Harness: Convolution Batch conv_3d - Data Type: u8s8f32 OpenBenchmarking.org ms, Fewer Is Better MKL-DNN DNNL 1.1 Harness: Convolution Batch conv_3d - Data Type: u8s8f32 Core i9 9900K 4K 8K 12K 16K 20K SE +/- 230.12, N = 3 17489.70 MIN: 17150.3 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
MKL-DNN DNNL Harness: Convolution Batch conv_3d - Data Type: f32 OpenBenchmarking.org ms, Fewer Is Better MKL-DNN DNNL 1.1 Harness: Convolution Batch conv_3d - Data Type: f32 Core i9 9900K 6 12 18 24 30 SE +/- 0.05, N = 3 23.83 MIN: 21.56 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
MKL-DNN DNNL Harness: IP Batch All - Data Type: u8s8f32 OpenBenchmarking.org ms, Fewer Is Better MKL-DNN DNNL 1.1 Harness: IP Batch All - Data Type: u8s8f32 Core i9 9900K 50 100 150 200 250 SE +/- 0.52, N = 3 247.61 MIN: 239.85 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
MKL-DNN DNNL Harness: IP Batch 1D - Data Type: u8s8f32 OpenBenchmarking.org ms, Fewer Is Better MKL-DNN DNNL 1.1 Harness: IP Batch 1D - Data Type: u8s8f32 Core i9 9900K 10 20 30 40 50 SE +/- 0.09, N = 3 44.10 MIN: 40.1 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
MKL-DNN DNNL Harness: Convolution Batch conv_googlenet_v3 - Data Type: f32 OpenBenchmarking.org ms, Fewer Is Better MKL-DNN DNNL 1.1 Harness: Convolution Batch conv_googlenet_v3 - Data Type: f32 Core i9 9900K 40 80 120 160 200 SE +/- 0.20, N = 3 165.97 MIN: 147.35 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
MKL-DNN DNNL Harness: Convolution Batch conv_alexnet - Data Type: f32 OpenBenchmarking.org ms, Fewer Is Better MKL-DNN DNNL 1.1 Harness: Convolution Batch conv_alexnet - Data Type: f32 Core i9 9900K 80 160 240 320 400 SE +/- 0.38, N = 3 374.81 MIN: 361.53 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
MKL-DNN DNNL Harness: Convolution Batch conv_googlenet_v3 - Data Type: u8s8f32 OpenBenchmarking.org ms, Fewer Is Better MKL-DNN DNNL 1.1 Harness: Convolution Batch conv_googlenet_v3 - Data Type: u8s8f32 Core i9 9900K 1200 2400 3600 4800 6000 SE +/- 2371.79, N = 12 5700.46 MIN: 2136.05 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
MKL-DNN DNNL Harness: IP Batch 1D - Data Type: f32 OpenBenchmarking.org ms, Fewer Is Better MKL-DNN DNNL 1.1 Harness: IP Batch 1D - Data Type: f32 Core i9 9900K 1.0688 2.1376 3.2064 4.2752 5.344 SE +/- 0.01, N = 3 4.75 MIN: 3.93 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl
Phoronix Test Suite v10.8.5