MKL-DNN DNNL AMD EPYC

2 x AMD EPYC 7601 32-Core testing with a Dell 02MJ3T (1.2.5 BIOS) and llvmpipe 504GB on Ubuntu 19.10 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/1910046-AS-1910044AS29.

MKL-DNN DNNL AMD EPYCProcessorMotherboardChipsetMemoryDiskGraphicsMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLCompilerFile-SystemScreen ResolutionEPYC 7742 2PXeon Platinum 8280 2PEPYC 7601 2P2 x AMD EPYC 7742 64-Core @ 2.25GHz (128 Cores / 256 Threads)AMD DAYTONA_X (RDY1001C BIOS)AMD Starship/Matisse516096MB280GB INTEL SSDPED1D280GA + 256GB Micron_1100_MTFDllvmpipe 504GBVE2282 x Mellanox MT27710Ubuntu 19.105.3.0-13-generic (x86_64)GNOME Shell 3.34.0X Server 1.20.5modesetting 1.20.53.3 Mesa 19.2.0 (LLVM 9.0 128 bits)GCC 9.2.1 20190909ext41920x10802 x Intel Xeon Platinum 8280 @ 4.00GHz (56 Cores / 112 Threads)GIGABYTE MD61-SC2-00 v01000100 (T15 BIOS)Intel Sky Lake-E DMI3 Registers386048MB280GB INTEL SSDPED1D280GAllvmpipe 377GB2 x Intel X722 for 1GbE + 2 x QLogic FastLinQ QL41000 10/25/40/50GbE3.3 Mesa 19.2.0 (LLVM 9.0 256 bits)2 x AMD EPYC 7601 32-Core (64 Cores / 128 Threads)Dell 02MJ3T (1.2.5 BIOS)AMD 17h516096MB280GB INTEL SSDPED1D280GA + 12 x 500GB Samsung SSD 860 + 120GB SSDSCKJB120G7Rllvmpipe 504GB2 x Broadcom BCM57416 NetXtreme-E Dual-Media 10G RDMA + 2 x Broadcom NetXtreme BCM5720 2-port PCIe3.3 Mesa 19.2.0 (LLVM 9.0 128 bits)1600x1200OpenBenchmarking.orgCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-offload-targets=nvptx-none,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- EPYC 7742 2P: Scaling Governor: acpi-cpufreq ondemand- Xeon Platinum 8280 2P: Scaling Governor: intel_pstate powersaveSecurity Details- EPYC 7742 2P: l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: conditional RSB filling - Xeon Platinum 8280 2P: l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling- EPYC 7601 2P: l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: disabled RSB filling

MKL-DNN DNNL AMD EPYCmkl-dnn: IP Batch 1D - f32mkl-dnn: IP Batch All - f32mkl-dnn: IP Batch 1D - u8s8f32mkl-dnn: IP Batch All - u8s8f32mkl-dnn: Convolution Batch conv_3d - f32mkl-dnn: Convolution Batch conv_all - f32mkl-dnn: Convolution Batch conv_3d - u8s8f32mkl-dnn: Deconvolution Batch deconv_1d - f32mkl-dnn: Deconvolution Batch deconv_3d - f32mkl-dnn: Convolution Batch conv_alexnet - f32mkl-dnn: Convolution Batch conv_all - u8s8f32mkl-dnn: Deconvolution Batch deconv_all - f32mkl-dnn: Deconvolution Batch deconv_1d - u8s8f32mkl-dnn: Deconvolution Batch deconv_3d - u8s8f32mkl-dnn: Recurrent Neural Network Training - f32mkl-dnn: Convolution Batch conv_alexnet - u8s8f32mkl-dnn: Convolution Batch conv_googlenet_v3 - f32mkl-dnn: Convolution Batch conv_googlenet_v3 - u8s8f32EPYC 7742 2PXeon Platinum 8280 2PEPYC 7601 2P2.1111.1029.21176.324.32403.501391.482.893.2547.439100.13837.06517.95846.19814.70540.5229.381580.491.429.832.182.904.70494.942989.041.241.1549.791521.72877.840.441881.87223.0319.5123.087.684.9047.8340.53270.5417.272040.282386.594.729.37192.4916846.883470.77900.061526.11709.121179.54116.501673.36OpenBenchmarking.org

MKL-DNN DNNL

Harness: IP Batch 1D - Data Type: f32

OpenBenchmarking.orgms, Fewer Is BetterMKL-DNN DNNL 1.1Harness: IP Batch 1D - Data Type: f32EPYC 7742 2PXeon Platinum 8280 2PEPYC 7601 2P1.10252.2053.30754.415.5125SE +/- 0.00, N = 3SE +/- 0.01, N = 3SE +/- 0.03, N = 32.111.424.90MIN: 1.88MIN: 1.32MIN: 4.241. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl

MKL-DNN DNNL

Harness: IP Batch All - Data Type: f32

OpenBenchmarking.orgms, Fewer Is BetterMKL-DNN DNNL 1.1Harness: IP Batch All - Data Type: f32EPYC 7742 2PXeon Platinum 8280 2PEPYC 7601 2P1122334455SE +/- 0.01, N = 3SE +/- 0.03, N = 3SE +/- 0.07, N = 311.109.8347.83MIN: 10.49MIN: 9.47MIN: 46.021. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl

MKL-DNN DNNL

Harness: IP Batch 1D - Data Type: u8s8f32

OpenBenchmarking.orgms, Fewer Is BetterMKL-DNN DNNL 1.1Harness: IP Batch 1D - Data Type: u8s8f32EPYC 7742 2PXeon Platinum 8280 2PEPYC 7601 2P918273645SE +/- 0.13, N = 3SE +/- 0.01, N = 3SE +/- 0.03, N = 329.212.1840.53MIN: 26.87MIN: 39.011. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl

MKL-DNN DNNL

Harness: IP Batch All - Data Type: u8s8f32

OpenBenchmarking.orgms, Fewer Is BetterMKL-DNN DNNL 1.1Harness: IP Batch All - Data Type: u8s8f32EPYC 7742 2PXeon Platinum 8280 2PEPYC 7601 2P60120180240300SE +/- 0.47, N = 3SE +/- 0.03, N = 3SE +/- 0.50, N = 3176.322.90270.54MIN: 170.32MIN: 263.91. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl

MKL-DNN DNNL

Harness: Convolution Batch conv_3d - Data Type: f32

OpenBenchmarking.orgms, Fewer Is BetterMKL-DNN DNNL 1.1Harness: Convolution Batch conv_3d - Data Type: f32EPYC 7742 2PXeon Platinum 8280 2PEPYC 7601 2P48121620SE +/- 0.02, N = 3SE +/- 0.03, N = 3SE +/- 0.75, N = 154.324.7017.27MIN: 3.54MIN: 4.44MIN: 10.571. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl

MKL-DNN DNNL

Harness: Convolution Batch conv_all - Data Type: f32

OpenBenchmarking.orgms, Fewer Is BetterMKL-DNN DNNL 1.1Harness: Convolution Batch conv_all - Data Type: f32EPYC 7742 2PXeon Platinum 8280 2PEPYC 7601 2P400800120016002000SE +/- 4.94, N = 3SE +/- 0.36, N = 3SE +/- 10.16, N = 3403.50494.942040.28MIN: 372.26MIN: 488.77MIN: 1969.211. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl

MKL-DNN DNNL

Harness: Convolution Batch conv_3d - Data Type: u8s8f32

OpenBenchmarking.orgms, Fewer Is BetterMKL-DNN DNNL 1.1Harness: Convolution Batch conv_3d - Data Type: u8s8f32EPYC 7742 2PXeon Platinum 8280 2PEPYC 7601 2P6001200180024003000SE +/- 9.19, N = 3SE +/- 3.75, N = 3SE +/- 15.68, N = 31391.482989.042386.59MIN: 1356.81MIN: 2929.55MIN: 2340.461. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl

MKL-DNN DNNL

Harness: Deconvolution Batch deconv_1d - Data Type: f32

OpenBenchmarking.orgms, Fewer Is BetterMKL-DNN DNNL 1.1Harness: Deconvolution Batch deconv_1d - Data Type: f32EPYC 7742 2PXeon Platinum 8280 2PEPYC 7601 2P1.0622.1243.1864.2485.31SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.10, N = 152.891.244.72MIN: 2.61MIN: 1.17MIN: 3.861. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl

MKL-DNN DNNL

Harness: Deconvolution Batch deconv_3d - Data Type: f32

OpenBenchmarking.orgms, Fewer Is BetterMKL-DNN DNNL 1.1Harness: Deconvolution Batch deconv_3d - Data Type: f32EPYC 7742 2PXeon Platinum 8280 2PEPYC 7601 2P3691215SE +/- 0.08, N = 15SE +/- 0.00, N = 4SE +/- 0.06, N = 33.251.159.37MIN: 2.46MIN: 9.061. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl

MKL-DNN DNNL

Harness: Convolution Batch conv_alexnet - Data Type: f32

OpenBenchmarking.orgms, Fewer Is BetterMKL-DNN DNNL 1.1Harness: Convolution Batch conv_alexnet - Data Type: f32EPYC 7742 2PXeon Platinum 8280 2PEPYC 7601 2P4080120160200SE +/- 0.40, N = 15SE +/- 0.28, N = 3SE +/- 0.33, N = 347.4349.79192.49MIN: 42.56MIN: 48.58MIN: 183.911. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl

MKL-DNN DNNL

Harness: Convolution Batch conv_all - Data Type: u8s8f32

OpenBenchmarking.orgms, Fewer Is BetterMKL-DNN DNNL 1.1Harness: Convolution Batch conv_all - Data Type: u8s8f32EPYC 7742 2PXeon Platinum 8280 2PEPYC 7601 2P4K8K12K16K20KSE +/- 13.71, N = 3SE +/- 1.30, N = 3SE +/- 198.89, N = 59100.131521.7216846.88MIN: 8785.76MIN: 15888.11. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl

MKL-DNN DNNL

Harness: Deconvolution Batch deconv_all - Data Type: f32

OpenBenchmarking.orgms, Fewer Is BetterMKL-DNN DNNL 1.1Harness: Deconvolution Batch deconv_all - Data Type: f32EPYC 7742 2PXeon Platinum 8280 2PEPYC 7601 2P7001400210028003500SE +/- 10.57, N = 3SE +/- 0.45, N = 3SE +/- 101.40, N = 9837.06877.843470.77MIN: 769.65MIN: 871.06MIN: 3032.191. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl

MKL-DNN DNNL

Harness: Deconvolution Batch deconv_1d - Data Type: u8s8f32

OpenBenchmarking.orgms, Fewer Is BetterMKL-DNN DNNL 1.1Harness: Deconvolution Batch deconv_1d - Data Type: u8s8f32EPYC 7742 2PXeon Platinum 8280 2PEPYC 7601 2P2004006008001000SE +/- 1.54, N = 3SE +/- 0.00, N = 3SE +/- 1.72, N = 3517.950.44900.06MIN: 505.29MIN: 888.121. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl

MKL-DNN DNNL

Harness: Deconvolution Batch deconv_3d - Data Type: u8s8f32

OpenBenchmarking.orgms, Fewer Is BetterMKL-DNN DNNL 1.1Harness: Deconvolution Batch deconv_3d - Data Type: u8s8f32EPYC 7742 2PXeon Platinum 8280 2PEPYC 7601 2P400800120016002000SE +/- 0.69, N = 3SE +/- 0.80, N = 3SE +/- 1.78, N = 3846.191881.871526.11MIN: 840.22MIN: 1867.97MIN: 1505.071. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl

MKL-DNN DNNL

Harness: Recurrent Neural Network Training - Data Type: f32

OpenBenchmarking.orgms, Fewer Is BetterMKL-DNN DNNL 1.1Harness: Recurrent Neural Network Training - Data Type: f32EPYC 7742 2PXeon Platinum 8280 2PEPYC 7601 2P2004006008001000SE +/- 6.96, N = 12SE +/- 2.99, N = 4SE +/- 6.47, N = 15814.70223.03709.12MIN: 752.4MIN: 207.46MIN: 637.591. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl

MKL-DNN DNNL

Harness: Convolution Batch conv_alexnet - Data Type: u8s8f32

OpenBenchmarking.orgms, Fewer Is BetterMKL-DNN DNNL 1.1Harness: Convolution Batch conv_alexnet - Data Type: u8s8f32EPYC 7742 2PXeon Platinum 8280 2PEPYC 7601 2P30060090012001500SE +/- 0.96, N = 3SE +/- 0.17, N = 3SE +/- 3.94, N = 3540.5219.511179.54MIN: 525.23MIN: 1154.961. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl

MKL-DNN DNNL

Harness: Convolution Batch conv_googlenet_v3 - Data Type: f32

OpenBenchmarking.orgms, Fewer Is BetterMKL-DNN DNNL 1.1Harness: Convolution Batch conv_googlenet_v3 - Data Type: f32EPYC 7742 2PXeon Platinum 8280 2PEPYC 7601 2P306090120150SE +/- 0.26, N = 3SE +/- 0.10, N = 3SE +/- 0.24, N = 329.3823.08116.50MIN: 26.98MIN: 21.92MIN: 101.761. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl

MKL-DNN DNNL

Harness: Convolution Batch conv_googlenet_v3 - Data Type: u8s8f32

OpenBenchmarking.orgms, Fewer Is BetterMKL-DNN DNNL 1.1Harness: Convolution Batch conv_googlenet_v3 - Data Type: u8s8f32EPYC 7742 2PXeon Platinum 8280 2PEPYC 7601 2P400800120016002000SE +/- 1.21, N = 3SE +/- 0.03, N = 3SE +/- 51.26, N = 91580.497.681673.36MIN: 1487.89MIN: 1293.521. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -lpthread -ldl


Phoronix Test Suite v10.8.4