MKL-DNN GCC 9 Cascadelake Compiler Tuning 2 x Intel Xeon Platinum 8280 testing with a GIGABYTE MD61-SC2-00 v01000100 (T15 BIOS) and ASPEED Family on Ubuntu 18.04 via the Phoronix Test Suite.
Compare your own system(s) to this result file with the
Phoronix Test Suite by running the command:
phoronix-test-suite benchmark 1904200-HV-MKLDNNGCC43 -O3 -march=skylake-avx512 Environment Notes: CXXFLAGS=-O3-march=skylake-avx512 CFLAGS=-O3-march=skylake-avx512Compiler Notes: --disable-multilib --enable-checking=releaseProcessor Notes: Scaling Governor: intel_pstate powersaveSecurity Notes: __user pointer sanitization + Enhanced IBRS IBPB: conditional RSB filling + SSB disabled via prctl and seccomp
-O3 -march=cascadelake Processor: 2 x Intel Xeon Platinum 8280 @ 4.00GHz (56 Cores / 112 Threads), Motherboard: GIGABYTE MD61-SC2-00 v01000100 (T15 BIOS), Chipset: Intel Sky Lake-E DMI3 Registers, Memory: 386048MB, Disk: Samsung SSD 970 PRO 512GB, Graphics: ASPEED Family, Monitor: VE228, Network: 2 x Intel X722 for 1GbE + 2 x QLogic FastLinQ QL41000 10/25/40/50GbE
OS: Ubuntu 18.04, Kernel: 5.1.0-999-generic (x86_64) 20190416, Desktop: GNOME Shell 3.28.3, Display Server: X Server 1.20.1, Display Driver: modesetting 1.20.1, Compiler: GCC 9.0.1 20190414, File-System: ext4, Screen Resolution: 1920x1080
Environment Notes: CXXFLAGS=-O3-march=cascadelake CFLAGS=-O3-march=cascadelakeCompiler Notes: --disable-multilib --enable-checking=releaseProcessor Notes: Scaling Governor: intel_pstate powersaveSecurity Notes: __user pointer sanitization + Enhanced IBRS IBPB: conditional RSB filling + SSB disabled via prctl and seccomp
MKL-DNN This is a test of the Intel MKL-DNN as the Intel Math Kernel Library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN 2019-04-16 Harness: IP Batch 1D - Data Type: f32 -O3 -march=skylake-avx512 -O3 -march=cascadelake 4 8 12 16 20 SE +/- 0.06, N = 3 SE +/- 0.03, N = 3 17.55 17.44 MIN: 16.94 -march=cascadelake - MIN: 16.92 1. (CXX) g++ options: -O3 -std=c++11 -march=native -mtune=native -fPIC -fopenmp -pie -lmklml_intel -ldl
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN 2019-04-16 Harness: IP Batch All - Data Type: f32 -O3 -march=skylake-avx512 -O3 -march=cascadelake 50 100 150 200 250 SE +/- 0.17, N = 3 SE +/- 0.10, N = 3 249 249 MIN: 245.97 -march=cascadelake - MIN: 245.95 1. (CXX) g++ options: -O3 -std=c++11 -march=native -mtune=native -fPIC -fopenmp -pie -lmklml_intel -ldl
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN 2019-04-16 Harness: IP Batch 1D - Data Type: u8s8u8s32 -O3 -march=skylake-avx512 -O3 -march=cascadelake 1.0328 2.0656 3.0984 4.1312 5.164 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 4.59 4.58 MIN: 4.22 -march=cascadelake - MIN: 4.19 1. (CXX) g++ options: -O3 -std=c++11 -march=native -mtune=native -fPIC -fopenmp -pie -lmklml_intel -ldl
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN 2019-04-16 Harness: IP Batch All - Data Type: u8s8u8s32 -O3 -march=skylake-avx512 -O3 -march=cascadelake 16 32 48 64 80 SE +/- 0.13, N = 3 SE +/- 0.02, N = 3 72.49 72.38 MIN: 70.54 -march=cascadelake - MIN: 70.76 1. (CXX) g++ options: -O3 -std=c++11 -march=native -mtune=native -fPIC -fopenmp -pie -lmklml_intel -ldl
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN 2019-04-16 Harness: Convolution Batch conv_3d - Data Type: f32 -O3 -march=skylake-avx512 -O3 -march=cascadelake 20 40 60 80 100 SE +/- 0.20, N = 3 SE +/- 0.22, N = 3 85.67 85.40 MIN: 83.5 -march=cascadelake - MIN: 83.33 1. (CXX) g++ options: -O3 -std=c++11 -march=native -mtune=native -fPIC -fopenmp -pie -lmklml_intel -ldl
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN 2019-04-16 Harness: Convolution Batch conv_all - Data Type: f32 -O3 -march=skylake-avx512 -O3 -march=cascadelake 3K 6K 9K 12K 15K SE +/- 5.34, N = 3 SE +/- 8.35, N = 3 13218 13215 MIN: 13095.4 -march=cascadelake - MIN: 13089.5 1. (CXX) g++ options: -O3 -std=c++11 -march=native -mtune=native -fPIC -fopenmp -pie -lmklml_intel -ldl
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN 2019-04-16 Harness: Deconvolution Batch deconv_1d - Data Type: f32 -O3 -march=skylake-avx512 -O3 -march=cascadelake 6 12 18 24 30 SE +/- 0.38, N = 3 SE +/- 0.27, N = 6 23.77 23.53 MIN: 22.94 -march=cascadelake - MIN: 22.83 1. (CXX) g++ options: -O3 -std=c++11 -march=native -mtune=native -fPIC -fopenmp -pie -lmklml_intel -ldl
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN 2019-04-16 Harness: Deconvolution Batch deconv_3d - Data Type: f32 -O3 -march=skylake-avx512 -O3 -march=cascadelake 7 14 21 28 35 SE +/- 0.06, N = 3 SE +/- 0.08, N = 3 27.41 27.77 MIN: 26.68 -march=cascadelake - MIN: 27 1. (CXX) g++ options: -O3 -std=c++11 -march=native -mtune=native -fPIC -fopenmp -pie -lmklml_intel -ldl
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN 2019-04-16 Harness: Convolution Batch conv_alexnet - Data Type: f32 -O3 -march=skylake-avx512 -O3 -march=cascadelake 400 800 1200 1600 2000 SE +/- 0.82, N = 3 SE +/- 1.65, N = 3 1756 1753 MIN: 1740.28 -march=cascadelake - MIN: 1737.11 1. (CXX) g++ options: -O3 -std=c++11 -march=native -mtune=native -fPIC -fopenmp -pie -lmklml_intel -ldl
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN 2019-04-16 Harness: Deconvolution Batch deconv_all - Data Type: f32 -O3 -march=skylake-avx512 -O3 -march=cascadelake 2K 4K 6K 8K 10K SE +/- 4.71, N = 3 SE +/- 2.13, N = 3 10271 10257 MIN: 10175.5 -march=cascadelake - MIN: 10166.6 1. (CXX) g++ options: -O3 -std=c++11 -march=native -mtune=native -fPIC -fopenmp -pie -lmklml_intel -ldl
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN 2019-04-16 Harness: Convolution Batch conv_3d - Data Type: u8s8u8s32 -O3 -march=skylake-avx512 -O3 -march=cascadelake 30K 60K 90K 120K 150K SE +/- 1214.69, N = 3 SE +/- 1664.64, N = 9 125838 128646 MIN: 124475 -march=cascadelake - MIN: 124405 1. (CXX) g++ options: -O3 -std=c++11 -march=native -mtune=native -fPIC -fopenmp -pie -lmklml_intel -ldl
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN 2019-04-16 Harness: Convolution Batch conv_all - Data Type: u8s8u8s32 -O3 -march=skylake-avx512 -O3 -march=cascadelake 13K 26K 39K 52K 65K SE +/- 321.79, N = 3 SE +/- 265.15, N = 3 59581 59454 MIN: 59174 -march=cascadelake - MIN: 59115 1. (CXX) g++ options: -O3 -std=c++11 -march=native -mtune=native -fPIC -fopenmp -pie -lmklml_intel -ldl
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN 2019-04-16 Harness: Convolution Batch conv_googlenet_v3 - Data Type: f32 -O3 -march=skylake-avx512 -O3 -march=cascadelake 160 320 480 640 800 SE +/- 0.66, N = 3 SE +/- 0.39, N = 3 727 727 MIN: 717.67 -march=cascadelake - MIN: 717.66 1. (CXX) g++ options: -O3 -std=c++11 -march=native -mtune=native -fPIC -fopenmp -pie -lmklml_intel -ldl
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN 2019-04-16 Harness: Deconvolution Batch deconv_1d - Data Type: u8s8u8s32 -O3 -march=skylake-avx512 -O3 -march=cascadelake 1.2173 2.4346 3.6519 4.8692 6.0865 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 5.40 5.41 MIN: 5.32 -march=cascadelake - MIN: 5.32 1. (CXX) g++ options: -O3 -std=c++11 -march=native -mtune=native -fPIC -fopenmp -pie -lmklml_intel -ldl
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN 2019-04-16 Harness: Deconvolution Batch deconv_3d - Data Type: u8s8u8s32 -O3 -march=skylake-avx512 -O3 -march=cascadelake 20K 40K 60K 80K 100K SE +/- 76.36, N = 3 SE +/- 139.78, N = 3 79735 79948 MIN: 79643 -march=cascadelake - MIN: 79663.4 1. (CXX) g++ options: -O3 -std=c++11 -march=native -mtune=native -fPIC -fopenmp -pie -lmklml_intel -ldl
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN 2019-04-16 Harness: Convolution Batch conv_alexnet - Data Type: u8s8u8s32 -O3 -march=skylake-avx512 -O3 -march=cascadelake 100 200 300 400 500 SE +/- 0.13, N = 3 SE +/- 0.33, N = 3 452 453 MIN: 444.46 -march=cascadelake - MIN: 444.07 1. (CXX) g++ options: -O3 -std=c++11 -march=native -mtune=native -fPIC -fopenmp -pie -lmklml_intel -ldl
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN 2019-04-16 Harness: Deconvolution Batch deconv_all - Data Type: u8s8u8s32 -O3 -march=skylake-avx512 -O3 -march=cascadelake 30K 60K 90K 120K 150K SE +/- 186.57, N = 3 SE +/- 134.27, N = 3 126456 126386 MIN: 126023 -march=cascadelake - MIN: 126069 1. (CXX) g++ options: -O3 -std=c++11 -march=native -mtune=native -fPIC -fopenmp -pie -lmklml_intel -ldl
OpenBenchmarking.org ms, Fewer Is Better MKL-DNN 2019-04-16 Harness: Convolution Batch conv_googlenet_v3 - Data Type: u8s8u8s32 -O3 -march=skylake-avx512 -O3 -march=cascadelake 40 80 120 160 200 SE +/- 0.11, N = 3 SE +/- 0.10, N = 3 177 177 MIN: 175.69 -march=cascadelake - MIN: 175.83 1. (CXX) g++ options: -O3 -std=c++11 -march=native -mtune=native -fPIC -fopenmp -pie -lmklml_intel -ldl
-O3 -march=skylake-avx512 Environment Notes: CXXFLAGS=-O3-march=skylake-avx512 CFLAGS=-O3-march=skylake-avx512Compiler Notes: --disable-multilib --enable-checking=releaseProcessor Notes: Scaling Governor: intel_pstate powersaveSecurity Notes: __user pointer sanitization + Enhanced IBRS IBPB: conditional RSB filling + SSB disabled via prctl and seccomp
Testing initiated at 19 April 2019 11:26 by user phoronix.
-O3 -march=cascadelake Processor: 2 x Intel Xeon Platinum 8280 @ 4.00GHz (56 Cores / 112 Threads), Motherboard: GIGABYTE MD61-SC2-00 v01000100 (T15 BIOS), Chipset: Intel Sky Lake-E DMI3 Registers, Memory: 386048MB, Disk: Samsung SSD 970 PRO 512GB, Graphics: ASPEED Family, Monitor: VE228, Network: 2 x Intel X722 for 1GbE + 2 x QLogic FastLinQ QL41000 10/25/40/50GbE
OS: Ubuntu 18.04, Kernel: 5.1.0-999-generic (x86_64) 20190416, Desktop: GNOME Shell 3.28.3, Display Server: X Server 1.20.1, Display Driver: modesetting 1.20.1, Compiler: GCC 9.0.1 20190414, File-System: ext4, Screen Resolution: 1920x1080
Environment Notes: CXXFLAGS=-O3-march=cascadelake CFLAGS=-O3-march=cascadelakeCompiler Notes: --disable-multilib --enable-checking=releaseProcessor Notes: Scaling Governor: intel_pstate powersaveSecurity Notes: __user pointer sanitization + Enhanced IBRS IBPB: conditional RSB filling + SSB disabled via prctl and seccomp
Testing initiated at 19 April 2019 16:20 by user phoronix.