1280 nn Intel Xeon E3-1280 v5 testing with a MSI Z170A SLI PLUS (MS-7998) v1.0 (2.A0 BIOS) and ASUS AMD Radeon HD 7850 / R7 265 R9 270 1024SP on Ubuntu 20.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2106187-IB-1280NN86046 .
1280 nn Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server OpenGL Compiler File-System Screen Resolution 1 2 3 Intel Xeon E3-1280 v5 @ 4.00GHz (4 Cores / 8 Threads) MSI Z170A SLI PLUS (MS-7998) v1.0 (2.A0 BIOS) Intel Xeon E3-1200 v5/E3-1500 32GB 256GB TOSHIBA RD400 ASUS AMD Radeon HD 7850 / R7 265 R9 270 1024SP Realtek ALC1150 VA2431 Intel I219-V Ubuntu 20.04 5.9.0-050900rc2daily20200826-generic (x86_64) 20200825 GNOME Shell 3.36.4 X Server 1.20.9 4.5 Mesa 20.0.8 (LLVM 10.0.0) GCC 9.3.0 ext4 1920x1080 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: intel_pstate powersave - CPU Microcode: 0xe2 - Thermald 1.9.1 Security Details - itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Mitigation of PTE Inversion; VMX: conditional cache flushes SMT vulnerable + mds: Mitigation of Clear buffers; SMT vulnerable + meltdown: Mitigation of PTI + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full generic retpoline IBPB: conditional IBRS_FW STIBP: conditional RSB filling + srbds: Mitigation of Microcode + tsx_async_abort: Mitigation of Clear buffers; SMT vulnerable
1280 nn mnn: mobilenetV3 mnn: squeezenetv1.1 mnn: resnet-v2-50 mnn: SqueezeNetV1.0 mnn: MobileNetV2_224 mnn: mobilenet-v1-1.0 mnn: inception-v3 ncnn: CPU - mobilenet ncnn: CPU-v2-v2 - mobilenet-v2 ncnn: CPU-v3-v3 - mobilenet-v3 ncnn: CPU - shufflenet-v2 ncnn: CPU - mnasnet ncnn: CPU - efficientnet-b0 ncnn: CPU - blazeface ncnn: CPU - googlenet ncnn: CPU - vgg16 ncnn: CPU - resnet18 ncnn: CPU - alexnet ncnn: CPU - resnet50 ncnn: CPU - yolov4-tiny ncnn: CPU - squeezenet_ssd ncnn: CPU - regnety_400m tnn: CPU - DenseNet tnn: CPU - MobileNet v2 tnn: CPU - SqueezeNet v2 tnn: CPU - SqueezeNet v1.1 1 2 3 2.623 4.558 44.087 6.715 4.602 4.503 51.138 25.96 6.82 5.74 4.85 5.65 9.41 1.91 19.96 91.02 21.98 19.85 40.64 37.62 30.28 11.82 4517.977 391.269 80.914 341.868 2.627 4.607 44.286 6.735 4.644 4.508 51.606 25.74 6.82 5.74 4.85 5.64 9.43 1.91 19.95 91.15 22.03 19.68 40.60 37.54 30.12 11.85 4523.532 389.951 80.888 341.414 2.631 4.610 44.274 6.743 4.651 4.523 51.765 25.87 6.89 5.79 4.86 5.68 9.49 1.96 20.07 91.45 22.02 19.85 40.64 37.70 30.21 11.84 4527.506 389.911 80.974 340.964 OpenBenchmarking.org
Mobile Neural Network Model: mobilenetV3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.2 Model: mobilenetV3 1 2 3 0.592 1.184 1.776 2.368 2.96 SE +/- 0.012, N = 3 SE +/- 0.010, N = 3 SE +/- 0.015, N = 3 2.623 2.627 2.631 MIN: 2.5 / MAX: 4.03 MIN: 2.58 / MAX: 7.02 MIN: 2.58 / MAX: 4.02 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: squeezenetv1.1 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.2 Model: squeezenetv1.1 1 2 3 1.0373 2.0746 3.1119 4.1492 5.1865 SE +/- 0.020, N = 3 SE +/- 0.022, N = 3 SE +/- 0.021, N = 3 4.558 4.607 4.610 MIN: 4.47 / MAX: 7.45 MIN: 4.53 / MAX: 26.9 MIN: 4.54 / MAX: 5.72 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: resnet-v2-50 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.2 Model: resnet-v2-50 1 2 3 10 20 30 40 50 SE +/- 0.06, N = 3 SE +/- 0.27, N = 3 SE +/- 0.11, N = 3 44.09 44.29 44.27 MIN: 43.62 / MAX: 68.3 MIN: 43.58 / MAX: 65.39 MIN: 43.74 / MAX: 66.75 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: SqueezeNetV1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.2 Model: SqueezeNetV1.0 1 2 3 2 4 6 8 10 SE +/- 0.028, N = 3 SE +/- 0.030, N = 3 SE +/- 0.033, N = 3 6.715 6.735 6.743 MIN: 6.62 / MAX: 24.05 MIN: 6.64 / MAX: 9.63 MIN: 6.63 / MAX: 11.15 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: MobileNetV2_224 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.2 Model: MobileNetV2_224 1 2 3 1.0465 2.093 3.1395 4.186 5.2325 SE +/- 0.038, N = 3 SE +/- 0.003, N = 3 SE +/- 0.020, N = 3 4.602 4.644 4.651 MIN: 4.47 / MAX: 7.33 MIN: 4.58 / MAX: 7.38 MIN: 4.56 / MAX: 14.74 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: mobilenet-v1-1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.2 Model: mobilenet-v1-1.0 1 2 3 1.0177 2.0354 3.0531 4.0708 5.0885 SE +/- 0.015, N = 3 SE +/- 0.010, N = 3 SE +/- 0.007, N = 3 4.503 4.508 4.523 MIN: 4.46 / MAX: 7.27 MIN: 4.47 / MAX: 10.02 MIN: 4.49 / MAX: 8.9 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: inception-v3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.2 Model: inception-v3 1 2 3 12 24 36 48 60 SE +/- 0.34, N = 3 SE +/- 0.25, N = 3 SE +/- 0.30, N = 3 51.14 51.61 51.77 MIN: 50.55 / MAX: 72.62 MIN: 51.05 / MAX: 74.56 MIN: 51.05 / MAX: 74.01 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: mobilenet 1 2 3 6 12 18 24 30 SE +/- 0.20, N = 3 SE +/- 0.02, N = 3 SE +/- 0.11, N = 3 25.96 25.74 25.87 MIN: 25.63 / MAX: 36.2 MIN: 25.62 / MAX: 28.44 MIN: 25.6 / MAX: 29.21 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU-v2-v2 - Model: mobilenet-v2 1 2 3 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.06, N = 3 6.82 6.82 6.89 MIN: 6.73 / MAX: 9.66 MIN: 6.74 / MAX: 8.53 MIN: 6.73 / MAX: 9.5 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU-v3-v3 - Model: mobilenet-v3 1 2 3 1.3028 2.6056 3.9084 5.2112 6.514 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.05, N = 3 5.74 5.74 5.79 MIN: 5.68 / MAX: 7.33 MIN: 5.67 / MAX: 7.22 MIN: 5.67 / MAX: 17.62 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: shufflenet-v2 1 2 3 1.0935 2.187 3.2805 4.374 5.4675 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 4.85 4.85 4.86 MIN: 4.8 / MAX: 6.43 MIN: 4.81 / MAX: 6.29 MIN: 4.8 / MAX: 6.37 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: mnasnet 1 2 3 1.278 2.556 3.834 5.112 6.39 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.03, N = 3 5.65 5.64 5.68 MIN: 5.58 / MAX: 7.37 MIN: 5.59 / MAX: 7.14 MIN: 5.58 / MAX: 7.37 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: efficientnet-b0 1 2 3 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.08, N = 3 9.41 9.43 9.49 MIN: 9.35 / MAX: 11.25 MIN: 9.36 / MAX: 20.24 MIN: 9.36 / MAX: 9.8 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: blazeface 1 2 3 0.441 0.882 1.323 1.764 2.205 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.06, N = 3 1.91 1.91 1.96 MIN: 1.88 / MAX: 2.07 MIN: 1.88 / MAX: 2.07 MIN: 1.88 / MAX: 2.22 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: googlenet 1 2 3 5 10 15 20 25 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 SE +/- 0.13, N = 3 19.96 19.95 20.07 MIN: 19.87 / MAX: 33.03 MIN: 19.86 / MAX: 22.71 MIN: 19.86 / MAX: 21.48 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: vgg16 1 2 3 20 40 60 80 100 SE +/- 0.09, N = 3 SE +/- 0.03, N = 3 SE +/- 0.31, N = 3 91.02 91.15 91.45 MIN: 90.68 / MAX: 94 MIN: 90.86 / MAX: 103.83 MIN: 90.83 / MAX: 158.8 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: resnet18 1 2 3 5 10 15 20 25 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 21.98 22.03 22.02 MIN: 21.89 / MAX: 22.64 MIN: 21.92 / MAX: 22.81 MIN: 21.92 / MAX: 23.68 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: alexnet 1 2 3 5 10 15 20 25 SE +/- 0.15, N = 3 SE +/- 0.01, N = 3 SE +/- 0.14, N = 3 19.85 19.68 19.85 MIN: 19.6 / MAX: 32.37 MIN: 19.61 / MAX: 20.82 MIN: 19.59 / MAX: 30.77 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: resnet50 1 2 3 9 18 27 36 45 SE +/- 0.04, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 40.64 40.60 40.64 MIN: 40.48 / MAX: 65.35 MIN: 40.48 / MAX: 44.38 MIN: 40.46 / MAX: 53.89 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: yolov4-tiny 1 2 3 9 18 27 36 45 SE +/- 0.15, N = 3 SE +/- 0.03, N = 3 SE +/- 0.12, N = 3 37.62 37.54 37.70 MIN: 37.32 / MAX: 50.74 MIN: 37.34 / MAX: 39.51 MIN: 37.42 / MAX: 46.13 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: squeezenet_ssd 1 2 3 7 14 21 28 35 SE +/- 0.13, N = 3 SE +/- 0.01, N = 3 SE +/- 0.09, N = 3 30.28 30.12 30.21 MIN: 29.93 / MAX: 33.92 MIN: 29.94 / MAX: 32.08 MIN: 29.93 / MAX: 43.99 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: regnety_400m 1 2 3 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 11.82 11.85 11.84 MIN: 11.78 / MAX: 12.01 MIN: 11.79 / MAX: 12.72 MIN: 11.79 / MAX: 11.98 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
TNN Target: CPU - Model: DenseNet OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: DenseNet 1 2 3 1000 2000 3000 4000 5000 SE +/- 4.13, N = 3 SE +/- 7.80, N = 3 SE +/- 13.48, N = 3 4517.98 4523.53 4527.51 MIN: 4498.13 / MAX: 4539.84 MIN: 4491.56 / MAX: 4557.66 MIN: 4494.26 / MAX: 4573.95 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
TNN Target: CPU - Model: MobileNet v2 OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: MobileNet v2 1 2 3 80 160 240 320 400 SE +/- 0.30, N = 3 SE +/- 0.09, N = 3 SE +/- 0.14, N = 3 391.27 389.95 389.91 MIN: 390.1 / MAX: 402.52 MIN: 388.73 / MAX: 395.45 MIN: 388.91 / MAX: 392.91 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
TNN Target: CPU - Model: SqueezeNet v2 OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: SqueezeNet v2 1 2 3 20 40 60 80 100 SE +/- 0.06, N = 3 SE +/- 0.06, N = 3 SE +/- 0.07, N = 3 80.91 80.89 80.97 MIN: 80.39 / MAX: 81.96 MIN: 80.35 / MAX: 81.76 MIN: 80.42 / MAX: 82.46 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
TNN Target: CPU - Model: SqueezeNet v1.1 OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: SqueezeNet v1.1 1 2 3 70 140 210 280 350 SE +/- 0.48, N = 3 SE +/- 0.06, N = 3 SE +/- 0.60, N = 3 341.87 341.41 340.96 MIN: 340.07 / MAX: 354.64 MIN: 340.29 / MAX: 342.56 MIN: 338.98 / MAX: 347.76 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
Phoronix Test Suite v10.8.5