mnn ncnn xeon 2 x Intel Xeon Platinum 8380 testing with a Intel M50CYP2SB2U (SE5C6200.86B.0022.D08.2103221623 BIOS) and ASPEED on CentOS Stream 9 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2208133-NE-MNNNCNNXE63&gru&sor .
mnn ncnn xeon Processor Motherboard Chipset Memory Disk Graphics Monitor Network OS Kernel Desktop Display Server Compiler File-System Screen Resolution A B C D E 2 x Intel Xeon Platinum 8380 @ 3.40GHz (80 Cores / 160 Threads) Intel M50CYP2SB2U (SE5C6200.86B.0022.D08.2103221623 BIOS) Intel Device 0998 512GB 7682GB INTEL SSDPF2KX076TZ ASPEED VE228 2 x Intel X710 for 10GBASE-T + 2 x Intel E810-C for QSFP CentOS Stream 9 5.14.0-142.el9.x86_64 (x86_64) GNOME Shell 40.10 X Server GCC 11.3.1 20220421 xfs 1920x1080 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: always Compiler Details - --build=x86_64-redhat-linux --disable-libunwind-exceptions --enable-__cxa_atexit --enable-bootstrap --enable-cet --enable-checking=release --enable-gnu-indirect-function --enable-gnu-unique-object --enable-host-bind-now --enable-host-pie --enable-initfini-array --enable-languages=c,c++,fortran,lto --enable-link-serialization=1 --enable-multilib --enable-offload-targets=nvptx-none --enable-plugin --enable-shared --enable-threads=posix --mandir=/usr/share/man --with-arch_32=x86-64 --with-arch_64=x86-64-v2 --with-build-config=bootstrap-lto --with-gcc-major-version-only --with-linker-hash-style=gnu --with-tune=generic --without-cuda-driver --without-isl Processor Details - Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0xd000363 Security Details - SELinux + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Mitigation of Clear buffers; SMT vulnerable + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
mnn ncnn xeon mnn: mobilenetV3 mnn: squeezenetv1.1 mnn: resnet-v2-50 mnn: SqueezeNetV1.0 mnn: MobileNetV2_224 mnn: mobilenet-v1-1.0 mnn: inception-v3 ncnn: CPU - mobilenet ncnn: CPU-v2-v2 - mobilenet-v2 ncnn: CPU-v3-v3 - mobilenet-v3 ncnn: CPU - shufflenet-v2 ncnn: CPU - mnasnet ncnn: CPU - efficientnet-b0 ncnn: CPU - blazeface ncnn: CPU - googlenet ncnn: CPU - vgg16 ncnn: CPU - resnet18 ncnn: CPU - alexnet ncnn: CPU - resnet50 ncnn: CPU - yolov4-tiny ncnn: CPU - squeezenet_ssd ncnn: CPU - regnety_400m ncnn: CPU - vision_transformer ncnn: CPU - FastestDet A B C D E 1.807 2.317 8.782 4.121 3.415 2.228 20.690 21.91 12.99 12.03 13.47 11.92 16.89 7.15 23.74 30.91 13.37 8.89 24.75 27.80 26.34 57.14 151.78 15.33 1.837 2.395 9.081 4.213 3.151 2.198 20.812 21.68 12.81 12.21 13.72 12.05 16.67 7.34 22.53 29.05 13.18 8.51 24.49 27.25 26.36 57.76 152.63 14.98 1.823 2.425 8.766 4.295 3.217 2.167 20.853 22.35 12.79 12.25 13.47 11.91 17.75 7.28 23.11 29.63 13.51 8.53 25.14 27.28 27.17 57.92 152.79 15.56 1.865 2.447 8.886 4.219 3.229 2.211 20.829 22.85 12.81 12.12 13.45 11.74 16.54 7.06 23.52 31.20 13.87 9.12 25.27 27.38 26.66 56.90 155.31 14.92 1.865 2.520 9.199 4.321 3.136 2.217 21.346 21.85 13.31 12.21 13.63 11.79 16.50 7.23 22.64 28.68 13.20 8.62 24.34 26.74 25.96 58.68 155.90 14.93 OpenBenchmarking.org
Mobile Neural Network Model: mobilenetV3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.0 Model: mobilenetV3 A C B D E 0.4196 0.8392 1.2588 1.6784 2.098 SE +/- 0.026, N = 3 SE +/- 0.019, N = 14 SE +/- 0.023, N = 3 SE +/- 0.018, N = 15 SE +/- 0.024, N = 3 1.807 1.823 1.837 1.865 1.865 MIN: 1.72 / MAX: 4.25 MIN: 1.67 / MAX: 2.18 MIN: 1.78 / MAX: 1.98 MIN: 1.77 / MAX: 4.16 MIN: 1.79 / MAX: 2.12 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: squeezenetv1.1 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.0 Model: squeezenetv1.1 A B C D E 0.567 1.134 1.701 2.268 2.835 SE +/- 0.101, N = 3 SE +/- 0.110, N = 3 SE +/- 0.054, N = 14 SE +/- 0.027, N = 15 SE +/- 0.079, N = 3 2.317 2.395 2.425 2.447 2.520 MIN: 2.17 / MAX: 3.61 MIN: 2.17 / MAX: 4.44 MIN: 2.11 / MAX: 3.4 MIN: 2.3 / MAX: 3.94 MIN: 2.34 / MAX: 6.7 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: resnet-v2-50 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.0 Model: resnet-v2-50 C A D B E 3 6 9 12 15 SE +/- 0.057, N = 14 SE +/- 0.149, N = 3 SE +/- 0.044, N = 15 SE +/- 0.142, N = 3 SE +/- 0.027, N = 3 8.766 8.782 8.886 9.081 9.199 MIN: 8.09 / MAX: 22.74 MIN: 8.31 / MAX: 21.48 MIN: 8.15 / MAX: 22.2 MIN: 8.4 / MAX: 24.9 MIN: 8.97 / MAX: 9.91 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: SqueezeNetV1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.0 Model: SqueezeNetV1.0 A B D C E 0.9722 1.9444 2.9166 3.8888 4.861 SE +/- 0.164, N = 3 SE +/- 0.097, N = 3 SE +/- 0.039, N = 15 SE +/- 0.058, N = 14 SE +/- 0.116, N = 3 4.121 4.213 4.219 4.295 4.321 MIN: 3.72 / MAX: 13.26 MIN: 3.63 / MAX: 8.6 MIN: 3.69 / MAX: 15.2 MIN: 3.71 / MAX: 11.95 MIN: 3.82 / MAX: 9.63 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: MobileNetV2_224 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.0 Model: MobileNetV2_224 E B C D A 0.7684 1.5368 2.3052 3.0736 3.842 SE +/- 0.105, N = 3 SE +/- 0.076, N = 3 SE +/- 0.049, N = 14 SE +/- 0.047, N = 15 SE +/- 0.143, N = 3 3.136 3.151 3.217 3.229 3.415 MIN: 2.79 / MAX: 8.34 MIN: 2.57 / MAX: 8.86 MIN: 2.68 / MAX: 10.02 MIN: 2.53 / MAX: 9.07 MIN: 2.58 / MAX: 12.21 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: mobilenet-v1-1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.0 Model: mobilenet-v1-1.0 C B D E A 0.5013 1.0026 1.5039 2.0052 2.5065 SE +/- 0.028, N = 13 SE +/- 0.007, N = 3 SE +/- 0.013, N = 15 SE +/- 0.021, N = 3 SE +/- 0.021, N = 3 2.167 2.198 2.211 2.217 2.228 MIN: 1.83 / MAX: 2.37 MIN: 2.15 / MAX: 2.45 MIN: 2.07 / MAX: 5.55 MIN: 2.16 / MAX: 2.33 MIN: 2.17 / MAX: 2.38 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: inception-v3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.0 Model: inception-v3 A B D C E 5 10 15 20 25 SE +/- 0.44, N = 3 SE +/- 0.18, N = 3 SE +/- 0.09, N = 15 SE +/- 0.12, N = 14 SE +/- 0.10, N = 3 20.69 20.81 20.83 20.85 21.35 MIN: 18.2 / MAX: 41.46 MIN: 19.89 / MAX: 32.91 MIN: 19.44 / MAX: 41.2 MIN: 19.34 / MAX: 46.76 MIN: 18.43 / MAX: 39.41 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: mobilenet B E A C D 5 10 15 20 25 SE +/- 0.02, N = 3 SE +/- 0.16, N = 3 SE +/- 0.29, N = 3 SE +/- 0.16, N = 3 SE +/- 0.10, N = 3 21.68 21.85 21.91 22.35 22.85 MIN: 21.29 / MAX: 45.51 MIN: 21.29 / MAX: 94.26 MIN: 21.02 / MAX: 61.03 MIN: 21.7 / MAX: 47.14 MIN: 21.59 / MAX: 246.4 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU-v2-v2 - Model: mobilenet-v2 C B D A E 3 6 9 12 15 SE +/- 0.05, N = 3 SE +/- 0.04, N = 3 SE +/- 0.09, N = 3 SE +/- 0.14, N = 3 SE +/- 0.30, N = 3 12.79 12.81 12.81 12.99 13.31 MIN: 12.42 / MAX: 18.33 MIN: 12.31 / MAX: 37.43 MIN: 12.34 / MAX: 89.44 MIN: 12.1 / MAX: 146.34 MIN: 12.14 / MAX: 236.38 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU-v3-v3 - Model: mobilenet-v3 A D B E C 3 6 9 12 15 SE +/- 0.14, N = 3 SE +/- 0.03, N = 3 SE +/- 0.20, N = 3 SE +/- 0.16, N = 3 SE +/- 0.06, N = 3 12.03 12.12 12.21 12.21 12.25 MIN: 11.55 / MAX: 35.77 MIN: 11.84 / MAX: 34.59 MIN: 11.64 / MAX: 151.41 MIN: 11.67 / MAX: 35.66 MIN: 11.87 / MAX: 36.44 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: shufflenet-v2 D A C E B 4 8 12 16 20 SE +/- 0.10, N = 3 SE +/- 0.10, N = 3 SE +/- 0.16, N = 3 SE +/- 0.26, N = 3 SE +/- 0.31, N = 3 13.45 13.47 13.47 13.63 13.72 MIN: 12.63 / MAX: 79.26 MIN: 12.89 / MAX: 36.76 MIN: 12.86 / MAX: 37.29 MIN: 12.59 / MAX: 105.4 MIN: 13.1 / MAX: 82.58 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: mnasnet D E C A B 3 6 9 12 15 SE +/- 0.13, N = 3 SE +/- 0.08, N = 3 SE +/- 0.10, N = 3 SE +/- 0.10, N = 3 SE +/- 0.16, N = 3 11.74 11.79 11.91 11.92 12.05 MIN: 11.27 / MAX: 17.51 MIN: 11.19 / MAX: 63.29 MIN: 11.32 / MAX: 35.73 MIN: 11.57 / MAX: 36.77 MIN: 11.5 / MAX: 157.89 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: efficientnet-b0 E D B A C 4 8 12 16 20 SE +/- 0.30, N = 3 SE +/- 0.29, N = 3 SE +/- 0.14, N = 3 SE +/- 0.14, N = 3 SE +/- 0.73, N = 3 16.50 16.54 16.67 16.89 17.75 MIN: 15.68 / MAX: 40.87 MIN: 15.61 / MAX: 86.38 MIN: 15.93 / MAX: 60.85 MIN: 16.1 / MAX: 73.41 MIN: 15.75 / MAX: 590.87 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: blazeface D A E C B 2 4 6 8 10 SE +/- 0.12, N = 3 SE +/- 0.08, N = 3 SE +/- 0.05, N = 3 SE +/- 0.11, N = 3 SE +/- 0.20, N = 3 7.06 7.15 7.23 7.28 7.34 MIN: 6.68 / MAX: 8.27 MIN: 6.83 / MAX: 10.24 MIN: 6.99 / MAX: 10.09 MIN: 6.88 / MAX: 9.96 MIN: 6.93 / MAX: 121.89 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: googlenet B E C D A 6 12 18 24 30 SE +/- 0.21, N = 3 SE +/- 0.29, N = 3 SE +/- 0.48, N = 3 SE +/- 0.45, N = 3 SE +/- 0.56, N = 3 22.53 22.64 23.11 23.52 23.74 MIN: 21.73 / MAX: 94.83 MIN: 21.47 / MAX: 240.52 MIN: 21.75 / MAX: 82.56 MIN: 22.04 / MAX: 299.28 MIN: 21.23 / MAX: 403.67 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: vgg16 E B C A D 7 14 21 28 35 SE +/- 0.04, N = 3 SE +/- 0.07, N = 3 SE +/- 0.23, N = 3 SE +/- 1.59, N = 3 SE +/- 0.60, N = 3 28.68 29.05 29.63 30.91 31.20 MIN: 26.99 / MAX: 114.83 MIN: 27.32 / MAX: 149.49 MIN: 27.23 / MAX: 201.62 MIN: 26.06 / MAX: 119.07 MIN: 28.77 / MAX: 277.95 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: resnet18 B E A C D 4 8 12 16 20 SE +/- 0.11, N = 3 SE +/- 0.12, N = 3 SE +/- 0.24, N = 3 SE +/- 0.15, N = 3 SE +/- 0.61, N = 3 13.18 13.20 13.37 13.51 13.87 MIN: 12.67 / MAX: 65.17 MIN: 12.71 / MAX: 82.75 MIN: 12.55 / MAX: 19.07 MIN: 12.95 / MAX: 158.24 MIN: 12.86 / MAX: 18.89 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: alexnet B C E A D 3 6 9 12 15 SE +/- 0.16, N = 3 SE +/- 0.08, N = 3 SE +/- 0.37, N = 3 SE +/- 0.36, N = 3 SE +/- 0.39, N = 3 8.51 8.53 8.62 8.89 9.12 MIN: 8.08 / MAX: 100.34 MIN: 8.13 / MAX: 49.08 MIN: 7.91 / MAX: 231.7 MIN: 7.96 / MAX: 119.99 MIN: 8.11 / MAX: 130.97 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: resnet50 E B A C D 6 12 18 24 30 SE +/- 0.11, N = 3 SE +/- 0.30, N = 3 SE +/- 0.66, N = 3 SE +/- 0.19, N = 3 SE +/- 0.11, N = 3 24.34 24.49 24.75 25.14 25.27 MIN: 23.27 / MAX: 121.62 MIN: 23.44 / MAX: 111.14 MIN: 23.11 / MAX: 123.68 MIN: 23.76 / MAX: 106.76 MIN: 23.98 / MAX: 190.68 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: yolov4-tiny E B C D A 7 14 21 28 35 SE +/- 0.19, N = 3 SE +/- 0.25, N = 3 SE +/- 0.03, N = 3 SE +/- 0.26, N = 3 SE +/- 0.43, N = 3 26.74 27.25 27.28 27.38 27.80 MIN: 25.94 / MAX: 113.75 MIN: 25.94 / MAX: 352.26 MIN: 26.44 / MAX: 51.63 MIN: 25.63 / MAX: 358.69 MIN: 25.29 / MAX: 281.03 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: squeezenet_ssd E A B D C 6 12 18 24 30 SE +/- 0.13, N = 3 SE +/- 0.14, N = 3 SE +/- 0.14, N = 3 SE +/- 0.27, N = 3 SE +/- 0.43, N = 3 25.96 26.34 26.36 26.66 27.17 MIN: 25.3 / MAX: 49.75 MIN: 24.99 / MAX: 249.3 MIN: 25.39 / MAX: 197.69 MIN: 24.96 / MAX: 198.39 MIN: 25.76 / MAX: 481.76 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: regnety_400m D A B C E 13 26 39 52 65 SE +/- 1.22, N = 3 SE +/- 1.60, N = 3 SE +/- 0.73, N = 3 SE +/- 0.98, N = 3 SE +/- 0.57, N = 3 56.90 57.14 57.76 57.92 58.68 MIN: 53.28 / MAX: 385.57 MIN: 53.37 / MAX: 430.33 MIN: 55.37 / MAX: 447.81 MIN: 55.21 / MAX: 203.4 MIN: 54.21 / MAX: 808.52 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: vision_transformer A B C D E 30 60 90 120 150 SE +/- 1.55, N = 3 SE +/- 0.18, N = 3 SE +/- 0.64, N = 3 SE +/- 0.39, N = 3 SE +/- 0.33, N = 3 151.78 152.63 152.79 155.31 155.90 MIN: 145.5 / MAX: 656.75 MIN: 147.33 / MAX: 369.79 MIN: 145.98 / MAX: 797.33 MIN: 145.23 / MAX: 1014.26 MIN: 146.82 / MAX: 812.77 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: CPU - Model: FastestDet D E B A C 4 8 12 16 20 SE +/- 0.22, N = 3 SE +/- 0.09, N = 3 SE +/- 0.20, N = 3 SE +/- 0.14, N = 3 SE +/- 0.45, N = 3 14.92 14.93 14.98 15.33 15.56 MIN: 14.41 / MAX: 20.23 MIN: 14.47 / MAX: 37.58 MIN: 14.48 / MAX: 39.53 MIN: 14.81 / MAX: 39.13 MIN: 14.29 / MAX: 72.49 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Phoronix Test Suite v10.8.5