mnn ncnn zen 1 epyc

AMD EPYC 7551 32-Core testing with a GIGABYTE MZ31-AR0-00 v01010101 (F10 BIOS) and ASPEED on Debian 11 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2208156-NE-MNNNCNNZE48.

mnn ncnn zen 1 epycProcessorMotherboardChipsetMemoryDiskGraphicsNetworkOSKernelCompilerFile-SystemScreen ResolutionABCAMD EPYC 7551 32-Core @ 2.00GHz (32 Cores / 64 Threads)GIGABYTE MZ31-AR0-00 v01010101 (F10 BIOS)AMD 17h8 x 4 GB DDR4-2133MT/s 9ASF51272PZ-2G6E1Samsung SSD 960 EVO 500GBASPEEDRealtek RTL8111/8168/8411 + 2 x Broadcom NetXtreme II BCM57810 10Debian 115.10.0-9-amd64 (x86_64)GCC 10.2.1 20210110ext41024x768OpenBenchmarking.orgKernel Details- Transparent Huge Pages: alwaysCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-mutex --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-10-Km9U7s/gcc-10-10.2.1/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-10-Km9U7s/gcc-10-10.2.1/debian/tmp-gcn/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0x8001227Security Details- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: disabled RSB filling + srbds: Not affected + tsx_async_abort: Not affected

mnn ncnn zen 1 epycmnn: mobilenetV3mnn: squeezenetv1.1mnn: resnet-v2-50mnn: SqueezeNetV1.0mnn: MobileNetV2_224mnn: mobilenet-v1-1.0mnn: inception-v3ncnn: CPU - mobilenetncnn: CPU-v2-v2 - mobilenet-v2ncnn: CPU-v3-v3 - mobilenet-v3ncnn: CPU - shufflenet-v2ncnn: CPU - mnasnetncnn: CPU - efficientnet-b0ncnn: CPU - blazefacencnn: CPU - googlenetncnn: CPU - vgg16ncnn: CPU - resnet18ncnn: CPU - alexnetncnn: CPU - resnet50ncnn: CPU - yolov4-tinyncnn: CPU - squeezenet_ssdncnn: CPU - regnety_400mncnn: CPU - vision_transformerncnn: CPU - FastestDetABC5.2819.15752.38014.09010.0488.34159.58945.0024.0421.6029.1622.3733.1912.2859.4194.4745.4241.3671.3463.2860.88103.41353.6832.285.4649.00954.15213.9838.5958.11957.84741.5224.6323.5126.6122.0734.6514.4753.8298.9945.6540.9477.6155.4856.8096.94352.3632.756.0709.51657.82614.4929.8988.78359.95545.8827.0522.3527.0726.2931.8011.3853.2492.7040.3942.4765.4160.0658.65100.75350.5830.45OpenBenchmarking.org

Mobile Neural Network

Model: mobilenetV3

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.0Model: mobilenetV3ABC246810SE +/- 0.074, N = 5SE +/- 0.230, N = 9SE +/- 0.285, N = 95.2815.4646.070MIN: 5 / MAX: 10.66MIN: 4.57 / MAX: 11.22MIN: 4.96 / MAX: 9.81. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl

Mobile Neural Network

Model: squeezenetv1.1

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.0Model: squeezenetv1.1ABC3691215SE +/- 0.160, N = 5SE +/- 0.283, N = 9SE +/- 0.364, N = 99.1579.0099.516MIN: 8.54 / MAX: 14.69MIN: 7.68 / MAX: 15.21MIN: 8.27 / MAX: 17.731. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl

Mobile Neural Network

Model: resnet-v2-50

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.0Model: resnet-v2-50ABC1326395265SE +/- 1.12, N = 5SE +/- 2.76, N = 9SE +/- 2.46, N = 952.3854.1557.83MIN: 47.96 / MAX: 141.5MIN: 47.92 / MAX: 164.33MIN: 50.22 / MAX: 132.711. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl

Mobile Neural Network

Model: SqueezeNetV1.0

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.0Model: SqueezeNetV1.0ABC48121620SE +/- 0.32, N = 5SE +/- 0.36, N = 9SE +/- 0.35, N = 914.0913.9814.49MIN: 12.99 / MAX: 16.82MIN: 12.79 / MAX: 21.34MIN: 12.97 / MAX: 21.041. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl

Mobile Neural Network

Model: MobileNetV2_224

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.0Model: MobileNetV2_224ABC3691215SE +/- 0.209, N = 5SE +/- 0.066, N = 9SE +/- 0.196, N = 910.0488.5959.898MIN: 8.59 / MAX: 32.81MIN: 7.48 / MAX: 39.13MIN: 8.56 / MAX: 34.581. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl

Mobile Neural Network

Model: mobilenet-v1-1.0

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.0Model: mobilenet-v1-1.0ABC246810SE +/- 0.583, N = 5SE +/- 0.478, N = 9SE +/- 0.610, N = 98.3418.1198.783MIN: 7.04 / MAX: 16.49MIN: 7.14 / MAX: 17.24MIN: 6.3 / MAX: 16.91. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl

Mobile Neural Network

Model: inception-v3

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.0Model: inception-v3ABC1326395265SE +/- 2.14, N = 5SE +/- 3.36, N = 9SE +/- 2.98, N = 959.5957.8559.96MIN: 54.7 / MAX: 74.6MIN: 51.06 / MAX: 261.45MIN: 51.78 / MAX: 271.461. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl

NCNN

Target: CPU - Model: mobilenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: mobilenetABC1020304050SE +/- 2.07, N = 9SE +/- 1.16, N = 9SE +/- 3.76, N = 945.0041.5245.88MIN: 35.78 / MAX: 540.85MIN: 35.31 / MAX: 563.93MIN: 35.78 / MAX: 543.711. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

NCNN

Target: CPU-v2-v2 - Model: mobilenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU-v2-v2 - Model: mobilenet-v2ABC612182430SE +/- 1.12, N = 9SE +/- 2.10, N = 9SE +/- 2.11, N = 924.0424.6327.05MIN: 19.32 / MAX: 466.34MIN: 18.22 / MAX: 475.35MIN: 18.97 / MAX: 481.631. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

NCNN

Target: CPU-v3-v3 - Model: mobilenet-v3

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU-v3-v3 - Model: mobilenet-v3ABC612182430SE +/- 1.04, N = 9SE +/- 1.97, N = 9SE +/- 1.30, N = 921.6023.5122.35MIN: 18.42 / MAX: 485.33MIN: 17.83 / MAX: 487.74MIN: 18.39 / MAX: 509.431. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

NCNN

Target: CPU - Model: shufflenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: shufflenet-v2ABC714212835SE +/- 4.25, N = 9SE +/- 1.64, N = 9SE +/- 1.75, N = 929.1626.6127.07MIN: 21.67 / MAX: 560.88MIN: 21.24 / MAX: 559.88MIN: 21.66 / MAX: 552.291. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

NCNN

Target: CPU - Model: mnasnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: mnasnetABC612182430SE +/- 1.65, N = 9SE +/- 1.74, N = 9SE +/- 2.10, N = 922.3722.0726.29MIN: 17.86 / MAX: 463.92MIN: 17 / MAX: 472.21MIN: 17.4 / MAX: 477.821. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

NCNN

Target: CPU - Model: efficientnet-b0

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: efficientnet-b0ABC816243240SE +/- 2.26, N = 9SE +/- 2.89, N = 9SE +/- 1.79, N = 933.1934.6531.80MIN: 24.9 / MAX: 696.74MIN: 24.08 / MAX: 682.94MIN: 24.43 / MAX: 693.071. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

NCNN

Target: CPU - Model: blazeface

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: blazefaceABC48121620SE +/- 0.71, N = 9SE +/- 2.43, N = 9SE +/- 0.13, N = 912.2814.4711.38MIN: 9.88 / MAX: 292.47MIN: 9.52 / MAX: 295.9MIN: 9.98 / MAX: 289.171. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

NCNN

Target: CPU - Model: googlenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: googlenetABC1326395265SE +/- 4.24, N = 9SE +/- 4.18, N = 9SE +/- 3.42, N = 959.4153.8253.24MIN: 40.06 / MAX: 739.56MIN: 39.13 / MAX: 737.69MIN: 40.33 / MAX: 740.641. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

NCNN

Target: CPU - Model: vgg16

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: vgg16ABC20406080100SE +/- 4.80, N = 9SE +/- 7.53, N = 9SE +/- 6.48, N = 994.4798.9992.70MIN: 60.89 / MAX: 237.15MIN: 56.24 / MAX: 223.87MIN: 61.84 / MAX: 235.511. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

NCNN

Target: CPU - Model: resnet18

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: resnet18ABC1020304050SE +/- 4.60, N = 9SE +/- 5.31, N = 9SE +/- 3.06, N = 945.4245.6540.39MIN: 28.71 / MAX: 297.65MIN: 25.28 / MAX: 296.12MIN: 28.07 / MAX: 300.71. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

NCNN

Target: CPU - Model: alexnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: alexnetABC1020304050SE +/- 4.07, N = 9SE +/- 5.19, N = 9SE +/- 6.22, N = 941.3640.9442.47MIN: 22.74 / MAX: 146.56MIN: 18.3 / MAX: 145.6MIN: 20.35 / MAX: 145.061. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

NCNN

Target: CPU - Model: resnet50

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: resnet50ABC20406080100SE +/- 4.54, N = 9SE +/- 4.17, N = 9SE +/- 4.24, N = 971.3477.6165.41MIN: 49.22 / MAX: 673.07MIN: 43.89 / MAX: 663.09MIN: 47.44 / MAX: 680.351. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

NCNN

Target: CPU - Model: yolov4-tiny

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: yolov4-tinyABC1428425670SE +/- 3.13, N = 9SE +/- 1.09, N = 9SE +/- 2.83, N = 963.2855.4860.06MIN: 47.49 / MAX: 295.76MIN: 47.61 / MAX: 284.14MIN: 47.24 / MAX: 293.121. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

NCNN

Target: CPU - Model: squeezenet_ssd

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: squeezenet_ssdABC1428425670SE +/- 3.79, N = 9SE +/- 3.24, N = 9SE +/- 2.64, N = 860.8856.8058.65MIN: 42.5 / MAX: 643.01MIN: 41.76 / MAX: 641.24MIN: 42.73 / MAX: 650.921. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

NCNN

Target: CPU - Model: regnety_400m

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: regnety_400mABC20406080100SE +/- 6.20, N = 9SE +/- 5.95, N = 9SE +/- 4.65, N = 9103.4196.94100.75MIN: 81.96 / MAX: 3182MIN: 81.39 / MAX: 3202.54MIN: 82.33 / MAX: 2992.231. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

NCNN

Target: CPU - Model: vision_transformer

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: vision_transformerABC80160240320400SE +/- 2.70, N = 9SE +/- 2.37, N = 9SE +/- 1.57, N = 9353.68352.36350.58MIN: 271.26 / MAX: 973.13MIN: 270.26 / MAX: 969.01MIN: 285.44 / MAX: 1166.881. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

NCNN

Target: CPU - Model: FastestDet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: FastestDetABC816243240SE +/- 1.74, N = 9SE +/- 2.48, N = 9SE +/- 1.25, N = 932.2832.7530.45MIN: 26.06 / MAX: 617.85MIN: 25.68 / MAX: 619.82MIN: 25.8 / MAX: 616.561. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread


Phoronix Test Suite v10.8.4