mnn ncnn zen 1 epyc

AMD EPYC 7551 32-Core testing with a GIGABYTE MZ31-AR0-00 v01010101 (F10 BIOS) and ASPEED on Debian 11 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2208156-NE-MNNNCNNZE48&sor.

mnn ncnn zen 1 epycProcessorMotherboardChipsetMemoryDiskGraphicsNetworkOSKernelCompilerFile-SystemScreen ResolutionABCAMD EPYC 7551 32-Core @ 2.00GHz (32 Cores / 64 Threads)GIGABYTE MZ31-AR0-00 v01010101 (F10 BIOS)AMD 17h8 x 4 GB DDR4-2133MT/s 9ASF51272PZ-2G6E1Samsung SSD 960 EVO 500GBASPEEDRealtek RTL8111/8168/8411 + 2 x Broadcom NetXtreme II BCM57810 10Debian 115.10.0-9-amd64 (x86_64)GCC 10.2.1 20210110ext41024x768OpenBenchmarking.orgKernel Details- Transparent Huge Pages: alwaysCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-mutex --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-10-Km9U7s/gcc-10-10.2.1/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-10-Km9U7s/gcc-10-10.2.1/debian/tmp-gcn/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0x8001227Security Details- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: disabled RSB filling + srbds: Not affected + tsx_async_abort: Not affected

mnn ncnn zen 1 epycmnn: mobilenetV3mnn: squeezenetv1.1mnn: resnet-v2-50mnn: SqueezeNetV1.0mnn: MobileNetV2_224mnn: mobilenet-v1-1.0mnn: inception-v3ncnn: CPU - mobilenetncnn: CPU-v2-v2 - mobilenet-v2ncnn: CPU-v3-v3 - mobilenet-v3ncnn: CPU - shufflenet-v2ncnn: CPU - mnasnetncnn: CPU - efficientnet-b0ncnn: CPU - blazefacencnn: CPU - googlenetncnn: CPU - vgg16ncnn: CPU - resnet18ncnn: CPU - alexnetncnn: CPU - resnet50ncnn: CPU - yolov4-tinyncnn: CPU - squeezenet_ssdncnn: CPU - regnety_400mncnn: CPU - vision_transformerncnn: CPU - FastestDetABC5.2819.15752.38014.09010.0488.34159.58945.0024.0421.6029.1622.3733.1912.2859.4194.4745.4241.3671.3463.2860.88103.41353.6832.285.4649.00954.15213.9838.5958.11957.84741.5224.6323.5126.6122.0734.6514.4753.8298.9945.6540.9477.6155.4856.8096.94352.3632.756.0709.51657.82614.4929.8988.78359.95545.8827.0522.3527.0726.2931.8011.3853.2492.7040.3942.4765.4160.0658.65100.75350.5830.45OpenBenchmarking.org

Mobile Neural Network

Model: mobilenetV3

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.0Model: mobilenetV3ABC246810SE +/- 0.074, N = 5SE +/- 0.230, N = 9SE +/- 0.285, N = 95.2815.4646.070MIN: 5 / MAX: 10.66MIN: 4.57 / MAX: 11.22MIN: 4.96 / MAX: 9.81. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl

Mobile Neural Network

Model: squeezenetv1.1

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.0Model: squeezenetv1.1BAC3691215SE +/- 0.283, N = 9SE +/- 0.160, N = 5SE +/- 0.364, N = 99.0099.1579.516MIN: 7.68 / MAX: 15.21MIN: 8.54 / MAX: 14.69MIN: 8.27 / MAX: 17.731. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl

Mobile Neural Network

Model: resnet-v2-50

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.0Model: resnet-v2-50ABC1326395265SE +/- 1.12, N = 5SE +/- 2.76, N = 9SE +/- 2.46, N = 952.3854.1557.83MIN: 47.96 / MAX: 141.5MIN: 47.92 / MAX: 164.33MIN: 50.22 / MAX: 132.711. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl

Mobile Neural Network

Model: SqueezeNetV1.0

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.0Model: SqueezeNetV1.0BAC48121620SE +/- 0.36, N = 9SE +/- 0.32, N = 5SE +/- 0.35, N = 913.9814.0914.49MIN: 12.79 / MAX: 21.34MIN: 12.99 / MAX: 16.82MIN: 12.97 / MAX: 21.041. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl

Mobile Neural Network

Model: MobileNetV2_224

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.0Model: MobileNetV2_224BCA3691215SE +/- 0.066, N = 9SE +/- 0.196, N = 9SE +/- 0.209, N = 58.5959.89810.048MIN: 7.48 / MAX: 39.13MIN: 8.56 / MAX: 34.58MIN: 8.59 / MAX: 32.811. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl

Mobile Neural Network

Model: mobilenet-v1-1.0

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.0Model: mobilenet-v1-1.0BAC246810SE +/- 0.478, N = 9SE +/- 0.583, N = 5SE +/- 0.610, N = 98.1198.3418.783MIN: 7.14 / MAX: 17.24MIN: 7.04 / MAX: 16.49MIN: 6.3 / MAX: 16.91. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl

Mobile Neural Network

Model: inception-v3

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.0Model: inception-v3BAC1326395265SE +/- 3.36, N = 9SE +/- 2.14, N = 5SE +/- 2.98, N = 957.8559.5959.96MIN: 51.06 / MAX: 261.45MIN: 54.7 / MAX: 74.6MIN: 51.78 / MAX: 271.461. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl

NCNN

Target: CPU - Model: mobilenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: mobilenetBAC1020304050SE +/- 1.16, N = 9SE +/- 2.07, N = 9SE +/- 3.76, N = 941.5245.0045.88MIN: 35.31 / MAX: 563.93MIN: 35.78 / MAX: 540.85MIN: 35.78 / MAX: 543.711. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

NCNN

Target: CPU-v2-v2 - Model: mobilenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU-v2-v2 - Model: mobilenet-v2ABC612182430SE +/- 1.12, N = 9SE +/- 2.10, N = 9SE +/- 2.11, N = 924.0424.6327.05MIN: 19.32 / MAX: 466.34MIN: 18.22 / MAX: 475.35MIN: 18.97 / MAX: 481.631. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

NCNN

Target: CPU-v3-v3 - Model: mobilenet-v3

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU-v3-v3 - Model: mobilenet-v3ACB612182430SE +/- 1.04, N = 9SE +/- 1.30, N = 9SE +/- 1.97, N = 921.6022.3523.51MIN: 18.42 / MAX: 485.33MIN: 18.39 / MAX: 509.43MIN: 17.83 / MAX: 487.741. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

NCNN

Target: CPU - Model: shufflenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: shufflenet-v2BCA714212835SE +/- 1.64, N = 9SE +/- 1.75, N = 9SE +/- 4.25, N = 926.6127.0729.16MIN: 21.24 / MAX: 559.88MIN: 21.66 / MAX: 552.29MIN: 21.67 / MAX: 560.881. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

NCNN

Target: CPU - Model: mnasnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: mnasnetBAC612182430SE +/- 1.74, N = 9SE +/- 1.65, N = 9SE +/- 2.10, N = 922.0722.3726.29MIN: 17 / MAX: 472.21MIN: 17.86 / MAX: 463.92MIN: 17.4 / MAX: 477.821. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

NCNN

Target: CPU - Model: efficientnet-b0

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: efficientnet-b0CAB816243240SE +/- 1.79, N = 9SE +/- 2.26, N = 9SE +/- 2.89, N = 931.8033.1934.65MIN: 24.43 / MAX: 693.07MIN: 24.9 / MAX: 696.74MIN: 24.08 / MAX: 682.941. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

NCNN

Target: CPU - Model: blazeface

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: blazefaceCAB48121620SE +/- 0.13, N = 9SE +/- 0.71, N = 9SE +/- 2.43, N = 911.3812.2814.47MIN: 9.98 / MAX: 289.17MIN: 9.88 / MAX: 292.47MIN: 9.52 / MAX: 295.91. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

NCNN

Target: CPU - Model: googlenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: googlenetCBA1326395265SE +/- 3.42, N = 9SE +/- 4.18, N = 9SE +/- 4.24, N = 953.2453.8259.41MIN: 40.33 / MAX: 740.64MIN: 39.13 / MAX: 737.69MIN: 40.06 / MAX: 739.561. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

NCNN

Target: CPU - Model: vgg16

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: vgg16CAB20406080100SE +/- 6.48, N = 9SE +/- 4.80, N = 9SE +/- 7.53, N = 992.7094.4798.99MIN: 61.84 / MAX: 235.51MIN: 60.89 / MAX: 237.15MIN: 56.24 / MAX: 223.871. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

NCNN

Target: CPU - Model: resnet18

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: resnet18CAB1020304050SE +/- 3.06, N = 9SE +/- 4.60, N = 9SE +/- 5.31, N = 940.3945.4245.65MIN: 28.07 / MAX: 300.7MIN: 28.71 / MAX: 297.65MIN: 25.28 / MAX: 296.121. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

NCNN

Target: CPU - Model: alexnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: alexnetBAC1020304050SE +/- 5.19, N = 9SE +/- 4.07, N = 9SE +/- 6.22, N = 940.9441.3642.47MIN: 18.3 / MAX: 145.6MIN: 22.74 / MAX: 146.56MIN: 20.35 / MAX: 145.061. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

NCNN

Target: CPU - Model: resnet50

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: resnet50CAB20406080100SE +/- 4.24, N = 9SE +/- 4.54, N = 9SE +/- 4.17, N = 965.4171.3477.61MIN: 47.44 / MAX: 680.35MIN: 49.22 / MAX: 673.07MIN: 43.89 / MAX: 663.091. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

NCNN

Target: CPU - Model: yolov4-tiny

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: yolov4-tinyBCA1428425670SE +/- 1.09, N = 9SE +/- 2.83, N = 9SE +/- 3.13, N = 955.4860.0663.28MIN: 47.61 / MAX: 284.14MIN: 47.24 / MAX: 293.12MIN: 47.49 / MAX: 295.761. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

NCNN

Target: CPU - Model: squeezenet_ssd

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: squeezenet_ssdBCA1428425670SE +/- 3.24, N = 9SE +/- 2.64, N = 8SE +/- 3.79, N = 956.8058.6560.88MIN: 41.76 / MAX: 641.24MIN: 42.73 / MAX: 650.92MIN: 42.5 / MAX: 643.011. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

NCNN

Target: CPU - Model: regnety_400m

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: regnety_400mBCA20406080100SE +/- 5.95, N = 9SE +/- 4.65, N = 9SE +/- 6.20, N = 996.94100.75103.41MIN: 81.39 / MAX: 3202.54MIN: 82.33 / MAX: 2992.23MIN: 81.96 / MAX: 31821. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

NCNN

Target: CPU - Model: vision_transformer

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: vision_transformerCBA80160240320400SE +/- 1.57, N = 9SE +/- 2.37, N = 9SE +/- 2.70, N = 9350.58352.36353.68MIN: 285.44 / MAX: 1166.88MIN: 270.26 / MAX: 969.01MIN: 271.26 / MAX: 973.131. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

NCNN

Target: CPU - Model: FastestDet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: FastestDetCAB816243240SE +/- 1.25, N = 9SE +/- 1.74, N = 9SE +/- 2.48, N = 930.4532.2832.75MIN: 25.8 / MAX: 616.56MIN: 26.06 / MAX: 617.85MIN: 25.68 / MAX: 619.821. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread


Phoronix Test Suite v10.8.4