TNN NCNN AMD Ryzen 9 3950X 16-Core testing with a ASUS ROG CROSSHAIR VIII HERO (WI-FI) (1302 BIOS) and Sapphire AMD Radeon RX 5600 OEM/5600 XT / 5700/5700 6GB on Ubuntu 20.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2009258-PTS-TNNNCNN783 .
TNN NCNN Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server OpenGL OpenCL Vulkan Compiler File-System Screen Resolution 3950X + Navi 2 3 AMD Ryzen 9 3950X 16-Core @ 3.50GHz (16 Cores / 32 Threads) ASUS ROG CROSSHAIR VIII HERO (WI-FI) (1302 BIOS) AMD Starship/Matisse 16GB 2000GB Corsair Force MP600 + 2000GB Sapphire AMD Radeon RX 5600 OEM/5600 XT / 5700/5700 6GB (1780/875MHz) AMD Navi 10 HDMI Audio DELL P2415Q Realtek RTL8125 2.5GbE + Intel I211 + Intel Wi-Fi 6 AX200 Ubuntu 20.04 5.9.0-050900rc6daily20200925-generic (x86_64) 20200924 GNOME Shell 3.36.4 X Server 1.20.8 4.6 Mesa 20.3.0-devel (git-3173367 2020-09-25 focal-oibaf-ppa) (LLVM 10.0.1) OpenCL 2.0 AMD-APP (3182.0) 1.2.145 GCC 9.3.0 + CUDA 11.0 ext4 3840x2160 OpenBenchmarking.org Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq performance - CPU Microcode: 0x8701013 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
TNN NCNN ncnn: CPU - squeezenet ncnn: CPU - mobilenet ncnn: CPU-v2-v2 - mobilenet-v2 ncnn: CPU-v3-v3 - mobilenet-v3 ncnn: CPU - shufflenet-v2 ncnn: CPU - mnasnet ncnn: CPU - efficientnet-b0 ncnn: CPU - blazeface ncnn: CPU - googlenet ncnn: CPU - vgg16 ncnn: CPU - resnet18 ncnn: CPU - alexnet ncnn: CPU - resnet50 ncnn: CPU - yolov4-tiny ncnn: Vulkan GPU - squeezenet ncnn: Vulkan GPU - mobilenet ncnn: Vulkan GPU-v2-v2 - mobilenet-v2 ncnn: Vulkan GPU-v3-v3 - mobilenet-v3 ncnn: Vulkan GPU - shufflenet-v2 ncnn: Vulkan GPU - mnasnet ncnn: Vulkan GPU - efficientnet-b0 ncnn: Vulkan GPU - blazeface ncnn: Vulkan GPU - googlenet ncnn: Vulkan GPU - vgg16 ncnn: Vulkan GPU - resnet18 ncnn: Vulkan GPU - alexnet ncnn: Vulkan GPU - resnet50 ncnn: Vulkan GPU - yolov4-tiny tnn: CPU - MobileNet v2 tnn: CPU - SqueezeNet v1.1 3950X + Navi 2 3 15.96 17.57 5.92 5.15 5.1 5.21 7.05 2.01 19.22 70.29 17.69 16.58 29.70 29.26 3.58 6.58 2.14 3.1 1.84 2.23 6.83 0.76 4.25 14.97 1.73 5.23 4.95 8.65 244.905 236.996 15.85 17.18 5.83 5.17 5.10 5.18 7.20 2.01 19.33 70.01 17.60 16.63 29.36 28.94 3.58 6.58 2.14 3.10 1.86 2.23 6.87 0.76 4.25 14.97 1.72 5.21 4.95 8.55 240.615 225.247 15.90 17.16 5.86 5.16 5.09 5.19 7.01 2.01 18.90 70.29 17.43 16.50 29.53 28.82 3.60 6.59 2.14 3.09 1.83 2.23 6.80 0.76 4.26 14.98 1.73 5.19 4.96 8.55 240.195 238.144 OpenBenchmarking.org
NCNN Target: CPU - Model: squeezenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: squeezenet 3950X + Navi 2 3 4 8 12 16 20 SE +/- 0.14, N = 3 SE +/- 0.04, N = 3 SE +/- 0.04, N = 3 15.96 15.85 15.90 MIN: 15.48 / MAX: 17.29 MIN: 15.47 / MAX: 16.5 MIN: 15.48 / MAX: 16.35 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: mobilenet 3950X + Navi 2 3 4 8 12 16 20 SE +/- 0.34, N = 3 SE +/- 0.21, N = 3 SE +/- 0.10, N = 3 17.57 17.18 17.16 MIN: 16.92 / MAX: 18.71 MIN: 16.64 / MAX: 18.06 MIN: 16.72 / MAX: 17.56 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU-v2-v2 - Model: mobilenet-v2 3950X + Navi 2 3 1.332 2.664 3.996 5.328 6.66 SE +/- 0.07, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 5.92 5.83 5.86 MIN: 5.69 / MAX: 29.4 MIN: 5.69 / MAX: 7.35 MIN: 5.7 / MAX: 7.43 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU-v3-v3 - Model: mobilenet-v3 3950X + Navi 2 3 1.1633 2.3266 3.4899 4.6532 5.8165 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 5.15 5.17 5.16 MIN: 5.04 / MAX: 16.03 MIN: 5.05 / MAX: 6.68 MIN: 5.04 / MAX: 6.76 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: shufflenet-v2 3950X + Navi 2 3 1.1475 2.295 3.4425 4.59 5.7375 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 5.10 5.10 5.09 MIN: 5.04 / MAX: 6.5 MIN: 5.02 / MAX: 6.45 MIN: 5.01 / MAX: 6.95 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: mnasnet 3950X + Navi 2 3 1.1723 2.3446 3.5169 4.6892 5.8615 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 5.21 5.18 5.19 MIN: 5.1 / MAX: 7.28 MIN: 5.07 / MAX: 7.05 MIN: 5.1 / MAX: 6.68 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: efficientnet-b0 3950X + Navi 2 3 2 4 6 8 10 SE +/- 0.03, N = 3 SE +/- 0.16, N = 3 SE +/- 0.01, N = 3 7.05 7.20 7.01 MIN: 6.94 / MAX: 7.27 MIN: 6.89 / MAX: 32.31 MIN: 6.92 / MAX: 7.51 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: blazeface 3950X + Navi 2 3 0.4523 0.9046 1.3569 1.8092 2.2615 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 2.01 2.01 2.01 MIN: 1.97 / MAX: 2.14 MIN: 1.97 / MAX: 2.11 MIN: 1.98 / MAX: 2.29 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: googlenet 3950X + Navi 2 3 5 10 15 20 25 SE +/- 0.23, N = 3 SE +/- 0.35, N = 3 SE +/- 0.01, N = 3 19.22 19.33 18.90 MIN: 18.31 / MAX: 56.39 MIN: 18.21 / MAX: 23.94 MIN: 18.17 / MAX: 19.69 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: vgg16 3950X + Navi 2 3 16 32 48 64 80 SE +/- 0.11, N = 3 SE +/- 0.09, N = 3 SE +/- 0.06, N = 3 70.29 70.01 70.29 MIN: 68.71 / MAX: 72.85 MIN: 68.34 / MAX: 79.57 MIN: 68.56 / MAX: 118.37 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: resnet18 3950X + Navi 2 3 4 8 12 16 20 SE +/- 0.17, N = 3 SE +/- 0.16, N = 3 SE +/- 0.01, N = 3 17.69 17.60 17.43 MIN: 17.36 / MAX: 18.26 MIN: 17.3 / MAX: 18.73 MIN: 17.29 / MAX: 17.97 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: alexnet 3950X + Navi 2 3 4 8 12 16 20 SE +/- 0.06, N = 3 SE +/- 0.09, N = 3 SE +/- 0.01, N = 3 16.58 16.63 16.50 MIN: 16.38 / MAX: 18.55 MIN: 16.38 / MAX: 35.76 MIN: 16.37 / MAX: 17.15 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: resnet50 3950X + Navi 2 3 7 14 21 28 35 SE +/- 0.16, N = 3 SE +/- 0.04, N = 3 SE +/- 0.12, N = 3 29.70 29.36 29.53 MIN: 29.28 / MAX: 71.36 MIN: 29.03 / MAX: 30.44 MIN: 29.11 / MAX: 67.88 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: yolov4-tiny 3950X + Navi 2 3 7 14 21 28 35 SE +/- 0.18, N = 3 SE +/- 0.16, N = 3 SE +/- 0.08, N = 3 29.26 28.94 28.82 MIN: 28.61 / MAX: 39.41 MIN: 28.45 / MAX: 30.87 MIN: 28.43 / MAX: 30.08 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: squeezenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: squeezenet 3950X + Navi 2 3 0.81 1.62 2.43 3.24 4.05 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 3.58 3.58 3.60 MIN: 3.47 / MAX: 3.8 MIN: 3.47 / MAX: 3.76 MIN: 3.48 / MAX: 3.82 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: mobilenet 3950X + Navi 2 3 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 6.58 6.58 6.59 MIN: 6.53 / MAX: 7.07 MIN: 6.54 / MAX: 6.79 MIN: 6.53 / MAX: 12.11 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 3950X + Navi 2 3 0.4815 0.963 1.4445 1.926 2.4075 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 2.14 2.14 2.14 MIN: 2.09 / MAX: 2.55 MIN: 2.09 / MAX: 3.7 MIN: 2.09 / MAX: 3.03 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 3950X + Navi 2 3 0.6975 1.395 2.0925 2.79 3.4875 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 3.10 3.10 3.09 MIN: 3.05 / MAX: 4.87 MIN: 3.05 / MAX: 3.51 MIN: 3.05 / MAX: 3.9 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: shufflenet-v2 3950X + Navi 2 3 0.4185 0.837 1.2555 1.674 2.0925 SE +/- 0.00, N = 3 SE +/- 0.03, N = 3 SE +/- 0.00, N = 3 1.84 1.86 1.83 MIN: 1.82 / MAX: 3.18 MIN: 1.82 / MAX: 11.33 MIN: 1.82 / MAX: 1.99 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: mnasnet 3950X + Navi 2 3 0.5018 1.0036 1.5054 2.0072 2.509 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 2.23 2.23 2.23 MIN: 2.19 / MAX: 2.47 MIN: 2.19 / MAX: 2.46 MIN: 2.19 / MAX: 2.7 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: efficientnet-b0 3950X + Navi 2 3 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 SE +/- 0.01, N = 3 6.83 6.87 6.80 MIN: 6.71 / MAX: 15.49 MIN: 6.71 / MAX: 17.28 MIN: 6.7 / MAX: 10.76 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: blazeface 3950X + Navi 2 3 0.171 0.342 0.513 0.684 0.855 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 0.76 0.76 0.76 MIN: 0.75 / MAX: 0.99 MIN: 0.75 / MAX: 0.96 MIN: 0.75 / MAX: 0.97 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: googlenet 3950X + Navi 2 3 0.9585 1.917 2.8755 3.834 4.7925 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 4.25 4.25 4.26 MIN: 4.22 / MAX: 4.91 MIN: 4.22 / MAX: 5.32 MIN: 4.23 / MAX: 9.69 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: vgg16 3950X + Navi 2 3 4 8 12 16 20 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 14.97 14.97 14.98 MIN: 14.41 / MAX: 15.48 MIN: 14.46 / MAX: 15.66 MIN: 14.38 / MAX: 19.49 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: resnet18 3950X + Navi 2 3 0.3893 0.7786 1.1679 1.5572 1.9465 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 1.73 1.72 1.73 MIN: 1.7 / MAX: 2.77 MIN: 1.7 / MAX: 2.25 MIN: 1.71 / MAX: 2.05 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: alexnet 3950X + Navi 2 3 1.1768 2.3536 3.5304 4.7072 5.884 SE +/- 0.04, N = 3 SE +/- 0.05, N = 3 SE +/- 0.02, N = 3 5.23 5.21 5.19 MIN: 4.89 / MAX: 16.85 MIN: 4.86 / MAX: 20.01 MIN: 4.9 / MAX: 9.27 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: resnet50 3950X + Navi 2 3 1.116 2.232 3.348 4.464 5.58 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 4.95 4.95 4.96 MIN: 4.88 / MAX: 8.9 MIN: 4.9 / MAX: 8.86 MIN: 4.89 / MAX: 10.05 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: yolov4-tiny 3950X + Navi 2 3 2 4 6 8 10 SE +/- 0.09, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 8.65 8.55 8.55 MIN: 8.5 / MAX: 46.41 MIN: 8.46 / MAX: 13.54 MIN: 8.5 / MAX: 8.75 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
TNN Target: CPU - Model: MobileNet v2 OpenBenchmarking.org ms, Fewer Is Better TNN 0.2.3 Target: CPU - Model: MobileNet v2 3950X + Navi 2 3 50 100 150 200 250 SE +/- 2.51, N = 3 SE +/- 0.38, N = 3 SE +/- 0.31, N = 3 244.91 240.62 240.20 MIN: 240.06 / MAX: 267.36 MIN: 239.1 / MAX: 274.3 MIN: 238.85 / MAX: 245.71 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -O3 -rdynamic -ldl
TNN Target: CPU - Model: SqueezeNet v1.1 OpenBenchmarking.org ms, Fewer Is Better TNN 0.2.3 Target: CPU - Model: SqueezeNet v1.1 3950X + Navi 2 3 50 100 150 200 250 SE +/- 0.72, N = 3 SE +/- 0.72, N = 3 SE +/- 0.09, N = 3 237.00 225.25 238.14 MIN: 234.3 / MAX: 240 MIN: 223.02 / MAX: 233.9 MIN: 235.38 / MAX: 238.84 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -O3 -rdynamic -ldl
Phoronix Test Suite v10.8.4