nn AMD Ryzen 9 5900X 12-Core testing with a ASUS ROG CROSSHAIR VIII HERO (3501 BIOS) and AMD Radeon RX 6800/6800 XT / 6900 16GB on Ubuntu 21.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2106189-PTS-NN67322042&sor&grs .
nn Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server OpenGL Vulkan Compiler File-System Screen Resolution 1 1a 2 3 AMD Ryzen 9 5900X 12-Core @ 3.70GHz (12 Cores / 24 Threads) ASUS ROG CROSSHAIR VIII HERO (3501 BIOS) AMD Starship/Matisse 16GB 1000GB Sabrent Rocket 4.0 Plus + 2000GB AMD Radeon RX 6800/6800 XT / 6900 16GB (2475/1000MHz) AMD Navi 21 HDMI Audio ASUS VP28U Realtek RTL8125 2.5GbE + Intel I211 Ubuntu 21.04 5.13.0-051300rc6daily20210617-generic (x86_64) 20210616 GNOME Shell 3.38.4 X Server 1.20.11 + Wayland 4.6 Mesa 21.2.0-devel (git-849ab4e 2021-06-17 hirsute-oibaf-ppa) (LLVM 12.0.0) 1.2.180 GCC 10.3.0 + CUDA 11.3 ext4 3840x2160 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-mutex --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-10-gDeRY6/gcc-10-10.3.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-10-gDeRY6/gcc-10-10.3.0/debian/tmp-gcn/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0xa201009 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected
nn ncnn: CPU - blazeface ncnn: CPU - mnasnet ncnn: CPU - efficientnet-b0 mnn: resnet-v2-50 mnn: squeezenetv1.1 mnn: mobilenetV3 ncnn: CPU - shufflenet-v2 ncnn: CPU - yolov4-tiny ncnn: CPU-v3-v3 - mobilenet-v3 mnn: inception-v3 mnn: MobileNetV2_224 mnn: SqueezeNetV1.0 ncnn: CPU - regnety_400m ncnn: CPU - vgg16 ncnn: CPU - resnet18 ncnn: CPU-v2-v2 - mobilenet-v2 mnn: mobilenet-v1-1.0 tnn: CPU - MobileNet v2 ncnn: CPU - alexnet ncnn: CPU - resnet50 ncnn: CPU - googlenet tnn: CPU - SqueezeNet v1.1 tnn: CPU - SqueezeNet v2 ncnn: CPU - mobilenet ncnn: CPU - squeezenet_ssd tnn: CPU - DenseNet 1 1a 2 3 27.944 3.743 2.129 26.103 3.277 5.057 4.384 226.928 212.476 50.930 2510.762 1.71 3.67 5.10 4.00 21.48 3.91 9.42 55.07 13.74 4.10 11.07 22.80 12.38 12.14 14.80 1.79 3.82 5.28 27.403 3.736 2.116 4.10 22.00 4.00 26.169 3.246 5.121 9.50 55.26 13.52 4.16 4.352 226.523 11.20 22.55 12.45 212.738 51.113 12.12 14.78 2504.442 1.70 3.79 5.13 27.010 3.619 2.059 4.05 21.91 3.96 25.616 3.211 5.023 9.34 55.98 13.61 4.13 4.325 224.051 11.13 22.69 12.51 211.086 50.718 12.05 14.75 2505.204 OpenBenchmarking.org
NCNN Target: CPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: blazeface 3 1a 2 0.4028 0.8056 1.2084 1.6112 2.014 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.04, N = 3 1.70 1.71 1.79 MIN: 1.6 / MAX: 9.42 MIN: 1.62 / MAX: 8.08 MIN: 1.65 / MAX: 9.61 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: mnasnet 1a 3 2 0.8595 1.719 2.5785 3.438 4.2975 SE +/- 0.02, N = 3 SE +/- 0.09, N = 3 SE +/- 0.06, N = 3 3.67 3.79 3.82 MIN: 3.5 / MAX: 11.58 MIN: 3.51 / MAX: 12.14 MIN: 3.57 / MAX: 11.88 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: efficientnet-b0 1a 3 2 1.188 2.376 3.564 4.752 5.94 SE +/- 0.04, N = 3 SE +/- 0.04, N = 3 SE +/- 0.09, N = 3 5.10 5.13 5.28 MIN: 4.83 / MAX: 13.32 MIN: 4.8 / MAX: 38.8 MIN: 4.9 / MAX: 13.73 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
Mobile Neural Network Model: resnet-v2-50 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.2 Model: resnet-v2-50 3 2 1 7 14 21 28 35 SE +/- 0.04, N = 7 SE +/- 0.15, N = 3 SE +/- 0.51, N = 3 27.01 27.40 27.94 MIN: 25.86 / MAX: 50.68 MIN: 26.03 / MAX: 50.79 MIN: 25.6 / MAX: 40.37 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: squeezenetv1.1 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.2 Model: squeezenetv1.1 3 2 1 0.8422 1.6844 2.5266 3.3688 4.211 SE +/- 0.058, N = 7 SE +/- 0.012, N = 3 SE +/- 0.068, N = 3 3.619 3.736 3.743 MIN: 3.25 / MAX: 12.54 MIN: 3.56 / MAX: 11.48 MIN: 3.36 / MAX: 12.79 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: mobilenetV3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.2 Model: mobilenetV3 3 2 1 0.479 0.958 1.437 1.916 2.395 SE +/- 0.019, N = 7 SE +/- 0.009, N = 3 SE +/- 0.027, N = 3 2.059 2.116 2.129 MIN: 1.88 / MAX: 10.15 MIN: 2.01 / MAX: 10.13 MIN: 1.91 / MAX: 11.33 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: shufflenet-v2 1a 3 2 0.9225 1.845 2.7675 3.69 4.6125 SE +/- 0.04, N = 3 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 4.00 4.05 4.10 MIN: 3.74 / MAX: 11.71 MIN: 3.86 / MAX: 11.94 MIN: 3.89 / MAX: 11.86 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: yolov4-tiny 1a 3 2 5 10 15 20 25 SE +/- 0.28, N = 3 SE +/- 0.11, N = 3 SE +/- 0.18, N = 3 21.48 21.91 22.00 MIN: 19.65 / MAX: 48.26 MIN: 19.58 / MAX: 57.14 MIN: 20.67 / MAX: 30.95 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU-v3-v3 - Model: mobilenet-v3 1a 3 2 0.9 1.8 2.7 3.6 4.5 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 3.91 3.96 4.00 MIN: 3.73 / MAX: 11.8 MIN: 3.75 / MAX: 12.14 MIN: 3.76 / MAX: 12.05 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
Mobile Neural Network Model: inception-v3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.2 Model: inception-v3 3 1 2 6 12 18 24 30 SE +/- 0.13, N = 7 SE +/- 0.23, N = 3 SE +/- 0.24, N = 3 25.62 26.10 26.17 MIN: 24.49 / MAX: 42.77 MIN: 24.2 / MAX: 62.16 MIN: 24.88 / MAX: 34.4 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: MobileNetV2_224 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.2 Model: MobileNetV2_224 3 2 1 0.7373 1.4746 2.2119 2.9492 3.6865 SE +/- 0.032, N = 7 SE +/- 0.041, N = 3 SE +/- 0.035, N = 3 3.211 3.246 3.277 MIN: 2.97 / MAX: 11 MIN: 3.07 / MAX: 10.44 MIN: 3.07 / MAX: 11.36 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: SqueezeNetV1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.2 Model: SqueezeNetV1.0 3 1 2 1.1522 2.3044 3.4566 4.6088 5.761 SE +/- 0.049, N = 7 SE +/- 0.054, N = 3 SE +/- 0.050, N = 3 5.023 5.057 5.121 MIN: 4.59 / MAX: 14.33 MIN: 4.64 / MAX: 33.19 MIN: 4.85 / MAX: 13.16 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
NCNN Target: CPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: regnety_400m 3 1a 2 3 6 9 12 15 SE +/- 0.07, N = 3 SE +/- 0.08, N = 3 SE +/- 0.09, N = 3 9.34 9.42 9.50 MIN: 8.86 / MAX: 17.28 MIN: 9.01 / MAX: 26.23 MIN: 9.03 / MAX: 17.41 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: vgg16 1a 2 3 13 26 39 52 65 SE +/- 0.20, N = 3 SE +/- 0.21, N = 3 SE +/- 0.90, N = 3 55.07 55.26 55.98 MIN: 52.29 / MAX: 67.12 MIN: 52.4 / MAX: 85.07 MIN: 51.31 / MAX: 862.27 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: resnet18 2 3 1a 4 8 12 16 20 SE +/- 0.04, N = 3 SE +/- 0.16, N = 3 SE +/- 0.16, N = 3 13.52 13.61 13.74 MIN: 12.87 / MAX: 22.24 MIN: 12.65 / MAX: 21.63 MIN: 12.93 / MAX: 22.43 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU-v2-v2 - Model: mobilenet-v2 1a 3 2 0.936 1.872 2.808 3.744 4.68 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.06, N = 3 4.10 4.13 4.16 MIN: 3.86 / MAX: 12.56 MIN: 3.88 / MAX: 12.67 MIN: 3.87 / MAX: 12.37 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
Mobile Neural Network Model: mobilenet-v1-1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.2 Model: mobilenet-v1-1.0 3 2 1 0.9864 1.9728 2.9592 3.9456 4.932 SE +/- 0.022, N = 7 SE +/- 0.051, N = 3 SE +/- 0.070, N = 3 4.325 4.352 4.384 MIN: 4.09 / MAX: 11.49 MIN: 4.08 / MAX: 11.71 MIN: 4.07 / MAX: 13.17 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
TNN Target: CPU - Model: MobileNet v2 OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: MobileNet v2 3 2 1 50 100 150 200 250 SE +/- 1.38, N = 3 SE +/- 2.57, N = 3 SE +/- 0.40, N = 3 224.05 226.52 226.93 MIN: 218.93 / MAX: 236.84 MIN: 218.62 / MAX: 239.35 MIN: 222.06 / MAX: 239.49 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: alexnet 1a 3 2 3 6 9 12 15 SE +/- 0.04, N = 3 SE +/- 0.08, N = 3 SE +/- 0.06, N = 3 11.07 11.13 11.20 MIN: 10.16 / MAX: 19.67 MIN: 10.19 / MAX: 19.44 MIN: 10.37 / MAX: 19.33 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: resnet50 2 3 1a 5 10 15 20 25 SE +/- 0.28, N = 3 SE +/- 0.18, N = 3 SE +/- 0.36, N = 3 22.55 22.69 22.80 MIN: 20.93 / MAX: 31.12 MIN: 21.25 / MAX: 31.38 MIN: 20.6 / MAX: 104.7 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: googlenet 1a 2 3 3 6 9 12 15 SE +/- 0.06, N = 3 SE +/- 0.01, N = 3 SE +/- 0.17, N = 3 12.38 12.45 12.51 MIN: 11.66 / MAX: 20.8 MIN: 11.62 / MAX: 30.43 MIN: 11.61 / MAX: 32.56 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
TNN Target: CPU - Model: SqueezeNet v1.1 OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: SqueezeNet v1.1 3 1 2 50 100 150 200 250 SE +/- 0.10, N = 3 SE +/- 0.82, N = 3 SE +/- 3.04, N = 3 211.09 212.48 212.74 MIN: 210.04 / MAX: 211.62 MIN: 210.54 / MAX: 219.75 MIN: 206.57 / MAX: 219.72 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
TNN Target: CPU - Model: SqueezeNet v2 OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: SqueezeNet v2 3 1 2 12 24 36 48 60 SE +/- 0.25, N = 3 SE +/- 0.28, N = 3 SE +/- 0.11, N = 3 50.72 50.93 51.11 MIN: 50.05 / MAX: 51.13 MIN: 50.31 / MAX: 51.63 MIN: 50.74 / MAX: 51.46 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: mobilenet 3 2 1a 3 6 9 12 15 SE +/- 0.02, N = 3 SE +/- 0.08, N = 3 SE +/- 0.09, N = 3 12.05 12.12 12.14 MIN: 11.3 / MAX: 29.68 MIN: 11.38 / MAX: 34.52 MIN: 11.26 / MAX: 38.92 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: squeezenet_ssd 3 2 1a 4 8 12 16 20 SE +/- 0.04, N = 3 SE +/- 0.04, N = 3 SE +/- 0.06, N = 3 14.75 14.78 14.80 MIN: 13.81 / MAX: 23.13 MIN: 13.64 / MAX: 45.78 MIN: 13.52 / MAX: 64.82 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
TNN Target: CPU - Model: DenseNet OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: DenseNet 2 3 1 500 1000 1500 2000 2500 SE +/- 3.49, N = 3 SE +/- 1.51, N = 3 SE +/- 3.60, N = 3 2504.44 2505.20 2510.76 MIN: 2433.14 / MAX: 2583.96 MIN: 2430 / MAX: 2579.39 MIN: 2446.86 / MAX: 2649.04 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
Phoronix Test Suite v10.8.5