nn AMD Ryzen 9 5900X 12-Core testing with a ASUS ROG CROSSHAIR VIII HERO (3501 BIOS) and AMD Radeon RX 6800/6800 XT / 6900 16GB on Ubuntu 21.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2106189-PTS-NN67322042&gru&sor .
nn Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server OpenGL Vulkan Compiler File-System Screen Resolution 1 1a 2 3 AMD Ryzen 9 5900X 12-Core @ 3.70GHz (12 Cores / 24 Threads) ASUS ROG CROSSHAIR VIII HERO (3501 BIOS) AMD Starship/Matisse 16GB 1000GB Sabrent Rocket 4.0 Plus + 2000GB AMD Radeon RX 6800/6800 XT / 6900 16GB (2475/1000MHz) AMD Navi 21 HDMI Audio ASUS VP28U Realtek RTL8125 2.5GbE + Intel I211 Ubuntu 21.04 5.13.0-051300rc6daily20210617-generic (x86_64) 20210616 GNOME Shell 3.38.4 X Server 1.20.11 + Wayland 4.6 Mesa 21.2.0-devel (git-849ab4e 2021-06-17 hirsute-oibaf-ppa) (LLVM 12.0.0) 1.2.180 GCC 10.3.0 + CUDA 11.3 ext4 3840x2160 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-mutex --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-10-gDeRY6/gcc-10-10.3.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-10-gDeRY6/gcc-10-10.3.0/debian/tmp-gcn/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0xa201009 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected
nn mnn: mobilenetV3 mnn: squeezenetv1.1 mnn: resnet-v2-50 mnn: SqueezeNetV1.0 mnn: MobileNetV2_224 mnn: mobilenet-v1-1.0 mnn: inception-v3 tnn: CPU - DenseNet tnn: CPU - MobileNet v2 tnn: CPU - SqueezeNet v2 tnn: CPU - SqueezeNet v1.1 ncnn: CPU - mobilenet ncnn: CPU-v2-v2 - mobilenet-v2 ncnn: CPU-v3-v3 - mobilenet-v3 ncnn: CPU - shufflenet-v2 ncnn: CPU - mnasnet ncnn: CPU - efficientnet-b0 ncnn: CPU - blazeface ncnn: CPU - googlenet ncnn: CPU - vgg16 ncnn: CPU - resnet18 ncnn: CPU - alexnet ncnn: CPU - resnet50 ncnn: CPU - yolov4-tiny ncnn: CPU - squeezenet_ssd ncnn: CPU - regnety_400m 1 1a 2 3 2.129 3.743 27.944 5.057 3.277 4.384 26.103 2510.762 226.928 50.930 212.476 12.14 4.10 3.91 4.00 3.67 5.10 1.71 12.38 55.07 13.74 11.07 22.80 21.48 14.80 9.42 2.116 3.736 27.403 5.121 3.246 4.352 26.169 2504.442 226.523 51.113 212.738 12.12 4.16 4.00 4.10 3.82 5.28 1.79 12.45 55.26 13.52 11.20 22.55 22.00 14.78 9.50 2.059 3.619 27.010 5.023 3.211 4.325 25.616 2505.204 224.051 50.718 211.086 12.05 4.13 3.96 4.05 3.79 5.13 1.70 12.51 55.98 13.61 11.13 22.69 21.91 14.75 9.34 OpenBenchmarking.org
Mobile Neural Network Model: mobilenetV3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.2 Model: mobilenetV3 3 2 1 0.479 0.958 1.437 1.916 2.395 SE +/- 0.019, N = 7 SE +/- 0.009, N = 3 SE +/- 0.027, N = 3 2.059 2.116 2.129 MIN: 1.88 / MAX: 10.15 MIN: 2.01 / MAX: 10.13 MIN: 1.91 / MAX: 11.33 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: squeezenetv1.1 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.2 Model: squeezenetv1.1 3 2 1 0.8422 1.6844 2.5266 3.3688 4.211 SE +/- 0.058, N = 7 SE +/- 0.012, N = 3 SE +/- 0.068, N = 3 3.619 3.736 3.743 MIN: 3.25 / MAX: 12.54 MIN: 3.56 / MAX: 11.48 MIN: 3.36 / MAX: 12.79 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: resnet-v2-50 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.2 Model: resnet-v2-50 3 2 1 7 14 21 28 35 SE +/- 0.04, N = 7 SE +/- 0.15, N = 3 SE +/- 0.51, N = 3 27.01 27.40 27.94 MIN: 25.86 / MAX: 50.68 MIN: 26.03 / MAX: 50.79 MIN: 25.6 / MAX: 40.37 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: SqueezeNetV1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.2 Model: SqueezeNetV1.0 3 1 2 1.1522 2.3044 3.4566 4.6088 5.761 SE +/- 0.049, N = 7 SE +/- 0.054, N = 3 SE +/- 0.050, N = 3 5.023 5.057 5.121 MIN: 4.59 / MAX: 14.33 MIN: 4.64 / MAX: 33.19 MIN: 4.85 / MAX: 13.16 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: MobileNetV2_224 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.2 Model: MobileNetV2_224 3 2 1 0.7373 1.4746 2.2119 2.9492 3.6865 SE +/- 0.032, N = 7 SE +/- 0.041, N = 3 SE +/- 0.035, N = 3 3.211 3.246 3.277 MIN: 2.97 / MAX: 11 MIN: 3.07 / MAX: 10.44 MIN: 3.07 / MAX: 11.36 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: mobilenet-v1-1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.2 Model: mobilenet-v1-1.0 3 2 1 0.9864 1.9728 2.9592 3.9456 4.932 SE +/- 0.022, N = 7 SE +/- 0.051, N = 3 SE +/- 0.070, N = 3 4.325 4.352 4.384 MIN: 4.09 / MAX: 11.49 MIN: 4.08 / MAX: 11.71 MIN: 4.07 / MAX: 13.17 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: inception-v3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.2 Model: inception-v3 3 1 2 6 12 18 24 30 SE +/- 0.13, N = 7 SE +/- 0.23, N = 3 SE +/- 0.24, N = 3 25.62 26.10 26.17 MIN: 24.49 / MAX: 42.77 MIN: 24.2 / MAX: 62.16 MIN: 24.88 / MAX: 34.4 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
TNN Target: CPU - Model: DenseNet OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: DenseNet 2 3 1 500 1000 1500 2000 2500 SE +/- 3.49, N = 3 SE +/- 1.51, N = 3 SE +/- 3.60, N = 3 2504.44 2505.20 2510.76 MIN: 2433.14 / MAX: 2583.96 MIN: 2430 / MAX: 2579.39 MIN: 2446.86 / MAX: 2649.04 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
TNN Target: CPU - Model: MobileNet v2 OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: MobileNet v2 3 2 1 50 100 150 200 250 SE +/- 1.38, N = 3 SE +/- 2.57, N = 3 SE +/- 0.40, N = 3 224.05 226.52 226.93 MIN: 218.93 / MAX: 236.84 MIN: 218.62 / MAX: 239.35 MIN: 222.06 / MAX: 239.49 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
TNN Target: CPU - Model: SqueezeNet v2 OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: SqueezeNet v2 3 1 2 12 24 36 48 60 SE +/- 0.25, N = 3 SE +/- 0.28, N = 3 SE +/- 0.11, N = 3 50.72 50.93 51.11 MIN: 50.05 / MAX: 51.13 MIN: 50.31 / MAX: 51.63 MIN: 50.74 / MAX: 51.46 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
TNN Target: CPU - Model: SqueezeNet v1.1 OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: SqueezeNet v1.1 3 1 2 50 100 150 200 250 SE +/- 0.10, N = 3 SE +/- 0.82, N = 3 SE +/- 3.04, N = 3 211.09 212.48 212.74 MIN: 210.04 / MAX: 211.62 MIN: 210.54 / MAX: 219.75 MIN: 206.57 / MAX: 219.72 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: mobilenet 3 2 1a 3 6 9 12 15 SE +/- 0.02, N = 3 SE +/- 0.08, N = 3 SE +/- 0.09, N = 3 12.05 12.12 12.14 MIN: 11.3 / MAX: 29.68 MIN: 11.38 / MAX: 34.52 MIN: 11.26 / MAX: 38.92 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU-v2-v2 - Model: mobilenet-v2 1a 3 2 0.936 1.872 2.808 3.744 4.68 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.06, N = 3 4.10 4.13 4.16 MIN: 3.86 / MAX: 12.56 MIN: 3.88 / MAX: 12.67 MIN: 3.87 / MAX: 12.37 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU-v3-v3 - Model: mobilenet-v3 1a 3 2 0.9 1.8 2.7 3.6 4.5 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 3.91 3.96 4.00 MIN: 3.73 / MAX: 11.8 MIN: 3.75 / MAX: 12.14 MIN: 3.76 / MAX: 12.05 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: shufflenet-v2 1a 3 2 0.9225 1.845 2.7675 3.69 4.6125 SE +/- 0.04, N = 3 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 4.00 4.05 4.10 MIN: 3.74 / MAX: 11.71 MIN: 3.86 / MAX: 11.94 MIN: 3.89 / MAX: 11.86 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: mnasnet 1a 3 2 0.8595 1.719 2.5785 3.438 4.2975 SE +/- 0.02, N = 3 SE +/- 0.09, N = 3 SE +/- 0.06, N = 3 3.67 3.79 3.82 MIN: 3.5 / MAX: 11.58 MIN: 3.51 / MAX: 12.14 MIN: 3.57 / MAX: 11.88 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: efficientnet-b0 1a 3 2 1.188 2.376 3.564 4.752 5.94 SE +/- 0.04, N = 3 SE +/- 0.04, N = 3 SE +/- 0.09, N = 3 5.10 5.13 5.28 MIN: 4.83 / MAX: 13.32 MIN: 4.8 / MAX: 38.8 MIN: 4.9 / MAX: 13.73 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: blazeface 3 1a 2 0.4028 0.8056 1.2084 1.6112 2.014 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.04, N = 3 1.70 1.71 1.79 MIN: 1.6 / MAX: 9.42 MIN: 1.62 / MAX: 8.08 MIN: 1.65 / MAX: 9.61 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: googlenet 1a 2 3 3 6 9 12 15 SE +/- 0.06, N = 3 SE +/- 0.01, N = 3 SE +/- 0.17, N = 3 12.38 12.45 12.51 MIN: 11.66 / MAX: 20.8 MIN: 11.62 / MAX: 30.43 MIN: 11.61 / MAX: 32.56 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: vgg16 1a 2 3 13 26 39 52 65 SE +/- 0.20, N = 3 SE +/- 0.21, N = 3 SE +/- 0.90, N = 3 55.07 55.26 55.98 MIN: 52.29 / MAX: 67.12 MIN: 52.4 / MAX: 85.07 MIN: 51.31 / MAX: 862.27 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: resnet18 2 3 1a 4 8 12 16 20 SE +/- 0.04, N = 3 SE +/- 0.16, N = 3 SE +/- 0.16, N = 3 13.52 13.61 13.74 MIN: 12.87 / MAX: 22.24 MIN: 12.65 / MAX: 21.63 MIN: 12.93 / MAX: 22.43 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: alexnet 1a 3 2 3 6 9 12 15 SE +/- 0.04, N = 3 SE +/- 0.08, N = 3 SE +/- 0.06, N = 3 11.07 11.13 11.20 MIN: 10.16 / MAX: 19.67 MIN: 10.19 / MAX: 19.44 MIN: 10.37 / MAX: 19.33 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: resnet50 2 3 1a 5 10 15 20 25 SE +/- 0.28, N = 3 SE +/- 0.18, N = 3 SE +/- 0.36, N = 3 22.55 22.69 22.80 MIN: 20.93 / MAX: 31.12 MIN: 21.25 / MAX: 31.38 MIN: 20.6 / MAX: 104.7 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: yolov4-tiny 1a 3 2 5 10 15 20 25 SE +/- 0.28, N = 3 SE +/- 0.11, N = 3 SE +/- 0.18, N = 3 21.48 21.91 22.00 MIN: 19.65 / MAX: 48.26 MIN: 19.58 / MAX: 57.14 MIN: 20.67 / MAX: 30.95 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: squeezenet_ssd 3 2 1a 4 8 12 16 20 SE +/- 0.04, N = 3 SE +/- 0.04, N = 3 SE +/- 0.06, N = 3 14.75 14.78 14.80 MIN: 13.81 / MAX: 23.13 MIN: 13.64 / MAX: 45.78 MIN: 13.52 / MAX: 64.82 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: regnety_400m 3 1a 2 3 6 9 12 15 SE +/- 0.07, N = 3 SE +/- 0.08, N = 3 SE +/- 0.09, N = 3 9.34 9.42 9.50 MIN: 8.86 / MAX: 17.28 MIN: 9.01 / MAX: 26.23 MIN: 9.03 / MAX: 17.41 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
Phoronix Test Suite v10.8.5