nn AMD Ryzen 9 5900X 12-Core testing with a ASUS ROG CROSSHAIR VIII HERO (3501 BIOS) and AMD Radeon RX 6800/6800 XT / 6900 16GB on Ubuntu 21.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2106189-PTS-NN67322042&grr&rdt .
nn Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server OpenGL Vulkan Compiler File-System Screen Resolution 1 1a 2 3 AMD Ryzen 9 5900X 12-Core @ 3.70GHz (12 Cores / 24 Threads) ASUS ROG CROSSHAIR VIII HERO (3501 BIOS) AMD Starship/Matisse 16GB 1000GB Sabrent Rocket 4.0 Plus + 2000GB AMD Radeon RX 6800/6800 XT / 6900 16GB (2475/1000MHz) AMD Navi 21 HDMI Audio ASUS VP28U Realtek RTL8125 2.5GbE + Intel I211 Ubuntu 21.04 5.13.0-051300rc6daily20210617-generic (x86_64) 20210616 GNOME Shell 3.38.4 X Server 1.20.11 + Wayland 4.6 Mesa 21.2.0-devel (git-849ab4e 2021-06-17 hirsute-oibaf-ppa) (LLVM 12.0.0) 1.2.180 GCC 10.3.0 + CUDA 11.3 ext4 3840x2160 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-mutex --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-10-gDeRY6/gcc-10-10.3.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-10-gDeRY6/gcc-10-10.3.0/debian/tmp-gcn/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0xa201009 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected
nn tnn: CPU - DenseNet mnn: inception-v3 mnn: mobilenet-v1-1.0 mnn: MobileNetV2_224 mnn: SqueezeNetV1.0 mnn: resnet-v2-50 mnn: squeezenetv1.1 mnn: mobilenetV3 ncnn: CPU - regnety_400m ncnn: CPU - squeezenet_ssd ncnn: CPU - yolov4-tiny ncnn: CPU - resnet50 ncnn: CPU - alexnet ncnn: CPU - resnet18 ncnn: CPU - vgg16 ncnn: CPU - googlenet ncnn: CPU - blazeface ncnn: CPU - efficientnet-b0 ncnn: CPU - mnasnet ncnn: CPU - shufflenet-v2 ncnn: CPU-v3-v3 - mobilenet-v3 ncnn: CPU-v2-v2 - mobilenet-v2 ncnn: CPU - mobilenet tnn: CPU - MobileNet v2 tnn: CPU - SqueezeNet v1.1 tnn: CPU - SqueezeNet v2 1 1a 2 3 2510.762 26.103 4.384 3.277 5.057 27.944 3.743 2.129 226.928 212.476 50.930 9.42 14.80 21.48 22.80 11.07 13.74 55.07 12.38 1.71 5.10 3.67 4.00 3.91 4.10 12.14 2504.442 26.169 4.352 3.246 5.121 27.403 3.736 2.116 9.50 14.78 22.00 22.55 11.20 13.52 55.26 12.45 1.79 5.28 3.82 4.10 4.00 4.16 12.12 226.523 212.738 51.113 2505.204 25.616 4.325 3.211 5.023 27.010 3.619 2.059 9.34 14.75 21.91 22.69 11.13 13.61 55.98 12.51 1.70 5.13 3.79 4.05 3.96 4.13 12.05 224.051 211.086 50.718 OpenBenchmarking.org
TNN Target: CPU - Model: DenseNet OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: DenseNet 1 2 3 500 1000 1500 2000 2500 SE +/- 3.60, N = 3 SE +/- 3.49, N = 3 SE +/- 1.51, N = 3 2510.76 2504.44 2505.20 MIN: 2446.86 / MAX: 2649.04 MIN: 2433.14 / MAX: 2583.96 MIN: 2430 / MAX: 2579.39 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
Mobile Neural Network Model: inception-v3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.2 Model: inception-v3 1 2 3 6 12 18 24 30 SE +/- 0.23, N = 3 SE +/- 0.24, N = 3 SE +/- 0.13, N = 7 26.10 26.17 25.62 MIN: 24.2 / MAX: 62.16 MIN: 24.88 / MAX: 34.4 MIN: 24.49 / MAX: 42.77 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: mobilenet-v1-1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.2 Model: mobilenet-v1-1.0 1 2 3 0.9864 1.9728 2.9592 3.9456 4.932 SE +/- 0.070, N = 3 SE +/- 0.051, N = 3 SE +/- 0.022, N = 7 4.384 4.352 4.325 MIN: 4.07 / MAX: 13.17 MIN: 4.08 / MAX: 11.71 MIN: 4.09 / MAX: 11.49 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: MobileNetV2_224 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.2 Model: MobileNetV2_224 1 2 3 0.7373 1.4746 2.2119 2.9492 3.6865 SE +/- 0.035, N = 3 SE +/- 0.041, N = 3 SE +/- 0.032, N = 7 3.277 3.246 3.211 MIN: 3.07 / MAX: 11.36 MIN: 3.07 / MAX: 10.44 MIN: 2.97 / MAX: 11 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: SqueezeNetV1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.2 Model: SqueezeNetV1.0 1 2 3 1.1522 2.3044 3.4566 4.6088 5.761 SE +/- 0.054, N = 3 SE +/- 0.050, N = 3 SE +/- 0.049, N = 7 5.057 5.121 5.023 MIN: 4.64 / MAX: 33.19 MIN: 4.85 / MAX: 13.16 MIN: 4.59 / MAX: 14.33 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: resnet-v2-50 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.2 Model: resnet-v2-50 1 2 3 7 14 21 28 35 SE +/- 0.51, N = 3 SE +/- 0.15, N = 3 SE +/- 0.04, N = 7 27.94 27.40 27.01 MIN: 25.6 / MAX: 40.37 MIN: 26.03 / MAX: 50.79 MIN: 25.86 / MAX: 50.68 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: squeezenetv1.1 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.2 Model: squeezenetv1.1 1 2 3 0.8422 1.6844 2.5266 3.3688 4.211 SE +/- 0.068, N = 3 SE +/- 0.012, N = 3 SE +/- 0.058, N = 7 3.743 3.736 3.619 MIN: 3.36 / MAX: 12.79 MIN: 3.56 / MAX: 11.48 MIN: 3.25 / MAX: 12.54 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: mobilenetV3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.2 Model: mobilenetV3 1 2 3 0.479 0.958 1.437 1.916 2.395 SE +/- 0.027, N = 3 SE +/- 0.009, N = 3 SE +/- 0.019, N = 7 2.129 2.116 2.059 MIN: 1.91 / MAX: 11.33 MIN: 2.01 / MAX: 10.13 MIN: 1.88 / MAX: 10.15 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
NCNN Target: CPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: regnety_400m 1a 2 3 3 6 9 12 15 SE +/- 0.08, N = 3 SE +/- 0.09, N = 3 SE +/- 0.07, N = 3 9.42 9.50 9.34 MIN: 9.01 / MAX: 26.23 MIN: 9.03 / MAX: 17.41 MIN: 8.86 / MAX: 17.28 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: squeezenet_ssd 1a 2 3 4 8 12 16 20 SE +/- 0.06, N = 3 SE +/- 0.04, N = 3 SE +/- 0.04, N = 3 14.80 14.78 14.75 MIN: 13.52 / MAX: 64.82 MIN: 13.64 / MAX: 45.78 MIN: 13.81 / MAX: 23.13 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: yolov4-tiny 1a 2 3 5 10 15 20 25 SE +/- 0.28, N = 3 SE +/- 0.18, N = 3 SE +/- 0.11, N = 3 21.48 22.00 21.91 MIN: 19.65 / MAX: 48.26 MIN: 20.67 / MAX: 30.95 MIN: 19.58 / MAX: 57.14 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: resnet50 1a 2 3 5 10 15 20 25 SE +/- 0.36, N = 3 SE +/- 0.28, N = 3 SE +/- 0.18, N = 3 22.80 22.55 22.69 MIN: 20.6 / MAX: 104.7 MIN: 20.93 / MAX: 31.12 MIN: 21.25 / MAX: 31.38 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: alexnet 1a 2 3 3 6 9 12 15 SE +/- 0.04, N = 3 SE +/- 0.06, N = 3 SE +/- 0.08, N = 3 11.07 11.20 11.13 MIN: 10.16 / MAX: 19.67 MIN: 10.37 / MAX: 19.33 MIN: 10.19 / MAX: 19.44 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: resnet18 1a 2 3 4 8 12 16 20 SE +/- 0.16, N = 3 SE +/- 0.04, N = 3 SE +/- 0.16, N = 3 13.74 13.52 13.61 MIN: 12.93 / MAX: 22.43 MIN: 12.87 / MAX: 22.24 MIN: 12.65 / MAX: 21.63 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: vgg16 1a 2 3 13 26 39 52 65 SE +/- 0.20, N = 3 SE +/- 0.21, N = 3 SE +/- 0.90, N = 3 55.07 55.26 55.98 MIN: 52.29 / MAX: 67.12 MIN: 52.4 / MAX: 85.07 MIN: 51.31 / MAX: 862.27 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: googlenet 1a 2 3 3 6 9 12 15 SE +/- 0.06, N = 3 SE +/- 0.01, N = 3 SE +/- 0.17, N = 3 12.38 12.45 12.51 MIN: 11.66 / MAX: 20.8 MIN: 11.62 / MAX: 30.43 MIN: 11.61 / MAX: 32.56 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: blazeface 1a 2 3 0.4028 0.8056 1.2084 1.6112 2.014 SE +/- 0.00, N = 3 SE +/- 0.04, N = 3 SE +/- 0.00, N = 3 1.71 1.79 1.70 MIN: 1.62 / MAX: 8.08 MIN: 1.65 / MAX: 9.61 MIN: 1.6 / MAX: 9.42 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: efficientnet-b0 1a 2 3 1.188 2.376 3.564 4.752 5.94 SE +/- 0.04, N = 3 SE +/- 0.09, N = 3 SE +/- 0.04, N = 3 5.10 5.28 5.13 MIN: 4.83 / MAX: 13.32 MIN: 4.9 / MAX: 13.73 MIN: 4.8 / MAX: 38.8 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: mnasnet 1a 2 3 0.8595 1.719 2.5785 3.438 4.2975 SE +/- 0.02, N = 3 SE +/- 0.06, N = 3 SE +/- 0.09, N = 3 3.67 3.82 3.79 MIN: 3.5 / MAX: 11.58 MIN: 3.57 / MAX: 11.88 MIN: 3.51 / MAX: 12.14 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: shufflenet-v2 1a 2 3 0.9225 1.845 2.7675 3.69 4.6125 SE +/- 0.04, N = 3 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 4.00 4.10 4.05 MIN: 3.74 / MAX: 11.71 MIN: 3.89 / MAX: 11.86 MIN: 3.86 / MAX: 11.94 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU-v3-v3 - Model: mobilenet-v3 1a 2 3 0.9 1.8 2.7 3.6 4.5 SE +/- 0.03, N = 3 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 3.91 4.00 3.96 MIN: 3.73 / MAX: 11.8 MIN: 3.76 / MAX: 12.05 MIN: 3.75 / MAX: 12.14 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU-v2-v2 - Model: mobilenet-v2 1a 2 3 0.936 1.872 2.808 3.744 4.68 SE +/- 0.02, N = 3 SE +/- 0.06, N = 3 SE +/- 0.02, N = 3 4.10 4.16 4.13 MIN: 3.86 / MAX: 12.56 MIN: 3.87 / MAX: 12.37 MIN: 3.88 / MAX: 12.67 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20210525 Target: CPU - Model: mobilenet 1a 2 3 3 6 9 12 15 SE +/- 0.09, N = 3 SE +/- 0.08, N = 3 SE +/- 0.02, N = 3 12.14 12.12 12.05 MIN: 11.26 / MAX: 38.92 MIN: 11.38 / MAX: 34.52 MIN: 11.3 / MAX: 29.68 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread
TNN Target: CPU - Model: MobileNet v2 OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: MobileNet v2 1 2 3 50 100 150 200 250 SE +/- 0.40, N = 3 SE +/- 2.57, N = 3 SE +/- 1.38, N = 3 226.93 226.52 224.05 MIN: 222.06 / MAX: 239.49 MIN: 218.62 / MAX: 239.35 MIN: 218.93 / MAX: 236.84 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
TNN Target: CPU - Model: SqueezeNet v1.1 OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: SqueezeNet v1.1 1 2 3 50 100 150 200 250 SE +/- 0.82, N = 3 SE +/- 3.04, N = 3 SE +/- 0.10, N = 3 212.48 212.74 211.09 MIN: 210.54 / MAX: 219.75 MIN: 206.57 / MAX: 219.72 MIN: 210.04 / MAX: 211.62 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
TNN Target: CPU - Model: SqueezeNet v2 OpenBenchmarking.org ms, Fewer Is Better TNN 0.3 Target: CPU - Model: SqueezeNet v2 1 2 3 12 24 36 48 60 SE +/- 0.28, N = 3 SE +/- 0.11, N = 3 SE +/- 0.25, N = 3 50.93 51.11 50.72 MIN: 50.31 / MAX: 51.63 MIN: 50.74 / MAX: 51.46 MIN: 50.05 / MAX: 51.13 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
Phoronix Test Suite v10.8.5