NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent.

To run this test with the Phoronix Test Suite, the basic command is: phoronix-test-suite benchmark ncnn.

Project Site

github.com

Test Created

18 September 2020

Last Updated

18 December 2020

Test Maintainer

Michael Larabel 

Test Type

System

Average Install Time

53 Seconds

Average Run Time

20 Minutes, 48 Seconds

Test Dependencies

CMake + C/C++ Compiler Toolchain + Vulkan

Accolades

5k+ Downloads

Supported Platforms


Public Result UploadsReported Installs*Test Completions*OpenBenchmarking.orgEventsNCNN Popularity Statisticspts/ncnn2020.092020.102020.112020.122021.012021.022021.032021.042021.052K4K6K8K10K
* Data based on those opting to upload their test results to OpenBenchmarking.org and users enabling the opt-in anonymous statistics reporting while running benchmarks from an Internet-connected platform.
Data current as of Sat, 08 May 2021 18:55:56 GMT.
Vulkan GPU44.5%CPU55.5%Target Option PopularityOpenBenchmarking.org
mnasnet6.7%efficientnet-b06.7%vgg166.7%yolov4-tiny6.7%squeezenet_ssd6.6%resnet186.7%regnety_400m6.7%shufflenet-v26.7%resnet506.7%mobilenet-v26.7%alexnet6.7%blazeface6.5%googlenet6.7%mobilenet6.7%mobilenet-v36.6%Model Option PopularityOpenBenchmarking.org

Revision History

pts/ncnn-1.1.0   [View Source]   Fri, 18 Dec 2020 08:06:41 GMT
Update against new upstream NCNN 20201218.

pts/ncnn-1.0.3   [View Source]   Fri, 25 Sep 2020 06:36:39 GMT
Drop int8 tests per https://github.com/phoronix-test-suite/test-profiles/pull/167

pts/ncnn-1.0.2   [View Source]   Thu, 24 Sep 2020 12:52:47 GMT
Expose Vulkan GPU support.

pts/ncnn-1.0.1   [View Source]   Fri, 18 Sep 2020 12:28:10 GMT
Increase the run count.

pts/ncnn-1.0.0   [View Source]   Fri, 18 Sep 2020 11:58:15 GMT
Initial commit of Tencent NCNN.

Suites Using This Test

Machine Learning

HPC - High Performance Computing

NVIDIA GPU Compute

Vulkan Compute


Performance Metrics

Analyze Test Configuration:

NCNN 20201218

Target: CPU - Model: resnet50

OpenBenchmarking.org metrics for this test profile configuration based on 492 public results since 18 December 2020 with the latest data as of 3 May 2021.

Below is an overview of the generalized performance for components where there is sufficient statistically significant data based upon user-uploaded results. It is important to keep in mind particularly in the Linux/open-source space there can be vastly different OS configurations, with this overview intended to offer just general guidance as to the performance expectations.

Component
Percentile Rank
# Matching Public Results
ms (Average)
100th
3
20 +/- 1
89th
13
24 +/- 2
Mid-Tier
75th
> 25
Median
50th
37
47th
10
41 +/- 4
33rd
15
49 +/- 3
33rd
11
50 +/- 7
28th
11
53 +/- 7
Low-Tier
25th
> 55
24th
5
58 +/- 4
19th
3
63 +/- 1
14th
3
73 +/- 1
12th
3
92 +/- 10
10th
4
135 +/- 2
9th
6
171 +/- 2
7th
4
184 +/- 1
4th
3
267 +/- 2
2nd
6
426 +/- 25
OpenBenchmarking.orgDistribution Of Public Results - Target: CPU - Model: resnet50492 Results Range From 18 To 758 ms1845729912615318020723426128831534236939642345047750453155858561263966669372074777460120180240300

Based on OpenBenchmarking.org data, the selected test / test configuration (NCNN 20201218 - Target: CPU - Model: resnet50) has an average run-time of 10 minutes. By default this test profile is set to run at least 3 times but may increase if the standard deviation exceeds pre-defined defaults or other calculations deem additional runs necessary for greater statistical accuracy of the result.

OpenBenchmarking.orgMinutesTime Required To Complete BenchmarkTarget: CPU - Model: resnet50Run-Time918273645Min: 3 / Avg: 9.37 / Max: 46

Based on public OpenBenchmarking.org results, the selected test / test configuration has an average standard deviation of 1.2%.

OpenBenchmarking.orgPercent, Fewer Is BetterAverage Deviation Between RunsTarget: CPU - Model: resnet50Deviation48121620Min: 0 / Avg: 1.21 / Max: 15

Does It Scale Well With Increasing Cores?

Yes, based on the automated analysis of the collected public benchmark data, this test / test settings does generally scale well with increasing CPU core counts. Data based on publicly available results for this test / test settings, separated by vendor, result divided by the reference CPU clock speed, grouped by matching physical CPU core count, and normalized against the smallest core count tested from each vendor for each CPU having a sufficient number of test samples and statistically significant data.

AMDIntelOpenBenchmarking.orgRelative Core Scaling To BaseNCNN CPU Core ScalingTarget: CPU - Model: resnet504681216321.10422.20843.31264.41685.521

Notable Instruction Set Usage

Notable instruction set extensions supported by this test, based on an automatic analysis by the Phoronix Test Suite / OpenBenchmarking.org analytics engine.

Instruction Set
Support
Instructions Detected
SSE2 (SSE2)
Used by default on supported hardware.
 
MOVDQA MOVD PSHUFD PSHUFLW MOVDQU SUBSD MOVAPD MINSD MAXSD ADDSD CVTSI2SD DIVSD PUNPCKLQDQ CVTSS2SD MULSD MULPD PSRLDQ CVTSD2SS CVTDQ2PS CVTTPS2DQ PMULUDQ
Used by default on supported hardware.
Found on Intel processors since Sandy Bridge (2011).
Found on AMD processors since Bulldozer (2011).

 
VZEROUPPER VINSERTF128 VEXTRACTF128 VBROADCASTSS VPERM2F128 VMASKMOVPS VPERMILPS VBROADCASTSD
Used by default on supported hardware.
Found on Intel processors since Haswell (2013).
Found on AMD processors since Excavator (2016).

 
VINSERTI128 VPERM2I128 VEXTRACTI128 VPERMD VPERMQ VPBROADCASTD VPBROADCASTB VGATHERDPS VPBROADCASTW VPGATHERQQ VPSRLVQ VPBROADCASTQ
FMA (FMA)
Used by default on supported hardware.
Found on Intel processors since Haswell (2013).
Found on AMD processors since Bulldozer (2011).

 
VFMADD132PS VFMADD132SS VFMADD231SS VFMADD213PS VFMADD231PS VFNMADD231PS VFNMADD132PS VFMSUB231PS VFMSUB132PS VFMADD213SS VFMSUB213PS VFMSUB132SS VFMADD132SD VFNMADD213SS VFNMADD132SS VFNMADD231SS VFMADD132PD VFNMADD213PS
The test / benchmark does honor compiler flag changes.
Last automated analysis: 30 January 2021

This test profile binary relies on the shared libraries libgomp.so.1, libpthread.so.0, libm.so.6, libmvec.so.1, libc.so.6, libdl.so.2.

Recent Test Results

OpenBenchmarking.org Results Compare

5 Systems - 59 Benchmark Results

2 x Intel Xeon Platinum 8380 - Intel M50CYP2SB2U - Intel Device 0998

Ubuntu 20.04 - 5.11.0-051100-generic - GNOME Shell 3.36.4

10 Systems - 454 Benchmark Results

Intel Core i5-11600K - ASUS ROG MAXIMUS XIII HERO - Intel Device 43ef

Ubuntu 21.04 - 5.12.0-051200rc3daily20210315-generic - GNOME Shell 3.38.3

3 Systems - 95 Benchmark Results

2 x AMD EPYC 7V13 64-Core - Microsoft Virtual Machine - 442GB

CentOS Linux 8 - 4.18.0-147.8.1.el8_1.x86_64 - GCC 8.3.1 20190507

2 Systems - 86 Benchmark Results

2 x AMD EPYC 7V13 64-Core - Microsoft Virtual Machine - 442GB

CentOS Linux 8 - 4.18.0-147.8.1.el8_1.x86_64 - GCC 8.3.1 20190507

12 Systems - 453 Benchmark Results

AMD Ryzen 5 5600X 6-Core - ASUS ROG CROSSHAIR VIII HERO - AMD Starship

Ubuntu 21.04 - 5.12.0-051200rc3daily20210315-generic - GNOME Shell 3.38.3

9 Systems - 442 Benchmark Results

AMD Ryzen 7 5800X 8-Core - ASUS ROG CROSSHAIR VIII HERO - AMD Starship

Ubuntu 21.04 - 5.12.0-051200rc3daily20210315-generic - GNOME Shell 3.38.3

1 System - 92 Benchmark Results

AMD Ryzen 5 5600X 6-Core - ASRock X570 Phantom Gaming-ITX/TB3 - AMD Device 1480

Ubuntu 18.04 - 5.4.0-70-generic - GNOME Shell 3.28.4

1 System - 323 Benchmark Results

Intel Core i9-11900K - ASUS ROG MAXIMUS XIII HERO - Intel Tiger Lake-H

Ubuntu 21.04 - 5.12.0-051200rc3daily20210315-generic - GNOME Shell 3.38.3

8 Systems - 439 Benchmark Results

Intel Core i5-11600K - ASUS ROG MAXIMUS XIII HERO - Intel Device 43ef

Ubuntu 21.04 - 5.12.0-051200rc3daily20210315-generic - GNOME Shell 3.38.3

1 System - 30 Benchmark Results

AMD Ryzen 5 3600 6-Core - MSI B550M PRO-VDH WIFI - AMD Starship

Linuxmint 20.1 - 5.11.0-051100-generic - MATE 1.24.0

1 System - 147 Benchmark Results

AMD Ryzen 9 5950X 16-Core - ASUS ROG CROSSHAIR VIII HERO - AMD Starship

Ubuntu 20.10 - 5.11.6-051106-generic - GNOME Shell 3.38.2

1 System - 30 Benchmark Results

ARMv8 rev 0 - Jetson-AGX - 32GB

Ubuntu 18.04 - 4.9.140-tegra - GNOME Shell 3.28.4

Most Popular Test Results

OpenBenchmarking.org Results Compare

2 Systems - 151 Benchmark Results

Intel Xeon E-2278GEL - Logic Supply RXM-181 - Intel Cannon Lake PCH

Ubuntu 20.10 - 5.8.0-41-generic - GNOME Shell 3.38.2

6 Systems - 143 Benchmark Results

AMD Ryzen 9 5900X 12-Core - ASUS ROG CROSSHAIR VIII HERO - AMD Starship

Ubuntu 20.10 - 5.10.0-7.1-liquorix-amd64 - GNOME Shell 3.38.1

8 Systems - 439 Benchmark Results

Intel Core i9-11900K - ASUS ROG MAXIMUS XIII HERO - Intel Tiger Lake-H

Ubuntu 21.04 - 5.12.0-051200rc3daily20210315-generic - GNOME Shell 3.38.3

3 Systems - 108 Benchmark Results

Intel Core i7-10700T - Logic Supply RXM-181 - Intel Comet Lake PCH

Ubuntu 20.10 - 5.8.0-43-generic - GNOME Shell 3.38.2

3 Systems - 191 Benchmark Results

AMD Ryzen 3 2200G - ASUS PRIME B350M-E - AMD Raven

Ubuntu 20.10 - 5.8.0-38-generic - GNOME Shell 3.38.1

4 Systems - 80 Benchmark Results

Intel Core i7-9750H - Notebook P95_96_97Ex Rx - Intel Cannon Lake PCH

Ubuntu 20.04 - 5.7.0-999-generic - GNOME Shell 3.36.4

4 Systems - 91 Benchmark Results

AMD Ryzen 5 5600X 6-Core - ASUS TUF GAMING B550M-PLUS - AMD Starship

Ubuntu 20.10 - 5.10.4-051004-generic - GNOME Shell 3.38.1

4 Systems - 110 Benchmark Results

Intel Core i7-1165G7 - Dell 0GG9PT - Intel Tiger Lake-LP

Fedora 33 - 5.8.15-301.fc33.x86_64 - GNOME Shell 3.38.2

Find More Test Results