TensorFlow Lite

This is a benchmark of the TensorFlow Lite implementation focused on TensorFlow machine learning for mobile, IoT, edge, and other cases. The current Linux support is limited to running on CPUs. This test profile is measuring the average inference time.

To run this test with the Phoronix Test Suite, the basic command is: phoronix-test-suite benchmark tensorflow-lite.

Project Site

tensorflow.org

Source Repository

github.com

Test Created

23 August 2020

Last Updated

19 May 2022

Test Maintainer

Michael Larabel 

Test Type

System

Average Install Time

9 Seconds

Average Run Time

4 Minutes, 53 Seconds

Accolades

10k+ Downloads

Supported Platforms


Public Result Uploads *Reported Installs **Reported Test Completions **Test Profile Page Views ***OpenBenchmarking.orgEventsTensorFlow Lite Popularity Statisticspts/tensorflow-lite2020.082020.102020.122021.022021.042021.062021.082021.102021.122022.022022.042022.062022.082022.102022.122023.022023.042023.062023.082023.102023.122024.022024.045K10K15K20K25K
* Uploading of benchmark result data to OpenBenchmarking.org is always optional (opt-in) via the Phoronix Test Suite for users wishing to share their results publicly.
** Data based on those opting to upload their test results to OpenBenchmarking.org and users enabling the opt-in anonymous statistics reporting while running benchmarks from an Internet-connected platform.
*** Test profile page view reporting began March 2021.
Data updated weekly as of 13 April 2024.
Mobilenet Float19.7%NASNet Mobile13.6%Mobilenet Quant16.9%SqueezeNet17.1%Inception ResNet V215.4%Inception V417.3%Model Option PopularityOpenBenchmarking.org

Revision History

pts/tensorflow-lite-1.1.0   [View Source]   Thu, 19 May 2022 09:57:39 GMT
Update against latest upstream nightly.

pts/tensorflow-lite-1.0.0   [View Source]   Sun, 23 Aug 2020 14:13:10 GMT
TensorFlow Lite initial commit.

Suites Using This Test

Machine Learning

HPC - High Performance Computing


Performance Metrics

Analyze Test Configuration:

TensorFlow Lite 2020-08-23

Model: SqueezeNet

OpenBenchmarking.org metrics for this test profile configuration based on 1,394 public results since 23 August 2020 with the latest data as of 3 October 2023.

Below is an overview of the generalized performance for components where there is sufficient statistically significant data based upon user-uploaded results. It is important to keep in mind particularly in the Linux/open-source space there can be vastly different OS configurations, with this overview intended to offer just general guidance as to the performance expectations.

Component
Percentile Rank
# Compatible Public Results
Microseconds (Average)
100th
17
44713 +/- 1129
99th
3
45465 +/- 898
97th
27
48357 +/- 1181
96th
3
48988 +/- 322
95th
5
50229 +/- 504
93rd
14
56523 +/- 3777
93rd
8
56779 +/- 483
92nd
3
58062 +/- 240
90th
3
60924 +/- 6
88th
4
61794 +/- 1621
88th
6
61957 +/- 490
88th
19
62085 +/- 1106
88th
12
62507 +/- 2217
86th
5
63789 +/- 5734
86th
23
64999 +/- 5562
85th
8
66289 +/- 981
84th
17
67052 +/- 6792
82nd
8
69557 +/- 933
81st
6
71115 +/- 381
80th
19
71827 +/- 6163
80th
6
73947 +/- 231
79th
10
75434 +/- 368
77th
5
76536 +/- 1585
77th
6
80145 +/- 951
77th
4
80221 +/- 637
76th
8
80568 +/- 2735
76th
8
81165 +/- 1146
Mid-Tier
75th
> 81779
75th
10
82184 +/- 214
74th
3
84972 +/- 150
74th
3
85039 +/- 91
72nd
15
88229 +/- 1229
72nd
16
88734 +/- 607
71st
7
90168 +/- 400
70th
4
92173 +/- 9701
70th
10
93571 +/- 200
67th
53
94975 +/- 790
65th
4
100056 +/- 407
64th
6
101229 +/- 797
64th
3
101444 +/- 365
64th
5
102345 +/- 2877
64th
11
102429 +/- 1591
63rd
6
105051 +/- 3
62nd
16
107009 +/- 359
61st
5
108274 +/- 9
61st
3
108699 +/- 51
57th
10
123957 +/- 229
57th
42
123974 +/- 1022
56th
23
126357 +/- 3410
54th
8
128967 +/- 270
52nd
9
138378 +/- 637
52nd
6
140837 +/- 16426
51st
9
141406 +/- 17973
51st
10
142148 +/- 900
Median
50th
150945
49th
15
163373 +/- 753
46th
29
168059 +/- 1034
46th
18
168363 +/- 4107
46th
5
168710 +/- 187
44th
6
175803 +/- 2187
43rd
7
180874 +/- 6382
43rd
3
182811 +/- 1058
43rd
15
183065 +/- 2446
42nd
8
186539 +/- 118
40th
3
191120 +/- 69
40th
8
191242 +/- 764
39th
15
205132 +/- 4018
37th
8
221107 +/- 56
36th
7
222020 +/- 10916
36th
14
223628 +/- 7553
36th
23
224302 +/- 3599
34th
7
227736 +/- 1252
33rd
4
232749 +/- 3491
32nd
10
242801 +/- 327
30th
4
253193 +/- 2302
30th
3
253299 +/- 1247
30th
11
255131 +/- 8449
30th
3
256574 +/- 154
30th
4
256843 +/- 23206
30th
8
256919 +/- 2919
29th
3
261785 +/- 3766
27th
3
285440 +/- 131
27th
3
285508 +/- 74
27th
15
293223 +/- 8067
Low-Tier
25th
> 296361
25th
3
299067 +/- 1359
25th
7
304630 +/- 7885
24th
5
313628 +/- 82
24th
18
317756 +/- 14239
23rd
5
320863 +/- 109
23rd
12
324622 +/- 2189
22nd
6
336429 +/- 45501
21st
14
352167 +/- 4064
20th
11
365139 +/- 784
19th
4
371607 +/- 2319
19th
6
377905 +/- 3286
19th
7
386834 +/- 441
18th
3
391769 +/- 100
18th
11
392284 +/- 24211
17th
3
399316 +/- 18
17th
4
402912 +/- 6915
17th
4
404755 +/- 11650
16th
3
418396 +/- 968
15th
3
448585 +/- 65
15th
4
450030 +/- 571
15th
4
450832 +/- 10749
15th
5
461470 +/- 20507
14th
3
473781 +/- 94
14th
3
479641 +/- 565
14th
3
487746 +/- 2310
13th
4
497443 +/- 86
13th
3
499788 +/- 82
13th
4
512579 +/- 293
12th
3
521462 +/- 2743
12th
3
535918 +/- 572
12th
4
541598 +/- 885
11th
3
542016 +/- 244
10th
46
551547 +/- 28931
8th
3
577579 +/- 442
8th
4
578321 +/- 5617
8th
3
580987 +/- 378
6th
15
645887 +/- 22844
6th
4
650641 +/- 108
6th
3
663458 +/- 95
5th
5
681259 +/- 50401
5th
3
684355 +/- 7490
5th
9
691242 +/- 57555
4th
6
721440 +/- 137
3rd
3
858599 +/- 18908
3rd
4
915523 +/- 113
3rd
3
994248 +/- 7234
3rd
3
1147490 +/- 201
3rd
4
1216949 +/- 330
2nd
5
1317480 +/- 1263
2nd
3
1440024 +/- 431
1st
3
1490682 +/- 522
1st
4
1628580 +/- 390
OpenBenchmarking.orgDistribution Of Public Results - Model: SqueezeNet1394 Results Range From 42486 To 2363010 Microseconds424868889713530818171922813027454132095236736341377446018550659655300759941864582969224073865178506283147387788492429597070610171171063528110993911563501202761124917212955831341994138840514348161481227152763815740491620460166687117132821759693180610418525151898926194533719917482038159208457021309812177392222380322702142316625236303690180270360450

Based on OpenBenchmarking.org data, the selected test / test configuration (TensorFlow Lite 2020-08-23 - Model: SqueezeNet) has an average run-time of 4 minutes. By default this test profile is set to run at least 3 times but may increase if the standard deviation exceeds pre-defined defaults or other calculations deem additional runs necessary for greater statistical accuracy of the result.

OpenBenchmarking.orgMinutesTime Required To Complete BenchmarkModel: SqueezeNetRun-Time3691215Min: 3 / Avg: 3.24 / Max: 12

Based on public OpenBenchmarking.org results, the selected test / test configuration has an average standard deviation of 0.4%.

OpenBenchmarking.orgPercent, Fewer Is BetterAverage Deviation Between RunsModel: SqueezeNetDeviation246810Min: 0 / Avg: 0.37 / Max: 6

Does It Scale Well With Increasing Cores?

Yes, based on the automated analysis of the collected public benchmark data, this test / test settings does generally scale well with increasing CPU core counts. Data based on publicly available results for this test / test settings, separated by vendor, result divided by the reference CPU clock speed, grouped by matching physical CPU core count, and normalized against the smallest core count tested from each vendor for each CPU having a sufficient number of test samples and statistically significant data.

IntelAMDOpenBenchmarking.orgRelative Core Scaling To BaseTensorFlow Lite CPU Core ScalingModel: SqueezeNet24681216243248643691215

Notable Instruction Set Usage

Notable instruction set extensions supported by this test, based on an automatic analysis by the Phoronix Test Suite / OpenBenchmarking.org analytics engine.

Instruction Set
Support
Instructions Detected
SSE2 (SSE2)
Used by default on supported hardware.
 
MOVDQA PMULUDQ PSHUFD PSRLDQ MOVD CVTSI2SD ADDSD MULSD SUBSD DIVSD MOVAPD CVTSS2SD CVTTSD2SI SQRTSD UCOMISD XORPD CVTSD2SS CVTTPS2DQ CVTDQ2PS PADDQ MOVDQU PUNPCKLQDQ UNPCKLPD CVTDQ2PD MULPD CVTPD2PS ANDPD MAXSD PSUBQ CVTPS2PD MINSD MOVUPD UNPCKHPD ADDPD PUNPCKHQDQ CVTPS2DQ
Used by default on supported hardware.
Found on Intel processors since Sandy Bridge (2011).
Found on AMD processors since Bulldozer (2011).

 
VBROADCASTF128 VZEROUPPER VMASKMOVPS VEXTRACTF128 VBROADCASTSS VINSERTF128 VPERMILPS
Advanced Vector Extensions 512 (AVX512)
Used by default on supported hardware.
 
(ZMM REGISTER USE)
FMA (FMA)
Used by default on supported hardware.
Found on Intel processors since Haswell (2013).
Found on AMD processors since Bulldozer (2011).

 
VFMADD132PS VFMADD231PS VFMADD213PS
Last automated analysis: 18 January 2022

This test profile binary relies on the shared libraries libm.so.6, libpthread.so.0, libdl.so.2, librt.so.1, libc.so.6.

Tested CPU Architectures

This benchmark has been successfully tested on the below mentioned architectures. The CPU architectures listed is where successful OpenBenchmarking.org result uploads occurred, namely for helping to determine if a given test is compatible with various alternative CPU architectures.

CPU Architecture
Kernel Identifier
Verified On
Intel / AMD x86 64-bit
x86_64
(Many Processors)