Caffe

This is a benchmark of the Caffe deep learning framework and currently supports the AlexNet and Googlenet model and execution on both CPUs and NVIDIA GPUs.

To run this test with the Phoronix Test Suite, the basic command is: phoronix-test-suite benchmark caffe.

Test Created

14 November 2015

Last Updated

26 September 2020

Test Maintainer

Michael Larabel 

Test Type

System

Average Install Time

36 Seconds

Average Run Time

2 Minutes, 1 Second

Test Dependencies

C/C++ Compiler Toolchain + CMake + Python + BLAS (Basic Linear Algebra Sub-Routine) + C++ Boost + Linear Algebra Pack + Snappy Compression + GFlags + OpenCV + HDF5

Accolades

150k+ Downloads

Supported Platforms


Public Result Uploads *Reported Installs **Reported Test Completions **Test Profile Page Views ***OpenBenchmarking.orgEventsCaffe AlexNet Popularity Statisticspts/caffe2015.112016.012016.032016.072016.092016.112017.012017.032017.052017.072017.092017.112018.012018.032018.052018.072018.092018.112019.012019.032019.052019.092020.012020.072020.102020.122021.022021.042021.062021.082021.102021.122022.022022.042022.062022.082K4K6K8K10K
* Uploading of benchmark result data to OpenBenchmarking.org is always optional (opt-in) via the Phoronix Test Suite for users wishing to share their results publicly.
** Data based on those opting to upload their test results to OpenBenchmarking.org and users enabling the opt-in anonymous statistics reporting while running benchmarks from an Internet-connected platform.
*** Test profile page view reporting began March 2021.
Data current as of 24 September 2022.
GoogleNet48.3%AlexNet51.7%Model Option PopularityOpenBenchmarking.org
NVIDIA CUDA 15.0%CPU85.0%Acceleration Option PopularityOpenBenchmarking.org
20046.2%100010.8%10043.0%Iterations Option PopularityOpenBenchmarking.org

Revision History

pts/caffe-1.5.0   [View Source]   Sat, 26 Sep 2020 21:35:45 GMT
Overhaul Caffe test profile with latest Git snapshot, switch to CMake build system, clean up test options, etc.

pts/caffe-1.4.0   [View Source]   Sat, 29 Dec 2018 11:15:41 GMT
Update Caffe to latest Git snapshot to hopefully workaround build problems on newer distros.

pts/caffe-1.3.3   [View Source]   Sun, 01 Apr 2018 18:50:19 GMT
Basic fix for OpenCV 3.4.

pts/caffe-1.3.2   [View Source]   Wed, 04 Jan 2017 11:07:36 GMT
Fix for OpenCV 3.2.

pts/caffe-1.3.1   [View Source]   Wed, 28 Dec 2016 20:36:42 GMT
Don't show title string of "Caffe AlexNet" but "Caffe" with recent test profile versions supporting more than just AlexNet.

pts/caffe-1.3.0   [View Source]   Wed, 28 Dec 2016 20:34:27 GMT
Update to latest Git snapshot to fix OpenCV compatibility.

pts/caffe-1.2.0   [View Source]   Mon, 15 Aug 2016 16:11:16 GMT
Add Googlenet support, decrease CPU only iteration count.

pts/caffe-1.1.1   [View Source]   Sun, 12 Jun 2016 18:32:44 GMT
Add OpenCV and OpenBLAS support.

pts/caffe-1.1.0   [View Source]   Sat, 11 Jun 2016 19:32:55 GMT
Update

pts/caffe-1.0.0   [View Source]   Sat, 14 Nov 2015 15:29:45 GMT
Initial commit of Caffe deep learning framework and with this benchmark using the AlexNet model for benchmarking.

Suites Using This Test

Machine Learning

HPC - High Performance Computing

NVIDIA GPU Compute


Performance Metrics

Analyze Test Configuration:

Caffe 2020-02-13

Model: AlexNet - Acceleration: CPU - Iterations: 200

OpenBenchmarking.org metrics for this test profile configuration based on 668 public results since 26 September 2020 with the latest data as of 9 September 2022.

Below is an overview of the generalized performance for components where there is sufficient statistically significant data based upon user-uploaded results. It is important to keep in mind particularly in the Linux/open-source space there can be vastly different OS configurations, with this overview intended to offer just general guidance as to the performance expectations.

Component
Percentile Rank
# Compatible Public Results
Milli-Seconds (Average)
98th
5
61026 +/- 486
97th
3
62602 +/- 247
96th
3
64228 +/- 734
96th
7
64307 +/- 2674
96th
3
64710 +/- 67
94th
4
67800 +/- 662
92nd
3
69547 +/- 56
92nd
8
70426 +/- 1728
89th
8
73630 +/- 669
88th
3
74271 +/- 70
86th
25
75833 +/- 5920
86th
3
75950 +/- 410
86th
17
76466 +/- 3220
83rd
5
79773 +/- 399
83rd
6
79908 +/- 403
81st
3
80617 +/- 1285
79th
7
81625 +/- 11296
79th
3
81752 +/- 84
76th
4
88529 +/- 1791
Mid-Tier
75th
> 88639
75th
7
89252 +/- 3692
75th
3
89443 +/- 1333
74th
7
89567 +/- 74
74th
15
90389 +/- 8876
73rd
3
93847 +/- 217
72nd
3
96087 +/- 128
70th
7
99795 +/- 4962
69th
3
101069 +/- 677
67th
4
103831 +/- 1771
67th
6
104489 +/- 1237
67th
3
105259 +/- 2634
64th
6
106754 +/- 3969
63rd
8
107223 +/- 4085
62nd
3
107769 +/- 480
62nd
6
108091 +/- 813
61st
5
108205 +/- 787
61st
3
108482 +/- 201
57th
6
112603 +/- 604
56th
4
114228 +/- 1992
56th
3
114458 +/- 524
56th
3
114831 +/- 794
55th
6
116644 +/- 8097
54th
3
116913 +/- 876
54th
3
117435 +/- 102
53rd
3
117632 +/- 405
52nd
3
118247 +/- 227
52nd
3
119073 +/- 2776
51st
4
119687 +/- 404
51st
3
120083 +/- 702
Median
50th
120100
50th
3
120153 +/- 202
50th
6
120202 +/- 1993
49th
3
120562 +/- 155
48th
3
121645 +/- 146
47th
3
121796 +/- 124
46th
3
122885 +/- 956
46th
6
123132 +/- 2195
44th
10
125387 +/- 2866
44th
10
126334 +/- 2828
44th
7
126960 +/- 8050
42nd
3
128586 +/- 802
40th
7
129971 +/- 285
39th
6
130377 +/- 1800
39th
3
130901 +/- 185
37th
3
133822 +/- 48
36th
3
134387 +/- 694
35th
3
137302 +/- 134
33rd
14
139688 +/- 1120
32nd
13
142703 +/- 4586
32nd
3
142973 +/- 424
31st
3
143622 +/- 167
28th
10
149988 +/- 3124
Low-Tier
25th
> 150736
25th
9
151039 +/- 1095
24th
10
151623 +/- 299
23rd
10
152268 +/- 2920
23rd
10
152465 +/- 20877
22nd
10
153144 +/- 3125
18th
8
157338 +/- 3321
17th
3
158865 +/- 409
17th
10
159527 +/- 19353
17th
8
160293 +/- 3276
17th
7
161676 +/- 9590
17th
10
162384 +/- 12424
15th
10
166472 +/- 3205
8th
10
177498 +/- 5017
7th
3
193058 +/- 6247
7th
3
203286 +/- 3004
6th
4
204085 +/- 1020
5th
4
209788 +/- 545
4th
4
214117 +/- 1527
3rd
3
224315 +/- 647
2nd
3
244159 +/- 8571
2nd
3
248715 +/- 4600
OpenBenchmarking.orgDistribution Of Public Results - Model: AlexNet - Acceleration: CPU - Iterations: 200668 Results Range From 38405 To 1296100 Milli-Seconds384056355988713113867139021164175189329214483239637264791289945315099340253365407390561415715440869466023491177516331541485566639591793616947642101667255692409717563742717767871793025818179843333868487893641918795943949969103994257101941110445651069719109487311200271145181117033511954891220643124579712709511296105306090120150

Based on OpenBenchmarking.org data, the selected test / test configuration (Caffe 2020-02-13 - Model: AlexNet - Acceleration: CPU - Iterations: 200) has an average run-time of 8 minutes. By default this test profile is set to run at least 3 times but may increase if the standard deviation exceeds pre-defined defaults or other calculations deem additional runs necessary for greater statistical accuracy of the result.

OpenBenchmarking.orgMinutesTime Required To Complete BenchmarkModel: AlexNet - Acceleration: CPU - Iterations: 200Run-Time612182430Min: 2 / Avg: 7.25 / Max: 27

Based on public OpenBenchmarking.org results, the selected test / test configuration has an average standard deviation of 0.3%.

OpenBenchmarking.orgPercent, Fewer Is BetterAverage Deviation Between RunsModel: AlexNet - Acceleration: CPU - Iterations: 200Deviation246810Min: 0 / Avg: 0.27 / Max: 4

Does It Scale Well With Increasing Cores?

No, based on the automated analysis of the collected public benchmark data, this test / test settings does not generally scale well with increasing CPU core counts. Data based on publicly available results for this test / test settings, separated by vendor, result divided by the reference CPU clock speed, grouped by matching physical CPU core count, and normalized against the smallest core count tested from each vendor for each CPU having a sufficient number of test samples and statistically significant data.

IntelAMDOpenBenchmarking.orgRelative Core Scaling To BaseCaffe AlexNet CPU Core ScalingModel: AlexNet - Acceleration: CPU - Iterations: 200468121618243248641280.4250.851.2751.72.125

Notable Instruction Set Usage

Notable instruction set extensions supported by this test, based on an automatic analysis by the Phoronix Test Suite / OpenBenchmarking.org analytics engine.

Instruction Set
Support
Instructions Detected
SSE2 (SSE2)
Used by default on supported hardware.
 
PUNPCKLQDQ MOVDQA MOVDQU CVTSS2SD MOVD ADDSD DIVSD CVTTSD2SI MOVUPD CVTPS2PD CVTPD2PS CVTSD2SS PSHUFD XORPD SHUFPD SUBSD MULSD CVTSI2SD MOVAPD UCOMISD UNPCKLPD CVTDQ2PS COMISD CVTDQ2PD SQRTSD ANDPD ANDNPD CMPNLESD ORPD DIVPD MULPD MINSD MINPD MAXPD MAXSD CMPLTPD ADDPD CMPLTSD MOVHPD SUBPD MOVLPD UNPCKHPD PMULUDQ PSRLDQ
Requires passing a supported compiler/build flag (verified with targets: sandybridge, skylake, tigerlake, cascadelake, sapphirerapids, alderlake, znver2, znver3).
Found on Intel processors since Sandy Bridge (2011).
Found on AMD processors since Bulldozer (2011).

 
VZEROUPPER VINSERTF128 VEXTRACTF128 VPERM2F128 VPERMILPS VPERMILPD VBROADCASTSS VBROADCASTSD VMASKMOVPS
Requires passing a supported compiler/build flag (verified with targets: skylake, tigerlake, cascadelake, sapphirerapids, alderlake, znver2, znver3).
Found on Intel processors since Haswell (2013).
Found on AMD processors since Excavator (2016).

 
VPERM2I128 VPERMD VPERMPD VPBROADCASTQ VPBROADCASTD VPERMQ VGATHERQPS VEXTRACTI128 VPMASKMOVD VINSERTI128 VPGATHERDD VPBROADCASTW
FMA (FMA)
Requires passing a supported compiler/build flag (verified with targets: skylake, tigerlake, cascadelake, sapphirerapids, alderlake, znver2, znver3).
Found on Intel processors since Haswell (2013).
Found on AMD processors since Bulldozer (2011).

 
VFMADD132SS VFMADD132SD VFMSUB213PS VFMSUB132SS VFMSUB213PD VFMSUB132SD VFNMADD213SD VFNMADD213SS VFMADD231SS VFNMADD231SS VFMADD213SS VFNMADD132SS VFMADD231SD VFNMADD132SD VFMADD213SD VFMADD132PS VFMADD132PD VFNMADD132PD VFNMADD213PD VFNMADD132PS VFNMADD213PS VFMSUB231SD VFNMADD231SD VFMADD231PD
Advanced Vector Extensions 512 (AVX512)
Requires passing a supported compiler/build flag (verified with targets: cascadelake, sapphirerapids).
 
(ZMM REGISTER USE)
The test / benchmark does honor compiler flag changes.
Last automated analysis: 17 January 2022

This test profile binary relies on the shared libraries libcaffe.so.1.0.0, libglog.so.0, libgflags.so.2.2, libprotobuf.so.23, libc.so.6, libm.so.6, liblmdb.so.0, libopenblas.so.0, libunwind.so.8, libpthread.so.0, libz.so.1, libcrypto.so.3, libcurl.so.4, libsz.so.2, libgfortran.so.5, liblzma.so.5, libnghttp2.so.14, libidn2.so.0, librtmp.so.1, libssh.so.4, libpsl.so.5, libssl.so.3, libldap-2.5.so.0, liblber-2.5.so.0, libzstd.so.1, libbrotlidec.so.1, libaec.so.0, libquadmath.so.0, libunistring.so.2, libgnutls.so.30, libhogweed.so.6, libnettle.so.8, libgmp.so.10, libkrb5.so.3, libk5crypto.so.3, libkrb5support.so.0, libsasl2.so.2, libbrotlicommon.so.1, libp11-kit.so.0, libtasn1.so.6, libkeyutils.so.1, libresolv.so.2, libffi.so.8.

Tested CPU Architectures

This benchmark has been successfully tested on the below mentioned architectures. The CPU architectures listed is where successful OpenBenchmarking.org result uploads occurred, namely for helping to determine if a given test is compatible with various alternative CPU architectures.

CPU Architecture
Kernel Identifier
Verified On
Intel / AMD x86 64-bit
x86_64
(Many Processors)
IBM POWER (PowerPC) 64-bit
ppc64le
POWER9 44-Core
ARMv8 64-bit
aarch64
ARMv8 Cortex-A72 6-Core, ARMv8 Neoverse-V1, Ampere Altra ARMv8 Neoverse-N1 160-Core

Recent Test Results

OpenBenchmarking.org Results Compare

13 Systems - 228 Benchmark Results

AMD Ryzen 9 7950X 16-Core - ASUS ROG CROSSHAIR X670E HERO - AMD Device 14d8

Fedora Linux 36 - 6.0.0-0.rc5.20220914git3245cb65fd91.39.vanilla.1.fc36.x86_64 - GNOME Shell 42.4

1 System - 50 Benchmark Results

Intel Core i7-3770 - QEMU Standard PC - Intel 82G33

Ubuntu 22.04 - 5.15.0-47-generic - NVIDIA

1 System - 366 Benchmark Results

2 x AMD EPYC 7713 64-Core - AMD DAYTONA_X - AMD Starship

Ubuntu 22.10 - 6.0.0-060000rc3daily20220904-generic - GNOME Shell

1 System - 1 Benchmark Result

Intel Core i5-10400 - 8GB - 4 x 275GB Virtual Disk

Ubuntu 20.04 - 5.10.102.1-microsoft-standard-WSL2 - X Server + Wayland

14 Systems - 229 Benchmark Results

AMD Ryzen 5 5600X 6-Core - ASUS ROG CROSSHAIR VIII HERO - AMD Starship

Ubuntu 20.04 - 5.9.0-050900-generic - GNOME Shell 3.36.4

1 System - 131 Benchmark Results

Intel Core i9-12900K - 16GB - 4 x 275GB Virtual Disk

Ubuntu 20.04 - 5.10.16.3-microsoft-standard-WSL2 - Wayland

1 System - 123 Benchmark Results

Intel Core i9-12900K - 16GB - 4 x 275GB Virtual Disk

Ubuntu 20.04 - 5.10.16.3-microsoft-standard-WSL2 - Wayland

1 System - 126 Benchmark Results

AMD Ryzen 5 5600X 6-Core - MSI MAG B550M MORTAR - AMD Starship

Linuxmint 20.3 - 5.15.0-41-generic - Cinnamon 5.2.7

1 System - 122 Benchmark Results

Intel Xeon Gold 5318N - Supermicro X12SPM-LN4F v2.00 - Intel Device 0998

Rocky Linux 8.6 - 4.18.0-372.16.1.el8_6.0.1.x86_64 - GNOME Shell 3.32.2

1 System - 122 Benchmark Results

2 x AMD EPYC 7F72 24-Core - GIGABYTE MZ62-HD4-00 v01000100 - AMD Starship

Rocky Linux 8.6 - 4.18.0-372.13.1.el8_6.x86_64 - GNOME Shell 3.32.2

1 System - 134 Benchmark Results

2 x AMD EPYC 7F72 24-Core - GIGABYTE MZ62-HD4-00 v01000100 - AMD Starship

Rocky Linux 8.6 - 4.18.0-372.16.1.el8_6.x86_64 - GNOME Shell 3.32.2

1 System - 89 Benchmark Results

2 x Intel Xeon Silver 4210 - Dell 0804P1 - Intel Sky Lake-E DMI3 Registers

Ubuntu 22.04 - 5.15.0-41-generic - X Server

1 System - 352 Benchmark Results

Intel Core i5-10400F - LENOVO 3717 - Intel Comet Lake-S 6c

ManjaroLinux 21.3.2 - 5.18.7-1-MANJARO - KDE Plasma 5.24.5

2 Systems - 121 Benchmark Results

ARMv8 Neoverse-V1 - Amazon EC2 c7g.8xlarge - Amazon Device 0200

Ubuntu 22.04 - 5.15.0-1004-aws - GCC 12.0.0 20220117

3 Systems - 19 Benchmark Results

AMD Ryzen 7 5800X3D 8-Core - ASRock X570 Pro4 - AMD Starship

Ubuntu 22.04 - 5.18.0-051800rc2daily20220411-generic - GNOME Shell 42.0

Most Popular Test Results

OpenBenchmarking.org Results Compare

2 Systems - 535 Benchmark Results

Intel Core i7-1065G7 - Dell 06CDVY - Intel Device 34ef

Ubuntu 20.04 - 5.9.0-050900rc7daily20201003-generic - GNOME Shell 3.36.4

11 Systems - 217 Benchmark Results

AMD Ryzen 7 3800XT 8-Core - ASUS ROG CROSSHAIR VIII HERO - AMD Starship

Ubuntu 20.04 - 5.9.0-050900-generic - GNOME Shell 3.36.4

3 Systems - 174 Benchmark Results

Intel Core i9-10900K - Gigabyte Z490 AORUS MASTER - Intel Comet Lake PCH

Fedora 32 - 5.8.11-200.fc32.x86_64 - GNOME Shell 3.36.6

12 Systems - 229 Benchmark Results

Intel Core i9-10900K - Gigabyte Z490 AORUS MASTER - Intel Comet Lake PCH

Ubuntu 20.04 - 5.9.0-050900-generic - GNOME Shell 3.36.4

26 Systems - 438 Benchmark Results

AMD EPYC 7532 32-Core - ASRockRack EPYCD8 - AMD Starship

Ubuntu 20.04 - 5.11.0-051100rc6daily20210201-generic - GNOME Shell 3.36.4

2 Systems - 403 Benchmark Results

Intel Core i9-10900K - Gigabyte Z490 AORUS MASTER - Intel Comet Lake PCH

Ubuntu 20.04 - 5.4.0-48-generic - GNOME Shell 3.36.4

4 Systems - 513 Benchmark Results

Intel Core i7-5775C - CompuLab v1.0 - Intel Broadwell-U DMI

Ubuntu 20.10 - 5.8.0-26-generic - GNOME Shell 3.38.1

3 Systems - 46 Benchmark Results

AMD Ryzen Threadripper 3960X 24-Core - MSI Creator TRX40 - AMD Starship

Ubuntu 20.04 - 5.9.0-rc5-14sep-patch - GNOME Shell 3.36.4

11 Systems - 229 Benchmark Results

AMD Ryzen 5 5600X 6-Core - ASUS ROG CROSSHAIR VIII HERO - AMD Starship

Ubuntu 20.04 - 5.9.0-050900-generic - GNOME Shell 3.36.4

3 Systems - 406 Benchmark Results

AMD Ryzen 9 3900XT 12-Core - MSI MEG X570 GODLIKE - AMD Starship

Ubuntu 20.10 - 5.8.0-20-generic - GNOME Shell 3.38.0

3 Systems - 131 Benchmark Results

Intel Core i5-10600K - ASUS PRIME Z490M-PLUS - Intel Comet Lake PCH

Ubuntu 20.04 - 5.8.14-050814-generic - GNOME Shell 3.36.4

3 Systems - 202 Benchmark Results

Intel Core i7-7700K - MSI Z270-A PRO - Intel Xeon E3-1200 v6

Ubuntu 20.04 - 5.4.0-28-generic - GNOME Shell 3.36.1

5 Systems - 259 Benchmark Results

2 x Intel Xeon Platinum 8280 - GIGABYTE MD61-SC2-00 v01000100 - Intel Sky Lake-E DMI3 Registers

Ubuntu 20.04 - 5.11.0-051100rc6daily20210201-generic - GNOME Shell 3.36.4

3 Systems - 32 Benchmark Results

AMD Ryzen 9 3900X 12-Core - ASUS TUF GAMING X570-PLUS - AMD Starship

Ubuntu 20.04 - 5.9.0-050900rc6daily20200922-generic - GNOME Shell 3.36.4

3 Systems - 40 Benchmark Results

2 x Intel Xeon Gold 5220R - TYAN S7106 - Intel Sky Lake-E DMI3 Registers

Ubuntu 20.04 - 5.9.0-050900rc6-generic - GNOME Shell 3.36.4

Find More Test Results