CUDA Caffe NVIDIA Comparison

CUDA 8.0 + cuDNN Caffe deep learning benchmarks with many different GPUs. Tests by Michael Larabel.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 1701311-TA-1611066TA90
Jump To Table - Results

View

Do Not Show Noisy Results
Do Not Show Results With Incomplete Data
Do Not Show Results With Little Change/Spread
List Notable Results

Statistics

Show Overall Harmonic Mean(s)
Show Overall Geometric Mean
Show Wins / Losses Counts (Pie Chart)
Normalize Results
Remove Outliers Before Calculating Averages

Graph Settings

Force Line Graphs Where Applicable
Convert To Scalar Where Applicable
Disable Color Branding
Prefer Vertical Bar Graphs
No Box Plots
On Line Graphs With Missing Data, Connect The Line Gaps

Additional Graphs

Show Perf Per Core/Thread Calculation Graphs Where Applicable
Show Perf Per Clock Calculation Graphs Where Applicable

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs
Condense Test Profiles With Multiple Version Results Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Highlight
Result
Hide
Result
Result
Identifier
Performance Per
Dollar
Date
Run
  Test
  Duration
GTX 680
November 05 2016
 
GTX 760
November 05 2016
 
GTX 780 Ti
November 05 2016
 
GTX 950
November 04 2016
 
GTX 960
November 04 2016
 
GTX 970
November 04 2016
 
GTX 980
November 04 2016
 
GTX 980 Ti
November 04 2016
 
GTX 1050
November 04 2016
 
GTX 1050 Ti
November 05 2016
 
GTX 1060
November 04 2016
 
GTX 1070
November 04 2016
 
GTX 1080
November 04 2016
 
ubu_caffe
January 31 2017
 
Invert Hiding All Results Option
 

Only show results where is faster than
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


CUDA Caffe NVIDIA ComparisonProcessorMotherboardChipsetMemoryDiskGraphicsAudioNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLVulkanCompilerFile-SystemScreen ResolutionGTX 680GTX 760GTX 780 TiGTX 950GTX 960GTX 970GTX 980GTX 980 TiGTX 1050GTX 1050 TiGTX 1060GTX 1070GTX 1080ubu_caffeIntel Xeon E3-1280 v5 @ 4.00GHz (8 Cores)MSI C236A WORKSTATION (MS-7998) v1.0Intel Sky Lake16384MB256GB INTEL SSDPEKKW256G7NVIDIA GeForce GTX 680 2048MB (1006/3004MHz)Realtek ALC1150Intel ConnectionUbuntu 16.044.8.4-040804-generic (x86_64)Unity 7.4.0X Server 1.18.4NVIDIA 375.104.5.01.0.8GCC 5.4.0 20160609 + LLVM 3.8.0 + CUDA 8.0ext43840x2160NVIDIA GeForce GTX 760 2048MB (980/3004MHz)NVIDIA GeForce GTX 780 Ti 3072MB (875/3500MHz)eVGA NVIDIA GeForce GTX 950 2048MB (1201/3304MHz)eVGA NVIDIA GeForce GTX 960 2048MB (1277/3505MHz)eVGA NVIDIA GeForce GTX 970 4096MB (1163/3505MHz)NVIDIA GeForce GTX 980 4096MB (135/324MHz)NVIDIA GeForce GTX 980 Ti 6144MB (999/3505MHz)Zotac NVIDIA GeForce GTX 1050 2048MB (1316/3504MHz)eVGA NVIDIA GeForce GTX 1050 Ti 4096MB (1341/3504MHz)NVIDIA GeForce GTX 1060 6GB 6144MB (1506/4006MHz)NVIDIA GeForce GTX 1070 8192MB (1505/4006MHz)NVIDIA GeForce GTX 1080 8192MB (1615/5005MHz)2 x Intel 0000 @ 3.00GHz (48 Cores)Supermicro X10DRi-LN4+ v1.01Intel Xeon E7 v4/Xeon8 x 8192 MB DDR4-2133MHz Samsung1000GB My Passport 0820NVIDIA Device 10f0Intel I350 Gigabit Connection4.4.0-59-generic (x86_64)GCC 5.4.0 20160609 + CUDA 8.01024x768OpenBenchmarking.orgCompiler Details- --build=x86_64-linux-gnu --disable-browser-plugin --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --enable-libmpx --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-arch-directory=amd64 --with-default-libstdcxx-abi=new --with-multilib-list=m32,m64,mx32 --with-tune=generic -vProcessor Details- GTX 680: Scaling Governor: intel_pstate performance- GTX 760: Scaling Governor: intel_pstate performance- GTX 780 Ti: Scaling Governor: intel_pstate performance- GTX 950: Scaling Governor: intel_pstate performance- GTX 960: Scaling Governor: intel_pstate performance- GTX 970: Scaling Governor: intel_pstate performance- GTX 980: Scaling Governor: intel_pstate performance- GTX 980 Ti: Scaling Governor: intel_pstate performance- GTX 1050: Scaling Governor: intel_pstate performance- GTX 1050 Ti: Scaling Governor: intel_pstate performance- GTX 1060: Scaling Governor: intel_pstate performance- GTX 1070: Scaling Governor: intel_pstate performance- GTX 1080: Scaling Governor: intel_pstate performance- ubu_caffe: Scaling Governor: intel_pstate powersave

CUDA Caffe NVIDIA Comparisoncaffe: CUDA AlexNetcaffe: CUDA GooglenetGTX 680GTX 760GTX 780 TiGTX 950GTX 960GTX 970GTX 980GTX 980 TiGTX 1050GTX 1050 TiGTX 1060GTX 1070GTX 1080ubu_caffe54520.3713834266711.2316464328177.9065152.0030783.1369528.2027512.8060318.2316987.3040193.7715013.4336217.2311652.1731349.8330970.0070347.7027452.5361541.5716184.7037604.7311438.0027661.909630.3624019.9712543.3739115.97OpenBenchmarking.org

Caffe AlexNet

OpenBenchmarking.orgMilli-Seconds, Fewer Is BetterCaffe AlexNet 2016-06-11Build: CUDA AlexNetGTX 680GTX 760GTX 780 TiGTX 950GTX 960GTX 970GTX 980GTX 980 TiGTX 1050GTX 1050 TiGTX 1060GTX 1070GTX 1080ubu_caffe14K28K42K56K70KSE +/- 123.58, N = 3SE +/- 18.99, N = 3SE +/- 16.03, N = 3SE +/- 31.65, N = 3SE +/- 24.09, N = 3SE +/- 6.55, N = 3SE +/- 8.54, N = 3SE +/- 5.86, N = 3SE +/- 22.56, N = 3SE +/- 8.51, N = 3SE +/- 5.77, N = 3SE +/- 17.68, N = 3SE +/- 8.72, N = 3SE +/- 60.10, N = 354520.3766711.2328177.9030783.1327512.8016987.3015013.4311652.1730970.0027452.5316184.7011438.009630.3612543.371. (CXX) g++ options: -pthread -fPIC -O2 -lcaffe -lglog -lgflags -lprotobuf -lboost_system -lboost_filesystem -lm -lhdf5_hl -lhdf5 -lleveldb -lsnappy -llmdb -lopencv_core -lopencv_highgui -lopencv_imgproc -lboost_thread -lstdc++ -lcblas -latlas
OpenBenchmarking.orgMilli-Seconds, Fewer Is BetterCaffe AlexNet 2016-06-11Build: CUDA AlexNetGTX 680GTX 760GTX 780 TiGTX 950GTX 960GTX 970GTX 980GTX 980 TiGTX 1050GTX 1050 TiGTX 1060GTX 1070GTX 1080ubu_caffe12K24K36K48K60KMin: 54305.7 / Avg: 54520.37 / Max: 54733.8Min: 66681.4 / Avg: 66711.23 / Max: 66746.5Min: 28154.8 / Avg: 28177.9 / Max: 28208.7Min: 30734.8 / Avg: 30783.13 / Max: 30842.7Min: 27479.5 / Avg: 27512.8 / Max: 27559.6Min: 16976 / Avg: 16987.3 / Max: 16998.7Min: 15004.3 / Avg: 15013.43 / Max: 15030.5Min: 11645.1 / Avg: 11652.17 / Max: 11663.8Min: 30938.7 / Avg: 30970 / Max: 31013.8Min: 27435.6 / Avg: 27452.53 / Max: 27462.5Min: 16173.4 / Avg: 16184.7 / Max: 16192.4Min: 11403 / Avg: 11438 / Max: 11459.9Min: 9613.01 / Avg: 9630.36 / Max: 9640.47Min: 12423.2 / Avg: 12543.37 / Max: 12606.11. (CXX) g++ options: -pthread -fPIC -O2 -lcaffe -lglog -lgflags -lprotobuf -lboost_system -lboost_filesystem -lm -lhdf5_hl -lhdf5 -lleveldb -lsnappy -llmdb -lopencv_core -lopencv_highgui -lopencv_imgproc -lboost_thread -lstdc++ -lcblas -latlas

OpenBenchmarking.orgCelsius, Fewer Is BetterCaffe AlexNet 2016-06-11GPU Temperature MonitorGTX 680GTX 760GTX 780 TiGTX 950GTX 960GTX 970GTX 980GTX 980 TiGTX 1050GTX 1050 TiGTX 1060GTX 1070GTX 10801530456075Min: 52 / Avg: 68.04 / Max: 75Min: 48 / Avg: 72.38 / Max: 81Min: 45 / Avg: 61.67 / Max: 72Min: 46 / Avg: 57.46 / Max: 67Min: 43 / Avg: 53.55 / Max: 63Min: 41 / Avg: 45.86 / Max: 50Min: 45 / Avg: 51.43 / Max: 58Min: 46 / Avg: 52.67 / Max: 60Min: 37 / Avg: 42 / Max: 46Min: 39 / Avg: 45.31 / Max: 50Min: 36 / Avg: 41.29 / Max: 46Min: 39 / Avg: 43 / Max: 48Min: 38 / Avg: 42.6 / Max: 47

OpenBenchmarking.orgWatts, Fewer Is BetterCaffe AlexNet 2016-06-11System Power Consumption MonitorGTX 680GTX 760GTX 780 TiGTX 950GTX 960GTX 970GTX 980GTX 980 TiGTX 1050GTX 1050 TiGTX 1060GTX 1070GTX 108050100150200250Min: 182.5 / Avg: 190.05 / Max: 195Min: 172.9 / Avg: 190.92 / Max: 199.3Min: 196.1 / Avg: 256.14 / Max: 266.6Min: 124.9 / Avg: 129.66 / Max: 133.1Min: 79.6 / Avg: 129.82 / Max: 136.7Min: 90.8 / Avg: 162.51 / Max: 175Min: 160.8 / Avg: 186.13 / Max: 192Min: 122 / Avg: 210.6 / Max: 234.5Min: 93.6 / Avg: 98.84 / Max: 99.6Min: 59.3 / Avg: 94.01 / Max: 97.4Min: 133 / Avg: 134.46 / Max: 136.2Min: 81.2 / Avg: 149.37 / Max: 170.9Min: 186.1 / Avg: 188.6 / Max: 192.6

OpenBenchmarking.orgMilli-Seconds, Fewer Is BetterCaffe AlexNet 2016-06-11Build: CUDA GooglenetGTX 680GTX 760GTX 780 TiGTX 950GTX 960GTX 970GTX 980GTX 980 TiGTX 1050GTX 1050 TiGTX 1060GTX 1070GTX 1080ubu_caffe40K80K120K160K200KSE +/- 26.87, N = 3SE +/- 99.04, N = 3SE +/- 56.68, N = 3SE +/- 39.05, N = 3SE +/- 23.55, N = 3SE +/- 28.41, N = 3SE +/- 79.51, N = 3SE +/- 85.73, N = 3SE +/- 20.52, N = 3SE +/- 59.49, N = 3SE +/- 55.94, N = 3SE +/- 47.22, N = 3SE +/- 36.77, N = 3SE +/- 141.04, N = 3138342.00164643.0065152.0069528.2060318.2340193.7736217.2331349.8370347.7061541.5737604.7327661.9024019.9739115.971. (CXX) g++ options: -pthread -fPIC -O2 -lcaffe -lglog -lgflags -lprotobuf -lboost_system -lboost_filesystem -lm -lhdf5_hl -lhdf5 -lleveldb -lsnappy -llmdb -lopencv_core -lopencv_highgui -lopencv_imgproc -lboost_thread -lstdc++ -lcblas -latlas
OpenBenchmarking.orgMilli-Seconds, Fewer Is BetterCaffe AlexNet 2016-06-11Build: CUDA GooglenetGTX 680GTX 760GTX 780 TiGTX 950GTX 960GTX 970GTX 980GTX 980 TiGTX 1050GTX 1050 TiGTX 1060GTX 1070GTX 1080ubu_caffe30K60K90K120K150KMin: 138294 / Avg: 138341.67 / Max: 138387Min: 164500 / Avg: 164642.67 / Max: 164833Min: 65039.7 / Avg: 65152 / Max: 65221.5Min: 69485.9 / Avg: 69528.2 / Max: 69606.2Min: 60271.3 / Avg: 60318.23 / Max: 60345.2Min: 40154.6 / Avg: 40193.77 / Max: 40249Min: 36128.6 / Avg: 36217.23 / Max: 36375.9Min: 31186.1 / Avg: 31349.83 / Max: 31475.8Min: 70309.5 / Avg: 70347.7 / Max: 70379.8Min: 61424.6 / Avg: 61541.57 / Max: 61618.9Min: 37496.2 / Avg: 37604.73 / Max: 37682.5Min: 27570.9 / Avg: 27661.9 / Max: 27729.3Min: 23967.2 / Avg: 24019.97 / Max: 24090.7Min: 38862.5 / Avg: 39115.97 / Max: 39349.91. (CXX) g++ options: -pthread -fPIC -O2 -lcaffe -lglog -lgflags -lprotobuf -lboost_system -lboost_filesystem -lm -lhdf5_hl -lhdf5 -lleveldb -lsnappy -llmdb -lopencv_core -lopencv_highgui -lopencv_imgproc -lboost_thread -lstdc++ -lcblas -latlas

OpenBenchmarking.orgCelsius, Fewer Is BetterCaffe AlexNet 2016-06-11GPU Temperature MonitorGTX 680GTX 760GTX 780 TiGTX 950GTX 960GTX 970GTX 980GTX 980 TiGTX 1050GTX 1050 TiGTX 1060GTX 1070GTX 10801530456075Min: 69 / Avg: 76.94 / Max: 79Min: 63 / Avg: 78.92 / Max: 81Min: 59 / Avg: 75.4 / Max: 81Min: 64 / Avg: 68.12 / Max: 70Min: 51 / Avg: 68.64 / Max: 71Min: 46 / Avg: 55.05 / Max: 59Min: 57 / Avg: 68.25 / Max: 75Min: 59 / Avg: 72.07 / Max: 79Min: 43 / Avg: 49.79 / Max: 52Min: 48 / Avg: 55.69 / Max: 59Min: 37 / Avg: 51.89 / Max: 58Min: 42 / Avg: 57.69 / Max: 65Min: 43 / Avg: 56.27 / Max: 64

OpenBenchmarking.orgWatts, Fewer Is BetterCaffe AlexNet 2016-06-11System Power Consumption MonitorGTX 680GTX 760GTX 780 TiGTX 950GTX 960GTX 970GTX 980GTX 980 TiGTX 1050GTX 1050 TiGTX 1060GTX 1070GTX 108050100150200250Min: 51.1 / Avg: 200.58 / Max: 204.5Min: 49.9 / Avg: 204.65 / Max: 210.5Min: 57.5 / Avg: 263.36 / Max: 284.5Min: 42.3 / Avg: 137.23 / Max: 140.5Min: 113.7 / Avg: 150.46 / Max: 153.2Min: 44.2 / Avg: 181.07 / Max: 191.1Min: 45.9 / Avg: 197.86 / Max: 211.6Min: 48.6 / Avg: 225.77 / Max: 242.9Min: 36.5 / Avg: 101.99 / Max: 105.9Min: 36.1 / Avg: 95.11 / Max: 99.3Min: 100.9 / Avg: 144.81 / Max: 149.3Min: 40.1 / Avg: 164.62 / Max: 185.9Min: 41 / Avg: 182.39 / Max: 202.8

GPU Temperature Monitor

OpenBenchmarking.orgCelsiusGPU Temperature MonitorPhoronix Test Suite System MonitoringGTX 680GTX 760GTX 780 TiGTX 950GTX 960GTX 970GTX 980GTX 980 TiGTX 1050GTX 1050 TiGTX 1060GTX 1070GTX 10801530456075Min: 40 / Avg: 73.59 / Max: 79Min: 38 / Avg: 76.14 / Max: 81Min: 37 / Avg: 70.44 / Max: 81Min: 33 / Avg: 63.23 / Max: 70Min: 31 / Avg: 62.53 / Max: 71Min: 30 / Avg: 50.28 / Max: 59Min: 36 / Avg: 61.1 / Max: 75Min: 38 / Avg: 64.35 / Max: 79Min: 28 / Avg: 46.2 / Max: 52Min: 30 / Avg: 51.15 / Max: 59Min: 30 / Avg: 47.83 / Max: 58Min: 32 / Avg: 52.08 / Max: 65Min: 31 / Avg: 50.95 / Max: 64

System Power Consumption Monitor

OpenBenchmarking.orgWattsSystem Power Consumption MonitorPhoronix Test Suite System MonitoringGTX 680GTX 760GTX 780 TiGTX 950GTX 960GTX 970GTX 980GTX 980 TiGTX 1050GTX 1050 TiGTX 1060GTX 1070GTX 108050100150200250Min: 51.1 / Avg: 193.35 / Max: 204.5Min: 49.9 / Avg: 196.96 / Max: 210.5Min: 48.5 / Avg: 241.86 / Max: 284.5Min: 42.3 / Avg: 127.39 / Max: 140.5Min: 43.4 / Avg: 136.19 / Max: 153.2Min: 44.2 / Avg: 161.86 / Max: 191.1Min: 45.9 / Avg: 177.58 / Max: 211.6Min: 48.6 / Avg: 194.02 / Max: 242.9Min: 36.5 / Avg: 96.75 / Max: 105.9Min: 36.1 / Avg: 90.28 / Max: 99.3Min: 38.4 / Avg: 126.82 / Max: 149.3Min: 40.1 / Avg: 146.82 / Max: 185.9Min: 41 / Avg: 165.2 / Max: 202.8