CUDA Caffe NVIDIA Comparison

CUDA 8.0 + cuDNN Caffe deep learning benchmarks with many different GPUs. Tests by Michael Larabel.

HTML result view exported from: https://openbenchmarking.org/result/1701311-TA-1611066TA90.

CUDA Caffe NVIDIA ComparisonProcessorMotherboardChipsetMemoryDiskGraphicsAudioNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLVulkanCompilerFile-SystemScreen ResolutionGTX 680GTX 760GTX 780 TiGTX 950GTX 960GTX 970GTX 980GTX 980 TiGTX 1050GTX 1050 TiGTX 1060GTX 1070GTX 1080ubu_caffeIntel Xeon E3-1280 v5 @ 4.00GHz (8 Cores)MSI C236A WORKSTATION (MS-7998) v1.0Intel Sky Lake16384MB256GB INTEL SSDPEKKW256G7NVIDIA GeForce GTX 680 2048MB (1006/3004MHz)Realtek ALC1150Intel ConnectionUbuntu 16.044.8.4-040804-generic (x86_64)Unity 7.4.0X Server 1.18.4NVIDIA 375.104.5.01.0.8GCC 5.4.0 20160609 + LLVM 3.8.0 + CUDA 8.0ext43840x2160NVIDIA GeForce GTX 760 2048MB (980/3004MHz)NVIDIA GeForce GTX 780 Ti 3072MB (875/3500MHz)eVGA NVIDIA GeForce GTX 950 2048MB (1201/3304MHz)eVGA NVIDIA GeForce GTX 960 2048MB (1277/3505MHz)eVGA NVIDIA GeForce GTX 970 4096MB (1163/3505MHz)NVIDIA GeForce GTX 980 4096MB (135/324MHz)NVIDIA GeForce GTX 980 Ti 6144MB (999/3505MHz)Zotac NVIDIA GeForce GTX 1050 2048MB (1316/3504MHz)eVGA NVIDIA GeForce GTX 1050 Ti 4096MB (1341/3504MHz)NVIDIA GeForce GTX 1060 6GB 6144MB (1506/4006MHz)NVIDIA GeForce GTX 1070 8192MB (1505/4006MHz)NVIDIA GeForce GTX 1080 8192MB (1615/5005MHz)2 x Intel 0000 @ 3.00GHz (48 Cores)Supermicro X10DRi-LN4+ v1.01Intel Xeon E7 v4/Xeon8 x 8192 MB DDR4-2133MHz Samsung1000GB My Passport 0820NVIDIA Device 10f0Intel I350 Gigabit Connection4.4.0-59-generic (x86_64)GCC 5.4.0 20160609 + CUDA 8.01024x768OpenBenchmarking.orgCompiler Details- --build=x86_64-linux-gnu --disable-browser-plugin --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --enable-libmpx --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-arch-directory=amd64 --with-default-libstdcxx-abi=new --with-multilib-list=m32,m64,mx32 --with-tune=generic -vProcessor Details- GTX 680: Scaling Governor: intel_pstate performance- GTX 760: Scaling Governor: intel_pstate performance- GTX 780 Ti: Scaling Governor: intel_pstate performance- GTX 950: Scaling Governor: intel_pstate performance- GTX 960: Scaling Governor: intel_pstate performance- GTX 970: Scaling Governor: intel_pstate performance- GTX 980: Scaling Governor: intel_pstate performance- GTX 980 Ti: Scaling Governor: intel_pstate performance- GTX 1050: Scaling Governor: intel_pstate performance- GTX 1050 Ti: Scaling Governor: intel_pstate performance- GTX 1060: Scaling Governor: intel_pstate performance- GTX 1070: Scaling Governor: intel_pstate performance- GTX 1080: Scaling Governor: intel_pstate performance- ubu_caffe: Scaling Governor: intel_pstate powersave

CUDA Caffe NVIDIA Comparisoncaffe: CUDA AlexNetcaffe: CUDA GooglenetGTX 680GTX 760GTX 780 TiGTX 950GTX 960GTX 970GTX 980GTX 980 TiGTX 1050GTX 1050 TiGTX 1060GTX 1070GTX 1080ubu_caffe54520.3713834266711.2316464328177.9065152.0030783.1369528.2027512.8060318.2316987.3040193.7715013.4336217.2311652.1731349.8330970.0070347.7027452.5361541.5716184.7037604.7311438.0027661.909630.3624019.9712543.3739115.97OpenBenchmarking.org

Caffe AlexNet

Build: CUDA AlexNet

OpenBenchmarking.orgMilli-Seconds, Fewer Is BetterCaffe AlexNet 2016-06-11Build: CUDA AlexNetGTX 680GTX 760GTX 780 TiGTX 950GTX 960GTX 970GTX 980GTX 980 TiGTX 1050GTX 1050 TiGTX 1060GTX 1070GTX 1080ubu_caffe14K28K42K56K70KSE +/- 123.58, N = 3SE +/- 18.99, N = 3SE +/- 16.03, N = 3SE +/- 31.65, N = 3SE +/- 24.09, N = 3SE +/- 6.55, N = 3SE +/- 8.54, N = 3SE +/- 5.86, N = 3SE +/- 22.56, N = 3SE +/- 8.51, N = 3SE +/- 5.77, N = 3SE +/- 17.68, N = 3SE +/- 8.72, N = 3SE +/- 60.10, N = 354520.3766711.2328177.9030783.1327512.8016987.3015013.4311652.1730970.0027452.5316184.7011438.009630.3612543.371. (CXX) g++ options: -pthread -fPIC -O2 -lcaffe -lglog -lgflags -lprotobuf -lboost_system -lboost_filesystem -lm -lhdf5_hl -lhdf5 -lleveldb -lsnappy -llmdb -lopencv_core -lopencv_highgui -lopencv_imgproc -lboost_thread -lstdc++ -lcblas -latlas

Caffe AlexNet

GPU Temperature Monitor

OpenBenchmarking.orgCelsius, Fewer Is BetterCaffe AlexNet 2016-06-11GPU Temperature MonitorGTX 680GTX 760GTX 780 TiGTX 950GTX 960GTX 970GTX 980GTX 980 TiGTX 1050GTX 1050 TiGTX 1060GTX 1070GTX 10801530456075Min: 52 / Avg: 68.04 / Max: 75Min: 48 / Avg: 72.38 / Max: 81Min: 45 / Avg: 61.67 / Max: 72Min: 46 / Avg: 57.46 / Max: 67Min: 43 / Avg: 53.55 / Max: 63Min: 41 / Avg: 45.86 / Max: 50Min: 45 / Avg: 51.43 / Max: 58Min: 46 / Avg: 52.67 / Max: 60Min: 37 / Avg: 42 / Max: 46Min: 39 / Avg: 45.31 / Max: 50Min: 36 / Avg: 41.29 / Max: 46Min: 39 / Avg: 43 / Max: 48Min: 38 / Avg: 42.6 / Max: 47

Caffe AlexNet

System Power Consumption Monitor

OpenBenchmarking.orgWatts, Fewer Is BetterCaffe AlexNet 2016-06-11System Power Consumption MonitorGTX 680GTX 760GTX 780 TiGTX 950GTX 960GTX 970GTX 980GTX 980 TiGTX 1050GTX 1050 TiGTX 1060GTX 1070GTX 108050100150200250Min: 182.5 / Avg: 190.05 / Max: 195Min: 172.9 / Avg: 190.92 / Max: 199.3Min: 196.1 / Avg: 256.14 / Max: 266.6Min: 124.9 / Avg: 129.66 / Max: 133.1Min: 79.6 / Avg: 129.82 / Max: 136.7Min: 90.8 / Avg: 162.51 / Max: 175Min: 160.8 / Avg: 186.13 / Max: 192Min: 122 / Avg: 210.6 / Max: 234.5Min: 93.6 / Avg: 98.84 / Max: 99.6Min: 59.3 / Avg: 94.01 / Max: 97.4Min: 133 / Avg: 134.46 / Max: 136.2Min: 81.2 / Avg: 149.37 / Max: 170.9Min: 186.1 / Avg: 188.6 / Max: 192.6

Caffe AlexNet

Build: CUDA Googlenet

OpenBenchmarking.orgMilli-Seconds, Fewer Is BetterCaffe AlexNet 2016-06-11Build: CUDA GooglenetGTX 680GTX 760GTX 780 TiGTX 950GTX 960GTX 970GTX 980GTX 980 TiGTX 1050GTX 1050 TiGTX 1060GTX 1070GTX 1080ubu_caffe40K80K120K160K200KSE +/- 26.87, N = 3SE +/- 99.04, N = 3SE +/- 56.68, N = 3SE +/- 39.05, N = 3SE +/- 23.55, N = 3SE +/- 28.41, N = 3SE +/- 79.51, N = 3SE +/- 85.73, N = 3SE +/- 20.52, N = 3SE +/- 59.49, N = 3SE +/- 55.94, N = 3SE +/- 47.22, N = 3SE +/- 36.77, N = 3SE +/- 141.04, N = 3138342.00164643.0065152.0069528.2060318.2340193.7736217.2331349.8370347.7061541.5737604.7327661.9024019.9739115.971. (CXX) g++ options: -pthread -fPIC -O2 -lcaffe -lglog -lgflags -lprotobuf -lboost_system -lboost_filesystem -lm -lhdf5_hl -lhdf5 -lleveldb -lsnappy -llmdb -lopencv_core -lopencv_highgui -lopencv_imgproc -lboost_thread -lstdc++ -lcblas -latlas

Caffe AlexNet

GPU Temperature Monitor

OpenBenchmarking.orgCelsius, Fewer Is BetterCaffe AlexNet 2016-06-11GPU Temperature MonitorGTX 680GTX 760GTX 780 TiGTX 950GTX 960GTX 970GTX 980GTX 980 TiGTX 1050GTX 1050 TiGTX 1060GTX 1070GTX 10801530456075Min: 69 / Avg: 76.94 / Max: 79Min: 63 / Avg: 78.92 / Max: 81Min: 59 / Avg: 75.4 / Max: 81Min: 64 / Avg: 68.12 / Max: 70Min: 51 / Avg: 68.64 / Max: 71Min: 46 / Avg: 55.05 / Max: 59Min: 57 / Avg: 68.25 / Max: 75Min: 59 / Avg: 72.07 / Max: 79Min: 43 / Avg: 49.79 / Max: 52Min: 48 / Avg: 55.69 / Max: 59Min: 37 / Avg: 51.89 / Max: 58Min: 42 / Avg: 57.69 / Max: 65Min: 43 / Avg: 56.27 / Max: 64

Caffe AlexNet

System Power Consumption Monitor

OpenBenchmarking.orgWatts, Fewer Is BetterCaffe AlexNet 2016-06-11System Power Consumption MonitorGTX 680GTX 760GTX 780 TiGTX 950GTX 960GTX 970GTX 980GTX 980 TiGTX 1050GTX 1050 TiGTX 1060GTX 1070GTX 108050100150200250Min: 51.1 / Avg: 200.58 / Max: 204.5Min: 49.9 / Avg: 204.65 / Max: 210.5Min: 57.5 / Avg: 263.36 / Max: 284.5Min: 42.3 / Avg: 137.23 / Max: 140.5Min: 113.7 / Avg: 150.46 / Max: 153.2Min: 44.2 / Avg: 181.07 / Max: 191.1Min: 45.9 / Avg: 197.86 / Max: 211.6Min: 48.6 / Avg: 225.77 / Max: 242.9Min: 36.5 / Avg: 101.99 / Max: 105.9Min: 36.1 / Avg: 95.11 / Max: 99.3Min: 100.9 / Avg: 144.81 / Max: 149.3Min: 40.1 / Avg: 164.62 / Max: 185.9Min: 41 / Avg: 182.39 / Max: 202.8

GPU Temperature Monitor

Phoronix Test Suite System Monitoring

OpenBenchmarking.orgCelsiusGPU Temperature MonitorPhoronix Test Suite System MonitoringGTX 680GTX 760GTX 780 TiGTX 950GTX 960GTX 970GTX 980GTX 980 TiGTX 1050GTX 1050 TiGTX 1060GTX 1070GTX 10801530456075Min: 40 / Avg: 73.59 / Max: 79Min: 38 / Avg: 76.14 / Max: 81Min: 37 / Avg: 70.44 / Max: 81Min: 33 / Avg: 63.23 / Max: 70Min: 31 / Avg: 62.53 / Max: 71Min: 30 / Avg: 50.28 / Max: 59Min: 36 / Avg: 61.1 / Max: 75Min: 38 / Avg: 64.35 / Max: 79Min: 28 / Avg: 46.2 / Max: 52Min: 30 / Avg: 51.15 / Max: 59Min: 30 / Avg: 47.83 / Max: 58Min: 32 / Avg: 52.08 / Max: 65Min: 31 / Avg: 50.95 / Max: 64

System Power Consumption Monitor

Phoronix Test Suite System Monitoring

OpenBenchmarking.orgWattsSystem Power Consumption MonitorPhoronix Test Suite System MonitoringGTX 680GTX 760GTX 780 TiGTX 950GTX 960GTX 970GTX 980GTX 980 TiGTX 1050GTX 1050 TiGTX 1060GTX 1070GTX 108050100150200250Min: 51.1 / Avg: 193.35 / Max: 204.5Min: 49.9 / Avg: 196.96 / Max: 210.5Min: 48.5 / Avg: 241.86 / Max: 284.5Min: 42.3 / Avg: 127.39 / Max: 140.5Min: 43.4 / Avg: 136.19 / Max: 153.2Min: 44.2 / Avg: 161.86 / Max: 191.1Min: 45.9 / Avg: 177.58 / Max: 211.6Min: 48.6 / Avg: 194.02 / Max: 242.9Min: 36.5 / Avg: 96.75 / Max: 105.9Min: 36.1 / Avg: 90.28 / Max: 99.3Min: 38.4 / Avg: 126.82 / Max: 149.3Min: 40.1 / Avg: 146.82 / Max: 185.9Min: 41 / Avg: 165.2 / Max: 202.8


Phoronix Test Suite v10.8.4