NVIDIA GeForce GTX 1080 CUDA Linux Compute GPGPU Testing NVIDIA GeForce GTX 1080 CUDA benchmarking including deep learning on Pascal. Benchmarks by Michael Larabel. GeForce GTX 960: Processor: Intel Xeon E3-1280 v5 @ 4.00GHz (8 Cores), Motherboard: MSI C236A WORKSTATION (MS-7998) v1.0, Chipset: Intel Sky Lake, Memory: 16384MB, Disk: Samsung SSD 950 PRO 256GB, Graphics: eVGA NVIDIA GeForce GTX 960 2043MB (1277/3505MHz), Audio: Realtek ALC1150, Network: Intel Connection OS: Ubuntu 16.04, Kernel: 4.4.0-22-generic (x86_64), Desktop: Unity 7.4.0, Display Driver: NVIDIA 367.18, OpenGL: 4.5.0, Vulkan: 1.0.8, Compiler: GCC 5.3.1 20160413 + CUDA 8.0, File-System: ext4, Screen Resolution: 3840x2160 GeForce GTX 970: Processor: Intel Xeon E3-1280 v5 @ 4.00GHz (8 Cores), Motherboard: MSI C236A WORKSTATION (MS-7998) v1.0, Chipset: Intel Sky Lake, Memory: 16384MB, Disk: Samsung SSD 950 PRO 256GB, Graphics: eVGA NVIDIA GeForce GTX 970 4091MB (1163/3505MHz), Audio: Realtek ALC1150, Network: Intel Connection OS: Ubuntu 16.04, Kernel: 4.4.0-22-generic (x86_64), Desktop: Unity 7.4.0, Display Driver: NVIDIA 367.18, OpenGL: 4.5.0, Vulkan: 1.0.8, Compiler: GCC 5.3.1 20160413 + CUDA 8.0, File-System: ext4, Screen Resolution: 3840x2160 GeForce GTX 980: Processor: Intel Xeon E3-1280 v5 @ 4.00GHz (8 Cores), Motherboard: MSI C236A WORKSTATION (MS-7998) v1.0, Chipset: Intel Sky Lake, Memory: 16384MB, Disk: Samsung SSD 950 PRO 256GB, Graphics: NVIDIA GeForce GTX 980 4091MB (1126/3505MHz), Audio: Realtek ALC1150, Network: Intel Connection OS: Ubuntu 16.04, Kernel: 4.4.0-22-generic (x86_64), Desktop: Unity 7.4.0, Display Driver: NVIDIA 367.18, OpenGL: 4.5.0, Vulkan: 1.0.8, Compiler: GCC 5.3.1 20160413 + CUDA 8.0, File-System: ext4, Screen Resolution: 3840x2160 GeForce GTX 980 Ti: Processor: Intel Xeon E3-1280 v5 @ 4.00GHz (8 Cores), Motherboard: MSI C236A WORKSTATION (MS-7998) v1.0, Chipset: Intel Sky Lake, Memory: 16384MB, Disk: Samsung SSD 950 PRO 256GB, Graphics: NVIDIA GeForce GTX 980 Ti 6139MB (999/3505MHz), Audio: Realtek ALC1150, Network: Intel Connection OS: Ubuntu 16.04, Kernel: 4.4.0-22-generic (x86_64), Desktop: Unity 7.4.0, Display Driver: NVIDIA 367.18, OpenGL: 4.5.0, Vulkan: 1.0.8, Compiler: GCC 5.3.1 20160413 + CUDA 8.0, File-System: ext4, Screen Resolution: 3840x2160 GeForce GTX TITAN X: Processor: Intel Xeon E3-1280 v5 @ 4.00GHz (8 Cores), Motherboard: MSI C236A WORKSTATION (MS-7998) v1.0, Chipset: Intel Sky Lake, Memory: 16384MB, Disk: Samsung SSD 950 PRO 256GB, Graphics: NVIDIA GeForce GTX TITAN X 12283MB (1001/3505MHz), Audio: Realtek ALC1150, Network: Intel Connection OS: Ubuntu 16.04, Kernel: 4.4.0-22-generic (x86_64), Desktop: Unity 7.4.0, Display Driver: NVIDIA 367.18, OpenGL: 4.5.0, Vulkan: 1.0.8, Compiler: GCC 5.3.1 20160413 + CUDA 8.0, File-System: ext4, Screen Resolution: 3840x2160 GeForce GTX 1080: Processor: Intel Xeon E3-1280 v5 @ 4.00GHz (8 Cores), Motherboard: MSI C236A WORKSTATION (MS-7998) v1.0, Chipset: Intel Sky Lake, Memory: 16384MB, Disk: Samsung SSD 950 PRO 256GB, Graphics: GeForce GTX 1080 8187MB (909/5005MHz), Audio: Realtek ALC1150, Network: Intel Connection OS: Ubuntu 16.04, Kernel: 4.4.0-22-generic (x86_64), Desktop: Unity 7.4.0, Display Driver: NVIDIA 367.18, OpenGL: 4.5.0, Vulkan: 1.0.8, Compiler: GCC 5.3.1 20160413 + CUDA 8.0, File-System: ext4, Screen Resolution: 3840x2160 GeForce 1070 on x4 slot: Processor: Intel Core i5-2500K @ 3.70GHz (4 Cores), Motherboard: ASUS P8H67-M EVO, Chipset: Intel 2nd Generation Core Family DRAM, Memory: 32768MB, Disk: 400GB Seagate ST3400832AS + 500GB Western Digital WD5000AAKX-2 + 1000GB Samsung SSD 850, Audio: Realtek ALC892, Network: Realtek RTL8111/8168/8411 OS: Ubuntu 16.04, Kernel: 4.4.0-38-generic (x86_64), Desktop: Unity 7.4.0, Compiler: GCC 4.9.3 + Clang 3.8.0-2ubuntu4 + CUDA 7.5, File-System: ext4, Screen Resolution: 1680x1028 GeForce 1070 on x4 slot mk2: Processor: Intel Core i5-2500K @ 3.70GHz (4 Cores), Motherboard: ASUS P8H67-M EVO, Chipset: Intel 2nd Generation Core Family DRAM, Memory: 32768MB, Disk: 400GB Seagate ST3400832AS + 500GB Western Digital WD5000AAKX-2 + 1000GB Samsung SSD 850, Audio: Realtek ALC892, Network: Realtek RTL8111/8168/8411 OS: Ubuntu 16.04, Kernel: 4.4.0-38-generic (x86_64), Desktop: Unity 7.4.0, Compiler: GCC 4.9.3 + Clang 3.8.0-2ubuntu4 + CUDA 7.5, File-System: ext4 Caffe AlexNet 2016-06-11 Build: CUDA Milli-Seconds < Lower Is Better GeForce GTX 960 ..... 28134.07 |=============================================== GeForce GTX 970 ..... 23567.70 |======================================= GeForce GTX 980 ..... 15504.53 |========================== GeForce GTX 980 Ti .. 12011.27 |==================== GeForce GTX TITAN X . 11397.13 |=================== GeForce GTX 1080 .... 8959.77 |=============== Caffe AlexNet 2016-06-11 Build: CPU Only Milli-Seconds < Lower Is Better Xeon E3-1280 v5 - CPU Only . 1787207 |========================================= SHOC Scalable HeterOgeneous Computing 2015-11-10 Target: CUDA - Benchmark: FFT SP GFLOPS > Higher Is Better GeForce GTX 960 ............. 189.14 |================= GeForce GTX 970 ............. 265.17 |======================== GeForce GTX 980 ............. 292.78 |========================== GeForce GTX 980 Ti .......... 302.76 |=========================== GeForce GTX TITAN X ......... 322.57 |============================= GeForce GTX 1080 ............ 461.28 |========================================= GeForce 1070 on x4 slot ..... 333.69 |============================== GeForce 1070 on x4 slot mk2 . 338.40 |============================== SHOC Scalable HeterOgeneous Computing 2015-11-10 Target: CUDA - Benchmark: MD5 Hash GHash/s > Higher Is Better GeForce GTX 960 ............. 3.88 |============== GeForce GTX 970 ............. 5.47 |=================== GeForce GTX 980 ............. 6.53 |======================= GeForce GTX 980 Ti .......... 7.81 |=========================== GeForce GTX TITAN X ......... 8.43 |============================== GeForce GTX 1080 ............ 11.98 |========================================== GeForce 1070 on x4 slot ..... 8.44 |============================== GeForce 1070 on x4 slot mk2 . 8.43 |============================== SHOC Scalable HeterOgeneous Computing 2015-11-10 Target: CUDA - Benchmark: Max SP Flops GFLOPS > Higher Is Better GeForce GTX 960 ............. 2944.94 |============= GeForce GTX 970 ............. 4316.43 |================== GeForce GTX 980 ............. 4999.85 |===================== GeForce GTX 980 Ti .......... 6144.29 |========================== GeForce GTX TITAN X ......... 6886.69 |============================= GeForce GTX 1080 ............ 9397.41 |======================================== GeForce 1070 on x4 slot mk2 . 7072.28 |============================== SHOC Scalable HeterOgeneous Computing 2015-11-10 Target: CUDA - Benchmark: Texture Read Bandwidth GB/s > Higher Is Better GeForce GTX 960 ............. 381.05 |============================== GeForce GTX 970 ............. 351.32 |=========================== GeForce GTX 980 ............. 332.16 |========================== GeForce GTX 980 Ti .......... 348.36 |=========================== GeForce GTX TITAN X ......... 352.05 |=========================== GeForce GTX 1080 ............ 528.41 |========================================= GeForce 1070 on x4 slot mk2 . 494.04 |====================================== CUDA Mini-Nbody 2015-11-10 Test: Original Seconds < Lower Is Better GeForce GTX 960 ............. 82.29 |========================================== GeForce GTX 970 ............. 52.04 |=========================== GeForce GTX 980 ............. 46.51 |======================== GeForce GTX 980 Ti .......... 35.35 |================== GeForce GTX TITAN X ......... 33.09 |================= GeForce GTX 1080 ............ 30.51 |================ GeForce 1070 on x4 slot mk2 . 40.33 |===================== CUDA Mini-Nbody 2015-11-10 Test: Cache Blocking Seconds < Lower Is Better GeForce GTX 960 ............. 36.30 |========================================== GeForce GTX 970 ............. 26.75 |=============================== GeForce GTX 980 ............. 24.91 |============================= GeForce GTX 980 Ti .......... 19.69 |======================= GeForce GTX TITAN X ......... 18.67 |====================== GeForce GTX 1080 ............ 14.02 |================ GeForce 1070 on x4 slot mk2 . 18.32 |===================== CUDA Mini-Nbody 2015-11-10 Test: Loop Unrolling Seconds < Lower Is Better GeForce GTX 960 ............. 35.71 |========================================== GeForce GTX 970 ............. 26.38 |=============================== GeForce GTX 980 ............. 24.63 |============================= GeForce GTX 980 Ti .......... 19.64 |======================= GeForce GTX TITAN X ......... 18.71 |====================== GeForce GTX 1080 ............ 14.52 |================= GeForce 1070 on x4 slot mk2 . 19.18 |======================= CUDA Mini-Nbody 2015-11-10 Test: SOA Data Layout Seconds < Lower Is Better GeForce GTX 960 ............. 81.27 |========================================== GeForce GTX 970 ............. 57.09 |============================== GeForce GTX 980 ............. 51.02 |========================== GeForce GTX 980 Ti .......... 42.04 |====================== GeForce GTX TITAN X ......... 38.69 |==================== GeForce GTX 1080 ............ 28.58 |=============== GeForce 1070 on x4 slot mk2 . 38.41 |==================== CUDA Mini-Nbody 2015-11-10 Test: Flush Denormals To Zero Seconds < Lower Is Better GeForce GTX 960 ............. 81.19 |========================================== GeForce GTX 970 ............. 57.20 |============================== GeForce GTX 980 ............. 50.44 |========================== GeForce GTX 980 Ti .......... 42.10 |====================== GeForce GTX TITAN X ......... 38.52 |==================== GeForce GTX 1080 ............ 28.58 |=============== GeForce 1070 on x4 slot mk2 . 38.51 |====================