NVIDIA GeForce GTX 1080 CUDA Linux Compute GPGPU Testing

NVIDIA GeForce GTX 1080 CUDA benchmarking including deep learning on Pascal. Benchmarks by Michael Larabel.

HTML result view exported from: https://openbenchmarking.org/result/1606116-HA-CUDATESTI01.

NVIDIA GeForce GTX 1080 CUDA Linux Compute GPGPU TestingProcessorMotherboardChipsetMemoryDiskGraphicsAudioNetworkOSKernelDesktopDisplay DriverOpenGLVulkanCompilerFile-SystemScreen ResolutionGeForce GTX 960GeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX TITAN XGeForce GTX 1080Intel Xeon E3-1280 v5 @ 4.00GHz (8 Cores)MSI C236A WORKSTATION (MS-7998) v1.0Intel Sky Lake16384MBSamsung SSD 950 PRO 256GBeVGA NVIDIA GeForce GTX 960 2043MB (1277/3505MHz)Realtek ALC1150Intel ConnectionUbuntu 16.044.4.0-22-generic (x86_64)Unity 7.4.0NVIDIA 367.184.5.01.0.8GCC 5.3.1 20160413 + CUDA 8.0ext43840x2160eVGA NVIDIA GeForce GTX 970 4091MB (1163/3505MHz)NVIDIA GeForce GTX 980 4091MB (1126/3505MHz)NVIDIA GeForce GTX 980 Ti 6139MB (999/3505MHz)NVIDIA GeForce GTX TITAN X 12283MB (1001/3505MHz)GeForce GTX 1080 8187MB (909/5005MHz)OpenBenchmarking.orgCompiler Details- --build=x86_64-linux-gnu --disable-browser-plugin --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --enable-libmpx --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-arch-directory=amd64 --with-default-libstdcxx-abi=new --with-multilib-list=m32,m64,mx32 --with-tune=generic -v Processor Details- Scaling Governor: intel_pstate performanceOpenCL Details- GeForce GTX 960: GPU Compute Cores: 1024- GeForce GTX 970: GPU Compute Cores: 1664- GeForce GTX 980: GPU Compute Cores: 2048- GeForce GTX 980 Ti: GPU Compute Cores: 2816- GeForce GTX TITAN X: GPU Compute Cores: 3072- GeForce GTX 1080: GPU Compute Cores: 2560System Details- GeForce GTX 960: GPU Compute Cores: 1024.- GeForce GTX 970: GPU Compute Cores: 1664.- GeForce GTX 980: GPU Compute Cores: 2048.- GeForce GTX 980 Ti: GPU Compute Cores: 2816.- GeForce GTX TITAN X: GPU Compute Cores: 3072.- GeForce GTX 1080: GPU Compute Cores: 2560.

NVIDIA GeForce GTX 1080 CUDA Linux Compute GPGPU Testingcaffe: CUDAshoc: CUDA - FFT SPshoc: CUDA - MD5 Hashshoc: CUDA - Max SP Flopsshoc: CUDA - Texture Read Bandwidthcuda-mini-nbody: Originalcuda-mini-nbody: Cache Blockingcuda-mini-nbody: Loop Unrollingcuda-mini-nbody: SOA Data Layoutcuda-mini-nbody: Flush Denormals To ZeroGeForce GTX 960GeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX TITAN XGeForce GTX 108028134.07189.143.882944.94381.0582.2936.3035.7181.2781.1923567.70265.175.474316.43351.3252.0426.7526.3857.0957.2015504.53292.786.534999.85332.1646.5124.9124.6351.0250.4412011.27302.767.816144.29348.3635.3519.6919.6442.0442.1011397.13322.578.436886.69352.0533.0918.6718.7138.6938.528959.77461.2811.989397.41528.4130.5114.0214.5228.5828.58OpenBenchmarking.org

Caffe AlexNet

Build: CUDA

OpenBenchmarking.orgMilli-Seconds, Fewer Is BetterCaffe AlexNet 2016-06-11Build: CUDAGeForce GTX 960GeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX TITAN XGeForce GTX 10806K12K18K24K30KSE +/- 2.72, N = 3SE +/- 1758.76, N = 6SE +/- 17.87, N = 3SE +/- 7.42, N = 3SE +/- 26.29, N = 3SE +/- 3.43, N = 328134.0723567.7015504.5312011.2711397.138959.771. (CXX) g++ options: -pthread -fPIC -O2 -lcaffe -lglog -lgflags -lprotobuf -lboost_system -lboost_filesystem -lm -lhdf5_hl -lhdf5 -lleveldb -lsnappy -llmdb -lopencv_core -lopencv_highgui -lopencv_imgproc -lboost_thread -lstdc++ -lcblas -latlas

Caffe AlexNet

Build: CPU Only

OpenBenchmarking.orgMilli-Seconds, Fewer Is BetterCaffe AlexNet 2016-06-11Build: CPU OnlyXeon E3-1280 v5 - CPU Only400K800K1200K1600K2000KSE +/- 4001.26, N = 317872071. (CXX) g++ options: -pthread -fPIC -O2 -lcaffe -lglog -lgflags -lprotobuf -lboost_system -lboost_filesystem -lm -lhdf5_hl -lhdf5 -lleveldb -lsnappy -llmdb -lopencv_core -lopencv_highgui -lopencv_imgproc -lboost_thread -lstdc++ -lcblas -latlas

SHOC Scalable HeterOgeneous Computing

Target: CUDA - Benchmark: FFT SP

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: CUDA - Benchmark: FFT SPGeForce GTX 960GeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX TITAN XGeForce GTX 1080100200300400500SE +/- 1.12, N = 3SE +/- 0.05, N = 3SE +/- 0.60, N = 3SE +/- 4.36, N = 5SE +/- 0.29, N = 3SE +/- 2.81, N = 3189.14265.17292.78302.76322.57461.281. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

SHOC Scalable HeterOgeneous Computing

Target: CUDA - Benchmark: MD5 Hash

OpenBenchmarking.orgGHash/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: CUDA - Benchmark: MD5 HashGeForce GTX 960GeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX TITAN XGeForce GTX 10803691215SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.01, N = 33.885.476.537.818.4311.981. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

SHOC Scalable HeterOgeneous Computing

Target: CUDA - Benchmark: Max SP Flops

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: CUDA - Benchmark: Max SP FlopsGeForce GTX 960GeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX TITAN XGeForce GTX 10802K4K6K8K10KSE +/- 7.67, N = 3SE +/- 1.66, N = 3SE +/- 11.01, N = 3SE +/- 21.31, N = 3SE +/- 41.66, N = 3SE +/- 88.40, N = 32944.944316.434999.856144.296886.699397.411. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

SHOC Scalable HeterOgeneous Computing

Target: CUDA - Benchmark: Texture Read Bandwidth

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: CUDA - Benchmark: Texture Read BandwidthGeForce GTX 960GeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX TITAN XGeForce GTX 1080110220330440550SE +/- 0.15, N = 3SE +/- 0.03, N = 3SE +/- 0.47, N = 3SE +/- 0.24, N = 3SE +/- 1.11, N = 3SE +/- 1.22, N = 3381.05351.32332.16348.36352.05528.411. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

CUDA Mini-Nbody

Test: Original

OpenBenchmarking.orgSeconds, Fewer Is BetterCUDA Mini-Nbody 2015-11-10Test: OriginalGeForce GTX 960GeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX TITAN XGeForce GTX 108020406080100SE +/- 0.27, N = 3SE +/- 0.13, N = 3SE +/- 0.15, N = 3SE +/- 0.21, N = 3SE +/- 0.18, N = 3SE +/- 0.08, N = 382.2952.0446.5135.3533.0930.51

CUDA Mini-Nbody

Test: Cache Blocking

OpenBenchmarking.orgSeconds, Fewer Is BetterCUDA Mini-Nbody 2015-11-10Test: Cache BlockingGeForce GTX 960GeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX TITAN XGeForce GTX 1080816243240SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.16, N = 3SE +/- 0.30, N = 3SE +/- 0.20, N = 3SE +/- 0.01, N = 336.3026.7524.9119.6918.6714.02

CUDA Mini-Nbody

Test: Loop Unrolling

OpenBenchmarking.orgSeconds, Fewer Is BetterCUDA Mini-Nbody 2015-11-10Test: Loop UnrollingGeForce GTX 960GeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX TITAN XGeForce GTX 1080816243240SE +/- 0.03, N = 3SE +/- 0.02, N = 3SE +/- 0.20, N = 3SE +/- 0.16, N = 3SE +/- 0.18, N = 3SE +/- 0.02, N = 335.7126.3824.6319.6418.7114.52

CUDA Mini-Nbody

Test: SOA Data Layout

OpenBenchmarking.orgSeconds, Fewer Is BetterCUDA Mini-Nbody 2015-11-10Test: SOA Data LayoutGeForce GTX 960GeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX TITAN XGeForce GTX 108020406080100SE +/- 0.10, N = 3SE +/- 0.01, N = 3SE +/- 0.13, N = 3SE +/- 0.12, N = 3SE +/- 0.06, N = 3SE +/- 0.05, N = 381.2757.0951.0242.0438.6928.58

CUDA Mini-Nbody

Test: Flush Denormals To Zero

OpenBenchmarking.orgSeconds, Fewer Is BetterCUDA Mini-Nbody 2015-11-10Test: Flush Denormals To ZeroGeForce GTX 960GeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX TITAN XGeForce GTX 108020406080100SE +/- 0.07, N = 3SE +/- 0.09, N = 3SE +/- 0.22, N = 3SE +/- 0.05, N = 3SE +/- 0.10, N = 3SE +/- 0.06, N = 381.1957.2050.4442.1038.5228.58


Phoronix Test Suite v10.8.4