OpenCL CUDA NVIDIA GPGPU Linux Tests

All Maxwell and various Kepler graphics cards tested on the NVIDIA Linux driver. Benchmarks by Michael Larabel for a future article on Phoronix.com just delivering various GPGPU benchmarks for reference purposes.

HTML result view exported from: https://openbenchmarking.org/result/1610201-LO-1511113PT87&sro&gru.

OpenCL CUDA NVIDIA GPGPU Linux TestsProcessorMotherboardChipsetMemoryDiskGraphicsAudioNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLCompilerFile-SystemScreen ResolutionVulkanGeForce GTX 680GeForce GTX 750GeForce GTX 760GeForce GTX 780 TiGeForce GTX 950GeForce GTX 960GeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX TITAN XGetForce GTX1080GTX1080_18OctGTX1080_Dip_18OctGTX1080Tests18OctGTX1080_DipkaIntel Core i5-6600K @ 3.50GHz (4 Cores)MSI Z170A GAMING PRO (MS-7984) v1.0Intel Device 191f16384MB256GB TS256GSSD370SNVIDIA GeForce GTX 680 2048MB (1006/3004MHz)Intel Device a170Intel Device 15b8Ubuntu 14.043.19.0-33-generic (x86_64)Unity 7.2.5X Server 1.17.1NVIDIA 352.394.3.0GCC 4.8.4 + Clang 3.4-1ubuntu3 + CUDA 7.5ext43840x2160eVGA NVIDIA GeForce GTX 750 1024MB (1019/2505MHz)NVIDIA GeForce GTX 760 2048MB (980/3004MHz)NVIDIA GeForce GTX 780 Ti 3072MB (875/3500MHz)eVGA NVIDIA GeForce GTX 950 2048MB (135/405MHz)eVGA NVIDIA GeForce GTX 960 2048MB (1277/3505MHz)eVGA NVIDIA GeForce GTX 970 4096MB (1163/3505MHz)NVIDIA GeForce GTX 980 4096MB (1126/3505MHz)NVIDIA GeForce GTX 980 Ti 6144MB (999/3505MHz)NVIDIA GeForce GTX TITAN X 12288MB (1001/3505MHz)2 x Intel Xeon E5-2620 v4 @ 3.00GHz (32 Cores)GIGABYTE MG50-G21-XX v01234567Intel Xeon E7 v4/Xeon8 x 8192 MB DDR4-2133MHz Samsung2 x 120GB SAMSUNG MZ7KM120Intel HD 5500NVIDIA ID 83Intel I350 Gigabit ConnectionSUSE LINUX 12.13.12.62-60.62-default (x86_64)X Server 1.15.21.4 (2.1 Mesa 10.5.4)1.0.8GCC 4.8.5 + CUDA 8.0btrfs1366x768NVIDIA GeForce GTX 10801.4 (2.1 Mesa 8.0.5)1090x6141366x768OpenBenchmarking.orgCompiler Details- GeForce GTX 680: --build=x86_64-linux-gnu --disable-browser-plugin --disable-libmudflap --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-objc-gc --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-arch-directory=amd64 --with-multilib-list=m32,m64,mx32 --with-tune=generic -v- GeForce GTX 750: --build=x86_64-linux-gnu --disable-browser-plugin --disable-libmudflap --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-objc-gc --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-arch-directory=amd64 --with-multilib-list=m32,m64,mx32 --with-tune=generic -v- GeForce GTX 760: --build=x86_64-linux-gnu --disable-browser-plugin --disable-libmudflap --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-objc-gc --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-arch-directory=amd64 --with-multilib-list=m32,m64,mx32 --with-tune=generic -v- GeForce GTX 780 Ti: --build=x86_64-linux-gnu --disable-browser-plugin --disable-libmudflap --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-objc-gc --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-arch-directory=amd64 --with-multilib-list=m32,m64,mx32 --with-tune=generic -v- GeForce GTX 950: --build=x86_64-linux-gnu --disable-browser-plugin --disable-libmudflap --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-objc-gc --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-arch-directory=amd64 --with-multilib-list=m32,m64,mx32 --with-tune=generic -v- GeForce GTX 960: --build=x86_64-linux-gnu --disable-browser-plugin --disable-libmudflap --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-objc-gc --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-arch-directory=amd64 --with-multilib-list=m32,m64,mx32 --with-tune=generic -v- GeForce GTX 970: --build=x86_64-linux-gnu --disable-browser-plugin --disable-libmudflap --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-objc-gc --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-arch-directory=amd64 --with-multilib-list=m32,m64,mx32 --with-tune=generic -v- GeForce GTX 980: --build=x86_64-linux-gnu --disable-browser-plugin --disable-libmudflap --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-objc-gc --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-arch-directory=amd64 --with-multilib-list=m32,m64,mx32 --with-tune=generic -v- GeForce GTX 980 Ti: --build=x86_64-linux-gnu --disable-browser-plugin --disable-libmudflap --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-objc-gc --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-arch-directory=amd64 --with-multilib-list=m32,m64,mx32 --with-tune=generic -v- GeForce GTX TITAN X: --build=x86_64-linux-gnu --disable-browser-plugin --disable-libmudflap --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-objc-gc --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-arch-directory=amd64 --with-multilib-list=m32,m64,mx32 --with-tune=generic -v- GetForce GTX1080: --build=x86_64-suse-linux --disable-libgcj --disable-libmudflap --disable-libssp --disable-libstdcxx-pch --disable-plugin --enable-__cxa_atexit --enable-checking=release --enable-languages=c,c++,objc,fortran,obj-c++,java,ada --enable-libstdcxx-allocator=new --enable-linux-futex --enable-ssp --enable-version-specific-runtime-libs --host=x86_64-suse-linux --mandir=/usr/share/man --with-arch-32=i586 --with-slibdir=/lib64 --with-tune=generic --without-system-libunwind- GTX1080_18Oct: --build=x86_64-suse-linux --disable-libgcj --disable-libmudflap --disable-libssp --disable-libstdcxx-pch --disable-plugin --enable-__cxa_atexit --enable-checking=release --enable-languages=c,c++,objc,fortran,obj-c++,java,ada --enable-libstdcxx-allocator=new --enable-linux-futex --enable-ssp --enable-version-specific-runtime-libs --host=x86_64-suse-linux --mandir=/usr/share/man --with-arch-32=i586 --with-slibdir=/lib64 --with-tune=generic --without-system-libunwind- GTX1080_Dip_18Oct: --build=x86_64-suse-linux --disable-libgcj --disable-libmudflap --disable-libssp --disable-libstdcxx-pch --disable-plugin --enable-__cxa_atexit --enable-checking=release --enable-languages=c,c++,objc,fortran,obj-c++,java,ada --enable-libstdcxx-allocator=new --enable-linux-futex --enable-ssp --enable-version-specific-runtime-libs --host=x86_64-suse-linux --mandir=/usr/share/man --with-arch-32=i586 --with-slibdir=/lib64 --with-tune=generic --without-system-libunwind- GTX1080Tests18Oct: --build=x86_64-suse-linux --disable-libgcj --disable-libmudflap --disable-libssp --disable-libstdcxx-pch --disable-plugin --enable-__cxa_atexit --enable-checking=release --enable-languages=c,c++,objc,fortran,obj-c++,java,ada --enable-libstdcxx-allocator=new --enable-linux-futex --enable-ssp --enable-version-specific-runtime-libs --host=x86_64-suse-linux --mandir=/usr/share/man --with-arch-32=i586 --with-slibdir=/lib64 --with-tune=generic --without-system-libunwind- GTX1080_Dipka: --build=x86_64-suse-linux --disable-libgcj --disable-libmudflap --disable-libssp --disable-libstdcxx-pch --disable-plugin --enable-__cxa_atexit --enable-checking=release --enable-languages=c,c++,objc,fortran,obj-c++,java,ada --enable-libstdcxx-allocator=new --enable-linux-futex --enable-ssp --enable-version-specific-runtime-libs --host=x86_64-suse-linux --mandir=/usr/share/man --with-arch-32=i586 --with-slibdir=/lib64 --with-tune=generic --without-system-libunwindProcessor Details- GeForce GTX 680: Scaling Governor: acpi-cpufreq performance- GeForce GTX 750: Scaling Governor: acpi-cpufreq performance- GeForce GTX 760: Scaling Governor: acpi-cpufreq performance- GeForce GTX 780 Ti: Scaling Governor: acpi-cpufreq performance- GeForce GTX 950: Scaling Governor: acpi-cpufreq performance- GeForce GTX 960: Scaling Governor: acpi-cpufreq performance- GeForce GTX 970: Scaling Governor: acpi-cpufreq performance- GeForce GTX 980: Scaling Governor: acpi-cpufreq performance- GeForce GTX 980 Ti: Scaling Governor: acpi-cpufreq performance- GeForce GTX TITAN X: Scaling Governor: acpi-cpufreq performance- GetForce GTX1080: Scaling Governor: intel_pstate powersave- GTX1080_18Oct: Scaling Governor: intel_pstate powersave- GTX1080_Dip_18Oct: Scaling Governor: intel_pstate powersave- GTX1080Tests18Oct: Scaling Governor: intel_pstate powersave- GTX1080_Dipka: Scaling Governor: intel_pstate powersaveOpenCL Details- GeForce GTX 680: GPU Compute Cores: 1536- GeForce GTX 750: GPU Compute Cores: 512- GeForce GTX 760: GPU Compute Cores: 1152- GeForce GTX 780 Ti: GPU Compute Cores: 2880- GeForce GTX 950: GPU Compute Cores: 768- GeForce GTX 960: GPU Compute Cores: 1024- GeForce GTX 970: GPU Compute Cores: 1664- GeForce GTX 980: GPU Compute Cores: 2048- GeForce GTX 980 Ti: GPU Compute Cores: 2816- GeForce GTX TITAN X: GPU Compute Cores: 3072System Details- GeForce GTX 680: GPU Compute Cores: 1536.- GeForce GTX 750: GPU Compute Cores: 512.- GeForce GTX 760: GPU Compute Cores: 1152.- GeForce GTX 780 Ti: GPU Compute Cores: 2880.- GeForce GTX 950: GPU Compute Cores: 768.- GeForce GTX 960: GPU Compute Cores: 1024.- GeForce GTX 970: GPU Compute Cores: 1664.- GeForce GTX 980: GPU Compute Cores: 2048.- GeForce GTX 980 Ti: GPU Compute Cores: 2816.- GeForce GTX TITAN X: GPU Compute Cores: 3072.Environment Details- GetForce GTX1080, GTX1080_18Oct, GTX1080_Dip_18Oct, GTX1080Tests18Oct, GTX1080_Dipka: LIBGL_DEBUG=quiet

OpenCL CUDA NVIDIA GPGPU Linux Testsshoc: CUDA - Texture Read Bandwidthshoc: OpenCL - Texture Read Bandwidthshoc: CUDA - FFT SPshoc: OpenCL - FFT SPshoc: CUDA - MD5 Hashshoc: OpenCL - MD5 Hashaskap: Griddingaskap: Degriddingjuliagpu: GPUmandelbulbgpu: GPUluxmark: GPU - Hotelluxmark: GPU - Microphoneluxmark: GPU - Luxball HDRcuda-mini-nbody: Originalcuda-mini-nbody: Cache Blockingcuda-mini-nbody: Loop Unrollingcuda-mini-nbody: SOA Data Layoutcuda-mini-nbody: Flush Denormals To ZeroGeForce GTX 680GeForce GTX 750GeForce GTX 760GeForce GTX 780 TiGeForce GTX 950GeForce GTX 960GeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX TITAN XGetForce GTX1080GTX1080_18OctGTX1080_Dip_18OctGTX1080Tests18OctGTX1080_Dipka242.1674.971.9148074789.0331636512.9757721274554158.42121.14113.6454.691.081.0736136874.0020060275.533491180.6698.1989.34199.95199.83170.2678.441.4038310650.5025392138.5046319414253286.62126.713.7878839770.1347400001.909924302963961.0329.9927.0554.3953.26326.23239.19172.2863.222.362.343399.145706.0764913682.6337156070.8776924235313105.3049.8947.54108.50108.48351.31269.98212.4362.783.383.363144.855290.3280042041.7344953399.478972460547482.0137.0835.3579.9779.84325.16283.36263.14117.234.794.775325.129509.14104144917.2358811317.1713464458973754.3228.5326.4255.8755.80336.48332.60289.63140.125.705.686051.2711094113830604.2763616558.77149247761071345.3825.1323.8850.1549.53348.92345.55311.46170.366.816.798320.5017380.60127978049.5371656708.83185562681380234.5819.7718.4640.9440.85356.52354.09324.09173.897.427.418458.7717380.60136037921.4375614774.13190663601408132.3718.6517.5937.4337.37529.61503.55371.85258.0311.8711.807989.2614013.5046.1927.3728.0841.4841.47524.29368.59264.3511.8511.77523.32502.38376.70268.4811.8311.757989.2614273.0045.9127.5028.1141.7741.58525.70504.48375.23265.5311.8411.778068.3614013.5045.7126.9227.8441.2741.90526.06503.78380.36266.4611.8811.788068.3614532.50474849.50347349.07243645283810210744.6526.3927.5740.7840.92OpenBenchmarking.org

SHOC Scalable HeterOgeneous Computing

Target: CUDA - Benchmark: Texture Read Bandwidth

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: CUDA - Benchmark: Texture Read BandwidthGTX1080Tests18OctGTX1080_18OctGTX1080_Dip_18OctGTX1080_DipkaGeForce GTX 750GeForce GTX 950GeForce GTX 960GeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX TITAN XGetForce GTX1080110220330440550SE +/- 0.75, N = 3SE +/- 1.02, N = 3SE +/- 1.03, N = 3SE +/- 1.32, N = 3SE +/- 0.42, N = 3SE +/- 0.85, N = 3SE +/- 0.14, N = 3SE +/- 0.28, N = 3SE +/- 1.15, N = 3SE +/- 1.22, N = 3SE +/- 0.12, N = 3SE +/- 1.15, N = 3525.70524.29523.32526.06158.42326.23351.31325.16336.48348.92356.52529.611. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Texture Read Bandwidth

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Texture Read BandwidthGTX1080Tests18OctGTX1080_Dip_18OctGTX1080_DipkaGeForce GTX 680GeForce GTX 750GeForce GTX 760GeForce GTX 780 TiGeForce GTX 950GeForce GTX 960GeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX TITAN XGetForce GTX1080110220330440550SE +/- 1.15, N = 3SE +/- 2.52, N = 3SE +/- 1.98, N = 3SE +/- 1.02, N = 3SE +/- 0.23, N = 3SE +/- 0.28, N = 3SE +/- 0.02, N = 3SE +/- 0.73, N = 3SE +/- 0.56, N = 3SE +/- 0.06, N = 3SE +/- 0.20, N = 3SE +/- 0.21, N = 3SE +/- 1.56, N = 3SE +/- 3.23, N = 3504.48502.38503.78242.16121.14170.26286.62239.19269.98283.36332.60345.55354.09503.551. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

SHOC Scalable HeterOgeneous Computing

Target: CUDA - Benchmark: FFT SP

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: CUDA - Benchmark: FFT SPGTX1080Tests18OctGTX1080_18OctGTX1080_Dip_18OctGTX1080_DipkaGeForce GTX 750GeForce GTX 950GeForce GTX 960GeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX TITAN XGetForce GTX108080160240320400SE +/- 6.04, N = 3SE +/- 6.07, N = 4SE +/- 3.35, N = 3SE +/- 4.96, N = 3SE +/- 0.69, N = 3SE +/- 0.47, N = 3SE +/- 1.49, N = 3SE +/- 2.44, N = 3SE +/- 3.09, N = 3SE +/- 0.32, N = 3SE +/- 1.19, N = 3SE +/- 6.33, N = 3375.23368.59376.70380.36113.64172.28212.43263.14289.63311.46324.09371.851. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: FFT SP

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: FFT SPGTX1080Tests18OctGTX1080_18OctGTX1080_Dip_18OctGTX1080_DipkaGeForce GTX 680GeForce GTX 750GeForce GTX 760GeForce GTX 780 TiGeForce GTX 950GeForce GTX 960GeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX TITAN XGetForce GTX108060120180240300SE +/- 2.86, N = 3SE +/- 1.98, N = 3SE +/- 2.61, N = 3SE +/- 3.70, N = 3SE +/- 0.87, N = 3SE +/- 0.08, N = 3SE +/- 0.31, N = 3SE +/- 0.19, N = 3SE +/- 0.08, N = 3SE +/- 1.20, N = 3SE +/- 0.52, N = 3SE +/- 1.30, N = 3SE +/- 0.65, N = 3SE +/- 0.19, N = 3SE +/- 3.85, N = 3265.53264.35268.48266.4674.9754.6978.44126.7163.2262.78117.23140.12170.36173.89258.031. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

SHOC Scalable HeterOgeneous Computing

Target: CUDA - Benchmark: MD5 Hash

OpenBenchmarking.orgGHash/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: CUDA - Benchmark: MD5 HashGTX1080Tests18OctGTX1080_18OctGTX1080_Dip_18OctGTX1080_DipkaGeForce GTX 750GeForce GTX 950GeForce GTX 960GeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX TITAN XGetForce GTX10803691215SE +/- 0.01, N = 3SE +/- 0.02, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.02, N = 311.8411.8511.8311.881.082.363.384.795.706.817.4211.871. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: MD5 Hash

OpenBenchmarking.orgGHash/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: MD5 HashGTX1080Tests18OctGTX1080_18OctGTX1080_Dip_18OctGTX1080_DipkaGeForce GTX 680GeForce GTX 750GeForce GTX 760GeForce GTX 780 TiGeForce GTX 950GeForce GTX 960GeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX TITAN XGetForce GTX10803691215SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.01, N = 311.7711.7711.7511.781.911.071.403.782.343.364.775.686.797.4111.801. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

ASKAP tConvolveCuda

Processing: Gridding

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP tConvolveCuda 2015-11-10Processing: GriddingGTX1080Tests18OctGTX1080_Dip_18OctGTX1080_DipkaGeForce GTX 950GeForce GTX 960GeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX TITAN XGetForce GTX10802K4K6K8K10KSE +/- 0.00, N = 3SE +/- 79.10, N = 3SE +/- 0.00, N = 3SE +/- 14.40, N = 3SE +/- 12.43, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 130.14, N = 4SE +/- 79.10, N = 38068.367989.268068.363399.143144.855325.126051.278320.508458.777989.261. (CXX) g++ options: -fPIC -O3 -m64 -lcudadevrt -lcudart_static -lrt -lpthread -ldl

ASKAP tConvolveCuda

Processing: Degridding

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP tConvolveCuda 2015-11-10Processing: DegriddingGTX1080Tests18OctGTX1080_Dip_18OctGTX1080_DipkaGeForce GTX 950GeForce GTX 960GeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX TITAN XGetForce GTX10804K8K12K16K20KSE +/- 0.00, N = 3SE +/- 259.50, N = 3SE +/- 259.50, N = 3SE +/- 41.05, N = 3SE +/- 34.80, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 369.80, N = 3SE +/- 369.80, N = 3SE +/- 0.00, N = 314013.5014273.0014532.505706.075290.329509.1411094.0017380.6017380.6014013.501. (CXX) g++ options: -fPIC -O3 -m64 -lcudadevrt -lcudart_static -lrt -lpthread -ldl

JuliaGPU

OpenCL Device: GPU

OpenBenchmarking.orgSamples/sec, More Is BetterJuliaGPU 1.2pts1OpenCL Device: GPUGTX1080_DipkaGeForce GTX 680GeForce GTX 750GeForce GTX 760GeForce GTX 780 TiGeForce GTX 950GeForce GTX 960GeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX TITAN X30M60M90M120M150MSE +/- 2656.73, N = 3SE +/- 59682.63, N = 3SE +/- 22546.70, N = 3SE +/- 14125.16, N = 3SE +/- 293396.06, N = 3SE +/- 58084.93, N = 3SE +/- 157475.07, N = 3SE +/- 84325.23, N = 3SE +/- 218639.12, N = 3SE +/- 473156.02, N = 3SE +/- 318277.32, N = 3474849.5048074789.0336136874.0038310650.5078839770.1364913682.6380042041.73104144917.23113830604.27127978049.53136037921.431. (CC) gcc options: -O3 -march=native -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL -lm

MandelbulbGPU

OpenCL Device: GPU

OpenBenchmarking.orgSamples/sec, More Is BetterMandelbulbGPU 1.0pts1OpenCL Device: GPUGTX1080_DipkaGeForce GTX 680GeForce GTX 750GeForce GTX 760GeForce GTX 780 TiGeForce GTX 950GeForce GTX 960GeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX TITAN X16M32M48M64M80MSE +/- 3765.68, N = 3SE +/- 36731.70, N = 3SE +/- 9818.73, N = 3SE +/- 28089.31, N = 3SE +/- 48150.35, N = 3SE +/- 29855.85, N = 3SE +/- 75512.83, N = 3SE +/- 91420.68, N = 3SE +/- 140370.89, N = 3SE +/- 168304.91, N = 3SE +/- 166919.37, N = 3347349.0731636512.9720060275.5325392138.5047400001.9037156070.8744953399.4758811317.1763616558.7771656708.8375614774.131. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL

LuxMark

OpenCL Device: GPU - Scene: Hotel

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.0OpenCL Device: GPU - Scene: HotelGTX1080_DipkaGeForce GTX 680GeForce GTX 760GeForce GTX 780 TiGeForce GTX 950GeForce GTX 960GeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX TITAN X5K10K15K20K25KSE +/- 8.25, N = 3SE +/- 2.00, N = 3SE +/- 0.33, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.67, N = 3SE +/- 0.00, N = 3SE +/- 1.20, N = 3SE +/- 0.33, N = 3SE +/- 0.33, N = 3243645774639927698971346149218551906

LuxMark

OpenCL Device: GPU - Scene: Microphone

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.0OpenCL Device: GPU - Scene: MicrophoneGTX1080_DipkaGeForce GTX 680GeForce GTX 760GeForce GTX 780 TiGeForce GTX 950GeForce GTX 960GeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX TITAN X11K22K33K44K55KSE +/- 2.60, N = 3SE +/- 3.06, N = 3SE +/- 0.67, N = 3SE +/- 12.00, N = 3SE +/- 4.26, N = 3SE +/- 1.15, N = 3SE +/- 7.64, N = 3SE +/- 0.67, N = 3SE +/- 18.50, N = 3SE +/- 3.00, N = 352838212719414302242324604458477662686360

LuxMark

OpenCL Device: GPU - Scene: Luxball HDR

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.0OpenCL Device: GPU - Scene: Luxball HDRGTX1080_DipkaGeForce GTX 680GeForce GTX 750GeForce GTX 760GeForce GTX 780 TiGeForce GTX 950GeForce GTX 960GeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX TITAN X20K40K60K80K100KSE +/- 48.95, N = 3SE +/- 12.17, N = 3SE +/- 11.67, N = 3SE +/- 1.45, N = 3SE +/- 35.97, N = 3SE +/- 16.67, N = 3SE +/- 0.88, N = 3SE +/- 24.85, N = 3SE +/- 1.20, N = 3SE +/- 44.35, N = 3SE +/- 4.70, N = 31021074554349142539639531354749737107131380214081

CUDA Mini-Nbody

Test: Original

OpenBenchmarking.orgSeconds, Fewer Is BetterCUDA Mini-Nbody 2015-11-10Test: OriginalGTX1080Tests18OctGTX1080_Dip_18OctGTX1080_DipkaGeForce GTX 750GeForce GTX 780 TiGeForce GTX 950GeForce GTX 960GeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX TITAN XGetForce GTX10804080120160200SE +/- 0.16, N = 3SE +/- 0.21, N = 3SE +/- 0.09, N = 3SE +/- 0.05, N = 3SE +/- 0.50, N = 3SE +/- 0.21, N = 3SE +/- 0.43, N = 3SE +/- 0.13, N = 3SE +/- 0.10, N = 3SE +/- 0.57, N = 3SE +/- 0.35, N = 3SE +/- 0.20, N = 345.7145.9144.65180.6661.03105.3082.0154.3245.3834.5832.3746.19

CUDA Mini-Nbody

Test: Cache Blocking

OpenBenchmarking.orgSeconds, Fewer Is BetterCUDA Mini-Nbody 2015-11-10Test: Cache BlockingGTX1080Tests18OctGTX1080_Dip_18OctGTX1080_DipkaGeForce GTX 750GeForce GTX 780 TiGeForce GTX 950GeForce GTX 960GeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX TITAN XGetForce GTX108020406080100SE +/- 0.05, N = 3SE +/- 0.12, N = 3SE +/- 0.02, N = 3SE +/- 0.00, N = 3SE +/- 0.27, N = 3SE +/- 0.02, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.06, N = 3SE +/- 0.21, N = 3SE +/- 0.10, N = 3SE +/- 0.16, N = 326.9227.5026.3998.1929.9949.8937.0828.5325.1319.7718.6527.37

CUDA Mini-Nbody

Test: Loop Unrolling

OpenBenchmarking.orgSeconds, Fewer Is BetterCUDA Mini-Nbody 2015-11-10Test: Loop UnrollingGTX1080Tests18OctGTX1080_Dip_18OctGTX1080_DipkaGeForce GTX 750GeForce GTX 780 TiGeForce GTX 950GeForce GTX 960GeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX TITAN XGetForce GTX108020406080100SE +/- 0.04, N = 3SE +/- 0.07, N = 3SE +/- 0.12, N = 3SE +/- 0.04, N = 3SE +/- 0.05, N = 3SE +/- 0.03, N = 3SE +/- 0.03, N = 3SE +/- 0.02, N = 3SE +/- 0.21, N = 3SE +/- 0.15, N = 3SE +/- 0.25, N = 3SE +/- 0.02, N = 327.8428.1127.5789.3427.0547.5435.3526.4223.8818.4617.5928.08

CUDA Mini-Nbody

Test: SOA Data Layout

OpenBenchmarking.orgSeconds, Fewer Is BetterCUDA Mini-Nbody 2015-11-10Test: SOA Data LayoutGTX1080Tests18OctGTX1080_Dip_18OctGTX1080_DipkaGeForce GTX 750GeForce GTX 780 TiGeForce GTX 950GeForce GTX 960GeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX TITAN XGetForce GTX10804080120160200SE +/- 0.08, N = 3SE +/- 0.25, N = 3SE +/- 0.08, N = 3SE +/- 0.04, N = 3SE +/- 0.16, N = 3SE +/- 0.02, N = 3SE +/- 0.08, N = 3SE +/- 0.05, N = 3SE +/- 0.21, N = 3SE +/- 0.11, N = 3SE +/- 0.20, N = 3SE +/- 0.03, N = 341.2741.7740.78199.9554.39108.5079.9755.8750.1540.9437.4341.48

CUDA Mini-Nbody

Test: Flush Denormals To Zero

OpenBenchmarking.orgSeconds, Fewer Is BetterCUDA Mini-Nbody 2015-11-10Test: Flush Denormals To ZeroGTX1080Tests18OctGTX1080_Dip_18OctGTX1080_DipkaGeForce GTX 750GeForce GTX 780 TiGeForce GTX 950GeForce GTX 960GeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX TITAN XGetForce GTX10804080120160200SE +/- 0.20, N = 3SE +/- 0.09, N = 3SE +/- 0.22, N = 3SE +/- 0.03, N = 3SE +/- 0.10, N = 3SE +/- 0.02, N = 3SE +/- 0.01, N = 3SE +/- 0.07, N = 3SE +/- 0.18, N = 3SE +/- 0.10, N = 3SE +/- 0.08, N = 3SE +/- 0.02, N = 341.9041.5840.92199.8353.26108.4879.8455.8049.5340.8537.3741.47


Phoronix Test Suite v10.8.4