OpenCL CUDA NVIDIA GPGPU Linux Tests

Benchmarks ran on the Eurocom Q6 for performance evaluation.

HTML result view exported from: https://openbenchmarking.org/result/1806259-AR-1511113PT86&rdt.

OpenCL CUDA NVIDIA GPGPU Linux TestsProcessorMotherboardChipsetMemoryDiskGraphicsAudioNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLCompilerFile-SystemScreen ResolutionOpenCLGeForce GTX 950GeForce GTX 980 TiGeForce GTX 970GeForce GTX 980GeForce GTX 960GeForce GTX TITAN XGeForce GTX 780 TiGeForce GTX 680GeForce GTX 750GeForce GTX 760GeForce GTX 1070 with Max-Q designIntel Core i5-6600K @ 3.50GHz (4 Cores)MSI Z170A GAMING PRO (MS-7984) v1.0Intel Device 191f16384MB256GB TS256GSSD370SeVGA NVIDIA GeForce GTX 950 2048MB (135/405MHz)Intel Device a170Intel Device 15b8Ubuntu 14.043.19.0-33-generic (x86_64)Unity 7.2.5X Server 1.17.1NVIDIA 352.394.3.0GCC 4.8.4 + Clang 3.4-1ubuntu3 + CUDA 7.5ext43840x2160NVIDIA GeForce GTX 980 Ti 6144MB (999/3505MHz)eVGA NVIDIA GeForce GTX 970 4096MB (1163/3505MHz)NVIDIA GeForce GTX 980 4096MB (1126/3505MHz)eVGA NVIDIA GeForce GTX 960 2048MB (1277/3505MHz)NVIDIA GeForce GTX TITAN X 12288MB (1001/3505MHz)NVIDIA GeForce GTX 780 Ti 3072MB (875/3500MHz)NVIDIA GeForce GTX 680 2048MB (1006/3004MHz)eVGA NVIDIA GeForce GTX 750 1024MB (1019/2505MHz)NVIDIA GeForce GTX 760 2048MB (980/3004MHz)Intel Core i7-8750H @ 4.10GHz (6 Cores / 12 Threads)Eurocom Q6 (7.005 BIOS)Intel Cannon Lake PCH Shared SRAM32768MB2050GB Crucial_CT2050MX + 1000GB Samsung SSD 960 EVO 1TBNVIDIA GeForce GTX 1070 with Max-Q Design 8192MB (1101/4006MHz)Realtek ALC1220Realtek RTL8111/8168/8411 + Intel Wireless-AC 9260Ubuntu 18.044.17.2 (x86_64)GNOME Shell 3.28.1X Server 1.19.6NVIDIA 396.24.024.6.0OpenCL 1.2 CUDA 9.2.127 + OpenCL 2.1GCC 7.3.0 + Clang 4.0.1-10 + LLVM 4.0.1 + CUDA 9.21920x1080OpenBenchmarking.orgCompiler Details- GeForce GTX 950: --build=x86_64-linux-gnu --disable-browser-plugin --disable-libmudflap --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-objc-gc --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-arch-directory=amd64 --with-multilib-list=m32,m64,mx32 --with-tune=generic -v- GeForce GTX 980 Ti: --build=x86_64-linux-gnu --disable-browser-plugin --disable-libmudflap --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-objc-gc --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-arch-directory=amd64 --with-multilib-list=m32,m64,mx32 --with-tune=generic -v- GeForce GTX 970: --build=x86_64-linux-gnu --disable-browser-plugin --disable-libmudflap --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-objc-gc --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-arch-directory=amd64 --with-multilib-list=m32,m64,mx32 --with-tune=generic -v- GeForce GTX 980: --build=x86_64-linux-gnu --disable-browser-plugin --disable-libmudflap --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-objc-gc --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-arch-directory=amd64 --with-multilib-list=m32,m64,mx32 --with-tune=generic -v- GeForce GTX 960: --build=x86_64-linux-gnu --disable-browser-plugin --disable-libmudflap --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-objc-gc --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-arch-directory=amd64 --with-multilib-list=m32,m64,mx32 --with-tune=generic -v- GeForce GTX TITAN X: --build=x86_64-linux-gnu --disable-browser-plugin --disable-libmudflap --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-objc-gc --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-arch-directory=amd64 --with-multilib-list=m32,m64,mx32 --with-tune=generic -v- GeForce GTX 780 Ti: --build=x86_64-linux-gnu --disable-browser-plugin --disable-libmudflap --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-objc-gc --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-arch-directory=amd64 --with-multilib-list=m32,m64,mx32 --with-tune=generic -v- GeForce GTX 680: --build=x86_64-linux-gnu --disable-browser-plugin --disable-libmudflap --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-objc-gc --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-arch-directory=amd64 --with-multilib-list=m32,m64,mx32 --with-tune=generic -v- GeForce GTX 750: --build=x86_64-linux-gnu --disable-browser-plugin --disable-libmudflap --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-objc-gc --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-arch-directory=amd64 --with-multilib-list=m32,m64,mx32 --with-tune=generic -v- GeForce GTX 760: --build=x86_64-linux-gnu --disable-browser-plugin --disable-libmudflap --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-objc-gc --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-arch-directory=amd64 --with-multilib-list=m32,m64,mx32 --with-tune=generic -v- GeForce GTX 1070 with Max-Q design: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --enable-libmpx --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-as=/usr/bin/x86_64-linux-gnu-as --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-ld=/usr/bin/x86_64-linux-gnu-ld --with-multilib-list=m32,m64,mx32 --with-target-system-zlib --with-tune=generic --without-cuda-driver -v Processor Details- GeForce GTX 950: Scaling Governor: acpi-cpufreq performance- GeForce GTX 980 Ti: Scaling Governor: acpi-cpufreq performance- GeForce GTX 970: Scaling Governor: acpi-cpufreq performance- GeForce GTX 980: Scaling Governor: acpi-cpufreq performance- GeForce GTX 960: Scaling Governor: acpi-cpufreq performance- GeForce GTX TITAN X: Scaling Governor: acpi-cpufreq performance- GeForce GTX 780 Ti: Scaling Governor: acpi-cpufreq performance- GeForce GTX 680: Scaling Governor: acpi-cpufreq performance- GeForce GTX 750: Scaling Governor: acpi-cpufreq performance- GeForce GTX 760: Scaling Governor: acpi-cpufreq performance- GeForce GTX 1070 with Max-Q design: Scaling Governor: intel_pstate performanceOpenCL Details- GeForce GTX 950: GPU Compute Cores: 768- GeForce GTX 980 Ti: GPU Compute Cores: 2816- GeForce GTX 970: GPU Compute Cores: 1664- GeForce GTX 980: GPU Compute Cores: 2048- GeForce GTX 960: GPU Compute Cores: 1024- GeForce GTX TITAN X: GPU Compute Cores: 3072- GeForce GTX 780 Ti: GPU Compute Cores: 2880- GeForce GTX 680: GPU Compute Cores: 1536- GeForce GTX 750: GPU Compute Cores: 512- GeForce GTX 760: GPU Compute Cores: 1152- GeForce GTX 1070 with Max-Q design: GPU Compute Cores: 2048System Details- GeForce GTX 950: GPU Compute Cores: 768.- GeForce GTX 980 Ti: GPU Compute Cores: 2816.- GeForce GTX 970: GPU Compute Cores: 1664.- GeForce GTX 980: GPU Compute Cores: 2048.- GeForce GTX 960: GPU Compute Cores: 1024.- GeForce GTX TITAN X: GPU Compute Cores: 3072.- GeForce GTX 780 Ti: GPU Compute Cores: 2880.- GeForce GTX 680: GPU Compute Cores: 1536.- GeForce GTX 750: GPU Compute Cores: 512.- GeForce GTX 760: GPU Compute Cores: 1152.Kernel Details- GeForce GTX 1070 with Max-Q design: drm.debug=0xeSecurity Details- GeForce GTX 1070 with Max-Q design: KPTI + __user pointer sanitization + Full generic retpoline IBPB IBRS_FW Protection

OpenCL CUDA NVIDIA GPGPU Linux Testsshoc: CUDA - FFT SPshoc: CUDA - MD5 Hashshoc: OpenCL - FFT SPshoc: OpenCL - MD5 Hashshoc: CUDA - Texture Read Bandwidthshoc: OpenCL - Texture Read Bandwidthaskap: Griddingaskap: Degriddingcuda-mini-nbody: Originalcuda-mini-nbody: Cache Blockingcuda-mini-nbody: Loop Unrollingcuda-mini-nbody: SOA Data Layoutcuda-mini-nbody: Flush Denormals To Zerojuliagpu: GPUmandelbulbgpu: GPUluxmark: GPU - Hotelluxmark: GPU - Microphoneluxmark: GPU - Luxball HDRGeForce GTX 950GeForce GTX 980 TiGeForce GTX 970GeForce GTX 980GeForce GTX 960GeForce GTX TITAN XGeForce GTX 780 TiGeForce GTX 680GeForce GTX 750GeForce GTX 760GeForce GTX 1070 with Max-Q design172.282.3663.222.34326.23239.193399.145706.07105.3049.8947.54108.50108.4864913682.6337156070.8776924235313311.466.81170.366.79348.92345.558320.5017380.6034.5819.7718.4640.9440.85127978049.5371656708.831855626813802263.144.79117.234.77325.16283.365325.129509.1454.3228.5326.4255.8755.80104144917.2358811317.17134644589737289.635.70140.125.68336.48332.606051.271109445.3825.1323.8850.1549.53113830604.2763616558.771492477610713212.433.3862.783.36351.31269.983144.855290.3282.0137.0835.3579.9779.8480042041.7344953399.4789724605474324.097.42173.897.41356.52354.098458.7717380.6032.3718.6517.5937.4337.37136037921.4375614774.131906636014081126.713.78286.6261.0329.9927.0554.3953.2678839770.1347400001.909924302963974.971.91242.1648074789.0331636512.9757721274554113.641.0854.691.07158.42121.14180.6698.1989.34199.95199.8336136874.0020060275.53349178.441.40170.2638310650.5025392138.5046319414253306.716.56234.916.51430.42428.557681.8913312.8045.7221.9421.9642.4241.96131929862.8074340055.80OpenBenchmarking.org

SHOC Scalable HeterOgeneous Computing

Target: CUDA - Benchmark: FFT SP

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: CUDA - Benchmark: FFT SPGeForce GTX 950GeForce GTX 980 TiGeForce GTX 970GeForce GTX 980GeForce GTX 960GeForce GTX TITAN XGeForce GTX 750GeForce GTX 1070 with Max-Q design70140210280350SE +/- 0.47, N = 3SE +/- 0.32, N = 3SE +/- 2.44, N = 3SE +/- 3.09, N = 3SE +/- 1.49, N = 3SE +/- 1.19, N = 3SE +/- 0.69, N = 3SE +/- 0.65, N = 3172.28311.46263.14289.63212.43324.09113.64306.71-std=c++141. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

SHOC Scalable HeterOgeneous Computing

Target: CUDA - Benchmark: MD5 Hash

OpenBenchmarking.orgGHash/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: CUDA - Benchmark: MD5 HashGeForce GTX 950GeForce GTX 980 TiGeForce GTX 970GeForce GTX 980GeForce GTX 960GeForce GTX TITAN XGeForce GTX 750GeForce GTX 1070 with Max-Q design246810SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 32.366.814.795.703.387.421.086.56-std=c++141. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: FFT SP

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: FFT SPGeForce GTX 950GeForce GTX 980 TiGeForce GTX 970GeForce GTX 980GeForce GTX 960GeForce GTX TITAN XGeForce GTX 780 TiGeForce GTX 680GeForce GTX 750GeForce GTX 760GeForce GTX 1070 with Max-Q design50100150200250SE +/- 0.08, N = 3SE +/- 0.65, N = 3SE +/- 0.52, N = 3SE +/- 1.30, N = 3SE +/- 1.20, N = 3SE +/- 0.19, N = 3SE +/- 0.19, N = 3SE +/- 0.87, N = 3SE +/- 0.08, N = 3SE +/- 0.31, N = 3SE +/- 0.95, N = 363.22170.36117.23140.1262.78173.89126.7174.9754.6978.44234.91-std=c++141. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: MD5 Hash

OpenBenchmarking.orgGHash/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: MD5 HashGeForce GTX 950GeForce GTX 980 TiGeForce GTX 970GeForce GTX 980GeForce GTX 960GeForce GTX TITAN XGeForce GTX 780 TiGeForce GTX 680GeForce GTX 750GeForce GTX 760GeForce GTX 1070 with Max-Q design246810SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 32.346.794.775.683.367.413.781.911.071.406.51-std=c++141. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

SHOC Scalable HeterOgeneous Computing

Target: CUDA - Benchmark: Texture Read Bandwidth

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: CUDA - Benchmark: Texture Read BandwidthGeForce GTX 950GeForce GTX 980 TiGeForce GTX 970GeForce GTX 980GeForce GTX 960GeForce GTX TITAN XGeForce GTX 750GeForce GTX 1070 with Max-Q design90180270360450SE +/- 0.85, N = 3SE +/- 1.22, N = 3SE +/- 0.28, N = 3SE +/- 1.15, N = 3SE +/- 0.14, N = 3SE +/- 0.12, N = 3SE +/- 0.42, N = 3SE +/- 3.41, N = 3326.23348.92325.16336.48351.31356.52158.42430.42-std=c++141. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Texture Read Bandwidth

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Texture Read BandwidthGeForce GTX 950GeForce GTX 980 TiGeForce GTX 970GeForce GTX 980GeForce GTX 960GeForce GTX TITAN XGeForce GTX 780 TiGeForce GTX 680GeForce GTX 750GeForce GTX 760GeForce GTX 1070 with Max-Q design90180270360450SE +/- 0.73, N = 3SE +/- 0.21, N = 3SE +/- 0.06, N = 3SE +/- 0.20, N = 3SE +/- 0.56, N = 3SE +/- 1.56, N = 3SE +/- 0.02, N = 3SE +/- 1.02, N = 3SE +/- 0.23, N = 3SE +/- 0.28, N = 3SE +/- 1.10, N = 3239.19345.55283.36332.60269.98354.09286.62242.16121.14170.26428.55-std=c++141. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

ASKAP tConvolveCuda

Processing: Gridding

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP tConvolveCuda 2015-11-10Processing: GriddingGeForce GTX 950GeForce GTX 980 TiGeForce GTX 970GeForce GTX 980GeForce GTX 960GeForce GTX TITAN XGeForce GTX 1070 with Max-Q design2K4K6K8K10KSE +/- 14.40, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 12.43, N = 3SE +/- 130.14, N = 4SE +/- 74.58, N = 33399.148320.505325.126051.273144.858458.777681.89-std=c++141. (CXX) g++ options: -fPIC -O3 -m64 -lcudadevrt -lcudart_static -lrt -lpthread -ldl

ASKAP tConvolveCuda

Processing: Degridding

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP tConvolveCuda 2015-11-10Processing: DegriddingGeForce GTX 950GeForce GTX 980 TiGeForce GTX 970GeForce GTX 980GeForce GTX 960GeForce GTX TITAN XGeForce GTX 1070 with Max-Q design4K8K12K16K20KSE +/- 41.05, N = 3SE +/- 369.80, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 34.80, N = 3SE +/- 369.80, N = 3SE +/- 0.00, N = 35706.0717380.609509.1411094.005290.3217380.6013312.80-std=c++141. (CXX) g++ options: -fPIC -O3 -m64 -lcudadevrt -lcudart_static -lrt -lpthread -ldl

CUDA Mini-Nbody

Test: Original

OpenBenchmarking.orgSeconds, Fewer Is BetterCUDA Mini-Nbody 2015-11-10Test: OriginalGeForce GTX 950GeForce GTX 980 TiGeForce GTX 970GeForce GTX 980GeForce GTX 960GeForce GTX TITAN XGeForce GTX 780 TiGeForce GTX 750GeForce GTX 1070 with Max-Q design4080120160200SE +/- 0.21, N = 3SE +/- 0.57, N = 3SE +/- 0.13, N = 3SE +/- 0.10, N = 3SE +/- 0.43, N = 3SE +/- 0.35, N = 3SE +/- 0.50, N = 3SE +/- 0.05, N = 3SE +/- 0.15, N = 3105.3034.5854.3245.3882.0132.3761.03180.6645.72

CUDA Mini-Nbody

Test: Cache Blocking

OpenBenchmarking.orgSeconds, Fewer Is BetterCUDA Mini-Nbody 2015-11-10Test: Cache BlockingGeForce GTX 950GeForce GTX 980 TiGeForce GTX 970GeForce GTX 980GeForce GTX 960GeForce GTX TITAN XGeForce GTX 780 TiGeForce GTX 750GeForce GTX 1070 with Max-Q design20406080100SE +/- 0.02, N = 3SE +/- 0.21, N = 3SE +/- 0.01, N = 3SE +/- 0.06, N = 3SE +/- 0.01, N = 3SE +/- 0.10, N = 3SE +/- 0.27, N = 3SE +/- 0.00, N = 3SE +/- 0.02, N = 349.8919.7728.5325.1337.0818.6529.9998.1921.94

CUDA Mini-Nbody

Test: Loop Unrolling

OpenBenchmarking.orgSeconds, Fewer Is BetterCUDA Mini-Nbody 2015-11-10Test: Loop UnrollingGeForce GTX 950GeForce GTX 980 TiGeForce GTX 970GeForce GTX 980GeForce GTX 960GeForce GTX TITAN XGeForce GTX 780 TiGeForce GTX 750GeForce GTX 1070 with Max-Q design20406080100SE +/- 0.03, N = 3SE +/- 0.15, N = 3SE +/- 0.02, N = 3SE +/- 0.21, N = 3SE +/- 0.03, N = 3SE +/- 0.25, N = 3SE +/- 0.05, N = 3SE +/- 0.04, N = 3SE +/- 0.03, N = 347.5418.4626.4223.8835.3517.5927.0589.3421.96

CUDA Mini-Nbody

Test: SOA Data Layout

OpenBenchmarking.orgSeconds, Fewer Is BetterCUDA Mini-Nbody 2015-11-10Test: SOA Data LayoutGeForce GTX 950GeForce GTX 980 TiGeForce GTX 970GeForce GTX 980GeForce GTX 960GeForce GTX TITAN XGeForce GTX 780 TiGeForce GTX 750GeForce GTX 1070 with Max-Q design4080120160200SE +/- 0.02, N = 3SE +/- 0.11, N = 3SE +/- 0.05, N = 3SE +/- 0.21, N = 3SE +/- 0.08, N = 3SE +/- 0.20, N = 3SE +/- 0.16, N = 3SE +/- 0.04, N = 3SE +/- 0.04, N = 3108.5040.9455.8750.1579.9737.4354.39199.9542.42

CUDA Mini-Nbody

Test: Flush Denormals To Zero

OpenBenchmarking.orgSeconds, Fewer Is BetterCUDA Mini-Nbody 2015-11-10Test: Flush Denormals To ZeroGeForce GTX 950GeForce GTX 980 TiGeForce GTX 970GeForce GTX 980GeForce GTX 960GeForce GTX TITAN XGeForce GTX 780 TiGeForce GTX 750GeForce GTX 1070 with Max-Q design4080120160200SE +/- 0.02, N = 3SE +/- 0.10, N = 3SE +/- 0.07, N = 3SE +/- 0.18, N = 3SE +/- 0.01, N = 3SE +/- 0.08, N = 3SE +/- 0.10, N = 3SE +/- 0.03, N = 3SE +/- 0.07, N = 3108.4840.8555.8049.5379.8437.3753.26199.8341.96

JuliaGPU

OpenCL Device: GPU

OpenBenchmarking.orgSamples/sec, More Is BetterJuliaGPU 1.2pts1OpenCL Device: GPUGeForce GTX 950GeForce GTX 980 TiGeForce GTX 970GeForce GTX 980GeForce GTX 960GeForce GTX TITAN XGeForce GTX 780 TiGeForce GTX 680GeForce GTX 750GeForce GTX 760GeForce GTX 1070 with Max-Q design30M60M90M120M150MSE +/- 58084.93, N = 3SE +/- 473156.02, N = 3SE +/- 84325.23, N = 3SE +/- 218639.12, N = 3SE +/- 157475.07, N = 3SE +/- 318277.32, N = 3SE +/- 293396.06, N = 3SE +/- 59682.63, N = 3SE +/- 22546.70, N = 3SE +/- 14125.16, N = 3SE +/- 177133.34, N = 364913682.63127978049.53104144917.23113830604.2780042041.73136037921.4378839770.1348074789.0336136874.0038310650.50131929862.801. (CC) gcc options: -O3 -march=native -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL -lm

MandelbulbGPU

OpenCL Device: GPU

OpenBenchmarking.orgSamples/sec, More Is BetterMandelbulbGPU 1.0pts1OpenCL Device: GPUGeForce GTX 950GeForce GTX 980 TiGeForce GTX 970GeForce GTX 980GeForce GTX 960GeForce GTX TITAN XGeForce GTX 780 TiGeForce GTX 680GeForce GTX 750GeForce GTX 760GeForce GTX 1070 with Max-Q design16M32M48M64M80MSE +/- 29855.85, N = 3SE +/- 168304.91, N = 3SE +/- 91420.68, N = 3SE +/- 140370.89, N = 3SE +/- 75512.83, N = 3SE +/- 166919.37, N = 3SE +/- 48150.35, N = 3SE +/- 36731.70, N = 3SE +/- 9818.73, N = 3SE +/- 28089.31, N = 3SE +/- 107143.67, N = 337156070.8771656708.8358811317.1763616558.7744953399.4775614774.1347400001.9031636512.9720060275.5325392138.5074340055.801. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL

LuxMark

OpenCL Device: GPU - Scene: Hotel

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.0OpenCL Device: GPU - Scene: HotelGeForce GTX 950GeForce GTX 980 TiGeForce GTX 970GeForce GTX 980GeForce GTX 960GeForce GTX TITAN XGeForce GTX 780 TiGeForce GTX 680GeForce GTX 760400800120016002000SE +/- 0.00, N = 3SE +/- 0.33, N = 3SE +/- 0.00, N = 3SE +/- 1.20, N = 3SE +/- 0.67, N = 3SE +/- 0.33, N = 3SE +/- 0.00, N = 3SE +/- 2.00, N = 3SE +/- 0.33, N = 37691855134614928971906992577463

LuxMark

OpenCL Device: GPU - Scene: Microphone

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.0OpenCL Device: GPU - Scene: MicrophoneGeForce GTX 950GeForce GTX 980 TiGeForce GTX 970GeForce GTX 980GeForce GTX 960GeForce GTX TITAN XGeForce GTX 780 TiGeForce GTX 680GeForce GTX 76014002800420056007000SE +/- 4.26, N = 3SE +/- 18.50, N = 3SE +/- 7.64, N = 3SE +/- 0.67, N = 3SE +/- 1.15, N = 3SE +/- 3.00, N = 3SE +/- 12.00, N = 3SE +/- 3.06, N = 3SE +/- 0.67, N = 3242362684458477624606360430221271941

LuxMark

OpenCL Device: GPU - Scene: Luxball HDR

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.0OpenCL Device: GPU - Scene: Luxball HDRGeForce GTX 950GeForce GTX 980 TiGeForce GTX 970GeForce GTX 980GeForce GTX 960GeForce GTX TITAN XGeForce GTX 780 TiGeForce GTX 680GeForce GTX 750GeForce GTX 7603K6K9K12K15KSE +/- 16.67, N = 3SE +/- 44.35, N = 3SE +/- 24.85, N = 3SE +/- 1.20, N = 3SE +/- 0.88, N = 3SE +/- 4.70, N = 3SE +/- 35.97, N = 3SE +/- 12.17, N = 3SE +/- 11.67, N = 3SE +/- 1.45, N = 35313138029737107135474140819639455434914253


Phoronix Test Suite v10.8.4