CUDA vs. OpenCL NVIDIA Pascal GPU Computing

Tests by Michael Larabel.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 1606129-HA-CUDAVSOPE49
Jump To Table - Results

Statistics

Remove Outliers Before Calculating Averages

Graph Settings

Prefer Vertical Bar Graphs

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Result
Identifier
View Logs
Performance Per
Dollar
Date
Run
  Test
  Duration
GeForce GTX 1080
June 12 2016
 
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


CUDA vs. OpenCL NVIDIA Pascal GPU ComputingOpenBenchmarking.orgPhoronix Test SuiteIntel Xeon E3-1280 v5 @ 4.00GHz (8 Cores)MSI C236A WORKSTATION (MS-7998) v1.0Intel Sky Lake16384MBSamsung SSD 950 PRO 256GBDevice 8187MB (1603/5005MHz)Realtek ALC1150Intel ConnectionUbuntu 16.044.4.0-22-generic (x86_64)Unity 7.4.0NVIDIA 367.184.5.01.0.8GCC 5.3.1 20160413 + CUDA 8.0ext43840x2160ProcessorMotherboardChipsetMemoryDiskGraphicsAudioNetworkOSKernelDesktopDisplay DriverOpenGLVulkanCompilerFile-SystemScreen ResolutionCUDA Vs. OpenCL NVIDIA Pascal GPU Computing BenchmarksSystem Logs- --build=x86_64-linux-gnu --disable-browser-plugin --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --enable-libmpx --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-arch-directory=amd64 --with-default-libstdcxx-abi=new --with-multilib-list=m32,m64,mx32 --with-tune=generic -v - Scaling Governor: intel_pstate powersave- GPU Compute Cores: 2560- GPU Compute Cores: 2560.

CUDA vs. OpenCL NVIDIA Pascal GPU Computingshoc: Triadshoc: FFT SPshoc: MD5 Hashshoc: Max SP Flopsshoc: Bus Speed Downloadshoc: Bus Speed Readbackshoc: Texture Read BandwidthGeForce GTX 1080CUDAOpenCL14.81462.2011.979366.8012.5113.22525.6411.93324.5611.849322.0112.5313.22518.08OpenBenchmarking.org

SHOC Scalable HeterOgeneous Computing

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Benchmark: TriadCUDAOpenCL48121620SE +/- 0.05, N = 3SE +/- 0.01, N = 314.8111.931. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft
OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Benchmark: TriadCUDAOpenCL48121620Min: 14.72 / Avg: 14.81 / Max: 14.89Min: 11.92 / Avg: 11.93 / Max: 11.951. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Benchmark: FFT SPCUDAOpenCL100200300400500SE +/- 2.48, N = 3SE +/- 3.01, N = 3462.20324.561. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft
OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Benchmark: FFT SPCUDAOpenCL80160240320400Min: 457.9 / Avg: 462.2 / Max: 466.49Min: 318.89 / Avg: 324.56 / Max: 329.141. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

OpenBenchmarking.orgGHash/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Benchmark: MD5 HashCUDAOpenCL3691215SE +/- 0.01, N = 3SE +/- 0.00, N = 311.9711.841. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft
OpenBenchmarking.orgGHash/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Benchmark: MD5 HashCUDAOpenCL3691215Min: 11.97 / Avg: 11.97 / Max: 11.98Min: 11.84 / Avg: 11.84 / Max: 11.851. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Benchmark: Max SP FlopsCUDAOpenCL2K4K6K8K10KSE +/- 74.92, N = 3SE +/- 2.31, N = 39366.809322.011. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft
OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Benchmark: Max SP FlopsCUDAOpenCL16003200480064008000Min: 9289.18 / Avg: 9366.8 / Max: 9516.61Min: 9317.39 / Avg: 9322.01 / Max: 9324.321. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Benchmark: Bus Speed DownloadCUDAOpenCL3691215SE +/- 0.01, N = 3SE +/- 0.00, N = 312.5112.531. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft
OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Benchmark: Bus Speed DownloadCUDAOpenCL48121620Min: 12.5 / Avg: 12.51 / Max: 12.52Min: 12.52 / Avg: 12.53 / Max: 12.531. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Benchmark: Bus Speed ReadbackCUDAOpenCL3691215SE +/- 0.00, N = 3SE +/- 0.00, N = 313.2213.221. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft
OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Benchmark: Bus Speed ReadbackCUDAOpenCL48121620Min: 13.21 / Avg: 13.22 / Max: 13.22Min: 13.22 / Avg: 13.22 / Max: 13.221. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Benchmark: Texture Read BandwidthCUDAOpenCL110220330440550SE +/- 1.00, N = 3SE +/- 1.01, N = 3525.64518.081. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft
OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Benchmark: Texture Read BandwidthCUDAOpenCL90180270360450Min: 524.62 / Avg: 525.64 / Max: 527.64Min: 516.24 / Avg: 518.08 / Max: 519.721. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

7 Results Shown

SHOC Scalable HeterOgeneous Computing:
  Triad
  FFT SP
  MD5 Hash
  Max SP Flops
  Bus Speed Download
  Bus Speed Readback
  Texture Read Bandwidth