OpenCL CUDA GPGPU GeForce GTX 1070

Intel Xeon E3-1280 v5 testing with a MSI C236A WORKSTATION (MS-7998) v1.0 and Device 8187MB on Ubuntu 16.04 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/1606134-HA-OPENCLCUD74.

OpenCL CUDA GPGPU GeForce GTX 1070ProcessorMotherboardChipsetMemoryDiskGraphicsAudioNetworkOSKernelDesktopDisplay DriverOpenGLVulkanCompilerFile-SystemScreen ResolutionNVIDIA GeForce GTX 1070Intel Xeon E3-1280 v5 @ 4.00GHz (8 Cores)MSI C236A WORKSTATION (MS-7998) v1.0Intel Sky Lake16384MBSamsung SSD 950 PRO 256GBDevice 8187MB (1505/4006MHz)Realtek ALC1150Intel ConnectionUbuntu 16.044.4.0-22-generic (x86_64)Unity 7.4.0NVIDIA 367.184.5.01.0.8GCC 5.3.1 20160413 + CUDA 8.0ext43840x2160OpenBenchmarking.org- --build=x86_64-linux-gnu --disable-browser-plugin --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --enable-libmpx --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-arch-directory=amd64 --with-default-libstdcxx-abi=new --with-multilib-list=m32,m64,mx32 --with-tune=generic -v - Scaling Governor: intel_pstate powersave- GPU Compute Cores: 1920- GPU Compute Cores: 1920.

OpenCL CUDA GPGPU GeForce GTX 1070financebench: Monte-Carlo OpenCLfinancebench: Black-Scholes OpenCLmixbench: Integermixbench: Double Precisionmixbench: Single Precisionshoc: CUDA - Triadshoc: CUDA - FFT SPshoc: OpenCL - Triadshoc: CUDA - MD5 Hashshoc: OpenCL - FFT SPshoc: OpenCL - MD5 Hashshoc: CUDA - Max SP Flopsshoc: OpenCL - Max SP Flopsshoc: CUDA - Bus Speed Downloadshoc: CUDA - Bus Speed Readbackshoc: OpenCL - Bus Speed Downloadshoc: OpenCL - Bus Speed Readbackshoc: CUDA - Texture Read Bandwidthshoc: OpenCL - Texture Read Bandwidthcuda-mini-nbody: Originalcuda-mini-nbody: Cache Blockingcuda-mini-nbody: Loop Unrollingcuda-mini-nbody: SOA Data Layoutcuda-mini-nbody: Flush Denormals To Zerocaffe: CUDAjuliagpu: GPUsmallpt-gpu: GPU - Complexsmallpt-gpu: GPU - Cornellsmallpt-gpu: GPU - Caustic3luxmark: GPU - Hotelluxmark: GPU - Microphoneluxmark: GPU - Luxball HDRNVIDIA GeForce GTX 1070597.229.572016.00227.456585.7514.76368.5611.738.45289.518.377096.497064.7112.4713.2112.4713.22503.81446.4639.0418.0118.6637.3837.1011704.03142650486.331465855842146585596014658560863028730315716OpenBenchmarking.org

FinanceBench

Benchmark: Monte-Carlo OpenCL

OpenBenchmarking.orgms, Fewer Is BetterFinanceBench 2016-06-06Benchmark: Monte-Carlo OpenCLNVIDIA GeForce GTX 1070130260390520650SE +/- 0.25, N = 3597.221. (CXX) g++ options: -O3 -lOpenCL

FinanceBench

Benchmark: Black-Scholes OpenCL

OpenBenchmarking.orgms, Fewer Is BetterFinanceBench 2016-06-06Benchmark: Black-Scholes OpenCLNVIDIA GeForce GTX 10703691215SE +/- 0.01, N = 39.571. (CXX) g++ options: -O3 -lOpenCL

Mixbench

Benchmark: Integer

OpenBenchmarking.orgGIOPS, More Is BetterMixbench 2016-06-06Benchmark: IntegerNVIDIA GeForce GTX 1070400800120016002000SE +/- 26.07, N = 32016.001. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2

Mixbench

Benchmark: Double Precision

OpenBenchmarking.orgGFLOPS, More Is BetterMixbench 2016-06-06Benchmark: Double PrecisionNVIDIA GeForce GTX 107050100150200250SE +/- 0.28, N = 3227.451. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2

Mixbench

Benchmark: Single Precision

OpenBenchmarking.orgGFLOPS, More Is BetterMixbench 2016-06-06Benchmark: Single PrecisionNVIDIA GeForce GTX 107014002800420056007000SE +/- 2.45, N = 36585.751. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2

SHOC Scalable HeterOgeneous Computing

Target: CUDA - Benchmark: Triad

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: CUDA - Benchmark: TriadNVIDIA GeForce GTX 107048121620SE +/- 0.01, N = 314.761. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

SHOC Scalable HeterOgeneous Computing

Target: CUDA - Benchmark: FFT SP

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: CUDA - Benchmark: FFT SPNVIDIA GeForce GTX 107080160240320400SE +/- 1.19, N = 3368.561. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Triad

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: TriadNVIDIA GeForce GTX 10703691215SE +/- 0.01, N = 311.731. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

SHOC Scalable HeterOgeneous Computing

Target: CUDA - Benchmark: MD5 Hash

OpenBenchmarking.orgGHash/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: CUDA - Benchmark: MD5 HashNVIDIA GeForce GTX 1070246810SE +/- 0.00, N = 38.451. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: FFT SP

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: FFT SPNVIDIA GeForce GTX 107060120180240300SE +/- 0.55, N = 3289.511. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: MD5 Hash

OpenBenchmarking.orgGHash/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: MD5 HashNVIDIA GeForce GTX 1070246810SE +/- 0.00, N = 38.371. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

SHOC Scalable HeterOgeneous Computing

Target: CUDA - Benchmark: Max SP Flops

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: CUDA - Benchmark: Max SP FlopsNVIDIA GeForce GTX 107015003000450060007500SE +/- 52.99, N = 37096.491. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Max SP Flops

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Max SP FlopsNVIDIA GeForce GTX 107015003000450060007500SE +/- 1.82, N = 37064.711. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

SHOC Scalable HeterOgeneous Computing

Target: CUDA - Benchmark: Bus Speed Download

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: CUDA - Benchmark: Bus Speed DownloadNVIDIA GeForce GTX 10703691215SE +/- 0.00, N = 312.471. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

SHOC Scalable HeterOgeneous Computing

Target: CUDA - Benchmark: Bus Speed Readback

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: CUDA - Benchmark: Bus Speed ReadbackNVIDIA GeForce GTX 10703691215SE +/- 0.00, N = 313.211. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Bus Speed Download

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Bus Speed DownloadNVIDIA GeForce GTX 10703691215SE +/- 0.01, N = 312.471. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Bus Speed Readback

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Bus Speed ReadbackNVIDIA GeForce GTX 10703691215SE +/- 0.00, N = 313.221. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

SHOC Scalable HeterOgeneous Computing

Target: CUDA - Benchmark: Texture Read Bandwidth

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: CUDA - Benchmark: Texture Read BandwidthNVIDIA GeForce GTX 1070110220330440550SE +/- 0.81, N = 3503.811. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Texture Read Bandwidth

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Texture Read BandwidthNVIDIA GeForce GTX 1070100200300400500SE +/- 0.42, N = 3446.461. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

CUDA Mini-Nbody

Test: Original

OpenBenchmarking.orgSeconds, Fewer Is BetterCUDA Mini-Nbody 2015-11-10Test: OriginalNVIDIA GeForce GTX 1070918273645SE +/- 0.05, N = 339.04

CUDA Mini-Nbody

Test: Cache Blocking

OpenBenchmarking.orgSeconds, Fewer Is BetterCUDA Mini-Nbody 2015-11-10Test: Cache BlockingNVIDIA GeForce GTX 107048121620SE +/- 0.01, N = 318.01

CUDA Mini-Nbody

Test: Loop Unrolling

OpenBenchmarking.orgSeconds, Fewer Is BetterCUDA Mini-Nbody 2015-11-10Test: Loop UnrollingNVIDIA GeForce GTX 1070510152025SE +/- 0.00, N = 318.66

CUDA Mini-Nbody

Test: SOA Data Layout

OpenBenchmarking.orgSeconds, Fewer Is BetterCUDA Mini-Nbody 2015-11-10Test: SOA Data LayoutNVIDIA GeForce GTX 1070918273645SE +/- 0.03, N = 337.38

CUDA Mini-Nbody

Test: Flush Denormals To Zero

OpenBenchmarking.orgSeconds, Fewer Is BetterCUDA Mini-Nbody 2015-11-10Test: Flush Denormals To ZeroNVIDIA GeForce GTX 1070918273645SE +/- 0.05, N = 337.10

Caffe AlexNet

Build: CUDA

OpenBenchmarking.orgMilli-Seconds, Fewer Is BetterCaffe AlexNet 2016-06-11Build: CUDANVIDIA GeForce GTX 10703K6K9K12K15KSE +/- 7.21, N = 311704.031. (CXX) g++ options: -pthread -fPIC -O2 -lcaffe -lglog -lgflags -lprotobuf -lboost_system -lboost_filesystem -lm -lhdf5_hl -lhdf5 -lleveldb -lsnappy -llmdb -lopencv_core -lopencv_highgui -lopencv_imgproc -lboost_thread -lstdc++ -lcblas -latlas

JuliaGPU

OpenCL Device: GPU

OpenBenchmarking.orgSamples/sec, More Is BetterJuliaGPU 1.2pts1OpenCL Device: GPUNVIDIA GeForce GTX 107030M60M90M120M150MSE +/- 166070.45, N = 3142650486.331. (CC) gcc options: -O3 -march=native -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL -lm

SmallPT GPU

OpenCL Device: GPU - Scene: Complex

OpenBenchmarking.orgSamples/sec, More Is BetterSmallPT GPU 1.6pts1OpenCL Device: GPU - Scene: ComplexNVIDIA GeForce GTX 1070300M600M900M1200M1500MSE +/- 19.34, N = 314658558421. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL

SmallPT GPU

OpenCL Device: GPU - Scene: Cornell

OpenBenchmarking.orgSamples/sec, More Is BetterSmallPT GPU 1.6pts1OpenCL Device: GPU - Scene: CornellNVIDIA GeForce GTX 1070300M600M900M1200M1500MSE +/- 21.94, N = 314658559601. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL

SmallPT GPU

OpenCL Device: GPU - Scene: Caustic3

OpenBenchmarking.orgSamples/sec, More Is BetterSmallPT GPU 1.6pts1OpenCL Device: GPU - Scene: Caustic3NVIDIA GeForce GTX 1070300M600M900M1200M1500MSE +/- 23.09, N = 314658560861. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL

LuxMark

OpenCL Device: GPU - Scene: Hotel

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.0OpenCL Device: GPU - Scene: HotelNVIDIA GeForce GTX 10706001200180024003000SE +/- 1.20, N = 33028

LuxMark

OpenCL Device: GPU - Scene: Microphone

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.0OpenCL Device: GPU - Scene: MicrophoneNVIDIA GeForce GTX 107016003200480064008000SE +/- 41.33, N = 37303

LuxMark

OpenCL Device: GPU - Scene: Luxball HDR

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.0OpenCL Device: GPU - Scene: Luxball HDRNVIDIA GeForce GTX 10703K6K9K12K15KSE +/- 3.00, N = 315716


Phoronix Test Suite v10.8.4