cuda-testing

Intel Xeon E3-1280 v5 testing with a MSI C236A WORKSTATION (MS-7998) v1.0 and eVGA NVIDIA GeForce GTX 960 2043MB on Ubuntu 16.04 via the Phoronix Test Suite.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 1606119-PTS-CUDATEST14
Jump To Table - Results

View

Do Not Show Noisy Results
Do Not Show Results With Incomplete Data
Do Not Show Results With Little Change/Spread
List Notable Results

Limit displaying results to tests within:

HPC - High Performance Computing 2 Tests
Machine Learning 2 Tests
NVIDIA GPU Compute 2 Tests

Statistics

Show Overall Harmonic Mean(s)
Show Overall Geometric Mean
Show Geometric Means Per-Suite/Category
Show Wins / Losses Counts (Pie Chart)
Normalize Results
Remove Outliers Before Calculating Averages

Graph Settings

Force Line Graphs Where Applicable
Convert To Scalar Where Applicable
Disable Color Branding
Prefer Vertical Bar Graphs

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Highlight
Result
Hide
Result
Result
Identifier
View Logs
Performance Per
Dollar
Date
Run
  Test
  Duration
GeForce GTX 1080
June 11 2016
 
GeForce GTX 980
June 11 2016
 
GeForce GTX 960
June 11 2016
 
Invert Hiding All Results Option
 

Only show results where is faster than
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


cuda-testing - Phoronix Test Suite

cuda-testing

Intel Xeon E3-1280 v5 testing with a MSI C236A WORKSTATION (MS-7998) v1.0 and eVGA NVIDIA GeForce GTX 960 2043MB on Ubuntu 16.04 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/1606119-PTS-CUDATEST14&grw&rdt.

cuda-testingProcessorMotherboardChipsetMemoryDiskGraphicsAudioNetworkOSKernelDesktopDisplay DriverOpenGLVulkanCompilerFile-SystemScreen ResolutionGeForce GTX 1080GeForce GTX 980GeForce GTX 960Intel Xeon E3-1280 v5 @ 4.00GHz (8 Cores)MSI C236A WORKSTATION (MS-7998) v1.0Intel Sky Lake16384MBSamsung SSD 950 PRO 256GBGeForce GTX 1080 8187MB (909/5005MHz)Realtek ALC1150Intel ConnectionUbuntu 16.044.4.0-22-generic (x86_64)Unity 7.4.0NVIDIA 367.184.5.01.0.8GCC 5.3.1 20160413 + CUDA 8.0ext43840x2160NVIDIA GeForce GTX 980 4091MB (1126/3505MHz)eVGA NVIDIA GeForce GTX 960 2043MB (1277/3505MHz)OpenBenchmarking.orgCompiler Details- --build=x86_64-linux-gnu --disable-browser-plugin --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --enable-libmpx --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-arch-directory=amd64 --with-default-libstdcxx-abi=new --with-multilib-list=m32,m64,mx32 --with-tune=generic -v Processor Details- GeForce GTX 1080: Scaling Governor: intel_pstate powersave- GeForce GTX 980: Scaling Governor: intel_pstate performance- GeForce GTX 960: Scaling Governor: intel_pstate powersaveOpenCL Details- GeForce GTX 1080: GPU Compute Cores: 2560- GeForce GTX 980: GPU Compute Cores: 2048- GeForce GTX 960: GPU Compute Cores: 1024System Details- GeForce GTX 1080: GPU Compute Cores: 2560.- GeForce GTX 980: GPU Compute Cores: 2048.- GeForce GTX 960: GPU Compute Cores: 1024.

cuda-testingcuda-mini-nbody: SOA Data Layoutcuda-mini-nbody: Flush Denormals To Zerocuda-mini-nbody: Cache Blockingcuda-mini-nbody: Loop Unrollingcaffe: CUDAcuda-mini-nbody: Originalshoc: CUDA - Triadshoc: CUDA - FFT SPshoc: CUDA - MD5 Hashshoc: CUDA - Max SP Flopsshoc: CUDA - Bus Speed Downloadshoc: CUDA - Bus Speed Readbackshoc: CUDA - Texture Read BandwidthGeForce GTX 1080GeForce GTX 980GeForce GTX 960Xeon E3-1280 v5 - CPU Only28.5828.5814.0214.528959.7730.5114.86461.2811.989397.4112.5313.22528.4151.0250.4424.9124.6315504.5346.5114.74292.786.534999.8512.5313.22332.1681.2781.1936.3035.7128134.0782.2914.36189.143.882944.9412.5313.21381.051787207OpenBenchmarking.org

CUDA Mini-Nbody

Test: SOA Data Layout

OpenBenchmarking.orgSeconds, Fewer Is BetterCUDA Mini-Nbody 2015-11-10Test: SOA Data LayoutGeForce GTX 1080GeForce GTX 980GeForce GTX 96020406080100SE +/- 0.05, N = 3SE +/- 0.13, N = 3SE +/- 0.10, N = 328.5851.0281.27

CUDA Mini-Nbody

Test: Flush Denormals To Zero

OpenBenchmarking.orgSeconds, Fewer Is BetterCUDA Mini-Nbody 2015-11-10Test: Flush Denormals To ZeroGeForce GTX 1080GeForce GTX 980GeForce GTX 96020406080100SE +/- 0.06, N = 3SE +/- 0.22, N = 3SE +/- 0.07, N = 328.5850.4481.19

CUDA Mini-Nbody

Test: Cache Blocking

OpenBenchmarking.orgSeconds, Fewer Is BetterCUDA Mini-Nbody 2015-11-10Test: Cache BlockingGeForce GTX 1080GeForce GTX 980GeForce GTX 960816243240SE +/- 0.01, N = 3SE +/- 0.16, N = 3SE +/- 0.01, N = 314.0224.9136.30

CUDA Mini-Nbody

Test: Loop Unrolling

OpenBenchmarking.orgSeconds, Fewer Is BetterCUDA Mini-Nbody 2015-11-10Test: Loop UnrollingGeForce GTX 1080GeForce GTX 980GeForce GTX 960816243240SE +/- 0.02, N = 3SE +/- 0.20, N = 3SE +/- 0.03, N = 314.5224.6335.71

Caffe AlexNet

Build: CUDA

OpenBenchmarking.orgMilli-Seconds, Fewer Is BetterCaffe AlexNet 2016-06-11Build: CUDAXeon E3-1280 v5 - CPU OnlyGeForce GTX 1080GeForce GTX 980GeForce GTX 960400K800K1200K1600K2000KSE +/- 4001.26, N = 3SE +/- 3.43, N = 3SE +/- 17.87, N = 3SE +/- 2.72, N = 31787207.008959.7715504.5328134.071. (CXX) g++ options: -pthread -fPIC -O2 -lcaffe -lglog -lgflags -lprotobuf -lboost_system -lboost_filesystem -lm -lhdf5_hl -lhdf5 -lleveldb -lsnappy -llmdb -lopencv_core -lopencv_highgui -lopencv_imgproc -lboost_thread -lstdc++ -lcblas -latlas

CUDA Mini-Nbody

Test: Original

OpenBenchmarking.orgSeconds, Fewer Is BetterCUDA Mini-Nbody 2015-11-10Test: OriginalGeForce GTX 1080GeForce GTX 980GeForce GTX 96020406080100SE +/- 0.08, N = 3SE +/- 0.15, N = 3SE +/- 0.27, N = 330.5146.5182.29

SHOC Scalable HeterOgeneous Computing

Target: CUDA - Benchmark: Triad

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: CUDA - Benchmark: TriadGeForce GTX 1080GeForce GTX 980GeForce GTX 96048121620SE +/- 0.04, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 314.8614.7414.361. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

SHOC Scalable HeterOgeneous Computing

Target: CUDA - Benchmark: FFT SP

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: CUDA - Benchmark: FFT SPGeForce GTX 1080GeForce GTX 980GeForce GTX 960100200300400500SE +/- 2.81, N = 3SE +/- 0.60, N = 3SE +/- 1.12, N = 3461.28292.78189.141. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

SHOC Scalable HeterOgeneous Computing

Target: CUDA - Benchmark: MD5 Hash

OpenBenchmarking.orgGHash/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: CUDA - Benchmark: MD5 HashGeForce GTX 1080GeForce GTX 980GeForce GTX 9603691215SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.00, N = 311.986.533.881. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

SHOC Scalable HeterOgeneous Computing

Target: CUDA - Benchmark: Max SP Flops

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: CUDA - Benchmark: Max SP FlopsGeForce GTX 1080GeForce GTX 980GeForce GTX 9602K4K6K8K10KSE +/- 88.40, N = 3SE +/- 11.01, N = 3SE +/- 7.67, N = 39397.414999.852944.941. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

SHOC Scalable HeterOgeneous Computing

Target: CUDA - Benchmark: Bus Speed Download

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: CUDA - Benchmark: Bus Speed DownloadGeForce GTX 1080GeForce GTX 980GeForce GTX 9603691215SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 312.5312.5312.531. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

SHOC Scalable HeterOgeneous Computing

Target: CUDA - Benchmark: Bus Speed Readback

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: CUDA - Benchmark: Bus Speed ReadbackGeForce GTX 1080GeForce GTX 980GeForce GTX 9603691215SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 313.2213.2213.211. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

SHOC Scalable HeterOgeneous Computing

Target: CUDA - Benchmark: Texture Read Bandwidth

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: CUDA - Benchmark: Texture Read BandwidthGeForce GTX 1080GeForce GTX 980GeForce GTX 960110220330440550SE +/- 1.22, N = 3SE +/- 0.47, N = 3SE +/- 0.15, N = 3528.41332.16381.051. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft


Phoronix Test Suite v10.8.4