CUDA 2016 NVIDIA Linux Ubuntu

NVIDIA CUDA Linux 2016 compute benchmarks by Michael Larabel for a future article.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 1704030-RI-1612261TA50
Jump To Table - Results

View

Do Not Show Noisy Results
Do Not Show Results With Incomplete Data
Do Not Show Results With Little Change/Spread
List Notable Results

Limit displaying results to tests within:

HPC - High Performance Computing 3 Tests
Machine Learning 2 Tests
NVIDIA GPU Compute 2 Tests

Statistics

Show Overall Harmonic Mean(s)
Show Overall Geometric Mean
Show Geometric Means Per-Suite/Category
Show Wins / Losses Counts (Pie Chart)
Normalize Results
Remove Outliers Before Calculating Averages

Graph Settings

Force Line Graphs Where Applicable
Convert To Scalar Where Applicable
Disable Color Branding
Prefer Vertical Bar Graphs

Additional Graphs

Show Perf Per Core/Thread Calculation Graphs Where Applicable
Show Perf Per Clock Calculation Graphs Where Applicable

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Highlight
Result
Hide
Result
Result
Identifier
Performance Per
Dollar
Date
Run
  Test
  Duration
GeForce GTX 650
December 26 2016
 
GeForce GTX 680
December 26 2016
 
GeForce GTX 750
December 26 2016
 
GeForce GTX 760
December 26 2016
 
GeForce GTX 780 Ti
December 25 2016
 
GeForce GTX 950
December 25 2016
 
GeForce GTX 960
December 25 2016
 
GeForce GTX 970
December 25 2016
 
GeForce GTX 980
December 25 2016
 
GeForce GTX 980 Ti
December 26 2016
 
GeForce GTX 1050
December 25 2016
 
GeForce GTX 1050 Ti
December 25 2016
 
GeForce GTX 1060
December 26 2016
 
GeForce GTX 1070
December 25 2016
 
GeForce GTX 1080
December 25 2016
 
NVIDIA GeForce GTX 560 Ti
April 03 2017
 
GeForce GTX 560 Ti
April 03 2017
 
Invert Hiding All Results Option
 

Only show results where is faster than
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


CUDA 2016 NVIDIA Linux UbuntuProcessorMotherboardChipsetMemoryDiskGraphicsAudioNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLVulkanCompilerFile-SystemScreen ResolutionGeForce GTX 650GeForce GTX 680GeForce GTX 750GeForce GTX 760GeForce GTX 780 TiGeForce GTX 950GeForce GTX 960GeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX 1050GeForce GTX 1050 TiGeForce GTX 1060GeForce GTX 1070GeForce GTX 1080NVIDIA GeForce GTX 560 TiGeForce GTX 560 TiIntel Xeon E3-1280 v5 @ 4.00GHz (8 Cores)MSI C236A WORKSTATION (MS-7998) v1.0Intel Sky Lake16384MB256GB TOSHIBA-RD400MSI NVIDIA GeForce GTX 650 1024MB (1084/2500MHz)Realtek ALC1150Intel ConnectionUbuntu 16.044.4.0-57-generic (x86_64)Unity 7.4.0X Server 1.18.4NVIDIA 375.264.5.01.0.24GCC 5.4.0 20160609 + CUDA 8.0ext43840x2160NVIDIA GeForce GTX 680 2048MB (1006/3004MHz)eVGA NVIDIA GeForce GTX 750 1024MB (1019/2505MHz)NVIDIA GeForce GTX 760 2048MB (1124/3004MHz)NVIDIA GeForce GTX 780 Ti 3072MB (875/3500MHz)eVGA NVIDIA GeForce GTX 950 2048MB (1202/3304MHz)eVGA NVIDIA GeForce GTX 960 2048MB (1277/3505MHz)eVGA NVIDIA GeForce GTX 970 4096MB (1163/3505MHz)NVIDIA GeForce GTX 980 4096MB (1126/3505MHz)NVIDIA GeForce GTX 980 Ti 6144MB (999/3505MHz)Zotac NVIDIA GeForce GTX 1050 2048MB (1681/3504MHz)eVGA NVIDIA GeForce GTX 1050 Ti 4096MB (1341/3504MHz)NVIDIA GeForce GTX 1060 6GB 6144MB (557/4006MHz)NVIDIA GeForce GTX 1070 8192MB (1069/4006MHz)NVIDIA GeForce GTX 1080 8192MB (1538/5005MHz)Intel Core i5-2500K @ 3.70GHz (4 Cores)Gigabyte Z68XP-UD3Intel 2nd Generation Core Family DRAM256GB Samsung SSD 840 + 3001GB Seagate ST3000DM001-1ER1eVGA NVIDIA GeForce GTX 560 Ti 1024MB (850/2052MHz)Realtek ALC889Realtek RTL8111/8168/84114.8.0-45-generic (x86_64)KDE Frameworks 5NVIDIA 375.393840x1200OpenBenchmarking.orgCompiler Details- --build=x86_64-linux-gnu --disable-browser-plugin --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --enable-libmpx --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-arch-directory=amd64 --with-default-libstdcxx-abi=new --with-multilib-list=m32,m64,mx32 --with-tune=generic -vProcessor Details- Scaling Governor: intel_pstate powersaveOpenCL Details- GeForce GTX 650: GPU Compute Cores: 384- GeForce GTX 680: GPU Compute Cores: 1536- GeForce GTX 750: GPU Compute Cores: 512- GeForce GTX 760: GPU Compute Cores: 1152- GeForce GTX 780 Ti: GPU Compute Cores: 2880- GeForce GTX 950: GPU Compute Cores: 768- GeForce GTX 960: GPU Compute Cores: 1024- GeForce GTX 970: GPU Compute Cores: 1664- GeForce GTX 980: GPU Compute Cores: 2048- GeForce GTX 980 Ti: GPU Compute Cores: 2816- GeForce GTX 1050: GPU Compute Cores: 640- GeForce GTX 1050 Ti: GPU Compute Cores: 768- GeForce GTX 1060: GPU Compute Cores: 1280- GeForce GTX 1070: GPU Compute Cores: 1920- GeForce GTX 1080: GPU Compute Cores: 2560- NVIDIA GeForce GTX 560 Ti: GPU Compute Cores: 384- GeForce GTX 560 Ti: GPU Compute Cores: 384System Details- GeForce GTX 650: GPU Compute Cores: 384.- GeForce GTX 680: GPU Compute Cores: 1536.- GeForce GTX 750: GPU Compute Cores: 512.- GeForce GTX 760: GPU Compute Cores: 1152.- GeForce GTX 780 Ti: GPU Compute Cores: 2880.- GeForce GTX 950: GPU Compute Cores: 768.- GeForce GTX 960: GPU Compute Cores: 1024.- GeForce GTX 970: GPU Compute Cores: 1664.- GeForce GTX 980: GPU Compute Cores: 2048.- GeForce GTX 980 Ti: GPU Compute Cores: 2816.- GeForce GTX 1050: GPU Compute Cores: 640.- GeForce GTX 1050 Ti: GPU Compute Cores: 768.- GeForce GTX 1060: GPU Compute Cores: 1280.- GeForce GTX 1070: GPU Compute Cores: 1920.- GeForce GTX 1080: GPU Compute Cores: 2560.- NVIDIA GeForce GTX 560 Ti: GPU Compute Cores: 384.- GeForce GTX 560 Ti: GPU Compute Cores: 384.

CUDA 2016 NVIDIA Linux Ubuntushoc: CUDA - FFT SPshoc: CUDA - MD5 Hashshoc: CUDA - Max SP Flopsshoc: CUDA - Texture Read Bandwidthaskap: Griddingaskap: Degriddingcuda-mini-nbody: Originalcaffe: CUDA AlexNetcaffe: CUDA GooglenetGeForce GTX 650GeForce GTX 680GeForce GTX 750GeForce GTX 760GeForce GTX 780 TiGeForce GTX 950GeForce GTX 960GeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX 1050GeForce GTX 1050 TiGeForce GTX 1060GeForce GTX 1070GeForce GTX 1080NVIDIA GeForce GTX 560 TiGeForce GTX 560 Ti52573.77133624116.131.281160.99160.58182.0866411.8316400961.4428595.1065105.93178.562.692210.77364.333399.145625.68104.1530595.4768771.30194.103.832941.88379.433132.425325.1282.7027360.7359805.47266.735.444320.28349.905255.519399.8452.5117005.6740125.47292.366.475002.78335.276006.4510798.1346.6014977.2035955.47308.497.736145.80349.098320.5017010.8036.0611722.1031440.40171.272.492109.11433.2136985873.92115.2430845.0369616.53199.713.032688.23453.153715.365961.62101.8526985.9060253.57304.625.644765.98503.115625.689861.3358.4216266.2337468.77377.278.407096.44501.087607.3113312.8039.7011451.9027658.17462.6011.909385.11526.218236.4514273.0033.069738.6524039.573582.064249.723550.504204.28OpenBenchmarking.org

SHOC Scalable HeterOgeneous Computing

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: CUDA - Benchmark: FFT SPGeForce GTX 750GeForce GTX 950GeForce GTX 960GeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX 1050GeForce GTX 1050 TiGeForce GTX 1060GeForce GTX 1070GeForce GTX 1080100200300400500SE +/- 0.20, N = 3SE +/- 0.50, N = 3SE +/- 1.44, N = 3SE +/- 1.20, N = 3SE +/- 0.74, N = 3SE +/- 0.35, N = 3SE +/- 0.73, N = 3SE +/- 1.12, N = 3SE +/- 0.78, N = 3SE +/- 2.67, N = 3SE +/- 1.33, N = 3116.13178.56194.10266.73292.36308.49171.27199.71304.62377.27462.601. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft
OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: CUDA - Benchmark: FFT SPGeForce GTX 750GeForce GTX 950GeForce GTX 960GeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX 1050GeForce GTX 1050 TiGeForce GTX 1060GeForce GTX 1070GeForce GTX 108080160240320400Min: 115.75 / Avg: 116.13 / Max: 116.4Min: 177.55 / Avg: 178.56 / Max: 179.11Min: 192.2 / Avg: 194.1 / Max: 196.93Min: 265.17 / Avg: 266.73 / Max: 269.09Min: 291.5 / Avg: 292.36 / Max: 293.84Min: 307.84 / Avg: 308.49 / Max: 309.02Min: 169.97 / Avg: 171.27 / Max: 172.5Min: 198.41 / Avg: 199.71 / Max: 201.95Min: 303.3 / Avg: 304.62 / Max: 306.01Min: 373.02 / Avg: 377.27 / Max: 382.21Min: 460.3 / Avg: 462.6 / Max: 464.91. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

OpenBenchmarking.orgGHash/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: CUDA - Benchmark: MD5 HashGeForce GTX 750GeForce GTX 950GeForce GTX 960GeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX 1050GeForce GTX 1050 TiGeForce GTX 1060GeForce GTX 1070GeForce GTX 10803691215SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.01, N = 31.282.693.835.446.477.732.493.035.648.4011.901. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft
OpenBenchmarking.orgGHash/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: CUDA - Benchmark: MD5 HashGeForce GTX 750GeForce GTX 950GeForce GTX 960GeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX 1050GeForce GTX 1050 TiGeForce GTX 1060GeForce GTX 1070GeForce GTX 10803691215Min: 1.28 / Avg: 1.28 / Max: 1.28Min: 2.68 / Avg: 2.69 / Max: 2.69Min: 3.83 / Avg: 3.83 / Max: 3.84Min: 5.44 / Avg: 5.44 / Max: 5.44Min: 6.47 / Avg: 6.47 / Max: 6.47Min: 7.72 / Avg: 7.73 / Max: 7.73Min: 2.48 / Avg: 2.49 / Max: 2.49Min: 3.03 / Avg: 3.03 / Max: 3.03Min: 5.64 / Avg: 5.64 / Max: 5.64Min: 8.39 / Avg: 8.4 / Max: 8.4Min: 11.88 / Avg: 11.9 / Max: 11.911. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: CUDA - Benchmark: Max SP FlopsGeForce GTX 750GeForce GTX 950GeForce GTX 960GeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX 1050GeForce GTX 1050 TiGeForce GTX 1060GeForce GTX 1070GeForce GTX 10802K4K6K8K10KSE +/- 0.08, N = 3SE +/- 6.61, N = 3SE +/- 8.34, N = 3SE +/- 3.47, N = 3SE +/- 9.75, N = 3SE +/- 20.77, N = 3SE +/- 0.30, N = 3SE +/- 5.43, N = 3SE +/- 21.65, N = 3SE +/- 50.45, N = 3SE +/- 64.36, N = 31160.992210.772941.884320.285002.786145.802109.112688.234765.987096.449385.111. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft
OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: CUDA - Benchmark: Max SP FlopsGeForce GTX 750GeForce GTX 950GeForce GTX 960GeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX 1050GeForce GTX 1050 TiGeForce GTX 1060GeForce GTX 1070GeForce GTX 108016003200480064008000Min: 1160.9 / Avg: 1160.99 / Max: 1161.15Min: 2204.08 / Avg: 2210.77 / Max: 2224Min: 2933.45 / Avg: 2941.88 / Max: 2958.57Min: 4316.73 / Avg: 4320.28 / Max: 4327.23Min: 4989.69 / Avg: 5002.78 / Max: 5021.84Min: 6122.8 / Avg: 6145.8 / Max: 6187.26Min: 2108.52 / Avg: 2109.11 / Max: 2109.41Min: 2682.55 / Avg: 2688.23 / Max: 2699.08Min: 4743.88 / Avg: 4765.98 / Max: 4809.29Min: 7045.56 / Avg: 7096.44 / Max: 7197.34Min: 9318.07 / Avg: 9385.11 / Max: 9513.81. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: CUDA - Benchmark: Texture Read BandwidthGeForce GTX 750GeForce GTX 950GeForce GTX 960GeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX 1050GeForce GTX 1050 TiGeForce GTX 1060GeForce GTX 1070GeForce GTX 1080110220330440550SE +/- 0.46, N = 3SE +/- 0.34, N = 3SE +/- 0.09, N = 3SE +/- 0.05, N = 3SE +/- 0.52, N = 3SE +/- 0.22, N = 3SE +/- 1.01, N = 3SE +/- 1.17, N = 3SE +/- 0.09, N = 3SE +/- 1.65, N = 3SE +/- 1.23, N = 3160.58364.33379.43349.90335.27349.09433.21453.15503.11501.08526.211. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft
OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: CUDA - Benchmark: Texture Read BandwidthGeForce GTX 750GeForce GTX 950GeForce GTX 960GeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX 1050GeForce GTX 1050 TiGeForce GTX 1060GeForce GTX 1070GeForce GTX 108090180270360450Min: 160.08 / Avg: 160.58 / Max: 161.51Min: 363.76 / Avg: 364.33 / Max: 364.93Min: 379.35 / Avg: 379.43 / Max: 379.61Min: 349.84 / Avg: 349.9 / Max: 349.99Min: 334.68 / Avg: 335.27 / Max: 336.31Min: 348.65 / Avg: 349.09 / Max: 349.32Min: 431.22 / Avg: 433.21 / Max: 434.48Min: 451.15 / Avg: 453.15 / Max: 455.21Min: 502.94 / Avg: 503.11 / Max: 503.26Min: 499.35 / Avg: 501.08 / Max: 504.38Min: 524.01 / Avg: 526.21 / Max: 528.261. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

ASKAP tConvolveCuda

This is a CUDA benchmark of ATNF's ASKAP Benchmark with currently using the tConvolveCuda sub-test. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP tConvolveCuda 2015-11-10Processing: GriddingGeForce GTX 950GeForce GTX 960GeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX 1050GeForce GTX 1050 TiGeForce GTX 1060GeForce GTX 1070GeForce GTX 1080NVIDIA GeForce GTX 560 TiGeForce GTX 560 Ti2K4K6K8K10KSE +/- 14.40, N = 3SE +/- 0.00, N = 3SE +/- 34.80, N = 3SE +/- 44.82, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 17.36, N = 3SE +/- 39.34, N = 3SE +/- 0.00, N = 3SE +/- 84.05, N = 3SE +/- 15.99, N = 3SE +/- 27.33, N = 33399.143132.425255.516006.458320.503698.003715.365625.687607.318236.453582.063550.501. (CXX) g++ options: -fPIC -O3 -m64 -lcudadevrt -lcudart_static -lrt -lpthread -ldl
OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP tConvolveCuda 2015-11-10Processing: GriddingGeForce GTX 950GeForce GTX 960GeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX 1050GeForce GTX 1050 TiGeForce GTX 1060GeForce GTX 1070GeForce GTX 1080NVIDIA GeForce GTX 560 TiGeForce GTX 560 Ti14002800420056007000Min: 3370.33 / Avg: 3399.14 / Max: 3413.54Min: 3132.42 / Avg: 3132.42 / Max: 3132.42Min: 5220.71 / Avg: 5255.51 / Max: 5325.12Min: 5916.8 / Avg: 6006.45 / Max: 6051.27Min: 8320.5 / Avg: 8320.5 / Max: 8320.5Min: 3698 / Avg: 3698 / Max: 3698Min: 3698 / Avg: 3715.36 / Max: 3750.08Min: 5547 / Avg: 5625.68 / Max: 5665.02Min: 7607.31 / Avg: 7607.31 / Max: 7607.31Min: 8068.36 / Avg: 8236.45 / Max: 8320.5Min: 3550.08 / Avg: 3582.06 / Max: 3598.05Min: 3503.37 / Avg: 3550.5 / Max: 3598.051. (CXX) g++ options: -fPIC -O3 -m64 -lcudadevrt -lcudart_static -lrt -lpthread -ldl

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP tConvolveCuda 2015-11-10Processing: DegriddingGeForce GTX 950GeForce GTX 960GeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX 1050GeForce GTX 1050 TiGeForce GTX 1060GeForce GTX 1070GeForce GTX 1080NVIDIA GeForce GTX 560 TiGeForce GTX 560 Ti4K8K12K16K20KSE +/- 39.34, N = 3SE +/- 0.00, N = 3SE +/- 109.30, N = 3SE +/- 147.93, N = 3SE +/- 369.80, N = 3SE +/- 42.88, N = 3SE +/- 44.82, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 259.50, N = 3SE +/- 44.73, N = 3SE +/- 22.01, N = 35625.685325.129399.8410798.1317010.805873.925961.629861.3313312.8014273.004249.724204.281. (CXX) g++ options: -fPIC -O3 -m64 -lcudadevrt -lcudart_static -lrt -lpthread -ldl
OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP tConvolveCuda 2015-11-10Processing: DegriddingGeForce GTX 950GeForce GTX 960GeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX 1050GeForce GTX 1050 TiGeForce GTX 1060GeForce GTX 1070GeForce GTX 1080NVIDIA GeForce GTX 560 TiGeForce GTX 560 Ti3K6K9K12K15KMin: 5547 / Avg: 5625.68 / Max: 5665.02Min: 5325.12 / Avg: 5325.12 / Max: 5325.12Min: 9181.24 / Avg: 9399.84 / Max: 9509.14Min: 10650.2 / Avg: 10798.13 / Max: 11094Min: 16641 / Avg: 17010.8 / Max: 17750.4Min: 5788.17 / Avg: 5873.92 / Max: 5916.8Min: 5916.8 / Avg: 5961.62 / Max: 6051.27Min: 9861.33 / Avg: 9861.33 / Max: 9861.33Min: 13312.8 / Avg: 13312.8 / Max: 13312.8Min: 14013.5 / Avg: 14273 / Max: 14792Min: 4160.25 / Avg: 4249.72 / Max: 4294.45Min: 4160.25 / Avg: 4204.28 / Max: 4226.291. (CXX) g++ options: -fPIC -O3 -m64 -lcudadevrt -lcudart_static -lrt -lpthread -ldl

CUDA Mini-Nbody

OpenBenchmarking.orgSeconds, Fewer Is BetterCUDA Mini-Nbody 2015-11-10Test: OriginalGeForce GTX 750GeForce GTX 780 TiGeForce GTX 950GeForce GTX 960GeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX 1050GeForce GTX 1050 TiGeForce GTX 1060GeForce GTX 1070GeForce GTX 10804080120160200SE +/- 0.15, N = 3SE +/- 0.16, N = 3SE +/- 0.40, N = 3SE +/- 0.31, N = 3SE +/- 0.14, N = 3SE +/- 0.23, N = 3SE +/- 0.48, N = 3SE +/- 0.09, N = 3SE +/- 0.14, N = 3SE +/- 0.11, N = 3SE +/- 0.05, N = 3SE +/- 0.09, N = 3182.0861.44104.1582.7052.5146.6036.06115.24101.8558.4239.7033.06
OpenBenchmarking.orgSeconds, Fewer Is BetterCUDA Mini-Nbody 2015-11-10Test: OriginalGeForce GTX 750GeForce GTX 780 TiGeForce GTX 950GeForce GTX 960GeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX 1050GeForce GTX 1050 TiGeForce GTX 1060GeForce GTX 1070GeForce GTX 1080306090120150Min: 181.92 / Avg: 182.08 / Max: 182.37Min: 61.23 / Avg: 61.44 / Max: 61.75Min: 103.72 / Avg: 104.15 / Max: 104.95Min: 82.38 / Avg: 82.7 / Max: 83.32Min: 52.26 / Avg: 52.51 / Max: 52.73Min: 46.14 / Avg: 46.6 / Max: 46.83Min: 35.28 / Avg: 36.06 / Max: 36.94Min: 115.11 / Avg: 115.24 / Max: 115.42Min: 101.69 / Avg: 101.85 / Max: 102.13Min: 58.24 / Avg: 58.42 / Max: 58.61Min: 39.65 / Avg: 39.7 / Max: 39.8Min: 32.91 / Avg: 33.06 / Max: 33.21

Caffe AlexNet

OpenBenchmarking.orgMilli-Seconds, Fewer Is BetterCaffe AlexNet 2016-06-11Build: CUDA AlexNetGeForce GTX 680GeForce GTX 760GeForce GTX 780 TiGeForce GTX 950GeForce GTX 960GeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX 1050GeForce GTX 1050 TiGeForce GTX 1060GeForce GTX 1070GeForce GTX 108014K28K42K56K70KSE +/- 76.71, N = 3SE +/- 12.08, N = 3SE +/- 3.70, N = 3SE +/- 34.72, N = 3SE +/- 5.41, N = 3SE +/- 8.15, N = 3SE +/- 2.22, N = 3SE +/- 12.80, N = 3SE +/- 2.73, N = 3SE +/- 10.92, N = 3SE +/- 7.27, N = 3SE +/- 2.23, N = 3SE +/- 11.97, N = 352573.7766411.8328595.1030595.4727360.7317005.6714977.2011722.1030845.0326985.9016266.2311451.909738.651. (CXX) g++ options: -pthread -fPIC -O2 -lcaffe -lglog -lgflags -lprotobuf -lboost_system -lboost_filesystem -lm -lhdf5_hl -lhdf5 -lleveldb -lsnappy -llmdb -lopencv_core -lopencv_highgui -lopencv_imgproc -lboost_thread -lstdc++ -lcblas -latlas
OpenBenchmarking.orgMilli-Seconds, Fewer Is BetterCaffe AlexNet 2016-06-11Build: CUDA AlexNetGeForce GTX 680GeForce GTX 760GeForce GTX 780 TiGeForce GTX 950GeForce GTX 960GeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX 1050GeForce GTX 1050 TiGeForce GTX 1060GeForce GTX 1070GeForce GTX 108012K24K36K48K60KMin: 52477.7 / Avg: 52573.77 / Max: 52725.4Min: 66396.6 / Avg: 66411.83 / Max: 66435.7Min: 28588.1 / Avg: 28595.1 / Max: 28600.7Min: 30526.1 / Avg: 30595.47 / Max: 30633Min: 27351.8 / Avg: 27360.73 / Max: 27370.5Min: 16995.1 / Avg: 17005.67 / Max: 17021.7Min: 14972.8 / Avg: 14977.2 / Max: 14979.9Min: 11708.9 / Avg: 11722.1 / Max: 11747.7Min: 30842.2 / Avg: 30845.03 / Max: 30850.5Min: 26964.5 / Avg: 26985.9 / Max: 27000.4Min: 16251.7 / Avg: 16266.23 / Max: 16273.8Min: 11447.5 / Avg: 11451.9 / Max: 11454.7Min: 9714.87 / Avg: 9738.65 / Max: 9753.021. (CXX) g++ options: -pthread -fPIC -O2 -lcaffe -lglog -lgflags -lprotobuf -lboost_system -lboost_filesystem -lm -lhdf5_hl -lhdf5 -lleveldb -lsnappy -llmdb -lopencv_core -lopencv_highgui -lopencv_imgproc -lboost_thread -lstdc++ -lcblas -latlas

OpenBenchmarking.orgMilli-Seconds, Fewer Is BetterCaffe AlexNet 2016-06-11Build: CUDA GooglenetGeForce GTX 680GeForce GTX 760GeForce GTX 780 TiGeForce GTX 950GeForce GTX 960GeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX 1050GeForce GTX 1050 TiGeForce GTX 1060GeForce GTX 1070GeForce GTX 108040K80K120K160K200KSE +/- 31.51, N = 3SE +/- 93.86, N = 3SE +/- 20.21, N = 3SE +/- 9.66, N = 3SE +/- 13.56, N = 3SE +/- 15.33, N = 3SE +/- 77.16, N = 3SE +/- 95.74, N = 3SE +/- 7.85, N = 3SE +/- 33.10, N = 3SE +/- 4.75, N = 3SE +/- 66.90, N = 3SE +/- 5.57, N = 3133624.00164009.0065105.9368771.3059805.4740125.4735955.4731440.4069616.5360253.5737468.7727658.1724039.571. (CXX) g++ options: -pthread -fPIC -O2 -lcaffe -lglog -lgflags -lprotobuf -lboost_system -lboost_filesystem -lm -lhdf5_hl -lhdf5 -lleveldb -lsnappy -llmdb -lopencv_core -lopencv_highgui -lopencv_imgproc -lboost_thread -lstdc++ -lcblas -latlas
OpenBenchmarking.orgMilli-Seconds, Fewer Is BetterCaffe AlexNet 2016-06-11Build: CUDA GooglenetGeForce GTX 680GeForce GTX 760GeForce GTX 780 TiGeForce GTX 950GeForce GTX 960GeForce GTX 970GeForce GTX 980GeForce GTX 980 TiGeForce GTX 1050GeForce GTX 1050 TiGeForce GTX 1060GeForce GTX 1070GeForce GTX 108030K60K90K120K150KMin: 133591 / Avg: 133624 / Max: 133687Min: 163912 / Avg: 164009.33 / Max: 164197Min: 65073.9 / Avg: 65105.93 / Max: 65143.3Min: 68757 / Avg: 68771.3 / Max: 68789.7Min: 59781.3 / Avg: 59805.47 / Max: 59828.2Min: 40103.3 / Avg: 40125.47 / Max: 40154.9Min: 35860 / Avg: 35955.47 / Max: 36108.2Min: 31294.4 / Avg: 31440.4 / Max: 31620.7Min: 69601.6 / Avg: 69616.53 / Max: 69628.2Min: 60188 / Avg: 60253.57 / Max: 60294.3Min: 37462.6 / Avg: 37468.77 / Max: 37478.1Min: 27527.8 / Avg: 27658.17 / Max: 27749.4Min: 24028.7 / Avg: 24039.57 / Max: 24047.11. (CXX) g++ options: -pthread -fPIC -O2 -lcaffe -lglog -lgflags -lprotobuf -lboost_system -lboost_filesystem -lm -lhdf5_hl -lhdf5 -lleveldb -lsnappy -llmdb -lopencv_core -lopencv_highgui -lopencv_imgproc -lboost_thread -lstdc++ -lcblas -latlas

9 Results Shown

SHOC Scalable HeterOgeneous Computing:
  CUDA - FFT SP
  CUDA - MD5 Hash
  CUDA - Max SP Flops
  CUDA - Texture Read Bandwidth
ASKAP tConvolveCuda:
  Gridding
  Degridding
CUDA Mini-Nbody
Caffe AlexNet:
  CUDA AlexNet
  CUDA Googlenet