cuda

2 x Intel Xeon E5-2630 v3 testing with a Supermicro X10DRG-H v1.02 and ASPEED ASPEED Family on Ubuntu 14.04 via the Phoronix Test Suite.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 1602024-HA-CUDA0016144
Jump To Table - Results

View

Do Not Show Noisy Results
Do Not Show Results With Incomplete Data
Do Not Show Results With Little Change/Spread
List Notable Results

Statistics

Show Overall Harmonic Mean(s)
Show Overall Geometric Mean
Show Wins / Losses Counts (Pie Chart)
Normalize Results
Remove Outliers Before Calculating Averages

Graph Settings

Force Line Graphs Where Applicable
Convert To Scalar Where Applicable
Disable Color Branding
Prefer Vertical Bar Graphs

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Highlight
Result
Hide
Result
Result
Identifier
View Logs
Performance Per
Dollar
Date
Run
  Test
  Duration
cuda-mini
January 29 2016
 
mini-nbody
February 02 2016
 
0202
February 02 2016
 
cuda-test
February 02 2016
 
2 x Intel Xeon E5-2630 v3 - ASPEED ASPEED Family -
February 02 2016
 
Invert Hiding All Results Option
 

Only show results where is faster than
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


cudaProcessorMotherboardChipsetMemoryDiskGraphicsMonitorNetworkOSKernelDesktopDisplay DriverOpenGLCompilerFile-SystemScreen ResolutionDisplay Servercuda-minimini-nbody0202cuda-test2 x Intel Xeon E5-2630 v3 - ASPEED ASPEED Family -2 x Intel Xeon E5-2630 v3 @ 2.40GHz (32 Cores)Supermicro X10DRG-H v1.02Intel Haswell-E DMI264512MB500GB HGST HTS545050A7LLVMpipeSyncMasterIntel I350 Gigabit ConnectionUbuntu 14.043.13.0-32-generic (x86_64)Unity 7.2.2modesetting 0.8.12.1 Mesa 10.1.3 Gallium 0.4GCC 4.8.4 + CUDA 7.5ext41024x768X Server 1.15.1GCC 4.8.4 + CUDA 7.0ASPEED ASPEED Family1280x1024OpenBenchmarking.orgCompiler Details- --build=x86_64-linux-gnu --disable-browser-plugin --disable-libmudflap --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-objc-gc --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-arch-directory=amd64 --with-multilib-list=m32,m64,mx32 --with-tune=generic -v Processor Details- Scaling Governor: acpi-cpufreq ondemand

cudashoc: CUDA - Triadshoc: CUDA - FFT SPshoc: CUDA - MD5 Hashshoc: CUDA - Max SP Flopsshoc: CUDA - Bus Speed Downloadshoc: CUDA - Bus Speed Readbackshoc: CUDA - Texture Read Bandwidthaskap: Griddingaskap: Degriddingcuda-mini-nbody: Originalcuda-mini-nbody: Cache Blockingcuda-mini-nbody: Loop Unrollingcuda-mini-nbody: SOA Data Layoutcuda-mini-nbody: Flush Denormals To Zeroshoc: OpenCL - Triadshoc: OpenCL - FFT SPshoc: OpenCL - MD5 Hashshoc: OpenCL - Max SP Flopsshoc: OpenCL - Bus Speed Downloadshoc: OpenCL - Bus Speed Readbackshoc: OpenCL - Texture Read Bandwidthcuda-minimini-nbody0202cuda-test2 x Intel Xeon E5-2630 v3 - ASPEED ASPEED Family -12.89285.467.776557.1210.3012.69335.137006.7414532.5058.3444.7444.4565.2865.5959.5559.1115.73291.417.546561.668.7012.57335.047132.9914013.5059.1044.9044.9165.4565.5011.84174.737.736505.9512.4613.07331.70OpenBenchmarking.org

SHOC Scalable HeterOgeneous Computing

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: CUDA - Benchmark: Triadcuda-minicuda-test48121620SE +/- 1.80, N = 6SE +/- 0.02, N = 312.8915.731. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft
OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: CUDA - Benchmark: Triadcuda-minicuda-test48121620Min: 6.96 / Avg: 12.89 / Max: 15.81Min: 15.7 / Avg: 15.73 / Max: 15.771. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: CUDA - Benchmark: FFT SPcuda-minicuda-test60120180240300SE +/- 1.86, N = 3SE +/- 5.52, N = 6285.46291.411. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft
OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: CUDA - Benchmark: FFT SPcuda-minicuda-test50100150200250Min: 282.21 / Avg: 285.46 / Max: 288.66Min: 275.19 / Avg: 291.41 / Max: 313.551. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

OpenBenchmarking.orgGHash/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: CUDA - Benchmark: MD5 Hashcuda-minicuda-test246810SE +/- 0.12, N = 3SE +/- 0.13, N = 47.777.541. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft
OpenBenchmarking.orgGHash/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: CUDA - Benchmark: MD5 Hashcuda-minicuda-test3691215Min: 7.52 / Avg: 7.77 / Max: 7.9Min: 7.17 / Avg: 7.54 / Max: 7.721. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: CUDA - Benchmark: Max SP Flopscuda-minicuda-test14002800420056007000SE +/- 3.17, N = 3SE +/- 0.15, N = 36557.126561.661. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft
OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: CUDA - Benchmark: Max SP Flopscuda-minicuda-test11002200330044005500Min: 6553.38 / Avg: 6557.12 / Max: 6563.42Min: 6561.44 / Avg: 6561.66 / Max: 6561.941. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: CUDA - Benchmark: Bus Speed Downloadcuda-minicuda-test3691215SE +/- 0.97, N = 6SE +/- 0.78, N = 610.308.701. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft
OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: CUDA - Benchmark: Bus Speed Downloadcuda-minicuda-test3691215Min: 7.8 / Avg: 10.3 / Max: 12.46Min: 7.67 / Avg: 8.7 / Max: 12.461. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: CUDA - Benchmark: Bus Speed Readbackcuda-minicuda-test3691215SE +/- 0.33, N = 6SE +/- 0.31, N = 612.6912.571. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft
OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: CUDA - Benchmark: Bus Speed Readbackcuda-minicuda-test48121620Min: 11.57 / Avg: 12.69 / Max: 13.21Min: 11.6 / Avg: 12.57 / Max: 13.211. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: CUDA - Benchmark: Texture Read Bandwidthcuda-minicuda-test70140210280350SE +/- 0.40, N = 3SE +/- 0.27, N = 3335.13335.041. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft
OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: CUDA - Benchmark: Texture Read Bandwidthcuda-minicuda-test60120180240300Min: 334.37 / Avg: 335.13 / Max: 335.74Min: 334.61 / Avg: 335.04 / Max: 335.551. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

ASKAP tConvolveCuda

This is a CUDA benchmark of ATNF's ASKAP Benchmark with currently using the tConvolveCuda sub-test. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP tConvolveCuda 2015-11-10Processing: Griddingcuda-minicuda-test15003000450060007500SE +/- 0.00, N = 3SE +/- 63.12, N = 37006.747132.991. (CXX) g++ options: -fPIC -O3 -m64 -lcudadevrt -lcudart_static -lrt -lpthread -ldl
OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP tConvolveCuda 2015-11-10Processing: Griddingcuda-minicuda-test12002400360048006000Min: 7006.74 / Avg: 7006.74 / Max: 7006.74Min: 7006.74 / Avg: 7132.99 / Max: 7196.111. (CXX) g++ options: -fPIC -O3 -m64 -lcudadevrt -lcudart_static -lrt -lpthread -ldl

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP tConvolveCuda 2015-11-10Processing: Degriddingcuda-minicuda-test3K6K9K12K15KSE +/- 259.50, N = 3SE +/- 0.00, N = 314532.5014013.501. (CXX) g++ options: -fPIC -O3 -m64 -lcudadevrt -lcudart_static -lrt -lpthread -ldl
OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP tConvolveCuda 2015-11-10Processing: Degriddingcuda-minicuda-test3K6K9K12K15KMin: 14013.5 / Avg: 14532.5 / Max: 14792Min: 14013.5 / Avg: 14013.5 / Max: 14013.51. (CXX) g++ options: -fPIC -O3 -m64 -lcudadevrt -lcudart_static -lrt -lpthread -ldl

CUDA Mini-Nbody

OpenBenchmarking.orgSeconds, Fewer Is BetterCUDA Mini-Nbody 2015-11-10Test: Originalcuda-minimini-nbody0202cuda-test1326395265SE +/- 0.21, N = 3SE +/- 0.94, N = 6SE +/- 0.10, N = 3SE +/- 0.25, N = 358.3459.5559.1159.10
OpenBenchmarking.orgSeconds, Fewer Is BetterCUDA Mini-Nbody 2015-11-10Test: Originalcuda-minimini-nbody0202cuda-test1224364860Min: 57.97 / Avg: 58.34 / Max: 58.7Min: 58.1 / Avg: 59.55 / Max: 64.18Min: 58.93 / Avg: 59.11 / Max: 59.26Min: 58.66 / Avg: 59.1 / Max: 59.54

OpenBenchmarking.orgSeconds, Fewer Is BetterCUDA Mini-Nbody 2015-11-10Test: Cache Blockingcuda-minicuda-test1020304050SE +/- 0.06, N = 3SE +/- 0.17, N = 344.7444.90
OpenBenchmarking.orgSeconds, Fewer Is BetterCUDA Mini-Nbody 2015-11-10Test: Cache Blockingcuda-minicuda-test918273645Min: 44.67 / Avg: 44.74 / Max: 44.85Min: 44.6 / Avg: 44.9 / Max: 45.17

OpenBenchmarking.orgSeconds, Fewer Is BetterCUDA Mini-Nbody 2015-11-10Test: Loop Unrollingcuda-minicuda-test1020304050SE +/- 0.42, N = 3SE +/- 0.14, N = 344.4544.91
OpenBenchmarking.orgSeconds, Fewer Is BetterCUDA Mini-Nbody 2015-11-10Test: Loop Unrollingcuda-minicuda-test918273645Min: 43.89 / Avg: 44.45 / Max: 45.27Min: 44.67 / Avg: 44.91 / Max: 45.15

OpenBenchmarking.orgSeconds, Fewer Is BetterCUDA Mini-Nbody 2015-11-10Test: SOA Data Layoutcuda-minicuda-test1530456075SE +/- 0.11, N = 3SE +/- 0.08, N = 365.2865.45
OpenBenchmarking.orgSeconds, Fewer Is BetterCUDA Mini-Nbody 2015-11-10Test: SOA Data Layoutcuda-minicuda-test1326395265Min: 65.16 / Avg: 65.28 / Max: 65.5Min: 65.31 / Avg: 65.45 / Max: 65.58

OpenBenchmarking.orgSeconds, Fewer Is BetterCUDA Mini-Nbody 2015-11-10Test: Flush Denormals To Zerocuda-minicuda-test1530456075SE +/- 0.22, N = 3SE +/- 0.20, N = 365.5965.50
OpenBenchmarking.orgSeconds, Fewer Is BetterCUDA Mini-Nbody 2015-11-10Test: Flush Denormals To Zerocuda-minicuda-test1326395265Min: 65.16 / Avg: 65.59 / Max: 65.88Min: 65.13 / Avg: 65.5 / Max: 65.83

SHOC Scalable HeterOgeneous Computing

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Triadcuda-test3691215SE +/- 0.03, N = 311.841. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: FFT SPcuda-test4080120160200SE +/- 1.40, N = 3174.731. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

OpenBenchmarking.orgGHash/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: MD5 Hashcuda-test246810SE +/- 0.12, N = 37.731. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Max SP Flopscuda-test14002800420056007000SE +/- 6.06, N = 36505.951. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Bus Speed Downloadcuda-test3691215SE +/- 0.00, N = 312.461. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Bus Speed Readbackcuda-test3691215SE +/- 0.14, N = 313.071. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Texture Read Bandwidthcuda-test70140210280350SE +/- 0.26, N = 3331.701. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft