cuda

2 x Intel Xeon E5-2630 v3 testing with a Supermicro X10DRG-H v1.02 and ASPEED ASPEED Family on Ubuntu 14.04 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/1602024-HA-CUDA0016144&grw.

cudaProcessorMotherboardChipsetMemoryDiskGraphicsMonitorNetworkOSKernelDesktopDisplay DriverOpenGLCompilerFile-SystemScreen ResolutionDisplay Servercuda-minimini-nbody0202cuda-test2 x Intel Xeon E5-2630 v3 - ASPEED ASPEED Family -2 x Intel Xeon E5-2630 v3 @ 2.40GHz (32 Cores)Supermicro X10DRG-H v1.02Intel Haswell-E DMI264512MB500GB HGST HTS545050A7LLVMpipeSyncMasterIntel I350 Gigabit ConnectionUbuntu 14.043.13.0-32-generic (x86_64)Unity 7.2.2modesetting 0.8.12.1 Mesa 10.1.3 Gallium 0.4GCC 4.8.4 + CUDA 7.5ext41024x768X Server 1.15.1GCC 4.8.4 + CUDA 7.0ASPEED ASPEED Family1280x1024OpenBenchmarking.orgCompiler Details- --build=x86_64-linux-gnu --disable-browser-plugin --disable-libmudflap --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-objc-gc --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-arch-directory=amd64 --with-multilib-list=m32,m64,mx32 --with-tune=generic -v Processor Details- Scaling Governor: acpi-cpufreq ondemand

cudacuda-mini-nbody: Cache Blockingshoc: CUDA - Triadshoc: CUDA - FFT SPcuda-mini-nbody: Loop Unrollingshoc: CUDA - MD5 Hashcuda-mini-nbody: SOA Data Layoutcuda-mini-nbody: Flush Denormals To Zeroshoc: CUDA - Max SP Flopsshoc: CUDA - Bus Speed Downloadcuda-mini-nbody: Originalshoc: CUDA - Bus Speed Readbackshoc: CUDA - Texture Read Bandwidthshoc: OpenCL - Triadshoc: OpenCL - FFT SPshoc: OpenCL - MD5 Hashshoc: OpenCL - Max SP Flopsshoc: OpenCL - Bus Speed Downloadshoc: OpenCL - Bus Speed Readbackshoc: OpenCL - Texture Read Bandwidthaskap: Griddingaskap: Degriddingcuda-minimini-nbody0202cuda-test2 x Intel Xeon E5-2630 v3 - ASPEED ASPEED Family -44.7412.89285.4644.457.7765.2865.596557.1210.3058.3412.69335.137006.7414532.5059.5559.1144.9015.73291.4144.917.5465.4565.506561.668.7059.1012.57335.0411.84174.737.736505.9512.4613.07331.707132.9914013.50OpenBenchmarking.org

CUDA Mini-Nbody

Test: Cache Blocking

OpenBenchmarking.orgSeconds, Fewer Is BetterCUDA Mini-Nbody 2015-11-10Test: Cache Blockingcuda-minicuda-test1020304050SE +/- 0.06, N = 3SE +/- 0.17, N = 344.7444.90

SHOC Scalable HeterOgeneous Computing

Target: CUDA - Benchmark: Triad

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: CUDA - Benchmark: Triadcuda-minicuda-test48121620SE +/- 1.80, N = 6SE +/- 0.02, N = 312.8915.731. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

SHOC Scalable HeterOgeneous Computing

Target: CUDA - Benchmark: FFT SP

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: CUDA - Benchmark: FFT SPcuda-minicuda-test60120180240300SE +/- 1.86, N = 3SE +/- 5.52, N = 6285.46291.411. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

CUDA Mini-Nbody

Test: Loop Unrolling

OpenBenchmarking.orgSeconds, Fewer Is BetterCUDA Mini-Nbody 2015-11-10Test: Loop Unrollingcuda-minicuda-test1020304050SE +/- 0.42, N = 3SE +/- 0.14, N = 344.4544.91

SHOC Scalable HeterOgeneous Computing

Target: CUDA - Benchmark: MD5 Hash

OpenBenchmarking.orgGHash/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: CUDA - Benchmark: MD5 Hashcuda-minicuda-test246810SE +/- 0.12, N = 3SE +/- 0.13, N = 47.777.541. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

CUDA Mini-Nbody

Test: SOA Data Layout

OpenBenchmarking.orgSeconds, Fewer Is BetterCUDA Mini-Nbody 2015-11-10Test: SOA Data Layoutcuda-minicuda-test1530456075SE +/- 0.11, N = 3SE +/- 0.08, N = 365.2865.45

CUDA Mini-Nbody

Test: Flush Denormals To Zero

OpenBenchmarking.orgSeconds, Fewer Is BetterCUDA Mini-Nbody 2015-11-10Test: Flush Denormals To Zerocuda-minicuda-test1530456075SE +/- 0.22, N = 3SE +/- 0.20, N = 365.5965.50

SHOC Scalable HeterOgeneous Computing

Target: CUDA - Benchmark: Max SP Flops

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: CUDA - Benchmark: Max SP Flopscuda-minicuda-test14002800420056007000SE +/- 3.17, N = 3SE +/- 0.15, N = 36557.126561.661. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

SHOC Scalable HeterOgeneous Computing

Target: CUDA - Benchmark: Bus Speed Download

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: CUDA - Benchmark: Bus Speed Downloadcuda-minicuda-test3691215SE +/- 0.97, N = 6SE +/- 0.78, N = 610.308.701. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

CUDA Mini-Nbody

Test: Original

OpenBenchmarking.orgSeconds, Fewer Is BetterCUDA Mini-Nbody 2015-11-10Test: Originalcuda-minimini-nbody0202cuda-test1326395265SE +/- 0.21, N = 3SE +/- 0.94, N = 6SE +/- 0.10, N = 3SE +/- 0.25, N = 358.3459.5559.1159.10

SHOC Scalable HeterOgeneous Computing

Target: CUDA - Benchmark: Bus Speed Readback

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: CUDA - Benchmark: Bus Speed Readbackcuda-minicuda-test3691215SE +/- 0.33, N = 6SE +/- 0.31, N = 612.6912.571. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

SHOC Scalable HeterOgeneous Computing

Target: CUDA - Benchmark: Texture Read Bandwidth

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: CUDA - Benchmark: Texture Read Bandwidthcuda-minicuda-test70140210280350SE +/- 0.40, N = 3SE +/- 0.27, N = 3335.13335.041. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Triad

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Triadcuda-test3691215SE +/- 0.03, N = 311.841. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: FFT SP

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: FFT SPcuda-test4080120160200SE +/- 1.40, N = 3174.731. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: MD5 Hash

OpenBenchmarking.orgGHash/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: MD5 Hashcuda-test246810SE +/- 0.12, N = 37.731. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Max SP Flops

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Max SP Flopscuda-test14002800420056007000SE +/- 6.06, N = 36505.951. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Bus Speed Download

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Bus Speed Downloadcuda-test3691215SE +/- 0.00, N = 312.461. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Bus Speed Readback

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Bus Speed Readbackcuda-test3691215SE +/- 0.14, N = 313.071. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Texture Read Bandwidth

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Texture Read Bandwidthcuda-test70140210280350SE +/- 0.26, N = 3331.701. (CXX) g++ options: -O2 -lSHOCCommon -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

ASKAP tConvolveCuda

Processing: Gridding

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP tConvolveCuda 2015-11-10Processing: Griddingcuda-minicuda-test15003000450060007500SE +/- 0.00, N = 3SE +/- 63.12, N = 37006.747132.991. (CXX) g++ options: -fPIC -O3 -m64 -lcudadevrt -lcudart_static -lrt -lpthread -ldl

ASKAP tConvolveCuda

Processing: Degridding

OpenBenchmarking.orgMillion Grid Points Per Second, More Is BetterASKAP tConvolveCuda 2015-11-10Processing: Degriddingcuda-minicuda-test3K6K9K12K15KSE +/- 259.50, N = 3SE +/- 0.00, N = 314532.5014013.501. (CXX) g++ options: -fPIC -O3 -m64 -lcudadevrt -lcudart_static -lrt -lpthread -ldl


Phoronix Test Suite v10.8.4