cuda 2 x Intel Xeon E5-2630 v3 testing with a Supermicro X10DRG-H v1.02 and ASPEED ASPEED Family on Ubuntu 14.04 via the Phoronix Test Suite. cuda-mini: Processor: 2 x Intel Xeon E5-2630 v3 @ 2.40GHz (32 Cores), Motherboard: Supermicro X10DRG-H v1.02, Chipset: Intel Haswell-E DMI2, Memory: 64512MB, Disk: 500GB HGST HTS545050A7, Graphics: LLVMpipe, Monitor: SyncMaster, Network: Intel I350 Gigabit Connection OS: Ubuntu 14.04, Kernel: 3.13.0-32-generic (x86_64), Desktop: Unity 7.2.2, Display Driver: modesetting 0.8.1, OpenGL: 2.1 Mesa 10.1.3 Gallium 0.4, Compiler: GCC 4.8.4 + CUDA 7.5, File-System: ext4, Screen Resolution: 1024x768 mini-nbody: Processor: 2 x Intel Xeon E5-2630 v3 @ 2.40GHz (32 Cores), Motherboard: Supermicro X10DRG-H v1.02, Chipset: Intel Haswell-E DMI2, Memory: 64512MB, Disk: 500GB HGST HTS545050A7, Graphics: LLVMpipe, Monitor: SyncMaster, Network: Intel I350 Gigabit Connection OS: Ubuntu 14.04, Kernel: 3.13.0-32-generic (x86_64), Desktop: Unity 7.2.2, Display Server: X Server 1.15.1, Display Driver: modesetting 0.8.1, OpenGL: 2.1 Mesa 10.1.3 Gallium 0.4, Compiler: GCC 4.8.4 + CUDA 7.0, File-System: ext4, Screen Resolution: 1024x768 0202: Processor: 2 x Intel Xeon E5-2630 v3 @ 2.40GHz (32 Cores), Motherboard: Supermicro X10DRG-H v1.02, Chipset: Intel Haswell-E DMI2, Memory: 64512MB, Disk: 500GB HGST HTS545050A7, Graphics: LLVMpipe, Monitor: SyncMaster, Network: Intel I350 Gigabit Connection OS: Ubuntu 14.04, Kernel: 3.13.0-32-generic (x86_64), Desktop: Unity 7.2.2, Display Server: X Server 1.15.1, Display Driver: modesetting 0.8.1, OpenGL: 2.1 Mesa 10.1.3 Gallium 0.4, Compiler: GCC 4.8.4 + CUDA 7.0, File-System: ext4, Screen Resolution: 1024x768 cuda-test: Processor: 2 x Intel Xeon E5-2630 v3 @ 2.40GHz (32 Cores), Motherboard: Supermicro X10DRG-H v1.02, Chipset: Intel Haswell-E DMI2, Memory: 64512MB, Disk: 500GB HGST HTS545050A7, Graphics: LLVMpipe, Monitor: SyncMaster, Network: Intel I350 Gigabit Connection OS: Ubuntu 14.04, Kernel: 3.13.0-32-generic (x86_64), Desktop: Unity 7.2.2, Display Server: X Server 1.15.1, Display Driver: modesetting 0.8.1, OpenGL: 2.1 Mesa 10.1.3 Gallium 0.4, Compiler: GCC 4.8.4 + CUDA 7.0, File-System: ext4, Screen Resolution: 1024x768 2 x Intel Xeon E5-2630 v3 - ASPEED ASPEED Family -: Processor: 2 x Intel Xeon E5-2630 v3 @ 2.40GHz (32 Cores), Motherboard: Supermicro X10DRG-H v1.02, Chipset: Intel Haswell-E DMI2, Memory: 64512MB, Disk: 500GB HGST HTS545050A7, Graphics: ASPEED ASPEED Family, Monitor: SyncMaster, Network: Intel I350 Gigabit Connection OS: Ubuntu 14.04, Kernel: 3.13.0-32-generic (x86_64), Desktop: Unity 7.2.2, Display Server: X Server 1.15.1, Display Driver: modesetting 0.8.1, Compiler: GCC 4.8.4 + CUDA 7.0, File-System: ext4, Screen Resolution: 1280x1024 SHOC Scalable HeterOgeneous Computing 2015-11-10 Target: CUDA - Benchmark: Triad GB/s > Higher Is Better cuda-mini . 12.89 |================================================= cuda-test . 15.73 |============================================================ SHOC Scalable HeterOgeneous Computing 2015-11-10 Target: CUDA - Benchmark: FFT SP GFLOPS > Higher Is Better cuda-mini . 285.46 |========================================================== cuda-test . 291.41 |=========================================================== SHOC Scalable HeterOgeneous Computing 2015-11-10 Target: CUDA - Benchmark: MD5 Hash GHash/s > Higher Is Better cuda-mini . 7.77 |============================================================= cuda-test . 7.54 |=========================================================== SHOC Scalable HeterOgeneous Computing 2015-11-10 Target: CUDA - Benchmark: Max SP Flops GFLOPS > Higher Is Better cuda-mini . 6557.12 |========================================================== cuda-test . 6561.66 |========================================================== SHOC Scalable HeterOgeneous Computing 2015-11-10 Target: CUDA - Benchmark: Bus Speed Download GB/s > Higher Is Better cuda-mini . 10.30 |============================================================ cuda-test . 8.70 |=================================================== SHOC Scalable HeterOgeneous Computing 2015-11-10 Target: CUDA - Benchmark: Bus Speed Readback GB/s > Higher Is Better cuda-mini . 12.69 |============================================================ cuda-test . 12.57 |=========================================================== SHOC Scalable HeterOgeneous Computing 2015-11-10 Target: CUDA - Benchmark: Texture Read Bandwidth GB/s > Higher Is Better cuda-mini . 335.13 |=========================================================== cuda-test . 335.04 |=========================================================== ASKAP tConvolveCuda 2015-11-10 Processing: Gridding Million Grid Points Per Second > Higher Is Better cuda-mini . 7006.74 |========================================================= cuda-test . 7132.99 |========================================================== ASKAP tConvolveCuda 2015-11-10 Processing: Degridding Million Grid Points Per Second > Higher Is Better cuda-mini . 14532.50 |========================================================= cuda-test . 14013.50 |======================================================= CUDA Mini-Nbody 2015-11-10 Test: Original Seconds < Lower Is Better cuda-mini .. 58.34 |========================================================== mini-nbody . 59.55 |=========================================================== 0202 ....... 59.11 |=========================================================== cuda-test .. 59.10 |=========================================================== CUDA Mini-Nbody 2015-11-10 Test: Cache Blocking Seconds < Lower Is Better cuda-mini . 44.74 |============================================================ cuda-test . 44.90 |============================================================ CUDA Mini-Nbody 2015-11-10 Test: Loop Unrolling Seconds < Lower Is Better cuda-mini . 44.45 |=========================================================== cuda-test . 44.91 |============================================================ CUDA Mini-Nbody 2015-11-10 Test: SOA Data Layout Seconds < Lower Is Better cuda-mini . 65.28 |============================================================ cuda-test . 65.45 |============================================================ CUDA Mini-Nbody 2015-11-10 Test: Flush Denormals To Zero Seconds < Lower Is Better cuda-mini . 65.59 |============================================================ cuda-test . 65.50 |============================================================ SHOC Scalable HeterOgeneous Computing 2015-11-10 Target: OpenCL - Benchmark: Triad GB/s > Higher Is Better cuda-test . 11.84 |============================================================ SHOC Scalable HeterOgeneous Computing 2015-11-10 Target: OpenCL - Benchmark: FFT SP GFLOPS > Higher Is Better cuda-test . 174.73 |=========================================================== SHOC Scalable HeterOgeneous Computing 2015-11-10 Target: OpenCL - Benchmark: MD5 Hash GHash/s > Higher Is Better cuda-test . 7.73 |============================================================= SHOC Scalable HeterOgeneous Computing 2015-11-10 Target: OpenCL - Benchmark: Max SP Flops GFLOPS > Higher Is Better cuda-test . 6505.95 |========================================================== SHOC Scalable HeterOgeneous Computing 2015-11-10 Target: OpenCL - Benchmark: Bus Speed Download GB/s > Higher Is Better cuda-test . 12.46 |============================================================ SHOC Scalable HeterOgeneous Computing 2015-11-10 Target: OpenCL - Benchmark: Bus Speed Readback GB/s > Higher Is Better cuda-test . 13.07 |============================================================ SHOC Scalable HeterOgeneous Computing 2015-11-10 Target: OpenCL - Benchmark: Texture Read Bandwidth GB/s > Higher Is Better cuda-test . 331.70 |===========================================================