CUDA NVIDIA Tegra X1 GPGPU Linux Tests Benchmarks by Michael Larabel for a future article on Phoronix.com just delivering various GPGPU benchmarks for reference purposes. Jetson TX1: Processor: Cortex A57 rev 1 @ 1.91GHz (4 Cores), Motherboard: jetson_tx1, Memory: 4096MB, Disk: 16GB 016G32 + 16GB SL16G, Graphics: NVIDIA TEGRA OS: Ubuntu 14.04, Kernel: 3.10.67-g3a5c467 (aarch64), Desktop: Unity 7.2.2, Display Server: X Server 1.15.1, Display Driver: NVIDIA 1.0.0, Compiler: GCC 4.8.4 + CUDA 7.0, File-System: ext4, Screen Resolution: 3840x2160 Desktop: Processor: Intel Core i7-7700K @ 4.20GHz (8 Cores), Motherboard: ASRock Z270 Extreme4, Chipset: Intel Device 591f, Memory: 32768MB, Disk: 525GB Crucial_CT525MX3 + 3001GB TOSHIBA DT01ACA3, Graphics: NVIDIA GeForce GTX 1080 8192MB (101/405MHz), Audio: Realtek Generic, Network: Intel Connection OS: Ubuntu 16.04, Kernel: 4.4.0-66-generic (x86_64), Desktop: Unity 7.4.0, Display Server: X Server 1.18.4, Display Driver: NVIDIA 375.26, OpenGL: 4.5.0, Vulkan: 1.0.24, Compiler: GCC 5.4.0 20160609 + CUDA 8.0, File-System: ext4, Screen Resolution: 3840x1080 SHOC Scalable HeterOgeneous Computing 2015-11-10 Target: CUDA - Benchmark: FFT SP GFLOPS > Higher Is Better Jetson TX1 . 3.92 | Desktop .... 493.46 |========================================================== SHOC Scalable HeterOgeneous Computing 2015-11-10 Target: CUDA - Benchmark: MD5 Hash GHash/s > Higher Is Better Jetson TX1 . 0.62 |=== Desktop .... 12.80 |=========================================================== SHOC Scalable HeterOgeneous Computing 2015-11-10 Target: CUDA - Benchmark: Texture Read Bandwidth GB/s > Higher Is Better Jetson TX1 . 46.62 |===== Desktop .... 551.68 |========================================================== ASKAP tConvolveCuda 2015-11-10 Processing: Gridding Million Grid Points Per Second > Higher Is Better Jetson TX1 . 262.83 |== Desktop .... 8588.90 |========================================================= ASKAP tConvolveCuda 2015-11-10 Processing: Degridding Million Grid Points Per Second > Higher Is Better Jetson TX1 . 649.05 |== Desktop .... 15372.07 |======================================================== CUDA Mini-Nbody 2015-11-10 Test: Original Seconds < Lower Is Better Jetson TX1 . 513.47 |========================================================== Desktop .... 30.47 |=== CUDA Mini-Nbody 2015-11-10 Test: Cache Blocking Seconds < Lower Is Better Jetson TX1 . 277.48 |========================================================== Desktop .... 12.96 |=== CUDA Mini-Nbody 2015-11-10 Test: Loop Unrolling Seconds < Lower Is Better Jetson TX1 . 236.40 |========================================================== Desktop .... 13.37 |=== CUDA Mini-Nbody 2015-11-10 Test: SOA Data Layout Seconds < Lower Is Better Jetson TX1 . 529.59 |========================================================== Desktop .... 26.68 |=== CUDA Mini-Nbody 2015-11-10 Test: Flush Denormals To Zero Seconds < Lower Is Better Jetson TX1 . 538.07 |========================================================== Desktop .... 26.72 |===