CUDA NVIDIA Tegra X1 GPGPU Linux Tests Benchmarks by Michael Larabel for a future article on Phoronix.com just delivering various GPGPU benchmarks for reference purposes. Jetson TX1: Processor: Cortex A57 rev 1 @ 1.91GHz (4 Cores), Motherboard: jetson_tx1, Memory: 4096MB, Disk: 16GB 016G32 + 16GB SL16G, Graphics: NVIDIA TEGRA OS: Ubuntu 14.04, Kernel: 3.10.67-g3a5c467 (aarch64), Desktop: Unity 7.2.2, Display Server: X Server 1.15.1, Display Driver: NVIDIA 1.0.0, Compiler: GCC 4.8.4 + CUDA 7.0, File-System: ext4, Screen Resolution: 3840x2160 Jetson TX2 Hogh-P: Processor: ARMv8 rev 3 @ 2.04GHz (6 Cores), Motherboard: quill, Memory: 8192MB, Disk: 31GB 032G34, Graphics: NVIDIA TEGRA OS: Ubuntu 16.04, Kernel: 4.4.38-tegra (aarch64), Desktop: Unity 7.4.5, Display Server: X Server 1.18.4, Display Driver: NVIDIA 1.0.0, Compiler: GCC 5.4.0 20160609 + CUDA 9.0, File-System: ext4, Screen Resolution: 1366x768 SHOC Scalable HeterOgeneous Computing 2015-11-10 Target: CUDA - Benchmark: FFT SP GFLOPS > Higher Is Better Jetson TX1 ........ 3.92 |========================= Jetson TX2 Hogh-P . 8.24 |===================================================== SHOC Scalable HeterOgeneous Computing 2015-11-10 Target: CUDA - Benchmark: MD5 Hash GHash/s > Higher Is Better Jetson TX1 ........ 0.62 |================================== Jetson TX2 Hogh-P . 0.98 |===================================================== SHOC Scalable HeterOgeneous Computing 2015-11-10 Target: CUDA - Benchmark: Texture Read Bandwidth GB/s > Higher Is Better Jetson TX1 ........ 46.62 |=============================== Jetson TX2 Hogh-P . 79.04 |==================================================== ASKAP tConvolveCuda 2015-11-10 Processing: Gridding Million Grid Points Per Second > Higher Is Better Jetson TX1 ........ 263 |================ Jetson TX2 Hogh-P . 905 |====================================================== ASKAP tConvolveCuda 2015-11-10 Processing: Degridding Million Grid Points Per Second > Higher Is Better Jetson TX1 ........ 649 |======================= Jetson TX2 Hogh-P . 1513 |===================================================== CUDA Mini-Nbody 2015-11-10 Test: Original Seconds < Lower Is Better Jetson TX1 ........ 513 |=================================== Jetson TX2 Hogh-P . 781 |====================================================== CUDA Mini-Nbody 2015-11-10 Test: Cache Blocking Seconds < Lower Is Better Jetson TX1 ........ 277 |====================================================== Jetson TX2 Hogh-P . 176 |================================== CUDA Mini-Nbody 2015-11-10 Test: Loop Unrolling Seconds < Lower Is Better Jetson TX1 ........ 236 |====================================================== Jetson TX2 Hogh-P . 200 |============================================== CUDA Mini-Nbody 2015-11-10 Test: SOA Data Layout Seconds < Lower Is Better Jetson TX1 ........ 530 |=================================================== Jetson TX2 Hogh-P . 557 |====================================================== CUDA Mini-Nbody 2015-11-10 Test: Flush Denormals To Zero Seconds < Lower Is Better Jetson TX1 ........ 538 |==================================================== Jetson TX2 Hogh-P . 554 |======================================================