Radeon ROCm 2.0 OpenCL Compute Versus NVIDIA Linux ROCm 2.0 Linux GPGPU/compute benchmarks for a future article on Phoronix.com by Michael Larabel. GTX 980 Ti: Processor: Intel Core i9-9900K @ 5.00GHz (8 Cores / 16 Threads), Motherboard: ASUS PRIME Z390-A (0602 BIOS), Chipset: Intel Cannon Lake PCH Shared SRAM, Memory: 16384MB, Disk: 2000GB SABRENT + Samsung SSD 970 EVO 250GB, Graphics: NVIDIA GeForce GTX 980 Ti 6GB (999/3505MHz), Audio: Realtek ALC1220, Monitor: Acer B286HK, Network: Intel Connection OS: Ubuntu 18.04, Kernel: 4.19.5-041905-generic (x86_64), Desktop: GNOME Shell 3.28.3, Display Server: X Server 1.19.6, Display Driver: NVIDIA 415.23, OpenGL: 4.6.0, OpenCL: OpenCL 1.2 CUDA 10.0.132, Vulkan: 1.1.84, Compiler: GCC 7.3.0 + LLVM 6.0.0 + CUDA 10.0, File-System: ext4, Screen Resolution: 3840x2160 GTX TITAN X GM200: Processor: Intel Core i9-9900K @ 5.00GHz (8 Cores / 16 Threads), Motherboard: ASUS PRIME Z390-A (0602 BIOS), Chipset: Intel Cannon Lake PCH Shared SRAM, Memory: 16384MB, Disk: 2000GB SABRENT + Samsung SSD 970 EVO 250GB, Graphics: NVIDIA GeForce GTX TITAN X 12GB (1001/3505MHz), Audio: Realtek ALC1220, Monitor: Acer B286HK, Network: Intel Connection OS: Ubuntu 18.04, Kernel: 4.19.5-041905-generic (x86_64), Desktop: GNOME Shell 3.28.3, Display Server: X Server 1.19.6, Display Driver: NVIDIA 415.23, OpenGL: 4.6.0, OpenCL: OpenCL 1.2 CUDA 10.0.132, Vulkan: 1.1.84, Compiler: GCC 7.3.0 + LLVM 6.0.0 + CUDA 10.0, File-System: ext4, Screen Resolution: 3840x2160 GTX 1060: Processor: Intel Core i9-9900K @ 5.00GHz (8 Cores / 16 Threads), Motherboard: ASUS PRIME Z390-A (0602 BIOS), Chipset: Intel Cannon Lake PCH Shared SRAM, Memory: 16384MB, Disk: 2000GB SABRENT + Samsung SSD 970 EVO 250GB, Graphics: NVIDIA GeForce GTX 1060 6GB (1506/4006MHz), Audio: Realtek ALC1220, Monitor: Acer B286HK, Network: Intel Connection OS: Ubuntu 18.04, Kernel: 4.19.5-041905-generic (x86_64), Desktop: GNOME Shell 3.28.3, Display Server: X Server 1.19.6, Display Driver: NVIDIA 415.23, OpenGL: 4.6.0, OpenCL: OpenCL 1.2 CUDA 10.0.132, Vulkan: 1.1.84, Compiler: GCC 7.3.0 + LLVM 6.0.0 + CUDA 10.0, File-System: ext4, Screen Resolution: 3840x2160 GTX 1070: Processor: Intel Core i9-9900K @ 5.00GHz (8 Cores / 16 Threads), Motherboard: ASUS PRIME Z390-A (0602 BIOS), Chipset: Intel Cannon Lake PCH Shared SRAM, Memory: 16384MB, Disk: 2000GB SABRENT + Samsung SSD 970 EVO 250GB, Graphics: NVIDIA GeForce GTX 1070 8GB (1506/4006MHz), Audio: Realtek ALC1220, Monitor: Acer B286HK, Network: Intel Connection OS: Ubuntu 18.04, Kernel: 4.19.5-041905-generic (x86_64), Desktop: GNOME Shell 3.28.3, Display Server: X Server 1.19.6, Display Driver: NVIDIA 415.23, OpenGL: 4.6.0, OpenCL: OpenCL 1.2 CUDA 10.0.132, Vulkan: 1.1.84, Compiler: GCC 7.3.0 + LLVM 6.0.0 + CUDA 10.0, File-System: ext4, Screen Resolution: 3840x2160 GTX 1080: Processor: Intel Core i9-9900K @ 5.00GHz (8 Cores / 16 Threads), Motherboard: ASUS PRIME Z390-A (0602 BIOS), Chipset: Intel Cannon Lake PCH Shared SRAM, Memory: 16384MB, Disk: 2000GB SABRENT + Samsung SSD 970 EVO 250GB, Graphics: NVIDIA GeForce GTX 1080 8GB (1607/5005MHz), Audio: Realtek ALC1220, Monitor: Acer B286HK, Network: Intel Connection OS: Ubuntu 18.04, Kernel: 4.19.5-041905-generic (x86_64), Desktop: GNOME Shell 3.28.3, Display Server: X Server 1.19.6, Display Driver: NVIDIA 415.23, OpenGL: 4.6.0, OpenCL: OpenCL 1.2 CUDA 10.0.132, Vulkan: 1.1.84, Compiler: GCC 7.3.0 + LLVM 6.0.0 + CUDA 10.0, File-System: ext4, Screen Resolution: 3840x2160 GTX 1080 Ti: Processor: Intel Core i9-9900K @ 5.00GHz (8 Cores / 16 Threads), Motherboard: ASUS PRIME Z390-A (0602 BIOS), Chipset: Intel Cannon Lake PCH Shared SRAM, Memory: 16384MB, Disk: 2000GB SABRENT + Samsung SSD 970 EVO 250GB, Graphics: NVIDIA GeForce GTX 1080 Ti 11GB (1480/5508MHz), Audio: Realtek ALC1220, Monitor: Acer B286HK, Network: Intel Connection OS: Ubuntu 18.04, Kernel: 4.19.5-041905-generic (x86_64), Desktop: GNOME Shell 3.28.3, Display Server: X Server 1.19.6, Display Driver: NVIDIA 415.23, OpenGL: 4.6.0, OpenCL: OpenCL 1.2 CUDA 10.0.132, Vulkan: 1.1.84, Compiler: GCC 7.3.0 + LLVM 6.0.0 + CUDA 10.0, File-System: ext4, Screen Resolution: 3840x2160 RTX 2080: Processor: Intel Core i9-9900K @ 5.00GHz (8 Cores / 16 Threads), Motherboard: ASUS PRIME Z390-A (0602 BIOS), Chipset: Intel Cannon Lake PCH Shared SRAM, Memory: 16384MB, Disk: 2000GB SABRENT + Samsung SSD 970 EVO 250GB, Graphics: Zotac NVIDIA GeForce RTX 2080 8GB (1515/7000MHz), Audio: Realtek ALC1220, Monitor: Acer B286HK, Network: Intel Connection OS: Ubuntu 18.04, Kernel: 4.19.5-041905-generic (x86_64), Desktop: GNOME Shell 3.28.3, Display Server: X Server 1.19.6, Display Driver: NVIDIA 415.23, OpenGL: 4.6.0, OpenCL: OpenCL 1.2 CUDA 10.0.132, Vulkan: 1.1.84, Compiler: GCC 7.3.0 + LLVM 6.0.0 + CUDA 10.0, File-System: ext4, Screen Resolution: 3840x2160 RTX 2080 Ti: Processor: Intel Core i9-9900K @ 5.00GHz (8 Cores / 16 Threads), Motherboard: ASUS PRIME Z390-A (0602 BIOS), Chipset: Intel Cannon Lake PCH Shared SRAM, Memory: 16384MB, Disk: 2000GB SABRENT + Samsung SSD 970 EVO 250GB, Graphics: NVIDIA GeForce RTX 2080 Ti 11GB (1350/7000MHz), Audio: Realtek ALC1220, Monitor: Acer B286HK, Network: Intel Connection OS: Ubuntu 18.04, Kernel: 4.19.5-041905-generic (x86_64), Desktop: GNOME Shell 3.28.3, Display Server: X Server 1.19.6, Display Driver: NVIDIA 415.23, OpenGL: 4.6.0, OpenCL: OpenCL 1.2 CUDA 10.0.132, Vulkan: 1.1.84, Compiler: GCC 7.3.0 + LLVM 6.0.0 + CUDA 10.0, File-System: ext4, Screen Resolution: 3840x2160 TITAN RTX: Processor: Intel Core i9-9900K @ 5.00GHz (8 Cores / 16 Threads), Motherboard: ASUS PRIME Z390-A (0602 BIOS), Chipset: Intel Cannon Lake PCH Shared SRAM, Memory: 16384MB, Disk: 2000GB SABRENT + Samsung SSD 970 EVO 250GB, Graphics: NVIDIA TITAN RTX 24GB (1350/7000MHz), Audio: Realtek ALC1220, Monitor: Acer B286HK, Network: Intel Connection OS: Ubuntu 18.04, Kernel: 4.19.5-041905-generic (x86_64), Desktop: GNOME Shell 3.28.3, Display Server: X Server 1.19.6, Display Driver: NVIDIA 415.23, OpenGL: 4.6.0, OpenCL: OpenCL 1.2 CUDA 10.0.132, Vulkan: 1.1.84, Compiler: GCC 7.3.0 + LLVM 6.0.0 + CUDA 10.0, File-System: ext4, Screen Resolution: 3840x2160 RX 580: Processor: Intel Core i9-9900K @ 5.00GHz (8 Cores / 16 Threads), Motherboard: ASUS PRIME Z390-A (0602 BIOS), Chipset: Intel Cannon Lake PCH Shared SRAM, Memory: 16384MB, Disk: 2000GB SABRENT + Samsung SSD 970 EVO 250GB, Graphics: MSI AMD Radeon RX 470/480 8GB (1366/2000MHz), Audio: Realtek ALC1220, Monitor: Acer B286HK, Network: Intel Connection OS: Ubuntu 18.04, Kernel: 4.19.5-041905-generic (x86_64), Desktop: GNOME Shell 3.28.3, Display Server: X Server 1.19.6, OpenGL: 4.5 Mesa 19.0.0-devel (git-17218a0406) (LLVM 8.0.0), OpenCL: OpenCL 2.1 AMD-APP (2783.0), Vulkan: 1.1.90, Compiler: GCC 7.3.0 + LLVM 6.0.0 + CUDA 10.0, File-System: ext4, Screen Resolution: 3840x2160 RX Vega 56: Processor: Intel Core i9-9900K @ 5.00GHz (8 Cores / 16 Threads), Motherboard: ASUS PRIME Z390-A (0602 BIOS), Chipset: Intel Cannon Lake PCH Shared SRAM, Memory: 16384MB, Disk: 2000GB SABRENT + Samsung SSD 970 EVO 250GB, Graphics: AMD Radeon RX Vega 8GB (1590/800MHz), Audio: Realtek ALC1220, Monitor: Acer B286HK, Network: Intel Connection OS: Ubuntu 18.04, Kernel: 4.15.0-43-generic (x86_64), Desktop: GNOME Shell 3.28.3, Display Server: X Server 1.19.6, OpenGL: 4.5 Mesa 19.0.0-devel (git-17218a0406) (LLVM 8.0.0), OpenCL: OpenCL 2.1 AMD-APP (2783.0), Vulkan: 1.1.90, Compiler: GCC 7.3.0 + LLVM 6.0.0 + CUDA 10.0, File-System: ext4, Screen Resolution: 3840x2160 RX Vega 64: Processor: Intel Core i9-9900K @ 5.00GHz (8 Cores / 16 Threads), Motherboard: ASUS PRIME Z390-A (0602 BIOS), Chipset: Intel Cannon Lake PCH Shared SRAM, Memory: 16384MB, Disk: 2000GB SABRENT + Samsung SSD 970 EVO 250GB, Graphics: AMD Radeon RX Vega 8GB (1630/945MHz), Audio: Realtek ALC1220, Monitor: Acer B286HK, Network: Intel Connection OS: Ubuntu 18.04, Kernel: 4.15.0-43-generic (x86_64), Desktop: GNOME Shell 3.28.3, Display Server: X Server 1.19.6, OpenGL: 4.5 Mesa 19.0.0-devel (git-17218a0406) (LLVM 8.0.0), OpenCL: OpenCL 2.1 AMD-APP (2783.0), Vulkan: 1.1.90, Compiler: GCC 7.3.0 + LLVM 6.0.0 + CUDA 10.0, File-System: ext4, Screen Resolution: 3840x2160 Darktable 2.4.2 Test: Boat - Acceleration: OpenCL Seconds < Lower Is Better GTX 980 Ti ........ 3.27 |============ GTX TITAN X GM200 . 3.11 |============ GTX 1060 .......... 3.77 |============== GTX 1070 .......... 2.87 |=========== GTX 1080 .......... 2.72 |========== GTX 1080 Ti ....... 2.27 |========= RTX 2080 .......... 1.88 |======= RTX 2080 Ti ....... 1.62 |====== TITAN RTX ......... 1.61 |====== RX 580 ............ 13.60 |==================================================== RX Vega 56 ........ 13.64 |==================================================== RX Vega 64 ........ 3.58 |============== Darktable 2.4.2 Test: Server Room - Acceleration: OpenCL Seconds < Lower Is Better GTX 980 Ti ........ 1.52 |=================== GTX TITAN X GM200 . 1.36 |================= GTX 1060 .......... 1.99 |========================= GTX 1070 .......... 1.10 |============== GTX 1080 .......... 1.12 |============== GTX 1080 Ti ....... 1.02 |============= RTX 2080 .......... 0.80 |========== RTX 2080 Ti ....... 0.73 |========= TITAN RTX ......... 0.73 |========= RX 580 ............ 4.19 |===================================================== RX Vega 56 ........ 4.21 |===================================================== RX Vega 64 ........ 1.76 |====================== Parboil 2.5 Test: OpenCL TPACF Seconds < Lower Is Better GTX 980 Ti ........ 1.39 |========================================== GTX TITAN X GM200 . 1.36 |========================================= GTX 1060 .......... 1.39 |========================================== GTX 1070 .......... 1.08 |================================= GTX 1080 .......... 1.06 |================================ GTX 1080 Ti ....... 0.77 |======================= RTX 2080 .......... 0.88 |=========================== RTX 2080 Ti ....... 0.64 |=================== TITAN RTX ......... 0.63 |=================== RX 580 ............ 1.75 |===================================================== RX Vega 56 ........ 1.41 |=========================================== RX Vega 64 ........ 1.36 |========================================= Rodinia 2.4 Test: OpenCL Particle Filter Seconds < Lower Is Better GTX 980 Ti ........ 10.22 |============================================ GTX TITAN X GM200 . 9.03 |======================================= GTX 1060 .......... 12.07 |==================================================== GTX 1070 .......... 8.30 |==================================== GTX 1080 .......... 6.54 |============================ GTX 1080 Ti ....... 4.97 |===================== RTX 2080 .......... 6.17 |=========================== RTX 2080 Ti ....... 4.45 |=================== TITAN RTX ......... 4.26 |================== Mixbench 2016-06-06 Benchmark: Single Precision GFLOPS > Higher Is Better GTX 980 Ti ........ 5860 |================== GTX TITAN X GM200 . 6557 |==================== GTX 1060 .......... 4346 |============= GTX 1070 .......... 6410 |=================== GTX 1080 .......... 8570 |========================== GTX 1080 Ti ....... 11605 |=================================== RTX 2080 .......... 11029 |================================= RTX 2080 Ti ....... 16175 |================================================= TITAN RTX ......... 17324 |==================================================== RX 580 ............ 5915 |================== RX Vega 56 ........ 10519 |================================ RX Vega 64 ........ 12458 |===================================== clpeak OpenCL Test: Global Memory Bandwidth GBPS > Higher Is Better GTX 980 Ti ........ 264 |=========================== GTX TITAN X GM200 . 263 |=========================== GTX 1060 .......... 147 |=============== GTX 1070 .......... 196 |==================== GTX 1080 .......... 222 |======================= GTX 1080 Ti ....... 329 |================================== RTX 2080 .......... 368 |====================================== RTX 2080 Ti ....... 505 |==================================================== TITAN RTX ......... 528 |====================================================== RX 580 ............ 205 |===================== RX Vega 56 ........ 317 |================================ RX Vega 64 ........ 362 |===================================== clpeak OpenCL Test: Integer Compute INT GIOPS > Higher Is Better GTX 980 Ti ........ 1616 |===== GTX TITAN X GM200 . 1783 |====== GTX 1060 .......... 1285 |==== GTX 1070 .......... 1684 |====== GTX 1080 .......... 2450 |======== GTX 1080 Ti ....... 3366 |=========== RTX 2080 .......... 10199 |================================== RTX 2080 Ti ....... 14840 |================================================== TITAN RTX ......... 15398 |==================================================== RX 580 ............ 1252 |==== RX Vega 56 ........ 2006 |======= RX Vega 64 ........ 2494 |======== FAHBench 2.3.2 Ns Per Day > Higher Is Better GTX 980 Ti ........ 114 |===================== GTX TITAN X GM200 . 121 |====================== GTX 1060 .......... 103 |=================== GTX 1070 .......... 140 |========================= GTX 1080 .......... 155 |============================ GTX 1080 Ti ....... 198 |==================================== RTX 2080 .......... 237 |=========================================== RTX 2080 Ti ....... 294 |===================================================== TITAN RTX ......... 297 |====================================================== LuxMark 3.1 OpenCL Device: GPU - Scene: Luxball HDR Score > Higher Is Better GTX 980 Ti ........ 16910 |=================== GTX TITAN X GM200 . 17299 |=================== GTX 1070 .......... 17288 |=================== GTX 1080 .......... 13823 |=============== GTX 1080 Ti ....... 21562 |======================== RTX 2080 .......... 29641 |================================= RTX 2080 Ti ....... 42693 |================================================ TITAN RTX ......... 45932 |==================================================== GTX 980 Ti ........ 16778 |=================== GTX TITAN X GM200 . 17288 |=================== GTX 1060 .......... 12238 |============== GTX 1070 .......... 17289 |=================== GTX 1080 .......... 13833 |================ GTX 1080 Ti ....... 21685 |======================== RTX 2080 .......... 29650 |================================= RTX 2080 Ti ....... 42314 |=============================================== TITAN RTX ......... 46377 |==================================================== RX 580 ............ 15270 |================= RX Vega 56 ........ 30649 |================================== RX Vega 64 ........ 32545 |==================================== LuxMark 3.1 OpenCL Device: GPU - Scene: Microphone Score > Higher Is Better GTX 980 Ti ........ 11105 |=================== GTX TITAN X GM200 . 10929 |=================== GTX 1060 .......... 6963 |============ GTX 1070 .......... 9980 |================= GTX 1080 .......... 8732 |=============== GTX 1080 Ti ....... 13732 |======================= RTX 2080 .......... 19855 |================================== RTX 2080 Ti ....... 28476 |================================================= TITAN RTX ......... 30528 |==================================================== LuxMark 3.1 OpenCL Device: GPU - Scene: Hotel Score > Higher Is Better GTX 980 Ti ........ 3879 |===================== GTX TITAN X GM200 . 4135 |====================== GTX 1070 .......... 3877 |===================== GTX 1080 .......... 3823 |==================== GTX 1080 Ti ....... 5581 |============================== RTX 2080 .......... 6589 |=================================== RTX 2080 Ti ....... 9191 |================================================= TITAN RTX ......... 9884 |===================================================== SHOC Scalable HeterOgeneous Computing 2015-11-10 Target: OpenCL - Benchmark: FFT SP GFLOPS > Higher Is Better GTX 980 Ti ........ 728 |========================= GTX TITAN X GM200 . 726 |========================= GTX 1060 .......... 302 |========== GTX 1070 .......... 452 |=============== GTX 1080 .......... 575 |==================== GTX 1080 Ti ....... 972 |================================= RTX 2080 .......... 1083 |===================================== RTX 2080 Ti ....... 1443 |================================================= TITAN RTX ......... 1548 |===================================================== RX 580 ............ 548 |=================== RX Vega 56 ........ 932 |================================ RX Vega 64 ........ 1074 |===================================== cl-mem 2017-01-13 Benchmark: Copy GB/s > Higher Is Better GTX 980 Ti ........ 217 |======================== GTX TITAN X GM200 . 218 |======================== GTX 1060 .......... 139 |================ GTX 1070 .......... 187 |===================== GTX 1080 .......... 209 |======================= GTX 1080 Ti ....... 317 |=================================== RTX 2080 .......... 328 |===================================== RTX 2080 Ti ....... 454 |=================================================== TITAN RTX ......... 484 |====================================================== RX 580 ............ 184 |===================== RX Vega 56 ........ 203 |======================= RX Vega 64 ........ 222 |=========================