NVIDIA vs. OpenCL ROCm Linux vs. AMDGPU-PRO Benchmarks ROCm 1.4 benchmarks on Ubuntu 16.04 compared to AMDGPU-PRO. Now with NVIDIA comparison points. OpenCL benchmarks by Michael Larabel for a future article on Phoronix.com.
HTML result view exported from: https://openbenchmarking.org/result/1701190-KH-1701193RI82&grr&rdt .
NVIDIA vs. OpenCL ROCm Linux vs. AMDGPU-PRO Benchmarks Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL OpenCL Compiler File-System Screen Resolution Vulkan Radeon R9 Fury - ROCm Radeon RX 480 - ROCm Radeon RX 460 - ROCm Radeon RX 460 - AMDGPU-PRO Radeon RX 480 - AMDGPU-PRO Radeon R9 Fury - AMDGPU-PRO GeForce GTX 1060 GeForce GTX 1070 GeForce GTX 1080 GeForce GTX 1050 GeForce GTX 1050 Ti Intel Xeon E3-1280 v5 @ 4.00GHz (8 Cores) MSI C236A WORKSTATION (MS-7998) v1.0 Intel Sky Lake 16384MB 256GB TOSHIBA-RD400 Sapphire AMD Radeon R9 FURY / NANO 3968MB Realtek ALC1150 Acer B286HK Intel Connection Ubuntu 16.04 4.6.0-kfd-compute-rocm-rel-1.4-16 (x86_64) Unity 7.4.0 X Server 1.18.3 modesetting 1.18.3 4.1 Mesa 11.2.0 Gallium 0.4 OpenCL 2.0 AMD-APP (2300.5) GCC 5.4.0 20160609 + Clang 4.0 + LLVM 4.0.0 ext4 3840x2160 LLVMpipe 3.3 Mesa 11.2.0 Gallium 0.4 AMD Radeon RX 460 2048MB 4.4.0-59-generic (x86_64) amdgpu 1.1.99 4.5.13462 OpenCL 2.0 AMD-APP (2236.5) GCC 5.4.0 20160609 AMD Radeon RX 480 8192MB Sapphire AMD Radeon R9 Fury 4096MB NVIDIA GeForce GTX 1060 6GB 6144MB (418/4006MHz) NVIDIA 375.26 4.5.0 OpenCL 1.2 CUDA 8.0.0 1.0.24 NVIDIA GeForce GTX 1070 8192MB (1504/4006MHz) NVIDIA GeForce GTX 1080 8192MB (109/5005MHz) Zotac NVIDIA GeForce GTX 1050 2048MB (1075/3504MHz) eVGA NVIDIA GeForce GTX 1050 Ti 4096MB (1341/3504MHz) OpenBenchmarking.org Compiler Details - --build=x86_64-linux-gnu --disable-browser-plugin --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --enable-libmpx --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-arch-directory=amd64 --with-default-libstdcxx-abi=new --with-multilib-list=m32,m64,mx32 --with-tune=generic -v Processor Details - Scaling Governor: intel_pstate powersave Graphics Details - Radeon R9 Fury - ROCm, Radeon RX 460 - AMDGPU-PRO, Radeon RX 480 - AMDGPU-PRO, Radeon R9 Fury - AMDGPU-PRO: GLAMOR Environment Details - Radeon RX 480 - ROCm, Radeon RX 460 - ROCm: LIBGL_ALWAYS_SOFTWARE=1 OpenCL Details - GeForce GTX 1060: GPU Compute Cores: 1280 - GeForce GTX 1070: GPU Compute Cores: 1920 - GeForce GTX 1080: GPU Compute Cores: 2560 - GeForce GTX 1050: GPU Compute Cores: 640 - GeForce GTX 1050 Ti: GPU Compute Cores: 768 System Details - GeForce GTX 1060: GPU Compute Cores: 1280. - GeForce GTX 1070: GPU Compute Cores: 1920. - GeForce GTX 1080: GPU Compute Cores: 2560. - GeForce GTX 1050: GPU Compute Cores: 640. - GeForce GTX 1050 Ti: GPU Compute Cores: 768.
NVIDIA vs. OpenCL ROCm Linux vs. AMDGPU-PRO Benchmarks luxmark: GPU - Luxball HDR luxmark: GPU - Microphone luxmark: GPU - Hotel mandelgpu: GPU mandelbulbgpu: GPU juliagpu: GPU darktable: Server Room - OpenCL darktable: Masskrug - OpenCL darktable: Boat - OpenCL rodinia: OpenCL Heartwall shoc: OpenCL - Texture Read Bandwidth shoc: OpenCL - Bus Speed Readback shoc: OpenCL - Bus Speed Download shoc: OpenCL - Max SP Flops shoc: OpenCL - FFT SP shoc: OpenCL - Triad Radeon R9 Fury - ROCm Radeon RX 480 - ROCm Radeon RX 460 - ROCm Radeon RX 460 - AMDGPU-PRO Radeon RX 480 - AMDGPU-PRO Radeon R9 Fury - AMDGPU-PRO GeForce GTX 1060 GeForce GTX 1070 GeForce GTX 1080 GeForce GTX 1050 GeForce GTX 1050 Ti 11995 5695 1201 82051996.27 44388927.12 73072755.80 1.48 6.09 4.98 6.45 214.53 10.86 11.32 5330.67 399.71 10.59 9196 987 59296261.87 49050438.67 70675082.10 0.99 5.93 5.72 7.28 193.49 8.38 8.37 5815.52 403.22 7.94 3664 381 28295516.33 29562658.90 46101692.27 2.48 7.05 9.57 13.51 91.14 5.27 5.72 2158.12 158.21 5.21 5547 2623 897 35552080.15 32208376.98 50807022.25 2.83 7.20 9.51 7.97 77.35 7.14 6.93 2066.69 245.13 6.25 14066 6924 2399 81101281.90 48517365.80 81972594.40 0.99 5.76 4.37 5.35 160.57 14.20 13.66 5750.69 508.20 9.40 19394 7681 2402 107202116.40 43447360.40 75992404.70 1.79 6.30 4.22 6.38 223.25 14.21 13.69 7131.18 751.86 4.12 11768 5204 2092 112043183.47 63345982.20 115523522.73 1.20 5.90 4.67 3.36 393.69 13.22 12.78 4780.88 296.88 11.85 16215 7302 3023 159458228.23 79620073.63 144431468.40 0.99 5.74 3.87 446.64 13.22 12.78 7115.54 456.72 12.08 12968 6388 2993 206148858.53 91109498.40 165302847.33 0.99 5.73 3.72 520.51 13.22 12.78 9415.48 573.71 12.20 6656 3300 1128 51548791.30 37667402.03 64896787.13 11.78 15.16 15.45 5.27 282.49 13.11 12.75 2115.38 223.30 11.25 7391 3612 1334 64272664.57 44889116.70 78171484.97 11.01 15.44 13.97 3.65 316.10 13.22 12.78 2697.13 188.16 11.38 OpenBenchmarking.org
LuxMark OpenCL Device: GPU - Scene: Luxball HDR OpenBenchmarking.org Score, More Is Better LuxMark 3.0 OpenCL Device: GPU - Scene: Luxball HDR Radeon R9 Fury - ROCm Radeon RX 480 - ROCm Radeon RX 460 - ROCm Radeon RX 460 - AMDGPU-PRO Radeon RX 480 - AMDGPU-PRO Radeon R9 Fury - AMDGPU-PRO GeForce GTX 1060 GeForce GTX 1070 GeForce GTX 1080 GeForce GTX 1050 GeForce GTX 1050 Ti 4K 8K 12K 16K 20K SE +/- 17.34, N = 3 SE +/- 0.67, N = 3 SE +/- 17.00, N = 3 SE +/- 9.82, N = 3 SE +/- 68.10, N = 3 SE +/- 75.47, N = 3 SE +/- 36.34, N = 3 SE +/- 2.31, N = 3 SE +/- 12.45, N = 3 SE +/- 5.20, N = 3 SE +/- 17.00, N = 3 11995 9196 3664 5547 14066 19394 11768 16215 12968 6656 7391
LuxMark OpenCL Device: GPU - Scene: Microphone OpenBenchmarking.org Score, More Is Better LuxMark 3.0 OpenCL Device: GPU - Scene: Microphone Radeon R9 Fury - ROCm Radeon RX 460 - AMDGPU-PRO Radeon RX 480 - AMDGPU-PRO Radeon R9 Fury - AMDGPU-PRO GeForce GTX 1060 GeForce GTX 1070 GeForce GTX 1080 GeForce GTX 1050 GeForce GTX 1050 Ti 1600 3200 4800 6400 8000 SE +/- 15.04, N = 3 SE +/- 6.98, N = 3 SE +/- 13.50, N = 3 SE +/- 17.84, N = 3 SE +/- 3.51, N = 3 SE +/- 38.17, N = 3 SE +/- 2.03, N = 3 SE +/- 3.38, N = 3 SE +/- 2.52, N = 3 5695 2623 6924 7681 5204 7302 6388 3300 3612
LuxMark OpenCL Device: GPU - Scene: Hotel OpenBenchmarking.org Score, More Is Better LuxMark 3.0 OpenCL Device: GPU - Scene: Hotel Radeon R9 Fury - ROCm Radeon RX 480 - ROCm Radeon RX 460 - ROCm Radeon RX 460 - AMDGPU-PRO Radeon RX 480 - AMDGPU-PRO Radeon R9 Fury - AMDGPU-PRO GeForce GTX 1060 GeForce GTX 1070 GeForce GTX 1080 GeForce GTX 1050 GeForce GTX 1050 Ti 600 1200 1800 2400 3000 SE +/- 0.00, N = 3 SE +/- 2.40, N = 3 SE +/- 0.58, N = 3 SE +/- 1.00, N = 3 SE +/- 6.94, N = 3 SE +/- 11.46, N = 3 SE +/- 6.03, N = 3 SE +/- 4.91, N = 3 SE +/- 9.00, N = 3 SE +/- 5.67, N = 3 SE +/- 3.79, N = 3 1201 987 381 897 2399 2402 2092 3023 2993 1128 1334
MandelGPU OpenCL Device: GPU OpenBenchmarking.org Samples/sec, More Is Better MandelGPU 1.3pts1 OpenCL Device: GPU Radeon R9 Fury - ROCm Radeon RX 480 - ROCm Radeon RX 460 - ROCm Radeon RX 460 - AMDGPU-PRO Radeon RX 480 - AMDGPU-PRO Radeon R9 Fury - AMDGPU-PRO GeForce GTX 1060 GeForce GTX 1070 GeForce GTX 1080 GeForce GTX 1050 GeForce GTX 1050 Ti 40M 80M 120M 160M 200M SE +/- 71744.88, N = 3 SE +/- 126265.24, N = 3 SE +/- 30521.44, N = 3 SE +/- 165178.15, N = 2 SE +/- 104172.86, N = 3 SE +/- 248567.98, N = 3 SE +/- 971382.09, N = 3 SE +/- 26110.91, N = 3 SE +/- 75826.86, N = 3 82051996.27 59296261.87 28295516.33 35552080.15 81101281.90 107202116.40 112043183.47 159458228.23 206148858.53 51548791.30 64272664.57 1. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL
MandelbulbGPU OpenCL Device: GPU OpenBenchmarking.org Samples/sec, More Is Better MandelbulbGPU 1.0pts1 OpenCL Device: GPU Radeon R9 Fury - ROCm Radeon RX 480 - ROCm Radeon RX 460 - ROCm Radeon RX 460 - AMDGPU-PRO Radeon RX 480 - AMDGPU-PRO Radeon R9 Fury - AMDGPU-PRO GeForce GTX 1060 GeForce GTX 1070 GeForce GTX 1080 GeForce GTX 1050 GeForce GTX 1050 Ti 20M 40M 60M 80M 100M SE +/- 2304744.64, N = 6 SE +/- 81023.55, N = 3 SE +/- 117840.12, N = 3 SE +/- 561923.72, N = 4 SE +/- 290297.61, N = 3 SE +/- 503324.21, N = 3 SE +/- 423859.17, N = 3 SE +/- 36018.97, N = 3 SE +/- 112131.74, N = 3 44388927.12 49050438.67 29562658.90 32208376.98 48517365.80 43447360.40 63345982.20 79620073.63 91109498.40 37667402.03 44889116.70 1. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL
JuliaGPU OpenCL Device: GPU OpenBenchmarking.org Samples/sec, More Is Better JuliaGPU 1.2pts1 OpenCL Device: GPU Radeon R9 Fury - ROCm Radeon RX 480 - ROCm Radeon RX 460 - ROCm Radeon RX 460 - AMDGPU-PRO Radeon RX 480 - AMDGPU-PRO Radeon R9 Fury - AMDGPU-PRO GeForce GTX 1060 GeForce GTX 1070 GeForce GTX 1080 GeForce GTX 1050 GeForce GTX 1050 Ti 40M 80M 120M 160M 200M SE +/- 985012.76, N = 3 SE +/- 94849.94, N = 3 SE +/- 160084.77, N = 3 SE +/- 97714.65, N = 2 SE +/- 500734.10, N = 2 SE +/- 194570.11, N = 3 SE +/- 169012.99, N = 3 SE +/- 694138.93, N = 3 SE +/- 29908.92, N = 3 SE +/- 109924.23, N = 3 73072755.80 70675082.10 46101692.27 50807022.25 81972594.40 75992404.70 115523522.73 144431468.40 165302847.33 64896787.13 78171484.97 1. (CC) gcc options: -O3 -march=native -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL -lm
Darktable Test: Server Room - Acceleration: OpenCL OpenBenchmarking.org Seconds, Fewer Is Better Darktable 2.2.1 Test: Server Room - Acceleration: OpenCL Radeon R9 Fury - ROCm Radeon RX 480 - ROCm Radeon RX 460 - ROCm Radeon RX 460 - AMDGPU-PRO Radeon RX 480 - AMDGPU-PRO Radeon R9 Fury - AMDGPU-PRO GeForce GTX 1060 GeForce GTX 1070 GeForce GTX 1080 GeForce GTX 1050 GeForce GTX 1050 Ti 3 6 9 12 15 SE +/- 0.02, N = 3 SE +/- 0.02, N = 4 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.04, N = 6 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 1.48 0.99 2.48 2.83 0.99 1.79 1.20 0.99 0.99 11.78 11.01
Darktable Test: Masskrug - Acceleration: OpenCL OpenBenchmarking.org Seconds, Fewer Is Better Darktable 2.2.1 Test: Masskrug - Acceleration: OpenCL Radeon R9 Fury - ROCm Radeon RX 480 - ROCm Radeon RX 460 - ROCm Radeon RX 460 - AMDGPU-PRO Radeon RX 480 - AMDGPU-PRO Radeon R9 Fury - AMDGPU-PRO GeForce GTX 1060 GeForce GTX 1070 GeForce GTX 1080 GeForce GTX 1050 GeForce GTX 1050 Ti 4 8 12 16 20 SE +/- 0.00, N = 3 SE +/- 0.08, N = 3 SE +/- 0.09, N = 3 SE +/- 0.03, N = 3 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 6.09 5.93 7.05 7.20 5.76 6.30 5.90 5.74 5.73 15.16 15.44
Darktable Test: Boat - Acceleration: OpenCL OpenBenchmarking.org Seconds, Fewer Is Better Darktable 2.2.1 Test: Boat - Acceleration: OpenCL Radeon R9 Fury - ROCm Radeon RX 480 - ROCm Radeon RX 460 - ROCm Radeon RX 460 - AMDGPU-PRO Radeon RX 480 - AMDGPU-PRO Radeon R9 Fury - AMDGPU-PRO GeForce GTX 1060 GeForce GTX 1070 GeForce GTX 1080 GeForce GTX 1050 GeForce GTX 1050 Ti 4 8 12 16 20 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 SE +/- 0.77, N = 6 SE +/- 0.01, N = 3 SE +/- 0.07, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 4.98 5.72 9.57 9.51 4.37 4.22 4.67 3.87 3.72 15.45 13.97
Rodinia Test: OpenCL Heartwall OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 2.4 Test: OpenCL Heartwall Radeon R9 Fury - ROCm Radeon RX 480 - ROCm Radeon RX 460 - ROCm Radeon RX 460 - AMDGPU-PRO Radeon RX 480 - AMDGPU-PRO Radeon R9 Fury - AMDGPU-PRO GeForce GTX 1060 GeForce GTX 1050 GeForce GTX 1050 Ti 3 6 9 12 15 SE +/- 0.16, N = 6 SE +/- 0.03, N = 3 SE +/- 0.08, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.07, N = 3 SE +/- 0.05, N = 5 SE +/- 0.04, N = 3 SE +/- 0.04, N = 3 6.45 7.28 13.51 7.97 5.35 6.38 3.36 5.27 3.65 1. (CXX) g++ options: -O2 -lOpenCL
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Texture Read Bandwidth OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2015-11-10 Target: OpenCL - Benchmark: Texture Read Bandwidth Radeon R9 Fury - ROCm Radeon RX 480 - ROCm Radeon RX 460 - ROCm Radeon RX 460 - AMDGPU-PRO Radeon RX 480 - AMDGPU-PRO Radeon R9 Fury - AMDGPU-PRO GeForce GTX 1060 GeForce GTX 1070 GeForce GTX 1080 GeForce GTX 1050 GeForce GTX 1050 Ti 110 220 330 440 550 SE +/- 4.26, N = 3 SE +/- 1.30, N = 3 SE +/- 0.16, N = 3 SE +/- 0.69, N = 3 SE +/- 0.37, N = 3 SE +/- 1.03, N = 3 SE +/- 0.96, N = 3 SE +/- 0.12, N = 3 SE +/- 1.14, N = 3 SE +/- 0.98, N = 3 SE +/- 1.06, N = 3 214.53 193.49 91.14 77.35 160.57 223.25 393.69 446.64 520.51 282.49 316.10 -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi 1. (CXX) g++ options: -O2 -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Bus Speed Readback OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2015-11-10 Target: OpenCL - Benchmark: Bus Speed Readback Radeon R9 Fury - ROCm Radeon RX 480 - ROCm Radeon RX 460 - ROCm Radeon RX 460 - AMDGPU-PRO Radeon RX 480 - AMDGPU-PRO Radeon R9 Fury - AMDGPU-PRO GeForce GTX 1060 GeForce GTX 1070 GeForce GTX 1080 GeForce GTX 1050 GeForce GTX 1050 Ti 4 8 12 16 20 SE +/- 0.03, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.03, N = 3 SE +/- 0.00, N = 3 10.86 8.38 5.27 7.14 14.20 14.21 13.22 13.22 13.22 13.11 13.22 -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi 1. (CXX) g++ options: -O2 -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Bus Speed Download OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2015-11-10 Target: OpenCL - Benchmark: Bus Speed Download Radeon R9 Fury - ROCm Radeon RX 480 - ROCm Radeon RX 460 - ROCm Radeon RX 460 - AMDGPU-PRO Radeon RX 480 - AMDGPU-PRO Radeon R9 Fury - AMDGPU-PRO GeForce GTX 1060 GeForce GTX 1070 GeForce GTX 1080 GeForce GTX 1050 GeForce GTX 1050 Ti 4 8 12 16 20 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 11.32 8.37 5.72 6.93 13.66 13.69 12.78 12.78 12.78 12.75 12.78 -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi 1. (CXX) g++ options: -O2 -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Max SP Flops OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2015-11-10 Target: OpenCL - Benchmark: Max SP Flops Radeon R9 Fury - ROCm Radeon RX 480 - ROCm Radeon RX 460 - ROCm Radeon RX 460 - AMDGPU-PRO Radeon RX 480 - AMDGPU-PRO Radeon R9 Fury - AMDGPU-PRO GeForce GTX 1060 GeForce GTX 1070 GeForce GTX 1080 GeForce GTX 1050 GeForce GTX 1050 Ti 2K 4K 6K 8K 10K SE +/- 369.63, N = 6 SE +/- 4.14, N = 3 SE +/- 0.07, N = 3 SE +/- 18.41, N = 3 SE +/- 30.75, N = 3 SE +/- 0.69, N = 3 SE +/- 22.70, N = 3 SE +/- 52.21, N = 3 SE +/- 70.36, N = 3 SE +/- 0.01, N = 3 SE +/- 5.23, N = 3 5330.67 5815.52 2158.12 2066.69 5750.69 7131.18 4780.88 7115.54 9415.48 2115.38 2697.13 -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi 1. (CXX) g++ options: -O2 -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: FFT SP OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2015-11-10 Target: OpenCL - Benchmark: FFT SP Radeon R9 Fury - ROCm Radeon RX 480 - ROCm Radeon RX 460 - ROCm Radeon RX 460 - AMDGPU-PRO Radeon RX 480 - AMDGPU-PRO Radeon R9 Fury - AMDGPU-PRO GeForce GTX 1060 GeForce GTX 1070 GeForce GTX 1080 GeForce GTX 1050 GeForce GTX 1050 Ti 160 320 480 640 800 SE +/- 0.44, N = 3 SE +/- 6.05, N = 3 SE +/- 0.04, N = 3 SE +/- 1.23, N = 3 SE +/- 2.19, N = 3 SE +/- 14.35, N = 3 SE +/- 4.87, N = 3 SE +/- 6.56, N = 6 SE +/- 6.31, N = 3 SE +/- 2.58, N = 3 SE +/- 2.31, N = 3 399.71 403.22 158.21 245.13 508.20 751.86 296.88 456.72 573.71 223.30 188.16 -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi 1. (CXX) g++ options: -O2 -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Triad OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2015-11-10 Target: OpenCL - Benchmark: Triad Radeon R9 Fury - ROCm Radeon RX 480 - ROCm Radeon RX 460 - ROCm Radeon RX 460 - AMDGPU-PRO Radeon RX 480 - AMDGPU-PRO Radeon R9 Fury - AMDGPU-PRO GeForce GTX 1060 GeForce GTX 1070 GeForce GTX 1080 GeForce GTX 1050 GeForce GTX 1050 Ti 3 6 9 12 15 SE +/- 0.04, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.10, N = 3 SE +/- 0.14, N = 4 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 10.59 7.94 5.21 6.25 9.40 4.12 11.85 12.08 12.20 11.25 11.38 -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi 1. (CXX) g++ options: -O2 -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt
Phoronix Test Suite v10.8.5