NVIDIA vs. OpenCL ROCm Linux vs. AMDGPU-PRO Benchmarks ROCm 1.4 benchmarks on Ubuntu 16.04 compared to AMDGPU-PRO. Now with NVIDIA comparison points. OpenCL benchmarks by Michael Larabel for a future article on Phoronix.com.
HTML result view exported from: https://openbenchmarking.org/result/1701193-RI-OPENCLCOM83&grs&sro .
NVIDIA vs. OpenCL ROCm Linux vs. AMDGPU-PRO Benchmarks Processor Motherboard Chipset Memory Disk Graphics Audio Network Monitor OS Kernel Desktop Display Server Display Driver OpenGL OpenCL Vulkan Compiler File-System Screen Resolution GeForce GTX 1050 GeForce GTX 1050 Ti GeForce GTX 1060 GeForce GTX 1070 GeForce GTX 1080 Radeon RX 460 - AMDGPU-PRO Radeon RX 480 - AMDGPU-PRO Radeon R9 Fury - AMDGPU-PRO Radeon RX 460 - ROCm Radeon RX 480 - ROCm Radeon R9 Fury - ROCm Intel Xeon E3-1280 v5 @ 4.00GHz (8 Cores) MSI C236A WORKSTATION (MS-7998) v1.0 Intel Sky Lake 16384MB 256GB TOSHIBA-RD400 Zotac NVIDIA GeForce GTX 1050 2048MB (1075/3504MHz) Realtek ALC1150 Intel Connection Ubuntu 16.04 4.4.0-59-generic (x86_64) Unity 7.4.0 X Server 1.18.3 NVIDIA 375.26 4.5.0 OpenCL 1.2 CUDA 8.0.0 1.0.24 GCC 5.4.0 20160609 ext4 3840x2160 eVGA NVIDIA GeForce GTX 1050 Ti 4096MB (1341/3504MHz) NVIDIA GeForce GTX 1060 6GB 6144MB (418/4006MHz) NVIDIA GeForce GTX 1070 8192MB (1504/4006MHz) NVIDIA GeForce GTX 1080 8192MB (109/5005MHz) AMD Radeon RX 460 2048MB Acer B286HK amdgpu 1.1.99 4.5.13462 OpenCL 2.0 AMD-APP (2236.5) AMD Radeon RX 480 8192MB Sapphire AMD Radeon R9 Fury 4096MB LLVMpipe 4.6.0-kfd-compute-rocm-rel-1.4-16 (x86_64) modesetting 1.18.3 3.3 Mesa 11.2.0 Gallium 0.4 OpenCL 2.0 AMD-APP (2300.5) GCC 5.4.0 20160609 + Clang 4.0 + LLVM 4.0.0 Sapphire AMD Radeon R9 FURY / NANO 3968MB 4.1 Mesa 11.2.0 Gallium 0.4 OpenBenchmarking.org Compiler Details - --build=x86_64-linux-gnu --disable-browser-plugin --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --enable-libmpx --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-arch-directory=amd64 --with-default-libstdcxx-abi=new --with-multilib-list=m32,m64,mx32 --with-tune=generic -v Processor Details - Scaling Governor: intel_pstate powersave OpenCL Details - GeForce GTX 1050: GPU Compute Cores: 640 - GeForce GTX 1050 Ti: GPU Compute Cores: 768 - GeForce GTX 1060: GPU Compute Cores: 1280 - GeForce GTX 1070: GPU Compute Cores: 1920 - GeForce GTX 1080: GPU Compute Cores: 2560 System Details - GeForce GTX 1050: GPU Compute Cores: 640. - GeForce GTX 1050 Ti: GPU Compute Cores: 768. - GeForce GTX 1060: GPU Compute Cores: 1280. - GeForce GTX 1070: GPU Compute Cores: 1920. - GeForce GTX 1080: GPU Compute Cores: 2560. Graphics Details - Radeon RX 460 - AMDGPU-PRO, Radeon RX 480 - AMDGPU-PRO, Radeon R9 Fury - AMDGPU-PRO, Radeon R9 Fury - ROCm: GLAMOR Environment Details - Radeon RX 460 - ROCm, Radeon RX 480 - ROCm: LIBGL_ALWAYS_SOFTWARE=1
NVIDIA vs. OpenCL ROCm Linux vs. AMDGPU-PRO Benchmarks luxmark: GPU - Hotel mandelgpu: GPU shoc: OpenCL - Texture Read Bandwidth luxmark: GPU - Luxball HDR shoc: OpenCL - FFT SP shoc: OpenCL - Max SP Flops darktable: Boat - OpenCL rodinia: OpenCL Heartwall juliagpu: GPU mandelbulbgpu: GPU shoc: OpenCL - Triad luxmark: GPU - Microphone shoc: OpenCL - Bus Speed Readback darktable: Masskrug - OpenCL shoc: OpenCL - Bus Speed Download darktable: Server Room - OpenCL GeForce GTX 1050 GeForce GTX 1050 Ti GeForce GTX 1060 GeForce GTX 1070 GeForce GTX 1080 Radeon RX 460 - AMDGPU-PRO Radeon RX 480 - AMDGPU-PRO Radeon R9 Fury - AMDGPU-PRO Radeon RX 460 - ROCm Radeon RX 480 - ROCm Radeon R9 Fury - ROCm 1128 51548791.30 282.49 6656 223.30 2115.38 15.45 5.27 64896787.13 37667402.03 11.25 3300 13.11 15.16 12.75 11.78 1334 64272664.57 316.10 7391 188.16 2697.13 13.97 3.65 78171484.97 44889116.70 11.38 3612 13.22 15.44 12.78 11.01 2092 112043183.47 393.69 11768 296.88 4780.88 4.67 3.36 115523522.73 63345982.20 11.85 5204 13.22 5.90 12.78 1.20 3023 159458228.23 446.64 16215 456.72 7115.54 3.87 144431468.40 79620073.63 12.08 7302 13.22 5.74 12.78 0.99 2993 206148858.53 520.51 12968 573.71 9415.48 3.72 165302847.33 91109498.40 12.20 6388 13.22 5.73 12.78 0.99 897 35552080.15 77.35 5547 245.13 2066.69 9.51 7.97 50807022.25 32208376.98 6.25 2623 7.14 7.20 6.93 2.83 2399 81101281.90 160.57 14066 508.20 5750.69 4.37 5.35 81972594.40 48517365.80 9.40 6924 14.20 5.76 13.66 0.99 2402 107202116.40 223.25 19394 751.86 7131.18 4.22 6.38 75992404.70 43447360.40 4.12 7681 14.21 6.30 13.69 1.79 381 28295516.33 91.14 3664 158.21 2158.12 9.57 13.51 46101692.27 29562658.90 5.21 5.27 7.05 5.72 2.48 987 59296261.87 193.49 9196 403.22 5815.52 5.72 7.28 70675082.10 49050438.67 7.94 8.38 5.93 8.37 0.99 1201 82051996.27 214.53 11995 399.71 5330.67 4.98 6.45 73072755.80 44388927.12 10.59 5695 10.86 6.09 11.32 1.48 OpenBenchmarking.org
LuxMark OpenCL Device: GPU - Scene: Hotel OpenBenchmarking.org Score, More Is Better LuxMark 3.0 OpenCL Device: GPU - Scene: Hotel GeForce GTX 1050 GeForce GTX 1050 Ti GeForce GTX 1060 GeForce GTX 1070 GeForce GTX 1080 Radeon R9 Fury - AMDGPU-PRO Radeon R9 Fury - ROCm Radeon RX 460 - AMDGPU-PRO Radeon RX 460 - ROCm Radeon RX 480 - AMDGPU-PRO Radeon RX 480 - ROCm 600 1200 1800 2400 3000 SE +/- 5.67, N = 3 SE +/- 3.79, N = 3 SE +/- 6.03, N = 3 SE +/- 4.91, N = 3 SE +/- 9.00, N = 3 SE +/- 11.46, N = 3 SE +/- 0.00, N = 3 SE +/- 1.00, N = 3 SE +/- 0.58, N = 3 SE +/- 6.94, N = 3 SE +/- 2.40, N = 3 1128 1334 2092 3023 2993 2402 1201 897 381 2399 987
MandelGPU OpenCL Device: GPU OpenBenchmarking.org Samples/sec, More Is Better MandelGPU 1.3pts1 OpenCL Device: GPU GeForce GTX 1050 GeForce GTX 1050 Ti GeForce GTX 1060 GeForce GTX 1070 GeForce GTX 1080 Radeon R9 Fury - AMDGPU-PRO Radeon R9 Fury - ROCm Radeon RX 460 - AMDGPU-PRO Radeon RX 460 - ROCm Radeon RX 480 - AMDGPU-PRO Radeon RX 480 - ROCm 40M 80M 120M 160M 200M SE +/- 26110.91, N = 3 SE +/- 75826.86, N = 3 SE +/- 104172.86, N = 3 SE +/- 248567.98, N = 3 SE +/- 971382.09, N = 3 SE +/- 71744.88, N = 3 SE +/- 165178.15, N = 2 SE +/- 30521.44, N = 3 SE +/- 126265.24, N = 3 51548791.30 64272664.57 112043183.47 159458228.23 206148858.53 107202116.40 82051996.27 35552080.15 28295516.33 81101281.90 59296261.87 1. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Texture Read Bandwidth OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2015-11-10 Target: OpenCL - Benchmark: Texture Read Bandwidth GeForce GTX 1050 GeForce GTX 1050 Ti GeForce GTX 1060 GeForce GTX 1070 GeForce GTX 1080 Radeon R9 Fury - AMDGPU-PRO Radeon R9 Fury - ROCm Radeon RX 460 - AMDGPU-PRO Radeon RX 460 - ROCm Radeon RX 480 - AMDGPU-PRO Radeon RX 480 - ROCm 110 220 330 440 550 SE +/- 0.98, N = 3 SE +/- 1.06, N = 3 SE +/- 0.96, N = 3 SE +/- 0.12, N = 3 SE +/- 1.14, N = 3 SE +/- 1.03, N = 3 SE +/- 4.26, N = 3 SE +/- 0.69, N = 3 SE +/- 0.16, N = 3 SE +/- 0.37, N = 3 SE +/- 1.30, N = 3 282.49 316.10 393.69 446.64 520.51 223.25 214.53 77.35 91.14 160.57 193.49 -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi 1. (CXX) g++ options: -O2 -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt
LuxMark OpenCL Device: GPU - Scene: Luxball HDR OpenBenchmarking.org Score, More Is Better LuxMark 3.0 OpenCL Device: GPU - Scene: Luxball HDR GeForce GTX 1050 GeForce GTX 1050 Ti GeForce GTX 1060 GeForce GTX 1070 GeForce GTX 1080 Radeon R9 Fury - AMDGPU-PRO Radeon R9 Fury - ROCm Radeon RX 460 - AMDGPU-PRO Radeon RX 460 - ROCm Radeon RX 480 - AMDGPU-PRO Radeon RX 480 - ROCm 4K 8K 12K 16K 20K SE +/- 5.20, N = 3 SE +/- 17.00, N = 3 SE +/- 36.34, N = 3 SE +/- 2.31, N = 3 SE +/- 12.45, N = 3 SE +/- 75.47, N = 3 SE +/- 17.34, N = 3 SE +/- 9.82, N = 3 SE +/- 17.00, N = 3 SE +/- 68.10, N = 3 SE +/- 0.67, N = 3 6656 7391 11768 16215 12968 19394 11995 5547 3664 14066 9196
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: FFT SP OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2015-11-10 Target: OpenCL - Benchmark: FFT SP GeForce GTX 1050 GeForce GTX 1050 Ti GeForce GTX 1060 GeForce GTX 1070 GeForce GTX 1080 Radeon R9 Fury - AMDGPU-PRO Radeon R9 Fury - ROCm Radeon RX 460 - AMDGPU-PRO Radeon RX 460 - ROCm Radeon RX 480 - AMDGPU-PRO Radeon RX 480 - ROCm 160 320 480 640 800 SE +/- 2.58, N = 3 SE +/- 2.31, N = 3 SE +/- 4.87, N = 3 SE +/- 6.56, N = 6 SE +/- 6.31, N = 3 SE +/- 14.35, N = 3 SE +/- 0.44, N = 3 SE +/- 1.23, N = 3 SE +/- 0.04, N = 3 SE +/- 2.19, N = 3 SE +/- 6.05, N = 3 223.30 188.16 296.88 456.72 573.71 751.86 399.71 245.13 158.21 508.20 403.22 -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi 1. (CXX) g++ options: -O2 -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Max SP Flops OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2015-11-10 Target: OpenCL - Benchmark: Max SP Flops GeForce GTX 1050 GeForce GTX 1050 Ti GeForce GTX 1060 GeForce GTX 1070 GeForce GTX 1080 Radeon R9 Fury - AMDGPU-PRO Radeon R9 Fury - ROCm Radeon RX 460 - AMDGPU-PRO Radeon RX 460 - ROCm Radeon RX 480 - AMDGPU-PRO Radeon RX 480 - ROCm 2K 4K 6K 8K 10K SE +/- 0.01, N = 3 SE +/- 5.23, N = 3 SE +/- 22.70, N = 3 SE +/- 52.21, N = 3 SE +/- 70.36, N = 3 SE +/- 0.69, N = 3 SE +/- 369.63, N = 6 SE +/- 18.41, N = 3 SE +/- 0.07, N = 3 SE +/- 30.75, N = 3 SE +/- 4.14, N = 3 2115.38 2697.13 4780.88 7115.54 9415.48 7131.18 5330.67 2066.69 2158.12 5750.69 5815.52 -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi 1. (CXX) g++ options: -O2 -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt
Darktable Test: Boat - Acceleration: OpenCL OpenBenchmarking.org Seconds, Fewer Is Better Darktable 2.2.1 Test: Boat - Acceleration: OpenCL GeForce GTX 1050 GeForce GTX 1050 Ti GeForce GTX 1060 GeForce GTX 1070 GeForce GTX 1080 Radeon R9 Fury - AMDGPU-PRO Radeon R9 Fury - ROCm Radeon RX 460 - AMDGPU-PRO Radeon RX 460 - ROCm Radeon RX 480 - AMDGPU-PRO Radeon RX 480 - ROCm 4 8 12 16 20 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.07, N = 3 SE +/- 0.03, N = 3 SE +/- 0.77, N = 6 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 15.45 13.97 4.67 3.87 3.72 4.22 4.98 9.51 9.57 4.37 5.72
Rodinia Test: OpenCL Heartwall OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 2.4 Test: OpenCL Heartwall GeForce GTX 1050 GeForce GTX 1050 Ti GeForce GTX 1060 Radeon R9 Fury - AMDGPU-PRO Radeon R9 Fury - ROCm Radeon RX 460 - AMDGPU-PRO Radeon RX 460 - ROCm Radeon RX 480 - AMDGPU-PRO Radeon RX 480 - ROCm 3 6 9 12 15 SE +/- 0.04, N = 3 SE +/- 0.04, N = 3 SE +/- 0.05, N = 5 SE +/- 0.07, N = 3 SE +/- 0.16, N = 6 SE +/- 0.01, N = 3 SE +/- 0.08, N = 3 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 5.27 3.65 3.36 6.38 6.45 7.97 13.51 5.35 7.28 1. (CXX) g++ options: -O2 -lOpenCL
JuliaGPU OpenCL Device: GPU OpenBenchmarking.org Samples/sec, More Is Better JuliaGPU 1.2pts1 OpenCL Device: GPU GeForce GTX 1050 GeForce GTX 1050 Ti GeForce GTX 1060 GeForce GTX 1070 GeForce GTX 1080 Radeon R9 Fury - AMDGPU-PRO Radeon R9 Fury - ROCm Radeon RX 460 - AMDGPU-PRO Radeon RX 460 - ROCm Radeon RX 480 - AMDGPU-PRO Radeon RX 480 - ROCm 40M 80M 120M 160M 200M SE +/- 29908.92, N = 3 SE +/- 109924.23, N = 3 SE +/- 194570.11, N = 3 SE +/- 169012.99, N = 3 SE +/- 694138.93, N = 3 SE +/- 985012.76, N = 3 SE +/- 97714.65, N = 2 SE +/- 160084.77, N = 3 SE +/- 500734.10, N = 2 SE +/- 94849.94, N = 3 64896787.13 78171484.97 115523522.73 144431468.40 165302847.33 75992404.70 73072755.80 50807022.25 46101692.27 81972594.40 70675082.10 1. (CC) gcc options: -O3 -march=native -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL -lm
MandelbulbGPU OpenCL Device: GPU OpenBenchmarking.org Samples/sec, More Is Better MandelbulbGPU 1.0pts1 OpenCL Device: GPU GeForce GTX 1050 GeForce GTX 1050 Ti GeForce GTX 1060 GeForce GTX 1070 GeForce GTX 1080 Radeon R9 Fury - AMDGPU-PRO Radeon R9 Fury - ROCm Radeon RX 460 - AMDGPU-PRO Radeon RX 460 - ROCm Radeon RX 480 - AMDGPU-PRO Radeon RX 480 - ROCm 20M 40M 60M 80M 100M SE +/- 36018.97, N = 3 SE +/- 112131.74, N = 3 SE +/- 290297.61, N = 3 SE +/- 503324.21, N = 3 SE +/- 423859.17, N = 3 SE +/- 2304744.64, N = 6 SE +/- 561923.72, N = 4 SE +/- 117840.12, N = 3 SE +/- 81023.55, N = 3 37667402.03 44889116.70 63345982.20 79620073.63 91109498.40 43447360.40 44388927.12 32208376.98 29562658.90 48517365.80 49050438.67 1. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Triad OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2015-11-10 Target: OpenCL - Benchmark: Triad GeForce GTX 1050 GeForce GTX 1050 Ti GeForce GTX 1060 GeForce GTX 1070 GeForce GTX 1080 Radeon R9 Fury - AMDGPU-PRO Radeon R9 Fury - ROCm Radeon RX 460 - AMDGPU-PRO Radeon RX 460 - ROCm Radeon RX 480 - AMDGPU-PRO Radeon RX 480 - ROCm 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 SE +/- 0.10, N = 3 SE +/- 0.00, N = 3 SE +/- 0.14, N = 4 SE +/- 0.01, N = 3 11.25 11.38 11.85 12.08 12.20 4.12 10.59 6.25 5.21 9.40 7.94 -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi 1. (CXX) g++ options: -O2 -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt
LuxMark OpenCL Device: GPU - Scene: Microphone OpenBenchmarking.org Score, More Is Better LuxMark 3.0 OpenCL Device: GPU - Scene: Microphone GeForce GTX 1050 GeForce GTX 1050 Ti GeForce GTX 1060 GeForce GTX 1070 GeForce GTX 1080 Radeon R9 Fury - AMDGPU-PRO Radeon R9 Fury - ROCm Radeon RX 460 - AMDGPU-PRO Radeon RX 480 - AMDGPU-PRO 1600 3200 4800 6400 8000 SE +/- 3.38, N = 3 SE +/- 2.52, N = 3 SE +/- 3.51, N = 3 SE +/- 38.17, N = 3 SE +/- 2.03, N = 3 SE +/- 17.84, N = 3 SE +/- 15.04, N = 3 SE +/- 6.98, N = 3 SE +/- 13.50, N = 3 3300 3612 5204 7302 6388 7681 5695 2623 6924
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Bus Speed Readback OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2015-11-10 Target: OpenCL - Benchmark: Bus Speed Readback GeForce GTX 1050 GeForce GTX 1050 Ti GeForce GTX 1060 GeForce GTX 1070 GeForce GTX 1080 Radeon R9 Fury - AMDGPU-PRO Radeon R9 Fury - ROCm Radeon RX 460 - AMDGPU-PRO Radeon RX 460 - ROCm Radeon RX 480 - AMDGPU-PRO Radeon RX 480 - ROCm 4 8 12 16 20 SE +/- 0.03, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 13.11 13.22 13.22 13.22 13.22 14.21 10.86 7.14 5.27 14.20 8.38 -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi 1. (CXX) g++ options: -O2 -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt
Darktable Test: Masskrug - Acceleration: OpenCL OpenBenchmarking.org Seconds, Fewer Is Better Darktable 2.2.1 Test: Masskrug - Acceleration: OpenCL GeForce GTX 1050 GeForce GTX 1050 Ti GeForce GTX 1060 GeForce GTX 1070 GeForce GTX 1080 Radeon R9 Fury - AMDGPU-PRO Radeon R9 Fury - ROCm Radeon RX 460 - AMDGPU-PRO Radeon RX 460 - ROCm Radeon RX 480 - AMDGPU-PRO Radeon RX 480 - ROCm 4 8 12 16 20 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.00, N = 3 SE +/- 0.03, N = 3 SE +/- 0.09, N = 3 SE +/- 0.04, N = 3 SE +/- 0.08, N = 3 15.16 15.44 5.90 5.74 5.73 6.30 6.09 7.20 7.05 5.76 5.93
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Bus Speed Download OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2015-11-10 Target: OpenCL - Benchmark: Bus Speed Download GeForce GTX 1050 GeForce GTX 1050 Ti GeForce GTX 1060 GeForce GTX 1070 GeForce GTX 1080 Radeon R9 Fury - AMDGPU-PRO Radeon R9 Fury - ROCm Radeon RX 460 - AMDGPU-PRO Radeon RX 460 - ROCm Radeon RX 480 - AMDGPU-PRO Radeon RX 480 - ROCm 4 8 12 16 20 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 12.75 12.78 12.78 12.78 12.78 13.69 11.32 6.93 5.72 13.66 8.37 -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi -lSHOCCommonMPI -pthread -lmpi_cxx -lmpi 1. (CXX) g++ options: -O2 -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt
Darktable Test: Server Room - Acceleration: OpenCL OpenBenchmarking.org Seconds, Fewer Is Better Darktable 2.2.1 Test: Server Room - Acceleration: OpenCL GeForce GTX 1050 GeForce GTX 1050 Ti GeForce GTX 1060 GeForce GTX 1070 GeForce GTX 1080 Radeon R9 Fury - AMDGPU-PRO Radeon R9 Fury - ROCm Radeon RX 460 - AMDGPU-PRO Radeon RX 460 - ROCm Radeon RX 480 - AMDGPU-PRO Radeon RX 480 - ROCm 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.04, N = 6 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 4 11.78 11.01 1.20 0.99 0.99 1.79 1.48 2.83 2.48 0.99 0.99
Phoronix Test Suite v10.8.5