Radeon ROCm 2.0 OpenCL Compute Versus NVIDIA Linux

ROCm 2.0 Linux GPGPU/compute benchmarks for a future article on Phoronix.com by Michael Larabel.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 1812285-SP-ROCM20NVI57
Jump To Table - Results

View

Do Not Show Noisy Results
Do Not Show Results With Incomplete Data
Do Not Show Results With Little Change/Spread
List Notable Results

Limit displaying results to tests within:

CPU Massive 3 Tests
HPC - High Performance Computing 3 Tests
Multi-Core 2 Tests
NVIDIA GPU Compute 5 Tests
OpenCL 6 Tests
OpenMPI Tests 2 Tests
Common Workstation Benchmarks 2 Tests

Statistics

Show Overall Harmonic Mean(s)
Show Overall Geometric Mean
Show Geometric Means Per-Suite/Category
Show Wins / Losses Counts (Pie Chart)
Normalize Results
Remove Outliers Before Calculating Averages

Graph Settings

Force Line Graphs Where Applicable
Convert To Scalar Where Applicable
Disable Color Branding
Prefer Vertical Bar Graphs

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Highlight
Result
Hide
Result
Result
Identifier
Performance Per
Dollar
Date
Run
  Test
  Duration
GTX 980 Ti
December 23 2018
  26 Minutes
GTX TITAN X GM200
December 22 2018
  26 Minutes
GTX 1060
December 28 2018
  20 Minutes
GTX 1070
December 23 2018
  26 Minutes
GTX 1080
December 22 2018
  26 Minutes
GTX 1080 Ti
December 23 2018
  26 Minutes
RTX 2080
December 23 2018
  26 Minutes
RTX 2080 Ti
December 23 2018
  25 Minutes
TITAN RTX
December 22 2018
  25 Minutes
RX 580
December 28 2018
  17 Minutes
RX Vega 56
December 28 2018
  16 Minutes
RX Vega 64
December 28 2018
  18 Minutes
Invert Hiding All Results Option
  23 Minutes

Only show results where is faster than
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


Radeon ROCm 2.0 OpenCL Compute Versus NVIDIA LinuxProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLOpenCLVulkanCompilerFile-SystemScreen ResolutionTITAN RTXGTX TITAN X GM200GTX 1080RTX 2080GTX 1080 TiGTX 1070GTX 980 TiRTX 2080 TiRX Vega 64RX 580RX Vega 56GTX 1060Intel Core i9-9900K @ 5.00GHz (8 Cores / 16 Threads)ASUS PRIME Z390-A (0602 BIOS)Intel Cannon Lake PCH Shared SRAM16384MB2000GB SABRENT + Samsung SSD 970 EVO 250GBNVIDIA TITAN RTX 24GB (1350/7000MHz)Realtek ALC1220Acer B286HKIntel ConnectionUbuntu 18.044.19.5-041905-generic (x86_64)GNOME Shell 3.28.3X Server 1.19.6NVIDIA 415.234.6.0OpenCL 1.2 CUDA 10.0.1321.1.84GCC 7.3.0 + LLVM 6.0.0 + CUDA 10.0ext43840x2160NVIDIA GeForce GTX TITAN X 12GB (1001/3505MHz)NVIDIA GeForce GTX 1080 8GB (1607/5005MHz)Zotac NVIDIA GeForce RTX 2080 8GB (1515/7000MHz)NVIDIA GeForce GTX 1080 Ti 11GB (1480/5508MHz)NVIDIA GeForce GTX 1070 8GB (1506/4006MHz)NVIDIA GeForce GTX 980 Ti 6GB (999/3505MHz)NVIDIA GeForce RTX 2080 Ti 11GB (1350/7000MHz)AMD Radeon RX Vega 8GB (1630/945MHz)4.15.0-43-generic (x86_64)4.5 Mesa 19.0.0-devel (git-17218a0406) (LLVM 8.0.0)OpenCL 2.1 AMD-APP (2783.0)1.1.90MSI AMD Radeon RX 470/480 8GB (1366/2000MHz)4.19.5-041905-generic (x86_64)AMD Radeon RX Vega 8GB (1590/800MHz)4.15.0-43-generic (x86_64)NVIDIA GeForce GTX 1060 6GB (1506/4006MHz)4.19.5-041905-generic (x86_64)NVIDIA 415.234.6.0OpenCL 1.2 CUDA 10.0.1321.1.84OpenBenchmarking.orgCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --enable-libmpx --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: intel_pstate performanceOpenCL Details- TITAN RTX: GPU Compute Cores: 4608- GTX TITAN X GM200: GPU Compute Cores: 3072- GTX 1080: GPU Compute Cores: 2560- RTX 2080: GPU Compute Cores: 2944- GTX 1080 Ti: GPU Compute Cores: 3584- GTX 1070: GPU Compute Cores: 1920- GTX 980 Ti: GPU Compute Cores: 2816- RTX 2080 Ti: GPU Compute Cores: 4352- GTX 1060: GPU Compute Cores: 1280Python Details- Python 2.7.15rc1 + Python 3.6.7Security Details- __user pointer sanitization + Full generic retpoline IBPB IBRS_FW + SSB disabled via prctl and seccomp

TITAN RTXGTX TITAN X GM200GTX 1080RTX 2080GTX 1080 TiGTX 1070GTX 980 TiRTX 2080 TiRX Vega 64RX 580RX Vega 56GTX 1060Result OverviewPhoronix Test Suite100%250%400%549%699%DarktableclpeakSHOC Scalable HeterOgeneous ComputingMixbenchLuxMarkcl-memParboil

Radeon ROCm 2.0 OpenCL Compute Versus NVIDIA Linuxdarktable: Boat - OpenCLdarktable: Server Room - OpenCLparboil: OpenCL TPACFrodinia: OpenCL Particle Filtermixbench: Single Precisionclpeak: Global Memory Bandwidthclpeak: Integer Compute INTfahbench: luxmark: GPU - Luxball HDRluxmark: GPU - Microphoneluxmark: GPU - Hotelshoc: OpenCL - FFT SPcl-mem: CopyTITAN RTXGTX TITAN X GM200GTX 1080RTX 2080GTX 1080 TiGTX 1070GTX 980 TiRTX 2080 TiRX Vega 64RX 580RX Vega 56GTX 10601.610.730.634.2617324528153982974593230528988415484843.111.361.369.0365572631783121172991092941357262182.721.121.066.548570222245015513823873238235752091.880.800.886.1711029368101992372964119855658910833282.271.020.774.97116053293366198215621373255819723172.871.101.088.306410196168414017288998038774521873.271.521.3910.2258602641616114169101110538797282171.620.730.644.4516175505148402944269328476919114434543.581.761.3612458362249432545107422213.604.191.75591520512521527054818413.644.211.41105193172006306499322033.771.991.3912.0743461471285103122386963302139OpenBenchmarking.org

Darktable

Darktable is an open-source photography / workflow application this will use any system-installed Darktable program or on Windows will automatically download the pre-built binary from the project. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 2.4.2Test: Boat - Acceleration: OpenCLGTX 1060RX Vega 56RX 580RX Vega 64RTX 2080 TiGTX 980 TiGTX 1070GTX 1080 TiRTX 2080GTX 1080GTX TITAN X GM200TITAN RTX48121620SE +/- 0.04, N = 3SE +/- 0.07, N = 3SE +/- 0.03, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.01, N = 33.7713.6413.603.581.623.272.872.271.882.723.111.61
OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 2.4.2Test: Boat - Acceleration: OpenCLGTX 1060RX Vega 56RX 580RX Vega 64RTX 2080 TiGTX 980 TiGTX 1070GTX 1080 TiRTX 2080GTX 1080GTX TITAN X GM200TITAN RTX48121620Min: 13.57 / Avg: 13.64 / Max: 13.68Min: 13.5 / Avg: 13.6 / Max: 13.73Min: 3.52 / Avg: 3.58 / Max: 3.61Min: 1.61 / Avg: 1.62 / Max: 1.62Min: 3.26 / Avg: 3.27 / Max: 3.27Min: 2.86 / Avg: 2.87 / Max: 2.87Min: 2.25 / Avg: 2.27 / Max: 2.28Min: 1.88 / Avg: 1.88 / Max: 1.89Min: 2.72 / Avg: 2.72 / Max: 2.73Min: 3.1 / Avg: 3.11 / Max: 3.12Min: 1.6 / Avg: 1.61 / Max: 1.62

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 2.4.2Test: Server Room - Acceleration: OpenCLGTX 1060RX Vega 56RX 580RX Vega 64RTX 2080 TiGTX 980 TiGTX 1070GTX 1080 TiRTX 2080GTX 1080GTX TITAN X GM200TITAN RTX0.94731.89462.84193.78924.7365SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 31.994.214.191.760.731.521.101.020.801.121.360.73
OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 2.4.2Test: Server Room - Acceleration: OpenCLGTX 1060RX Vega 56RX 580RX Vega 64RTX 2080 TiGTX 980 TiGTX 1070GTX 1080 TiRTX 2080GTX 1080GTX TITAN X GM200TITAN RTX246810Min: 4.2 / Avg: 4.21 / Max: 4.22Min: 4.19 / Avg: 4.19 / Max: 4.19Min: 1.76 / Avg: 1.76 / Max: 1.76Min: 0.73 / Avg: 0.73 / Max: 0.73Min: 1.52 / Avg: 1.52 / Max: 1.52Min: 1.09 / Avg: 1.1 / Max: 1.1Min: 1.02 / Avg: 1.02 / Max: 1.03Min: 0.79 / Avg: 0.8 / Max: 0.8Min: 1.11 / Avg: 1.12 / Max: 1.12Min: 1.36 / Avg: 1.36 / Max: 1.37Min: 0.73 / Avg: 0.73 / Max: 0.74

Parboil

The Parboil Benchmarks from the IMPACT Research Group at University of Illinois are a set of throughput computing applications for looking at computing architecture and compilers. Parboil test-cases support OpenMP, OpenCL, and CUDA multi-processing environments. However, at this time the test profile is just making use of the OpenMP and OpenCL test workloads. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterParboil 2.5Test: OpenCL TPACFGTX 1060RX Vega 56RX 580RX Vega 64RTX 2080 TiGTX 980 TiGTX 1070GTX 1080 TiRTX 2080GTX 1080GTX TITAN X GM200TITAN RTX0.39380.78761.18141.57521.969SE +/- 0.02, N = 6SE +/- 0.00, N = 3SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.03, N = 3SE +/- 0.05, N = 3SE +/- 0.03, N = 3SE +/- 0.03, N = 3SE +/- 0.03, N = 3SE +/- 0.04, N = 3SE +/- 0.01, N = 3SE +/- 0.04, N = 31.391.411.751.360.641.391.080.770.881.061.360.631. (CXX) g++ options: -lm -lpthread -lgomp -O3 -ffast-math -fopenmp
OpenBenchmarking.orgSeconds, Fewer Is BetterParboil 2.5Test: OpenCL TPACFGTX 1060RX Vega 56RX 580RX Vega 64RTX 2080 TiGTX 980 TiGTX 1070GTX 1080 TiRTX 2080GTX 1080GTX TITAN X GM200TITAN RTX246810Min: 1.36 / Avg: 1.39 / Max: 1.49Min: 1.4 / Avg: 1.41 / Max: 1.41Min: 1.74 / Avg: 1.75 / Max: 1.76Min: 1.35 / Avg: 1.36 / Max: 1.37Min: 0.6 / Avg: 0.64 / Max: 0.71Min: 1.29 / Avg: 1.39 / Max: 1.47Min: 1.05 / Avg: 1.08 / Max: 1.14Min: 0.73 / Avg: 0.77 / Max: 0.84Min: 0.86 / Avg: 0.88 / Max: 0.94Min: 1.02 / Avg: 1.06 / Max: 1.15Min: 1.33 / Avg: 1.36 / Max: 1.37Min: 0.59 / Avg: 0.63 / Max: 0.71. (CXX) g++ options: -lm -lpthread -lgomp -O3 -ffast-math -fopenmp

Rodinia

Rodinia is a suite focused upon accelerating compute-intensive applications with accelerators. CUDA, OpenMP, and OpenCL parallel models are supported by the included applications. This profile utilizes the OpenCL and OpenMP test binaries at the moment. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 2.4Test: OpenCL Particle FilterGTX 1060RTX 2080 TiGTX 980 TiGTX 1070GTX 1080 TiRTX 2080GTX 1080GTX TITAN X GM200TITAN RTX3691215SE +/- 0.01, N = 3SE +/- 0.05, N = 3SE +/- 0.07, N = 3SE +/- 0.01, N = 3SE +/- 0.02, N = 3SE +/- 0.03, N = 3SE +/- 0.07, N = 3SE +/- 0.16, N = 3SE +/- 0.07, N = 312.074.4510.228.304.976.176.549.034.261. (CXX) g++ options: -O2 -lOpenCL
OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 2.4Test: OpenCL Particle FilterGTX 1060RTX 2080 TiGTX 980 TiGTX 1070GTX 1080 TiRTX 2080GTX 1080GTX TITAN X GM200TITAN RTX48121620Min: 12.05 / Avg: 12.07 / Max: 12.08Min: 4.4 / Avg: 4.45 / Max: 4.54Min: 10.1 / Avg: 10.22 / Max: 10.31Min: 8.28 / Avg: 8.3 / Max: 8.31Min: 4.94 / Avg: 4.97 / Max: 5.02Min: 6.12 / Avg: 6.17 / Max: 6.21Min: 6.44 / Avg: 6.54 / Max: 6.68Min: 8.75 / Avg: 9.03 / Max: 9.28Min: 4.18 / Avg: 4.26 / Max: 4.411. (CXX) g++ options: -O2 -lOpenCL

Mixbench

A benchmark suite for GPUs on mixed operational intensity kernels. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS, More Is BetterMixbench 2016-06-06Benchmark: Single PrecisionGTX 1060RX Vega 56RX 580RX Vega 64RTX 2080 TiGTX 980 TiGTX 1070GTX 1080 TiRTX 2080GTX 1080GTX TITAN X GM200TITAN RTX4K8K12K16K20KSE +/- 0.99, N = 3SE +/- 3.03, N = 3SE +/- 0.59, N = 3SE +/- 5.02, N = 3SE +/- 21.81, N = 3SE +/- 1.73, N = 3SE +/- 48.40, N = 3SE +/- 548.77, N = 3SE +/- 1.47, N = 3SE +/- 9.71, N = 3SE +/- 6.28, N = 3SE +/- 8.55, N = 34346105195915124581617558606410116051102985706557173241. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2
OpenBenchmarking.orgGFLOPS, More Is BetterMixbench 2016-06-06Benchmark: Single PrecisionGTX 1060RX Vega 56RX 580RX Vega 64RTX 2080 TiGTX 980 TiGTX 1070GTX 1080 TiRTX 2080GTX 1080GTX TITAN X GM200TITAN RTX3K6K9K12K15KMin: 4343.67 / Avg: 4345.54 / Max: 4347.03Min: 10513.67 / Avg: 10519.07 / Max: 10524.15Min: 5914.3 / Avg: 5915.46 / Max: 5916.18Min: 12449.03 / Avg: 12457.93 / Max: 12466.39Min: 16132.42 / Avg: 16174.84 / Max: 16204.83Min: 5856.75 / Avg: 5859.77 / Max: 5862.75Min: 6312.87 / Avg: 6409.63 / Max: 6460.56Min: 10507.47 / Avg: 11605 / Max: 12157.2Min: 11025.88 / Avg: 11028.67 / Max: 11030.85Min: 8550.35 / Avg: 8569.53 / Max: 8581.72Min: 6547.86 / Avg: 6557.02 / Max: 6569.05Min: 17309.25 / Avg: 17323.57 / Max: 17338.821. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2

clpeak

Clpeak is designed to test the peak capabilities of OpenCL devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGBPS, More Is BetterclpeakOpenCL Test: Global Memory BandwidthGTX 1060RX Vega 56RX 580RX Vega 64RTX 2080 TiGTX 980 TiGTX 1070GTX 1080 TiRTX 2080GTX 1080GTX TITAN X GM200TITAN RTX110220330440550SE +/- 0.00, N = 3SE +/- 0.02, N = 3SE +/- 1.53, N = 3SE +/- 0.01, N = 3SE +/- 0.74, N = 3SE +/- 0.01, N = 3SE +/- 0.02, N = 3SE +/- 0.70, N = 3SE +/- 0.23, N = 3SE +/- 0.02, N = 3SE +/- 0.26, N = 3SE +/- 1.05, N = 3147317205362505264196329368222263528
OpenBenchmarking.orgGBPS, More Is BetterclpeakOpenCL Test: Global Memory BandwidthGTX 1060RX Vega 56RX 580RX Vega 64RTX 2080 TiGTX 980 TiGTX 1070GTX 1080 TiRTX 2080GTX 1080GTX TITAN X GM200TITAN RTX90180270360450Min: 146.69 / Avg: 146.69 / Max: 146.69Min: 317.43 / Avg: 317.46 / Max: 317.5Min: 201.89 / Avg: 204.54 / Max: 207.2Min: 362.11 / Avg: 362.13 / Max: 362.15Min: 503.99 / Avg: 504.78 / Max: 506.25Min: 263.71 / Avg: 263.73 / Max: 263.74Min: 196.29 / Avg: 196.33 / Max: 196.36Min: 328.41 / Avg: 329.14 / Max: 330.54Min: 367.58 / Avg: 367.82 / Max: 368.28Min: 222.04 / Avg: 222.07 / Max: 222.11Min: 262.81 / Avg: 263.32 / Max: 263.58Min: 526.65 / Avg: 527.7 / Max: 529.8

OpenBenchmarking.orgGIOPS, More Is BetterclpeakOpenCL Test: Integer Compute INTGTX 1060RX Vega 56RX 580RX Vega 64RTX 2080 TiGTX 980 TiGTX 1070GTX 1080 TiRTX 2080GTX 1080GTX TITAN X GM200TITAN RTX3K6K9K12K15KSE +/- 4.82, N = 3SE +/- 1.26, N = 3SE +/- 0.01, N = 3SE +/- 2.18, N = 3SE +/- 697.03, N = 3SE +/- 21.99, N = 3SE +/- 18.17, N = 3SE +/- 18.09, N = 3SE +/- 652.84, N = 3SE +/- 11.35, N = 3SE +/- 23.76, N = 3SE +/- 1168.75, N = 3128520061252249414840161616843366101992450178315398
OpenBenchmarking.orgGIOPS, More Is BetterclpeakOpenCL Test: Integer Compute INTGTX 1060RX Vega 56RX 580RX Vega 64RTX 2080 TiGTX 980 TiGTX 1070GTX 1080 TiRTX 2080GTX 1080GTX TITAN X GM200TITAN RTX3K6K9K12K15KMin: 1278.2 / Avg: 1284.87 / Max: 1294.23Min: 2003.55 / Avg: 2006.08 / Max: 2007.38Min: 1252.48 / Avg: 1252.49 / Max: 1252.51Min: 2490.91 / Avg: 2493.52 / Max: 2497.86Min: 13446.24 / Avg: 14840.11 / Max: 15557.13Min: 1572.18 / Avg: 1616.16 / Max: 1638.91Min: 1652.3 / Avg: 1684.04 / Max: 1715.25Min: 3330.06 / Avg: 3365.65 / Max: 3389.05Min: 8893.73 / Avg: 10199.41 / Max: 10853.76Min: 2429.74 / Avg: 2449.53 / Max: 2469.05Min: 1737.17 / Avg: 1783.21 / Max: 1816.4Min: 13061.35 / Avg: 15397.85 / Max: 16625.41

FAHBench

FAHBench is a Folding@Home benchmark on the GPU. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNs Per Day, More Is BetterFAHBench 2.3.2GTX 1060RTX 2080 TiGTX 980 TiGTX 1070GTX 1080 TiRTX 2080GTX 1080GTX TITAN X GM200TITAN RTX60120180240300SE +/- 0.07, N = 3SE +/- 0.29, N = 3SE +/- 0.03, N = 3SE +/- 0.15, N = 3SE +/- 0.16, N = 3SE +/- 0.23, N = 3SE +/- 0.15, N = 3SE +/- 0.08, N = 3SE +/- 0.64, N = 3103294114140198237155121297
OpenBenchmarking.orgNs Per Day, More Is BetterFAHBench 2.3.2GTX 1060RTX 2080 TiGTX 980 TiGTX 1070GTX 1080 TiRTX 2080GTX 1080GTX TITAN X GM200TITAN RTX50100150200250Min: 102.61 / Avg: 102.75 / Max: 102.83Min: 293.89 / Avg: 294.39 / Max: 294.91Min: 114.22 / Avg: 114.29 / Max: 114.33Min: 140 / Avg: 140.29 / Max: 140.45Min: 197.4 / Avg: 197.7 / Max: 197.94Min: 236.16 / Avg: 236.54 / Max: 236.96Min: 154.88 / Avg: 155.17 / Max: 155.35Min: 120.7 / Avg: 120.84 / Max: 120.96Min: 295.48 / Avg: 296.63 / Max: 297.68

LuxMark

LuxMark is a multi-platform OpenGL benchmark using LuxRender. LuxMark supports targeting different OpenCL devices and has multiple scenes available for rendering. LuxMark is a fully open-source OpenCL program with real-world rendering examples. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.1OpenCL Device: GPU - Scene: Luxball HDRGTX 1060RX Vega 56RX 580RX Vega 64RTX 2080 TiGTX 980 TiGTX 1070GTX 1080 TiRTX 2080GTX 1080GTX TITAN X GM200TITAN RTX10K20K30K40K50KSE +/- 0.88, N = 3SE +/- 37.86, N = 3SE +/- 16.67, N = 3SE +/- 545.54, N = 4SE +/- 66.64, N = 3SE +/- 26.00, N = 3SE +/- 0.58, N = 3SE +/- 119.86, N = 3SE +/- 2.73, N = 3SE +/- 29.04, N = 3SE +/- 11.50, N = 3SE +/- 41.88, N = 3122383064915270325454269316910172882156229641138231729945932
OpenBenchmarking.orgScore, More Is BetterLuxMark 3.1OpenCL Device: GPU - Scene: Luxball HDRGTX 1060RX Vega 56RX 580RX Vega 64RTX 2080 TiGTX 980 TiGTX 1070GTX 1080 TiRTX 2080GTX 1080GTX TITAN X GM200TITAN RTX8K16K24K32K40KMin: 12237 / Avg: 12238.33 / Max: 12240Min: 30609 / Avg: 30649.33 / Max: 30725Min: 15253 / Avg: 15269.67 / Max: 15303Min: 31755 / Avg: 32545 / Max: 34129Min: 42611 / Avg: 42693 / Max: 42825Min: 16884 / Avg: 16910 / Max: 16962Min: 17287 / Avg: 17288 / Max: 17289Min: 21322 / Avg: 21561.67 / Max: 21686Min: 29637 / Avg: 29640.67 / Max: 29646Min: 13766 / Avg: 13822.67 / Max: 13862Min: 17287 / Avg: 17299 / Max: 17322Min: 45851 / Avg: 45932 / Max: 45991

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.1OpenCL Device: GPU - Scene: MicrophoneGTX 1060RTX 2080 TiGTX 980 TiGTX 1070GTX 1080 TiRTX 2080GTX 1080GTX TITAN X GM200TITAN RTX7K14K21K28K35KSE +/- 0.33, N = 3SE +/- 18.48, N = 3SE +/- 23.78, N = 3SE +/- 1.20, N = 3SE +/- 12.67, N = 3SE +/- 30.23, N = 3SE +/- 15.59, N = 3SE +/- 20.28, N = 3SE +/- 39.86, N = 3696328476111059980137321985587321092930528
OpenBenchmarking.orgScore, More Is BetterLuxMark 3.1OpenCL Device: GPU - Scene: MicrophoneGTX 1060RTX 2080 TiGTX 980 TiGTX 1070GTX 1080 TiRTX 2080GTX 1080GTX TITAN X GM200TITAN RTX5K10K15K20K25KMin: 6962 / Avg: 6962.67 / Max: 6963Min: 28444 / Avg: 28476 / Max: 28508Min: 11058 / Avg: 11105.33 / Max: 11133Min: 9978 / Avg: 9980.33 / Max: 9982Min: 13707 / Avg: 13732.33 / Max: 13745Min: 19817 / Avg: 19855.33 / Max: 19915Min: 8705 / Avg: 8732 / Max: 8759Min: 10898 / Avg: 10928.67 / Max: 10967Min: 30448 / Avg: 30527.67 / Max: 30570

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.1OpenCL Device: GPU - Scene: HotelRTX 2080 TiGTX 980 TiGTX 1070GTX 1080 TiRTX 2080GTX 1080GTX TITAN X GM200TITAN RTX2K4K6K8K10KSE +/- 15.93, N = 3SE +/- 10.50, N = 3SE +/- 8.21, N = 3SE +/- 6.01, N = 3SE +/- 1.15, N = 3SE +/- 18.33, N = 3SE +/- 31.83, N = 3SE +/- 38.04, N = 391913879387755816589382341359884
OpenBenchmarking.orgScore, More Is BetterLuxMark 3.1OpenCL Device: GPU - Scene: HotelRTX 2080 TiGTX 980 TiGTX 1070GTX 1080 TiRTX 2080GTX 1080GTX TITAN X GM200TITAN RTX2K4K6K8K10KMin: 9170 / Avg: 9190.67 / Max: 9222Min: 3868 / Avg: 3879 / Max: 3900Min: 3867 / Avg: 3876.67 / Max: 3893Min: 5573 / Avg: 5581.33 / Max: 5593Min: 6587 / Avg: 6589 / Max: 6591Min: 3805 / Avg: 3823.33 / Max: 3860Min: 4071 / Avg: 4134.67 / Max: 4167Min: 9808 / Avg: 9884 / Max: 9925

SHOC Scalable HeterOgeneous Computing

The CUDA and OpenCL version of Vetter's Scalable HeterOgeneous Computing benchmark suite. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: FFT SPGTX 1060RX Vega 56RX 580RX Vega 64RTX 2080 TiGTX 980 TiGTX 1070GTX 1080 TiRTX 2080GTX 1080GTX TITAN X GM200TITAN RTX30060090012001500SE +/- 1.44, N = 3SE +/- 1.83, N = 3SE +/- 0.10, N = 3SE +/- 2.24, N = 3SE +/- 14.24, N = 3SE +/- 0.20, N = 3SE +/- 0.94, N = 3SE +/- 1.23, N = 3SE +/- 5.95, N = 3SE +/- 2.68, N = 3SE +/- 0.60, N = 3SE +/- 11.44, N = 330293254810741443728452972108357572615481. (CXX) g++ options: -O2 -lSHOCCommon -std=c++14 -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft
OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: FFT SPGTX 1060RX Vega 56RX 580RX Vega 64RTX 2080 TiGTX 980 TiGTX 1070GTX 1080 TiRTX 2080GTX 1080GTX TITAN X GM200TITAN RTX30060090012001500Min: 299.19 / Avg: 302.07 / Max: 303.55Min: 929.02 / Avg: 932.33 / Max: 935.35Min: 547.41 / Avg: 547.54 / Max: 547.74Min: 1069.25 / Avg: 1073.72 / Max: 1076.07Min: 1427.63 / Avg: 1442.88 / Max: 1471.34Min: 728.11 / Avg: 728.42 / Max: 728.79Min: 450.87 / Avg: 452.29 / Max: 454.08Min: 970.25 / Avg: 972.45 / Max: 974.49Min: 1070.61 / Avg: 1082.5 / Max: 1088.89Min: 570.17 / Avg: 574.56 / Max: 579.41Min: 724.49 / Avg: 725.58 / Max: 726.54Min: 1525.13 / Avg: 1547.56 / Max: 1562.711. (CXX) g++ options: -O2 -lSHOCCommon -std=c++14 -lcudadevrt -lcudart_static -lrt -lpthread -ldl -lcufft

cl-mem

A basic OpenCL memory benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: CopyGTX 1060RX Vega 56RX 580RX Vega 64RTX 2080 TiGTX 980 TiGTX 1070GTX 1080 TiRTX 2080GTX 1080GTX TITAN X GM200TITAN RTX100200300400500SE +/- 0.03, N = 3SE +/- 0.03, N = 3SE +/- 0.03, N = 3SE +/- 0.06, N = 3SE +/- 0.39, N = 3SE +/- 0.00, N = 3SE +/- 0.07, N = 3SE +/- 0.07, N = 3SE +/- 0.09, N = 3SE +/- 0.07, N = 3SE +/- 0.03, N = 3SE +/- 0.24, N = 31392031842224542171873173282092184841. (CC) gcc options: -O2 -flto -lOpenCL
OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: CopyGTX 1060RX Vega 56RX 580RX Vega 64RTX 2080 TiGTX 980 TiGTX 1070GTX 1080 TiRTX 2080GTX 1080GTX TITAN X GM200TITAN RTX90180270360450Min: 139.3 / Avg: 139.33 / Max: 139.4Min: 203.4 / Avg: 203.43 / Max: 203.5Min: 184.2 / Avg: 184.23 / Max: 184.3Min: 221.8 / Avg: 221.9 / Max: 222Min: 453.3 / Avg: 454.07 / Max: 454.6Min: 217.1 / Avg: 217.1 / Max: 217.1Min: 186.8 / Avg: 186.87 / Max: 187Min: 317.3 / Avg: 317.37 / Max: 317.5Min: 327.7 / Avg: 327.83 / Max: 328Min: 209.3 / Avg: 209.37 / Max: 209.5Min: 217.8 / Avg: 217.87 / Max: 217.9Min: 483.3 / Avg: 483.63 / Max: 484.11. (CC) gcc options: -O2 -flto -lOpenCL