NVIDIA GPU Compute OpenCL CUDA Benchmarks

Some benchmarks by Michael Larabel tests for a future article.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2009112-FI-ROCMGPUCO10
Jump To Table - Results

View

Do Not Show Noisy Results
Do Not Show Results With Incomplete Data
Do Not Show Results With Little Change/Spread
List Notable Results

Limit displaying results to tests within:

BLAS (Basic Linear Algebra Sub-Routine) Tests 2 Tests
CPU Massive 3 Tests
HPC - High Performance Computing 4 Tests
Machine Learning 2 Tests
Multi-Core 3 Tests
NVIDIA GPU Compute 14 Tests
OpenCL 5 Tests

Statistics

Show Overall Harmonic Mean(s)
Show Overall Geometric Mean
Show Geometric Means Per-Suite/Category
Show Wins / Losses Counts (Pie Chart)
Normalize Results
Remove Outliers Before Calculating Averages

Graph Settings

Force Line Graphs Where Applicable
Convert To Scalar Where Applicable
Disable Color Branding
Prefer Vertical Bar Graphs
No Box Plots
On Line Graphs With Missing Data, Connect The Line Gaps

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs
Condense Test Profiles With Multiple Version Results Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Highlight
Result
Hide
Result
Result
Identifier
Performance Per
Dollar
Date
Run
  Test
  Duration
GTX 1060
September 08 2020
  1 Hour, 10 Minutes
GTX 1070
September 07 2020
  1 Hour, 6 Minutes
GTX 1070 Ti
September 06 2020
  1 Hour, 11 Minutes
GTX 1080
September 07 2020
  1 Hour, 4 Minutes
GTX 1650
September 09 2020
  1 Hour, 10 Minutes
GTX 1650 SUPER
September 08 2020
  1 Hour, 8 Minutes
GTX 1660
September 09 2020
  1 Hour, 5 Minutes
GTX 1660 SUPER
September 07 2020
  1 Hour, 5 Minutes
GTX 1660 Ti
September 09 2020
  1 Hour, 3 Minutes
RTX 2060
September 08 2020
  1 Hour, 3 Minutes
RTX 2060 SUPER
September 09 2020
  1 Hour, 1 Minute
RTX 2070
September 09 2020
  1 Hour, 1 Minute
RTX 2070 SUPER
September 08 2020
  59 Minutes
RTX 2080
September 08 2020
  1 Hour
RTX 2080 SUPER
September 08 2020
  59 Minutes
RTX 2080 Ti
September 07 2020
  1 Hour
TITAN RTX
September 06 2020
  1 Hour, 1 Minute
Invert Hiding All Results Option
  1 Hour, 4 Minutes

Only show results where is faster than
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


NVIDIA GPU Compute OpenCL CUDA Benchmarks Suite 1.0.0 System Test suite extracted from NVIDIA GPU Compute OpenCL CUDA Benchmarks. pts/plaidml-1.0.4 --fp16 --no-train mobilenet OPENCL FP16: Yes - Mode: Inference - Network: Mobilenet - Device: OpenCL pts/plaidml-1.0.4 --no-fp16 --train mobilenet OPENCL FP16: No - Mode: Training - Network: Mobilenet - Device: OpenCL pts/plaidml-1.0.4 --no-fp16 --no-train mobilenet OPENCL FP16: No - Mode: Inference - Network: Mobilenet - Device: OpenCL pts/luxcorerender-cl-1.2.0 RainbowColorsAndPrism/LuxCoreScene/render.cfg Scene: Rainbow Colors and Prism pts/plaidml-1.0.4 --no-fp16 --no-train densenet201 OPENCL FP16: No - Mode: Inference - Network: DenseNet 201 - Device: OpenCL pts/plaidml-1.0.4 --no-fp16 --no-train imdb_lstm OPENCL FP16: No - Mode: Inference - Network: IMDB LSTM - Device: OpenCL pts/lczero-1.5.0 -b opencl Backend: OpenCL pts/rodinia-1.3.1 OCL_PARTICLEFILTER Test: OpenCL Particle Filter pts/arrayfire-1.1.0 cg_opencl Test: Conjugate Gradient OpenCL pts/neatbench-1.0.4 gpu Acceleration: GPU pts/luxcorerender-cl-1.2.0 LuxCore2.1Benchmark/LuxCoreScene/render.cfg Scene: LuxCore Benchmark pts/fahbench-1.0.1 pts/mixbench-1.1.1 mixbench-ocl-ro SPGFLOPS Backend: OpenCL - Benchmark: Single Precision pts/luxcorerender-cl-1.2.0 DLSC/LuxCoreScene/render.cfg Scene: DLSC pts/mixbench-1.1.1 mixbench-ocl-ro DPGFLOPS Backend: OpenCL - Benchmark: Double Precision pts/mixbench-1.1.1 mixbench-ocl-ro GIOPS Backend: OpenCL - Benchmark: Integer pts/mixbench-1.1.1 mixbench-cuda-ro SPGFLOPS Backend: NVIDIA CUDA - Benchmark: Single Precision pts/luxcorerender-cl-1.2.0 Food/LuxCoreScene/render.cfg Scene: Food pts/mixbench-1.1.1 mixbench-cuda-ro DPGFLOPS Backend: NVIDIA CUDA - Benchmark: Double Precision pts/mixbench-1.1.1 mixbench-cuda-ro HPGFLOPS Backend: NVIDIA CUDA - Benchmark: Half Precision pts/mixbench-1.1.1 mixbench-cuda-ro GIOPS Backend: NVIDIA CUDA - Benchmark: Integer pts/namd-cuda-1.1.0 ATPase Simulation - 327,506 Atoms pts/octanebench-1.2.1 Total Score pts/financebench-1.0.0 Black-Scholes/OpenCL/blackScholesAnalyticEngine.exe Benchmark: Black-Scholes OpenCL pts/cl-mem-1.0.1 READ Benchmark: Read pts/cl-mem-1.0.1 WRITE Benchmark: Write pts/gromacs-gpu-1.1.0 Water Benchmark pts/cl-mem-1.0.1 COPY Benchmark: Copy pts/clpeak-1.0.1 --global-bandwidth OpenCL Test: Global Memory Bandwidth pts/clpeak-1.0.1 --compute-sp OpenCL Test: Single-Precision Float pts/clpeak-1.0.1 --compute-dp OpenCL Test: Double-Precision Double pts/clpeak-1.0.1 --compute-integer OpenCL Test: Integer Compute INT pts/mandelgpu-1.3.1 0 1 OpenCL Device: GPU pts/viennacl-1.0.0 OpenCL LU Factorization