NVIDIA GPU Compute

Some benchmarks by Michael Larabel.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2009116-FI-GPUCOMPUT55

Jump To Table - Results

Statistics

Show Overall Harmonic Mean(s)
Show Overall Geometric Mean
Show Geometric Means Per-Suite/Category
Show Wins / Losses Counts (Pie Chart)
Normalize Results
Remove Outliers Before Calculating Averages

Graph Settings

Force Line Graphs Where Applicable
Convert To Scalar Where Applicable
Disable Color Branding
Prefer Vertical Bar Graphs
No Box Plots
On Line Graphs With Missing Data, Connect The Line Gaps

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs
Condense Test Profiles With Multiple Version Results Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Only show results where is faster than

Only show results matching title/arguments (delimit multiple options with a comma):

Do not show results matching title/arguments (delimit multiple options with a comma):

NVIDIA GPU Compute Suite 1.0.0 System Test suite extracted from NVIDIA GPU Compute. system/darktable-1.0.4 bench.SRW output.jpg --core -d opencl -d perf Test: Boat - Acceleration: OpenCL pts/mixbench-1.1.1 mixbench-cuda-ro HPGFLOPS Backend: NVIDIA CUDA - Benchmark: Half Precision pts/clpeak-1.0.1 --compute-dp OpenCL Test: Double-Precision Double pts/mandelgpu-1.3.1 0 1 OpenCL Device: GPU pts/mixbench-1.1.1 mixbench-ocl-ro SPGFLOPS Backend: OpenCL - Benchmark: Single Precision pts/clpeak-1.0.1 --compute-sp OpenCL Test: Single-Precision Float pts/plaidml-1.0.4 --no-fp16 --no-train imdb_lstm OPENCL FP16: No - Mode: Inference - Network: IMDB LSTM - Device: OpenCL pts/cl-mem-1.0.1 READ Benchmark: Read pts/mixbench-1.1.1 mixbench-ocl-ro DPGFLOPS Backend: OpenCL - Benchmark: Double Precision pts/clpeak-1.0.1 --global-bandwidth OpenCL Test: Global Memory Bandwidth pts/mixbench-1.1.1 mixbench-cuda-ro SPGFLOPS Backend: NVIDIA CUDA - Benchmark: Single Precision pts/octanebench-1.2.1 Total Score pts/cl-mem-1.0.1 WRITE Benchmark: Write pts/plaidml-1.0.4 --no-fp16 --no-train mobilenet OPENCL FP16: No - Mode: Inference - Network: Mobilenet - Device: OpenCL pts/luxcorerender-cl-1.2.0 LuxCore2.1Benchmark/LuxCoreScene/render.cfg Scene: LuxCore Benchmark pts/luxcorerender-cl-1.2.0 DLSC/LuxCoreScene/render.cfg Scene: DLSC pts/plaidml-1.0.4 --no-fp16 --no-train inception_v3 OPENCL FP16: No - Mode: Inference - Network: Inception V3 - Device: OpenCL pts/luxcorerender-cl-1.2.0 Food/LuxCoreScene/render.cfg Scene: Food pts/plaidml-1.0.4 --fp16 --no-train mobilenet OPENCL FP16: Yes - Mode: Inference - Network: Mobilenet - Device: OpenCL pts/rodinia-1.3.1 OCL_PARTICLEFILTER Test: OpenCL Particle Filter pts/fahbench-1.0.1 pts/plaidml-1.0.4 --no-fp16 --no-train densenet201 OPENCL FP16: No - Mode: Inference - Network: DenseNet 201 - Device: OpenCL pts/arrayfire-1.1.0 cg_opencl Test: Conjugate Gradient OpenCL pts/neatbench-1.0.4 gpu Acceleration: GPU pts/arrayfire-1.1.0 blas_opencl Test: BLAS OpenCL pts/mixbench-1.1.1 mixbench-cuda-ro DPGFLOPS Backend: NVIDIA CUDA - Benchmark: Double Precision system/darktable-1.0.4 server_room.NEF output.jpg --core -d opencl -d perf Test: Server Room - Acceleration: OpenCL pts/luxcorerender-cl-1.2.0 RainbowColorsAndPrism/LuxCoreScene/render.cfg Scene: Rainbow Colors and Prism pts/cl-mem-1.0.1 COPY Benchmark: Copy pts/plaidml-1.0.4 --no-fp16 --no-train vgg19 OPENCL FP16: No - Mode: Inference - Network: VGG19 - Device: OpenCL pts/gromacs-gpu-1.1.0 Water Benchmark pts/plaidml-1.0.4 --no-fp16 --train mobilenet OPENCL FP16: No - Mode: Training - Network: Mobilenet - Device: OpenCL pts/namd-cuda-1.1.0 ATPase Simulation - 327,506 Atoms pts/mixbench-1.1.1 mixbench-ocl-ro GIOPS Backend: OpenCL - Benchmark: Integer pts/plaidml-1.0.4 --no-fp16 --no-train vgg16 OPENCL FP16: No - Mode: Inference - Network: VGG16 - Device: OpenCL pts/financebench-1.0.0 Monte-Carlo/OpenCL/monteCarloEngine.exe Benchmark: Monte-Carlo OpenCL pts/clpeak-1.0.1 --compute-integer OpenCL Test: Integer Compute INT pts/mixbench-1.1.1 mixbench-cuda-ro GIOPS Backend: NVIDIA CUDA - Benchmark: Integer pts/lczero-1.5.0 -b opencl Backend: OpenCL system/darktable-1.0.4 server-rack.dng output.jpg --core -d opencl -d perf Test: Server Rack - Acceleration: OpenCL system/darktable-1.0.4 masskrug.NEF output.jpg --core -d opencl -d perf Test: Masskrug - Acceleration: OpenCL pts/rodinia-1.3.1 OCL_MYOCYTE Test: OpenCL Myocyte pts/rodinia-1.3.1 CUDA_MYOCYTE Test: NVIDIA CUDA GPU Myocyte pts/viennacl-1.0.0 OpenCL LU Factorization pts/daphne-1.0.0 Cuda euclidean_cluster Backend: NVIDIA CUDA - Kernel: Euclidean Cluster pts/daphne-1.0.0 OpenCl euclidean_cluster Backend: OpenCL - Kernel: Euclidean Cluster pts/daphne-1.0.0 OpenCl ndt_mapping Backend: OpenCL - Kernel: NDT Mapping pts/daphne-1.0.0 OpenCl points2image Backend: OpenCL - Kernel: Points2Image pts/daphne-1.0.0 Cuda ndt_mapping Backend: NVIDIA CUDA - Kernel: NDT Mapping pts/daphne-1.0.0 Cuda points2image Backend: NVIDIA CUDA - Kernel: Points2Image pts/financebench-1.0.0 Black-Scholes/OpenCL/blackScholesAnalyticEngine.exe Benchmark: Black-Scholes OpenCL