NVIDIA GPU Compute Benchmarks

Benchmarks for a future article.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2106127-IB-3080C630035

Jump To Table - Results

RTX 3080 RBAR

Processor: AMD Ryzen 9 5900X 12-Core @ 3.70GHz (12 Cores / 24 Threads), Motherboard: ASUS ROG CROSSHAIR VIII HERO (3402 BIOS), Chipset: AMD Starship/Matisse, Memory: 16GB, Disk: 1000GB Sabrent Rocket 4.0 Plus + 2000GB, Graphics: NVIDIA GeForce RTX 3080 10GB, Audio: NVIDIA GA102 HD Audio, Monitor: ASUS VP28U, Network: Realtek RTL8125 2.5GbE + Intel I211

OS: Ubuntu 21.04, Kernel: 5.11.0-17-generic (x86_64), Desktop: GNOME Shell 3.38.4, Display Server: X Server 1.20.11, Display Driver: NVIDIA 465.31, OpenGL: 4.6.0, OpenCL: OpenCL 3.0 CUDA 11.3.116, Vulkan: 1.2.168, Compiler: GCC 10.3.0 + CUDA 11.3, File-System: ext4, Screen Resolution: 3840x2160

Kernel Notes: Transparent Huge Pages: madvise
Compiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-mutex --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-10-gDeRY6/gcc-10-10.3.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-10-gDeRY6/gcc-10-10.3.0/debian/tmp-gcn/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v
Processor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa201009
OpenCL Notes: GPU Compute Cores: 8704
Python Notes: Python 3.9.5
Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected

PlaidML

This test profile uses PlaidML deep learning framework developed by Intel for offering up various benchmarks. Learn more via the OpenBenchmarking.org test page.

cl-mem

A basic OpenCL memory benchmark. Learn more via the OpenBenchmarking.org test page.

ViennaCL

ViennaCL is an open-source linear algebra library written in C++ and with support for OpenCL and OpenMP. This test profile makes use of ViennaCL's built-in benchmarks. Learn more via the OpenBenchmarking.org test page.

SHOC Scalable HeterOgeneous Computing

The CUDA and OpenCL version of Vetter's Scalable HeterOgeneous Computing benchmark suite. SHOC provides a number of different benchmark programs for evaluating the performance and stability of compute devices. Learn more via the OpenBenchmarking.org test page.

ViennaCL

clpeak

Clpeak is designed to test the peak capabilities of OpenCL devices. Learn more via the OpenBenchmarking.org test page.

Mixbench

A benchmark suite for GPUs on mixed operational intensity kernels. Learn more via the OpenBenchmarking.org test page.

ArrayFire

ArrayFire is an GPU and CPU numeric processing library, this test uses the built-in CPU and OpenCL ArrayFire benchmarks. Learn more via the OpenBenchmarking.org test page.

clpeak

Clpeak is designed to test the peak capabilities of OpenCL devices. Learn more via the OpenBenchmarking.org test page.

vkpeak

Vkpeak is a Vulkan compute benchmark inspired by OpenCL's clpeak. Vkpeak provides Vulkan compute performance measurements for FP16 / FP32 / FP64 / INT16 / INT32 scalar and vec4 performance. Learn more via the OpenBenchmarking.org test page.

SHOC Scalable HeterOgeneous Computing

ViennaCL

SHOC Scalable HeterOgeneous Computing

Mixbench

A benchmark suite for GPUs on mixed operational intensity kernels. Learn more via the OpenBenchmarking.org test page.

clpeak

Clpeak is designed to test the peak capabilities of OpenCL devices. Learn more via the OpenBenchmarking.org test page.

vkpeak

Hashcat

Hashcat is an open-source, advanced password recovery tool supporting GPU acceleration with OpenCL, NVIDIA CUDA, and Radeon ROCm. Learn more via the OpenBenchmarking.org test page.

IndigoBench

This is a test of Indigo Renderer's IndigoBench benchmark. Learn more via the OpenBenchmarking.org test page.

LuxCoreRender

LuxCoreRender is an open-source 3D physically based renderer formerly known as LuxRender. LuxCoreRender supports CPU-based rendering as well as GPU acceleration via OpenCL, NVIDIA CUDA, and NVIDIA OptiX interfaces. Learn more via the OpenBenchmarking.org test page.

LeelaChessZero

LeelaChessZero (lc0 / lczero) is a chess engine automated vian neural networks. This test profile can be used for OpenCL, CUDA + cuDNN, and BLAS (CPU-based) benchmarking. Learn more via the OpenBenchmarking.org test page.

FAHBench

FAHBench is a Folding@Home benchmark on the GPU. Learn more via the OpenBenchmarking.org test page.

OctaneBench

OctaneBench is a test of the OctaneRender on the GPU and requires the use of NVIDIA CUDA. Learn more via the OpenBenchmarking.org test page.

Chaos Group V-RAY

This is a test of Chaos Group's V-RAY benchmark. V-RAY is a commercial renderer that can integrate with various creator software products like SketchUp and 3ds Max. The V-RAY benchmark is standalone and supports CPU and NVIDIA CUDA/RTX based rendering. Learn more via the OpenBenchmarking.org test page.

NAMD CUDA

NAMD is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems. NAMD was developed by the Theoretical and Computational Biophysics Group in the Beckman Institute for Advanced Science and Technology at the University of Illinois at Urbana-Champaign. This version of the NAMD test profile uses CUDA GPU acceleration. Learn more via the OpenBenchmarking.org test page.

ArrayFire

ArrayFire is an GPU and CPU numeric processing library, this test uses the built-in CPU and OpenCL ArrayFire benchmarks. Learn more via the OpenBenchmarking.org test page.

VkResample

VkResample is a Vulkan-based image upscaling library based on VkFFT. The sample input file is upscaling a 4K image to 8K using Vulkan-based GPU acceleration. Learn more via the OpenBenchmarking.org test page.

Blender

Blender is an open-source 3D creation and modeling software project. This test is of Blender's Cycles benchmark with various sample files. GPU computing via OpenCL, NVIDIA OptiX, and NVIDIA CUDA is supported. Learn more via the OpenBenchmarking.org test page.

RealSR-NCNN

RealSR-NCNN is an NCNN neural network implementation of the RealSR project and accelerated using the Vulkan API. RealSR is the Real-World Super Resolution via Kernel Estimation and Noise Injection. NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. This test profile times how long it takes to increase the resolution of a sample image by a scale of 4x with Vulkan. Learn more via the OpenBenchmarking.org test page.

Waifu2x-NCNN Vulkan

Waifu2x-NCNN is an NCNN neural network implementation of the Waifu2x converter project and accelerated using the Vulkan API. NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. This test profile times how long it takes to increase the resolution of a sample image with Vulkan. Learn more via the OpenBenchmarking.org test page.

Betsy GPU Compressor

Betsy is an open-source GPU compressor of various GPU compression techniques. Betsy is written in GLSL for Vulkan/OpenGL (compute shader) support for GPU-based texture compression. Learn more via the OpenBenchmarking.org test page.

RedShift Demo

This is a test of MAXON's RedShift demo build that currently requires NVIDIA GPU acceleration. Learn more via the OpenBenchmarking.org test page.

78 Results Shown

PlaidML:
No - Inference - ResNet 50 - OpenCL
No - Inference - VGG16 - OpenCL
No - Inference - VGG19 - OpenCL
cl-mem:
Read
Write
Copy
ViennaCL:
OpenCL BLAS - sCOPY
OpenCL BLAS - sAXPY
OpenCL BLAS - dCOPY
OpenCL BLAS - dAXPY
OpenCL BLAS - dDOT
SHOC Scalable HeterOgeneous Computing
ViennaCL:
OpenCL BLAS - sDOT
OpenCL BLAS - dGEMV-N
OpenCL BLAS - dGEMV-T
clpeak
Mixbench:
NVIDIA CUDA - Single Precision
NVIDIA CUDA - Double Precision
NVIDIA CUDA - Half Precision
ArrayFire
clpeak:
Single-Precision Float
Double-Precision Double
vkpeak:
fp32-scalar
fp32-vec4
fp16-scalar
fp16-vec4
fp64-scalar
fp64-vec4
SHOC Scalable HeterOgeneous Computing:
OpenCL - FFT SP
OpenCL - GEMM SGEMM_N
OpenCL - S3D
ViennaCL:
OpenCL BLAS - dGEMM-NN
OpenCL BLAS - dGEMM-NT
OpenCL BLAS - dGEMM-TN
OpenCL BLAS - dGEMM-TT
SHOC Scalable HeterOgeneous Computing
Mixbench
clpeak
vkpeak:
int32-scalar
int32-vec4
int16-scalar
int16-vec4
Hashcat:
MD5
SHA1
SHA-512
7-Zip
TrueCrypt RIPEMD160 + XTS
IndigoBench:
OpenCL GPU - Supercar
OpenCL GPU - Bedroom
LuxCoreRender:
DLSC - GPU
Rainbow Colors and Prism - GPU
LuxCore Benchmark - GPU
Orange Juice - GPU
Danish Mood - GPU
LeelaChessZero
FAHBench
OctaneBench
Chaos Group V-RAY:
NVIDIA CUDA GPU
NVIDIA RTX GPU
NAMD CUDA
ArrayFire
VkResample
Blender:
BMW27 - CUDA
BMW27 - NVIDIA OptiX
Classroom - CUDA
Classroom - NVIDIA OptiX
Fishy Cat - CUDA
Fishy Cat - NVIDIA OptiX
Pabellon Barcelona - CUDA
Pabellon Barcelona - NVIDIA OptiX
Barbershop - CUDA
Barbershop - NVIDIA OptiX
RealSR-NCNN:
4x - Yes
4x - No
Waifu2x-NCNN Vulkan
Betsy GPU Compressor:
ETC1 - Highest
ETC2 RGB - Highest
RedShift Demo

RTX 3080 RBAR

Testing initiated at 12 June 2021 11:11 by user phoronix.

NVIDIA GPU Compute Benchmarks

Statistics

Graph Settings

Multi-Way Comparison

Table

Run Management

RTX 3080 RBAR

PlaidML

cl-mem

ViennaCL

SHOC Scalable HeterOgeneous Computing

ViennaCL

clpeak

Mixbench

ArrayFire

clpeak

vkpeak

SHOC Scalable HeterOgeneous Computing

ViennaCL

SHOC Scalable HeterOgeneous Computing

Mixbench

clpeak

vkpeak

Hashcat

IndigoBench

LuxCoreRender

LeelaChessZero

FAHBench

OctaneBench

Chaos Group V-RAY

NAMD CUDA

ArrayFire

VkResample

Blender

RealSR-NCNN

Waifu2x-NCNN Vulkan

Betsy GPU Compressor

RedShift Demo

78 Results Shown

RTX 3080 RBAR