cuda-RTX2080-2021Nov eVGA GeForce RTX 2080 FTW3 Ultra (OC BIOS) testing with an NZXT N7 Z490 (P1.80 BIOS) and Intel Core i5-11600K on Ubuntu 21.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2111240-TAD-CUDARTX214&grs .
cuda-RTX2080-2021Nov Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL OpenCL Vulkan Compiler File-System Screen Resolution RTX 2080 Intel Core i5-11600K @ 4.90GHz (6 Cores / 12 Threads) NZXT N7 Z490 (P1.80 BIOS) Intel Comet Lake PCH 16GB 1000GB Western Digital WDS100T3XHC-00SJG0 + 512GB INTEL SSDPEKKW512G8 + 4001GB Seagate ST4000VN008-2DR1 NVIDIA GeForce RTX 2080 8GB Realtek ALC1220 LG TV SSCR2 Realtek RTL8125 2.5GbE + Intel Wi-Fi 6 AX200 Ubuntu 21.10 5.13.0-21-generic (x86_64) GNOME Shell 40.5 X Server 1.20.13 NVIDIA 495.44 4.6.0 OpenCL 3.0 CUDA 11.5.100 1.2.186 GCC 10.3.0 + Clang 13.0.0-2 ext4 3840x2160 OpenBenchmarking.org - Transparent Huge Pages: madvise - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-mutex --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-10-h9G0XI/gcc-10-10.3.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-10-h9G0XI/gcc-10-10.3.0/debian/tmp-gcn/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: intel_pstate powersave - CPU Microcode: 0x40 - Thermald 2.4.6 - BAR1 / Visible vRAM Size: 256 MiB - GPU Compute Cores: 2944 - Python 3.9.7 - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
cuda-RTX2080-2021Nov neatbench: GPU clpeak: Global Memory Bandwidth clpeak: Double-Precision Double clpeak: Single-Precision Float clpeak: Integer Compute INT mandelgpu: GPU blender: Pabellon Barcelona - NVIDIA OptiX blender: Pabellon Barcelona - CUDA blender: Barbershop - NVIDIA OptiX blender: Classroom - NVIDIA OptiX blender: Barbershop - CUDA blender: Fishy Cat - CUDA blender: Classroom - CUDA blender: BMW27 - CUDA indigobench: OpenCL GPU - Supercar indigobench: OpenCL GPU - Bedroom ncnn: Vulkan GPU - regnety_400m ncnn: Vulkan GPU - squeezenet_ssd ncnn: Vulkan GPU - yolov4-tiny ncnn: Vulkan GPU - resnet50 ncnn: Vulkan GPU - alexnet ncnn: Vulkan GPU - resnet18 ncnn: Vulkan GPU - vgg16 ncnn: Vulkan GPU - googlenet ncnn: Vulkan GPU - blazeface ncnn: Vulkan GPU - mnasnet ncnn: Vulkan GPU - shufflenet-v2 ncnn: Vulkan GPU-v2-v2 - mobilenet-v2 ncnn: Vulkan GPU - mobilenet viennacl: OpenCL BLAS - dGEMM-TT viennacl: OpenCL BLAS - dGEMM-NT viennacl: OpenCL BLAS - dGEMM-TN viennacl: OpenCL BLAS - dGEMM-NN viennacl: OpenCL BLAS - dGEMV-T viennacl: OpenCL BLAS - dGEMV-N viennacl: OpenCL BLAS - dDOT viennacl: OpenCL BLAS - dAXPY viennacl: OpenCL BLAS - dCOPY viennacl: OpenCL BLAS - sDOT viennacl: OpenCL BLAS - sAXPY viennacl: OpenCL BLAS - sCOPY viennacl: CPU BLAS - dGEMM-TT viennacl: CPU BLAS - dGEMM-TN viennacl: CPU BLAS - dGEMM-NT viennacl: CPU BLAS - dGEMM-NN viennacl: CPU BLAS - dGEMV-T viennacl: CPU BLAS - dGEMV-N viennacl: CPU BLAS - dDOT viennacl: CPU BLAS - dAXPY viennacl: CPU BLAS - dCOPY viennacl: CPU BLAS - sDOT viennacl: CPU BLAS - sAXPY viennacl: CPU BLAS - sCOPY financebench: Black-Scholes OpenCL luxcorerender: Rainbow Colors and Prism - GPU luxcorerender: LuxCore Benchmark - GPU luxcorerender: Orange Juice - GPU luxcorerender: Danish Mood - GPU luxcorerender: DLSC - GPU arrayfire: Conjugate Gradient OpenCL lczero: OpenCL fahbench: octanebench: Total Score vkresample: 2x - Single vkresample: 2x - Double betsy: ETC2 RGB - Highest betsy: ETC1 - Highest namd-cuda: ATPase Simulation - 327,506 Atoms cl-mem: Write cl-mem: Read cl-mem: Copy shoc: OpenCL - Texture Read Bandwidth shoc: OpenCL - Bus Speed Readback shoc: OpenCL - Bus Speed Download shoc: OpenCL - Max SP Flops shoc: OpenCL - GEMM SGEMM_N shoc: OpenCL - Reduction shoc: OpenCL - MD5 Hash shoc: OpenCL - FFT SP shoc: OpenCL - Triad shoc: OpenCL - S3D hashcat: TrueCrypt RIPEMD160 + XTS hashcat: SHA-512 hashcat: 7-Zip hashcat: SHA1 hashcat: MD5 vkfft: waifu2x-ncnn: 2x - 3 - Yes realsr-ncnn: 4x - Yes realsr-ncnn: 4x - No vkpeak: int16-vec4 vkpeak: int16-scalar vkpeak: int32-vec4 vkpeak: int32-scalar vkpeak: fp64-vec4 vkpeak: fp64-scalar vkpeak: fp16-vec4 vkpeak: fp16-scalar vkpeak: fp32-vec4 vkpeak: fp32-scalar blender: Fishy Cat - NVIDIA OptiX blender: BMW27 - NVIDIA OptiX ncnn: Vulkan GPU - efficientnet-b0 ncnn: Vulkan GPU-v3-v3 - mobilenet-v3 waifu2x-ncnn: 2x - 3 - No RTX 2080 2080 368.50 363.60 8924.60 9352.06 354984648.5 150.54 349.37 941.59 92.65 560.30 97.03 163.55 52.27 25.742 8.132 2.06 4.62 6.62 3.58 2.05 1.64 6.92 3.73 0.80 1.65 1.47 1.59 4.00 342 341 347 344 343 381 405 387 368 256 325 280 28.3 28.9 27.0 28.0 49.8 48.7 45.1 34.0 22.5 47 35.6 23.4 11.904 12.10 3.67 4.79 2.86 4.31 2.068 31943 265.6517 270.154655 19.052 213.162 6.730 4.975 0.17916 330.5 396.1 294.0 1183.85 13.1891 12.6731 11598.6 3334.85 322.206 25.8290 1098.11 12.3415 197.076 483067 1671066667 667567 13039400000 41316300000 29593 4.745 66.165 10.567 9950.42 7605.25 11470.55 11560.21 364.79 363.41 22884.74 11660.50 11647.53 11666.70 53.14 35.04 2.93 1.99 OpenBenchmarking.org
NeatBench Acceleration: GPU OpenBenchmarking.org FPS, More Is Better NeatBench 5 Acceleration: GPU RTX 2080 400 800 1200 1600 2000 SE +/- 0.00, N = 3 2080
clpeak OpenCL Test: Global Memory Bandwidth OpenBenchmarking.org GBPS, More Is Better clpeak OpenCL Test: Global Memory Bandwidth RTX 2080 80 160 240 320 400 SE +/- 0.16, N = 3 368.50 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
clpeak OpenCL Test: Double-Precision Double OpenBenchmarking.org GFLOPS, More Is Better clpeak OpenCL Test: Double-Precision Double RTX 2080 80 160 240 320 400 SE +/- 0.01, N = 3 363.60 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
clpeak OpenCL Test: Single-Precision Float OpenBenchmarking.org GFLOPS, More Is Better clpeak OpenCL Test: Single-Precision Float RTX 2080 2K 4K 6K 8K 10K SE +/- 29.57, N = 3 8924.60 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
clpeak OpenCL Test: Integer Compute INT OpenBenchmarking.org GIOPS, More Is Better clpeak OpenCL Test: Integer Compute INT RTX 2080 2K 4K 6K 8K 10K SE +/- 104.43, N = 15 9352.06 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
MandelGPU OpenCL Device: GPU OpenBenchmarking.org Samples/sec, More Is Better MandelGPU 1.3pts1 OpenCL Device: GPU RTX 2080 80M 160M 240M 320M 400M SE +/- 798592.26, N = 3 354984648.5 1. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL
Blender Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.92 Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX RTX 2080 30 60 90 120 150 SE +/- 0.04, N = 3 150.54
Blender Blend File: Pabellon Barcelona - Compute: CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.92 Blend File: Pabellon Barcelona - Compute: CUDA RTX 2080 80 160 240 320 400 SE +/- 0.04, N = 3 349.37
Blender Blend File: Barbershop - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.92 Blend File: Barbershop - Compute: NVIDIA OptiX RTX 2080 200 400 600 800 1000 SE +/- 0.50, N = 3 941.59
Blender Blend File: Classroom - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.92 Blend File: Classroom - Compute: NVIDIA OptiX RTX 2080 20 40 60 80 100 SE +/- 0.06, N = 3 92.65
Blender Blend File: Barbershop - Compute: CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.92 Blend File: Barbershop - Compute: CUDA RTX 2080 120 240 360 480 600 SE +/- 0.80, N = 3 560.30
Blender Blend File: Fishy Cat - Compute: CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.92 Blend File: Fishy Cat - Compute: CUDA RTX 2080 20 40 60 80 100 SE +/- 0.07, N = 3 97.03
Blender Blend File: Classroom - Compute: CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.92 Blend File: Classroom - Compute: CUDA RTX 2080 40 80 120 160 200 SE +/- 0.16, N = 3 163.55
Blender Blend File: BMW27 - Compute: CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.92 Blend File: BMW27 - Compute: CUDA RTX 2080 12 24 36 48 60 SE +/- 0.12, N = 3 52.27
IndigoBench Acceleration: OpenCL GPU - Scene: Supercar OpenBenchmarking.org M samples/s, More Is Better IndigoBench 4.4 Acceleration: OpenCL GPU - Scene: Supercar RTX 2080 6 12 18 24 30 SE +/- 0.02, N = 3 25.74
IndigoBench Acceleration: OpenCL GPU - Scene: Bedroom OpenBenchmarking.org M samples/s, More Is Better IndigoBench 4.4 Acceleration: OpenCL GPU - Scene: Bedroom RTX 2080 2 4 6 8 10 SE +/- 0.005, N = 3 8.132
NCNN Target: Vulkan GPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: Vulkan GPU - Model: regnety_400m RTX 2080 0.4635 0.927 1.3905 1.854 2.3175 SE +/- 0.01, N = 2 2.06 MIN: 2.02 / MAX: 3.39 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: Vulkan GPU - Model: squeezenet_ssd RTX 2080 1.0395 2.079 3.1185 4.158 5.1975 SE +/- 0.11, N = 3 4.62 MIN: 4.09 / MAX: 28.4 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: Vulkan GPU - Model: yolov4-tiny RTX 2080 2 4 6 8 10 SE +/- 0.01, N = 3 6.62 MIN: 6.4 / MAX: 12.03 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: Vulkan GPU - Model: resnet50 RTX 2080 0.8055 1.611 2.4165 3.222 4.0275 SE +/- 0.01, N = 3 3.58 MIN: 3.54 / MAX: 7.28 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: Vulkan GPU - Model: alexnet RTX 2080 0.4613 0.9226 1.3839 1.8452 2.3065 SE +/- 0.00, N = 3 2.05 MIN: 1.86 / MAX: 2.51 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: Vulkan GPU - Model: resnet18 RTX 2080 0.369 0.738 1.107 1.476 1.845 SE +/- 0.00, N = 2 1.64 MIN: 1.62 / MAX: 1.85 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: Vulkan GPU - Model: vgg16 RTX 2080 2 4 6 8 10 SE +/- 0.02, N = 3 6.92 MIN: 6.41 / MAX: 18.43 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: Vulkan GPU - Model: googlenet RTX 2080 0.8393 1.6786 2.5179 3.3572 4.1965 SE +/- 0.08, N = 3 3.73 MIN: 3.24 / MAX: 11.72 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: Vulkan GPU - Model: blazeface RTX 2080 0.18 0.36 0.54 0.72 0.9 SE +/- 0.00, N = 3 0.80 MIN: 0.78 / MAX: 1.9 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: Vulkan GPU - Model: mnasnet RTX 2080 0.3713 0.7426 1.1139 1.4852 1.8565 SE +/- 0.00, N = 3 1.65 MIN: 1.63 / MAX: 2.09 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: Vulkan GPU - Model: shufflenet-v2 RTX 2080 0.3308 0.6616 0.9924 1.3232 1.654 SE +/- 0.00, N = 3 1.47 MIN: 1.44 / MAX: 2.91 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 RTX 2080 0.3578 0.7156 1.0734 1.4312 1.789 SE +/- 0.00, N = 3 1.59 MIN: 1.56 / MAX: 3.07 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: Vulkan GPU - Model: mobilenet RTX 2080 0.9 1.8 2.7 3.6 4.5 SE +/- 0.00, N = 3 4.00 MIN: 3.94 / MAX: 4.16 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
ViennaCL Test: OpenCL BLAS - dGEMM-TT OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-TT RTX 2080 70 140 210 280 350 342 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMM-NT OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-NT RTX 2080 70 140 210 280 350 SE +/- 0.00, N = 2 341 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMM-TN OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-TN RTX 2080 80 160 240 320 400 SE +/- 0.67, N = 3 347 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMM-NN OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-NN RTX 2080 70 140 210 280 350 SE +/- 1.00, N = 2 344 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMV-T OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMV-T RTX 2080 70 140 210 280 350 SE +/- 0.33, N = 3 343 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMV-N OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMV-N RTX 2080 80 160 240 320 400 SE +/- 0.33, N = 3 381 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dDOT OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dDOT RTX 2080 90 180 270 360 450 SE +/- 0.00, N = 3 405 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dAXPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dAXPY RTX 2080 80 160 240 320 400 SE +/- 0.33, N = 3 387 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dCOPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dCOPY RTX 2080 80 160 240 320 400 SE +/- 0.00, N = 3 368 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - sDOT OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - sDOT RTX 2080 60 120 180 240 300 SE +/- 0.33, N = 3 256 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - sAXPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - sAXPY RTX 2080 70 140 210 280 350 SE +/- 0.33, N = 3 325 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - sCOPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - sCOPY RTX 2080 60 120 180 240 300 SE +/- 0.67, N = 3 280 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMM-TT OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TT RTX 2080 7 14 21 28 35 SE +/- 0.03, N = 3 28.3 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMM-TN OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TN RTX 2080 7 14 21 28 35 SE +/- 0.03, N = 3 28.9 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMM-NT OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NT RTX 2080 6 12 18 24 30 SE +/- 0.09, N = 3 27.0 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMM-NN OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NN RTX 2080 7 14 21 28 35 SE +/- 0.03, N = 3 28.0 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMV-T OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-T RTX 2080 11 22 33 44 55 SE +/- 0.03, N = 3 49.8 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMV-N OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-N RTX 2080 11 22 33 44 55 SE +/- 0.07, N = 3 48.7 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dDOT OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dDOT RTX 2080 10 20 30 40 50 SE +/- 0.00, N = 3 45.1 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dAXPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dAXPY RTX 2080 8 16 24 32 40 SE +/- 0.03, N = 3 34.0 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dCOPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dCOPY RTX 2080 5 10 15 20 25 SE +/- 0.00, N = 3 22.5 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - sDOT OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - sDOT RTX 2080 11 22 33 44 55 SE +/- 0.00, N = 3 47 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - sAXPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - sAXPY RTX 2080 8 16 24 32 40 SE +/- 0.03, N = 3 35.6 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - sCOPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - sCOPY RTX 2080 6 12 18 24 30 SE +/- 0.00, N = 3 23.4 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
FinanceBench Benchmark: Black-Scholes OpenCL OpenBenchmarking.org ms, Fewer Is Better FinanceBench 2016-07-25 Benchmark: Black-Scholes OpenCL RTX 2080 3 6 9 12 15 SE +/- 0.14, N = 3 11.90 1. (CXX) g++ options: -O3 -march=native -fopenmp
LuxCoreRender Scene: Rainbow Colors and Prism - Acceleration: GPU OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender 2.5 Scene: Rainbow Colors and Prism - Acceleration: GPU RTX 2080 3 6 9 12 15 SE +/- 0.05, N = 3 12.10 MIN: 10.32 / MAX: 12.85
LuxCoreRender Scene: LuxCore Benchmark - Acceleration: GPU OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender 2.5 Scene: LuxCore Benchmark - Acceleration: GPU RTX 2080 0.8258 1.6516 2.4774 3.3032 4.129 SE +/- 0.01, N = 3 3.67 MIN: 0.65 / MAX: 4.71
LuxCoreRender Scene: Orange Juice - Acceleration: GPU OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender 2.5 Scene: Orange Juice - Acceleration: GPU RTX 2080 1.0778 2.1556 3.2334 4.3112 5.389 SE +/- 0.01, N = 3 4.79 MIN: 3.99 / MAX: 5.29
LuxCoreRender Scene: Danish Mood - Acceleration: GPU OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender 2.5 Scene: Danish Mood - Acceleration: GPU RTX 2080 0.6435 1.287 1.9305 2.574 3.2175 SE +/- 0.02, N = 15 2.86 MIN: 0.56 / MAX: 3.83
LuxCoreRender Scene: DLSC - Acceleration: GPU OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender 2.5 Scene: DLSC - Acceleration: GPU RTX 2080 0.9698 1.9396 2.9094 3.8792 4.849 SE +/- 0.01, N = 3 4.31 MIN: 2.98 / MAX: 4.5
ArrayFire Test: Conjugate Gradient OpenCL OpenBenchmarking.org ms, Fewer Is Better ArrayFire 3.7 Test: Conjugate Gradient OpenCL RTX 2080 0.4653 0.9306 1.3959 1.8612 2.3265 SE +/- 0.002, N = 3 2.068 1. (CXX) g++ options: -rdynamic
LeelaChessZero Backend: OpenCL OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.28 Backend: OpenCL RTX 2080 7K 14K 21K 28K 35K SE +/- 157.25, N = 3 31943 1. (CXX) g++ options: -flto -pthread
FAHBench OpenBenchmarking.org Ns Per Day, More Is Better FAHBench 2.3.2 RTX 2080 60 120 180 240 300 SE +/- 0.37, N = 3 265.65
OctaneBench Total Score OpenBenchmarking.org Score, More Is Better OctaneBench 2020.1 Total Score RTX 2080 60 120 180 240 300 270.15
VkResample Upscale: 2x - Precision: Single OpenBenchmarking.org ms, Fewer Is Better VkResample 1.0 Upscale: 2x - Precision: Single RTX 2080 5 10 15 20 25 SE +/- 0.01, N = 3 19.05 1. (CXX) g++ options: -O3
VkResample Upscale: 2x - Precision: Double OpenBenchmarking.org ms, Fewer Is Better VkResample 1.0 Upscale: 2x - Precision: Double RTX 2080 50 100 150 200 250 SE +/- 0.10, N = 3 213.16 1. (CXX) g++ options: -O3
Betsy GPU Compressor Codec: ETC2 RGB - Quality: Highest OpenBenchmarking.org Seconds, Fewer Is Better Betsy GPU Compressor 1.1 Beta Codec: ETC2 RGB - Quality: Highest RTX 2080 2 4 6 8 10 SE +/- 0.047, N = 3 6.730 1. (CXX) g++ options: -O3 -O2 -ldl
Betsy GPU Compressor Codec: ETC1 - Quality: Highest OpenBenchmarking.org Seconds, Fewer Is Better Betsy GPU Compressor 1.1 Beta Codec: ETC1 - Quality: Highest RTX 2080 1.1194 2.2388 3.3582 4.4776 5.597 SE +/- 0.024, N = 3 4.975 1. (CXX) g++ options: -O3 -O2 -ldl
NAMD CUDA ATPase Simulation - 327,506 Atoms OpenBenchmarking.org days/ns, Fewer Is Better NAMD CUDA 2.14 ATPase Simulation - 327,506 Atoms RTX 2080 0.0403 0.0806 0.1209 0.1612 0.2015 SE +/- 0.00031, N = 3 0.17916
cl-mem Benchmark: Write OpenBenchmarking.org GB/s, More Is Better cl-mem 2017-01-13 Benchmark: Write RTX 2080 70 140 210 280 350 SE +/- 1.10, N = 3 330.5 1. (CC) gcc options: -O2 -flto -lOpenCL
cl-mem Benchmark: Read OpenBenchmarking.org GB/s, More Is Better cl-mem 2017-01-13 Benchmark: Read RTX 2080 90 180 270 360 450 SE +/- 0.13, N = 3 396.1 1. (CC) gcc options: -O2 -flto -lOpenCL
cl-mem Benchmark: Copy OpenBenchmarking.org GB/s, More Is Better cl-mem 2017-01-13 Benchmark: Copy RTX 2080 60 120 180 240 300 SE +/- 0.15, N = 3 294.0 1. (CC) gcc options: -O2 -flto -lOpenCL
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Texture Read Bandwidth OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Texture Read Bandwidth RTX 2080 300 600 900 1200 1500 SE +/- 1.01, N = 3 1183.85 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Bus Speed Readback OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Readback RTX 2080 3 6 9 12 15 SE +/- 0.01, N = 3 13.19 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Bus Speed Download OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Download RTX 2080 3 6 9 12 15 SE +/- 0.00, N = 3 12.67 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Max SP Flops OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Max SP Flops RTX 2080 2K 4K 6K 8K 10K SE +/- 58.75, N = 3 11598.6 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: GEMM SGEMM_N OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: GEMM SGEMM_N RTX 2080 700 1400 2100 2800 3500 SE +/- 1.38, N = 3 3334.85 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Reduction OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Reduction RTX 2080 70 140 210 280 350 SE +/- 0.16, N = 3 322.21 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: MD5 Hash OpenBenchmarking.org GHash/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: MD5 Hash RTX 2080 6 12 18 24 30 SE +/- 0.01, N = 3 25.83 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: FFT SP OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: FFT SP RTX 2080 200 400 600 800 1000 SE +/- 1.42, N = 3 1098.11 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Triad OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Triad RTX 2080 3 6 9 12 15 SE +/- 0.00, N = 3 12.34 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: S3D OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: S3D RTX 2080 40 80 120 160 200 SE +/- 0.34, N = 3 197.08 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
Hashcat Benchmark: TrueCrypt RIPEMD160 + XTS OpenBenchmarking.org H/s, More Is Better Hashcat 6.2.4 Benchmark: TrueCrypt RIPEMD160 + XTS RTX 2080 100K 200K 300K 400K 500K SE +/- 2216.85, N = 3 483067
Hashcat Benchmark: SHA-512 OpenBenchmarking.org H/s, More Is Better Hashcat 6.2.4 Benchmark: SHA-512 RTX 2080 400M 800M 1200M 1600M 2000M SE +/- 1849624.59, N = 3 1671066667
Hashcat Benchmark: 7-Zip OpenBenchmarking.org H/s, More Is Better Hashcat 6.2.4 Benchmark: 7-Zip RTX 2080 140K 280K 420K 560K 700K SE +/- 731.06, N = 3 667567
Hashcat Benchmark: SHA1 OpenBenchmarking.org H/s, More Is Better Hashcat 6.2.4 Benchmark: SHA1 RTX 2080 3000M 6000M 9000M 12000M 15000M SE +/- 19055270.49, N = 3 13039400000
Hashcat Benchmark: MD5 OpenBenchmarking.org H/s, More Is Better Hashcat 6.2.4 Benchmark: MD5 RTX 2080 9000M 18000M 27000M 36000M 45000M SE +/- 144315811.10, N = 3 41316300000
VkFFT OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.1.1 RTX 2080 6K 12K 18K 24K 30K SE +/- 64.47, N = 3 29593 1. (CXX) g++ options: -O3
Waifu2x-NCNN Vulkan Scale: 2x - Denoise: 3 - TAA: Yes OpenBenchmarking.org Seconds, Fewer Is Better Waifu2x-NCNN Vulkan 20200818 Scale: 2x - Denoise: 3 - TAA: Yes RTX 2080 1.0676 2.1352 3.2028 4.2704 5.338 SE +/- 0.017, N = 3 4.745
RealSR-NCNN Scale: 4x - TAA: Yes OpenBenchmarking.org Seconds, Fewer Is Better RealSR-NCNN 20200818 Scale: 4x - TAA: Yes RTX 2080 15 30 45 60 75 SE +/- 0.10, N = 3 66.17
RealSR-NCNN Scale: 4x - TAA: No OpenBenchmarking.org Seconds, Fewer Is Better RealSR-NCNN 20200818 Scale: 4x - TAA: No RTX 2080 3 6 9 12 15 SE +/- 0.10, N = 3 10.57
vkpeak int16-vec4 OpenBenchmarking.org GIOPS, More Is Better vkpeak 20210424 int16-vec4 RTX 2080 2K 4K 6K 8K 10K SE +/- 39.50, N = 3 9950.42
vkpeak int16-scalar OpenBenchmarking.org GIOPS, More Is Better vkpeak 20210424 int16-scalar RTX 2080 1600 3200 4800 6400 8000 SE +/- 19.71, N = 3 7605.25
vkpeak int32-vec4 OpenBenchmarking.org GIOPS, More Is Better vkpeak 20210424 int32-vec4 RTX 2080 2K 4K 6K 8K 10K SE +/- 29.16, N = 3 11470.55
vkpeak int32-scalar OpenBenchmarking.org GIOPS, More Is Better vkpeak 20210424 int32-scalar RTX 2080 2K 4K 6K 8K 10K SE +/- 51.10, N = 3 11560.21
vkpeak fp64-vec4 OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20210424 fp64-vec4 RTX 2080 80 160 240 320 400 SE +/- 1.38, N = 2 364.79
vkpeak fp64-scalar OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20210424 fp64-scalar RTX 2080 80 160 240 320 400 SE +/- 1.59, N = 3 363.41
vkpeak fp16-vec4 OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20210424 fp16-vec4 RTX 2080 5K 10K 15K 20K 25K SE +/- 154.37, N = 3 22884.74
vkpeak fp16-scalar OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20210424 fp16-scalar RTX 2080 2K 4K 6K 8K 10K SE +/- 78.41, N = 3 11660.50
vkpeak fp32-vec4 OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20210424 fp32-vec4 RTX 2080 2K 4K 6K 8K 10K SE +/- 77.92, N = 3 11647.53
vkpeak fp32-scalar OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20210424 fp32-scalar RTX 2080 2K 4K 6K 8K 10K SE +/- 101.27, N = 3 11666.70
Blender Blend File: Fishy Cat - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.92 Blend File: Fishy Cat - Compute: NVIDIA OptiX RTX 2080 12 24 36 48 60 SE +/- 2.68, N = 15 53.14
Blender Blend File: BMW27 - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.92 Blend File: BMW27 - Compute: NVIDIA OptiX RTX 2080 8 16 24 32 40 SE +/- 2.72, N = 15 35.04
NCNN Target: Vulkan GPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: Vulkan GPU - Model: efficientnet-b0 RTX 2080 0.6593 1.3186 1.9779 2.6372 3.2965 SE +/- 0.15, N = 3 2.93 MIN: 2.75 / MAX: 6.96 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20210720 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 RTX 2080 0.4478 0.8956 1.3434 1.7912 2.239 SE +/- 0.16, N = 3 1.99 MIN: 1.8 / MAX: 5.58 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Phoronix Test Suite v10.8.4