Compare your own system(s) to this result file with the
Phoronix Test Suite by running the command:
phoronix-test-suite benchmark 2307077-NE-2307020PT71 2307020-PTS-GPUREVIEW1 - Phoronix Test Suite 2307020-PTS-GPUREVIEW1 RTX 4080 16GB PNY REVIEW
HTML result view exported from: https://openbenchmarking.org/result/2307077-NE-2307020PT71&sro&grr .
2307020-PTS-GPUREVIEW1 Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL OpenCL Vulkan Compiler File-System Screen Resolution RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny Intel Xeon w9-3495X @ 4.80GHz (56 Cores / 112 Threads) ASUS Pro WS W790E-SAGE SE (0506 BIOS) Intel Device 7aa7 8 x 32 GB DDR5-4812MT/s Hynix HMCG88AEBRA115N 6401GB Micron_9300_MTFDHAL6T4TDR + 0GB Virtual HDisk0 NVIDIA GeForce RTX 3090 24GB Realtek ALC1220 BenQ PD2720U 2 x Intel X710 for 10GBASE-T Ubuntu 22.04 6.3.0-060300-generic (x86_64) GNOME Shell 42.5 X Server 1.21.1.4 NVIDIA 530.41.03 4.6.0 OpenCL 3.0 CUDA 12.1.98 1.3.236 GCC 11.3.0 + CUDA 12.1 ext4 3840x2160 NVIDIA GeForce RTX 2080 Ti 22GB NVIDIA GeForce RTX 4090 24GB NVIDIA GeForce RTX 4080 16GB OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-aYxV0E/gcc-11-11.3.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-aYxV0E/gcc-11-11.3.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: intel_pstate powersave (EPP: performance) - CPU Microcode: 0x2b000390 Graphics Details - RTX 3090 24GB -Zotac: BAR1 / Visible vRAM Size: 32768 MiB - vBIOS Version: 94.02.26.48.65 - RTX 2080 Ti 22GB -Dell: BAR1 / Visible vRAM Size: 256 MiB - vBIOS Version: 90.02.30.40.4d - RTX 4090 24GB -Nvidia: BAR1 / Visible vRAM Size: 32768 MiB - vBIOS Version: 95.02.20.00.03 - RTX 4080 16GB -Pny: BAR1 / Visible vRAM Size: 16384 MiB - vBIOS Version: 95.03.0e.00.67 OpenCL Details - RTX 3090 24GB -Zotac: GPU Compute Cores: 10496 - RTX 2080 Ti 22GB -Dell: GPU Compute Cores: 4352 - RTX 4090 24GB -Nvidia: GPU Compute Cores: 16384 - RTX 4080 16GB -Pny: GPU Compute Cores: 9728 Python Details - Python 3.10.6 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected
2307020-PTS-GPUREVIEW1 shoc: OpenCL - Max SP Flops vkfft: lczero: OpenCL ncnn: Vulkan GPU - FastestDet ncnn: Vulkan GPU - vision_transformer ncnn: Vulkan GPU - regnety_400m ncnn: Vulkan GPU - squeezenet_ssd ncnn: Vulkan GPU - yolov4-tiny ncnn: Vulkan GPU - resnet50 ncnn: Vulkan GPU - alexnet ncnn: Vulkan GPU - resnet18 ncnn: Vulkan GPU - vgg16 ncnn: Vulkan GPU - googlenet ncnn: Vulkan GPU - blazeface ncnn: Vulkan GPU - efficientnet-b0 ncnn: Vulkan GPU - mnasnet ncnn: Vulkan GPU - shufflenet-v2 ncnn: Vulkan GPU-v3-v3 - mobilenet-v3 ncnn: Vulkan GPU-v2-v2 - mobilenet-v2 ncnn: Vulkan GPU - mobilenet vkpeak: int16-vec4 vkpeak: int16-scalar vkpeak: int32-vec4 vkpeak: int32-scalar vkpeak: fp64-vec4 vkpeak: fp64-scalar vkpeak: fp16-vec4 vkpeak: fp16-scalar vkpeak: fp32-vec4 vkpeak: fp32-scalar octanebench: Total Score fahbench: v-ray: NVIDIA RTX GPU v-ray: NVIDIA CUDA GPU viennacl: CPU BLAS - dGEMM-TT viennacl: CPU BLAS - dGEMM-TN viennacl: CPU BLAS - dGEMM-NT viennacl: CPU BLAS - dGEMM-NN viennacl: CPU BLAS - dGEMV-T viennacl: CPU BLAS - dGEMV-N viennacl: CPU BLAS - dDOT viennacl: CPU BLAS - dAXPY viennacl: CPU BLAS - dCOPY viennacl: CPU BLAS - sDOT viennacl: CPU BLAS - sAXPY viennacl: CPU BLAS - sCOPY indigobench: OpenCL GPU - Bedroom indigobench: OpenCL GPU - Supercar namd-cuda: ATPase Simulation - 327,506 Atoms blender: Barbershop - NVIDIA OptiX gromacs: NVIDIA CUDA GPU - water_GMX50_bare blender: Fishy Cat - NVIDIA OptiX realsr-ncnn: 4x - Yes blender: BMW27 - NVIDIA OptiX vkresample: 2x - Double viennacl: OpenCL BLAS - dGEMM-TT viennacl: OpenCL BLAS - dGEMM-TN viennacl: OpenCL BLAS - dGEMM-NT viennacl: OpenCL BLAS - dGEMM-NN viennacl: OpenCL BLAS - dGEMV-T viennacl: OpenCL BLAS - dGEMV-N viennacl: OpenCL BLAS - dDOT viennacl: OpenCL BLAS - dAXPY viennacl: OpenCL BLAS - dCOPY viennacl: OpenCL BLAS - sDOT viennacl: OpenCL BLAS - sAXPY viennacl: OpenCL BLAS - sCOPY hashcat: MD5 blender: Pabellon Barcelona - NVIDIA OptiX blender: Classroom - NVIDIA OptiX vkresample: 2x - Single shoc: OpenCL - Texture Read Bandwidth hashcat: TrueCrypt RIPEMD160 + XTS clpeak: Double-Precision Double hashcat: SHA-512 caffe: GoogleNet - NVIDIA CUDA - 1000 hashcat: SHA1 realsr-ncnn: 4x - No hashcat: 7-Zip caffe: AlexNet - NVIDIA CUDA - 1000 rodinia: OpenCL Particle Filter waifu2x-ncnn: 2x - 3 - Yes caffe: GoogleNet - NVIDIA CUDA - 200 cl-mem: Copy cl-mem: Write cl-mem: Read arrayfire: Conjugate Gradient OpenCL clpeak: Global Memory Bandwidth caffe: GoogleNet - NVIDIA CUDA - 100 mandelgpu: GPU shoc: OpenCL - GEMM SGEMM_N caffe: AlexNet - NVIDIA CUDA - 200 shoc: OpenCL - FFT SP shoc: OpenCL - S3D shoc: OpenCL - Triad caffe: AlexNet - NVIDIA CUDA - 100 shoc: OpenCL - Bus Speed Readback clpeak: Single-Precision Float shoc: OpenCL - Bus Speed Download shoc: OpenCL - Reduction clpeak: Integer Compute INT neatbench: GPU financebench: Black-Scholes OpenCL shoc: OpenCL - MD5 Hash RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny 37738.4 41547 14316 13.52 324.88 25.04 57.27 49.10 6.35 21.31 21.45 4.40 28.44 5.38 23.97 19.16 17.57 18.78 18.25 37.93 16205.98 13295.72 20074.10 20305.75 645.31 645.24 40002.93 20161.59 26392.09 20403.35 669.371026 316.5721 2878 2052 143 141 120 107 749 172 606 984 834 764 2272 1842 20.772 51.701 0.06742 52.28 23.743 10.84 31.170 6.13 121.353 584 587 588 585 371 185 658 716 600 372 495 363 63700926667 16.07 14.25 9.359 2148.86 731327 640.15 2595250000 24142.8 20634050000 6.509 1095950 6584.81 3.768 3.536 4830.59 359.2 734.3 823.3 1.583 814.92 2418.86 568293355.5 7912.59 1321.36 2349.39 427.491 24.5486 664.597 26.3975 35000.44 25.1930 392.804 17864.54 3090 5.796 41.3324 16106.4 33404 13228 6.80 363.75 23.13 61.74 52.48 5.85 21.50 22.13 7.07 28.22 5.49 26.09 7.03 6.11 18.77 6.35 38.20 12975.65 10270.26 15752.98 15963.59 503.53 500.91 30649.13 15521.12 15859.12 15974.22 349.490185 292.6940 1306 950 145 142 119 106.8 742 179 658 1058 923 747 2261 1755 11.152 32.598 0.09515 92.87 15.377 16.89 52.755 8.93 153.146 458 460 461 459 370 297 524 518 475 305 392 306 51503250000 27.42 22.69 14.790 1147.26 580260 506.01 2047183333 16300900000 9.239 840275 4.527 4.307 318.1 452.2 543.8 1.679 504.66 442866316.9 4858.13 1504.87 270.801 12.6064 13.2046 14076.71 12.8530 366.058 11826.76 2080 8.840 32.5181 88606.0 59427 15097 5.08 289.04 5.46 8.99 35.23 7.28 5.08 4.50 25.14 5.77 3.95 6.42 4.39 4.85 4.97 4.42 9.21 39280.82 29507.17 44062.52 44276.72 1396.26 1394.44 87698.39 44269.79 58607.46 44315.35 1326.237674 423.2069 5478 4265 144 137 115 120 771 188 671 1098 946 784 2360 1874 35.296 78.445 0.04850 30.55 43.470 5.73 20.187 3.70 55.925 1340 1293 1280 1150 443 220 719 772 661 446 568 444 154333333333 8.42 7.47 7.799 2980.15 1857800 1391.51 6297766667 16499.3 49393366667 5.109 2746938 4379.70 2.085 2.490 3304.15 410.3 785.8 886.0 0.8803 871.56 1660.01 951815274.2 27175.9 883.953 2779.48 644.603 25.0717 447.341 26.3724 79463.36 25.1311 967.142 40697.73 4090 2.890 93.9019 55455.8 48201 15316 5.18 290.15 5.32 9.13 35.87 5.16 17.57 4.78 7.90 5.71 3.89 6.30 4.37 4.78 5.02 4.32 9.36 24616.84 18441.46 27595.51 27758.93 873.67 873.52 54827.66 27669.37 36655.69 27771.88 977.52814 424.3492 4080 3126 137 135 122 103.5 787 187 678 1085 956 783 2422 1901 26.080 66.030 0.05427 37.97 32.044 7.59 24.374 4.43 89.237 849 831 795 774 435 224 597 608 540 417 487 385 96891966667 10.41 9.32 11.543 3065.76 1146560 869.25 3936350000 16310.5 30881533333 5.630 1747350 5228.00 2.685 2.761 3279.17 382.0 567.7 623.2 1.122 611.67 1649.29 772392077.3 16952.6 1053.88 1831.86 423.578 24.7590 537.491 26.3935 47802.87 25.1007 1020.235 24524.34 4080 4.404 60.5581 OpenBenchmarking.org
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Max SP Flops OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Max SP Flops RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 20K 40K 60K 80K 100K SE +/- 102.77, N = 3 SE +/- 292.47, N = 3 SE +/- 73.47, N = 3 SE +/- 19.25, N = 3 16106.4 37738.4 55455.8 88606.0 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
VkFFT OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.1.1 RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 13K 26K 39K 52K 65K SE +/- 163.05, N = 3 SE +/- 475.08, N = 9 SE +/- 213.00, N = 3 SE +/- 72.97, N = 3 33404 41547 48201 59427 1. (CXX) g++ options: -O3
LeelaChessZero Backend: OpenCL OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.28 Backend: OpenCL RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 3K 6K 9K 12K 15K SE +/- 135.58, N = 4 SE +/- 52.37, N = 3 SE +/- 126.87, N = 3 SE +/- 198.64, N = 3 13228 14316 15316 15097 1. (CXX) g++ options: -flto -pthread
NCNN Target: Vulkan GPU - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: FastestDet RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 3 6 9 12 15 SE +/- 0.10, N = 3 SE +/- 3.64, N = 3 SE +/- 0.16, N = 3 SE +/- 0.04, N = 3 6.80 13.52 5.18 5.08 MIN: 5.02 / MAX: 32.78 MIN: 5.33 / MAX: 35.07 MIN: 4.28 / MAX: 25.38 MIN: 4.3 / MAX: 23.08 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: vision_transformer RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 80 160 240 320 400 SE +/- 0.57, N = 3 SE +/- 0.55, N = 3 SE +/- 3.05, N = 3 SE +/- 1.34, N = 3 363.75 324.88 290.15 289.04 MIN: 316.04 / MAX: 471.43 MIN: 283.26 / MAX: 423.43 MIN: 247.6 / MAX: 837.91 MIN: 260.04 / MAX: 661.43 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: regnety_400m RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 6 12 18 24 30 SE +/- 2.01, N = 3 SE +/- 0.25, N = 3 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 23.13 25.04 5.32 5.46 MIN: 8.74 / MAX: 39.68 MIN: 11.25 / MAX: 44.6 MIN: 4.68 / MAX: 22.84 MIN: 4.78 / MAX: 28.36 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: squeezenet_ssd RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 14 28 42 56 70 SE +/- 0.50, N = 3 SE +/- 0.43, N = 3 SE +/- 0.11, N = 3 SE +/- 0.05, N = 3 61.74 57.27 9.13 8.99 MIN: 23.2 / MAX: 85.24 MIN: 16.86 / MAX: 94.95 MIN: 7.71 / MAX: 66.82 MIN: 8.18 / MAX: 62.47 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: yolov4-tiny RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 12 24 36 48 60 SE +/- 0.13, N = 3 SE +/- 0.40, N = 3 SE +/- 0.08, N = 3 SE +/- 0.10, N = 3 52.48 49.10 35.87 35.23 MIN: 20.11 / MAX: 75.46 MIN: 17.59 / MAX: 76.07 MIN: 11.56 / MAX: 61.96 MIN: 11.4 / MAX: 59.2 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: resnet50 RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 2 4 6 8 10 SE +/- 0.14, N = 3 SE +/- 0.14, N = 3 SE +/- 0.04, N = 3 SE +/- 0.77, N = 3 5.85 6.35 5.16 7.28 MIN: 4.75 / MAX: 26.01 MIN: 5 / MAX: 21.95 MIN: 4.78 / MAX: 21.85 MIN: 3.88 / MAX: 30.89 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: alexnet RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 5 10 15 20 25 SE +/- 0.30, N = 3 SE +/- 0.22, N = 3 SE +/- 0.19, N = 3 SE +/- 0.08, N = 3 21.50 21.31 17.57 5.08 MIN: 3.33 / MAX: 40.17 MIN: 7.31 / MAX: 35.87 MIN: 7.8 / MAX: 33.45 MIN: 3.13 / MAX: 27.91 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: resnet18 RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 5 10 15 20 25 SE +/- 0.55, N = 3 SE +/- 0.21, N = 3 SE +/- 0.06, N = 3 SE +/- 0.10, N = 3 22.13 21.45 4.78 4.50 MIN: 9.01 / MAX: 43.23 MIN: 8.94 / MAX: 40.36 MIN: 4.18 / MAX: 28.49 MIN: 3.89 / MAX: 21.9 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: vgg16 RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 6 12 18 24 30 SE +/- 1.00, N = 3 SE +/- 0.11, N = 3 SE +/- 3.46, N = 3 SE +/- 0.10, N = 3 7.07 4.40 7.90 25.14 MIN: 4.46 / MAX: 48.35 MIN: 3.72 / MAX: 24.39 MIN: 3.84 / MAX: 36.49 MIN: 12.85 / MAX: 33.3 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: googlenet RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 7 14 21 28 35 SE +/- 0.49, N = 3 SE +/- 0.13, N = 3 SE +/- 0.11, N = 3 SE +/- 0.10, N = 3 28.22 28.44 5.71 5.77 MIN: 15.18 / MAX: 38.54 MIN: 14.74 / MAX: 44.82 MIN: 5.03 / MAX: 22.85 MIN: 4.99 / MAX: 22.76 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: blazeface RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 1.2353 2.4706 3.7059 4.9412 6.1765 SE +/- 0.07, N = 3 SE +/- 0.22, N = 3 SE +/- 0.10, N = 3 SE +/- 0.08, N = 3 5.49 5.38 3.89 3.95 MIN: 3.72 / MAX: 33.58 MIN: 4.06 / MAX: 22.14 MIN: 3.33 / MAX: 36.65 MIN: 3.22 / MAX: 30.17 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: efficientnet-b0 RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 6 12 18 24 30 SE +/- 0.20, N = 3 SE +/- 0.63, N = 3 SE +/- 0.10, N = 3 SE +/- 0.36, N = 3 26.09 23.97 6.30 6.42 MIN: 12.37 / MAX: 40.64 MIN: 10.15 / MAX: 37.01 MIN: 5.32 / MAX: 26.26 MIN: 5.35 / MAX: 28.33 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: mnasnet RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 5 10 15 20 25 SE +/- 0.73, N = 3 SE +/- 0.13, N = 3 SE +/- 0.02, N = 3 SE +/- 0.10, N = 2 7.03 19.16 4.37 4.39 MIN: 4.55 / MAX: 25.62 MIN: 7.92 / MAX: 33.04 MIN: 3.71 / MAX: 23.07 MIN: 3.79 / MAX: 22 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: shufflenet-v2 RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 4 8 12 16 20 SE +/- 0.28, N = 3 SE +/- 0.06, N = 3 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 6.11 17.57 4.78 4.85 MIN: 4.77 / MAX: 33.2 MIN: 8.24 / MAX: 39.36 MIN: 4.03 / MAX: 23.06 MIN: 4.28 / MAX: 25.38 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 5 10 15 20 25 SE +/- 0.83, N = 3 SE +/- 0.18, N = 3 SE +/- 0.11, N = 3 SE +/- 0.06, N = 3 18.77 18.78 5.02 4.97 MIN: 7.68 / MAX: 35.81 MIN: 8.92 / MAX: 34.42 MIN: 4.12 / MAX: 23.74 MIN: 4.23 / MAX: 23.03 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 4 8 12 16 20 SE +/- 0.39, N = 3 SE +/- 0.28, N = 3 SE +/- 0.10, N = 3 SE +/- 0.01, N = 3 6.35 18.25 4.32 4.42 MIN: 4.5 / MAX: 27.11 MIN: 7.4 / MAX: 39.06 MIN: 3.68 / MAX: 23.77 MIN: 3.68 / MAX: 20.27 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: mobilenet RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 9 18 27 36 45 SE +/- 0.18, N = 3 SE +/- 0.21, N = 3 SE +/- 0.06, N = 3 SE +/- 0.03, N = 3 38.20 37.93 9.36 9.21 MIN: 10.75 / MAX: 66.72 MIN: 14.41 / MAX: 61.76 MIN: 8.28 / MAX: 49.63 MIN: 8.39 / MAX: 47.45 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
vkpeak int16-vec4 OpenBenchmarking.org GIOPS, More Is Better vkpeak 20210424 int16-vec4 RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 8K 16K 24K 32K 40K SE +/- 4.25, N = 3 SE +/- 4.87, N = 3 SE +/- 0.85, N = 3 SE +/- 1.77, N = 3 12975.65 16205.98 24616.84 39280.82
vkpeak int16-scalar OpenBenchmarking.org GIOPS, More Is Better vkpeak 20210424 int16-scalar RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 6K 12K 18K 24K 30K SE +/- 0.89, N = 3 SE +/- 0.25, N = 3 SE +/- 0.34, N = 3 SE +/- 4.42, N = 3 10270.26 13295.72 18441.46 29507.17
vkpeak int32-vec4 OpenBenchmarking.org GIOPS, More Is Better vkpeak 20210424 int32-vec4 RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 9K 18K 27K 36K 45K SE +/- 15.45, N = 3 SE +/- 27.49, N = 3 SE +/- 37.47, N = 3 SE +/- 3.11, N = 3 15752.98 20074.10 27595.51 44062.52
vkpeak int32-scalar OpenBenchmarking.org GIOPS, More Is Better vkpeak 20210424 int32-scalar RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 9K 18K 27K 36K 45K SE +/- 39.60, N = 3 SE +/- 10.90, N = 3 SE +/- 2.25, N = 3 SE +/- 25.10, N = 3 15963.59 20305.75 27758.93 44276.72
vkpeak fp64-vec4 OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20210424 fp64-vec4 RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 300 600 900 1200 1500 SE +/- 0.03, N = 3 SE +/- 1.70, N = 3 SE +/- 0.01, N = 3 SE +/- 1.25, N = 3 503.53 645.31 873.67 1396.26
vkpeak fp64-scalar OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20210424 fp64-scalar RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 300 600 900 1200 1500 SE +/- 1.28, N = 3 SE +/- 1.62, N = 3 SE +/- 0.08, N = 3 SE +/- 0.73, N = 3 500.91 645.24 873.52 1394.44
vkpeak fp16-vec4 OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20210424 fp16-vec4 RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 20K 40K 60K 80K 100K SE +/- 26.46, N = 3 SE +/- 64.52, N = 3 SE +/- 0.37, N = 3 SE +/- 98.17, N = 3 30649.13 40002.93 54827.66 87698.39
vkpeak fp16-scalar OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20210424 fp16-scalar RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 9K 18K 27K 36K 45K SE +/- 61.99, N = 3 SE +/- 30.30, N = 3 SE +/- 0.86, N = 3 SE +/- 7.21, N = 3 15521.12 20161.59 27669.37 44269.79
vkpeak fp32-vec4 OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20210424 fp32-vec4 RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 13K 26K 39K 52K 65K SE +/- 77.45, N = 3 SE +/- 43.15, N = 3 SE +/- 13.71, N = 3 SE +/- 13.83, N = 3 15859.12 26392.09 36655.69 58607.46
vkpeak fp32-scalar OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20210424 fp32-scalar RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 9K 18K 27K 36K 45K SE +/- 136.44, N = 3 SE +/- 34.37, N = 3 SE +/- 35.38, N = 3 SE +/- 10.09, N = 3 15974.22 20403.35 27771.88 44315.35
OctaneBench Total Score OpenBenchmarking.org Score, More Is Better OctaneBench 2020.1 Total Score RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 300 600 900 1200 1500 349.49 669.37 977.53 1326.24
FAHBench OpenBenchmarking.org Ns Per Day, More Is Better FAHBench 2.3.2 RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 90 180 270 360 450 SE +/- 0.79, N = 3 SE +/- 0.41, N = 3 SE +/- 0.63, N = 3 SE +/- 0.24, N = 3 292.69 316.57 424.35 423.21
Chaos Group V-RAY Mode: NVIDIA RTX GPU OpenBenchmarking.org vrays, More Is Better Chaos Group V-RAY 5.02 Mode: NVIDIA RTX GPU RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 1200 2400 3600 4800 6000 SE +/- 10.17, N = 3 SE +/- 18.12, N = 3 SE +/- 30.20, N = 3 SE +/- 63.00, N = 3 1306 2878 4080 5478
Chaos Group V-RAY Mode: NVIDIA CUDA GPU OpenBenchmarking.org vpaths, More Is Better Chaos Group V-RAY 5.02 Mode: NVIDIA CUDA GPU RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 900 1800 2700 3600 4500 SE +/- 0.67, N = 3 SE +/- 2.85, N = 3 SE +/- 2.73, N = 3 SE +/- 2.33, N = 3 950 2052 3126 4265
ViennaCL Test: CPU BLAS - dGEMM-TT OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TT RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 30 60 90 120 150 SE +/- 4.93, N = 15 SE +/- 5.83, N = 15 SE +/- 6.46, N = 11 SE +/- 9.73, N = 5 145 143 137 144 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMM-TN OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TN RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 30 60 90 120 150 SE +/- 4.47, N = 15 SE +/- 5.51, N = 15 SE +/- 4.72, N = 12 SE +/- 9.50, N = 5 142 141 135 137 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMM-NT OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NT RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 30 60 90 120 150 SE +/- 2.19, N = 15 SE +/- 4.37, N = 15 SE +/- 4.18, N = 12 SE +/- 0.48, N = 4 119 120 122 115 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMM-NN OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NN RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 30 60 90 120 150 SE +/- 1.59, N = 15 SE +/- 1.05, N = 15 SE +/- 1.29, N = 12 SE +/- 10.59, N = 5 106.8 107.0 103.5 120.0 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMV-T OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-T RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 200 400 600 800 1000 SE +/- 6.17, N = 14 SE +/- 4.88, N = 15 SE +/- 7.65, N = 12 SE +/- 2.58, N = 5 742 749 787 771 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMV-N OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-N RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 40 80 120 160 200 SE +/- 1.84, N = 15 SE +/- 8.48, N = 15 SE +/- 0.91, N = 12 SE +/- 1.74, N = 5 179 172 187 188 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dDOT OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dDOT RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 150 300 450 600 750 SE +/- 4.70, N = 15 SE +/- 1.68, N = 15 SE +/- 8.14, N = 12 SE +/- 6.71, N = 5 658 606 678 671 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dAXPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dAXPY RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 200 400 600 800 1000 SE +/- 5.45, N = 15 SE +/- 6.89, N = 15 SE +/- 7.44, N = 12 SE +/- 22.23, N = 5 1058 984 1085 1098 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dCOPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dCOPY RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 200 400 600 800 1000 SE +/- 8.86, N = 15 SE +/- 7.74, N = 15 SE +/- 9.35, N = 12 SE +/- 11.59, N = 5 923 834 956 946 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - sDOT OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - sDOT RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 200 400 600 800 1000 SE +/- 4.44, N = 15 SE +/- 6.38, N = 15 SE +/- 5.06, N = 12 SE +/- 7.38, N = 5 747 764 783 784 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - sAXPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - sAXPY RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 500 1000 1500 2000 2500 SE +/- 16.08, N = 15 SE +/- 18.29, N = 15 SE +/- 31.16, N = 12 SE +/- 13.04, N = 5 2261 2272 2422 2360 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - sCOPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - sCOPY RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 400 800 1200 1600 2000 SE +/- 17.80, N = 15 SE +/- 12.27, N = 15 SE +/- 13.45, N = 12 SE +/- 19.65, N = 5 1755 1842 1901 1874 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
IndigoBench Acceleration: OpenCL GPU - Scene: Bedroom OpenBenchmarking.org M samples/s, More Is Better IndigoBench 4.4 Acceleration: OpenCL GPU - Scene: Bedroom RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 8 16 24 32 40 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 11.15 20.77 26.08 35.30
IndigoBench Acceleration: OpenCL GPU - Scene: Supercar OpenBenchmarking.org M samples/s, More Is Better IndigoBench 4.4 Acceleration: OpenCL GPU - Scene: Supercar RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 20 40 60 80 100 SE +/- 0.05, N = 3 SE +/- 0.02, N = 3 SE +/- 0.07, N = 3 SE +/- 0.01, N = 3 32.60 51.70 66.03 78.45
NAMD CUDA ATPase Simulation - 327,506 Atoms OpenBenchmarking.org days/ns, Fewer Is Better NAMD CUDA 2.14 ATPase Simulation - 327,506 Atoms RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 0.0214 0.0428 0.0642 0.0856 0.107 SE +/- 0.00074, N = 15 SE +/- 0.00094, N = 3 SE +/- 0.00113, N = 15 SE +/- 0.00119, N = 15 0.09515 0.06742 0.05427 0.04850
Blender Blend File: Barbershop - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.6 Blend File: Barbershop - Compute: NVIDIA OptiX RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 20 40 60 80 100 SE +/- 0.09, N = 3 SE +/- 0.01, N = 3 SE +/- 0.07, N = 3 SE +/- 0.13, N = 3 92.87 52.28 37.97 30.55
GROMACS Implementation: NVIDIA CUDA GPU - Input: water_GMX50_bare OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2023 Implementation: NVIDIA CUDA GPU - Input: water_GMX50_bare RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 10 20 30 40 50 SE +/- 0.00, N = 3 SE +/- 0.05, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 15.38 23.74 32.04 43.47 1. (CXX) g++ options: -O3
Blender Blend File: Fishy Cat - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.6 Blend File: Fishy Cat - Compute: NVIDIA OptiX RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 4 8 12 16 20 SE +/- 0.21, N = 4 SE +/- 0.07, N = 15 SE +/- 0.07, N = 15 SE +/- 0.07, N = 15 16.89 10.84 7.59 5.73
RealSR-NCNN Scale: 4x - TAA: Yes OpenBenchmarking.org Seconds, Fewer Is Better RealSR-NCNN 20200818 Scale: 4x - TAA: Yes RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 12 24 36 48 60 SE +/- 0.07, N = 3 SE +/- 0.18, N = 3 SE +/- 0.04, N = 3 SE +/- 0.04, N = 3 52.76 31.17 24.37 20.19
Blender Blend File: BMW27 - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.6 Blend File: BMW27 - Compute: NVIDIA OptiX RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 2 4 6 8 10 SE +/- 0.06, N = 15 SE +/- 0.06, N = 15 SE +/- 0.06, N = 15 SE +/- 0.06, N = 15 8.93 6.13 4.43 3.70
VkResample Upscale: 2x - Precision: Double OpenBenchmarking.org ms, Fewer Is Better VkResample 1.0 Upscale: 2x - Precision: Double RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 30 60 90 120 150 SE +/- 0.26, N = 3 SE +/- 0.06, N = 3 SE +/- 0.06, N = 3 SE +/- 0.05, N = 3 153.15 121.35 89.24 55.93 1. (CXX) g++ options: -O3
ViennaCL Test: OpenCL BLAS - dGEMM-TT OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-TT RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 300 600 900 1200 1500 SE +/- 1.45, N = 3 SE +/- 1.20, N = 3 SE +/- 0.00, N = 3 458 584 849 1340 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMM-TN OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-TN RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 300 600 900 1200 1500 SE +/- 0.50, N = 2 SE +/- 1.67, N = 3 SE +/- 0.67, N = 3 SE +/- 3.33, N = 3 460 587 831 1293 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMM-NT OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-NT RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 300 600 900 1200 1500 SE +/- 0.58, N = 3 SE +/- 1.67, N = 3 SE +/- 1.00, N = 3 SE +/- 0.00, N = 3 461 588 795 1280 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMM-NN OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-NN RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 200 400 600 800 1000 SE +/- 0.88, N = 3 SE +/- 1.67, N = 3 SE +/- 0.88, N = 3 SE +/- 0.00, N = 3 459 585 774 1150 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMV-T OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMV-T RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 100 200 300 400 500 SE +/- 0.58, N = 3 SE +/- 0.58, N = 3 SE +/- 0.67, N = 3 SE +/- 0.00, N = 2 370 371 435 443 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMV-N OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMV-N RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 60 120 180 240 300 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 297 185 224 220 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dDOT OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dDOT RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 160 320 480 640 800 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 SE +/- 0.67, N = 3 524 658 597 719 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dAXPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dAXPY RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 170 340 510 680 850 SE +/- 0.33, N = 3 SE +/- 1.00, N = 3 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 518 716 608 772 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dCOPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dCOPY RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 140 280 420 560 700 SE +/- 0.33, N = 3 SE +/- 1.20, N = 3 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 475 600 540 661 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - sDOT OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - sDOT RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 100 200 300 400 500 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 305 372 417 446 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - sAXPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - sAXPY RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 120 240 360 480 600 SE +/- 0.33, N = 3 SE +/- 0.67, N = 3 SE +/- 0.33, N = 3 SE +/- 0.00, N = 3 392 495 487 568 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - sCOPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - sCOPY RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 100 200 300 400 500 SE +/- 1.73, N = 3 SE +/- 1.20, N = 3 SE +/- 0.33, N = 3 SE +/- 0.88, N = 3 306 363 385 444 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
Hashcat Benchmark: MD5 OpenBenchmarking.org H/s, More Is Better Hashcat 6.2.4 Benchmark: MD5 RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 30000M 60000M 90000M 120000M 150000M SE +/- 232076903.56, N = 6 SE +/- 474267340.07, N = 15 SE +/- 34207656.71, N = 6 SE +/- 243127767.05, N = 6 51503250000 63700926667 96891966667 154333333333
Blender Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.6 Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 6 12 18 24 30 SE +/- 0.04, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 5 SE +/- 0.02, N = 5 27.42 16.07 10.41 8.42
Blender Blend File: Classroom - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.6 Blend File: Classroom - Compute: NVIDIA OptiX RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 5 10 15 20 25 SE +/- 0.03, N = 3 SE +/- 0.01, N = 4 SE +/- 0.02, N = 5 SE +/- 0.01, N = 6 22.69 14.25 9.32 7.47
VkResample Upscale: 2x - Precision: Single OpenBenchmarking.org ms, Fewer Is Better VkResample 1.0 Upscale: 2x - Precision: Single RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 4 8 12 16 20 SE +/- 0.023, N = 5 SE +/- 0.008, N = 5 SE +/- 0.003, N = 5 SE +/- 0.003, N = 5 14.790 9.359 11.543 7.799 1. (CXX) g++ options: -O3
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Texture Read Bandwidth OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Texture Read Bandwidth RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 700 1400 2100 2800 3500 SE +/- 0.54, N = 3 SE +/- 0.38, N = 3 SE +/- 2.39, N = 7 SE +/- 0.95, N = 6 1147.26 2148.86 3065.76 2980.15 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
Hashcat Benchmark: TrueCrypt RIPEMD160 + XTS OpenBenchmarking.org H/s, More Is Better Hashcat 6.2.4 Benchmark: TrueCrypt RIPEMD160 + XTS RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 400K 800K 1200K 1600K 2000K SE +/- 4392.30, N = 10 SE +/- 5945.49, N = 15 SE +/- 8051.58, N = 15 SE +/- 13557.46, N = 7 580260 731327 1146560 1857800
clpeak OpenCL Test: Double-Precision Double OpenBenchmarking.org GFLOPS, More Is Better clpeak 1.1.2 OpenCL Test: Double-Precision Double RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 300 600 900 1200 1500 SE +/- 1.50, N = 4 SE +/- 0.46, N = 5 SE +/- 1.29, N = 6 SE +/- 1.85, N = 6 506.01 640.15 869.25 1391.51 1. (CXX) g++ options: -O3
Hashcat Benchmark: SHA-512 OpenBenchmarking.org H/s, More Is Better Hashcat 6.2.4 Benchmark: SHA-512 RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 1300M 2600M 3900M 5200M 6500M SE +/- 6223365.47, N = 6 SE +/- 2013247.79, N = 6 SE +/- 1634982.16, N = 6 SE +/- 2101057.93, N = 6 2047183333 2595250000 3936350000 6297766667
Caffe Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 1000 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 1000 RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 5K 10K 15K 20K 25K SE +/- 30.41, N = 3 SE +/- 18.97, N = 3 SE +/- 24.30, N = 3 24142.8 16310.5 16499.3 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lcrypto -lcurl -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
Hashcat Benchmark: SHA1 OpenBenchmarking.org H/s, More Is Better Hashcat 6.2.4 Benchmark: SHA1 RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 11000M 22000M 33000M 44000M 55000M SE +/- 39682498.24, N = 6 SE +/- 15661518.66, N = 6 SE +/- 19028429.02, N = 6 SE +/- 18148896.51, N = 6 16300900000 20634050000 30881533333 49393366667
RealSR-NCNN Scale: 4x - TAA: No OpenBenchmarking.org Seconds, Fewer Is Better RealSR-NCNN 20200818 Scale: 4x - TAA: No RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 3 6 9 12 15 SE +/- 0.025, N = 5 SE +/- 0.015, N = 6 SE +/- 0.015, N = 7 SE +/- 0.018, N = 7 9.239 6.509 5.630 5.109
Hashcat Benchmark: 7-Zip OpenBenchmarking.org H/s, More Is Better Hashcat 6.2.4 Benchmark: 7-Zip RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 600K 1200K 1800K 2400K 3000K SE +/- 1608.10, N = 8 SE +/- 1997.77, N = 8 SE +/- 1869.68, N = 8 SE +/- 1870.54, N = 8 840275 1095950 1747350 2746938
Caffe Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 1000 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 1000 RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 1400 2800 4200 5600 7000 SE +/- 2.76, N = 6 SE +/- 3.07, N = 6 SE +/- 2.74, N = 7 6584.81 5228.00 4379.70 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lcrypto -lcurl -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
Rodinia Test: OpenCL Particle Filter OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenCL Particle Filter RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 1.0186 2.0372 3.0558 4.0744 5.093 SE +/- 0.031, N = 8 SE +/- 0.029, N = 10 SE +/- 0.005, N = 10 SE +/- 0.080, N = 3 4.527 3.768 2.685 2.085 1. (CXX) g++ options: -m64 -lm -lcuda -lcudart -lcudadevrt -lcudart_static -lrt -lpthread -ldl
Waifu2x-NCNN Vulkan Scale: 2x - Denoise: 3 - TAA: Yes OpenBenchmarking.org Seconds, Fewer Is Better Waifu2x-NCNN Vulkan 20200818 Scale: 2x - Denoise: 3 - TAA: Yes RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 0.9691 1.9382 2.9073 3.8764 4.8455 SE +/- 0.007, N = 8 SE +/- 0.005, N = 9 SE +/- 0.004, N = 10 SE +/- 0.003, N = 10 4.307 3.536 2.761 2.490
Caffe Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 200 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 200 RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 1000 2000 3000 4000 5000 SE +/- 2.81, N = 7 SE +/- 3.25, N = 8 SE +/- 3.64, N = 8 4830.59 3279.17 3304.15 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lcrypto -lcurl -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
cl-mem Benchmark: Copy OpenBenchmarking.org GB/s, More Is Better cl-mem 2017-01-13 Benchmark: Copy RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 90 180 270 360 450 SE +/- 0.27, N = 8 SE +/- 0.21, N = 10 SE +/- 0.03, N = 9 SE +/- 0.05, N = 10 318.1 359.2 382.0 410.3 1. (CC) gcc options: -O2 -flto -lOpenCL
cl-mem Benchmark: Write OpenBenchmarking.org GB/s, More Is Better cl-mem 2017-01-13 Benchmark: Write RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 200 400 600 800 1000 SE +/- 0.85, N = 8 SE +/- 0.49, N = 10 SE +/- 0.54, N = 9 SE +/- 0.36, N = 10 452.2 734.3 567.7 785.8 1. (CC) gcc options: -O2 -flto -lOpenCL
cl-mem Benchmark: Read OpenBenchmarking.org GB/s, More Is Better cl-mem 2017-01-13 Benchmark: Read RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 200 400 600 800 1000 SE +/- 0.15, N = 8 SE +/- 0.81, N = 9 SE +/- 0.71, N = 9 SE +/- 0.31, N = 10 543.8 823.3 623.2 886.0 1. (CC) gcc options: -O2 -flto -lOpenCL
ArrayFire Test: Conjugate Gradient OpenCL OpenBenchmarking.org ms, Fewer Is Better ArrayFire 3.7 Test: Conjugate Gradient OpenCL RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 0.3778 0.7556 1.1334 1.5112 1.889 SE +/- 0.0055, N = 9 SE +/- 0.0020, N = 9 SE +/- 0.0013, N = 10 SE +/- 0.0014, N = 9 1.6790 1.5830 1.1220 0.8803 1. (CXX) g++ options: -rdynamic
clpeak OpenCL Test: Global Memory Bandwidth OpenBenchmarking.org GBPS, More Is Better clpeak 1.1.2 OpenCL Test: Global Memory Bandwidth RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 200 400 600 800 1000 SE +/- 0.36, N = 9 SE +/- 0.01, N = 10 SE +/- 0.17, N = 10 SE +/- 0.04, N = 10 504.66 814.92 611.67 871.56 1. (CXX) g++ options: -O3
Caffe Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 100 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 100 RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 500 1000 1500 2000 2500 SE +/- 1.69, N = 9 SE +/- 2.33, N = 10 SE +/- 1.34, N = 10 2418.86 1649.29 1660.01 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lcrypto -lcurl -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
MandelGPU OpenCL Device: GPU OpenBenchmarking.org Samples/sec, More Is Better MandelGPU 1.3pts1 OpenCL Device: GPU RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 200M 400M 600M 800M 1000M SE +/- 1540493.49, N = 10 SE +/- 959338.71, N = 11 SE +/- 1340958.21, N = 12 SE +/- 3825360.26, N = 12 442866316.9 568293355.5 772392077.3 951815274.2 1. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: GEMM SGEMM_N OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: GEMM SGEMM_N RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 6K 12K 18K 24K 30K SE +/- 33.05, N = 10 SE +/- 16.79, N = 11 SE +/- 224.54, N = 15 SE +/- 164.66, N = 13 4858.13 7912.59 16952.60 27175.90 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
Caffe Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 200 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 200 RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 300 600 900 1200 1500 SE +/- 1.00, N = 10 SE +/- 1.43, N = 10 SE +/- 0.87, N = 11 1321.36 1053.88 883.95 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lcrypto -lcurl -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: FFT SP OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: FFT SP RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 600 1200 1800 2400 3000 SE +/- 0.70, N = 11 SE +/- 0.59, N = 12 SE +/- 2.58, N = 12 SE +/- 1.36, N = 12 1504.87 2349.39 1831.86 2779.48 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: S3D OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: S3D RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 140 280 420 560 700 SE +/- 0.15, N = 12 SE +/- 0.35, N = 12 SE +/- 0.38, N = 13 SE +/- 0.22, N = 12 270.80 427.49 423.58 644.60 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Triad OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Triad RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 6 12 18 24 30 SE +/- 0.00, N = 12 SE +/- 0.00, N = 12 SE +/- 0.01, N = 12 SE +/- 0.01, N = 12 12.61 24.55 24.76 25.07 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
Caffe Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 100 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 100 RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 140 280 420 560 700 SE +/- 1.37, N = 11 SE +/- 1.06, N = 11 SE +/- 1.11, N = 11 664.60 537.49 447.34 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lcrypto -lcurl -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Bus Speed Readback OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Readback RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 6 12 18 24 30 SE +/- 0.00, N = 12 SE +/- 0.00, N = 13 SE +/- 0.00, N = 14 SE +/- 0.00, N = 13 13.20 26.40 26.39 26.37 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
clpeak OpenCL Test: Single-Precision Float OpenBenchmarking.org GFLOPS, More Is Better clpeak 1.1.2 OpenCL Test: Single-Precision Float RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 20K 40K 60K 80K 100K SE +/- 80.60, N = 11 SE +/- 10.34, N = 15 SE +/- 50.90, N = 14 SE +/- 103.68, N = 14 14076.71 35000.44 47802.87 79463.36 1. (CXX) g++ options: -O3
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Bus Speed Download OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Download RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 6 12 18 24 30 SE +/- 0.00, N = 12 SE +/- 0.02, N = 13 SE +/- 0.02, N = 13 SE +/- 0.02, N = 13 12.85 25.19 25.10 25.13 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Reduction OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Reduction RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 200 400 600 800 1000 SE +/- 0.06, N = 12 SE +/- 0.11, N = 13 SE +/- 13.90, N = 15 SE +/- 9.75, N = 15 366.06 392.80 1020.24 967.14 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
clpeak OpenCL Test: Integer Compute INT OpenBenchmarking.org GIOPS, More Is Better clpeak 1.1.2 OpenCL Test: Integer Compute INT RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 9K 18K 27K 36K 45K SE +/- 86.60, N = 15 SE +/- 36.29, N = 15 SE +/- 20.64, N = 14 SE +/- 49.03, N = 14 11826.76 17864.54 24524.34 40697.73 1. (CXX) g++ options: -O3
NeatBench Acceleration: GPU OpenBenchmarking.org FPS, More Is Better NeatBench 5 Acceleration: GPU RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 900 1800 2700 3600 4500 SE +/- 0.00, N = 13 SE +/- 0.00, N = 13 SE +/- 0.00, N = 14 SE +/- 0.00, N = 14 2080 3090 4080 4090
FinanceBench Benchmark: Black-Scholes OpenCL OpenBenchmarking.org ms, Fewer Is Better FinanceBench 2016-07-25 Benchmark: Black-Scholes OpenCL RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 2 4 6 8 10 SE +/- 0.110, N = 15 SE +/- 0.002, N = 14 SE +/- 0.013, N = 15 SE +/- 0.005, N = 15 8.840 5.796 4.404 2.890 1. (CXX) g++ options: -O3 -march=native -fopenmp
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: MD5 Hash OpenBenchmarking.org GHash/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: MD5 Hash RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 20 40 60 80 100 SE +/- 0.06, N = 14 SE +/- 0.05, N = 14 SE +/- 0.65, N = 15 SE +/- 0.95, N = 15 32.52 41.33 60.56 93.90 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
GPU Temperature Monitor Phoronix Test Suite System Monitoring OpenBenchmarking.org Celsius GPU Temperature Monitor Phoronix Test Suite System Monitoring RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 16 32 48 64 80 Min: 44 / Avg: 69.38 / Max: 83 Min: 41 / Avg: 65.62 / Max: 84 Min: 31 / Avg: 44.05 / Max: 73 Min: 32 / Avg: 43.13 / Max: 66
GPU Power Consumption Monitor Phoronix Test Suite System Monitoring OpenBenchmarking.org Watts GPU Power Consumption Monitor Phoronix Test Suite System Monitoring RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 80 160 240 320 400 Min: 18.41 / Avg: 148.08 / Max: 347.3 Min: 18.42 / Avg: 170.36 / Max: 367.21 Min: 4 / Avg: 92.31 / Max: 331.21 Min: 6 / Avg: 110.92 / Max: 456.05
Waifu2x-NCNN Vulkan GPU Temperature Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 59.0 68.2 75.0 RTX 3090 24GB -Zotac 47.0 61.2 73.0 RTX 4080 16GB -Pny 40.0 42.9 51.0 RTX 4090 24GB -Nvidia 39.0 41.6 46.0 OpenBenchmarking.org Celsius, Fewer Is Better Waifu2x-NCNN Vulkan 20200818 GPU Temperature Monitor 20 40 60 80 100
Waifu2x-NCNN Vulkan GPU Power Consumption Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 22.8 138.6 261.5 RTX 3090 24GB -Zotac 27.5 154.7 304.3 RTX 4080 16GB -Pny 14.0 72.7 204.6 RTX 4090 24GB -Nvidia 7.1 93.5 229.6 OpenBenchmarking.org Watts, Fewer Is Better Waifu2x-NCNN Vulkan 20200818 GPU Power Consumption Monitor 80 160 240 320 400
RealSR-NCNN GPU Temperature Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 60.0 69.3 76.0 RTX 3090 24GB -Zotac 48.0 61.7 77.0 RTX 4080 16GB -Pny 42.0 47.6 58.0 RTX 4090 24GB -Nvidia 42.0 45.1 53.0 OpenBenchmarking.org Celsius, Fewer Is Better RealSR-NCNN 20200818 GPU Temperature Monitor 20 40 60 80 100
RealSR-NCNN GPU Power Consumption Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 23.3 150.8 256.7 RTX 3090 24GB -Zotac 28.7 166.5 354.3 RTX 4080 16GB -Pny 14.1 102.2 294.6 RTX 4090 24GB -Nvidia 7.1 105.0 327.6 OpenBenchmarking.org Watts, Fewer Is Better RealSR-NCNN 20200818 GPU Power Consumption Monitor 100 200 300 400 500
RealSR-NCNN GPU Temperature Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 46.0 74.2 80.0 RTX 3090 24GB -Zotac 52.0 74.9 84.0 RTX 4080 16GB -Pny 42.0 57.3 64.0 RTX 4090 24GB -Nvidia 41.0 51.6 56.0 OpenBenchmarking.org Celsius, Fewer Is Better RealSR-NCNN 20200818 GPU Temperature Monitor 20 40 60 80 100
RealSR-NCNN GPU Power Consumption Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 20.4 225.5 257.1 RTX 3090 24GB -Zotac 33.0 292.5 356.8 RTX 4080 16GB -Pny 13.9 221.3 300.8 RTX 4090 24GB -Nvidia 6.6 234.7 332.3 OpenBenchmarking.org Watts, Fewer Is Better RealSR-NCNN 20200818 GPU Power Consumption Monitor 100 200 300 400 500
NCNN GPU Temperature Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 46.0 49.8 61.0 RTX 3090 24GB -Zotac 42.0 51.6 64.0 RTX 4080 16GB -Pny 35.0 38.2 43.0 RTX 4090 24GB -Nvidia 33.0 37.4 45.0 OpenBenchmarking.org Celsius, Fewer Is Better NCNN 20220729 GPU Temperature Monitor 20 40 60 80 100
NCNN GPU Power Consumption Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 19.2 31.7 108.8 RTX 3090 24GB -Zotac 18.8 36.1 148.3 RTX 4080 16GB -Pny 4.2 18.5 66.4 RTX 4090 24GB -Nvidia 6.4 13.6 59.6 OpenBenchmarking.org Watts, Fewer Is Better NCNN 20220729 GPU Power Consumption Monitor 40 80 120 160 200
vkpeak GPU Temperature Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 56.0 77.3 83.0 RTX 3090 24GB -Zotac 42.0 71.3 79.0 RTX 4080 16GB -Pny 43.0 57.3 66.0 RTX 4090 24GB -Nvidia 42.0 53.9 65.0 OpenBenchmarking.org Celsius, Fewer Is Better vkpeak 20210424 GPU Temperature Monitor 20 40 60 80 100
vkpeak GPU Power Consumption Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 21.4 217.8 258.8 RTX 3090 24GB -Zotac 24.9 265.1 330.5 RTX 4080 16GB -Pny 13.6 171.1 245.8 RTX 4090 24GB -Nvidia 6.6 203.4 347.2 OpenBenchmarking.org Watts, Fewer Is Better vkpeak 20210424 GPU Power Consumption Monitor 100 200 300 400 500
VkResample GPU Temperature Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 55.0 65.4 73.0 RTX 3090 24GB -Zotac 47.0 59.2 74.0 RTX 4080 16GB -Pny 38.0 43.6 50.0 RTX 4090 24GB -Nvidia 38.0 41.6 47.0 OpenBenchmarking.org Celsius, Fewer Is Better VkResample 1.0 GPU Temperature Monitor 20 40 60 80 100
VkResample GPU Power Consumption Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 21.1 114.5 208.0 RTX 3090 24GB -Zotac 19.1 124.7 254.8 RTX 4080 16GB -Pny 4.2 59.1 135.5 RTX 4090 24GB -Nvidia 6.7 58.1 182.3 OpenBenchmarking.org Watts, Fewer Is Better VkResample 1.0 GPU Power Consumption Monitor 70 140 210 280 350
VkResample GPU Temperature Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 57.0 59.7 69.0 RTX 3090 24GB -Zotac 48.0 53.5 73.0 RTX 4080 16GB -Pny 36.0 38.1 46.0 RTX 4090 24GB -Nvidia 35.0 37.0 43.0 OpenBenchmarking.org Celsius, Fewer Is Better VkResample 1.0 GPU Temperature Monitor 20 40 60 80 100
VkResample GPU Power Consumption Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 21.4 60.8 255.9 RTX 3090 24GB -Zotac 19.2 78.0 357.0 RTX 4080 16GB -Pny 4.3 35.1 203.1 RTX 4090 24GB -Nvidia 6.5 37.5 270.4 OpenBenchmarking.org Watts, Fewer Is Better VkResample 1.0 GPU Power Consumption Monitor 100 200 300 400 500
VkFFT GPU Temperature Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 54.0 68.6 77.0 RTX 3090 24GB -Zotac 47.0 62.8 74.0 RTX 4080 16GB -Pny 37.0 39.6 46.0 RTX 4090 24GB -Nvidia 35.0 36.9 43.0 OpenBenchmarking.org Celsius, Fewer Is Better VkFFT 1.1.1 GPU Temperature Monitor 20 40 60 80 100
VkFFT GPU Power Consumption Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 21.5 119.5 264.1 RTX 3090 24GB -Zotac 19.1 140.5 367.2 RTX 4080 16GB -Pny 11.2 53.2 215.9 RTX 4090 24GB -Nvidia 6.9 61.1 277.8 OpenBenchmarking.org Watts, Fewer Is Better VkFFT 1.1.1 GPU Power Consumption Monitor 100 200 300 400 500
Caffe GPU Temperature Monitor Min Avg Max RTX 3090 24GB -Zotac 47.0 72.8 78.0 RTX 4080 16GB -Pny 37.0 46.1 50.0 RTX 4090 24GB -Nvidia 37.0 42.3 45.0 OpenBenchmarking.org Celsius, Fewer Is Better Caffe 2020-02-13 GPU Temperature Monitor 20 40 60 80 100
Caffe GPU Power Consumption Monitor Min Avg Max RTX 3090 24GB -Zotac 18.5 252.8 299.7 RTX 4080 16GB -Pny 13.4 121.4 156.0 RTX 4090 24GB -Nvidia 6.9 129.1 162.7 OpenBenchmarking.org Watts, Fewer Is Better Caffe 2020-02-13 GPU Power Consumption Monitor 80 160 240 320 400
Caffe GPU Temperature Monitor Min Avg Max RTX 3090 24GB -Zotac 47.0 64.6 76.0 RTX 4080 16GB -Pny 37.0 41.8 48.0 RTX 4090 24GB -Nvidia 36.0 39.3 43.0 OpenBenchmarking.org Celsius, Fewer Is Better Caffe 2020-02-13 GPU Temperature Monitor 20 40 60 80 100
Caffe GPU Power Consumption Monitor Min Avg Max RTX 3090 24GB -Zotac 25.4 184.4 290.0 RTX 4080 16GB -Pny 12.6 78.5 154.8 RTX 4090 24GB -Nvidia 7.0 87.3 162.9 OpenBenchmarking.org Watts, Fewer Is Better Caffe 2020-02-13 GPU Power Consumption Monitor 70 140 210 280 350
Caffe GPU Temperature Monitor Min Avg Max RTX 3090 24GB -Zotac 47.0 61.3 74.0 RTX 4080 16GB -Pny 38.0 40.8 47.0 RTX 4090 24GB -Nvidia 37.0 38.5 42.0 OpenBenchmarking.org Celsius, Fewer Is Better Caffe 2020-02-13 GPU Temperature Monitor 20 40 60 80 100
Caffe GPU Power Consumption Monitor Min Avg Max RTX 3090 24GB -Zotac 19.3 154.8 289.3 RTX 4080 16GB -Pny 13.4 62.2 154.8 RTX 4090 24GB -Nvidia 7.0 71.0 160.9 OpenBenchmarking.org Watts, Fewer Is Better Caffe 2020-02-13 GPU Power Consumption Monitor 70 140 210 280 350
Caffe GPU Temperature Monitor Min Avg Max RTX 3090 24GB -Zotac 48.0 68.9 80.0 RTX 4080 16GB -Pny 37.0 43.9 50.0 RTX 4090 24GB -Nvidia 36.0 40.7 45.0 OpenBenchmarking.org Celsius, Fewer Is Better Caffe 2020-02-13 GPU Temperature Monitor 20 40 60 80 100
Caffe GPU Power Consumption Monitor Min Avg Max RTX 3090 24GB -Zotac 19.1 224.3 341.6 RTX 4080 16GB -Pny 10.7 104.4 178.5 RTX 4090 24GB -Nvidia 7.0 114.8 203.5 OpenBenchmarking.org Watts, Fewer Is Better Caffe 2020-02-13 GPU Power Consumption Monitor 80 160 240 320 400
Caffe GPU Temperature Monitor Min Avg Max RTX 3090 24GB -Zotac 50.0 61.4 77.0 RTX 4080 16GB -Pny 38.0 40.3 48.0 RTX 4090 24GB -Nvidia 38.0 39.5 44.0 OpenBenchmarking.org Celsius, Fewer Is Better Caffe 2020-02-13 GPU Temperature Monitor 20 40 60 80 100
Caffe GPU Power Consumption Monitor Min Avg Max RTX 3090 24GB -Zotac 19.3 144.4 336.9 RTX 4080 16GB -Pny 13.4 60.0 179.5 RTX 4090 24GB -Nvidia 6.8 69.2 202.4 OpenBenchmarking.org Watts, Fewer Is Better Caffe 2020-02-13 GPU Power Consumption Monitor 80 160 240 320 400
Caffe GPU Temperature Monitor Min Avg Max RTX 3090 24GB -Zotac 47.0 60.1 75.0 RTX 4080 16GB -Pny 39.0 42.1 50.0 RTX 4090 24GB -Nvidia 40.0 42.4 49.0 OpenBenchmarking.org Celsius, Fewer Is Better Caffe 2020-02-13 GPU Temperature Monitor 20 40 60 80 100
Caffe GPU Power Consumption Monitor Min Avg Max RTX 3090 24GB -Zotac 18.7 125.3 335.1 RTX 4080 16GB -Pny 14.0 48.0 178.0 RTX 4090 24GB -Nvidia 7.0 56.9 202.9 OpenBenchmarking.org Watts, Fewer Is Better Caffe 2020-02-13 GPU Power Consumption Monitor 80 160 240 320 400
Blender GPU Temperature Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 59.0 78.7 82.0 RTX 3090 24GB -Zotac 47.0 75.6 82.0 RTX 4080 16GB -Pny 40.0 51.0 56.0 RTX 4090 24GB -Nvidia 40.0 48.5 54.0 OpenBenchmarking.org Celsius, Fewer Is Better Blender 3.6 GPU Temperature Monitor 20 40 60 80 100
Blender GPU Power Consumption Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 22.8 226.2 257.6 RTX 3090 24GB -Zotac 18.6 292.2 358.2 RTX 4080 16GB -Pny 13.7 161.8 215.2 RTX 4090 24GB -Nvidia 6.8 189.4 276.9 OpenBenchmarking.org Watts, Fewer Is Better Blender 3.6 GPU Power Consumption Monitor 100 200 300 400 500
Blender GPU Temperature Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 60.0 75.8 80.0 RTX 3090 24GB -Zotac 47.0 71.9 80.0 RTX 4080 16GB -Pny 39.0 48.0 54.0 RTX 4090 24GB -Nvidia 39.0 45.3 50.0 OpenBenchmarking.org Celsius, Fewer Is Better Blender 3.6 GPU Temperature Monitor 20 40 60 80 100
Blender GPU Power Consumption Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 22.7 212.9 257.2 RTX 3090 24GB -Zotac 20.5 264.0 361.3 RTX 4080 16GB -Pny 12.8 140.8 239.5 RTX 4090 24GB -Nvidia 6.9 172.2 302.0 OpenBenchmarking.org Watts, Fewer Is Better Blender 3.6 GPU Power Consumption Monitor 100 200 300 400 500
Blender GPU Temperature Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 60.0 74.4 80.0 RTX 3090 24GB -Zotac 48.0 70.7 82.0 RTX 4080 16GB -Pny 40.0 48.2 57.0 RTX 4090 24GB -Nvidia 41.0 45.8 54.0 OpenBenchmarking.org Celsius, Fewer Is Better Blender 3.6 GPU Temperature Monitor 20 40 60 80 100
Blender GPU Power Consumption Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 22.9 195.0 256.9 RTX 3090 24GB -Zotac 19.4 234.9 344.1 RTX 4080 16GB -Pny 13.4 116.8 223.6 RTX 4090 24GB -Nvidia 6.9 136.7 303.2 OpenBenchmarking.org Watts, Fewer Is Better Blender 3.6 GPU Power Consumption Monitor 80 160 240 320 400
Blender GPU Temperature Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 59.0 75.4 81.0 RTX 3090 24GB -Zotac 48.0 73.0 81.0 RTX 4080 16GB -Pny 39.0 48.3 55.0 RTX 4090 24GB -Nvidia 40.0 46.5 52.0 OpenBenchmarking.org Celsius, Fewer Is Better Blender 3.6 GPU Temperature Monitor 20 40 60 80 100
Blender GPU Power Consumption Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 22.5 212.6 255.0 RTX 3090 24GB -Zotac 19.2 267.4 353.6 RTX 4080 16GB -Pny 5.3 142.2 230.1 RTX 4090 24GB -Nvidia 7.1 162.1 291.8 OpenBenchmarking.org Watts, Fewer Is Better Blender 3.6 GPU Power Consumption Monitor 100 200 300 400 500
Blender GPU Temperature Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 60.0 72.6 78.0 RTX 3090 24GB -Zotac 46.0 67.3 79.0 RTX 4080 16GB -Pny 41.0 45.3 51.0 RTX 4090 24GB -Nvidia 42.0 44.8 50.0 OpenBenchmarking.org Celsius, Fewer Is Better Blender 3.6 GPU Temperature Monitor 20 40 60 80 100
Blender GPU Power Consumption Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 23.3 173.8 254.3 RTX 3090 24GB -Zotac 19.2 215.9 349.9 RTX 4080 16GB -Pny 14.3 95.0 198.6 RTX 4090 24GB -Nvidia 7.0 107.7 242.2 OpenBenchmarking.org Watts, Fewer Is Better Blender 3.6 GPU Power Consumption Monitor 100 200 300 400 500
Chaos Group V-RAY GPU Temperature Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 55.0 71.7 82.0 RTX 3090 24GB -Zotac 47.0 68.3 80.0 RTX 4080 16GB -Pny 39.0 48.0 54.0 RTX 4090 24GB -Nvidia 41.0 48.6 54.0 OpenBenchmarking.org Celsius, Fewer Is Better Chaos Group V-RAY 5.02 GPU Temperature Monitor 20 40 60 80 100
Chaos Group V-RAY GPU Power Consumption Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 21.6 167.9 253.1 RTX 3090 24GB -Zotac 18.9 235.8 358.0 RTX 4080 16GB -Pny 4.5 130.0 211.7 RTX 4090 24GB -Nvidia 6.8 169.0 272.7 OpenBenchmarking.org Watts, Fewer Is Better Chaos Group V-RAY 5.02 GPU Power Consumption Monitor 100 200 300 400 500
Chaos Group V-RAY GPU Temperature Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 57.0 73.5 82.0 RTX 3090 24GB -Zotac 48.0 71.5 82.0 RTX 4080 16GB -Pny 42.0 51.8 57.0 RTX 4090 24GB -Nvidia 42.0 50.8 57.0 OpenBenchmarking.org Celsius, Fewer Is Better Chaos Group V-RAY 5.02 GPU Temperature Monitor 20 40 60 80 100
Chaos Group V-RAY GPU Power Consumption Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 22.7 181.0 256.4 RTX 3090 24GB -Zotac 19.2 250.0 356.2 RTX 4080 16GB -Pny 13.4 159.5 227.6 RTX 4090 24GB -Nvidia 7.2 199.1 296.3 OpenBenchmarking.org Watts, Fewer Is Better Chaos Group V-RAY 5.02 GPU Power Consumption Monitor 100 200 300 400 500
IndigoBench GPU Temperature Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 61.0 76.6 82.0 RTX 3090 24GB -Zotac 50.0 77.7 84.0 RTX 4080 16GB -Pny 42.0 54.0 59.0 RTX 4090 24GB -Nvidia 41.0 51.7 57.0 OpenBenchmarking.org Celsius, Fewer Is Better IndigoBench 4.4 GPU Temperature Monitor 20 40 60 80 100
IndigoBench GPU Power Consumption Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 23.1 213.7 254.5 RTX 3090 24GB -Zotac 20.2 287.2 353.5 RTX 4080 16GB -Pny 14.0 180.5 230.3 RTX 4090 24GB -Nvidia 6.7 211.2 296.7 OpenBenchmarking.org Watts, Fewer Is Better IndigoBench 4.4 GPU Power Consumption Monitor 100 200 300 400 500
IndigoBench GPU Temperature Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 57.0 76.0 81.0 RTX 3090 24GB -Zotac 52.0 77.6 83.0 RTX 4080 16GB -Pny 40.0 51.9 56.0 RTX 4090 24GB -Nvidia 42.0 49.4 52.0 OpenBenchmarking.org Celsius, Fewer Is Better IndigoBench 4.4 GPU Temperature Monitor 20 40 60 80 100
IndigoBench GPU Power Consumption Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 21.3 211.7 259.5 RTX 3090 24GB -Zotac 20.9 282.9 356.8 RTX 4080 16GB -Pny 13.0 159.4 207.9 RTX 4090 24GB -Nvidia 6.7 172.9 252.7 OpenBenchmarking.org Watts, Fewer Is Better IndigoBench 4.4 GPU Power Consumption Monitor 100 200 300 400 500
SHOC Scalable HeterOgeneous Computing GPU Temperature Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 56.0 58.0 60.0 RTX 3090 24GB -Zotac 52.0 58.1 62.0 RTX 4080 16GB -Pny 38.0 39.5 41.0 RTX 4090 24GB -Nvidia 40.0 41.5 43.0 OpenBenchmarking.org Celsius, Fewer Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 GPU Temperature Monitor 20 40 60 80 100
SHOC Scalable HeterOgeneous Computing GPU Power Consumption Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 21.3 56.5 80.7 RTX 3090 24GB -Zotac 19.8 85.0 125.6 RTX 4080 16GB -Pny 12.2 30.6 47.0 RTX 4090 24GB -Nvidia 6.6 39.8 58.8 OpenBenchmarking.org Watts, Fewer Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 GPU Power Consumption Monitor 40 80 120 160 200
SHOC Scalable HeterOgeneous Computing GPU Temperature Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 57.0 59.4 62.0 RTX 3090 24GB -Zotac 52.0 56.2 60.0 RTX 4080 16GB -Pny 35.0 37.0 38.0 RTX 4090 24GB -Nvidia 38.0 39.2 40.0 OpenBenchmarking.org Celsius, Fewer Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 GPU Temperature Monitor 20 40 60 80 100
SHOC Scalable HeterOgeneous Computing GPU Power Consumption Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 22.0 53.4 76.5 RTX 3090 24GB -Zotac 19.8 72.1 117.2 RTX 4080 16GB -Pny 12.1 28.5 40.3 RTX 4090 24GB -Nvidia 6.5 33.8 55.0 OpenBenchmarking.org Watts, Fewer Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 GPU Power Consumption Monitor 40 80 120 160 200
SHOC Scalable HeterOgeneous Computing GPU Temperature Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 57.0 62.9 68.0 RTX 3090 24GB -Zotac 52.0 62.9 72.0 RTX 4080 16GB -Pny 36.0 36.7 39.0 RTX 4090 24GB -Nvidia 34.0 36.4 40.0 OpenBenchmarking.org Celsius, Fewer Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 GPU Temperature Monitor 20 40 60 80 100
SHOC Scalable HeterOgeneous Computing GPU Power Consumption Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 21.7 93.4 198.2 RTX 3090 24GB -Zotac 20.0 125.2 273.7 RTX 4080 16GB -Pny 13.3 39.3 98.6 RTX 4090 24GB -Nvidia 6.4 41.6 108.0 OpenBenchmarking.org Watts, Fewer Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 GPU Power Consumption Monitor 70 140 210 280 350
SHOC Scalable HeterOgeneous Computing GPU Temperature Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 59.0 62.4 68.0 RTX 3090 24GB -Zotac 50.0 60.2 73.0 RTX 4080 16GB -Pny 36.0 38.6 55.0 RTX 4090 24GB -Nvidia 35.0 35.6 37.0 OpenBenchmarking.org Celsius, Fewer Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 GPU Temperature Monitor 20 40 60 80 100
SHOC Scalable HeterOgeneous Computing GPU Power Consumption Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 22.1 84.4 261.2 RTX 3090 24GB -Zotac 19.5 105.9 323.3 RTX 4080 16GB -Pny 12.8 44.7 278.8 RTX 4090 24GB -Nvidia 7.0 41.7 150.7 OpenBenchmarking.org Watts, Fewer Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 GPU Power Consumption Monitor 80 160 240 320 400
SHOC Scalable HeterOgeneous Computing GPU Temperature Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 56.0 64.4 71.0 RTX 3090 24GB -Zotac 53.0 63.8 74.0 RTX 4080 16GB -Pny 36.0 38.5 46.0 RTX 4090 24GB -Nvidia 36.0 37.1 42.0 OpenBenchmarking.org Celsius, Fewer Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 GPU Temperature Monitor 20 40 60 80 100
SHOC Scalable HeterOgeneous Computing GPU Power Consumption Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 22.3 107.8 225.4 RTX 3090 24GB -Zotac 25.8 138.9 317.8 RTX 4080 16GB -Pny 11.9 51.2 195.6 RTX 4090 24GB -Nvidia 7.0 59.0 207.2 OpenBenchmarking.org Watts, Fewer Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 GPU Power Consumption Monitor 80 160 240 320 400
SHOC Scalable HeterOgeneous Computing GPU Temperature Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 58.0 60.7 64.0 RTX 3090 24GB -Zotac 49.0 58.0 65.0 RTX 4080 16GB -Pny 37.0 38.1 40.0 RTX 4090 24GB -Nvidia 37.0 38.0 39.0 OpenBenchmarking.org Celsius, Fewer Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 GPU Temperature Monitor 20 40 60 80 100
SHOC Scalable HeterOgeneous Computing GPU Power Consumption Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 23.2 73.4 184.7 RTX 3090 24GB -Zotac 24.9 98.4 223.7 RTX 4080 16GB -Pny 12.2 38.7 72.2 RTX 4090 24GB -Nvidia 7.0 45.5 106.3 OpenBenchmarking.org Watts, Fewer Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 GPU Power Consumption Monitor 60 120 180 240 300
SHOC Scalable HeterOgeneous Computing GPU Temperature Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 59.0 74.0 78.0 RTX 3090 24GB -Zotac 50.0 72.2 79.0 RTX 4080 16GB -Pny 37.0 43.1 49.0 RTX 4090 24GB -Nvidia 37.0 41.8 47.0 OpenBenchmarking.org Celsius, Fewer Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 GPU Temperature Monitor 20 40 60 80 100
SHOC Scalable HeterOgeneous Computing GPU Power Consumption Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 22.2 184.6 246.0 RTX 3090 24GB -Zotac 22.8 253.9 357.5 RTX 4080 16GB -Pny 13.5 98.3 224.9 RTX 4090 24GB -Nvidia 7.0 125.0 250.8 OpenBenchmarking.org Watts, Fewer Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 GPU Power Consumption Monitor 100 200 300 400 500
SHOC Scalable HeterOgeneous Computing GPU Temperature Monitor OpenBenchmarking.org Celsius, Fewer Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 GPU Temperature Monitor RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 16 32 48 64 80 Min: 55 / Avg: 73.85 / Max: 82 Min: 52 / Avg: 70.48 / Max: 82 Min: 38 / Avg: 43.44 / Max: 73 Min: 39 / Avg: 43.63 / Max: 66
SHOC Scalable HeterOgeneous Computing GPU Power Consumption Monitor OpenBenchmarking.org Watts, Fewer Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 GPU Power Consumption Monitor RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 80 160 240 320 400 Min: 21.69 / Avg: 175.84 / Max: 257.38 Min: 20.35 / Avg: 201.61 / Max: 329.19 Min: 13.84 / Avg: 90.17 / Max: 331.21 Min: 6.77 / Avg: 116.7 / Max: 447.83
SHOC Scalable HeterOgeneous Computing GPU Temperature Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 51.0 55.1 59.0 RTX 3090 24GB -Zotac 52.0 58.8 65.0 RTX 4080 16GB -Pny 41.0 42.7 45.0 RTX 4090 24GB -Nvidia 37.0 39.1 41.0 OpenBenchmarking.org Celsius, Fewer Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 GPU Temperature Monitor 20 40 60 80 100
SHOC Scalable HeterOgeneous Computing GPU Power Consumption Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 20.6 66.6 107.8 RTX 3090 24GB -Zotac 19.7 86.6 154.5 RTX 4080 16GB -Pny 13.9 38.2 62.5 RTX 4090 24GB -Nvidia 6.6 43.8 72.7 OpenBenchmarking.org Watts, Fewer Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 GPU Power Consumption Monitor 40 80 120 160 200
SHOC Scalable HeterOgeneous Computing GPU Temperature Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 44.0 49.7 54.0 RTX 3090 24GB -Zotac 55.0 61.1 68.0 RTX 4080 16GB -Pny 38.0 40.1 42.0 RTX 4090 24GB -Nvidia 34.0 36.1 38.0 OpenBenchmarking.org Celsius, Fewer Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 GPU Temperature Monitor 20 40 60 80 100
SHOC Scalable HeterOgeneous Computing GPU Power Consumption Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 20.1 63.6 102.9 RTX 3090 24GB -Zotac 20.3 93.5 158.5 RTX 4080 16GB -Pny 13.7 37.0 60.9 RTX 4090 24GB -Nvidia 6.4 43.4 72.8 OpenBenchmarking.org Watts, Fewer Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 GPU Power Consumption Monitor 50 100 150 200 250
ViennaCL GPU Temperature Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 44.0 49.2 58.0 RTX 3090 24GB -Zotac 48.0 53.3 57.0 RTX 4080 16GB -Pny 34.0 35.7 38.0 RTX 4090 24GB -Nvidia 32.0 33.9 37.0 OpenBenchmarking.org Celsius, Fewer Is Better ViennaCL 1.7.1 GPU Temperature Monitor 16 32 48 64 80
ViennaCL GPU Power Consumption Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 18.4 21.6 43.4 RTX 3090 24GB -Zotac 19.0 26.2 37.2 RTX 4080 16GB -Pny 4.0 12.1 18.8 RTX 4090 24GB -Nvidia 6.0 8.0 15.8 OpenBenchmarking.org Watts, Fewer Is Better ViennaCL 1.7.1 GPU Power Consumption Monitor 12 24 36 48 60
ViennaCL GPU Temperature Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 59.0 72.2 78.0 RTX 3090 24GB -Zotac 49.0 70.8 83.0 RTX 4080 16GB -Pny 37.0 44.0 54.0 RTX 4090 24GB -Nvidia 36.0 41.5 48.0 OpenBenchmarking.org Celsius, Fewer Is Better ViennaCL 1.7.1 GPU Temperature Monitor 20 40 60 80 100
ViennaCL GPU Power Consumption Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 22.4 173.6 255.0 RTX 3090 24GB -Zotac 28.9 232.9 357.9 RTX 4080 16GB -Pny 13.4 109.2 223.4 RTX 4090 24GB -Nvidia 6.9 128.0 256.4 OpenBenchmarking.org Watts, Fewer Is Better ViennaCL 1.7.1 GPU Power Consumption Monitor 100 200 300 400 500
MandelGPU GPU Temperature Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 60.0 67.6 74.0 RTX 3090 24GB -Zotac 50.0 62.9 79.0 RTX 4080 16GB -Pny 37.0 40.9 52.0 RTX 4090 24GB -Nvidia 36.0 38.8 46.0 OpenBenchmarking.org Celsius, Fewer Is Better MandelGPU 1.3pts1 GPU Temperature Monitor 20 40 60 80 100
MandelGPU GPU Power Consumption Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 22.6 126.4 254.8 RTX 3090 24GB -Zotac 19.7 149.1 321.4 RTX 4080 16GB -Pny 14.2 65.5 184.4 RTX 4090 24GB -Nvidia 6.9 68.6 215.0 OpenBenchmarking.org Watts, Fewer Is Better MandelGPU 1.3pts1 GPU Power Consumption Monitor 80 160 240 320 400
cl-mem GPU Temperature Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 60.0 67.6 73.0 RTX 3090 24GB -Zotac 50.0 64.4 74.0 RTX 4080 16GB -Pny 37.0 40.3 43.0 RTX 4090 24GB -Nvidia 35.0 38.0 40.0 OpenBenchmarking.org Celsius, Fewer Is Better cl-mem 2017-01-13 GPU Temperature Monitor 20 40 60 80 100
cl-mem GPU Power Consumption Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 23.0 127.1 210.3 RTX 3090 24GB -Zotac 19.4 168.8 328.7 RTX 4080 16GB -Pny 10.7 80.4 155.4 RTX 4090 24GB -Nvidia 6.8 90.7 205.9 OpenBenchmarking.org Watts, Fewer Is Better cl-mem 2017-01-13 GPU Power Consumption Monitor 80 160 240 320 400
cl-mem GPU Temperature Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 60.0 67.7 73.0 RTX 3090 24GB -Zotac 49.0 64.5 74.0 RTX 4080 16GB -Pny 38.0 40.7 43.0 RTX 4090 24GB -Nvidia 36.0 37.9 40.0 OpenBenchmarking.org Celsius, Fewer Is Better cl-mem 2017-01-13 GPU Temperature Monitor 20 40 60 80 100
cl-mem GPU Power Consumption Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 23.2 126.9 212.8 RTX 3090 24GB -Zotac 19.7 171.2 327.8 RTX 4080 16GB -Pny 12.8 77.9 155.5 RTX 4090 24GB -Nvidia 6.8 87.4 204.4 OpenBenchmarking.org Watts, Fewer Is Better cl-mem 2017-01-13 GPU Power Consumption Monitor 80 160 240 320 400
cl-mem GPU Temperature Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 59.0 67.8 73.0 RTX 3090 24GB -Zotac 48.0 63.9 73.0 RTX 4080 16GB -Pny 39.0 41.6 44.0 RTX 4090 24GB -Nvidia 38.0 39.6 42.0 OpenBenchmarking.org Celsius, Fewer Is Better cl-mem 2017-01-13 GPU Temperature Monitor 20 40 60 80 100
cl-mem GPU Power Consumption Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 22.9 130.3 212.4 RTX 3090 24GB -Zotac 19.8 166.2 329.4 RTX 4080 16GB -Pny 12.9 78.5 155.7 RTX 4090 24GB -Nvidia 6.8 92.1 205.2 OpenBenchmarking.org Watts, Fewer Is Better cl-mem 2017-01-13 GPU Power Consumption Monitor 80 160 240 320 400
LeelaChessZero GPU Temperature Monitor OpenBenchmarking.org Celsius, Fewer Is Better LeelaChessZero 0.28 GPU Temperature Monitor RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 16 32 48 64 80 Min: 55 / Avg: 79.42 / Max: 82 Min: 53 / Avg: 75.37 / Max: 83 Min: 38 / Avg: 46.95 / Max: 50 Min: 36 / Avg: 43.56 / Max: 46
LeelaChessZero GPU Power Consumption Monitor OpenBenchmarking.org Watts, Fewer Is Better LeelaChessZero 0.28 GPU Power Consumption Monitor RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 60 120 180 240 300 Min: 21.91 / Avg: 229.31 / Max: 264.57 Min: 28.54 / Avg: 263.63 / Max: 349.64 Min: 13.99 / Avg: 117.32 / Max: 160.59 Min: 6.5 / Avg: 122.59 / Max: 165.25
FinanceBench GPU Temperature Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 55.0 56.8 58.0 RTX 3090 24GB -Zotac 52.0 55.8 60.0 RTX 4080 16GB -Pny 35.0 36.4 38.0 RTX 4090 24GB -Nvidia 33.0 34.7 36.0 OpenBenchmarking.org Celsius, Fewer Is Better FinanceBench 2016-07-25 GPU Temperature Monitor 16 32 48 64 80
FinanceBench GPU Power Consumption Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 21.3 45.4 73.1 RTX 3090 24GB -Zotac 19.5 66.6 119.2 RTX 4080 16GB -Pny 13.0 26.6 39.8 RTX 4090 24GB -Nvidia 6.4 29.3 51.3 OpenBenchmarking.org Watts, Fewer Is Better FinanceBench 2016-07-25 GPU Power Consumption Monitor 40 80 120 160 200
NeatBench GPU Temperature Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 56.0 58.3 61.0 RTX 3090 24GB -Zotac 50.0 57.3 63.0 RTX 4080 16GB -Pny 34.0 34.9 37.0 RTX 4090 24GB -Nvidia 33.0 34.4 36.0 OpenBenchmarking.org Celsius, Fewer Is Better NeatBench 5 GPU Temperature Monitor 20 40 60 80 100
NeatBench GPU Power Consumption Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 22.1 68.0 119.6 RTX 3090 24GB -Zotac 23.6 85.4 168.2 RTX 4080 16GB -Pny 4.6 27.9 89.5 RTX 4090 24GB -Nvidia 6.9 36.3 73.8 OpenBenchmarking.org Watts, Fewer Is Better NeatBench 5 GPU Power Consumption Monitor 50 100 150 200 250
clpeak GPU Temperature Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 59.0 62.5 68.0 RTX 3090 24GB -Zotac 50.0 61.6 72.0 RTX 4080 16GB -Pny 35.0 38.5 56.0 RTX 4090 24GB -Nvidia 36.0 38.5 55.0 OpenBenchmarking.org Celsius, Fewer Is Better clpeak 1.1.2 GPU Temperature Monitor 20 40 60 80 100
clpeak GPU Power Consumption Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 22.3 90.1 254.1 RTX 3090 24GB -Zotac 23.7 127.5 323.8 RTX 4080 16GB -Pny 14.8 58.3 297.3 RTX 4090 24GB -Nvidia 7.1 83.9 421.2 OpenBenchmarking.org Watts, Fewer Is Better clpeak 1.1.2 GPU Power Consumption Monitor 110 220 330 440 550
clpeak GPU Temperature Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 59.0 70.0 75.0 RTX 3090 24GB -Zotac 52.0 67.1 73.0 RTX 4080 16GB -Pny 35.0 39.2 41.0 RTX 4090 24GB -Nvidia 36.0 39.5 41.0 OpenBenchmarking.org Celsius, Fewer Is Better clpeak 1.1.2 GPU Temperature Monitor 20 40 60 80 100
clpeak GPU Power Consumption Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 22.4 145.6 183.3 RTX 3090 24GB -Zotac 20.3 170.7 209.4 RTX 4080 16GB -Pny 12.8 70.5 93.0 RTX 4090 24GB -Nvidia 7.0 90.3 122.6 OpenBenchmarking.org Watts, Fewer Is Better clpeak 1.1.2 GPU Power Consumption Monitor 60 120 180 240 300
clpeak GPU Temperature Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 60.0 65.9 73.0 RTX 3090 24GB -Zotac 50.0 61.1 73.0 RTX 4080 16GB -Pny 36.0 38.0 52.0 RTX 4090 24GB -Nvidia 36.0 38.8 52.0 OpenBenchmarking.org Celsius, Fewer Is Better clpeak 1.1.2 GPU Temperature Monitor 20 40 60 80 100
clpeak GPU Power Consumption Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 21.8 109.9 254.5 RTX 3090 24GB -Zotac 22.4 114.3 268.7 RTX 4080 16GB -Pny 14.3 56.1 255.4 RTX 4090 24GB -Nvidia 6.9 70.4 356.0 OpenBenchmarking.org Watts, Fewer Is Better clpeak 1.1.2 GPU Power Consumption Monitor 100 200 300 400 500
clpeak GPU Temperature Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 59.0 66.4 74.0 RTX 3090 24GB -Zotac 50.0 62.0 75.0 RTX 4080 16GB -Pny 36.0 38.4 43.0 RTX 4090 24GB -Nvidia 35.0 37.4 40.0 OpenBenchmarking.org Celsius, Fewer Is Better clpeak 1.1.2 GPU Temperature Monitor 20 40 60 80 100
clpeak GPU Power Consumption Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 22.0 120.7 246.3 RTX 3090 24GB -Zotac 20.3 147.9 355.6 RTX 4080 16GB -Pny 14.6 68.3 177.2 RTX 4090 24GB -Nvidia 6.8 79.8 231.0 OpenBenchmarking.org Watts, Fewer Is Better clpeak 1.1.2 GPU Power Consumption Monitor 100 200 300 400 500
ArrayFire GPU Temperature Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 59.0 64.4 71.0 RTX 3090 24GB -Zotac 49.0 61.9 72.0 RTX 4080 16GB -Pny 37.0 39.4 42.0 RTX 4090 24GB -Nvidia 36.0 38.5 41.0 OpenBenchmarking.org Celsius, Fewer Is Better ArrayFire 3.7 GPU Temperature Monitor 20 40 60 80 100
ArrayFire GPU Power Consumption Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 22.4 103.9 212.7 RTX 3090 24GB -Zotac 24.2 149.2 279.0 RTX 4080 16GB -Pny 14.7 58.6 132.1 RTX 4090 24GB -Nvidia 6.7 47.6 113.6 OpenBenchmarking.org Watts, Fewer Is Better ArrayFire 3.7 GPU Power Consumption Monitor 70 140 210 280 350
Rodinia GPU Temperature Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 62.0 69.2 74.0 RTX 3090 24GB -Zotac 52.0 64.9 77.0 RTX 4080 16GB -Pny 39.0 43.8 49.0 RTX 4090 24GB -Nvidia 41.0 44.6 50.0 OpenBenchmarking.org Celsius, Fewer Is Better Rodinia 3.1 GPU Temperature Monitor 20 40 60 80 100
Rodinia GPU Power Consumption Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 23.3 133.0 219.9 RTX 3090 24GB -Zotac 19.9 148.1 254.9 RTX 4080 16GB -Pny 14.5 63.1 123.6 RTX 4090 24GB -Nvidia 6.8 67.5 167.3 OpenBenchmarking.org Watts, Fewer Is Better Rodinia 3.1 GPU Power Consumption Monitor 70 140 210 280 350
OctaneBench GPU Temperature Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 58.0 80.0 83.0 RTX 3090 24GB -Zotac 50.0 79.3 82.0 RTX 4080 16GB -Pny 35.0 54.7 59.0 RTX 4090 24GB -Nvidia 35.0 55.0 60.0 OpenBenchmarking.org Celsius, Fewer Is Better OctaneBench 2020.1 GPU Temperature Monitor 20 40 60 80 100
OctaneBench GPU Power Consumption Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 22.6 242.0 259.3 RTX 3090 24GB -Zotac 28.6 337.3 362.8 RTX 4080 16GB -Pny 12.7 220.4 259.6 RTX 4090 24GB -Nvidia 6.8 279.5 322.5 OpenBenchmarking.org Watts, Fewer Is Better OctaneBench 2020.1 GPU Power Consumption Monitor 100 200 300 400 500
NAMD CUDA GPU Temperature Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 60.0 67.3 74.0 RTX 3090 24GB -Zotac 47.0 54.3 78.0 RTX 4080 16GB -Pny 36.0 39.5 52.0 RTX 4090 24GB -Nvidia 34.0 37.3 44.0 OpenBenchmarking.org Celsius, Fewer Is Better NAMD CUDA 2.14 GPU Temperature Monitor 20 40 60 80 100
NAMD CUDA GPU Power Consumption Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 23.6 127.2 297.8 RTX 3090 24GB -Zotac 19.0 68.9 327.9 RTX 4080 16GB -Pny 13.4 69.1 263.4 RTX 4090 24GB -Nvidia 6.4 54.4 281.7 OpenBenchmarking.org Watts, Fewer Is Better NAMD CUDA 2.14 GPU Power Consumption Monitor 80 160 240 320 400
GROMACS GPU Temperature Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 59.0 73.6 80.0 RTX 3090 24GB -Zotac 48.0 71.1 82.0 RTX 4080 16GB -Pny 35.0 48.9 62.0 RTX 4090 24GB -Nvidia 36.0 45.6 57.0 OpenBenchmarking.org Celsius, Fewer Is Better GROMACS 2023 GPU Temperature Monitor 20 40 60 80 100
GROMACS GPU Power Consumption Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 22.7 194.8 276.3 RTX 3090 24GB -Zotac 28.0 243.2 365.1 RTX 4080 16GB -Pny 12.5 176.8 317.1 RTX 4090 24GB -Nvidia 6.7 182.8 404.4 OpenBenchmarking.org Watts, Fewer Is Better GROMACS 2023 GPU Power Consumption Monitor 110 220 330 440 550
FAHBench GPU Temperature Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 55.0 69.5 81.0 RTX 3090 24GB -Zotac 46.0 66.8 80.0 RTX 4080 16GB -Pny 33.0 39.7 46.0 RTX 4090 24GB -Nvidia 35.0 40.0 44.0 OpenBenchmarking.org Celsius, Fewer Is Better FAHBench 2.3.2 GPU Temperature Monitor 20 40 60 80 100
FAHBench GPU Power Consumption Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 21.7 160.0 255.5 RTX 3090 24GB -Zotac 19.0 192.6 299.0 RTX 4080 16GB -Pny 10.4 88.1 141.6 RTX 4090 24GB -Nvidia 6.5 96.4 148.8 OpenBenchmarking.org Watts, Fewer Is Better FAHBench 2.3.2 GPU Power Consumption Monitor 80 160 240 320 400
Hashcat GPU Temperature Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 59.0 65.3 73.0 RTX 3090 24GB -Zotac 47.0 60.7 77.0 RTX 4080 16GB -Pny 35.0 42.8 63.0 RTX 4090 24GB -Nvidia 39.0 44.8 62.0 OpenBenchmarking.org Celsius, Fewer Is Better Hashcat 6.2.4 GPU Temperature Monitor 20 40 60 80 100
Hashcat GPU Power Consumption Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 22.8 115.4 347.3 RTX 3090 24GB -Zotac 19.1 138.5 326.4 RTX 4080 16GB -Pny 12.4 94.7 308.4 RTX 4090 24GB -Nvidia 6.6 136.8 444.0 OpenBenchmarking.org Watts, Fewer Is Better Hashcat 6.2.4 GPU Power Consumption Monitor 120 240 360 480 600
Hashcat GPU Temperature Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 59.0 65.6 73.0 RTX 3090 24GB -Zotac 45.0 60.4 76.0 RTX 4080 16GB -Pny 36.0 43.0 63.0 RTX 4090 24GB -Nvidia 41.0 46.6 62.0 OpenBenchmarking.org Celsius, Fewer Is Better Hashcat 6.2.4 GPU Temperature Monitor 20 40 60 80 100
Hashcat GPU Power Consumption Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 22.9 121.2 333.4 RTX 3090 24GB -Zotac 19.1 155.4 349.1 RTX 4080 16GB -Pny 13.2 101.3 323.7 RTX 4090 24GB -Nvidia 6.8 144.6 456.1 OpenBenchmarking.org Watts, Fewer Is Better Hashcat 6.2.4 GPU Power Consumption Monitor 120 240 360 480 600
Hashcat GPU Temperature Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 58.0 69.2 76.0 RTX 3090 24GB -Zotac 45.0 65.2 78.0 RTX 4080 16GB -Pny 35.0 48.9 64.0 RTX 4090 24GB -Nvidia 41.0 52.5 65.0 OpenBenchmarking.org Celsius, Fewer Is Better Hashcat 6.2.4 GPU Temperature Monitor 20 40 60 80 100
Hashcat GPU Power Consumption Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 22.3 156.5 321.2 RTX 3090 24GB -Zotac 20.5 192.6 326.5 RTX 4080 16GB -Pny 12.9 157.1 290.4 RTX 4090 24GB -Nvidia 6.8 231.7 417.6 OpenBenchmarking.org Watts, Fewer Is Better Hashcat 6.2.4 GPU Power Consumption Monitor 110 220 330 440 550
Hashcat GPU Temperature Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 58.0 68.8 76.0 RTX 3090 24GB -Zotac 46.0 65.4 78.0 RTX 4080 16GB -Pny 34.0 47.5 62.0 RTX 4090 24GB -Nvidia 40.0 50.8 64.0 OpenBenchmarking.org Celsius, Fewer Is Better Hashcat 6.2.4 GPU Temperature Monitor 20 40 60 80 100
Hashcat GPU Power Consumption Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 22.5 154.3 261.7 RTX 3090 24GB -Zotac 19.1 192.7 330.0 RTX 4080 16GB -Pny 12.7 148.9 281.6 RTX 4090 24GB -Nvidia 6.7 190.1 401.1 OpenBenchmarking.org Watts, Fewer Is Better Hashcat 6.2.4 GPU Power Consumption Monitor 110 220 330 440 550
Hashcat GPU Temperature Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 50.0 65.0 74.0 RTX 3090 24GB -Zotac 50.0 67.5 82.0 RTX 4080 16GB -Pny 31.0 45.7 63.0 RTX 4090 24GB -Nvidia 36.0 49.1 65.0 OpenBenchmarking.org Celsius, Fewer Is Better Hashcat 6.2.4 GPU Temperature Monitor 20 40 60 80 100
Hashcat GPU Power Consumption Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 21.1 147.0 255.1 RTX 3090 24GB -Zotac 20.4 191.9 326.0 RTX 4080 16GB -Pny 6.8 142.4 304.5 RTX 4090 24GB -Nvidia 6.7 197.8 444.1 OpenBenchmarking.org Watts, Fewer Is Better Hashcat 6.2.4 GPU Power Consumption Monitor 120 240 360 480 600
Phoronix Test Suite v10.8.4