Compare your own system(s) to this result file with the
Phoronix Test Suite by running the command:
phoronix-test-suite benchmark 2307077-NE-2307020PT71 2307020-PTS-GPUREVIEW1 - Phoronix Test Suite 2307020-PTS-GPUREVIEW1 RTX 4080 16GB PNY REVIEW
HTML result view exported from: https://openbenchmarking.org/result/2307077-NE-2307020PT71&sor&grt .
2307020-PTS-GPUREVIEW1 Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL OpenCL Vulkan Compiler File-System Screen Resolution RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny Intel Xeon w9-3495X @ 4.80GHz (56 Cores / 112 Threads) ASUS Pro WS W790E-SAGE SE (0506 BIOS) Intel Device 7aa7 8 x 32 GB DDR5-4812MT/s Hynix HMCG88AEBRA115N 6401GB Micron_9300_MTFDHAL6T4TDR + 0GB Virtual HDisk0 NVIDIA GeForce RTX 3090 24GB Realtek ALC1220 BenQ PD2720U 2 x Intel X710 for 10GBASE-T Ubuntu 22.04 6.3.0-060300-generic (x86_64) GNOME Shell 42.5 X Server 1.21.1.4 NVIDIA 530.41.03 4.6.0 OpenCL 3.0 CUDA 12.1.98 1.3.236 GCC 11.3.0 + CUDA 12.1 ext4 3840x2160 NVIDIA GeForce RTX 2080 Ti 22GB NVIDIA GeForce RTX 4090 24GB NVIDIA GeForce RTX 4080 16GB OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-aYxV0E/gcc-11-11.3.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-aYxV0E/gcc-11-11.3.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: intel_pstate powersave (EPP: performance) - CPU Microcode: 0x2b000390 Graphics Details - RTX 3090 24GB -Zotac: BAR1 / Visible vRAM Size: 32768 MiB - vBIOS Version: 94.02.26.48.65 - RTX 2080 Ti 22GB -Dell: BAR1 / Visible vRAM Size: 256 MiB - vBIOS Version: 90.02.30.40.4d - RTX 4090 24GB -Nvidia: BAR1 / Visible vRAM Size: 32768 MiB - vBIOS Version: 95.02.20.00.03 - RTX 4080 16GB -Pny: BAR1 / Visible vRAM Size: 16384 MiB - vBIOS Version: 95.03.0e.00.67 OpenCL Details - RTX 3090 24GB -Zotac: GPU Compute Cores: 10496 - RTX 2080 Ti 22GB -Dell: GPU Compute Cores: 4352 - RTX 4090 24GB -Nvidia: GPU Compute Cores: 16384 - RTX 4080 16GB -Pny: GPU Compute Cores: 9728 Python Details - Python 3.10.6 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected
2307020-PTS-GPUREVIEW1 arrayfire: Conjugate Gradient OpenCL blender: BMW27 - NVIDIA OptiX blender: Classroom - NVIDIA OptiX blender: Fishy Cat - NVIDIA OptiX blender: Pabellon Barcelona - NVIDIA OptiX blender: Barbershop - NVIDIA OptiX caffe: AlexNet - NVIDIA CUDA - 100 caffe: AlexNet - NVIDIA CUDA - 200 caffe: AlexNet - NVIDIA CUDA - 1000 caffe: GoogleNet - NVIDIA CUDA - 100 caffe: GoogleNet - NVIDIA CUDA - 200 caffe: GoogleNet - NVIDIA CUDA - 1000 v-ray: NVIDIA CUDA GPU v-ray: NVIDIA RTX GPU cl-mem: Read cl-mem: Write cl-mem: Copy clpeak: Global Memory Bandwidth clpeak: Single-Precision Float clpeak: Double-Precision Double clpeak: Integer Compute INT fahbench: financebench: Black-Scholes OpenCL gromacs: NVIDIA CUDA GPU - water_GMX50_bare hashcat: MD5 hashcat: SHA1 hashcat: SHA-512 hashcat: 7-Zip hashcat: TrueCrypt RIPEMD160 + XTS indigobench: OpenCL GPU - Supercar indigobench: OpenCL GPU - Bedroom lczero: OpenCL mandelgpu: GPU namd-cuda: ATPase Simulation - 327,506 Atoms ncnn: Vulkan GPU - mobilenet ncnn: Vulkan GPU-v2-v2 - mobilenet-v2 ncnn: Vulkan GPU-v3-v3 - mobilenet-v3 ncnn: Vulkan GPU - shufflenet-v2 ncnn: Vulkan GPU - mnasnet ncnn: Vulkan GPU - efficientnet-b0 ncnn: Vulkan GPU - blazeface ncnn: Vulkan GPU - googlenet ncnn: Vulkan GPU - vgg16 ncnn: Vulkan GPU - resnet18 ncnn: Vulkan GPU - alexnet ncnn: Vulkan GPU - resnet50 ncnn: Vulkan GPU - yolov4-tiny ncnn: Vulkan GPU - squeezenet_ssd ncnn: Vulkan GPU - regnety_400m ncnn: Vulkan GPU - vision_transformer ncnn: Vulkan GPU - FastestDet neatbench: GPU octanebench: Total Score realsr-ncnn: 4x - Yes realsr-ncnn: 4x - No rodinia: OpenCL Particle Filter shoc: OpenCL - Bus Speed Download shoc: OpenCL - Bus Speed Readback shoc: OpenCL - Max SP Flops shoc: OpenCL - Texture Read Bandwidth shoc: OpenCL - FFT SP shoc: OpenCL - GEMM SGEMM_N shoc: OpenCL - MD5 Hash shoc: OpenCL - Reduction shoc: OpenCL - Triad shoc: OpenCL - S3D viennacl: OpenCL BLAS - sCOPY viennacl: OpenCL BLAS - sAXPY viennacl: OpenCL BLAS - sDOT viennacl: OpenCL BLAS - dCOPY viennacl: OpenCL BLAS - dAXPY viennacl: OpenCL BLAS - dDOT viennacl: OpenCL BLAS - dGEMV-N viennacl: OpenCL BLAS - dGEMV-T viennacl: OpenCL BLAS - dGEMM-NN viennacl: OpenCL BLAS - dGEMM-NT viennacl: OpenCL BLAS - dGEMM-TN viennacl: OpenCL BLAS - dGEMM-TT viennacl: CPU BLAS - sCOPY viennacl: CPU BLAS - sAXPY viennacl: CPU BLAS - sDOT viennacl: CPU BLAS - dCOPY viennacl: CPU BLAS - dAXPY viennacl: CPU BLAS - dDOT viennacl: CPU BLAS - dGEMV-N viennacl: CPU BLAS - dGEMV-T viennacl: CPU BLAS - dGEMM-NN viennacl: CPU BLAS - dGEMM-NT viennacl: CPU BLAS - dGEMM-TN viennacl: CPU BLAS - dGEMM-TT vkfft: vkpeak: fp32-scalar vkpeak: fp32-vec4 vkpeak: fp16-scalar vkpeak: fp16-vec4 vkpeak: fp64-scalar vkpeak: fp64-vec4 vkpeak: int32-scalar vkpeak: int32-vec4 vkpeak: int16-scalar vkpeak: int16-vec4 vkresample: 2x - Single vkresample: 2x - Double waifu2x-ncnn: 2x - 3 - Yes RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny 1.583 6.13 14.25 10.84 16.07 52.28 664.597 1321.36 6584.81 2418.86 4830.59 24142.8 2052 2878 823.3 734.3 359.2 814.92 35000.44 640.15 17864.54 316.5721 5.796 23.743 63700926667 20634050000 2595250000 1095950 731327 51.701 20.772 14316 568293355.5 0.06742 37.93 18.25 18.78 17.57 19.16 23.97 5.38 28.44 4.40 21.45 21.31 6.35 49.10 57.27 25.04 324.88 13.52 3090 669.371026 31.170 6.509 3.768 25.1930 26.3975 37738.4 2148.86 2349.39 7912.59 41.3324 392.804 24.5486 427.491 363 495 372 600 716 658 185 371 585 588 587 584 1842 2272 764 834 984 606 172 749 107 120 141 143 41547 20403.35 26392.09 20161.59 40002.93 645.24 645.31 20305.75 20074.10 13295.72 16205.98 9.359 121.353 3.536 1.679 8.93 22.69 16.89 27.42 92.87 950 1306 543.8 452.2 318.1 504.66 14076.71 506.01 11826.76 292.6940 8.840 15.377 51503250000 16300900000 2047183333 840275 580260 32.598 11.152 13228 442866316.9 0.09515 38.20 6.35 18.77 6.11 7.03 26.09 5.49 28.22 7.07 22.13 21.50 5.85 52.48 61.74 23.13 363.75 6.80 2080 349.490185 52.755 9.239 4.527 12.8530 13.2046 16106.4 1147.26 1504.87 4858.13 32.5181 366.058 12.6064 270.801 306 392 305 475 518 524 297 370 459 461 460 458 1755 2261 747 923 1058 658 179 742 106.8 119 142 145 33404 15974.22 15859.12 15521.12 30649.13 500.91 503.53 15963.59 15752.98 10270.26 12975.65 14.790 153.146 4.307 0.8803 3.70 7.47 5.73 8.42 30.55 447.341 883.953 4379.70 1660.01 3304.15 16499.3 4265 5478 886.0 785.8 410.3 871.56 79463.36 1391.51 40697.73 423.2069 2.890 43.470 154333333333 49393366667 6297766667 2746938 1857800 78.445 35.296 15097 951815274.2 0.04850 9.21 4.42 4.97 4.85 4.39 6.42 3.95 5.77 25.14 4.50 5.08 7.28 35.23 8.99 5.46 289.04 5.08 4090 1326.237674 20.187 5.109 2.085 25.1311 26.3724 88606.0 2980.15 2779.48 27175.9 93.9019 967.142 25.0717 644.603 444 568 446 661 772 719 220 443 1150 1280 1293 1340 1874 2360 784 946 1098 671 188 771 120 115 137 144 59427 44315.35 58607.46 44269.79 87698.39 1394.44 1396.26 44276.72 44062.52 29507.17 39280.82 7.799 55.925 2.490 1.122 4.43 9.32 7.59 10.41 37.97 537.491 1053.88 5228.00 1649.29 3279.17 16310.5 3126 4080 623.2 567.7 382.0 611.67 47802.87 869.25 24524.34 424.3492 4.404 32.044 96891966667 30881533333 3936350000 1747350 1146560 66.030 26.080 15316 772392077.3 0.05427 9.36 4.32 5.02 4.78 4.37 6.30 3.89 5.71 7.90 4.78 17.57 5.16 35.87 9.13 5.32 290.15 5.18 4080 977.52814 24.374 5.630 2.685 25.1007 26.3935 55455.8 3065.76 1831.86 16952.6 60.5581 1020.235 24.7590 423.578 385 487 417 540 608 597 224 435 774 795 831 849 1901 2422 783 956 1085 678 187 787 103.5 122 135 137 48201 27771.88 36655.69 27669.37 54827.66 873.52 873.67 27758.93 27595.51 18441.46 24616.84 11.543 89.237 2.761 OpenBenchmarking.org
ArrayFire Test: Conjugate Gradient OpenCL OpenBenchmarking.org ms, Fewer Is Better ArrayFire 3.7 Test: Conjugate Gradient OpenCL RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 0.3778 0.7556 1.1334 1.5112 1.889 SE +/- 0.0014, N = 9 SE +/- 0.0013, N = 10 SE +/- 0.0020, N = 9 SE +/- 0.0055, N = 9 0.8803 1.1220 1.5830 1.6790 1. (CXX) g++ options: -rdynamic
ArrayFire GPU Power Consumption Monitor Min Avg Max RTX 4090 24GB -Nvidia 6.7 47.6 113.6 RTX 4080 16GB -Pny 14.7 58.6 132.1 RTX 2080 Ti 22GB -Dell 22.4 103.9 212.7 RTX 3090 24GB -Zotac 24.2 149.2 279.0 OpenBenchmarking.org Watts, Fewer Is Better ArrayFire 3.7 GPU Power Consumption Monitor 70 140 210 280 350
ArrayFire GPU Temperature Monitor Min Avg Max RTX 4090 24GB -Nvidia 36.0 38.5 41.0 RTX 4080 16GB -Pny 37.0 39.4 42.0 RTX 3090 24GB -Zotac 49.0 61.9 72.0 RTX 2080 Ti 22GB -Dell 59.0 64.4 71.0 OpenBenchmarking.org Celsius, Fewer Is Better ArrayFire 3.7 GPU Temperature Monitor 20 40 60 80 100
Blender Blend File: BMW27 - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.6 Blend File: BMW27 - Compute: NVIDIA OptiX RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 2 4 6 8 10 SE +/- 0.06, N = 15 SE +/- 0.06, N = 15 SE +/- 0.06, N = 15 SE +/- 0.06, N = 15 3.70 4.43 6.13 8.93
Blender GPU Power Consumption Monitor Min Avg Max RTX 4080 16GB -Pny 14.3 95.0 198.6 RTX 4090 24GB -Nvidia 7.0 107.7 242.2 RTX 2080 Ti 22GB -Dell 23.3 173.8 254.3 RTX 3090 24GB -Zotac 19.2 215.9 349.9 OpenBenchmarking.org Watts, Fewer Is Better Blender 3.6 GPU Power Consumption Monitor 100 200 300 400 500
Blender GPU Temperature Monitor Min Avg Max RTX 4090 24GB -Nvidia 42.0 44.8 50.0 RTX 4080 16GB -Pny 41.0 45.3 51.0 RTX 3090 24GB -Zotac 46.0 67.3 79.0 RTX 2080 Ti 22GB -Dell 60.0 72.6 78.0 OpenBenchmarking.org Celsius, Fewer Is Better Blender 3.6 GPU Temperature Monitor 20 40 60 80 100
Blender Blend File: Classroom - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.6 Blend File: Classroom - Compute: NVIDIA OptiX RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 5 10 15 20 25 SE +/- 0.01, N = 6 SE +/- 0.02, N = 5 SE +/- 0.01, N = 4 SE +/- 0.03, N = 3 7.47 9.32 14.25 22.69
Blender GPU Power Consumption Monitor Min Avg Max RTX 4080 16GB -Pny 5.3 142.2 230.1 RTX 4090 24GB -Nvidia 7.1 162.1 291.8 RTX 2080 Ti 22GB -Dell 22.5 212.6 255.0 RTX 3090 24GB -Zotac 19.2 267.4 353.6 OpenBenchmarking.org Watts, Fewer Is Better Blender 3.6 GPU Power Consumption Monitor 100 200 300 400 500
Blender GPU Temperature Monitor Min Avg Max RTX 4090 24GB -Nvidia 40.0 46.5 52.0 RTX 4080 16GB -Pny 39.0 48.3 55.0 RTX 3090 24GB -Zotac 48.0 73.0 81.0 RTX 2080 Ti 22GB -Dell 59.0 75.4 81.0 OpenBenchmarking.org Celsius, Fewer Is Better Blender 3.6 GPU Temperature Monitor 20 40 60 80 100
Blender Blend File: Fishy Cat - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.6 Blend File: Fishy Cat - Compute: NVIDIA OptiX RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 4 8 12 16 20 SE +/- 0.07, N = 15 SE +/- 0.07, N = 15 SE +/- 0.07, N = 15 SE +/- 0.21, N = 4 5.73 7.59 10.84 16.89
Blender GPU Power Consumption Monitor Min Avg Max RTX 4080 16GB -Pny 13.4 116.8 223.6 RTX 4090 24GB -Nvidia 6.9 136.7 303.2 RTX 2080 Ti 22GB -Dell 22.9 195.0 256.9 RTX 3090 24GB -Zotac 19.4 234.9 344.1 OpenBenchmarking.org Watts, Fewer Is Better Blender 3.6 GPU Power Consumption Monitor 80 160 240 320 400
Blender GPU Temperature Monitor Min Avg Max RTX 4090 24GB -Nvidia 41.0 45.8 54.0 RTX 4080 16GB -Pny 40.0 48.2 57.0 RTX 3090 24GB -Zotac 48.0 70.7 82.0 RTX 2080 Ti 22GB -Dell 60.0 74.4 80.0 OpenBenchmarking.org Celsius, Fewer Is Better Blender 3.6 GPU Temperature Monitor 20 40 60 80 100
Blender Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.6 Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 6 12 18 24 30 SE +/- 0.02, N = 5 SE +/- 0.01, N = 5 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 8.42 10.41 16.07 27.42
Blender GPU Power Consumption Monitor Min Avg Max RTX 4080 16GB -Pny 12.8 140.8 239.5 RTX 4090 24GB -Nvidia 6.9 172.2 302.0 RTX 2080 Ti 22GB -Dell 22.7 212.9 257.2 RTX 3090 24GB -Zotac 20.5 264.0 361.3 OpenBenchmarking.org Watts, Fewer Is Better Blender 3.6 GPU Power Consumption Monitor 100 200 300 400 500
Blender GPU Temperature Monitor Min Avg Max RTX 4090 24GB -Nvidia 39.0 45.3 50.0 RTX 4080 16GB -Pny 39.0 48.0 54.0 RTX 3090 24GB -Zotac 47.0 71.9 80.0 RTX 2080 Ti 22GB -Dell 60.0 75.8 80.0 OpenBenchmarking.org Celsius, Fewer Is Better Blender 3.6 GPU Temperature Monitor 20 40 60 80 100
Blender Blend File: Barbershop - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.6 Blend File: Barbershop - Compute: NVIDIA OptiX RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 20 40 60 80 100 SE +/- 0.13, N = 3 SE +/- 0.07, N = 3 SE +/- 0.01, N = 3 SE +/- 0.09, N = 3 30.55 37.97 52.28 92.87
Blender GPU Power Consumption Monitor Min Avg Max RTX 4080 16GB -Pny 13.7 161.8 215.2 RTX 4090 24GB -Nvidia 6.8 189.4 276.9 RTX 2080 Ti 22GB -Dell 22.8 226.2 257.6 RTX 3090 24GB -Zotac 18.6 292.2 358.2 OpenBenchmarking.org Watts, Fewer Is Better Blender 3.6 GPU Power Consumption Monitor 100 200 300 400 500
Blender GPU Temperature Monitor Min Avg Max RTX 4090 24GB -Nvidia 40.0 48.5 54.0 RTX 4080 16GB -Pny 40.0 51.0 56.0 RTX 3090 24GB -Zotac 47.0 75.6 82.0 RTX 2080 Ti 22GB -Dell 59.0 78.7 82.0 OpenBenchmarking.org Celsius, Fewer Is Better Blender 3.6 GPU Temperature Monitor 20 40 60 80 100
Caffe Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 100 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 100 RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac 140 280 420 560 700 SE +/- 1.11, N = 11 SE +/- 1.06, N = 11 SE +/- 1.37, N = 11 447.34 537.49 664.60 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lcrypto -lcurl -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
Caffe GPU Power Consumption Monitor Min Avg Max RTX 4080 16GB -Pny 14.0 48.0 178.0 RTX 4090 24GB -Nvidia 7.0 56.9 202.9 RTX 3090 24GB -Zotac 18.7 125.3 335.1 OpenBenchmarking.org Watts, Fewer Is Better Caffe 2020-02-13 GPU Power Consumption Monitor 80 160 240 320 400
Caffe GPU Temperature Monitor Min Avg Max RTX 4080 16GB -Pny 39.0 42.1 50.0 RTX 4090 24GB -Nvidia 40.0 42.4 49.0 RTX 3090 24GB -Zotac 47.0 60.1 75.0 OpenBenchmarking.org Celsius, Fewer Is Better Caffe 2020-02-13 GPU Temperature Monitor 20 40 60 80 100
Caffe Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 200 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 200 RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac 300 600 900 1200 1500 SE +/- 0.87, N = 11 SE +/- 1.43, N = 10 SE +/- 1.00, N = 10 883.95 1053.88 1321.36 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lcrypto -lcurl -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
Caffe GPU Power Consumption Monitor Min Avg Max RTX 4080 16GB -Pny 13.4 60.0 179.5 RTX 4090 24GB -Nvidia 6.8 69.2 202.4 RTX 3090 24GB -Zotac 19.3 144.4 336.9 OpenBenchmarking.org Watts, Fewer Is Better Caffe 2020-02-13 GPU Power Consumption Monitor 80 160 240 320 400
Caffe GPU Temperature Monitor Min Avg Max RTX 4090 24GB -Nvidia 38.0 39.5 44.0 RTX 4080 16GB -Pny 38.0 40.3 48.0 RTX 3090 24GB -Zotac 50.0 61.4 77.0 OpenBenchmarking.org Celsius, Fewer Is Better Caffe 2020-02-13 GPU Temperature Monitor 20 40 60 80 100
Caffe Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 1000 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 1000 RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac 1400 2800 4200 5600 7000 SE +/- 2.74, N = 7 SE +/- 3.07, N = 6 SE +/- 2.76, N = 6 4379.70 5228.00 6584.81 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lcrypto -lcurl -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
Caffe GPU Power Consumption Monitor Min Avg Max RTX 4080 16GB -Pny 10.7 104.4 178.5 RTX 4090 24GB -Nvidia 7.0 114.8 203.5 RTX 3090 24GB -Zotac 19.1 224.3 341.6 OpenBenchmarking.org Watts, Fewer Is Better Caffe 2020-02-13 GPU Power Consumption Monitor 80 160 240 320 400
Caffe GPU Temperature Monitor Min Avg Max RTX 4090 24GB -Nvidia 36.0 40.7 45.0 RTX 4080 16GB -Pny 37.0 43.9 50.0 RTX 3090 24GB -Zotac 48.0 68.9 80.0 OpenBenchmarking.org Celsius, Fewer Is Better Caffe 2020-02-13 GPU Temperature Monitor 20 40 60 80 100
Caffe Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 100 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 100 RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia RTX 3090 24GB -Zotac 500 1000 1500 2000 2500 SE +/- 2.33, N = 10 SE +/- 1.34, N = 10 SE +/- 1.69, N = 9 1649.29 1660.01 2418.86 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lcrypto -lcurl -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
Caffe GPU Power Consumption Monitor Min Avg Max RTX 4080 16GB -Pny 13.4 62.2 154.8 RTX 4090 24GB -Nvidia 7.0 71.0 160.9 RTX 3090 24GB -Zotac 19.3 154.8 289.3 OpenBenchmarking.org Watts, Fewer Is Better Caffe 2020-02-13 GPU Power Consumption Monitor 70 140 210 280 350
Caffe GPU Temperature Monitor Min Avg Max RTX 4090 24GB -Nvidia 37.0 38.5 42.0 RTX 4080 16GB -Pny 38.0 40.8 47.0 RTX 3090 24GB -Zotac 47.0 61.3 74.0 OpenBenchmarking.org Celsius, Fewer Is Better Caffe 2020-02-13 GPU Temperature Monitor 20 40 60 80 100
Caffe Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 200 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 200 RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia RTX 3090 24GB -Zotac 1000 2000 3000 4000 5000 SE +/- 3.25, N = 8 SE +/- 3.64, N = 8 SE +/- 2.81, N = 7 3279.17 3304.15 4830.59 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lcrypto -lcurl -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
Caffe GPU Power Consumption Monitor Min Avg Max RTX 4080 16GB -Pny 12.6 78.5 154.8 RTX 4090 24GB -Nvidia 7.0 87.3 162.9 RTX 3090 24GB -Zotac 25.4 184.4 290.0 OpenBenchmarking.org Watts, Fewer Is Better Caffe 2020-02-13 GPU Power Consumption Monitor 70 140 210 280 350
Caffe GPU Temperature Monitor Min Avg Max RTX 4090 24GB -Nvidia 36.0 39.3 43.0 RTX 4080 16GB -Pny 37.0 41.8 48.0 RTX 3090 24GB -Zotac 47.0 64.6 76.0 OpenBenchmarking.org Celsius, Fewer Is Better Caffe 2020-02-13 GPU Temperature Monitor 20 40 60 80 100
Caffe Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 1000 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 1000 RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia RTX 3090 24GB -Zotac 5K 10K 15K 20K 25K SE +/- 18.97, N = 3 SE +/- 24.30, N = 3 SE +/- 30.41, N = 3 16310.5 16499.3 24142.8 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lcrypto -lcurl -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
Caffe GPU Power Consumption Monitor Min Avg Max RTX 4080 16GB -Pny 13.4 121.4 156.0 RTX 4090 24GB -Nvidia 6.9 129.1 162.7 RTX 3090 24GB -Zotac 18.5 252.8 299.7 OpenBenchmarking.org Watts, Fewer Is Better Caffe 2020-02-13 GPU Power Consumption Monitor 80 160 240 320 400
Caffe GPU Temperature Monitor Min Avg Max RTX 4090 24GB -Nvidia 37.0 42.3 45.0 RTX 4080 16GB -Pny 37.0 46.1 50.0 RTX 3090 24GB -Zotac 47.0 72.8 78.0 OpenBenchmarking.org Celsius, Fewer Is Better Caffe 2020-02-13 GPU Temperature Monitor 20 40 60 80 100
Chaos Group V-RAY Mode: NVIDIA CUDA GPU OpenBenchmarking.org vpaths, More Is Better Chaos Group V-RAY 5.02 Mode: NVIDIA CUDA GPU RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 900 1800 2700 3600 4500 SE +/- 2.33, N = 3 SE +/- 2.73, N = 3 SE +/- 2.85, N = 3 SE +/- 0.67, N = 3 4265 3126 2052 950
Chaos Group V-RAY GPU Power Consumption Monitor Min Avg Max RTX 4080 16GB -Pny 13.4 159.5 227.6 RTX 2080 Ti 22GB -Dell 22.7 181.0 256.4 RTX 4090 24GB -Nvidia 7.2 199.1 296.3 RTX 3090 24GB -Zotac 19.2 250.0 356.2 OpenBenchmarking.org Watts, Fewer Is Better Chaos Group V-RAY 5.02 GPU Power Consumption Monitor 100 200 300 400 500
Chaos Group V-RAY GPU Temperature Monitor Min Avg Max RTX 4090 24GB -Nvidia 42.0 50.8 57.0 RTX 4080 16GB -Pny 42.0 51.8 57.0 RTX 3090 24GB -Zotac 48.0 71.5 82.0 RTX 2080 Ti 22GB -Dell 57.0 73.5 82.0 OpenBenchmarking.org Celsius, Fewer Is Better Chaos Group V-RAY 5.02 GPU Temperature Monitor 20 40 60 80 100
Chaos Group V-RAY Mode: NVIDIA RTX GPU OpenBenchmarking.org vrays, More Is Better Chaos Group V-RAY 5.02 Mode: NVIDIA RTX GPU RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 1200 2400 3600 4800 6000 SE +/- 63.00, N = 3 SE +/- 30.20, N = 3 SE +/- 18.12, N = 3 SE +/- 10.17, N = 3 5478 4080 2878 1306
Chaos Group V-RAY GPU Power Consumption Monitor Min Avg Max RTX 4080 16GB -Pny 4.5 130.0 211.7 RTX 2080 Ti 22GB -Dell 21.6 167.9 253.1 RTX 4090 24GB -Nvidia 6.8 169.0 272.7 RTX 3090 24GB -Zotac 18.9 235.8 358.0 OpenBenchmarking.org Watts, Fewer Is Better Chaos Group V-RAY 5.02 GPU Power Consumption Monitor 100 200 300 400 500
Chaos Group V-RAY GPU Temperature Monitor Min Avg Max RTX 4080 16GB -Pny 39.0 48.0 54.0 RTX 4090 24GB -Nvidia 41.0 48.6 54.0 RTX 3090 24GB -Zotac 47.0 68.3 80.0 RTX 2080 Ti 22GB -Dell 55.0 71.7 82.0 OpenBenchmarking.org Celsius, Fewer Is Better Chaos Group V-RAY 5.02 GPU Temperature Monitor 20 40 60 80 100
cl-mem Benchmark: Read OpenBenchmarking.org GB/s, More Is Better cl-mem 2017-01-13 Benchmark: Read RTX 4090 24GB -Nvidia RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 2080 Ti 22GB -Dell 200 400 600 800 1000 SE +/- 0.31, N = 10 SE +/- 0.81, N = 9 SE +/- 0.71, N = 9 SE +/- 0.15, N = 8 886.0 823.3 623.2 543.8 1. (CC) gcc options: -O2 -flto -lOpenCL
cl-mem GPU Power Consumption Monitor Min Avg Max RTX 4080 16GB -Pny 12.9 78.5 155.7 RTX 4090 24GB -Nvidia 6.8 92.1 205.2 RTX 2080 Ti 22GB -Dell 22.9 130.3 212.4 RTX 3090 24GB -Zotac 19.8 166.2 329.4 OpenBenchmarking.org Watts, Fewer Is Better cl-mem 2017-01-13 GPU Power Consumption Monitor 80 160 240 320 400
cl-mem GPU Temperature Monitor Min Avg Max RTX 4090 24GB -Nvidia 38.0 39.6 42.0 RTX 4080 16GB -Pny 39.0 41.6 44.0 RTX 3090 24GB -Zotac 48.0 63.9 73.0 RTX 2080 Ti 22GB -Dell 59.0 67.8 73.0 OpenBenchmarking.org Celsius, Fewer Is Better cl-mem 2017-01-13 GPU Temperature Monitor 20 40 60 80 100
cl-mem Benchmark: Write OpenBenchmarking.org GB/s, More Is Better cl-mem 2017-01-13 Benchmark: Write RTX 4090 24GB -Nvidia RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 2080 Ti 22GB -Dell 200 400 600 800 1000 SE +/- 0.36, N = 10 SE +/- 0.49, N = 10 SE +/- 0.54, N = 9 SE +/- 0.85, N = 8 785.8 734.3 567.7 452.2 1. (CC) gcc options: -O2 -flto -lOpenCL
cl-mem GPU Power Consumption Monitor Min Avg Max RTX 4080 16GB -Pny 12.8 77.9 155.5 RTX 4090 24GB -Nvidia 6.8 87.4 204.4 RTX 2080 Ti 22GB -Dell 23.2 126.9 212.8 RTX 3090 24GB -Zotac 19.7 171.2 327.8 OpenBenchmarking.org Watts, Fewer Is Better cl-mem 2017-01-13 GPU Power Consumption Monitor 80 160 240 320 400
cl-mem GPU Temperature Monitor Min Avg Max RTX 4090 24GB -Nvidia 36.0 37.9 40.0 RTX 4080 16GB -Pny 38.0 40.7 43.0 RTX 3090 24GB -Zotac 49.0 64.5 74.0 RTX 2080 Ti 22GB -Dell 60.0 67.7 73.0 OpenBenchmarking.org Celsius, Fewer Is Better cl-mem 2017-01-13 GPU Temperature Monitor 20 40 60 80 100
cl-mem Benchmark: Copy OpenBenchmarking.org GB/s, More Is Better cl-mem 2017-01-13 Benchmark: Copy RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 90 180 270 360 450 SE +/- 0.05, N = 10 SE +/- 0.03, N = 9 SE +/- 0.21, N = 10 SE +/- 0.27, N = 8 410.3 382.0 359.2 318.1 1. (CC) gcc options: -O2 -flto -lOpenCL
cl-mem GPU Power Consumption Monitor Min Avg Max RTX 4080 16GB -Pny 10.7 80.4 155.4 RTX 4090 24GB -Nvidia 6.8 90.7 205.9 RTX 2080 Ti 22GB -Dell 23.0 127.1 210.3 RTX 3090 24GB -Zotac 19.4 168.8 328.7 OpenBenchmarking.org Watts, Fewer Is Better cl-mem 2017-01-13 GPU Power Consumption Monitor 80 160 240 320 400
cl-mem GPU Temperature Monitor Min Avg Max RTX 4090 24GB -Nvidia 35.0 38.0 40.0 RTX 4080 16GB -Pny 37.0 40.3 43.0 RTX 3090 24GB -Zotac 50.0 64.4 74.0 RTX 2080 Ti 22GB -Dell 60.0 67.6 73.0 OpenBenchmarking.org Celsius, Fewer Is Better cl-mem 2017-01-13 GPU Temperature Monitor 20 40 60 80 100
clpeak OpenCL Test: Global Memory Bandwidth OpenBenchmarking.org GBPS, More Is Better clpeak 1.1.2 OpenCL Test: Global Memory Bandwidth RTX 4090 24GB -Nvidia RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 2080 Ti 22GB -Dell 200 400 600 800 1000 SE +/- 0.04, N = 10 SE +/- 0.01, N = 10 SE +/- 0.17, N = 10 SE +/- 0.36, N = 9 871.56 814.92 611.67 504.66 1. (CXX) g++ options: -O3
clpeak GPU Power Consumption Monitor Min Avg Max RTX 4080 16GB -Pny 14.6 68.3 177.2 RTX 4090 24GB -Nvidia 6.8 79.8 231.0 RTX 2080 Ti 22GB -Dell 22.0 120.7 246.3 RTX 3090 24GB -Zotac 20.3 147.9 355.6 OpenBenchmarking.org Watts, Fewer Is Better clpeak 1.1.2 GPU Power Consumption Monitor 100 200 300 400 500
clpeak GPU Temperature Monitor Min Avg Max RTX 4090 24GB -Nvidia 35.0 37.4 40.0 RTX 4080 16GB -Pny 36.0 38.4 43.0 RTX 3090 24GB -Zotac 50.0 62.0 75.0 RTX 2080 Ti 22GB -Dell 59.0 66.4 74.0 OpenBenchmarking.org Celsius, Fewer Is Better clpeak 1.1.2 GPU Temperature Monitor 20 40 60 80 100
clpeak OpenCL Test: Single-Precision Float OpenBenchmarking.org GFLOPS, More Is Better clpeak 1.1.2 OpenCL Test: Single-Precision Float RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 20K 40K 60K 80K 100K SE +/- 103.68, N = 14 SE +/- 50.90, N = 14 SE +/- 10.34, N = 15 SE +/- 80.60, N = 11 79463.36 47802.87 35000.44 14076.71 1. (CXX) g++ options: -O3
clpeak GPU Power Consumption Monitor Min Avg Max RTX 4080 16GB -Pny 14.3 56.1 255.4 RTX 4090 24GB -Nvidia 6.9 70.4 356.0 RTX 2080 Ti 22GB -Dell 21.8 109.9 254.5 RTX 3090 24GB -Zotac 22.4 114.3 268.7 OpenBenchmarking.org Watts, Fewer Is Better clpeak 1.1.2 GPU Power Consumption Monitor 100 200 300 400 500
clpeak GPU Temperature Monitor Min Avg Max RTX 4080 16GB -Pny 36.0 38.0 52.0 RTX 4090 24GB -Nvidia 36.0 38.8 52.0 RTX 3090 24GB -Zotac 50.0 61.1 73.0 RTX 2080 Ti 22GB -Dell 60.0 65.9 73.0 OpenBenchmarking.org Celsius, Fewer Is Better clpeak 1.1.2 GPU Temperature Monitor 20 40 60 80 100
clpeak OpenCL Test: Double-Precision Double OpenBenchmarking.org GFLOPS, More Is Better clpeak 1.1.2 OpenCL Test: Double-Precision Double RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 300 600 900 1200 1500 SE +/- 1.85, N = 6 SE +/- 1.29, N = 6 SE +/- 0.46, N = 5 SE +/- 1.50, N = 4 1391.51 869.25 640.15 506.01 1. (CXX) g++ options: -O3
clpeak GPU Power Consumption Monitor Min Avg Max RTX 4080 16GB -Pny 12.8 70.5 93.0 RTX 4090 24GB -Nvidia 7.0 90.3 122.6 RTX 2080 Ti 22GB -Dell 22.4 145.6 183.3 RTX 3090 24GB -Zotac 20.3 170.7 209.4 OpenBenchmarking.org Watts, Fewer Is Better clpeak 1.1.2 GPU Power Consumption Monitor 60 120 180 240 300
clpeak GPU Temperature Monitor Min Avg Max RTX 4080 16GB -Pny 35.0 39.2 41.0 RTX 4090 24GB -Nvidia 36.0 39.5 41.0 RTX 3090 24GB -Zotac 52.0 67.1 73.0 RTX 2080 Ti 22GB -Dell 59.0 70.0 75.0 OpenBenchmarking.org Celsius, Fewer Is Better clpeak 1.1.2 GPU Temperature Monitor 20 40 60 80 100
clpeak OpenCL Test: Integer Compute INT OpenBenchmarking.org GIOPS, More Is Better clpeak 1.1.2 OpenCL Test: Integer Compute INT RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 9K 18K 27K 36K 45K SE +/- 49.03, N = 14 SE +/- 20.64, N = 14 SE +/- 36.29, N = 15 SE +/- 86.60, N = 15 40697.73 24524.34 17864.54 11826.76 1. (CXX) g++ options: -O3
clpeak GPU Power Consumption Monitor Min Avg Max RTX 4080 16GB -Pny 14.8 58.3 297.3 RTX 4090 24GB -Nvidia 7.1 83.9 421.2 RTX 2080 Ti 22GB -Dell 22.3 90.1 254.1 RTX 3090 24GB -Zotac 23.7 127.5 323.8 OpenBenchmarking.org Watts, Fewer Is Better clpeak 1.1.2 GPU Power Consumption Monitor 110 220 330 440 550
clpeak GPU Temperature Monitor Min Avg Max RTX 4090 24GB -Nvidia 36.0 38.5 55.0 RTX 4080 16GB -Pny 35.0 38.5 56.0 RTX 3090 24GB -Zotac 50.0 61.6 72.0 RTX 2080 Ti 22GB -Dell 59.0 62.5 68.0 OpenBenchmarking.org Celsius, Fewer Is Better clpeak 1.1.2 GPU Temperature Monitor 20 40 60 80 100
FAHBench OpenBenchmarking.org Ns Per Day, More Is Better FAHBench 2.3.2 RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 90 180 270 360 450 SE +/- 0.63, N = 3 SE +/- 0.24, N = 3 SE +/- 0.41, N = 3 SE +/- 0.79, N = 3 424.35 423.21 316.57 292.69
FAHBench GPU Power Consumption Monitor Min Avg Max RTX 4080 16GB -Pny 10.4 88.1 141.6 RTX 4090 24GB -Nvidia 6.5 96.4 148.8 RTX 2080 Ti 22GB -Dell 21.7 160.0 255.5 RTX 3090 24GB -Zotac 19.0 192.6 299.0 OpenBenchmarking.org Watts, Fewer Is Better FAHBench 2.3.2 GPU Power Consumption Monitor 80 160 240 320 400
FAHBench GPU Temperature Monitor Min Avg Max RTX 4080 16GB -Pny 33.0 39.7 46.0 RTX 4090 24GB -Nvidia 35.0 40.0 44.0 RTX 3090 24GB -Zotac 46.0 66.8 80.0 RTX 2080 Ti 22GB -Dell 55.0 69.5 81.0 OpenBenchmarking.org Celsius, Fewer Is Better FAHBench 2.3.2 GPU Temperature Monitor 20 40 60 80 100
FinanceBench Benchmark: Black-Scholes OpenCL OpenBenchmarking.org ms, Fewer Is Better FinanceBench 2016-07-25 Benchmark: Black-Scholes OpenCL RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 2 4 6 8 10 SE +/- 0.005, N = 15 SE +/- 0.013, N = 15 SE +/- 0.002, N = 14 SE +/- 0.110, N = 15 2.890 4.404 5.796 8.840 1. (CXX) g++ options: -O3 -march=native -fopenmp
FinanceBench GPU Power Consumption Monitor Min Avg Max RTX 4080 16GB -Pny 13.0 26.6 39.8 RTX 4090 24GB -Nvidia 6.4 29.3 51.3 RTX 2080 Ti 22GB -Dell 21.3 45.4 73.1 RTX 3090 24GB -Zotac 19.5 66.6 119.2 OpenBenchmarking.org Watts, Fewer Is Better FinanceBench 2016-07-25 GPU Power Consumption Monitor 40 80 120 160 200
FinanceBench GPU Temperature Monitor Min Avg Max RTX 4090 24GB -Nvidia 33.0 34.7 36.0 RTX 4080 16GB -Pny 35.0 36.4 38.0 RTX 3090 24GB -Zotac 52.0 55.8 60.0 RTX 2080 Ti 22GB -Dell 55.0 56.8 58.0 OpenBenchmarking.org Celsius, Fewer Is Better FinanceBench 2016-07-25 GPU Temperature Monitor 16 32 48 64 80
GPU Power Consumption Monitor Phoronix Test Suite System Monitoring OpenBenchmarking.org Watts GPU Power Consumption Monitor Phoronix Test Suite System Monitoring RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac 80 160 240 320 400 Min: 4 / Avg: 92.31 / Max: 331.21 Min: 6 / Avg: 110.92 / Max: 456.05 Min: 18.41 / Avg: 148.08 / Max: 347.3 Min: 18.42 / Avg: 170.36 / Max: 367.21
GPU Temperature Monitor Phoronix Test Suite System Monitoring OpenBenchmarking.org Celsius GPU Temperature Monitor Phoronix Test Suite System Monitoring RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 16 32 48 64 80 Min: 32 / Avg: 43.13 / Max: 66 Min: 31 / Avg: 44.05 / Max: 73 Min: 41 / Avg: 65.62 / Max: 84 Min: 44 / Avg: 69.38 / Max: 83
GROMACS Implementation: NVIDIA CUDA GPU - Input: water_GMX50_bare OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2023 Implementation: NVIDIA CUDA GPU - Input: water_GMX50_bare RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 10 20 30 40 50 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.05, N = 3 SE +/- 0.00, N = 3 43.47 32.04 23.74 15.38 1. (CXX) g++ options: -O3
GROMACS GPU Power Consumption Monitor Min Avg Max RTX 4080 16GB -Pny 12.5 176.8 317.1 RTX 4090 24GB -Nvidia 6.7 182.8 404.4 RTX 2080 Ti 22GB -Dell 22.7 194.8 276.3 RTX 3090 24GB -Zotac 28.0 243.2 365.1 OpenBenchmarking.org Watts, Fewer Is Better GROMACS 2023 GPU Power Consumption Monitor 110 220 330 440 550
GROMACS GPU Temperature Monitor Min Avg Max RTX 4090 24GB -Nvidia 36.0 45.6 57.0 RTX 4080 16GB -Pny 35.0 48.9 62.0 RTX 3090 24GB -Zotac 48.0 71.1 82.0 RTX 2080 Ti 22GB -Dell 59.0 73.6 80.0 OpenBenchmarking.org Celsius, Fewer Is Better GROMACS 2023 GPU Temperature Monitor 20 40 60 80 100
Hashcat Benchmark: MD5 OpenBenchmarking.org H/s, More Is Better Hashcat 6.2.4 Benchmark: MD5 RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 30000M 60000M 90000M 120000M 150000M SE +/- 243127767.05, N = 6 SE +/- 34207656.71, N = 6 SE +/- 474267340.07, N = 15 SE +/- 232076903.56, N = 6 154333333333 96891966667 63700926667 51503250000
Hashcat GPU Power Consumption Monitor Min Avg Max RTX 4080 16GB -Pny 6.8 142.4 304.5 RTX 2080 Ti 22GB -Dell 21.1 147.0 255.1 RTX 3090 24GB -Zotac 20.4 191.9 326.0 RTX 4090 24GB -Nvidia 6.7 197.8 444.1 OpenBenchmarking.org Watts, Fewer Is Better Hashcat 6.2.4 GPU Power Consumption Monitor 120 240 360 480 600
Hashcat GPU Temperature Monitor Min Avg Max RTX 4080 16GB -Pny 31.0 45.7 63.0 RTX 4090 24GB -Nvidia 36.0 49.1 65.0 RTX 2080 Ti 22GB -Dell 50.0 65.0 74.0 RTX 3090 24GB -Zotac 50.0 67.5 82.0 OpenBenchmarking.org Celsius, Fewer Is Better Hashcat 6.2.4 GPU Temperature Monitor 20 40 60 80 100
Hashcat Benchmark: SHA1 OpenBenchmarking.org H/s, More Is Better Hashcat 6.2.4 Benchmark: SHA1 RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 11000M 22000M 33000M 44000M 55000M SE +/- 18148896.51, N = 6 SE +/- 19028429.02, N = 6 SE +/- 15661518.66, N = 6 SE +/- 39682498.24, N = 6 49393366667 30881533333 20634050000 16300900000
Hashcat GPU Power Consumption Monitor Min Avg Max RTX 4080 16GB -Pny 12.7 148.9 281.6 RTX 2080 Ti 22GB -Dell 22.5 154.3 261.7 RTX 4090 24GB -Nvidia 6.7 190.1 401.1 RTX 3090 24GB -Zotac 19.1 192.7 330.0 OpenBenchmarking.org Watts, Fewer Is Better Hashcat 6.2.4 GPU Power Consumption Monitor 110 220 330 440 550
Hashcat GPU Temperature Monitor Min Avg Max RTX 4080 16GB -Pny 34.0 47.5 62.0 RTX 4090 24GB -Nvidia 40.0 50.8 64.0 RTX 3090 24GB -Zotac 46.0 65.4 78.0 RTX 2080 Ti 22GB -Dell 58.0 68.8 76.0 OpenBenchmarking.org Celsius, Fewer Is Better Hashcat 6.2.4 GPU Temperature Monitor 20 40 60 80 100
Hashcat Benchmark: SHA-512 OpenBenchmarking.org H/s, More Is Better Hashcat 6.2.4 Benchmark: SHA-512 RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 1300M 2600M 3900M 5200M 6500M SE +/- 2101057.93, N = 6 SE +/- 1634982.16, N = 6 SE +/- 2013247.79, N = 6 SE +/- 6223365.47, N = 6 6297766667 3936350000 2595250000 2047183333
Hashcat GPU Power Consumption Monitor Min Avg Max RTX 2080 Ti 22GB -Dell 22.3 156.5 321.2 RTX 4080 16GB -Pny 12.9 157.1 290.4 RTX 3090 24GB -Zotac 20.5 192.6 326.5 RTX 4090 24GB -Nvidia 6.8 231.7 417.6 OpenBenchmarking.org Watts, Fewer Is Better Hashcat 6.2.4 GPU Power Consumption Monitor 110 220 330 440 550
Hashcat GPU Temperature Monitor Min Avg Max RTX 4080 16GB -Pny 35.0 48.9 64.0 RTX 4090 24GB -Nvidia 41.0 52.5 65.0 RTX 3090 24GB -Zotac 45.0 65.2 78.0 RTX 2080 Ti 22GB -Dell 58.0 69.2 76.0 OpenBenchmarking.org Celsius, Fewer Is Better Hashcat 6.2.4 GPU Temperature Monitor 20 40 60 80 100
Hashcat Benchmark: 7-Zip OpenBenchmarking.org H/s, More Is Better Hashcat 6.2.4 Benchmark: 7-Zip RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 600K 1200K 1800K 2400K 3000K SE +/- 1870.54, N = 8 SE +/- 1869.68, N = 8 SE +/- 1997.77, N = 8 SE +/- 1608.10, N = 8 2746938 1747350 1095950 840275
Hashcat GPU Power Consumption Monitor Min Avg Max RTX 4080 16GB -Pny 13.2 101.3 323.7 RTX 2080 Ti 22GB -Dell 22.9 121.2 333.4 RTX 4090 24GB -Nvidia 6.8 144.6 456.1 RTX 3090 24GB -Zotac 19.1 155.4 349.1 OpenBenchmarking.org Watts, Fewer Is Better Hashcat 6.2.4 GPU Power Consumption Monitor 120 240 360 480 600
Hashcat GPU Temperature Monitor Min Avg Max RTX 4080 16GB -Pny 36.0 43.0 63.0 RTX 4090 24GB -Nvidia 41.0 46.6 62.0 RTX 3090 24GB -Zotac 45.0 60.4 76.0 RTX 2080 Ti 22GB -Dell 59.0 65.6 73.0 OpenBenchmarking.org Celsius, Fewer Is Better Hashcat 6.2.4 GPU Temperature Monitor 20 40 60 80 100
Hashcat Benchmark: TrueCrypt RIPEMD160 + XTS OpenBenchmarking.org H/s, More Is Better Hashcat 6.2.4 Benchmark: TrueCrypt RIPEMD160 + XTS RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 400K 800K 1200K 1600K 2000K SE +/- 13557.46, N = 7 SE +/- 8051.58, N = 15 SE +/- 5945.49, N = 15 SE +/- 4392.30, N = 10 1857800 1146560 731327 580260
Hashcat GPU Power Consumption Monitor Min Avg Max RTX 4080 16GB -Pny 12.4 94.7 308.4 RTX 2080 Ti 22GB -Dell 22.8 115.4 347.3 RTX 4090 24GB -Nvidia 6.6 136.8 444.0 RTX 3090 24GB -Zotac 19.1 138.5 326.4 OpenBenchmarking.org Watts, Fewer Is Better Hashcat 6.2.4 GPU Power Consumption Monitor 120 240 360 480 600
Hashcat GPU Temperature Monitor Min Avg Max RTX 4080 16GB -Pny 35.0 42.8 63.0 RTX 4090 24GB -Nvidia 39.0 44.8 62.0 RTX 3090 24GB -Zotac 47.0 60.7 77.0 RTX 2080 Ti 22GB -Dell 59.0 65.3 73.0 OpenBenchmarking.org Celsius, Fewer Is Better Hashcat 6.2.4 GPU Temperature Monitor 20 40 60 80 100
IndigoBench Acceleration: OpenCL GPU - Scene: Supercar OpenBenchmarking.org M samples/s, More Is Better IndigoBench 4.4 Acceleration: OpenCL GPU - Scene: Supercar RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 20 40 60 80 100 SE +/- 0.01, N = 3 SE +/- 0.07, N = 3 SE +/- 0.02, N = 3 SE +/- 0.05, N = 3 78.45 66.03 51.70 32.60
IndigoBench GPU Power Consumption Monitor Min Avg Max RTX 4080 16GB -Pny 13.0 159.4 207.9 RTX 4090 24GB -Nvidia 6.7 172.9 252.7 RTX 2080 Ti 22GB -Dell 21.3 211.7 259.5 RTX 3090 24GB -Zotac 20.9 282.9 356.8 OpenBenchmarking.org Watts, Fewer Is Better IndigoBench 4.4 GPU Power Consumption Monitor 100 200 300 400 500
IndigoBench GPU Temperature Monitor Min Avg Max RTX 4090 24GB -Nvidia 42.0 49.4 52.0 RTX 4080 16GB -Pny 40.0 51.9 56.0 RTX 2080 Ti 22GB -Dell 57.0 76.0 81.0 RTX 3090 24GB -Zotac 52.0 77.6 83.0 OpenBenchmarking.org Celsius, Fewer Is Better IndigoBench 4.4 GPU Temperature Monitor 20 40 60 80 100
IndigoBench Acceleration: OpenCL GPU - Scene: Bedroom OpenBenchmarking.org M samples/s, More Is Better IndigoBench 4.4 Acceleration: OpenCL GPU - Scene: Bedroom RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 8 16 24 32 40 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 35.30 26.08 20.77 11.15
IndigoBench GPU Power Consumption Monitor Min Avg Max RTX 4080 16GB -Pny 14.0 180.5 230.3 RTX 4090 24GB -Nvidia 6.7 211.2 296.7 RTX 2080 Ti 22GB -Dell 23.1 213.7 254.5 RTX 3090 24GB -Zotac 20.2 287.2 353.5 OpenBenchmarking.org Watts, Fewer Is Better IndigoBench 4.4 GPU Power Consumption Monitor 100 200 300 400 500
IndigoBench GPU Temperature Monitor Min Avg Max RTX 4090 24GB -Nvidia 41.0 51.7 57.0 RTX 4080 16GB -Pny 42.0 54.0 59.0 RTX 2080 Ti 22GB -Dell 61.0 76.6 82.0 RTX 3090 24GB -Zotac 50.0 77.7 84.0 OpenBenchmarking.org Celsius, Fewer Is Better IndigoBench 4.4 GPU Temperature Monitor 20 40 60 80 100
LeelaChessZero Backend: OpenCL OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.28 Backend: OpenCL RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 3K 6K 9K 12K 15K SE +/- 126.87, N = 3 SE +/- 198.64, N = 3 SE +/- 52.37, N = 3 SE +/- 135.58, N = 4 15316 15097 14316 13228 1. (CXX) g++ options: -flto -pthread
LeelaChessZero GPU Power Consumption Monitor Min Avg Max RTX 4080 16GB -Pny 14.0 117.3 160.6 RTX 4090 24GB -Nvidia 6.5 122.6 165.3 RTX 2080 Ti 22GB -Dell 21.9 229.3 264.6 RTX 3090 24GB -Zotac 28.5 263.6 349.6 OpenBenchmarking.org Watts, Fewer Is Better LeelaChessZero 0.28 GPU Power Consumption Monitor 100 200 300 400 500
LeelaChessZero GPU Temperature Monitor Min Avg Max RTX 4090 24GB -Nvidia 36.0 43.6 46.0 RTX 4080 16GB -Pny 38.0 47.0 50.0 RTX 3090 24GB -Zotac 53.0 75.4 83.0 RTX 2080 Ti 22GB -Dell 55.0 79.4 82.0 OpenBenchmarking.org Celsius, Fewer Is Better LeelaChessZero 0.28 GPU Temperature Monitor 20 40 60 80 100
MandelGPU OpenCL Device: GPU OpenBenchmarking.org Samples/sec, More Is Better MandelGPU 1.3pts1 OpenCL Device: GPU RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 200M 400M 600M 800M 1000M SE +/- 3825360.26, N = 12 SE +/- 1340958.21, N = 12 SE +/- 959338.71, N = 11 SE +/- 1540493.49, N = 10 951815274.2 772392077.3 568293355.5 442866316.9 1. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL
MandelGPU GPU Power Consumption Monitor Min Avg Max RTX 4080 16GB -Pny 14.2 65.5 184.4 RTX 4090 24GB -Nvidia 6.9 68.6 215.0 RTX 2080 Ti 22GB -Dell 22.6 126.4 254.8 RTX 3090 24GB -Zotac 19.7 149.1 321.4 OpenBenchmarking.org Watts, Fewer Is Better MandelGPU 1.3pts1 GPU Power Consumption Monitor 80 160 240 320 400
MandelGPU GPU Temperature Monitor Min Avg Max RTX 4090 24GB -Nvidia 36.0 38.8 46.0 RTX 4080 16GB -Pny 37.0 40.9 52.0 RTX 3090 24GB -Zotac 50.0 62.9 79.0 RTX 2080 Ti 22GB -Dell 60.0 67.6 74.0 OpenBenchmarking.org Celsius, Fewer Is Better MandelGPU 1.3pts1 GPU Temperature Monitor 20 40 60 80 100
NAMD CUDA ATPase Simulation - 327,506 Atoms OpenBenchmarking.org days/ns, Fewer Is Better NAMD CUDA 2.14 ATPase Simulation - 327,506 Atoms RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 0.0214 0.0428 0.0642 0.0856 0.107 SE +/- 0.00119, N = 15 SE +/- 0.00113, N = 15 SE +/- 0.00094, N = 3 SE +/- 0.00074, N = 15 0.04850 0.05427 0.06742 0.09515
NAMD CUDA GPU Power Consumption Monitor Min Avg Max RTX 4090 24GB -Nvidia 6.4 54.4 281.7 RTX 3090 24GB -Zotac 19.0 68.9 327.9 RTX 4080 16GB -Pny 13.4 69.1 263.4 RTX 2080 Ti 22GB -Dell 23.6 127.2 297.8 OpenBenchmarking.org Watts, Fewer Is Better NAMD CUDA 2.14 GPU Power Consumption Monitor 80 160 240 320 400
NAMD CUDA GPU Temperature Monitor Min Avg Max RTX 4090 24GB -Nvidia 34.0 37.3 44.0 RTX 4080 16GB -Pny 36.0 39.5 52.0 RTX 3090 24GB -Zotac 47.0 54.3 78.0 RTX 2080 Ti 22GB -Dell 60.0 67.3 74.0 OpenBenchmarking.org Celsius, Fewer Is Better NAMD CUDA 2.14 GPU Temperature Monitor 20 40 60 80 100
NCNN Target: Vulkan GPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: mobilenet RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 9 18 27 36 45 SE +/- 0.03, N = 3 SE +/- 0.06, N = 3 SE +/- 0.21, N = 3 SE +/- 0.18, N = 3 9.21 9.36 37.93 38.20 MIN: 8.39 / MAX: 47.45 MIN: 8.28 / MAX: 49.63 MIN: 14.41 / MAX: 61.76 MIN: 10.75 / MAX: 66.72 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac 4 8 12 16 20 SE +/- 0.10, N = 3 SE +/- 0.01, N = 3 SE +/- 0.39, N = 3 SE +/- 0.28, N = 3 4.32 4.42 6.35 18.25 MIN: 3.68 / MAX: 23.77 MIN: 3.68 / MAX: 20.27 MIN: 4.5 / MAX: 27.11 MIN: 7.4 / MAX: 39.06 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac 5 10 15 20 25 SE +/- 0.06, N = 3 SE +/- 0.11, N = 3 SE +/- 0.83, N = 3 SE +/- 0.18, N = 3 4.97 5.02 18.77 18.78 MIN: 4.23 / MAX: 23.03 MIN: 4.12 / MAX: 23.74 MIN: 7.68 / MAX: 35.81 MIN: 8.92 / MAX: 34.42 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: shufflenet-v2 RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac 4 8 12 16 20 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 SE +/- 0.28, N = 3 SE +/- 0.06, N = 3 4.78 4.85 6.11 17.57 MIN: 4.03 / MAX: 23.06 MIN: 4.28 / MAX: 25.38 MIN: 4.77 / MAX: 33.2 MIN: 8.24 / MAX: 39.36 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: mnasnet RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac 5 10 15 20 25 SE +/- 0.02, N = 3 SE +/- 0.10, N = 2 SE +/- 0.73, N = 3 SE +/- 0.13, N = 3 4.37 4.39 7.03 19.16 MIN: 3.71 / MAX: 23.07 MIN: 3.79 / MAX: 22 MIN: 4.55 / MAX: 25.62 MIN: 7.92 / MAX: 33.04 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: efficientnet-b0 RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 6 12 18 24 30 SE +/- 0.10, N = 3 SE +/- 0.36, N = 3 SE +/- 0.63, N = 3 SE +/- 0.20, N = 3 6.30 6.42 23.97 26.09 MIN: 5.32 / MAX: 26.26 MIN: 5.35 / MAX: 28.33 MIN: 10.15 / MAX: 37.01 MIN: 12.37 / MAX: 40.64 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: blazeface RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 1.2353 2.4706 3.7059 4.9412 6.1765 SE +/- 0.10, N = 3 SE +/- 0.08, N = 3 SE +/- 0.22, N = 3 SE +/- 0.07, N = 3 3.89 3.95 5.38 5.49 MIN: 3.33 / MAX: 36.65 MIN: 3.22 / MAX: 30.17 MIN: 4.06 / MAX: 22.14 MIN: 3.72 / MAX: 33.58 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: googlenet RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac 7 14 21 28 35 SE +/- 0.11, N = 3 SE +/- 0.10, N = 3 SE +/- 0.49, N = 3 SE +/- 0.13, N = 3 5.71 5.77 28.22 28.44 MIN: 5.03 / MAX: 22.85 MIN: 4.99 / MAX: 22.76 MIN: 15.18 / MAX: 38.54 MIN: 14.74 / MAX: 44.82 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: vgg16 RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia 6 12 18 24 30 SE +/- 0.11, N = 3 SE +/- 1.00, N = 3 SE +/- 3.46, N = 3 SE +/- 0.10, N = 3 4.40 7.07 7.90 25.14 MIN: 3.72 / MAX: 24.39 MIN: 4.46 / MAX: 48.35 MIN: 3.84 / MAX: 36.49 MIN: 12.85 / MAX: 33.3 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: resnet18 RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 5 10 15 20 25 SE +/- 0.10, N = 3 SE +/- 0.06, N = 3 SE +/- 0.21, N = 3 SE +/- 0.55, N = 3 4.50 4.78 21.45 22.13 MIN: 3.89 / MAX: 21.9 MIN: 4.18 / MAX: 28.49 MIN: 8.94 / MAX: 40.36 MIN: 9.01 / MAX: 43.23 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: alexnet RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 5 10 15 20 25 SE +/- 0.08, N = 3 SE +/- 0.19, N = 3 SE +/- 0.22, N = 3 SE +/- 0.30, N = 3 5.08 17.57 21.31 21.50 MIN: 3.13 / MAX: 27.91 MIN: 7.8 / MAX: 33.45 MIN: 7.31 / MAX: 35.87 MIN: 3.33 / MAX: 40.17 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: resnet50 RTX 4080 16GB -Pny RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4090 24GB -Nvidia 2 4 6 8 10 SE +/- 0.04, N = 3 SE +/- 0.14, N = 3 SE +/- 0.14, N = 3 SE +/- 0.77, N = 3 5.16 5.85 6.35 7.28 MIN: 4.78 / MAX: 21.85 MIN: 4.75 / MAX: 26.01 MIN: 5 / MAX: 21.95 MIN: 3.88 / MAX: 30.89 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: yolov4-tiny RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 12 24 36 48 60 SE +/- 0.10, N = 3 SE +/- 0.08, N = 3 SE +/- 0.40, N = 3 SE +/- 0.13, N = 3 35.23 35.87 49.10 52.48 MIN: 11.4 / MAX: 59.2 MIN: 11.56 / MAX: 61.96 MIN: 17.59 / MAX: 76.07 MIN: 20.11 / MAX: 75.46 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: squeezenet_ssd RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 14 28 42 56 70 SE +/- 0.05, N = 3 SE +/- 0.11, N = 3 SE +/- 0.43, N = 3 SE +/- 0.50, N = 3 8.99 9.13 57.27 61.74 MIN: 8.18 / MAX: 62.47 MIN: 7.71 / MAX: 66.82 MIN: 16.86 / MAX: 94.95 MIN: 23.2 / MAX: 85.24 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: regnety_400m RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac 6 12 18 24 30 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 SE +/- 2.01, N = 3 SE +/- 0.25, N = 3 5.32 5.46 23.13 25.04 MIN: 4.68 / MAX: 22.84 MIN: 4.78 / MAX: 28.36 MIN: 8.74 / MAX: 39.68 MIN: 11.25 / MAX: 44.6 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: vision_transformer RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 80 160 240 320 400 SE +/- 1.34, N = 3 SE +/- 3.05, N = 3 SE +/- 0.55, N = 3 SE +/- 0.57, N = 3 289.04 290.15 324.88 363.75 MIN: 260.04 / MAX: 661.43 MIN: 247.6 / MAX: 837.91 MIN: 283.26 / MAX: 423.43 MIN: 316.04 / MAX: 471.43 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20220729 Target: Vulkan GPU - Model: FastestDet RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac 3 6 9 12 15 SE +/- 0.04, N = 3 SE +/- 0.16, N = 3 SE +/- 0.10, N = 3 SE +/- 3.64, N = 3 5.08 5.18 6.80 13.52 MIN: 4.3 / MAX: 23.08 MIN: 4.28 / MAX: 25.38 MIN: 5.02 / MAX: 32.78 MIN: 5.33 / MAX: 35.07 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN GPU Power Consumption Monitor Min Avg Max RTX 4090 24GB -Nvidia 6.4 13.6 59.6 RTX 4080 16GB -Pny 4.2 18.5 66.4 RTX 2080 Ti 22GB -Dell 19.2 31.7 108.8 RTX 3090 24GB -Zotac 18.8 36.1 148.3 OpenBenchmarking.org Watts, Fewer Is Better NCNN 20220729 GPU Power Consumption Monitor 40 80 120 160 200
NCNN GPU Temperature Monitor Min Avg Max RTX 4090 24GB -Nvidia 33.0 37.4 45.0 RTX 4080 16GB -Pny 35.0 38.2 43.0 RTX 2080 Ti 22GB -Dell 46.0 49.8 61.0 RTX 3090 24GB -Zotac 42.0 51.6 64.0 OpenBenchmarking.org Celsius, Fewer Is Better NCNN 20220729 GPU Temperature Monitor 20 40 60 80 100
NeatBench Acceleration: GPU OpenBenchmarking.org FPS, More Is Better NeatBench 5 Acceleration: GPU RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 900 1800 2700 3600 4500 SE +/- 0.00, N = 14 SE +/- 0.00, N = 14 SE +/- 0.00, N = 13 SE +/- 0.00, N = 13 4090 4080 3090 2080
NeatBench GPU Power Consumption Monitor Min Avg Max RTX 4080 16GB -Pny 4.6 27.9 89.5 RTX 4090 24GB -Nvidia 6.9 36.3 73.8 RTX 2080 Ti 22GB -Dell 22.1 68.0 119.6 RTX 3090 24GB -Zotac 23.6 85.4 168.2 OpenBenchmarking.org Watts, Fewer Is Better NeatBench 5 GPU Power Consumption Monitor 50 100 150 200 250
NeatBench GPU Temperature Monitor Min Avg Max RTX 4090 24GB -Nvidia 33.0 34.4 36.0 RTX 4080 16GB -Pny 34.0 34.9 37.0 RTX 3090 24GB -Zotac 50.0 57.3 63.0 RTX 2080 Ti 22GB -Dell 56.0 58.3 61.0 OpenBenchmarking.org Celsius, Fewer Is Better NeatBench 5 GPU Temperature Monitor 20 40 60 80 100
OctaneBench Total Score OpenBenchmarking.org Score, More Is Better OctaneBench 2020.1 Total Score RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 300 600 900 1200 1500 1326.24 977.53 669.37 349.49
OctaneBench GPU Power Consumption Monitor Min Avg Max RTX 4080 16GB -Pny 12.7 220.4 259.6 RTX 2080 Ti 22GB -Dell 22.6 242.0 259.3 RTX 4090 24GB -Nvidia 6.8 279.5 322.5 RTX 3090 24GB -Zotac 28.6 337.3 362.8 OpenBenchmarking.org Watts, Fewer Is Better OctaneBench 2020.1 GPU Power Consumption Monitor 100 200 300 400 500
OctaneBench GPU Temperature Monitor Min Avg Max RTX 4080 16GB -Pny 35.0 54.7 59.0 RTX 4090 24GB -Nvidia 35.0 55.0 60.0 RTX 3090 24GB -Zotac 50.0 79.3 82.0 RTX 2080 Ti 22GB -Dell 58.0 80.0 83.0 OpenBenchmarking.org Celsius, Fewer Is Better OctaneBench 2020.1 GPU Temperature Monitor 20 40 60 80 100
RealSR-NCNN Scale: 4x - TAA: Yes OpenBenchmarking.org Seconds, Fewer Is Better RealSR-NCNN 20200818 Scale: 4x - TAA: Yes RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 12 24 36 48 60 SE +/- 0.04, N = 3 SE +/- 0.04, N = 3 SE +/- 0.18, N = 3 SE +/- 0.07, N = 3 20.19 24.37 31.17 52.76
RealSR-NCNN GPU Power Consumption Monitor Min Avg Max RTX 4080 16GB -Pny 13.9 221.3 300.8 RTX 2080 Ti 22GB -Dell 20.4 225.5 257.1 RTX 4090 24GB -Nvidia 6.6 234.7 332.3 RTX 3090 24GB -Zotac 33.0 292.5 356.8 OpenBenchmarking.org Watts, Fewer Is Better RealSR-NCNN 20200818 GPU Power Consumption Monitor 100 200 300 400 500
RealSR-NCNN GPU Temperature Monitor Min Avg Max RTX 4090 24GB -Nvidia 41.0 51.6 56.0 RTX 4080 16GB -Pny 42.0 57.3 64.0 RTX 2080 Ti 22GB -Dell 46.0 74.2 80.0 RTX 3090 24GB -Zotac 52.0 74.9 84.0 OpenBenchmarking.org Celsius, Fewer Is Better RealSR-NCNN 20200818 GPU Temperature Monitor 20 40 60 80 100
RealSR-NCNN Scale: 4x - TAA: No OpenBenchmarking.org Seconds, Fewer Is Better RealSR-NCNN 20200818 Scale: 4x - TAA: No RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 3 6 9 12 15 SE +/- 0.018, N = 7 SE +/- 0.015, N = 7 SE +/- 0.015, N = 6 SE +/- 0.025, N = 5 5.109 5.630 6.509 9.239
RealSR-NCNN GPU Power Consumption Monitor Min Avg Max RTX 4080 16GB -Pny 14.1 102.2 294.6 RTX 4090 24GB -Nvidia 7.1 105.0 327.6 RTX 2080 Ti 22GB -Dell 23.3 150.8 256.7 RTX 3090 24GB -Zotac 28.7 166.5 354.3 OpenBenchmarking.org Watts, Fewer Is Better RealSR-NCNN 20200818 GPU Power Consumption Monitor 100 200 300 400 500
RealSR-NCNN GPU Temperature Monitor Min Avg Max RTX 4090 24GB -Nvidia 42.0 45.1 53.0 RTX 4080 16GB -Pny 42.0 47.6 58.0 RTX 3090 24GB -Zotac 48.0 61.7 77.0 RTX 2080 Ti 22GB -Dell 60.0 69.3 76.0 OpenBenchmarking.org Celsius, Fewer Is Better RealSR-NCNN 20200818 GPU Temperature Monitor 20 40 60 80 100
Rodinia Test: OpenCL Particle Filter OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenCL Particle Filter RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 1.0186 2.0372 3.0558 4.0744 5.093 SE +/- 0.080, N = 3 SE +/- 0.005, N = 10 SE +/- 0.029, N = 10 SE +/- 0.031, N = 8 2.085 2.685 3.768 4.527 1. (CXX) g++ options: -m64 -lm -lcuda -lcudart -lcudadevrt -lcudart_static -lrt -lpthread -ldl
Rodinia GPU Power Consumption Monitor Min Avg Max RTX 4080 16GB -Pny 14.5 63.1 123.6 RTX 4090 24GB -Nvidia 6.8 67.5 167.3 RTX 2080 Ti 22GB -Dell 23.3 133.0 219.9 RTX 3090 24GB -Zotac 19.9 148.1 254.9 OpenBenchmarking.org Watts, Fewer Is Better Rodinia 3.1 GPU Power Consumption Monitor 70 140 210 280 350
Rodinia GPU Temperature Monitor Min Avg Max RTX 4080 16GB -Pny 39.0 43.8 49.0 RTX 4090 24GB -Nvidia 41.0 44.6 50.0 RTX 3090 24GB -Zotac 52.0 64.9 77.0 RTX 2080 Ti 22GB -Dell 62.0 69.2 74.0 OpenBenchmarking.org Celsius, Fewer Is Better Rodinia 3.1 GPU Temperature Monitor 20 40 60 80 100
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Bus Speed Download OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Download RTX 3090 24GB -Zotac RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 2080 Ti 22GB -Dell 6 12 18 24 30 SE +/- 0.02, N = 13 SE +/- 0.02, N = 13 SE +/- 0.02, N = 13 SE +/- 0.00, N = 12 25.19 25.13 25.10 12.85 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing GPU Power Consumption Monitor Min Avg Max RTX 4080 16GB -Pny 13.7 37.0 60.9 RTX 4090 24GB -Nvidia 6.4 43.4 72.8 RTX 2080 Ti 22GB -Dell 20.1 63.6 102.9 RTX 3090 24GB -Zotac 20.3 93.5 158.5 OpenBenchmarking.org Watts, Fewer Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 GPU Power Consumption Monitor 50 100 150 200 250
SHOC Scalable HeterOgeneous Computing GPU Temperature Monitor Min Avg Max RTX 4090 24GB -Nvidia 34.0 36.1 38.0 RTX 4080 16GB -Pny 38.0 40.1 42.0 RTX 2080 Ti 22GB -Dell 44.0 49.7 54.0 RTX 3090 24GB -Zotac 55.0 61.1 68.0 OpenBenchmarking.org Celsius, Fewer Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 GPU Temperature Monitor 20 40 60 80 100
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Bus Speed Readback OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Readback RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia RTX 2080 Ti 22GB -Dell 6 12 18 24 30 SE +/- 0.00, N = 13 SE +/- 0.00, N = 14 SE +/- 0.00, N = 13 SE +/- 0.00, N = 12 26.40 26.39 26.37 13.20 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing GPU Power Consumption Monitor Min Avg Max RTX 4080 16GB -Pny 13.9 38.2 62.5 RTX 4090 24GB -Nvidia 6.6 43.8 72.7 RTX 2080 Ti 22GB -Dell 20.6 66.6 107.8 RTX 3090 24GB -Zotac 19.7 86.6 154.5 OpenBenchmarking.org Watts, Fewer Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 GPU Power Consumption Monitor 40 80 120 160 200
SHOC Scalable HeterOgeneous Computing GPU Temperature Monitor Min Avg Max RTX 4090 24GB -Nvidia 37.0 39.1 41.0 RTX 4080 16GB -Pny 41.0 42.7 45.0 RTX 2080 Ti 22GB -Dell 51.0 55.1 59.0 RTX 3090 24GB -Zotac 52.0 58.8 65.0 OpenBenchmarking.org Celsius, Fewer Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 GPU Temperature Monitor 20 40 60 80 100
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Max SP Flops OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Max SP Flops RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 20K 40K 60K 80K 100K SE +/- 19.25, N = 3 SE +/- 73.47, N = 3 SE +/- 292.47, N = 3 SE +/- 102.77, N = 3 88606.0 55455.8 37738.4 16106.4 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing GPU Power Consumption Monitor OpenBenchmarking.org Watts, Fewer Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 GPU Power Consumption Monitor RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac 80 160 240 320 400 Min: 13.84 / Avg: 90.17 / Max: 331.21 Min: 6.77 / Avg: 116.7 / Max: 447.83 Min: 21.69 / Avg: 175.84 / Max: 257.38 Min: 20.35 / Avg: 201.61 / Max: 329.19
SHOC Scalable HeterOgeneous Computing GPU Temperature Monitor OpenBenchmarking.org Celsius, Fewer Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 GPU Temperature Monitor RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 16 32 48 64 80 Min: 38 / Avg: 43.44 / Max: 73 Min: 39 / Avg: 43.63 / Max: 66 Min: 52 / Avg: 70.48 / Max: 82 Min: 55 / Avg: 73.85 / Max: 82
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Texture Read Bandwidth OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Texture Read Bandwidth RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 700 1400 2100 2800 3500 SE +/- 2.39, N = 7 SE +/- 0.95, N = 6 SE +/- 0.38, N = 3 SE +/- 0.54, N = 3 3065.76 2980.15 2148.86 1147.26 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing GPU Power Consumption Monitor Min Avg Max RTX 4080 16GB -Pny 13.5 98.3 224.9 RTX 4090 24GB -Nvidia 7.0 125.0 250.8 RTX 2080 Ti 22GB -Dell 22.2 184.6 246.0 RTX 3090 24GB -Zotac 22.8 253.9 357.5 OpenBenchmarking.org Watts, Fewer Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 GPU Power Consumption Monitor 100 200 300 400 500
SHOC Scalable HeterOgeneous Computing GPU Temperature Monitor Min Avg Max RTX 4090 24GB -Nvidia 37.0 41.8 47.0 RTX 4080 16GB -Pny 37.0 43.1 49.0 RTX 3090 24GB -Zotac 50.0 72.2 79.0 RTX 2080 Ti 22GB -Dell 59.0 74.0 78.0 OpenBenchmarking.org Celsius, Fewer Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 GPU Temperature Monitor 20 40 60 80 100
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: FFT SP OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: FFT SP RTX 4090 24GB -Nvidia RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 2080 Ti 22GB -Dell 600 1200 1800 2400 3000 SE +/- 1.36, N = 12 SE +/- 0.59, N = 12 SE +/- 2.58, N = 12 SE +/- 0.70, N = 11 2779.48 2349.39 1831.86 1504.87 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing GPU Power Consumption Monitor Min Avg Max RTX 4080 16GB -Pny 12.2 38.7 72.2 RTX 4090 24GB -Nvidia 7.0 45.5 106.3 RTX 2080 Ti 22GB -Dell 23.2 73.4 184.7 RTX 3090 24GB -Zotac 24.9 98.4 223.7 OpenBenchmarking.org Watts, Fewer Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 GPU Power Consumption Monitor 60 120 180 240 300
SHOC Scalable HeterOgeneous Computing GPU Temperature Monitor Min Avg Max RTX 4090 24GB -Nvidia 37.0 38.0 39.0 RTX 4080 16GB -Pny 37.0 38.1 40.0 RTX 3090 24GB -Zotac 49.0 58.0 65.0 RTX 2080 Ti 22GB -Dell 58.0 60.7 64.0 OpenBenchmarking.org Celsius, Fewer Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 GPU Temperature Monitor 20 40 60 80 100
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: GEMM SGEMM_N OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: GEMM SGEMM_N RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 6K 12K 18K 24K 30K SE +/- 164.66, N = 13 SE +/- 224.54, N = 15 SE +/- 16.79, N = 11 SE +/- 33.05, N = 10 27175.90 16952.60 7912.59 4858.13 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing GPU Power Consumption Monitor Min Avg Max RTX 4080 16GB -Pny 11.9 51.2 195.6 RTX 4090 24GB -Nvidia 7.0 59.0 207.2 RTX 2080 Ti 22GB -Dell 22.3 107.8 225.4 RTX 3090 24GB -Zotac 25.8 138.9 317.8 OpenBenchmarking.org Watts, Fewer Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 GPU Power Consumption Monitor 80 160 240 320 400
SHOC Scalable HeterOgeneous Computing GPU Temperature Monitor Min Avg Max RTX 4090 24GB -Nvidia 36.0 37.1 42.0 RTX 4080 16GB -Pny 36.0 38.5 46.0 RTX 3090 24GB -Zotac 53.0 63.8 74.0 RTX 2080 Ti 22GB -Dell 56.0 64.4 71.0 OpenBenchmarking.org Celsius, Fewer Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 GPU Temperature Monitor 20 40 60 80 100
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: MD5 Hash OpenBenchmarking.org GHash/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: MD5 Hash RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 20 40 60 80 100 SE +/- 0.95, N = 15 SE +/- 0.65, N = 15 SE +/- 0.05, N = 14 SE +/- 0.06, N = 14 93.90 60.56 41.33 32.52 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing GPU Power Consumption Monitor Min Avg Max RTX 4090 24GB -Nvidia 7.0 41.7 150.7 RTX 4080 16GB -Pny 12.8 44.7 278.8 RTX 2080 Ti 22GB -Dell 22.1 84.4 261.2 RTX 3090 24GB -Zotac 19.5 105.9 323.3 OpenBenchmarking.org Watts, Fewer Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 GPU Power Consumption Monitor 80 160 240 320 400
SHOC Scalable HeterOgeneous Computing GPU Temperature Monitor Min Avg Max RTX 4090 24GB -Nvidia 35.0 35.6 37.0 RTX 4080 16GB -Pny 36.0 38.6 55.0 RTX 3090 24GB -Zotac 50.0 60.2 73.0 RTX 2080 Ti 22GB -Dell 59.0 62.4 68.0 OpenBenchmarking.org Celsius, Fewer Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 GPU Temperature Monitor 20 40 60 80 100
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Reduction OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Reduction RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 200 400 600 800 1000 SE +/- 13.90, N = 15 SE +/- 9.75, N = 15 SE +/- 0.11, N = 13 SE +/- 0.06, N = 12 1020.24 967.14 392.80 366.06 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing GPU Power Consumption Monitor Min Avg Max RTX 4080 16GB -Pny 13.3 39.3 98.6 RTX 4090 24GB -Nvidia 6.4 41.6 108.0 RTX 2080 Ti 22GB -Dell 21.7 93.4 198.2 RTX 3090 24GB -Zotac 20.0 125.2 273.7 OpenBenchmarking.org Watts, Fewer Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 GPU Power Consumption Monitor 70 140 210 280 350
SHOC Scalable HeterOgeneous Computing GPU Temperature Monitor Min Avg Max RTX 4090 24GB -Nvidia 34.0 36.4 40.0 RTX 4080 16GB -Pny 36.0 36.7 39.0 RTX 2080 Ti 22GB -Dell 57.0 62.9 68.0 RTX 3090 24GB -Zotac 52.0 62.9 72.0 OpenBenchmarking.org Celsius, Fewer Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 GPU Temperature Monitor 20 40 60 80 100
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Triad OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Triad RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 6 12 18 24 30 SE +/- 0.01, N = 12 SE +/- 0.01, N = 12 SE +/- 0.00, N = 12 SE +/- 0.00, N = 12 25.07 24.76 24.55 12.61 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing GPU Power Consumption Monitor Min Avg Max RTX 4080 16GB -Pny 12.1 28.5 40.3 RTX 4090 24GB -Nvidia 6.5 33.8 55.0 RTX 2080 Ti 22GB -Dell 22.0 53.4 76.5 RTX 3090 24GB -Zotac 19.8 72.1 117.2 OpenBenchmarking.org Watts, Fewer Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 GPU Power Consumption Monitor 40 80 120 160 200
SHOC Scalable HeterOgeneous Computing GPU Temperature Monitor Min Avg Max RTX 4080 16GB -Pny 35.0 37.0 38.0 RTX 4090 24GB -Nvidia 38.0 39.2 40.0 RTX 3090 24GB -Zotac 52.0 56.2 60.0 RTX 2080 Ti 22GB -Dell 57.0 59.4 62.0 OpenBenchmarking.org Celsius, Fewer Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 GPU Temperature Monitor 20 40 60 80 100
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: S3D OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: S3D RTX 4090 24GB -Nvidia RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 2080 Ti 22GB -Dell 140 280 420 560 700 SE +/- 0.22, N = 12 SE +/- 0.35, N = 12 SE +/- 0.38, N = 13 SE +/- 0.15, N = 12 644.60 427.49 423.58 270.80 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing GPU Power Consumption Monitor Min Avg Max RTX 4080 16GB -Pny 12.2 30.6 47.0 RTX 4090 24GB -Nvidia 6.6 39.8 58.8 RTX 2080 Ti 22GB -Dell 21.3 56.5 80.7 RTX 3090 24GB -Zotac 19.8 85.0 125.6 OpenBenchmarking.org Watts, Fewer Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 GPU Power Consumption Monitor 40 80 120 160 200
SHOC Scalable HeterOgeneous Computing GPU Temperature Monitor Min Avg Max RTX 4080 16GB -Pny 38.0 39.5 41.0 RTX 4090 24GB -Nvidia 40.0 41.5 43.0 RTX 2080 Ti 22GB -Dell 56.0 58.0 60.0 RTX 3090 24GB -Zotac 52.0 58.1 62.0 OpenBenchmarking.org Celsius, Fewer Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 GPU Temperature Monitor 20 40 60 80 100
ViennaCL Test: OpenCL BLAS - sCOPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - sCOPY RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 100 200 300 400 500 SE +/- 0.88, N = 3 SE +/- 0.33, N = 3 SE +/- 1.20, N = 3 SE +/- 1.73, N = 3 444 385 363 306 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - sAXPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - sAXPY RTX 4090 24GB -Nvidia RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 2080 Ti 22GB -Dell 120 240 360 480 600 SE +/- 0.00, N = 3 SE +/- 0.67, N = 3 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 568 495 487 392 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - sDOT OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - sDOT RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 100 200 300 400 500 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 446 417 372 305 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dCOPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dCOPY RTX 4090 24GB -Nvidia RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 2080 Ti 22GB -Dell 140 280 420 560 700 SE +/- 0.33, N = 3 SE +/- 1.20, N = 3 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 661 600 540 475 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dAXPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dAXPY RTX 4090 24GB -Nvidia RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 2080 Ti 22GB -Dell 170 340 510 680 850 SE +/- 0.33, N = 3 SE +/- 1.00, N = 3 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 772 716 608 518 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dDOT OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dDOT RTX 4090 24GB -Nvidia RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 2080 Ti 22GB -Dell 160 320 480 640 800 SE +/- 0.67, N = 3 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 SE +/- 0.33, N = 3 719 658 597 524 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMV-N OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMV-N RTX 2080 Ti 22GB -Dell RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia RTX 3090 24GB -Zotac 60 120 180 240 300 SE +/- 0.33, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.33, N = 3 297 224 220 185 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMV-T OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMV-T RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 100 200 300 400 500 SE +/- 0.00, N = 2 SE +/- 0.67, N = 3 SE +/- 0.58, N = 3 SE +/- 0.58, N = 3 443 435 371 370 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMM-NN OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-NN RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 200 400 600 800 1000 SE +/- 0.00, N = 3 SE +/- 0.88, N = 3 SE +/- 1.67, N = 3 SE +/- 0.88, N = 3 1150 774 585 459 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMM-NT OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-NT RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 300 600 900 1200 1500 SE +/- 0.00, N = 3 SE +/- 1.00, N = 3 SE +/- 1.67, N = 3 SE +/- 0.58, N = 3 1280 795 588 461 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMM-TN OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-TN RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 300 600 900 1200 1500 SE +/- 3.33, N = 3 SE +/- 0.67, N = 3 SE +/- 1.67, N = 3 SE +/- 0.50, N = 2 1293 831 587 460 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMM-TT OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-TT RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 300 600 900 1200 1500 SE +/- 0.00, N = 3 SE +/- 1.20, N = 3 SE +/- 1.45, N = 3 1340 849 584 458 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL GPU Power Consumption Monitor Min Avg Max RTX 4080 16GB -Pny 13.4 109.2 223.4 RTX 4090 24GB -Nvidia 6.9 128.0 256.4 RTX 2080 Ti 22GB -Dell 22.4 173.6 255.0 RTX 3090 24GB -Zotac 28.9 232.9 357.9 OpenBenchmarking.org Watts, Fewer Is Better ViennaCL 1.7.1 GPU Power Consumption Monitor 100 200 300 400 500
ViennaCL GPU Temperature Monitor Min Avg Max RTX 4090 24GB -Nvidia 36.0 41.5 48.0 RTX 4080 16GB -Pny 37.0 44.0 54.0 RTX 3090 24GB -Zotac 49.0 70.8 83.0 RTX 2080 Ti 22GB -Dell 59.0 72.2 78.0 OpenBenchmarking.org Celsius, Fewer Is Better ViennaCL 1.7.1 GPU Temperature Monitor 20 40 60 80 100
ViennaCL Test: CPU BLAS - sCOPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - sCOPY RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 400 800 1200 1600 2000 SE +/- 13.45, N = 12 SE +/- 19.65, N = 5 SE +/- 12.27, N = 15 SE +/- 17.80, N = 15 1901 1874 1842 1755 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - sAXPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - sAXPY RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 500 1000 1500 2000 2500 SE +/- 31.16, N = 12 SE +/- 13.04, N = 5 SE +/- 18.29, N = 15 SE +/- 16.08, N = 15 2422 2360 2272 2261 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - sDOT OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - sDOT RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 200 400 600 800 1000 SE +/- 7.38, N = 5 SE +/- 5.06, N = 12 SE +/- 6.38, N = 15 SE +/- 4.44, N = 15 784 783 764 747 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dCOPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dCOPY RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac 200 400 600 800 1000 SE +/- 9.35, N = 12 SE +/- 11.59, N = 5 SE +/- 8.86, N = 15 SE +/- 7.74, N = 15 956 946 923 834 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dAXPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dAXPY RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac 200 400 600 800 1000 SE +/- 22.23, N = 5 SE +/- 7.44, N = 12 SE +/- 5.45, N = 15 SE +/- 6.89, N = 15 1098 1085 1058 984 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dDOT OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dDOT RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac 150 300 450 600 750 SE +/- 8.14, N = 12 SE +/- 6.71, N = 5 SE +/- 4.70, N = 15 SE +/- 1.68, N = 15 678 671 658 606 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMV-N OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-N RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac 40 80 120 160 200 SE +/- 1.74, N = 5 SE +/- 0.91, N = 12 SE +/- 1.84, N = 15 SE +/- 8.48, N = 15 188 187 179 172 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMV-T OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-T RTX 4080 16GB -Pny RTX 4090 24GB -Nvidia RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 200 400 600 800 1000 SE +/- 7.65, N = 12 SE +/- 2.58, N = 5 SE +/- 4.88, N = 15 SE +/- 6.17, N = 14 787 771 749 742 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMM-NN OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NN RTX 4090 24GB -Nvidia RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell RTX 4080 16GB -Pny 30 60 90 120 150 SE +/- 10.59, N = 5 SE +/- 1.05, N = 15 SE +/- 1.59, N = 15 SE +/- 1.29, N = 12 120.0 107.0 106.8 103.5 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMM-NT OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NT RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell RTX 4090 24GB -Nvidia 30 60 90 120 150 SE +/- 4.18, N = 12 SE +/- 4.37, N = 15 SE +/- 2.19, N = 15 SE +/- 0.48, N = 4 122 120 119 115 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMM-TN OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TN RTX 2080 Ti 22GB -Dell RTX 3090 24GB -Zotac RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny 30 60 90 120 150 SE +/- 4.47, N = 15 SE +/- 5.51, N = 15 SE +/- 9.50, N = 5 SE +/- 4.72, N = 12 142 141 137 135 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMM-TT OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TT RTX 2080 Ti 22GB -Dell RTX 4090 24GB -Nvidia RTX 3090 24GB -Zotac RTX 4080 16GB -Pny 30 60 90 120 150 SE +/- 4.93, N = 15 SE +/- 9.73, N = 5 SE +/- 5.83, N = 15 SE +/- 6.46, N = 11 145 144 143 137 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL GPU Power Consumption Monitor Min Avg Max RTX 4090 24GB -Nvidia 6.0 8.0 15.8 RTX 4080 16GB -Pny 4.0 12.1 18.8 RTX 2080 Ti 22GB -Dell 18.4 21.6 43.4 RTX 3090 24GB -Zotac 19.0 26.2 37.2 OpenBenchmarking.org Watts, Fewer Is Better ViennaCL 1.7.1 GPU Power Consumption Monitor 12 24 36 48 60
ViennaCL GPU Temperature Monitor Min Avg Max RTX 4090 24GB -Nvidia 32.0 33.9 37.0 RTX 4080 16GB -Pny 34.0 35.7 38.0 RTX 2080 Ti 22GB -Dell 44.0 49.2 58.0 RTX 3090 24GB -Zotac 48.0 53.3 57.0 OpenBenchmarking.org Celsius, Fewer Is Better ViennaCL 1.7.1 GPU Temperature Monitor 16 32 48 64 80
VkFFT OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.1.1 RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 13K 26K 39K 52K 65K SE +/- 72.97, N = 3 SE +/- 213.00, N = 3 SE +/- 475.08, N = 9 SE +/- 163.05, N = 3 59427 48201 41547 33404 1. (CXX) g++ options: -O3
VkFFT GPU Power Consumption Monitor Min Avg Max RTX 4080 16GB -Pny 11.2 53.2 215.9 RTX 4090 24GB -Nvidia 6.9 61.1 277.8 RTX 2080 Ti 22GB -Dell 21.5 119.5 264.1 RTX 3090 24GB -Zotac 19.1 140.5 367.2 OpenBenchmarking.org Watts, Fewer Is Better VkFFT 1.1.1 GPU Power Consumption Monitor 100 200 300 400 500
VkFFT GPU Temperature Monitor Min Avg Max RTX 4090 24GB -Nvidia 35.0 36.9 43.0 RTX 4080 16GB -Pny 37.0 39.6 46.0 RTX 3090 24GB -Zotac 47.0 62.8 74.0 RTX 2080 Ti 22GB -Dell 54.0 68.6 77.0 OpenBenchmarking.org Celsius, Fewer Is Better VkFFT 1.1.1 GPU Temperature Monitor 20 40 60 80 100
vkpeak fp32-scalar OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20210424 fp32-scalar RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 9K 18K 27K 36K 45K SE +/- 10.09, N = 3 SE +/- 35.38, N = 3 SE +/- 34.37, N = 3 SE +/- 136.44, N = 3 44315.35 27771.88 20403.35 15974.22
vkpeak fp32-vec4 OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20210424 fp32-vec4 RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 13K 26K 39K 52K 65K SE +/- 13.83, N = 3 SE +/- 13.71, N = 3 SE +/- 43.15, N = 3 SE +/- 77.45, N = 3 58607.46 36655.69 26392.09 15859.12
vkpeak fp16-scalar OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20210424 fp16-scalar RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 9K 18K 27K 36K 45K SE +/- 7.21, N = 3 SE +/- 0.86, N = 3 SE +/- 30.30, N = 3 SE +/- 61.99, N = 3 44269.79 27669.37 20161.59 15521.12
vkpeak fp16-vec4 OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20210424 fp16-vec4 RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 20K 40K 60K 80K 100K SE +/- 98.17, N = 3 SE +/- 0.37, N = 3 SE +/- 64.52, N = 3 SE +/- 26.46, N = 3 87698.39 54827.66 40002.93 30649.13
vkpeak fp64-scalar OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20210424 fp64-scalar RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 300 600 900 1200 1500 SE +/- 0.73, N = 3 SE +/- 0.08, N = 3 SE +/- 1.62, N = 3 SE +/- 1.28, N = 3 1394.44 873.52 645.24 500.91
vkpeak fp64-vec4 OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20210424 fp64-vec4 RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 300 600 900 1200 1500 SE +/- 1.25, N = 3 SE +/- 0.01, N = 3 SE +/- 1.70, N = 3 SE +/- 0.03, N = 3 1396.26 873.67 645.31 503.53
vkpeak int32-scalar OpenBenchmarking.org GIOPS, More Is Better vkpeak 20210424 int32-scalar RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 9K 18K 27K 36K 45K SE +/- 25.10, N = 3 SE +/- 2.25, N = 3 SE +/- 10.90, N = 3 SE +/- 39.60, N = 3 44276.72 27758.93 20305.75 15963.59
vkpeak int32-vec4 OpenBenchmarking.org GIOPS, More Is Better vkpeak 20210424 int32-vec4 RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 9K 18K 27K 36K 45K SE +/- 3.11, N = 3 SE +/- 37.47, N = 3 SE +/- 27.49, N = 3 SE +/- 15.45, N = 3 44062.52 27595.51 20074.10 15752.98
vkpeak int16-scalar OpenBenchmarking.org GIOPS, More Is Better vkpeak 20210424 int16-scalar RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 6K 12K 18K 24K 30K SE +/- 4.42, N = 3 SE +/- 0.34, N = 3 SE +/- 0.25, N = 3 SE +/- 0.89, N = 3 29507.17 18441.46 13295.72 10270.26
vkpeak int16-vec4 OpenBenchmarking.org GIOPS, More Is Better vkpeak 20210424 int16-vec4 RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 8K 16K 24K 32K 40K SE +/- 1.77, N = 3 SE +/- 0.85, N = 3 SE +/- 4.87, N = 3 SE +/- 4.25, N = 3 39280.82 24616.84 16205.98 12975.65
vkpeak GPU Power Consumption Monitor Min Avg Max RTX 4080 16GB -Pny 13.6 171.1 245.8 RTX 4090 24GB -Nvidia 6.6 203.4 347.2 RTX 2080 Ti 22GB -Dell 21.4 217.8 258.8 RTX 3090 24GB -Zotac 24.9 265.1 330.5 OpenBenchmarking.org Watts, Fewer Is Better vkpeak 20210424 GPU Power Consumption Monitor 100 200 300 400 500
vkpeak GPU Temperature Monitor Min Avg Max RTX 4090 24GB -Nvidia 42.0 53.9 65.0 RTX 4080 16GB -Pny 43.0 57.3 66.0 RTX 3090 24GB -Zotac 42.0 71.3 79.0 RTX 2080 Ti 22GB -Dell 56.0 77.3 83.0 OpenBenchmarking.org Celsius, Fewer Is Better vkpeak 20210424 GPU Temperature Monitor 20 40 60 80 100
VkResample Upscale: 2x - Precision: Single OpenBenchmarking.org ms, Fewer Is Better VkResample 1.0 Upscale: 2x - Precision: Single RTX 4090 24GB -Nvidia RTX 3090 24GB -Zotac RTX 4080 16GB -Pny RTX 2080 Ti 22GB -Dell 4 8 12 16 20 SE +/- 0.003, N = 5 SE +/- 0.008, N = 5 SE +/- 0.003, N = 5 SE +/- 0.023, N = 5 7.799 9.359 11.543 14.790 1. (CXX) g++ options: -O3
VkResample GPU Power Consumption Monitor Min Avg Max RTX 4080 16GB -Pny 4.3 35.1 203.1 RTX 4090 24GB -Nvidia 6.5 37.5 270.4 RTX 2080 Ti 22GB -Dell 21.4 60.8 255.9 RTX 3090 24GB -Zotac 19.2 78.0 357.0 OpenBenchmarking.org Watts, Fewer Is Better VkResample 1.0 GPU Power Consumption Monitor 100 200 300 400 500
VkResample GPU Temperature Monitor Min Avg Max RTX 4090 24GB -Nvidia 35.0 37.0 43.0 RTX 4080 16GB -Pny 36.0 38.1 46.0 RTX 3090 24GB -Zotac 48.0 53.5 73.0 RTX 2080 Ti 22GB -Dell 57.0 59.7 69.0 OpenBenchmarking.org Celsius, Fewer Is Better VkResample 1.0 GPU Temperature Monitor 20 40 60 80 100
VkResample Upscale: 2x - Precision: Double OpenBenchmarking.org ms, Fewer Is Better VkResample 1.0 Upscale: 2x - Precision: Double RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 30 60 90 120 150 SE +/- 0.05, N = 3 SE +/- 0.06, N = 3 SE +/- 0.06, N = 3 SE +/- 0.26, N = 3 55.93 89.24 121.35 153.15 1. (CXX) g++ options: -O3
VkResample GPU Power Consumption Monitor Min Avg Max RTX 4090 24GB -Nvidia 6.7 58.1 182.3 RTX 4080 16GB -Pny 4.2 59.1 135.5 RTX 2080 Ti 22GB -Dell 21.1 114.5 208.0 RTX 3090 24GB -Zotac 19.1 124.7 254.8 OpenBenchmarking.org Watts, Fewer Is Better VkResample 1.0 GPU Power Consumption Monitor 70 140 210 280 350
VkResample GPU Temperature Monitor Min Avg Max RTX 4090 24GB -Nvidia 38.0 41.6 47.0 RTX 4080 16GB -Pny 38.0 43.6 50.0 RTX 3090 24GB -Zotac 47.0 59.2 74.0 RTX 2080 Ti 22GB -Dell 55.0 65.4 73.0 OpenBenchmarking.org Celsius, Fewer Is Better VkResample 1.0 GPU Temperature Monitor 20 40 60 80 100
Waifu2x-NCNN Vulkan Scale: 2x - Denoise: 3 - TAA: Yes OpenBenchmarking.org Seconds, Fewer Is Better Waifu2x-NCNN Vulkan 20200818 Scale: 2x - Denoise: 3 - TAA: Yes RTX 4090 24GB -Nvidia RTX 4080 16GB -Pny RTX 3090 24GB -Zotac RTX 2080 Ti 22GB -Dell 0.9691 1.9382 2.9073 3.8764 4.8455 SE +/- 0.003, N = 10 SE +/- 0.004, N = 10 SE +/- 0.005, N = 9 SE +/- 0.007, N = 8 2.490 2.761 3.536 4.307
Waifu2x-NCNN Vulkan GPU Power Consumption Monitor Min Avg Max RTX 4080 16GB -Pny 14.0 72.7 204.6 RTX 4090 24GB -Nvidia 7.1 93.5 229.6 RTX 2080 Ti 22GB -Dell 22.8 138.6 261.5 RTX 3090 24GB -Zotac 27.5 154.7 304.3 OpenBenchmarking.org Watts, Fewer Is Better Waifu2x-NCNN Vulkan 20200818 GPU Power Consumption Monitor 80 160 240 320 400
Waifu2x-NCNN Vulkan GPU Temperature Monitor Min Avg Max RTX 4090 24GB -Nvidia 39.0 41.6 46.0 RTX 4080 16GB -Pny 40.0 42.9 51.0 RTX 3090 24GB -Zotac 47.0 61.2 73.0 RTX 2080 Ti 22GB -Dell 59.0 68.2 75.0 OpenBenchmarking.org Celsius, Fewer Is Better Waifu2x-NCNN Vulkan 20200818 GPU Temperature Monitor 20 40 60 80 100
Phoronix Test Suite v10.8.4