nvidia rtx 5090 compute benchmarks Tests for a future article. Intel Core Ultra 9 285K testing with a ASUS ROG MAXIMUS Z890 HERO (1203 BIOS) and ASUS NVIDIA GeForce RTX 5090 32GB on Ubuntu 24.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2501242-PTS-NVIDIART00&grw .
nvidia rtx 5090 compute benchmarks Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL OpenCL Compiler File-System Screen Resolution rtx 5090 NVIDIA 5090 GeForce RTX 5090 Intel Core Ultra 9 285K @ 5.10GHz (24 Cores) ASUS ROG MAXIMUS Z890 HERO (1203 BIOS) Intel Device ae7f 2 x 16GB DDR5-6400MT/s Micron CP16G64C38U5B.M8D1 1000GB Western Digital WDS100T1X0E-00AFY0 + 4001GB Western Digital WD_BLACK SN850X 4000GB ASUS NVIDIA GeForce RTX 5090 32GB Intel Device 7f50 ASUS VP28U Realtek Device 8126 + Intel I226-V + Intel Wi-Fi 7 Ubuntu 24.10 6.11.0-13-generic (x86_64) GNOME Shell 47.0 X Server 1.21.1.13 NVIDIA 570.86.10 4.6.0 OpenCL 3.0 CUDA 12.8.51 + OpenCL 3.0 GCC 14.2.0 ext4 3840x2160 OpenBenchmarking.org Kernel Details - nouveau.modeset=0 - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: intel_pstate performance (EPP: default) - CPU Microcode: 0x114 - Thermald 2.5.8 Graphics Details - BAR1 / Visible vRAM Size: 32768 MiB - vBIOS Version: 98.02.2e.00.03 OpenCL Details - GPU Compute Cores: 21760 Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
nvidia rtx 5090 compute benchmarks shoc: OpenCL - S3D shoc: OpenCL - Triad shoc: OpenCL - FFT SP shoc: OpenCL - MD5 Hash shoc: OpenCL - Reduction shoc: OpenCL - GEMM SGEMM_N shoc: OpenCL - Max SP Flops shoc: OpenCL - Bus Speed Download shoc: OpenCL - Bus Speed Readback shoc: OpenCL - Texture Read Bandwidth ncnn: Vulkan GPU - mobilenet ncnn: Vulkan GPU-v2-v2 - mobilenet-v2 ncnn: Vulkan GPU-v3-v3 - mobilenet-v3 ncnn: Vulkan GPU - shufflenet-v2 ncnn: Vulkan GPU - mnasnet opencl-benchmark: FP32 Compute ncnn: Vulkan GPU - efficientnet-b0 ncnn: Vulkan GPU - blazeface ncnn: Vulkan GPU - googlenet ncnn: Vulkan GPU - vgg16 ncnn: Vulkan GPU - resnet18 ncnn: Vulkan GPU - alexnet ncnn: Vulkan GPU - resnet50 ncnn: Vulkan GPUv2-yolov3v2-yolov3 - mobilenetv2-yolov3 ncnn: Vulkan GPU - yolov4-tiny ncnn: Vulkan GPU - squeezenet_ssd ncnn: Vulkan GPU - regnety_400m ncnn: Vulkan GPU - vision_transformer ncnn: Vulkan GPU - FastestDet v-ray: NVIDIA RTX GPU opencl-benchmark: INT16 Compute v-ray: NVIDIA CUDA GPU opencl-benchmark: INT8 Compute opencl-benchmark: INT32 Compute opencl-benchmark: FP16 Compute opencl-benchmark: INT64 Compute opencl-benchmark: FP64 Compute blender: BMW27 - NVIDIA CUDA blender: BMW27 - NVIDIA OptiX blender: Junkshop - NVIDIA CUDA blender: Classroom - NVIDIA CUDA blender: Fishy Cat - NVIDIA CUDA blender: Junkshop - NVIDIA OptiX opencl-benchmark: Memory Bandwidth Coalesced Write opencl-benchmark: Memory Bandwidth Coalesced Read blender: Barbershop - NVIDIA CUDA blender: Classroom - NVIDIA OptiX blender: Fishy Cat - NVIDIA OptiX blender: Barbershop - NVIDIA OptiX blender: Pabellon Barcelona - NVIDIA CUDA blender: Pabellon Barcelona - NVIDIA OptiX indigobench: OpenCL GPU - Bedroom indigobench: OpenCL GPU - Supercar hashcat: MD5 hashcat: SHA1 hashcat: 7-Zip hashcat: SHA-512 hashcat: TrueCrypt RIPEMD160 + XTS namd-cuda: ATPase Simulation - 327,506 Atoms clpeak: Kernel Latency clpeak: Integer Compute clpeak: Integer 24-bit Compute clpeak: Global Memory Bandwidth clpeak: Double-Precision Compute clpeak: Single-Precision Compute clpeak: Transfer Bandwidth enqueueReadBuffer clpeak: Transfer Bandwidth enqueueWriteBuffer realsr-ncnn: 4x - No realsr-ncnn: 4x - Yes vkfft: FFT + iFFT R2C / C2R vkfft: FFT + iFFT C2C 1D batched in half precision vkfft: FFT + iFFT C2C Bluestein in single precision vkfft: FFT + iFFT C2C 1D batched in double precision vkfft: FFT + iFFT C2C 1D batched in single precision vkfft: FFT + iFFT C2C multidimensional in single precision vkfft: FFT + iFFT C2C Bluestein benchmark in double precision vkfft: FFT + iFFT C2C 1D batched in single precision, no reshuffling vkpeak: fp32-scalar vkpeak: fp32-vec4 vkpeak: fp16-scalar vkpeak: fp16-vec4 vkpeak: fp64-scalar vkpeak: fp64-vec4 vkpeak: int32-scalar vkpeak: int32-vec4 vkpeak: int16-scalar vkpeak: int16-vec4 vkresample: 2x - Double vkresample: 2x - Single waifu2x-ncnn: 2x - 3 - Yes fluidx3d: FP32-FP32 fluidx3d: FP32-FP16C fluidx3d: FP32-FP16S rtx 5090 NVIDIA 5090 GeForce RTX 5090 1117.54 27.8329 4398.39 142.407 837.207 35937.2 124615 28.7867 28.6895 2870.68 42.44 10.21 4.68 4.53 9.8 117.847 27.44 2.67 13.92 37.05 10.93 8.74 28.58 42.44 39.34 22.02 47.9 62.76 11.01 11923 54.018 4851 41.795 61.759 122.914 4.396 1.95 4.72 2.92 8.99 8.38 8.92 5.66 1687.49 1596.24 35.14 6.16 4.55 24.33 17.35 7 42.759 92.7 106848250000 68852500000 3272300 8900400000 2776000 0.05810 5.15 62151.94 61843.11 1562.97 1976.9 121415.53 13.83 18.41 4.63 13.484 164933 302221 36054 63738 237717 144624 9931 243937 63013.32 83296.84 62611.78 72592.93 1967.37 1965.7 62142.56 61885.37 40006.6 43806.13 103.505 5.648 2.299 9524 19140 18499 1120.51 27.8098 4400.85 142.24 836.98 36016.5 124556 28.787 28.5856 2872.53 39.74 11.25 4.28 4.04 4.77 117.864 20.78 3.05 12.78 38.73 9.41 8.55 30.1 39.74 38.91 26.39 52.59 62.86 8.68 11923 54.037 4882 41.724 61.759 122.944 4.4 1.95 4.72 2.97 8.98 8.38 8.92 5.67 1679.44 1603.93 35.28 6.14 4.56 24.6 17.34 7.06 42.852 92.52 106216550000 69072700000 3276600 8901200000 2770700 0.05851 5.16 62119.51 61866.86 1564.49 1977.26 121438.41 13.89 18.42 4.4 13.504 164801 305671 36026 62773 235884 144600 9940 243913 63035.62 83290.22 62611.17 72578.62 1967.49 1965.39 62142.07 61894.95 39989.7 43799.96 103.478 5.649 2.267 9527 19121 18496 1120.88 27.7455 4375.95 142.532 837.243 35961.3 124646 28.7881 28.1406 2875.5 38.91 8.37 4.29 5.12 6.55 117.881 19.79 2.67 18.06 37.49 11.01 8.99 29.47 38.91 36.86 25.57 36.2 62.83 9.29 11923 53.953 4882 41.757 61.773 122.941 4.392 1.951 4.65 2.98 9 8.38 8.92 5.66 1680.23 1596.88 35.1 6.17 4.55 24.5 17.44 7.04 42.924 92.711 106544000000 69104300000 3264300 8895900000 2777600 0.05943 5.15 62178.05 61903.93 1564.61 1976.78 121419.57 13.78 18.33 4.446 13.486 164873 300737 35954 63637 239575 144695 9934 241913 63035.62 83257.59 62597.51 72575.33 1967.43 1965.32 62141.02 61914.51 39998.42 43803.6 103.453 5.648 2.278 9525 19135 18500 OpenBenchmarking.org
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: S3D OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: S3D rtx 5090 NVIDIA 5090 GeForce RTX 5090 200 400 600 800 1000 1117.54 1120.51 1120.88 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Triad OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Triad rtx 5090 NVIDIA 5090 GeForce RTX 5090 7 14 21 28 35 27.83 27.81 27.75 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: FFT SP OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: FFT SP rtx 5090 NVIDIA 5090 GeForce RTX 5090 900 1800 2700 3600 4500 4398.39 4400.85 4375.95 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: MD5 Hash OpenBenchmarking.org GHash/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: MD5 Hash rtx 5090 NVIDIA 5090 GeForce RTX 5090 30 60 90 120 150 142.41 142.24 142.53 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Reduction OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Reduction rtx 5090 NVIDIA 5090 GeForce RTX 5090 200 400 600 800 1000 837.21 836.98 837.24 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: GEMM SGEMM_N OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: GEMM SGEMM_N rtx 5090 NVIDIA 5090 GeForce RTX 5090 8K 16K 24K 32K 40K 35937.2 36016.5 35961.3 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Max SP Flops OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Max SP Flops rtx 5090 NVIDIA 5090 GeForce RTX 5090 30K 60K 90K 120K 150K 124615 124556 124646 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Bus Speed Download OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Download rtx 5090 NVIDIA 5090 GeForce RTX 5090 7 14 21 28 35 28.79 28.79 28.79 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Bus Speed Readback OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Readback rtx 5090 NVIDIA 5090 GeForce RTX 5090 7 14 21 28 35 28.69 28.59 28.14 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Texture Read Bandwidth OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Texture Read Bandwidth rtx 5090 NVIDIA 5090 GeForce RTX 5090 600 1200 1800 2400 3000 2870.68 2872.53 2875.50 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
NCNN Target: Vulkan GPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: mobilenet rtx 5090 NVIDIA 5090 GeForce RTX 5090 10 20 30 40 50 42.44 39.74 38.91 MIN: 8.34 / MAX: 76.17 MIN: 8.16 / MAX: 76.72 MIN: 8.9 / MAX: 75.48 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 rtx 5090 NVIDIA 5090 GeForce RTX 5090 3 6 9 12 15 10.21 11.25 8.37 MIN: 3.85 / MAX: 64.73 MIN: 3.82 / MAX: 63.94 MIN: 3.84 / MAX: 63.9 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 rtx 5090 NVIDIA 5090 GeForce RTX 5090 1.053 2.106 3.159 4.212 5.265 4.68 4.28 4.29 MIN: 4.08 / MAX: 57.94 MIN: 4.05 / MAX: 5.15 MIN: 4.06 / MAX: 5.93 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: shufflenet-v2 rtx 5090 NVIDIA 5090 GeForce RTX 5090 1.152 2.304 3.456 4.608 5.76 4.53 4.04 5.12 MIN: 3.88 / MAX: 57.59 MIN: 3.9 / MAX: 5.77 MIN: 3.91 / MAX: 67.5 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: mnasnet rtx 5090 NVIDIA 5090 GeForce RTX 5090 3 6 9 12 15 9.80 4.77 6.55 MIN: 3.75 / MAX: 63.5 MIN: 3.69 / MAX: 54.04 MIN: 3.69 / MAX: 59.21 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
ProjectPhysX OpenCL-Benchmark Operation: FP32 Compute OpenBenchmarking.org TFLOPs/s, More Is Better ProjectPhysX OpenCL-Benchmark 1.6 Operation: FP32 Compute rtx 5090 NVIDIA 5090 GeForce RTX 5090 30 60 90 120 150 117.85 117.86 117.88 1. (CXX) g++ options: -std=c++17 -pthread -lOpenCL
NCNN Target: Vulkan GPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: efficientnet-b0 rtx 5090 NVIDIA 5090 GeForce RTX 5090 6 12 18 24 30 27.44 20.78 19.79 MIN: 6.34 / MAX: 109.98 MIN: 6.36 / MAX: 109.98 MIN: 6.3 / MAX: 110.05 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: blazeface rtx 5090 NVIDIA 5090 GeForce RTX 5090 0.6863 1.3726 2.0589 2.7452 3.4315 2.67 3.05 2.67 MIN: 2.4 / MAX: 41.21 MIN: 2.38 / MAX: 49.95 MIN: 2.38 / MAX: 25.86 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: googlenet rtx 5090 NVIDIA 5090 GeForce RTX 5090 4 8 12 16 20 13.92 12.78 18.06 MIN: 7.49 / MAX: 98.37 MIN: 7.62 / MAX: 95.87 MIN: 7.48 / MAX: 98.98 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: vgg16 rtx 5090 NVIDIA 5090 GeForce RTX 5090 9 18 27 36 45 37.05 38.73 37.49 MIN: 22.53 / MAX: 46.22 MIN: 20.99 / MAX: 46.14 MIN: 23.61 / MAX: 45.51 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: resnet18 rtx 5090 NVIDIA 5090 GeForce RTX 5090 3 6 9 12 15 10.93 9.41 11.01 MIN: 4.48 / MAX: 44.35 MIN: 4.47 / MAX: 43.1 MIN: 4.51 / MAX: 42.89 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: alexnet rtx 5090 NVIDIA 5090 GeForce RTX 5090 3 6 9 12 15 8.74 8.55 8.99 MIN: 3.21 / MAX: 22.23 MIN: 3.18 / MAX: 21.75 MIN: 3.19 / MAX: 21.78 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: resnet50 rtx 5090 NVIDIA 5090 GeForce RTX 5090 7 14 21 28 35 28.58 30.10 29.47 MIN: 10.02 / MAX: 89.48 MIN: 10.13 / MAX: 90.77 MIN: 10.07 / MAX: 90.59 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3 rtx 5090 NVIDIA 5090 GeForce RTX 5090 10 20 30 40 50 42.44 39.74 38.91 MIN: 8.34 / MAX: 76.17 MIN: 8.16 / MAX: 76.72 MIN: 8.9 / MAX: 75.48 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: yolov4-tiny rtx 5090 NVIDIA 5090 GeForce RTX 5090 9 18 27 36 45 39.34 38.91 36.86 MIN: 15.92 / MAX: 49.04 MIN: 15.12 / MAX: 48.75 MIN: 11.12 / MAX: 47.83 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: squeezenet_ssd rtx 5090 NVIDIA 5090 GeForce RTX 5090 6 12 18 24 30 22.02 26.39 25.57 MIN: 7.41 / MAX: 92.66 MIN: 7.39 / MAX: 95.48 MIN: 7.22 / MAX: 94.51 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: regnety_400m rtx 5090 NVIDIA 5090 GeForce RTX 5090 12 24 36 48 60 47.90 52.59 36.20 MIN: 21.96 / MAX: 421.33 MIN: 21.91 / MAX: 425.58 MIN: 21.98 / MAX: 458.49 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: vision_transformer rtx 5090 NVIDIA 5090 GeForce RTX 5090 14 28 42 56 70 62.76 62.86 62.83 MIN: 40.3 / MAX: 105.61 MIN: 42.12 / MAX: 106.46 MIN: 41.21 / MAX: 109.15 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: FastestDet rtx 5090 NVIDIA 5090 GeForce RTX 5090 3 6 9 12 15 11.01 8.68 9.29 MIN: 5.09 / MAX: 92.36 MIN: 5.07 / MAX: 85.5 MIN: 5.01 / MAX: 89.47 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Chaos Group V-RAY Mode: NVIDIA RTX GPU OpenBenchmarking.org vpaths, More Is Better Chaos Group V-RAY 6.0 Mode: NVIDIA RTX GPU rtx 5090 NVIDIA 5090 GeForce RTX 5090 3K 6K 9K 12K 15K 11923 11923 11923
ProjectPhysX OpenCL-Benchmark Operation: INT16 Compute OpenBenchmarking.org TIOPs/s, More Is Better ProjectPhysX OpenCL-Benchmark 1.6 Operation: INT16 Compute rtx 5090 NVIDIA 5090 GeForce RTX 5090 12 24 36 48 60 54.02 54.04 53.95 1. (CXX) g++ options: -std=c++17 -pthread -lOpenCL
Chaos Group V-RAY Mode: NVIDIA CUDA GPU OpenBenchmarking.org vpaths, More Is Better Chaos Group V-RAY 6.0 Mode: NVIDIA CUDA GPU rtx 5090 NVIDIA 5090 GeForce RTX 5090 1000 2000 3000 4000 5000 4851 4882 4882
ProjectPhysX OpenCL-Benchmark Operation: INT8 Compute OpenBenchmarking.org TIOPs/s, More Is Better ProjectPhysX OpenCL-Benchmark 1.6 Operation: INT8 Compute rtx 5090 NVIDIA 5090 GeForce RTX 5090 10 20 30 40 50 41.80 41.72 41.76 1. (CXX) g++ options: -std=c++17 -pthread -lOpenCL
ProjectPhysX OpenCL-Benchmark Operation: INT32 Compute OpenBenchmarking.org TIOPs/s, More Is Better ProjectPhysX OpenCL-Benchmark 1.6 Operation: INT32 Compute rtx 5090 NVIDIA 5090 GeForce RTX 5090 14 28 42 56 70 61.76 61.76 61.77 1. (CXX) g++ options: -std=c++17 -pthread -lOpenCL
ProjectPhysX OpenCL-Benchmark Operation: FP16 Compute OpenBenchmarking.org TFLOPs/s, More Is Better ProjectPhysX OpenCL-Benchmark 1.6 Operation: FP16 Compute rtx 5090 NVIDIA 5090 GeForce RTX 5090 30 60 90 120 150 122.91 122.94 122.94 1. (CXX) g++ options: -std=c++17 -pthread -lOpenCL
ProjectPhysX OpenCL-Benchmark Operation: INT64 Compute OpenBenchmarking.org TIOPs/s, More Is Better ProjectPhysX OpenCL-Benchmark 1.6 Operation: INT64 Compute rtx 5090 NVIDIA 5090 GeForce RTX 5090 0.99 1.98 2.97 3.96 4.95 4.396 4.400 4.392 1. (CXX) g++ options: -std=c++17 -pthread -lOpenCL
ProjectPhysX OpenCL-Benchmark Operation: FP64 Compute OpenBenchmarking.org TFLOPs/s, More Is Better ProjectPhysX OpenCL-Benchmark 1.6 Operation: FP64 Compute rtx 5090 NVIDIA 5090 GeForce RTX 5090 0.439 0.878 1.317 1.756 2.195 1.950 1.950 1.951 1. (CXX) g++ options: -std=c++17 -pthread -lOpenCL
Blender Blend File: BMW27 - Compute: NVIDIA CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: BMW27 - Compute: NVIDIA CUDA rtx 5090 NVIDIA 5090 GeForce RTX 5090 1.062 2.124 3.186 4.248 5.31 4.72 4.72 4.65
Blender Blend File: BMW27 - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: BMW27 - Compute: NVIDIA OptiX rtx 5090 NVIDIA 5090 GeForce RTX 5090 0.6705 1.341 2.0115 2.682 3.3525 2.92 2.97 2.98
Blender Blend File: Junkshop - Compute: NVIDIA CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Junkshop - Compute: NVIDIA CUDA rtx 5090 NVIDIA 5090 GeForce RTX 5090 3 6 9 12 15 8.99 8.98 9.00
Blender Blend File: Classroom - Compute: NVIDIA CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Classroom - Compute: NVIDIA CUDA rtx 5090 NVIDIA 5090 GeForce RTX 5090 2 4 6 8 10 8.38 8.38 8.38
Blender Blend File: Fishy Cat - Compute: NVIDIA CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Fishy Cat - Compute: NVIDIA CUDA rtx 5090 NVIDIA 5090 GeForce RTX 5090 2 4 6 8 10 8.92 8.92 8.92
Blender Blend File: Junkshop - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Junkshop - Compute: NVIDIA OptiX rtx 5090 NVIDIA 5090 GeForce RTX 5090 1.2758 2.5516 3.8274 5.1032 6.379 5.66 5.67 5.66
ProjectPhysX OpenCL-Benchmark Operation: Memory Bandwidth Coalesced Write OpenBenchmarking.org GB/s, More Is Better ProjectPhysX OpenCL-Benchmark 1.6 Operation: Memory Bandwidth Coalesced Write rtx 5090 NVIDIA 5090 GeForce RTX 5090 400 800 1200 1600 2000 1687.49 1679.44 1680.23 1. (CXX) g++ options: -std=c++17 -pthread -lOpenCL
ProjectPhysX OpenCL-Benchmark Operation: Memory Bandwidth Coalesced Read OpenBenchmarking.org GB/s, More Is Better ProjectPhysX OpenCL-Benchmark 1.6 Operation: Memory Bandwidth Coalesced Read rtx 5090 NVIDIA 5090 GeForce RTX 5090 300 600 900 1200 1500 1596.24 1603.93 1596.88 1. (CXX) g++ options: -std=c++17 -pthread -lOpenCL
Blender Blend File: Barbershop - Compute: NVIDIA CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Barbershop - Compute: NVIDIA CUDA rtx 5090 NVIDIA 5090 GeForce RTX 5090 8 16 24 32 40 35.14 35.28 35.10
Blender Blend File: Classroom - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Classroom - Compute: NVIDIA OptiX rtx 5090 NVIDIA 5090 GeForce RTX 5090 2 4 6 8 10 6.16 6.14 6.17
Blender Blend File: Fishy Cat - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Fishy Cat - Compute: NVIDIA OptiX rtx 5090 NVIDIA 5090 GeForce RTX 5090 1.026 2.052 3.078 4.104 5.13 4.55 4.56 4.55
Blender Blend File: Barbershop - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Barbershop - Compute: NVIDIA OptiX rtx 5090 NVIDIA 5090 GeForce RTX 5090 6 12 18 24 30 24.33 24.60 24.50
Blender Blend File: Pabellon Barcelona - Compute: NVIDIA CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Pabellon Barcelona - Compute: NVIDIA CUDA rtx 5090 NVIDIA 5090 GeForce RTX 5090 4 8 12 16 20 17.35 17.34 17.44
Blender Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX rtx 5090 NVIDIA 5090 GeForce RTX 5090 2 4 6 8 10 7.00 7.06 7.04
IndigoBench Acceleration: OpenCL GPU - Scene: Bedroom OpenBenchmarking.org M samples/s, More Is Better IndigoBench 4.4 Acceleration: OpenCL GPU - Scene: Bedroom rtx 5090 NVIDIA 5090 GeForce RTX 5090 10 20 30 40 50 42.76 42.85 42.92
IndigoBench Acceleration: OpenCL GPU - Scene: Supercar OpenBenchmarking.org M samples/s, More Is Better IndigoBench 4.4 Acceleration: OpenCL GPU - Scene: Supercar rtx 5090 NVIDIA 5090 GeForce RTX 5090 20 40 60 80 100 92.70 92.52 92.71
Hashcat Benchmark: MD5 OpenBenchmarking.org H/s, More Is Better Hashcat 6.2.4 Benchmark: MD5 rtx 5090 NVIDIA 5090 GeForce RTX 5090 20000M 40000M 60000M 80000M 100000M SE +/- 102551750000.00, N = 2 SE +/- 101883450000.00, N = 2 SE +/- 102256000000.00, N = 2 106848250000 106216550000 106544000000
Hashcat Benchmark: SHA1 OpenBenchmarking.org H/s, More Is Better Hashcat 6.2.4 Benchmark: SHA1 rtx 5090 NVIDIA 5090 GeForce RTX 5090 15000M 30000M 45000M 60000M 75000M 68852500000 69072700000 69104300000
Hashcat Benchmark: 7-Zip OpenBenchmarking.org H/s, More Is Better Hashcat 6.2.4 Benchmark: 7-Zip rtx 5090 NVIDIA 5090 GeForce RTX 5090 700K 1400K 2100K 2800K 3500K 3272300 3276600 3264300
Hashcat Benchmark: SHA-512 OpenBenchmarking.org H/s, More Is Better Hashcat 6.2.4 Benchmark: SHA-512 rtx 5090 NVIDIA 5090 GeForce RTX 5090 2000M 4000M 6000M 8000M 10000M 8900400000 8901200000 8895900000
Hashcat Benchmark: TrueCrypt RIPEMD160 + XTS OpenBenchmarking.org H/s, More Is Better Hashcat 6.2.4 Benchmark: TrueCrypt RIPEMD160 + XTS rtx 5090 NVIDIA 5090 GeForce RTX 5090 600K 1200K 1800K 2400K 3000K 2776000 2770700 2777600
NAMD CUDA ATPase Simulation - 327,506 Atoms OpenBenchmarking.org days/ns, Fewer Is Better NAMD CUDA 2.14 ATPase Simulation - 327,506 Atoms rtx 5090 NVIDIA 5090 GeForce RTX 5090 0.0134 0.0268 0.0402 0.0536 0.067 0.05810 0.05851 0.05943
clpeak OpenCL Test: Kernel Latency OpenBenchmarking.org us, Fewer Is Better clpeak 1.1.2 OpenCL Test: Kernel Latency rtx 5090 NVIDIA 5090 GeForce RTX 5090 1.161 2.322 3.483 4.644 5.805 5.15 5.16 5.15 1. (CXX) g++ options: -O3
clpeak OpenCL Test: Integer Compute OpenBenchmarking.org GIOPS, More Is Better clpeak 1.1.2 OpenCL Test: Integer Compute rtx 5090 NVIDIA 5090 GeForce RTX 5090 13K 26K 39K 52K 65K 62151.94 62119.51 62178.05 1. (CXX) g++ options: -O3
clpeak OpenCL Test: Integer 24-bit Compute OpenBenchmarking.org GIOPS, More Is Better clpeak 1.1.2 OpenCL Test: Integer 24-bit Compute rtx 5090 NVIDIA 5090 GeForce RTX 5090 13K 26K 39K 52K 65K 61843.11 61866.86 61903.93 1. (CXX) g++ options: -O3
clpeak OpenCL Test: Global Memory Bandwidth OpenBenchmarking.org GBPS, More Is Better clpeak 1.1.2 OpenCL Test: Global Memory Bandwidth rtx 5090 NVIDIA 5090 GeForce RTX 5090 300 600 900 1200 1500 1562.97 1564.49 1564.61 1. (CXX) g++ options: -O3
clpeak OpenCL Test: Double-Precision Compute OpenBenchmarking.org GFLOPS, More Is Better clpeak 1.1.2 OpenCL Test: Double-Precision Compute rtx 5090 NVIDIA 5090 GeForce RTX 5090 400 800 1200 1600 2000 1976.90 1977.26 1976.78 1. (CXX) g++ options: -O3
clpeak OpenCL Test: Single-Precision Compute OpenBenchmarking.org GFLOPS, More Is Better clpeak 1.1.2 OpenCL Test: Single-Precision Compute rtx 5090 NVIDIA 5090 GeForce RTX 5090 30K 60K 90K 120K 150K 121415.53 121438.41 121419.57 1. (CXX) g++ options: -O3
clpeak OpenCL Test: Transfer Bandwidth enqueueReadBuffer OpenBenchmarking.org GBPS, More Is Better clpeak 1.1.2 OpenCL Test: Transfer Bandwidth enqueueReadBuffer rtx 5090 NVIDIA 5090 GeForce RTX 5090 4 8 12 16 20 13.83 13.89 13.78 1. (CXX) g++ options: -O3
clpeak OpenCL Test: Transfer Bandwidth enqueueWriteBuffer OpenBenchmarking.org GBPS, More Is Better clpeak 1.1.2 OpenCL Test: Transfer Bandwidth enqueueWriteBuffer rtx 5090 NVIDIA 5090 GeForce RTX 5090 5 10 15 20 25 18.41 18.42 18.33 1. (CXX) g++ options: -O3
RealSR-NCNN Scale: 4x - TAA: No OpenBenchmarking.org Seconds, Fewer Is Better RealSR-NCNN 20200818 Scale: 4x - TAA: No rtx 5090 NVIDIA 5090 GeForce RTX 5090 1.0418 2.0836 3.1254 4.1672 5.209 4.630 4.400 4.446
RealSR-NCNN Scale: 4x - TAA: Yes OpenBenchmarking.org Seconds, Fewer Is Better RealSR-NCNN 20200818 Scale: 4x - TAA: Yes rtx 5090 NVIDIA 5090 GeForce RTX 5090 3 6 9 12 15 13.48 13.50 13.49
VkFFT Test: FFT + iFFT R2C / C2R OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT R2C / C2R rtx 5090 NVIDIA 5090 GeForce RTX 5090 40K 80K 120K 160K 200K 164933 164801 164873 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C 1D batched in half precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT C2C 1D batched in half precision rtx 5090 NVIDIA 5090 GeForce RTX 5090 70K 140K 210K 280K 350K 302221 305671 300737 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C Bluestein in single precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT C2C Bluestein in single precision rtx 5090 NVIDIA 5090 GeForce RTX 5090 8K 16K 24K 32K 40K 36054 36026 35954 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C 1D batched in double precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT C2C 1D batched in double precision rtx 5090 NVIDIA 5090 GeForce RTX 5090 14K 28K 42K 56K 70K 63738 62773 63637 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C 1D batched in single precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT C2C 1D batched in single precision rtx 5090 NVIDIA 5090 GeForce RTX 5090 50K 100K 150K 200K 250K 237717 235884 239575 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C multidimensional in single precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT C2C multidimensional in single precision rtx 5090 NVIDIA 5090 GeForce RTX 5090 30K 60K 90K 120K 150K 144624 144600 144695 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C Bluestein benchmark in double precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT C2C Bluestein benchmark in double precision rtx 5090 NVIDIA 5090 GeForce RTX 5090 2K 4K 6K 8K 10K 9931 9940 9934 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C 1D batched in single precision, no reshuffling OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT C2C 1D batched in single precision, no reshuffling rtx 5090 NVIDIA 5090 GeForce RTX 5090 50K 100K 150K 200K 250K 243937 243913 241913 1. (CXX) g++ options: -O3
vkpeak fp32-scalar OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20240505 fp32-scalar rtx 5090 NVIDIA 5090 GeForce RTX 5090 14K 28K 42K 56K 70K 63013.32 63035.62 63035.62
vkpeak fp32-vec4 OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20240505 fp32-vec4 rtx 5090 NVIDIA 5090 GeForce RTX 5090 20K 40K 60K 80K 100K 83296.84 83290.22 83257.59
vkpeak fp16-scalar OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20240505 fp16-scalar rtx 5090 NVIDIA 5090 GeForce RTX 5090 13K 26K 39K 52K 65K 62611.78 62611.17 62597.51
vkpeak fp16-vec4 OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20240505 fp16-vec4 rtx 5090 NVIDIA 5090 GeForce RTX 5090 16K 32K 48K 64K 80K 72592.93 72578.62 72575.33
vkpeak fp64-scalar OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20240505 fp64-scalar rtx 5090 NVIDIA 5090 GeForce RTX 5090 400 800 1200 1600 2000 1967.37 1967.49 1967.43
vkpeak fp64-vec4 OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20240505 fp64-vec4 rtx 5090 NVIDIA 5090 GeForce RTX 5090 400 800 1200 1600 2000 1965.70 1965.39 1965.32
vkpeak int32-scalar OpenBenchmarking.org GIOPS, More Is Better vkpeak 20240505 int32-scalar rtx 5090 NVIDIA 5090 GeForce RTX 5090 13K 26K 39K 52K 65K 62142.56 62142.07 62141.02
vkpeak int32-vec4 OpenBenchmarking.org GIOPS, More Is Better vkpeak 20240505 int32-vec4 rtx 5090 NVIDIA 5090 GeForce RTX 5090 13K 26K 39K 52K 65K 61885.37 61894.95 61914.51
vkpeak int16-scalar OpenBenchmarking.org GIOPS, More Is Better vkpeak 20240505 int16-scalar rtx 5090 NVIDIA 5090 GeForce RTX 5090 9K 18K 27K 36K 45K 40006.60 39989.70 39998.42
vkpeak int16-vec4 OpenBenchmarking.org GIOPS, More Is Better vkpeak 20240505 int16-vec4 rtx 5090 NVIDIA 5090 GeForce RTX 5090 9K 18K 27K 36K 45K 43806.13 43799.96 43803.60
VkResample Upscale: 2x - Precision: Double OpenBenchmarking.org ms, Fewer Is Better VkResample 1.0 Upscale: 2x - Precision: Double rtx 5090 NVIDIA 5090 GeForce RTX 5090 20 40 60 80 100 103.51 103.48 103.45 1. (CXX) g++ options: -O3
VkResample Upscale: 2x - Precision: Single OpenBenchmarking.org ms, Fewer Is Better VkResample 1.0 Upscale: 2x - Precision: Single rtx 5090 NVIDIA 5090 GeForce RTX 5090 1.271 2.542 3.813 5.084 6.355 5.648 5.649 5.648 1. (CXX) g++ options: -O3
Waifu2x-NCNN Vulkan Scale: 2x - Denoise: 3 - TAA: Yes OpenBenchmarking.org Seconds, Fewer Is Better Waifu2x-NCNN Vulkan 20200818 Scale: 2x - Denoise: 3 - TAA: Yes rtx 5090 NVIDIA 5090 GeForce RTX 5090 0.5173 1.0346 1.5519 2.0692 2.5865 2.299 2.267 2.278
FluidX3D Test: FP32-FP32 OpenBenchmarking.org MLUPs/s, More Is Better FluidX3D 3.0 Test: FP32-FP32 rtx 5090 NVIDIA 5090 GeForce RTX 5090 2K 4K 6K 8K 10K 9524 9527 9525
FluidX3D Test: FP32-FP16C OpenBenchmarking.org MLUPs/s, More Is Better FluidX3D 3.0 Test: FP32-FP16C rtx 5090 NVIDIA 5090 GeForce RTX 5090 4K 8K 12K 16K 20K 19140 19121 19135
FluidX3D Test: FP32-FP16S OpenBenchmarking.org MLUPs/s, More Is Better FluidX3D 3.0 Test: FP32-FP16S rtx 5090 NVIDIA 5090 GeForce RTX 5090 4K 8K 12K 16K 20K 18499 18496 18500
Phoronix Test Suite v10.8.5