nvidia rtx 5090 compute benchmarks Tests for a future article. Intel Core Ultra 9 285K testing with a ASUS ROG MAXIMUS Z890 HERO (1203 BIOS) and ASUS NVIDIA GeForce RTX 5090 32GB on Ubuntu 24.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2501242-PTS-NVIDIART00&gru .
nvidia rtx 5090 compute benchmarks Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL OpenCL Compiler File-System Screen Resolution rtx 5090 NVIDIA 5090 GeForce RTX 5090 Intel Core Ultra 9 285K @ 5.10GHz (24 Cores) ASUS ROG MAXIMUS Z890 HERO (1203 BIOS) Intel Device ae7f 2 x 16GB DDR5-6400MT/s Micron CP16G64C38U5B.M8D1 1000GB Western Digital WDS100T1X0E-00AFY0 + 4001GB Western Digital WD_BLACK SN850X 4000GB ASUS NVIDIA GeForce RTX 5090 32GB Intel Device 7f50 ASUS VP28U Realtek Device 8126 + Intel I226-V + Intel Wi-Fi 7 Ubuntu 24.10 6.11.0-13-generic (x86_64) GNOME Shell 47.0 X Server 1.21.1.13 NVIDIA 570.86.10 4.6.0 OpenCL 3.0 CUDA 12.8.51 + OpenCL 3.0 GCC 14.2.0 ext4 3840x2160 OpenBenchmarking.org Kernel Details - nouveau.modeset=0 - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: intel_pstate performance (EPP: default) - CPU Microcode: 0x114 - Thermald 2.5.8 Graphics Details - BAR1 / Visible vRAM Size: 32768 MiB - vBIOS Version: 98.02.2e.00.03 OpenCL Details - GPU Compute Cores: 21760 Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
nvidia rtx 5090 compute benchmarks vkfft: FFT + iFFT R2C / C2R vkfft: FFT + iFFT C2C 1D batched in half precision vkfft: FFT + iFFT C2C Bluestein in single precision vkfft: FFT + iFFT C2C 1D batched in double precision vkfft: FFT + iFFT C2C 1D batched in single precision vkfft: FFT + iFFT C2C multidimensional in single precision vkfft: FFT + iFFT C2C Bluestein benchmark in double precision vkfft: FFT + iFFT C2C 1D batched in single precision, no reshuffling shoc: OpenCL - Triad shoc: OpenCL - Reduction shoc: OpenCL - Bus Speed Download shoc: OpenCL - Bus Speed Readback shoc: OpenCL - Texture Read Bandwidth opencl-benchmark: Memory Bandwidth Coalesced Read opencl-benchmark: Memory Bandwidth Coalesced Write clpeak: Global Memory Bandwidth clpeak: Transfer Bandwidth enqueueReadBuffer clpeak: Transfer Bandwidth enqueueWriteBuffer vkpeak: fp32-scalar vkpeak: fp32-vec4 vkpeak: fp16-scalar vkpeak: fp16-vec4 vkpeak: fp64-scalar vkpeak: fp64-vec4 shoc: OpenCL - S3D shoc: OpenCL - FFT SP shoc: OpenCL - GEMM SGEMM_N shoc: OpenCL - Max SP Flops clpeak: Double-Precision Compute clpeak: Single-Precision Compute shoc: OpenCL - MD5 Hash vkpeak: int32-scalar vkpeak: int32-vec4 vkpeak: int16-scalar vkpeak: int16-vec4 clpeak: Integer Compute clpeak: Integer 24-bit Compute hashcat: MD5 hashcat: SHA1 hashcat: 7-Zip hashcat: SHA-512 hashcat: TrueCrypt RIPEMD160 + XTS indigobench: OpenCL GPU - Bedroom indigobench: OpenCL GPU - Supercar fluidx3d: FP32-FP32 fluidx3d: FP32-FP16C fluidx3d: FP32-FP16S opencl-benchmark: FP64 Compute opencl-benchmark: FP32 Compute opencl-benchmark: FP16 Compute opencl-benchmark: INT64 Compute opencl-benchmark: INT32 Compute opencl-benchmark: INT16 Compute opencl-benchmark: INT8 Compute v-ray: NVIDIA RTX GPU v-ray: NVIDIA CUDA GPU namd-cuda: ATPase Simulation - 327,506 Atoms vkresample: 2x - Double vkresample: 2x - Single ncnn: Vulkan GPU - mobilenet ncnn: Vulkan GPU-v2-v2 - mobilenet-v2 ncnn: Vulkan GPU-v3-v3 - mobilenet-v3 ncnn: Vulkan GPU - shufflenet-v2 ncnn: Vulkan GPU - mnasnet ncnn: Vulkan GPU - efficientnet-b0 ncnn: Vulkan GPU - blazeface ncnn: Vulkan GPU - googlenet ncnn: Vulkan GPU - vgg16 ncnn: Vulkan GPU - resnet18 ncnn: Vulkan GPU - alexnet ncnn: Vulkan GPU - resnet50 ncnn: Vulkan GPUv2-yolov3v2-yolov3 - mobilenetv2-yolov3 ncnn: Vulkan GPU - yolov4-tiny ncnn: Vulkan GPU - squeezenet_ssd ncnn: Vulkan GPU - regnety_400m ncnn: Vulkan GPU - vision_transformer ncnn: Vulkan GPU - FastestDet realsr-ncnn: 4x - No realsr-ncnn: 4x - Yes waifu2x-ncnn: 2x - 3 - Yes blender: BMW27 - NVIDIA CUDA blender: BMW27 - NVIDIA OptiX blender: Junkshop - NVIDIA CUDA blender: Classroom - NVIDIA CUDA blender: Fishy Cat - NVIDIA CUDA blender: Junkshop - NVIDIA OptiX blender: Barbershop - NVIDIA CUDA blender: Classroom - NVIDIA OptiX blender: Fishy Cat - NVIDIA OptiX blender: Barbershop - NVIDIA OptiX blender: Pabellon Barcelona - NVIDIA CUDA blender: Pabellon Barcelona - NVIDIA OptiX clpeak: Kernel Latency rtx 5090 NVIDIA 5090 GeForce RTX 5090 164933 302221 36054 63738 237717 144624 9931 243937 27.8329 837.207 28.7867 28.6895 2870.68 1596.24 1687.49 1562.97 13.83 18.41 63013.32 83296.84 62611.78 72592.93 1967.37 1965.7 1117.54 4398.39 35937.2 124615 1976.9 121415.53 142.407 62142.56 61885.37 40006.6 43806.13 62151.94 61843.11 106848250000 68852500000 3272300 8900400000 2776000 42.759 92.7 9524 19140 18499 1.95 117.847 122.914 4.396 61.759 54.018 41.795 11923 4851 0.05810 103.505 5.648 42.44 10.21 4.68 4.53 9.8 27.44 2.67 13.92 37.05 10.93 8.74 28.58 42.44 39.34 22.02 47.9 62.76 11.01 4.63 13.484 2.299 4.72 2.92 8.99 8.38 8.92 5.66 35.14 6.16 4.55 24.33 17.35 7 5.15 164801 305671 36026 62773 235884 144600 9940 243913 27.8098 836.98 28.787 28.5856 2872.53 1603.93 1679.44 1564.49 13.89 18.42 63035.62 83290.22 62611.17 72578.62 1967.49 1965.39 1120.51 4400.85 36016.5 124556 1977.26 121438.41 142.24 62142.07 61894.95 39989.7 43799.96 62119.51 61866.86 106216550000 69072700000 3276600 8901200000 2770700 42.852 92.52 9527 19121 18496 1.95 117.864 122.944 4.4 61.759 54.037 41.724 11923 4882 0.05851 103.478 5.649 39.74 11.25 4.28 4.04 4.77 20.78 3.05 12.78 38.73 9.41 8.55 30.1 39.74 38.91 26.39 52.59 62.86 8.68 4.4 13.504 2.267 4.72 2.97 8.98 8.38 8.92 5.67 35.28 6.14 4.56 24.6 17.34 7.06 5.16 164873 300737 35954 63637 239575 144695 9934 241913 27.7455 837.243 28.7881 28.1406 2875.5 1596.88 1680.23 1564.61 13.78 18.33 63035.62 83257.59 62597.51 72575.33 1967.43 1965.32 1120.88 4375.95 35961.3 124646 1976.78 121419.57 142.532 62141.02 61914.51 39998.42 43803.6 62178.05 61903.93 106544000000 69104300000 3264300 8895900000 2777600 42.924 92.711 9525 19135 18500 1.951 117.881 122.941 4.392 61.773 53.953 41.757 11923 4882 0.05943 103.453 5.648 38.91 8.37 4.29 5.12 6.55 19.79 2.67 18.06 37.49 11.01 8.99 29.47 38.91 36.86 25.57 36.2 62.83 9.29 4.446 13.486 2.278 4.65 2.98 9 8.38 8.92 5.66 35.1 6.17 4.55 24.5 17.44 7.04 5.15 OpenBenchmarking.org
VkFFT Test: FFT + iFFT R2C / C2R OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT R2C / C2R rtx 5090 NVIDIA 5090 GeForce RTX 5090 40K 80K 120K 160K 200K 164933 164801 164873 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C 1D batched in half precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT C2C 1D batched in half precision rtx 5090 NVIDIA 5090 GeForce RTX 5090 70K 140K 210K 280K 350K 302221 305671 300737 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C Bluestein in single precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT C2C Bluestein in single precision rtx 5090 NVIDIA 5090 GeForce RTX 5090 8K 16K 24K 32K 40K 36054 36026 35954 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C 1D batched in double precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT C2C 1D batched in double precision rtx 5090 NVIDIA 5090 GeForce RTX 5090 14K 28K 42K 56K 70K 63738 62773 63637 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C 1D batched in single precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT C2C 1D batched in single precision rtx 5090 NVIDIA 5090 GeForce RTX 5090 50K 100K 150K 200K 250K 237717 235884 239575 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C multidimensional in single precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT C2C multidimensional in single precision rtx 5090 NVIDIA 5090 GeForce RTX 5090 30K 60K 90K 120K 150K 144624 144600 144695 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C Bluestein benchmark in double precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT C2C Bluestein benchmark in double precision rtx 5090 NVIDIA 5090 GeForce RTX 5090 2K 4K 6K 8K 10K 9931 9940 9934 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C 1D batched in single precision, no reshuffling OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT C2C 1D batched in single precision, no reshuffling rtx 5090 NVIDIA 5090 GeForce RTX 5090 50K 100K 150K 200K 250K 243937 243913 241913 1. (CXX) g++ options: -O3
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Triad OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Triad rtx 5090 NVIDIA 5090 GeForce RTX 5090 7 14 21 28 35 27.83 27.81 27.75 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Reduction OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Reduction rtx 5090 NVIDIA 5090 GeForce RTX 5090 200 400 600 800 1000 837.21 836.98 837.24 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Bus Speed Download OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Download rtx 5090 NVIDIA 5090 GeForce RTX 5090 7 14 21 28 35 28.79 28.79 28.79 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Bus Speed Readback OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Readback rtx 5090 NVIDIA 5090 GeForce RTX 5090 7 14 21 28 35 28.69 28.59 28.14 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Texture Read Bandwidth OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Texture Read Bandwidth rtx 5090 NVIDIA 5090 GeForce RTX 5090 600 1200 1800 2400 3000 2870.68 2872.53 2875.50 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
ProjectPhysX OpenCL-Benchmark Operation: Memory Bandwidth Coalesced Read OpenBenchmarking.org GB/s, More Is Better ProjectPhysX OpenCL-Benchmark 1.6 Operation: Memory Bandwidth Coalesced Read rtx 5090 NVIDIA 5090 GeForce RTX 5090 300 600 900 1200 1500 1596.24 1603.93 1596.88 1. (CXX) g++ options: -std=c++17 -pthread -lOpenCL
ProjectPhysX OpenCL-Benchmark Operation: Memory Bandwidth Coalesced Write OpenBenchmarking.org GB/s, More Is Better ProjectPhysX OpenCL-Benchmark 1.6 Operation: Memory Bandwidth Coalesced Write rtx 5090 NVIDIA 5090 GeForce RTX 5090 400 800 1200 1600 2000 1687.49 1679.44 1680.23 1. (CXX) g++ options: -std=c++17 -pthread -lOpenCL
clpeak OpenCL Test: Global Memory Bandwidth OpenBenchmarking.org GBPS, More Is Better clpeak 1.1.2 OpenCL Test: Global Memory Bandwidth rtx 5090 NVIDIA 5090 GeForce RTX 5090 300 600 900 1200 1500 1562.97 1564.49 1564.61 1. (CXX) g++ options: -O3
clpeak OpenCL Test: Transfer Bandwidth enqueueReadBuffer OpenBenchmarking.org GBPS, More Is Better clpeak 1.1.2 OpenCL Test: Transfer Bandwidth enqueueReadBuffer rtx 5090 NVIDIA 5090 GeForce RTX 5090 4 8 12 16 20 13.83 13.89 13.78 1. (CXX) g++ options: -O3
clpeak OpenCL Test: Transfer Bandwidth enqueueWriteBuffer OpenBenchmarking.org GBPS, More Is Better clpeak 1.1.2 OpenCL Test: Transfer Bandwidth enqueueWriteBuffer rtx 5090 NVIDIA 5090 GeForce RTX 5090 5 10 15 20 25 18.41 18.42 18.33 1. (CXX) g++ options: -O3
vkpeak fp32-scalar OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20240505 fp32-scalar rtx 5090 NVIDIA 5090 GeForce RTX 5090 14K 28K 42K 56K 70K 63013.32 63035.62 63035.62
vkpeak fp32-vec4 OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20240505 fp32-vec4 rtx 5090 NVIDIA 5090 GeForce RTX 5090 20K 40K 60K 80K 100K 83296.84 83290.22 83257.59
vkpeak fp16-scalar OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20240505 fp16-scalar rtx 5090 NVIDIA 5090 GeForce RTX 5090 13K 26K 39K 52K 65K 62611.78 62611.17 62597.51
vkpeak fp16-vec4 OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20240505 fp16-vec4 rtx 5090 NVIDIA 5090 GeForce RTX 5090 16K 32K 48K 64K 80K 72592.93 72578.62 72575.33
vkpeak fp64-scalar OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20240505 fp64-scalar rtx 5090 NVIDIA 5090 GeForce RTX 5090 400 800 1200 1600 2000 1967.37 1967.49 1967.43
vkpeak fp64-vec4 OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20240505 fp64-vec4 rtx 5090 NVIDIA 5090 GeForce RTX 5090 400 800 1200 1600 2000 1965.70 1965.39 1965.32
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: S3D OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: S3D rtx 5090 NVIDIA 5090 GeForce RTX 5090 200 400 600 800 1000 1117.54 1120.51 1120.88 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: FFT SP OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: FFT SP rtx 5090 NVIDIA 5090 GeForce RTX 5090 900 1800 2700 3600 4500 4398.39 4400.85 4375.95 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: GEMM SGEMM_N OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: GEMM SGEMM_N rtx 5090 NVIDIA 5090 GeForce RTX 5090 8K 16K 24K 32K 40K 35937.2 36016.5 35961.3 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Max SP Flops OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Max SP Flops rtx 5090 NVIDIA 5090 GeForce RTX 5090 30K 60K 90K 120K 150K 124615 124556 124646 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
clpeak OpenCL Test: Double-Precision Compute OpenBenchmarking.org GFLOPS, More Is Better clpeak 1.1.2 OpenCL Test: Double-Precision Compute rtx 5090 NVIDIA 5090 GeForce RTX 5090 400 800 1200 1600 2000 1976.90 1977.26 1976.78 1. (CXX) g++ options: -O3
clpeak OpenCL Test: Single-Precision Compute OpenBenchmarking.org GFLOPS, More Is Better clpeak 1.1.2 OpenCL Test: Single-Precision Compute rtx 5090 NVIDIA 5090 GeForce RTX 5090 30K 60K 90K 120K 150K 121415.53 121438.41 121419.57 1. (CXX) g++ options: -O3
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: MD5 Hash OpenBenchmarking.org GHash/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: MD5 Hash rtx 5090 NVIDIA 5090 GeForce RTX 5090 30 60 90 120 150 142.41 142.24 142.53 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
vkpeak int32-scalar OpenBenchmarking.org GIOPS, More Is Better vkpeak 20240505 int32-scalar rtx 5090 NVIDIA 5090 GeForce RTX 5090 13K 26K 39K 52K 65K 62142.56 62142.07 62141.02
vkpeak int32-vec4 OpenBenchmarking.org GIOPS, More Is Better vkpeak 20240505 int32-vec4 rtx 5090 NVIDIA 5090 GeForce RTX 5090 13K 26K 39K 52K 65K 61885.37 61894.95 61914.51
vkpeak int16-scalar OpenBenchmarking.org GIOPS, More Is Better vkpeak 20240505 int16-scalar rtx 5090 NVIDIA 5090 GeForce RTX 5090 9K 18K 27K 36K 45K 40006.60 39989.70 39998.42
vkpeak int16-vec4 OpenBenchmarking.org GIOPS, More Is Better vkpeak 20240505 int16-vec4 rtx 5090 NVIDIA 5090 GeForce RTX 5090 9K 18K 27K 36K 45K 43806.13 43799.96 43803.60
clpeak OpenCL Test: Integer Compute OpenBenchmarking.org GIOPS, More Is Better clpeak 1.1.2 OpenCL Test: Integer Compute rtx 5090 NVIDIA 5090 GeForce RTX 5090 13K 26K 39K 52K 65K 62151.94 62119.51 62178.05 1. (CXX) g++ options: -O3
clpeak OpenCL Test: Integer 24-bit Compute OpenBenchmarking.org GIOPS, More Is Better clpeak 1.1.2 OpenCL Test: Integer 24-bit Compute rtx 5090 NVIDIA 5090 GeForce RTX 5090 13K 26K 39K 52K 65K 61843.11 61866.86 61903.93 1. (CXX) g++ options: -O3
Hashcat Benchmark: MD5 OpenBenchmarking.org H/s, More Is Better Hashcat 6.2.4 Benchmark: MD5 rtx 5090 NVIDIA 5090 GeForce RTX 5090 20000M 40000M 60000M 80000M 100000M SE +/- 102551750000.00, N = 2 SE +/- 101883450000.00, N = 2 SE +/- 102256000000.00, N = 2 106848250000 106216550000 106544000000
Hashcat Benchmark: SHA1 OpenBenchmarking.org H/s, More Is Better Hashcat 6.2.4 Benchmark: SHA1 rtx 5090 NVIDIA 5090 GeForce RTX 5090 15000M 30000M 45000M 60000M 75000M 68852500000 69072700000 69104300000
Hashcat Benchmark: 7-Zip OpenBenchmarking.org H/s, More Is Better Hashcat 6.2.4 Benchmark: 7-Zip rtx 5090 NVIDIA 5090 GeForce RTX 5090 700K 1400K 2100K 2800K 3500K 3272300 3276600 3264300
Hashcat Benchmark: SHA-512 OpenBenchmarking.org H/s, More Is Better Hashcat 6.2.4 Benchmark: SHA-512 rtx 5090 NVIDIA 5090 GeForce RTX 5090 2000M 4000M 6000M 8000M 10000M 8900400000 8901200000 8895900000
Hashcat Benchmark: TrueCrypt RIPEMD160 + XTS OpenBenchmarking.org H/s, More Is Better Hashcat 6.2.4 Benchmark: TrueCrypt RIPEMD160 + XTS rtx 5090 NVIDIA 5090 GeForce RTX 5090 600K 1200K 1800K 2400K 3000K 2776000 2770700 2777600
IndigoBench Acceleration: OpenCL GPU - Scene: Bedroom OpenBenchmarking.org M samples/s, More Is Better IndigoBench 4.4 Acceleration: OpenCL GPU - Scene: Bedroom rtx 5090 NVIDIA 5090 GeForce RTX 5090 10 20 30 40 50 42.76 42.85 42.92
IndigoBench Acceleration: OpenCL GPU - Scene: Supercar OpenBenchmarking.org M samples/s, More Is Better IndigoBench 4.4 Acceleration: OpenCL GPU - Scene: Supercar rtx 5090 NVIDIA 5090 GeForce RTX 5090 20 40 60 80 100 92.70 92.52 92.71
FluidX3D Test: FP32-FP32 OpenBenchmarking.org MLUPs/s, More Is Better FluidX3D 3.0 Test: FP32-FP32 rtx 5090 NVIDIA 5090 GeForce RTX 5090 2K 4K 6K 8K 10K 9524 9527 9525
FluidX3D Test: FP32-FP16C OpenBenchmarking.org MLUPs/s, More Is Better FluidX3D 3.0 Test: FP32-FP16C rtx 5090 NVIDIA 5090 GeForce RTX 5090 4K 8K 12K 16K 20K 19140 19121 19135
FluidX3D Test: FP32-FP16S OpenBenchmarking.org MLUPs/s, More Is Better FluidX3D 3.0 Test: FP32-FP16S rtx 5090 NVIDIA 5090 GeForce RTX 5090 4K 8K 12K 16K 20K 18499 18496 18500
ProjectPhysX OpenCL-Benchmark Operation: FP64 Compute OpenBenchmarking.org TFLOPs/s, More Is Better ProjectPhysX OpenCL-Benchmark 1.6 Operation: FP64 Compute rtx 5090 NVIDIA 5090 GeForce RTX 5090 0.439 0.878 1.317 1.756 2.195 1.950 1.950 1.951 1. (CXX) g++ options: -std=c++17 -pthread -lOpenCL
ProjectPhysX OpenCL-Benchmark Operation: FP32 Compute OpenBenchmarking.org TFLOPs/s, More Is Better ProjectPhysX OpenCL-Benchmark 1.6 Operation: FP32 Compute rtx 5090 NVIDIA 5090 GeForce RTX 5090 30 60 90 120 150 117.85 117.86 117.88 1. (CXX) g++ options: -std=c++17 -pthread -lOpenCL
ProjectPhysX OpenCL-Benchmark Operation: FP16 Compute OpenBenchmarking.org TFLOPs/s, More Is Better ProjectPhysX OpenCL-Benchmark 1.6 Operation: FP16 Compute rtx 5090 NVIDIA 5090 GeForce RTX 5090 30 60 90 120 150 122.91 122.94 122.94 1. (CXX) g++ options: -std=c++17 -pthread -lOpenCL
ProjectPhysX OpenCL-Benchmark Operation: INT64 Compute OpenBenchmarking.org TIOPs/s, More Is Better ProjectPhysX OpenCL-Benchmark 1.6 Operation: INT64 Compute rtx 5090 NVIDIA 5090 GeForce RTX 5090 0.99 1.98 2.97 3.96 4.95 4.396 4.400 4.392 1. (CXX) g++ options: -std=c++17 -pthread -lOpenCL
ProjectPhysX OpenCL-Benchmark Operation: INT32 Compute OpenBenchmarking.org TIOPs/s, More Is Better ProjectPhysX OpenCL-Benchmark 1.6 Operation: INT32 Compute rtx 5090 NVIDIA 5090 GeForce RTX 5090 14 28 42 56 70 61.76 61.76 61.77 1. (CXX) g++ options: -std=c++17 -pthread -lOpenCL
ProjectPhysX OpenCL-Benchmark Operation: INT16 Compute OpenBenchmarking.org TIOPs/s, More Is Better ProjectPhysX OpenCL-Benchmark 1.6 Operation: INT16 Compute rtx 5090 NVIDIA 5090 GeForce RTX 5090 12 24 36 48 60 54.02 54.04 53.95 1. (CXX) g++ options: -std=c++17 -pthread -lOpenCL
ProjectPhysX OpenCL-Benchmark Operation: INT8 Compute OpenBenchmarking.org TIOPs/s, More Is Better ProjectPhysX OpenCL-Benchmark 1.6 Operation: INT8 Compute rtx 5090 NVIDIA 5090 GeForce RTX 5090 10 20 30 40 50 41.80 41.72 41.76 1. (CXX) g++ options: -std=c++17 -pthread -lOpenCL
Chaos Group V-RAY Mode: NVIDIA RTX GPU OpenBenchmarking.org vpaths, More Is Better Chaos Group V-RAY 6.0 Mode: NVIDIA RTX GPU rtx 5090 NVIDIA 5090 GeForce RTX 5090 3K 6K 9K 12K 15K 11923 11923 11923
Chaos Group V-RAY Mode: NVIDIA CUDA GPU OpenBenchmarking.org vpaths, More Is Better Chaos Group V-RAY 6.0 Mode: NVIDIA CUDA GPU rtx 5090 NVIDIA 5090 GeForce RTX 5090 1000 2000 3000 4000 5000 4851 4882 4882
NAMD CUDA ATPase Simulation - 327,506 Atoms OpenBenchmarking.org days/ns, Fewer Is Better NAMD CUDA 2.14 ATPase Simulation - 327,506 Atoms rtx 5090 NVIDIA 5090 GeForce RTX 5090 0.0134 0.0268 0.0402 0.0536 0.067 0.05810 0.05851 0.05943
VkResample Upscale: 2x - Precision: Double OpenBenchmarking.org ms, Fewer Is Better VkResample 1.0 Upscale: 2x - Precision: Double rtx 5090 NVIDIA 5090 GeForce RTX 5090 20 40 60 80 100 103.51 103.48 103.45 1. (CXX) g++ options: -O3
VkResample Upscale: 2x - Precision: Single OpenBenchmarking.org ms, Fewer Is Better VkResample 1.0 Upscale: 2x - Precision: Single rtx 5090 NVIDIA 5090 GeForce RTX 5090 1.271 2.542 3.813 5.084 6.355 5.648 5.649 5.648 1. (CXX) g++ options: -O3
NCNN Target: Vulkan GPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: mobilenet rtx 5090 NVIDIA 5090 GeForce RTX 5090 10 20 30 40 50 42.44 39.74 38.91 MIN: 8.34 / MAX: 76.17 MIN: 8.16 / MAX: 76.72 MIN: 8.9 / MAX: 75.48 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 rtx 5090 NVIDIA 5090 GeForce RTX 5090 3 6 9 12 15 10.21 11.25 8.37 MIN: 3.85 / MAX: 64.73 MIN: 3.82 / MAX: 63.94 MIN: 3.84 / MAX: 63.9 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 rtx 5090 NVIDIA 5090 GeForce RTX 5090 1.053 2.106 3.159 4.212 5.265 4.68 4.28 4.29 MIN: 4.08 / MAX: 57.94 MIN: 4.05 / MAX: 5.15 MIN: 4.06 / MAX: 5.93 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: shufflenet-v2 rtx 5090 NVIDIA 5090 GeForce RTX 5090 1.152 2.304 3.456 4.608 5.76 4.53 4.04 5.12 MIN: 3.88 / MAX: 57.59 MIN: 3.9 / MAX: 5.77 MIN: 3.91 / MAX: 67.5 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: mnasnet rtx 5090 NVIDIA 5090 GeForce RTX 5090 3 6 9 12 15 9.80 4.77 6.55 MIN: 3.75 / MAX: 63.5 MIN: 3.69 / MAX: 54.04 MIN: 3.69 / MAX: 59.21 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: efficientnet-b0 rtx 5090 NVIDIA 5090 GeForce RTX 5090 6 12 18 24 30 27.44 20.78 19.79 MIN: 6.34 / MAX: 109.98 MIN: 6.36 / MAX: 109.98 MIN: 6.3 / MAX: 110.05 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: blazeface rtx 5090 NVIDIA 5090 GeForce RTX 5090 0.6863 1.3726 2.0589 2.7452 3.4315 2.67 3.05 2.67 MIN: 2.4 / MAX: 41.21 MIN: 2.38 / MAX: 49.95 MIN: 2.38 / MAX: 25.86 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: googlenet rtx 5090 NVIDIA 5090 GeForce RTX 5090 4 8 12 16 20 13.92 12.78 18.06 MIN: 7.49 / MAX: 98.37 MIN: 7.62 / MAX: 95.87 MIN: 7.48 / MAX: 98.98 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: vgg16 rtx 5090 NVIDIA 5090 GeForce RTX 5090 9 18 27 36 45 37.05 38.73 37.49 MIN: 22.53 / MAX: 46.22 MIN: 20.99 / MAX: 46.14 MIN: 23.61 / MAX: 45.51 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: resnet18 rtx 5090 NVIDIA 5090 GeForce RTX 5090 3 6 9 12 15 10.93 9.41 11.01 MIN: 4.48 / MAX: 44.35 MIN: 4.47 / MAX: 43.1 MIN: 4.51 / MAX: 42.89 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: alexnet rtx 5090 NVIDIA 5090 GeForce RTX 5090 3 6 9 12 15 8.74 8.55 8.99 MIN: 3.21 / MAX: 22.23 MIN: 3.18 / MAX: 21.75 MIN: 3.19 / MAX: 21.78 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: resnet50 rtx 5090 NVIDIA 5090 GeForce RTX 5090 7 14 21 28 35 28.58 30.10 29.47 MIN: 10.02 / MAX: 89.48 MIN: 10.13 / MAX: 90.77 MIN: 10.07 / MAX: 90.59 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3 rtx 5090 NVIDIA 5090 GeForce RTX 5090 10 20 30 40 50 42.44 39.74 38.91 MIN: 8.34 / MAX: 76.17 MIN: 8.16 / MAX: 76.72 MIN: 8.9 / MAX: 75.48 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: yolov4-tiny rtx 5090 NVIDIA 5090 GeForce RTX 5090 9 18 27 36 45 39.34 38.91 36.86 MIN: 15.92 / MAX: 49.04 MIN: 15.12 / MAX: 48.75 MIN: 11.12 / MAX: 47.83 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: squeezenet_ssd rtx 5090 NVIDIA 5090 GeForce RTX 5090 6 12 18 24 30 22.02 26.39 25.57 MIN: 7.41 / MAX: 92.66 MIN: 7.39 / MAX: 95.48 MIN: 7.22 / MAX: 94.51 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: regnety_400m rtx 5090 NVIDIA 5090 GeForce RTX 5090 12 24 36 48 60 47.90 52.59 36.20 MIN: 21.96 / MAX: 421.33 MIN: 21.91 / MAX: 425.58 MIN: 21.98 / MAX: 458.49 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: vision_transformer rtx 5090 NVIDIA 5090 GeForce RTX 5090 14 28 42 56 70 62.76 62.86 62.83 MIN: 40.3 / MAX: 105.61 MIN: 42.12 / MAX: 106.46 MIN: 41.21 / MAX: 109.15 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20241226 Target: Vulkan GPU - Model: FastestDet rtx 5090 NVIDIA 5090 GeForce RTX 5090 3 6 9 12 15 11.01 8.68 9.29 MIN: 5.09 / MAX: 92.36 MIN: 5.07 / MAX: 85.5 MIN: 5.01 / MAX: 89.47 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
RealSR-NCNN Scale: 4x - TAA: No OpenBenchmarking.org Seconds, Fewer Is Better RealSR-NCNN 20200818 Scale: 4x - TAA: No rtx 5090 NVIDIA 5090 GeForce RTX 5090 1.0418 2.0836 3.1254 4.1672 5.209 4.630 4.400 4.446
RealSR-NCNN Scale: 4x - TAA: Yes OpenBenchmarking.org Seconds, Fewer Is Better RealSR-NCNN 20200818 Scale: 4x - TAA: Yes rtx 5090 NVIDIA 5090 GeForce RTX 5090 3 6 9 12 15 13.48 13.50 13.49
Waifu2x-NCNN Vulkan Scale: 2x - Denoise: 3 - TAA: Yes OpenBenchmarking.org Seconds, Fewer Is Better Waifu2x-NCNN Vulkan 20200818 Scale: 2x - Denoise: 3 - TAA: Yes rtx 5090 NVIDIA 5090 GeForce RTX 5090 0.5173 1.0346 1.5519 2.0692 2.5865 2.299 2.267 2.278
Blender Blend File: BMW27 - Compute: NVIDIA CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: BMW27 - Compute: NVIDIA CUDA rtx 5090 NVIDIA 5090 GeForce RTX 5090 1.062 2.124 3.186 4.248 5.31 4.72 4.72 4.65
Blender Blend File: BMW27 - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: BMW27 - Compute: NVIDIA OptiX rtx 5090 NVIDIA 5090 GeForce RTX 5090 0.6705 1.341 2.0115 2.682 3.3525 2.92 2.97 2.98
Blender Blend File: Junkshop - Compute: NVIDIA CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Junkshop - Compute: NVIDIA CUDA rtx 5090 NVIDIA 5090 GeForce RTX 5090 3 6 9 12 15 8.99 8.98 9.00
Blender Blend File: Classroom - Compute: NVIDIA CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Classroom - Compute: NVIDIA CUDA rtx 5090 NVIDIA 5090 GeForce RTX 5090 2 4 6 8 10 8.38 8.38 8.38
Blender Blend File: Fishy Cat - Compute: NVIDIA CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Fishy Cat - Compute: NVIDIA CUDA rtx 5090 NVIDIA 5090 GeForce RTX 5090 2 4 6 8 10 8.92 8.92 8.92
Blender Blend File: Junkshop - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Junkshop - Compute: NVIDIA OptiX rtx 5090 NVIDIA 5090 GeForce RTX 5090 1.2758 2.5516 3.8274 5.1032 6.379 5.66 5.67 5.66
Blender Blend File: Barbershop - Compute: NVIDIA CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Barbershop - Compute: NVIDIA CUDA rtx 5090 NVIDIA 5090 GeForce RTX 5090 8 16 24 32 40 35.14 35.28 35.10
Blender Blend File: Classroom - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Classroom - Compute: NVIDIA OptiX rtx 5090 NVIDIA 5090 GeForce RTX 5090 2 4 6 8 10 6.16 6.14 6.17
Blender Blend File: Fishy Cat - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Fishy Cat - Compute: NVIDIA OptiX rtx 5090 NVIDIA 5090 GeForce RTX 5090 1.026 2.052 3.078 4.104 5.13 4.55 4.56 4.55
Blender Blend File: Barbershop - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Barbershop - Compute: NVIDIA OptiX rtx 5090 NVIDIA 5090 GeForce RTX 5090 6 12 18 24 30 24.33 24.60 24.50
Blender Blend File: Pabellon Barcelona - Compute: NVIDIA CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Pabellon Barcelona - Compute: NVIDIA CUDA rtx 5090 NVIDIA 5090 GeForce RTX 5090 4 8 12 16 20 17.35 17.34 17.44
Blender Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.3 Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX rtx 5090 NVIDIA 5090 GeForce RTX 5090 2 4 6 8 10 7.00 7.06 7.04
clpeak OpenCL Test: Kernel Latency OpenBenchmarking.org us, Fewer Is Better clpeak 1.1.2 OpenCL Test: Kernel Latency rtx 5090 NVIDIA 5090 GeForce RTX 5090 1.161 2.322 3.483 4.644 5.805 5.15 5.16 5.15 1. (CXX) g++ options: -O3
Phoronix Test Suite v10.8.5