Nvidia Rtx 5090 Compute Benchmarks Performance

Tests for a future article. Intel Core Ultra 9 285K testing with a ASUS ROG MAXIMUS Z890 HERO (1203 BIOS) and ASUS NVIDIA GeForce RTX 5090 32GB on Ubuntu 24.10 via the Phoronix Test Suite.

vkpeak

RealSR-NCNN

RealSR-NCNN is an NCNN neural network implementation of the RealSR project and accelerated using the Vulkan API. RealSR is the Real-World Super Resolution via Kernel Estimation and Noise Injection. NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. This test profile times how long it takes to increase the resolution of a sample image by a scale of 4x with Vulkan. Learn more via the OpenBenchmarking.org test page.

Waifu2x-NCNN Vulkan

Waifu2x-NCNN is an NCNN neural network implementation of the Waifu2x converter project and accelerated using the Vulkan API. NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. This test profile times how long it takes to increase the resolution of a sample image with Vulkan. Learn more via the OpenBenchmarking.org test page.

Scale: 2x - Denoise: 3 - TAA: No

rtx 5090: The test run did not produce a result.

NVIDIA 5090: The test run did not produce a result.

GeForce RTX 5090: The test run did not produce a result.

VkFFT

VkFFT is a Fast Fourier Transform (FFT) Library that is GPU accelerated by means of the Vulkan API. The VkFFT benchmark runs FFT performance differences of many different sizes before returning an overall benchmark score. Learn more via the OpenBenchmarking.org test page.

Hashcat

Hashcat is an open-source, advanced password recovery tool supporting GPU acceleration with OpenCL, NVIDIA CUDA, and Radeon ROCm. Learn more via the OpenBenchmarking.org test page.

SHOC Scalable HeterOgeneous Computing

The CUDA and OpenCL version of Vetter's Scalable HeterOgeneous Computing benchmark suite. SHOC provides a number of different benchmark programs for evaluating the performance and stability of compute devices. Learn more via the OpenBenchmarking.org test page.

ProjectPhysX OpenCL-Benchmark

NAMD CUDA

NAMD is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems. NAMD was developed by the Theoretical and Computational Biophysics Group in the Beckman Institute for Advanced Science and Technology at the University of Illinois at Urbana-Champaign. This version of the NAMD test profile uses CUDA GPU acceleration. Learn more via the OpenBenchmarking.org test page.

VkResample

VkResample is a Vulkan-based image upscaling library based on VkFFT. The sample input file is upscaling a 4K image to 8K using Vulkan-based GPU acceleration. Learn more via the OpenBenchmarking.org test page.

FluidX3D

clpeak

Clpeak is designed to test the peak capabilities of OpenCL devices. Learn more via the OpenBenchmarking.org test page.

NCNN

Blender

IndigoBench

This is a test of Indigo Renderer's IndigoBench benchmark. Learn more via the OpenBenchmarking.org test page.

Chaos Group V-RAY

This is a test of Chaos Group's V-RAY benchmark. V-RAY is a commercial renderer that can integrate with various creator software products like SketchUp and 3ds Max. The V-RAY benchmark is standalone and supports CPU and NVIDIA CUDA/RTX based rendering. Learn more via the OpenBenchmarking.org test page.

93 Results Shown

vkpeak:
fp32-scalar
fp32-vec4
fp16-scalar
fp16-vec4
fp64-scalar
fp64-vec4
int32-scalar
int32-vec4
int16-scalar
int16-vec4
RealSR-NCNN:
4x - No
4x - Yes
Waifu2x-NCNN Vulkan
VkFFT:
FFT + iFFT R2C / C2R
FFT + iFFT C2C 1D batched in half precision
FFT + iFFT C2C Bluestein in single precision
FFT + iFFT C2C 1D batched in double precision
FFT + iFFT C2C 1D batched in single precision
FFT + iFFT C2C multidimensional in single precision
FFT + iFFT C2C Bluestein benchmark in double precision
FFT + iFFT C2C 1D batched in single precision, no reshuffling
Hashcat:
MD5
SHA1
7-Zip
SHA-512
TrueCrypt RIPEMD160 + XTS
SHOC Scalable HeterOgeneous Computing:
OpenCL - S3D
OpenCL - Triad
OpenCL - FFT SP
OpenCL - MD5 Hash
OpenCL - Reduction
OpenCL - GEMM SGEMM_N
OpenCL - Max SP Flops
OpenCL - Bus Speed Download
OpenCL - Bus Speed Readback
OpenCL - Texture Read Bandwidth
ProjectPhysX OpenCL-Benchmark:
FP64 Compute
FP32 Compute
FP16 Compute
INT64 Compute
INT32 Compute
INT16 Compute
INT8 Compute
Memory Bandwidth Coalesced Read
Memory Bandwidth Coalesced Write
NAMD CUDA
VkResample:
2x - Double
2x - Single
FluidX3D:
FP32-FP32
FP32-FP16C
FP32-FP16S
clpeak:
Kernel Latency
Integer Compute
Integer 24-bit Compute
Global Memory Bandwidth
Double-Precision Compute
Single-Precision Compute
Transfer Bandwidth enqueueReadBuffer
Transfer Bandwidth enqueueWriteBuffer
NCNN:
Vulkan GPU - mobilenet
Vulkan GPU-v2-v2 - mobilenet-v2
Vulkan GPU-v3-v3 - mobilenet-v3
Vulkan GPU - shufflenet-v2
Vulkan GPU - mnasnet
Vulkan GPU - efficientnet-b0
Vulkan GPU - blazeface
Vulkan GPU - googlenet
Vulkan GPU - vgg16
Vulkan GPU - resnet18
Vulkan GPU - alexnet
Vulkan GPU - resnet50
Vulkan GPUv2-yolov3v2-yolov3 - mobilenetv2-yolov3
Vulkan GPU - yolov4-tiny
Vulkan GPU - squeezenet_ssd
Vulkan GPU - regnety_400m
Vulkan GPU - vision_transformer
Vulkan GPU - FastestDet
Blender:
BMW27 - NVIDIA CUDA
BMW27 - NVIDIA OptiX
Junkshop - NVIDIA CUDA
Classroom - NVIDIA CUDA
Fishy Cat - NVIDIA CUDA
Junkshop - NVIDIA OptiX
Barbershop - NVIDIA CUDA
Classroom - NVIDIA OptiX
Fishy Cat - NVIDIA OptiX
Barbershop - NVIDIA OptiX
Pabellon Barcelona - NVIDIA CUDA
Pabellon Barcelona - NVIDIA OptiX
IndigoBench:
OpenCL GPU - Bedroom
OpenCL GPU - Supercar
Chaos Group V-RAY:
NVIDIA RTX GPU
NVIDIA CUDA GPU

rtx 5090

Kernel Notes: nouveau.modeset=0 - Transparent Huge Pages: madvise
Compiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v
Processor Notes: Scaling Governor: intel_pstate performance (EPP: default) - CPU Microcode: 0x114 - Thermald 2.5.8
Graphics Notes: BAR1 / Visible vRAM Size: 32768 MiB - vBIOS Version: 98.02.2e.00.03
OpenCL Notes: GPU Compute Cores: 21760
Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected

Testing initiated at 24 January 2025 16:55 by user pts.

NVIDIA 5090

Testing initiated at 24 January 2025 17:49 by user pts.

GeForce RTX 5090

Processor: Intel Core Ultra 9 285K @ 5.10GHz (24 Cores), Motherboard: ASUS ROG MAXIMUS Z890 HERO (1203 BIOS), Chipset: Intel Device ae7f, Memory: 2 x 16GB DDR5-6400MT/s Micron CP16G64C38U5B.M8D1, Disk: 1000GB Western Digital WDS100T1X0E-00AFY0 + 4001GB Western Digital WD_BLACK SN850X 4000GB, Graphics: ASUS NVIDIA GeForce RTX 5090 32GB, Audio: Intel Device 7f50, Monitor: ASUS VP28U, Network: Realtek Device 8126 + Intel I226-V + Intel Wi-Fi 7

OS: Ubuntu 24.10, Kernel: 6.11.0-13-generic (x86_64), Desktop: GNOME Shell 47.0, Display Server: X Server 1.21.1.13, Display Driver: NVIDIA 570.86.10, OpenGL: 4.6.0, OpenCL: OpenCL 3.0 CUDA 12.8.51 + OpenCL 3.0, Compiler: GCC 14.2.0, File-System: ext4, Screen Resolution: 3840x2160

Testing initiated at 24 January 2025 18:39 by user pts.

nvidia rtx 5090 compute benchmarks

View

Statistics

Graph Settings

Multi-Way Comparison

Table

Run Management

rtx 5090

NVIDIA 5090

GeForce RTX 5090

vkpeak

RealSR-NCNN

Waifu2x-NCNN Vulkan

VkFFT

Hashcat

SHOC Scalable HeterOgeneous Computing

ProjectPhysX OpenCL-Benchmark

NAMD CUDA

VkResample

FluidX3D

clpeak

NCNN

Blender

IndigoBench

Chaos Group V-RAY

93 Results Shown

rtx 5090

NVIDIA 5090

GeForce RTX 5090