nvidia rtx 5090 compute benchmarks

Tests for a future article. Intel Core Ultra 9 285K testing with a ASUS ROG MAXIMUS Z890 HERO (1203 BIOS) and ASUS NVIDIA GeForce RTX 5090 32GB on Ubuntu 24.10 via the Phoronix Test Suite.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2501242-PTS-NVIDIART00
Jump To Table - Results

View

Do Not Show Noisy Results
Do Not Show Results With Incomplete Data
Do Not Show Results With Little Change/Spread
List Notable Results
Show Result Confidence Charts
Allow Limiting Results To Certain Suite(s)

Statistics

Show Overall Harmonic Mean(s)
Show Overall Geometric Mean
Show Wins / Losses Counts (Pie Chart)
Normalize Results
Remove Outliers Before Calculating Averages

Graph Settings

Force Line Graphs Where Applicable
Convert To Scalar Where Applicable
Disable Color Branding
Prefer Vertical Bar Graphs

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Highlight
Result
Toggle/Hide
Result
Result
Identifier
View Logs
Performance Per
Dollar
Date
Run
  Test
  Duration
rtx 5090
January 24
  1 Hour, 45 Minutes
NVIDIA 5090
January 24
  1 Hour, 42 Minutes
GeForce RTX 5090
January 24
  1 Hour, 41 Minutes
Invert Behavior (Only Show Selected Data)
  1 Hour, 43 Minutes

Only show results where is faster than
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


nvidia rtx 5090 compute benchmarksOpenBenchmarking.orgPhoronix Test SuiteIntel Core Ultra 9 285K @ 5.10GHz (24 Cores)ASUS ROG MAXIMUS Z890 HERO (1203 BIOS)Intel Device ae7f2 x 16GB DDR5-6400MT/s Micron CP16G64C38U5B.M8D11000GB Western Digital WDS100T1X0E-00AFY0 + 4001GB Western Digital WD_BLACK SN850X 4000GBASUS NVIDIA GeForce RTX 5090 32GBIntel Device 7f50ASUS VP28URealtek Device 8126 + Intel I226-V + Intel Wi-Fi 7Ubuntu 24.106.11.0-13-generic (x86_64)GNOME Shell 47.0X Server 1.21.1.13NVIDIA 570.86.104.6.0OpenCL 3.0 CUDA 12.8.51 + OpenCL 3.0GCC 14.2.0ext43840x2160ProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLOpenCLCompilerFile-SystemScreen ResolutionNvidia Rtx 5090 Compute Benchmarks PerformanceSystem Logs- nouveau.modeset=0 - Transparent Huge Pages: madvise- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: intel_pstate performance (EPP: default) - CPU Microcode: 0x114 - Thermald 2.5.8- BAR1 / Visible vRAM Size: 32768 MiB - vBIOS Version: 98.02.2e.00.03- GPU Compute Cores: 21760- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected

rtx 5090NVIDIA 5090GeForce RTX 5090Result OverviewPhoronix Test Suite100%102%104%105%107%NCNNRealSR-NCNNNAMD CUDAWaifu2x-NCNN VulkanBlenderChaos Group V-RAYSHOC Scalable HeterOgeneous ComputingIndigoBenchVkFFTclpeakHashcatProjectPhysX OpenCL-BenchmarkFluidX3DVkResamplevkpeak

nvidia rtx 5090 compute benchmarksvkpeak: fp32-scalarvkpeak: fp32-vec4vkpeak: fp16-scalarvkpeak: fp16-vec4vkpeak: fp64-scalarvkpeak: fp64-vec4vkpeak: int32-scalarvkpeak: int32-vec4vkpeak: int16-scalarvkpeak: int16-vec4realsr-ncnn: 4x - Norealsr-ncnn: 4x - Yeswaifu2x-ncnn: 2x - 3 - Yesvkfft: FFT + iFFT R2C / C2Rvkfft: FFT + iFFT C2C 1D batched in half precisionvkfft: FFT + iFFT C2C Bluestein in single precisionvkfft: FFT + iFFT C2C 1D batched in double precisionvkfft: FFT + iFFT C2C 1D batched in single precisionvkfft: FFT + iFFT C2C multidimensional in single precisionvkfft: FFT + iFFT C2C Bluestein benchmark in double precisionvkfft: FFT + iFFT C2C 1D batched in single precision, no reshufflinghashcat: MD5hashcat: SHA1hashcat: 7-Ziphashcat: SHA-512hashcat: TrueCrypt RIPEMD160 + XTSshoc: OpenCL - S3Dshoc: OpenCL - Triadshoc: OpenCL - FFT SPshoc: OpenCL - MD5 Hashshoc: OpenCL - Reductionshoc: OpenCL - GEMM SGEMM_Nshoc: OpenCL - Max SP Flopsshoc: OpenCL - Bus Speed Downloadshoc: OpenCL - Bus Speed Readbackshoc: OpenCL - Texture Read Bandwidthopencl-benchmark: FP64 Computeopencl-benchmark: FP32 Computeopencl-benchmark: FP16 Computeopencl-benchmark: INT64 Computeopencl-benchmark: INT32 Computeopencl-benchmark: INT16 Computeopencl-benchmark: INT8 Computeopencl-benchmark: Memory Bandwidth Coalesced Readopencl-benchmark: Memory Bandwidth Coalesced Writenamd-cuda: ATPase Simulation - 327,506 Atomsvkresample: 2x - Doublevkresample: 2x - Singlefluidx3d: FP32-FP32fluidx3d: FP32-FP16Cfluidx3d: FP32-FP16Sclpeak: Kernel Latencyclpeak: Integer Computeclpeak: Integer 24-bit Computeclpeak: Global Memory Bandwidthclpeak: Double-Precision Computeclpeak: Single-Precision Computeclpeak: Transfer Bandwidth enqueueReadBufferclpeak: Transfer Bandwidth enqueueWriteBufferncnn: Vulkan GPU - mobilenetncnn: Vulkan GPU-v2-v2 - mobilenet-v2ncnn: Vulkan GPU-v3-v3 - mobilenet-v3ncnn: Vulkan GPU - shufflenet-v2ncnn: Vulkan GPU - mnasnetncnn: Vulkan GPU - efficientnet-b0ncnn: Vulkan GPU - blazefacencnn: Vulkan GPU - googlenetncnn: Vulkan GPU - vgg16ncnn: Vulkan GPU - resnet18ncnn: Vulkan GPU - alexnetncnn: Vulkan GPU - resnet50ncnn: Vulkan GPUv2-yolov3v2-yolov3 - mobilenetv2-yolov3ncnn: Vulkan GPU - yolov4-tinyncnn: Vulkan GPU - squeezenet_ssdncnn: Vulkan GPU - regnety_400mncnn: Vulkan GPU - vision_transformerncnn: Vulkan GPU - FastestDetblender: BMW27 - NVIDIA CUDAblender: BMW27 - NVIDIA OptiXblender: Junkshop - NVIDIA CUDAblender: Classroom - NVIDIA CUDAblender: Fishy Cat - NVIDIA CUDAblender: Junkshop - NVIDIA OptiXblender: Barbershop - NVIDIA CUDAblender: Classroom - NVIDIA OptiXblender: Fishy Cat - NVIDIA OptiXblender: Barbershop - NVIDIA OptiXblender: Pabellon Barcelona - NVIDIA CUDAblender: Pabellon Barcelona - NVIDIA OptiXindigobench: OpenCL GPU - Bedroomindigobench: OpenCL GPU - Supercarv-ray: NVIDIA RTX GPUv-ray: NVIDIA CUDA GPUrtx 5090NVIDIA 5090GeForce RTX 509063013.3283296.8462611.7872592.931967.371965.762142.5661885.3740006.643806.134.6313.4842.29916493330222136054637382377171446249931243937106848250000688525000003272300890040000027760001117.5427.83294398.39142.407837.20735937.212461528.786728.68952870.681.95117.847122.9144.39661.75954.01841.7951596.241687.490.05810103.5055.648952419140184995.1562151.9461843.111562.971976.9121415.5313.8318.4142.4410.214.684.539.827.442.6713.9237.0510.938.7428.5842.4439.3422.0247.962.7611.014.722.928.998.388.925.6635.146.164.5524.3317.35742.75992.711923485163035.6283290.2262611.1772578.621967.491965.3962142.0761894.9539989.743799.964.413.5042.26716480130567136026627732358841446009940243913106216550000690727000003276600890120000027707001120.5127.80984400.85142.24836.9836016.512455628.78728.58562872.531.95117.864122.9444.461.75954.03741.7241603.931679.440.05851103.4785.649952719121184965.1662119.5161866.861564.491977.26121438.4113.8918.4239.7411.254.284.044.7720.783.0512.7838.739.418.5530.139.7438.9126.3952.5962.868.684.722.978.988.388.925.6735.286.144.5624.617.347.0642.85292.5211923488263035.6283257.5962597.5172575.331967.431965.3262141.0261914.5139998.4243803.64.44613.4862.27816487330073735954636372395751446959934241913106544000000691043000003264300889590000027776001120.8827.74554375.95142.532837.24335961.312464628.788128.14062875.51.951117.881122.9414.39261.77353.95341.7571596.881680.230.05943103.4535.648952519135185005.1562178.0561903.931564.611976.78121419.5713.7818.3338.918.374.295.126.5519.792.6718.0637.4911.018.9929.4738.9136.8625.5736.262.839.294.652.9898.388.925.6635.16.174.5524.517.447.0442.92492.711119234882OpenBenchmarking.org

vkpeak

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20240505fp32-scalarrtx 5090NVIDIA 5090GeForce RTX 509014K28K42K56K70K63013.3263035.6263035.62

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20240505fp32-vec4rtx 5090NVIDIA 5090GeForce RTX 509020K40K60K80K100K83296.8483290.2283257.59

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20240505fp16-scalarrtx 5090NVIDIA 5090GeForce RTX 509013K26K39K52K65K62611.7862611.1762597.51

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20240505fp16-vec4rtx 5090NVIDIA 5090GeForce RTX 509016K32K48K64K80K72592.9372578.6272575.33

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20240505fp64-scalarrtx 5090NVIDIA 5090GeForce RTX 50904008001200160020001967.371967.491967.43

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20240505fp64-vec4rtx 5090NVIDIA 5090GeForce RTX 50904008001200160020001965.701965.391965.32

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20240505int32-scalarrtx 5090NVIDIA 5090GeForce RTX 509013K26K39K52K65K62142.5662142.0762141.02

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20240505int32-vec4rtx 5090NVIDIA 5090GeForce RTX 509013K26K39K52K65K61885.3761894.9561914.51

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20240505int16-scalarrtx 5090NVIDIA 5090GeForce RTX 50909K18K27K36K45K40006.6039989.7039998.42

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20240505int16-vec4rtx 5090NVIDIA 5090GeForce RTX 50909K18K27K36K45K43806.1343799.9643803.60

RealSR-NCNN

RealSR-NCNN is an NCNN neural network implementation of the RealSR project and accelerated using the Vulkan API. RealSR is the Real-World Super Resolution via Kernel Estimation and Noise Injection. NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. This test profile times how long it takes to increase the resolution of a sample image by a scale of 4x with Vulkan. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterRealSR-NCNN 20200818Scale: 4x - TAA: Nortx 5090NVIDIA 5090GeForce RTX 50901.04182.08363.12544.16725.2094.6304.4004.446

OpenBenchmarking.orgSeconds, Fewer Is BetterRealSR-NCNN 20200818Scale: 4x - TAA: Yesrtx 5090NVIDIA 5090GeForce RTX 5090369121513.4813.5013.49

Waifu2x-NCNN Vulkan

Waifu2x-NCNN is an NCNN neural network implementation of the Waifu2x converter project and accelerated using the Vulkan API. NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. This test profile times how long it takes to increase the resolution of a sample image with Vulkan. Learn more via the OpenBenchmarking.org test page.

Scale: 2x - Denoise: 3 - TAA: No

rtx 5090: The test run did not produce a result.

NVIDIA 5090: The test run did not produce a result.

GeForce RTX 5090: The test run did not produce a result.

OpenBenchmarking.orgSeconds, Fewer Is BetterWaifu2x-NCNN Vulkan 20200818Scale: 2x - Denoise: 3 - TAA: Yesrtx 5090NVIDIA 5090GeForce RTX 50900.51731.03461.55192.06922.58652.2992.2672.278

VkFFT

VkFFT is a Fast Fourier Transform (FFT) Library that is GPU accelerated by means of the Vulkan API. The VkFFT benchmark runs FFT performance differences of many different sizes before returning an overall benchmark score. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT R2C / C2Rrtx 5090NVIDIA 5090GeForce RTX 509040K80K120K160K200K1649331648011648731. (CXX) g++ options: -O3

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C 1D batched in half precisionrtx 5090NVIDIA 5090GeForce RTX 509070K140K210K280K350K3022213056713007371. (CXX) g++ options: -O3

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C Bluestein in single precisionrtx 5090NVIDIA 5090GeForce RTX 50908K16K24K32K40K3605436026359541. (CXX) g++ options: -O3

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C 1D batched in double precisionrtx 5090NVIDIA 5090GeForce RTX 509014K28K42K56K70K6373862773636371. (CXX) g++ options: -O3

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C 1D batched in single precisionrtx 5090NVIDIA 5090GeForce RTX 509050K100K150K200K250K2377172358842395751. (CXX) g++ options: -O3

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C multidimensional in single precisionrtx 5090NVIDIA 5090GeForce RTX 509030K60K90K120K150K1446241446001446951. (CXX) g++ options: -O3

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C Bluestein benchmark in double precisionrtx 5090NVIDIA 5090GeForce RTX 50902K4K6K8K10K9931994099341. (CXX) g++ options: -O3

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C 1D batched in single precision, no reshufflingrtx 5090NVIDIA 5090GeForce RTX 509050K100K150K200K250K2439372439132419131. (CXX) g++ options: -O3

Hashcat

Hashcat is an open-source, advanced password recovery tool supporting GPU acceleration with OpenCL, NVIDIA CUDA, and Radeon ROCm. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: MD5rtx 5090NVIDIA 5090GeForce RTX 509020000M40000M60000M80000M100000MSE +/- 102551750000.00, N = 2SE +/- 101883450000.00, N = 2SE +/- 102256000000.00, N = 2106848250000106216550000106544000000

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: SHA1rtx 5090NVIDIA 5090GeForce RTX 509015000M30000M45000M60000M75000M688525000006907270000069104300000

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: 7-Ziprtx 5090NVIDIA 5090GeForce RTX 5090700K1400K2100K2800K3500K327230032766003264300

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: SHA-512rtx 5090NVIDIA 5090GeForce RTX 50902000M4000M6000M8000M10000M890040000089012000008895900000

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: TrueCrypt RIPEMD160 + XTSrtx 5090NVIDIA 5090GeForce RTX 5090600K1200K1800K2400K3000K277600027707002777600

SHOC Scalable HeterOgeneous Computing

The CUDA and OpenCL version of Vetter's Scalable HeterOgeneous Computing benchmark suite. SHOC provides a number of different benchmark programs for evaluating the performance and stability of compute devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: S3Drtx 5090NVIDIA 5090GeForce RTX 509020040060080010001117.541120.511120.881. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Triadrtx 5090NVIDIA 5090GeForce RTX 509071421283527.8327.8127.751. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: FFT SPrtx 5090NVIDIA 5090GeForce RTX 509090018002700360045004398.394400.854375.951. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGHash/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: MD5 Hashrtx 5090NVIDIA 5090GeForce RTX 5090306090120150142.41142.24142.531. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Reductionrtx 5090NVIDIA 5090GeForce RTX 50902004006008001000837.21836.98837.241. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: GEMM SGEMM_Nrtx 5090NVIDIA 5090GeForce RTX 50908K16K24K32K40K35937.236016.535961.31. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Max SP Flopsrtx 5090NVIDIA 5090GeForce RTX 509030K60K90K120K150K1246151245561246461. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Bus Speed Downloadrtx 5090NVIDIA 5090GeForce RTX 509071421283528.7928.7928.791. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Bus Speed Readbackrtx 5090NVIDIA 5090GeForce RTX 509071421283528.6928.5928.141. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Texture Read Bandwidthrtx 5090NVIDIA 5090GeForce RTX 509060012001800240030002870.682872.532875.501. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

ProjectPhysX OpenCL-Benchmark

OpenBenchmarking.orgTFLOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.6Operation: FP64 Computertx 5090NVIDIA 5090GeForce RTX 50900.4390.8781.3171.7562.1951.9501.9501.9511. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

OpenBenchmarking.orgTFLOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.6Operation: FP32 Computertx 5090NVIDIA 5090GeForce RTX 5090306090120150117.85117.86117.881. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

OpenBenchmarking.orgTFLOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.6Operation: FP16 Computertx 5090NVIDIA 5090GeForce RTX 5090306090120150122.91122.94122.941. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.6Operation: INT64 Computertx 5090NVIDIA 5090GeForce RTX 50900.991.982.973.964.954.3964.4004.3921. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.6Operation: INT32 Computertx 5090NVIDIA 5090GeForce RTX 5090142842567061.7661.7661.771. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.6Operation: INT16 Computertx 5090NVIDIA 5090GeForce RTX 5090122436486054.0254.0453.951. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.6Operation: INT8 Computertx 5090NVIDIA 5090GeForce RTX 5090102030405041.8041.7241.761. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.6Operation: Memory Bandwidth Coalesced Readrtx 5090NVIDIA 5090GeForce RTX 5090300600900120015001596.241603.931596.881. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.6Operation: Memory Bandwidth Coalesced Writertx 5090NVIDIA 5090GeForce RTX 50904008001200160020001687.491679.441680.231. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

NAMD CUDA

NAMD is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems. NAMD was developed by the Theoretical and Computational Biophysics Group in the Beckman Institute for Advanced Science and Technology at the University of Illinois at Urbana-Champaign. This version of the NAMD test profile uses CUDA GPU acceleration. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgdays/ns, Fewer Is BetterNAMD CUDA 2.14ATPase Simulation - 327,506 Atomsrtx 5090NVIDIA 5090GeForce RTX 50900.01340.02680.04020.05360.0670.058100.058510.05943

VkResample

VkResample is a Vulkan-based image upscaling library based on VkFFT. The sample input file is upscaling a 4K image to 8K using Vulkan-based GPU acceleration. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterVkResample 1.0Upscale: 2x - Precision: Doublertx 5090NVIDIA 5090GeForce RTX 509020406080100103.51103.48103.451. (CXX) g++ options: -O3

OpenBenchmarking.orgms, Fewer Is BetterVkResample 1.0Upscale: 2x - Precision: Singlertx 5090NVIDIA 5090GeForce RTX 50901.2712.5423.8135.0846.3555.6485.6495.6481. (CXX) g++ options: -O3

FluidX3D

OpenBenchmarking.orgMLUPs/s, More Is BetterFluidX3D 3.0Test: FP32-FP32rtx 5090NVIDIA 5090GeForce RTX 50902K4K6K8K10K952495279525

OpenBenchmarking.orgMLUPs/s, More Is BetterFluidX3D 3.0Test: FP32-FP16Crtx 5090NVIDIA 5090GeForce RTX 50904K8K12K16K20K191401912119135

OpenBenchmarking.orgMLUPs/s, More Is BetterFluidX3D 3.0Test: FP32-FP16Srtx 5090NVIDIA 5090GeForce RTX 50904K8K12K16K20K184991849618500

clpeak

Clpeak is designed to test the peak capabilities of OpenCL devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgus, Fewer Is Betterclpeak 1.1.2OpenCL Test: Kernel Latencyrtx 5090NVIDIA 5090GeForce RTX 50901.1612.3223.4834.6445.8055.155.165.151. (CXX) g++ options: -O3

OpenBenchmarking.orgGIOPS, More Is Betterclpeak 1.1.2OpenCL Test: Integer Computertx 5090NVIDIA 5090GeForce RTX 509013K26K39K52K65K62151.9462119.5162178.051. (CXX) g++ options: -O3

OpenBenchmarking.orgGIOPS, More Is Betterclpeak 1.1.2OpenCL Test: Integer 24-bit Computertx 5090NVIDIA 5090GeForce RTX 509013K26K39K52K65K61843.1161866.8661903.931. (CXX) g++ options: -O3

OpenBenchmarking.orgGBPS, More Is Betterclpeak 1.1.2OpenCL Test: Global Memory Bandwidthrtx 5090NVIDIA 5090GeForce RTX 5090300600900120015001562.971564.491564.611. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOPS, More Is Betterclpeak 1.1.2OpenCL Test: Double-Precision Computertx 5090NVIDIA 5090GeForce RTX 50904008001200160020001976.901977.261976.781. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOPS, More Is Betterclpeak 1.1.2OpenCL Test: Single-Precision Computertx 5090NVIDIA 5090GeForce RTX 509030K60K90K120K150K121415.53121438.41121419.571. (CXX) g++ options: -O3

OpenBenchmarking.orgGBPS, More Is Betterclpeak 1.1.2OpenCL Test: Transfer Bandwidth enqueueReadBufferrtx 5090NVIDIA 5090GeForce RTX 50904812162013.8313.8913.781. (CXX) g++ options: -O3

OpenBenchmarking.orgGBPS, More Is Betterclpeak 1.1.2OpenCL Test: Transfer Bandwidth enqueueWriteBufferrtx 5090NVIDIA 5090GeForce RTX 509051015202518.4118.4218.331. (CXX) g++ options: -O3

NCNN

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: mobilenetrtx 5090NVIDIA 5090GeForce RTX 5090102030405042.4439.7438.91MIN: 8.34 / MAX: 76.17MIN: 8.16 / MAX: 76.72MIN: 8.9 / MAX: 75.481. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2rtx 5090NVIDIA 5090GeForce RTX 5090369121510.2111.258.37MIN: 3.85 / MAX: 64.73MIN: 3.82 / MAX: 63.94MIN: 3.84 / MAX: 63.91. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3rtx 5090NVIDIA 5090GeForce RTX 50901.0532.1063.1594.2125.2654.684.284.29MIN: 4.08 / MAX: 57.94MIN: 4.05 / MAX: 5.15MIN: 4.06 / MAX: 5.931. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: shufflenet-v2rtx 5090NVIDIA 5090GeForce RTX 50901.1522.3043.4564.6085.764.534.045.12MIN: 3.88 / MAX: 57.59MIN: 3.9 / MAX: 5.77MIN: 3.91 / MAX: 67.51. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: mnasnetrtx 5090NVIDIA 5090GeForce RTX 509036912159.804.776.55MIN: 3.75 / MAX: 63.5MIN: 3.69 / MAX: 54.04MIN: 3.69 / MAX: 59.211. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: efficientnet-b0rtx 5090NVIDIA 5090GeForce RTX 509061218243027.4420.7819.79MIN: 6.34 / MAX: 109.98MIN: 6.36 / MAX: 109.98MIN: 6.3 / MAX: 110.051. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: blazefacertx 5090NVIDIA 5090GeForce RTX 50900.68631.37262.05892.74523.43152.673.052.67MIN: 2.4 / MAX: 41.21MIN: 2.38 / MAX: 49.95MIN: 2.38 / MAX: 25.861. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: googlenetrtx 5090NVIDIA 5090GeForce RTX 50904812162013.9212.7818.06MIN: 7.49 / MAX: 98.37MIN: 7.62 / MAX: 95.87MIN: 7.48 / MAX: 98.981. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: vgg16rtx 5090NVIDIA 5090GeForce RTX 509091827364537.0538.7337.49MIN: 22.53 / MAX: 46.22MIN: 20.99 / MAX: 46.14MIN: 23.61 / MAX: 45.511. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: resnet18rtx 5090NVIDIA 5090GeForce RTX 5090369121510.939.4111.01MIN: 4.48 / MAX: 44.35MIN: 4.47 / MAX: 43.1MIN: 4.51 / MAX: 42.891. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: alexnetrtx 5090NVIDIA 5090GeForce RTX 509036912158.748.558.99MIN: 3.21 / MAX: 22.23MIN: 3.18 / MAX: 21.75MIN: 3.19 / MAX: 21.781. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: resnet50rtx 5090NVIDIA 5090GeForce RTX 509071421283528.5830.1029.47MIN: 10.02 / MAX: 89.48MIN: 10.13 / MAX: 90.77MIN: 10.07 / MAX: 90.591. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3rtx 5090NVIDIA 5090GeForce RTX 5090102030405042.4439.7438.91MIN: 8.34 / MAX: 76.17MIN: 8.16 / MAX: 76.72MIN: 8.9 / MAX: 75.481. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: yolov4-tinyrtx 5090NVIDIA 5090GeForce RTX 509091827364539.3438.9136.86MIN: 15.92 / MAX: 49.04MIN: 15.12 / MAX: 48.75MIN: 11.12 / MAX: 47.831. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: squeezenet_ssdrtx 5090NVIDIA 5090GeForce RTX 509061218243022.0226.3925.57MIN: 7.41 / MAX: 92.66MIN: 7.39 / MAX: 95.48MIN: 7.22 / MAX: 94.511. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: regnety_400mrtx 5090NVIDIA 5090GeForce RTX 5090122436486047.9052.5936.20MIN: 21.96 / MAX: 421.33MIN: 21.91 / MAX: 425.58MIN: 21.98 / MAX: 458.491. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: vision_transformerrtx 5090NVIDIA 5090GeForce RTX 5090142842567062.7662.8662.83MIN: 40.3 / MAX: 105.61MIN: 42.12 / MAX: 106.46MIN: 41.21 / MAX: 109.151. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: FastestDetrtx 5090NVIDIA 5090GeForce RTX 5090369121511.018.689.29MIN: 5.09 / MAX: 92.36MIN: 5.07 / MAX: 85.5MIN: 5.01 / MAX: 89.471. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

Blender

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: BMW27 - Compute: NVIDIA CUDArtx 5090NVIDIA 5090GeForce RTX 50901.0622.1243.1864.2485.314.724.724.65

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: BMW27 - Compute: NVIDIA OptiXrtx 5090NVIDIA 5090GeForce RTX 50900.67051.3412.01152.6823.35252.922.972.98

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: Junkshop - Compute: NVIDIA CUDArtx 5090NVIDIA 5090GeForce RTX 509036912158.998.989.00

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: Classroom - Compute: NVIDIA CUDArtx 5090NVIDIA 5090GeForce RTX 50902468108.388.388.38

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: Fishy Cat - Compute: NVIDIA CUDArtx 5090NVIDIA 5090GeForce RTX 50902468108.928.928.92

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: Junkshop - Compute: NVIDIA OptiXrtx 5090NVIDIA 5090GeForce RTX 50901.27582.55163.82745.10326.3795.665.675.66

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: Barbershop - Compute: NVIDIA CUDArtx 5090NVIDIA 5090GeForce RTX 509081624324035.1435.2835.10

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: Classroom - Compute: NVIDIA OptiXrtx 5090NVIDIA 5090GeForce RTX 50902468106.166.146.17

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: Fishy Cat - Compute: NVIDIA OptiXrtx 5090NVIDIA 5090GeForce RTX 50901.0262.0523.0784.1045.134.554.564.55

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: Barbershop - Compute: NVIDIA OptiXrtx 5090NVIDIA 5090GeForce RTX 509061218243024.3324.6024.50

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: Pabellon Barcelona - Compute: NVIDIA CUDArtx 5090NVIDIA 5090GeForce RTX 50904812162017.3517.3417.44

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: Pabellon Barcelona - Compute: NVIDIA OptiXrtx 5090NVIDIA 5090GeForce RTX 50902468107.007.067.04

IndigoBench

This is a test of Indigo Renderer's IndigoBench benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgM samples/s, More Is BetterIndigoBench 4.4Acceleration: OpenCL GPU - Scene: Bedroomrtx 5090NVIDIA 5090GeForce RTX 5090102030405042.7642.8542.92

OpenBenchmarking.orgM samples/s, More Is BetterIndigoBench 4.4Acceleration: OpenCL GPU - Scene: Supercarrtx 5090NVIDIA 5090GeForce RTX 50902040608010092.7092.5292.71

Chaos Group V-RAY

This is a test of Chaos Group's V-RAY benchmark. V-RAY is a commercial renderer that can integrate with various creator software products like SketchUp and 3ds Max. The V-RAY benchmark is standalone and supports CPU and NVIDIA CUDA/RTX based rendering. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgvpaths, More Is BetterChaos Group V-RAY 6.0Mode: NVIDIA RTX GPUrtx 5090NVIDIA 5090GeForce RTX 50903K6K9K12K15K119231192311923

OpenBenchmarking.orgvpaths, More Is BetterChaos Group V-RAY 6.0Mode: NVIDIA CUDA GPUrtx 5090NVIDIA 5090GeForce RTX 509010002000300040005000485148824882

93 Results Shown

vkpeak:
  fp32-scalar
  fp32-vec4
  fp16-scalar
  fp16-vec4
  fp64-scalar
  fp64-vec4
  int32-scalar
  int32-vec4
  int16-scalar
  int16-vec4
RealSR-NCNN:
  4x - No
  4x - Yes
Waifu2x-NCNN Vulkan
VkFFT:
  FFT + iFFT R2C / C2R
  FFT + iFFT C2C 1D batched in half precision
  FFT + iFFT C2C Bluestein in single precision
  FFT + iFFT C2C 1D batched in double precision
  FFT + iFFT C2C 1D batched in single precision
  FFT + iFFT C2C multidimensional in single precision
  FFT + iFFT C2C Bluestein benchmark in double precision
  FFT + iFFT C2C 1D batched in single precision, no reshuffling
Hashcat:
  MD5
  SHA1
  7-Zip
  SHA-512
  TrueCrypt RIPEMD160 + XTS
SHOC Scalable HeterOgeneous Computing:
  OpenCL - S3D
  OpenCL - Triad
  OpenCL - FFT SP
  OpenCL - MD5 Hash
  OpenCL - Reduction
  OpenCL - GEMM SGEMM_N
  OpenCL - Max SP Flops
  OpenCL - Bus Speed Download
  OpenCL - Bus Speed Readback
  OpenCL - Texture Read Bandwidth
ProjectPhysX OpenCL-Benchmark:
  FP64 Compute
  FP32 Compute
  FP16 Compute
  INT64 Compute
  INT32 Compute
  INT16 Compute
  INT8 Compute
  Memory Bandwidth Coalesced Read
  Memory Bandwidth Coalesced Write
NAMD CUDA
VkResample:
  2x - Double
  2x - Single
FluidX3D:
  FP32-FP32
  FP32-FP16C
  FP32-FP16S
clpeak:
  Kernel Latency
  Integer Compute
  Integer 24-bit Compute
  Global Memory Bandwidth
  Double-Precision Compute
  Single-Precision Compute
  Transfer Bandwidth enqueueReadBuffer
  Transfer Bandwidth enqueueWriteBuffer
NCNN:
  Vulkan GPU - mobilenet
  Vulkan GPU-v2-v2 - mobilenet-v2
  Vulkan GPU-v3-v3 - mobilenet-v3
  Vulkan GPU - shufflenet-v2
  Vulkan GPU - mnasnet
  Vulkan GPU - efficientnet-b0
  Vulkan GPU - blazeface
  Vulkan GPU - googlenet
  Vulkan GPU - vgg16
  Vulkan GPU - resnet18
  Vulkan GPU - alexnet
  Vulkan GPU - resnet50
  Vulkan GPUv2-yolov3v2-yolov3 - mobilenetv2-yolov3
  Vulkan GPU - yolov4-tiny
  Vulkan GPU - squeezenet_ssd
  Vulkan GPU - regnety_400m
  Vulkan GPU - vision_transformer
  Vulkan GPU - FastestDet
Blender:
  BMW27 - NVIDIA CUDA
  BMW27 - NVIDIA OptiX
  Junkshop - NVIDIA CUDA
  Classroom - NVIDIA CUDA
  Fishy Cat - NVIDIA CUDA
  Junkshop - NVIDIA OptiX
  Barbershop - NVIDIA CUDA
  Classroom - NVIDIA OptiX
  Fishy Cat - NVIDIA OptiX
  Barbershop - NVIDIA OptiX
  Pabellon Barcelona - NVIDIA CUDA
  Pabellon Barcelona - NVIDIA OptiX
IndigoBench:
  OpenCL GPU - Bedroom
  OpenCL GPU - Supercar
Chaos Group V-RAY:
  NVIDIA RTX GPU
  NVIDIA CUDA GPU