nvidia rtx 5090 compute benchmarks

Intel Core Ultra 9 285K testing with a ASUS ROG MAXIMUS Z890 HERO (1203 BIOS) and ASUS NVIDIA GeForce RTX 5090 32GB on Ubuntu 24.10 via the Phoronix Test Suite.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2501241-PTS-NVIDIART86
Jump To Table - Results

View

Do Not Show Noisy Results
Do Not Show Results With Incomplete Data
Do Not Show Results With Little Change/Spread
List Notable Results
Show Result Confidence Charts
Allow Limiting Results To Certain Suite(s)

Statistics

Show Overall Harmonic Mean(s)
Show Overall Geometric Mean
Show Wins / Losses Counts (Pie Chart)
Normalize Results
Remove Outliers Before Calculating Averages

Graph Settings

Force Line Graphs Where Applicable
Convert To Scalar Where Applicable
Disable Color Branding
Prefer Vertical Bar Graphs

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Highlight
Result
Toggle/Hide
Result
Result
Identifier
View Logs
Performance Per
Dollar
Date
Run
  Test
  Duration
rtx 5090
January 24
  1 Hour, 45 Minutes
NVIDIA 5090
January 24
  1 Hour, 42 Minutes
GeForce RTX 5090
January 24
  1 Hour, 41 Minutes
Invert Behavior (Only Show Selected Data)
  1 Hour, 43 Minutes

Only show results where is faster than
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


nvidia rtx 5090 compute benchmarksOpenBenchmarking.orgPhoronix Test SuiteIntel Core Ultra 9 285K @ 5.10GHz (24 Cores)ASUS ROG MAXIMUS Z890 HERO (1203 BIOS)Intel Device ae7f2 x 16GB DDR5-6400MT/s Micron CP16G64C38U5B.M8D11000GB Western Digital WDS100T1X0E-00AFY0 + 4001GB Western Digital WD_BLACK SN850X 4000GBASUS NVIDIA GeForce RTX 5090 32GBIntel Device 7f50ASUS VP28URealtek Device 8126 + Intel I226-V + Intel Wi-Fi 7Ubuntu 24.106.11.0-13-generic (x86_64)GNOME Shell 47.0X Server 1.21.1.13NVIDIA 570.86.104.6.0OpenCL 3.0 CUDA 12.8.51 + OpenCL 3.0GCC 14.2.0ext43840x2160ProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLOpenCLCompilerFile-SystemScreen ResolutionNvidia Rtx 5090 Compute Benchmarks PerformanceSystem Logs- nouveau.modeset=0 - Transparent Huge Pages: madvise- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: intel_pstate performance (EPP: default) - CPU Microcode: 0x114 - Thermald 2.5.8- BAR1 / Visible vRAM Size: 32768 MiB - vBIOS Version: 98.02.2e.00.03- GPU Compute Cores: 21760- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected

rtx 5090NVIDIA 5090GeForce RTX 5090Result OverviewPhoronix Test Suite100%102%104%105%107%NCNNRealSR-NCNNNAMD CUDAWaifu2x-NCNN VulkanBlenderChaos Group V-RAYSHOC Scalable HeterOgeneous ComputingIndigoBenchVkFFTclpeakHashcatProjectPhysX OpenCL-BenchmarkFluidX3DVkResamplevkpeak

nvidia rtx 5090 compute benchmarksncnn: Vulkan GPU - mnasnetncnn: Vulkan GPU - regnety_400mncnn: Vulkan GPU - googlenetncnn: Vulkan GPU - efficientnet-b0ncnn: Vulkan GPU-v2-v2 - mobilenet-v2ncnn: Vulkan GPU - FastestDetncnn: Vulkan GPU - shufflenet-v2ncnn: Vulkan GPU - squeezenet_ssdncnn: Vulkan GPU - resnet18ncnn: Vulkan GPU - blazefacencnn: Vulkan GPU-v3-v3 - mobilenet-v3ncnn: Vulkan GPUv2-yolov3v2-yolov3 - mobilenetv2-yolov3ncnn: Vulkan GPU - mobilenetncnn: Vulkan GPU - yolov4-tinyncnn: Vulkan GPU - resnet50realsr-ncnn: 4x - Noncnn: Vulkan GPU - alexnetncnn: Vulkan GPU - vgg16namd-cuda: ATPase Simulation - 327,506 Atomsblender: BMW27 - NVIDIA OptiXshoc: OpenCL - Bus Speed Readbackvkfft: FFT + iFFT C2C 1D batched in half precisionvkfft: FFT + iFFT C2C 1D batched in single precisionvkfft: FFT + iFFT C2C 1D batched in double precisionblender: BMW27 - NVIDIA CUDAwaifu2x-ncnn: 2x - 3 - Yesblender: Barbershop - NVIDIA OptiXblender: Pabellon Barcelona - NVIDIA OptiXvkfft: FFT + iFFT C2C 1D batched in single precision, no reshufflingclpeak: Transfer Bandwidth enqueueReadBufferv-ray: NVIDIA CUDA GPUhashcat: MD5blender: Pabellon Barcelona - NVIDIA CUDAshoc: OpenCL - FFT SPblender: Barbershop - NVIDIA CUDAclpeak: Transfer Bandwidth enqueueWriteBufferblender: Classroom - NVIDIA OptiXopencl-benchmark: Memory Bandwidth Coalesced Readopencl-benchmark: Memory Bandwidth Coalesced Writeindigobench: OpenCL GPU - Bedroomhashcat: 7-Ziphashcat: SHA1shoc: OpenCL - Triadshoc: OpenCL - S3Dvkfft: FFT + iFFT C2C Bluestein in single precisionhashcat: TrueCrypt RIPEMD160 + XTSblender: Junkshop - NVIDIA CUDAshoc: OpenCL - GEMM SGEMM_Nblender: Fishy Cat - NVIDIA OptiXindigobench: OpenCL GPU - Supercarshoc: OpenCL - MD5 Hashclpeak: Kernel Latencyopencl-benchmark: INT64 Computeblender: Junkshop - NVIDIA OptiXopencl-benchmark: INT8 Computeshoc: OpenCL - Texture Read Bandwidthncnn: Vulkan GPU - vision_transformeropencl-benchmark: INT16 Computerealsr-ncnn: 4x - Yesclpeak: Global Memory Bandwidthfluidx3d: FP32-FP16Cclpeak: Integer 24-bit Computeclpeak: Integer Computevkfft: FFT + iFFT C2C Bluestein benchmark in double precisionvkfft: FFT + iFFT R2C / C2Rshoc: OpenCL - Max SP Flopsvkfft: FFT + iFFT C2C multidimensional in single precisionhashcat: SHA-512opencl-benchmark: FP64 Computevkresample: 2x - Doublevkpeak: fp32-vec4vkpeak: int32-vec4vkpeak: int16-scalarvkpeak: fp32-scalarfluidx3d: FP32-FP32shoc: OpenCL - Reductionopencl-benchmark: FP32 Computeopencl-benchmark: FP16 Computeclpeak: Double-Precision Computevkpeak: fp16-vec4vkpeak: fp16-scalaropencl-benchmark: INT32 Computefluidx3d: FP32-FP16Svkpeak: fp64-vec4clpeak: Single-Precision Computevkresample: 2x - Singlevkpeak: int16-vec4vkpeak: fp64-scalarshoc: OpenCL - Bus Speed Downloadvkpeak: int32-scalarv-ray: NVIDIA RTX GPUblender: Fishy Cat - NVIDIA CUDAblender: Classroom - NVIDIA CUDAwaifu2x-ncnn: 2x - 3 - Nortx 5090NVIDIA 5090GeForce RTX 50909.847.913.9227.4410.2111.014.5322.0210.932.674.6842.4442.4439.3428.584.638.7437.050.058102.9228.6895302221237717637384.722.29924.33724393713.83485110684825000017.354398.3935.1418.416.161596.241687.4942.75932723006885250000027.83291117.543605427760008.9935937.24.5592.7142.4075.154.3965.6641.7952870.6862.7654.01813.4841562.971914061843.1162151.94993116493312461514462489004000001.95103.50583296.8461885.3740006.663013.329524837.207117.847122.9141976.972592.9362611.7861.759184991965.7121415.535.64843806.131967.3728.786762142.56119238.928.384.7752.5912.7820.7811.258.684.0426.399.413.054.2839.7439.7438.9130.14.48.5538.730.058512.9728.5856305671235884627734.722.26724.67.0624391313.89488210621655000017.344400.8535.2818.426.141603.931679.4442.85232766006907270000027.80981120.513602627707008.9836016.54.5692.52142.245.164.45.6741.7242872.5362.8654.03713.5041564.491912161866.8662119.51994016480112455614460089012000001.95103.47883290.2261894.9539989.763035.629527836.98117.864122.9441977.2672578.6262611.1761.759184961965.39121438.415.64943799.961967.4928.78762142.07119238.928.386.5536.218.0619.798.379.295.1225.5711.012.674.2938.9138.9136.8629.474.4468.9937.490.059432.9828.1406300737239575636374.652.27824.57.0424191313.78488210654400000017.444375.9535.118.336.171596.881680.2342.92432643006910430000027.74551120.88359542777600935961.34.5592.711142.5325.154.3925.6641.7572875.562.8353.95313.4861564.611913561903.9362178.05993416487312464614469588959000001.951103.45383257.5961914.5139998.4263035.629525837.243117.881122.9411976.7872575.3362597.5161.773185001965.32121419.575.64843803.61967.4328.788162141.02119238.928.38OpenBenchmarking.org

NCNN

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: mnasnetrtx 5090GeForce RTX 5090NVIDIA 509036912159.806.554.77MIN: 3.75 / MAX: 63.5MIN: 3.69 / MAX: 59.21MIN: 3.69 / MAX: 54.041. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: regnety_400mNVIDIA 5090rtx 5090GeForce RTX 5090122436486052.5947.9036.20MIN: 21.91 / MAX: 425.58MIN: 21.96 / MAX: 421.33MIN: 21.98 / MAX: 458.491. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: googlenetGeForce RTX 5090rtx 5090NVIDIA 50904812162018.0613.9212.78MIN: 7.48 / MAX: 98.98MIN: 7.49 / MAX: 98.37MIN: 7.62 / MAX: 95.871. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: efficientnet-b0rtx 5090NVIDIA 5090GeForce RTX 509061218243027.4420.7819.79MIN: 6.34 / MAX: 109.98MIN: 6.36 / MAX: 109.98MIN: 6.3 / MAX: 110.051. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2NVIDIA 5090rtx 5090GeForce RTX 5090369121511.2510.218.37MIN: 3.82 / MAX: 63.94MIN: 3.85 / MAX: 64.73MIN: 3.84 / MAX: 63.91. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: FastestDetrtx 5090GeForce RTX 5090NVIDIA 5090369121511.019.298.68MIN: 5.09 / MAX: 92.36MIN: 5.01 / MAX: 89.47MIN: 5.07 / MAX: 85.51. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: shufflenet-v2GeForce RTX 5090rtx 5090NVIDIA 50901.1522.3043.4564.6085.765.124.534.04MIN: 3.91 / MAX: 67.5MIN: 3.88 / MAX: 57.59MIN: 3.9 / MAX: 5.771. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: squeezenet_ssdNVIDIA 5090GeForce RTX 5090rtx 509061218243026.3925.5722.02MIN: 7.39 / MAX: 95.48MIN: 7.22 / MAX: 94.51MIN: 7.41 / MAX: 92.661. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: resnet18GeForce RTX 5090rtx 5090NVIDIA 5090369121511.0110.939.41MIN: 4.51 / MAX: 42.89MIN: 4.48 / MAX: 44.35MIN: 4.47 / MAX: 43.11. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: blazefaceNVIDIA 5090GeForce RTX 5090rtx 50900.68631.37262.05892.74523.43153.052.672.67MIN: 2.38 / MAX: 49.95MIN: 2.38 / MAX: 25.86MIN: 2.4 / MAX: 41.211. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3rtx 5090GeForce RTX 5090NVIDIA 50901.0532.1063.1594.2125.2654.684.294.28MIN: 4.08 / MAX: 57.94MIN: 4.06 / MAX: 5.93MIN: 4.05 / MAX: 5.151. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3rtx 5090NVIDIA 5090GeForce RTX 5090102030405042.4439.7438.91MIN: 8.34 / MAX: 76.17MIN: 8.16 / MAX: 76.72MIN: 8.9 / MAX: 75.481. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: mobilenetrtx 5090NVIDIA 5090GeForce RTX 5090102030405042.4439.7438.91MIN: 8.34 / MAX: 76.17MIN: 8.16 / MAX: 76.72MIN: 8.9 / MAX: 75.481. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: yolov4-tinyrtx 5090NVIDIA 5090GeForce RTX 509091827364539.3438.9136.86MIN: 15.92 / MAX: 49.04MIN: 15.12 / MAX: 48.75MIN: 11.12 / MAX: 47.831. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: resnet50NVIDIA 5090GeForce RTX 5090rtx 509071421283530.1029.4728.58MIN: 10.13 / MAX: 90.77MIN: 10.07 / MAX: 90.59MIN: 10.02 / MAX: 89.481. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

RealSR-NCNN

RealSR-NCNN is an NCNN neural network implementation of the RealSR project and accelerated using the Vulkan API. RealSR is the Real-World Super Resolution via Kernel Estimation and Noise Injection. NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. This test profile times how long it takes to increase the resolution of a sample image by a scale of 4x with Vulkan. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterRealSR-NCNN 20200818Scale: 4x - TAA: Nortx 5090GeForce RTX 5090NVIDIA 50901.04182.08363.12544.16725.2094.6304.4464.400

NCNN

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: alexnetGeForce RTX 5090rtx 5090NVIDIA 509036912158.998.748.55MIN: 3.19 / MAX: 21.78MIN: 3.21 / MAX: 22.23MIN: 3.18 / MAX: 21.751. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: vgg16NVIDIA 5090GeForce RTX 5090rtx 509091827364538.7337.4937.05MIN: 20.99 / MAX: 46.14MIN: 23.61 / MAX: 45.51MIN: 22.53 / MAX: 46.221. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NAMD CUDA

NAMD is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems. NAMD was developed by the Theoretical and Computational Biophysics Group in the Beckman Institute for Advanced Science and Technology at the University of Illinois at Urbana-Champaign. This version of the NAMD test profile uses CUDA GPU acceleration. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgdays/ns, Fewer Is BetterNAMD CUDA 2.14ATPase Simulation - 327,506 AtomsGeForce RTX 5090NVIDIA 5090rtx 50900.01340.02680.04020.05360.0670.059430.058510.05810

Blender

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: BMW27 - Compute: NVIDIA OptiXGeForce RTX 5090NVIDIA 5090rtx 50900.67051.3412.01152.6823.35252.982.972.92

SHOC Scalable HeterOgeneous Computing

The CUDA and OpenCL version of Vetter's Scalable HeterOgeneous Computing benchmark suite. SHOC provides a number of different benchmark programs for evaluating the performance and stability of compute devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Bus Speed ReadbackGeForce RTX 5090NVIDIA 5090rtx 509071421283528.1428.5928.691. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

VkFFT

VkFFT is a Fast Fourier Transform (FFT) Library that is GPU accelerated by means of the Vulkan API. The VkFFT benchmark runs FFT performance differences of many different sizes before returning an overall benchmark score. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C 1D batched in half precisionGeForce RTX 5090rtx 5090NVIDIA 509070K140K210K280K350K3007373022213056711. (CXX) g++ options: -O3

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C 1D batched in single precisionNVIDIA 5090rtx 5090GeForce RTX 509050K100K150K200K250K2358842377172395751. (CXX) g++ options: -O3

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C 1D batched in double precisionNVIDIA 5090GeForce RTX 5090rtx 509014K28K42K56K70K6277363637637381. (CXX) g++ options: -O3

Blender

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: BMW27 - Compute: NVIDIA CUDANVIDIA 5090rtx 5090GeForce RTX 50901.0622.1243.1864.2485.314.724.724.65

Waifu2x-NCNN Vulkan

Waifu2x-NCNN is an NCNN neural network implementation of the Waifu2x converter project and accelerated using the Vulkan API. NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. This test profile times how long it takes to increase the resolution of a sample image with Vulkan. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterWaifu2x-NCNN Vulkan 20200818Scale: 2x - Denoise: 3 - TAA: Yesrtx 5090GeForce RTX 5090NVIDIA 50900.51731.03461.55192.06922.58652.2992.2782.267

Blender

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: Barbershop - Compute: NVIDIA OptiXNVIDIA 5090GeForce RTX 5090rtx 509061218243024.6024.5024.33

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: Pabellon Barcelona - Compute: NVIDIA OptiXNVIDIA 5090GeForce RTX 5090rtx 50902468107.067.047.00

VkFFT

VkFFT is a Fast Fourier Transform (FFT) Library that is GPU accelerated by means of the Vulkan API. The VkFFT benchmark runs FFT performance differences of many different sizes before returning an overall benchmark score. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C 1D batched in single precision, no reshufflingGeForce RTX 5090NVIDIA 5090rtx 509050K100K150K200K250K2419132439132439371. (CXX) g++ options: -O3

clpeak

Clpeak is designed to test the peak capabilities of OpenCL devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGBPS, More Is Betterclpeak 1.1.2OpenCL Test: Transfer Bandwidth enqueueReadBufferGeForce RTX 5090rtx 5090NVIDIA 50904812162013.7813.8313.891. (CXX) g++ options: -O3

Chaos Group V-RAY

This is a test of Chaos Group's V-RAY benchmark. V-RAY is a commercial renderer that can integrate with various creator software products like SketchUp and 3ds Max. The V-RAY benchmark is standalone and supports CPU and NVIDIA CUDA/RTX based rendering. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgvpaths, More Is BetterChaos Group V-RAY 6.0Mode: NVIDIA CUDA GPUrtx 5090NVIDIA 5090GeForce RTX 509010002000300040005000485148824882

Hashcat

Hashcat is an open-source, advanced password recovery tool supporting GPU acceleration with OpenCL, NVIDIA CUDA, and Radeon ROCm. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: MD5NVIDIA 5090GeForce RTX 5090rtx 509020000M40000M60000M80000M100000MSE +/- 101883450000.00, N = 2SE +/- 102256000000.00, N = 2SE +/- 102551750000.00, N = 2106216550000106544000000106848250000

Blender

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: Pabellon Barcelona - Compute: NVIDIA CUDAGeForce RTX 5090rtx 5090NVIDIA 50904812162017.4417.3517.34

SHOC Scalable HeterOgeneous Computing

The CUDA and OpenCL version of Vetter's Scalable HeterOgeneous Computing benchmark suite. SHOC provides a number of different benchmark programs for evaluating the performance and stability of compute devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: FFT SPGeForce RTX 5090rtx 5090NVIDIA 509090018002700360045004375.954398.394400.851. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

Blender

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: Barbershop - Compute: NVIDIA CUDANVIDIA 5090rtx 5090GeForce RTX 509081624324035.2835.1435.10

clpeak

Clpeak is designed to test the peak capabilities of OpenCL devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGBPS, More Is Betterclpeak 1.1.2OpenCL Test: Transfer Bandwidth enqueueWriteBufferGeForce RTX 5090rtx 5090NVIDIA 509051015202518.3318.4118.421. (CXX) g++ options: -O3

Blender

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: Classroom - Compute: NVIDIA OptiXGeForce RTX 5090rtx 5090NVIDIA 50902468106.176.166.14

ProjectPhysX OpenCL-Benchmark

OpenBenchmarking.orgGB/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.6Operation: Memory Bandwidth Coalesced Readrtx 5090GeForce RTX 5090NVIDIA 5090300600900120015001596.241596.881603.931. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.6Operation: Memory Bandwidth Coalesced WriteNVIDIA 5090GeForce RTX 5090rtx 50904008001200160020001679.441680.231687.491. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

IndigoBench

This is a test of Indigo Renderer's IndigoBench benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgM samples/s, More Is BetterIndigoBench 4.4Acceleration: OpenCL GPU - Scene: Bedroomrtx 5090NVIDIA 5090GeForce RTX 5090102030405042.7642.8542.92

Hashcat

Hashcat is an open-source, advanced password recovery tool supporting GPU acceleration with OpenCL, NVIDIA CUDA, and Radeon ROCm. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: 7-ZipGeForce RTX 5090rtx 5090NVIDIA 5090700K1400K2100K2800K3500K326430032723003276600

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: SHA1rtx 5090NVIDIA 5090GeForce RTX 509015000M30000M45000M60000M75000M688525000006907270000069104300000

SHOC Scalable HeterOgeneous Computing

The CUDA and OpenCL version of Vetter's Scalable HeterOgeneous Computing benchmark suite. SHOC provides a number of different benchmark programs for evaluating the performance and stability of compute devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: TriadGeForce RTX 5090NVIDIA 5090rtx 509071421283527.7527.8127.831. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: S3Drtx 5090NVIDIA 5090GeForce RTX 509020040060080010001117.541120.511120.881. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

VkFFT

VkFFT is a Fast Fourier Transform (FFT) Library that is GPU accelerated by means of the Vulkan API. The VkFFT benchmark runs FFT performance differences of many different sizes before returning an overall benchmark score. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C Bluestein in single precisionGeForce RTX 5090NVIDIA 5090rtx 50908K16K24K32K40K3595436026360541. (CXX) g++ options: -O3

Hashcat

Hashcat is an open-source, advanced password recovery tool supporting GPU acceleration with OpenCL, NVIDIA CUDA, and Radeon ROCm. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: TrueCrypt RIPEMD160 + XTSNVIDIA 5090rtx 5090GeForce RTX 5090600K1200K1800K2400K3000K277070027760002777600

Blender

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: Junkshop - Compute: NVIDIA CUDAGeForce RTX 5090rtx 5090NVIDIA 509036912159.008.998.98

SHOC Scalable HeterOgeneous Computing

The CUDA and OpenCL version of Vetter's Scalable HeterOgeneous Computing benchmark suite. SHOC provides a number of different benchmark programs for evaluating the performance and stability of compute devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: GEMM SGEMM_Nrtx 5090GeForce RTX 5090NVIDIA 50908K16K24K32K40K35937.235961.336016.51. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

Blender

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: Fishy Cat - Compute: NVIDIA OptiXNVIDIA 5090GeForce RTX 5090rtx 50901.0262.0523.0784.1045.134.564.554.55

IndigoBench

This is a test of Indigo Renderer's IndigoBench benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgM samples/s, More Is BetterIndigoBench 4.4Acceleration: OpenCL GPU - Scene: SupercarNVIDIA 5090rtx 5090GeForce RTX 50902040608010092.5292.7092.71

SHOC Scalable HeterOgeneous Computing

The CUDA and OpenCL version of Vetter's Scalable HeterOgeneous Computing benchmark suite. SHOC provides a number of different benchmark programs for evaluating the performance and stability of compute devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGHash/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: MD5 HashNVIDIA 5090rtx 5090GeForce RTX 5090306090120150142.24142.41142.531. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

clpeak

Clpeak is designed to test the peak capabilities of OpenCL devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgus, Fewer Is Betterclpeak 1.1.2OpenCL Test: Kernel LatencyNVIDIA 5090GeForce RTX 5090rtx 50901.1612.3223.4834.6445.8055.165.155.151. (CXX) g++ options: -O3

ProjectPhysX OpenCL-Benchmark

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.6Operation: INT64 ComputeGeForce RTX 5090rtx 5090NVIDIA 50900.991.982.973.964.954.3924.3964.4001. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

Blender

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: Junkshop - Compute: NVIDIA OptiXNVIDIA 5090GeForce RTX 5090rtx 50901.27582.55163.82745.10326.3795.675.665.66

ProjectPhysX OpenCL-Benchmark

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.6Operation: INT8 ComputeNVIDIA 5090GeForce RTX 5090rtx 5090102030405041.7241.7641.801. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

SHOC Scalable HeterOgeneous Computing

The CUDA and OpenCL version of Vetter's Scalable HeterOgeneous Computing benchmark suite. SHOC provides a number of different benchmark programs for evaluating the performance and stability of compute devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Texture Read Bandwidthrtx 5090NVIDIA 5090GeForce RTX 509060012001800240030002870.682872.532875.501. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

NCNN

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: vision_transformerNVIDIA 5090GeForce RTX 5090rtx 5090142842567062.8662.8362.76MIN: 42.12 / MAX: 106.46MIN: 41.21 / MAX: 109.15MIN: 40.3 / MAX: 105.611. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

ProjectPhysX OpenCL-Benchmark

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.6Operation: INT16 ComputeGeForce RTX 5090rtx 5090NVIDIA 5090122436486053.9554.0254.041. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

RealSR-NCNN

RealSR-NCNN is an NCNN neural network implementation of the RealSR project and accelerated using the Vulkan API. RealSR is the Real-World Super Resolution via Kernel Estimation and Noise Injection. NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. This test profile times how long it takes to increase the resolution of a sample image by a scale of 4x with Vulkan. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterRealSR-NCNN 20200818Scale: 4x - TAA: YesNVIDIA 5090GeForce RTX 5090rtx 5090369121513.5013.4913.48

clpeak

Clpeak is designed to test the peak capabilities of OpenCL devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGBPS, More Is Betterclpeak 1.1.2OpenCL Test: Global Memory Bandwidthrtx 5090NVIDIA 5090GeForce RTX 5090300600900120015001562.971564.491564.611. (CXX) g++ options: -O3

FluidX3D

OpenBenchmarking.orgMLUPs/s, More Is BetterFluidX3D 3.0Test: FP32-FP16CNVIDIA 5090GeForce RTX 5090rtx 50904K8K12K16K20K191211913519140

clpeak

Clpeak is designed to test the peak capabilities of OpenCL devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGIOPS, More Is Betterclpeak 1.1.2OpenCL Test: Integer 24-bit Computertx 5090NVIDIA 5090GeForce RTX 509013K26K39K52K65K61843.1161866.8661903.931. (CXX) g++ options: -O3

OpenBenchmarking.orgGIOPS, More Is Betterclpeak 1.1.2OpenCL Test: Integer ComputeNVIDIA 5090rtx 5090GeForce RTX 509013K26K39K52K65K62119.5162151.9462178.051. (CXX) g++ options: -O3

VkFFT

VkFFT is a Fast Fourier Transform (FFT) Library that is GPU accelerated by means of the Vulkan API. The VkFFT benchmark runs FFT performance differences of many different sizes before returning an overall benchmark score. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C Bluestein benchmark in double precisionrtx 5090GeForce RTX 5090NVIDIA 50902K4K6K8K10K9931993499401. (CXX) g++ options: -O3

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT R2C / C2RNVIDIA 5090GeForce RTX 5090rtx 509040K80K120K160K200K1648011648731649331. (CXX) g++ options: -O3

SHOC Scalable HeterOgeneous Computing

The CUDA and OpenCL version of Vetter's Scalable HeterOgeneous Computing benchmark suite. SHOC provides a number of different benchmark programs for evaluating the performance and stability of compute devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Max SP FlopsNVIDIA 5090rtx 5090GeForce RTX 509030K60K90K120K150K1245561246151246461. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

VkFFT

VkFFT is a Fast Fourier Transform (FFT) Library that is GPU accelerated by means of the Vulkan API. The VkFFT benchmark runs FFT performance differences of many different sizes before returning an overall benchmark score. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C multidimensional in single precisionNVIDIA 5090rtx 5090GeForce RTX 509030K60K90K120K150K1446001446241446951. (CXX) g++ options: -O3

Hashcat

Hashcat is an open-source, advanced password recovery tool supporting GPU acceleration with OpenCL, NVIDIA CUDA, and Radeon ROCm. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: SHA-512GeForce RTX 5090rtx 5090NVIDIA 50902000M4000M6000M8000M10000M889590000089004000008901200000

ProjectPhysX OpenCL-Benchmark

OpenBenchmarking.orgTFLOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.6Operation: FP64 Computertx 5090NVIDIA 5090GeForce RTX 50900.4390.8781.3171.7562.1951.9501.9501.9511. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

VkResample

VkResample is a Vulkan-based image upscaling library based on VkFFT. The sample input file is upscaling a 4K image to 8K using Vulkan-based GPU acceleration. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterVkResample 1.0Upscale: 2x - Precision: Doublertx 5090NVIDIA 5090GeForce RTX 509020406080100103.51103.48103.451. (CXX) g++ options: -O3

vkpeak

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20240505fp32-vec4GeForce RTX 5090NVIDIA 5090rtx 509020K40K60K80K100K83257.5983290.2283296.84

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20240505int32-vec4rtx 5090NVIDIA 5090GeForce RTX 509013K26K39K52K65K61885.3761894.9561914.51

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20240505int16-scalarNVIDIA 5090GeForce RTX 5090rtx 50909K18K27K36K45K39989.7039998.4240006.60

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20240505fp32-scalarrtx 5090NVIDIA 5090GeForce RTX 509014K28K42K56K70K63013.3263035.6263035.62

FluidX3D

OpenBenchmarking.orgMLUPs/s, More Is BetterFluidX3D 3.0Test: FP32-FP32rtx 5090GeForce RTX 5090NVIDIA 50902K4K6K8K10K952495259527

SHOC Scalable HeterOgeneous Computing

The CUDA and OpenCL version of Vetter's Scalable HeterOgeneous Computing benchmark suite. SHOC provides a number of different benchmark programs for evaluating the performance and stability of compute devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: ReductionNVIDIA 5090rtx 5090GeForce RTX 50902004006008001000836.98837.21837.241. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

ProjectPhysX OpenCL-Benchmark

OpenBenchmarking.orgTFLOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.6Operation: FP32 Computertx 5090NVIDIA 5090GeForce RTX 5090306090120150117.85117.86117.881. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

OpenBenchmarking.orgTFLOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.6Operation: FP16 Computertx 5090GeForce RTX 5090NVIDIA 5090306090120150122.91122.94122.941. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

clpeak

Clpeak is designed to test the peak capabilities of OpenCL devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS, More Is Betterclpeak 1.1.2OpenCL Test: Double-Precision ComputeGeForce RTX 5090rtx 5090NVIDIA 50904008001200160020001976.781976.901977.261. (CXX) g++ options: -O3

vkpeak

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20240505fp16-vec4GeForce RTX 5090NVIDIA 5090rtx 509016K32K48K64K80K72575.3372578.6272592.93

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20240505fp16-scalarGeForce RTX 5090NVIDIA 5090rtx 509013K26K39K52K65K62597.5162611.1762611.78

ProjectPhysX OpenCL-Benchmark

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.6Operation: INT32 Computertx 5090NVIDIA 5090GeForce RTX 5090142842567061.7661.7661.771. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

FluidX3D

OpenBenchmarking.orgMLUPs/s, More Is BetterFluidX3D 3.0Test: FP32-FP16SNVIDIA 5090rtx 5090GeForce RTX 50904K8K12K16K20K184961849918500

vkpeak

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20240505fp64-vec4GeForce RTX 5090NVIDIA 5090rtx 50904008001200160020001965.321965.391965.70

clpeak

Clpeak is designed to test the peak capabilities of OpenCL devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS, More Is Betterclpeak 1.1.2OpenCL Test: Single-Precision Computertx 5090GeForce RTX 5090NVIDIA 509030K60K90K120K150K121415.53121419.57121438.411. (CXX) g++ options: -O3

VkResample

VkResample is a Vulkan-based image upscaling library based on VkFFT. The sample input file is upscaling a 4K image to 8K using Vulkan-based GPU acceleration. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterVkResample 1.0Upscale: 2x - Precision: SingleNVIDIA 5090GeForce RTX 5090rtx 50901.2712.5423.8135.0846.3555.6495.6485.6481. (CXX) g++ options: -O3

vkpeak

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20240505int16-vec4NVIDIA 5090GeForce RTX 5090rtx 50909K18K27K36K45K43799.9643803.6043806.13

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20240505fp64-scalarrtx 5090GeForce RTX 5090NVIDIA 50904008001200160020001967.371967.431967.49

SHOC Scalable HeterOgeneous Computing

The CUDA and OpenCL version of Vetter's Scalable HeterOgeneous Computing benchmark suite. SHOC provides a number of different benchmark programs for evaluating the performance and stability of compute devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Bus Speed Downloadrtx 5090NVIDIA 5090GeForce RTX 509071421283528.7928.7928.791. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

vkpeak

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20240505int32-scalarGeForce RTX 5090NVIDIA 5090rtx 509013K26K39K52K65K62141.0262142.0762142.56

Chaos Group V-RAY

This is a test of Chaos Group's V-RAY benchmark. V-RAY is a commercial renderer that can integrate with various creator software products like SketchUp and 3ds Max. The V-RAY benchmark is standalone and supports CPU and NVIDIA CUDA/RTX based rendering. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgvpaths, More Is BetterChaos Group V-RAY 6.0Mode: NVIDIA RTX GPUrtx 5090NVIDIA 5090GeForce RTX 50903K6K9K12K15K119231192311923

Blender

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: Fishy Cat - Compute: NVIDIA CUDAGeForce RTX 5090NVIDIA 5090rtx 50902468108.928.928.92

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: Classroom - Compute: NVIDIA CUDAGeForce RTX 5090NVIDIA 5090rtx 50902468108.388.388.38

Waifu2x-NCNN Vulkan

Waifu2x-NCNN is an NCNN neural network implementation of the Waifu2x converter project and accelerated using the Vulkan API. NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. This test profile times how long it takes to increase the resolution of a sample image with Vulkan. Learn more via the OpenBenchmarking.org test page.

Scale: 2x - Denoise: 3 - TAA: No

rtx 5090: The test run did not produce a result.

NVIDIA 5090: The test run did not produce a result.

GeForce RTX 5090: The test run did not produce a result.

93 Results Shown

NCNN:
  Vulkan GPU - mnasnet
  Vulkan GPU - regnety_400m
  Vulkan GPU - googlenet
  Vulkan GPU - efficientnet-b0
  Vulkan GPU-v2-v2 - mobilenet-v2
  Vulkan GPU - FastestDet
  Vulkan GPU - shufflenet-v2
  Vulkan GPU - squeezenet_ssd
  Vulkan GPU - resnet18
  Vulkan GPU - blazeface
  Vulkan GPU-v3-v3 - mobilenet-v3
  Vulkan GPUv2-yolov3v2-yolov3 - mobilenetv2-yolov3
  Vulkan GPU - mobilenet
  Vulkan GPU - yolov4-tiny
  Vulkan GPU - resnet50
RealSR-NCNN
NCNN:
  Vulkan GPU - alexnet
  Vulkan GPU - vgg16
NAMD CUDA
Blender
SHOC Scalable HeterOgeneous Computing
VkFFT:
  FFT + iFFT C2C 1D batched in half precision
  FFT + iFFT C2C 1D batched in single precision
  FFT + iFFT C2C 1D batched in double precision
Blender
Waifu2x-NCNN Vulkan
Blender:
  Barbershop - NVIDIA OptiX
  Pabellon Barcelona - NVIDIA OptiX
VkFFT
clpeak
Chaos Group V-RAY
Hashcat
Blender
SHOC Scalable HeterOgeneous Computing
Blender
clpeak
Blender
ProjectPhysX OpenCL-Benchmark:
  Memory Bandwidth Coalesced Read
  Memory Bandwidth Coalesced Write
IndigoBench
Hashcat:
  7-Zip
  SHA1
SHOC Scalable HeterOgeneous Computing:
  OpenCL - Triad
  OpenCL - S3D
VkFFT
Hashcat
Blender
SHOC Scalable HeterOgeneous Computing
Blender
IndigoBench
SHOC Scalable HeterOgeneous Computing
clpeak
ProjectPhysX OpenCL-Benchmark
Blender
ProjectPhysX OpenCL-Benchmark
SHOC Scalable HeterOgeneous Computing
NCNN
ProjectPhysX OpenCL-Benchmark
RealSR-NCNN
clpeak
FluidX3D
clpeak:
  Integer 24-bit Compute
  Integer Compute
VkFFT:
  FFT + iFFT C2C Bluestein benchmark in double precision
  FFT + iFFT R2C / C2R
SHOC Scalable HeterOgeneous Computing
VkFFT
Hashcat
ProjectPhysX OpenCL-Benchmark
VkResample
vkpeak:
  fp32-vec4
  int32-vec4
  int16-scalar
  fp32-scalar
FluidX3D
SHOC Scalable HeterOgeneous Computing
ProjectPhysX OpenCL-Benchmark:
  FP32 Compute
  FP16 Compute
clpeak
vkpeak:
  fp16-vec4
  fp16-scalar
ProjectPhysX OpenCL-Benchmark
FluidX3D
vkpeak
clpeak
VkResample
vkpeak:
  int16-vec4
  fp64-scalar
SHOC Scalable HeterOgeneous Computing
vkpeak
Chaos Group V-RAY
Blender:
  Fishy Cat - NVIDIA CUDA
  Classroom - NVIDIA CUDA