nvidia RTX 5080 rtx 5090 compute benchmarks

Benchmarks for a future article.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2501297-PTS-NVIDIART00
Jump To Table - Results

View

Do Not Show Noisy Results
Do Not Show Results With Incomplete Data
Do Not Show Results With Little Change/Spread
List Notable Results
Show Result Confidence Charts
Allow Limiting Results To Certain Suite(s)

Statistics

Show Overall Harmonic Mean(s)
Show Overall Geometric Mean
Show Wins / Losses Counts (Pie Chart)
Normalize Results
Remove Outliers Before Calculating Averages

Graph Settings

Force Line Graphs Where Applicable
Convert To Scalar Where Applicable
Disable Color Branding
Prefer Vertical Bar Graphs

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs
Condense Test Profiles With Multiple Version Results Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Highlight
Result
Toggle/Hide
Result
Result
Identifier
View Logs
Performance Per
Dollar
Date
Run
  Test
  Duration
RTX 5090
January 24
  1 Hour
RTX 5080
January 28
  2 Hours, 17 Minutes
NVIDIA RTX 5080
January 28
  2 Hours, 15 Minutes
Invert Behavior (Only Show Selected Data)
  1 Hour, 51 Minutes

Only show results where is faster than
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


nvidia RTX 5080 rtx 5090 compute benchmarksProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLOpenCLCompilerFile-SystemScreen ResolutionRTX 5090RTX 5080NVIDIA RTX 5080Intel Core Ultra 9 285K @ 5.10GHz (24 Cores)ASUS ROG MAXIMUS Z890 HERO (1203 BIOS)Intel Device ae7f2 x 16GB DDR5-6400MT/s Micron CP16G64C38U5B.M8D11000GB Western Digital WDS100T1X0E-00AFY0 + 4001GB Western Digital WD_BLACK SN850X 4000GBASUS NVIDIA GeForce RTX 5090 32GBIntel Device 7f50ASUS VP28URealtek Device 8126 + Intel I226-V + Intel Wi-Fi 7Ubuntu 24.106.11.0-13-generic (x86_64)GNOME Shell 47.0X Server 1.21.1.13NVIDIA 570.86.104.6.0OpenCL 3.0 CUDA 12.8.51 + OpenCL 3.0GCC 14.2.0ext43840x2160ASUS NVIDIA GeForce RTX 5080 16GB6.11.0-14-generic (x86_64)OpenCL 3.0 CUDA 12.8.51GCC 14.2.0 + CUDA 12.8OpenBenchmarking.orgKernel Details- RTX 5090: nouveau.modeset=0 - Transparent Huge Pages: madvise- RTX 5080: Transparent Huge Pages: madvise- NVIDIA RTX 5080: Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: intel_pstate performance (EPP: default) - CPU Microcode: 0x114 - Thermald 2.5.8Graphics Details- RTX 5090: BAR1 / Visible vRAM Size: 32768 MiB - vBIOS Version: 98.02.2e.00.03- RTX 5080: BAR1 / Visible vRAM Size: 16384 MiB - vBIOS Version: 98.03.3b.00.01- NVIDIA RTX 5080: BAR1 / Visible vRAM Size: 16384 MiB - vBIOS Version: 98.03.3b.00.01OpenCL Details- RTX 5090: GPU Compute Cores: 21760- RTX 5080: GPU Compute Cores: 10752- NVIDIA RTX 5080: GPU Compute Cores: 10752Security Details- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected

RTX 5090RTX 5080NVIDIA RTX 5080Result OverviewPhoronix Test Suite100%128%155%183%211%NCNNVkResampleProjectPhysX OpenCL-BenchmarkFluidX3DHashcatChaos Group V-RAYBlenderclpeakIndigoBenchSHOC Scalable HeterOgeneous ComputingWaifu2x-NCNN VulkanRealSR-NCNN

nvidia RTX 5080 rtx 5090 compute benchmarksrealsr-ncnn: 4x - Norealsr-ncnn: 4x - Yeswaifu2x-ncnn: 2x - 3 - Yeshashcat: MD5hashcat: SHA1hashcat: 7-Ziphashcat: SHA-512hashcat: TrueCrypt RIPEMD160 + XTSshoc: OpenCL - S3Dshoc: OpenCL - Triadshoc: OpenCL - FFT SPshoc: OpenCL - MD5 Hashshoc: OpenCL - Reductionshoc: OpenCL - GEMM SGEMM_Nshoc: OpenCL - Bus Speed Downloadshoc: OpenCL - Bus Speed Readbackshoc: OpenCL - Texture Read Bandwidthopencl-benchmark: FP64 Computeopencl-benchmark: FP32 Computeopencl-benchmark: FP16 Computeopencl-benchmark: INT64 Computeopencl-benchmark: INT32 Computeopencl-benchmark: INT16 Computeopencl-benchmark: INT8 Computeopencl-benchmark: Memory Bandwidth Coalesced Readopencl-benchmark: Memory Bandwidth Coalesced Writenamd-cuda: ATPase Simulation - 327,506 Atomsvkresample: 2x - Doublevkresample: 2x - Singlefluidx3d: FP32-FP32fluidx3d: FP32-FP16Cfluidx3d: FP32-FP16Sclpeak: Kernel Latencyclpeak: Integer Computeclpeak: Integer 24-bit Computeclpeak: Global Memory Bandwidthclpeak: Double-Precision Computeclpeak: Single-Precision Computeclpeak: Transfer Bandwidth enqueueReadBufferclpeak: Transfer Bandwidth enqueueWriteBufferncnn: Vulkan GPU - mobilenetncnn: Vulkan GPU-v2-v2 - mobilenet-v2ncnn: Vulkan GPU-v3-v3 - mobilenet-v3ncnn: Vulkan GPU - shufflenet-v2ncnn: Vulkan GPU - mnasnetncnn: Vulkan GPU - efficientnet-b0ncnn: Vulkan GPU - blazefacencnn: Vulkan GPU - googlenetncnn: Vulkan GPU - vgg16ncnn: Vulkan GPU - resnet18ncnn: Vulkan GPU - alexnetncnn: Vulkan GPU - resnet50ncnn: Vulkan GPUv2-yolov3v2-yolov3 - mobilenetv2-yolov3ncnn: Vulkan GPU - yolov4-tinyncnn: Vulkan GPU - squeezenet_ssdncnn: Vulkan GPU - regnety_400mncnn: Vulkan GPU - vision_transformerncnn: Vulkan GPU - FastestDetblender: BMW27 - NVIDIA CUDAblender: BMW27 - NVIDIA OptiXblender: Junkshop - NVIDIA CUDAblender: Classroom - NVIDIA CUDAblender: Fishy Cat - NVIDIA CUDAblender: Junkshop - NVIDIA OptiXblender: Barbershop - NVIDIA CUDAblender: Classroom - NVIDIA OptiXblender: Fishy Cat - NVIDIA OptiXblender: Barbershop - NVIDIA OptiXblender: Pabellon Barcelona - NVIDIA CUDAblender: Pabellon Barcelona - NVIDIA OptiXindigobench: OpenCL GPU - Bedroomindigobench: OpenCL GPU - Supercarv-ray: NVIDIA RTX GPUv-ray: NVIDIA CUDA GPUnamd-cuda: ATPase with 327,506 Atomsnamd-cuda: STMV with 1,066,628 Atomsllama-cpp: NVIDIA CUDA - Llama-3.1-Tulu-3-8B-Q8_0 - Text Generation 128llama-cpp: NVIDIA CUDA - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 512llama-cpp: NVIDIA CUDA - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 1024llama-cpp: NVIDIA CUDA - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 2048llama-cpp: NVIDIA CUDA - granite-3.0-3b-a800m-instruct-Q8_0 - Text Generation 128llama-cpp: NVIDIA CUDA - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 512llama-cpp: NVIDIA CUDA - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 1024llama-cpp: NVIDIA CUDA - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 2048llama-cpp: NVIDIA CUDA - Mistral-7B-Instruct-v0.3-Q8_0 - Text Generation 128llama-cpp: NVIDIA CUDA - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 512llama-cpp: NVIDIA CUDA - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 1024llama-cpp: NVIDIA CUDA - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 2048RTX 5090RTX 5080NVIDIA RTX 50804.6313.4842.299106848250000688525000003272300890040000027760001117.5427.83294398.39142.407837.20735937.228.786728.68952870.681.95117.847122.9144.39661.75954.01841.7951596.241687.490.05810103.5055.648952419140184995.1562151.9461843.111562.971976.9121415.5313.8318.4142.4410.214.684.539.827.442.6713.9237.0510.938.7428.5842.4439.3422.0247.962.7611.014.722.928.998.388.925.6635.146.164.5524.3317.35742.75992.71192348515.22118.6762.939943500000032522500000165700042430000001294700591.43627.84622381.869.5635858.04218948.128.786828.37982728.640.9557.85159.4684.27630.2127.2220.368909.66911.11212.34810.278517410328102764.8730128.0430086.5849.23962.3959218.8913.9518.6471.2741.524.2210.8237.5963.396.4236.5842.0117.4218.5444.7671.2744.0968.02165.97107.87437.344.1213.814.0814.47.8458.39.477.1539.4831.5510.626.29368.3467388252814.572544.02164977804.827292.846627.59105.093878.923831.533705.87101.937865.657327.356642.525.15318.642.7339981580000032633100000166450042655000001294200592.11527.54382381.9269.3868857.86119013.128.786428.55722734.060.9558.11659.4694.330.19727.25520.169910.07911.96212.31310.276517410330102754.8730173.7330059.43849.74962.4559305.8513.718.4971.0233.9669.7529.3365.774.0842.5442.4716.3117.657.5771.0244.5568.01155.22108.8732.487.294.112.9414.0514.437.7758.079.437.1939.0931.5410.3926.27268.3177372252814.436034.0115597.037799.247304.836630.67105.253849.783832.033687.77101.847864.887331.656653.97OpenBenchmarking.org

RealSR-NCNN

RealSR-NCNN is an NCNN neural network implementation of the RealSR project and accelerated using the Vulkan API. RealSR is the Real-World Super Resolution via Kernel Estimation and Noise Injection. NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. This test profile times how long it takes to increase the resolution of a sample image by a scale of 4x with Vulkan. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterRealSR-NCNN 20200818Scale: 4x - TAA: NoRTX 5090RTX 5080NVIDIA RTX 50801.17472.34943.52414.69885.87354.6305.2215.153

OpenBenchmarking.orgSeconds, Fewer Is BetterRealSR-NCNN 20200818Scale: 4x - TAA: YesRTX 5090RTX 5080NVIDIA RTX 508051015202513.4818.6818.64

Waifu2x-NCNN Vulkan

Waifu2x-NCNN is an NCNN neural network implementation of the Waifu2x converter project and accelerated using the Vulkan API. NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. This test profile times how long it takes to increase the resolution of a sample image with Vulkan. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterWaifu2x-NCNN Vulkan 20200818Scale: 2x - Denoise: 3 - TAA: YesRTX 5090RTX 5080NVIDIA RTX 50800.65931.31861.97792.63723.29652.2992.9302.733

Hashcat

Hashcat is an open-source, advanced password recovery tool supporting GPU acceleration with OpenCL, NVIDIA CUDA, and Radeon ROCm. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: MD5RTX 5090RTX 5080NVIDIA RTX 508020000M40000M60000M80000M100000MSE +/- 102551750000.00, N = 21068482500009943500000099815800000

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: SHA1RTX 5090RTX 5080NVIDIA RTX 508015000M30000M45000M60000M75000M688525000003252250000032633100000

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: 7-ZipRTX 5090RTX 5080NVIDIA RTX 5080700K1400K2100K2800K3500K327230016570001664500

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: SHA-512RTX 5090RTX 5080NVIDIA RTX 50802000M4000M6000M8000M10000M890040000042430000004265500000

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: TrueCrypt RIPEMD160 + XTSRTX 5090RTX 5080NVIDIA RTX 5080600K1200K1800K2400K3000K277600012947001294200

SHOC Scalable HeterOgeneous Computing

The CUDA and OpenCL version of Vetter's Scalable HeterOgeneous Computing benchmark suite. SHOC provides a number of different benchmark programs for evaluating the performance and stability of compute devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: S3DRTX 5090RTX 5080NVIDIA RTX 508020040060080010001117.54591.44592.121. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: TriadRTX 5090RTX 5080NVIDIA RTX 508071421283527.8327.8527.541. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: FFT SPRTX 5090RTX 5080NVIDIA RTX 508090018002700360045004398.392381.802381.921. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGHash/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: MD5 HashRTX 5090RTX 5080NVIDIA RTX 5080306090120150142.4169.5669.391. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: ReductionRTX 5090RTX 5080NVIDIA RTX 50802004006008001000837.21858.04857.861. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: GEMM SGEMM_NRTX 5090RTX 5080NVIDIA RTX 50808K16K24K32K40K35937.218948.119013.11. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Bus Speed DownloadRTX 5090RTX 5080NVIDIA RTX 508071421283528.7928.7928.791. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Bus Speed ReadbackRTX 5090RTX 5080NVIDIA RTX 508071421283528.6928.3828.561. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Texture Read BandwidthRTX 5090RTX 5080NVIDIA RTX 508060012001800240030002870.682728.642734.061. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

ProjectPhysX OpenCL-Benchmark

OpenBenchmarking.orgTFLOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.6Operation: FP64 ComputeRTX 5090RTX 5080NVIDIA RTX 50800.43880.87761.31641.75522.1941.950.950.951. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

OpenBenchmarking.orgTFLOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.6Operation: FP32 ComputeRTX 5090RTX 5080NVIDIA RTX 5080306090120150117.8557.8558.121. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

OpenBenchmarking.orgTFLOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.6Operation: FP16 ComputeRTX 5090RTX 5080NVIDIA RTX 5080306090120150122.9159.4759.471. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.6Operation: INT64 ComputeRTX 5090RTX 5080NVIDIA RTX 50800.98911.97822.96733.95644.94554.3964.2764.3001. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.6Operation: INT32 ComputeRTX 5090RTX 5080NVIDIA RTX 5080142842567061.7630.2130.201. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.6Operation: INT16 ComputeRTX 5090RTX 5080NVIDIA RTX 5080122436486054.0227.2227.261. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.6Operation: INT8 ComputeRTX 5090RTX 5080NVIDIA RTX 5080102030405041.8020.3720.171. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.6Operation: Memory Bandwidth Coalesced ReadRTX 5090RTX 5080NVIDIA RTX 5080300600900120015001596.24909.66910.071. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.6Operation: Memory Bandwidth Coalesced WriteRTX 5090RTX 5080NVIDIA RTX 50804008001200160020001687.49911.11911.961. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

NAMD CUDA

NAMD is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems. NAMD was developed by the Theoretical and Computational Biophysics Group in the Beckman Institute for Advanced Science and Technology at the University of Illinois at Urbana-Champaign. This version of the NAMD test profile uses CUDA GPU acceleration. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgdays/ns, Fewer Is BetterNAMD CUDA 2.14ATPase Simulation - 327,506 AtomsRTX 50900.01310.02620.03930.05240.06550.05810

ATPase Simulation - 327,506 Atoms

RTX 5080: The test run did not produce a result. E: FATAL ERROR: No simulation config file specified on command line.

NVIDIA RTX 5080: The test run did not produce a result. E: FATAL ERROR: No simulation config file specified on command line.

VkResample

VkResample is a Vulkan-based image upscaling library based on VkFFT. The sample input file is upscaling a 4K image to 8K using Vulkan-based GPU acceleration. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterVkResample 1.0Upscale: 2x - Precision: DoubleRTX 5090RTX 5080NVIDIA RTX 508050100150200250103.51212.35212.311. (CXX) g++ options: -O3

OpenBenchmarking.orgms, Fewer Is BetterVkResample 1.0Upscale: 2x - Precision: SingleRTX 5090RTX 5080NVIDIA RTX 508036912155.64810.27810.2761. (CXX) g++ options: -O3

FluidX3D

OpenBenchmarking.orgMLUPs/s, More Is BetterFluidX3D 3.0Test: FP32-FP32RTX 5090RTX 5080NVIDIA RTX 50802K4K6K8K10K952451745174

OpenBenchmarking.orgMLUPs/s, More Is BetterFluidX3D 3.0Test: FP32-FP16CRTX 5090RTX 5080NVIDIA RTX 50804K8K12K16K20K191401032810330

OpenBenchmarking.orgMLUPs/s, More Is BetterFluidX3D 3.0Test: FP32-FP16SRTX 5090RTX 5080NVIDIA RTX 50804K8K12K16K20K184991027610275

clpeak

Clpeak is designed to test the peak capabilities of OpenCL devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgus, Fewer Is Betterclpeak 1.1.2OpenCL Test: Kernel LatencyRTX 5090RTX 5080NVIDIA RTX 50801.15882.31763.47644.63525.7945.154.874.871. (CXX) g++ options: -O3

OpenBenchmarking.orgGIOPS, More Is Betterclpeak 1.1.2OpenCL Test: Integer ComputeRTX 5090RTX 5080NVIDIA RTX 508013K26K39K52K65K62151.9430128.0430173.731. (CXX) g++ options: -O3

OpenBenchmarking.orgGIOPS, More Is Betterclpeak 1.1.2OpenCL Test: Integer 24-bit ComputeRTX 5090RTX 5080NVIDIA RTX 508013K26K39K52K65K61843.1130086.5030059.431. (CXX) g++ options: -O3

OpenBenchmarking.orgGBPS, More Is Betterclpeak 1.1.2OpenCL Test: Global Memory BandwidthRTX 5090RTX 5080NVIDIA RTX 5080300600900120015001562.97849.23849.741. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOPS, More Is Betterclpeak 1.1.2OpenCL Test: Double-Precision ComputeRTX 5090RTX 5080NVIDIA RTX 50804008001200160020001976.90962.39962.451. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOPS, More Is Betterclpeak 1.1.2OpenCL Test: Single-Precision ComputeRTX 5090RTX 5080NVIDIA RTX 508030K60K90K120K150K121415.5359218.8959305.851. (CXX) g++ options: -O3

OpenBenchmarking.orgGBPS, More Is Betterclpeak 1.1.2OpenCL Test: Transfer Bandwidth enqueueReadBufferRTX 5090RTX 5080NVIDIA RTX 50804812162013.8313.9513.701. (CXX) g++ options: -O3

OpenBenchmarking.orgGBPS, More Is Betterclpeak 1.1.2OpenCL Test: Transfer Bandwidth enqueueWriteBufferRTX 5090RTX 5080NVIDIA RTX 508051015202518.4118.6418.491. (CXX) g++ options: -O3

NCNN

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: mobilenetRTX 5090RTX 5080NVIDIA RTX 5080163248648042.4471.2771.02MIN: 8.34 / MAX: 76.17MIN: 8.2 / MAX: 79.42MIN: 8.7 / MAX: 81.031. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2RTX 5090RTX 5080NVIDIA RTX 508091827364510.2141.5233.96MIN: 3.85 / MAX: 64.73MIN: 3.78 / MAX: 68.66MIN: 3.85 / MAX: 68.91. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3RTX 5090RTX 5080NVIDIA RTX 50802468104.684.226.00MIN: 4.08 / MAX: 57.94MIN: 4.04 / MAX: 4.96MIN: 4.05 / MAX: 791. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: shufflenet-v2RTX 5090RTX 5080NVIDIA RTX 508036912154.5310.829.75MIN: 3.88 / MAX: 57.59MIN: 3.9 / MAX: 74.52MIN: 3.91 / MAX: 74.731. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: mnasnetRTX 5090RTX 5080NVIDIA RTX 50809182736459.8037.5929.33MIN: 3.75 / MAX: 63.5MIN: 3.7 / MAX: 66.4MIN: 3.68 / MAX: 66.871. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: efficientnet-b0RTX 5090RTX 5080NVIDIA RTX 5080153045607527.4463.3965.77MIN: 6.34 / MAX: 109.98MIN: 6.28 / MAX: 114.02MIN: 6.28 / MAX: 115.251. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: blazefaceRTX 5090RTX 5080NVIDIA RTX 50802468102.676.424.08MIN: 2.4 / MAX: 41.21MIN: 2.36 / MAX: 52.25MIN: 2.36 / MAX: 52.231. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: googlenetRTX 5090RTX 5080NVIDIA RTX 5080102030405013.9236.5842.54MIN: 7.49 / MAX: 98.37MIN: 7.53 / MAX: 102.56MIN: 7.59 / MAX: 104.851. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: vgg16RTX 5090RTX 5080NVIDIA RTX 5080102030405037.0542.0142.47MIN: 22.53 / MAX: 46.22MIN: 23.96 / MAX: 46.31MIN: 25.67 / MAX: 47.211. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: resnet18RTX 5090RTX 5080NVIDIA RTX 50804812162010.9317.4216.31MIN: 4.48 / MAX: 44.35MIN: 4.48 / MAX: 44.53MIN: 4.4 / MAX: 44.981. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: alexnetRTX 5090RTX 5080NVIDIA RTX 50805101520258.7418.5417.60MIN: 3.21 / MAX: 22.23MIN: 3.21 / MAX: 22.49MIN: 3.2 / MAX: 23.031. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: resnet50RTX 5090RTX 5080NVIDIA RTX 5080132639526528.5844.7657.57MIN: 10.02 / MAX: 89.48MIN: 9.99 / MAX: 92.95MIN: 9.95 / MAX: 941. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3RTX 5090RTX 5080NVIDIA RTX 5080163248648042.4471.2771.02MIN: 8.34 / MAX: 76.17MIN: 8.2 / MAX: 79.42MIN: 8.7 / MAX: 81.031. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: yolov4-tinyRTX 5090RTX 5080NVIDIA RTX 5080102030405039.3444.0944.55MIN: 15.92 / MAX: 49.04MIN: 14.62 / MAX: 49.42MIN: 13.75 / MAX: 49.721. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: squeezenet_ssdRTX 5090RTX 5080NVIDIA RTX 5080153045607522.0268.0268.01MIN: 7.41 / MAX: 92.66MIN: 7.34 / MAX: 99.35MIN: 7.27 / MAX: 102.261. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: regnety_400mRTX 5090RTX 5080NVIDIA RTX 5080408012016020047.90165.97155.22MIN: 21.96 / MAX: 421.33MIN: 21.95 / MAX: 492.35MIN: 21.96 / MAX: 495.671. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: vision_transformerRTX 5090RTX 5080NVIDIA RTX 50802040608010062.76107.87108.87MIN: 40.3 / MAX: 105.61MIN: 40.48 / MAX: 118.93MIN: 45.11 / MAX: 120.31. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20241226Target: Vulkan GPU - Model: FastestDetRTX 5090RTX 5080NVIDIA RTX 5080102030405011.0143.0032.48MIN: 5.09 / MAX: 92.36MIN: 5.09 / MAX: 96.19MIN: 5.11 / MAX: 96.351. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

Blender

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: BMW27 - Compute: NVIDIA CUDARTX 5090RTX 5080NVIDIA RTX 50802468104.727.347.29

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: BMW27 - Compute: NVIDIA OptiXRTX 5090RTX 5080NVIDIA RTX 50800.9271.8542.7813.7084.6352.924.124.10

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: Junkshop - Compute: NVIDIA CUDARTX 5090RTX 5080NVIDIA RTX 5080481216208.9913.8012.94

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: Classroom - Compute: NVIDIA CUDARTX 5090RTX 5080NVIDIA RTX 5080481216208.3814.0814.05

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: Fishy Cat - Compute: NVIDIA CUDARTX 5090RTX 5080NVIDIA RTX 5080481216208.9214.4014.43

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: Junkshop - Compute: NVIDIA OptiXRTX 5090RTX 5080NVIDIA RTX 50802468105.667.847.77

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: Barbershop - Compute: NVIDIA CUDARTX 5090RTX 5080NVIDIA RTX 5080132639526535.1458.3058.07

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: Classroom - Compute: NVIDIA OptiXRTX 5090RTX 5080NVIDIA RTX 508036912156.169.479.43

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: Fishy Cat - Compute: NVIDIA OptiXRTX 5090RTX 5080NVIDIA RTX 50802468104.557.157.19

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: Barbershop - Compute: NVIDIA OptiXRTX 5090RTX 5080NVIDIA RTX 508091827364524.3339.4839.09

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: Pabellon Barcelona - Compute: NVIDIA CUDARTX 5090RTX 5080NVIDIA RTX 508071421283517.3531.5531.54

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: Pabellon Barcelona - Compute: NVIDIA OptiXRTX 5090RTX 5080NVIDIA RTX 508036912157.0010.6010.39

IndigoBench

This is a test of Indigo Renderer's IndigoBench benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgM samples/s, More Is BetterIndigoBench 4.4Acceleration: OpenCL GPU - Scene: BedroomRTX 5090RTX 5080NVIDIA RTX 5080102030405042.7626.2926.27

OpenBenchmarking.orgM samples/s, More Is BetterIndigoBench 4.4Acceleration: OpenCL GPU - Scene: SupercarRTX 5090RTX 5080NVIDIA RTX 50802040608010092.7068.3568.32

Chaos Group V-RAY

This is a test of Chaos Group's V-RAY benchmark. V-RAY is a commercial renderer that can integrate with various creator software products like SketchUp and 3ds Max. The V-RAY benchmark is standalone and supports CPU and NVIDIA CUDA/RTX based rendering. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgvpaths, More Is BetterChaos Group V-RAY 6.0Mode: NVIDIA RTX GPURTX 5090RTX 5080NVIDIA RTX 50803K6K9K12K15K1192373887372

OpenBenchmarking.orgvpaths, More Is BetterChaos Group V-RAY 6.0Mode: NVIDIA CUDA GPURTX 5090RTX 5080NVIDIA RTX 508010002000300040005000485125282528

NAMD CUDA

OpenBenchmarking.orgns/day, More Is BetterNAMD CUDA 3.0.1Input: ATPase with 327,506 AtomsRTX 5080NVIDIA RTX 50804812162014.5714.44

OpenBenchmarking.orgns/day, More Is BetterNAMD CUDA 3.0.1Input: STMV with 1,066,628 AtomsRTX 5080NVIDIA RTX 50800.90491.80982.71473.61964.52454.021644.01155

Llama.cpp

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128RTX 5080NVIDIA RTX 50802040608010097.0097.031. (CXX) g++ options: -O3

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512RTX 5080NVIDIA RTX 50802K4K6K8K10K7804.827799.241. (CXX) g++ options: -O3

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024RTX 5080NVIDIA RTX 5080160032004800640080007292.847304.831. (CXX) g++ options: -O3

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048RTX 5080NVIDIA RTX 5080140028004200560070006627.596630.671. (CXX) g++ options: -O3

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128RTX 5080NVIDIA RTX 508020406080100105.09105.251. (CXX) g++ options: -O3

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512RTX 5080NVIDIA RTX 508080016002400320040003878.923849.781. (CXX) g++ options: -O3

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024RTX 5080NVIDIA RTX 508080016002400320040003831.533832.031. (CXX) g++ options: -O3

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048RTX 5080NVIDIA RTX 508080016002400320040003705.873687.771. (CXX) g++ options: -O3

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128RTX 5080NVIDIA RTX 508020406080100101.93101.841. (CXX) g++ options: -O3

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512RTX 5080NVIDIA RTX 50802K4K6K8K10K7865.657864.881. (CXX) g++ options: -O3

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024RTX 5080NVIDIA RTX 5080160032004800640080007327.357331.651. (CXX) g++ options: -O3

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048RTX 5080NVIDIA RTX 5080140028004200560070006642.526653.971. (CXX) g++ options: -O3

88 Results Shown

RealSR-NCNN:
  4x - No
  4x - Yes
Waifu2x-NCNN Vulkan
Hashcat:
  MD5
  SHA1
  7-Zip
  SHA-512
  TrueCrypt RIPEMD160 + XTS
SHOC Scalable HeterOgeneous Computing:
  OpenCL - S3D
  OpenCL - Triad
  OpenCL - FFT SP
  OpenCL - MD5 Hash
  OpenCL - Reduction
  OpenCL - GEMM SGEMM_N
  OpenCL - Bus Speed Download
  OpenCL - Bus Speed Readback
  OpenCL - Texture Read Bandwidth
ProjectPhysX OpenCL-Benchmark:
  FP64 Compute
  FP32 Compute
  FP16 Compute
  INT64 Compute
  INT32 Compute
  INT16 Compute
  INT8 Compute
  Memory Bandwidth Coalesced Read
  Memory Bandwidth Coalesced Write
NAMD CUDA
VkResample:
  2x - Double
  2x - Single
FluidX3D:
  FP32-FP32
  FP32-FP16C
  FP32-FP16S
clpeak:
  Kernel Latency
  Integer Compute
  Integer 24-bit Compute
  Global Memory Bandwidth
  Double-Precision Compute
  Single-Precision Compute
  Transfer Bandwidth enqueueReadBuffer
  Transfer Bandwidth enqueueWriteBuffer
NCNN:
  Vulkan GPU - mobilenet
  Vulkan GPU-v2-v2 - mobilenet-v2
  Vulkan GPU-v3-v3 - mobilenet-v3
  Vulkan GPU - shufflenet-v2
  Vulkan GPU - mnasnet
  Vulkan GPU - efficientnet-b0
  Vulkan GPU - blazeface
  Vulkan GPU - googlenet
  Vulkan GPU - vgg16
  Vulkan GPU - resnet18
  Vulkan GPU - alexnet
  Vulkan GPU - resnet50
  Vulkan GPUv2-yolov3v2-yolov3 - mobilenetv2-yolov3
  Vulkan GPU - yolov4-tiny
  Vulkan GPU - squeezenet_ssd
  Vulkan GPU - regnety_400m
  Vulkan GPU - vision_transformer
  Vulkan GPU - FastestDet
Blender:
  BMW27 - NVIDIA CUDA
  BMW27 - NVIDIA OptiX
  Junkshop - NVIDIA CUDA
  Classroom - NVIDIA CUDA
  Fishy Cat - NVIDIA CUDA
  Junkshop - NVIDIA OptiX
  Barbershop - NVIDIA CUDA
  Classroom - NVIDIA OptiX
  Fishy Cat - NVIDIA OptiX
  Barbershop - NVIDIA OptiX
  Pabellon Barcelona - NVIDIA CUDA
  Pabellon Barcelona - NVIDIA OptiX
IndigoBench:
  OpenCL GPU - Bedroom
  OpenCL GPU - Supercar
Chaos Group V-RAY:
  NVIDIA RTX GPU
  NVIDIA CUDA GPU
NAMD CUDA:
  ATPase with 327,506 Atoms
  STMV with 1,066,628 Atoms
Llama.cpp:
  NVIDIA CUDA - Llama-3.1-Tulu-3-8B-Q8_0 - Text Generation 128
  NVIDIA CUDA - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 512
  NVIDIA CUDA - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 1024
  NVIDIA CUDA - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 2048
  NVIDIA CUDA - granite-3.0-3b-a800m-instruct-Q8_0 - Text Generation 128
  NVIDIA CUDA - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 512
  NVIDIA CUDA - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 1024
  NVIDIA CUDA - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 2048
  NVIDIA CUDA - Mistral-7B-Instruct-v0.3-Q8_0 - Text Generation 128
  NVIDIA CUDA - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 512
  NVIDIA CUDA - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 1024
  NVIDIA CUDA - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 2048