NVK Vulkan

Some early NVK Vulkan benchmarks by Michael Larabel.

HTML result view exported from: https://openbenchmarking.org/result/2308119-PTS-NVKVULKA25&sor.

ProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLCompilerFile-SystemScreen ResolutionRTX 2060RTX 3060 TiRTX 3070 Ti NVIDIA 535 NVK NVIDIA 535 NVK NVIDIA 535 NVKAMD Ryzen 9 7950X 16-Core @ 4.50GHz (16 Cores / 32 Threads)ASUS ROG STRIX X670E-E GAMING WIFI (1416 BIOS)AMD Device 14d832GBWestern Digital WD_BLACK SN850X 1000GB + 4001GBNVIDIA GeForce RTX 2060 6GBNVIDIA TU106 HD AudioASUS MG28UIntel I225-V + Intel Wi-Fi 6 AX210/AX211/AX411Ubuntu 23.046.2.0-27-generic (x86_64)GNOME Shell 44.3X Server 1.21.1.7NVIDIA 535.984.6.0GCC 12.3.0ext43840x2160AMD Ryzen 9 7950X 16-Core @ 5.88GHz (16 Cores / 32 Threads)NVIDIA NV166 6GB6.5.0-rc2-phx-nvk (x86_64)X Server 1.21.1.7 + Waylandnouveau4.3 Mesa 23.3~git2308100600.81cae3~oibaf~l (git-81cae3d 2023-08-10 lunar-oibaf-ppa)AMD Ryzen 9 7950X 16-Core @ 4.50GHz (16 Cores / 32 Threads)NVIDIA GeForce RTX 3060 Ti 8GBNVIDIA GA104 HD Audio6.2.0-27-generic (x86_64)X Server 1.21.1.7NVIDIA 535.984.6.0AMD Ryzen 9 7950X 16-Core @ 5.88GHz (16 Cores / 32 Threads)NVIDIA NV174 8GB6.5.0-rc2-phx-nvk (x86_64)X Server 1.21.1.7 + Waylandnouveau4.3 Mesa 23.3~git2308100600.81cae3~oibaf~l (git-81cae3d 2023-08-10 lunar-oibaf-ppa)AMD Ryzen 9 7950X 16-Core @ 4.50GHz (16 Cores / 32 Threads)NVIDIA GeForce RTX 3070 Ti 8GB6.2.0-27-generic (x86_64)X Server 1.21.1.7NVIDIA 535.984.6.0AMD Ryzen 9 7950X 16-Core @ 5.88GHz (16 Cores / 32 Threads)NVIDIA NV174 8GB6.5.0-rc2-phx-nvk (x86_64)X Server 1.21.1.7 + Waylandnouveau4.3 Mesa 23.3~git2308100600.81cae3~oibaf~l (git-81cae3d 2023-08-10 lunar-oibaf-ppa)OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-DAPbBt/gcc-12-12.3.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-DAPbBt/gcc-12-12.3.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- RTX 2060: NVIDIA 535: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa601203- RTX 2060: NVK: Scaling Governor: amd-pstate-epp performance (EPP: performance) - CPU Microcode: 0xa601203- RTX 3060 Ti: NVIDIA 535: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa601203- RTX 3060 Ti: NVK: Scaling Governor: amd-pstate-epp performance (EPP: performance) - CPU Microcode: 0xa601203- RTX 3070 Ti: NVIDIA 535: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa601203- RTX 3070 Ti: NVK: Scaling Governor: amd-pstate-epp performance (EPP: performance) - CPU Microcode: 0xa601203Graphics Details- RTX 2060: NVIDIA 535: BAR1 / Visible vRAM Size: 256 MiB - vBIOS Version: 90.06.2e.00.05- RTX 3060 Ti: NVIDIA 535: BAR1 / Visible vRAM Size: 8192 MiB - vBIOS Version: 94.04.25.00.2c- RTX 3070 Ti: NVIDIA 535: BAR1 / Visible vRAM Size: 8192 MiB - vBIOS Version: 94.04.5b.00.02Security Details- RTX 2060: NVIDIA 535: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected- RTX 2060: NVK: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected- RTX 3060 Ti: NVIDIA 535: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected- RTX 3060 Ti: NVK: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected- RTX 3070 Ti: NVIDIA 535: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected- RTX 3070 Ti: NVK: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affectedVbios Version Details- RTX 2060: NVK, RTX 3060 Ti: NVK, RTX 3070 Ti: NVK: 102-RAPHAEL-008

yquake2: Vulkan - On - On - 1920 x 1080yquake2: Vulkan - On - On - 2560 x 1440yquake2: Vulkan - Off - On - 1920 x 1080yquake2: Vulkan - Off - On - 2560 x 1440yquake2: Vulkan - On - Off - 1920 x 1080yquake2: Vulkan - On - Off - 2560 x 1440yquake2: Vulkan - On - Off - 3840 x 2160yquake2: Vulkan - Off - Off - 1920 x 1080yquake2: Vulkan - Off - Off - 2560 x 1440yquake2: Vulkan - Off - Off - 3840 x 2160vkmark: 1920 x 1080 - Mailboxvkmark: 2560 x 1440 - Mailboxvkpeak: fp32-scalarvkpeak: fp32-vec4vkpeak: int32-scalarvkpeak: int32-vec4vkfft: FFT + iFFT R2C / C2Rvkpeak: fp16-scalarvkpeak: fp16-vec4RTX 2060RTX 3060 TiRTX 3070 TiRTX 3060 Ti NVIDIA 535 NVK NVIDIA 535 NVK NVIDIA 535 NVKRTX 3060 Ti844.6525.5885.7550.72279.11713.4810.62418.81708.9806.67369.727350.02233407259.0914318.6613.48.113.78.138.726.21538.926.315775462.7075.2363.1475.448161264.27971314.0827.32972.42142.41093.52991.62212.91092.89680.8612800.88339229710.0119169.7116.910.017.11049.63115.949.931.115.91518356.9489.9259.2471.0211041508.5975.31581.31000.63292.12391.01193.23241.22396.91203.811642.7215425.474253411692.3023093.8117.010.217.310.248.431.216.648.631.316.61398270.80110.3072.8386.9610539693.4719134.98OpenBenchmarking.org

yquake2

Renderer: Vulkan - AF: On - MSAA: On - Resolution: 1920 x 1080

RTX 3070 TiRTX 3060 TiRTX 2060OpenBenchmarking.orgFrames Per Second, More Is Betteryquake2 8.10Renderer: Vulkan - AF: On - MSAA: On - Resolution: 1920 x 1080NVIDIA 535NVK30060090012001500SE +/- 8.31, N = 3SE +/- 0.17, N = 3SE +/- 11.86, N = 3SE +/- 0.07, N = 3SE +/- 3.63, N = 3SE +/- 0.10, N = 111508.517.01264.216.9844.613.41. (CC) gcc options: -shared -lm -ldl -rdynamic -lSDL2 -O2 -pipe -fomit-frame-pointer -std=gnu99 -fno-strict-aliasing -fwrapv -fvisibility=hidden -MMD -mfpmath=sse -fPIC

yquake2

Renderer: Vulkan - AF: On - MSAA: On - Resolution: 2560 x 1440

RTX 3070 TiRTX 3060 TiRTX 2060OpenBenchmarking.orgFrames Per Second, More Is Betteryquake2 8.10Renderer: Vulkan - AF: On - MSAA: On - Resolution: 2560 x 1440NVIDIA 535NVK2004006008001000SE +/- 1.73, N = 3SE +/- 0.00, N = 3SE +/- 2.31, N = 3SE +/- 0.03, N = 3SE +/- 5.55, N = 3SE +/- 0.00, N = 3975.310.2797.010.0525.58.11. (CC) gcc options: -shared -lm -ldl -rdynamic -lSDL2 -O2 -pipe -fomit-frame-pointer -std=gnu99 -fno-strict-aliasing -fwrapv -fvisibility=hidden -MMD -mfpmath=sse -fPIC

yquake2

Renderer: Vulkan - AF: Off - MSAA: On - Resolution: 1920 x 1080

RTX 3070 TiRTX 3060 TiRTX 2060OpenBenchmarking.orgFrames Per Second, More Is Betteryquake2 8.10Renderer: Vulkan - AF: Off - MSAA: On - Resolution: 1920 x 1080NVIDIA 535NVK30060090012001500SE +/- 6.57, N = 3SE +/- 0.03, N = 3SE +/- 7.88, N = 3SE +/- 0.03, N = 3SE +/- 7.68, N = 3SE +/- 0.00, N = 31581.317.31314.017.1885.713.71. (CC) gcc options: -shared -lm -ldl -rdynamic -lSDL2 -O2 -pipe -fomit-frame-pointer -std=gnu99 -fno-strict-aliasing -fwrapv -fvisibility=hidden -MMD -mfpmath=sse -fPIC

yquake2

Renderer: Vulkan - AF: Off - MSAA: On - Resolution: 2560 x 1440

RTX 3070 TiRTX 3060 TiRTX 2060OpenBenchmarking.orgFrames Per Second, More Is Betteryquake2 8.10Renderer: Vulkan - AF: Off - MSAA: On - Resolution: 2560 x 1440NVIDIA 535NVK2004006008001000SE +/- 2.79, N = 3SE +/- 0.00, N = 3SE +/- 3.27, N = 3SE +/- 0.00, N = 3SE +/- 2.57, N = 3SE +/- 0.00, N = 31000.610.2827.310.0550.78.11. (CC) gcc options: -shared -lm -ldl -rdynamic -lSDL2 -O2 -pipe -fomit-frame-pointer -std=gnu99 -fno-strict-aliasing -fwrapv -fvisibility=hidden -MMD -mfpmath=sse -fPIC

yquake2

Renderer: Vulkan - AF: On - MSAA: Off - Resolution: 1920 x 1080

RTX 3070 TiRTX 3060 TiRTX 2060OpenBenchmarking.orgFrames Per Second, More Is Betteryquake2 8.10Renderer: Vulkan - AF: On - MSAA: Off - Resolution: 1920 x 1080NVIDIA 535NVK7001400210028003500SE +/- 26.37, N = 3SE +/- 0.21, N = 3SE +/- 30.28, N = 3SE +/- 0.06, N = 3SE +/- 16.65, N = 3SE +/- 0.22, N = 33292.148.42972.449.62279.138.71. (CC) gcc options: -shared -lm -ldl -rdynamic -lSDL2 -O2 -pipe -fomit-frame-pointer -std=gnu99 -fno-strict-aliasing -fwrapv -fvisibility=hidden -MMD -mfpmath=sse -fPIC

yquake2

Renderer: Vulkan - AF: On - MSAA: Off - Resolution: 2560 x 1440

RTX 3070 TiRTX 3060 TiRTX 2060OpenBenchmarking.orgFrames Per Second, More Is Betteryquake2 8.10Renderer: Vulkan - AF: On - MSAA: Off - Resolution: 2560 x 1440NVIDIA 535NVK5001000150020002500SE +/- 10.91, N = 3SE +/- 0.03, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 23.66, N = 3SE +/- 0.07, N = 32391.031.22142.431.01713.426.21. (CC) gcc options: -shared -lm -ldl -rdynamic -lSDL2 -O2 -pipe -fomit-frame-pointer -std=gnu99 -fno-strict-aliasing -fwrapv -fvisibility=hidden -MMD -mfpmath=sse -fPIC

yquake2

Renderer: Vulkan - AF: On - MSAA: Off - Resolution: 3840 x 2160

RTX 3070 TiRTX 3060 TiRTX 2060OpenBenchmarking.orgFrames Per Second, More Is Betteryquake2 8.10Renderer: Vulkan - AF: On - MSAA: Off - Resolution: 3840 x 2160NVIDIA 535NVK30060090012001500SE +/- 5.25, N = 3SE +/- 0.00, N = 3SE +/- 6.67, N = 3SE +/- 0.00, N = 3SE +/- 1.51, N = 3SE +/- 0.00, N = 31193.216.61093.515.9810.615.01. (CC) gcc options: -shared -lm -ldl -rdynamic -lSDL2 -O2 -pipe -fomit-frame-pointer -std=gnu99 -fno-strict-aliasing -fwrapv -fvisibility=hidden -MMD -mfpmath=sse -fPIC

yquake2

Renderer: Vulkan - AF: Off - MSAA: Off - Resolution: 1920 x 1080

RTX 3070 TiRTX 3060 TiRTX 2060OpenBenchmarking.orgFrames Per Second, More Is Betteryquake2 8.10Renderer: Vulkan - AF: Off - MSAA: Off - Resolution: 1920 x 1080NVIDIA 535NVK7001400210028003500SE +/- 19.20, N = 3SE +/- 0.17, N = 3SE +/- 31.93, N = 15SE +/- 0.12, N = 3SE +/- 24.71, N = 15SE +/- 0.25, N = 33241.248.62991.649.92418.838.91. (CC) gcc options: -shared -lm -ldl -rdynamic -lSDL2 -O2 -pipe -fomit-frame-pointer -std=gnu99 -fno-strict-aliasing -fwrapv -fvisibility=hidden -MMD -mfpmath=sse -fPIC

yquake2

Renderer: Vulkan - AF: Off - MSAA: Off - Resolution: 2560 x 1440

RTX 3070 TiRTX 3060 TiRTX 2060OpenBenchmarking.orgFrames Per Second, More Is Betteryquake2 8.10Renderer: Vulkan - AF: Off - MSAA: Off - Resolution: 2560 x 1440NVIDIA 535NVK5001000150020002500SE +/- 3.03, N = 3SE +/- 0.03, N = 3SE +/- 24.36, N = 3SE +/- 0.03, N = 3SE +/- 18.42, N = 5SE +/- 0.09, N = 32396.931.32212.931.11708.926.31. (CC) gcc options: -shared -lm -ldl -rdynamic -lSDL2 -O2 -pipe -fomit-frame-pointer -std=gnu99 -fno-strict-aliasing -fwrapv -fvisibility=hidden -MMD -mfpmath=sse -fPIC

yquake2

Renderer: Vulkan - AF: Off - MSAA: Off - Resolution: 3840 x 2160

RTX 3070 TiRTX 3060 TiRTX 2060OpenBenchmarking.orgFrames Per Second, More Is Betteryquake2 8.10Renderer: Vulkan - AF: Off - MSAA: Off - Resolution: 3840 x 2160NVIDIA 535NVK30060090012001500SE +/- 2.30, N = 3SE +/- 0.03, N = 3SE +/- 1.64, N = 3SE +/- 0.00, N = 3SE +/- 7.55, N = 3SE +/- 0.00, N = 31203.816.61092.815.9806.615.01. (CC) gcc options: -shared -lm -ldl -rdynamic -lSDL2 -O2 -pipe -fomit-frame-pointer -std=gnu99 -fno-strict-aliasing -fwrapv -fvisibility=hidden -MMD -mfpmath=sse -fPIC

VKMark

Resolution: 1920 x 1080 - Present Mode: Mailbox

RTX 3060 TiRTX 3070 TiRTX 2060OpenBenchmarking.orgVKMark Score, More Is BetterVKMark 2022-05-16Resolution: 1920 x 1080 - Present Mode: MailboxNVK306090120150SE +/- 0.58, N = 3SE +/- 0.00, N = 3SE +/- 0.33, N = 3151139771. (CXX) g++ options: -pthread -ldl -std=c++14 -O0 -MD -MQ -MF

VKMark

Resolution: 2560 x 1440 - Present Mode: Mailbox

RTX 3060 TiRTX 3070 TiRTX 2060OpenBenchmarking.orgVKMark Score, More Is BetterVKMark 2022-05-16Resolution: 2560 x 1440 - Present Mode: MailboxNVK20406080100SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 38382541. (CXX) g++ options: -pthread -ldl -std=c++14 -O0 -MD -MQ -MF

vkpeak

fp32-scalar

RTX 3070 TiRTX 3060 TiRTX 2060OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20230730fp32-scalarNVIDIA 535NVK2K4K6K8K10KSE +/- 22.64, N = 3SE +/- 0.01, N = 3SE +/- 9.07, N = 3SE +/- 0.01, N = 3SE +/- 60.39, N = 3SE +/- 0.16, N = 311642.7270.809680.8656.947369.7262.70

vkpeak

fp32-vec4

RTX 3070 TiRTX 3060 TiRTX 2060OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20230730fp32-vec4NVIDIA 535NVK3K6K9K12K15KSE +/- 0.67, N = 3SE +/- 0.05, N = 3SE +/- 1.16, N = 3SE +/- 0.00, N = 3SE +/- 33.15, N = 3SE +/- 0.08, N = 315425.47110.3012800.8889.927350.0275.23

vkpeak

int32-scalar

RTX 3070 TiRTX 2060RTX 3060 TiOpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20230730int32-scalarNVK1632486480SE +/- 0.01, N = 3SE +/- 0.12, N = 3SE +/- 0.00, N = 372.8363.1459.24

vkpeak

int32-vec4

RTX 3070 TiRTX 2060RTX 3060 TiOpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20230730int32-vec4NVK20406080100SE +/- 0.03, N = 3SE +/- 0.04, N = 3SE +/- 0.00, N = 386.9675.4471.02

VkFFT

Test: FFT + iFFT R2C / C2R

RTX 3070 TiRTX 3060 TiRTX 2060OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT R2C / C2RNVIDIA 535NVK9K18K27K36K45KSE +/- 264.61, N = 3SE +/- 1.76, N = 3SE +/- 12.44, N = 3SE +/- 0.58, N = 3SE +/- 15.52, N = 3SE +/- 9.50, N = 2425341053339221104233408161. (CXX) g++ options: -O3

vkpeak

fp16-scalar

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20230730fp16-scalarRTX 3070 Ti: NVIDIA 535RTX 3060 Ti: NVIDIA 535RTX 3060 TiRTX 2060: NVIDIA 5353K6K9K12K15KSE +/- 29.33, N = 3SE +/- 16.04, N = 3SE +/- 0.89, N = 3SE +/- 19.13, N = 311692.309710.019693.927323.43

vkpeak

fp16-vec4

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20230730fp16-vec4RTX 3070 Ti: NVIDIA 535RTX 3060 Ti: NVIDIA 535RTX 3060 TiRTX 2060: NVIDIA 5355K10K15K20K25KSE +/- 59.94, N = 3SE +/- 32.39, N = 3SE +/- 0.27, N = 3SE +/- 65.37, N = 323093.8119169.7119135.8514432.17


Phoronix Test Suite v10.8.4