Vulkan Compute

AMD Ryzen 9 5900X 12-Core testing with a ASUS ROG CROSSHAIR VIII HERO (3501 BIOS) and eVGA NVIDIA GeForce RTX 3060 12GB on Ubuntu 21.04 via the Phoronix Test Suite.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2107307-PTS-VULKANCO44
Jump To Table - Results

Statistics

Remove Outliers Before Calculating Averages

Graph Settings

Prefer Vertical Bar Graphs

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Result
Identifier
Performance Per
Dollar
Date
Run
  Test
  Duration
NVIDIA RTX 3060
July 30
  2 Hours, 55 Minutes
Only show results matching title/arguments (delimit multiple options with a comma):


Vulkan ComputeOpenBenchmarking.orgPhoronix Test Suite 10.6.1AMD Ryzen 9 5900X 12-Core @ 3.70GHz (12 Cores / 24 Threads)ASUS ROG CROSSHAIR VIII HERO (3501 BIOS)AMD Starship/Matisse16GB1000GB Sabrent Rocket 4.0 Plus + 2000GBeVGA NVIDIA GeForce RTX 3060 12GBNVIDIA Device 228eASUS VP28URealtek RTL8125 2.5GbE + Intel I211Ubuntu 21.045.11.0-25-generic (x86_64)GNOME Shell 3.38.4X Server 1.20.11NVIDIA 470.57.024.6.01.2.175GCC 10.3.0ext43840x2160ProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLVulkanCompilerFile-SystemScreen ResolutionVulkan Compute BenchmarksSystem Logs- Transparent Huge Pages: madvise- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-mutex --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-10-gDeRY6/gcc-10-10.3.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-10-gDeRY6/gcc-10-10.3.0/debian/tmp-gcn/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa201009 - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected

Vulkan Computevkpeak: fp32-scalarvkpeak: fp32-vec4vkpeak: fp16-scalarvkpeak: fp16-vec4vkpeak: fp64-scalarvkpeak: fp64-vec4vkpeak: int32-scalarvkpeak: int32-vec4vkpeak: int16-scalarvkpeak: int16-vec4realsr-ncnn: 4x - Norealsr-ncnn: 4x - Yeswaifu2x-ncnn: 2x - 3 - Yesvkfft: vkresample: 2x - Singlencnn: Vulkan GPU - mobilenetncnn: Vulkan GPU-v2-v2 - mobilenet-v2ncnn: Vulkan GPU-v3-v3 - mobilenet-v3ncnn: Vulkan GPU - shufflenet-v2ncnn: Vulkan GPU - mnasnetncnn: Vulkan GPU - efficientnet-b0ncnn: Vulkan GPU - blazefacencnn: Vulkan GPU - googlenetncnn: Vulkan GPU - vgg16ncnn: Vulkan GPU - alexnetncnn: Vulkan GPU - resnet50ncnn: Vulkan GPU - yolov4-tinyncnn: Vulkan GPU - squeezenet_ssdncnn: Vulkan GPU - regnety_400mncnn: Vulkan GPU - resnet18NVIDIA RTX 30606829.759079.506849.6413242.60214.25214.296830.286766.464480.115957.5210.50567.9034.9742733723.1494.571.952.181.742.043.250.984.277.32.104.197.214.882.441.93OpenBenchmarking.org

vkpeak

Vkpeak is a Vulkan compute benchmark inspired by OpenCL's clpeak. Vkpeak provides Vulkan compute performance measurements for FP16 / FP32 / FP64 / INT16 / INT32 scalar and vec4 performance. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20210424fp32-scalarNVIDIA RTX 306015003000450060007500SE +/- 10.62, N = 36829.75

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20210424fp32-vec4NVIDIA RTX 30602K4K6K8K10KSE +/- 0.30, N = 39079.50

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20210424fp16-scalarNVIDIA RTX 306015003000450060007500SE +/- 17.95, N = 36849.64

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20210424fp16-vec4NVIDIA RTX 30603K6K9K12K15KSE +/- 2.31, N = 313242.60

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20210424fp64-scalarNVIDIA RTX 306050100150200250SE +/- 0.03, N = 3214.25

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20210424fp64-vec4NVIDIA RTX 306050100150200250SE +/- 0.01, N = 3214.29

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20210424int32-scalarNVIDIA RTX 306015003000450060007500SE +/- 0.58, N = 36830.28

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20210424int32-vec4NVIDIA RTX 306015003000450060007500SE +/- 17.45, N = 36766.46

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20210424int16-scalarNVIDIA RTX 306010002000300040005000SE +/- 0.29, N = 34480.11

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20210424int16-vec4NVIDIA RTX 306013002600390052006500SE +/- 0.03, N = 35957.52

RealSR-NCNN

RealSR-NCNN is an NCNN neural network implementation of the RealSR project and accelerated using the Vulkan API. RealSR is the Real-World Super Resolution via Kernel Estimation and Noise Injection. NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. This test profile times how long it takes to increase the resolution of a sample image by a scale of 4x with Vulkan. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterRealSR-NCNN 20200818Scale: 4x - TAA: NoNVIDIA RTX 30603691215SE +/- 0.00, N = 310.51

OpenBenchmarking.orgSeconds, Fewer Is BetterRealSR-NCNN 20200818Scale: 4x - TAA: YesNVIDIA RTX 30601530456075SE +/- 0.06, N = 367.90

Waifu2x-NCNN Vulkan

Waifu2x-NCNN is an NCNN neural network implementation of the Waifu2x converter project and accelerated using the Vulkan API. NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. This test profile times how long it takes to increase the resolution of a sample image with Vulkan. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterWaifu2x-NCNN Vulkan 20200818Scale: 2x - Denoise: 3 - TAA: YesNVIDIA RTX 30601.11922.23843.35764.47685.596SE +/- 0.004, N = 34.974

VkFFT

VkFFT is a Fast Fourier Transform (FFT) Library that is GPU accelerated by means of the Vulkan API. The VkFFT benchmark runs FFT performance differences of many different sizes before returning an overall benchmark score. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.1.1NVIDIA RTX 30606K12K18K24K30KSE +/- 308.54, N = 3273371. (CXX) g++ options: -O3 -pthread

VkResample

VkResample is a Vulkan-based image upscaling library based on VkFFT. The sample input file is upscaling a 4K image to 8K using Vulkan-based GPU acceleration. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterVkResample 1.0Upscale: 2x - Precision: SingleNVIDIA RTX 3060612182430SE +/- 0.02, N = 323.151. (CXX) g++ options: -O3 -pthread

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: Vulkan GPU - Model: mobilenetNVIDIA RTX 30601.02832.05663.08494.11325.1415SE +/- 0.01, N = 34.57MIN: 4.52 / MAX: 4.921. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2NVIDIA RTX 30600.43880.87761.31641.75522.194SE +/- 0.00, N = 31.95MIN: 1.92 / MAX: 2.861. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3NVIDIA RTX 30600.49050.9811.47151.9622.4525SE +/- 0.00, N = 32.18MIN: 2.16 / MAX: 3.531. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: Vulkan GPU - Model: shufflenet-v2NVIDIA RTX 30600.39150.7831.17451.5661.9575SE +/- 0.00, N = 31.74MIN: 1.71 / MAX: 3.741. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: Vulkan GPU - Model: mnasnetNVIDIA RTX 30600.4590.9181.3771.8362.295SE +/- 0.01, N = 32.04MIN: 2.02 / MAX: 4.461. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: Vulkan GPU - Model: efficientnet-b0NVIDIA RTX 30600.73131.46262.19392.92523.6565SE +/- 0.01, N = 33.25MIN: 3.22 / MAX: 3.71. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: Vulkan GPU - Model: blazefaceNVIDIA RTX 30600.22050.4410.66150.8821.1025SE +/- 0.00, N = 30.98MIN: 0.95 / MAX: 2.331. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: Vulkan GPU - Model: googlenetNVIDIA RTX 30600.96081.92162.88243.84324.804SE +/- 0.08, N = 34.27MIN: 3.95 / MAX: 15.211. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: Vulkan GPU - Model: vgg16NVIDIA RTX 3060246810SE +/- 0.00, N = 37.3MIN: 7.17 / MAX: 15.831. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: Vulkan GPU - Model: alexnetNVIDIA RTX 30600.47250.9451.41751.892.3625SE +/- 0.00, N = 32.10MIN: 2.07 / MAX: 4.521. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: Vulkan GPU - Model: resnet50NVIDIA RTX 30600.94281.88562.82843.77124.714SE +/- 0.01, N = 34.19MIN: 4.17 / MAX: 4.841. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: Vulkan GPU - Model: yolov4-tinyNVIDIA RTX 3060246810SE +/- 0.01, N = 37.21MIN: 7.02 / MAX: 7.521. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: Vulkan GPU - Model: squeezenet_ssdNVIDIA RTX 30601.0982.1963.2944.3925.49SE +/- 0.07, N = 34.88MIN: 4.67 / MAX: 10.311. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: Vulkan GPU - Model: regnety_400mNVIDIA RTX 30600.5491.0981.6472.1962.745SE +/- 0.00, N = 32.44MIN: 2.41 / MAX: 3.441. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: Vulkan GPU - Model: resnet18NVIDIA RTX 30600.43430.86861.30291.73722.1715SE +/- 0.00, N = 21.93MIN: 1.91 / MAX: 2.251. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread