vulkan benchmarks

AMD Ryzen 9 7950X 16-Core testing with a ASUS ROG STRIX X670E-E GAMING WIFI (1416 BIOS) and AMD Radeon RX 6700 XT on Ubuntu 23.04 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2308012-PTS-VULKANBE49&grr&sor.

vulkan benchmarksProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerOpenGLCompilerFile-SystemScreen ResolutionabcAMD Ryzen 9 7950X 16-Core @ 4.50GHz (16 Cores / 32 Threads)ASUS ROG STRIX X670E-E GAMING WIFI (1416 BIOS)AMD Device 14d832GBWestern Digital WD_BLACK SN850X 1000GB + 4001GBAMD Radeon RX 6700 XT (2855/1000MHz)AMD Navi 21/23ASUS MG28UIntel I225-V + Intel Wi-Fi 6 AX210/AX211/AX411Ubuntu 23.046.4.6-060406-generic (x86_64)GNOME Shell 44.2X Server 1.21.1.7 + Wayland4.6 Mesa 23.3~git2307260600.87109c~oibaf~l (git-87109c3 2023-07-26 lunar-oibaf-ppa) (LLVM 15.0.7 DRM 3.52)GCC 12.2.0ext43840x2160OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-Pa930Z/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-Pa930Z/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0xa601203Graphics Details- BAR1 / Visible vRAM Size: 12272 MB - vBIOS Version: 113-D5121100-101Security Details- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected

vulkan benchmarksvkpeak: int16-vec4vkpeak: int16-scalarvkpeak: int32-vec4vkpeak: int32-scalarvkpeak: fp64-vec4vkpeak: fp64-scalarvkpeak: fp16-vec4vkpeak: fp16-scalarvkpeak: fp32-vec4vkpeak: fp32-scalarvkfft: FFT + iFFT C2C 1D batched in double precisionncnn: CPU-v3-v3 - mobilenet-v3ncnn: Vulkan GPU-v3-v3 - mobilenet-v3vkfft: FFT + iFFT C2C Bluestein benchmark in double precisionvkfft: FFT + iFFT C2C 1D batched in single precisionncnn: CPU - FastestDetncnn: CPU - vision_transformerncnn: CPU - regnety_400mncnn: CPU - squeezenet_ssdncnn: CPU - yolov4-tinyncnn: CPU - resnet50ncnn: CPU - alexnetncnn: CPU - resnet18ncnn: CPU - vgg16ncnn: CPU - googlenetncnn: CPU - blazefacencnn: CPU - efficientnet-b0ncnn: CPU - mnasnetncnn: CPU - shufflenet-v2ncnn: CPU-v2-v2 - mobilenet-v2ncnn: CPU - mobilenetncnn: Vulkan GPU - FastestDetncnn: Vulkan GPU - vision_transformerncnn: Vulkan GPU - regnety_400mncnn: Vulkan GPU - squeezenet_ssdncnn: Vulkan GPU - yolov4-tinyncnn: Vulkan GPU - resnet50ncnn: Vulkan GPU - alexnetncnn: Vulkan GPU - resnet18ncnn: Vulkan GPU - vgg16ncnn: Vulkan GPU - googlenetncnn: Vulkan GPU - blazefacencnn: Vulkan GPU - efficientnet-b0ncnn: Vulkan GPU - mnasnetncnn: Vulkan GPU - shufflenet-v2ncnn: Vulkan GPU-v2-v2 - mobilenet-v2ncnn: Vulkan GPU - mobilenetvkfft: FFT + iFFT C2C 1D batched in single precision, no reshufflingncnn: CPU-v3-v3-v3 - FastestDetncnn: CPU-v3-v3-v3 - vision_transformerncnn: CPU-v3-v3-v3 - regnety_400mncnn: CPU-v3-v3-v3 - squeezenet_ssdncnn: CPU-v3-v3-v3 - yolov4-tinyncnn: CPU-v3-v3-v3 - resnet50ncnn: CPU-v3-v3-v3 - alexnetncnn: CPU-v3-v3-v3 - resnet18ncnn: CPU-v3-v3-v3 - vgg16ncnn: CPU-v3-v3-v3 - googlenetncnn: CPU-v3-v3-v3 - blazefacencnn: CPU-v3-v3-v3 - efficientnet-b0ncnn: CPU-v3-v3-v3 - mnasnetncnn: CPU-v3-v3-v3 - shufflenet-v2ncnn: CPU-v3-v3-v3-v3-v3 - mobilenet-v3ncnn: CPU-v3-v3-v3-v2-v2 - mobilenet-v2ncnn: CPU-v3-v3-v3 - mobilenetncnn: Vulkan GPU-v3-v3-v3 - FastestDetncnn: Vulkan GPU-v3-v3-v3 - vision_transformerncnn: Vulkan GPU-v3-v3-v3 - regnety_400mncnn: Vulkan GPU-v3-v3-v3 - squeezenet_ssdncnn: Vulkan GPU-v3-v3-v3 - yolov4-tinyncnn: Vulkan GPU-v3-v3-v3 - resnet50ncnn: Vulkan GPU-v3-v3-v3 - alexnetncnn: Vulkan GPU-v3-v3-v3 - resnet18ncnn: Vulkan GPU-v3-v3-v3 - vgg16ncnn: Vulkan GPU-v3-v3-v3 - googlenetncnn: Vulkan GPU-v3-v3-v3 - blazefacencnn: Vulkan GPU-v3-v3-v3 - efficientnet-b0ncnn: Vulkan GPU-v3-v3-v3 - mnasnetncnn: Vulkan GPU-v3-v3-v3 - shufflenet-v2ncnn: Vulkan GPU-v3-v3-v3-v3-v3 - mobilenet-v3ncnn: Vulkan GPU-v3-v3-v3-v2-v2 - mobilenet-v2ncnn: Vulkan GPU-v3-v3-v3 - mobilenetvkfft: FFT + iFFT C2C Bluestein in single precisionvkfft: FFT + iFFT C2C 1D batched in half precisionvkfft: FFT + iFFT C2C multidimensional in single precisionvkfft: FFT + iFFT R2C / C2Rvkresample: 2x - Singleabc23123.7713102.752658.732272.62841.80841.4023232.4213154.1512730.0813190.09208163.183.174717478873.6232.498.187.0712.9010.204.415.2923.757.941.383.902.973.343.168.054.131.888.167.0912.8410.014.315.2823.517.901.383.862.983.353.178.05505041134091597330014210511.68623396.5913070.812640.082269.25836.55839.223390.4413145.1912808.5912807.06208224695479484.0531.958.187.0612.7410.014.325.223.497.821.383.852.973.343.167.974.0731.858.217.0712.87104.335.2323.567.851.373.822.953.333.148.04506434.0731.658.057.0312.9810.014.295.2123.57.841.373.822.973.333.163.158.014.0631.718.277.1412.779.874.425.4223.427.971.373.852.963.333.163.1581127391812327514216311.6923385.4413063.862638.692269.06836.16839.0123387.2613136.7912822.0112860.56208473.173.24670479714.1131.778.277.112.8110.114.315.2423.457.931.393.882.993.353.188.024.0931.7987.0612.81104.335.2123.547.81.373.832.963.323.138.03505964.0831.788.147.0412.8910.334.285.2623.997.881.383.892.973.343.163.1483.6931.667.987.0712.8610.034.35.2323.547.831.363.822.963.333.173.157.951131191744328124302111.688OpenBenchmarking.org

vkpeak

int16-vec4

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20230730int16-vec4bca5K10K15K20K25KSE +/- 21.55, N = 323396.5923385.4423123.77

vkpeak

int16-scalar

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20230730int16-scalarabc3K6K9K12K15KSE +/- 1.30, N = 313102.7513070.8113063.86

vkpeak

int32-vec4

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20230730int32-vec4abc6001200180024003000SE +/- 0.26, N = 32658.732640.082638.69

vkpeak

int32-scalar

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20230730int32-scalarabc5001000150020002500SE +/- 0.34, N = 32272.622269.252269.06

vkpeak

fp64-vec4

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20230730fp64-vec4abc2004006008001000SE +/- 0.32, N = 3841.80836.55836.16

vkpeak

fp64-scalar

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20230730fp64-scalarabc2004006008001000SE +/- 0.22, N = 3841.40839.20839.01

vkpeak

fp16-vec4

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20230730fp16-vec4bca5K10K15K20K25KSE +/- 5.96, N = 323390.4423387.2623232.42

vkpeak

fp16-scalar

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20230730fp16-scalarabc3K6K9K12K15KSE +/- 4.01, N = 313154.1513145.1913136.79

vkpeak

fp32-vec4

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20230730fp32-vec4cba3K6K9K12K15KSE +/- 1.81, N = 312822.0112808.5912730.08

vkpeak

fp32-scalar

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20230730fp32-scalaracb3K6K9K12K15KSE +/- 4.18, N = 313190.0912860.5612807.06

VkFFT

Test: FFT + iFFT C2C 1D batched in double precision

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT C2C 1D batched in double precisioncba4K8K12K16K20KSE +/- 11.67, N = 32084720822208161. (CXX) g++ options: -O3

NCNN

Target: CPU-v3-v3 - Model: mobilenet-v3

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU-v3-v3 - Model: mobilenet-v3ca0.71551.4312.14652.8623.5775SE +/- 0.00, N = 23.173.18MIN: 3.15 / MAX: 3.74MIN: 3.14 / MAX: 3.821. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3ac0.721.442.162.883.6SE +/- 0.02, N = 33.173.20MIN: 3.11 / MAX: 3.73MIN: 3.16 / MAX: 3.681. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

VkFFT

Test: FFT + iFFT C2C Bluestein benchmark in double precision

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT C2C Bluestein benchmark in double precisionabc10002000300040005000SE +/- 0.33, N = 34717469546701. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT C2C 1D batched in single precision

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT C2C 1D batched in single precisioncba10K20K30K40K50KSE +/- 9.54, N = 34797147948478871. (CXX) g++ options: -O3

NCNN

Target: CPU - Model: FastestDet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: FastestDetabc0.92481.84962.77443.69924.624SE +/- 0.45, N = 33.624.054.11MIN: 2.7 / MAX: 4.54MIN: 4.02 / MAX: 4.35MIN: 4.08 / MAX: 4.41. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: vision_transformer

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: vision_transformercba816243240SE +/- 0.29, N = 331.7731.9532.49MIN: 31.61 / MAX: 35.68MIN: 31.79 / MAX: 32.33MIN: 31.67 / MAX: 40.111. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: regnety_400m

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: regnety_400mabc246810SE +/- 0.04, N = 38.188.188.27MIN: 8.07 / MAX: 9.68MIN: 8.12 / MAX: 8.86MIN: 8.22 / MAX: 9.181. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: squeezenet_ssd

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: squeezenet_ssdbac246810SE +/- 0.01, N = 37.067.077.10MIN: 7.01 / MAX: 7.55MIN: 7.01 / MAX: 8.07MIN: 7.05 / MAX: 7.651. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: yolov4-tiny

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: yolov4-tinybca3691215SE +/- 0.11, N = 312.7412.8112.90MIN: 12.66 / MAX: 13.28MIN: 12.74 / MAX: 13.2MIN: 12.69 / MAX: 15.881. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: resnet50

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: resnet50bca3691215SE +/- 0.23, N = 310.0110.1110.20MIN: 9.85 / MAX: 11.06MIN: 9.95 / MAX: 16.18MIN: 9.84 / MAX: 12.481. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: alexnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: alexnetcba0.99231.98462.97693.96924.9615SE +/- 0.11, N = 34.314.324.41MIN: 4.26 / MAX: 4.98MIN: 4.26 / MAX: 5.15MIN: 4.24 / MAX: 5.161. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: resnet18

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: resnet18bca1.19032.38063.57094.76125.9515SE +/- 0.07, N = 35.205.245.29MIN: 5.1 / MAX: 5.9MIN: 5.15 / MAX: 6.09MIN: 5.09 / MAX: 6.291. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: vgg16

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: vgg16cba612182430SE +/- 0.30, N = 323.4523.4923.75MIN: 23.26 / MAX: 24.51MIN: 23.36 / MAX: 24.62MIN: 23.31 / MAX: 25.121. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: googlenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: googlenetbca246810SE +/- 0.11, N = 37.827.937.94MIN: 7.73 / MAX: 8.65MIN: 7.82 / MAX: 8.91MIN: 7.71 / MAX: 8.731. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: blazeface

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: blazefaceabc0.31280.62560.93841.25121.564SE +/- 0.00, N = 31.381.381.39MIN: 1.35 / MAX: 2.06MIN: 1.35 / MAX: 1.67MIN: 1.36 / MAX: 1.531. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: efficientnet-b0

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: efficientnet-b0bca0.87751.7552.63253.514.3875SE +/- 0.05, N = 33.853.883.90MIN: 3.81 / MAX: 4.42MIN: 3.84 / MAX: 4.41MIN: 3.82 / MAX: 4.511. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: mnasnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: mnasnetabc0.67281.34562.01842.69123.364SE +/- 0.01, N = 32.972.972.99MIN: 2.92 / MAX: 3.48MIN: 2.93 / MAX: 3.45MIN: 2.96 / MAX: 3.441. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: shufflenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: shufflenet-v2abc0.75381.50762.26143.01523.769SE +/- 0.01, N = 33.343.343.35MIN: 3.3 / MAX: 3.85MIN: 3.31 / MAX: 3.77MIN: 3.31 / MAX: 3.81. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU-v2-v2 - Model: mobilenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU-v2-v2 - Model: mobilenet-v2abc0.71551.4312.14652.8623.5775SE +/- 0.00, N = 33.163.163.18MIN: 3.1 / MAX: 3.8MIN: 3.11 / MAX: 3.61MIN: 3.13 / MAX: 3.841. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: mobilenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: mobilenetbca246810SE +/- 0.03, N = 37.978.028.05MIN: 7.94 / MAX: 8.26MIN: 7.98 / MAX: 8.33MIN: 7.97 / MAX: 9.071. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: FastestDet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: FastestDetbca0.92251.8452.76753.694.6125SE +/- 0.00, N = 34.074.094.10MIN: 4.04 / MAX: 4.53MIN: 4.05 / MAX: 5.5MIN: 4.06 / MAX: 4.811. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: vision_transformer

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: vision_transformercba714212835SE +/- 0.09, N = 331.7931.8531.88MIN: 31.63 / MAX: 35.57MIN: 31.69 / MAX: 33.06MIN: 31.55 / MAX: 37.471. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: regnety_400m

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: regnety_400mcab246810SE +/- 0.11, N = 38.008.168.21MIN: 7.94 / MAX: 8.88MIN: 7.9 / MAX: 8.99MIN: 8.14 / MAX: 8.841. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: squeezenet_ssd

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: squeezenet_ssdcba246810SE +/- 0.02, N = 37.067.077.09MIN: 7 / MAX: 8.03MIN: 7 / MAX: 8.07MIN: 6.98 / MAX: 7.951. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: yolov4-tiny

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: yolov4-tinycab3691215SE +/- 0.02, N = 312.8112.8412.87MIN: 12.73 / MAX: 13.08MIN: 12.69 / MAX: 15.33MIN: 12.76 / MAX: 13.731. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: resnet50

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: resnet50bca3691215SE +/- 0.02, N = 310.0010.0010.01MIN: 9.92 / MAX: 12.35MIN: 9.91 / MAX: 11.15MIN: 9.88 / MAX: 11.41. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: alexnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: alexnetabc0.97431.94862.92293.89724.8715SE +/- 0.01, N = 34.314.334.33MIN: 4.24 / MAX: 5.2MIN: 4.28 / MAX: 5.16MIN: 4.26 / MAX: 10.591. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: resnet18

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: resnet18cba1.1882.3763.5644.7525.94SE +/- 0.01, N = 35.215.235.28MIN: 5.11 / MAX: 6.04MIN: 5.13 / MAX: 6.18MIN: 5.17 / MAX: 6.161. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: vgg16

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: vgg16acb612182430SE +/- 0.04, N = 323.5123.5423.56MIN: 23.29 / MAX: 24.68MIN: 23.33 / MAX: 24.61MIN: 23.34 / MAX: 24.721. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: googlenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: googlenetcba246810SE +/- 0.02, N = 37.807.857.90MIN: 7.72 / MAX: 8.74MIN: 7.76 / MAX: 8.76MIN: 7.74 / MAX: 9.541. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: blazeface

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: blazefacebca0.31050.6210.93151.2421.5525SE +/- 0.01, N = 31.371.371.38MIN: 1.35 / MAX: 1.75MIN: 1.35 / MAX: 1.82MIN: 1.34 / MAX: 1.851. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: efficientnet-b0

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: efficientnet-b0bca0.86851.7372.60553.4744.3425SE +/- 0.01, N = 33.823.833.86MIN: 3.78 / MAX: 4.39MIN: 3.79 / MAX: 4.61MIN: 3.8 / MAX: 4.61. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: mnasnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: mnasnetbca0.67051.3412.01152.6823.3525SE +/- 0.01, N = 32.952.962.98MIN: 2.92 / MAX: 3.42MIN: 2.93 / MAX: 3.41MIN: 2.92 / MAX: 4.031. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: shufflenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: shufflenet-v2cba0.75381.50762.26143.01523.769SE +/- 0.01, N = 33.323.333.35MIN: 3.29 / MAX: 4.19MIN: 3.3 / MAX: 3.59MIN: 3.29 / MAX: 3.851. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2cba0.71331.42662.13992.85323.5665SE +/- 0.01, N = 33.133.143.17MIN: 3.08 / MAX: 3.85MIN: 3.1 / MAX: 3.73MIN: 3.09 / MAX: 3.781. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: mobilenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: mobilenetcba246810SE +/- 0.02, N = 38.038.048.05MIN: 7.98 / MAX: 8.84MIN: 7.95 / MAX: 14.33MIN: 7.95 / MAX: 8.891. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

VkFFT

Test: FFT + iFFT C2C 1D batched in single precision, no reshuffling

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT C2C 1D batched in single precision, no reshufflingbca11K22K33K44K55KSE +/- 8.89, N = 35064350596505041. (CXX) g++ options: -O3

NCNN

Target: CPU-v3-v3-v3 - Model: FastestDet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU-v3-v3-v3 - Model: FastestDetbc0.9181.8362.7543.6724.594.074.08MIN: 4.03 / MAX: 5.83MIN: 4.05 / MAX: 4.361. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU-v3-v3-v3 - Model: vision_transformer

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU-v3-v3-v3 - Model: vision_transformerbc71421283531.6531.78MIN: 31.53 / MAX: 32.23MIN: 31.64 / MAX: 34.511. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU-v3-v3-v3 - Model: regnety_400m

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU-v3-v3-v3 - Model: regnety_400mbc2468108.058.14MIN: 8 / MAX: 8.58MIN: 8.08 / MAX: 8.691. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU-v3-v3-v3 - Model: squeezenet_ssd

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU-v3-v3-v3 - Model: squeezenet_ssdbc2468107.037.04MIN: 6.97 / MAX: 7.88MIN: 6.96 / MAX: 7.831. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU-v3-v3-v3 - Model: yolov4-tiny

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU-v3-v3-v3 - Model: yolov4-tinycb369121512.8912.98MIN: 12.84 / MAX: 13.19MIN: 12.73 / MAX: 35.551. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU-v3-v3-v3 - Model: resnet50

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU-v3-v3-v3 - Model: resnet50bc369121510.0110.33MIN: 9.89 / MAX: 10.86MIN: 10.16 / MAX: 13.971. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU-v3-v3-v3 - Model: alexnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU-v3-v3-v3 - Model: alexnetcb0.96531.93062.89593.86124.82654.284.29MIN: 4.24 / MAX: 5.12MIN: 4.24 / MAX: 5.641. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU-v3-v3-v3 - Model: resnet18

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU-v3-v3-v3 - Model: resnet18bc1.18352.3673.55054.7345.91755.215.26MIN: 5.12 / MAX: 6.22MIN: 5.18 / MAX: 6.271. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU-v3-v3-v3 - Model: vgg16

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU-v3-v3-v3 - Model: vgg16bc61218243023.5023.99MIN: 23.3 / MAX: 24.41MIN: 23.72 / MAX: 24.981. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU-v3-v3-v3 - Model: googlenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU-v3-v3-v3 - Model: googlenetbc2468107.847.88MIN: 7.74 / MAX: 8.7MIN: 7.79 / MAX: 8.781. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU-v3-v3-v3 - Model: blazeface

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU-v3-v3-v3 - Model: blazefacebc0.31050.6210.93151.2421.55251.371.38MIN: 1.35 / MAX: 1.52MIN: 1.36 / MAX: 1.581. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU-v3-v3-v3 - Model: efficientnet-b0

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU-v3-v3-v3 - Model: efficientnet-b0bc0.87531.75062.62593.50124.37653.823.89MIN: 3.79 / MAX: 4.34MIN: 3.83 / MAX: 9.721. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU-v3-v3-v3 - Model: mnasnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU-v3-v3-v3 - Model: mnasnetbc0.66831.33662.00492.67323.34152.972.97MIN: 2.94 / MAX: 3.43MIN: 2.94 / MAX: 3.431. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU-v3-v3-v3 - Model: shufflenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU-v3-v3-v3 - Model: shufflenet-v2bc0.75151.5032.25453.0063.75753.333.34MIN: 3.3 / MAX: 3.79MIN: 3.32 / MAX: 3.791. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU-v3-v3-v3-v3-v3 - Model: mobilenet-v3

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU-v3-v3-v3-v3-v3 - Model: mobilenet-v3bc0.7111.4222.1332.8443.5553.163.16MIN: 3.12 / MAX: 3.69MIN: 3.12 / MAX: 3.71. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU-v3-v3-v3-v2-v2 - Model: mobilenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU-v3-v3-v3-v2-v2 - Model: mobilenet-v2cb0.70881.41762.12642.83523.5443.143.15MIN: 3.1 / MAX: 3.67MIN: 3.1 / MAX: 3.681. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU-v3-v3-v3 - Model: mobilenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU-v3-v3-v3 - Model: mobilenetcb2468108.008.01MIN: 7.96 / MAX: 8.63MIN: 7.95 / MAX: 8.951. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU-v3-v3-v3 - Model: FastestDet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU-v3-v3-v3 - Model: FastestDetcb0.91351.8272.74053.6544.56753.694.06MIN: 3.66 / MAX: 3.92MIN: 4.03 / MAX: 4.31. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU-v3-v3-v3 - Model: vision_transformer

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU-v3-v3-v3 - Model: vision_transformercb71421283531.6631.71MIN: 31.52 / MAX: 32.14MIN: 31.56 / MAX: 33.031. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU-v3-v3-v3 - Model: regnety_400m

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU-v3-v3-v3 - Model: regnety_400mcb2468107.988.27MIN: 7.93 / MAX: 8.65MIN: 8.22 / MAX: 9.011. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU-v3-v3-v3 - Model: squeezenet_ssd

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU-v3-v3-v3 - Model: squeezenet_ssdcb2468107.077.14MIN: 7.01 / MAX: 7.75MIN: 7.06 / MAX: 7.951. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU-v3-v3-v3 - Model: yolov4-tiny

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU-v3-v3-v3 - Model: yolov4-tinybc369121512.7712.86MIN: 12.69 / MAX: 13.71MIN: 12.76 / MAX: 13.981. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU-v3-v3-v3 - Model: resnet50

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU-v3-v3-v3 - Model: resnet50bc36912159.8710.03MIN: 9.79 / MAX: 10.73MIN: 9.93 / MAX: 10.961. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU-v3-v3-v3 - Model: alexnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU-v3-v3-v3 - Model: alexnetcb0.99451.9892.98353.9784.97254.304.42MIN: 4.26 / MAX: 5.16MIN: 4.32 / MAX: 5.11. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU-v3-v3-v3 - Model: resnet18

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU-v3-v3-v3 - Model: resnet18cb1.21952.4393.65854.8786.09755.235.42MIN: 5.11 / MAX: 6.03MIN: 5.36 / MAX: 6.271. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU-v3-v3-v3 - Model: vgg16

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU-v3-v3-v3 - Model: vgg16bc61218243023.4223.54MIN: 23.27 / MAX: 24.32MIN: 23.32 / MAX: 24.541. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU-v3-v3-v3 - Model: googlenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU-v3-v3-v3 - Model: googlenetcb2468107.837.97MIN: 7.74 / MAX: 8.61MIN: 7.89 / MAX: 8.71. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU-v3-v3-v3 - Model: blazeface

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU-v3-v3-v3 - Model: blazefacecb0.30830.61660.92491.23321.54151.361.37MIN: 1.34 / MAX: 1.44MIN: 1.35 / MAX: 1.391. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU-v3-v3-v3 - Model: efficientnet-b0

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU-v3-v3-v3 - Model: efficientnet-b0cb0.86631.73262.59893.46524.33153.823.85MIN: 3.78 / MAX: 4.53MIN: 3.82 / MAX: 4.481. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU-v3-v3-v3 - Model: mnasnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU-v3-v3-v3 - Model: mnasnetbc0.6661.3321.9982.6643.332.962.96MIN: 2.93 / MAX: 3.4MIN: 2.93 / MAX: 3.411. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU-v3-v3-v3 - Model: shufflenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU-v3-v3-v3 - Model: shufflenet-v2bc0.74931.49862.24792.99723.74653.333.33MIN: 3.3 / MAX: 3.77MIN: 3.31 / MAX: 3.811. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU-v3-v3-v3-v3-v3 - Model: mobilenet-v3

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU-v3-v3-v3-v3-v3 - Model: mobilenet-v3bc0.71331.42662.13992.85323.56653.163.17MIN: 3.11 / MAX: 3.75MIN: 3.11 / MAX: 8.891. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU-v3-v3-v3-v2-v2 - Model: mobilenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU-v3-v3-v3-v2-v2 - Model: mobilenet-v2bc0.70881.41762.12642.83523.5443.153.15MIN: 3.11 / MAX: 3.88MIN: 3.11 / MAX: 3.851. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU-v3-v3-v3 - Model: mobilenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU-v3-v3-v3 - Model: mobilenetcb2468107.958.00MIN: 7.89 / MAX: 8.79MIN: 7.95 / MAX: 8.991. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

VkFFT

Test: FFT + iFFT C2C Bluestein in single precision

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT C2C Bluestein in single precisionacb2K4K6K8K10KSE +/- 62.67, N = 31134011311112731. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT C2C 1D batched in half precision

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT C2C 1D batched in half precisionbca20K40K60K80K100KSE +/- 83.55, N = 39181291744915971. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT C2C multidimensional in single precision

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT C2C multidimensional in single precisionacb7K14K21K28K35KSE +/- 57.83, N = 33300132812327511. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT R2C / C2R

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT R2C / C2Rcba9K18K27K36K45KSE +/- 200.55, N = 34302142163421051. (CXX) g++ options: -O3

VkResample

Upscale: 2x - Precision: Single

OpenBenchmarking.orgms, Fewer Is BetterVkResample 1.0Upscale: 2x - Precision: Singleacb3691215SE +/- 0.00, N = 311.6911.6911.691. (CXX) g++ options: -O3


Phoronix Test Suite v10.8.5