vulkan-benchmarks AMD Ryzen 9 7950X 16-Core testing with a ASUS ROG STRIX X670E-E GAMING WIFI (1416 BIOS) and NVIDIA GeForce RTX 4090 24GB on Ubuntu 23.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2308069-PTS-VULKANBE16&grr&sor .
vulkan-benchmarks Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server OpenGL Compiler File-System Screen Resolution Display Driver a b c d e f g h i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 AMD Ryzen 9 7950X 16-Core @ 4.50GHz (16 Cores / 32 Threads) ASUS ROG STRIX X670E-E GAMING WIFI (1416 BIOS) AMD Device 14d8 32GB Western Digital WD_BLACK SN850X 1000GB + 4001GB AMD Radeon RX 6700 XT (2855/1000MHz) AMD Navi 21/23 ASUS MG28U Intel I225-V + Intel Wi-Fi 6 AX210/AX211/AX411 Ubuntu 23.04 6.4.6-060406-generic (x86_64) GNOME Shell 44.2 X Server 1.21.1.7 + Wayland 4.6 Mesa 23.3~git2307260600.87109c~oibaf~l (git-87109c3 2023-07-26 lunar-oibaf-ppa) (LLVM 15.0.7 DRM 3.52) GCC 12.2.0 ext4 3840x2160 MSI NVIDIA GeForce RTX 4060 8GB NVIDIA Device 22be X Server 1.21.1.7 NVIDIA 535.86.05 4.6.0 eVGA NVIDIA GeForce RTX 3060 12GB NVIDIA GA106 HD Audio NVIDIA GeForce RTX 3060 Ti 8GB NVIDIA GA104 HD Audio 2560x1440 NVIDIA GeForce RTX 4080 16GB NVIDIA Device 22bb 3840x2160 NVIDIA GeForce RTX 3090 24GB NVIDIA GA102 HD Audio NVIDIA GeForce RTX 3070 8GB NVIDIA GA104 HD Audio 2560x1440 NVIDIA GeForce RTX 3070 Ti 8GB NVIDIA GeForce RTX 4090 24GB NVIDIA AD102 HD Audio 3840x2160 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-Pa930Z/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-Pa930Z/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - a: Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0xa601203 - b: Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0xa601203 - c: Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0xa601203 - d: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa601203 - e: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa601203 - f: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa601203 - g: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa601203 - h: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa601203 - i: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa601203 - 4080: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa601203 - 4080 rep: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa601203 - 4080 xxx: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa601203 - 4080 zzz: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa601203 - 3090: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa601203 - 3090 rep: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa601203 - 3070: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa601203 - RTX 3070 Ti: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa601203 - 4090: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa601203 - 4090 rep: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa601203 - nv 4090: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa601203 Graphics Details - a: BAR1 / Visible vRAM Size: 12272 MB - vBIOS Version: 113-D5121100-101 - b: BAR1 / Visible vRAM Size: 12272 MB - vBIOS Version: 113-D5121100-101 - c: BAR1 / Visible vRAM Size: 12272 MB - vBIOS Version: 113-D5121100-101 - d: BAR1 / Visible vRAM Size: 8192 MiB - vBIOS Version: 95.07.31.00.e3 - e: BAR1 / Visible vRAM Size: 8192 MiB - vBIOS Version: 95.07.31.00.e3 - f: BAR1 / Visible vRAM Size: 16384 MiB - vBIOS Version: 94.06.14.40.46 - g: BAR1 / Visible vRAM Size: 16384 MiB - vBIOS Version: 94.06.14.40.46 - h: BAR1 / Visible vRAM Size: 16384 MiB - vBIOS Version: 94.06.14.40.46 - i: BAR1 / Visible vRAM Size: 8192 MiB - vBIOS Version: 94.04.25.00.2c - 4080: BAR1 / Visible vRAM Size: 16384 MiB - vBIOS Version: 95.03.0e.00.04 - 4080 rep: BAR1 / Visible vRAM Size: 16384 MiB - vBIOS Version: 95.03.0e.00.04 - 4080 xxx: BAR1 / Visible vRAM Size: 16384 MiB - vBIOS Version: 95.03.0e.00.04 - 4080 zzz: BAR1 / Visible vRAM Size: 16384 MiB - vBIOS Version: 95.03.0e.00.04 - 3090: BAR1 / Visible vRAM Size: 32768 MiB - vBIOS Version: 94.02.27.00.02 - 3090 rep: BAR1 / Visible vRAM Size: 32768 MiB - vBIOS Version: 94.02.27.00.02 - 3070: BAR1 / Visible vRAM Size: 8192 MiB - vBIOS Version: 94.04.25.00.2b - RTX 3070 Ti: BAR1 / Visible vRAM Size: 8192 MiB - vBIOS Version: 94.04.5b.00.02 - 4090: BAR1 / Visible vRAM Size: 32768 MiB - vBIOS Version: 95.02.20.00.01 - 4090 rep: BAR1 / Visible vRAM Size: 32768 MiB - vBIOS Version: 95.02.20.00.01 - nv 4090: BAR1 / Visible vRAM Size: 32768 MiB - vBIOS Version: 95.02.20.00.01 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
vulkan-benchmarks vkpeak: fp16-vec4 vkpeak: int32-scalar vkpeak: int16-vec4 vkpeak: int32-vec4 vkpeak: int16-scalar vkpeak: fp16-scalar vkpeak: fp32-vec4 vkpeak: fp32-scalar vkpeak: fp64-scalar vkpeak: fp64-vec4 ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - mobilenet-v3 ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - FastestDet ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - vision_transformer ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - regnety_400m ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - squeezenet_ssd ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - yolov4-tiny ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - resnet50 ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - alexnet ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - resnet18 ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - vgg16 ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - googlenet ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - blazeface ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - efficientnet-b0 ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - mnasnet ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - shufflenet-v2 ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v2-v2 - mobilenet-v2 ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - mobilenet vkfft: FFT + iFFT C2C Bluestein benchmark in double precision vkresample: 2x - Double ncnn: Vulkan GPU-v3-v3 - mobilenet-v3 ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - FastestDet ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - vision_transformer ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - regnety_400m ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - squeezenet_ssd ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - yolov4-tiny ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - resnet50 ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - alexnet ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - resnet18 ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - vgg16 ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - googlenet ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - blazeface ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - efficientnet-b0 ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - mnasnet ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - shufflenet-v2 ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - mobilenet-v3 ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v2-v2 - mobilenet-v2 ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - mobilenet vkfft: FFT + iFFT C2C 1D batched in double precision ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3 - mobilenet-v3 vkfft: FFT + iFFT C2C Bluestein in single precision ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3 - yolov4-tiny ncnn: CPU-v3-v3 - mobilenet-v3 ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3 - FastestDet ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3 - vision_transformer ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3 - regnety_400m ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3 - squeezenet_ssd ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3 - resnet50 ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3 - alexnet ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3 - resnet18 ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3 - vgg16 ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3 - googlenet ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3 - blazeface ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3 - efficientnet-b0 ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3 - mnasnet ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3 - shufflenet-v2 ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3-v2-v2 - mobilenet-v2 ncnn: Vulkan GPU-v3-v3-v3-v3-v3-v3 - mobilenet ncnn: Vulkan GPU - FastestDet ncnn: Vulkan GPU - vision_transformer ncnn: Vulkan GPU - regnety_400m ncnn: Vulkan GPU - squeezenet_ssd ncnn: Vulkan GPU - yolov4-tiny ncnn: Vulkan GPU - resnet50 ncnn: Vulkan GPU - alexnet ncnn: Vulkan GPU - resnet18 ncnn: Vulkan GPU - vgg16 ncnn: Vulkan GPU - googlenet ncnn: Vulkan GPU - blazeface ncnn: Vulkan GPU - efficientnet-b0 ncnn: Vulkan GPU - mnasnet ncnn: Vulkan GPU - shufflenet-v2 ncnn: Vulkan GPU-v2-v2 - mobilenet-v2 ncnn: Vulkan GPU - mobilenet ncnn: CPU - FastestDet ncnn: CPU - blazeface ncnn: Vulkan GPU-v3-v3-v3-v3-v3 - mobilenet-v3 ncnn: CPU-v3-v3-v3-v3-v3 - mobilenet-v3 ncnn: CPU - vision_transformer ncnn: CPU - regnety_400m ncnn: CPU - squeezenet_ssd ncnn: CPU - yolov4-tiny ncnn: CPU - resnet50 ncnn: CPU - alexnet ncnn: CPU - resnet18 ncnn: CPU - vgg16 ncnn: CPU - googlenet ncnn: CPU - efficientnet-b0 ncnn: CPU - mnasnet ncnn: CPU - shufflenet-v2 ncnn: CPU-v2-v2 - mobilenet-v2 ncnn: CPU - mobilenet ncnn: CPU-v3-v3-v3 - FastestDet ncnn: CPU-v3-v3-v3 - vision_transformer ncnn: CPU-v3-v3-v3 - regnety_400m ncnn: CPU-v3-v3-v3 - squeezenet_ssd ncnn: CPU-v3-v3-v3 - yolov4-tiny ncnn: CPU-v3-v3-v3 - resnet50 ncnn: CPU-v3-v3-v3 - alexnet ncnn: CPU-v3-v3-v3 - resnet18 ncnn: CPU-v3-v3-v3 - vgg16 ncnn: CPU-v3-v3-v3 - googlenet ncnn: CPU-v3-v3-v3 - blazeface ncnn: CPU-v3-v3-v3 - efficientnet-b0 ncnn: CPU-v3-v3-v3 - mnasnet ncnn: CPU-v3-v3-v3 - shufflenet-v2 ncnn: CPU-v3-v3-v3-v2-v2 - mobilenet-v2 ncnn: CPU-v3-v3-v3 - mobilenet ncnn: Vulkan GPU-v3-v3-v3 - regnety_400m ncnn: Vulkan GPU-v3-v3-v3 - FastestDet ncnn: Vulkan GPU-v3-v3-v3 - vision_transformer ncnn: Vulkan GPU-v3-v3-v3 - squeezenet_ssd ncnn: Vulkan GPU-v3-v3-v3 - yolov4-tiny ncnn: Vulkan GPU-v3-v3-v3 - resnet50 ncnn: Vulkan GPU-v3-v3-v3 - alexnet ncnn: Vulkan GPU-v3-v3-v3 - resnet18 ncnn: Vulkan GPU-v3-v3-v3 - vgg16 ncnn: Vulkan GPU-v3-v3-v3 - googlenet ncnn: Vulkan GPU-v3-v3-v3 - blazeface ncnn: Vulkan GPU-v3-v3-v3 - efficientnet-b0 ncnn: Vulkan GPU-v3-v3-v3 - mnasnet ncnn: Vulkan GPU-v3-v3-v3 - shufflenet-v2 ncnn: Vulkan GPU-v3-v3-v3-v2-v2 - mobilenet-v2 ncnn: Vulkan GPU-v3-v3-v3 - mobilenet vkfft: FFT + iFFT C2C 1D batched in single precision ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - FastestDet ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - vision_transformer ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - regnety_400m ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - squeezenet_ssd ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - yolov4-tiny ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - resnet50 ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - alexnet ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - resnet18 ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - vgg16 ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - googlenet ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - blazeface ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - efficientnet-b0 ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - mnasnet ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - shufflenet-v2 ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - mobilenet-v3 ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v2-v2 - mobilenet-v2 ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - mobilenet ncnn: CPU-v3-v3-v3-v3-v3-v3-v3-v3 - mobilenet-v3 vkfft: FFT + iFFT C2C 1D batched in single precision, no reshuffling ncnn: CPU-v3-v3-v3-v3-v3-v3 - FastestDet ncnn: CPU-v3-v3-v3-v3-v3-v3 - vision_transformer ncnn: CPU-v3-v3-v3-v3-v3-v3 - regnety_400m ncnn: CPU-v3-v3-v3-v3-v3-v3 - squeezenet_ssd ncnn: CPU-v3-v3-v3-v3-v3-v3 - yolov4-tiny ncnn: CPU-v3-v3-v3-v3-v3-v3 - resnet50 ncnn: CPU-v3-v3-v3-v3-v3-v3 - alexnet ncnn: CPU-v3-v3-v3-v3-v3-v3 - resnet18 ncnn: CPU-v3-v3-v3-v3-v3-v3 - vgg16 ncnn: CPU-v3-v3-v3-v3-v3-v3 - googlenet ncnn: CPU-v3-v3-v3-v3-v3-v3 - blazeface ncnn: CPU-v3-v3-v3-v3-v3-v3 - efficientnet-b0 ncnn: CPU-v3-v3-v3-v3-v3-v3 - mnasnet ncnn: CPU-v3-v3-v3-v3-v3-v3 - shufflenet-v2 ncnn: CPU-v3-v3-v3-v3-v3-v3-v2-v2 - mobilenet-v2 ncnn: CPU-v3-v3-v3-v3-v3-v3 - mobilenet vkfft: FFT + iFFT C2C 1D batched in half precision vkfft: FFT + iFFT C2C multidimensional in single precision vkfft: FFT + iFFT R2C / C2R vkresample: 2x - Single a b c d e f g h i 4080 4080 rep 4080 xxx 4080 zzz 3090 3090 rep 3070 RTX 3070 Ti 4090 4090 rep nv 4090 23232.42 2272.62 23123.77 2658.73 13102.75 13154.15 12730.08 13190.09 841.40 841.80 4717 3.17 20816 11340 3.18 4.1 31.88 8.16 7.09 12.84 10.01 4.31 5.28 23.51 7.90 1.38 3.86 2.98 3.35 3.17 8.05 3.62 1.38 32.49 8.18 7.07 12.90 10.20 4.41 5.29 23.75 7.94 3.90 2.97 3.34 3.16 8.05 47887 50504 91597 33001 42105 11.686 23390.44 2269.25 23396.59 2640.08 13070.81 13145.19 12808.59 12807.06 839.2 836.55 4695 20822 11273 4.07 31.85 8.21 7.07 12.87 10 4.33 5.23 23.56 7.85 1.37 3.82 2.95 3.33 3.14 8.04 4.05 1.38 3.16 3.16 31.95 8.18 7.06 12.74 10.01 4.32 5.2 23.49 7.82 3.85 2.97 3.34 3.16 7.97 4.07 31.65 8.05 7.03 12.98 10.01 4.29 5.21 23.5 7.84 1.37 3.82 2.97 3.33 3.15 8.01 8.27 4.06 31.71 7.14 12.77 9.87 4.42 5.42 23.42 7.97 1.37 3.85 2.96 3.33 3.15 8 47948 50643 91812 32751 42163 11.69 23387.26 2269.06 23385.44 2638.69 13063.86 13136.79 12822.01 12860.56 839.01 836.16 4670 3.2 20847 11311 3.17 4.09 31.79 8 7.06 12.81 10 4.33 5.21 23.54 7.8 1.37 3.83 2.96 3.32 3.13 8.03 4.11 1.39 3.17 3.16 31.77 8.27 7.1 12.81 10.11 4.31 5.24 23.45 7.93 3.88 2.99 3.35 3.18 8.02 4.08 31.78 8.14 7.04 12.89 10.33 4.28 5.26 23.99 7.88 1.38 3.89 2.97 3.34 3.14 8 7.98 3.69 31.66 7.07 12.86 10.03 4.3 5.23 23.54 7.83 1.36 3.82 2.96 3.33 3.15 7.95 47971 50596 91744 32812 43021 11.688 16864.47 8520.02 7352.85 8465.82 5676.02 8412.33 11251.17 8531.96 267.43 267.74 2346 500.014 3.17 12143 10719 3.17 4.11 32.12 8.17 7.08 12.85 10.10 4.31 5.23 23.56 7.85 1.38 3.87 2.97 3.35 3.16 8.02 4.08 1.38 32.43 8.23 7.09 12.95 10.00 4.30 5.23 23.51 7.85 3.85 2.98 3.35 3.17 8.10 42645 43365 85181 36328 35399 32.855 16865.29 8505.20 7336.25 8465.71 5675.99 8397.80 11231.72 8515.58 267.41 267.25 2343 500.016 3.18 12168 10560 4.08 31.93 8.10 7.05 12.87 10.10 4.31 5.22 23.60 7.85 1.38 3.84 2.96 3.33 3.14 8.04 42651 43365 85191 37090 35304 32.850 13440.97 6827.92 5959.75 6800.17 4480.59 6812.52 9006.57 6837.94 214.17 214.23 1814 500.01 3.14 10561 7571 4.24 32.92 8.34 7.08 13.17 10.26 4.64 5.48 24.55 8.15 1.38 3.86 2.97 3.4 3.13 8.27 4.22 1.37 3.15 3.15 33.56 8.08 7.23 13.32 11.05 4.36 5.69 24.19 7.92 3.87 2.97 3.55 3.15 8.45 3.85 33.47 8.34 6.97 13.07 11.05 4.83 6.13 24.45 8.07 1.43 4.04 3.12 3.4 3.16 8.56 8.5 4.2 33.36 7.09 14.34 10.25 4.35 5.3 24.12 7.94 1.37 3.85 2.96 3.33 3.16 8.65 56476 57110 104146 26238 26593 26.738 13438.4 6824.21 5956.24 6795.39 4478.41 6810.55 9003.12 6832.74 213.96 213.95 1818 500.011 10548 3.16 7574 13.14 3.14 3.92 33.32 8.38 7.14 10.33 4.35 5.28 24.04 7.96 1.38 3.84 2.97 3.34 3.17 8.5 4.07 32.42 8.36 7.1 13.64 10.34 4.87 6.22 24.2 8.96 1.41 3.86 2.98 3.35 3.18 8.17 2.57 1.38 3.15 3.16 32.73 8.3 7.13 17.23 10.72 4.32 5.55 23.78 7.98 3.91 3 3.59 3.16 22.74 3.97 33.39 8.07 7.26 13.08 11.25 4.71 5.48 24.92 9.15 1.37 4.14 3.05 3.38 3.15 8.98 7.99 4.06 32.38 7.19 12.89 10.18 4.35 5.3 23.82 7.94 1.37 3.85 2.95 3.33 3.14 8.04 56455 57094 104171 26541 26638 26.769 13490.24 6800.6 5978.38 6772.98 4495.98 6838.32 9036.17 6810.73 213.37 210.96 10572 7622 56431 104298 26524 2417 500.006 4.87 14780 3.26 10061 13.77 3.26 3.83 38.01 8.46 7.21 13.1 5.1 5.88 27.43 10.47 1.41 4.19 3.07 3.43 3.29 10.02 5.14 36.42 9.88 7.46 15.16 12.96 6.53 5.82 27.83 8.75 1.4 5.88 3.2 3.49 3.29 10.4 2.66 1.4 3.26 3.29 37.8 9.94 8.96 14.65 12.09 5.3 5.6 30.96 10.3 4.05 2.74 5.03 3.52 8.37 5.69 36.55 8.21 8.16 15.11 14.05 5.01 5.86 29.12 10.17 1.28 4.68 3.39 3.52 3.3 10.08 7.99 4.43 38.33 8.33 15.43 11.15 4.99 5.85 29.07 10.19 1.25 4.21 2.99 3.36 3.28 9.05 69738 71163 132270 34686 33727 20.93 5579 288.201 4.19 35.6 8.67 7.71 13.93 11.48 4.98 5.92 25.67 8.79 1.44 4.04 3.1 3.48 3.26 3.29 8.43 34974 17121 13.86 3.24 4.28 34.91 8.61 7.66 11.4 4.61 5.61 25.48 8.4 1.43 4.06 3.06 3.46 3.28 8.84 4.2 35.56 8.39 7.58 13.79 10.81 4.62 5.67 25 8.42 1.41 4.01 3.07 3.43 3.26 8.44 4.42 1.4 3.27 3.28 35.07 8.24 7.73 13.85 11.16 4.75 5.69 25.37 8.42 3.99 3.05 3.41 3.28 8.73 4.2 34.2 8.33 7.64 13.81 11.11 4.66 5.7 25.1 8.45 1.44 4.02 3.08 3.46 3.31 8.43 8.45 4.2 34.13 7.66 13.79 10.95 4.69 5.65 25.04 8.49 1.42 4.05 3.09 3.44 3.29 8.43 104556 106210 211076 65869 66473 13.136 5583 288.166 4.17 33.93 8.35 7.62 13.73 11.07 4.64 5.67 25.56 8.42 1.42 4.01 3.06 3.44 3.31 3.27 8.38 35038 3.24 17287 13.55 3.26 4.14 34.1 8.24 7.55 10.8 4.72 5.61 25.05 8.49 1.41 3.98 3.03 3.39 3.27 8.4 4.21 35.07 8.56 7.64 13.67 10.84 4.65 5.61 25.04 8.4 1.41 4.05 3.08 3.44 3.29 8.57 4.34 1.42 3.27 35.28 8.67 7.86 14.03 11.76 4.67 5.68 26.11 8.52 4.09 3.06 3.43 3.28 8.41 4.18 34.27 8.57 7.59 13.55 10.79 4.65 5.63 24.91 8.52 1.42 4.02 3.09 3.43 3.27 8.48 8.44 4.09 34.29 7.63 13.68 10.84 4.69 5.69 25.04 8.58 1.45 4.04 3.07 3.44 3.3 8.46 104491 3.28 106205 4.2 34.22 8.72 7.67 13.71 10.86 4.68 5.64 25.01 8.52 1.43 4.07 3.09 3.47 3.3 8.45 211058 70068 68279 13.136 5587 288.039 3.75 33.9 8.25 7.27 13.52 11.22 4.71 5.65 25.33 8.26 1.31 3.97 2.98 3.34 3.05 3.14 8.31 35071 3.26 17343 13.62 3.27 4.17 34.23 8.52 7.67 10.91 4.67 5.67 25.4 8.43 1.42 4.02 3.05 3.43 3.27 8.46 4.2 34.19 8.56 7.62 13.65 10.94 4.68 5.62 25.01 8.42 1.42 4.04 3.07 3.5 3.26 8.37 4.17 1.42 3.33 3.31 34.27 8.45 7.62 13.6 10.82 4.68 5.56 25.03 8.38 4.04 3.06 3.45 3.28 8.37 4.19 34.37 8.75 7.64 13.69 10.91 4.65 5.66 25 8.5 1.42 4.06 3.07 3.47 3.3 8.44 8.58 4.31 35.4 7.7 13.95 11.5 5.21 5.89 26.08 8.99 1.42 4.22 3.13 3.51 3.4 8.88 104528 3.08 106099 3.8 34.14 8.38 7.27 13.63 11.26 4.69 5.78 25.44 8.32 1.32 4.01 3 3.4 3.2 8.34 210713 67887 69068 13.137 5584 288.028 3.28 4.61 35.36 8.37 8.06 15.26 12.5 4.7 5.74 26.09 8.55 1.41 4.05 3.08 3.44 3.26 3.28 9.19 35058 3.24 17185 13.61 3.24 4.2 34.1 8.49 7.62 10.91 4.68 5.6 25.16 8.42 1.42 4.01 3.08 3.43 3.28 8.46 4.79 34.32 8.58 7.63 13.8 11.07 4.68 5.59 25.82 8.41 1.42 4.04 3.06 3.46 3.29 8.47 4.16 1.4 3.2 3.27 34.1 8.37 7.55 13.63 11.1 4.69 5.63 25.4 8.4 3.99 3.04 3.42 3.25 8.38 4.04 34.47 8.47 7.35 13.83 11.21 4.67 5.71 25.45 8.55 1.41 4.03 3.06 3.43 3.28 8.4 8.1 4.12 34.05 7.51 13.62 11.09 4.65 5.59 25.26 8.37 1.39 3.95 3.01 3.37 3.23 8.38 104543 3.06 105926 3.82 34.47 8.34 7.25 13.42 11.1 4.66 5.77 25.26 8.29 1.31 3.95 2.96 3.36 3.16 8.25 210991 70040 67689 13.126 41149.1 20909.02 16886.66 20820.09 13710.88 20845.09 27797.8 21269.72 653.13 653.15 3.16 4.08 32.16 8.2 7.05 12.86 10.03 4.3 5.2 23.5 7.83 1.39 3.86 2.97 3.36 3.15 8.01 4282 371.699 3.19 3.83 31.94 8.33 7.05 12.97 10.07 4.32 5.21 23.51 7.84 1.38 3.88 2.94 3.32 3.13 3.12 8.03 30945 3.16 14406 13.1 3.18 4.03 33.22 7.99 7.04 10.38 4.31 5.19 23.55 7.86 1.36 3.83 2.96 3.33 3.15 8.11 4.11 31.94 8.38 7.16 12.88 10.1 4.35 5.27 23.55 7.87 1.39 3.88 2.99 3.39 3.18 8.07 4.21 1.38 3.15 33.01 8.25 7.52 14.26 10.3 4.3 5.21 23.43 7.86 3.87 2.99 3.34 3.17 8.6 4.04 31.89 7.95 7.04 12.87 10.05 4.31 5.19 23.5 7.83 1.36 3.83 2.95 3.32 3.14 8.06 8.01 4.04 31.86 7.04 12.88 9.97 4.3 5.23 23.5 7.82 1.36 3.85 2.97 3.34 3.16 8 141357 143969 4.1 32.1 8.22 7.12 12.82 10.03 4.33 5.2 23.58 7.9 1.39 3.88 2.98 3.36 3.16 8.07 255207 51005 55347 10.399 40876.12 20613.41 16878.2 20517.45 13606.79 20640.67 27393.2 20708.84 648.71 4.1 32.13 8.19 7.07 12.89 10.04 4.3 5.2 23.47 7.82 1.38 3.85 2.98 3.36 3.17 8.05 4289 371.422 3.16 4.07 31.97 8.07 7.08 12.9 9.98 4.3 5.2 23.38 7.86 1.37 3.85 2.97 3.34 3.18 3.17 8.05 31122 14449 12.86 3.19 4.08 32.09 8.34 7.09 10.04 4.31 5.22 23.52 7.89 1.39 3.87 2.99 3.37 3.19 8.04 4.08 31.91 8.02 7.06 12.83 10.06 4.31 5.2 23.48 7.82 1.37 3.86 2.97 3.36 3.17 8.03 4.08 1.38 3.15 3.15 31.8 8.24 7.12 12.77 9.95 4.31 5.29 23.43 7.9 3.86 2.97 3.35 3.17 8.01 4.11 31.93 8.09 7.09 12.82 10.07 4.3 5.2 23.54 7.85 1.37 3.85 2.96 3.33 3.17 8.03 8.25 4.1 32.11 7.08 12.84 10.01 4.3 5.24 23.43 7.85 1.38 3.85 2.97 3.36 3.17 8.01 141437 4.07 31.94 8.06 7.07 12.92 10.27 4.31 5.27 23.72 7.86 1.38 3.84 2.96 3.32 3.19 3.15 8.03 3.17 143956 4.07 31.85 8.03 7.09 12.81 10.06 4.31 5.3 23.4 7.91 1.38 3.85 2.97 3.33 3.16 8.06 265171 54814 54432 10.428 5.97 4.48 69.48 18.24 15.46 26.33 23.54 10.69 13.34 51.28 18.66 3.57 9.53 8.15 6.3 5.49 18.54 24.745 6.6 6.71 70.29 17.61 18.83 28.73 23.44 11.89 11.3 49.7 16.97 2.99 9.81 6.06 5.59 5.99 5.46 17.09 6.56 29.8 7.52 6.93 71.08 17.88 15.4 23.59 10.08 12.14 55.48 18.6 1.77 8.41 4.59 8 8.35 18.39 8.41 70.76 16.22 15.82 28.59 23.48 10.88 12.68 48.29 19.49 3.98 8.99 5.09 7.07 7.81 21.11 8.65 2.98 8.06 5.38 75.34 18 13.2 27.66 21.5 9.62 14.03 56.64 18.25 9.23 6.88 6.82 9.67 17.81 9.18 81.77 19.66 17.75 29.34 24.07 11 11.14 49.75 17 3.03 9.01 6.87 8.13 9.19 17.82 17.23 8.63 73.51 16.15 29.49 23.11 9.86 13.38 55.42 20.72 3.18 7.81 6.02 4.89 7.24 16.34 7.23 70.53 17.02 15.32 29.38 22.15 11.43 12.13 53.48 18.8 2.69 6.63 8.55 5.89 7.34 5.92 17.06 6.43 7.12 65.41 18.25 14.27 28.41 22.19 10.59 12.64 50.32 19.2 2.53 9.19 6.07 7.81 7.22 16.52 22.064 3.52 4.25 38.32 9.10 8.31 15.56 12.52 5.41 6.69 28.53 9.58 1.51 4.55 3.25 3.92 3.69 9.98 24.805 3.64 4.32 38.27 9.05 8.65 15.00 12.42 5.34 6.40 28.53 9.90 1.34 4.37 3.10 3.89 3.44 3.66 9.52 3.62 15.21 3.61 4.41 37.88 9.19 8.28 12.73 5.53 6.57 28.63 9.84 1.49 4.60 3.34 3.75 3.66 9.62 4.26 38.03 9.07 8.47 15.54 12.60 5.25 6.28 29.06 9.87 1.60 4.53 3.40 3.98 3.56 9.35 3.94 1.60 3.65 3.76 37.91 8.83 8.13 15.20 12.73 5.49 6.08 28.36 9.69 4.72 3.26 3.77 3.76 9.43 4.33 37.86 9.02 8.39 15.42 12.11 5.55 6.18 28.40 9.65 1.79 4.78 3.11 4.09 3.41 9.62 8.89 4.26 38.29 8.29 15.44 12.35 5.67 6.23 28.40 9.86 1.71 4.73 3.37 3.95 3.66 9.62 4.14 38.50 9.14 7.45 14.64 13.15 6.25 5.94 27.86 9.97 1.40 4.17 3.12 3.48 3.24 3.83 10.02 3.70 4.18 38.04 8.42 7.57 14.57 12.81 6.17 6.22 27.98 9.68 2.48 4.74 3.24 4.02 3.91 10.03 27.183 3.36 2.82 38.82 10.05 7.4 15.69 12.98 4.94 7.78 27.31 8.91 1.45 4.14 3.19 3.47 5.25 8.46 8039 172.883 3.28 4.45 38.62 9.87 9.81 15.95 12.4 4.67 7.52 27.44 9.05 1.42 4.15 4.93 3.52 3.62 3.48 9.16 55214 3.34 20373 15.55 3.53 4.13 39.06 10.1 9.28 11.53 4.72 5.78 29.05 11.3 1.43 4.24 3.1 5.17 3.41 8.96 4.39 38.76 8.64 7.83 13.97 14.13 4.64 5.69 28.82 10.62 1.39 4.34 3.18 3.45 3.3 10.55 5.48 1.27 3.12 3.25 38.25 8.13 7.86 13.68 14.1 5.14 6 27.75 10.27 4.23 3.12 3.55 3.3 10.08 2.93 38.79 8.13 9.32 15.3 11.39 4.94 5.81 28.55 9.97 1.17 4.09 3.19 5.18 3.46 8.96 9.6 3.94 39.01 7.93 15.44 14.58 6.11 5.97 28.21 8.87 1.33 4.18 3 3.34 4.99 8.81 153896 4.62 39.35 10.11 7.43 16.05 13 5.14 6.96 27.32 8.55 1.35 4.63 3.23 3.56 3.36 4.75 10.56 3.33 152656 2.85 38.76 10.09 9.51 15.85 11.72 4.99 7.74 30.16 8.38 1.3 4.47 5.19 3.48 3.36 9.04 290342 81406 84351 9.284 3.44 3.91 38.69 8.45 9.34 16.6 11.24 5.16 5.84 29.35 10.18 1.38 4.44 3.28 5.18 3.6 8.83 8119 173.043 3.3 4.11 39.03 10.69 9.46 15.4 11.51 5.25 5.9 29.17 10.39 1.41 4.41 4.99 3.48 3.34 3.36 9.02 55383 3.33 20404 15.45 3.31 3.96 38.17 8.64 9.16 12.17 5.33 6.05 29.12 10.38 1.46 4.04 3.13 5.27 3.34 8.37 4.59 37.59 8.48 8.22 15.72 13.08 6.79 6.01 27.04 10.65 1.4 4.09 3.15 3.49 3.31 10.23 5.27 1.45 4.9 3.53 37.05 8.78 7.98 16.79 13.73 5.23 6.77 29.24 10.86 4.3 3.18 3.47 3.3 10.75 4.16 39.12 17.15 9.3 15.34 13.82 5.27 7.75 27.25 9.29 1.34 4.34 3.23 3.59 3.45 8.74 10.34 4.16 38.73 7.81 16.39 13.57 6.58 5.81 27.59 8.9 1.41 6.28 3.1 3.4 3.31 9.54 153939 4.59 38.65 8.7 9.44 15.41 10.96 5.34 5.87 29.85 10.47 1.42 4.1 3.12 5.23 3.35 3.38 8.22 3.3 155936 3.12 38.79 10.23 9.38 13.88 12.47 5.45 8.14 30.74 8.97 1.42 4.35 5.11 3.42 3.44 10.61 287651 80999 81329 8.962 4.81 4.01 38.58 8.37 7.72 15.67 13.46 6.54 5.97 29.4 8.85 2.91 5.88 3.1 3.37 3.27 10.15 8132 172.887 3.36 2.64 38.46 10.03 7.02 15.55 13.13 4.69 7.38 28.14 8.35 1.26 4.1 3.07 3.5 3.17 3.42 8.93 54950 3.26 20601 17.3 3.47 2.81 39.18 10.09 9.21 13.63 5.14 8.16 27.89 8.61 1.18 4.12 3.16 3.51 5.1 9.41 3.93 39.04 9.81 9.11 15.26 12.45 5.2 7.44 29.29 8.7 1.4 4.1 4.7 3.51 3.43 8.45 4.51 1.33 2.61 3.35 38.9 10.17 9.11 15.4 11.41 5.18 7.82 29.54 8.93 4.37 4.77 3.45 3.6 8.15 4.06 38.99 9.55 9.37 15.62 13.68 4.67 7.61 27.04 9.02 1.16 4.04 4.61 3.46 3.39 8.91 7.73 5.92 38.58 7.72 16.61 13.13 6.11 5.84 27.77 10.01 1.07 5.26 2.54 3.17 4.45 10.54 152170 5.86 37.13 8.34 8.26 16.3 13.25 6.32 5.58 27.25 10.14 1.4 5.82 3.1 3.43 4.96 3.29 10.64 4.97 155148 3.93 38.95 8.25 7.48 17.67 13.29 6.62 6.07 27.61 10.75 1.42 5.94 3.12 3.32 3.29 12.12 292768 82875 84887 8.967 OpenBenchmarking.org
vkpeak fp16-vec4 OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20230730 fp16-vec4 3090 rep 3090 b c a e d h f g 9K 18K 27K 36K 45K SE +/- 5.96, N = 3 SE +/- 0.36, N = 3 SE +/- 0.37, N = 3 41188.02 41149.10 23390.44 23387.26 23232.42 16865.29 16864.47 13490.24 13440.97 13438.47
vkpeak int32-scalar OpenBenchmarking.org GIOPS, More Is Better vkpeak 20230730 int32-scalar 3090 3090 rep d e f g h a b c 4K 8K 12K 16K 20K SE +/- 15.02, N = 3 SE +/- 0.03, N = 3 SE +/- 0.34, N = 3 20909.02 20767.64 8520.02 8505.20 6827.92 6824.29 6800.60 2272.62 2269.25 2269.06
vkpeak int16-vec4 OpenBenchmarking.org GIOPS, More Is Better vkpeak 20230730 int16-vec4 b c a 3090 3090 rep d e h f g 5K 10K 15K 20K 25K SE +/- 21.55, N = 3 SE +/- 17.33, N = 3 SE +/- 0.31, N = 3 23396.59 23385.44 23123.77 16886.66 16881.47 7352.85 7336.25 5978.38 5959.75 5956.38
vkpeak int32-vec4 OpenBenchmarking.org GIOPS, More Is Better vkpeak 20230730 int32-vec4 3090 3090 rep d e f g h a b c 4K 8K 12K 16K 20K SE +/- 0.19, N = 3 SE +/- 0.05, N = 3 SE +/- 0.26, N = 3 20820.09 20517.68 8465.82 8465.71 6800.17 6795.39 6772.98 2658.73 2640.08 2638.69
vkpeak int16-scalar OpenBenchmarking.org GIOPS, More Is Better vkpeak 20230730 int16-scalar 3090 3090 rep a b c d e h f g 3K 6K 9K 12K 15K SE +/- 1.30, N = 3 SE +/- 0.09, N = 3 SE +/- 0.02, N = 3 13710.88 13608.57 13102.75 13070.81 13063.86 5676.02 5675.99 4495.98 4480.59 4479.22
vkpeak fp16-scalar OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20230730 fp16-scalar 3090 rep 3090 a b c d e h f g 4K 8K 12K 16K 20K SE +/- 4.01, N = 3 SE +/- 13.46, N = 3 SE +/- 5.09, N = 3 20953.30 20845.09 13154.15 13145.19 13136.79 8412.33 8397.80 6838.32 6812.52 6811.35
vkpeak fp32-vec4 OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20230730 fp32-vec4 3090 rep 3090 c b a d e h f g 6K 12K 18K 24K 30K SE +/- 1.81, N = 3 SE +/- 19.37, N = 3 SE +/- 2.57, N = 3 27807.58 27797.80 12822.01 12808.59 12730.08 11251.17 11231.72 9036.17 9006.57 9003.12
vkpeak fp32-scalar OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20230730 fp32-scalar 3090 3090 rep a c b d e f g h 5K 10K 15K 20K 25K SE +/- 4.18, N = 3 SE +/- 16.18, N = 3 SE +/- 0.30, N = 3 21269.72 20925.30 13190.09 12860.56 12807.06 8531.96 8515.58 6837.94 6832.74 6810.73
vkpeak fp64-scalar OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20230730 fp64-scalar a b c 3090 rep 3090 d e f g h 200 400 600 800 1000 SE +/- 0.22, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 841.40 839.20 839.01 653.63 653.13 267.43 267.41 214.17 213.96 213.37
vkpeak fp64-vec4 OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20230730 fp64-vec4 a b c 3090 d e f g h 200 400 600 800 1000 SE +/- 0.32, N = 3 SE +/- 0.48, N = 3 SE +/- 0.00, N = 3 841.80 836.55 836.16 653.15 267.74 267.25 214.23 213.95 210.96
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: mobilenet-v3 3090 4090 4090 rep RTX 3070 Ti nv 4090 3070 1.3433 2.6866 4.0299 5.3732 6.7165 SE +/- 0.17, N = 15 3.16 3.36 3.44 3.52 4.81 5.97 MIN: 3.12 / MAX: 3.67 MIN: 3.21 / MAX: 4.83 MIN: 3.3 / MAX: 4.34 MIN: 2.95 / MAX: 536.1 MIN: 3.13 / MAX: 149.75 MIN: 2.84 / MAX: 111.8 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: FastestDet 4090 4090 rep nv 4090 3090 3090 rep RTX 3070 Ti 3070 1.008 2.016 3.024 4.032 5.04 SE +/- 0.29, N = 15 2.82 3.91 4.01 4.08 4.10 4.25 4.48 MIN: 2.69 / MAX: 3.5 MIN: 3.77 / MAX: 5.87 MIN: 3.87 / MAX: 5.47 MIN: 4.04 / MAX: 4.2 MIN: 4.06 / MAX: 4.2 MIN: 2.46 / MAX: 526.3 MIN: 2.2 / MAX: 27.6 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: vision_transformer 3090 rep 3090 RTX 3070 Ti nv 4090 4090 rep 4090 3070 15 30 45 60 75 SE +/- 0.12, N = 15 32.13 32.16 38.32 38.58 38.69 38.82 69.48 MIN: 31.95 / MAX: 32.87 MIN: 31.94 / MAX: 33.7 MIN: 32.26 / MAX: 477.15 MIN: 33.06 / MAX: 464.16 MIN: 33.32 / MAX: 390.07 MIN: 33.83 / MAX: 435.6 MIN: 39.08 / MAX: 374.31 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: regnety_400m 3090 rep 3090 nv 4090 4090 rep RTX 3070 Ti 4090 3070 4 8 12 16 20 SE +/- 0.20, N = 15 8.19 8.20 8.37 8.45 9.10 10.05 18.24 MIN: 8.12 / MAX: 8.98 MIN: 8.14 / MAX: 8.74 MIN: 8.08 / MAX: 10.1 MIN: 8.05 / MAX: 12.64 MIN: 7.61 / MAX: 454.62 MIN: 8.13 / MAX: 173.18 MIN: 7.5 / MAX: 201.09 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: squeezenet_ssd 3090 3090 rep 4090 nv 4090 RTX 3070 Ti 4090 rep 3070 4 8 12 16 20 SE +/- 0.22, N = 15 7.05 7.07 7.40 7.72 8.31 9.34 15.46 MIN: 6.98 / MAX: 7.81 MIN: 6.99 / MAX: 7.81 MIN: 6.81 / MAX: 8.46 MIN: 7.13 / MAX: 8.97 MIN: 6.35 / MAX: 364.95 MIN: 6.88 / MAX: 268.7 MIN: 7.08 / MAX: 147.31 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: yolov4-tiny 3090 3090 rep RTX 3070 Ti nv 4090 4090 4090 rep 3070 6 12 18 24 30 SE +/- 0.25, N = 15 12.86 12.89 15.56 15.67 15.69 16.60 26.33 MIN: 12.74 / MAX: 13.68 MIN: 12.79 / MAX: 13.77 MIN: 12.24 / MAX: 459.8 MIN: 12.91 / MAX: 334.44 MIN: 13.13 / MAX: 187.93 MIN: 12.98 / MAX: 103.04 MIN: 12.62 / MAX: 127.32 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: resnet50 3090 3090 rep 4090 rep RTX 3070 Ti 4090 nv 4090 3070 6 12 18 24 30 SE +/- 0.24, N = 15 10.03 10.04 11.24 12.52 12.98 13.46 23.54 MIN: 9.88 / MAX: 10.86 MIN: 9.94 / MAX: 10.91 MIN: 10.22 / MAX: 29.96 MIN: 9.95 / MAX: 459.05 MIN: 10.26 / MAX: 145.62 MIN: 10.6 / MAX: 340.67 MIN: 10.3 / MAX: 149.49 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: alexnet 3090 3090 rep 4090 4090 rep RTX 3070 Ti nv 4090 3070 3 6 9 12 15 SE +/- 0.21, N = 15 4.30 4.30 4.94 5.16 5.41 6.54 10.69 MIN: 4.25 / MAX: 4.63 MIN: 4.24 / MAX: 4.85 MIN: 4.52 / MAX: 6.23 MIN: 4.73 / MAX: 6.38 MIN: 4.23 / MAX: 364.66 MIN: 4.56 / MAX: 110.58 MIN: 4.32 / MAX: 148.92 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: resnet18 3090 3090 rep 4090 rep nv 4090 RTX 3070 Ti 4090 3070 3 6 9 12 15 SE +/- 0.24, N = 15 5.20 5.20 5.84 5.97 6.69 7.78 13.34 MIN: 5.1 / MAX: 6.05 MIN: 5.08 / MAX: 6.05 MIN: 5.35 / MAX: 8.28 MIN: 5.4 / MAX: 8.25 MIN: 5.06 / MAX: 462.37 MIN: 5.4 / MAX: 168.29 MIN: 5.43 / MAX: 279.86 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: vgg16 3090 rep 3090 4090 RTX 3070 Ti 4090 rep nv 4090 3070 12 24 36 48 60 SE +/- 0.26, N = 15 23.47 23.50 27.31 28.53 29.35 29.40 51.28 MIN: 23.25 / MAX: 24.24 MIN: 23.26 / MAX: 24.34 MIN: 24.27 / MAX: 230.86 MIN: 24.21 / MAX: 515.3 MIN: 24.55 / MAX: 485.35 MIN: 26.17 / MAX: 411.51 MIN: 24.83 / MAX: 242.12 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: googlenet 3090 rep 3090 nv 4090 4090 RTX 3070 Ti 4090 rep 3070 5 10 15 20 25 SE +/- 0.22, N = 15 7.82 7.83 8.85 8.91 9.58 10.18 18.66 MIN: 7.72 / MAX: 8.6 MIN: 7.73 / MAX: 8.6 MIN: 8.16 / MAX: 10.25 MIN: 8.3 / MAX: 10.96 MIN: 7.62 / MAX: 396.9 MIN: 7.81 / MAX: 204.67 MIN: 7.42 / MAX: 326.73 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: blazeface 3090 rep 4090 rep 3090 4090 RTX 3070 Ti nv 4090 3070 0.8033 1.6066 2.4099 3.2132 4.0165 SE +/- 0.14, N = 15 1.38 1.38 1.39 1.45 1.51 2.91 3.57 MIN: 1.35 / MAX: 1.88 MIN: 1.33 / MAX: 1.98 MIN: 1.36 / MAX: 3.12 MIN: 1.38 / MAX: 2.98 MIN: 1.11 / MAX: 380.46 MIN: 1.29 / MAX: 113.97 MIN: 1.08 / MAX: 141.04 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: efficientnet-b0 3090 rep 3090 4090 4090 rep RTX 3070 Ti nv 4090 3070 3 6 9 12 15 SE +/- 0.16, N = 15 3.85 3.86 4.14 4.44 4.55 5.88 9.53 MIN: 3.81 / MAX: 4.6 MIN: 3.82 / MAX: 4.82 MIN: 3.93 / MAX: 5.94 MIN: 4.24 / MAX: 5.18 MIN: 3.84 / MAX: 379.07 MIN: 3.96 / MAX: 194.08 MIN: 3.77 / MAX: 182.53 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: mnasnet 3090 3090 rep nv 4090 4090 RTX 3070 Ti 4090 rep 3070 2 4 6 8 10 SE +/- 0.11, N = 15 2.97 2.98 3.10 3.19 3.25 3.28 8.15 MIN: 2.93 / MAX: 3.45 MIN: 2.94 / MAX: 3.36 MIN: 2.97 / MAX: 3.92 MIN: 3.04 / MAX: 3.98 MIN: 2.68 / MAX: 277.21 MIN: 3.15 / MAX: 4.32 MIN: 2.67 / MAX: 317.68 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: shufflenet-v2 3090 3090 rep nv 4090 4090 RTX 3070 Ti 4090 rep 3070 2 4 6 8 10 SE +/- 0.20, N = 15 3.36 3.36 3.37 3.47 3.92 5.18 6.30 MIN: 3.32 / MAX: 3.82 MIN: 3.33 / MAX: 3.83 MIN: 3.25 / MAX: 5.26 MIN: 3.33 / MAX: 5.01 MIN: 3.12 / MAX: 496.78 MIN: 3.45 / MAX: 200.36 MIN: 3.28 / MAX: 147.57 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v2-v2 - Model: mobilenet-v2 3090 3090 rep nv 4090 4090 rep RTX 3070 Ti 4090 3070 1.2353 2.4706 3.7059 4.9412 6.1765 SE +/- 0.20, N = 15 3.15 3.17 3.27 3.60 3.69 5.25 5.49 MIN: 3.11 / MAX: 3.78 MIN: 3.12 / MAX: 3.78 MIN: 3.11 / MAX: 4.1 MIN: 3.44 / MAX: 4.27 MIN: 3.07 / MAX: 544.13 MIN: 3.11 / MAX: 367.53 MIN: 2.97 / MAX: 152.08 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: mobilenet 3090 3090 rep 4090 4090 rep RTX 3070 Ti nv 4090 3070 5 10 15 20 25 SE +/- 0.24, N = 15 8.01 8.05 8.46 8.83 9.98 10.15 18.54 MIN: 7.96 / MAX: 8.47 MIN: 7.98 / MAX: 8.94 MIN: 8.12 / MAX: 10.14 MIN: 8.29 / MAX: 10.15 MIN: 7.79 / MAX: 434.9 MIN: 8.08 / MAX: 193.04 MIN: 8.01 / MAX: 164.45 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
VkFFT Test: FFT + iFFT C2C Bluestein benchmark in double precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.2.31 Test: FFT + iFFT C2C Bluestein benchmark in double precision nv 4090 4090 rep 4090 4080 xxx 4080 zzz 4080 rep 4080 a b c 3090 rep 3090 i d e g f 2K 4K 6K 8K 10K SE +/- 0.33, N = 3 SE +/- 4.37, N = 3 SE +/- 11.20, N = 3 8132 8119 8039 5587 5584 5583 5579 4717 4695 4670 4289 4282 2417 2346 2343 1818 1814 1. (CXX) g++ options: -O3
VkResample Upscale: 2x - Precision: Double OpenBenchmarking.org ms, Fewer Is Better VkResample 1.0 Upscale: 2x - Precision: Double 3070 RTX 3070 Ti 4090 nv 4090 4090 rep 4080 zzz 4080 xxx 4080 rep 4080 3090 rep 3090 i f g d e 110 220 330 440 550 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 24.75 24.81 172.88 172.89 173.04 288.03 288.04 288.17 288.20 371.42 371.70 500.01 500.01 500.01 500.01 500.02 1. (CXX) g++ options: -O3
NCNN Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 f 3090 rep a d e 3090 c 4080 zzz 4090 4090 rep nv 4090 RTX 3070 Ti i 3070 2 4 6 8 10 SE +/- 0.02, N = 3 SE +/- 0.00, N = 2 SE +/- 0.02, N = 3 SE +/- 0.20, N = 14 3.14 3.16 3.17 3.17 3.18 3.19 3.20 3.28 3.28 3.30 3.36 3.64 4.87 6.60 MIN: 3.09 / MAX: 3.54 MIN: 3.11 / MAX: 3.62 MIN: 3.11 / MAX: 3.73 MIN: 3.1 / MAX: 3.83 MIN: 3.11 / MAX: 3.78 MIN: 3.14 / MAX: 3.48 MIN: 3.16 / MAX: 3.68 MIN: 3.13 / MAX: 4.65 MIN: 3.15 / MAX: 3.9 MIN: 3.15 / MAX: 3.92 MIN: 3.21 / MAX: 4.3 MIN: 2.87 / MAX: 429.02 MIN: 3.14 / MAX: 278.98 MIN: 2.98 / MAX: 166.19 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: FastestDet nv 4090 4080 xxx 3090 3090 rep 4090 rep 4080 rep 4080 RTX 3070 Ti 4090 4080 zzz 3070 2 4 6 8 10 SE +/- 0.27, N = 15 2.64 3.75 3.83 4.07 4.11 4.17 4.19 4.32 4.45 4.61 6.71 MIN: 2.52 / MAX: 4.14 MIN: 3.63 / MAX: 5.24 MIN: 3.79 / MAX: 4.09 MIN: 4.03 / MAX: 4.18 MIN: 3.98 / MAX: 4.73 MIN: 4.02 / MAX: 4.75 MIN: 4.06 / MAX: 7.41 MIN: 2.51 / MAX: 398.91 MIN: 4.29 / MAX: 5.05 MIN: 4.45 / MAX: 5.92 MIN: 2.73 / MAX: 109.52 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: vision_transformer 3090 3090 rep 4080 xxx 4080 rep 4080 zzz 4080 RTX 3070 Ti nv 4090 4090 4090 rep 3070 16 32 48 64 80 SE +/- 0.11, N = 15 31.94 31.97 33.90 33.93 35.36 35.60 38.27 38.46 38.62 39.03 70.29 MIN: 31.72 / MAX: 34.34 MIN: 31.71 / MAX: 33.78 MIN: 32.72 / MAX: 37.77 MIN: 32.77 / MAX: 36.2 MIN: 33.87 / MAX: 42.41 MIN: 34.13 / MAX: 38.49 MIN: 32.29 / MAX: 507.7 MIN: 32.39 / MAX: 435.46 MIN: 33.33 / MAX: 465 MIN: 33.61 / MAX: 343.67 MIN: 39.39 / MAX: 250.19 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: regnety_400m 3090 rep 4080 xxx 3090 4080 rep 4080 zzz 4080 RTX 3070 Ti 4090 nv 4090 4090 rep 3070 4 8 12 16 20 SE +/- 0.24, N = 15 8.07 8.25 8.33 8.35 8.37 8.67 9.05 9.87 10.03 10.69 17.61 MIN: 7.99 / MAX: 8.88 MIN: 7.93 / MAX: 9.88 MIN: 8.25 / MAX: 9.32 MIN: 8.05 / MAX: 9.76 MIN: 8.04 / MAX: 10.13 MIN: 8.3 / MAX: 14.66 MIN: 7.52 / MAX: 417.33 MIN: 7.81 / MAX: 243.06 MIN: 7.81 / MAX: 171.2 MIN: 8.17 / MAX: 339.6 MIN: 7.85 / MAX: 165.34 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: squeezenet_ssd nv 4090 3090 3090 rep 4080 xxx 4080 rep 4080 4080 zzz RTX 3070 Ti 4090 rep 4090 3070 5 10 15 20 25 SE +/- 0.24, N = 15 7.02 7.05 7.08 7.27 7.62 7.71 8.06 8.65 9.46 9.81 18.83 MIN: 6.38 / MAX: 9.36 MIN: 6.97 / MAX: 7.95 MIN: 7 / MAX: 7.94 MIN: 6.74 / MAX: 8.84 MIN: 7.01 / MAX: 14.37 MIN: 7.15 / MAX: 9.1 MIN: 7.42 / MAX: 9.25 MIN: 6.64 / MAX: 544.17 MIN: 7.03 / MAX: 160.39 MIN: 7.16 / MAX: 389.1 MIN: 6.71 / MAX: 206.11 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: yolov4-tiny 3090 rep 3090 4080 xxx 4080 rep 4080 RTX 3070 Ti 4080 zzz 4090 rep nv 4090 4090 3070 7 14 21 28 35 SE +/- 0.19, N = 15 12.90 12.97 13.52 13.73 13.93 15.00 15.26 15.40 15.55 15.95 28.73 MIN: 12.77 / MAX: 13.92 MIN: 12.83 / MAX: 13.8 MIN: 12.72 / MAX: 21.19 MIN: 12.78 / MAX: 20.99 MIN: 13.08 / MAX: 15.68 MIN: 12.75 / MAX: 401.37 MIN: 14.19 / MAX: 17.06 MIN: 13 / MAX: 245.79 MIN: 12.87 / MAX: 342.3 MIN: 13.38 / MAX: 245.18 MIN: 12.83 / MAX: 264.49 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: resnet50 3090 rep 3090 4080 rep 4080 xxx 4080 4090 rep 4090 RTX 3070 Ti 4080 zzz nv 4090 3070 6 12 18 24 30 SE +/- 0.25, N = 15 9.98 10.07 11.07 11.22 11.48 11.51 12.40 12.42 12.50 13.13 23.44 MIN: 9.85 / MAX: 11.35 MIN: 9.95 / MAX: 10.88 MIN: 10.16 / MAX: 13.16 MIN: 10.33 / MAX: 12.81 MIN: 10.56 / MAX: 12.93 MIN: 10.56 / MAX: 13.22 MIN: 11.44 / MAX: 14.43 MIN: 10.23 / MAX: 444.76 MIN: 11.47 / MAX: 14.56 MIN: 10.18 / MAX: 247.5 MIN: 10.17 / MAX: 219.36 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: alexnet 3090 rep 3090 4080 rep 4090 nv 4090 4080 zzz 4080 xxx 4080 4090 rep RTX 3070 Ti 3070 3 6 9 12 15 SE +/- 0.16, N = 15 4.30 4.32 4.64 4.67 4.69 4.70 4.71 4.98 5.25 5.34 11.89 MIN: 4.24 / MAX: 5.11 MIN: 4.25 / MAX: 5.33 MIN: 4.24 / MAX: 6 MIN: 4.28 / MAX: 6 MIN: 4.28 / MAX: 6.33 MIN: 4.28 / MAX: 5.92 MIN: 4.26 / MAX: 7.21 MIN: 4.59 / MAX: 7.15 MIN: 4.86 / MAX: 6.33 MIN: 4.25 / MAX: 221.78 MIN: 4.34 / MAX: 229.18 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: resnet18 3090 rep 3090 4080 xxx 4080 rep 4080 zzz 4090 rep 4080 RTX 3070 Ti nv 4090 4090 3070 3 6 9 12 15 SE +/- 0.20, N = 15 5.20 5.21 5.65 5.67 5.74 5.90 5.92 6.40 7.38 7.52 11.30 MIN: 5.1 / MAX: 6.09 MIN: 5.09 / MAX: 6.13 MIN: 5.18 / MAX: 6.76 MIN: 5.19 / MAX: 7.38 MIN: 5.18 / MAX: 8.08 MIN: 5.43 / MAX: 7.49 MIN: 5.37 / MAX: 8.24 MIN: 5.1 / MAX: 457.07 MIN: 5.15 / MAX: 138.85 MIN: 5.45 / MAX: 290.49 MIN: 5.3 / MAX: 181.7 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: vgg16 3090 rep 3090 4080 xxx 4080 rep 4080 4080 zzz 4090 nv 4090 RTX 3070 Ti 4090 rep 3070 11 22 33 44 55 SE +/- 0.28, N = 15 23.38 23.51 25.33 25.56 25.67 26.09 27.44 28.14 28.53 29.17 49.70 MIN: 23.19 / MAX: 24.27 MIN: 23.27 / MAX: 24.38 MIN: 24.26 / MAX: 34.98 MIN: 24.24 / MAX: 27.92 MIN: 24.46 / MAX: 27.34 MIN: 24.58 / MAX: 30.18 MIN: 24.06 / MAX: 264.59 MIN: 24.24 / MAX: 221.5 MIN: 23.95 / MAX: 473.83 MIN: 24.61 / MAX: 264.85 MIN: 25.55 / MAX: 421.44 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: googlenet 3090 3090 rep 4080 xxx nv 4090 4080 rep 4080 zzz 4080 4090 RTX 3070 Ti 4090 rep 3070 4 8 12 16 20 SE +/- 0.19, N = 15 7.84 7.86 8.26 8.35 8.42 8.55 8.79 9.05 9.90 10.39 16.97 MIN: 7.74 / MAX: 8.72 MIN: 7.75 / MAX: 8.71 MIN: 7.62 / MAX: 10.47 MIN: 7.7 / MAX: 10.46 MIN: 7.77 / MAX: 10.52 MIN: 7.86 / MAX: 10.08 MIN: 8.08 / MAX: 10.27 MIN: 8.26 / MAX: 13.34 MIN: 7.76 / MAX: 396.66 MIN: 7.87 / MAX: 391.66 MIN: 7.44 / MAX: 229.93 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: blazeface nv 4090 4080 xxx RTX 3070 Ti 3090 rep 3090 4080 zzz 4090 rep 4080 rep 4090 4080 3070 0.6728 1.3456 2.0184 2.6912 3.364 SE +/- 0.03, N = 15 1.26 1.31 1.34 1.37 1.38 1.41 1.41 1.42 1.42 1.44 2.99 MIN: 1.2 / MAX: 1.76 MIN: 1.25 / MAX: 3.14 MIN: 1.06 / MAX: 2.66 MIN: 1.35 / MAX: 1.48 MIN: 1.36 / MAX: 1.53 MIN: 1.34 / MAX: 1.88 MIN: 1.35 / MAX: 1.91 MIN: 1.35 / MAX: 2.89 MIN: 1.36 / MAX: 1.92 MIN: 1.37 / MAX: 2.07 MIN: 1.22 / MAX: 149.55 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: efficientnet-b0 3090 rep 3090 4080 xxx 4080 rep 4080 4080 zzz nv 4090 4090 RTX 3070 Ti 4090 rep 3070 3 6 9 12 15 SE +/- 0.13, N = 15 3.85 3.88 3.97 4.01 4.04 4.05 4.10 4.15 4.37 4.41 9.81 MIN: 3.78 / MAX: 4.83 MIN: 3.83 / MAX: 4.72 MIN: 3.79 / MAX: 5.93 MIN: 3.81 / MAX: 6.04 MIN: 3.84 / MAX: 4.83 MIN: 3.83 / MAX: 5.42 MIN: 3.87 / MAX: 6.14 MIN: 3.93 / MAX: 5.94 MIN: 3.85 / MAX: 366.28 MIN: 4.21 / MAX: 5.82 MIN: 3.87 / MAX: 165.38 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: mnasnet 3090 3090 rep 4080 xxx 4080 rep nv 4090 4080 zzz 4080 RTX 3070 Ti 4090 4090 rep 3070 2 4 6 8 10 SE +/- 0.04, N = 15 2.94 2.97 2.98 3.06 3.07 3.08 3.10 3.10 4.93 4.99 6.06 MIN: 2.9 / MAX: 3.34 MIN: 2.94 / MAX: 3.45 MIN: 2.86 / MAX: 4.47 MIN: 2.93 / MAX: 5.02 MIN: 2.93 / MAX: 4.52 MIN: 2.95 / MAX: 3.88 MIN: 2.95 / MAX: 4.05 MIN: 2.61 / MAX: 4.75 MIN: 2.97 / MAX: 124.96 MIN: 3.02 / MAX: 235.56 MIN: 2.96 / MAX: 42.7 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: shufflenet-v2 3090 4080 xxx 3090 rep 4080 rep 4080 zzz 4080 4090 rep nv 4090 4090 RTX 3070 Ti 3070 1.2578 2.5156 3.7734 5.0312 6.289 SE +/- 0.20, N = 15 3.32 3.34 3.34 3.44 3.44 3.48 3.48 3.50 3.52 3.89 5.59 MIN: 3.29 / MAX: 3.79 MIN: 3.22 / MAX: 3.97 MIN: 3.3 / MAX: 3.79 MIN: 3.31 / MAX: 4.32 MIN: 3.31 / MAX: 4.85 MIN: 3.34 / MAX: 4.88 MIN: 3.34 / MAX: 4.1 MIN: 3.37 / MAX: 4.2 MIN: 3.38 / MAX: 4.23 MIN: 3.08 / MAX: 345.39 MIN: 3.32 / MAX: 42.33 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: mobilenet-v3 4080 xxx 3090 nv 4090 3090 rep 4080 4080 zzz 4080 rep 4090 rep RTX 3070 Ti 4090 3070 1.3478 2.6956 4.0434 5.3912 6.739 SE +/- 0.13, N = 13 3.05 3.13 3.17 3.18 3.26 3.26 3.31 3.34 3.44 3.62 5.99 MIN: 2.94 / MAX: 3.56 MIN: 3.09 / MAX: 3.68 MIN: 3.04 / MAX: 4.3 MIN: 3.13 / MAX: 3.61 MIN: 3.13 / MAX: 4.7 MIN: 3.12 / MAX: 4.74 MIN: 3.16 / MAX: 3.93 MIN: 3.19 / MAX: 3.99 MIN: 2.65 / MAX: 361.91 MIN: 3.47 / MAX: 4.24 MIN: 3.05 / MAX: 26.81 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v2-v2 - Model: mobilenet-v2 3090 4080 xxx 3090 rep 4080 rep 4080 zzz 4080 4090 rep nv 4090 4090 RTX 3070 Ti 3070 1.2285 2.457 3.6855 4.914 6.1425 SE +/- 0.18, N = 15 3.12 3.14 3.17 3.27 3.28 3.29 3.36 3.42 3.48 3.66 5.46 MIN: 3.07 / MAX: 3.62 MIN: 3 / MAX: 3.85 MIN: 3.12 / MAX: 3.89 MIN: 3.08 / MAX: 4.68 MIN: 3.11 / MAX: 4.26 MIN: 3.12 / MAX: 3.99 MIN: 3.17 / MAX: 4.8 MIN: 3.15 / MAX: 25.1 MIN: 3.32 / MAX: 4.99 MIN: 2.73 / MAX: 398.42 MIN: 3.27 / MAX: 38.65 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: mobilenet 3090 3090 rep 4080 xxx 4080 rep 4080 nv 4090 4090 rep 4090 4080 zzz RTX 3070 Ti 3070 4 8 12 16 20 SE +/- 0.25, N = 15 8.03 8.05 8.31 8.38 8.43 8.93 9.02 9.16 9.19 9.52 17.09 MIN: 7.96 / MAX: 8.83 MIN: 7.96 / MAX: 9.04 MIN: 7.85 / MAX: 10.21 MIN: 7.94 / MAX: 10.07 MIN: 8.03 / MAX: 9.64 MIN: 8.33 / MAX: 11.07 MIN: 8.42 / MAX: 11.17 MIN: 8.5 / MAX: 10.51 MIN: 8.51 / MAX: 11.04 MIN: 7.97 / MAX: 420.29 MIN: 7.89 / MAX: 121.53 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
VkFFT Test: FFT + iFFT C2C 1D batched in double precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.2.31 Test: FFT + iFFT C2C 1D batched in double precision 4090 rep 4090 nv 4090 4080 xxx 4080 zzz 4080 rep 4080 3090 rep 3090 c b a i e d h f g 12K 24K 36K 48K 60K SE +/- 14.62, N = 3 SE +/- 11.67, N = 3 SE +/- 12.42, N = 3 SE +/- 10.58, N = 3 55383 55214 54950 35071 35058 35038 34974 31122 30945 20847 20822 20816 14780 12168 12143 10572 10561 10548 1. (CXX) g++ options: -O3
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v3-v3 - Model: mobilenet-v3 g 3090 4080 rep 4080 zzz i 4080 xxx nv 4090 4090 4090 rep RTX 3070 Ti 3070 2 4 6 8 10 SE +/- 0.18, N = 15 3.16 3.16 3.24 3.24 3.26 3.26 3.26 3.30 3.33 3.62 6.56 MIN: 3.12 / MAX: 3.58 MIN: 3.11 / MAX: 3.77 MIN: 3.11 / MAX: 4.37 MIN: 3.1 / MAX: 3.88 MIN: 3.11 / MAX: 4.7 MIN: 3.13 / MAX: 4.08 MIN: 3.13 / MAX: 3.96 MIN: 3.14 / MAX: 4.82 MIN: 3.19 / MAX: 4.79 MIN: 3 / MAX: 469.9 MIN: 3.07 / MAX: 110.87 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
VkFFT Test: FFT + iFFT C2C Bluestein in single precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.2.31 Test: FFT + iFFT C2C Bluestein in single precision nv 4090 4090 rep 4090 4080 xxx 4080 rep 4080 zzz 4080 3090 rep 3090 a c b d e i h g f 4K 8K 12K 16K 20K SE +/- 83.38, N = 3 SE +/- 62.67, N = 3 SE +/- 72.34, N = 3 SE +/- 75.16, N = 15 20601 20404 20373 17343 17287 17185 17121 14449 14406 11340 11311 11273 10719 10560 10061 7622 7574 7571 1. (CXX) g++ options: -O3
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: yolov4-tiny 3090 rep 3090 g 4080 rep 4080 zzz 4080 xxx i 4080 RTX 3070 Ti 4090 rep 4090 nv 4090 3070 7 14 21 28 35 SE +/- 0.28, N = 15 12.86 13.10 13.14 13.55 13.61 13.62 13.77 13.86 15.21 15.45 15.55 17.30 29.80 MIN: 12.76 / MAX: 13.73 MIN: 13.01 / MAX: 14.17 MIN: 13 / MAX: 14.02 MIN: 12.72 / MAX: 15.51 MIN: 12.67 / MAX: 19.72 MIN: 12.71 / MAX: 15.65 MIN: 12.96 / MAX: 14.66 MIN: 13.04 / MAX: 15.04 MIN: 12.34 / MAX: 380.51 MIN: 12.65 / MAX: 445.76 MIN: 13.11 / MAX: 307.2 MIN: 14.66 / MAX: 441.3 MIN: 12.85 / MAX: 216.34 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3 - Model: mobilenet-v3 g c d a 3090 3090 rep 4080 4080 zzz i 4080 rep 4080 xxx 4090 rep nv 4090 4090 RTX 3070 Ti 3070 2 4 6 8 10 SE +/- 0.00, N = 3 SE +/- 0.00, N = 2 SE +/- 0.21, N = 14 3.14 3.17 3.17 3.18 3.18 3.19 3.24 3.24 3.26 3.26 3.27 3.31 3.47 3.53 3.61 7.52 MIN: 3.1 / MAX: 3.81 MIN: 3.15 / MAX: 3.74 MIN: 3.12 / MAX: 3.96 MIN: 3.14 / MAX: 3.82 MIN: 3.14 / MAX: 4.14 MIN: 3.15 / MAX: 3.72 MIN: 3.09 / MAX: 4.73 MIN: 3.11 / MAX: 4.47 MIN: 3.14 / MAX: 3.9 MIN: 3.09 / MAX: 3.96 MIN: 3.13 / MAX: 3.85 MIN: 3.16 / MAX: 4.73 MIN: 3.32 / MAX: 4.91 MIN: 3.39 / MAX: 4.31 MIN: 2.51 / MAX: 502.85 MIN: 2.94 / MAX: 215 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: FastestDet nv 4090 i g 4090 rep 3090 4090 3090 rep 4080 rep 4080 xxx 4080 zzz 4080 RTX 3070 Ti 3070 2 4 6 8 10 SE +/- 0.20, N = 15 2.81 3.83 3.92 3.96 4.03 4.03 4.08 4.14 4.17 4.20 4.28 4.41 6.93 MIN: 2.68 / MAX: 4.38 MIN: 3.7 / MAX: 4.57 MIN: 3.88 / MAX: 4.72 MIN: 3.79 / MAX: 11.36 MIN: 3.99 / MAX: 4.22 MIN: 3.89 / MAX: 4.63 MIN: 4.04 / MAX: 4.29 MIN: 4 / MAX: 5.6 MIN: 4.03 / MAX: 5.63 MIN: 4.01 / MAX: 11.47 MIN: 4.13 / MAX: 4.85 MIN: 2.06 / MAX: 295.24 MIN: 2.57 / MAX: 163.84 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: vision_transformer 3090 rep 3090 g 4080 rep 4080 zzz 4080 xxx 4080 RTX 3070 Ti i 4090 rep 4090 nv 4090 3070 16 32 48 64 80 SE +/- 0.20, N = 15 32.09 33.22 33.32 34.10 34.10 34.23 34.91 37.88 38.01 38.17 38.38 39.18 71.08 MIN: 31.84 / MAX: 32.77 MIN: 33.04 / MAX: 36.99 MIN: 31.83 / MAX: 104.12 MIN: 32.43 / MAX: 38.75 MIN: 32.32 / MAX: 38.54 MIN: 33.08 / MAX: 37.43 MIN: 33.72 / MAX: 36.82 MIN: 32.46 / MAX: 518.57 MIN: 32.96 / MAX: 388.09 MIN: 32.97 / MAX: 462.63 MIN: 33.53 / MAX: 477.38 MIN: 33.74 / MAX: 520.24 MIN: 38.84 / MAX: 374.68 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: regnety_400m 3090 4090 4080 rep 3090 rep g i 4080 zzz 4080 xxx 4080 4090 rep RTX 3070 Ti nv 4090 3070 4 8 12 16 20 SE +/- 0.21, N = 15 7.99 8.10 8.24 8.34 8.38 8.46 8.49 8.52 8.61 8.64 9.19 10.09 17.88 MIN: 7.92 / MAX: 8.78 MIN: 7.65 / MAX: 10.05 MIN: 7.91 / MAX: 9.53 MIN: 8.26 / MAX: 9.09 MIN: 8.05 / MAX: 27.34 MIN: 8.08 / MAX: 10.33 MIN: 8.08 / MAX: 9.72 MIN: 8.13 / MAX: 9.73 MIN: 8.21 / MAX: 10.07 MIN: 8.3 / MAX: 10.51 MIN: 7.44 / MAX: 524.66 MIN: 7.84 / MAX: 366.66 MIN: 7.38 / MAX: 190.77 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: squeezenet_ssd 3090 3090 rep g i 4080 rep 4090 4080 zzz 4080 4080 xxx RTX 3070 Ti 4090 rep nv 4090 3070 4 8 12 16 20 SE +/- 0.25, N = 14 7.04 7.09 7.14 7.21 7.55 7.57 7.62 7.66 7.67 8.28 9.16 9.21 15.40 MIN: 6.96 / MAX: 7.7 MIN: 7.01 / MAX: 7.97 MIN: 7.03 / MAX: 7.99 MIN: 6.73 / MAX: 8.82 MIN: 6.99 / MAX: 9.08 MIN: 7.02 / MAX: 9 MIN: 7 / MAX: 9.93 MIN: 7.09 / MAX: 8.97 MIN: 7.04 / MAX: 9.1 MIN: 6.38 / MAX: 381.81 MIN: 6.73 / MAX: 423.75 MIN: 6.83 / MAX: 203.62 MIN: 6.64 / MAX: 132.68 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: resnet50 3090 rep g 3090 4080 rep 4080 xxx 4080 zzz 4080 4090 4090 rep RTX 3070 Ti i nv 4090 3070 6 12 18 24 30 SE +/- 0.23, N = 15 10.04 10.33 10.38 10.80 10.91 10.91 11.40 11.53 12.17 12.73 13.10 13.63 23.59 MIN: 9.94 / MAX: 10.89 MIN: 10.2 / MAX: 11.18 MIN: 9.88 / MAX: 18.75 MIN: 9.89 / MAX: 12.54 MIN: 9.91 / MAX: 13.07 MIN: 9.94 / MAX: 14.83 MIN: 10.5 / MAX: 13.51 MIN: 10.59 / MAX: 13.73 MIN: 11.25 / MAX: 13.79 MIN: 9.84 / MAX: 518.97 MIN: 10.59 / MAX: 267.95 MIN: 10.52 / MAX: 488.94 MIN: 9.96 / MAX: 177.63 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: alexnet 3090 3090 rep g 4080 4080 xxx 4080 zzz 4080 rep 4090 i nv 4090 4090 rep RTX 3070 Ti 3070 3 6 9 12 15 SE +/- 0.22, N = 15 4.31 4.31 4.35 4.61 4.67 4.68 4.72 4.72 5.10 5.14 5.33 5.53 10.08 MIN: 4.25 / MAX: 4.94 MIN: 4.26 / MAX: 5.07 MIN: 4.28 / MAX: 5.1 MIN: 4.24 / MAX: 7.25 MIN: 4.27 / MAX: 6.36 MIN: 4.26 / MAX: 6.8 MIN: 4.25 / MAX: 7.3 MIN: 4.31 / MAX: 6.71 MIN: 4.75 / MAX: 6.12 MIN: 4.65 / MAX: 6.81 MIN: 4.83 / MAX: 6.6 MIN: 4.22 / MAX: 362.62 MIN: 4.36 / MAX: 225.66 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: resnet18 3090 3090 rep g 4080 zzz 4080 4080 rep 4080 xxx 4090 i 4090 rep RTX 3070 Ti nv 4090 3070 3 6 9 12 15 SE +/- 0.23, N = 15 5.19 5.22 5.28 5.60 5.61 5.61 5.67 5.78 5.88 6.05 6.57 8.16 12.14 MIN: 5.09 / MAX: 6 MIN: 5.13 / MAX: 6.1 MIN: 5.16 / MAX: 6.09 MIN: 5.09 / MAX: 7.51 MIN: 5.09 / MAX: 7.91 MIN: 5.07 / MAX: 7.08 MIN: 5.1 / MAX: 8.06 MIN: 5.26 / MAX: 7.24 MIN: 5.36 / MAX: 8.2 MIN: 5.53 / MAX: 7.66 MIN: 4.91 / MAX: 391.33 MIN: 5.39 / MAX: 397.44 MIN: 5.28 / MAX: 151.53 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: vgg16 3090 rep 3090 g 4080 rep 4080 zzz 4080 xxx 4080 i nv 4090 RTX 3070 Ti 4090 4090 rep 3070 12 24 36 48 60 SE +/- 0.30, N = 15 23.52 23.55 24.04 25.05 25.16 25.40 25.48 27.43 27.89 28.63 29.05 29.12 55.48 MIN: 23.33 / MAX: 25.08 MIN: 23.31 / MAX: 24.48 MIN: 23.48 / MAX: 73.3 MIN: 23.78 / MAX: 26.95 MIN: 23.97 / MAX: 27.81 MIN: 24.05 / MAX: 27.09 MIN: 23.88 / MAX: 51.68 MIN: 24.65 / MAX: 251.37 MIN: 24.5 / MAX: 463.23 MIN: 24.13 / MAX: 500.18 MIN: 24.19 / MAX: 451.92 MIN: 24.62 / MAX: 266.39 MIN: 25.94 / MAX: 298.67 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: googlenet 3090 3090 rep g 4080 4080 zzz 4080 xxx 4080 rep nv 4090 RTX 3070 Ti 4090 rep i 4090 3070 5 10 15 20 25 SE +/- 0.24, N = 15 7.86 7.89 7.96 8.40 8.42 8.43 8.49 8.61 9.84 10.38 10.47 10.87 18.60 MIN: 7.74 / MAX: 8.62 MIN: 7.79 / MAX: 8.84 MIN: 7.81 / MAX: 9.05 MIN: 7.71 / MAX: 10.64 MIN: 7.78 / MAX: 10.7 MIN: 7.77 / MAX: 10.4 MIN: 7.74 / MAX: 10.76 MIN: 7.95 / MAX: 10.07 MIN: 7.3 / MAX: 438.04 MIN: 7.96 / MAX: 255.68 MIN: 8.21 / MAX: 350.07 MIN: 8.37 / MAX: 194.11 MIN: 8.02 / MAX: 292.16 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: blazeface 4090 nv 4090 3090 g 3090 rep i 4080 rep 4080 xxx 4080 zzz 4080 4090 rep RTX 3070 Ti 3070 0.3983 0.7966 1.1949 1.5932 1.9915 SE +/- 0.12, N = 14 1.16 1.18 1.36 1.38 1.39 1.41 1.41 1.42 1.42 1.43 1.46 1.49 1.77 MIN: 1.1 / MAX: 2 MIN: 1.11 / MAX: 1.85 MIN: 1.34 / MAX: 1.46 MIN: 1.35 / MAX: 2.09 MIN: 1.37 / MAX: 1.52 MIN: 1.35 / MAX: 2.02 MIN: 1.34 / MAX: 2.1 MIN: 1.35 / MAX: 2 MIN: 1.34 / MAX: 2.84 MIN: 1.36 / MAX: 2.06 MIN: 1.39 / MAX: 2.91 MIN: 1.05 / MAX: 379.08 MIN: 1.08 / MAX: 12.53 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: efficientnet-b0 3090 g 3090 rep 4080 rep 4080 zzz 4080 xxx 4090 rep 4080 nv 4090 i 4090 RTX 3070 Ti 3070 2 4 6 8 10 SE +/- 0.19, N = 15 3.83 3.84 3.87 3.98 4.01 4.02 4.04 4.06 4.12 4.19 4.24 4.60 8.41 MIN: 3.78 / MAX: 4.4 MIN: 3.78 / MAX: 4.57 MIN: 3.81 / MAX: 4.62 MIN: 3.77 / MAX: 5.44 MIN: 3.79 / MAX: 5.39 MIN: 3.8 / MAX: 5.14 MIN: 3.85 / MAX: 4.9 MIN: 3.85 / MAX: 4.97 MIN: 3.86 / MAX: 5.39 MIN: 4.01 / MAX: 5.09 MIN: 3.96 / MAX: 5.55 MIN: 3.79 / MAX: 336.2 MIN: 3.76 / MAX: 67.73 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: mnasnet 3090 g 3090 rep 4080 rep 4080 xxx 4080 i 4080 zzz 4090 4090 rep nv 4090 RTX 3070 Ti 3070 1.0328 2.0656 3.0984 4.1312 5.164 SE +/- 0.14, N = 15 2.96 2.97 2.99 3.03 3.05 3.06 3.07 3.08 3.10 3.13 3.16 3.34 4.59 MIN: 2.93 / MAX: 3.31 MIN: 2.93 / MAX: 3.88 MIN: 2.96 / MAX: 3.32 MIN: 2.91 / MAX: 4.45 MIN: 2.91 / MAX: 3.67 MIN: 2.94 / MAX: 3.67 MIN: 2.93 / MAX: 3.84 MIN: 2.93 / MAX: 4.42 MIN: 2.97 / MAX: 3.71 MIN: 3.01 / MAX: 3.62 MIN: 3.02 / MAX: 4.6 MIN: 2.68 / MAX: 393.6 MIN: 2.88 / MAX: 20.12 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: shufflenet-v2 3090 g 3090 rep 4080 rep i 4080 xxx 4080 zzz 4080 nv 4090 RTX 3070 Ti 4090 4090 rep 3070 2 4 6 8 10 SE +/- 0.16, N = 15 3.33 3.34 3.37 3.39 3.43 3.43 3.43 3.46 3.51 3.75 5.09 5.27 8.00 MIN: 3.29 / MAX: 3.67 MIN: 3.31 / MAX: 4.05 MIN: 3.33 / MAX: 3.8 MIN: 3.26 / MAX: 3.91 MIN: 3.3 / MAX: 4.89 MIN: 3.31 / MAX: 3.95 MIN: 3.29 / MAX: 3.87 MIN: 3.3 / MAX: 5.74 MIN: 3.38 / MAX: 4.05 MIN: 3.2 / MAX: 361.52 MIN: 3.33 / MAX: 161.5 MIN: 3.27 / MAX: 191.55 MIN: 3.16 / MAX: 190.15 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3-v2-v2 - Model: mobilenet-v2 3090 g 3090 rep 4080 rep 4080 xxx 4080 4080 zzz i 4090 4090 rep RTX 3070 Ti nv 4090 3070 2 4 6 8 10 SE +/- 0.15, N = 15 3.15 3.17 3.19 3.27 3.27 3.28 3.28 3.29 3.32 3.34 3.66 5.10 8.35 MIN: 3.1 / MAX: 3.68 MIN: 3.1 / MAX: 5.03 MIN: 3.13 / MAX: 4 MIN: 3.08 / MAX: 5.18 MIN: 3.11 / MAX: 4.73 MIN: 3.11 / MAX: 4.16 MIN: 3.09 / MAX: 4.98 MIN: 3.1 / MAX: 3.96 MIN: 3.12 / MAX: 4.24 MIN: 3.14 / MAX: 4.45 MIN: 3.01 / MAX: 311.25 MIN: 3.14 / MAX: 138.88 MIN: 3.08 / MAX: 103.38 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3-v3 - Model: mobilenet 3090 rep 3090 4090 rep 4080 rep 4080 xxx 4080 zzz g 4080 4090 nv 4090 RTX 3070 Ti i 3070 5 10 15 20 25 SE +/- 0.27, N = 15 8.04 8.11 8.37 8.40 8.46 8.46 8.50 8.84 8.96 9.41 9.62 10.02 18.39 MIN: 7.96 / MAX: 9.01 MIN: 8.02 / MAX: 14.2 MIN: 7.98 / MAX: 10.71 MIN: 7.93 / MAX: 15.25 MIN: 7.95 / MAX: 10.34 MIN: 7.97 / MAX: 10.56 MIN: 8.42 / MAX: 9.29 MIN: 8.31 / MAX: 10.98 MIN: 8.37 / MAX: 11.12 MIN: 8.98 / MAX: 11.38 MIN: 7.71 / MAX: 449.11 MIN: 8.07 / MAX: 266.25 MIN: 7.92 / MAX: 173.39 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: FastestDet nv 4090 b g e 3090 rep c a d 3090 4080 4080 xxx 4080 rep f RTX 3070 Ti 4090 4090 rep 4080 zzz i 3070 2 4 6 8 10 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 SE +/- 0.29, N = 15 3.93 4.07 4.07 4.08 4.08 4.09 4.10 4.11 4.11 4.20 4.20 4.21 4.24 4.26 4.39 4.59 4.79 5.14 8.41 MIN: 3.8 / MAX: 5.4 MIN: 4.04 / MAX: 4.53 MIN: 4.02 / MAX: 4.82 MIN: 4.03 / MAX: 5.29 MIN: 4.04 / MAX: 4.35 MIN: 4.05 / MAX: 5.5 MIN: 4.06 / MAX: 4.81 MIN: 4.01 / MAX: 9.72 MIN: 4.07 / MAX: 4.29 MIN: 4.02 / MAX: 4.97 MIN: 4.03 / MAX: 6.49 MIN: 4.04 / MAX: 4.97 MIN: 3.88 / MAX: 24.21 MIN: 2.5 / MAX: 396.93 MIN: 4.25 / MAX: 5.86 MIN: 2.62 / MAX: 232.18 MIN: 4.64 / MAX: 6.21 MIN: 3.7 / MAX: 81.79 MIN: 2.89 / MAX: 487.78 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: vision_transformer c b a 3090 rep e 3090 d g f 4080 xxx 4080 zzz 4080 rep 4080 i 4090 rep RTX 3070 Ti 4090 nv 4090 3070 16 32 48 64 80 SE +/- 0.09, N = 3 SE +/- 0.07, N = 3 SE +/- 0.21, N = 3 SE +/- 0.16, N = 15 31.79 31.85 31.88 31.91 31.93 31.94 32.12 32.42 32.92 34.19 34.32 35.07 35.56 36.42 37.59 38.03 38.76 39.04 70.76 MIN: 31.63 / MAX: 35.57 MIN: 31.69 / MAX: 33.06 MIN: 31.55 / MAX: 37.47 MIN: 31.74 / MAX: 34.28 MIN: 31.62 / MAX: 35.85 MIN: 31.73 / MAX: 34.21 MIN: 31.66 / MAX: 46.9 MIN: 31.89 / MAX: 65.47 MIN: 32.67 / MAX: 36.93 MIN: 32.72 / MAX: 36.79 MIN: 32.58 / MAX: 41.88 MIN: 33.66 / MAX: 39.36 MIN: 33.19 / MAX: 40.43 MIN: 33.49 / MAX: 224.86 MIN: 34.45 / MAX: 457.98 MIN: 32.66 / MAX: 467.28 MIN: 33.12 / MAX: 539.58 MIN: 33.83 / MAX: 463.88 MIN: 38.81 / MAX: 250.01 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: regnety_400m c 3090 rep e a d b f g 3090 4080 4090 rep 4080 rep 4080 xxx 4080 zzz 4090 RTX 3070 Ti nv 4090 i 3070 4 8 12 16 20 SE +/- 0.02, N = 3 SE +/- 0.11, N = 3 SE +/- 0.06, N = 3 SE +/- 0.21, N = 15 8.00 8.02 8.10 8.16 8.17 8.21 8.34 8.36 8.38 8.39 8.48 8.56 8.56 8.58 8.64 9.07 9.81 9.88 16.22 MIN: 7.94 / MAX: 8.88 MIN: 7.95 / MAX: 8.63 MIN: 7.98 / MAX: 8.84 MIN: 7.9 / MAX: 8.99 MIN: 7.99 / MAX: 8.97 MIN: 8.14 / MAX: 8.84 MIN: 7.99 / MAX: 26.72 MIN: 8.27 / MAX: 9.08 MIN: 8.31 / MAX: 8.86 MIN: 8 / MAX: 10.29 MIN: 8.09 / MAX: 9.64 MIN: 8.17 / MAX: 10.28 MIN: 8.15 / MAX: 9.8 MIN: 8.13 / MAX: 9.78 MIN: 8.28 / MAX: 10.42 MIN: 7.61 / MAX: 402.49 MIN: 7.82 / MAX: 241.19 MIN: 8.14 / MAX: 251.77 MIN: 7.74 / MAX: 314.84 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: squeezenet_ssd e c 3090 rep b d f a g 3090 i 4080 4080 xxx 4080 zzz 4080 rep 4090 4090 rep RTX 3070 Ti nv 4090 3070 4 8 12 16 20 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.26, N = 15 7.05 7.06 7.06 7.07 7.08 7.08 7.09 7.10 7.16 7.46 7.58 7.62 7.63 7.64 7.83 8.22 8.47 9.11 15.82 MIN: 6.95 / MAX: 8 MIN: 7 / MAX: 8.03 MIN: 7 / MAX: 7.82 MIN: 7 / MAX: 8.07 MIN: 6.97 / MAX: 7.99 MIN: 6.98 / MAX: 8.07 MIN: 6.98 / MAX: 7.95 MIN: 6.99 / MAX: 8.59 MIN: 7.05 / MAX: 13.55 MIN: 6.9 / MAX: 8.9 MIN: 6.98 / MAX: 9.05 MIN: 7.01 / MAX: 9.28 MIN: 7 / MAX: 9.17 MIN: 7.05 / MAX: 9.12 MIN: 7.21 / MAX: 9.32 MIN: 7.56 / MAX: 9.8 MIN: 6.29 / MAX: 533.92 MIN: 6.35 / MAX: 130.38 MIN: 6.99 / MAX: 82.57 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: yolov4-tiny c 3090 rep a d b e 3090 f g 4080 xxx 4080 rep 4080 4080 zzz 4090 i nv 4090 RTX 3070 Ti 4090 rep 3070 7 14 21 28 35 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 SE +/- 0.18, N = 15 12.81 12.83 12.84 12.85 12.87 12.87 12.88 13.17 13.64 13.65 13.67 13.79 13.80 13.97 15.16 15.26 15.54 15.72 28.59 MIN: 12.73 / MAX: 13.08 MIN: 12.74 / MAX: 13.59 MIN: 12.69 / MAX: 15.33 MIN: 12.72 / MAX: 13.93 MIN: 12.76 / MAX: 13.73 MIN: 12.68 / MAX: 13.84 MIN: 12.76 / MAX: 13.67 MIN: 13.03 / MAX: 14.1 MIN: 13.04 / MAX: 76.32 MIN: 12.71 / MAX: 14.99 MIN: 12.71 / MAX: 14.88 MIN: 12.75 / MAX: 19.63 MIN: 12.76 / MAX: 15.76 MIN: 13.11 / MAX: 16.15 MIN: 12.86 / MAX: 248.64 MIN: 12.87 / MAX: 132.82 MIN: 12.15 / MAX: 492.01 MIN: 13.2 / MAX: 301.81 MIN: 12.87 / MAX: 325.37 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: resnet50 b c a 3090 rep d e 3090 f g 4080 4080 rep 4080 xxx 4080 zzz nv 4090 RTX 3070 Ti i 4090 rep 4090 3070 6 12 18 24 30 SE +/- 0.02, N = 3 SE +/- 0.09, N = 3 SE +/- 0.12, N = 3 SE +/- 0.26, N = 15 10.00 10.00 10.01 10.06 10.10 10.10 10.10 10.26 10.34 10.81 10.84 10.94 11.07 12.45 12.60 12.96 13.08 14.13 23.48 MIN: 9.92 / MAX: 12.35 MIN: 9.91 / MAX: 11.15 MIN: 9.88 / MAX: 11.4 MIN: 9.95 / MAX: 11.04 MIN: 9.86 / MAX: 11.08 MIN: 9.84 / MAX: 11.72 MIN: 9.97 / MAX: 11.42 MIN: 10.09 / MAX: 11.22 MIN: 10.14 / MAX: 11.37 MIN: 9.95 / MAX: 12.78 MIN: 9.93 / MAX: 12.81 MIN: 9.95 / MAX: 12.7 MIN: 10.1 / MAX: 13.23 MIN: 11.55 / MAX: 14.48 MIN: 9.82 / MAX: 418.4 MIN: 10.23 / MAX: 424.46 MIN: 10.11 / MAX: 444.45 MIN: 10.63 / MAX: 167.28 MIN: 10.06 / MAX: 112.91 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: alexnet a d e 3090 rep b c 3090 4080 f 4090 4080 rep 4080 xxx 4080 zzz g nv 4090 RTX 3070 Ti i 4090 rep 3070 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.18, N = 15 4.31 4.31 4.31 4.31 4.33 4.33 4.35 4.62 4.64 4.64 4.65 4.68 4.68 4.87 5.20 5.25 6.53 6.79 10.88 MIN: 4.24 / MAX: 5.2 MIN: 4.25 / MAX: 5.28 MIN: 4.23 / MAX: 11.03 MIN: 4.26 / MAX: 5.26 MIN: 4.28 / MAX: 5.16 MIN: 4.26 / MAX: 10.59 MIN: 4.28 / MAX: 7.49 MIN: 4.26 / MAX: 6.15 MIN: 4.57 / MAX: 5.49 MIN: 4.26 / MAX: 5.98 MIN: 4.26 / MAX: 6.53 MIN: 4.26 / MAX: 6.61 MIN: 4.26 / MAX: 6.23 MIN: 4.8 / MAX: 5.62 MIN: 4.82 / MAX: 7.07 MIN: 4.23 / MAX: 375.94 MIN: 4.57 / MAX: 242.16 MIN: 4.23 / MAX: 262.43 MIN: 4.38 / MAX: 52.99 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: resnet18 3090 rep c e b d 3090 a f 4080 zzz 4080 rep 4080 xxx 4080 4090 i 4090 rep g RTX 3070 Ti nv 4090 3070 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 SE +/- 0.20, N = 15 5.20 5.21 5.22 5.23 5.23 5.27 5.28 5.48 5.59 5.61 5.62 5.67 5.69 5.82 6.01 6.22 6.28 7.44 12.68 MIN: 5.09 / MAX: 5.98 MIN: 5.11 / MAX: 6.04 MIN: 5.09 / MAX: 11.15 MIN: 5.13 / MAX: 6.18 MIN: 5.08 / MAX: 6.28 MIN: 5.15 / MAX: 6.19 MIN: 5.17 / MAX: 6.16 MIN: 5.33 / MAX: 6.16 MIN: 5.06 / MAX: 6.95 MIN: 5.11 / MAX: 7.44 MIN: 5.1 / MAX: 7.65 MIN: 5.18 / MAX: 7.22 MIN: 5.16 / MAX: 8.22 MIN: 5.28 / MAX: 7.02 MIN: 5.44 / MAX: 8.18 MIN: 6.11 / MAX: 7 MIN: 4.94 / MAX: 298.06 MIN: 5.29 / MAX: 320.54 MIN: 5.39 / MAX: 262.62 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: vgg16 3090 rep a c 3090 b d e g f 4080 4080 xxx 4080 rep 4080 zzz 4090 rep i 4090 RTX 3070 Ti nv 4090 3070 11 22 33 44 55 SE +/- 0.04, N = 3 SE +/- 0.10, N = 3 SE +/- 0.14, N = 3 SE +/- 0.24, N = 15 23.48 23.51 23.54 23.55 23.56 23.56 23.60 24.20 24.55 25.00 25.01 25.04 25.82 27.04 27.83 28.82 29.06 29.29 48.29 MIN: 23.24 / MAX: 29.21 MIN: 23.29 / MAX: 24.68 MIN: 23.33 / MAX: 24.61 MIN: 23.3 / MAX: 24.45 MIN: 23.34 / MAX: 24.72 MIN: 23.24 / MAX: 24.78 MIN: 23.17 / MAX: 24.71 MIN: 23.56 / MAX: 58.31 MIN: 23.62 / MAX: 97.69 MIN: 23.93 / MAX: 26.69 MIN: 23.8 / MAX: 26.41 MIN: 24.06 / MAX: 27.35 MIN: 24.35 / MAX: 62.94 MIN: 24.22 / MAX: 296.13 MIN: 24.98 / MAX: 262.23 MIN: 24.35 / MAX: 214.1 MIN: 24.11 / MAX: 541.55 MIN: 24.63 / MAX: 296.95 MIN: 24.97 / MAX: 183.12 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: googlenet c 3090 rep b d e 3090 a f 4080 rep 4080 zzz 4080 4080 xxx nv 4090 i g RTX 3070 Ti 4090 4090 rep 3070 5 10 15 20 25 SE +/- 0.04, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.22, N = 15 7.80 7.82 7.85 7.85 7.85 7.87 7.90 8.15 8.40 8.41 8.42 8.42 8.70 8.75 8.96 9.87 10.62 10.65 19.49 MIN: 7.72 / MAX: 8.74 MIN: 7.69 / MAX: 8.61 MIN: 7.76 / MAX: 8.76 MIN: 7.71 / MAX: 8.85 MIN: 7.71 / MAX: 8.76 MIN: 7.76 / MAX: 10.36 MIN: 7.74 / MAX: 9.54 MIN: 8.02 / MAX: 9.02 MIN: 7.77 / MAX: 9.78 MIN: 7.72 / MAX: 9.9 MIN: 7.79 / MAX: 10.01 MIN: 7.73 / MAX: 10.06 MIN: 7.96 / MAX: 10.01 MIN: 8.08 / MAX: 16.01 MIN: 8.82 / MAX: 9.87 MIN: 7.33 / MAX: 399.24 MIN: 7.83 / MAX: 323.31 MIN: 8.29 / MAX: 236.11 MIN: 7.4 / MAX: 200.01 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: blazeface b c 3090 rep a d e f 3090 4090 i 4090 rep nv 4090 g 4080 4080 rep 4080 xxx 4080 zzz RTX 3070 Ti 3070 0.8955 1.791 2.6865 3.582 4.4775 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.16, N = 15 1.37 1.37 1.37 1.38 1.38 1.38 1.38 1.39 1.39 1.40 1.40 1.40 1.41 1.41 1.41 1.42 1.42 1.60 3.98 MIN: 1.35 / MAX: 1.75 MIN: 1.35 / MAX: 1.82 MIN: 1.36 / MAX: 1.46 MIN: 1.34 / MAX: 1.85 MIN: 1.34 / MAX: 2.25 MIN: 1.34 / MAX: 1.88 MIN: 1.35 / MAX: 2.08 MIN: 1.37 / MAX: 1.82 MIN: 1.33 / MAX: 1.94 MIN: 1.33 / MAX: 2 MIN: 1.33 / MAX: 1.93 MIN: 1.34 / MAX: 1.87 MIN: 1.38 / MAX: 2.09 MIN: 1.35 / MAX: 2.01 MIN: 1.35 / MAX: 1.9 MIN: 1.36 / MAX: 1.93 MIN: 1.36 / MAX: 2.01 MIN: 1.11 / MAX: 436.01 MIN: 1.31 / MAX: 228.4 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: efficientnet-b0 b c e a f g 3090 rep d 3090 4080 4080 xxx 4080 zzz 4080 rep 4090 rep nv 4090 4090 RTX 3070 Ti i 3070 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.18, N = 15 3.82 3.83 3.84 3.86 3.86 3.86 3.86 3.87 3.88 4.01 4.04 4.04 4.05 4.09 4.10 4.34 4.53 5.88 8.99 MIN: 3.78 / MAX: 4.39 MIN: 3.79 / MAX: 4.61 MIN: 3.79 / MAX: 4.76 MIN: 3.8 / MAX: 4.6 MIN: 3.78 / MAX: 10.45 MIN: 3.82 / MAX: 4.22 MIN: 3.82 / MAX: 4.34 MIN: 3.77 / MAX: 9.91 MIN: 3.84 / MAX: 4.39 MIN: 3.78 / MAX: 5.34 MIN: 3.82 / MAX: 5.33 MIN: 3.8 / MAX: 5.31 MIN: 3.83 / MAX: 6.11 MIN: 3.87 / MAX: 5.46 MIN: 3.86 / MAX: 5.46 MIN: 4.14 / MAX: 5.84 MIN: 3.75 / MAX: 396.62 MIN: 4.04 / MAX: 364.21 MIN: 3.71 / MAX: 129.99 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: mnasnet b c e d f 3090 rep a g 3090 4080 zzz 4080 4080 xxx 4080 rep 4090 rep 4090 i RTX 3070 Ti nv 4090 3070 1.1453 2.2906 3.4359 4.5812 5.7265 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.16, N = 15 2.95 2.96 2.96 2.97 2.97 2.97 2.98 2.98 2.99 3.06 3.07 3.07 3.08 3.15 3.18 3.20 3.40 4.70 5.09 MIN: 2.92 / MAX: 3.42 MIN: 2.93 / MAX: 3.41 MIN: 2.91 / MAX: 5.9 MIN: 2.92 / MAX: 3.34 MIN: 2.93 / MAX: 3.66 MIN: 2.94 / MAX: 3.28 MIN: 2.92 / MAX: 4.03 MIN: 2.94 / MAX: 3.65 MIN: 2.96 / MAX: 3.14 MIN: 2.92 / MAX: 3.73 MIN: 2.93 / MAX: 4.63 MIN: 2.95 / MAX: 4.19 MIN: 2.94 / MAX: 3.67 MIN: 3 / MAX: 4.54 MIN: 3.05 / MAX: 4.64 MIN: 3.07 / MAX: 3.86 MIN: 2.72 / MAX: 432.18 MIN: 3 / MAX: 188.08 MIN: 2.86 / MAX: 53.75 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: shufflenet-v2 c b e a d g 3090 rep 3090 f 4080 4080 rep 4090 4080 zzz i 4090 rep 4080 xxx nv 4090 RTX 3070 Ti 3070 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.20, N = 15 3.32 3.33 3.33 3.35 3.35 3.35 3.36 3.39 3.40 3.43 3.44 3.45 3.46 3.49 3.49 3.50 3.51 3.98 7.07 MIN: 3.29 / MAX: 4.19 MIN: 3.3 / MAX: 3.59 MIN: 3.28 / MAX: 4.14 MIN: 3.29 / MAX: 3.85 MIN: 3.3 / MAX: 3.82 MIN: 3.3 / MAX: 4.02 MIN: 3.32 / MAX: 4.06 MIN: 3.35 / MAX: 3.69 MIN: 3.35 / MAX: 5.89 MIN: 3.3 / MAX: 4.22 MIN: 3.3 / MAX: 5.36 MIN: 3.32 / MAX: 3.99 MIN: 3.32 / MAX: 5.24 MIN: 3.35 / MAX: 4.24 MIN: 3.36 / MAX: 4.33 MIN: 3.37 / MAX: 4.85 MIN: 3.37 / MAX: 4 MIN: 3.14 / MAX: 529.82 MIN: 3.25 / MAX: 243.32 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 c f b e d a 3090 rep g 3090 4080 4080 xxx i 4080 rep 4080 zzz 4090 4090 rep nv 4090 RTX 3070 Ti 3070 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.14, N = 15 3.13 3.13 3.14 3.14 3.16 3.17 3.17 3.18 3.18 3.26 3.26 3.29 3.29 3.29 3.30 3.31 3.43 3.56 7.81 MIN: 3.08 / MAX: 3.85 MIN: 3.07 / MAX: 3.82 MIN: 3.1 / MAX: 3.73 MIN: 3.08 / MAX: 4.06 MIN: 3.09 / MAX: 3.92 MIN: 3.09 / MAX: 3.78 MIN: 3.12 / MAX: 3.64 MIN: 3.13 / MAX: 3.9 MIN: 3.14 / MAX: 3.63 MIN: 3.1 / MAX: 4.12 MIN: 3.1 / MAX: 3.87 MIN: 3.12 / MAX: 3.93 MIN: 3.12 / MAX: 4.14 MIN: 3.11 / MAX: 3.98 MIN: 3.12 / MAX: 4.82 MIN: 3.14 / MAX: 4.92 MIN: 3.25 / MAX: 4.81 MIN: 3.09 / MAX: 345.01 MIN: 3.07 / MAX: 154.75 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: mobilenet d c 3090 rep b e a 3090 g f 4080 xxx 4080 nv 4090 4080 zzz 4080 rep RTX 3070 Ti 4090 rep i 4090 3070 5 10 15 20 25 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.25, N = 15 8.02 8.03 8.03 8.04 8.04 8.05 8.07 8.17 8.27 8.37 8.44 8.45 8.47 8.57 9.35 10.23 10.40 10.55 21.11 MIN: 7.95 / MAX: 9.81 MIN: 7.98 / MAX: 8.84 MIN: 7.96 / MAX: 8.77 MIN: 7.95 / MAX: 14.33 MIN: 7.95 / MAX: 9.09 MIN: 7.95 / MAX: 8.89 MIN: 7.99 / MAX: 8.8 MIN: 8.08 / MAX: 9.37 MIN: 8.17 / MAX: 9.04 MIN: 7.97 / MAX: 16.09 MIN: 7.98 / MAX: 10.55 MIN: 8.03 / MAX: 12.61 MIN: 8.04 / MAX: 10.17 MIN: 7.98 / MAX: 10 MIN: 7.49 / MAX: 474.12 MIN: 8.13 / MAX: 386.42 MIN: 7.97 / MAX: 455.46 MIN: 8.22 / MAX: 303.1 MIN: 7.98 / MAX: 322.43 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: FastestDet g i a RTX 3070 Ti b d 3090 rep c 4080 zzz 4080 xxx 3090 f 4080 rep 4080 nv 4090 4090 rep 4090 3070 2 4 6 8 10 SE +/- 0.45, N = 3 SE +/- 0.23, N = 15 SE +/- 0.01, N = 3 2.57 2.66 3.62 3.94 4.05 4.08 4.08 4.11 4.16 4.17 4.21 4.22 4.34 4.42 4.51 5.27 5.48 8.65 MIN: 2.53 / MAX: 3.21 MIN: 2.54 / MAX: 3.41 MIN: 2.7 / MAX: 4.54 MIN: 2.43 / MAX: 267.02 MIN: 4.02 / MAX: 4.35 MIN: 4.02 / MAX: 4.28 MIN: 4.05 / MAX: 4.84 MIN: 4.08 / MAX: 4.4 MIN: 4 / MAX: 4.69 MIN: 4.05 / MAX: 4.74 MIN: 4.19 / MAX: 4.41 MIN: 4.18 / MAX: 4.97 MIN: 4.19 / MAX: 5.77 MIN: 4.25 / MAX: 6.71 MIN: 4.34 / MAX: 5.96 MIN: 4.05 / MAX: 247.02 MIN: 2.67 / MAX: 259.34 MIN: 3.94 / MAX: 185.21 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: blazeface 4090 nv 4090 f a b d g 3090 3090 rep c i 4080 4080 zzz 4080 rep 4080 xxx 4090 rep RTX 3070 Ti 3070 0.6705 1.341 2.0115 2.682 3.3525 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.14, N = 15 1.27 1.33 1.37 1.38 1.38 1.38 1.38 1.38 1.38 1.39 1.40 1.40 1.40 1.42 1.42 1.45 1.60 2.98 MIN: 1.21 / MAX: 1.95 MIN: 1.27 / MAX: 1.77 MIN: 1.34 / MAX: 2.11 MIN: 1.35 / MAX: 2.06 MIN: 1.35 / MAX: 1.67 MIN: 1.35 / MAX: 2.05 MIN: 1.36 / MAX: 1.62 MIN: 1.35 / MAX: 2.23 MIN: 1.36 / MAX: 1.71 MIN: 1.36 / MAX: 1.53 MIN: 1.34 / MAX: 2 MIN: 1.34 / MAX: 2.15 MIN: 1.34 / MAX: 2.1 MIN: 1.35 / MAX: 1.88 MIN: 1.36 / MAX: 2.02 MIN: 1.38 / MAX: 2.96 MIN: 0.95 / MAX: 433.24 MIN: 1.29 / MAX: 144.96 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v3-v3 - Model: mobilenet-v3 nv 4090 4090 f g 3090 rep b c 4080 zzz i 4080 4080 rep 4080 xxx RTX 3070 Ti 4090 rep 3070 2 4 6 8 10 SE +/- 0.18, N = 15 2.61 3.12 3.15 3.15 3.15 3.16 3.17 3.20 3.26 3.27 3.27 3.33 3.65 4.90 8.06 MIN: 2.5 / MAX: 3.12 MIN: 2.99 / MAX: 5.09 MIN: 3.1 / MAX: 3.8 MIN: 3.1 / MAX: 3.87 MIN: 3.11 / MAX: 3.83 MIN: 3.11 / MAX: 3.75 MIN: 3.11 / MAX: 8.89 MIN: 3.06 / MAX: 3.84 MIN: 3.12 / MAX: 4.19 MIN: 3.12 / MAX: 5.24 MIN: 3.14 / MAX: 3.99 MIN: 3.19 / MAX: 4.2 MIN: 2.87 / MAX: 347.75 MIN: 3.17 / MAX: 120.84 MIN: 2.96 / MAX: 219.87 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3 - Model: mobilenet-v3 f 3090 3090 rep b c g 4090 4080 zzz 4080 i 4080 xxx nv 4090 4090 rep RTX 3070 Ti 3070 1.2105 2.421 3.6315 4.842 6.0525 SE +/- 0.22, N = 14 3.15 3.15 3.15 3.16 3.16 3.16 3.25 3.27 3.28 3.29 3.31 3.35 3.53 3.76 5.38 MIN: 3.11 / MAX: 3.48 MIN: 3.11 / MAX: 3.71 MIN: 3.11 / MAX: 3.6 MIN: 3.12 / MAX: 3.69 MIN: 3.12 / MAX: 3.7 MIN: 3.11 / MAX: 3.93 MIN: 3.11 / MAX: 4.74 MIN: 3.14 / MAX: 4.63 MIN: 3.14 / MAX: 3.89 MIN: 3.15 / MAX: 4.32 MIN: 3.16 / MAX: 5.3 MIN: 3.21 / MAX: 5.23 MIN: 3.2 / MAX: 40.81 MIN: 2.89 / MAX: 366.04 MIN: 2.74 / MAX: 121.29 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: vision_transformer c 3090 rep b d a g 3090 f 4080 zzz 4080 xxx 4080 4080 rep 4090 rep i RTX 3070 Ti 4090 nv 4090 3070 20 40 60 80 100 SE +/- 0.39, N = 3 SE +/- 0.29, N = 3 SE +/- 0.12, N = 15 31.77 31.80 31.95 32.43 32.49 32.73 33.01 33.56 34.10 34.27 35.07 35.28 37.05 37.80 37.91 38.25 38.90 75.34 MIN: 31.61 / MAX: 35.68 MIN: 31.66 / MAX: 32.23 MIN: 31.79 / MAX: 32.33 MIN: 31.56 / MAX: 37.69 MIN: 31.67 / MAX: 40.11 MIN: 31.44 / MAX: 81.32 MIN: 32.88 / MAX: 33.42 MIN: 32.98 / MAX: 51.93 MIN: 32.65 / MAX: 37.64 MIN: 32.82 / MAX: 39.79 MIN: 33.14 / MAX: 43.26 MIN: 33.9 / MAX: 38.67 MIN: 33.89 / MAX: 407.84 MIN: 33.74 / MAX: 321.51 MIN: 32.08 / MAX: 541.11 MIN: 33.04 / MAX: 447.7 MIN: 34.2 / MAX: 300.84 MIN: 38.72 / MAX: 418.01 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: regnety_400m f 4090 a b d 4080 3090 rep 3090 c g 4080 zzz 4080 xxx 4080 rep 4090 rep RTX 3070 Ti i nv 4090 3070 4 8 12 16 20 SE +/- 0.04, N = 3 SE +/- 0.06, N = 3 SE +/- 0.19, N = 15 8.08 8.13 8.18 8.18 8.23 8.24 8.24 8.25 8.27 8.30 8.37 8.45 8.67 8.78 8.83 9.94 10.17 18.00 MIN: 7.98 / MAX: 10.87 MIN: 7.78 / MAX: 9.98 MIN: 8.07 / MAX: 9.68 MIN: 8.12 / MAX: 8.86 MIN: 8.03 / MAX: 8.9 MIN: 7.89 / MAX: 9.52 MIN: 8.17 / MAX: 8.84 MIN: 8.17 / MAX: 8.9 MIN: 8.22 / MAX: 9.18 MIN: 8.22 / MAX: 9.1 MIN: 8.05 / MAX: 10.19 MIN: 8.12 / MAX: 9.68 MIN: 8.22 / MAX: 15.29 MIN: 8.45 / MAX: 10.05 MIN: 7.65 / MAX: 351.08 MIN: 7.43 / MAX: 166.02 MIN: 8.12 / MAX: 209.53 MIN: 7.91 / MAX: 176.28 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: squeezenet_ssd b a d c 3090 rep g f 4090 rep 3090 4080 zzz 4080 xxx 4080 4080 rep 4090 RTX 3070 Ti i nv 4090 3070 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.24, N = 15 7.06 7.07 7.09 7.10 7.12 7.13 7.23 7.31 7.52 7.55 7.62 7.73 7.86 7.86 8.13 8.96 9.11 13.20 MIN: 7.01 / MAX: 7.55 MIN: 7.01 / MAX: 8.07 MIN: 6.99 / MAX: 9.39 MIN: 7.05 / MAX: 7.65 MIN: 7.05 / MAX: 7.63 MIN: 7.04 / MAX: 8.43 MIN: 7.15 / MAX: 8.02 MIN: 6.71 / MAX: 9.3 MIN: 7.45 / MAX: 7.74 MIN: 7 / MAX: 8.72 MIN: 7.01 / MAX: 8.84 MIN: 7.13 / MAX: 9.7 MIN: 7.22 / MAX: 10.84 MIN: 7.25 / MAX: 8.98 MIN: 6.37 / MAX: 399.11 MIN: 6.92 / MAX: 244.02 MIN: 6.77 / MAX: 101.58 MIN: 6.9 / MAX: 68.61 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: yolov4-tiny b 3090 rep c a d f 4080 xxx 4080 zzz 4090 4080 4080 rep 3090 i RTX 3070 Ti 4090 rep nv 4090 g 3070 7 14 21 28 35 SE +/- 0.11, N = 3 SE +/- 0.05, N = 3 SE +/- 0.18, N = 15 12.74 12.77 12.81 12.90 12.95 13.32 13.60 13.63 13.68 13.85 14.03 14.26 14.65 15.20 15.38 15.40 17.23 27.66 MIN: 12.66 / MAX: 13.28 MIN: 12.7 / MAX: 13.02 MIN: 12.74 / MAX: 13.2 MIN: 12.69 / MAX: 15.88 MIN: 12.75 / MAX: 18.88 MIN: 12.95 / MAX: 35.49 MIN: 12.8 / MAX: 16.23 MIN: 12.77 / MAX: 15.36 MIN: 12.83 / MAX: 14.63 MIN: 12.84 / MAX: 16.75 MIN: 13.15 / MAX: 15.97 MIN: 14.17 / MAX: 14.53 MIN: 12.44 / MAX: 202.68 MIN: 12.69 / MAX: 431.37 MIN: 12.32 / MAX: 188.07 MIN: 12.35 / MAX: 321.43 MIN: 12.99 / MAX: 196.66 MIN: 12.74 / MAX: 294.9 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: resnet50 3090 rep d b c a 3090 g 4080 xxx f 4080 zzz 4080 nv 4090 4080 rep i RTX 3070 Ti 4090 rep 4090 3070 5 10 15 20 25 SE +/- 0.01, N = 3 SE +/- 0.23, N = 3 SE +/- 0.26, N = 15 9.95 10.00 10.01 10.11 10.20 10.30 10.72 10.82 11.05 11.10 11.16 11.41 11.76 12.09 12.73 12.73 14.10 21.50 MIN: 9.85 / MAX: 10.72 MIN: 9.86 / MAX: 11.02 MIN: 9.85 / MAX: 11.06 MIN: 9.95 / MAX: 16.18 MIN: 9.84 / MAX: 12.48 MIN: 9.82 / MAX: 17.56 MIN: 10.1 / MAX: 108.3 MIN: 9.9 / MAX: 12.26 MIN: 10.14 / MAX: 162.88 MIN: 10.2 / MAX: 13.06 MIN: 10.29 / MAX: 15.03 MIN: 10.57 / MAX: 12.22 MIN: 10.68 / MAX: 44.94 MIN: 11.16 / MAX: 13.48 MIN: 10.18 / MAX: 541.92 MIN: 10.22 / MAX: 181.72 MIN: 10.27 / MAX: 287 MIN: 10.24 / MAX: 116.85 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: alexnet d 3090 c 3090 rep b g f a 4080 rep 4080 xxx 4080 zzz 4080 4090 4090 rep nv 4090 i RTX 3070 Ti 3070 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.11, N = 3 SE +/- 0.21, N = 14 4.30 4.30 4.31 4.31 4.32 4.32 4.36 4.41 4.67 4.68 4.69 4.75 5.14 5.14 5.18 5.30 5.49 9.62 MIN: 4.23 / MAX: 5.32 MIN: 4.25 / MAX: 4.83 MIN: 4.26 / MAX: 4.98 MIN: 4.26 / MAX: 5.18 MIN: 4.26 / MAX: 5.15 MIN: 4.25 / MAX: 5.17 MIN: 4.29 / MAX: 5.7 MIN: 4.24 / MAX: 5.16 MIN: 4.27 / MAX: 5.88 MIN: 4.28 / MAX: 6.37 MIN: 4.29 / MAX: 5.78 MIN: 4.31 / MAX: 13.88 MIN: 4.73 / MAX: 6.32 MIN: 4.76 / MAX: 6.26 MIN: 4.75 / MAX: 7.12 MIN: 4.92 / MAX: 7.18 MIN: 4.26 / MAX: 363.39 MIN: 4.31 / MAX: 147.6 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: resnet18 b 3090 d c a 3090 rep g 4080 xxx i 4080 zzz 4080 rep f 4080 4090 RTX 3070 Ti 4090 rep nv 4090 3070 4 8 12 16 20 SE +/- 0.02, N = 3 SE +/- 0.07, N = 3 SE +/- 0.17, N = 15 5.20 5.21 5.23 5.24 5.29 5.29 5.55 5.56 5.60 5.63 5.68 5.69 5.69 6.00 6.08 6.77 7.82 14.03 MIN: 5.1 / MAX: 5.9 MIN: 5.09 / MAX: 6.04 MIN: 5.1 / MAX: 6.28 MIN: 5.15 / MAX: 6.09 MIN: 5.09 / MAX: 6.29 MIN: 5.18 / MAX: 6.19 MIN: 5.19 / MAX: 25.4 MIN: 5.09 / MAX: 6.84 MIN: 5.13 / MAX: 6.83 MIN: 5.08 / MAX: 7.55 MIN: 5.17 / MAX: 7.45 MIN: 5.22 / MAX: 92.59 MIN: 5.16 / MAX: 7.68 MIN: 5.47 / MAX: 7.29 MIN: 4.97 / MAX: 245.95 MIN: 6.16 / MAX: 8.42 MIN: 5.54 / MAX: 303.05 MIN: 5 / MAX: 303.38 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: vgg16 3090 3090 rep c b d a g f 4080 xxx 4080 4080 zzz 4080 rep 4090 4090 rep RTX 3070 Ti nv 4090 i 3070 13 26 39 52 65 SE +/- 0.05, N = 3 SE +/- 0.30, N = 3 SE +/- 0.23, N = 15 23.43 23.43 23.45 23.49 23.51 23.75 23.78 24.19 25.03 25.37 25.40 26.11 27.75 28.19 28.36 29.54 30.96 56.64 MIN: 23.2 / MAX: 24.1 MIN: 23.23 / MAX: 24.39 MIN: 23.26 / MAX: 24.51 MIN: 23.36 / MAX: 24.62 MIN: 23.19 / MAX: 24.68 MIN: 23.31 / MAX: 25.12 MIN: 23.52 / MAX: 24.89 MIN: 23.99 / MAX: 30.98 MIN: 23.85 / MAX: 28.9 MIN: 24.26 / MAX: 36.52 MIN: 24.09 / MAX: 32.86 MIN: 24.54 / MAX: 30.29 MIN: 24.58 / MAX: 282.59 MIN: 24.69 / MAX: 205.72 MIN: 24.13 / MAX: 449.57 MIN: 24.77 / MAX: 364.86 MIN: 25.92 / MAX: 328.63 MIN: 25.75 / MAX: 367.74 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: googlenet b d 3090 3090 rep f c a g 4080 xxx 4080 zzz 4080 4080 rep nv 4090 4090 rep RTX 3070 Ti 4090 i 3070 4 8 12 16 20 SE +/- 0.01, N = 3 SE +/- 0.11, N = 3 SE +/- 0.22, N = 15 7.82 7.85 7.86 7.90 7.92 7.93 7.94 7.98 8.38 8.40 8.42 8.52 8.93 9.53 9.69 10.27 10.30 18.25 MIN: 7.73 / MAX: 8.65 MIN: 7.71 / MAX: 8.83 MIN: 7.76 / MAX: 8.74 MIN: 7.79 / MAX: 8.74 MIN: 7.8 / MAX: 8.96 MIN: 7.82 / MAX: 8.91 MIN: 7.71 / MAX: 8.73 MIN: 7.86 / MAX: 8.78 MIN: 7.72 / MAX: 10.05 MIN: 7.72 / MAX: 10.5 MIN: 7.75 / MAX: 9.96 MIN: 7.84 / MAX: 10.21 MIN: 8.27 / MAX: 10.68 MIN: 8.86 / MAX: 11.44 MIN: 7.29 / MAX: 407.61 MIN: 7.95 / MAX: 115.68 MIN: 8.19 / MAX: 349.57 MIN: 7.5 / MAX: 267.89 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: efficientnet-b0 b d 3090 rep f 3090 c a g 4080 4080 zzz 4090 rep 4080 xxx i 4080 rep 4090 nv 4090 RTX 3070 Ti 3070 3 6 9 12 15 SE +/- 0.00, N = 3 SE +/- 0.05, N = 3 SE +/- 0.19, N = 15 3.85 3.85 3.86 3.87 3.87 3.88 3.90 3.91 3.99 3.99 4.03 4.04 4.05 4.09 4.23 4.37 4.72 9.23 MIN: 3.81 / MAX: 4.42 MIN: 3.81 / MAX: 4.46 MIN: 3.81 / MAX: 4.75 MIN: 3.81 / MAX: 4.97 MIN: 3.83 / MAX: 4.69 MIN: 3.84 / MAX: 4.41 MIN: 3.82 / MAX: 4.51 MIN: 3.85 / MAX: 4.64 MIN: 3.79 / MAX: 5.83 MIN: 3.8 / MAX: 5.69 MIN: 3.86 / MAX: 4.82 MIN: 3.83 / MAX: 5.71 MIN: 3.78 / MAX: 5.45 MIN: 3.86 / MAX: 5.59 MIN: 3.98 / MAX: 12.23 MIN: 4.15 / MAX: 5.96 MIN: 3.37 / MAX: 486.93 MIN: 3.43 / MAX: 156.19 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: mnasnet i a b f 3090 rep d c 3090 g 4080 zzz 4080 4080 rep 4080 xxx 4090 4090 rep RTX 3070 Ti nv 4090 3070 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.14, N = 15 2.74 2.97 2.97 2.97 2.97 2.98 2.99 2.99 3.00 3.04 3.05 3.06 3.06 3.12 3.18 3.26 4.77 6.88 MIN: 2.62 / MAX: 4.22 MIN: 2.92 / MAX: 3.48 MIN: 2.93 / MAX: 3.45 MIN: 2.93 / MAX: 3.95 MIN: 2.93 / MAX: 3.28 MIN: 2.94 / MAX: 3.83 MIN: 2.96 / MAX: 3.44 MIN: 2.95 / MAX: 3.88 MIN: 2.96 / MAX: 3.68 MIN: 2.91 / MAX: 4.47 MIN: 2.92 / MAX: 3.82 MIN: 2.94 / MAX: 4.51 MIN: 2.94 / MAX: 4.45 MIN: 2.98 / MAX: 3.79 MIN: 3.05 / MAX: 3.8 MIN: 2.46 / MAX: 277.54 MIN: 3.07 / MAX: 97.57 MIN: 3.05 / MAX: 110.25 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: shufflenet-v2 a b 3090 c d 3090 rep 4080 4080 zzz 4080 rep 4080 xxx nv 4090 4090 rep f 4090 g RTX 3070 Ti i 3070 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.19, N = 15 3.34 3.34 3.34 3.35 3.35 3.35 3.41 3.42 3.43 3.45 3.45 3.47 3.55 3.55 3.59 3.77 5.03 6.82 MIN: 3.3 / MAX: 3.85 MIN: 3.31 / MAX: 3.77 MIN: 3.3 / MAX: 4.19 MIN: 3.31 / MAX: 3.8 MIN: 3.3 / MAX: 3.82 MIN: 3.31 / MAX: 3.68 MIN: 3.28 / MAX: 4.87 MIN: 3.28 / MAX: 4.19 MIN: 3.3 / MAX: 4.15 MIN: 3.32 / MAX: 3.85 MIN: 3.32 / MAX: 4.91 MIN: 3.33 / MAX: 4.93 MIN: 3.27 / MAX: 22.86 MIN: 3.39 / MAX: 5.48 MIN: 3.3 / MAX: 25.28 MIN: 3.02 / MAX: 511.95 MIN: 3.07 / MAX: 228.55 MIN: 3.16 / MAX: 64.72 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v2-v2 - Model: mobilenet-v2 f a b g d 3090 3090 rep c 4080 zzz 4080 4080 rep 4080 xxx 4090 4090 rep i nv 4090 RTX 3070 Ti 3070 3 6 9 12 15 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 SE +/- 0.17, N = 15 3.15 3.16 3.16 3.16 3.17 3.17 3.17 3.18 3.25 3.28 3.28 3.28 3.30 3.30 3.52 3.60 3.76 9.67 MIN: 3.1 / MAX: 3.65 MIN: 3.1 / MAX: 3.8 MIN: 3.11 / MAX: 3.61 MIN: 3.11 / MAX: 3.83 MIN: 3.1 / MAX: 8.86 MIN: 3.12 / MAX: 4.05 MIN: 3.11 / MAX: 4.94 MIN: 3.13 / MAX: 3.84 MIN: 3.09 / MAX: 4.51 MIN: 3.11 / MAX: 3.88 MIN: 3.11 / MAX: 4 MIN: 3.1 / MAX: 4.05 MIN: 3.11 / MAX: 4.81 MIN: 3.13 / MAX: 3.97 MIN: 3.29 / MAX: 19.18 MIN: 3.43 / MAX: 4.62 MIN: 2.6 / MAX: 364.73 MIN: 3.19 / MAX: 225.84 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: mobilenet b 3090 rep c a d nv 4090 i 4080 xxx 4080 zzz 4080 rep 4090 rep f 3090 4080 RTX 3070 Ti 4090 3070 g 5 10 15 20 25 SE +/- 0.03, N = 3 SE +/- 0.06, N = 3 SE +/- 0.22, N = 15 7.97 8.01 8.02 8.05 8.10 8.15 8.37 8.37 8.38 8.41 8.43 8.45 8.60 8.73 9.43 10.08 17.81 22.74 MIN: 7.94 / MAX: 8.26 MIN: 7.96 / MAX: 9.85 MIN: 7.98 / MAX: 8.33 MIN: 7.97 / MAX: 9.07 MIN: 7.94 / MAX: 14.4 MIN: 7.73 / MAX: 9.34 MIN: 8.15 / MAX: 9.75 MIN: 7.96 / MAX: 9.72 MIN: 7.94 / MAX: 10.16 MIN: 8.14 / MAX: 11.03 MIN: 8.04 / MAX: 18.04 MIN: 8.37 / MAX: 9.44 MIN: 8.5 / MAX: 13.72 MIN: 8.15 / MAX: 10.96 MIN: 7.95 / MAX: 398.1 MIN: 8.1 / MAX: 118.32 MIN: 8.05 / MAX: 159.41 MIN: 8.24 / MAX: 1264.67 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3 - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: FastestDet 4090 f g 4080 zzz 3090 nv 4090 b c 3090 rep 4090 rep 4080 rep 4080 xxx 4080 RTX 3070 Ti i 3070 3 6 9 12 15 SE +/- 0.27, N = 14 2.93 3.85 3.97 4.04 4.04 4.06 4.07 4.08 4.11 4.16 4.18 4.19 4.20 4.33 5.69 9.18 MIN: 2.84 / MAX: 3.38 MIN: 3.8 / MAX: 4.65 MIN: 3.92 / MAX: 4.75 MIN: 3.89 / MAX: 5.01 MIN: 4.01 / MAX: 4.15 MIN: 3.91 / MAX: 5.78 MIN: 4.03 / MAX: 5.83 MIN: 4.05 / MAX: 4.36 MIN: 4.07 / MAX: 4.21 MIN: 4 / MAX: 5.58 MIN: 4.03 / MAX: 5.07 MIN: 4.04 / MAX: 5.47 MIN: 4.06 / MAX: 4.86 MIN: 2.59 / MAX: 433.58 MIN: 3.69 / MAX: 261.71 MIN: 3.64 / MAX: 122.65 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3 - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: vision_transformer b c 3090 3090 rep g f 4080 4080 rep 4080 xxx 4080 zzz i RTX 3070 Ti 4090 nv 4090 4090 rep 3070 20 40 60 80 100 SE +/- 0.18, N = 15 31.65 31.78 31.89 31.93 33.39 33.47 34.20 34.27 34.37 34.47 36.55 37.86 38.79 38.99 39.12 81.77 MIN: 31.53 / MAX: 32.23 MIN: 31.64 / MAX: 34.51 MIN: 31.66 / MAX: 39.97 MIN: 31.76 / MAX: 33.09 MIN: 32.73 / MAX: 88.83 MIN: 32.89 / MAX: 74.09 MIN: 32.92 / MAX: 36.19 MIN: 33.07 / MAX: 37.01 MIN: 33.01 / MAX: 38.7 MIN: 33.32 / MAX: 37.42 MIN: 33 / MAX: 209.38 MIN: 32.9 / MAX: 463.9 MIN: 33.95 / MAX: 457.41 MIN: 34.17 / MAX: 473.06 MIN: 33.92 / MAX: 465.83 MIN: 44.4 / MAX: 460.28 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3 - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: regnety_400m 3090 b g 3090 rep 4090 c i 4080 f 4080 zzz 4080 rep 4080 xxx RTX 3070 Ti nv 4090 4090 rep 3070 5 10 15 20 25 SE +/- 0.21, N = 15 7.95 8.05 8.07 8.09 8.13 8.14 8.21 8.33 8.34 8.47 8.57 8.75 9.02 9.55 17.15 19.66 MIN: 7.88 / MAX: 8.67 MIN: 8 / MAX: 8.58 MIN: 7.97 / MAX: 8.81 MIN: 7.99 / MAX: 14.25 MIN: 7.75 / MAX: 10.05 MIN: 8.08 / MAX: 8.69 MIN: 7.9 / MAX: 9.99 MIN: 8.02 / MAX: 9.64 MIN: 8.26 / MAX: 9.3 MIN: 8.13 / MAX: 10.27 MIN: 8.21 / MAX: 10.39 MIN: 8.35 / MAX: 10.08 MIN: 7.69 / MAX: 501.76 MIN: 7.5 / MAX: 193.79 MIN: 8.02 / MAX: 773.45 MIN: 7.5 / MAX: 235.36 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3 - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: squeezenet_ssd f b c 3090 3090 rep g 4080 zzz 4080 rep 4080 4080 xxx i RTX 3070 Ti 4090 rep 4090 nv 4090 3070 4 8 12 16 20 SE +/- 0.24, N = 15 6.97 7.03 7.04 7.04 7.09 7.26 7.35 7.59 7.64 7.64 8.16 8.39 9.30 9.32 9.37 17.75 MIN: 6.83 / MAX: 13.87 MIN: 6.97 / MAX: 7.88 MIN: 6.96 / MAX: 7.83 MIN: 6.96 / MAX: 7.74 MIN: 7.02 / MAX: 7.99 MIN: 7.14 / MAX: 8.59 MIN: 6.79 / MAX: 9.82 MIN: 7.02 / MAX: 8.87 MIN: 7.05 / MAX: 9.9 MIN: 7.03 / MAX: 9.19 MIN: 7.51 / MAX: 9.94 MIN: 6.53 / MAX: 436.05 MIN: 6.92 / MAX: 310.91 MIN: 7.1 / MAX: 172.56 MIN: 7.07 / MAX: 281.92 MIN: 6.47 / MAX: 272.11 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3 - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: yolov4-tiny 3090 rep 3090 c b f g 4080 rep 4080 xxx 4080 4080 zzz i 4090 4090 rep RTX 3070 Ti nv 4090 3070 7 14 21 28 35 SE +/- 0.14, N = 15 12.82 12.87 12.89 12.98 13.07 13.08 13.55 13.69 13.81 13.83 15.11 15.30 15.34 15.42 15.62 29.34 MIN: 12.72 / MAX: 13.48 MIN: 12.75 / MAX: 13.58 MIN: 12.84 / MAX: 13.19 MIN: 12.73 / MAX: 35.55 MIN: 12.95 / MAX: 14.55 MIN: 12.96 / MAX: 13.83 MIN: 12.75 / MAX: 14.74 MIN: 12.73 / MAX: 15.68 MIN: 12.84 / MAX: 15.1 MIN: 12.89 / MAX: 15.4 MIN: 12.93 / MAX: 151.45 MIN: 12.87 / MAX: 144.73 MIN: 12.94 / MAX: 157.95 MIN: 12.21 / MAX: 414.81 MIN: 12.99 / MAX: 184 MIN: 12.17 / MAX: 245.34 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3 - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: resnet50 b 3090 3090 rep c 4080 rep 4080 xxx f 4080 4080 zzz g 4090 RTX 3070 Ti nv 4090 4090 rep i 3070 6 12 18 24 30 SE +/- 0.22, N = 15 10.01 10.05 10.07 10.33 10.79 10.91 11.05 11.11 11.21 11.25 11.39 12.11 13.68 13.82 14.05 24.07 MIN: 9.89 / MAX: 10.86 MIN: 9.85 / MAX: 12.64 MIN: 9.94 / MAX: 11.06 MIN: 10.16 / MAX: 13.97 MIN: 9.91 / MAX: 12.75 MIN: 9.91 / MAX: 13.1 MIN: 10.46 / MAX: 112.6 MIN: 10.19 / MAX: 13.03 MIN: 10.3 / MAX: 13.25 MIN: 10.55 / MAX: 118.12 MIN: 10.48 / MAX: 13.29 MIN: 10.16 / MAX: 382.56 MIN: 10.25 / MAX: 566.67 MIN: 10.34 / MAX: 245.6 MIN: 11.69 / MAX: 252.21 MIN: 10.02 / MAX: 218.35 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3 - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: alexnet c b 3090 rep 3090 4080 rep 4080 xxx 4080 4080 zzz nv 4090 g f 4090 i 4090 rep RTX 3070 Ti 3070 3 6 9 12 15 SE +/- 0.23, N = 15 4.28 4.29 4.30 4.31 4.65 4.65 4.66 4.67 4.67 4.71 4.83 4.94 5.01 5.27 5.55 11.00 MIN: 4.24 / MAX: 5.12 MIN: 4.24 / MAX: 5.64 MIN: 4.24 / MAX: 4.99 MIN: 4.25 / MAX: 5.13 MIN: 4.26 / MAX: 6.13 MIN: 4.28 / MAX: 6.42 MIN: 4.29 / MAX: 6.1 MIN: 4.28 / MAX: 6.29 MIN: 4.28 / MAX: 5.7 MIN: 4.65 / MAX: 5.57 MIN: 4.76 / MAX: 5.74 MIN: 4.51 / MAX: 6.64 MIN: 4.6 / MAX: 6.68 MIN: 4.78 / MAX: 7.7 MIN: 4.2 / MAX: 281.58 MIN: 4.33 / MAX: 199.92 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3 - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: resnet18 3090 3090 rep b c g 4080 rep 4080 xxx 4080 4080 zzz 4090 i f RTX 3070 Ti nv 4090 4090 rep 3070 3 6 9 12 15 SE +/- 0.19, N = 15 5.19 5.20 5.21 5.26 5.48 5.63 5.66 5.70 5.71 5.81 5.86 6.13 6.18 7.61 7.75 11.14 MIN: 5.09 / MAX: 6.13 MIN: 5.1 / MAX: 5.97 MIN: 5.12 / MAX: 6.22 MIN: 5.18 / MAX: 6.27 MIN: 5.37 / MAX: 6.51 MIN: 5.09 / MAX: 7.75 MIN: 5.14 / MAX: 7.49 MIN: 5.15 / MAX: 7.9 MIN: 5.12 / MAX: 8.19 MIN: 5.27 / MAX: 7.16 MIN: 5.35 / MAX: 7.79 MIN: 5.41 / MAX: 151.51 MIN: 5.17 / MAX: 262.79 MIN: 5.23 / MAX: 90.18 MIN: 5.57 / MAX: 125.43 MIN: 4.79 / MAX: 65.12 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3 - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: vgg16 b 3090 3090 rep c f 4080 rep g 4080 xxx 4080 4080 zzz nv 4090 4090 rep RTX 3070 Ti 4090 i 3070 11 22 33 44 55 SE +/- 0.24, N = 15 23.50 23.50 23.54 23.99 24.45 24.91 24.92 25.00 25.10 25.45 27.04 27.25 28.40 28.55 29.12 49.75 MIN: 23.3 / MAX: 24.41 MIN: 23.17 / MAX: 24.44 MIN: 23.33 / MAX: 24.41 MIN: 23.72 / MAX: 24.98 MIN: 24.26 / MAX: 25.26 MIN: 23.8 / MAX: 26.87 MIN: 24.58 / MAX: 31.89 MIN: 23.91 / MAX: 27.99 MIN: 24.12 / MAX: 27.57 MIN: 24.22 / MAX: 27.73 MIN: 24.33 / MAX: 215.56 MIN: 24.14 / MAX: 379.93 MIN: 24.12 / MAX: 509.06 MIN: 24.05 / MAX: 201.8 MIN: 26.33 / MAX: 310.23 MIN: 25.45 / MAX: 273.86 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3 - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: googlenet 3090 b 3090 rep c f 4080 4080 xxx 4080 rep 4080 zzz nv 4090 g 4090 rep RTX 3070 Ti 4090 i 3070 4 8 12 16 20 SE +/- 0.24, N = 15 7.83 7.84 7.85 7.88 8.07 8.45 8.50 8.52 8.55 9.02 9.15 9.29 9.65 9.97 10.17 17.00 MIN: 7.71 / MAX: 8.8 MIN: 7.74 / MAX: 8.7 MIN: 7.75 / MAX: 8.69 MIN: 7.79 / MAX: 8.78 MIN: 7.92 / MAX: 8.86 MIN: 7.79 / MAX: 10.32 MIN: 7.79 / MAX: 9.94 MIN: 7.81 / MAX: 10.78 MIN: 7.85 / MAX: 10.35 MIN: 8.41 / MAX: 11.08 MIN: 7.84 / MAX: 198.46 MIN: 7.98 / MAX: 83.03 MIN: 7.59 / MAX: 472.81 MIN: 7.67 / MAX: 258.52 MIN: 7.94 / MAX: 150.01 MIN: 7.35 / MAX: 277.79 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3 - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: blazeface nv 4090 4090 i 4090 rep 3090 b g 3090 rep c 4080 zzz 4080 rep 4080 xxx f 4080 RTX 3070 Ti 3070 0.6818 1.3636 2.0454 2.7272 3.409 SE +/- 0.19, N = 15 1.16 1.17 1.28 1.34 1.36 1.37 1.37 1.37 1.38 1.41 1.42 1.42 1.43 1.44 1.79 3.03 MIN: 1.11 / MAX: 1.67 MIN: 1.11 / MAX: 1.9 MIN: 1.23 / MAX: 1.73 MIN: 1.27 / MAX: 1.95 MIN: 1.34 / MAX: 1.46 MIN: 1.35 / MAX: 1.52 MIN: 1.34 / MAX: 2.07 MIN: 1.35 / MAX: 1.46 MIN: 1.36 / MAX: 1.58 MIN: 1.34 / MAX: 1.91 MIN: 1.36 / MAX: 2.2 MIN: 1.36 / MAX: 1.92 MIN: 1.4 / MAX: 1.77 MIN: 1.37 / MAX: 3.45 MIN: 1.13 / MAX: 312.12 MIN: 1.28 / MAX: 96.94 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3 - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: efficientnet-b0 b 3090 3090 rep c 4080 4080 rep 4080 zzz f nv 4090 4080 xxx 4090 g 4090 rep i RTX 3070 Ti 3070 3 6 9 12 15 SE +/- 0.22, N = 15 3.82 3.83 3.85 3.89 4.02 4.02 4.03 4.04 4.04 4.06 4.09 4.14 4.34 4.68 4.78 9.01 MIN: 3.79 / MAX: 4.34 MIN: 3.78 / MAX: 4.41 MIN: 3.81 / MAX: 4.53 MIN: 3.83 / MAX: 9.72 MIN: 3.82 / MAX: 5.66 MIN: 3.82 / MAX: 5.39 MIN: 3.82 / MAX: 5.43 MIN: 3.99 / MAX: 4.82 MIN: 3.78 / MAX: 4.9 MIN: 3.83 / MAX: 5.55 MIN: 3.86 / MAX: 4.83 MIN: 4.09 / MAX: 5.13 MIN: 4.16 / MAX: 5.28 MIN: 4.48 / MAX: 6.02 MIN: 3.82 / MAX: 411.19 MIN: 3.98 / MAX: 188.57 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3 - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: mnasnet 3090 3090 rep b c g 4080 zzz 4080 xxx 4080 4080 rep RTX 3070 Ti f 4090 4090 rep i nv 4090 3070 2 4 6 8 10 SE +/- 0.04, N = 15 2.95 2.96 2.97 2.97 3.05 3.06 3.07 3.08 3.09 3.11 3.12 3.19 3.23 3.39 4.61 6.87 MIN: 2.92 / MAX: 3.29 MIN: 2.94 / MAX: 3.38 MIN: 2.94 / MAX: 3.43 MIN: 2.94 / MAX: 3.43 MIN: 3.01 / MAX: 3.88 MIN: 2.93 / MAX: 3.64 MIN: 2.94 / MAX: 3.6 MIN: 2.94 / MAX: 4.52 MIN: 2.95 / MAX: 4.52 MIN: 2.8 / MAX: 4.98 MIN: 3.08 / MAX: 3.86 MIN: 3.06 / MAX: 3.75 MIN: 3.1 / MAX: 3.75 MIN: 3.26 / MAX: 4.86 MIN: 2.78 / MAX: 222.99 MIN: 2.93 / MAX: 216.41 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3 - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: shufflenet-v2 3090 b 3090 rep c g f 4080 rep 4080 zzz 4080 nv 4090 4080 xxx i 4090 rep RTX 3070 Ti 4090 3070 2 4 6 8 10 SE +/- 0.21, N = 15 3.32 3.33 3.33 3.34 3.38 3.40 3.43 3.43 3.46 3.46 3.47 3.52 3.59 4.09 5.18 8.13 MIN: 3.28 / MAX: 3.66 MIN: 3.3 / MAX: 3.79 MIN: 3.3 / MAX: 3.67 MIN: 3.32 / MAX: 3.79 MIN: 3.34 / MAX: 4.15 MIN: 3.35 / MAX: 4.17 MIN: 3.3 / MAX: 4.03 MIN: 3.31 / MAX: 3.94 MIN: 3.34 / MAX: 3.93 MIN: 3.32 / MAX: 5.2 MIN: 3.33 / MAX: 5.01 MIN: 3.39 / MAX: 4.05 MIN: 3.46 / MAX: 4.09 MIN: 3.12 / MAX: 435.28 MIN: 3.34 / MAX: 283.54 MIN: 3.09 / MAX: 147.21 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v2-v2 - Model: mobilenet-v2 c 3090 b g f 3090 rep 4080 rep 4080 zzz i 4080 xxx 4080 nv 4090 RTX 3070 Ti 4090 rep 4090 3070 3 6 9 12 15 SE +/- 0.10, N = 15 3.14 3.14 3.15 3.15 3.16 3.17 3.27 3.28 3.30 3.30 3.31 3.39 3.41 3.45 3.46 9.19 MIN: 3.1 / MAX: 3.67 MIN: 3.08 / MAX: 3.7 MIN: 3.1 / MAX: 3.68 MIN: 3.1 / MAX: 3.63 MIN: 3.09 / MAX: 3.89 MIN: 3.11 / MAX: 4.5 MIN: 3.1 / MAX: 4.34 MIN: 3.1 / MAX: 4 MIN: 3.14 / MAX: 4.82 MIN: 3.12 / MAX: 4.03 MIN: 3.12 / MAX: 4.76 MIN: 3.21 / MAX: 4.24 MIN: 2.99 / MAX: 184.91 MIN: 3.23 / MAX: 4.55 MIN: 3.29 / MAX: 4.38 MIN: 3.04 / MAX: 232.12 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3 - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3 - Model: mobilenet c b 3090 rep 3090 4080 zzz 4080 4080 xxx 4080 rep f 4090 rep nv 4090 4090 g RTX 3070 Ti i 3070 4 8 12 16 20 SE +/- 0.23, N = 15 8.00 8.01 8.03 8.06 8.40 8.43 8.44 8.48 8.56 8.74 8.91 8.96 8.98 9.62 10.08 17.82 MIN: 7.96 / MAX: 8.63 MIN: 7.95 / MAX: 8.95 MIN: 7.98 / MAX: 8.77 MIN: 7.94 / MAX: 13.92 MIN: 8.12 / MAX: 10.11 MIN: 7.99 / MAX: 10.44 MIN: 7.97 / MAX: 10.71 MIN: 7.96 / MAX: 10.32 MIN: 8.04 / MAX: 75.44 MIN: 8.25 / MAX: 10.5 MIN: 8.33 / MAX: 10.07 MIN: 8.39 / MAX: 10.77 MIN: 8.1 / MAX: 124.43 MIN: 7.76 / MAX: 454.91 MIN: 8.08 / MAX: 286.28 MIN: 7.57 / MAX: 211.62 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: regnety_400m nv 4090 c g i 3090 4080 zzz 3090 rep b 4080 rep 4080 f 4080 xxx RTX 3070 Ti 4090 4090 rep 3070 4 8 12 16 20 SE +/- 0.25, N = 14 7.73 7.98 7.99 7.99 8.01 8.10 8.25 8.27 8.44 8.45 8.50 8.58 8.89 9.60 10.34 17.23 MIN: 7.43 / MAX: 9.41 MIN: 7.93 / MAX: 8.65 MIN: 7.91 / MAX: 8.8 MIN: 7.62 / MAX: 9.27 MIN: 7.93 / MAX: 8.35 MIN: 7.77 / MAX: 15.42 MIN: 8.12 / MAX: 14 MIN: 8.22 / MAX: 9.01 MIN: 8.04 / MAX: 10.17 MIN: 8.05 / MAX: 10.3 MIN: 8.04 / MAX: 30.12 MIN: 8.23 / MAX: 10.39 MIN: 7.74 / MAX: 476.28 MIN: 7.66 / MAX: 210.23 MIN: 8.21 / MAX: 214.16 MIN: 7.8 / MAX: 193.14 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: FastestDet c 4090 g 3090 b 4080 rep 3090 rep 4080 zzz 4090 rep f 4080 RTX 3070 Ti 4080 xxx i nv 4090 3070 2 4 6 8 10 SE +/- 0.15, N = 15 3.69 3.94 3.97 4.04 4.06 4.09 4.10 4.12 4.16 4.20 4.20 4.26 4.31 4.43 5.92 8.63 MIN: 3.66 / MAX: 3.92 MIN: 3.8 / MAX: 5.41 MIN: 3.93 / MAX: 4.73 MIN: 4 / MAX: 4.15 MIN: 4.03 / MAX: 4.3 MIN: 3.92 / MAX: 5.5 MIN: 4.06 / MAX: 4.21 MIN: 3.97 / MAX: 6.99 MIN: 4.03 / MAX: 4.73 MIN: 4.15 / MAX: 4.92 MIN: 4.04 / MAX: 5.82 MIN: 2.71 / MAX: 347.03 MIN: 4.14 / MAX: 6.11 MIN: 4.28 / MAX: 5.01 MIN: 4.25 / MAX: 103.26 MIN: 4.27 / MAX: 144.3 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: vision_transformer c b 3090 3090 rep g f 4080 zzz 4080 4080 rep 4080 xxx RTX 3070 Ti i nv 4090 4090 rep 4090 3070 16 32 48 64 80 SE +/- 0.13, N = 15 31.66 31.71 31.86 32.11 32.38 33.36 34.05 34.13 34.29 35.40 38.29 38.33 38.58 38.73 39.01 73.51 MIN: 31.52 / MAX: 32.14 MIN: 31.56 / MAX: 33.03 MIN: 31.58 / MAX: 35.84 MIN: 31.94 / MAX: 33.01 MIN: 32.04 / MAX: 51.55 MIN: 32.83 / MAX: 76.21 MIN: 32.83 / MAX: 38.57 MIN: 32.98 / MAX: 36.11 MIN: 33.11 / MAX: 40.12 MIN: 33.93 / MAX: 39.3 MIN: 32.31 / MAX: 557.38 MIN: 34.14 / MAX: 246.43 MIN: 33.77 / MAX: 476.18 MIN: 33.81 / MAX: 362.17 MIN: 33.91 / MAX: 411.66 MIN: 39.27 / MAX: 288.2 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: squeezenet_ssd 3090 c 3090 rep f b g 4080 zzz 4080 rep 4080 4080 xxx nv 4090 4090 rep 4090 RTX 3070 Ti i 3070 4 8 12 16 20 SE +/- 0.19, N = 15 7.04 7.07 7.08 7.09 7.14 7.19 7.51 7.63 7.66 7.70 7.72 7.81 7.93 8.29 8.33 16.15 MIN: 6.97 / MAX: 7.76 MIN: 7.01 / MAX: 7.75 MIN: 7.01 / MAX: 7.93 MIN: 6.98 / MAX: 8.01 MIN: 7.06 / MAX: 7.95 MIN: 6.99 / MAX: 23.11 MIN: 6.94 / MAX: 9.51 MIN: 7.02 / MAX: 9.71 MIN: 7.02 / MAX: 9.08 MIN: 7.11 / MAX: 9.19 MIN: 7.12 / MAX: 23.25 MIN: 7.24 / MAX: 9.04 MIN: 7.31 / MAX: 9.45 MIN: 6.37 / MAX: 448.22 MIN: 6.32 / MAX: 222.03 MIN: 7.25 / MAX: 210.69 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: yolov4-tiny b 3090 rep c 3090 g 4080 zzz 4080 rep 4080 4080 xxx f i RTX 3070 Ti 4090 4090 rep nv 4090 3070 7 14 21 28 35 SE +/- 0.23, N = 15 12.77 12.84 12.86 12.88 12.89 13.62 13.68 13.79 13.95 14.34 15.43 15.44 15.44 16.39 16.61 29.49 MIN: 12.69 / MAX: 13.71 MIN: 12.76 / MAX: 13.7 MIN: 12.76 / MAX: 13.98 MIN: 12.75 / MAX: 13.79 MIN: 12.65 / MAX: 27.99 MIN: 12.75 / MAX: 15.79 MIN: 12.77 / MAX: 15.57 MIN: 12.79 / MAX: 15.92 MIN: 13.03 / MAX: 15.9 MIN: 14.23 / MAX: 15.12 MIN: 13.1 / MAX: 210.2 MIN: 12.61 / MAX: 387.62 MIN: 12.92 / MAX: 211.43 MIN: 12.97 / MAX: 369.64 MIN: 12.32 / MAX: 375.99 MIN: 13.03 / MAX: 182.99 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: resnet50 b 3090 3090 rep c g f 4080 rep 4080 4080 zzz i 4080 xxx RTX 3070 Ti nv 4090 4090 rep 4090 3070 6 12 18 24 30 SE +/- 0.27, N = 15 9.87 9.97 10.01 10.03 10.18 10.25 10.84 10.95 11.09 11.15 11.50 12.35 13.13 13.57 14.58 23.11 MIN: 9.79 / MAX: 10.73 MIN: 9.86 / MAX: 10.84 MIN: 9.91 / MAX: 10.74 MIN: 9.93 / MAX: 10.96 MIN: 10.01 / MAX: 11.25 MIN: 10.05 / MAX: 11.08 MIN: 9.93 / MAX: 12.83 MIN: 9.91 / MAX: 17.11 MIN: 10.18 / MAX: 13.12 MIN: 10.31 / MAX: 12.97 MIN: 10.5 / MAX: 13.47 MIN: 9.83 / MAX: 424.28 MIN: 10.56 / MAX: 323.44 MIN: 10.45 / MAX: 199.55 MIN: 10.67 / MAX: 324.82 MIN: 10.22 / MAX: 140.41 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: alexnet c 3090 3090 rep f g b 4080 zzz 4080 4080 rep i 4080 xxx RTX 3070 Ti 4090 nv 4090 4090 rep 3070 3 6 9 12 15 SE +/- 0.23, N = 15 4.30 4.30 4.30 4.35 4.35 4.42 4.65 4.69 4.69 4.99 5.21 5.67 6.11 6.11 6.58 9.86 MIN: 4.26 / MAX: 5.16 MIN: 4.25 / MAX: 5.08 MIN: 4.25 / MAX: 4.7 MIN: 4.27 / MAX: 5.16 MIN: 4.26 / MAX: 5.85 MIN: 4.32 / MAX: 5.1 MIN: 4.26 / MAX: 5.97 MIN: 4.26 / MAX: 6.15 MIN: 4.26 / MAX: 7.17 MIN: 4.59 / MAX: 6.56 MIN: 4.79 / MAX: 6.66 MIN: 4.21 / MAX: 365.75 MIN: 4.73 / MAX: 81.72 MIN: 4.83 / MAX: 124.76 MIN: 4.61 / MAX: 91.07 MIN: 4.25 / MAX: 157.02 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: resnet18 c 3090 3090 rep f g b 4080 zzz 4080 4080 rep 4090 rep nv 4090 i 4080 xxx 4090 RTX 3070 Ti 3070 3 6 9 12 15 SE +/- 0.16, N = 15 5.23 5.23 5.24 5.30 5.30 5.42 5.59 5.65 5.69 5.81 5.84 5.85 5.89 5.97 6.23 13.38 MIN: 5.11 / MAX: 6.03 MIN: 5.1 / MAX: 6.07 MIN: 5.14 / MAX: 5.99 MIN: 5.17 / MAX: 5.93 MIN: 5.19 / MAX: 6.24 MIN: 5.36 / MAX: 6.27 MIN: 5.09 / MAX: 7.7 MIN: 5.14 / MAX: 6.93 MIN: 5.11 / MAX: 6.94 MIN: 5.3 / MAX: 6.82 MIN: 5.35 / MAX: 7.72 MIN: 5.3 / MAX: 8.27 MIN: 5.36 / MAX: 7.53 MIN: 5.46 / MAX: 7.02 MIN: 4.99 / MAX: 309.18 MIN: 5.43 / MAX: 208.42 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: vgg16 b 3090 rep 3090 c g f 4080 4080 rep 4080 zzz 4080 xxx 4090 rep nv 4090 4090 RTX 3070 Ti i 3070 12 24 36 48 60 SE +/- 0.27, N = 15 23.42 23.43 23.50 23.54 23.82 24.12 25.04 25.04 25.26 26.08 27.59 27.77 28.21 28.40 29.07 55.42 MIN: 23.27 / MAX: 24.32 MIN: 23.26 / MAX: 24.3 MIN: 23.23 / MAX: 24.26 MIN: 23.32 / MAX: 24.54 MIN: 23.62 / MAX: 24.63 MIN: 23.57 / MAX: 46.44 MIN: 23.87 / MAX: 28.04 MIN: 23.81 / MAX: 27.15 MIN: 24.14 / MAX: 27.73 MIN: 24.52 / MAX: 27.73 MIN: 24.34 / MAX: 396.09 MIN: 24.82 / MAX: 264.66 MIN: 24.57 / MAX: 270.76 MIN: 23.98 / MAX: 456 MIN: 24.45 / MAX: 263.33 MIN: 25.32 / MAX: 281.46 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: googlenet 3090 c 3090 rep f g b 4080 zzz 4080 4080 rep 4090 4090 rep 4080 xxx RTX 3070 Ti nv 4090 i 3070 5 10 15 20 25 SE +/- 0.21, N = 15 7.82 7.83 7.85 7.94 7.94 7.97 8.37 8.49 8.58 8.87 8.90 8.99 9.86 10.01 10.19 20.72 MIN: 7.69 / MAX: 8.6 MIN: 7.74 / MAX: 8.61 MIN: 7.75 / MAX: 8.64 MIN: 7.8 / MAX: 8.78 MIN: 7.79 / MAX: 9.59 MIN: 7.89 / MAX: 8.7 MIN: 7.76 / MAX: 10.31 MIN: 7.82 / MAX: 11.98 MIN: 7.79 / MAX: 10.48 MIN: 8.18 / MAX: 11.09 MIN: 8.22 / MAX: 11.07 MIN: 8.25 / MAX: 10.27 MIN: 7.54 / MAX: 396.21 MIN: 7.29 / MAX: 259.11 MIN: 7.73 / MAX: 212.36 MIN: 7.49 / MAX: 355.33 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: blazeface nv 4090 i 4090 c 3090 b f g 3090 rep 4080 zzz 4090 rep 4080 4080 xxx 4080 rep RTX 3070 Ti 3070 0.7155 1.431 2.1465 2.862 3.5775 SE +/- 0.18, N = 15 1.07 1.25 1.33 1.36 1.36 1.37 1.37 1.37 1.38 1.39 1.41 1.42 1.42 1.45 1.71 3.18 MIN: 1.02 / MAX: 1.52 MIN: 1.19 / MAX: 2.61 MIN: 1.27 / MAX: 1.98 MIN: 1.34 / MAX: 1.44 MIN: 1.34 / MAX: 1.61 MIN: 1.35 / MAX: 1.39 MIN: 1.35 / MAX: 1.62 MIN: 1.34 / MAX: 1.7 MIN: 1.36 / MAX: 1.9 MIN: 1.34 / MAX: 1.89 MIN: 1.35 / MAX: 1.89 MIN: 1.35 / MAX: 2.15 MIN: 1.36 / MAX: 1.92 MIN: 1.36 / MAX: 8.73 MIN: 1.09 / MAX: 448.17 MIN: 1.31 / MAX: 185.03 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: efficientnet-b0 c b f g 3090 3090 rep 4080 zzz 4080 rep 4080 4090 i 4080 xxx RTX 3070 Ti nv 4090 4090 rep 3070 2 4 6 8 10 SE +/- 0.18, N = 15 3.82 3.85 3.85 3.85 3.85 3.85 3.95 4.04 4.05 4.18 4.21 4.22 4.73 5.26 6.28 7.81 MIN: 3.78 / MAX: 4.53 MIN: 3.82 / MAX: 4.48 MIN: 3.8 / MAX: 4.6 MIN: 3.8 / MAX: 4.62 MIN: 3.8 / MAX: 4.43 MIN: 3.81 / MAX: 4.62 MIN: 3.76 / MAX: 4.84 MIN: 3.81 / MAX: 5.08 MIN: 3.83 / MAX: 5 MIN: 4 / MAX: 5.25 MIN: 3.96 / MAX: 4.94 MIN: 4 / MAX: 5.58 MIN: 3.79 / MAX: 418.72 MIN: 3.48 / MAX: 250.88 MIN: 3.91 / MAX: 337.73 MIN: 3.73 / MAX: 159.47 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: mnasnet nv 4090 g b c f 3090 3090 rep i 4090 4080 zzz 4080 rep 4080 4090 rep 4080 xxx RTX 3070 Ti 3070 2 4 6 8 10 SE +/- 0.16, N = 14 2.54 2.95 2.96 2.96 2.96 2.97 2.97 2.99 3.00 3.01 3.07 3.09 3.10 3.13 3.37 6.02 MIN: 2.44 / MAX: 3.58 MIN: 2.91 / MAX: 3.64 MIN: 2.93 / MAX: 3.4 MIN: 2.93 / MAX: 3.41 MIN: 2.92 / MAX: 3.81 MIN: 2.92 / MAX: 3.28 MIN: 2.94 / MAX: 3.39 MIN: 2.86 / MAX: 4.38 MIN: 2.89 / MAX: 3.46 MIN: 2.91 / MAX: 3.6 MIN: 2.94 / MAX: 3.72 MIN: 2.94 / MAX: 3.79 MIN: 2.97 / MAX: 3.72 MIN: 3 / MAX: 5.1 MIN: 2.86 / MAX: 278.87 MIN: 2.79 / MAX: 50.49 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: shufflenet-v2 nv 4090 b c f g 3090 4090 i 3090 rep 4080 zzz 4090 rep 4080 4080 rep 4080 xxx RTX 3070 Ti 3070 1.1003 2.2006 3.3009 4.4012 5.5015 SE +/- 0.22, N = 15 3.17 3.33 3.33 3.33 3.33 3.34 3.34 3.36 3.36 3.37 3.40 3.44 3.44 3.51 3.95 4.89 MIN: 3.04 / MAX: 3.78 MIN: 3.3 / MAX: 3.77 MIN: 3.31 / MAX: 3.81 MIN: 3.29 / MAX: 3.99 MIN: 3.29 / MAX: 4.1 MIN: 3.31 / MAX: 3.6 MIN: 3.23 / MAX: 4.78 MIN: 3.25 / MAX: 4.02 MIN: 3.32 / MAX: 3.7 MIN: 3.25 / MAX: 3.95 MIN: 3.26 / MAX: 4.84 MIN: 3.31 / MAX: 4.88 MIN: 3.32 / MAX: 4.16 MIN: 3.37 / MAX: 4.26 MIN: 3.19 / MAX: 410.41 MIN: 3.04 / MAX: 18.32 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3-v2-v2 - Model: mobilenet-v2 g b c f 3090 3090 rep 4080 zzz i 4080 4080 rep 4090 rep 4080 xxx RTX 3070 Ti nv 4090 4090 3070 2 4 6 8 10 SE +/- 0.18, N = 15 3.14 3.15 3.15 3.16 3.16 3.17 3.23 3.28 3.29 3.30 3.31 3.40 3.66 4.45 4.99 7.24 MIN: 3.09 / MAX: 3.61 MIN: 3.11 / MAX: 3.88 MIN: 3.11 / MAX: 3.85 MIN: 3.1 / MAX: 3.71 MIN: 3.11 / MAX: 3.51 MIN: 3.12 / MAX: 4.03 MIN: 3.06 / MAX: 4.66 MIN: 3.09 / MAX: 5.28 MIN: 3.12 / MAX: 4.64 MIN: 3.12 / MAX: 4.7 MIN: 3.12 / MAX: 4.6 MIN: 3.23 / MAX: 4.8 MIN: 3.01 / MAX: 437.59 MIN: 2.65 / MAX: 216.76 MIN: 3.1 / MAX: 201.8 MIN: 3.04 / MAX: 261.68 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3-v3 - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3-v3 - Model: mobilenet c b 3090 3090 rep g 4080 zzz 4080 4080 rep f 4090 4080 xxx i 4090 rep RTX 3070 Ti nv 4090 3070 4 8 12 16 20 SE +/- 0.26, N = 15 7.95 8.00 8.00 8.01 8.04 8.38 8.43 8.46 8.65 8.81 8.88 9.05 9.54 9.62 10.54 16.34 MIN: 7.89 / MAX: 8.79 MIN: 7.95 / MAX: 8.99 MIN: 7.94 / MAX: 8.78 MIN: 7.95 / MAX: 8.35 MIN: 7.93 / MAX: 8.86 MIN: 7.95 / MAX: 10.41 MIN: 7.99 / MAX: 10.66 MIN: 7.99 / MAX: 10.62 MIN: 8.55 / MAX: 9.53 MIN: 8.32 / MAX: 10.7 MIN: 8.31 / MAX: 10.01 MIN: 8.48 / MAX: 11.28 MIN: 8.94 / MAX: 10.54 MIN: 7.76 / MAX: 502.83 MIN: 8.41 / MAX: 134.08 MIN: 8.13 / MAX: 80.69 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
VkFFT Test: FFT + iFFT C2C 1D batched in single precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.2.31 Test: FFT + iFFT C2C 1D batched in single precision 4090 rep 4090 nv 4090 3090 rep 3090 4080 4080 zzz 4080 xxx 4080 rep i f g h c b a e d 30K 60K 90K 120K 150K SE +/- 25.50, N = 3 SE +/- 9.54, N = 3 SE +/- 1.67, N = 3 SE +/- 2.73, N = 3 153939 153896 152170 141437 141357 104556 104543 104528 104491 69738 56476 56455 56431 47971 47948 47887 42651 42645 1. (CXX) g++ options: -O3
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: FastestDet 3090 rep RTX 3070 Ti 4090 rep 4090 nv 4090 3070 2 4 6 8 10 SE +/- 0.15, N = 3 4.07 4.14 4.59 4.62 5.86 7.23 MIN: 4.04 / MAX: 4.25 MIN: 3.73 / MAX: 5.07 MIN: 4.44 / MAX: 5.2 MIN: 4.48 / MAX: 5.16 MIN: 3.9 / MAX: 190.17 MIN: 3.75 / MAX: 121.71 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: vision_transformer 3090 rep nv 4090 RTX 3070 Ti 4090 rep 4090 3070 16 32 48 64 80 SE +/- 0.10, N = 3 31.94 37.13 38.50 38.65 39.35 70.53 MIN: 31.73 / MAX: 32.75 MIN: 33.97 / MAX: 443.1 MIN: 33.7 / MAX: 418.06 MIN: 33.07 / MAX: 476.08 MIN: 34.22 / MAX: 466.65 MIN: 39.2 / MAX: 276.33 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: regnety_400m 3090 rep nv 4090 4090 rep RTX 3070 Ti 4090 3070 4 8 12 16 20 SE +/- 0.54, N = 3 8.06 8.34 8.70 9.14 10.11 17.02 MIN: 7.98 / MAX: 8.6 MIN: 8.01 / MAX: 12.36 MIN: 8.29 / MAX: 12.6 MIN: 8.14 / MAX: 400.02 MIN: 8.03 / MAX: 259.38 MIN: 7.65 / MAX: 216.63 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: squeezenet_ssd 3090 rep 4090 RTX 3070 Ti nv 4090 4090 rep 3070 4 8 12 16 20 SE +/- 0.14, N = 3 7.07 7.43 7.45 8.26 9.44 15.32 MIN: 6.98 / MAX: 9.71 MIN: 6.84 / MAX: 8.82 MIN: 6.59 / MAX: 9.11 MIN: 7.64 / MAX: 11.08 MIN: 7.17 / MAX: 94.63 MIN: 6.66 / MAX: 139.17 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: yolov4-tiny 3090 rep RTX 3070 Ti 4090 rep 4090 nv 4090 3070 7 14 21 28 35 SE +/- 0.94, N = 3 12.92 14.64 15.41 16.05 16.30 29.38 MIN: 12.79 / MAX: 18.5 MIN: 12.77 / MAX: 383.28 MIN: 12.75 / MAX: 226.87 MIN: 12.93 / MAX: 474.03 MIN: 14.11 / MAX: 184.46 MIN: 12.95 / MAX: 201.31 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: resnet50 3090 rep 4090 rep 4090 RTX 3070 Ti nv 4090 3070 5 10 15 20 25 SE +/- 0.30, N = 3 10.27 10.96 13.00 13.15 13.25 22.15 MIN: 10.12 / MAX: 11.19 MIN: 10.09 / MAX: 12.99 MIN: 10.34 / MAX: 397.57 MIN: 10.26 / MAX: 349.93 MIN: 10.61 / MAX: 154.12 MIN: 10.11 / MAX: 123.04 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: alexnet 3090 rep 4090 4090 rep RTX 3070 Ti nv 4090 3070 3 6 9 12 15 SE +/- 0.57, N = 3 4.31 5.14 5.34 6.25 6.32 11.43 MIN: 4.26 / MAX: 4.83 MIN: 4.75 / MAX: 7.34 MIN: 4.87 / MAX: 6.57 MIN: 4.27 / MAX: 334.55 MIN: 4.26 / MAX: 195.95 MIN: 4.24 / MAX: 178.83 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: resnet18 3090 rep nv 4090 4090 rep RTX 3070 Ti 4090 3070 3 6 9 12 15 SE +/- 0.05, N = 3 5.27 5.58 5.87 5.94 6.96 12.13 MIN: 5.15 / MAX: 6.11 MIN: 5.09 / MAX: 6.98 MIN: 5.41 / MAX: 7.58 MIN: 5.32 / MAX: 8.32 MIN: 5.3 / MAX: 242.18 MIN: 5.32 / MAX: 123.4 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: vgg16 3090 rep nv 4090 4090 RTX 3070 Ti 4090 rep 3070 12 24 36 48 60 SE +/- 0.28, N = 3 23.72 27.25 27.32 27.86 29.85 53.48 MIN: 23.56 / MAX: 24.59 MIN: 24.12 / MAX: 252.53 MIN: 24.36 / MAX: 262.38 MIN: 24.17 / MAX: 416.36 MIN: 24.25 / MAX: 400.86 MIN: 25.52 / MAX: 296.52 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: googlenet 3090 rep 4090 RTX 3070 Ti nv 4090 4090 rep 3070 5 10 15 20 25 SE +/- 0.55, N = 3 7.86 8.55 9.97 10.14 10.47 18.80 MIN: 7.75 / MAX: 8.57 MIN: 7.85 / MAX: 11.39 MIN: 8.16 / MAX: 381.49 MIN: 7.85 / MAX: 257.61 MIN: 7.86 / MAX: 191.94 MIN: 7.78 / MAX: 141.46 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: blazeface 4090 3090 rep RTX 3070 Ti nv 4090 4090 rep 3070 0.6053 1.2106 1.8159 2.4212 3.0265 SE +/- 0.04, N = 3 1.35 1.38 1.40 1.40 1.42 2.69 MIN: 1.28 / MAX: 1.84 MIN: 1.36 / MAX: 1.73 MIN: 1.28 / MAX: 1.91 MIN: 1.34 / MAX: 1.86 MIN: 1.36 / MAX: 2.03 MIN: 1.35 / MAX: 48.81 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: efficientnet-b0 3090 rep 4090 rep RTX 3070 Ti 4090 nv 4090 3070 2 4 6 8 10 SE +/- 0.08, N = 3 3.84 4.10 4.17 4.63 5.82 6.63 MIN: 3.8 / MAX: 4.67 MIN: 3.88 / MAX: 5.04 MIN: 3.86 / MAX: 5.52 MIN: 4.38 / MAX: 6.01 MIN: 3.98 / MAX: 197.79 MIN: 3.75 / MAX: 22.34 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: mnasnet 3090 rep nv 4090 RTX 3070 Ti 4090 rep 4090 3070 2 4 6 8 10 SE +/- 0.02, N = 3 2.96 3.10 3.12 3.12 3.23 8.55 MIN: 2.92 / MAX: 3.27 MIN: 2.97 / MAX: 3.73 MIN: 2.97 / MAX: 4.65 MIN: 3 / MAX: 4.1 MIN: 3.08 / MAX: 4.73 MIN: 2.99 / MAX: 185.5 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: shufflenet-v2 3090 rep nv 4090 RTX 3070 Ti 4090 4090 rep 3070 1.3253 2.6506 3.9759 5.3012 6.6265 SE +/- 0.02, N = 3 3.32 3.43 3.48 3.56 5.23 5.89 MIN: 3.29 / MAX: 3.62 MIN: 3.29 / MAX: 5.31 MIN: 3.33 / MAX: 5.22 MIN: 3.43 / MAX: 4.24 MIN: 3.34 / MAX: 185.57 MIN: 3.19 / MAX: 97.88 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: mobilenet-v3 3090 rep RTX 3070 Ti 4090 rep 4090 nv 4090 3070 2 4 6 8 10 SE +/- 0.04, N = 3 3.19 3.24 3.35 3.36 4.96 7.34 MIN: 3.13 / MAX: 3.61 MIN: 3.05 / MAX: 5.14 MIN: 3.22 / MAX: 3.99 MIN: 3.22 / MAX: 4.62 MIN: 3.14 / MAX: 189.43 MIN: 3.09 / MAX: 155.33 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v2-v2 - Model: mobilenet-v2 3090 rep nv 4090 4090 rep RTX 3070 Ti 4090 3070 1.332 2.664 3.996 5.328 6.66 SE +/- 0.53, N = 3 3.15 3.29 3.38 3.83 4.75 5.92 MIN: 3.1 / MAX: 3.75 MIN: 3.12 / MAX: 4.27 MIN: 3.2 / MAX: 4 MIN: 3.11 / MAX: 343.21 MIN: 2.93 / MAX: 147.66 MIN: 3.16 / MAX: 103.24 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3-v3 - Model: mobilenet 3090 rep 4090 rep RTX 3070 Ti 4090 nv 4090 3070 4 8 12 16 20 SE +/- 0.13, N = 3 8.03 8.22 10.02 10.56 10.64 17.06 MIN: 7.97 / MAX: 8.91 MIN: 7.75 / MAX: 9.41 MIN: 7.8 / MAX: 372.36 MIN: 8.32 / MAX: 239.95 MIN: 8.4 / MAX: 127.99 MIN: 8 / MAX: 101.45 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v3-v3 - Model: mobilenet-v3 4080 zzz 4080 xxx 3090 rep 4080 rep 4090 rep 4090 RTX 3070 Ti nv 4090 3070 2 4 6 8 10 SE +/- 0.53, N = 3 3.06 3.08 3.17 3.28 3.30 3.33 3.70 4.97 6.43 MIN: 2.94 / MAX: 3.94 MIN: 2.97 / MAX: 3.67 MIN: 3.12 / MAX: 3.75 MIN: 3.13 / MAX: 4.78 MIN: 3.15 / MAX: 3.91 MIN: 3.2 / MAX: 4.4 MIN: 2.98 / MAX: 261.6 MIN: 3.15 / MAX: 291.01 MIN: 2.85 / MAX: 164.91 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
VkFFT Test: FFT + iFFT C2C 1D batched in single precision, no reshuffling OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.2.31 Test: FFT + iFFT C2C 1D batched in single precision, no reshuffling 4090 rep nv 4090 4090 3090 3090 rep 4080 4080 rep 4080 xxx 4080 zzz i f g b c a e d 30K 60K 90K 120K 150K SE +/- 8.89, N = 3 SE +/- 2.08, N = 3 SE +/- 2.33, N = 3 155936 155148 152656 143969 143956 106210 106205 106099 105926 71163 57110 57094 50643 50596 50504 43365 43365 1. (CXX) g++ options: -O3
NCNN Target: CPU-v3-v3-v3-v3-v3-v3 - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3 - Model: FastestDet 4090 4090 rep 4080 xxx 4080 zzz nv 4090 3090 rep 3090 RTX 3070 Ti 4080 rep 3070 2 4 6 8 10 SE +/- 0.87, N = 3 2.85 3.12 3.80 3.82 3.93 4.07 4.10 4.18 4.20 7.12 MIN: 2.74 / MAX: 4.36 MIN: 2.97 / MAX: 4.42 MIN: 3.65 / MAX: 6.08 MIN: 3.65 / MAX: 9.77 MIN: 3.76 / MAX: 11.77 MIN: 4.03 / MAX: 4.2 MIN: 4.07 / MAX: 4.34 MIN: 2.53 / MAX: 295.11 MIN: 4.04 / MAX: 5.63 MIN: 3.72 / MAX: 188.7 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3 - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3 - Model: vision_transformer 3090 rep 3090 4080 xxx 4080 rep 4080 zzz RTX 3070 Ti 4090 4090 rep nv 4090 3070 15 30 45 60 75 SE +/- 0.11, N = 3 31.85 32.10 34.14 34.22 34.47 38.04 38.76 38.79 38.95 65.41 MIN: 31.67 / MAX: 35.74 MIN: 31.9 / MAX: 33.03 MIN: 32.5 / MAX: 37.13 MIN: 33.01 / MAX: 37.09 MIN: 33.05 / MAX: 39.69 MIN: 33.11 / MAX: 346.94 MIN: 33.38 / MAX: 423.24 MIN: 34.02 / MAX: 460.15 MIN: 34.04 / MAX: 486.96 MIN: 39.08 / MAX: 230.59 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3 - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3 - Model: regnety_400m 3090 rep 3090 nv 4090 4080 zzz 4080 xxx RTX 3070 Ti 4080 rep 4090 4090 rep 3070 4 8 12 16 20 SE +/- 0.29, N = 3 8.03 8.22 8.25 8.34 8.38 8.42 8.72 10.09 10.23 18.25 MIN: 7.97 / MAX: 8.65 MIN: 8.14 / MAX: 8.67 MIN: 7.87 / MAX: 10.07 MIN: 8.03 / MAX: 10.23 MIN: 8.04 / MAX: 9.63 MIN: 7.66 / MAX: 10.74 MIN: 8.32 / MAX: 10.48 MIN: 8.01 / MAX: 418.58 MIN: 8.22 / MAX: 197.1 MIN: 7.8 / MAX: 238.29 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3 - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3 - Model: squeezenet_ssd 3090 rep 3090 4080 zzz 4080 xxx nv 4090 RTX 3070 Ti 4080 rep 4090 rep 4090 3070 4 8 12 16 20 SE +/- 0.23, N = 3 7.09 7.12 7.25 7.27 7.48 7.57 7.67 9.38 9.51 14.27 MIN: 7.02 / MAX: 7.86 MIN: 7.04 / MAX: 7.97 MIN: 6.72 / MAX: 8.05 MIN: 6.73 / MAX: 8.77 MIN: 6.85 / MAX: 9.67 MIN: 6.69 / MAX: 10 MIN: 7.06 / MAX: 9.96 MIN: 6.77 / MAX: 224.11 MIN: 7.11 / MAX: 307.17 MIN: 7.01 / MAX: 51.13 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3 - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3 - Model: yolov4-tiny 3090 rep 3090 4080 zzz 4080 xxx 4080 rep 4090 rep RTX 3070 Ti 4090 nv 4090 3070 7 14 21 28 35 SE +/- 0.81, N = 3 12.81 12.82 13.42 13.63 13.71 13.88 14.57 15.85 17.67 28.41 MIN: 12.7 / MAX: 13.69 MIN: 12.72 / MAX: 13.66 MIN: 12.65 / MAX: 16.19 MIN: 12.77 / MAX: 16.93 MIN: 12.78 / MAX: 15.62 MIN: 13.09 / MAX: 14.77 MIN: 12.33 / MAX: 312.42 MIN: 13.26 / MAX: 253.23 MIN: 14.92 / MAX: 343.93 MIN: 12.49 / MAX: 151.04 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3 - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3 - Model: resnet50 3090 3090 rep 4080 rep 4080 zzz 4080 xxx 4090 4090 rep RTX 3070 Ti nv 4090 3070 5 10 15 20 25 SE +/- 0.04, N = 3 10.03 10.06 10.86 11.10 11.26 11.72 12.47 12.81 13.29 22.19 MIN: 9.93 / MAX: 10.87 MIN: 9.86 / MAX: 11.9 MIN: 9.98 / MAX: 12.46 MIN: 10.19 / MAX: 18.3 MIN: 10.32 / MAX: 13.29 MIN: 10.8 / MAX: 12.8 MIN: 11.5 / MAX: 14.68 MIN: 10.06 / MAX: 349.03 MIN: 10.54 / MAX: 456.82 MIN: 10.16 / MAX: 181.74 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3 - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3 - Model: alexnet 3090 rep 3090 4080 zzz 4080 rep 4080 xxx 4090 4090 rep RTX 3070 Ti nv 4090 3070 3 6 9 12 15 SE +/- 0.43, N = 3 4.31 4.33 4.66 4.68 4.69 4.99 5.45 6.17 6.62 10.59 MIN: 4.26 / MAX: 5.07 MIN: 4.26 / MAX: 5.19 MIN: 4.24 / MAX: 5.97 MIN: 4.27 / MAX: 6.08 MIN: 4.26 / MAX: 6.07 MIN: 4.56 / MAX: 6.91 MIN: 4.93 / MAX: 7.98 MIN: 4.5 / MAX: 261.75 MIN: 4.28 / MAX: 339.62 MIN: 4.3 / MAX: 177.68 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3 - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3 - Model: resnet18 3090 3090 rep 4080 rep 4080 zzz 4080 xxx nv 4090 RTX 3070 Ti 4090 4090 rep 3070 3 6 9 12 15 SE +/- 0.30, N = 3 5.20 5.30 5.64 5.77 5.78 6.07 6.22 7.74 8.14 12.64 MIN: 5.1 / MAX: 6.16 MIN: 5.21 / MAX: 6.24 MIN: 5.11 / MAX: 7.51 MIN: 5.22 / MAX: 7.06 MIN: 5.21 / MAX: 6.97 MIN: 5.49 / MAX: 15.12 MIN: 5.3 / MAX: 8.22 MIN: 5.25 / MAX: 312.09 MIN: 5.39 / MAX: 122.47 MIN: 5.3 / MAX: 53.81 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3 - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3 - Model: vgg16 3090 rep 3090 4080 rep 4080 zzz 4080 xxx nv 4090 RTX 3070 Ti 4090 4090 rep 3070 11 22 33 44 55 SE +/- 0.54, N = 3 23.40 23.58 25.01 25.26 25.44 27.61 27.98 30.16 30.74 50.32 MIN: 23.2 / MAX: 24.07 MIN: 23.35 / MAX: 24.43 MIN: 23.88 / MAX: 26.66 MIN: 24.29 / MAX: 27.75 MIN: 24.27 / MAX: 27.68 MIN: 24.67 / MAX: 401.29 MIN: 24.35 / MAX: 423.63 MIN: 24.66 / MAX: 332.49 MIN: 25.36 / MAX: 428.68 MIN: 25.92 / MAX: 281.06 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3 - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3 - Model: googlenet 3090 3090 rep 4080 zzz 4080 xxx 4090 4080 rep 4090 rep RTX 3070 Ti nv 4090 3070 5 10 15 20 25 SE +/- 0.80, N = 3 7.90 7.91 8.29 8.32 8.38 8.52 8.97 9.68 10.75 19.20 MIN: 7.8 / MAX: 8.73 MIN: 7.81 / MAX: 8.62 MIN: 7.63 / MAX: 9.87 MIN: 7.71 / MAX: 10.39 MIN: 7.78 / MAX: 10.43 MIN: 7.85 / MAX: 10.56 MIN: 8.22 / MAX: 10.51 MIN: 8.16 / MAX: 382.41 MIN: 7.92 / MAX: 447.83 MIN: 7.84 / MAX: 193.36 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3 - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3 - Model: blazeface 4090 4080 zzz 4080 xxx 3090 rep 3090 4090 rep nv 4090 4080 rep RTX 3070 Ti 3070 0.5693 1.1386 1.7079 2.2772 2.8465 SE +/- 0.48, N = 3 1.30 1.31 1.32 1.38 1.39 1.42 1.42 1.43 2.48 2.53 MIN: 1.24 / MAX: 1.92 MIN: 1.25 / MAX: 1.76 MIN: 1.26 / MAX: 2.03 MIN: 1.35 / MAX: 1.64 MIN: 1.37 / MAX: 1.48 MIN: 1.34 / MAX: 2.37 MIN: 1.34 / MAX: 1.99 MIN: 1.36 / MAX: 2.02 MIN: 1.17 / MAX: 344.52 MIN: 1.08 / MAX: 118.73 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3 - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3 - Model: efficientnet-b0 3090 rep 3090 4080 zzz 4080 xxx 4080 rep 4090 rep 4090 RTX 3070 Ti nv 4090 3070 3 6 9 12 15 SE +/- 0.46, N = 3 3.85 3.88 3.95 4.01 4.07 4.35 4.47 4.74 5.94 9.19 MIN: 3.81 / MAX: 4.75 MIN: 3.83 / MAX: 4.61 MIN: 3.79 / MAX: 4.59 MIN: 3.83 / MAX: 5.28 MIN: 3.85 / MAX: 4.79 MIN: 4.08 / MAX: 5.62 MIN: 4.23 / MAX: 5.82 MIN: 3.68 / MAX: 295.7 MIN: 3.97 / MAX: 208.59 MIN: 3.85 / MAX: 131.42 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3 - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3 - Model: mnasnet 4080 zzz 3090 rep 3090 4080 xxx 4080 rep nv 4090 RTX 3070 Ti 4090 rep 4090 3070 2 4 6 8 10 SE +/- 0.13, N = 3 2.96 2.97 2.98 3.00 3.09 3.12 3.24 5.11 5.19 6.07 MIN: 2.85 / MAX: 3.82 MIN: 2.93 / MAX: 3.3 MIN: 2.95 / MAX: 3.9 MIN: 2.88 / MAX: 4.37 MIN: 2.96 / MAX: 4.98 MIN: 2.98 / MAX: 3.71 MIN: 2.9 / MAX: 5.34 MIN: 2.96 / MAX: 247.47 MIN: 3.04 / MAX: 436.91 MIN: 2.94 / MAX: 129.1 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3 - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3 - Model: shufflenet-v2 nv 4090 3090 rep 4080 zzz 3090 4080 xxx 4090 rep 4080 rep 4090 RTX 3070 Ti 3070 2 4 6 8 10 SE +/- 0.60, N = 3 3.32 3.33 3.36 3.36 3.40 3.42 3.47 3.48 4.02 7.81 MIN: 3.19 / MAX: 4.76 MIN: 3.3 / MAX: 3.78 MIN: 3.23 / MAX: 3.99 MIN: 3.32 / MAX: 3.66 MIN: 3.28 / MAX: 3.87 MIN: 3.29 / MAX: 3.94 MIN: 3.33 / MAX: 5.39 MIN: 3.35 / MAX: 4.05 MIN: 3.27 / MAX: 328.59 MIN: 3.3 / MAX: 131.26 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3-v2-v2 - Model: mobilenet-v2 4080 zzz 3090 3090 rep 4080 xxx nv 4090 4080 rep 4090 4090 rep RTX 3070 Ti 3070 2 4 6 8 10 SE +/- 0.53, N = 3 3.16 3.16 3.16 3.20 3.29 3.30 3.36 3.44 3.91 7.22 MIN: 3.01 / MAX: 5.17 MIN: 3.11 / MAX: 3.95 MIN: 3.09 / MAX: 4.06 MIN: 3.05 / MAX: 4.67 MIN: 3.13 / MAX: 4.29 MIN: 3.11 / MAX: 4.01 MIN: 3.21 / MAX: 4.78 MIN: 3.27 / MAX: 4.93 MIN: 3.04 / MAX: 394.66 MIN: 3.17 / MAX: 69.66 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3-v3-v3-v3-v3 - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3-v3-v3-v3-v3 - Model: mobilenet 3090 rep 3090 4080 zzz 4080 xxx 4080 rep 4090 RTX 3070 Ti 4090 rep nv 4090 3070 4 8 12 16 20 SE +/- 0.14, N = 3 8.06 8.07 8.25 8.34 8.45 9.04 10.03 10.61 12.12 16.52 MIN: 8 / MAX: 8.96 MIN: 8.01 / MAX: 8.62 MIN: 7.78 / MAX: 9.61 MIN: 7.89 / MAX: 9.42 MIN: 8.01 / MAX: 10.86 MIN: 8.49 / MAX: 10.96 MIN: 7.86 / MAX: 346.64 MIN: 8.34 / MAX: 225.97 MIN: 9.16 / MAX: 505.01 MIN: 7.9 / MAX: 82.53 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
VkFFT Test: FFT + iFFT C2C 1D batched in half precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.2.31 Test: FFT + iFFT C2C 1D batched in half precision nv 4090 4090 4090 rep 3090 rep 3090 4080 4080 rep 4080 zzz 4080 xxx i h g f b c a e d 60K 120K 180K 240K 300K SE +/- 133.47, N = 3 SE +/- 83.55, N = 3 SE +/- 26.03, N = 3 SE +/- 18.50, N = 3 292768 290342 287651 265171 255207 211076 211058 210991 210713 132270 104298 104171 104146 91812 91744 91597 85191 85181 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C multidimensional in single precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.2.31 Test: FFT + iFFT C2C multidimensional in single precision nv 4090 4090 4090 rep 4080 rep 4080 zzz 4080 xxx 4080 3090 rep 3090 e d i a c b g f 20K 40K 60K 80K 100K SE +/- 555.86, N = 3 SE +/- 437.33, N = 3 SE +/- 116.12, N = 3 SE +/- 57.83, N = 3 82875 81406 80999 70068 70040 67887 65869 54814 51005 37090 36328 34686 33001 32812 32751 26541 26238 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT R2C / C2R OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.2.31 Test: FFT + iFFT R2C / C2R nv 4090 4090 4090 rep 4080 xxx 4080 rep 4080 zzz 4080 3090 3090 rep c b a d e i g f h 20K 40K 60K 80K 100K SE +/- 796.66, N = 3 SE +/- 200.55, N = 3 SE +/- 118.74, N = 3 SE +/- 3.71, N = 3 84887 84351 81329 69068 68279 67689 66473 55347 54432 43021 42163 42105 35399 35304 33727 26638 26593 26524 1. (CXX) g++ options: -O3
VkResample Upscale: 2x - Precision: Single OpenBenchmarking.org ms, Fewer Is Better VkResample 1.0 Upscale: 2x - Precision: Single 4090 rep nv 4090 4090 3090 3090 rep a c b 4080 zzz 4080 4080 rep 4080 xxx i 3070 f g RTX 3070 Ti e d 8 16 24 32 40 SE +/- 0.001, N = 3 SE +/- 0.029, N = 3 SE +/- 0.000, N = 3 SE +/- 0.004, N = 3 8.962 8.967 9.284 10.399 10.428 11.686 11.688 11.690 13.126 13.136 13.136 13.137 20.930 22.064 26.738 26.769 27.183 32.850 32.855 1. (CXX) g++ options: -O3
Phoronix Test Suite v10.8.5