vulkan tests AMD Ryzen Threadripper 3990X 64-Core testing with a Gigabyte TRX40 AORUS PRO WIFI (F6 BIOS) and AMD Radeon RX 5700 8GB on Ubuntu 23.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2308016-PTS-VULKANTE10&grs&sor .
vulkan tests Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server OpenGL Compiler File-System Screen Resolution a b c AMD Ryzen Threadripper 3990X 64-Core @ 2.90GHz (64 Cores / 128 Threads) Gigabyte TRX40 AORUS PRO WIFI (F6 BIOS) AMD Starship/Matisse 128GB Samsung SSD 970 EVO Plus 500GB AMD Radeon RX 5700 8GB (1750/875MHz) AMD Navi 10 HDMI Audio DELL P2415Q Intel I211 + Intel Wi-Fi 6 AX200 Ubuntu 23.04 6.2.0-26-generic (x86_64) GNOME Shell 44.0 X Server + Wayland 4.6 Mesa 23.0.2 (LLVM 15.0.7 DRM 3.49) GCC 12.2.0 ext4 3840x2160 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-Pa930Z/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-Pa930Z/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0x830107a Graphics Details - BAR1 / Visible vRAM Size: 256 MB - vBIOS Version: 113-D1820201-101 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Mitigation of untrained return thunk; SMT enabled with STIBP protection + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
vulkan tests ncnn: Vulkan GPU - alexnet ncnn: CPU - resnet50 ncnn: CPU - regnety_400m ncnn: CPU - vgg16 ncnn: Vulkan GPU - vision_transformer ncnn: Vulkan GPU - resnet50 ncnn: CPU - mobilenet ncnn: Vulkan GPU - yolov4-tiny ncnn: Vulkan GPU - mobilenet ncnn: CPU - squeezenet_ssd ncnn: CPU - resnet18 ncnn: CPU - googlenet ncnn: Vulkan GPU - resnet18 ncnn: CPU - yolov4-tiny ncnn: Vulkan GPU - regnety_400m ncnn: Vulkan GPU - vgg16 ncnn: Vulkan GPU - squeezenet_ssd ncnn: Vulkan GPU - googlenet vkpeak: fp32-scalar vkpeak: fp32-vec4 vkpeak: fp16-vec4 vkfft: FFT + iFFT R2C / C2R vkpeak: int16-vec4 vkfft: FFT + iFFT C2C 1D batched in half precision vkpeak: fp16-scalar vkfft: FFT + iFFT C2C multidimensional in single precision vkfft: FFT + iFFT C2C Bluestein in single precision vkpeak: fp64-vec4 vkpeak: fp64-scalar vkpeak: int32-vec4 vkpeak: int32-scalar vkfft: FFT + iFFT C2C Bluestein benchmark in double precision vkpeak: int16-scalar vkresample: 2x - Single vkfft: FFT + iFFT C2C 1D batched in double precision vkfft: FFT + iFFT C2C 1D batched in single precision vkfft: FFT + iFFT C2C 1D batched in single precision, no reshuffling ncnn: Vulkan GPU - FastestDet ncnn: Vulkan GPU - blazeface ncnn: Vulkan GPU - efficientnet-b0 ncnn: Vulkan GPU - mnasnet ncnn: Vulkan GPU - shufflenet-v2 ncnn: Vulkan GPU-v3-v3 - mobilenet-v3 ncnn: Vulkan GPU-v2-v2 - mobilenet-v2 ncnn: CPU - FastestDet ncnn: CPU - vision_transformer ncnn: CPU - alexnet ncnn: CPU - blazeface ncnn: CPU - efficientnet-b0 ncnn: CPU - mnasnet ncnn: CPU - shufflenet-v2 ncnn: CPU-v3-v3 - mobilenet-v3 ncnn: CPU-v2-v2 - mobilenet-v2 a b c 15.22 35.20 45.43 45.64 75.17 38.91 25.21 36.72 24.66 22.30 19.07 30.87 19.20 37.72 46.73 47.52 21.81 31.43 7836.61 7781.34 12516.94 25525 12531.35 100903 7853.66 16993 7268 493.28 493.36 1574.82 1214.41 3066 7856.21 12.770 18431 58944 64140 14.25 7.22 18.42 14.18 14.75 12.97 15.71 14.66 73.47 11.76 5.89 18.00 13.13 14.08 12.73 13.83 15.34 39.03 48.03 47.53 72.26 39.84 25.78 37.79 25.25 22.27 19.40 31.55 19.42 37.38 46.22 48.13 22.01 31.24 7856.19 7808.56 12550.91 25447 12550.38 101063 7860.29 16976 7258 492.96 493.01 1574.60 1213.48 3064 7852.03 12.771 18428 58947 64139 14.53 7.09 19.47 13.72 14.27 13.25 14.78 13.70 74.11 13.97 7.12 19.69 12.90 14.38 13.45 14.49 13.07 38.18 47.02 46.88 72.70 38.43 25.96 37.36 25.37 22.89 18.91 31.64 19.04 37.14 46.10 47.50 21.78 31.46 7866.83 7808.48 12559.31 25451 12563.14 100839 7866.33 16967 7259 493.57 493.51 1575.91 1214.40 3064 7857.01 12.775 18426 58938 64131 13.88 6.08 18.02 12.93 13.29 12.57 13.63 16.03 75.96 12.37 6.95 19.06 15.27 14.12 13.60 16.01 OpenBenchmarking.org
NCNN Target: Vulkan GPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: alexnet c a b 4 8 12 16 20 SE +/- 0.28, N = 3 SE +/- 0.36, N = 3 SE +/- 0.17, N = 12 13.07 15.22 15.34 MIN: 10.71 / MAX: 16.27 MIN: 11.93 / MAX: 21.72 MIN: 12.01 / MAX: 62 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: resnet50 a c b 9 18 27 36 45 SE +/- 0.28, N = 3 SE +/- 1.02, N = 3 SE +/- 0.53, N = 3 35.20 38.18 39.03 MIN: 30.71 / MAX: 97.13 MIN: 32.56 / MAX: 45.61 MIN: 34.39 / MAX: 46.86 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: regnety_400m a c b 11 22 33 44 55 SE +/- 1.49, N = 3 SE +/- 0.33, N = 3 SE +/- 0.70, N = 3 45.43 47.02 48.03 MIN: 39.85 / MAX: 90.63 MIN: 41.08 / MAX: 93.32 MIN: 41.67 / MAX: 101.38 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: vgg16 a c b 11 22 33 44 55 SE +/- 0.74, N = 3 SE +/- 0.66, N = 3 SE +/- 0.34, N = 3 45.64 46.88 47.53 MIN: 42.07 / MAX: 55.26 MIN: 43.93 / MAX: 59.11 MIN: 44.76 / MAX: 58.97 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: vision_transformer b c a 20 40 60 80 100 SE +/- 0.80, N = 12 SE +/- 1.92, N = 3 SE +/- 2.14, N = 3 72.26 72.70 75.17 MIN: 69.21 / MAX: 182.16 MIN: 69.83 / MAX: 164 MIN: 69.97 / MAX: 147.06 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: resnet50 c a b 9 18 27 36 45 SE +/- 0.59, N = 3 SE +/- 0.49, N = 3 SE +/- 0.24, N = 12 38.43 38.91 39.84 MIN: 33.37 / MAX: 44.58 MIN: 33.52 / MAX: 45.34 MIN: 32.55 / MAX: 100.04 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: mobilenet a b c 6 12 18 24 30 SE +/- 0.05, N = 3 SE +/- 0.28, N = 3 SE +/- 0.21, N = 3 25.21 25.78 25.96 MIN: 21.72 / MAX: 91.8 MIN: 22.57 / MAX: 80.73 MIN: 22.99 / MAX: 30.78 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: yolov4-tiny a c b 9 18 27 36 45 SE +/- 0.55, N = 3 SE +/- 0.31, N = 3 SE +/- 0.27, N = 12 36.72 37.36 37.79 MIN: 32.78 / MAX: 42.58 MIN: 33.45 / MAX: 45.21 MIN: 33.15 / MAX: 109.38 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: mobilenet a b c 6 12 18 24 30 SE +/- 0.20, N = 3 SE +/- 0.39, N = 12 SE +/- 0.04, N = 3 24.66 25.25 25.37 MIN: 22.49 / MAX: 28.96 MIN: 21.74 / MAX: 83.58 MIN: 22.3 / MAX: 30.62 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: squeezenet_ssd b a c 5 10 15 20 25 SE +/- 0.39, N = 3 SE +/- 0.47, N = 3 SE +/- 0.30, N = 3 22.27 22.30 22.89 MIN: 20.98 / MAX: 27.69 MIN: 20.67 / MAX: 28.29 MIN: 20.98 / MAX: 29.54 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: resnet18 c a b 5 10 15 20 25 SE +/- 0.03, N = 3 SE +/- 0.16, N = 3 SE +/- 0.21, N = 3 18.91 19.07 19.40 MIN: 16.19 / MAX: 66.7 MIN: 16.37 / MAX: 82.83 MIN: 16.25 / MAX: 94.41 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: googlenet a b c 7 14 21 28 35 SE +/- 0.21, N = 3 SE +/- 0.22, N = 3 SE +/- 0.41, N = 3 30.87 31.55 31.64 MIN: 27.29 / MAX: 90.04 MIN: 29.53 / MAX: 39 MIN: 26.58 / MAX: 37.83 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: resnet18 c a b 5 10 15 20 25 SE +/- 0.17, N = 3 SE +/- 0.06, N = 3 SE +/- 0.09, N = 12 19.04 19.20 19.42 MIN: 16.24 / MAX: 22.76 MIN: 16.68 / MAX: 25.67 MIN: 16.13 / MAX: 88.96 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: yolov4-tiny c b a 9 18 27 36 45 SE +/- 0.31, N = 3 SE +/- 0.58, N = 3 SE +/- 0.32, N = 3 37.14 37.38 37.72 MIN: 33.68 / MAX: 43.48 MIN: 33.26 / MAX: 45.88 MIN: 34.14 / MAX: 48.53 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: regnety_400m c b a 11 22 33 44 55 SE +/- 0.73, N = 3 SE +/- 0.51, N = 12 SE +/- 0.37, N = 3 46.10 46.22 46.73 MIN: 39.52 / MAX: 107.36 MIN: 39.54 / MAX: 121.82 MIN: 39.79 / MAX: 151.9 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: vgg16 c a b 11 22 33 44 55 SE +/- 0.08, N = 3 SE +/- 0.06, N = 3 SE +/- 0.15, N = 12 47.50 47.52 48.13 MIN: 45.06 / MAX: 62.77 MIN: 45.43 / MAX: 59.56 MIN: 44.54 / MAX: 88.13 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: squeezenet_ssd c a b 5 10 15 20 25 SE +/- 0.20, N = 3 SE +/- 0.21, N = 3 SE +/- 0.08, N = 12 21.78 21.81 22.01 MIN: 20.62 / MAX: 26.09 MIN: 20.68 / MAX: 25.84 MIN: 20.55 / MAX: 81.08 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: googlenet b a c 7 14 21 28 35 SE +/- 0.16, N = 12 SE +/- 0.18, N = 3 SE +/- 0.10, N = 3 31.24 31.43 31.46 MIN: 27.01 / MAX: 86.35 MIN: 26.65 / MAX: 84.75 MIN: 26.64 / MAX: 38.17 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
vkpeak fp32-scalar OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20230730 fp32-scalar c b a 2K 4K 6K 8K 10K SE +/- 1.93, N = 3 SE +/- 1.09, N = 3 SE +/- 0.62, N = 3 7866.83 7856.19 7836.61
vkpeak fp32-vec4 OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20230730 fp32-vec4 b c a 2K 4K 6K 8K 10K SE +/- 7.17, N = 3 SE +/- 8.31, N = 3 SE +/- 7.57, N = 3 7808.56 7808.48 7781.34
vkpeak fp16-vec4 OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20230730 fp16-vec4 c b a 3K 6K 9K 12K 15K SE +/- 1.25, N = 3 SE +/- 1.15, N = 3 SE +/- 1.96, N = 3 12559.31 12550.91 12516.94
VkFFT Test: FFT + iFFT R2C / C2R OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.2.31 Test: FFT + iFFT R2C / C2R a c b 5K 10K 15K 20K 25K SE +/- 73.86, N = 3 SE +/- 2.03, N = 3 SE +/- 4.33, N = 3 25525 25451 25447 1. (CXX) g++ options: -O3
vkpeak int16-vec4 OpenBenchmarking.org GIOPS, More Is Better vkpeak 20230730 int16-vec4 c b a 3K 6K 9K 12K 15K SE +/- 1.54, N = 3 SE +/- 0.77, N = 3 SE +/- 0.58, N = 3 12563.14 12550.38 12531.35
VkFFT Test: FFT + iFFT C2C 1D batched in half precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.2.31 Test: FFT + iFFT C2C 1D batched in half precision b a c 20K 40K 60K 80K 100K SE +/- 43.64, N = 3 SE +/- 98.36, N = 3 SE +/- 219.62, N = 3 101063 100903 100839 1. (CXX) g++ options: -O3
vkpeak fp16-scalar OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20230730 fp16-scalar c b a 2K 4K 6K 8K 10K SE +/- 1.85, N = 3 SE +/- 0.75, N = 3 SE +/- 1.36, N = 3 7866.33 7860.29 7853.66
VkFFT Test: FFT + iFFT C2C multidimensional in single precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.2.31 Test: FFT + iFFT C2C multidimensional in single precision a b c 4K 8K 12K 16K 20K SE +/- 51.10, N = 3 SE +/- 12.01, N = 3 SE +/- 9.53, N = 3 16993 16976 16967 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C Bluestein in single precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.2.31 Test: FFT + iFFT C2C Bluestein in single precision a c b 1600 3200 4800 6400 8000 SE +/- 15.24, N = 3 SE +/- 4.70, N = 3 SE +/- 2.67, N = 3 7268 7259 7258 1. (CXX) g++ options: -O3
vkpeak fp64-vec4 OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20230730 fp64-vec4 c a b 110 220 330 440 550 SE +/- 0.10, N = 3 SE +/- 0.11, N = 3 SE +/- 0.03, N = 3 493.57 493.28 492.96
vkpeak fp64-scalar OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20230730 fp64-scalar c a b 110 220 330 440 550 SE +/- 0.07, N = 3 SE +/- 0.10, N = 3 SE +/- 0.03, N = 3 493.51 493.36 493.01
vkpeak int32-vec4 OpenBenchmarking.org GIOPS, More Is Better vkpeak 20230730 int32-vec4 c a b 300 600 900 1200 1500 SE +/- 0.21, N = 3 SE +/- 0.17, N = 3 SE +/- 0.05, N = 3 1575.91 1574.82 1574.60
vkpeak int32-scalar OpenBenchmarking.org GIOPS, More Is Better vkpeak 20230730 int32-scalar a c b 300 600 900 1200 1500 SE +/- 0.27, N = 3 SE +/- 0.32, N = 3 SE +/- 0.05, N = 3 1214.41 1214.40 1213.48
VkFFT Test: FFT + iFFT C2C Bluestein benchmark in double precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.2.31 Test: FFT + iFFT C2C Bluestein benchmark in double precision a c b 700 1400 2100 2800 3500 SE +/- 4.33, N = 3 SE +/- 1.53, N = 3 SE +/- 1.00, N = 3 3066 3064 3064 1. (CXX) g++ options: -O3
vkpeak int16-scalar OpenBenchmarking.org GIOPS, More Is Better vkpeak 20230730 int16-scalar c a b 2K 4K 6K 8K 10K SE +/- 0.96, N = 3 SE +/- 0.44, N = 3 SE +/- 0.51, N = 3 7857.01 7856.21 7852.03
VkResample Upscale: 2x - Precision: Single OpenBenchmarking.org ms, Fewer Is Better VkResample 1.0 Upscale: 2x - Precision: Single a b c 3 6 9 12 15 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 12.77 12.77 12.78 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C 1D batched in double precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.2.31 Test: FFT + iFFT C2C 1D batched in double precision a b c 4K 8K 12K 16K 20K SE +/- 6.17, N = 3 SE +/- 3.61, N = 3 SE +/- 1.86, N = 3 18431 18428 18426 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C 1D batched in single precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.2.31 Test: FFT + iFFT C2C 1D batched in single precision b a c 13K 26K 39K 52K 65K SE +/- 3.33, N = 3 SE +/- 1.33, N = 3 SE +/- 1.20, N = 3 58947 58944 58938 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C 1D batched in single precision, no reshuffling OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.2.31 Test: FFT + iFFT C2C 1D batched in single precision, no reshuffling a b c 14K 28K 42K 56K 70K SE +/- 6.43, N = 3 SE +/- 5.46, N = 3 SE +/- 7.69, N = 3 64140 64139 64131 1. (CXX) g++ options: -O3
NCNN Target: Vulkan GPU - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: FastestDet c a b 4 8 12 16 20 SE +/- 0.66, N = 3 SE +/- 1.49, N = 3 SE +/- 0.63, N = 12 13.88 14.25 14.53 MIN: 11.52 / MAX: 18.83 MIN: 11.35 / MAX: 22.25 MIN: 11.05 / MAX: 27.1 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: blazeface c b a 2 4 6 8 10 SE +/- 0.20, N = 3 SE +/- 0.38, N = 12 SE +/- 0.32, N = 3 6.08 7.09 7.22 MIN: 5.01 / MAX: 7.58 MIN: 4.88 / MAX: 78.29 MIN: 5.7 / MAX: 10.52 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: efficientnet-b0 c a b 5 10 15 20 25 SE +/- 0.52, N = 3 SE +/- 0.44, N = 3 SE +/- 0.83, N = 12 18.02 18.42 19.47 MIN: 13.61 / MAX: 27.87 MIN: 14.24 / MAX: 27.56 MIN: 13.59 / MAX: 97.32 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: mnasnet c b a 4 8 12 16 20 SE +/- 1.40, N = 3 SE +/- 0.74, N = 12 SE +/- 2.23, N = 3 12.93 13.72 14.18 MIN: 9.34 / MAX: 22.14 MIN: 9.31 / MAX: 23.03 MIN: 9.68 / MAX: 23.38 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: shufflenet-v2 c b a 4 8 12 16 20 SE +/- 0.21, N = 3 SE +/- 0.46, N = 12 SE +/- 1.24, N = 3 13.29 14.27 14.75 MIN: 11.24 / MAX: 22.87 MIN: 10.94 / MAX: 22.32 MIN: 11.39 / MAX: 23.44 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 c a b 3 6 9 12 15 SE +/- 0.32, N = 3 SE +/- 1.36, N = 3 SE +/- 0.62, N = 12 12.57 12.97 13.25 MIN: 9.6 / MAX: 17.01 MIN: 9.71 / MAX: 20.92 MIN: 9.52 / MAX: 53.34 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 c b a 4 8 12 16 20 SE +/- 0.74, N = 3 SE +/- 0.85, N = 12 SE +/- 3.00, N = 3 13.63 14.78 15.71 MIN: 10.24 / MAX: 27.87 MIN: 10.2 / MAX: 70.67 MIN: 10.21 / MAX: 26.19 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: FastestDet b a c 4 8 12 16 20 SE +/- 0.53, N = 3 SE +/- 1.04, N = 3 SE +/- 0.49, N = 3 13.70 14.66 16.03 MIN: 11.46 / MAX: 17.73 MIN: 11.69 / MAX: 19.64 MIN: 11.61 / MAX: 24.19 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: vision_transformer a b c 20 40 60 80 100 SE +/- 2.54, N = 3 SE +/- 2.67, N = 3 SE +/- 2.22, N = 3 73.47 74.11 75.96 MIN: 70.04 / MAX: 83.77 MIN: 70.44 / MAX: 113.62 MIN: 71.06 / MAX: 93.63 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: alexnet a c b 4 8 12 16 20 SE +/- 0.62, N = 3 SE +/- 0.24, N = 3 SE +/- 0.45, N = 3 11.76 12.37 13.97 MIN: 9.49 / MAX: 19.16 MIN: 10.09 / MAX: 15.86 MIN: 11.94 / MAX: 26.76 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: blazeface a c b 2 4 6 8 10 SE +/- 0.59, N = 3 SE +/- 0.44, N = 3 SE +/- 0.58, N = 3 5.89 6.95 7.12 MIN: 4.78 / MAX: 9.52 MIN: 5.22 / MAX: 9.49 MIN: 5.21 / MAX: 13.65 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: efficientnet-b0 a c b 5 10 15 20 25 SE +/- 1.70, N = 3 SE +/- 1.07, N = 3 SE +/- 1.68, N = 3 18.00 19.06 19.69 MIN: 13.54 / MAX: 69.84 MIN: 13.85 / MAX: 27.95 MIN: 13.76 / MAX: 27.29 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: mnasnet b a c 4 8 12 16 20 SE +/- 1.05, N = 3 SE +/- 1.54, N = 3 SE +/- 0.65, N = 3 12.90 13.13 15.27 MIN: 9.19 / MAX: 19.69 MIN: 9.27 / MAX: 22.76 MIN: 10.33 / MAX: 25.75 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU - Model: shufflenet-v2 a c b 4 8 12 16 20 SE +/- 1.17, N = 3 SE +/- 1.04, N = 3 SE +/- 0.93, N = 3 14.08 14.12 14.38 MIN: 10.99 / MAX: 18.54 MIN: 11.12 / MAX: 22.91 MIN: 11.05 / MAX: 20.51 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v3-v3 - Model: mobilenet-v3 a b c 3 6 9 12 15 SE +/- 1.00, N = 3 SE +/- 0.98, N = 3 SE +/- 0.59, N = 3 12.73 13.45 13.60 MIN: 9.51 / MAX: 20.96 MIN: 9.44 / MAX: 19.16 MIN: 9.51 / MAX: 51.49 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: CPU-v2-v2 - Model: mobilenet-v2 a b c 4 8 12 16 20 SE +/- 1.66, N = 3 SE +/- 1.52, N = 3 SE +/- 1.00, N = 3 13.83 14.49 16.01 MIN: 9.93 / MAX: 25.72 MIN: 10.12 / MAX: 23.01 MIN: 11.37 / MAX: 25.15 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Phoronix Test Suite v10.8.5