heikows3-2023-08-18-nvidia-gpu-compute

AMD EPYC 7313 16-Core testing with a GIGABYTE MZE2-G10-00 v01010101 (M07 BIOS) and ASPEED 45GB on Debian 12 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2308184-NE-HEIKOWS3219.

heikows3-2023-08-18-nvidia-gpu-computeProcessorMotherboardChipsetMemoryDiskGraphicsNetworkOSKernelDisplay ServerDisplay DriverOpenCLCompilerFile-SystemScreen Resolutionheikows3-2023-08-18-nvidia-gpu-computeAMD EPYC 7313 16-Core @ 3.00GHz (16 Cores / 32 Threads)GIGABYTE MZE2-G10-00 v01010101 (M07 BIOS)AMD Starship/Matisse8 x 32 GB DDR4-3200MT/s 36ASF4G72PZ-3G2E77682GB Micron_7450_MTFDKCC7T6TFR + 1920GB Micron_7450_MTFDKBG1T9TFRASPEED 45GB2 x Intel I350Debian 126.2.16-3-pve (x86_64)X ServerNVIDIAOpenCL 3.0 CUDA 12.2.79GCC 12.2.0ext4640x480OpenBenchmarking.org- Transparent Huge Pages: madvise- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-bTRWOB/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-bTRWOB/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa00115d- BAR1 / Visible vRAM Size: 65536 MiB - vBIOS Version: 95.02.39.00.01- Python 3.11.2- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected

heikows3-2023-08-18-nvidia-gpu-computehashcat: MD5hashcat: SHA1hashcat: 7-Ziphashcat: SHA-512hashcat: TrueCrypt RIPEMD160 + XTSshoc: OpenCL - S3Dshoc: OpenCL - Triadshoc: OpenCL - FFT SPshoc: OpenCL - MD5 Hashshoc: OpenCL - Reductionshoc: OpenCL - GEMM SGEMM_Nshoc: OpenCL - Max SP Flopsshoc: OpenCL - Bus Speed Downloadshoc: OpenCL - Bus Speed Readbackshoc: OpenCL - Texture Read Bandwidthcl-mem: Copycl-mem: Readcl-mem: Writefahbench: clpeak: Integer Compute INTclpeak: Single-Precision Floatclpeak: Double-Precision Doubleclpeak: Global Memory Bandwidthlczero: OpenCLrodinia: OpenCL Particle Filterarrayfire: Conjugate Gradient OpenCLluxcorerender: DLSC - GPUluxcorerender: Danish Mood - GPUluxcorerender: Orange Juice - GPUluxcorerender: LuxCore Benchmark - GPUluxcorerender: Rainbow Colors and Prism - GPUfinancebench: Black-Scholes OpenCLviennacl: CPU BLAS - sCOPYviennacl: CPU BLAS - sAXPYviennacl: CPU BLAS - sDOTviennacl: CPU BLAS - dCOPYviennacl: CPU BLAS - dAXPYviennacl: CPU BLAS - dDOTviennacl: CPU BLAS - dGEMV-Nviennacl: CPU BLAS - dGEMV-Tviennacl: CPU BLAS - dGEMM-NNviennacl: CPU BLAS - dGEMM-NTviennacl: CPU BLAS - dGEMM-TNviennacl: CPU BLAS - dGEMM-TTviennacl: OpenCL BLAS - sCOPYviennacl: OpenCL BLAS - sAXPYviennacl: OpenCL BLAS - sDOTviennacl: OpenCL BLAS - dCOPYviennacl: OpenCL BLAS - dAXPYviennacl: OpenCL BLAS - dDOTviennacl: OpenCL BLAS - dGEMV-Nviennacl: OpenCL BLAS - dGEMV-Tviennacl: OpenCL BLAS - dGEMM-NNviennacl: OpenCL BLAS - dGEMM-TNncnn: Vulkan GPU - mobilenetncnn: Vulkan GPU-v2-v2 - mobilenet-v2ncnn: Vulkan GPU-v3-v3 - mobilenet-v3ncnn: Vulkan GPU - shufflenet-v2ncnn: Vulkan GPU - mnasnetncnn: Vulkan GPU - efficientnet-b0ncnn: Vulkan GPU - blazefacencnn: Vulkan GPU - googlenetncnn: Vulkan GPU - vgg16ncnn: Vulkan GPU - resnet18ncnn: Vulkan GPU - alexnetncnn: Vulkan GPU - resnet50ncnn: Vulkan GPU - yolov4-tinyncnn: Vulkan GPU - squeezenet_ssdncnn: Vulkan GPU - regnety_400mncnn: Vulkan GPU - vision_transformerncnn: Vulkan GPU - FastestDetneatbench: GPUheikows3-2023-08-18-nvidia-gpu-compute12556666666743759733333206443353173000001464567527.27725.17502134.9099.4652877.57223502.187811.926.756126.31302715.40354.2697.2449.8395.174740540.4182945.321412.32670.25373622.1371.12714.6612.8712.7212.8227.322.7917641042252274411313.4202.023373.769.574.171.887912537745466246211963941190118017.005.276.337.415.337.872.5214.8630.298.717.8816.3724.5512.4415.3898.468.40126OpenBenchmarking.org

Hashcat

Benchmark: MD5

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: MD5heikows3-2023-08-18-nvidia-gpu-compute30000M60000M90000M120000M150000MSE +/- 733333333.33, N = 3125566666667

Hashcat

Benchmark: SHA1

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: SHA1heikows3-2023-08-18-nvidia-gpu-compute9000M18000M27000M36000M45000MSE +/- 156567582.14, N = 343759733333

Hashcat

Benchmark: 7-Zip

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: 7-Zipheikows3-2023-08-18-nvidia-gpu-compute400K800K1200K1600K2000KSE +/- 5771.29, N = 32064433

Hashcat

Benchmark: SHA-512

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: SHA-512heikows3-2023-08-18-nvidia-gpu-compute1100M2200M3300M4400M5500MSE +/- 15332644.91, N = 35317300000

Hashcat

Benchmark: TrueCrypt RIPEMD160 + XTS

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: TrueCrypt RIPEMD160 + XTSheikows3-2023-08-18-nvidia-gpu-compute300K600K900K1200K1500KSE +/- 14853.10, N = 31464567

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: S3D

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: S3Dheikows3-2023-08-18-nvidia-gpu-compute110220330440550SE +/- 0.45, N = 3527.281. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Triad

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Triadheikows3-2023-08-18-nvidia-gpu-compute612182430SE +/- 0.13, N = 325.181. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: FFT SP

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: FFT SPheikows3-2023-08-18-nvidia-gpu-compute5001000150020002500SE +/- 4.43, N = 32134.901. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: MD5 Hash

OpenBenchmarking.orgGHash/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: MD5 Hashheikows3-2023-08-18-nvidia-gpu-compute20406080100SE +/- 0.03, N = 399.471. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Reduction

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Reductionheikows3-2023-08-18-nvidia-gpu-compute2004006008001000SE +/- 0.41, N = 3877.571. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: GEMM SGEMM_N

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: GEMM SGEMM_Nheikows3-2023-08-18-nvidia-gpu-compute5K10K15K20K25KSE +/- 16.11, N = 323502.11. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Max SP Flops

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Max SP Flopsheikows3-2023-08-18-nvidia-gpu-compute20K40K60K80K100KSE +/- 630.72, N = 387811.91. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Bus Speed Download

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Bus Speed Downloadheikows3-2023-08-18-nvidia-gpu-compute612182430SE +/- 0.01, N = 326.761. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Bus Speed Readback

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Bus Speed Readbackheikows3-2023-08-18-nvidia-gpu-compute612182430SE +/- 0.02, N = 326.311. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Texture Read Bandwidth

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Texture Read Bandwidthheikows3-2023-08-18-nvidia-gpu-compute6001200180024003000SE +/- 1.38, N = 32715.401. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

cl-mem

Benchmark: Copy

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: Copyheikows3-2023-08-18-nvidia-gpu-compute80160240320400SE +/- 0.00, N = 3354.21. (CC) gcc options: -O2 -flto -lOpenCL

cl-mem

Benchmark: Read

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: Readheikows3-2023-08-18-nvidia-gpu-compute150300450600750SE +/- 0.29, N = 3697.21. (CC) gcc options: -O2 -flto -lOpenCL

cl-mem

Benchmark: Write

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: Writeheikows3-2023-08-18-nvidia-gpu-compute100200300400500SE +/- 0.47, N = 3449.81. (CC) gcc options: -O2 -flto -lOpenCL

FAHBench

OpenBenchmarking.orgNs Per Day, More Is BetterFAHBench 2.3.2heikows3-2023-08-18-nvidia-gpu-compute90180270360450SE +/- 0.21, N = 3395.17

clpeak

OpenCL Test: Integer Compute INT

OpenBenchmarking.orgGIOPS, More Is Betterclpeak 1.1.2OpenCL Test: Integer Compute INTheikows3-2023-08-18-nvidia-gpu-compute9K18K27K36K45KSE +/- 549.47, N = 340540.411. (CXX) g++ options: -O3

clpeak

OpenCL Test: Single-Precision Float

OpenBenchmarking.orgGFLOPS, More Is Betterclpeak 1.1.2OpenCL Test: Single-Precision Floatheikows3-2023-08-18-nvidia-gpu-compute20K40K60K80K100KSE +/- 664.25, N = 382945.321. (CXX) g++ options: -O3

clpeak

OpenCL Test: Double-Precision Double

OpenBenchmarking.orgGFLOPS, More Is Betterclpeak 1.1.2OpenCL Test: Double-Precision Doubleheikows3-2023-08-18-nvidia-gpu-compute30060090012001500SE +/- 0.03, N = 31412.321. (CXX) g++ options: -O3

clpeak

OpenCL Test: Global Memory Bandwidth

OpenBenchmarking.orgGBPS, More Is Betterclpeak 1.1.2OpenCL Test: Global Memory Bandwidthheikows3-2023-08-18-nvidia-gpu-compute140280420560700SE +/- 0.03, N = 3670.251. (CXX) g++ options: -O3

LeelaChessZero

Backend: OpenCL

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.28Backend: OpenCLheikows3-2023-08-18-nvidia-gpu-compute8K16K24K32K40KSE +/- 337.61, N = 3373621. (CXX) g++ options: -flto -pthread

Rodinia

Test: OpenCL Particle Filter

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenCL Particle Filterheikows3-2023-08-18-nvidia-gpu-compute0.48080.96161.44241.92322.404SE +/- 0.021, N = 142.1371. (CXX) g++ options: -O2 -lOpenCL

ArrayFire

Test: Conjugate Gradient OpenCL

OpenBenchmarking.orgms, Fewer Is BetterArrayFire 3.7Test: Conjugate Gradient OpenCLheikows3-2023-08-18-nvidia-gpu-compute0.25360.50720.76081.01441.268SE +/- 0.008, N = 121.1271. (CXX) g++ options: -rdynamic

LuxCoreRender

Scene: DLSC - Acceleration: GPU

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender 2.6Scene: DLSC - Acceleration: GPUheikows3-2023-08-18-nvidia-gpu-compute48121620SE +/- 0.00, N = 314.66MIN: 14.04 / MAX: 14.8

LuxCoreRender

Scene: Danish Mood - Acceleration: GPU

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender 2.6Scene: Danish Mood - Acceleration: GPUheikows3-2023-08-18-nvidia-gpu-compute3691215SE +/- 0.18, N = 312.87MIN: 3.79 / MAX: 15.91

LuxCoreRender

Scene: Orange Juice - Acceleration: GPU

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender 2.6Scene: Orange Juice - Acceleration: GPUheikows3-2023-08-18-nvidia-gpu-compute3691215SE +/- 0.03, N = 312.72MIN: 11.03 / MAX: 16.62

LuxCoreRender

Scene: LuxCore Benchmark - Acceleration: GPU

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender 2.6Scene: LuxCore Benchmark - Acceleration: GPUheikows3-2023-08-18-nvidia-gpu-compute3691215SE +/- 0.02, N = 312.82MIN: 3.62 / MAX: 15.6

LuxCoreRender

Scene: Rainbow Colors and Prism - Acceleration: GPU

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender 2.6Scene: Rainbow Colors and Prism - Acceleration: GPUheikows3-2023-08-18-nvidia-gpu-compute612182430SE +/- 0.26, N = 327.32MIN: 25.15 / MAX: 29.06

FinanceBench

Benchmark: Black-Scholes OpenCL

OpenBenchmarking.orgms, Fewer Is BetterFinanceBench 2016-07-25Benchmark: Black-Scholes OpenCLheikows3-2023-08-18-nvidia-gpu-compute0.6281.2561.8842.5123.14SE +/- 0.004, N = 32.7911. (CXX) g++ options: -O3 -march=native -fopenmp

ViennaCL

Test: CPU BLAS - sCOPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - sCOPYheikows3-2023-08-18-nvidia-gpu-compute160320480640800SE +/- 25.16, N = 137641. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - sAXPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - sAXPYheikows3-2023-08-18-nvidia-gpu-compute2004006008001000SE +/- 28.96, N = 1310421. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - sDOT

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - sDOTheikows3-2023-08-18-nvidia-gpu-compute60120180240300SE +/- 5.14, N = 132521. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dCOPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dCOPYheikows3-2023-08-18-nvidia-gpu-compute60120180240300SE +/- 9.27, N = 132741. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dAXPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dAXPYheikows3-2023-08-18-nvidia-gpu-compute90180270360450SE +/- 14.54, N = 134111. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dDOT

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dDOTheikows3-2023-08-18-nvidia-gpu-compute70140210280350SE +/- 26.27, N = 13313.41. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMV-N

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMV-Nheikows3-2023-08-18-nvidia-gpu-compute4080120160200SE +/- 17.46, N = 12202.01. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMV-T

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMV-Theikows3-2023-08-18-nvidia-gpu-compute50100150200250SE +/- 4.86, N = 132331. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMM-NN

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-NNheikows3-2023-08-18-nvidia-gpu-compute1632486480SE +/- 0.55, N = 1373.71. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMM-NT

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-NTheikows3-2023-08-18-nvidia-gpu-compute1530456075SE +/- 0.56, N = 1369.51. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMM-TN

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-TNheikows3-2023-08-18-nvidia-gpu-compute1632486480SE +/- 0.35, N = 1274.11. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMM-TT

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-TTheikows3-2023-08-18-nvidia-gpu-compute1632486480SE +/- 0.48, N = 1371.81. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - sCOPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sCOPYheikows3-2023-08-18-nvidia-gpu-compute2004006008001000SE +/- 0.00, N = 38791. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - sAXPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sAXPYheikows3-2023-08-18-nvidia-gpu-compute30060090012001500SE +/- 3.33, N = 312531. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - sDOT

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sDOTheikows3-2023-08-18-nvidia-gpu-compute170340510680850SE +/- 0.33, N = 37741. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dCOPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dCOPYheikows3-2023-08-18-nvidia-gpu-compute120240360480600SE +/- 0.33, N = 35461. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dAXPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dAXPYheikows3-2023-08-18-nvidia-gpu-compute130260390520650SE +/- 0.00, N = 36241. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dDOT

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dDOTheikows3-2023-08-18-nvidia-gpu-compute130260390520650SE +/- 0.33, N = 36211. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMV-N

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMV-Nheikows3-2023-08-18-nvidia-gpu-compute4080120160200SE +/- 0.00, N = 31961. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMV-T

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMV-Theikows3-2023-08-18-nvidia-gpu-compute90180270360450SE +/- 0.33, N = 33941. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMM-NN

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-NNheikows3-2023-08-18-nvidia-gpu-compute30060090012001500SE +/- 0.00, N = 311901. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMM-TN

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-TNheikows3-2023-08-18-nvidia-gpu-compute30060090012001500SE +/- 0.00, N = 311801. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

NCNN

Target: Vulkan GPU - Model: mobilenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: mobilenetheikows3-2023-08-18-nvidia-gpu-compute48121620SE +/- 2.05, N = 1217.00MIN: 11.33 / MAX: 1631.331. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2heikows3-2023-08-18-nvidia-gpu-compute1.18582.37163.55744.74325.929SE +/- 0.10, N = 125.27MIN: 4.22 / MAX: 95.371. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3heikows3-2023-08-18-nvidia-gpu-compute246810SE +/- 0.74, N = 116.33MIN: 3.96 / MAX: 300.991. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: shufflenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: shufflenet-v2heikows3-2023-08-18-nvidia-gpu-compute246810SE +/- 0.63, N = 127.41MIN: 4.83 / MAX: 750.451. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: mnasnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: mnasnetheikows3-2023-08-18-nvidia-gpu-compute1.19932.39863.59794.79725.9965SE +/- 0.34, N = 125.33MIN: 3.83 / MAX: 270.941. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: efficientnet-b0

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: efficientnet-b0heikows3-2023-08-18-nvidia-gpu-compute246810SE +/- 0.65, N = 127.87MIN: 5.8 / MAX: 250.321. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: blazeface

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: blazefaceheikows3-2023-08-18-nvidia-gpu-compute0.5671.1341.7012.2682.835SE +/- 0.18, N = 122.52MIN: 1.87 / MAX: 171.421. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: googlenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: googlenetheikows3-2023-08-18-nvidia-gpu-compute48121620SE +/- 0.92, N = 1214.86MIN: 10.8 / MAX: 555.151. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: vgg16

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: vgg16heikows3-2023-08-18-nvidia-gpu-compute714212835SE +/- 2.77, N = 1230.29MIN: 19.27 / MAX: 995.121. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: resnet18

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: resnet18heikows3-2023-08-18-nvidia-gpu-compute246810SE +/- 0.65, N = 128.71MIN: 6.01 / MAX: 281.021. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: alexnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: alexnetheikows3-2023-08-18-nvidia-gpu-compute246810SE +/- 1.51, N = 127.88MIN: 4.04 / MAX: 594.051. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: resnet50

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: resnet50heikows3-2023-08-18-nvidia-gpu-compute48121620SE +/- 0.67, N = 1216.37MIN: 12.02 / MAX: 381.741. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: yolov4-tiny

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: yolov4-tinyheikows3-2023-08-18-nvidia-gpu-compute612182430SE +/- 1.50, N = 1224.55MIN: 17.26 / MAX: 1220.451. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: squeezenet_ssd

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: squeezenet_ssdheikows3-2023-08-18-nvidia-gpu-compute3691215SE +/- 0.35, N = 1212.44MIN: 9.33 / MAX: 436.161. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: regnety_400m

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: regnety_400mheikows3-2023-08-18-nvidia-gpu-compute48121620SE +/- 1.11, N = 1215.38MIN: 11.62 / MAX: 228.251. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: vision_transformer

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: vision_transformerheikows3-2023-08-18-nvidia-gpu-compute20406080100SE +/- 9.48, N = 1298.46MIN: 64.31 / MAX: 2287.81. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: FastestDet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: FastestDetheikows3-2023-08-18-nvidia-gpu-compute246810SE +/- 0.70, N = 118.40MIN: 5.61 / MAX: 443.041. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NeatBench

Acceleration: GPU

OpenBenchmarking.orgFPS, More Is BetterNeatBench 5Acceleration: GPUheikows3-2023-08-18-nvidia-gpu-compute306090120150SE +/- 0.00, N = 3126


Phoronix Test Suite v10.8.5