nvidia

nvdia

HTML result view exported from: https://openbenchmarking.org/result/2106184-IB-NVIDIA10275&grt.

nvidiaProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLOpenCLVulkanCompilerFile-SystemScreen ResolutionnvdidaIntel Xeon W-2235 @ 4.60GHz (6 Cores / 12 Threads)Dell 06JWJY (2.8.0 BIOS)Intel Sky Lake-E DMI3 Registers32GB2 x 2000GB TOSHIBA DT01ACA2NVIDIA Quadro RTX 6000 24GBRealtek ALC3234ASUS VS229Intel I219-LMUbuntu 20.045.8.0-55-generic (x86_64)GNOME Shell 3.36.9X Server 1.20.9NVIDIA 460.804.6.0OpenCL 1.2 CUDA 11.2.1621.2.155GCC 9.3.0 + CUDA 11.2ext41920x1080OpenBenchmarking.org- Transparent Huge Pages: madvise- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: intel_pstate powersave - CPU Microcode: 0x5003102- Python 3.8.5- itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Mitigation of TSX disabled

nvidiaarrayfire: Conjugate Gradient OpenCLbetsy: ETC1 - Highestbetsy: ETC2 RGB - Highestblender: BMW27 - CUDAblender: Classroom - CUDAblender: Fishy Cat - CUDAblender: Barbershop - CUDAblender: BMW27 - NVIDIA OptiXblender: Classroom - NVIDIA OptiXblender: Fishy Cat - NVIDIA OptiXblender: Barbershop - NVIDIA OptiXblender: Pabellon Barcelona - CUDAblender: Pabellon Barcelona - NVIDIA OptiXcl-mem: Copycl-mem: Readcl-mem: Writeclpeak: Integer Compute INTclpeak: Single-Precision Floatclpeak: Double-Precision Doubleclpeak: Global Memory Bandwidthfahbench: financebench: Black-Scholes OpenCLhashcat: MD5hashcat: SHA1hashcat: 7-Ziphashcat: SHA-512hashcat: TrueCrypt RIPEMD160 + XTSindigobench: OpenCL GPU - Bedroomindigobench: OpenCL GPU - Supercarlczero: OpenCLluxcorerender: DLSC - GPUluxcorerender: Danish Mood - GPUluxcorerender: Orange Juice - GPUluxcorerender: LuxCore Benchmark - GPUluxcorerender: Rainbow Colors and Prism - GPUmandelgpu: GPUmixbench: OpenCL - Integermixbench: OpenCL - Double Precisionmixbench: OpenCL - Single Precisionncnn: Vulkan GPU - mobilenetncnn: Vulkan GPU-v2-v2 - mobilenet-v2ncnn: Vulkan GPU-v3-v3 - mobilenet-v3ncnn: Vulkan GPU - shufflenet-v2ncnn: Vulkan GPU - mnasnetncnn: Vulkan GPU - efficientnet-b0ncnn: Vulkan GPU - blazefacencnn: Vulkan GPU - googlenetncnn: Vulkan GPU - vgg16ncnn: Vulkan GPU - resnet18ncnn: Vulkan GPU - alexnetncnn: Vulkan GPU - resnet50ncnn: Vulkan GPU - yolov4-tinyncnn: Vulkan GPU - squeezenet_ssdncnn: Vulkan GPU - regnety_400mrealsr-ncnn: 4x - Norealsr-ncnn: 4x - Yesrodinia: OpenCL Particle Filterviennacl: CPU BLAS - sCOPYviennacl: CPU BLAS - sAXPYviennacl: CPU BLAS - sDOTviennacl: CPU BLAS - dCOPYviennacl: CPU BLAS - dAXPYviennacl: CPU BLAS - dDOTviennacl: CPU BLAS - dGEMV-Nviennacl: CPU BLAS - dGEMV-Tviennacl: CPU BLAS - dGEMM-NNviennacl: CPU BLAS - dGEMM-NTviennacl: CPU BLAS - dGEMM-TNviennacl: CPU BLAS - dGEMM-TTviennacl: OpenCL BLAS - sCOPYviennacl: OpenCL BLAS - sAXPYviennacl: OpenCL BLAS - sDOTviennacl: OpenCL BLAS - dCOPYviennacl: OpenCL BLAS - dAXPYviennacl: OpenCL BLAS - dDOTviennacl: OpenCL BLAS - dGEMV-Nviennacl: OpenCL BLAS - dGEMV-Tviennacl: OpenCL BLAS - dGEMM-NNviennacl: OpenCL BLAS - dGEMM-NTviennacl: OpenCL BLAS - dGEMM-TNviennacl: OpenCL BLAS - dGEMM-TTvkfft: vkpeak: fp32-scalarvkpeak: fp32-vec4vkpeak: fp16-scalarvkpeak: fp16-vec4vkpeak: fp64-scalarvkpeak: fp64-vec4vkpeak: int32-scalarvkpeak: int32-vec4vkpeak: int16-scalarvkpeak: int16-vec4vkresample: 2x - Doublevkresample: 2x - Singlewaifu2x-ncnn: 2x - 3 - Yesnvdida1.6004.1415.303356.18770.77853.961050.83356.64783.05874.961042.751143.151146.23323.9569.3512.914291.1114639.03542.32531.12302.61418.1695216773333317976266667856867228523333367073311.64334.64670485.823.285.954.4615.092835959.916760.87458.0317012.0622.316.525.296.965.708.462.2116.7764.0216.1213.4533.7931.1422.0317.258.72545.6114.48725.637.741.724.837.141.142.642.421.721.522.622.33134083194975445314643745034974914983450917230.4217066.2616948.3832855.41534.46535.6717057.0316926.8011229.0014281.96149.57313.5034.279OpenBenchmarking.org

ArrayFire

Test: Conjugate Gradient OpenCL

OpenBenchmarking.orgms, Fewer Is BetterArrayFire 3.7Test: Conjugate Gradient OpenCLnvdida0.360.721.081.441.8SE +/- 0.002, N = 31.6001. (CXX) g++ options: -rdynamic

Betsy GPU Compressor

Codec: ETC1 - Quality: Highest

OpenBenchmarking.orgSeconds, Fewer Is BetterBetsy GPU Compressor 1.1 BetaCodec: ETC1 - Quality: Highestnvdida0.93171.86342.79513.72684.6585SE +/- 0.032, N = 94.1411. (CXX) g++ options: -O3 -O2 -lpthread -ldl

Betsy GPU Compressor

Codec: ETC2 RGB - Quality: Highest

OpenBenchmarking.orgSeconds, Fewer Is BetterBetsy GPU Compressor 1.1 BetaCodec: ETC2 RGB - Quality: Highestnvdida1.19322.38643.57964.77285.966SE +/- 0.015, N = 35.3031. (CXX) g++ options: -O3 -O2 -lpthread -ldl

Blender

Blend File: BMW27 - Compute: CUDA

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.92Blend File: BMW27 - Compute: CUDAnvdida80160240320400SE +/- 6.31, N = 6356.18

Blender

Blend File: Classroom - Compute: CUDA

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.92Blend File: Classroom - Compute: CUDAnvdida170340510680850SE +/- 5.19, N = 3770.77

Blender

Blend File: Fishy Cat - Compute: CUDA

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.92Blend File: Fishy Cat - Compute: CUDAnvdida2004006008001000SE +/- 9.79, N = 3853.96

Blender

Blend File: Barbershop - Compute: CUDA

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.92Blend File: Barbershop - Compute: CUDAnvdida2004006008001000SE +/- 3.47, N = 31050.83

Blender

Blend File: BMW27 - Compute: NVIDIA OptiX

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.92Blend File: BMW27 - Compute: NVIDIA OptiXnvdida80160240320400SE +/- 6.20, N = 9356.64

Blender

Blend File: Classroom - Compute: NVIDIA OptiX

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.92Blend File: Classroom - Compute: NVIDIA OptiXnvdida2004006008001000SE +/- 3.87, N = 3783.05

Blender

Blend File: Fishy Cat - Compute: NVIDIA OptiX

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.92Blend File: Fishy Cat - Compute: NVIDIA OptiXnvdida2004006008001000SE +/- 6.89, N = 3874.96

Blender

Blend File: Barbershop - Compute: NVIDIA OptiX

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.92Blend File: Barbershop - Compute: NVIDIA OptiXnvdida2004006008001000SE +/- 1.02, N = 31042.75

Blender

Blend File: Pabellon Barcelona - Compute: CUDA

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.92Blend File: Pabellon Barcelona - Compute: CUDAnvdida2004006008001000SE +/- 16.54, N = 91143.15

Blender

Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.92Blend File: Pabellon Barcelona - Compute: NVIDIA OptiXnvdida2004006008001000SE +/- 2.96, N = 31146.23

cl-mem

Benchmark: Copy

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: Copynvdida70140210280350SE +/- 0.13, N = 3323.91. (CC) gcc options: -O2 -flto -lOpenCL

cl-mem

Benchmark: Read

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: Readnvdida120240360480600SE +/- 0.00, N = 3569.31. (CC) gcc options: -O2 -flto -lOpenCL

cl-mem

Benchmark: Write

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: Writenvdida110220330440550SE +/- 0.99, N = 3512.91. (CC) gcc options: -O2 -flto -lOpenCL

clpeak

OpenCL Test: Integer Compute INT

OpenBenchmarking.orgGIOPS, More Is BetterclpeakOpenCL Test: Integer Compute INTnvdida3K6K9K12K15KSE +/- 106.86, N = 1514291.111. (CXX) g++ options: -O3 -rdynamic -lOpenCL

clpeak

OpenCL Test: Single-Precision Float

OpenBenchmarking.orgGFLOPS, More Is BetterclpeakOpenCL Test: Single-Precision Floatnvdida3K6K9K12K15KSE +/- 188.11, N = 1514639.031. (CXX) g++ options: -O3 -rdynamic -lOpenCL

clpeak

OpenCL Test: Double-Precision Double

OpenBenchmarking.orgGFLOPS, More Is BetterclpeakOpenCL Test: Double-Precision Doublenvdida120240360480600SE +/- 1.43, N = 3542.321. (CXX) g++ options: -O3 -rdynamic -lOpenCL

clpeak

OpenCL Test: Global Memory Bandwidth

OpenBenchmarking.orgGBPS, More Is BetterclpeakOpenCL Test: Global Memory Bandwidthnvdida110220330440550SE +/- 0.05, N = 3531.121. (CXX) g++ options: -O3 -rdynamic -lOpenCL

FAHBench

OpenBenchmarking.orgNs Per Day, More Is BetterFAHBench 2.3.2nvdida70140210280350SE +/- 1.82, N = 3302.61

FinanceBench

Benchmark: Black-Scholes OpenCL

OpenBenchmarking.orgms, Fewer Is BetterFinanceBench 2016-07-25Benchmark: Black-Scholes OpenCLnvdida246810SE +/- 0.023, N = 38.1691. (CXX) g++ options: -O3 -march=native -fopenmp

Hashcat

Benchmark: MD5

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.1.1Benchmark: MD5nvdida11000M22000M33000M44000M55000MSE +/- 110453524.67, N = 352167733333

Hashcat

Benchmark: SHA1

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.1.1Benchmark: SHA1nvdida4000M8000M12000M16000M20000MSE +/- 43432105.38, N = 317976266667

Hashcat

Benchmark: 7-Zip

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.1.1Benchmark: 7-Zipnvdida200K400K600K800K1000KSE +/- 1690.50, N = 3856867

Hashcat

Benchmark: SHA-512

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.1.1Benchmark: SHA-512nvdida500M1000M1500M2000M2500MSE +/- 4294311.48, N = 32285233333

Hashcat

Benchmark: TrueCrypt RIPEMD160 + XTS

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.1.1Benchmark: TrueCrypt RIPEMD160 + XTSnvdida140K280K420K560K700KSE +/- 145.30, N = 3670733

IndigoBench

Acceleration: OpenCL GPU - Scene: Bedroom

OpenBenchmarking.orgM samples/s, More Is BetterIndigoBench 4.4Acceleration: OpenCL GPU - Scene: Bedroomnvdida3691215SE +/- 0.01, N = 311.64

IndigoBench

Acceleration: OpenCL GPU - Scene: Supercar

OpenBenchmarking.orgM samples/s, More Is BetterIndigoBench 4.4Acceleration: OpenCL GPU - Scene: Supercarnvdida816243240SE +/- 0.05, N = 334.65

LeelaChessZero

Backend: OpenCL

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.26Backend: OpenCLnvdida15003000450060007500SE +/- 20.90, N = 370481. (CXX) g++ options: -flto -pthread

LuxCoreRender

Scene: DLSC - Acceleration: GPU

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender 2.5Scene: DLSC - Acceleration: GPUnvdida1.30952.6193.92855.2386.5475SE +/- 0.07, N = 35.82MIN: 2.61 / MAX: 6.12

LuxCoreRender

Scene: Danish Mood - Acceleration: GPU

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender 2.5Scene: Danish Mood - Acceleration: GPUnvdida0.7381.4762.2142.9523.69SE +/- 0.04, N = 153.28MIN: 0.46 / MAX: 4.77

LuxCoreRender

Scene: Orange Juice - Acceleration: GPU

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender 2.5Scene: Orange Juice - Acceleration: GPUnvdida1.33882.67764.01645.35526.694SE +/- 0.00, N = 35.95MIN: 4.95 / MAX: 7.04

LuxCoreRender

Scene: LuxCore Benchmark - Acceleration: GPU

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender 2.5Scene: LuxCore Benchmark - Acceleration: GPUnvdida1.00352.0073.01054.0145.0175SE +/- 0.02, N = 34.46MIN: 0.58 / MAX: 6.19

LuxCoreRender

Scene: Rainbow Colors and Prism - Acceleration: GPU

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender 2.5Scene: Rainbow Colors and Prism - Acceleration: GPUnvdida48121620SE +/- 0.44, N = 1215.09MIN: 0.21 / MAX: 17.99

MandelGPU

OpenCL Device: GPU

OpenBenchmarking.orgSamples/sec, More Is BetterMandelGPU 1.3pts1OpenCL Device: GPUnvdida600K1200K1800K2400K3000KSE +/- 6.55, N = 32835959.91. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL

Mixbench

Backend: OpenCL - Benchmark: Integer

OpenBenchmarking.orgGIOPS, More Is BetterMixbench 2020-06-23Backend: OpenCL - Benchmark: Integernvdida4K8K12K16K20KSE +/- 195.94, N = 416760.871. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2

Mixbench

Backend: OpenCL - Benchmark: Double Precision

OpenBenchmarking.orgGFLOPS, More Is BetterMixbench 2020-06-23Backend: OpenCL - Benchmark: Double Precisionnvdida100200300400500SE +/- 0.70, N = 3458.031. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2

Mixbench

Backend: OpenCL - Benchmark: Single Precision

OpenBenchmarking.orgGFLOPS, More Is BetterMixbench 2020-06-23Backend: OpenCL - Benchmark: Single Precisionnvdida4K8K12K16K20KSE +/- 21.47, N = 317012.061. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2

NCNN

Target: Vulkan GPU - Model: mobilenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20201218Target: Vulkan GPU - Model: mobilenetnvdida510152025SE +/- 0.22, N = 322.31MIN: 21.61 / MAX: 33.681. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20201218Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2nvdida246810SE +/- 0.07, N = 36.52MIN: 5.89 / MAX: 53.671. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20201218Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3nvdida1.19032.38063.57094.76125.9515SE +/- 0.02, N = 35.29MIN: 4.95 / MAX: 6.971. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: shufflenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20201218Target: Vulkan GPU - Model: shufflenet-v2nvdida246810SE +/- 0.10, N = 36.96MIN: 6.68 / MAX: 24.731. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: mnasnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20201218Target: Vulkan GPU - Model: mnasnetnvdida1.28252.5653.84755.136.4125SE +/- 0.08, N = 35.70MIN: 5.01 / MAX: 73.241. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: efficientnet-b0

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20201218Target: Vulkan GPU - Model: efficientnet-b0nvdida246810SE +/- 0.10, N = 38.46MIN: 7.92 / MAX: 20.011. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: blazeface

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20201218Target: Vulkan GPU - Model: blazefacenvdida0.49730.99461.49191.98922.4865SE +/- 0.02, N = 32.21MIN: 2.12 / MAX: 2.771. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: googlenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20201218Target: Vulkan GPU - Model: googlenetnvdida48121620SE +/- 0.11, N = 316.77MIN: 16.28 / MAX: 34.071. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: vgg16

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20201218Target: Vulkan GPU - Model: vgg16nvdida1428425670SE +/- 0.38, N = 364.02MIN: 62.95 / MAX: 114.531. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: resnet18

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20201218Target: Vulkan GPU - Model: resnet18nvdida48121620SE +/- 0.12, N = 316.12MIN: 15.7 / MAX: 52.751. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: alexnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20201218Target: Vulkan GPU - Model: alexnetnvdida3691215SE +/- 0.14, N = 313.45MIN: 13.12 / MAX: 15.361. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: resnet50

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20201218Target: Vulkan GPU - Model: resnet50nvdida816243240SE +/- 0.27, N = 333.79MIN: 33.09 / MAX: 71.321. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: yolov4-tiny

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20201218Target: Vulkan GPU - Model: yolov4-tinynvdida714212835SE +/- 0.18, N = 331.14MIN: 30.2 / MAX: 73.21. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: squeezenet_ssd

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20201218Target: Vulkan GPU - Model: squeezenet_ssdnvdida510152025SE +/- 0.14, N = 322.03MIN: 21.46 / MAX: 89.381. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: regnety_400m

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20201218Target: Vulkan GPU - Model: regnety_400mnvdida48121620SE +/- 0.06, N = 317.25MIN: 16.71 / MAX: 24.91. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

RealSR-NCNN

Scale: 4x - TAA: No

OpenBenchmarking.orgSeconds, Fewer Is BetterRealSR-NCNN 20200818Scale: 4x - TAA: Nonvdida246810SE +/- 0.055, N = 38.725

RealSR-NCNN

Scale: 4x - TAA: Yes

OpenBenchmarking.orgSeconds, Fewer Is BetterRealSR-NCNN 20200818Scale: 4x - TAA: Yesnvdida1020304050SE +/- 0.20, N = 345.61

Rodinia

Test: OpenCL Particle Filter

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenCL Particle Filternvdida1.00962.01923.02884.03845.048SE +/- 0.045, N = 64.4871. (CXX) g++ options: -m64 -lm -lcuda -lcudart -lcudadevrt -lcudart_static -lrt -lpthread -ldl

ViennaCL

Test: CPU BLAS - sCOPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - sCOPYnvdida612182430SE +/- 0.18, N = 1525.61. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - sAXPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - sAXPYnvdida918273645SE +/- 0.44, N = 1537.71. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - sDOT

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - sDOTnvdida1020304050SE +/- 0.55, N = 1541.71. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dCOPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dCOPYnvdida612182430SE +/- 0.04, N = 1524.81. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dAXPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dAXPYnvdida918273645SE +/- 0.07, N = 1537.11. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dDOT

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dDOTnvdida918273645SE +/- 0.33, N = 1541.11. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMV-N

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMV-Nnvdida1020304050SE +/- 0.08, N = 1542.61. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMV-T

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMV-Tnvdida1020304050SE +/- 0.74, N = 1242.41. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMM-NN

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-NNnvdida510152025SE +/- 0.28, N = 1521.71. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMM-NT

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-NTnvdida510152025SE +/- 0.23, N = 1421.51. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMM-TN

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-TNnvdida510152025SE +/- 0.10, N = 1522.61. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMM-TT

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-TTnvdida510152025SE +/- 0.04, N = 1422.31. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - sCOPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sCOPYnvdida70140210280350SE +/- 1.86, N = 33131. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - sAXPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sAXPYnvdida90180270360450SE +/- 0.67, N = 34081. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - sDOT

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sDOTnvdida70140210280350SE +/- 0.67, N = 33191. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dCOPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dCOPYnvdida110220330440550SE +/- 0.33, N = 34971. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dAXPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dAXPYnvdida1202403604806005441. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dDOT

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dDOTnvdida1102203304405505311. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMV-N

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMV-Nnvdida100200300400500SE +/- 0.88, N = 34641. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMV-T

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMV-Tnvdida80160240320400SE +/- 0.88, N = 33741. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMM-NN

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-NNnvdida110220330440550SE +/- 2.00, N = 25031. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMM-NT

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-NTnvdida110220330440550SE +/- 4.00, N = 24971. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMM-TN

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-TNnvdida110220330440550SE +/- 2.31, N = 34911. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMM-TT

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-TTnvdida110220330440550SE +/- 2.00, N = 24981. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

VkFFT

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.1.1nvdida7K14K21K28K35KSE +/- 363.50, N = 5345091. (CXX) g++ options: -O3 -pthread

vkpeak

fp32-scalar

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20210424fp32-scalarnvdida4K8K12K16K20KSE +/- 192.00, N = 417230.42

vkpeak

fp32-vec4

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20210424fp32-vec4nvdida4K8K12K16K20KSE +/- 96.79, N = 417066.26

vkpeak

fp16-scalar

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20210424fp16-scalarnvdida4K8K12K16K20KSE +/- 73.03, N = 416948.38

vkpeak

fp16-vec4

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20210424fp16-vec4nvdida7K14K21K28K35KSE +/- 85.46, N = 432855.41

vkpeak

fp64-scalar

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20210424fp64-scalarnvdida120240360480600SE +/- 0.02, N = 4534.46

vkpeak

fp64-vec4

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20210424fp64-vec4nvdida120240360480600SE +/- 1.06, N = 4535.67

vkpeak

int32-scalar

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20210424int32-scalarnvdida4K8K12K16K20KSE +/- 39.37, N = 417057.03

vkpeak

int32-vec4

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20210424int32-vec4nvdida4K8K12K16K20KSE +/- 33.92, N = 416926.80

vkpeak

int16-scalar

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20210424int16-scalarnvdida2K4K6K8K10KSE +/- 0.40, N = 411229.00

vkpeak

int16-vec4

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20210424int16-vec4nvdida3K6K9K12K15KSE +/- 27.14, N = 414281.96

VkResample

Upscale: 2x - Precision: Double

OpenBenchmarking.orgms, Fewer Is BetterVkResample 1.0Upscale: 2x - Precision: Doublenvdida306090120150SE +/- 0.04, N = 3149.571. (CXX) g++ options: -O3 -pthread

VkResample

Upscale: 2x - Precision: Single

OpenBenchmarking.orgms, Fewer Is BetterVkResample 1.0Upscale: 2x - Precision: Singlenvdida3691215SE +/- 0.01, N = 313.501. (CXX) g++ options: -O3 -pthread

Waifu2x-NCNN Vulkan

Scale: 2x - Denoise: 3 - TAA: Yes

OpenBenchmarking.orgSeconds, Fewer Is BetterWaifu2x-NCNN Vulkan 20200818Scale: 2x - Denoise: 3 - TAA: Yesnvdida0.96281.92562.88843.85124.814SE +/- 0.017, N = 34.279


Phoronix Test Suite v10.8.4