RTX 3070 Compute

AMD Ryzen 9 5900X 12-Core testing with a ASUS ROG CROSSHAIR VIII HERO (3402 BIOS) and NVIDIA GeForce RTX 3070 8GB on Ubuntu 20.04 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2104078-IB-RTX3070CO18&grt&sor.

RTX 3070 ComputeProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLOpenCLVulkanCompilerFile-SystemScreen Resolution123AMD Ryzen 9 5900X 12-Core @ 3.70GHz (12 Cores / 24 Threads)ASUS ROG CROSSHAIR VIII HERO (3402 BIOS)AMD Starship/Matisse16GB1000GB Sabrent Rocket 4.0 Plus + 2000GBNVIDIA GeForce RTX 3070 8GBNVIDIA Device 228bASUS VP28URealtek RTL8125 2.5GbE + Intel I211Ubuntu 20.045.8.0-48-generic (x86_64)GNOME Shell 3.36.7X Server 1.20.9NVIDIA 460.674.6.0OpenCL 1.2 CUDA 11.2.1621.2.155GCC 9.3.0 + CUDA 11.2ext43840x2160OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa201009 OpenCL Details- GPU Compute Cores: 5888Python Details- 1: Python 3.8.5Security Details- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected

RTX 3070 Computearrayfire: Conjugate Gradient OpenCLbetsy: ETC1 - Highestbetsy: ETC2 RGB - Highestblender: BMW27 - CUDAblender: Classroom - CUDAblender: Fishy Cat - CUDAblender: Barbershop - CUDAblender: BMW27 - NVIDIA OptiXblender: Classroom - NVIDIA OptiXblender: Fishy Cat - NVIDIA OptiXblender: Barbershop - NVIDIA OptiXblender: Pabellon Barcelona - CUDAblender: Pabellon Barcelona - NVIDIA OptiXv-ray: NVIDIA RTX GPUv-ray: NVIDIA CUDA GPUcl-mem: Copycl-mem: Readcl-mem: Writeclpeak: Integer Compute INTclpeak: Single-Precision Floatclpeak: Double-Precision Doubleclpeak: Global Memory Bandwidthfahbench: financebench: Black-Scholes OpenCLgromacs-gpu: Water Benchmarkhashcat: MD5hashcat: SHA1hashcat: 7-Ziphashcat: SHA-512hashcat: TrueCrypt RIPEMD160 + XTSindigobench: OpenCL GPU - Bedroomindigobench: OpenCL GPU - Supercarlczero: OpenCLluxcorerender-cl: DLSCluxcorerender-cl: Foodluxcorerender-cl: LuxCore Benchmarkluxcorerender-cl: Rainbow Colors and Prismmandelgpu: GPUmixbench: OpenCL - Integermixbench: OpenCL - Double Precisionmixbench: OpenCL - Single Precisionnamd-cuda: ATPase Simulation - 327,506 Atomsncnn: Vulkan GPU - mobilenetncnn: Vulkan GPU-v2-v2 - mobilenet-v2ncnn: Vulkan GPU-v3-v3 - mobilenet-v3ncnn: Vulkan GPU - shufflenet-v2ncnn: Vulkan GPU - mnasnetncnn: Vulkan GPU - efficientnet-b0ncnn: Vulkan GPU - blazefacencnn: Vulkan GPU - googlenetncnn: Vulkan GPU - vgg16ncnn: Vulkan GPU - resnet18ncnn: Vulkan GPU - alexnetncnn: Vulkan GPU - resnet50ncnn: Vulkan GPU - yolov4-tinyncnn: Vulkan GPU - squeezenet_ssdncnn: Vulkan GPU - regnety_400moctanebench: Total Scorerealsr-ncnn: 4x - Norealsr-ncnn: 4x - Yesredshift: rodinia: OpenCL Particle Filtershoc: OpenCL - S3Dshoc: OpenCL - Triadshoc: OpenCL - FFT SPshoc: OpenCL - MD5 Hashshoc: OpenCL - Reductionshoc: OpenCL - GEMM SGEMM_Nshoc: OpenCL - Max SP Flopsshoc: OpenCL - Bus Speed Downloadshoc: OpenCL - Bus Speed Readbackshoc: OpenCL - Texture Read Bandwidthviennacl: CPU BLAS - sCOPYviennacl: CPU BLAS - sAXPYviennacl: CPU BLAS - sDOTviennacl: CPU BLAS - dCOPYviennacl: CPU BLAS - dAXPYviennacl: CPU BLAS - dDOTviennacl: CPU BLAS - dGEMV-Nviennacl: CPU BLAS - dGEMV-Tviennacl: CPU BLAS - dGEMM-NNviennacl: CPU BLAS - dGEMM-NTviennacl: CPU BLAS - dGEMM-TNviennacl: CPU BLAS - dGEMM-TTviennacl: OpenCL BLAS - sCOPYviennacl: OpenCL BLAS - sAXPYviennacl: OpenCL BLAS - sDOTviennacl: OpenCL BLAS - dCOPYviennacl: OpenCL BLAS - dAXPYviennacl: OpenCL BLAS - dDOTviennacl: OpenCL BLAS - dGEMV-Nviennacl: OpenCL BLAS - dGEMV-Tviennacl: OpenCL BLAS - dGEMM-NNviennacl: OpenCL BLAS - dGEMM-NTviennacl: OpenCL BLAS - dGEMM-TNvkfft: vkresample: 2x - Doublevkresample: 2x - Singlewaifu2x-ncnn: 2x - 3 - Yes1232.0864.3096.03929.0776.1054.20509.3316.1148.1735.70465.14190.2976.7717131342297.6393.6380.210264.2820099.75360.90389.57267.090110.4838.1383877883333313120133333686733166480000050103312.91737.257289147.973.396.5119.18319970658.911433.74295.1122081.790.1327312.854.394.224.854.055.581.8413.2755.7113.9511.0824.4722.2014.4916.75410.5140438.43850.0582285.958218.44024.67211134.9925.4712325.6713806.5723117.826.281426.39502120.5662.094.014123.234.544.376.581.354.753.457.055.329335832536539639722233434234334232004220.45217.4894.2972.0944.3006.05129.0376.4954.44508.3816.1748.2635.84464.01190.9877.1617101337296.9393.2379.910202.3919991.73364.99389.61265.832410.4968.0833883903333313144266667686700166956666750283312.88237.163291177.943.396.5019.19319688588.011336.97299.6521965.660.1338813.564.354.244.924.105.621.9013.1755.8914.0311.2224.7523.7315.4516.87411.1116758.42749.9942286.016218.60524.67141133.8225.4940325.8013769.3123179.026.309726.39092131.2863.492.813922.633.443.577.081.953.552.755.854.529335732436439539622033233834033632323221.03717.5324.3038.43750.1814.321OpenBenchmarking.org

ArrayFire

Test: Conjugate Gradient OpenCL

OpenBenchmarking.orgms, Fewer Is BetterArrayFire 3.7Test: Conjugate Gradient OpenCL120.47120.94241.41361.88482.356SE +/- 0.000, N = 3SE +/- 0.000, N = 32.0862.0941. (CXX) g++ options: -rdynamic

Betsy GPU Compressor

Codec: ETC1 - Quality: Highest

OpenBenchmarking.orgSeconds, Fewer Is BetterBetsy GPU Compressor 1.1 BetaCodec: ETC1 - Quality: Highest210.96951.9392.90853.8784.8475SE +/- 0.019, N = 3SE +/- 0.010, N = 34.3004.3091. (CXX) g++ options: -O3 -O2 -lpthread -ldl

Betsy GPU Compressor

Codec: ETC2 RGB - Quality: Highest

OpenBenchmarking.orgSeconds, Fewer Is BetterBetsy GPU Compressor 1.1 BetaCodec: ETC2 RGB - Quality: Highest12246810SE +/- 0.025, N = 3SE +/- 0.015, N = 36.0396.0511. (CXX) g++ options: -O3 -O2 -lpthread -ldl

Blender

Blend File: BMW27 - Compute: CUDA

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.92Blend File: BMW27 - Compute: CUDA21714212835SE +/- 0.01, N = 3SE +/- 0.03, N = 329.0329.07

Blender

Blend File: Classroom - Compute: CUDA

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.92Blend File: Classroom - Compute: CUDA1220406080100SE +/- 0.01, N = 3SE +/- 0.02, N = 376.1076.49

Blender

Blend File: Fishy Cat - Compute: CUDA

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.92Blend File: Fishy Cat - Compute: CUDA121224364860SE +/- 0.02, N = 3SE +/- 0.04, N = 354.2054.44

Blender

Blend File: Barbershop - Compute: CUDA

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.92Blend File: Barbershop - Compute: CUDA21110220330440550SE +/- 0.25, N = 3SE +/- 0.43, N = 3508.38509.33

Blender

Blend File: BMW27 - Compute: NVIDIA OptiX

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.92Blend File: BMW27 - Compute: NVIDIA OptiX1248121620SE +/- 0.04, N = 3SE +/- 0.04, N = 316.1116.17

Blender

Blend File: Classroom - Compute: NVIDIA OptiX

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.92Blend File: Classroom - Compute: NVIDIA OptiX121122334455SE +/- 0.02, N = 3SE +/- 0.03, N = 348.1748.26

Blender

Blend File: Fishy Cat - Compute: NVIDIA OptiX

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.92Blend File: Fishy Cat - Compute: NVIDIA OptiX12816243240SE +/- 0.01, N = 3SE +/- 0.03, N = 335.7035.84

Blender

Blend File: Barbershop - Compute: NVIDIA OptiX

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.92Blend File: Barbershop - Compute: NVIDIA OptiX21100200300400500SE +/- 1.26, N = 3SE +/- 1.92, N = 3464.01465.14

Blender

Blend File: Pabellon Barcelona - Compute: CUDA

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.92Blend File: Pabellon Barcelona - Compute: CUDA124080120160200SE +/- 0.01, N = 3SE +/- 0.01, N = 3190.29190.98

Blender

Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.92Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX1220406080100SE +/- 0.05, N = 3SE +/- 0.04, N = 376.7777.16

Chaos Group V-RAY

Mode: NVIDIA RTX GPU

OpenBenchmarking.orgvrays, More Is BetterChaos Group V-RAY 5Mode: NVIDIA RTX GPU12400800120016002000SE +/- 2.00, N = 317131710

Chaos Group V-RAY

Mode: NVIDIA CUDA GPU

OpenBenchmarking.orgvpaths, More Is BetterChaos Group V-RAY 5Mode: NVIDIA CUDA GPU1230060090012001500SE +/- 1.20, N = 3SE +/- 0.67, N = 313421337

cl-mem

Benchmark: Copy

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: Copy1260120180240300SE +/- 0.17, N = 3SE +/- 0.20, N = 3297.6296.91. (CC) gcc options: -O2 -flto -lOpenCL

cl-mem

Benchmark: Read

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: Read1290180270360450SE +/- 0.07, N = 3SE +/- 0.36, N = 3393.6393.21. (CC) gcc options: -O2 -flto -lOpenCL

cl-mem

Benchmark: Write

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: Write1280160240320400SE +/- 0.06, N = 3SE +/- 0.13, N = 3380.2379.91. (CC) gcc options: -O2 -flto -lOpenCL

clpeak

OpenCL Test: Integer Compute INT

OpenBenchmarking.orgGIOPS, More Is BetterclpeakOpenCL Test: Integer Compute INT122K4K6K8K10KSE +/- 105.22, N = 5SE +/- 86.71, N = 310264.2810202.391. (CXX) g++ options: -O3 -rdynamic -lOpenCL

clpeak

OpenCL Test: Single-Precision Float

OpenBenchmarking.orgGFLOPS, More Is BetterclpeakOpenCL Test: Single-Precision Float124K8K12K16K20KSE +/- 2.13, N = 3SE +/- 109.12, N = 320099.7519991.731. (CXX) g++ options: -O3 -rdynamic -lOpenCL

clpeak

OpenCL Test: Double-Precision Double

OpenBenchmarking.orgGFLOPS, More Is BetterclpeakOpenCL Test: Double-Precision Double2180160240320400SE +/- 0.03, N = 3SE +/- 0.04, N = 3364.99360.901. (CXX) g++ options: -O3 -rdynamic -lOpenCL

clpeak

OpenCL Test: Global Memory Bandwidth

OpenBenchmarking.orgGBPS, More Is BetterclpeakOpenCL Test: Global Memory Bandwidth2180160240320400SE +/- 0.03, N = 3SE +/- 0.02, N = 3389.61389.571. (CXX) g++ options: -O3 -rdynamic -lOpenCL

FAHBench

OpenBenchmarking.orgNs Per Day, More Is BetterFAHBench 2.3.21260120180240300SE +/- 0.04, N = 3SE +/- 0.14, N = 3267.09265.83

FinanceBench

Benchmark: Black-Scholes OpenCL

OpenBenchmarking.orgms, Fewer Is BetterFinanceBench 2016-07-25Benchmark: Black-Scholes OpenCL123691215SE +/- 0.01, N = 3SE +/- 0.04, N = 310.4810.501. (CXX) g++ options: -O3 -march=native -fopenmp

GROMACS

Water Benchmark

OpenBenchmarking.orgNs Per Day, More Is BetterGROMACS 2020.3Water Benchmark12246810SE +/- 0.024, N = 3SE +/- 0.016, N = 38.1388.0831. (CXX) g++ options: -O3 -lpthread -ldl -lrt -lm

Hashcat

Benchmark: MD5

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.1.1Benchmark: MD5218000M16000M24000M32000M40000MSE +/- 42362968.63, N = 3SE +/- 49139065.70, N = 33883903333338778833333

Hashcat

Benchmark: SHA1

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.1.1Benchmark: SHA1213000M6000M9000M12000M15000MSE +/- 5394235.61, N = 3SE +/- 21817526.08, N = 31314426666713120133333

Hashcat

Benchmark: 7-Zip

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.1.1Benchmark: 7-Zip12150K300K450K600K750KSE +/- 1386.04, N = 3SE +/- 1069.27, N = 3686733686700

Hashcat

Benchmark: SHA-512

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.1.1Benchmark: SHA-51221400M800M1200M1600M2000MSE +/- 866666.67, N = 3SE +/- 723417.81, N = 316695666671664800000

Hashcat

Benchmark: TrueCrypt RIPEMD160 + XTS

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.1.1Benchmark: TrueCrypt RIPEMD160 + XTS21110K220K330K440K550KSE +/- 1197.68, N = 3SE +/- 1550.63, N = 3502833501033

IndigoBench

Acceleration: OpenCL GPU - Scene: Bedroom

OpenBenchmarking.orgM samples/s, More Is BetterIndigoBench 4.4Acceleration: OpenCL GPU - Scene: Bedroom123691215SE +/- 0.02, N = 3SE +/- 0.02, N = 312.9212.88

IndigoBench

Acceleration: OpenCL GPU - Scene: Supercar

OpenBenchmarking.orgM samples/s, More Is BetterIndigoBench 4.4Acceleration: OpenCL GPU - Scene: Supercar12918273645SE +/- 0.03, N = 3SE +/- 0.02, N = 337.2637.16

LeelaChessZero

Backend: OpenCL

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.26Backend: OpenCL216K12K18K24K30KSE +/- 234.03, N = 3SE +/- 74.99, N = 329117289141. (CXX) g++ options: -flto -pthread

LuxCoreRender OpenCL

Scene: DLSC

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender OpenCL 2.3Scene: DLSC12246810SE +/- 0.00, N = 3SE +/- 0.00, N = 37.977.94MIN: 7.86 / MAX: 8.17MIN: 7.82 / MAX: 8.14

LuxCoreRender OpenCL

Scene: Food

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender OpenCL 2.3Scene: Food210.76281.52562.28843.05123.814SE +/- 0.02, N = 3SE +/- 0.03, N = 33.393.39MIN: 0.26 / MAX: 4.22MIN: 0.23 / MAX: 4.24

LuxCoreRender OpenCL

Scene: LuxCore Benchmark

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender OpenCL 2.3Scene: LuxCore Benchmark12246810SE +/- 0.00, N = 3SE +/- 0.01, N = 36.516.50MIN: 0.27 / MAX: 7.46MIN: 0.32 / MAX: 7.45

LuxCoreRender OpenCL

Scene: Rainbow Colors and Prism

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender OpenCL 2.3Scene: Rainbow Colors and Prism21510152025SE +/- 0.02, N = 3SE +/- 0.04, N = 319.1919.18MIN: 17.89 / MAX: 20.09MIN: 17.88 / MAX: 20.08

MandelGPU

OpenCL Device: GPU

OpenBenchmarking.orgSamples/sec, More Is BetterMandelGPU 1.3pts1OpenCL Device: GPU1270M140M210M280M350MSE +/- 682590.08, N = 3SE +/- 1129798.58, N = 3319970658.9319688588.01. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL

Mixbench

Backend: OpenCL - Benchmark: Integer

OpenBenchmarking.orgGIOPS, More Is BetterMixbench 2020-06-23Backend: OpenCL - Benchmark: Integer122K4K6K8K10KSE +/- 3.19, N = 3SE +/- 53.09, N = 311433.7411336.971. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2

Mixbench

Backend: OpenCL - Benchmark: Double Precision

OpenBenchmarking.orgGFLOPS, More Is BetterMixbench 2020-06-23Backend: OpenCL - Benchmark: Double Precision2170140210280350SE +/- 0.76, N = 3SE +/- 1.86, N = 3299.65295.111. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2

Mixbench

Backend: OpenCL - Benchmark: Single Precision

OpenBenchmarking.orgGFLOPS, More Is BetterMixbench 2020-06-23Backend: OpenCL - Benchmark: Single Precision125K10K15K20K25KSE +/- 8.94, N = 3SE +/- 109.50, N = 322081.7921965.661. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2

NAMD CUDA

ATPase Simulation - 327,506 Atoms

OpenBenchmarking.orgdays/ns, Fewer Is BetterNAMD CUDA 2.14ATPase Simulation - 327,506 Atoms120.03010.06020.09030.12040.1505SE +/- 0.00056, N = 3SE +/- 0.00147, N = 30.132730.13388

NCNN

Target: Vulkan GPU - Model: mobilenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20201218Target: Vulkan GPU - Model: mobilenet123691215SE +/- 0.08, N = 3SE +/- 0.01, N = 312.8513.56MIN: 11.95 / MAX: 34.82MIN: 12.13 / MAX: 52.711. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20201218Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2210.98781.97562.96343.95124.939SE +/- 0.05, N = 3SE +/- 0.04, N = 34.354.39MIN: 3.99 / MAX: 6.3MIN: 4.12 / MAX: 5.781. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20201218Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3120.9541.9082.8623.8164.77SE +/- 0.06, N = 3SE +/- 0.15, N = 34.224.24MIN: 3.9 / MAX: 30.64MIN: 3.8 / MAX: 24.561. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: shufflenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20201218Target: Vulkan GPU - Model: shufflenet-v2121.1072.2143.3214.4285.535SE +/- 0.06, N = 3SE +/- 0.02, N = 34.854.92MIN: 4.59 / MAX: 6.05MIN: 4.48 / MAX: 25.291. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: mnasnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20201218Target: Vulkan GPU - Model: mnasnet120.92251.8452.76753.694.6125SE +/- 0.11, N = 3SE +/- 0.01, N = 34.054.10MIN: 3.65 / MAX: 20.14MIN: 3.66 / MAX: 26.141. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: efficientnet-b0

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20201218Target: Vulkan GPU - Model: efficientnet-b0121.26452.5293.79355.0586.3225SE +/- 0.10, N = 3SE +/- 0.12, N = 35.585.62MIN: 5.12 / MAX: 20.98MIN: 5.1 / MAX: 21.881. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: blazeface

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20201218Target: Vulkan GPU - Model: blazeface120.42750.8551.28251.712.1375SE +/- 0.01, N = 3SE +/- 0.06, N = 31.841.90MIN: 1.75 / MAX: 3.08MIN: 1.73 / MAX: 11.771. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: googlenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20201218Target: Vulkan GPU - Model: googlenet213691215SE +/- 0.27, N = 3SE +/- 0.08, N = 313.1713.27MIN: 11.88 / MAX: 33.16MIN: 12.01 / MAX: 35.621. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: vgg16

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20201218Target: Vulkan GPU - Model: vgg16121326395265SE +/- 0.18, N = 3SE +/- 0.11, N = 355.7155.89MIN: 51.94 / MAX: 108.04MIN: 52.71 / MAX: 91.561. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: resnet18

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20201218Target: Vulkan GPU - Model: resnet181248121620SE +/- 0.01, N = 3SE +/- 0.07, N = 313.9514.03MIN: 13.09 / MAX: 42.15MIN: 12.85 / MAX: 39.191. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: alexnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20201218Target: Vulkan GPU - Model: alexnet123691215SE +/- 0.03, N = 3SE +/- 0.07, N = 311.0811.22MIN: 10.26 / MAX: 26.8MIN: 10.16 / MAX: 47.071. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: resnet50

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20201218Target: Vulkan GPU - Model: resnet5012612182430SE +/- 0.33, N = 3SE +/- 0.37, N = 324.4724.75MIN: 22.75 / MAX: 54.75MIN: 22.79 / MAX: 62.121. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: yolov4-tiny

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20201218Target: Vulkan GPU - Model: yolov4-tiny12612182430SE +/- 0.01, N = 3SE +/- 1.20, N = 322.2023.73MIN: 21.01 / MAX: 44.66MIN: 21.14 / MAX: 142.51. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: squeezenet_ssd

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20201218Target: Vulkan GPU - Model: squeezenet_ssd1248121620SE +/- 0.15, N = 3SE +/- 0.12, N = 314.4915.45MIN: 13.53 / MAX: 35.44MIN: 13.94 / MAX: 74.441. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: regnety_400m

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20201218Target: Vulkan GPU - Model: regnety_400m1248121620SE +/- 0.25, N = 3SE +/- 0.32, N = 316.7516.87MIN: 15.59 / MAX: 33.2MIN: 15.42 / MAX: 40.881. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OctaneBench

Total Score

OpenBenchmarking.orgScore, More Is BetterOctaneBench 2020.1Total Score2190180270360450411.11410.51

RealSR-NCNN

Scale: 4x - TAA: No

OpenBenchmarking.orgSeconds, Fewer Is BetterRealSR-NCNN 20200818Scale: 4x - TAA: No231246810SE +/- 0.093, N = 3SE +/- 0.096, N = 3SE +/- 0.085, N = 38.4278.4378.438

RealSR-NCNN

Scale: 4x - TAA: Yes

OpenBenchmarking.orgSeconds, Fewer Is BetterRealSR-NCNN 20200818Scale: 4x - TAA: Yes2131122334455SE +/- 0.07, N = 3SE +/- 0.09, N = 3SE +/- 0.08, N = 349.9950.0650.18

RedShift Demo

OpenBenchmarking.orgSeconds, Fewer Is BetterRedShift Demo 3.01250100150200250SE +/- 0.88, N = 3SE +/- 1.00, N = 3228228

Rodinia

Test: OpenCL Particle Filter

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenCL Particle Filter12246810SE +/- 0.014, N = 3SE +/- 0.023, N = 35.9586.0161. (CXX) g++ options: -m64 -lm -lcuda -lcudart -lcudadevrt -lcudart_static -lrt -lpthread -ldl

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: S3D

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: S3D2150100150200250SE +/- 0.28, N = 3SE +/- 0.27, N = 3218.61218.441. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Triad

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Triad12612182430SE +/- 0.01, N = 3SE +/- 0.01, N = 324.6724.671. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: FFT SP

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: FFT SP122004006008001000SE +/- 0.71, N = 3SE +/- 0.99, N = 31134.991133.821. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: MD5 Hash

OpenBenchmarking.orgGHash/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: MD5 Hash21612182430SE +/- 0.04, N = 3SE +/- 0.02, N = 325.4925.471. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Reduction

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Reduction2170140210280350SE +/- 0.46, N = 3SE +/- 0.56, N = 3325.80325.671. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: GEMM SGEMM_N

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: GEMM SGEMM_N128001600240032004000SE +/- 10.00, N = 3SE +/- 22.78, N = 33806.573769.311. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Max SP Flops

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Max SP Flops215K10K15K20K25KSE +/- 49.54, N = 3SE +/- 31.09, N = 323179.023117.81. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Bus Speed Download

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Bus Speed Download21612182430SE +/- 0.03, N = 3SE +/- 0.03, N = 326.3126.281. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Bus Speed Readback

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Bus Speed Readback12612182430SE +/- 0.01, N = 3SE +/- 0.01, N = 326.4026.391. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Texture Read Bandwidth

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Texture Read Bandwidth215001000150020002500SE +/- 5.26, N = 3SE +/- 7.35, N = 32131.282120.561. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

ViennaCL

Test: CPU BLAS - sCOPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - sCOPY211428425670SE +/- 0.59, N = 3SE +/- 0.84, N = 363.462.01. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - sAXPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - sAXPY1220406080100SE +/- 0.87, N = 3SE +/- 2.27, N = 394.092.81. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - sDOT

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - sDOT12306090120150SE +/- 1.15, N = 3SE +/- 3.53, N = 31411391. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dCOPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dCOPY12612182430SE +/- 0.10, N = 3SE +/- 0.74, N = 323.222.61. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dAXPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dAXPY12816243240SE +/- 0.22, N = 3SE +/- 0.65, N = 334.533.41. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dDOT

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dDOT121020304050SE +/- 0.42, N = 3SE +/- 0.62, N = 344.343.51. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMV-N

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMV-N2120406080100SE +/- 0.49, N = 3SE +/- 0.45, N = 377.076.51. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMV-T

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMV-T2120406080100SE +/- 0.57, N = 3SE +/- 1.43, N = 381.981.31. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMM-NN

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-NN121224364860SE +/- 0.10, N = 3SE +/- 1.00, N = 354.753.51. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMM-NT

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-NT121224364860SE +/- 0.15, N = 3SE +/- 0.70, N = 353.452.71. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMM-TN

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-TN121326395265SE +/- 0.12, N = 3SE +/- 1.22, N = 357.055.81. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMM-TT

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-TT121224364860SE +/- 0.15, N = 3SE +/- 0.97, N = 355.354.51. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - sCOPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sCOPY2160120180240300SE +/- 0.58, N = 32932931. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - sAXPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sAXPY12801602403204003583571. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - sDOT

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sDOT12701402102803503253241. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dCOPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dCOPY1280160240320400SE +/- 0.33, N = 33653641. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dAXPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dAXPY12901802703604503963951. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dDOT

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dDOT12901802703604503973961. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMV-N

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMV-N12501001502002502222201. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMV-T

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMV-T12701402102803503343321. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMM-NN

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-NN1270140210280350SE +/- 1.00, N = 3SE +/- 0.67, N = 33423381. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMM-NT

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-NT1270140210280350SE +/- 1.50, N = 23433401. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMM-TN

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-TN12701402102803503423361. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

VkFFT

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.1.1217K14K21K28K35KSE +/- 422.26, N = 3SE +/- 238.49, N = 332323320041. (CXX) g++ options: -O3 -pthread

VkResample

Upscale: 2x - Precision: Double

OpenBenchmarking.orgms, Fewer Is BetterVkResample 1.0Upscale: 2x - Precision: Double1250100150200250SE +/- 0.07, N = 3SE +/- 0.07, N = 3220.45221.041. (CXX) g++ options: -O3 -pthread

VkResample

Upscale: 2x - Precision: Single

OpenBenchmarking.orgms, Fewer Is BetterVkResample 1.0Upscale: 2x - Precision: Single1248121620SE +/- 0.01, N = 3SE +/- 0.01, N = 317.4917.531. (CXX) g++ options: -O3 -pthread

Waifu2x-NCNN Vulkan

Scale: 2x - Denoise: 3 - TAA: Yes

OpenBenchmarking.orgSeconds, Fewer Is BetterWaifu2x-NCNN Vulkan 20200818Scale: 2x - Denoise: 3 - TAA: Yes1230.97221.94442.91663.88884.861SE +/- 0.003, N = 3SE +/- 0.004, N = 3SE +/- 0.001, N = 34.2974.3034.321


Phoronix Test Suite v10.8.4