nvidia_opencl_linux

AMD wx4150 on Ubuntu 20.04.2 with ROCM fan speed max

HTML result view exported from: https://openbenchmarking.org/result/2103143-HA-2011304FI79&grt&sro.

nvidia_opencl_linuxProcessorMotherboardChipsetMemoryDiskGraphicsAudioNetworkMonitorOSKernelDesktopDisplay ServerDisplay DriverOpenCLVulkanCompilerFile-SystemScreen ResolutionOpenGLnvidia_opencl_linuxamd_opencl_linuxamd-opencl-linuxIntel Core i7-4700MQ @ 3.40GHz (4 Cores / 8 Threads)HP 1909 (L70 Ver. 01.42 BIOS)Intel Xeon E3-1200 v3/4th32GB500GB Samsung SSD 860 + 256GB SAMSUNG MZ7PD256 + 500GB Seagate ST500LT012-1DG14 + 256GB SAMSUNG MZMPD256 + 128GB ED2S5NVIDIA Quadro M1000M 2GB (135/405MHz)IDT 92HD91BXXIntel I217-LM + Intel 7260Ubuntu 20.045.4.0-53-generic (x86_64)GNOME Shell 3.36.4X Server 1.20.8NVIDIA 450.80.02OpenCL 1.2 CUDA 11.0.2281.2.131GCC 9.3.0ext41920x1200HP 1909 (L70 Ver. 01.45 BIOS)500GB Samsung SSD 860 + 500GB Seagate ST500LT012-1DG14Intel HD 4600 2GB (1150MHz)HP ZR24w5.6.0-1042-oem (x86_64)X Server 1.20.94.5 Mesa 20.2.6OpenCL 2.0 AMD-APP (3212.0)1.2.1453840x1200OpenBenchmarking.orgCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: intel_pstate powersave - CPU Microcode: 0x28 - Thermald 1.9.1OpenCL Details- nvidia_opencl_linux: GPU Compute Cores: 512Python Details- nvidia_opencl_linux: Python 3.8.5Security Details- itlb_multihit: KVM: Mitigation of Split huge pages + l1tf: Mitigation of PTE Inversion; VMX: conditional cache flushes SMT vulnerable + mds: Mitigation of Clear buffers; SMT vulnerable + meltdown: Mitigation of PTI + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full generic retpoline IBPB: conditional IBRS_FW STIBP: conditional RSB filling + srbds: Mitigation of Microcode + tsx_async_abort: Not affected Kernel Details- amd_opencl_linux, amd-opencl-linux: Transparent Huge Pages: madviseGraphics Details- amd_opencl_linux, amd-opencl-linux: GLAMOR

nvidia_opencl_linuxblender: BMW27 - OpenCLblender: Barbershop - OpenCLcl-mem: Copycl-mem: Readcl-mem: Writeclpeak: Kernel Latencyclpeak: Integer Compute INTclpeak: Single-Precision Floatclpeak: Double-Precision Doubleclpeak: Global Memory Bandwidthclpeak: Transfer Bandwidth enqueueReadBufferclpeak: Transfer Bandwidth enqueueWriteBufferdarktable: Boat - OpenCLdarktable: Masskrug - OpenCLdarktable: Server Rack - OpenCLdarktable: Server Room - OpenCLluxmark: GPU - Hotelluxmark: GPU - Microphoneluxmark: GPU - Luxball HDRrodinia: OpenCL LavaMDrodinia: OpenCL Myocyterodinia: OpenCL Heartwallrodinia: OpenCL Particle Filtershoc: OpenCL - Triadshoc: OpenCL - FFT SPshoc: OpenCL - MD5 Hashshoc: OpenCL - Max SP Flopsshoc: OpenCL - Bus Speed Downloadshoc: OpenCL - Bus Speed Readbackshoc: OpenCL - Texture Read Bandwidthsmallpt-gpu: GPU - 1920 x 1200 - Causticsmallpt-gpu: GPU - 1920 x 1200 - Cornellsmallpt-gpu: GPU - 1920 x 1200 - Caustic3nvidia_opencl_linuxamd_opencl_linuxamd-opencl-linux694.902379.3360.167.463.37.75251.31700.2935.6366.996.6310.9311.03110.8220.3234.594751251237303.95155.3385.97845.30610.5663122.3731.43221130.1212.687812.7640110.99016067567481606756869160675699568.579.674.16.13368.481823.80115.4475.2110.3619.398.0479.7510.2232.90249037575322230.3457.7984.7502217.1392.35029081355.66905.246181.667116157009271615701063161570120368.579.573.96.15367.951824.96115.5475.2610.4119.468.0839.7760.2282.92649137675324229.8597.8104.7505217.3322.34929086055.66585.244781.8607161570693116157070681615707207OpenBenchmarking.org

Blender

Blend File: BMW27 - Compute: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: BMW27 - Compute: OpenCLnvidia_opencl_linux150300450600750SE +/- 10.50, N = 3694.90

Blender

Blend File: Barbershop - Compute: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: Barbershop - Compute: OpenCLnvidia_opencl_linux5001000150020002500SE +/- 5.31, N = 32379.33

cl-mem

Benchmark: Copy

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: Copyamd-opencl-linuxamd_opencl_linuxnvidia_opencl_linux1530456075SE +/- 0.06, N = 3SE +/- 0.12, N = 3SE +/- 0.00, N = 368.568.560.11. (CC) gcc options: -O2 -flto -lOpenCL

cl-mem

Benchmark: Read

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: Readamd-opencl-linuxamd_opencl_linuxnvidia_opencl_linux20406080100SE +/- 0.07, N = 3SE +/- 0.03, N = 3SE +/- 0.03, N = 379.579.667.41. (CC) gcc options: -O2 -flto -lOpenCL

cl-mem

Benchmark: Write

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: Writeamd-opencl-linuxamd_opencl_linuxnvidia_opencl_linux1632486480SE +/- 0.12, N = 3SE +/- 0.13, N = 3SE +/- 0.03, N = 373.974.163.31. (CC) gcc options: -O2 -flto -lOpenCL

clpeak

OpenCL Test: Kernel Latency

OpenBenchmarking.orgus, Fewer Is BetterclpeakOpenCL Test: Kernel Latencyamd-opencl-linuxamd_opencl_linuxnvidia_opencl_linux246810SE +/- 0.06, N = 5SE +/- 0.06, N = 7SE +/- 0.05, N = 36.156.137.751. (CXX) g++ options: -O3 -rdynamic -lOpenCL

clpeak

OpenCL Test: Integer Compute INT

OpenBenchmarking.orgGIOPS, More Is BetterclpeakOpenCL Test: Integer Compute INTamd-opencl-linuxamd_opencl_linuxnvidia_opencl_linux80160240320400SE +/- 0.05, N = 3SE +/- 0.11, N = 3SE +/- 1.52, N = 3367.95368.48251.311. (CXX) g++ options: -O3 -rdynamic -lOpenCL

clpeak

OpenCL Test: Single-Precision Float

OpenBenchmarking.orgGFLOPS, More Is BetterclpeakOpenCL Test: Single-Precision Floatamd-opencl-linuxamd_opencl_linuxnvidia_opencl_linux400800120016002000SE +/- 0.08, N = 3SE +/- 0.11, N = 3SE +/- 0.31, N = 31824.961823.80700.291. (CXX) g++ options: -O3 -rdynamic -lOpenCL

clpeak

OpenCL Test: Double-Precision Double

OpenBenchmarking.orgGFLOPS, More Is BetterclpeakOpenCL Test: Double-Precision Doubleamd-opencl-linuxamd_opencl_linuxnvidia_opencl_linux306090120150SE +/- 0.02, N = 3SE +/- 0.11, N = 3SE +/- 0.02, N = 3115.54115.4435.631. (CXX) g++ options: -O3 -rdynamic -lOpenCL

clpeak

OpenCL Test: Global Memory Bandwidth

OpenBenchmarking.orgGBPS, More Is BetterclpeakOpenCL Test: Global Memory Bandwidthamd-opencl-linuxamd_opencl_linuxnvidia_opencl_linux20406080100SE +/- 0.01, N = 3SE +/- 0.06, N = 3SE +/- 0.05, N = 375.2675.2166.991. (CXX) g++ options: -O3 -rdynamic -lOpenCL

clpeak

OpenCL Test: Transfer Bandwidth enqueueReadBuffer

OpenBenchmarking.orgGBPS, More Is BetterclpeakOpenCL Test: Transfer Bandwidth enqueueReadBufferamd-opencl-linuxamd_opencl_linuxnvidia_opencl_linux3691215SE +/- 0.04, N = 3SE +/- 0.07, N = 3SE +/- 0.01, N = 310.4110.366.631. (CXX) g++ options: -O3 -rdynamic -lOpenCL

clpeak

OpenCL Test: Transfer Bandwidth enqueueWriteBuffer

OpenBenchmarking.orgGBPS, More Is BetterclpeakOpenCL Test: Transfer Bandwidth enqueueWriteBufferamd-opencl-linuxamd_opencl_linuxnvidia_opencl_linux510152025SE +/- 0.01, N = 3SE +/- 0.03, N = 3SE +/- 0.06, N = 319.4619.3910.931. (CXX) g++ options: -O3 -rdynamic -lOpenCL

Darktable

Test: Boat - Acceleration: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 3.0.1Test: Boat - Acceleration: OpenCLamd-opencl-linuxamd_opencl_linuxnvidia_opencl_linux3691215SE +/- 0.016, N = 3SE +/- 0.044, N = 3SE +/- 0.011, N = 38.0838.04711.031

Darktable

Test: Masskrug - Acceleration: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 3.0.1Test: Masskrug - Acceleration: OpenCLamd-opencl-linuxamd_opencl_linuxnvidia_opencl_linux3691215SE +/- 0.016, N = 3SE +/- 0.032, N = 3SE +/- 0.011, N = 39.7769.75110.822

Darktable

Test: Server Rack - Acceleration: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 3.0.1Test: Server Rack - Acceleration: OpenCLamd-opencl-linuxamd_opencl_linuxnvidia_opencl_linux0.07270.14540.21810.29080.3635SE +/- 0.003, N = 3SE +/- 0.002, N = 15SE +/- 0.001, N = 30.2280.2230.323

Darktable

Test: Server Room - Acceleration: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 3.0.1Test: Server Room - Acceleration: OpenCLamd-opencl-linuxamd_opencl_linuxnvidia_opencl_linux1.03372.06743.10114.13485.1685SE +/- 0.030, N = 3SE +/- 0.021, N = 15SE +/- 0.007, N = 32.9262.9024.594

LuxMark

OpenCL Device: GPU - Scene: Hotel

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.1OpenCL Device: GPU - Scene: Hotelamd-opencl-linuxamd_opencl_linuxnvidia_opencl_linux160320480640800SE +/- 0.67, N = 3SE +/- 3.48, N = 3491490751

LuxMark

OpenCL Device: GPU - Scene: Microphone

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.1OpenCL Device: GPU - Scene: Microphoneamd-opencl-linuxamd_opencl_linuxnvidia_opencl_linux8001600240032004000SE +/- 10.84, N = 3SE +/- 2.67, N = 3376737572512

LuxMark

OpenCL Device: GPU - Scene: Luxball HDR

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.1OpenCL Device: GPU - Scene: Luxball HDRamd-opencl-linuxamd_opencl_linuxnvidia_opencl_linux11002200330044005500SE +/- 3.21, N = 3SE +/- 12.68, N = 3532453223730

Rodinia

Test: OpenCL LavaMD

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenCL LavaMDnvidia_opencl_linux0.8891.7782.6673.5564.445SE +/- 0.052, N = 53.9511. (CXX) g++ options: -O2 -lOpenCL

Rodinia

Test: OpenCL Myocyte

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenCL Myocyteamd-opencl-linuxamd_opencl_linuxnvidia_opencl_linux50100150200250SE +/- 0.24, N = 3SE +/- 0.41, N = 3SE +/- 0.27, N = 3229.86230.3555.341. (CXX) g++ options: -O2 -lOpenCL

Rodinia

Test: OpenCL Heartwall

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenCL Heartwallamd-opencl-linuxamd_opencl_linuxnvidia_opencl_linux246810SE +/- 0.013, N = 3SE +/- 0.026, N = 3SE +/- 0.057, N = 147.8107.7985.9781. (CXX) g++ options: -O2 -lOpenCL

Rodinia

Test: OpenCL Particle Filter

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenCL Particle Filternvidia_opencl_linux1020304050SE +/- 0.03, N = 345.311. (CXX) g++ options: -O2 -lOpenCL

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Triad

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Triadamd-opencl-linuxamd_opencl_linuxnvidia_opencl_linux3691215SE +/- 0.0336, N = 3SE +/- 0.0171, N = 3SE +/- 0.0068, N = 34.75054.750210.56631. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: FFT SP

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: FFT SPamd-opencl-linuxamd_opencl_linuxnvidia_opencl_linux50100150200250SE +/- 0.19, N = 3SE +/- 0.28, N = 3SE +/- 0.82, N = 3217.33217.14122.371. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: MD5 Hash

OpenBenchmarking.orgGHash/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: MD5 Hashamd-opencl-linuxamd_opencl_linuxnvidia_opencl_linux0.52881.05761.58642.11522.644SE +/- 0.0007, N = 3SE +/- 0.0003, N = 3SE +/- 0.0007, N = 32.34922.35021.43221. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Max SP Flops

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Max SP Flopsamd-opencl-linuxamd_opencl_linuxnvidia_opencl_linux200K400K600K800K1000KSE +/- 5904.67, N = 3SE +/- 1706.69, N = 3SE +/- 1.98, N = 3908605.00908135.001130.121. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Bus Speed Download

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Bus Speed Downloadamd-opencl-linuxamd_opencl_linuxnvidia_opencl_linux3691215SE +/- 0.0024, N = 3SE +/- 0.0012, N = 3SE +/- 0.0011, N = 35.66585.669012.68781. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Bus Speed Readback

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Bus Speed Readbackamd-opencl-linuxamd_opencl_linuxnvidia_opencl_linux3691215SE +/- 0.0023, N = 3SE +/- 0.0057, N = 3SE +/- 0.0014, N = 35.24475.246112.76401. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Texture Read Bandwidth

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Texture Read Bandwidthamd-opencl-linuxamd_opencl_linuxnvidia_opencl_linux20406080100SE +/- 0.37, N = 3SE +/- 0.21, N = 3SE +/- 0.57, N = 381.8681.67110.991. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

SmallPT GPU

OpenCL Device: GPU - Resolution: 1920 x 1200 - Scene: Caustic

OpenBenchmarking.orgSamples/sec, More Is BetterSmallPT GPU 1.6pts1OpenCL Device: GPU - Resolution: 1920 x 1200 - Scene: Causticamd-opencl-linuxamd_opencl_linuxnvidia_opencl_linux300M600M900M1200M1500MSE +/- 25.98, N = 3SE +/- 25.98, N = 3SE +/- 24.25, N = 31615706931161570092716067567481. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL

SmallPT GPU

OpenCL Device: GPU - Resolution: 1920 x 1200 - Scene: Cornell

OpenBenchmarking.orgSamples/sec, More Is BetterSmallPT GPU 1.6pts1OpenCL Device: GPU - Resolution: 1920 x 1200 - Scene: Cornellamd-opencl-linuxamd_opencl_linuxnvidia_opencl_linux300M600M900M1200M1500MSE +/- 24.54, N = 3SE +/- 24.54, N = 3SE +/- 20.78, N = 31615707068161570106316067568691. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL

SmallPT GPU

OpenCL Device: GPU - Resolution: 1920 x 1200 - Scene: Caustic3

OpenBenchmarking.orgSamples/sec, More Is BetterSmallPT GPU 1.6pts1OpenCL Device: GPU - Resolution: 1920 x 1200 - Scene: Caustic3amd-opencl-linuxamd_opencl_linuxnvidia_opencl_linux300M600M900M1200M1500MSE +/- 25.98, N = 3SE +/- 26.27, N = 3SE +/- 23.96, N = 31615707207161570120316067569951. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL


Phoronix Test Suite v10.8.4