nvidia_opencl_linux

AMD wx4150 on Ubuntu 20.04.2 with ROCM fan speed max

HTML result view exported from: https://openbenchmarking.org/result/2103143-HA-2011304FI79&sor&grs.

nvidia_opencl_linuxProcessorMotherboardChipsetMemoryDiskGraphicsAudioNetworkMonitorOSKernelDesktopDisplay ServerDisplay DriverOpenCLVulkanCompilerFile-SystemScreen ResolutionOpenGLnvidia_opencl_linuxamd_opencl_linuxamd-opencl-linuxIntel Core i7-4700MQ @ 3.40GHz (4 Cores / 8 Threads)HP 1909 (L70 Ver. 01.42 BIOS)Intel Xeon E3-1200 v3/4th32GB500GB Samsung SSD 860 + 256GB SAMSUNG MZ7PD256 + 500GB Seagate ST500LT012-1DG14 + 256GB SAMSUNG MZMPD256 + 128GB ED2S5NVIDIA Quadro M1000M 2GB (135/405MHz)IDT 92HD91BXXIntel I217-LM + Intel 7260Ubuntu 20.045.4.0-53-generic (x86_64)GNOME Shell 3.36.4X Server 1.20.8NVIDIA 450.80.02OpenCL 1.2 CUDA 11.0.2281.2.131GCC 9.3.0ext41920x1200HP 1909 (L70 Ver. 01.45 BIOS)500GB Samsung SSD 860 + 500GB Seagate ST500LT012-1DG14Intel HD 4600 2GB (1150MHz)HP ZR24w5.6.0-1042-oem (x86_64)X Server 1.20.94.5 Mesa 20.2.6OpenCL 2.0 AMD-APP (3212.0)1.2.1453840x1200OpenBenchmarking.orgCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: intel_pstate powersave - CPU Microcode: 0x28 - Thermald 1.9.1OpenCL Details- nvidia_opencl_linux: GPU Compute Cores: 512Python Details- nvidia_opencl_linux: Python 3.8.5Security Details- itlb_multihit: KVM: Mitigation of Split huge pages + l1tf: Mitigation of PTE Inversion; VMX: conditional cache flushes SMT vulnerable + mds: Mitigation of Clear buffers; SMT vulnerable + meltdown: Mitigation of PTI + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full generic retpoline IBPB: conditional IBRS_FW STIBP: conditional RSB filling + srbds: Mitigation of Microcode + tsx_async_abort: Not affected Kernel Details- amd_opencl_linux, amd-opencl-linux: Transparent Huge Pages: madviseGraphics Details- amd_opencl_linux, amd-opencl-linux: GLAMOR

nvidia_opencl_linuxshoc: OpenCL - Max SP Flopsrodinia: OpenCL Myocyteclpeak: Double-Precision Doubleclpeak: Single-Precision Floatshoc: OpenCL - Bus Speed Readbackshoc: OpenCL - Bus Speed Downloadshoc: OpenCL - Triadclpeak: Transfer Bandwidth enqueueWriteBuffershoc: OpenCL - FFT SPshoc: OpenCL - MD5 Hashdarktable: Server Room - OpenCLclpeak: Transfer Bandwidth enqueueReadBufferluxmark: GPU - Hotelluxmark: GPU - Microphoneclpeak: Integer Compute INTdarktable: Server Rack - OpenCLluxmark: GPU - Luxball HDRdarktable: Boat - OpenCLshoc: OpenCL - Texture Read Bandwidthrodinia: OpenCL Heartwallclpeak: Kernel Latencycl-mem: Readcl-mem: Writecl-mem: Copyclpeak: Global Memory Bandwidthdarktable: Masskrug - OpenCLsmallpt-gpu: GPU - 1920 x 1200 - Caustic3smallpt-gpu: GPU - 1920 x 1200 - Cornellsmallpt-gpu: GPU - 1920 x 1200 - Causticblender: Barbershop - OpenCLblender: BMW27 - OpenCLrodinia: OpenCL Particle Filterrodinia: OpenCL LavaMDnvidia_opencl_linuxamd_opencl_linuxamd-opencl-linux1130.1255.33835.63700.2912.764012.687810.566310.93122.3731.43224.5946.637512512251.310.323373011.031110.9905.9787.7567.463.360.166.9910.8221606756995160675686916067567482379.33694.9045.3063.951908135230.345115.441823.805.24615.66904.750219.39217.1392.35022.90210.364903757368.480.22353228.04781.66717.7986.1379.674.168.575.219.751161570120316157010631615700927908605229.859115.541824.965.24475.66584.750519.46217.3322.34922.92610.414913767367.950.22853248.08381.86077.8106.1579.573.968.575.269.776161570720716157070681615706931OpenBenchmarking.org

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Max SP Flops

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Max SP Flopsamd-opencl-linuxamd_opencl_linuxnvidia_opencl_linux200K400K600K800K1000KSE +/- 5904.67, N = 3SE +/- 1706.69, N = 3SE +/- 1.98, N = 3908605.00908135.001130.121. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

Rodinia

Test: OpenCL Myocyte

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenCL Myocytenvidia_opencl_linuxamd-opencl-linuxamd_opencl_linux50100150200250SE +/- 0.27, N = 3SE +/- 0.24, N = 3SE +/- 0.41, N = 355.34229.86230.351. (CXX) g++ options: -O2 -lOpenCL

clpeak

OpenCL Test: Double-Precision Double

OpenBenchmarking.orgGFLOPS, More Is BetterclpeakOpenCL Test: Double-Precision Doubleamd-opencl-linuxamd_opencl_linuxnvidia_opencl_linux306090120150SE +/- 0.02, N = 3SE +/- 0.11, N = 3SE +/- 0.02, N = 3115.54115.4435.631. (CXX) g++ options: -O3 -rdynamic -lOpenCL

clpeak

OpenCL Test: Single-Precision Float

OpenBenchmarking.orgGFLOPS, More Is BetterclpeakOpenCL Test: Single-Precision Floatamd-opencl-linuxamd_opencl_linuxnvidia_opencl_linux400800120016002000SE +/- 0.08, N = 3SE +/- 0.11, N = 3SE +/- 0.31, N = 31824.961823.80700.291. (CXX) g++ options: -O3 -rdynamic -lOpenCL

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Bus Speed Readback

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Bus Speed Readbacknvidia_opencl_linuxamd_opencl_linuxamd-opencl-linux3691215SE +/- 0.0014, N = 3SE +/- 0.0057, N = 3SE +/- 0.0023, N = 312.76405.24615.24471. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Bus Speed Download

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Bus Speed Downloadnvidia_opencl_linuxamd_opencl_linuxamd-opencl-linux3691215SE +/- 0.0011, N = 3SE +/- 0.0012, N = 3SE +/- 0.0024, N = 312.68785.66905.66581. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Triad

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Triadnvidia_opencl_linuxamd-opencl-linuxamd_opencl_linux3691215SE +/- 0.0068, N = 3SE +/- 0.0336, N = 3SE +/- 0.0171, N = 310.56634.75054.75021. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

clpeak

OpenCL Test: Transfer Bandwidth enqueueWriteBuffer

OpenBenchmarking.orgGBPS, More Is BetterclpeakOpenCL Test: Transfer Bandwidth enqueueWriteBufferamd-opencl-linuxamd_opencl_linuxnvidia_opencl_linux510152025SE +/- 0.01, N = 3SE +/- 0.03, N = 3SE +/- 0.06, N = 319.4619.3910.931. (CXX) g++ options: -O3 -rdynamic -lOpenCL

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: FFT SP

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: FFT SPamd-opencl-linuxamd_opencl_linuxnvidia_opencl_linux50100150200250SE +/- 0.19, N = 3SE +/- 0.28, N = 3SE +/- 0.82, N = 3217.33217.14122.371. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: MD5 Hash

OpenBenchmarking.orgGHash/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: MD5 Hashamd_opencl_linuxamd-opencl-linuxnvidia_opencl_linux0.52881.05761.58642.11522.644SE +/- 0.0003, N = 3SE +/- 0.0007, N = 3SE +/- 0.0007, N = 32.35022.34921.43221. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

Darktable

Test: Server Room - Acceleration: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 3.0.1Test: Server Room - Acceleration: OpenCLamd_opencl_linuxamd-opencl-linuxnvidia_opencl_linux1.03372.06743.10114.13485.1685SE +/- 0.021, N = 15SE +/- 0.030, N = 3SE +/- 0.007, N = 32.9022.9264.594

clpeak

OpenCL Test: Transfer Bandwidth enqueueReadBuffer

OpenBenchmarking.orgGBPS, More Is BetterclpeakOpenCL Test: Transfer Bandwidth enqueueReadBufferamd-opencl-linuxamd_opencl_linuxnvidia_opencl_linux3691215SE +/- 0.04, N = 3SE +/- 0.07, N = 3SE +/- 0.01, N = 310.4110.366.631. (CXX) g++ options: -O3 -rdynamic -lOpenCL

LuxMark

OpenCL Device: GPU - Scene: Hotel

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.1OpenCL Device: GPU - Scene: Hotelnvidia_opencl_linuxamd-opencl-linuxamd_opencl_linux160320480640800SE +/- 3.48, N = 3SE +/- 0.67, N = 3751491490

LuxMark

OpenCL Device: GPU - Scene: Microphone

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.1OpenCL Device: GPU - Scene: Microphoneamd-opencl-linuxamd_opencl_linuxnvidia_opencl_linux8001600240032004000SE +/- 10.84, N = 3SE +/- 2.67, N = 3376737572512

clpeak

OpenCL Test: Integer Compute INT

OpenBenchmarking.orgGIOPS, More Is BetterclpeakOpenCL Test: Integer Compute INTamd_opencl_linuxamd-opencl-linuxnvidia_opencl_linux80160240320400SE +/- 0.11, N = 3SE +/- 0.05, N = 3SE +/- 1.52, N = 3368.48367.95251.311. (CXX) g++ options: -O3 -rdynamic -lOpenCL

Darktable

Test: Server Rack - Acceleration: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 3.0.1Test: Server Rack - Acceleration: OpenCLamd_opencl_linuxamd-opencl-linuxnvidia_opencl_linux0.07270.14540.21810.29080.3635SE +/- 0.002, N = 15SE +/- 0.003, N = 3SE +/- 0.001, N = 30.2230.2280.323

LuxMark

OpenCL Device: GPU - Scene: Luxball HDR

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.1OpenCL Device: GPU - Scene: Luxball HDRamd-opencl-linuxamd_opencl_linuxnvidia_opencl_linux11002200330044005500SE +/- 3.21, N = 3SE +/- 12.68, N = 3532453223730

Darktable

Test: Boat - Acceleration: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 3.0.1Test: Boat - Acceleration: OpenCLamd_opencl_linuxamd-opencl-linuxnvidia_opencl_linux3691215SE +/- 0.044, N = 3SE +/- 0.016, N = 3SE +/- 0.011, N = 38.0478.08311.031

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Texture Read Bandwidth

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Texture Read Bandwidthnvidia_opencl_linuxamd-opencl-linuxamd_opencl_linux20406080100SE +/- 0.57, N = 3SE +/- 0.37, N = 3SE +/- 0.21, N = 3110.9981.8681.671. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

Rodinia

Test: OpenCL Heartwall

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenCL Heartwallnvidia_opencl_linuxamd_opencl_linuxamd-opencl-linux246810SE +/- 0.057, N = 14SE +/- 0.026, N = 3SE +/- 0.013, N = 35.9787.7987.8101. (CXX) g++ options: -O2 -lOpenCL

clpeak

OpenCL Test: Kernel Latency

OpenBenchmarking.orgus, Fewer Is BetterclpeakOpenCL Test: Kernel Latencyamd_opencl_linuxamd-opencl-linuxnvidia_opencl_linux246810SE +/- 0.06, N = 7SE +/- 0.06, N = 5SE +/- 0.05, N = 36.136.157.751. (CXX) g++ options: -O3 -rdynamic -lOpenCL

cl-mem

Benchmark: Read

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: Readamd_opencl_linuxamd-opencl-linuxnvidia_opencl_linux20406080100SE +/- 0.03, N = 3SE +/- 0.07, N = 3SE +/- 0.03, N = 379.679.567.41. (CC) gcc options: -O2 -flto -lOpenCL

cl-mem

Benchmark: Write

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: Writeamd_opencl_linuxamd-opencl-linuxnvidia_opencl_linux1632486480SE +/- 0.13, N = 3SE +/- 0.12, N = 3SE +/- 0.03, N = 374.173.963.31. (CC) gcc options: -O2 -flto -lOpenCL

cl-mem

Benchmark: Copy

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: Copyamd-opencl-linuxamd_opencl_linuxnvidia_opencl_linux1530456075SE +/- 0.06, N = 3SE +/- 0.12, N = 3SE +/- 0.00, N = 368.568.560.11. (CC) gcc options: -O2 -flto -lOpenCL

clpeak

OpenCL Test: Global Memory Bandwidth

OpenBenchmarking.orgGBPS, More Is BetterclpeakOpenCL Test: Global Memory Bandwidthamd-opencl-linuxamd_opencl_linuxnvidia_opencl_linux20406080100SE +/- 0.01, N = 3SE +/- 0.06, N = 3SE +/- 0.05, N = 375.2675.2166.991. (CXX) g++ options: -O3 -rdynamic -lOpenCL

Darktable

Test: Masskrug - Acceleration: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 3.0.1Test: Masskrug - Acceleration: OpenCLamd_opencl_linuxamd-opencl-linuxnvidia_opencl_linux3691215SE +/- 0.032, N = 3SE +/- 0.016, N = 3SE +/- 0.011, N = 39.7519.77610.822

SmallPT GPU

OpenCL Device: GPU - Resolution: 1920 x 1200 - Scene: Caustic3

OpenBenchmarking.orgSamples/sec, More Is BetterSmallPT GPU 1.6pts1OpenCL Device: GPU - Resolution: 1920 x 1200 - Scene: Caustic3amd-opencl-linuxamd_opencl_linuxnvidia_opencl_linux300M600M900M1200M1500MSE +/- 25.98, N = 3SE +/- 26.27, N = 3SE +/- 23.96, N = 31615707207161570120316067569951. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL

SmallPT GPU

OpenCL Device: GPU - Resolution: 1920 x 1200 - Scene: Cornell

OpenBenchmarking.orgSamples/sec, More Is BetterSmallPT GPU 1.6pts1OpenCL Device: GPU - Resolution: 1920 x 1200 - Scene: Cornellamd-opencl-linuxamd_opencl_linuxnvidia_opencl_linux300M600M900M1200M1500MSE +/- 24.54, N = 3SE +/- 24.54, N = 3SE +/- 20.78, N = 31615707068161570106316067568691. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL

SmallPT GPU

OpenCL Device: GPU - Resolution: 1920 x 1200 - Scene: Caustic

OpenBenchmarking.orgSamples/sec, More Is BetterSmallPT GPU 1.6pts1OpenCL Device: GPU - Resolution: 1920 x 1200 - Scene: Causticamd-opencl-linuxamd_opencl_linuxnvidia_opencl_linux300M600M900M1200M1500MSE +/- 25.98, N = 3SE +/- 25.98, N = 3SE +/- 24.25, N = 31615706931161570092716067567481. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL

Blender

Blend File: Barbershop - Compute: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: Barbershop - Compute: OpenCLnvidia_opencl_linux5001000150020002500SE +/- 5.31, N = 32379.33

Blender

Blend File: BMW27 - Compute: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: BMW27 - Compute: OpenCLnvidia_opencl_linux150300450600750SE +/- 10.50, N = 3694.90

Rodinia

Test: OpenCL Particle Filter

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenCL Particle Filternvidia_opencl_linux1020304050SE +/- 0.03, N = 345.311. (CXX) g++ options: -O2 -lOpenCL

Rodinia

Test: OpenCL LavaMD

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenCL LavaMDnvidia_opencl_linux0.8891.7782.6673.5564.445SE +/- 0.052, N = 53.9511. (CXX) g++ options: -O2 -lOpenCL


Phoronix Test Suite v10.8.4