nvidia_opencl_linux

AMD wx4150 on Ubuntu 20.04.2 with ROCM

HTML result view exported from: https://openbenchmarking.org/result/2103141-HA-2011304FI52&sor.

nvidia_opencl_linuxProcessorMotherboardChipsetMemoryDiskGraphicsAudioNetworkMonitorOSKernelDesktopDisplay ServerDisplay DriverOpenCLVulkanCompilerFile-SystemScreen ResolutionOpenGLnvidia_opencl_linuxamd_opencl_linuxIntel Core i7-4700MQ @ 3.40GHz (4 Cores / 8 Threads)HP 1909 (L70 Ver. 01.42 BIOS)Intel Xeon E3-1200 v3/4th32GB500GB Samsung SSD 860 + 256GB SAMSUNG MZ7PD256 + 500GB Seagate ST500LT012-1DG14 + 256GB SAMSUNG MZMPD256 + 128GB ED2S5NVIDIA Quadro M1000M 2GB (135/405MHz)IDT 92HD91BXXIntel I217-LM + Intel 7260Ubuntu 20.045.4.0-53-generic (x86_64)GNOME Shell 3.36.4X Server 1.20.8NVIDIA 450.80.02OpenCL 1.2 CUDA 11.0.2281.2.131GCC 9.3.0ext41920x1200HP 1909 (L70 Ver. 01.45 BIOS)500GB Samsung SSD 860 + 500GB Seagate ST500LT012-1DG14Intel HD 4600 2GB (1150MHz)HP ZR24w5.6.0-1042-oem (x86_64)X Server 1.20.94.5 Mesa 20.2.6OpenCL 2.0 AMD-APP (3212.0)1.2.1453840x1200OpenBenchmarking.orgCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: intel_pstate powersave - CPU Microcode: 0x28 - Thermald 1.9.1OpenCL Details- nvidia_opencl_linux: GPU Compute Cores: 512Python Details- nvidia_opencl_linux: Python 3.8.5Security Details- itlb_multihit: KVM: Mitigation of Split huge pages + l1tf: Mitigation of PTE Inversion; VMX: conditional cache flushes SMT vulnerable + mds: Mitigation of Clear buffers; SMT vulnerable + meltdown: Mitigation of PTI + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full generic retpoline IBPB: conditional IBRS_FW STIBP: conditional RSB filling + srbds: Mitigation of Microcode + tsx_async_abort: Not affected Kernel Details- amd_opencl_linux: Transparent Huge Pages: madviseGraphics Details- amd_opencl_linux: GLAMOR

nvidia_opencl_linuxshoc: OpenCL - Triadshoc: OpenCL - FFT SPshoc: OpenCL - MD5 Hashshoc: OpenCL - Max SP Flopsshoc: OpenCL - Bus Speed Downloadshoc: OpenCL - Bus Speed Readbackshoc: OpenCL - Texture Read Bandwidthcl-mem: Copycl-mem: Readcl-mem: Writerodinia: OpenCL LavaMDrodinia: OpenCL Myocyterodinia: OpenCL Heartwallrodinia: OpenCL Particle Filterdarktable: Boat - OpenCLdarktable: Masskrug - OpenCLdarktable: Server Rack - OpenCLdarktable: Server Room - OpenCLblender: BMW27 - OpenCLblender: Barbershop - OpenCLsmallpt-gpu: GPU - 1920 x 1200 - Causticsmallpt-gpu: GPU - 1920 x 1200 - Cornellsmallpt-gpu: GPU - 1920 x 1200 - Caustic3luxmark: GPU - Hotelluxmark: GPU - Microphoneluxmark: GPU - Luxball HDRclpeak: Kernel Latencyclpeak: Integer Compute INTclpeak: Single-Precision Floatclpeak: Double-Precision Doubleclpeak: Global Memory Bandwidthclpeak: Transfer Bandwidth enqueueReadBufferclpeak: Transfer Bandwidth enqueueWriteBuffernvidia_opencl_linuxamd_opencl_linux10.5663122.3731.43221130.1212.687812.7640110.99060.167.463.33.95155.3385.97845.30611.03110.8220.3234.594694.902379.33160675674816067568691606756995751251237307.75251.31700.2935.6366.996.6310.934.7502217.1392.35029081355.66905.246181.667168.579.674.1230.3457.7988.0479.7510.2232.902161570092716157010631615701203490375753226.13368.481823.80115.4475.2110.3619.39OpenBenchmarking.org

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Triad

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Triadnvidia_opencl_linuxamd_opencl_linux3691215SE +/- 0.0068, N = 3SE +/- 0.0171, N = 310.56634.75021. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: FFT SP

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: FFT SPamd_opencl_linuxnvidia_opencl_linux50100150200250SE +/- 0.28, N = 3SE +/- 0.82, N = 3217.14122.371. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: MD5 Hash

OpenBenchmarking.orgGHash/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: MD5 Hashamd_opencl_linuxnvidia_opencl_linux0.52881.05761.58642.11522.644SE +/- 0.0003, N = 3SE +/- 0.0007, N = 32.35021.43221. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Max SP Flops

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Max SP Flopsamd_opencl_linuxnvidia_opencl_linux200K400K600K800K1000KSE +/- 1706.69, N = 3SE +/- 1.98, N = 3908135.001130.121. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Bus Speed Download

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Bus Speed Downloadnvidia_opencl_linuxamd_opencl_linux3691215SE +/- 0.0011, N = 3SE +/- 0.0012, N = 312.68785.66901. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Bus Speed Readback

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Bus Speed Readbacknvidia_opencl_linuxamd_opencl_linux3691215SE +/- 0.0014, N = 3SE +/- 0.0057, N = 312.76405.24611. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Texture Read Bandwidth

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Texture Read Bandwidthnvidia_opencl_linuxamd_opencl_linux20406080100SE +/- 0.57, N = 3SE +/- 0.21, N = 3110.9981.671. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

cl-mem

Benchmark: Copy

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: Copyamd_opencl_linuxnvidia_opencl_linux1530456075SE +/- 0.12, N = 3SE +/- 0.00, N = 368.560.11. (CC) gcc options: -O2 -flto -lOpenCL

cl-mem

Benchmark: Read

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: Readamd_opencl_linuxnvidia_opencl_linux20406080100SE +/- 0.03, N = 3SE +/- 0.03, N = 379.667.41. (CC) gcc options: -O2 -flto -lOpenCL

cl-mem

Benchmark: Write

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: Writeamd_opencl_linuxnvidia_opencl_linux1632486480SE +/- 0.13, N = 3SE +/- 0.03, N = 374.163.31. (CC) gcc options: -O2 -flto -lOpenCL

Rodinia

Test: OpenCL LavaMD

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenCL LavaMDnvidia_opencl_linux0.8891.7782.6673.5564.445SE +/- 0.052, N = 53.9511. (CXX) g++ options: -O2 -lOpenCL

Rodinia

Test: OpenCL Myocyte

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenCL Myocytenvidia_opencl_linuxamd_opencl_linux50100150200250SE +/- 0.27, N = 3SE +/- 0.41, N = 355.34230.351. (CXX) g++ options: -O2 -lOpenCL

Rodinia

Test: OpenCL Heartwall

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenCL Heartwallnvidia_opencl_linuxamd_opencl_linux246810SE +/- 0.057, N = 14SE +/- 0.026, N = 35.9787.7981. (CXX) g++ options: -O2 -lOpenCL

Rodinia

Test: OpenCL Particle Filter

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenCL Particle Filternvidia_opencl_linux1020304050SE +/- 0.03, N = 345.311. (CXX) g++ options: -O2 -lOpenCL

Darktable

Test: Boat - Acceleration: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 3.0.1Test: Boat - Acceleration: OpenCLamd_opencl_linuxnvidia_opencl_linux3691215SE +/- 0.044, N = 3SE +/- 0.011, N = 38.04711.031

Darktable

Test: Masskrug - Acceleration: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 3.0.1Test: Masskrug - Acceleration: OpenCLamd_opencl_linuxnvidia_opencl_linux3691215SE +/- 0.032, N = 3SE +/- 0.011, N = 39.75110.822

Darktable

Test: Server Rack - Acceleration: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 3.0.1Test: Server Rack - Acceleration: OpenCLamd_opencl_linuxnvidia_opencl_linux0.07270.14540.21810.29080.3635SE +/- 0.002, N = 15SE +/- 0.001, N = 30.2230.323

Darktable

Test: Server Room - Acceleration: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 3.0.1Test: Server Room - Acceleration: OpenCLamd_opencl_linuxnvidia_opencl_linux1.03372.06743.10114.13485.1685SE +/- 0.021, N = 15SE +/- 0.007, N = 32.9024.594

Blender

Blend File: BMW27 - Compute: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: BMW27 - Compute: OpenCLnvidia_opencl_linux150300450600750SE +/- 10.50, N = 3694.90

Blender

Blend File: Barbershop - Compute: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: Barbershop - Compute: OpenCLnvidia_opencl_linux5001000150020002500SE +/- 5.31, N = 32379.33

SmallPT GPU

OpenCL Device: GPU - Resolution: 1920 x 1200 - Scene: Caustic

OpenBenchmarking.orgSamples/sec, More Is BetterSmallPT GPU 1.6pts1OpenCL Device: GPU - Resolution: 1920 x 1200 - Scene: Causticamd_opencl_linuxnvidia_opencl_linux300M600M900M1200M1500MSE +/- 25.98, N = 3SE +/- 24.25, N = 3161570092716067567481. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL

SmallPT GPU

OpenCL Device: GPU - Resolution: 1920 x 1200 - Scene: Cornell

OpenBenchmarking.orgSamples/sec, More Is BetterSmallPT GPU 1.6pts1OpenCL Device: GPU - Resolution: 1920 x 1200 - Scene: Cornellamd_opencl_linuxnvidia_opencl_linux300M600M900M1200M1500MSE +/- 24.54, N = 3SE +/- 20.78, N = 3161570106316067568691. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL

SmallPT GPU

OpenCL Device: GPU - Resolution: 1920 x 1200 - Scene: Caustic3

OpenBenchmarking.orgSamples/sec, More Is BetterSmallPT GPU 1.6pts1OpenCL Device: GPU - Resolution: 1920 x 1200 - Scene: Caustic3amd_opencl_linuxnvidia_opencl_linux300M600M900M1200M1500MSE +/- 26.27, N = 3SE +/- 23.96, N = 3161570120316067569951. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL

LuxMark

OpenCL Device: GPU - Scene: Hotel

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.1OpenCL Device: GPU - Scene: Hotelnvidia_opencl_linuxamd_opencl_linux160320480640800SE +/- 3.48, N = 3SE +/- 0.67, N = 3751490

LuxMark

OpenCL Device: GPU - Scene: Microphone

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.1OpenCL Device: GPU - Scene: Microphoneamd_opencl_linuxnvidia_opencl_linux8001600240032004000SE +/- 10.84, N = 3SE +/- 2.67, N = 337572512

LuxMark

OpenCL Device: GPU - Scene: Luxball HDR

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.1OpenCL Device: GPU - Scene: Luxball HDRamd_opencl_linuxnvidia_opencl_linux11002200330044005500SE +/- 12.68, N = 353223730

clpeak

OpenCL Test: Kernel Latency

OpenBenchmarking.orgus, Fewer Is BetterclpeakOpenCL Test: Kernel Latencyamd_opencl_linuxnvidia_opencl_linux246810SE +/- 0.06, N = 7SE +/- 0.05, N = 36.137.751. (CXX) g++ options: -O3 -rdynamic -lOpenCL

clpeak

OpenCL Test: Integer Compute INT

OpenBenchmarking.orgGIOPS, More Is BetterclpeakOpenCL Test: Integer Compute INTamd_opencl_linuxnvidia_opencl_linux80160240320400SE +/- 0.11, N = 3SE +/- 1.52, N = 3368.48251.311. (CXX) g++ options: -O3 -rdynamic -lOpenCL

clpeak

OpenCL Test: Single-Precision Float

OpenBenchmarking.orgGFLOPS, More Is BetterclpeakOpenCL Test: Single-Precision Floatamd_opencl_linuxnvidia_opencl_linux400800120016002000SE +/- 0.11, N = 3SE +/- 0.31, N = 31823.80700.291. (CXX) g++ options: -O3 -rdynamic -lOpenCL

clpeak

OpenCL Test: Double-Precision Double

OpenBenchmarking.orgGFLOPS, More Is BetterclpeakOpenCL Test: Double-Precision Doubleamd_opencl_linuxnvidia_opencl_linux306090120150SE +/- 0.11, N = 3SE +/- 0.02, N = 3115.4435.631. (CXX) g++ options: -O3 -rdynamic -lOpenCL

clpeak

OpenCL Test: Global Memory Bandwidth

OpenBenchmarking.orgGBPS, More Is BetterclpeakOpenCL Test: Global Memory Bandwidthamd_opencl_linuxnvidia_opencl_linux20406080100SE +/- 0.06, N = 3SE +/- 0.05, N = 375.2166.991. (CXX) g++ options: -O3 -rdynamic -lOpenCL

clpeak

OpenCL Test: Transfer Bandwidth enqueueReadBuffer

OpenBenchmarking.orgGBPS, More Is BetterclpeakOpenCL Test: Transfer Bandwidth enqueueReadBufferamd_opencl_linuxnvidia_opencl_linux3691215SE +/- 0.07, N = 3SE +/- 0.01, N = 310.366.631. (CXX) g++ options: -O3 -rdynamic -lOpenCL

clpeak

OpenCL Test: Transfer Bandwidth enqueueWriteBuffer

OpenBenchmarking.orgGBPS, More Is BetterclpeakOpenCL Test: Transfer Bandwidth enqueueWriteBufferamd_opencl_linuxnvidia_opencl_linux510152025SE +/- 0.03, N = 3SE +/- 0.06, N = 319.3910.931. (CXX) g++ options: -O3 -rdynamic -lOpenCL


Phoronix Test Suite v10.8.4