nvidia_opencl_linux

AMD wx4150 on Ubuntu 20.04.2 with ROCM fan speed max

HTML result view exported from: https://openbenchmarking.org/result/2103143-HA-2011304FI79&gru.

nvidia_opencl_linuxProcessorMotherboardChipsetMemoryDiskGraphicsAudioNetworkMonitorOSKernelDesktopDisplay ServerDisplay DriverOpenCLVulkanCompilerFile-SystemScreen ResolutionOpenGLnvidia_opencl_linuxamd_opencl_linuxamd-opencl-linuxIntel Core i7-4700MQ @ 3.40GHz (4 Cores / 8 Threads)HP 1909 (L70 Ver. 01.42 BIOS)Intel Xeon E3-1200 v3/4th32GB500GB Samsung SSD 860 + 256GB SAMSUNG MZ7PD256 + 500GB Seagate ST500LT012-1DG14 + 256GB SAMSUNG MZMPD256 + 128GB ED2S5NVIDIA Quadro M1000M 2GB (135/405MHz)IDT 92HD91BXXIntel I217-LM + Intel 7260Ubuntu 20.045.4.0-53-generic (x86_64)GNOME Shell 3.36.4X Server 1.20.8NVIDIA 450.80.02OpenCL 1.2 CUDA 11.0.2281.2.131GCC 9.3.0ext41920x1200HP 1909 (L70 Ver. 01.45 BIOS)500GB Samsung SSD 860 + 500GB Seagate ST500LT012-1DG14Intel HD 4600 2GB (1150MHz)HP ZR24w5.6.0-1042-oem (x86_64)X Server 1.20.94.5 Mesa 20.2.6OpenCL 2.0 AMD-APP (3212.0)1.2.1453840x1200OpenBenchmarking.orgCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: intel_pstate powersave - CPU Microcode: 0x28 - Thermald 1.9.1OpenCL Details- nvidia_opencl_linux: GPU Compute Cores: 512Python Details- nvidia_opencl_linux: Python 3.8.5Security Details- itlb_multihit: KVM: Mitigation of Split huge pages + l1tf: Mitigation of PTE Inversion; VMX: conditional cache flushes SMT vulnerable + mds: Mitigation of Clear buffers; SMT vulnerable + meltdown: Mitigation of PTI + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full generic retpoline IBPB: conditional IBRS_FW STIBP: conditional RSB filling + srbds: Mitigation of Microcode + tsx_async_abort: Not affected Kernel Details- amd_opencl_linux, amd-opencl-linux: Transparent Huge Pages: madviseGraphics Details- amd_opencl_linux, amd-opencl-linux: GLAMOR

nvidia_opencl_linuxshoc: OpenCL - Triadshoc: OpenCL - Bus Speed Downloadshoc: OpenCL - Bus Speed Readbackshoc: OpenCL - Texture Read Bandwidthcl-mem: Copycl-mem: Readcl-mem: Writeclpeak: Global Memory Bandwidthclpeak: Transfer Bandwidth enqueueReadBufferclpeak: Transfer Bandwidth enqueueWriteBuffershoc: OpenCL - FFT SPshoc: OpenCL - Max SP Flopsclpeak: Single-Precision Floatclpeak: Double-Precision Doubleshoc: OpenCL - MD5 Hashclpeak: Integer Compute INTsmallpt-gpu: GPU - 1920 x 1200 - Causticsmallpt-gpu: GPU - 1920 x 1200 - Cornellsmallpt-gpu: GPU - 1920 x 1200 - Caustic3luxmark: GPU - Hotelluxmark: GPU - Microphoneluxmark: GPU - Luxball HDRrodinia: OpenCL LavaMDrodinia: OpenCL Myocyterodinia: OpenCL Heartwallrodinia: OpenCL Particle Filterdarktable: Boat - OpenCLdarktable: Masskrug - OpenCLdarktable: Server Rack - OpenCLdarktable: Server Room - OpenCLblender: BMW27 - OpenCLblender: Barbershop - OpenCLclpeak: Kernel Latencynvidia_opencl_linuxamd_opencl_linuxamd-opencl-linux10.566312.687812.7640110.99060.167.463.366.996.6310.93122.3731130.12700.2935.631.4322251.31160675674816067568691606756995751251237303.95155.3385.97845.30611.03110.8220.3234.594694.902379.337.754.75025.66905.246181.667168.579.674.175.2110.3619.39217.1399081351823.80115.442.3502368.4816157009271615701063161570120349037575322230.3457.7988.0479.7510.2232.9026.134.75055.66585.244781.860768.579.573.975.2610.4119.46217.3329086051824.96115.542.3492367.9516157069311615707068161570720749137675324229.8597.8108.0839.7760.2282.9266.15OpenBenchmarking.org

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Triad

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Triadnvidia_opencl_linuxamd_opencl_linuxamd-opencl-linux3691215SE +/- 0.0068, N = 3SE +/- 0.0171, N = 3SE +/- 0.0336, N = 310.56634.75024.75051. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Bus Speed Download

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Bus Speed Downloadnvidia_opencl_linuxamd_opencl_linuxamd-opencl-linux3691215SE +/- 0.0011, N = 3SE +/- 0.0012, N = 3SE +/- 0.0024, N = 312.68785.66905.66581. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Bus Speed Readback

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Bus Speed Readbacknvidia_opencl_linuxamd_opencl_linuxamd-opencl-linux3691215SE +/- 0.0014, N = 3SE +/- 0.0057, N = 3SE +/- 0.0023, N = 312.76405.24615.24471. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Texture Read Bandwidth

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Texture Read Bandwidthnvidia_opencl_linuxamd_opencl_linuxamd-opencl-linux20406080100SE +/- 0.57, N = 3SE +/- 0.21, N = 3SE +/- 0.37, N = 3110.9981.6781.861. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

cl-mem

Benchmark: Copy

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: Copynvidia_opencl_linuxamd_opencl_linuxamd-opencl-linux1530456075SE +/- 0.00, N = 3SE +/- 0.12, N = 3SE +/- 0.06, N = 360.168.568.51. (CC) gcc options: -O2 -flto -lOpenCL

cl-mem

Benchmark: Read

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: Readnvidia_opencl_linuxamd_opencl_linuxamd-opencl-linux20406080100SE +/- 0.03, N = 3SE +/- 0.03, N = 3SE +/- 0.07, N = 367.479.679.51. (CC) gcc options: -O2 -flto -lOpenCL

cl-mem

Benchmark: Write

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: Writenvidia_opencl_linuxamd_opencl_linuxamd-opencl-linux1632486480SE +/- 0.03, N = 3SE +/- 0.13, N = 3SE +/- 0.12, N = 363.374.173.91. (CC) gcc options: -O2 -flto -lOpenCL

clpeak

OpenCL Test: Global Memory Bandwidth

OpenBenchmarking.orgGBPS, More Is BetterclpeakOpenCL Test: Global Memory Bandwidthnvidia_opencl_linuxamd_opencl_linuxamd-opencl-linux20406080100SE +/- 0.05, N = 3SE +/- 0.06, N = 3SE +/- 0.01, N = 366.9975.2175.261. (CXX) g++ options: -O3 -rdynamic -lOpenCL

clpeak

OpenCL Test: Transfer Bandwidth enqueueReadBuffer

OpenBenchmarking.orgGBPS, More Is BetterclpeakOpenCL Test: Transfer Bandwidth enqueueReadBuffernvidia_opencl_linuxamd_opencl_linuxamd-opencl-linux3691215SE +/- 0.01, N = 3SE +/- 0.07, N = 3SE +/- 0.04, N = 36.6310.3610.411. (CXX) g++ options: -O3 -rdynamic -lOpenCL

clpeak

OpenCL Test: Transfer Bandwidth enqueueWriteBuffer

OpenBenchmarking.orgGBPS, More Is BetterclpeakOpenCL Test: Transfer Bandwidth enqueueWriteBuffernvidia_opencl_linuxamd_opencl_linuxamd-opencl-linux510152025SE +/- 0.06, N = 3SE +/- 0.03, N = 3SE +/- 0.01, N = 310.9319.3919.461. (CXX) g++ options: -O3 -rdynamic -lOpenCL

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: FFT SP

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: FFT SPnvidia_opencl_linuxamd_opencl_linuxamd-opencl-linux50100150200250SE +/- 0.82, N = 3SE +/- 0.28, N = 3SE +/- 0.19, N = 3122.37217.14217.331. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Max SP Flops

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Max SP Flopsnvidia_opencl_linuxamd_opencl_linuxamd-opencl-linux200K400K600K800K1000KSE +/- 1.98, N = 3SE +/- 1706.69, N = 3SE +/- 5904.67, N = 31130.12908135.00908605.001. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

clpeak

OpenCL Test: Single-Precision Float

OpenBenchmarking.orgGFLOPS, More Is BetterclpeakOpenCL Test: Single-Precision Floatnvidia_opencl_linuxamd_opencl_linuxamd-opencl-linux400800120016002000SE +/- 0.31, N = 3SE +/- 0.11, N = 3SE +/- 0.08, N = 3700.291823.801824.961. (CXX) g++ options: -O3 -rdynamic -lOpenCL

clpeak

OpenCL Test: Double-Precision Double

OpenBenchmarking.orgGFLOPS, More Is BetterclpeakOpenCL Test: Double-Precision Doublenvidia_opencl_linuxamd_opencl_linuxamd-opencl-linux306090120150SE +/- 0.02, N = 3SE +/- 0.11, N = 3SE +/- 0.02, N = 335.63115.44115.541. (CXX) g++ options: -O3 -rdynamic -lOpenCL

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: MD5 Hash

OpenBenchmarking.orgGHash/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: MD5 Hashnvidia_opencl_linuxamd_opencl_linuxamd-opencl-linux0.52881.05761.58642.11522.644SE +/- 0.0007, N = 3SE +/- 0.0003, N = 3SE +/- 0.0007, N = 31.43222.35022.34921. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

clpeak

OpenCL Test: Integer Compute INT

OpenBenchmarking.orgGIOPS, More Is BetterclpeakOpenCL Test: Integer Compute INTnvidia_opencl_linuxamd_opencl_linuxamd-opencl-linux80160240320400SE +/- 1.52, N = 3SE +/- 0.11, N = 3SE +/- 0.05, N = 3251.31368.48367.951. (CXX) g++ options: -O3 -rdynamic -lOpenCL

SmallPT GPU

OpenCL Device: GPU - Resolution: 1920 x 1200 - Scene: Caustic

OpenBenchmarking.orgSamples/sec, More Is BetterSmallPT GPU 1.6pts1OpenCL Device: GPU - Resolution: 1920 x 1200 - Scene: Causticnvidia_opencl_linuxamd_opencl_linuxamd-opencl-linux300M600M900M1200M1500MSE +/- 24.25, N = 3SE +/- 25.98, N = 3SE +/- 25.98, N = 31606756748161570092716157069311. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL

SmallPT GPU

OpenCL Device: GPU - Resolution: 1920 x 1200 - Scene: Cornell

OpenBenchmarking.orgSamples/sec, More Is BetterSmallPT GPU 1.6pts1OpenCL Device: GPU - Resolution: 1920 x 1200 - Scene: Cornellnvidia_opencl_linuxamd_opencl_linuxamd-opencl-linux300M600M900M1200M1500MSE +/- 20.78, N = 3SE +/- 24.54, N = 3SE +/- 24.54, N = 31606756869161570106316157070681. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL

SmallPT GPU

OpenCL Device: GPU - Resolution: 1920 x 1200 - Scene: Caustic3

OpenBenchmarking.orgSamples/sec, More Is BetterSmallPT GPU 1.6pts1OpenCL Device: GPU - Resolution: 1920 x 1200 - Scene: Caustic3nvidia_opencl_linuxamd_opencl_linuxamd-opencl-linux300M600M900M1200M1500MSE +/- 23.96, N = 3SE +/- 26.27, N = 3SE +/- 25.98, N = 31606756995161570120316157072071. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL

LuxMark

OpenCL Device: GPU - Scene: Hotel

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.1OpenCL Device: GPU - Scene: Hotelnvidia_opencl_linuxamd_opencl_linuxamd-opencl-linux160320480640800SE +/- 3.48, N = 3SE +/- 0.67, N = 3751490491

LuxMark

OpenCL Device: GPU - Scene: Microphone

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.1OpenCL Device: GPU - Scene: Microphonenvidia_opencl_linuxamd_opencl_linuxamd-opencl-linux8001600240032004000SE +/- 2.67, N = 3SE +/- 10.84, N = 3251237573767

LuxMark

OpenCL Device: GPU - Scene: Luxball HDR

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.1OpenCL Device: GPU - Scene: Luxball HDRnvidia_opencl_linuxamd_opencl_linuxamd-opencl-linux11002200330044005500SE +/- 12.68, N = 3SE +/- 3.21, N = 3373053225324

Rodinia

Test: OpenCL LavaMD

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenCL LavaMDnvidia_opencl_linux0.8891.7782.6673.5564.445SE +/- 0.052, N = 53.9511. (CXX) g++ options: -O2 -lOpenCL

Rodinia

Test: OpenCL Myocyte

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenCL Myocytenvidia_opencl_linuxamd_opencl_linuxamd-opencl-linux50100150200250SE +/- 0.27, N = 3SE +/- 0.41, N = 3SE +/- 0.24, N = 355.34230.35229.861. (CXX) g++ options: -O2 -lOpenCL

Rodinia

Test: OpenCL Heartwall

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenCL Heartwallnvidia_opencl_linuxamd_opencl_linuxamd-opencl-linux246810SE +/- 0.057, N = 14SE +/- 0.026, N = 3SE +/- 0.013, N = 35.9787.7987.8101. (CXX) g++ options: -O2 -lOpenCL

Rodinia

Test: OpenCL Particle Filter

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenCL Particle Filternvidia_opencl_linux1020304050SE +/- 0.03, N = 345.311. (CXX) g++ options: -O2 -lOpenCL

Darktable

Test: Boat - Acceleration: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 3.0.1Test: Boat - Acceleration: OpenCLnvidia_opencl_linuxamd_opencl_linuxamd-opencl-linux3691215SE +/- 0.011, N = 3SE +/- 0.044, N = 3SE +/- 0.016, N = 311.0318.0478.083

Darktable

Test: Masskrug - Acceleration: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 3.0.1Test: Masskrug - Acceleration: OpenCLnvidia_opencl_linuxamd_opencl_linuxamd-opencl-linux3691215SE +/- 0.011, N = 3SE +/- 0.032, N = 3SE +/- 0.016, N = 310.8229.7519.776

Darktable

Test: Server Rack - Acceleration: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 3.0.1Test: Server Rack - Acceleration: OpenCLnvidia_opencl_linuxamd_opencl_linuxamd-opencl-linux0.07270.14540.21810.29080.3635SE +/- 0.001, N = 3SE +/- 0.002, N = 15SE +/- 0.003, N = 30.3230.2230.228

Darktable

Test: Server Room - Acceleration: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 3.0.1Test: Server Room - Acceleration: OpenCLnvidia_opencl_linuxamd_opencl_linuxamd-opencl-linux1.03372.06743.10114.13485.1685SE +/- 0.007, N = 3SE +/- 0.021, N = 15SE +/- 0.030, N = 34.5942.9022.926

Blender

Blend File: BMW27 - Compute: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: BMW27 - Compute: OpenCLnvidia_opencl_linux150300450600750SE +/- 10.50, N = 3694.90

Blender

Blend File: Barbershop - Compute: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: Barbershop - Compute: OpenCLnvidia_opencl_linux5001000150020002500SE +/- 5.31, N = 32379.33

clpeak

OpenCL Test: Kernel Latency

OpenBenchmarking.orgus, Fewer Is BetterclpeakOpenCL Test: Kernel Latencynvidia_opencl_linuxamd_opencl_linuxamd-opencl-linux246810SE +/- 0.05, N = 3SE +/- 0.06, N = 7SE +/- 0.06, N = 57.756.136.151. (CXX) g++ options: -O3 -rdynamic -lOpenCL


Phoronix Test Suite v10.8.4