nvidia_opencl_linux

AMD wx4150 on Ubuntu 20.04.2 with ROCM

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2103141-HA-2011304FI52
Jump To Table - Results

View

Do Not Show Noisy Results
Do Not Show Results With Incomplete Data
Do Not Show Results With Little Change/Spread
List Notable Results

Limit displaying results to tests within:

CPU Massive 3 Tests
Creator Workloads 2 Tests
HPC - High Performance Computing 2 Tests
Multi-Core 2 Tests
NVIDIA GPU Compute 5 Tests
OpenCL 7 Tests
Server CPU Tests 2 Tests
Common Workstation Benchmarks 2 Tests

Statistics

Show Overall Harmonic Mean(s)
Show Overall Geometric Mean
Show Geometric Means Per-Suite/Category
Show Wins / Losses Counts (Pie Chart)
Normalize Results
Remove Outliers Before Calculating Averages

Graph Settings

Force Line Graphs Where Applicable
Convert To Scalar Where Applicable
Disable Color Branding
Prefer Vertical Bar Graphs

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Highlight
Result
Hide
Result
Result
Identifier
Performance Per
Dollar
Date
Run
  Test
  Duration
nvidia_opencl_linux
December 01 2020
  4 Hours, 2 Minutes
amd_opencl_linux
March 14 2021
  1 Hour, 13 Minutes
Invert Hiding All Results Option
  2 Hours, 37 Minutes
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


nvidia_opencl_linuxProcessorMotherboardChipsetMemoryDiskGraphicsAudioNetworkMonitorOSKernelDesktopDisplay ServerDisplay DriverOpenCLVulkanCompilerFile-SystemScreen ResolutionOpenGLnvidia_opencl_linuxamd_opencl_linuxIntel Core i7-4700MQ @ 3.40GHz (4 Cores / 8 Threads)HP 1909 (L70 Ver. 01.42 BIOS)Intel Xeon E3-1200 v3/4th32GB500GB Samsung SSD 860 + 256GB SAMSUNG MZ7PD256 + 500GB Seagate ST500LT012-1DG14 + 256GB SAMSUNG MZMPD256 + 128GB ED2S5NVIDIA Quadro M1000M 2GB (135/405MHz)IDT 92HD91BXXIntel I217-LM + Intel 7260Ubuntu 20.045.4.0-53-generic (x86_64)GNOME Shell 3.36.4X Server 1.20.8NVIDIA 450.80.02OpenCL 1.2 CUDA 11.0.2281.2.131GCC 9.3.0ext41920x1200HP 1909 (L70 Ver. 01.45 BIOS)500GB Samsung SSD 860 + 500GB Seagate ST500LT012-1DG14Intel HD 4600 2GB (1150MHz)HP ZR24w5.6.0-1042-oem (x86_64)X Server 1.20.94.5 Mesa 20.2.6OpenCL 2.0 AMD-APP (3212.0)1.2.1453840x1200OpenBenchmarking.orgCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: intel_pstate powersave - CPU Microcode: 0x28 - Thermald 1.9.1OpenCL Details- nvidia_opencl_linux: GPU Compute Cores: 512Python Details- nvidia_opencl_linux: Python 3.8.5Security Details- itlb_multihit: KVM: Mitigation of Split huge pages + l1tf: Mitigation of PTE Inversion; VMX: conditional cache flushes SMT vulnerable + mds: Mitigation of Clear buffers; SMT vulnerable + meltdown: Mitigation of PTI + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full generic retpoline IBPB: conditional IBRS_FW STIBP: conditional RSB filling + srbds: Mitigation of Microcode + tsx_async_abort: Not affected Kernel Details- amd_opencl_linux: Transparent Huge Pages: madviseGraphics Details- amd_opencl_linux: GLAMOR

nvidia_opencl_linux vs. amd_opencl_linux ComparisonPhoronix Test SuiteBaseline+20064.4%+20064.4%+40128.8%+40128.8%+60193.2%+60193.2%80257.4%224%160.4%77.4%77.4%64.1%58.3%56.3%49.6%46.6%44.8%42.7%37.1%26.4%18.1%17.1%14%12.3%11%OpenCL - Max SP FlopsOpenCL Myocyte316.3%D.P.DS.P.FOpenCL - Bus Speed Readback143.3%OpenCL - Bus Speed Download123.8%OpenCL - Triad122.4%OpenCL - FFT SPT.B.eOpenCL - MD5 HashServer Room - OpenCLT.B.eGPU - Hotel53.3%GPU - MicrophoneI.C.IServer Rack - OpenCLGPU - Luxball HDRBoat - OpenCLOpenCL - T.R.B35.9%OpenCL Heartwall30.4%Kernel LatencyReadWriteCopyG.M.BMasskrug - OpenCLSHOC Scalable HeterOgeneous ComputingRodiniaclpeakclpeakSHOC Scalable HeterOgeneous ComputingSHOC Scalable HeterOgeneous ComputingSHOC Scalable HeterOgeneous ComputingSHOC Scalable HeterOgeneous ComputingclpeakSHOC Scalable HeterOgeneous ComputingDarktableclpeakLuxMarkLuxMarkclpeakDarktableLuxMarkDarktableSHOC Scalable HeterOgeneous ComputingRodiniaclpeakcl-memcl-memcl-memclpeakDarktablenvidia_opencl_linuxamd_opencl_linux

nvidia_opencl_linuxblender: Barbershop - OpenCLblender: BMW27 - OpenCLshoc: OpenCL - Max SP Flopsrodinia: OpenCL Myocyteluxmark: GPU - Hotelluxmark: GPU - Microphoneluxmark: GPU - Luxball HDRrodinia: OpenCL Particle Filtersmallpt-gpu: GPU - 1920 x 1200 - Causticsmallpt-gpu: GPU - 1920 x 1200 - Caustic3clpeak: Double-Precision Doublesmallpt-gpu: GPU - 1920 x 1200 - Cornellcl-mem: Readcl-mem: Copycl-mem: Writerodinia: OpenCL Heartwalldarktable: Server Room - OpenCLdarktable: Boat - OpenCLclpeak: Kernel Latencyshoc: OpenCL - Texture Read Bandwidthdarktable: Masskrug - OpenCLclpeak: Integer Compute INTclpeak: Transfer Bandwidth enqueueReadBufferclpeak: Transfer Bandwidth enqueueWriteBufferclpeak: Global Memory Bandwidthshoc: OpenCL - MD5 Hashclpeak: Single-Precision Floatrodinia: OpenCL LavaMDdarktable: Server Rack - OpenCLshoc: OpenCL - FFT SPshoc: OpenCL - Triadshoc: OpenCL - Bus Speed Readbackshoc: OpenCL - Bus Speed Downloadnvidia_opencl_linuxamd_opencl_linux2379.33694.901130.1255.3387512512373045.3061606756748160675699535.63160675686967.460.163.35.9784.59411.0317.75110.99010.822251.316.6310.9366.991.4322700.293.9510.323122.37310.566312.764012.6878908135230.3454903757532216157009271615701203115.44161570106379.668.574.17.7982.9028.0476.1381.66719.751368.4810.3619.3975.212.35021823.800.223217.1394.75025.24615.6690OpenBenchmarking.org

Blender

Blender is an open-source 3D creation software project. This test is of Blender's Cycles benchmark with various sample files. GPU computing via OpenCL or CUDA is supported. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: Barbershop - Compute: OpenCLnvidia_opencl_linux5001000150020002500SE +/- 5.31, N = 32379.33

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.90Blend File: BMW27 - Compute: OpenCLnvidia_opencl_linux150300450600750SE +/- 10.50, N = 3694.90

SHOC Scalable HeterOgeneous Computing

The CUDA and OpenCL version of Vetter's Scalable HeterOgeneous Computing benchmark suite. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Max SP Flopsnvidia_opencl_linuxamd_opencl_linux200K400K600K800K1000KSE +/- 1.98, N = 3SE +/- 1706.69, N = 31130.12908135.001. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi
OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Max SP Flopsnvidia_opencl_linuxamd_opencl_linux160K320K480K640K800KMin: 1127.91 / Avg: 1130.12 / Max: 1134.07Min: 904918 / Avg: 908134.67 / Max: 9107321. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

Rodinia

Rodinia is a suite focused upon accelerating compute-intensive applications with accelerators. CUDA, OpenMP, and OpenCL parallel models are supported by the included applications. This profile utilizes select OpenCL, NVIDIA CUDA and OpenMP test binaries at the moment. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenCL Myocytenvidia_opencl_linuxamd_opencl_linux50100150200250SE +/- 0.27, N = 3SE +/- 0.41, N = 355.34230.351. (CXX) g++ options: -O2 -lOpenCL
OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenCL Myocytenvidia_opencl_linuxamd_opencl_linux4080120160200Min: 55.04 / Avg: 55.34 / Max: 55.88Min: 229.61 / Avg: 230.34 / Max: 231.041. (CXX) g++ options: -O2 -lOpenCL

LuxMark

LuxMark is a multi-platform OpenGL benchmark using LuxRender. LuxMark supports targeting different OpenCL devices and has multiple scenes available for rendering. LuxMark is a fully open-source OpenCL program with real-world rendering examples. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.1OpenCL Device: GPU - Scene: Hotelnvidia_opencl_linuxamd_opencl_linux160320480640800SE +/- 3.48, N = 3SE +/- 0.67, N = 3751490
OpenBenchmarking.orgScore, More Is BetterLuxMark 3.1OpenCL Device: GPU - Scene: Hotelnvidia_opencl_linuxamd_opencl_linux130260390520650Min: 745 / Avg: 750.67 / Max: 757Min: 489 / Avg: 490.33 / Max: 491

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.1OpenCL Device: GPU - Scene: Microphonenvidia_opencl_linuxamd_opencl_linux8001600240032004000SE +/- 2.67, N = 3SE +/- 10.84, N = 325123757
OpenBenchmarking.orgScore, More Is BetterLuxMark 3.1OpenCL Device: GPU - Scene: Microphonenvidia_opencl_linuxamd_opencl_linux7001400210028003500Min: 2509 / Avg: 2511.67 / Max: 2517Min: 3735 / Avg: 3756.67 / Max: 3768

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.1OpenCL Device: GPU - Scene: Luxball HDRnvidia_opencl_linuxamd_opencl_linux11002200330044005500SE +/- 12.68, N = 337305322
OpenBenchmarking.orgScore, More Is BetterLuxMark 3.1OpenCL Device: GPU - Scene: Luxball HDRnvidia_opencl_linuxamd_opencl_linux9001800270036004500Min: 3705 / Avg: 3730.33 / Max: 3744

Rodinia

Rodinia is a suite focused upon accelerating compute-intensive applications with accelerators. CUDA, OpenMP, and OpenCL parallel models are supported by the included applications. This profile utilizes select OpenCL, NVIDIA CUDA and OpenMP test binaries at the moment. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenCL Particle Filternvidia_opencl_linux1020304050SE +/- 0.03, N = 345.311. (CXX) g++ options: -O2 -lOpenCL

SmallPT GPU

SmallPT GPU is an OpenCL benchmark that's run with various PTS changes compared to upstream and multiple rendering scenes are available. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSamples/sec, More Is BetterSmallPT GPU 1.6pts1OpenCL Device: GPU - Resolution: 1920 x 1200 - Scene: Causticnvidia_opencl_linuxamd_opencl_linux300M600M900M1200M1500MSE +/- 24.25, N = 3SE +/- 25.98, N = 3160675674816157009271. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL
OpenBenchmarking.orgSamples/sec, More Is BetterSmallPT GPU 1.6pts1OpenCL Device: GPU - Resolution: 1920 x 1200 - Scene: Causticnvidia_opencl_linuxamd_opencl_linux300M600M900M1200M1500MMin: 1606756706 / Avg: 1606756748 / Max: 1606756790Min: 1615700882 / Avg: 1615700927 / Max: 16157009721. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL

OpenBenchmarking.orgSamples/sec, More Is BetterSmallPT GPU 1.6pts1OpenCL Device: GPU - Resolution: 1920 x 1200 - Scene: Caustic3nvidia_opencl_linuxamd_opencl_linux300M600M900M1200M1500MSE +/- 23.96, N = 3SE +/- 26.27, N = 3160675699516157012031. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL
OpenBenchmarking.orgSamples/sec, More Is BetterSmallPT GPU 1.6pts1OpenCL Device: GPU - Resolution: 1920 x 1200 - Scene: Caustic3nvidia_opencl_linuxamd_opencl_linux300M600M900M1200M1500MMin: 1606756954 / Avg: 1606756995.33 / Max: 1606757037Min: 1615701157 / Avg: 1615701202.67 / Max: 16157012481. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL

clpeak

Clpeak is designed to test the peak capabilities of OpenCL devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS, More Is BetterclpeakOpenCL Test: Double-Precision Doublenvidia_opencl_linuxamd_opencl_linux306090120150SE +/- 0.02, N = 3SE +/- 0.11, N = 335.63115.441. (CXX) g++ options: -O3 -rdynamic -lOpenCL
OpenBenchmarking.orgGFLOPS, More Is BetterclpeakOpenCL Test: Double-Precision Doublenvidia_opencl_linuxamd_opencl_linux20406080100Min: 35.61 / Avg: 35.63 / Max: 35.66Min: 115.29 / Avg: 115.44 / Max: 115.651. (CXX) g++ options: -O3 -rdynamic -lOpenCL

SmallPT GPU

SmallPT GPU is an OpenCL benchmark that's run with various PTS changes compared to upstream and multiple rendering scenes are available. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSamples/sec, More Is BetterSmallPT GPU 1.6pts1OpenCL Device: GPU - Resolution: 1920 x 1200 - Scene: Cornellnvidia_opencl_linuxamd_opencl_linux300M600M900M1200M1500MSE +/- 20.78, N = 3SE +/- 24.54, N = 3160675686916157010631. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL
OpenBenchmarking.orgSamples/sec, More Is BetterSmallPT GPU 1.6pts1OpenCL Device: GPU - Resolution: 1920 x 1200 - Scene: Cornellnvidia_opencl_linuxamd_opencl_linux300M600M900M1200M1500MMin: 1606756833 / Avg: 1606756869 / Max: 1606756905Min: 1615701021 / Avg: 1615701063.33 / Max: 16157011061. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL

cl-mem

A basic OpenCL memory benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: Readnvidia_opencl_linuxamd_opencl_linux20406080100SE +/- 0.03, N = 3SE +/- 0.03, N = 367.479.61. (CC) gcc options: -O2 -flto -lOpenCL
OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: Readnvidia_opencl_linuxamd_opencl_linux1530456075Min: 67.3 / Avg: 67.37 / Max: 67.4Min: 79.6 / Avg: 79.63 / Max: 79.71. (CC) gcc options: -O2 -flto -lOpenCL

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: Copynvidia_opencl_linuxamd_opencl_linux1530456075SE +/- 0.00, N = 3SE +/- 0.12, N = 360.168.51. (CC) gcc options: -O2 -flto -lOpenCL
OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: Copynvidia_opencl_linuxamd_opencl_linux1326395265Min: 60.1 / Avg: 60.1 / Max: 60.1Min: 68.3 / Avg: 68.5 / Max: 68.71. (CC) gcc options: -O2 -flto -lOpenCL

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: Writenvidia_opencl_linuxamd_opencl_linux1632486480SE +/- 0.03, N = 3SE +/- 0.13, N = 363.374.11. (CC) gcc options: -O2 -flto -lOpenCL
OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: Writenvidia_opencl_linuxamd_opencl_linux1428425670Min: 63.3 / Avg: 63.33 / Max: 63.4Min: 73.8 / Avg: 74.07 / Max: 74.21. (CC) gcc options: -O2 -flto -lOpenCL

Rodinia

Rodinia is a suite focused upon accelerating compute-intensive applications with accelerators. CUDA, OpenMP, and OpenCL parallel models are supported by the included applications. This profile utilizes select OpenCL, NVIDIA CUDA and OpenMP test binaries at the moment. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenCL Heartwallnvidia_opencl_linuxamd_opencl_linux246810SE +/- 0.057, N = 14SE +/- 0.026, N = 35.9787.7981. (CXX) g++ options: -O2 -lOpenCL
OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenCL Heartwallnvidia_opencl_linuxamd_opencl_linux3691215Min: 5.89 / Avg: 5.98 / Max: 6.72Min: 7.75 / Avg: 7.8 / Max: 7.831. (CXX) g++ options: -O2 -lOpenCL

Darktable

Darktable is an open-source photography / workflow application this will use any system-installed Darktable program or on Windows will automatically download the pre-built binary from the project. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 3.0.1Test: Server Room - Acceleration: OpenCLnvidia_opencl_linuxamd_opencl_linux1.03372.06743.10114.13485.1685SE +/- 0.007, N = 3SE +/- 0.021, N = 154.5942.902
OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 3.0.1Test: Server Room - Acceleration: OpenCLnvidia_opencl_linuxamd_opencl_linux246810Min: 4.58 / Avg: 4.59 / Max: 4.61Min: 2.66 / Avg: 2.9 / Max: 2.96

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 3.0.1Test: Boat - Acceleration: OpenCLnvidia_opencl_linuxamd_opencl_linux3691215SE +/- 0.011, N = 3SE +/- 0.044, N = 311.0318.047
OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 3.0.1Test: Boat - Acceleration: OpenCLnvidia_opencl_linuxamd_opencl_linux3691215Min: 11.01 / Avg: 11.03 / Max: 11.05Min: 7.96 / Avg: 8.05 / Max: 8.1

clpeak

Clpeak is designed to test the peak capabilities of OpenCL devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgus, Fewer Is BetterclpeakOpenCL Test: Kernel Latencynvidia_opencl_linuxamd_opencl_linux246810SE +/- 0.05, N = 3SE +/- 0.06, N = 77.756.131. (CXX) g++ options: -O3 -rdynamic -lOpenCL
OpenBenchmarking.orgus, Fewer Is BetterclpeakOpenCL Test: Kernel Latencynvidia_opencl_linuxamd_opencl_linux3691215Min: 7.68 / Avg: 7.75 / Max: 7.84Min: 5.94 / Avg: 6.13 / Max: 6.331. (CXX) g++ options: -O3 -rdynamic -lOpenCL

SHOC Scalable HeterOgeneous Computing

The CUDA and OpenCL version of Vetter's Scalable HeterOgeneous Computing benchmark suite. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Texture Read Bandwidthnvidia_opencl_linuxamd_opencl_linux20406080100SE +/- 0.57, N = 3SE +/- 0.21, N = 3110.9981.671. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi
OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Texture Read Bandwidthnvidia_opencl_linuxamd_opencl_linux20406080100Min: 110.33 / Avg: 110.99 / Max: 112.12Min: 81.27 / Avg: 81.67 / Max: 821. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

Darktable

Darktable is an open-source photography / workflow application this will use any system-installed Darktable program or on Windows will automatically download the pre-built binary from the project. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 3.0.1Test: Masskrug - Acceleration: OpenCLnvidia_opencl_linuxamd_opencl_linux3691215SE +/- 0.011, N = 3SE +/- 0.032, N = 310.8229.751
OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 3.0.1Test: Masskrug - Acceleration: OpenCLnvidia_opencl_linuxamd_opencl_linux3691215Min: 10.8 / Avg: 10.82 / Max: 10.84Min: 9.69 / Avg: 9.75 / Max: 9.79

clpeak

Clpeak is designed to test the peak capabilities of OpenCL devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGIOPS, More Is BetterclpeakOpenCL Test: Integer Compute INTnvidia_opencl_linuxamd_opencl_linux80160240320400SE +/- 1.52, N = 3SE +/- 0.11, N = 3251.31368.481. (CXX) g++ options: -O3 -rdynamic -lOpenCL
OpenBenchmarking.orgGIOPS, More Is BetterclpeakOpenCL Test: Integer Compute INTnvidia_opencl_linuxamd_opencl_linux70140210280350Min: 248.26 / Avg: 251.31 / Max: 252.83Min: 368.26 / Avg: 368.48 / Max: 368.591. (CXX) g++ options: -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGBPS, More Is BetterclpeakOpenCL Test: Transfer Bandwidth enqueueReadBuffernvidia_opencl_linuxamd_opencl_linux3691215SE +/- 0.01, N = 3SE +/- 0.07, N = 36.6310.361. (CXX) g++ options: -O3 -rdynamic -lOpenCL
OpenBenchmarking.orgGBPS, More Is BetterclpeakOpenCL Test: Transfer Bandwidth enqueueReadBuffernvidia_opencl_linuxamd_opencl_linux3691215Min: 6.62 / Avg: 6.63 / Max: 6.64Min: 10.29 / Avg: 10.36 / Max: 10.491. (CXX) g++ options: -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGBPS, More Is BetterclpeakOpenCL Test: Transfer Bandwidth enqueueWriteBuffernvidia_opencl_linuxamd_opencl_linux510152025SE +/- 0.06, N = 3SE +/- 0.03, N = 310.9319.391. (CXX) g++ options: -O3 -rdynamic -lOpenCL
OpenBenchmarking.orgGBPS, More Is BetterclpeakOpenCL Test: Transfer Bandwidth enqueueWriteBuffernvidia_opencl_linuxamd_opencl_linux510152025Min: 10.81 / Avg: 10.93 / Max: 11Min: 19.35 / Avg: 19.39 / Max: 19.451. (CXX) g++ options: -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGBPS, More Is BetterclpeakOpenCL Test: Global Memory Bandwidthnvidia_opencl_linuxamd_opencl_linux20406080100SE +/- 0.05, N = 3SE +/- 0.06, N = 366.9975.211. (CXX) g++ options: -O3 -rdynamic -lOpenCL
OpenBenchmarking.orgGBPS, More Is BetterclpeakOpenCL Test: Global Memory Bandwidthnvidia_opencl_linuxamd_opencl_linux1428425670Min: 66.89 / Avg: 66.99 / Max: 67.06Min: 75.1 / Avg: 75.21 / Max: 75.281. (CXX) g++ options: -O3 -rdynamic -lOpenCL

SHOC Scalable HeterOgeneous Computing

The CUDA and OpenCL version of Vetter's Scalable HeterOgeneous Computing benchmark suite. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGHash/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: MD5 Hashnvidia_opencl_linuxamd_opencl_linux0.52881.05761.58642.11522.644SE +/- 0.0007, N = 3SE +/- 0.0003, N = 31.43222.35021. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi
OpenBenchmarking.orgGHash/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: MD5 Hashnvidia_opencl_linuxamd_opencl_linux246810Min: 1.43 / Avg: 1.43 / Max: 1.43Min: 2.35 / Avg: 2.35 / Max: 2.351. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

clpeak

Clpeak is designed to test the peak capabilities of OpenCL devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS, More Is BetterclpeakOpenCL Test: Single-Precision Floatnvidia_opencl_linuxamd_opencl_linux400800120016002000SE +/- 0.31, N = 3SE +/- 0.11, N = 3700.291823.801. (CXX) g++ options: -O3 -rdynamic -lOpenCL
OpenBenchmarking.orgGFLOPS, More Is BetterclpeakOpenCL Test: Single-Precision Floatnvidia_opencl_linuxamd_opencl_linux30060090012001500Min: 699.78 / Avg: 700.29 / Max: 700.85Min: 1823.66 / Avg: 1823.8 / Max: 1824.021. (CXX) g++ options: -O3 -rdynamic -lOpenCL

Rodinia

Rodinia is a suite focused upon accelerating compute-intensive applications with accelerators. CUDA, OpenMP, and OpenCL parallel models are supported by the included applications. This profile utilizes select OpenCL, NVIDIA CUDA and OpenMP test binaries at the moment. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenCL LavaMDnvidia_opencl_linux0.8891.7782.6673.5564.445SE +/- 0.052, N = 53.9511. (CXX) g++ options: -O2 -lOpenCL

Darktable

Darktable is an open-source photography / workflow application this will use any system-installed Darktable program or on Windows will automatically download the pre-built binary from the project. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 3.0.1Test: Server Rack - Acceleration: OpenCLnvidia_opencl_linuxamd_opencl_linux0.07270.14540.21810.29080.3635SE +/- 0.001, N = 3SE +/- 0.002, N = 150.3230.223
OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 3.0.1Test: Server Rack - Acceleration: OpenCLnvidia_opencl_linuxamd_opencl_linux12345Min: 0.32 / Avg: 0.32 / Max: 0.32Min: 0.2 / Avg: 0.22 / Max: 0.23

SHOC Scalable HeterOgeneous Computing

The CUDA and OpenCL version of Vetter's Scalable HeterOgeneous Computing benchmark suite. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: FFT SPnvidia_opencl_linuxamd_opencl_linux50100150200250SE +/- 0.82, N = 3SE +/- 0.28, N = 3122.37217.141. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi
OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: FFT SPnvidia_opencl_linuxamd_opencl_linux4080120160200Min: 120.73 / Avg: 122.37 / Max: 123.21Min: 216.84 / Avg: 217.14 / Max: 217.691. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Triadnvidia_opencl_linuxamd_opencl_linux3691215SE +/- 0.0068, N = 3SE +/- 0.0171, N = 310.56634.75021. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi
OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Triadnvidia_opencl_linuxamd_opencl_linux3691215Min: 10.55 / Avg: 10.57 / Max: 10.58Min: 4.73 / Avg: 4.75 / Max: 4.781. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Bus Speed Readbacknvidia_opencl_linuxamd_opencl_linux3691215SE +/- 0.0014, N = 3SE +/- 0.0057, N = 312.76405.24611. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi
OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Bus Speed Readbacknvidia_opencl_linuxamd_opencl_linux48121620Min: 12.76 / Avg: 12.76 / Max: 12.77Min: 5.24 / Avg: 5.25 / Max: 5.261. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Bus Speed Downloadnvidia_opencl_linuxamd_opencl_linux3691215SE +/- 0.0011, N = 3SE +/- 0.0012, N = 312.68785.66901. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi
OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Bus Speed Downloadnvidia_opencl_linuxamd_opencl_linux48121620Min: 12.69 / Avg: 12.69 / Max: 12.69Min: 5.67 / Avg: 5.67 / Max: 5.671. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi