opencl-clvk-benchmark-test

Intel Core i9-9900K testing with a ASUS PRIME Z390-A (0506 BIOS) and eVGA NVIDIA GeForce RTX 2070 8GB on Ubuntu 18.10 via the Phoronix Test Suite.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 1810313-SK-OPENCLCLV15
Jump To Table - Results

View

Do Not Show Noisy Results
Do Not Show Results With Incomplete Data
Do Not Show Results With Little Change/Spread
List Notable Results

Limit displaying results to tests within:

NVIDIA GPU Compute 2 Tests
OpenCL 2 Tests

Statistics

Show Overall Harmonic Mean(s)
Show Overall Geometric Mean
Show Geometric Means Per-Suite/Category
Show Wins / Losses Counts (Pie Chart)
Normalize Results
Remove Outliers Before Calculating Averages

Graph Settings

Force Line Graphs Where Applicable
Convert To Scalar Where Applicable
Disable Color Branding
Prefer Vertical Bar Graphs

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Highlight
Result
Hide
Result
Result
Identifier
Performance Per
Dollar
Date
Run
  Test
  Duration
CLVK 20181031
October 31 2018
  12 Minutes
NVIDIA 410.73
October 31 2018
  56 Minutes
Invert Hiding All Results Option
  34 Minutes
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


opencl-clvk-benchmark-test OpenBenchmarking.orgPhoronix Test SuiteIntel Core i9-9900K @ 5.00GHz (8 Cores / 16 Threads)ASUS PRIME Z390-A (0506 BIOS)Intel Cannon Lake PCH Shared SRAM16384MBSamsung SSD 970 EVO 250GBeVGA NVIDIA GeForce RTX 2070 8GB (1410/7000MHz)Realtek ALC1220Acer B286HKIntel ConnectionUbuntu 18.104.18.0-10-generic (x86_64)GNOME Shell 3.30.1X Server 1.20.1NVIDIA 410.734.6.0OpenCL 1.2 clvkOpenCL 1.2 CUDA 10.0.185GCC 8.2.0ext43840x2160ProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLOpenCLsCompilerFile-SystemScreen ResolutionOpencl-clvk-benchmark-test PerformanceSystem Logs- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --enable-libmpx --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib --with-tune=generic --without-cuda-driver -v - Scaling Governor: intel_pstate powersave- GPU Compute Cores: 2304- __user pointer sanitization + Full generic retpoline IBPB IBRS_FW + SSB disabled via prctl and seccomp

CLVK 20181031 vs. NVIDIA 410.73 ComparisonPhoronix Test SuiteBaseline+661%+661%+1322%+1322%+1983%+1983%2643.8%2608.3%2574.5%1475.4%OpenCL - Bus Speed ReadbackOpenCL - Bus Speed DownloadOpenCL - TriadOpenCL - FFT SPSHOC Scalable HeterOgeneous ComputingSHOC Scalable HeterOgeneous ComputingSHOC Scalable HeterOgeneous ComputingSHOC Scalable HeterOgeneous ComputingCLVK 20181031NVIDIA 410.73

opencl-clvk-benchmark-test shoc: OpenCL - Triadshoc: OpenCL - FFT SPshoc: OpenCL - Bus Speed Downloadshoc: OpenCL - Bus Speed Readbackshoc: OpenCL - MD5 Hashshoc: OpenCL - Max SP Flopsshoc: OpenCL - Texture Read Bandwidthcl-mem: Copycl-mem: Readcl-mem: Writeclpeak: Kernel Latencyclpeak: Integer Compute INTclpeak: Single-Precision Floatclpeak: Double-Precision Doubleclpeak: Global Memory Bandwidthclpeak: Transfer Bandwidth enqueueReadBufferclpeak: Transfer Bandwidth enqueueWriteBufferCLVK 20181031NVIDIA 410.730.4762.800.480.4812.57989.3613.0013.1718.7987591106.70330.37395.40317.073.6283678679275.80368.0111.3212.59OpenBenchmarking.org

SHOC Scalable HeterOgeneous Computing

The CUDA and OpenCL version of Vetter's Scalable HeterOgeneous Computing benchmark suite. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: TriadCLVK 20181031NVIDIA 410.733691215SE +/- 0.01, N = 3SE +/- 0.00, N = 30.4712.571. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi
OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: TriadCLVK 20181031NVIDIA 410.7348121620Min: 0.45 / Avg: 0.47 / Max: 0.48Min: 12.56 / Avg: 12.57 / Max: 12.571. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: FFT SPCLVK 20181031NVIDIA 410.732004006008001000SE +/- 0.02, N = 3SE +/- 0.44, N = 362.80989.361. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi
OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: FFT SPCLVK 20181031NVIDIA 410.732004006008001000Min: 62.76 / Avg: 62.8 / Max: 62.82Min: 988.79 / Avg: 989.36 / Max: 990.221. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Bus Speed DownloadCLVK 20181031NVIDIA 410.733691215SE +/- 0.00, N = 3SE +/- 0.00, N = 30.4813.001. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi
OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Bus Speed DownloadCLVK 20181031NVIDIA 410.7348121620Min: 0.47 / Avg: 0.48 / Max: 0.48Min: 13 / Avg: 13 / Max: 131. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Bus Speed ReadbackCLVK 20181031NVIDIA 410.733691215SE +/- 0.01, N = 12SE +/- 0.00, N = 30.4813.171. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi
OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Bus Speed ReadbackCLVK 20181031NVIDIA 410.7348121620Min: 0.46 / Avg: 0.48 / Max: 0.51Min: 13.17 / Avg: 13.17 / Max: 13.171. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

OpenBenchmarking.orgGHash/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: MD5 HashNVIDIA 410.73510152025SE +/- 0.02, N = 318.791. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Max SP FlopsNVIDIA 410.732K4K6K8K10KSE +/- 0.15, N = 387591. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Texture Read BandwidthNVIDIA 410.732004006008001000SE +/- 0.96, N = 31106.701. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

cl-mem

A basic OpenCL memory benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: CopyNVIDIA 410.7370140210280350SE +/- 0.12, N = 3330.371. (CC) gcc options: -O2 -flto -lOpenCL

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: ReadNVIDIA 410.7390180270360450SE +/- 0.10, N = 3395.401. (CC) gcc options: -O2 -flto -lOpenCL

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: WriteNVIDIA 410.7370140210280350SE +/- 0.63, N = 3317.071. (CC) gcc options: -O2 -flto -lOpenCL

clpeak

Clpeak is designed to test the peak capabilities of OpenCL devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgus, Fewer Is BetterclpeakOpenCL Test: Kernel LatencyNVIDIA 410.730.81451.6292.44353.2584.0725SE +/- 0.04, N = 33.62

OpenBenchmarking.orgGIOPS, More Is BetterclpeakOpenCL Test: Integer Compute INTNVIDIA 410.732K4K6K8K10KSE +/- 157.16, N = 118367

OpenBenchmarking.orgGFLOPS, More Is BetterclpeakOpenCL Test: Single-Precision FloatNVIDIA 410.732K4K6K8K10KSE +/- 90.96, N = 128679

OpenBenchmarking.orgGFLOPS, More Is BetterclpeakOpenCL Test: Double-Precision DoubleNVIDIA 410.7360120180240300SE +/- 0.14, N = 3275.80

OpenBenchmarking.orgGBPS, More Is BetterclpeakOpenCL Test: Global Memory BandwidthNVIDIA 410.7380160240320400SE +/- 0.52, N = 3368.01

OpenBenchmarking.orgGBPS, More Is BetterclpeakOpenCL Test: Transfer Bandwidth enqueueReadBufferNVIDIA 410.733691215SE +/- 0.00, N = 311.32

OpenBenchmarking.orgGBPS, More Is BetterclpeakOpenCL Test: Transfer Bandwidth enqueueWriteBufferNVIDIA 410.733691215SE +/- 0.00, N = 312.59