Coffeelake Beignet vs. OpenCL NEO Intel

Intel Core i7-8700K testing of OpenCL Linux drivers by Michael Larabel for a future article.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 1905107-HV-COFFEELAK00
Jump To Table - Results

View

Do Not Show Noisy Results
Do Not Show Results With Incomplete Data
Do Not Show Results With Little Change/Spread
List Notable Results
Show Result Confidence Charts
Allow Limiting Results To Certain Suite(s)

Statistics

Show Overall Harmonic Mean(s)
Show Overall Geometric Mean
Show Wins / Losses Counts (Pie Chart)
Normalize Results
Remove Outliers Before Calculating Averages

Graph Settings

Force Line Graphs Where Applicable
Convert To Scalar Where Applicable
Disable Color Branding
Prefer Vertical Bar Graphs

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Highlight
Result
Toggle/Hide
Result
Result
Identifier
Performance Per
Dollar
Date
Run
  Test
  Duration
Beignet Git
May 10 2019
  3 Hours, 24 Minutes
Intel OpenCL NEO
May 10 2019
  1 Hour, 57 Minutes
Invert Behavior (Only Show Selected Data)
  2 Hours, 41 Minutes
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


Coffeelake Beignet vs. OpenCL NEO IntelProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLOpenCLCompilerFile-SystemScreen ResolutionBeignet GitIntel OpenCL NEOIntel Core i7-8700K @ 4.70GHz (6 Cores / 12 Threads)ASUS TUF Z370-PLUS GAMING (1802 BIOS)Intel 8th Gen Core16384MB128GB THNSN5128GPU7 TOSHIBAinteldrmfb (1200MHz)Realtek ALC887-VDDELL P2415QIntel I219-VUbuntu 18.044.18.0-18-generic (x86_64)GNOME Shell 3.28.3X Server 1.20.1modesetting 1.20.14.5 Mesa 18.2.8OpenCL 2.0 beignet 1.4 (git-fc5f430c)GCC 7.4.0 + Clang 6.0.0-1ubuntu2 + LLVM 6.0.0ext43840x2160Intel UHD 630 3GB (1200MHz)OpenCL 2.1OpenBenchmarking.orgCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --enable-libmpx --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: intel_pstate powersavePython Details- Python 2.7.15rc1 + Python 3.6.7Security Details- KPTI + __user pointer sanitization + Full generic retpoline IBPB: conditional IBRS_FW STIBP: conditional RSB filling + SSB disabled via prctl and seccomp + PTE Inversion

Beignet Git vs. Intel OpenCL NEO ComparisonPhoronix Test SuiteBaseline+932.8%+932.8%+1865.6%+1865.6%+2798.4%+2798.4%+3731.2%+3731.2%751.3%3731.3%284%216.1%149.2%102.5%51.2%44.1%13.6%10.5%6%5.4%4.4%S.P.FGPU - HotelOpenCL HeartwallOpenCL - Max SP FlopsO.L.FOpenCLOpenCL - TriadOpenCL - Bus Speed DownloadOpenCL - Bus Speed Readback33.5%OpenCL LBMBMW27 - OpenCL11.6%OpenCL - FFT SPKernel Latency9.8%OpenCL BFS8.3%G.M.BServer Rack - OpenCL5.6%OpenCL - MD5 HashOpenCL - T.R.B4.6%Fishy Cat - OpenCLclpeakLuxMarkRodiniaSHOC Scalable HeterOgeneous ComputingViennaCLLeelaChessZeroSHOC Scalable HeterOgeneous ComputingSHOC Scalable HeterOgeneous ComputingSHOC Scalable HeterOgeneous ComputingParboilBlenderSHOC Scalable HeterOgeneous ComputingclpeakParboilclpeakDarktableSHOC Scalable HeterOgeneous ComputingSHOC Scalable HeterOgeneous ComputingBlenderBeignet GitIntel OpenCL NEO

Coffeelake Beignet vs. OpenCL NEO Intelblender: BMW27 - OpenCLblender: Classroom - OpenCLblender: Fishy Cat - OpenCLblender: Barbershop - OpenCLblender: Pabellon Barcelona - OpenCLclpeak: Kernel Latencyclpeak: Single-Precision Floatclpeak: Global Memory Bandwidthclpeak: Transfer Bandwidth enqueueReadBufferclpeak: Transfer Bandwidth enqueueWriteBuffercomd-cl: Average Atom Update Ratedarktable: Boat - OpenCLdarktable: Masskrug - OpenCLdarktable: Server Rack - OpenCLdarktable: Server Room - OpenCLlczero: OpenCLluxmark: GPU - Hotelparboil: OpenCL BFSparboil: OpenCL LBMrodinia: OpenCL Heartwallshoc: OpenCL - Triadshoc: OpenCL - FFT SPshoc: OpenCL - MD5 Hashshoc: OpenCL - Max SP Flopsshoc: OpenCL - Bus Speed Downloadshoc: OpenCL - Bus Speed Readbackshoc: OpenCL - Texture Read Bandwidthviennacl: OpenCL LU Factorizationxsbench-cl: Beignet GitIntel OpenCL NEO4136761014135592921.7053.8831.4115.4240.822.5813.876.480.184.7552.21160.9678.5635.179.2015.980.3746720.4839.0959.748.5211455503461678971137292623.83458.6833.2815.3441.312.5813.876.480.194.78105.736131.0469.159.1613.9117.650.39147629.5129.2757.0921.23OpenBenchmarking.org

Blender

Blender is an open-source 3D creation software project. This test is of Blender's Cycles benchmark with various sample files. GPU computing via OpenCL or CUDA is supported. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.79aBlend File: BMW27 - Compute: OpenCLBeignet GitIntel OpenCL NEO100200300400500413461

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.79aBlend File: Classroom - Compute: OpenCLBeignet GitIntel OpenCL NEO150300450600750676678

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.79aBlend File: Fishy Cat - Compute: OpenCLIntel OpenCL NEOBeignet Git20040060080010009711014

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.79aBlend File: Barbershop - Compute: OpenCLBeignet GitIntel OpenCL NEO3006009001200150013551372

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.79aBlend File: Pabellon Barcelona - Compute: OpenCLIntel OpenCL NEOBeignet Git2004006008001000926929

clpeak

Clpeak is designed to test the peak capabilities of OpenCL devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgus, Fewer Is BetterclpeakOpenCL Test: Kernel LatencyBeignet GitIntel OpenCL NEO612182430SE +/- 0.26, N = 3SE +/- 0.10, N = 321.7023.831. (CXX) g++ options: -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGFLOPS, More Is BetterclpeakOpenCL Test: Single-Precision FloatIntel OpenCL NEOBeignet Git100200300400500SE +/- 0.07, N = 3SE +/- 5.07, N = 12458.6853.881. (CXX) g++ options: -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGBPS, More Is BetterclpeakOpenCL Test: Global Memory BandwidthIntel OpenCL NEOBeignet Git816243240SE +/- 0.09, N = 3SE +/- 2.22, N = 1233.2831.411. (CXX) g++ options: -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGBPS, More Is BetterclpeakOpenCL Test: Transfer Bandwidth enqueueReadBufferBeignet GitIntel OpenCL NEO48121620SE +/- 0.02, N = 3SE +/- 0.19, N = 515.4215.341. (CXX) g++ options: -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGBPS, More Is BetterclpeakOpenCL Test: Transfer Bandwidth enqueueWriteBufferIntel OpenCL NEOBeignet Git918273645SE +/- 0.09, N = 3SE +/- 0.15, N = 341.3140.821. (CXX) g++ options: -O3 -rdynamic -lOpenCL

CoMD OpenCL

CoMD benchmark in OpenCL via GPUOpen. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgus/atom/task, More Is BetterCoMD OpenCL 2017-07-06Average Atom Update RateIntel OpenCL NEOBeignet Git0.58051.1611.74152.3222.9025SE +/- 0.00, N = 3SE +/- 0.01, N = 32.582.581. (CC) gcc options: -std=c99 -O5 -lm -lOpenCL

Darktable

Darktable is an open-source photography / workflow application this will use any system-installed Darktable program or on Windows will automatically download the pre-built binary from the project. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 2.4.2Test: Boat - Acceleration: OpenCLBeignet GitIntel OpenCL NEO48121620SE +/- 0.03, N = 3SE +/- 0.05, N = 313.8713.87

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 2.4.2Test: Masskrug - Acceleration: OpenCLBeignet GitIntel OpenCL NEO246810SE +/- 0.01, N = 3SE +/- 0.01, N = 36.486.48

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 2.4.2Test: Server Rack - Acceleration: OpenCLBeignet GitIntel OpenCL NEO0.04280.08560.12840.17120.214SE +/- 0.00, N = 3SE +/- 0.00, N = 30.180.19

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 2.4.2Test: Server Room - Acceleration: OpenCLBeignet GitIntel OpenCL NEO1.07552.1513.22654.3025.3775SE +/- 0.01, N = 3SE +/- 0.01, N = 34.754.78

LeelaChessZero

LeelaChessZero (lc0 / lczero) is a chess engine automated vian neural networks. This test profile can be used for OpenCL, CUDA + cuDNN, and BLAS (CPU-based) benchmarking. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.20.1Backend: OpenCLIntel OpenCL NEOBeignet Git20406080100SE +/- 0.42, N = 3SE +/- 3.79, N = 6105.7352.211. (CXX) g++ options: -lpthread -lOpenCL -lz

LuxMark

LuxMark is a multi-platform OpenGL benchmark using LuxRender. LuxMark supports targeting different OpenCL devices and has multiple scenes available for rendering. LuxMark is a fully open-source OpenCL program with real-world rendering examples. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.1OpenCL Device: GPU - Scene: HotelIntel OpenCL NEOBeignet Git130260390520650SE +/- 6.08, N = 3SE +/- 13.09, N = 1061316

Parboil

The Parboil Benchmarks from the IMPACT Research Group at University of Illinois are a set of throughput computing applications for looking at computing architecture and compilers. Parboil test-cases support OpenMP, OpenCL, and CUDA multi-processing environments. However, at this time the test profile is just making use of the OpenMP and OpenCL test workloads. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterParboil 2.5Test: OpenCL BFSBeignet GitIntel OpenCL NEO0.2340.4680.7020.9361.17SE +/- 0.00, N = 3SE +/- 0.01, N = 30.961.041. (CXX) g++ options: -lm -lpthread -lgomp -O3 -ffast-math -fopenmp

OpenBenchmarking.orgSeconds, Fewer Is BetterParboil 2.5Test: OpenCL LBMIntel OpenCL NEOBeignet Git20406080100SE +/- 0.41, N = 3SE +/- 0.03, N = 369.1578.561. (CXX) g++ options: -lm -lpthread -lgomp -O3 -ffast-math -fopenmp

Rodinia

Rodinia is a suite focused upon accelerating compute-intensive applications with accelerators. CUDA, OpenMP, and OpenCL parallel models are supported by the included applications. This profile utilizes the OpenCL and OpenMP test binaries at the moment. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 2.4Test: OpenCL HeartwallIntel OpenCL NEOBeignet Git816243240SE +/- 0.04, N = 3SE +/- 2.44, N = 129.1635.171. (CXX) g++ options: -O2 -lOpenCL

SHOC Scalable HeterOgeneous Computing

The CUDA and OpenCL version of Vetter's Scalable HeterOgeneous Computing benchmark suite. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: TriadIntel OpenCL NEOBeignet Git48121620SE +/- 0.06, N = 3SE +/- 0.42, N = 1513.919.201. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: FFT SPIntel OpenCL NEOBeignet Git48121620SE +/- 0.03, N = 3SE +/- 0.00, N = 317.6515.981. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

OpenBenchmarking.orgGHash/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: MD5 HashIntel OpenCL NEOBeignet Git0.08780.17560.26340.35120.439SE +/- 0.00, N = 3SE +/- 0.00, N = 30.390.371. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Max SP FlopsIntel OpenCL NEOBeignet Git30060090012001500SE +/- 16.73, N = 15SE +/- 0.01, N = 314764671. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Bus Speed DownloadIntel OpenCL NEOBeignet Git714212835SE +/- 0.15, N = 3SE +/- 0.09, N = 329.5120.481. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Bus Speed ReadbackBeignet GitIntel OpenCL NEO918273645SE +/- 0.47, N = 6SE +/- 0.43, N = 439.0929.271. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Texture Read BandwidthBeignet GitIntel OpenCL NEO1326395265SE +/- 0.30, N = 3SE +/- 0.00, N = 359.7457.091. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

ViennaCL

ViennaCL is an open-source linear algebra library written in C++ and with support for OpenCL and OpenMP. This test profile uses ViennaCL OpenCL support and runs the included computational benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS, More Is BetterViennaCL 1.4.2OpenCL LU FactorizationIntel OpenCL NEOBeignet Git510152025SE +/- 0.03, N = 3SE +/- 0.42, N = 1221.238.521. (CXX) g++ options: -rdynamic -lOpenCL

Xsbench OpenCL

Xsbench benchmark in OpenCL via GPUOpen. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgLookups/s, More Is BetterXsbench OpenCL 2017-07-06Beignet Git2M4M6M8M10MSE +/- 2043.29, N = 3114555031. (CC) gcc options: -std=gnu99 -fopenmp -O3 -lm -lOpenCL