Coffeelake Beignet vs. OpenCL NEO Intel

Intel Core i7-8700K testing of OpenCL Linux drivers by Michael Larabel for a future article.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 1905107-HV-COFFEELAK00
Jump To Table - Results

View

Do Not Show Noisy Results
Do Not Show Results With Incomplete Data
Do Not Show Results With Little Change/Spread
List Notable Results
Show Result Confidence Charts
Allow Limiting Results To Certain Suite(s)

Statistics

Show Overall Harmonic Mean(s)
Show Overall Geometric Mean
Show Wins / Losses Counts (Pie Chart)
Normalize Results
Remove Outliers Before Calculating Averages

Graph Settings

Force Line Graphs Where Applicable
Convert To Scalar Where Applicable
Disable Color Branding
Prefer Vertical Bar Graphs

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Highlight
Result
Toggle/Hide
Result
Result
Identifier
Performance Per
Dollar
Date
Run
  Test
  Duration
Beignet Git
May 10 2019
  3 Hours, 24 Minutes
Intel OpenCL NEO
May 10 2019
  1 Hour, 57 Minutes
Invert Behavior (Only Show Selected Data)
  2 Hours, 41 Minutes
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


Coffeelake Beignet vs. OpenCL NEO IntelProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLOpenCLCompilerFile-SystemScreen ResolutionBeignet GitIntel OpenCL NEOIntel Core i7-8700K @ 4.70GHz (6 Cores / 12 Threads)ASUS TUF Z370-PLUS GAMING (1802 BIOS)Intel 8th Gen Core16384MB128GB THNSN5128GPU7 TOSHIBAinteldrmfb (1200MHz)Realtek ALC887-VDDELL P2415QIntel I219-VUbuntu 18.044.18.0-18-generic (x86_64)GNOME Shell 3.28.3X Server 1.20.1modesetting 1.20.14.5 Mesa 18.2.8OpenCL 2.0 beignet 1.4 (git-fc5f430c)GCC 7.4.0 + Clang 6.0.0-1ubuntu2 + LLVM 6.0.0ext43840x2160Intel UHD 630 3GB (1200MHz)OpenCL 2.1OpenBenchmarking.orgCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --enable-libmpx --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: intel_pstate powersavePython Details- Python 2.7.15rc1 + Python 3.6.7Security Details- KPTI + __user pointer sanitization + Full generic retpoline IBPB: conditional IBRS_FW STIBP: conditional RSB filling + SSB disabled via prctl and seccomp + PTE Inversion

Beignet Git vs. Intel OpenCL NEO ComparisonPhoronix Test SuiteBaseline+932.8%+932.8%+1865.6%+1865.6%+2798.4%+2798.4%+3731.2%+3731.2%751.3%3731.3%284%216.1%149.2%102.5%51.2%44.1%13.6%10.5%6%5.4%4.4%S.P.FGPU - HotelOpenCL HeartwallOpenCL - Max SP FlopsO.L.FOpenCLOpenCL - TriadOpenCL - Bus Speed DownloadOpenCL - Bus Speed Readback33.5%OpenCL LBMBMW27 - OpenCL11.6%OpenCL - FFT SPKernel Latency9.8%OpenCL BFS8.3%G.M.BServer Rack - OpenCL5.6%OpenCL - MD5 HashOpenCL - T.R.B4.6%Fishy Cat - OpenCLclpeakLuxMarkRodiniaSHOC Scalable HeterOgeneous ComputingViennaCLLeelaChessZeroSHOC Scalable HeterOgeneous ComputingSHOC Scalable HeterOgeneous ComputingSHOC Scalable HeterOgeneous ComputingParboilBlenderSHOC Scalable HeterOgeneous ComputingclpeakParboilclpeakDarktableSHOC Scalable HeterOgeneous ComputingSHOC Scalable HeterOgeneous ComputingBlenderBeignet GitIntel OpenCL NEO

Coffeelake Beignet vs. OpenCL NEO Inteldarktable: Boat - OpenCLdarktable: Masskrug - OpenCLdarktable: Server Rack - OpenCLdarktable: Server Room - OpenCLlczero: OpenCLshoc: OpenCL - FFT SPshoc: OpenCL - MD5 Hashshoc: OpenCL - Max SP Flopsshoc: OpenCL - Bus Speed Downloadshoc: OpenCL - Bus Speed Readbackshoc: OpenCL - Texture Read Bandwidthparboil: OpenCL BFSparboil: OpenCL LBMrodinia: OpenCL Heartwallblender: BMW27 - OpenCLblender: Classroom - OpenCLblender: Fishy Cat - OpenCLblender: Barbershop - OpenCLblender: Pabellon Barcelona - OpenCLclpeak: Kernel Latencyclpeak: Single-Precision Floatclpeak: Global Memory Bandwidthclpeak: Transfer Bandwidth enqueueReadBufferclpeak: Transfer Bandwidth enqueueWriteBufferviennacl: OpenCL LU Factorizationluxmark: GPU - Hotelxsbench-cl: comd-cl: Average Atom Update Rateshoc: OpenCL - TriadBeignet GitIntel OpenCL NEO13.876.480.184.7552.2115.980.3746720.4839.0959.740.9678.5635.174136761014135592921.7053.8831.4115.4240.828.5216114555032.589.2013.876.480.194.78105.7317.650.39147629.5129.2757.091.0469.159.16461678971137292623.83458.6833.2815.3441.3121.236132.5813.91OpenBenchmarking.org

Darktable

Darktable is an open-source photography / workflow application this will use any system-installed Darktable program or on Windows will automatically download the pre-built binary from the project. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 2.4.2Test: Boat - Acceleration: OpenCLBeignet GitIntel OpenCL NEO48121620SE +/- 0.03, N = 3SE +/- 0.05, N = 313.8713.87

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 2.4.2Test: Masskrug - Acceleration: OpenCLBeignet GitIntel OpenCL NEO246810SE +/- 0.01, N = 3SE +/- 0.01, N = 36.486.48

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 2.4.2Test: Server Rack - Acceleration: OpenCLBeignet GitIntel OpenCL NEO0.04280.08560.12840.17120.214SE +/- 0.00, N = 3SE +/- 0.00, N = 30.180.19

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 2.4.2Test: Server Room - Acceleration: OpenCLBeignet GitIntel OpenCL NEO1.07552.1513.22654.3025.3775SE +/- 0.01, N = 3SE +/- 0.01, N = 34.754.78

LeelaChessZero

LeelaChessZero (lc0 / lczero) is a chess engine automated vian neural networks. This test profile can be used for OpenCL, CUDA + cuDNN, and BLAS (CPU-based) benchmarking. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.20.1Backend: OpenCLBeignet GitIntel OpenCL NEO20406080100SE +/- 3.79, N = 6SE +/- 0.42, N = 352.21105.731. (CXX) g++ options: -lpthread -lOpenCL -lz

SHOC Scalable HeterOgeneous Computing

The CUDA and OpenCL version of Vetter's Scalable HeterOgeneous Computing benchmark suite. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: FFT SPBeignet GitIntel OpenCL NEO48121620SE +/- 0.00, N = 3SE +/- 0.03, N = 315.9817.651. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

OpenBenchmarking.orgGHash/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: MD5 HashBeignet GitIntel OpenCL NEO0.08780.17560.26340.35120.439SE +/- 0.00, N = 3SE +/- 0.00, N = 30.370.391. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Max SP FlopsBeignet GitIntel OpenCL NEO30060090012001500SE +/- 0.01, N = 3SE +/- 16.73, N = 1546714761. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Bus Speed DownloadBeignet GitIntel OpenCL NEO714212835SE +/- 0.09, N = 3SE +/- 0.15, N = 320.4829.511. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Bus Speed ReadbackBeignet GitIntel OpenCL NEO918273645SE +/- 0.47, N = 6SE +/- 0.43, N = 439.0929.271. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Texture Read BandwidthBeignet GitIntel OpenCL NEO1326395265SE +/- 0.30, N = 3SE +/- 0.00, N = 359.7457.091. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

Parboil

The Parboil Benchmarks from the IMPACT Research Group at University of Illinois are a set of throughput computing applications for looking at computing architecture and compilers. Parboil test-cases support OpenMP, OpenCL, and CUDA multi-processing environments. However, at this time the test profile is just making use of the OpenMP and OpenCL test workloads. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterParboil 2.5Test: OpenCL BFSBeignet GitIntel OpenCL NEO0.2340.4680.7020.9361.17SE +/- 0.00, N = 3SE +/- 0.01, N = 30.961.041. (CXX) g++ options: -lm -lpthread -lgomp -O3 -ffast-math -fopenmp

OpenBenchmarking.orgSeconds, Fewer Is BetterParboil 2.5Test: OpenCL LBMBeignet GitIntel OpenCL NEO20406080100SE +/- 0.03, N = 3SE +/- 0.41, N = 378.5669.151. (CXX) g++ options: -lm -lpthread -lgomp -O3 -ffast-math -fopenmp

Rodinia

Rodinia is a suite focused upon accelerating compute-intensive applications with accelerators. CUDA, OpenMP, and OpenCL parallel models are supported by the included applications. This profile utilizes the OpenCL and OpenMP test binaries at the moment. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 2.4Test: OpenCL HeartwallBeignet GitIntel OpenCL NEO816243240SE +/- 2.44, N = 12SE +/- 0.04, N = 335.179.161. (CXX) g++ options: -O2 -lOpenCL

Blender

Blender is an open-source 3D creation software project. This test is of Blender's Cycles benchmark with various sample files. GPU computing via OpenCL or CUDA is supported. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.79aBlend File: BMW27 - Compute: OpenCLBeignet GitIntel OpenCL NEO100200300400500413461

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.79aBlend File: Classroom - Compute: OpenCLBeignet GitIntel OpenCL NEO150300450600750676678

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.79aBlend File: Fishy Cat - Compute: OpenCLBeignet GitIntel OpenCL NEO20040060080010001014971

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.79aBlend File: Barbershop - Compute: OpenCLBeignet GitIntel OpenCL NEO3006009001200150013551372

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.79aBlend File: Pabellon Barcelona - Compute: OpenCLBeignet GitIntel OpenCL NEO2004006008001000929926

clpeak

Clpeak is designed to test the peak capabilities of OpenCL devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgus, Fewer Is BetterclpeakOpenCL Test: Kernel LatencyBeignet GitIntel OpenCL NEO612182430SE +/- 0.26, N = 3SE +/- 0.10, N = 321.7023.831. (CXX) g++ options: -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGFLOPS, More Is BetterclpeakOpenCL Test: Single-Precision FloatBeignet GitIntel OpenCL NEO100200300400500SE +/- 5.07, N = 12SE +/- 0.07, N = 353.88458.681. (CXX) g++ options: -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGBPS, More Is BetterclpeakOpenCL Test: Global Memory BandwidthBeignet GitIntel OpenCL NEO816243240SE +/- 2.22, N = 12SE +/- 0.09, N = 331.4133.281. (CXX) g++ options: -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGBPS, More Is BetterclpeakOpenCL Test: Transfer Bandwidth enqueueReadBufferBeignet GitIntel OpenCL NEO48121620SE +/- 0.02, N = 3SE +/- 0.19, N = 515.4215.341. (CXX) g++ options: -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGBPS, More Is BetterclpeakOpenCL Test: Transfer Bandwidth enqueueWriteBufferBeignet GitIntel OpenCL NEO918273645SE +/- 0.15, N = 3SE +/- 0.09, N = 340.8241.311. (CXX) g++ options: -O3 -rdynamic -lOpenCL

ViennaCL

ViennaCL is an open-source linear algebra library written in C++ and with support for OpenCL and OpenMP. This test profile uses ViennaCL OpenCL support and runs the included computational benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS, More Is BetterViennaCL 1.4.2OpenCL LU FactorizationBeignet GitIntel OpenCL NEO510152025SE +/- 0.42, N = 12SE +/- 0.03, N = 38.5221.231. (CXX) g++ options: -rdynamic -lOpenCL

LuxMark

LuxMark is a multi-platform OpenGL benchmark using LuxRender. LuxMark supports targeting different OpenCL devices and has multiple scenes available for rendering. LuxMark is a fully open-source OpenCL program with real-world rendering examples. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.1OpenCL Device: GPU - Scene: HotelBeignet GitIntel OpenCL NEO130260390520650SE +/- 13.09, N = 10SE +/- 6.08, N = 316613

Xsbench OpenCL

Xsbench benchmark in OpenCL via GPUOpen. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgLookups/s, More Is BetterXsbench OpenCL 2017-07-06Beignet Git2M4M6M8M10MSE +/- 2043.29, N = 3114555031. (CC) gcc options: -std=gnu99 -fopenmp -O3 -lm -lOpenCL

CoMD OpenCL

CoMD benchmark in OpenCL via GPUOpen. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgus/atom/task, More Is BetterCoMD OpenCL 2017-07-06Average Atom Update RateBeignet GitIntel OpenCL NEO0.58051.1611.74152.3222.9025SE +/- 0.01, N = 3SE +/- 0.00, N = 32.582.581. (CC) gcc options: -std=c99 -O5 -lm -lOpenCL

SHOC Scalable HeterOgeneous Computing

The CUDA and OpenCL version of Vetter's Scalable HeterOgeneous Computing benchmark suite. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: TriadBeignet GitIntel OpenCL NEO48121620SE +/- 0.42, N = 15SE +/- 0.06, N = 39.2013.911. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi