Coffeelake Beignet vs. OpenCL NEO Intel

Intel Core i7-8700K testing of OpenCL Linux drivers by Michael Larabel for a future article.

HTML result view exported from: https://openbenchmarking.org/result/1905107-HV-COFFEELAK00&obr_ncb=y&grr&rdt.

Coffeelake Beignet vs. OpenCL NEO IntelProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLOpenCLCompilerFile-SystemScreen ResolutionBeignet GitIntel OpenCL NEOIntel Core i7-8700K @ 4.70GHz (6 Cores / 12 Threads)ASUS TUF Z370-PLUS GAMING (1802 BIOS)Intel 8th Gen Core16384MB128GB THNSN5128GPU7 TOSHIBAinteldrmfb (1200MHz)Realtek ALC887-VDDELL P2415QIntel I219-VUbuntu 18.044.18.0-18-generic (x86_64)GNOME Shell 3.28.3X Server 1.20.1modesetting 1.20.14.5 Mesa 18.2.8OpenCL 2.0 beignet 1.4 (git-fc5f430c)GCC 7.4.0 + Clang 6.0.0-1ubuntu2 + LLVM 6.0.0ext43840x2160Intel UHD 630 3GB (1200MHz)OpenCL 2.1OpenBenchmarking.orgCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --enable-libmpx --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: intel_pstate powersavePython Details- Python 2.7.15rc1 + Python 3.6.7Security Details- KPTI + __user pointer sanitization + Full generic retpoline IBPB: conditional IBRS_FW STIBP: conditional RSB filling + SSB disabled via prctl and seccomp + PTE Inversion

Coffeelake Beignet vs. OpenCL NEO Intellczero: OpenCLluxmark: GPU - Hotelblender: Barbershop - OpenCLblender: Fishy Cat - OpenCLblender: Pabellon Barcelona - OpenCLblender: Classroom - OpenCLblender: BMW27 - OpenCLclpeak: Single-Precision Floatclpeak: Global Memory Bandwidthshoc: OpenCL - Max SP Flopsshoc: OpenCL - Texture Read Bandwidthparboil: OpenCL LBMrodinia: OpenCL Heartwallshoc: OpenCL - MD5 Hashxsbench-cl: darktable: Boat - OpenCLviennacl: OpenCL LU Factorizationcomd-cl: Average Atom Update Rateclpeak: Transfer Bandwidth enqueueReadBuffershoc: OpenCL - Triaddarktable: Masskrug - OpenCLclpeak: Transfer Bandwidth enqueueWriteBufferdarktable: Server Room - OpenCLshoc: OpenCL - FFT SPclpeak: Kernel Latencyparboil: OpenCL BFSshoc: OpenCL - Bus Speed Readbackshoc: OpenCL - Bus Speed Downloaddarktable: Server Rack - OpenCLBeignet GitIntel OpenCL NEO52.21161355101492967641353.8831.4146759.7478.5635.170.371145550313.878.522.5815.429.206.4840.824.7515.9821.700.9639.0920.480.18105.736131372971926678461458.6833.28147657.0969.159.160.3913.8721.232.5815.3413.916.4841.314.7817.6523.831.0429.2729.510.19OpenBenchmarking.org

LeelaChessZero

Backend: OpenCL

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.20.1Backend: OpenCLBeignet GitIntel OpenCL NEO20406080100SE +/- 3.79, N = 6SE +/- 0.42, N = 352.21105.731. (CXX) g++ options: -lpthread -lOpenCL -lz

LuxMark

OpenCL Device: GPU - Scene: Hotel

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.1OpenCL Device: GPU - Scene: HotelBeignet GitIntel OpenCL NEO130260390520650SE +/- 13.09, N = 10SE +/- 6.08, N = 316613

Blender

Blend File: Barbershop - Compute: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.79aBlend File: Barbershop - Compute: OpenCLBeignet GitIntel OpenCL NEO3006009001200150013551372

Blender

Blend File: Fishy Cat - Compute: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.79aBlend File: Fishy Cat - Compute: OpenCLBeignet GitIntel OpenCL NEO20040060080010001014971

Blender

Blend File: Pabellon Barcelona - Compute: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.79aBlend File: Pabellon Barcelona - Compute: OpenCLBeignet GitIntel OpenCL NEO2004006008001000929926

Blender

Blend File: Classroom - Compute: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.79aBlend File: Classroom - Compute: OpenCLBeignet GitIntel OpenCL NEO150300450600750676678

Blender

Blend File: BMW27 - Compute: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.79aBlend File: BMW27 - Compute: OpenCLBeignet GitIntel OpenCL NEO100200300400500413461

clpeak

OpenCL Test: Single-Precision Float

OpenBenchmarking.orgGFLOPS, More Is BetterclpeakOpenCL Test: Single-Precision FloatBeignet GitIntel OpenCL NEO100200300400500SE +/- 5.07, N = 12SE +/- 0.07, N = 353.88458.681. (CXX) g++ options: -O3 -rdynamic -lOpenCL

clpeak

OpenCL Test: Global Memory Bandwidth

OpenBenchmarking.orgGBPS, More Is BetterclpeakOpenCL Test: Global Memory BandwidthBeignet GitIntel OpenCL NEO816243240SE +/- 2.22, N = 12SE +/- 0.09, N = 331.4133.281. (CXX) g++ options: -O3 -rdynamic -lOpenCL

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Max SP Flops

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Max SP FlopsBeignet GitIntel OpenCL NEO30060090012001500SE +/- 0.01, N = 3SE +/- 16.73, N = 1546714761. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Texture Read Bandwidth

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Texture Read BandwidthBeignet GitIntel OpenCL NEO1326395265SE +/- 0.30, N = 3SE +/- 0.00, N = 359.7457.091. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

Parboil

Test: OpenCL LBM

OpenBenchmarking.orgSeconds, Fewer Is BetterParboil 2.5Test: OpenCL LBMBeignet GitIntel OpenCL NEO20406080100SE +/- 0.03, N = 3SE +/- 0.41, N = 378.5669.151. (CXX) g++ options: -lm -lpthread -lgomp -O3 -ffast-math -fopenmp

Rodinia

Test: OpenCL Heartwall

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 2.4Test: OpenCL HeartwallBeignet GitIntel OpenCL NEO816243240SE +/- 2.44, N = 12SE +/- 0.04, N = 335.179.161. (CXX) g++ options: -O2 -lOpenCL

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: MD5 Hash

OpenBenchmarking.orgGHash/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: MD5 HashBeignet GitIntel OpenCL NEO0.08780.17560.26340.35120.439SE +/- 0.00, N = 3SE +/- 0.00, N = 30.370.391. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

Xsbench OpenCL

OpenBenchmarking.orgLookups/s, More Is BetterXsbench OpenCL 2017-07-06Beignet Git2M4M6M8M10MSE +/- 2043.29, N = 3114555031. (CC) gcc options: -std=gnu99 -fopenmp -O3 -lm -lOpenCL

Darktable

Test: Boat - Acceleration: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 2.4.2Test: Boat - Acceleration: OpenCLBeignet GitIntel OpenCL NEO48121620SE +/- 0.03, N = 3SE +/- 0.05, N = 313.8713.87

ViennaCL

OpenCL LU Factorization

OpenBenchmarking.orgGFLOPS, More Is BetterViennaCL 1.4.2OpenCL LU FactorizationBeignet GitIntel OpenCL NEO510152025SE +/- 0.42, N = 12SE +/- 0.03, N = 38.5221.231. (CXX) g++ options: -rdynamic -lOpenCL

CoMD OpenCL

Average Atom Update Rate

OpenBenchmarking.orgus/atom/task, More Is BetterCoMD OpenCL 2017-07-06Average Atom Update RateBeignet GitIntel OpenCL NEO0.58051.1611.74152.3222.9025SE +/- 0.01, N = 3SE +/- 0.00, N = 32.582.581. (CC) gcc options: -std=c99 -O5 -lm -lOpenCL

clpeak

OpenCL Test: Transfer Bandwidth enqueueReadBuffer

OpenBenchmarking.orgGBPS, More Is BetterclpeakOpenCL Test: Transfer Bandwidth enqueueReadBufferBeignet GitIntel OpenCL NEO48121620SE +/- 0.02, N = 3SE +/- 0.19, N = 515.4215.341. (CXX) g++ options: -O3 -rdynamic -lOpenCL

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Triad

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: TriadBeignet GitIntel OpenCL NEO48121620SE +/- 0.42, N = 15SE +/- 0.06, N = 39.2013.911. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

Darktable

Test: Masskrug - Acceleration: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 2.4.2Test: Masskrug - Acceleration: OpenCLBeignet GitIntel OpenCL NEO246810SE +/- 0.01, N = 3SE +/- 0.01, N = 36.486.48

clpeak

OpenCL Test: Transfer Bandwidth enqueueWriteBuffer

OpenBenchmarking.orgGBPS, More Is BetterclpeakOpenCL Test: Transfer Bandwidth enqueueWriteBufferBeignet GitIntel OpenCL NEO918273645SE +/- 0.15, N = 3SE +/- 0.09, N = 340.8241.311. (CXX) g++ options: -O3 -rdynamic -lOpenCL

Darktable

Test: Server Room - Acceleration: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 2.4.2Test: Server Room - Acceleration: OpenCLBeignet GitIntel OpenCL NEO1.07552.1513.22654.3025.3775SE +/- 0.01, N = 3SE +/- 0.01, N = 34.754.78

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: FFT SP

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: FFT SPBeignet GitIntel OpenCL NEO48121620SE +/- 0.00, N = 3SE +/- 0.03, N = 315.9817.651. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

clpeak

OpenCL Test: Kernel Latency

OpenBenchmarking.orgus, Fewer Is BetterclpeakOpenCL Test: Kernel LatencyBeignet GitIntel OpenCL NEO612182430SE +/- 0.26, N = 3SE +/- 0.10, N = 321.7023.831. (CXX) g++ options: -O3 -rdynamic -lOpenCL

Parboil

Test: OpenCL BFS

OpenBenchmarking.orgSeconds, Fewer Is BetterParboil 2.5Test: OpenCL BFSBeignet GitIntel OpenCL NEO0.2340.4680.7020.9361.17SE +/- 0.00, N = 3SE +/- 0.01, N = 30.961.041. (CXX) g++ options: -lm -lpthread -lgomp -O3 -ffast-math -fopenmp

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Bus Speed Readback

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Bus Speed ReadbackBeignet GitIntel OpenCL NEO918273645SE +/- 0.47, N = 6SE +/- 0.43, N = 439.0929.271. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Bus Speed Download

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Bus Speed DownloadBeignet GitIntel OpenCL NEO714212835SE +/- 0.09, N = 3SE +/- 0.15, N = 320.4829.511. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

Darktable

Test: Server Rack - Acceleration: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 2.4.2Test: Server Rack - Acceleration: OpenCLBeignet GitIntel OpenCL NEO0.04280.08560.12840.17120.214SE +/- 0.00, N = 3SE +/- 0.00, N = 30.180.19


Phoronix Test Suite v10.8.5