Intel OpenCL NEO Ubuntu 19.04

Intel Core i7-6770HQ OpenCL Linux benchmarking by Michael Larabel for a future article.

HTML result view exported from: https://openbenchmarking.org/result/1905082-HV-NEOBEIGNE73.

Intel OpenCL NEO Ubuntu 19.04ProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLOpenCLCompilerFile-SystemScreen ResolutionBeignet OpenCL 2.0NEO OpenCL 2.1Intel Core i7-6770HQ @ 3.50GHz (4 Cores / 8 Threads)Intel NUC6i7KYB (KYSKLi70.86A.0037.2016.0603.1032 BIOS)Intel Xeon E3-1200 v5/E3-150032768MBSamsung SSD 950 PRO 512GBIntel Iris Pro 580 3GB (950MHz)Realtek ALC233DELL P2415QIntel I219-LM + Intel 8260Ubuntu 19.045.0.0-13-generic (x86_64)GNOME Shell 3.32.0X Server 1.20.4modesetting 1.20.44.5 Mesa 19.1.0-devel (git-bdd273d 2019-05-06 disco-oibaf-ppa)OpenCL 2.0 beignet 1.3GCC 8.3.0ext43840x2160OpenCL 2.1OpenBenchmarking.orgCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --enable-libmpx --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: intel_pstate powersaveSecurity Details- KPTI + __user pointer sanitization + Full generic retpoline IBPB: conditional IBRS_FW STIBP: conditional RSB filling + SSB disabled via prctl and seccomp + PTE Inversion; VMX: conditional cache flushes SMT vulnerable

Intel OpenCL NEO Ubuntu 19.04shoc: OpenCL - Triadshoc: OpenCL - FFT SPshoc: OpenCL - MD5 Hashshoc: OpenCL - Max SP Flopsshoc: OpenCL - Bus Speed Downloadshoc: OpenCL - Bus Speed Readbackshoc: OpenCL - Texture Read Bandwidthviennacl: OpenCL LU Factorizationcl-mem: Copycl-mem: Readcl-mem: Writelczero: OpenCLdarktable: Boat - OpenCLdarktable: Masskrug - OpenCLdarktable: Server Rack - OpenCLdarktable: Server Room - OpenCLblender: BMW27 - OpenCLblender: Barbershop - OpenCLxsbench-cl: juliagpu: GPUclpeak: Kernel Latencyclpeak: Single-Precision Floatclpeak: Global Memory Bandwidthclpeak: Transfer Bandwidth enqueueReadBufferclpeak: Transfer Bandwidth enqueueWriteBuffercomd-cl: Average Atom Update RateBeignet OpenCL 2.0NEO OpenCL 2.18.5428.630.8984510.7318.6214112.9246.6543.0948.9211616.5912.020.288.416952779236249494921872631.25105425.649.3728.943.7317.1841.240.94317627.6327.9312820.2515515.6410.670.525.3067927456687006533.80107725.789.5929.473.61OpenBenchmarking.org

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Triad

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: TriadBeignet OpenCL 2.0NEO OpenCL 2.148121620SE +/- 0.27, N = 12SE +/- 0.15, N = 38.5417.181. (CXX) g++ options: -O2 -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: FFT SP

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: FFT SPBeignet OpenCL 2.0NEO OpenCL 2.1918273645SE +/- 0.01, N = 3SE +/- 0.04, N = 328.6341.241. (CXX) g++ options: -O2 -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: MD5 Hash

OpenBenchmarking.orgGHash/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: MD5 HashBeignet OpenCL 2.0NEO OpenCL 2.10.21150.4230.63450.8461.0575SE +/- 0.00, N = 3SE +/- 0.00, N = 30.890.941. (CXX) g++ options: -O2 -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Max SP Flops

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Max SP FlopsBeignet OpenCL 2.0NEO OpenCL 2.17001400210028003500SE +/- 0.12, N = 3SE +/- 2.05, N = 384531761. (CXX) g++ options: -O2 -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Bus Speed Download

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Bus Speed DownloadBeignet OpenCL 2.0NEO OpenCL 2.1714212835SE +/- 0.14, N = 4SE +/- 0.39, N = 410.7327.631. (CXX) g++ options: -O2 -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Bus Speed Readback

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Bus Speed ReadbackBeignet OpenCL 2.0NEO OpenCL 2.1714212835SE +/- 0.15, N = 3SE +/- 0.32, N = 618.6227.931. (CXX) g++ options: -O2 -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Texture Read Bandwidth

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Texture Read BandwidthBeignet OpenCL 2.0NEO OpenCL 2.1306090120150SE +/- 0.20, N = 3SE +/- 0.37, N = 31411281. (CXX) g++ options: -O2 -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt

ViennaCL

OpenCL LU Factorization

OpenBenchmarking.orgGFLOPS, More Is BetterViennaCL 1.4.2OpenCL LU FactorizationBeignet OpenCL 2.0NEO OpenCL 2.1510152025SE +/- 0.61, N = 15SE +/- 0.02, N = 312.9220.251. (CXX) g++ options: -rdynamic -lOpenCL

cl-mem

Benchmark: Copy

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: CopyBeignet OpenCL 2.01122334455SE +/- 2.67, N = 1246.651. (CC) gcc options: -O2 -flto -lOpenCL

cl-mem

Benchmark: Read

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: ReadBeignet OpenCL 2.01020304050SE +/- 2.62, N = 1243.091. (CC) gcc options: -O2 -flto -lOpenCL

cl-mem

Benchmark: Write

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: WriteBeignet OpenCL 2.01122334455SE +/- 1.77, N = 1548.921. (CC) gcc options: -O2 -flto -lOpenCL

LeelaChessZero

Backend: OpenCL

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.20.1Backend: OpenCLBeignet OpenCL 2.0NEO OpenCL 2.1306090120150SE +/- 6.45, N = 12SE +/- 1.39, N = 121161551. (CXX) g++ options: -lpthread

Darktable

Test: Boat - Acceleration: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 2.6.0Test: Boat - Acceleration: OpenCLBeignet OpenCL 2.0NEO OpenCL 2.148121620SE +/- 0.04, N = 3SE +/- 0.06, N = 316.5915.64

Darktable

Test: Masskrug - Acceleration: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 2.6.0Test: Masskrug - Acceleration: OpenCLBeignet OpenCL 2.0NEO OpenCL 2.13691215SE +/- 0.01, N = 3SE +/- 0.06, N = 312.0210.67

Darktable

Test: Server Rack - Acceleration: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 2.6.0Test: Server Rack - Acceleration: OpenCLBeignet OpenCL 2.0NEO OpenCL 2.10.1170.2340.3510.4680.585SE +/- 0.00, N = 4SE +/- 0.01, N = 50.280.52

Darktable

Test: Server Room - Acceleration: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 2.6.0Test: Server Room - Acceleration: OpenCLBeignet OpenCL 2.0NEO OpenCL 2.1246810SE +/- 0.00, N = 3SE +/- 0.03, N = 38.415.30

Blender

Blend File: BMW27 - Compute: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.79aBlend File: BMW27 - Compute: OpenCLBeignet OpenCL 2.0NEO OpenCL 2.1150300450600750695679

Blender

Blend File: Barbershop - Compute: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.79aBlend File: Barbershop - Compute: OpenCLBeignet OpenCL 2.0NEO OpenCL 2.1600120018002400300027792745

Xsbench OpenCL

OpenBenchmarking.orgLookups/s, More Is BetterXsbench OpenCL 2017-07-06Beignet OpenCL 2.05M10M15M20M25MSE +/- 12258.51, N = 3236249491. (CC) gcc options: -std=gnu99 -fopenmp -O3 -lm -lOpenCL

JuliaGPU

OpenCL Device: GPU

OpenBenchmarking.orgSamples/sec, More Is BetterJuliaGPU 1.2pts1OpenCL Device: GPUBeignet OpenCL 2.0NEO OpenCL 2.114M28M42M56M70MSE +/- 105679.48, N = 3SE +/- 208041.13, N = 349218726668700651. (CC) gcc options: -O3 -march=native -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL -lm

clpeak

OpenCL Test: Kernel Latency

OpenBenchmarking.orgus, Fewer Is BetterclpeakOpenCL Test: Kernel LatencyBeignet OpenCL 2.0NEO OpenCL 2.1816243240SE +/- 0.28, N = 15SE +/- 0.48, N = 331.2533.801. (CXX) g++ options: -O3 -rdynamic -lOpenCL

clpeak

OpenCL Test: Single-Precision Float

OpenBenchmarking.orgGFLOPS, More Is BetterclpeakOpenCL Test: Single-Precision FloatBeignet OpenCL 2.0NEO OpenCL 2.12004006008001000SE +/- 0.23, N = 3SE +/- 3.42, N = 3105410771. (CXX) g++ options: -O3 -rdynamic -lOpenCL

clpeak

OpenCL Test: Global Memory Bandwidth

OpenBenchmarking.orgGBPS, More Is BetterclpeakOpenCL Test: Global Memory BandwidthBeignet OpenCL 2.0NEO OpenCL 2.1612182430SE +/- 0.16, N = 3SE +/- 0.12, N = 325.6425.781. (CXX) g++ options: -O3 -rdynamic -lOpenCL

clpeak

OpenCL Test: Transfer Bandwidth enqueueReadBuffer

OpenBenchmarking.orgGBPS, More Is BetterclpeakOpenCL Test: Transfer Bandwidth enqueueReadBufferBeignet OpenCL 2.0NEO OpenCL 2.13691215SE +/- 0.06, N = 3SE +/- 0.06, N = 39.379.591. (CXX) g++ options: -O3 -rdynamic -lOpenCL

clpeak

OpenCL Test: Transfer Bandwidth enqueueWriteBuffer

OpenBenchmarking.orgGBPS, More Is BetterclpeakOpenCL Test: Transfer Bandwidth enqueueWriteBufferBeignet OpenCL 2.0NEO OpenCL 2.1714212835SE +/- 0.29, N = 3SE +/- 0.14, N = 328.9429.471. (CXX) g++ options: -O3 -rdynamic -lOpenCL

CoMD OpenCL

Average Atom Update Rate

OpenBenchmarking.orgus/atom/task, More Is BetterCoMD OpenCL 2017-07-06Average Atom Update RateBeignet OpenCL 2.0NEO OpenCL 2.10.83931.67862.51793.35724.1965SE +/- 0.05, N = 4SE +/- 0.02, N = 33.733.611. (CC) gcc options: -std=c99 -O5 -lm -lOpenCL


Phoronix Test Suite v10.8.4