OpenCL 2.0 Intel Beignet CPU Celeron Kabylake Comparison

Intel OpenCL 2.0 Beignet 1.3 benchmarking with Intel CPUs. Tests by Michael Larabel for a future article on Phoronix.

HTML result view exported from: https://openbenchmarking.org/result/1701313-RI-BEIGNET1211&rdt&grs.

OpenCL 2.0 Intel Beignet CPU Celeron Kabylake ComparisonProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLOpenCLVulkanCompilerFile-SystemScreen ResolutionCore i5 7600KCore i3 7100Core i7 7700KPentium G4400Core i5 6600KCore i5 6500Pentium G4600Celeron G3930Intel Core i5-7600K @ 3.80GHz (4 Cores)ASUS PRIME Z270-PIntel Device 591f16384MBSamsung SSD 950 PRO 256GBIntel Kabylake GT2 3072MB (1150MHz)Realtek ALC887-VDDELL P2415QRealtek RTL8111/8168/8411Clear Linux4.9.5-302.native (x86_64)Xfce 4.12X Server 1.19.1modesetting 1.19.14.5 Mesa 17.0.0-develOpenCL 2.0 beignet 1.31.0.37GCC 6.3.0 + Clang 3.9.1 + LLVM 3.9.1ext41920x1080Intel Core i3-7100 @ 3.90GHz (4 Cores)Intel Device 590fIntel Kabylake GT2 3072MB (1100MHz)Intel Core i7-7700K @ 4.20GHz (8 Cores)Intel Device 591fIntel Kabylake GT2 3072MB (1150MHz)Intel Pentium G4400 @ 3.30GHz (2 Cores)MSI B150M MORTAR (MS-7972) v2.0Intel Skylake8192MB120GB Samsung SSD 850Intel HD 510 (Skylake GT1) 3072MB (1000MHz)Realtek ALC892Intel Core i5-6600K @ 3.50GHz (4 Cores)MSI Z170A GAMING PRO (MS-7984) v1.015360MB256GB TS256GSSD370SIntel HD 530 (Skylake GT2) 3072MB (1150MHz)Realtek ALC1150Intel ConnectionIntel Core i5-6500 @ 3.20GHz (4 Cores)Gigabyte Z170M-D3H-CF8192MB250GB Samsung SSD 850Intel HD 530 (Skylake GT2) 3072MB (1050MHz)Realtek ALC892Intel Pentium G4600 @ 3.60GHz (4 Cores)MSI Z270-A PRO (MS-7A71) v1.0Intel Device 590f15360MBSamsung SSD 950 PRO 256GBIntel Kabylake GT2 3072MB (1100MHz)Realtek RTL8111/8168/8411Intel Celeron G3930 @ 2.90GHz (2 Cores)Intel Kabylake GT1 3072MB (1050MHz)OpenBenchmarking.orgCompiler Details- --build=x86_64-generic-linux --disable-libunwind-exceptions --disable-multiarch --disable-vtable-verify --enable-__cxa_atexit --enable-bootstrap --enable-clocale=gnu --enable-gnu-indirect-function --enable-languages=c,c++,fortran,go --enable-ld=default --enable-libmpx --enable-libstdcxx-pch --enable-lto --enable-multilib --enable-plugin --enable-shared --enable-threads=posix --exec-prefix=/usr --includedir=/usr/include --target=x86_64-generic-linux --with-arch=westmere --with-glibc-version=2.19 --with-gnu-ld --with-isl --with-ppl=yes --with-tune=haswell Processor Details- Scaling Governor: acpi-cpufreq performance

OpenCL 2.0 Intel Beignet CPU Celeron Kabylake Comparisonmandelgpu: GPUshoc: OpenCL - Max SP Flopsjuliagpu: GPUshoc: OpenCL - Bus Speed Readbackshoc: OpenCL - Triadshoc: OpenCL - Texture Read Bandwidthshoc: OpenCL - Bus Speed Downloadshoc: OpenCL - FFT SPmandelbulbgpu: GPUshoc: OpenCL - MD5 Hashcl-mem: Writecl-mem: ReadCore i5 7600KCore i3 7100Core i7 7700KPentium G4400Core i5 6600KCore i5 6500Pentium G4600Celeron G39309180692.03340.6936333556.8335.128.3758.1126.1810.907893608.830.3242.9239.728031118.63297.3931987657.0332.086.6151.5224.869.676991142.000.2847.9744.129186739.20340.7137085079.7738.5212.2357.8928.3910.887953468.130.3239.1341.073613230.53138.1316211893.1024.356.2431.9821.0742.3040.329149327.20340.7136351871.5333.968.2757.6025.6010.897914468.330.3246.0239.927886706.37292.4631757837.2329.577.4349.8621.519.326852443.000.2840.4845.888029783.93297.3932892655.4026.357.4151.5622.399.667023912.230.2844.3239.983822342.23146.0217060871.1719.106.2433.7617.9746.3541.07OpenBenchmarking.org

MandelGPU

OpenCL Device: GPU

OpenBenchmarking.orgSamples/sec, More Is BetterMandelGPU 1.3pts1OpenCL Device: GPUCore i5 7600KCore i3 7100Core i7 7700KPentium G4400Core i5 6600KCore i5 6500Pentium G4600Celeron G39302M4M6M8M10MSE +/- 414.49, N = 3SE +/- 369.28, N = 3SE +/- 7061.44, N = 3SE +/- 1228.27, N = 3SE +/- 8446.97, N = 3SE +/- 473.77, N = 3SE +/- 5671.17, N = 3SE +/- 29.87, N = 39180692.038031118.639186739.203613230.539149327.207886706.378029783.933822342.231. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Max SP Flops

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Max SP FlopsCore i5 7600KCore i3 7100Core i7 7700KPentium G4400Core i5 6600KCore i5 6500Pentium G4600Celeron G393070140210280350SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.02, N = 3SE +/- 0.00, N = 3SE +/- 0.17, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3340.69297.39340.71138.13340.71292.46297.39146.021. (CXX) g++ options: -O2 -pipe -fexceptions -fstack-protector -malign-data=abi -ftree-vectorize -fopt-info-vec -m64 -march=westmere -mtune=haswell -O3 -mtune=intel -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpicxx -lmpi

JuliaGPU

OpenCL Device: GPU

OpenBenchmarking.orgSamples/sec, More Is BetterJuliaGPU 1.2pts1OpenCL Device: GPUCore i5 7600KCore i3 7100Core i7 7700KPentium G4400Core i5 6600KCore i5 6500Pentium G4600Celeron G39308M16M24M32M40MSE +/- 375450.50, N = 3SE +/- 30806.75, N = 3SE +/- 469338.36, N = 3SE +/- 4354.56, N = 3SE +/- 8517.71, N = 3SE +/- 7819.95, N = 3SE +/- 346084.01, N = 3SE +/- 721.79, N = 336333556.8331987657.0337085079.7716211893.1036351871.5331757837.2332892655.4017060871.171. (CC) gcc options: -O3 -march=native -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL -lm

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Bus Speed Readback

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Bus Speed ReadbackCore i5 7600KCore i3 7100Core i7 7700KPentium G4400Core i5 6600KCore i5 6500Pentium G4600Celeron G3930918273645SE +/- 0.36, N = 3SE +/- 0.29, N = 3SE +/- 0.32, N = 3SE +/- 0.25, N = 3SE +/- 0.35, N = 3SE +/- 0.28, N = 3SE +/- 0.38, N = 3SE +/- 0.15, N = 335.1232.0838.5224.3533.9629.5726.3519.101. (CXX) g++ options: -O2 -pipe -fexceptions -fstack-protector -malign-data=abi -ftree-vectorize -fopt-info-vec -m64 -march=westmere -mtune=haswell -O3 -mtune=intel -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpicxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Triad

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: TriadCore i5 7600KCore i3 7100Core i7 7700KPentium G4400Core i5 6600KCore i5 6500Pentium G4600Celeron G39303691215SE +/- 0.02, N = 3SE +/- 0.01, N = 3SE +/- 0.19, N = 3SE +/- 0.01, N = 3SE +/- 0.03, N = 3SE +/- 0.02, N = 3SE +/- 0.12, N = 4SE +/- 0.02, N = 38.376.6112.236.248.277.437.416.241. (CXX) g++ options: -O2 -pipe -fexceptions -fstack-protector -malign-data=abi -ftree-vectorize -fopt-info-vec -m64 -march=westmere -mtune=haswell -O3 -mtune=intel -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpicxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Texture Read Bandwidth

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Texture Read BandwidthCore i5 7600KCore i3 7100Core i7 7700KPentium G4400Core i5 6600KCore i5 6500Pentium G4600Celeron G39301326395265SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.22, N = 3SE +/- 0.40, N = 3SE +/- 0.52, N = 3SE +/- 0.35, N = 3SE +/- 0.03, N = 3SE +/- 0.42, N = 358.1151.5257.8931.9857.6049.8651.5633.761. (CXX) g++ options: -O2 -pipe -fexceptions -fstack-protector -malign-data=abi -ftree-vectorize -fopt-info-vec -m64 -march=westmere -mtune=haswell -O3 -mtune=intel -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpicxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Bus Speed Download

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Bus Speed DownloadCore i5 7600KCore i3 7100Core i7 7700KPentium G4400Core i5 6600KCore i5 6500Pentium G4600Celeron G3930714212835SE +/- 0.19, N = 3SE +/- 0.03, N = 3SE +/- 0.05, N = 3SE +/- 0.07, N = 3SE +/- 0.06, N = 3SE +/- 0.14, N = 3SE +/- 0.07, N = 3SE +/- 0.03, N = 326.1824.8628.3921.0725.6021.5122.3917.971. (CXX) g++ options: -O2 -pipe -fexceptions -fstack-protector -malign-data=abi -ftree-vectorize -fopt-info-vec -m64 -march=westmere -mtune=haswell -O3 -mtune=intel -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpicxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: FFT SP

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: FFT SPCore i5 7600KCore i3 7100Core i7 7700KCore i5 6600KCore i5 6500Pentium G46003691215SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.01, N = 3SE +/- 0.00, N = 310.909.6710.8810.899.329.661. (CXX) g++ options: -O2 -pipe -fexceptions -fstack-protector -malign-data=abi -ftree-vectorize -fopt-info-vec -m64 -march=westmere -mtune=haswell -O3 -mtune=intel -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpicxx -lmpi

MandelbulbGPU

OpenCL Device: GPU

OpenBenchmarking.orgSamples/sec, More Is BetterMandelbulbGPU 1.0pts1OpenCL Device: GPUCore i5 7600KCore i3 7100Core i7 7700KCore i5 6600KCore i5 6500Pentium G46002M4M6M8M10MSE +/- 16049.95, N = 3SE +/- 467.88, N = 3SE +/- 19261.23, N = 3SE +/- 864.90, N = 3SE +/- 16180.63, N = 3SE +/- 15808.74, N = 37893608.836991142.007953468.137914468.336852443.007023912.231. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: MD5 Hash

OpenBenchmarking.orgGHash/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: MD5 HashCore i5 7600KCore i3 7100Core i7 7700KCore i5 6600KCore i5 6500Pentium G46000.0720.1440.2160.2880.36SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 30.320.280.320.320.280.281. (CXX) g++ options: -O2 -pipe -fexceptions -fstack-protector -malign-data=abi -ftree-vectorize -fopt-info-vec -m64 -march=westmere -mtune=haswell -O3 -mtune=intel -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpicxx -lmpi

cl-mem

Benchmark: Write

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: WriteCore i5 7600KCore i3 7100Core i7 7700KPentium G4400Core i5 6600KCore i5 6500Pentium G4600Celeron G39301122334455SE +/- 2.03, N = 6SE +/- 3.35, N = 6SE +/- 0.77, N = 3SE +/- 2.02, N = 6SE +/- 1.25, N = 6SE +/- 2.50, N = 6SE +/- 3.56, N = 6SE +/- 2.80, N = 642.9247.9739.1342.3046.0240.4844.3246.351. (CC) gcc options: -O2 -flto -lOpenCL

cl-mem

Benchmark: Read

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: ReadCore i5 7600KCore i3 7100Core i7 7700KPentium G4400Core i5 6600KCore i5 6500Pentium G4600Celeron G39301020304050SE +/- 1.38, N = 6SE +/- 1.44, N = 6SE +/- 2.40, N = 6SE +/- 2.78, N = 6SE +/- 1.86, N = 6SE +/- 2.54, N = 6SE +/- 2.47, N = 6SE +/- 2.40, N = 639.7244.1241.0740.3239.9245.8839.9841.071. (CC) gcc options: -O2 -flto -lOpenCL


Phoronix Test Suite v10.8.4