Tiger Lake Xe Graphics Performance

Intel Core i7-1165G7 testing with a Dell 0GG9PT (1.0.3 BIOS) and Intel Xe 3GB on Ubuntu 20.04 via the Phoronix Test Suite.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2010228-FI-TIGERLAKE89
Jump To Table - Results

Statistics

Remove Outliers Before Calculating Averages

Graph Settings

Prefer Vertical Bar Graphs

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Result
Identifier
View Logs
Performance Per
Dollar
Date
Run
  Test
  Duration
Xe Graphics
October 21 2020
  8 Hours, 42 Minutes
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


Tiger Lake Xe Graphics PerformanceOpenBenchmarking.orgPhoronix Test SuiteIntel Core i7-1165G7 @ 4.70GHz (4 Cores / 8 Threads)Dell 0GG9PT (1.0.3 BIOS)Intel Device a0ef16GBKioxia KBG40ZNS256G NVMe 256GBIntel Xe 3GB (1300MHz)Realtek ALC289Intel Device a0f0Ubuntu 20.045.9.0-050900daily20201021-generic (x86_64)GNOME Shell 3.36.4X Server 1.20.8modesetting 1.20.84.6 Mesa 20.3.0-devel (git-3d51c27 2020-10-21 focal-oibaf-ppa)OpenCL 3.01.2.145GCC 9.3.0ext41920x1200ProcessorMotherboardChipsetMemoryDiskGraphicsAudioNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLOpenCLVulkanCompilerFile-SystemScreen ResolutionTiger Lake Xe Graphics Performance BenchmarksSystem Logs- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: intel_pstate powersave - CPU Microcode: 0x60 - Thermald 1.9.1- Python 2.7.18 + Python 3.8.5- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected

Tiger Lake Xe Graphics Performancerealsr-ncnn: 4x - Norealsr-ncnn: 4x - Yeswaifu2x-ncnn: 2x - 3 - Nowaifu2x-ncnn: 2x - 3 - Yesfinancebench: Monte-Carlo OpenCLfinancebench: Black-Scholes OpenCLshoc: OpenCL - Triadshoc: OpenCL - FFT SPshoc: OpenCL - MD5 Hashshoc: OpenCL - Max SP Flopsshoc: OpenCL - Bus Speed Downloadshoc: OpenCL - Bus Speed Readbackshoc: OpenCL - Texture Read Bandwidthviennacl: OpenCL LU Factorizationcl-mem: Copycl-mem: Readcl-mem: Writeetlegacy: Renderer2 - 1920 x 1200tesseract: 1920 x 1200unigine-heaven: 1920 x 1200 - Fullscreen - OpenGLunigine-super: 1920 x 1200 - Fullscreen - Low - OpenGLunigine-super: 1920 x 1200 - Fullscreen - Medium - OpenGLunigine-valley: 1920 x 1200 - Fullscreen - OpenGLxonotic: 1920 x 1200 - Lowxonotic: 1920 x 1200 - Highxonotic: 1920 x 1200 - Ultraxonotic: 1920 x 1200 - Ultimategputest: GiMark - 1920 x 1200 - Fullscreengputest: Furmark - 1920 x 1200 - Fullscreengputest: TessMark - 1920 x 1200 - Fullscreengputest: Triangle - 1920 x 1200 - Fullscreengputest: Pixmark Piano - 1920 x 1200 - Fullscreengputest: Pixmark Volplosion - 1920 x 1200 - Fullscreenlczero: OpenCLrodinia: OpenCL Myocytencnn: Vulkan GPU - squeezenetncnn: Vulkan GPU - mobilenetncnn: Vulkan GPU-v2-v2 - mobilenet-v2ncnn: Vulkan GPU-v3-v3 - mobilenet-v3ncnn: Vulkan GPU - shufflenet-v2ncnn: Vulkan GPU - mnasnetncnn: Vulkan GPU - efficientnet-b0ncnn: Vulkan GPU - blazefacencnn: Vulkan GPU - googlenetncnn: Vulkan GPU - vgg16ncnn: Vulkan GPU - resnet18ncnn: Vulkan GPU - alexnetncnn: Vulkan GPU - resnet50ncnn: Vulkan GPU - yolov4-tinyplaidml: No - Inference - VGG16 - OpenCLplaidml: No - Inference - VGG19 - OpenCLplaidml: No - Inference - IMDB LSTM - OpenCLplaidml: No - Inference - Mobilenet - OpenCLplaidml: No - Inference - ResNet 50 - OpenCLplaidml: No - Inference - DenseNet 201 - OpenCLplaidml: No - Inference - Inception V3 - OpenCLplaidml: No - Inference - NASNer Large - OpenCLmandelgpu: GPUsmallpt-gpu: GPU - Complexsmallpt-gpu: GPU - Cornellsmallpt-gpu: GPU - Caustic3clpeak: Kernel Latencyclpeak: Single-Precision Floatclpeak: Global Memory Bandwidthclpeak: Transfer Bandwidth enqueueWriteBufferoneapi-level-zero: Peak Integer Computeoneapi-level-zero: Device-To-Host Bandwidthoneapi-level-zero: Device-To-Host Bandwidthoneapi-level-zero: Host-To-Device Bandwidthoneapi-level-zero: Host-To-Device Bandwidthoneapi-level-zero: Peak Kernel Launch Latencyoneapi-level-zero: Peak Half-Precision Computeoneapi-level-zero: Peak Single-Precision Computeoneapi-level-zero: Host-To-Device-To-Host Image Copyoneapi-level-zero: Peak Float16 Global Memory Bandwidthoneapi-level-zero: Peak System Memory Copy to Shared MemoryXe Graphics73.636581.9644.71228.601451.4659934.32615.2003155.6231.67827820.1853.174855.8650189.59853.250647.756.447.3106.7132.919328.010129.216.230.5298279.3196982188.2008468160.8638948121.5502594201417945076788015861590145278.04312.0210.764.635.683.144.6311.151.3110.6141.157.4510.1015.5917.0919.8115.2630.66358.1595.3519.3043.974.3445778725.416033456371603345763160334589837.011775.8356.6830.49399.61325.76338110419.2825.76430110418.9124.64213036.911219.5121.264960.283812.6809OpenBenchmarking.org

RealSR-NCNN

RealSR-NCNN is an NCNN neural network implementation of the RealSR project and accelerated using the Vulkan API. RealSR is the Real-World Super Resolution via Kernel Estimation and Noise Injection. NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. This test profile times how long it takes to increase the resolution of a sample image by a scale of 4x with Vulkan. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterRealSR-NCNN 20200818Scale: 4x - TAA: NoXe Graphics1632486480SE +/- 0.67, N = 373.64

OpenBenchmarking.orgSeconds, Fewer Is BetterRealSR-NCNN 20200818Scale: 4x - TAA: YesXe Graphics130260390520650SE +/- 0.56, N = 3581.96

Waifu2x-NCNN Vulkan

Waifu2x-NCNN is an NCNN neural network implementation of the Waifu2x converter project and accelerated using the Vulkan API. NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. This test profile times how long it takes to increase the resolution of a sample image with Vulkan. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterWaifu2x-NCNN Vulkan 20200818Scale: 2x - Denoise: 3 - TAA: NoXe Graphics1.06022.12043.18064.24085.301SE +/- 0.004, N = 34.712

OpenBenchmarking.orgSeconds, Fewer Is BetterWaifu2x-NCNN Vulkan 20200818Scale: 2x - Denoise: 3 - TAA: YesXe Graphics714212835SE +/- 0.03, N = 328.60

FinanceBench

FinanceBench is a collection of financial program benchmarks with support for benchmarking on the GPU. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterFinanceBench 2016-06-06Benchmark: Monte-Carlo OpenCLXe Graphics100200300400500SE +/- 1.24, N = 3451.471. (CXX) g++ options: -O3 -lOpenCL

OpenBenchmarking.orgms, Fewer Is BetterFinanceBench 2016-06-06Benchmark: Black-Scholes OpenCLXe Graphics0.97341.94682.92023.89364.867SE +/- 0.001, N = 34.3261. (CXX) g++ options: -O3 -lOpenCL

SHOC Scalable HeterOgeneous Computing

The CUDA and OpenCL version of Vetter's Scalable HeterOgeneous Computing benchmark suite. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: TriadXe Graphics48121620SE +/- 0.06, N = 315.201. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: FFT SPXe Graphics306090120150SE +/- 0.52, N = 3155.621. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

OpenBenchmarking.orgGHash/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: MD5 HashXe Graphics0.37760.75521.13281.51041.888SE +/- 0.0000, N = 31.67821. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Max SP FlopsXe Graphics2K4K6K8K10KSE +/- 0.05, N = 37820.181. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Bus Speed DownloadXe Graphics1224364860SE +/- 0.37, N = 353.171. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Bus Speed ReadbackXe Graphics1326395265SE +/- 0.36, N = 355.871. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Texture Read BandwidthXe Graphics4080120160200SE +/- 0.02, N = 3189.601. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

ViennaCL

ViennaCL is an open-source linear algebra library written in C++ and with support for OpenCL and OpenMP. This test profile uses ViennaCL OpenCL support and runs the included computational benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS, More Is BetterViennaCL 1.4.2OpenCL LU FactorizationXe Graphics1224364860SE +/- 0.06, N = 353.251. (CXX) g++ options: -rdynamic -lOpenCL

cl-mem

A basic OpenCL memory benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: CopyXe Graphics1122334455SE +/- 0.03, N = 347.71. (CC) gcc options: -O2 -flto -lOpenCL

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: ReadXe Graphics1326395265SE +/- 0.64, N = 356.41. (CC) gcc options: -O2 -flto -lOpenCL

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: WriteXe Graphics1122334455SE +/- 0.09, N = 347.31. (CC) gcc options: -O2 -flto -lOpenCL

ET: Legacy

ETLegacy is an open-source engine evolution of Wolfenstein: Enemy Territory, a World War II era first person shooter that was released for free by Splash Damage using the id Tech 3 engine. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterET: Legacy 2.75Renderer: Renderer2 - Resolution: 1920 x 1200Xe Graphics20406080100SE +/- 0.52, N = 3106.7

Tesseract

Tesseract is a fork of Cube 2 Sauerbraten with numerous graphics and game-play improvements. Tesseract has been in development since 2012 while its first release happened in May of 2014. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterTesseract 2014-05-12Resolution: 1920 x 1200Xe Graphics306090120150SE +/- 0.68, N = 3132.92

Unigine Heaven

This test calculates the average frame-rate within the Heaven demo for the Unigine engine. This engine is extremely demanding on the system's graphics card. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterUnigine Heaven 4.0Resolution: 1920 x 1200 - Mode: Fullscreen - Renderer: OpenGLXe Graphics714212835SE +/- 0.04, N = 328.01

Unigine Superposition

This test calculates the average frame-rate within the Superposition demo for the Unigine engine, released in 2017. This engine is extremely demanding on the system's graphics card. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterUnigine Superposition 1.0Resolution: 1920 x 1200 - Mode: Fullscreen - Quality: Low - Renderer: OpenGLXe Graphics714212835SE +/- 0.03, N = 329.2MAX: 37.7

OpenBenchmarking.orgFrames Per Second, More Is BetterUnigine Superposition 1.0Resolution: 1920 x 1200 - Mode: Fullscreen - Quality: Medium - Renderer: OpenGLXe Graphics48121620SE +/- 0.00, N = 316.2MAX: 20

Unigine Valley

This test calculates the average frame-rate within the Valley demo for the Unigine engine, released in February 2013. This engine is extremely demanding on the system's graphics card. Unigine Valley relies upon an OpenGL 3 core profile context. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterUnigine Valley 1.0Resolution: 1920 x 1200 - Mode: Fullscreen - Renderer: OpenGLXe Graphics714212835SE +/- 0.06, N = 330.53

Xonotic

This is a benchmark of Xonotic, which is a fork of the DarkPlaces-based Nexuiz game. Development began in March of 2010 on the Xonotic game. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterXonotic 0.8.2Resolution: 1920 x 1200 - Effects Quality: LowXe Graphics60120180240300SE +/- 0.76, N = 3279.32MIN: 172 / MAX: 641

OpenBenchmarking.orgFrames Per Second, More Is BetterXonotic 0.8.2Resolution: 1920 x 1200 - Effects Quality: HighXe Graphics4080120160200SE +/- 0.24, N = 3188.20MIN: 90 / MAX: 305

OpenBenchmarking.orgFrames Per Second, More Is BetterXonotic 0.8.2Resolution: 1920 x 1200 - Effects Quality: UltraXe Graphics4080120160200SE +/- 0.06, N = 3160.86MIN: 70 / MAX: 252

OpenBenchmarking.orgFrames Per Second, More Is BetterXonotic 0.8.2Resolution: 1920 x 1200 - Effects Quality: UltimateXe Graphics306090120150SE +/- 0.37, N = 3121.55MIN: 35 / MAX: 211

GpuTest

GpuTest is a cross-platform OpenGL benchmark developed at Geeks3D.com that offers tech demos such as FurMark, TessMark, and other workloads to stress various areas of GPUs and drivers. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgPoints, More Is BetterGpuTest 0.7.0Test: GiMark - Resolution: 1920 x 1200 - Mode: FullscreenXe Graphics4008001200160020002014

OpenBenchmarking.orgPoints, More Is BetterGpuTest 0.7.0Test: Furmark - Resolution: 1920 x 1200 - Mode: FullscreenXe Graphics400800120016002000SE +/- 24.66, N = 31794

OpenBenchmarking.orgPoints, More Is BetterGpuTest 0.7.0Test: TessMark - Resolution: 1920 x 1200 - Mode: FullscreenXe Graphics11002200330044005500SE +/- 14.40, N = 35076

OpenBenchmarking.orgPoints, More Is BetterGpuTest 0.7.0Test: Triangle - Resolution: 1920 x 1200 - Mode: FullscreenXe Graphics20K40K60K80K100KSE +/- 70.44, N = 378801

OpenBenchmarking.orgPoints, More Is BetterGpuTest 0.7.0Test: Pixmark Piano - Resolution: 1920 x 1200 - Mode: FullscreenXe Graphics130260390520650SE +/- 3.38, N = 3586

OpenBenchmarking.orgPoints, More Is BetterGpuTest 0.7.0Test: Pixmark Volplosion - Resolution: 1920 x 1200 - Mode: FullscreenXe Graphics30060090012001500SE +/- 10.17, N = 31590

LeelaChessZero

LeelaChessZero (lc0 / lczero) is a chess engine automated vian neural networks. This test profile can be used for OpenCL, CUDA + cuDNN, and BLAS (CPU-based) benchmarking. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.26Backend: OpenCLXe Graphics30060090012001500SE +/- 9.06, N = 314521. (CXX) g++ options: -flto -pthread

Rodinia

Rodinia is a suite focused upon accelerating compute-intensive applications with accelerators. CUDA, OpenMP, and OpenCL parallel models are supported by the included applications. This profile utilizes select OpenCL, NVIDIA CUDA and OpenMP test binaries at the moment. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenCL MyocyteXe Graphics20406080100SE +/- 8.81, N = 1378.041. (CXX) g++ options: -O2 -lOpenCL

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: Vulkan GPU - Model: squeezenetXe Graphics3691215SE +/- 0.02, N = 312.02MIN: 11.8 / MAX: 13.41. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: Vulkan GPU - Model: mobilenetXe Graphics3691215SE +/- 0.01, N = 310.76MIN: 10.57 / MAX: 13.061. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2Xe Graphics1.04182.08363.12544.16725.209SE +/- 0.33, N = 34.63MIN: 4.11 / MAX: 6.021. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3Xe Graphics1.2782.5563.8345.1126.39SE +/- 0.11, N = 35.68MIN: 5.34 / MAX: 7.81. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: Vulkan GPU - Model: shufflenet-v2Xe Graphics0.70651.4132.11952.8263.5325SE +/- 0.01, N = 33.14MIN: 2.97 / MAX: 3.411. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: Vulkan GPU - Model: mnasnetXe Graphics1.04182.08363.12544.16725.209SE +/- 0.03, N = 34.63MIN: 4.43 / MAX: 5.281. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: Vulkan GPU - Model: efficientnet-b0Xe Graphics3691215SE +/- 0.02, N = 311.15MIN: 11.04 / MAX: 11.891. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: Vulkan GPU - Model: blazefaceXe Graphics0.29480.58960.88441.17921.474SE +/- 0.04, N = 31.31MIN: 1.01 / MAX: 5.341. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: Vulkan GPU - Model: googlenetXe Graphics3691215SE +/- 0.02, N = 310.61MIN: 10.51 / MAX: 10.951. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: Vulkan GPU - Model: vgg16Xe Graphics918273645SE +/- 0.03, N = 341.15MIN: 40.8 / MAX: 41.541. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: Vulkan GPU - Model: resnet18Xe Graphics246810SE +/- 0.12, N = 37.45MIN: 7.13 / MAX: 8.041. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: Vulkan GPU - Model: alexnetXe Graphics3691215SE +/- 0.02, N = 310.10MIN: 9.87 / MAX: 10.261. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: Vulkan GPU - Model: resnet50Xe Graphics48121620SE +/- 0.01, N = 315.59MIN: 15.42 / MAX: 15.861. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20200916Target: Vulkan GPU - Model: yolov4-tinyXe Graphics48121620SE +/- 0.03, N = 317.09MIN: 16.88 / MAX: 19.591. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

PlaidML

This test profile uses PlaidML deep learning framework developed by Intel for offering up various benchmarks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFPS, More Is BetterPlaidMLFP16: No - Mode: Inference - Network: VGG16 - Device: OpenCLXe Graphics510152025SE +/- 0.12, N = 319.81

OpenBenchmarking.orgFPS, More Is BetterPlaidMLFP16: No - Mode: Inference - Network: VGG19 - Device: OpenCLXe Graphics48121620SE +/- 0.02, N = 315.26

OpenBenchmarking.orgFPS, More Is BetterPlaidMLFP16: No - Mode: Inference - Network: IMDB LSTM - Device: OpenCLXe Graphics714212835SE +/- 0.32, N = 330.66

OpenBenchmarking.orgFPS, More Is BetterPlaidMLFP16: No - Mode: Inference - Network: Mobilenet - Device: OpenCLXe Graphics80160240320400SE +/- 0.19, N = 3358.15

OpenBenchmarking.orgFPS, More Is BetterPlaidMLFP16: No - Mode: Inference - Network: ResNet 50 - Device: OpenCLXe Graphics20406080100SE +/- 1.25, N = 495.35

OpenBenchmarking.orgFPS, More Is BetterPlaidMLFP16: No - Mode: Inference - Network: DenseNet 201 - Device: OpenCLXe Graphics510152025SE +/- 0.05, N = 319.30

OpenBenchmarking.orgFPS, More Is BetterPlaidMLFP16: No - Mode: Inference - Network: Inception V3 - Device: OpenCLXe Graphics1020304050SE +/- 0.11, N = 343.97

OpenBenchmarking.orgFPS, More Is BetterPlaidMLFP16: No - Mode: Inference - Network: NASNer Large - Device: OpenCLXe Graphics0.97651.9532.92953.9064.8825SE +/- 0.00, N = 34.34

MandelGPU

MandelGPU is an OpenCL benchmark and this test runs with the OpenCL rendering float4 kernel with a maximum of 4096 iterations. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSamples/sec, More Is BetterMandelGPU 1.3pts1OpenCL Device: GPUXe Graphics10M20M30M40M50MSE +/- 17622.89, N = 345778725.41. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL

SmallPT GPU

SmallPT GPU is an OpenCL benchmark that's run with various PTS changes compared to upstream and multiple rendering scenes are available. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSamples/sec, More Is BetterSmallPT GPU 1.6pts1OpenCL Device: GPU - Scene: ComplexXe Graphics300M600M900M1200M1500MSE +/- 22.23, N = 316033456371. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL

OpenBenchmarking.orgSamples/sec, More Is BetterSmallPT GPU 1.6pts1OpenCL Device: GPU - Scene: CornellXe Graphics300M600M900M1200M1500MSE +/- 23.67, N = 316033457631. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL

OpenBenchmarking.orgSamples/sec, More Is BetterSmallPT GPU 1.6pts1OpenCL Device: GPU - Scene: Caustic3Xe Graphics300M600M900M1200M1500MSE +/- 25.12, N = 316033458981. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL

clpeak

Clpeak is designed to test the peak capabilities of OpenCL devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgus, Fewer Is BetterclpeakOpenCL Test: Kernel LatencyXe Graphics918273645SE +/- 0.43, N = 1537.011. (CXX) g++ options: -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGFLOPS, More Is BetterclpeakOpenCL Test: Single-Precision FloatXe Graphics400800120016002000SE +/- 0.19, N = 31775.831. (CXX) g++ options: -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGBPS, More Is BetterclpeakOpenCL Test: Global Memory BandwidthXe Graphics1326395265SE +/- 0.05, N = 356.681. (CXX) g++ options: -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGBPS, More Is BetterclpeakOpenCL Test: Transfer Bandwidth enqueueWriteBufferXe Graphics714212835SE +/- 0.45, N = 430.491. (CXX) g++ options: -O3 -rdynamic -lOpenCL

oneAPI Level Zero Tests

This is benchmarking the collection of Intel oneAPI Level Zero Tests. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS, More Is BetteroneAPI Level Zero TestsTest: Peak Integer ComputeXe Graphics90180270360450SE +/- 0.72, N = 3399.611. (CXX) g++ options: -ldl -pthread

OpenBenchmarking.orgGB/s, More Is BetteroneAPI Level Zero TestsTest: Device-To-Host BandwidthXe Graphics612182430SE +/- 0.02, N = 325.761. (CXX) g++ options: -ldl -pthread

OpenBenchmarking.orgusec, Fewer Is BetteroneAPI Level Zero TestsTest: Device-To-Host BandwidthXe Graphics2K4K6K8K10KSE +/- 7.98, N = 310419.281. (CXX) g++ options: -ldl -pthread

OpenBenchmarking.orgGB/s, More Is BetteroneAPI Level Zero TestsTest: Host-To-Device BandwidthXe Graphics612182430SE +/- 0.02, N = 325.761. (CXX) g++ options: -ldl -pthread

OpenBenchmarking.orgusec, Fewer Is BetteroneAPI Level Zero TestsTest: Host-To-Device BandwidthXe Graphics2K4K6K8K10KSE +/- 8.92, N = 310418.911. (CXX) g++ options: -ldl -pthread

OpenBenchmarking.orgus, Fewer Is BetteroneAPI Level Zero TestsTest: Peak Kernel Launch LatencyXe Graphics612182430SE +/- 1.04, N = 1524.641. (CXX) g++ options: -ldl -pthread

OpenBenchmarking.orgGFLOPS, More Is BetteroneAPI Level Zero TestsTest: Peak Half-Precision ComputeXe Graphics7001400210028003500SE +/- 1.44, N = 33036.911. (CXX) g++ options: -ldl -pthread

OpenBenchmarking.orgGB/s, More Is BetteroneAPI Level Zero TestsTest: Peak Single-Precision ComputeXe Graphics30060090012001500SE +/- 0.00, N = 31219.511. (CXX) g++ options: -ldl -pthread

OpenBenchmarking.orgGB/s, More Is BetteroneAPI Level Zero TestsTest: Host-To-Device-To-Host Image CopyXe Graphics510152025SE +/- 0.08, N = 321.261. (CXX) g++ options: -ldl -pthread

OpenBenchmarking.orgGB/s, More Is BetteroneAPI Level Zero TestsTest: Peak Float16 Global Memory BandwidthXe Graphics1326395265SE +/- 0.01, N = 360.281. (CXX) g++ options: -ldl -pthread

OpenBenchmarking.orgGB/s, More Is BetteroneAPI Level Zero TestsTest: Peak System Memory Copy to Shared MemoryXe Graphics3691215SE +/- 0.05, N = 312.681. (CXX) g++ options: -ldl -pthread

76 Results Shown

RealSR-NCNN:
  4x - No
  4x - Yes
Waifu2x-NCNN Vulkan:
  2x - 3 - No
  2x - 3 - Yes
FinanceBench:
  Monte-Carlo OpenCL
  Black-Scholes OpenCL
SHOC Scalable HeterOgeneous Computing:
  OpenCL - Triad
  OpenCL - FFT SP
  OpenCL - MD5 Hash
  OpenCL - Max SP Flops
  OpenCL - Bus Speed Download
  OpenCL - Bus Speed Readback
  OpenCL - Texture Read Bandwidth
ViennaCL
cl-mem:
  Copy
  Read
  Write
ET: Legacy
Tesseract
Unigine Heaven
Unigine Superposition:
  1920 x 1200 - Fullscreen - Low - OpenGL
  1920 x 1200 - Fullscreen - Medium - OpenGL
Unigine Valley
Xonotic:
  1920 x 1200 - Low
  1920 x 1200 - High
  1920 x 1200 - Ultra
  1920 x 1200 - Ultimate
GpuTest:
  GiMark - 1920 x 1200 - Fullscreen
  Furmark - 1920 x 1200 - Fullscreen
  TessMark - 1920 x 1200 - Fullscreen
  Triangle - 1920 x 1200 - Fullscreen
  Pixmark Piano - 1920 x 1200 - Fullscreen
  Pixmark Volplosion - 1920 x 1200 - Fullscreen
LeelaChessZero
Rodinia
NCNN:
  Vulkan GPU - squeezenet
  Vulkan GPU - mobilenet
  Vulkan GPU-v2-v2 - mobilenet-v2
  Vulkan GPU-v3-v3 - mobilenet-v3
  Vulkan GPU - shufflenet-v2
  Vulkan GPU - mnasnet
  Vulkan GPU - efficientnet-b0
  Vulkan GPU - blazeface
  Vulkan GPU - googlenet
  Vulkan GPU - vgg16
  Vulkan GPU - resnet18
  Vulkan GPU - alexnet
  Vulkan GPU - resnet50
  Vulkan GPU - yolov4-tiny
PlaidML:
  No - Inference - VGG16 - OpenCL
  No - Inference - VGG19 - OpenCL
  No - Inference - IMDB LSTM - OpenCL
  No - Inference - Mobilenet - OpenCL
  No - Inference - ResNet 50 - OpenCL
  No - Inference - DenseNet 201 - OpenCL
  No - Inference - Inception V3 - OpenCL
  No - Inference - NASNer Large - OpenCL
MandelGPU
SmallPT GPU:
  GPU - Complex
  GPU - Cornell
  GPU - Caustic3
clpeak:
  Kernel Latency
  Single-Precision Float
  Global Memory Bandwidth
  Transfer Bandwidth enqueueWriteBuffer
oneAPI Level Zero Tests:
  Peak Integer Compute
  Device-To-Host Bandwidth
  Device-To-Host Bandwidth
  Host-To-Device Bandwidth
  Host-To-Device Bandwidth
  Peak Kernel Launch Latency
  Peak Half-Precision Compute
  Peak Single-Precision Compute
  Host-To-Device-To-Host Image Copy
  Peak Float16 Global Memory Bandwidth
  Peak System Memory Copy to Shared Memory