GPU Compute Lunar Lake

Intel Core Ultra 7 256V testing with a ASUS Zenbook S 14 UX5406SA_UX5406SA UX5406SA v1.0 (UX5406SA.300 BIOS) and ASUS Intel LNL 7GB on Ubuntu 24.10 via the Phoronix Test Suite.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2411256-NE-GPUCOMPUT81
Jump To Table - Results

Statistics

Remove Outliers Before Calculating Averages

Graph Settings

Prefer Vertical Bar Graphs
No Box Plots
On Line Graphs With Missing Data, Connect The Line Gaps

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs
Condense Test Profiles With Multiple Version Results Into Single Result Graphs

Table

Show Detailed System Result Table

Sensor Monitoring

Show Accumulated Sensor Monitoring Data For Displayed Results
Generate Power Efficiency / Performance Per Watt Results

Run Management

Result
Identifier
Performance Per
Dollar
Date
Run
  Test
  Duration
Core Ultra 7 256V Xe2 LNL
November 25
  8 Hours, 24 Minutes
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


GPU Compute Lunar LakeOpenBenchmarking.orgPhoronix Test SuiteIntel Core Ultra 7 256V @ 4.70GHz (8 Cores)ASUS Zenbook S 14 UX5406SA_UX5406SA UX5406SA v1.0 (UX5406SA.300 BIOS)Intel Device a87f8 x 2GB LPDDR5-8533MT/s Samsung1024GB Western Digital WD PC SN560 SDDPNQE-1T00-1102ASUS Intel LNL 7GBIntel Lunar Lake-M HD AudioIntel Device a840Ubuntu 24.106.12.0-rc6-phx-drm-next (x86_64)GNOME Shell 47.0X Server + Wayland4.6 Mesa 25.0~git2411250600.45c523~oibaf~o (git-45c5231 2024-11-25 oracular-oibaf-ppOpenCL 3.0GCC 14.2.0ext42880x1800ProcessorMotherboardChipsetMemoryDiskGraphicsAudioNetworkOSKernelDesktopDisplay ServerOpenGLOpenCLCompilerFile-SystemScreen ResolutionGPU Compute Lunar Lake BenchmarksSystem Logs- Transparent Huge Pages: madvise- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: intel_pstate powersave (EPP: balance_performance) - Platform Profile: balanced - CPU Microcode: 0x114 - Thermald 2.5.8 - ACPI Profile: balanced - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected

GPU Compute Lunar Lakeopencl-benchmark: FP64 Computeopencl-benchmark: FP32 Computeopencl-benchmark: FP16 Computeopencl-benchmark: INT64 Computeopencl-benchmark: INT32 Computeopencl-benchmark: INT16 Computeopencl-benchmark: INT8 Computeopencl-benchmark: Memory Bandwidth Coalesced Readopencl-benchmark: Memory Bandwidth Coalesced Writefluidx3d: FP32-FP32fluidx3d: FP32-FP16Sfluidx3d: FP32-FP16Cblender: BMW27 - Intel oneAPIblender: Classroom - Intel oneAPIblender: Fishy Cat - Intel oneAPIblender: Pabellon Barcelona - Intel oneAPIblender: Junkshop - Intel oneAPIdarktable: Boat - OpenCLdarktable: Masskrug - OpenCLdarktable: Server Room - OpenCLdarktable: Server Rack - OpenCLgpuowl: 57885161gpuowl: 77936867gpuowl: 332220523cl-mem: Readcl-mem: Writecl-mem: Copyclpeak: Global Memory Bandwidthclpeak: Single-Precision Computeclpeak: Double-Precision Computeclpeak: Integer Computeclpeak: Integer 24-bit Computeclpeak: Transfer Bandwidth enqueueWriteBufferclpeak: Transfer Bandwidth enqueueReadBufferclpeak: Kernel Latencyhashcat: MD5hashcat: SHA1hashcat: SHA-512hashcat: TrueCrypt RIPEMD160 + XTSshoc: OpenCL - Bus Speed Downloadshoc: OpenCL - Bus Speed Readbackshoc: OpenCL - Max SP Flopsshoc: OpenCL - Texture Read Bandwidthshoc: OpenCL - FFT SPshoc: OpenCL - GEMM SGEMM_Nshoc: OpenCL - MD5 Hashshoc: OpenCL - Reductionshoc: OpenCL - Triadshoc: OpenCL - S3Dviennacl: OpenCL BLAS - sCOPYviennacl: OpenCL BLAS - sAXPYviennacl: OpenCL BLAS - sDOTviennacl: OpenCL BLAS - dCOPYviennacl: OpenCL BLAS - dAXPYviennacl: OpenCL BLAS - dDOTviennacl: OpenCL BLAS - dGEMV-Nviennacl: OpenCL BLAS - dGEMV-Tviennacl: OpenCL BLAS - dGEMM-NNviennacl: OpenCL BLAS - dGEMM-NTviennacl: OpenCL BLAS - dGEMM-TNviennacl: OpenCL BLAS - dGEMM-TTvkresample: 2x - Singlevkpeak: fp32-scalarvkpeak: fp32-vec4vkpeak: fp16-scalarvkpeak: fp16-vec4vkpeak: fp64-scalarvkpeak: fp64-vec4vkpeak: int32-scalarvkpeak: int32-vec4vkpeak: int16-scalarvkpeak: int16-vec4Core Ultra 7 256V Xe2 LNL0.2433.8387.1230.1691.2378.2846.20933.1128.66637131271660.93184.2481.46373.55119.608.3764.3013.0790.458197.09147.6828.90213.5153.7180.6189.343953.60248.021281.061281.6432.3033.29204.9466521400008329266679626492010480052.754151.23562356991309.100303.121792.9714.5828103.940618.421184.943386.214511997.999.294.357.797.513213412913526.9381990.102928.446197.796532.41218.75213.40827.00849.003187.823641.00OpenBenchmarking.org

IndigoBench

This is a test of Indigo Renderer's IndigoBench benchmark. Learn more via the OpenBenchmarking.org test page.

Acceleration: OpenCL GPU - Scene: Supercar

Core Ultra 7 256V Xe2 LNL: The test run did not produce a result. E: sh: 1: exec: ./indigobench: not found

Acceleration: OpenCL GPU - Scene: Bedroom

Core Ultra 7 256V Xe2 LNL: The test run did not produce a result. E: sh: 1: exec: ./indigobench: not found

ProjectPhysX OpenCL-Benchmark

OpenBenchmarking.orgTFLOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.6Operation: FP64 ComputeCore Ultra 7 256V Xe2 LNL0.05470.10940.16410.21880.2735SE +/- 0.000, N = 30.2431. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

OpenBenchmarking.orgTFLOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.6Operation: FP32 ComputeCore Ultra 7 256V Xe2 LNL0.86361.72722.59083.45444.318SE +/- 0.023, N = 33.8381. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

OpenBenchmarking.orgTFLOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.6Operation: FP16 ComputeCore Ultra 7 256V Xe2 LNL246810SE +/- 0.022, N = 37.1231. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.6Operation: INT64 ComputeCore Ultra 7 256V Xe2 LNL0.0380.0760.1140.1520.19SE +/- 0.001, N = 30.1691. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.6Operation: INT32 ComputeCore Ultra 7 256V Xe2 LNL0.27830.55660.83491.11321.3915SE +/- 0.001, N = 31.2371. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.6Operation: INT16 ComputeCore Ultra 7 256V Xe2 LNL246810SE +/- 0.045, N = 38.2841. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.6Operation: INT8 ComputeCore Ultra 7 256V Xe2 LNL246810SE +/- 0.033, N = 36.2091. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.6Operation: Memory Bandwidth Coalesced ReadCore Ultra 7 256V Xe2 LNL816243240SE +/- 0.13, N = 333.111. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.6Operation: Memory Bandwidth Coalesced WriteCore Ultra 7 256V Xe2 LNL714212835SE +/- 0.11, N = 328.661. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

FluidX3D

OpenBenchmarking.orgMLUPs/s, More Is BetterFluidX3D 3.0Test: FP32-FP32Core Ultra 7 256V Xe2 LNL140280420560700SE +/- 12.75, N = 8637

OpenBenchmarking.orgMLUPs/s, More Is BetterFluidX3D 3.0Test: FP32-FP16SCore Ultra 7 256V Xe2 LNL30060090012001500SE +/- 14.29, N = 121312

OpenBenchmarking.orgMLUPs/s, More Is BetterFluidX3D 3.0Test: FP32-FP16CCore Ultra 7 256V Xe2 LNL150300450600750SE +/- 1.76, N = 3716

Blender

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: BMW27 - Compute: Intel oneAPICore Ultra 7 256V Xe2 LNL1428425670SE +/- 0.65, N = 460.93

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: Classroom - Compute: Intel oneAPICore Ultra 7 256V Xe2 LNL4080120160200SE +/- 2.27, N = 4184.24

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: Fishy Cat - Compute: Intel oneAPICore Ultra 7 256V Xe2 LNL20406080100SE +/- 0.58, N = 1481.46

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: Pabellon Barcelona - Compute: Intel oneAPICore Ultra 7 256V Xe2 LNL80160240320400SE +/- 12.74, N = 7373.55

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: Junkshop - Compute: Intel oneAPICore Ultra 7 256V Xe2 LNL306090120150SE +/- 1.36, N = 3119.60

Darktable

Darktable is an open-source photography / workflow application this will use any system-installed Darktable program or on Windows will automatically download the pre-built binary from the project. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 4.8.1Test: Boat - Acceleration: OpenCLCore Ultra 7 256V Xe2 LNL246810SE +/- 0.021, N = 48.376

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 4.8.1Test: Masskrug - Acceleration: OpenCLCore Ultra 7 256V Xe2 LNL0.96771.93542.90313.87084.8385SE +/- 0.021, N = 74.301

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 4.8.1Test: Server Room - Acceleration: OpenCLCore Ultra 7 256V Xe2 LNL0.69281.38562.07842.77123.464SE +/- 0.018, N = 83.079

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 4.8.1Test: Server Rack - Acceleration: OpenCLCore Ultra 7 256V Xe2 LNL0.10310.20620.30930.41240.5155SE +/- 0.008, N = 150.458

GpuOwl

GpuOwl is a Mersenne primality tester leveraging OpenCL for cross-vendor GPU acceleration. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgIterations / Second, More Is BetterGpuOwl 7.5Exponent: 57885161Core Ultra 7 256V Xe2 LNL4080120160200SE +/- 1.35, N = 3197.091. (CXX) g++ options: -O3 -lgmp -lOpenCL

OpenBenchmarking.orgIterations / Second, More Is BetterGpuOwl 7.5Exponent: 77936867Core Ultra 7 256V Xe2 LNL306090120150SE +/- 0.63, N = 3147.681. (CXX) g++ options: -O3 -lgmp -lOpenCL

OpenBenchmarking.orgIterations / Second, More Is BetterGpuOwl 7.5Exponent: 332220523Core Ultra 7 256V Xe2 LNL714212835SE +/- 0.26, N = 328.901. (CXX) g++ options: -O3 -lgmp -lOpenCL

cl-mem

A basic OpenCL memory benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: ReadCore Ultra 7 256V Xe2 LNL50100150200250SE +/- 2.15, N = 15213.51. (CC) gcc options: -O2 -flto -lOpenCL

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: WriteCore Ultra 7 256V Xe2 LNL306090120150SE +/- 0.25, N = 4153.71. (CC) gcc options: -O2 -flto -lOpenCL

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: CopyCore Ultra 7 256V Xe2 LNL4080120160200SE +/- 0.55, N = 5180.61. (CC) gcc options: -O2 -flto -lOpenCL

clpeak

Clpeak is designed to test the peak capabilities of OpenCL devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGBPS, More Is Betterclpeak 1.1.2OpenCL Test: Global Memory BandwidthCore Ultra 7 256V Xe2 LNL4080120160200SE +/- 0.30, N = 6189.341. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOPS, More Is Betterclpeak 1.1.2OpenCL Test: Single-Precision ComputeCore Ultra 7 256V Xe2 LNL8001600240032004000SE +/- 0.77, N = 73953.601. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOPS, More Is Betterclpeak 1.1.2OpenCL Test: Double-Precision ComputeCore Ultra 7 256V Xe2 LNL50100150200250SE +/- 0.15, N = 3248.021. (CXX) g++ options: -O3

OpenBenchmarking.orgGIOPS, More Is Betterclpeak 1.1.2OpenCL Test: Integer ComputeCore Ultra 7 256V Xe2 LNL30060090012001500SE +/- 0.70, N = 51281.061. (CXX) g++ options: -O3

OpenBenchmarking.orgGIOPS, More Is Betterclpeak 1.1.2OpenCL Test: Integer 24-bit ComputeCore Ultra 7 256V Xe2 LNL30060090012001500SE +/- 0.69, N = 51281.641. (CXX) g++ options: -O3

OpenBenchmarking.orgGBPS, More Is Betterclpeak 1.1.2OpenCL Test: Transfer Bandwidth enqueueWriteBufferCore Ultra 7 256V Xe2 LNL816243240SE +/- 0.34, N = 432.301. (CXX) g++ options: -O3

OpenBenchmarking.orgGBPS, More Is Betterclpeak 1.1.2OpenCL Test: Transfer Bandwidth enqueueReadBufferCore Ultra 7 256V Xe2 LNL816243240SE +/- 0.03, N = 433.291. (CXX) g++ options: -O3

OpenBenchmarking.orgus, Fewer Is Betterclpeak 1.1.2OpenCL Test: Kernel LatencyCore Ultra 7 256V Xe2 LNL4080120160200SE +/- 1.52, N = 15204.941. (CXX) g++ options: -O3

Hashcat

Hashcat is an open-source, advanced password recovery tool supporting GPU acceleration with OpenCL, NVIDIA CUDA, and Radeon ROCm. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: MD5Core Ultra 7 256V Xe2 LNL1400M2800M4200M5600M7000MSE +/- 10240634.75, N = 56652140000

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: SHA1Core Ultra 7 256V Xe2 LNL200M400M600M800M1000MSE +/- 19378351.86, N = 15832926667

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: SHA-512Core Ultra 7 256V Xe2 LNL20M40M60M80M100MSE +/- 32482.97, N = 596264920

Benchmark: 7-Zip

Core Ultra 7 256V Xe2 LNL: The test run did not produce a result.

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: TrueCrypt RIPEMD160 + XTSCore Ultra 7 256V Xe2 LNL20K40K60K80K100KSE +/- 235.23, N = 6104800

SHOC Scalable HeterOgeneous Computing

The CUDA and OpenCL version of Vetter's Scalable HeterOgeneous Computing benchmark suite. SHOC provides a number of different benchmark programs for evaluating the performance and stability of compute devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Bus Speed DownloadCore Ultra 7 256V Xe2 LNL1224364860SE +/- 0.02, N = 1452.751. (CXX) g++ options: -O2 -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Bus Speed ReadbackCore Ultra 7 256V Xe2 LNL1224364860SE +/- 0.04, N = 1351.241. (CXX) g++ options: -O2 -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Max SP FlopsCore Ultra 7 256V Xe2 LNL500K1000K1500K2000K2500KSE +/- 234243.55, N = 923569911. (CXX) g++ options: -O2 -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Texture Read BandwidthCore Ultra 7 256V Xe2 LNL70140210280350SE +/- 1.19, N = 3309.101. (CXX) g++ options: -O2 -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: FFT SPCore Ultra 7 256V Xe2 LNL70140210280350SE +/- 0.19, N = 11303.121. (CXX) g++ options: -O2 -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: GEMM SGEMM_NCore Ultra 7 256V Xe2 LNL2004006008001000SE +/- 2.49, N = 7792.971. (CXX) g++ options: -O2 -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt

OpenBenchmarking.orgGHash/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: MD5 HashCore Ultra 7 256V Xe2 LNL1.03112.06223.09334.12445.1555SE +/- 0.0001, N = 84.58281. (CXX) g++ options: -O2 -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: ReductionCore Ultra 7 256V Xe2 LNL20406080100SE +/- 1.71, N = 15103.941. (CXX) g++ options: -O2 -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: TriadCore Ultra 7 256V Xe2 LNL510152025SE +/- 0.77, N = 1518.421. (CXX) g++ options: -O2 -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: S3DCore Ultra 7 256V Xe2 LNL20406080100SE +/- 1.74, N = 1584.941. (CXX) g++ options: -O2 -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt

ViennaCL

ViennaCL is an open-source linear algebra library written in C++ and with support for OpenCL and OpenMP. This test profile makes use of ViennaCL's built-in benchmarks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sCOPYCore Ultra 7 256V Xe2 LNL20406080100SE +/- 1.07, N = 1586.21. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sAXPYCore Ultra 7 256V Xe2 LNL306090120150SE +/- 1.60, N = 151451. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sDOTCore Ultra 7 256V Xe2 LNL306090120150SE +/- 0.50, N = 151191. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dCOPYCore Ultra 7 256V Xe2 LNL20406080100SE +/- 0.36, N = 1597.91. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dAXPYCore Ultra 7 256V Xe2 LNL20406080100SE +/- 0.24, N = 1599.21. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dDOTCore Ultra 7 256V Xe2 LNL20406080100SE +/- 0.55, N = 1594.31. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMV-NCore Ultra 7 256V Xe2 LNL1326395265SE +/- 0.12, N = 1557.71. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMV-TCore Ultra 7 256V Xe2 LNL20406080100SE +/- 0.11, N = 1597.51. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-NNCore Ultra 7 256V Xe2 LNL306090120150SE +/- 0.13, N = 151321. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-NTCore Ultra 7 256V Xe2 LNL306090120150SE +/- 0.19, N = 151341. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-TNCore Ultra 7 256V Xe2 LNL306090120150SE +/- 0.13, N = 151291. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-TTCore Ultra 7 256V Xe2 LNL306090120150SE +/- 0.11, N = 151351. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

VkResample

VkResample is a Vulkan-based image upscaling library based on VkFFT. The sample input file is upscaling a 4K image to 8K using Vulkan-based GPU acceleration. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterVkResample 1.0Upscale: 2x - Precision: SingleCore Ultra 7 256V Xe2 LNL612182430SE +/- 0.21, N = 1026.941. (CXX) g++ options: -O3

vkpeak

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20240505fp32-scalarCore Ultra 7 256V Xe2 LNL400800120016002000SE +/- 0.04, N = 31990.10

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20240505fp32-vec4Core Ultra 7 256V Xe2 LNL6001200180024003000SE +/- 0.13, N = 32928.44

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20240505fp16-scalarCore Ultra 7 256V Xe2 LNL13002600390052006500SE +/- 0.17, N = 36197.79

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20240505fp16-vec4Core Ultra 7 256V Xe2 LNL14002800420056007000SE +/- 0.16, N = 36532.41

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20240505fp64-scalarCore Ultra 7 256V Xe2 LNL50100150200250SE +/- 0.01, N = 3218.75

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20240505fp64-vec4Core Ultra 7 256V Xe2 LNL50100150200250SE +/- 0.01, N = 3213.40

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20240505int32-scalarCore Ultra 7 256V Xe2 LNL2004006008001000SE +/- 0.04, N = 3827.00

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20240505int32-vec4Core Ultra 7 256V Xe2 LNL2004006008001000SE +/- 0.02, N = 3849.00

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20240505int16-scalarCore Ultra 7 256V Xe2 LNL7001400210028003500SE +/- 0.08, N = 33187.82

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20240505int16-vec4Core Ultra 7 256V Xe2 LNL8001600240032004000SE +/- 0.57, N = 33641.00

72 Results Shown

ProjectPhysX OpenCL-Benchmark:
  FP64 Compute
  FP32 Compute
  FP16 Compute
  INT64 Compute
  INT32 Compute
  INT16 Compute
  INT8 Compute
  Memory Bandwidth Coalesced Read
  Memory Bandwidth Coalesced Write
FluidX3D:
  FP32-FP32
  FP32-FP16S
  FP32-FP16C
Blender:
  BMW27 - Intel oneAPI
  Classroom - Intel oneAPI
  Fishy Cat - Intel oneAPI
  Pabellon Barcelona - Intel oneAPI
  Junkshop - Intel oneAPI
Darktable:
  Boat - OpenCL
  Masskrug - OpenCL
  Server Room - OpenCL
  Server Rack - OpenCL
GpuOwl:
  57885161
  77936867
  332220523
cl-mem:
  Read
  Write
  Copy
clpeak:
  Global Memory Bandwidth
  Single-Precision Compute
  Double-Precision Compute
  Integer Compute
  Integer 24-bit Compute
  Transfer Bandwidth enqueueWriteBuffer
  Transfer Bandwidth enqueueReadBuffer
  Kernel Latency
Hashcat:
  MD5
  SHA1
  SHA-512
  TrueCrypt RIPEMD160 + XTS
SHOC Scalable HeterOgeneous Computing:
  OpenCL - Bus Speed Download
  OpenCL - Bus Speed Readback
  OpenCL - Max SP Flops
  OpenCL - Texture Read Bandwidth
  OpenCL - FFT SP
  OpenCL - GEMM SGEMM_N
  OpenCL - MD5 Hash
  OpenCL - Reduction
  OpenCL - Triad
  OpenCL - S3D
ViennaCL:
  OpenCL BLAS - sCOPY
  OpenCL BLAS - sAXPY
  OpenCL BLAS - sDOT
  OpenCL BLAS - dCOPY
  OpenCL BLAS - dAXPY
  OpenCL BLAS - dDOT
  OpenCL BLAS - dGEMV-N
  OpenCL BLAS - dGEMV-T
  OpenCL BLAS - dGEMM-NN
  OpenCL BLAS - dGEMM-NT
  OpenCL BLAS - dGEMM-TN
  OpenCL BLAS - dGEMM-TT
VkResample
vkpeak:
  fp32-scalar
  fp32-vec4
  fp16-scalar
  fp16-vec4
  fp64-scalar
  fp64-vec4
  int32-scalar
  int32-vec4
  int16-scalar
  int16-vec4