Intel GPU Compute Lunar Lake vs. Meteor Lake

Intel Lunar Lake OpenCL and GPU compute benchmarks compared to Meteor Lake on Ubuntu Linux. Intel Lunar Lake GPU compute benchmarks by Michael Larabel for a future article.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2411266-NE-2411256NE67
Jump To Table - Results

View

Do Not Show Noisy Results
Do Not Show Results With Incomplete Data
Do Not Show Results With Little Change/Spread
List Notable Results
Show Result Confidence Charts
Allow Limiting Results To Certain Suite(s)

Statistics

Show Overall Harmonic Mean(s)
Show Overall Geometric Mean
Show Wins / Losses Counts (Pie Chart)
Normalize Results
Remove Outliers Before Calculating Averages

Graph Settings

Force Line Graphs Where Applicable
Convert To Scalar Where Applicable
Disable Color Branding
Prefer Vertical Bar Graphs
No Box Plots
On Line Graphs With Missing Data, Connect The Line Gaps

Additional Graphs

Show Perf Per Core/Thread Calculation Graphs Where Applicable
Show Perf Per Clock Calculation Graphs Where Applicable

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs
Condense Test Profiles With Multiple Version Results Into Single Result Graphs

Table

Show Detailed System Result Table

Sensor Monitoring

Show Accumulated Sensor Monitoring Data For Displayed Results
Generate Power Efficiency / Performance Per Watt Results

Run Management

Highlight
Result
Toggle/Hide
Result
Result
Identifier
Performance Per
Dollar
Date
Run
  Test
  Duration
Core Ultra 7 155H Xe MTL
November 26
  11 Hours, 36 Minutes
Core Ultra 7 256V Xe2 LNL
November 25
  5 Hours, 38 Minutes
Invert Behavior (Only Show Selected Data)
  8 Hours, 37 Minutes
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


Intel GPU Compute Lunar Lake vs. Meteor LakeProcessorMotherboardChipsetMemoryDiskGraphicsAudioNetworkOSKernelDesktopDisplay ServerOpenGLOpenCLCompilerFile-SystemScreen ResolutionCore Ultra 7 155H Xe MTLCore Ultra 7 256V Xe2 LNLIntel Core Ultra 7 155H @ 4.80GHz (16 Cores / 22 Threads)MTL Swift SFG14-72T Coral_MTH (V1.01 BIOS)Intel Device 7e7f8 x 2GB LPDDR5-6400MT/s Micron MT62F1G32D2DS-0261024GB Micron_2550_MTFDKBA1T0TGEIntel Arc MTL 8GBIntel Meteor Lake-P HD AudioIntel Meteor Lake PCH CNVi WiFiUbuntu 24.106.12.0-rc6-phx-drm-next (x86_64)GNOME Shell 47.0X Server + Wayland4.6 Mesa 25.0~git2411250600.45c523~oibaf~o (git-45c5231 2024-11-25 oracular-oibaf-ppOpenCL 3.0GCC 14.2.0ext43840x1200Intel Core Ultra 7 256V @ 4.70GHz (8 Cores)ASUS Zenbook S 14 UX5406SA_UX5406SA UX5406SA v1.0 (UX5406SA.300 BIOS)Intel Device a87f8 x 2GB LPDDR5-8533MT/s Samsung1024GB Western Digital WD PC SN560 SDDPNQE-1T00-1102ASUS Intel LNL 7GBIntel Lunar Lake-M HD AudioIntel Device a8402880x1800OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Core Ultra 7 155H Xe MTL: Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0x1e - Thermald 2.5.8- Core Ultra 7 256V Xe2 LNL: Scaling Governor: intel_pstate powersave (EPP: balance_performance) - Platform Profile: balanced - CPU Microcode: 0x114 - Thermald 2.5.8 - ACPI Profile: balanced Security Details- Core Ultra 7 155H Xe MTL: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; RSB filling; PBRSB-eIBRS: Not affected; BHI: BHI_DIS_S + srbds: Not affected + tsx_async_abort: Not affected - Core Ultra 7 256V Xe2 LNL: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected

Core Ultra 7 155H Xe MTL vs. Core Ultra 7 256V Xe2 LNL ComparisonPhoronix Test SuiteBaseline+212.7%+212.7%+425.4%+425.4%+638.1%+638.1%850.7%597.9%258%245.4%161.7%137.4%119.2%109.5%92%90.7%75%73.6%73.4%70.3%58.4%56.9%55.9%51.7%47.7%42.7%42.1%40.5%39.8%39.3%34.6%29.9%27.3%26%23.7%22.7%16.2%10.6%10.1%9.8%9.6%8.1%7.5%7.4%7.3%6.7%4%FP32-FP32FP32-FP16SFP32-FP16C2x - SingleM.B.C.WG.M.BINT8 ComputeOpenCL BLAS - sAXPYOpenCL - S3DM.B.C.ROpenCL BLAS - sDOTFP64 ComputeD.P.CT.B.eOpenCL - ReductionPabellon Barcelona - Intel oneAPI57.6%fp64-scalarOpenCL - Bus Speed Readbackfp64-vec4int16-scalarOpenCL - Bus Speed DownloadOpenCL BLAS - dCOPYOpenCL BLAS - dGEMV-TT.B.eOpenCL BLAS - dAXPYOpenCL - FFT SPOpenCL BLAS - dDOTServer Room - OpenCLOpenCL BLAS - sCOPYOpenCL - GEMM SGEMM_NServer Rack - OpenCLBMW27 - Intel oneAPIFP16 Compute14.2%FP32 Compute12.8%5788516177936867OpenCL BLAS - dGEMM-TTINT16 ComputeOpenCL BLAS - dGEMM-NTFishy Cat - Intel oneAPI8%fp16-vec4fp16-scalarOpenCL BLAS - dGEMM-NNMasskrug - OpenCLOpenCL BLAS - dGEMM-TNFluidX3DFluidX3DFluidX3DVkResampleProjectPhysX OpenCL-BenchmarkclpeakProjectPhysX OpenCL-BenchmarkViennaCLSHOC Scalable HeterOgeneous ComputingProjectPhysX OpenCL-BenchmarkViennaCLProjectPhysX OpenCL-BenchmarkclpeakclpeakSHOC Scalable HeterOgeneous ComputingBlendervkpeakSHOC Scalable HeterOgeneous ComputingvkpeakvkpeakSHOC Scalable HeterOgeneous ComputingViennaCLViennaCLclpeakViennaCLSHOC Scalable HeterOgeneous ComputingViennaCLDarktableViennaCLSHOC Scalable HeterOgeneous ComputingDarktableBlenderProjectPhysX OpenCL-BenchmarkProjectPhysX OpenCL-BenchmarkGpuOwlGpuOwlViennaCLProjectPhysX OpenCL-BenchmarkViennaCLBlendervkpeakvkpeakViennaCLDarktableViennaCLCore Ultra 7 155H Xe MTLCore Ultra 7 256V Xe2 LNL

Intel GPU Compute Lunar Lake vs. Meteor Lakeblender: BMW27 - Intel oneAPIblender: Classroom - Intel oneAPIblender: Fishy Cat - Intel oneAPIblender: Pabellon Barcelona - Intel oneAPIcl-mem: Writecl-mem: Copyclpeak: Global Memory Bandwidthclpeak: Double-Precision Computeclpeak: Transfer Bandwidth enqueueWriteBufferclpeak: Transfer Bandwidth enqueueReadBufferdarktable: Masskrug - OpenCLdarktable: Server Room - OpenCLdarktable: Server Rack - OpenCLfluidx3d: FP32-FP32fluidx3d: FP32-FP16Sfluidx3d: FP32-FP16Cgpuowl: 57885161gpuowl: 77936867opencl-benchmark: FP64 Computeopencl-benchmark: FP32 Computeopencl-benchmark: FP16 Computeopencl-benchmark: INT16 Computeopencl-benchmark: INT8 Computeopencl-benchmark: Memory Bandwidth Coalesced Readopencl-benchmark: Memory Bandwidth Coalesced Writeshoc: OpenCL - Bus Speed Downloadshoc: OpenCL - Bus Speed Readbackshoc: OpenCL - FFT SPshoc: OpenCL - GEMM SGEMM_Nshoc: OpenCL - Reductionshoc: OpenCL - S3Dviennacl: OpenCL BLAS - sCOPYviennacl: OpenCL BLAS - sAXPYviennacl: OpenCL BLAS - sDOTviennacl: OpenCL BLAS - dCOPYviennacl: OpenCL BLAS - dAXPYviennacl: OpenCL BLAS - dDOTviennacl: OpenCL BLAS - dGEMV-Tviennacl: OpenCL BLAS - dGEMM-NNviennacl: OpenCL BLAS - dGEMM-NTviennacl: OpenCL BLAS - dGEMM-TNviennacl: OpenCL BLAS - dGEMM-TTvkpeak: fp16-scalarvkpeak: fp16-vec4vkpeak: fp64-scalarvkpeak: fp64-vec4vkpeak: int16-scalarvkresample: 2x - SingleCore Ultra 7 155H Xe MTLCore Ultra 7 256V Xe2 LNL70.79187.3575.42237.0479.74143.0223.1019.554.5913.9190.56267188200178.24134.120.144.3288.1377.5562.83217.3610.9536.964732.8625225.233640.92965.632044.246368.469.268.068.971.272.669.41231241241235768.616078.52139.38140.652157.6493.04760.93184.2481.46373.55153.7180.6189.34248.0232.3033.294.3013.0790.4586371312716197.09147.680.2433.8387.1238.2846.20933.1128.6652.754151.2356303.121792.971103.940684.943386.214511997.999.294.397.51321341291356197.796532.41218.75213.403187.8226.938OpenBenchmarking.org

Blender

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: BMW27 - Compute: Intel oneAPICore Ultra 7 155H Xe MTLCore Ultra 7 256V Xe2 LNL1632486480SE +/- 0.58, N = 15SE +/- 0.65, N = 470.7960.93

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: Classroom - Compute: Intel oneAPICore Ultra 7 155H Xe MTLCore Ultra 7 256V Xe2 LNL4080120160200SE +/- 0.29, N = 3SE +/- 2.27, N = 4187.35184.24

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: Fishy Cat - Compute: Intel oneAPICore Ultra 7 155H Xe MTLCore Ultra 7 256V Xe2 LNL20406080100SE +/- 0.68, N = 14SE +/- 0.58, N = 1475.4281.46

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.3Blend File: Pabellon Barcelona - Compute: Intel oneAPICore Ultra 7 155H Xe MTLCore Ultra 7 256V Xe2 LNL80160240320400SE +/- 0.69, N = 3SE +/- 12.74, N = 7237.04373.55

cl-mem

A basic OpenCL memory benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: WriteCore Ultra 7 256V Xe2 LNL306090120150SE +/- 0.25, N = 4153.71. (CC) gcc options: -O2 -flto -lOpenCL

Benchmark: Write

Core Ultra 7 155H Xe MTL: The test quit with a non-zero exit status.

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: CopyCore Ultra 7 256V Xe2 LNL4080120160200SE +/- 0.55, N = 5180.61. (CC) gcc options: -O2 -flto -lOpenCL

Benchmark: Copy

Core Ultra 7 155H Xe MTL: The test quit with a non-zero exit status.

clpeak

Clpeak is designed to test the peak capabilities of OpenCL devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGBPS, More Is Betterclpeak 1.1.2OpenCL Test: Global Memory BandwidthCore Ultra 7 155H Xe MTLCore Ultra 7 256V Xe2 LNL4080120160200SE +/- 0.00, N = 3SE +/- 0.30, N = 679.74189.341. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOPS, More Is Betterclpeak 1.1.2OpenCL Test: Double-Precision ComputeCore Ultra 7 155H Xe MTLCore Ultra 7 256V Xe2 LNL50100150200250SE +/- 0.01, N = 3SE +/- 0.15, N = 3143.02248.021. (CXX) g++ options: -O3

OpenBenchmarking.orgGBPS, More Is Betterclpeak 1.1.2OpenCL Test: Transfer Bandwidth enqueueWriteBufferCore Ultra 7 155H Xe MTLCore Ultra 7 256V Xe2 LNL816243240SE +/- 0.10, N = 3SE +/- 0.34, N = 423.1032.301. (CXX) g++ options: -O3

OpenBenchmarking.orgGBPS, More Is Betterclpeak 1.1.2OpenCL Test: Transfer Bandwidth enqueueReadBufferCore Ultra 7 155H Xe MTLCore Ultra 7 256V Xe2 LNL816243240SE +/- 0.22, N = 3SE +/- 0.03, N = 419.5533.291. (CXX) g++ options: -O3

Darktable

MinAvgMaxCore Ultra 7 256V Xe2 LNL0.712.729.1OpenBenchmarking.orgWatts, Fewer Is BetterDarktableCPU Power Consumption Monitor918273645

MinAvgMaxCore Ultra 7 256V Xe2 LNL50.062.382.0OpenBenchmarking.orgCelsius, Fewer Is BetterDarktableCPU Temperature Monitor20406080100

MinAvgMaxCore Ultra 7 256V Xe2 LNL0.98.020.4OpenBenchmarking.orgWatts, Fewer Is BetterDarktableCPU Power Consumption Monitor612182430

MinAvgMaxCore Ultra 7 256V Xe2 LNL48.051.766.0OpenBenchmarking.orgCelsius, Fewer Is BetterDarktableCPU Temperature Monitor20406080100

MinAvgMaxCore Ultra 7 256V Xe2 LNL0.23.512.8OpenBenchmarking.orgWatts, Fewer Is BetterDarktableCPU Power Consumption Monitor48121620

MinAvgMaxCore Ultra 7 256V Xe2 LNL42.044.654.0OpenBenchmarking.orgCelsius, Fewer Is BetterDarktableCPU Temperature Monitor1530456075

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 4.8.1Test: Masskrug - Acceleration: OpenCLCore Ultra 7 155H Xe MTLCore Ultra 7 256V Xe2 LNL1.0332.0663.0994.1325.165SE +/- 0.015, N = 7SE +/- 0.021, N = 74.5914.301

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 4.8.1Test: Server Room - Acceleration: OpenCLCore Ultra 7 155H Xe MTLCore Ultra 7 256V Xe2 LNL0.88181.76362.64543.52724.409SE +/- 0.011, N = 7SE +/- 0.018, N = 83.9193.079

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 4.8.1Test: Server Rack - Acceleration: OpenCLCore Ultra 7 155H Xe MTLCore Ultra 7 256V Xe2 LNL0.12650.2530.37950.5060.6325SE +/- 0.002, N = 12SE +/- 0.008, N = 150.5620.458

FluidX3D

OpenBenchmarking.orgMLUPs/s, More Is BetterFluidX3D 3.0Test: FP32-FP32Core Ultra 7 155H Xe MTLCore Ultra 7 256V Xe2 LNL140280420560700SE +/- 0.73, N = 9SE +/- 12.75, N = 867637

OpenBenchmarking.orgMLUPs/s, More Is BetterFluidX3D 3.0Test: FP32-FP16SCore Ultra 7 155H Xe MTLCore Ultra 7 256V Xe2 LNL30060090012001500SE +/- 0.00, N = 3SE +/- 14.29, N = 121881312

OpenBenchmarking.orgMLUPs/s, More Is BetterFluidX3D 3.0Test: FP32-FP16CCore Ultra 7 155H Xe MTLCore Ultra 7 256V Xe2 LNL150300450600750SE +/- 0.00, N = 3SE +/- 1.76, N = 3200716

GpuOwl

GpuOwl is a Mersenne primality tester leveraging OpenCL for cross-vendor GPU acceleration. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgIterations / Second, More Is BetterGpuOwl 7.5Exponent: 57885161Core Ultra 7 155H Xe MTLCore Ultra 7 256V Xe2 LNL4080120160200SE +/- 0.04, N = 3SE +/- 1.35, N = 3178.24197.091. (CXX) g++ options: -O3 -lgmp -lOpenCL

OpenBenchmarking.orgIterations / Second, More Is BetterGpuOwl 7.5Exponent: 77936867Core Ultra 7 155H Xe MTLCore Ultra 7 256V Xe2 LNL306090120150SE +/- 0.02, N = 3SE +/- 0.63, N = 3134.12147.681. (CXX) g++ options: -O3 -lgmp -lOpenCL

ProjectPhysX OpenCL-Benchmark

OpenBenchmarking.orgTFLOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.6Operation: FP64 ComputeCore Ultra 7 155H Xe MTLCore Ultra 7 256V Xe2 LNL0.05470.10940.16410.21880.2735SE +/- 0.000, N = 3SE +/- 0.000, N = 30.1400.2431. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

OpenBenchmarking.orgTFLOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.6Operation: FP32 ComputeCore Ultra 7 155H Xe MTLCore Ultra 7 256V Xe2 LNL0.97381.94762.92143.89524.869SE +/- 0.020, N = 3SE +/- 0.023, N = 34.3283.8381. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

OpenBenchmarking.orgTFLOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.6Operation: FP16 ComputeCore Ultra 7 155H Xe MTLCore Ultra 7 256V Xe2 LNL246810SE +/- 0.024, N = 3SE +/- 0.022, N = 38.1377.1231. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.6Operation: INT16 ComputeCore Ultra 7 155H Xe MTLCore Ultra 7 256V Xe2 LNL246810SE +/- 0.638, N = 3SE +/- 0.045, N = 37.5568.2841. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.6Operation: INT8 ComputeCore Ultra 7 155H Xe MTLCore Ultra 7 256V Xe2 LNL246810SE +/- 0.009, N = 3SE +/- 0.033, N = 32.8326.2091. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.6Operation: Memory Bandwidth Coalesced ReadCore Ultra 7 155H Xe MTLCore Ultra 7 256V Xe2 LNL816243240SE +/- 0.02, N = 3SE +/- 0.13, N = 317.3633.111. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.6Operation: Memory Bandwidth Coalesced WriteCore Ultra 7 155H Xe MTLCore Ultra 7 256V Xe2 LNL714212835SE +/- 0.03, N = 3SE +/- 0.11, N = 310.9528.661. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

SHOC Scalable HeterOgeneous Computing

The CUDA and OpenCL version of Vetter's Scalable HeterOgeneous Computing benchmark suite. SHOC provides a number of different benchmark programs for evaluating the performance and stability of compute devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Bus Speed DownloadCore Ultra 7 155H Xe MTLCore Ultra 7 256V Xe2 LNL1224364860SE +/- 0.08, N = 15SE +/- 0.02, N = 1436.9652.75-lSHOCCommonMPI -lmpi_cxx -lmpi1. (CXX) g++ options: -O2 -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Bus Speed ReadbackCore Ultra 7 155H Xe MTLCore Ultra 7 256V Xe2 LNL1224364860SE +/- 0.19, N = 15SE +/- 0.04, N = 1332.8651.24-lSHOCCommonMPI -lmpi_cxx -lmpi1. (CXX) g++ options: -O2 -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: FFT SPCore Ultra 7 155H Xe MTLCore Ultra 7 256V Xe2 LNL70140210280350SE +/- 0.04, N = 11SE +/- 0.19, N = 11225.23303.12-lSHOCCommonMPI -lmpi_cxx -lmpi1. (CXX) g++ options: -O2 -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: GEMM SGEMM_NCore Ultra 7 155H Xe MTLCore Ultra 7 256V Xe2 LNL2004006008001000SE +/- 0.33, N = 6SE +/- 2.49, N = 7640.93792.97-lSHOCCommonMPI -lmpi_cxx -lmpi1. (CXX) g++ options: -O2 -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: ReductionCore Ultra 7 155H Xe MTLCore Ultra 7 256V Xe2 LNL20406080100SE +/- 0.02, N = 7SE +/- 1.71, N = 1565.63103.94-lSHOCCommonMPI -lmpi_cxx -lmpi1. (CXX) g++ options: -O2 -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: S3DCore Ultra 7 155H Xe MTLCore Ultra 7 256V Xe2 LNL20406080100SE +/- 0.15, N = 10SE +/- 1.74, N = 1544.2584.94-lSHOCCommonMPI -lmpi_cxx -lmpi1. (CXX) g++ options: -O2 -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt

ViennaCL

ViennaCL is an open-source linear algebra library written in C++ and with support for OpenCL and OpenMP. This test profile makes use of ViennaCL's built-in benchmarks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sCOPYCore Ultra 7 155H Xe MTLCore Ultra 7 256V Xe2 LNL20406080100SE +/- 0.13, N = 3SE +/- 1.07, N = 1568.486.21. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sAXPYCore Ultra 7 155H Xe MTLCore Ultra 7 256V Xe2 LNL306090120150SE +/- 0.24, N = 3SE +/- 1.60, N = 1569.2145.01. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sDOTCore Ultra 7 155H Xe MTLCore Ultra 7 256V Xe2 LNL306090120150SE +/- 0.12, N = 3SE +/- 0.50, N = 1568.0119.01. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dCOPYCore Ultra 7 155H Xe MTLCore Ultra 7 256V Xe2 LNL20406080100SE +/- 0.42, N = 3SE +/- 0.36, N = 1568.997.91. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dAXPYCore Ultra 7 155H Xe MTLCore Ultra 7 256V Xe2 LNL20406080100SE +/- 0.78, N = 3SE +/- 0.24, N = 1571.299.21. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dDOTCore Ultra 7 155H Xe MTLCore Ultra 7 256V Xe2 LNL20406080100SE +/- 0.13, N = 3SE +/- 0.55, N = 1572.694.31. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMV-TCore Ultra 7 155H Xe MTLCore Ultra 7 256V Xe2 LNL20406080100SE +/- 0.27, N = 3SE +/- 0.11, N = 1569.497.51. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-NNCore Ultra 7 155H Xe MTLCore Ultra 7 256V Xe2 LNL306090120150SE +/- 0.00, N = 3SE +/- 0.13, N = 151231321. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-NTCore Ultra 7 155H Xe MTLCore Ultra 7 256V Xe2 LNL306090120150SE +/- 0.33, N = 3SE +/- 0.19, N = 151241341. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-TNCore Ultra 7 155H Xe MTLCore Ultra 7 256V Xe2 LNL306090120150SE +/- 0.00, N = 3SE +/- 0.13, N = 151241291. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-TTCore Ultra 7 155H Xe MTLCore Ultra 7 256V Xe2 LNL306090120150SE +/- 0.00, N = 3SE +/- 0.11, N = 151231351. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

vkpeak

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20240505fp16-scalarCore Ultra 7 155H Xe MTLCore Ultra 7 256V Xe2 LNL13002600390052006500SE +/- 1.11, N = 3SE +/- 0.17, N = 35768.616197.79

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20240505fp16-vec4Core Ultra 7 155H Xe MTLCore Ultra 7 256V Xe2 LNL14002800420056007000SE +/- 0.82, N = 3SE +/- 0.16, N = 36078.526532.41

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20240505fp64-scalarCore Ultra 7 155H Xe MTLCore Ultra 7 256V Xe2 LNL50100150200250SE +/- 0.02, N = 3SE +/- 0.01, N = 3139.38218.75

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20240505fp64-vec4Core Ultra 7 155H Xe MTLCore Ultra 7 256V Xe2 LNL50100150200250SE +/- 0.01, N = 3SE +/- 0.01, N = 3140.65213.40

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20240505int16-scalarCore Ultra 7 155H Xe MTLCore Ultra 7 256V Xe2 LNL7001400210028003500SE +/- 0.70, N = 3SE +/- 0.08, N = 32157.643187.82

VkResample

VkResample is a Vulkan-based image upscaling library based on VkFFT. The sample input file is upscaling a 4K image to 8K using Vulkan-based GPU acceleration. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterVkResample 1.0Upscale: 2x - Precision: SingleCore Ultra 7 155H Xe MTLCore Ultra 7 256V Xe2 LNL20406080100SE +/- 0.02, N = 3SE +/- 0.21, N = 1093.0526.941. (CXX) g++ options: -O3