VIENNACL CL BLAS AMD Ryzen 9 7945HX testing with a Alienware 0DWD2H (1.13.1 BIOS) and NVIDIA GeForce RTX 4090 Laptop GPU 16GB on cachyos rolling via the Phoronix Test Suite. Radeon HD 8790M: Processor: Intel Core i5-4300M @ 3.30GHz (2 Cores / 4 Threads), Motherboard: Dell 0VWNW8 (A26 BIOS), Chipset: Intel Xeon E3-1200 v3/4th, Memory: 8GB, Disk: 128GB SAMSUNG SSD PM85, Graphics: AMD Radeon HD 8790M (1250MHz), Audio: Intel Xeon E3-1200 v3/4th, Network: Intel I217-LM + Intel Centrino Ultimate-N 6300 OS: cachyos rolling, Kernel: 6.6.2-4-cachyos-lto (x86_64), Desktop: GNOME Shell 45.1, Display Server: X Server 1.21.1.9, OpenGL: 4.6 Mesa 24.0.0-devel (git-023fa0aa5d) (LLVM 16.0.6 DRM 3.54), OpenCL: OpenCL 1.1 Mesa 24.0.0-devel (git-023fa0aa5d), Compiler: GCC 13.2.1 20231110 + Clang 16.0.6 + LLVM 16.0.6 + CUDA 12.3, File-System: xfs, Screen Resolution: 1920x1080 IntelR HD Graphics 4600 HSW GT2 0x416: Processor: Intel Core i5-4300M @ 3.30GHz (2 Cores / 4 Threads), Motherboard: Dell 0VWNW8 (A26 BIOS), Chipset: Intel Xeon E3-1200 v3/4th, Memory: 8GB, Disk: 128GB SAMSUNG SSD PM85, Graphics: Intel HD 4600 HSW GT2 2GB (1250MHz), Audio: Intel Xeon E3-1200 v3/4th, Network: Intel I217-LM + Intel Centrino Ultimate-N 6300 OS: cachyos rolling, Kernel: 6.7.6-1-cachyos-rt-bore-lto (x86_64), Desktop: KDE Plasma 5.27.10, Display Server: X Server 1.21.1.11, OpenGL: 4.6 Mesa 24.0.1-arch1.1, OpenCL: OpenCL 2.0 beignet 1.4 (git-f72309a5), Compiler: GCC 13.2.1 20230801 + Clang 16.0.6 + LLVM 16.0.6, File-System: xfs, Screen Resolution: 1920x1080 Intel HD Graphics 4600 HSW GT2 CLANG70: Processor: Intel Core i5-4300M @ 3.30GHz (2 Cores / 4 Threads), Motherboard: Dell 0VWNW8 (A26 BIOS), Chipset: Intel Xeon E3-1200 v3/4th, Memory: 8GB, Disk: 128GB SAMSUNG SSD PM85, Graphics: Intel HD 4600 HSW GT2 2GB (1250MHz), Audio: Intel Xeon E3-1200 v3/4th, Network: Intel I217-LM + Intel Centrino Ultimate-N 6300 OS: cachyos rolling, Kernel: 6.7.9-1-cachyos-rt-bore-lto (x86_64), Desktop: KDE Plasma 6.0.1, Display Server: X Server 1.21.1.11, OpenGL: 4.6 Mesa 24.0.2-arch1.2, OpenCL: OpenCL 2.0 beignet 1.4 (git-f72309a5), Compiler: Clang 17.0.6 + GCC 13.2.1 20230801 + LLVM 17.0.6, File-System: xfs, Screen Resolution: 1920x1080 nVidia RTX 4090 mobile: Processor: AMD Ryzen 9 7945HX @ 5.46GHz (16 Cores / 32 Threads), Motherboard: Alienware 0DWD2H (1.13.1 BIOS), Chipset: AMD Device 14d8, Memory: 62GB, Disk: PC SN810 NVMe WDC 2048GB + 4001GB CT4000P3SSD8, Graphics: NVIDIA GeForce RTX 4090 Laptop GPU 16GB, Audio: NVIDIA Device 22bb, Network: Realtek RTL8125 2.5GbE + Qualcomm QCNFA765 OS: cachyos rolling, Kernel: 6.11.0-5-cachyos-lto (x86_64), Desktop: GNOME Shell 47.0, Display Server: X Server 1.21.1.13, Display Driver: NVIDIA 560.35.03, OpenGL: 4.6.0, OpenCL: OpenCL 3.0 CUDA 12.6.65, Compiler: GCC 14.2.1 20240910 + Clang 18.1.8 + LLVM 18.1.8 + CUDA 12.6, File-System: zfs, Screen Resolution: 2560x1600 ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMV-N GB/s > Higher Is Better Radeon HD 8790M ........ 34.5 |======== nVidia RTX 4090 mobile . 196.0 |=============================================== ViennaCL 1.7.1 Test: OpenCL BLAS - sAXPY GB/s > Higher Is Better IntelR HD Graphics 4600 HSW GT2 0x416 .. 13.9 |= IntelR HD Graphics 4600 HSW GT2 0x416 .. 13.8 |= IntelR HD Graphics 4600 HSW GT2 0x416 .. 14.0 |= IntelR HD Graphics 4600 HSW GT2 0x416 .. 14.1 |= IntelR HD Graphics 4600 HSW GT2 0x416 .. 13.2 |= IntelR HD Graphics 4600 HSW GT2 0x416 .. 14.2 |= Intel HD Graphics 4600 HSW GT2 CLANG70 . 13.5 |= Intel HD Graphics 4600 HSW GT2 CLANG70 . 13.3 |= Radeon HD 8790M ........................ 31.1 |== IntelR HD Graphics 4600 HSW GT2 0x416 .. 13.6 |= Intel HD Graphics 4600 HSW GT2 CLANG70 . 13.4 |= nVidia RTX 4090 mobile ................. 437.0 |=============================== ViennaCL 1.7.1 Test: OpenCL BLAS - sDOT GB/s > Higher Is Better IntelR HD Graphics 4600 HSW GT2 0x416 .. 15.80 |= IntelR HD Graphics 4600 HSW GT2 0x416 .. 15.70 |= IntelR HD Graphics 4600 HSW GT2 0x416 .. 15.30 |= IntelR HD Graphics 4600 HSW GT2 0x416 .. 14.61 |= IntelR HD Graphics 4600 HSW GT2 0x416 .. 16.00 |= IntelR HD Graphics 4600 HSW GT2 0x416 .. 16.10 |= Intel HD Graphics 4600 HSW GT2 CLANG70 . 15.00 |= Intel HD Graphics 4600 HSW GT2 CLANG70 . 15.30 |= Radeon HD 8790M ........................ 27.30 |== IntelR HD Graphics 4600 HSW GT2 0x416 .. 15.60 |= Intel HD Graphics 4600 HSW GT2 CLANG70 . 15.10 |= nVidia RTX 4090 mobile ................. 375.00 |============================== ViennaCL 1.7.1 Test: OpenCL BLAS - sCOPY GB/s > Higher Is Better IntelR HD Graphics 4600 HSW GT2 0x416 .. 13.6 |== IntelR HD Graphics 4600 HSW GT2 0x416 .. 14.1 |== IntelR HD Graphics 4600 HSW GT2 0x416 .. 13.8 |== IntelR HD Graphics 4600 HSW GT2 0x416 .. 14.3 |== IntelR HD Graphics 4600 HSW GT2 0x416 .. 14.0 |== IntelR HD Graphics 4600 HSW GT2 0x416 .. 14.2 |== Intel HD Graphics 4600 HSW GT2 CLANG70 . 14.6 |== Intel HD Graphics 4600 HSW GT2 CLANG70 . 13.8 |== Intel HD Graphics 4600 HSW GT2 CLANG70 . 13.6 |== Intel HD Graphics 4600 HSW GT2 CLANG70 . 13.4 |== Intel HD Graphics 4600 HSW GT2 CLANG70 . 13.5 |== Intel HD Graphics 4600 HSW GT2 CLANG70 . 13.7 |== Intel HD Graphics 4600 HSW GT2 CLANG70 . 14.0 |== Intel HD Graphics 4600 HSW GT2 CLANG70 . 14.4 |== Radeon HD 8790M ........................ 29.5 |=== IntelR HD Graphics 4600 HSW GT2 0x416 .. 14.4 |== Intel HD Graphics 4600 HSW GT2 CLANG70 . 14.7 |== nVidia RTX 4090 mobile ................. 267.0 |=============================== ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-TT GFLOPs/s > Higher Is Better Radeon HD 8790M ........ 38.5 |=== nVidia RTX 4090 mobile . 681.0 |=============================================== ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-TN GFLOPs/s > Higher Is Better Radeon HD 8790M ........ 37.7 |=== nVidia RTX 4090 mobile . 666.0 |=============================================== ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMV-T GB/s > Higher Is Better Radeon HD 8790M ........ 22.7 |=== nVidia RTX 4090 mobile . 373.0 |=============================================== ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-NT GFLOPs/s > Higher Is Better Radeon HD 8790M ........ 39.7 |=== nVidia RTX 4090 mobile . 637.0 |=============================================== ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-NN GFLOPs/s > Higher Is Better Radeon HD 8790M ........ 39.0 |=== nVidia RTX 4090 mobile . 620.0 |=============================================== ViennaCL 1.7.1 Test: OpenCL BLAS - dCOPY GB/s > Higher Is Better Radeon HD 8790M ........ 35.5 |==== nVidia RTX 4090 mobile . 469.0 |=============================================== ViennaCL 1.7.1 Test: OpenCL BLAS - dAXPY GB/s > Higher Is Better Radeon HD 8790M ........ 40.6 |==== nVidia RTX 4090 mobile . 521.0 |=============================================== ViennaCL 1.7.1 Test: OpenCL BLAS - dDOT GB/s > Higher Is Better Radeon HD 8790M ........ 44.7 |==== nVidia RTX 4090 mobile . 518.0 |===============================================