Llama.cpp NVIDIA GeForce RTX 5090

Benchmarks by Michael Larabel for a future article on Phoronix.

RTX 3090

Processor: Intel Core Ultra 9 285K @ 5.10GHz (24 Cores), Motherboard: ASUS ROG MAXIMUS Z890 HERO (1203 BIOS), Chipset: Intel Device ae7f, Memory: 2 x 16GB DDR5-6400MT/s Micron CP16G64C38U5B.M8D1, Disk: 4001GB Western Digital WD_BLACK SN850X 4000GB + 1000GB Western Digital WDS100T1X0E-00AFY0, Graphics: ASUS NVIDIA GeForce RTX 3090 24GB, Audio: Intel Device 7f50, Monitor: ASUS VP28U, Network: Realtek Device 8126 + Intel I226-V + Intel Wi-Fi 7

OS: Ubuntu 24.10, Kernel: 6.11.0-13-generic (x86_64), Desktop: GNOME Shell 47.0, Display Server: X Server 1.21.1.13, Display Driver: NVIDIA 570.86.10, OpenGL: 4.6.0, OpenCL: OpenCL 3.0 CUDA 12.8.51 + OpenCL 3.0, Compiler: GCC 14.2.0 + CUDA 12.8, File-System: ext4, Screen Resolution: 3840x2160

Kernel Notes: nouveau.modeset=0 - Transparent Huge Pages: madvise
Compiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v
Processor Notes: Scaling Governor: intel_pstate performance (EPP: default) - CPU Microcode: 0x114 - Thermald 2.5.8
Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected

RTX 4070

Changed Graphics to ASUS NVIDIA GeForce RTX 4070 12GB.

RTX 4070 SUPER

Changed Graphics to ASUS NVIDIA GeForce RTX 4070 SUPER 12GB.

RTX 4070 Ti SUPER

Changed Graphics to ASUS NVIDIA GeForce RTX 4070 Ti SUPER 16GB.

RTX 4080

Changed Graphics to ASUS NVIDIA GeForce RTX 4080 16GB.

RTX 4080 SUPER

Changed Graphics to ASUS NVIDIA GeForce RTX 4080 SUPER 16GB.

RTX 4090

Changed Graphics to ASUS NVIDIA GeForce RTX 4090 24GB.

RTX 5090

Changed Graphics to ASUS NVIDIA GeForce RTX 5090 32GB.

Llama.cpp

Result

GPU Power Consumption

GPU Temp

Result

GPU Power Consumption

GPU Temp

Result

GPU Power Consumption

GPU Temp

Result

GPU Power Consumption

GPU Temp

Result

GPU Power Consumption

GPU Temp

Result

GPU Power Consumption

GPU Temp

Result

GPU Power Consumption

GPU Temp

Result

GPU Power Consumption

GPU Temp

8 Results Shown

Llama.cpp:
NVIDIA CUDA - Llama-3.1-Tulu-3-8B-Q8_0 - Text Generation 128
NVIDIA CUDA - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 512
NVIDIA CUDA - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 1024
NVIDIA CUDA - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 2048
NVIDIA CUDA - Mistral-7B-Instruct-v0.3-Q8_0 - Text Generation 128
NVIDIA CUDA - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 512
NVIDIA CUDA - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 1024
NVIDIA CUDA - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 2048

RTX 3090

Testing initiated at 26 January 2025 15:35 by user pts.

RTX 4070

Processor: Intel Core Ultra 9 285K @ 5.10GHz (24 Cores), Motherboard: ASUS ROG MAXIMUS Z890 HERO (1203 BIOS), Chipset: Intel Device ae7f, Memory: 2 x 16GB DDR5-6400MT/s Micron CP16G64C38U5B.M8D1, Disk: 4001GB Western Digital WD_BLACK SN850X 4000GB + 1000GB Western Digital WDS100T1X0E-00AFY0, Graphics: ASUS NVIDIA GeForce RTX 4070 12GB, Audio: Intel Device 7f50, Monitor: ASUS VP28U, Network: Realtek Device 8126 + Intel I226-V + Intel Wi-Fi 7

Testing initiated at 25 January 2025 19:11 by user pts.

RTX 4070 SUPER

Processor: Intel Core Ultra 9 285K @ 5.10GHz (24 Cores), Motherboard: ASUS ROG MAXIMUS Z890 HERO (1203 BIOS), Chipset: Intel Device ae7f, Memory: 2 x 16GB DDR5-6400MT/s Micron CP16G64C38U5B.M8D1, Disk: 4001GB Western Digital WD_BLACK SN850X 4000GB + 1000GB Western Digital WDS100T1X0E-00AFY0, Graphics: ASUS NVIDIA GeForce RTX 4070 SUPER 12GB, Audio: Intel Device 7f50, Monitor: ASUS VP28U, Network: Realtek Device 8126 + Intel I226-V + Intel Wi-Fi 7

Testing initiated at 25 January 2025 20:29 by user pts.

RTX 4070 Ti SUPER

Processor: Intel Core Ultra 9 285K @ 5.10GHz (24 Cores), Motherboard: ASUS ROG MAXIMUS Z890 HERO (1203 BIOS), Chipset: Intel Device ae7f, Memory: 2 x 16GB DDR5-6400MT/s Micron CP16G64C38U5B.M8D1, Disk: 4001GB Western Digital WD_BLACK SN850X 4000GB + 1000GB Western Digital WDS100T1X0E-00AFY0, Graphics: ASUS NVIDIA GeForce RTX 4070 Ti SUPER 16GB, Audio: Intel Device 7f50, Monitor: ASUS VP28U, Network: Realtek Device 8126 + Intel I226-V + Intel Wi-Fi 7

Testing initiated at 25 January 2025 21:55 by user pts.

RTX 4080

Processor: Intel Core Ultra 9 285K @ 5.10GHz (24 Cores), Motherboard: ASUS ROG MAXIMUS Z890 HERO (1203 BIOS), Chipset: Intel Device ae7f, Memory: 2 x 16GB DDR5-6400MT/s Micron CP16G64C38U5B.M8D1, Disk: 4001GB Western Digital WD_BLACK SN850X 4000GB + 1000GB Western Digital WDS100T1X0E-00AFY0, Graphics: ASUS NVIDIA GeForce RTX 4080 16GB, Audio: Intel Device 7f50, Monitor: ASUS VP28U, Network: Realtek Device 8126 + Intel I226-V + Intel Wi-Fi 7

Testing initiated at 26 January 2025 02:01 by user pts.

RTX 4080 SUPER

Processor: Intel Core Ultra 9 285K @ 5.10GHz (24 Cores), Motherboard: ASUS ROG MAXIMUS Z890 HERO (1203 BIOS), Chipset: Intel Device ae7f, Memory: 2 x 16GB DDR5-6400MT/s Micron CP16G64C38U5B.M8D1, Disk: 4001GB Western Digital WD_BLACK SN850X 4000GB + 1000GB Western Digital WDS100T1X0E-00AFY0, Graphics: ASUS NVIDIA GeForce RTX 4080 SUPER 16GB, Audio: Intel Device 7f50, Monitor: ASUS VP28U, Network: Realtek Device 8126 + Intel I226-V + Intel Wi-Fi 7

Testing initiated at 25 January 2025 18:01 by user pts.

RTX 4090

Processor: Intel Core Ultra 9 285K @ 5.10GHz (24 Cores), Motherboard: ASUS ROG MAXIMUS Z890 HERO (1203 BIOS), Chipset: Intel Device ae7f, Memory: 2 x 16GB DDR5-6400MT/s Micron CP16G64C38U5B.M8D1, Disk: 4001GB Western Digital WD_BLACK SN850X 4000GB + 1000GB Western Digital WDS100T1X0E-00AFY0, Graphics: ASUS NVIDIA GeForce RTX 4090 24GB, Audio: Intel Device 7f50, Monitor: ASUS VP28U, Network: Realtek Device 8126 + Intel I226-V + Intel Wi-Fi 7

Testing initiated at 26 January 2025 00:25 by user pts.

RTX 5090

Processor: Intel Core Ultra 9 285K @ 5.10GHz (24 Cores), Motherboard: ASUS ROG MAXIMUS Z890 HERO (1203 BIOS), Chipset: Intel Device ae7f, Memory: 2 x 16GB DDR5-6400MT/s Micron CP16G64C38U5B.M8D1, Disk: 4001GB Western Digital WD_BLACK SN850X 4000GB + 1000GB Western Digital WDS100T1X0E-00AFY0, Graphics: ASUS NVIDIA GeForce RTX 5090 32GB, Audio: Intel Device 7f50, Monitor: ASUS VP28U, Network: Realtek Device 8126 + Intel I226-V + Intel Wi-Fi 7

Testing initiated at 25 January 2025 16:58 by user pts.

Llama.cpp NVIDIA GeForce RTX 5090

View

Statistics

Graph Settings

Multi-Way Comparison

Table

Sensor Monitoring

Run Management

RTX 3090

RTX 4070

RTX 4070 SUPER

RTX 4070 Ti SUPER

RTX 4080

RTX 4080 SUPER

RTX 4090

RTX 5090

Llama.cpp

8 Results Shown

RTX 3090

RTX 4070

RTX 4070 SUPER

RTX 4070 Ti SUPER

RTX 4080

RTX 4080 SUPER

RTX 4090

RTX 5090