Llama.cpp NVIDIA GeForce RTX 5090

Benchmarks by Michael Larabel for a future article on Phoronix.

HTML result view exported from: https://openbenchmarking.org/result/2501264-PTS-LLAMACPP76&grw.

Llama.cpp NVIDIA GeForce RTX 5090ProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLOpenCLCompilerFile-SystemScreen ResolutionRTX 3090RTX 4070RTX 4070 SUPERRTX 4070 Ti SUPERRTX 4080RTX 4080 SUPERRTX 4090RTX 5090Intel Core Ultra 9 285K @ 5.10GHz (24 Cores)ASUS ROG MAXIMUS Z890 HERO (1203 BIOS)Intel Device ae7f2 x 16GB DDR5-6400MT/s Micron CP16G64C38U5B.M8D14001GB Western Digital WD_BLACK SN850X 4000GB + 1000GB Western Digital WDS100T1X0E-00AFY0ASUS NVIDIA GeForce RTX 3090 24GBIntel Device 7f50ASUS VP28URealtek Device 8126 + Intel I226-V + Intel Wi-Fi 7Ubuntu 24.106.11.0-13-generic (x86_64)GNOME Shell 47.0X Server 1.21.1.13NVIDIA 570.86.104.6.0OpenCL 3.0 CUDA 12.8.51 + OpenCL 3.0GCC 14.2.0 + CUDA 12.8ext43840x2160ASUS NVIDIA GeForce RTX 4070 12GBASUS NVIDIA GeForce RTX 4070 SUPER 12GBASUS NVIDIA GeForce RTX 4070 Ti SUPER 16GBASUS NVIDIA GeForce RTX 4080 16GBASUS NVIDIA GeForce RTX 4080 SUPER 16GBASUS NVIDIA GeForce RTX 4090 24GBASUS NVIDIA GeForce RTX 5090 32GBOpenBenchmarking.orgKernel Details- nouveau.modeset=0 - Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: intel_pstate performance (EPP: default) - CPU Microcode: 0x114 - Thermald 2.5.8Security Details- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected

Llama.cpp NVIDIA GeForce RTX 5090llama-cpp: NVIDIA CUDA - Llama-3.1-Tulu-3-8B-Q8_0 - Text Generation 128llama-cpp: NVIDIA CUDA - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 512llama-cpp: NVIDIA CUDA - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 1024llama-cpp: NVIDIA CUDA - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 2048llama-cpp: NVIDIA CUDA - Mistral-7B-Instruct-v0.3-Q8_0 - Text Generation 128llama-cpp: NVIDIA CUDA - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 512llama-cpp: NVIDIA CUDA - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 1024llama-cpp: NVIDIA CUDA - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 2048RTX 3090RTX 4070RTX 4070 SUPERRTX 4070 Ti SUPERRTX 4080RTX 4080 SUPERRTX 4090RTX 509090.885030.784784.574437.5795.295046.024783.194437.4954.474502.644198.293798.0657.364533.014212.673803.4654.595419.574997.774436.2957.485475.925024.484452.7570.836328.825889.855301.3574.556391.675924.065313.4074.567606.747034.856228.2878.507685.777067.756246.6977.087782.937201.826380.3481.177862.607236.306410.73100.5111675.0810889.089573.68105.6811826.4110961.069593.96158.7712780.5412385.3411211.19166.1212884.4612462.0311240.15OpenBenchmarking.org

Llama.cpp

Backend: NVIDIA CUDA - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128RTX 3090RTX 4070RTX 4070 SUPERRTX 4070 Ti SUPERRTX 4080RTX 4080 SUPERRTX 4090RTX 50904080120160200SE +/- 0.06, N = 3SE +/- 0.00, N = 3SE +/- 0.01, N = 3SE +/- 0.02, N = 3SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.02, N = 3SE +/- 0.06, N = 390.8854.4754.5970.8374.5677.08100.51158.771. (CXX) g++ options: -O3

Llama.cpp

Backend: NVIDIA CUDA - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512RTX 3090RTX 4070RTX 4070 SUPERRTX 4070 Ti SUPERRTX 4080RTX 4080 SUPERRTX 4090RTX 50903K6K9K12K15KSE +/- 2.36, N = 5SE +/- 0.83, N = 5SE +/- 0.71, N = 5SE +/- 0.97, N = 6SE +/- 1.40, N = 6SE +/- 1.43, N = 6SE +/- 4.15, N = 7SE +/- 9.49, N = 85030.784502.645419.576328.827606.747782.9311675.0812780.541. (CXX) g++ options: -O3

Llama.cpp

Backend: NVIDIA CUDA - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024RTX 3090RTX 4070RTX 4070 SUPERRTX 4070 Ti SUPERRTX 4080RTX 4080 SUPERRTX 4090RTX 50903K6K9K12K15KSE +/- 4.02, N = 3SE +/- 3.12, N = 3SE +/- 2.88, N = 3SE +/- 3.71, N = 4SE +/- 4.02, N = 4SE +/- 0.54, N = 4SE +/- 2.66, N = 5SE +/- 7.60, N = 64784.574198.294997.775889.857034.857201.8210889.0812385.341. (CXX) g++ options: -O3

Llama.cpp

Backend: NVIDIA CUDA - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048RTX 3090RTX 4070RTX 4070 SUPERRTX 4070 Ti SUPERRTX 4080RTX 4080 SUPERRTX 4090RTX 50902K4K6K8K10KSE +/- 6.70, N = 3SE +/- 3.08, N = 3SE +/- 2.42, N = 3SE +/- 4.23, N = 3SE +/- 2.85, N = 3SE +/- 2.45, N = 3SE +/- 6.13, N = 3SE +/- 5.46, N = 44437.573798.064436.295301.356228.286380.349573.6811211.191. (CXX) g++ options: -O3

Llama.cpp

Backend: NVIDIA CUDA - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128RTX 3090RTX 4070RTX 4070 SUPERRTX 4070 Ti SUPERRTX 4080RTX 4080 SUPERRTX 4090RTX 50904080120160200SE +/- 0.02, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.01, N = 3SE +/- 0.02, N = 3SE +/- 0.03, N = 395.2957.3657.4874.5578.5081.17105.68166.121. (CXX) g++ options: -O3

Llama.cpp

Backend: NVIDIA CUDA - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512RTX 3090RTX 4070RTX 4070 SUPERRTX 4070 Ti SUPERRTX 4080RTX 4080 SUPERRTX 4090RTX 50903K6K9K12K15KSE +/- 4.47, N = 5SE +/- 1.08, N = 5SE +/- 1.14, N = 5SE +/- 0.47, N = 6SE +/- 1.46, N = 6SE +/- 1.63, N = 6SE +/- 2.51, N = 8SE +/- 4.77, N = 85046.024533.015475.926391.677685.777862.6011826.4112884.461. (CXX) g++ options: -O3

Llama.cpp

Backend: NVIDIA CUDA - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024RTX 3090RTX 4070RTX 4070 SUPERRTX 4070 Ti SUPERRTX 4080RTX 4080 SUPERRTX 4090RTX 50903K6K9K12K15KSE +/- 5.32, N = 3SE +/- 3.38, N = 3SE +/- 2.97, N = 3SE +/- 4.77, N = 4SE +/- 3.16, N = 4SE +/- 1.89, N = 4SE +/- 2.42, N = 5SE +/- 7.53, N = 64783.194212.675024.485924.067067.757236.3010961.0612462.031. (CXX) g++ options: -O3

Llama.cpp

Backend: NVIDIA CUDA - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048RTX 3090RTX 4070RTX 4070 SUPERRTX 4070 Ti SUPERRTX 4080RTX 4080 SUPERRTX 4090RTX 50902K4K6K8K10KSE +/- 3.07, N = 3SE +/- 2.69, N = 3SE +/- 2.58, N = 3SE +/- 3.82, N = 3SE +/- 3.30, N = 3SE +/- 1.36, N = 3SE +/- 3.44, N = 3SE +/- 5.63, N = 44437.493803.464452.755313.406246.696410.739593.9611240.151. (CXX) g++ options: -O3

Llama.cpp

GPU Power Consumption Monitor

OpenBenchmarking.orgWatts, Fewer Is BetterLlama.cpp b4397GPU Power Consumption MonitorRTX 3090RTX 4070RTX 4070 SUPERRTX 4070 Ti SUPERRTX 4080RTX 4080 SUPERRTX 4090RTX 509080160240320400Min: 24.97 / Avg: 327.34 / Max: 342.87Min: 6.48 / Avg: 154.01 / Max: 159.62Min: 7.65 / Avg: 163.7 / Max: 170.36Min: 5.38 / Avg: 214.26 / Max: 226.28Min: 4.69 / Avg: 203.74 / Max: 213.53Min: 4.82 / Avg: 200.32 / Max: 209.63Min: 13.89 / Avg: 257.43 / Max: 273.19Min: 17.41 / Avg: 415.2 / Max: 457.63

Llama.cpp

GPU Temperature Monitor

OpenBenchmarking.orgCelsius, Fewer Is BetterLlama.cpp b4397GPU Temperature MonitorRTX 3090RTX 4070RTX 4070 SUPERRTX 4070 Ti SUPERRTX 4080RTX 4080 SUPERRTX 4090RTX 50901326395265Min: 34 / Avg: 59.63 / Max: 65Min: 35 / Avg: 54.12 / Max: 57Min: 33 / Avg: 57.41 / Max: 60Min: 25 / Avg: 53.76 / Max: 58Min: 32 / Avg: 50.18 / Max: 53Min: 35 / Avg: 50.69 / Max: 53Min: 32 / Avg: 51.59 / Max: 56Min: 37 / Avg: 61.91 / Max: 69

Llama.cpp

GPU Power Consumption Monitor

OpenBenchmarking.orgWatts, Fewer Is BetterLlama.cpp b4397GPU Power Consumption MonitorRTX 3090RTX 4070RTX 4070 SUPERRTX 4070 Ti SUPERRTX 4080RTX 4080 SUPERRTX 4090RTX 5090100200300400500Min: 22.01 / Avg: 237.66 / Max: 341.47Min: 7.97 / Avg: 132.64 / Max: 200.11Min: 7.25 / Avg: 136.61 / Max: 220.65Min: 14.33 / Avg: 169.82 / Max: 285.64Min: 9.4 / Avg: 174.7 / Max: 319.36Min: 5.25 / Avg: 168 / Max: 319.7Min: 15.06 / Avg: 193.46 / Max: 435.07Min: 27.51 / Avg: 256.77 / Max: 560.52

Llama.cpp

GPU Temperature Monitor

OpenBenchmarking.orgCelsius, Fewer Is BetterLlama.cpp b4397GPU Temperature MonitorRTX 3090RTX 4070RTX 4070 SUPERRTX 4070 Ti SUPERRTX 4080RTX 4080 SUPERRTX 4090RTX 50901326395265Min: 48 / Avg: 56.73 / Max: 61Min: 42 / Avg: 53.58 / Max: 61Min: 44 / Avg: 56 / Max: 64Min: 42 / Avg: 51.94 / Max: 60Min: 39 / Avg: 49.47 / Max: 59Min: 39 / Avg: 49.64 / Max: 60Min: 43 / Avg: 51.1 / Max: 62Min: 50 / Avg: 56.94 / Max: 66

Llama.cpp

GPU Power Consumption Monitor

OpenBenchmarking.orgWatts, Fewer Is BetterLlama.cpp b4397GPU Power Consumption MonitorRTX 3090RTX 4070RTX 4070 SUPERRTX 4070 Ti SUPERRTX 4080RTX 4080 SUPERRTX 4090RTX 5090100200300400500Min: 23.95 / Avg: 275.82 / Max: 344.66Min: 7.55 / Avg: 158.88 / Max: 199.95Min: 7.89 / Avg: 169.03 / Max: 219.99Min: 13.99 / Avg: 212.45 / Max: 285.44Min: 4.24 / Avg: 223.14 / Max: 317.21Min: 5.05 / Avg: 217.04 / Max: 314.35Min: 14.49 / Avg: 265.09 / Max: 430.27Min: 18.52 / Avg: 345.54 / Max: 570.21

Llama.cpp

GPU Temperature Monitor

OpenBenchmarking.orgCelsius, Fewer Is BetterLlama.cpp b4397GPU Temperature MonitorRTX 3090RTX 4070RTX 4070 SUPERRTX 4070 Ti SUPERRTX 4080RTX 4080 SUPERRTX 4090RTX 50901326395265Min: 51 / Avg: 60.11 / Max: 64Min: 40 / Avg: 56.03 / Max: 62Min: 41 / Avg: 58.53 / Max: 66Min: 39 / Avg: 54.45 / Max: 62Min: 37 / Avg: 52.51 / Max: 61Min: 37 / Avg: 52.95 / Max: 61Min: 40 / Avg: 53.84 / Max: 63Min: 46 / Avg: 58.72 / Max: 68

Llama.cpp

GPU Power Consumption Monitor

OpenBenchmarking.orgWatts, Fewer Is BetterLlama.cpp b4397GPU Power Consumption MonitorRTX 3090RTX 4070RTX 4070 SUPERRTX 4070 Ti SUPERRTX 4080RTX 4080 SUPERRTX 4090RTX 5090100200300400500Min: 23.39 / Avg: 306.37 / Max: 343.56Min: 7.83 / Avg: 178.34 / Max: 200.02Min: 7.37 / Avg: 192.5 / Max: 220.51Min: 14.21 / Avg: 245.09 / Max: 285.52Min: 8.5 / Avg: 260.59 / Max: 312.27Min: 4.79 / Avg: 256.77 / Max: 311.16Min: 14.48 / Avg: 321.19 / Max: 418.3Min: 19.28 / Avg: 432.76 / Max: 574.58

Llama.cpp

GPU Temperature Monitor

OpenBenchmarking.orgCelsius, Fewer Is BetterLlama.cpp b4397GPU Temperature MonitorRTX 3090RTX 4070RTX 4070 SUPERRTX 4070 Ti SUPERRTX 4080RTX 4080 SUPERRTX 4090RTX 50901428425670Min: 53 / Avg: 63.32 / Max: 66Min: 41 / Avg: 60.41 / Max: 65Min: 42 / Avg: 62.98 / Max: 69Min: 40 / Avg: 58.28 / Max: 64Min: 38 / Avg: 56.57 / Max: 63Min: 38 / Avg: 56.88 / Max: 62Min: 41 / Avg: 58.44 / Max: 66Min: 48 / Avg: 63.88 / Max: 71

Llama.cpp

GPU Power Consumption Monitor

OpenBenchmarking.orgWatts, Fewer Is BetterLlama.cpp b4397GPU Power Consumption MonitorRTX 3090RTX 4070RTX 4070 SUPERRTX 4070 Ti SUPERRTX 4080RTX 4080 SUPERRTX 4090RTX 509080160240320400Min: 23.64 / Avg: 329.32 / Max: 344.04Min: 6.88 / Avg: 153.72 / Max: 159.32Min: 7.44 / Avg: 163.71 / Max: 169.72Min: 13.45 / Avg: 215.91 / Max: 226.33Min: 6.77 / Avg: 203.21 / Max: 212.41Min: 4.53 / Avg: 199.33 / Max: 208.99Min: 14.19 / Avg: 258 / Max: 273.81Min: 16.89 / Avg: 413.87 / Max: 455.23

Llama.cpp

GPU Temperature Monitor

OpenBenchmarking.orgCelsius, Fewer Is BetterLlama.cpp b4397GPU Temperature MonitorRTX 3090RTX 4070RTX 4070 SUPERRTX 4070 Ti SUPERRTX 4080RTX 4080 SUPERRTX 4090RTX 50901326395265Min: 52 / Avg: 62.42 / Max: 66Min: 36 / Avg: 54.09 / Max: 56Min: 37 / Avg: 57.77 / Max: 60Min: 36 / Avg: 55.47 / Max: 58Min: 34 / Avg: 50.12 / Max: 53Min: 34 / Avg: 50.09 / Max: 52Min: 35 / Avg: 52.4 / Max: 56Min: 41 / Avg: 62.32 / Max: 69

Llama.cpp

GPU Power Consumption Monitor

OpenBenchmarking.orgWatts, Fewer Is BetterLlama.cpp b4397GPU Power Consumption MonitorRTX 3090RTX 4070RTX 4070 SUPERRTX 4070 Ti SUPERRTX 4080RTX 4080 SUPERRTX 4090RTX 5090100200300400500Min: 20.53 / Avg: 240.16 / Max: 345.05Min: 7.98 / Avg: 134.4 / Max: 201.53Min: 7.61 / Avg: 138.89 / Max: 220.09Min: 14.38 / Avg: 171.84 / Max: 284.66Min: 8.38 / Avg: 175.29 / Max: 319.75Min: 5.04 / Avg: 168.67 / Max: 319.11Min: 14.41 / Avg: 196.33 / Max: 437.54Min: 26.67 / Avg: 258.35 / Max: 561.45

Llama.cpp

GPU Temperature Monitor

OpenBenchmarking.orgCelsius, Fewer Is BetterLlama.cpp b4397GPU Temperature MonitorRTX 3090RTX 4070RTX 4070 SUPERRTX 4070 Ti SUPERRTX 4080RTX 4080 SUPERRTX 4090RTX 50901326395265Min: 49 / Avg: 57.48 / Max: 62Min: 42 / Avg: 53.79 / Max: 61Min: 44 / Avg: 56.15 / Max: 65Min: 42 / Avg: 52.11 / Max: 60Min: 39 / Avg: 49.57 / Max: 59Min: 39 / Avg: 50.19 / Max: 60Min: 43 / Avg: 51.48 / Max: 62Min: 50 / Avg: 57.19 / Max: 66

Llama.cpp

GPU Power Consumption Monitor

OpenBenchmarking.orgWatts, Fewer Is BetterLlama.cpp b4397GPU Power Consumption MonitorRTX 3090RTX 4070RTX 4070 SUPERRTX 4070 Ti SUPERRTX 4080RTX 4080 SUPERRTX 4090RTX 5090100200300400500Min: 25.35 / Avg: 277.57 / Max: 341.63Min: 7.56 / Avg: 159.8 / Max: 199.94Min: 8.41 / Avg: 169.72 / Max: 220.27Min: 13.81 / Avg: 214.5 / Max: 285.22Min: 4.25 / Avg: 225.62 / Max: 317.36Min: 4.69 / Avg: 218.99 / Max: 314.37Min: 14.69 / Avg: 265.39 / Max: 431.84Min: 18.98 / Avg: 349.69 / Max: 571.23

Llama.cpp

GPU Temperature Monitor

OpenBenchmarking.orgCelsius, Fewer Is BetterLlama.cpp b4397GPU Temperature MonitorRTX 3090RTX 4070RTX 4070 SUPERRTX 4070 Ti SUPERRTX 4080RTX 4080 SUPERRTX 4090RTX 50901326395265Min: 52 / Avg: 60.73 / Max: 65Min: 40 / Avg: 56.19 / Max: 62Min: 41 / Avg: 58.84 / Max: 66Min: 39 / Avg: 54.87 / Max: 62Min: 37 / Avg: 53.02 / Max: 61Min: 37 / Avg: 53.06 / Max: 62Min: 40 / Avg: 54.27 / Max: 63Min: 46 / Avg: 58.7 / Max: 68

Llama.cpp

GPU Power Consumption Monitor

OpenBenchmarking.orgWatts, Fewer Is BetterLlama.cpp b4397GPU Power Consumption MonitorRTX 3090RTX 4070RTX 4070 SUPERRTX 4070 Ti SUPERRTX 4080RTX 4080 SUPERRTX 4090RTX 5090100200300400500Min: 23.77 / Avg: 307.97 / Max: 343.53Min: 7.53 / Avg: 179.01 / Max: 199.51Min: 7.61 / Avg: 193.37 / Max: 220.56Min: 14.11 / Avg: 246.61 / Max: 285.48Min: 6.32 / Avg: 262.08 / Max: 312.97Min: 4.91 / Avg: 257.78 / Max: 310.42Min: 14.24 / Avg: 323.62 / Max: 418.45Min: 20.93 / Avg: 437.15 / Max: 575.44

Llama.cpp

GPU Temperature Monitor

OpenBenchmarking.orgCelsius, Fewer Is BetterLlama.cpp b4397GPU Temperature MonitorRTX 3090RTX 4070RTX 4070 SUPERRTX 4070 Ti SUPERRTX 4080RTX 4080 SUPERRTX 4090RTX 50901428425670Min: 53 / Avg: 63.58 / Max: 66Min: 41 / Avg: 60.34 / Max: 65Min: 42 / Avg: 63.2 / Max: 68Min: 40 / Avg: 58.49 / Max: 63Min: 38 / Avg: 56.96 / Max: 62Min: 38 / Avg: 57.1 / Max: 62Min: 41 / Avg: 58.54 / Max: 65Min: 48 / Avg: 64.08 / Max: 71


Phoronix Test Suite v10.8.5