aorus-llama-cpp

AMD Ryzen 9 9950X 16-Core testing with a Gigabyte X870 AORUS ELITE WIFI7 (F3h BIOS) and Gigabyte NVIDIA GeForce RTX 4090 24GB on openSUSE Leap 15.6 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2502038-NE-AORUSLLAM92.

aorus-llama-cppProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDisplay ServerDisplay DriverCompilerFile-SystemScreen ResolutionAMD Ryzen 9 9950X 16-Core - Gigabyte NVIDIA GeForceAMD Ryzen 9 9950X 16-Core @ 4.30GHz (16 Cores / 32 Threads)Gigabyte X870 AORUS ELITE WIFI7 (F3h BIOS)AMD Raphael/Granite4 x 32 GB DDR5-3600MT/s CMH64GX5M2B6000C381000GB Samsung SSD 970 EVO Plus 1TB + 2 x 2000GB Samsung SSD 870Gigabyte NVIDIA GeForce RTX 4090 24GBNVIDIA AD102 HD AudioSyncMasterRealtek RTL8125 2.5GbE + MEDIATEK Device 7925openSUSE Leap 15.66.4.0-150600.23.30-default (x86_64)X Server 1.21.1.11NVIDIAGCC 11.3.0 + CUDA 12.6btrfs1280x1024OpenBenchmarking.org- Transparent Huge Pages: always- --build=x86_64-suse-linux --disable-libcc1 --disable-libssp --disable-libstdcxx-pch --disable-libvtv --disable-plugin --disable-werror --enable-cet=auto --enable-checking=release --enable-gnu-indirect-function --enable-languages=c,c++,objc,fortran,obj-c++,ada,go,d --enable-libphobos --enable-libstdcxx-allocator=new --enable-linux-futex --enable-multilib --enable-offload-targets=nvptx-none, --enable-ssp --enable-version-specific-runtime-libs --host=x86_64-suse-linux --mandir=/usr/share/man --with-arch-32=x86-64 --with-gcc-major-version-only --with-slibdir=/lib64 --with-tune=generic --without-cuda-driver --without-system-libunwind - Scaling Governor: acpi-cpufreq ondemand (Boost: Enabled) - CPU Microcode: 0xb404023- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Vulnerable + spectre_v1: Vulnerable: __user pointer sanitization and usercopy barriers only; no swapgs barriers + spectre_v2: Vulnerable; IBPB: disabled; STIBP: disabled; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected

aorus-llama-cppllama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Text Generation 128llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 512llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 1024llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 2048llama-cpp: NVIDIA CUDA - Llama-3.1-Tulu-3-8B-Q8_0 - Text Generation 128llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Text Generation 128llama-cpp: NVIDIA CUDA - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 512llama-cpp: NVIDIA CUDA - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 1024llama-cpp: NVIDIA CUDA - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 2048llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 512llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 1024llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 2048llama-cpp: NVIDIA CUDA - Mistral-7B-Instruct-v0.3-Q8_0 - Text Generation 128llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Text Generation 128llama-cpp: NVIDIA CUDA - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 512llama-cpp: NVIDIA CUDA - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 1024llama-cpp: NVIDIA CUDA - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 2048llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 512llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 1024llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 2048llama-cpp: NVIDIA CUDA - granite-3.0-3b-a800m-instruct-Q8_0 - Text Generation 128llama-cpp: NVIDIA CUDA - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 512llama-cpp: NVIDIA CUDA - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 1024llama-cpp: NVIDIA CUDA - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 2048AMD Ryzen 9 9950X 16-Core - Gigabyte NVIDIA GeForce5.729.6331.0931.85100.836.0211536.4510755.709540.8229.6230.9132.02105.7442.7411646.7510797.679555.9848.7151.5152.34137.784793.604752.534573.06OpenBenchmarking.org

Llama.cpp

Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128AMD Ryzen 9 9950X 16-Core - Gigabyte NVIDIA GeForce1.28252.5653.84755.136.4125SE +/- 0.00, N = 35.71. (CXX) g++ options: -O3

Llama.cpp

Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512AMD Ryzen 9 9950X 16-Core - Gigabyte NVIDIA GeForce714212835SE +/- 0.06, N = 329.631. (CXX) g++ options: -O3

Llama.cpp

Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024AMD Ryzen 9 9950X 16-Core - Gigabyte NVIDIA GeForce714212835SE +/- 0.06, N = 331.091. (CXX) g++ options: -O3

Llama.cpp

Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048AMD Ryzen 9 9950X 16-Core - Gigabyte NVIDIA GeForce714212835SE +/- 0.13, N = 331.851. (CXX) g++ options: -O3

Llama.cpp

Backend: NVIDIA CUDA - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128AMD Ryzen 9 9950X 16-Core - Gigabyte NVIDIA GeForce20406080100SE +/- 0.02, N = 3100.831. (CXX) g++ options: -O3

Llama.cpp

Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128AMD Ryzen 9 9950X 16-Core - Gigabyte NVIDIA GeForce246810SE +/- 0.00, N = 36.021. (CXX) g++ options: -O3

Llama.cpp

Backend: NVIDIA CUDA - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512AMD Ryzen 9 9950X 16-Core - Gigabyte NVIDIA GeForce2K4K6K8K10KSE +/- 4.04, N = 311536.451. (CXX) g++ options: -O3

Llama.cpp

Backend: NVIDIA CUDA - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024AMD Ryzen 9 9950X 16-Core - Gigabyte NVIDIA GeForce2K4K6K8K10KSE +/- 9.57, N = 310755.701. (CXX) g++ options: -O3

Llama.cpp

Backend: NVIDIA CUDA - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048AMD Ryzen 9 9950X 16-Core - Gigabyte NVIDIA GeForce2K4K6K8K10KSE +/- 2.82, N = 39540.821. (CXX) g++ options: -O3

Llama.cpp

Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512AMD Ryzen 9 9950X 16-Core - Gigabyte NVIDIA GeForce714212835SE +/- 0.06, N = 329.621. (CXX) g++ options: -O3

Llama.cpp

Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024AMD Ryzen 9 9950X 16-Core - Gigabyte NVIDIA GeForce714212835SE +/- 0.09, N = 330.911. (CXX) g++ options: -O3

Llama.cpp

Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048AMD Ryzen 9 9950X 16-Core - Gigabyte NVIDIA GeForce714212835SE +/- 0.14, N = 332.021. (CXX) g++ options: -O3

Llama.cpp

Backend: NVIDIA CUDA - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128AMD Ryzen 9 9950X 16-Core - Gigabyte NVIDIA GeForce20406080100SE +/- 0.04, N = 3105.741. (CXX) g++ options: -O3

Llama.cpp

Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128AMD Ryzen 9 9950X 16-Core - Gigabyte NVIDIA GeForce1020304050SE +/- 0.09, N = 342.741. (CXX) g++ options: -O3

Llama.cpp

Backend: NVIDIA CUDA - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512AMD Ryzen 9 9950X 16-Core - Gigabyte NVIDIA GeForce2K4K6K8K10KSE +/- 1.40, N = 311646.751. (CXX) g++ options: -O3

Llama.cpp

Backend: NVIDIA CUDA - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024AMD Ryzen 9 9950X 16-Core - Gigabyte NVIDIA GeForce2K4K6K8K10KSE +/- 4.21, N = 310797.671. (CXX) g++ options: -O3

Llama.cpp

Backend: NVIDIA CUDA - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048AMD Ryzen 9 9950X 16-Core - Gigabyte NVIDIA GeForce2K4K6K8K10KSE +/- 1.50, N = 39555.981. (CXX) g++ options: -O3

Llama.cpp

Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512AMD Ryzen 9 9950X 16-Core - Gigabyte NVIDIA GeForce1122334455SE +/- 0.27, N = 348.711. (CXX) g++ options: -O3

Llama.cpp

Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024AMD Ryzen 9 9950X 16-Core - Gigabyte NVIDIA GeForce1224364860SE +/- 0.01, N = 351.511. (CXX) g++ options: -O3

Llama.cpp

Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048AMD Ryzen 9 9950X 16-Core - Gigabyte NVIDIA GeForce1224364860SE +/- 0.13, N = 352.341. (CXX) g++ options: -O3

Llama.cpp

Backend: NVIDIA CUDA - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128AMD Ryzen 9 9950X 16-Core - Gigabyte NVIDIA GeForce306090120150SE +/- 0.59, N = 3137.781. (CXX) g++ options: -O3

Llama.cpp

Backend: NVIDIA CUDA - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512AMD Ryzen 9 9950X 16-Core - Gigabyte NVIDIA GeForce10002000300040005000SE +/- 9.24, N = 34793.601. (CXX) g++ options: -O3

Llama.cpp

Backend: NVIDIA CUDA - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024AMD Ryzen 9 9950X 16-Core - Gigabyte NVIDIA GeForce10002000300040005000SE +/- 11.92, N = 34752.531. (CXX) g++ options: -O3

Llama.cpp

Backend: NVIDIA CUDA - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048AMD Ryzen 9 9950X 16-Core - Gigabyte NVIDIA GeForce10002000300040005000SE +/- 11.17, N = 34573.061. (CXX) g++ options: -O3


Phoronix Test Suite v10.8.5