aorus-llama-cpp

AMD Ryzen 9 9950X 16-Core testing with a Gigabyte X870 AORUS ELITE WIFI7 (F3h BIOS) and Gigabyte NVIDIA GeForce RTX 4090 24GB on openSUSE Leap 15.6 via the Phoronix Test Suite.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2502038-NE-AORUSLLAM92
Jump To Table - Results

Statistics

Remove Outliers Before Calculating Averages

Graph Settings

Prefer Vertical Bar Graphs

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Result
Identifier
Performance Per
Dollar
Date
Run
  Test
  Duration
AMD Ryzen 9 9950X 16-Core - Gigabyte NVIDIA GeForce
December 19 2024
  2 Hours, 2 Minutes
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


aorus-llama-cppOpenBenchmarking.orgPhoronix Test SuiteAMD Ryzen 9 9950X 16-Core @ 4.30GHz (16 Cores / 32 Threads)Gigabyte X870 AORUS ELITE WIFI7 (F3h BIOS)AMD Raphael/Granite4 x 32 GB DDR5-3600MT/s CMH64GX5M2B6000C381000GB Samsung SSD 970 EVO Plus 1TB + 2 x 2000GB Samsung SSD 870Gigabyte NVIDIA GeForce RTX 4090 24GBNVIDIA AD102 HD AudioSyncMasterRealtek RTL8125 2.5GbE + MEDIATEK Device 7925openSUSE Leap 15.66.4.0-150600.23.30-default (x86_64)X Server 1.21.1.11NVIDIAGCC 11.3.0 + CUDA 12.6btrfs1280x1024ProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDisplay ServerDisplay DriverCompilerFile-SystemScreen ResolutionAorus-llama-cpp BenchmarksSystem Logs- Transparent Huge Pages: always- --build=x86_64-suse-linux --disable-libcc1 --disable-libssp --disable-libstdcxx-pch --disable-libvtv --disable-plugin --disable-werror --enable-cet=auto --enable-checking=release --enable-gnu-indirect-function --enable-languages=c,c++,objc,fortran,obj-c++,ada,go,d --enable-libphobos --enable-libstdcxx-allocator=new --enable-linux-futex --enable-multilib --enable-offload-targets=nvptx-none, --enable-ssp --enable-version-specific-runtime-libs --host=x86_64-suse-linux --mandir=/usr/share/man --with-arch-32=x86-64 --with-gcc-major-version-only --with-slibdir=/lib64 --with-tune=generic --without-cuda-driver --without-system-libunwind - Scaling Governor: acpi-cpufreq ondemand (Boost: Enabled) - CPU Microcode: 0xb404023- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Vulnerable + spectre_v1: Vulnerable: __user pointer sanitization and usercopy barriers only; no swapgs barriers + spectre_v2: Vulnerable; IBPB: disabled; STIBP: disabled; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected

aorus-llama-cppllama-cpp: NVIDIA CUDA - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 2048llama-cpp: NVIDIA CUDA - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 1024llama-cpp: NVIDIA CUDA - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 512llama-cpp: NVIDIA CUDA - granite-3.0-3b-a800m-instruct-Q8_0 - Text Generation 128llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 2048llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 1024llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 512llama-cpp: NVIDIA CUDA - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 2048llama-cpp: NVIDIA CUDA - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 1024llama-cpp: NVIDIA CUDA - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 512llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Text Generation 128llama-cpp: NVIDIA CUDA - Mistral-7B-Instruct-v0.3-Q8_0 - Text Generation 128llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 2048llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 1024llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 512llama-cpp: NVIDIA CUDA - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 2048llama-cpp: NVIDIA CUDA - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 1024llama-cpp: NVIDIA CUDA - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 512llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Text Generation 128llama-cpp: NVIDIA CUDA - Llama-3.1-Tulu-3-8B-Q8_0 - Text Generation 128llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 2048llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 1024llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 512llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Text Generation 128AMD Ryzen 9 9950X 16-Core - Gigabyte NVIDIA GeForce4573.064752.534793.60137.7852.3451.5148.719555.9810797.6711646.7542.74105.7432.0230.9129.629540.8210755.7011536.456.02100.8331.8531.0929.635.7OpenBenchmarking.org

Llama.cpp

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048AMD Ryzen 9 9950X 16-Core - Gigabyte NVIDIA GeForce10002000300040005000SE +/- 11.17, N = 34573.061. (CXX) g++ options: -O3

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024AMD Ryzen 9 9950X 16-Core - Gigabyte NVIDIA GeForce10002000300040005000SE +/- 11.92, N = 34752.531. (CXX) g++ options: -O3

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512AMD Ryzen 9 9950X 16-Core - Gigabyte NVIDIA GeForce10002000300040005000SE +/- 9.24, N = 34793.601. (CXX) g++ options: -O3

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128AMD Ryzen 9 9950X 16-Core - Gigabyte NVIDIA GeForce306090120150SE +/- 0.59, N = 3137.781. (CXX) g++ options: -O3

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048AMD Ryzen 9 9950X 16-Core - Gigabyte NVIDIA GeForce1224364860SE +/- 0.13, N = 352.341. (CXX) g++ options: -O3

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024AMD Ryzen 9 9950X 16-Core - Gigabyte NVIDIA GeForce1224364860SE +/- 0.01, N = 351.511. (CXX) g++ options: -O3

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512AMD Ryzen 9 9950X 16-Core - Gigabyte NVIDIA GeForce1122334455SE +/- 0.27, N = 348.711. (CXX) g++ options: -O3

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048AMD Ryzen 9 9950X 16-Core - Gigabyte NVIDIA GeForce2K4K6K8K10KSE +/- 1.50, N = 39555.981. (CXX) g++ options: -O3

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024AMD Ryzen 9 9950X 16-Core - Gigabyte NVIDIA GeForce2K4K6K8K10KSE +/- 4.21, N = 310797.671. (CXX) g++ options: -O3

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512AMD Ryzen 9 9950X 16-Core - Gigabyte NVIDIA GeForce2K4K6K8K10KSE +/- 1.40, N = 311646.751. (CXX) g++ options: -O3

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128AMD Ryzen 9 9950X 16-Core - Gigabyte NVIDIA GeForce1020304050SE +/- 0.09, N = 342.741. (CXX) g++ options: -O3

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128AMD Ryzen 9 9950X 16-Core - Gigabyte NVIDIA GeForce20406080100SE +/- 0.04, N = 3105.741. (CXX) g++ options: -O3

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048AMD Ryzen 9 9950X 16-Core - Gigabyte NVIDIA GeForce714212835SE +/- 0.14, N = 332.021. (CXX) g++ options: -O3

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024AMD Ryzen 9 9950X 16-Core - Gigabyte NVIDIA GeForce714212835SE +/- 0.09, N = 330.911. (CXX) g++ options: -O3

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512AMD Ryzen 9 9950X 16-Core - Gigabyte NVIDIA GeForce714212835SE +/- 0.06, N = 329.621. (CXX) g++ options: -O3

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048AMD Ryzen 9 9950X 16-Core - Gigabyte NVIDIA GeForce2K4K6K8K10KSE +/- 2.82, N = 39540.821. (CXX) g++ options: -O3

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024AMD Ryzen 9 9950X 16-Core - Gigabyte NVIDIA GeForce2K4K6K8K10KSE +/- 9.57, N = 310755.701. (CXX) g++ options: -O3

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512AMD Ryzen 9 9950X 16-Core - Gigabyte NVIDIA GeForce2K4K6K8K10KSE +/- 4.04, N = 311536.451. (CXX) g++ options: -O3

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128AMD Ryzen 9 9950X 16-Core - Gigabyte NVIDIA GeForce246810SE +/- 0.00, N = 36.021. (CXX) g++ options: -O3

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: NVIDIA CUDA - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128AMD Ryzen 9 9950X 16-Core - Gigabyte NVIDIA GeForce20406080100SE +/- 0.02, N = 3100.831. (CXX) g++ options: -O3

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048AMD Ryzen 9 9950X 16-Core - Gigabyte NVIDIA GeForce714212835SE +/- 0.13, N = 331.851. (CXX) g++ options: -O3

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024AMD Ryzen 9 9950X 16-Core - Gigabyte NVIDIA GeForce714212835SE +/- 0.06, N = 331.091. (CXX) g++ options: -O3

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512AMD Ryzen 9 9950X 16-Core - Gigabyte NVIDIA GeForce714212835SE +/- 0.06, N = 329.631. (CXX) g++ options: -O3

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128AMD Ryzen 9 9950X 16-Core - Gigabyte NVIDIA GeForce1.28252.5653.84755.136.4125SE +/- 0.00, N = 35.71. (CXX) g++ options: -O3

24 Results Shown

Llama.cpp:
  NVIDIA CUDA - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 2048
  NVIDIA CUDA - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 1024
  NVIDIA CUDA - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 512
  NVIDIA CUDA - granite-3.0-3b-a800m-instruct-Q8_0 - Text Generation 128
  CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 2048
  CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 1024
  CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 512
  NVIDIA CUDA - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 2048
  NVIDIA CUDA - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 1024
  NVIDIA CUDA - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 512
  CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Text Generation 128
  NVIDIA CUDA - Mistral-7B-Instruct-v0.3-Q8_0 - Text Generation 128
  CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 2048
  CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 1024
  CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 512
  NVIDIA CUDA - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 2048
  NVIDIA CUDA - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 1024
  NVIDIA CUDA - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 512
  CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Text Generation 128
  NVIDIA CUDA - Llama-3.1-Tulu-3-8B-Q8_0 - Text Generation 128
  CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 2048
  CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 1024
  CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 512
  CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Text Generation 128