llamna cpp epyc turin amd

AMD EPYC 9655P 96-Core testing with a Supermicro Super Server H13SSL-N v1.01 (3.0 BIOS) and ASPEED on Ubuntu 24.10 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2501198-NE-LLAMNACPP45.

llamna cpp epyc turin amdProcessorMotherboardChipsetMemoryDiskGraphicsNetworkOSKernelDesktopDisplay ServerCompilerFile-SystemScreen ResolutionaAMD EPYC 9655P 96-Core @ 2.60GHz (96 Cores / 192 Threads)Supermicro Super Server H13SSL-N v1.01 (3.0 BIOS)AMD 1Ah12 x 64GB DDR5-6000MT/s Micron MTC40F2046S1RC64BDY QSFF3201GB Micron_7450_MTFDKCB3T2TFSASPEED2 x Broadcom NetXtreme BCM5720 PCIeUbuntu 24.106.13.0-rc4-phx-stock (x86_64)GNOME Shell 47.0X ServerGCC 14.2.0ext41024x768OpenBenchmarking.org- Transparent Huge Pages: madvise- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xb002116 - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected

llamna cpp epyc turin amdllama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Text Generation 128llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 512llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 1024llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 2048llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Text Generation 128llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 512llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 1024llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 2048llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Text Generation 128llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 512llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 1024llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 2048a45.46110.26107.69110.8792.95369.98371.41355.0947.56112.12109.84110.72OpenBenchmarking.org

Llama.cpp

Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128a1020304050SE +/- 0.11, N = 445.461. (CXX) g++ options: -O3

Llama.cpp

CPU Temperature Monitor

MinAvgMaxa26.039.043.6OpenBenchmarking.orgCelsius, Fewer Is BetterLlama.cpp b4397CPU Temperature Monitor1224364860

Llama.cpp

Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512a20406080100SE +/- 0.53, N = 3110.261. (CXX) g++ options: -O3

Llama.cpp

CPU Temperature Monitor

MinAvgMaxa33.041.643.9OpenBenchmarking.orgCelsius, Fewer Is BetterLlama.cpp b4397CPU Temperature Monitor1224364860

Llama.cpp

Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024a20406080100SE +/- 0.59, N = 3107.691. (CXX) g++ options: -O3

Llama.cpp

CPU Temperature Monitor

MinAvgMaxa33.543.044.6OpenBenchmarking.orgCelsius, Fewer Is BetterLlama.cpp b4397CPU Temperature Monitor1224364860

Llama.cpp

Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048a20406080100SE +/- 0.33, N = 3110.871. (CXX) g++ options: -O3

Llama.cpp

CPU Temperature Monitor

MinAvgMaxa34.644.445.9OpenBenchmarking.orgCelsius, Fewer Is BetterLlama.cpp b4397CPU Temperature Monitor1224364860

Llama.cpp

Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128a20406080100SE +/- 0.44, N = 692.951. (CXX) g++ options: -O3

Llama.cpp

CPU Temperature Monitor

MinAvgMaxa35.640.141.6OpenBenchmarking.orgCelsius, Fewer Is BetterLlama.cpp b4397CPU Temperature Monitor1122334455

Llama.cpp

Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512a80160240320400SE +/- 2.91, N = 10369.981. (CXX) g++ options: -O3

Llama.cpp

CPU Temperature Monitor

MinAvgMaxa33.539.741.5OpenBenchmarking.orgCelsius, Fewer Is BetterLlama.cpp b4397CPU Temperature Monitor1122334455

Llama.cpp

Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024a80160240320400SE +/- 2.41, N = 3371.411. (CXX) g++ options: -O3

Llama.cpp

CPU Temperature Monitor

MinAvgMaxa32.939.841.9OpenBenchmarking.orgCelsius, Fewer Is BetterLlama.cpp b4397CPU Temperature Monitor1224364860

Llama.cpp

Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048a80160240320400SE +/- 2.05, N = 3355.091. (CXX) g++ options: -O3

Llama.cpp

CPU Temperature Monitor

MinAvgMaxa33.341.342.8OpenBenchmarking.orgCelsius, Fewer Is BetterLlama.cpp b4397CPU Temperature Monitor1224364860

Llama.cpp

Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128a1122334455SE +/- 0.16, N = 447.561. (CXX) g++ options: -O3

Llama.cpp

CPU Temperature Monitor

MinAvgMaxa34.042.645.5OpenBenchmarking.orgCelsius, Fewer Is BetterLlama.cpp b4397CPU Temperature Monitor1224364860

Llama.cpp

Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512a306090120150SE +/- 1.03, N = 3112.121. (CXX) g++ options: -O3

Llama.cpp

CPU Temperature Monitor

MinAvgMaxa35.443.044.9OpenBenchmarking.orgCelsius, Fewer Is BetterLlama.cpp b4397CPU Temperature Monitor1224364860

Llama.cpp

Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024a20406080100SE +/- 0.77, N = 3109.841. (CXX) g++ options: -O3

Llama.cpp

CPU Temperature Monitor

MinAvgMaxa34.543.945.5OpenBenchmarking.orgCelsius, Fewer Is BetterLlama.cpp b4397CPU Temperature Monitor1224364860

Llama.cpp

Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048a20406080100SE +/- 0.79, N = 3110.721. (CXX) g++ options: -O3

Llama.cpp

CPU Temperature Monitor

MinAvgMaxa35.544.846.1OpenBenchmarking.orgCelsius, Fewer Is BetterLlama.cpp b4397CPU Temperature Monitor1224364860


Phoronix Test Suite v10.8.5