llama

2 x Intel Xeon Gold 6138 testing with a GIGABYTE V4288 MD61-SC2-00 v01000100 (R22 BIOS) and ASPEED on Debian 12 via the Phoronix Test Suite.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2501037-ATTA-LLAMA4177
Jump To Table - Results

Statistics

Remove Outliers Before Calculating Averages

Graph Settings

Prefer Vertical Bar Graphs

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs
Condense Test Profiles With Multiple Version Results Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Result
Identifier
Performance Per
Dollar
Date
Run
  Test
  Duration
2 x Intel Xeon Gold 6138 - ASPEED - GIGABYTE V4288
December 26 2024
  1 Hour, 50 Minutes
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


llamaOpenBenchmarking.orgPhoronix Test Suite2 x Intel Xeon Gold 6138 (40 Cores / 80 Threads)GIGABYTE V4288 MD61-SC2-00 v01000100 (R22 BIOS)Intel Sky Lake-E DMI3 Registers16 + 16 + 32 + 32 + 32 + 32 + 32 DDR4-2666MT/s1000GB Western Digital WD10EZEX-21M + 1000GB Seagate ST1000DM003-1SB1ASPEEDS22F3502 x Intel X722 for 1GbE + 2 x QLogic FastLinQ QL41000 10/25/40/50GbEDebian 126.1.0-26-amd64 (x86_64)GCC 12.2.0ext41920x1080ProcessorMotherboardChipsetMemoryDiskGraphicsMonitorNetworkOSKernelCompilerFile-SystemScreen ResolutionLlama BenchmarksSystem Logs- Transparent Huge Pages: always- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-bTRWOB/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-bTRWOB/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - CPU Microcode: 0x2007006- gather_data_sampling: Mitigation of Microcode + itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Mitigation of PTE Inversion; VMX: conditional cache flushes SMT vulnerable + mds: Mitigation of Clear buffers; SMT vulnerable + meltdown: Mitigation of PTI + mmio_stale_data: Mitigation of Clear buffers; SMT vulnerable + reg_file_data_sampling: Not affected + retbleed: Mitigation of IBRS + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of IBRS; IBPB: conditional; STIBP: conditional; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Mitigation of Clear buffers; SMT vulnerable

llamallama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Text Generation 128llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Text Generation 128llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Text Generation 128llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Text Generation 128llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 512llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 1024llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 20482 x Intel Xeon Gold 6138 - ASPEED - GIGABYTE V42885.536.0220.985.9726.0526.6025.35OpenBenchmarking.org

Llama.cpp

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4154Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 1282 x Intel Xeon Gold 6138 - ASPEED - GIGABYTE V42881.24432.48863.73294.97726.2215SE +/- 0.01, N = 25.531. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4154Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 1282 x Intel Xeon Gold 6138 - ASPEED - GIGABYTE V4288246810SE +/- 0.01, N = 26.021. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4154Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 1282 x Intel Xeon Gold 6138 - ASPEED - GIGABYTE V4288510152025SE +/- 0.05, N = 220.981. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -fopenmp -march=native -mtune=native -lopenblas

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 1282 x Intel Xeon Gold 6138 - ASPEED - GIGABYTE V42881.34332.68664.02995.37326.7165SE +/- 0.00, N = 35.971. (CXX) g++ options: -O3

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 5122 x Intel Xeon Gold 6138 - ASPEED - GIGABYTE V4288612182430SE +/- 0.55, N = 1226.051. (CXX) g++ options: -O3

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 10242 x Intel Xeon Gold 6138 - ASPEED - GIGABYTE V4288612182430SE +/- 0.51, N = 1226.601. (CXX) g++ options: -O3

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b4397Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 20482 x Intel Xeon Gold 6138 - ASPEED - GIGABYTE V4288612182430SE +/- 0.36, N = 325.351. (CXX) g++ options: -O3