nau AMD EPYC 7R13 48-Core testing with a Supermicro H12SSL-I v1.02 (2.7 BIOS) and NVIDIA GeForce RTX 4090 24GB on EndeavourOS rolling via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2403305-NE-NAU11698711 .
nau Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL Compiler File-System Screen Resolution AMD EPYC 7R13 48-Core - NVIDIA GeForce RTX 4090 24GB AMD EPYC 7R13 48-Core @ 3.73GHz (48 Cores / 96 Threads) Supermicro H12SSL-I v1.02 (2.7 BIOS) AMD Starship/Matisse 256GB 15363GB Micron_7450_MTFDKCC15T3TFR NVIDIA GeForce RTX 4090 24GB NVIDIA AD102 HD Audio 38GN950 2 x Intel X710 for 10GbE SFP+ EndeavourOS rolling 6.8.2-zen2-1-zen (x86_64) Xfce 4.18 X Server 1.21.1.11 NVIDIA 550.67 4.6.0 GCC 13.2.1 20230801 + Clang 17.0.6 + LLVM 17.0.6 + CUDA 12.4 btrfs 3840x1600 OpenBenchmarking.org - Transparent Huge Pages: always - NVCC_PREPEND_FLAGS="-ccbin /opt/cuda/bin" - --disable-libssp --disable-libstdcxx-pch --disable-werror --enable-__cxa_atexit --enable-bootstrap --enable-cet=auto --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-default-ssp --enable-gnu-indirect-function --enable-gnu-unique-object --enable-languages=ada,c,c++,d,fortran,go,lto,m2,objc,obj-c++ --enable-libstdcxx-backtrace --enable-link-serialization=1 --enable-lto --enable-multilib --enable-plugin --enable-shared --enable-threads=posix --mandir=/usr/share/man --with-build-config=bootstrap-lto --with-linker-hash-style=gnu - Scaling Governor: amd-pstate-epp performance (EPP: performance) - CPU Microcode: 0xa0011d3 - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
nau llama-cpp: llama-2-7b.Q4_0.gguf llama-cpp: llama-2-13b.Q4_0.gguf llama-cpp: llama-2-70b-chat.Q5_0.gguf AMD EPYC 7R13 48-Core - NVIDIA GeForce RTX 4090 24GB 27.85 15.79 2.92 OpenBenchmarking.org
Llama.cpp Model: llama-2-7b.Q4_0.gguf OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b1808 Model: llama-2-7b.Q4_0.gguf AMD EPYC 7R13 48-Core - NVIDIA GeForce RTX 4090 24GB 7 14 21 28 35 SE +/- 0.18, N = 15 27.85 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -march=native -mtune=native -fopenmp -lopenblas
Llama.cpp Model: llama-2-13b.Q4_0.gguf OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b1808 Model: llama-2-13b.Q4_0.gguf AMD EPYC 7R13 48-Core - NVIDIA GeForce RTX 4090 24GB 4 8 12 16 20 SE +/- 0.01, N = 3 15.79 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -march=native -mtune=native -fopenmp -lopenblas
Llama.cpp Model: llama-2-70b-chat.Q5_0.gguf OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b1808 Model: llama-2-70b-chat.Q5_0.gguf AMD EPYC 7R13 48-Core - NVIDIA GeForce RTX 4090 24GB 0.657 1.314 1.971 2.628 3.285 SE +/- 0.00, N = 3 2.92 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -march=native -mtune=native -fopenmp -lopenblas
Phoronix Test Suite v10.8.5