llama tr AMD Ryzen Threadripper 3990X 64-Core testing with a Gigabyte TRX40 AORUS PRO WIFI (F6 BIOS) and AMD Radeon RX 5700 8GB on Ubuntu 23.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2401109-PTS-LLAMATR408&grt&sor .
llama tr Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server OpenGL Compiler File-System Screen Resolution a b c f AMD Ryzen Threadripper 3990X 64-Core @ 2.90GHz (64 Cores / 128 Threads) Gigabyte TRX40 AORUS PRO WIFI (F6 BIOS) AMD Starship/Matisse 128GB Samsung SSD 970 EVO Plus 500GB AMD Radeon RX 5700 8GB (1750/875MHz) AMD Navi 10 HDMI Audio DELL P2415Q Intel I211 + Intel Wi-Fi 6 AX200 Ubuntu 23.10 6.5.0-14-generic (x86_64) GNOME Shell 45.0 X Server + Wayland 4.6 Mesa 23.2.1-1ubuntu3 (LLVM 15.0.7 DRM 3.54) GCC 13.2.0 ext4 3840x2160 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0x830107a Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Mitigation of untrained return thunk; SMT enabled with STIBP protection + spec_rstack_overflow: Mitigation of safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
llama tr cachebench: Read cachebench: Write cachebench: Read / Modify / Write llama-cpp: llama-2-7b.Q4_0.gguf llama-cpp: llama-2-13b.Q4_0.gguf llama-cpp: llama-2-70b-chat.Q5_0.gguf a b c f 10914.542344 63817.965300 115970.797764 14.23 8.21 1.45 10857.243163 64184.073674 117095.822748 14.23 8.01 1.45 10863.417473 63783.849327 116205.99615 14.99 8.02 1.45 10954.499723 64095.092643 117248.299362 14.19 8 1.46 OpenBenchmarking.org
CacheBench Test: Read OpenBenchmarking.org MB/s, More Is Better CacheBench Test: Read f a c b 2K 4K 6K 8K 10K SE +/- 14.16, N = 3 10954.50 10914.54 10863.42 10857.24 MIN: 10928.99 / MAX: 10978.8 MIN: 10859.9 / MAX: 10971.13 MIN: 10840.77 / MAX: 10909.02 MIN: 10840.61 / MAX: 10931.91 1. (CC) gcc options: -O3 -lrt
CacheBench Test: Write OpenBenchmarking.org MB/s, More Is Better CacheBench Test: Write b f a c 14K 28K 42K 56K 70K SE +/- 64.28, N = 3 64184.07 64095.09 63817.97 63783.85 MIN: 59624.93 / MAX: 65076.13 MIN: 58078.31 / MAX: 65198.73 MIN: 59125.08 / MAX: 64846.48 MIN: 58980.46 / MAX: 64725.85 1. (CC) gcc options: -O3 -lrt
CacheBench Test: Read / Modify / Write OpenBenchmarking.org MB/s, More Is Better CacheBench Test: Read / Modify / Write f b c a 30K 60K 90K 120K 150K SE +/- 213.45, N = 3 117248.30 117095.82 116206.00 115970.80 MIN: 96453.27 / MAX: 127988.94 MIN: 98368.14 / MAX: 127792.96 MIN: 97578.52 / MAX: 126854.22 MIN: 96405.85 / MAX: 127016.83 1. (CC) gcc options: -O3 -lrt
Llama.cpp Model: llama-2-7b.Q4_0.gguf OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b1808 Model: llama-2-7b.Q4_0.gguf c b a f 4 8 12 16 20 SE +/- 0.00, N = 3 14.99 14.23 14.23 14.19 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -march=native -mtune=native -lopenblas
Llama.cpp Model: llama-2-13b.Q4_0.gguf OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b1808 Model: llama-2-13b.Q4_0.gguf a c b f 2 4 6 8 10 SE +/- 0.06, N = 3 8.21 8.02 8.01 8.00 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -march=native -mtune=native -lopenblas
Llama.cpp Model: llama-2-70b-chat.Q5_0.gguf OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b1808 Model: llama-2-70b-chat.Q5_0.gguf f c b a 0.3285 0.657 0.9855 1.314 1.6425 SE +/- 0.00, N = 3 1.46 1.45 1.45 1.45 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -march=native -mtune=native -lopenblas
Phoronix Test Suite v10.8.5