llama tr

AMD Ryzen Threadripper 3990X 64-Core testing with a Gigabyte TRX40 AORUS PRO WIFI (F6 BIOS) and AMD Radeon RX 5700 8GB on Ubuntu 23.10 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2401109-PTS-LLAMATR408&rdt&grs.

llama trProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerOpenGLCompilerFile-SystemScreen ResolutionabcfAMD Ryzen Threadripper 3990X 64-Core @ 2.90GHz (64 Cores / 128 Threads)Gigabyte TRX40 AORUS PRO WIFI (F6 BIOS)AMD Starship/Matisse128GBSamsung SSD 970 EVO Plus 500GBAMD Radeon RX 5700 8GB (1750/875MHz)AMD Navi 10 HDMI AudioDELL P2415QIntel I211 + Intel Wi-Fi 6 AX200Ubuntu 23.106.5.0-14-generic (x86_64)GNOME Shell 45.0X Server + Wayland4.6 Mesa 23.2.1-1ubuntu3 (LLVM 15.0.7 DRM 3.54)GCC 13.2.0ext43840x2160OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0x830107aSecurity Details- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Mitigation of untrained return thunk; SMT enabled with STIBP protection + spec_rstack_overflow: Mitigation of safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected

llama trllama-cpp: llama-2-7b.Q4_0.ggufllama-cpp: llama-2-13b.Q4_0.ggufcachebench: Read / Modify / Writecachebench: Readllama-cpp: llama-2-70b-chat.Q5_0.ggufcachebench: Writeabcf14.238.21115970.79776410914.5423441.4563817.96530014.238.01117095.82274810857.2431631.4564184.07367414.998.02116205.9961510863.4174731.4563783.84932714.198117248.29936210954.4997231.4664095.092643OpenBenchmarking.org

Llama.cpp

Model: llama-2-7b.Q4_0.gguf

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b1808Model: llama-2-7b.Q4_0.ggufabcf48121620SE +/- 0.00, N = 314.2314.2314.9914.191. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -march=native -mtune=native -lopenblas

Llama.cpp

Model: llama-2-13b.Q4_0.gguf

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b1808Model: llama-2-13b.Q4_0.ggufabcf246810SE +/- 0.06, N = 38.218.018.028.001. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -march=native -mtune=native -lopenblas

CacheBench

Test: Read / Modify / Write

OpenBenchmarking.orgMB/s, More Is BetterCacheBenchTest: Read / Modify / Writeabcf30K60K90K120K150KSE +/- 213.45, N = 3115970.80117095.82116206.00117248.30MIN: 96405.85 / MAX: 127016.83MIN: 98368.14 / MAX: 127792.96MIN: 97578.52 / MAX: 126854.22MIN: 96453.27 / MAX: 127988.941. (CC) gcc options: -O3 -lrt

CacheBench

Test: Read

OpenBenchmarking.orgMB/s, More Is BetterCacheBenchTest: Readabcf2K4K6K8K10KSE +/- 14.16, N = 310914.5410857.2410863.4210954.50MIN: 10859.9 / MAX: 10971.13MIN: 10840.61 / MAX: 10931.91MIN: 10840.77 / MAX: 10909.02MIN: 10928.99 / MAX: 10978.81. (CC) gcc options: -O3 -lrt

Llama.cpp

Model: llama-2-70b-chat.Q5_0.gguf

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b1808Model: llama-2-70b-chat.Q5_0.ggufabcf0.32850.6570.98551.3141.6425SE +/- 0.00, N = 31.451.451.451.461. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -march=native -mtune=native -lopenblas

CacheBench

Test: Write

OpenBenchmarking.orgMB/s, More Is BetterCacheBenchTest: Writeabcf14K28K42K56K70KSE +/- 64.28, N = 363817.9764184.0763783.8564095.09MIN: 59125.08 / MAX: 64846.48MIN: 59624.93 / MAX: 65076.13MIN: 58980.46 / MAX: 64725.85MIN: 58078.31 / MAX: 65198.731. (CC) gcc options: -O3 -lrt


Phoronix Test Suite v10.8.5