llama cache AMD Ryzen 9 5900HX testing with a ASUS ROG Strix G513QY_G513QY G513QY v1.0 (G513QY.318 BIOS) and ASUS AMD Cezanne 512MB on Ubuntu 22.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2401145-PTS-LLAMACAC05&sor&grt .
llama cache Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server OpenGL Vulkan Compiler File-System Screen Resolution a b c d AMD Ryzen 9 5900HX @ 3.30GHz (8 Cores / 16 Threads) ASUS ROG Strix G513QY_G513QY G513QY v1.0 (G513QY.318 BIOS) AMD Renoir/Cezanne 2 x 8 GB DDR4-3200MT/s Micron 4ATF1G64HZ-3G2E2 512GB SAMSUNG MZVLQ512HBLU-00B00 ASUS AMD Cezanne 512MB (2500/1000MHz) AMD Navi 21/23 LQ156M1JW25 Realtek RTL8111/8168/8411 + MEDIATEK MT7921 802.11ax PCI Ubuntu 22.10 5.19.0-46-generic (x86_64) GNOME Shell 43.0 X Server 1.21.1.4 + Wayland 4.6 Mesa 22.2.5 (LLVM 15.0.2 DRM 3.47) 1.3.224 GCC 12.2.0 ext4 1920x1080 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - Platform Profile: balanced - CPU Microcode: 0xa50000c - ACPI Profile: balanced Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
llama cache cachebench: Read cachebench: Write cachebench: Read / Modify / Write llama-cpp: llama-2-7b.Q4_0.gguf llama-cpp: llama-2-13b.Q4_0.gguf a b c d 11931.741352 66937.480129 132182.912278 8.54 4.54 11843.394980 66621.687109 133023.784670 8.51 4.56 11802.912235 66365.528626 132942.234510 8.61 4.59 11868.193731 66631.740451 133690.855615 8.61 4.56 OpenBenchmarking.org
CacheBench Test: Read OpenBenchmarking.org MB/s, More Is Better CacheBench Test: Read a d b c 3K 6K 9K 12K 15K SE +/- 30.51, N = 3 SE +/- 20.21, N = 3 SE +/- 2.71, N = 3 11931.74 11868.19 11843.39 11802.91 MIN: 11928.97 / MAX: 11933.85 MIN: 11791.66 / MAX: 11933.58 MIN: 11799.82 / MAX: 11871.22 MIN: 11762.79 / MAX: 11815.7 1. (CC) gcc options: -O3 -lrt
CacheBench Test: Write OpenBenchmarking.org MB/s, More Is Better CacheBench Test: Write a d b c 14K 28K 42K 56K 70K SE +/- 439.92, N = 3 SE +/- 148.41, N = 3 SE +/- 247.88, N = 3 66937.48 66631.74 66621.69 66365.53 MIN: 59248.05 / MAX: 70414.59 MIN: 49747.62 / MAX: 70825.55 MIN: 58620.28 / MAX: 70336.15 MIN: 53306.3 / MAX: 70073.27 1. (CC) gcc options: -O3 -lrt
CacheBench Test: Read / Modify / Write OpenBenchmarking.org MB/s, More Is Better CacheBench Test: Read / Modify / Write d b c a 30K 60K 90K 120K 150K SE +/- 124.56, N = 3 SE +/- 152.56, N = 3 SE +/- 146.48, N = 3 133690.86 133023.78 132942.23 132182.91 MIN: 114809.76 / MAX: 141022.75 MIN: 113712.69 / MAX: 140256.3 MIN: 114171.26 / MAX: 140254.11 MIN: 113456.59 / MAX: 139428.89 1. (CC) gcc options: -O3 -lrt
Llama.cpp Model: llama-2-7b.Q4_0.gguf OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b1808 Model: llama-2-7b.Q4_0.gguf d c a b 2 4 6 8 10 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 8.61 8.61 8.54 8.51 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -march=native -mtune=native -lopenblas
Llama.cpp Model: llama-2-13b.Q4_0.gguf OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b1808 Model: llama-2-13b.Q4_0.gguf c d b a 1.0328 2.0656 3.0984 4.1312 5.164 SE +/- 0.03, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 4.59 4.56 4.56 4.54 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -march=native -mtune=native -lopenblas
Phoronix Test Suite v10.8.5