llama cache

AMD Ryzen 9 5900HX testing with a ASUS ROG Strix G513QY_G513QY G513QY v1.0 (G513QY.318 BIOS) and ASUS AMD Cezanne 512MB on Ubuntu 22.10 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2401145-PTS-LLAMACAC05.

llama cacheProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerOpenGLVulkanCompilerFile-SystemScreen ResolutionabcdAMD Ryzen 9 5900HX @ 3.30GHz (8 Cores / 16 Threads)ASUS ROG Strix G513QY_G513QY G513QY v1.0 (G513QY.318 BIOS)AMD Renoir/Cezanne2 x 8 GB DDR4-3200MT/s Micron 4ATF1G64HZ-3G2E2512GB SAMSUNG MZVLQ512HBLU-00B00ASUS AMD Cezanne 512MB (2500/1000MHz)AMD Navi 21/23LQ156M1JW25Realtek RTL8111/8168/8411 + MEDIATEK MT7921 802.11ax PCIUbuntu 22.105.19.0-46-generic (x86_64)GNOME Shell 43.0X Server 1.21.1.4 + Wayland4.6 Mesa 22.2.5 (LLVM 15.0.2 DRM 3.47)1.3.224GCC 12.2.0ext41920x1080OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - Platform Profile: balanced - CPU Microcode: 0xa50000c - ACPI Profile: balanced Security Details- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected

llama cachecachebench: Readcachebench: Writecachebench: Read / Modify / Writellama-cpp: llama-2-7b.Q4_0.ggufllama-cpp: llama-2-13b.Q4_0.ggufabcd11931.74135266937.480129132182.9122788.544.5411843.39498066621.687109133023.7846708.514.5611802.91223566365.528626132942.2345108.614.5911868.19373166631.740451133690.8556158.614.56OpenBenchmarking.org

CacheBench

Test: Read

OpenBenchmarking.orgMB/s, More Is BetterCacheBenchTest: Readabcd3K6K9K12K15KSE +/- 20.21, N = 3SE +/- 2.71, N = 3SE +/- 30.51, N = 311931.7411843.3911802.9111868.19MIN: 11928.97 / MAX: 11933.85MIN: 11799.82 / MAX: 11871.22MIN: 11762.79 / MAX: 11815.7MIN: 11791.66 / MAX: 11933.581. (CC) gcc options: -O3 -lrt

CacheBench

Test: Write

OpenBenchmarking.orgMB/s, More Is BetterCacheBenchTest: Writeabcd14K28K42K56K70KSE +/- 148.41, N = 3SE +/- 247.88, N = 3SE +/- 439.92, N = 366937.4866621.6966365.5366631.74MIN: 59248.05 / MAX: 70414.59MIN: 58620.28 / MAX: 70336.15MIN: 53306.3 / MAX: 70073.27MIN: 49747.62 / MAX: 70825.551. (CC) gcc options: -O3 -lrt

CacheBench

Test: Read / Modify / Write

OpenBenchmarking.orgMB/s, More Is BetterCacheBenchTest: Read / Modify / Writeabcd30K60K90K120K150KSE +/- 152.56, N = 3SE +/- 146.48, N = 3SE +/- 124.56, N = 3132182.91133023.78132942.23133690.86MIN: 113456.59 / MAX: 139428.89MIN: 113712.69 / MAX: 140256.3MIN: 114171.26 / MAX: 140254.11MIN: 114809.76 / MAX: 141022.751. (CC) gcc options: -O3 -lrt

Llama.cpp

Model: llama-2-7b.Q4_0.gguf

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b1808Model: llama-2-7b.Q4_0.ggufabcd246810SE +/- 0.00, N = 3SE +/- 0.01, N = 3SE +/- 0.00, N = 38.548.518.618.611. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -march=native -mtune=native -lopenblas

Llama.cpp

Model: llama-2-13b.Q4_0.gguf

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b1808Model: llama-2-13b.Q4_0.ggufabcd1.03282.06563.09844.13125.164SE +/- 0.00, N = 3SE +/- 0.03, N = 3SE +/- 0.00, N = 34.544.564.594.561. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -march=native -mtune=native -lopenblas


Phoronix Test Suite v10.8.4