llama

Intel Core i9-14900K testing with a ASUS PRIME Z790-P WIFI (1402 BIOS) and Intel Arc A770 DG2 16GB on Ubuntu 23.10 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2401115-PTS-LLAMA41637.

llamaProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorOSKernelDesktopDisplay ServerOpenGLOpenCLCompilerFile-SystemScreen ResolutionavcdIntel Core i9-14900K @ 5.70GHz (24 Cores / 32 Threads)ASUS PRIME Z790-P WIFI (1402 BIOS)Intel Device 7a272 x 16 GB DRAM-6000MT/s Corsair CMK32GX5M2B6000C36Western Digital WD_BLACK SN850X 1000GBIntel Arc A770 DG2 16GB (2400MHz)Realtek ALC897ASUS VP28UUbuntu 23.106.7.0-060700rc7daily20231224-generic (x86_64)GNOME Shell 45.1X Server 1.21.1.74.6 Mesa 24.0~git2312240600.c05261~oibaf~m (git-c05261a 2023-12-24 mantic-oibaf-ppa)OpenCL 3.0GCC 13.2.0ext41920x1080OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: intel_pstate powersave (EPP: performance) - CPU Microcode: 0x11d - Thermald 2.5.4Security Details- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected

llamacachebench: Readcachebench: Writecachebench: Read / Modify / Writellama-cpp: llama-2-7b.Q4_0.ggufllama-cpp: llama-2-13b.Q4_0.ggufavcd22912.673526161363.477243176235.84600116.388.7522911.469477161285.203466166291.19254816.448.7522594.465007161299.013930175404.26475716.608.7722913.708604161353.744637175929.32131516.508.76OpenBenchmarking.org

CacheBench

Test: Read

OpenBenchmarking.orgMB/s, More Is BetterCacheBenchTest: Readavcd5K10K15K20K25KSE +/- 0.19, N = 3SE +/- 1.01, N = 3SE +/- 318.35, N = 12SE +/- 0.19, N = 322912.6722911.4722594.4722913.71MIN: 22908.26 / MAX: 22913.68MIN: 22878.67 / MAX: 22914.42MIN: 19091.54 / MAX: 22915.61MIN: 22911.85 / MAX: 22915.861. (CC) gcc options: -O3 -lrt

CacheBench

Test: Write

OpenBenchmarking.orgMB/s, More Is BetterCacheBenchTest: Writeavcd30K60K90K120K150KSE +/- 125.82, N = 3SE +/- 51.54, N = 3SE +/- 56.50, N = 3SE +/- 71.53, N = 3161363.48161285.20161299.01161353.74MIN: 86690.74 / MAX: 183302.05MIN: 86701.91 / MAX: 183314.99MIN: 86743.71 / MAX: 183256.07MIN: 86730.26 / MAX: 183303.091. (CC) gcc options: -O3 -lrt

CacheBench

Test: Read / Modify / Write

OpenBenchmarking.orgMB/s, More Is BetterCacheBenchTest: Read / Modify / Writeavcd40K80K120K160K200KSE +/- 186.19, N = 3SE +/- 4161.43, N = 12SE +/- 718.69, N = 3SE +/- 161.29, N = 3176235.85166291.19175404.26175929.32MIN: 136248.93 / MAX: 201568.94MIN: 111634.31 / MAX: 201391.75MIN: 128368.83 / MAX: 201388.28MIN: 133100.03 / MAX: 201497.561. (CC) gcc options: -O3 -lrt

Llama.cpp

Model: llama-2-7b.Q4_0.gguf

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b1808Model: llama-2-7b.Q4_0.ggufavcd48121620SE +/- 0.03, N = 3SE +/- 0.01, N = 3SE +/- 0.13, N = 3SE +/- 0.05, N = 316.3816.4416.6016.501. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -march=native -mtune=native -lopenblas

Llama.cpp

Model: llama-2-13b.Q4_0.gguf

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b1808Model: llama-2-13b.Q4_0.ggufavcd246810SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.01, N = 3SE +/- 0.02, N = 38.758.758.778.761. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -march=native -mtune=native -lopenblas


Phoronix Test Suite v10.8.4