lll

AMD EPYC 8324P 32-Core testing with a AMD Cinnabar (RCB1009C BIOS) and ASPEED on Ubuntu 23.10 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2401278-NE-LLL53839360&grs.

lllProcessorMotherboardChipsetMemoryDiskGraphicsNetworkOSKernelDesktopDisplay ServerCompilerFile-SystemScreen ResolutionabcAMD EPYC 8324P 32-Core @ 2.65GHz (32 Cores / 64 Threads)AMD Cinnabar (RCB1009C BIOS)AMD Device 14a46 x 32 GB DRAM-4800MT/s Samsung M321R4GA0BB0-CQKMG3201GB Micron_7450_MTFDKCB3T2TFSASPEED2 x Broadcom NetXtreme BCM5720 PCIeUbuntu 23.106.5.0-5-generic (x86_64)GNOME ShellX Server 1.21.1.7GCC 13.2.0ext4640x480OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-FTCNCZ/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-FTCNCZ/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0xaa00212 Security Details- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected

lllllama-cpp: llama-2-13b.Q4_0.ggufllama-cpp: llama-2-7b.Q4_0.ggufllamafile: llava-v1.5-7b-q4 - CPUllamafile: wizardcoder-python-34b-v1.0.Q6_K - CPUllama-cpp: llama-2-70b-chat.Q5_0.ggufcompress-lz4: 1 - Compression Speedcompress-lz4: 9 - Compression Speedcompress-lz4: 3 - Compression Speedcachebench: Read / Modify / Writecompress-lz4: 9 - Decompression Speedllamafile: mistral-7b-instruct-v0.2.Q8_0 - CPUcompress-lz4: 1 - Decompression Speedcompress-lz4: 3 - Decompression Speedcachebench: Readcachebench: Writeabc17.9729.5823.545.53.42565.0328.3588.287264.3144423471.614.753650.13306.97612.83500445638.46841417.9530.0323.575.533.43564.2128.3488.2387154.9464263467.414.763649.83307.67615.12937245639.68120218.3129.8523.765.513.43565.4228.2988.0987247.9930053468.114.753647.93308.67615.21480845639.009333OpenBenchmarking.org

Llama.cpp

Model: llama-2-13b.Q4_0.gguf

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b1808Model: llama-2-13b.Q4_0.ggufabc510152025SE +/- 0.04, N = 3SE +/- 0.17, N = 317.9717.9518.311. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -march=native -mtune=native -lopenblas

Llama.cpp

Model: llama-2-7b.Q4_0.gguf

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b1808Model: llama-2-7b.Q4_0.ggufabc714212835SE +/- 0.22, N = 12SE +/- 0.12, N = 329.5830.0329.851. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -march=native -mtune=native -lopenblas

Llamafile

Test: llava-v1.5-7b-q4 - Acceleration: CPU

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.6Test: llava-v1.5-7b-q4 - Acceleration: CPUabc612182430SE +/- 0.02, N = 3SE +/- 0.06, N = 323.5423.5723.76

Llamafile

Test: wizardcoder-python-34b-v1.0.Q6_K - Acceleration: CPU

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.6Test: wizardcoder-python-34b-v1.0.Q6_K - Acceleration: CPUabc1.24432.48863.73294.97726.2215SE +/- 0.01, N = 3SE +/- 0.01, N = 35.505.535.51

Llama.cpp

Model: llama-2-70b-chat.Q5_0.gguf

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b1808Model: llama-2-70b-chat.Q5_0.ggufabc0.77181.54362.31543.08723.859SE +/- 0.00, N = 3SE +/- 0.00, N = 33.423.433.431. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -march=native -mtune=native -lopenblas

LZ4 Compression

Compression Level: 1 - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterLZ4 Compression 1.9.4Compression Level: 1 - Compression Speedabc120240360480600SE +/- 0.34, N = 3SE +/- 0.78, N = 3565.03564.21565.421. (CC) gcc options: -O3

LZ4 Compression

Compression Level: 9 - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterLZ4 Compression 1.9.4Compression Level: 9 - Compression Speedabc714212835SE +/- 0.01, N = 3SE +/- 0.03, N = 328.3528.3428.291. (CC) gcc options: -O3

LZ4 Compression

Compression Level: 3 - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterLZ4 Compression 1.9.4Compression Level: 3 - Compression Speedabc20406080100SE +/- 0.01, N = 3SE +/- 0.10, N = 388.2088.2388.091. (CC) gcc options: -O3

CacheBench

Test: Read / Modify / Write

OpenBenchmarking.orgMB/s, More Is BetterCacheBenchTest: Read / Modify / Writeabc20K40K60K80K100KSE +/- 87.15, N = 3SE +/- 10.58, N = 387264.3187154.9587247.99MIN: 65718.12 / MAX: 90689.42MIN: 65719.2 / MAX: 90700.03MIN: 65710.28 / MAX: 90697.681. (CC) gcc options: -O3 -lrt

LZ4 Compression

Compression Level: 9 - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterLZ4 Compression 1.9.4Compression Level: 9 - Decompression Speedabc7001400210028003500SE +/- 0.52, N = 3SE +/- 0.17, N = 33471.63467.43468.11. (CC) gcc options: -O3

Llamafile

Test: mistral-7b-instruct-v0.2.Q8_0 - Acceleration: CPU

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.6Test: mistral-7b-instruct-v0.2.Q8_0 - Acceleration: CPUabc48121620SE +/- 0.08, N = 3SE +/- 0.09, N = 314.7514.7614.75

LZ4 Compression

Compression Level: 1 - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterLZ4 Compression 1.9.4Compression Level: 1 - Decompression Speedabc8001600240032004000SE +/- 0.58, N = 3SE +/- 0.41, N = 33650.13649.83647.91. (CC) gcc options: -O3

LZ4 Compression

Compression Level: 3 - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterLZ4 Compression 1.9.4Compression Level: 3 - Decompression Speedabc7001400210028003500SE +/- 0.71, N = 3SE +/- 0.62, N = 33306.93307.63308.61. (CC) gcc options: -O3

CacheBench

Test: Read

OpenBenchmarking.orgMB/s, More Is BetterCacheBenchTest: Readabc16003200480064008000SE +/- 0.12, N = 3SE +/- 0.07, N = 37612.847615.137615.21MIN: 7612.38 / MAX: 7613.67MIN: 7614.3 / MAX: 7615.59MIN: 7612.23 / MAX: 7615.741. (CC) gcc options: -O3 -lrt

CacheBench

Test: Write

OpenBenchmarking.orgMB/s, More Is BetterCacheBenchTest: Writeabc10K20K30K40K50KSE +/- 0.89, N = 3SE +/- 0.76, N = 345638.4745639.6845639.01MIN: 45475.7 / MAX: 45690.04MIN: 45474.92 / MAX: 45693.28MIN: 45474.02 / MAX: 45692.391. (CC) gcc options: -O3 -lrt


Phoronix Test Suite v10.8.5