old epyc ai Tests for a future article. AMD EPYC 7551 32-Core testing with a GIGABYTE MZ31-AR0-00 v01010101 (F10 BIOS) and ASPEED on Debian 12 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2406025-NE-OLDEPYCAI16&grr&sro .
old epyc ai Processor Motherboard Chipset Memory Disk Graphics Network OS Kernel Compiler File-System Screen Resolution a b c AMD EPYC 7551 32-Core @ 2.00GHz (32 Cores / 64 Threads) GIGABYTE MZ31-AR0-00 v01010101 (F10 BIOS) AMD 17h 8 x 4 GB DDR4-2133MT/s 9ASF51272PZ-2G6E1 Samsung SSD 960 EVO 500GB + 31GB SanDisk 3.2Gen1 ASPEED Realtek RTL8111/8168/8411 + 2 x Broadcom NetXtreme II BCM57810 10 Debian 12 6.1.0-10-amd64 (x86_64) GCC 12.2.0 ext4 1024x768 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: always Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-bTRWOB/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-bTRWOB/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0x8001227 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Mitigation of untrained return thunk; SMT vulnerable + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional STIBP: disabled RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
old epyc ai whisper-cpp: ggml-medium.en - 2016 State of the Union whisper-cpp: ggml-small.en - 2016 State of the Union whisper-cpp: ggml-base.en - 2016 State of the Union llamafile: wizardcoder-python-34b-v1.0.Q6_K - CPU llama-cpp: Meta-Llama-3-8B-Instruct-Q8_0.gguf llamafile: mistral-7b-instruct-v0.2.Q5_K_M - CPU llamafile: TinyLlama-1.1B-Chat-v1.0.BF16 - CPU llamafile: llava-v1.6-mistral-7b.Q8_0 - CPU a b c 3286.60625 1415.45588 359.631 1.11 1.74 5.75 12.71 1433.90888 445.01803 1.27 2.51 5.3 12.81 OpenBenchmarking.org
Whisper.cpp Model: ggml-medium.en - Input: 2016 State of the Union OpenBenchmarking.org Seconds, Fewer Is Better Whisper.cpp 1.6.2 Model: ggml-medium.en - Input: 2016 State of the Union a 700 1400 2100 2800 3500 3286.61 1. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread -msse3 -mssse3 -mavx -mf16c -mfma -mavx2
Whisper.cpp Model: ggml-small.en - Input: 2016 State of the Union OpenBenchmarking.org Seconds, Fewer Is Better Whisper.cpp 1.6.2 Model: ggml-small.en - Input: 2016 State of the Union a b 300 600 900 1200 1500 1415.46 1433.91 1. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread -msse3 -mssse3 -mavx -mf16c -mfma -mavx2
Whisper.cpp Model: ggml-base.en - Input: 2016 State of the Union OpenBenchmarking.org Seconds, Fewer Is Better Whisper.cpp 1.6.2 Model: ggml-base.en - Input: 2016 State of the Union a b 100 200 300 400 500 359.63 445.02 1. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread -msse3 -mssse3 -mavx -mf16c -mfma -mavx2
Llamafile Test: wizardcoder-python-34b-v1.0.Q6_K - Acceleration: CPU OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.6 Test: wizardcoder-python-34b-v1.0.Q6_K - Acceleration: CPU a c 0.2858 0.5716 0.8574 1.1432 1.429 1.11 1.27
Llama.cpp Model: Meta-Llama-3-8B-Instruct-Q8_0.gguf OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b3067 Model: Meta-Llama-3-8B-Instruct-Q8_0.gguf a c 0.5648 1.1296 1.6944 2.2592 2.824 1.74 2.51 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -march=native -mtune=native -lopenblas
Llamafile Test: mistral-7b-instruct-v0.2.Q5_K_M - Acceleration: CPU OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.6 Test: mistral-7b-instruct-v0.2.Q5_K_M - Acceleration: CPU a c 1.2938 2.5876 3.8814 5.1752 6.469 5.75 5.30
Llamafile Test: TinyLlama-1.1B-Chat-v1.0.BF16 - Acceleration: CPU OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.6 Test: TinyLlama-1.1B-Chat-v1.0.BF16 - Acceleration: CPU a c 3 6 9 12 15 12.71 12.81
Phoronix Test Suite v10.8.5