new tests 9655 Dec Benchmarks for a future article. AMD EPYC 9655P 96-Core testing with a Supermicro Super Server H13SSL-N v1.01 (3.0 BIOS) and ASPEED on Ubuntu 24.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2412134-NE-NEWTESTS903&sor&grr .
new tests 9655 Dec Processor Motherboard Chipset Memory Disk Graphics Network OS Kernel Desktop Display Server Compiler File-System Screen Resolution a b c AMD EPYC 9655P 96-Core @ 2.60GHz (96 Cores / 192 Threads) Supermicro Super Server H13SSL-N v1.01 (3.0 BIOS) AMD 1Ah 12 x 64GB DDR5-6000MT/s Micron MTC40F2046S1RC64BDY QSFF 3201GB Micron_7450_MTFDKCB3T2TFS ASPEED 2 x Broadcom NetXtreme BCM5720 PCIe Ubuntu 24.10 6.13.0-rc1-phx (x86_64) GNOME Shell 47.0 X Server GCC 14.2.0 ext4 1024x768 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xb002116 Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
new tests 9655 Dec relion: Basic - CPU srsran: PUSCH Processor Benchmark, Throughput Total vvenc: Bosphorus 4K - Fast llamafile: wizardcoder-python-34b-v1.0.Q6_K - Text Generation 128 llamafile: mistral-7b-instruct-v0.2.Q5_K_M - Text Generation 128 srsran: PDSCH Processor Benchmark, Throughput Total vvenc: Bosphorus 4K - Faster llamafile: Llama-3.2-3B-Instruct.Q6_K - Text Generation 128 vvenc: Bosphorus 1080p - Fast srsran: PUSCH Processor Benchmark, Throughput Thread llamafile: TinyLlama-1.1B-Chat-v1.0.BF16 - Text Generation 128 x265: Bosphorus 4K vvenc: Bosphorus 1080p - Faster llamafile: wizardcoder-python-34b-v1.0.Q6_K - Text Generation 16 srsran: PDSCH Processor Benchmark, Throughput Thread x265: Bosphorus 1080p llamafile: mistral-7b-instruct-v0.2.Q5_K_M - Text Generation 16 llamafile: Llama-3.2-3B-Instruct.Q6_K - Text Generation 16 llamafile: TinyLlama-1.1B-Chat-v1.0.BF16 - Text Generation 16 a b c 137.746 8552.1 10.653 14.43 52.75 60763.5 23.259 80.69 29.577 317.1 122.87 41.46 62.913 5.91 2424.2 118.41 38.72 65.15 86.83 150.359 8590.2 10.921 14.4 55.07 59350.2 23.406 86.5 29.913 317.1 125.5 41.47 62.999 14.31 2428.9 118.69 54.57 86.02 122.08 151.654 8537.4 10.908 14.33 55.16 59987.8 23.429 85.05 29.879 158.7 126.93 41.13 63.151 14.28 2449.8 119.56 54.49 79.73 119.14 OpenBenchmarking.org
RELION Test: Basic - Device: CPU OpenBenchmarking.org Seconds, Fewer Is Better RELION 5.0 Test: Basic - Device: CPU a b c 30 60 90 120 150 137.75 150.36 151.65 1. (CXX) g++ options: -fPIC -std=c++14 -fopenmp -O3 -rdynamic -ldl -ltiff -lfftw3f -lfftw3 -lpng -ljpeg -lmpi_cxx -lmpi
srsRAN Project Test: PUSCH Processor Benchmark, Throughput Total OpenBenchmarking.org Mbps, More Is Better srsRAN Project 24.10 Test: PUSCH Processor Benchmark, Throughput Total b a c 2K 4K 6K 8K 10K 8590.2 8552.1 8537.4 1. (CXX) g++ options: -O3 -march=native -mtune=generic -fno-trapping-math -fno-math-errno -ldl
VVenC Video Input: Bosphorus 4K - Video Preset: Fast OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.13 Video Input: Bosphorus 4K - Video Preset: Fast b c a 3 6 9 12 15 10.92 10.91 10.65 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
Llamafile Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Text Generation 128 a b c 4 8 12 16 20 14.43 14.40 14.33
Llamafile Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Text Generation 128 c b a 12 24 36 48 60 55.16 55.07 52.75
srsRAN Project Test: PDSCH Processor Benchmark, Throughput Total OpenBenchmarking.org Mbps, More Is Better srsRAN Project 24.10 Test: PDSCH Processor Benchmark, Throughput Total a c b 13K 26K 39K 52K 65K 60763.5 59987.8 59350.2 1. (CXX) g++ options: -O3 -march=native -mtune=generic -fno-trapping-math -fno-math-errno -ldl
VVenC Video Input: Bosphorus 4K - Video Preset: Faster OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.13 Video Input: Bosphorus 4K - Video Preset: Faster c b a 6 12 18 24 30 23.43 23.41 23.26 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
Llamafile Model: Llama-3.2-3B-Instruct.Q6_K - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: Llama-3.2-3B-Instruct.Q6_K - Test: Text Generation 128 b c a 20 40 60 80 100 86.50 85.05 80.69
VVenC Video Input: Bosphorus 1080p - Video Preset: Fast OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.13 Video Input: Bosphorus 1080p - Video Preset: Fast b c a 7 14 21 28 35 29.91 29.88 29.58 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
srsRAN Project Test: PUSCH Processor Benchmark, Throughput Thread OpenBenchmarking.org Mbps, More Is Better srsRAN Project 24.10 Test: PUSCH Processor Benchmark, Throughput Thread b a c 70 140 210 280 350 317.1 317.1 158.7 1. (CXX) g++ options: -O3 -march=native -mtune=generic -fno-trapping-math -fno-math-errno -ldl
Llamafile Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Text Generation 128 c b a 30 60 90 120 150 126.93 125.50 122.87
x265 Video Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better x265 4.1 Video Input: Bosphorus 4K b a c 9 18 27 36 45 41.47 41.46 41.13 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
VVenC Video Input: Bosphorus 1080p - Video Preset: Faster OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.13 Video Input: Bosphorus 1080p - Video Preset: Faster c b a 14 28 42 56 70 63.15 63.00 62.91 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
Llamafile Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Text Generation 16 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Text Generation 16 b c a 4 8 12 16 20 14.31 14.28 5.91
srsRAN Project Test: PDSCH Processor Benchmark, Throughput Thread OpenBenchmarking.org Mbps, More Is Better srsRAN Project 24.10 Test: PDSCH Processor Benchmark, Throughput Thread c b a 500 1000 1500 2000 2500 2449.8 2428.9 2424.2 1. (CXX) g++ options: -O3 -march=native -mtune=generic -fno-trapping-math -fno-math-errno -ldl
x265 Video Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better x265 4.1 Video Input: Bosphorus 1080p c b a 30 60 90 120 150 119.56 118.69 118.41 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
Llamafile Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Text Generation 16 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Text Generation 16 b c a 12 24 36 48 60 54.57 54.49 38.72
Llamafile Model: Llama-3.2-3B-Instruct.Q6_K - Test: Text Generation 16 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: Llama-3.2-3B-Instruct.Q6_K - Test: Text Generation 16 b c a 20 40 60 80 100 86.02 79.73 65.15
Llamafile Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Text Generation 16 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Text Generation 16 b c a 30 60 90 120 150 122.08 119.14 86.83
Phoronix Test Suite v10.8.5