new tests 9655 Dec Benchmarks for a future article. AMD EPYC 9655P 96-Core testing with a Supermicro Super Server H13SSL-N v1.01 (3.0 BIOS) and ASPEED on Ubuntu 24.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2412134-NE-NEWTESTS903 .
new tests 9655 Dec Processor Motherboard Chipset Memory Disk Graphics Network OS Kernel Desktop Display Server Compiler File-System Screen Resolution a b c AMD EPYC 9655P 96-Core @ 2.60GHz (96 Cores / 192 Threads) Supermicro Super Server H13SSL-N v1.01 (3.0 BIOS) AMD 1Ah 12 x 64GB DDR5-6000MT/s Micron MTC40F2046S1RC64BDY QSFF 3201GB Micron_7450_MTFDKCB3T2TFS ASPEED 2 x Broadcom NetXtreme BCM5720 PCIe Ubuntu 24.10 6.13.0-rc1-phx (x86_64) GNOME Shell 47.0 X Server GCC 14.2.0 ext4 1024x768 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xb002116 Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
new tests 9655 Dec relion: Basic - CPU srsran: PDSCH Processor Benchmark, Throughput Total srsran: PUSCH Processor Benchmark, Throughput Total srsran: PDSCH Processor Benchmark, Throughput Thread srsran: PUSCH Processor Benchmark, Throughput Thread vvenc: Bosphorus 4K - Fast vvenc: Bosphorus 4K - Faster vvenc: Bosphorus 1080p - Fast vvenc: Bosphorus 1080p - Faster x265: Bosphorus 4K x265: Bosphorus 1080p llamafile: Llama-3.2-3B-Instruct.Q6_K - Text Generation 16 llamafile: Llama-3.2-3B-Instruct.Q6_K - Text Generation 128 llamafile: TinyLlama-1.1B-Chat-v1.0.BF16 - Text Generation 16 llamafile: TinyLlama-1.1B-Chat-v1.0.BF16 - Text Generation 128 llamafile: mistral-7b-instruct-v0.2.Q5_K_M - Text Generation 16 llamafile: mistral-7b-instruct-v0.2.Q5_K_M - Text Generation 128 llamafile: wizardcoder-python-34b-v1.0.Q6_K - Text Generation 16 llamafile: wizardcoder-python-34b-v1.0.Q6_K - Text Generation 128 a b c 137.746 60763.5 8552.1 2424.2 317.1 10.653 23.259 29.577 62.913 41.46 118.41 65.15 80.69 86.83 122.87 38.72 52.75 5.91 14.43 150.359 59350.2 8590.2 2428.9 317.1 10.921 23.406 29.913 62.999 41.47 118.69 86.02 86.5 122.08 125.5 54.57 55.07 14.31 14.4 151.654 59987.8 8537.4 2449.8 158.7 10.908 23.429 29.879 63.151 41.13 119.56 79.73 85.05 119.14 126.93 54.49 55.16 14.28 14.33 OpenBenchmarking.org
RELION Test: Basic - Device: CPU OpenBenchmarking.org Seconds, Fewer Is Better RELION 5.0 Test: Basic - Device: CPU a b c 30 60 90 120 150 137.75 150.36 151.65 1. (CXX) g++ options: -fPIC -std=c++14 -fopenmp -O3 -rdynamic -ldl -ltiff -lfftw3f -lfftw3 -lpng -ljpeg -lmpi_cxx -lmpi
srsRAN Project Test: PDSCH Processor Benchmark, Throughput Total OpenBenchmarking.org Mbps, More Is Better srsRAN Project 24.10 Test: PDSCH Processor Benchmark, Throughput Total a b c 13K 26K 39K 52K 65K 60763.5 59350.2 59987.8 1. (CXX) g++ options: -O3 -march=native -mtune=generic -fno-trapping-math -fno-math-errno -ldl
srsRAN Project Test: PUSCH Processor Benchmark, Throughput Total OpenBenchmarking.org Mbps, More Is Better srsRAN Project 24.10 Test: PUSCH Processor Benchmark, Throughput Total a b c 2K 4K 6K 8K 10K 8552.1 8590.2 8537.4 1. (CXX) g++ options: -O3 -march=native -mtune=generic -fno-trapping-math -fno-math-errno -ldl
srsRAN Project Test: PDSCH Processor Benchmark, Throughput Thread OpenBenchmarking.org Mbps, More Is Better srsRAN Project 24.10 Test: PDSCH Processor Benchmark, Throughput Thread a b c 500 1000 1500 2000 2500 2424.2 2428.9 2449.8 1. (CXX) g++ options: -O3 -march=native -mtune=generic -fno-trapping-math -fno-math-errno -ldl
srsRAN Project Test: PUSCH Processor Benchmark, Throughput Thread OpenBenchmarking.org Mbps, More Is Better srsRAN Project 24.10 Test: PUSCH Processor Benchmark, Throughput Thread a b c 70 140 210 280 350 317.1 317.1 158.7 1. (CXX) g++ options: -O3 -march=native -mtune=generic -fno-trapping-math -fno-math-errno -ldl
VVenC Video Input: Bosphorus 4K - Video Preset: Fast OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.13 Video Input: Bosphorus 4K - Video Preset: Fast a b c 3 6 9 12 15 10.65 10.92 10.91 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
VVenC Video Input: Bosphorus 4K - Video Preset: Faster OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.13 Video Input: Bosphorus 4K - Video Preset: Faster a b c 6 12 18 24 30 23.26 23.41 23.43 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
VVenC Video Input: Bosphorus 1080p - Video Preset: Fast OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.13 Video Input: Bosphorus 1080p - Video Preset: Fast a b c 7 14 21 28 35 29.58 29.91 29.88 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
VVenC Video Input: Bosphorus 1080p - Video Preset: Faster OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.13 Video Input: Bosphorus 1080p - Video Preset: Faster a b c 14 28 42 56 70 62.91 63.00 63.15 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
x265 Video Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better x265 4.1 Video Input: Bosphorus 4K a b c 9 18 27 36 45 41.46 41.47 41.13 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
x265 Video Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better x265 4.1 Video Input: Bosphorus 1080p a b c 30 60 90 120 150 118.41 118.69 119.56 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
Llamafile Model: Llama-3.2-3B-Instruct.Q6_K - Test: Text Generation 16 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: Llama-3.2-3B-Instruct.Q6_K - Test: Text Generation 16 a b c 20 40 60 80 100 65.15 86.02 79.73
Llamafile Model: Llama-3.2-3B-Instruct.Q6_K - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: Llama-3.2-3B-Instruct.Q6_K - Test: Text Generation 128 a b c 20 40 60 80 100 80.69 86.50 85.05
Llamafile Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Text Generation 16 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Text Generation 16 a b c 30 60 90 120 150 86.83 122.08 119.14
Llamafile Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: TinyLlama-1.1B-Chat-v1.0.BF16 - Test: Text Generation 128 a b c 30 60 90 120 150 122.87 125.50 126.93
Llamafile Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Text Generation 16 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Text Generation 16 a b c 12 24 36 48 60 38.72 54.57 54.49
Llamafile Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: mistral-7b-instruct-v0.2.Q5_K_M - Test: Text Generation 128 a b c 12 24 36 48 60 52.75 55.07 55.16
Llamafile Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Text Generation 16 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Text Generation 16 a b c 4 8 12 16 20 5.91 14.31 14.28
Llamafile Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llamafile 0.8.16 Model: wizardcoder-python-34b-v1.0.Q6_K - Test: Text Generation 128 a b c 4 8 12 16 20 14.43 14.40 14.33
Phoronix Test Suite v10.8.5