feb 9950X AMD Ryzen 9 9950X 16-Core testing with a ASRock X870E Taichi (3.12.AS02 BIOS) and XFX AMD Radeon RX 7900 XTX 24GB on Ubuntu 24.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2502104-PTS-FEB9950X21&sor .
feb 9950X Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server OpenGL Compiler File-System Screen Resolution a b c d AMD Ryzen 9 9950X 16-Core @ 5.75GHz (16 Cores / 32 Threads) ASRock X870E Taichi (3.12.AS02 BIOS) AMD Device 14d8 2 x 16GB DDR5-6000MT/s F5-6000J2836G16G Western Digital WD_BLACK SN850X 2000GB XFX AMD Radeon RX 7900 XTX 24GB AMD Navi 31 HDMI/DP DELL U2723QE Realtek Device 8126 + MEDIATEK Device 0717 Ubuntu 24.04 6.12.3-061203-generic (x86_64) GNOME Shell 46.0 X Server 1.21.1.11 + Wayland 4.6 Mesa 24.2.0-devel (LLVM 18.1.7 DRM 3.59) GCC 13.3.0 ext4 3840x2160 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-fG75Ri/gcc-13-13.3.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-fG75Ri/gcc-13-13.3.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: amd-pstate-epp powersave (Boost: Enabled EPP: balance_performance) - CPU Microcode: 0xb404023 Python Details - Python 2.7.16 + Python 3.12.3 Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
feb 9950X qmcpack: H4_ae qmcpack: Li2_STO_ae qmcpack: LiH_ae_MSD qmcpack: O_ae_pyscf_UHF qmcpack: FeCO6_b3lyp_gms liquid-dsp: 1 - 256 - 32 liquid-dsp: 1 - 256 - 57 liquid-dsp: 2 - 256 - 32 liquid-dsp: 2 - 256 - 57 liquid-dsp: 4 - 256 - 32 liquid-dsp: 4 - 256 - 57 liquid-dsp: 8 - 256 - 32 liquid-dsp: 8 - 256 - 57 liquid-dsp: 1 - 256 - 512 liquid-dsp: 16 - 256 - 32 liquid-dsp: 16 - 256 - 57 liquid-dsp: 2 - 256 - 512 liquid-dsp: 32 - 256 - 32 liquid-dsp: 32 - 256 - 57 liquid-dsp: 4 - 256 - 512 liquid-dsp: 8 - 256 - 512 liquid-dsp: 16 - 256 - 512 liquid-dsp: 32 - 256 - 512 llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Text Generation 128 llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 512 llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 1024 llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 2048 llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Text Generation 128 llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 512 llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 1024 llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 2048 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Text Generation 128 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 512 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 1024 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 2048 a b c d 11 121.76 42.036 125.15 52.822 56787000 90088000 113930000 182470000 229660000 315060000 454100000 586180000 41410000 892520000 1096000000 82303000 1583300000 1596800000 158040000 315200000 582860000 618230000 9.17 89.38 91.42 88.41 9.71 91.25 90.33 88.09 65.08 413.38 395.79 375.05 10.36 123.24 42.005 126.79 52.295 56789000 90375000 113780000 183150000 229740000 313980000 454350000 585930000 41571000 893150000 1093500000 82576000 1587600000 1601200000 160690000 322050000 574670000 617980000 9.17 88.6 91.37 89.13 9.74 92.04 92.07 90.16 65.01 418.76 402.62 370.37 10.69 122.97 41.883 124.76 52.529 57980000 90142000 114160000 183260000 229920000 315910000 453740000 589920000 42347000 891310000 1091900000 82543000 1584800000 1599700000 160370000 312230000 580280000 618330000 9.17 93.6 90.08 89.14 9.75 88.49 89.3 89.69 65.1 406.48 394.84 371.21 10.54 123.99 41.811 124.25 52.313 59018000 90466000 113880000 182390000 229660000 315130000 454640000 592680000 42423000 892320000 1089400000 80008000 1583700000 1599300000 155760000 317070000 585540000 616800000 9.19 92.18 90.29 89.55 9.74 90.77 91.21 89.74 64.9 420.46 399.88 376.59 OpenBenchmarking.org
QMCPACK Input: H4_ae OpenBenchmarking.org Total Execution Time - Seconds, Fewer Is Better QMCPACK 4.0 Input: H4_ae b d c a 3 6 9 12 15 10.36 10.54 10.69 11.00 1. (CXX) g++ options: -fopenmp -foffload=disable -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -ffast-math -march=native -O3 -lm -ldl
QMCPACK Input: Li2_STO_ae OpenBenchmarking.org Total Execution Time - Seconds, Fewer Is Better QMCPACK 4.0 Input: Li2_STO_ae a c b d 30 60 90 120 150 121.76 122.97 123.24 123.99 1. (CXX) g++ options: -fopenmp -foffload=disable -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -ffast-math -march=native -O3 -lm -ldl
QMCPACK Input: LiH_ae_MSD OpenBenchmarking.org Total Execution Time - Seconds, Fewer Is Better QMCPACK 4.0 Input: LiH_ae_MSD d c b a 10 20 30 40 50 41.81 41.88 42.01 42.04 1. (CXX) g++ options: -fopenmp -foffload=disable -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -ffast-math -march=native -O3 -lm -ldl
QMCPACK Input: O_ae_pyscf_UHF OpenBenchmarking.org Total Execution Time - Seconds, Fewer Is Better QMCPACK 4.0 Input: O_ae_pyscf_UHF d c a b 30 60 90 120 150 124.25 124.76 125.15 126.79 1. (CXX) g++ options: -fopenmp -foffload=disable -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -ffast-math -march=native -O3 -lm -ldl
QMCPACK Input: FeCO6_b3lyp_gms OpenBenchmarking.org Total Execution Time - Seconds, Fewer Is Better QMCPACK 4.0 Input: FeCO6_b3lyp_gms b d c a 12 24 36 48 60 52.30 52.31 52.53 52.82 1. (CXX) g++ options: -fopenmp -foffload=disable -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -ffast-math -march=native -O3 -lm -ldl
Liquid-DSP Threads: 1 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.7 Threads: 1 - Buffer Length: 256 - Filter Length: 32 d c b a 13M 26M 39M 52M 65M 59018000 57980000 56789000 56787000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 1 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.7 Threads: 1 - Buffer Length: 256 - Filter Length: 57 d b c a 20M 40M 60M 80M 100M 90466000 90375000 90142000 90088000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 2 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.7 Threads: 2 - Buffer Length: 256 - Filter Length: 32 c a d b 20M 40M 60M 80M 100M 114160000 113930000 113880000 113780000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 2 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.7 Threads: 2 - Buffer Length: 256 - Filter Length: 57 c b a d 40M 80M 120M 160M 200M 183260000 183150000 182470000 182390000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 4 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.7 Threads: 4 - Buffer Length: 256 - Filter Length: 32 c b d a 50M 100M 150M 200M 250M 229920000 229740000 229660000 229660000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 4 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.7 Threads: 4 - Buffer Length: 256 - Filter Length: 57 c d a b 70M 140M 210M 280M 350M 315910000 315130000 315060000 313980000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 8 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.7 Threads: 8 - Buffer Length: 256 - Filter Length: 32 d b a c 100M 200M 300M 400M 500M 454640000 454350000 454100000 453740000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 8 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.7 Threads: 8 - Buffer Length: 256 - Filter Length: 57 d c a b 130M 260M 390M 520M 650M 592680000 589920000 586180000 585930000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 1 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.7 Threads: 1 - Buffer Length: 256 - Filter Length: 512 d c b a 9M 18M 27M 36M 45M 42423000 42347000 41571000 41410000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 16 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.7 Threads: 16 - Buffer Length: 256 - Filter Length: 32 b a d c 200M 400M 600M 800M 1000M 893150000 892520000 892320000 891310000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 16 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.7 Threads: 16 - Buffer Length: 256 - Filter Length: 57 a b c d 200M 400M 600M 800M 1000M 1096000000 1093500000 1091900000 1089400000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 2 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.7 Threads: 2 - Buffer Length: 256 - Filter Length: 512 b c a d 20M 40M 60M 80M 100M 82576000 82543000 82303000 80008000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 32 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.7 Threads: 32 - Buffer Length: 256 - Filter Length: 32 b c d a 300M 600M 900M 1200M 1500M 1587600000 1584800000 1583700000 1583300000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 32 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.7 Threads: 32 - Buffer Length: 256 - Filter Length: 57 b c d a 300M 600M 900M 1200M 1500M 1601200000 1599700000 1599300000 1596800000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 4 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.7 Threads: 4 - Buffer Length: 256 - Filter Length: 512 b c a d 30M 60M 90M 120M 150M 160690000 160370000 158040000 155760000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 8 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.7 Threads: 8 - Buffer Length: 256 - Filter Length: 512 b d a c 70M 140M 210M 280M 350M 322050000 317070000 315200000 312230000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 16 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.7 Threads: 16 - Buffer Length: 256 - Filter Length: 512 d a c b 130M 260M 390M 520M 650M 585540000 582860000 580280000 574670000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 32 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.7 Threads: 32 - Buffer Length: 256 - Filter Length: 512 c a b d 130M 260M 390M 520M 650M 618330000 618230000 617980000 616800000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128 d c b a 3 6 9 12 15 9.19 9.17 9.17 9.17 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512 c d a b 20 40 60 80 100 93.60 92.18 89.38 88.60 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024 a b d c 20 40 60 80 100 91.42 91.37 90.29 90.08 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048 d c b a 20 40 60 80 100 89.55 89.14 89.13 88.41 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128 c d b a 3 6 9 12 15 9.75 9.74 9.74 9.71 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512 b a d c 20 40 60 80 100 92.04 91.25 90.77 88.49 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024 b d a c 20 40 60 80 100 92.07 91.21 90.33 89.30 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048 b d c a 20 40 60 80 100 90.16 89.74 89.69 88.09 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128 c a b d 15 30 45 60 75 65.10 65.08 65.01 64.90 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512 d b a c 90 180 270 360 450 420.46 418.76 413.38 406.48 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024 b d a c 90 180 270 360 450 402.62 399.88 395.79 394.84 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048 d a c b 80 160 240 320 400 376.59 375.05 371.21 370.37 1. (CXX) g++ options: -O3
Phoronix Test Suite v10.8.5