feb 9950X AMD Ryzen 9 9950X 16-Core testing with a ASRock X870E Taichi (3.12.AS02 BIOS) and XFX AMD Radeon RX 7900 XTX 24GB on Ubuntu 24.04 via the Phoronix Test Suite. a: Processor: AMD Ryzen 9 9950X 16-Core @ 5.75GHz (16 Cores / 32 Threads), Motherboard: ASRock X870E Taichi (3.12.AS02 BIOS), Chipset: AMD Device 14d8, Memory: 2 x 16GB DDR5-6000MT/s F5-6000J2836G16G, Disk: Western Digital WD_BLACK SN850X 2000GB, Graphics: XFX AMD Radeon RX 7900 XTX 24GB, Audio: AMD Navi 31 HDMI/DP, Monitor: DELL U2723QE, Network: Realtek Device 8126 + MEDIATEK Device 0717 OS: Ubuntu 24.04, Kernel: 6.12.3-061203-generic (x86_64), Desktop: GNOME Shell 46.0, Display Server: X Server 1.21.1.11 + Wayland, OpenGL: 4.6 Mesa 24.2.0-devel (LLVM 18.1.7 DRM 3.59), Compiler: GCC 13.3.0, File-System: ext4, Screen Resolution: 3840x2160 b: Processor: AMD Ryzen 9 9950X 16-Core @ 5.75GHz (16 Cores / 32 Threads), Motherboard: ASRock X870E Taichi (3.12.AS02 BIOS), Chipset: AMD Device 14d8, Memory: 2 x 16GB DDR5-6000MT/s F5-6000J2836G16G, Disk: Western Digital WD_BLACK SN850X 2000GB, Graphics: XFX AMD Radeon RX 7900 XTX 24GB, Audio: AMD Navi 31 HDMI/DP, Monitor: DELL U2723QE, Network: Realtek Device 8126 + MEDIATEK Device 0717 OS: Ubuntu 24.04, Kernel: 6.12.3-061203-generic (x86_64), Desktop: GNOME Shell 46.0, Display Server: X Server 1.21.1.11 + Wayland, OpenGL: 4.6 Mesa 24.2.0-devel (LLVM 18.1.7 DRM 3.59), Compiler: GCC 13.3.0, File-System: ext4, Screen Resolution: 3840x2160 c: Processor: AMD Ryzen 9 9950X 16-Core @ 5.75GHz (16 Cores / 32 Threads), Motherboard: ASRock X870E Taichi (3.12.AS02 BIOS), Chipset: AMD Device 14d8, Memory: 2 x 16GB DDR5-6000MT/s F5-6000J2836G16G, Disk: Western Digital WD_BLACK SN850X 2000GB, Graphics: XFX AMD Radeon RX 7900 XTX 24GB, Audio: AMD Navi 31 HDMI/DP, Monitor: DELL U2723QE, Network: Realtek Device 8126 + MEDIATEK Device 0717 OS: Ubuntu 24.04, Kernel: 6.12.3-061203-generic (x86_64), Desktop: GNOME Shell 46.0, Display Server: X Server 1.21.1.11 + Wayland, OpenGL: 4.6 Mesa 24.2.0-devel (LLVM 18.1.7 DRM 3.59), Compiler: GCC 13.3.0, File-System: ext4, Screen Resolution: 3840x2160 d: Processor: AMD Ryzen 9 9950X 16-Core @ 5.75GHz (16 Cores / 32 Threads), Motherboard: ASRock X870E Taichi (3.12.AS02 BIOS), Chipset: AMD Device 14d8, Memory: 2 x 16GB DDR5-6000MT/s F5-6000J2836G16G, Disk: Western Digital WD_BLACK SN850X 2000GB, Graphics: XFX AMD Radeon RX 7900 XTX 24GB, Audio: AMD Navi 31 HDMI/DP, Monitor: DELL U2723QE, Network: Realtek Device 8126 + MEDIATEK Device 0717 OS: Ubuntu 24.04, Kernel: 6.12.3-061203-generic (x86_64), Desktop: GNOME Shell 46.0, Display Server: X Server 1.21.1.11 + Wayland, OpenGL: 4.6 Mesa 24.2.0-devel (LLVM 18.1.7 DRM 3.59), Compiler: GCC 13.3.0, File-System: ext4, Screen Resolution: 3840x2160 QMCPACK 4.0 Input: H4_ae Total Execution Time - Seconds < Lower Is Better a . 11.00 |==================================================================== b . 10.36 |================================================================ c . 10.69 |================================================================== d . 10.54 |================================================================= Llama.cpp b4397 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512 Tokens Per Second > Higher Is Better a . 89.38 |================================================================= b . 88.60 |================================================================ c . 93.60 |==================================================================== d . 92.18 |=================================================================== Llama.cpp b4397 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512 Tokens Per Second > Higher Is Better a . 91.25 |=================================================================== b . 92.04 |==================================================================== c . 88.49 |================================================================= d . 90.77 |=================================================================== Liquid-DSP 1.7 Threads: 1 - Buffer Length: 256 - Filter Length: 32 samples/s > Higher Is Better a . 56787000 |=============================================================== b . 56789000 |=============================================================== c . 57980000 |================================================================ d . 59018000 |================================================================= Llama.cpp b4397 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512 Tokens Per Second > Higher Is Better a . 413.38 |================================================================== b . 418.76 |=================================================================== c . 406.48 |================================================================= d . 420.46 |=================================================================== Liquid-DSP 1.7 Threads: 2 - Buffer Length: 256 - Filter Length: 512 samples/s > Higher Is Better a . 82303000 |================================================================= b . 82576000 |================================================================= c . 82543000 |================================================================= d . 80008000 |=============================================================== Liquid-DSP 1.7 Threads: 4 - Buffer Length: 256 - Filter Length: 512 samples/s > Higher Is Better a . 158040000 |=============================================================== b . 160690000 |================================================================ c . 160370000 |================================================================ d . 155760000 |============================================================== Liquid-DSP 1.7 Threads: 8 - Buffer Length: 256 - Filter Length: 512 samples/s > Higher Is Better a . 315200000 |=============================================================== b . 322050000 |================================================================ c . 312230000 |============================================================== d . 317070000 |=============================================================== Llama.cpp b4397 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024 Tokens Per Second > Higher Is Better a . 90.33 |=================================================================== b . 92.07 |==================================================================== c . 89.30 |================================================================== d . 91.21 |=================================================================== Liquid-DSP 1.7 Threads: 1 - Buffer Length: 256 - Filter Length: 512 samples/s > Higher Is Better a . 41410000 |=============================================================== b . 41571000 |================================================================ c . 42347000 |================================================================= d . 42423000 |================================================================= Llama.cpp b4397 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048 Tokens Per Second > Higher Is Better a . 88.09 |================================================================== b . 90.16 |==================================================================== c . 89.69 |==================================================================== d . 89.74 |==================================================================== QMCPACK 4.0 Input: O_ae_pyscf_UHF Total Execution Time - Seconds < Lower Is Better a . 125.15 |================================================================== b . 126.79 |=================================================================== c . 124.76 |================================================================== d . 124.25 |================================================================== Llama.cpp b4397 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024 Tokens Per Second > Higher Is Better a . 395.79 |================================================================== b . 402.62 |=================================================================== c . 394.84 |================================================================== d . 399.88 |=================================================================== Liquid-DSP 1.7 Threads: 16 - Buffer Length: 256 - Filter Length: 512 samples/s > Higher Is Better a . 582860000 |================================================================ b . 574670000 |=============================================================== c . 580280000 |=============================================================== d . 585540000 |================================================================ QMCPACK 4.0 Input: Li2_STO_ae Total Execution Time - Seconds < Lower Is Better a . 121.76 |================================================================== b . 123.24 |=================================================================== c . 122.97 |================================================================== d . 123.99 |=================================================================== Llama.cpp b4397 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048 Tokens Per Second > Higher Is Better a . 375.05 |=================================================================== b . 370.37 |================================================================== c . 371.21 |================================================================== d . 376.59 |=================================================================== Llama.cpp b4397 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024 Tokens Per Second > Higher Is Better a . 91.42 |==================================================================== b . 91.37 |==================================================================== c . 90.08 |=================================================================== d . 90.29 |=================================================================== Llama.cpp b4397 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048 Tokens Per Second > Higher Is Better a . 88.41 |=================================================================== b . 89.13 |==================================================================== c . 89.14 |==================================================================== d . 89.55 |==================================================================== Liquid-DSP 1.7 Threads: 8 - Buffer Length: 256 - Filter Length: 57 samples/s > Higher Is Better a . 586180000 |=============================================================== b . 585930000 |=============================================================== c . 589920000 |================================================================ d . 592680000 |================================================================ QMCPACK 4.0 Input: FeCO6_b3lyp_gms Total Execution Time - Seconds < Lower Is Better a . 52.82 |==================================================================== b . 52.30 |=================================================================== c . 52.53 |==================================================================== d . 52.31 |=================================================================== Liquid-DSP 1.7 Threads: 4 - Buffer Length: 256 - Filter Length: 57 samples/s > Higher Is Better a . 315060000 |================================================================ b . 313980000 |================================================================ c . 315910000 |================================================================ d . 315130000 |================================================================ Liquid-DSP 1.7 Threads: 16 - Buffer Length: 256 - Filter Length: 57 samples/s > Higher Is Better a . 1096000000 |=============================================================== b . 1093500000 |=============================================================== c . 1091900000 |=============================================================== d . 1089400000 |=============================================================== QMCPACK 4.0 Input: LiH_ae_MSD Total Execution Time - Seconds < Lower Is Better a . 42.04 |==================================================================== b . 42.01 |==================================================================== c . 41.88 |==================================================================== d . 41.81 |==================================================================== Liquid-DSP 1.7 Threads: 2 - Buffer Length: 256 - Filter Length: 57 samples/s > Higher Is Better a . 182470000 |================================================================ b . 183150000 |================================================================ c . 183260000 |================================================================ d . 182390000 |================================================================ Liquid-DSP 1.7 Threads: 1 - Buffer Length: 256 - Filter Length: 57 samples/s > Higher Is Better a . 90088000 |================================================================= b . 90375000 |================================================================= c . 90142000 |================================================================= d . 90466000 |================================================================= Llama.cpp b4397 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128 Tokens Per Second > Higher Is Better a . 9.71 |===================================================================== b . 9.74 |===================================================================== c . 9.75 |===================================================================== d . 9.74 |===================================================================== Liquid-DSP 1.7 Threads: 2 - Buffer Length: 256 - Filter Length: 32 samples/s > Higher Is Better a . 113930000 |================================================================ b . 113780000 |================================================================ c . 114160000 |================================================================ d . 113880000 |================================================================ Llama.cpp b4397 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128 Tokens Per Second > Higher Is Better a . 65.08 |==================================================================== b . 65.01 |==================================================================== c . 65.10 |==================================================================== d . 64.90 |==================================================================== Liquid-DSP 1.7 Threads: 32 - Buffer Length: 256 - Filter Length: 57 samples/s > Higher Is Better a . 1596800000 |=============================================================== b . 1601200000 |=============================================================== c . 1599700000 |=============================================================== d . 1599300000 |=============================================================== Liquid-DSP 1.7 Threads: 32 - Buffer Length: 256 - Filter Length: 32 samples/s > Higher Is Better a . 1583300000 |=============================================================== b . 1587600000 |=============================================================== c . 1584800000 |=============================================================== d . 1583700000 |=============================================================== Liquid-DSP 1.7 Threads: 32 - Buffer Length: 256 - Filter Length: 512 samples/s > Higher Is Better a . 618230000 |================================================================ b . 617980000 |================================================================ c . 618330000 |================================================================ d . 616800000 |================================================================ Llama.cpp b4397 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128 Tokens Per Second > Higher Is Better a . 9.17 |===================================================================== b . 9.17 |===================================================================== c . 9.17 |===================================================================== d . 9.19 |===================================================================== Liquid-DSP 1.7 Threads: 16 - Buffer Length: 256 - Filter Length: 32 samples/s > Higher Is Better a . 892520000 |================================================================ b . 893150000 |================================================================ c . 891310000 |================================================================ d . 892320000 |================================================================ Liquid-DSP 1.7 Threads: 8 - Buffer Length: 256 - Filter Length: 32 samples/s > Higher Is Better a . 454100000 |================================================================ b . 454350000 |================================================================ c . 453740000 |================================================================ d . 454640000 |================================================================ Liquid-DSP 1.7 Threads: 4 - Buffer Length: 256 - Filter Length: 32 samples/s > Higher Is Better a . 229660000 |================================================================ b . 229740000 |================================================================ c . 229920000 |================================================================ d . 229660000 |================================================================