llama cpp 7995WX AMD Ryzen Threadripper PRO 7995WX 96-Cores testing with a HP 8B24 (U65 Ver. 01.01.04 BIOS) and NVIDIA RTX A4000 16GB on Ubuntu 23.10 via the Phoronix Test Suite. a: Processor: AMD Ryzen Threadripper PRO 7995WX 96-Cores @ 6.44GHz (96 Cores / 192 Threads), Motherboard: HP 8B24 (U65 Ver. 01.01.04 BIOS), Chipset: AMD Device 14a4, Memory: 8 x 16 GB DRAM-5600MT/s Hynix HMCG78AGBRA190N, Disk: 2 x 1024GB SAMSUNG MZVL21T0HCLR-00BH1, Graphics: NVIDIA RTX A4000 16GB, Audio: NVIDIA GA104 HD Audio, Monitor: ASUS VP28U, Network: Realtek RTL8111/8168/8411 OS: Ubuntu 23.10, Kernel: 6.5.0-14-generic (x86_64), Desktop: GNOME Shell 45.0, Display Server: X Server 1.21.1.7, Display Driver: NVIDIA 535.129.03, OpenGL: 4.6.0, OpenCL: OpenCL 3.0 CUDA 12.2.147, Compiler: GCC 13.2.0, File-System: ext4, Screen Resolution: 3840x2160 b: Processor: AMD Ryzen Threadripper PRO 7995WX 96-Cores @ 6.44GHz (96 Cores / 192 Threads), Motherboard: HP 8B24 (U65 Ver. 01.01.04 BIOS), Chipset: AMD Device 14a4, Memory: 8 x 16 GB DRAM-5600MT/s Hynix HMCG78AGBRA190N, Disk: 2 x 1024GB SAMSUNG MZVL21T0HCLR-00BH1, Graphics: NVIDIA RTX A4000 16GB, Audio: NVIDIA GA104 HD Audio, Monitor: ASUS VP28U, Network: Realtek RTL8111/8168/8411 OS: Ubuntu 23.10, Kernel: 6.5.0-14-generic (x86_64), Desktop: GNOME Shell 45.0, Display Server: X Server 1.21.1.7, Display Driver: NVIDIA 535.129.03, OpenGL: 4.6.0, OpenCL: OpenCL 3.0 CUDA 12.2.147, Compiler: GCC 13.2.0, File-System: ext4, Screen Resolution: 3840x2160 c: Processor: AMD Ryzen Threadripper PRO 7995WX 96-Cores @ 6.44GHz (96 Cores / 192 Threads), Motherboard: HP 8B24 (U65 Ver. 01.01.04 BIOS), Chipset: AMD Device 14a4, Memory: 8 x 16 GB DRAM-5600MT/s Hynix HMCG78AGBRA190N, Disk: 2 x 1024GB SAMSUNG MZVL21T0HCLR-00BH1, Graphics: NVIDIA RTX A4000 16GB, Audio: NVIDIA GA104 HD Audio, Monitor: ASUS VP28U, Network: Realtek RTL8111/8168/8411 OS: Ubuntu 23.10, Kernel: 6.5.0-14-generic (x86_64), Desktop: GNOME Shell 45.0, Display Server: X Server 1.21.1.7, Display Driver: NVIDIA 535.129.03, OpenGL: 4.6.0, OpenCL: OpenCL 3.0 CUDA 12.2.147, Compiler: GCC 13.2.0, File-System: ext4, Screen Resolution: 3840x2160 d: Processor: AMD Ryzen Threadripper PRO 7995WX 96-Cores @ 6.44GHz (96 Cores / 192 Threads), Motherboard: HP 8B24 (U65 Ver. 01.01.04 BIOS), Chipset: AMD Device 14a4, Memory: 8 x 16 GB DRAM-5600MT/s Hynix HMCG78AGBRA190N, Disk: 2 x 1024GB SAMSUNG MZVL21T0HCLR-00BH1, Graphics: NVIDIA RTX A4000 16GB, Audio: NVIDIA GA104 HD Audio, Monitor: ASUS VP28U, Network: Realtek RTL8111/8168/8411 OS: Ubuntu 23.10, Kernel: 6.5.0-14-generic (x86_64), Desktop: GNOME Shell 45.0, Display Server: X Server 1.21.1.7, Display Driver: NVIDIA 535.129.03, OpenGL: 4.6.0, OpenCL: OpenCL 3.0 CUDA 12.2.147, Compiler: GCC 13.2.0, File-System: ext4, Screen Resolution: 3840x2160 Llama.cpp b1808 Model: llama-2-7b.Q4_0.gguf Tokens Per Second > Higher Is Better a . 31.53 |=================================================================== b . 32.17 |==================================================================== c . 30.92 |================================================================= d . 31.09 |================================================================== Llama.cpp b1808 Model: llama-2-13b.Q4_0.gguf Tokens Per Second > Higher Is Better a . 20.30 |==================================================================== b . 20.02 |=================================================================== c . 20.14 |=================================================================== d . 20.08 |=================================================================== Llama.cpp b1808 Model: llama-2-70b-chat.Q5_0.gguf Tokens Per Second > Higher Is Better a . 4.66 |===================================================================== b . 4.63 |==================================================================== c . 4.66 |===================================================================== d . 4.69 |=====================================================================