vkfft nvidia AMD Ryzen Threadripper PRO 7995WX 96-Cores testing with a HP Z6 G5 A Workstation 8B24 (U65 Ver. 01.01.04 BIOS) and NVIDIA RTX A4000 16GB on Ubuntu 23.10 via the Phoronix Test Suite. a: Processor: AMD Ryzen Threadripper PRO 7995WX 96-Cores @ 6.44GHz (96 Cores / 192 Threads), Motherboard: HP Z6 G5 A Workstation 8B24 (U65 Ver. 01.01.04 BIOS), Chipset: AMD Device 14a4, Memory: 8 x 16GB DRAM-5200MT/s Hynix HMCG78AGBRA190N, Disk: 2 x 1024GB SAMSUNG MZVL21T0HCLR-00BH1, Graphics: NVIDIA RTX A4000 16GB, Audio: NVIDIA GA104 HD Audio, Monitor: ASUS VP28U, Network: Realtek RTL8111/8168/8411 OS: Ubuntu 23.10, Kernel: 6.5.0-17-generic (x86_64), Desktop: GNOME Shell 45.2, Display Server: X Server 1.21.1.7, Display Driver: NVIDIA 535.154.05, OpenGL: 4.6.0, OpenCL: OpenCL 3.0 CUDA 12.2.148, Compiler: GCC 13.2.0, File-System: ext4, Screen Resolution: 3840x2160 b: Processor: AMD Ryzen Threadripper PRO 7995WX 96-Cores @ 6.44GHz (96 Cores / 192 Threads), Motherboard: HP Z6 G5 A Workstation 8B24 (U65 Ver. 01.01.04 BIOS), Chipset: AMD Device 14a4, Memory: 8 x 16GB DRAM-5200MT/s Hynix HMCG78AGBRA190N, Disk: 2 x 1024GB SAMSUNG MZVL21T0HCLR-00BH1, Graphics: NVIDIA RTX A4000 16GB, Audio: NVIDIA GA104 HD Audio, Monitor: ASUS VP28U, Network: Realtek RTL8111/8168/8411 OS: Ubuntu 23.10, Kernel: 6.5.0-17-generic (x86_64), Desktop: GNOME Shell 45.2, Display Server: X Server 1.21.1.7, Display Driver: NVIDIA 535.154.05, OpenGL: 4.6.0, OpenCL: OpenCL 3.0 CUDA 12.2.148, Compiler: GCC 13.2.0, File-System: ext4, Screen Resolution: 3840x2160 c: Processor: AMD Ryzen Threadripper PRO 7995WX 96-Cores @ 6.44GHz (96 Cores / 192 Threads), Motherboard: HP Z6 G5 A Workstation 8B24 (U65 Ver. 01.01.04 BIOS), Chipset: AMD Device 14a4, Memory: 8 x 16GB DRAM-5200MT/s Hynix HMCG78AGBRA190N, Disk: 2 x 1024GB SAMSUNG MZVL21T0HCLR-00BH1, Graphics: NVIDIA RTX A4000 16GB, Audio: NVIDIA GA104 HD Audio, Monitor: ASUS VP28U, Network: Realtek RTL8111/8168/8411 OS: Ubuntu 23.10, Kernel: 6.5.0-17-generic (x86_64), Desktop: GNOME Shell 45.2, Display Server: X Server 1.21.1.7, Display Driver: NVIDIA 535.154.05, OpenGL: 4.6.0, OpenCL: OpenCL 3.0 CUDA 12.2.148, Compiler: GCC 13.2.0, File-System: ext4, Screen Resolution: 3840x2160 VkFFT 1.3.4 Test: FFT + iFFT R2C / C2R Benchmark Score > Higher Is Better a . 34574 |================================================================= b . 35977 |==================================================================== c . 35065 |================================================================== VkFFT 1.3.4 Test: FFT + iFFT C2C 1D batched in half precision Benchmark Score > Higher Is Better a . 115485 |=================================================================== b . 114380 |================================================================== c . 113700 |================================================================== VkFFT 1.3.4 Test: FFT + iFFT C2C Bluestein in single precision Benchmark Score > Higher Is Better a . 10187 |==================================================================== b . 10190 |==================================================================== c . 10170 |==================================================================== VkFFT 1.3.4 Test: FFT + iFFT C2C 1D batched in double precision Benchmark Score > Higher Is Better a . 15703 |==================================================================== b . 15685 |==================================================================== c . 15738 |==================================================================== VkFFT 1.3.4 Test: FFT + iFFT C2C 1D batched in single precision Benchmark Score > Higher Is Better a . 68268 |==================================================================== b . 68251 |==================================================================== c . 68306 |==================================================================== VkFFT 1.3.4 Test: FFT + iFFT C2C multidimensional in single precision Benchmark Score > Higher Is Better a . 33526 |================================================================== b . 34221 |==================================================================== c . 34364 |==================================================================== VkFFT 1.3.4 Test: FFT + iFFT C2C Bluestein benchmark in double precision Benchmark Score > Higher Is Better a . 2752 |===================================================================== b . 2748 |===================================================================== c . 2750 |===================================================================== VkFFT 1.3.4 Test: FFT + iFFT C2C 1D batched in single precision, no reshuffling Benchmark Score > Higher Is Better a . 69653 |==================================================================== b . 69684 |==================================================================== c . 69704 |====================================================================