vkfft radeon pro AMD Ryzen Threadripper 7980X 64-Cores testing with a System76 Thelio Major (FA Z5 BIOS) and AMD Radeon Pro W7900 45GB on Ubuntu 23.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2402189-NE-VKFFTRADE61&export=pdf&gru&sro .
vkfft radeon pro Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server OpenGL Compiler File-System Screen Resolution a b c d AMD Ryzen Threadripper 7980X 64-Cores @ 7.79GHz (64 Cores / 128 Threads) System76 Thelio Major (FA Z5 BIOS) AMD Device 14a4 4 x 32GB DRAM-4800MT/s Micron MTC20F1045S1RC48BA2 1000GB CT1000T700SSD5 AMD Radeon Pro W7900 45GB (1760/1124MHz) AMD Device 14cc DELL P2415Q Aquantia AQC113C NBase-T/IEEE + Realtek RTL8125 2.5GbE + Intel Wi-Fi 6 AX210/AX211/AX411 Ubuntu 23.10 6.5.0-17-generic (x86_64) GNOME Shell 45.2 X Server + Wayland 4.6 Mesa 23.2.1-1ubuntu3.1 (LLVM 15.0.7 DRM 3.54) GCC 13.2.0 ext4 1920x1080 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - CPU Microcode: 0xa108105 Graphics Details - BAR1 / Visible vRAM Size: 46064 MB - vBIOS Version: 113-D7070600-101 Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
vkfft radeon pro vkfft: FFT + iFFT R2C / C2R vkfft: FFT + iFFT C2C 1D batched in half precision vkfft: FFT + iFFT C2C Bluestein in single precision vkfft: FFT + iFFT C2C 1D batched in double precision vkfft: FFT + iFFT C2C 1D batched in single precision vkfft: FFT + iFFT C2C multidimensional in single precision vkfft: FFT + iFFT C2C Bluestein benchmark in double precision vkfft: FFT + iFFT C2C 1D batched in single precision, no reshuffling a b c d 81139 169776 21418 29808 91609 70723 6236 97352 83484 169553 21180 30259 92385 73372 6302 96782 83564 169460 21360 30157 92146 71367 6235 97044 81653 168618 21157 30085 92096 71351 6244 96809 OpenBenchmarking.org
VkFFT Test: FFT + iFFT R2C / C2R OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT R2C / C2R a b c d 20K 40K 60K 80K 100K 81139 83484 83564 81653 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C 1D batched in half precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT C2C 1D batched in half precision a b c d 40K 80K 120K 160K 200K 169776 169553 169460 168618 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C Bluestein in single precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT C2C Bluestein in single precision a b c d 5K 10K 15K 20K 25K 21418 21180 21360 21157 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C 1D batched in double precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT C2C 1D batched in double precision a b c d 6K 12K 18K 24K 30K 29808 30259 30157 30085 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C 1D batched in single precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT C2C 1D batched in single precision a b c d 20K 40K 60K 80K 100K 91609 92385 92146 92096 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C multidimensional in single precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT C2C multidimensional in single precision a b c d 16K 32K 48K 64K 80K 70723 73372 71367 71351 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C Bluestein benchmark in double precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT C2C Bluestein benchmark in double precision a b c d 1400 2800 4200 5600 7000 6236 6302 6235 6244 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C 1D batched in single precision, no reshuffling OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT C2C 1D batched in single precision, no reshuffling a b c d 20K 40K 60K 80K 100K 97352 96782 97044 96809 1. (CXX) g++ options: -O3
Phoronix Test Suite v10.8.5