testgpu AMD Ryzen Threadripper PRO 7965WX 24-Cores testing with a ASUS Pro WS WRX90E-SAGE SE (0803 BIOS) and Gigabyte AMD Radeon RX 7900 XT 20GB on Debian via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2412054-LORE-TESTGPU68&grs .
testgpu Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server OpenGL Compiler File-System Screen Resolution Gigabyte AMD Radeon RX 7900 XT AMD Ryzen Threadripper PRO 7965WX 24-Cores @ 5.36GHz (24 Cores / 48 Threads) ASUS Pro WS WRX90E-SAGE SE (0803 BIOS) AMD Genoa/Bergamo 128GB 2 x 2000GB CT2000T705SSD3 Gigabyte AMD Radeon RX 7900 XT 20GB (2175/1249MHz) AMD Device 14cc DELL U2723QE 2 x Intel X710 for 10GBASE-T Debian 6.11.10-amd64 (x86_64) KDE Plasma 6.2.3 X Server 1.21.1.14 + Wayland 4.6 Mesa 24.2.8-1 (LLVM 19.1.4 DRM 3.59) GCC 14.2.0 ext4 3840x2160 OpenBenchmarking.org - Transparent Huge Pages: always - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=3 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/reproducible-path/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/reproducible-path/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: amd-pstate-epp powersave (Boost: Enabled EPP: balance_performance) - CPU Microcode: 0xa108108 - GLAMOR - BAR1 / Visible vRAM Size: 256 MB - vBIOS Version: 113-EXT91531-001 - Python 3.12.7 - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
testgpu ncnn: Vulkan GPU - vision_transformer ncnn: Vulkan GPU - vgg16 vkresample: 2x - Single vkresample: 2x - Double vkfft: FFT + iFFT C2C 1D batched in single precision, no reshuffling vkfft: FFT + iFFT C2C Bluestein benchmark in double precision vkfft: FFT + iFFT C2C multidimensional in single precision vkfft: FFT + iFFT C2C 1D batched in single precision vkfft: FFT + iFFT C2C 1D batched in double precision vkfft: FFT + iFFT C2C Bluestein in single precision vkfft: FFT + iFFT C2C 1D batched in half precision vkfft: FFT + iFFT R2C / C2R waifu2x-ncnn: 2x - 3 - Yes realsr-ncnn: 4x - Yes realsr-ncnn: 4x - No vkpeak: int16-vec4 vkpeak: int16-scalar vkpeak: int32-vec4 vkpeak: int32-scalar vkpeak: fp64-vec4 vkpeak: fp64-scalar vkpeak: fp16-vec4 vkpeak: fp16-scalar vkpeak: fp32-vec4 vkpeak: fp32-scalar ncnn: Vulkan GPU - FastestDet ncnn: Vulkan GPU - regnety_400m ncnn: Vulkan GPU - squeezenet_ssd ncnn: Vulkan GPU - yolov4-tiny ncnn: Vulkan GPU - resnet50 ncnn: Vulkan GPU - alexnet ncnn: Vulkan GPU - resnet18 ncnn: Vulkan GPU - googlenet ncnn: Vulkan GPU - blazeface ncnn: Vulkan GPU - efficientnet-b0 ncnn: Vulkan GPU - mnasnet ncnn: Vulkan GPU - shufflenet-v2 ncnn: Vulkan GPU-v3-v3 - mobilenet-v3 ncnn: Vulkan GPU-v2-v2 - mobilenet-v2 ncnn: Vulkan GPU - mobilenet waifu2x-ncnn: 2x - 3 - No Gigabyte AMD Radeon RX 7900 XT 53.21 34.61 6.458 67.167 81768 5252 54453 77961 21183 17488 110579 64329 4.260 19.490 4.470 46788.55 23645.42 6081.94 6109.64 990.61 991.49 46366.43 24824.27 21318.02 24981.03 13.83 24.95 19.60 34.05 18.97 7.62 11.22 23.22 4.18 11.45 8.84 11.47 8.75 9.61 20.26 OpenBenchmarking.org
NCNN Target: Vulkan GPU - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: vision_transformer Gigabyte AMD Radeon RX 7900 XT 12 24 36 48 60 SE +/- 0.68, N = 12 53.21 MIN: 37.26 / MAX: 604.65 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: vgg16 Gigabyte AMD Radeon RX 7900 XT 8 16 24 32 40 SE +/- 0.48, N = 12 34.61 MIN: 22.47 / MAX: 479.77 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
VkResample Upscale: 2x - Precision: Single OpenBenchmarking.org ms, Fewer Is Better VkResample 1.0 Upscale: 2x - Precision: Single Gigabyte AMD Radeon RX 7900 XT 2 4 6 8 10 SE +/- 0.004, N = 3 6.458 1. (CXX) g++ options: -O3
VkResample Upscale: 2x - Precision: Double OpenBenchmarking.org ms, Fewer Is Better VkResample 1.0 Upscale: 2x - Precision: Double Gigabyte AMD Radeon RX 7900 XT 15 30 45 60 75 SE +/- 0.15, N = 3 67.17 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C 1D batched in single precision, no reshuffling OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT C2C 1D batched in single precision, no reshuffling Gigabyte AMD Radeon RX 7900 XT 20K 40K 60K 80K 100K SE +/- 34.57, N = 3 81768 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C Bluestein benchmark in double precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT C2C Bluestein benchmark in double precision Gigabyte AMD Radeon RX 7900 XT 1100 2200 3300 4400 5500 SE +/- 8.45, N = 3 5252 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C multidimensional in single precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT C2C multidimensional in single precision Gigabyte AMD Radeon RX 7900 XT 12K 24K 36K 48K 60K SE +/- 305.40, N = 3 54453 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C 1D batched in single precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT C2C 1D batched in single precision Gigabyte AMD Radeon RX 7900 XT 20K 40K 60K 80K 100K SE +/- 34.71, N = 3 77961 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C 1D batched in double precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT C2C 1D batched in double precision Gigabyte AMD Radeon RX 7900 XT 5K 10K 15K 20K 25K SE +/- 8.99, N = 3 21183 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C Bluestein in single precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT C2C Bluestein in single precision Gigabyte AMD Radeon RX 7900 XT 4K 8K 12K 16K 20K SE +/- 118.39, N = 3 17488 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C 1D batched in half precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT C2C 1D batched in half precision Gigabyte AMD Radeon RX 7900 XT 20K 40K 60K 80K 100K SE +/- 177.96, N = 3 110579 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT R2C / C2R OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT R2C / C2R Gigabyte AMD Radeon RX 7900 XT 14K 28K 42K 56K 70K SE +/- 266.07, N = 3 64329 1. (CXX) g++ options: -O3
Waifu2x-NCNN Vulkan Scale: 2x - Denoise: 3 - TAA: Yes OpenBenchmarking.org Seconds, Fewer Is Better Waifu2x-NCNN Vulkan 20200818 Scale: 2x - Denoise: 3 - TAA: Yes Gigabyte AMD Radeon RX 7900 XT 0.9585 1.917 2.8755 3.834 4.7925 SE +/- 0.004, N = 3 4.260
RealSR-NCNN Scale: 4x - TAA: Yes OpenBenchmarking.org Seconds, Fewer Is Better RealSR-NCNN 20200818 Scale: 4x - TAA: Yes Gigabyte AMD Radeon RX 7900 XT 5 10 15 20 25 SE +/- 0.04, N = 3 19.49
RealSR-NCNN Scale: 4x - TAA: No OpenBenchmarking.org Seconds, Fewer Is Better RealSR-NCNN 20200818 Scale: 4x - TAA: No Gigabyte AMD Radeon RX 7900 XT 1.0058 2.0116 3.0174 4.0232 5.029 SE +/- 0.035, N = 3 4.470
vkpeak int16-vec4 OpenBenchmarking.org GIOPS, More Is Better vkpeak 20240505 int16-vec4 Gigabyte AMD Radeon RX 7900 XT 10K 20K 30K 40K 50K SE +/- 194.52, N = 3 46788.55
vkpeak int16-scalar OpenBenchmarking.org GIOPS, More Is Better vkpeak 20240505 int16-scalar Gigabyte AMD Radeon RX 7900 XT 5K 10K 15K 20K 25K SE +/- 42.40, N = 3 23645.42
vkpeak int32-vec4 OpenBenchmarking.org GIOPS, More Is Better vkpeak 20240505 int32-vec4 Gigabyte AMD Radeon RX 7900 XT 1300 2600 3900 5200 6500 SE +/- 19.61, N = 3 6081.94
vkpeak int32-scalar OpenBenchmarking.org GIOPS, More Is Better vkpeak 20240505 int32-scalar Gigabyte AMD Radeon RX 7900 XT 1300 2600 3900 5200 6500 SE +/- 8.60, N = 3 6109.64
vkpeak fp64-vec4 OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20240505 fp64-vec4 Gigabyte AMD Radeon RX 7900 XT 200 400 600 800 1000 SE +/- 0.28, N = 3 990.61
vkpeak fp64-scalar OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20240505 fp64-scalar Gigabyte AMD Radeon RX 7900 XT 200 400 600 800 1000 SE +/- 0.10, N = 3 991.49
vkpeak fp16-vec4 OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20240505 fp16-vec4 Gigabyte AMD Radeon RX 7900 XT 10K 20K 30K 40K 50K SE +/- 169.42, N = 3 46366.43
vkpeak fp16-scalar OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20240505 fp16-scalar Gigabyte AMD Radeon RX 7900 XT 5K 10K 15K 20K 25K SE +/- 48.07, N = 3 24824.27
vkpeak fp32-vec4 OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20240505 fp32-vec4 Gigabyte AMD Radeon RX 7900 XT 5K 10K 15K 20K 25K SE +/- 22.98, N = 3 21318.02
vkpeak fp32-scalar OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20240505 fp32-scalar Gigabyte AMD Radeon RX 7900 XT 5K 10K 15K 20K 25K SE +/- 37.96, N = 3 24981.03
NCNN Target: Vulkan GPU - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: FastestDet Gigabyte AMD Radeon RX 7900 XT 4 8 12 16 20 SE +/- 0.37, N = 12 13.83 MIN: 6.02 / MAX: 368.18 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: regnety_400m Gigabyte AMD Radeon RX 7900 XT 6 12 18 24 30 SE +/- 0.64, N = 12 24.95 MIN: 15.49 / MAX: 706.04 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: squeezenet_ssd Gigabyte AMD Radeon RX 7900 XT 5 10 15 20 25 SE +/- 0.50, N = 12 19.60 MIN: 12.41 / MAX: 685.75 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: yolov4-tiny Gigabyte AMD Radeon RX 7900 XT 8 16 24 32 40 SE +/- 0.68, N = 12 34.05 MIN: 20.65 / MAX: 478.65 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: resnet50 Gigabyte AMD Radeon RX 7900 XT 5 10 15 20 25 SE +/- 0.59, N = 12 18.97 MIN: 11.9 / MAX: 714.42 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: alexnet Gigabyte AMD Radeon RX 7900 XT 2 4 6 8 10 SE +/- 0.23, N = 12 7.62 MIN: 5.05 / MAX: 280.54 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: resnet18 Gigabyte AMD Radeon RX 7900 XT 3 6 9 12 15 SE +/- 0.39, N = 11 11.22 MIN: 7.35 / MAX: 325.84 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: googlenet Gigabyte AMD Radeon RX 7900 XT 6 12 18 24 30 SE +/- 0.54, N = 12 23.22 MIN: 14.77 / MAX: 462.71 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: blazeface Gigabyte AMD Radeon RX 7900 XT 0.9405 1.881 2.8215 3.762 4.7025 SE +/- 0.17, N = 12 4.18 MIN: 2.41 / MAX: 320.85 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: efficientnet-b0 Gigabyte AMD Radeon RX 7900 XT 3 6 9 12 15 SE +/- 0.31, N = 12 11.45 MIN: 6.89 / MAX: 391.33 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: mnasnet Gigabyte AMD Radeon RX 7900 XT 2 4 6 8 10 SE +/- 0.28, N = 11 8.84 MIN: 5.2 / MAX: 374.25 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: shufflenet-v2 Gigabyte AMD Radeon RX 7900 XT 3 6 9 12 15 SE +/- 0.48, N = 12 11.47 MIN: 7.41 / MAX: 781.29 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 Gigabyte AMD Radeon RX 7900 XT 2 4 6 8 10 SE +/- 0.30, N = 12 8.75 MIN: 5.58 / MAX: 650.14 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 Gigabyte AMD Radeon RX 7900 XT 3 6 9 12 15 SE +/- 0.22, N = 12 9.61 MIN: 5.58 / MAX: 420.93 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: mobilenet Gigabyte AMD Radeon RX 7900 XT 5 10 15 20 25 SE +/- 0.64, N = 12 20.26 MIN: 12.27 / MAX: 484.55 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Phoronix Test Suite v10.8.5