testgpu AMD Ryzen Threadripper PRO 7965WX 24-Cores testing with a ASUS Pro WS WRX90E-SAGE SE (0803 BIOS) and Gigabyte AMD Radeon RX 7900 XT 20GB on Debian via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2412054-LORE-TESTGPU68 .
testgpu Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server OpenGL Compiler File-System Screen Resolution Gigabyte AMD Radeon RX 7900 XT AMD Ryzen Threadripper PRO 7965WX 24-Cores @ 5.36GHz (24 Cores / 48 Threads) ASUS Pro WS WRX90E-SAGE SE (0803 BIOS) AMD Genoa/Bergamo 128GB 2 x 2000GB CT2000T705SSD3 Gigabyte AMD Radeon RX 7900 XT 20GB (2175/1249MHz) AMD Device 14cc DELL U2723QE 2 x Intel X710 for 10GBASE-T Debian 6.11.10-amd64 (x86_64) KDE Plasma 6.2.3 X Server 1.21.1.14 + Wayland 4.6 Mesa 24.2.8-1 (LLVM 19.1.4 DRM 3.59) GCC 14.2.0 ext4 3840x2160 OpenBenchmarking.org - Transparent Huge Pages: always - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=3 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/reproducible-path/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/reproducible-path/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: amd-pstate-epp powersave (Boost: Enabled EPP: balance_performance) - CPU Microcode: 0xa108108 - GLAMOR - BAR1 / Visible vRAM Size: 256 MB - vBIOS Version: 113-EXT91531-001 - Python 3.12.7 - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
testgpu vkpeak: fp32-scalar vkpeak: fp32-vec4 vkpeak: fp16-scalar vkpeak: fp16-vec4 vkpeak: fp64-scalar vkpeak: fp64-vec4 vkpeak: int32-scalar vkpeak: int32-vec4 vkpeak: int16-scalar vkpeak: int16-vec4 realsr-ncnn: 4x - No realsr-ncnn: 4x - Yes waifu2x-ncnn: 2x - 3 - Yes vkfft: FFT + iFFT R2C / C2R vkfft: FFT + iFFT C2C 1D batched in half precision vkfft: FFT + iFFT C2C Bluestein in single precision vkfft: FFT + iFFT C2C 1D batched in double precision vkfft: FFT + iFFT C2C 1D batched in single precision vkfft: FFT + iFFT C2C multidimensional in single precision vkfft: FFT + iFFT C2C Bluestein benchmark in double precision vkfft: FFT + iFFT C2C 1D batched in single precision, no reshuffling vkresample: 2x - Double vkresample: 2x - Single ncnn: Vulkan GPU - mobilenet ncnn: Vulkan GPU-v2-v2 - mobilenet-v2 ncnn: Vulkan GPU-v3-v3 - mobilenet-v3 ncnn: Vulkan GPU - shufflenet-v2 ncnn: Vulkan GPU - mnasnet ncnn: Vulkan GPU - efficientnet-b0 ncnn: Vulkan GPU - blazeface ncnn: Vulkan GPU - googlenet ncnn: Vulkan GPU - vgg16 ncnn: Vulkan GPU - resnet18 ncnn: Vulkan GPU - alexnet ncnn: Vulkan GPU - resnet50 ncnn: Vulkan GPU - yolov4-tiny ncnn: Vulkan GPU - squeezenet_ssd ncnn: Vulkan GPU - regnety_400m ncnn: Vulkan GPU - vision_transformer ncnn: Vulkan GPU - FastestDet Gigabyte AMD Radeon RX 7900 XT 24981.03 21318.02 24824.27 46366.43 991.49 990.61 6109.64 6081.94 23645.42 46788.55 4.470 19.490 4.260 64329 110579 17488 21183 77961 54453 5252 81768 67.167 6.458 20.26 9.61 8.75 11.47 8.84 11.45 4.18 23.22 34.61 11.22 7.62 18.97 34.05 19.60 24.95 53.21 13.83 OpenBenchmarking.org
vkpeak fp32-scalar OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20240505 fp32-scalar Gigabyte AMD Radeon RX 7900 XT 5K 10K 15K 20K 25K SE +/- 37.96, N = 3 24981.03
vkpeak fp32-vec4 OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20240505 fp32-vec4 Gigabyte AMD Radeon RX 7900 XT 5K 10K 15K 20K 25K SE +/- 22.98, N = 3 21318.02
vkpeak fp16-scalar OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20240505 fp16-scalar Gigabyte AMD Radeon RX 7900 XT 5K 10K 15K 20K 25K SE +/- 48.07, N = 3 24824.27
vkpeak fp16-vec4 OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20240505 fp16-vec4 Gigabyte AMD Radeon RX 7900 XT 10K 20K 30K 40K 50K SE +/- 169.42, N = 3 46366.43
vkpeak fp64-scalar OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20240505 fp64-scalar Gigabyte AMD Radeon RX 7900 XT 200 400 600 800 1000 SE +/- 0.10, N = 3 991.49
vkpeak fp64-vec4 OpenBenchmarking.org GFLOPS, More Is Better vkpeak 20240505 fp64-vec4 Gigabyte AMD Radeon RX 7900 XT 200 400 600 800 1000 SE +/- 0.28, N = 3 990.61
vkpeak int32-scalar OpenBenchmarking.org GIOPS, More Is Better vkpeak 20240505 int32-scalar Gigabyte AMD Radeon RX 7900 XT 1300 2600 3900 5200 6500 SE +/- 8.60, N = 3 6109.64
vkpeak int32-vec4 OpenBenchmarking.org GIOPS, More Is Better vkpeak 20240505 int32-vec4 Gigabyte AMD Radeon RX 7900 XT 1300 2600 3900 5200 6500 SE +/- 19.61, N = 3 6081.94
vkpeak int16-scalar OpenBenchmarking.org GIOPS, More Is Better vkpeak 20240505 int16-scalar Gigabyte AMD Radeon RX 7900 XT 5K 10K 15K 20K 25K SE +/- 42.40, N = 3 23645.42
vkpeak int16-vec4 OpenBenchmarking.org GIOPS, More Is Better vkpeak 20240505 int16-vec4 Gigabyte AMD Radeon RX 7900 XT 10K 20K 30K 40K 50K SE +/- 194.52, N = 3 46788.55
RealSR-NCNN Scale: 4x - TAA: No OpenBenchmarking.org Seconds, Fewer Is Better RealSR-NCNN 20200818 Scale: 4x - TAA: No Gigabyte AMD Radeon RX 7900 XT 1.0058 2.0116 3.0174 4.0232 5.029 SE +/- 0.035, N = 3 4.470
RealSR-NCNN Scale: 4x - TAA: Yes OpenBenchmarking.org Seconds, Fewer Is Better RealSR-NCNN 20200818 Scale: 4x - TAA: Yes Gigabyte AMD Radeon RX 7900 XT 5 10 15 20 25 SE +/- 0.04, N = 3 19.49
Waifu2x-NCNN Vulkan Scale: 2x - Denoise: 3 - TAA: Yes OpenBenchmarking.org Seconds, Fewer Is Better Waifu2x-NCNN Vulkan 20200818 Scale: 2x - Denoise: 3 - TAA: Yes Gigabyte AMD Radeon RX 7900 XT 0.9585 1.917 2.8755 3.834 4.7925 SE +/- 0.004, N = 3 4.260
VkFFT Test: FFT + iFFT R2C / C2R OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT R2C / C2R Gigabyte AMD Radeon RX 7900 XT 14K 28K 42K 56K 70K SE +/- 266.07, N = 3 64329 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C 1D batched in half precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT C2C 1D batched in half precision Gigabyte AMD Radeon RX 7900 XT 20K 40K 60K 80K 100K SE +/- 177.96, N = 3 110579 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C Bluestein in single precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT C2C Bluestein in single precision Gigabyte AMD Radeon RX 7900 XT 4K 8K 12K 16K 20K SE +/- 118.39, N = 3 17488 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C 1D batched in double precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT C2C 1D batched in double precision Gigabyte AMD Radeon RX 7900 XT 5K 10K 15K 20K 25K SE +/- 8.99, N = 3 21183 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C 1D batched in single precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT C2C 1D batched in single precision Gigabyte AMD Radeon RX 7900 XT 20K 40K 60K 80K 100K SE +/- 34.71, N = 3 77961 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C multidimensional in single precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT C2C multidimensional in single precision Gigabyte AMD Radeon RX 7900 XT 12K 24K 36K 48K 60K SE +/- 305.40, N = 3 54453 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C Bluestein benchmark in double precision OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT C2C Bluestein benchmark in double precision Gigabyte AMD Radeon RX 7900 XT 1100 2200 3300 4400 5500 SE +/- 8.45, N = 3 5252 1. (CXX) g++ options: -O3
VkFFT Test: FFT + iFFT C2C 1D batched in single precision, no reshuffling OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.3.4 Test: FFT + iFFT C2C 1D batched in single precision, no reshuffling Gigabyte AMD Radeon RX 7900 XT 20K 40K 60K 80K 100K SE +/- 34.57, N = 3 81768 1. (CXX) g++ options: -O3
VkResample Upscale: 2x - Precision: Double OpenBenchmarking.org ms, Fewer Is Better VkResample 1.0 Upscale: 2x - Precision: Double Gigabyte AMD Radeon RX 7900 XT 15 30 45 60 75 SE +/- 0.15, N = 3 67.17 1. (CXX) g++ options: -O3
VkResample Upscale: 2x - Precision: Single OpenBenchmarking.org ms, Fewer Is Better VkResample 1.0 Upscale: 2x - Precision: Single Gigabyte AMD Radeon RX 7900 XT 2 4 6 8 10 SE +/- 0.004, N = 3 6.458 1. (CXX) g++ options: -O3
NCNN Target: Vulkan GPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: mobilenet Gigabyte AMD Radeon RX 7900 XT 5 10 15 20 25 SE +/- 0.64, N = 12 20.26 MIN: 12.27 / MAX: 484.55 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 Gigabyte AMD Radeon RX 7900 XT 3 6 9 12 15 SE +/- 0.22, N = 12 9.61 MIN: 5.58 / MAX: 420.93 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 Gigabyte AMD Radeon RX 7900 XT 2 4 6 8 10 SE +/- 0.30, N = 12 8.75 MIN: 5.58 / MAX: 650.14 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: shufflenet-v2 Gigabyte AMD Radeon RX 7900 XT 3 6 9 12 15 SE +/- 0.48, N = 12 11.47 MIN: 7.41 / MAX: 781.29 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: mnasnet Gigabyte AMD Radeon RX 7900 XT 2 4 6 8 10 SE +/- 0.28, N = 11 8.84 MIN: 5.2 / MAX: 374.25 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: efficientnet-b0 Gigabyte AMD Radeon RX 7900 XT 3 6 9 12 15 SE +/- 0.31, N = 12 11.45 MIN: 6.89 / MAX: 391.33 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: blazeface Gigabyte AMD Radeon RX 7900 XT 0.9405 1.881 2.8215 3.762 4.7025 SE +/- 0.17, N = 12 4.18 MIN: 2.41 / MAX: 320.85 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: googlenet Gigabyte AMD Radeon RX 7900 XT 6 12 18 24 30 SE +/- 0.54, N = 12 23.22 MIN: 14.77 / MAX: 462.71 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: vgg16 Gigabyte AMD Radeon RX 7900 XT 8 16 24 32 40 SE +/- 0.48, N = 12 34.61 MIN: 22.47 / MAX: 479.77 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: resnet18 Gigabyte AMD Radeon RX 7900 XT 3 6 9 12 15 SE +/- 0.39, N = 11 11.22 MIN: 7.35 / MAX: 325.84 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: alexnet Gigabyte AMD Radeon RX 7900 XT 2 4 6 8 10 SE +/- 0.23, N = 12 7.62 MIN: 5.05 / MAX: 280.54 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: resnet50 Gigabyte AMD Radeon RX 7900 XT 5 10 15 20 25 SE +/- 0.59, N = 12 18.97 MIN: 11.9 / MAX: 714.42 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: yolov4-tiny Gigabyte AMD Radeon RX 7900 XT 8 16 24 32 40 SE +/- 0.68, N = 12 34.05 MIN: 20.65 / MAX: 478.65 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: squeezenet_ssd Gigabyte AMD Radeon RX 7900 XT 5 10 15 20 25 SE +/- 0.50, N = 12 19.60 MIN: 12.41 / MAX: 685.75 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: regnety_400m Gigabyte AMD Radeon RX 7900 XT 6 12 18 24 30 SE +/- 0.64, N = 12 24.95 MIN: 15.49 / MAX: 706.04 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: vision_transformer OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: vision_transformer Gigabyte AMD Radeon RX 7900 XT 12 24 36 48 60 SE +/- 0.68, N = 12 53.21 MIN: 37.26 / MAX: 604.65 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: FastestDet OpenBenchmarking.org ms, Fewer Is Better NCNN 20230517 Target: Vulkan GPU - Model: FastestDet Gigabyte AMD Radeon RX 7900 XT 4 8 12 16 20 SE +/- 0.37, N = 12 13.83 MIN: 6.02 / MAX: 368.18 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Phoronix Test Suite v10.8.5