testgpu

AMD Ryzen Threadripper PRO 7965WX 24-Cores testing with a ASUS Pro WS WRX90E-SAGE SE (0803 BIOS) and Gigabyte AMD Radeon RX 7900 XT 20GB on Debian via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2412054-LORE-TESTGPU68.

testgpuProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerOpenGLCompilerFile-SystemScreen ResolutionGigabyte AMD Radeon RX 7900 XTAMD Ryzen Threadripper PRO 7965WX 24-Cores @ 5.36GHz (24 Cores / 48 Threads)ASUS Pro WS WRX90E-SAGE SE (0803 BIOS)AMD Genoa/Bergamo128GB2 x 2000GB CT2000T705SSD3Gigabyte AMD Radeon RX 7900 XT 20GB (2175/1249MHz)AMD Device 14ccDELL U2723QE2 x Intel X710 for 10GBASE-TDebian6.11.10-amd64 (x86_64)KDE Plasma 6.2.3X Server 1.21.1.14 + Wayland4.6 Mesa 24.2.8-1 (LLVM 19.1.4 DRM 3.59)GCC 14.2.0ext43840x2160OpenBenchmarking.org- Transparent Huge Pages: always- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=3 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/reproducible-path/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/reproducible-path/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: amd-pstate-epp powersave (Boost: Enabled EPP: balance_performance) - CPU Microcode: 0xa108108 - GLAMOR - BAR1 / Visible vRAM Size: 256 MB - vBIOS Version: 113-EXT91531-001- Python 3.12.7- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected

testgpuvkpeak: fp32-scalarvkpeak: fp32-vec4vkpeak: fp16-scalarvkpeak: fp16-vec4vkpeak: fp64-scalarvkpeak: fp64-vec4vkpeak: int32-scalarvkpeak: int32-vec4vkpeak: int16-scalarvkpeak: int16-vec4realsr-ncnn: 4x - Norealsr-ncnn: 4x - Yeswaifu2x-ncnn: 2x - 3 - Yesvkfft: FFT + iFFT R2C / C2Rvkfft: FFT + iFFT C2C 1D batched in half precisionvkfft: FFT + iFFT C2C Bluestein in single precisionvkfft: FFT + iFFT C2C 1D batched in double precisionvkfft: FFT + iFFT C2C 1D batched in single precisionvkfft: FFT + iFFT C2C multidimensional in single precisionvkfft: FFT + iFFT C2C Bluestein benchmark in double precisionvkfft: FFT + iFFT C2C 1D batched in single precision, no reshufflingvkresample: 2x - Doublevkresample: 2x - Singlencnn: Vulkan GPU - mobilenetncnn: Vulkan GPU-v2-v2 - mobilenet-v2ncnn: Vulkan GPU-v3-v3 - mobilenet-v3ncnn: Vulkan GPU - shufflenet-v2ncnn: Vulkan GPU - mnasnetncnn: Vulkan GPU - efficientnet-b0ncnn: Vulkan GPU - blazefacencnn: Vulkan GPU - googlenetncnn: Vulkan GPU - vgg16ncnn: Vulkan GPU - resnet18ncnn: Vulkan GPU - alexnetncnn: Vulkan GPU - resnet50ncnn: Vulkan GPU - yolov4-tinyncnn: Vulkan GPU - squeezenet_ssdncnn: Vulkan GPU - regnety_400mncnn: Vulkan GPU - vision_transformerncnn: Vulkan GPU - FastestDetGigabyte AMD Radeon RX 7900 XT24981.0321318.0224824.2746366.43991.49990.616109.646081.9423645.4246788.554.47019.4904.260643291105791748821183779615445352528176867.1676.45820.269.618.7511.478.8411.454.1823.2234.6111.227.6218.9734.0519.6024.9553.2113.83OpenBenchmarking.org

vkpeak

fp32-scalar

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20240505fp32-scalarGigabyte AMD Radeon RX 7900 XT5K10K15K20K25KSE +/- 37.96, N = 324981.03

vkpeak

fp32-vec4

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20240505fp32-vec4Gigabyte AMD Radeon RX 7900 XT5K10K15K20K25KSE +/- 22.98, N = 321318.02

vkpeak

fp16-scalar

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20240505fp16-scalarGigabyte AMD Radeon RX 7900 XT5K10K15K20K25KSE +/- 48.07, N = 324824.27

vkpeak

fp16-vec4

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20240505fp16-vec4Gigabyte AMD Radeon RX 7900 XT10K20K30K40K50KSE +/- 169.42, N = 346366.43

vkpeak

fp64-scalar

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20240505fp64-scalarGigabyte AMD Radeon RX 7900 XT2004006008001000SE +/- 0.10, N = 3991.49

vkpeak

fp64-vec4

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20240505fp64-vec4Gigabyte AMD Radeon RX 7900 XT2004006008001000SE +/- 0.28, N = 3990.61

vkpeak

int32-scalar

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20240505int32-scalarGigabyte AMD Radeon RX 7900 XT13002600390052006500SE +/- 8.60, N = 36109.64

vkpeak

int32-vec4

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20240505int32-vec4Gigabyte AMD Radeon RX 7900 XT13002600390052006500SE +/- 19.61, N = 36081.94

vkpeak

int16-scalar

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20240505int16-scalarGigabyte AMD Radeon RX 7900 XT5K10K15K20K25KSE +/- 42.40, N = 323645.42

vkpeak

int16-vec4

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20240505int16-vec4Gigabyte AMD Radeon RX 7900 XT10K20K30K40K50KSE +/- 194.52, N = 346788.55

RealSR-NCNN

Scale: 4x - TAA: No

OpenBenchmarking.orgSeconds, Fewer Is BetterRealSR-NCNN 20200818Scale: 4x - TAA: NoGigabyte AMD Radeon RX 7900 XT1.00582.01163.01744.02325.029SE +/- 0.035, N = 34.470

RealSR-NCNN

Scale: 4x - TAA: Yes

OpenBenchmarking.orgSeconds, Fewer Is BetterRealSR-NCNN 20200818Scale: 4x - TAA: YesGigabyte AMD Radeon RX 7900 XT510152025SE +/- 0.04, N = 319.49

Waifu2x-NCNN Vulkan

Scale: 2x - Denoise: 3 - TAA: Yes

OpenBenchmarking.orgSeconds, Fewer Is BetterWaifu2x-NCNN Vulkan 20200818Scale: 2x - Denoise: 3 - TAA: YesGigabyte AMD Radeon RX 7900 XT0.95851.9172.87553.8344.7925SE +/- 0.004, N = 34.260

VkFFT

Test: FFT + iFFT R2C / C2R

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT R2C / C2RGigabyte AMD Radeon RX 7900 XT14K28K42K56K70KSE +/- 266.07, N = 3643291. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT C2C 1D batched in half precision

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C 1D batched in half precisionGigabyte AMD Radeon RX 7900 XT20K40K60K80K100KSE +/- 177.96, N = 31105791. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT C2C Bluestein in single precision

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C Bluestein in single precisionGigabyte AMD Radeon RX 7900 XT4K8K12K16K20KSE +/- 118.39, N = 3174881. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT C2C 1D batched in double precision

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C 1D batched in double precisionGigabyte AMD Radeon RX 7900 XT5K10K15K20K25KSE +/- 8.99, N = 3211831. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT C2C 1D batched in single precision

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C 1D batched in single precisionGigabyte AMD Radeon RX 7900 XT20K40K60K80K100KSE +/- 34.71, N = 3779611. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT C2C multidimensional in single precision

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C multidimensional in single precisionGigabyte AMD Radeon RX 7900 XT12K24K36K48K60KSE +/- 305.40, N = 3544531. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT C2C Bluestein benchmark in double precision

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C Bluestein benchmark in double precisionGigabyte AMD Radeon RX 7900 XT11002200330044005500SE +/- 8.45, N = 352521. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT C2C 1D batched in single precision, no reshuffling

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C 1D batched in single precision, no reshufflingGigabyte AMD Radeon RX 7900 XT20K40K60K80K100KSE +/- 34.57, N = 3817681. (CXX) g++ options: -O3

VkResample

Upscale: 2x - Precision: Double

OpenBenchmarking.orgms, Fewer Is BetterVkResample 1.0Upscale: 2x - Precision: DoubleGigabyte AMD Radeon RX 7900 XT1530456075SE +/- 0.15, N = 367.171. (CXX) g++ options: -O3

VkResample

Upscale: 2x - Precision: Single

OpenBenchmarking.orgms, Fewer Is BetterVkResample 1.0Upscale: 2x - Precision: SingleGigabyte AMD Radeon RX 7900 XT246810SE +/- 0.004, N = 36.4581. (CXX) g++ options: -O3

NCNN

Target: Vulkan GPU - Model: mobilenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: mobilenetGigabyte AMD Radeon RX 7900 XT510152025SE +/- 0.64, N = 1220.26MIN: 12.27 / MAX: 484.551. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2Gigabyte AMD Radeon RX 7900 XT3691215SE +/- 0.22, N = 129.61MIN: 5.58 / MAX: 420.931. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3Gigabyte AMD Radeon RX 7900 XT246810SE +/- 0.30, N = 128.75MIN: 5.58 / MAX: 650.141. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: shufflenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: shufflenet-v2Gigabyte AMD Radeon RX 7900 XT3691215SE +/- 0.48, N = 1211.47MIN: 7.41 / MAX: 781.291. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: mnasnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: mnasnetGigabyte AMD Radeon RX 7900 XT246810SE +/- 0.28, N = 118.84MIN: 5.2 / MAX: 374.251. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: efficientnet-b0

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: efficientnet-b0Gigabyte AMD Radeon RX 7900 XT3691215SE +/- 0.31, N = 1211.45MIN: 6.89 / MAX: 391.331. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: blazeface

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: blazefaceGigabyte AMD Radeon RX 7900 XT0.94051.8812.82153.7624.7025SE +/- 0.17, N = 124.18MIN: 2.41 / MAX: 320.851. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: googlenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: googlenetGigabyte AMD Radeon RX 7900 XT612182430SE +/- 0.54, N = 1223.22MIN: 14.77 / MAX: 462.711. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: vgg16

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: vgg16Gigabyte AMD Radeon RX 7900 XT816243240SE +/- 0.48, N = 1234.61MIN: 22.47 / MAX: 479.771. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: resnet18

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: resnet18Gigabyte AMD Radeon RX 7900 XT3691215SE +/- 0.39, N = 1111.22MIN: 7.35 / MAX: 325.841. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: alexnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: alexnetGigabyte AMD Radeon RX 7900 XT246810SE +/- 0.23, N = 127.62MIN: 5.05 / MAX: 280.541. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: resnet50

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: resnet50Gigabyte AMD Radeon RX 7900 XT510152025SE +/- 0.59, N = 1218.97MIN: 11.9 / MAX: 714.421. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: yolov4-tiny

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: yolov4-tinyGigabyte AMD Radeon RX 7900 XT816243240SE +/- 0.68, N = 1234.05MIN: 20.65 / MAX: 478.651. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: squeezenet_ssd

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: squeezenet_ssdGigabyte AMD Radeon RX 7900 XT510152025SE +/- 0.50, N = 1219.60MIN: 12.41 / MAX: 685.751. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: regnety_400m

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: regnety_400mGigabyte AMD Radeon RX 7900 XT612182430SE +/- 0.64, N = 1224.95MIN: 15.49 / MAX: 706.041. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: vision_transformer

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: vision_transformerGigabyte AMD Radeon RX 7900 XT1224364860SE +/- 0.68, N = 1253.21MIN: 37.26 / MAX: 604.651. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: FastestDet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: FastestDetGigabyte AMD Radeon RX 7900 XT48121620SE +/- 0.37, N = 1213.83MIN: 6.02 / MAX: 368.181. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread


Phoronix Test Suite v10.8.5