aztest

AMD Ryzen 5 5600G testing with a Gigabyte B450M DS3H-CF (F63c BIOS) and NVIDIA RTX A4000 16GB on Ubuntu 22.04 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2308121-NE-AZTEST58313.

aztestProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLOpenCLCompilerFile-SystemScreen ResolutionNVIDIA RTX A4000AMD Ryzen 5 5600G @ 3.90GHz (6 Cores / 12 Threads)Gigabyte B450M DS3H-CF (F63c BIOS)AMD Renoir/Cezanne24GB1024GB ADATA SX6000LNP + Patriot M.2 P300 512GB + 500GB Samsung SSD 850NVIDIA RTX A4000 16GBNVIDIA GA104 HD AudioLG ULTRAWIDERealtek RTL8111/8168/8411Ubuntu 22.045.15.0-78-generic (x86_64)Xfce 4.16X Server 1.21.1.4NVIDIA 535.86.104.6.0OpenCL 2.1 AMD-APP (3581.0) + OpenCL 3.0 CUDA 12.2.128GCC 11.4.0 + CUDA 12.2ext46000x1440OpenBenchmarking.org- Transparent Huge Pages: madvise- CXXFLAGS="-O3 -march=native -Ofast -funsafe-math-optimizations" CMAKE_CUDA_FLAGS=-allow-unsupported-compiler CFLAGS="-O3 -march=native -Ofast -funsafe-math-optimizations" - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0xa50000d- GLAMOR - BAR1 / Visible vRAM Size: 32768 MiB - vBIOS Version: 94.04.57.00.09- GPU Compute Cores: 6144- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected

aztestcl-mem: Copyvkpeak: fp32-scalarvkpeak: fp32-vec4vkpeak: fp16-scalarvkpeak: fp16-vec4vkpeak: fp64-scalarvkpeak: fp64-vec4vkpeak: int32-scalarvkpeak: int32-vec4vkpeak: int16-scalarvkpeak: int16-vec4vkfft: FFT + iFFT R2C / C2Rvkfft: FFT + iFFT C2C 1D batched in half precisionvkfft: FFT + iFFT C2C Bluestein in single precisionvkfft: FFT + iFFT C2C 1D batched in double precisionvkfft: FFT + iFFT C2C 1D batched in single precisionvkfft: FFT + iFFT C2C multidimensional in single precisionvkfft: FFT + iFFT C2C Bluestein benchmark in double precisionvkfft: FFT + iFFT C2C 1D batched in single precision, no reshufflingNVIDIA RTX A400035.011340.3914547.5911128.7322115.72353.44353.7611216.2111016.367252.318679.14322681287669771167886804130639267670658OpenBenchmarking.org

cl-mem

Benchmark: Copy

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: CopyNVIDIA RTX A4000816243240SE +/- 0.03, N = 335.01. (CC) gcc options: -O2 -flto -lOpenCL

vkpeak

fp32-scalar

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20230730fp32-scalarNVIDIA RTX A40002K4K6K8K10KSE +/- 96.10, N = 311340.39

vkpeak

fp32-vec4

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20230730fp32-vec4NVIDIA RTX A40003K6K9K12K15KSE +/- 195.27, N = 314547.59

vkpeak

fp16-scalar

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20230730fp16-scalarNVIDIA RTX A40002K4K6K8K10KSE +/- 112.00, N = 311128.73

vkpeak

fp16-vec4

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20230730fp16-vec4NVIDIA RTX A40005K10K15K20K25KSE +/- 77.14, N = 322115.72

vkpeak

fp64-scalar

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20230730fp64-scalarNVIDIA RTX A400080160240320400SE +/- 0.09, N = 3353.44

vkpeak

fp64-vec4

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20230730fp64-vec4NVIDIA RTX A400080160240320400SE +/- 0.22, N = 3353.76

vkpeak

int32-scalar

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20230730int32-scalarNVIDIA RTX A40002K4K6K8K10KSE +/- 9.02, N = 311216.21

vkpeak

int32-vec4

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20230730int32-vec4NVIDIA RTX A40002K4K6K8K10KSE +/- 13.81, N = 311016.36

vkpeak

int16-scalar

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20230730int16-scalarNVIDIA RTX A400016003200480064008000SE +/- 4.92, N = 37252.31

vkpeak

int16-vec4

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20230730int16-vec4NVIDIA RTX A40002K4K6K8K10KSE +/- 2.32, N = 38679.14

VkFFT

Test: FFT + iFFT R2C / C2R

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT R2C / C2RNVIDIA RTX A40007K14K21K28K35KSE +/- 289.61, N = 15322681. (CXX) g++ options: -O3 -march=native -Ofast

VkFFT

Test: FFT + iFFT C2C 1D batched in half precision

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT C2C 1D batched in half precisionNVIDIA RTX A400030K60K90K120K150KSE +/- 795.78, N = 31287661. (CXX) g++ options: -O3 -march=native -Ofast

VkFFT

Test: FFT + iFFT C2C Bluestein in single precision

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT C2C Bluestein in single precisionNVIDIA RTX A40002K4K6K8K10KSE +/- 67.92, N = 397711. (CXX) g++ options: -O3 -march=native -Ofast

VkFFT

Test: FFT + iFFT C2C 1D batched in double precision

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT C2C 1D batched in double precisionNVIDIA RTX A40004K8K12K16K20KSE +/- 38.48, N = 3167881. (CXX) g++ options: -O3 -march=native -Ofast

VkFFT

Test: FFT + iFFT C2C 1D batched in single precision

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT C2C 1D batched in single precisionNVIDIA RTX A400015K30K45K60K75KSE +/- 40.21, N = 3680411. (CXX) g++ options: -O3 -march=native -Ofast

VkFFT

Test: FFT + iFFT C2C multidimensional in single precision

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT C2C multidimensional in single precisionNVIDIA RTX A40007K14K21K28K35KSE +/- 229.12, N = 3306391. (CXX) g++ options: -O3 -march=native -Ofast

VkFFT

Test: FFT + iFFT C2C Bluestein benchmark in double precision

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT C2C Bluestein benchmark in double precisionNVIDIA RTX A40006001200180024003000SE +/- 1.45, N = 326761. (CXX) g++ options: -O3 -march=native -Ofast

VkFFT

Test: FFT + iFFT C2C 1D batched in single precision, no reshuffling

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.2.31Test: FFT + iFFT C2C 1D batched in single precision, no reshufflingNVIDIA RTX A400015K30K45K60K75KSE +/- 40.86, N = 3706581. (CXX) g++ options: -O3 -march=native -Ofast


Phoronix Test Suite v10.8.5