opencl benchmark test AMD Ryzen Threadripper PRO 7995WX 96-Cores testing with a HP 8B24 (U65 Ver. 01.01.04 BIOS) and NVIDIA RTX A4000 16GB on Ubuntu 23.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2401055-PTS-OPENCLBE94&rdt&grw .
opencl benchmark test Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL OpenCL Compiler File-System Screen Resolution a b c d AMD Ryzen Threadripper PRO 7995WX 96-Cores @ 6.44GHz (96 Cores / 192 Threads) HP 8B24 (U65 Ver. 01.01.04 BIOS) AMD Device 14a4 128GB 2 x 1024GB SAMSUNG MZVL21T0HCLR-00BH1 NVIDIA RTX A4000 16GB NVIDIA GA104 HD Audio ASUS VP28U Realtek RTL8111/8168/8411 Ubuntu 23.10 6.5.0-14-generic (x86_64) GNOME Shell 45.0 X Server 1.21.1.7 NVIDIA 535.129.03 4.6.0 OpenCL 3.0 CUDA 12.2.147 GCC 13.2.0 ext4 3840x2160 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - CPU Microcode: 0xa108105 Graphics Details - BAR1 / Visible vRAM Size: 256 MiB - vBIOS Version: 94.04.57.00.0b OpenCL Details - GPU Compute Cores: 6144 Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
opencl benchmark test opencl-benchmark: Memory Bandwidth Coalesced Write opencl-benchmark: INT8 Compute opencl-benchmark: Memory Bandwidth Coalesced Read opencl-benchmark: INT16 Compute opencl-benchmark: INT32 Compute opencl-benchmark: INT64 Compute opencl-benchmark: FP32 Compute opencl-benchmark: FP64 Compute a b c d 406.12 8.244 399.34 8.703 10.376 2.857 22.051 0.355 406.20 8.118 399.33 8.625 10.253 2.833 22.001 0.354 406.19 8.218 399.36 8.599 10.344 2.859 22.046 0.353 406.33 8.069 399.26 8.533 10.344 2.854 21.995 0.353 OpenBenchmarking.org
ProjectPhysX OpenCL-Benchmark Operation: Memory Bandwidth Coalesced Write OpenBenchmarking.org GB/s, More Is Better ProjectPhysX OpenCL-Benchmark 1.2 Operation: Memory Bandwidth Coalesced Write a b c d 90 180 270 360 450 SE +/- 0.21, N = 3 SE +/- 0.25, N = 3 SE +/- 0.05, N = 3 SE +/- 0.10, N = 3 406.12 406.20 406.19 406.33 1. (CXX) g++ options: -std=c++17 -pthread -lOpenCL
ProjectPhysX OpenCL-Benchmark Operation: INT8 Compute OpenBenchmarking.org TIOPs/s, More Is Better ProjectPhysX OpenCL-Benchmark 1.2 Operation: INT8 Compute a b c d 2 4 6 8 10 SE +/- 0.050, N = 3 SE +/- 0.043, N = 3 SE +/- 0.066, N = 3 SE +/- 0.025, N = 3 8.244 8.118 8.218 8.069 1. (CXX) g++ options: -std=c++17 -pthread -lOpenCL
ProjectPhysX OpenCL-Benchmark Operation: Memory Bandwidth Coalesced Read OpenBenchmarking.org GB/s, More Is Better ProjectPhysX OpenCL-Benchmark 1.2 Operation: Memory Bandwidth Coalesced Read a b c d 90 180 270 360 450 SE +/- 0.04, N = 3 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 399.34 399.33 399.36 399.26 1. (CXX) g++ options: -std=c++17 -pthread -lOpenCL
ProjectPhysX OpenCL-Benchmark Operation: INT16 Compute OpenBenchmarking.org TIOPs/s, More Is Better ProjectPhysX OpenCL-Benchmark 1.2 Operation: INT16 Compute a b c d 2 4 6 8 10 SE +/- 0.012, N = 3 SE +/- 0.028, N = 3 SE +/- 0.092, N = 3 SE +/- 0.036, N = 3 8.703 8.625 8.599 8.533 1. (CXX) g++ options: -std=c++17 -pthread -lOpenCL
ProjectPhysX OpenCL-Benchmark Operation: INT32 Compute OpenBenchmarking.org TIOPs/s, More Is Better ProjectPhysX OpenCL-Benchmark 1.2 Operation: INT32 Compute a b c d 3 6 9 12 15 SE +/- 0.03, N = 3 SE +/- 0.08, N = 3 SE +/- 0.08, N = 3 SE +/- 0.08, N = 3 10.38 10.25 10.34 10.34 1. (CXX) g++ options: -std=c++17 -pthread -lOpenCL
ProjectPhysX OpenCL-Benchmark Operation: INT64 Compute OpenBenchmarking.org TIOPs/s, More Is Better ProjectPhysX OpenCL-Benchmark 1.2 Operation: INT64 Compute a b c d 0.6433 1.2866 1.9299 2.5732 3.2165 SE +/- 0.029, N = 3 SE +/- 0.018, N = 3 SE +/- 0.028, N = 3 SE +/- 0.029, N = 3 2.857 2.833 2.859 2.854 1. (CXX) g++ options: -std=c++17 -pthread -lOpenCL
ProjectPhysX OpenCL-Benchmark Operation: FP32 Compute OpenBenchmarking.org TFLOPs/s, More Is Better ProjectPhysX OpenCL-Benchmark 1.2 Operation: FP32 Compute a b c d 5 10 15 20 25 SE +/- 0.04, N = 3 SE +/- 0.08, N = 3 SE +/- 0.04, N = 3 SE +/- 0.07, N = 3 22.05 22.00 22.05 22.00 1. (CXX) g++ options: -std=c++17 -pthread -lOpenCL
ProjectPhysX OpenCL-Benchmark Operation: FP64 Compute OpenBenchmarking.org TFLOPs/s, More Is Better ProjectPhysX OpenCL-Benchmark 1.2 Operation: FP64 Compute a b c d 0.0799 0.1598 0.2397 0.3196 0.3995 SE +/- 0.002, N = 3 SE +/- 0.002, N = 3 SE +/- 0.001, N = 3 SE +/- 0.001, N = 3 0.355 0.354 0.353 0.353 1. (CXX) g++ options: -std=c++17 -pthread -lOpenCL
Phoronix Test Suite v10.8.5