opencl-set0-yoda-prehw

Test after swapping in the new HW, after Manjaro installed, with rocm AMD version of opencl.

HTML result view exported from: https://openbenchmarking.org/result/2311085-BILL-231107406&sro.

opencl-set0-yoda-prehwProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerOpenGLOpenCLVulkanCompilerFile-SystemScreen Resolution20230927-trial20230928_preswitch20231020_postswitchperf20231107_postswitch_mjrperf_opencl_cmnt20231107_postswitch_mjrperf_opencl_rocmIntel Core i7-7700 @ 4.20GHz (4 Cores / 8 Threads)ASUS PRIME H270M-PLUS (0809 BIOS)Intel Xeon E3-1200 v6/7th + H27032GBSamsung SSD 960 EVO 250GB + 1000GB Samsung SSD 970 EVO Plus 1TB + 3001GB Western Digital WD30EFRX-68ESapphire AMD Radeon RX 6700 XT 12GB (2725/1000MHz)Realtek ALC887-VDPB248Intel I219-V + Intel Wi-Fi 6 AX200Ubuntu 22.045.15.0-84-generic (x86_64)GNOME Shell 42.9X Server 1.21.1.34.6 Mesa 23.2.0-devel (LLVM 16.0.6 DRM 3.54)OpenCL 2.1 AMD-APP (3590.0)1.3.252GCC 11.4.0 + LLVM 14.0.0ext4 (ecryptfs)1920x1200AMD Ryzen 9 7950X3D 16-Core @ 4.20GHz (16 Cores / 32 Threads)ASUS TUF GAMING B650M-PLUS WIFI (0823 BIOS)AMD Device 14d862GB1000GB Samsung SSD 970 EVO Plus 1TB + Samsung SSD 970 EVO Plus 500GB + 3001GB Western Digital WD30EFRX-68EAMD Navi 21 HDMI AudioRealtek RTL8125 2.5GbE + Realtek Device b8525.15.0-86-generic (x86_64)AMD Ryzen 9 7950X3D 16-Core @ 5.76GHz (16 Cores / 32 Threads)1000GB Samsung SSD 970 EVO Plus 1TB + Samsung SSD 970 EVO Plus 500GB + 3001GB Western Digital WD30EFRX-68E + 4001GB Rugged USB-C + 3001GB Elements 25A2Sapphire AMD Radeon RX 6700 XT 12GB (2200/2400MHz)AMD Navi 21/23ManjaroLinux 23.1.06.5.9-1-MANJARO (x86_64)KDE Plasma 5.27.9X Server 1.21.1.94.6 Mesa 23.1.9-manjaro1.1 (LLVM 16.0.6 DRM 3.54)GCC 13.2.1 20230801 + Clang 16.0.6 + LLVM 16.0.6btrfsOpenCL 2.1 AMD-APP.dbg (3570.0)OpenBenchmarking.orgKernel Details- 20230927-trial: Transparent Huge Pages: madvise- 20230928_preswitch: Transparent Huge Pages: madvise- 20231020_postswitchperf: Transparent Huge Pages: madvise- 20231107_postswitch_mjrperf_opencl_cmnt: Transparent Huge Pages: always- 20231107_postswitch_mjrperf_opencl_rocm: Transparent Huge Pages: alwaysCompiler Details- 20230927-trial: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - 20230928_preswitch: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - 20231020_postswitchperf: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - 20231107_postswitch_mjrperf_opencl_cmnt: --disable-libssp --disable-libstdcxx-pch --disable-werror --enable-__cxa_atexit --enable-bootstrap --enable-cet=auto --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-default-ssp --enable-gnu-indirect-function --enable-gnu-unique-object --enable-languages=ada,c,c++,d,fortran,go,lto,objc,obj-c++ --enable-libstdcxx-backtrace --enable-link-serialization=1 --enable-lto --enable-multilib --enable-plugin --enable-shared --enable-threads=posix --mandir=/usr/share/man --with-build-config=bootstrap-lto --with-linker-hash-style=gnu - 20231107_postswitch_mjrperf_opencl_rocm: --disable-libssp --disable-libstdcxx-pch --disable-werror --enable-__cxa_atexit --enable-bootstrap --enable-cet=auto --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-default-ssp --enable-gnu-indirect-function --enable-gnu-unique-object --enable-languages=ada,c,c++,d,fortran,go,lto,objc,obj-c++ --enable-libstdcxx-backtrace --enable-link-serialization=1 --enable-lto --enable-multilib --enable-plugin --enable-shared --enable-threads=posix --mandir=/usr/share/man --with-build-config=bootstrap-lto --with-linker-hash-style=gnu Processor Details- 20230927-trial: Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0xf4 - Thermald 2.4.9- 20230928_preswitch: Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0xf4 - Thermald 2.4.9- 20231020_postswitchperf: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa601203- 20231107_postswitch_mjrperf_opencl_cmnt: Scaling Governor: amd-pstate-epp performance (EPP: performance) - CPU Microcode: 0xa601203- 20231107_postswitch_mjrperf_opencl_rocm: Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - CPU Microcode: 0xa601203Graphics Details- 20230927-trial: BAR1 / Visible vRAM Size: 256 MB - vBIOS Version: 113-D5122200-S05- 20230928_preswitch: BAR1 / Visible vRAM Size: 256 MB - vBIOS Version: 113-D5122200-S05- 20231020_postswitchperf: BAR1 / Visible vRAM Size: 12272 MB - vBIOS Version: 113-D5122200-S05- 20231107_postswitch_mjrperf_opencl_cmnt: GLAMOR - BAR1 / Visible vRAM Size: 12272 MB - vBIOS Version: 102-RAPHAEL-006- 20231107_postswitch_mjrperf_opencl_rocm: GLAMOR - BAR1 / Visible vRAM Size: 12272 MB - vBIOS Version: 102-RAPHAEL-006Python Details- 20230927-trial: Python 3.10.12- 20230928_preswitch: Python 3.10.12- 20231020_postswitchperf: Python 3.10.12- 20231107_postswitch_mjrperf_opencl_cmnt: Python 3.10.13- 20231107_postswitch_mjrperf_opencl_rocm: Python 3.10.13Security Details- 20230927-trial: gather_data_sampling: Mitigation of Microcode + itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Mitigation of PTE Inversion; VMX: conditional cache flushes SMT vulnerable + mds: Mitigation of Clear buffers; SMT vulnerable + meltdown: Mitigation of PTI + mmio_stale_data: Mitigation of Clear buffers; SMT vulnerable + retbleed: Mitigation of IBRS + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of IBRS IBPB: conditional STIBP: conditional RSB filling PBRSB-eIBRS: Not affected + srbds: Mitigation of Microcode + tsx_async_abort: Mitigation of TSX disabled - 20230928_preswitch: gather_data_sampling: Mitigation of Microcode + itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Mitigation of PTE Inversion; VMX: conditional cache flushes SMT vulnerable + mds: Mitigation of Clear buffers; SMT vulnerable + meltdown: Mitigation of PTI + mmio_stale_data: Mitigation of Clear buffers; SMT vulnerable + retbleed: Mitigation of IBRS + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of IBRS IBPB: conditional STIBP: conditional RSB filling PBRSB-eIBRS: Not affected + srbds: Mitigation of Microcode + tsx_async_abort: Mitigation of TSX disabled - 20231020_postswitchperf: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of safe RET no microcode + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected - 20231107_postswitch_mjrperf_opencl_cmnt: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of safe RET no microcode + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected - 20231107_postswitch_mjrperf_opencl_rocm: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of safe RET no microcode + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected

opencl-set0-yoda-prehwshoc: OpenCL - Triadcl-mem: Copyfluidx3d: FP32-FP16Sclpeak: Double-Precision Computeparboil: OpenMP Stencilrodinia: OpenCL Myocyteviennacl: OpenCL BLAS - sCOPYviennacl: OpenCL BLAS - sAXPYviennacl: OpenCL BLAS - sDOTviennacl: OpenCL BLAS - dCOPYviennacl: OpenCL BLAS - dDOTviennacl: OpenCL BLAS - dGEMV-Nviennacl: OpenCL BLAS - dGEMV-Tviennacl: OpenCL BLAS - dGEMM-NNviennacl: OpenCL BLAS - dGEMM-NTviennacl: OpenCL BLAS - dGEMM-TNviennacl: OpenCL BLAS - dGEMM-TTviennacl: OpenCL BLAS - dAXPYdarktable: Masskrug - OpenCLdarktable: Masskrug - CPU-onlyjuliagpu: GPUjuliagpu: CPU+GPUmandelbulbgpu: CPU+GPUmandelgpu: CPU+GPUsmallpt-gpu: GPU - Caustic3luxmark: GPU - Microphoneluxmark: CPU+GPU - Microphonelulesh-cl: 20230927-trial20230928_preswitch20231020_postswitchperf20231107_postswitch_mjrperf_opencl_cmnt20231107_postswitch_mjrperf_opencl_rocm11.1138281.32689808.9223.16534912.06341362032220118775.81737057327107292215.2167.894137240757.5135315754.479519746.9266133232.1169583453530063297922953.383910.9247281.22716808.6822.1601718.65742959830519421076.11997127337087302235.3308.008142637305.6139394174.979087731.3268569163.2169587801629797301322913.892013.5105281.52620788.754.73826247.6274225903352703261022907007227017163021.3821.833428957016.6411077431.4222793824.3330203056.3169780420135982353633122.5070281.82778801.1922.3634235753692683161022817057367037312981.3631.811574416637.8573384003.0328322358.4316262936.2169938955535717353553350.387623.1767281.32704795.3152.3764426263612703241022867027246967193011.874570297541.0568108729.7329089248.8315714433.5169939893035105348103332.6619OpenBenchmarking.org

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Triad

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Triad20230927-trial20230928_preswitch20231020_postswitchperf20231107_postswitch_mjrperf_opencl_rocm612182430SE +/- 0.16, N = 5SE +/- 0.22, N = 3SE +/- 0.14, N = 15SE +/- 0.26, N = 411.1110.9213.5123.181. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

cl-mem

Benchmark: Copy

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: Copy20230927-trial20230928_preswitch20231020_postswitchperf20231107_postswitch_mjrperf_opencl_cmnt20231107_postswitch_mjrperf_opencl_rocm60120180240300SE +/- 0.09, N = 3SE +/- 0.09, N = 3SE +/- 0.17, N = 3SE +/- 0.06, N = 3SE +/- 0.15, N = 3281.3281.2281.5281.8281.31. (CC) gcc options: -O2 -flto -lOpenCL

FluidX3D

Test: FP32-FP16S

OpenBenchmarking.orgMLUPs/s, More Is BetterFluidX3D 2.3Test: FP32-FP16S20230927-trial20230928_preswitch20231020_postswitchperf20231107_postswitch_mjrperf_opencl_cmnt20231107_postswitch_mjrperf_opencl_rocm6001200180024003000SE +/- 6.17, N = 3SE +/- 2.85, N = 3SE +/- 5.04, N = 3SE +/- 16.05, N = 3SE +/- 4.91, N = 326892716262027782704

clpeak

OpenCL Test: Double-Precision Compute

OpenBenchmarking.orgGFLOPS, More Is Betterclpeak 1.1.2OpenCL Test: Double-Precision Compute20230927-trial20230928_preswitch20231020_postswitchperf20231107_postswitch_mjrperf_opencl_cmnt20231107_postswitch_mjrperf_opencl_rocm2004006008001000SE +/- 0.21, N = 3SE +/- 0.28, N = 3SE +/- 0.44, N = 3SE +/- 0.47, N = 3SE +/- 0.38, N = 3808.92808.68788.75801.19795.311. (CXX) g++ options: -O3

Parboil

Test: OpenMP Stencil

OpenBenchmarking.orgSeconds, Fewer Is BetterParboil 2.5Test: OpenMP Stencil20230927-trial20230928_preswitch20231020_postswitchperf612182430SE +/- 0.148343, N = 3SE +/- 0.024061, N = 3SE +/- 0.054079, N = 323.16534922.1601714.7382621. (CXX) g++ options: -lm -lpthread -lgomp -O3 -ffast-math -fopenmp

Rodinia

Test: OpenCL Myocyte

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenCL Myocyte20230927-trial20230928_preswitch20231020_postswitchperf20231107_postswitch_mjrperf_opencl_cmnt20231107_postswitch_mjrperf_opencl_rocm1224364860SE +/- 3.477, N = 15SE +/- 0.142, N = 4SE +/- 11.383, N = 15SE +/- 5.754, N = 15SE +/- 0.393, N = 312.0638.65747.62722.36352.3761. (CXX) g++ options: -O2 -lOpenCL

ViennaCL

Test: OpenCL BLAS - sCOPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sCOPY20230927-trial20230928_preswitch20231020_postswitchperf20231107_postswitch_mjrperf_opencl_cmnt20231107_postswitch_mjrperf_opencl_rocm100200300400500SE +/- 19.45, N = 12SE +/- 3.21, N = 3SE +/- 5.77, N = 3SE +/- 4.06, N = 3SE +/- 4.91, N = 34134294224234421. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - sAXPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sAXPY20230927-trial20230928_preswitch20231020_postswitchperf20231107_postswitch_mjrperf_opencl_cmnt20231107_postswitch_mjrperf_opencl_rocm140280420560700SE +/- 0.69, N = 12SE +/- 1.45, N = 3SE +/- 8.08, N = 3SE +/- 0.58, N = 3SE +/- 0.33, N = 36205985905756261. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - sDOT

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sDOT20230927-trial20230928_preswitch20231020_postswitchperf20231107_postswitch_mjrperf_opencl_cmnt20231107_postswitch_mjrperf_opencl_rocm80160240320400SE +/- 3.17, N = 12SE +/- 4.62, N = 3SE +/- 5.55, N = 3SE +/- 0.88, N = 3SE +/- 0.00, N = 33223053353693611. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dCOPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dCOPY20230927-trial20230928_preswitch20231020_postswitchperf20231107_postswitch_mjrperf_opencl_cmnt20231107_postswitch_mjrperf_opencl_rocm60120180240300SE +/- 5.02, N = 12SE +/- 11.57, N = 3SE +/- 0.00, N = 3SE +/- 0.58, N = 3SE +/- 1.20, N = 32011942702682701. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dDOT

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dDOT20230927-trial20230928_preswitch20231020_postswitchperf20231107_postswitch_mjrperf_opencl_cmnt20231107_postswitch_mjrperf_opencl_rocm70140210280350SE +/- 11.40, N = 12SE +/- 28.01, N = 3SE +/- 1.33, N = 3SE +/- 0.67, N = 3SE +/- 0.88, N = 31872103263163241. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMV-N

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMV-N20230927-trial20230928_preswitch20231020_postswitchperf20231107_postswitch_mjrperf_opencl_cmnt20231107_postswitch_mjrperf_opencl_rocm20406080100SE +/- 1.09, N = 12SE +/- 0.33, N = 3SE +/- 0.00, N = 3SE +/- 0.33, N = 3SE +/- 0.33, N = 375.876.1102.0102.0102.01. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMV-T

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMV-T20230927-trial20230928_preswitch20231020_postswitchperf20231107_postswitch_mjrperf_opencl_cmnt20231107_postswitch_mjrperf_opencl_rocm60120180240300SE +/- 12.99, N = 11SE +/- 27.91, N = 3SE +/- 1.00, N = 3SE +/- 3.51, N = 3SE +/- 1.20, N = 31731992902812861. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMM-NN

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-NN20230927-trial20230928_preswitch20231020_postswitchperf20231107_postswitch_mjrperf_opencl_cmnt20231107_postswitch_mjrperf_opencl_rocm150300450600750SE +/- 0.36, N = 12SE +/- 0.33, N = 3SE +/- 0.58, N = 3SE +/- 1.20, N = 3SE +/- 0.33, N = 37057127007057021. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMM-NT

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-NT20230927-trial20230928_preswitch20231020_postswitchperf20231107_postswitch_mjrperf_opencl_cmnt20231107_postswitch_mjrperf_opencl_rocm160320480640800SE +/- 0.29, N = 12SE +/- 0.58, N = 3SE +/- 0.58, N = 3SE +/- 0.58, N = 3SE +/- 0.88, N = 37327337227367241. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMM-TN

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-TN20230927-trial20230928_preswitch20231020_postswitchperf20231107_postswitch_mjrperf_opencl_cmnt20231107_postswitch_mjrperf_opencl_rocm150300450600750SE +/- 0.29, N = 12SE +/- 0.67, N = 3SE +/- 0.67, N = 3SE +/- 1.00, N = 2SE +/- 1.20, N = 37107087017036961. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMM-TT

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-TT20230927-trial20230928_preswitch20231020_postswitchperf20231107_postswitch_mjrperf_opencl_cmnt20231107_postswitch_mjrperf_opencl_rocm160320480640800SE +/- 0.26, N = 12SE +/- 0.58, N = 3SE +/- 2.33, N = 3SE +/- 0.88, N = 3SE +/- 0.67, N = 37297307167317191. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dAXPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dAXPY20230927-trial20230928_preswitch20231020_postswitchperf20231107_postswitch_mjrperf_opencl_cmnt20231107_postswitch_mjrperf_opencl_rocm70140210280350SE +/- 2.94, N = 11SE +/- 5.04, N = 3SE +/- 0.33, N = 3SE +/- 0.67, N = 3SE +/- 1.33, N = 32212233022983011. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

Darktable

Test: Masskrug - Acceleration: OpenCL

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 4.4.2Test: Masskrug - Acceleration: OpenCL20230927-trial20230928_preswitch20231020_postswitchperf20231107_postswitch_mjrperf_opencl_cmnt1.19932.39863.59794.79725.9965SE +/- 0.051, N = 3SE +/- 0.020, N = 3SE +/- 0.019, N = 3SE +/- 0.003, N = 35.2165.3301.3821.363

Darktable

Test: Masskrug - Acceleration: CPU-only

OpenBenchmarking.orgSeconds, Fewer Is BetterDarktable 4.4.2Test: Masskrug - Acceleration: CPU-only20230927-trial20230928_preswitch20231020_postswitchperf20231107_postswitch_mjrperf_opencl_cmnt20231107_postswitch_mjrperf_opencl_rocm246810SE +/- 0.007, N = 3SE +/- 0.016, N = 3SE +/- 0.003, N = 3SE +/- 0.004, N = 3SE +/- 0.009, N = 37.8948.0081.8331.8111.874

JuliaGPU

OpenCL Device: GPU

OpenBenchmarking.orgSamples/sec, More Is BetterJuliaGPU 1.2pts1OpenCL Device: GPU20230927-trial20230928_preswitch20231020_postswitchperf20231107_postswitch_mjrperf_opencl_cmnt20231107_postswitch_mjrperf_opencl_rocm120M240M360M480M600MSE +/- 251640.79, N = 3SE +/- 2175742.71, N = 5SE +/- 6247483.81, N = 3SE +/- 2889986.82, N = 3SE +/- 1565214.18, N = 3137240757.5142637305.6428957016.6574416637.8570297541.01. (CC) gcc options: -O3 -march=native -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL -lm

JuliaGPU

OpenCL Device: CPU+GPU

OpenBenchmarking.orgSamples/sec, More Is BetterJuliaGPU 1.2pts1OpenCL Device: CPU+GPU20230927-trial20230928_preswitch20231020_postswitchperf20231107_postswitch_mjrperf_opencl_cmnt20231107_postswitch_mjrperf_opencl_rocm120M240M360M480M600MSE +/- 343734.88, N = 3SE +/- 1203139.51, N = 3SE +/- 10623925.13, N = 15SE +/- 1495461.38, N = 3SE +/- 905489.96, N = 3135315754.4139394174.9411077431.4573384003.0568108729.71. (CC) gcc options: -O3 -march=native -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL -lm

MandelbulbGPU

OpenCL Device: CPU+GPU

OpenBenchmarking.orgSamples/sec, More Is BetterMandelbulbGPU 1.0pts1OpenCL Device: CPU+GPU20230927-trial20230928_preswitch20231020_postswitchperf20231107_postswitch_mjrperf_opencl_cmnt20231107_postswitch_mjrperf_opencl_rocm70M140M210M280M350MSE +/- 1033260.89, N = 7SE +/- 359470.33, N = 3SE +/- 4252385.89, N = 15SE +/- 103195.31, N = 3SE +/- 558975.72, N = 379519746.979087731.3222793824.3328322358.4329089248.81. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL

MandelGPU

OpenCL Device: CPU+GPU

OpenBenchmarking.orgSamples/sec, More Is BetterMandelGPU 1.3pts1OpenCL Device: CPU+GPU20230927-trial20230928_preswitch20231020_postswitchperf20231107_postswitch_mjrperf_opencl_cmnt20231107_postswitch_mjrperf_opencl_rocm70M140M210M280M350MSE +/- 1702516.89, N = 3SE +/- 403315.78, N = 3SE +/- 688999.96, N = 3SE +/- 166696.40, N = 3SE +/- 1514181.54, N = 3266133232.1268569163.2330203056.3316262936.2315714433.51. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL

SmallPT GPU

OpenCL Device: GPU - Scene: Caustic3

OpenBenchmarking.orgSamples/sec, More Is BetterSmallPT GPU 1.6pts1OpenCL Device: GPU - Scene: Caustic320230927-trial20230928_preswitch20231020_postswitchperf20231107_postswitch_mjrperf_opencl_cmnt20231107_postswitch_mjrperf_opencl_rocm400M800M1200M1600M2000MSE +/- 25.12, N = 3SE +/- 25.12, N = 3SE +/- 24.54, N = 3SE +/- 24.83, N = 3SE +/- 25.12, N = 3169583453516958780161697804201169938955516993989301. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL

LuxMark

OpenCL Device: GPU - Scene: Microphone

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.1OpenCL Device: GPU - Scene: Microphone20230927-trial20230928_preswitch20231020_postswitchperf20231107_postswitch_mjrperf_opencl_cmnt20231107_postswitch_mjrperf_opencl_rocm8K16K24K32K40KSE +/- 72.19, N = 3SE +/- 174.17, N = 3SE +/- 593.35, N = 3SE +/- 360.26, N = 5SE +/- 290.74, N = 93006329797359823571735105

LuxMark

OpenCL Device: CPU+GPU - Scene: Microphone

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.1OpenCL Device: CPU+GPU - Scene: Microphone20230927-trial20230928_preswitch20231020_postswitchperf20231107_postswitch_mjrperf_opencl_cmnt20231107_postswitch_mjrperf_opencl_rocm8K16K24K32K40KSE +/- 177.68, N = 3SE +/- 6.11, N = 3SE +/- 9.74, N = 3SE +/- 1.45, N = 3SE +/- 10.73, N = 32979230132353633535534810

Lulesh OpenCL

OpenBenchmarking.orgz/s, More Is BetterLulesh OpenCL 2017-07-0620230927-trial20230928_preswitch20231020_postswitchperf20231107_postswitch_mjrperf_opencl_cmnt20231107_postswitch_mjrperf_opencl_rocm7001400210028003500SE +/- 47.73, N = 15SE +/- 36.57, N = 15SE +/- 42.68, N = 15SE +/- 52.82, N = 12SE +/- 34.09, N = 152953.382913.893122.513350.393332.661. (CXX) g++ options: -std=c++11 -lOpenCL -O3 -lm


Phoronix Test Suite v10.8.5