opencl-set0-yoda-prehw Intel Core i7-7700 testing with a ASUS PRIME H270M-PLUS (0809 BIOS) and Sapphire AMD Radeon RX 6700 XT 12GB on Ubuntu 22.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2309274-BILL-OPENCLS98&grr .
opencl-set0-yoda-prehw Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server OpenGL OpenCL Vulkan Compiler File-System Screen Resolution 20230927-trial Intel Core i7-7700 @ 4.20GHz (4 Cores / 8 Threads) ASUS PRIME H270M-PLUS (0809 BIOS) Intel Xeon E3-1200 v6/7th + H270 32GB Samsung SSD 960 EVO 250GB + 1000GB Samsung SSD 970 EVO Plus 1TB + 3001GB Western Digital WD30EFRX-68E Sapphire AMD Radeon RX 6700 XT 12GB (2725/1000MHz) Realtek ALC887-VD PB248 Intel I219-V + Intel Wi-Fi 6 AX200 Ubuntu 22.04 5.15.0-84-generic (x86_64) GNOME Shell 42.9 X Server 1.21.1.3 4.6 Mesa 23.2.0-devel (LLVM 16.0.6 DRM 3.54) OpenCL 2.1 AMD-APP (3590.0) 1.3.252 GCC 11.4.0 + LLVM 14.0.0 ext4 (ecryptfs) 1920x1200 OpenBenchmarking.org - Transparent Huge Pages: madvise - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0xf4 - Thermald 2.4.9 - BAR1 / Visible vRAM Size: 256 MB - vBIOS Version: 113-D5122200-S05 - Python 3.10.12 - gather_data_sampling: Mitigation of Microcode + itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Mitigation of PTE Inversion; VMX: conditional cache flushes SMT vulnerable + mds: Mitigation of Clear buffers; SMT vulnerable + meltdown: Mitigation of PTI + mmio_stale_data: Mitigation of Clear buffers; SMT vulnerable + retbleed: Mitigation of IBRS + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of IBRS IBPB: conditional STIBP: conditional RSB filling PBRSB-eIBRS: Not affected + srbds: Mitigation of Microcode + tsx_async_abort: Mitigation of TSX disabled
opencl-set0-yoda-prehw luxmark: GPU - Microphone luxmark: CPU+GPU - Microphone viennacl: OpenCL BLAS - dAXPY viennacl: OpenCL BLAS - dGEMM-TT viennacl: OpenCL BLAS - dGEMM-TN viennacl: OpenCL BLAS - dGEMM-NT viennacl: OpenCL BLAS - dGEMM-NN viennacl: OpenCL BLAS - dGEMV-T viennacl: OpenCL BLAS - dGEMV-N viennacl: OpenCL BLAS - dDOT viennacl: OpenCL BLAS - dCOPY viennacl: OpenCL BLAS - sDOT viennacl: OpenCL BLAS - sAXPY viennacl: OpenCL BLAS - sCOPY fluidx3d: FP32-FP16S rodinia: OpenCL Myocyte lulesh-cl: smallpt-gpu: GPU - Caustic3 parboil: OpenMP Stencil mandelbulbgpu: CPU+GPU juliagpu: CPU+GPU juliagpu: GPU darktable: Masskrug - OpenCL darktable: Masskrug - CPU-only cl-mem: Copy clpeak: Double-Precision Compute mandelgpu: CPU+GPU shoc: OpenCL - Triad smallpt-gpu: CPU - Caustic3 20230927-trial 30063 29792 221 729 710 732 705 173 75.8 187 201 322 620 413 2689 12.063 2953.3839 1695834535 23.165349 79519746.9 135315754.4 137240757.5 5.216 7.894 281.3 808.92 266133232.1 11.1138 OpenBenchmarking.org
LuxMark OpenCL Device: GPU - Scene: Microphone OpenBenchmarking.org Score, More Is Better LuxMark 3.1 OpenCL Device: GPU - Scene: Microphone 20230927-trial 6K 12K 18K 24K 30K SE +/- 72.19, N = 3 30063
LuxMark OpenCL Device: CPU+GPU - Scene: Microphone OpenBenchmarking.org Score, More Is Better LuxMark 3.1 OpenCL Device: CPU+GPU - Scene: Microphone 20230927-trial 6K 12K 18K 24K 30K SE +/- 177.68, N = 3 29792
ViennaCL Test: OpenCL BLAS - dAXPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dAXPY 20230927-trial 50 100 150 200 250 SE +/- 2.94, N = 11 221 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMM-TT OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-TT 20230927-trial 160 320 480 640 800 SE +/- 0.26, N = 12 729 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMM-TN OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-TN 20230927-trial 150 300 450 600 750 SE +/- 0.29, N = 12 710 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMM-NT OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-NT 20230927-trial 160 320 480 640 800 SE +/- 0.29, N = 12 732 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMM-NN OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-NN 20230927-trial 150 300 450 600 750 SE +/- 0.36, N = 12 705 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMV-T OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMV-T 20230927-trial 40 80 120 160 200 SE +/- 12.99, N = 11 173 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMV-N OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMV-N 20230927-trial 20 40 60 80 100 SE +/- 1.09, N = 12 75.8 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dDOT OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dDOT 20230927-trial 40 80 120 160 200 SE +/- 11.40, N = 12 187 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dCOPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dCOPY 20230927-trial 40 80 120 160 200 SE +/- 5.02, N = 12 201 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - sDOT OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - sDOT 20230927-trial 70 140 210 280 350 SE +/- 3.17, N = 12 322 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - sAXPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - sAXPY 20230927-trial 130 260 390 520 650 SE +/- 0.69, N = 12 620 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - sCOPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - sCOPY 20230927-trial 90 180 270 360 450 SE +/- 19.45, N = 12 413 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
FluidX3D Test: FP32-FP16S OpenBenchmarking.org MLUPs/s, More Is Better FluidX3D 2.3 Test: FP32-FP16S 20230927-trial 600 1200 1800 2400 3000 SE +/- 6.17, N = 3 2689
Rodinia Test: OpenCL Myocyte OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenCL Myocyte 20230927-trial 3 6 9 12 15 SE +/- 3.48, N = 15 12.06 1. (CXX) g++ options: -O2 -lOpenCL
Lulesh OpenCL OpenBenchmarking.org z/s, More Is Better Lulesh OpenCL 2017-07-06 20230927-trial 600 1200 1800 2400 3000 SE +/- 47.73, N = 15 2953.38 1. (CXX) g++ options: -std=c++11 -lOpenCL -O3 -lm
SmallPT GPU OpenCL Device: GPU - Scene: Caustic3 OpenBenchmarking.org Samples/sec, More Is Better SmallPT GPU 1.6pts1 OpenCL Device: GPU - Scene: Caustic3 20230927-trial 400M 800M 1200M 1600M 2000M SE +/- 25.12, N = 3 1695834535 1. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL
Parboil Test: OpenMP Stencil OpenBenchmarking.org Seconds, Fewer Is Better Parboil 2.5 Test: OpenMP Stencil 20230927-trial 6 12 18 24 30 SE +/- 0.15, N = 3 23.17 1. (CXX) g++ options: -lm -lpthread -lgomp -O3 -ffast-math -fopenmp
MandelbulbGPU OpenCL Device: CPU+GPU OpenBenchmarking.org Samples/sec, More Is Better MandelbulbGPU 1.0pts1 OpenCL Device: CPU+GPU 20230927-trial 20M 40M 60M 80M 100M SE +/- 1033260.89, N = 7 79519746.9 1. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL
JuliaGPU OpenCL Device: CPU+GPU OpenBenchmarking.org Samples/sec, More Is Better JuliaGPU 1.2pts1 OpenCL Device: CPU+GPU 20230927-trial 30M 60M 90M 120M 150M SE +/- 343734.88, N = 3 135315754.4 1. (CC) gcc options: -O3 -march=native -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL -lm
JuliaGPU OpenCL Device: GPU OpenBenchmarking.org Samples/sec, More Is Better JuliaGPU 1.2pts1 OpenCL Device: GPU 20230927-trial 30M 60M 90M 120M 150M SE +/- 251640.79, N = 3 137240757.5 1. (CC) gcc options: -O3 -march=native -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL -lm
Darktable Test: Masskrug - Acceleration: OpenCL OpenBenchmarking.org Seconds, Fewer Is Better Darktable 4.4.2 Test: Masskrug - Acceleration: OpenCL 20230927-trial 1.1736 2.3472 3.5208 4.6944 5.868 SE +/- 0.051, N = 3 5.216
Darktable Test: Masskrug - Acceleration: CPU-only OpenBenchmarking.org Seconds, Fewer Is Better Darktable 4.4.2 Test: Masskrug - Acceleration: CPU-only 20230927-trial 2 4 6 8 10 SE +/- 0.007, N = 3 7.894
cl-mem Benchmark: Copy OpenBenchmarking.org GB/s, More Is Better cl-mem 2017-01-13 Benchmark: Copy 20230927-trial 60 120 180 240 300 SE +/- 0.09, N = 3 281.3 1. (CC) gcc options: -O2 -flto -lOpenCL
clpeak OpenCL Test: Double-Precision Compute OpenBenchmarking.org GFLOPS, More Is Better clpeak 1.1.2 OpenCL Test: Double-Precision Compute 20230927-trial 200 400 600 800 1000 SE +/- 0.21, N = 3 808.92 1. (CXX) g++ options: -O3
MandelGPU OpenCL Device: CPU+GPU OpenBenchmarking.org Samples/sec, More Is Better MandelGPU 1.3pts1 OpenCL Device: CPU+GPU 20230927-trial 60M 120M 180M 240M 300M SE +/- 1702516.89, N = 3 266133232.1 1. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Triad OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Triad 20230927-trial 3 6 9 12 15 SE +/- 0.16, N = 5 11.11 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
Phoronix Test Suite v10.8.5