garuda-unit-test AMD Ryzen 5 5600X 6-Core testing with a Gigabyte X570 I AORUS PRO WIFI (F36d BIOS) and ASUS NVIDIA GeForce RTX 3070 8GB on Garuda Soaring via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2301034-NE-GARUDAUNI78&grw .
garuda-unit-test Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL Vulkan Compiler File-System Screen Resolution CantRemember AMD Ryzen 5 5600X 6-Core @ 3.70GHz (6 Cores / 12 Threads) Gigabyte X570 I AORUS PRO WIFI (F36d BIOS) AMD Starship/Matisse 16GB 1000GB CT1000P1SSD8 + 2000GB CT2000P3SSD8 ASUS NVIDIA GeForce RTX 3070 8GB NVIDIA GA104 HD Audio Odyssey G52A Intel I211 + Intel Wi-Fi 6 AX200 Garuda Soaring 6.1.1-zen1-1-zen (x86_64) KDE Plasma 5.26.4 X Server 1.21.1.6 NVIDIA 525.60.11 4.6.0 1.3.224 GCC 12.2.0 btrfs 5120x1440 OpenBenchmarking.org - Transparent Huge Pages: always - --disable-libssp --disable-libstdcxx-pch --disable-werror --enable-__cxa_atexit --enable-bootstrap --enable-cet=auto --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-default-ssp --enable-gnu-indirect-function --enable-gnu-unique-object --enable-languages=c,c++,ada,fortran,go,lto,objc,obj-c++,d --enable-libstdcxx-backtrace --enable-link-serialization=1 --enable-lto --enable-multilib --enable-plugin --enable-shared --enable-threads=posix --mandir=/usr/share/man --with-build-config=bootstrap-lto --with-linker-hash-style=gnu - Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0xa201016 - BAR1 / Visible vRAM Size: 256 MiB - vBIOS Version: 94.04.3a.40.20 - GPU Compute Cores: 5888 - Python 3.10.9 - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
garuda-unit-test darktable: Boat - OpenCL darktable: Masskrug - OpenCL darktable: Server Rack - OpenCL darktable: Server Room - OpenCL shoc: OpenCL - S3D shoc: OpenCL - Triad shoc: OpenCL - FFT SP shoc: OpenCL - MD5 Hash shoc: OpenCL - Reduction shoc: OpenCL - GEMM SGEMM_N shoc: OpenCL - Max SP Flops shoc: OpenCL - Bus Speed Download shoc: OpenCL - Bus Speed Readback shoc: OpenCL - Texture Read Bandwidth rodinia: OpenCL Myocyte rodinia: OpenCL Leukocyte rodinia: OpenCL Particle Filter cl-mem: Copy cl-mem: Read cl-mem: Write clpeak: Kernel Latency clpeak: Integer Compute INT clpeak: Single-Precision Float clpeak: Double-Precision Double clpeak: Global Memory Bandwidth clpeak: Transfer Bandwidth enqueueReadBuffer clpeak: Transfer Bandwidth enqueueWriteBuffer viennacl: CPU BLAS - sCOPY viennacl: CPU BLAS - sAXPY viennacl: CPU BLAS - sDOT viennacl: CPU BLAS - dCOPY viennacl: CPU BLAS - dAXPY viennacl: CPU BLAS - dDOT viennacl: CPU BLAS - dGEMV-N viennacl: CPU BLAS - dGEMV-T viennacl: CPU BLAS - dGEMM-NN viennacl: CPU BLAS - dGEMM-NT viennacl: CPU BLAS - dGEMM-TN viennacl: CPU BLAS - dGEMM-TT viennacl: OpenCL BLAS - sCOPY viennacl: OpenCL BLAS - sAXPY viennacl: OpenCL BLAS - sDOT viennacl: OpenCL BLAS - dAXPY viennacl: OpenCL BLAS - dDOT viennacl: OpenCL BLAS - dGEMV-N viennacl: OpenCL BLAS - dGEMV-T viennacl: OpenCL BLAS - dGEMM-NN viennacl: OpenCL BLAS - dGEMM-NT viennacl: OpenCL BLAS - dGEMM-TN viennacl: OpenCL BLAS - dGEMM-TT viennacl: OpenCL BLAS - dCOPY fluidx3d: FP32-FP32 fluidx3d: FP32-FP16C fluidx3d: FP32-FP16S lulesh-cl: luxmark: GPU - Hotel luxmark: CPU+GPU - Hotel luxmark: GPU - Microphone luxmark: GPU - Luxball HDR luxmark: CPU+GPU - Microphone luxmark: CPU+GPU - Luxball HDR smallpt-gpu: GPU - 5120 x 1440 - Caustic smallpt-gpu: GPU - 5120 x 1440 - Cornell smallpt-gpu: GPU - 5120 x 1440 - Caustic3 xsbench-cl: CantRemember 1.704 3.338 0.141 0.780 217.979 24.6456 1117.17 25.2819 327.380 3589.40 20986.7 26.3063 27.1039 2061.08 28.609 3.282 6.138 259.5 344.0 321.1 5.25 10411.51 20546.88 358.24 390.58 14.13 24.45 24.1 37.1 37.7 20.5 30.9 29.8 42.6 40.1 26.9 25.6 27.8 25.5 284 342 311 377 385 168 316 327 328 321 321 360 2246 4439 4506 5940.0720 11041 11142 35183 50404 35587 51047 1672726555 1672726690 1672726825 OpenBenchmarking.org
Darktable Test: Boat - Acceleration: OpenCL OpenBenchmarking.org Seconds, Fewer Is Better Darktable 4.2.0 Test: Boat - Acceleration: OpenCL CantRemember 0.3834 0.7668 1.1502 1.5336 1.917 SE +/- 0.014, N = 15 1.704
Darktable Test: Masskrug - Acceleration: OpenCL OpenBenchmarking.org Seconds, Fewer Is Better Darktable 4.2.0 Test: Masskrug - Acceleration: OpenCL CantRemember 0.7511 1.5022 2.2533 3.0044 3.7555 SE +/- 0.025, N = 3 3.338
Darktable Test: Server Rack - Acceleration: OpenCL OpenBenchmarking.org Seconds, Fewer Is Better Darktable 4.2.0 Test: Server Rack - Acceleration: OpenCL CantRemember 0.0317 0.0634 0.0951 0.1268 0.1585 SE +/- 0.001, N = 3 0.141
Darktable Test: Server Room - Acceleration: OpenCL OpenBenchmarking.org Seconds, Fewer Is Better Darktable 4.2.0 Test: Server Room - Acceleration: OpenCL CantRemember 0.1755 0.351 0.5265 0.702 0.8775 SE +/- 0.003, N = 3 0.780
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: S3D OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: S3D CantRemember 50 100 150 200 250 SE +/- 0.19, N = 3 217.98 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Triad OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Triad CantRemember 6 12 18 24 30 SE +/- 0.00, N = 3 24.65 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: FFT SP OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: FFT SP CantRemember 200 400 600 800 1000 SE +/- 7.28, N = 3 1117.17 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: MD5 Hash OpenBenchmarking.org GHash/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: MD5 Hash CantRemember 6 12 18 24 30 SE +/- 0.04, N = 3 25.28 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Reduction OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Reduction CantRemember 70 140 210 280 350 SE +/- 0.35, N = 3 327.38 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: GEMM SGEMM_N OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: GEMM SGEMM_N CantRemember 800 1600 2400 3200 4000 SE +/- 22.79, N = 3 3589.40 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Max SP Flops OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Max SP Flops CantRemember 4K 8K 12K 16K 20K SE +/- 245.20, N = 3 20986.7 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Bus Speed Download OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Download CantRemember 6 12 18 24 30 SE +/- 0.01, N = 3 26.31 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Bus Speed Readback OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Readback CantRemember 6 12 18 24 30 SE +/- 0.01, N = 3 27.10 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Texture Read Bandwidth OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Texture Read Bandwidth CantRemember 400 800 1200 1600 2000 SE +/- 3.84, N = 3 2061.08 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
Rodinia Test: OpenCL Myocyte OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenCL Myocyte CantRemember 7 14 21 28 35 SE +/- 0.38, N = 15 28.61 1. (CXX) g++ options: -O2 -lOpenCL
Rodinia Test: OpenCL Leukocyte OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenCL Leukocyte CantRemember 0.7385 1.477 2.2155 2.954 3.6925 SE +/- 0.033, N = 6 3.282 1. (CXX) g++ options: -O2 -lOpenCL
Rodinia Test: OpenCL Particle Filter OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenCL Particle Filter CantRemember 2 4 6 8 10 SE +/- 0.065, N = 3 6.138 1. (CXX) g++ options: -O2 -lOpenCL
cl-mem Benchmark: Copy OpenBenchmarking.org GB/s, More Is Better cl-mem 2017-01-13 Benchmark: Copy CantRemember 60 120 180 240 300 SE +/- 0.27, N = 3 259.5 1. (CC) gcc options: -O2 -flto -lOpenCL
cl-mem Benchmark: Read OpenBenchmarking.org GB/s, More Is Better cl-mem 2017-01-13 Benchmark: Read CantRemember 70 140 210 280 350 SE +/- 0.20, N = 3 344.0 1. (CC) gcc options: -O2 -flto -lOpenCL
cl-mem Benchmark: Write OpenBenchmarking.org GB/s, More Is Better cl-mem 2017-01-13 Benchmark: Write CantRemember 70 140 210 280 350 SE +/- 0.07, N = 3 321.1 1. (CC) gcc options: -O2 -flto -lOpenCL
clpeak OpenCL Test: Kernel Latency OpenBenchmarking.org us, Fewer Is Better clpeak OpenCL Test: Kernel Latency CantRemember 1.1813 2.3626 3.5439 4.7252 5.9065 SE +/- 0.01, N = 3 5.25 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
clpeak OpenCL Test: Integer Compute INT OpenBenchmarking.org GIOPS, More Is Better clpeak OpenCL Test: Integer Compute INT CantRemember 2K 4K 6K 8K 10K SE +/- 36.62, N = 3 10411.51 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
clpeak OpenCL Test: Single-Precision Float OpenBenchmarking.org GFLOPS, More Is Better clpeak OpenCL Test: Single-Precision Float CantRemember 4K 8K 12K 16K 20K SE +/- 53.88, N = 3 20546.88 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
clpeak OpenCL Test: Double-Precision Double OpenBenchmarking.org GFLOPS, More Is Better clpeak OpenCL Test: Double-Precision Double CantRemember 80 160 240 320 400 SE +/- 0.60, N = 3 358.24 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
clpeak OpenCL Test: Global Memory Bandwidth OpenBenchmarking.org GBPS, More Is Better clpeak OpenCL Test: Global Memory Bandwidth CantRemember 80 160 240 320 400 SE +/- 0.02, N = 3 390.58 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
clpeak OpenCL Test: Transfer Bandwidth enqueueReadBuffer OpenBenchmarking.org GBPS, More Is Better clpeak OpenCL Test: Transfer Bandwidth enqueueReadBuffer CantRemember 4 8 12 16 20 SE +/- 0.05, N = 3 14.13 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
clpeak OpenCL Test: Transfer Bandwidth enqueueWriteBuffer OpenBenchmarking.org GBPS, More Is Better clpeak OpenCL Test: Transfer Bandwidth enqueueWriteBuffer CantRemember 6 12 18 24 30 SE +/- 0.19, N = 3 24.45 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - sCOPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - sCOPY CantRemember 6 12 18 24 30 SE +/- 0.79, N = 15 24.1 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - sAXPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - sAXPY CantRemember 9 18 27 36 45 SE +/- 0.58, N = 15 37.1 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - sDOT OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - sDOT CantRemember 9 18 27 36 45 SE +/- 1.56, N = 15 37.7 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dCOPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dCOPY CantRemember 5 10 15 20 25 SE +/- 0.25, N = 15 20.5 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dAXPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dAXPY CantRemember 7 14 21 28 35 SE +/- 0.27, N = 15 30.9 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dDOT OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dDOT CantRemember 7 14 21 28 35 SE +/- 0.50, N = 14 29.8 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMV-N OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-N CantRemember 10 20 30 40 50 SE +/- 0.70, N = 15 42.6 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMV-T OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-T CantRemember 9 18 27 36 45 SE +/- 1.19, N = 13 40.1 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMM-NN OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NN CantRemember 6 12 18 24 30 SE +/- 0.07, N = 15 26.9 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMM-NT OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NT CantRemember 6 12 18 24 30 SE +/- 0.15, N = 14 25.6 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMM-TN OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TN CantRemember 7 14 21 28 35 SE +/- 0.36, N = 15 27.8 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMM-TT OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TT CantRemember 6 12 18 24 30 SE +/- 0.24, N = 13 25.5 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - sCOPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - sCOPY CantRemember 60 120 180 240 300 SE +/- 0.67, N = 3 284 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - sAXPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - sAXPY CantRemember 70 140 210 280 350 SE +/- 4.93, N = 3 342 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - sDOT OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - sDOT CantRemember 70 140 210 280 350 SE +/- 6.56, N = 3 311 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dAXPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dAXPY CantRemember 80 160 240 320 400 SE +/- 3.84, N = 3 377 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dDOT OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dDOT CantRemember 80 160 240 320 400 SE +/- 2.03, N = 3 385 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMV-N OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMV-N CantRemember 40 80 120 160 200 SE +/- 4.26, N = 3 168 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMV-T OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMV-T CantRemember 70 140 210 280 350 SE +/- 3.38, N = 3 316 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMM-NN OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-NN CantRemember 70 140 210 280 350 SE +/- 0.88, N = 3 327 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMM-NT OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-NT CantRemember 70 140 210 280 350 SE +/- 1.45, N = 3 328 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMM-TN OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-TN CantRemember 70 140 210 280 350 SE +/- 9.21, N = 3 321 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMM-TT OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-TT CantRemember 70 140 210 280 350 SE +/- 10.50, N = 2 321 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dCOPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dCOPY CantRemember 80 160 240 320 400 SE +/- 4.50, N = 2 360 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
FluidX3D Test: FP32-FP32 OpenBenchmarking.org MLUPs/s, More Is Better FluidX3D 1.4 Test: FP32-FP32 CantRemember 500 1000 1500 2000 2500 SE +/- 1.53, N = 3 2246
FluidX3D Test: FP32-FP16C OpenBenchmarking.org MLUPs/s, More Is Better FluidX3D 1.4 Test: FP32-FP16C CantRemember 1000 2000 3000 4000 5000 SE +/- 8.84, N = 3 4439
FluidX3D Test: FP32-FP16S OpenBenchmarking.org MLUPs/s, More Is Better FluidX3D 1.4 Test: FP32-FP16S CantRemember 1000 2000 3000 4000 5000 SE +/- 1.53, N = 3 4506
Lulesh OpenCL OpenBenchmarking.org z/s, More Is Better Lulesh OpenCL 2017-07-06 CantRemember 1300 2600 3900 5200 6500 SE +/- 6.98, N = 3 5940.07 1. (CXX) g++ options: -std=c++11 -lOpenCL -O3 -lm
LuxMark OpenCL Device: GPU - Scene: Hotel OpenBenchmarking.org Score, More Is Better LuxMark 3.1 OpenCL Device: GPU - Scene: Hotel CantRemember 2K 4K 6K 8K 10K SE +/- 61.92, N = 3 11041
LuxMark OpenCL Device: CPU+GPU - Scene: Hotel OpenBenchmarking.org Score, More Is Better LuxMark 3.1 OpenCL Device: CPU+GPU - Scene: Hotel CantRemember 2K 4K 6K 8K 10K SE +/- 102.63, N = 12 11142
LuxMark OpenCL Device: GPU - Scene: Microphone OpenBenchmarking.org Score, More Is Better LuxMark 3.1 OpenCL Device: GPU - Scene: Microphone CantRemember 8K 16K 24K 32K 40K SE +/- 189.69, N = 3 35183
LuxMark OpenCL Device: GPU - Scene: Luxball HDR OpenBenchmarking.org Score, More Is Better LuxMark 3.1 OpenCL Device: GPU - Scene: Luxball HDR CantRemember 11K 22K 33K 44K 55K SE +/- 164.85, N = 3 50404
LuxMark OpenCL Device: CPU+GPU - Scene: Microphone OpenBenchmarking.org Score, More Is Better LuxMark 3.1 OpenCL Device: CPU+GPU - Scene: Microphone CantRemember 8K 16K 24K 32K 40K SE +/- 5.61, N = 3 35587
LuxMark OpenCL Device: CPU+GPU - Scene: Luxball HDR OpenBenchmarking.org Score, More Is Better LuxMark 3.1 OpenCL Device: CPU+GPU - Scene: Luxball HDR CantRemember 11K 22K 33K 44K 55K SE +/- 47.99, N = 3 51047
SmallPT GPU OpenCL Device: GPU - Resolution: 5120 x 1440 - Scene: Caustic OpenBenchmarking.org Samples/sec, More Is Better SmallPT GPU 1.6pts1 OpenCL Device: GPU - Resolution: 5120 x 1440 - Scene: Caustic CantRemember 400M 800M 1200M 1600M 2000M SE +/- 25.12, N = 3 1672726555 1. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL
SmallPT GPU OpenCL Device: GPU - Resolution: 5120 x 1440 - Scene: Cornell OpenBenchmarking.org Samples/sec, More Is Better SmallPT GPU 1.6pts1 OpenCL Device: GPU - Resolution: 5120 x 1440 - Scene: Cornell CantRemember 400M 800M 1200M 1600M 2000M SE +/- 24.25, N = 3 1672726690 1. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL
SmallPT GPU OpenCL Device: GPU - Resolution: 5120 x 1440 - Scene: Caustic3 OpenBenchmarking.org Samples/sec, More Is Better SmallPT GPU 1.6pts1 OpenCL Device: GPU - Resolution: 5120 x 1440 - Scene: Caustic3 CantRemember 400M 800M 1200M 1600M 2000M SE +/- 24.83, N = 3 1672726825 1. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL
Phoronix Test Suite v10.8.4