cl more for coffee Intel Core i7-8700K testing with a ASUS TUF Z370-PLUS GAMING (2001 BIOS) and ASUS Intel UHD 630 CFL GT2 3GB on Ubuntu 20.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2103281-IB-CLMOREFOR18&grr .
cl more for coffee Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server OpenGL OpenCL Compiler File-System Screen Resolution 1 2 3 Intel Core i7-8700K @ 4.70GHz (6 Cores / 12 Threads) ASUS TUF Z370-PLUS GAMING (2001 BIOS) Intel 8th Gen Core 16GB 128GB Toshiba THNSN5128GPU7 ASUS Intel UHD 630 CFL GT2 3GB (1200MHz) Realtek ALC887-VD VA2431 Intel I219-V Ubuntu 20.04 5.9.0-050900rc6daily20200923-generic (x86_64) 20200922 GNOME Shell 3.36.4 X Server 1.20.9 4.6 Mesa 20.0.8 OpenCL 2.1 GCC 9.3.0 ext4 1920x1080 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: intel_pstate powersave - CPU Microcode: 0xde - Thermald 1.9.1 Security Details - itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Mitigation of PTE Inversion; VMX: conditional cache flushes SMT vulnerable + mds: Mitigation of Clear buffers; SMT vulnerable + meltdown: Mitigation of PTI + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full generic retpoline IBPB: conditional IBRS_FW STIBP: conditional RSB filling + srbds: Mitigation of Microcode + tsx_async_abort: Mitigation of Clear buffers; SMT vulnerable
cl more for coffee shoc: OpenCL - S3D gmpbench: Total Time shoc: OpenCL - Texture Read Bandwidth shoc: OpenCL - Max SP Flops shoc: OpenCL - MD5 Hash viennacl: OpenCL BLAS - dGEMM-TT viennacl: OpenCL BLAS - dGEMM-TN viennacl: OpenCL BLAS - dGEMM-NT viennacl: OpenCL BLAS - dGEMM-NN viennacl: OpenCL BLAS - dGEMV-T viennacl: OpenCL BLAS - dGEMV-N viennacl: OpenCL BLAS - dDOT viennacl: OpenCL BLAS - dAXPY viennacl: OpenCL BLAS - dCOPY viennacl: OpenCL BLAS - sDOT viennacl: OpenCL BLAS - sAXPY viennacl: OpenCL BLAS - sCOPY botan: AES-256 - Decrypt botan: AES-256 botan: ChaCha20Poly1305 - Decrypt botan: ChaCha20Poly1305 botan: Blowfish - Decrypt botan: Blowfish botan: Twofish - Decrypt botan: Twofish botan: CAST-256 - Decrypt botan: CAST-256 botan: KASUMI - Decrypt botan: KASUMI viennacl: CPU BLAS - dGEMM-TT viennacl: CPU BLAS - dGEMM-TN viennacl: CPU BLAS - dGEMM-NT viennacl: CPU BLAS - dGEMM-NN viennacl: CPU BLAS - dGEMV-T viennacl: CPU BLAS - dGEMV-N viennacl: CPU BLAS - dDOT viennacl: CPU BLAS - dAXPY viennacl: CPU BLAS - dCOPY viennacl: CPU BLAS - sDOT viennacl: CPU BLAS - sAXPY viennacl: CPU BLAS - sCOPY shoc: OpenCL - GEMM SGEMM_N shoc: OpenCL - Reduction shoc: OpenCL - FFT SP shoc: OpenCL - Triad shoc: OpenCL - Bus Speed Readback shoc: OpenCL - Bus Speed Download 1 2 3 18.7768 5871.6 56.9830 1762.69 0.3861 15.8 16.4 17.1 19.6 31.1 33.7 32.7 30.7 29.8 31.9 30.1 28.8 4524.803 4532.764 851.992 855.748 506.062 510.166 410.739 409.517 161.715 161.593 101.390 106.160 23.2 23.9 22.6 23.3 38.4 38.4 37.2 33.7 22.3 39.0 35.9 23.6 243.172 32.1662 15.3804 13.5815 28.6577 28.7658 18.7965 5879.2 56.9864 1762.06 0.3861 15.8 16.4 17.1 19.6 31.1 33.8 32.7 30.7 29.8 31.8 30.1 28.8 4524.213 4522.856 852.584 858.076 506.470 510.386 411.074 409.827 161.744 161.612 101.364 106.167 23.1 24 22.5 23.2 38.5 38.4 37.2 33.7 22.3 38.9 35.8 23.6 242.256 32.0962 15.3758 13.6328 28.8604 29.2933 18.7267 5847.4 56.9806 1763.46 0.3861 15.8 16.4 17.1 19.6 31.0 33.7 32.8 30.7 29.8 32.0 30.1 28.8 4518.229 4532.950 853.028 858.879 506.713 510.721 410.781 409.541 161.692 161.577 101.330 106.077 23.1 23.9 22.5 23.3 38.5 38.4 37.1 33.7 22.4 38.9 35.8 23.6 242.702 32.1605 15.4224 13.5835 28.7643 28.8862 OpenBenchmarking.org
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: S3D OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: S3D 1 2 3 5 10 15 20 25 SE +/- 0.04, N = 3 SE +/- 0.07, N = 3 SE +/- 0.02, N = 3 18.78 18.80 18.73 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi
GNU GMP GMPbench Total Time OpenBenchmarking.org GMPbench Score, More Is Better GNU GMP GMPbench 6.2.1 Total Time 1 2 3 1300 2600 3900 5200 6500 5871.6 5879.2 5847.4 1. (CC) gcc options: -O3 -fomit-frame-pointer -lm
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Texture Read Bandwidth OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Texture Read Bandwidth 1 2 3 13 26 39 52 65 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 56.98 56.99 56.98 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Max SP Flops OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Max SP Flops 1 2 3 400 800 1200 1600 2000 SE +/- 1.06, N = 3 SE +/- 1.03, N = 3 SE +/- 0.71, N = 3 1762.69 1762.06 1763.46 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: MD5 Hash OpenBenchmarking.org GHash/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: MD5 Hash 1 2 3 0.0869 0.1738 0.2607 0.3476 0.4345 SE +/- 0.0000, N = 3 SE +/- 0.0000, N = 3 SE +/- 0.0000, N = 3 0.3861 0.3861 0.3861 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi
ViennaCL Test: OpenCL BLAS - dGEMM-TT OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-TT 1 2 3 4 8 12 16 20 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 15.8 15.8 15.8 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMM-TN OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-TN 1 2 3 4 8 12 16 20 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 16.4 16.4 16.4 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMM-NT OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-NT 1 2 3 4 8 12 16 20 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 17.1 17.1 17.1 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMM-NN OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-NN 1 2 3 5 10 15 20 25 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.03, N = 3 19.6 19.6 19.6 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMV-T OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMV-T 1 2 3 7 14 21 28 35 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 SE +/- 0.06, N = 3 31.1 31.1 31.0 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMV-N OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMV-N 1 2 3 8 16 24 32 40 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 SE +/- 0.06, N = 3 33.7 33.8 33.7 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dDOT OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dDOT 1 2 3 8 16 24 32 40 SE +/- 0.17, N = 3 SE +/- 0.06, N = 3 SE +/- 0.15, N = 3 32.7 32.7 32.8 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dAXPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dAXPY 1 2 3 7 14 21 28 35 SE +/- 0.00, N = 3 SE +/- 0.03, N = 3 SE +/- 0.00, N = 3 30.7 30.7 30.7 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dCOPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dCOPY 1 2 3 7 14 21 28 35 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 29.8 29.8 29.8 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - sDOT OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - sDOT 1 2 3 7 14 21 28 35 SE +/- 0.22, N = 3 SE +/- 0.07, N = 3 SE +/- 0.03, N = 3 31.9 31.8 32.0 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - sAXPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - sAXPY 1 2 3 7 14 21 28 35 SE +/- 0.00, N = 3 SE +/- 0.03, N = 3 SE +/- 0.00, N = 3 30.1 30.1 30.1 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - sCOPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - sCOPY 1 2 3 7 14 21 28 35 SE +/- 0.07, N = 3 SE +/- 0.03, N = 3 SE +/- 0.06, N = 3 28.8 28.8 28.8 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
Botan Test: AES-256 - Decrypt OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: AES-256 - Decrypt 1 2 3 1000 2000 3000 4000 5000 SE +/- 0.15, N = 3 SE +/- 0.46, N = 3 SE +/- 6.10, N = 3 4524.80 4524.21 4518.23 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Botan Test: AES-256 OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: AES-256 1 2 3 1000 2000 3000 4000 5000 SE +/- 0.36, N = 3 SE +/- 10.26, N = 3 SE +/- 0.17, N = 3 4532.76 4522.86 4532.95 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Botan Test: ChaCha20Poly1305 - Decrypt OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: ChaCha20Poly1305 - Decrypt 1 2 3 200 400 600 800 1000 SE +/- 0.37, N = 3 SE +/- 0.14, N = 3 SE +/- 0.08, N = 3 851.99 852.58 853.03 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Botan Test: ChaCha20Poly1305 OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: ChaCha20Poly1305 1 2 3 200 400 600 800 1000 SE +/- 0.66, N = 3 SE +/- 0.80, N = 3 SE +/- 0.10, N = 3 855.75 858.08 858.88 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Botan Test: Blowfish - Decrypt OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: Blowfish - Decrypt 1 2 3 110 220 330 440 550 SE +/- 0.16, N = 3 SE +/- 0.26, N = 3 SE +/- 0.05, N = 3 506.06 506.47 506.71 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Botan Test: Blowfish OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: Blowfish 1 2 3 110 220 330 440 550 SE +/- 0.30, N = 3 SE +/- 0.25, N = 3 SE +/- 0.15, N = 3 510.17 510.39 510.72 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Botan Test: Twofish - Decrypt OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: Twofish - Decrypt 1 2 3 90 180 270 360 450 SE +/- 0.23, N = 3 SE +/- 0.20, N = 3 SE +/- 0.26, N = 3 410.74 411.07 410.78 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Botan Test: Twofish OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: Twofish 1 2 3 90 180 270 360 450 SE +/- 0.24, N = 3 SE +/- 0.22, N = 3 SE +/- 0.27, N = 3 409.52 409.83 409.54 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Botan Test: CAST-256 - Decrypt OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: CAST-256 - Decrypt 1 2 3 40 80 120 160 200 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 161.72 161.74 161.69 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Botan Test: CAST-256 OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: CAST-256 1 2 3 40 80 120 160 200 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 161.59 161.61 161.58 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Botan Test: KASUMI - Decrypt OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: KASUMI - Decrypt 1 2 3 20 40 60 80 100 SE +/- 0.04, N = 3 SE +/- 0.03, N = 3 SE +/- 0.00, N = 3 101.39 101.36 101.33 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
Botan Test: KASUMI OpenBenchmarking.org MiB/s, More Is Better Botan 2.17.3 Test: KASUMI 1 2 3 20 40 60 80 100 SE +/- 0.05, N = 3 SE +/- 0.04, N = 3 SE +/- 0.01, N = 3 106.16 106.17 106.08 1. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt
ViennaCL Test: CPU BLAS - dGEMM-TT OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TT 1 2 3 6 12 18 24 30 SE +/- 0.07, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 23.2 23.1 23.1 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMM-TN OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TN 1 2 3 6 12 18 24 30 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 23.9 24.0 23.9 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMM-NT OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NT 1 2 3 5 10 15 20 25 SE +/- 0.06, N = 3 SE +/- 0.00, N = 3 SE +/- 0.03, N = 3 22.6 22.5 22.5 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMM-NN OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NN 1 2 3 6 12 18 24 30 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 23.3 23.2 23.3 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMV-T OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-T 1 2 3 9 18 27 36 45 SE +/- 0.13, N = 3 SE +/- 0.05, N = 2 38.4 38.5 38.5 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMV-N OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-N 1 2 3 9 18 27 36 45 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 38.4 38.4 38.4 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dDOT OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dDOT 1 2 3 9 18 27 36 45 SE +/- 0.03, N = 3 SE +/- 0.00, N = 3 SE +/- 0.03, N = 3 37.2 37.2 37.1 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dAXPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dAXPY 1 2 3 8 16 24 32 40 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 33.7 33.7 33.7 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dCOPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dCOPY 1 2 3 5 10 15 20 25 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 22.3 22.3 22.4 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - sDOT OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - sDOT 1 2 3 9 18 27 36 45 SE +/- 0.07, N = 3 SE +/- 0.07, N = 3 SE +/- 0.03, N = 3 39.0 38.9 38.9 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - sAXPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - sAXPY 1 2 3 8 16 24 32 40 SE +/- 0.07, N = 3 SE +/- 0.03, N = 3 SE +/- 0.00, N = 3 35.9 35.8 35.8 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - sCOPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - sCOPY 1 2 3 6 12 18 24 30 SE +/- 0.03, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 23.6 23.6 23.6 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: GEMM SGEMM_N OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: GEMM SGEMM_N 1 2 3 50 100 150 200 250 SE +/- 0.26, N = 3 SE +/- 0.14, N = 3 SE +/- 0.44, N = 3 243.17 242.26 242.70 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Reduction OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Reduction 1 2 3 7 14 21 28 35 SE +/- 0.06, N = 3 SE +/- 0.01, N = 3 SE +/- 0.08, N = 3 32.17 32.10 32.16 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: FFT SP OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: FFT SP 1 2 3 4 8 12 16 20 SE +/- 0.02, N = 3 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 15.38 15.38 15.42 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Triad OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Triad 1 2 3 4 8 12 16 20 SE +/- 0.07, N = 3 SE +/- 0.09, N = 3 SE +/- 0.10, N = 3 13.58 13.63 13.58 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Bus Speed Readback OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Readback 1 2 3 7 14 21 28 35 SE +/- 0.02, N = 3 SE +/- 0.23, N = 3 SE +/- 0.07, N = 3 28.66 28.86 28.76 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Bus Speed Download OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Download 1 2 3 7 14 21 28 35 SE +/- 0.08, N = 3 SE +/- 0.09, N = 3 SE +/- 0.15, N = 3 28.77 29.29 28.89 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi
Phoronix Test Suite v10.8.5