20231222um79064thunderbolt.txt AMD Ryzen 9 7940HS testing with a Shenzhen Meigao Electronic Equipment F7BSC (1.07 BIOS) and AMD Radeon PRO W6800 8GB on Ubuntu 22.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2312228-NE-20231222U30&grt .
20231222um79064thunderbolt.txt Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server OpenGL OpenCL Vulkan Compiler File-System Screen Resolution AMD Radeon PRO W6800 AMD Ryzen 9 7940HS @ 4.00GHz (8 Cores / 16 Threads) Shenzhen Meigao Electronic Equipment F7BSC (1.07 BIOS) AMD Device 14e8 56GB 4097GB HP SSD FX900 Pro 4TB + 1024GB KINGSTON OM8PGP41024Q-A0 AMD Radeon PRO W6800 8GB (2555/1000MHz) AMD Navi 21 HDMI Audio DELL ST2210 Realtek RTL8125 2.5GbE + Intel I210 + Intel Wi-Fi 6 AX210/AX211/AX411 Ubuntu 22.04 6.2.0-39-generic (x86_64) GNOME Shell 42.9 X Server 1.21.1.3 + Wayland 4.6 Mesa 23.0.4-0ubuntu1~22.04.1 (LLVM 15.0.7 DRM 3.56) OpenCL 2.1 AMD-APP (3602.0) 1.3.238 GCC 11.4.0 ext4 1920x1080 OpenBenchmarking.org - Transparent Huge Pages: madvise - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0xa704103 - BAR1 / Visible vRAM Size: 8192 MB - vBIOS Version: 113-D4300100-103 - Python 3.10.12 - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of safe RET no microcode + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
20231222um79064thunderbolt.txt cl-mem: Copy cl-mem: Read cl-mem: Write clpeak: Kernel Latency clpeak: Integer Compute clpeak: Integer 24-bit Compute clpeak: Global Memory Bandwidth clpeak: Double-Precision Compute clpeak: Single-Precision Compute clpeak: Transfer Bandwidth enqueueReadBuffer clpeak: Transfer Bandwidth enqueueWriteBuffer darktable: Boat - OpenCL darktable: Masskrug - OpenCL darktable: Server Rack - OpenCL darktable: Server Room - OpenCL fluidx3d: FP32-FP32 fluidx3d: FP32-FP16C fluidx3d: FP32-FP16S lulesh-cl: luxmark: GPU - Hotel luxmark: CPU+GPU - Hotel luxmark: GPU - Microphone luxmark: GPU - Luxball HDR luxmark: CPU+GPU - Microphone luxmark: CPU+GPU - Luxball HDR rodinia: OpenCL Myocyte rodinia: OpenCL Leukocyte shoc: OpenCL - S3D shoc: OpenCL - Triad shoc: OpenCL - FFT SP shoc: OpenCL - MD5 Hash shoc: OpenCL - Reduction shoc: OpenCL - GEMM SGEMM_N shoc: OpenCL - Max SP Flops shoc: OpenCL - Bus Speed Download shoc: OpenCL - Bus Speed Readback shoc: OpenCL - Texture Read Bandwidth smallpt-gpu: GPU - 1920 x 1080 - Caustic smallpt-gpu: GPU - 1920 x 1080 - Cornell smallpt-gpu: GPU - 1920 x 1080 - Caustic3 viennacl: CPU BLAS - sCOPY viennacl: CPU BLAS - sAXPY viennacl: CPU BLAS - sDOT viennacl: CPU BLAS - dCOPY viennacl: CPU BLAS - dAXPY viennacl: CPU BLAS - dDOT viennacl: CPU BLAS - dGEMV-N viennacl: CPU BLAS - dGEMM-NN viennacl: CPU BLAS - dGEMM-NT viennacl: CPU BLAS - dGEMM-TN viennacl: CPU BLAS - dGEMM-TT viennacl: OpenCL BLAS - sCOPY viennacl: OpenCL BLAS - sAXPY viennacl: OpenCL BLAS - sDOT viennacl: OpenCL BLAS - dCOPY viennacl: OpenCL BLAS - dAXPY viennacl: OpenCL BLAS - dDOT viennacl: OpenCL BLAS - dGEMV-N viennacl: OpenCL BLAS - dGEMV-T viennacl: OpenCL BLAS - dGEMM-NN viennacl: OpenCL BLAS - dGEMM-NT viennacl: OpenCL BLAS - dGEMM-TN viennacl: OpenCL BLAS - dGEMM-TT xsbench-cl: AMD Radeon PRO W6800 326.9 413.5 369.7 19.26 3654.24 15708.24 373.03 1172.37 17191.60 4.99 21.56 2.829 3.626 0.447 1.099 3406 5166 5376 1952.2936 19388 19317 112563 143110 111311 142214 9.679 3.775 95.9500 1.9084 1469.88 24.2202 589.718 4847.52 28075778 1.9972 1.8595 913.650 1703273562 1703273699 1703273839 43.1 64.7 45.7 37.7 56.6 42.1 41.5 48.3 47.4 52.8 50.6 511 737 497 326 359 346 140 469 972 1010 941 991 OpenBenchmarking.org
cl-mem Benchmark: Copy OpenBenchmarking.org GB/s, More Is Better cl-mem 2017-01-13 Benchmark: Copy AMD Radeon PRO W6800 70 140 210 280 350 SE +/- 0.50, N = 3 326.9 1. (CC) gcc options: -O2 -flto -lOpenCL
cl-mem Benchmark: Read OpenBenchmarking.org GB/s, More Is Better cl-mem 2017-01-13 Benchmark: Read AMD Radeon PRO W6800 90 180 270 360 450 SE +/- 1.72, N = 3 413.5 1. (CC) gcc options: -O2 -flto -lOpenCL
cl-mem Benchmark: Write OpenBenchmarking.org GB/s, More Is Better cl-mem 2017-01-13 Benchmark: Write AMD Radeon PRO W6800 80 160 240 320 400 SE +/- 1.59, N = 3 369.7 1. (CC) gcc options: -O2 -flto -lOpenCL
clpeak OpenCL Test: Kernel Latency OpenBenchmarking.org us, Fewer Is Better clpeak 1.1.2 OpenCL Test: Kernel Latency AMD Radeon PRO W6800 5 10 15 20 25 SE +/- 0.20, N = 15 19.26 1. (CXX) g++ options: -O3
clpeak OpenCL Test: Integer Compute OpenBenchmarking.org GIOPS, More Is Better clpeak 1.1.2 OpenCL Test: Integer Compute AMD Radeon PRO W6800 800 1600 2400 3200 4000 SE +/- 2.74, N = 3 3654.24 1. (CXX) g++ options: -O3
clpeak OpenCL Test: Integer 24-bit Compute OpenBenchmarking.org GIOPS, More Is Better clpeak 1.1.2 OpenCL Test: Integer 24-bit Compute AMD Radeon PRO W6800 3K 6K 9K 12K 15K SE +/- 115.06, N = 3 15708.24 1. (CXX) g++ options: -O3
clpeak OpenCL Test: Global Memory Bandwidth OpenBenchmarking.org GBPS, More Is Better clpeak 1.1.2 OpenCL Test: Global Memory Bandwidth AMD Radeon PRO W6800 80 160 240 320 400 SE +/- 0.33, N = 3 373.03 1. (CXX) g++ options: -O3
clpeak OpenCL Test: Double-Precision Compute OpenBenchmarking.org GFLOPS, More Is Better clpeak 1.1.2 OpenCL Test: Double-Precision Compute AMD Radeon PRO W6800 300 600 900 1200 1500 SE +/- 0.55, N = 3 1172.37 1. (CXX) g++ options: -O3
clpeak OpenCL Test: Single-Precision Compute OpenBenchmarking.org GFLOPS, More Is Better clpeak 1.1.2 OpenCL Test: Single-Precision Compute AMD Radeon PRO W6800 4K 8K 12K 16K 20K SE +/- 82.20, N = 3 17191.60 1. (CXX) g++ options: -O3
clpeak OpenCL Test: Transfer Bandwidth enqueueReadBuffer OpenBenchmarking.org GBPS, More Is Better clpeak 1.1.2 OpenCL Test: Transfer Bandwidth enqueueReadBuffer AMD Radeon PRO W6800 1.1228 2.2456 3.3684 4.4912 5.614 SE +/- 0.03, N = 3 4.99 1. (CXX) g++ options: -O3
clpeak OpenCL Test: Transfer Bandwidth enqueueWriteBuffer OpenBenchmarking.org GBPS, More Is Better clpeak 1.1.2 OpenCL Test: Transfer Bandwidth enqueueWriteBuffer AMD Radeon PRO W6800 5 10 15 20 25 SE +/- 0.18, N = 15 21.56 1. (CXX) g++ options: -O3
Darktable Test: Boat - Acceleration: OpenCL OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.8.1 Test: Boat - Acceleration: OpenCL AMD Radeon PRO W6800 0.6365 1.273 1.9095 2.546 3.1825 SE +/- 0.005, N = 3 2.829
Darktable Test: Masskrug - Acceleration: OpenCL OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.8.1 Test: Masskrug - Acceleration: OpenCL AMD Radeon PRO W6800 0.8159 1.6318 2.4477 3.2636 4.0795 SE +/- 0.033, N = 7 3.626
Darktable Test: Server Rack - Acceleration: OpenCL OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.8.1 Test: Server Rack - Acceleration: OpenCL AMD Radeon PRO W6800 0.1006 0.2012 0.3018 0.4024 0.503 SE +/- 0.005, N = 15 0.447
Darktable Test: Server Room - Acceleration: OpenCL OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.8.1 Test: Server Room - Acceleration: OpenCL AMD Radeon PRO W6800 0.2473 0.4946 0.7419 0.9892 1.2365 SE +/- 0.016, N = 3 1.099
FluidX3D Test: FP32-FP32 OpenBenchmarking.org MLUPs/s, More Is Better FluidX3D 2.9 Test: FP32-FP32 AMD Radeon PRO W6800 700 1400 2100 2800 3500 SE +/- 7.69, N = 3 3406
FluidX3D Test: FP32-FP16C OpenBenchmarking.org MLUPs/s, More Is Better FluidX3D 2.9 Test: FP32-FP16C AMD Radeon PRO W6800 1100 2200 3300 4400 5500 SE +/- 28.21, N = 3 5166
FluidX3D Test: FP32-FP16S OpenBenchmarking.org MLUPs/s, More Is Better FluidX3D 2.9 Test: FP32-FP16S AMD Radeon PRO W6800 1200 2400 3600 4800 6000 SE +/- 66.77, N = 3 5376
Lulesh OpenCL OpenBenchmarking.org z/s, More Is Better Lulesh OpenCL 2017-07-06 AMD Radeon PRO W6800 400 800 1200 1600 2000 SE +/- 21.61, N = 5 1952.29 1. (CXX) g++ options: -std=c++11 -lOpenCL -O3 -lm
LuxMark OpenCL Device: GPU - Scene: Hotel OpenBenchmarking.org Score, More Is Better LuxMark 3.1 OpenCL Device: GPU - Scene: Hotel AMD Radeon PRO W6800 4K 8K 12K 16K 20K SE +/- 187.40, N = 6 19388
LuxMark OpenCL Device: CPU+GPU - Scene: Hotel OpenBenchmarking.org Score, More Is Better LuxMark 3.1 OpenCL Device: CPU+GPU - Scene: Hotel AMD Radeon PRO W6800 4K 8K 12K 16K 20K SE +/- 65.58, N = 3 19317
LuxMark OpenCL Device: GPU - Scene: Microphone OpenBenchmarking.org Score, More Is Better LuxMark 3.1 OpenCL Device: GPU - Scene: Microphone AMD Radeon PRO W6800 20K 40K 60K 80K 100K SE +/- 1027.00, N = 7 112563
LuxMark OpenCL Device: GPU - Scene: Luxball HDR OpenBenchmarking.org Score, More Is Better LuxMark 3.1 OpenCL Device: GPU - Scene: Luxball HDR AMD Radeon PRO W6800 30K 60K 90K 120K 150K SE +/- 505.54, N = 3 143110
LuxMark OpenCL Device: CPU+GPU - Scene: Microphone OpenBenchmarking.org Score, More Is Better LuxMark 3.1 OpenCL Device: CPU+GPU - Scene: Microphone AMD Radeon PRO W6800 20K 40K 60K 80K 100K SE +/- 156.19, N = 3 111311
LuxMark OpenCL Device: CPU+GPU - Scene: Luxball HDR OpenBenchmarking.org Score, More Is Better LuxMark 3.1 OpenCL Device: CPU+GPU - Scene: Luxball HDR AMD Radeon PRO W6800 30K 60K 90K 120K 150K SE +/- 359.67, N = 3 142214
Rodinia Test: OpenCL Myocyte OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenCL Myocyte AMD Radeon PRO W6800 3 6 9 12 15 SE +/- 0.110, N = 15 9.679 1. (CXX) g++ options: -O2 -lOpenCL
Rodinia Test: OpenCL Leukocyte OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenCL Leukocyte AMD Radeon PRO W6800 0.8494 1.6988 2.5482 3.3976 4.247 SE +/- 0.033, N = 15 3.775 1. (CXX) g++ options: -O2 -lOpenCL
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: S3D OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: S3D AMD Radeon PRO W6800 20 40 60 80 100 SE +/- 0.71, N = 3 95.95 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Triad OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Triad AMD Radeon PRO W6800 0.4294 0.8588 1.2882 1.7176 2.147 SE +/- 0.0030, N = 3 1.9084 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: FFT SP OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: FFT SP AMD Radeon PRO W6800 300 600 900 1200 1500 SE +/- 4.15, N = 3 1469.88 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: MD5 Hash OpenBenchmarking.org GHash/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: MD5 Hash AMD Radeon PRO W6800 6 12 18 24 30 SE +/- 0.02, N = 3 24.22 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Reduction OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Reduction AMD Radeon PRO W6800 130 260 390 520 650 SE +/- 0.29, N = 3 589.72 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: GEMM SGEMM_N OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: GEMM SGEMM_N AMD Radeon PRO W6800 1000 2000 3000 4000 5000 SE +/- 50.10, N = 3 4847.52 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Max SP Flops OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Max SP Flops AMD Radeon PRO W6800 6M 12M 18M 24M 30M SE +/- 320864.78, N = 9 28075778 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Bus Speed Download OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Download AMD Radeon PRO W6800 0.4494 0.8988 1.3482 1.7976 2.247 SE +/- 0.0000, N = 3 1.9972 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Bus Speed Readback OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Readback AMD Radeon PRO W6800 0.4184 0.8368 1.2552 1.6736 2.092 SE +/- 0.0000, N = 3 1.8595 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Texture Read Bandwidth OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Texture Read Bandwidth AMD Radeon PRO W6800 200 400 600 800 1000 SE +/- 0.93, N = 3 913.65 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SmallPT GPU OpenCL Device: GPU - Resolution: 1920 x 1080 - Scene: Caustic OpenBenchmarking.org Samples/sec, More Is Better SmallPT GPU 1.6pts1 OpenCL Device: GPU - Resolution: 1920 x 1080 - Scene: Caustic AMD Radeon PRO W6800 400M 800M 1200M 1600M 2000M SE +/- 25.12, N = 3 1703273562 1. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL
SmallPT GPU OpenCL Device: GPU - Resolution: 1920 x 1080 - Scene: Cornell OpenBenchmarking.org Samples/sec, More Is Better SmallPT GPU 1.6pts1 OpenCL Device: GPU - Resolution: 1920 x 1080 - Scene: Cornell AMD Radeon PRO W6800 400M 800M 1200M 1600M 2000M SE +/- 25.12, N = 3 1703273699 1. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL
SmallPT GPU OpenCL Device: GPU - Resolution: 1920 x 1080 - Scene: Caustic3 OpenBenchmarking.org Samples/sec, More Is Better SmallPT GPU 1.6pts1 OpenCL Device: GPU - Resolution: 1920 x 1080 - Scene: Caustic3 AMD Radeon PRO W6800 400M 800M 1200M 1600M 2000M SE +/- 25.40, N = 3 1703273839 1. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL
ViennaCL Test: CPU BLAS - sCOPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - sCOPY AMD Radeon PRO W6800 10 20 30 40 50 SE +/- 0.09, N = 3 43.1 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - sAXPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - sAXPY AMD Radeon PRO W6800 14 28 42 56 70 SE +/- 0.18, N = 3 64.7 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - sDOT OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - sDOT AMD Radeon PRO W6800 10 20 30 40 50 SE +/- 0.17, N = 3 45.7 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dCOPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dCOPY AMD Radeon PRO W6800 9 18 27 36 45 SE +/- 0.00, N = 3 37.7 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dAXPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dAXPY AMD Radeon PRO W6800 13 26 39 52 65 SE +/- 0.09, N = 3 56.6 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dDOT OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dDOT AMD Radeon PRO W6800 10 20 30 40 50 SE +/- 0.03, N = 3 42.1 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMV-N OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-N AMD Radeon PRO W6800 9 18 27 36 45 SE +/- 0.00, N = 3 41.5 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMM-NN OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NN AMD Radeon PRO W6800 11 22 33 44 55 SE +/- 0.03, N = 3 48.3 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMM-NT OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NT AMD Radeon PRO W6800 11 22 33 44 55 SE +/- 0.00, N = 3 47.4 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMM-TN OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TN AMD Radeon PRO W6800 12 24 36 48 60 SE +/- 0.00, N = 3 52.8 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMM-TT OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TT AMD Radeon PRO W6800 11 22 33 44 55 SE +/- 0.19, N = 3 50.6 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - sCOPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - sCOPY AMD Radeon PRO W6800 110 220 330 440 550 SE +/- 0.58, N = 3 511 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - sAXPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - sAXPY AMD Radeon PRO W6800 160 320 480 640 800 SE +/- 0.33, N = 3 737 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - sDOT OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - sDOT AMD Radeon PRO W6800 110 220 330 440 550 SE +/- 1.86, N = 3 497 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dCOPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dCOPY AMD Radeon PRO W6800 70 140 210 280 350 SE +/- 0.67, N = 3 326 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dAXPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dAXPY AMD Radeon PRO W6800 80 160 240 320 400 SE +/- 0.88, N = 3 359 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dDOT OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dDOT AMD Radeon PRO W6800 80 160 240 320 400 SE +/- 0.67, N = 3 346 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMV-N OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMV-N AMD Radeon PRO W6800 30 60 90 120 150 SE +/- 1.15, N = 3 140 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMV-T OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMV-T AMD Radeon PRO W6800 100 200 300 400 500 SE +/- 0.33, N = 3 469 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMM-NN OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-NN AMD Radeon PRO W6800 200 400 600 800 1000 SE +/- 0.67, N = 3 972 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMM-NT OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-NT AMD Radeon PRO W6800 200 400 600 800 1000 SE +/- 0.00, N = 3 1010 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMM-TN OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-TN AMD Radeon PRO W6800 200 400 600 800 1000 SE +/- 1.15, N = 3 941 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMM-TT OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-TT AMD Radeon PRO W6800 200 400 600 800 1000 SE +/- 0.88, N = 3 991 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
Phoronix Test Suite v10.8.5