20231222um79064thunderbolt.txt AMD Ryzen 9 7940HS testing with a Shenzhen Meigao Electronic Equipment F7BSC (1.07 BIOS) and AMD Radeon PRO W6800 8GB on Ubuntu 22.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2312228-NE-20231222U30&grs .
20231222um79064thunderbolt.txt Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server OpenGL OpenCL Vulkan Compiler File-System Screen Resolution AMD Radeon PRO W6800 AMD Ryzen 9 7940HS @ 4.00GHz (8 Cores / 16 Threads) Shenzhen Meigao Electronic Equipment F7BSC (1.07 BIOS) AMD Device 14e8 56GB 4097GB HP SSD FX900 Pro 4TB + 1024GB KINGSTON OM8PGP41024Q-A0 AMD Radeon PRO W6800 8GB (2555/1000MHz) AMD Navi 21 HDMI Audio DELL ST2210 Realtek RTL8125 2.5GbE + Intel I210 + Intel Wi-Fi 6 AX210/AX211/AX411 Ubuntu 22.04 6.2.0-39-generic (x86_64) GNOME Shell 42.9 X Server 1.21.1.3 + Wayland 4.6 Mesa 23.0.4-0ubuntu1~22.04.1 (LLVM 15.0.7 DRM 3.56) OpenCL 2.1 AMD-APP (3602.0) 1.3.238 GCC 11.4.0 ext4 1920x1080 OpenBenchmarking.org - Transparent Huge Pages: madvise - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0xa704103 - BAR1 / Visible vRAM Size: 8192 MB - vBIOS Version: 113-D4300100-103 - Python 3.10.12 - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of safe RET no microcode + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
20231222um79064thunderbolt.txt lulesh-cl: luxmark: CPU+GPU - Luxball HDR luxmark: CPU+GPU - Microphone luxmark: GPU - Luxball HDR luxmark: GPU - Microphone luxmark: CPU+GPU - Hotel luxmark: GPU - Hotel smallpt-gpu: GPU - 1920 x 1080 - Caustic3 smallpt-gpu: GPU - 1920 x 1080 - Cornell smallpt-gpu: GPU - 1920 x 1080 - Caustic darktable: Server Room - OpenCL darktable: Server Rack - OpenCL darktable: Masskrug - OpenCL darktable: Boat - OpenCL viennacl: OpenCL BLAS - dGEMM-TT viennacl: OpenCL BLAS - dGEMM-TN viennacl: OpenCL BLAS - dGEMM-NT viennacl: OpenCL BLAS - dGEMM-NN viennacl: OpenCL BLAS - dGEMV-T viennacl: OpenCL BLAS - dGEMV-N viennacl: OpenCL BLAS - dDOT viennacl: OpenCL BLAS - dAXPY viennacl: OpenCL BLAS - dCOPY viennacl: OpenCL BLAS - sDOT viennacl: OpenCL BLAS - sAXPY viennacl: OpenCL BLAS - sCOPY viennacl: CPU BLAS - dGEMM-TT viennacl: CPU BLAS - dGEMM-TN viennacl: CPU BLAS - dGEMM-NT viennacl: CPU BLAS - dGEMM-NN viennacl: CPU BLAS - dGEMV-N viennacl: CPU BLAS - dDOT viennacl: CPU BLAS - dAXPY viennacl: CPU BLAS - dCOPY viennacl: CPU BLAS - sDOT viennacl: CPU BLAS - sAXPY viennacl: CPU BLAS - sCOPY rodinia: OpenCL Leukocyte rodinia: OpenCL Myocyte clpeak: Transfer Bandwidth enqueueWriteBuffer clpeak: Transfer Bandwidth enqueueReadBuffer clpeak: Single-Precision Compute clpeak: Double-Precision Compute clpeak: Global Memory Bandwidth clpeak: Integer 24-bit Compute clpeak: Integer Compute clpeak: Kernel Latency fluidx3d: FP32-FP16S fluidx3d: FP32-FP16C fluidx3d: FP32-FP32 cl-mem: Write cl-mem: Read cl-mem: Copy shoc: OpenCL - Texture Read Bandwidth shoc: OpenCL - Bus Speed Readback shoc: OpenCL - Bus Speed Download shoc: OpenCL - Max SP Flops shoc: OpenCL - GEMM SGEMM_N shoc: OpenCL - Reduction shoc: OpenCL - MD5 Hash shoc: OpenCL - FFT SP shoc: OpenCL - Triad shoc: OpenCL - S3D parboil: OpenCL BFS AMD Radeon PRO W6800 1952.2936 142214 111311 143110 112563 19317 19388 1703273839 1703273699 1703273562 1.099 0.447 3.626 2.829 991 941 1010 972 469 140 346 359 326 497 737 511 50.6 52.8 47.4 48.3 41.5 42.1 56.6 37.7 45.7 64.7 43.1 3.775 9.679 21.56 4.99 17191.60 1172.37 373.03 15708.24 3654.24 19.26 5376 5166 3406 369.7 413.5 326.9 913.650 1.8595 1.9972 28075778 4847.52 589.718 24.2202 1469.88 1.9084 95.9500 OpenBenchmarking.org
Lulesh OpenCL OpenBenchmarking.org z/s, More Is Better Lulesh OpenCL 2017-07-06 AMD Radeon PRO W6800 400 800 1200 1600 2000 SE +/- 21.61, N = 5 1952.29 1. (CXX) g++ options: -std=c++11 -lOpenCL -O3 -lm
LuxMark OpenCL Device: CPU+GPU - Scene: Luxball HDR OpenBenchmarking.org Score, More Is Better LuxMark 3.1 OpenCL Device: CPU+GPU - Scene: Luxball HDR AMD Radeon PRO W6800 30K 60K 90K 120K 150K SE +/- 359.67, N = 3 142214
LuxMark OpenCL Device: CPU+GPU - Scene: Microphone OpenBenchmarking.org Score, More Is Better LuxMark 3.1 OpenCL Device: CPU+GPU - Scene: Microphone AMD Radeon PRO W6800 20K 40K 60K 80K 100K SE +/- 156.19, N = 3 111311
LuxMark OpenCL Device: GPU - Scene: Luxball HDR OpenBenchmarking.org Score, More Is Better LuxMark 3.1 OpenCL Device: GPU - Scene: Luxball HDR AMD Radeon PRO W6800 30K 60K 90K 120K 150K SE +/- 505.54, N = 3 143110
LuxMark OpenCL Device: GPU - Scene: Microphone OpenBenchmarking.org Score, More Is Better LuxMark 3.1 OpenCL Device: GPU - Scene: Microphone AMD Radeon PRO W6800 20K 40K 60K 80K 100K SE +/- 1027.00, N = 7 112563
LuxMark OpenCL Device: CPU+GPU - Scene: Hotel OpenBenchmarking.org Score, More Is Better LuxMark 3.1 OpenCL Device: CPU+GPU - Scene: Hotel AMD Radeon PRO W6800 4K 8K 12K 16K 20K SE +/- 65.58, N = 3 19317
LuxMark OpenCL Device: GPU - Scene: Hotel OpenBenchmarking.org Score, More Is Better LuxMark 3.1 OpenCL Device: GPU - Scene: Hotel AMD Radeon PRO W6800 4K 8K 12K 16K 20K SE +/- 187.40, N = 6 19388
SmallPT GPU OpenCL Device: GPU - Resolution: 1920 x 1080 - Scene: Caustic3 OpenBenchmarking.org Samples/sec, More Is Better SmallPT GPU 1.6pts1 OpenCL Device: GPU - Resolution: 1920 x 1080 - Scene: Caustic3 AMD Radeon PRO W6800 400M 800M 1200M 1600M 2000M SE +/- 25.40, N = 3 1703273839 1. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL
SmallPT GPU OpenCL Device: GPU - Resolution: 1920 x 1080 - Scene: Cornell OpenBenchmarking.org Samples/sec, More Is Better SmallPT GPU 1.6pts1 OpenCL Device: GPU - Resolution: 1920 x 1080 - Scene: Cornell AMD Radeon PRO W6800 400M 800M 1200M 1600M 2000M SE +/- 25.12, N = 3 1703273699 1. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL
SmallPT GPU OpenCL Device: GPU - Resolution: 1920 x 1080 - Scene: Caustic OpenBenchmarking.org Samples/sec, More Is Better SmallPT GPU 1.6pts1 OpenCL Device: GPU - Resolution: 1920 x 1080 - Scene: Caustic AMD Radeon PRO W6800 400M 800M 1200M 1600M 2000M SE +/- 25.12, N = 3 1703273562 1. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL
Darktable Test: Server Room - Acceleration: OpenCL OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.8.1 Test: Server Room - Acceleration: OpenCL AMD Radeon PRO W6800 0.2473 0.4946 0.7419 0.9892 1.2365 SE +/- 0.016, N = 3 1.099
Darktable Test: Server Rack - Acceleration: OpenCL OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.8.1 Test: Server Rack - Acceleration: OpenCL AMD Radeon PRO W6800 0.1006 0.2012 0.3018 0.4024 0.503 SE +/- 0.005, N = 15 0.447
Darktable Test: Masskrug - Acceleration: OpenCL OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.8.1 Test: Masskrug - Acceleration: OpenCL AMD Radeon PRO W6800 0.8159 1.6318 2.4477 3.2636 4.0795 SE +/- 0.033, N = 7 3.626
Darktable Test: Boat - Acceleration: OpenCL OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.8.1 Test: Boat - Acceleration: OpenCL AMD Radeon PRO W6800 0.6365 1.273 1.9095 2.546 3.1825 SE +/- 0.005, N = 3 2.829
ViennaCL Test: OpenCL BLAS - dGEMM-TT OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-TT AMD Radeon PRO W6800 200 400 600 800 1000 SE +/- 0.88, N = 3 991 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMM-TN OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-TN AMD Radeon PRO W6800 200 400 600 800 1000 SE +/- 1.15, N = 3 941 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMM-NT OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-NT AMD Radeon PRO W6800 200 400 600 800 1000 SE +/- 0.00, N = 3 1010 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMM-NN OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-NN AMD Radeon PRO W6800 200 400 600 800 1000 SE +/- 0.67, N = 3 972 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMV-T OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMV-T AMD Radeon PRO W6800 100 200 300 400 500 SE +/- 0.33, N = 3 469 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dGEMV-N OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMV-N AMD Radeon PRO W6800 30 60 90 120 150 SE +/- 1.15, N = 3 140 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dDOT OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dDOT AMD Radeon PRO W6800 80 160 240 320 400 SE +/- 0.67, N = 3 346 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dAXPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dAXPY AMD Radeon PRO W6800 80 160 240 320 400 SE +/- 0.88, N = 3 359 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - dCOPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - dCOPY AMD Radeon PRO W6800 70 140 210 280 350 SE +/- 0.67, N = 3 326 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - sDOT OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - sDOT AMD Radeon PRO W6800 110 220 330 440 550 SE +/- 1.86, N = 3 497 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - sAXPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - sAXPY AMD Radeon PRO W6800 160 320 480 640 800 SE +/- 0.33, N = 3 737 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: OpenCL BLAS - sCOPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: OpenCL BLAS - sCOPY AMD Radeon PRO W6800 110 220 330 440 550 SE +/- 0.58, N = 3 511 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMM-TT OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TT AMD Radeon PRO W6800 11 22 33 44 55 SE +/- 0.19, N = 3 50.6 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMM-TN OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TN AMD Radeon PRO W6800 12 24 36 48 60 SE +/- 0.00, N = 3 52.8 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMM-NT OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NT AMD Radeon PRO W6800 11 22 33 44 55 SE +/- 0.00, N = 3 47.4 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMM-NN OpenBenchmarking.org GFLOPs/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NN AMD Radeon PRO W6800 11 22 33 44 55 SE +/- 0.03, N = 3 48.3 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dGEMV-N OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-N AMD Radeon PRO W6800 9 18 27 36 45 SE +/- 0.00, N = 3 41.5 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dDOT OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dDOT AMD Radeon PRO W6800 10 20 30 40 50 SE +/- 0.03, N = 3 42.1 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dAXPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dAXPY AMD Radeon PRO W6800 13 26 39 52 65 SE +/- 0.09, N = 3 56.6 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - dCOPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - dCOPY AMD Radeon PRO W6800 9 18 27 36 45 SE +/- 0.00, N = 3 37.7 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - sDOT OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - sDOT AMD Radeon PRO W6800 10 20 30 40 50 SE +/- 0.17, N = 3 45.7 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - sAXPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - sAXPY AMD Radeon PRO W6800 14 28 42 56 70 SE +/- 0.18, N = 3 64.7 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
ViennaCL Test: CPU BLAS - sCOPY OpenBenchmarking.org GB/s, More Is Better ViennaCL 1.7.1 Test: CPU BLAS - sCOPY AMD Radeon PRO W6800 10 20 30 40 50 SE +/- 0.09, N = 3 43.1 1. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL
Rodinia Test: OpenCL Leukocyte OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenCL Leukocyte AMD Radeon PRO W6800 0.8494 1.6988 2.5482 3.3976 4.247 SE +/- 0.033, N = 15 3.775 1. (CXX) g++ options: -O2 -lOpenCL
Rodinia Test: OpenCL Myocyte OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenCL Myocyte AMD Radeon PRO W6800 3 6 9 12 15 SE +/- 0.110, N = 15 9.679 1. (CXX) g++ options: -O2 -lOpenCL
clpeak OpenCL Test: Transfer Bandwidth enqueueWriteBuffer OpenBenchmarking.org GBPS, More Is Better clpeak 1.1.2 OpenCL Test: Transfer Bandwidth enqueueWriteBuffer AMD Radeon PRO W6800 5 10 15 20 25 SE +/- 0.18, N = 15 21.56 1. (CXX) g++ options: -O3
clpeak OpenCL Test: Transfer Bandwidth enqueueReadBuffer OpenBenchmarking.org GBPS, More Is Better clpeak 1.1.2 OpenCL Test: Transfer Bandwidth enqueueReadBuffer AMD Radeon PRO W6800 1.1228 2.2456 3.3684 4.4912 5.614 SE +/- 0.03, N = 3 4.99 1. (CXX) g++ options: -O3
clpeak OpenCL Test: Single-Precision Compute OpenBenchmarking.org GFLOPS, More Is Better clpeak 1.1.2 OpenCL Test: Single-Precision Compute AMD Radeon PRO W6800 4K 8K 12K 16K 20K SE +/- 82.20, N = 3 17191.60 1. (CXX) g++ options: -O3
clpeak OpenCL Test: Double-Precision Compute OpenBenchmarking.org GFLOPS, More Is Better clpeak 1.1.2 OpenCL Test: Double-Precision Compute AMD Radeon PRO W6800 300 600 900 1200 1500 SE +/- 0.55, N = 3 1172.37 1. (CXX) g++ options: -O3
clpeak OpenCL Test: Global Memory Bandwidth OpenBenchmarking.org GBPS, More Is Better clpeak 1.1.2 OpenCL Test: Global Memory Bandwidth AMD Radeon PRO W6800 80 160 240 320 400 SE +/- 0.33, N = 3 373.03 1. (CXX) g++ options: -O3
clpeak OpenCL Test: Integer 24-bit Compute OpenBenchmarking.org GIOPS, More Is Better clpeak 1.1.2 OpenCL Test: Integer 24-bit Compute AMD Radeon PRO W6800 3K 6K 9K 12K 15K SE +/- 115.06, N = 3 15708.24 1. (CXX) g++ options: -O3
clpeak OpenCL Test: Integer Compute OpenBenchmarking.org GIOPS, More Is Better clpeak 1.1.2 OpenCL Test: Integer Compute AMD Radeon PRO W6800 800 1600 2400 3200 4000 SE +/- 2.74, N = 3 3654.24 1. (CXX) g++ options: -O3
clpeak OpenCL Test: Kernel Latency OpenBenchmarking.org us, Fewer Is Better clpeak 1.1.2 OpenCL Test: Kernel Latency AMD Radeon PRO W6800 5 10 15 20 25 SE +/- 0.20, N = 15 19.26 1. (CXX) g++ options: -O3
FluidX3D Test: FP32-FP16S OpenBenchmarking.org MLUPs/s, More Is Better FluidX3D 2.9 Test: FP32-FP16S AMD Radeon PRO W6800 1200 2400 3600 4800 6000 SE +/- 66.77, N = 3 5376
FluidX3D Test: FP32-FP16C OpenBenchmarking.org MLUPs/s, More Is Better FluidX3D 2.9 Test: FP32-FP16C AMD Radeon PRO W6800 1100 2200 3300 4400 5500 SE +/- 28.21, N = 3 5166
FluidX3D Test: FP32-FP32 OpenBenchmarking.org MLUPs/s, More Is Better FluidX3D 2.9 Test: FP32-FP32 AMD Radeon PRO W6800 700 1400 2100 2800 3500 SE +/- 7.69, N = 3 3406
cl-mem Benchmark: Write OpenBenchmarking.org GB/s, More Is Better cl-mem 2017-01-13 Benchmark: Write AMD Radeon PRO W6800 80 160 240 320 400 SE +/- 1.59, N = 3 369.7 1. (CC) gcc options: -O2 -flto -lOpenCL
cl-mem Benchmark: Read OpenBenchmarking.org GB/s, More Is Better cl-mem 2017-01-13 Benchmark: Read AMD Radeon PRO W6800 90 180 270 360 450 SE +/- 1.72, N = 3 413.5 1. (CC) gcc options: -O2 -flto -lOpenCL
cl-mem Benchmark: Copy OpenBenchmarking.org GB/s, More Is Better cl-mem 2017-01-13 Benchmark: Copy AMD Radeon PRO W6800 70 140 210 280 350 SE +/- 0.50, N = 3 326.9 1. (CC) gcc options: -O2 -flto -lOpenCL
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Texture Read Bandwidth OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Texture Read Bandwidth AMD Radeon PRO W6800 200 400 600 800 1000 SE +/- 0.93, N = 3 913.65 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Bus Speed Readback OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Readback AMD Radeon PRO W6800 0.4184 0.8368 1.2552 1.6736 2.092 SE +/- 0.0000, N = 3 1.8595 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Bus Speed Download OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Download AMD Radeon PRO W6800 0.4494 0.8988 1.3482 1.7976 2.247 SE +/- 0.0000, N = 3 1.9972 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Max SP Flops OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Max SP Flops AMD Radeon PRO W6800 6M 12M 18M 24M 30M SE +/- 320864.78, N = 9 28075778 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: GEMM SGEMM_N OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: GEMM SGEMM_N AMD Radeon PRO W6800 1000 2000 3000 4000 5000 SE +/- 50.10, N = 3 4847.52 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Reduction OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Reduction AMD Radeon PRO W6800 130 260 390 520 650 SE +/- 0.29, N = 3 589.72 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: MD5 Hash OpenBenchmarking.org GHash/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: MD5 Hash AMD Radeon PRO W6800 6 12 18 24 30 SE +/- 0.02, N = 3 24.22 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: FFT SP OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: FFT SP AMD Radeon PRO W6800 300 600 900 1200 1500 SE +/- 4.15, N = 3 1469.88 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: Triad OpenBenchmarking.org GB/s, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Triad AMD Radeon PRO W6800 0.4294 0.8588 1.2882 1.7176 2.147 SE +/- 0.0030, N = 3 1.9084 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
SHOC Scalable HeterOgeneous Computing Target: OpenCL - Benchmark: S3D OpenBenchmarking.org GFLOPS, More Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: S3D AMD Radeon PRO W6800 20 40 60 80 100 SE +/- 0.71, N = 3 95.95 1. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi
Phoronix Test Suite v10.8.5