OpenCL August

Fresh NVIDIA vs. Radeon OpenCL Linux benchmarks. Tests by Michael Larabel for a future article on Phoronix.com.

HTML result view exported from: https://openbenchmarking.org/result/1808234-PTS-OPENCLAU56.

OpenCL AugustProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLOpenCLCompilerFile-SystemScreen ResolutionRadeon RX Vega 56Radeon RX Vega 64GeForce GTX 1070GeForce GTX 1070 TiGeForce GTX 1080GeForce GTX 1080 TiAMD Ryzen Threadripper 2990WX 32-Core @ 3.00GHz (32 Cores / 64 Threads)ASUS ROG ZENITH EXTREME (1402 BIOS)AMD Family 17h32768MBSamsung SSD 970 EVO 500GBAMD Radeon RX Vega 8176MBRealtek ALC1220ASUS VP28UIntel I211 Gigabit Connection + Qualcomm Atheros QCA6174 802.11ac WirelessUbuntu 18.044.15.0-33-generic (x86_64)GNOME Shell 3.28.3X Server 1.19.6amdgpu 18.0.994.6.13536OpenCL 2.1 AMD-APP (2671.3)GCC 7.3.0ext43840x2160NVIDIA GeForce GTX 1070 8192MB (1506/4006MHz)NVIDIA 396.544.6.0OpenCL 1.2 CUDA 9.2.210Zotac NVIDIA GeForce GTX 1070 Ti 8192MB (1607/4006MHz)NVIDIA GeForce GTX 1080 8192MB (1607/5005MHz)NVIDIA GeForce GTX 1080 Ti 11264MB (1480/5508MHz)OpenBenchmarking.orgCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --enable-libmpx --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-as=/usr/bin/x86_64-linux-gnu-as --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-ld=/usr/bin/x86_64-linux-gnu-ld --with-multilib-list=m32,m64,mx32 --with-target-system-zlib --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: acpi-cpufreq ondemandGraphics Details- Radeon RX Vega 56, Radeon RX Vega 64: GLAMORPython Details- Radeon RX Vega 56: Python 2.7.15rc1 + Python 3.6.5Security Details- __user pointer sanitization + Full AMD retpoline IBPB + SSB disabled via prctl and seccomp ProtectionOpenCL Details- GeForce GTX 1070: GPU Compute Cores: 1920- GeForce GTX 1070 Ti: GPU Compute Cores: 2432- GeForce GTX 1080: GPU Compute Cores: 2560- GeForce GTX 1080 Ti: GPU Compute Cores: 3584

OpenCL Augustshoc: OpenCL - FFT SPshoc: OpenCL - MD5 Hashshoc: OpenCL - Texture Read Bandwidthcl-mem: Copycl-mem: Readcl-mem: Writefahbench: mandelgpu: GPUluxmark: GPU - HotelRadeon RX Vega 56Radeon RX Vega 64GeForce GTX 1070GeForce GTX 1070 TiGeForce GTX 1080GeForce GTX 1080 Ti830.7713.10362.97313.30346.47333.2092.48165896206.175083882.9617.12427.74369.93399.00388.5792.46192917451.235907528.7110.70456.10186.87205.50192.30131.18148507150.873820551.9513.80501.46186.80205.63191145.38184334944.104405650.7814.40523.65209.33228.40216.70141.38188560526.973883984.6119.72593.14317.37337.73336.30179.34250678650.875662OpenBenchmarking.org

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: FFT SP

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: FFT SPRadeon RX Vega 56Radeon RX Vega 64GeForce GTX 1070GeForce GTX 1070 TiGeForce GTX 1080GeForce GTX 1080 Ti2004006008001000SE +/- 31.13, N = 12SE +/- 14.14, N = 12SE +/- 2.10, N = 3SE +/- 1.30, N = 3SE +/- 1.65, N = 3SE +/- 1.72, N = 3830.77882.96528.71551.95650.78984.611. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: MD5 Hash

OpenBenchmarking.orgGHash/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: MD5 HashRadeon RX Vega 56Radeon RX Vega 64GeForce GTX 1070GeForce GTX 1070 TiGeForce GTX 1080GeForce GTX 1080 Ti510152025SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.02, N = 3SE +/- 0.00, N = 3SE +/- 0.02, N = 3SE +/- 0.02, N = 313.1017.1210.7013.8014.4019.721. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Texture Read Bandwidth

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2015-11-10Target: OpenCL - Benchmark: Texture Read BandwidthRadeon RX Vega 56Radeon RX Vega 64GeForce GTX 1070GeForce GTX 1070 TiGeForce GTX 1080GeForce GTX 1080 Ti130260390520650SE +/- 1.35, N = 3SE +/- 0.13, N = 3SE +/- 0.38, N = 3SE +/- 1.71, N = 3SE +/- 2.76, N = 3SE +/- 0.88, N = 3362.97427.74456.10501.46523.65593.141. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

cl-mem

Benchmark: Copy

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: CopyRadeon RX Vega 56Radeon RX Vega 64GeForce GTX 1070GeForce GTX 1070 TiGeForce GTX 1080GeForce GTX 1080 Ti80160240320400SE +/- 0.30, N = 3SE +/- 0.03, N = 3SE +/- 0.03, N = 3SE +/- 0.00, N = 3SE +/- 0.03, N = 3SE +/- 0.15, N = 3313.30369.93186.87186.80209.33317.371. (CC) gcc options: -O2 -flto -lOpenCL

cl-mem

Benchmark: Read

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: ReadRadeon RX Vega 56Radeon RX Vega 64GeForce GTX 1070GeForce GTX 1070 TiGeForce GTX 1080GeForce GTX 1080 Ti90180270360450SE +/- 0.52, N = 3SE +/- 0.10, N = 3SE +/- 0.10, N = 3SE +/- 0.09, N = 3SE +/- 0.20, N = 3SE +/- 0.30, N = 3346.47399.00205.50205.63228.40337.731. (CC) gcc options: -O2 -flto -lOpenCL

cl-mem

Benchmark: Write

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: WriteRadeon RX Vega 56Radeon RX Vega 64GeForce GTX 1070GeForce GTX 1070 TiGeForce GTX 1080GeForce GTX 1080 Ti80160240320400SE +/- 0.40, N = 3SE +/- 0.71, N = 3SE +/- 0.00, N = 3SE +/- 0.06, N = 3SE +/- 0.10, N = 3333.20388.57192.30191.00216.70336.301. (CC) gcc options: -O2 -flto -lOpenCL

FAHBench

OpenBenchmarking.orgNs Per Day, More Is BetterFAHBench 2.3.2Radeon RX Vega 56Radeon RX Vega 64GeForce GTX 1070GeForce GTX 1070 TiGeForce GTX 1080GeForce GTX 1080 Ti4080120160200SE +/- 0.14, N = 3SE +/- 0.82, N = 3SE +/- 0.18, N = 3SE +/- 0.17, N = 3SE +/- 0.05, N = 3SE +/- 0.34, N = 392.4892.46131.18145.38141.38179.34

MandelGPU

OpenCL Device: GPU

OpenBenchmarking.orgSamples/sec, More Is BetterMandelGPU 1.3pts1OpenCL Device: GPURadeon RX Vega 56Radeon RX Vega 64GeForce GTX 1070GeForce GTX 1070 TiGeForce GTX 1080GeForce GTX 1080 Ti50M100M150M200M250MSE +/- 144327.89, N = 3SE +/- 234631.71, N = 3SE +/- 483103.56, N = 3SE +/- 572304.04, N = 3SE +/- 430789.92, N = 3SE +/- 727895.77, N = 3165896206.17192917451.23148507150.87184334944.10188560526.97250678650.871. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL

LuxMark

OpenCL Device: GPU - Scene: Hotel

OpenBenchmarking.orgScore, More Is BetterLuxMark 3.1OpenCL Device: GPU - Scene: HotelRadeon RX Vega 56Radeon RX Vega 64GeForce GTX 1070GeForce GTX 1070 TiGeForce GTX 1080GeForce GTX 1080 Ti13002600390052006500SE +/- 4.18, N = 3SE +/- 32.69, N = 3SE +/- 1.33, N = 3SE +/- 12.60, N = 3508359073820440538835662


Phoronix Test Suite v10.8.4