OpenCL High-End GPU Comparison On Linux

High-end graphics card tests of OpenCL workloads under Ubuntu 14.04 LTS Linux by Michael Larabel for a future article on Phoronix.com.

HTML result view exported from: https://openbenchmarking.org/result/1405317-PL-1405314PL18.

OpenCL High-End GPU Comparison On LinuxProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLCompilerFile-SystemScreen ResolutionGeForce GTX 680GeForce GTX 750 TiGeForce GTX 760GeForce GTX 770GeForce GTX 780 TiGeForce GTX TITANRadeon HD 7950Radeon R9 270XRadeon R9 290GT740MQuadro FX 580HD 6450Intel Core i7-4770K @ 3.50GHz (8 Cores)ECS Z87H3-A2X EXTREME v1.0Intel 4th Gen Core DRAM16384MB120GB Samsung SSD 840NVIDIA GeForce GTX 680 2048MB (1006/3004MHz)Realtek ALC1150Samsung SyncMasterRealtek RTL8111/8168/8411Ubuntu 14.043.13.0-24-generic (x86_64)Unity 7.2.0X Server 1.15.1NVIDIA 337.194.3.0GCC 4.8.2ext42560x1600NVIDIA GeForce GTX 750 Ti 2048MB (1019/2700MHz)NVIDIA GeForce GTX 760 2048MB (540/3004MHz)NVIDIA GeForce GTX 770 2048MB (1045/3505MHz)NVIDIA GeForce GTX 780 Ti 3072MB (875/3500MHz)NVIDIA GeForce GTX TITAN 6144MB (836/3004MHz)XFX AMD Radeon HD 7900 3072MB (900/1375MHz)SyncMasterfglrx 14.10.24.3.12874Supported device 6810 2048MB (1100/1400MHz)Supported device 67B1 4096MB (947/1250MHz)Intel Core i7-4700MQ @ 3.40GHz (8 Cores)MSI MS-1758Intel Xeon E3-1200 v3/4th120GB INTEL SSDSC2CT12 + 1000GB HGST HTS721010A9MSI NVIDIA GeForce GT 740M 2048MB (540/900MHz)Intel Xeon E3-1200 v3/4thQualcomm Atheros AR8161 Gigabit + Realtek RTL8723AE PCIe Wireless3.15.0-031500rc2-generic (x86_64)KDE 4.13.0NVIDIA 331.384.3.01920x1080Intel Celeron J1900 @ 1.99GHz (4 Cores)ASRock Q1900B-ITXIntel ValleyView SSA-CUnit120GB GOODRAM C50NVIDIA Quadro FX 580 512MB (450/800MHz)Realtek ALC662 rev1LG IPS224Realtek RTL8111/8168/84113.13.0-27-generic (x86_64)Xfce 4.103.3.0GCC 4.8.2 + CUDA 5.5AMD Radeon HD 6450 1024MBIPS224fglrx 13.35.54.3.12798OpenBenchmarking.orgCompiler Details- --build=x86_64-linux-gnu --disable-browser-plugin --disable-libmudflap --disable-werror --enable-checking=release --enable-clocale=gnu --enable-gnu-unique-object --enable-gtk-cairo --enable-java-awt=gtk --enable-java-home --enable-languages=c,c++,java,go,d,fortran,objc,obj-c++ --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-objc-gc --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-arch-directory=amd64 --with-multilib-list=m32,m64,mx32 --with-tune=generic -vProcessor Details- GeForce GTX 680: Scaling Governor: acpi-cpufreq ondemand- GeForce GTX 750 Ti: Scaling Governor: acpi-cpufreq ondemand- GeForce GTX 760: Scaling Governor: acpi-cpufreq ondemand- GeForce GTX 770: Scaling Governor: acpi-cpufreq ondemand- GeForce GTX 780 Ti: Scaling Governor: acpi-cpufreq ondemand- GeForce GTX TITAN: Scaling Governor: acpi-cpufreq ondemand- Radeon HD 7950: Scaling Governor: acpi-cpufreq ondemand- Radeon R9 270X: Scaling Governor: acpi-cpufreq ondemand- Radeon R9 290: Scaling Governor: acpi-cpufreq ondemand- GT740M: Scaling Governor: intel_pstate performance- Quadro FX 580: Scaling Governor: acpi-cpufreq ondemand- HD 6450: Scaling Governor: acpi-cpufreq ondemandOpenCL Details- GeForce GTX 680: GPU Compute Cores: 1536- GeForce GTX 750 Ti: GPU Compute Cores: 640- GeForce GTX 760: GPU Compute Cores: 1152- GeForce GTX 770: GPU Compute Cores: 1536- GeForce GTX 780 Ti: GPU Compute Cores: 2880- GeForce GTX TITAN: GPU Compute Cores: 2688- GT740M: GPU Compute Cores: 384- Quadro FX 580: GPU Compute Cores: 32System Details- GeForce GTX 680: GPU Compute Cores: 1536.- GeForce GTX 750 Ti: GPU Compute Cores: 640.- GeForce GTX 760: GPU Compute Cores: 1152.- GeForce GTX 770: GPU Compute Cores: 1536.- GeForce GTX 780 Ti: GPU Compute Cores: 2880.- GeForce GTX TITAN: GPU Compute Cores: 2688.- GT740M: GPU Compute Cores: 384.- Quadro FX 580: GPU Compute Cores: 32.Environment Details- Radeon HD 7950: LIBGL_DRIVERS_PATH=/usr/lib/i386-linux-gnu/dri:/usr/lib/x86_64-linux-gnu/dri- Radeon R9 270X: LIBGL_DRIVERS_PATH=/usr/lib/i386-linux-gnu/dri:/usr/lib/x86_64-linux-gnu/dri- Radeon R9 290: LIBGL_DRIVERS_PATH=/usr/lib/i386-linux-gnu/dri:/usr/lib/x86_64-linux-gnu/dri- HD 6450: LIBGL_DRIVERS_PATH=/usr/lib/fglrx/dri:/usr/lib/x86_64-linux-gnu/dri:/usr/lib/dri:/usr/lib32/fglrx/dri:/usr/lib/i386-linux-gnu/dri

OpenCL High-End GPU Comparison On Linuxrodinia: OpenCL Myocyterodinia: OpenCL Heartwallrodinia: OpenCL Particle Filterjuliagpu: GPUmandelbulbgpu: GPUmandelgpu: GPUluxmark: GPU - Roomluxmark: GPU - Salaluxmark: GPU - Luxball HDRopendwarfs: LU Decompositionopendwarfs: Compressed Sparse Rowopendwarfs: Cyclic Redundancy CheckGeForce GTX 680GeForce GTX 750 TiGeForce GTX 760GeForce GTX 770GeForce GTX 780 TiGeForce GTX TITANRadeon HD 7950Radeon R9 270XRadeon R9 290GT740MQuadro FX 580HD 645050.455.1617.9339685064.1018694237.5342837373.236541314893156.805.820.0643.394.6337.5337946110.4314849487.7727866753.475881020810564.675.540.1450.057.9422.3031803078.2314878137.4033501696.435471095766769.775.770.0748.744.9417.0641256658.5719542653.2045359706.676851372934354.635.720.0553.863.0912.9566967826.7731666087.6764490219.13121424061808544.935.430.0454.9014.1657954293.9728313279.8361797466.57109022111640651.595.700.05364.235.5910.7998318101255083.9521.720.03337.975.1813.0183614771067168.8416.880.0479.565.638.27131223431676269.1113.690.0311404297.575327063.208876208.801112351595173.665.460.241675692.03978506.40853461.7321165904.7794.842.893362598771.3760.072.02OpenBenchmarking.org

Rodinia

Test: OpenCL Myocyte

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 2.4Test: OpenCL MyocyteGeForce GTX 680GeForce GTX 750 TiGeForce GTX 760GeForce GTX 770GeForce GTX 780 TiGeForce GTX TITANRadeon HD 7950Radeon R9 270XRadeon R9 29080160240320400SE +/- 0.04, N = 3SE +/- 0.84, N = 3SE +/- 0.72, N = 3SE +/- 0.08, N = 3SE +/- 0.91, N = 3SE +/- 0.09, N = 3SE +/- 5.51, N = 3SE +/- 2.16, N = 3SE +/- 0.21, N = 350.4543.3950.0548.7453.8654.90364.23337.9779.561. (CXX) g++ options: -O2 -lOpenCL

Rodinia

Test: OpenCL Heartwall

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 2.4Test: OpenCL HeartwallGeForce GTX 680GeForce GTX 750 TiGeForce GTX 760GeForce GTX 770GeForce GTX 780 TiRadeon HD 7950Radeon R9 270XRadeon R9 290246810SE +/- 0.03, N = 3SE +/- 0.07, N = 6SE +/- 0.12, N = 3SE +/- 0.03, N = 3SE +/- 0.07, N = 3SE +/- 0.05, N = 3SE +/- 0.04, N = 35.164.637.944.943.095.595.185.631. (CXX) g++ options: -O2 -lOpenCL

Rodinia

Test: OpenCL Particle Filter

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 2.4Test: OpenCL Particle FilterGeForce GTX 680GeForce GTX 750 TiGeForce GTX 760GeForce GTX 770GeForce GTX 780 TiGeForce GTX TITANRadeon HD 7950Radeon R9 270XRadeon R9 290918273645SE +/- 0.07, N = 3SE +/- 0.66, N = 6SE +/- 0.34, N = 6SE +/- 0.01, N = 3SE +/- 0.22, N = 6SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.24, N = 6SE +/- 0.13, N = 617.9337.5322.3017.0612.9514.1610.7913.018.271. (CXX) g++ options: -O2 -lOpenCL

JuliaGPU

OpenCL Device: GPU

OpenBenchmarking.orgSamples/sec, More Is BetterJuliaGPU 1.2pts1OpenCL Device: GPUGeForce GTX 680GeForce GTX 750 TiGeForce GTX 760GeForce GTX 770GeForce GTX 780 TiGeForce GTX TITANGT740MQuadro FX 58014M28M42M56M70MSE +/- 24001.93, N = 3SE +/- 37139.42, N = 3SE +/- 22044.37, N = 3SE +/- 23516.24, N = 3SE +/- 22581.94, N = 3SE +/- 178115.56, N = 3SE +/- 4800.20, N = 3SE +/- 147.56, N = 339685064.1037946110.4331803078.2341256658.5766967826.7757954293.9711404297.571675692.031. (CC) gcc options: -O3 -march=native -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL -lm

MandelbulbGPU

OpenCL Device: GPU

OpenBenchmarking.orgSamples/sec, More Is BetterMandelbulbGPU 1.0pts1OpenCL Device: GPUGeForce GTX 680GeForce GTX 750 TiGeForce GTX 760GeForce GTX 770GeForce GTX 780 TiGeForce GTX TITANGT740MQuadro FX 5807M14M21M28M35MSE +/- 62529.27, N = 3SE +/- 5126.17, N = 3SE +/- 6057.51, N = 3SE +/- 21521.06, N = 3SE +/- 64326.57, N = 3SE +/- 33588.22, N = 3SE +/- 1542.92, N = 3SE +/- 1292.11, N = 318694237.5314849487.7714878137.4019542653.2031666087.6728313279.835327063.20978506.401. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL

MandelGPU

OpenCL Device: GPU

OpenBenchmarking.orgSamples/sec, More Is BetterMandelGPU 1.3pts1OpenCL Device: GPUGeForce GTX 680GeForce GTX 750 TiGeForce GTX 760GeForce GTX 770GeForce GTX 780 TiGeForce GTX TITANGT740MQuadro FX 58014M28M42M56M70MSE +/- 6352.22, N = 3SE +/- 5193.56, N = 3SE +/- 15675.12, N = 3SE +/- 6781.46, N = 3SE +/- 274468.92, N = 3SE +/- 24850.67, N = 3SE +/- 330.44, N = 3SE +/- 10.81, N = 342837373.2327866753.4733501696.4345359706.6764490219.1361797466.578876208.80853461.731. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL

LuxMark

OpenCL Device: GPU - Scene: Room

OpenBenchmarking.orgScore, More Is BetterLuxMark 2.1beta1OpenCL Device: GPU - Scene: RoomGeForce GTX 680GeForce GTX 750 TiGeForce GTX 760GeForce GTX 770GeForce GTX 780 TiGeForce GTX TITANRadeon HD 7950Radeon R9 270XRadeon R9 290GT740MHD 645030060090012001500SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.33, N = 3SE +/- 0.58, N = 3SE +/- 0.58, N = 3SE +/- 0.88, N = 3SE +/- 0.88, N = 3SE +/- 3.21, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 365458854768512141090983836131211133

LuxMark

OpenCL Device: GPU - Scene: Sala

OpenBenchmarking.orgScore, More Is BetterLuxMark 2.1beta1OpenCL Device: GPU - Scene: SalaGeForce GTX 680GeForce GTX 750 TiGeForce GTX 760GeForce GTX 770GeForce GTX 780 TiGeForce GTX TITANRadeon HD 7950Radeon R9 270XRadeon R9 290GT740MQuadro FX 580HD 64505001000150020002500SE +/- 0.33, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 2.60, N = 3SE +/- 0.67, N = 3SE +/- 0.33, N = 3SE +/- 8.54, N = 3SE +/- 0.67, N = 3SE +/- 1.86, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 31314102010951372240622111810147723432352162

LuxMark

OpenCL Device: GPU - Scene: Luxball HDR

OpenBenchmarking.orgScore, More Is BetterLuxMark 2.1beta1OpenCL Device: GPU - Scene: Luxball HDRGeForce GTX 680GeForce GTX 750 TiGeForce GTX 760GeForce GTX 770GeForce GTX 780 TiGeForce GTX TITANRadeon HD 7950Radeon R9 270XRadeon R9 290GT740MQuadro FX 580HD 64504K8K12K16K20KSE +/- 1.53, N = 3SE +/- 2.52, N = 3SE +/- 1.20, N = 3SE +/- 1.86, N = 3SE +/- 9.85, N = 3SE +/- 113.19, N = 3SE +/- 20.07, N = 3SE +/- 19.73, N = 3SE +/- 59.37, N = 3SE +/- 1.00, N = 3SE +/- 0.33, N = 3SE +/- 0.58, N = 3893181057667934318085164061255010671167621595165598

OpenDwarfs

Test: LU Decomposition

OpenBenchmarking.orgms, Fewer Is BetterOpenDwarfs 2013-11-06Test: LU DecompositionGeForce GTX 680GeForce GTX 750 TiGeForce GTX 760GeForce GTX 770GeForce GTX 780 TiGeForce GTX TITANRadeon HD 7950Radeon R9 270XRadeon R9 290GT740MQuadro FX 580HD 64502004006008001000SE +/- 0.23, N = 3SE +/- 0.06, N = 3SE +/- 0.06, N = 3SE +/- 0.06, N = 3SE +/- 0.04, N = 3SE +/- 0.22, N = 3SE +/- 6.80, N = 6SE +/- 4.20, N = 6SE +/- 0.62, N = 3SE +/- 2.97, N = 3SE +/- 2.88, N = 3SE +/- 0.65, N = 356.8064.6769.7754.6344.9351.5983.9568.8469.11173.66904.77771.371. (CC) gcc options: -lm -lOpenCL

OpenDwarfs

Test: Compressed Sparse Row

OpenBenchmarking.orgms, Fewer Is BetterOpenDwarfs 2013-11-06Test: Compressed Sparse RowGeForce GTX 680GeForce GTX 750 TiGeForce GTX 760GeForce GTX 770GeForce GTX 780 TiGeForce GTX TITANRadeon HD 7950Radeon R9 270XRadeon R9 290GT740MQuadro FX 580HD 645020406080100SE +/- 0.02, N = 3SE +/- 0.02, N = 3SE +/- 0.04, N = 3SE +/- 0.03, N = 3SE +/- 0.02, N = 3SE +/- 0.05, N = 3SE +/- 1.21, N = 6SE +/- 0.76, N = 6SE +/- 0.49, N = 6SE +/- 0.03, N = 3SE +/- 0.21, N = 3SE +/- 0.71, N = 35.825.545.775.725.435.7021.7216.8813.695.4694.8460.071. (CC) gcc options: -lm -lOpenCL

OpenDwarfs

Test: Cyclic Redundancy Check

OpenBenchmarking.orgms, Fewer Is BetterOpenDwarfs 2013-11-06Test: Cyclic Redundancy CheckGeForce GTX 680GeForce GTX 750 TiGeForce GTX 760GeForce GTX 770GeForce GTX 780 TiGeForce GTX TITANRadeon HD 7950Radeon R9 270XRadeon R9 290GT740MQuadro FX 580HD 64500.65031.30061.95092.60123.2515SE +/- 0.00, N = 3SE +/- 0.00, N = 6SE +/- 0.00, N = 6SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 6SE +/- 0.00, N = 6SE +/- 0.00, N = 6SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 30.060.140.070.050.040.050.030.040.030.242.892.021. (CC) gcc options: -lm -lOpenCL


Phoronix Test Suite v10.8.4