gpu_test_20220214_2334.txt

AMD EPYC 7443 24-Core testing with a ASUS KRPG-U8-M (0402 BIOS) and ASUS NVIDIA GeForce RTX 3090 24GB on Ubuntu 20.04 via the Phoronix Test Suite.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2202146-NE-GPUTEST2097
Jump To Table - Results

Statistics

Remove Outliers Before Calculating Averages

Graph Settings

Prefer Vertical Bar Graphs

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Result
Identifier
Performance Per
Dollar
Date
Run
  Test
  Duration
ASUS NVIDIA GeForce RTX 3090
February 14 2022
  4 Hours, 3 Minutes
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


gpu_test_20220214_2334.txtOpenBenchmarking.orgPhoronix Test SuiteAMD EPYC 7443 24-Core @ 2.85GHz (24 Cores / 48 Threads)ASUS KRPG-U8-M (0402 BIOS)AMD Starship/Matisse64GB4 x 1920GB KINGSTON SEDC500ASUS NVIDIA GeForce RTX 3090 24GBNVIDIA Device 1aef2 x Intel I350Ubuntu 20.045.4.0-99-generic (x86_64) 20220203GNOME Shell 3.36.9X Server 1.20.13NVIDIAOpenCL 3.0 CUDA 11.6.991.3.194GCC 9.3.0 + CUDA 11.6ext41024x768ProcessorMotherboardChipsetMemoryDiskGraphicsAudioNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenCLVulkanCompilerFile-SystemScreen ResolutionGpu_test_20220214_2334.txt BenchmarksSystem Logs- Transparent Huge Pages: madvise- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: acpi-cpufreq ondemand (Boost: Enabled) - CPU Microcode: 0xa001137 - BAR1 / Visible vRAM Size: 256 MiB- Python 3.8.10- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected

gpu_test_20220214_2334.txtneatbench: GPUshoc: OpenCL - Triadshoc: OpenCL - Reductionshoc: OpenCL - Bus Speed Downloadshoc: OpenCL - Bus Speed Readbackshoc: OpenCL - Texture Read Bandwidthcl-mem: Copycl-mem: Readcl-mem: Writeviennacl: CPU BLAS - sCOPYviennacl: CPU BLAS - sAXPYviennacl: CPU BLAS - sDOTviennacl: CPU BLAS - dCOPYviennacl: CPU BLAS - dAXPYviennacl: CPU BLAS - dDOTviennacl: CPU BLAS - dGEMV-Nviennacl: CPU BLAS - dGEMV-Tviennacl: OpenCL BLAS - sCOPYviennacl: OpenCL BLAS - sAXPYviennacl: OpenCL BLAS - sDOTviennacl: OpenCL BLAS - dCOPYviennacl: OpenCL BLAS - dAXPYviennacl: OpenCL BLAS - dDOTviennacl: OpenCL BLAS - dGEMV-Nviennacl: OpenCL BLAS - dGEMV-Tclpeak: Global Memory Bandwidthmixbench: OpenCL - Double Precisionmixbench: OpenCL - Single Precisionmixbench: NVIDIA CUDA - Half Precisionmixbench: NVIDIA CUDA - Double Precisionmixbench: NVIDIA CUDA - Single Precisionshoc: OpenCL - S3Dshoc: OpenCL - FFT SPshoc: OpenCL - GEMM SGEMM_Nshoc: OpenCL - Max SP Flopsclpeak: Single-Precision Floatclpeak: Double-Precision Doubleviennacl: CPU BLAS - dGEMM-NNviennacl: CPU BLAS - dGEMM-NTviennacl: CPU BLAS - dGEMM-TNviennacl: CPU BLAS - dGEMM-TTviennacl: OpenCL BLAS - dGEMM-NNviennacl: OpenCL BLAS - dGEMM-NTviennacl: OpenCL BLAS - dGEMM-TNviennacl: OpenCL BLAS - dGEMM-TTshoc: OpenCL - MD5 Hashmixbench: OpenCL - Integermixbench: NVIDIA CUDA - Integerclpeak: Integer Compute INThashcat: 7-Ziphashcat: SHA-512lczero: OpenCLfahbench: gromacs: NVIDIA CUDA GPU - water_GMX50_barecaffe: AlexNet - NVIDIA CUDA - 100caffe: AlexNet - NVIDIA CUDA - 200caffe: AlexNet - NVIDIA CUDA - 1000caffe: GoogleNet - NVIDIA CUDA - 100caffe: GoogleNet - NVIDIA CUDA - 200caffe: GoogleNet - NVIDIA CUDA - 1000arrayfire: Conjugate Gradient OpenCLfinancebench: Black-Scholes OpenCLrodinia: OpenCL Particle Filterblender: Pabellon Barcelona - NVIDIA OptiXASUS NVIDIA GeForce RTX 3090309022.1391387.94526.712327.07342186.78360.6817.5749.99.7713.74.9014.922.07.7211.912.0362497366603718655237372813.99542.1438008.3036859.46494.8834074.74428.1892075.238003.6639430.835158.22639.8182.181.084.883.159059258759143.381717349.2516263.1418344.4537534671103926666712273343.07552.874680.2271346.606636.652590.725136.2625531.92.2226.9004.106OpenBenchmarking.org

PlaidML

This test profile uses PlaidML deep learning framework developed by Intel for offering up various benchmarks. Learn more via the OpenBenchmarking.org test page.

FP16: No - Mode: Training - Network: Mobilenet - Device: OpenCL

ASUS NVIDIA GeForce RTX 3090: The test run did not produce a result. The test run did not produce a result. The test run did not produce a result.

FP16: No - Mode: Inference - Network: IMDB LSTM - Device: OpenCL

ASUS NVIDIA GeForce RTX 3090: The test run did not produce a result. The test run did not produce a result. The test run did not produce a result.

FP16: No - Mode: Inference - Network: Mobilenet - Device: OpenCL

ASUS NVIDIA GeForce RTX 3090: The test run did not produce a result. The test run did not produce a result. The test run did not produce a result.

FP16: Yes - Mode: Inference - Network: Mobilenet - Device: OpenCL

ASUS NVIDIA GeForce RTX 3090: The test run did not produce a result. The test run did not produce a result. The test run did not produce a result.

FP16: No - Mode: Inference - Network: DenseNet 201 - Device: OpenCL

ASUS NVIDIA GeForce RTX 3090: The test run did not produce a result. The test run did not produce a result. The test run did not produce a result.

NeatBench

NeatBench is a benchmark of the cross-platform Neat Video software on the CPU and optional GPU (OpenCL / CUDA) support. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFPS, More Is BetterNeatBench 5Acceleration: GPUASUS NVIDIA GeForce RTX 309070014002100280035003090

SHOC Scalable HeterOgeneous Computing

The CUDA and OpenCL version of Vetter's Scalable HeterOgeneous Computing benchmark suite. SHOC provides a number of different benchmark programs for evaluating the performance and stability of compute devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: TriadASUS NVIDIA GeForce RTX 3090510152025SE +/- 0.01, N = 322.141. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: ReductionASUS NVIDIA GeForce RTX 309080160240320400SE +/- 0.04, N = 3387.951. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Bus Speed DownloadASUS NVIDIA GeForce RTX 3090612182430SE +/- 0.01, N = 326.711. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Bus Speed ReadbackASUS NVIDIA GeForce RTX 3090612182430SE +/- 0.00, N = 327.071. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Texture Read BandwidthASUS NVIDIA GeForce RTX 30905001000150020002500SE +/- 5.55, N = 32186.781. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

cl-mem

A basic OpenCL memory benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: CopyASUS NVIDIA GeForce RTX 309080160240320400SE +/- 0.03, N = 3360.61. (CC) gcc options: -O2 -flto -lOpenCL

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: ReadASUS NVIDIA GeForce RTX 30902004006008001000SE +/- 0.15, N = 3817.51. (CC) gcc options: -O2 -flto -lOpenCL

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: WriteASUS NVIDIA GeForce RTX 3090160320480640800SE +/- 0.15, N = 3749.91. (CC) gcc options: -O2 -flto -lOpenCL

ViennaCL

ViennaCL is an open-source linear algebra library written in C++ and with support for OpenCL and OpenMP. This test profile makes use of ViennaCL's built-in benchmarks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - sCOPYASUS NVIDIA GeForce RTX 30903691215SE +/- 0.67, N = 159.771. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - sAXPYASUS NVIDIA GeForce RTX 309048121620SE +/- 0.68, N = 1513.71. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - sDOTASUS NVIDIA GeForce RTX 30901.10252.2053.30754.415.5125SE +/- 0.63, N = 154.901. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dCOPYASUS NVIDIA GeForce RTX 309048121620SE +/- 0.11, N = 1514.91. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dAXPYASUS NVIDIA GeForce RTX 3090510152025SE +/- 0.21, N = 1522.01. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dDOTASUS NVIDIA GeForce RTX 3090246810SE +/- 0.06, N = 157.721. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMV-NASUS NVIDIA GeForce RTX 30903691215SE +/- 0.25, N = 1511.91. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMV-TASUS NVIDIA GeForce RTX 30903691215SE +/- 0.41, N = 1412.01. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sCOPYASUS NVIDIA GeForce RTX 309080160240320400SE +/- 0.67, N = 33621. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sAXPYASUS NVIDIA GeForce RTX 3090110220330440550SE +/- 0.88, N = 34971. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sDOTASUS NVIDIA GeForce RTX 3090801602403204003661. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dCOPYASUS NVIDIA GeForce RTX 30901302603905206506031. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dAXPYASUS NVIDIA GeForce RTX 3090150300450600750SE +/- 0.58, N = 37181. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dDOTASUS NVIDIA GeForce RTX 3090140280420560700SE +/- 0.67, N = 36551. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMV-NASUS NVIDIA GeForce RTX 3090501001502002502371. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMV-TASUS NVIDIA GeForce RTX 3090801602403204003721. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

clpeak

Clpeak is designed to test the peak capabilities of OpenCL devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGBPS, More Is BetterclpeakOpenCL Test: Global Memory BandwidthASUS NVIDIA GeForce RTX 30902004006008001000SE +/- 0.01, N = 3813.991. (CXX) g++ options: -O3 -rdynamic -lOpenCL

Mixbench

A benchmark suite for GPUs on mixed operational intensity kernels. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS, More Is BetterMixbench 2020-06-23Backend: OpenCL - Benchmark: Double PrecisionASUS NVIDIA GeForce RTX 3090120240360480600SE +/- 0.05, N = 3542.141. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2

OpenBenchmarking.orgGFLOPS, More Is BetterMixbench 2020-06-23Backend: OpenCL - Benchmark: Single PrecisionASUS NVIDIA GeForce RTX 30908K16K24K32K40KSE +/- 553.93, N = 1538008.301. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2

OpenBenchmarking.orgGFLOPS, More Is BetterMixbench 2020-06-23Backend: NVIDIA CUDA - Benchmark: Half PrecisionASUS NVIDIA GeForce RTX 30908K16K24K32K40KSE +/- 748.28, N = 1536859.461. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2

OpenBenchmarking.orgGFLOPS, More Is BetterMixbench 2020-06-23Backend: NVIDIA CUDA - Benchmark: Double PrecisionASUS NVIDIA GeForce RTX 3090110220330440550SE +/- 8.67, N = 15494.881. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2

OpenBenchmarking.orgGFLOPS, More Is BetterMixbench 2020-06-23Backend: NVIDIA CUDA - Benchmark: Single PrecisionASUS NVIDIA GeForce RTX 30907K14K21K28K35KSE +/- 573.50, N = 1534074.741. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2

SHOC Scalable HeterOgeneous Computing

The CUDA and OpenCL version of Vetter's Scalable HeterOgeneous Computing benchmark suite. SHOC provides a number of different benchmark programs for evaluating the performance and stability of compute devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: S3DASUS NVIDIA GeForce RTX 309090180270360450SE +/- 0.23, N = 3428.191. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: FFT SPASUS NVIDIA GeForce RTX 3090400800120016002000SE +/- 0.44, N = 32075.231. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: GEMM SGEMM_NASUS NVIDIA GeForce RTX 30902K4K6K8K10KSE +/- 36.53, N = 38003.661. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Max SP FlopsASUS NVIDIA GeForce RTX 30908K16K24K32K40KSE +/- 87.85, N = 339430.81. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

clpeak

Clpeak is designed to test the peak capabilities of OpenCL devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS, More Is BetterclpeakOpenCL Test: Single-Precision FloatASUS NVIDIA GeForce RTX 30908K16K24K32K40KSE +/- 7.08, N = 335158.221. (CXX) g++ options: -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGFLOPS, More Is BetterclpeakOpenCL Test: Double-Precision DoubleASUS NVIDIA GeForce RTX 3090140280420560700SE +/- 2.44, N = 3639.811. (CXX) g++ options: -O3 -rdynamic -lOpenCL

ViennaCL

ViennaCL is an open-source linear algebra library written in C++ and with support for OpenCL and OpenMP. This test profile makes use of ViennaCL's built-in benchmarks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-NNASUS NVIDIA GeForce RTX 309020406080100SE +/- 0.23, N = 1582.11. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-NTASUS NVIDIA GeForce RTX 309020406080100SE +/- 0.20, N = 1581.01. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-TNASUS NVIDIA GeForce RTX 309020406080100SE +/- 0.07, N = 1584.81. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-TTASUS NVIDIA GeForce RTX 309020406080100SE +/- 0.11, N = 1583.11. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-NNASUS NVIDIA GeForce RTX 3090130260390520650SE +/- 1.33, N = 35901. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-NTASUS NVIDIA GeForce RTX 3090130260390520650SE +/- 1.53, N = 35921. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-TNASUS NVIDIA GeForce RTX 3090130260390520650SE +/- 1.20, N = 35871. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-TTASUS NVIDIA GeForce RTX 3090130260390520650SE +/- 1.33, N = 35911. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

SHOC Scalable HeterOgeneous Computing

The CUDA and OpenCL version of Vetter's Scalable HeterOgeneous Computing benchmark suite. SHOC provides a number of different benchmark programs for evaluating the performance and stability of compute devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGHash/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: MD5 HashASUS NVIDIA GeForce RTX 30901020304050SE +/- 0.25, N = 343.381. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

Mixbench

A benchmark suite for GPUs on mixed operational intensity kernels. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGIOPS, More Is BetterMixbench 2020-06-23Backend: OpenCL - Benchmark: IntegerASUS NVIDIA GeForce RTX 30904K8K12K16K20KSE +/- 1.49, N = 317349.251. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2

OpenBenchmarking.orgGIOPS, More Is BetterMixbench 2020-06-23Backend: NVIDIA CUDA - Benchmark: IntegerASUS NVIDIA GeForce RTX 30903K6K9K12K15KSE +/- 321.32, N = 1516263.141. (CXX) g++ options: -lm -lstdc++ -lOpenCL -lrt -O2

clpeak

Clpeak is designed to test the peak capabilities of OpenCL devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGIOPS, More Is BetterclpeakOpenCL Test: Integer Compute INTASUS NVIDIA GeForce RTX 30904K8K12K16K20KSE +/- 170.54, N = 718344.451. (CXX) g++ options: -O3 -rdynamic -lOpenCL

Hashcat

Hashcat is an open-source, advanced password recovery tool supporting GPU acceleration with OpenCL, NVIDIA CUDA, and Radeon ROCm. Learn more via the OpenBenchmarking.org test page.

Benchmark: MD5

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status.

Benchmark: SHA1

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status.

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: 7-ZipASUS NVIDIA GeForce RTX 3090800K1600K2400K3200K4000KSE +/- 23217.55, N = 33753467

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: SHA-512ASUS NVIDIA GeForce RTX 30902000M4000M6000M8000M10000MSE +/- 18910696.56, N = 311039266667

Benchmark: TrueCrypt RIPEMD160 + XTS

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status.

LuxCoreRender

LuxCoreRender is an open-source 3D physically based renderer formerly known as LuxRender. LuxCoreRender supports CPU-based rendering as well as GPU acceleration via OpenCL, NVIDIA CUDA, and NVIDIA OptiX interfaces. Learn more via the OpenBenchmarking.org test page.

Scene: DLSC - Acceleration: GPU

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: RUNTIME ERROR: CUDA driver API error CUDA_ERROR_OUT_OF_MEMORY (code: 2, file:/home/vsts/work/1/s/LinuxCompile/LuxCore-sdk/src/luxrays/devices/cudadevice.cpp, line: 516): out of memory

Scene: Danish Mood - Acceleration: GPU

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: RUNTIME ERROR: CUDA driver API error CUDA_ERROR_OUT_OF_MEMORY (code: 2, file:/home/vsts/work/1/s/LinuxCompile/LuxCore-sdk/src/luxrays/devices/cudadevice.cpp, line: 516): out of memory

Scene: Orange Juice - Acceleration: GPU

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: RUNTIME ERROR: CUDA driver API error CUDA_ERROR_OUT_OF_MEMORY (code: 2, file:/home/vsts/work/1/s/LinuxCompile/LuxCore-sdk/src/luxrays/devices/cudadevice.cpp, line: 516): out of memory

Scene: LuxCore Benchmark - Acceleration: GPU

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: RUNTIME ERROR: CUDA driver API error CUDA_ERROR_OUT_OF_MEMORY (code: 2, file:/home/vsts/work/1/s/LinuxCompile/LuxCore-sdk/src/luxrays/devices/cudadevice.cpp, line: 516): out of memory

Scene: Rainbow Colors and Prism - Acceleration: GPU

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: RUNTIME ERROR: CUDA driver API error CUDA_ERROR_OUT_OF_MEMORY (code: 2, file:/home/vsts/work/1/s/LinuxCompile/LuxCore-sdk/src/luxrays/devices/cudadevice.cpp, line: 516): out of memory

LeelaChessZero

LeelaChessZero (lc0 / lczero) is a chess engine automated vian neural networks. This test profile can be used for OpenCL, CUDA + cuDNN, and BLAS (CPU-based) benchmarking. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.28Backend: OpenCLASUS NVIDIA GeForce RTX 30903K6K9K12K15KSE +/- 128.90, N = 5122731. (CXX) g++ options: -flto -pthread

FAHBench

FAHBench is a Folding@Home benchmark on the GPU. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNs Per Day, More Is BetterFAHBench 2.3.2ASUS NVIDIA GeForce RTX 309070140210280350SE +/- 0.24, N = 3343.08

GROMACS

The GROMACS (GROningen MAchine for Chemical Simulations) molecular dynamics package testing with the water_GMX50 data. This test profile allows selecting between CPU and GPU-based GROMACS builds. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNs Per Day, More Is BetterGROMACS 2021.2Implementation: NVIDIA CUDA GPU - Input: water_GMX50_bareASUS NVIDIA GeForce RTX 30900.64671.29341.94012.58683.2335SE +/- 0.003, N = 32.8741. (CXX) g++ options: -O3

MandelGPU

MandelGPU is an OpenCL benchmark and this test runs with the OpenCL rendering float4 kernel with a maximum of 4096 iterations. Learn more via the OpenBenchmarking.org test page.

OpenCL Device: GPU

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status.

Caffe

This is a benchmark of the Caffe deep learning framework and currently supports the AlexNet and Googlenet model and execution on both CPUs and NVIDIA GPUs. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMilli-Seconds, Fewer Is BetterCaffe 2020-02-13Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 100ASUS NVIDIA GeForce RTX 3090150300450600750SE +/- 2.45, N = 3680.231. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas

OpenBenchmarking.orgMilli-Seconds, Fewer Is BetterCaffe 2020-02-13Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 200ASUS NVIDIA GeForce RTX 309030060090012001500SE +/- 0.62, N = 31346.601. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas

OpenBenchmarking.orgMilli-Seconds, Fewer Is BetterCaffe 2020-02-13Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 1000ASUS NVIDIA GeForce RTX 309014002800420056007000SE +/- 3.28, N = 36636.651. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas

OpenBenchmarking.orgMilli-Seconds, Fewer Is BetterCaffe 2020-02-13Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 100ASUS NVIDIA GeForce RTX 30906001200180024003000SE +/- 13.31, N = 32590.721. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas

OpenBenchmarking.orgMilli-Seconds, Fewer Is BetterCaffe 2020-02-13Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 200ASUS NVIDIA GeForce RTX 309011002200330044005500SE +/- 19.01, N = 35136.261. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas

OpenBenchmarking.orgMilli-Seconds, Fewer Is BetterCaffe 2020-02-13Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 1000ASUS NVIDIA GeForce RTX 30905K10K15K20K25KSE +/- 88.92, N = 325531.91. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas

ArrayFire

ArrayFire is an GPU and CPU numeric processing library, this test uses the built-in CPU and OpenCL ArrayFire benchmarks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterArrayFire 3.7Test: Conjugate Gradient OpenCLASUS NVIDIA GeForce RTX 30900.511.522.5SE +/- 0.005, N = 32.2221. (CXX) g++ options: -rdynamic

FinanceBench

FinanceBench is a collection of financial program benchmarks with support for benchmarking on the GPU via OpenCL and CPU benchmarking with OpenMP. The FinanceBench test cases are focused on Black-Sholes-Merton Process with Analytic European Option engine, QMC (Sobol) Monte-Carlo method (Equity Option Example), Bonds Fixed-rate bond with flat forward curve, and Repo Securities repurchase agreement. FinanceBench was originally written by the Cavazos Lab at University of Delaware. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterFinanceBench 2016-07-25Benchmark: Black-Scholes OpenCLASUS NVIDIA GeForce RTX 3090246810SE +/- 0.554, N = 156.9001. (CXX) g++ options: -O3 -march=native -fopenmp

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

Target: Vulkan GPU

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status.

RedShift Demo

This is a test of MAXON's RedShift demo build that currently requires NVIDIA GPU acceleration. Learn more via the OpenBenchmarking.org test page.

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./redshift: 3: /usr/redshift/bin/redshiftBenchmark: not found

Rodinia

Rodinia is a suite focused upon accelerating compute-intensive applications with accelerators. CUDA, OpenMP, and OpenCL parallel models are supported by the included applications. This profile utilizes select OpenCL, NVIDIA CUDA and OpenMP test binaries at the moment. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenCL Particle FilterASUS NVIDIA GeForce RTX 30900.92391.84782.77173.69564.6195SE +/- 0.024, N = 34.1061. (CXX) g++ options: -m64 -lm -lcuda -lcudart -lcudadevrt -lcudart_static -lrt -lpthread -ldl

Blender

Blender is an open-source 3D creation and modeling software project. This test is of Blender's Cycles benchmark with various sample files. GPU computing via NVIDIA OptiX and NVIDIA CUDA is currently supported. Learn more via the OpenBenchmarking.org test page.

Blend File: BMW27 - Compute: CUDA

ASUS NVIDIA GeForce RTX 3090: The test run did not produce a result. The test run did not produce a result. The test run did not produce a result. E: Error: Cannot read file 'blender-3.0.0-linux-x64/CUDA': No such file or directory

Blend File: Classroom - Compute: CUDA

ASUS NVIDIA GeForce RTX 3090: The test run did not produce a result. The test run did not produce a result. The test run did not produce a result. E: Error: Cannot read file 'blender-3.0.0-linux-x64/CUDA': No such file or directory

Blend File: Fishy Cat - Compute: CUDA

ASUS NVIDIA GeForce RTX 3090: The test run did not produce a result. The test run did not produce a result. The test run did not produce a result. E: Error: Cannot read file 'blender-3.0.0-linux-x64/CUDA': No such file or directory

Blend File: Barbershop - Compute: CUDA

ASUS NVIDIA GeForce RTX 3090: The test run did not produce a result. The test run did not produce a result. The test run did not produce a result. E: Error: Not freed memory blocks: 4, total unfreed memory 0.000427 MB

Blend File: BMW27 - Compute: NVIDIA OptiX

ASUS NVIDIA GeForce RTX 3090: The test run did not produce a result. The test run did not produce a result. The test run did not produce a result. E: Error: Cannot read file 'blender-3.0.0-linux-x64/OPTIX': No such file or directory

Blend File: Classroom - Compute: NVIDIA OptiX

ASUS NVIDIA GeForce RTX 3090: The test run did not produce a result. The test run did not produce a result. The test run did not produce a result. E: Error: Cannot read file 'blender-3.0.0-linux-x64/OPTIX': No such file or directory

Blend File: Fishy Cat - Compute: NVIDIA OptiX

ASUS NVIDIA GeForce RTX 3090: The test run did not produce a result. The test run did not produce a result. The test run did not produce a result. E: Error: Cannot read file 'blender-3.0.0-linux-x64/OPTIX': No such file or directory

Blend File: Barbershop - Compute: NVIDIA OptiX

ASUS NVIDIA GeForce RTX 3090: The test run did not produce a result. The test run did not produce a result. The test run did not produce a result. E: Error: Not freed memory blocks: 4, total unfreed memory 0.000427 MB

Blend File: Pabellon Barcelona - Compute: CUDA

ASUS NVIDIA GeForce RTX 3090: The test run did not produce a result. The test run did not produce a result. The test run did not produce a result. E: Error: Cannot read file 'blender-3.0.0-linux-x64/CUDA': No such file or directory

Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX

ASUS NVIDIA GeForce RTX 3090: The test run did not produce a result. The test run did not produce a result. The test run did not produce a result. E: Error: Cannot read file 'blender-3.0.0-linux-x64/OPTIX': No such file or directory

63 Results Shown

NeatBench
SHOC Scalable HeterOgeneous Computing:
  OpenCL - Triad
  OpenCL - Reduction
  OpenCL - Bus Speed Download
  OpenCL - Bus Speed Readback
  OpenCL - Texture Read Bandwidth
cl-mem:
  Copy
  Read
  Write
ViennaCL:
  CPU BLAS - sCOPY
  CPU BLAS - sAXPY
  CPU BLAS - sDOT
  CPU BLAS - dCOPY
  CPU BLAS - dAXPY
  CPU BLAS - dDOT
  CPU BLAS - dGEMV-N
  CPU BLAS - dGEMV-T
  OpenCL BLAS - sCOPY
  OpenCL BLAS - sAXPY
  OpenCL BLAS - sDOT
  OpenCL BLAS - dCOPY
  OpenCL BLAS - dAXPY
  OpenCL BLAS - dDOT
  OpenCL BLAS - dGEMV-N
  OpenCL BLAS - dGEMV-T
clpeak
Mixbench:
  OpenCL - Double Precision
  OpenCL - Single Precision
  NVIDIA CUDA - Half Precision
  NVIDIA CUDA - Double Precision
  NVIDIA CUDA - Single Precision
SHOC Scalable HeterOgeneous Computing:
  OpenCL - S3D
  OpenCL - FFT SP
  OpenCL - GEMM SGEMM_N
  OpenCL - Max SP Flops
clpeak:
  Single-Precision Float
  Double-Precision Double
ViennaCL:
  CPU BLAS - dGEMM-NN
  CPU BLAS - dGEMM-NT
  CPU BLAS - dGEMM-TN
  CPU BLAS - dGEMM-TT
  OpenCL BLAS - dGEMM-NN
  OpenCL BLAS - dGEMM-NT
  OpenCL BLAS - dGEMM-TN
  OpenCL BLAS - dGEMM-TT
SHOC Scalable HeterOgeneous Computing
Mixbench:
  OpenCL - Integer
  NVIDIA CUDA - Integer
clpeak
Hashcat:
  7-Zip
  SHA-512
LeelaChessZero
FAHBench
GROMACS
Caffe:
  AlexNet - NVIDIA CUDA - 100
  AlexNet - NVIDIA CUDA - 200
  AlexNet - NVIDIA CUDA - 1000
  GoogleNet - NVIDIA CUDA - 100
  GoogleNet - NVIDIA CUDA - 200
  GoogleNet - NVIDIA CUDA - 1000
ArrayFire
FinanceBench
Rodinia