RTX 30 Compute September

AMD Ryzen 9 5900X 12-Core testing with a ASUS ROG CROSSHAIR VIII HERO (3801 BIOS) and NVIDIA GeForce RTX 3090 24GB on Ubuntu 21.10 via the Phoronix Test Suite.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2109061-TJ-RTX30COMP69
Jump To Table - Results

View

Do Not Show Noisy Results
Do Not Show Results With Incomplete Data
Do Not Show Results With Little Change/Spread
List Notable Results

Limit displaying results to tests within:

BLAS (Basic Linear Algebra Sub-Routine) Tests 3 Tests
C++ Boost Tests 2 Tests
CPU Massive 4 Tests
Creator Workloads 4 Tests
Game Development 2 Tests
HPC - High Performance Computing 8 Tests
Machine Learning 5 Tests
Multi-Core 6 Tests
NVIDIA GPU Compute 27 Tests
OpenCL 4 Tests
OpenMPI Tests 2 Tests
Python Tests 2 Tests
Renderers 3 Tests
Scientific Computing 2 Tests
Server CPU Tests 2 Tests
Vulkan Compute 8 Tests
Common Workstation Benchmarks 2 Tests

Statistics

Show Overall Harmonic Mean(s)
Show Overall Geometric Mean
Show Geometric Means Per-Suite/Category
Show Wins / Losses Counts (Pie Chart)
Normalize Results
Remove Outliers Before Calculating Averages

Graph Settings

Force Line Graphs Where Applicable
Convert To Scalar Where Applicable
Disable Color Branding
Prefer Vertical Bar Graphs

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Highlight
Result
Hide
Result
Result
Identifier
View Logs
Performance Per
Dollar
Date
Run
  Test
  Duration
RTX 3070
September 06 2021
  6 Hours, 15 Minutes
RTX 3090
September 06 2021
  2 Hours, 7 Minutes
GeForce RTX 3090
September 06 2021
  47 Minutes
Invert Hiding All Results Option
  3 Hours, 3 Minutes

Only show results where is faster than
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


RTX 30 Compute September - Phoronix Test Suite

RTX 30 Compute September

AMD Ryzen 9 5900X 12-Core testing with a ASUS ROG CROSSHAIR VIII HERO (3801 BIOS) and NVIDIA GeForce RTX 3090 24GB on Ubuntu 21.10 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2109061-TJ-RTX30COMP69&gru&rdt&rro.

RTX 30 Compute SeptemberProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLOpenCLVulkanCompilerFile-SystemScreen ResolutionRTX 3070RTX 3090GeForce RTX 3090AMD Ryzen 9 5900X 12-Core @ 3.70GHz (12 Cores / 24 Threads)ASUS ROG CROSSHAIR VIII HERO (3801 BIOS)AMD Starship/Matisse16GB1000GB Western Digital WDS100T1X0E-00AFY0 + 1000GB Western Digital WD_BLACK SN850 1TB + 2000GBNVIDIA GeForce RTX 3070 8GBNVIDIA GA104 HD AudioASUS VP28URealtek RTL8125 2.5GbE + Intel I211Ubuntu 21.105.13.0-14-generic (x86_64)GNOME Shell 40.2X Server 1.20.11NVIDIA 470.63.014.6.0OpenCL 3.0 CUDA 11.4.1121.2.175GCC 11.2.0ext43840x2160NVIDIA GeForce RTX 3090 24GBNVIDIA GA102 HD AudioOpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-M6DaQn/gcc-11-11.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-M6DaQn/gcc-11-11.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa201016OpenCL Details- RTX 3070: GPU Compute Cores: 5888- RTX 3090: GPU Compute Cores: 10496- GeForce RTX 3090: GPU Compute Cores: 10496Python Details- Python 3.9.6Security Details- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected

RTX 30 Compute Septembervkfft: shoc: OpenCL - Triadshoc: OpenCL - Reductionshoc: OpenCL - Bus Speed Downloadshoc: OpenCL - Bus Speed Readbackshoc: OpenCL - Texture Read Bandwidthcl-mem: Copycl-mem: Readcl-mem: Writeviennacl: CPU BLAS - sCOPYviennacl: CPU BLAS - sAXPYviennacl: CPU BLAS - sDOTviennacl: CPU BLAS - dCOPYviennacl: CPU BLAS - dAXPYviennacl: CPU BLAS - dDOTviennacl: CPU BLAS - dGEMV-Nviennacl: CPU BLAS - dGEMV-Tviennacl: OpenCL BLAS - sCOPYviennacl: OpenCL BLAS - sAXPYviennacl: OpenCL BLAS - sDOTviennacl: OpenCL BLAS - dCOPYviennacl: OpenCL BLAS - dAXPYviennacl: OpenCL BLAS - dDOTviennacl: OpenCL BLAS - dGEMV-Nviennacl: OpenCL BLAS - dGEMV-Tvkpeak: fp32-scalarvkpeak: fp32-vec4vkpeak: fp16-scalarvkpeak: fp16-vec4vkpeak: fp64-scalarvkpeak: fp64-vec4shoc: OpenCL - S3Dshoc: OpenCL - FFT SPshoc: OpenCL - GEMM SGEMM_Nshoc: OpenCL - Max SP Flopsviennacl: CPU BLAS - dGEMM-NNviennacl: CPU BLAS - dGEMM-NTviennacl: CPU BLAS - dGEMM-TNviennacl: CPU BLAS - dGEMM-TTviennacl: OpenCL BLAS - dGEMM-NNviennacl: OpenCL BLAS - dGEMM-NTviennacl: OpenCL BLAS - dGEMM-TNviennacl: OpenCL BLAS - dGEMM-TTshoc: OpenCL - MD5 Hashvkpeak: int32-scalarvkpeak: int32-vec4vkpeak: int16-scalarvkpeak: int16-vec4hashcat: MD5hashcat: SHA1hashcat: 7-Ziphashcat: SHA-512hashcat: TrueCrypt RIPEMD160 + XTSindigobench: OpenCL GPU - Bedroomindigobench: OpenCL GPU - Supercarluxcorerender: DLSC - GPUluxcorerender: Danish Mood - GPUluxcorerender: Orange Juice - GPUluxcorerender: LuxCore Benchmark - GPUluxcorerender: Rainbow Colors and Prism - GPUlczero: OpenCLfahbench: octanebench: Total Scorenamd-cuda: ATPase Simulation - 327,506 Atomsvkresample: 2x - Doublevkresample: 2x - Singlearrayfire: Conjugate Gradient OpenCLfinancebench: Black-Scholes OpenCLncnn: Vulkan GPU - mobilenetncnn: Vulkan GPU-v2-v2 - mobilenet-v2ncnn: Vulkan GPU-v3-v3 - mobilenet-v3ncnn: Vulkan GPU - shufflenet-v2ncnn: Vulkan GPU - mnasnetncnn: Vulkan GPU - efficientnet-b0ncnn: Vulkan GPU - blazefacencnn: Vulkan GPU - vgg16ncnn: Vulkan GPU - resnet18ncnn: Vulkan GPU - alexnetncnn: Vulkan GPU - resnet50ncnn: Vulkan GPU - yolov4-tinyncnn: Vulkan GPU - squeezenet_ssdncnn: Vulkan GPU - regnety_400mncnn: Vulkan GPU - googlenetrealsr-ncnn: 4x - Norealsr-ncnn: 4x - Yeswaifu2x-ncnn: 2x - 3 - Yesredshift: rodinia: OpenCL Particle Filterblender: BMW27 - CUDAblender: Classroom - CUDAblender: Fishy Cat - CUDAblender: Barbershop - CUDAblender: BMW27 - NVIDIA OptiXblender: Classroom - NVIDIA OptiXblender: Fishy Cat - NVIDIA OptiXRTX 3070RTX 3090GeForce RTX 30903155424.6671326.57026.283727.11352130.04296.8393.3382.066.9100.815124.636.548.682.889.829135932237039639522233311589.5915319.3111590.4822503.16363.75364.29219.1951134.823732.8723102.953.153.658.256.434134233725.408711571.3511521.487644.3210106.833892516666713194833333689667167363333350316712.94137.3186.685.066.996.2221.2737790265.5078415.1992510.12555216.40617.5292.08910.4574.311.912.141.712.013.170.985.431.681.903.746.574.432.424.137.85646.7904.1122265.92029.0076.0654.69507.3915.8245.7733.344329525.5289391.21126.257127.07852228.01364.3826.67456389.613622.634.843.277.683.836450135160772166323637621074.8927597.62095440868.96655.42655.31427.9352349.198174.5440344.152.45256.15460060259860144.426720836.3220743.813646.5616889.2666739300000228537000001154600288460000085750020.96853.36411.538.6410.4210.9232.0953653344.0195683.5959990.13155119.8249.2541.486.3574.471.982.311.772.143.341.074.351.661.953.687.24.372.524.415.56628.4153.2761393.6818.3251.1834.65373.329.5228.6618.974456725.4496391.09121146.0127807.7520953.6540869.19658.56651.2427.0782348.038105.7344.194220925.8720746.6713752.1917006.616658910000022798500000116110028813000008594005.59528.4213.275OpenBenchmarking.org

VkFFT

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.1.1GeForce RTX 3090RTX 3090RTX 307010K20K30K40K50KSE +/- 172.10, N = 34456743295315541. (CXX) g++ options: -O3 -pthread

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Triad

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: TriadGeForce RTX 3090RTX 3090RTX 3070612182430SE +/- 0.01, N = 325.4525.5324.671. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Reduction

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: ReductionGeForce RTX 3090RTX 3090RTX 307080160240320400SE +/- 0.70, N = 3391.09391.21326.571. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Bus Speed Download

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Bus Speed DownloadRTX 3090RTX 3070612182430SE +/- 0.02, N = 326.2626.281. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Bus Speed Readback

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Bus Speed ReadbackRTX 3090RTX 3070612182430SE +/- 0.01, N = 327.0827.111. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Texture Read Bandwidth

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Texture Read BandwidthRTX 3090RTX 30705001000150020002500SE +/- 6.58, N = 32228.012130.041. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

cl-mem

Benchmark: Copy

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: CopyRTX 3090RTX 307080160240320400SE +/- 0.45, N = 3364.3296.81. (CC) gcc options: -O2 -flto -lOpenCL

cl-mem

Benchmark: Read

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: ReadRTX 3090RTX 30702004006008001000SE +/- 0.30, N = 3826.6393.31. (CC) gcc options: -O2 -flto -lOpenCL

cl-mem

Benchmark: Write

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: WriteRTX 3090RTX 3070160320480640800SE +/- 0.09, N = 3745.0382.01. (CC) gcc options: -O2 -flto -lOpenCL

ViennaCL

Test: CPU BLAS - sCOPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - sCOPYRTX 3090RTX 30701530456075SE +/- 0.58, N = 363.066.91. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - sAXPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - sAXPYRTX 3090RTX 307020406080100SE +/- 0.73, N = 389.6100.81. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - sDOT

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - sDOTRTX 3090RTX 3070306090120150SE +/- 0.88, N = 31361511. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dCOPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dCOPYRTX 3090RTX 3070612182430SE +/- 0.03, N = 322.624.61. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dAXPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dAXPYRTX 3090RTX 3070816243240SE +/- 0.32, N = 334.836.51. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dDOT

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dDOTRTX 3090RTX 30701122334455SE +/- 0.50, N = 343.248.61. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMV-N

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMV-NRTX 3090RTX 307020406080100SE +/- 0.09, N = 377.682.81. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMV-T

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMV-TRTX 3090RTX 307020406080100SE +/- 0.10, N = 383.889.81. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - sCOPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sCOPYRTX 3090RTX 307080160240320400SE +/- 0.58, N = 33642911. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - sAXPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sAXPYRTX 3090RTX 3070110220330440550SE +/- 0.33, N = 35013591. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - sDOT

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sDOTRTX 3090RTX 307080160240320400SE +/- 0.67, N = 33513221. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dCOPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dCOPYRTX 3090RTX 3070130260390520650SE +/- 0.33, N = 36073701. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dAXPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dAXPYRTX 3090RTX 3070160320480640800SE +/- 0.00, N = 37213961. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dDOT

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dDOTRTX 3090RTX 3070140280420560700SE +/- 0.00, N = 36633951. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMV-N

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMV-NRTX 3090RTX 307050100150200250SE +/- 0.58, N = 32362221. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMV-T

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMV-TRTX 3090RTX 307080160240320400SE +/- 0.67, N = 33763331. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

vkpeak

fp32-scalar

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20210424fp32-scalarGeForce RTX 3090RTX 3090RTX 30705K10K15K20K25KSE +/- 35.78, N = 321146.0121074.8911589.59

vkpeak

fp32-vec4

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20210424fp32-vec4GeForce RTX 3090RTX 3090RTX 30706K12K18K24K30KSE +/- 35.81, N = 327807.7527597.6015319.31

vkpeak

fp16-scalar

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20210424fp16-scalarGeForce RTX 3090RTX 3090RTX 30704K8K12K16K20KSE +/- 26.04, N = 320953.6520954.0011590.48

vkpeak

fp16-vec4

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20210424fp16-vec4GeForce RTX 3090RTX 3090RTX 30709K18K27K36K45KSE +/- 71.70, N = 340869.1940868.9622503.16

vkpeak

fp64-scalar

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20210424fp64-scalarGeForce RTX 3090RTX 3090RTX 3070140280420560700SE +/- 0.36, N = 3658.56655.42363.75

vkpeak

fp64-vec4

OpenBenchmarking.orgGFLOPS, More Is Bettervkpeak 20210424fp64-vec4GeForce RTX 3090RTX 3090RTX 3070140280420560700SE +/- 0.42, N = 3651.20655.31364.29

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: S3D

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: S3DGeForce RTX 3090RTX 3090RTX 307090180270360450SE +/- 0.17, N = 3427.08427.94219.201. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: FFT SP

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: FFT SPGeForce RTX 3090RTX 3090RTX 30705001000150020002500SE +/- 0.78, N = 32348.032349.191134.821. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: GEMM SGEMM_N

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: GEMM SGEMM_NGeForce RTX 3090RTX 3090RTX 30702K4K6K8K10KSE +/- 26.69, N = 38105.738174.543732.871. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Max SP Flops

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Max SP FlopsRTX 3090RTX 30709K18K27K36K45KSE +/- 59.20, N = 340344.123102.91. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

ViennaCL

Test: CPU BLAS - dGEMM-NN

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-NNRTX 3090RTX 30701224364860SE +/- 0.26, N = 352.453.11. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMM-NT

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-NTRTX 3090RTX 30701224364860SE +/- 0.12, N = 352.053.61. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMM-TN

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-TNRTX 3090RTX 30701326395265SE +/- 0.00, N = 356.158.21. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMM-TT

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-TTRTX 3090RTX 30701326395265SE +/- 0.09, N = 354.056.41. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMM-NN

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-NNRTX 3090RTX 3070130260390520650SE +/- 1.33, N = 36003411. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMM-NT

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-NTRTX 3090RTX 3070130260390520650SE +/- 1.33, N = 36023421. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMM-TN

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-TNRTX 3090RTX 30701302603905206505983371. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMM-TT

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-TTRTX 30901302603905206506011. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: MD5 Hash

OpenBenchmarking.orgGHash/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: MD5 HashGeForce RTX 3090RTX 3090RTX 30701020304050SE +/- 0.03, N = 344.1944.4325.411. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -pthread -lmpi_cxx -lmpi

vkpeak

int32-scalar

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20210424int32-scalarGeForce RTX 3090RTX 3090RTX 30704K8K12K16K20KSE +/- 16.33, N = 320925.8720836.3211571.35

vkpeak

int32-vec4

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20210424int32-vec4GeForce RTX 3090RTX 3090RTX 30704K8K12K16K20KSE +/- 18.17, N = 320746.6720743.8011521.48

vkpeak

int16-scalar

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20210424int16-scalarGeForce RTX 3090RTX 3090RTX 30703K6K9K12K15KSE +/- 11.41, N = 313752.1913646.567644.32

vkpeak

int16-vec4

OpenBenchmarking.orgGIOPS, More Is Bettervkpeak 20210424int16-vec4GeForce RTX 3090RTX 3090RTX 30704K8K12K16K20KSE +/- 20.90, N = 317006.6116889.2610106.83

Hashcat

Benchmark: MD5

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.1.1Benchmark: MD5GeForce RTX 3090RTX 3090RTX 307014000M28000M42000M56000M70000MSE +/- 41756290.28, N = 3665891000006673930000038925166667

Hashcat

Benchmark: SHA1

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.1.1Benchmark: SHA1GeForce RTX 3090RTX 3090RTX 30705000M10000M15000M20000M25000MSE +/- 7750340.49, N = 3227985000002285370000013194833333

Hashcat

Benchmark: 7-Zip

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.1.1Benchmark: 7-ZipGeForce RTX 3090RTX 3090RTX 3070200K400K600K800K1000KSE +/- 1128.91, N = 311611001154600689667

Hashcat

Benchmark: SHA-512

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.1.1Benchmark: SHA-512GeForce RTX 3090RTX 3090RTX 3070600M1200M1800M2400M3000MSE +/- 735602.55, N = 3288130000028846000001673633333

Hashcat

Benchmark: TrueCrypt RIPEMD160 + XTS

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.1.1Benchmark: TrueCrypt RIPEMD160 + XTSGeForce RTX 3090RTX 3090RTX 3070200K400K600K800K1000KSE +/- 800.69, N = 3859400857500503167

IndigoBench

Acceleration: OpenCL GPU - Scene: Bedroom

OpenBenchmarking.orgM samples/s, More Is BetterIndigoBench 4.4Acceleration: OpenCL GPU - Scene: BedroomRTX 3090RTX 3070510152025SE +/- 0.02, N = 320.9712.94

IndigoBench

Acceleration: OpenCL GPU - Scene: Supercar

OpenBenchmarking.orgM samples/s, More Is BetterIndigoBench 4.4Acceleration: OpenCL GPU - Scene: SupercarRTX 3090RTX 30701224364860SE +/- 0.04, N = 353.3637.32

LuxCoreRender

Scene: DLSC - Acceleration: GPU

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender 2.5Scene: DLSC - Acceleration: GPURTX 3090RTX 30703691215SE +/- 0.00, N = 311.536.68MIN: 11.29 / MAX: 11.68MIN: 6.25 / MAX: 6.94

LuxCoreRender

Scene: Danish Mood - Acceleration: GPU

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender 2.5Scene: Danish Mood - Acceleration: GPURTX 3090RTX 3070246810SE +/- 0.00, N = 38.645.06MIN: 2.95 / MAX: 10.48MIN: 1.5 / MAX: 6.12

LuxCoreRender

Scene: Orange Juice - Acceleration: GPU

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender 2.5Scene: Orange Juice - Acceleration: GPURTX 3090RTX 30703691215SE +/- 0.01, N = 310.426.99MIN: 8.45 / MAX: 13.57MIN: 5.61 / MAX: 8.46

LuxCoreRender

Scene: LuxCore Benchmark - Acceleration: GPU

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender 2.5Scene: LuxCore Benchmark - Acceleration: GPURTX 3090RTX 30703691215SE +/- 0.01, N = 310.926.22MIN: 3.29 / MAX: 13.01MIN: 2.05 / MAX: 7.38

LuxCoreRender

Scene: Rainbow Colors and Prism - Acceleration: GPU

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender 2.5Scene: Rainbow Colors and Prism - Acceleration: GPURTX 3090RTX 3070714212835SE +/- 0.03, N = 332.0921.27MIN: 28.96 / MAX: 34.45MIN: 20.13 / MAX: 22.99

LeelaChessZero

Backend: OpenCL

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.28Backend: OpenCLRTX 3090RTX 307011K22K33K44K55KSE +/- 178.94, N = 353653377901. (CXX) g++ options: -flto -pthread

FAHBench

OpenBenchmarking.orgNs Per Day, More Is BetterFAHBench 2.3.2RTX 3090RTX 307070140210280350SE +/- 0.05, N = 3344.02265.51

OctaneBench

Total Score

OpenBenchmarking.orgScore, More Is BetterOctaneBench 2020.1Total ScoreRTX 3090RTX 3070150300450600750683.60415.20

NAMD CUDA

ATPase Simulation - 327,506 Atoms

OpenBenchmarking.orgdays/ns, Fewer Is BetterNAMD CUDA 2.14ATPase Simulation - 327,506 AtomsRTX 3090RTX 30700.02960.05920.08880.11840.148SE +/- 0.00049, N = 30.131550.12555

VkResample

Upscale: 2x - Precision: Double

OpenBenchmarking.orgms, Fewer Is BetterVkResample 1.0Upscale: 2x - Precision: DoubleRTX 3090RTX 307050100150200250SE +/- 0.03, N = 3119.82216.411. (CXX) g++ options: -O3 -pthread

VkResample

Upscale: 2x - Precision: Single

OpenBenchmarking.orgms, Fewer Is BetterVkResample 1.0Upscale: 2x - Precision: SingleRTX 3090RTX 307048121620SE +/- 0.013, N = 39.25417.5291. (CXX) g++ options: -O3 -pthread

ArrayFire

Test: Conjugate Gradient OpenCL

OpenBenchmarking.orgms, Fewer Is BetterArrayFire 3.7Test: Conjugate Gradient OpenCLRTX 3090RTX 30700.470.941.411.882.35SE +/- 0.001, N = 31.4802.0891. (CXX) g++ options: -rdynamic

FinanceBench

Benchmark: Black-Scholes OpenCL

OpenBenchmarking.orgms, Fewer Is BetterFinanceBench 2016-07-25Benchmark: Black-Scholes OpenCLRTX 3090RTX 30703691215SE +/- 0.003, N = 36.35710.4571. (CXX) g++ options: -O3 -march=native -fopenmp

NCNN

Target: Vulkan GPU - Model: mobilenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: Vulkan GPU - Model: mobilenetRTX 3090RTX 30701.0082.0163.0244.0325.04SE +/- 0.01, N = 34.474.31MIN: 4.1 / MAX: 24.65MIN: 4.25 / MAX: 5.941. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

NCNN

Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2RTX 3090RTX 30700.44550.8911.33651.7822.2275SE +/- 0.01, N = 31.981.91MIN: 1.84 / MAX: 7.62MIN: 1.88 / MAX: 3.421. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

NCNN

Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3RTX 3090RTX 30700.51981.03961.55942.07922.599SE +/- 0.01, N = 32.312.14MIN: 2.09 / MAX: 11.02MIN: 2.11 / MAX: 3.931. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

NCNN

Target: Vulkan GPU - Model: shufflenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: Vulkan GPU - Model: shufflenet-v2RTX 3090RTX 30700.39830.79661.19491.59321.9915SE +/- 0.01, N = 31.771.71MIN: 1.67 / MAX: 9.47MIN: 1.68 / MAX: 3.551. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

NCNN

Target: Vulkan GPU - Model: mnasnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: Vulkan GPU - Model: mnasnetRTX 3090RTX 30700.48150.9631.44451.9262.4075SE +/- 0.01, N = 32.142.01MIN: 1.94 / MAX: 8.72MIN: 1.98 / MAX: 4.821. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

NCNN

Target: Vulkan GPU - Model: efficientnet-b0

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: Vulkan GPU - Model: efficientnet-b0RTX 3090RTX 30700.75381.50762.26143.01523.769SE +/- 0.01, N = 33.343.17MIN: 3.09 / MAX: 9.48MIN: 3.14 / MAX: 5.061. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

NCNN

Target: Vulkan GPU - Model: blazeface

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: Vulkan GPU - Model: blazefaceRTX 3090RTX 30700.24080.48160.72240.96321.204SE +/- 0.00, N = 31.070.98MIN: 0.89 / MAX: 21.68MIN: 0.95 / MAX: 2.271. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

NCNN

Target: Vulkan GPU - Model: vgg16

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: Vulkan GPU - Model: vgg16RTX 3090RTX 30701.22182.44363.66544.88726.109SE +/- 0.01, N = 34.355.43MIN: 4.02 / MAX: 18.18MIN: 5.38 / MAX: 9.581. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

NCNN

Target: Vulkan GPU - Model: resnet18

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: Vulkan GPU - Model: resnet18RTX 3090RTX 30700.40730.81461.22191.62922.0365SE +/- 0.01, N = 31.661.68MIN: 1.53 / MAX: 9.86MIN: 1.66 / MAX: 1.971. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

NCNN

Target: Vulkan GPU - Model: alexnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: Vulkan GPU - Model: alexnetRTX 3090RTX 30700.43880.87761.31641.75522.194SE +/- 0.01, N = 31.951.90MIN: 1.76 / MAX: 18.4MIN: 1.86 / MAX: 5.121. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

NCNN

Target: Vulkan GPU - Model: resnet50

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: Vulkan GPU - Model: resnet50RTX 3090RTX 30700.84151.6832.52453.3664.2075SE +/- 0.01, N = 33.683.74MIN: 3.44 / MAX: 13.64MIN: 3.71 / MAX: 4.061. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

NCNN

Target: Vulkan GPU - Model: yolov4-tiny

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: Vulkan GPU - Model: yolov4-tinyRTX 3090RTX 3070246810SE +/- 0.01, N = 37.206.57MIN: 6.44 / MAX: 31.2MIN: 6.33 / MAX: 6.981. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

NCNN

Target: Vulkan GPU - Model: squeezenet_ssd

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: Vulkan GPU - Model: squeezenet_ssdRTX 3090RTX 30700.99681.99362.99043.98724.984SE +/- 0.13, N = 34.374.43MIN: 3.93 / MAX: 23.52MIN: 4.1 / MAX: 7.621. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

NCNN

Target: Vulkan GPU - Model: regnety_400m

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: Vulkan GPU - Model: regnety_400mRTX 3090RTX 30700.57831.15661.73492.31322.8915SE +/- 0.01, N = 32.522.42MIN: 2.38 / MAX: 10.9MIN: 2.4 / MAX: 3.841. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

NCNN

Target: Vulkan GPU - Model: googlenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: Vulkan GPU - Model: googlenetRTX 3090RTX 30700.99231.98462.97693.96924.9615SE +/- 0.13, N = 24.414.13MIN: 3.6 / MAX: 14.7MIN: 3.77 / MAX: 13.511. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

RealSR-NCNN

Scale: 4x - TAA: No

OpenBenchmarking.orgSeconds, Fewer Is BetterRealSR-NCNN 20200818Scale: 4x - TAA: NoGeForce RTX 3090RTX 3090RTX 3070246810SE +/- 0.002, N = 35.5955.5667.856

RealSR-NCNN

Scale: 4x - TAA: Yes

OpenBenchmarking.orgSeconds, Fewer Is BetterRealSR-NCNN 20200818Scale: 4x - TAA: YesGeForce RTX 3090RTX 3090RTX 30701122334455SE +/- 0.04, N = 328.4228.4246.79

Waifu2x-NCNN Vulkan

Scale: 2x - Denoise: 3 - TAA: Yes

OpenBenchmarking.orgSeconds, Fewer Is BetterWaifu2x-NCNN Vulkan 20200818Scale: 2x - Denoise: 3 - TAA: YesGeForce RTX 3090RTX 3090RTX 30700.92521.85042.77563.70084.626SE +/- 0.006, N = 33.2753.2764.112

RedShift Demo

OpenBenchmarking.orgSeconds, Fewer Is BetterRedShift Demo 3.0RTX 3090RTX 307050100150200250SE +/- 0.33, N = 3139226

Rodinia

Test: OpenCL Particle Filter

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenCL Particle FilterRTX 3090RTX 30701.3322.6643.9965.3286.66SE +/- 0.007, N = 33.6805.9201. (CXX) g++ options: -O2 -lOpenCL

Blender

Blend File: BMW27 - Compute: CUDA

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.92Blend File: BMW27 - Compute: CUDARTX 3090RTX 3070714212835SE +/- 0.04, N = 318.3229.00

Blender

Blend File: Classroom - Compute: CUDA

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.92Blend File: Classroom - Compute: CUDARTX 3090RTX 307020406080100SE +/- 0.01, N = 351.1876.06

Blender

Blend File: Fishy Cat - Compute: CUDA

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.92Blend File: Fishy Cat - Compute: CUDARTX 3090RTX 30701224364860SE +/- 0.04, N = 334.6554.69

Blender

Blend File: Barbershop - Compute: CUDA

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.92Blend File: Barbershop - Compute: CUDARTX 3090RTX 3070110220330440550SE +/- 0.17, N = 3373.32507.39

Blender

Blend File: BMW27 - Compute: NVIDIA OptiX

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.92Blend File: BMW27 - Compute: NVIDIA OptiXRTX 3090RTX 307048121620SE +/- 0.01, N = 39.5215.82

Blender

Blend File: Classroom - Compute: NVIDIA OptiX

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.92Blend File: Classroom - Compute: NVIDIA OptiXRTX 3090RTX 30701020304050SE +/- 0.02, N = 328.6645.77

Blender

Blend File: Fishy Cat - Compute: NVIDIA OptiX

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.92Blend File: Fishy Cat - Compute: NVIDIA OptiXRTX 3090RTX 3070816243240SE +/- 0.01, N = 318.9733.34


Phoronix Test Suite v10.8.4