RTX 4070 SUPER

Intel Core i9-13900K testing with a ASUS TUF GAMING Z790-PRO WIFI (1401 BIOS) and ASUS NVIDIA GeForce RTX 4070 SUPER 12GB on EndeavourOS rolling via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2401264-NE-RTX4070SU09&rdt&grr.

RTX 4070 SUPERProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLOpenCLCompilerFile-SystemScreen ResolutionNVIDIA RTX 4070 SUPERRTX 4070 SUPERNVIDIA 4070 SUPERIntel Core i9-13900K @ 5.50GHz (24 Cores / 32 Threads)ASUS TUF GAMING Z790-PRO WIFI (1401 BIOS)Intel Device 7a2732GB4001GB Seagate ZP4000GP304001ASUS NVIDIA GeForce RTX 4070 SUPER 12GBRealtek ALC1220ARZOPAIntel I226-V + Intel Device 7a70EndeavourOS rolling6.7.1-arch1-1 (x86_64)KDE Plasma 5.27.10X Server 1.21.1.11NVIDIA 550.40.074.6.0OpenCL 3.0 CUDA 12.4.74GCC 13.2.1 20230801ext41920x1080GCC 13.2.1 20230801 + CUDA 12.3OpenBenchmarking.orgKernel Details- Transparent Huge Pages: alwaysCompiler Details- NVIDIA RTX 4070 SUPER, NVIDIA 4070 SUPER: --disable-libssp --disable-libstdcxx-pch --disable-werror --enable-__cxa_atexit --enable-bootstrap --enable-cet=auto --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-default-ssp --enable-gnu-indirect-function --enable-gnu-unique-object --enable-languages=ada,c,c++,d,fortran,go,lto,objc,obj-c++ --enable-libstdcxx-backtrace --enable-link-serialization=1 --enable-lto --enable-multilib --enable-plugin --enable-shared --enable-threads=posix --mandir=/usr/share/man --with-build-config=bootstrap-lto --with-linker-hash-style=gnu Processor Details- Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0x11dGraphics Details- NVIDIA RTX 4070 SUPER, NVIDIA 4070 SUPER: BAR1 / Visible vRAM Size: 16384 MiB - vBIOS Version: 95.04.69.00.c1Security Details- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected Python Details- RTX 4070 SUPER, NVIDIA 4070 SUPER: Python 3.11.6Environment Details- NVIDIA 4070 SUPER: NVCC_PREPEND_FLAGS="-ccbin /opt/cuda/bin"

RTX 4070 SUPERgpuowl: 77936867gpuowl: 332220523octanebench: Total Scoregpuowl: 57885161fahbench: luxcorerender: LuxCore Benchmark - GPUluxcorerender: DLSC - GPUindigobench: OpenCL GPU - Bedroomvkresample: 2x - Doubleindigobench: OpenCL GPU - Supercarluxcorerender: Orange Juice - GPUluxcorerender: Danish Mood - GPUblender: Barbershop - NVIDIA OptiXnamd-cuda: ATPase Simulation - 327,506 Atomsblender: Fishy Cat - NVIDIA OptiXrealsr-ncnn: 4x - Yesrealsr-ncnn: 4x - Noblender: BMW27 - NVIDIA OptiXviennacl: CPU BLAS - dGEMM-TTviennacl: CPU BLAS - dGEMM-TNviennacl: CPU BLAS - dGEMM-NTviennacl: CPU BLAS - dGEMM-NNviennacl: CPU BLAS - dGEMV-Tviennacl: CPU BLAS - dGEMV-Nviennacl: CPU BLAS - dDOTviennacl: CPU BLAS - dAXPYviennacl: CPU BLAS - dCOPYviennacl: CPU BLAS - sDOTviennacl: CPU BLAS - sAXPYviennacl: CPU BLAS - sCOPYviennacl: OpenCL BLAS - dGEMM-TTviennacl: OpenCL BLAS - dGEMM-TNviennacl: OpenCL BLAS - dGEMM-NTviennacl: OpenCL BLAS - dGEMM-NNviennacl: OpenCL BLAS - dGEMV-Tviennacl: OpenCL BLAS - dGEMV-Nviennacl: OpenCL BLAS - dDOTviennacl: OpenCL BLAS - dAXPYviennacl: OpenCL BLAS - dCOPYviennacl: OpenCL BLAS - sDOTviennacl: OpenCL BLAS - sAXPYviennacl: OpenCL BLAS - sCOPYpytorch: NVIDIA CUDA GPU - 256 - Efficientnet_v2_lblender: Pabellon Barcelona - NVIDIA OptiXblender: Classroom - NVIDIA OptiXopencl-benchmark: Memory Bandwidth Coalesced Writeopencl-benchmark: Memory Bandwidth Coalesced Readopencl-benchmark: INT8 Computeopencl-benchmark: INT16 Computeopencl-benchmark: INT32 Computeopencl-benchmark: INT64 Computeopencl-benchmark: FP32 Computeopencl-benchmark: FP64 Computepytorch: NVIDIA CUDA GPU - 256 - ResNet-152pytorch: NVIDIA CUDA GPU - 32 - Efficientnet_v2_lpytorch: NVIDIA CUDA GPU - 512 - Efficientnet_v2_lvkresample: 2x - Singlepytorch: NVIDIA CUDA GPU - 64 - ResNet-50clpeak: Double-Precision Doubleluxcorerender: Rainbow Colors and Prism - GPUhashcat: SHA-512hashcat: SHA1hashcat: MD5pytorch: NVIDIA CUDA GPU - 1 - Efficientnet_v2_lpytorch: NVIDIA CUDA GPU - 32 - ResNet-152pytorch: NVIDIA CUDA GPU - 512 - ResNet-50hashcat: TrueCrypt RIPEMD160 + XTSpytorch: NVIDIA CUDA GPU - 1 - ResNet-152rodinia: OpenCL Particle Filterpytorch: NVIDIA CUDA GPU - 16 - ResNet-50cl-mem: Copycl-mem: Readcl-mem: Writehashcat: 7-Zipwaifu2x-ncnn: 2x - 3 - Yesfinancebench: Black-Scholes OpenCLclpeak: Global Memory Bandwidthmandelgpu: GPUclpeak: Integer Compute INTclpeak: Single-Precision Floatneatbench: GPUvkfft: FFT + iFFT C2C 1D batched in single precisionNVIDIA RTX 4070 SUPERRTX 4070 SUPERNVIDIA 4070 SUPER646.41137.44869.07455.01464.8614.30717.17019.8894.21438.5940.621103.17194.58102.60103.57507.45106.37195.39504.27201.94509.45720.973789366.057612.8213.5919.801339.59352.81311.7210.5651.300.067919.4534.8856.3235.5712211511711910910296.887.270.816515613261359958457738921045843742337039233414.2912.6018.489630.1127.67323273333322132600000675830333338029673.480331.8446.2407.511764672.8555.912437.65587219538.218170.5435492.694070OpenBenchmarking.org

GpuOwl

Exponent: 77936867

OpenBenchmarking.orgIterations / Second, More Is BetterGpuOwl 7.2.1Exponent: 77936867NVIDIA RTX 4070 SUPER140280420560700SE +/- 0.00, N = 3646.41

GpuOwl

Exponent: 332220523

OpenBenchmarking.orgIterations / Second, More Is BetterGpuOwl 7.2.1Exponent: 332220523NVIDIA RTX 4070 SUPER306090120150SE +/- 0.00, N = 3137.44

OctaneBench

Total Score

OpenBenchmarking.orgScore, More Is BetterOctaneBench 2020.1Total ScoreNVIDIA 4070 SUPER160320480640800720.97

GpuOwl

Exponent: 57885161

OpenBenchmarking.orgIterations / Second, More Is BetterGpuOwl 7.2.1Exponent: 57885161NVIDIA RTX 4070 SUPER2004006008001000SE +/- 1.26, N = 3869.07

FAHBench

OpenBenchmarking.orgNs Per Day, More Is BetterFAHBench 2.3.2NVIDIA 4070 SUPER80160240320400SE +/- 0.39, N = 3366.06

LuxCoreRender

Scene: LuxCore Benchmark - Acceleration: GPU

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender 2.6Scene: LuxCore Benchmark - Acceleration: GPUNVIDIA 4070 SUPER3691215SE +/- 0.02, N = 312.82MIN: 4.84 / MAX: 14.62

LuxCoreRender

Scene: DLSC - Acceleration: GPU

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender 2.6Scene: DLSC - Acceleration: GPUNVIDIA 4070 SUPER3691215SE +/- 0.01, N = 313.59MIN: 12.52 / MAX: 13.84

IndigoBench

Acceleration: OpenCL GPU - Scene: Bedroom

OpenBenchmarking.orgM samples/s, More Is BetterIndigoBench 4.4Acceleration: OpenCL GPU - Scene: BedroomNVIDIA 4070 SUPER510152025SE +/- 0.01, N = 319.80

VkResample

Upscale: 2x - Precision: Double

OpenBenchmarking.orgms, Fewer Is BetterVkResample 1.0Upscale: 2x - Precision: DoubleNVIDIA 4070 SUPER70140210280350SE +/- 0.30, N = 3339.591. (CXX) g++ options: -O3

IndigoBench

Acceleration: OpenCL GPU - Scene: Supercar

OpenBenchmarking.orgM samples/s, More Is BetterIndigoBench 4.4Acceleration: OpenCL GPU - Scene: SupercarNVIDIA 4070 SUPER1224364860SE +/- 0.03, N = 352.81

LuxCoreRender

Scene: Orange Juice - Acceleration: GPU

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender 2.6Scene: Orange Juice - Acceleration: GPUNVIDIA 4070 SUPER3691215SE +/- 0.07, N = 311.72MIN: 9.6 / MAX: 15.44

LuxCoreRender

Scene: Danish Mood - Acceleration: GPU

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender 2.6Scene: Danish Mood - Acceleration: GPUNVIDIA 4070 SUPER3691215SE +/- 0.08, N = 310.56MIN: 3.7 / MAX: 12.17

Blender

Blend File: Barbershop - Compute: NVIDIA OptiX

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.0Blend File: Barbershop - Compute: NVIDIA OptiXNVIDIA 4070 SUPER1224364860SE +/- 0.10, N = 351.30

NAMD CUDA

ATPase Simulation - 327,506 Atoms

OpenBenchmarking.orgdays/ns, Fewer Is BetterNAMD CUDA 2.14ATPase Simulation - 327,506 AtomsNVIDIA 4070 SUPER0.01530.03060.04590.06120.0765SE +/- 0.00031, N = 30.06791

Blender

Blend File: Fishy Cat - Compute: NVIDIA OptiX

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.0Blend File: Fishy Cat - Compute: NVIDIA OptiXNVIDIA 4070 SUPER3691215SE +/- 0.06, N = 139.45

RealSR-NCNN

Scale: 4x - TAA: Yes

OpenBenchmarking.orgSeconds, Fewer Is BetterRealSR-NCNN 20200818Scale: 4x - TAA: YesNVIDIA 4070 SUPER816243240SE +/- 0.02, N = 334.89

RealSR-NCNN

Scale: 4x - TAA: No

OpenBenchmarking.orgSeconds, Fewer Is BetterRealSR-NCNN 20200818Scale: 4x - TAA: NoNVIDIA 4070 SUPER246810SE +/- 0.150, N = 156.323

Blender

Blend File: BMW27 - Compute: NVIDIA OptiX

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.0Blend File: BMW27 - Compute: NVIDIA OptiXNVIDIA 4070 SUPER1.25332.50663.75995.01326.2665SE +/- 0.06, N = 135.57

ViennaCL

Test: CPU BLAS - dGEMM-TT

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-TTNVIDIA 4070 SUPER306090120150SE +/- 2.08, N = 31221. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMM-TN

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-TNNVIDIA 4070 SUPER306090120150SE +/- 1.00, N = 21151. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMM-NT

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-NTNVIDIA 4070 SUPER306090120150SE +/- 2.08, N = 31171. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMM-NN

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-NNNVIDIA 4070 SUPER306090120150SE +/- 4.04, N = 31191. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMV-T

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMV-TNVIDIA 4070 SUPER20406080100SE +/- 0.33, N = 31091. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMV-N

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMV-NNVIDIA 4070 SUPER20406080100SE +/- 0.33, N = 31021. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dDOT

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dDOTNVIDIA 4070 SUPER20406080100SE +/- 0.09, N = 396.81. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dAXPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dAXPYNVIDIA 4070 SUPER20406080100SE +/- 0.12, N = 387.21. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dCOPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dCOPYNVIDIA 4070 SUPER1632486480SE +/- 0.32, N = 370.81. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - sDOT

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - sDOTNVIDIA 4070 SUPER4080120160200SE +/- 2.73, N = 31651. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - sAXPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - sAXPYNVIDIA 4070 SUPER306090120150SE +/- 2.19, N = 31561. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - sCOPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - sCOPYNVIDIA 4070 SUPER306090120150SE +/- 1.20, N = 31321. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMM-TT

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-TTNVIDIA 4070 SUPER130260390520650SE +/- 0.00, N = 36131. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMM-TN

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-TNNVIDIA 4070 SUPER130260390520650SE +/- 0.00, N = 35991. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMM-NT

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-NTNVIDIA 4070 SUPER130260390520650SE +/- 0.00, N = 35841. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMM-NN

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-NNNVIDIA 4070 SUPER120240360480600SE +/- 0.00, N = 35771. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMV-T

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMV-TNVIDIA 4070 SUPER80160240320400SE +/- 0.00, N = 33891. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMV-N

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMV-NNVIDIA 4070 SUPER50100150200250SE +/- 0.33, N = 32101. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dDOT

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dDOTNVIDIA 4070 SUPER100200300400500SE +/- 0.00, N = 34581. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dAXPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dAXPYNVIDIA 4070 SUPER90180270360450SE +/- 0.00, N = 34371. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dCOPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dCOPYNVIDIA 4070 SUPER90180270360450SE +/- 0.33, N = 34231. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - sDOT

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sDOTNVIDIA 4070 SUPER80160240320400SE +/- 0.00, N = 33701. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - sAXPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sAXPYNVIDIA 4070 SUPER90180270360450SE +/- 0.00, N = 33921. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - sCOPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sCOPYNVIDIA 4070 SUPER70140210280350SE +/- 0.33, N = 33341. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: Efficientnet_v2_lRTX 4070 SUPER20406080100103.17MIN: 95.79 / MAX: 105.15

Blender

Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.0Blend File: Pabellon Barcelona - Compute: NVIDIA OptiXNVIDIA 4070 SUPER48121620SE +/- 0.03, N = 314.29

Blender

Blend File: Classroom - Compute: NVIDIA OptiX

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.0Blend File: Classroom - Compute: NVIDIA OptiXNVIDIA 4070 SUPER3691215SE +/- 0.00, N = 312.60

ProjectPhysX OpenCL-Benchmark

Operation: Memory Bandwidth Coalesced Write

OpenBenchmarking.orgGB/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: Memory Bandwidth Coalesced WriteNVIDIA RTX 4070 SUPER100200300400500SE +/- 0.14, N = 3455.011. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

ProjectPhysX OpenCL-Benchmark

Operation: Memory Bandwidth Coalesced Read

OpenBenchmarking.orgGB/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: Memory Bandwidth Coalesced ReadNVIDIA RTX 4070 SUPER100200300400500SE +/- 0.01, N = 3464.861. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

ProjectPhysX OpenCL-Benchmark

Operation: INT8 Compute

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: INT8 ComputeNVIDIA RTX 4070 SUPER48121620SE +/- 0.05, N = 314.311. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

ProjectPhysX OpenCL-Benchmark

Operation: INT16 Compute

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: INT16 ComputeNVIDIA RTX 4070 SUPER48121620SE +/- 0.00, N = 317.171. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

ProjectPhysX OpenCL-Benchmark

Operation: INT32 Compute

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: INT32 ComputeNVIDIA RTX 4070 SUPER510152025SE +/- 0.00, N = 319.891. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

ProjectPhysX OpenCL-Benchmark

Operation: INT64 Compute

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: INT64 ComputeNVIDIA RTX 4070 SUPER0.94821.89642.84463.79284.741SE +/- 0.015, N = 34.2141. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

ProjectPhysX OpenCL-Benchmark

Operation: FP32 Compute

OpenBenchmarking.orgTFLOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: FP32 ComputeNVIDIA RTX 4070 SUPER918273645SE +/- 0.03, N = 338.591. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

ProjectPhysX OpenCL-Benchmark

Operation: FP64 Compute

OpenBenchmarking.orgTFLOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: FP64 ComputeNVIDIA RTX 4070 SUPER0.13970.27940.41910.55880.6985SE +/- 0.000, N = 30.6211. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: ResNet-152RTX 4070 SUPER4080120160200SE +/- 1.14, N = 2194.58MIN: 183.74 / MAX: 198.52

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: Efficientnet_v2_lRTX 4070 SUPER20406080100102.60MIN: 94.84 / MAX: 104.25

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: Efficientnet_v2_lRTX 4070 SUPER20406080100103.57MIN: 95.95 / MAX: 105.54

VkResample

Upscale: 2x - Precision: Single

OpenBenchmarking.orgms, Fewer Is BetterVkResample 1.0Upscale: 2x - Precision: SingleNVIDIA 4070 SUPER510152025SE +/- 0.00, N = 318.491. (CXX) g++ options: -O3

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: ResNet-50RTX 4070 SUPER110220330440550SE +/- 0.92, N = 3507.45MIN: 423.41 / MAX: 512.88

clpeak

OpenCL Test: Double-Precision Double

OpenBenchmarking.orgGFLOPS, More Is Betterclpeak 1.1.2OpenCL Test: Double-Precision DoubleNVIDIA 4070 SUPER140280420560700SE +/- 0.98, N = 3630.111. (CXX) g++ options: -O3

LuxCoreRender

Scene: Rainbow Colors and Prism - Acceleration: GPU

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender 2.6Scene: Rainbow Colors and Prism - Acceleration: GPUNVIDIA 4070 SUPER714212835SE +/- 0.03, N = 327.67MIN: 24.87 / MAX: 29.03

Hashcat

Benchmark: SHA-512

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: SHA-512NVIDIA 4070 SUPER700M1400M2100M2800M3500MSE +/- 1530068.99, N = 33232733333

Hashcat

Benchmark: SHA1

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: SHA1NVIDIA 4070 SUPER5000M10000M15000M20000M25000MSE +/- 5140363.15, N = 322132600000

Hashcat

Benchmark: MD5

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: MD5NVIDIA 4070 SUPER14000M28000M42000M56000M70000MSE +/- 22430807.19, N = 367583033333

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: Efficientnet_v2_lRTX 4070 SUPER20406080100106.37MIN: 97.91 / MAX: 108.16

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: ResNet-152RTX 4070 SUPER4080120160200195.39MIN: 183.94 / MAX: 198.7

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: ResNet-50RTX 4070 SUPER110220330440550SE +/- 4.43, N = 2504.27MIN: 418.22 / MAX: 512.44

Hashcat

Benchmark: TrueCrypt RIPEMD160 + XTS

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: TrueCrypt RIPEMD160 + XTSNVIDIA 4070 SUPER200K400K600K800K1000KSE +/- 633.33, N = 3802967

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: ResNet-152RTX 4070 SUPER4080120160200201.94MIN: 183.53 / MAX: 206.5

Rodinia

Test: OpenCL Particle Filter

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenCL Particle FilterNVIDIA 4070 SUPER0.7831.5662.3493.1323.915SE +/- 0.039, N = 43.4801. (CXX) g++ options: -O2 -lOpenCL

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: ResNet-50RTX 4070 SUPER110220330440550509.45MIN: 430.1 / MAX: 516.48

cl-mem

Benchmark: Copy

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: CopyNVIDIA 4070 SUPER70140210280350SE +/- 0.03, N = 3331.81. (CC) gcc options: -O2 -flto -lOpenCL

cl-mem

Benchmark: Read

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: ReadNVIDIA 4070 SUPER100200300400500SE +/- 0.12, N = 3446.21. (CC) gcc options: -O2 -flto -lOpenCL

cl-mem

Benchmark: Write

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: WriteNVIDIA 4070 SUPER90180270360450SE +/- 1.11, N = 3407.51. (CC) gcc options: -O2 -flto -lOpenCL

Hashcat

Benchmark: 7-Zip

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: 7-ZipNVIDIA 4070 SUPER300K600K900K1200K1500KSE +/- 1991.93, N = 31176467

Waifu2x-NCNN Vulkan

Scale: 2x - Denoise: 3 - TAA: Yes

OpenBenchmarking.orgSeconds, Fewer Is BetterWaifu2x-NCNN Vulkan 20200818Scale: 2x - Denoise: 3 - TAA: YesNVIDIA 4070 SUPER0.64241.28481.92722.56963.212SE +/- 0.014, N = 32.855

FinanceBench

Benchmark: Black-Scholes OpenCL

OpenBenchmarking.orgms, Fewer Is BetterFinanceBench 2016-07-25Benchmark: Black-Scholes OpenCLNVIDIA 4070 SUPER1.33022.66043.99065.32086.651SE +/- 0.114, N = 155.9121. (CXX) g++ options: -O3 -march=native -fopenmp

clpeak

OpenCL Test: Global Memory Bandwidth

OpenBenchmarking.orgGBPS, More Is Betterclpeak 1.1.2OpenCL Test: Global Memory BandwidthNVIDIA 4070 SUPER90180270360450SE +/- 0.02, N = 3437.651. (CXX) g++ options: -O3

MandelGPU

OpenCL Device: GPU

OpenBenchmarking.orgSamples/sec, More Is BetterMandelGPU 1.3pts1OpenCL Device: GPUNVIDIA 4070 SUPER130M260M390M520M650MSE +/- 467034.80, N = 3587219538.21. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL

clpeak

OpenCL Test: Integer Compute INT

OpenBenchmarking.orgGIOPS, More Is Betterclpeak 1.1.2OpenCL Test: Integer Compute INTNVIDIA 4070 SUPER4K8K12K16K20KSE +/- 3.14, N = 318170.541. (CXX) g++ options: -O3

clpeak

OpenCL Test: Single-Precision Float

OpenBenchmarking.orgGFLOPS, More Is Betterclpeak 1.1.2OpenCL Test: Single-Precision FloatNVIDIA 4070 SUPER8K16K24K32K40KSE +/- 0.99, N = 335492.691. (CXX) g++ options: -O3

NeatBench

Acceleration: GPU

OpenBenchmarking.orgFPS, More Is BetterNeatBench 5Acceleration: GPUNVIDIA 4070 SUPER9001800270036004500SE +/- 0.00, N = 34070


Phoronix Test Suite v10.8.5