RTX 4070 SUPER

Intel Core i9-13900K testing with a ASUS TUF GAMING Z790-PRO WIFI (1401 BIOS) and ASUS NVIDIA GeForce RTX 4070 SUPER 12GB on EndeavourOS rolling via the Phoronix Test Suite.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2401264-NE-RTX4070SU09
Jump To Table - Results

View

Do Not Show Noisy Results
Do Not Show Results With Incomplete Data
Do Not Show Results With Little Change/Spread
List Notable Results

Limit displaying results to tests within:

BLAS (Basic Linear Algebra Sub-Routine) Tests 3 Tests
C++ Boost Tests 2 Tests
CPU Massive 4 Tests
Creator Workloads 5 Tests
Game Development 2 Tests
HPC - High Performance Computing 9 Tests
Machine Learning 6 Tests
Multi-Core 7 Tests
NVIDIA GPU Compute 29 Tests
OpenCL 6 Tests
OpenMPI Tests 2 Tests
Python Tests 4 Tests
Renderers 3 Tests
Scientific Computing 2 Tests
Server CPU Tests 2 Tests
Vulkan Compute 8 Tests
Common Workstation Benchmarks 2 Tests

Statistics

Show Overall Harmonic Mean(s)
Show Overall Geometric Mean
Show Geometric Means Per-Suite/Category
Show Wins / Losses Counts (Pie Chart)
Normalize Results
Remove Outliers Before Calculating Averages

Graph Settings

Force Line Graphs Where Applicable
Convert To Scalar Where Applicable
Disable Color Branding
Prefer Vertical Bar Graphs

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Highlight
Result
Hide
Result
Result
Identifier
Performance Per
Dollar
Date
Run
  Test
  Duration
NVIDIA RTX 4070 SUPER
January 25
  27 Minutes
RTX 4070 SUPER
January 26
  5 Minutes
NVIDIA 4070 SUPER
January 26
  1 Hour, 27 Minutes
Invert Hiding All Results Option
  40 Minutes

Only show results where is faster than
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


RTX 4070 SUPER - Phoronix Test Suite

RTX 4070 SUPER

Intel Core i9-13900K testing with a ASUS TUF GAMING Z790-PRO WIFI (1401 BIOS) and ASUS NVIDIA GeForce RTX 4070 SUPER 12GB on EndeavourOS rolling via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2401264-NE-RTX4070SU09&grw&sor.

RTX 4070 SUPERProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLOpenCLCompilerFile-SystemScreen ResolutionNVIDIA RTX 4070 SUPERRTX 4070 SUPERNVIDIA 4070 SUPERIntel Core i9-13900K @ 5.50GHz (24 Cores / 32 Threads)ASUS TUF GAMING Z790-PRO WIFI (1401 BIOS)Intel Device 7a2732GB4001GB Seagate ZP4000GP304001ASUS NVIDIA GeForce RTX 4070 SUPER 12GBRealtek ALC1220ARZOPAIntel I226-V + Intel Device 7a70EndeavourOS rolling6.7.1-arch1-1 (x86_64)KDE Plasma 5.27.10X Server 1.21.1.11NVIDIA 550.40.074.6.0OpenCL 3.0 CUDA 12.4.74GCC 13.2.1 20230801ext41920x1080GCC 13.2.1 20230801 + CUDA 12.3OpenBenchmarking.orgKernel Details- Transparent Huge Pages: alwaysCompiler Details- NVIDIA RTX 4070 SUPER, NVIDIA 4070 SUPER: --disable-libssp --disable-libstdcxx-pch --disable-werror --enable-__cxa_atexit --enable-bootstrap --enable-cet=auto --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-default-ssp --enable-gnu-indirect-function --enable-gnu-unique-object --enable-languages=ada,c,c++,d,fortran,go,lto,objc,obj-c++ --enable-libstdcxx-backtrace --enable-link-serialization=1 --enable-lto --enable-multilib --enable-plugin --enable-shared --enable-threads=posix --mandir=/usr/share/man --with-build-config=bootstrap-lto --with-linker-hash-style=gnu Processor Details- Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0x11dGraphics Details- NVIDIA RTX 4070 SUPER, NVIDIA 4070 SUPER: BAR1 / Visible vRAM Size: 16384 MiB - vBIOS Version: 95.04.69.00.c1Security Details- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected Python Details- RTX 4070 SUPER, NVIDIA 4070 SUPER: Python 3.11.6Environment Details- NVIDIA 4070 SUPER: NVCC_PREPEND_FLAGS="-ccbin /opt/cuda/bin"

RTX 4070 SUPERopencl-benchmark: Memory Bandwidth Coalesced Writeopencl-benchmark: INT8 Computeopencl-benchmark: Memory Bandwidth Coalesced Readopencl-benchmark: INT16 Computeopencl-benchmark: INT32 Computeopencl-benchmark: INT64 Computepytorch: NVIDIA CUDA GPU - 1 - ResNet-152pytorch: NVIDIA CUDA GPU - 16 - ResNet-50pytorch: NVIDIA CUDA GPU - 64 - ResNet-50pytorch: NVIDIA CUDA GPU - 512 - ResNet-50gpuowl: 332220523gpuowl: 57885161gpuowl: 77936867pytorch: NVIDIA CUDA GPU - 32 - ResNet-152pytorch: NVIDIA CUDA GPU - 256 - ResNet-152pytorch: NVIDIA CUDA GPU - 1 - Efficientnet_v2_lpytorch: NVIDIA CUDA GPU - 32 - Efficientnet_v2_lpytorch: NVIDIA CUDA GPU - 256 - Efficientnet_v2_lpytorch: NVIDIA CUDA GPU - 512 - Efficientnet_v2_lrodinia: OpenCL Particle Filterblender: BMW27 - NVIDIA OptiXblender: Classroom - NVIDIA OptiXblender: Fishy Cat - NVIDIA OptiXblender: Barbershop - NVIDIA OptiXblender: Pabellon Barcelona - NVIDIA OptiXneatbench: GPUindigobench: OpenCL GPU - Bedroomindigobench: OpenCL GPU - Supercarluxcorerender: DLSC - GPUluxcorerender: Danish Mood - GPUopencl-benchmark: FP32 Computeluxcorerender: Orange Juice - GPUluxcorerender: LuxCore Benchmark - GPUluxcorerender: Rainbow Colors and Prism - GPUfahbench: hashcat: MD5hashcat: SHA1hashcat: 7-Ziphashcat: SHA-512hashcat: TrueCrypt RIPEMD160 + XTSnamd-cuda: ATPase Simulation - 327,506 Atomsoctanebench: Total Scorefinancebench: Black-Scholes OpenCLcl-mem: Copycl-mem: Readcl-mem: Writeclpeak: Integer Compute INTclpeak: Single-Precision Floatclpeak: Double-Precision Doubleclpeak: Global Memory Bandwidthmandelgpu: GPUopencl-benchmark: FP64 Computeviennacl: CPU BLAS - sCOPYviennacl: CPU BLAS - sAXPYviennacl: CPU BLAS - sDOTviennacl: CPU BLAS - dCOPYviennacl: CPU BLAS - dAXPYviennacl: CPU BLAS - dDOTviennacl: CPU BLAS - dGEMV-Nviennacl: CPU BLAS - dGEMV-Tviennacl: CPU BLAS - dGEMM-NNviennacl: CPU BLAS - dGEMM-NTviennacl: CPU BLAS - dGEMM-TNviennacl: CPU BLAS - dGEMM-TTviennacl: OpenCL BLAS - sCOPYviennacl: OpenCL BLAS - sAXPYviennacl: OpenCL BLAS - sDOTviennacl: OpenCL BLAS - dCOPYviennacl: OpenCL BLAS - dAXPYviennacl: OpenCL BLAS - dDOTviennacl: OpenCL BLAS - dGEMV-Nviennacl: OpenCL BLAS - dGEMV-Tviennacl: OpenCL BLAS - dGEMM-NNviennacl: OpenCL BLAS - dGEMM-NTviennacl: OpenCL BLAS - dGEMM-TNviennacl: OpenCL BLAS - dGEMM-TTrealsr-ncnn: 4x - Norealsr-ncnn: 4x - Yesvkresample: 2x - Doublevkresample: 2x - Singlewaifu2x-ncnn: 2x - 3 - YesNVIDIA RTX 4070 SUPERRTX 4070 SUPERNVIDIA 4070 SUPER455.0114.307464.8617.17019.8894.214137.44869.07646.4138.5940.621201.94509.45507.45504.27195.39194.58106.37102.60103.17103.573.4805.5712.609.4551.3014.29407019.80152.81313.5910.5611.7212.8227.67366.05766758303333322132600000117646732327333338029670.06791720.9737895.912331.8446.2407.518170.5435492.69630.11437.65587219538.213215616570.887.296.81021091191171151223343923704234374582103895775845996136.32334.885339.59318.4892.855OpenBenchmarking.org

ProjectPhysX OpenCL-Benchmark

Operation: Memory Bandwidth Coalesced Write

OpenBenchmarking.orgGB/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: Memory Bandwidth Coalesced WriteNVIDIA RTX 4070 SUPER100200300400500SE +/- 0.14, N = 3455.011. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

ProjectPhysX OpenCL-Benchmark

Operation: INT8 Compute

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: INT8 ComputeNVIDIA RTX 4070 SUPER48121620SE +/- 0.05, N = 314.311. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

ProjectPhysX OpenCL-Benchmark

Operation: Memory Bandwidth Coalesced Read

OpenBenchmarking.orgGB/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: Memory Bandwidth Coalesced ReadNVIDIA RTX 4070 SUPER100200300400500SE +/- 0.01, N = 3464.861. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

ProjectPhysX OpenCL-Benchmark

Operation: INT16 Compute

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: INT16 ComputeNVIDIA RTX 4070 SUPER48121620SE +/- 0.00, N = 317.171. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

ProjectPhysX OpenCL-Benchmark

Operation: INT32 Compute

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: INT32 ComputeNVIDIA RTX 4070 SUPER510152025SE +/- 0.00, N = 319.891. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

ProjectPhysX OpenCL-Benchmark

Operation: INT64 Compute

OpenBenchmarking.orgTIOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: INT64 ComputeNVIDIA RTX 4070 SUPER0.94821.89642.84463.79284.741SE +/- 0.015, N = 34.2141. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: ResNet-152RTX 4070 SUPER4080120160200201.94MIN: 183.53 / MAX: 206.5

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: ResNet-50RTX 4070 SUPER110220330440550509.45MIN: 430.1 / MAX: 516.48

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: ResNet-50RTX 4070 SUPER110220330440550SE +/- 0.92, N = 3507.45MIN: 423.41 / MAX: 512.88

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: ResNet-50RTX 4070 SUPER110220330440550SE +/- 4.43, N = 2504.27MIN: 418.22 / MAX: 512.44

GpuOwl

Exponent: 332220523

OpenBenchmarking.orgIterations / Second, More Is BetterGpuOwl 7.2.1Exponent: 332220523NVIDIA RTX 4070 SUPER306090120150SE +/- 0.00, N = 3137.44

GpuOwl

Exponent: 57885161

OpenBenchmarking.orgIterations / Second, More Is BetterGpuOwl 7.2.1Exponent: 57885161NVIDIA RTX 4070 SUPER2004006008001000SE +/- 1.26, N = 3869.07

GpuOwl

Exponent: 77936867

OpenBenchmarking.orgIterations / Second, More Is BetterGpuOwl 7.2.1Exponent: 77936867NVIDIA RTX 4070 SUPER140280420560700SE +/- 0.00, N = 3646.41

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: ResNet-152RTX 4070 SUPER4080120160200195.39MIN: 183.94 / MAX: 198.7

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: ResNet-152RTX 4070 SUPER4080120160200SE +/- 1.14, N = 2194.58MIN: 183.74 / MAX: 198.52

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: Efficientnet_v2_lRTX 4070 SUPER20406080100106.37MIN: 97.91 / MAX: 108.16

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: Efficientnet_v2_lRTX 4070 SUPER20406080100102.60MIN: 94.84 / MAX: 104.25

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: Efficientnet_v2_lRTX 4070 SUPER20406080100103.17MIN: 95.79 / MAX: 105.15

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: Efficientnet_v2_lRTX 4070 SUPER20406080100103.57MIN: 95.95 / MAX: 105.54

Rodinia

Test: OpenCL Particle Filter

OpenBenchmarking.orgSeconds, Fewer Is BetterRodinia 3.1Test: OpenCL Particle FilterNVIDIA 4070 SUPER0.7831.5662.3493.1323.915SE +/- 0.039, N = 43.4801. (CXX) g++ options: -O2 -lOpenCL

Blender

Blend File: BMW27 - Compute: NVIDIA OptiX

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.0Blend File: BMW27 - Compute: NVIDIA OptiXNVIDIA 4070 SUPER1.25332.50663.75995.01326.2665SE +/- 0.06, N = 135.57

Blender

Blend File: Classroom - Compute: NVIDIA OptiX

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.0Blend File: Classroom - Compute: NVIDIA OptiXNVIDIA 4070 SUPER3691215SE +/- 0.00, N = 312.60

Blender

Blend File: Fishy Cat - Compute: NVIDIA OptiX

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.0Blend File: Fishy Cat - Compute: NVIDIA OptiXNVIDIA 4070 SUPER3691215SE +/- 0.06, N = 139.45

Blender

Blend File: Barbershop - Compute: NVIDIA OptiX

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.0Blend File: Barbershop - Compute: NVIDIA OptiXNVIDIA 4070 SUPER1224364860SE +/- 0.10, N = 351.30

Blender

Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.0Blend File: Pabellon Barcelona - Compute: NVIDIA OptiXNVIDIA 4070 SUPER48121620SE +/- 0.03, N = 314.29

NeatBench

Acceleration: GPU

OpenBenchmarking.orgFPS, More Is BetterNeatBench 5Acceleration: GPUNVIDIA 4070 SUPER9001800270036004500SE +/- 0.00, N = 34070

IndigoBench

Acceleration: OpenCL GPU - Scene: Bedroom

OpenBenchmarking.orgM samples/s, More Is BetterIndigoBench 4.4Acceleration: OpenCL GPU - Scene: BedroomNVIDIA 4070 SUPER510152025SE +/- 0.01, N = 319.80

IndigoBench

Acceleration: OpenCL GPU - Scene: Supercar

OpenBenchmarking.orgM samples/s, More Is BetterIndigoBench 4.4Acceleration: OpenCL GPU - Scene: SupercarNVIDIA 4070 SUPER1224364860SE +/- 0.03, N = 352.81

LuxCoreRender

Scene: DLSC - Acceleration: GPU

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender 2.6Scene: DLSC - Acceleration: GPUNVIDIA 4070 SUPER3691215SE +/- 0.01, N = 313.59MIN: 12.52 / MAX: 13.84

LuxCoreRender

Scene: Danish Mood - Acceleration: GPU

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender 2.6Scene: Danish Mood - Acceleration: GPUNVIDIA 4070 SUPER3691215SE +/- 0.08, N = 310.56MIN: 3.7 / MAX: 12.17

ProjectPhysX OpenCL-Benchmark

Operation: FP32 Compute

OpenBenchmarking.orgTFLOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: FP32 ComputeNVIDIA RTX 4070 SUPER918273645SE +/- 0.03, N = 338.591. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

LuxCoreRender

Scene: Orange Juice - Acceleration: GPU

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender 2.6Scene: Orange Juice - Acceleration: GPUNVIDIA 4070 SUPER3691215SE +/- 0.07, N = 311.72MIN: 9.6 / MAX: 15.44

LuxCoreRender

Scene: LuxCore Benchmark - Acceleration: GPU

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender 2.6Scene: LuxCore Benchmark - Acceleration: GPUNVIDIA 4070 SUPER3691215SE +/- 0.02, N = 312.82MIN: 4.84 / MAX: 14.62

LuxCoreRender

Scene: Rainbow Colors and Prism - Acceleration: GPU

OpenBenchmarking.orgM samples/sec, More Is BetterLuxCoreRender 2.6Scene: Rainbow Colors and Prism - Acceleration: GPUNVIDIA 4070 SUPER714212835SE +/- 0.03, N = 327.67MIN: 24.87 / MAX: 29.03

FAHBench

OpenBenchmarking.orgNs Per Day, More Is BetterFAHBench 2.3.2NVIDIA 4070 SUPER80160240320400SE +/- 0.39, N = 3366.06

Hashcat

Benchmark: MD5

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: MD5NVIDIA 4070 SUPER14000M28000M42000M56000M70000MSE +/- 22430807.19, N = 367583033333

Hashcat

Benchmark: SHA1

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: SHA1NVIDIA 4070 SUPER5000M10000M15000M20000M25000MSE +/- 5140363.15, N = 322132600000

Hashcat

Benchmark: 7-Zip

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: 7-ZipNVIDIA 4070 SUPER300K600K900K1200K1500KSE +/- 1991.93, N = 31176467

Hashcat

Benchmark: SHA-512

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: SHA-512NVIDIA 4070 SUPER700M1400M2100M2800M3500MSE +/- 1530068.99, N = 33232733333

Hashcat

Benchmark: TrueCrypt RIPEMD160 + XTS

OpenBenchmarking.orgH/s, More Is BetterHashcat 6.2.4Benchmark: TrueCrypt RIPEMD160 + XTSNVIDIA 4070 SUPER200K400K600K800K1000KSE +/- 633.33, N = 3802967

NAMD CUDA

ATPase Simulation - 327,506 Atoms

OpenBenchmarking.orgdays/ns, Fewer Is BetterNAMD CUDA 2.14ATPase Simulation - 327,506 AtomsNVIDIA 4070 SUPER0.01530.03060.04590.06120.0765SE +/- 0.00031, N = 30.06791

OctaneBench

Total Score

OpenBenchmarking.orgScore, More Is BetterOctaneBench 2020.1Total ScoreNVIDIA 4070 SUPER160320480640800720.97

FinanceBench

Benchmark: Black-Scholes OpenCL

OpenBenchmarking.orgms, Fewer Is BetterFinanceBench 2016-07-25Benchmark: Black-Scholes OpenCLNVIDIA 4070 SUPER1.33022.66043.99065.32086.651SE +/- 0.114, N = 155.9121. (CXX) g++ options: -O3 -march=native -fopenmp

cl-mem

Benchmark: Copy

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: CopyNVIDIA 4070 SUPER70140210280350SE +/- 0.03, N = 3331.81. (CC) gcc options: -O2 -flto -lOpenCL

cl-mem

Benchmark: Read

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: ReadNVIDIA 4070 SUPER100200300400500SE +/- 0.12, N = 3446.21. (CC) gcc options: -O2 -flto -lOpenCL

cl-mem

Benchmark: Write

OpenBenchmarking.orgGB/s, More Is Bettercl-mem 2017-01-13Benchmark: WriteNVIDIA 4070 SUPER90180270360450SE +/- 1.11, N = 3407.51. (CC) gcc options: -O2 -flto -lOpenCL

clpeak

OpenCL Test: Integer Compute INT

OpenBenchmarking.orgGIOPS, More Is Betterclpeak 1.1.2OpenCL Test: Integer Compute INTNVIDIA 4070 SUPER4K8K12K16K20KSE +/- 3.14, N = 318170.541. (CXX) g++ options: -O3

clpeak

OpenCL Test: Single-Precision Float

OpenBenchmarking.orgGFLOPS, More Is Betterclpeak 1.1.2OpenCL Test: Single-Precision FloatNVIDIA 4070 SUPER8K16K24K32K40KSE +/- 0.99, N = 335492.691. (CXX) g++ options: -O3

clpeak

OpenCL Test: Double-Precision Double

OpenBenchmarking.orgGFLOPS, More Is Betterclpeak 1.1.2OpenCL Test: Double-Precision DoubleNVIDIA 4070 SUPER140280420560700SE +/- 0.98, N = 3630.111. (CXX) g++ options: -O3

clpeak

OpenCL Test: Global Memory Bandwidth

OpenBenchmarking.orgGBPS, More Is Betterclpeak 1.1.2OpenCL Test: Global Memory BandwidthNVIDIA 4070 SUPER90180270360450SE +/- 0.02, N = 3437.651. (CXX) g++ options: -O3

MandelGPU

OpenCL Device: GPU

OpenBenchmarking.orgSamples/sec, More Is BetterMandelGPU 1.3pts1OpenCL Device: GPUNVIDIA 4070 SUPER130M260M390M520M650MSE +/- 467034.80, N = 3587219538.21. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL

ProjectPhysX OpenCL-Benchmark

Operation: FP64 Compute

OpenBenchmarking.orgTFLOPs/s, More Is BetterProjectPhysX OpenCL-Benchmark 1.2Operation: FP64 ComputeNVIDIA RTX 4070 SUPER0.13970.27940.41910.55880.6985SE +/- 0.000, N = 30.6211. (CXX) g++ options: -std=c++17 -pthread -lOpenCL

ViennaCL

Test: CPU BLAS - sCOPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - sCOPYNVIDIA 4070 SUPER306090120150SE +/- 1.20, N = 31321. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - sAXPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - sAXPYNVIDIA 4070 SUPER306090120150SE +/- 2.19, N = 31561. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - sDOT

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - sDOTNVIDIA 4070 SUPER4080120160200SE +/- 2.73, N = 31651. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dCOPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dCOPYNVIDIA 4070 SUPER1632486480SE +/- 0.32, N = 370.81. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dAXPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dAXPYNVIDIA 4070 SUPER20406080100SE +/- 0.12, N = 387.21. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dDOT

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dDOTNVIDIA 4070 SUPER20406080100SE +/- 0.09, N = 396.81. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMV-N

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMV-NNVIDIA 4070 SUPER20406080100SE +/- 0.33, N = 31021. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMV-T

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMV-TNVIDIA 4070 SUPER20406080100SE +/- 0.33, N = 31091. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMM-NN

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-NNNVIDIA 4070 SUPER306090120150SE +/- 4.04, N = 31191. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMM-NT

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-NTNVIDIA 4070 SUPER306090120150SE +/- 2.08, N = 31171. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMM-TN

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-TNNVIDIA 4070 SUPER306090120150SE +/- 1.00, N = 21151. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: CPU BLAS - dGEMM-TT

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: CPU BLAS - dGEMM-TTNVIDIA 4070 SUPER306090120150SE +/- 2.08, N = 31221. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - sCOPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sCOPYNVIDIA 4070 SUPER70140210280350SE +/- 0.33, N = 33341. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - sAXPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sAXPYNVIDIA 4070 SUPER90180270360450SE +/- 0.00, N = 33921. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - sDOT

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - sDOTNVIDIA 4070 SUPER80160240320400SE +/- 0.00, N = 33701. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dCOPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dCOPYNVIDIA 4070 SUPER90180270360450SE +/- 0.33, N = 34231. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dAXPY

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dAXPYNVIDIA 4070 SUPER90180270360450SE +/- 0.00, N = 34371. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dDOT

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dDOTNVIDIA 4070 SUPER100200300400500SE +/- 0.00, N = 34581. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMV-N

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMV-NNVIDIA 4070 SUPER50100150200250SE +/- 0.33, N = 32101. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMV-T

OpenBenchmarking.orgGB/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMV-TNVIDIA 4070 SUPER80160240320400SE +/- 0.00, N = 33891. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMM-NN

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-NNNVIDIA 4070 SUPER120240360480600SE +/- 0.00, N = 35771. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMM-NT

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-NTNVIDIA 4070 SUPER130260390520650SE +/- 0.00, N = 35841. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMM-TN

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-TNNVIDIA 4070 SUPER130260390520650SE +/- 0.00, N = 35991. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

ViennaCL

Test: OpenCL BLAS - dGEMM-TT

OpenBenchmarking.orgGFLOPs/s, More Is BetterViennaCL 1.7.1Test: OpenCL BLAS - dGEMM-TTNVIDIA 4070 SUPER130260390520650SE +/- 0.00, N = 36131. (CXX) g++ options: -fopenmp -O3 -rdynamic -lOpenCL

RealSR-NCNN

Scale: 4x - TAA: No

OpenBenchmarking.orgSeconds, Fewer Is BetterRealSR-NCNN 20200818Scale: 4x - TAA: NoNVIDIA 4070 SUPER246810SE +/- 0.150, N = 156.323

RealSR-NCNN

Scale: 4x - TAA: Yes

OpenBenchmarking.orgSeconds, Fewer Is BetterRealSR-NCNN 20200818Scale: 4x - TAA: YesNVIDIA 4070 SUPER816243240SE +/- 0.02, N = 334.89

VkResample

Upscale: 2x - Precision: Double

OpenBenchmarking.orgms, Fewer Is BetterVkResample 1.0Upscale: 2x - Precision: DoubleNVIDIA 4070 SUPER70140210280350SE +/- 0.30, N = 3339.591. (CXX) g++ options: -O3

VkResample

Upscale: 2x - Precision: Single

OpenBenchmarking.orgms, Fewer Is BetterVkResample 1.0Upscale: 2x - Precision: SingleNVIDIA 4070 SUPER510152025SE +/- 0.00, N = 318.491. (CXX) g++ options: -O3

Waifu2x-NCNN Vulkan

Scale: 2x - Denoise: 3 - TAA: Yes

OpenBenchmarking.orgSeconds, Fewer Is BetterWaifu2x-NCNN Vulkan 20200818Scale: 2x - Denoise: 3 - TAA: YesNVIDIA 4070 SUPER0.64241.28481.92722.56963.212SE +/- 0.014, N = 32.855


Phoronix Test Suite v10.8.4