Intel Core i9-13900K testing with a ASUS TUF GAMING Z790-PRO WIFI (1401 BIOS) and MSI NVIDIA GeForce RTX 4070 12GB on EndeavourOS rolling via the Phoronix Test Suite.
Compare your own system(s) to this result file with the
Phoronix Test Suite by running the command:
phoronix-test-suite benchmark 2401299-NE-2401275NE83
RTX 4070 SUPER
Intel Core i9-13900K testing with a ASUS TUF GAMING Z790-PRO WIFI (1401 BIOS) and MSI NVIDIA GeForce RTX 4070 12GB on EndeavourOS rolling via the Phoronix Test Suite.
NVIDIA RTX 4070 SUPER:
Processor: Intel Core i9-13900K @ 5.50GHz (24 Cores / 32 Threads), Motherboard: ASUS TUF GAMING Z790-PRO WIFI (1401 BIOS), Chipset: Intel Device 7a27, Memory: 32GB, Disk: 4001GB Seagate ZP4000GP304001, Graphics: ASUS NVIDIA GeForce RTX 4070 SUPER 12GB, Audio: Realtek ALC1220, Monitor: ARZOPA, Network: Intel I226-V + Intel Device 7a70
OS: EndeavourOS rolling, Kernel: 6.7.1-arch1-1 (x86_64), Desktop: KDE Plasma 5.27.10, Display Server: X Server 1.21.1.11, Display Driver: NVIDIA 550.40.07, OpenGL: 4.6.0, OpenCL: OpenCL 3.0 CUDA 12.4.74, Compiler: GCC 13.2.1 20230801, File-System: ext4, Screen Resolution: 1920x1080
NVIDIA RTX 4070:
Processor: Intel Core i9-13900K @ 5.50GHz (24 Cores / 32 Threads), Motherboard: ASUS TUF GAMING Z790-PRO WIFI (1401 BIOS), Chipset: Intel Device 7a27, Memory: 32GB, Disk: 4001GB Seagate ZP4000GP304001, Graphics: MSI NVIDIA GeForce RTX 4070 12GB, Audio: Realtek ALC1220, Monitor: ARZOPA, Network: Intel I226-V + Intel Device 7a70
OS: EndeavourOS rolling, Kernel: 6.7.1-arch1-1 (x86_64), Desktop: KDE Plasma 5.27.10, Display Server: X Server 1.21.1.11, Display Driver: NVIDIA 550.40.07, OpenGL: 4.6.0, OpenCL: OpenCL 3.0 CUDA 12.4.74, Compiler: GCC 13.2.1 20230801 + CUDA 12.3, File-System: ext4, Screen Resolution: 1920x1080
clpeak 1.1.2
OpenCL Test: Integer Compute INT
GIOPS > Higher Is Better
NVIDIA RTX 4070 SUPER . 18170.54 |=============================================
NVIDIA RTX 4070 ....... 14555.19 |====================================
clpeak 1.1.2
OpenCL Test: Single-Precision Float
GFLOPS > Higher Is Better
NVIDIA RTX 4070 SUPER . 35492.69 |=============================================
NVIDIA RTX 4070 ....... 28479.39 |====================================
RealSR-NCNN 20200818
Scale: 4x - TAA: Yes
Seconds < Lower Is Better
NVIDIA RTX 4070 SUPER . 34.89 |=======================================
NVIDIA RTX 4070 ....... 42.85 |================================================
ViennaCL 1.7.1
Test: OpenCL BLAS - dGEMM-NT
GFLOPs/s > Higher Is Better
NVIDIA RTX 4070 SUPER . 584 |==================================================
NVIDIA RTX 4070 ....... 477 |=========================================
ProjectPhysX OpenCL-Benchmark 1.2
Operation: INT64 Compute
TIOPs/s > Higher Is Better
NVIDIA RTX 4070 SUPER . 4.214 |================================================
NVIDIA RTX 4070 ....... 3.443 |=======================================
clpeak 1.1.2
OpenCL Test: Double-Precision Double
GFLOPS > Higher Is Better
NVIDIA RTX 4070 SUPER . 630.11 |===============================================
NVIDIA RTX 4070 ....... 515.17 |======================================
VkResample 1.0
Upscale: 2x - Precision: Double
ms < Lower Is Better
NVIDIA RTX 4070 SUPER . 339.59 |======================================
NVIDIA RTX 4070 ....... 415.16 |===============================================
ViennaCL 1.7.1
Test: OpenCL BLAS - dGEMM-TT
GFLOPs/s > Higher Is Better
NVIDIA RTX 4070 SUPER . 613 |==================================================
NVIDIA RTX 4070 ....... 502 |=========================================
GpuOwl 7.2.1
Exponent: 332220523
Iterations / Second > Higher Is Better
NVIDIA RTX 4070 SUPER . 137.44 |===============================================
NVIDIA RTX 4070 ....... 112.61 |=======================================
ViennaCL 1.7.1
Test: OpenCL BLAS - dGEMM-NN
GFLOPs/s > Higher Is Better
NVIDIA RTX 4070 SUPER . 577 |==================================================
NVIDIA RTX 4070 ....... 473 |=========================================
GpuOwl 7.2.1
Exponent: 77936867
Iterations / Second > Higher Is Better
NVIDIA RTX 4070 SUPER . 646.41 |===============================================
NVIDIA RTX 4070 ....... 530.32 |=======================================
ProjectPhysX OpenCL-Benchmark 1.2
Operation: FP64 Compute
TFLOPs/s > Higher Is Better
NVIDIA RTX 4070 SUPER . 0.621 |================================================
NVIDIA RTX 4070 ....... 0.510 |=======================================
Hashcat 6.2.4
Benchmark: SHA1
H/s > Higher Is Better
NVIDIA RTX 4070 SUPER . 22132600000 |==========================================
NVIDIA RTX 4070 ....... 18202466667 |===================================
GpuOwl 7.2.1
Exponent: 57885161
Iterations / Second > Higher Is Better
NVIDIA RTX 4070 SUPER . 869.07 |===============================================
NVIDIA RTX 4070 ....... 714.80 |=======================================
ProjectPhysX OpenCL-Benchmark 1.2
Operation: FP32 Compute
TFLOPs/s > Higher Is Better
NVIDIA RTX 4070 SUPER . 38.59 |================================================
NVIDIA RTX 4070 ....... 31.77 |========================================
Hashcat 6.2.4
Benchmark: TrueCrypt RIPEMD160 + XTS
H/s > Higher Is Better
NVIDIA RTX 4070 SUPER . 802967 |===============================================
NVIDIA RTX 4070 ....... 660967 |=======================================
ProjectPhysX OpenCL-Benchmark 1.2
Operation: INT32 Compute
TIOPs/s > Higher Is Better
NVIDIA RTX 4070 SUPER . 19.89 |================================================
NVIDIA RTX 4070 ....... 16.38 |========================================
ViennaCL 1.7.1
Test: OpenCL BLAS - dGEMM-TN
GFLOPs/s > Higher Is Better
NVIDIA RTX 4070 SUPER . 599 |==================================================
NVIDIA RTX 4070 ....... 494 |=========================================
Hashcat 6.2.4
Benchmark: SHA-512
H/s > Higher Is Better
NVIDIA RTX 4070 SUPER . 3232733333 |===========================================
NVIDIA RTX 4070 ....... 2673300000 |====================================
Hashcat 6.2.4
Benchmark: 7-Zip
H/s > Higher Is Better
NVIDIA RTX 4070 SUPER . 1176467 |==============================================
NVIDIA RTX 4070 ....... 976967 |======================================
Hashcat 6.2.4
Benchmark: MD5
H/s > Higher Is Better
NVIDIA RTX 4070 SUPER . 67583033333 |==========================================
NVIDIA RTX 4070 ....... 56147866667 |===================================
ProjectPhysX OpenCL-Benchmark 1.2
Operation: INT16 Compute
TIOPs/s > Higher Is Better
NVIDIA RTX 4070 SUPER . 17.17 |================================================
NVIDIA RTX 4070 ....... 14.28 |========================================
LuxCoreRender 2.6
Scene: Rainbow Colors and Prism - Acceleration: GPU
M samples/sec > Higher Is Better
NVIDIA RTX 4070 SUPER . 27.67 |================================================
NVIDIA RTX 4070 ....... 23.26 |========================================
LuxCoreRender 2.6
Scene: Danish Mood - Acceleration: GPU
M samples/sec > Higher Is Better
NVIDIA RTX 4070 SUPER . 10.56 |================================================
NVIDIA RTX 4070 ....... 8.89 |========================================
Libplacebo 5.229.1
Test: deband_heavy
FPS > Higher Is Better
NVIDIA RTX 4070 SUPER . 2186.70 |==============================================
NVIDIA RTX 4070 ....... 1847.98 |=======================================
NVIDIA RTX 4070 ....... 1844.08 |=======================================
NVIDIA RTX 4070 ....... 1843.26 |=======================================
Libplacebo 5.229.1
Test: polar_nocompute
FPS > Higher Is Better
NVIDIA RTX 4070 SUPER . 2327.55 |==============================================
NVIDIA RTX 4070 ....... 1972.78 |=======================================
NVIDIA RTX 4070 ....... 1969.19 |=======================================
NVIDIA RTX 4070 ....... 1968.37 |=======================================
ProjectPhysX OpenCL-Benchmark 1.2
Operation: INT8 Compute
TIOPs/s > Higher Is Better
NVIDIA RTX 4070 SUPER . 14.31 |================================================
NVIDIA RTX 4070 ....... 12.12 |=========================================
Blender 4.0
Blend File: Classroom - Compute: NVIDIA OptiX
Seconds < Lower Is Better
NVIDIA RTX 4070 SUPER . 12.60 |=========================================
NVIDIA RTX 4070 ....... 14.86 |================================================
Rodinia 3.1
Test: OpenCL Particle Filter
Seconds < Lower Is Better
NVIDIA RTX 4070 SUPER . 3.480 |=========================================
NVIDIA RTX 4070 ....... 4.098 |================================================
LuxCoreRender 2.6
Scene: LuxCore Benchmark - Acceleration: GPU
M samples/sec > Higher Is Better
NVIDIA RTX 4070 SUPER . 12.82 |================================================
NVIDIA RTX 4070 ....... 10.92 |=========================================
Blender 4.0
Blend File: Fishy Cat - Compute: NVIDIA OptiX
Seconds < Lower Is Better
NVIDIA RTX 4070 SUPER . 9.45 |=========================================
NVIDIA RTX 4070 ....... 11.03 |================================================
VkFFT 1.2.31
Test: FFT + iFFT R2C / C2R
Benchmark Score > Higher Is Better
NVIDIA RTX 4070 SUPER . 54794 |================================================
NVIDIA RTX 4070 ....... 47097 |=========================================
Blender 4.0
Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX
Seconds < Lower Is Better
NVIDIA RTX 4070 SUPER . 14.29 |=========================================
NVIDIA RTX 4070 ....... 16.55 |================================================
LuxCoreRender 2.6
Scene: DLSC - Acceleration: GPU
M samples/sec > Higher Is Better
NVIDIA RTX 4070 SUPER . 13.59 |================================================
NVIDIA RTX 4070 ....... 11.74 |=========================================
FAHBench 2.3.2
Ns Per Day > Higher Is Better
NVIDIA RTX 4070 SUPER . 366.06 |===============================================
NVIDIA RTX 4070 ....... 317.20 |=========================================
VkFFT 1.2.31
Test: FFT + iFFT C2C Bluestein benchmark in double precision
Benchmark Score > Higher Is Better
NVIDIA RTX 4070 SUPER . 4451 |=================================================
NVIDIA RTX 4070 ....... 3886 |===========================================
Blender 4.0
Blend File: Barbershop - Compute: NVIDIA OptiX
Seconds < Lower Is Better
NVIDIA RTX 4070 SUPER . 51.30 |==========================================
NVIDIA RTX 4070 ....... 58.44 |================================================
MandelGPU 1.3pts1
OpenCL Device: GPU
Samples/sec > Higher Is Better
NVIDIA RTX 4070 SUPER . 587219538.2 |==========================================
NVIDIA RTX 4070 ....... 516770131.2 |=====================================
LuxCoreRender 2.6
Scene: Orange Juice - Acceleration: GPU
M samples/sec > Higher Is Better
NVIDIA RTX 4070 SUPER . 11.72 |================================================
NVIDIA RTX 4070 ....... 10.40 |===========================================
Blender 4.0
Blend File: BMW27 - Compute: NVIDIA OptiX
Seconds < Lower Is Better
NVIDIA RTX 4070 SUPER . 5.57 |============================================
NVIDIA RTX 4070 ....... 6.21 |=================================================
OctaneBench 2020.1
Total Score
Score > Higher Is Better
NVIDIA RTX 4070 SUPER . 720.97 |===============================================
NVIDIA RTX 4070 ....... 648.00 |==========================================
PyTorch 2.1
Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: ResNet-50
batches/sec > Higher Is Better
NVIDIA RTX 4070 SUPER . 509.45 |===============================================
NVIDIA RTX 4070 ....... 458.39 |==========================================
Waifu2x-NCNN Vulkan 20200818
Scale: 2x - Denoise: 3 - TAA: Yes
Seconds < Lower Is Better
NVIDIA RTX 4070 SUPER . 2.855 |===========================================
NVIDIA RTX 4070 ....... 3.168 |================================================
PyTorch 2.1
Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: ResNet-50
batches/sec > Higher Is Better
NVIDIA RTX 4070 SUPER . 507.45 |===============================================
NVIDIA RTX 4070 ....... 458.36 |==========================================
VkFFT 1.2.31
Test: FFT + iFFT C2C Bluestein in single precision
Benchmark Score > Higher Is Better
NVIDIA RTX 4070 SUPER . 15166 |================================================
NVIDIA RTX 4070 ....... 13714 |===========================================
NAMD CUDA 2.14
ATPase Simulation - 327,506 Atoms
days/ns < Lower Is Better
NVIDIA RTX 4070 SUPER . 0.06791 |==========================================
NVIDIA RTX 4070 ....... 0.07498 |==============================================
PyTorch 2.1
Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: ResNet-50
batches/sec > Higher Is Better
NVIDIA RTX 4070 SUPER . 504.27 |===============================================
NVIDIA RTX 4070 ....... 459.27 |===========================================
PyTorch 2.1
Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: ResNet-50
batches/sec > Higher Is Better
NVIDIA RTX 4070 SUPER . 504.67 |===============================================
NVIDIA RTX 4070 ....... 459.93 |===========================================
PyTorch 2.1
Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: ResNet-50
batches/sec > Higher Is Better
NVIDIA RTX 4070 SUPER . 501.50 |===============================================
NVIDIA RTX 4070 ....... 459.94 |===========================================
IndigoBench 4.4
Acceleration: OpenCL GPU - Scene: Supercar
M samples/s > Higher Is Better
NVIDIA RTX 4070 SUPER . 52.81 |================================================
NVIDIA RTX 4070 ....... 48.52 |============================================
IndigoBench 4.4
Acceleration: OpenCL GPU - Scene: Bedroom
M samples/s > Higher Is Better
NVIDIA RTX 4070 SUPER . 19.80 |================================================
NVIDIA RTX 4070 ....... 18.20 |============================================
VkFFT 1.2.31
Test: FFT + iFFT C2C 1D batched in double precision
Benchmark Score > Higher Is Better
NVIDIA RTX 4070 SUPER . 24317 |================================================
NVIDIA RTX 4070 ....... 22390 |============================================
VkFFT 1.2.31
Test: FFT + iFFT C2C multidimensional in single precision
Benchmark Score > Higher Is Better
NVIDIA RTX 4070 SUPER . 50299 |================================================
NVIDIA RTX 4070 ....... 47212 |=============================================
VkFFT 1.2.31
Test: FFT + iFFT C2C 1D batched in single precision, no reshuffling
Benchmark Score > Higher Is Better
NVIDIA RTX 4070 SUPER . 75078 |==============================================
NVIDIA RTX 4070 ....... 79057 |================================================
ViennaCL 1.7.1
Test: CPU BLAS - dGEMM-TN
GFLOPs/s > Higher Is Better
NVIDIA RTX 4070 SUPER . 115 |================================================
NVIDIA RTX 4070 ....... 121 |==================================================
VkFFT 1.2.31
Test: FFT + iFFT C2C 1D batched in single precision
Benchmark Score > Higher Is Better
NVIDIA RTX 4070 SUPER . 73929 |==============================================
NVIDIA RTX 4070 ....... 77774 |================================================
PyTorch 2.1
Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: ResNet-152
batches/sec > Higher Is Better
NVIDIA RTX 4070 SUPER . 196.07 |===============================================
NVIDIA RTX 4070 ....... 186.63 |=============================================
VkFFT 1.2.31
Test: FFT + iFFT C2C 1D batched in half precision
Benchmark Score > Higher Is Better
NVIDIA RTX 4070 SUPER . 131705 |=============================================
NVIDIA RTX 4070 ....... 137762 |===============================================
PyTorch 2.1
Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: ResNet-152
batches/sec > Higher Is Better
NVIDIA RTX 4070 SUPER . 195.40 |===============================================
NVIDIA RTX 4070 ....... 187.26 |=============================================
ViennaCL 1.7.1
Test: CPU BLAS - dGEMM-NT
GFLOPs/s > Higher Is Better
NVIDIA RTX 4070 SUPER . 117 |================================================
NVIDIA RTX 4070 ....... 122 |==================================================
PyTorch 2.1
Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: ResNet-152
batches/sec > Higher Is Better
NVIDIA RTX 4070 SUPER . 195.30 |===============================================
NVIDIA RTX 4070 ....... 187.51 |=============================================
ViennaCL 1.7.1
Test: OpenCL BLAS - dAXPY
GB/s > Higher Is Better
NVIDIA RTX 4070 SUPER . 437 |================================================
NVIDIA RTX 4070 ....... 455 |==================================================
PyTorch 2.1
Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: ResNet-152
batches/sec > Higher Is Better
NVIDIA RTX 4070 SUPER . 195.39 |===============================================
NVIDIA RTX 4070 ....... 187.69 |=============================================
PyTorch 2.1
Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: ResNet-152
batches/sec > Higher Is Better
NVIDIA RTX 4070 SUPER . 194.58 |===============================================
NVIDIA RTX 4070 ....... 187.27 |=============================================
ViennaCL 1.7.1
Test: CPU BLAS - dGEMM-TT
GFLOPs/s > Higher Is Better
NVIDIA RTX 4070 SUPER . 122 |==================================================
NVIDIA RTX 4070 ....... 118 |================================================
VkResample 1.0
Upscale: 2x - Precision: Single
ms < Lower Is Better
NVIDIA RTX 4070 SUPER . 18.49 |================================================
NVIDIA RTX 4070 ....... 18.02 |===============================================
ViennaCL 1.7.1
Test: CPU BLAS - dGEMM-NN
GFLOPs/s > Higher Is Better
NVIDIA RTX 4070 SUPER . 119 |=================================================
NVIDIA RTX 4070 ....... 122 |==================================================
ViennaCL 1.7.1
Test: OpenCL BLAS - sDOT
GB/s > Higher Is Better
NVIDIA RTX 4070 SUPER . 370 |==================================================
NVIDIA RTX 4070 ....... 362 |=================================================
PyTorch 2.1
Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: Efficientnet_v2_l
batches/sec > Higher Is Better
NVIDIA RTX 4070 SUPER . 103.57 |===============================================
NVIDIA RTX 4070 ....... 101.43 |==============================================
PyTorch 2.1
Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: ResNet-50
batches/sec > Higher Is Better
NVIDIA RTX 4070 SUPER . 557.73 |===============================================
NVIDIA RTX 4070 ....... 546.76 |==============================================
ViennaCL 1.7.1
Test: CPU BLAS - sAXPY
GB/s > Higher Is Better
NVIDIA RTX 4070 SUPER . 156 |==================================================
NVIDIA RTX 4070 ....... 153 |=================================================
PyTorch 2.1
Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: Efficientnet_v2_l
batches/sec > Higher Is Better
NVIDIA RTX 4070 SUPER . 103.17 |===============================================
NVIDIA RTX 4070 ....... 101.24 |==============================================
PyTorch 2.1
Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: ResNet-152
batches/sec > Higher Is Better
NVIDIA RTX 4070 SUPER . 201.94 |===============================================
NVIDIA RTX 4070 ....... 198.18 |==============================================
Libplacebo 5.229.1
Test: av1_grain_lap
FPS > Higher Is Better
NVIDIA RTX 4070 SUPER . 4171.00 |==============================================
NVIDIA RTX 4070 ....... 4103.40 |=============================================
NVIDIA RTX 4070 ....... 4126.40 |==============================================
NVIDIA RTX 4070 ....... 4152.41 |==============================================
TensorFlow 2.12
Device: GPU - Batch Size: 16 - Model: VGG-16
images/sec > Higher Is Better
NVIDIA RTX 4070 SUPER . 1.48 |================================================
NVIDIA RTX 4070 ....... 1.50 |=================================================
TensorFlow 2.12
Device: GPU - Batch Size: 1 - Model: GoogLeNet
images/sec > Higher Is Better
NVIDIA RTX 4070 SUPER . 12.62 |===============================================
NVIDIA RTX 4070 ....... 12.78 |================================================
ViennaCL 1.7.1
Test: OpenCL BLAS - sCOPY
GB/s > Higher Is Better
NVIDIA RTX 4070 SUPER . 334 |==================================================
NVIDIA RTX 4070 ....... 330 |=================================================
PyTorch 2.1
Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: Efficientnet_v2_l
batches/sec > Higher Is Better
NVIDIA RTX 4070 SUPER . 106.37 |==============================================
NVIDIA RTX 4070 ....... 107.59 |===============================================
Libplacebo 5.229.1
Test: hdr_lut
FPS > Higher Is Better
NVIDIA RTX 4070 SUPER . 3905.98 |==============================================
NVIDIA RTX 4070 ....... 3927.11 |==============================================
NVIDIA RTX 4070 ....... 3940.40 |==============================================
NVIDIA RTX 4070 ....... 3946.90 |==============================================
PyTorch 2.1
Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: Efficientnet_v2_l
batches/sec > Higher Is Better
NVIDIA RTX 4070 SUPER . 102.60 |===============================================
NVIDIA RTX 4070 ....... 101.55 |===============================================
ViennaCL 1.7.1
Test: CPU BLAS - dGEMV-N
GB/s > Higher Is Better
NVIDIA RTX 4070 SUPER . 102 |==================================================
NVIDIA RTX 4070 ....... 103 |==================================================
ProjectPhysX OpenCL-Benchmark 1.2
Operation: Memory Bandwidth Coalesced Write
GB/s > Higher Is Better
NVIDIA RTX 4070 SUPER . 455.01 |===============================================
NVIDIA RTX 4070 ....... 459.43 |===============================================
TensorFlow 2.12
Device: GPU - Batch Size: 1 - Model: AlexNet
images/sec > Higher Is Better
NVIDIA RTX 4070 SUPER . 13.92 |================================================
NVIDIA RTX 4070 ....... 14.04 |================================================
ViennaCL 1.7.1
Test: OpenCL BLAS - sAXPY
GB/s > Higher Is Better
NVIDIA RTX 4070 SUPER . 392 |==================================================
NVIDIA RTX 4070 ....... 389 |==================================================
ViennaCL 1.7.1
Test: CPU BLAS - sCOPY
GB/s > Higher Is Better
NVIDIA RTX 4070 SUPER . 132 |==================================================
NVIDIA RTX 4070 ....... 131 |==================================================
TensorFlow 2.12
Device: GPU - Batch Size: 1 - Model: VGG-16
images/sec > Higher Is Better
NVIDIA RTX 4070 SUPER . 1.35 |=================================================
NVIDIA RTX 4070 ....... 1.36 |=================================================
TensorFlow 2.12
Device: GPU - Batch Size: 32 - Model: ResNet-50
images/sec > Higher Is Better
NVIDIA RTX 4070 SUPER . 5.51 |=================================================
NVIDIA RTX 4070 ....... 5.55 |=================================================
ViennaCL 1.7.1
Test: CPU BLAS - sDOT
GB/s > Higher Is Better
NVIDIA RTX 4070 SUPER . 165 |==================================================
NVIDIA RTX 4070 ....... 166 |==================================================
TensorFlow 2.12
Device: GPU - Batch Size: 16 - Model: ResNet-50
images/sec > Higher Is Better
NVIDIA RTX 4070 SUPER . 5.46 |=================================================
NVIDIA RTX 4070 ....... 5.49 |=================================================
ViennaCL 1.7.1
Test: OpenCL BLAS - dGEMV-T
GB/s > Higher Is Better
NVIDIA RTX 4070 SUPER . 389 |==================================================
NVIDIA RTX 4070 ....... 387 |==================================================
ViennaCL 1.7.1
Test: OpenCL BLAS - dGEMV-N
GB/s > Higher Is Better
NVIDIA RTX 4070 SUPER . 210 |==================================================
NVIDIA RTX 4070 ....... 209 |==================================================
ViennaCL 1.7.1
Test: CPU BLAS - dAXPY
GB/s > Higher Is Better
NVIDIA RTX 4070 SUPER . 87.2 |=================================================
NVIDIA RTX 4070 ....... 86.8 |=================================================
cl-mem 2017-01-13
Benchmark: Copy
GB/s > Higher Is Better
NVIDIA RTX 4070 SUPER . 331.8 |================================================
NVIDIA RTX 4070 ....... 330.3 |================================================
TensorFlow 2.12
Device: GPU - Batch Size: 16 - Model: AlexNet
images/sec > Higher Is Better
NVIDIA RTX 4070 SUPER . 31.59 |================================================
NVIDIA RTX 4070 ....... 31.45 |================================================
ViennaCL 1.7.1
Test: OpenCL BLAS - dDOT
GB/s > Higher Is Better
NVIDIA RTX 4070 SUPER . 458 |==================================================
NVIDIA RTX 4070 ....... 456 |==================================================
TensorFlow 2.12
Device: GPU - Batch Size: 512 - Model: AlexNet
images/sec > Higher Is Better
NVIDIA RTX 4070 SUPER . 35.10 |================================================
NVIDIA RTX 4070 ....... 35.21 |================================================
PyTorch 2.1
Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: Efficientnet_v2_l
batches/sec > Higher Is Better
NVIDIA RTX 4070 SUPER . 102.60 |===============================================
NVIDIA RTX 4070 ....... 102.90 |===============================================
ViennaCL 1.7.1
Test: CPU BLAS - dCOPY
GB/s > Higher Is Better
NVIDIA RTX 4070 SUPER . 70.8 |=================================================
NVIDIA RTX 4070 ....... 71.0 |=================================================
TensorFlow 2.12
Device: GPU - Batch Size: 32 - Model: AlexNet
images/sec > Higher Is Better
NVIDIA RTX 4070 SUPER . 33.40 |================================================
NVIDIA RTX 4070 ....... 33.32 |================================================
TensorFlow 2.12
Device: GPU - Batch Size: 1 - Model: ResNet-50
images/sec > Higher Is Better
NVIDIA RTX 4070 SUPER . 4.35 |=================================================
NVIDIA RTX 4070 ....... 4.34 |=================================================
cl-mem 2017-01-13
Benchmark: Write
GB/s > Higher Is Better
NVIDIA RTX 4070 SUPER . 407.5 |================================================
NVIDIA RTX 4070 ....... 406.7 |================================================
TensorFlow 2.12
Device: GPU - Batch Size: 64 - Model: GoogLeNet
images/sec > Higher Is Better
NVIDIA RTX 4070 SUPER . 15.52 |================================================
NVIDIA RTX 4070 ....... 15.54 |================================================
TensorFlow 2.12
Device: GPU - Batch Size: 32 - Model: GoogLeNet
images/sec > Higher Is Better
NVIDIA RTX 4070 SUPER . 15.61 |================================================
NVIDIA RTX 4070 ....... 15.63 |================================================
TensorFlow 2.12
Device: GPU - Batch Size: 64 - Model: AlexNet
images/sec > Higher Is Better
NVIDIA RTX 4070 SUPER . 33.97 |================================================
NVIDIA RTX 4070 ....... 33.93 |================================================
ViennaCL 1.7.1
Test: CPU BLAS - dDOT
GB/s > Higher Is Better
NVIDIA RTX 4070 SUPER . 96.8 |=================================================
NVIDIA RTX 4070 ....... 96.7 |=================================================
clpeak 1.1.2
OpenCL Test: Global Memory Bandwidth
GBPS > Higher Is Better
NVIDIA RTX 4070 SUPER . 437.65 |===============================================
NVIDIA RTX 4070 ....... 437.21 |===============================================
ProjectPhysX OpenCL-Benchmark 1.2
Operation: Memory Bandwidth Coalesced Read
GB/s > Higher Is Better
NVIDIA RTX 4070 SUPER . 464.86 |===============================================
NVIDIA RTX 4070 ....... 465.18 |===============================================
TensorFlow 2.12
Device: GPU - Batch Size: 16 - Model: GoogLeNet
images/sec > Higher Is Better
NVIDIA RTX 4070 SUPER . 15.67 |================================================
NVIDIA RTX 4070 ....... 15.66 |================================================
cl-mem 2017-01-13
Benchmark: Read
GB/s > Higher Is Better
NVIDIA RTX 4070 SUPER . 446.2 |================================================
NVIDIA RTX 4070 ....... 446.3 |================================================
TensorFlow 2.12
Device: GPU - Batch Size: 64 - Model: ResNet-50
images/sec > Higher Is Better
NVIDIA RTX 4070 SUPER . 5.55 |=================================================
NVIDIA RTX 4070 ....... 5.55 |=================================================
TensorFlow 2.12
Device: GPU - Batch Size: 256 - Model: AlexNet
images/sec > Higher Is Better
NVIDIA RTX 4070 SUPER . 34.16 |================================================
TensorFlow 2.12
Device: GPU - Batch Size: 64 - Model: VGG-16
images/sec > Higher Is Better
NVIDIA RTX 4070 ....... 1.50 |=================================================
TensorFlow 2.12
Device: GPU - Batch Size: 32 - Model: VGG-16
images/sec > Higher Is Better
NVIDIA RTX 4070 SUPER . 1.50 |=================================================
NVIDIA RTX 4070 ....... 1.50 |=================================================
NeatBench 5
Acceleration: GPU
FPS > Higher Is Better
NVIDIA RTX 4070 SUPER . 4070 |=================================================
NVIDIA RTX 4070 ....... 4070 |=================================================
ViennaCL 1.7.1
Test: OpenCL BLAS - dCOPY
GB/s > Higher Is Better
NVIDIA RTX 4070 SUPER . 423 |==================================================
NVIDIA RTX 4070 ....... 423 |==================================================
ViennaCL 1.7.1
Test: CPU BLAS - dGEMV-T
GB/s > Higher Is Better
NVIDIA RTX 4070 SUPER . 109 |==================================================
NVIDIA RTX 4070 ....... 109 |==================================================
PyTorch 2.1
Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: Efficientnet_v2_l
batches/sec > Higher Is Better
NVIDIA RTX 4070 ....... 103.68 |===============================================
NCNN 20230517
Target: Vulkan GPU - Model: FastestDet
ms < Lower Is Better
NVIDIA RTX 4070 SUPER . 2.86 |=================================================
NVIDIA RTX 4070 ....... 2.34 |========================================
NVIDIA RTX 4070 ....... 2.67 |==============================================
NCNN 20230517
Target: Vulkan GPU - Model: vision_transformer
ms < Lower Is Better
NVIDIA RTX 4070 SUPER . 844.61 |===============================================
NVIDIA RTX 4070 ....... 281.56 |================
NVIDIA RTX 4070 ....... 382.82 |=====================
NCNN 20230517
Target: Vulkan GPU - Model: regnety_400m
ms < Lower Is Better
NVIDIA RTX 4070 SUPER . 11.11 |================================================
NVIDIA RTX 4070 ....... 6.50 |============================
NVIDIA RTX 4070 ....... 6.21 |===========================
NCNN 20230517
Target: Vulkan GPU - Model: squeezenet_ssd
ms < Lower Is Better
NVIDIA RTX 4070 SUPER . 6.86 |=================================================
NVIDIA RTX 4070 ....... 5.27 |======================================
NVIDIA RTX 4070 ....... 5.18 |=====================================
NCNN 20230517
Target: Vulkan GPU - Model: yolov4-tiny
ms < Lower Is Better
NVIDIA RTX 4070 SUPER . 63.82 |================================================
NVIDIA RTX 4070 ....... 25.11 |===================
NVIDIA RTX 4070 ....... 20.74 |================
NCNN 20230517
Target: Vulkan GPU - Model: resnet50
ms < Lower Is Better
NVIDIA RTX 4070 SUPER . 46.26 |================================================
NVIDIA RTX 4070 ....... 8.24 |=========
NVIDIA RTX 4070 ....... 8.72 |=========
NCNN 20230517
Target: Vulkan GPU - Model: alexnet
ms < Lower Is Better
NVIDIA RTX 4070 SUPER . 16.17 |================================================
NVIDIA RTX 4070 ....... 9.33 |============================
NVIDIA RTX 4070 ....... 5.78 |=================
NCNN 20230517
Target: Vulkan GPU - Model: resnet18
ms < Lower Is Better
NVIDIA RTX 4070 SUPER . 8.97 |=================================================
NVIDIA RTX 4070 ....... 8.58 |===============================================
NVIDIA RTX 4070 ....... 5.11 |============================
NCNN 20230517
Target: Vulkan GPU - Model: vgg16
ms < Lower Is Better
NVIDIA RTX 4070 SUPER . 117.81 |===============================================
NVIDIA RTX 4070 ....... 54.54 |======================
NVIDIA RTX 4070 ....... 45.52 |==================
NCNN 20230517
Target: Vulkan GPU - Model: googlenet
ms < Lower Is Better
NVIDIA RTX 4070 SUPER . 11.04 |================================================
NVIDIA RTX 4070 ....... 6.06 |==========================
NCNN 20230517
Target: Vulkan GPU - Model: blazeface
ms < Lower Is Better
NVIDIA RTX 4070 SUPER . 0.84 |=================================================
NVIDIA RTX 4070 ....... 0.84 |=================================================
NCNN 20230517
Target: Vulkan GPU - Model: efficientnet-b0
ms < Lower Is Better
NVIDIA RTX 4070 SUPER . 5.07 |=================================================
NVIDIA RTX 4070 ....... 3.46 |=================================
NVIDIA RTX 4070 ....... 3.59 |===================================
NCNN 20230517
Target: Vulkan GPU - Model: mnasnet
ms < Lower Is Better
NVIDIA RTX 4070 SUPER . 3.85 |=================================================
NVIDIA RTX 4070 ....... 2.22 |============================
NVIDIA RTX 4070 ....... 2.24 |=============================
NCNN 20230517
Target: Vulkan GPU - Model: shufflenet-v2
ms < Lower Is Better
NVIDIA RTX 4070 SUPER . 2.31 |=================================================
NVIDIA RTX 4070 ....... 2.11 |=============================================
NVIDIA RTX 4070 ....... 2.08 |============================================
NCNN 20230517
Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3
ms < Lower Is Better
NVIDIA RTX 4070 SUPER . 2.25 |=============
NVIDIA RTX 4070 ....... 8.71 |=================================================
NVIDIA RTX 4070 ....... 2.15 |============
NCNN 20230517
Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2
ms < Lower Is Better
NVIDIA RTX 4070 SUPER . 3.03 |================================
NVIDIA RTX 4070 ....... 4.69 |=================================================
NVIDIA RTX 4070 ....... 2.48 |==========================
NCNN 20230517
Target: Vulkan GPU - Model: mobilenet
ms < Lower Is Better
NVIDIA RTX 4070 SUPER . 8.62 |=========================================
NVIDIA RTX 4070 ....... 10.14 |================================================
NVIDIA RTX 4070 ....... 7.20 |==================================
Libplacebo 5.229.1
Test: hdr_peakdetect
FPS > Higher Is Better
NVIDIA RTX 4070 SUPER . 3292.37 |============================================
NVIDIA RTX 4070 ....... 3310.02 |============================================
NVIDIA RTX 4070 ....... 3452.43 |==============================================
NVIDIA RTX 4070 ....... 3329.26 |============================================
TensorFlow 2.12
Device: GPU - Batch Size: 512 - Model: VGG-16
images/sec > Higher Is Better
TensorFlow 2.12
Device: GPU - Batch Size: 256 - Model: VGG-16
images/sec > Higher Is Better
PlaidML
FP16: No - Mode: Inference - Network: DenseNet 201 - Device: OpenCL
Examples Per Second > Higher Is Better
PlaidML
FP16: Yes - Mode: Inference - Network: Mobilenet - Device: OpenCL
Examples Per Second > Higher Is Better
PlaidML
FP16: No - Mode: Inference - Network: Mobilenet - Device: OpenCL
Examples Per Second > Higher Is Better
PlaidML
FP16: No - Mode: Inference - Network: IMDB LSTM - Device: OpenCL
Examples Per Second > Higher Is Better
PlaidML
FP16: No - Mode: Training - Network: Mobilenet - Device: OpenCL
Examples Per Second > Higher Is Better
NCNN 20230517
Target: Vulkan GPU
ms < Lower Is Better
Caffe 2020-02-13
Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 1000
Milli-Seconds < Lower Is Better
Caffe 2020-02-13
Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 200
Milli-Seconds < Lower Is Better
Caffe 2020-02-13
Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 100
Milli-Seconds < Lower Is Better
Caffe 2020-02-13
Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 1000
Milli-Seconds < Lower Is Better
Caffe 2020-02-13
Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 200
Milli-Seconds < Lower Is Better
Caffe 2020-02-13
Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 100
Milli-Seconds < Lower Is Better
FinanceBench 2016-07-25
Benchmark: Black-Scholes OpenCL
ms < Lower Is Better
NVIDIA RTX 4070 SUPER . 5.912 |=========================================
NVIDIA RTX 4070 ....... 6.906 |================================================
ArrayFire 3.9
Test: Conjugate Gradient OpenCL
LeelaChessZero 0.30
Backend: OpenCL
Nodes Per Second > Higher Is Better
Libplacebo 5.229.1
FPS > Higher Is Better
Waifu2x-NCNN Vulkan 20200818
Scale: 2x - Denoise: 3 - TAA: No
Seconds < Lower Is Better
RealSR-NCNN 20200818
Scale: 4x - TAA: No
Seconds < Lower Is Better
NVIDIA RTX 4070 SUPER . 6.323 |===========================================
NVIDIA RTX 4070 ....... 7.092 |================================================
vkpeak 20230730
GFLOPS > Higher Is Better