RTX 4070 SUPER Intel Core i9-13900K testing with a ASUS TUF GAMING Z790-PRO WIFI (1401 BIOS) and ASUS NVIDIA GeForce RTX 4070 SUPER 12GB on EndeavourOS rolling via the Phoronix Test Suite. NVIDIA RTX 4070 SUPER: Processor: Intel Core i9-13900K @ 5.50GHz (24 Cores / 32 Threads), Motherboard: ASUS TUF GAMING Z790-PRO WIFI (1401 BIOS), Chipset: Intel Device 7a27, Memory: 32GB, Disk: 4001GB Seagate ZP4000GP304001, Graphics: ASUS NVIDIA GeForce RTX 4070 SUPER 12GB, Audio: Realtek ALC1220, Monitor: ARZOPA, Network: Intel I226-V + Intel Device 7a70 OS: EndeavourOS rolling, Kernel: 6.7.1-arch1-1 (x86_64), Desktop: KDE Plasma 5.27.10, Display Server: X Server 1.21.1.11, Display Driver: NVIDIA 550.40.07, OpenGL: 4.6.0, OpenCL: OpenCL 3.0 CUDA 12.4.74, Compiler: GCC 13.2.1 20230801, File-System: ext4, Screen Resolution: 1920x1080 Libplacebo 5.229.1 Test: av1_grain_lap FPS > Higher Is Better NVIDIA RTX 4070 SUPER . 4171.00 |============================================== Libplacebo 5.229.1 Test: hdr_lut FPS > Higher Is Better NVIDIA RTX 4070 SUPER . 3905.98 |============================================== Libplacebo 5.229.1 Test: hdr_peakdetect FPS > Higher Is Better NVIDIA RTX 4070 SUPER . 3292.37 |============================================== Libplacebo 5.229.1 Test: polar_nocompute FPS > Higher Is Better NVIDIA RTX 4070 SUPER . 2327.55 |============================================== Libplacebo 5.229.1 Test: deband_heavy FPS > Higher Is Better NVIDIA RTX 4070 SUPER . 2186.70 |============================================== TensorFlow 2.12 Device: GPU - Batch Size: 64 - Model: ResNet-50 images/sec > Higher Is Better NVIDIA RTX 4070 SUPER . 5.55 |================================================= TensorFlow 2.12 Device: GPU - Batch Size: 64 - Model: GoogLeNet images/sec > Higher Is Better NVIDIA RTX 4070 SUPER . 15.52 |================================================ TensorFlow 2.12 Device: GPU - Batch Size: 32 - Model: ResNet-50 images/sec > Higher Is Better NVIDIA RTX 4070 SUPER . 5.51 |================================================= TensorFlow 2.12 Device: GPU - Batch Size: 32 - Model: GoogLeNet images/sec > Higher Is Better NVIDIA RTX 4070 SUPER . 15.61 |================================================ TensorFlow 2.12 Device: GPU - Batch Size: 16 - Model: ResNet-50 images/sec > Higher Is Better NVIDIA RTX 4070 SUPER . 5.46 |================================================= TensorFlow 2.12 Device: GPU - Batch Size: 16 - Model: GoogLeNet images/sec > Higher Is Better NVIDIA RTX 4070 SUPER . 15.67 |================================================ TensorFlow 2.12 Device: GPU - Batch Size: 512 - Model: AlexNet images/sec > Higher Is Better NVIDIA RTX 4070 SUPER . 35.10 |================================================ TensorFlow 2.12 Device: GPU - Batch Size: 256 - Model: AlexNet images/sec > Higher Is Better NVIDIA RTX 4070 SUPER . 34.16 |================================================ TensorFlow 2.12 Device: GPU - Batch Size: 1 - Model: ResNet-50 images/sec > Higher Is Better NVIDIA RTX 4070 SUPER . 4.35 |================================================= TensorFlow 2.12 Device: GPU - Batch Size: 1 - Model: GoogLeNet images/sec > Higher Is Better NVIDIA RTX 4070 SUPER . 12.62 |================================================ TensorFlow 2.12 Device: GPU - Batch Size: 64 - Model: AlexNet images/sec > Higher Is Better NVIDIA RTX 4070 SUPER . 33.97 |================================================ TensorFlow 2.12 Device: GPU - Batch Size: 32 - Model: AlexNet images/sec > Higher Is Better NVIDIA RTX 4070 SUPER . 33.4 |================================================= TensorFlow 2.12 Device: GPU - Batch Size: 16 - Model: AlexNet images/sec > Higher Is Better NVIDIA RTX 4070 SUPER . 31.59 |================================================ TensorFlow 2.12 Device: GPU - Batch Size: 32 - Model: VGG-16 images/sec > Higher Is Better NVIDIA RTX 4070 SUPER . 1.50 |================================================= TensorFlow 2.12 Device: GPU - Batch Size: 16 - Model: VGG-16 images/sec > Higher Is Better NVIDIA RTX 4070 SUPER . 1.48 |================================================= TensorFlow 2.12 Device: GPU - Batch Size: 1 - Model: AlexNet images/sec > Higher Is Better NVIDIA RTX 4070 SUPER . 13.92 |================================================ TensorFlow 2.12 Device: GPU - Batch Size: 1 - Model: VGG-16 images/sec > Higher Is Better NVIDIA RTX 4070 SUPER . 1.35 |================================================= NeatBench 5 Acceleration: GPU FPS > Higher Is Better NVIDIA RTX 4070 SUPER . 4070 |================================================= MandelGPU 1.3pts1 OpenCL Device: GPU Samples/sec > Higher Is Better NVIDIA RTX 4070 SUPER . 587219538.2 |========================================== IndigoBench 4.4 Acceleration: OpenCL GPU - Scene: Supercar M samples/s > Higher Is Better NVIDIA RTX 4070 SUPER . 52.81 |================================================ IndigoBench 4.4 Acceleration: OpenCL GPU - Scene: Bedroom M samples/s > Higher Is Better NVIDIA RTX 4070 SUPER . 19.80 |================================================ Blender 4.0 Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX Seconds < Lower Is Better NVIDIA RTX 4070 SUPER . 14.29 |================================================ Blender 4.0 Blend File: Barbershop - Compute: NVIDIA OptiX Seconds < Lower Is Better NVIDIA RTX 4070 SUPER . 51.30 |================================================ Blender 4.0 Blend File: Fishy Cat - Compute: NVIDIA OptiX Seconds < Lower Is Better NVIDIA RTX 4070 SUPER . 9.45 |================================================= Blender 4.0 Blend File: Classroom - Compute: NVIDIA OptiX Seconds < Lower Is Better NVIDIA RTX 4070 SUPER . 12.60 |================================================ Blender 4.0 Blend File: BMW27 - Compute: NVIDIA OptiX Seconds < Lower Is Better NVIDIA RTX 4070 SUPER . 5.57 |================================================= ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-TT GFLOPs/s > Higher Is Better NVIDIA RTX 4070 SUPER . 613 |================================================== ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-TN GFLOPs/s > Higher Is Better NVIDIA RTX 4070 SUPER . 599 |================================================== ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-NT GFLOPs/s > Higher Is Better NVIDIA RTX 4070 SUPER . 584 |================================================== ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-NN GFLOPs/s > Higher Is Better NVIDIA RTX 4070 SUPER . 577 |================================================== ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMV-T GB/s > Higher Is Better NVIDIA RTX 4070 SUPER . 389 |================================================== ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMV-N GB/s > Higher Is Better NVIDIA RTX 4070 SUPER . 210 |================================================== ViennaCL 1.7.1 Test: OpenCL BLAS - dDOT GB/s > Higher Is Better NVIDIA RTX 4070 SUPER . 458 |================================================== ViennaCL 1.7.1 Test: OpenCL BLAS - dAXPY GB/s > Higher Is Better NVIDIA RTX 4070 SUPER . 437 |================================================== ViennaCL 1.7.1 Test: OpenCL BLAS - dCOPY GB/s > Higher Is Better NVIDIA RTX 4070 SUPER . 423 |================================================== ViennaCL 1.7.1 Test: OpenCL BLAS - sDOT GB/s > Higher Is Better NVIDIA RTX 4070 SUPER . 370 |================================================== ViennaCL 1.7.1 Test: OpenCL BLAS - sAXPY GB/s > Higher Is Better NVIDIA RTX 4070 SUPER . 392 |================================================== ViennaCL 1.7.1 Test: OpenCL BLAS - sCOPY GB/s > Higher Is Better NVIDIA RTX 4070 SUPER . 334 |================================================== ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TT GFLOPs/s > Higher Is Better NVIDIA RTX 4070 SUPER . 122 |================================================== ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TN GFLOPs/s > Higher Is Better NVIDIA RTX 4070 SUPER . 115 |================================================== ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NT GFLOPs/s > Higher Is Better NVIDIA RTX 4070 SUPER . 117 |================================================== ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NN GFLOPs/s > Higher Is Better NVIDIA RTX 4070 SUPER . 119 |================================================== ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-T GB/s > Higher Is Better NVIDIA RTX 4070 SUPER . 109 |================================================== ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-N GB/s > Higher Is Better NVIDIA RTX 4070 SUPER . 102 |================================================== ViennaCL 1.7.1 Test: CPU BLAS - dDOT GB/s > Higher Is Better NVIDIA RTX 4070 SUPER . 96.8 |================================================= ViennaCL 1.7.1 Test: CPU BLAS - dAXPY GB/s > Higher Is Better NVIDIA RTX 4070 SUPER . 87.2 |================================================= ViennaCL 1.7.1 Test: CPU BLAS - dCOPY GB/s > Higher Is Better NVIDIA RTX 4070 SUPER . 70.8 |================================================= ViennaCL 1.7.1 Test: CPU BLAS - sDOT GB/s > Higher Is Better NVIDIA RTX 4070 SUPER . 165 |================================================== ViennaCL 1.7.1 Test: CPU BLAS - sAXPY GB/s > Higher Is Better NVIDIA RTX 4070 SUPER . 156 |================================================== ViennaCL 1.7.1 Test: CPU BLAS - sCOPY GB/s > Higher Is Better NVIDIA RTX 4070 SUPER . 132 |================================================== LuxCoreRender 2.6 Scene: Rainbow Colors and Prism - Acceleration: GPU M samples/sec > Higher Is Better NVIDIA RTX 4070 SUPER . 27.67 |================================================ LuxCoreRender 2.6 Scene: LuxCore Benchmark - Acceleration: GPU M samples/sec > Higher Is Better NVIDIA RTX 4070 SUPER . 12.82 |================================================ LuxCoreRender 2.6 Scene: Orange Juice - Acceleration: GPU M samples/sec > Higher Is Better NVIDIA RTX 4070 SUPER . 11.72 |================================================ LuxCoreRender 2.6 Scene: Danish Mood - Acceleration: GPU M samples/sec > Higher Is Better NVIDIA RTX 4070 SUPER . 10.56 |================================================ LuxCoreRender 2.6 Scene: DLSC - Acceleration: GPU M samples/sec > Higher Is Better NVIDIA RTX 4070 SUPER . 13.59 |================================================ Rodinia 3.1 Test: OpenCL Particle Filter Seconds < Lower Is Better NVIDIA RTX 4070 SUPER . 3.480 |================================================ clpeak 1.1.2 OpenCL Test: Global Memory Bandwidth GBPS > Higher Is Better NVIDIA RTX 4070 SUPER . 437.65 |=============================================== clpeak 1.1.2 OpenCL Test: Double-Precision Double GFLOPS > Higher Is Better NVIDIA RTX 4070 SUPER . 630.11 |=============================================== clpeak 1.1.2 OpenCL Test: Single-Precision Float GFLOPS > Higher Is Better NVIDIA RTX 4070 SUPER . 35492.69 |============================================= clpeak 1.1.2 OpenCL Test: Integer Compute INT GIOPS > Higher Is Better NVIDIA RTX 4070 SUPER . 18170.54 |============================================= FAHBench 2.3.2 Ns Per Day > Higher Is Better NVIDIA RTX 4070 SUPER . 366.06 |=============================================== OctaneBench 2020.1 Total Score Score > Higher Is Better NVIDIA RTX 4070 SUPER . 720.97 |=============================================== VkResample 1.0 Upscale: 2x - Precision: Single ms < Lower Is Better NVIDIA RTX 4070 SUPER . 18.49 |================================================ VkResample 1.0 Upscale: 2x - Precision: Double ms < Lower Is Better NVIDIA RTX 4070 SUPER . 339.59 |=============================================== NAMD CUDA 2.14 ATPase Simulation - 327,506 Atoms days/ns < Lower Is Better NVIDIA RTX 4070 SUPER . 0.06791 |============================================== cl-mem 2017-01-13 Benchmark: Write GB/s > Higher Is Better NVIDIA RTX 4070 SUPER . 407.5 |================================================ cl-mem 2017-01-13 Benchmark: Read GB/s > Higher Is Better NVIDIA RTX 4070 SUPER . 446.2 |================================================ cl-mem 2017-01-13 Benchmark: Copy GB/s > Higher Is Better NVIDIA RTX 4070 SUPER . 331.8 |================================================ Hashcat 6.2.4 Benchmark: TrueCrypt RIPEMD160 + XTS H/s > Higher Is Better NVIDIA RTX 4070 SUPER . 802967 |=============================================== Hashcat 6.2.4 Benchmark: SHA-512 H/s > Higher Is Better NVIDIA RTX 4070 SUPER . 3232733333 |=========================================== Hashcat 6.2.4 Benchmark: 7-Zip H/s > Higher Is Better NVIDIA RTX 4070 SUPER . 1176467 |============================================== Hashcat 6.2.4 Benchmark: SHA1 H/s > Higher Is Better NVIDIA RTX 4070 SUPER . 22132600000 |========================================== Hashcat 6.2.4 Benchmark: MD5 H/s > Higher Is Better NVIDIA RTX 4070 SUPER . 67583033333 |========================================== VkFFT 1.2.31 Test: FFT + iFFT C2C 1D batched in single precision, no reshuffling Benchmark Score > Higher Is Better NVIDIA RTX 4070 SUPER . 75078 |================================================ VkFFT 1.2.31 Test: FFT + iFFT C2C Bluestein benchmark in double precision Benchmark Score > Higher Is Better NVIDIA RTX 4070 SUPER . 4451 |================================================= VkFFT 1.2.31 Test: FFT + iFFT C2C multidimensional in single precision Benchmark Score > Higher Is Better NVIDIA RTX 4070 SUPER . 50299 |================================================ VkFFT 1.2.31 Test: FFT + iFFT C2C 1D batched in single precision Benchmark Score > Higher Is Better NVIDIA RTX 4070 SUPER . 73929 |================================================ VkFFT 1.2.31 Test: FFT + iFFT C2C 1D batched in double precision Benchmark Score > Higher Is Better NVIDIA RTX 4070 SUPER . 24317 |================================================ VkFFT 1.2.31 Test: FFT + iFFT C2C Bluestein in single precision Benchmark Score > Higher Is Better NVIDIA RTX 4070 SUPER . 15166 |================================================ VkFFT 1.2.31 Test: FFT + iFFT C2C 1D batched in half precision Benchmark Score > Higher Is Better NVIDIA RTX 4070 SUPER . 131705 |=============================================== VkFFT 1.2.31 Test: FFT + iFFT R2C / C2R Benchmark Score > Higher Is Better NVIDIA RTX 4070 SUPER . 54794 |================================================ Waifu2x-NCNN Vulkan 20200818 Scale: 2x - Denoise: 3 - TAA: Yes Seconds < Lower Is Better NVIDIA RTX 4070 SUPER . 2.855 |================================================ RealSR-NCNN 20200818 Scale: 4x - TAA: Yes Seconds < Lower Is Better NVIDIA RTX 4070 SUPER . 34.89 |================================================ GpuOwl 7.2.1 Exponent: 332220523 Iterations / Second > Higher Is Better NVIDIA RTX 4070 SUPER . 137.44 |=============================================== GpuOwl 7.2.1 Exponent: 77936867 Iterations / Second > Higher Is Better NVIDIA RTX 4070 SUPER . 646.41 |=============================================== GpuOwl 7.2.1 Exponent: 57885161 Iterations / Second > Higher Is Better NVIDIA RTX 4070 SUPER . 869.07 |=============================================== PyTorch 2.1 Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: Efficientnet_v2_l batches/sec > Higher Is Better NVIDIA RTX 4070 SUPER . 103.57 |=============================================== PyTorch 2.1 Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: Efficientnet_v2_l batches/sec > Higher Is Better NVIDIA RTX 4070 SUPER . 103.17 |=============================================== PyTorch 2.1 Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: Efficientnet_v2_l batches/sec > Higher Is Better NVIDIA RTX 4070 SUPER . 102.60 |=============================================== PyTorch 2.1 Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: Efficientnet_v2_l batches/sec > Higher Is Better NVIDIA RTX 4070 SUPER . 102.60 |=============================================== PyTorch 2.1 Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: Efficientnet_v2_l batches/sec > Higher Is Better NVIDIA RTX 4070 SUPER . 106.37 |=============================================== PyTorch 2.1 Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: ResNet-152 batches/sec > Higher Is Better NVIDIA RTX 4070 SUPER . 195.30 |=============================================== PyTorch 2.1 Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: ResNet-152 batches/sec > Higher Is Better NVIDIA RTX 4070 SUPER . 194.58 |=============================================== PyTorch 2.1 Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: ResNet-152 batches/sec > Higher Is Better NVIDIA RTX 4070 SUPER . 196.07 |=============================================== PyTorch 2.1 Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: ResNet-50 batches/sec > Higher Is Better NVIDIA RTX 4070 SUPER . 504.27 |=============================================== PyTorch 2.1 Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: ResNet-152 batches/sec > Higher Is Better NVIDIA RTX 4070 SUPER . 195.39 |=============================================== PyTorch 2.1 Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: ResNet-50 batches/sec > Higher Is Better NVIDIA RTX 4070 SUPER . 504.67 |=============================================== PyTorch 2.1 Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: ResNet-152 batches/sec > Higher Is Better NVIDIA RTX 4070 SUPER . 195.40 |=============================================== PyTorch 2.1 Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: ResNet-50 batches/sec > Higher Is Better NVIDIA RTX 4070 SUPER . 507.45 |=============================================== PyTorch 2.1 Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: ResNet-50 batches/sec > Higher Is Better NVIDIA RTX 4070 SUPER . 501.50 |=============================================== PyTorch 2.1 Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: ResNet-50 batches/sec > Higher Is Better NVIDIA RTX 4070 SUPER . 509.45 |=============================================== PyTorch 2.1 Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: ResNet-152 batches/sec > Higher Is Better NVIDIA RTX 4070 SUPER . 201.94 |=============================================== PyTorch 2.1 Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: ResNet-50 batches/sec > Higher Is Better NVIDIA RTX 4070 SUPER . 557.73 |=============================================== ProjectPhysX OpenCL-Benchmark 1.2 Operation: Memory Bandwidth Coalesced Write GB/s > Higher Is Better NVIDIA RTX 4070 SUPER . 455.01 |=============================================== ProjectPhysX OpenCL-Benchmark 1.2 Operation: Memory Bandwidth Coalesced Read GB/s > Higher Is Better NVIDIA RTX 4070 SUPER . 464.86 |=============================================== ProjectPhysX OpenCL-Benchmark 1.2 Operation: INT8 Compute TIOPs/s > Higher Is Better NVIDIA RTX 4070 SUPER . 14.31 |================================================ ProjectPhysX OpenCL-Benchmark 1.2 Operation: INT16 Compute TIOPs/s > Higher Is Better NVIDIA RTX 4070 SUPER . 17.17 |================================================ ProjectPhysX OpenCL-Benchmark 1.2 Operation: INT32 Compute TIOPs/s > Higher Is Better NVIDIA RTX 4070 SUPER . 19.89 |================================================ ProjectPhysX OpenCL-Benchmark 1.2 Operation: INT64 Compute TIOPs/s > Higher Is Better NVIDIA RTX 4070 SUPER . 4.214 |================================================ ProjectPhysX OpenCL-Benchmark 1.2 Operation: FP32 Compute TFLOPs/s > Higher Is Better NVIDIA RTX 4070 SUPER . 38.59 |================================================ ProjectPhysX OpenCL-Benchmark 1.2 Operation: FP64 Compute TFLOPs/s > Higher Is Better NVIDIA RTX 4070 SUPER . 0.621 |================================================ NCNN 20230517 Target: Vulkan GPU - Model: FastestDet ms < Lower Is Better NVIDIA RTX 4070 SUPER . 2.86 |================================================= NCNN 20230517 Target: Vulkan GPU - Model: vision_transformer ms < Lower Is Better NVIDIA RTX 4070 SUPER . 844.61 |=============================================== NCNN 20230517 Target: Vulkan GPU - Model: regnety_400m ms < Lower Is Better NVIDIA RTX 4070 SUPER . 11.11 |================================================ NCNN 20230517 Target: Vulkan GPU - Model: squeezenet_ssd ms < Lower Is Better NVIDIA RTX 4070 SUPER . 6.86 |================================================= NCNN 20230517 Target: Vulkan GPU - Model: yolov4-tiny ms < Lower Is Better NVIDIA RTX 4070 SUPER . 63.82 |================================================ NCNN 20230517 Target: Vulkan GPU - Model: resnet50 ms < Lower Is Better NVIDIA RTX 4070 SUPER . 46.26 |================================================ NCNN 20230517 Target: Vulkan GPU - Model: alexnet ms < Lower Is Better NVIDIA RTX 4070 SUPER . 16.17 |================================================ NCNN 20230517 Target: Vulkan GPU - Model: resnet18 ms < Lower Is Better NVIDIA RTX 4070 SUPER . 8.97 |================================================= NCNN 20230517 Target: Vulkan GPU - Model: vgg16 ms < Lower Is Better NVIDIA RTX 4070 SUPER . 117.81 |=============================================== NCNN 20230517 Target: Vulkan GPU - Model: googlenet ms < Lower Is Better NVIDIA RTX 4070 SUPER . 11.04 |================================================ NCNN 20230517 Target: Vulkan GPU - Model: blazeface ms < Lower Is Better NVIDIA RTX 4070 SUPER . 0.84 |================================================= NCNN 20230517 Target: Vulkan GPU - Model: efficientnet-b0 ms < Lower Is Better NVIDIA RTX 4070 SUPER . 5.07 |================================================= NCNN 20230517 Target: Vulkan GPU - Model: mnasnet ms < Lower Is Better NVIDIA RTX 4070 SUPER . 3.85 |================================================= NCNN 20230517 Target: Vulkan GPU - Model: shufflenet-v2 ms < Lower Is Better NVIDIA RTX 4070 SUPER . 2.31 |================================================= NCNN 20230517 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 ms < Lower Is Better NVIDIA RTX 4070 SUPER . 2.25 |================================================= NCNN 20230517 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 ms < Lower Is Better NVIDIA RTX 4070 SUPER . 3.03 |================================================= NCNN 20230517 Target: Vulkan GPU - Model: mobilenet ms < Lower Is Better NVIDIA RTX 4070 SUPER . 8.62 |================================================= TensorFlow 2.12 Device: GPU - Batch Size: 512 - Model: VGG-16 images/sec > Higher Is Better TensorFlow 2.12 Device: GPU - Batch Size: 256 - Model: VGG-16 images/sec > Higher Is Better TensorFlow 2.12 Device: GPU - Batch Size: 64 - Model: VGG-16 images/sec > Higher Is Better PlaidML FP16: No - Mode: Inference - Network: DenseNet 201 - Device: OpenCL Examples Per Second > Higher Is Better PlaidML FP16: Yes - Mode: Inference - Network: Mobilenet - Device: OpenCL Examples Per Second > Higher Is Better PlaidML FP16: No - Mode: Inference - Network: Mobilenet - Device: OpenCL Examples Per Second > Higher Is Better PlaidML FP16: No - Mode: Inference - Network: IMDB LSTM - Device: OpenCL Examples Per Second > Higher Is Better PlaidML FP16: No - Mode: Training - Network: Mobilenet - Device: OpenCL Examples Per Second > Higher Is Better NCNN 20230517 Target: Vulkan GPU ms < Lower Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 1000 Milli-Seconds < Lower Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 200 Milli-Seconds < Lower Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 100 Milli-Seconds < Lower Is Better Caffe 2020-02-13 Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 1000 Milli-Seconds < Lower Is Better Caffe 2020-02-13 Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 200 Milli-Seconds < Lower Is Better Caffe 2020-02-13 Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 100 Milli-Seconds < Lower Is Better FinanceBench 2016-07-25 Benchmark: Black-Scholes OpenCL ms < Lower Is Better NVIDIA RTX 4070 SUPER . 5.912 |================================================ ArrayFire 3.9 Test: Conjugate Gradient OpenCL LeelaChessZero 0.30 Backend: OpenCL Nodes Per Second > Higher Is Better Libplacebo 5.229.1 FPS > Higher Is Better Waifu2x-NCNN Vulkan 20200818 Scale: 2x - Denoise: 3 - TAA: No Seconds < Lower Is Better RealSR-NCNN 20200818 Scale: 4x - TAA: No Seconds < Lower Is Better NVIDIA RTX 4070 SUPER . 6.323 |================================================ vkpeak 20230730 GFLOPS > Higher Is Better PyTorch 2.1 Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: Efficientnet_v2_l batches/sec > Higher Is Better