RTX 4070 SUPER Intel Core i9-13900K testing with a ASUS TUF GAMING Z790-PRO WIFI (1401 BIOS) and ASUS NVIDIA GeForce RTX 4070 SUPER 12GB on EndeavourOS rolling via the Phoronix Test Suite. NVIDIA RTX 4070 SUPER: Processor: Intel Core i9-13900K @ 5.50GHz (24 Cores / 32 Threads), Motherboard: ASUS TUF GAMING Z790-PRO WIFI (1401 BIOS), Chipset: Intel Device 7a27, Memory: 32GB, Disk: 4001GB Seagate ZP4000GP304001, Graphics: ASUS NVIDIA GeForce RTX 4070 SUPER 12GB, Audio: Realtek ALC1220, Monitor: ARZOPA, Network: Intel I226-V + Intel Device 7a70 OS: EndeavourOS rolling, Kernel: 6.7.1-arch1-1 (x86_64), Desktop: KDE Plasma 5.27.10, Display Server: X Server 1.21.1.11, Display Driver: NVIDIA 550.40.07, OpenGL: 4.6.0, OpenCL: OpenCL 3.0 CUDA 12.4.74, Compiler: GCC 13.2.1 20230801, File-System: ext4, Screen Resolution: 1920x1080 RTX 4070 SUPER: Processor: Intel Core i9-13900K @ 5.50GHz (24 Cores / 32 Threads), Motherboard: ASUS TUF GAMING Z790-PRO WIFI (1401 BIOS), Chipset: Intel Device 7a27, Memory: 32GB, Disk: 4001GB Seagate ZP4000GP304001, Graphics: ASUS NVIDIA GeForce RTX 4070 SUPER 12GB, Audio: Realtek ALC1220, Monitor: ARZOPA, Network: Intel I226-V + Intel Device 7a70 OS: EndeavourOS rolling, Kernel: 6.7.1-arch1-1 (x86_64), Desktop: KDE Plasma 5.27.10, Display Server: X Server 1.21.1.11, Display Driver: NVIDIA 550.40.07, OpenGL: 4.6.0, OpenCL: OpenCL 3.0 CUDA 12.4.74, Compiler: GCC 13.2.1 20230801, File-System: ext4, Screen Resolution: 1920x1080 ProjectPhysX OpenCL-Benchmark 1.2 Operation: FP64 Compute TFLOPs/s > Higher Is Better NVIDIA RTX 4070 SUPER . 0.621 |================================================ ProjectPhysX OpenCL-Benchmark 1.2 Operation: FP32 Compute TFLOPs/s > Higher Is Better NVIDIA RTX 4070 SUPER . 38.59 |================================================ ProjectPhysX OpenCL-Benchmark 1.2 Operation: INT64 Compute TIOPs/s > Higher Is Better NVIDIA RTX 4070 SUPER . 4.214 |================================================ ProjectPhysX OpenCL-Benchmark 1.2 Operation: INT32 Compute TIOPs/s > Higher Is Better NVIDIA RTX 4070 SUPER . 19.89 |================================================ ProjectPhysX OpenCL-Benchmark 1.2 Operation: INT16 Compute TIOPs/s > Higher Is Better NVIDIA RTX 4070 SUPER . 17.17 |================================================ ProjectPhysX OpenCL-Benchmark 1.2 Operation: INT8 Compute TIOPs/s > Higher Is Better NVIDIA RTX 4070 SUPER . 14.31 |================================================ ProjectPhysX OpenCL-Benchmark 1.2 Operation: Memory Bandwidth Coalesced Read GB/s > Higher Is Better NVIDIA RTX 4070 SUPER . 464.86 |=============================================== ProjectPhysX OpenCL-Benchmark 1.2 Operation: Memory Bandwidth Coalesced Write GB/s > Higher Is Better NVIDIA RTX 4070 SUPER . 455.01 |=============================================== PyTorch 2.1 Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: ResNet-50 batches/sec > Higher Is Better PyTorch 2.1 Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: ResNet-152 batches/sec > Higher Is Better RTX 4070 SUPER . 201.94 |====================================================== PyTorch 2.1 Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: ResNet-50 batches/sec > Higher Is Better RTX 4070 SUPER . 509.45 |====================================================== PyTorch 2.1 Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: ResNet-50 batches/sec > Higher Is Better PyTorch 2.1 Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: ResNet-50 batches/sec > Higher Is Better RTX 4070 SUPER . 507.45 |====================================================== PyTorch 2.1 Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: ResNet-152 batches/sec > Higher Is Better PyTorch 2.1 Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: ResNet-50 batches/sec > Higher Is Better PyTorch 2.1 Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: ResNet-152 batches/sec > Higher Is Better RTX 4070 SUPER . 195.39 |====================================================== PyTorch 2.1 Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: ResNet-50 batches/sec > Higher Is Better RTX 4070 SUPER . 504.27 |====================================================== PyTorch 2.1 Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: ResNet-152 batches/sec > Higher Is Better PyTorch 2.1 Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: ResNet-152 batches/sec > Higher Is Better RTX 4070 SUPER . 194.58 |====================================================== PyTorch 2.1 Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: ResNet-152 batches/sec > Higher Is Better PyTorch 2.1 Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: Efficientnet_v2_l batches/sec > Higher Is Better RTX 4070 SUPER . 106.37 |====================================================== PyTorch 2.1 Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: Efficientnet_v2_l batches/sec > Higher Is Better PyTorch 2.1 Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: Efficientnet_v2_l batches/sec > Higher Is Better RTX 4070 SUPER . 102.60 |====================================================== PyTorch 2.1 Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: Efficientnet_v2_l batches/sec > Higher Is Better PyTorch 2.1 Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: Efficientnet_v2_l batches/sec > Higher Is Better RTX 4070 SUPER . 103.17 |====================================================== PyTorch 2.1 Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: Efficientnet_v2_l batches/sec > Higher Is Better RTX 4070 SUPER . 103.57 |======================================================