RTX 4070 SUPER Suite

RTX 4070 SUPER Suite 1.0.0 System Test suite extracted from RTX 4070 SUPER. pts/gpuowl-1.0.0 -iters 100000 -prp 77936867 Exponent: 77936867 pts/gpuowl-1.0.0 -iters 20000 -prp 332220523 Exponent: 332220523 pts/octanebench-1.3.0 Total Score pts/vkpeak-1.1.0 pts/gpuowl-1.0.0 -iters 100000 -prp 57885161 Exponent: 57885161 pts/fahbench-1.0.2 pts/luxcorerender-1.4.0 LuxCore2.1Benchmark/LuxCoreScene/render.cfg -D renderengine.type PATHOCL -D opencl.native.threads.count 0 -D context.cuda.optix.enable 0 Scene: LuxCore Benchmark - Acceleration: GPU pts/luxcorerender-1.4.0 DLSC/LuxCoreScene/render.cfg -D renderengine.type PATHOCL -D opencl.native.threads.count 0 -D context.cuda.optix.enable 0 Scene: DLSC - Acceleration: GPU pts/indigobench-1.1.0 --gpuonly --scenes bedroom Acceleration: OpenCL GPU - Scene: Bedroom pts/vkresample-1.0.2 -u 2 -p 1 Upscale: 2x - Precision: Double pts/indigobench-1.1.0 --gpuonly --scenes supercar Acceleration: OpenCL GPU - Scene: Supercar pts/luxcorerender-1.4.0 OrangeJuice/LuxCoreScene/render.cfg -D renderengine.type PATHOCL -D opencl.native.threads.count 0 -D context.cuda.optix.enable 0 Scene: Orange Juice - Acceleration: GPU pts/luxcorerender-1.4.0 DanishMood/LuxCoreScene/render.cfg -D renderengine.type PATHOCL -D opencl.native.threads.count 0 -D context.cuda.optix.enable 0 Scene: Danish Mood - Acceleration: GPU pts/blender-4.0.0 -b ../barbershop_interior_gpu.blend -o output.test -x 1 -F JPEG -f 1 -- --cycles-device OPTIX Blend File: Barbershop - Compute: NVIDIA OptiX pts/namd-cuda-1.1.1 ATPase Simulation - 327,506 Atoms pts/blender-4.0.0 -b ../fishy_cat_gpu.blend -o output.test -x 1 -F JPEG -f 1 -- --cycles-device OPTIX Blend File: Fishy Cat - Compute: NVIDIA OptiX pts/realsr-ncnn-1.0.0 -s 4 -x Scale: 4x - TAA: Yes pts/realsr-ncnn-1.0.0 -s 4 Scale: 4x - TAA: No pts/blender-4.0.0 -b ../bmw27_gpu.blend -o output.test -x 1 -F JPEG -f 1 -- --cycles-device OPTIX Blend File: BMW27 - Compute: NVIDIA OptiX pts/viennacl-1.1.0 dense_blas-bench-cpu Test: CPU BLAS - dGEMM-TT pts/viennacl-1.1.0 dense_blas-bench-cpu Test: CPU BLAS - dGEMM-TN pts/viennacl-1.1.0 dense_blas-bench-cpu Test: CPU BLAS - dGEMM-NT pts/viennacl-1.1.0 dense_blas-bench-cpu Test: CPU BLAS - dGEMM-NN pts/viennacl-1.1.0 dense_blas-bench-cpu Test: CPU BLAS - dGEMV-T pts/viennacl-1.1.0 dense_blas-bench-cpu Test: CPU BLAS - dGEMV-N pts/viennacl-1.1.0 dense_blas-bench-cpu Test: CPU BLAS - dDOT pts/viennacl-1.1.0 dense_blas-bench-cpu Test: CPU BLAS - dAXPY pts/viennacl-1.1.0 dense_blas-bench-cpu Test: CPU BLAS - dCOPY pts/viennacl-1.1.0 dense_blas-bench-cpu Test: CPU BLAS - sDOT pts/viennacl-1.1.0 dense_blas-bench-cpu Test: CPU BLAS - sAXPY pts/viennacl-1.1.0 dense_blas-bench-cpu Test: CPU BLAS - sCOPY pts/viennacl-1.1.0 dense_blas-bench-opencl Test: OpenCL BLAS - dGEMM-TT pts/viennacl-1.1.0 dense_blas-bench-opencl Test: OpenCL BLAS - dGEMM-TN pts/viennacl-1.1.0 dense_blas-bench-opencl Test: OpenCL BLAS - dGEMM-NT pts/viennacl-1.1.0 dense_blas-bench-opencl Test: OpenCL BLAS - dGEMM-NN pts/viennacl-1.1.0 dense_blas-bench-opencl Test: OpenCL BLAS - dGEMV-T pts/viennacl-1.1.0 dense_blas-bench-opencl Test: OpenCL BLAS - dGEMV-N pts/viennacl-1.1.0 dense_blas-bench-opencl Test: OpenCL BLAS - dDOT pts/viennacl-1.1.0 dense_blas-bench-opencl Test: OpenCL BLAS - dAXPY pts/viennacl-1.1.0 dense_blas-bench-opencl Test: OpenCL BLAS - dCOPY pts/viennacl-1.1.0 dense_blas-bench-opencl Test: OpenCL BLAS - sDOT pts/viennacl-1.1.0 dense_blas-bench-opencl Test: OpenCL BLAS - sAXPY pts/viennacl-1.1.0 dense_blas-bench-opencl Test: OpenCL BLAS - sCOPY pts/pytorch-1.0.1 cuda 256 efficientnet_v2_l Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: Efficientnet_v2_l pts/blender-4.0.0 -b ../pavillon_barcelone_gpu.blend -o output.test -x 1 -F JPEG -f 1 -- --cycles-device OPTIX Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX pts/blender-4.0.0 -b ../classroom_gpu.blend -o output.test -x 1 -F JPEG -f 1 -- --cycles-device OPTIX Blend File: Classroom - Compute: NVIDIA OptiX pts/opencl-benchmark-1.0.0 Operation: Memory Bandwidth Coalesced Write pts/opencl-benchmark-1.0.0 Operation: Memory Bandwidth Coalesced Read pts/opencl-benchmark-1.0.0 Operation: INT8 Compute pts/opencl-benchmark-1.0.0 Operation: INT16 Compute pts/opencl-benchmark-1.0.0 Operation: INT32 Compute pts/opencl-benchmark-1.0.0 Operation: INT64 Compute pts/opencl-benchmark-1.0.0 Operation: FP32 Compute pts/opencl-benchmark-1.0.0 Operation: FP64 Compute pts/pytorch-1.0.1 cuda 256 resnet152 Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: ResNet-152 pts/pytorch-1.0.1 cuda 32 efficientnet_v2_l Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: Efficientnet_v2_l pts/pytorch-1.0.1 cuda 512 efficientnet_v2_l Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: Efficientnet_v2_l pts/vkresample-1.0.2 -u 2 -p 0 Upscale: 2x - Precision: Single pts/pytorch-1.0.1 cuda 64 resnet50 Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: ResNet-50 pts/clpeak-1.1.0 --compute-dp OpenCL Test: Double-Precision Double pts/luxcorerender-1.4.0 RainbowColorsAndPrism/LuxCoreScene/render.cfg -D renderengine.type PATHOCL -D opencl.native.threads.count 0 -D context.cuda.optix.enable 0 Scene: Rainbow Colors and Prism - Acceleration: GPU pts/hashcat-1.1.1 -m 1700 Benchmark: SHA-512 pts/hashcat-1.1.1 -m 100 Benchmark: SHA1 pts/hashcat-1.1.1 -m 0 Benchmark: MD5 pts/pytorch-1.0.1 cuda 1 efficientnet_v2_l Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: Efficientnet_v2_l pts/pytorch-1.0.1 cuda 32 resnet152 Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: ResNet-152 pts/pytorch-1.0.1 cuda 512 resnet50 Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: ResNet-50 pts/hashcat-1.1.1 -m 6211 Benchmark: TrueCrypt RIPEMD160 + XTS pts/pytorch-1.0.1 cuda 1 resnet152 Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: ResNet-152 pts/rodinia-1.3.2 OCL_PARTICLEFILTER Test: OpenCL Particle Filter pts/pytorch-1.0.1 cuda 16 resnet50 Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: ResNet-50 pts/cl-mem-1.0.1 COPY Benchmark: Copy pts/cl-mem-1.0.1 READ Benchmark: Read pts/cl-mem-1.0.1 WRITE Benchmark: Write pts/hashcat-1.1.1 -m 11600 Benchmark: 7-Zip pts/waifu2x-ncnn-1.0.0 -s 2 -n 3 -x Scale: 2x - Denoise: 3 - TAA: Yes pts/financebench-1.1.1 Black-Scholes/OpenCL/blackScholesAnalyticEngine.exe Benchmark: Black-Scholes OpenCL pts/pytorch-1.0.1 cuda 64 efficientnet_v2_l Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: Efficientnet_v2_l pts/clpeak-1.1.0 --global-bandwidth OpenCL Test: Global Memory Bandwidth pts/mandelgpu-1.3.1 0 1 OpenCL Device: GPU pts/pytorch-1.0.1 cuda 256 resnet50 Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: ResNet-50 pts/pytorch-1.0.1 cuda 512 resnet152 Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: ResNet-152 pts/waifu2x-ncnn-1.0.0 -s 2 -n 3 Scale: 2x - Denoise: 3 - TAA: No pts/pytorch-1.0.1 cuda 64 resnet152 Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: ResNet-152 pts/pytorch-1.0.1 cuda 1 resnet50 Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: ResNet-50 pts/pytorch-1.0.1 cuda 16 efficientnet_v2_l Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: Efficientnet_v2_l pts/clpeak-1.1.0 --compute-integer OpenCL Test: Integer Compute INT pts/pytorch-1.0.1 cuda 16 resnet152 Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: ResNet-152 pts/pytorch-1.0.1 cuda 32 resnet50 Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: ResNet-50 pts/clpeak-1.1.0 --compute-sp OpenCL Test: Single-Precision Float pts/neatbench-1.0.4 gpu Acceleration: GPU pts/plaidml-1.0.4 --no-fp16 --no-train densenet201 OPENCL FP16: No - Mode: Inference - Network: DenseNet 201 - Device: OpenCL pts/plaidml-1.0.4 --fp16 --no-train mobilenet OPENCL FP16: Yes - Mode: Inference - Network: Mobilenet - Device: OpenCL pts/plaidml-1.0.4 --no-fp16 --no-train mobilenet OPENCL FP16: No - Mode: Inference - Network: Mobilenet - Device: OpenCL pts/plaidml-1.0.4 --no-fp16 --no-train imdb_lstm OPENCL FP16: No - Mode: Inference - Network: IMDB LSTM - Device: OpenCL pts/plaidml-1.0.4 --no-fp16 --train mobilenet OPENCL FP16: No - Mode: Training - Network: Mobilenet - Device: OpenCL pts/gromacs-1.8.0 cuda-build water-cut1.0_GMX50_bare/1536 Implementation: NVIDIA CUDA GPU - Input: water_GMX50_bare pts/lczero-1.7.0 -b opencl Backend: OpenCL pts/libplacebo-1.1.0 pts/shoc-1.2.0 -opencl -benchmark GEMM Target: OpenCL - Benchmark: GEMM SGEMM_N pts/mixbench-1.1.1 mixbench-ocl-ro SPGFLOPS Backend: OpenCL - Benchmark: Single Precision pts/mixbench-1.1.1 mixbench-ocl-ro DPGFLOPS Backend: OpenCL - Benchmark: Double Precision pts/vkfft-1.2.0 -vkfft 6 Test: FFT + iFFT R2C / C2R pts/ncnn-1.5.0 Target: Vulkan GPU pts/caffe-1.5.0 --model=../models/bvlc_googlenet/deploy.prototxt -gpu all -iterations 1000 Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 1000 pts/caffe-1.5.0 --model=../models/bvlc_googlenet/deploy.prototxt -gpu all -iterations 200 Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 200 pts/caffe-1.5.0 --model=../models/bvlc_alexnet/deploy.prototxt -gpu all -iterations 1000 Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 1000 pts/caffe-1.5.0 --model=../models/bvlc_alexnet/deploy.prototxt -gpu all -iterations 200 Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 200 pts/caffe-1.5.0 --model=../models/bvlc_alexnet/deploy.prototxt -gpu all -iterations 100 Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 100 pts/betsy-1.0.0 --codec=etc2_rgb --quality=2 Codec: ETC2 RGB - Quality: Highest pts/shoc-1.2.0 -opencl -benchmark DeviceMemory Target: OpenCL - Benchmark: Texture Read Bandwidth pts/shoc-1.2.0 -opencl -benchmark BusSpeedDownload Target: OpenCL - Benchmark: Bus Speed Download pts/shoc-1.2.0 -opencl -benchmark Reduction Target: OpenCL - Benchmark: Reduction pts/shoc-1.2.0 -opencl -benchmark FFT Target: OpenCL - Benchmark: FFT SP pts/mixbench-1.1.1 mixbench-cuda-ro SPGFLOPS Backend: NVIDIA CUDA - Benchmark: Single Precision pts/mixbench-1.1.1 mixbench-cuda-ro DPGFLOPS Backend: NVIDIA CUDA - Benchmark: Double Precision pts/mixbench-1.1.1 mixbench-cuda-ro HPGFLOPS Backend: NVIDIA CUDA - Benchmark: Half Precision pts/mixbench-1.1.1 mixbench-cuda-ro GIOPS Backend: NVIDIA CUDA - Benchmark: Integer pts/mixbench-1.1.1 mixbench-ocl-ro GIOPS Backend: OpenCL - Benchmark: Integer pts/vkfft-1.2.0 -vkfft 7 Test: FFT + iFFT C2C Bluestein in single precision pts/vkfft-1.2.0 -vkfft 2 Test: FFT + iFFT C2C 1D batched in half precision pts/caffe-1.5.0 --model=../models/bvlc_googlenet/deploy.prototxt -gpu all -iterations 100 Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 100 pts/arrayfire-1.2.0 cg_opencl Test: Conjugate Gradient OpenCL pts/shoc-1.2.0 -opencl -benchmark BusSpeedReadback Target: OpenCL - Benchmark: Bus Speed Readback pts/shoc-1.2.0 -opencl -benchmark MaxFlops Target: OpenCL - Benchmark: Max SP Flops pts/shoc-1.2.0 -opencl -benchmark MD5Hash Target: OpenCL - Benchmark: MD5 Hash pts/shoc-1.2.0 -opencl -benchmark Triad Target: OpenCL - Benchmark: Triad pts/shoc-1.2.0 -opencl -benchmark S3D Target: OpenCL - Benchmark: S3D pts/vkfft-1.2.0 -vkfft 3 Test: FFT + iFFT C2C multidimensional in single precision pts/vkfft-1.2.0 -vkfft 1 Test: FFT + iFFT C2C 1D batched in double precision pts/betsy-1.0.0 --codec=etc1 --quality=2 Codec: ETC1 - Quality: Highest pts/vkfft-1.2.0 -vkfft 5 Test: FFT + iFFT C2C 1D batched in single precision, no reshuffling pts/vkfft-1.2.0 -vkfft 8 Test: FFT + iFFT C2C Bluestein benchmark in double precision pts/vkfft-1.2.0 -vkfft 0 Test: FFT + iFFT C2C 1D batched in single precision