RTX 4070 SUPER
sudo apt install vulkan-headers vulkan-tools libvulkan-dev
HTML result view exported from: https://openbenchmarking.org/result/2412102-NE-INTELGPU716&rdt&grr.
TensorFlow
Device: GPU - Batch Size: 64 - Model: VGG-16
IndigoBench
Acceleration: OpenCL GPU - Scene: Bedroom
TensorFlow
Device: GPU - Batch Size: 32 - Model: VGG-16
TensorFlow
Device: GPU - Batch Size: 32 - Model: VGG-16
TensorFlow
Device: GPU - Batch Size: 64 - Model: VGG-16
TensorFlow
Device: GPU - Batch Size: 16 - Model: VGG-16
NCNN
Target: Vulkan GPU - Model: vgg16
NCNN
Target: Vulkan GPU - Model: FastestDet
NCNN
Target: Vulkan GPU - Model: vision_transformer
NCNN
Target: Vulkan GPU - Model: regnety_400m
NCNN
Target: Vulkan GPU - Model: squeezenet_ssd
NCNN
Target: Vulkan GPU - Model: yolov4-tiny
NCNN
Target: Vulkan GPU - Model: resnet50
NCNN
Target: Vulkan GPU - Model: alexnet
NCNN
Target: Vulkan GPU - Model: resnet18
NCNN
Target: Vulkan GPU - Model: googlenet
NCNN
Target: Vulkan GPU - Model: blazeface
NCNN
Target: Vulkan GPU - Model: efficientnet-b0
NCNN
Target: Vulkan GPU - Model: mnasnet
NCNN
Target: Vulkan GPU - Model: shufflenet-v2
NCNN
Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3
NCNN
Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2
NCNN
Target: Vulkan GPU - Model: mobilenet
NCNN
Target: Vulkan GPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3
TensorFlow
Device: GPU - Batch Size: 16 - Model: VGG-16
TensorFlow
Device: GPU - Batch Size: 64 - Model: ResNet-50
TensorFlow
Device: GPU - Batch Size: 64 - Model: ResNet-50
SPECViewPerf 2020
Resolution: 1920 x 1080 - Viewset: CREO-03
Unigine Heaven
Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL
SPECViewPerf 2020
Resolution: 1920 x 1080 - Viewset: SOLIDWORKS-07
SPECViewPerf 2020
Resolution: 1920 x 1080 - Viewset: MAYA-06
TensorFlow
Device: GPU - Batch Size: 32 - Model: ResNet-50
SPECViewPerf 2020
Resolution: 1920 x 1080 - Viewset: MEDICAL-O3
TensorFlow
Device: GPU - Batch Size: 32 - Model: ResNet-50
TensorFlow
Device: GPU - Batch Size: 16 - Model: ResNet-50
SPECViewPerf 2020
Resolution: 1920 x 1080 - Viewset: SNX-04
Unigine Valley
Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL
SPECViewPerf 2020
Resolution: 1920 x 1080 - Viewset: CATIA-06
TensorFlow
Device: GPU - Batch Size: 16 - Model: AlexNet
TensorFlow
Device: GPU - Batch Size: 16 - Model: ResNet-50
vkpeak
int32-scalar
vkpeak
int32-vec4
vkpeak
int16-vec4
vkpeak
int16-scalar
vkpeak
fp16-vec4
vkpeak
fp16-scalar
vkpeak
fp32-vec4
vkpeak
fp32-scalar
TensorFlow
Device: GPU - Batch Size: 64 - Model: AlexNet
VkFFT
Test: FFT + iFFT C2C 1D batched in half precision
Xonotic
Resolution: 1920 x 1080 - Effects Quality: High
SPECViewPerf 2020
Resolution: 1920 x 1080 - Viewset: ENERGY-03
TensorFlow
Device: GPU - Batch Size: 32 - Model: GoogLeNet
TensorFlow
Device: GPU - Batch Size: 64 - Model: GoogLeNet
LuxMark
OpenCL Device: GPU - Scene: Microphone
LuxMark
OpenCL Device: GPU - Scene: Hotel
LuxMark
OpenCL Device: CPU+GPU - Scene: Microphone
LuxMark
OpenCL Device: GPU - Scene: Luxball HDR
LuxMark
OpenCL Device: CPU+GPU - Scene: Luxball HDR
LuxMark
OpenCL Device: CPU+GPU - Scene: Hotel
TensorFlow
Device: GPU - Batch Size: 32 - Model: GoogLeNet
TensorFlow
Device: GPU - Batch Size: 64 - Model: GoogLeNet
IndigoBench
Acceleration: OpenCL GPU - Scene: Supercar
TensorFlow
Device: GPU - Batch Size: 32 - Model: AlexNet
VkFFT
Test: FFT + iFFT C2C multidimensional in single precision
TensorFlow
Device: GPU - Batch Size: 16 - Model: GoogLeNet
Xonotic
Resolution: 1920 x 1080 - Effects Quality: Ultimate
SHOC Scalable HeterOgeneous Computing
Target: OpenCL - Benchmark: Texture Read Bandwidth
TensorFlow
Device: GPU - Batch Size: 1 - Model: VGG-16
TensorFlow
Device: GPU - Batch Size: 16 - Model: GoogLeNet
VkFFT
Test: FFT + iFFT C2C 1D batched in half precision
IndigoBench
Acceleration: CPU - Scene: Bedroom
IndigoBench
Acceleration: CPU - Scene: Supercar
TensorFlow
Device: GPU - Batch Size: 64 - Model: AlexNet
VkFFT
Test: FFT + iFFT C2C 1D batched in single precision
TensorFlow
Device: GPU - Batch Size: 32 - Model: AlexNet
vkpeak
fp16-vec4
vkpeak
fp16-scalar
vkpeak
fp32-vec4
vkpeak
fp32-scalar
VkFFT
Test: FFT + iFFT C2C 1D batched in single precision
RealSR-NCNN
Scale: 4x - TAA: Yes
Blender
Blend File: Barbershop - Compute: NVIDIA OptiX
VkFFT
Test: FFT + iFFT C2C 1D batched in single precision, no reshuffling
VkFFT
Test: FFT + iFFT C2C 1D batched in single precision, no reshuffling
Xonotic
Resolution: 1920 x 1080 - Effects Quality: Ultra
VkFFT
Test: FFT + iFFT C2C Bluestein in single precision
TensorFlow
Device: GPU - Batch Size: 1 - Model: ResNet-50
VkFFT
Test: FFT + iFFT R2C / C2R
Blender
Blend File: Fishy Cat - Compute: NVIDIA OptiX
Xonotic
Resolution: 1920 x 1080 - Effects Quality: Low
TensorFlow
Device: GPU - Batch Size: 1 - Model: VGG-16
OpenArena
Resolution: 1920 x 1080 - Total Frame Time
OpenArena
Resolution: 1920 x 1080
VkFFT
Test: FFT + iFFT C2C Bluestein benchmark in double precision
TensorFlow
Device: GPU - Batch Size: 16 - Model: AlexNet
ParaView
Test: Many Spheres - Resolution: 1920 x 1080
ParaView
Test: Many Spheres - Resolution: 1920 x 1080
VkFFT
Test: FFT + iFFT C2C multidimensional in single precision
ProjectPhysX OpenCL-Benchmark
Operation: FP16 Compute
TensorFlow
Device: GPU - Batch Size: 1 - Model: AlexNet
Blender
Blend File: BMW27 - Compute: NVIDIA OptiX
FAHBench
TensorFlow
Device: GPU - Batch Size: 1 - Model: ResNet-50
VkFFT
Test: FFT + iFFT C2C 1D batched in double precision
ProjectPhysX OpenCL-Benchmark
Operation: INT8 Compute
ProjectPhysX OpenCL-Benchmark
Operation: Memory Bandwidth Coalesced Read
ProjectPhysX OpenCL-Benchmark
Operation: INT16 Compute
ProjectPhysX OpenCL-Benchmark
Operation: Memory Bandwidth Coalesced Write
ProjectPhysX OpenCL-Benchmark
Operation: INT32 Compute
ProjectPhysX OpenCL-Benchmark
Operation: INT64 Compute
ProjectPhysX OpenCL-Benchmark
Operation: FP32 Compute
SHOC Scalable HeterOgeneous Computing
Target: OpenCL - Benchmark: Max SP Flops
ViennaCL
Test: CPU BLAS - dGEMM-TT
ViennaCL
Test: CPU BLAS - dGEMM-TN
ViennaCL
Test: CPU BLAS - dGEMM-NT
ViennaCL
Test: CPU BLAS - dGEMM-NN
ViennaCL
Test: CPU BLAS - dGEMV-T
ViennaCL
Test: CPU BLAS - dGEMV-N
ViennaCL
Test: CPU BLAS - dDOT
ViennaCL
Test: CPU BLAS - dAXPY
ViennaCL
Test: CPU BLAS - dCOPY
ViennaCL
Test: CPU BLAS - sDOT
ViennaCL
Test: CPU BLAS - sAXPY
ViennaCL
Test: CPU BLAS - sCOPY
VkFFT
Test: FFT + iFFT R2C / C2R
ViennaCL
Test: OpenCL BLAS - dGEMM-TT
ViennaCL
Test: OpenCL BLAS - dGEMM-TN
ViennaCL
Test: OpenCL BLAS - dGEMM-NT
ViennaCL
Test: OpenCL BLAS - dGEMM-NN
ViennaCL
Test: OpenCL BLAS - dGEMV-T
ViennaCL
Test: OpenCL BLAS - dGEMV-N
ViennaCL
Test: OpenCL BLAS - dDOT
ViennaCL
Test: OpenCL BLAS - dAXPY
ViennaCL
Test: OpenCL BLAS - dCOPY
VkResample
Upscale: 2x - Precision: Double
VkFFT
Test: FFT + iFFT C2C Bluestein in single precision
RealSR-NCNN
Scale: 4x - TAA: No
Blender
Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX
ViennaCL
Test: OpenCL BLAS - sCOPY
ViennaCL
Test: OpenCL BLAS - sAXPY
Blender
Blend File: Classroom - Compute: NVIDIA OptiX
ViennaCL
Test: OpenCL BLAS - sDOT
clpeak
OpenCL Test: Single-Precision Float
clpeak
OpenCL Test: Integer Compute INT
Hashcat
Benchmark: SHA1
Hashcat
Benchmark: 7-Zip
clpeak
OpenCL Test: Double-Precision Double
TensorFlow
Device: GPU - Batch Size: 1 - Model: GoogLeNet
VkResample
Upscale: 2x - Precision: Single
Hashcat
Benchmark: SHA-512
MandelGPU
OpenCL Device: GPU
Hashcat
Benchmark: MD5
Darktable
Test: Boat - Acceleration: OpenCL
ParaView
Test: Wavelet Volume - Resolution: 1920 x 1080
ParaView
Test: Wavelet Volume - Resolution: 1920 x 1080
TensorFlow
Device: GPU - Batch Size: 1 - Model: GoogLeNet
cl-mem
Benchmark: Read
cl-mem
Benchmark: Write
SHOC Scalable HeterOgeneous Computing
Target: OpenCL - Benchmark: GEMM SGEMM_N
TensorFlow
Device: GPU - Batch Size: 1 - Model: AlexNet
ProjectPhysX OpenCL-Benchmark
Operation: FP64 Compute
SHOC Scalable HeterOgeneous Computing
Target: OpenCL - Benchmark: Reduction
Waifu2x-NCNN Vulkan
Scale: 2x - Denoise: 3 - TAA: Yes
cl-mem
Benchmark: Copy
Hashcat
Benchmark: TrueCrypt RIPEMD160 + XTS
Darktable
Test: Boat - Acceleration: CPU-only
ParaView
Test: Wavelet Contour - Resolution: 1920 x 1080
ParaView
Test: Wavelet Contour - Resolution: 1920 x 1080
clpeak
OpenCL Test: Global Memory Bandwidth
Darktable
Test: Masskrug - Acceleration: OpenCL
Darktable
Test: Server Room - Acceleration: OpenCL
FinanceBench
Benchmark: Black-Scholes OpenCL
Darktable
Test: Masskrug - Acceleration: CPU-only
Darktable
Test: Server Room - Acceleration: CPU-only
SHOC Scalable HeterOgeneous Computing
Target: OpenCL - Benchmark: Triad
SHOC Scalable HeterOgeneous Computing
Target: OpenCL - Benchmark: MD5 Hash
SHOC Scalable HeterOgeneous Computing
Target: OpenCL - Benchmark: Bus Speed Readback
Darktable
Test: Server Rack - Acceleration: OpenCL
SHOC Scalable HeterOgeneous Computing
Target: OpenCL - Benchmark: Bus Speed Download
SHOC Scalable HeterOgeneous Computing
Target: OpenCL - Benchmark: FFT SP
SHOC Scalable HeterOgeneous Computing
Target: OpenCL - Benchmark: S3D
Darktable
Test: Server Rack - Acceleration: CPU-only
NeatBench
Acceleration: GPU
Phoronix Test Suite v10.8.5