gpu_init_bench AMD Ryzen 5 5600X 6-Core testing with a ASUS TUF GAMING X570-PLUS (4021 BIOS) and NVIDIA GeForce RTX 3070 Ti 8GB on Ubuntu 20.04 via the Phoronix Test Suite. qpu_init_bench: Processor: AMD Ryzen 5 5600X 6-Core @ 3.70GHz (6 Cores / 12 Threads), Motherboard: ASUS TUF GAMING X570-PLUS (4021 BIOS), Chipset: AMD Starship/Matisse, Memory: 32GB, Disk: 250GB Samsung SSD 860 + 1000GB Samsung SSD 870, Graphics: NVIDIA GeForce RTX 3070 Ti 8GB, Audio: NVIDIA Device 228b, Monitor: S24E510C, Network: Realtek RTL8111/8168/8411 OS: Ubuntu 20.04, Kernel: 5.4.0-42-generic (x86_64), Desktop: GNOME Shell 3.36.4, Display Server: X Server 1.20.8, Display Driver: NVIDIA 470.94, OpenGL: 4.6.0, OpenCL: OpenCL 3.0 CUDA 11.4.176, Vulkan: 1.2.175, Compiler: GCC 9.3.0, File-System: ext4, Screen Resolution: 1920x1080 vkpeak 20210424 fp32-scalar GFLOPS > Higher Is Better qpu_init_bench . 11797.83 |==================================================== vkpeak 20210424 fp32-vec4 GFLOPS > Higher Is Better qpu_init_bench . 15649.98 |==================================================== vkpeak 20210424 fp16-scalar GFLOPS > Higher Is Better qpu_init_bench . 11830.50 |==================================================== vkpeak 20210424 fp16-vec4 GFLOPS > Higher Is Better qpu_init_bench . 23000.40 |==================================================== vkpeak 20210424 fp64-scalar GFLOPS > Higher Is Better qpu_init_bench . 369.09 |====================================================== vkpeak 20210424 fp64-vec4 GFLOPS > Higher Is Better qpu_init_bench . 370.28 |====================================================== vkpeak 20210424 int32-scalar GIOPS > Higher Is Better qpu_init_bench . 11763.56 |==================================================== vkpeak 20210424 int32-vec4 GIOPS > Higher Is Better qpu_init_bench . 11696.96 |==================================================== vkpeak 20210424 int16-scalar GIOPS > Higher Is Better qpu_init_bench . 7764.50 |===================================================== vkpeak 20210424 int16-vec4 GIOPS > Higher Is Better qpu_init_bench . 10277.20 |==================================================== RealSR-NCNN 20200818 Scale: 4x - TAA: No Seconds < Lower Is Better qpu_init_bench . 7.309 |======================================================= RealSR-NCNN 20200818 Scale: 4x - TAA: Yes Seconds < Lower Is Better qpu_init_bench . 41.32 |======================================================= Waifu2x-NCNN Vulkan 20200818 Scale: 2x - Denoise: 3 - TAA: No Seconds < Lower Is Better Waifu2x-NCNN Vulkan 20200818 Scale: 2x - Denoise: 3 - TAA: Yes Seconds < Lower Is Better qpu_init_bench . 3.936 |======================================================= VkFFT 1.1.1 Benchmark Score > Higher Is Better qpu_init_bench . 36861 |======================================================= Hashcat 6.2.4 Benchmark: MD5 H/s > Higher Is Better qpu_init_bench . 42545333333 |================================================= Hashcat 6.2.4 Benchmark: SHA1 H/s > Higher Is Better qpu_init_bench . 13385600000 |================================================= Hashcat 6.2.4 Benchmark: 7-Zip H/s > Higher Is Better qpu_init_bench . 692400 |====================================================== Hashcat 6.2.4 Benchmark: SHA-512 H/s > Higher Is Better qpu_init_bench . 1705800000 |================================================== Hashcat 6.2.4 Benchmark: TrueCrypt RIPEMD160 + XTS H/s > Higher Is Better qpu_init_bench . 510833 |====================================================== Mixbench 2020-06-23 Backend: OpenCL - Benchmark: Integer Mixbench 2020-06-23 Backend: NVIDIA CUDA - Benchmark: Integer Mixbench 2020-06-23 Backend: OpenCL - Benchmark: Double Precision Mixbench 2020-06-23 Backend: OpenCL - Benchmark: Single Precision Mixbench 2020-06-23 Backend: NVIDIA CUDA - Benchmark: Half Precision Mixbench 2020-06-23 Backend: NVIDIA CUDA - Benchmark: Double Precision Mixbench 2020-06-23 Backend: NVIDIA CUDA - Benchmark: Single Precision SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: S3D GFLOPS > Higher Is Better qpu_init_bench . 285.06 |====================================================== SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Triad GB/s > Higher Is Better qpu_init_bench . 6.5570 |====================================================== SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: FFT SP GFLOPS > Higher Is Better qpu_init_bench . 1529.56 |===================================================== SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: MD5 Hash GHash/s > Higher Is Better qpu_init_bench . 26.51 |======================================================= SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Reduction GB/s > Higher Is Better qpu_init_bench . 367.87 |====================================================== SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: GEMM SGEMM_N GFLOPS > Higher Is Better qpu_init_bench . 4724.99 |===================================================== SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Max SP Flops GFLOPS > Higher Is Better qpu_init_bench . 23442.2 |===================================================== SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Download GB/s > Higher Is Better qpu_init_bench . 6.3218 |====================================================== SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Readback GB/s > Higher Is Better qpu_init_bench . 6.7638 |====================================================== SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Texture Read Bandwidth GB/s > Higher Is Better qpu_init_bench . 1982.41 |===================================================== Libplacebo 2.72.2 FPS > Higher Is Better cl-mem 2017-01-13 Benchmark: Copy GB/s > Higher Is Better qpu_init_bench . 339.1 |======================================================= cl-mem 2017-01-13 Benchmark: Read GB/s > Higher Is Better qpu_init_bench . 539.5 |======================================================= cl-mem 2017-01-13 Benchmark: Write GB/s > Higher Is Better qpu_init_bench . 521.9 |======================================================= NAMD CUDA 2.14 ATPase Simulation - 327,506 Atoms days/ns < Lower Is Better qpu_init_bench . 0.20725 |===================================================== Betsy GPU Compressor 1.1 Beta Codec: ETC1 - Quality: Highest Seconds < Lower Is Better qpu_init_bench . 4.324 |======================================================= Betsy GPU Compressor 1.1 Beta Codec: ETC2 RGB - Quality: Highest Seconds < Lower Is Better qpu_init_bench . 5.827 |======================================================= VkResample 1.0 Upscale: 2x - Precision: Double ms < Lower Is Better qpu_init_bench . 210.98 |====================================================== VkResample 1.0 Upscale: 2x - Precision: Single ms < Lower Is Better qpu_init_bench . 13.49 |======================================================= OctaneBench 2020.1 Total Score Score > Higher Is Better qpu_init_bench . 452.53 |====================================================== RedShift Demo 3.0 Seconds < Lower Is Better FAHBench 2.3.2 Ns Per Day > Higher Is Better qpu_init_bench . 263.85 |====================================================== LeelaChessZero 0.28 Backend: OpenCL Nodes Per Second > Higher Is Better ArrayFire 3.7 Test: Conjugate Gradient OpenCL ms < Lower Is Better qpu_init_bench . 1.740 |======================================================= LuxCoreRender 2.5 Scene: DLSC - Acceleration: GPU M samples/sec > Higher Is Better qpu_init_bench . 6.99 |======================================================== LuxCoreRender 2.5 Scene: Danish Mood - Acceleration: GPU M samples/sec > Higher Is Better qpu_init_bench . 4.50 |======================================================== LuxCoreRender 2.5 Scene: Orange Juice - Acceleration: GPU M samples/sec > Higher Is Better qpu_init_bench . 6.96 |======================================================== LuxCoreRender 2.5 Scene: LuxCore Benchmark - Acceleration: GPU M samples/sec > Higher Is Better qpu_init_bench . 5.68 |======================================================== LuxCoreRender 2.5 Scene: Rainbow Colors and Prism - Acceleration: GPU M samples/sec > Higher Is Better qpu_init_bench . 21.33 |======================================================= FinanceBench 2016-07-25 Benchmark: Black-Scholes OpenCL ms < Lower Is Better qpu_init_bench . 9.958 |======================================================= ViennaCL 1.7.1 Test: CPU BLAS - sCOPY GB/s > Higher Is Better qpu_init_bench . 24.0 |======================================================== ViennaCL 1.7.1 Test: CPU BLAS - sAXPY GB/s > Higher Is Better qpu_init_bench . 36.2 |======================================================== ViennaCL 1.7.1 Test: CPU BLAS - sDOT GB/s > Higher Is Better qpu_init_bench . 42.3 |======================================================== ViennaCL 1.7.1 Test: CPU BLAS - dCOPY GB/s > Higher Is Better qpu_init_bench . 19 |========================================================== ViennaCL 1.7.1 Test: CPU BLAS - dAXPY GB/s > Higher Is Better qpu_init_bench . 28.3 |======================================================== ViennaCL 1.7.1 Test: CPU BLAS - dDOT GB/s > Higher Is Better qpu_init_bench . 33.0 |======================================================== ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-N GB/s > Higher Is Better qpu_init_bench . 37.3 |======================================================== ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-T GB/s > Higher Is Better qpu_init_bench . 37.6 |======================================================== ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NN GFLOPs/s > Higher Is Better qpu_init_bench . 32.7 |======================================================== ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NT GFLOPs/s > Higher Is Better qpu_init_bench . 32.3 |======================================================== ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TN GFLOPs/s > Higher Is Better qpu_init_bench . 33.9 |======================================================== ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TT GFLOPs/s > Higher Is Better qpu_init_bench . 33.5 |======================================================== ViennaCL 1.7.1 Test: OpenCL BLAS - sCOPY GB/s > Higher Is Better qpu_init_bench . 331 |========================================================= ViennaCL 1.7.1 Test: OpenCL BLAS - sAXPY GB/s > Higher Is Better qpu_init_bench . 433 |========================================================= ViennaCL 1.7.1 Test: OpenCL BLAS - sDOT GB/s > Higher Is Better qpu_init_bench . 362 |========================================================= ViennaCL 1.7.1 Test: OpenCL BLAS - dCOPY GB/s > Higher Is Better qpu_init_bench . 475 |========================================================= ViennaCL 1.7.1 Test: OpenCL BLAS - dAXPY GB/s > Higher Is Better qpu_init_bench . 526 |========================================================= ViennaCL 1.7.1 Test: OpenCL BLAS - dDOT GB/s > Higher Is Better qpu_init_bench . 516 |========================================================= ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMV-N GB/s > Higher Is Better qpu_init_bench . 237 |========================================================= ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMV-T GB/s > Higher Is Better qpu_init_bench . 368 |========================================================= ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-NN GFLOPs/s > Higher Is Better qpu_init_bench . 349 |========================================================= ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-NT GFLOPs/s > Higher Is Better qpu_init_bench . 351 |========================================================= ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-TN GFLOPs/s > Higher Is Better qpu_init_bench . 348 |========================================================= GROMACS 2021.2 Implementation: NVIDIA CUDA GPU - Input: water_GMX50_bare Ns Per Day > Higher Is Better Caffe 2020-02-13 Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 100 Milli-Seconds < Lower Is Better Caffe 2020-02-13 Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 200 Milli-Seconds < Lower Is Better Caffe 2020-02-13 Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 1000 Milli-Seconds < Lower Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 100 Milli-Seconds < Lower Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 200 Milli-Seconds < Lower Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 1000 Milli-Seconds < Lower Is Better NCNN 20210720 Target: Vulkan GPU - Model: mobilenet ms < Lower Is Better qpu_init_bench . 4.01 |======================================================== NCNN 20210720 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 ms < Lower Is Better qpu_init_bench . 1.75 |======================================================== NCNN 20210720 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 ms < Lower Is Better qpu_init_bench . 1.99 |======================================================== NCNN 20210720 Target: Vulkan GPU - Model: shufflenet-v2 ms < Lower Is Better qpu_init_bench . 1.57 |======================================================== NCNN 20210720 Target: Vulkan GPU - Model: mnasnet ms < Lower Is Better qpu_init_bench . 1.84 |======================================================== NCNN 20210720 Target: Vulkan GPU - Model: efficientnet-b0 ms < Lower Is Better qpu_init_bench . 3.00 |======================================================== NCNN 20210720 Target: Vulkan GPU - Model: blazeface ms < Lower Is Better qpu_init_bench . 0.79 |======================================================== NCNN 20210720 Target: Vulkan GPU - Model: googlenet ms < Lower Is Better qpu_init_bench . 3.75 |======================================================== NCNN 20210720 Target: Vulkan GPU - Model: vgg16 ms < Lower Is Better qpu_init_bench . 5.05 |======================================================== NCNN 20210720 Target: Vulkan GPU - Model: resnet18 ms < Lower Is Better qpu_init_bench . 1.56 |======================================================== NCNN 20210720 Target: Vulkan GPU - Model: alexnet ms < Lower Is Better qpu_init_bench . 1.74 |======================================================== NCNN 20210720 Target: Vulkan GPU - Model: resnet50 ms < Lower Is Better qpu_init_bench . 3.53 |======================================================== NCNN 20210720 Target: Vulkan GPU - Model: yolov4-tiny ms < Lower Is Better qpu_init_bench . 6.18 |======================================================== NCNN 20210720 Target: Vulkan GPU - Model: squeezenet_ssd ms < Lower Is Better qpu_init_bench . 4.17 |======================================================== NCNN 20210720 Target: Vulkan GPU - Model: regnety_400m ms < Lower Is Better qpu_init_bench . 2.25 |======================================================== PlaidML FP16: No - Mode: Training - Network: Mobilenet - Device: OpenCL Examples Per Second > Higher Is Better PlaidML FP16: No - Mode: Inference - Network: IMDB LSTM - Device: OpenCL Examples Per Second > Higher Is Better PlaidML FP16: No - Mode: Inference - Network: Mobilenet - Device: OpenCL Examples Per Second > Higher Is Better PlaidML FP16: Yes - Mode: Inference - Network: Mobilenet - Device: OpenCL Examples Per Second > Higher Is Better PlaidML FP16: No - Mode: Inference - Network: DenseNet 201 - Device: OpenCL Examples Per Second > Higher Is Better Blender 3.0 Blend File: BMW27 - Compute: CUDA Seconds < Lower Is Better qpu_init_bench . 17.01 |======================================================= Blender 3.0 Blend File: Classroom - Compute: CUDA Seconds < Lower Is Better qpu_init_bench . 34.88 |======================================================= Blender 3.0 Blend File: Fishy Cat - Compute: CUDA Seconds < Lower Is Better qpu_init_bench . 36.95 |======================================================= Blender 3.0 Blend File: Barbershop - Compute: CUDA Seconds < Lower Is Better qpu_init_bench . 146.45 |====================================================== Blender 3.0 Blend File: BMW27 - Compute: NVIDIA OptiX Seconds < Lower Is Better qpu_init_bench . 9.50 |======================================================== Blender 3.0 Blend File: Classroom - Compute: NVIDIA OptiX Seconds < Lower Is Better qpu_init_bench . 25.15 |======================================================= Blender 3.0 Blend File: Fishy Cat - Compute: NVIDIA OptiX Seconds < Lower Is Better qpu_init_bench . 17.69 |======================================================= Blender 3.0 Blend File: Barbershop - Compute: NVIDIA OptiX Seconds < Lower Is Better qpu_init_bench . 84.78 |======================================================= Blender 3.0 Blend File: Pabellon Barcelona - Compute: CUDA Seconds < Lower Is Better qpu_init_bench . 79.58 |======================================================= Blender 3.0 Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX Seconds < Lower Is Better qpu_init_bench . 25.59 |======================================================= IndigoBench 4.4 Acceleration: OpenCL GPU - Scene: Bedroom M samples/s > Higher Is Better qpu_init_bench . 13.73 |======================================================= IndigoBench 4.4 Acceleration: OpenCL GPU - Scene: Supercar M samples/s > Higher Is Better qpu_init_bench . 38.64 |======================================================= MandelGPU 1.3pts1 OpenCL Device: GPU Samples/sec > Higher Is Better qpu_init_bench . 288849901.7 |================================================= clpeak OpenCL Test: Integer Compute INT GIOPS > Higher Is Better qpu_init_bench . 10822.61 |==================================================== clpeak OpenCL Test: Single-Precision Float GFLOPS > Higher Is Better qpu_init_bench . 21443.22 |==================================================== clpeak OpenCL Test: Double-Precision Double GFLOPS > Higher Is Better qpu_init_bench . 371.07 |====================================================== clpeak OpenCL Test: Global Memory Bandwidth GBPS > Higher Is Better qpu_init_bench . 529.63 |====================================================== NeatBench 5 Acceleration: GPU FPS > Higher Is Better qpu_init_bench . 3070 |========================================================