gpu_test_results 2 x Intel Xeon Silver 4210 testing with a Dell 0804P1 (2.24.0 BIOS) and NVIDIA Quadro RTX 4000 8GB on Ubuntu 22.04 via the Phoronix Test Suite. gpu_tests: Processor: 2 x Intel Xeon Silver 4210 @ 3.20GHz (20 Cores / 40 Threads), Motherboard: Dell 0804P1 (2.24.0 BIOS), Chipset: Intel Sky Lake-E DMI3 Registers, Memory: 32GB, Disk: 2 x Toshiba KXG60ZNV256G NVMe 256GB, Graphics: NVIDIA Quadro RTX 4000 8GB, Audio: Realtek ALC3234, Network: Intel I219-LM OS: Ubuntu 22.04, Kernel: 5.15.0-41-generic (x86_64), Display Server: X Server, Display Driver: NVIDIA, OpenCL: OpenCL 3.0 CUDA 11.7.89, Vulkan: 1.2.204, Compiler: GCC 11.2.0, File-System: ext4, Screen Resolution: 1024x768 Hashcat 6.2.4 Benchmark: MD5 H/s > Higher Is Better gpu_tests . 25831366667 |====================================================== Hashcat 6.2.4 Benchmark: SHA1 H/s > Higher Is Better gpu_tests . 8761500000 |======================================================= Hashcat 6.2.4 Benchmark: 7-Zip H/s > Higher Is Better gpu_tests . 435567 |=========================================================== Hashcat 6.2.4 Benchmark: SHA-512 H/s > Higher Is Better gpu_tests . 1098900000 |======================================================= Hashcat 6.2.4 Benchmark: TrueCrypt RIPEMD160 + XTS H/s > Higher Is Better gpu_tests . 314167 |=========================================================== Mixbench 2020-06-23 Backend: OpenCL - Benchmark: Integer Mixbench 2020-06-23 Backend: NVIDIA CUDA - Benchmark: Integer Mixbench 2020-06-23 Backend: OpenCL - Benchmark: Double Precision Mixbench 2020-06-23 Backend: OpenCL - Benchmark: Single Precision Mixbench 2020-06-23 Backend: NVIDIA CUDA - Benchmark: Half Precision Mixbench 2020-06-23 Backend: NVIDIA CUDA - Benchmark: Double Precision Mixbench 2020-06-23 Backend: NVIDIA CUDA - Benchmark: Single Precision SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: S3D GFLOPS > Higher Is Better gpu_tests . 169.21 |=========================================================== SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Triad GB/s > Higher Is Better gpu_tests . 12.12 |============================================================ SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: FFT SP GFLOPS > Higher Is Better gpu_tests . 719.74 |=========================================================== SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: MD5 Hash GHash/s > Higher Is Better gpu_tests . 17.11 |============================================================ SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Reduction GB/s > Higher Is Better gpu_tests . 315.33 |=========================================================== SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: GEMM SGEMM_N GFLOPS > Higher Is Better gpu_tests . 2876.27 |========================================================== SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Max SP Flops GFLOPS > Higher Is Better gpu_tests . 8469.32 |========================================================== SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Download GB/s > Higher Is Better gpu_tests . 12.24 |============================================================ SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Readback GB/s > Higher Is Better gpu_tests . 13.17 |============================================================ SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Texture Read Bandwidth GB/s > Higher Is Better gpu_tests . 1085.43 |========================================================== cl-mem 2017-01-13 Benchmark: Copy GB/s > Higher Is Better gpu_tests . 280.3 |============================================================ cl-mem 2017-01-13 Benchmark: Read GB/s > Higher Is Better gpu_tests . 379.6 |============================================================ cl-mem 2017-01-13 Benchmark: Write GB/s > Higher Is Better gpu_tests . 324.5 |============================================================ RedShift Demo 3.0 Seconds < Lower Is Better FAHBench 2.3.2 Ns Per Day > Higher Is Better gpu_tests . 191.16 |=========================================================== LeelaChessZero 0.28 Backend: OpenCL Nodes Per Second > Higher Is Better gpu_tests . 3748 |============================================================= Rodinia 3.1 Test: OpenCL Particle Filter Seconds < Lower Is Better gpu_tests . 8.096 |============================================================ ArrayFire 3.7 Test: Conjugate Gradient OpenCL ms < Lower Is Better gpu_tests . 2.207 |============================================================ LuxCoreRender 2.6 Scene: DLSC - Acceleration: GPU M samples/sec > Higher Is Better gpu_tests . 3.35 |============================================================= LuxCoreRender 2.6 Scene: Danish Mood - Acceleration: GPU M samples/sec > Higher Is Better gpu_tests . 2.19 |============================================================= LuxCoreRender 2.6 Scene: Orange Juice - Acceleration: GPU M samples/sec > Higher Is Better gpu_tests . 3.54 |============================================================= LuxCoreRender 2.6 Scene: LuxCore Benchmark - Acceleration: GPU M samples/sec > Higher Is Better gpu_tests . 2.76 |============================================================= LuxCoreRender 2.6 Scene: Rainbow Colors and Prism - Acceleration: GPU M samples/sec > Higher Is Better gpu_tests . 10.77 |============================================================ FinanceBench 2016-07-25 Benchmark: Black-Scholes OpenCL ms < Lower Is Better gpu_tests . 22.21 |============================================================ ViennaCL 1.7.1 Test: CPU BLAS - sCOPY GB/s > Higher Is Better gpu_tests . 29.52 |============================================================ ViennaCL 1.7.1 Test: CPU BLAS - sAXPY GB/s > Higher Is Better gpu_tests . 34.9 |============================================================= ViennaCL 1.7.1 Test: CPU BLAS - sDOT GB/s > Higher Is Better gpu_tests . 48.20 |============================================================ ViennaCL 1.7.1 Test: CPU BLAS - dCOPY GB/s > Higher Is Better gpu_tests . 27.7 |============================================================= ViennaCL 1.7.1 Test: CPU BLAS - dAXPY GB/s > Higher Is Better gpu_tests . 31.3 |============================================================= ViennaCL 1.7.1 Test: CPU BLAS - dDOT GB/s > Higher Is Better gpu_tests . 40.2 |============================================================= ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-N GB/s > Higher Is Better gpu_tests . 35.3 |============================================================= ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-T GB/s > Higher Is Better gpu_tests . 38.86 |============================================================ ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NN GFLOPs/s > Higher Is Better gpu_tests . 36.4 |============================================================= ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NT GFLOPs/s > Higher Is Better gpu_tests . 38.5 |============================================================= ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TT GFLOPs/s > Higher Is Better gpu_tests . 37.9 |============================================================= ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TN GFLOPs/s > Higher Is Better gpu_tests . 38.6 |============================================================= ViennaCL 1.7.1 Test: OpenCL BLAS - sCOPY GB/s > Higher Is Better gpu_tests . 262 |============================================================== ViennaCL 1.7.1 Test: OpenCL BLAS - sAXPY GB/s > Higher Is Better gpu_tests . 318 |============================================================== ViennaCL 1.7.1 Test: OpenCL BLAS - sDOT GB/s > Higher Is Better gpu_tests . 248 |============================================================== ViennaCL 1.7.1 Test: OpenCL BLAS - dCOPY GB/s > Higher Is Better gpu_tests . 355 |============================================================== ViennaCL 1.7.1 Test: OpenCL BLAS - dAXPY GB/s > Higher Is Better gpu_tests . 372 |============================================================== ViennaCL 1.7.1 Test: OpenCL BLAS - dDOT GB/s > Higher Is Better gpu_tests . 384 |============================================================== ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMV-N GB/s > Higher Is Better gpu_tests . 356 |============================================================== ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMV-T GB/s > Higher Is Better gpu_tests . 312 |============================================================== ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-NN GFLOPs/s > Higher Is Better gpu_tests . 254 |============================================================== ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-NT GFLOPs/s > Higher Is Better gpu_tests . 254 |============================================================== ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-TN GFLOPs/s > Higher Is Better gpu_tests . 251 |============================================================== ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-TT GFLOPs/s > Higher Is Better gpu_tests . 255 |============================================================== GROMACS 2022.1 Implementation: NVIDIA CUDA GPU - Input: water_GMX50_bare Ns Per Day > Higher Is Better Caffe 2020-02-13 Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 100 Milli-Seconds < Lower Is Better Caffe 2020-02-13 Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 200 Milli-Seconds < Lower Is Better Caffe 2020-02-13 Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 1000 Milli-Seconds < Lower Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 100 Milli-Seconds < Lower Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 200 Milli-Seconds < Lower Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 1000 Milli-Seconds < Lower Is Better NCNN 20210720 Target: Vulkan GPU ms < Lower Is Better PlaidML FP16: No - Mode: Training - Network: Mobilenet - Device: OpenCL Examples Per Second > Higher Is Better PlaidML FP16: No - Mode: Inference - Network: IMDB LSTM - Device: OpenCL Examples Per Second > Higher Is Better PlaidML FP16: No - Mode: Inference - Network: Mobilenet - Device: OpenCL Examples Per Second > Higher Is Better PlaidML FP16: Yes - Mode: Inference - Network: Mobilenet - Device: OpenCL Examples Per Second > Higher Is Better PlaidML FP16: No - Mode: Inference - Network: DenseNet 201 - Device: OpenCL Examples Per Second > Higher Is Better Blender 3.2 Blend File: BMW27 - Compute: CUDA Seconds < Lower Is Better gpu_tests . 28.30 |============================================================ Blender 3.2 Blend File: Classroom - Compute: CUDA Seconds < Lower Is Better gpu_tests . 61.40 |============================================================ Blender 3.2 Blend File: Fishy Cat - Compute: CUDA Seconds < Lower Is Better gpu_tests . 64.09 |============================================================ Blender 3.2 Blend File: Barbershop - Compute: CUDA Seconds < Lower Is Better gpu_tests . 264.98 |=========================================================== Blender 3.2 Blend File: BMW27 - Compute: NVIDIA OptiX Seconds < Lower Is Better gpu_tests . 15.54 |============================================================ Blender 3.2 Blend File: Classroom - Compute: NVIDIA OptiX Seconds < Lower Is Better gpu_tests . 42.03 |============================================================ Blender 3.2 Blend File: Fishy Cat - Compute: NVIDIA OptiX Seconds < Lower Is Better gpu_tests . 31.94 |============================================================ Blender 3.2 Blend File: Barbershop - Compute: NVIDIA OptiX Seconds < Lower Is Better gpu_tests . 159.30 |=========================================================== Blender 3.2 Blend File: Pabellon Barcelona - Compute: CUDA Seconds < Lower Is Better gpu_tests . 154.46 |=========================================================== Blender 3.2 Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX Seconds < Lower Is Better gpu_tests . 46.79 |============================================================ MandelGPU 1.3pts1 OpenCL Device: GPU Samples/sec > Higher Is Better clpeak OpenCL Test: Integer Compute INT GIOPS > Higher Is Better gpu_tests . 6079.47 |========================================================== clpeak OpenCL Test: Single-Precision Float GFLOPS > Higher Is Better gpu_tests . 6563.48 |========================================================== clpeak OpenCL Test: Double-Precision Double GFLOPS > Higher Is Better gpu_tests . 266.52 |=========================================================== clpeak OpenCL Test: Global Memory Bandwidth GBPS > Higher Is Better gpu_tests . 345.40 |=========================================================== NeatBench 5 Acceleration: GPU FPS > Higher Is Better gpu_tests . 30.9 |=============================================================