ngpu.test AMD Ryzen 9 7950X 16-Core testing with a ASUS ProArt X670E-CREATOR WIFI (1004 BIOS) and NVIDIA GeForce RTX 4090 24GB on Ubuntu 22.04 via the Phoronix Test Suite. pts_nvidia-gpu-compute: Processor: AMD Ryzen 9 7950X 16-Core @ 4.50GHz (16 Cores / 32 Threads), Motherboard: ASUS ProArt X670E-CREATOR WIFI (1004 BIOS), Chipset: AMD Device 14d8, Memory: 2 x 32 GB DDR5-5600MT/s Kingston KF556C40-32, Disk: 2000GB Samsung SSD 980 PRO 2TB, Graphics: NVIDIA GeForce RTX 4090 24GB, Audio: NVIDIA Device 22ba, Monitor: LG Ultra HD, Network: Intel I225-V + MEDIATEK Device 0616 OS: Ubuntu 22.04, Kernel: 5.19.0-38-generic (x86_64), Desktop: KDE Plasma 5.24.7, Display Server: X Server 1.21.1.3, Display Driver: NVIDIA 525.105.17, OpenGL: 4.6.0, OpenCL: OpenCL 3.0 CUDA 12.0.151, Vulkan: 1.3.224, Compiler: GCC 11.3.0, File-System: xfs, Screen Resolution: 3840x2160 vkpeak 20210424 fp32-scalar GFLOPS > Higher Is Better pts_nvidia-gpu-compute . 45133.99 |============================================ vkpeak 20210424 fp32-vec4 GFLOPS > Higher Is Better pts_nvidia-gpu-compute . 59636.96 |============================================ vkpeak 20210424 fp16-scalar GFLOPS > Higher Is Better pts_nvidia-gpu-compute . 45059.05 |============================================ vkpeak 20210424 fp16-vec4 GFLOPS > Higher Is Better pts_nvidia-gpu-compute . 89348.10 |============================================ vkpeak 20210424 fp64-scalar GFLOPS > Higher Is Better pts_nvidia-gpu-compute . 1419.89 |============================================= vkpeak 20210424 fp64-vec4 GFLOPS > Higher Is Better pts_nvidia-gpu-compute . 1419.84 |============================================= vkpeak 20210424 int32-scalar GIOPS > Higher Is Better pts_nvidia-gpu-compute . 45045.95 |============================================ vkpeak 20210424 int32-vec4 GIOPS > Higher Is Better pts_nvidia-gpu-compute . 44803.57 |============================================ vkpeak 20210424 int16-scalar GIOPS > Higher Is Better pts_nvidia-gpu-compute . 29976.88 |============================================ vkpeak 20210424 int16-vec4 GIOPS > Higher Is Better pts_nvidia-gpu-compute . 39957.84 |============================================ RealSR-NCNN 20200818 Scale: 4x - TAA: No Seconds < Lower Is Better pts_nvidia-gpu-compute . 4.211 |=============================================== RealSR-NCNN 20200818 Scale: 4x - TAA: Yes Seconds < Lower Is Better pts_nvidia-gpu-compute . 19.12 |=============================================== Waifu2x-NCNN Vulkan 20200818 Scale: 2x - Denoise: 3 - TAA: No Seconds < Lower Is Better Waifu2x-NCNN Vulkan 20200818 Scale: 2x - Denoise: 3 - TAA: Yes Seconds < Lower Is Better pts_nvidia-gpu-compute . 2.468 |=============================================== VkFFT 1.1.1 Benchmark Score > Higher Is Better pts_nvidia-gpu-compute . 63576 |=============================================== Hashcat 6.2.4 Benchmark: MD5 H/s > Higher Is Better pts_nvidia-gpu-compute . 156200000000 |======================================== Hashcat 6.2.4 Benchmark: SHA1 H/s > Higher Is Better pts_nvidia-gpu-compute . 49962233333 |========================================= Hashcat 6.2.4 Benchmark: 7-Zip H/s > Higher Is Better pts_nvidia-gpu-compute . 2741600 |============================================= Hashcat 6.2.4 Benchmark: SHA-512 H/s > Higher Is Better pts_nvidia-gpu-compute . 6389066667 |========================================== Hashcat 6.2.4 Benchmark: TrueCrypt RIPEMD160 + XTS H/s > Higher Is Better pts_nvidia-gpu-compute . 1867900 |============================================= Mixbench 2020-06-23 Backend: OpenCL - Benchmark: Integer Mixbench 2020-06-23 Backend: NVIDIA CUDA - Benchmark: Integer Mixbench 2020-06-23 Backend: OpenCL - Benchmark: Double Precision Mixbench 2020-06-23 Backend: OpenCL - Benchmark: Single Precision Mixbench 2020-06-23 Backend: NVIDIA CUDA - Benchmark: Half Precision Mixbench 2020-06-23 Backend: NVIDIA CUDA - Benchmark: Double Precision Mixbench 2020-06-23 Backend: NVIDIA CUDA - Benchmark: Single Precision SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: S3D GFLOPS > Higher Is Better pts_nvidia-gpu-compute . 646.63 |============================================== SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Triad GB/s > Higher Is Better pts_nvidia-gpu-compute . 26.39 |=============================================== SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: FFT SP GFLOPS > Higher Is Better pts_nvidia-gpu-compute . 2778.27 |============================================= SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: MD5 Hash GHash/s > Higher Is Better pts_nvidia-gpu-compute . 93.47 |=============================================== SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Reduction GB/s > Higher Is Better pts_nvidia-gpu-compute . 992.53 |============================================== SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: GEMM SGEMM_N GFLOPS > Higher Is Better pts_nvidia-gpu-compute . 28105.9 |============================================= SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Max SP Flops GFLOPS > Higher Is Better pts_nvidia-gpu-compute . 90084.7 |============================================= SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Download GB/s > Higher Is Better pts_nvidia-gpu-compute . 26.89 |=============================================== SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Readback GB/s > Higher Is Better pts_nvidia-gpu-compute . 26.40 |=============================================== SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Texture Read Bandwidth GB/s > Higher Is Better pts_nvidia-gpu-compute . 3032.40 |============================================= Libplacebo 5.229.1 FPS > Higher Is Better cl-mem 2017-01-13 Benchmark: Copy GB/s > Higher Is Better pts_nvidia-gpu-compute . 411.6 |=============================================== cl-mem 2017-01-13 Benchmark: Read GB/s > Higher Is Better pts_nvidia-gpu-compute . 885.5 |=============================================== cl-mem 2017-01-13 Benchmark: Write GB/s > Higher Is Better pts_nvidia-gpu-compute . 799.4 |=============================================== NAMD CUDA 2.14 ATPase Simulation - 327,506 Atoms days/ns < Lower Is Better pts_nvidia-gpu-compute . 0.06968 |============================================= Betsy GPU Compressor 1.1 Beta Codec: ETC1 - Quality: Highest Seconds < Lower Is Better Betsy GPU Compressor 1.1 Beta Codec: ETC2 RGB - Quality: Highest Seconds < Lower Is Better VkResample 1.0 Upscale: 2x - Precision: Double ms < Lower Is Better pts_nvidia-gpu-compute . 55.39 |=============================================== VkResample 1.0 Upscale: 2x - Precision: Single ms < Lower Is Better pts_nvidia-gpu-compute . 7.784 |=============================================== OctaneBench 2020.1 Total Score Score > Higher Is Better pts_nvidia-gpu-compute . 1330.49 |============================================= RedShift Demo 3.0 Seconds < Lower Is Better FAHBench 2.3.2 Ns Per Day > Higher Is Better pts_nvidia-gpu-compute . 440.50 |============================================== clpeak 1.1.2 OpenCL Test: Integer Compute INT GIOPS > Higher Is Better pts_nvidia-gpu-compute . 40863.0 |============================================= clpeak 1.1.2 OpenCL Test: Single-Precision Float GFLOPS > Higher Is Better pts_nvidia-gpu-compute . 78857.40 |============================================ clpeak 1.1.2 OpenCL Test: Double-Precision Double GFLOPS > Higher Is Better pts_nvidia-gpu-compute . 1413.35 |============================================= clpeak 1.1.2 OpenCL Test: Global Memory Bandwidth GBPS > Higher Is Better pts_nvidia-gpu-compute . 870.86 |============================================== LeelaChessZero 0.28 Backend: OpenCL Nodes Per Second > Higher Is Better pts_nvidia-gpu-compute . 46246 |=============================================== Rodinia 3.1 Test: OpenCL Particle Filter Seconds < Lower Is Better pts_nvidia-gpu-compute . 2.095 |=============================================== ArrayFire 3.7 Test: Conjugate Gradient OpenCL ms < Lower Is Better pts_nvidia-gpu-compute . 0.8731 |============================================== LuxCoreRender 2.6 Scene: DLSC - Acceleration: GPU M samples/sec > Higher Is Better pts_nvidia-gpu-compute . 22.43 |=============================================== LuxCoreRender 2.6 Scene: Danish Mood - Acceleration: GPU M samples/sec > Higher Is Better pts_nvidia-gpu-compute . 18.84 |=============================================== LuxCoreRender 2.6 Scene: Orange Juice - Acceleration: GPU M samples/sec > Higher Is Better pts_nvidia-gpu-compute . 17.64 |=============================================== LuxCoreRender 2.6 Scene: LuxCore Benchmark - Acceleration: GPU M samples/sec > Higher Is Better pts_nvidia-gpu-compute . 19.26 |=============================================== LuxCoreRender 2.6 Scene: Rainbow Colors and Prism - Acceleration: GPU M samples/sec > Higher Is Better pts_nvidia-gpu-compute . 40.94 |=============================================== FinanceBench 2016-07-25 Benchmark: Black-Scholes OpenCL ms < Lower Is Better pts_nvidia-gpu-compute . 2.934 |=============================================== ViennaCL 1.7.1 Test: CPU BLAS - sCOPY GB/s > Higher Is Better pts_nvidia-gpu-compute . 208 |================================================= ViennaCL 1.7.1 Test: CPU BLAS - sAXPY GB/s > Higher Is Better pts_nvidia-gpu-compute . 305 |================================================= ViennaCL 1.7.1 Test: CPU BLAS - sDOT GB/s > Higher Is Better pts_nvidia-gpu-compute . 300 |================================================= ViennaCL 1.7.1 Test: CPU BLAS - dCOPY GB/s > Higher Is Better pts_nvidia-gpu-compute . 58.7 |================================================ ViennaCL 1.7.1 Test: CPU BLAS - dAXPY GB/s > Higher Is Better pts_nvidia-gpu-compute . 88.3 |================================================ ViennaCL 1.7.1 Test: CPU BLAS - dDOT GB/s > Higher Is Better pts_nvidia-gpu-compute . 93.4 |================================================ ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-N GB/s > Higher Is Better pts_nvidia-gpu-compute . 101.8 |=============================================== ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-T GB/s > Higher Is Better pts_nvidia-gpu-compute . 127 |================================================= ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NN GFLOPs/s > Higher Is Better pts_nvidia-gpu-compute . 110 |================================================= ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NT GFLOPs/s > Higher Is Better pts_nvidia-gpu-compute . 106 |================================================= ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TN GFLOPs/s > Higher Is Better pts_nvidia-gpu-compute . 117 |================================================= ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TT GFLOPs/s > Higher Is Better pts_nvidia-gpu-compute . 111 |================================================= ViennaCL 1.7.1 Test: OpenCL BLAS - sCOPY GB/s > Higher Is Better pts_nvidia-gpu-compute . 480 |================================================= ViennaCL 1.7.1 Test: OpenCL BLAS - sAXPY GB/s > Higher Is Better pts_nvidia-gpu-compute . 618 |================================================= ViennaCL 1.7.1 Test: OpenCL BLAS - sDOT GB/s > Higher Is Better pts_nvidia-gpu-compute . 451 |================================================= ViennaCL 1.7.1 Test: OpenCL BLAS - dCOPY GB/s > Higher Is Better pts_nvidia-gpu-compute . 661 |================================================= ViennaCL 1.7.1 Test: OpenCL BLAS - dAXPY GB/s > Higher Is Better pts_nvidia-gpu-compute . 773 |================================================= ViennaCL 1.7.1 Test: OpenCL BLAS - dDOT GB/s > Higher Is Better pts_nvidia-gpu-compute . 730 |================================================= ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMV-N GB/s > Higher Is Better pts_nvidia-gpu-compute . 221 |================================================= ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMV-T GB/s > Higher Is Better pts_nvidia-gpu-compute . 446 |================================================= ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-NN GFLOPs/s > Higher Is Better pts_nvidia-gpu-compute . 1167 |================================================ ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-NT GFLOPs/s > Higher Is Better pts_nvidia-gpu-compute . 1290 |================================================ ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-TN GFLOPs/s > Higher Is Better pts_nvidia-gpu-compute . 1310 |================================================ ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-TT GFLOPs/s > Higher Is Better pts_nvidia-gpu-compute . 1357 |================================================ GROMACS 2023 Implementation: NVIDIA CUDA GPU - Input: water_GMX50_bare Ns Per Day > Higher Is Better Caffe 2020-02-13 Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 100 Milli-Seconds < Lower Is Better Caffe 2020-02-13 Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 200 Milli-Seconds < Lower Is Better Caffe 2020-02-13 Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 1000 Milli-Seconds < Lower Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 100 Milli-Seconds < Lower Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 200 Milli-Seconds < Lower Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 1000 Milli-Seconds < Lower Is Better NCNN 20220729 Target: Vulkan GPU - Model: mobilenet ms < Lower Is Better pts_nvidia-gpu-compute . 2.68 |================================================ NCNN 20220729 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 ms < Lower Is Better pts_nvidia-gpu-compute . 1.11 |================================================ NCNN 20220729 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 ms < Lower Is Better pts_nvidia-gpu-compute . 1.53 |================================================ NCNN 20220729 Target: Vulkan GPU - Model: shufflenet-v2 ms < Lower Is Better pts_nvidia-gpu-compute . 1.28 |================================================ NCNN 20220729 Target: Vulkan GPU - Model: mnasnet ms < Lower Is Better pts_nvidia-gpu-compute . 0.99 |================================================ NCNN 20220729 Target: Vulkan GPU - Model: efficientnet-b0 ms < Lower Is Better pts_nvidia-gpu-compute . 2.27 |================================================ NCNN 20220729 Target: Vulkan GPU - Model: blazeface ms < Lower Is Better pts_nvidia-gpu-compute . 0.84 |================================================ NCNN 20220729 Target: Vulkan GPU - Model: googlenet ms < Lower Is Better pts_nvidia-gpu-compute . 2.30 |================================================ NCNN 20220729 Target: Vulkan GPU - Model: vgg16 ms < Lower Is Better pts_nvidia-gpu-compute . 1.99 |================================================ NCNN 20220729 Target: Vulkan GPU - Model: resnet18 ms < Lower Is Better pts_nvidia-gpu-compute . 1.47 |================================================ NCNN 20220729 Target: Vulkan GPU - Model: alexnet ms < Lower Is Better pts_nvidia-gpu-compute . 1.99 |================================================ NCNN 20220729 Target: Vulkan GPU - Model: resnet50 ms < Lower Is Better pts_nvidia-gpu-compute . 2.07 |================================================ NCNN 20220729 Target: Vulkan GPU - Model: yolov4-tiny ms < Lower Is Better pts_nvidia-gpu-compute . 5.15 |================================================ NCNN 20220729 Target: Vulkan GPU - Model: squeezenet_ssd ms < Lower Is Better pts_nvidia-gpu-compute . 3.43 |================================================ NCNN 20220729 Target: Vulkan GPU - Model: regnety_400m ms < Lower Is Better pts_nvidia-gpu-compute . 1.36 |================================================ NCNN 20220729 Target: Vulkan GPU - Model: vision_transformer ms < Lower Is Better pts_nvidia-gpu-compute . 135.14 |============================================== NCNN 20220729 Target: Vulkan GPU - Model: FastestDet ms < Lower Is Better pts_nvidia-gpu-compute . 1.99 |================================================ PlaidML FP16: No - Mode: Training - Network: Mobilenet - Device: OpenCL Examples Per Second > Higher Is Better PlaidML FP16: No - Mode: Inference - Network: IMDB LSTM - Device: OpenCL Examples Per Second > Higher Is Better PlaidML FP16: No - Mode: Inference - Network: Mobilenet - Device: OpenCL Examples Per Second > Higher Is Better PlaidML FP16: Yes - Mode: Inference - Network: Mobilenet - Device: OpenCL Examples Per Second > Higher Is Better PlaidML FP16: No - Mode: Inference - Network: DenseNet 201 - Device: OpenCL Examples Per Second > Higher Is Better Blender 3.5 Blend File: BMW27 - Compute: NVIDIA OptiX Seconds < Lower Is Better pts_nvidia-gpu-compute . 13.27 |=============================================== Blender 3.5 Blend File: Classroom - Compute: NVIDIA OptiX Seconds < Lower Is Better pts_nvidia-gpu-compute . 7.07 |================================================ Blender 3.5 Blend File: Fishy Cat - Compute: NVIDIA OptiX Seconds < Lower Is Better pts_nvidia-gpu-compute . 5.38 |================================================ Blender 3.5 Blend File: Barbershop - Compute: NVIDIA OptiX Seconds < Lower Is Better pts_nvidia-gpu-compute . 29.75 |=============================================== Blender 3.5 Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX Seconds < Lower Is Better pts_nvidia-gpu-compute . 8.12 |================================================ IndigoBench 4.4 Acceleration: OpenCL GPU - Scene: Bedroom M samples/s > Higher Is Better pts_nvidia-gpu-compute . 35.66 |=============================================== IndigoBench 4.4 Acceleration: OpenCL GPU - Scene: Supercar M samples/s > Higher Is Better pts_nvidia-gpu-compute . 80.07 |=============================================== MandelGPU 1.3pts1 OpenCL Device: GPU Samples/sec > Higher Is Better pts_nvidia-gpu-compute . 907383065.7 |========================================= NeatBench 5 Acceleration: GPU FPS > Higher Is Better pts_nvidia-gpu-compute . 4090 |================================================