gpu_compute_nvidia0_run1 Intel Core i7-12700K testing with a Gigabyte B660 DS3H DDR4 (F4 BIOS) and Gigabyte NVIDIA GeForce RTX 3070 8GB on Ubuntu 22.04 via the Phoronix Test Suite. gpu_compute_0_1: Processor: Intel Core i7-12700K @ 4.90GHz (12 Cores / 20 Threads), Motherboard: Gigabyte B660 DS3H DDR4 (F4 BIOS), Chipset: Intel Device 7aa7, Memory: 64GB, Disk: 2000GB PNY CS2130 2TB SSD + 4 x 6001GB Western Digital WD60EZAZ-00S, Graphics: Gigabyte NVIDIA GeForce RTX 3070 8GB, Audio: Realtek ALC897, Monitor: BenQ PD2700U, Network: Realtek RTL8111/8168/8411 OS: Ubuntu 22.04, Kernel: 5.15.0-71-generic (x86_64), Desktop: GNOME Shell 42.5, Display Server: X Server 1.21.1.4, Display Driver: NVIDIA 530.41.03, OpenGL: 4.6.0, OpenCL: OpenCL 3.0 CUDA 12.1.98, Vulkan: 1.3.236, Compiler: GCC 11.3.0 + CUDA 11.5, File-System: ext4, Screen Resolution: 3840x2160 NeatBench 5 Acceleration: GPU FPS > Higher Is Better gpu_compute_0_1 . 3070 |======================================================= MandelGPU 1.3pts1 OpenCL Device: GPU Samples/sec > Higher Is Better gpu_compute_0_1 . 399785618.2 |================================================ IndigoBench 4.4 Acceleration: OpenCL GPU - Scene: Supercar M samples/s > Higher Is Better gpu_compute_0_1 . 70.56 |====================================================== IndigoBench 4.4 Acceleration: OpenCL GPU - Scene: Bedroom M samples/s > Higher Is Better gpu_compute_0_1 . 25.88 |====================================================== Blender 3.5 Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX Seconds < Lower Is Better gpu_compute_0_1 . 14.15 |====================================================== Blender 3.5 Blend File: Barbershop - Compute: NVIDIA OptiX Seconds < Lower Is Better gpu_compute_0_1 . 49.68 |====================================================== Blender 3.5 Blend File: Fishy Cat - Compute: NVIDIA OptiX Seconds < Lower Is Better gpu_compute_0_1 . 10.45 |====================================================== Blender 3.5 Blend File: Classroom - Compute: NVIDIA OptiX Seconds < Lower Is Better gpu_compute_0_1 . 12.65 |====================================================== Blender 3.5 Blend File: BMW27 - Compute: NVIDIA OptiX Seconds < Lower Is Better gpu_compute_0_1 . 5.33 |======================================================= ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-TN GFLOPs/s > Higher Is Better gpu_compute_0_1 . 330 |======================================================== ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMV-T GB/s > Higher Is Better gpu_compute_0_1 . 332 |======================================================== ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMV-N GB/s > Higher Is Better gpu_compute_0_1 . 176 |======================================================== ViennaCL 1.7.1 Test: OpenCL BLAS - dDOT GB/s > Higher Is Better gpu_compute_0_1 . 402 |======================================================== ViennaCL 1.7.1 Test: OpenCL BLAS - dAXPY GB/s > Higher Is Better gpu_compute_0_1 . 400 |======================================================== ViennaCL 1.7.1 Test: OpenCL BLAS - dCOPY GB/s > Higher Is Better gpu_compute_0_1 . 377 |======================================================== ViennaCL 1.7.1 Test: OpenCL BLAS - sDOT GB/s > Higher Is Better gpu_compute_0_1 . 328 |======================================================== ViennaCL 1.7.1 Test: OpenCL BLAS - sAXPY GB/s > Higher Is Better gpu_compute_0_1 . 361 |======================================================== ViennaCL 1.7.1 Test: OpenCL BLAS - sCOPY GB/s > Higher Is Better gpu_compute_0_1 . 285 |======================================================== ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TN GFLOPs/s > Higher Is Better gpu_compute_0_1 . 67.8 |======================================================= ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NT GFLOPs/s > Higher Is Better gpu_compute_0_1 . 59.0 |======================================================= ViennaCL 1.7.1 Test: CPU BLAS - dDOT GB/s > Higher Is Better gpu_compute_0_1 . 38.6 |======================================================= ViennaCL 1.7.1 Test: CPU BLAS - dAXPY GB/s > Higher Is Better gpu_compute_0_1 . 37.2 |======================================================= ViennaCL 1.7.1 Test: CPU BLAS - dCOPY GB/s > Higher Is Better gpu_compute_0_1 . 34.3 |======================================================= ViennaCL 1.7.1 Test: CPU BLAS - sDOT GB/s > Higher Is Better gpu_compute_0_1 . 51.0 |======================================================= ViennaCL 1.7.1 Test: CPU BLAS - sAXPY GB/s > Higher Is Better gpu_compute_0_1 . 48.0 |======================================================= ViennaCL 1.7.1 Test: CPU BLAS - sCOPY GB/s > Higher Is Better gpu_compute_0_1 . 45.8 |======================================================= FinanceBench 2016-07-25 Benchmark: Black-Scholes OpenCL ms < Lower Is Better gpu_compute_0_1 . 9.801 |====================================================== LuxCoreRender 2.6 Scene: Rainbow Colors and Prism - Acceleration: GPU M samples/sec > Higher Is Better gpu_compute_0_1 . 35.97 |====================================================== LuxCoreRender 2.6 Scene: LuxCore Benchmark - Acceleration: GPU M samples/sec > Higher Is Better gpu_compute_0_1 . 12.50 |====================================================== LuxCoreRender 2.6 Scene: Orange Juice - Acceleration: GPU M samples/sec > Higher Is Better gpu_compute_0_1 . 13.98 |====================================================== LuxCoreRender 2.6 Scene: Danish Mood - Acceleration: GPU M samples/sec > Higher Is Better gpu_compute_0_1 . 9.93 |======================================================= LuxCoreRender 2.6 Scene: DLSC - Acceleration: GPU M samples/sec > Higher Is Better gpu_compute_0_1 . 16.43 |====================================================== ArrayFire 3.7 Test: Conjugate Gradient OpenCL ms < Lower Is Better gpu_compute_0_1 . 2.110 |====================================================== Rodinia 3.1 Test: OpenCL Particle Filter Seconds < Lower Is Better gpu_compute_0_1 . 6.108 |====================================================== LeelaChessZero 0.28 Backend: OpenCL Nodes Per Second > Higher Is Better gpu_compute_0_1 . 11591 |====================================================== clpeak 1.1.2 OpenCL Test: Global Memory Bandwidth GBPS > Higher Is Better gpu_compute_0_1 . 390.85 |===================================================== clpeak 1.1.2 OpenCL Test: Double-Precision Double GFLOPS > Higher Is Better gpu_compute_0_1 . 354.00 |===================================================== clpeak 1.1.2 OpenCL Test: Single-Precision Float GFLOPS > Higher Is Better gpu_compute_0_1 . 19632.77 |=================================================== clpeak 1.1.2 OpenCL Test: Integer Compute INT GIOPS > Higher Is Better gpu_compute_0_1 . 10137.57 |=================================================== FAHBench 2.3.2 Ns Per Day > Higher Is Better gpu_compute_0_1 . 253.77 |===================================================== OctaneBench 2020.1 Total Score Score > Higher Is Better gpu_compute_0_1 . 405.76 |===================================================== VkResample 1.0 Upscale: 2x - Precision: Single ms < Lower Is Better gpu_compute_0_1 . 17.40 |====================================================== VkResample 1.0 Upscale: 2x - Precision: Double ms < Lower Is Better gpu_compute_0_1 . 219.15 |===================================================== NAMD CUDA 2.14 ATPase Simulation - 327,506 Atoms days/ns < Lower Is Better gpu_compute_0_1 . 0.34927 |==================================================== cl-mem 2017-01-13 Benchmark: Write GB/s > Higher Is Better gpu_compute_0_1 . 388.7 |====================================================== cl-mem 2017-01-13 Benchmark: Read GB/s > Higher Is Better gpu_compute_0_1 . 395.0 |====================================================== cl-mem 2017-01-13 Benchmark: Copy GB/s > Higher Is Better gpu_compute_0_1 . 295.9 |====================================================== Libplacebo 5.229.1 Test: hdr_peakdetect FPS > Higher Is Better gpu_compute_0_1 . 1858.13 |==================================================== Libplacebo 5.229.1 Test: polar_nocompute FPS > Higher Is Better gpu_compute_0_1 . 401.15 |===================================================== Libplacebo 5.229.1 Test: deband_heavy FPS > Higher Is Better gpu_compute_0_1 . 683.98 |===================================================== Mixbench 2020-06-23 Backend: NVIDIA CUDA - Benchmark: Single Precision GFLOPS > Higher Is Better gpu_compute_0_1 . 20980.61 |=================================================== Mixbench 2020-06-23 Backend: NVIDIA CUDA - Benchmark: Double Precision GFLOPS > Higher Is Better gpu_compute_0_1 . 294.24 |===================================================== Mixbench 2020-06-23 Backend: NVIDIA CUDA - Benchmark: Half Precision GFLOPS > Higher Is Better gpu_compute_0_1 . 22037.60 |=================================================== Mixbench 2020-06-23 Backend: OpenCL - Benchmark: Single Precision GFLOPS > Higher Is Better gpu_compute_0_1 . 21538.33 |=================================================== Mixbench 2020-06-23 Backend: OpenCL - Benchmark: Double Precision GFLOPS > Higher Is Better gpu_compute_0_1 . 298.80 |===================================================== Mixbench 2020-06-23 Backend: NVIDIA CUDA - Benchmark: Integer GIOPS > Higher Is Better gpu_compute_0_1 . 9812.41 |==================================================== Mixbench 2020-06-23 Backend: OpenCL - Benchmark: Integer GIOPS > Higher Is Better gpu_compute_0_1 . 11162.25 |=================================================== Hashcat 6.2.4 Benchmark: TrueCrypt RIPEMD160 + XTS H/s > Higher Is Better gpu_compute_0_1 . 962500 |===================================================== Hashcat 6.2.4 Benchmark: SHA-512 H/s > Higher Is Better gpu_compute_0_1 . 3647233333 |================================================= Hashcat 6.2.4 Benchmark: 7-Zip H/s > Higher Is Better gpu_compute_0_1 . 1244967 |==================================================== Hashcat 6.2.4 Benchmark: SHA1 H/s > Higher Is Better gpu_compute_0_1 . 25076066667 |================================================ Hashcat 6.2.4 Benchmark: MD5 H/s > Higher Is Better gpu_compute_0_1 . 80251200000 |================================================ VkFFT 1.1.1 Benchmark Score > Higher Is Better gpu_compute_0_1 . 33111 |====================================================== Waifu2x-NCNN Vulkan 20200818 Scale: 2x - Denoise: 3 - TAA: Yes Seconds < Lower Is Better gpu_compute_0_1 . 4.104 |====================================================== RealSR-NCNN 20200818 Scale: 4x - TAA: Yes Seconds < Lower Is Better gpu_compute_0_1 . 48.87 |====================================================== RealSR-NCNN 20200818 Scale: 4x - TAA: No Seconds < Lower Is Better gpu_compute_0_1 . 8.346 |====================================================== vkpeak 20210424 int16-vec4 GIOPS > Higher Is Better gpu_compute_0_1 . 9838.10 |==================================================== vkpeak 20210424 int16-scalar GIOPS > Higher Is Better gpu_compute_0_1 . 7436.90 |==================================================== vkpeak 20210424 int32-vec4 GIOPS > Higher Is Better gpu_compute_0_1 . 11206.15 |=================================================== vkpeak 20210424 int32-scalar GIOPS > Higher Is Better gpu_compute_0_1 . 11252.47 |=================================================== vkpeak 20210424 fp64-vec4 GFLOPS > Higher Is Better gpu_compute_0_1 . 354.1 |====================================================== vkpeak 20210424 fp64-scalar GFLOPS > Higher Is Better gpu_compute_0_1 . 353.66 |===================================================== vkpeak 20210424 fp16-vec4 GFLOPS > Higher Is Better gpu_compute_0_1 . 22221.79 |=================================================== vkpeak 20210424 fp16-scalar GFLOPS > Higher Is Better gpu_compute_0_1 . 11276.80 |=================================================== vkpeak 20210424 fp32-vec4 GFLOPS > Higher Is Better gpu_compute_0_1 . 14920.77 |=================================================== vkpeak 20210424 fp32-scalar GFLOPS > Higher Is Better gpu_compute_0_1 . 11271.30 |=================================================== PlaidML FP16: No - Mode: Inference - Network: DenseNet 201 - Device: OpenCL Examples Per Second > Higher Is Better PlaidML FP16: Yes - Mode: Inference - Network: Mobilenet - Device: OpenCL Examples Per Second > Higher Is Better PlaidML FP16: No - Mode: Inference - Network: Mobilenet - Device: OpenCL Examples Per Second > Higher Is Better PlaidML FP16: No - Mode: Inference - Network: IMDB LSTM - Device: OpenCL Examples Per Second > Higher Is Better PlaidML FP16: No - Mode: Training - Network: Mobilenet - Device: OpenCL Examples Per Second > Higher Is Better NCNN 20220729 Target: Vulkan GPU ms < Lower Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 1000 Milli-Seconds < Lower Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 200 Milli-Seconds < Lower Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 100 Milli-Seconds < Lower Is Better Caffe 2020-02-13 Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 1000 Milli-Seconds < Lower Is Better Caffe 2020-02-13 Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 200 Milli-Seconds < Lower Is Better Caffe 2020-02-13 Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 100 Milli-Seconds < Lower Is Better GROMACS 2023 Implementation: NVIDIA CUDA GPU - Input: water_GMX50_bare Ns Per Day > Higher Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TT GFLOPs/s > Higher Is Better gpu_compute_0_1 . 62.6 |======================================================= ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NN GFLOPs/s > Higher Is Better gpu_compute_0_1 . 59.6 |======================================================= ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-T GB/s > Higher Is Better gpu_compute_0_1 . 39.0 |======================================================= ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-N GB/s > Higher Is Better gpu_compute_0_1 . 30.94 |====================================================== RedShift Demo 3.0 Seconds < Lower Is Better Betsy GPU Compressor 1.1 Beta Codec: ETC2 RGB - Quality: Highest Seconds < Lower Is Better Betsy GPU Compressor 1.1 Beta Codec: ETC1 - Quality: Highest Seconds < Lower Is Better Libplacebo 5.229.1 Test: av1_grain_lap FPS > Higher Is Better gpu_compute_0_1 . 165.02 |===================================================== Libplacebo 5.229.1 Test: hdr_lut FPS > Higher Is Better gpu_compute_0_1 . 400.45 |===================================================== SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Texture Read Bandwidth SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Readback SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Download SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Max SP Flops SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: GEMM SGEMM_N SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Reduction SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: MD5 Hash SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: FFT SP SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Triad SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: S3D Waifu2x-NCNN Vulkan 20200818 Scale: 2x - Denoise: 3 - TAA: No Seconds < Lower Is Better