phoenix_tes1 AMD Ryzen 7 7700X 8-Core testing with a MSI PRO B650-P WIFI (MS-7D78) v1.0 (1.B0 BIOS) and NVIDIA GeForce RTX 3090 Ti 24GB on Ubuntu 23.04 via the Phoronix Test Suite. first_test_nvidia: Processor: AMD Ryzen 7 7700X 8-Core @ 4.50GHz (8 Cores / 16 Threads), Motherboard: MSI PRO B650-P WIFI (MS-7D78) v1.0 (1.B0 BIOS), Chipset: AMD Device 14d8, Memory: 32GB, Disk: 1000GB SHPP41-1000GM, Graphics: NVIDIA GeForce RTX 3090 Ti 24GB, Audio: NVIDIA GA102 HD Audio, Monitor: DELL S3221QS, Network: Realtek RTL8125 2.5GbE + MEDIATEK MT7922 802.11ax PCI OS: Ubuntu 23.04, Kernel: 6.2.0-39-generic (x86_64), Desktop: GNOME Shell 44.3, Display Server: X Server 1.21.1.7, Display Driver: NVIDIA 550.76, OpenGL: 4.6.0, OpenCL: OpenCL 3.0 CUDA 12.4.131, Compiler: GCC 12.3.0 + CUDA 11.8, File-System: ext4, Screen Resolution: 1920x1080 Mixbench 2020-06-23 Backend: OpenCL - Benchmark: Integer Mixbench 2020-06-23 Backend: NVIDIA CUDA - Benchmark: Integer Mixbench 2020-06-23 Backend: OpenCL - Benchmark: Double Precision Mixbench 2020-06-23 Backend: OpenCL - Benchmark: Single Precision Mixbench 2020-06-23 Backend: NVIDIA CUDA - Benchmark: Half Precision Mixbench 2020-06-23 Backend: NVIDIA CUDA - Benchmark: Double Precision Mixbench 2020-06-23 Backend: NVIDIA CUDA - Benchmark: Single Precision SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: S3D SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Triad SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: FFT SP SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: MD5 Hash SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Reduction SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: GEMM SGEMM_N SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Max SP Flops SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Download SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Readback SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Texture Read Bandwidth VkFFT 1.3.4 Test: FFT + iFFT R2C / C2R Benchmark Score > Higher Is Better first_test_nvidia . 56730 |==================================================== VkFFT 1.3.4 Test: FFT + iFFT C2C 1D batched in half precision Benchmark Score > Higher Is Better first_test_nvidia . 166438 |=================================================== VkFFT 1.3.4 Test: FFT + iFFT C2C Bluestein in single precision Benchmark Score > Higher Is Better first_test_nvidia . 14716 |==================================================== VkFFT 1.3.4 Test: FFT + iFFT C2C 1D batched in double precision Benchmark Score > Higher Is Better first_test_nvidia . 26356 |==================================================== VkFFT 1.3.4 Test: FFT + iFFT C2C 1D batched in single precision Benchmark Score > Higher Is Better first_test_nvidia . 129620 |=================================================== VkFFT 1.3.4 Test: FFT + iFFT C2C multidimensional in single precision Benchmark Score > Higher Is Better first_test_nvidia . 53534 |==================================================== VkFFT 1.3.4 Test: FFT + iFFT C2C Bluestein benchmark in double precision Benchmark Score > Higher Is Better first_test_nvidia . 4460 |===================================================== VkFFT 1.3.4 Test: FFT + iFFT C2C 1D batched in single precision, no reshuffling Benchmark Score > Higher Is Better first_test_nvidia . 130931 |=================================================== PlaidML FP16: No - Mode: Training - Network: Mobilenet - Device: OpenCL Examples Per Second > Higher Is Better PlaidML FP16: No - Mode: Inference - Network: IMDB LSTM - Device: OpenCL Examples Per Second > Higher Is Better PlaidML FP16: No - Mode: Inference - Network: Mobilenet - Device: OpenCL Examples Per Second > Higher Is Better PlaidML FP16: Yes - Mode: Inference - Network: Mobilenet - Device: OpenCL Examples Per Second > Higher Is Better PlaidML FP16: No - Mode: Inference - Network: DenseNet 201 - Device: OpenCL Examples Per Second > Higher Is Better Libplacebo 6.338.2 FPS > Higher Is Better NeatBench 5 Acceleration: GPU FPS > Higher Is Better first_test_nvidia . 3090 |===================================================== cl-mem 2017-01-13 Benchmark: Copy GB/s > Higher Is Better first_test_nvidia . 375.3 |==================================================== cl-mem 2017-01-13 Benchmark: Read GB/s > Higher Is Better first_test_nvidia . 887.3 |==================================================== cl-mem 2017-01-13 Benchmark: Write GB/s > Higher Is Better first_test_nvidia . 780.0 |==================================================== ViennaCL 1.7.1 Test: CPU BLAS - sCOPY GB/s > Higher Is Better first_test_nvidia . 41.4 |===================================================== ViennaCL 1.7.1 Test: CPU BLAS - sAXPY GB/s > Higher Is Better first_test_nvidia . 59.9 |===================================================== ViennaCL 1.7.1 Test: CPU BLAS - sDOT GB/s > Higher Is Better first_test_nvidia . 66.1 |===================================================== ViennaCL 1.7.1 Test: CPU BLAS - dCOPY GB/s > Higher Is Better first_test_nvidia . 26.7 |===================================================== ViennaCL 1.7.1 Test: CPU BLAS - dAXPY GB/s > Higher Is Better first_test_nvidia . 39.6 |===================================================== ViennaCL 1.7.1 Test: CPU BLAS - dDOT GB/s > Higher Is Better first_test_nvidia . 36.74 |==================================================== ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-N GB/s > Higher Is Better first_test_nvidia . 42.5 |===================================================== ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-T GB/s > Higher Is Better first_test_nvidia . 41.6 |===================================================== ViennaCL 1.7.1 Test: OpenCL BLAS - sCOPY GB/s > Higher Is Better first_test_nvidia . 378 |====================================================== ViennaCL 1.7.1 Test: OpenCL BLAS - sAXPY GB/s > Higher Is Better first_test_nvidia . 517 |====================================================== ViennaCL 1.7.1 Test: OpenCL BLAS - sDOT GB/s > Higher Is Better first_test_nvidia . 386 |====================================================== ViennaCL 1.7.1 Test: OpenCL BLAS - dCOPY GB/s > Higher Is Better first_test_nvidia . 630 |====================================================== ViennaCL 1.7.1 Test: OpenCL BLAS - dAXPY GB/s > Higher Is Better first_test_nvidia . 758 |====================================================== ViennaCL 1.7.1 Test: OpenCL BLAS - dDOT GB/s > Higher Is Better first_test_nvidia . 676 |====================================================== ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMV-N GB/s > Higher Is Better first_test_nvidia . 194 |====================================================== ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMV-T GB/s > Higher Is Better first_test_nvidia . 389 |====================================================== clpeak 1.1.2 OpenCL Test: Global Memory Bandwidth GBPS > Higher Is Better first_test_nvidia . 875.80 |=================================================== vkpeak 20230730 fp32-scalar GFLOPS > Higher Is Better first_test_nvidia . 21660.78 |================================================= vkpeak 20230730 fp32-vec4 GFLOPS > Higher Is Better first_test_nvidia . 28743.57 |================================================= vkpeak 20230730 fp16-scalar GFLOPS > Higher Is Better first_test_nvidia . 21659.94 |================================================= vkpeak 20230730 fp16-vec4 GFLOPS > Higher Is Better first_test_nvidia . 42791.90 |================================================= vkpeak 20230730 fp64-scalar GFLOPS > Higher Is Better first_test_nvidia . 676.99 |=================================================== vkpeak 20230730 fp64-vec4 GFLOPS > Higher Is Better first_test_nvidia . 677.05 |=================================================== clpeak 1.1.2 OpenCL Test: Single-Precision Float GFLOPS > Higher Is Better first_test_nvidia . 38660.51 |================================================= clpeak 1.1.2 OpenCL Test: Double-Precision Double GFLOPS > Higher Is Better first_test_nvidia . 682.86 |=================================================== ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NN GFLOPs/s > Higher Is Better first_test_nvidia . 54.0 |===================================================== ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NT GFLOPs/s > Higher Is Better first_test_nvidia . 53.3 |===================================================== ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TN GFLOPs/s > Higher Is Better first_test_nvidia . 55.8 |===================================================== ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TT GFLOPs/s > Higher Is Better first_test_nvidia . 55.9 |===================================================== ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-NN GFLOPs/s > Higher Is Better first_test_nvidia . 616 |====================================================== ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-NT GFLOPs/s > Higher Is Better first_test_nvidia . 618 |====================================================== ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-TN GFLOPs/s > Higher Is Better first_test_nvidia . 617 |====================================================== ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-TT GFLOPs/s > Higher Is Better first_test_nvidia . 615 |====================================================== vkpeak 20230730 int32-scalar GIOPS > Higher Is Better first_test_nvidia . 21596.79 |================================================= vkpeak 20230730 int32-vec4 GIOPS > Higher Is Better first_test_nvidia . 21486.45 |================================================= vkpeak 20230730 int16-scalar GIOPS > Higher Is Better first_test_nvidia . 14236.27 |================================================= vkpeak 20230730 int16-vec4 GIOPS > Higher Is Better first_test_nvidia . 18957.17 |================================================= clpeak 1.1.2 OpenCL Test: Integer Compute INT GIOPS > Higher Is Better first_test_nvidia . 19797.45 |================================================= Hashcat 6.2.4 Benchmark: MD5 H/s > Higher Is Better first_test_nvidia . 77873933333 |============================================== Hashcat 6.2.4 Benchmark: SHA1 H/s > Higher Is Better first_test_nvidia . 24389300000 |============================================== Hashcat 6.2.4 Benchmark: 7-Zip H/s > Higher Is Better first_test_nvidia . 1240067 |================================================== Hashcat 6.2.4 Benchmark: SHA-512 H/s > Higher Is Better first_test_nvidia . 3567933333 |=============================================== Hashcat 6.2.4 Benchmark: TrueCrypt RIPEMD160 + XTS H/s > Higher Is Better first_test_nvidia . 920967 |=================================================== IndigoBench 4.4 Acceleration: OpenCL GPU - Scene: Bedroom M samples/s > Higher Is Better first_test_nvidia . 22.09 |==================================================== IndigoBench 4.4 Acceleration: OpenCL GPU - Scene: Supercar M samples/s > Higher Is Better first_test_nvidia . 55.01 |==================================================== LuxCoreRender 2.6 Scene: DLSC - Acceleration: GPU M samples/sec > Higher Is Better first_test_nvidia . 15.37 |==================================================== LuxCoreRender 2.6 Scene: Danish Mood - Acceleration: GPU M samples/sec > Higher Is Better first_test_nvidia . 10.27 |==================================================== LuxCoreRender 2.6 Scene: Orange Juice - Acceleration: GPU M samples/sec > Higher Is Better first_test_nvidia . 12.96 |==================================================== LuxCoreRender 2.6 Scene: LuxCore Benchmark - Acceleration: GPU M samples/sec > Higher Is Better first_test_nvidia . 12.93 |==================================================== LuxCoreRender 2.6 Scene: Rainbow Colors and Prism - Acceleration: GPU M samples/sec > Higher Is Better first_test_nvidia . 36.28 |==================================================== LeelaChessZero 0.30 Backend: OpenCL Nodes Per Second > Higher Is Better FAHBench 2.3.2 Ns Per Day > Higher Is Better first_test_nvidia . 357.89 |=================================================== GROMACS 2024 Implementation: NVIDIA CUDA GPU - Input: water_GMX50_bare Ns Per Day > Higher Is Better MandelGPU 1.3pts1 OpenCL Device: GPU Samples/sec > Higher Is Better first_test_nvidia . 512205954.7 |============================================== OctaneBench 2020.1 Total Score Score > Higher Is Better first_test_nvidia . 710.34 |=================================================== Chaos Group V-RAY 6.0 Mode: NVIDIA RTX GPU Vrays > Higher Is Better Chaos Group V-RAY 6.0 Mode: NVIDIA CUDA GPU Vrays > Higher Is Better ArrayFire 3.9 Test: Conjugate Gradient OpenCL NAMD CUDA 2.14 ATPase Simulation - 327,506 Atoms days/ns < Lower Is Better first_test_nvidia . 0.11664 |================================================== Caffe 2020-02-13 Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 100 Milli-Seconds < Lower Is Better Caffe 2020-02-13 Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 200 Milli-Seconds < Lower Is Better Caffe 2020-02-13 Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 1000 Milli-Seconds < Lower Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 100 Milli-Seconds < Lower Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 200 Milli-Seconds < Lower Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 1000 Milli-Seconds < Lower Is Better VkResample 1.0 Upscale: 2x - Precision: Double ms < Lower Is Better first_test_nvidia . 314.43 |=================================================== VkResample 1.0 Upscale: 2x - Precision: Single ms < Lower Is Better first_test_nvidia . 9.484 |==================================================== FinanceBench 2016-07-25 Benchmark: Black-Scholes OpenCL ms < Lower Is Better NCNN 20230517 Target: Vulkan GPU - Model: mobilenet ms < Lower Is Better first_test_nvidia . 8.26 |===================================================== NCNN 20230517 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 ms < Lower Is Better first_test_nvidia . 2.09 |===================================================== NCNN 20230517 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 ms < Lower Is Better first_test_nvidia . 1.82 |===================================================== NCNN 20230517 Target: Vulkan GPU - Model: shufflenet-v2 ms < Lower Is Better first_test_nvidia . 1.49 |===================================================== NCNN 20230517 Target: Vulkan GPU - Model: mnasnet ms < Lower Is Better first_test_nvidia . 1.90 |===================================================== NCNN 20230517 Target: Vulkan GPU - Model: efficientnet-b0 ms < Lower Is Better first_test_nvidia . 2.97 |===================================================== NCNN 20230517 Target: Vulkan GPU - Model: blazeface ms < Lower Is Better first_test_nvidia . 0.60 |===================================================== NCNN 20230517 Target: Vulkan GPU - Model: googlenet ms < Lower Is Better first_test_nvidia . 6.92 |===================================================== NCNN 20230517 Target: Vulkan GPU - Model: vgg16 ms < Lower Is Better first_test_nvidia . 36.77 |==================================================== NCNN 20230517 Target: Vulkan GPU - Model: resnet18 ms < Lower Is Better first_test_nvidia . 5.68 |===================================================== NCNN 20230517 Target: Vulkan GPU - Model: alexnet ms < Lower Is Better first_test_nvidia . 5.34 |===================================================== NCNN 20230517 Target: Vulkan GPU - Model: resnet50 ms < Lower Is Better first_test_nvidia . 11.28 |==================================================== NCNN 20230517 Target: Vulkan GPU - Model: yolov4-tiny ms < Lower Is Better first_test_nvidia . 14.96 |==================================================== NCNN 20230517 Target: Vulkan GPU - Model: squeezenet_ssd ms < Lower Is Better first_test_nvidia . 5.78 |===================================================== NCNN 20230517 Target: Vulkan GPU - Model: regnety_400m ms < Lower Is Better first_test_nvidia . 5.05 |===================================================== NCNN 20230517 Target: Vulkan GPU - Model: vision_transformer ms < Lower Is Better first_test_nvidia . 53.28 |==================================================== NCNN 20230517 Target: Vulkan GPU - Model: FastestDet ms < Lower Is Better first_test_nvidia . 1.97 |===================================================== RealSR-NCNN 20200818 Scale: 4x - TAA: No Seconds < Lower Is Better first_test_nvidia . 5.378 |==================================================== RealSR-NCNN 20200818 Scale: 4x - TAA: Yes Seconds < Lower Is Better first_test_nvidia . 27.12 |==================================================== Waifu2x-NCNN Vulkan 20200818 Scale: 2x - Denoise: 3 - TAA: No Seconds < Lower Is Better Waifu2x-NCNN Vulkan 20200818 Scale: 2x - Denoise: 3 - TAA: Yes Seconds < Lower Is Better first_test_nvidia . 3.111 |==================================================== Betsy GPU Compressor 1.1 Beta Codec: ETC1 - Quality: Highest Seconds < Lower Is Better Betsy GPU Compressor 1.1 Beta Codec: ETC2 RGB - Quality: Highest Seconds < Lower Is Better RedShift Demo 3.0 Seconds < Lower Is Better Rodinia 3.1 Test: OpenCL Particle Filter Seconds < Lower Is Better Blender 4.1 Blend File: BMW27 - Compute: NVIDIA OptiX Seconds < Lower Is Better first_test_nvidia . 6.13 |===================================================== Blender 4.1 Blend File: Junkshop - Compute: NVIDIA OptiX Seconds < Lower Is Better first_test_nvidia . 11.14 |==================================================== Blender 4.1 Blend File: Classroom - Compute: NVIDIA OptiX Seconds < Lower Is Better first_test_nvidia . 14.57 |==================================================== Blender 4.1 Blend File: Fishy Cat - Compute: NVIDIA OptiX Seconds < Lower Is Better first_test_nvidia . 10.08 |==================================================== Blender 4.1 Blend File: Barbershop - Compute: NVIDIA OptiX Seconds < Lower Is Better first_test_nvidia . 53.39 |==================================================== Blender 4.1 Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX Seconds < Lower Is Better first_test_nvidia . 16.50 |====================================================