gooxi-ai KVM testing on Ubuntu 24.04 via the Phoronix Test Suite. Nvidia compute: Processor: Intel Xeon Gold 6226R (32 Cores), Motherboard: QEMU Standard PC (i440FX + PIIX 1996) (rel-1.16.2-0-gea1b7a073390-prebuilt.qemu.org BIOS), Chipset: Intel 440FX 82441FX PMC, Memory: 16 GB + 16 GB + 16 GB + 16 GB + 16 GB + 16 GB + 16 GB + 16 GB + 16 GB + 16 GB + 16 GB + 16 GB + 16 GB + 16 GB + 16 GB + 14240 MB RAM, Disk: 1100GB QEMU HDD, Graphics: NVIDIA RTX A4000 16GB, Audio: NVIDIA GA104 HD Audio, Monitor: QEMU Monitor, Network: Red Hat Virtio device OS: Ubuntu 24.04, Kernel: 6.8.0-39-generic (x86_64), Display Server: X Server, Display Driver: NVIDIA, Compiler: GCC 13.2.0 + CUDA 12.4, File-System: ext4, Screen Resolution: 1280x800, System Layer: KVM Nvidia Compute: Processor: Intel Xeon Gold 6226R (32 Cores), Motherboard: QEMU Standard PC (i440FX + PIIX 1996) (rel-1.16.2-0-gea1b7a073390-prebuilt.qemu.org BIOS), Chipset: Intel 440FX 82441FX PMC, Memory: 16 GB + 16 GB + 16 GB + 16 GB + 16 GB + 16 GB + 16 GB + 16 GB + 16 GB + 16 GB + 16 GB + 16 GB + 16 GB + 16 GB + 16 GB + 14240 MB RAM, Disk: 1100GB QEMU HDD, Graphics: NVIDIA RTX A4000 16GB, Audio: NVIDIA GA104 HD Audio, Monitor: QEMU Monitor, Network: Red Hat Virtio device OS: Ubuntu 24.04, Kernel: 6.8.0-39-generic (x86_64), Display Server: X Server, Display Driver: NVIDIA, Compiler: GCC 13.2.0 + CUDA 12.4, File-System: ext4, Screen Resolution: 1280x800, System Layer: KVM Mixbench 2020-06-23 Backend: OpenCL - Benchmark: Integer Mixbench 2020-06-23 Backend: OpenCL - Benchmark: Double Precision Mixbench 2020-06-23 Backend: OpenCL - Benchmark: Single Precision SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: S3D SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Triad SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: FFT SP SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: MD5 Hash SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Reduction SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: GEMM SGEMM_N SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Max SP Flops SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Download SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Readback SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Texture Read Bandwidth cl-mem 2017-01-13 Benchmark: Copy cl-mem 2017-01-13 Benchmark: Read cl-mem 2017-01-13 Benchmark: Write PlaidML FP16: No - Mode: Training - Network: Mobilenet - Device: OpenCL Examples Per Second > Higher Is Better PlaidML FP16: No - Mode: Inference - Network: IMDB LSTM - Device: OpenCL Examples Per Second > Higher Is Better PlaidML FP16: No - Mode: Inference - Network: Mobilenet - Device: OpenCL Examples Per Second > Higher Is Better PlaidML FP16: Yes - Mode: Inference - Network: Mobilenet - Device: OpenCL Examples Per Second > Higher Is Better PlaidML FP16: No - Mode: Inference - Network: DenseNet 201 - Device: OpenCL Examples Per Second > Higher Is Better NeatBench 5 Acceleration: GPU FPS > Higher Is Better ViennaCL 1.7.1 Test: CPU BLAS - sCOPY GB/s > Higher Is Better Nvidia Compute . 120.4 |======================================================= ViennaCL 1.7.1 Test: CPU BLAS - sAXPY GB/s > Higher Is Better Nvidia Compute . 194 |========================================================= ViennaCL 1.7.1 Test: CPU BLAS - sDOT GB/s > Higher Is Better Nvidia Compute . 158 |========================================================= ViennaCL 1.7.1 Test: CPU BLAS - dCOPY GB/s > Higher Is Better Nvidia Compute . 51.4 |======================================================== ViennaCL 1.7.1 Test: CPU BLAS - dAXPY GB/s > Higher Is Better Nvidia Compute . 78.8 |======================================================== ViennaCL 1.7.1 Test: CPU BLAS - dDOT GB/s > Higher Is Better Nvidia Compute . 78.0 |======================================================== ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-N GB/s > Higher Is Better Nvidia Compute . 77.7 |======================================================== ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-T GB/s > Higher Is Better Nvidia Compute . 91.0 |======================================================== clpeak 1.1.2 OpenCL Test: Global Memory Bandwidth GBPS > Higher Is Better Nvidia Compute . 377.03 |====================================================== Mixbench 2020-06-23 Backend: NVIDIA CUDA - Benchmark: Half Precision GFLOPS > Higher Is Better Nvidia Compute . 21382.24 |==================================================== Mixbench 2020-06-23 Backend: NVIDIA CUDA - Benchmark: Double Precision GFLOPS > Higher Is Better Nvidia Compute . 300.57 |====================================================== Mixbench 2020-06-23 Backend: NVIDIA CUDA - Benchmark: Single Precision GFLOPS > Higher Is Better Nvidia Compute . 20359.29 |==================================================== clpeak 1.1.2 OpenCL Test: Single-Precision Float GFLOPS > Higher Is Better Nvidia Compute . 18537.60 |==================================================== clpeak 1.1.2 OpenCL Test: Double-Precision Double GFLOPS > Higher Is Better Nvidia Compute . 361.53 |====================================================== ViennaCL 1.7.1 Test: OpenCL BLAS GFLOPS > Higher Is Better ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NN GFLOPs/s > Higher Is Better Nvidia Compute . 73.3 |======================================================== ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NT GFLOPs/s > Higher Is Better Nvidia Compute . 71.5 |======================================================== ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TN GFLOPs/s > Higher Is Better Nvidia Compute . 73.2 |======================================================== ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TT GFLOPs/s > Higher Is Better Nvidia Compute . 73.3 |======================================================== Mixbench 2020-06-23 Backend: NVIDIA CUDA - Benchmark: Integer GIOPS > Higher Is Better Nvidia Compute . 9803.67 |===================================================== clpeak 1.1.2 OpenCL Test: Integer Compute INT GIOPS > Higher Is Better Nvidia Compute . 9572.62 |===================================================== Hashcat 6.2.4 Benchmark: MD5 H/s > Higher Is Better Nvidia Compute . 176473950000 |================================================ Hashcat 6.2.4 Benchmark: SHA1 H/s > Higher Is Better Nvidia Compute . 58534106250 |================================================= Hashcat 6.2.4 Benchmark: 7-Zip H/s > Higher Is Better Nvidia Compute . 4960333 |===================================================== Hashcat 6.2.4 Benchmark: SHA-512 H/s > Higher Is Better Nvidia Compute . 15311200000 |================================================= Hashcat 6.2.4 Benchmark: TrueCrypt RIPEMD160 + XTS H/s > Higher Is Better Nvidia Compute . 4035300 |===================================================== LuxCoreRender 2.6 Scene: DLSC - Acceleration: GPU M samples/sec > Higher Is Better Nvidia Compute . 65.45 |======================================================= LuxCoreRender 2.6 Scene: Danish Mood - Acceleration: GPU M samples/sec > Higher Is Better Nvidia Compute . 40.41 |======================================================= LuxCoreRender 2.6 Scene: Orange Juice - Acceleration: GPU M samples/sec > Higher Is Better Nvidia Compute . 56.97 |======================================================= LuxCoreRender 2.6 Scene: LuxCore Benchmark - Acceleration: GPU M samples/sec > Higher Is Better Nvidia Compute . 33.02 |======================================================= LuxCoreRender 2.6 Scene: Rainbow Colors and Prism - Acceleration: GPU M samples/sec > Higher Is Better Nvidia Compute . 135.27 |====================================================== LeelaChessZero 0.30 Backend: OpenCL Nodes Per Second > Higher Is Better FAHBench 2.3.2 Ns Per Day > Higher Is Better Nvidia Compute . 234.62 |====================================================== GROMACS 2024 Implementation: NVIDIA CUDA GPU - Input: water_GMX50_bare Ns Per Day > Higher Is Better MandelGPU 1.3pts1 OpenCL Device: GPU Samples/sec > Higher Is Better ArrayFire 3.9 Test: Conjugate Gradient OpenCL Caffe 2020-02-13 Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 100 Milli-Seconds < Lower Is Better Caffe 2020-02-13 Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 200 Milli-Seconds < Lower Is Better Caffe 2020-02-13 Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 1000 Milli-Seconds < Lower Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 100 Milli-Seconds < Lower Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 200 Milli-Seconds < Lower Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 1000 Milli-Seconds < Lower Is Better FinanceBench 2016-07-25 Benchmark: Black-Scholes OpenCL ms < Lower Is Better NCNN 20230517 Target: Vulkan GPU - Model: mobilenet ms < Lower Is Better Nvidia Compute . 23.45 |======================================================= NCNN 20230517 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 ms < Lower Is Better Nvidia Compute . 8.43 |======================================================== NCNN 20230517 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 ms < Lower Is Better Nvidia Compute . 7.06 |======================================================== NCNN 20230517 Target: Vulkan GPU - Model: shufflenet-v2 ms < Lower Is Better Nvidia Compute . 7.93 |======================================================== NCNN 20230517 Target: Vulkan GPU - Model: mnasnet ms < Lower Is Better Nvidia Compute . 7.16 |======================================================== NCNN 20230517 Target: Vulkan GPU - Model: efficientnet-b0 ms < Lower Is Better Nvidia Compute . 10.04 |======================================================= NCNN 20230517 Target: Vulkan GPU - Model: blazeface ms < Lower Is Better Nvidia Compute . 3.20 |======================================================== NCNN 20230517 Target: Vulkan GPU - Model: googlenet ms < Lower Is Better Nvidia Compute . 20.07 |======================================================= NCNN 20230517 Target: Vulkan GPU - Model: vgg16 ms < Lower Is Better Nvidia Compute . 50.78 |======================================================= NCNN 20230517 Target: Vulkan GPU - Model: resnet18 ms < Lower Is Better Nvidia Compute . 12.56 |======================================================= NCNN 20230517 Target: Vulkan GPU - Model: alexnet ms < Lower Is Better Nvidia Compute . 8.39 |======================================================== NCNN 20230517 Target: Vulkan GPU - Model: resnet50 ms < Lower Is Better Nvidia Compute . 24.68 |======================================================= NCNN 20230517 Target: Vulkan GPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3 ms < Lower Is Better Nvidia Compute . 23.45 |======================================================= NCNN 20230517 Target: Vulkan GPU - Model: yolov4-tiny ms < Lower Is Better Nvidia Compute . 36.46 |======================================================= NCNN 20230517 Target: Vulkan GPU - Model: squeezenet_ssd ms < Lower Is Better Nvidia Compute . 18.61 |======================================================= NCNN 20230517 Target: Vulkan GPU - Model: regnety_400m ms < Lower Is Better Nvidia Compute . 22.22 |======================================================= NCNN 20230517 Target: Vulkan GPU - Model: vision_transformer ms < Lower Is Better Nvidia Compute . 75.18 |======================================================= NCNN 20230517 Target: Vulkan GPU - Model: FastestDet ms < Lower Is Better Nvidia Compute . 8.67 |======================================================== RedShift Demo 3.0 Seconds < Lower Is Better Rodinia 3.1 Test: OpenCL Particle Filter Seconds < Lower Is Better Blender 4.2 Blend File: BMW27 - Compute: NVIDIA OptiX Seconds < Lower Is Better Nvidia Compute . 7.64 |======================================================== Blender 4.2 Blend File: Junkshop - Compute: NVIDIA OptiX Seconds < Lower Is Better Nvidia Compute . 22.53 |======================================================= Blender 4.2 Blend File: Classroom - Compute: NVIDIA OptiX Seconds < Lower Is Better Nvidia Compute . 13.41 |======================================================= Blender 4.2 Blend File: Fishy Cat - Compute: NVIDIA OptiX Seconds < Lower Is Better Nvidia Compute . 11.21 |======================================================= Blender 4.2 Blend File: Barbershop - Compute: NVIDIA OptiX Seconds < Lower Is Better Nvidia Compute . 50.38 |======================================================= Blender 4.2 Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX Seconds < Lower Is Better Nvidia Compute . 18.22 |=======================================================