2 KVM testing on Ubuntu 20.04 via the Phoronix Test Suite. NVIDIA A100 80GB PCIe: Processor: 14 x Intel Xeon Gold 6342 (14 Cores), Motherboard: Nutanix AHV (nutanix-ahv-2.20220304.0.2619.el7 BIOS), Chipset: Intel 440FX 82441FX PMC, Memory: 4 x 16384 MB RAM, Disk: 428GB VDISK, Graphics: NVIDIA A100 80GB PCIe, Network: Red Hat Virtio device OS: Ubuntu 20.04, Kernel: 5.4.0-172-generic (x86_64), Display Driver: NVIDIA, OpenCL: OpenCL 3.0 CUDA 12.2.148, Vulkan: 1.3.242, Compiler: GCC 9.4.0 + CUDA 12.3, File-System: ext4, Screen Resolution: 1024x768, System Layer: KVM ArrayFire 3.9 Test: Conjugate Gradient OpenCL ms < Lower Is Better NVIDIA A100 80GB PCIe . 1.988 |================================================ Blender 4.0 Blend File: BMW27 - Compute: NVIDIA OptiX Seconds < Lower Is Better NVIDIA A100 80GB PCIe . 27.52 |================================================ Blender 4.0 Blend File: Classroom - Compute: NVIDIA OptiX Seconds < Lower Is Better NVIDIA A100 80GB PCIe . 20.82 |================================================ Blender 4.0 Blend File: Fishy Cat - Compute: NVIDIA OptiX Seconds < Lower Is Better NVIDIA A100 80GB PCIe . 22.36 |================================================ Blender 4.0 Blend File: Barbershop - Compute: NVIDIA OptiX Seconds < Lower Is Better NVIDIA A100 80GB PCIe . 83.64 |================================================ Blender 4.0 Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX Seconds < Lower Is Better NVIDIA A100 80GB PCIe . 44.99 |================================================ Caffe 2020-02-13 Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 100 Milli-Seconds < Lower Is Better NVIDIA A100 80GB PCIe . 857.69 |=============================================== Caffe 2020-02-13 Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 200 Milli-Seconds < Lower Is Better NVIDIA A100 80GB PCIe . 1709.23 |============================================== Caffe 2020-02-13 Model: AlexNet - Acceleration: NVIDIA CUDA - Iterations: 1000 Milli-Seconds < Lower Is Better NVIDIA A100 80GB PCIe . 8505.73 |============================================== Caffe 2020-02-13 Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 100 Milli-Seconds < Lower Is Better NVIDIA A100 80GB PCIe . 3190.99 |============================================== Caffe 2020-02-13 Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 200 Milli-Seconds < Lower Is Better NVIDIA A100 80GB PCIe . 6316.84 |============================================== Caffe 2020-02-13 Model: GoogleNet - Acceleration: NVIDIA CUDA - Iterations: 1000 Milli-Seconds < Lower Is Better NVIDIA A100 80GB PCIe . 31538.7 |============================================== cl-mem 2017-01-13 Benchmark: Copy GB/s > Higher Is Better NVIDIA A100 80GB PCIe . 234.8 |================================================ cl-mem 2017-01-13 Benchmark: Read GB/s > Higher Is Better NVIDIA A100 80GB PCIe . 796.1 |================================================ cl-mem 2017-01-13 Benchmark: Write GB/s > Higher Is Better NVIDIA A100 80GB PCIe . 1405.8 |=============================================== clpeak 1.1.2 OpenCL Test: Integer Compute INT GIOPS > Higher Is Better NVIDIA A100 80GB PCIe . 19208.70 |============================================= clpeak 1.1.2 OpenCL Test: Single-Precision Float GFLOPS > Higher Is Better NVIDIA A100 80GB PCIe . 19311.06 |============================================= clpeak 1.1.2 OpenCL Test: Double-Precision Double GFLOPS > Higher Is Better NVIDIA A100 80GB PCIe . 9689.03 |============================================== clpeak 1.1.2 OpenCL Test: Global Memory Bandwidth GBPS > Higher Is Better NVIDIA A100 80GB PCIe . 1495.36 |============================================== FAHBench 2.3.2 Ns Per Day > Higher Is Better NVIDIA A100 80GB PCIe . 258.60 |=============================================== FinanceBench 2016-07-25 Benchmark: Black-Scholes OpenCL ms < Lower Is Better NVIDIA A100 80GB PCIe . 1.035 |================================================ GROMACS 2024 Implementation: NVIDIA CUDA GPU - Input: water_GMX50_bare Ns Per Day > Higher Is Better NVIDIA A100 80GB PCIe . 25.61 |================================================ Hashcat 6.2.4 Benchmark: MD5 H/s > Higher Is Better Hashcat 6.2.4 Benchmark: SHA1 H/s > Higher Is Better Hashcat 6.2.4 Benchmark: 7-Zip H/s > Higher Is Better Hashcat 6.2.4 Benchmark: SHA-512 H/s > Higher Is Better Hashcat 6.2.4 Benchmark: TrueCrypt RIPEMD160 + XTS H/s > Higher Is Better LeelaChessZero 0.30 Backend: OpenCL Nodes Per Second > Higher Is Better LuxCoreRender 2.6 Scene: DLSC - Acceleration: GPU M samples/sec > Higher Is Better LuxCoreRender 2.6 Scene: Danish Mood - Acceleration: GPU M samples/sec > Higher Is Better LuxCoreRender 2.6 Scene: Orange Juice - Acceleration: GPU M samples/sec > Higher Is Better LuxCoreRender 2.6 Scene: LuxCore Benchmark - Acceleration: GPU M samples/sec > Higher Is Better LuxCoreRender 2.6 Scene: Rainbow Colors and Prism - Acceleration: GPU M samples/sec > Higher Is Better MandelGPU 1.3pts1 OpenCL Device: GPU Samples/sec > Higher Is Better Mixbench 2020-06-23 Backend: OpenCL - Benchmark: Integer GIOPS > Higher Is Better NVIDIA A100 80GB PCIe . 18824.22 |============================================= Mixbench 2020-06-23 Backend: NVIDIA CUDA - Benchmark: Integer Mixbench 2020-06-23 Backend: OpenCL - Benchmark: Double Precision GFLOPS > Higher Is Better NVIDIA A100 80GB PCIe . 9542.09 |============================================== Mixbench 2020-06-23 Backend: OpenCL - Benchmark: Single Precision GFLOPS > Higher Is Better NVIDIA A100 80GB PCIe . 18866.42 |============================================= Mixbench 2020-06-23 Backend: NVIDIA CUDA - Benchmark: Half Precision Mixbench 2020-06-23 Backend: NVIDIA CUDA - Benchmark: Double Precision Mixbench 2020-06-23 Backend: NVIDIA CUDA - Benchmark: Single Precision NCNN 20230517 Target: Vulkan GPU - Model: mobilenet ms < Lower Is Better NVIDIA A100 80GB PCIe . 14.70 |================================================ NCNN 20230517 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 ms < Lower Is Better NVIDIA A100 80GB PCIe . 4.99 |================================================= NCNN 20230517 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 ms < Lower Is Better NVIDIA A100 80GB PCIe . 4.24 |================================================= NCNN 20230517 Target: Vulkan GPU - Model: shufflenet-v2 ms < Lower Is Better NVIDIA A100 80GB PCIe . 4.36 |================================================= NCNN 20230517 Target: Vulkan GPU - Model: efficientnet-b0 ms < Lower Is Better NVIDIA A100 80GB PCIe . 5.95 |================================================= NCNN 20230517 Target: Vulkan GPU - Model: blazeface ms < Lower Is Better NVIDIA A100 80GB PCIe . 1.57 |================================================= NCNN 20230517 Target: Vulkan GPU - Model: googlenet ms < Lower Is Better NVIDIA A100 80GB PCIe . 12.48 |================================================ NCNN 20230517 Target: Vulkan GPU - Model: vgg16 ms < Lower Is Better NVIDIA A100 80GB PCIe . 32.12 |================================================ NCNN 20230517 Target: Vulkan GPU - Model: resnet18 ms < Lower Is Better NVIDIA A100 80GB PCIe . 8.00 |================================================= NCNN 20230517 Target: Vulkan GPU - Model: alexnet ms < Lower Is Better NVIDIA A100 80GB PCIe . 5.54 |================================================= NCNN 20230517 Target: Vulkan GPU - Model: resnet50 ms < Lower Is Better NVIDIA A100 80GB PCIe . 18.19 |================================================ NCNN 20230517 Target: Vulkan GPU - Model: yolov4-tiny ms < Lower Is Better NVIDIA A100 80GB PCIe . 22.49 |================================================ NCNN 20230517 Target: Vulkan GPU - Model: squeezenet_ssd ms < Lower Is Better NVIDIA A100 80GB PCIe . 10.93 |================================================ NCNN 20230517 Target: Vulkan GPU - Model: regnety_400m ms < Lower Is Better NVIDIA A100 80GB PCIe . 11.38 |================================================ NCNN 20230517 Target: Vulkan GPU - Model: vision_transformer ms < Lower Is Better NVIDIA A100 80GB PCIe . 83.82 |================================================ NCNN 20230517 Target: Vulkan GPU - Model: FastestDet ms < Lower Is Better NVIDIA A100 80GB PCIe . 5.10 |================================================= NCNN 20230517 Target: Vulkan GPU - Model: mnasnet ms < Lower Is Better NVIDIA A100 80GB PCIe . 4.40 |================================================= NeatBench 5 Acceleration: GPU FPS > Higher Is Better PlaidML FP16: No - Mode: Training - Network: Mobilenet - Device: OpenCL Examples Per Second > Higher Is Better PlaidML FP16: No - Mode: Inference - Network: IMDB LSTM - Device: OpenCL Examples Per Second > Higher Is Better PlaidML FP16: No - Mode: Inference - Network: Mobilenet - Device: OpenCL Examples Per Second > Higher Is Better PlaidML FP16: Yes - Mode: Inference - Network: Mobilenet - Device: OpenCL Examples Per Second > Higher Is Better PlaidML FP16: No - Mode: Inference - Network: DenseNet 201 - Device: OpenCL Examples Per Second > Higher Is Better RedShift Demo 3.0 Seconds < Lower Is Better Rodinia 3.1 Test: OpenCL Particle Filter Seconds < Lower Is Better SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: S3D GFLOPS > Higher Is Better NVIDIA A100 80GB PCIe . 815.73 |=============================================== SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Triad GB/s > Higher Is Better NVIDIA A100 80GB PCIe . 24.80 |================================================ SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: FFT SP GFLOPS > Higher Is Better NVIDIA A100 80GB PCIe . 4423.01 |============================================== SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: MD5 Hash GHash/s > Higher Is Better NVIDIA A100 80GB PCIe . 42.76 |================================================ SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Reduction GB/s > Higher Is Better NVIDIA A100 80GB PCIe . 236.03 |=============================================== SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: GEMM SGEMM_N GFLOPS > Higher Is Better NVIDIA A100 80GB PCIe . 13470.7 |============================================== SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Max SP Flops GFLOPS > Higher Is Better NVIDIA A100 80GB PCIe . 19366.2 |============================================== SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Download GB/s > Higher Is Better NVIDIA A100 80GB PCIe . 25.31 |================================================ SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Bus Speed Readback GB/s > Higher Is Better NVIDIA A100 80GB PCIe . 26.40 |================================================ SHOC Scalable HeterOgeneous Computing 2020-04-17 Target: OpenCL - Benchmark: Texture Read Bandwidth GB/s > Higher Is Better NVIDIA A100 80GB PCIe . 1582.12 |============================================== ViennaCL 1.7.1 Test: CPU BLAS - sCOPY GB/s > Higher Is Better NVIDIA A100 80GB PCIe . 147.4 |================================================ ViennaCL 1.7.1 Test: CPU BLAS - sAXPY GB/s > Higher Is Better NVIDIA A100 80GB PCIe . 178 |================================================== ViennaCL 1.7.1 Test: CPU BLAS - sDOT GB/s > Higher Is Better NVIDIA A100 80GB PCIe . 84.1 |================================================= ViennaCL 1.7.1 Test: CPU BLAS - dCOPY GB/s > Higher Is Better NVIDIA A100 80GB PCIe . 73.0 |================================================= ViennaCL 1.7.1 Test: CPU BLAS - dAXPY GB/s > Higher Is Better NVIDIA A100 80GB PCIe . 118 |================================================== ViennaCL 1.7.1 Test: CPU BLAS - dDOT GB/s > Higher Is Better NVIDIA A100 80GB PCIe . 107 |================================================== ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-N GB/s > Higher Is Better NVIDIA A100 80GB PCIe . 97.1 |================================================= ViennaCL 1.7.1 Test: CPU BLAS - dGEMV-T GB/s > Higher Is Better NVIDIA A100 80GB PCIe . 75.8 |================================================= ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NN GFLOPs/s > Higher Is Better NVIDIA A100 80GB PCIe . 26.3 |================================================= ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-NT GFLOPs/s > Higher Is Better NVIDIA A100 80GB PCIe . 26.1 |================================================= ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TN GFLOPs/s > Higher Is Better NVIDIA A100 80GB PCIe . 25.9 |================================================= ViennaCL 1.7.1 Test: CPU BLAS - dGEMM-TT GFLOPs/s > Higher Is Better NVIDIA A100 80GB PCIe . 26.4 |================================================= ViennaCL 1.7.1 Test: OpenCL BLAS - sCOPY GB/s > Higher Is Better NVIDIA A100 80GB PCIe . 232 |================================================== ViennaCL 1.7.1 Test: OpenCL BLAS - sAXPY GB/s > Higher Is Better NVIDIA A100 80GB PCIe . 312 |================================================== ViennaCL 1.7.1 Test: OpenCL BLAS - sDOT GB/s > Higher Is Better NVIDIA A100 80GB PCIe . 227 |================================================== ViennaCL 1.7.1 Test: OpenCL BLAS - dCOPY GB/s > Higher Is Better NVIDIA A100 80GB PCIe . 440 |================================================== ViennaCL 1.7.1 Test: OpenCL BLAS - dAXPY GB/s > Higher Is Better NVIDIA A100 80GB PCIe . 572 |================================================== ViennaCL 1.7.1 Test: OpenCL BLAS - dDOT GB/s > Higher Is Better NVIDIA A100 80GB PCIe . 435 |================================================== ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMV-N GB/s > Higher Is Better NVIDIA A100 80GB PCIe . 68.2 |================================================= ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMV-T GB/s > Higher Is Better NVIDIA A100 80GB PCIe . 245 |================================================== ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-NN GFLOPs/s > Higher Is Better NVIDIA A100 80GB PCIe . 4243 |================================================= ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-NT GFLOPs/s > Higher Is Better NVIDIA A100 80GB PCIe . 4653 |================================================= ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-TN GFLOPs/s > Higher Is Better NVIDIA A100 80GB PCIe . 4220 |================================================= ViennaCL 1.7.1 Test: OpenCL BLAS - dGEMM-TT GFLOPs/s > Higher Is Better NVIDIA A100 80GB PCIe . 4270 |=================================================