2

KVM testing on Ubuntu 20.04 via the Phoronix Test Suite.

NVIDIA A100 80GB PCIe

Kernel Notes: Transparent Huge Pages: madvise
Compiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-9QDOt0/gcc-9-9.4.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v
Processor Notes: CPU Microcode: 0x1
Graphics Notes: BAR1 / Visible vRAM Size: 131072 MiB - vBIOS Version: 92.00.90.00.0f
Python Notes: Python 3.8.10
Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Unknown: No mitigations + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected

14 x Intel Xeon Gold 6342 - NVIDIA A100 80GB PCIe -

Processor: 14 x Intel Xeon Gold 6342 (14 Cores), Motherboard: Nutanix AHV (nutanix-ahv-2.20220304.0.2619.el7 BIOS), Chipset: Intel 440FX 82441FX PMC, Memory: 4 x 16384 MB RAM, Disk: 428GB VDISK, Graphics: NVIDIA A100 80GB PCIe, Network: Red Hat Virtio device

OS: Ubuntu 20.04, Kernel: 5.4.0-172-generic (x86_64), Display Driver: NVIDIA, OpenCL: OpenCL 3.0 CUDA 12.2.148, Vulkan: 1.3.242, Compiler: GCC 9.4.0 + CUDA 12.3, File-System: ext4, Screen Resolution: 1024x768, System Layer: KVM

Kernel Notes: Transparent Huge Pages: madvise
Compiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-9QDOt0/gcc-9-9.4.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v
Processor Notes: CPU Microcode: 0x1
Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Unknown: No mitigations + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected

PlaidML

This test profile uses PlaidML deep learning framework developed by Intel for offering up various benchmarks. Learn more via the OpenBenchmarking.org test page.

FP16: No - Mode: Training - Network: Mobilenet - Device: OpenCL

NVIDIA A100 80GB PCIe: The test run did not produce a result. The test run did not produce a result. The test run did not produce a result.

FP16: No - Mode: Inference - Network: IMDB LSTM - Device: OpenCL

NVIDIA A100 80GB PCIe: The test run did not produce a result. The test run did not produce a result. The test run did not produce a result.

FP16: No - Mode: Inference - Network: Mobilenet - Device: OpenCL

NVIDIA A100 80GB PCIe: The test run did not produce a result. The test run did not produce a result. The test run did not produce a result.

FP16: Yes - Mode: Inference - Network: Mobilenet - Device: OpenCL

NVIDIA A100 80GB PCIe: The test run did not produce a result. The test run did not produce a result. The test run did not produce a result.

FP16: No - Mode: Inference - Network: DenseNet 201 - Device: OpenCL

NVIDIA A100 80GB PCIe: The test run did not produce a result. The test run did not produce a result. The test run did not produce a result.

LeelaChessZero

LeelaChessZero (lc0 / lczero) is a chess engine automated vian neural networks. This test profile can be used for OpenCL, CUDA + cuDNN, and BLAS (CPU-based) benchmarking. Learn more via the OpenBenchmarking.org test page.

Backend: OpenCL

NVIDIA A100 80GB PCIe: The test run did not produce a result. The test run did not produce a result. The test run did not produce a result.

Caffe

This is a benchmark of the Caffe deep learning framework and currently supports the AlexNet and Googlenet model and execution on both CPUs and NVIDIA GPUs. Learn more via the OpenBenchmarking.org test page.

SHOC Scalable HeterOgeneous Computing

The CUDA and OpenCL version of Vetter's Scalable HeterOgeneous Computing benchmark suite. SHOC provides a number of different benchmark programs for evaluating the performance and stability of compute devices. Learn more via the OpenBenchmarking.org test page.

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

GROMACS

The GROMACS (GROningen MAchine for Chemical Simulations) molecular dynamics package testing with the water_GMX50 data. This test profile allows selecting between CPU and GPU-based GROMACS builds. Learn more via the OpenBenchmarking.org test page.

Rodinia

Rodinia is a suite focused upon accelerating compute-intensive applications with accelerators. CUDA, OpenMP, and OpenCL parallel models are supported by the included applications. This profile utilizes select OpenCL, NVIDIA CUDA and OpenMP test binaries at the moment. Learn more via the OpenBenchmarking.org test page.

Test: OpenCL Particle Filter

NVIDIA A100 80GB PCIe: The test run did not produce a result. The test run did not produce a result. The test run did not produce a result. E: ERROR: clEnqueueWriteBuffer arrayX_GPU (size:400000) => -2002801765

ArrayFire

ArrayFire is an GPU and CPU numeric processing library, this test uses the built-in CPU and OpenCL ArrayFire benchmarks. Learn more via the OpenBenchmarking.org test page.

Blender

Blender is an open-source 3D creation and modeling software project. This test is of Blender's Cycles performance with various sample files. GPU computing via NVIDIA OptiX and NVIDIA CUDA is currently supported as well as HIP for AMD Radeon GPUs and Intel oneAPI for Intel Graphics. Learn more via the OpenBenchmarking.org test page.

NeatBench

NeatBench is a benchmark of the cross-platform Neat Video software on the CPU and optional GPU (OpenCL / CUDA) support. Learn more via the OpenBenchmarking.org test page.

Acceleration: GPU

NVIDIA A100 80GB PCIe: The test run did not produce a result. The test run did not produce a result. The test run did not produce a result.

LuxCoreRender

LuxCoreRender is an open-source 3D physically based renderer formerly known as LuxRender. LuxCoreRender supports CPU-based rendering as well as GPU acceleration via OpenCL, NVIDIA CUDA, and NVIDIA OptiX interfaces. Learn more via the OpenBenchmarking.org test page.

Scene: DLSC - Acceleration: GPU

NVIDIA A100 80GB PCIe: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: RUNTIME ERROR: CUDA driver API error CUDA_ERROR_UNSUPPORTED_PTX_VERSION (code: 222, file:/home/vsts/work/1/s/LinuxCompile/LuxCore-sdk/src/luxrays/utils/cuda.cpp, line: 251): the provided PTX was compiled with an unsupported toolchain.

Scene: Danish Mood - Acceleration: GPU

Scene: Orange Juice - Acceleration: GPU

Scene: LuxCore Benchmark - Acceleration: GPU

Scene: Rainbow Colors and Prism - Acceleration: GPU

FAHBench

FAHBench is a Folding@Home benchmark on the GPU. Learn more via the OpenBenchmarking.org test page.

Hashcat

Hashcat is an open-source, advanced password recovery tool supporting GPU acceleration with OpenCL, NVIDIA CUDA, and Radeon ROCm. Learn more via the OpenBenchmarking.org test page.

Benchmark: MD5

NVIDIA A100 80GB PCIe: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: [31mptxas application ptx input, line 9; fatal : Unsupported .version 8.3; current version is '8.2'[0m

Benchmark: SHA1

Benchmark: 7-Zip

Benchmark: SHA-512

Benchmark: TrueCrypt RIPEMD160 + XTS

Mixbench

A benchmark suite for GPUs on mixed operational intensity kernels. Learn more via the OpenBenchmarking.org test page.

Backend: NVIDIA CUDA - Benchmark: Integer

NVIDIA A100 80GB PCIe: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status.

Backend: NVIDIA CUDA - Benchmark: Half Precision

NVIDIA A100 80GB PCIe: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status.

Backend: NVIDIA CUDA - Benchmark: Double Precision

NVIDIA A100 80GB PCIe: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status.

Backend: NVIDIA CUDA - Benchmark: Single Precision

NVIDIA A100 80GB PCIe: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status.

RedShift Demo

This is a test of MAXON's RedShift demo build that currently requires NVIDIA GPU acceleration. Learn more via the OpenBenchmarking.org test page.

NVIDIA A100 80GB PCIe: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./redshift: 3: /usr/redshift/bin/redshiftBenchmark: not found

FinanceBench

FinanceBench is a collection of financial program benchmarks with support for benchmarking on the GPU via OpenCL and CPU benchmarking with OpenMP. The FinanceBench test cases are focused on Black-Sholes-Merton Process with Analytic European Option engine, QMC (Sobol) Monte-Carlo method (Equity Option Example), Bonds Fixed-rate bond with flat forward curve, and Repo Securities repurchase agreement. FinanceBench was originally written by the Cavazos Lab at University of Delaware. Learn more via the OpenBenchmarking.org test page.

cl-mem

A basic OpenCL memory benchmark. Learn more via the OpenBenchmarking.org test page.

clpeak

Clpeak is designed to test the peak capabilities of OpenCL devices. Learn more via the OpenBenchmarking.org test page.

MandelGPU

MandelGPU is an OpenCL benchmark and this test runs with the OpenCL rendering float4 kernel with a maximum of 4096 iterations. Learn more via the OpenBenchmarking.org test page.

OpenCL Device: GPU

NVIDIA A100 80GB PCIe: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status.

ViennaCL

ViennaCL is an open-source linear algebra library written in C++ and with support for OpenCL and OpenMP. This test profile makes use of ViennaCL's built-in benchmarks. Learn more via the OpenBenchmarking.org test page.

76 Results Shown

Caffe:
AlexNet - NVIDIA CUDA - 100
AlexNet - NVIDIA CUDA - 200
AlexNet - NVIDIA CUDA - 1000
GoogleNet - NVIDIA CUDA - 100
GoogleNet - NVIDIA CUDA - 200
GoogleNet - NVIDIA CUDA - 1000
SHOC Scalable HeterOgeneous Computing:
OpenCL - S3D
OpenCL - Triad
OpenCL - FFT SP
OpenCL - MD5 Hash
OpenCL - Reduction
OpenCL - GEMM SGEMM_N
OpenCL - Max SP Flops
OpenCL - Bus Speed Download
OpenCL - Bus Speed Readback
OpenCL - Texture Read Bandwidth
NCNN:
Vulkan GPU - mobilenet
Vulkan GPU-v2-v2 - mobilenet-v2
Vulkan GPU-v3-v3 - mobilenet-v3
Vulkan GPU - shufflenet-v2
Vulkan GPU - efficientnet-b0
Vulkan GPU - blazeface
Vulkan GPU - googlenet
Vulkan GPU - vgg16
Vulkan GPU - resnet18
Vulkan GPU - alexnet
Vulkan GPU - resnet50
Vulkan GPU - yolov4-tiny
Vulkan GPU - squeezenet_ssd
Vulkan GPU - regnety_400m
Vulkan GPU - vision_transformer
Vulkan GPU - FastestDet
Vulkan GPU - mnasnet
GROMACS
ArrayFire
Blender:
BMW27 - NVIDIA OptiX
Classroom - NVIDIA OptiX
Fishy Cat - NVIDIA OptiX
Barbershop - NVIDIA OptiX
Pabellon Barcelona - NVIDIA OptiX
FAHBench
Mixbench:
OpenCL - Integer
OpenCL - Double Precision
OpenCL - Single Precision
FinanceBench
cl-mem:
Copy
Read
Write
clpeak:
Integer Compute INT
Single-Precision Float
Double-Precision Double
Global Memory Bandwidth
ViennaCL:
CPU BLAS - sCOPY
CPU BLAS - sAXPY
CPU BLAS - sDOT
CPU BLAS - dCOPY
CPU BLAS - dAXPY
CPU BLAS - dDOT
CPU BLAS - dGEMV-N
CPU BLAS - dGEMV-T
CPU BLAS - dGEMM-NN
CPU BLAS - dGEMM-NT
CPU BLAS - dGEMM-TN
CPU BLAS - dGEMM-TT
OpenCL BLAS - sCOPY
OpenCL BLAS - sAXPY
OpenCL BLAS - sDOT
OpenCL BLAS - dCOPY
OpenCL BLAS - dAXPY
OpenCL BLAS - dDOT
OpenCL BLAS - dGEMV-N
OpenCL BLAS - dGEMV-T
OpenCL BLAS - dGEMM-NN
OpenCL BLAS - dGEMM-NT
OpenCL BLAS - dGEMM-TN
OpenCL BLAS - dGEMM-TT

NVIDIA A100 80GB PCIe

Testing initiated at 23 February 2024 14:18 by user root.

14 x Intel Xeon Gold 6342 - NVIDIA A100 80GB PCIe -

Kernel Notes: Transparent Huge Pages: madvise
Compiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-9QDOt0/gcc-9-9.4.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v
Processor Notes: CPU Microcode: 0x1
Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Unknown: No mitigations + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected

Testing initiated at 23 February 2024 16:35 by user root.

2

View

Statistics

Graph Settings

Multi-Way Comparison

Table

Run Management

NVIDIA A100 80GB PCIe

14 x Intel Xeon Gold 6342 - NVIDIA A100 80GB PCIe -

PlaidML

LeelaChessZero

Caffe

SHOC Scalable HeterOgeneous Computing

NCNN

GROMACS

Rodinia

ArrayFire

Blender

NeatBench

LuxCoreRender

FAHBench

Hashcat

Mixbench

RedShift Demo

FinanceBench

cl-mem

clpeak

MandelGPU

ViennaCL

76 Results Shown

NVIDIA A100 80GB PCIe

14 x Intel Xeon Gold 6342 - NVIDIA A100 80GB PCIe -