2

KVM testing on Ubuntu 20.04 via the Phoronix Test Suite.

NVIDIA A100 80GB PCIe

Kernel Notes: Transparent Huge Pages: madvise
Compiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-9QDOt0/gcc-9-9.4.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v
Processor Notes: CPU Microcode: 0x1
Graphics Notes: BAR1 / Visible vRAM Size: 131072 MiB - vBIOS Version: 92.00.90.00.0f
Python Notes: Python 3.8.10
Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Unknown: No mitigations + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected

14 x Intel Xeon Gold 6342 - NVIDIA A100 80GB PCIe -

Processor: 14 x Intel Xeon Gold 6342 (14 Cores), Motherboard: Nutanix AHV (nutanix-ahv-2.20220304.0.2619.el7 BIOS), Chipset: Intel 440FX 82441FX PMC, Memory: 4 x 16384 MB RAM, Disk: 428GB VDISK, Graphics: NVIDIA A100 80GB PCIe, Network: Red Hat Virtio device

OS: Ubuntu 20.04, Kernel: 5.4.0-172-generic (x86_64), Display Driver: NVIDIA, OpenCL: OpenCL 3.0 CUDA 12.2.148, Vulkan: 1.3.242, Compiler: GCC 9.4.0 + CUDA 12.3, File-System: ext4, Screen Resolution: 1024x768, System Layer: KVM

Kernel Notes: Transparent Huge Pages: madvise
Compiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-9QDOt0/gcc-9-9.4.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v
Processor Notes: CPU Microcode: 0x1
Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Unknown: No mitigations + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected

SHOC Scalable HeterOgeneous Computing

The CUDA and OpenCL version of Vetter's Scalable HeterOgeneous Computing benchmark suite. SHOC provides a number of different benchmark programs for evaluating the performance and stability of compute devices. Learn more via the OpenBenchmarking.org test page.

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

ViennaCL

ViennaCL is an open-source linear algebra library written in C++ and with support for OpenCL and OpenMP. This test profile makes use of ViennaCL's built-in benchmarks. Learn more via the OpenBenchmarking.org test page.

Blender

Blender is an open-source 3D creation and modeling software project. This test is of Blender's Cycles performance with various sample files. GPU computing via NVIDIA OptiX and NVIDIA CUDA is currently supported as well as HIP for AMD Radeon GPUs and Intel oneAPI for Intel Graphics. Learn more via the OpenBenchmarking.org test page.

FAHBench

FAHBench is a Folding@Home benchmark on the GPU. Learn more via the OpenBenchmarking.org test page.

Blender

GROMACS

The GROMACS (GROningen MAchine for Chemical Simulations) molecular dynamics package testing with the water_GMX50 data. This test profile allows selecting between CPU and GPU-based GROMACS builds. Learn more via the OpenBenchmarking.org test page.

Caffe

This is a benchmark of the Caffe deep learning framework and currently supports the AlexNet and Googlenet model and execution on both CPUs and NVIDIA GPUs. Learn more via the OpenBenchmarking.org test page.

Blender

ViennaCL

LuxCoreRender

LuxCoreRender is an open-source 3D physically based renderer formerly known as LuxRender. LuxCoreRender supports CPU-based rendering as well as GPU acceleration via OpenCL, NVIDIA CUDA, and NVIDIA OptiX interfaces. Learn more via the OpenBenchmarking.org test page.

Scene: LuxCore Benchmark - Acceleration: GPU

NVIDIA A100 80GB PCIe: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: RUNTIME ERROR: CUDA driver API error CUDA_ERROR_UNSUPPORTED_PTX_VERSION (code: 222, file:/home/vsts/work/1/s/LinuxCompile/LuxCore-sdk/src/luxrays/utils/cuda.cpp, line: 251): the provided PTX was compiled with an unsupported toolchain.

Scene: Danish Mood - Acceleration: GPU

Caffe

PlaidML

This test profile uses PlaidML deep learning framework developed by Intel for offering up various benchmarks. Learn more via the OpenBenchmarking.org test page.

FP16: No - Mode: Inference - Network: DenseNet 201 - Device: OpenCL

NVIDIA A100 80GB PCIe: The test run did not produce a result. The test run did not produce a result. The test run did not produce a result.

SHOC Scalable HeterOgeneous Computing

PlaidML

This test profile uses PlaidML deep learning framework developed by Intel for offering up various benchmarks. Learn more via the OpenBenchmarking.org test page.

FP16: No - Mode: Training - Network: Mobilenet - Device: OpenCL

NVIDIA A100 80GB PCIe: The test run did not produce a result. The test run did not produce a result. The test run did not produce a result.

Caffe

LuxCoreRender

Scene: DLSC - Acceleration: GPU

Scene: Orange Juice - Acceleration: GPU

ArrayFire

ArrayFire is an GPU and CPU numeric processing library, this test uses the built-in CPU and OpenCL ArrayFire benchmarks. Learn more via the OpenBenchmarking.org test page.

Hashcat

Hashcat is an open-source, advanced password recovery tool supporting GPU acceleration with OpenCL, NVIDIA CUDA, and Radeon ROCm. Learn more via the OpenBenchmarking.org test page.

Benchmark: MD5

NVIDIA A100 80GB PCIe: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: [31mptxas application ptx input, line 9; fatal : Unsupported .version 8.3; current version is '8.2'[0m

Caffe

Mixbench

A benchmark suite for GPUs on mixed operational intensity kernels. Learn more via the OpenBenchmarking.org test page.

Hashcat

Hashcat is an open-source, advanced password recovery tool supporting GPU acceleration with OpenCL, NVIDIA CUDA, and Radeon ROCm. Learn more via the OpenBenchmarking.org test page.

Benchmark: SHA-512

Benchmark: SHA1

Benchmark: 7-Zip

cl-mem

A basic OpenCL memory benchmark. Learn more via the OpenBenchmarking.org test page.

Hashcat

Hashcat is an open-source, advanced password recovery tool supporting GPU acceleration with OpenCL, NVIDIA CUDA, and Radeon ROCm. Learn more via the OpenBenchmarking.org test page.

Benchmark: TrueCrypt RIPEMD160 + XTS

PlaidML

This test profile uses PlaidML deep learning framework developed by Intel for offering up various benchmarks. Learn more via the OpenBenchmarking.org test page.

FP16: Yes - Mode: Inference - Network: Mobilenet - Device: OpenCL

NVIDIA A100 80GB PCIe: The test run did not produce a result. The test run did not produce a result. The test run did not produce a result.

SHOC Scalable HeterOgeneous Computing

PlaidML

This test profile uses PlaidML deep learning framework developed by Intel for offering up various benchmarks. Learn more via the OpenBenchmarking.org test page.

FP16: No - Mode: Inference - Network: Mobilenet - Device: OpenCL

NVIDIA A100 80GB PCIe: The test run did not produce a result. The test run did not produce a result. The test run did not produce a result.

FP16: No - Mode: Inference - Network: IMDB LSTM - Device: OpenCL

NVIDIA A100 80GB PCIe: The test run did not produce a result. The test run did not produce a result. The test run did not produce a result.

FinanceBench

FinanceBench is a collection of financial program benchmarks with support for benchmarking on the GPU via OpenCL and CPU benchmarking with OpenMP. The FinanceBench test cases are focused on Black-Sholes-Merton Process with Analytic European Option engine, QMC (Sobol) Monte-Carlo method (Equity Option Example), Bonds Fixed-rate bond with flat forward curve, and Repo Securities repurchase agreement. FinanceBench was originally written by the Cavazos Lab at University of Delaware. Learn more via the OpenBenchmarking.org test page.

clpeak

Clpeak is designed to test the peak capabilities of OpenCL devices. Learn more via the OpenBenchmarking.org test page.

Caffe

clpeak

Clpeak is designed to test the peak capabilities of OpenCL devices. Learn more via the OpenBenchmarking.org test page.

SHOC Scalable HeterOgeneous Computing

Caffe

NeatBench

NeatBench is a benchmark of the cross-platform Neat Video software on the CPU and optional GPU (OpenCL / CUDA) support. Learn more via the OpenBenchmarking.org test page.

Acceleration: GPU

NVIDIA A100 80GB PCIe: The test run did not produce a result. The test run did not produce a result. The test run did not produce a result.

Rodinia

Rodinia is a suite focused upon accelerating compute-intensive applications with accelerators. CUDA, OpenMP, and OpenCL parallel models are supported by the included applications. This profile utilizes select OpenCL, NVIDIA CUDA and OpenMP test binaries at the moment. Learn more via the OpenBenchmarking.org test page.

Test: OpenCL Particle Filter

NVIDIA A100 80GB PCIe: The test run did not produce a result. The test run did not produce a result. The test run did not produce a result. E: ERROR: clEnqueueWriteBuffer arrayX_GPU (size:400000) => -2002801765

SHOC Scalable HeterOgeneous Computing

clpeak

Clpeak is designed to test the peak capabilities of OpenCL devices. Learn more via the OpenBenchmarking.org test page.

Mixbench

A benchmark suite for GPUs on mixed operational intensity kernels. Learn more via the OpenBenchmarking.org test page.

SHOC Scalable HeterOgeneous Computing

LuxCoreRender

Scene: Rainbow Colors and Prism - Acceleration: GPU

MandelGPU

MandelGPU is an OpenCL benchmark and this test runs with the OpenCL rendering float4 kernel with a maximum of 4096 iterations. Learn more via the OpenBenchmarking.org test page.

OpenCL Device: GPU

NVIDIA A100 80GB PCIe: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status.

Mixbench

A benchmark suite for GPUs on mixed operational intensity kernels. Learn more via the OpenBenchmarking.org test page.

Backend: NVIDIA CUDA - Benchmark: Single Precision

NVIDIA A100 80GB PCIe: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status.

Backend: NVIDIA CUDA - Benchmark: Half Precision

NVIDIA A100 80GB PCIe: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status.

Backend: NVIDIA CUDA - Benchmark: Double Precision

NVIDIA A100 80GB PCIe: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status.

Backend: NVIDIA CUDA - Benchmark: Integer

NVIDIA A100 80GB PCIe: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status.

LeelaChessZero

LeelaChessZero (lc0 / lczero) is a chess engine automated vian neural networks. This test profile can be used for OpenCL, CUDA + cuDNN, and BLAS (CPU-based) benchmarking. Learn more via the OpenBenchmarking.org test page.

Backend: OpenCL

NVIDIA A100 80GB PCIe: The test run did not produce a result. The test run did not produce a result. The test run did not produce a result.

RedShift Demo

This is a test of MAXON's RedShift demo build that currently requires NVIDIA GPU acceleration. Learn more via the OpenBenchmarking.org test page.

NVIDIA A100 80GB PCIe: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./redshift: 3: /usr/redshift/bin/redshiftBenchmark: not found

76 Results Shown

SHOC Scalable HeterOgeneous Computing
NCNN:
Vulkan GPU - mnasnet
Vulkan GPU - FastestDet
Vulkan GPU - vision_transformer
Vulkan GPU - regnety_400m
Vulkan GPU - squeezenet_ssd
Vulkan GPU - yolov4-tiny
Vulkan GPU - resnet50
Vulkan GPU - alexnet
Vulkan GPU - resnet18
Vulkan GPU - vgg16
Vulkan GPU - googlenet
Vulkan GPU - blazeface
Vulkan GPU - efficientnet-b0
Vulkan GPU - shufflenet-v2
Vulkan GPU-v3-v3 - mobilenet-v3
Vulkan GPU-v2-v2 - mobilenet-v2
Vulkan GPU - mobilenet
ViennaCL:
CPU BLAS - dGEMM-TT
CPU BLAS - dGEMM-TN
CPU BLAS - dGEMM-NT
CPU BLAS - dGEMM-NN
CPU BLAS - dGEMV-T
CPU BLAS - dGEMV-N
CPU BLAS - dDOT
CPU BLAS - dAXPY
CPU BLAS - dCOPY
CPU BLAS - sDOT
CPU BLAS - sAXPY
CPU BLAS - sCOPY
Blender
FAHBench
Blender:
Barbershop - NVIDIA OptiX
Fishy Cat - NVIDIA OptiX
Pabellon Barcelona - NVIDIA OptiX
GROMACS
Caffe
Blender
ViennaCL:
OpenCL BLAS - dGEMM-TT
OpenCL BLAS - dGEMM-TN
OpenCL BLAS - dGEMM-NT
OpenCL BLAS - dGEMM-NN
OpenCL BLAS - dGEMV-T
OpenCL BLAS - dGEMV-N
OpenCL BLAS - dDOT
OpenCL BLAS - dAXPY
OpenCL BLAS - dCOPY
OpenCL BLAS - sDOT
OpenCL BLAS - sAXPY
OpenCL BLAS - sCOPY
Caffe
SHOC Scalable HeterOgeneous Computing
Caffe
ArrayFire
Caffe
Mixbench
cl-mem:
Write
Copy
Read
SHOC Scalable HeterOgeneous Computing
FinanceBench
clpeak
Caffe
clpeak
SHOC Scalable HeterOgeneous Computing:
OpenCL - Reduction
OpenCL - FFT SP
OpenCL - S3D
OpenCL - Bus Speed Readback
OpenCL - GEMM SGEMM_N
Caffe
SHOC Scalable HeterOgeneous Computing
clpeak:
Double-Precision Double
Integer Compute INT
Mixbench:
OpenCL - Single Precision
OpenCL - Double Precision
SHOC Scalable HeterOgeneous Computing

NVIDIA A100 80GB PCIe

Testing initiated at 23 February 2024 14:18 by user root.

14 x Intel Xeon Gold 6342 - NVIDIA A100 80GB PCIe -

Kernel Notes: Transparent Huge Pages: madvise
Compiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-9QDOt0/gcc-9-9.4.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v
Processor Notes: CPU Microcode: 0x1
Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Unknown: No mitigations + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected

Testing initiated at 23 February 2024 16:35 by user root.

2

View

Statistics

Graph Settings

Multi-Way Comparison

Table

Run Management

NVIDIA A100 80GB PCIe

14 x Intel Xeon Gold 6342 - NVIDIA A100 80GB PCIe -

SHOC Scalable HeterOgeneous Computing

NCNN

ViennaCL

Blender

FAHBench

Blender

GROMACS

Caffe

Blender

ViennaCL

LuxCoreRender

Caffe

PlaidML

SHOC Scalable HeterOgeneous Computing

PlaidML

Caffe

LuxCoreRender

ArrayFire

Hashcat

Caffe

Mixbench

Hashcat

cl-mem

Hashcat

PlaidML

SHOC Scalable HeterOgeneous Computing

PlaidML

FinanceBench

clpeak

Caffe

clpeak

SHOC Scalable HeterOgeneous Computing

Caffe

NeatBench

Rodinia

SHOC Scalable HeterOgeneous Computing

clpeak

Mixbench

SHOC Scalable HeterOgeneous Computing

LuxCoreRender

MandelGPU

Mixbench

LeelaChessZero

RedShift Demo

76 Results Shown

NVIDIA A100 80GB PCIe

14 x Intel Xeon Gold 6342 - NVIDIA A100 80GB PCIe -