a40-ml

KVM testing on Ubuntu 22.04 via the Phoronix Test Suite.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2412124-NE-A40ML481010
Jump To Table - Results

Statistics

Remove Outliers Before Calculating Averages

Graph Settings

Prefer Vertical Bar Graphs

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Result
Identifier
Performance Per
Dollar
Date
Run
  Test
  Duration
NVIDIA A40 - 80 x Intel Xeon
December 11
  3 Hours, 23 Minutes
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


a40-mlOpenBenchmarking.orgPhoronix Test Suite80 x Intel Xeon (Icelake) (80 Cores)Nutanix AHV (0.0.0 BIOS)Intel 440FX 82441FX PMC8 x 16 GB RAM Red Hat8796GB VDISKNVIDIA A40 48GBRed Hat Virtio deviceUbuntu 22.046.5.0-45-generic (x86_64)NVIDIA1.3.255GCC 11.4.0ext41280x1024KVMProcessorMotherboardChipsetMemoryDiskGraphicsNetworkOSKernelDisplay DriverVulkanCompilerFile-SystemScreen ResolutionSystem LayerA40-ml BenchmarksSystem Logs- Transparent Huge Pages: madvise- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - CPU Microcode: 0x1- ??.??.??.??.??- Python 3.10.12- gather_data_sampling: Unknown: Dependent on hypervisor status + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Vulnerable: Clear buffers attempted no microcode; SMT Host state unknown + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; RSB filling; PBRSB-eIBRS: SW sequence; BHI: Syscall hardening KVM: SW loop + srbds: Not affected + tsx_async_abort: Not affected

a40-mllitert: DeepLab V3litert: SqueezeNetlitert: Inception V4litert: NASNet Mobilelitert: Mobilenet Floatlitert: Mobilenet Quantlitert: Inception ResNet V2litert: Quantized COCO SSD MobileNet v1rbenchmark: numpy: deepspeech: CPUrnnoise: 26 Minute Long Talking Sampletensorflow-lite: SqueezeNettensorflow-lite: Inception V4tensorflow-lite: NASNet Mobiletensorflow-lite: Mobilenet Floattensorflow-lite: Mobilenet Quanttensorflow-lite: Inception ResNet V2shoc: OpenCL - Max SP Flopsshoc: OpenCL - Bus Speed Downloadshoc: OpenCL - Bus Speed Readbackshoc: OpenCL - Texture Read Bandwidthshoc: OpenCL - S3Dshoc: OpenCL - Triadshoc: OpenCL - FFT SPshoc: OpenCL - MD5 Hashshoc: OpenCL - Reductionshoc: OpenCL - GEMM SGEMM_Nonednn: IP Shapes 1D - CPUonednn: IP Shapes 3D - CPUonednn: Convolution Batch Shapes Auto - CPUonednn: Deconvolution Batch shapes_1d - CPUonednn: Deconvolution Batch shapes_3d - CPUonednn: Recurrent Neural Network Training - CPUonednn: Recurrent Neural Network Inference - CPUNVIDIA A40 - 80 x Intel Xeon4389.582809.0821955.237733.51791.251719.8525768.93369.020.1618359.4798.0982016.9222726.2921435.032777.41857.443085.4940922.837081.325.333026.40221935.67347.33724.41741822.0441.2709334.4786124.240.7862271.618573.116807.213831.323301129.72766.906OpenBenchmarking.org

LiteRT

OpenBenchmarking.orgMicroseconds, Fewer Is BetterLiteRT 2024-10-15Model: DeepLab V3NVIDIA A40 - 80 x Intel Xeon9001800270036004500SE +/- 50.26, N = 34389.58

OpenBenchmarking.orgMicroseconds, Fewer Is BetterLiteRT 2024-10-15Model: SqueezeNetNVIDIA A40 - 80 x Intel Xeon6001200180024003000SE +/- 23.23, N = 32809.08

OpenBenchmarking.orgMicroseconds, Fewer Is BetterLiteRT 2024-10-15Model: Inception V4NVIDIA A40 - 80 x Intel Xeon5K10K15K20K25KSE +/- 210.50, N = 321955.2

OpenBenchmarking.orgMicroseconds, Fewer Is BetterLiteRT 2024-10-15Model: NASNet MobileNVIDIA A40 - 80 x Intel Xeon8K16K24K32K40KSE +/- 372.70, N = 337733.5

OpenBenchmarking.orgMicroseconds, Fewer Is BetterLiteRT 2024-10-15Model: Mobilenet FloatNVIDIA A40 - 80 x Intel Xeon400800120016002000SE +/- 17.09, N = 31791.25

OpenBenchmarking.orgMicroseconds, Fewer Is BetterLiteRT 2024-10-15Model: Mobilenet QuantNVIDIA A40 - 80 x Intel Xeon400800120016002000SE +/- 18.75, N = 31719.85

OpenBenchmarking.orgMicroseconds, Fewer Is BetterLiteRT 2024-10-15Model: Inception ResNet V2NVIDIA A40 - 80 x Intel Xeon6K12K18K24K30KSE +/- 190.27, N = 325768.9

OpenBenchmarking.orgMicroseconds, Fewer Is BetterLiteRT 2024-10-15Model: Quantized COCO SSD MobileNet v1NVIDIA A40 - 80 x Intel Xeon7001400210028003500SE +/- 6.32, N = 33369.02

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

Device: CPU - Batch Size: 1 - Model: VGG-16

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Device: GPU - Batch Size: 1 - Model: VGG-16

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Device: CPU - Batch Size: 1 - Model: AlexNet

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Device: CPU - Batch Size: 16 - Model: VGG-16

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Device: CPU - Batch Size: 32 - Model: VGG-16

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Device: CPU - Batch Size: 64 - Model: VGG-16

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Device: GPU - Batch Size: 1 - Model: AlexNet

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Device: GPU - Batch Size: 16 - Model: VGG-16

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Device: GPU - Batch Size: 32 - Model: VGG-16

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Device: GPU - Batch Size: 64 - Model: VGG-16

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Device: CPU - Batch Size: 16 - Model: AlexNet

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Device: CPU - Batch Size: 256 - Model: VGG-16

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Device: CPU - Batch Size: 32 - Model: AlexNet

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Device: CPU - Batch Size: 512 - Model: VGG-16

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Device: CPU - Batch Size: 64 - Model: AlexNet

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Device: GPU - Batch Size: 16 - Model: AlexNet

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Device: GPU - Batch Size: 256 - Model: VGG-16

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Device: GPU - Batch Size: 32 - Model: AlexNet

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Device: GPU - Batch Size: 512 - Model: VGG-16

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Device: GPU - Batch Size: 64 - Model: AlexNet

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Device: CPU - Batch Size: 1 - Model: GoogLeNet

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Device: CPU - Batch Size: 1 - Model: ResNet-50

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Device: CPU - Batch Size: 256 - Model: AlexNet

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Device: CPU - Batch Size: 512 - Model: AlexNet

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Device: GPU - Batch Size: 1 - Model: GoogLeNet

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Device: GPU - Batch Size: 1 - Model: ResNet-50

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Device: GPU - Batch Size: 256 - Model: AlexNet

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Device: GPU - Batch Size: 512 - Model: AlexNet

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Device: CPU - Batch Size: 16 - Model: GoogLeNet

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Device: CPU - Batch Size: 16 - Model: ResNet-50

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Device: CPU - Batch Size: 32 - Model: GoogLeNet

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Device: CPU - Batch Size: 32 - Model: ResNet-50

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Device: CPU - Batch Size: 64 - Model: GoogLeNet

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Device: CPU - Batch Size: 64 - Model: ResNet-50

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Device: GPU - Batch Size: 16 - Model: GoogLeNet

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Device: GPU - Batch Size: 16 - Model: ResNet-50

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Device: GPU - Batch Size: 32 - Model: GoogLeNet

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Device: GPU - Batch Size: 32 - Model: ResNet-50

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Device: GPU - Batch Size: 64 - Model: GoogLeNet

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Device: GPU - Batch Size: 64 - Model: ResNet-50

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Device: CPU - Batch Size: 256 - Model: GoogLeNet

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Device: CPU - Batch Size: 256 - Model: ResNet-50

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Device: CPU - Batch Size: 512 - Model: GoogLeNet

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Device: CPU - Batch Size: 512 - Model: ResNet-50

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Device: GPU - Batch Size: 256 - Model: GoogLeNet

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Device: GPU - Batch Size: 256 - Model: ResNet-50

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Device: GPU - Batch Size: 512 - Model: GoogLeNet

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

Device: GPU - Batch Size: 512 - Model: ResNet-50

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'absl'

LeelaChessZero

Backend: BLAS

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status.

R Benchmark

This test is a quick-running survey of general R performance Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterR BenchmarkNVIDIA A40 - 80 x Intel Xeon0.03640.07280.10920.14560.182SE +/- 0.0011, N = 30.16181. R scripting front-end version 4.1.2 (2021-11-01)

Numpy Benchmark

This is a test to obtain the general Numpy performance. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgScore, More Is BetterNumpy BenchmarkNVIDIA A40 - 80 x Intel Xeon80160240320400SE +/- 1.09, N = 3359.47

DeepSpeech

Mozilla DeepSpeech is a speech-to-text engine powered by TensorFlow for machine learning and derived from Baidu's Deep Speech research paper. This test profile times the speech-to-text process for a roughly three minute audio recording. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterDeepSpeech 0.6Acceleration: CPUNVIDIA A40 - 80 x Intel Xeon20406080100SE +/- 0.53, N = 398.10

RNNoise

RNNoise is a recurrent neural network for audio noise reduction developed by Mozilla and Xiph.Org. This test profile is a single-threaded test measuring the time to denoise a sample 26 minute long 16-bit RAW audio file using this recurrent neural network noise suppression library. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterRNNoise 0.2Input: 26 Minute Long Talking SampleNVIDIA A40 - 80 x Intel Xeon48121620SE +/- 0.17, N = 1516.921. (CC) gcc options: -O2 -pedantic -fvisibility=hidden

PyTorch

This is a benchmark of PyTorch making use of pytorch-benchmark [https://github.com/LukasHedegaard/pytorch-benchmark]. Learn more via the OpenBenchmarking.org test page.

Device: CPU - Batch Size: 1 - Model: ResNet-50

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ImportError: libnvJitLink.so.12: cannot open shared object file: No such file or directory

Device: CPU - Batch Size: 1 - Model: ResNet-152

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ImportError: libnvJitLink.so.12: cannot open shared object file: No such file or directory

Device: CPU - Batch Size: 16 - Model: ResNet-50

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ImportError: libnvJitLink.so.12: cannot open shared object file: No such file or directory

Device: CPU - Batch Size: 32 - Model: ResNet-50

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ImportError: libnvJitLink.so.12: cannot open shared object file: No such file or directory

Device: CPU - Batch Size: 64 - Model: ResNet-50

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ImportError: libnvJitLink.so.12: cannot open shared object file: No such file or directory

Device: CPU - Batch Size: 16 - Model: ResNet-152

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ImportError: libnvJitLink.so.12: cannot open shared object file: No such file or directory

Device: CPU - Batch Size: 256 - Model: ResNet-50

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ImportError: libnvJitLink.so.12: cannot open shared object file: No such file or directory

Device: CPU - Batch Size: 32 - Model: ResNet-152

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ImportError: libnvJitLink.so.12: cannot open shared object file: No such file or directory

Device: CPU - Batch Size: 512 - Model: ResNet-50

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ImportError: libnvJitLink.so.12: cannot open shared object file: No such file or directory

Device: CPU - Batch Size: 64 - Model: ResNet-152

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ImportError: libnvJitLink.so.12: cannot open shared object file: No such file or directory

Device: CPU - Batch Size: 256 - Model: ResNet-152

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ImportError: libnvJitLink.so.12: cannot open shared object file: No such file or directory

Device: CPU - Batch Size: 512 - Model: ResNet-152

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ImportError: libnvJitLink.so.12: cannot open shared object file: No such file or directory

Device: CPU - Batch Size: 1 - Model: Efficientnet_v2_l

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ImportError: libnvJitLink.so.12: cannot open shared object file: No such file or directory

Device: CPU - Batch Size: 16 - Model: Efficientnet_v2_l

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ImportError: libnvJitLink.so.12: cannot open shared object file: No such file or directory

Device: CPU - Batch Size: 32 - Model: Efficientnet_v2_l

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ImportError: libnvJitLink.so.12: cannot open shared object file: No such file or directory

Device: CPU - Batch Size: 64 - Model: Efficientnet_v2_l

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ImportError: libnvJitLink.so.12: cannot open shared object file: No such file or directory

Device: CPU - Batch Size: 256 - Model: Efficientnet_v2_l

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ImportError: libnvJitLink.so.12: cannot open shared object file: No such file or directory

Device: CPU - Batch Size: 512 - Model: Efficientnet_v2_l

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ImportError: libnvJitLink.so.12: cannot open shared object file: No such file or directory

Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: ResNet-50

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ImportError: libnvJitLink.so.12: cannot open shared object file: No such file or directory

Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: ResNet-152

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ImportError: libnvJitLink.so.12: cannot open shared object file: No such file or directory

Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: ResNet-50

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ImportError: libnvJitLink.so.12: cannot open shared object file: No such file or directory

Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: ResNet-50

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ImportError: libnvJitLink.so.12: cannot open shared object file: No such file or directory

Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: ResNet-50

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ImportError: libnvJitLink.so.12: cannot open shared object file: No such file or directory

Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: ResNet-152

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ImportError: libnvJitLink.so.12: cannot open shared object file: No such file or directory

Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: ResNet-50

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ImportError: libnvJitLink.so.12: cannot open shared object file: No such file or directory

Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: ResNet-152

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ImportError: libnvJitLink.so.12: cannot open shared object file: No such file or directory

Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: ResNet-50

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ImportError: libnvJitLink.so.12: cannot open shared object file: No such file or directory

Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: ResNet-152

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ImportError: libnvJitLink.so.12: cannot open shared object file: No such file or directory

Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: ResNet-152

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ImportError: libnvJitLink.so.12: cannot open shared object file: No such file or directory

Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: ResNet-152

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ImportError: libnvJitLink.so.12: cannot open shared object file: No such file or directory

Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: Efficientnet_v2_l

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ImportError: libnvJitLink.so.12: cannot open shared object file: No such file or directory

Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: Efficientnet_v2_l

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ImportError: libnvJitLink.so.12: cannot open shared object file: No such file or directory

Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: Efficientnet_v2_l

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ImportError: libnvJitLink.so.12: cannot open shared object file: No such file or directory

Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: Efficientnet_v2_l

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ImportError: libnvJitLink.so.12: cannot open shared object file: No such file or directory

Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: Efficientnet_v2_l

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ImportError: libnvJitLink.so.12: cannot open shared object file: No such file or directory

Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: Efficientnet_v2_l

NVIDIA A40 - 80 x Intel Xeon: The test quit with a non-zero exit status. E: ImportError: libnvJitLink.so.12: cannot open shared object file: No such file or directory

TensorFlow Lite

This is a benchmark of the TensorFlow Lite implementation focused on TensorFlow machine learning for mobile, IoT, edge, and other cases. The current Linux support is limited to running on CPUs. This test profile is measuring the average inference time. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2022-05-18Model: SqueezeNetNVIDIA A40 - 80 x Intel Xeon6001200180024003000SE +/- 15.01, N = 32726.29

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2022-05-18Model: Inception V4NVIDIA A40 - 80 x Intel Xeon5K10K15K20K25KSE +/- 92.43, N = 321435.0

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2022-05-18Model: NASNet MobileNVIDIA A40 - 80 x Intel Xeon7K14K21K28K35KSE +/- 229.61, N = 1232777.4

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2022-05-18Model: Mobilenet FloatNVIDIA A40 - 80 x Intel Xeon400800120016002000SE +/- 8.77, N = 31857.44

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2022-05-18Model: Mobilenet QuantNVIDIA A40 - 80 x Intel Xeon7001400210028003500SE +/- 31.88, N = 53085.49

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2022-05-18Model: Inception ResNet V2NVIDIA A40 - 80 x Intel Xeon9K18K27K36K45KSE +/- 4628.73, N = 1240922.8

SHOC Scalable HeterOgeneous Computing

The CUDA and OpenCL version of Vetter's Scalable HeterOgeneous Computing benchmark suite. SHOC provides a number of different benchmark programs for evaluating the performance and stability of compute devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Max SP FlopsNVIDIA A40 - 80 x Intel Xeon8K16K24K32K40KSE +/- 0.19, N = 337081.31. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Bus Speed DownloadNVIDIA A40 - 80 x Intel Xeon612182430SE +/- 0.00, N = 325.331. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Bus Speed ReadbackNVIDIA A40 - 80 x Intel Xeon612182430SE +/- 0.00, N = 326.401. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Texture Read BandwidthNVIDIA A40 - 80 x Intel Xeon400800120016002000SE +/- 1.74, N = 31935.671. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: S3DNVIDIA A40 - 80 x Intel Xeon80160240320400SE +/- 0.20, N = 3347.341. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: TriadNVIDIA A40 - 80 x Intel Xeon612182430SE +/- 0.01, N = 324.421. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: FFT SPNVIDIA A40 - 80 x Intel Xeon400800120016002000SE +/- 0.32, N = 31822.041. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGHash/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: MD5 HashNVIDIA A40 - 80 x Intel Xeon918273645SE +/- 0.00, N = 341.271. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: ReductionNVIDIA A40 - 80 x Intel Xeon70140210280350SE +/- 0.54, N = 3334.481. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: GEMM SGEMM_NNVIDIA A40 - 80 x Intel Xeon13002600390052006500SE +/- 27.83, N = 36124.241. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

oneDNN

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: IP Shapes 1D - Engine: CPUNVIDIA A40 - 80 x Intel Xeon0.17690.35380.53070.70760.8845SE +/- 0.003359, N = 30.786227MIN: 0.711. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: IP Shapes 3D - Engine: CPUNVIDIA A40 - 80 x Intel Xeon0.36420.72841.09261.45681.821SE +/- 0.02310, N = 31.61857MIN: 1.471. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: Convolution Batch Shapes Auto - Engine: CPUNVIDIA A40 - 80 x Intel Xeon0.70131.40262.10392.80523.5065SE +/- 0.03515, N = 33.11680MIN: 2.861. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: Deconvolution Batch shapes_1d - Engine: CPUNVIDIA A40 - 80 x Intel Xeon246810SE +/- 0.13056, N = 157.21383MIN: 5.81. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: Deconvolution Batch shapes_3d - Engine: CPUNVIDIA A40 - 80 x Intel Xeon0.29770.59540.89311.19081.4885SE +/- 0.01177, N = 31.32330MIN: 1.191. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: Recurrent Neural Network Training - Engine: CPUNVIDIA A40 - 80 x Intel Xeon2004006008001000SE +/- 6.18, N = 31129.72MIN: 1084.91. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: Recurrent Neural Network Inference - Engine: CPUNVIDIA A40 - 80 x Intel Xeon170340510680850SE +/- 5.38, N = 15766.91MIN: 692.951. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl