xeon febby

2 x INTEL XEON PLATINUM 8592+ testing with a Quanta Cloud QuantaGrid D54Q-2U S6Q-MB-MPS (3B05.TEL4P1 BIOS) and ASPEED on Ubuntu 23.10 via the Phoronix Test Suite.

a

Kernel Notes: Transparent Huge Pages: madvise
Compiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v
Processor Notes: Scaling Governor: intel_pstate performance (EPP: performance) - CPU Microcode: 0x21000161
Python Notes: Python 3.11.6
Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected

b

Processor: 2 x INTEL XEON PLATINUM 8592+ @ 3.90GHz (128 Cores / 256 Threads), Motherboard: Quanta Cloud QuantaGrid D54Q-2U S6Q-MB-MPS (3B05.TEL4P1 BIOS), Chipset: Intel Device 1bce, Memory: 1008GB, Disk: 3201GB Micron_7450_MTFDKCB3T2TFS, Graphics: ASPEED, Network: 2 x Intel X710 for 10GBASE-T

OS: Ubuntu 23.10, Kernel: 6.6.0-060600-generic (x86_64), Compiler: GCC 13.2.0, File-System: ext4, Screen Resolution: 1024x768

PyTorch

This is a benchmark of PyTorch making use of pytorch-benchmark [https://github.com/LukasHedegaard/pytorch-benchmark]. Currently this test profile is catered to CPU-based testing. Learn more via the OpenBenchmarking.org test page.

Llama.cpp

Llama.cpp is a port of Facebook's LLaMA model in C/C++ developed by Georgi Gerganov. Llama.cpp allows the inference of LLaMA and other supported models in C/C++. For CPU inference Llama.cpp supports AVX2/AVX-512, ARM NEON, and other modern ISAs along with features like OpenBLAS usage. Learn more via the OpenBenchmarking.org test page.

Quicksilver

Quicksilver is a proxy application that represents some elements of the Mercury workload by solving a simplified dynamic Monte Carlo particle transport problem. Quicksilver is developed by Lawrence Livermore National Laboratory (LLNL) and this test profile currently makes use of the OpenMP CPU threaded code path. Learn more via the OpenBenchmarking.org test page.

Llama.cpp

Llamafile

Mozilla's Llamafile allows distributing and running large language models (LLMs) as a single file. Llamafile aims to make open-source LLMs more accessible to developers and users. Llamafile supports a variety of models, CPUs and GPUs, and other options. Learn more via the OpenBenchmarking.org test page.

Quicksilver

Llamafile

NAMD

NAMD is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems. NAMD was developed by the Theoretical and Computational Biophysics Group in the Beckman Institute for Advanced Science and Technology at the University of Illinois at Urbana-Champaign. Learn more via the OpenBenchmarking.org test page.

Llamafile

NAMD

ONNX Runtime

ONNX Runtime is developed by Microsoft and partners as a open-source, cross-platform, high performance machine learning inferencing and training accelerator. This test profile runs the ONNX Runtime with various models available from the ONNX Model Zoo. Learn more via the OpenBenchmarking.org test page.

Result

Inference Time Cost (ms)

PyTorch

ONNX Runtime

Result

Inference Time Cost (ms)

Speedb

Speedb is a next-generation key value storage engine that is RocksDB compatible and aiming for stability, efficiency, and performance. Learn more via the OpenBenchmarking.org test page.

ONNX Runtime

Result

Inference Time Cost (ms)

Result

Inference Time Cost (ms)

Result

Inference Time Cost (ms)

Result

Inference Time Cost (ms)

Result

Inference Time Cost (ms)

Speedb

Speedb is a next-generation key value storage engine that is RocksDB compatible and aiming for stability, efficiency, and performance. Learn more via the OpenBenchmarking.org test page.

ONNX Runtime

Result

Inference Time Cost (ms)

Result

Inference Time Cost (ms)

Result

Inference Time Cost (ms)

dav1d

Dav1d is an open-source, speedy AV1 video decoder supporting modern SIMD CPU features. This test profile times how long it takes to decode sample AV1 video content. Learn more via the OpenBenchmarking.org test page.

GROMACS

The GROMACS (GROningen MAchine for Chemical Simulations) molecular dynamics package testing with the water_GMX50 data. This test profile allows selecting between CPU and GPU-based GROMACS builds. Learn more via the OpenBenchmarking.org test page.

PyTorch

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

Intel Open Image Denoise

Open Image Denoise is a denoising library for ray-tracing and part of the Intel oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

TensorFlow

Y-Cruncher

Y-Cruncher is a multi-threaded Pi benchmark capable of computing Pi to trillions of digits. Learn more via the OpenBenchmarking.org test page.

Intel Open Image Denoise

Open Image Denoise is a denoising library for ray-tracing and part of the Intel oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

Y-Cruncher

Y-Cruncher is a multi-threaded Pi benchmark capable of computing Pi to trillions of digits. Learn more via the OpenBenchmarking.org test page.

TensorFlow

52 Results Shown

a

Testing initiated at 19 February 2024 21:44 by user phoronix.

b

OS: Ubuntu 23.10, Kernel: 6.6.0-060600-generic (x86_64), Compiler: GCC 13.2.0, File-System: ext4, Screen Resolution: 1024x768

Testing initiated at 20 February 2024 00:25 by user phoronix.

xeon febby

View

Statistics

Graph Settings

Multi-Way Comparison

Table

Run Management

a

b

PyTorch

Llama.cpp

Quicksilver

Llama.cpp

Llamafile

Quicksilver

Llamafile

NAMD

Llamafile

NAMD

ONNX Runtime

PyTorch

ONNX Runtime

Speedb

ONNX Runtime

Speedb

ONNX Runtime

dav1d

GROMACS

PyTorch

TensorFlow

Intel Open Image Denoise

TensorFlow

Y-Cruncher

Intel Open Image Denoise

Y-Cruncher

TensorFlow

52 Results Shown

a

b