AMD EPYC 8534P

AMD EPYC 8534P 64-Core testing with a AMD Cinnabar (RCB1009C BIOS) and ASPEED on Ubuntu 23.10 via the Phoronix Test Suite.

a

Kernel Notes: Transparent Huge Pages: madvise
Compiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-FTCNCZ/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-FTCNCZ/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v
Processor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xaa00212
Java Notes: OpenJDK Runtime Environment (build 11.0.20+8-post-Ubuntu-1ubuntu1)
Python Notes: Python 3.11.5
Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected

b

Kernel Notes: Transparent Huge Pages: madvise
Compiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-FTCNCZ/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-FTCNCZ/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v
Processor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xaa00212
Python Notes: Python 3.11.5
Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected

c

Processor: AMD EPYC 8534P 64-Core @ 2.30GHz (64 Cores / 128 Threads), Motherboard: AMD Cinnabar (RCB1009C BIOS), Chipset: AMD Device 14a4, Memory: 6 x 32 GB DRAM-4800MT/s Samsung M321R4GA0BB0-CQKMG, Disk: 3201GB Micron_7450_MTFDKCB3T2TFS, Graphics: ASPEED, Network: 2 x Broadcom NetXtreme BCM5720 PCIe

OS: Ubuntu 23.10, Kernel: 6.5.0-5-generic (x86_64), Desktop: GNOME Shell, Display Server: X Server 1.21.1.7, Compiler: GCC 13.2.0, File-System: ext4, Screen Resolution: 640x480

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

Quicksilver

Quicksilver is a proxy application that represents some elements of the Mercury workload by solving a simplified dynamic Monte Carlo particle transport problem. Quicksilver is developed by Lawrence Livermore National Laboratory (LLNL) and this test profile currently makes use of the OpenMP CPU threaded code path. Learn more via the OpenBenchmarking.org test page.

Y-Cruncher

Y-Cruncher is a multi-threaded Pi benchmark capable of computing Pi to trillions of digits. Learn more via the OpenBenchmarking.org test page.

TensorFlow

Y-Cruncher

Y-Cruncher is a multi-threaded Pi benchmark capable of computing Pi to trillions of digits. Learn more via the OpenBenchmarking.org test page.

Neural Magic DeepSparse

This is a benchmark of Neural Magic's DeepSparse using its built-in deepsparse.benchmark utility and various models from their SparseZoo (https://sparsezoo.neuralmagic.com/). Learn more via the OpenBenchmarking.org test page.

Quicksilver

Neural Magic DeepSparse

PyTorch

This is a benchmark of PyTorch making use of pytorch-benchmark [https://github.com/LukasHedegaard/pytorch-benchmark]. Currently this test profile is catered to CPU-based testing. Learn more via the OpenBenchmarking.org test page.

Llama.cpp

Llama.cpp is a port of Facebook's LLaMA model in C/C++ developed by Georgi Gerganov. Llama.cpp allows the inference of LLaMA and other supported models in C/C++. For CPU inference Llama.cpp supports AVX2/AVX-512, ARM NEON, and other modern ISAs along with features like OpenBLAS usage. Learn more via the OpenBenchmarking.org test page.

Llamafile

Mozilla's Llamafile allows distributing and running large language models (LLMs) as a single file. Llamafile aims to make open-source LLMs more accessible to developers and users. Llamafile supports a variety of models, CPUs and GPUs, and other options. Learn more via the OpenBenchmarking.org test page.

SVT-AV1

This is a benchmark of the SVT-AV1 open-source video encoder/decoder. SVT-AV1 was originally developed by Intel as part of their Open Visual Cloud / Scalable Video Technology (SVT). Development of SVT-AV1 has since moved to the Alliance for Open Media as part of upstream AV1 development. SVT-AV1 is a CPU-based multi-threaded video encoder for the AV1 video format with a sample YUV video file. Learn more via the OpenBenchmarking.org test page.

Quicksilver

SVT-AV1

Speedb

Speedb is a next-generation key value storage engine that is RocksDB compatible and aiming for stability, efficiency, and performance. Learn more via the OpenBenchmarking.org test page.

83 Results Shown

TensorFlow
Quicksilver
Y-Cruncher
TensorFlow:
CPU - 16 - ResNet-50
CPU - 512 - ResNet-50
Y-Cruncher
Neural Magic DeepSparse:
NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream:
items/sec
ms/batch
NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Stream:
items/sec
ms/batch
NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Stream:
items/sec
ms/batch
NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Synchronous Single-Stream:
items/sec
ms/batch
ResNet-50, Baseline - Asynchronous Multi-Stream:
items/sec
ms/batch
ResNet-50, Baseline - Synchronous Single-Stream:
items/sec
ms/batch
ResNet-50, Sparse INT8 - Asynchronous Multi-Stream:
items/sec
ms/batch
ResNet-50, Sparse INT8 - Synchronous Single-Stream:
items/sec
ms/batch
CV Detection, YOLOv5s COCO - Asynchronous Multi-Stream:
items/sec
ms/batch
CV Detection, YOLOv5s COCO - Synchronous Single-Stream:
items/sec
ms/batch
BERT-Large, NLP Question Answering - Asynchronous Multi-Stream:
items/sec
ms/batch
BERT-Large, NLP Question Answering - Synchronous Single-Stream:
items/sec
ms/batch
CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream:
items/sec
ms/batch
CV Classification, ResNet-50 ImageNet - Synchronous Single-Stream:
items/sec
ms/batch
CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Stream:
items/sec
ms/batch
CV Detection, YOLOv5s COCO, Sparse INT8 - Synchronous Single-Stream:
items/sec
ms/batch
Quicksilver
Neural Magic DeepSparse:
NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream:
items/sec
ms/batch
NLP Text Classification, DistilBERT mnli - Synchronous Single-Stream:
items/sec
ms/batch
CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream:
items/sec
ms/batch
CV Segmentation, 90% Pruned YOLACT Pruned - Synchronous Single-Stream:
items/sec
ms/batch
BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Stream:
items/sec
ms/batch
BERT-Large, NLP Question Answering, Sparse INT8 - Synchronous Single-Stream:
items/sec
ms/batch
NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream:
items/sec
ms/batch
NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Stream:
items/sec
ms/batch
PyTorch:
CPU - 1 - ResNet-50
CPU - 1 - ResNet-152
CPU - 16 - ResNet-50
CPU - 16 - ResNet-152
CPU - 512 - ResNet-50
CPU - 512 - ResNet-152
CPU - 1 - Efficientnet_v2_l
CPU - 16 - Efficientnet_v2_l
CPU - 512 - Efficientnet_v2_l
Llama.cpp:
llama-2-7b.Q4_0.gguf
llama-2-13b.Q4_0.gguf
llama-2-70b-chat.Q5_0.gguf
Llamafile:
llava-v1.5-7b-q4 - CPU
mistral-7b-instruct-v0.2.Q8_0 - CPU
wizardcoder-python-34b-v1.0.Q6_K - CPU
SVT-AV1
Quicksilver
SVT-AV1:
Preset 8 - Bosphorus 4K
Preset 12 - Bosphorus 4K
Preset 13 - Bosphorus 4K
Preset 4 - Bosphorus 1080p
Preset 8 - Bosphorus 1080p
Preset 12 - Bosphorus 1080p
Preset 13 - Bosphorus 1080p
Speedb:
Rand Read
Update Rand
Read While Writing
Read Rand Write Rand

a

Testing initiated at 27 January 2024 22:40 by user phoronix.

b

Kernel Notes: Transparent Huge Pages: madvise
Compiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-FTCNCZ/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-FTCNCZ/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v
Processor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xaa00212
Python Notes: Python 3.11.5
Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected

Testing initiated at 28 January 2024 01:06 by user phoronix.

c

Kernel Notes: Transparent Huge Pages: madvise
Compiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-FTCNCZ/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-FTCNCZ/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v
Processor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xaa00212
Python Notes: Python 3.11.5
Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected

Testing initiated at 28 January 2024 02:56 by user phoronix.

AMD EPYC 8534P

View

Statistics

Graph Settings

Multi-Way Comparison

Table

Run Management

a

b

c

TensorFlow

Quicksilver

Y-Cruncher

TensorFlow

Y-Cruncher

Neural Magic DeepSparse

Quicksilver

Neural Magic DeepSparse

PyTorch

Llama.cpp

Llamafile

SVT-AV1

Quicksilver

SVT-AV1

Speedb

83 Results Shown

a

b

c