Deepspaarse 17 Benchmarks - OpenBenchmarking.org

AMD Ryzen 9 7950X 16-Core testing with a ASUS ROG STRIX X670E-E GAMING WIFI (1905 BIOS) and NVIDIA GeForce RTX 3080 10GB on Ubuntu 23.10 via the Phoronix Test Suite.

Neural Magic DeepSparse

This is a benchmark of Neural Magic's DeepSparse using its built-in deepsparse.benchmark utility and various models from their SparseZoo (https://sparsezoo.neuralmagic.com/). Learn more via the OpenBenchmarking.org test page.

44 Results Shown

Neural Magic DeepSparse:
CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream:
ms/batch
items/sec
Llama2 Chat 7b Quantized - Asynchronous Multi-Stream:
ms/batch
items/sec
Llama2 Chat 7b Quantized - Synchronous Single-Stream:
ms/batch
items/sec
NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Synchronous Single-Stream:
ms/batch
items/sec
NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Stream:
ms/batch
items/sec
BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Stream:
ms/batch
items/sec
BERT-Large, NLP Question Answering, Sparse INT8 - Synchronous Single-Stream:
ms/batch
items/sec
NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Stream:
ms/batch
items/sec
NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream:
ms/batch
items/sec
NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream:
ms/batch
items/sec
NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Stream:
ms/batch
items/sec
CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream:
ms/batch
items/sec
CV Segmentation, 90% Pruned YOLACT Pruned - Synchronous Single-Stream:
ms/batch
items/sec
NLP Text Classification, DistilBERT mnli - Synchronous Single-Stream:
ms/batch
items/sec
NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream:
ms/batch
items/sec
ResNet-50, Baseline - Synchronous Single-Stream:
ms/batch
items/sec
ResNet-50, Sparse INT8 - Synchronous Single-Stream:
ms/batch
items/sec
ResNet-50, Sparse INT8 - Asynchronous Multi-Stream:
ms/batch
items/sec
CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Stream:
ms/batch
items/sec
CV Detection, YOLOv5s COCO, Sparse INT8 - Synchronous Single-Stream:
ms/batch
items/sec
ResNet-50, Baseline - Asynchronous Multi-Stream:
ms/batch
items/sec
CV Classification, ResNet-50 ImageNet - Synchronous Single-Stream:
ms/batch
items/sec

a

Kernel Notes: Transparent Huge Pages: madvise
Processor Notes: Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - CPU Microcode: 0xa601206
Python Notes: Python 3.11.6
Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected

Testing initiated at 15 March 2024 17:18 by user pts.

b

Testing initiated at 15 March 2024 18:09 by user pts.

c

Testing initiated at 15 March 2024 18:28 by user pts.

d

Testing initiated at 15 March 2024 18:47 by user pts.

e

Processor: AMD Ryzen 9 7950X 16-Core @ 5.88GHz (16 Cores / 32 Threads), Motherboard: ASUS ROG STRIX X670E-E GAMING WIFI (1905 BIOS), Chipset: AMD Device 14d8, Memory: 2 x 16GB DRAM-6000MT/s G Skill F5-6000J3038F16G, Disk: 2000GB Samsung SSD 980 PRO 2TB + Western Digital WD_BLACK SN850X 2000GB, Graphics: NVIDIA GeForce RTX 3080 10GB, Audio: NVIDIA GA102 HD Audio, Monitor: DELL U2723QE, Network: Intel I225-V + Intel Wi-Fi 6 AX210/AX211/AX411

OS: Ubuntu 23.10, Kernel: 6.7.0-060700-generic (x86_64), Desktop: GNOME Shell 45.2, Display Server: X Server 1.21.1.7, Display Driver: NVIDIA 550.54.14, OpenGL: 4.6.0, OpenCL: OpenCL 3.0 CUDA 12.4.89, Compiler: GCC 13.2.0, File-System: ext4, Screen Resolution: 3840x2160

Testing initiated at 15 March 2024 19:06 by user pts.

deepspaarse 17

View

Statistics

Graph Settings

Multi-Way Comparison

Table

Run Management

a

b

c

d

e

Neural Magic DeepSparse

44 Results Shown

a

b

c

d

e