epyc 9654 AMD March

Tests for a future article. 2 x AMD EPYC 9654 96-Core testing with a AMD Titanite_4G (RTI1004D BIOS) and ASPEED on Ubuntu 23.04 via the Phoronix Test Suite.

a

Kernel Notes: Transparent Huge Pages: madvise
Compiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-l0Aoyl/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-l0Aoyl/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v
Processor Notes: Scaling Governor: amd-pstate performance (Boost: Enabled) - CPU Microcode: 0xa101111
Python Notes: Python 3.10.9
Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected

b

c

Processor: AMD EPYC 9654 96-Core @ 3.71GHz (96 Cores / 192 Threads), Motherboard: AMD Titanite_4G (RTI1004D BIOS), Chipset: AMD Device 14a4, Memory: 768GB, Disk: 800GB INTEL SSDPF21Q800GB, Graphics: ASPEED, Monitor: VGA HDMI, Network: Broadcom NetXtreme BCM5720 PCIe

OS: Ubuntu 23.04, Kernel: 5.19.0-21-generic (x86_64), Desktop: GNOME Shell 43.1, Display Server: X Server 1.21.1.4, Vulkan: 1.3.224, Compiler: GCC 12.2.0, File-System: ext4, Screen Resolution: 1920x1080

d

e

Changed Processor to 2 x AMD EPYC 9654 96-Core @ 3.71GHz (192 Cores / 384 Threads).

Changed Memory to 1520GB.

RocksDB

This is a benchmark of Meta/Facebook's RocksDB as an embeddable persistent key-value store for fast storage based on Google's LevelDB. Learn more via the OpenBenchmarking.org test page.

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenCV

This is a benchmark of the OpenCV (Computer Vision) library's built-in performance tests. Learn more via the OpenBenchmarking.org test page.

MariaDB

This is a MariaDB MySQL database server benchmark making use of mysqlslap. Learn more via the OpenBenchmarking.org test page.

FFmpeg

This is a benchmark of the FFmpeg multimedia framework. The FFmpeg test profile is making use of a modified version of vbench from Columbia University's Architecture and Design Lab (ARCADE) [http://arcade.cs.columbia.edu/vbench/] that is a benchmark for video-as-a-service workloads. The test profile offers the options of a range of vbench scenarios based on freely distributable video content and offers the options of using the x264 or x265 video encoders for transcoding. Learn more via the OpenBenchmarking.org test page.

OpenCV

This is a benchmark of the OpenCV (Computer Vision) library's built-in performance tests. Learn more via the OpenBenchmarking.org test page.

Timed LLVM Compilation

This test times how long it takes to compile/build the LLVM compiler stack. Learn more via the OpenBenchmarking.org test page.

TensorFlow

OpenCV

This is a benchmark of the OpenCV (Computer Vision) library's built-in performance tests. Learn more via the OpenBenchmarking.org test page.

MariaDB

This is a MariaDB MySQL database server benchmark making use of mysqlslap. Learn more via the OpenBenchmarking.org test page.

OpenSSL

OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test profile makes use of the built-in "openssl speed" benchmarking capabilities. Learn more via the OpenBenchmarking.org test page.

FFmpeg

ClickHouse

ClickHouse is an open-source, high performance OLAP data management system. This test profile uses ClickHouse's standard benchmark recommendations per https://clickhouse.com/docs/en/operations/performance-test/ / https://github.com/ClickHouse/ClickBench/tree/main/clickhouse with the 100 million rows web analytics dataset. The reported value is the query processing time using the geometric mean of all separate queries performed as an aggregate. Learn more via the OpenBenchmarking.org test page.

MariaDB

This is a MariaDB MySQL database server benchmark making use of mysqlslap. Learn more via the OpenBenchmarking.org test page.

FFmpeg

PostgreSQL

This is a benchmark of PostgreSQL using the integrated pgbench for facilitating the database benchmarks. Learn more via the OpenBenchmarking.org test page.

OpenCV

This is a benchmark of the OpenCV (Computer Vision) library's built-in performance tests. Learn more via the OpenBenchmarking.org test page.

Timed Node.js Compilation

This test profile times how long it takes to build/compile Node.js itself from source. Node.js is a JavaScript run-time built from the Chrome V8 JavaScript engine while itself is written in C/C++. Learn more via the OpenBenchmarking.org test page.

Timed LLVM Compilation

This test times how long it takes to compile/build the LLVM compiler stack. Learn more via the OpenBenchmarking.org test page.

TensorFlow

Timed Godot Game Engine Compilation

This test times how long it takes to compile the Godot Game Engine. Godot is a popular, open-source, cross-platform 2D/3D game engine and is built using the SCons build system and targeting the X11 platform. Learn more via the OpenBenchmarking.org test page.

ONNX Runtime

ONNX Runtime is developed by Microsoft and partners as a open-source, cross-platform, high performance machine learning inferencing and training accelerator. This test profile runs the ONNX Runtime with various models available from the ONNX Model Zoo. Learn more via the OpenBenchmarking.org test page.

Result

Inference Time Cost (ms)

Result

Inference Time Cost (ms)

Result

Inference Time Cost (ms)

Result

Inference Time Cost (ms)

Result

Inference Time Cost (ms)

Result

Inference Time Cost (ms)

Result

Inference Time Cost (ms)

Result

Inference Time Cost (ms)

Result

Inference Time Cost (ms)

Result

Inference Time Cost (ms)

Result

Inference Time Cost (ms)

Result

Inference Time Cost (ms)

nginx

This is a benchmark of the lightweight Nginx HTTP(S) web-server. This Nginx web server benchmark test profile makes use of the wrk program for facilitating the HTTP requests over a fixed period time with a configurable number of concurrent clients/connections. HTTPS with a self-signed OpenSSL certificate is used by this test for local benchmarking. Learn more via the OpenBenchmarking.org test page.

Apache HTTP Server

This is a test of the Apache HTTPD web server. This Apache HTTPD web server benchmark test profile makes use of the wrk program for facilitating the HTTP requests over a fixed period time with a configurable number of concurrent clients. Learn more via the OpenBenchmarking.org test page.

ONNX Runtime

Result

Inference Time Cost (ms)

OpenCV

This is a benchmark of the OpenCV (Computer Vision) library's built-in performance tests. Learn more via the OpenBenchmarking.org test page.

ONNX Runtime

Result

Inference Time Cost (ms)

Result

Inference Time Cost (ms)

Result

Inference Time Cost (ms)

Result

Inference Time Cost (ms)

Result

Inference Time Cost (ms)

TensorFlow

Zstd Compression

This test measures the time needed to compress/decompress a sample file (silesia.tar) using Zstd (Zstandard) compression with options for different compression levels / settings. Learn more via the OpenBenchmarking.org test page.

OpenCV

This is a benchmark of the OpenCV (Computer Vision) library's built-in performance tests. Learn more via the OpenBenchmarking.org test page.

TensorFlow

Darmstadt Automotive Parallel Heterogeneous Suite

DAPHNE is the Darmstadt Automotive Parallel HeterogeNEous Benchmark Suite with OpenCL / CUDA / OpenMP test cases for these automotive benchmarks for evaluating programming models in context to vehicle autonomous driving capabilities. Learn more via the OpenBenchmarking.org test page.

Zstd Compression

Memcached

Memcached is a high performance, distributed memory object caching system. This Memcached test profiles makes use of memtier_benchmark for excuting this CPU/memory-focused server benchmark. Learn more via the OpenBenchmarking.org test page.

Zstd Compression

Build2

This test profile measures the time to bootstrap/install the build2 C++ build toolchain from source. Build2 is a cross-platform build toolchain for C/C++ code and features Cargo-like features. Learn more via the OpenBenchmarking.org test page.

Neural Magic DeepSparse

This is a benchmark of Neural Magic's DeepSparse using its built-in deepsparse.benchmark utility and various models from their SparseZoo (https://sparsezoo.neuralmagic.com/). Learn more via the OpenBenchmarking.org test page.

TensorFlow

Neural Magic DeepSparse

RocksDB

This is a benchmark of Meta/Facebook's RocksDB as an embeddable persistent key-value store for fast storage based on Google's LevelDB. Learn more via the OpenBenchmarking.org test page.

John The Ripper

This is a benchmark of John The Ripper, which is a password cracker. Learn more via the OpenBenchmarking.org test page.

RocksDB

This is a benchmark of Meta/Facebook's RocksDB as an embeddable persistent key-value store for fast storage based on Google's LevelDB. Learn more via the OpenBenchmarking.org test page.

John The Ripper

This is a benchmark of John The Ripper, which is a password cracker. Learn more via the OpenBenchmarking.org test page.

OpenSSL

RocksDB

This is a benchmark of Meta/Facebook's RocksDB as an embeddable persistent key-value store for fast storage based on Google's LevelDB. Learn more via the OpenBenchmarking.org test page.

Neural Magic DeepSparse

FFmpeg

nginx

Connections: 200

d: The test quit with a non-zero exit status.

e: The test quit with a non-zero exit status.

Apache HTTP Server

Concurrent Requests: 200

d: The test quit with a non-zero exit status.

e: The test quit with a non-zero exit status.

Neural Magic DeepSparse

TensorFlow

Neural Magic DeepSparse

FFmpeg

Neural Magic DeepSparse

TensorFlow

Neural Magic DeepSparse

OpenCV

This is a benchmark of the OpenCV (Computer Vision) library's built-in performance tests. Learn more via the OpenBenchmarking.org test page.

TensorFlow

OpenCV

This is a benchmark of the OpenCV (Computer Vision) library's built-in performance tests. Learn more via the OpenBenchmarking.org test page.

John The Ripper

This is a benchmark of John The Ripper, which is a password cracker. Learn more via the OpenBenchmarking.org test page.

GROMACS

The GROMACS (GROningen MAchine for Chemical Simulations) molecular dynamics package testing with the water_GMX50 data. This test profile allows selecting between CPU and GPU-based GROMACS builds. Learn more via the OpenBenchmarking.org test page.

TensorFlow

Darmstadt Automotive Parallel Heterogeneous Suite

SPECFEM3D

simulates acoustic (fluid), elastic (solid), coupled acoustic/elastic, poroelastic or seismic wave propagation in any type of conforming mesh of hexahedra. This test profile currently relies on CPU-based execution for SPECFEM3D and using a variety of their built-in examples/models for benchmarking. Learn more via the OpenBenchmarking.org test page.

TensorFlow

SPECFEM3D

Embree

Darmstadt Automotive Parallel Heterogeneous Suite

Embree

Timed FFmpeg Compilation

This test times how long it takes to build the FFmpeg multimedia library. Learn more via the OpenBenchmarking.org test page.

TensorFlow

SPECFEM3D

TensorFlow

SPECFEM3D

dav1d

Dav1d is an open-source, speedy AV1 video decoder. This test profile times how long it takes to decode sample AV1 video content. Learn more via the OpenBenchmarking.org test page.

Video Input: Chimera 1080p 10-bit

d: The test quit with a non-zero exit status.

e: The test quit with a non-zero exit status.

TensorFlow

Google Draco

Draco is a library developed by Google for compressing/decompressing 3D geometric meshes and point clouds. This test profile uses some Artec3D PLY models as the sample 3D model input formats for Draco compression/decompression. Learn more via the OpenBenchmarking.org test page.

dav1d

Dav1d is an open-source, speedy AV1 video decoder. This test profile times how long it takes to decode sample AV1 video content. Learn more via the OpenBenchmarking.org test page.

Video Input: Chimera 1080p

d: The test quit with a non-zero exit status.

e: The test quit with a non-zero exit status.

Google Draco

dav1d

Dav1d is an open-source, speedy AV1 video decoder. This test profile times how long it takes to decode sample AV1 video content. Learn more via the OpenBenchmarking.org test page.

Video Input: Summer Nature 4K

d: The test quit with a non-zero exit status.

e: The test quit with a non-zero exit status.

Embree

dav1d

Dav1d is an open-source, speedy AV1 video decoder. This test profile times how long it takes to decode sample AV1 video content. Learn more via the OpenBenchmarking.org test page.

Video Input: Summer Nature 1080p

d: The test quit with a non-zero exit status.

e: The test quit with a non-zero exit status.

nginx

Connections: 1000

a: The test quit with a non-zero exit status.

c: The test quit with a non-zero exit status.

b: The test quit with a non-zero exit status.

d: The test quit with a non-zero exit status.

e: The test quit with a non-zero exit status.

Apache HTTP Server

Concurrent Requests: 1000

a: The test quit with a non-zero exit status.

c: The test quit with a non-zero exit status.

b: The test quit with a non-zero exit status.

d: The test quit with a non-zero exit status.

e: The test quit with a non-zero exit status.

Concurrent Requests: 100

a: The test quit with a non-zero exit status.

c: The test quit with a non-zero exit status.

b: The test quit with a non-zero exit status.

d: The test quit with a non-zero exit status.

e: The test quit with a non-zero exit status.

nginx

Connections: 100

a: The test quit with a non-zero exit status.

c: The test quit with a non-zero exit status.

b: The test quit with a non-zero exit status.

d: The test quit with a non-zero exit status.

e: The test quit with a non-zero exit status.

199 Results Shown

RocksDB
TensorFlow
OpenCV
MariaDB
FFmpeg:
libx264 - Upload:
FPS
Seconds
OpenCV
Timed LLVM Compilation
TensorFlow
OpenCV
MariaDB
OpenSSL:
SHA512
SHA256
AES-256-GCM
AES-128-GCM
ChaCha20-Poly1305
ChaCha20
FFmpeg:
libx264 - Video On Demand:
FPS
Seconds
libx264 - Platform:
FPS
Seconds
ClickHouse:
100M Rows Hits Dataset, Third Run
100M Rows Hits Dataset, Second Run
100M Rows Hits Dataset, First Run / Cold Cache
MariaDB:
2048
1024
512
FFmpeg:
libx265 - Upload:
FPS
Seconds
libx265 - Platform:
FPS
Seconds
libx265 - Video On Demand:
FPS
Seconds
PostgreSQL:
100 - 800 - Read Only - Average Latency
100 - 800 - Read Only
100 - 1000 - Read Write - Average Latency
100 - 1000 - Read Write
100 - 800 - Read Write - Average Latency
100 - 800 - Read Write
100 - 1000 - Read Only - Average Latency
100 - 1000 - Read Only
1 - 1000 - Read Write - Average Latency
1 - 1000 - Read Write
1 - 800 - Read Write - Average Latency
1 - 800 - Read Write
1 - 1000 - Read Only - Average Latency
1 - 1000 - Read Only
1 - 800 - Read Only - Average Latency
1 - 800 - Read Only
OpenCV
Timed Node.js Compilation
Timed LLVM Compilation
TensorFlow
Timed Godot Game Engine Compilation
ONNX Runtime:
fcn-resnet101-11 - CPU - Parallel:
Inference Time Cost (ms)
Inferences Per Second
GPT-2 - CPU - Parallel:
Inference Time Cost (ms)
Inferences Per Second
GPT-2 - CPU - Standard:
Inference Time Cost (ms)
Inferences Per Second
fcn-resnet101-11 - CPU - Standard:
Inference Time Cost (ms)
Inferences Per Second
bertsquad-12 - CPU - Parallel:
Inference Time Cost (ms)
Inferences Per Second
bertsquad-12 - CPU - Standard:
Inference Time Cost (ms)
Inferences Per Second
yolov4 - CPU - Parallel:
Inference Time Cost (ms)
Inferences Per Second
yolov4 - CPU - Standard:
Inference Time Cost (ms)
Inferences Per Second
ArcFace ResNet-100 - CPU - Parallel:
Inference Time Cost (ms)
Inferences Per Second
ArcFace ResNet-100 - CPU - Standard:
Inference Time Cost (ms)
Inferences Per Second
Faster R-CNN R-50-FPN-int8 - CPU - Parallel:
Inference Time Cost (ms)
Inferences Per Second
Faster R-CNN R-50-FPN-int8 - CPU - Standard:
Inference Time Cost (ms)
Inferences Per Second
nginx
Apache HTTP Server
ONNX Runtime:
CaffeNet 12-int8 - CPU - Parallel:
Inference Time Cost (ms)
Inferences Per Second
OpenCV
ONNX Runtime:
CaffeNet 12-int8 - CPU - Standard:
Inference Time Cost (ms)
Inferences Per Second
ResNet50 v1-12-int8 - CPU - Parallel:
Inference Time Cost (ms)
Inferences Per Second
ResNet50 v1-12-int8 - CPU - Standard:
Inference Time Cost (ms)
Inferences Per Second
super-resolution-10 - CPU - Parallel:
Inference Time Cost (ms)
Inferences Per Second
super-resolution-10 - CPU - Standard:
Inference Time Cost (ms)
Inferences Per Second
TensorFlow
Zstd Compression:
19, Long Mode - Decompression Speed
19, Long Mode - Compression Speed
OpenCV
TensorFlow
Darmstadt Automotive Parallel Heterogeneous Suite
Zstd Compression:
19 - Decompression Speed
19 - Compression Speed
Memcached:
1:100
1:5
1:10
Zstd Compression:
12 - Decompression Speed
12 - Compression Speed
8, Long Mode - Decompression Speed
8, Long Mode - Compression Speed
8 - Decompression Speed
8 - Compression Speed
Build2
Neural Magic DeepSparse:
CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream:
ms/batch
items/sec
TensorFlow
Neural Magic DeepSparse:
NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream:
ms/batch
items/sec
RocksDB:
Rand Fill Sync
Update Rand
Rand Fill
John The Ripper
RocksDB:
Read Rand Write Rand
Read While Writing
John The Ripper
OpenSSL:
RSA4096:
verify/s
sign/s
RocksDB
Neural Magic DeepSparse:
NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream:
ms/batch
items/sec
NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Stream:
ms/batch
items/sec
FFmpeg:
libx265 - Live:
FPS
Seconds
nginx
Apache HTTP Server
Neural Magic DeepSparse:
NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Asynchronous Multi-Stream:
ms/batch
items/sec
NLP Text Classification, BERT base uncased SST2 - Synchronous Single-Stream:
ms/batch
items/sec
TensorFlow
Neural Magic DeepSparse:
NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Stream:
ms/batch
items/sec
NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Synchronous Single-Stream:
ms/batch
items/sec
NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Stream:
ms/batch
items/sec
NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Stream:
ms/batch
items/sec
NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream:
ms/batch
items/sec
CV Segmentation, 90% Pruned YOLACT Pruned - Synchronous Single-Stream:
ms/batch
items/sec
NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Synchronous Single-Stream:
ms/batch
items/sec
FFmpeg:
libx264 - Live:
FPS
Seconds
Neural Magic DeepSparse:
CV Detection, YOLOv5s COCO - Asynchronous Multi-Stream:
ms/batch
items/sec
CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream:
ms/batch
items/sec
TensorFlow
Neural Magic DeepSparse:
NLP Text Classification, DistilBERT mnli - Synchronous Single-Stream:
ms/batch
items/sec
CV Detection, YOLOv5s COCO - Synchronous Single-Stream:
ms/batch
items/sec
CV Classification, ResNet-50 ImageNet - Synchronous Single-Stream:
ms/batch
items/sec
OpenCV
TensorFlow
OpenCV
John The Ripper:
WPA PSK
bcrypt
Blowfish
GROMACS
TensorFlow:
CPU - 32 - GoogLeNet
CPU - 256 - AlexNet
Darmstadt Automotive Parallel Heterogeneous Suite
SPECFEM3D
TensorFlow
SPECFEM3D
Embree
Darmstadt Automotive Parallel Heterogeneous Suite
Embree
Timed FFmpeg Compilation
TensorFlow
SPECFEM3D:
Homogeneous Halfspace
Tomographic Model
TensorFlow
SPECFEM3D
dav1d
TensorFlow
Google Draco
dav1d
Google Draco
dav1d
Embree:
Pathtracer - Crown
Pathtracer ISPC - Crown
Pathtracer - Asian Dragon
Pathtracer ISPC - Asian Dragon
dav1d

a

Testing initiated at 28 March 2023 14:53 by user phoronix.

b

Testing initiated at 28 March 2023 18:31 by user phoronix.

c

Testing initiated at 28 March 2023 21:17 by user phoronix.

d

Testing initiated at 29 March 2023 06:43 by user phoronix.

e

Processor: 2 x AMD EPYC 9654 96-Core @ 3.71GHz (192 Cores / 384 Threads), Motherboard: AMD Titanite_4G (RTI1004D BIOS), Chipset: AMD Device 14a4, Memory: 1520GB, Disk: 800GB INTEL SSDPF21Q800GB, Graphics: ASPEED, Monitor: VGA HDMI, Network: Broadcom NetXtreme BCM5720 PCIe

Testing initiated at 29 March 2023 10:47 by user phoronix.

epyc 9654 AMD March

View

Statistics

Graph Settings

Additional Graphs

Multi-Way Comparison

Table

Run Management

a

b

c

d

e

RocksDB

TensorFlow

OpenCV

MariaDB

FFmpeg

OpenCV

Timed LLVM Compilation

TensorFlow

OpenCV

MariaDB

OpenSSL

FFmpeg

ClickHouse

MariaDB

FFmpeg

PostgreSQL

OpenCV

Timed Node.js Compilation

Timed LLVM Compilation

TensorFlow

Timed Godot Game Engine Compilation

ONNX Runtime

nginx

Apache HTTP Server

ONNX Runtime

OpenCV

ONNX Runtime

TensorFlow

Zstd Compression

OpenCV

TensorFlow

Darmstadt Automotive Parallel Heterogeneous Suite

Zstd Compression

Memcached

Zstd Compression

Build2

Neural Magic DeepSparse

TensorFlow

Neural Magic DeepSparse

RocksDB

John The Ripper

RocksDB

John The Ripper

OpenSSL

RocksDB

Neural Magic DeepSparse

FFmpeg

nginx

Apache HTTP Server

Neural Magic DeepSparse

TensorFlow

Neural Magic DeepSparse

FFmpeg

Neural Magic DeepSparse

TensorFlow

Neural Magic DeepSparse

OpenCV

TensorFlow

OpenCV

John The Ripper

GROMACS

TensorFlow

Darmstadt Automotive Parallel Heterogeneous Suite

SPECFEM3D

TensorFlow

SPECFEM3D

Embree