okt

Tests for a future article. AMD Ryzen 9 3900XT 12-Core testing with a MSI MEG X570 GODLIKE (MS-7C34) v1.0 (1.B3 BIOS) and AMD Radeon RX 56/64 8GB on Ubuntu 22.04 via the Phoronix Test Suite.

a

Kernel Notes: Transparent Huge Pages: madvise
Compiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v
Disk Notes: NONE / errors=remount-ro,relatime,rw / Block Size: 4096
Processor Notes: Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0x8701021
Graphics Notes: BAR1 / Visible vRAM Size: 256 MB - vBIOS Version: 113-D0500100-102
Java Notes: OpenJDK Runtime Environment (build 11.0.20.1+1-post-Ubuntu-0ubuntu122.04)
Python Notes: Python 3.10.12
Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Mitigation of untrained return thunk; SMT enabled with STIBP protection + spec_rstack_overflow: Mitigation of safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected

b

Processor: AMD Ryzen 9 3900XT 12-Core @ 3.80GHz (12 Cores / 24 Threads), Motherboard: MSI MEG X570 GODLIKE (MS-7C34) v1.0 (1.B3 BIOS), Chipset: AMD Starship/Matisse, Memory: 16GB, Disk: 500GB Seagate FireCuda 520 SSD ZP500GM30002, Graphics: AMD Radeon RX 56/64 8GB (1630/945MHz), Audio: AMD Vega 10 HDMI Audio, Monitor: ASUS MG28U, Network: Realtek Device 2600 + Realtek Killer E3000 2.5GbE + Intel Wi-Fi 6 AX200

OS: Ubuntu 22.04, Kernel: 6.2.0-35-generic (x86_64), Desktop: GNOME Shell 42.2, Display Server: X Server + Wayland, OpenGL: 4.6 Mesa 22.0.1 (LLVM 13.0.1 DRM 3.49), Vulkan: 1.3.204, Compiler: GCC 11.4.0, File-System: ext4, Screen Resolution: 3840x2160

SQLite

This is a simple benchmark of SQLite. At present this test profile just measures the time to perform a pre-defined number of insertions on an indexed database with a variable number of concurrent repetitions -- up to the maximum number of CPU threads available. Learn more via the OpenBenchmarking.org test page.

3DMark Wild Life Extreme

This test profile only automates the vendor build of 3DMark with its command-line / JSON support. If you do not have a licensed copy of the necessary 3DMark binaries in your Phoronix Test Suite download cache on your system, this test profile will not do anything and simply fail. You must have already obtained the proper licensed binaries from UL for this test profile to work -- this test profile simply automates the firing of the 3DMark benchmark at your desired resolution and capturing the results within the Phoronix Test Suite while you must already have the necessary 3DMark files on your system. Learn more via the OpenBenchmarking.org test page.

QuantLib

QuantLib is an open-source library/framework around quantitative finance for modeling, trading and risk management scenarios. QuantLib is written in C++ with Boost and its built-in benchmark used reports the QuantLib Benchmark Index benchmark score. Learn more via the OpenBenchmarking.org test page.

Crypto++

Crypto++ is a C++ class library of cryptographic algorithms. Learn more via the OpenBenchmarking.org test page.

High Performance Conjugate Gradient

HPCG is the High Performance Conjugate Gradient and is a new scientific benchmark from Sandia National Lans focused for super-computer testing with modern real-world workloads compared to HPCC. Learn more via the OpenBenchmarking.org test page.

CloverLeaf

CloverLeaf is a Lagrangian-Eulerian hydrodynamics benchmark. This test profile currently makes use of CloverLeaf's OpenMP version. Learn more via the OpenBenchmarking.org test page.

CP2K Molecular Dynamics

CP2K is an open-source molecular dynamics software package focused on quantum chemistry and solid-state physics. More details on the CP2K benchmark test cases and details can be found @ https://www.cp2k.org/performance Learn more via the OpenBenchmarking.org test page.

Input: H2O-DFT-LS

a: The test quit with a non-zero exit status. E: mpirun noticed that process rank 6 with PID 0 on node phoronix-MS-7C34 exited on signal 9 (Killed).

b: The test quit with a non-zero exit status. E: mpirun noticed that process rank 10 with PID 0 on node phoronix-MS-7C34 exited on signal 9 (Killed).

libxsmm

Libxsmm is an open-source library for specialized dense and sparse matrix operations and deep learning primitives. Libxsmm supports making use of Intel AMX, AVX-512, and other modern CPU instruction set capabilities. Learn more via the OpenBenchmarking.org test page.

Palabos

The Palabos library is a framework for general purpose Computational Fluid Dynamics (CFD). Palabos uses a kernel based on the Lattice Boltzmann method. This test profile uses the Palabos MPI-based Cavity3D benchmark. Learn more via the OpenBenchmarking.org test page.

QMCPACK

QMCPACK is a modern high-performance open-source Quantum Monte Carlo (QMC) simulation code making use of MPI for this benchmark of the H20 example code. QMCPACK is an open-source production level many-body ab initio Quantum Monte Carlo code for computing the electronic structure of atoms, molecules, and solids. QMCPACK is supported by the U.S. Department of Energy. Learn more via the OpenBenchmarking.org test page.

OpenRadioss

OpenRadioss is an open-source AGPL-licensed finite element solver for dynamic event analysis OpenRadioss is based on Altair Radioss and open-sourced in 2022. This open-source finite element solver is benchmarked with various example models available from https://www.openradioss.org/models/ and https://github.com/OpenRadioss/ModelExchange/tree/main/Examples. This test is currently using a reference OpenRadioss binary build offered via GitHub. Learn more via the OpenBenchmarking.org test page.

Z3 Theorem Prover

The Z3 Theorem Prover / SMT solver is developed by Microsoft Research under the MIT license. Learn more via the OpenBenchmarking.org test page.

nekRS

nekRS is an open-source Navier Stokes solver based on the spectral element method. NekRS supports both CPU and GPU/accelerator support though this test profile is currently configured for CPU execution. NekRS is part of Nek5000 of the Mathematics and Computer Science MCS at Argonne National Laboratory. This nekRS benchmark is primarily relevant to large core count HPC servers and otherwise may be very time consuming on smaller systems. Learn more via the OpenBenchmarking.org test page.

srsRAN Project

srsRAN Project is a complete ORAN-native 5G RAN solution created by Software Radio Systems (SRS). The srsRAN Project radio suite was formerly known as srsLTE and can be used for building your own software-defined radio (SDR) 4G/5G mobile network. Learn more via the OpenBenchmarking.org test page.

easyWave

The easyWave software allows simulating tsunami generation and propagation in the context of early warning systems. EasyWave supports making use of OpenMP for CPU multi-threading and there are also GPU ports available but not currently incorporated as part of this test profile. The easyWave tsunami generation software is run with one of the example/reference input files for measuring the CPU execution time. Learn more via the OpenBenchmarking.org test page.

dav1d

Dav1d is an open-source, speedy AV1 video decoder supporting modern SIMD CPU features. This test profile times how long it takes to decode sample AV1 video content. Learn more via the OpenBenchmarking.org test page.

Embree

SVT-AV1

VVenC

VVenC is the Fraunhofer Versatile Video Encoder as a fast/efficient H.266/VVC encoder. The vvenc encoder makes use of SIMD Everywhere (SIMDe). The vvenc software is published under the Clear BSD License. Learn more via the OpenBenchmarking.org test page.

Intel Open Image Denoise

OpenVKL

OpenVKL is the Intel Open Volume Kernel Library that offers high-performance volume computation kernels and part of the Intel oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

OSPRay

Intel OSPRay is a portable ray-tracing engine for high-performance, high-fidelity scientific visualizations. OSPRay builds off Intel's Embree and Intel SPMD Program Compiler (ISPC) components as part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

libavif avifenc

This is a test of the AOMedia libavif library testing the encoding of a JPEG image to AV1 Image Format (AVIF). Learn more via the OpenBenchmarking.org test page.

Timed GCC Compilation

This test times how long it takes to build the GNU Compiler Collection (GCC) open-source compiler. Learn more via the OpenBenchmarking.org test page.

Timed Gem5 Compilation

This test times how long it takes to compile Gem5. Gem5 is a simulator for computer system architecture research. Gem5 is widely used for computer architecture research within the industry, academia, and more. Learn more via the OpenBenchmarking.org test page.

Timed Godot Game Engine Compilation

This test times how long it takes to compile the Godot Game Engine. Godot is a popular, open-source, cross-platform 2D/3D game engine and is built using the SCons build system and targeting the X11 platform. Learn more via the OpenBenchmarking.org test page.

Timed LLVM Compilation

This test times how long it takes to compile/build the LLVM compiler stack. Learn more via the OpenBenchmarking.org test page.

Timed Node.js Compilation

This test profile times how long it takes to build/compile Node.js itself from source. Node.js is a JavaScript run-time built from the Chrome V8 JavaScript engine while itself is written in C/C++. Learn more via the OpenBenchmarking.org test page.

Time To Compile

a: The test quit with a non-zero exit status. E: g++: fatal error: Killed signal terminated program cc1plus

Build2

This test profile measures the time to bootstrap/install the build2 C++ build toolchain from source. Build2 is a cross-platform build toolchain for C/C++ code and features Cargo-like features. Learn more via the OpenBenchmarking.org test page.

oneDNN

This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI toolkit. Learn more via the OpenBenchmarking.org test page.

Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU

a: The test run did not produce a result.

b: The test run did not produce a result.

Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU

a: The test run did not produce a result.

b: The test run did not produce a result.

Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU

a: The test run did not produce a result.

b: The test run did not produce a result.

Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU

a: The test run did not produce a result.

b: The test run did not produce a result.

Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU

a: The test run did not produce a result.

b: The test run did not produce a result.

OSPRay Studio

Intel OSPRay Studio is an open-source, interactive visualization and ray-tracing software package. OSPRay Studio makes use of Intel OSPRay, a portable ray-tracing engine for high-performance, high-fidelity visualizations. OSPRay builds off Intel's Embree and Intel SPMD Program Compiler (ISPC) components as part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

Opus Codec Encoding

Opus is an open audio codec. Opus is a lossy audio compression format designed primarily for interactive real-time applications over the Internet. This test uses Opus-Tools and measures the time required to encode a WAV file to Opus five times. Learn more via the OpenBenchmarking.org test page.

Cpuminer-Opt

Cpuminer-Opt is a fork of cpuminer-multi that carries a wide range of CPU performance optimizations for measuring the potential cryptocurrency mining performance of the CPU/processor with a wide variety of cryptocurrencies. The benchmark reports the hash speed for the CPU mining performance for the selected cryptocurrency. Learn more via the OpenBenchmarking.org test page.

Liquid-DSP

LiquidSDR's Liquid-DSP is a software-defined radio (SDR) digital signal processing library. This test profile runs a multi-threaded benchmark of this SDR/DSP library focused on embedded platform usage. Learn more via the OpenBenchmarking.org test page.

Apache IoTDB

Apache IotDB is a time series database and this benchmark is facilitated using the IoT Benchmaark [https://github.com/thulab/iot-benchmark/]. Learn more via the OpenBenchmarking.org test page.

Device Count: 800 - Batch Size Per Write: 100 - Sensor Count: 200 - Client Number: 400

a: The test quit with a non-zero exit status.

b: The test quit with a non-zero exit status.

Memcached

Memcached is a high performance, distributed memory object caching system. This Memcached test profiles makes use of memtier_benchmark for excuting this CPU/memory-focused server benchmark. Learn more via the OpenBenchmarking.org test page.

DuckDB

DuckDB is an in-progress SQL OLAP database management system optimized for analytics and features a vectorized and parallel engine. Learn more via the OpenBenchmarking.org test page.

Benchmark: IMDB

a: The test run did not produce a result.

b: The test run did not produce a result.

Benchmark: TPC-H Parquet

a: The test run did not produce a result.

b: The test run did not produce a result.

PostgreSQL

This is a benchmark of PostgreSQL using the integrated pgbench for facilitating the database benchmarks. Learn more via the OpenBenchmarking.org test page.

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

Neural Magic DeepSparse

Stress-NG

Stress-NG is a Linux stress tool developed by Colin Ian King. Learn more via the OpenBenchmarking.org test page.

GPAW

GPAW is a density-functional theory (DFT) Python code based on the projector-augmented wave (PAW) method and the atomic simulation environment (ASE). Learn more via the OpenBenchmarking.org test page.

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

Blender

OpenVINO

This is a test of the Intel OpenVINO, a toolkit around neural networks, using its built-in benchmarking support and analyzing the throughput and latency for various models. Learn more via the OpenBenchmarking.org test page.

Apache Cassandra

This is a benchmark of the Apache Cassandra NoSQL database management system making use of cassandra-stress. Learn more via the OpenBenchmarking.org test page.

Apache Hadoop

This is a benchmark of the Apache Hadoop making use of its built-in name-node throughput benchmark (NNThroughputBenchmark). Learn more via the OpenBenchmarking.org test page.

nginx

This is a benchmark of the lightweight Nginx HTTP(S) web-server. This Nginx web server benchmark test profile makes use of the wrk program for facilitating the HTTP requests over a fixed period time with a configurable number of concurrent clients/connections. HTTPS with a self-signed OpenSSL certificate is used by this test for local benchmarking. Learn more via the OpenBenchmarking.org test page.

Apache HTTP Server

This is a test of the Apache HTTPD web server. This Apache HTTPD web server benchmark test profile makes use of the wrk program for facilitating the HTTP requests over a fixed period time with a configurable number of concurrent clients. Learn more via the OpenBenchmarking.org test page.

Concurrent Requests: 100

a: The test quit with a non-zero exit status. E: ./apache: 2: ./wrk-4.2.0/wrk: not found

b: The test quit with a non-zero exit status. E: ./apache: 2: ./wrk-4.2.0/wrk: not found

Concurrent Requests: 200

a: The test quit with a non-zero exit status. E: ./apache: 2: ./wrk-4.2.0/wrk: not found

b: The test quit with a non-zero exit status. E: ./apache: 2: ./wrk-4.2.0/wrk: not found

Concurrent Requests: 500

a: The test quit with a non-zero exit status. E: ./apache: 2: ./wrk-4.2.0/wrk: not found

b: The test quit with a non-zero exit status. E: ./apache: 2: ./wrk-4.2.0/wrk: not found

Concurrent Requests: 1000

a: The test quit with a non-zero exit status. E: ./apache: 2: ./wrk-4.2.0/wrk: not found

b: The test quit with a non-zero exit status. E: ./apache: 2: ./wrk-4.2.0/wrk: not found

Whisper.cpp

Whisper.cpp is a port of OpenAI's Whisper model in C/C++. Whisper.cpp is developed by Georgi Gerganov for transcribing WAV audio files to text / speech recognition. Whisper.cpp supports ARM NEON, x86 AVX, and other advanced CPU features. Learn more via the OpenBenchmarking.org test page.

BRL-CAD

BRL-CAD is a cross-platform, open-source solid modeling system with built-in benchmark mode. Learn more via the OpenBenchmarking.org test page.

345 Results Shown

SQLite:
1
2
4
8
3DMark Wild Life Extreme
QuantLib:
Multi-Threaded
Single-Threaded
Crypto++
High Performance Conjugate Gradient
CloverLeaf:
clover_bm
clover_bm64_short
CP2K Molecular Dynamics:
H20-64
Fayalite-FIST
libxsmm:
128
32
64
Palabos
QMCPACK:
H4_ae
Li2_STO_ae
LiH_ae_MSD
simple-H2O
O_ae_pyscf_UHF
FeCO6_b3lyp_gms
OpenRadioss:
Bumper Beam
Chrysler Neon 1M
Cell Phone Drop Test
Bird Strike on Windshield
Rubber O-Ring Seal Installation
INIVOL and Fluid Structure Interaction Drop Container
Z3 Theorem Prover:
1.smt2
2.smt2
nekRS:
Kershaw
TurboPipe Periodic
srsRAN Project:
Downlink Processor Benchmark
PUSCH Processor Benchmark, Throughput Total
PUSCH Processor Benchmark, Throughput Thread
easyWave:
e2Asean Grid + BengkuluSept2007 Source - 240
e2Asean Grid + BengkuluSept2007 Source - 1200
dav1d:
Chimera 1080p
Summer Nature 4K
Summer Nature 1080p
Chimera 1080p 10-bit
Embree:
Pathtracer - Crown
Pathtracer ISPC - Crown
Pathtracer - Asian Dragon
Pathtracer - Asian Dragon Obj
Pathtracer ISPC - Asian Dragon
Pathtracer ISPC - Asian Dragon Obj
SVT-AV1:
Preset 4 - Bosphorus 4K
Preset 8 - Bosphorus 4K
Preset 12 - Bosphorus 4K
Preset 13 - Bosphorus 4K
Preset 4 - Bosphorus 1080p
Preset 8 - Bosphorus 1080p
Preset 12 - Bosphorus 1080p
Preset 13 - Bosphorus 1080p
VVenC:
Bosphorus 4K - Fast
Bosphorus 4K - Faster
Bosphorus 1080p - Fast
Bosphorus 1080p - Faster
Intel Open Image Denoise:
RT.hdr_alb_nrm.3840x2160 - CPU-Only
RT.ldr_alb_nrm.3840x2160 - CPU-Only
RTLightmap.hdr.4096x4096 - CPU-Only
OpenVKL:
vklBenchmarkCPU ISPC
vklBenchmarkCPU Scalar
OSPRay:
particle_volume/ao/real_time
particle_volume/scivis/real_time
particle_volume/pathtracer/real_time
gravity_spheres_volume/dim_512/ao/real_time
gravity_spheres_volume/dim_512/scivis/real_time
gravity_spheres_volume/dim_512/pathtracer/real_time
libavif avifenc:
0
2
6
6, Lossless
10, Lossless
Timed GCC Compilation
Timed Gem5 Compilation
Timed Godot Game Engine Compilation
Timed LLVM Compilation:
Ninja
Unix Makefiles
Timed Node.js Compilation
Build2
oneDNN:
IP Shapes 1D - f32 - CPU
IP Shapes 3D - f32 - CPU
IP Shapes 1D - u8s8f32 - CPU
IP Shapes 3D - u8s8f32 - CPU
Convolution Batch Shapes Auto - f32 - CPU
Deconvolution Batch shapes_1d - f32 - CPU
Deconvolution Batch shapes_3d - f32 - CPU
Convolution Batch Shapes Auto - u8s8f32 - CPU
Deconvolution Batch shapes_1d - u8s8f32 - CPU
Deconvolution Batch shapes_3d - u8s8f32 - CPU
Recurrent Neural Network Training - f32 - CPU
Recurrent Neural Network Inference - f32 - CPU
Recurrent Neural Network Training - u8s8f32 - CPU
Recurrent Neural Network Inference - u8s8f32 - CPU
Recurrent Neural Network Training - bf16bf16bf16 - CPU
Recurrent Neural Network Inference - bf16bf16bf16 - CPU
OSPRay Studio:
1 - 4K - 1 - Path Tracer - CPU
2 - 4K - 1 - Path Tracer - CPU
3 - 4K - 1 - Path Tracer - CPU
1 - 4K - 16 - Path Tracer - CPU
1 - 4K - 32 - Path Tracer - CPU
2 - 4K - 16 - Path Tracer - CPU
2 - 4K - 32 - Path Tracer - CPU
3 - 4K - 16 - Path Tracer - CPU
3 - 4K - 32 - Path Tracer - CPU
1 - 1080p - 1 - Path Tracer - CPU
2 - 1080p - 1 - Path Tracer - CPU
3 - 1080p - 1 - Path Tracer - CPU
1 - 1080p - 16 - Path Tracer - CPU
1 - 1080p - 32 - Path Tracer - CPU
2 - 1080p - 16 - Path Tracer - CPU
2 - 1080p - 32 - Path Tracer - CPU
3 - 1080p - 16 - Path Tracer - CPU
3 - 1080p - 32 - Path Tracer - CPU
Opus Codec Encoding
Cpuminer-Opt:
Magi
scrypt
Deepcoin
Ringcoin
Blake-2 S
Garlicoin
Skeincoin
Myriad-Groestl
LBC, LBRY Credits
Quad SHA-256, Pyrite
Triple SHA-256, Onecoin
Liquid-DSP:
1 - 256 - 32
1 - 256 - 57
2 - 256 - 32
2 - 256 - 57
4 - 256 - 32
4 - 256 - 57
8 - 256 - 32
8 - 256 - 57
1 - 256 - 512
16 - 256 - 32
16 - 256 - 57
2 - 256 - 512
24 - 256 - 32
24 - 256 - 57
4 - 256 - 512
8 - 256 - 512
16 - 256 - 512
24 - 256 - 512
Apache IoTDB:
800 - 1 - 200 - 100:
point/sec
Average Latency
800 - 1 - 200 - 400:
point/sec
Average Latency
800 - 1 - 500 - 100:
point/sec
Average Latency
800 - 1 - 500 - 400:
point/sec
Average Latency
800 - 1 - 800 - 100:
point/sec
Average Latency
800 - 1 - 800 - 400:
point/sec
Average Latency
800 - 100 - 200 - 100:
point/sec
Average Latency
800 - 100 - 500 - 100:
point/sec
Average Latency
800 - 100 - 500 - 400:
point/sec
Average Latency
800 - 100 - 800 - 100:
point/sec
Average Latency
800 - 100 - 800 - 400:
point/sec
Average Latency
Memcached:
1:10
1:100
PostgreSQL:
100 - 1000 - Read Only
100 - 1000 - Read Only - Average Latency
100 - 1000 - Read Write
100 - 1000 - Read Write - Average Latency
TensorFlow:
CPU - 16 - ResNet-50
CPU - 32 - ResNet-50
Neural Magic DeepSparse:
NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream:
items/sec
ms/batch
NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Stream:
items/sec
ms/batch
NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Stream:
items/sec
ms/batch
NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Synchronous Single-Stream:
items/sec
ms/batch
NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Asynchronous Multi-Stream:
items/sec
ms/batch
NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Synchronous Single-Stream:
items/sec
ms/batch
NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Stream:
items/sec
ms/batch
NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Synchronous Single-Stream:
items/sec
ms/batch
ResNet-50, Baseline - Asynchronous Multi-Stream:
items/sec
ms/batch
ResNet-50, Baseline - Synchronous Single-Stream:
items/sec
ms/batch
ResNet-50, Sparse INT8 - Asynchronous Multi-Stream:
items/sec
ms/batch
ResNet-50, Sparse INT8 - Synchronous Single-Stream:
items/sec
ms/batch
CV Detection, YOLOv5s COCO - Asynchronous Multi-Stream:
items/sec
ms/batch
CV Detection, YOLOv5s COCO - Synchronous Single-Stream:
items/sec
ms/batch
BERT-Large, NLP Question Answering - Asynchronous Multi-Stream:
items/sec
ms/batch
BERT-Large, NLP Question Answering - Synchronous Single-Stream:
items/sec
ms/batch
CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream:
items/sec
ms/batch
CV Classification, ResNet-50 ImageNet - Synchronous Single-Stream:
items/sec
ms/batch
CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Stream:
items/sec
ms/batch
CV Detection, YOLOv5s COCO, Sparse INT8 - Synchronous Single-Stream:
items/sec
ms/batch
NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream:
items/sec
ms/batch
NLP Text Classification, DistilBERT mnli - Synchronous Single-Stream:
items/sec
ms/batch
CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream:
items/sec
ms/batch
CV Segmentation, 90% Pruned YOLACT Pruned - Synchronous Single-Stream:
items/sec
ms/batch
BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Stream:
items/sec
ms/batch
BERT-Large, NLP Question Answering, Sparse INT8 - Synchronous Single-Stream:
items/sec
ms/batch
NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Stream:
items/sec
ms/batch
NLP Text Classification, BERT base uncased SST2 - Synchronous Single-Stream:
items/sec
ms/batch
NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream:
items/sec
ms/batch
NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Stream:
items/sec
ms/batch
Stress-NG:
Hash
MMAP
NUMA
Pipe
Poll
Zlib
Futex
MEMFD
Mutex
Atomic
Crypto
Malloc
Cloning
Forking
Pthread
AVL Tree
IO_uring
SENDFILE
CPU Cache
CPU Stress
Semaphores
Matrix Math
Vector Math
AVX-512 VNNI
Function Call
x86_64 RdRand
Floating Point
Matrix 3D Math
Memory Copying
Vector Shuffle
Mixed Scheduler
Socket Activity
Wide Vector Math
Context Switching
Fused Multiply-Add
Vector Floating Point
Glibc C String Functions
Glibc Qsort Data Sorting
System V Message Passing
GPAW
NCNN:
CPU - mobilenet
CPU-v2-v2 - mobilenet-v2
CPU-v3-v3 - mobilenet-v3
CPU - shufflenet-v2
CPU - mnasnet
CPU - efficientnet-b0
CPU - blazeface
CPU - googlenet
CPU - vgg16
CPU - resnet18
CPU - alexnet
CPU - resnet50
CPU - yolov4-tiny
CPU - squeezenet_ssd
CPU - regnety_400m
CPU - vision_transformer
CPU - FastestDet
Blender:
BMW27 - CPU-Only
Classroom - CPU-Only
Fishy Cat - CPU-Only
Barbershop - CPU-Only
Pabellon Barcelona - CPU-Only
OpenVINO:
Face Detection FP16 - CPU:
FPS
ms
Person Detection FP16 - CPU:
FPS
ms
Person Detection FP32 - CPU:
FPS
ms
Vehicle Detection FP16 - CPU:
FPS
ms
Face Detection FP16-INT8 - CPU:
FPS
ms
Face Detection Retail FP16 - CPU:
FPS
ms
Road Segmentation ADAS FP16 - CPU:
FPS
ms
Vehicle Detection FP16-INT8 - CPU:
FPS
ms
Weld Porosity Detection FP16 - CPU:
FPS
ms
Face Detection Retail FP16-INT8 - CPU:
FPS
ms
Road Segmentation ADAS FP16-INT8 - CPU:
FPS
ms
Machine Translation EN To DE FP16 - CPU:
FPS
ms
Weld Porosity Detection FP16-INT8 - CPU:
FPS
ms
Person Vehicle Bike Detection FP16 - CPU:
FPS
ms
Handwritten English Recognition FP16 - CPU:
FPS
ms
Age Gender Recognition Retail 0013 FP16 - CPU:
FPS
ms
Handwritten English Recognition FP16-INT8 - CPU:
FPS
ms
Age Gender Recognition Retail 0013 FP16-INT8 - CPU:
FPS
ms
Apache Cassandra
Apache Hadoop
nginx:
100
200
500
1000
Whisper.cpp:
ggml-base.en - 2016 State of the Union
ggml-small.en - 2016 State of the Union
ggml-medium.en - 2016 State of the Union
BRL-CAD

a

Testing initiated at 30 October 2023 06:49 by user phoronix.

b

Testing initiated at 30 October 2023 17:16 by user phoronix.

okt

View

Statistics

Graph Settings

Multi-Way Comparison

Table

Run Management

a

b

SQLite

3DMark Wild Life Extreme

QuantLib

Crypto++

High Performance Conjugate Gradient

CloverLeaf

CP2K Molecular Dynamics

libxsmm

Palabos

QMCPACK

OpenRadioss

Z3 Theorem Prover

nekRS

srsRAN Project

easyWave

dav1d

Embree

SVT-AV1

VVenC

Intel Open Image Denoise

OpenVKL

OSPRay

libavif avifenc

Timed GCC Compilation

Timed Gem5 Compilation

Timed Godot Game Engine Compilation

Timed LLVM Compilation

Timed Node.js Compilation

Build2

oneDNN

OSPRay Studio

Opus Codec Encoding

Cpuminer-Opt

Liquid-DSP

Apache IoTDB

Memcached

DuckDB

PostgreSQL

TensorFlow

Neural Magic DeepSparse

Stress-NG

GPAW

NCNN

Blender

OpenVINO

Apache Cassandra

Apache Hadoop

nginx

Apache HTTP Server

Whisper.cpp

BRL-CAD

345 Results Shown

a

b