Google Cloud c3 Sapphire Rapids

Benchmarks by Michael Larabel for a future article.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2303218-PTS-2303217N47
Jump To Table - Results

Statistics

Remove Outliers Before Calculating Averages

Graph Settings

Prefer Vertical Bar Graphs

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Result
Identifier
Performance Per
Dollar
Date
Run
  Test
  Duration
c3-highcpu-8 SPR
March 21 2023
  13 Hours, 29 Minutes
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


Google Cloud c3 Sapphire RapidsOpenBenchmarking.orgPhoronix Test SuiteIntel Xeon Platinum 8481C (4 Cores / 8 Threads)Google Compute Engine c3-highcpu-8Intel 440FX 82441FX PMC16GB322GB nvme_card-pdGoogle Compute Engine VirtualUbuntu 22.105.19.0-1015-gcp (x86_64)1.3.224GCC 12.2.0ext4KVMProcessorMotherboardChipsetMemoryDiskNetworkOSKernelVulkanCompilerFile-SystemSystem LayerGoogle Cloud C3 Sapphire Rapids PerformanceSystem Logs- Transparent Huge Pages: madvise- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - CPU Microcode: 0xffffffff- Python 3.10.7- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected

Google Cloud c3 Sapphire Rapidsnekrs: TurboPipe Periodicopenradioss: Bird Strike on Windshieldopencv: Image Processingopencv: Stitchingcockroach: MoVR - 128tensorflow: CPU - 64 - ResNet-50openvkl: vklBenchmark ISPCspecfem3d: Layered Halfspaceopenradioss: Rubber O-Ring Seal Installationlczero: Eigenlczero: BLASspecfem3d: Water-layered Halfspacemysqlslap: 4096blender: BMW27 - CPU-Onlymysqlslap: 2048openradioss: Bumper Beamoidn: RTLightmap.hdr.4096x4096build-linux-kernel: defconfigtensorflow: CPU - 32 - ResNet-50brl-cad: VGR Performance Metricgromacs: MPI CPU - water_GMX50_bareopenradioss: Cell Phone Drop Testopencv: Graph APIspecfem3d: Homogeneous Halfspacecompress-7zip: Decompression Ratingcompress-7zip: Compression Ratingopenssl: SHA256openssl: ChaCha20-Poly1305openssl: AES-256-GCMopenssl: AES-128-GCMopenssl: ChaCha20openssl: SHA512namd: ATPase Simulation - 327,506 Atomsopenfoam: drivaerFastback, Small Mesh Size - Execution Timeopenfoam: drivaerFastback, Small Mesh Size - Mesh Timespecfem3d: Tomographic Modelpgbench: 100 - 800 - Read Only - Average Latencypgbench: 100 - 800 - Read Onlypgbench: 100 - 1000 - Read Only - Average Latencypgbench: 100 - 1000 - Read Onlyspecfem3d: Mount St. Helenstensorflow: CPU - 16 - ResNet-50oidn: RT.hdr_alb_nrm.3840x2160ospray-studio: 3 - 4K - 1 - Path Tracerbuild-ffmpeg: Time To Compileembree: Pathtracer ISPC - Crowncockroach: KV, 50% Reads - 128cockroach: KV, 95% Reads - 128minibude: OpenMP - BM1minibude: OpenMP - BM1ospray-studio: 1 - 4K - 1 - Path Traceronednn: Recurrent Neural Network Training - bf16bf16bf16 - CPUnginx: 1000nginx: 4000nginx: 100nginx: 500nginx: 200embree: Pathtracer ISPC - Asian Dragonopencv: Coreuvg266: Bosphorus 4K - Very Fastonednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPUuvg266: Bosphorus 4K - Super Fastcompress-zstd: 19 - Decompression Speedcompress-zstd: 19 - Compression Speedmemcached: 1:10memcached: 1:100uvg266: Bosphorus 4K - Ultra Fastopencv: Object Detectioncompress-zstd: 19, Long Mode - Decompression Speedcompress-zstd: 19, Long Mode - Compression Speedjohn-the-ripper: MD5john-the-ripper: HMAC-SHA512openssl: RSA4096openssl: RSA4096deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Streamdeepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Streamdeepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Streamdeepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Streamdeepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Streamdeepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Streamdeepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Streamdeepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Streamdeepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Asynchronous Multi-Streamdeepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Asynchronous Multi-Streamdeepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Streamdeepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Streamdeepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Streamdeepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Streamdraco: Liondeepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Streamdeepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Streamincompact3d: input.i3d 129 Cells Per Directionopencv: Videojohn-the-ripper: WPA PSKjohn-the-ripper: bcryptjohn-the-ripper: Blowfishonednn: Deconvolution Batch shapes_1d - bf16bf16bf16 - CPUuvg266: Bosphorus 1080p - Very Fastuvg266: Bosphorus 1080p - Super Fastonednn: IP Shapes 1D - bf16bf16bf16 - CPUuvg266: Bosphorus 1080p - Ultra Fastonednn: Matrix Multiply Batch Shapes Transformer - bf16bf16bf16 - CPUonednn: IP Shapes 3D - bf16bf16bf16 - CPUdraco: Church Facadeonednn: Convolution Batch Shapes Auto - bf16bf16bf16 - CPUonednn: Deconvolution Batch shapes_3d - bf16bf16bf16 - CPUc3-highcpu-8 SPR30667900000595.14128163214760458.815.6998372.089480857367.0012211272321.625251668317315.24332303.380.12244.80014.93710720.777219.73219931179.545901627204683530642838739871597078114048008361573575940778232209155763715685729203.35779422.6783662.044277143.9370228752.5653119423.405293725139.52446961414.200.2425819120.4385.847519321.624960.17.546188.651209524660.7032118.5832814.7536310.3534672.6535602.107.3642873726.992337.317.48905.210.31044947.131030937.289.1238999907.26.57657133750900067857.72062.7305.89916.5372528.01933.7873530.61003.7693123.293116.219438.512051.8926104.310219.167760.385833.1064625030.795264.874732.39431633165428818693269301.4714532.3934.501.5000442.240.9689865.3421875734.177073.54944OpenBenchmarking.org

nekRS

nekRS is an open-source Navier Stokes solver based on the spectral element method. NekRS supports both CPU and GPU/accelerator support though this test profile is currently configured for CPU execution. NekRS is part of Nek5000 of the Mathematics and Computer Science MCS at Argonne National Laboratory. This nekRS benchmark is primarily relevant to large core count HPC servers and otherwise may be very time consuming. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFLOP/s, More Is BetternekRS 22.0Input: TurboPipe Periodicc3-highcpu-8 SPR7000M14000M21000M28000M35000MSE +/- 92013205.57, N = 3306679000001. (CXX) g++ options: -fopenmp -O2 -march=native -mtune=native -ftree-vectorize -lmpi_cxx -lmpi

OpenRadioss

OpenRadioss is an open-source AGPL-licensed finite element solver for dynamic event analysis OpenRadioss is based on Altair Radioss and open-sourced in 2022. This open-source finite element solver is benchmarked with various example models available from https://www.openradioss.org/models/. This test is currently using a reference OpenRadioss binary build offered via GitHub. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenRadioss 2022.10.13Model: Bird Strike on Windshieldc3-highcpu-8 SPR130260390520650SE +/- 0.30, N = 3595.14

OpenCV

This is a benchmark of the OpenCV (Computer Vision) library's built-in performance tests. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterOpenCV 4.7Test: Image Processingc3-highcpu-8 SPR30K60K90K120K150KSE +/- 1624.35, N = 121281631. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt

OpenBenchmarking.orgms, Fewer Is BetterOpenCV 4.7Test: Stitchingc3-highcpu-8 SPR50K100K150K200K250KSE +/- 1973.06, N = 72147601. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt

CockroachDB

CockroachDB is a cloud-native, distributed SQL database for data intensive applications. This test profile uses a server-less CockroachDB configuration to test various Coackroach workloads on the local host with a single node. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgops/s, More Is BetterCockroachDB 22.2Workload: MoVR - Concurrency: 128c3-highcpu-8 SPR100200300400500SE +/- 5.62, N = 15458.8

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries too. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.10Device: CPU - Batch Size: 64 - Model: ResNet-50c3-highcpu-8 SPR48121620SE +/- 0.01, N = 315.69

OpenVKL

OpenVKL is the Intel Open Volume Kernel Library that offers high-performance volume computation kernels and part of the Intel oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgItems / Sec, More Is BetterOpenVKL 1.3.1Benchmark: vklBenchmark ISPCc3-highcpu-8 SPR20406080100SE +/- 0.33, N = 398MIN: 11 / MAX: 1579

VVenC

OpenBenchmarking.orgFrames Per Second, More Is BetterVVenC 1.7Video Input: Bosphorus 4K - Video Preset: Fastc3-highcpu-8 SPR0.35010.70021.05031.40041.7505SE +/- 0.005, N = 31.5561. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects

SPECFEM3D

simulates acoustic (fluid), elastic (solid), coupled acoustic/elastic, poroelastic or seismic wave propagation in any type of conforming mesh of hexahedra. This test profile currently relies on CPU-based execution for SPECFEM3D and using a variety of their built-in examples/models for benchmarking. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterSPECFEM3D 4.0Model: Layered Halfspacec3-highcpu-8 SPR80160240320400SE +/- 0.64, N = 3372.091. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

OpenRadioss

OpenRadioss is an open-source AGPL-licensed finite element solver for dynamic event analysis OpenRadioss is based on Altair Radioss and open-sourced in 2022. This open-source finite element solver is benchmarked with various example models available from https://www.openradioss.org/models/. This test is currently using a reference OpenRadioss binary build offered via GitHub. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenRadioss 2022.10.13Model: Rubber O-Ring Seal Installationc3-highcpu-8 SPR80160240320400SE +/- 0.20, N = 3367.00

LeelaChessZero

LeelaChessZero (lc0 / lczero) is a chess engine automated vian neural networks. This test profile can be used for OpenCL, CUDA + cuDNN, and BLAS (CPU-based) benchmarking. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.28Backend: Eigenc3-highcpu-8 SPR30060090012001500SE +/- 7.69, N = 312211. (CXX) g++ options: -flto -pthread

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.28Backend: BLASc3-highcpu-8 SPR30060090012001500SE +/- 16.59, N = 312721. (CXX) g++ options: -flto -pthread

SPECFEM3D

simulates acoustic (fluid), elastic (solid), coupled acoustic/elastic, poroelastic or seismic wave propagation in any type of conforming mesh of hexahedra. This test profile currently relies on CPU-based execution for SPECFEM3D and using a variety of their built-in examples/models for benchmarking. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterSPECFEM3D 4.0Model: Water-layered Halfspacec3-highcpu-8 SPR70140210280350SE +/- 0.31, N = 3321.631. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

MariaDB

This is a MariaDB MySQL database server benchmark making use of mysqlslap. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgQueries Per Second, More Is BetterMariaDB 11.0.1Clients: 4096c3-highcpu-8 SPR70140210280350SE +/- 2.62, N = 33171. (CXX) g++ options: -pie -fPIC -fstack-protector -O3 -lnuma -lcrypt -lz -lm -lssl -lcrypto -lpthread -ldl

Blender

Blender is an open-source 3D creation and modeling software project. This test is of Blender's Cycles performance with various sample files. GPU computing via NVIDIA OptiX and NVIDIA CUDA is currently supported as well as HIP for AMD Radeon GPUs and Intel oneAPI for Intel Graphics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 3.4Blend File: BMW27 - Compute: CPU-Onlyc3-highcpu-8 SPR70140210280350SE +/- 1.13, N = 3315.24

MariaDB

This is a MariaDB MySQL database server benchmark making use of mysqlslap. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgQueries Per Second, More Is BetterMariaDB 11.0.1Clients: 2048c3-highcpu-8 SPR70140210280350SE +/- 3.01, N = 33321. (CXX) g++ options: -pie -fPIC -fstack-protector -O3 -lnuma -lcrypt -lz -lm -lssl -lcrypto -lpthread -ldl

OpenRadioss

OpenRadioss is an open-source AGPL-licensed finite element solver for dynamic event analysis OpenRadioss is based on Altair Radioss and open-sourced in 2022. This open-source finite element solver is benchmarked with various example models available from https://www.openradioss.org/models/. This test is currently using a reference OpenRadioss binary build offered via GitHub. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenRadioss 2022.10.13Model: Bumper Beamc3-highcpu-8 SPR70140210280350SE +/- 0.51, N = 3303.38

Intel Open Image Denoise

Open Image Denoise is a denoising library for ray-tracing and part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgImages / Sec, More Is BetterIntel Open Image Denoise 1.4.0Run: RTLightmap.hdr.4096x4096c3-highcpu-8 SPR0.0270.0540.0810.1080.135SE +/- 0.00, N = 30.12

Timed Linux Kernel Compilation

This test times how long it takes to build the Linux kernel in a default configuration (defconfig) for the architecture being tested or alternatively an allmodconfig for building all possible kernel modules for the build. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Linux Kernel Compilation 6.1Build: defconfigc3-highcpu-8 SPR50100150200250SE +/- 0.64, N = 3244.80

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries too. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.10Device: CPU - Batch Size: 32 - Model: ResNet-50c3-highcpu-8 SPR48121620SE +/- 0.01, N = 314.93

BRL-CAD

BRL-CAD is a cross-platform, open-source solid modeling system with built-in benchmark mode. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgVGR Performance Metric, More Is BetterBRL-CAD 7.34VGR Performance Metricc3-highcpu-8 SPR15K30K45K60K75K710721. (CXX) g++ options: -std=c++14 -pipe -fvisibility=hidden -fno-strict-aliasing -fno-common -fexceptions -ftemplate-depth-128 -m64 -ggdb3 -O3 -fipa-pta -fstrength-reduce -finline-functions -flto -ltcl8.6 -lregex_brl -lz_brl -lnetpbm -ldl -lm -ltk8.6

GROMACS

The GROMACS (GROningen MAchine for Chemical Simulations) molecular dynamics package testing with the water_GMX50 data. This test profile allows selecting between CPU and GPU-based GROMACS builds. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNs Per Day, More Is BetterGROMACS 2023Implementation: MPI CPU - Input: water_GMX50_barec3-highcpu-8 SPR0.17480.34960.52440.69920.874SE +/- 0.001, N = 30.7771. (CXX) g++ options: -O3

OpenRadioss

OpenRadioss is an open-source AGPL-licensed finite element solver for dynamic event analysis OpenRadioss is based on Altair Radioss and open-sourced in 2022. This open-source finite element solver is benchmarked with various example models available from https://www.openradioss.org/models/. This test is currently using a reference OpenRadioss binary build offered via GitHub. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenRadioss 2022.10.13Model: Cell Phone Drop Testc3-highcpu-8 SPR50100150200250SE +/- 0.38, N = 3219.73

OpenCV

This is a benchmark of the OpenCV (Computer Vision) library's built-in performance tests. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterOpenCV 4.7Test: Graph APIc3-highcpu-8 SPR50K100K150K200K250KSE +/- 931.36, N = 32199311. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt

SPECFEM3D

simulates acoustic (fluid), elastic (solid), coupled acoustic/elastic, poroelastic or seismic wave propagation in any type of conforming mesh of hexahedra. This test profile currently relies on CPU-based execution for SPECFEM3D and using a variety of their built-in examples/models for benchmarking. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterSPECFEM3D 4.0Model: Homogeneous Halfspacec3-highcpu-8 SPR4080120160200SE +/- 0.09, N = 3179.551. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

7-Zip Compression

This is a test of 7-Zip compression/decompression with its integrated benchmark feature. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMIPS, More Is Better7-Zip Compression 22.01Test: Decompression Ratingc3-highcpu-8 SPR4K8K12K16K20KSE +/- 183.58, N = 15204681. (CXX) g++ options: -lpthread -ldl -O2 -fPIC

OpenBenchmarking.orgMIPS, More Is Better7-Zip Compression 22.01Test: Compression Ratingc3-highcpu-8 SPR8K16K24K32K40KSE +/- 246.17, N = 15353061. (CXX) g++ options: -lpthread -ldl -O2 -fPIC

OpenSSL

OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test profile makes use of the built-in "openssl speed" benchmarking capabilities. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.1Algorithm: SHA256c3-highcpu-8 SPR900M1800M2700M3600M4500MSE +/- 2722265.85, N = 342838739871. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.1Algorithm: ChaCha20-Poly1305c3-highcpu-8 SPR3000M6000M9000M12000M15000MSE +/- 21990039.72, N = 3159707811401. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.1Algorithm: AES-256-GCMc3-highcpu-8 SPR10000M20000M30000M40000M50000MSE +/- 48074910.93, N = 3480083615731. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.1Algorithm: AES-128-GCMc3-highcpu-8 SPR12000M24000M36000M48000M60000MSE +/- 33921495.47, N = 3575940778231. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.1Algorithm: ChaCha20c3-highcpu-8 SPR5000M10000M15000M20000M25000MSE +/- 35781789.46, N = 3220915576371. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.1Algorithm: SHA512c3-highcpu-8 SPR300M600M900M1200M1500MSE +/- 1152941.98, N = 315685729201. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl

VVenC

OpenBenchmarking.orgFrames Per Second, More Is BetterVVenC 1.7Video Input: Bosphorus 4K - Video Preset: Fasterc3-highcpu-8 SPR0.79291.58582.37873.17163.9645SE +/- 0.000, N = 33.5241. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects

NAMD

NAMD is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems. NAMD was developed by the Theoretical and Computational Biophysics Group in the Beckman Institute for Advanced Science and Technology at the University of Illinois at Urbana-Champaign. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgdays/ns, Fewer Is BetterNAMD 2.14ATPase Simulation - 327,506 Atomsc3-highcpu-8 SPR0.75551.5112.26653.0223.7775SE +/- 0.00254, N = 33.35779

OpenFOAM

OpenFOAM is the leading free, open-source software for computational fluid dynamics (CFD). This test profile currently uses the drivaerFastback test case for analyzing automotive aerodynamics or alternatively the older motorBike input. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 10Input: drivaerFastback, Small Mesh Size - Execution Timec3-highcpu-8 SPR90180270360450422.681. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lphysicalProperties -lspecie -lfiniteVolume -lfvModels -lgenericPatchFields -lmeshTools -lsampling -lOpenFOAM -ldl -lm

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 10Input: drivaerFastback, Small Mesh Size - Mesh Timec3-highcpu-8 SPR142842567062.041. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lphysicalProperties -lspecie -lfiniteVolume -lfvModels -lgenericPatchFields -lmeshTools -lsampling -lOpenFOAM -ldl -lm

SPECFEM3D

simulates acoustic (fluid), elastic (solid), coupled acoustic/elastic, poroelastic or seismic wave propagation in any type of conforming mesh of hexahedra. This test profile currently relies on CPU-based execution for SPECFEM3D and using a variety of their built-in examples/models for benchmarking. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterSPECFEM3D 4.0Model: Tomographic Modelc3-highcpu-8 SPR306090120150SE +/- 1.65, N = 3143.941. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

PostgreSQL

This is a benchmark of PostgreSQL using the integrated pgbench for facilitating the database benchmarks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterPostgreSQL 15Scaling Factor: 100 - Clients: 800 - Mode: Read Only - Average Latencyc3-highcpu-8 SPR0.57711.15421.73132.30842.8855SE +/- 0.028, N = 32.5651. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL 15Scaling Factor: 100 - Clients: 800 - Mode: Read Onlyc3-highcpu-8 SPR70K140K210K280K350KSE +/- 3369.27, N = 33119421. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm

OpenBenchmarking.orgms, Fewer Is BetterPostgreSQL 15Scaling Factor: 100 - Clients: 1000 - Mode: Read Only - Average Latencyc3-highcpu-8 SPR0.76611.53222.29833.06443.8305SE +/- 0.028, N = 33.4051. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL 15Scaling Factor: 100 - Clients: 1000 - Mode: Read Onlyc3-highcpu-8 SPR60K120K180K240K300KSE +/- 2414.33, N = 32937251. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm

SPECFEM3D

simulates acoustic (fluid), elastic (solid), coupled acoustic/elastic, poroelastic or seismic wave propagation in any type of conforming mesh of hexahedra. This test profile currently relies on CPU-based execution for SPECFEM3D and using a variety of their built-in examples/models for benchmarking. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterSPECFEM3D 4.0Model: Mount St. Helensc3-highcpu-8 SPR306090120150SE +/- 0.10, N = 3139.521. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries too. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.10Device: CPU - Batch Size: 16 - Model: ResNet-50c3-highcpu-8 SPR48121620SE +/- 0.04, N = 314.20

Intel Open Image Denoise

Open Image Denoise is a denoising library for ray-tracing and part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgImages / Sec, More Is BetterIntel Open Image Denoise 1.4.0Run: RT.hdr_alb_nrm.3840x2160c3-highcpu-8 SPR0.0540.1080.1620.2160.27SE +/- 0.00, N = 30.24

OSPRay Studio

Intel OSPRay Studio is an open-source, interactive visualization and ray-tracing software package. OSPRay Studio makes use of Intel OSPRay, a portable ray-tracing engine for high-performance, high-fidelity visualizations. OSPRay builds off Intel's Embree and Intel SPMD Program Compiler (ISPC) components as part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 3 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracerc3-highcpu-8 SPR6K12K18K24K30KSE +/- 364.36, N = 3258191. (CXX) g++ options: -O3 -lm -ldl

Timed FFmpeg Compilation

This test times how long it takes to build the FFmpeg multimedia library. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed FFmpeg Compilation 6.0Time To Compilec3-highcpu-8 SPR306090120150SE +/- 0.10, N = 3120.44

VVenC

OpenBenchmarking.orgFrames Per Second, More Is BetterVVenC 1.7Video Input: Bosphorus 1080p - Video Preset: Fastc3-highcpu-8 SPR1.19362.38723.58084.77445.968SE +/- 0.018, N = 35.3051. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects

Embree

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.0.1Binary: Pathtracer ISPC - Model: Crownc3-highcpu-8 SPR1.31572.63143.94715.26286.5785SE +/- 0.0120, N = 35.8475MIN: 5.81 / MAX: 5.92

CockroachDB

CockroachDB is a cloud-native, distributed SQL database for data intensive applications. This test profile uses a server-less CockroachDB configuration to test various Coackroach workloads on the local host with a single node. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgops/s, More Is BetterCockroachDB 22.2Workload: KV, 50% Reads - Concurrency: 128c3-highcpu-8 SPR4K8K12K16K20KSE +/- 40.86, N = 319321.6

OpenBenchmarking.orgops/s, More Is BetterCockroachDB 22.2Workload: KV, 95% Reads - Concurrency: 128c3-highcpu-8 SPR5K10K15K20K25KSE +/- 127.78, N = 324960.1

miniBUDE

MiniBUDE is a mini application for the the core computation of the Bristol University Docking Engine (BUDE). This test profile currently makes use of the OpenMP implementation of miniBUDE for CPU benchmarking. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgBillion Interactions/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM1c3-highcpu-8 SPR246810SE +/- 0.001, N = 37.5461. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm

OpenBenchmarking.orgGFInst/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM1c3-highcpu-8 SPR4080120160200SE +/- 0.03, N = 3188.651. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm

OSPRay Studio

Intel OSPRay Studio is an open-source, interactive visualization and ray-tracing software package. OSPRay Studio makes use of Intel OSPRay, a portable ray-tracing engine for high-performance, high-fidelity visualizations. OSPRay builds off Intel's Embree and Intel SPMD Program Compiler (ISPC) components as part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 1 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracerc3-highcpu-8 SPR4K8K12K16K20KSE +/- 4.91, N = 3209521. (CXX) g++ options: -O3 -lm -ldl

oneDNN

This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPUc3-highcpu-8 SPR10002000300040005000SE +/- 0.86, N = 34660.70MIN: 4648.251. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

nginx

This is a benchmark of the lightweight Nginx HTTP(S) web-server. This Nginx web server benchmark test profile makes use of the wrk program for facilitating the HTTP requests over a fixed period time with a configurable number of concurrent clients/connections. HTTPS with a self-signed OpenSSL certificate is used by this test for local benchmarking. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgRequests Per Second, More Is Betternginx 1.23.2Connections: 1000c3-highcpu-8 SPR7K14K21K28K35KSE +/- 22.42, N = 332118.581. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2

OpenBenchmarking.orgRequests Per Second, More Is Betternginx 1.23.2Connections: 4000c3-highcpu-8 SPR7K14K21K28K35KSE +/- 27.93, N = 332814.751. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2

OpenBenchmarking.orgRequests Per Second, More Is Betternginx 1.23.2Connections: 100c3-highcpu-8 SPR8K16K24K32K40KSE +/- 23.95, N = 336310.351. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2

OpenBenchmarking.orgRequests Per Second, More Is Betternginx 1.23.2Connections: 500c3-highcpu-8 SPR7K14K21K28K35KSE +/- 321.85, N = 334672.651. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2

OpenBenchmarking.orgRequests Per Second, More Is Betternginx 1.23.2Connections: 200c3-highcpu-8 SPR8K16K24K32K40KSE +/- 83.52, N = 335602.101. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2

Embree

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.0.1Binary: Pathtracer ISPC - Model: Asian Dragonc3-highcpu-8 SPR246810SE +/- 0.0079, N = 37.3642MIN: 7.33 / MAX: 7.44

OpenCV

This is a benchmark of the OpenCV (Computer Vision) library's built-in performance tests. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterOpenCV 4.7Test: Corec3-highcpu-8 SPR20K40K60K80K100KSE +/- 280.31, N = 3873721. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt

uvg266

uvg266 is an open-source VVC/H.266 (Versatile Video Coding) encoder based on Kvazaar as part of the Ultra Video Group, Tampere University, Finland. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is Betteruvg266 0.4.1Video Input: Bosphorus 4K - Video Preset: Very Fastc3-highcpu-8 SPR246810SE +/- 0.03, N = 36.99

oneDNN

This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPUc3-highcpu-8 SPR5001000150020002500SE +/- 1.76, N = 32337.31MIN: 2326.771. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

uvg266

uvg266 is an open-source VVC/H.266 (Versatile Video Coding) encoder based on Kvazaar as part of the Ultra Video Group, Tampere University, Finland. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is Betteruvg266 0.4.1Video Input: Bosphorus 4K - Video Preset: Super Fastc3-highcpu-8 SPR246810SE +/- 0.00, N = 37.48

Zstd Compression

This test measures the time needed to compress/decompress a sample file (silesia.tar) using Zstd (Zstandard) compression with options for different compression levels / settings. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 19 - Decompression Speedc3-highcpu-8 SPR2004006008001000SE +/- 1.50, N = 3905.21. (CC) gcc options: -O3 -pthread -lz -llzma

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 19 - Compression Speedc3-highcpu-8 SPR3691215SE +/- 0.12, N = 310.31. (CC) gcc options: -O3 -pthread -lz -llzma

Memcached

Memcached is a high performance, distributed memory object caching system. This Memcached test profiles makes use of memtier_benchmark for excuting this CPU/memory-focused server benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgOps/sec, More Is BetterMemcached 1.6.18Set To Get Ratio: 1:10c3-highcpu-8 SPR200K400K600K800K1000KSE +/- 3280.78, N = 31044947.131. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre

OpenBenchmarking.orgOps/sec, More Is BetterMemcached 1.6.18Set To Get Ratio: 1:100c3-highcpu-8 SPR200K400K600K800K1000KSE +/- 11157.73, N = 31030937.281. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre

uvg266

uvg266 is an open-source VVC/H.266 (Versatile Video Coding) encoder based on Kvazaar as part of the Ultra Video Group, Tampere University, Finland. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is Betteruvg266 0.4.1Video Input: Bosphorus 4K - Video Preset: Ultra Fastc3-highcpu-8 SPR3691215SE +/- 0.01, N = 39.12

OpenCV

This is a benchmark of the OpenCV (Computer Vision) library's built-in performance tests. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterOpenCV 4.7Test: Object Detectionc3-highcpu-8 SPR8K16K24K32K40KSE +/- 384.74, N = 5389991. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt

Zstd Compression

This test measures the time needed to compress/decompress a sample file (silesia.tar) using Zstd (Zstandard) compression with options for different compression levels / settings. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 19, Long Mode - Decompression Speedc3-highcpu-8 SPR2004006008001000SE +/- 1.48, N = 3907.21. (CC) gcc options: -O3 -pthread -lz -llzma

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 19, Long Mode - Compression Speedc3-highcpu-8 SPR246810SE +/- 0.00, N = 36.51. (CC) gcc options: -O3 -pthread -lz -llzma

John The Ripper

This is a benchmark of John The Ripper, which is a password cracker. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgReal C/S, More Is BetterJohn The Ripper 2023.03.14Test: MD5c3-highcpu-8 SPR160K320K480K640K800KSE +/- 1472.13, N = 37657131. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lm -lrt -lz -ldl -lcrypt

OpenBenchmarking.orgReal C/S, More Is BetterJohn The Ripper 2023.03.14Test: HMAC-SHA512c3-highcpu-8 SPR8M16M24M32M40MSE +/- 21071.31, N = 3375090001. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lm -lrt -lz -ldl -lcrypt

OpenSSL

OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test profile makes use of the built-in "openssl speed" benchmarking capabilities. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgverify/s, More Is BetterOpenSSL 3.1Algorithm: RSA4096c3-highcpu-8 SPR15K30K45K60K75KSE +/- 8.53, N = 367857.71. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl

OpenBenchmarking.orgsign/s, More Is BetterOpenSSL 3.1Algorithm: RSA4096c3-highcpu-8 SPR400800120016002000SE +/- 1.17, N = 32062.71. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl

Neural Magic DeepSparse

This is a benchmark of Neural Magic's DeepSparse using its built-in deepsparse.benchmark utility and various models from their SparseZoo (https://sparsezoo.neuralmagic.com/). Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.3.2Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Streamc3-highcpu-8 SPR70140210280350SE +/- 0.37, N = 3305.90

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.3.2Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Streamc3-highcpu-8 SPR246810SE +/- 0.0078, N = 36.5372

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.3.2Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Streamc3-highcpu-8 SPR110220330440550SE +/- 1.78, N = 3528.02

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.3.2Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Streamc3-highcpu-8 SPR0.85211.70422.55633.40844.2605SE +/- 0.0130, N = 33.7873

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.3.2Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Streamc3-highcpu-8 SPR110220330440550SE +/- 3.34, N = 3530.61

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.3.2Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Streamc3-highcpu-8 SPR0.84811.69622.54433.39244.2405SE +/- 0.0239, N = 33.7693

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.3.2Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Streamc3-highcpu-8 SPR306090120150SE +/- 0.95, N = 3123.29

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.3.2Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Streamc3-highcpu-8 SPR48121620SE +/- 0.13, N = 316.22

VVenC

OpenBenchmarking.orgFrames Per Second, More Is BetterVVenC 1.7Video Input: Bosphorus 1080p - Video Preset: Fasterc3-highcpu-8 SPR3691215SE +/- 0.01, N = 312.921. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects

Neural Magic DeepSparse

This is a benchmark of Neural Magic's DeepSparse using its built-in deepsparse.benchmark utility and various models from their SparseZoo (https://sparsezoo.neuralmagic.com/). Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.3.2Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Streamc3-highcpu-8 SPR918273645SE +/- 0.04, N = 338.51

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.3.2Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Streamc3-highcpu-8 SPR1224364860SE +/- 0.05, N = 351.89

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.3.2Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Streamc3-highcpu-8 SPR20406080100SE +/- 0.14, N = 3104.31

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.3.2Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Streamc3-highcpu-8 SPR510152025SE +/- 0.03, N = 319.17

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.3.2Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Streamc3-highcpu-8 SPR1428425670SE +/- 0.28, N = 360.39

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.3.2Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Streamc3-highcpu-8 SPR816243240SE +/- 0.15, N = 333.11

Google Draco

Draco is a library developed by Google for compressing/decompressing 3D geometric meshes and point clouds. This test profile uses some Artec3D PLY models as the sample 3D model input formats for Draco compression/decompression. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterGoogle Draco 1.5.6Model: Lionc3-highcpu-8 SPR13002600390052006500SE +/- 80.23, N = 1562501. (CXX) g++ options: -O3

Neural Magic DeepSparse

This is a benchmark of Neural Magic's DeepSparse using its built-in deepsparse.benchmark utility and various models from their SparseZoo (https://sparsezoo.neuralmagic.com/). Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.3.2Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Streamc3-highcpu-8 SPR714212835SE +/- 0.02, N = 330.80

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.3.2Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Streamc3-highcpu-8 SPR1428425670SE +/- 0.04, N = 364.87

Xcompact3d Incompact3d

Xcompact3d Incompact3d is a Fortran-MPI based, finite difference high-performance code for solving the incompressible Navier-Stokes equation and as many as you need scalar transport equations. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterXcompact3d Incompact3d 2021-03-11Input: input.i3d 129 Cells Per Directionc3-highcpu-8 SPR816243240SE +/- 0.02, N = 332.391. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

OpenCV

This is a benchmark of the OpenCV (Computer Vision) library's built-in performance tests. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterOpenCV 4.7Test: Videoc3-highcpu-8 SPR7K14K21K28K35KSE +/- 198.80, N = 3316541. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt

John The Ripper

This is a benchmark of John The Ripper, which is a password cracker. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgReal C/S, More Is BetterJohn The Ripper 2023.03.14Test: WPA PSKc3-highcpu-8 SPR6K12K18K24K30KSE +/- 27.15, N = 3288181. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lm -lrt -lz -ldl -lcrypt

OpenBenchmarking.orgReal C/S, More Is BetterJohn The Ripper 2023.03.14Test: bcryptc3-highcpu-8 SPR15003000450060007500SE +/- 0.67, N = 369321. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lm -lrt -lz -ldl -lcrypt

OpenBenchmarking.orgReal C/S, More Is BetterJohn The Ripper 2023.03.14Test: Blowfishc3-highcpu-8 SPR15003000450060007500SE +/- 2.08, N = 369301. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lm -lrt -lz -ldl -lcrypt

oneDNN

This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPUc3-highcpu-8 SPR0.33110.66220.99331.32441.6555SE +/- 0.00472, N = 31.47145MIN: 1.41. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

uvg266

uvg266 is an open-source VVC/H.266 (Versatile Video Coding) encoder based on Kvazaar as part of the Ultra Video Group, Tampere University, Finland. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is Betteruvg266 0.4.1Video Input: Bosphorus 1080p - Video Preset: Very Fastc3-highcpu-8 SPR816243240SE +/- 0.25, N = 332.39

OpenBenchmarking.orgFrames Per Second, More Is Betteruvg266 0.4.1Video Input: Bosphorus 1080p - Video Preset: Super Fastc3-highcpu-8 SPR816243240SE +/- 0.03, N = 334.50

oneDNN

This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPUc3-highcpu-8 SPR0.33750.6751.01251.351.6875SE +/- 0.00340, N = 31.50004MIN: 1.371. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

uvg266

uvg266 is an open-source VVC/H.266 (Versatile Video Coding) encoder based on Kvazaar as part of the Ultra Video Group, Tampere University, Finland. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is Betteruvg266 0.4.1Video Input: Bosphorus 1080p - Video Preset: Ultra Fastc3-highcpu-8 SPR1020304050SE +/- 0.01, N = 342.24

oneDNN

This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPUc3-highcpu-8 SPR0.2180.4360.6540.8721.09SE +/- 0.013114, N = 30.968986MIN: 0.921. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPUc3-highcpu-8 SPR1.2022.4043.6064.8086.01SE +/- 0.00535, N = 35.34218MIN: 4.941. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

Google Draco

Draco is a library developed by Google for compressing/decompressing 3D geometric meshes and point clouds. This test profile uses some Artec3D PLY models as the sample 3D model input formats for Draco compression/decompression. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterGoogle Draco 1.5.6Model: Church Facadec3-highcpu-8 SPR16003200480064008000SE +/- 10.48, N = 375731. (CXX) g++ options: -O3

oneDNN

This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPUc3-highcpu-8 SPR0.93981.87962.81943.75924.699SE +/- 0.00870, N = 34.17707MIN: 4.041. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPUc3-highcpu-8 SPR0.79861.59722.39583.19443.993SE +/- 0.01072, N = 33.54944MIN: 3.471. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

110 Results Shown

nekRS
OpenRadioss
OpenCV:
  Image Processing
  Stitching
CockroachDB
TensorFlow
OpenVKL
VVenC
SPECFEM3D
OpenRadioss
LeelaChessZero:
  Eigen
  BLAS
SPECFEM3D
MariaDB
Blender
MariaDB
OpenRadioss
Intel Open Image Denoise
Timed Linux Kernel Compilation
TensorFlow
BRL-CAD
GROMACS
OpenRadioss
OpenCV
SPECFEM3D
7-Zip Compression:
  Decompression Rating
  Compression Rating
OpenSSL:
  SHA256
  ChaCha20-Poly1305
  AES-256-GCM
  AES-128-GCM
  ChaCha20
  SHA512
VVenC
NAMD
OpenFOAM:
  drivaerFastback, Small Mesh Size - Execution Time
  drivaerFastback, Small Mesh Size - Mesh Time
SPECFEM3D
PostgreSQL:
  100 - 800 - Read Only - Average Latency
  100 - 800 - Read Only
  100 - 1000 - Read Only - Average Latency
  100 - 1000 - Read Only
SPECFEM3D
TensorFlow
Intel Open Image Denoise
OSPRay Studio
Timed FFmpeg Compilation
VVenC
Embree
CockroachDB:
  KV, 50% Reads - 128
  KV, 95% Reads - 128
miniBUDE:
  OpenMP - BM1:
    Billion Interactions/s
    GFInst/s
OSPRay Studio
oneDNN
nginx:
  1000
  4000
  100
  500
  200
Embree
OpenCV
uvg266
oneDNN
uvg266
Zstd Compression:
  19 - Decompression Speed
  19 - Compression Speed
Memcached:
  1:10
  1:100
uvg266
OpenCV
Zstd Compression:
  19, Long Mode - Decompression Speed
  19, Long Mode - Compression Speed
John The Ripper:
  MD5
  HMAC-SHA512
OpenSSL:
  RSA4096:
    verify/s
    sign/s
Neural Magic DeepSparse:
  CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream:
    ms/batch
    items/sec
  NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream:
    ms/batch
    items/sec
  NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream:
    ms/batch
    items/sec
  NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Stream:
    ms/batch
    items/sec
VVenC
Neural Magic DeepSparse:
  NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Asynchronous Multi-Stream:
    ms/batch
    items/sec
  NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Stream:
    ms/batch
    items/sec
  NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream:
    ms/batch
    items/sec
Google Draco
Neural Magic DeepSparse:
  CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream:
    ms/batch
    items/sec
Xcompact3d Incompact3d
OpenCV
John The Ripper:
  WPA PSK
  bcrypt
  Blowfish
oneDNN
uvg266:
  Bosphorus 1080p - Very Fast
  Bosphorus 1080p - Super Fast
oneDNN
uvg266
oneDNN:
  Matrix Multiply Batch Shapes Transformer - bf16bf16bf16 - CPU
  IP Shapes 3D - bf16bf16bf16 - CPU
Google Draco
oneDNN:
  Convolution Batch Shapes Auto - bf16bf16bf16 - CPU
  Deconvolution Batch shapes_3d - bf16bf16bf16 - CPU