Benchmarks by Michael Larabel for a future article.
c3-highcpu-8 SPR Processor: Intel Xeon Platinum 8481C (4 Cores / 8 Threads), Motherboard: Google Compute Engine c3-highcpu-8, Chipset: Intel 440FX 82441FX PMC, Memory: 16GB, Disk: 322GB nvme_card-pd, Network: Google Compute Engine Virtual
OS: Ubuntu 22.10, Kernel: 5.19.0-1015-gcp (x86_64), Vulkan: 1.3.224, Compiler: GCC 12.2.0, File-System: ext4, System Layer: KVM
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: CPU Microcode: 0xffffffffPython Notes: Python 3.10.7Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected
Google Cloud c3 Sapphire Rapids OpenBenchmarking.org Phoronix Test Suite Intel Xeon Platinum 8481C (4 Cores / 8 Threads) Google Compute Engine c3-highcpu-8 Intel 440FX 82441FX PMC 16GB 322GB nvme_card-pd Google Compute Engine Virtual Ubuntu 22.10 5.19.0-1015-gcp (x86_64) 1.3.224 GCC 12.2.0 ext4 KVM Processor Motherboard Chipset Memory Disk Network OS Kernel Vulkan Compiler File-System System Layer Google Cloud C3 Sapphire Rapids Performance System Logs - Transparent Huge Pages: madvise - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - CPU Microcode: 0xffffffff - Python 3.10.7 - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected
Google Cloud c3 Sapphire Rapids opencv: Object Detection opencv: Image Processing opencv: Stitching opencv: Graph API opencv: Video opencv: Core brl-cad: VGR Performance Metric nginx: 4000 nginx: 1000 nginx: 500 nginx: 200 nginx: 100 blender: BMW27 - CPU-Only draco: Church Facade draco: Lion deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Stream deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Stream deepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Asynchronous Multi-Stream deepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Asynchronous Multi-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream tensorflow: CPU - 64 - ResNet-50 tensorflow: CPU - 32 - ResNet-50 tensorflow: CPU - 16 - ResNet-50 pgbench: 100 - 1000 - Read Only - Average Latency pgbench: 100 - 1000 - Read Only pgbench: 100 - 800 - Read Only - Average Latency pgbench: 100 - 800 - Read Only mysqlslap: 4096 mysqlslap: 2048 gromacs: MPI CPU - water_GMX50_bare memcached: 1:100 memcached: 1:10 cockroach: KV, 95% Reads - 128 cockroach: KV, 50% Reads - 128 cockroach: MoVR - 128 openssl: ChaCha20-Poly1305 openssl: AES-256-GCM openssl: AES-128-GCM openssl: ChaCha20 openssl: RSA4096 openssl: RSA4096 openssl: SHA512 openssl: SHA256 ospray-studio: 3 - 4K - 1 - Path Tracer ospray-studio: 1 - 4K - 1 - Path Tracer onednn: Matrix Multiply Batch Shapes Transformer - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Deconvolution Batch shapes_3d - bf16bf16bf16 - CPU onednn: Deconvolution Batch shapes_1d - bf16bf16bf16 - CPU onednn: Convolution Batch Shapes Auto - bf16bf16bf16 - CPU onednn: IP Shapes 3D - bf16bf16bf16 - CPU onednn: IP Shapes 1D - bf16bf16bf16 - CPU build-linux-kernel: defconfig build-ffmpeg: Time To Compile compress-7zip: Decompression Rating compress-7zip: Compression Rating openvkl: vklBenchmark ISPC oidn: RTLightmap.hdr.4096x4096 oidn: RT.hdr_alb_nrm.3840x2160 uvg266: Bosphorus 1080p - Ultra Fast uvg266: Bosphorus 1080p - Super Fast uvg266: Bosphorus 1080p - Very Fast uvg266: Bosphorus 4K - Ultra Fast uvg266: Bosphorus 4K - Super Fast uvg266: Bosphorus 4K - Very Fast embree: Pathtracer ISPC - Asian Dragon embree: Pathtracer ISPC - Crown john-the-ripper: MD5 john-the-ripper: HMAC-SHA512 john-the-ripper: Blowfish john-the-ripper: WPA PSK john-the-ripper: bcrypt compress-zstd: 19, Long Mode - Decompression Speed compress-zstd: 19, Long Mode - Compression Speed compress-zstd: 19 - Decompression Speed compress-zstd: 19 - Compression Speed specfem3d: Water-layered Halfspace specfem3d: Homogeneous Halfspace specfem3d: Tomographic Model specfem3d: Layered Halfspace specfem3d: Mount St. Helens openradioss: Rubber O-Ring Seal Installation openradioss: Bird Strike on Windshield openradioss: Cell Phone Drop Test openradioss: Bumper Beam openfoam: drivaerFastback, Small Mesh Size - Execution Time openfoam: drivaerFastback, Small Mesh Size - Mesh Time incompact3d: input.i3d 129 Cells Per Direction nekrs: TurboPipe Periodic namd: ATPase Simulation - 327,506 Atoms minibude: OpenMP - BM1 minibude: OpenMP - BM1 lczero: Eigen lczero: BLAS c3-highcpu-8 SPR 38999 128163 214760 219931 31654 87372 71072 32814.75 32118.58 34672.65 35602.10 36310.35 315.24 7573 6250 530.6100 3.7693 123.2931 16.2194 305.8991 6.5372 60.3858 33.1064 30.7952 64.8747 104.3102 19.1677 38.5120 51.8926 528.0193 3.7873 15.69 14.93 14.20 3.405 293725 2.565 311942 317 332 0.777 1030937.28 1044947.13 24960.1 19321.6 458.8 15970781140 48008361573 57594077823 22091557637 67857.7 2062.7 1568572920 4283873987 25819 20952 0.968986 2337.31 4660.70 3.54944 1.47145 4.17707 5.34218 1.50004 244.800 120.438 20468 35306 98 0.12 0.24 42.24 34.50 32.39 9.12 7.48 6.99 7.3642 5.8475 765713 37509000 6930 28818 6932 907.2 6.5 905.2 10.3 321.625251668 179.545901627 143.937022875 372.089480857 139.524469614 367.00 595.14 219.73 303.38 422.67836 62.044277 32.3943163 30667900000 3.35779 7.546 188.651 1221 1272 OpenBenchmarking.org
OpenCV This is a benchmark of the OpenCV (Computer Vision) library's built-in performance tests. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Object Detection c3-highcpu-8 SPR 8K 16K 24K 32K 40K SE +/- 384.74, N = 5 38999 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Image Processing c3-highcpu-8 SPR 30K 60K 90K 120K 150K SE +/- 1624.35, N = 12 128163 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Stitching c3-highcpu-8 SPR 50K 100K 150K 200K 250K SE +/- 1973.06, N = 7 214760 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Graph API c3-highcpu-8 SPR 50K 100K 150K 200K 250K SE +/- 931.36, N = 3 219931 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Video c3-highcpu-8 SPR 7K 14K 21K 28K 35K SE +/- 198.80, N = 3 31654 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Core c3-highcpu-8 SPR 20K 40K 60K 80K 100K SE +/- 280.31, N = 3 87372 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
BRL-CAD BRL-CAD is a cross-platform, open-source solid modeling system with built-in benchmark mode. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org VGR Performance Metric, More Is Better BRL-CAD 7.34 VGR Performance Metric c3-highcpu-8 SPR 15K 30K 45K 60K 75K 71072 1. (CXX) g++ options: -std=c++14 -pipe -fvisibility=hidden -fno-strict-aliasing -fno-common -fexceptions -ftemplate-depth-128 -m64 -ggdb3 -O3 -fipa-pta -fstrength-reduce -finline-functions -flto -ltcl8.6 -lregex_brl -lz_brl -lnetpbm -ldl -lm -ltk8.6
nginx This is a benchmark of the lightweight Nginx HTTP(S) web-server. This Nginx web server benchmark test profile makes use of the wrk program for facilitating the HTTP requests over a fixed period time with a configurable number of concurrent clients/connections. HTTPS with a self-signed OpenSSL certificate is used by this test for local benchmarking. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.23.2 Connections: 4000 c3-highcpu-8 SPR 7K 14K 21K 28K 35K SE +/- 27.93, N = 3 32814.75 1. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2
OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.23.2 Connections: 1000 c3-highcpu-8 SPR 7K 14K 21K 28K 35K SE +/- 22.42, N = 3 32118.58 1. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2
OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.23.2 Connections: 500 c3-highcpu-8 SPR 7K 14K 21K 28K 35K SE +/- 321.85, N = 3 34672.65 1. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2
OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.23.2 Connections: 200 c3-highcpu-8 SPR 8K 16K 24K 32K 40K SE +/- 83.52, N = 3 35602.10 1. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2
OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.23.2 Connections: 100 c3-highcpu-8 SPR 8K 16K 24K 32K 40K SE +/- 23.95, N = 3 36310.35 1. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2
Blender Blender is an open-source 3D creation and modeling software project. This test is of Blender's Cycles performance with various sample files. GPU computing via NVIDIA OptiX and NVIDIA CUDA is currently supported as well as HIP for AMD Radeon GPUs and Intel oneAPI for Intel Graphics. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.4 Blend File: BMW27 - Compute: CPU-Only c3-highcpu-8 SPR 70 140 210 280 350 SE +/- 1.13, N = 3 315.24
Google Draco Draco is a library developed by Google for compressing/decompressing 3D geometric meshes and point clouds. This test profile uses some Artec3D PLY models as the sample 3D model input formats for Draco compression/decompression. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.5.6 Model: Church Facade c3-highcpu-8 SPR 1600 3200 4800 6400 8000 SE +/- 10.48, N = 3 7573 1. (CXX) g++ options: -O3
OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream c3-highcpu-8 SPR 0.8521 1.7042 2.5563 3.4084 4.2605 SE +/- 0.0130, N = 3 3.7873
TensorFlow This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries too. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.10 Device: CPU - Batch Size: 64 - Model: ResNet-50 c3-highcpu-8 SPR 4 8 12 16 20 SE +/- 0.01, N = 3 15.69
PostgreSQL This is a benchmark of PostgreSQL using the integrated pgbench for facilitating the database benchmarks. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better PostgreSQL 15 Scaling Factor: 100 - Clients: 1000 - Mode: Read Only - Average Latency c3-highcpu-8 SPR 0.7661 1.5322 2.2983 3.0644 3.8305 SE +/- 0.028, N = 3 3.405 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
OpenBenchmarking.org TPS, More Is Better PostgreSQL 15 Scaling Factor: 100 - Clients: 1000 - Mode: Read Only c3-highcpu-8 SPR 60K 120K 180K 240K 300K SE +/- 2414.33, N = 3 293725 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
OpenBenchmarking.org ms, Fewer Is Better PostgreSQL 15 Scaling Factor: 100 - Clients: 800 - Mode: Read Only - Average Latency c3-highcpu-8 SPR 0.5771 1.1542 1.7313 2.3084 2.8855 SE +/- 0.028, N = 3 2.565 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
OpenBenchmarking.org TPS, More Is Better PostgreSQL 15 Scaling Factor: 100 - Clients: 800 - Mode: Read Only c3-highcpu-8 SPR 70K 140K 210K 280K 350K SE +/- 3369.27, N = 3 311942 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 11.0.1 Clients: 2048 c3-highcpu-8 SPR 70 140 210 280 350 SE +/- 3.01, N = 3 332 1. (CXX) g++ options: -pie -fPIC -fstack-protector -O3 -lnuma -lcrypt -lz -lm -lssl -lcrypto -lpthread -ldl
GROMACS The GROMACS (GROningen MAchine for Chemical Simulations) molecular dynamics package testing with the water_GMX50 data. This test profile allows selecting between CPU and GPU-based GROMACS builds. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2023 Implementation: MPI CPU - Input: water_GMX50_bare c3-highcpu-8 SPR 0.1748 0.3496 0.5244 0.6992 0.874 SE +/- 0.001, N = 3 0.777 1. (CXX) g++ options: -O3
Memcached Memcached is a high performance, distributed memory object caching system. This Memcached test profiles makes use of memtier_benchmark for excuting this CPU/memory-focused server benchmark. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Ops/sec, More Is Better Memcached 1.6.18 Set To Get Ratio: 1:100 c3-highcpu-8 SPR 200K 400K 600K 800K 1000K SE +/- 11157.73, N = 3 1030937.28 1. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre
OpenBenchmarking.org Ops/sec, More Is Better Memcached 1.6.18 Set To Get Ratio: 1:10 c3-highcpu-8 SPR 200K 400K 600K 800K 1000K SE +/- 3280.78, N = 3 1044947.13 1. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre
CockroachDB CockroachDB is a cloud-native, distributed SQL database for data intensive applications. This test profile uses a server-less CockroachDB configuration to test various Coackroach workloads on the local host with a single node. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ops/s, More Is Better CockroachDB 22.2 Workload: KV, 95% Reads - Concurrency: 128 c3-highcpu-8 SPR 5K 10K 15K 20K 25K SE +/- 127.78, N = 3 24960.1
OpenSSL OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test profile makes use of the built-in "openssl speed" benchmarking capabilities. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: ChaCha20-Poly1305 c3-highcpu-8 SPR 3000M 6000M 9000M 12000M 15000M SE +/- 21990039.72, N = 3 15970781140 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: AES-256-GCM c3-highcpu-8 SPR 10000M 20000M 30000M 40000M 50000M SE +/- 48074910.93, N = 3 48008361573 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: AES-128-GCM c3-highcpu-8 SPR 12000M 24000M 36000M 48000M 60000M SE +/- 33921495.47, N = 3 57594077823 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: ChaCha20 c3-highcpu-8 SPR 5000M 10000M 15000M 20000M 25000M SE +/- 35781789.46, N = 3 22091557637 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenBenchmarking.org verify/s, More Is Better OpenSSL 3.1 Algorithm: RSA4096 c3-highcpu-8 SPR 15K 30K 45K 60K 75K SE +/- 8.53, N = 3 67857.7 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenBenchmarking.org sign/s, More Is Better OpenSSL 3.1 Algorithm: RSA4096 c3-highcpu-8 SPR 400 800 1200 1600 2000 SE +/- 1.17, N = 3 2062.7 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: SHA512 c3-highcpu-8 SPR 300M 600M 900M 1200M 1500M SE +/- 1152941.98, N = 3 1568572920 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: SHA256 c3-highcpu-8 SPR 900M 1800M 2700M 3600M 4500M SE +/- 2722265.85, N = 3 4283873987 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OSPRay Studio Intel OSPRay Studio is an open-source, interactive visualization and ray-tracing software package. OSPRay Studio makes use of Intel OSPRay, a portable ray-tracing engine for high-performance, high-fidelity visualizations. OSPRay builds off Intel's Embree and Intel SPMD Program Compiler (ISPC) components as part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 0.11 Camera: 3 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer c3-highcpu-8 SPR 6K 12K 18K 24K 30K SE +/- 364.36, N = 3 25819 1. (CXX) g++ options: -O3 -lm -ldl
OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 0.11 Camera: 1 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer c3-highcpu-8 SPR 4K 8K 12K 16K 20K SE +/- 4.91, N = 3 20952 1. (CXX) g++ options: -O3 -lm -ldl
oneDNN This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI toolkit. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU c3-highcpu-8 SPR 0.218 0.436 0.654 0.872 1.09 SE +/- 0.013114, N = 3 0.968986 MIN: 0.92 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU c3-highcpu-8 SPR 500 1000 1500 2000 2500 SE +/- 1.76, N = 3 2337.31 MIN: 2326.77 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU c3-highcpu-8 SPR 1000 2000 3000 4000 5000 SE +/- 0.86, N = 3 4660.70 MIN: 4648.25 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU c3-highcpu-8 SPR 0.7986 1.5972 2.3958 3.1944 3.993 SE +/- 0.01072, N = 3 3.54944 MIN: 3.47 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU c3-highcpu-8 SPR 0.3311 0.6622 0.9933 1.3244 1.6555 SE +/- 0.00472, N = 3 1.47145 MIN: 1.4 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU c3-highcpu-8 SPR 0.9398 1.8796 2.8194 3.7592 4.699 SE +/- 0.00870, N = 3 4.17707 MIN: 4.04 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU c3-highcpu-8 SPR 1.202 2.404 3.606 4.808 6.01 SE +/- 0.00535, N = 3 5.34218 MIN: 4.94 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU c3-highcpu-8 SPR 0.3375 0.675 1.0125 1.35 1.6875 SE +/- 0.00340, N = 3 1.50004 MIN: 1.37 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
VVenC OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.7 Video Input: Bosphorus 1080p - Video Preset: Faster c3-highcpu-8 SPR 3 6 9 12 15 SE +/- 0.01, N = 3 12.92 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.7 Video Input: Bosphorus 1080p - Video Preset: Fast c3-highcpu-8 SPR 1.1936 2.3872 3.5808 4.7744 5.968 SE +/- 0.018, N = 3 5.305 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.7 Video Input: Bosphorus 4K - Video Preset: Faster c3-highcpu-8 SPR 0.7929 1.5858 2.3787 3.1716 3.9645 SE +/- 0.000, N = 3 3.524 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.7 Video Input: Bosphorus 4K - Video Preset: Fast c3-highcpu-8 SPR 0.3501 0.7002 1.0503 1.4004 1.7505 SE +/- 0.005, N = 3 1.556 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 1080p - Video Preset: Super Fast c3-highcpu-8 SPR 8 16 24 32 40 SE +/- 0.03, N = 3 34.50
OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 1080p - Video Preset: Very Fast c3-highcpu-8 SPR 8 16 24 32 40 SE +/- 0.25, N = 3 32.39
OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 4K - Video Preset: Ultra Fast c3-highcpu-8 SPR 3 6 9 12 15 SE +/- 0.01, N = 3 9.12
OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 4K - Video Preset: Super Fast c3-highcpu-8 SPR 2 4 6 8 10 SE +/- 0.00, N = 3 7.48
OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 4K - Video Preset: Very Fast c3-highcpu-8 SPR 2 4 6 8 10 SE +/- 0.03, N = 3 6.99
Embree OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.0.1 Binary: Pathtracer ISPC - Model: Asian Dragon c3-highcpu-8 SPR 2 4 6 8 10 SE +/- 0.0079, N = 3 7.3642 MIN: 7.33 / MAX: 7.44
OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.0.1 Binary: Pathtracer ISPC - Model: Crown c3-highcpu-8 SPR 1.3157 2.6314 3.9471 5.2628 6.5785 SE +/- 0.0120, N = 3 5.8475 MIN: 5.81 / MAX: 5.92
Zstd Compression This test measures the time needed to compress/decompress a sample file (silesia.tar) using Zstd (Zstandard) compression with options for different compression levels / settings. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19, Long Mode - Decompression Speed c3-highcpu-8 SPR 200 400 600 800 1000 SE +/- 1.48, N = 3 907.2 1. (CC) gcc options: -O3 -pthread -lz -llzma
SPECFEM3D simulates acoustic (fluid), elastic (solid), coupled acoustic/elastic, poroelastic or seismic wave propagation in any type of conforming mesh of hexahedra. This test profile currently relies on CPU-based execution for SPECFEM3D and using a variety of their built-in examples/models for benchmarking. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.0 Model: Water-layered Halfspace c3-highcpu-8 SPR 70 140 210 280 350 SE +/- 0.31, N = 3 321.63 1. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.0 Model: Homogeneous Halfspace c3-highcpu-8 SPR 40 80 120 160 200 SE +/- 0.09, N = 3 179.55 1. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.0 Model: Tomographic Model c3-highcpu-8 SPR 30 60 90 120 150 SE +/- 1.65, N = 3 143.94 1. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.0 Model: Layered Halfspace c3-highcpu-8 SPR 80 160 240 320 400 SE +/- 0.64, N = 3 372.09 1. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.0 Model: Mount St. Helens c3-highcpu-8 SPR 30 60 90 120 150 SE +/- 0.10, N = 3 139.52 1. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
OpenRadioss OpenRadioss is an open-source AGPL-licensed finite element solver for dynamic event analysis OpenRadioss is based on Altair Radioss and open-sourced in 2022. This open-source finite element solver is benchmarked with various example models available from https://www.openradioss.org/models/. This test is currently using a reference OpenRadioss binary build offered via GitHub. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better OpenRadioss 2022.10.13 Model: Rubber O-Ring Seal Installation c3-highcpu-8 SPR 80 160 240 320 400 SE +/- 0.20, N = 3 367.00
OpenFOAM OpenFOAM is the leading free, open-source software for computational fluid dynamics (CFD). This test profile currently uses the drivaerFastback test case for analyzing automotive aerodynamics or alternatively the older motorBike input. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: drivaerFastback, Small Mesh Size - Execution Time c3-highcpu-8 SPR 90 180 270 360 450 422.68 1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lphysicalProperties -lspecie -lfiniteVolume -lfvModels -lgenericPatchFields -lmeshTools -lsampling -lOpenFOAM -ldl -lm
OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: drivaerFastback, Small Mesh Size - Mesh Time c3-highcpu-8 SPR 14 28 42 56 70 62.04 1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lphysicalProperties -lspecie -lfiniteVolume -lfvModels -lgenericPatchFields -lmeshTools -lsampling -lOpenFOAM -ldl -lm
Xcompact3d Incompact3d Xcompact3d Incompact3d is a Fortran-MPI based, finite difference high-performance code for solving the incompressible Navier-Stokes equation and as many as you need scalar transport equations. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 129 Cells Per Direction c3-highcpu-8 SPR 8 16 24 32 40 SE +/- 0.02, N = 3 32.39 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
nekRS nekRS is an open-source Navier Stokes solver based on the spectral element method. NekRS supports both CPU and GPU/accelerator support though this test profile is currently configured for CPU execution. NekRS is part of Nek5000 of the Mathematics and Computer Science MCS at Argonne National Laboratory. This nekRS benchmark is primarily relevant to large core count HPC servers and otherwise may be very time consuming. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org FLOP/s, More Is Better nekRS 22.0 Input: TurboPipe Periodic c3-highcpu-8 SPR 7000M 14000M 21000M 28000M 35000M SE +/- 92013205.57, N = 3 30667900000 1. (CXX) g++ options: -fopenmp -O2 -march=native -mtune=native -ftree-vectorize -lmpi_cxx -lmpi
NAMD NAMD is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems. NAMD was developed by the Theoretical and Computational Biophysics Group in the Beckman Institute for Advanced Science and Technology at the University of Illinois at Urbana-Champaign. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org days/ns, Fewer Is Better NAMD 2.14 ATPase Simulation - 327,506 Atoms c3-highcpu-8 SPR 0.7555 1.511 2.2665 3.022 3.7775 SE +/- 0.00254, N = 3 3.35779
miniBUDE MiniBUDE is a mini application for the the core computation of the Bristol University Docking Engine (BUDE). This test profile currently makes use of the OpenMP implementation of miniBUDE for CPU benchmarking. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Billion Interactions/s, More Is Better miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM1 c3-highcpu-8 SPR 2 4 6 8 10 SE +/- 0.001, N = 3 7.546 1. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm
OpenBenchmarking.org GFInst/s, More Is Better miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM1 c3-highcpu-8 SPR 40 80 120 160 200 SE +/- 0.03, N = 3 188.65 1. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm
c3-highcpu-8 SPR Processor: Intel Xeon Platinum 8481C (4 Cores / 8 Threads), Motherboard: Google Compute Engine c3-highcpu-8, Chipset: Intel 440FX 82441FX PMC, Memory: 16GB, Disk: 322GB nvme_card-pd, Network: Google Compute Engine Virtual
OS: Ubuntu 22.10, Kernel: 5.19.0-1015-gcp (x86_64), Vulkan: 1.3.224, Compiler: GCC 12.2.0, File-System: ext4, System Layer: KVM
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: CPU Microcode: 0xffffffffPython Notes: Python 3.10.7Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 21 March 2023 03:08 by user michael_larabel.