Benchmarks by Michael Larabel for a future article.
c3-highcpu-8 SPR Processor: Intel Xeon Platinum 8481C (4 Cores / 8 Threads), Motherboard: Google Compute Engine c3-highcpu-8, Chipset: Intel 440FX 82441FX PMC, Memory: 16GB, Disk: 322GB nvme_card-pd, Network: Google Compute Engine Virtual
OS: Ubuntu 22.10, Kernel: 5.19.0-1015-gcp (x86_64), Vulkan: 1.3.224, Compiler: GCC 12.2.0, File-System: ext4, System Layer: KVM
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: CPU Microcode: 0xffffffffPython Notes: Python 3.10.7Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected
Google Cloud c3 Sapphire Rapids OpenBenchmarking.org Phoronix Test Suite Intel Xeon Platinum 8481C (4 Cores / 8 Threads) Google Compute Engine c3-highcpu-8 Intel 440FX 82441FX PMC 16GB 322GB nvme_card-pd Google Compute Engine Virtual Ubuntu 22.10 5.19.0-1015-gcp (x86_64) 1.3.224 GCC 12.2.0 ext4 KVM Processor Motherboard Chipset Memory Disk Network OS Kernel Vulkan Compiler File-System System Layer Google Cloud C3 Sapphire Rapids Performance System Logs - Transparent Huge Pages: madvise - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - CPU Microcode: 0xffffffff - Python 3.10.7 - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected
Google Cloud c3 Sapphire Rapids compress-7zip: Compression Rating compress-7zip: Decompression Rating blender: BMW27 - CPU-Only brl-cad: VGR Performance Metric cockroach: MoVR - 128 cockroach: KV, 50% Reads - 128 cockroach: KV, 95% Reads - 128 embree: Pathtracer ISPC - Crown embree: Pathtracer ISPC - Asian Dragon draco: Lion draco: Church Facade gromacs: MPI CPU - water_GMX50_bare oidn: RT.hdr_alb_nrm.3840x2160 oidn: RTLightmap.hdr.4096x4096 john-the-ripper: bcrypt john-the-ripper: WPA PSK john-the-ripper: Blowfish john-the-ripper: HMAC-SHA512 john-the-ripper: MD5 lczero: BLAS lczero: Eigen mysqlslap: 2048 mysqlslap: 4096 memcached: 1:10 memcached: 1:100 minibude: OpenMP - BM1 minibude: OpenMP - BM1 namd: ATPase Simulation - 327,506 Atoms nekrs: TurboPipe Periodic deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream deepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Asynchronous Multi-Stream deepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Asynchronous Multi-Stream deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Stream deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream nginx: 100 nginx: 200 nginx: 500 nginx: 1000 nginx: 4000 onednn: IP Shapes 1D - bf16bf16bf16 - CPU onednn: IP Shapes 3D - bf16bf16bf16 - CPU onednn: Convolution Batch Shapes Auto - bf16bf16bf16 - CPU onednn: Deconvolution Batch shapes_1d - bf16bf16bf16 - CPU onednn: Deconvolution Batch shapes_3d - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: Matrix Multiply Batch Shapes Transformer - bf16bf16bf16 - CPU opencv: Core opencv: Video opencv: Graph API opencv: Stitching opencv: Image Processing opencv: Object Detection openfoam: drivaerFastback, Small Mesh Size - Mesh Time openfoam: drivaerFastback, Small Mesh Size - Execution Time openradioss: Bumper Beam openradioss: Cell Phone Drop Test openradioss: Bird Strike on Windshield openradioss: Rubber O-Ring Seal Installation openssl: SHA256 openssl: SHA512 openssl: RSA4096 openssl: RSA4096 openssl: ChaCha20 openssl: AES-128-GCM openssl: AES-256-GCM openssl: ChaCha20-Poly1305 openvkl: vklBenchmark ISPC ospray-studio: 1 - 4K - 1 - Path Tracer ospray-studio: 3 - 4K - 1 - Path Tracer pgbench: 100 - 800 - Read Only pgbench: 100 - 800 - Read Only - Average Latency pgbench: 100 - 1000 - Read Only pgbench: 100 - 1000 - Read Only - Average Latency specfem3d: Mount St. Helens specfem3d: Layered Halfspace specfem3d: Tomographic Model specfem3d: Homogeneous Halfspace specfem3d: Water-layered Halfspace tensorflow: CPU - 16 - ResNet-50 tensorflow: CPU - 32 - ResNet-50 tensorflow: CPU - 64 - ResNet-50 build-ffmpeg: Time To Compile build-linux-kernel: defconfig uvg266: Bosphorus 4K - Very Fast uvg266: Bosphorus 4K - Super Fast uvg266: Bosphorus 4K - Ultra Fast uvg266: Bosphorus 1080p - Very Fast uvg266: Bosphorus 1080p - Super Fast uvg266: Bosphorus 1080p - Ultra Fast incompact3d: input.i3d 129 Cells Per Direction compress-zstd: 19 - Compression Speed compress-zstd: 19 - Decompression Speed compress-zstd: 19, Long Mode - Compression Speed compress-zstd: 19, Long Mode - Decompression Speed c3-highcpu-8 SPR 35306 20468 315.24 71072 458.8 19321.6 24960.1 5.8475 7.3642 6250 7573 0.777 0.24 0.12 6932 28818 6930 37509000 765713 1272 1221 332 317 1044947.13 1030937.28 188.651 7.546 3.35779 30667900000 3.7873 528.0193 51.8926 38.5120 19.1677 104.3102 64.8747 30.7952 33.1064 60.3858 6.5372 305.8991 16.2194 123.2931 3.7693 530.6100 36310.35 35602.10 34672.65 32118.58 32814.75 1.50004 5.34218 4.17707 1.47145 3.54944 4660.70 2337.31 0.968986 87372 31654 219931 214760 128163 38999 62.044277 422.67836 303.38 219.73 595.14 367.00 4283873987 1568572920 2062.7 67857.7 22091557637 57594077823 48008361573 15970781140 98 20952 25819 311942 2.565 293725 3.405 139.524469614 372.089480857 143.937022875 179.545901627 321.625251668 14.20 14.93 15.69 120.438 244.800 6.99 7.48 9.12 32.39 34.50 42.24 32.3943163 10.3 905.2 6.5 907.2 OpenBenchmarking.org
Blender Blender is an open-source 3D creation and modeling software project. This test is of Blender's Cycles performance with various sample files. GPU computing via NVIDIA OptiX and NVIDIA CUDA is currently supported as well as HIP for AMD Radeon GPUs and Intel oneAPI for Intel Graphics. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.4 Blend File: BMW27 - Compute: CPU-Only c3-highcpu-8 SPR 70 140 210 280 350 SE +/- 1.13, N = 3 315.24
BRL-CAD BRL-CAD is a cross-platform, open-source solid modeling system with built-in benchmark mode. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org VGR Performance Metric, More Is Better BRL-CAD 7.34 VGR Performance Metric c3-highcpu-8 SPR 15K 30K 45K 60K 75K 71072 1. (CXX) g++ options: -std=c++14 -pipe -fvisibility=hidden -fno-strict-aliasing -fno-common -fexceptions -ftemplate-depth-128 -m64 -ggdb3 -O3 -fipa-pta -fstrength-reduce -finline-functions -flto -ltcl8.6 -lregex_brl -lz_brl -lnetpbm -ldl -lm -ltk8.6
CockroachDB CockroachDB is a cloud-native, distributed SQL database for data intensive applications. This test profile uses a server-less CockroachDB configuration to test various Coackroach workloads on the local host with a single node. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ops/s, More Is Better CockroachDB 22.2 Workload: MoVR - Concurrency: 128 c3-highcpu-8 SPR 100 200 300 400 500 SE +/- 5.62, N = 15 458.8
Embree OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.0.1 Binary: Pathtracer ISPC - Model: Crown c3-highcpu-8 SPR 1.3157 2.6314 3.9471 5.2628 6.5785 SE +/- 0.0120, N = 3 5.8475 MIN: 5.81 / MAX: 5.92
OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.0.1 Binary: Pathtracer ISPC - Model: Asian Dragon c3-highcpu-8 SPR 2 4 6 8 10 SE +/- 0.0079, N = 3 7.3642 MIN: 7.33 / MAX: 7.44
Google Draco Draco is a library developed by Google for compressing/decompressing 3D geometric meshes and point clouds. This test profile uses some Artec3D PLY models as the sample 3D model input formats for Draco compression/decompression. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.5.6 Model: Lion c3-highcpu-8 SPR 1300 2600 3900 5200 6500 SE +/- 80.23, N = 15 6250 1. (CXX) g++ options: -O3
GROMACS The GROMACS (GROningen MAchine for Chemical Simulations) molecular dynamics package testing with the water_GMX50 data. This test profile allows selecting between CPU and GPU-based GROMACS builds. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2023 Implementation: MPI CPU - Input: water_GMX50_bare c3-highcpu-8 SPR 0.1748 0.3496 0.5244 0.6992 0.874 SE +/- 0.001, N = 3 0.777 1. (CXX) g++ options: -O3
OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 11.0.1 Clients: 4096 c3-highcpu-8 SPR 70 140 210 280 350 SE +/- 2.62, N = 3 317 1. (CXX) g++ options: -pie -fPIC -fstack-protector -O3 -lnuma -lcrypt -lz -lm -lssl -lcrypto -lpthread -ldl
Memcached Memcached is a high performance, distributed memory object caching system. This Memcached test profiles makes use of memtier_benchmark for excuting this CPU/memory-focused server benchmark. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Ops/sec, More Is Better Memcached 1.6.18 Set To Get Ratio: 1:10 c3-highcpu-8 SPR 200K 400K 600K 800K 1000K SE +/- 3280.78, N = 3 1044947.13 1. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre
OpenBenchmarking.org Ops/sec, More Is Better Memcached 1.6.18 Set To Get Ratio: 1:100 c3-highcpu-8 SPR 200K 400K 600K 800K 1000K SE +/- 11157.73, N = 3 1030937.28 1. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre
miniBUDE MiniBUDE is a mini application for the the core computation of the Bristol University Docking Engine (BUDE). This test profile currently makes use of the OpenMP implementation of miniBUDE for CPU benchmarking. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org GFInst/s, More Is Better miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM1 c3-highcpu-8 SPR 40 80 120 160 200 SE +/- 0.03, N = 3 188.65 1. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm
OpenBenchmarking.org Billion Interactions/s, More Is Better miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM1 c3-highcpu-8 SPR 2 4 6 8 10 SE +/- 0.001, N = 3 7.546 1. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm
NAMD NAMD is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems. NAMD was developed by the Theoretical and Computational Biophysics Group in the Beckman Institute for Advanced Science and Technology at the University of Illinois at Urbana-Champaign. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org days/ns, Fewer Is Better NAMD 2.14 ATPase Simulation - 327,506 Atoms c3-highcpu-8 SPR 0.7555 1.511 2.2665 3.022 3.7775 SE +/- 0.00254, N = 3 3.35779
nekRS nekRS is an open-source Navier Stokes solver based on the spectral element method. NekRS supports both CPU and GPU/accelerator support though this test profile is currently configured for CPU execution. NekRS is part of Nek5000 of the Mathematics and Computer Science MCS at Argonne National Laboratory. This nekRS benchmark is primarily relevant to large core count HPC servers and otherwise may be very time consuming. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org FLOP/s, More Is Better nekRS 22.0 Input: TurboPipe Periodic c3-highcpu-8 SPR 7000M 14000M 21000M 28000M 35000M SE +/- 92013205.57, N = 3 30667900000 1. (CXX) g++ options: -fopenmp -O2 -march=native -mtune=native -ftree-vectorize -lmpi_cxx -lmpi
nginx This is a benchmark of the lightweight Nginx HTTP(S) web-server. This Nginx web server benchmark test profile makes use of the wrk program for facilitating the HTTP requests over a fixed period time with a configurable number of concurrent clients/connections. HTTPS with a self-signed OpenSSL certificate is used by this test for local benchmarking. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.23.2 Connections: 100 c3-highcpu-8 SPR 8K 16K 24K 32K 40K SE +/- 23.95, N = 3 36310.35 1. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2
OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.23.2 Connections: 200 c3-highcpu-8 SPR 8K 16K 24K 32K 40K SE +/- 83.52, N = 3 35602.10 1. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2
OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.23.2 Connections: 500 c3-highcpu-8 SPR 7K 14K 21K 28K 35K SE +/- 321.85, N = 3 34672.65 1. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2
OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.23.2 Connections: 1000 c3-highcpu-8 SPR 7K 14K 21K 28K 35K SE +/- 22.42, N = 3 32118.58 1. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2
OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.23.2 Connections: 4000 c3-highcpu-8 SPR 7K 14K 21K 28K 35K SE +/- 27.93, N = 3 32814.75 1. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2
oneDNN This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI toolkit. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU c3-highcpu-8 SPR 0.3375 0.675 1.0125 1.35 1.6875 SE +/- 0.00340, N = 3 1.50004 MIN: 1.37 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU c3-highcpu-8 SPR 1.202 2.404 3.606 4.808 6.01 SE +/- 0.00535, N = 3 5.34218 MIN: 4.94 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU c3-highcpu-8 SPR 0.9398 1.8796 2.8194 3.7592 4.699 SE +/- 0.00870, N = 3 4.17707 MIN: 4.04 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU c3-highcpu-8 SPR 0.3311 0.6622 0.9933 1.3244 1.6555 SE +/- 0.00472, N = 3 1.47145 MIN: 1.4 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU c3-highcpu-8 SPR 0.7986 1.5972 2.3958 3.1944 3.993 SE +/- 0.01072, N = 3 3.54944 MIN: 3.47 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU c3-highcpu-8 SPR 1000 2000 3000 4000 5000 SE +/- 0.86, N = 3 4660.70 MIN: 4648.25 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU c3-highcpu-8 SPR 500 1000 1500 2000 2500 SE +/- 1.76, N = 3 2337.31 MIN: 2326.77 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU c3-highcpu-8 SPR 0.218 0.436 0.654 0.872 1.09 SE +/- 0.013114, N = 3 0.968986 MIN: 0.92 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenCV This is a benchmark of the OpenCV (Computer Vision) library's built-in performance tests. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Core c3-highcpu-8 SPR 20K 40K 60K 80K 100K SE +/- 280.31, N = 3 87372 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Video c3-highcpu-8 SPR 7K 14K 21K 28K 35K SE +/- 198.80, N = 3 31654 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Graph API c3-highcpu-8 SPR 50K 100K 150K 200K 250K SE +/- 931.36, N = 3 219931 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Stitching c3-highcpu-8 SPR 50K 100K 150K 200K 250K SE +/- 1973.06, N = 7 214760 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Image Processing c3-highcpu-8 SPR 30K 60K 90K 120K 150K SE +/- 1624.35, N = 12 128163 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Object Detection c3-highcpu-8 SPR 8K 16K 24K 32K 40K SE +/- 384.74, N = 5 38999 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
OpenFOAM OpenFOAM is the leading free, open-source software for computational fluid dynamics (CFD). This test profile currently uses the drivaerFastback test case for analyzing automotive aerodynamics or alternatively the older motorBike input. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: drivaerFastback, Small Mesh Size - Mesh Time c3-highcpu-8 SPR 14 28 42 56 70 62.04 1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lphysicalProperties -lspecie -lfiniteVolume -lfvModels -lgenericPatchFields -lmeshTools -lsampling -lOpenFOAM -ldl -lm
OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: drivaerFastback, Small Mesh Size - Execution Time c3-highcpu-8 SPR 90 180 270 360 450 422.68 1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lphysicalProperties -lspecie -lfiniteVolume -lfvModels -lgenericPatchFields -lmeshTools -lsampling -lOpenFOAM -ldl -lm
OpenRadioss OpenRadioss is an open-source AGPL-licensed finite element solver for dynamic event analysis OpenRadioss is based on Altair Radioss and open-sourced in 2022. This open-source finite element solver is benchmarked with various example models available from https://www.openradioss.org/models/. This test is currently using a reference OpenRadioss binary build offered via GitHub. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better OpenRadioss 2022.10.13 Model: Bumper Beam c3-highcpu-8 SPR 70 140 210 280 350 SE +/- 0.51, N = 3 303.38
OpenSSL OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test profile makes use of the built-in "openssl speed" benchmarking capabilities. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: SHA256 c3-highcpu-8 SPR 900M 1800M 2700M 3600M 4500M SE +/- 2722265.85, N = 3 4283873987 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: SHA512 c3-highcpu-8 SPR 300M 600M 900M 1200M 1500M SE +/- 1152941.98, N = 3 1568572920 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenBenchmarking.org sign/s, More Is Better OpenSSL 3.1 Algorithm: RSA4096 c3-highcpu-8 SPR 400 800 1200 1600 2000 SE +/- 1.17, N = 3 2062.7 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenBenchmarking.org verify/s, More Is Better OpenSSL 3.1 Algorithm: RSA4096 c3-highcpu-8 SPR 15K 30K 45K 60K 75K SE +/- 8.53, N = 3 67857.7 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: ChaCha20 c3-highcpu-8 SPR 5000M 10000M 15000M 20000M 25000M SE +/- 35781789.46, N = 3 22091557637 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: AES-128-GCM c3-highcpu-8 SPR 12000M 24000M 36000M 48000M 60000M SE +/- 33921495.47, N = 3 57594077823 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: AES-256-GCM c3-highcpu-8 SPR 10000M 20000M 30000M 40000M 50000M SE +/- 48074910.93, N = 3 48008361573 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: ChaCha20-Poly1305 c3-highcpu-8 SPR 3000M 6000M 9000M 12000M 15000M SE +/- 21990039.72, N = 3 15970781140 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OSPRay Studio Intel OSPRay Studio is an open-source, interactive visualization and ray-tracing software package. OSPRay Studio makes use of Intel OSPRay, a portable ray-tracing engine for high-performance, high-fidelity visualizations. OSPRay builds off Intel's Embree and Intel SPMD Program Compiler (ISPC) components as part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 0.11 Camera: 1 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer c3-highcpu-8 SPR 4K 8K 12K 16K 20K SE +/- 4.91, N = 3 20952 1. (CXX) g++ options: -O3 -lm -ldl
OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 0.11 Camera: 3 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer c3-highcpu-8 SPR 6K 12K 18K 24K 30K SE +/- 364.36, N = 3 25819 1. (CXX) g++ options: -O3 -lm -ldl
OpenBenchmarking.org ms, Fewer Is Better PostgreSQL 15 Scaling Factor: 100 - Clients: 800 - Mode: Read Only - Average Latency c3-highcpu-8 SPR 0.5771 1.1542 1.7313 2.3084 2.8855 SE +/- 0.028, N = 3 2.565 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
OpenBenchmarking.org TPS, More Is Better PostgreSQL 15 Scaling Factor: 100 - Clients: 1000 - Mode: Read Only c3-highcpu-8 SPR 60K 120K 180K 240K 300K SE +/- 2414.33, N = 3 293725 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
OpenBenchmarking.org ms, Fewer Is Better PostgreSQL 15 Scaling Factor: 100 - Clients: 1000 - Mode: Read Only - Average Latency c3-highcpu-8 SPR 0.7661 1.5322 2.2983 3.0644 3.8305 SE +/- 0.028, N = 3 3.405 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
SPECFEM3D simulates acoustic (fluid), elastic (solid), coupled acoustic/elastic, poroelastic or seismic wave propagation in any type of conforming mesh of hexahedra. This test profile currently relies on CPU-based execution for SPECFEM3D and using a variety of their built-in examples/models for benchmarking. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.0 Model: Mount St. Helens c3-highcpu-8 SPR 30 60 90 120 150 SE +/- 0.10, N = 3 139.52 1. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.0 Model: Layered Halfspace c3-highcpu-8 SPR 80 160 240 320 400 SE +/- 0.64, N = 3 372.09 1. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.0 Model: Tomographic Model c3-highcpu-8 SPR 30 60 90 120 150 SE +/- 1.65, N = 3 143.94 1. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.0 Model: Homogeneous Halfspace c3-highcpu-8 SPR 40 80 120 160 200 SE +/- 0.09, N = 3 179.55 1. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.0 Model: Water-layered Halfspace c3-highcpu-8 SPR 70 140 210 280 350 SE +/- 0.31, N = 3 321.63 1. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
TensorFlow This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries too. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.10 Device: CPU - Batch Size: 16 - Model: ResNet-50 c3-highcpu-8 SPR 4 8 12 16 20 SE +/- 0.04, N = 3 14.20
OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 4K - Video Preset: Super Fast c3-highcpu-8 SPR 2 4 6 8 10 SE +/- 0.00, N = 3 7.48
OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 4K - Video Preset: Ultra Fast c3-highcpu-8 SPR 3 6 9 12 15 SE +/- 0.01, N = 3 9.12
OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 1080p - Video Preset: Very Fast c3-highcpu-8 SPR 8 16 24 32 40 SE +/- 0.25, N = 3 32.39
OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 1080p - Video Preset: Super Fast c3-highcpu-8 SPR 8 16 24 32 40 SE +/- 0.03, N = 3 34.50
OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 1080p - Video Preset: Ultra Fast c3-highcpu-8 SPR 10 20 30 40 50 SE +/- 0.01, N = 3 42.24
VVenC OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.7 Video Input: Bosphorus 4K - Video Preset: Fast c3-highcpu-8 SPR 0.3501 0.7002 1.0503 1.4004 1.7505 SE +/- 0.005, N = 3 1.556 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.7 Video Input: Bosphorus 4K - Video Preset: Faster c3-highcpu-8 SPR 0.7929 1.5858 2.3787 3.1716 3.9645 SE +/- 0.000, N = 3 3.524 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.7 Video Input: Bosphorus 1080p - Video Preset: Fast c3-highcpu-8 SPR 1.1936 2.3872 3.5808 4.7744 5.968 SE +/- 0.018, N = 3 5.305 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.7 Video Input: Bosphorus 1080p - Video Preset: Faster c3-highcpu-8 SPR 3 6 9 12 15 SE +/- 0.01, N = 3 12.92 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
Xcompact3d Incompact3d Xcompact3d Incompact3d is a Fortran-MPI based, finite difference high-performance code for solving the incompressible Navier-Stokes equation and as many as you need scalar transport equations. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 129 Cells Per Direction c3-highcpu-8 SPR 8 16 24 32 40 SE +/- 0.02, N = 3 32.39 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19, Long Mode - Decompression Speed c3-highcpu-8 SPR 200 400 600 800 1000 SE +/- 1.48, N = 3 907.2 1. (CC) gcc options: -O3 -pthread -lz -llzma
c3-highcpu-8 SPR Processor: Intel Xeon Platinum 8481C (4 Cores / 8 Threads), Motherboard: Google Compute Engine c3-highcpu-8, Chipset: Intel 440FX 82441FX PMC, Memory: 16GB, Disk: 322GB nvme_card-pd, Network: Google Compute Engine Virtual
OS: Ubuntu 22.10, Kernel: 5.19.0-1015-gcp (x86_64), Vulkan: 1.3.224, Compiler: GCC 12.2.0, File-System: ext4, System Layer: KVM
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: CPU Microcode: 0xffffffffPython Notes: Python 3.10.7Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 21 March 2023 03:08 by user michael_larabel.