Google Cloud c3 Sapphire Rapids Benchmarks by Michael Larabel for a future article. c3-highcpu-8 SPR: Processor: Intel Xeon Platinum 8481C (4 Cores / 8 Threads), Motherboard: Google Compute Engine c3-highcpu-8, Chipset: Intel 440FX 82441FX PMC, Memory: 16GB, Disk: 322GB nvme_card-pd, Network: Google Compute Engine Virtual OS: Ubuntu 22.10, Kernel: 5.19.0-1015-gcp (x86_64), Vulkan: 1.3.224, Compiler: GCC 12.2.0, File-System: ext4, System Layer: KVM miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM1 Billion Interactions/s > Higher Is Better c3-highcpu-8 SPR . 7.546 |===================================================== OpenSSL 3.1 Algorithm: SHA256 byte/s > Higher Is Better c3-highcpu-8 SPR . 4283873987 |================================================ OpenSSL 3.1 Algorithm: SHA512 byte/s > Higher Is Better c3-highcpu-8 SPR . 1568572920 |================================================ OpenSSL 3.1 Algorithm: ChaCha20 byte/s > Higher Is Better c3-highcpu-8 SPR . 22091557637 |=============================================== OpenSSL 3.1 Algorithm: AES-128-GCM byte/s > Higher Is Better c3-highcpu-8 SPR . 57594077823 |=============================================== OpenSSL 3.1 Algorithm: AES-256-GCM byte/s > Higher Is Better c3-highcpu-8 SPR . 48008361573 |=============================================== OpenSSL 3.1 Algorithm: ChaCha20-Poly1305 byte/s > Higher Is Better c3-highcpu-8 SPR . 15970781140 |=============================================== nekRS 22.0 Input: TurboPipe Periodic FLOP/s > Higher Is Better c3-highcpu-8 SPR . 30667900000 |=============================================== Embree 4.0.1 Binary: Pathtracer ISPC - Model: Crown Frames Per Second > Higher Is Better c3-highcpu-8 SPR . 5.8475 |==================================================== Embree 4.0.1 Binary: Pathtracer ISPC - Model: Asian Dragon Frames Per Second > Higher Is Better c3-highcpu-8 SPR . 7.3642 |==================================================== uvg266 0.4.1 Video Input: Bosphorus 4K - Video Preset: Very Fast Frames Per Second > Higher Is Better c3-highcpu-8 SPR . 6.99 |====================================================== uvg266 0.4.1 Video Input: Bosphorus 4K - Video Preset: Super Fast Frames Per Second > Higher Is Better c3-highcpu-8 SPR . 7.48 |====================================================== uvg266 0.4.1 Video Input: Bosphorus 4K - Video Preset: Ultra Fast Frames Per Second > Higher Is Better c3-highcpu-8 SPR . 9.12 |====================================================== uvg266 0.4.1 Video Input: Bosphorus 1080p - Video Preset: Very Fast Frames Per Second > Higher Is Better c3-highcpu-8 SPR . 32.39 |===================================================== uvg266 0.4.1 Video Input: Bosphorus 1080p - Video Preset: Super Fast Frames Per Second > Higher Is Better c3-highcpu-8 SPR . 34.50 |===================================================== uvg266 0.4.1 Video Input: Bosphorus 1080p - Video Preset: Ultra Fast Frames Per Second > Higher Is Better c3-highcpu-8 SPR . 42.24 |===================================================== VVenC 1.7 Video Input: Bosphorus 4K - Video Preset: Fast Frames Per Second > Higher Is Better c3-highcpu-8 SPR . 1.556 |===================================================== VVenC 1.7 Video Input: Bosphorus 4K - Video Preset: Faster Frames Per Second > Higher Is Better c3-highcpu-8 SPR . 3.524 |===================================================== VVenC 1.7 Video Input: Bosphorus 1080p - Video Preset: Fast Frames Per Second > Higher Is Better c3-highcpu-8 SPR . 5.305 |===================================================== VVenC 1.7 Video Input: Bosphorus 1080p - Video Preset: Faster Frames Per Second > Higher Is Better c3-highcpu-8 SPR . 12.92 |===================================================== miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM1 GFInst/s > Higher Is Better c3-highcpu-8 SPR . 188.65 |==================================================== Intel Open Image Denoise 1.4.0 Run: RT.hdr_alb_nrm.3840x2160 Images / Sec > Higher Is Better c3-highcpu-8 SPR . 0.24 |====================================================== Intel Open Image Denoise 1.4.0 Run: RTLightmap.hdr.4096x4096 Images / Sec > Higher Is Better c3-highcpu-8 SPR . 0.12 |====================================================== TensorFlow 2.10 Device: CPU - Batch Size: 16 - Model: ResNet-50 images/sec > Higher Is Better c3-highcpu-8 SPR . 14.20 |===================================================== TensorFlow 2.10 Device: CPU - Batch Size: 32 - Model: ResNet-50 images/sec > Higher Is Better c3-highcpu-8 SPR . 14.93 |===================================================== TensorFlow 2.10 Device: CPU - Batch Size: 64 - Model: ResNet-50 images/sec > Higher Is Better c3-highcpu-8 SPR . 15.69 |===================================================== OpenVKL 1.3.1 Benchmark: vklBenchmark ISPC Items / Sec > Higher Is Better c3-highcpu-8 SPR . 98 |======================================================== Neural Magic DeepSparse 1.3.2 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream items/sec > Higher Is Better c3-highcpu-8 SPR . 3.7873 |==================================================== Neural Magic DeepSparse 1.3.2 Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream items/sec > Higher Is Better c3-highcpu-8 SPR . 51.89 |===================================================== Neural Magic DeepSparse 1.3.2 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream items/sec > Higher Is Better c3-highcpu-8 SPR . 19.17 |===================================================== Neural Magic DeepSparse 1.3.2 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream items/sec > Higher Is Better c3-highcpu-8 SPR . 64.87 |===================================================== Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream items/sec > Higher Is Better c3-highcpu-8 SPR . 33.11 |===================================================== Neural Magic DeepSparse 1.3.2 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream items/sec > Higher Is Better c3-highcpu-8 SPR . 6.5372 |==================================================== Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream items/sec > Higher Is Better c3-highcpu-8 SPR . 16.22 |===================================================== Neural Magic DeepSparse 1.3.2 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream items/sec > Higher Is Better c3-highcpu-8 SPR . 3.7693 |==================================================== Zstd Compression 1.5.4 Compression Level: 19 - Compression Speed MB/s > Higher Is Better c3-highcpu-8 SPR . 10.3 |====================================================== Zstd Compression 1.5.4 Compression Level: 19 - Decompression Speed MB/s > Higher Is Better c3-highcpu-8 SPR . 905.2 |===================================================== Zstd Compression 1.5.4 Compression Level: 19, Long Mode - Compression Speed MB/s > Higher Is Better c3-highcpu-8 SPR . 6.5 |======================================================= Zstd Compression 1.5.4 Compression Level: 19, Long Mode - Decompression Speed MB/s > Higher Is Better c3-highcpu-8 SPR . 907.2 |===================================================== 7-Zip Compression 22.01 Test: Compression Rating MIPS > Higher Is Better c3-highcpu-8 SPR . 35306 |===================================================== 7-Zip Compression 22.01 Test: Decompression Rating MIPS > Higher Is Better c3-highcpu-8 SPR . 20468 |===================================================== LeelaChessZero 0.28 Backend: BLAS Nodes Per Second > Higher Is Better c3-highcpu-8 SPR . 1272 |====================================================== LeelaChessZero 0.28 Backend: Eigen Nodes Per Second > Higher Is Better c3-highcpu-8 SPR . 1221 |====================================================== GROMACS 2023 Implementation: MPI CPU - Input: water_GMX50_bare Ns Per Day > Higher Is Better c3-highcpu-8 SPR . 0.777 |===================================================== CockroachDB 22.2 Workload: MoVR - Concurrency: 128 ops/s > Higher Is Better c3-highcpu-8 SPR . 458.8 |===================================================== CockroachDB 22.2 Workload: KV, 50% Reads - Concurrency: 128 ops/s > Higher Is Better c3-highcpu-8 SPR . 19321.6 |=================================================== CockroachDB 22.2 Workload: KV, 95% Reads - Concurrency: 128 ops/s > Higher Is Better c3-highcpu-8 SPR . 24960.1 |=================================================== Memcached 1.6.18 Set To Get Ratio: 1:10 Ops/sec > Higher Is Better c3-highcpu-8 SPR . 1044947.13 |================================================ Memcached 1.6.18 Set To Get Ratio: 1:100 Ops/sec > Higher Is Better c3-highcpu-8 SPR . 1030937.28 |================================================ MariaDB 11.0.1 Clients: 2048 Queries Per Second > Higher Is Better c3-highcpu-8 SPR . 332 |======================================================= MariaDB 11.0.1 Clients: 4096 Queries Per Second > Higher Is Better c3-highcpu-8 SPR . 317 |======================================================= John The Ripper 2023.03.14 Test: bcrypt Real C/S > Higher Is Better c3-highcpu-8 SPR . 6932 |====================================================== John The Ripper 2023.03.14 Test: WPA PSK Real C/S > Higher Is Better c3-highcpu-8 SPR . 28818 |===================================================== John The Ripper 2023.03.14 Test: Blowfish Real C/S > Higher Is Better c3-highcpu-8 SPR . 6930 |====================================================== John The Ripper 2023.03.14 Test: HMAC-SHA512 Real C/S > Higher Is Better c3-highcpu-8 SPR . 37509000 |================================================== John The Ripper 2023.03.14 Test: MD5 Real C/S > Higher Is Better c3-highcpu-8 SPR . 765713 |==================================================== nginx 1.23.2 Connections: 100 Requests Per Second > Higher Is Better c3-highcpu-8 SPR . 36310.35 |================================================== nginx 1.23.2 Connections: 200 Requests Per Second > Higher Is Better c3-highcpu-8 SPR . 35602.10 |================================================== nginx 1.23.2 Connections: 500 Requests Per Second > Higher Is Better c3-highcpu-8 SPR . 34672.65 |================================================== nginx 1.23.2 Connections: 1000 Requests Per Second > Higher Is Better c3-highcpu-8 SPR . 32118.58 |================================================== nginx 1.23.2 Connections: 4000 Requests Per Second > Higher Is Better c3-highcpu-8 SPR . 32814.75 |================================================== OpenSSL 3.1 Algorithm: RSA4096 sign/s > Higher Is Better c3-highcpu-8 SPR . 2062.7 |==================================================== PostgreSQL 15 Scaling Factor: 100 - Clients: 800 - Mode: Read Only TPS > Higher Is Better c3-highcpu-8 SPR . 311942 |==================================================== PostgreSQL 15 Scaling Factor: 100 - Clients: 1000 - Mode: Read Only TPS > Higher Is Better c3-highcpu-8 SPR . 293725 |==================================================== OpenSSL 3.1 Algorithm: RSA4096 verify/s > Higher Is Better c3-highcpu-8 SPR . 67857.7 |=================================================== BRL-CAD 7.34 VGR Performance Metric VGR Performance Metric > Higher Is Better c3-highcpu-8 SPR . 71072 |===================================================== NAMD 2.14 ATPase Simulation - 327,506 Atoms days/ns < Lower Is Better c3-highcpu-8 SPR . 3.35779 |=================================================== oneDNN 3.0 Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU ms < Lower Is Better c3-highcpu-8 SPR . 1.50004 |=================================================== oneDNN 3.0 Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU ms < Lower Is Better c3-highcpu-8 SPR . 5.34218 |=================================================== oneDNN 3.0 Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU ms < Lower Is Better c3-highcpu-8 SPR . 4.17707 |=================================================== oneDNN 3.0 Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU ms < Lower Is Better c3-highcpu-8 SPR . 1.47145 |=================================================== oneDNN 3.0 Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU ms < Lower Is Better c3-highcpu-8 SPR . 3.54944 |=================================================== oneDNN 3.0 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU ms < Lower Is Better c3-highcpu-8 SPR . 4660.70 |=================================================== oneDNN 3.0 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU ms < Lower Is Better c3-highcpu-8 SPR . 2337.31 |=================================================== oneDNN 3.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU ms < Lower Is Better c3-highcpu-8 SPR . 0.968986 |================================================== OSPRay Studio 0.11 Camera: 1 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer ms < Lower Is Better c3-highcpu-8 SPR . 20952 |===================================================== OSPRay Studio 0.11 Camera: 3 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer ms < Lower Is Better c3-highcpu-8 SPR . 25819 |===================================================== PostgreSQL 15 Scaling Factor: 100 - Clients: 800 - Mode: Read Only - Average Latency ms < Lower Is Better c3-highcpu-8 SPR . 2.565 |===================================================== PostgreSQL 15 Scaling Factor: 100 - Clients: 1000 - Mode: Read Only - Average Latency ms < Lower Is Better c3-highcpu-8 SPR . 3.405 |===================================================== Google Draco 1.5.6 Model: Lion ms < Lower Is Better c3-highcpu-8 SPR . 6250 |====================================================== Google Draco 1.5.6 Model: Church Facade ms < Lower Is Better c3-highcpu-8 SPR . 7573 |====================================================== OpenCV 4.7 Test: Core ms < Lower Is Better c3-highcpu-8 SPR . 87372 |===================================================== OpenCV 4.7 Test: Video ms < Lower Is Better c3-highcpu-8 SPR . 31654 |===================================================== OpenCV 4.7 Test: Graph API ms < Lower Is Better c3-highcpu-8 SPR . 219931 |==================================================== OpenCV 4.7 Test: Stitching ms < Lower Is Better c3-highcpu-8 SPR . 214760 |==================================================== OpenCV 4.7 Test: Image Processing ms < Lower Is Better c3-highcpu-8 SPR . 128163 |==================================================== OpenCV 4.7 Test: Object Detection ms < Lower Is Better c3-highcpu-8 SPR . 38999 |===================================================== Neural Magic DeepSparse 1.3.2 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream ms/batch < Lower Is Better c3-highcpu-8 SPR . 528.02 |==================================================== Neural Magic DeepSparse 1.3.2 Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream ms/batch < Lower Is Better c3-highcpu-8 SPR . 38.51 |===================================================== Neural Magic DeepSparse 1.3.2 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream ms/batch < Lower Is Better c3-highcpu-8 SPR . 104.31 |==================================================== Neural Magic DeepSparse 1.3.2 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream ms/batch < Lower Is Better c3-highcpu-8 SPR . 30.80 |===================================================== Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream ms/batch < Lower Is Better c3-highcpu-8 SPR . 60.39 |===================================================== Neural Magic DeepSparse 1.3.2 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream ms/batch < Lower Is Better c3-highcpu-8 SPR . 305.90 |==================================================== Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream ms/batch < Lower Is Better c3-highcpu-8 SPR . 123.29 |==================================================== Neural Magic DeepSparse 1.3.2 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream ms/batch < Lower Is Better c3-highcpu-8 SPR . 530.61 |==================================================== Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 129 Cells Per Direction Seconds < Lower Is Better c3-highcpu-8 SPR . 32.39 |===================================================== OpenFOAM 10 Input: drivaerFastback, Small Mesh Size - Mesh Time Seconds < Lower Is Better c3-highcpu-8 SPR . 62.04 |===================================================== OpenFOAM 10 Input: drivaerFastback, Small Mesh Size - Execution Time Seconds < Lower Is Better c3-highcpu-8 SPR . 422.68 |==================================================== OpenRadioss 2022.10.13 Model: Bumper Beam Seconds < Lower Is Better c3-highcpu-8 SPR . 303.38 |==================================================== OpenRadioss 2022.10.13 Model: Cell Phone Drop Test Seconds < Lower Is Better c3-highcpu-8 SPR . 219.73 |==================================================== OpenRadioss 2022.10.13 Model: Bird Strike on Windshield Seconds < Lower Is Better c3-highcpu-8 SPR . 595.14 |==================================================== OpenRadioss 2022.10.13 Model: Rubber O-Ring Seal Installation Seconds < Lower Is Better c3-highcpu-8 SPR . 367.00 |==================================================== SPECFEM3D 4.0 Model: Mount St. Helens Seconds < Lower Is Better c3-highcpu-8 SPR . 139.52 |==================================================== SPECFEM3D 4.0 Model: Layered Halfspace Seconds < Lower Is Better c3-highcpu-8 SPR . 372.09 |==================================================== SPECFEM3D 4.0 Model: Tomographic Model Seconds < Lower Is Better c3-highcpu-8 SPR . 143.94 |==================================================== SPECFEM3D 4.0 Model: Homogeneous Halfspace Seconds < Lower Is Better c3-highcpu-8 SPR . 179.55 |==================================================== SPECFEM3D 4.0 Model: Water-layered Halfspace Seconds < Lower Is Better c3-highcpu-8 SPR . 321.63 |==================================================== Timed FFmpeg Compilation 6.0 Time To Compile Seconds < Lower Is Better c3-highcpu-8 SPR . 120.44 |==================================================== Timed Linux Kernel Compilation 6.1 Build: defconfig Seconds < Lower Is Better c3-highcpu-8 SPR . 244.80 |==================================================== Blender 3.4 Blend File: BMW27 - Compute: CPU-Only Seconds < Lower Is Better c3-highcpu-8 SPR . 315.24 |====================================================