Benchmarks by Michael Larabel for a future article.
Compare your own system(s) to this result file with the
Phoronix Test Suite by running the command:
phoronix-test-suite benchmark 2303226-NE-2303218PT51 Google Cloud c3 Sapphire Rapids - Phoronix Test Suite Google Cloud c3 Sapphire Rapids Benchmarks by Michael Larabel for a future article.
HTML result view exported from: https://openbenchmarking.org/result/2303226-NE-2303218PT51&gru&rdt .
Google Cloud c3 Sapphire Rapids Processor Motherboard Chipset Memory Disk Network OS Kernel Vulkan Compiler File-System System Layer c3-highcpu-8 SPR c2-standard-8 CLX Intel Xeon Platinum 8481C (4 Cores / 8 Threads) Google Compute Engine c3-highcpu-8 Intel 440FX 82441FX PMC 16GB 322GB nvme_card-pd Google Compute Engine Virtual Ubuntu 22.10 5.19.0-1015-gcp (x86_64) 1.3.224 GCC 12.2.0 ext4 KVM Intel Xeon (4 Cores / 8 Threads) Google Compute Engine c2-standard-8 32GB 322GB PersistentDisk Red Hat Virtio device OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - CPU Microcode: 0xffffffff Python Details - Python 3.10.7 Security Details - c3-highcpu-8 SPR: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected - c2-standard-8 CLX: itlb_multihit: Not affected + l1tf: Not affected + mds: Mitigation of Clear buffers; SMT Host state unknown + meltdown: Not affected + mmio_stale_data: Vulnerable: Clear buffers attempted no microcode; SMT Host state unknown + retbleed: Mitigation of Enhanced IBRS + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Mitigation of Clear buffers; SMT Host state unknown
Google Cloud c3 Sapphire Rapids minibude: OpenMP - BM1 openssl: SHA256 openssl: SHA512 openssl: ChaCha20 openssl: AES-128-GCM openssl: AES-256-GCM openssl: ChaCha20-Poly1305 nekrs: TurboPipe Periodic embree: Pathtracer ISPC - Crown embree: Pathtracer ISPC - Asian Dragon uvg266: Bosphorus 4K - Very Fast uvg266: Bosphorus 4K - Super Fast uvg266: Bosphorus 4K - Ultra Fast uvg266: Bosphorus 1080p - Very Fast uvg266: Bosphorus 1080p - Super Fast uvg266: Bosphorus 1080p - Ultra Fast minibude: OpenMP - BM1 oidn: RT.hdr_alb_nrm.3840x2160 oidn: RTLightmap.hdr.4096x4096 tensorflow: CPU - 16 - ResNet-50 tensorflow: CPU - 32 - ResNet-50 tensorflow: CPU - 64 - ResNet-50 openvkl: vklBenchmark ISPC deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream deepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Asynchronous Multi-Stream deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream compress-zstd: 19 - Compression Speed compress-zstd: 19 - Decompression Speed compress-zstd: 19, Long Mode - Compression Speed compress-zstd: 19, Long Mode - Decompression Speed compress-7zip: Compression Rating compress-7zip: Decompression Rating lczero: BLAS lczero: Eigen gromacs: MPI CPU - water_GMX50_bare cockroach: MoVR - 128 cockroach: KV, 50% Reads - 128 cockroach: KV, 95% Reads - 128 memcached: 1:10 memcached: 1:100 mysqlslap: 2048 mysqlslap: 4096 john-the-ripper: bcrypt john-the-ripper: WPA PSK john-the-ripper: Blowfish john-the-ripper: HMAC-SHA512 john-the-ripper: MD5 nginx: 100 nginx: 200 nginx: 500 nginx: 1000 nginx: 4000 openssl: RSA4096 pgbench: 100 - 800 - Read Only pgbench: 100 - 1000 - Read Only openssl: RSA4096 brl-cad: VGR Performance Metric namd: ATPase Simulation - 327,506 Atoms onednn: IP Shapes 1D - bf16bf16bf16 - CPU onednn: IP Shapes 3D - bf16bf16bf16 - CPU onednn: Convolution Batch Shapes Auto - bf16bf16bf16 - CPU onednn: Deconvolution Batch shapes_1d - bf16bf16bf16 - CPU onednn: Deconvolution Batch shapes_3d - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: Matrix Multiply Batch Shapes Transformer - bf16bf16bf16 - CPU ospray-studio: 1 - 4K - 1 - Path Tracer ospray-studio: 3 - 4K - 1 - Path Tracer pgbench: 100 - 800 - Read Only - Average Latency pgbench: 100 - 1000 - Read Only - Average Latency draco: Lion draco: Church Facade opencv: Core opencv: Video opencv: Graph API opencv: Stitching opencv: Image Processing opencv: Object Detection deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream deepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Asynchronous Multi-Stream deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream incompact3d: input.i3d 129 Cells Per Direction openfoam: drivaerFastback, Small Mesh Size - Mesh Time openfoam: drivaerFastback, Small Mesh Size - Execution Time openradioss: Bumper Beam openradioss: Cell Phone Drop Test openradioss: Bird Strike on Windshield openradioss: Rubber O-Ring Seal Installation specfem3d: Mount St. Helens specfem3d: Layered Halfspace specfem3d: Tomographic Model specfem3d: Homogeneous Halfspace specfem3d: Water-layered Halfspace build-ffmpeg: Time To Compile build-linux-kernel: defconfig blender: BMW27 - CPU-Only c3-highcpu-8 SPR c2-standard-8 CLX 7.546 4283873987 1568572920 22091557637 57594077823 48008361573 15970781140 30667900000 5.8475 7.3642 6.99 7.48 9.12 32.39 34.50 42.24 188.651 0.24 0.12 14.20 14.93 15.69 98 3.7873 51.8926 19.1677 64.8747 33.1064 6.5372 16.2194 3.7693 10.3 905.2 6.5 907.2 35306 20468 1272 1221 0.777 458.8 19321.6 24960.1 1044947.13 1030937.28 332 317 6932 28818 6930 37509000 765713 36310.35 35602.10 34672.65 32118.58 32814.75 2062.7 311942 293725 67857.7 71072 3.35779 1.50004 5.34218 4.17707 1.47145 3.54944 4660.70 2337.31 0.968986 20952 25819 2.565 3.405 6250 7573 87372 31654 219931 214760 128163 38999 528.0193 38.5120 104.3102 30.7952 60.3858 305.8991 123.2931 530.6100 32.3943163 62.044277 422.67836 303.38 219.73 595.14 367.00 139.524469614 372.089480857 143.937022875 179.545901627 321.625251668 120.438 244.800 315.24 6.112 1318193530 1465815227 21346811813 23237312603 16945000630 10919151843 25100766667 3.9340 5.1542 5.69 6.03 7.45 26.30 27.72 34.47 152.798 0.22 0.11 13.33 14.11 14.79 70 2.9308 62.5073 15.9179 59.9914 29.1783 6.1893 13.9932 2.9352 9.24 701.1 5.86 713.8 30989 22852 909 902 0.579 534.9 13684.5 16184.2 715723.53 702291.93 248 237 6687 31278 6684 40197000 683546 25148.28 24695.91 21957.17 21446.27 21594.94 1156.6 173317 169643 76402.3 50314 4.44534 19.5287 8.02416 34.1893 52.9421 35.9164 5767.70 2998.03 7.43260 30911 37892 4.616 5.895 7437 11462 142770 11737 236186 250833 147234 58056 682.3748 31.9583 125.4746 33.2924 68.5076 323.0828 142.8892 681.3449 37.7575671 85.518264 560.60744 390.61 291.49 724.80 523.45 145.379328519 374.756877204 150.915292633 190.794140810 347.812814070 139.039 289.662 OpenBenchmarking.org
miniBUDE Implementation: OpenMP - Input Deck: BM1 OpenBenchmarking.org Billion Interactions/s, More Is Better miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM1 c3-highcpu-8 SPR c2-standard-8 CLX 2 4 6 8 10 SE +/- 0.001, N = 3 SE +/- 0.001, N = 3 7.546 6.112 1. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm
OpenSSL Algorithm: SHA256 OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: SHA256 c3-highcpu-8 SPR c2-standard-8 CLX 900M 1800M 2700M 3600M 4500M SE +/- 2722265.85, N = 3 SE +/- 28303.98, N = 3 4283873987 1318193530 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenSSL Algorithm: SHA512 OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: SHA512 c3-highcpu-8 SPR c2-standard-8 CLX 300M 600M 900M 1200M 1500M SE +/- 1152941.98, N = 3 SE +/- 2965153.70, N = 3 1568572920 1465815227 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenSSL Algorithm: ChaCha20 OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: ChaCha20 c3-highcpu-8 SPR c2-standard-8 CLX 5000M 10000M 15000M 20000M 25000M SE +/- 35781789.46, N = 3 SE +/- 1497684.67, N = 3 22091557637 21346811813 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenSSL Algorithm: AES-128-GCM OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: AES-128-GCM c3-highcpu-8 SPR c2-standard-8 CLX 12000M 24000M 36000M 48000M 60000M SE +/- 33921495.47, N = 3 SE +/- 6165180.91, N = 3 57594077823 23237312603 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenSSL Algorithm: AES-256-GCM OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: AES-256-GCM c3-highcpu-8 SPR c2-standard-8 CLX 10000M 20000M 30000M 40000M 50000M SE +/- 48074910.93, N = 3 SE +/- 3855504.36, N = 3 48008361573 16945000630 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenSSL Algorithm: ChaCha20-Poly1305 OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: ChaCha20-Poly1305 c3-highcpu-8 SPR c2-standard-8 CLX 3000M 6000M 9000M 12000M 15000M SE +/- 21990039.72, N = 3 SE +/- 1592104.52, N = 3 15970781140 10919151843 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
nekRS Input: TurboPipe Periodic OpenBenchmarking.org FLOP/s, More Is Better nekRS 22.0 Input: TurboPipe Periodic c3-highcpu-8 SPR c2-standard-8 CLX 7000M 14000M 21000M 28000M 35000M SE +/- 92013205.57, N = 3 SE +/- 56849518.71, N = 3 30667900000 25100766667 1. (CXX) g++ options: -fopenmp -O2 -march=native -mtune=native -ftree-vectorize -lmpi_cxx -lmpi
Embree Binary: Pathtracer ISPC - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.0.1 Binary: Pathtracer ISPC - Model: Crown c3-highcpu-8 SPR c2-standard-8 CLX 1.3157 2.6314 3.9471 5.2628 6.5785 SE +/- 0.0120, N = 3 SE +/- 0.0030, N = 3 5.8475 3.9340 MIN: 5.81 / MAX: 5.92 MIN: 3.91 / MAX: 3.99
Embree Binary: Pathtracer ISPC - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.0.1 Binary: Pathtracer ISPC - Model: Asian Dragon c3-highcpu-8 SPR c2-standard-8 CLX 2 4 6 8 10 SE +/- 0.0079, N = 3 SE +/- 0.0119, N = 3 7.3642 5.1542 MIN: 7.33 / MAX: 7.44 MIN: 5.12 / MAX: 5.22
uvg266 Video Input: Bosphorus 4K - Video Preset: Very Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 4K - Video Preset: Very Fast c3-highcpu-8 SPR c2-standard-8 CLX 2 4 6 8 10 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 6.99 5.69
uvg266 Video Input: Bosphorus 4K - Video Preset: Super Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 4K - Video Preset: Super Fast c3-highcpu-8 SPR c2-standard-8 CLX 2 4 6 8 10 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 7.48 6.03
uvg266 Video Input: Bosphorus 4K - Video Preset: Ultra Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 4K - Video Preset: Ultra Fast c3-highcpu-8 SPR c2-standard-8 CLX 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 9.12 7.45
uvg266 Video Input: Bosphorus 1080p - Video Preset: Very Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 1080p - Video Preset: Very Fast c3-highcpu-8 SPR c2-standard-8 CLX 8 16 24 32 40 SE +/- 0.25, N = 3 SE +/- 0.16, N = 3 32.39 26.30
uvg266 Video Input: Bosphorus 1080p - Video Preset: Super Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 1080p - Video Preset: Super Fast c3-highcpu-8 SPR c2-standard-8 CLX 8 16 24 32 40 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 34.50 27.72
uvg266 Video Input: Bosphorus 1080p - Video Preset: Ultra Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 1080p - Video Preset: Ultra Fast c3-highcpu-8 SPR c2-standard-8 CLX 10 20 30 40 50 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 42.24 34.47
VVenC Video Input: Bosphorus 4K - Video Preset: Fast OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.7 Video Input: Bosphorus 4K - Video Preset: Fast c3-highcpu-8 SPR 0.3501 0.7002 1.0503 1.4004 1.7505 SE +/- 0.005, N = 3 1.556 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
VVenC Video Input: Bosphorus 4K - Video Preset: Faster OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.7 Video Input: Bosphorus 4K - Video Preset: Faster c3-highcpu-8 SPR 0.7929 1.5858 2.3787 3.1716 3.9645 SE +/- 0.000, N = 3 3.524 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
VVenC Video Input: Bosphorus 1080p - Video Preset: Fast OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.7 Video Input: Bosphorus 1080p - Video Preset: Fast c3-highcpu-8 SPR 1.1936 2.3872 3.5808 4.7744 5.968 SE +/- 0.018, N = 3 5.305 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
VVenC Video Input: Bosphorus 1080p - Video Preset: Faster OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.7 Video Input: Bosphorus 1080p - Video Preset: Faster c3-highcpu-8 SPR 3 6 9 12 15 SE +/- 0.01, N = 3 12.92 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
miniBUDE Implementation: OpenMP - Input Deck: BM1 OpenBenchmarking.org GFInst/s, More Is Better miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM1 c3-highcpu-8 SPR c2-standard-8 CLX 40 80 120 160 200 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 188.65 152.80 1. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm
Intel Open Image Denoise Run: RT.hdr_alb_nrm.3840x2160 OpenBenchmarking.org Images / Sec, More Is Better Intel Open Image Denoise 1.4.0 Run: RT.hdr_alb_nrm.3840x2160 c3-highcpu-8 SPR c2-standard-8 CLX 0.054 0.108 0.162 0.216 0.27 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.24 0.22
Intel Open Image Denoise Run: RTLightmap.hdr.4096x4096 OpenBenchmarking.org Images / Sec, More Is Better Intel Open Image Denoise 1.4.0 Run: RTLightmap.hdr.4096x4096 c3-highcpu-8 SPR c2-standard-8 CLX 0.027 0.054 0.081 0.108 0.135 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.12 0.11
TensorFlow Device: CPU - Batch Size: 16 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.10 Device: CPU - Batch Size: 16 - Model: ResNet-50 c3-highcpu-8 SPR c2-standard-8 CLX 4 8 12 16 20 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 14.20 13.33
TensorFlow Device: CPU - Batch Size: 32 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.10 Device: CPU - Batch Size: 32 - Model: ResNet-50 c3-highcpu-8 SPR c2-standard-8 CLX 4 8 12 16 20 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 14.93 14.11
TensorFlow Device: CPU - Batch Size: 64 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.10 Device: CPU - Batch Size: 64 - Model: ResNet-50 c3-highcpu-8 SPR c2-standard-8 CLX 4 8 12 16 20 SE +/- 0.01, N = 3 SE +/- 0.18, N = 3 15.69 14.79
OpenVKL Benchmark: vklBenchmark ISPC OpenBenchmarking.org Items / Sec, More Is Better OpenVKL 1.3.1 Benchmark: vklBenchmark ISPC c3-highcpu-8 SPR c2-standard-8 CLX 20 40 60 80 100 SE +/- 0.33, N = 3 SE +/- 0.00, N = 3 98 70 MIN: 11 / MAX: 1579 MIN: 8 / MAX: 1119
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream c3-highcpu-8 SPR c2-standard-8 CLX 0.8521 1.7042 2.5563 3.4084 4.2605 SE +/- 0.0130, N = 3 SE +/- 0.0001, N = 3 3.7873 2.9308
Neural Magic DeepSparse Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream c3-highcpu-8 SPR c2-standard-8 CLX 14 28 42 56 70 SE +/- 0.05, N = 3 SE +/- 0.07, N = 3 51.89 62.51
Neural Magic DeepSparse Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream c3-highcpu-8 SPR c2-standard-8 CLX 5 10 15 20 25 SE +/- 0.03, N = 3 SE +/- 0.05, N = 3 19.17 15.92
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream c3-highcpu-8 SPR c2-standard-8 CLX 14 28 42 56 70 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 64.87 59.99
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream c3-highcpu-8 SPR c2-standard-8 CLX 8 16 24 32 40 SE +/- 0.15, N = 3 SE +/- 0.03, N = 3 33.11 29.18
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream c3-highcpu-8 SPR c2-standard-8 CLX 2 4 6 8 10 SE +/- 0.0078, N = 3 SE +/- 0.0034, N = 3 6.5372 6.1893
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream c3-highcpu-8 SPR c2-standard-8 CLX 4 8 12 16 20 SE +/- 0.13, N = 3 SE +/- 0.03, N = 3 16.22 13.99
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream c3-highcpu-8 SPR c2-standard-8 CLX 0.8481 1.6962 2.5443 3.3924 4.2405 SE +/- 0.0239, N = 3 SE +/- 0.0038, N = 3 3.7693 2.9352
Zstd Compression Compression Level: 19 - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19 - Compression Speed c3-highcpu-8 SPR c2-standard-8 CLX 3 6 9 12 15 SE +/- 0.12, N = 3 SE +/- 0.08, N = 3 10.30 9.24 -llzma 1. (CC) gcc options: -O3 -pthread -lz
Zstd Compression Compression Level: 19 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19 - Decompression Speed c3-highcpu-8 SPR c2-standard-8 CLX 200 400 600 800 1000 SE +/- 1.50, N = 3 SE +/- 1.45, N = 3 905.2 701.1 -llzma 1. (CC) gcc options: -O3 -pthread -lz
Zstd Compression Compression Level: 19, Long Mode - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19, Long Mode - Compression Speed c3-highcpu-8 SPR c2-standard-8 CLX 2 4 6 8 10 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 6.50 5.86 -llzma 1. (CC) gcc options: -O3 -pthread -lz
Zstd Compression Compression Level: 19, Long Mode - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19, Long Mode - Decompression Speed c3-highcpu-8 SPR c2-standard-8 CLX 200 400 600 800 1000 SE +/- 1.48, N = 3 SE +/- 2.02, N = 3 907.2 713.8 -llzma 1. (CC) gcc options: -O3 -pthread -lz
7-Zip Compression Test: Compression Rating OpenBenchmarking.org MIPS, More Is Better 7-Zip Compression 22.01 Test: Compression Rating c3-highcpu-8 SPR c2-standard-8 CLX 8K 16K 24K 32K 40K SE +/- 246.17, N = 15 SE +/- 248.15, N = 3 35306 30989 1. (CXX) g++ options: -lpthread -ldl -O2 -fPIC
7-Zip Compression Test: Decompression Rating OpenBenchmarking.org MIPS, More Is Better 7-Zip Compression 22.01 Test: Decompression Rating c3-highcpu-8 SPR c2-standard-8 CLX 5K 10K 15K 20K 25K SE +/- 183.58, N = 15 SE +/- 52.92, N = 3 20468 22852 1. (CXX) g++ options: -lpthread -ldl -O2 -fPIC
LeelaChessZero Backend: BLAS OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.28 Backend: BLAS c3-highcpu-8 SPR c2-standard-8 CLX 300 600 900 1200 1500 SE +/- 16.59, N = 3 SE +/- 7.36, N = 3 1272 909 1. (CXX) g++ options: -flto -pthread
LeelaChessZero Backend: Eigen OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.28 Backend: Eigen c3-highcpu-8 SPR c2-standard-8 CLX 300 600 900 1200 1500 SE +/- 7.69, N = 3 SE +/- 9.00, N = 3 1221 902 1. (CXX) g++ options: -flto -pthread
GROMACS Implementation: MPI CPU - Input: water_GMX50_bare OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2023 Implementation: MPI CPU - Input: water_GMX50_bare c3-highcpu-8 SPR c2-standard-8 CLX 0.1748 0.3496 0.5244 0.6992 0.874 SE +/- 0.001, N = 3 SE +/- 0.001, N = 3 0.777 0.579 1. (CXX) g++ options: -O3
CockroachDB Workload: MoVR - Concurrency: 128 OpenBenchmarking.org ops/s, More Is Better CockroachDB 22.2 Workload: MoVR - Concurrency: 128 c3-highcpu-8 SPR c2-standard-8 CLX 120 240 360 480 600 SE +/- 5.62, N = 15 SE +/- 1.90, N = 3 458.8 534.9
CockroachDB Workload: KV, 50% Reads - Concurrency: 128 OpenBenchmarking.org ops/s, More Is Better CockroachDB 22.2 Workload: KV, 50% Reads - Concurrency: 128 c3-highcpu-8 SPR c2-standard-8 CLX 4K 8K 12K 16K 20K SE +/- 40.86, N = 3 SE +/- 68.65, N = 3 19321.6 13684.5
CockroachDB Workload: KV, 95% Reads - Concurrency: 128 OpenBenchmarking.org ops/s, More Is Better CockroachDB 22.2 Workload: KV, 95% Reads - Concurrency: 128 c3-highcpu-8 SPR c2-standard-8 CLX 5K 10K 15K 20K 25K SE +/- 127.78, N = 3 SE +/- 79.71, N = 3 24960.1 16184.2
Memcached Set To Get Ratio: 1:10 OpenBenchmarking.org Ops/sec, More Is Better Memcached 1.6.18 Set To Get Ratio: 1:10 c3-highcpu-8 SPR c2-standard-8 CLX 200K 400K 600K 800K 1000K SE +/- 3280.78, N = 3 SE +/- 2213.84, N = 3 1044947.13 715723.53 1. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre
Memcached Set To Get Ratio: 1:100 OpenBenchmarking.org Ops/sec, More Is Better Memcached 1.6.18 Set To Get Ratio: 1:100 c3-highcpu-8 SPR c2-standard-8 CLX 200K 400K 600K 800K 1000K SE +/- 11157.73, N = 3 SE +/- 3476.80, N = 3 1030937.28 702291.93 1. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre
MariaDB Clients: 2048 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 11.0.1 Clients: 2048 c3-highcpu-8 SPR c2-standard-8 CLX 70 140 210 280 350 SE +/- 3.01, N = 3 SE +/- 3.50, N = 3 332 248 1. (CXX) g++ options: -pie -fPIC -fstack-protector -O3 -lnuma -lcrypt -lz -lm -lssl -lcrypto -lpthread -ldl
MariaDB Clients: 4096 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 11.0.1 Clients: 4096 c3-highcpu-8 SPR c2-standard-8 CLX 70 140 210 280 350 SE +/- 2.62, N = 3 SE +/- 2.55, N = 3 317 237 1. (CXX) g++ options: -pie -fPIC -fstack-protector -O3 -lnuma -lcrypt -lz -lm -lssl -lcrypto -lpthread -ldl
John The Ripper Test: bcrypt OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: bcrypt c3-highcpu-8 SPR c2-standard-8 CLX 1500 3000 4500 6000 7500 SE +/- 0.67, N = 3 SE +/- 0.67, N = 3 6932 6687 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lm -lrt -lz -ldl -lcrypt
John The Ripper Test: WPA PSK OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: WPA PSK c3-highcpu-8 SPR c2-standard-8 CLX 7K 14K 21K 28K 35K SE +/- 27.15, N = 3 SE +/- 11.37, N = 3 28818 31278 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lm -lrt -lz -ldl -lcrypt
John The Ripper Test: Blowfish OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: Blowfish c3-highcpu-8 SPR c2-standard-8 CLX 1500 3000 4500 6000 7500 SE +/- 2.08, N = 3 SE +/- 2.73, N = 3 6930 6684 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lm -lrt -lz -ldl -lcrypt
John The Ripper Test: HMAC-SHA512 OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: HMAC-SHA512 c3-highcpu-8 SPR c2-standard-8 CLX 9M 18M 27M 36M 45M SE +/- 21071.31, N = 3 SE +/- 32331.62, N = 3 37509000 40197000 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lm -lrt -lz -ldl -lcrypt
John The Ripper Test: MD5 OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: MD5 c3-highcpu-8 SPR c2-standard-8 CLX 160K 320K 480K 640K 800K SE +/- 1472.13, N = 3 SE +/- 162.53, N = 3 765713 683546 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lm -lrt -lz -ldl -lcrypt
nginx Connections: 100 OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.23.2 Connections: 100 c3-highcpu-8 SPR c2-standard-8 CLX 8K 16K 24K 32K 40K SE +/- 23.95, N = 3 SE +/- 33.32, N = 3 36310.35 25148.28 1. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2
nginx Connections: 200 OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.23.2 Connections: 200 c3-highcpu-8 SPR c2-standard-8 CLX 8K 16K 24K 32K 40K SE +/- 83.52, N = 3 SE +/- 37.62, N = 3 35602.10 24695.91 1. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2
nginx Connections: 500 OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.23.2 Connections: 500 c3-highcpu-8 SPR c2-standard-8 CLX 7K 14K 21K 28K 35K SE +/- 321.85, N = 3 SE +/- 17.44, N = 3 34672.65 21957.17 1. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2
nginx Connections: 1000 OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.23.2 Connections: 1000 c3-highcpu-8 SPR c2-standard-8 CLX 7K 14K 21K 28K 35K SE +/- 22.42, N = 3 SE +/- 84.22, N = 3 32118.58 21446.27 1. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2
nginx Connections: 4000 OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.23.2 Connections: 4000 c3-highcpu-8 SPR c2-standard-8 CLX 7K 14K 21K 28K 35K SE +/- 27.93, N = 3 SE +/- 11.52, N = 3 32814.75 21594.94 1. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2
OpenSSL Algorithm: RSA4096 OpenBenchmarking.org sign/s, More Is Better OpenSSL 3.1 Algorithm: RSA4096 c3-highcpu-8 SPR c2-standard-8 CLX 400 800 1200 1600 2000 SE +/- 1.17, N = 3 SE +/- 2.05, N = 3 2062.7 1156.6 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
PostgreSQL Scaling Factor: 100 - Clients: 800 - Mode: Read Only OpenBenchmarking.org TPS, More Is Better PostgreSQL 15 Scaling Factor: 100 - Clients: 800 - Mode: Read Only c3-highcpu-8 SPR c2-standard-8 CLX 70K 140K 210K 280K 350K SE +/- 3369.27, N = 3 SE +/- 822.93, N = 3 311942 173317 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
PostgreSQL Scaling Factor: 100 - Clients: 1000 - Mode: Read Only OpenBenchmarking.org TPS, More Is Better PostgreSQL 15 Scaling Factor: 100 - Clients: 1000 - Mode: Read Only c3-highcpu-8 SPR c2-standard-8 CLX 60K 120K 180K 240K 300K SE +/- 2414.33, N = 3 SE +/- 1014.66, N = 3 293725 169643 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
OpenSSL Algorithm: RSA4096 OpenBenchmarking.org verify/s, More Is Better OpenSSL 3.1 Algorithm: RSA4096 c3-highcpu-8 SPR c2-standard-8 CLX 16K 32K 48K 64K 80K SE +/- 8.53, N = 3 SE +/- 47.93, N = 3 67857.7 76402.3 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
BRL-CAD VGR Performance Metric OpenBenchmarking.org VGR Performance Metric, More Is Better BRL-CAD 7.34 VGR Performance Metric c3-highcpu-8 SPR c2-standard-8 CLX 15K 30K 45K 60K 75K 71072 50314 1. (CXX) g++ options: -std=c++14 -pipe -fvisibility=hidden -fno-strict-aliasing -fno-common -fexceptions -ftemplate-depth-128 -m64 -ggdb3 -O3 -fipa-pta -fstrength-reduce -finline-functions -flto -ltcl8.6 -lregex_brl -lz_brl -lnetpbm -ldl -lm -ltk8.6
NAMD ATPase Simulation - 327,506 Atoms OpenBenchmarking.org days/ns, Fewer Is Better NAMD 2.14 ATPase Simulation - 327,506 Atoms c3-highcpu-8 SPR c2-standard-8 CLX 1.0002 2.0004 3.0006 4.0008 5.001 SE +/- 0.00254, N = 3 SE +/- 0.00771, N = 3 3.35779 4.44534
oneDNN Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU c3-highcpu-8 SPR c2-standard-8 CLX 5 10 15 20 25 SE +/- 0.00340, N = 3 SE +/- 0.01518, N = 3 1.50004 19.52870 MIN: 18.98 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU c3-highcpu-8 SPR c2-standard-8 CLX 2 4 6 8 10 SE +/- 0.00535, N = 3 SE +/- 0.01190, N = 3 5.34218 8.02416 MIN: 4.94 MIN: 7.86 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU c3-highcpu-8 SPR c2-standard-8 CLX 8 16 24 32 40 SE +/- 0.00870, N = 3 SE +/- 0.00489, N = 3 4.17707 34.18930 MIN: 33.77 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU c3-highcpu-8 SPR c2-standard-8 CLX 12 24 36 48 60 SE +/- 0.00472, N = 3 SE +/- 0.00209, N = 3 1.47145 52.94210 MIN: 52.7 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU c3-highcpu-8 SPR c2-standard-8 CLX 8 16 24 32 40 SE +/- 0.01072, N = 3 SE +/- 0.01009, N = 3 3.54944 35.91640 MIN: 35.71 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU c3-highcpu-8 SPR c2-standard-8 CLX 1200 2400 3600 4800 6000 SE +/- 0.86, N = 3 SE +/- 8.38, N = 3 4660.70 5767.70 MIN: 4648.25 MIN: 5732.12 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU c3-highcpu-8 SPR c2-standard-8 CLX 600 1200 1800 2400 3000 SE +/- 1.76, N = 3 SE +/- 0.83, N = 3 2337.31 2998.03 MIN: 2326.77 MIN: 2977.98 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU c3-highcpu-8 SPR c2-standard-8 CLX 2 4 6 8 10 SE +/- 0.013114, N = 3 SE +/- 0.007542, N = 3 0.968986 7.432600 MIN: 7.25 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OSPRay Studio Camera: 1 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 0.11 Camera: 1 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer c3-highcpu-8 SPR c2-standard-8 CLX 7K 14K 21K 28K 35K SE +/- 4.91, N = 3 SE +/- 32.33, N = 3 20952 30911 1. (CXX) g++ options: -O3 -lm -ldl
OSPRay Studio Camera: 3 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 0.11 Camera: 3 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer c3-highcpu-8 SPR c2-standard-8 CLX 8K 16K 24K 32K 40K SE +/- 364.36, N = 3 SE +/- 39.94, N = 3 25819 37892 1. (CXX) g++ options: -O3 -lm -ldl
PostgreSQL Scaling Factor: 100 - Clients: 800 - Mode: Read Only - Average Latency OpenBenchmarking.org ms, Fewer Is Better PostgreSQL 15 Scaling Factor: 100 - Clients: 800 - Mode: Read Only - Average Latency c3-highcpu-8 SPR c2-standard-8 CLX 1.0386 2.0772 3.1158 4.1544 5.193 SE +/- 0.028, N = 3 SE +/- 0.022, N = 3 2.565 4.616 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
PostgreSQL Scaling Factor: 100 - Clients: 1000 - Mode: Read Only - Average Latency OpenBenchmarking.org ms, Fewer Is Better PostgreSQL 15 Scaling Factor: 100 - Clients: 1000 - Mode: Read Only - Average Latency c3-highcpu-8 SPR c2-standard-8 CLX 1.3264 2.6528 3.9792 5.3056 6.632 SE +/- 0.028, N = 3 SE +/- 0.035, N = 3 3.405 5.895 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
Google Draco Model: Lion OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.5.6 Model: Lion c3-highcpu-8 SPR c2-standard-8 CLX 1600 3200 4800 6400 8000 SE +/- 80.23, N = 15 SE +/- 18.52, N = 3 6250 7437 1. (CXX) g++ options: -O3
Google Draco Model: Church Facade OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.5.6 Model: Church Facade c3-highcpu-8 SPR c2-standard-8 CLX 2K 4K 6K 8K 10K SE +/- 10.48, N = 3 SE +/- 16.76, N = 3 7573 11462 1. (CXX) g++ options: -O3
OpenCV Test: Core OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Core c3-highcpu-8 SPR c2-standard-8 CLX 30K 60K 90K 120K 150K SE +/- 280.31, N = 3 SE +/- 2578.21, N = 12 87372 142770 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
OpenCV Test: Video OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Video c3-highcpu-8 SPR c2-standard-8 CLX 7K 14K 21K 28K 35K SE +/- 198.80, N = 3 SE +/- 50.28, N = 3 31654 11737 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
OpenCV Test: Graph API OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Graph API c3-highcpu-8 SPR c2-standard-8 CLX 50K 100K 150K 200K 250K SE +/- 931.36, N = 3 SE +/- 1570.24, N = 3 219931 236186 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
OpenCV Test: Stitching OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Stitching c3-highcpu-8 SPR c2-standard-8 CLX 50K 100K 150K 200K 250K SE +/- 1973.06, N = 7 SE +/- 1856.06, N = 3 214760 250833 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
OpenCV Test: Image Processing OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Image Processing c3-highcpu-8 SPR c2-standard-8 CLX 30K 60K 90K 120K 150K SE +/- 1624.35, N = 12 SE +/- 1527.37, N = 4 128163 147234 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
OpenCV Test: Object Detection OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Object Detection c3-highcpu-8 SPR c2-standard-8 CLX 12K 24K 36K 48K 60K SE +/- 384.74, N = 5 SE +/- 751.87, N = 3 38999 58056 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream c3-highcpu-8 SPR c2-standard-8 CLX 150 300 450 600 750 SE +/- 1.78, N = 3 SE +/- 0.03, N = 3 528.02 682.37
Neural Magic DeepSparse Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream c3-highcpu-8 SPR c2-standard-8 CLX 9 18 27 36 45 SE +/- 0.04, N = 3 SE +/- 0.03, N = 3 38.51 31.96
Neural Magic DeepSparse Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream c3-highcpu-8 SPR c2-standard-8 CLX 30 60 90 120 150 SE +/- 0.14, N = 3 SE +/- 0.40, N = 3 104.31 125.47
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream c3-highcpu-8 SPR c2-standard-8 CLX 8 16 24 32 40 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 30.80 33.29
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream c3-highcpu-8 SPR c2-standard-8 CLX 15 30 45 60 75 SE +/- 0.28, N = 3 SE +/- 0.08, N = 3 60.39 68.51
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream c3-highcpu-8 SPR c2-standard-8 CLX 70 140 210 280 350 SE +/- 0.37, N = 3 SE +/- 0.18, N = 3 305.90 323.08
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream c3-highcpu-8 SPR c2-standard-8 CLX 30 60 90 120 150 SE +/- 0.95, N = 3 SE +/- 0.29, N = 3 123.29 142.89
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream c3-highcpu-8 SPR c2-standard-8 CLX 150 300 450 600 750 SE +/- 3.34, N = 3 SE +/- 0.89, N = 3 530.61 681.34
Xcompact3d Incompact3d Input: input.i3d 129 Cells Per Direction OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 129 Cells Per Direction c3-highcpu-8 SPR c2-standard-8 CLX 9 18 27 36 45 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 32.39 37.76 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
OpenFOAM Input: drivaerFastback, Small Mesh Size - Mesh Time OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: drivaerFastback, Small Mesh Size - Mesh Time c3-highcpu-8 SPR c2-standard-8 CLX 20 40 60 80 100 62.04 85.52 -lphysicalProperties -lspecie -lfiniteVolume -lfvModels -lmeshTools -lsampling -ldynamicMesh 1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lgenericPatchFields -lOpenFOAM -ldl -lm
OpenFOAM Input: drivaerFastback, Small Mesh Size - Execution Time OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: drivaerFastback, Small Mesh Size - Execution Time c3-highcpu-8 SPR c2-standard-8 CLX 120 240 360 480 600 422.68 560.61 -lphysicalProperties -lspecie -lfiniteVolume -lfvModels -lmeshTools -lsampling -ldynamicMesh 1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lgenericPatchFields -lOpenFOAM -ldl -lm
OpenRadioss Model: Bumper Beam OpenBenchmarking.org Seconds, Fewer Is Better OpenRadioss 2022.10.13 Model: Bumper Beam c3-highcpu-8 SPR c2-standard-8 CLX 80 160 240 320 400 SE +/- 0.51, N = 3 SE +/- 0.90, N = 3 303.38 390.61
OpenRadioss Model: Cell Phone Drop Test OpenBenchmarking.org Seconds, Fewer Is Better OpenRadioss 2022.10.13 Model: Cell Phone Drop Test c3-highcpu-8 SPR c2-standard-8 CLX 60 120 180 240 300 SE +/- 0.38, N = 3 SE +/- 0.39, N = 3 219.73 291.49
OpenRadioss Model: Bird Strike on Windshield OpenBenchmarking.org Seconds, Fewer Is Better OpenRadioss 2022.10.13 Model: Bird Strike on Windshield c3-highcpu-8 SPR c2-standard-8 CLX 160 320 480 640 800 SE +/- 0.30, N = 3 SE +/- 1.51, N = 3 595.14 724.80
OpenRadioss Model: Rubber O-Ring Seal Installation OpenBenchmarking.org Seconds, Fewer Is Better OpenRadioss 2022.10.13 Model: Rubber O-Ring Seal Installation c3-highcpu-8 SPR c2-standard-8 CLX 110 220 330 440 550 SE +/- 0.20, N = 3 SE +/- 0.45, N = 3 367.00 523.45
SPECFEM3D Model: Mount St. Helens OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.0 Model: Mount St. Helens c3-highcpu-8 SPR c2-standard-8 CLX 30 60 90 120 150 SE +/- 0.10, N = 3 SE +/- 0.56, N = 3 139.52 145.38 1. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
SPECFEM3D Model: Layered Halfspace OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.0 Model: Layered Halfspace c3-highcpu-8 SPR c2-standard-8 CLX 80 160 240 320 400 SE +/- 0.64, N = 3 SE +/- 2.51, N = 3 372.09 374.76 1. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
SPECFEM3D Model: Tomographic Model OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.0 Model: Tomographic Model c3-highcpu-8 SPR c2-standard-8 CLX 30 60 90 120 150 SE +/- 1.65, N = 3 SE +/- 1.30, N = 3 143.94 150.92 1. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
SPECFEM3D Model: Homogeneous Halfspace OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.0 Model: Homogeneous Halfspace c3-highcpu-8 SPR c2-standard-8 CLX 40 80 120 160 200 SE +/- 0.09, N = 3 SE +/- 1.87, N = 3 179.55 190.79 1. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
SPECFEM3D Model: Water-layered Halfspace OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.0 Model: Water-layered Halfspace c3-highcpu-8 SPR c2-standard-8 CLX 80 160 240 320 400 SE +/- 0.31, N = 3 SE +/- 0.85, N = 3 321.63 347.81 1. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
Timed FFmpeg Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed FFmpeg Compilation 6.0 Time To Compile c3-highcpu-8 SPR c2-standard-8 CLX 30 60 90 120 150 SE +/- 0.10, N = 3 SE +/- 0.03, N = 3 120.44 139.04
Timed Linux Kernel Compilation Build: defconfig OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 6.1 Build: defconfig c3-highcpu-8 SPR c2-standard-8 CLX 60 120 180 240 300 SE +/- 0.64, N = 3 SE +/- 0.85, N = 3 244.80 289.66
Blender Blend File: BMW27 - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.4 Blend File: BMW27 - Compute: CPU-Only c3-highcpu-8 SPR 70 140 210 280 350 SE +/- 1.13, N = 3 315.24
Phoronix Test Suite v10.8.4