Benchmarks by Michael Larabel for a future article.
Compare your own system(s) to this result file with the
Phoronix Test Suite by running the command:
phoronix-test-suite benchmark 2303226-NE-2303218PT51 Google Cloud c3 Sapphire Rapids - Phoronix Test Suite Google Cloud c3 Sapphire Rapids Benchmarks by Michael Larabel for a future article.
HTML result view exported from: https://openbenchmarking.org/result/2303226-NE-2303218PT51&sro&gru .
Google Cloud c3 Sapphire Rapids Processor Motherboard Chipset Memory Disk Network OS Kernel Vulkan Compiler File-System System Layer c3-highcpu-8 SPR c2-standard-8 CLX Intel Xeon Platinum 8481C (4 Cores / 8 Threads) Google Compute Engine c3-highcpu-8 Intel 440FX 82441FX PMC 16GB 322GB nvme_card-pd Google Compute Engine Virtual Ubuntu 22.10 5.19.0-1015-gcp (x86_64) 1.3.224 GCC 12.2.0 ext4 KVM Intel Xeon (4 Cores / 8 Threads) Google Compute Engine c2-standard-8 32GB 322GB PersistentDisk Red Hat Virtio device OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - CPU Microcode: 0xffffffff Python Details - Python 3.10.7 Security Details - c3-highcpu-8 SPR: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected - c2-standard-8 CLX: itlb_multihit: Not affected + l1tf: Not affected + mds: Mitigation of Clear buffers; SMT Host state unknown + meltdown: Not affected + mmio_stale_data: Vulnerable: Clear buffers attempted no microcode; SMT Host state unknown + retbleed: Mitigation of Enhanced IBRS + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Mitigation of Clear buffers; SMT Host state unknown
Google Cloud c3 Sapphire Rapids minibude: OpenMP - BM1 openssl: SHA256 openssl: SHA512 openssl: ChaCha20 openssl: AES-128-GCM openssl: AES-256-GCM openssl: ChaCha20-Poly1305 nekrs: TurboPipe Periodic embree: Pathtracer ISPC - Crown embree: Pathtracer ISPC - Asian Dragon uvg266: Bosphorus 4K - Very Fast uvg266: Bosphorus 4K - Super Fast uvg266: Bosphorus 4K - Ultra Fast uvg266: Bosphorus 1080p - Very Fast uvg266: Bosphorus 1080p - Super Fast uvg266: Bosphorus 1080p - Ultra Fast minibude: OpenMP - BM1 oidn: RT.hdr_alb_nrm.3840x2160 oidn: RTLightmap.hdr.4096x4096 tensorflow: CPU - 16 - ResNet-50 tensorflow: CPU - 32 - ResNet-50 tensorflow: CPU - 64 - ResNet-50 openvkl: vklBenchmark ISPC deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream deepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Asynchronous Multi-Stream deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream compress-zstd: 19 - Compression Speed compress-zstd: 19 - Decompression Speed compress-zstd: 19, Long Mode - Compression Speed compress-zstd: 19, Long Mode - Decompression Speed compress-7zip: Compression Rating compress-7zip: Decompression Rating lczero: BLAS lczero: Eigen gromacs: MPI CPU - water_GMX50_bare cockroach: MoVR - 128 cockroach: KV, 50% Reads - 128 cockroach: KV, 95% Reads - 128 memcached: 1:10 memcached: 1:100 mysqlslap: 2048 mysqlslap: 4096 john-the-ripper: bcrypt john-the-ripper: WPA PSK john-the-ripper: Blowfish john-the-ripper: HMAC-SHA512 john-the-ripper: MD5 nginx: 100 nginx: 200 nginx: 500 nginx: 1000 nginx: 4000 openssl: RSA4096 pgbench: 100 - 800 - Read Only pgbench: 100 - 1000 - Read Only openssl: RSA4096 brl-cad: VGR Performance Metric namd: ATPase Simulation - 327,506 Atoms onednn: IP Shapes 1D - bf16bf16bf16 - CPU onednn: IP Shapes 3D - bf16bf16bf16 - CPU onednn: Convolution Batch Shapes Auto - bf16bf16bf16 - CPU onednn: Deconvolution Batch shapes_1d - bf16bf16bf16 - CPU onednn: Deconvolution Batch shapes_3d - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: Matrix Multiply Batch Shapes Transformer - bf16bf16bf16 - CPU ospray-studio: 1 - 4K - 1 - Path Tracer ospray-studio: 3 - 4K - 1 - Path Tracer pgbench: 100 - 800 - Read Only - Average Latency pgbench: 100 - 1000 - Read Only - Average Latency draco: Lion draco: Church Facade opencv: Core opencv: Video opencv: Graph API opencv: Stitching opencv: Image Processing opencv: Object Detection deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream deepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Asynchronous Multi-Stream deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream incompact3d: input.i3d 129 Cells Per Direction openfoam: drivaerFastback, Small Mesh Size - Mesh Time openfoam: drivaerFastback, Small Mesh Size - Execution Time openradioss: Bumper Beam openradioss: Cell Phone Drop Test openradioss: Bird Strike on Windshield openradioss: Rubber O-Ring Seal Installation specfem3d: Mount St. Helens specfem3d: Layered Halfspace specfem3d: Tomographic Model specfem3d: Homogeneous Halfspace specfem3d: Water-layered Halfspace build-ffmpeg: Time To Compile build-linux-kernel: defconfig blender: BMW27 - CPU-Only c3-highcpu-8 SPR c2-standard-8 CLX 7.546 4283873987 1568572920 22091557637 57594077823 48008361573 15970781140 30667900000 5.8475 7.3642 6.99 7.48 9.12 32.39 34.50 42.24 188.651 0.24 0.12 14.20 14.93 15.69 98 3.7873 51.8926 19.1677 64.8747 33.1064 6.5372 16.2194 3.7693 10.3 905.2 6.5 907.2 35306 20468 1272 1221 0.777 458.8 19321.6 24960.1 1044947.13 1030937.28 332 317 6932 28818 6930 37509000 765713 36310.35 35602.10 34672.65 32118.58 32814.75 2062.7 311942 293725 67857.7 71072 3.35779 1.50004 5.34218 4.17707 1.47145 3.54944 4660.70 2337.31 0.968986 20952 25819 2.565 3.405 6250 7573 87372 31654 219931 214760 128163 38999 528.0193 38.5120 104.3102 30.7952 60.3858 305.8991 123.2931 530.6100 32.3943163 62.044277 422.67836 303.38 219.73 595.14 367.00 139.524469614 372.089480857 143.937022875 179.545901627 321.625251668 120.438 244.800 315.24 6.112 1318193530 1465815227 21346811813 23237312603 16945000630 10919151843 25100766667 3.9340 5.1542 5.69 6.03 7.45 26.30 27.72 34.47 152.798 0.22 0.11 13.33 14.11 14.79 70 2.9308 62.5073 15.9179 59.9914 29.1783 6.1893 13.9932 2.9352 9.24 701.1 5.86 713.8 30989 22852 909 902 0.579 534.9 13684.5 16184.2 715723.53 702291.93 248 237 6687 31278 6684 40197000 683546 25148.28 24695.91 21957.17 21446.27 21594.94 1156.6 173317 169643 76402.3 50314 4.44534 19.5287 8.02416 34.1893 52.9421 35.9164 5767.70 2998.03 7.43260 30911 37892 4.616 5.895 7437 11462 142770 11737 236186 250833 147234 58056 682.3748 31.9583 125.4746 33.2924 68.5076 323.0828 142.8892 681.3449 37.7575671 85.518264 560.60744 390.61 291.49 724.80 523.45 145.379328519 374.756877204 150.915292633 190.794140810 347.812814070 139.039 289.662 OpenBenchmarking.org
miniBUDE Implementation: OpenMP - Input Deck: BM1 OpenBenchmarking.org Billion Interactions/s, More Is Better miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM1 c2-standard-8 CLX c3-highcpu-8 SPR 2 4 6 8 10 SE +/- 0.001, N = 3 SE +/- 0.001, N = 3 6.112 7.546 1. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm
OpenSSL Algorithm: SHA256 OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: SHA256 c2-standard-8 CLX c3-highcpu-8 SPR 900M 1800M 2700M 3600M 4500M SE +/- 28303.98, N = 3 SE +/- 2722265.85, N = 3 1318193530 4283873987 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenSSL Algorithm: SHA512 OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: SHA512 c2-standard-8 CLX c3-highcpu-8 SPR 300M 600M 900M 1200M 1500M SE +/- 2965153.70, N = 3 SE +/- 1152941.98, N = 3 1465815227 1568572920 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenSSL Algorithm: ChaCha20 OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: ChaCha20 c2-standard-8 CLX c3-highcpu-8 SPR 5000M 10000M 15000M 20000M 25000M SE +/- 1497684.67, N = 3 SE +/- 35781789.46, N = 3 21346811813 22091557637 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenSSL Algorithm: AES-128-GCM OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: AES-128-GCM c2-standard-8 CLX c3-highcpu-8 SPR 12000M 24000M 36000M 48000M 60000M SE +/- 6165180.91, N = 3 SE +/- 33921495.47, N = 3 23237312603 57594077823 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenSSL Algorithm: AES-256-GCM OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: AES-256-GCM c2-standard-8 CLX c3-highcpu-8 SPR 10000M 20000M 30000M 40000M 50000M SE +/- 3855504.36, N = 3 SE +/- 48074910.93, N = 3 16945000630 48008361573 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenSSL Algorithm: ChaCha20-Poly1305 OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: ChaCha20-Poly1305 c2-standard-8 CLX c3-highcpu-8 SPR 3000M 6000M 9000M 12000M 15000M SE +/- 1592104.52, N = 3 SE +/- 21990039.72, N = 3 10919151843 15970781140 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
nekRS Input: TurboPipe Periodic OpenBenchmarking.org FLOP/s, More Is Better nekRS 22.0 Input: TurboPipe Periodic c2-standard-8 CLX c3-highcpu-8 SPR 7000M 14000M 21000M 28000M 35000M SE +/- 56849518.71, N = 3 SE +/- 92013205.57, N = 3 25100766667 30667900000 1. (CXX) g++ options: -fopenmp -O2 -march=native -mtune=native -ftree-vectorize -lmpi_cxx -lmpi
Embree Binary: Pathtracer ISPC - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.0.1 Binary: Pathtracer ISPC - Model: Crown c2-standard-8 CLX c3-highcpu-8 SPR 1.3157 2.6314 3.9471 5.2628 6.5785 SE +/- 0.0030, N = 3 SE +/- 0.0120, N = 3 3.9340 5.8475 MIN: 3.91 / MAX: 3.99 MIN: 5.81 / MAX: 5.92
Embree Binary: Pathtracer ISPC - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.0.1 Binary: Pathtracer ISPC - Model: Asian Dragon c2-standard-8 CLX c3-highcpu-8 SPR 2 4 6 8 10 SE +/- 0.0119, N = 3 SE +/- 0.0079, N = 3 5.1542 7.3642 MIN: 5.12 / MAX: 5.22 MIN: 7.33 / MAX: 7.44
uvg266 Video Input: Bosphorus 4K - Video Preset: Very Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 4K - Video Preset: Very Fast c2-standard-8 CLX c3-highcpu-8 SPR 2 4 6 8 10 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 5.69 6.99
uvg266 Video Input: Bosphorus 4K - Video Preset: Super Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 4K - Video Preset: Super Fast c2-standard-8 CLX c3-highcpu-8 SPR 2 4 6 8 10 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 6.03 7.48
uvg266 Video Input: Bosphorus 4K - Video Preset: Ultra Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 4K - Video Preset: Ultra Fast c2-standard-8 CLX c3-highcpu-8 SPR 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 7.45 9.12
uvg266 Video Input: Bosphorus 1080p - Video Preset: Very Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 1080p - Video Preset: Very Fast c2-standard-8 CLX c3-highcpu-8 SPR 8 16 24 32 40 SE +/- 0.16, N = 3 SE +/- 0.25, N = 3 26.30 32.39
uvg266 Video Input: Bosphorus 1080p - Video Preset: Super Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 1080p - Video Preset: Super Fast c2-standard-8 CLX c3-highcpu-8 SPR 8 16 24 32 40 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 27.72 34.50
uvg266 Video Input: Bosphorus 1080p - Video Preset: Ultra Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 1080p - Video Preset: Ultra Fast c2-standard-8 CLX c3-highcpu-8 SPR 10 20 30 40 50 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 34.47 42.24
VVenC Video Input: Bosphorus 4K - Video Preset: Fast OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.7 Video Input: Bosphorus 4K - Video Preset: Fast c3-highcpu-8 SPR 0.3501 0.7002 1.0503 1.4004 1.7505 SE +/- 0.005, N = 3 1.556 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
VVenC Video Input: Bosphorus 4K - Video Preset: Faster OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.7 Video Input: Bosphorus 4K - Video Preset: Faster c3-highcpu-8 SPR 0.7929 1.5858 2.3787 3.1716 3.9645 SE +/- 0.000, N = 3 3.524 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
VVenC Video Input: Bosphorus 1080p - Video Preset: Fast OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.7 Video Input: Bosphorus 1080p - Video Preset: Fast c3-highcpu-8 SPR 1.1936 2.3872 3.5808 4.7744 5.968 SE +/- 0.018, N = 3 5.305 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
VVenC Video Input: Bosphorus 1080p - Video Preset: Faster OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.7 Video Input: Bosphorus 1080p - Video Preset: Faster c3-highcpu-8 SPR 3 6 9 12 15 SE +/- 0.01, N = 3 12.92 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
miniBUDE Implementation: OpenMP - Input Deck: BM1 OpenBenchmarking.org GFInst/s, More Is Better miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM1 c2-standard-8 CLX c3-highcpu-8 SPR 40 80 120 160 200 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 152.80 188.65 1. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm
Intel Open Image Denoise Run: RT.hdr_alb_nrm.3840x2160 OpenBenchmarking.org Images / Sec, More Is Better Intel Open Image Denoise 1.4.0 Run: RT.hdr_alb_nrm.3840x2160 c2-standard-8 CLX c3-highcpu-8 SPR 0.054 0.108 0.162 0.216 0.27 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.22 0.24
Intel Open Image Denoise Run: RTLightmap.hdr.4096x4096 OpenBenchmarking.org Images / Sec, More Is Better Intel Open Image Denoise 1.4.0 Run: RTLightmap.hdr.4096x4096 c2-standard-8 CLX c3-highcpu-8 SPR 0.027 0.054 0.081 0.108 0.135 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.11 0.12
TensorFlow Device: CPU - Batch Size: 16 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.10 Device: CPU - Batch Size: 16 - Model: ResNet-50 c2-standard-8 CLX c3-highcpu-8 SPR 4 8 12 16 20 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 13.33 14.20
TensorFlow Device: CPU - Batch Size: 32 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.10 Device: CPU - Batch Size: 32 - Model: ResNet-50 c2-standard-8 CLX c3-highcpu-8 SPR 4 8 12 16 20 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 14.11 14.93
TensorFlow Device: CPU - Batch Size: 64 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.10 Device: CPU - Batch Size: 64 - Model: ResNet-50 c2-standard-8 CLX c3-highcpu-8 SPR 4 8 12 16 20 SE +/- 0.18, N = 3 SE +/- 0.01, N = 3 14.79 15.69
OpenVKL Benchmark: vklBenchmark ISPC OpenBenchmarking.org Items / Sec, More Is Better OpenVKL 1.3.1 Benchmark: vklBenchmark ISPC c2-standard-8 CLX c3-highcpu-8 SPR 20 40 60 80 100 SE +/- 0.00, N = 3 SE +/- 0.33, N = 3 70 98 MIN: 8 / MAX: 1119 MIN: 11 / MAX: 1579
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream c2-standard-8 CLX c3-highcpu-8 SPR 0.8521 1.7042 2.5563 3.4084 4.2605 SE +/- 0.0001, N = 3 SE +/- 0.0130, N = 3 2.9308 3.7873
Neural Magic DeepSparse Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream c2-standard-8 CLX c3-highcpu-8 SPR 14 28 42 56 70 SE +/- 0.07, N = 3 SE +/- 0.05, N = 3 62.51 51.89
Neural Magic DeepSparse Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream c2-standard-8 CLX c3-highcpu-8 SPR 5 10 15 20 25 SE +/- 0.05, N = 3 SE +/- 0.03, N = 3 15.92 19.17
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream c2-standard-8 CLX c3-highcpu-8 SPR 14 28 42 56 70 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 59.99 64.87
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream c2-standard-8 CLX c3-highcpu-8 SPR 8 16 24 32 40 SE +/- 0.03, N = 3 SE +/- 0.15, N = 3 29.18 33.11
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream c2-standard-8 CLX c3-highcpu-8 SPR 2 4 6 8 10 SE +/- 0.0034, N = 3 SE +/- 0.0078, N = 3 6.1893 6.5372
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream c2-standard-8 CLX c3-highcpu-8 SPR 4 8 12 16 20 SE +/- 0.03, N = 3 SE +/- 0.13, N = 3 13.99 16.22
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream c2-standard-8 CLX c3-highcpu-8 SPR 0.8481 1.6962 2.5443 3.3924 4.2405 SE +/- 0.0038, N = 3 SE +/- 0.0239, N = 3 2.9352 3.7693
Zstd Compression Compression Level: 19 - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19 - Compression Speed c2-standard-8 CLX c3-highcpu-8 SPR 3 6 9 12 15 SE +/- 0.08, N = 3 SE +/- 0.12, N = 3 9.24 10.30 -llzma 1. (CC) gcc options: -O3 -pthread -lz
Zstd Compression Compression Level: 19 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19 - Decompression Speed c2-standard-8 CLX c3-highcpu-8 SPR 200 400 600 800 1000 SE +/- 1.45, N = 3 SE +/- 1.50, N = 3 701.1 905.2 -llzma 1. (CC) gcc options: -O3 -pthread -lz
Zstd Compression Compression Level: 19, Long Mode - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19, Long Mode - Compression Speed c2-standard-8 CLX c3-highcpu-8 SPR 2 4 6 8 10 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 5.86 6.50 -llzma 1. (CC) gcc options: -O3 -pthread -lz
Zstd Compression Compression Level: 19, Long Mode - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19, Long Mode - Decompression Speed c2-standard-8 CLX c3-highcpu-8 SPR 200 400 600 800 1000 SE +/- 2.02, N = 3 SE +/- 1.48, N = 3 713.8 907.2 -llzma 1. (CC) gcc options: -O3 -pthread -lz
7-Zip Compression Test: Compression Rating OpenBenchmarking.org MIPS, More Is Better 7-Zip Compression 22.01 Test: Compression Rating c2-standard-8 CLX c3-highcpu-8 SPR 8K 16K 24K 32K 40K SE +/- 248.15, N = 3 SE +/- 246.17, N = 15 30989 35306 1. (CXX) g++ options: -lpthread -ldl -O2 -fPIC
7-Zip Compression Test: Decompression Rating OpenBenchmarking.org MIPS, More Is Better 7-Zip Compression 22.01 Test: Decompression Rating c2-standard-8 CLX c3-highcpu-8 SPR 5K 10K 15K 20K 25K SE +/- 52.92, N = 3 SE +/- 183.58, N = 15 22852 20468 1. (CXX) g++ options: -lpthread -ldl -O2 -fPIC
LeelaChessZero Backend: BLAS OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.28 Backend: BLAS c2-standard-8 CLX c3-highcpu-8 SPR 300 600 900 1200 1500 SE +/- 7.36, N = 3 SE +/- 16.59, N = 3 909 1272 1. (CXX) g++ options: -flto -pthread
LeelaChessZero Backend: Eigen OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.28 Backend: Eigen c2-standard-8 CLX c3-highcpu-8 SPR 300 600 900 1200 1500 SE +/- 9.00, N = 3 SE +/- 7.69, N = 3 902 1221 1. (CXX) g++ options: -flto -pthread
GROMACS Implementation: MPI CPU - Input: water_GMX50_bare OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2023 Implementation: MPI CPU - Input: water_GMX50_bare c2-standard-8 CLX c3-highcpu-8 SPR 0.1748 0.3496 0.5244 0.6992 0.874 SE +/- 0.001, N = 3 SE +/- 0.001, N = 3 0.579 0.777 1. (CXX) g++ options: -O3
CockroachDB Workload: MoVR - Concurrency: 128 OpenBenchmarking.org ops/s, More Is Better CockroachDB 22.2 Workload: MoVR - Concurrency: 128 c2-standard-8 CLX c3-highcpu-8 SPR 120 240 360 480 600 SE +/- 1.90, N = 3 SE +/- 5.62, N = 15 534.9 458.8
CockroachDB Workload: KV, 50% Reads - Concurrency: 128 OpenBenchmarking.org ops/s, More Is Better CockroachDB 22.2 Workload: KV, 50% Reads - Concurrency: 128 c2-standard-8 CLX c3-highcpu-8 SPR 4K 8K 12K 16K 20K SE +/- 68.65, N = 3 SE +/- 40.86, N = 3 13684.5 19321.6
CockroachDB Workload: KV, 95% Reads - Concurrency: 128 OpenBenchmarking.org ops/s, More Is Better CockroachDB 22.2 Workload: KV, 95% Reads - Concurrency: 128 c2-standard-8 CLX c3-highcpu-8 SPR 5K 10K 15K 20K 25K SE +/- 79.71, N = 3 SE +/- 127.78, N = 3 16184.2 24960.1
Memcached Set To Get Ratio: 1:10 OpenBenchmarking.org Ops/sec, More Is Better Memcached 1.6.18 Set To Get Ratio: 1:10 c2-standard-8 CLX c3-highcpu-8 SPR 200K 400K 600K 800K 1000K SE +/- 2213.84, N = 3 SE +/- 3280.78, N = 3 715723.53 1044947.13 1. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre
Memcached Set To Get Ratio: 1:100 OpenBenchmarking.org Ops/sec, More Is Better Memcached 1.6.18 Set To Get Ratio: 1:100 c2-standard-8 CLX c3-highcpu-8 SPR 200K 400K 600K 800K 1000K SE +/- 3476.80, N = 3 SE +/- 11157.73, N = 3 702291.93 1030937.28 1. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre
MariaDB Clients: 2048 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 11.0.1 Clients: 2048 c2-standard-8 CLX c3-highcpu-8 SPR 70 140 210 280 350 SE +/- 3.50, N = 3 SE +/- 3.01, N = 3 248 332 1. (CXX) g++ options: -pie -fPIC -fstack-protector -O3 -lnuma -lcrypt -lz -lm -lssl -lcrypto -lpthread -ldl
MariaDB Clients: 4096 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 11.0.1 Clients: 4096 c2-standard-8 CLX c3-highcpu-8 SPR 70 140 210 280 350 SE +/- 2.55, N = 3 SE +/- 2.62, N = 3 237 317 1. (CXX) g++ options: -pie -fPIC -fstack-protector -O3 -lnuma -lcrypt -lz -lm -lssl -lcrypto -lpthread -ldl
John The Ripper Test: bcrypt OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: bcrypt c2-standard-8 CLX c3-highcpu-8 SPR 1500 3000 4500 6000 7500 SE +/- 0.67, N = 3 SE +/- 0.67, N = 3 6687 6932 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lm -lrt -lz -ldl -lcrypt
John The Ripper Test: WPA PSK OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: WPA PSK c2-standard-8 CLX c3-highcpu-8 SPR 7K 14K 21K 28K 35K SE +/- 11.37, N = 3 SE +/- 27.15, N = 3 31278 28818 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lm -lrt -lz -ldl -lcrypt
John The Ripper Test: Blowfish OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: Blowfish c2-standard-8 CLX c3-highcpu-8 SPR 1500 3000 4500 6000 7500 SE +/- 2.73, N = 3 SE +/- 2.08, N = 3 6684 6930 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lm -lrt -lz -ldl -lcrypt
John The Ripper Test: HMAC-SHA512 OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: HMAC-SHA512 c2-standard-8 CLX c3-highcpu-8 SPR 9M 18M 27M 36M 45M SE +/- 32331.62, N = 3 SE +/- 21071.31, N = 3 40197000 37509000 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lm -lrt -lz -ldl -lcrypt
John The Ripper Test: MD5 OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: MD5 c2-standard-8 CLX c3-highcpu-8 SPR 160K 320K 480K 640K 800K SE +/- 162.53, N = 3 SE +/- 1472.13, N = 3 683546 765713 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lm -lrt -lz -ldl -lcrypt
nginx Connections: 100 OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.23.2 Connections: 100 c2-standard-8 CLX c3-highcpu-8 SPR 8K 16K 24K 32K 40K SE +/- 33.32, N = 3 SE +/- 23.95, N = 3 25148.28 36310.35 1. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2
nginx Connections: 200 OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.23.2 Connections: 200 c2-standard-8 CLX c3-highcpu-8 SPR 8K 16K 24K 32K 40K SE +/- 37.62, N = 3 SE +/- 83.52, N = 3 24695.91 35602.10 1. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2
nginx Connections: 500 OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.23.2 Connections: 500 c2-standard-8 CLX c3-highcpu-8 SPR 7K 14K 21K 28K 35K SE +/- 17.44, N = 3 SE +/- 321.85, N = 3 21957.17 34672.65 1. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2
nginx Connections: 1000 OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.23.2 Connections: 1000 c2-standard-8 CLX c3-highcpu-8 SPR 7K 14K 21K 28K 35K SE +/- 84.22, N = 3 SE +/- 22.42, N = 3 21446.27 32118.58 1. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2
nginx Connections: 4000 OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.23.2 Connections: 4000 c2-standard-8 CLX c3-highcpu-8 SPR 7K 14K 21K 28K 35K SE +/- 11.52, N = 3 SE +/- 27.93, N = 3 21594.94 32814.75 1. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2
OpenSSL Algorithm: RSA4096 OpenBenchmarking.org sign/s, More Is Better OpenSSL 3.1 Algorithm: RSA4096 c2-standard-8 CLX c3-highcpu-8 SPR 400 800 1200 1600 2000 SE +/- 2.05, N = 3 SE +/- 1.17, N = 3 1156.6 2062.7 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
PostgreSQL Scaling Factor: 100 - Clients: 800 - Mode: Read Only OpenBenchmarking.org TPS, More Is Better PostgreSQL 15 Scaling Factor: 100 - Clients: 800 - Mode: Read Only c2-standard-8 CLX c3-highcpu-8 SPR 70K 140K 210K 280K 350K SE +/- 822.93, N = 3 SE +/- 3369.27, N = 3 173317 311942 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
PostgreSQL Scaling Factor: 100 - Clients: 1000 - Mode: Read Only OpenBenchmarking.org TPS, More Is Better PostgreSQL 15 Scaling Factor: 100 - Clients: 1000 - Mode: Read Only c2-standard-8 CLX c3-highcpu-8 SPR 60K 120K 180K 240K 300K SE +/- 1014.66, N = 3 SE +/- 2414.33, N = 3 169643 293725 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
OpenSSL Algorithm: RSA4096 OpenBenchmarking.org verify/s, More Is Better OpenSSL 3.1 Algorithm: RSA4096 c2-standard-8 CLX c3-highcpu-8 SPR 16K 32K 48K 64K 80K SE +/- 47.93, N = 3 SE +/- 8.53, N = 3 76402.3 67857.7 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
BRL-CAD VGR Performance Metric OpenBenchmarking.org VGR Performance Metric, More Is Better BRL-CAD 7.34 VGR Performance Metric c2-standard-8 CLX c3-highcpu-8 SPR 15K 30K 45K 60K 75K 50314 71072 1. (CXX) g++ options: -std=c++14 -pipe -fvisibility=hidden -fno-strict-aliasing -fno-common -fexceptions -ftemplate-depth-128 -m64 -ggdb3 -O3 -fipa-pta -fstrength-reduce -finline-functions -flto -ltcl8.6 -lregex_brl -lz_brl -lnetpbm -ldl -lm -ltk8.6
NAMD ATPase Simulation - 327,506 Atoms OpenBenchmarking.org days/ns, Fewer Is Better NAMD 2.14 ATPase Simulation - 327,506 Atoms c2-standard-8 CLX c3-highcpu-8 SPR 1.0002 2.0004 3.0006 4.0008 5.001 SE +/- 0.00771, N = 3 SE +/- 0.00254, N = 3 4.44534 3.35779
oneDNN Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU c2-standard-8 CLX c3-highcpu-8 SPR 5 10 15 20 25 SE +/- 0.01518, N = 3 SE +/- 0.00340, N = 3 19.52870 1.50004 MIN: 18.98 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU c2-standard-8 CLX c3-highcpu-8 SPR 2 4 6 8 10 SE +/- 0.01190, N = 3 SE +/- 0.00535, N = 3 8.02416 5.34218 MIN: 7.86 MIN: 4.94 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU c2-standard-8 CLX c3-highcpu-8 SPR 8 16 24 32 40 SE +/- 0.00489, N = 3 SE +/- 0.00870, N = 3 34.18930 4.17707 MIN: 33.77 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU c2-standard-8 CLX c3-highcpu-8 SPR 12 24 36 48 60 SE +/- 0.00209, N = 3 SE +/- 0.00472, N = 3 52.94210 1.47145 MIN: 52.7 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU c2-standard-8 CLX c3-highcpu-8 SPR 8 16 24 32 40 SE +/- 0.01009, N = 3 SE +/- 0.01072, N = 3 35.91640 3.54944 MIN: 35.71 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU c2-standard-8 CLX c3-highcpu-8 SPR 1200 2400 3600 4800 6000 SE +/- 8.38, N = 3 SE +/- 0.86, N = 3 5767.70 4660.70 MIN: 5732.12 MIN: 4648.25 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU c2-standard-8 CLX c3-highcpu-8 SPR 600 1200 1800 2400 3000 SE +/- 0.83, N = 3 SE +/- 1.76, N = 3 2998.03 2337.31 MIN: 2977.98 MIN: 2326.77 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU c2-standard-8 CLX c3-highcpu-8 SPR 2 4 6 8 10 SE +/- 0.007542, N = 3 SE +/- 0.013114, N = 3 7.432600 0.968986 MIN: 7.25 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OSPRay Studio Camera: 1 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 0.11 Camera: 1 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer c2-standard-8 CLX c3-highcpu-8 SPR 7K 14K 21K 28K 35K SE +/- 32.33, N = 3 SE +/- 4.91, N = 3 30911 20952 1. (CXX) g++ options: -O3 -lm -ldl
OSPRay Studio Camera: 3 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 0.11 Camera: 3 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer c2-standard-8 CLX c3-highcpu-8 SPR 8K 16K 24K 32K 40K SE +/- 39.94, N = 3 SE +/- 364.36, N = 3 37892 25819 1. (CXX) g++ options: -O3 -lm -ldl
PostgreSQL Scaling Factor: 100 - Clients: 800 - Mode: Read Only - Average Latency OpenBenchmarking.org ms, Fewer Is Better PostgreSQL 15 Scaling Factor: 100 - Clients: 800 - Mode: Read Only - Average Latency c2-standard-8 CLX c3-highcpu-8 SPR 1.0386 2.0772 3.1158 4.1544 5.193 SE +/- 0.022, N = 3 SE +/- 0.028, N = 3 4.616 2.565 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
PostgreSQL Scaling Factor: 100 - Clients: 1000 - Mode: Read Only - Average Latency OpenBenchmarking.org ms, Fewer Is Better PostgreSQL 15 Scaling Factor: 100 - Clients: 1000 - Mode: Read Only - Average Latency c2-standard-8 CLX c3-highcpu-8 SPR 1.3264 2.6528 3.9792 5.3056 6.632 SE +/- 0.035, N = 3 SE +/- 0.028, N = 3 5.895 3.405 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
Google Draco Model: Lion OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.5.6 Model: Lion c2-standard-8 CLX c3-highcpu-8 SPR 1600 3200 4800 6400 8000 SE +/- 18.52, N = 3 SE +/- 80.23, N = 15 7437 6250 1. (CXX) g++ options: -O3
Google Draco Model: Church Facade OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.5.6 Model: Church Facade c2-standard-8 CLX c3-highcpu-8 SPR 2K 4K 6K 8K 10K SE +/- 16.76, N = 3 SE +/- 10.48, N = 3 11462 7573 1. (CXX) g++ options: -O3
OpenCV Test: Core OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Core c2-standard-8 CLX c3-highcpu-8 SPR 30K 60K 90K 120K 150K SE +/- 2578.21, N = 12 SE +/- 280.31, N = 3 142770 87372 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
OpenCV Test: Video OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Video c2-standard-8 CLX c3-highcpu-8 SPR 7K 14K 21K 28K 35K SE +/- 50.28, N = 3 SE +/- 198.80, N = 3 11737 31654 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
OpenCV Test: Graph API OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Graph API c2-standard-8 CLX c3-highcpu-8 SPR 50K 100K 150K 200K 250K SE +/- 1570.24, N = 3 SE +/- 931.36, N = 3 236186 219931 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
OpenCV Test: Stitching OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Stitching c2-standard-8 CLX c3-highcpu-8 SPR 50K 100K 150K 200K 250K SE +/- 1856.06, N = 3 SE +/- 1973.06, N = 7 250833 214760 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
OpenCV Test: Image Processing OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Image Processing c2-standard-8 CLX c3-highcpu-8 SPR 30K 60K 90K 120K 150K SE +/- 1527.37, N = 4 SE +/- 1624.35, N = 12 147234 128163 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
OpenCV Test: Object Detection OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Object Detection c2-standard-8 CLX c3-highcpu-8 SPR 12K 24K 36K 48K 60K SE +/- 751.87, N = 3 SE +/- 384.74, N = 5 58056 38999 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream c2-standard-8 CLX c3-highcpu-8 SPR 150 300 450 600 750 SE +/- 0.03, N = 3 SE +/- 1.78, N = 3 682.37 528.02
Neural Magic DeepSparse Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream c2-standard-8 CLX c3-highcpu-8 SPR 9 18 27 36 45 SE +/- 0.03, N = 3 SE +/- 0.04, N = 3 31.96 38.51
Neural Magic DeepSparse Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream c2-standard-8 CLX c3-highcpu-8 SPR 30 60 90 120 150 SE +/- 0.40, N = 3 SE +/- 0.14, N = 3 125.47 104.31
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream c2-standard-8 CLX c3-highcpu-8 SPR 8 16 24 32 40 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 33.29 30.80
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream c2-standard-8 CLX c3-highcpu-8 SPR 15 30 45 60 75 SE +/- 0.08, N = 3 SE +/- 0.28, N = 3 68.51 60.39
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream c2-standard-8 CLX c3-highcpu-8 SPR 70 140 210 280 350 SE +/- 0.18, N = 3 SE +/- 0.37, N = 3 323.08 305.90
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream c2-standard-8 CLX c3-highcpu-8 SPR 30 60 90 120 150 SE +/- 0.29, N = 3 SE +/- 0.95, N = 3 142.89 123.29
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream c2-standard-8 CLX c3-highcpu-8 SPR 150 300 450 600 750 SE +/- 0.89, N = 3 SE +/- 3.34, N = 3 681.34 530.61
Xcompact3d Incompact3d Input: input.i3d 129 Cells Per Direction OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 129 Cells Per Direction c2-standard-8 CLX c3-highcpu-8 SPR 9 18 27 36 45 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 37.76 32.39 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
OpenFOAM Input: drivaerFastback, Small Mesh Size - Mesh Time OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: drivaerFastback, Small Mesh Size - Mesh Time c2-standard-8 CLX c3-highcpu-8 SPR 20 40 60 80 100 85.52 62.04 -ldynamicMesh -lphysicalProperties -lspecie -lfiniteVolume -lfvModels -lmeshTools -lsampling 1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lgenericPatchFields -lOpenFOAM -ldl -lm
OpenFOAM Input: drivaerFastback, Small Mesh Size - Execution Time OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: drivaerFastback, Small Mesh Size - Execution Time c2-standard-8 CLX c3-highcpu-8 SPR 120 240 360 480 600 560.61 422.68 -ldynamicMesh -lphysicalProperties -lspecie -lfiniteVolume -lfvModels -lmeshTools -lsampling 1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lgenericPatchFields -lOpenFOAM -ldl -lm
OpenRadioss Model: Bumper Beam OpenBenchmarking.org Seconds, Fewer Is Better OpenRadioss 2022.10.13 Model: Bumper Beam c2-standard-8 CLX c3-highcpu-8 SPR 80 160 240 320 400 SE +/- 0.90, N = 3 SE +/- 0.51, N = 3 390.61 303.38
OpenRadioss Model: Cell Phone Drop Test OpenBenchmarking.org Seconds, Fewer Is Better OpenRadioss 2022.10.13 Model: Cell Phone Drop Test c2-standard-8 CLX c3-highcpu-8 SPR 60 120 180 240 300 SE +/- 0.39, N = 3 SE +/- 0.38, N = 3 291.49 219.73
OpenRadioss Model: Bird Strike on Windshield OpenBenchmarking.org Seconds, Fewer Is Better OpenRadioss 2022.10.13 Model: Bird Strike on Windshield c2-standard-8 CLX c3-highcpu-8 SPR 160 320 480 640 800 SE +/- 1.51, N = 3 SE +/- 0.30, N = 3 724.80 595.14
OpenRadioss Model: Rubber O-Ring Seal Installation OpenBenchmarking.org Seconds, Fewer Is Better OpenRadioss 2022.10.13 Model: Rubber O-Ring Seal Installation c2-standard-8 CLX c3-highcpu-8 SPR 110 220 330 440 550 SE +/- 0.45, N = 3 SE +/- 0.20, N = 3 523.45 367.00
SPECFEM3D Model: Mount St. Helens OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.0 Model: Mount St. Helens c2-standard-8 CLX c3-highcpu-8 SPR 30 60 90 120 150 SE +/- 0.56, N = 3 SE +/- 0.10, N = 3 145.38 139.52 1. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
SPECFEM3D Model: Layered Halfspace OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.0 Model: Layered Halfspace c2-standard-8 CLX c3-highcpu-8 SPR 80 160 240 320 400 SE +/- 2.51, N = 3 SE +/- 0.64, N = 3 374.76 372.09 1. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
SPECFEM3D Model: Tomographic Model OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.0 Model: Tomographic Model c2-standard-8 CLX c3-highcpu-8 SPR 30 60 90 120 150 SE +/- 1.30, N = 3 SE +/- 1.65, N = 3 150.92 143.94 1. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
SPECFEM3D Model: Homogeneous Halfspace OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.0 Model: Homogeneous Halfspace c2-standard-8 CLX c3-highcpu-8 SPR 40 80 120 160 200 SE +/- 1.87, N = 3 SE +/- 0.09, N = 3 190.79 179.55 1. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
SPECFEM3D Model: Water-layered Halfspace OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.0 Model: Water-layered Halfspace c2-standard-8 CLX c3-highcpu-8 SPR 80 160 240 320 400 SE +/- 0.85, N = 3 SE +/- 0.31, N = 3 347.81 321.63 1. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
Timed FFmpeg Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed FFmpeg Compilation 6.0 Time To Compile c2-standard-8 CLX c3-highcpu-8 SPR 30 60 90 120 150 SE +/- 0.03, N = 3 SE +/- 0.10, N = 3 139.04 120.44
Timed Linux Kernel Compilation Build: defconfig OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 6.1 Build: defconfig c2-standard-8 CLX c3-highcpu-8 SPR 60 120 180 240 300 SE +/- 0.85, N = 3 SE +/- 0.64, N = 3 289.66 244.80
Blender Blend File: BMW27 - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.4 Blend File: BMW27 - Compute: CPU-Only c3-highcpu-8 SPR 70 140 210 280 350 SE +/- 1.13, N = 3 315.24
Phoronix Test Suite v10.8.4