Benchmarks by Michael Larabel for a future article.
Compare your own system(s) to this result file with the
Phoronix Test Suite by running the command:
phoronix-test-suite benchmark 2303226-NE-2303218PT51 Google Cloud c3 Sapphire Rapids - Phoronix Test Suite Google Cloud c3 Sapphire Rapids Benchmarks by Michael Larabel for a future article.
HTML result view exported from: https://openbenchmarking.org/result/2303226-NE-2303218PT51&grs&sro .
Google Cloud c3 Sapphire Rapids Processor Motherboard Chipset Memory Disk Network OS Kernel Vulkan Compiler File-System System Layer c3-highcpu-8 SPR c2-standard-8 CLX Intel Xeon Platinum 8481C (4 Cores / 8 Threads) Google Compute Engine c3-highcpu-8 Intel 440FX 82441FX PMC 16GB 322GB nvme_card-pd Google Compute Engine Virtual Ubuntu 22.10 5.19.0-1015-gcp (x86_64) 1.3.224 GCC 12.2.0 ext4 KVM Intel Xeon (4 Cores / 8 Threads) Google Compute Engine c2-standard-8 32GB 322GB PersistentDisk Red Hat Virtio device OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - CPU Microcode: 0xffffffff Python Details - Python 3.10.7 Security Details - c3-highcpu-8 SPR: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected - c2-standard-8 CLX: itlb_multihit: Not affected + l1tf: Not affected + mds: Mitigation of Clear buffers; SMT Host state unknown + meltdown: Not affected + mmio_stale_data: Vulnerable: Clear buffers attempted no microcode; SMT Host state unknown + retbleed: Mitigation of Enhanced IBRS + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Mitigation of Clear buffers; SMT Host state unknown
Google Cloud c3 Sapphire Rapids onednn: Convolution Batch Shapes Auto - bf16bf16bf16 - CPU onednn: Matrix Multiply Batch Shapes Transformer - bf16bf16bf16 - CPU onednn: Deconvolution Batch shapes_1d - bf16bf16bf16 - CPU openssl: SHA256 openssl: AES-256-GCM opencv: Video openssl: AES-128-GCM onednn: IP Shapes 1D - bf16bf16bf16 - CPU onednn: Deconvolution Batch shapes_3d - bf16bf16bf16 - CPU pgbench: 100 - 800 - Read Only pgbench: 100 - 800 - Read Only - Average Latency openssl: RSA4096 pgbench: 100 - 1000 - Read Only pgbench: 100 - 1000 - Read Only - Average Latency nginx: 500 cockroach: KV, 95% Reads - 128 nginx: 4000 draco: Church Facade onednn: IP Shapes 3D - bf16bf16bf16 - CPU nginx: 1000 opencv: Object Detection embree: Pathtracer ISPC - Crown ospray-studio: 1 - 4K - 1 - Path Tracer memcached: 1:100 ospray-studio: 3 - 4K - 1 - Path Tracer openssl: ChaCha20-Poly1305 memcached: 1:10 nginx: 100 nginx: 200 embree: Pathtracer ISPC - Asian Dragon openradioss: Rubber O-Ring Seal Installation brl-cad: VGR Performance Metric cockroach: KV, 50% Reads - 128 openvkl: vklBenchmark ISPC lczero: BLAS openfoam: drivaerFastback, Small Mesh Size - Mesh Time lczero: Eigen gromacs: MPI CPU - water_GMX50_bare mysqlslap: 2048 mysqlslap: 4096 openradioss: Cell Phone Drop Test openfoam: drivaerFastback, Small Mesh Size - Execution Time namd: ATPase Simulation - 327,506 Atoms deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream compress-zstd: 19 - Decompression Speed openradioss: Bumper Beam deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU compress-zstd: 19, Long Mode - Decompression Speed uvg266: Bosphorus 1080p - Super Fast uvg266: Bosphorus 4K - Super Fast onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU minibude: OpenMP - BM1 minibude: OpenMP - BM1 uvg266: Bosphorus 1080p - Very Fast uvg266: Bosphorus 4K - Very Fast uvg266: Bosphorus 1080p - Ultra Fast uvg266: Bosphorus 4K - Ultra Fast nekrs: TurboPipe Periodic openradioss: Bird Strike on Windshield deepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Asynchronous Multi-Stream deepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Asynchronous Multi-Stream deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Stream deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Stream draco: Lion build-linux-kernel: defconfig opencv: Stitching cockroach: MoVR - 128 incompact3d: input.i3d 129 Cells Per Direction deepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Stream build-ffmpeg: Time To Compile opencv: Image Processing compress-7zip: Compression Rating deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream openssl: RSA4096 john-the-ripper: MD5 compress-7zip: Decompression Rating compress-zstd: 19 - Compression Speed compress-zstd: 19, Long Mode - Compression Speed oidn: RTLightmap.hdr.4096x4096 oidn: RT.hdr_alb_nrm.3840x2160 john-the-ripper: WPA PSK specfem3d: Water-layered Halfspace deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream opencv: Graph API john-the-ripper: HMAC-SHA512 openssl: SHA512 tensorflow: CPU - 16 - ResNet-50 specfem3d: Homogeneous Halfspace tensorflow: CPU - 64 - ResNet-50 tensorflow: CPU - 32 - ResNet-50 deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream specfem3d: Tomographic Model specfem3d: Mount St. Helens john-the-ripper: Blowfish john-the-ripper: bcrypt openssl: ChaCha20 specfem3d: Layered Halfspace blender: BMW27 - CPU-Only opencv: Core c3-highcpu-8 SPR c2-standard-8 CLX 4.17707 0.968986 1.47145 4283873987 48008361573 31654 57594077823 1.50004 3.54944 311942 2.565 2062.7 293725 3.405 34672.65 24960.1 32814.75 7573 5.34218 32118.58 38999 5.8475 20952 1030937.28 25819 15970781140 1044947.13 36310.35 35602.10 7.3642 367.00 71072 19321.6 98 1272 62.044277 1221 0.777 332 317 219.73 422.67836 3.35779 528.0193 3.7873 905.2 303.38 3.7693 530.6100 2337.31 907.2 34.50 7.48 4660.70 188.651 7.546 32.39 6.99 42.24 9.12 30667900000 595.14 38.5120 51.8926 19.1677 104.3102 6250 244.800 214760 458.8 32.3943163 16.2194 123.2931 120.438 128163 35306 33.1064 60.3858 67857.7 765713 20468 10.3 6.5 0.12 0.24 28818 321.625251668 64.8747 30.7952 219931 37509000 1568572920 14.20 179.545901627 15.69 14.93 6.5372 305.8991 143.937022875 139.524469614 6930 6932 22091557637 372.089480857 315.24 87372 34.1893 7.43260 52.9421 1318193530 16945000630 11737 23237312603 19.5287 35.9164 173317 4.616 1156.6 169643 5.895 21957.17 16184.2 21594.94 11462 8.02416 21446.27 58056 3.9340 30911 702291.93 37892 10919151843 715723.53 25148.28 24695.91 5.1542 523.45 50314 13684.5 70 909 85.518264 902 0.579 248 237 291.49 560.60744 4.44534 682.3748 2.9308 701.1 390.61 2.9352 681.3449 2998.03 713.8 27.72 6.03 5767.70 152.798 6.112 26.30 5.69 34.47 7.45 25100766667 724.80 31.9583 62.5073 15.9179 125.4746 7437 289.662 250833 534.9 37.7575671 13.9932 142.8892 139.039 147234 30989 29.1783 68.5076 76402.3 683546 22852 9.24 5.86 0.11 0.22 31278 347.812814070 59.9914 33.2924 236186 40197000 1465815227 13.33 190.794140810 14.79 14.11 6.1893 323.0828 150.915292633 145.379328519 6684 6687 21346811813 374.756877204 142770 OpenBenchmarking.org
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU c2-standard-8 CLX c3-highcpu-8 SPR 8 16 24 32 40 SE +/- 0.00489, N = 3 SE +/- 0.00870, N = 3 34.18930 4.17707 MIN: 33.77 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU c2-standard-8 CLX c3-highcpu-8 SPR 2 4 6 8 10 SE +/- 0.007542, N = 3 SE +/- 0.013114, N = 3 7.432600 0.968986 MIN: 7.25 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU c2-standard-8 CLX c3-highcpu-8 SPR 12 24 36 48 60 SE +/- 0.00209, N = 3 SE +/- 0.00472, N = 3 52.94210 1.47145 MIN: 52.7 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenSSL Algorithm: SHA256 OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: SHA256 c2-standard-8 CLX c3-highcpu-8 SPR 900M 1800M 2700M 3600M 4500M SE +/- 28303.98, N = 3 SE +/- 2722265.85, N = 3 1318193530 4283873987 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenSSL Algorithm: AES-256-GCM OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: AES-256-GCM c2-standard-8 CLX c3-highcpu-8 SPR 10000M 20000M 30000M 40000M 50000M SE +/- 3855504.36, N = 3 SE +/- 48074910.93, N = 3 16945000630 48008361573 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenCV Test: Video OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Video c2-standard-8 CLX c3-highcpu-8 SPR 7K 14K 21K 28K 35K SE +/- 50.28, N = 3 SE +/- 198.80, N = 3 11737 31654 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
OpenSSL Algorithm: AES-128-GCM OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: AES-128-GCM c2-standard-8 CLX c3-highcpu-8 SPR 12000M 24000M 36000M 48000M 60000M SE +/- 6165180.91, N = 3 SE +/- 33921495.47, N = 3 23237312603 57594077823 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
oneDNN Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU c2-standard-8 CLX c3-highcpu-8 SPR 5 10 15 20 25 SE +/- 0.01518, N = 3 SE +/- 0.00340, N = 3 19.52870 1.50004 MIN: 18.98 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU c2-standard-8 CLX c3-highcpu-8 SPR 8 16 24 32 40 SE +/- 0.01009, N = 3 SE +/- 0.01072, N = 3 35.91640 3.54944 MIN: 35.71 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
PostgreSQL Scaling Factor: 100 - Clients: 800 - Mode: Read Only OpenBenchmarking.org TPS, More Is Better PostgreSQL 15 Scaling Factor: 100 - Clients: 800 - Mode: Read Only c2-standard-8 CLX c3-highcpu-8 SPR 70K 140K 210K 280K 350K SE +/- 822.93, N = 3 SE +/- 3369.27, N = 3 173317 311942 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
PostgreSQL Scaling Factor: 100 - Clients: 800 - Mode: Read Only - Average Latency OpenBenchmarking.org ms, Fewer Is Better PostgreSQL 15 Scaling Factor: 100 - Clients: 800 - Mode: Read Only - Average Latency c2-standard-8 CLX c3-highcpu-8 SPR 1.0386 2.0772 3.1158 4.1544 5.193 SE +/- 0.022, N = 3 SE +/- 0.028, N = 3 4.616 2.565 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
OpenSSL Algorithm: RSA4096 OpenBenchmarking.org sign/s, More Is Better OpenSSL 3.1 Algorithm: RSA4096 c2-standard-8 CLX c3-highcpu-8 SPR 400 800 1200 1600 2000 SE +/- 2.05, N = 3 SE +/- 1.17, N = 3 1156.6 2062.7 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
PostgreSQL Scaling Factor: 100 - Clients: 1000 - Mode: Read Only OpenBenchmarking.org TPS, More Is Better PostgreSQL 15 Scaling Factor: 100 - Clients: 1000 - Mode: Read Only c2-standard-8 CLX c3-highcpu-8 SPR 60K 120K 180K 240K 300K SE +/- 1014.66, N = 3 SE +/- 2414.33, N = 3 169643 293725 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
PostgreSQL Scaling Factor: 100 - Clients: 1000 - Mode: Read Only - Average Latency OpenBenchmarking.org ms, Fewer Is Better PostgreSQL 15 Scaling Factor: 100 - Clients: 1000 - Mode: Read Only - Average Latency c2-standard-8 CLX c3-highcpu-8 SPR 1.3264 2.6528 3.9792 5.3056 6.632 SE +/- 0.035, N = 3 SE +/- 0.028, N = 3 5.895 3.405 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
nginx Connections: 500 OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.23.2 Connections: 500 c2-standard-8 CLX c3-highcpu-8 SPR 7K 14K 21K 28K 35K SE +/- 17.44, N = 3 SE +/- 321.85, N = 3 21957.17 34672.65 1. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2
CockroachDB Workload: KV, 95% Reads - Concurrency: 128 OpenBenchmarking.org ops/s, More Is Better CockroachDB 22.2 Workload: KV, 95% Reads - Concurrency: 128 c2-standard-8 CLX c3-highcpu-8 SPR 5K 10K 15K 20K 25K SE +/- 79.71, N = 3 SE +/- 127.78, N = 3 16184.2 24960.1
nginx Connections: 4000 OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.23.2 Connections: 4000 c2-standard-8 CLX c3-highcpu-8 SPR 7K 14K 21K 28K 35K SE +/- 11.52, N = 3 SE +/- 27.93, N = 3 21594.94 32814.75 1. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2
Google Draco Model: Church Facade OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.5.6 Model: Church Facade c2-standard-8 CLX c3-highcpu-8 SPR 2K 4K 6K 8K 10K SE +/- 16.76, N = 3 SE +/- 10.48, N = 3 11462 7573 1. (CXX) g++ options: -O3
oneDNN Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU c2-standard-8 CLX c3-highcpu-8 SPR 2 4 6 8 10 SE +/- 0.01190, N = 3 SE +/- 0.00535, N = 3 8.02416 5.34218 MIN: 7.86 MIN: 4.94 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
nginx Connections: 1000 OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.23.2 Connections: 1000 c2-standard-8 CLX c3-highcpu-8 SPR 7K 14K 21K 28K 35K SE +/- 84.22, N = 3 SE +/- 22.42, N = 3 21446.27 32118.58 1. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2
OpenCV Test: Object Detection OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Object Detection c2-standard-8 CLX c3-highcpu-8 SPR 12K 24K 36K 48K 60K SE +/- 751.87, N = 3 SE +/- 384.74, N = 5 58056 38999 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
Embree Binary: Pathtracer ISPC - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.0.1 Binary: Pathtracer ISPC - Model: Crown c2-standard-8 CLX c3-highcpu-8 SPR 1.3157 2.6314 3.9471 5.2628 6.5785 SE +/- 0.0030, N = 3 SE +/- 0.0120, N = 3 3.9340 5.8475 MIN: 3.91 / MAX: 3.99 MIN: 5.81 / MAX: 5.92
OSPRay Studio Camera: 1 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 0.11 Camera: 1 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer c2-standard-8 CLX c3-highcpu-8 SPR 7K 14K 21K 28K 35K SE +/- 32.33, N = 3 SE +/- 4.91, N = 3 30911 20952 1. (CXX) g++ options: -O3 -lm -ldl
Memcached Set To Get Ratio: 1:100 OpenBenchmarking.org Ops/sec, More Is Better Memcached 1.6.18 Set To Get Ratio: 1:100 c2-standard-8 CLX c3-highcpu-8 SPR 200K 400K 600K 800K 1000K SE +/- 3476.80, N = 3 SE +/- 11157.73, N = 3 702291.93 1030937.28 1. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre
OSPRay Studio Camera: 3 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer OpenBenchmarking.org ms, Fewer Is Better OSPRay Studio 0.11 Camera: 3 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer c2-standard-8 CLX c3-highcpu-8 SPR 8K 16K 24K 32K 40K SE +/- 39.94, N = 3 SE +/- 364.36, N = 3 37892 25819 1. (CXX) g++ options: -O3 -lm -ldl
OpenSSL Algorithm: ChaCha20-Poly1305 OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: ChaCha20-Poly1305 c2-standard-8 CLX c3-highcpu-8 SPR 3000M 6000M 9000M 12000M 15000M SE +/- 1592104.52, N = 3 SE +/- 21990039.72, N = 3 10919151843 15970781140 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
Memcached Set To Get Ratio: 1:10 OpenBenchmarking.org Ops/sec, More Is Better Memcached 1.6.18 Set To Get Ratio: 1:10 c2-standard-8 CLX c3-highcpu-8 SPR 200K 400K 600K 800K 1000K SE +/- 2213.84, N = 3 SE +/- 3280.78, N = 3 715723.53 1044947.13 1. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre
nginx Connections: 100 OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.23.2 Connections: 100 c2-standard-8 CLX c3-highcpu-8 SPR 8K 16K 24K 32K 40K SE +/- 33.32, N = 3 SE +/- 23.95, N = 3 25148.28 36310.35 1. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2
nginx Connections: 200 OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.23.2 Connections: 200 c2-standard-8 CLX c3-highcpu-8 SPR 8K 16K 24K 32K 40K SE +/- 37.62, N = 3 SE +/- 83.52, N = 3 24695.91 35602.10 1. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2
Embree Binary: Pathtracer ISPC - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.0.1 Binary: Pathtracer ISPC - Model: Asian Dragon c2-standard-8 CLX c3-highcpu-8 SPR 2 4 6 8 10 SE +/- 0.0119, N = 3 SE +/- 0.0079, N = 3 5.1542 7.3642 MIN: 5.12 / MAX: 5.22 MIN: 7.33 / MAX: 7.44
OpenRadioss Model: Rubber O-Ring Seal Installation OpenBenchmarking.org Seconds, Fewer Is Better OpenRadioss 2022.10.13 Model: Rubber O-Ring Seal Installation c2-standard-8 CLX c3-highcpu-8 SPR 110 220 330 440 550 SE +/- 0.45, N = 3 SE +/- 0.20, N = 3 523.45 367.00
BRL-CAD VGR Performance Metric OpenBenchmarking.org VGR Performance Metric, More Is Better BRL-CAD 7.34 VGR Performance Metric c2-standard-8 CLX c3-highcpu-8 SPR 15K 30K 45K 60K 75K 50314 71072 1. (CXX) g++ options: -std=c++14 -pipe -fvisibility=hidden -fno-strict-aliasing -fno-common -fexceptions -ftemplate-depth-128 -m64 -ggdb3 -O3 -fipa-pta -fstrength-reduce -finline-functions -flto -ltcl8.6 -lregex_brl -lz_brl -lnetpbm -ldl -lm -ltk8.6
CockroachDB Workload: KV, 50% Reads - Concurrency: 128 OpenBenchmarking.org ops/s, More Is Better CockroachDB 22.2 Workload: KV, 50% Reads - Concurrency: 128 c2-standard-8 CLX c3-highcpu-8 SPR 4K 8K 12K 16K 20K SE +/- 68.65, N = 3 SE +/- 40.86, N = 3 13684.5 19321.6
OpenVKL Benchmark: vklBenchmark ISPC OpenBenchmarking.org Items / Sec, More Is Better OpenVKL 1.3.1 Benchmark: vklBenchmark ISPC c2-standard-8 CLX c3-highcpu-8 SPR 20 40 60 80 100 SE +/- 0.00, N = 3 SE +/- 0.33, N = 3 70 98 MIN: 8 / MAX: 1119 MIN: 11 / MAX: 1579
LeelaChessZero Backend: BLAS OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.28 Backend: BLAS c2-standard-8 CLX c3-highcpu-8 SPR 300 600 900 1200 1500 SE +/- 7.36, N = 3 SE +/- 16.59, N = 3 909 1272 1. (CXX) g++ options: -flto -pthread
OpenFOAM Input: drivaerFastback, Small Mesh Size - Mesh Time OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: drivaerFastback, Small Mesh Size - Mesh Time c2-standard-8 CLX c3-highcpu-8 SPR 20 40 60 80 100 85.52 62.04 -ldynamicMesh -lphysicalProperties -lspecie -lfiniteVolume -lfvModels -lmeshTools -lsampling 1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lgenericPatchFields -lOpenFOAM -ldl -lm
LeelaChessZero Backend: Eigen OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.28 Backend: Eigen c2-standard-8 CLX c3-highcpu-8 SPR 300 600 900 1200 1500 SE +/- 9.00, N = 3 SE +/- 7.69, N = 3 902 1221 1. (CXX) g++ options: -flto -pthread
GROMACS Implementation: MPI CPU - Input: water_GMX50_bare OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2023 Implementation: MPI CPU - Input: water_GMX50_bare c2-standard-8 CLX c3-highcpu-8 SPR 0.1748 0.3496 0.5244 0.6992 0.874 SE +/- 0.001, N = 3 SE +/- 0.001, N = 3 0.579 0.777 1. (CXX) g++ options: -O3
MariaDB Clients: 2048 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 11.0.1 Clients: 2048 c2-standard-8 CLX c3-highcpu-8 SPR 70 140 210 280 350 SE +/- 3.50, N = 3 SE +/- 3.01, N = 3 248 332 1. (CXX) g++ options: -pie -fPIC -fstack-protector -O3 -lnuma -lcrypt -lz -lm -lssl -lcrypto -lpthread -ldl
MariaDB Clients: 4096 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 11.0.1 Clients: 4096 c2-standard-8 CLX c3-highcpu-8 SPR 70 140 210 280 350 SE +/- 2.55, N = 3 SE +/- 2.62, N = 3 237 317 1. (CXX) g++ options: -pie -fPIC -fstack-protector -O3 -lnuma -lcrypt -lz -lm -lssl -lcrypto -lpthread -ldl
OpenRadioss Model: Cell Phone Drop Test OpenBenchmarking.org Seconds, Fewer Is Better OpenRadioss 2022.10.13 Model: Cell Phone Drop Test c2-standard-8 CLX c3-highcpu-8 SPR 60 120 180 240 300 SE +/- 0.39, N = 3 SE +/- 0.38, N = 3 291.49 219.73
OpenFOAM Input: drivaerFastback, Small Mesh Size - Execution Time OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 10 Input: drivaerFastback, Small Mesh Size - Execution Time c2-standard-8 CLX c3-highcpu-8 SPR 120 240 360 480 600 560.61 422.68 -ldynamicMesh -lphysicalProperties -lspecie -lfiniteVolume -lfvModels -lmeshTools -lsampling 1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lgenericPatchFields -lOpenFOAM -ldl -lm
NAMD ATPase Simulation - 327,506 Atoms OpenBenchmarking.org days/ns, Fewer Is Better NAMD 2.14 ATPase Simulation - 327,506 Atoms c2-standard-8 CLX c3-highcpu-8 SPR 1.0002 2.0004 3.0006 4.0008 5.001 SE +/- 0.00771, N = 3 SE +/- 0.00254, N = 3 4.44534 3.35779
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream c2-standard-8 CLX c3-highcpu-8 SPR 150 300 450 600 750 SE +/- 0.03, N = 3 SE +/- 1.78, N = 3 682.37 528.02
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream c2-standard-8 CLX c3-highcpu-8 SPR 0.8521 1.7042 2.5563 3.4084 4.2605 SE +/- 0.0001, N = 3 SE +/- 0.0130, N = 3 2.9308 3.7873
Zstd Compression Compression Level: 19 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19 - Decompression Speed c2-standard-8 CLX c3-highcpu-8 SPR 200 400 600 800 1000 SE +/- 1.45, N = 3 SE +/- 1.50, N = 3 701.1 905.2 -llzma 1. (CC) gcc options: -O3 -pthread -lz
OpenRadioss Model: Bumper Beam OpenBenchmarking.org Seconds, Fewer Is Better OpenRadioss 2022.10.13 Model: Bumper Beam c2-standard-8 CLX c3-highcpu-8 SPR 80 160 240 320 400 SE +/- 0.90, N = 3 SE +/- 0.51, N = 3 390.61 303.38
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream c2-standard-8 CLX c3-highcpu-8 SPR 0.8481 1.6962 2.5443 3.3924 4.2405 SE +/- 0.0038, N = 3 SE +/- 0.0239, N = 3 2.9352 3.7693
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream c2-standard-8 CLX c3-highcpu-8 SPR 150 300 450 600 750 SE +/- 0.89, N = 3 SE +/- 3.34, N = 3 681.34 530.61
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU c2-standard-8 CLX c3-highcpu-8 SPR 600 1200 1800 2400 3000 SE +/- 0.83, N = 3 SE +/- 1.76, N = 3 2998.03 2337.31 MIN: 2977.98 MIN: 2326.77 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
Zstd Compression Compression Level: 19, Long Mode - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19, Long Mode - Decompression Speed c2-standard-8 CLX c3-highcpu-8 SPR 200 400 600 800 1000 SE +/- 2.02, N = 3 SE +/- 1.48, N = 3 713.8 907.2 -llzma 1. (CC) gcc options: -O3 -pthread -lz
uvg266 Video Input: Bosphorus 1080p - Video Preset: Super Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 1080p - Video Preset: Super Fast c2-standard-8 CLX c3-highcpu-8 SPR 8 16 24 32 40 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 27.72 34.50
uvg266 Video Input: Bosphorus 4K - Video Preset: Super Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 4K - Video Preset: Super Fast c2-standard-8 CLX c3-highcpu-8 SPR 2 4 6 8 10 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 6.03 7.48
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU c2-standard-8 CLX c3-highcpu-8 SPR 1200 2400 3600 4800 6000 SE +/- 8.38, N = 3 SE +/- 0.86, N = 3 5767.70 4660.70 MIN: 5732.12 MIN: 4648.25 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
miniBUDE Implementation: OpenMP - Input Deck: BM1 OpenBenchmarking.org GFInst/s, More Is Better miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM1 c2-standard-8 CLX c3-highcpu-8 SPR 40 80 120 160 200 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 152.80 188.65 1. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm
miniBUDE Implementation: OpenMP - Input Deck: BM1 OpenBenchmarking.org Billion Interactions/s, More Is Better miniBUDE 20210901 Implementation: OpenMP - Input Deck: BM1 c2-standard-8 CLX c3-highcpu-8 SPR 2 4 6 8 10 SE +/- 0.001, N = 3 SE +/- 0.001, N = 3 6.112 7.546 1. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm
uvg266 Video Input: Bosphorus 1080p - Video Preset: Very Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 1080p - Video Preset: Very Fast c2-standard-8 CLX c3-highcpu-8 SPR 8 16 24 32 40 SE +/- 0.16, N = 3 SE +/- 0.25, N = 3 26.30 32.39
uvg266 Video Input: Bosphorus 4K - Video Preset: Very Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 4K - Video Preset: Very Fast c2-standard-8 CLX c3-highcpu-8 SPR 2 4 6 8 10 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 5.69 6.99
uvg266 Video Input: Bosphorus 1080p - Video Preset: Ultra Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 1080p - Video Preset: Ultra Fast c2-standard-8 CLX c3-highcpu-8 SPR 10 20 30 40 50 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 34.47 42.24
uvg266 Video Input: Bosphorus 4K - Video Preset: Ultra Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 4K - Video Preset: Ultra Fast c2-standard-8 CLX c3-highcpu-8 SPR 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 7.45 9.12
nekRS Input: TurboPipe Periodic OpenBenchmarking.org FLOP/s, More Is Better nekRS 22.0 Input: TurboPipe Periodic c2-standard-8 CLX c3-highcpu-8 SPR 7000M 14000M 21000M 28000M 35000M SE +/- 56849518.71, N = 3 SE +/- 92013205.57, N = 3 25100766667 30667900000 1. (CXX) g++ options: -fopenmp -O2 -march=native -mtune=native -ftree-vectorize -lmpi_cxx -lmpi
OpenRadioss Model: Bird Strike on Windshield OpenBenchmarking.org Seconds, Fewer Is Better OpenRadioss 2022.10.13 Model: Bird Strike on Windshield c2-standard-8 CLX c3-highcpu-8 SPR 160 320 480 640 800 SE +/- 1.51, N = 3 SE +/- 0.30, N = 3 724.80 595.14
Neural Magic DeepSparse Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream c2-standard-8 CLX c3-highcpu-8 SPR 9 18 27 36 45 SE +/- 0.03, N = 3 SE +/- 0.04, N = 3 31.96 38.51
Neural Magic DeepSparse Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream c2-standard-8 CLX c3-highcpu-8 SPR 14 28 42 56 70 SE +/- 0.07, N = 3 SE +/- 0.05, N = 3 62.51 51.89
Neural Magic DeepSparse Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream c2-standard-8 CLX c3-highcpu-8 SPR 5 10 15 20 25 SE +/- 0.05, N = 3 SE +/- 0.03, N = 3 15.92 19.17
Neural Magic DeepSparse Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream c2-standard-8 CLX c3-highcpu-8 SPR 30 60 90 120 150 SE +/- 0.40, N = 3 SE +/- 0.14, N = 3 125.47 104.31
Google Draco Model: Lion OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.5.6 Model: Lion c2-standard-8 CLX c3-highcpu-8 SPR 1600 3200 4800 6400 8000 SE +/- 18.52, N = 3 SE +/- 80.23, N = 15 7437 6250 1. (CXX) g++ options: -O3
Timed Linux Kernel Compilation Build: defconfig OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 6.1 Build: defconfig c2-standard-8 CLX c3-highcpu-8 SPR 60 120 180 240 300 SE +/- 0.85, N = 3 SE +/- 0.64, N = 3 289.66 244.80
OpenCV Test: Stitching OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Stitching c2-standard-8 CLX c3-highcpu-8 SPR 50K 100K 150K 200K 250K SE +/- 1856.06, N = 3 SE +/- 1973.06, N = 7 250833 214760 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
CockroachDB Workload: MoVR - Concurrency: 128 OpenBenchmarking.org ops/s, More Is Better CockroachDB 22.2 Workload: MoVR - Concurrency: 128 c2-standard-8 CLX c3-highcpu-8 SPR 120 240 360 480 600 SE +/- 1.90, N = 3 SE +/- 5.62, N = 15 534.9 458.8
Xcompact3d Incompact3d Input: input.i3d 129 Cells Per Direction OpenBenchmarking.org Seconds, Fewer Is Better Xcompact3d Incompact3d 2021-03-11 Input: input.i3d 129 Cells Per Direction c2-standard-8 CLX c3-highcpu-8 SPR 9 18 27 36 45 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 37.76 32.39 1. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream c2-standard-8 CLX c3-highcpu-8 SPR 4 8 12 16 20 SE +/- 0.03, N = 3 SE +/- 0.13, N = 3 13.99 16.22
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream c2-standard-8 CLX c3-highcpu-8 SPR 30 60 90 120 150 SE +/- 0.29, N = 3 SE +/- 0.95, N = 3 142.89 123.29
Timed FFmpeg Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed FFmpeg Compilation 6.0 Time To Compile c2-standard-8 CLX c3-highcpu-8 SPR 30 60 90 120 150 SE +/- 0.03, N = 3 SE +/- 0.10, N = 3 139.04 120.44
OpenCV Test: Image Processing OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Image Processing c2-standard-8 CLX c3-highcpu-8 SPR 30K 60K 90K 120K 150K SE +/- 1527.37, N = 4 SE +/- 1624.35, N = 12 147234 128163 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
7-Zip Compression Test: Compression Rating OpenBenchmarking.org MIPS, More Is Better 7-Zip Compression 22.01 Test: Compression Rating c2-standard-8 CLX c3-highcpu-8 SPR 8K 16K 24K 32K 40K SE +/- 248.15, N = 3 SE +/- 246.17, N = 15 30989 35306 1. (CXX) g++ options: -lpthread -ldl -O2 -fPIC
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream c2-standard-8 CLX c3-highcpu-8 SPR 8 16 24 32 40 SE +/- 0.03, N = 3 SE +/- 0.15, N = 3 29.18 33.11
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream c2-standard-8 CLX c3-highcpu-8 SPR 15 30 45 60 75 SE +/- 0.08, N = 3 SE +/- 0.28, N = 3 68.51 60.39
OpenSSL Algorithm: RSA4096 OpenBenchmarking.org verify/s, More Is Better OpenSSL 3.1 Algorithm: RSA4096 c2-standard-8 CLX c3-highcpu-8 SPR 16K 32K 48K 64K 80K SE +/- 47.93, N = 3 SE +/- 8.53, N = 3 76402.3 67857.7 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
John The Ripper Test: MD5 OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: MD5 c2-standard-8 CLX c3-highcpu-8 SPR 160K 320K 480K 640K 800K SE +/- 162.53, N = 3 SE +/- 1472.13, N = 3 683546 765713 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lm -lrt -lz -ldl -lcrypt
7-Zip Compression Test: Decompression Rating OpenBenchmarking.org MIPS, More Is Better 7-Zip Compression 22.01 Test: Decompression Rating c2-standard-8 CLX c3-highcpu-8 SPR 5K 10K 15K 20K 25K SE +/- 52.92, N = 3 SE +/- 183.58, N = 15 22852 20468 1. (CXX) g++ options: -lpthread -ldl -O2 -fPIC
Zstd Compression Compression Level: 19 - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19 - Compression Speed c2-standard-8 CLX c3-highcpu-8 SPR 3 6 9 12 15 SE +/- 0.08, N = 3 SE +/- 0.12, N = 3 9.24 10.30 -llzma 1. (CC) gcc options: -O3 -pthread -lz
Zstd Compression Compression Level: 19, Long Mode - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19, Long Mode - Compression Speed c2-standard-8 CLX c3-highcpu-8 SPR 2 4 6 8 10 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 5.86 6.50 -llzma 1. (CC) gcc options: -O3 -pthread -lz
Intel Open Image Denoise Run: RTLightmap.hdr.4096x4096 OpenBenchmarking.org Images / Sec, More Is Better Intel Open Image Denoise 1.4.0 Run: RTLightmap.hdr.4096x4096 c2-standard-8 CLX c3-highcpu-8 SPR 0.027 0.054 0.081 0.108 0.135 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.11 0.12
Intel Open Image Denoise Run: RT.hdr_alb_nrm.3840x2160 OpenBenchmarking.org Images / Sec, More Is Better Intel Open Image Denoise 1.4.0 Run: RT.hdr_alb_nrm.3840x2160 c2-standard-8 CLX c3-highcpu-8 SPR 0.054 0.108 0.162 0.216 0.27 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.22 0.24
John The Ripper Test: WPA PSK OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: WPA PSK c2-standard-8 CLX c3-highcpu-8 SPR 7K 14K 21K 28K 35K SE +/- 11.37, N = 3 SE +/- 27.15, N = 3 31278 28818 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lm -lrt -lz -ldl -lcrypt
SPECFEM3D Model: Water-layered Halfspace OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.0 Model: Water-layered Halfspace c2-standard-8 CLX c3-highcpu-8 SPR 80 160 240 320 400 SE +/- 0.85, N = 3 SE +/- 0.31, N = 3 347.81 321.63 1. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream c2-standard-8 CLX c3-highcpu-8 SPR 14 28 42 56 70 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 59.99 64.87
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream c2-standard-8 CLX c3-highcpu-8 SPR 8 16 24 32 40 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 33.29 30.80
OpenCV Test: Graph API OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Graph API c2-standard-8 CLX c3-highcpu-8 SPR 50K 100K 150K 200K 250K SE +/- 1570.24, N = 3 SE +/- 931.36, N = 3 236186 219931 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
John The Ripper Test: HMAC-SHA512 OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: HMAC-SHA512 c2-standard-8 CLX c3-highcpu-8 SPR 9M 18M 27M 36M 45M SE +/- 32331.62, N = 3 SE +/- 21071.31, N = 3 40197000 37509000 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lm -lrt -lz -ldl -lcrypt
OpenSSL Algorithm: SHA512 OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: SHA512 c2-standard-8 CLX c3-highcpu-8 SPR 300M 600M 900M 1200M 1500M SE +/- 2965153.70, N = 3 SE +/- 1152941.98, N = 3 1465815227 1568572920 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
TensorFlow Device: CPU - Batch Size: 16 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.10 Device: CPU - Batch Size: 16 - Model: ResNet-50 c2-standard-8 CLX c3-highcpu-8 SPR 4 8 12 16 20 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 13.33 14.20
SPECFEM3D Model: Homogeneous Halfspace OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.0 Model: Homogeneous Halfspace c2-standard-8 CLX c3-highcpu-8 SPR 40 80 120 160 200 SE +/- 1.87, N = 3 SE +/- 0.09, N = 3 190.79 179.55 1. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
TensorFlow Device: CPU - Batch Size: 64 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.10 Device: CPU - Batch Size: 64 - Model: ResNet-50 c2-standard-8 CLX c3-highcpu-8 SPR 4 8 12 16 20 SE +/- 0.18, N = 3 SE +/- 0.01, N = 3 14.79 15.69
TensorFlow Device: CPU - Batch Size: 32 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.10 Device: CPU - Batch Size: 32 - Model: ResNet-50 c2-standard-8 CLX c3-highcpu-8 SPR 4 8 12 16 20 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 14.11 14.93
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream c2-standard-8 CLX c3-highcpu-8 SPR 2 4 6 8 10 SE +/- 0.0034, N = 3 SE +/- 0.0078, N = 3 6.1893 6.5372
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream c2-standard-8 CLX c3-highcpu-8 SPR 70 140 210 280 350 SE +/- 0.18, N = 3 SE +/- 0.37, N = 3 323.08 305.90
SPECFEM3D Model: Tomographic Model OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.0 Model: Tomographic Model c2-standard-8 CLX c3-highcpu-8 SPR 30 60 90 120 150 SE +/- 1.30, N = 3 SE +/- 1.65, N = 3 150.92 143.94 1. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
SPECFEM3D Model: Mount St. Helens OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.0 Model: Mount St. Helens c2-standard-8 CLX c3-highcpu-8 SPR 30 60 90 120 150 SE +/- 0.56, N = 3 SE +/- 0.10, N = 3 145.38 139.52 1. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
John The Ripper Test: Blowfish OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: Blowfish c2-standard-8 CLX c3-highcpu-8 SPR 1500 3000 4500 6000 7500 SE +/- 2.73, N = 3 SE +/- 2.08, N = 3 6684 6930 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lm -lrt -lz -ldl -lcrypt
John The Ripper Test: bcrypt OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: bcrypt c2-standard-8 CLX c3-highcpu-8 SPR 1500 3000 4500 6000 7500 SE +/- 0.67, N = 3 SE +/- 0.67, N = 3 6687 6932 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lm -lrt -lz -ldl -lcrypt
OpenSSL Algorithm: ChaCha20 OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: ChaCha20 c2-standard-8 CLX c3-highcpu-8 SPR 5000M 10000M 15000M 20000M 25000M SE +/- 1497684.67, N = 3 SE +/- 35781789.46, N = 3 21346811813 22091557637 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
SPECFEM3D Model: Layered Halfspace OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.0 Model: Layered Halfspace c2-standard-8 CLX c3-highcpu-8 SPR 80 160 240 320 400 SE +/- 2.51, N = 3 SE +/- 0.64, N = 3 374.76 372.09 1. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
Blender Blend File: BMW27 - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 3.4 Blend File: BMW27 - Compute: CPU-Only c3-highcpu-8 SPR 70 140 210 280 350 SE +/- 1.13, N = 3 315.24
VVenC Video Input: Bosphorus 1080p - Video Preset: Faster OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.7 Video Input: Bosphorus 1080p - Video Preset: Faster c3-highcpu-8 SPR 3 6 9 12 15 SE +/- 0.01, N = 3 12.92 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
VVenC Video Input: Bosphorus 1080p - Video Preset: Fast OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.7 Video Input: Bosphorus 1080p - Video Preset: Fast c3-highcpu-8 SPR 1.1936 2.3872 3.5808 4.7744 5.968 SE +/- 0.018, N = 3 5.305 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
VVenC Video Input: Bosphorus 4K - Video Preset: Faster OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.7 Video Input: Bosphorus 4K - Video Preset: Faster c3-highcpu-8 SPR 0.7929 1.5858 2.3787 3.1716 3.9645 SE +/- 0.000, N = 3 3.524 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
VVenC Video Input: Bosphorus 4K - Video Preset: Fast OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.7 Video Input: Bosphorus 4K - Video Preset: Fast c3-highcpu-8 SPR 0.3501 0.7002 1.0503 1.4004 1.7505 SE +/- 0.005, N = 3 1.556 1. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects
OpenCV Test: Core OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Core c2-standard-8 CLX c3-highcpu-8 SPR 30K 60K 90K 120K 150K SE +/- 2578.21, N = 12 SE +/- 280.31, N = 3 142770 87372 1. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt
Phoronix Test Suite v10.8.4