Google Cloud c3 Sapphire Rapids

Benchmarks by Michael Larabel for a future article.

HTML result view exported from: https://openbenchmarking.org/result/2303226-NE-2303218PT51.

Google Cloud c3 Sapphire RapidsProcessorMotherboardChipsetMemoryDiskNetworkOSKernelVulkanCompilerFile-SystemSystem Layerc3-highcpu-8 SPRc2-standard-8 CLXIntel Xeon Platinum 8481C (4 Cores / 8 Threads)Google Compute Engine c3-highcpu-8Intel 440FX 82441FX PMC16GB322GB nvme_card-pdGoogle Compute Engine VirtualUbuntu 22.105.19.0-1015-gcp (x86_64)1.3.224GCC 12.2.0ext4KVMIntel Xeon (4 Cores / 8 Threads)Google Compute Engine c2-standard-832GB322GB PersistentDiskRed Hat Virtio deviceOpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- CPU Microcode: 0xffffffffPython Details- Python 3.10.7Security Details- c3-highcpu-8 SPR: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected - c2-standard-8 CLX: itlb_multihit: Not affected + l1tf: Not affected + mds: Mitigation of Clear buffers; SMT Host state unknown + meltdown: Not affected + mmio_stale_data: Vulnerable: Clear buffers attempted no microcode; SMT Host state unknown + retbleed: Mitigation of Enhanced IBRS + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Mitigation of Clear buffers; SMT Host state unknown

Google Cloud c3 Sapphire Rapidslczero: BLASlczero: Eigenminibude: OpenMP - BM1minibude: OpenMP - BM1namd: ATPase Simulation - 327,506 Atomsnekrs: TurboPipe Periodicincompact3d: input.i3d 129 Cells Per Directionopenfoam: drivaerFastback, Small Mesh Size - Mesh Timeopenfoam: drivaerFastback, Small Mesh Size - Execution Timeopenradioss: Bumper Beamopenradioss: Cell Phone Drop Testopenradioss: Bird Strike on Windshieldopenradioss: Rubber O-Ring Seal Installationspecfem3d: Mount St. Helensspecfem3d: Layered Halfspacespecfem3d: Tomographic Modelspecfem3d: Homogeneous Halfspacespecfem3d: Water-layered Halfspacecompress-zstd: 19 - Compression Speedcompress-zstd: 19 - Decompression Speedcompress-zstd: 19, Long Mode - Compression Speedcompress-zstd: 19, Long Mode - Decompression Speedjohn-the-ripper: bcryptjohn-the-ripper: WPA PSKjohn-the-ripper: Blowfishjohn-the-ripper: HMAC-SHA512john-the-ripper: MD5embree: Pathtracer ISPC - Crownembree: Pathtracer ISPC - Asian Dragonuvg266: Bosphorus 4K - Very Fastuvg266: Bosphorus 4K - Super Fastuvg266: Bosphorus 4K - Ultra Fastuvg266: Bosphorus 1080p - Very Fastuvg266: Bosphorus 1080p - Super Fastuvg266: Bosphorus 1080p - Ultra Fastoidn: RT.hdr_alb_nrm.3840x2160oidn: RTLightmap.hdr.4096x4096openvkl: vklBenchmark ISPCcompress-7zip: Compression Ratingcompress-7zip: Decompression Ratingbuild-ffmpeg: Time To Compilebuild-linux-kernel: defconfigonednn: IP Shapes 1D - bf16bf16bf16 - CPUonednn: IP Shapes 3D - bf16bf16bf16 - CPUonednn: Convolution Batch Shapes Auto - bf16bf16bf16 - CPUonednn: Deconvolution Batch shapes_1d - bf16bf16bf16 - CPUonednn: Deconvolution Batch shapes_3d - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Training - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPUonednn: Matrix Multiply Batch Shapes Transformer - bf16bf16bf16 - CPUospray-studio: 1 - 4K - 1 - Path Tracerospray-studio: 3 - 4K - 1 - Path Traceropenssl: SHA256openssl: SHA512openssl: RSA4096openssl: RSA4096openssl: ChaCha20openssl: AES-128-GCMopenssl: AES-256-GCMopenssl: ChaCha20-Poly1305cockroach: MoVR - 128cockroach: KV, 50% Reads - 128cockroach: KV, 95% Reads - 128memcached: 1:10memcached: 1:100gromacs: MPI CPU - water_GMX50_baremysqlslap: 2048mysqlslap: 4096pgbench: 100 - 800 - Read Onlypgbench: 100 - 800 - Read Only - Average Latencypgbench: 100 - 1000 - Read Onlypgbench: 100 - 1000 - Read Only - Average Latencytensorflow: CPU - 16 - ResNet-50tensorflow: CPU - 32 - ResNet-50tensorflow: CPU - 64 - ResNet-50deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Streamdeepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Streamdeepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Asynchronous Multi-Streamdeepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Asynchronous Multi-Streamdeepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Streamdeepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Streamdeepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Streamdeepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Streamdeepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Streamdeepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Streamdeepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Streamdeepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Streamdeepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Streamdeepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Streamdeepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Streamdeepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Streamdraco: Liondraco: Church Facadeblender: BMW27 - CPU-Onlynginx: 100nginx: 200nginx: 500nginx: 1000nginx: 4000brl-cad: VGR Performance Metricopencv: Coreopencv: Videoopencv: Graph APIopencv: Stitchingopencv: Image Processingopencv: Object Detectionc3-highcpu-8 SPRc2-standard-8 CLX12721221188.6517.5463.357793066790000032.394316362.044277422.67836303.38219.73595.14367.00139.524469614372.089480857143.937022875179.545901627321.62525166810.3905.26.5907.26932288186930375090007657135.84757.36426.997.489.1232.3934.5042.240.240.12983530620468120.438244.8001.500045.342184.177071.471453.549444660.702337.310.9689862095225819428387398715685729202062.767857.722091557637575940778234800836157315970781140458.819321.624960.11044947.131030937.280.7773323173119422.5652937253.40514.2014.9315.693.7873528.019351.892638.512019.1677104.310264.874730.795233.106460.38586.5372305.899116.2194123.29313.7693530.610062507573315.2436310.3535602.1034672.6532118.5832814.7571072873723165421993121476012816338999909902152.7986.1124.445342510076666737.757567185.518264560.60744390.61291.49724.80523.45145.379328519374.756877204150.915292633190.794140810347.8128140709.24701.15.86713.86687312786684401970006835463.93405.15425.696.037.4526.3027.7234.470.220.11703098922852139.039289.66219.52878.0241634.189352.942135.91645767.702998.037.432603091137892131819353014658152271156.676402.321346811813232373126031694500063010919151843534.913684.516184.2715723.53702291.930.5792482371733174.6161696435.89513.3314.1114.792.9308682.374862.507331.958315.9179125.474659.991433.292429.178368.50766.1893323.082813.9932142.88922.9352681.344974371146225148.2824695.9121957.1721446.2721594.94503141427701173723618625083314723458056OpenBenchmarking.org

LeelaChessZero

Backend: BLAS

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.28Backend: BLASc3-highcpu-8 SPRc2-standard-8 CLX30060090012001500SE +/- 16.59, N = 3SE +/- 7.36, N = 312729091. (CXX) g++ options: -flto -pthread

LeelaChessZero

Backend: Eigen

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.28Backend: Eigenc3-highcpu-8 SPRc2-standard-8 CLX30060090012001500SE +/- 7.69, N = 3SE +/- 9.00, N = 312219021. (CXX) g++ options: -flto -pthread

miniBUDE

Implementation: OpenMP - Input Deck: BM1

OpenBenchmarking.orgGFInst/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM1c3-highcpu-8 SPRc2-standard-8 CLX4080120160200SE +/- 0.03, N = 3SE +/- 0.02, N = 3188.65152.801. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm

miniBUDE

Implementation: OpenMP - Input Deck: BM1

OpenBenchmarking.orgBillion Interactions/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM1c3-highcpu-8 SPRc2-standard-8 CLX246810SE +/- 0.001, N = 3SE +/- 0.001, N = 37.5466.1121. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm

NAMD

ATPase Simulation - 327,506 Atoms

OpenBenchmarking.orgdays/ns, Fewer Is BetterNAMD 2.14ATPase Simulation - 327,506 Atomsc3-highcpu-8 SPRc2-standard-8 CLX1.00022.00043.00064.00085.001SE +/- 0.00254, N = 3SE +/- 0.00771, N = 33.357794.44534

nekRS

Input: TurboPipe Periodic

OpenBenchmarking.orgFLOP/s, More Is BetternekRS 22.0Input: TurboPipe Periodicc3-highcpu-8 SPRc2-standard-8 CLX7000M14000M21000M28000M35000MSE +/- 92013205.57, N = 3SE +/- 56849518.71, N = 330667900000251007666671. (CXX) g++ options: -fopenmp -O2 -march=native -mtune=native -ftree-vectorize -lmpi_cxx -lmpi

Xcompact3d Incompact3d

Input: input.i3d 129 Cells Per Direction

OpenBenchmarking.orgSeconds, Fewer Is BetterXcompact3d Incompact3d 2021-03-11Input: input.i3d 129 Cells Per Directionc3-highcpu-8 SPRc2-standard-8 CLX918273645SE +/- 0.02, N = 3SE +/- 0.02, N = 332.3937.761. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

OpenFOAM

Input: drivaerFastback, Small Mesh Size - Mesh Time

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 10Input: drivaerFastback, Small Mesh Size - Mesh Timec3-highcpu-8 SPRc2-standard-8 CLX2040608010062.0485.52-lphysicalProperties -lspecie -lfiniteVolume -lfvModels -lmeshTools -lsampling-ldynamicMesh1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lgenericPatchFields -lOpenFOAM -ldl -lm

OpenFOAM

Input: drivaerFastback, Small Mesh Size - Execution Time

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 10Input: drivaerFastback, Small Mesh Size - Execution Timec3-highcpu-8 SPRc2-standard-8 CLX120240360480600422.68560.61-lphysicalProperties -lspecie -lfiniteVolume -lfvModels -lmeshTools -lsampling-ldynamicMesh1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lgenericPatchFields -lOpenFOAM -ldl -lm

OpenRadioss

Model: Bumper Beam

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenRadioss 2022.10.13Model: Bumper Beamc3-highcpu-8 SPRc2-standard-8 CLX80160240320400SE +/- 0.51, N = 3SE +/- 0.90, N = 3303.38390.61

OpenRadioss

Model: Cell Phone Drop Test

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenRadioss 2022.10.13Model: Cell Phone Drop Testc3-highcpu-8 SPRc2-standard-8 CLX60120180240300SE +/- 0.38, N = 3SE +/- 0.39, N = 3219.73291.49

OpenRadioss

Model: Bird Strike on Windshield

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenRadioss 2022.10.13Model: Bird Strike on Windshieldc3-highcpu-8 SPRc2-standard-8 CLX160320480640800SE +/- 0.30, N = 3SE +/- 1.51, N = 3595.14724.80

OpenRadioss

Model: Rubber O-Ring Seal Installation

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenRadioss 2022.10.13Model: Rubber O-Ring Seal Installationc3-highcpu-8 SPRc2-standard-8 CLX110220330440550SE +/- 0.20, N = 3SE +/- 0.45, N = 3367.00523.45

SPECFEM3D

Model: Mount St. Helens

OpenBenchmarking.orgSeconds, Fewer Is BetterSPECFEM3D 4.0Model: Mount St. Helensc3-highcpu-8 SPRc2-standard-8 CLX306090120150SE +/- 0.10, N = 3SE +/- 0.56, N = 3139.52145.381. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

SPECFEM3D

Model: Layered Halfspace

OpenBenchmarking.orgSeconds, Fewer Is BetterSPECFEM3D 4.0Model: Layered Halfspacec3-highcpu-8 SPRc2-standard-8 CLX80160240320400SE +/- 0.64, N = 3SE +/- 2.51, N = 3372.09374.761. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

SPECFEM3D

Model: Tomographic Model

OpenBenchmarking.orgSeconds, Fewer Is BetterSPECFEM3D 4.0Model: Tomographic Modelc3-highcpu-8 SPRc2-standard-8 CLX306090120150SE +/- 1.65, N = 3SE +/- 1.30, N = 3143.94150.921. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

SPECFEM3D

Model: Homogeneous Halfspace

OpenBenchmarking.orgSeconds, Fewer Is BetterSPECFEM3D 4.0Model: Homogeneous Halfspacec3-highcpu-8 SPRc2-standard-8 CLX4080120160200SE +/- 0.09, N = 3SE +/- 1.87, N = 3179.55190.791. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

SPECFEM3D

Model: Water-layered Halfspace

OpenBenchmarking.orgSeconds, Fewer Is BetterSPECFEM3D 4.0Model: Water-layered Halfspacec3-highcpu-8 SPRc2-standard-8 CLX80160240320400SE +/- 0.31, N = 3SE +/- 0.85, N = 3321.63347.811. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

Zstd Compression

Compression Level: 19 - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 19 - Compression Speedc3-highcpu-8 SPRc2-standard-8 CLX3691215SE +/- 0.12, N = 3SE +/- 0.08, N = 310.309.24-llzma1. (CC) gcc options: -O3 -pthread -lz

Zstd Compression

Compression Level: 19 - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 19 - Decompression Speedc3-highcpu-8 SPRc2-standard-8 CLX2004006008001000SE +/- 1.50, N = 3SE +/- 1.45, N = 3905.2701.1-llzma1. (CC) gcc options: -O3 -pthread -lz

Zstd Compression

Compression Level: 19, Long Mode - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 19, Long Mode - Compression Speedc3-highcpu-8 SPRc2-standard-8 CLX246810SE +/- 0.00, N = 3SE +/- 0.00, N = 36.505.86-llzma1. (CC) gcc options: -O3 -pthread -lz

Zstd Compression

Compression Level: 19, Long Mode - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 19, Long Mode - Decompression Speedc3-highcpu-8 SPRc2-standard-8 CLX2004006008001000SE +/- 1.48, N = 3SE +/- 2.02, N = 3907.2713.8-llzma1. (CC) gcc options: -O3 -pthread -lz

John The Ripper

Test: bcrypt

OpenBenchmarking.orgReal C/S, More Is BetterJohn The Ripper 2023.03.14Test: bcryptc3-highcpu-8 SPRc2-standard-8 CLX15003000450060007500SE +/- 0.67, N = 3SE +/- 0.67, N = 3693266871. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lm -lrt -lz -ldl -lcrypt

John The Ripper

Test: WPA PSK

OpenBenchmarking.orgReal C/S, More Is BetterJohn The Ripper 2023.03.14Test: WPA PSKc3-highcpu-8 SPRc2-standard-8 CLX7K14K21K28K35KSE +/- 27.15, N = 3SE +/- 11.37, N = 328818312781. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lm -lrt -lz -ldl -lcrypt

John The Ripper

Test: Blowfish

OpenBenchmarking.orgReal C/S, More Is BetterJohn The Ripper 2023.03.14Test: Blowfishc3-highcpu-8 SPRc2-standard-8 CLX15003000450060007500SE +/- 2.08, N = 3SE +/- 2.73, N = 3693066841. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lm -lrt -lz -ldl -lcrypt

John The Ripper

Test: HMAC-SHA512

OpenBenchmarking.orgReal C/S, More Is BetterJohn The Ripper 2023.03.14Test: HMAC-SHA512c3-highcpu-8 SPRc2-standard-8 CLX9M18M27M36M45MSE +/- 21071.31, N = 3SE +/- 32331.62, N = 337509000401970001. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lm -lrt -lz -ldl -lcrypt

John The Ripper

Test: MD5

OpenBenchmarking.orgReal C/S, More Is BetterJohn The Ripper 2023.03.14Test: MD5c3-highcpu-8 SPRc2-standard-8 CLX160K320K480K640K800KSE +/- 1472.13, N = 3SE +/- 162.53, N = 37657136835461. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lm -lrt -lz -ldl -lcrypt

Embree

Binary: Pathtracer ISPC - Model: Crown

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.0.1Binary: Pathtracer ISPC - Model: Crownc3-highcpu-8 SPRc2-standard-8 CLX1.31572.63143.94715.26286.5785SE +/- 0.0120, N = 3SE +/- 0.0030, N = 35.84753.9340MIN: 5.81 / MAX: 5.92MIN: 3.91 / MAX: 3.99

Embree

Binary: Pathtracer ISPC - Model: Asian Dragon

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.0.1Binary: Pathtracer ISPC - Model: Asian Dragonc3-highcpu-8 SPRc2-standard-8 CLX246810SE +/- 0.0079, N = 3SE +/- 0.0119, N = 37.36425.1542MIN: 7.33 / MAX: 7.44MIN: 5.12 / MAX: 5.22

uvg266

Video Input: Bosphorus 4K - Video Preset: Very Fast

OpenBenchmarking.orgFrames Per Second, More Is Betteruvg266 0.4.1Video Input: Bosphorus 4K - Video Preset: Very Fastc3-highcpu-8 SPRc2-standard-8 CLX246810SE +/- 0.03, N = 3SE +/- 0.02, N = 36.995.69

uvg266

Video Input: Bosphorus 4K - Video Preset: Super Fast

OpenBenchmarking.orgFrames Per Second, More Is Betteruvg266 0.4.1Video Input: Bosphorus 4K - Video Preset: Super Fastc3-highcpu-8 SPRc2-standard-8 CLX246810SE +/- 0.00, N = 3SE +/- 0.00, N = 37.486.03

uvg266

Video Input: Bosphorus 4K - Video Preset: Ultra Fast

OpenBenchmarking.orgFrames Per Second, More Is Betteruvg266 0.4.1Video Input: Bosphorus 4K - Video Preset: Ultra Fastc3-highcpu-8 SPRc2-standard-8 CLX3691215SE +/- 0.01, N = 3SE +/- 0.01, N = 39.127.45

uvg266

Video Input: Bosphorus 1080p - Video Preset: Very Fast

OpenBenchmarking.orgFrames Per Second, More Is Betteruvg266 0.4.1Video Input: Bosphorus 1080p - Video Preset: Very Fastc3-highcpu-8 SPRc2-standard-8 CLX816243240SE +/- 0.25, N = 3SE +/- 0.16, N = 332.3926.30

uvg266

Video Input: Bosphorus 1080p - Video Preset: Super Fast

OpenBenchmarking.orgFrames Per Second, More Is Betteruvg266 0.4.1Video Input: Bosphorus 1080p - Video Preset: Super Fastc3-highcpu-8 SPRc2-standard-8 CLX816243240SE +/- 0.03, N = 3SE +/- 0.01, N = 334.5027.72

uvg266

Video Input: Bosphorus 1080p - Video Preset: Ultra Fast

OpenBenchmarking.orgFrames Per Second, More Is Betteruvg266 0.4.1Video Input: Bosphorus 1080p - Video Preset: Ultra Fastc3-highcpu-8 SPRc2-standard-8 CLX1020304050SE +/- 0.01, N = 3SE +/- 0.01, N = 342.2434.47

VVenC

Video Input: Bosphorus 4K - Video Preset: Fast

OpenBenchmarking.orgFrames Per Second, More Is BetterVVenC 1.7Video Input: Bosphorus 4K - Video Preset: Fastc3-highcpu-8 SPR0.35010.70021.05031.40041.7505SE +/- 0.005, N = 31.5561. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects

VVenC

Video Input: Bosphorus 4K - Video Preset: Faster

OpenBenchmarking.orgFrames Per Second, More Is BetterVVenC 1.7Video Input: Bosphorus 4K - Video Preset: Fasterc3-highcpu-8 SPR0.79291.58582.37873.17163.9645SE +/- 0.000, N = 33.5241. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects

VVenC

Video Input: Bosphorus 1080p - Video Preset: Fast

OpenBenchmarking.orgFrames Per Second, More Is BetterVVenC 1.7Video Input: Bosphorus 1080p - Video Preset: Fastc3-highcpu-8 SPR1.19362.38723.58084.77445.968SE +/- 0.018, N = 35.3051. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects

VVenC

Video Input: Bosphorus 1080p - Video Preset: Faster

OpenBenchmarking.orgFrames Per Second, More Is BetterVVenC 1.7Video Input: Bosphorus 1080p - Video Preset: Fasterc3-highcpu-8 SPR3691215SE +/- 0.01, N = 312.921. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects

Intel Open Image Denoise

Run: RT.hdr_alb_nrm.3840x2160

OpenBenchmarking.orgImages / Sec, More Is BetterIntel Open Image Denoise 1.4.0Run: RT.hdr_alb_nrm.3840x2160c3-highcpu-8 SPRc2-standard-8 CLX0.0540.1080.1620.2160.27SE +/- 0.00, N = 3SE +/- 0.00, N = 30.240.22

Intel Open Image Denoise

Run: RTLightmap.hdr.4096x4096

OpenBenchmarking.orgImages / Sec, More Is BetterIntel Open Image Denoise 1.4.0Run: RTLightmap.hdr.4096x4096c3-highcpu-8 SPRc2-standard-8 CLX0.0270.0540.0810.1080.135SE +/- 0.00, N = 3SE +/- 0.00, N = 30.120.11

OpenVKL

Benchmark: vklBenchmark ISPC

OpenBenchmarking.orgItems / Sec, More Is BetterOpenVKL 1.3.1Benchmark: vklBenchmark ISPCc3-highcpu-8 SPRc2-standard-8 CLX20406080100SE +/- 0.33, N = 3SE +/- 0.00, N = 39870MIN: 11 / MAX: 1579MIN: 8 / MAX: 1119

7-Zip Compression

Test: Compression Rating

OpenBenchmarking.orgMIPS, More Is Better7-Zip Compression 22.01Test: Compression Ratingc3-highcpu-8 SPRc2-standard-8 CLX8K16K24K32K40KSE +/- 246.17, N = 15SE +/- 248.15, N = 335306309891. (CXX) g++ options: -lpthread -ldl -O2 -fPIC

7-Zip Compression

Test: Decompression Rating

OpenBenchmarking.orgMIPS, More Is Better7-Zip Compression 22.01Test: Decompression Ratingc3-highcpu-8 SPRc2-standard-8 CLX5K10K15K20K25KSE +/- 183.58, N = 15SE +/- 52.92, N = 320468228521. (CXX) g++ options: -lpthread -ldl -O2 -fPIC

Timed FFmpeg Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed FFmpeg Compilation 6.0Time To Compilec3-highcpu-8 SPRc2-standard-8 CLX306090120150SE +/- 0.10, N = 3SE +/- 0.03, N = 3120.44139.04

Timed Linux Kernel Compilation

Build: defconfig

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Linux Kernel Compilation 6.1Build: defconfigc3-highcpu-8 SPRc2-standard-8 CLX60120180240300SE +/- 0.64, N = 3SE +/- 0.85, N = 3244.80289.66

oneDNN

Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPUc3-highcpu-8 SPRc2-standard-8 CLX510152025SE +/- 0.00340, N = 3SE +/- 0.01518, N = 31.5000419.52870MIN: 18.981. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPUc3-highcpu-8 SPRc2-standard-8 CLX246810SE +/- 0.00535, N = 3SE +/- 0.01190, N = 35.342188.02416MIN: 4.94MIN: 7.861. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPUc3-highcpu-8 SPRc2-standard-8 CLX816243240SE +/- 0.00870, N = 3SE +/- 0.00489, N = 34.1770734.18930MIN: 33.771. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPUc3-highcpu-8 SPRc2-standard-8 CLX1224364860SE +/- 0.00472, N = 3SE +/- 0.00209, N = 31.4714552.94210MIN: 52.71. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPUc3-highcpu-8 SPRc2-standard-8 CLX816243240SE +/- 0.01072, N = 3SE +/- 0.01009, N = 33.5494435.91640MIN: 35.711. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPUc3-highcpu-8 SPRc2-standard-8 CLX12002400360048006000SE +/- 0.86, N = 3SE +/- 8.38, N = 34660.705767.70MIN: 4648.25MIN: 5732.121. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPUc3-highcpu-8 SPRc2-standard-8 CLX6001200180024003000SE +/- 1.76, N = 3SE +/- 0.83, N = 32337.312998.03MIN: 2326.77MIN: 2977.981. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPUc3-highcpu-8 SPRc2-standard-8 CLX246810SE +/- 0.013114, N = 3SE +/- 0.007542, N = 30.9689867.432600MIN: 7.251. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

OSPRay Studio

Camera: 1 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 1 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracerc3-highcpu-8 SPRc2-standard-8 CLX7K14K21K28K35KSE +/- 4.91, N = 3SE +/- 32.33, N = 320952309111. (CXX) g++ options: -O3 -lm -ldl

OSPRay Studio

Camera: 3 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 3 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracerc3-highcpu-8 SPRc2-standard-8 CLX8K16K24K32K40KSE +/- 364.36, N = 3SE +/- 39.94, N = 325819378921. (CXX) g++ options: -O3 -lm -ldl

OpenSSL

Algorithm: SHA256

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.1Algorithm: SHA256c3-highcpu-8 SPRc2-standard-8 CLX900M1800M2700M3600M4500MSE +/- 2722265.85, N = 3SE +/- 28303.98, N = 3428387398713181935301. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl

OpenSSL

Algorithm: SHA512

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.1Algorithm: SHA512c3-highcpu-8 SPRc2-standard-8 CLX300M600M900M1200M1500MSE +/- 1152941.98, N = 3SE +/- 2965153.70, N = 3156857292014658152271. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl

OpenSSL

Algorithm: RSA4096

OpenBenchmarking.orgsign/s, More Is BetterOpenSSL 3.1Algorithm: RSA4096c3-highcpu-8 SPRc2-standard-8 CLX400800120016002000SE +/- 1.17, N = 3SE +/- 2.05, N = 32062.71156.61. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl

OpenSSL

Algorithm: RSA4096

OpenBenchmarking.orgverify/s, More Is BetterOpenSSL 3.1Algorithm: RSA4096c3-highcpu-8 SPRc2-standard-8 CLX16K32K48K64K80KSE +/- 8.53, N = 3SE +/- 47.93, N = 367857.776402.31. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl

OpenSSL

Algorithm: ChaCha20

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.1Algorithm: ChaCha20c3-highcpu-8 SPRc2-standard-8 CLX5000M10000M15000M20000M25000MSE +/- 35781789.46, N = 3SE +/- 1497684.67, N = 322091557637213468118131. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl

OpenSSL

Algorithm: AES-128-GCM

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.1Algorithm: AES-128-GCMc3-highcpu-8 SPRc2-standard-8 CLX12000M24000M36000M48000M60000MSE +/- 33921495.47, N = 3SE +/- 6165180.91, N = 357594077823232373126031. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl

OpenSSL

Algorithm: AES-256-GCM

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.1Algorithm: AES-256-GCMc3-highcpu-8 SPRc2-standard-8 CLX10000M20000M30000M40000M50000MSE +/- 48074910.93, N = 3SE +/- 3855504.36, N = 348008361573169450006301. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl

OpenSSL

Algorithm: ChaCha20-Poly1305

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.1Algorithm: ChaCha20-Poly1305c3-highcpu-8 SPRc2-standard-8 CLX3000M6000M9000M12000M15000MSE +/- 21990039.72, N = 3SE +/- 1592104.52, N = 315970781140109191518431. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl

CockroachDB

Workload: MoVR - Concurrency: 128

OpenBenchmarking.orgops/s, More Is BetterCockroachDB 22.2Workload: MoVR - Concurrency: 128c3-highcpu-8 SPRc2-standard-8 CLX120240360480600SE +/- 5.62, N = 15SE +/- 1.90, N = 3458.8534.9

CockroachDB

Workload: KV, 50% Reads - Concurrency: 128

OpenBenchmarking.orgops/s, More Is BetterCockroachDB 22.2Workload: KV, 50% Reads - Concurrency: 128c3-highcpu-8 SPRc2-standard-8 CLX4K8K12K16K20KSE +/- 40.86, N = 3SE +/- 68.65, N = 319321.613684.5

CockroachDB

Workload: KV, 95% Reads - Concurrency: 128

OpenBenchmarking.orgops/s, More Is BetterCockroachDB 22.2Workload: KV, 95% Reads - Concurrency: 128c3-highcpu-8 SPRc2-standard-8 CLX5K10K15K20K25KSE +/- 127.78, N = 3SE +/- 79.71, N = 324960.116184.2

Memcached

Set To Get Ratio: 1:10

OpenBenchmarking.orgOps/sec, More Is BetterMemcached 1.6.18Set To Get Ratio: 1:10c3-highcpu-8 SPRc2-standard-8 CLX200K400K600K800K1000KSE +/- 3280.78, N = 3SE +/- 2213.84, N = 31044947.13715723.531. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre

Memcached

Set To Get Ratio: 1:100

OpenBenchmarking.orgOps/sec, More Is BetterMemcached 1.6.18Set To Get Ratio: 1:100c3-highcpu-8 SPRc2-standard-8 CLX200K400K600K800K1000KSE +/- 11157.73, N = 3SE +/- 3476.80, N = 31030937.28702291.931. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre

GROMACS

Implementation: MPI CPU - Input: water_GMX50_bare

OpenBenchmarking.orgNs Per Day, More Is BetterGROMACS 2023Implementation: MPI CPU - Input: water_GMX50_barec3-highcpu-8 SPRc2-standard-8 CLX0.17480.34960.52440.69920.874SE +/- 0.001, N = 3SE +/- 0.001, N = 30.7770.5791. (CXX) g++ options: -O3

MariaDB

Clients: 2048

OpenBenchmarking.orgQueries Per Second, More Is BetterMariaDB 11.0.1Clients: 2048c3-highcpu-8 SPRc2-standard-8 CLX70140210280350SE +/- 3.01, N = 3SE +/- 3.50, N = 33322481. (CXX) g++ options: -pie -fPIC -fstack-protector -O3 -lnuma -lcrypt -lz -lm -lssl -lcrypto -lpthread -ldl

MariaDB

Clients: 4096

OpenBenchmarking.orgQueries Per Second, More Is BetterMariaDB 11.0.1Clients: 4096c3-highcpu-8 SPRc2-standard-8 CLX70140210280350SE +/- 2.62, N = 3SE +/- 2.55, N = 33172371. (CXX) g++ options: -pie -fPIC -fstack-protector -O3 -lnuma -lcrypt -lz -lm -lssl -lcrypto -lpthread -ldl

PostgreSQL

Scaling Factor: 100 - Clients: 800 - Mode: Read Only

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL 15Scaling Factor: 100 - Clients: 800 - Mode: Read Onlyc3-highcpu-8 SPRc2-standard-8 CLX70K140K210K280K350KSE +/- 3369.27, N = 3SE +/- 822.93, N = 33119421733171. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm

PostgreSQL

Scaling Factor: 100 - Clients: 800 - Mode: Read Only - Average Latency

OpenBenchmarking.orgms, Fewer Is BetterPostgreSQL 15Scaling Factor: 100 - Clients: 800 - Mode: Read Only - Average Latencyc3-highcpu-8 SPRc2-standard-8 CLX1.03862.07723.11584.15445.193SE +/- 0.028, N = 3SE +/- 0.022, N = 32.5654.6161. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm

PostgreSQL

Scaling Factor: 100 - Clients: 1000 - Mode: Read Only

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL 15Scaling Factor: 100 - Clients: 1000 - Mode: Read Onlyc3-highcpu-8 SPRc2-standard-8 CLX60K120K180K240K300KSE +/- 2414.33, N = 3SE +/- 1014.66, N = 32937251696431. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm

PostgreSQL

Scaling Factor: 100 - Clients: 1000 - Mode: Read Only - Average Latency

OpenBenchmarking.orgms, Fewer Is BetterPostgreSQL 15Scaling Factor: 100 - Clients: 1000 - Mode: Read Only - Average Latencyc3-highcpu-8 SPRc2-standard-8 CLX1.32642.65283.97925.30566.632SE +/- 0.028, N = 3SE +/- 0.035, N = 33.4055.8951. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm

TensorFlow

Device: CPU - Batch Size: 16 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.10Device: CPU - Batch Size: 16 - Model: ResNet-50c3-highcpu-8 SPRc2-standard-8 CLX48121620SE +/- 0.04, N = 3SE +/- 0.02, N = 314.2013.33

TensorFlow

Device: CPU - Batch Size: 32 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.10Device: CPU - Batch Size: 32 - Model: ResNet-50c3-highcpu-8 SPRc2-standard-8 CLX48121620SE +/- 0.01, N = 3SE +/- 0.01, N = 314.9314.11

TensorFlow

Device: CPU - Batch Size: 64 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.10Device: CPU - Batch Size: 64 - Model: ResNet-50c3-highcpu-8 SPRc2-standard-8 CLX48121620SE +/- 0.01, N = 3SE +/- 0.18, N = 315.6914.79

Neural Magic DeepSparse

Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.3.2Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Streamc3-highcpu-8 SPRc2-standard-8 CLX0.85211.70422.55633.40844.2605SE +/- 0.0130, N = 3SE +/- 0.0001, N = 33.78732.9308

Neural Magic DeepSparse

Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.3.2Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Streamc3-highcpu-8 SPRc2-standard-8 CLX150300450600750SE +/- 1.78, N = 3SE +/- 0.03, N = 3528.02682.37

Neural Magic DeepSparse

Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.3.2Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Streamc3-highcpu-8 SPRc2-standard-8 CLX1428425670SE +/- 0.05, N = 3SE +/- 0.07, N = 351.8962.51

Neural Magic DeepSparse

Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.3.2Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Streamc3-highcpu-8 SPRc2-standard-8 CLX918273645SE +/- 0.04, N = 3SE +/- 0.03, N = 338.5131.96

Neural Magic DeepSparse

Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.3.2Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Streamc3-highcpu-8 SPRc2-standard-8 CLX510152025SE +/- 0.03, N = 3SE +/- 0.05, N = 319.1715.92

Neural Magic DeepSparse

Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.3.2Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Streamc3-highcpu-8 SPRc2-standard-8 CLX306090120150SE +/- 0.14, N = 3SE +/- 0.40, N = 3104.31125.47

Neural Magic DeepSparse

Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.3.2Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Streamc3-highcpu-8 SPRc2-standard-8 CLX1428425670SE +/- 0.04, N = 3SE +/- 0.02, N = 364.8759.99

Neural Magic DeepSparse

Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.3.2Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Streamc3-highcpu-8 SPRc2-standard-8 CLX816243240SE +/- 0.02, N = 3SE +/- 0.01, N = 330.8033.29

Neural Magic DeepSparse

Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.3.2Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Streamc3-highcpu-8 SPRc2-standard-8 CLX816243240SE +/- 0.15, N = 3SE +/- 0.03, N = 333.1129.18

Neural Magic DeepSparse

Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.3.2Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Streamc3-highcpu-8 SPRc2-standard-8 CLX1530456075SE +/- 0.28, N = 3SE +/- 0.08, N = 360.3968.51

Neural Magic DeepSparse

Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.3.2Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Streamc3-highcpu-8 SPRc2-standard-8 CLX246810SE +/- 0.0078, N = 3SE +/- 0.0034, N = 36.53726.1893

Neural Magic DeepSparse

Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.3.2Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Streamc3-highcpu-8 SPRc2-standard-8 CLX70140210280350SE +/- 0.37, N = 3SE +/- 0.18, N = 3305.90323.08

Neural Magic DeepSparse

Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.3.2Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Streamc3-highcpu-8 SPRc2-standard-8 CLX48121620SE +/- 0.13, N = 3SE +/- 0.03, N = 316.2213.99

Neural Magic DeepSparse

Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.3.2Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Streamc3-highcpu-8 SPRc2-standard-8 CLX306090120150SE +/- 0.95, N = 3SE +/- 0.29, N = 3123.29142.89

Neural Magic DeepSparse

Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.3.2Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Streamc3-highcpu-8 SPRc2-standard-8 CLX0.84811.69622.54433.39244.2405SE +/- 0.0239, N = 3SE +/- 0.0038, N = 33.76932.9352

Neural Magic DeepSparse

Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.3.2Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Streamc3-highcpu-8 SPRc2-standard-8 CLX150300450600750SE +/- 3.34, N = 3SE +/- 0.89, N = 3530.61681.34

Google Draco

Model: Lion

OpenBenchmarking.orgms, Fewer Is BetterGoogle Draco 1.5.6Model: Lionc3-highcpu-8 SPRc2-standard-8 CLX16003200480064008000SE +/- 80.23, N = 15SE +/- 18.52, N = 3625074371. (CXX) g++ options: -O3

Google Draco

Model: Church Facade

OpenBenchmarking.orgms, Fewer Is BetterGoogle Draco 1.5.6Model: Church Facadec3-highcpu-8 SPRc2-standard-8 CLX2K4K6K8K10KSE +/- 10.48, N = 3SE +/- 16.76, N = 37573114621. (CXX) g++ options: -O3

Blender

Blend File: BMW27 - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 3.4Blend File: BMW27 - Compute: CPU-Onlyc3-highcpu-8 SPR70140210280350SE +/- 1.13, N = 3315.24

nginx

Connections: 100

OpenBenchmarking.orgRequests Per Second, More Is Betternginx 1.23.2Connections: 100c3-highcpu-8 SPRc2-standard-8 CLX8K16K24K32K40KSE +/- 23.95, N = 3SE +/- 33.32, N = 336310.3525148.281. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2

nginx

Connections: 200

OpenBenchmarking.orgRequests Per Second, More Is Betternginx 1.23.2Connections: 200c3-highcpu-8 SPRc2-standard-8 CLX8K16K24K32K40KSE +/- 83.52, N = 3SE +/- 37.62, N = 335602.1024695.911. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2

nginx

Connections: 500

OpenBenchmarking.orgRequests Per Second, More Is Betternginx 1.23.2Connections: 500c3-highcpu-8 SPRc2-standard-8 CLX7K14K21K28K35KSE +/- 321.85, N = 3SE +/- 17.44, N = 334672.6521957.171. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2

nginx

Connections: 1000

OpenBenchmarking.orgRequests Per Second, More Is Betternginx 1.23.2Connections: 1000c3-highcpu-8 SPRc2-standard-8 CLX7K14K21K28K35KSE +/- 22.42, N = 3SE +/- 84.22, N = 332118.5821446.271. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2

nginx

Connections: 4000

OpenBenchmarking.orgRequests Per Second, More Is Betternginx 1.23.2Connections: 4000c3-highcpu-8 SPRc2-standard-8 CLX7K14K21K28K35KSE +/- 27.93, N = 3SE +/- 11.52, N = 332814.7521594.941. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2

BRL-CAD

VGR Performance Metric

OpenBenchmarking.orgVGR Performance Metric, More Is BetterBRL-CAD 7.34VGR Performance Metricc3-highcpu-8 SPRc2-standard-8 CLX15K30K45K60K75K71072503141. (CXX) g++ options: -std=c++14 -pipe -fvisibility=hidden -fno-strict-aliasing -fno-common -fexceptions -ftemplate-depth-128 -m64 -ggdb3 -O3 -fipa-pta -fstrength-reduce -finline-functions -flto -ltcl8.6 -lregex_brl -lz_brl -lnetpbm -ldl -lm -ltk8.6

OpenCV

Test: Core

OpenBenchmarking.orgms, Fewer Is BetterOpenCV 4.7Test: Corec3-highcpu-8 SPRc2-standard-8 CLX30K60K90K120K150KSE +/- 280.31, N = 3SE +/- 2578.21, N = 12873721427701. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt

OpenCV

Test: Video

OpenBenchmarking.orgms, Fewer Is BetterOpenCV 4.7Test: Videoc3-highcpu-8 SPRc2-standard-8 CLX7K14K21K28K35KSE +/- 198.80, N = 3SE +/- 50.28, N = 331654117371. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt

OpenCV

Test: Graph API

OpenBenchmarking.orgms, Fewer Is BetterOpenCV 4.7Test: Graph APIc3-highcpu-8 SPRc2-standard-8 CLX50K100K150K200K250KSE +/- 931.36, N = 3SE +/- 1570.24, N = 32199312361861. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt

OpenCV

Test: Stitching

OpenBenchmarking.orgms, Fewer Is BetterOpenCV 4.7Test: Stitchingc3-highcpu-8 SPRc2-standard-8 CLX50K100K150K200K250KSE +/- 1973.06, N = 7SE +/- 1856.06, N = 32147602508331. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt

OpenCV

Test: Image Processing

OpenBenchmarking.orgms, Fewer Is BetterOpenCV 4.7Test: Image Processingc3-highcpu-8 SPRc2-standard-8 CLX30K60K90K120K150KSE +/- 1624.35, N = 12SE +/- 1527.37, N = 41281631472341. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt

OpenCV

Test: Object Detection

OpenBenchmarking.orgms, Fewer Is BetterOpenCV 4.7Test: Object Detectionc3-highcpu-8 SPRc2-standard-8 CLX12K24K36K48K60KSE +/- 384.74, N = 5SE +/- 751.87, N = 338999580561. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt


Phoronix Test Suite v10.8.4