Google Cloud c3 Sapphire Rapids

Benchmarks by Michael Larabel for a future article.

HTML result view exported from: https://openbenchmarking.org/result/2303226-NE-2303218PT51&sro.

Google Cloud c3 Sapphire RapidsProcessorMotherboardChipsetMemoryDiskNetworkOSKernelVulkanCompilerFile-SystemSystem Layerc3-highcpu-8 SPRc2-standard-8 CLXIntel Xeon Platinum 8481C (4 Cores / 8 Threads)Google Compute Engine c3-highcpu-8Intel 440FX 82441FX PMC16GB322GB nvme_card-pdGoogle Compute Engine VirtualUbuntu 22.105.19.0-1015-gcp (x86_64)1.3.224GCC 12.2.0ext4KVMIntel Xeon (4 Cores / 8 Threads)Google Compute Engine c2-standard-832GB322GB PersistentDiskRed Hat Virtio deviceOpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- CPU Microcode: 0xffffffffPython Details- Python 3.10.7Security Details- c3-highcpu-8 SPR: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected - c2-standard-8 CLX: itlb_multihit: Not affected + l1tf: Not affected + mds: Mitigation of Clear buffers; SMT Host state unknown + meltdown: Not affected + mmio_stale_data: Vulnerable: Clear buffers attempted no microcode; SMT Host state unknown + retbleed: Mitigation of Enhanced IBRS + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Mitigation of Clear buffers; SMT Host state unknown

Google Cloud c3 Sapphire Rapidslczero: BLASlczero: Eigenminibude: OpenMP - BM1minibude: OpenMP - BM1namd: ATPase Simulation - 327,506 Atomsnekrs: TurboPipe Periodicincompact3d: input.i3d 129 Cells Per Directionopenfoam: drivaerFastback, Small Mesh Size - Mesh Timeopenfoam: drivaerFastback, Small Mesh Size - Execution Timeopenradioss: Bumper Beamopenradioss: Cell Phone Drop Testopenradioss: Bird Strike on Windshieldopenradioss: Rubber O-Ring Seal Installationspecfem3d: Mount St. Helensspecfem3d: Layered Halfspacespecfem3d: Tomographic Modelspecfem3d: Homogeneous Halfspacespecfem3d: Water-layered Halfspacecompress-zstd: 19 - Compression Speedcompress-zstd: 19 - Decompression Speedcompress-zstd: 19, Long Mode - Compression Speedcompress-zstd: 19, Long Mode - Decompression Speedjohn-the-ripper: bcryptjohn-the-ripper: WPA PSKjohn-the-ripper: Blowfishjohn-the-ripper: HMAC-SHA512john-the-ripper: MD5embree: Pathtracer ISPC - Crownembree: Pathtracer ISPC - Asian Dragonuvg266: Bosphorus 4K - Very Fastuvg266: Bosphorus 4K - Super Fastuvg266: Bosphorus 4K - Ultra Fastuvg266: Bosphorus 1080p - Very Fastuvg266: Bosphorus 1080p - Super Fastuvg266: Bosphorus 1080p - Ultra Fastoidn: RT.hdr_alb_nrm.3840x2160oidn: RTLightmap.hdr.4096x4096openvkl: vklBenchmark ISPCcompress-7zip: Compression Ratingcompress-7zip: Decompression Ratingbuild-ffmpeg: Time To Compilebuild-linux-kernel: defconfigonednn: IP Shapes 1D - bf16bf16bf16 - CPUonednn: IP Shapes 3D - bf16bf16bf16 - CPUonednn: Convolution Batch Shapes Auto - bf16bf16bf16 - CPUonednn: Deconvolution Batch shapes_1d - bf16bf16bf16 - CPUonednn: Deconvolution Batch shapes_3d - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Training - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPUonednn: Matrix Multiply Batch Shapes Transformer - bf16bf16bf16 - CPUospray-studio: 1 - 4K - 1 - Path Tracerospray-studio: 3 - 4K - 1 - Path Traceropenssl: SHA256openssl: SHA512openssl: RSA4096openssl: RSA4096openssl: ChaCha20openssl: AES-128-GCMopenssl: AES-256-GCMopenssl: ChaCha20-Poly1305cockroach: MoVR - 128cockroach: KV, 50% Reads - 128cockroach: KV, 95% Reads - 128memcached: 1:10memcached: 1:100gromacs: MPI CPU - water_GMX50_baremysqlslap: 2048mysqlslap: 4096pgbench: 100 - 800 - Read Onlypgbench: 100 - 800 - Read Only - Average Latencypgbench: 100 - 1000 - Read Onlypgbench: 100 - 1000 - Read Only - Average Latencytensorflow: CPU - 16 - ResNet-50tensorflow: CPU - 32 - ResNet-50tensorflow: CPU - 64 - ResNet-50deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Streamdeepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Streamdeepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Asynchronous Multi-Streamdeepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Asynchronous Multi-Streamdeepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Streamdeepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Streamdeepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Streamdeepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Streamdeepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Streamdeepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Streamdeepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Streamdeepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Streamdeepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Streamdeepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Streamdeepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Streamdeepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Streamdraco: Liondraco: Church Facadeblender: BMW27 - CPU-Onlynginx: 100nginx: 200nginx: 500nginx: 1000nginx: 4000brl-cad: VGR Performance Metricopencv: Coreopencv: Videoopencv: Graph APIopencv: Stitchingopencv: Image Processingopencv: Object Detectionc3-highcpu-8 SPRc2-standard-8 CLX12721221188.6517.5463.357793066790000032.394316362.044277422.67836303.38219.73595.14367.00139.524469614372.089480857143.937022875179.545901627321.62525166810.3905.26.5907.26932288186930375090007657135.84757.36426.997.489.1232.3934.5042.240.240.12983530620468120.438244.8001.500045.342184.177071.471453.549444660.702337.310.9689862095225819428387398715685729202062.767857.722091557637575940778234800836157315970781140458.819321.624960.11044947.131030937.280.7773323173119422.5652937253.40514.2014.9315.693.7873528.019351.892638.512019.1677104.310264.874730.795233.106460.38586.5372305.899116.2194123.29313.7693530.610062507573315.2436310.3535602.1034672.6532118.5832814.7571072873723165421993121476012816338999909902152.7986.1124.445342510076666737.757567185.518264560.60744390.61291.49724.80523.45145.379328519374.756877204150.915292633190.794140810347.8128140709.24701.15.86713.86687312786684401970006835463.93405.15425.696.037.4526.3027.7234.470.220.11703098922852139.039289.66219.52878.0241634.189352.942135.91645767.702998.037.432603091137892131819353014658152271156.676402.321346811813232373126031694500063010919151843534.913684.516184.2715723.53702291.930.5792482371733174.6161696435.89513.3314.1114.792.9308682.374862.507331.958315.9179125.474659.991433.292429.178368.50766.1893323.082813.9932142.88922.9352681.344974371146225148.2824695.9121957.1721446.2721594.94503141427701173723618625083314723458056OpenBenchmarking.org

LeelaChessZero

Backend: BLAS

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.28Backend: BLASc2-standard-8 CLXc3-highcpu-8 SPR30060090012001500SE +/- 7.36, N = 3SE +/- 16.59, N = 390912721. (CXX) g++ options: -flto -pthread

LeelaChessZero

Backend: Eigen

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.28Backend: Eigenc2-standard-8 CLXc3-highcpu-8 SPR30060090012001500SE +/- 9.00, N = 3SE +/- 7.69, N = 390212211. (CXX) g++ options: -flto -pthread

miniBUDE

Implementation: OpenMP - Input Deck: BM1

OpenBenchmarking.orgGFInst/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM1c2-standard-8 CLXc3-highcpu-8 SPR4080120160200SE +/- 0.02, N = 3SE +/- 0.03, N = 3152.80188.651. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm

miniBUDE

Implementation: OpenMP - Input Deck: BM1

OpenBenchmarking.orgBillion Interactions/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM1c2-standard-8 CLXc3-highcpu-8 SPR246810SE +/- 0.001, N = 3SE +/- 0.001, N = 36.1127.5461. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm

NAMD

ATPase Simulation - 327,506 Atoms

OpenBenchmarking.orgdays/ns, Fewer Is BetterNAMD 2.14ATPase Simulation - 327,506 Atomsc2-standard-8 CLXc3-highcpu-8 SPR1.00022.00043.00064.00085.001SE +/- 0.00771, N = 3SE +/- 0.00254, N = 34.445343.35779

nekRS

Input: TurboPipe Periodic

OpenBenchmarking.orgFLOP/s, More Is BetternekRS 22.0Input: TurboPipe Periodicc2-standard-8 CLXc3-highcpu-8 SPR7000M14000M21000M28000M35000MSE +/- 56849518.71, N = 3SE +/- 92013205.57, N = 325100766667306679000001. (CXX) g++ options: -fopenmp -O2 -march=native -mtune=native -ftree-vectorize -lmpi_cxx -lmpi

Xcompact3d Incompact3d

Input: input.i3d 129 Cells Per Direction

OpenBenchmarking.orgSeconds, Fewer Is BetterXcompact3d Incompact3d 2021-03-11Input: input.i3d 129 Cells Per Directionc2-standard-8 CLXc3-highcpu-8 SPR918273645SE +/- 0.02, N = 3SE +/- 0.02, N = 337.7632.391. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

OpenFOAM

Input: drivaerFastback, Small Mesh Size - Mesh Time

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 10Input: drivaerFastback, Small Mesh Size - Mesh Timec2-standard-8 CLXc3-highcpu-8 SPR2040608010085.5262.04-ldynamicMesh-lphysicalProperties -lspecie -lfiniteVolume -lfvModels -lmeshTools -lsampling1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lgenericPatchFields -lOpenFOAM -ldl -lm

OpenFOAM

Input: drivaerFastback, Small Mesh Size - Execution Time

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 10Input: drivaerFastback, Small Mesh Size - Execution Timec2-standard-8 CLXc3-highcpu-8 SPR120240360480600560.61422.68-ldynamicMesh-lphysicalProperties -lspecie -lfiniteVolume -lfvModels -lmeshTools -lsampling1. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lgenericPatchFields -lOpenFOAM -ldl -lm

OpenRadioss

Model: Bumper Beam

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenRadioss 2022.10.13Model: Bumper Beamc2-standard-8 CLXc3-highcpu-8 SPR80160240320400SE +/- 0.90, N = 3SE +/- 0.51, N = 3390.61303.38

OpenRadioss

Model: Cell Phone Drop Test

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenRadioss 2022.10.13Model: Cell Phone Drop Testc2-standard-8 CLXc3-highcpu-8 SPR60120180240300SE +/- 0.39, N = 3SE +/- 0.38, N = 3291.49219.73

OpenRadioss

Model: Bird Strike on Windshield

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenRadioss 2022.10.13Model: Bird Strike on Windshieldc2-standard-8 CLXc3-highcpu-8 SPR160320480640800SE +/- 1.51, N = 3SE +/- 0.30, N = 3724.80595.14

OpenRadioss

Model: Rubber O-Ring Seal Installation

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenRadioss 2022.10.13Model: Rubber O-Ring Seal Installationc2-standard-8 CLXc3-highcpu-8 SPR110220330440550SE +/- 0.45, N = 3SE +/- 0.20, N = 3523.45367.00

SPECFEM3D

Model: Mount St. Helens

OpenBenchmarking.orgSeconds, Fewer Is BetterSPECFEM3D 4.0Model: Mount St. Helensc2-standard-8 CLXc3-highcpu-8 SPR306090120150SE +/- 0.56, N = 3SE +/- 0.10, N = 3145.38139.521. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

SPECFEM3D

Model: Layered Halfspace

OpenBenchmarking.orgSeconds, Fewer Is BetterSPECFEM3D 4.0Model: Layered Halfspacec2-standard-8 CLXc3-highcpu-8 SPR80160240320400SE +/- 2.51, N = 3SE +/- 0.64, N = 3374.76372.091. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

SPECFEM3D

Model: Tomographic Model

OpenBenchmarking.orgSeconds, Fewer Is BetterSPECFEM3D 4.0Model: Tomographic Modelc2-standard-8 CLXc3-highcpu-8 SPR306090120150SE +/- 1.30, N = 3SE +/- 1.65, N = 3150.92143.941. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

SPECFEM3D

Model: Homogeneous Halfspace

OpenBenchmarking.orgSeconds, Fewer Is BetterSPECFEM3D 4.0Model: Homogeneous Halfspacec2-standard-8 CLXc3-highcpu-8 SPR4080120160200SE +/- 1.87, N = 3SE +/- 0.09, N = 3190.79179.551. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

SPECFEM3D

Model: Water-layered Halfspace

OpenBenchmarking.orgSeconds, Fewer Is BetterSPECFEM3D 4.0Model: Water-layered Halfspacec2-standard-8 CLXc3-highcpu-8 SPR80160240320400SE +/- 0.85, N = 3SE +/- 0.31, N = 3347.81321.631. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

Zstd Compression

Compression Level: 19 - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 19 - Compression Speedc2-standard-8 CLXc3-highcpu-8 SPR3691215SE +/- 0.08, N = 3SE +/- 0.12, N = 39.2410.30-llzma1. (CC) gcc options: -O3 -pthread -lz

Zstd Compression

Compression Level: 19 - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 19 - Decompression Speedc2-standard-8 CLXc3-highcpu-8 SPR2004006008001000SE +/- 1.45, N = 3SE +/- 1.50, N = 3701.1905.2-llzma1. (CC) gcc options: -O3 -pthread -lz

Zstd Compression

Compression Level: 19, Long Mode - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 19, Long Mode - Compression Speedc2-standard-8 CLXc3-highcpu-8 SPR246810SE +/- 0.00, N = 3SE +/- 0.00, N = 35.866.50-llzma1. (CC) gcc options: -O3 -pthread -lz

Zstd Compression

Compression Level: 19, Long Mode - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.4Compression Level: 19, Long Mode - Decompression Speedc2-standard-8 CLXc3-highcpu-8 SPR2004006008001000SE +/- 2.02, N = 3SE +/- 1.48, N = 3713.8907.2-llzma1. (CC) gcc options: -O3 -pthread -lz

John The Ripper

Test: bcrypt

OpenBenchmarking.orgReal C/S, More Is BetterJohn The Ripper 2023.03.14Test: bcryptc2-standard-8 CLXc3-highcpu-8 SPR15003000450060007500SE +/- 0.67, N = 3SE +/- 0.67, N = 3668769321. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lm -lrt -lz -ldl -lcrypt

John The Ripper

Test: WPA PSK

OpenBenchmarking.orgReal C/S, More Is BetterJohn The Ripper 2023.03.14Test: WPA PSKc2-standard-8 CLXc3-highcpu-8 SPR7K14K21K28K35KSE +/- 11.37, N = 3SE +/- 27.15, N = 331278288181. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lm -lrt -lz -ldl -lcrypt

John The Ripper

Test: Blowfish

OpenBenchmarking.orgReal C/S, More Is BetterJohn The Ripper 2023.03.14Test: Blowfishc2-standard-8 CLXc3-highcpu-8 SPR15003000450060007500SE +/- 2.73, N = 3SE +/- 2.08, N = 3668469301. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lm -lrt -lz -ldl -lcrypt

John The Ripper

Test: HMAC-SHA512

OpenBenchmarking.orgReal C/S, More Is BetterJohn The Ripper 2023.03.14Test: HMAC-SHA512c2-standard-8 CLXc3-highcpu-8 SPR9M18M27M36M45MSE +/- 32331.62, N = 3SE +/- 21071.31, N = 340197000375090001. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lm -lrt -lz -ldl -lcrypt

John The Ripper

Test: MD5

OpenBenchmarking.orgReal C/S, More Is BetterJohn The Ripper 2023.03.14Test: MD5c2-standard-8 CLXc3-highcpu-8 SPR160K320K480K640K800KSE +/- 162.53, N = 3SE +/- 1472.13, N = 36835467657131. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lm -lrt -lz -ldl -lcrypt

Embree

Binary: Pathtracer ISPC - Model: Crown

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.0.1Binary: Pathtracer ISPC - Model: Crownc2-standard-8 CLXc3-highcpu-8 SPR1.31572.63143.94715.26286.5785SE +/- 0.0030, N = 3SE +/- 0.0120, N = 33.93405.8475MIN: 3.91 / MAX: 3.99MIN: 5.81 / MAX: 5.92

Embree

Binary: Pathtracer ISPC - Model: Asian Dragon

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.0.1Binary: Pathtracer ISPC - Model: Asian Dragonc2-standard-8 CLXc3-highcpu-8 SPR246810SE +/- 0.0119, N = 3SE +/- 0.0079, N = 35.15427.3642MIN: 5.12 / MAX: 5.22MIN: 7.33 / MAX: 7.44

uvg266

Video Input: Bosphorus 4K - Video Preset: Very Fast

OpenBenchmarking.orgFrames Per Second, More Is Betteruvg266 0.4.1Video Input: Bosphorus 4K - Video Preset: Very Fastc2-standard-8 CLXc3-highcpu-8 SPR246810SE +/- 0.02, N = 3SE +/- 0.03, N = 35.696.99

uvg266

Video Input: Bosphorus 4K - Video Preset: Super Fast

OpenBenchmarking.orgFrames Per Second, More Is Betteruvg266 0.4.1Video Input: Bosphorus 4K - Video Preset: Super Fastc2-standard-8 CLXc3-highcpu-8 SPR246810SE +/- 0.00, N = 3SE +/- 0.00, N = 36.037.48

uvg266

Video Input: Bosphorus 4K - Video Preset: Ultra Fast

OpenBenchmarking.orgFrames Per Second, More Is Betteruvg266 0.4.1Video Input: Bosphorus 4K - Video Preset: Ultra Fastc2-standard-8 CLXc3-highcpu-8 SPR3691215SE +/- 0.01, N = 3SE +/- 0.01, N = 37.459.12

uvg266

Video Input: Bosphorus 1080p - Video Preset: Very Fast

OpenBenchmarking.orgFrames Per Second, More Is Betteruvg266 0.4.1Video Input: Bosphorus 1080p - Video Preset: Very Fastc2-standard-8 CLXc3-highcpu-8 SPR816243240SE +/- 0.16, N = 3SE +/- 0.25, N = 326.3032.39

uvg266

Video Input: Bosphorus 1080p - Video Preset: Super Fast

OpenBenchmarking.orgFrames Per Second, More Is Betteruvg266 0.4.1Video Input: Bosphorus 1080p - Video Preset: Super Fastc2-standard-8 CLXc3-highcpu-8 SPR816243240SE +/- 0.01, N = 3SE +/- 0.03, N = 327.7234.50

uvg266

Video Input: Bosphorus 1080p - Video Preset: Ultra Fast

OpenBenchmarking.orgFrames Per Second, More Is Betteruvg266 0.4.1Video Input: Bosphorus 1080p - Video Preset: Ultra Fastc2-standard-8 CLXc3-highcpu-8 SPR1020304050SE +/- 0.01, N = 3SE +/- 0.01, N = 334.4742.24

VVenC

Video Input: Bosphorus 4K - Video Preset: Fast

OpenBenchmarking.orgFrames Per Second, More Is BetterVVenC 1.7Video Input: Bosphorus 4K - Video Preset: Fastc3-highcpu-8 SPR0.35010.70021.05031.40041.7505SE +/- 0.005, N = 31.5561. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects

VVenC

Video Input: Bosphorus 4K - Video Preset: Faster

OpenBenchmarking.orgFrames Per Second, More Is BetterVVenC 1.7Video Input: Bosphorus 4K - Video Preset: Fasterc3-highcpu-8 SPR0.79291.58582.37873.17163.9645SE +/- 0.000, N = 33.5241. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects

VVenC

Video Input: Bosphorus 1080p - Video Preset: Fast

OpenBenchmarking.orgFrames Per Second, More Is BetterVVenC 1.7Video Input: Bosphorus 1080p - Video Preset: Fastc3-highcpu-8 SPR1.19362.38723.58084.77445.968SE +/- 0.018, N = 35.3051. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects

VVenC

Video Input: Bosphorus 1080p - Video Preset: Faster

OpenBenchmarking.orgFrames Per Second, More Is BetterVVenC 1.7Video Input: Bosphorus 1080p - Video Preset: Fasterc3-highcpu-8 SPR3691215SE +/- 0.01, N = 312.921. (CXX) g++ options: -O3 -flto=auto -fno-fat-lto-objects

Intel Open Image Denoise

Run: RT.hdr_alb_nrm.3840x2160

OpenBenchmarking.orgImages / Sec, More Is BetterIntel Open Image Denoise 1.4.0Run: RT.hdr_alb_nrm.3840x2160c2-standard-8 CLXc3-highcpu-8 SPR0.0540.1080.1620.2160.27SE +/- 0.00, N = 3SE +/- 0.00, N = 30.220.24

Intel Open Image Denoise

Run: RTLightmap.hdr.4096x4096

OpenBenchmarking.orgImages / Sec, More Is BetterIntel Open Image Denoise 1.4.0Run: RTLightmap.hdr.4096x4096c2-standard-8 CLXc3-highcpu-8 SPR0.0270.0540.0810.1080.135SE +/- 0.00, N = 3SE +/- 0.00, N = 30.110.12

OpenVKL

Benchmark: vklBenchmark ISPC

OpenBenchmarking.orgItems / Sec, More Is BetterOpenVKL 1.3.1Benchmark: vklBenchmark ISPCc2-standard-8 CLXc3-highcpu-8 SPR20406080100SE +/- 0.00, N = 3SE +/- 0.33, N = 37098MIN: 8 / MAX: 1119MIN: 11 / MAX: 1579

7-Zip Compression

Test: Compression Rating

OpenBenchmarking.orgMIPS, More Is Better7-Zip Compression 22.01Test: Compression Ratingc2-standard-8 CLXc3-highcpu-8 SPR8K16K24K32K40KSE +/- 248.15, N = 3SE +/- 246.17, N = 1530989353061. (CXX) g++ options: -lpthread -ldl -O2 -fPIC

7-Zip Compression

Test: Decompression Rating

OpenBenchmarking.orgMIPS, More Is Better7-Zip Compression 22.01Test: Decompression Ratingc2-standard-8 CLXc3-highcpu-8 SPR5K10K15K20K25KSE +/- 52.92, N = 3SE +/- 183.58, N = 1522852204681. (CXX) g++ options: -lpthread -ldl -O2 -fPIC

Timed FFmpeg Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed FFmpeg Compilation 6.0Time To Compilec2-standard-8 CLXc3-highcpu-8 SPR306090120150SE +/- 0.03, N = 3SE +/- 0.10, N = 3139.04120.44

Timed Linux Kernel Compilation

Build: defconfig

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Linux Kernel Compilation 6.1Build: defconfigc2-standard-8 CLXc3-highcpu-8 SPR60120180240300SE +/- 0.85, N = 3SE +/- 0.64, N = 3289.66244.80

oneDNN

Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPUc2-standard-8 CLXc3-highcpu-8 SPR510152025SE +/- 0.01518, N = 3SE +/- 0.00340, N = 319.528701.50004MIN: 18.981. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPUc2-standard-8 CLXc3-highcpu-8 SPR246810SE +/- 0.01190, N = 3SE +/- 0.00535, N = 38.024165.34218MIN: 7.86MIN: 4.941. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPUc2-standard-8 CLXc3-highcpu-8 SPR816243240SE +/- 0.00489, N = 3SE +/- 0.00870, N = 334.189304.17707MIN: 33.771. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPUc2-standard-8 CLXc3-highcpu-8 SPR1224364860SE +/- 0.00209, N = 3SE +/- 0.00472, N = 352.942101.47145MIN: 52.71. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPUc2-standard-8 CLXc3-highcpu-8 SPR816243240SE +/- 0.01009, N = 3SE +/- 0.01072, N = 335.916403.54944MIN: 35.711. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPUc2-standard-8 CLXc3-highcpu-8 SPR12002400360048006000SE +/- 8.38, N = 3SE +/- 0.86, N = 35767.704660.70MIN: 5732.12MIN: 4648.251. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPUc2-standard-8 CLXc3-highcpu-8 SPR6001200180024003000SE +/- 0.83, N = 3SE +/- 1.76, N = 32998.032337.31MIN: 2977.98MIN: 2326.771. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.0Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPUc2-standard-8 CLXc3-highcpu-8 SPR246810SE +/- 0.007542, N = 3SE +/- 0.013114, N = 37.4326000.968986MIN: 7.251. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

OSPRay Studio

Camera: 1 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 1 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracerc2-standard-8 CLXc3-highcpu-8 SPR7K14K21K28K35KSE +/- 32.33, N = 3SE +/- 4.91, N = 330911209521. (CXX) g++ options: -O3 -lm -ldl

OSPRay Studio

Camera: 3 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracer

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 3 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path Tracerc2-standard-8 CLXc3-highcpu-8 SPR8K16K24K32K40KSE +/- 39.94, N = 3SE +/- 364.36, N = 337892258191. (CXX) g++ options: -O3 -lm -ldl

OpenSSL

Algorithm: SHA256

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.1Algorithm: SHA256c2-standard-8 CLXc3-highcpu-8 SPR900M1800M2700M3600M4500MSE +/- 28303.98, N = 3SE +/- 2722265.85, N = 3131819353042838739871. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl

OpenSSL

Algorithm: SHA512

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.1Algorithm: SHA512c2-standard-8 CLXc3-highcpu-8 SPR300M600M900M1200M1500MSE +/- 2965153.70, N = 3SE +/- 1152941.98, N = 3146581522715685729201. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl

OpenSSL

Algorithm: RSA4096

OpenBenchmarking.orgsign/s, More Is BetterOpenSSL 3.1Algorithm: RSA4096c2-standard-8 CLXc3-highcpu-8 SPR400800120016002000SE +/- 2.05, N = 3SE +/- 1.17, N = 31156.62062.71. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl

OpenSSL

Algorithm: RSA4096

OpenBenchmarking.orgverify/s, More Is BetterOpenSSL 3.1Algorithm: RSA4096c2-standard-8 CLXc3-highcpu-8 SPR16K32K48K64K80KSE +/- 47.93, N = 3SE +/- 8.53, N = 376402.367857.71. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl

OpenSSL

Algorithm: ChaCha20

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.1Algorithm: ChaCha20c2-standard-8 CLXc3-highcpu-8 SPR5000M10000M15000M20000M25000MSE +/- 1497684.67, N = 3SE +/- 35781789.46, N = 321346811813220915576371. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl

OpenSSL

Algorithm: AES-128-GCM

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.1Algorithm: AES-128-GCMc2-standard-8 CLXc3-highcpu-8 SPR12000M24000M36000M48000M60000MSE +/- 6165180.91, N = 3SE +/- 33921495.47, N = 323237312603575940778231. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl

OpenSSL

Algorithm: AES-256-GCM

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.1Algorithm: AES-256-GCMc2-standard-8 CLXc3-highcpu-8 SPR10000M20000M30000M40000M50000MSE +/- 3855504.36, N = 3SE +/- 48074910.93, N = 316945000630480083615731. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl

OpenSSL

Algorithm: ChaCha20-Poly1305

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.1Algorithm: ChaCha20-Poly1305c2-standard-8 CLXc3-highcpu-8 SPR3000M6000M9000M12000M15000MSE +/- 1592104.52, N = 3SE +/- 21990039.72, N = 310919151843159707811401. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl

CockroachDB

Workload: MoVR - Concurrency: 128

OpenBenchmarking.orgops/s, More Is BetterCockroachDB 22.2Workload: MoVR - Concurrency: 128c2-standard-8 CLXc3-highcpu-8 SPR120240360480600SE +/- 1.90, N = 3SE +/- 5.62, N = 15534.9458.8

CockroachDB

Workload: KV, 50% Reads - Concurrency: 128

OpenBenchmarking.orgops/s, More Is BetterCockroachDB 22.2Workload: KV, 50% Reads - Concurrency: 128c2-standard-8 CLXc3-highcpu-8 SPR4K8K12K16K20KSE +/- 68.65, N = 3SE +/- 40.86, N = 313684.519321.6

CockroachDB

Workload: KV, 95% Reads - Concurrency: 128

OpenBenchmarking.orgops/s, More Is BetterCockroachDB 22.2Workload: KV, 95% Reads - Concurrency: 128c2-standard-8 CLXc3-highcpu-8 SPR5K10K15K20K25KSE +/- 79.71, N = 3SE +/- 127.78, N = 316184.224960.1

Memcached

Set To Get Ratio: 1:10

OpenBenchmarking.orgOps/sec, More Is BetterMemcached 1.6.18Set To Get Ratio: 1:10c2-standard-8 CLXc3-highcpu-8 SPR200K400K600K800K1000KSE +/- 2213.84, N = 3SE +/- 3280.78, N = 3715723.531044947.131. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre

Memcached

Set To Get Ratio: 1:100

OpenBenchmarking.orgOps/sec, More Is BetterMemcached 1.6.18Set To Get Ratio: 1:100c2-standard-8 CLXc3-highcpu-8 SPR200K400K600K800K1000KSE +/- 3476.80, N = 3SE +/- 11157.73, N = 3702291.931030937.281. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre

GROMACS

Implementation: MPI CPU - Input: water_GMX50_bare

OpenBenchmarking.orgNs Per Day, More Is BetterGROMACS 2023Implementation: MPI CPU - Input: water_GMX50_barec2-standard-8 CLXc3-highcpu-8 SPR0.17480.34960.52440.69920.874SE +/- 0.001, N = 3SE +/- 0.001, N = 30.5790.7771. (CXX) g++ options: -O3

MariaDB

Clients: 2048

OpenBenchmarking.orgQueries Per Second, More Is BetterMariaDB 11.0.1Clients: 2048c2-standard-8 CLXc3-highcpu-8 SPR70140210280350SE +/- 3.50, N = 3SE +/- 3.01, N = 32483321. (CXX) g++ options: -pie -fPIC -fstack-protector -O3 -lnuma -lcrypt -lz -lm -lssl -lcrypto -lpthread -ldl

MariaDB

Clients: 4096

OpenBenchmarking.orgQueries Per Second, More Is BetterMariaDB 11.0.1Clients: 4096c2-standard-8 CLXc3-highcpu-8 SPR70140210280350SE +/- 2.55, N = 3SE +/- 2.62, N = 32373171. (CXX) g++ options: -pie -fPIC -fstack-protector -O3 -lnuma -lcrypt -lz -lm -lssl -lcrypto -lpthread -ldl

PostgreSQL

Scaling Factor: 100 - Clients: 800 - Mode: Read Only

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL 15Scaling Factor: 100 - Clients: 800 - Mode: Read Onlyc2-standard-8 CLXc3-highcpu-8 SPR70K140K210K280K350KSE +/- 822.93, N = 3SE +/- 3369.27, N = 31733173119421. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm

PostgreSQL

Scaling Factor: 100 - Clients: 800 - Mode: Read Only - Average Latency

OpenBenchmarking.orgms, Fewer Is BetterPostgreSQL 15Scaling Factor: 100 - Clients: 800 - Mode: Read Only - Average Latencyc2-standard-8 CLXc3-highcpu-8 SPR1.03862.07723.11584.15445.193SE +/- 0.022, N = 3SE +/- 0.028, N = 34.6162.5651. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm

PostgreSQL

Scaling Factor: 100 - Clients: 1000 - Mode: Read Only

OpenBenchmarking.orgTPS, More Is BetterPostgreSQL 15Scaling Factor: 100 - Clients: 1000 - Mode: Read Onlyc2-standard-8 CLXc3-highcpu-8 SPR60K120K180K240K300KSE +/- 1014.66, N = 3SE +/- 2414.33, N = 31696432937251. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm

PostgreSQL

Scaling Factor: 100 - Clients: 1000 - Mode: Read Only - Average Latency

OpenBenchmarking.orgms, Fewer Is BetterPostgreSQL 15Scaling Factor: 100 - Clients: 1000 - Mode: Read Only - Average Latencyc2-standard-8 CLXc3-highcpu-8 SPR1.32642.65283.97925.30566.632SE +/- 0.035, N = 3SE +/- 0.028, N = 35.8953.4051. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm

TensorFlow

Device: CPU - Batch Size: 16 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.10Device: CPU - Batch Size: 16 - Model: ResNet-50c2-standard-8 CLXc3-highcpu-8 SPR48121620SE +/- 0.02, N = 3SE +/- 0.04, N = 313.3314.20

TensorFlow

Device: CPU - Batch Size: 32 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.10Device: CPU - Batch Size: 32 - Model: ResNet-50c2-standard-8 CLXc3-highcpu-8 SPR48121620SE +/- 0.01, N = 3SE +/- 0.01, N = 314.1114.93

TensorFlow

Device: CPU - Batch Size: 64 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.10Device: CPU - Batch Size: 64 - Model: ResNet-50c2-standard-8 CLXc3-highcpu-8 SPR48121620SE +/- 0.18, N = 3SE +/- 0.01, N = 314.7915.69

Neural Magic DeepSparse

Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.3.2Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Streamc2-standard-8 CLXc3-highcpu-8 SPR0.85211.70422.55633.40844.2605SE +/- 0.0001, N = 3SE +/- 0.0130, N = 32.93083.7873

Neural Magic DeepSparse

Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.3.2Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Streamc2-standard-8 CLXc3-highcpu-8 SPR150300450600750SE +/- 0.03, N = 3SE +/- 1.78, N = 3682.37528.02

Neural Magic DeepSparse

Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.3.2Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Streamc2-standard-8 CLXc3-highcpu-8 SPR1428425670SE +/- 0.07, N = 3SE +/- 0.05, N = 362.5151.89

Neural Magic DeepSparse

Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.3.2Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Streamc2-standard-8 CLXc3-highcpu-8 SPR918273645SE +/- 0.03, N = 3SE +/- 0.04, N = 331.9638.51

Neural Magic DeepSparse

Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.3.2Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Streamc2-standard-8 CLXc3-highcpu-8 SPR510152025SE +/- 0.05, N = 3SE +/- 0.03, N = 315.9219.17

Neural Magic DeepSparse

Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.3.2Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Streamc2-standard-8 CLXc3-highcpu-8 SPR306090120150SE +/- 0.40, N = 3SE +/- 0.14, N = 3125.47104.31

Neural Magic DeepSparse

Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.3.2Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Streamc2-standard-8 CLXc3-highcpu-8 SPR1428425670SE +/- 0.02, N = 3SE +/- 0.04, N = 359.9964.87

Neural Magic DeepSparse

Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.3.2Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Streamc2-standard-8 CLXc3-highcpu-8 SPR816243240SE +/- 0.01, N = 3SE +/- 0.02, N = 333.2930.80

Neural Magic DeepSparse

Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.3.2Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Streamc2-standard-8 CLXc3-highcpu-8 SPR816243240SE +/- 0.03, N = 3SE +/- 0.15, N = 329.1833.11

Neural Magic DeepSparse

Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.3.2Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Streamc2-standard-8 CLXc3-highcpu-8 SPR1530456075SE +/- 0.08, N = 3SE +/- 0.28, N = 368.5160.39

Neural Magic DeepSparse

Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.3.2Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Streamc2-standard-8 CLXc3-highcpu-8 SPR246810SE +/- 0.0034, N = 3SE +/- 0.0078, N = 36.18936.5372

Neural Magic DeepSparse

Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.3.2Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Streamc2-standard-8 CLXc3-highcpu-8 SPR70140210280350SE +/- 0.18, N = 3SE +/- 0.37, N = 3323.08305.90

Neural Magic DeepSparse

Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.3.2Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Streamc2-standard-8 CLXc3-highcpu-8 SPR48121620SE +/- 0.03, N = 3SE +/- 0.13, N = 313.9916.22

Neural Magic DeepSparse

Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.3.2Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Streamc2-standard-8 CLXc3-highcpu-8 SPR306090120150SE +/- 0.29, N = 3SE +/- 0.95, N = 3142.89123.29

Neural Magic DeepSparse

Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.3.2Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Streamc2-standard-8 CLXc3-highcpu-8 SPR0.84811.69622.54433.39244.2405SE +/- 0.0038, N = 3SE +/- 0.0239, N = 32.93523.7693

Neural Magic DeepSparse

Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.3.2Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Streamc2-standard-8 CLXc3-highcpu-8 SPR150300450600750SE +/- 0.89, N = 3SE +/- 3.34, N = 3681.34530.61

Google Draco

Model: Lion

OpenBenchmarking.orgms, Fewer Is BetterGoogle Draco 1.5.6Model: Lionc2-standard-8 CLXc3-highcpu-8 SPR16003200480064008000SE +/- 18.52, N = 3SE +/- 80.23, N = 15743762501. (CXX) g++ options: -O3

Google Draco

Model: Church Facade

OpenBenchmarking.orgms, Fewer Is BetterGoogle Draco 1.5.6Model: Church Facadec2-standard-8 CLXc3-highcpu-8 SPR2K4K6K8K10KSE +/- 16.76, N = 3SE +/- 10.48, N = 31146275731. (CXX) g++ options: -O3

Blender

Blend File: BMW27 - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 3.4Blend File: BMW27 - Compute: CPU-Onlyc3-highcpu-8 SPR70140210280350SE +/- 1.13, N = 3315.24

nginx

Connections: 100

OpenBenchmarking.orgRequests Per Second, More Is Betternginx 1.23.2Connections: 100c2-standard-8 CLXc3-highcpu-8 SPR8K16K24K32K40KSE +/- 33.32, N = 3SE +/- 23.95, N = 325148.2836310.351. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2

nginx

Connections: 200

OpenBenchmarking.orgRequests Per Second, More Is Betternginx 1.23.2Connections: 200c2-standard-8 CLXc3-highcpu-8 SPR8K16K24K32K40KSE +/- 37.62, N = 3SE +/- 83.52, N = 324695.9135602.101. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2

nginx

Connections: 500

OpenBenchmarking.orgRequests Per Second, More Is Betternginx 1.23.2Connections: 500c2-standard-8 CLXc3-highcpu-8 SPR7K14K21K28K35KSE +/- 17.44, N = 3SE +/- 321.85, N = 321957.1734672.651. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2

nginx

Connections: 1000

OpenBenchmarking.orgRequests Per Second, More Is Betternginx 1.23.2Connections: 1000c2-standard-8 CLXc3-highcpu-8 SPR7K14K21K28K35KSE +/- 84.22, N = 3SE +/- 22.42, N = 321446.2732118.581. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2

nginx

Connections: 4000

OpenBenchmarking.orgRequests Per Second, More Is Betternginx 1.23.2Connections: 4000c2-standard-8 CLXc3-highcpu-8 SPR7K14K21K28K35KSE +/- 11.52, N = 3SE +/- 27.93, N = 321594.9432814.751. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2

BRL-CAD

VGR Performance Metric

OpenBenchmarking.orgVGR Performance Metric, More Is BetterBRL-CAD 7.34VGR Performance Metricc2-standard-8 CLXc3-highcpu-8 SPR15K30K45K60K75K50314710721. (CXX) g++ options: -std=c++14 -pipe -fvisibility=hidden -fno-strict-aliasing -fno-common -fexceptions -ftemplate-depth-128 -m64 -ggdb3 -O3 -fipa-pta -fstrength-reduce -finline-functions -flto -ltcl8.6 -lregex_brl -lz_brl -lnetpbm -ldl -lm -ltk8.6

OpenCV

Test: Core

OpenBenchmarking.orgms, Fewer Is BetterOpenCV 4.7Test: Corec2-standard-8 CLXc3-highcpu-8 SPR30K60K90K120K150KSE +/- 2578.21, N = 12SE +/- 280.31, N = 3142770873721. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt

OpenCV

Test: Video

OpenBenchmarking.orgms, Fewer Is BetterOpenCV 4.7Test: Videoc2-standard-8 CLXc3-highcpu-8 SPR7K14K21K28K35KSE +/- 50.28, N = 3SE +/- 198.80, N = 311737316541. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt

OpenCV

Test: Graph API

OpenBenchmarking.orgms, Fewer Is BetterOpenCV 4.7Test: Graph APIc2-standard-8 CLXc3-highcpu-8 SPR50K100K150K200K250KSE +/- 1570.24, N = 3SE +/- 931.36, N = 32361862199311. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt

OpenCV

Test: Stitching

OpenBenchmarking.orgms, Fewer Is BetterOpenCV 4.7Test: Stitchingc2-standard-8 CLXc3-highcpu-8 SPR50K100K150K200K250KSE +/- 1856.06, N = 3SE +/- 1973.06, N = 72508332147601. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt

OpenCV

Test: Image Processing

OpenBenchmarking.orgms, Fewer Is BetterOpenCV 4.7Test: Image Processingc2-standard-8 CLXc3-highcpu-8 SPR30K60K90K120K150KSE +/- 1527.37, N = 4SE +/- 1624.35, N = 121472341281631. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt

OpenCV

Test: Object Detection

OpenBenchmarking.orgms, Fewer Is BetterOpenCV 4.7Test: Object Detectionc2-standard-8 CLXc3-highcpu-8 SPR12K24K36K48K60KSE +/- 751.87, N = 3SE +/- 384.74, N = 558056389991. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt


Phoronix Test Suite v10.8.4