xeon jan

Intel Xeon Silver 4216 testing with a TYAN S7100AG2NR (V4.02 BIOS) and ASPEED on Debian 12 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2401144-NE-XEONJAN1706&grs&sor.

xeon janProcessorMotherboardChipsetMemoryDiskGraphicsAudioNetworkOSKernelDisplay ServerCompilerFile-SystemScreen ResolutionabcIntel Xeon Silver 4216 @ 3.20GHz (16 Cores / 32 Threads)TYAN S7100AG2NR (V4.02 BIOS)Intel Sky Lake-E DMI3 Registers6 x 8 GB DDR4-2400MT/s240GB Corsair Force MP500ASPEEDRealtek ALC8922 x Intel I350Debian 126.1.0-11-amd64 (x86_64)X ServerGCC 12.2.0ext41024x768OpenBenchmarking.orgKernel Details- Transparent Huge Pages: alwaysCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-bTRWOB/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-bTRWOB/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0x500002c Python Details- Python 3.11.2Security Details- gather_data_sampling: Vulnerable: No microcode + itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Vulnerable: Clear buffers attempted no microcode; SMT vulnerable + retbleed: Mitigation of Enhanced IBRS + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Mitigation of TSX disabled

xeon janspeedb: Rand Fill Syncspeedb: Rand Fillspeedb: Update Randtensorflow: CPU - 1 - GoogLeNetllama-cpp: llama-2-7b.Q4_0.ggufsvt-av1: Preset 12 - Bosphorus 4Kspeedb: Read While Writingsvt-av1: Preset 12 - Bosphorus 1080pdeepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Streamdeepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Streamlczero: Eigencachebench: Read / Modify / Writespeedb: Seq Filllczero: BLASpytorch: CPU - 1 - ResNet-50svt-av1: Preset 13 - Bosphorus 1080psvt-av1: Preset 8 - Bosphorus 1080ppytorch: CPU - 16 - ResNet-50quicksilver: CTS2svt-av1: Preset 4 - Bosphorus 1080psvt-av1: Preset 4 - Bosphorus 4Ksvt-av1: Preset 8 - Bosphorus 4Kspeedb: Rand Readtensorflow: CPU - 1 - ResNet-50y-cruncher: 1Bdeepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Streampytorch: CPU - 16 - Efficientnet_v2_lllama-cpp: llama-2-13b.Q4_0.ggufpytorch: CPU - 32 - ResNet-152deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Streampytorch: CPU - 16 - ResNet-152svt-av1: Preset 13 - Bosphorus 4Kspeedb: Read Rand Write Randdeepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Streamdeepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Streamtensorflow: CPU - 1 - VGG-16pytorch: CPU - 1 - ResNet-152deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Streampytorch: CPU - 32 - Efficientnet_v2_lpytorch: CPU - 32 - ResNet-50tensorflow: CPU - 1 - AlexNetdeepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Streampytorch: CPU - 1 - Efficientnet_v2_lquicksilver: CORAL2 P2llama-cpp: llama-2-70b-chat.Q5_0.ggufdeepsparse: ResNet-50, Sparse INT8 - Asynchronous Multi-Streamdeepsparse: ResNet-50, Sparse INT8 - Asynchronous Multi-Streamtensorflow: CPU - 16 - AlexNetquicksilver: CORAL2 P1deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Streamtensorflow: CPU - 16 - GoogLeNetdeepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Streamy-cruncher: 500Mdeepsparse: BERT-Large, NLP Question Answering - Asynchronous Multi-Streamdeepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Streamdeepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Streamdeepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Streamdeepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Streamtensorflow: CPU - 16 - ResNet-50deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Streamdeepsparse: ResNet-50, Baseline - Asynchronous Multi-Streamdeepsparse: ResNet-50, Baseline - Asynchronous Multi-Streamdeepsparse: CV Detection, YOLOv5s COCO - Asynchronous Multi-Streamdeepsparse: CV Detection, YOLOv5s COCO - Asynchronous Multi-Streamdeepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Streamcachebench: Writedeepsparse: BERT-Large, NLP Question Answering - Asynchronous Multi-Streamcachebench: Readtensorflow: CPU - 16 - VGG-16abc896237973017289117.2616.9582.8053897119165.2657.51617.32013361680.5632895651693729.51188.99345.63321.5784460007.3262.46224.384532715544.8146.09114.3844.728.78.181061.89438.1482.392164095328.1401283.96073.2711.111071.41014.7921.5618.21552.5346.9192870001.5732.363410.901583.171017000064.909947.63123.222920.6239.4169116.182768.7595164.6289125.958816.2263.4829116.281268.7723165.967248.192348.570923161.605712845.95186062.3701075.961339729802616372615.8615.8978.5443867484170.987.29357.51953359877.3371895586623829.64184.72446.68621.2784970007.3792.42323.99529156034.8745.44514.50174.768.738.091073.13188.1382.623165815627.8492286.90783.2411.171063.83794.7521.6418.25549.39926.9093540001.51736.867710.835182.831011000064.536747.51123.888620.6829.4557115.879968.9544164.5919126.09816.2563.3871116.038168.9149165.687948.273548.59823134.972437845.99526058.7446675.961015037720615113716.1916.5582.2224014397168.0127.54167.28473260843.7048175493823728.89187.02345.93521.7386070007.2512.42124.069524435334.8845.92814.30094.708.628.081060.59698.2283.269165617227.95285.9413.2611.071072.97664.7521.7418.35553.46876.9593080001.5734.420810.870583.331015000064.61747.36123.781920.5759.4646116.40368.677164.0809125.773416.2163.5329116.080368.8893165.875248.218748.65223165.587056845.20076057.9930075.96OpenBenchmarking.org

Speedb

Test: Random Fill Sync

OpenBenchmarking.orgOp/s, More Is BetterSpeedb 2.7Test: Random Fill Syncbca3K6K9K12K15K133971015089621. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

Speedb

Test: Random Fill

OpenBenchmarking.orgOp/s, More Is BetterSpeedb 2.7Test: Random Fillacb80K160K240K320K400K3797303772062980261. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

Speedb

Test: Update Random

OpenBenchmarking.orgOp/s, More Is BetterSpeedb 2.7Test: Update Randomabc40K80K120K160K200K1728911637261511371. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

TensorFlow

Device: CPU - Batch Size: 1 - Model: GoogLeNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 1 - Model: GoogLeNetacb4812162017.2616.1915.86

Llama.cpp

Model: llama-2-7b.Q4_0.gguf

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b1808Model: llama-2-7b.Q4_0.ggufacb4812162016.9516.5515.891. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -march=native -mtune=native -lopenblas

SVT-AV1

Encoder Mode: Preset 12 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 1.8Encoder Mode: Preset 12 - Input: Bosphorus 4Kacb2040608010082.8182.2278.541. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

Speedb

Test: Read While Writing

OpenBenchmarking.orgOp/s, More Is BetterSpeedb 2.7Test: Read While Writingcab900K1800K2700K3600K4500K4014397389711938674841. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

SVT-AV1

Encoder Mode: Preset 12 - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 1.8Encoder Mode: Preset 12 - Input: Bosphorus 1080pbca4080120160200170.98168.01165.271. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

Neural Magic DeepSparse

Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Streamcab2468107.54167.51617.2935

Neural Magic DeepSparse

Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Streambac2468107.51957.32017.2847

LeelaChessZero

Backend: Eigen

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.30Backend: Eigenbac8162432403333321. (CXX) g++ options: -flto -pthread

CacheBench

Test: Read / Modify / Write

OpenBenchmarking.orgMB/s, More Is BetterCacheBenchTest: Read / Modify / Writeacb13K26K39K52K65K61680.5660843.7059877.34MIN: 44499.9 / MAX: 70905.52MIN: 44676.87 / MAX: 71748.63MIN: 45685.19 / MAX: 70473.281. (CC) gcc options: -O3 -lrt

Speedb

Test: Sequential Fill

OpenBenchmarking.orgOp/s, More Is BetterSpeedb 2.7Test: Sequential Fillabc120K240K360K480K600K5651695586625493821. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

LeelaChessZero

Backend: BLAS

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.30Backend: BLASbca9182736453837371. (CXX) g++ options: -flto -pthread

PyTorch

Device: CPU - Batch Size: 1 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 1 - Model: ResNet-50bac71421283529.6429.5128.89MIN: 22.76 / MAX: 29.97MIN: 20.25 / MAX: 29.94MIN: 23.63 / MAX: 29.36

SVT-AV1

Encoder Mode: Preset 13 - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 1.8Encoder Mode: Preset 13 - Input: Bosphorus 1080pacb4080120160200188.99187.02184.721. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

SVT-AV1

Encoder Mode: Preset 8 - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 1.8Encoder Mode: Preset 8 - Input: Bosphorus 1080pbca112233445546.6945.9445.631. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

PyTorch

Device: CPU - Batch Size: 16 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 16 - Model: ResNet-50cab51015202521.7321.5721.27MIN: 18.85 / MAX: 21.83MIN: 17.47 / MAX: 21.75MIN: 16.96 / MAX: 21.64

Quicksilver

Input: CTS2

OpenBenchmarking.orgFigure Of Merit, More Is BetterQuicksilver 20230818Input: CTS2cba2M4M6M8M10M8607000849700084460001. (CXX) g++ options: -fopenmp -O3 -march=native

SVT-AV1

Encoder Mode: Preset 4 - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 1.8Encoder Mode: Preset 4 - Input: Bosphorus 1080pbac2468107.3797.3267.2511. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

SVT-AV1

Encoder Mode: Preset 4 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 1.8Encoder Mode: Preset 4 - Input: Bosphorus 4Kabc0.5541.1081.6622.2162.772.4622.4232.4211. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

SVT-AV1

Encoder Mode: Preset 8 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 1.8Encoder Mode: Preset 8 - Input: Bosphorus 4Kacb61218243024.3824.0723.991. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

Speedb

Test: Random Read

OpenBenchmarking.orgOp/s, More Is BetterSpeedb 2.7Test: Random Readabc11M22M33M44M55M5327155452915603524435331. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

TensorFlow

Device: CPU - Batch Size: 1 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 1 - Model: ResNet-50cba1.0982.1963.2944.3925.494.884.874.81

Y-Cruncher

Pi Digits To Calculate: 1B

OpenBenchmarking.orgSeconds, Fewer Is BetterY-Cruncher 0.8.3Pi Digits To Calculate: 1Bbca102030405045.4545.9346.09

Neural Magic DeepSparse

Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Streambac4812162014.5014.3814.30

PyTorch

Device: CPU - Batch Size: 16 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 16 - Model: Efficientnet_v2_lbac1.0712.1423.2134.2845.3554.764.724.70MIN: 3.39 / MAX: 4.87MIN: 3.31 / MAX: 4.87MIN: 3.38 / MAX: 4.83

Llama.cpp

Model: llama-2-13b.Q4_0.gguf

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b1808Model: llama-2-13b.Q4_0.ggufbac2468108.738.708.621. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -march=native -mtune=native -lopenblas

PyTorch

Device: CPU - Batch Size: 32 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 32 - Model: ResNet-152abc2468108.188.098.08MIN: 7.33 / MAX: 8.24MIN: 6.96 / MAX: 8.2MIN: 7.28 / MAX: 8.15

Neural Magic DeepSparse

Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Streamcab20040060080010001060.601061.891073.13

PyTorch

Device: CPU - Batch Size: 16 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 16 - Model: ResNet-152cab2468108.228.148.13MIN: 7.13 / MAX: 8.3MIN: 6.9 / MAX: 8.25MIN: 6.89 / MAX: 8.23

SVT-AV1

Encoder Mode: Preset 13 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 1.8Encoder Mode: Preset 13 - Input: Bosphorus 4Kcba2040608010083.2782.6282.391. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

Speedb

Test: Read Random Write Random

OpenBenchmarking.orgOp/s, More Is BetterSpeedb 2.7Test: Read Random Write Randombca400K800K1200K1600K2000K1658156165617216409531. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

Neural Magic DeepSparse

Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Streambca71421283527.8527.9528.14

Neural Magic DeepSparse

Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Streambca60120180240300286.91285.94283.96

TensorFlow

Device: CPU - Batch Size: 1 - Model: VGG-16

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 1 - Model: VGG-16acb0.73581.47162.20742.94323.6793.273.263.24

PyTorch

Device: CPU - Batch Size: 1 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 1 - Model: ResNet-152bac369121511.1711.1111.07MIN: 9.55 / MAX: 11.26MIN: 10.08 / MAX: 11.16MIN: 10.1 / MAX: 11.15

Neural Magic DeepSparse

Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Streambac20040060080010001063.841071.411072.98

PyTorch

Device: CPU - Batch Size: 32 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 32 - Model: Efficientnet_v2_lacb1.07782.15563.23344.31125.3894.794.754.75MIN: 3.39 / MAX: 4.9MIN: 3.39 / MAX: 4.88MIN: 3.34 / MAX: 4.89

PyTorch

Device: CPU - Batch Size: 32 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 32 - Model: ResNet-50cba51015202521.7421.6421.56MIN: 18.79 / MAX: 21.88MIN: 18.44 / MAX: 21.94MIN: 18.26 / MAX: 21.78

TensorFlow

Device: CPU - Batch Size: 1 - Model: AlexNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 1 - Model: AlexNetcba51015202518.3518.2518.21

Neural Magic DeepSparse

Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Streambac120240360480600549.40552.53553.47

PyTorch

Device: CPU - Batch Size: 1 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 1 - Model: Efficientnet_v2_lcab2468106.956.916.90MIN: 5.05 / MAX: 7.09MIN: 5.17 / MAX: 7.03MIN: 5.18 / MAX: 7.02

Quicksilver

Input: CORAL2 P2

OpenBenchmarking.orgFigure Of Merit, More Is BetterQuicksilver 20230818Input: CORAL2 P2bca2M4M6M8M10M9354000930800092870001. (CXX) g++ options: -fopenmp -O3 -march=native

Llama.cpp

Model: llama-2-70b-chat.Q5_0.gguf

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b1808Model: llama-2-70b-chat.Q5_0.ggufbca0.33980.67961.01941.35921.6991.511.501.501. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -march=native -mtune=native -lopenblas

Neural Magic DeepSparse

Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Streambca160320480640800736.87734.42732.36

Neural Magic DeepSparse

Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Streambca369121510.8410.8710.90

TensorFlow

Device: CPU - Batch Size: 16 - Model: AlexNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 16 - Model: AlexNetcab2040608010083.3383.1782.83

Quicksilver

Input: CORAL2 P1

OpenBenchmarking.orgFigure Of Merit, More Is BetterQuicksilver 20230818Input: CORAL2 P1acb2M4M6M8M10M1017000010150000101100001. (CXX) g++ options: -fopenmp -O3 -march=native

Neural Magic DeepSparse

Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Streamacb142842567064.9164.6264.54

TensorFlow

Device: CPU - Batch Size: 16 - Model: GoogLeNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 16 - Model: GoogLeNetabc112233445547.6347.5147.36

Neural Magic DeepSparse

Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Streamacb306090120150123.22123.78123.89

Y-Cruncher

Pi Digits To Calculate: 500M

OpenBenchmarking.orgSeconds, Fewer Is BetterY-Cruncher 0.8.3Pi Digits To Calculate: 500Mcab51015202520.5820.6220.68

Neural Magic DeepSparse

Model: BERT-Large, NLP Question Answering - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: BERT-Large, NLP Question Answering - Scenario: Asynchronous Multi-Streamcba36912159.46469.45579.4169

Neural Magic DeepSparse

Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Streamcab306090120150116.40116.18115.88

Neural Magic DeepSparse

Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Streamcab153045607568.6868.7668.95

Neural Magic DeepSparse

Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Streamcba4080120160200164.08164.59164.63

Neural Magic DeepSparse

Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Streambac306090120150126.10125.96125.77

TensorFlow

Device: CPU - Batch Size: 16 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 16 - Model: ResNet-50bac4812162016.2516.2216.21

Neural Magic DeepSparse

Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Streambac142842567063.3963.4863.53

Neural Magic DeepSparse

Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Streamacb306090120150116.28116.08116.04

Neural Magic DeepSparse

Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Streamacb153045607568.7768.8968.91

Neural Magic DeepSparse

Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Streambca4080120160200165.69165.88165.97

Neural Magic DeepSparse

Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Streambca112233445548.2748.2248.19

Neural Magic DeepSparse

Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Streamcba112233445548.6548.6048.57

CacheBench

Test: Write

OpenBenchmarking.orgMB/s, More Is BetterCacheBenchTest: Writecab5K10K15K20K25K23165.5923161.6123134.97MIN: 20076.82 / MAX: 24250.96MIN: 20765.44 / MAX: 24245.95MIN: 20991.01 / MAX: 24248.751. (CC) gcc options: -O3 -lrt

Neural Magic DeepSparse

Model: BERT-Large, NLP Question Answering - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: BERT-Large, NLP Question Answering - Scenario: Asynchronous Multi-Streamcab2004006008001000845.20845.95846.00

CacheBench

Test: Read

OpenBenchmarking.orgMB/s, More Is BetterCacheBenchTest: Readabc130026003900520065006062.376058.746057.99MIN: 5922.93 / MAX: 6087.27MIN: 5776.01 / MAX: 6087.87MIN: 5886 / MAX: 6083.651. (CC) gcc options: -O3 -lrt

TensorFlow

Device: CPU - Batch Size: 16 - Model: VGG-16

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 16 - Model: VGG-16cba1.3412.6824.0235.3646.7055.965.965.96


Phoronix Test Suite v10.8.4