Intel Xeon Silver 4216 testing with a TYAN S7100AG2NR (V4.02 BIOS) and ASPEED on Debian 12 via the Phoronix Test Suite.
Compare your own system(s) to this result file with the
Phoronix Test Suite by running the command:
phoronix-test-suite benchmark 2401144-NE-XEONJAN1706 xeon jan - Phoronix Test Suite xeon jan Intel Xeon Silver 4216 testing with a TYAN S7100AG2NR (V4.02 BIOS) and ASPEED on Debian 12 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2401144-NE-XEONJAN1706&sro&grs .
xeon jan Processor Motherboard Chipset Memory Disk Graphics Audio Network OS Kernel Display Server Compiler File-System Screen Resolution a b c Intel Xeon Silver 4216 @ 3.20GHz (16 Cores / 32 Threads) TYAN S7100AG2NR (V4.02 BIOS) Intel Sky Lake-E DMI3 Registers 6 x 8 GB DDR4-2400MT/s 240GB Corsair Force MP500 ASPEED Realtek ALC892 2 x Intel I350 Debian 12 6.1.0-11-amd64 (x86_64) X Server GCC 12.2.0 ext4 1024x768 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: always Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-bTRWOB/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-bTRWOB/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0x500002c Python Details - Python 3.11.2 Security Details - gather_data_sampling: Vulnerable: No microcode + itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Vulnerable: Clear buffers attempted no microcode; SMT vulnerable + retbleed: Mitigation of Enhanced IBRS + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Mitigation of TSX disabled
xeon jan speedb: Rand Fill Sync speedb: Rand Fill speedb: Update Rand tensorflow: CPU - 1 - GoogLeNet llama-cpp: llama-2-7b.Q4_0.gguf svt-av1: Preset 12 - Bosphorus 4K speedb: Read While Writing svt-av1: Preset 12 - Bosphorus 1080p deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream lczero: Eigen cachebench: Read / Modify / Write speedb: Seq Fill lczero: BLAS pytorch: CPU - 1 - ResNet-50 svt-av1: Preset 13 - Bosphorus 1080p svt-av1: Preset 8 - Bosphorus 1080p pytorch: CPU - 16 - ResNet-50 quicksilver: CTS2 svt-av1: Preset 4 - Bosphorus 1080p svt-av1: Preset 4 - Bosphorus 4K svt-av1: Preset 8 - Bosphorus 4K speedb: Rand Read tensorflow: CPU - 1 - ResNet-50 y-cruncher: 1B deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream pytorch: CPU - 16 - Efficientnet_v2_l llama-cpp: llama-2-13b.Q4_0.gguf pytorch: CPU - 32 - ResNet-152 deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream pytorch: CPU - 16 - ResNet-152 svt-av1: Preset 13 - Bosphorus 4K speedb: Read Rand Write Rand deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Stream tensorflow: CPU - 1 - VGG-16 pytorch: CPU - 1 - ResNet-152 deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream pytorch: CPU - 32 - Efficientnet_v2_l pytorch: CPU - 32 - ResNet-50 tensorflow: CPU - 1 - AlexNet deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream pytorch: CPU - 1 - Efficientnet_v2_l quicksilver: CORAL2 P2 llama-cpp: llama-2-70b-chat.Q5_0.gguf deepsparse: ResNet-50, Sparse INT8 - Asynchronous Multi-Stream deepsparse: ResNet-50, Sparse INT8 - Asynchronous Multi-Stream tensorflow: CPU - 16 - AlexNet quicksilver: CORAL2 P1 deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream tensorflow: CPU - 16 - GoogLeNet deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream y-cruncher: 500M deepsparse: BERT-Large, NLP Question Answering - Asynchronous Multi-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Stream deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Stream tensorflow: CPU - 16 - ResNet-50 deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Stream deepsparse: ResNet-50, Baseline - Asynchronous Multi-Stream deepsparse: ResNet-50, Baseline - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Stream cachebench: Write deepsparse: BERT-Large, NLP Question Answering - Asynchronous Multi-Stream cachebench: Read tensorflow: CPU - 16 - VGG-16 a b c 8962 379730 172891 17.26 16.95 82.805 3897119 165.265 7.5161 7.3201 33 61680.563289 565169 37 29.51 188.993 45.633 21.57 8446000 7.326 2.462 24.384 53271554 4.81 46.091 14.384 4.72 8.7 8.18 1061.8943 8.14 82.392 1640953 28.1401 283.9607 3.27 11.11 1071.4101 4.79 21.56 18.21 552.534 6.91 9287000 1.5 732.3634 10.9015 83.17 10170000 64.9099 47.63 123.2229 20.623 9.4169 116.1827 68.7595 164.6289 125.9588 16.22 63.4829 116.2812 68.7723 165.9672 48.1923 48.5709 23161.605712 845.9518 6062.370107 5.96 13397 298026 163726 15.86 15.89 78.544 3867484 170.98 7.2935 7.5195 33 59877.337189 558662 38 29.64 184.724 46.686 21.27 8497000 7.379 2.423 23.99 52915603 4.87 45.445 14.5017 4.76 8.73 8.09 1073.1318 8.13 82.623 1658156 27.8492 286.9078 3.24 11.17 1063.8379 4.75 21.64 18.25 549.3992 6.90 9354000 1.51 736.8677 10.8351 82.83 10110000 64.5367 47.51 123.8886 20.682 9.4557 115.8799 68.9544 164.5919 126.098 16.25 63.3871 116.0381 68.9149 165.6879 48.2735 48.598 23134.972437 845.9952 6058.744667 5.96 10150 377206 151137 16.19 16.55 82.222 4014397 168.012 7.5416 7.2847 32 60843.704817 549382 37 28.89 187.023 45.935 21.73 8607000 7.251 2.421 24.069 52443533 4.88 45.928 14.3009 4.70 8.62 8.08 1060.5969 8.22 83.269 1656172 27.95 285.941 3.26 11.07 1072.9766 4.75 21.74 18.35 553.4687 6.95 9308000 1.5 734.4208 10.8705 83.33 10150000 64.617 47.36 123.7819 20.575 9.4646 116.403 68.677 164.0809 125.7734 16.21 63.5329 116.0803 68.8893 165.8752 48.2187 48.652 23165.587056 845.2007 6057.993007 5.96 OpenBenchmarking.org
Speedb Test: Random Fill Sync OpenBenchmarking.org Op/s, More Is Better Speedb 2.7 Test: Random Fill Sync a b c 3K 6K 9K 12K 15K 8962 13397 10150 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
Speedb Test: Random Fill OpenBenchmarking.org Op/s, More Is Better Speedb 2.7 Test: Random Fill a b c 80K 160K 240K 320K 400K 379730 298026 377206 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
Speedb Test: Update Random OpenBenchmarking.org Op/s, More Is Better Speedb 2.7 Test: Update Random a b c 40K 80K 120K 160K 200K 172891 163726 151137 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
TensorFlow Device: CPU - Batch Size: 1 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 1 - Model: GoogLeNet a b c 4 8 12 16 20 17.26 15.86 16.19
Llama.cpp Model: llama-2-7b.Q4_0.gguf OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b1808 Model: llama-2-7b.Q4_0.gguf a b c 4 8 12 16 20 16.95 15.89 16.55 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -march=native -mtune=native -lopenblas
SVT-AV1 Encoder Mode: Preset 12 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.8 Encoder Mode: Preset 12 - Input: Bosphorus 4K a b c 20 40 60 80 100 82.81 78.54 82.22 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
Speedb Test: Read While Writing OpenBenchmarking.org Op/s, More Is Better Speedb 2.7 Test: Read While Writing a b c 900K 1800K 2700K 3600K 4500K 3897119 3867484 4014397 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
SVT-AV1 Encoder Mode: Preset 12 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.8 Encoder Mode: Preset 12 - Input: Bosphorus 1080p a b c 40 80 120 160 200 165.27 170.98 168.01 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream a b c 2 4 6 8 10 7.5161 7.2935 7.5416
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream a b c 2 4 6 8 10 7.3201 7.5195 7.2847
LeelaChessZero Backend: Eigen OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.30 Backend: Eigen a b c 8 16 24 32 40 33 33 32 1. (CXX) g++ options: -flto -pthread
CacheBench Test: Read / Modify / Write OpenBenchmarking.org MB/s, More Is Better CacheBench Test: Read / Modify / Write a b c 13K 26K 39K 52K 65K 61680.56 59877.34 60843.70 MIN: 44499.9 / MAX: 70905.52 MIN: 45685.19 / MAX: 70473.28 MIN: 44676.87 / MAX: 71748.63 1. (CC) gcc options: -O3 -lrt
Speedb Test: Sequential Fill OpenBenchmarking.org Op/s, More Is Better Speedb 2.7 Test: Sequential Fill a b c 120K 240K 360K 480K 600K 565169 558662 549382 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
LeelaChessZero Backend: BLAS OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.30 Backend: BLAS a b c 9 18 27 36 45 37 38 37 1. (CXX) g++ options: -flto -pthread
PyTorch Device: CPU - Batch Size: 1 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 1 - Model: ResNet-50 a b c 7 14 21 28 35 29.51 29.64 28.89 MIN: 20.25 / MAX: 29.94 MIN: 22.76 / MAX: 29.97 MIN: 23.63 / MAX: 29.36
SVT-AV1 Encoder Mode: Preset 13 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.8 Encoder Mode: Preset 13 - Input: Bosphorus 1080p a b c 40 80 120 160 200 188.99 184.72 187.02 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 8 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.8 Encoder Mode: Preset 8 - Input: Bosphorus 1080p a b c 11 22 33 44 55 45.63 46.69 45.94 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
PyTorch Device: CPU - Batch Size: 16 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 16 - Model: ResNet-50 a b c 5 10 15 20 25 21.57 21.27 21.73 MIN: 17.47 / MAX: 21.75 MIN: 16.96 / MAX: 21.64 MIN: 18.85 / MAX: 21.83
Quicksilver Input: CTS2 OpenBenchmarking.org Figure Of Merit, More Is Better Quicksilver 20230818 Input: CTS2 a b c 2M 4M 6M 8M 10M 8446000 8497000 8607000 1. (CXX) g++ options: -fopenmp -O3 -march=native
SVT-AV1 Encoder Mode: Preset 4 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.8 Encoder Mode: Preset 4 - Input: Bosphorus 1080p a b c 2 4 6 8 10 7.326 7.379 7.251 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 4 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.8 Encoder Mode: Preset 4 - Input: Bosphorus 4K a b c 0.554 1.108 1.662 2.216 2.77 2.462 2.423 2.421 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 8 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.8 Encoder Mode: Preset 8 - Input: Bosphorus 4K a b c 6 12 18 24 30 24.38 23.99 24.07 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
Speedb Test: Random Read OpenBenchmarking.org Op/s, More Is Better Speedb 2.7 Test: Random Read a b c 11M 22M 33M 44M 55M 53271554 52915603 52443533 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
TensorFlow Device: CPU - Batch Size: 1 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 1 - Model: ResNet-50 a b c 1.098 2.196 3.294 4.392 5.49 4.81 4.87 4.88
Y-Cruncher Pi Digits To Calculate: 1B OpenBenchmarking.org Seconds, Fewer Is Better Y-Cruncher 0.8.3 Pi Digits To Calculate: 1B a b c 10 20 30 40 50 46.09 45.45 45.93
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream a b c 4 8 12 16 20 14.38 14.50 14.30
PyTorch Device: CPU - Batch Size: 16 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 16 - Model: Efficientnet_v2_l a b c 1.071 2.142 3.213 4.284 5.355 4.72 4.76 4.70 MIN: 3.31 / MAX: 4.87 MIN: 3.39 / MAX: 4.87 MIN: 3.38 / MAX: 4.83
Llama.cpp Model: llama-2-13b.Q4_0.gguf OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b1808 Model: llama-2-13b.Q4_0.gguf a b c 2 4 6 8 10 8.70 8.73 8.62 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -march=native -mtune=native -lopenblas
PyTorch Device: CPU - Batch Size: 32 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 32 - Model: ResNet-152 a b c 2 4 6 8 10 8.18 8.09 8.08 MIN: 7.33 / MAX: 8.24 MIN: 6.96 / MAX: 8.2 MIN: 7.28 / MAX: 8.15
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream a b c 200 400 600 800 1000 1061.89 1073.13 1060.60
PyTorch Device: CPU - Batch Size: 16 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 16 - Model: ResNet-152 a b c 2 4 6 8 10 8.14 8.13 8.22 MIN: 6.9 / MAX: 8.25 MIN: 6.89 / MAX: 8.23 MIN: 7.13 / MAX: 8.3
SVT-AV1 Encoder Mode: Preset 13 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.8 Encoder Mode: Preset 13 - Input: Bosphorus 4K a b c 20 40 60 80 100 82.39 82.62 83.27 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
Speedb Test: Read Random Write Random OpenBenchmarking.org Op/s, More Is Better Speedb 2.7 Test: Read Random Write Random a b c 400K 800K 1200K 1600K 2000K 1640953 1658156 1656172 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream a b c 7 14 21 28 35 28.14 27.85 27.95
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream a b c 60 120 180 240 300 283.96 286.91 285.94
TensorFlow Device: CPU - Batch Size: 1 - Model: VGG-16 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 1 - Model: VGG-16 a b c 0.7358 1.4716 2.2074 2.9432 3.679 3.27 3.24 3.26
PyTorch Device: CPU - Batch Size: 1 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 1 - Model: ResNet-152 a b c 3 6 9 12 15 11.11 11.17 11.07 MIN: 10.08 / MAX: 11.16 MIN: 9.55 / MAX: 11.26 MIN: 10.1 / MAX: 11.15
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream a b c 200 400 600 800 1000 1071.41 1063.84 1072.98
PyTorch Device: CPU - Batch Size: 32 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 32 - Model: Efficientnet_v2_l a b c 1.0778 2.1556 3.2334 4.3112 5.389 4.79 4.75 4.75 MIN: 3.39 / MAX: 4.9 MIN: 3.34 / MAX: 4.89 MIN: 3.39 / MAX: 4.88
PyTorch Device: CPU - Batch Size: 32 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 32 - Model: ResNet-50 a b c 5 10 15 20 25 21.56 21.64 21.74 MIN: 18.26 / MAX: 21.78 MIN: 18.44 / MAX: 21.94 MIN: 18.79 / MAX: 21.88
TensorFlow Device: CPU - Batch Size: 1 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 1 - Model: AlexNet a b c 5 10 15 20 25 18.21 18.25 18.35
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream a b c 120 240 360 480 600 552.53 549.40 553.47
PyTorch Device: CPU - Batch Size: 1 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.1 Device: CPU - Batch Size: 1 - Model: Efficientnet_v2_l a b c 2 4 6 8 10 6.91 6.90 6.95 MIN: 5.17 / MAX: 7.03 MIN: 5.18 / MAX: 7.02 MIN: 5.05 / MAX: 7.09
Quicksilver Input: CORAL2 P2 OpenBenchmarking.org Figure Of Merit, More Is Better Quicksilver 20230818 Input: CORAL2 P2 a b c 2M 4M 6M 8M 10M 9287000 9354000 9308000 1. (CXX) g++ options: -fopenmp -O3 -march=native
Llama.cpp Model: llama-2-70b-chat.Q5_0.gguf OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b1808 Model: llama-2-70b-chat.Q5_0.gguf a b c 0.3398 0.6796 1.0194 1.3592 1.699 1.50 1.51 1.50 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -march=native -mtune=native -lopenblas
Neural Magic DeepSparse Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream a b c 160 320 480 640 800 732.36 736.87 734.42
Neural Magic DeepSparse Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream a b c 3 6 9 12 15 10.90 10.84 10.87
TensorFlow Device: CPU - Batch Size: 16 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 16 - Model: AlexNet a b c 20 40 60 80 100 83.17 82.83 83.33
Quicksilver Input: CORAL2 P1 OpenBenchmarking.org Figure Of Merit, More Is Better Quicksilver 20230818 Input: CORAL2 P1 a b c 2M 4M 6M 8M 10M 10170000 10110000 10150000 1. (CXX) g++ options: -fopenmp -O3 -march=native
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream a b c 14 28 42 56 70 64.91 64.54 64.62
TensorFlow Device: CPU - Batch Size: 16 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 16 - Model: GoogLeNet a b c 11 22 33 44 55 47.63 47.51 47.36
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream a b c 30 60 90 120 150 123.22 123.89 123.78
Y-Cruncher Pi Digits To Calculate: 500M OpenBenchmarking.org Seconds, Fewer Is Better Y-Cruncher 0.8.3 Pi Digits To Calculate: 500M a b c 5 10 15 20 25 20.62 20.68 20.58
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: BERT-Large, NLP Question Answering - Scenario: Asynchronous Multi-Stream a b c 3 6 9 12 15 9.4169 9.4557 9.4646
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream a b c 30 60 90 120 150 116.18 115.88 116.40
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream a b c 15 30 45 60 75 68.76 68.95 68.68
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream a b c 40 80 120 160 200 164.63 164.59 164.08
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream a b c 30 60 90 120 150 125.96 126.10 125.77
TensorFlow Device: CPU - Batch Size: 16 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 16 - Model: ResNet-50 a b c 4 8 12 16 20 16.22 16.25 16.21
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream a b c 14 28 42 56 70 63.48 63.39 63.53
Neural Magic DeepSparse Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream a b c 30 60 90 120 150 116.28 116.04 116.08
Neural Magic DeepSparse Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream a b c 15 30 45 60 75 68.77 68.91 68.89
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream a b c 40 80 120 160 200 165.97 165.69 165.88
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream a b c 11 22 33 44 55 48.19 48.27 48.22
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream a b c 11 22 33 44 55 48.57 48.60 48.65
CacheBench Test: Write OpenBenchmarking.org MB/s, More Is Better CacheBench Test: Write a b c 5K 10K 15K 20K 25K 23161.61 23134.97 23165.59 MIN: 20765.44 / MAX: 24245.95 MIN: 20991.01 / MAX: 24248.75 MIN: 20076.82 / MAX: 24250.96 1. (CC) gcc options: -O3 -lrt
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: BERT-Large, NLP Question Answering - Scenario: Asynchronous Multi-Stream a b c 200 400 600 800 1000 845.95 846.00 845.20
CacheBench Test: Read OpenBenchmarking.org MB/s, More Is Better CacheBench Test: Read a b c 1300 2600 3900 5200 6500 6062.37 6058.74 6057.99 MIN: 5922.93 / MAX: 6087.27 MIN: 5776.01 / MAX: 6087.87 MIN: 5886 / MAX: 6083.65 1. (CC) gcc options: -O3 -lrt
TensorFlow Device: CPU - Batch Size: 16 - Model: VGG-16 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 16 - Model: VGG-16 a b c 1.341 2.682 4.023 5.364 6.705 5.96 5.96 5.96
Phoronix Test Suite v10.8.4