AMD EPYC 8534P

AMD EPYC 8534P 64-Core testing with a AMD Cinnabar (RCB1009C BIOS) and ASPEED on Ubuntu 23.10 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2401286-NE-AMDEPYC8520&sor&grr.

AMD EPYC 8534PProcessorMotherboardChipsetMemoryDiskGraphicsNetworkOSKernelDesktopDisplay ServerCompilerFile-SystemScreen ResolutionabcAMD EPYC 8534P 64-Core @ 2.30GHz (64 Cores / 128 Threads)AMD Cinnabar (RCB1009C BIOS)AMD Device 14a46 x 32 GB DRAM-4800MT/s Samsung M321R4GA0BB0-CQKMG3201GB Micron_7450_MTFDKCB3T2TFSASPEED2 x Broadcom NetXtreme BCM5720 PCIeUbuntu 23.106.5.0-5-generic (x86_64)GNOME ShellX Server 1.21.1.7GCC 13.2.0ext4640x480OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-FTCNCZ/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-FTCNCZ/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xaa00212 Java Details- a: OpenJDK Runtime Environment (build 11.0.20+8-post-Ubuntu-1ubuntu1)Python Details- Python 3.11.5Security Details- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected

AMD EPYC 8534Ptensorflow: CPU - 512 - ResNet-50quicksilver: CTS2pytorch: CPU - 16 - Efficientnet_v2_lpytorch: CPU - 512 - Efficientnet_v2_lquicksilver: CORAL2 P2llama-cpp: llama-2-70b-chat.Q5_0.ggufpytorch: CPU - 512 - ResNet-152pytorch: CPU - 16 - ResNet-152pytorch: CPU - 1 - Efficientnet_v2_ldeepsparse: BERT-Large, NLP Question Answering - Synchronous Single-Streamdeepsparse: BERT-Large, NLP Question Answering - Synchronous Single-Streamdeepsparse: BERT-Large, NLP Question Answering - Asynchronous Multi-Streamdeepsparse: BERT-Large, NLP Question Answering - Asynchronous Multi-Streampytorch: CPU - 1 - ResNet-152llamafile: mistral-7b-instruct-v0.2.Q8_0 - CPUspeedb: Update Randspeedb: Read Rand Write Randspeedb: Read While Writingspeedb: Rand Readpytorch: CPU - 16 - ResNet-50pytorch: CPU - 512 - ResNet-50quicksilver: CORAL2 P1deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Synchronous Single-Streamdeepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Synchronous Single-Streamdeepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Streamdeepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Streamdeepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Streamdeepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Streamdeepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Streamdeepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Streamdeepsparse: NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Streamdeepsparse: NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Streamdeepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Streamdeepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Streamdeepsparse: NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Streamdeepsparse: NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Streamllamafile: wizardcoder-python-34b-v1.0.Q6_K - CPUdeepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Synchronous Single-Streamdeepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Synchronous Single-Streamdeepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Streamdeepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Streamdeepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Synchronous Single-Streamdeepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Synchronous Single-Streamdeepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Streamdeepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Streamdeepsparse: NLP Text Classification, DistilBERT mnli - Synchronous Single-Streamdeepsparse: NLP Text Classification, DistilBERT mnli - Synchronous Single-Streamdeepsparse: CV Detection, YOLOv5s COCO - Synchronous Single-Streamdeepsparse: CV Detection, YOLOv5s COCO - Synchronous Single-Streamdeepsparse: ResNet-50, Sparse INT8 - Synchronous Single-Streamdeepsparse: ResNet-50, Sparse INT8 - Synchronous Single-Streamdeepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Synchronous Single-Streamdeepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Synchronous Single-Streamdeepsparse: CV Detection, YOLOv5s COCO - Asynchronous Multi-Streamdeepsparse: CV Detection, YOLOv5s COCO - Asynchronous Multi-Streamdeepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Streamdeepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Streamtensorflow: CPU - 16 - ResNet-50deepsparse: ResNet-50, Sparse INT8 - Asynchronous Multi-Streamdeepsparse: ResNet-50, Sparse INT8 - Asynchronous Multi-Streamdeepsparse: ResNet-50, Baseline - Asynchronous Multi-Streamdeepsparse: ResNet-50, Baseline - Asynchronous Multi-Streamdeepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Streamdeepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Streamdeepsparse: ResNet-50, Baseline - Synchronous Single-Streamdeepsparse: ResNet-50, Baseline - Synchronous Single-Streamdeepsparse: CV Classification, ResNet-50 ImageNet - Synchronous Single-Streamdeepsparse: CV Classification, ResNet-50 ImageNet - Synchronous Single-Streamllama-cpp: llama-2-13b.Q4_0.ggufsvt-av1: Preset 4 - Bosphorus 4Kpytorch: CPU - 1 - ResNet-50tensorflow: CPU - 1 - ResNet-50llamafile: llava-v1.5-7b-q4 - CPUllama-cpp: llama-2-7b.Q4_0.ggufy-cruncher: 1Bsvt-av1: Preset 8 - Bosphorus 4Ksvt-av1: Preset 4 - Bosphorus 1080psvt-av1: Preset 12 - Bosphorus 4Ksvt-av1: Preset 13 - Bosphorus 4Ky-cruncher: 500Msvt-av1: Preset 8 - Bosphorus 1080psvt-av1: Preset 12 - Bosphorus 1080psvt-av1: Preset 13 - Bosphorus 1080pabc103.3163200006.026.04162500003.414.3814.679.3831.082432.1619695.375245.582616.8614.2735526425065281538102028770222336.6336.44213000004.9412202.163122.19131439.7734882.432635.6549885.823635.536733.728829.640546.4063688.567533.859829.52625.515.614163.9831508.89362.393322.641544.1242101.6471313.98727.3198136.46696.59151.49121.2115822.756.605151.2523150.7692211.5649149.3881213.597653.658.47793766.130166.9993476.836766.9655477.00145.3282187.45295.3237187.619916.656.69745.125.9221.9427.959.94667.3217.115192.633195.9044.897129.024506.246599.849103.2163600006.046.05162600003.414.7914.559.5930.993832.2534694.56345.635816.4414.2934551525141961528421929844556536.1337.28213300004.9419202.111322.17971441.0421882.882935.6944883.462235.606633.726229.642846.6414685.106433.792829.58415.4615.681963.7036512.169962.159422.703643.9996101.7295313.80127.3167136.52816.5866151.55951.2107823.4536.5703152.04150.1556212.3922148.779214.547153.618.51343749.031567.0028476.678466.9522477.28265.3168187.86945.328187.480216.726.7745.605.9821.8927.949.91568.42317.63193.581191.9274.914133.524505.536599.42103.19162933336.026.03161933333.4114.5514.679.6130.851632.4023694.926445.531516.5814.2735139425080821544242029833220336.6836.71213166675.0360198.369122.15501442.4037883.114835.5891882.972135.647433.802329.575946.5963685.737333.739229.63135.5015.595864.0546510.483862.144522.674344.0566101.7894313.75287.2910137.01576.5609152.14051.2141821.13176.4896153.9350150.4754211.9097149.2086213.872853.628.50263754.337266.9723477.056866.9459477.25315.3279187.46275.4112184.658116.636.72145.335.9421.9726.509.92967.77817.565193.425194.4714.918130.345503.508603.599OpenBenchmarking.org

TensorFlow

Device: CPU - Batch Size: 512 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 512 - Model: ResNet-50abc20406080100SE +/- 0.01, N = 3103.30103.20103.19

Quicksilver

Input: CTS2

OpenBenchmarking.orgFigure Of Merit, More Is BetterQuicksilver 20230818Input: CTS2bac4M8M12M16M20MSE +/- 8819.17, N = 31636000016320000162933331. (CXX) g++ options: -fopenmp -O3 -march=native

PyTorch

Device: CPU - Batch Size: 16 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 16 - Model: Efficientnet_v2_lbca246810SE +/- 0.01, N = 36.046.026.02MIN: 5.63 / MAX: 6.16MIN: 5.54 / MAX: 6.14MIN: 5.57 / MAX: 6.13

PyTorch

Device: CPU - Batch Size: 512 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 512 - Model: Efficientnet_v2_lbac246810SE +/- 0.02, N = 36.056.046.03MIN: 5.62 / MAX: 6.16MIN: 5.57 / MAX: 6.16MIN: 5.42 / MAX: 6.21

Quicksilver

Input: CORAL2 P2

OpenBenchmarking.orgFigure Of Merit, More Is BetterQuicksilver 20230818Input: CORAL2 P2bac3M6M9M12M15MSE +/- 6666.67, N = 31626000016250000161933331. (CXX) g++ options: -fopenmp -O3 -march=native

Llama.cpp

Model: llama-2-70b-chat.Q5_0.gguf

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b1808Model: llama-2-70b-chat.Q5_0.ggufcba0.76731.53462.30193.06923.8365SE +/- 0.01, N = 33.413.403.401. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -march=native -mtune=native -lopenblas

PyTorch

Device: CPU - Batch Size: 512 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 512 - Model: ResNet-152bca48121620SE +/- 0.06, N = 314.7914.5514.38MIN: 13.56 / MAX: 14.89MIN: 13.35 / MAX: 14.76MIN: 12.94 / MAX: 14.47

PyTorch

Device: CPU - Batch Size: 16 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 16 - Model: ResNet-152cab48121620SE +/- 0.08, N = 314.6714.6714.55MIN: 13.21 / MAX: 14.86MIN: 14.48 / MAX: 14.78MIN: 14.35 / MAX: 14.64

PyTorch

Device: CPU - Batch Size: 1 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 1 - Model: Efficientnet_v2_lcba3691215SE +/- 0.04, N = 39.619.599.38MIN: 8.95 / MAX: 9.78MIN: 9.48 / MAX: 9.67MIN: 8.84 / MAX: 9.52

Neural Magic DeepSparse

Model: BERT-Large, NLP Question Answering - Scenario: Synchronous Single-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: BERT-Large, NLP Question Answering - Scenario: Synchronous Single-Streamcba714212835SE +/- 0.02, N = 330.8530.9931.08

Neural Magic DeepSparse

Model: BERT-Large, NLP Question Answering - Scenario: Synchronous Single-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: BERT-Large, NLP Question Answering - Scenario: Synchronous Single-Streamcba816243240SE +/- 0.02, N = 332.4032.2532.16

Neural Magic DeepSparse

Model: BERT-Large, NLP Question Answering - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: BERT-Large, NLP Question Answering - Scenario: Asynchronous Multi-Streambca150300450600750SE +/- 0.19, N = 3694.56694.93695.38

Neural Magic DeepSparse

Model: BERT-Large, NLP Question Answering - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: BERT-Large, NLP Question Answering - Scenario: Asynchronous Multi-Streambac1020304050SE +/- 0.07, N = 345.6445.5845.53

PyTorch

Device: CPU - Batch Size: 1 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 1 - Model: ResNet-152acb48121620SE +/- 0.05, N = 316.8616.5816.44MIN: 14.83 / MAX: 17.02MIN: 16.34 / MAX: 16.81MIN: 14.99 / MAX: 16.55

Llamafile

Test: mistral-7b-instruct-v0.2.Q8_0 - Acceleration: CPU

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.6Test: mistral-7b-instruct-v0.2.Q8_0 - Acceleration: CPUbca48121620SE +/- 0.02, N = 314.2914.2714.27

Speedb

Test: Update Random

OpenBenchmarking.orgOp/s, More Is BetterSpeedb 2.7Test: Update Randomacb80K160K240K320K400KSE +/- 2346.60, N = 33552643513943455151. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

Speedb

Test: Read Random Write Random

OpenBenchmarking.orgOp/s, More Is BetterSpeedb 2.7Test: Read Random Write Randombca500K1000K1500K2000K2500KSE +/- 12093.31, N = 32514196250808225065281. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

Speedb

Test: Read While Writing

OpenBenchmarking.orgOp/s, More Is BetterSpeedb 2.7Test: Read While Writingcab3M6M9M12M15MSE +/- 188799.32, N = 31544242015381020152842191. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

Speedb

Test: Random Read

OpenBenchmarking.orgOp/s, More Is BetterSpeedb 2.7Test: Random Readbca60M120M180M240M300MSE +/- 222389.57, N = 32984455652983322032877022231. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

PyTorch

Device: CPU - Batch Size: 16 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 16 - Model: ResNet-50cab816243240SE +/- 0.18, N = 336.6836.6336.13MIN: 30.63 / MAX: 37.51MIN: 35.5 / MAX: 37.04MIN: 34.96 / MAX: 36.54

PyTorch

Device: CPU - Batch Size: 512 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 512 - Model: ResNet-50bca918273645SE +/- 0.29, N = 337.2836.7136.44MIN: 35.91 / MAX: 37.74MIN: 35.24 / MAX: 37.75MIN: 35.51 / MAX: 36.77

Quicksilver

Input: CORAL2 P1

OpenBenchmarking.orgFigure Of Merit, More Is BetterQuicksilver 20230818Input: CORAL2 P1bca5M10M15M20M25MSE +/- 34801.02, N = 32133000021316667213000001. (CXX) g++ options: -fopenmp -O3 -march=native

Neural Magic DeepSparse

Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Synchronous Single-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Synchronous Single-Streamabc1.13312.26623.39934.53245.6655SE +/- 0.0306, N = 34.94124.94195.0360

Neural Magic DeepSparse

Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Synchronous Single-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Synchronous Single-Streamabc4080120160200SE +/- 1.20, N = 3202.16202.11198.37

Neural Magic DeepSparse

Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Streamcba510152025SE +/- 0.01, N = 322.1622.1822.19

Neural Magic DeepSparse

Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Streamcba30060090012001500SE +/- 0.78, N = 31442.401441.041439.77

Neural Magic DeepSparse

Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Streamabc2004006008001000SE +/- 0.36, N = 3882.43882.88883.11

Neural Magic DeepSparse

Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Streambac816243240SE +/- 0.02, N = 335.6935.6535.59

Neural Magic DeepSparse

Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Streamcba2004006008001000SE +/- 0.60, N = 3882.97883.46885.82

Neural Magic DeepSparse

Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Streamcba816243240SE +/- 0.06, N = 335.6535.6135.54

Neural Magic DeepSparse

Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Streambac816243240SE +/- 0.03, N = 333.7333.7333.80

Neural Magic DeepSparse

Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Streambac714212835SE +/- 0.02, N = 329.6429.6429.58

Neural Magic DeepSparse

Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Streamacb1122334455SE +/- 0.10, N = 346.4146.6046.64

Neural Magic DeepSparse

Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Streamacb150300450600750SE +/- 1.46, N = 3688.57685.74685.11

Neural Magic DeepSparse

Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Streamcba816243240SE +/- 0.03, N = 333.7433.7933.86

Neural Magic DeepSparse

Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Streamcba714212835SE +/- 0.02, N = 329.6329.5829.53

Llamafile

Test: wizardcoder-python-34b-v1.0.Q6_K - Acceleration: CPU

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.6Test: wizardcoder-python-34b-v1.0.Q6_K - Acceleration: CPUcab1.23752.4753.71254.956.1875SE +/- 0.02, N = 35.505.505.46

Neural Magic DeepSparse

Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Synchronous Single-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Synchronous Single-Streamcab48121620SE +/- 0.03, N = 315.6015.6115.68

Neural Magic DeepSparse

Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Synchronous Single-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Synchronous Single-Streamcab1428425670SE +/- 0.14, N = 364.0563.9863.70

Neural Magic DeepSparse

Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Streamacb110220330440550SE +/- 1.07, N = 3508.89510.48512.17

Neural Magic DeepSparse

Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Streamabc1428425670SE +/- 0.16, N = 362.3962.1662.14

Neural Magic DeepSparse

Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Streamacb510152025SE +/- 0.02, N = 322.6422.6722.70

Neural Magic DeepSparse

Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Streamacb1020304050SE +/- 0.05, N = 344.1244.0644.00

Neural Magic DeepSparse

Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Streamabc20406080100SE +/- 0.11, N = 3101.65101.73101.79

Neural Magic DeepSparse

Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Streamabc70140210280350SE +/- 0.43, N = 3313.99313.80313.75

Neural Magic DeepSparse

Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Streamcba246810SE +/- 0.0130, N = 37.29107.31677.3198

Neural Magic DeepSparse

Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Streamcba306090120150SE +/- 0.24, N = 3137.02136.53136.47

Neural Magic DeepSparse

Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Streamcba246810SE +/- 0.0014, N = 36.56096.58666.5900

Neural Magic DeepSparse

Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Streamcba306090120150SE +/- 0.03, N = 3152.14151.56151.49

Neural Magic DeepSparse

Model: ResNet-50, Sparse INT8 - Scenario: Synchronous Single-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: ResNet-50, Sparse INT8 - Scenario: Synchronous Single-Streambac0.27320.54640.81961.09281.366SE +/- 0.0028, N = 31.21071.21151.2141

Neural Magic DeepSparse

Model: ResNet-50, Sparse INT8 - Scenario: Synchronous Single-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: ResNet-50, Sparse INT8 - Scenario: Synchronous Single-Streambac2004006008001000SE +/- 1.85, N = 3823.45822.75821.13

Neural Magic DeepSparse

Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Synchronous Single-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Synchronous Single-Streamcba246810SE +/- 0.0019, N = 36.48966.57036.6050

Neural Magic DeepSparse

Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Synchronous Single-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Synchronous Single-Streamcba306090120150SE +/- 0.04, N = 3153.94152.04151.25

Neural Magic DeepSparse

Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Streambca306090120150SE +/- 0.10, N = 3150.16150.48150.77

Neural Magic DeepSparse

Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Streambca50100150200250SE +/- 0.12, N = 3212.39211.91211.56

Neural Magic DeepSparse

Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Streambca306090120150SE +/- 0.14, N = 3148.78149.21149.39

Neural Magic DeepSparse

Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Streambca50100150200250SE +/- 0.26, N = 3214.55213.87213.60

TensorFlow

Device: CPU - Batch Size: 16 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 16 - Model: ResNet-50acb1224364860SE +/- 0.04, N = 353.6553.6253.61

Neural Magic DeepSparse

Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Streamacb246810SE +/- 0.0299, N = 38.47798.50268.5134

Neural Magic DeepSparse

Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Streamacb8001600240032004000SE +/- 13.07, N = 33766.133754.343749.03

Neural Magic DeepSparse

Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Streamcab1530456075SE +/- 0.04, N = 366.9767.0067.00

Neural Magic DeepSparse

Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Streamcab100200300400500SE +/- 0.22, N = 3477.06476.84476.68

Neural Magic DeepSparse

Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Streamcba1530456075SE +/- 0.02, N = 366.9566.9566.97

Neural Magic DeepSparse

Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Streambca100200300400500SE +/- 0.10, N = 3477.28477.25477.00

Neural Magic DeepSparse

Model: ResNet-50, Baseline - Scenario: Synchronous Single-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: ResNet-50, Baseline - Scenario: Synchronous Single-Streambca1.19882.39763.59644.79525.994SE +/- 0.0047, N = 35.31685.32795.3282

Neural Magic DeepSparse

Model: ResNet-50, Baseline - Scenario: Synchronous Single-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: ResNet-50, Baseline - Scenario: Synchronous Single-Streambca4080120160200SE +/- 0.17, N = 3187.87187.46187.45

Neural Magic DeepSparse

Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.6Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Streamabc1.21752.4353.65254.876.0875SE +/- 0.0772, N = 35.32375.32805.4112

Neural Magic DeepSparse

Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.6Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Streamabc4080120160200SE +/- 2.60, N = 3187.62187.48184.66

Llama.cpp

Model: llama-2-13b.Q4_0.gguf

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b1808Model: llama-2-13b.Q4_0.ggufbac48121620SE +/- 0.04, N = 316.7216.6516.631. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -march=native -mtune=native -lopenblas

SVT-AV1

Encoder Mode: Preset 4 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 1.8Encoder Mode: Preset 4 - Input: Bosphorus 4Kbca246810SE +/- 0.038, N = 36.7706.7216.6971. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

PyTorch

Device: CPU - Batch Size: 1 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 1 - Model: ResNet-50bca1020304050SE +/- 0.20, N = 345.6045.3345.12MIN: 34.79 / MAX: 46.32MIN: 43.73 / MAX: 46.27MIN: 35.13 / MAX: 45.99

TensorFlow

Device: CPU - Batch Size: 1 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 1 - Model: ResNet-50bca1.34552.6914.03655.3826.7275SE +/- 0.04, N = 35.985.945.92

Llamafile

Test: llava-v1.5-7b-q4 - Acceleration: CPU

OpenBenchmarking.orgTokens Per Second, More Is BetterLlamafile 0.6Test: llava-v1.5-7b-q4 - Acceleration: CPUcab510152025SE +/- 0.03, N = 321.9721.9421.89

Llama.cpp

Model: llama-2-7b.Q4_0.gguf

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b1808Model: llama-2-7b.Q4_0.ggufabc714212835SE +/- 0.02, N = 327.9527.9426.501. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -march=native -mtune=native -lopenblas

Y-Cruncher

Pi Digits To Calculate: 1B

OpenBenchmarking.orgSeconds, Fewer Is BetterY-Cruncher 0.8.3Pi Digits To Calculate: 1Bbca3691215SE +/- 0.006, N = 39.9159.9299.946

SVT-AV1

Encoder Mode: Preset 8 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 1.8Encoder Mode: Preset 8 - Input: Bosphorus 4Kbca1530456075SE +/- 0.17, N = 368.4267.7867.321. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

SVT-AV1

Encoder Mode: Preset 4 - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 1.8Encoder Mode: Preset 4 - Input: Bosphorus 1080pbca48121620SE +/- 0.16, N = 317.6317.5717.121. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

SVT-AV1

Encoder Mode: Preset 12 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 1.8Encoder Mode: Preset 12 - Input: Bosphorus 4Kbca4080120160200SE +/- 2.19, N = 3193.58193.43192.631. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

SVT-AV1

Encoder Mode: Preset 13 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 1.8Encoder Mode: Preset 13 - Input: Bosphorus 4Kacb4080120160200SE +/- 1.23, N = 3195.90194.47191.931. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

Y-Cruncher

Pi Digits To Calculate: 500M

OpenBenchmarking.orgSeconds, Fewer Is BetterY-Cruncher 0.8.3Pi Digits To Calculate: 500Mabc1.10662.21323.31984.42645.533SE +/- 0.007, N = 34.8974.9144.918

SVT-AV1

Encoder Mode: Preset 8 - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 1.8Encoder Mode: Preset 8 - Input: Bosphorus 1080pbca306090120150SE +/- 1.85, N = 3133.52130.35129.021. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

SVT-AV1

Encoder Mode: Preset 12 - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 1.8Encoder Mode: Preset 12 - Input: Bosphorus 1080pabc110220330440550SE +/- 4.17, N = 3506.25505.54503.511. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

SVT-AV1

Encoder Mode: Preset 13 - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 1.8Encoder Mode: Preset 13 - Input: Bosphorus 1080pcab130260390520650SE +/- 1.86, N = 3603.60599.85599.421. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq


Phoronix Test Suite v10.8.5