n1n1 ARMv8 Neoverse-N1 testing with a GIGABYTE G242-P36-00 MP32-AR2-00 v01000100 (F31k SCP: 2.10.20220531 BIOS) and ASPEED on Ubuntu 23.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2403174-NE-N1N13670960&rdt&grr .
n1n1 Processor Motherboard Chipset Memory Disk Graphics Monitor Network OS Kernel Compiler File-System Screen Resolution a aa b c ARMv8 Neoverse-N1 @ 3.00GHz (128 Cores) GIGABYTE G242-P36-00 MP32-AR2-00 v01000100 (F31k SCP: 2.10.20220531 BIOS) Ampere Computing LLC Altra PCI Root Complex A 16 x 32 GB DDR4-3200MT/s Samsung M393A4K40DB3-CWE 800GB Micron_7450_MTFDKBA800TFS ASPEED VGA HDMI 2 x Intel I350 Ubuntu 23.10 6.5.0-15-generic (aarch64) GCC 13.2.0 ext4 1024x768 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -v Processor Details - Scaling Governor: cppc_cpufreq performance (Boost: Disabled) Python Details - Python 3.11.6 Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of CSV2 BHB + srbds: Not affected + tsx_async_abort: Not affected
n1n1 deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream deepsparse: ResNet-50, Sparse INT8 - Asynchronous Multi-Stream deepsparse: ResNet-50, Sparse INT8 - Asynchronous Multi-Stream stockfish: Chess Benchmark build-linux-kernel: allmodconfig deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Synchronous Single-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Synchronous Single-Stream deepsparse: Llama2 Chat 7b Quantized - Asynchronous Multi-Stream deepsparse: Llama2 Chat 7b Quantized - Asynchronous Multi-Stream openvino: Face Detection FP16-INT8 - CPU openvino: Face Detection FP16-INT8 - CPU jpegxl: JPEG - 90 openvino: Face Detection FP16 - CPU openvino: Face Detection FP16 - CPU jpegxl: PNG - 90 onednn: Recurrent Neural Network Training - CPU build-linux-kernel: defconfig jpegxl-decode: 1 onednn: Recurrent Neural Network Inference - CPU deepsparse: ResNet-50, Sparse INT8 - Synchronous Single-Stream deepsparse: ResNet-50, Sparse INT8 - Synchronous Single-Stream openvino: Person Detection FP16 - CPU openvino: Person Detection FP16 - CPU openvino: Person Detection FP32 - CPU openvino: Person Detection FP32 - CPU deepsparse: Llama2 Chat 7b Quantized - Synchronous Single-Stream deepsparse: Llama2 Chat 7b Quantized - Synchronous Single-Stream openvino: Road Segmentation ADAS FP16-INT8 - CPU openvino: Road Segmentation ADAS FP16-INT8 - CPU openvino: Machine Translation EN To DE FP16 - CPU openvino: Machine Translation EN To DE FP16 - CPU openvino: Road Segmentation ADAS FP16 - CPU openvino: Road Segmentation ADAS FP16 - CPU deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Synchronous Single-Stream deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Synchronous Single-Stream openvino: Noise Suppression Poconet-Like FP16 - CPU openvino: Noise Suppression Poconet-Like FP16 - CPU openvino: Vehicle Detection FP16-INT8 - CPU openvino: Vehicle Detection FP16-INT8 - CPU openvino: Person Vehicle Bike Detection FP16 - CPU openvino: Person Vehicle Bike Detection FP16 - CPU openvino: Weld Porosity Detection FP16 - CPU openvino: Weld Porosity Detection FP16 - CPU openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Person Re-Identification Retail FP16 - CPU openvino: Person Re-Identification Retail FP16 - CPU openvino: Handwritten English Recognition FP16-INT8 - CPU openvino: Handwritten English Recognition FP16-INT8 - CPU openvino: Vehicle Detection FP16 - CPU openvino: Vehicle Detection FP16 - CPU openvino: Handwritten English Recognition FP16 - CPU openvino: Handwritten English Recognition FP16 - CPU openvino: Face Detection Retail FP16-INT8 - CPU openvino: Face Detection Retail FP16-INT8 - CPU openvino: Face Detection Retail FP16 - CPU openvino: Face Detection Retail FP16 - CPU deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Stream openvino: Age Gender Recognition Retail 0013 FP16 - CPU openvino: Age Gender Recognition Retail 0013 FP16 - CPU openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU deepsparse: NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Stream svt-av1: Preset 4 - Bosphorus 4K deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Stream deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Stream deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Synchronous Single-Stream deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Synchronous Single-Stream deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Stream jpegxl: JPEG - 80 deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Synchronous Single-Stream deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Synchronous Single-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Synchronous Single-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Synchronous Single-Stream jpegxl: PNG - 80 primesieve: 1e13 deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream deepsparse: ResNet-50, Baseline - Synchronous Single-Stream deepsparse: ResNet-50, Baseline - Synchronous Single-Stream deepsparse: ResNet-50, Baseline - Asynchronous Multi-Stream deepsparse: ResNet-50, Baseline - Asynchronous Multi-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Synchronous Single-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Synchronous Single-Stream srsran: PUSCH Processor Benchmark, Throughput Total encode-wavpack: WAV To WavPack srsran: PUSCH Processor Benchmark, Throughput Thread srsran: PDSCH Processor Benchmark, Throughput Total svt-av1: Preset 8 - Bosphorus 4K onednn: Deconvolution Batch shapes_1d - CPU srsran: PDSCH Processor Benchmark, Throughput Thread jpegxl-decode: All svt-av1: Preset 4 - Bosphorus 1080p jpegxl: PNG - 100 onednn: IP Shapes 1D - CPU jpegxl: JPEG - 100 draco: Church Facade svt-av1: Preset 8 - Bosphorus 1080p draco: Lion onednn: IP Shapes 3D - CPU svt-av1: Preset 12 - Bosphorus 4K svt-av1: Preset 13 - Bosphorus 4K onednn: Deconvolution Batch shapes_3d - CPU onednn: Convolution Batch Shapes Auto - CPU primesieve: 1e12 svt-av1: Preset 12 - Bosphorus 1080p compress-pbzip2: FreeBSD-13.0-RELEASE-amd64-memstick.img Compression svt-av1: Preset 13 - Bosphorus 1080p a aa b c 59028775 37.591 39.249 94.273 27.237 2.652 39.268 43.097 1602.1 46.7 14099.8 24.945 175.8 558.569 8.914 29.603 31.665 56.897 74.682 74.896 265.743 364.399 1337.5918 46.6120 23.5039 2678.2382 59449725 348.018 32.6597 30.5961 21332.8931 2.2602 11232.43 2.74 37.415 10877.53 2.84 37.895 3738.39 92.760 27.152 1460.94 3.1508 315.7450 2150.30 14.77 2156.87 14.73 77.3026 12.9298 913.41 34.90 794.06 40.11 486.11 65.60 7.5520 132.1849 193.84 164.82 357.86 89.35 156.22 204.69 108.97 293.47 146.71 217.95 224.22 142.60 216.18 147.76 143.42 222.86 194.88 163.95 95.98 333.15 47.28 676.59 38.3162 26.0871 22.80 1402.51 21.86 1462.94 38.0726 26.2531 2.644 143.8317 438.7131 19.7459 50.5976 55.0261 1149.4724 1844.1246 33.4187 1840.3677 33.5337 310.5129 202.6359 38.921 8.8709 112.5334 9.0819 109.9523 40.279 42.305 182.8789 345.1080 7.4741 133.5301 132.9525 474.8976 132.5425 476.3557 7.4691 133.6218 25.199 13936.1 24.927 20.9255 175.7 523.019 8.925 29.238 4.84065 31.121 10100 57.135 7351 2.15582 74.469 74.900 2.79626 4.29470 2.911 264.978 2.413553 363.354 1335.3456 46.715 23.4116 2688.9567 51901853 350.294 32.5272 30.7211 21231.5641 2.2754 11206.13 2.75 37.79 10891.93 2.84 39.669 3737.15 94.426 27.417 1461 3.1835 312.4633 2151.85 14.77 2140.2 14.84 77.4941 12.8977 915.17 34.79 793.31 40.15 486.9 65.49 7.5061 132.9867 193.93 164.75 358.5 89.19 155.74 205.33 107.5 297.48 144.38 221.47 224.22 142.58 217.16 147.08 142.79 223.85 194.84 163.98 96.9 329.97 48.1 664.78 38.5246 25.9453 22.79 1402.97 21.71 1473.23 37.9374 26.3468 2.65 143.4837 439.6023 19.6933 50.7328 55.2555 1144.8012 1834.8257 33.7125 1835.2572 33.5843 311.1676 202.1471 37.766 8.8483 112.8291 9.1198 109.4784 41.309 42.441 181.8956 346.6699 7.4484 134.0016 132.7356 475.8212 131.9529 478.6418 7.4692 133.6253 25.205 13999.6 25.006 20.4308 564.893 8.921 29.494 4.88015 31.621 9847 57.027 7320 2.15178 75.167 74.958 2.78238 4.28036 2.872 265.435 2.439338 365.102 1333.7177 46.6799 23.9126 2630.334 53514996 349.915 32.5835 30.6675 21169.2953 2.2836 11196.54 2.75 35.843 10876.7 2.84 39.251 3738.53 94.496 27.396 1469.65 3.1449 316.347 2157.45 14.73 2146.07 14.8 77.119 12.9605 913.21 34.88 792 40.21 486.11 65.6 7.5924 131.4797 193.87 164.79 357.41 89.3 154.3 207.24 108.56 294.58 145.82 219.27 224.31 142.54 217.41 146.9 143.48 222.78 194.65 164.13 96.38 331.77 47.72 670.19 37.9597 26.3315 22.78 1403.65 21.89 1460.72 38.1495 26.2002 2.65 143.6025 438.2501 19.7262 50.649 55.2435 1144.7727 1833.4487 33.6797 1836.793 33.6663 312.7909 201.0244 39.315 8.8461 112.8521 8.982 111.1578 41.354 42.294 185.7196 339.9 7.4767 133.4866 131.5237 479.9901 131.8594 478.3732 7.454 133.8853 25.2 24.952 20.8925 542.103 8.926 29.544 4.88858 31.624 9848 56.789 7332 2.14878 75.015 74.604 2.80386 4.28461 2.893 264.28 2.438631 363.612 OpenBenchmarking.org
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream aa b c 300 600 900 1200 1500 SE +/- 3.19, N = 3 1337.59 1335.35 1333.72
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream aa b c 11 22 33 44 55 SE +/- 0.11, N = 3 46.61 46.72 46.68
Neural Magic DeepSparse Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream aa b c 6 12 18 24 30 SE +/- 0.06, N = 3 23.50 23.41 23.91
Neural Magic DeepSparse Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream aa b c 600 1200 1800 2400 3000 SE +/- 6.53, N = 3 2678.24 2688.96 2630.33
Stockfish Chess Benchmark OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 16.1 Chess Benchmark a aa b c 13M 26M 39M 52M 65M SE +/- 1497045.19, N = 12 59028775 59449725 51901853 53514996 1. (CXX) g++ options: -lgcov -lpthread -fno-exceptions -std=c++17 -fno-peel-loops -fno-tracer -pedantic -O3 -funroll-loops -flto -flto-partition=one -flto=jobserver
Timed Linux Kernel Compilation Build: allmodconfig OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 6.8 Build: allmodconfig aa b c 80 160 240 320 400 SE +/- 0.68, N = 3 348.02 350.29 349.92
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream aa b c 8 16 24 32 40 SE +/- 0.01, N = 3 32.66 32.53 32.58
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream aa b c 7 14 21 28 35 SE +/- 0.01, N = 3 30.60 30.72 30.67
Neural Magic DeepSparse Model: Llama2 Chat 7b Quantized - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: Llama2 Chat 7b Quantized - Scenario: Asynchronous Multi-Stream aa b c 5K 10K 15K 20K 25K SE +/- 55.68, N = 3 21332.89 21231.56 21169.30
Neural Magic DeepSparse Model: Llama2 Chat 7b Quantized - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: Llama2 Chat 7b Quantized - Scenario: Asynchronous Multi-Stream aa b c 0.5138 1.0276 1.5414 2.0552 2.569 SE +/- 0.0074, N = 3 2.2602 2.2754 2.2836
OpenVINO Model: Face Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Face Detection FP16-INT8 - Device: CPU aa b c 2K 4K 6K 8K 10K SE +/- 9.32, N = 3 11232.43 11206.13 11196.54 MIN: 6926.76 / MAX: 21113.44 MIN: 7011.32 / MAX: 20429.17 MIN: 7222.84 / MAX: 20603.63 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Face Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Face Detection FP16-INT8 - Device: CPU aa b c 0.6188 1.2376 1.8564 2.4752 3.094 SE +/- 0.00, N = 3 2.74 2.75 2.75 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
JPEG-XL libjxl Input: JPEG - Quality: 90 OpenBenchmarking.org MP/s, More Is Better JPEG-XL libjxl 0.10.1 Input: JPEG - Quality: 90 a aa b c 9 18 27 36 45 SE +/- 0.45, N = 15 37.59 37.42 37.79 35.84 1. (CXX) g++ options: -fno-rtti -O3 -fPIE -pie -lm
OpenVINO Model: Face Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Face Detection FP16 - Device: CPU aa b c 2K 4K 6K 8K 10K SE +/- 17.40, N = 3 10877.53 10891.93 10876.70 MIN: 4104.89 / MAX: 18949.05 MIN: 3821.31 / MAX: 19031.99 MIN: 3255.92 / MAX: 18738.42 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Face Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Face Detection FP16 - Device: CPU aa b c 0.639 1.278 1.917 2.556 3.195 SE +/- 0.01, N = 3 2.84 2.84 2.84 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
JPEG-XL libjxl Input: PNG - Quality: 90 OpenBenchmarking.org MP/s, More Is Better JPEG-XL libjxl 0.10.1 Input: PNG - Quality: 90 a aa b c 9 18 27 36 45 SE +/- 0.55, N = 15 39.25 37.90 39.67 39.25 1. (CXX) g++ options: -fno-rtti -O3 -fPIE -pie -lm
oneDNN Harness: Recurrent Neural Network Training - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: Recurrent Neural Network Training - Engine: CPU aa b c 800 1600 2400 3200 4000 SE +/- 2.30, N = 3 3738.39 3737.15 3738.53 MIN: 3728.79 MIN: 3730.87 MIN: 3730.99 1. (CXX) g++ options: -O3 -march=native -fopenmp -mcpu=generic -fPIC -pie -ldl -lpthread
Timed Linux Kernel Compilation Build: defconfig OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 6.8 Build: defconfig a aa b c 20 40 60 80 100 SE +/- 0.90, N = 3 94.27 92.76 94.43 94.50
JPEG-XL Decoding libjxl CPU Threads: 1 OpenBenchmarking.org MP/s, More Is Better JPEG-XL Decoding libjxl 0.10.1 CPU Threads: 1 a aa b c 6 12 18 24 30 SE +/- 0.01, N = 3 27.24 27.15 27.42 27.40
oneDNN Harness: Recurrent Neural Network Inference - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: Recurrent Neural Network Inference - Engine: CPU aa b c 300 600 900 1200 1500 SE +/- 3.72, N = 3 1460.94 1461.00 1469.65 MIN: 1436.36 MIN: 1442.49 MIN: 1448.43 1. (CXX) g++ options: -O3 -march=native -fopenmp -mcpu=generic -fPIC -pie -ldl -lpthread
Neural Magic DeepSparse Model: ResNet-50, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Sparse INT8 - Scenario: Synchronous Single-Stream aa b c 0.7163 1.4326 2.1489 2.8652 3.5815 SE +/- 0.0074, N = 3 3.1508 3.1835 3.1449
Neural Magic DeepSparse Model: ResNet-50, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Sparse INT8 - Scenario: Synchronous Single-Stream aa b c 70 140 210 280 350 SE +/- 0.75, N = 3 315.75 312.46 316.35
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Person Detection FP16 - Device: CPU aa b c 500 1000 1500 2000 2500 SE +/- 1.19, N = 3 2150.30 2151.85 2157.45 MIN: 491.1 / MAX: 2996.72 MIN: 500.93 / MAX: 2975.2 MIN: 644.54 / MAX: 2962.51 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Person Detection FP16 - Device: CPU aa b c 4 8 12 16 20 SE +/- 0.01, N = 3 14.77 14.77 14.73 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Person Detection FP32 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Person Detection FP32 - Device: CPU aa b c 500 1000 1500 2000 2500 SE +/- 2.39, N = 3 2156.87 2140.20 2146.07 MIN: 504.09 / MAX: 2990 MIN: 527.18 / MAX: 2951.37 MIN: 439.17 / MAX: 2969.83 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Person Detection FP32 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Person Detection FP32 - Device: CPU aa b c 4 8 12 16 20 SE +/- 0.02, N = 3 14.73 14.84 14.80 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
Neural Magic DeepSparse Model: Llama2 Chat 7b Quantized - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: Llama2 Chat 7b Quantized - Scenario: Synchronous Single-Stream aa b c 20 40 60 80 100 SE +/- 0.12, N = 3 77.30 77.49 77.12
Neural Magic DeepSparse Model: Llama2 Chat 7b Quantized - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: Llama2 Chat 7b Quantized - Scenario: Synchronous Single-Stream aa b c 3 6 9 12 15 SE +/- 0.02, N = 3 12.93 12.90 12.96
OpenVINO Model: Road Segmentation ADAS FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Road Segmentation ADAS FP16-INT8 - Device: CPU aa b c 200 400 600 800 1000 SE +/- 0.49, N = 3 913.41 915.17 913.21 MIN: 742.17 / MAX: 1356.42 MIN: 711.5 / MAX: 1350.07 MIN: 718.49 / MAX: 1350.67 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Road Segmentation ADAS FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Road Segmentation ADAS FP16-INT8 - Device: CPU aa b c 8 16 24 32 40 SE +/- 0.02, N = 3 34.90 34.79 34.88 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Machine Translation EN To DE FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Machine Translation EN To DE FP16 - Device: CPU aa b c 200 400 600 800 1000 SE +/- 0.93, N = 3 794.06 793.31 792.00 MIN: 604.52 / MAX: 1620.5 MIN: 559.01 / MAX: 1581.54 MIN: 568.74 / MAX: 1657.2 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Machine Translation EN To DE FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Machine Translation EN To DE FP16 - Device: CPU aa b c 9 18 27 36 45 SE +/- 0.05, N = 3 40.11 40.15 40.21 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Road Segmentation ADAS FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Road Segmentation ADAS FP16 - Device: CPU aa b c 110 220 330 440 550 SE +/- 0.88, N = 3 486.11 486.90 486.11 MIN: 118.22 / MAX: 849.31 MIN: 119.18 / MAX: 852.49 MIN: 171.7 / MAX: 813.73 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Road Segmentation ADAS FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Road Segmentation ADAS FP16 - Device: CPU aa b c 15 30 45 60 75 SE +/- 0.12, N = 3 65.60 65.49 65.60 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Synchronous Single-Stream aa b c 2 4 6 8 10 SE +/- 0.0154, N = 3 7.5520 7.5061 7.5924
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Synchronous Single-Stream aa b c 30 60 90 120 150 SE +/- 0.27, N = 3 132.18 132.99 131.48
OpenVINO Model: Noise Suppression Poconet-Like FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Noise Suppression Poconet-Like FP16 - Device: CPU aa b c 40 80 120 160 200 SE +/- 0.04, N = 3 193.84 193.93 193.87 MIN: 183.19 / MAX: 407.14 MIN: 182.93 / MAX: 402.18 MIN: 182.85 / MAX: 406.51 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Noise Suppression Poconet-Like FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Noise Suppression Poconet-Like FP16 - Device: CPU aa b c 40 80 120 160 200 SE +/- 0.03, N = 3 164.82 164.75 164.79 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Vehicle Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Vehicle Detection FP16-INT8 - Device: CPU aa b c 80 160 240 320 400 SE +/- 0.11, N = 3 357.86 358.50 357.41 MIN: 301.59 / MAX: 522.85 MIN: 300.19 / MAX: 528.83 MIN: 204.13 / MAX: 519.56 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Vehicle Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Vehicle Detection FP16-INT8 - Device: CPU aa b c 20 40 60 80 100 SE +/- 0.03, N = 3 89.35 89.19 89.30 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Person Vehicle Bike Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Person Vehicle Bike Detection FP16 - Device: CPU aa b c 30 60 90 120 150 SE +/- 0.49, N = 3 156.22 155.74 154.30 MIN: 44.3 / MAX: 240.55 MIN: 48.23 / MAX: 240.13 MIN: 44.57 / MAX: 239.56 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Person Vehicle Bike Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Person Vehicle Bike Detection FP16 - Device: CPU aa b c 50 100 150 200 250 SE +/- 0.66, N = 3 204.69 205.33 207.24 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Weld Porosity Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Weld Porosity Detection FP16 - Device: CPU aa b c 20 40 60 80 100 SE +/- 0.11, N = 3 108.97 107.50 108.56 MIN: 17.48 / MAX: 1207.62 MIN: 57.15 / MAX: 1202.08 MIN: 17.21 / MAX: 1188.34 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Weld Porosity Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Weld Porosity Detection FP16 - Device: CPU aa b c 60 120 180 240 300 SE +/- 0.30, N = 3 293.47 297.48 294.58 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Weld Porosity Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Weld Porosity Detection FP16-INT8 - Device: CPU aa b c 30 60 90 120 150 SE +/- 0.12, N = 3 146.71 144.38 145.82 MIN: 96.02 / MAX: 1572.43 MIN: 96.65 / MAX: 1566.66 MIN: 96.38 / MAX: 1563.28 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Weld Porosity Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Weld Porosity Detection FP16-INT8 - Device: CPU aa b c 50 100 150 200 250 SE +/- 0.18, N = 3 217.95 221.47 219.27 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Person Re-Identification Retail FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Person Re-Identification Retail FP16 - Device: CPU aa b c 50 100 150 200 250 SE +/- 0.53, N = 3 224.22 224.22 224.31 MIN: 29.21 / MAX: 400.61 MIN: 36.4 / MAX: 368.76 MIN: 31.77 / MAX: 351.21 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Person Re-Identification Retail FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Person Re-Identification Retail FP16 - Device: CPU aa b c 30 60 90 120 150 SE +/- 0.34, N = 3 142.60 142.58 142.54 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Handwritten English Recognition FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Handwritten English Recognition FP16-INT8 - Device: CPU aa b c 50 100 150 200 250 SE +/- 1.21, N = 3 216.18 217.16 217.41 MIN: 206.9 / MAX: 376.9 MIN: 208.82 / MAX: 374.93 MIN: 210.44 / MAX: 372.96 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Handwritten English Recognition FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Handwritten English Recognition FP16-INT8 - Device: CPU aa b c 30 60 90 120 150 SE +/- 0.83, N = 3 147.76 147.08 146.90 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Vehicle Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Vehicle Detection FP16 - Device: CPU aa b c 30 60 90 120 150 SE +/- 0.06, N = 3 143.42 142.79 143.48 MIN: 62.82 / MAX: 295.2 MIN: 60 / MAX: 245.21 MIN: 44.55 / MAX: 252.93 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Vehicle Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Vehicle Detection FP16 - Device: CPU aa b c 50 100 150 200 250 SE +/- 0.10, N = 3 222.86 223.85 222.78 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Handwritten English Recognition FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Handwritten English Recognition FP16 - Device: CPU aa b c 40 80 120 160 200 SE +/- 0.08, N = 3 194.88 194.84 194.65 MIN: 185.7 / MAX: 356.13 MIN: 185.09 / MAX: 355.83 MIN: 185.45 / MAX: 358.03 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Handwritten English Recognition FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Handwritten English Recognition FP16 - Device: CPU aa b c 40 80 120 160 200 SE +/- 0.06, N = 3 163.95 163.98 164.13 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Face Detection Retail FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Face Detection Retail FP16-INT8 - Device: CPU aa b c 20 40 60 80 100 SE +/- 0.18, N = 3 95.98 96.90 96.38 MIN: 71.43 / MAX: 140.32 MIN: 70.14 / MAX: 141.32 MIN: 69.36 / MAX: 140.93 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Face Detection Retail FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Face Detection Retail FP16-INT8 - Device: CPU aa b c 70 140 210 280 350 SE +/- 0.61, N = 3 333.15 329.97 331.77 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Face Detection Retail FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Face Detection Retail FP16 - Device: CPU aa b c 11 22 33 44 55 SE +/- 0.60, N = 3 47.28 48.10 47.72 MIN: 10.17 / MAX: 121.04 MIN: 9.92 / MAX: 115.12 MIN: 9.97 / MAX: 99.86 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Face Detection Retail FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Face Detection Retail FP16 - Device: CPU aa b c 150 300 450 600 750 SE +/- 8.52, N = 3 676.59 664.78 670.19 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream aa b c 9 18 27 36 45 SE +/- 0.16, N = 3 38.32 38.52 37.96
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream aa b c 6 12 18 24 30 SE +/- 0.11, N = 3 26.09 25.95 26.33
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU aa b c 5 10 15 20 25 SE +/- 0.05, N = 3 22.80 22.79 22.78 MIN: 1.57 / MAX: 164.42 MIN: 1.59 / MAX: 165.35 MIN: 1.63 / MAX: 162.11 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU aa b c 300 600 900 1200 1500 SE +/- 3.07, N = 3 1402.51 1402.97 1403.65 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU aa b c 5 10 15 20 25 SE +/- 0.02, N = 3 21.86 21.71 21.89 MIN: 2 / MAX: 157.1 MIN: 2.05 / MAX: 156.88 MIN: 2.07 / MAX: 156.71 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU aa b c 300 600 900 1200 1500 SE +/- 1.48, N = 3 1462.94 1473.23 1460.72 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream aa b c 9 18 27 36 45 SE +/- 0.04, N = 3 38.07 37.94 38.15
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream aa b c 6 12 18 24 30 SE +/- 0.02, N = 3 26.25 26.35 26.20
SVT-AV1 Encoder Mode: Preset 4 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 4 - Input: Bosphorus 4K a aa b c 0.5967 1.1934 1.7901 2.3868 2.9835 SE +/- 0.004, N = 3 2.652 2.644 2.650 2.650 1. (CXX) g++ options: -march=native
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream aa b c 30 60 90 120 150 SE +/- 0.07, N = 3 143.83 143.48 143.60
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream aa b c 100 200 300 400 500 SE +/- 0.42, N = 3 438.71 439.60 438.25
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Synchronous Single-Stream aa b c 5 10 15 20 25 SE +/- 0.03, N = 3 19.75 19.69 19.73
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Synchronous Single-Stream aa b c 11 22 33 44 55 SE +/- 0.08, N = 3 50.60 50.73 50.65
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream aa b c 12 24 36 48 60 SE +/- 0.11, N = 3 55.03 55.26 55.24
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream aa b c 200 400 600 800 1000 SE +/- 2.84, N = 3 1149.47 1144.80 1144.77
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream aa b c 400 800 1200 1600 2000 SE +/- 1.78, N = 3 1844.12 1834.83 1833.45
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream aa b c 8 16 24 32 40 SE +/- 0.02, N = 3 33.42 33.71 33.68
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream aa b c 400 800 1200 1600 2000 SE +/- 1.00, N = 3 1840.37 1835.26 1836.79
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream aa b c 8 16 24 32 40 SE +/- 0.04, N = 3 33.53 33.58 33.67
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream aa b c 70 140 210 280 350 SE +/- 0.51, N = 3 310.51 311.17 312.79
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream aa b c 40 80 120 160 200 SE +/- 0.34, N = 3 202.64 202.15 201.02
JPEG-XL libjxl Input: JPEG - Quality: 80 OpenBenchmarking.org MP/s, More Is Better JPEG-XL libjxl 0.10.1 Input: JPEG - Quality: 80 a aa b c 9 18 27 36 45 SE +/- 0.12, N = 3 39.27 38.92 37.77 39.32 1. (CXX) g++ options: -fno-rtti -O3 -fPIE -pie -lm
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Synchronous Single-Stream aa b c 2 4 6 8 10 SE +/- 0.0129, N = 3 8.8709 8.8483 8.8461
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Synchronous Single-Stream aa b c 30 60 90 120 150 SE +/- 0.16, N = 3 112.53 112.83 112.85
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream aa b c 3 6 9 12 15 SE +/- 0.0713, N = 3 9.0819 9.1198 8.9820
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream aa b c 20 40 60 80 100 SE +/- 0.86, N = 3 109.95 109.48 111.16
JPEG-XL libjxl Input: PNG - Quality: 80 OpenBenchmarking.org MP/s, More Is Better JPEG-XL libjxl 0.10.1 Input: PNG - Quality: 80 a aa b c 10 20 30 40 50 SE +/- 0.30, N = 3 43.10 40.28 41.31 41.35 1. (CXX) g++ options: -fno-rtti -O3 -fPIE -pie -lm
Primesieve Length: 1e13 OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 12.1 Length: 1e13 aa b c 10 20 30 40 50 SE +/- 0.07, N = 3 42.31 42.44 42.29 1. (CXX) g++ options: -O3
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream aa b c 40 80 120 160 200 SE +/- 0.08, N = 3 182.88 181.90 185.72
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream aa b c 80 160 240 320 400 SE +/- 0.25, N = 3 345.11 346.67 339.90
Neural Magic DeepSparse Model: ResNet-50, Baseline - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Baseline - Scenario: Synchronous Single-Stream aa b c 2 4 6 8 10 SE +/- 0.0064, N = 3 7.4741 7.4484 7.4767
Neural Magic DeepSparse Model: ResNet-50, Baseline - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Baseline - Scenario: Synchronous Single-Stream aa b c 30 60 90 120 150 SE +/- 0.11, N = 3 133.53 134.00 133.49
Neural Magic DeepSparse Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream aa b c 30 60 90 120 150 SE +/- 0.39, N = 3 132.95 132.74 131.52
Neural Magic DeepSparse Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream aa b c 100 200 300 400 500 SE +/- 1.33, N = 3 474.90 475.82 479.99
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream aa b c 30 60 90 120 150 SE +/- 0.34, N = 3 132.54 131.95 131.86
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream aa b c 100 200 300 400 500 SE +/- 1.18, N = 3 476.36 478.64 478.37
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream aa b c 2 4 6 8 10 SE +/- 0.0095, N = 3 7.4691 7.4692 7.4540
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream aa b c 30 60 90 120 150 SE +/- 0.17, N = 3 133.62 133.63 133.89
srsRAN Project Test: PUSCH Processor Benchmark, Throughput Total OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.10.1-20240219 Test: PUSCH Processor Benchmark, Throughput Total a 300 600 900 1200 1500 1602.1 MIN: 947.2 1. (CXX) g++ options: -O3 -fno-trapping-math -fno-math-errno -ldl
WavPack Audio Encoding WAV To WavPack OpenBenchmarking.org Seconds, Fewer Is Better WavPack Audio Encoding 5.7 WAV To WavPack aa b c 6 12 18 24 30 SE +/- 0.00, N = 5 25.20 25.21 25.20
srsRAN Project Test: PUSCH Processor Benchmark, Throughput Thread OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.10.1-20240219 Test: PUSCH Processor Benchmark, Throughput Thread a 11 22 33 44 55 46.7 MIN: 28.9 1. (CXX) g++ options: -O3 -fno-trapping-math -fno-math-errno -ldl
srsRAN Project Test: PDSCH Processor Benchmark, Throughput Total OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.10.1-20240219 Test: PDSCH Processor Benchmark, Throughput Total a aa b 3K 6K 9K 12K 15K SE +/- 42.60, N = 3 14099.8 13936.1 13999.6 1. (CXX) g++ options: -O3 -fno-trapping-math -fno-math-errno -ldl
SVT-AV1 Encoder Mode: Preset 8 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 8 - Input: Bosphorus 4K a aa b c 6 12 18 24 30 SE +/- 0.01, N = 3 24.95 24.93 25.01 24.95 1. (CXX) g++ options: -march=native
oneDNN Harness: Deconvolution Batch shapes_1d - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: Deconvolution Batch shapes_1d - Engine: CPU aa b c 5 10 15 20 25 SE +/- 0.20, N = 3 20.93 20.43 20.89 MIN: 19.34 MIN: 19.32 MIN: 19.81 1. (CXX) g++ options: -O3 -march=native -fopenmp -mcpu=generic -fPIC -pie -ldl -lpthread
srsRAN Project Test: PDSCH Processor Benchmark, Throughput Thread OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.10.1-20240219 Test: PDSCH Processor Benchmark, Throughput Thread a aa 40 80 120 160 200 SE +/- 0.03, N = 3 175.8 175.7 1. (CXX) g++ options: -O3 -fno-trapping-math -fno-math-errno -ldl
JPEG-XL Decoding libjxl CPU Threads: All OpenBenchmarking.org MP/s, More Is Better JPEG-XL Decoding libjxl 0.10.1 CPU Threads: All a aa b c 120 240 360 480 600 SE +/- 1.96, N = 3 558.57 523.02 564.89 542.10
SVT-AV1 Encoder Mode: Preset 4 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 4 - Input: Bosphorus 1080p a aa b c 2 4 6 8 10 SE +/- 0.010, N = 3 8.914 8.925 8.921 8.926 1. (CXX) g++ options: -march=native
JPEG-XL libjxl Input: PNG - Quality: 100 OpenBenchmarking.org MP/s, More Is Better JPEG-XL libjxl 0.10.1 Input: PNG - Quality: 100 a aa b c 7 14 21 28 35 SE +/- 0.04, N = 3 29.60 29.24 29.49 29.54 1. (CXX) g++ options: -fno-rtti -O3 -fPIE -pie -lm
oneDNN Harness: IP Shapes 1D - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: IP Shapes 1D - Engine: CPU aa b c 1.0999 2.1998 3.2997 4.3996 5.4995 SE +/- 0.01022, N = 3 4.84065 4.88015 4.88858 MIN: 4.25 MIN: 4.23 MIN: 4.3 1. (CXX) g++ options: -O3 -march=native -fopenmp -mcpu=generic -fPIC -pie -ldl -lpthread
JPEG-XL libjxl Input: JPEG - Quality: 100 OpenBenchmarking.org MP/s, More Is Better JPEG-XL libjxl 0.10.1 Input: JPEG - Quality: 100 a aa b c 7 14 21 28 35 SE +/- 0.00, N = 3 31.67 31.12 31.62 31.62 1. (CXX) g++ options: -fno-rtti -O3 -fPIE -pie -lm
Google Draco Model: Church Facade OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.5.6 Model: Church Facade aa b c 2K 4K 6K 8K 10K SE +/- 6.24, N = 3 10100 9847 9848 1. (CXX) g++ options: -O3
SVT-AV1 Encoder Mode: Preset 8 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 8 - Input: Bosphorus 1080p a aa b c 13 26 39 52 65 SE +/- 0.06, N = 3 56.90 57.14 57.03 56.79 1. (CXX) g++ options: -march=native
Google Draco Model: Lion OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.5.6 Model: Lion aa b c 1600 3200 4800 6400 8000 SE +/- 1.86, N = 3 7351 7320 7332 1. (CXX) g++ options: -O3
oneDNN Harness: IP Shapes 3D - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: IP Shapes 3D - Engine: CPU aa b c 0.4851 0.9702 1.4553 1.9404 2.4255 SE +/- 0.00137, N = 3 2.15582 2.15178 2.14878 MIN: 2.06 MIN: 2.06 MIN: 2.06 1. (CXX) g++ options: -O3 -march=native -fopenmp -mcpu=generic -fPIC -pie -ldl -lpthread
SVT-AV1 Encoder Mode: Preset 12 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 12 - Input: Bosphorus 4K a aa b c 20 40 60 80 100 SE +/- 0.28, N = 3 74.68 74.47 75.17 75.02 1. (CXX) g++ options: -march=native
SVT-AV1 Encoder Mode: Preset 13 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 13 - Input: Bosphorus 4K a aa b c 20 40 60 80 100 SE +/- 0.19, N = 3 74.90 74.90 74.96 74.60 1. (CXX) g++ options: -march=native
oneDNN Harness: Deconvolution Batch shapes_3d - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: Deconvolution Batch shapes_3d - Engine: CPU aa b c 0.6309 1.2618 1.8927 2.5236 3.1545 SE +/- 0.01912, N = 12 2.79626 2.78238 2.80386 MIN: 2.68 MIN: 2.72 MIN: 2.7 1. (CXX) g++ options: -O3 -march=native -fopenmp -mcpu=generic -fPIC -pie -ldl -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: Convolution Batch Shapes Auto - Engine: CPU aa b c 0.9663 1.9326 2.8989 3.8652 4.8315 SE +/- 0.01638, N = 3 4.29470 4.28036 4.28461 MIN: 4.16 MIN: 4.17 MIN: 4.14 1. (CXX) g++ options: -O3 -march=native -fopenmp -mcpu=generic -fPIC -pie -ldl -lpthread
Primesieve Length: 1e12 OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 12.1 Length: 1e12 aa b c 0.655 1.31 1.965 2.62 3.275 SE +/- 0.003, N = 3 2.911 2.872 2.893 1. (CXX) g++ options: -O3
SVT-AV1 Encoder Mode: Preset 12 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 12 - Input: Bosphorus 1080p a aa b c 60 120 180 240 300 SE +/- 0.05, N = 3 265.74 264.98 265.44 264.28 1. (CXX) g++ options: -march=native
Parallel BZIP2 Compression FreeBSD-13.0-RELEASE-amd64-memstick.img Compression OpenBenchmarking.org Seconds, Fewer Is Better Parallel BZIP2 Compression 1.1.13 FreeBSD-13.0-RELEASE-amd64-memstick.img Compression aa b c 0.5489 1.0978 1.6467 2.1956 2.7445 SE +/- 0.001512, N = 3 2.413553 2.439338 2.438631 1. (CXX) g++ options: -O2 -pthread -lbz2 -lpthread
SVT-AV1 Encoder Mode: Preset 13 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 13 - Input: Bosphorus 1080p a aa b c 80 160 240 320 400 SE +/- 0.57, N = 3 364.40 363.35 365.10 363.61 1. (CXX) g++ options: -march=native
Phoronix Test Suite v10.8.5