n1n1 ARMv8 Neoverse-N1 testing with a GIGABYTE G242-P36-00 MP32-AR2-00 v01000100 (F31k SCP: 2.10.20220531 BIOS) and ASPEED on Ubuntu 23.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2403174-NE-N1N13670960&grt .
n1n1 Processor Motherboard Chipset Memory Disk Graphics Monitor Network OS Kernel Compiler File-System Screen Resolution a aa b c ARMv8 Neoverse-N1 @ 3.00GHz (128 Cores) GIGABYTE G242-P36-00 MP32-AR2-00 v01000100 (F31k SCP: 2.10.20220531 BIOS) Ampere Computing LLC Altra PCI Root Complex A 16 x 32 GB DDR4-3200MT/s Samsung M393A4K40DB3-CWE 800GB Micron_7450_MTFDKBA800TFS ASPEED VGA HDMI 2 x Intel I350 Ubuntu 23.10 6.5.0-15-generic (aarch64) GCC 13.2.0 ext4 1024x768 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -v Processor Details - Scaling Governor: cppc_cpufreq performance (Boost: Disabled) Python Details - Python 3.11.6 Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of CSV2 BHB + srbds: Not affected + tsx_async_abort: Not affected
n1n1 draco: Lion draco: Church Facade jpegxl-decode: 1 jpegxl-decode: All jpegxl: PNG - 80 jpegxl: PNG - 90 jpegxl: JPEG - 80 jpegxl: JPEG - 90 jpegxl: PNG - 100 jpegxl: JPEG - 100 deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Stream deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Synchronous Single-Stream deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Synchronous Single-Stream deepsparse: ResNet-50, Baseline - Asynchronous Multi-Stream deepsparse: ResNet-50, Baseline - Asynchronous Multi-Stream deepsparse: ResNet-50, Baseline - Synchronous Single-Stream deepsparse: ResNet-50, Baseline - Synchronous Single-Stream deepsparse: ResNet-50, Sparse INT8 - Asynchronous Multi-Stream deepsparse: ResNet-50, Sparse INT8 - Asynchronous Multi-Stream deepsparse: ResNet-50, Sparse INT8 - Synchronous Single-Stream deepsparse: ResNet-50, Sparse INT8 - Synchronous Single-Stream deepsparse: Llama2 Chat 7b Quantized - Asynchronous Multi-Stream deepsparse: Llama2 Chat 7b Quantized - Asynchronous Multi-Stream deepsparse: Llama2 Chat 7b Quantized - Synchronous Single-Stream deepsparse: Llama2 Chat 7b Quantized - Synchronous Single-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Synchronous Single-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Synchronous Single-Stream deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Synchronous Single-Stream deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Synchronous Single-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Synchronous Single-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Synchronous Single-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Synchronous Single-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Synchronous Single-Stream deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Stream deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Stream deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Synchronous Single-Stream deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Synchronous Single-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Stream onednn: IP Shapes 1D - CPU onednn: IP Shapes 3D - CPU onednn: Convolution Batch Shapes Auto - CPU onednn: Deconvolution Batch shapes_1d - CPU onednn: Deconvolution Batch shapes_3d - CPU onednn: Recurrent Neural Network Training - CPU onednn: Recurrent Neural Network Inference - CPU openvino: Face Detection FP16 - CPU openvino: Face Detection FP16 - CPU openvino: Person Detection FP16 - CPU openvino: Person Detection FP16 - CPU openvino: Person Detection FP32 - CPU openvino: Person Detection FP32 - CPU openvino: Vehicle Detection FP16 - CPU openvino: Vehicle Detection FP16 - CPU openvino: Face Detection FP16-INT8 - CPU openvino: Face Detection FP16-INT8 - CPU openvino: Face Detection Retail FP16 - CPU openvino: Face Detection Retail FP16 - CPU openvino: Road Segmentation ADAS FP16 - CPU openvino: Road Segmentation ADAS FP16 - CPU openvino: Vehicle Detection FP16-INT8 - CPU openvino: Vehicle Detection FP16-INT8 - CPU openvino: Weld Porosity Detection FP16 - CPU openvino: Weld Porosity Detection FP16 - CPU openvino: Face Detection Retail FP16-INT8 - CPU openvino: Face Detection Retail FP16-INT8 - CPU openvino: Road Segmentation ADAS FP16-INT8 - CPU openvino: Road Segmentation ADAS FP16-INT8 - CPU openvino: Machine Translation EN To DE FP16 - CPU openvino: Machine Translation EN To DE FP16 - CPU openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Person Vehicle Bike Detection FP16 - CPU openvino: Person Vehicle Bike Detection FP16 - CPU openvino: Noise Suppression Poconet-Like FP16 - CPU openvino: Noise Suppression Poconet-Like FP16 - CPU openvino: Handwritten English Recognition FP16 - CPU openvino: Handwritten English Recognition FP16 - CPU openvino: Person Re-Identification Retail FP16 - CPU openvino: Person Re-Identification Retail FP16 - CPU openvino: Age Gender Recognition Retail 0013 FP16 - CPU openvino: Age Gender Recognition Retail 0013 FP16 - CPU openvino: Handwritten English Recognition FP16-INT8 - CPU openvino: Handwritten English Recognition FP16-INT8 - CPU openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU compress-pbzip2: FreeBSD-13.0-RELEASE-amd64-memstick.img Compression primesieve: 1e12 primesieve: 1e13 srsran: PDSCH Processor Benchmark, Throughput Total srsran: PUSCH Processor Benchmark, Throughput Total srsran: PDSCH Processor Benchmark, Throughput Thread srsran: PUSCH Processor Benchmark, Throughput Thread stockfish: Chess Benchmark svt-av1: Preset 4 - Bosphorus 4K svt-av1: Preset 8 - Bosphorus 4K svt-av1: Preset 12 - Bosphorus 4K svt-av1: Preset 13 - Bosphorus 4K svt-av1: Preset 4 - Bosphorus 1080p svt-av1: Preset 8 - Bosphorus 1080p svt-av1: Preset 12 - Bosphorus 1080p svt-av1: Preset 13 - Bosphorus 1080p build-linux-kernel: defconfig build-linux-kernel: allmodconfig encode-wavpack: WAV To WavPack a aa b c 27.237 558.569 43.097 39.249 39.268 37.591 29.603 31.665 14099.8 1602.1 175.8 46.7 59028775 2.652 24.945 74.682 74.896 8.914 56.897 265.743 364.399 94.273 7351 10100 27.152 523.019 40.279 37.895 38.921 37.415 29.238 31.121 33.4187 1844.1246 26.0871 38.3162 1149.4724 55.0261 132.1849 7.5520 474.8976 132.9525 133.5301 7.4741 2678.2382 23.5039 315.7450 3.1508 2.2602 21332.8931 12.9298 77.3026 476.3557 132.5425 133.6218 7.4691 202.6359 310.5129 112.5334 8.8709 345.1080 182.8789 109.9523 9.0819 46.6120 1337.5918 30.5961 32.6597 438.7131 143.8317 50.5976 19.7459 33.5337 1840.3677 26.2531 38.0726 4.84065 2.15582 4.29470 20.9255 2.79626 3738.39 1460.94 2.84 10877.53 14.77 2150.30 14.73 2156.87 222.86 143.42 2.74 11232.43 676.59 47.28 65.60 486.11 89.35 357.86 293.47 108.97 333.15 95.98 34.90 913.41 40.11 794.06 217.95 146.71 204.69 156.22 164.82 193.84 163.95 194.88 142.60 224.22 1402.51 22.80 147.76 216.18 1462.94 21.86 2.413553 2.911 42.305 13936.1 175.7 59449725 2.644 24.927 74.469 74.900 8.925 57.135 264.978 363.354 92.760 348.018 25.199 7320 9847 27.417 564.893 41.309 39.669 37.766 37.79 29.494 31.621 33.7125 1834.8257 25.9453 38.5246 1144.8012 55.2555 132.9867 7.5061 475.8212 132.7356 134.0016 7.4484 2688.9567 23.4116 312.4633 3.1835 2.2754 21231.5641 12.8977 77.4941 478.6418 131.9529 133.6253 7.4692 202.1471 311.1676 112.8291 8.8483 346.6699 181.8956 109.4784 9.1198 46.715 1335.3456 30.7211 32.5272 439.6023 143.4837 50.7328 19.6933 33.5843 1835.2572 26.3468 37.9374 4.88015 2.15178 4.28036 20.4308 2.78238 3737.15 1461 2.84 10891.93 14.77 2151.85 14.84 2140.2 223.85 142.79 2.75 11206.13 664.78 48.1 65.49 486.9 89.19 358.5 297.48 107.5 329.97 96.9 34.79 915.17 40.15 793.31 221.47 144.38 205.33 155.74 164.75 193.93 163.98 194.84 142.58 224.22 1402.97 22.79 147.08 217.16 1473.23 21.71 2.439338 2.872 42.441 13999.6 51901853 2.65 25.006 75.167 74.958 8.921 57.027 265.435 365.102 94.426 350.294 25.205 7332 9848 27.396 542.103 41.354 39.251 39.315 35.843 29.544 31.624 33.6797 1833.4487 26.3315 37.9597 1144.7727 55.2435 131.4797 7.5924 479.9901 131.5237 133.4866 7.4767 2630.334 23.9126 316.347 3.1449 2.2836 21169.2953 12.9605 77.119 478.3732 131.8594 133.8853 7.454 201.0244 312.7909 112.8521 8.8461 339.9 185.7196 111.1578 8.982 46.6799 1333.7177 30.6675 32.5835 438.2501 143.6025 50.649 19.7262 33.6663 1836.793 26.2002 38.1495 4.88858 2.14878 4.28461 20.8925 2.80386 3738.53 1469.65 2.84 10876.7 14.73 2157.45 14.8 2146.07 222.78 143.48 2.75 11196.54 670.19 47.72 65.6 486.11 89.3 357.41 294.58 108.56 331.77 96.38 34.88 913.21 40.21 792 219.27 145.82 207.24 154.3 164.79 193.87 164.13 194.65 142.54 224.31 1403.65 22.78 146.9 217.41 1460.72 21.89 2.438631 2.893 42.294 53514996 2.65 24.952 75.015 74.604 8.926 56.789 264.28 363.612 94.496 349.915 25.2 OpenBenchmarking.org
Google Draco Model: Lion OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.5.6 Model: Lion aa b c 1600 3200 4800 6400 8000 SE +/- 1.86, N = 3 7351 7320 7332 1. (CXX) g++ options: -O3
Google Draco Model: Church Facade OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.5.6 Model: Church Facade aa b c 2K 4K 6K 8K 10K SE +/- 6.24, N = 3 10100 9847 9848 1. (CXX) g++ options: -O3
JPEG-XL Decoding libjxl CPU Threads: 1 OpenBenchmarking.org MP/s, More Is Better JPEG-XL Decoding libjxl 0.10.1 CPU Threads: 1 a aa b c 6 12 18 24 30 SE +/- 0.01, N = 3 27.24 27.15 27.42 27.40
JPEG-XL Decoding libjxl CPU Threads: All OpenBenchmarking.org MP/s, More Is Better JPEG-XL Decoding libjxl 0.10.1 CPU Threads: All a aa b c 120 240 360 480 600 SE +/- 1.96, N = 3 558.57 523.02 564.89 542.10
JPEG-XL libjxl Input: PNG - Quality: 80 OpenBenchmarking.org MP/s, More Is Better JPEG-XL libjxl 0.10.1 Input: PNG - Quality: 80 a aa b c 10 20 30 40 50 SE +/- 0.30, N = 3 43.10 40.28 41.31 41.35 1. (CXX) g++ options: -fno-rtti -O3 -fPIE -pie -lm
JPEG-XL libjxl Input: PNG - Quality: 90 OpenBenchmarking.org MP/s, More Is Better JPEG-XL libjxl 0.10.1 Input: PNG - Quality: 90 a aa b c 9 18 27 36 45 SE +/- 0.55, N = 15 39.25 37.90 39.67 39.25 1. (CXX) g++ options: -fno-rtti -O3 -fPIE -pie -lm
JPEG-XL libjxl Input: JPEG - Quality: 80 OpenBenchmarking.org MP/s, More Is Better JPEG-XL libjxl 0.10.1 Input: JPEG - Quality: 80 a aa b c 9 18 27 36 45 SE +/- 0.12, N = 3 39.27 38.92 37.77 39.32 1. (CXX) g++ options: -fno-rtti -O3 -fPIE -pie -lm
JPEG-XL libjxl Input: JPEG - Quality: 90 OpenBenchmarking.org MP/s, More Is Better JPEG-XL libjxl 0.10.1 Input: JPEG - Quality: 90 a aa b c 9 18 27 36 45 SE +/- 0.45, N = 15 37.59 37.42 37.79 35.84 1. (CXX) g++ options: -fno-rtti -O3 -fPIE -pie -lm
JPEG-XL libjxl Input: PNG - Quality: 100 OpenBenchmarking.org MP/s, More Is Better JPEG-XL libjxl 0.10.1 Input: PNG - Quality: 100 a aa b c 7 14 21 28 35 SE +/- 0.04, N = 3 29.60 29.24 29.49 29.54 1. (CXX) g++ options: -fno-rtti -O3 -fPIE -pie -lm
JPEG-XL libjxl Input: JPEG - Quality: 100 OpenBenchmarking.org MP/s, More Is Better JPEG-XL libjxl 0.10.1 Input: JPEG - Quality: 100 a aa b c 7 14 21 28 35 SE +/- 0.00, N = 3 31.67 31.12 31.62 31.62 1. (CXX) g++ options: -fno-rtti -O3 -fPIE -pie -lm
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream aa b c 8 16 24 32 40 SE +/- 0.02, N = 3 33.42 33.71 33.68
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream aa b c 400 800 1200 1600 2000 SE +/- 1.78, N = 3 1844.12 1834.83 1833.45
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream aa b c 6 12 18 24 30 SE +/- 0.11, N = 3 26.09 25.95 26.33
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream aa b c 9 18 27 36 45 SE +/- 0.16, N = 3 38.32 38.52 37.96
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream aa b c 200 400 600 800 1000 SE +/- 2.84, N = 3 1149.47 1144.80 1144.77
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream aa b c 12 24 36 48 60 SE +/- 0.11, N = 3 55.03 55.26 55.24
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Synchronous Single-Stream aa b c 30 60 90 120 150 SE +/- 0.27, N = 3 132.18 132.99 131.48
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Synchronous Single-Stream aa b c 2 4 6 8 10 SE +/- 0.0154, N = 3 7.5520 7.5061 7.5924
Neural Magic DeepSparse Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream aa b c 100 200 300 400 500 SE +/- 1.33, N = 3 474.90 475.82 479.99
Neural Magic DeepSparse Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream aa b c 30 60 90 120 150 SE +/- 0.39, N = 3 132.95 132.74 131.52
Neural Magic DeepSparse Model: ResNet-50, Baseline - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Baseline - Scenario: Synchronous Single-Stream aa b c 30 60 90 120 150 SE +/- 0.11, N = 3 133.53 134.00 133.49
Neural Magic DeepSparse Model: ResNet-50, Baseline - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Baseline - Scenario: Synchronous Single-Stream aa b c 2 4 6 8 10 SE +/- 0.0064, N = 3 7.4741 7.4484 7.4767
Neural Magic DeepSparse Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream aa b c 600 1200 1800 2400 3000 SE +/- 6.53, N = 3 2678.24 2688.96 2630.33
Neural Magic DeepSparse Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream aa b c 6 12 18 24 30 SE +/- 0.06, N = 3 23.50 23.41 23.91
Neural Magic DeepSparse Model: ResNet-50, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Sparse INT8 - Scenario: Synchronous Single-Stream aa b c 70 140 210 280 350 SE +/- 0.75, N = 3 315.75 312.46 316.35
Neural Magic DeepSparse Model: ResNet-50, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Sparse INT8 - Scenario: Synchronous Single-Stream aa b c 0.7163 1.4326 2.1489 2.8652 3.5815 SE +/- 0.0074, N = 3 3.1508 3.1835 3.1449
Neural Magic DeepSparse Model: Llama2 Chat 7b Quantized - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: Llama2 Chat 7b Quantized - Scenario: Asynchronous Multi-Stream aa b c 0.5138 1.0276 1.5414 2.0552 2.569 SE +/- 0.0074, N = 3 2.2602 2.2754 2.2836
Neural Magic DeepSparse Model: Llama2 Chat 7b Quantized - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: Llama2 Chat 7b Quantized - Scenario: Asynchronous Multi-Stream aa b c 5K 10K 15K 20K 25K SE +/- 55.68, N = 3 21332.89 21231.56 21169.30
Neural Magic DeepSparse Model: Llama2 Chat 7b Quantized - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: Llama2 Chat 7b Quantized - Scenario: Synchronous Single-Stream aa b c 3 6 9 12 15 SE +/- 0.02, N = 3 12.93 12.90 12.96
Neural Magic DeepSparse Model: Llama2 Chat 7b Quantized - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: Llama2 Chat 7b Quantized - Scenario: Synchronous Single-Stream aa b c 20 40 60 80 100 SE +/- 0.12, N = 3 77.30 77.49 77.12
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream aa b c 100 200 300 400 500 SE +/- 1.18, N = 3 476.36 478.64 478.37
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream aa b c 30 60 90 120 150 SE +/- 0.34, N = 3 132.54 131.95 131.86
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream aa b c 30 60 90 120 150 SE +/- 0.17, N = 3 133.62 133.63 133.89
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream aa b c 2 4 6 8 10 SE +/- 0.0095, N = 3 7.4691 7.4692 7.4540
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream aa b c 40 80 120 160 200 SE +/- 0.34, N = 3 202.64 202.15 201.02
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream aa b c 70 140 210 280 350 SE +/- 0.51, N = 3 310.51 311.17 312.79
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Synchronous Single-Stream aa b c 30 60 90 120 150 SE +/- 0.16, N = 3 112.53 112.83 112.85
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Synchronous Single-Stream aa b c 2 4 6 8 10 SE +/- 0.0129, N = 3 8.8709 8.8483 8.8461
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream aa b c 80 160 240 320 400 SE +/- 0.25, N = 3 345.11 346.67 339.90
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream aa b c 40 80 120 160 200 SE +/- 0.08, N = 3 182.88 181.90 185.72
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream aa b c 20 40 60 80 100 SE +/- 0.86, N = 3 109.95 109.48 111.16
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream aa b c 3 6 9 12 15 SE +/- 0.0713, N = 3 9.0819 9.1198 8.9820
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream aa b c 11 22 33 44 55 SE +/- 0.11, N = 3 46.61 46.72 46.68
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream aa b c 300 600 900 1200 1500 SE +/- 3.19, N = 3 1337.59 1335.35 1333.72
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream aa b c 7 14 21 28 35 SE +/- 0.01, N = 3 30.60 30.72 30.67
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream aa b c 8 16 24 32 40 SE +/- 0.01, N = 3 32.66 32.53 32.58
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream aa b c 100 200 300 400 500 SE +/- 0.42, N = 3 438.71 439.60 438.25
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream aa b c 30 60 90 120 150 SE +/- 0.07, N = 3 143.83 143.48 143.60
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Synchronous Single-Stream aa b c 11 22 33 44 55 SE +/- 0.08, N = 3 50.60 50.73 50.65
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Synchronous Single-Stream aa b c 5 10 15 20 25 SE +/- 0.03, N = 3 19.75 19.69 19.73
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream aa b c 8 16 24 32 40 SE +/- 0.04, N = 3 33.53 33.58 33.67
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream aa b c 400 800 1200 1600 2000 SE +/- 1.00, N = 3 1840.37 1835.26 1836.79
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream aa b c 6 12 18 24 30 SE +/- 0.02, N = 3 26.25 26.35 26.20
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream aa b c 9 18 27 36 45 SE +/- 0.04, N = 3 38.07 37.94 38.15
oneDNN Harness: IP Shapes 1D - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: IP Shapes 1D - Engine: CPU aa b c 1.0999 2.1998 3.2997 4.3996 5.4995 SE +/- 0.01022, N = 3 4.84065 4.88015 4.88858 MIN: 4.25 MIN: 4.23 MIN: 4.3 1. (CXX) g++ options: -O3 -march=native -fopenmp -mcpu=generic -fPIC -pie -ldl -lpthread
oneDNN Harness: IP Shapes 3D - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: IP Shapes 3D - Engine: CPU aa b c 0.4851 0.9702 1.4553 1.9404 2.4255 SE +/- 0.00137, N = 3 2.15582 2.15178 2.14878 MIN: 2.06 MIN: 2.06 MIN: 2.06 1. (CXX) g++ options: -O3 -march=native -fopenmp -mcpu=generic -fPIC -pie -ldl -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: Convolution Batch Shapes Auto - Engine: CPU aa b c 0.9663 1.9326 2.8989 3.8652 4.8315 SE +/- 0.01638, N = 3 4.29470 4.28036 4.28461 MIN: 4.16 MIN: 4.17 MIN: 4.14 1. (CXX) g++ options: -O3 -march=native -fopenmp -mcpu=generic -fPIC -pie -ldl -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: Deconvolution Batch shapes_1d - Engine: CPU aa b c 5 10 15 20 25 SE +/- 0.20, N = 3 20.93 20.43 20.89 MIN: 19.34 MIN: 19.32 MIN: 19.81 1. (CXX) g++ options: -O3 -march=native -fopenmp -mcpu=generic -fPIC -pie -ldl -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: Deconvolution Batch shapes_3d - Engine: CPU aa b c 0.6309 1.2618 1.8927 2.5236 3.1545 SE +/- 0.01912, N = 12 2.79626 2.78238 2.80386 MIN: 2.68 MIN: 2.72 MIN: 2.7 1. (CXX) g++ options: -O3 -march=native -fopenmp -mcpu=generic -fPIC -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Training - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: Recurrent Neural Network Training - Engine: CPU aa b c 800 1600 2400 3200 4000 SE +/- 2.30, N = 3 3738.39 3737.15 3738.53 MIN: 3728.79 MIN: 3730.87 MIN: 3730.99 1. (CXX) g++ options: -O3 -march=native -fopenmp -mcpu=generic -fPIC -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: Recurrent Neural Network Inference - Engine: CPU aa b c 300 600 900 1200 1500 SE +/- 3.72, N = 3 1460.94 1461.00 1469.65 MIN: 1436.36 MIN: 1442.49 MIN: 1448.43 1. (CXX) g++ options: -O3 -march=native -fopenmp -mcpu=generic -fPIC -pie -ldl -lpthread
OpenVINO Model: Face Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Face Detection FP16 - Device: CPU aa b c 0.639 1.278 1.917 2.556 3.195 SE +/- 0.01, N = 3 2.84 2.84 2.84 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Face Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Face Detection FP16 - Device: CPU aa b c 2K 4K 6K 8K 10K SE +/- 17.40, N = 3 10877.53 10891.93 10876.70 MIN: 4104.89 / MAX: 18949.05 MIN: 3821.31 / MAX: 19031.99 MIN: 3255.92 / MAX: 18738.42 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Person Detection FP16 - Device: CPU aa b c 4 8 12 16 20 SE +/- 0.01, N = 3 14.77 14.77 14.73 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Person Detection FP16 - Device: CPU aa b c 500 1000 1500 2000 2500 SE +/- 1.19, N = 3 2150.30 2151.85 2157.45 MIN: 491.1 / MAX: 2996.72 MIN: 500.93 / MAX: 2975.2 MIN: 644.54 / MAX: 2962.51 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Person Detection FP32 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Person Detection FP32 - Device: CPU aa b c 4 8 12 16 20 SE +/- 0.02, N = 3 14.73 14.84 14.80 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Person Detection FP32 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Person Detection FP32 - Device: CPU aa b c 500 1000 1500 2000 2500 SE +/- 2.39, N = 3 2156.87 2140.20 2146.07 MIN: 504.09 / MAX: 2990 MIN: 527.18 / MAX: 2951.37 MIN: 439.17 / MAX: 2969.83 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Vehicle Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Vehicle Detection FP16 - Device: CPU aa b c 50 100 150 200 250 SE +/- 0.10, N = 3 222.86 223.85 222.78 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Vehicle Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Vehicle Detection FP16 - Device: CPU aa b c 30 60 90 120 150 SE +/- 0.06, N = 3 143.42 142.79 143.48 MIN: 62.82 / MAX: 295.2 MIN: 60 / MAX: 245.21 MIN: 44.55 / MAX: 252.93 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Face Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Face Detection FP16-INT8 - Device: CPU aa b c 0.6188 1.2376 1.8564 2.4752 3.094 SE +/- 0.00, N = 3 2.74 2.75 2.75 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Face Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Face Detection FP16-INT8 - Device: CPU aa b c 2K 4K 6K 8K 10K SE +/- 9.32, N = 3 11232.43 11206.13 11196.54 MIN: 6926.76 / MAX: 21113.44 MIN: 7011.32 / MAX: 20429.17 MIN: 7222.84 / MAX: 20603.63 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Face Detection Retail FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Face Detection Retail FP16 - Device: CPU aa b c 150 300 450 600 750 SE +/- 8.52, N = 3 676.59 664.78 670.19 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Face Detection Retail FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Face Detection Retail FP16 - Device: CPU aa b c 11 22 33 44 55 SE +/- 0.60, N = 3 47.28 48.10 47.72 MIN: 10.17 / MAX: 121.04 MIN: 9.92 / MAX: 115.12 MIN: 9.97 / MAX: 99.86 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Road Segmentation ADAS FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Road Segmentation ADAS FP16 - Device: CPU aa b c 15 30 45 60 75 SE +/- 0.12, N = 3 65.60 65.49 65.60 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Road Segmentation ADAS FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Road Segmentation ADAS FP16 - Device: CPU aa b c 110 220 330 440 550 SE +/- 0.88, N = 3 486.11 486.90 486.11 MIN: 118.22 / MAX: 849.31 MIN: 119.18 / MAX: 852.49 MIN: 171.7 / MAX: 813.73 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Vehicle Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Vehicle Detection FP16-INT8 - Device: CPU aa b c 20 40 60 80 100 SE +/- 0.03, N = 3 89.35 89.19 89.30 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Vehicle Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Vehicle Detection FP16-INT8 - Device: CPU aa b c 80 160 240 320 400 SE +/- 0.11, N = 3 357.86 358.50 357.41 MIN: 301.59 / MAX: 522.85 MIN: 300.19 / MAX: 528.83 MIN: 204.13 / MAX: 519.56 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Weld Porosity Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Weld Porosity Detection FP16 - Device: CPU aa b c 60 120 180 240 300 SE +/- 0.30, N = 3 293.47 297.48 294.58 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Weld Porosity Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Weld Porosity Detection FP16 - Device: CPU aa b c 20 40 60 80 100 SE +/- 0.11, N = 3 108.97 107.50 108.56 MIN: 17.48 / MAX: 1207.62 MIN: 57.15 / MAX: 1202.08 MIN: 17.21 / MAX: 1188.34 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Face Detection Retail FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Face Detection Retail FP16-INT8 - Device: CPU aa b c 70 140 210 280 350 SE +/- 0.61, N = 3 333.15 329.97 331.77 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Face Detection Retail FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Face Detection Retail FP16-INT8 - Device: CPU aa b c 20 40 60 80 100 SE +/- 0.18, N = 3 95.98 96.90 96.38 MIN: 71.43 / MAX: 140.32 MIN: 70.14 / MAX: 141.32 MIN: 69.36 / MAX: 140.93 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Road Segmentation ADAS FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Road Segmentation ADAS FP16-INT8 - Device: CPU aa b c 8 16 24 32 40 SE +/- 0.02, N = 3 34.90 34.79 34.88 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Road Segmentation ADAS FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Road Segmentation ADAS FP16-INT8 - Device: CPU aa b c 200 400 600 800 1000 SE +/- 0.49, N = 3 913.41 915.17 913.21 MIN: 742.17 / MAX: 1356.42 MIN: 711.5 / MAX: 1350.07 MIN: 718.49 / MAX: 1350.67 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Machine Translation EN To DE FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Machine Translation EN To DE FP16 - Device: CPU aa b c 9 18 27 36 45 SE +/- 0.05, N = 3 40.11 40.15 40.21 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Machine Translation EN To DE FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Machine Translation EN To DE FP16 - Device: CPU aa b c 200 400 600 800 1000 SE +/- 0.93, N = 3 794.06 793.31 792.00 MIN: 604.52 / MAX: 1620.5 MIN: 559.01 / MAX: 1581.54 MIN: 568.74 / MAX: 1657.2 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Weld Porosity Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Weld Porosity Detection FP16-INT8 - Device: CPU aa b c 50 100 150 200 250 SE +/- 0.18, N = 3 217.95 221.47 219.27 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Weld Porosity Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Weld Porosity Detection FP16-INT8 - Device: CPU aa b c 30 60 90 120 150 SE +/- 0.12, N = 3 146.71 144.38 145.82 MIN: 96.02 / MAX: 1572.43 MIN: 96.65 / MAX: 1566.66 MIN: 96.38 / MAX: 1563.28 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Person Vehicle Bike Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Person Vehicle Bike Detection FP16 - Device: CPU aa b c 50 100 150 200 250 SE +/- 0.66, N = 3 204.69 205.33 207.24 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Person Vehicle Bike Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Person Vehicle Bike Detection FP16 - Device: CPU aa b c 30 60 90 120 150 SE +/- 0.49, N = 3 156.22 155.74 154.30 MIN: 44.3 / MAX: 240.55 MIN: 48.23 / MAX: 240.13 MIN: 44.57 / MAX: 239.56 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Noise Suppression Poconet-Like FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Noise Suppression Poconet-Like FP16 - Device: CPU aa b c 40 80 120 160 200 SE +/- 0.03, N = 3 164.82 164.75 164.79 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Noise Suppression Poconet-Like FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Noise Suppression Poconet-Like FP16 - Device: CPU aa b c 40 80 120 160 200 SE +/- 0.04, N = 3 193.84 193.93 193.87 MIN: 183.19 / MAX: 407.14 MIN: 182.93 / MAX: 402.18 MIN: 182.85 / MAX: 406.51 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Handwritten English Recognition FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Handwritten English Recognition FP16 - Device: CPU aa b c 40 80 120 160 200 SE +/- 0.06, N = 3 163.95 163.98 164.13 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Handwritten English Recognition FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Handwritten English Recognition FP16 - Device: CPU aa b c 40 80 120 160 200 SE +/- 0.08, N = 3 194.88 194.84 194.65 MIN: 185.7 / MAX: 356.13 MIN: 185.09 / MAX: 355.83 MIN: 185.45 / MAX: 358.03 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Person Re-Identification Retail FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Person Re-Identification Retail FP16 - Device: CPU aa b c 30 60 90 120 150 SE +/- 0.34, N = 3 142.60 142.58 142.54 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Person Re-Identification Retail FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Person Re-Identification Retail FP16 - Device: CPU aa b c 50 100 150 200 250 SE +/- 0.53, N = 3 224.22 224.22 224.31 MIN: 29.21 / MAX: 400.61 MIN: 36.4 / MAX: 368.76 MIN: 31.77 / MAX: 351.21 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU aa b c 300 600 900 1200 1500 SE +/- 3.07, N = 3 1402.51 1402.97 1403.65 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU aa b c 5 10 15 20 25 SE +/- 0.05, N = 3 22.80 22.79 22.78 MIN: 1.57 / MAX: 164.42 MIN: 1.59 / MAX: 165.35 MIN: 1.63 / MAX: 162.11 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Handwritten English Recognition FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Handwritten English Recognition FP16-INT8 - Device: CPU aa b c 30 60 90 120 150 SE +/- 0.83, N = 3 147.76 147.08 146.90 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Handwritten English Recognition FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Handwritten English Recognition FP16-INT8 - Device: CPU aa b c 50 100 150 200 250 SE +/- 1.21, N = 3 216.18 217.16 217.41 MIN: 206.9 / MAX: 376.9 MIN: 208.82 / MAX: 374.93 MIN: 210.44 / MAX: 372.96 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU aa b c 300 600 900 1200 1500 SE +/- 1.48, N = 3 1462.94 1473.23 1460.72 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU aa b c 5 10 15 20 25 SE +/- 0.02, N = 3 21.86 21.71 21.89 MIN: 2 / MAX: 157.1 MIN: 2.05 / MAX: 156.88 MIN: 2.07 / MAX: 156.71 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
Parallel BZIP2 Compression FreeBSD-13.0-RELEASE-amd64-memstick.img Compression OpenBenchmarking.org Seconds, Fewer Is Better Parallel BZIP2 Compression 1.1.13 FreeBSD-13.0-RELEASE-amd64-memstick.img Compression aa b c 0.5489 1.0978 1.6467 2.1956 2.7445 SE +/- 0.001512, N = 3 2.413553 2.439338 2.438631 1. (CXX) g++ options: -O2 -pthread -lbz2 -lpthread
Primesieve Length: 1e12 OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 12.1 Length: 1e12 aa b c 0.655 1.31 1.965 2.62 3.275 SE +/- 0.003, N = 3 2.911 2.872 2.893 1. (CXX) g++ options: -O3
Primesieve Length: 1e13 OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 12.1 Length: 1e13 aa b c 10 20 30 40 50 SE +/- 0.07, N = 3 42.31 42.44 42.29 1. (CXX) g++ options: -O3
srsRAN Project Test: PDSCH Processor Benchmark, Throughput Total OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.10.1-20240219 Test: PDSCH Processor Benchmark, Throughput Total a aa b 3K 6K 9K 12K 15K SE +/- 42.60, N = 3 14099.8 13936.1 13999.6 1. (CXX) g++ options: -O3 -fno-trapping-math -fno-math-errno -ldl
srsRAN Project Test: PUSCH Processor Benchmark, Throughput Total OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.10.1-20240219 Test: PUSCH Processor Benchmark, Throughput Total a 300 600 900 1200 1500 1602.1 MIN: 947.2 1. (CXX) g++ options: -O3 -fno-trapping-math -fno-math-errno -ldl
srsRAN Project Test: PDSCH Processor Benchmark, Throughput Thread OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.10.1-20240219 Test: PDSCH Processor Benchmark, Throughput Thread a aa 40 80 120 160 200 SE +/- 0.03, N = 3 175.8 175.7 1. (CXX) g++ options: -O3 -fno-trapping-math -fno-math-errno -ldl
srsRAN Project Test: PUSCH Processor Benchmark, Throughput Thread OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.10.1-20240219 Test: PUSCH Processor Benchmark, Throughput Thread a 11 22 33 44 55 46.7 MIN: 28.9 1. (CXX) g++ options: -O3 -fno-trapping-math -fno-math-errno -ldl
Stockfish Chess Benchmark OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 16.1 Chess Benchmark a aa b c 13M 26M 39M 52M 65M SE +/- 1497045.19, N = 12 59028775 59449725 51901853 53514996 1. (CXX) g++ options: -lgcov -lpthread -fno-exceptions -std=c++17 -fno-peel-loops -fno-tracer -pedantic -O3 -funroll-loops -flto -flto-partition=one -flto=jobserver
SVT-AV1 Encoder Mode: Preset 4 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 4 - Input: Bosphorus 4K a aa b c 0.5967 1.1934 1.7901 2.3868 2.9835 SE +/- 0.004, N = 3 2.652 2.644 2.650 2.650 1. (CXX) g++ options: -march=native
SVT-AV1 Encoder Mode: Preset 8 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 8 - Input: Bosphorus 4K a aa b c 6 12 18 24 30 SE +/- 0.01, N = 3 24.95 24.93 25.01 24.95 1. (CXX) g++ options: -march=native
SVT-AV1 Encoder Mode: Preset 12 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 12 - Input: Bosphorus 4K a aa b c 20 40 60 80 100 SE +/- 0.28, N = 3 74.68 74.47 75.17 75.02 1. (CXX) g++ options: -march=native
SVT-AV1 Encoder Mode: Preset 13 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 13 - Input: Bosphorus 4K a aa b c 20 40 60 80 100 SE +/- 0.19, N = 3 74.90 74.90 74.96 74.60 1. (CXX) g++ options: -march=native
SVT-AV1 Encoder Mode: Preset 4 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 4 - Input: Bosphorus 1080p a aa b c 2 4 6 8 10 SE +/- 0.010, N = 3 8.914 8.925 8.921 8.926 1. (CXX) g++ options: -march=native
SVT-AV1 Encoder Mode: Preset 8 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 8 - Input: Bosphorus 1080p a aa b c 13 26 39 52 65 SE +/- 0.06, N = 3 56.90 57.14 57.03 56.79 1. (CXX) g++ options: -march=native
SVT-AV1 Encoder Mode: Preset 12 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 12 - Input: Bosphorus 1080p a aa b c 60 120 180 240 300 SE +/- 0.05, N = 3 265.74 264.98 265.44 264.28 1. (CXX) g++ options: -march=native
SVT-AV1 Encoder Mode: Preset 13 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 13 - Input: Bosphorus 1080p a aa b c 80 160 240 320 400 SE +/- 0.57, N = 3 364.40 363.35 365.10 363.61 1. (CXX) g++ options: -march=native
Timed Linux Kernel Compilation Build: defconfig OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 6.8 Build: defconfig a aa b c 20 40 60 80 100 SE +/- 0.90, N = 3 94.27 92.76 94.43 94.50
Timed Linux Kernel Compilation Build: allmodconfig OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 6.8 Build: allmodconfig aa b c 80 160 240 320 400 SE +/- 0.68, N = 3 348.02 350.29 349.92
WavPack Audio Encoding WAV To WavPack OpenBenchmarking.org Seconds, Fewer Is Better WavPack Audio Encoding 5.7 WAV To WavPack aa b c 6 12 18 24 30 SE +/- 0.00, N = 5 25.20 25.21 25.20
Phoronix Test Suite v10.8.5