n1n1 ARMv8 Neoverse-N1 testing with a GIGABYTE G242-P36-00 MP32-AR2-00 v01000100 (F31k SCP: 2.10.20220531 BIOS) and ASPEED on Ubuntu 23.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2403174-NE-N1N13670960&grs&sor .
n1n1 Processor Motherboard Chipset Memory Disk Graphics Monitor Network OS Kernel Compiler File-System Screen Resolution a aa b c ARMv8 Neoverse-N1 @ 3.00GHz (128 Cores) GIGABYTE G242-P36-00 MP32-AR2-00 v01000100 (F31k SCP: 2.10.20220531 BIOS) Ampere Computing LLC Altra PCI Root Complex A 16 x 32 GB DDR4-3200MT/s Samsung M393A4K40DB3-CWE 800GB Micron_7450_MTFDKBA800TFS ASPEED VGA HDMI 2 x Intel I350 Ubuntu 23.10 6.5.0-15-generic (aarch64) GCC 13.2.0 ext4 1024x768 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -v Processor Details - Scaling Governor: cppc_cpufreq performance (Boost: Disabled) Python Details - Python 3.11.6 Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Mitigation of CSV2 BHB + srbds: Not affected + tsx_async_abort: Not affected
n1n1 jpegxl-decode: All jpegxl: PNG - 80 jpegxl: JPEG - 90 jpegxl: PNG - 90 jpegxl: JPEG - 80 draco: Church Facade onednn: Deconvolution Batch shapes_1d - CPU deepsparse: ResNet-50, Sparse INT8 - Asynchronous Multi-Stream deepsparse: ResNet-50, Sparse INT8 - Asynchronous Multi-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream build-linux-kernel: defconfig openvino: Face Detection Retail FP16 - CPU jpegxl: JPEG - 100 openvino: Face Detection Retail FP16 - CPU openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Weld Porosity Detection FP16-INT8 - CPU deepsparse: NLP Text Classification, DistilBERT mnli - Synchronous Single-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Synchronous Single-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Stream openvino: Weld Porosity Detection FP16 - CPU openvino: Weld Porosity Detection FP16 - CPU primesieve: 1e12 jpegxl: PNG - 100 openvino: Person Vehicle Bike Detection FP16 - CPU openvino: Person Vehicle Bike Detection FP16 - CPU deepsparse: ResNet-50, Sparse INT8 - Synchronous Single-Stream deepsparse: ResNet-50, Sparse INT8 - Synchronous Single-Stream srsran: PDSCH Processor Benchmark, Throughput Total deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Synchronous Single-Stream deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Synchronous Single-Stream deepsparse: ResNet-50, Baseline - Asynchronous Multi-Stream deepsparse: ResNet-50, Baseline - Asynchronous Multi-Stream compress-pbzip2: FreeBSD-13.0-RELEASE-amd64-memstick.img Compression deepsparse: Llama2 Chat 7b Quantized - Asynchronous Multi-Stream onednn: IP Shapes 1D - CPU jpegxl-decode: 1 openvino: Face Detection Retail FP16-INT8 - CPU openvino: Face Detection Retail FP16-INT8 - CPU svt-av1: Preset 12 - Bosphorus 4K deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Stream openvino: Person Detection FP32 - CPU deepsparse: Llama2 Chat 7b Quantized - Asynchronous Multi-Stream onednn: Deconvolution Batch shapes_3d - CPU openvino: Person Detection FP32 - CPU deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Stream build-linux-kernel: allmodconfig svt-av1: Preset 8 - Bosphorus 1080p onednn: Recurrent Neural Network Inference - CPU openvino: Handwritten English Recognition FP16-INT8 - CPU deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream openvino: Handwritten English Recognition FP16-INT8 - CPU deepsparse: NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Stream svt-av1: Preset 12 - Bosphorus 1080p deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: Llama2 Chat 7b Quantized - Synchronous Single-Stream deepsparse: Llama2 Chat 7b Quantized - Synchronous Single-Stream openvino: Vehicle Detection FP16 - CPU svt-av1: Preset 13 - Bosphorus 1080p openvino: Vehicle Detection FP16 - CPU deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream svt-av1: Preset 13 - Bosphorus 4K draco: Lion deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Synchronous Single-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Synchronous Single-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream deepsparse: ResNet-50, Baseline - Synchronous Single-Stream deepsparse: ResNet-50, Baseline - Synchronous Single-Stream openvino: Face Detection FP16-INT8 - CPU primesieve: 1e13 onednn: Convolution Batch Shapes Auto - CPU openvino: Person Detection FP16 - CPU onednn: IP Shapes 3D - CPU openvino: Face Detection FP16-INT8 - CPU svt-av1: Preset 8 - Bosphorus 4K openvino: Road Segmentation ADAS FP16-INT8 - CPU deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Stream openvino: Vehicle Detection FP16-INT8 - CPU svt-av1: Preset 4 - Bosphorus 4K deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Synchronous Single-Stream deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Synchronous Single-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream openvino: Person Detection FP16 - CPU deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Synchronous Single-Stream deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Synchronous Single-Stream openvino: Machine Translation EN To DE FP16 - CPU openvino: Machine Translation EN To DE FP16 - CPU deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream openvino: Road Segmentation ADAS FP16-INT8 - CPU deepsparse: CV Classification, ResNet-50 ImageNet - Synchronous Single-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Synchronous Single-Stream openvino: Vehicle Detection FP16-INT8 - CPU openvino: Road Segmentation ADAS FP16 - CPU openvino: Road Segmentation ADAS FP16 - CPU openvino: Face Detection FP16 - CPU svt-av1: Preset 4 - Bosphorus 1080p openvino: Handwritten English Recognition FP16 - CPU openvino: Handwritten English Recognition FP16 - CPU openvino: Age Gender Recognition Retail 0013 FP16 - CPU openvino: Age Gender Recognition Retail 0013 FP16 - CPU srsran: PDSCH Processor Benchmark, Throughput Thread openvino: Noise Suppression Poconet-Like FP16 - CPU openvino: Noise Suppression Poconet-Like FP16 - CPU openvino: Person Re-Identification Retail FP16 - CPU openvino: Person Re-Identification Retail FP16 - CPU onednn: Recurrent Neural Network Training - CPU encode-wavpack: WAV To WavPack openvino: Face Detection FP16 - CPU srsran: PUSCH Processor Benchmark, Throughput Thread srsran: PUSCH Processor Benchmark, Throughput Total stockfish: Chess Benchmark a aa b c 558.569 43.097 37.591 39.249 39.268 94.273 31.665 29.603 14099.8 27.237 74.682 56.897 265.743 364.399 74.896 24.945 2.652 8.914 175.8 46.7 1602.1 59028775 523.019 40.279 37.415 37.895 38.921 10100 20.9255 2678.2382 23.5039 182.8789 345.1080 92.760 676.59 31.121 47.28 217.95 146.71 9.0819 109.9523 26.0871 38.3162 108.97 293.47 2.911 29.238 204.69 156.22 315.7450 3.1508 13936.1 7.5520 132.1849 132.9525 474.8976 2.413553 2.2602 4.84065 27.152 333.15 95.98 74.469 33.4187 1462.94 21.86 202.6359 2156.87 21332.8931 2.79626 14.73 310.5129 348.018 57.135 1460.94 147.76 1844.1246 216.18 26.2531 38.0726 264.978 132.5425 12.9298 77.3026 143.42 363.354 222.86 476.3557 74.900 7351 55.0261 1149.4724 30.5961 32.6597 33.5337 133.5301 7.4741 2.74 42.305 4.29470 2150.30 2.15582 11232.43 24.927 34.90 438.7131 357.86 2.644 1337.5918 112.5334 8.8709 1840.3677 14.77 50.5976 19.7459 794.06 40.11 143.8317 46.6120 913.41 7.4691 133.6218 89.35 65.60 486.11 10877.53 8.925 194.88 163.95 22.80 1402.51 175.7 193.84 164.82 142.60 224.22 3738.39 25.199 2.84 59449725 564.893 41.309 37.79 39.669 37.766 9847 20.4308 2688.9567 23.4116 181.8956 346.6699 94.426 664.78 31.621 48.1 221.47 144.38 9.1198 109.4784 25.9453 38.5246 107.5 297.48 2.872 29.494 205.33 155.74 312.4633 3.1835 13999.6 7.5061 132.9867 132.7356 475.8212 2.439338 2.2754 4.88015 27.417 329.97 96.9 75.167 33.7125 1473.23 21.71 202.1471 2140.2 21231.5641 2.78238 14.84 311.1676 350.294 57.027 1461 147.08 1834.8257 217.16 26.3468 37.9374 265.435 131.9529 12.8977 77.4941 142.79 365.102 223.85 478.6418 74.958 7320 55.2555 1144.8012 30.7211 32.5272 33.5843 134.0016 7.4484 2.75 42.441 4.28036 2151.85 2.15178 11206.13 25.006 34.79 439.6023 358.5 2.65 1335.3456 112.8291 8.8483 1835.2572 14.77 50.7328 19.6933 793.31 40.15 143.4837 46.715 915.17 7.4692 133.6253 89.19 65.49 486.9 10891.93 8.921 194.84 163.98 22.79 1402.97 193.93 164.75 142.58 224.22 3737.15 25.205 2.84 51901853 542.103 41.354 35.843 39.251 39.315 9848 20.8925 2630.334 23.9126 185.7196 339.9 94.496 670.19 31.624 47.72 219.27 145.82 8.982 111.1578 26.3315 37.9597 108.56 294.58 2.893 29.544 207.24 154.3 316.347 3.1449 7.5924 131.4797 131.5237 479.9901 2.438631 2.2836 4.88858 27.396 331.77 96.38 75.015 33.6797 1460.72 21.89 201.0244 2146.07 21169.2953 2.80386 14.8 312.7909 349.915 56.789 1469.65 146.9 1833.4487 217.41 26.2002 38.1495 264.28 131.8594 12.9605 77.119 143.48 363.612 222.78 478.3732 74.604 7332 55.2435 1144.7727 30.6675 32.5835 33.6663 133.4866 7.4767 2.75 42.294 4.28461 2157.45 2.14878 11196.54 24.952 34.88 438.2501 357.41 2.65 1333.7177 112.8521 8.8461 1836.793 14.73 50.649 19.7262 792 40.21 143.6025 46.6799 913.21 7.454 133.8853 89.3 65.6 486.11 10876.7 8.926 194.65 164.13 22.78 1403.65 193.87 164.79 142.54 224.31 3738.53 25.2 2.84 53514996 OpenBenchmarking.org
JPEG-XL Decoding libjxl CPU Threads: All OpenBenchmarking.org MP/s, More Is Better JPEG-XL Decoding libjxl 0.10.1 CPU Threads: All b a c aa 120 240 360 480 600 SE +/- 1.96, N = 3 564.89 558.57 542.10 523.02
JPEG-XL libjxl Input: PNG - Quality: 80 OpenBenchmarking.org MP/s, More Is Better JPEG-XL libjxl 0.10.1 Input: PNG - Quality: 80 a c b aa 10 20 30 40 50 SE +/- 0.30, N = 3 43.10 41.35 41.31 40.28 1. (CXX) g++ options: -fno-rtti -O3 -fPIE -pie -lm
JPEG-XL libjxl Input: JPEG - Quality: 90 OpenBenchmarking.org MP/s, More Is Better JPEG-XL libjxl 0.10.1 Input: JPEG - Quality: 90 b a aa c 9 18 27 36 45 SE +/- 0.45, N = 15 37.79 37.59 37.42 35.84 1. (CXX) g++ options: -fno-rtti -O3 -fPIE -pie -lm
JPEG-XL libjxl Input: PNG - Quality: 90 OpenBenchmarking.org MP/s, More Is Better JPEG-XL libjxl 0.10.1 Input: PNG - Quality: 90 b c a aa 9 18 27 36 45 SE +/- 0.55, N = 15 39.67 39.25 39.25 37.90 1. (CXX) g++ options: -fno-rtti -O3 -fPIE -pie -lm
JPEG-XL libjxl Input: JPEG - Quality: 80 OpenBenchmarking.org MP/s, More Is Better JPEG-XL libjxl 0.10.1 Input: JPEG - Quality: 80 c a aa b 9 18 27 36 45 SE +/- 0.12, N = 3 39.32 39.27 38.92 37.77 1. (CXX) g++ options: -fno-rtti -O3 -fPIE -pie -lm
Google Draco Model: Church Facade OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.5.6 Model: Church Facade b c aa 2K 4K 6K 8K 10K SE +/- 6.24, N = 3 9847 9848 10100 1. (CXX) g++ options: -O3
oneDNN Harness: Deconvolution Batch shapes_1d - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: Deconvolution Batch shapes_1d - Engine: CPU b c aa 5 10 15 20 25 SE +/- 0.20, N = 3 20.43 20.89 20.93 MIN: 19.32 MIN: 19.81 MIN: 19.34 1. (CXX) g++ options: -O3 -march=native -fopenmp -mcpu=generic -fPIC -pie -ldl -lpthread
Neural Magic DeepSparse Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream b aa c 600 1200 1800 2400 3000 SE +/- 6.53, N = 3 2688.96 2678.24 2630.33
Neural Magic DeepSparse Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream b aa c 6 12 18 24 30 SE +/- 0.06, N = 3 23.41 23.50 23.91
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream b aa c 40 80 120 160 200 SE +/- 0.08, N = 3 181.90 182.88 185.72
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream b aa c 80 160 240 320 400 SE +/- 0.25, N = 3 346.67 345.11 339.90
Timed Linux Kernel Compilation Build: defconfig OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 6.8 Build: defconfig aa a b c 20 40 60 80 100 SE +/- 0.90, N = 3 92.76 94.27 94.43 94.50
OpenVINO Model: Face Detection Retail FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Face Detection Retail FP16 - Device: CPU aa c b 150 300 450 600 750 SE +/- 8.52, N = 3 676.59 670.19 664.78 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
JPEG-XL libjxl Input: JPEG - Quality: 100 OpenBenchmarking.org MP/s, More Is Better JPEG-XL libjxl 0.10.1 Input: JPEG - Quality: 100 a c b aa 7 14 21 28 35 SE +/- 0.00, N = 3 31.67 31.62 31.62 31.12 1. (CXX) g++ options: -fno-rtti -O3 -fPIE -pie -lm
OpenVINO Model: Face Detection Retail FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Face Detection Retail FP16 - Device: CPU aa c b 11 22 33 44 55 SE +/- 0.60, N = 3 47.28 47.72 48.10 MIN: 10.17 / MAX: 121.04 MIN: 9.97 / MAX: 99.86 MIN: 9.92 / MAX: 115.12 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Weld Porosity Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Weld Porosity Detection FP16-INT8 - Device: CPU b c aa 50 100 150 200 250 SE +/- 0.18, N = 3 221.47 219.27 217.95 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Weld Porosity Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Weld Porosity Detection FP16-INT8 - Device: CPU b c aa 30 60 90 120 150 SE +/- 0.12, N = 3 144.38 145.82 146.71 MIN: 96.65 / MAX: 1566.66 MIN: 96.38 / MAX: 1563.28 MIN: 96.02 / MAX: 1572.43 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream c aa b 3 6 9 12 15 SE +/- 0.0713, N = 3 8.9820 9.0819 9.1198
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream c aa b 20 40 60 80 100 SE +/- 0.86, N = 3 111.16 109.95 109.48
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream c aa b 6 12 18 24 30 SE +/- 0.11, N = 3 26.33 26.09 25.95
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream c aa b 9 18 27 36 45 SE +/- 0.16, N = 3 37.96 38.32 38.52
OpenVINO Model: Weld Porosity Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Weld Porosity Detection FP16 - Device: CPU b c aa 20 40 60 80 100 SE +/- 0.11, N = 3 107.50 108.56 108.97 MIN: 57.15 / MAX: 1202.08 MIN: 17.21 / MAX: 1188.34 MIN: 17.48 / MAX: 1207.62 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Weld Porosity Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Weld Porosity Detection FP16 - Device: CPU b c aa 60 120 180 240 300 SE +/- 0.30, N = 3 297.48 294.58 293.47 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
Primesieve Length: 1e12 OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 12.1 Length: 1e12 b c aa 0.655 1.31 1.965 2.62 3.275 SE +/- 0.003, N = 3 2.872 2.893 2.911 1. (CXX) g++ options: -O3
JPEG-XL libjxl Input: PNG - Quality: 100 OpenBenchmarking.org MP/s, More Is Better JPEG-XL libjxl 0.10.1 Input: PNG - Quality: 100 a c b aa 7 14 21 28 35 SE +/- 0.04, N = 3 29.60 29.54 29.49 29.24 1. (CXX) g++ options: -fno-rtti -O3 -fPIE -pie -lm
OpenVINO Model: Person Vehicle Bike Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Person Vehicle Bike Detection FP16 - Device: CPU c b aa 50 100 150 200 250 SE +/- 0.66, N = 3 207.24 205.33 204.69 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Person Vehicle Bike Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Person Vehicle Bike Detection FP16 - Device: CPU c b aa 30 60 90 120 150 SE +/- 0.49, N = 3 154.30 155.74 156.22 MIN: 44.57 / MAX: 239.56 MIN: 48.23 / MAX: 240.13 MIN: 44.3 / MAX: 240.55 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
Neural Magic DeepSparse Model: ResNet-50, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Sparse INT8 - Scenario: Synchronous Single-Stream c aa b 70 140 210 280 350 SE +/- 0.75, N = 3 316.35 315.75 312.46
Neural Magic DeepSparse Model: ResNet-50, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Sparse INT8 - Scenario: Synchronous Single-Stream c aa b 0.7163 1.4326 2.1489 2.8652 3.5815 SE +/- 0.0074, N = 3 3.1449 3.1508 3.1835
srsRAN Project Test: PDSCH Processor Benchmark, Throughput Total OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.10.1-20240219 Test: PDSCH Processor Benchmark, Throughput Total a b aa 3K 6K 9K 12K 15K SE +/- 42.60, N = 3 14099.8 13999.6 13936.1 1. (CXX) g++ options: -O3 -fno-trapping-math -fno-math-errno -ldl
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Synchronous Single-Stream b aa c 2 4 6 8 10 SE +/- 0.0154, N = 3 7.5061 7.5520 7.5924
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Synchronous Single-Stream b aa c 30 60 90 120 150 SE +/- 0.27, N = 3 132.99 132.18 131.48
Neural Magic DeepSparse Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream c b aa 30 60 90 120 150 SE +/- 0.39, N = 3 131.52 132.74 132.95
Neural Magic DeepSparse Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream c b aa 100 200 300 400 500 SE +/- 1.33, N = 3 479.99 475.82 474.90
Parallel BZIP2 Compression FreeBSD-13.0-RELEASE-amd64-memstick.img Compression OpenBenchmarking.org Seconds, Fewer Is Better Parallel BZIP2 Compression 1.1.13 FreeBSD-13.0-RELEASE-amd64-memstick.img Compression aa c b 0.5489 1.0978 1.6467 2.1956 2.7445 SE +/- 0.001512, N = 3 2.413553 2.438631 2.439338 1. (CXX) g++ options: -O2 -pthread -lbz2 -lpthread
Neural Magic DeepSparse Model: Llama2 Chat 7b Quantized - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: Llama2 Chat 7b Quantized - Scenario: Asynchronous Multi-Stream c b aa 0.5138 1.0276 1.5414 2.0552 2.569 SE +/- 0.0074, N = 3 2.2836 2.2754 2.2602
oneDNN Harness: IP Shapes 1D - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: IP Shapes 1D - Engine: CPU aa b c 1.0999 2.1998 3.2997 4.3996 5.4995 SE +/- 0.01022, N = 3 4.84065 4.88015 4.88858 MIN: 4.25 MIN: 4.23 MIN: 4.3 1. (CXX) g++ options: -O3 -march=native -fopenmp -mcpu=generic -fPIC -pie -ldl -lpthread
JPEG-XL Decoding libjxl CPU Threads: 1 OpenBenchmarking.org MP/s, More Is Better JPEG-XL Decoding libjxl 0.10.1 CPU Threads: 1 b c a aa 6 12 18 24 30 SE +/- 0.01, N = 3 27.42 27.40 27.24 27.15
OpenVINO Model: Face Detection Retail FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Face Detection Retail FP16-INT8 - Device: CPU aa c b 70 140 210 280 350 SE +/- 0.61, N = 3 333.15 331.77 329.97 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Face Detection Retail FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Face Detection Retail FP16-INT8 - Device: CPU aa c b 20 40 60 80 100 SE +/- 0.18, N = 3 95.98 96.38 96.90 MIN: 71.43 / MAX: 140.32 MIN: 69.36 / MAX: 140.93 MIN: 70.14 / MAX: 141.32 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
SVT-AV1 Encoder Mode: Preset 12 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 12 - Input: Bosphorus 4K b c a aa 20 40 60 80 100 SE +/- 0.28, N = 3 75.17 75.02 74.68 74.47 1. (CXX) g++ options: -march=native
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream b c aa 8 16 24 32 40 SE +/- 0.02, N = 3 33.71 33.68 33.42
OpenVINO Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU b aa c 300 600 900 1200 1500 SE +/- 1.48, N = 3 1473.23 1462.94 1460.72 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU b aa c 5 10 15 20 25 SE +/- 0.02, N = 3 21.71 21.86 21.89 MIN: 2.05 / MAX: 156.88 MIN: 2 / MAX: 157.1 MIN: 2.07 / MAX: 156.71 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream aa b c 40 80 120 160 200 SE +/- 0.34, N = 3 202.64 202.15 201.02
OpenVINO Model: Person Detection FP32 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Person Detection FP32 - Device: CPU b c aa 500 1000 1500 2000 2500 SE +/- 2.39, N = 3 2140.20 2146.07 2156.87 MIN: 527.18 / MAX: 2951.37 MIN: 439.17 / MAX: 2969.83 MIN: 504.09 / MAX: 2990 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
Neural Magic DeepSparse Model: Llama2 Chat 7b Quantized - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: Llama2 Chat 7b Quantized - Scenario: Asynchronous Multi-Stream c b aa 5K 10K 15K 20K 25K SE +/- 55.68, N = 3 21169.30 21231.56 21332.89
oneDNN Harness: Deconvolution Batch shapes_3d - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: Deconvolution Batch shapes_3d - Engine: CPU b aa c 0.6309 1.2618 1.8927 2.5236 3.1545 SE +/- 0.01912, N = 12 2.78238 2.79626 2.80386 MIN: 2.72 MIN: 2.68 MIN: 2.7 1. (CXX) g++ options: -O3 -march=native -fopenmp -mcpu=generic -fPIC -pie -ldl -lpthread
OpenVINO Model: Person Detection FP32 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Person Detection FP32 - Device: CPU b c aa 4 8 12 16 20 SE +/- 0.02, N = 3 14.84 14.80 14.73 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream aa b c 70 140 210 280 350 SE +/- 0.51, N = 3 310.51 311.17 312.79
Timed Linux Kernel Compilation Build: allmodconfig OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 6.8 Build: allmodconfig aa c b 80 160 240 320 400 SE +/- 0.68, N = 3 348.02 349.92 350.29
SVT-AV1 Encoder Mode: Preset 8 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 8 - Input: Bosphorus 1080p aa b a c 13 26 39 52 65 SE +/- 0.06, N = 3 57.14 57.03 56.90 56.79 1. (CXX) g++ options: -march=native
oneDNN Harness: Recurrent Neural Network Inference - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: Recurrent Neural Network Inference - Engine: CPU aa b c 300 600 900 1200 1500 SE +/- 3.72, N = 3 1460.94 1461.00 1469.65 MIN: 1436.36 MIN: 1442.49 MIN: 1448.43 1. (CXX) g++ options: -O3 -march=native -fopenmp -mcpu=generic -fPIC -pie -ldl -lpthread
OpenVINO Model: Handwritten English Recognition FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Handwritten English Recognition FP16-INT8 - Device: CPU aa b c 30 60 90 120 150 SE +/- 0.83, N = 3 147.76 147.08 146.90 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream c b aa 400 800 1200 1600 2000 SE +/- 1.78, N = 3 1833.45 1834.83 1844.12
OpenVINO Model: Handwritten English Recognition FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Handwritten English Recognition FP16-INT8 - Device: CPU aa b c 50 100 150 200 250 SE +/- 1.21, N = 3 216.18 217.16 217.41 MIN: 206.9 / MAX: 376.9 MIN: 208.82 / MAX: 374.93 MIN: 210.44 / MAX: 372.96 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream b aa c 6 12 18 24 30 SE +/- 0.02, N = 3 26.35 26.25 26.20
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream b aa c 9 18 27 36 45 SE +/- 0.04, N = 3 37.94 38.07 38.15
SVT-AV1 Encoder Mode: Preset 12 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 12 - Input: Bosphorus 1080p a b aa c 60 120 180 240 300 SE +/- 0.05, N = 3 265.74 265.44 264.98 264.28 1. (CXX) g++ options: -march=native
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream c b aa 30 60 90 120 150 SE +/- 0.34, N = 3 131.86 131.95 132.54
Neural Magic DeepSparse Model: Llama2 Chat 7b Quantized - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: Llama2 Chat 7b Quantized - Scenario: Synchronous Single-Stream c aa b 3 6 9 12 15 SE +/- 0.02, N = 3 12.96 12.93 12.90
Neural Magic DeepSparse Model: Llama2 Chat 7b Quantized - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: Llama2 Chat 7b Quantized - Scenario: Synchronous Single-Stream c aa b 20 40 60 80 100 SE +/- 0.12, N = 3 77.12 77.30 77.49
OpenVINO Model: Vehicle Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Vehicle Detection FP16 - Device: CPU b aa c 30 60 90 120 150 SE +/- 0.06, N = 3 142.79 143.42 143.48 MIN: 60 / MAX: 245.21 MIN: 62.82 / MAX: 295.2 MIN: 44.55 / MAX: 252.93 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
SVT-AV1 Encoder Mode: Preset 13 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 13 - Input: Bosphorus 1080p b a c aa 80 160 240 320 400 SE +/- 0.57, N = 3 365.10 364.40 363.61 363.35 1. (CXX) g++ options: -march=native
OpenVINO Model: Vehicle Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Vehicle Detection FP16 - Device: CPU b aa c 50 100 150 200 250 SE +/- 0.10, N = 3 223.85 222.86 222.78 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream b c aa 100 200 300 400 500 SE +/- 1.18, N = 3 478.64 478.37 476.36
SVT-AV1 Encoder Mode: Preset 13 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 13 - Input: Bosphorus 4K b aa a c 20 40 60 80 100 SE +/- 0.19, N = 3 74.96 74.90 74.90 74.60 1. (CXX) g++ options: -march=native
Google Draco Model: Lion OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.5.6 Model: Lion b c aa 1600 3200 4800 6400 8000 SE +/- 1.86, N = 3 7320 7332 7351 1. (CXX) g++ options: -O3
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream aa c b 12 24 36 48 60 SE +/- 0.11, N = 3 55.03 55.24 55.26
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream aa b c 200 400 600 800 1000 SE +/- 2.84, N = 3 1149.47 1144.80 1144.77
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream b c aa 7 14 21 28 35 SE +/- 0.01, N = 3 30.72 30.67 30.60
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream b c aa 8 16 24 32 40 SE +/- 0.01, N = 3 32.53 32.58 32.66
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream c b aa 8 16 24 32 40 SE +/- 0.04, N = 3 33.67 33.58 33.53
Neural Magic DeepSparse Model: ResNet-50, Baseline - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Baseline - Scenario: Synchronous Single-Stream b aa c 30 60 90 120 150 SE +/- 0.11, N = 3 134.00 133.53 133.49
Neural Magic DeepSparse Model: ResNet-50, Baseline - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: ResNet-50, Baseline - Scenario: Synchronous Single-Stream b aa c 2 4 6 8 10 SE +/- 0.0064, N = 3 7.4484 7.4741 7.4767
OpenVINO Model: Face Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Face Detection FP16-INT8 - Device: CPU c b aa 0.6188 1.2376 1.8564 2.4752 3.094 SE +/- 0.00, N = 3 2.75 2.75 2.74 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
Primesieve Length: 1e13 OpenBenchmarking.org Seconds, Fewer Is Better Primesieve 12.1 Length: 1e13 c aa b 10 20 30 40 50 SE +/- 0.07, N = 3 42.29 42.31 42.44 1. (CXX) g++ options: -O3
oneDNN Harness: Convolution Batch Shapes Auto - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: Convolution Batch Shapes Auto - Engine: CPU b c aa 0.9663 1.9326 2.8989 3.8652 4.8315 SE +/- 0.01638, N = 3 4.28036 4.28461 4.29470 MIN: 4.17 MIN: 4.14 MIN: 4.16 1. (CXX) g++ options: -O3 -march=native -fopenmp -mcpu=generic -fPIC -pie -ldl -lpthread
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Person Detection FP16 - Device: CPU aa b c 500 1000 1500 2000 2500 SE +/- 1.19, N = 3 2150.30 2151.85 2157.45 MIN: 491.1 / MAX: 2996.72 MIN: 500.93 / MAX: 2975.2 MIN: 644.54 / MAX: 2962.51 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
oneDNN Harness: IP Shapes 3D - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: IP Shapes 3D - Engine: CPU c b aa 0.4851 0.9702 1.4553 1.9404 2.4255 SE +/- 0.00137, N = 3 2.14878 2.15178 2.15582 MIN: 2.06 MIN: 2.06 MIN: 2.06 1. (CXX) g++ options: -O3 -march=native -fopenmp -mcpu=generic -fPIC -pie -ldl -lpthread
OpenVINO Model: Face Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Face Detection FP16-INT8 - Device: CPU c b aa 2K 4K 6K 8K 10K SE +/- 9.32, N = 3 11196.54 11206.13 11232.43 MIN: 7222.84 / MAX: 20603.63 MIN: 7011.32 / MAX: 20429.17 MIN: 6926.76 / MAX: 21113.44 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
SVT-AV1 Encoder Mode: Preset 8 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 8 - Input: Bosphorus 4K b c a aa 6 12 18 24 30 SE +/- 0.01, N = 3 25.01 24.95 24.95 24.93 1. (CXX) g++ options: -march=native
OpenVINO Model: Road Segmentation ADAS FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Road Segmentation ADAS FP16-INT8 - Device: CPU aa c b 8 16 24 32 40 SE +/- 0.02, N = 3 34.90 34.88 34.79 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream b aa c 100 200 300 400 500 SE +/- 0.42, N = 3 439.60 438.71 438.25
OpenVINO Model: Vehicle Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Vehicle Detection FP16-INT8 - Device: CPU c aa b 80 160 240 320 400 SE +/- 0.11, N = 3 357.41 357.86 358.50 MIN: 204.13 / MAX: 519.56 MIN: 301.59 / MAX: 522.85 MIN: 300.19 / MAX: 528.83 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
SVT-AV1 Encoder Mode: Preset 4 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 4 - Input: Bosphorus 4K a c b aa 0.5967 1.1934 1.7901 2.3868 2.9835 SE +/- 0.004, N = 3 2.652 2.650 2.650 2.644 1. (CXX) g++ options: -march=native
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream c b aa 300 600 900 1200 1500 SE +/- 3.19, N = 3 1333.72 1335.35 1337.59
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Synchronous Single-Stream c b aa 30 60 90 120 150 SE +/- 0.16, N = 3 112.85 112.83 112.53
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Synchronous Single-Stream c b aa 2 4 6 8 10 SE +/- 0.0129, N = 3 8.8461 8.8483 8.8709
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream b c aa 400 800 1200 1600 2000 SE +/- 1.00, N = 3 1835.26 1836.79 1840.37
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Person Detection FP16 - Device: CPU b aa c 4 8 12 16 20 SE +/- 0.01, N = 3 14.77 14.77 14.73 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Synchronous Single-Stream b c aa 11 22 33 44 55 SE +/- 0.08, N = 3 50.73 50.65 50.60
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Synchronous Single-Stream b c aa 5 10 15 20 25 SE +/- 0.03, N = 3 19.69 19.73 19.75
OpenVINO Model: Machine Translation EN To DE FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Machine Translation EN To DE FP16 - Device: CPU c b aa 200 400 600 800 1000 SE +/- 0.93, N = 3 792.00 793.31 794.06 MIN: 568.74 / MAX: 1657.2 MIN: 559.01 / MAX: 1581.54 MIN: 604.52 / MAX: 1620.5 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Machine Translation EN To DE FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Machine Translation EN To DE FP16 - Device: CPU c b aa 9 18 27 36 45 SE +/- 0.05, N = 3 40.21 40.15 40.11 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
Neural Magic DeepSparse Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream b c aa 30 60 90 120 150 SE +/- 0.07, N = 3 143.48 143.60 143.83
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream b c aa 11 22 33 44 55 SE +/- 0.11, N = 3 46.72 46.68 46.61
OpenVINO Model: Road Segmentation ADAS FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Road Segmentation ADAS FP16-INT8 - Device: CPU c aa b 200 400 600 800 1000 SE +/- 0.49, N = 3 913.21 913.41 915.17 MIN: 718.49 / MAX: 1350.67 MIN: 742.17 / MAX: 1356.42 MIN: 711.5 / MAX: 1350.07 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.7 Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream c aa b 2 4 6 8 10 SE +/- 0.0095, N = 3 7.4540 7.4691 7.4692
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.7 Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream c b aa 30 60 90 120 150 SE +/- 0.17, N = 3 133.89 133.63 133.62
OpenVINO Model: Vehicle Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Vehicle Detection FP16-INT8 - Device: CPU aa c b 20 40 60 80 100 SE +/- 0.03, N = 3 89.35 89.30 89.19 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Road Segmentation ADAS FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Road Segmentation ADAS FP16 - Device: CPU c aa b 15 30 45 60 75 SE +/- 0.12, N = 3 65.60 65.60 65.49 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Road Segmentation ADAS FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Road Segmentation ADAS FP16 - Device: CPU aa c b 110 220 330 440 550 SE +/- 0.88, N = 3 486.11 486.11 486.90 MIN: 118.22 / MAX: 849.31 MIN: 171.7 / MAX: 813.73 MIN: 119.18 / MAX: 852.49 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Face Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Face Detection FP16 - Device: CPU c aa b 2K 4K 6K 8K 10K SE +/- 17.40, N = 3 10876.70 10877.53 10891.93 MIN: 3255.92 / MAX: 18738.42 MIN: 4104.89 / MAX: 18949.05 MIN: 3821.31 / MAX: 19031.99 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
SVT-AV1 Encoder Mode: Preset 4 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 2.0 Encoder Mode: Preset 4 - Input: Bosphorus 1080p c aa b a 2 4 6 8 10 SE +/- 0.010, N = 3 8.926 8.925 8.921 8.914 1. (CXX) g++ options: -march=native
OpenVINO Model: Handwritten English Recognition FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Handwritten English Recognition FP16 - Device: CPU c b aa 40 80 120 160 200 SE +/- 0.08, N = 3 194.65 194.84 194.88 MIN: 185.45 / MAX: 358.03 MIN: 185.09 / MAX: 355.83 MIN: 185.7 / MAX: 356.13 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Handwritten English Recognition FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Handwritten English Recognition FP16 - Device: CPU c b aa 40 80 120 160 200 SE +/- 0.06, N = 3 164.13 163.98 163.95 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU c b aa 5 10 15 20 25 SE +/- 0.05, N = 3 22.78 22.79 22.80 MIN: 1.63 / MAX: 162.11 MIN: 1.59 / MAX: 165.35 MIN: 1.57 / MAX: 164.42 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU c b aa 300 600 900 1200 1500 SE +/- 3.07, N = 3 1403.65 1402.97 1402.51 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
srsRAN Project Test: PDSCH Processor Benchmark, Throughput Thread OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.10.1-20240219 Test: PDSCH Processor Benchmark, Throughput Thread a aa 40 80 120 160 200 SE +/- 0.03, N = 3 175.8 175.7 1. (CXX) g++ options: -O3 -fno-trapping-math -fno-math-errno -ldl
OpenVINO Model: Noise Suppression Poconet-Like FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Noise Suppression Poconet-Like FP16 - Device: CPU aa c b 40 80 120 160 200 SE +/- 0.04, N = 3 193.84 193.87 193.93 MIN: 183.19 / MAX: 407.14 MIN: 182.85 / MAX: 406.51 MIN: 182.93 / MAX: 402.18 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Noise Suppression Poconet-Like FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Noise Suppression Poconet-Like FP16 - Device: CPU aa c b 40 80 120 160 200 SE +/- 0.03, N = 3 164.82 164.79 164.75 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Person Re-Identification Retail FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Person Re-Identification Retail FP16 - Device: CPU aa b c 30 60 90 120 150 SE +/- 0.34, N = 3 142.60 142.58 142.54 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Person Re-Identification Retail FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Person Re-Identification Retail FP16 - Device: CPU aa b c 50 100 150 200 250 SE +/- 0.53, N = 3 224.22 224.22 224.31 MIN: 29.21 / MAX: 400.61 MIN: 36.4 / MAX: 368.76 MIN: 31.77 / MAX: 351.21 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
oneDNN Harness: Recurrent Neural Network Training - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.4 Harness: Recurrent Neural Network Training - Engine: CPU b aa c 800 1600 2400 3200 4000 SE +/- 2.30, N = 3 3737.15 3738.39 3738.53 MIN: 3730.87 MIN: 3728.79 MIN: 3730.99 1. (CXX) g++ options: -O3 -march=native -fopenmp -mcpu=generic -fPIC -pie -ldl -lpthread
WavPack Audio Encoding WAV To WavPack OpenBenchmarking.org Seconds, Fewer Is Better WavPack Audio Encoding 5.7 WAV To WavPack aa c b 6 12 18 24 30 SE +/- 0.00, N = 5 25.20 25.20 25.21
OpenVINO Model: Face Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Face Detection FP16 - Device: CPU c b aa 0.639 1.278 1.917 2.556 3.195 SE +/- 0.01, N = 3 2.84 2.84 2.84 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
srsRAN Project Test: PUSCH Processor Benchmark, Throughput Thread OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.10.1-20240219 Test: PUSCH Processor Benchmark, Throughput Thread a 11 22 33 44 55 46.7 MIN: 28.9 1. (CXX) g++ options: -O3 -fno-trapping-math -fno-math-errno -ldl
srsRAN Project Test: PUSCH Processor Benchmark, Throughput Total OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.10.1-20240219 Test: PUSCH Processor Benchmark, Throughput Total a 300 600 900 1200 1500 1602.1 MIN: 947.2 1. (CXX) g++ options: -O3 -fno-trapping-math -fno-math-errno -ldl
Stockfish Chess Benchmark OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 16.1 Chess Benchmark aa a c b 13M 26M 39M 52M 65M SE +/- 1497045.19, N = 12 59449725 59028775 53514996 51901853 1. (CXX) g++ options: -lgcov -lpthread -fno-exceptions -std=c++17 -fno-peel-loops -fno-tracer -pedantic -O3 -funroll-loops -flto -flto-partition=one -flto=jobserver
Phoronix Test Suite v10.8.5