a Benchmarks for a future article.
Compare your own system(s) to this result file with the
Phoronix Test Suite by running the command:
phoronix-test-suite benchmark 2312143-NE-A8154652071 a Kernel Notes: Transparent Huge Pages: madviseProcessor Notes: Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0x21000161Python Notes: Python 3.11.6Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected
b c d Processor: 2 x INTEL XEON PLATINUM 8592+ @ 3.90GHz (128 Cores / 256 Threads), Motherboard: Quanta Cloud S6Q-MB-MPS (3B05.TEL4P1 BIOS), Chipset: Intel Device 1bce, Memory: 1008GB, Disk: 3201GB Micron_7450_MTFDKCB3T2TFS, Graphics: ASPEED, Network: 2 x Intel X710 for 10GBASE-T
OS: Ubuntu 23.10, Kernel: 6.5.0-13-generic (x86_64), Compiler: GCC 13.2.0, File-System: ext4, Screen Resolution: 1920x1080
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0x21000161Python Notes: Python 3.11.6Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected
a OpenBenchmarking.org Phoronix Test Suite 2 x INTEL XEON PLATINUM 8592+ @ 3.90GHz (128 Cores / 256 Threads) Quanta Cloud S6Q-MB-MPS (3B05.TEL4P1 BIOS) Intel Device 1bce 1008GB 3201GB Micron_7450_MTFDKCB3T2TFS ASPEED 2 x Intel X710 for 10GBASE-T Ubuntu 23.10 6.5.0-13-generic (x86_64) GCC 13.2.0 ext4 1920x1080 Processor Motherboard Chipset Memory Disk Graphics Network OS Kernel Compiler File-System Screen Resolution A Benchmarks System Logs - Transparent Huge Pages: madvise - Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0x21000161 - Python 3.11.6 - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected - d: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v
a b c d Result Overview Phoronix Test Suite 100% 100% 101% 101% 102% NWChem Neural Magic DeepSparse WRF
a deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream deepsparse: BERT-Large, NLP Question Answering - Synchronous Single-Stream deepsparse: BERT-Large, NLP Question Answering - Synchronous Single-Stream deepsparse: ResNet-50, Sparse INT8 - Synchronous Single-Stream nwchem: C240 Buckyball deepsparse: ResNet-50, Sparse INT8 - Synchronous Single-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Synchronous Single-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Synchronous Single-Stream deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Synchronous Single-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream wrf: conus 2.5km deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Synchronous Single-Stream deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Stream deepsparse: BERT-Large, NLP Question Answering, Sparse INT8 - Asynchronous Multi-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream deepsparse: ResNet-50, Baseline - Asynchronous Multi-Stream deepsparse: ResNet-50, Baseline - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Synchronous Single-Stream deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Synchronous Single-Stream deepsparse: ResNet-50, Sparse INT8 - Asynchronous Multi-Stream deepsparse: ResNet-50, Sparse INT8 - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO - Synchronous Single-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO - Synchronous Single-Stream deepsparse: CV Detection, YOLOv5s COCO - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO - Asynchronous Multi-Stream deepsparse: BERT-Large, NLP Question Answering - Asynchronous Multi-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Synchronous Single-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Synchronous Single-Stream deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO, Sparse INT8 - Asynchronous Multi-Stream deepsparse: ResNet-50, Baseline - Synchronous Single-Stream deepsparse: ResNet-50, Baseline - Synchronous Single-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Synchronous Single-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Synchronous Single-Stream deepsparse: BERT-Large, NLP Question Answering - Asynchronous Multi-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Asynchronous Multi-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Synchronous Single-Stream deepsparse: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Synchronous Single-Stream a b c d 31.1945 32.0613 31.6954 31.5821 183.4966 347.6637 29.9421 33.4022 0.9681 1744 1030.3181 199.6930 5.0073 10.3634 34.8681 1832.7927 5566.729 96.4300 1880.7688 33.9808 475.6964 1829.9644 34.9169 5.1485 194.2476 11228.6625 5.6852 190.8420 133.5159 5.2382 77.1758 828.0384 407.2777 34.9620 28.5876 854.0525 74.8177 338.9562 2.9492 1231.2049 2.9484 339.0416 156.0563 51.9301 475.1265 14.0362 4554.9796 133.6023 4.2083 238.6523 31.2336 32.0368 31.5332 31.7303 180.3157 353.6889 30.3891 32.8989 0.9706 1730.7 1027.5117 199.3380 5.0196 10.3695 35.1423 1817.9939 5600.976 96.3790 1875.9035 34.0815 474.1852 1824.3759 35.0063 5.1659 193.5972 11169.8517 5.7140 190.0399 133.7189 5.2617 77.0924 828.6532 408.6263 35.1065 28.4701 852.9455 74.9435 337.6799 2.9602 1231.2939 2.9500 338.8583 155.7222 51.9273 474.3687 14.0081 4563.5097 133.8239 4.2295 237.0460 31.6300 31.6501 31.7202 31.5618 177.1512 359.8919 30.2944 32.9984 0.9556 1757.3 1043.2804 199.3323 5.0179 10.3507 34.8217 1834.4771 5583.112 96.5474 1879.8388 33.9939 475.6804 1827.9780 34.9683 5.1802 193.0872 11163.6593 5.7165 189.8175 133.4843 5.2651 77.1235 828.4462 408.9699 35.0057 28.5521 854.2521 74.7945 338.5536 2.9530 1231.2736 2.9555 338.2192 155.7677 51.9192 475.3497 14.0090 4563.4278 133.6004 4.3436 230.7640 32.8936 30.4222 33.0540 30.2891 181.2029 352.2235 29.5231 33.8635 0.9667 1748 1032.2410 196.7874 5.0808 10.4501 35.1013 1819.6861 5617.197 95.6981 1865.0193 34.2508 477.6108 1837.4977 34.7567 5.1614 193.8186 11221.1357 5.6908 190.1376 133.0079 5.2589 77.4573 825.0841 408.1638 34.9987 28.5586 850.8926 75.0848 338.1466 2.9573 1226.7960 2.9455 339.3671 155.6020 52.0706 475.4464 14.0315 4556.0233 133.5969 4.1604 240.2336 OpenBenchmarking.org
Neural Magic DeepSparse This is a benchmark of Neural Magic's DeepSparse using its built-in deepsparse.benchmark utility and various models from their SparseZoo (https://sparsezoo.neuralmagic.com/). Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream d c b a 8 16 24 32 40 SE +/- 0.26, N = 15 SE +/- 0.32, N = 15 SE +/- 0.25, N = 15 SE +/- 0.38, N = 4 32.89 31.63 31.23 31.19
OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream d c b a 7 14 21 28 35 SE +/- 0.25, N = 15 SE +/- 0.32, N = 15 SE +/- 0.27, N = 15 SE +/- 0.40, N = 4 30.42 31.65 32.04 32.06
OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream d c b a 8 16 24 32 40 SE +/- 0.33, N = 15 SE +/- 0.33, N = 15 SE +/- 0.25, N = 15 SE +/- 0.31, N = 15 33.05 31.72 31.53 31.70
OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream d c b a 7 14 21 28 35 SE +/- 0.31, N = 15 SE +/- 0.32, N = 15 SE +/- 0.25, N = 15 SE +/- 0.31, N = 15 30.29 31.56 31.73 31.58
OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream d c b a 40 80 120 160 200 SE +/- 1.48, N = 9 SE +/- 1.83, N = 3 SE +/- 2.09, N = 3 SE +/- 1.57, N = 8 181.20 177.15 180.32 183.50
OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream d c b a 80 160 240 320 400 SE +/- 2.84, N = 9 SE +/- 3.47, N = 3 SE +/- 4.19, N = 3 SE +/- 3.00, N = 8 352.22 359.89 353.69 347.66
OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: BERT-Large, NLP Question Answering - Scenario: Synchronous Single-Stream d c b a 7 14 21 28 35 SE +/- 0.10, N = 3 SE +/- 0.15, N = 3 SE +/- 0.27, N = 3 SE +/- 0.30, N = 6 29.52 30.29 30.39 29.94
OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: BERT-Large, NLP Question Answering - Scenario: Synchronous Single-Stream d c b a 8 16 24 32 40 SE +/- 0.12, N = 3 SE +/- 0.16, N = 3 SE +/- 0.30, N = 3 SE +/- 0.34, N = 6 33.86 33.00 32.90 33.40
OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: ResNet-50, Sparse INT8 - Scenario: Synchronous Single-Stream d c b a 0.2184 0.4368 0.6552 0.8736 1.092 SE +/- 0.0094, N = 7 SE +/- 0.0048, N = 3 SE +/- 0.0090, N = 3 SE +/- 0.0105, N = 4 0.9667 0.9556 0.9706 0.9681
NWChem NWChem is an open-source high performance computational chemistry package. Per NWChem's documentation, "NWChem aims to provide its users with computational chemistry tools that are scalable both in their ability to treat large scientific computational chemistry problems efficiently, and in their use of available parallel computing resources from high-performance parallel supercomputers to conventional workstation clusters." Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better NWChem 7.0.2 Input: C240 Buckyball d c b a 400 800 1200 1600 2000 1748.0 1757.3 1730.7 1744.0 1. (F9X) gfortran options: -lnwctask -lccsd -lmcscf -lselci -lmp2 -lmoints -lstepper -ldriver -loptim -lnwdft -lgradients -lcphf -lesp -lddscf -ldangchang -lguess -lhessian -lvib -lnwcutil -lrimp2 -lproperty -lsolvation -lnwints -lprepar -lnwmd -lnwpw -lofpw -lpaw -lpspw -lband -lnwpwlib -lcafe -lspace -lanalyze -lqhop -lpfft -ldplot -ldrdy -lvscf -lqmmm -lqmd -letrans -ltce -lbq -lmm -lcons -lperfm -ldntmc -lccca -ldimqm -lga -larmci -lpeigs -l64to32 -lopenblas -lpthread -lrt -llapack -lnwcblas -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz -lcomex -m64 -ffast-math -std=legacy -fdefault-integer-8 -finline-functions -O2
Neural Magic DeepSparse This is a benchmark of Neural Magic's DeepSparse using its built-in deepsparse.benchmark utility and various models from their SparseZoo (https://sparsezoo.neuralmagic.com/). Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: ResNet-50, Sparse INT8 - Scenario: Synchronous Single-Stream d c b a 200 400 600 800 1000 SE +/- 9.64, N = 7 SE +/- 5.37, N = 3 SE +/- 9.47, N = 3 SE +/- 11.12, N = 4 1032.24 1043.28 1027.51 1030.32
OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream d c b a 40 80 120 160 200 SE +/- 1.69, N = 8 SE +/- 1.76, N = 12 SE +/- 2.05, N = 12 SE +/- 1.45, N = 12 196.79 199.33 199.34 199.69
OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream d c b a 1.1432 2.2864 3.4296 4.5728 5.716 SE +/- 0.0457, N = 8 SE +/- 0.0484, N = 12 SE +/- 0.0573, N = 12 SE +/- 0.0391, N = 12 5.0808 5.0179 5.0196 5.0073
OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Synchronous Single-Stream d c b a 3 6 9 12 15 SE +/- 0.08, N = 10 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 10.45 10.35 10.37 10.36
OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream d c b a 8 16 24 32 40 SE +/- 0.39, N = 3 SE +/- 0.11, N = 3 SE +/- 0.14, N = 3 SE +/- 0.43, N = 3 35.10 34.82 35.14 34.87
OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream d c b a 400 800 1200 1600 2000 SE +/- 21.20, N = 3 SE +/- 5.61, N = 3 SE +/- 7.71, N = 3 SE +/- 22.80, N = 3 1819.69 1834.48 1817.99 1832.79
WRF WRF, the Weather Research and Forecasting Model, is a "next-generation mesoscale numerical weather prediction system designed for both atmospheric research and operational forecasting applications. It features two dynamical cores, a data assimilation system, and a software architecture supporting parallel computation and system extensibility." Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org Seconds, Fewer Is Better WRF 4.2.2 Input: conus 2.5km d c b a 1200 2400 3600 4800 6000 5617.20 5583.11 5600.98 5566.73 1. (F9X) gfortran options: -O2 -ftree-vectorize -funroll-loops -ffree-form -fconvert=big-endian -frecord-marker=4 -fallow-invalid-boz -lesmf_time -lwrfio_nf -lnetcdff -lnetcdf -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
Neural Magic DeepSparse This is a benchmark of Neural Magic's DeepSparse using its built-in deepsparse.benchmark utility and various models from their SparseZoo (https://sparsezoo.neuralmagic.com/). Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Synchronous Single-Stream d c b a 20 40 60 80 100 SE +/- 0.72, N = 10 SE +/- 0.16, N = 3 SE +/- 0.30, N = 3 SE +/- 0.11, N = 3 95.70 96.55 96.38 96.43
OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream d c b a 400 800 1200 1600 2000 SE +/- 21.54, N = 3 SE +/- 21.19, N = 3 SE +/- 26.46, N = 3 SE +/- 21.42, N = 3 1865.02 1879.84 1875.90 1880.77
OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream d c b a 8 16 24 32 40 SE +/- 0.40, N = 3 SE +/- 0.38, N = 3 SE +/- 0.48, N = 3 SE +/- 0.38, N = 3 34.25 33.99 34.08 33.98
OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream d c b a 100 200 300 400 500 SE +/- 0.44, N = 3 SE +/- 0.52, N = 3 SE +/- 0.47, N = 3 SE +/- 0.61, N = 3 477.61 475.68 474.19 475.70
OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream d c b a 400 800 1200 1600 2000 SE +/- 14.66, N = 3 SE +/- 17.94, N = 3 SE +/- 13.19, N = 3 SE +/- 4.85, N = 3 1837.50 1827.98 1824.38 1829.96
OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream d c b a 8 16 24 32 40 SE +/- 0.29, N = 3 SE +/- 0.35, N = 3 SE +/- 0.25, N = 3 SE +/- 0.08, N = 3 34.76 34.97 35.01 34.92
OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Synchronous Single-Stream d c b a 1.1655 2.331 3.4965 4.662 5.8275 SE +/- 0.0471, N = 12 SE +/- 0.0500, N = 12 SE +/- 0.0445, N = 13 SE +/- 0.0389, N = 12 5.1614 5.1802 5.1659 5.1485
OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Synchronous Single-Stream d c b a 40 80 120 160 200 SE +/- 1.62, N = 12 SE +/- 1.70, N = 12 SE +/- 1.53, N = 13 SE +/- 1.36, N = 12 193.82 193.09 193.60 194.25
OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream d c b a 2K 4K 6K 8K 10K SE +/- 77.36, N = 12 SE +/- 101.43, N = 7 SE +/- 89.80, N = 9 SE +/- 71.77, N = 13 11221.14 11163.66 11169.85 11228.66
OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream d c b a 1.2862 2.5724 3.8586 5.1448 6.431 SE +/- 0.0425, N = 12 SE +/- 0.0544, N = 7 SE +/- 0.0484, N = 9 SE +/- 0.0393, N = 13 5.6908 5.7165 5.7140 5.6852
OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Stream d c b a 40 80 120 160 200 SE +/- 1.52, N = 12 SE +/- 1.69, N = 7 SE +/- 1.73, N = 12 SE +/- 1.29, N = 12 190.14 189.82 190.04 190.84
OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream d c b a 30 60 90 120 150 SE +/- 0.11, N = 3 SE +/- 0.16, N = 3 SE +/- 0.13, N = 3 SE +/- 0.15, N = 3 133.01 133.48 133.72 133.52
OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Stream d c b a 1.1846 2.3692 3.5538 4.7384 5.923 SE +/- 0.0457, N = 12 SE +/- 0.0492, N = 7 SE +/- 0.0526, N = 12 SE +/- 0.0382, N = 12 5.2589 5.2651 5.2617 5.2382
OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream d c b a 20 40 60 80 100 SE +/- 0.87, N = 3 SE +/- 0.64, N = 3 SE +/- 0.63, N = 3 SE +/- 0.70, N = 3 77.46 77.12 77.09 77.18
OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream d c b a 200 400 600 800 1000 SE +/- 9.22, N = 3 SE +/- 6.81, N = 3 SE +/- 6.37, N = 3 SE +/- 7.51, N = 3 825.08 828.45 828.65 828.04
OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: BERT-Large, NLP Question Answering - Scenario: Asynchronous Multi-Stream d c b a 90 180 270 360 450 SE +/- 2.06, N = 3 SE +/- 2.55, N = 3 SE +/- 1.91, N = 3 SE +/- 1.62, N = 3 408.16 408.97 408.63 407.28
OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream d c b a 8 16 24 32 40 SE +/- 0.23, N = 12 SE +/- 0.35, N = 6 SE +/- 0.34, N = 6 SE +/- 0.33, N = 7 35.00 35.01 35.11 34.96
OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream d c b a 7 14 21 28 35 SE +/- 0.20, N = 12 SE +/- 0.29, N = 6 SE +/- 0.29, N = 6 SE +/- 0.28, N = 7 28.56 28.55 28.47 28.59
OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream d c b a 200 400 600 800 1000 SE +/- 10.96, N = 3 SE +/- 9.10, N = 3 SE +/- 10.26, N = 3 SE +/- 8.70, N = 3 850.89 854.25 852.95 854.05
OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream d c b a 20 40 60 80 100 SE +/- 0.97, N = 3 SE +/- 0.80, N = 3 SE +/- 0.92, N = 3 SE +/- 0.77, N = 3 75.08 74.79 74.94 74.82
OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: ResNet-50, Baseline - Scenario: Synchronous Single-Stream d c b a 70 140 210 280 350 SE +/- 2.94, N = 12 SE +/- 2.56, N = 12 SE +/- 3.44, N = 6 SE +/- 2.76, N = 9 338.15 338.55 337.68 338.96
OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: ResNet-50, Baseline - Scenario: Synchronous Single-Stream d c b a 0.666 1.332 1.998 2.664 3.33 SE +/- 0.0282, N = 12 SE +/- 0.0242, N = 12 SE +/- 0.0315, N = 6 SE +/- 0.0255, N = 9 2.9573 2.9530 2.9602 2.9492
OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream d c b a 300 600 900 1200 1500 SE +/- 17.38, N = 3 SE +/- 13.91, N = 3 SE +/- 14.60, N = 4 SE +/- 14.83, N = 3 1226.80 1231.27 1231.29 1231.20
OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream d c b a 0.665 1.33 1.995 2.66 3.325 SE +/- 0.0208, N = 12 SE +/- 0.0300, N = 6 SE +/- 0.0260, N = 9 SE +/- 0.0247, N = 10 2.9455 2.9555 2.9500 2.9484
OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream d c b a 70 140 210 280 350 SE +/- 2.24, N = 12 SE +/- 3.30, N = 6 SE +/- 2.82, N = 9 SE +/- 2.66, N = 10 339.37 338.22 338.86 339.04
OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: BERT-Large, NLP Question Answering - Scenario: Asynchronous Multi-Stream d c b a 30 60 90 120 150 SE +/- 1.10, N = 3 SE +/- 1.19, N = 3 SE +/- 0.66, N = 3 SE +/- 0.66, N = 3 155.60 155.77 155.72 156.06
OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream d c b a 12 24 36 48 60 SE +/- 0.70, N = 3 SE +/- 0.62, N = 3 SE +/- 0.60, N = 4 SE +/- 0.63, N = 3 52.07 51.92 51.93 51.93
OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream d c b a 100 200 300 400 500 SE +/- 0.75, N = 3 SE +/- 0.55, N = 3 SE +/- 0.39, N = 3 SE +/- 0.32, N = 3 475.45 475.35 474.37 475.13
OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream d c b a 4 8 12 16 20 SE +/- 0.14, N = 6 SE +/- 0.14, N = 6 SE +/- 0.13, N = 7 SE +/- 0.16, N = 5 14.03 14.01 14.01 14.04
OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream d c b a 1000 2000 3000 4000 5000 SE +/- 42.83, N = 6 SE +/- 43.60, N = 6 SE +/- 40.39, N = 7 SE +/- 49.40, N = 5 4556.02 4563.43 4563.51 4554.98
OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream d c b a 30 60 90 120 150 SE +/- 0.26, N = 3 SE +/- 0.19, N = 3 SE +/- 0.14, N = 3 SE +/- 0.13, N = 3 133.60 133.60 133.82 133.60
OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.6 Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Synchronous Single-Stream d c b a 0.9773 1.9546 2.9319 3.9092 4.8865 SE +/- 0.0540, N = 3 SE +/- 0.0661, N = 15 SE +/- 0.0666, N = 15 SE +/- 0.0821, N = 15 4.1604 4.3436 4.2295 4.2083
OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.6 Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Synchronous Single-Stream d c b a 50 100 150 200 250 SE +/- 3.16, N = 3 SE +/- 3.66, N = 15 SE +/- 3.82, N = 15 SE +/- 4.61, N = 15 240.23 230.76 237.05 238.65
a Kernel Notes: Transparent Huge Pages: madviseProcessor Notes: Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0x21000161Python Notes: Python 3.11.6Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 12 December 2023 00:07 by user phoronix.
b Kernel Notes: Transparent Huge Pages: madviseProcessor Notes: Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0x21000161Python Notes: Python 3.11.6Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 12 December 2023 02:28 by user phoronix.
c Kernel Notes: Transparent Huge Pages: madviseProcessor Notes: Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0x21000161Python Notes: Python 3.11.6Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 12 December 2023 10:31 by user phoronix.
d Processor: 2 x INTEL XEON PLATINUM 8592+ @ 3.90GHz (128 Cores / 256 Threads), Motherboard: Quanta Cloud S6Q-MB-MPS (3B05.TEL4P1 BIOS), Chipset: Intel Device 1bce, Memory: 1008GB, Disk: 3201GB Micron_7450_MTFDKCB3T2TFS, Graphics: ASPEED, Network: 2 x Intel X710 for 10GBASE-T
OS: Ubuntu 23.10, Kernel: 6.5.0-13-generic (x86_64), Compiler: GCC 13.2.0, File-System: ext4, Screen Resolution: 1920x1080
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -vProcessor Notes: Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0x21000161Python Notes: Python 3.11.6Security Notes: gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 13 December 2023 00:17 by user phoronix.