2 x AMD EPYC 9654 96-Core testing with a AMD Titanite_4G (RTI1004D BIOS) and llvmpipe on Red Hat Enterprise Linux 9.1 via the Phoronix Test Suite.
Compare your own system(s) to this result file with the
Phoronix Test Suite by running the command:
phoronix-test-suite benchmark 2303114-NE-9654NEW5019 9654 new - Phoronix Test Suite 9654 new 2 x AMD EPYC 9654 96-Core testing with a AMD Titanite_4G (RTI1004D BIOS) and llvmpipe on Red Hat Enterprise Linux 9.1 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2303114-NE-9654NEW5019&grs&export=pdf&sor .
9654 new Processor Motherboard Chipset Memory Disk Graphics Monitor Network OS Kernel Desktop Display Server Compiler File-System Screen Resolution OpenGL a b c no smt a no smt b smt a smt b smt c smt d AMD EPYC 9654 96-Core @ 2.40GHz (96 Cores / 192 Threads) AMD Titanite_4G (RTI1004D BIOS) AMD Device 14a4 768GB 2 x 1920GB SAMSUNG MZWLJ1T9HBJR-00007 ASPEED VGA HDMI Broadcom NetXtreme BCM5720 PCIe Red Hat Enterprise Linux 9.1 5.14.0-162.6.1.el9_1.x86_64 (x86_64) GNOME Shell 40.10 X Server 1.20.11 GCC 11.3.1 20220421 xfs 1600x1200 2 x AMD EPYC 9654 96-Core @ 2.40GHz (192 Cores) 1520GB llvmpipe 4.5 Mesa 22.1.5 (LLVM 14.0.6 256 bits) 1024x768 2 x AMD EPYC 9654 96-Core @ 2.40GHz (192 Cores / 384 Threads) OpenBenchmarking.org Kernel Details - Transparent Huge Pages: always Compiler Details - --build=x86_64-redhat-linux --disable-libunwind-exceptions --enable-__cxa_atexit --enable-bootstrap --enable-cet --enable-checking=release --enable-gnu-indirect-function --enable-gnu-unique-object --enable-host-bind-now --enable-host-pie --enable-initfini-array --enable-languages=c,c++,fortran,lto --enable-link-serialization=1 --enable-multilib --enable-offload-targets=nvptx-none --enable-plugin --enable-shared --enable-threads=posix --mandir=/usr/share/man --with-arch_32=x86-64 --with-arch_64=x86-64-v2 --with-build-config=bootstrap-lto --with-gcc-major-version-only --with-linker-hash-style=gnu --with-tune=generic --without-cuda-driver --without-isl Processor Details - Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa101111 Python Details - Python 3.9.14 Security Details - a: SELinux + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected - b: SELinux + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected - c: SELinux + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected - no smt a: SELinux + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: disabled RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected - no smt b: SELinux + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: disabled RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected - smt a: SELinux + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected - smt b: SELinux + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected - smt c: SELinux + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected - smt d: SELinux + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
9654 new stress-ng: MMAP onednn: IP Shapes 1D - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU onednn: IP Shapes 1D - f32 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU stress-ng: Context Switching stress-ng: NUMA onednn: Deconvolution Batch shapes_1d - f32 - CPU openvino: Vehicle Detection FP16 - CPU openvino: Vehicle Detection FP16 - CPU stress-ng: Pthread rocksdb: Rand Read stress-ng: Matrix Math stress-ng: CPU Cache stress-ng: CPU Stress stress-ng: Vector Math stress-ng: Mutex stress-ng: Crypto onednn: IP Shapes 3D - f32 - CPU stress-ng: SENDFILE stress-ng: Function Call onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU stress-ng: Atomic deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream stress-ng: Glibc Qsort Data Sorting deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU stress-ng: Glibc C String Functions deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream onednn: Matrix Multiply Batch Shapes Transformer - bf16bf16bf16 - CPU stress-ng: Hash openvino: Face Detection FP16-INT8 - CPU openvino: Face Detection FP16 - CPU openvino: Face Detection FP16 - CPU openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Face Detection FP16-INT8 - CPU onednn: IP Shapes 3D - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Training - f32 - CPU openvino: Weld Porosity Detection FP16 - CPU openvino: Weld Porosity Detection FP16 - CPU deepsparse: CV Detection, YOLOv5s COCO - Asynchronous Multi-Stream onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Stream openvino: Person Detection FP32 - CPU openvino: Vehicle Detection FP16-INT8 - CPU openvino: Vehicle Detection FP16-INT8 - CPU openvino: Person Detection FP32 - CPU deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Stream stress-ng: Malloc openvino: Person Detection FP16 - CPU openvino: Person Detection FP16 - CPU openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU rocksdb: Read While Writing deepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Asynchronous Multi-Stream onednn: Convolution Batch Shapes Auto - bf16bf16bf16 - CPU openvino: Machine Translation EN To DE FP16 - CPU openvino: Machine Translation EN To DE FP16 - CPU onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU stress-ng: MEMFD stress-ng: Memory Copying stress-ng: Forking stress-ng: Futex gromacs: MPI CPU - water_GMX50_bare rocksdb: Read Rand Write Rand onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU openvino: Person Vehicle Bike Detection FP16 - CPU openvino: Person Vehicle Bike Detection FP16 - CPU memcached: 1:5 stress-ng: System V Message Passing onednn: Deconvolution Batch shapes_1d - bf16bf16bf16 - CPU kvazaar: Bosphorus 4K - Slow memcached: 1:10 onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU kvazaar: Bosphorus 4K - Medium uvg266: Bosphorus 4K - Slow openvino: Age Gender Recognition Retail 0013 FP16 - CPU onednn: Deconvolution Batch shapes_3d - bf16bf16bf16 - CPU memcached: 1:100 stress-ng: Semaphores build-linux-kernel: defconfig stress-ng: Poll openvkl: vklBenchmark Scalar deepsparse: NLP Text Classification, DistilBERT mnli - Synchronous Single-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Synchronous Single-Stream uvg266: Bosphorus 4K - Medium deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Synchronous Single-Stream deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Synchronous Single-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Synchronous Single-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Synchronous Single-Stream deepsparse: NLP Text Classification, BERT base uncased SST2 - Synchronous Single-Stream deepsparse: NLP Text Classification, BERT base uncased SST2 - Synchronous Single-Stream uvg266: Bosphorus 1080p - Super Fast uvg266: Bosphorus 1080p - Ultra Fast compress-zstd: 12 - Compression Speed rocksdb: Update Rand rocksdb: Seq Fill deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Synchronous Single-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Synchronous Single-Stream uvg266: Bosphorus 1080p - Very Fast clickhouse: 100M Rows Hits Dataset, Second Run rocksdb: Rand Fill clickhouse: 100M Rows Hits Dataset, First Run / Cold Cache deepsparse: CV Detection, YOLOv5s COCO - Synchronous Single-Stream deepsparse: CV Detection, YOLOv5s COCO - Synchronous Single-Stream clickhouse: 100M Rows Hits Dataset, Third Run vpxenc: Speed 5 - Bosphorus 4K openvino: Age Gender Recognition Retail 0013 FP16 - CPU uvg266: Bosphorus 4K - Ultra Fast openvkl: vklBenchmark ISPC compress-zstd: 8 - Compression Speed compress-zstd: 3 - Compression Speed build-ffmpeg: Time To Compile kvazaar: Bosphorus 1080p - Very Fast uvg266: Bosphorus 4K - Very Fast kvazaar: Bosphorus 1080p - Slow kvazaar: Bosphorus 1080p - Super Fast uvg266: Bosphorus 4K - Super Fast deepsparse: NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Stream compress-zstd: 3, Long Mode - Compression Speed kvazaar: Bosphorus 1080p - Medium deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Stream rocksdb: Rand Fill Sync deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream kvazaar: Bosphorus 4K - Very Fast kvazaar: Bosphorus 1080p - Ultra Fast kvazaar: Bosphorus 4K - Super Fast kvazaar: Bosphorus 4K - Ultra Fast openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU deepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Synchronous Single-Stream deepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Synchronous Single-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream vpxenc: Speed 0 - Bosphorus 4K uvg266: Bosphorus 1080p - Slow uvg266: Bosphorus 1080p - Medium openvino: Weld Porosity Detection FP16-INT8 - CPU compress-zstd: 8, Long Mode - Compression Speed deepsparse: CV Detection, YOLOv5s COCO - Asynchronous Multi-Stream vpxenc: Speed 0 - Bosphorus 1080p compress-zstd: 19, Long Mode - Compression Speed deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream vvenc: Bosphorus 1080p - Faster compress-zstd: 19 - Compression Speed deepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Stream deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Stream stress-ng: Socket Activity compress-zstd: 19 - Decompression Speed compress-zstd: 12 - Decompression Speed vvenc: Bosphorus 4K - Fast compress-zstd: 19, Long Mode - Decompression Speed vvenc: Bosphorus 4K - Faster compress-zstd: 3 - Decompression Speed compress-zstd: 8 - Decompression Speed vpxenc: Speed 5 - Bosphorus 1080p deepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Asynchronous Multi-Stream vvenc: Bosphorus 1080p - Fast compress-zstd: 8, Long Mode - Decompression Speed compress-zstd: 3, Long Mode - Decompression Speed embree: Pathtracer - Crown a b c no smt a no smt b smt a smt b smt c smt d 1663.08 8.74576 674.088 2.01581 671.69 675.915 3.11497 18941003.97 483.9 7.35746 16.3 2941.47 109397.78 468231434 382305.69 77.51 205134.87 556875.59 59479783.64 203073.15 1.56461 1950323.96 621015.34 0.263997 174.72 42.9581 1122.39 42.9893 0.555277 16257100.25 1009.8486 0.2902 18954936 95.76 962.1 49.72 9494.22 499.32 1.87777 907.489 4914.17 9.76 440.8533 911.245 909.691 149.7446 619.7531 314.8361 27.97 8.11 5908.89 1701.7 401.7766 312709034.05 28.47 1671.46 74495.17 9296185 1566.564 0.415446 540.41 88.71 0.357017 0.53445 0.304972 518.46 20106.51 58156.36 2794694.37 10.569 2926458 1.17004 0.693367 5750.74 8.34 4143044.21 10473084.09 1.83147 40.63 6861088.89 0.311405 41.4 29.37 112346.36 0.711588 4878860 18128283.29 25.768 12653709.64 556 5.1812 192.8806 33.03 99.1493 10.0811 5.3018 188.5378 11.362 87.9631 238.68 240.98 330.8 544384 542256 16.5828 60.2522 234.95 625.91 533927 600.70 9.5962 104.1178 625.16 17.36 0.65 70.56 1089 1233.8 3033.9 12.569 296.93 68.82 139.92 307.49 69.33 28.6233 34.9289 892.7 143.31 34.8262 28.7077 376601 1115.7602 1116.426 80.9 310.73 81.9 83.45 1.09 197.1018 5.0715 47.464 7.68 81.16 91.37 10.1 938.5 108.6775 14.69 9.22 319.9815 77.3363 30.039 19.1 152.1377 119.2067 8873.58 1472.6 1704.8 5.82 1383.7 12.336 1514.8 1669.3 29.37 30.5924 12.44 1677.9 1536.6 1664.75 9.1957 671.654 1.9234 670.202 668.952 4.27858 16313126.86 498.67 7.28243 15.54 3085.53 109356.78 466888888 382304.04 67.21 217072.27 556797.27 60031543.97 203147.15 1.56257 1913590.77 621003.92 0.263373 223.29 42.8348 1132.35 42.897 0.551966 16168655.17 1012.3494 0.291387 18955118.1 95.48 962.63 49.72 9496.95 500.83 1.72975 903.272 4915.42 9.75 439.0373 912.114 917.077 149.8207 619.9442 316.2907 28.23 8.13 5894.27 1685.26 401.2053 314418461.02 28.42 1675.96 74353.41 8620352 1573.7496 0.578466 541.03 88.61 0.356875 0.513068 0.305731 507.74 20340.4 58664.97 2805836.52 10.609 2891962 1.16696 0.689064 5720.9 8.38 4162812.47 10471889.97 1.84172 40.86 6792746.34 0.312721 41.39 29.29 111378.39 0.712495 4876951.36 18100474.36 25.749 12661687.46 557 5.1633 193.5305 33.13 98.6175 10.136 5.498 181.8123 11.1872 89.3355 239.92 240.91 332 545556 545396 16.6745 59.9222 234.68 627.20 534681 610.79 9.5793 104.2908 628.37 17.37 0.66 70.68 1075 1239.3 3095 12.434 290.59 68.79 139.07 303.99 69 28.7137 34.8189 909.4 144.21 35.0962 28.4866 373002 1118.4383 1116.2899 81.61 305.52 80.68 84.77 1.09 196.5414 5.0859 47.3532 7.71 81.41 91.47 10.1 910.8 109.1129 14.67 9.29 319.9648 77.305 29.845 19.1 151.6003 119.3799 8851.65 1467.8 1716.7 5.809 1378.2 12.391 1516.6 1651 29.5 30.4608 12.399 1673.7 1537.5 1668.92 3.83017 669.146 1.88686 671.233 667.95 4.3972 16862185.54 478.05 7.38116 15.1 3174.04 109609.15 468069792 382328.5 97.04 217304.19 556833.5 59929401.4 203095.44 1.56742 1891202.37 621041.93 0.26292 183.33 42.9152 1125.86 42.7433 0.556851 16537965.14 1013.7357 0.285432 18961773.18 95.59 962.15 49.77 9485.65 500.3 1.6386 906.139 4910.13 9.76 438.4372 915.084 911.762 149.6905 625.5512 316.6282 27.89 8.13 5898.79 1704.02 400.0427 313768771.26 28.33 1679.03 74486.08 8316379 1571.8281 0.417188 535.78 89.46 0.353175 0.546991 0.305599 507.97 20297.91 64299.62 2794473.75 10.587 2910023 1.16919 0.692667 5724.81 8.37 4155273.46 10475486.39 1.82264 40.76 6839975.87 0.312289 41.47 29.41 112186.25 0.708563 4852421.67 18088584.67 25.68 12676101.41 549 5.1696 193.3119 33.1 98.5922 10.1384 5.5007 181.7243 11.1605 89.547 237.96 238.77 330.1 543572 544565 16.5185 60.4884 237.44 621.88 536551 614.24 8.8004 113.5057 636.18 17.46 0.66 71.13 1066 1241.1 3049.4 12.465 291.03 69.04 140.1 301.4 69.92 28.5698 34.9937 916.9 143.81 35.1188 28.4687 356922 1122.0584 1117.7323 80.31 309.64 84.55 82.7 1.09 197.6601 5.0573 47.2942 7.7 81.1 91.29 10.11 926.9 109.2467 14.76 9.3 320.0504 76.6181 30.081 19.1 151.3899 119.7026 8876.1 1470.5 1715.6 5.808 1384 12.399 1517.4 1661.9 29.46 30.4911 12.441 1675.4 1540.9 4520.48 6.19257 930.257 2.04003 913.201 973.318 4.87883 47222624.49 20.51 9.25292 6.01 7979.05 68076.03 1209611055 932248.18 40.94 328297.04 920216.27 63581395.67 435615.35 2.19645 3284433.2 829106.63 0.316665 400.03 97.3996 1978.8 97.4442 0.664359 26755146.69 2204.7533 0.316779 27408413.1 208.82 438.29 109.08 20836.1 229.46 2.02286 1148.26 10556.93 4.54 941.2443 1151.25 1147.34 308.4294 1291.9733 649.302 57.18 3.95 12119.66 833.99 813.9832 456657338.84 57.58 828.67 127833.52 7643831 3107.2957 0.291687 1045.81 45.86 0.342836 0.556069 0.160361 303.56 11342.69 43094.96 3746361.52 19.175 2079804 0.664295 0.40359 9784.09 4.9 2444886.82 7402514.74 1.65041 47.13 4244908.83 0.254845 47.97 34.59 160545.41 0.462386 3192061.68 13141129.68 17.606 10458275.24 647 7.5856 131.7586 38.38 72.1298 13.8559 6.0118 166.2725 14.4638 69.1063 181.67 186.24 279.9 462018 465700 20.7385 48.1849 196.25 666.43 478629 635.48 9.8899 101.0095 665.00 14.48 0.57 57.78 1098 1227.5 2804.2 10.597 269.62 57.13 155.56 288.41 59.83 31.9229 31.3187 955.6 159.63 31.9146 31.327 350673 955.1631 956.0088 70.02 278.62 75.5 76.05 1.01 191.2605 5.2262 42.805 7.04 87.15 98.21 9.18 859.9 100.2418 14.28 9.39 304.2075 73.1993 28.689 19.8 145.3548 116.1532 8968.28 1483.1 1728 5.89 1395.8 12.477 1500.4 1669.7 29.62 30.4213 12.357 1684.8 1542.5 3591.71 7.98861 901.044 1.93126 903.358 939.547 4.79302 44683906.68 19.78 9.31751 6.01 7979.82 67451.3 1213540299 925984.24 55.84 326819.72 920642.34 65500297.73 437065.75 2.12379 3282827.5 829445.38 0.293564 395.95 97.3441 1963.75 97.3838 0.670574 28009942.56 2203.1573 0.340498 27422305.53 210.14 438.99 109.04 20849.75 228.01 2.02833 1124.75 10549.24 4.54 938.9084 1119.17 1139.63 308.3033 1292.8038 647.7596 57.19 3.95 12128.38 834.17 817.6313 456651508.04 57.49 830.04 127770.39 7913568 3086.3591 0.291229 1048.74 45.73 0.347231 0.629026 0.164845 308.67 10949.9 45685.28 3802292.6 18.837 2063831 0.651301 0.400325 9758.74 4.91 2460416.33 7372780.72 1.70057 47.43 4220522.49 0.255796 47.93 34.68 162294.47 0.461174 3216133.91 13192391.85 17.586 10393402.01 652 6.4174 155.7357 38.65 71.8878 13.9039 6.065 164.8126 14.4178 69.3266 216.15 209.85 278.2 452228 464044 20.4223 48.9257 183.75 649.52 468210 622.73 9.9395 100.509 662.01 14.5 0.58 57.23 1108 1234.2 2865.9 10.662 268.77 58.09 159.79 280.18 58.33 32.1778 31.07 1032.3 161.4 29.7544 33.6012 344040 955.2386 955.3438 69.94 271.45 74.03 74.37 1.01 175.3199 5.7015 42.8393 7.16 88.13 96.67 9.18 892 100.4198 13.72 9.76 303.7377 73.2087 28.52 19.8 145.3627 115.6369 8924.1 1483.8 1727.8 5.895 1397.6 12.31 1511.5 1667.6 29.71 30.6403 12.374 1682.8 1538.8 8360.76 15.421 3187 7.60566 3269.05 3133.65 11.4326 12895047.73 24.82 20.2695 6.02 7953.21 74978.51 1225662852 946032.88 47.58 487359.39 1291689.33 136339636.91 466292.12 3.39151 4329963.26 1414423.65 0.591738 184.34 97.2916 2564.48 97.0249 1.20049 35687958.37 2230.3099 0.621776 41966139.67 208.13 439.41 108.93 20610.23 230.21 3.38833 1912.16 10544.03 4.54 929.0246 1830.41 1888.1 314.6186 1306.6677 658.2111 57.29 3.95 12117.16 833.15 817.7536 634718750.41 57.72 827.07 148316.88 15317250 3114.6848 0.451472 1049.16 45.71 0.674455 0.975357 0.279638 464.7 15914.1 36266.02 2333781.39 18.818 1787682 0.97859 0.538539 9746.08 4.92 2520253 10103952.13 2.70807 64.63 4530948.62 0.40743 65.56 45.56 173926.92 0.672585 4383314.65 20047519.05 17.376 15359471.73 764 7.5225 132.8655 47.56 77.8827 12.8322 7.1865 139.0973 13.637 73.2924 178.49 216.15 254.1 413199 414168 21.0323 47.506 183.99 524.04 423027 500.13 8.1914 121.9232 534.82 13.9 0.53 57.96 1318 1023.9 2795.1 10.381 250.71 59.3 132.67 296.72 59.06 32.2956 30.9572 1051.4 140.78 31.7592 31.4805 388052 972.5163 970.2202 75.91 302.55 76.64 77.96 0.97 188.5826 5.3004 42.9515 7.24 80.17 89.4 9.28 853.8 103.0888 14.08 9.78 303.3272 73.306 18.8 145.3966 117.1233 8748.99 1495.2 1723.7 1393.5 1515.6 1664.3 29.54 30.7462 1682.5 1539.5 9017.12 19.549 3034.62 7.4227 3249.28 3222.93 7.30147 12221058.34 24.77 20.572 6.03 7949.05 180407.59 1231168093 946660.88 44.38 487987.32 1291704.57 135855921.28 466609.22 3.49453 4329824.38 1413555.44 0.600199 186.64 97.3528 2520.38 96.906 1.18135 35996976.13 2245.8036 0.620907 41989522.68 208.19 439.5 108.89 20610.62 230.06 3.59643 1890.28 10549.15 4.54 927.679 1937.65 1894.07 314.2452 1306.5839 658.341 57.31 3.96 12109.64 832.7 822.1552 634995429.66 57.67 827.03 147582.87 13830689 3114.4293 0.450261 1050.37 45.66 0.675664 0.971704 0.291087 564.05 15430.18 36020.16 2396325.93 19.015 1761416 0.973555 0.551783 9680.44 4.95 2507331.29 8586858.51 2.7614 66.22 4392934.75 0.408074 65.69 46.25 175754.35 0.672768 4357453.76 19927313.52 17.017 15341564.16 810 6.8844 145.1736 46.62 77.4096 12.9106 7.3054 136.8324 14.6356 68.2924 220.79 179.56 259.1 411513 413708 21.7398 45.9642 209.55 527.88 417882 516.09 8.3506 119.5964 538.63 14.27 0.53 57.51 1243 1122.4 2513.4 10.259 274.63 57.86 136.33 267.04 59.2 32.2415 31.0093 1038.9 138.33 31.7806 31.4593 401295 975.504 969.2122 74.11 295.18 76.7 77 0.97 188.4147 5.3053 42.6536 6.97 79.95 89.7 9.28 852.7 103.1902 13.57 9.87 304.0782 73.3496 19.5 145.2745 116.4813 8750.61 1483.5 1726.3 1392.4 1519.1 1671 29.6 30.7508 1676.5 1540.4 7633.16 18.6594 3011.21 9.34981 3153.22 3185.64 12.013 12328933.65 24.79 21.0301 5.99 7993.18 77834.39 1234570512 952461.48 42.55 490300.48 1295970.28 138939958.5 468159.21 3.60194 4351871.12 1422505.51 0.593299 184.24 96.9077 2517.26 96.9975 1.20146 34602827.77 2253.1411 0.634992 41972765.93 210.19 437.51 109.36 20684.29 227.93 3.36535 1888.21 10603.84 4.52 937.992 1926.15 1925.68 315.3318 1305.7137 662.2585 57.71 3.93 12168.05 826.67 818.6037 639757070.52 58.05 821.93 148047.13 13948914 3132.3838 0.44782 1058.57 45.31 0.676124 0.99571 0.275917 413.34 15077.69 34917.39 2067499.58 19.045 1613913 0.972863 0.545118 9773.26 4.9 2517750.47 8609357.68 2.70235 65.49 4549313.42 0.394377 65.59 46.34 173620.31 0.676798 4394194.49 19866440.86 17.053 15228320.07 793 7.355 135.8907 46.57 70.1418 14.2486 6.1057 163.7111 14.2011 70.3828 178.78 218.28 249.8 419985 414282 20.4884 48.772 192.25 524.75 416006 495.85 10.3852 96.1737 530.47 15.38 0.53 57.33 1235 1005.8 2604.6 10.407 243.36 57.44 135.8 288.09 58.34 33.9879 29.4163 1046.6 140.89 31.9277 31.3146 394852 971.8988 971.6677 73.57 303.43 78.72 76.14 0.96 189.4054 5.2773 42.5521 6.92 79.63 89.16 9.25 860.8 102.0999 13.97 9.82 303.2549 73.3407 19.2 144.4661 116.9689 8747.83 1475.2 1732.3 1389 1516.2 1664 29.54 30.5784 1683.7 1540.9 7273.19 11.7573 3349.36 8.26078 3171.15 3172.19 12.1571 12728401.45 24.71 20.9726 6 7990.47 91735.75 1231916197 951700.17 40.56 489998.75 1300693.93 137579874.56 468006 3.50563 4351920.42 1425621.44 0.602831 182.8 97.8908 2516.09 96.7169 1.2334 36111781.82 2255.422 0.627133 41965286.06 209.24 437.68 109.23 20679.68 228.94 2.97427 1957.77 10587.3 4.52 931.8223 1883.76 1912.74 316.0328 1301.1811 660.2132 57.62 3.93 12172.92 828.01 816.7562 640853365.54 58 822.5 149647.68 13098097 3116.0909 0.457554 1057.74 45.34 0.672115 0.975091 0.247825 394.49 13370.33 34685.85 2077119.76 19.077 1752904 0.973984 0.63063 9767.99 4.91 2527084.87 12418451.71 2.67303 66.08 4569476.86 0.389223 65.49 46.25 172228.34 0.680395 4403267.42 19842969.28 17.079 15403597.24 781 7.064 141.4705 46.21 75.524 13.2331 7.149 139.8241 15.126 66.083 218.91 224.31 256.2 420514 414047 21.3593 46.7815 181.83 515.11 415428 500.118340751 9.3656 106.6518 527.80 15.11 0.53 58.76 1238 1024.8 2528.7 10.373 270.76 57.52 135.56 256.45 58.66 32.1758 31.0727 1059 136.42 30.2494 33.0514 404463 974.985 966.7606 73.24 303.63 75.5 78.85 0.97 183.949 5.4341 42.4798 7.15 80 88.91 9.26 879.6 102.7646 13.68 9.81 301.9192 73.6196 19.6 144.9144 117.2219 8864.78 1479.1 1731.2 1391.7 1513.5 1664.2 29.66 30.7247 1677.6 1543.6 OpenBenchmarking.org
Stress-NG Test: MMAP OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: MMAP smt b smt a smt c smt d no smt a no smt b c b a 2K 4K 6K 8K 10K 9017.12 8360.76 7633.16 7273.19 4520.48 3591.71 1668.92 1664.75 1663.08 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
oneDNN Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU c no smt a no smt b a b smt d smt a smt c smt b 5 10 15 20 25 3.83017 6.19257 7.98861 8.74576 9.19570 11.75730 15.42100 18.65940 19.54900 MIN: 2.72 MIN: 3.65 MIN: 3.88 MIN: 3.65 MIN: 4.13 MIN: 8.17 MIN: 10.4 MIN: 11.23 MIN: 10.67 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU c b a no smt b no smt a smt c smt b smt a smt d 700 1400 2100 2800 3500 669.15 671.65 674.09 901.04 930.26 3011.21 3034.62 3187.00 3349.36 MIN: 661.71 MIN: 664.19 MIN: 667.01 MIN: 864.28 MIN: 898.89 MIN: 2853.26 MIN: 2821.2 MIN: 3034.82 MIN: 3325.75 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU c b no smt b a no smt a smt b smt a smt d smt c 3 6 9 12 15 1.88686 1.92340 1.93126 2.01581 2.04003 7.42270 7.60566 8.26078 9.34981 MIN: 1.68 MIN: 1.72 MIN: 1.77 MIN: 1.81 MIN: 1.78 MIN: 6.44 MIN: 6.54 MIN: 7.09 MIN: 7.73 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU b c a no smt b no smt a smt c smt d smt b smt a 700 1400 2100 2800 3500 670.20 671.23 671.69 903.36 913.20 3153.22 3171.15 3249.28 3269.05 MIN: 663.54 MIN: 662.9 MIN: 664.46 MIN: 874.93 MIN: 884.92 MIN: 3059.88 MIN: 3148.29 MIN: 3226.78 MIN: 3243.11 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU c b a no smt b no smt a smt a smt d smt c smt b 700 1400 2100 2800 3500 667.95 668.95 675.92 939.55 973.32 3133.65 3172.19 3185.64 3222.93 MIN: 660.93 MIN: 662.29 MIN: 668.99 MIN: 904.78 MIN: 937.3 MIN: 3037.06 MIN: 3155.85 MIN: 2949.58 MIN: 2927.41 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU a b c no smt b no smt a smt b smt a smt c smt d 3 6 9 12 15 3.11497 4.27858 4.39720 4.79302 4.87883 7.30147 11.43260 12.01300 12.15710 MIN: 2.47 MIN: 3.37 MIN: 3.21 MIN: 3.4 MIN: 3.53 MIN: 5.77 MIN: 7.59 MIN: 8.02 MIN: 8.52 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
Stress-NG Test: Context Switching OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: Context Switching no smt a no smt b a c b smt a smt d smt c smt b 10M 20M 30M 40M 50M 47222624.49 44683906.68 18941003.97 16862185.54 16313126.86 12895047.73 12728401.45 12328933.65 12221058.34 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: NUMA OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: NUMA b a c smt a smt c smt b smt d no smt a no smt b 110 220 330 440 550 498.67 483.90 478.05 24.82 24.79 24.77 24.71 20.51 19.78 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU b a c no smt a no smt b smt a smt b smt d smt c 5 10 15 20 25 7.28243 7.35746 7.38116 9.25292 9.31751 20.26950 20.57200 20.97260 21.03010 MIN: 6.58 MIN: 4.85 MIN: 6.77 MIN: 7.79 MIN: 8.07 MIN: 17.68 MIN: 18.25 MIN: 18.21 MIN: 17.96 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenVINO Model: Vehicle Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Vehicle Detection FP16 - Device: CPU smt c smt d no smt a no smt b smt a smt b c b a 4 8 12 16 20 5.99 6.00 6.01 6.01 6.02 6.03 15.10 15.54 16.30 MIN: 5.13 / MAX: 31.29 MIN: 5.21 / MAX: 25.51 MIN: 5.2 / MAX: 37.8 MIN: 5.02 / MAX: 36.88 MIN: 5.27 / MAX: 25.12 MIN: 5.17 / MAX: 38.35 MIN: 6.94 / MAX: 60.59 MIN: 8.28 / MAX: 57.72 MIN: 7.91 / MAX: 51.62 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenVINO Model: Vehicle Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Vehicle Detection FP16 - Device: CPU smt c smt d no smt b no smt a smt a smt b c b a 2K 4K 6K 8K 10K 7993.18 7990.47 7979.82 7979.05 7953.21 7949.05 3174.04 3085.53 2941.47 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
Stress-NG Test: Pthread OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: Pthread smt b c a b smt d smt c smt a no smt a no smt b 40K 80K 120K 160K 200K 180407.59 109609.15 109397.78 109356.78 91735.75 77834.39 74978.51 68076.03 67451.30 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
RocksDB Test: Random Read OpenBenchmarking.org Op/s, More Is Better RocksDB 7.9.2 Test: Random Read smt c smt d smt b smt a no smt b no smt a a c b 300M 600M 900M 1200M 1500M 1234570512 1231916197 1231168093 1225662852 1213540299 1209611055 468231434 468069792 466888888 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti
Stress-NG Test: Matrix Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: Matrix Math smt c smt d smt b smt a no smt a no smt b c a b 200K 400K 600K 800K 1000K 952461.48 951700.17 946660.88 946032.88 932248.18 925984.24 382328.50 382305.69 382304.04 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: CPU Cache OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: CPU Cache c a b no smt b smt a smt b smt c no smt a smt d 20 40 60 80 100 97.04 77.51 67.21 55.84 47.58 44.38 42.55 40.94 40.56 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: CPU Stress OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: CPU Stress smt c smt d smt b smt a no smt a no smt b c b a 110K 220K 330K 440K 550K 490300.48 489998.75 487987.32 487359.39 328297.04 326819.72 217304.19 217072.27 205134.87 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: Vector Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: Vector Math smt d smt c smt b smt a no smt b no smt a a c b 300K 600K 900K 1200K 1500K 1300693.93 1295970.28 1291704.57 1291689.33 920642.34 920216.27 556875.59 556833.50 556797.27 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: Mutex OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: Mutex smt c smt d smt a smt b no smt b no smt a b c a 30M 60M 90M 120M 150M 138939958.50 137579874.56 136339636.91 135855921.28 65500297.73 63581395.67 60031543.97 59929401.40 59479783.64 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: Crypto OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: Crypto smt c smt d smt b smt a no smt b no smt a b c a 100K 200K 300K 400K 500K 468159.21 468006.00 466609.22 466292.12 437065.75 435615.35 203147.15 203095.44 203073.15 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU b a c no smt b no smt a smt a smt b smt d smt c 0.8104 1.6208 2.4312 3.2416 4.052 1.56257 1.56461 1.56742 2.12379 2.19645 3.39151 3.49453 3.50563 3.60194 MIN: 1.41 MIN: 1.41 MIN: 1.39 MIN: 1.9 MIN: 2 MIN: 2.93 MIN: 3.11 MIN: 3.19 MIN: 3.08 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
Stress-NG Test: SENDFILE OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: SENDFILE smt d smt c smt a smt b no smt a no smt b a b c 900K 1800K 2700K 3600K 4500K 4351920.42 4351871.12 4329963.26 4329824.38 3284433.20 3282827.50 1950323.96 1913590.77 1891202.37 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: Function Call OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: Function Call smt d smt c smt a smt b no smt b no smt a c a b 300K 600K 900K 1200K 1500K 1425621.44 1422505.51 1414423.65 1413555.44 829445.38 829106.63 621041.93 621015.34 621003.92 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU c b a no smt b no smt a smt a smt c smt b smt d 0.1356 0.2712 0.4068 0.5424 0.678 0.262920 0.263373 0.263997 0.293564 0.316665 0.591738 0.593299 0.600199 0.602831 MIN: 0.2 MIN: 0.2 MIN: 0.18 MIN: 0.22 MIN: 0.24 MIN: 0.43 MIN: 0.39 MIN: 0.38 MIN: 0.48 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
Stress-NG Test: Atomic OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: Atomic no smt a no smt b b smt b smt a smt c c smt d a 90 180 270 360 450 400.03 395.95 223.29 186.64 184.34 184.24 183.33 182.80 174.72 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream smt d no smt a smt b no smt b smt a smt c a c b 20 40 60 80 100 97.89 97.40 97.35 97.34 97.29 96.91 42.96 42.92 42.83
Stress-NG Test: Glibc Qsort Data Sorting OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: Glibc Qsort Data Sorting smt a smt b smt c smt d no smt a no smt b b c a 600 1200 1800 2400 3000 2564.48 2520.38 2517.26 2516.09 1978.80 1963.75 1132.35 1125.86 1122.39 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream no smt a no smt b smt a smt c smt b smt d a b c 20 40 60 80 100 97.44 97.38 97.02 97.00 96.91 96.72 42.99 42.90 42.74
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU b a c no smt a no smt b smt b smt a smt c smt d 0.2775 0.555 0.8325 1.11 1.3875 0.551966 0.555277 0.556851 0.664359 0.670574 1.181350 1.200490 1.201460 1.233400 MIN: 0.49 MIN: 0.53 MIN: 0.53 MIN: 0.56 MIN: 0.55 MIN: 1.08 MIN: 1.04 MIN: 1.08 MIN: 1.08 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
Stress-NG Test: Glibc C String Functions OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: Glibc C String Functions smt d smt b smt a smt c no smt b no smt a c a b 8M 16M 24M 32M 40M 36111781.82 35996976.13 35687958.37 34602827.77 28009942.56 26755146.69 16537965.14 16257100.25 16168655.17 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream smt d smt c smt b smt a no smt a no smt b c b a 500 1000 1500 2000 2500 2255.42 2253.14 2245.80 2230.31 2204.75 2203.16 1013.74 1012.35 1009.85
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU c a b no smt a no smt b smt b smt a smt d smt c 0.1429 0.2858 0.4287 0.5716 0.7145 0.285432 0.290200 0.291387 0.316779 0.340498 0.620907 0.621776 0.627133 0.634992 MIN: 0.24 MIN: 0.25 MIN: 0.23 MIN: 0.25 MIN: 0.28 MIN: 0.41 MIN: 0.45 MIN: 0.4 MIN: 0.55 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
Stress-NG Test: Hash OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: Hash smt b smt c smt a smt d no smt b no smt a c b a 9M 18M 27M 36M 45M 41989522.68 41972765.93 41966139.67 41965286.06 27422305.53 27408413.10 18961773.18 18955118.10 18954936.00 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
OpenVINO Model: Face Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Face Detection FP16-INT8 - Device: CPU smt c no smt b smt d no smt a smt b smt a a c b 50 100 150 200 250 210.19 210.14 209.24 208.82 208.19 208.13 95.76 95.59 95.48 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenVINO Model: Face Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Face Detection FP16 - Device: CPU smt c smt d no smt a no smt b smt a smt b a c b 200 400 600 800 1000 437.51 437.68 438.29 438.99 439.41 439.50 962.10 962.15 962.63 MIN: 400.05 / MAX: 473.91 MIN: 394.89 / MAX: 478.58 MIN: 416.93 / MAX: 496.86 MIN: 427.57 / MAX: 484.31 MIN: 410.81 / MAX: 465.4 MIN: 424.22 / MAX: 477.66 MIN: 879.24 / MAX: 1018.81 MIN: 888.7 / MAX: 1017.92 MIN: 893.43 / MAX: 1015.71 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenVINO Model: Face Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Face Detection FP16 - Device: CPU smt c smt d no smt a no smt b smt a smt b c b a 20 40 60 80 100 109.36 109.23 109.08 109.04 108.93 108.89 49.77 49.72 49.72 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenVINO Model: Weld Porosity Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Weld Porosity Detection FP16-INT8 - Device: CPU no smt b no smt a smt c smt d smt b smt a b a c 4K 8K 12K 16K 20K 20849.75 20836.10 20684.29 20679.68 20610.62 20610.23 9496.95 9494.22 9485.65 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenVINO Model: Face Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Face Detection FP16-INT8 - Device: CPU smt c no smt b smt d no smt a smt b smt a a c b 110 220 330 440 550 227.93 228.01 228.94 229.46 230.06 230.21 499.32 500.30 500.83 MIN: 210.46 / MAX: 252.96 MIN: 212.99 / MAX: 267.17 MIN: 214.92 / MAX: 248.29 MIN: 217.93 / MAX: 265.81 MIN: 210.58 / MAX: 251.42 MIN: 211.56 / MAX: 253.58 MIN: 264.54 / MAX: 537.26 MIN: 410.04 / MAX: 531.93 MIN: 418.76 / MAX: 546.62 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
oneDNN Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU c b a no smt a no smt b smt d smt c smt a smt b 0.8092 1.6184 2.4276 3.2368 4.046 1.63860 1.72975 1.87777 2.02286 2.02833 2.97427 3.36535 3.38833 3.59643 MIN: 1.2 MIN: 1.27 MIN: 1.23 MIN: 1.56 MIN: 1.43 MIN: 2.3 MIN: 2.7 MIN: 2.76 MIN: 2.7 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU b c a no smt b no smt a smt c smt b smt a smt d 400 800 1200 1600 2000 903.27 906.14 907.49 1124.75 1148.26 1888.21 1890.28 1912.16 1957.77 MIN: 894.76 MIN: 898.41 MIN: 897.95 MIN: 1091.36 MIN: 1108.42 MIN: 1860.29 MIN: 1864.29 MIN: 1891.03 MIN: 1929.47 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenVINO Model: Weld Porosity Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Weld Porosity Detection FP16 - Device: CPU smt c smt d no smt a no smt b smt b smt a b a c 2K 4K 6K 8K 10K 10603.84 10587.30 10556.93 10549.24 10549.15 10544.03 4915.42 4914.17 4910.13 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenVINO Model: Weld Porosity Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Weld Porosity Detection FP16 - Device: CPU smt c smt d no smt a no smt b smt a smt b b a c 3 6 9 12 15 4.52 4.52 4.54 4.54 4.54 4.54 9.75 9.76 9.76 MIN: 4.12 / MAX: 33.17 MIN: 4.03 / MAX: 45.52 MIN: 4.12 / MAX: 27.91 MIN: 4.09 / MAX: 55.78 MIN: 4.07 / MAX: 35.16 MIN: 4.11 / MAX: 42.14 MIN: 5.25 / MAX: 35.53 MIN: 5.03 / MAX: 28.1 MIN: 4.98 / MAX: 36.12 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream no smt a no smt b smt c smt d smt a smt b a b c 200 400 600 800 1000 941.24 938.91 937.99 931.82 929.02 927.68 440.85 439.04 438.44
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU a b c no smt b no smt a smt a smt d smt c smt b 400 800 1200 1600 2000 911.25 912.11 915.08 1119.17 1151.25 1830.41 1883.76 1926.15 1937.65 MIN: 903.01 MIN: 902.43 MIN: 906.26 MIN: 1086.08 MIN: 1070.35 MIN: 1807.97 MIN: 1850.75 MIN: 1901.09 MIN: 1908.65 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU a c b no smt b no smt a smt a smt b smt d smt c 400 800 1200 1600 2000 909.69 911.76 917.08 1139.63 1147.34 1888.10 1894.07 1912.74 1925.68 MIN: 901.1 MIN: 901.88 MIN: 909.14 MIN: 1100.83 MIN: 1109.85 MIN: 1865.76 MIN: 1870.66 MIN: 1890.93 MIN: 1901.2 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream smt d smt c smt a smt b no smt a no smt b b a c 70 140 210 280 350 316.03 315.33 314.62 314.25 308.43 308.30 149.82 149.74 149.69
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream smt a smt b smt c smt d no smt b no smt a c b a 300 600 900 1200 1500 1306.67 1306.58 1305.71 1301.18 1292.80 1291.97 625.55 619.94 619.75
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream smt c smt d smt b smt a no smt a no smt b c b a 140 280 420 560 700 662.26 660.21 658.34 658.21 649.30 647.76 316.63 316.29 314.84
OpenVINO Model: Person Detection FP32 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Person Detection FP32 - Device: CPU smt c smt d smt b smt a no smt b no smt a b a c 13 26 39 52 65 57.71 57.62 57.31 57.29 57.19 57.18 28.23 27.97 27.89 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenVINO Model: Vehicle Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Vehicle Detection FP16-INT8 - Device: CPU smt c smt d no smt a no smt b smt a smt b a b c 2 4 6 8 10 3.93 3.93 3.95 3.95 3.95 3.96 8.11 8.13 8.13 MIN: 3.61 / MAX: 42.82 MIN: 3.61 / MAX: 23.62 MIN: 3.61 / MAX: 38 MIN: 3.68 / MAX: 42.53 MIN: 3.61 / MAX: 34.38 MIN: 3.66 / MAX: 32.83 MIN: 5.39 / MAX: 69.87 MIN: 5.35 / MAX: 55.84 MIN: 3.83 / MAX: 59.87 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenVINO Model: Vehicle Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Vehicle Detection FP16-INT8 - Device: CPU smt d smt c no smt b no smt a smt a smt b a c b 3K 6K 9K 12K 15K 12172.92 12168.05 12128.38 12119.66 12117.16 12109.64 5908.89 5898.79 5894.27 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenVINO Model: Person Detection FP32 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Person Detection FP32 - Device: CPU smt c smt d smt b smt a no smt a no smt b b a c 400 800 1200 1600 2000 826.67 828.01 832.70 833.15 833.99 834.17 1685.26 1701.70 1704.02 MIN: 724.1 / MAX: 1006.75 MIN: 716.19 / MAX: 1018.34 MIN: 723.65 / MAX: 1031.69 MIN: 725.84 / MAX: 1017.38 MIN: 723.91 / MAX: 1011.94 MIN: 732.01 / MAX: 1006.46 MIN: 891.16 / MAX: 1979.37 MIN: 1395.71 / MAX: 2063.97 MIN: 828.99 / MAX: 1969.02 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
Neural Magic DeepSparse Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream smt b smt c smt a no smt b smt d no smt a a b c 200 400 600 800 1000 822.16 818.60 817.75 817.63 816.76 813.98 401.78 401.21 400.04
Stress-NG Test: Malloc OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: Malloc smt d smt c smt b smt a no smt a no smt b b c a 140M 280M 420M 560M 700M 640853365.54 639757070.52 634995429.66 634718750.41 456657338.84 456651508.04 314418461.02 313768771.26 312709034.05 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Person Detection FP16 - Device: CPU smt c smt d smt a smt b no smt a no smt b a b c 13 26 39 52 65 58.05 58.00 57.72 57.67 57.58 57.49 28.47 28.42 28.33 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Person Detection FP16 - Device: CPU smt c smt d smt b smt a no smt a no smt b a b c 400 800 1200 1600 2000 821.93 822.50 827.03 827.07 828.67 830.04 1671.46 1675.96 1679.03 MIN: 725.25 / MAX: 1010.56 MIN: 717.65 / MAX: 997.43 MIN: 723.77 / MAX: 1003.25 MIN: 724.84 / MAX: 1037.91 MIN: 730.29 / MAX: 1036.1 MIN: 722.6 / MAX: 1015.67 MIN: 924.15 / MAX: 1977.46 MIN: 1231.52 / MAX: 1967.75 MIN: 865.58 / MAX: 1995.12 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenVINO Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU smt d smt a smt c smt b no smt a no smt b a c b 30K 60K 90K 120K 150K 149647.68 148316.88 148047.13 147582.87 127833.52 127770.39 74495.17 74486.08 74353.41 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
RocksDB Test: Read While Writing OpenBenchmarking.org Op/s, More Is Better RocksDB 7.9.2 Test: Read While Writing smt a smt c smt b smt d a b c no smt b no smt a 3M 6M 9M 12M 15M 15317250 13948914 13830689 13098097 9296185 8620352 8316379 7913568 7643831 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti
Neural Magic DeepSparse Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream smt c smt d smt a smt b no smt a no smt b b c a 700 1400 2100 2800 3500 3132.38 3116.09 3114.68 3114.43 3107.30 3086.36 1573.75 1571.83 1566.56
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU no smt b no smt a a c smt c smt b smt a smt d b 0.1302 0.2604 0.3906 0.5208 0.651 0.291229 0.291687 0.415446 0.417188 0.447820 0.450261 0.451472 0.457554 0.578466 MIN: 0.27 MIN: 0.27 MIN: 0.4 MIN: 0.4 MIN: 0.37 MIN: 0.37 MIN: 0.4 MIN: 0.34 MIN: 0.4 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenVINO Model: Machine Translation EN To DE FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Machine Translation EN To DE FP16 - Device: CPU smt c smt d smt b smt a no smt b no smt a b a c 200 400 600 800 1000 1058.57 1057.74 1050.37 1049.16 1048.74 1045.81 541.03 540.41 535.78 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenVINO Model: Machine Translation EN To DE FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Machine Translation EN To DE FP16 - Device: CPU smt c smt d smt b smt a no smt b no smt a b a c 20 40 60 80 100 45.31 45.34 45.66 45.71 45.73 45.86 88.61 88.71 89.46 MIN: 39.35 / MAX: 73.19 MIN: 39.4 / MAX: 73.83 MIN: 38.82 / MAX: 75.65 MIN: 39.78 / MAX: 73.33 MIN: 39.68 / MAX: 86.83 MIN: 39.29 / MAX: 91.13 MIN: 47.57 / MAX: 124.67 MIN: 44.2 / MAX: 132.86 MIN: 42.16 / MAX: 123.4 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU no smt a no smt b c b a smt d smt a smt b smt c 0.1521 0.3042 0.4563 0.6084 0.7605 0.342836 0.347231 0.353175 0.356875 0.357017 0.672115 0.674455 0.675664 0.676124 MIN: 0.28 MIN: 0.3 MIN: 0.31 MIN: 0.31 MIN: 0.31 MIN: 0.49 MIN: 0.49 MIN: 0.55 MIN: 0.49 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU b a c no smt a no smt b smt b smt d smt a smt c 0.224 0.448 0.672 0.896 1.12 0.513068 0.534450 0.546991 0.556069 0.629026 0.971704 0.975091 0.975357 0.995710 MIN: 0.47 MIN: 0.49 MIN: 0.5 MIN: 0.46 MIN: 0.51 MIN: 0.82 MIN: 0.91 MIN: 0.92 MIN: 0.87 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU no smt a no smt b smt d smt c smt a smt b a c b 0.0688 0.1376 0.2064 0.2752 0.344 0.160361 0.164845 0.247825 0.275917 0.279638 0.291087 0.304972 0.305599 0.305731 MIN: 0.15 MIN: 0.15 MIN: 0.23 MIN: 0.23 MIN: 0.23 MIN: 0.23 MIN: 0.28 MIN: 0.28 MIN: 0.28 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
Stress-NG Test: MEMFD OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: MEMFD smt b a c b smt a smt c smt d no smt b no smt a 120 240 360 480 600 564.05 518.46 507.97 507.74 464.70 413.34 394.49 308.67 303.56 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: Memory Copying OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: Memory Copying b c a smt a smt b smt c smt d no smt a no smt b 4K 8K 12K 16K 20K 20340.40 20297.91 20106.51 15914.10 15430.18 15077.69 13370.33 11342.69 10949.90 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: Forking OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: Forking c b a no smt b no smt a smt a smt b smt c smt d 14K 28K 42K 56K 70K 64299.62 58664.97 58156.36 45685.28 43094.96 36266.02 36020.16 34917.39 34685.85 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: Futex OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: Futex no smt b no smt a b a c smt b smt a smt d smt c 800K 1600K 2400K 3200K 4000K 3802292.60 3746361.52 2805836.52 2794694.37 2794473.75 2396325.93 2333781.39 2077119.76 2067499.58 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
GROMACS Implementation: MPI CPU - Input: water_GMX50_bare OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2023 Implementation: MPI CPU - Input: water_GMX50_bare no smt a smt d smt c smt b no smt b smt a b c a 5 10 15 20 25 19.18 19.08 19.05 19.02 18.84 18.82 10.61 10.59 10.57 1. (CXX) g++ options: -O3
RocksDB Test: Read Random Write Random OpenBenchmarking.org Op/s, More Is Better RocksDB 7.9.2 Test: Read Random Write Random a c b no smt a no smt b smt a smt b smt d smt c 600K 1200K 1800K 2400K 3000K 2926458 2910023 2891962 2079804 2063831 1787682 1761416 1752904 1613913 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU no smt b no smt a smt c smt b smt d smt a b c a 0.2633 0.5266 0.7899 1.0532 1.3165 0.651301 0.664295 0.972863 0.973555 0.973984 0.978590 1.166960 1.169190 1.170040 MIN: 0.62 MIN: 0.63 MIN: 0.92 MIN: 0.92 MIN: 0.92 MIN: 0.93 MIN: 1.07 MIN: 1.07 MIN: 1.07 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU no smt b no smt a smt a smt c smt b smt d b c a 0.156 0.312 0.468 0.624 0.78 0.400325 0.403590 0.538539 0.545118 0.551783 0.630630 0.689064 0.692667 0.693367 MIN: 0.38 MIN: 0.38 MIN: 0.48 MIN: 0.45 MIN: 0.49 MIN: 0.48 MIN: 0.64 MIN: 0.65 MIN: 0.64 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenVINO Model: Person Vehicle Bike Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Person Vehicle Bike Detection FP16 - Device: CPU no smt a smt c smt d no smt b smt a smt b a c b 2K 4K 6K 8K 10K 9784.09 9773.26 9767.99 9758.74 9746.08 9680.44 5750.74 5724.81 5720.90 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenVINO Model: Person Vehicle Bike Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Person Vehicle Bike Detection FP16 - Device: CPU no smt a smt c no smt b smt d smt a smt b a c b 2 4 6 8 10 4.90 4.90 4.91 4.91 4.92 4.95 8.34 8.37 8.38 MIN: 4.46 / MAX: 56.52 MIN: 4.5 / MAX: 29.95 MIN: 4.49 / MAX: 34.72 MIN: 4.51 / MAX: 27.27 MIN: 4.52 / MAX: 34.8 MIN: 4.54 / MAX: 24.23 MIN: 6.61 / MAX: 55.54 MIN: 6.83 / MAX: 50.24 MIN: 6.4 / MAX: 32.33 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
Memcached Set To Get Ratio: 1:5 OpenBenchmarking.org Ops/sec, More Is Better Memcached 1.6.18 Set To Get Ratio: 1:5 b c a smt d smt a smt c smt b no smt b no smt a 900K 1800K 2700K 3600K 4500K 4162812.47 4155273.46 4143044.21 2527084.87 2520253.00 2517750.47 2507331.29 2460416.33 2444886.82 1. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre
Stress-NG Test: System V Message Passing OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: System V Message Passing smt d c a b smt a smt c smt b no smt a no smt b 3M 6M 9M 12M 15M 12418451.71 10475486.39 10473084.09 10471889.97 10103952.13 8609357.68 8586858.51 7402514.74 7372780.72 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU no smt a no smt b c a b smt d smt c smt a smt b 0.6213 1.2426 1.8639 2.4852 3.1065 1.65041 1.70057 1.82264 1.83147 1.84172 2.67303 2.70235 2.70807 2.76140 MIN: 1.41 MIN: 1.5 MIN: 1.71 MIN: 1.73 MIN: 1.76 MIN: 2.09 MIN: 2.27 MIN: 2.09 MIN: 2.46 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
Kvazaar Video Input: Bosphorus 4K - Video Preset: Slow OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 4K - Video Preset: Slow smt b smt d smt c smt a no smt b no smt a b c a 15 30 45 60 75 66.22 66.08 65.49 64.63 47.43 47.13 40.86 40.76 40.63 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Memcached Set To Get Ratio: 1:10 OpenBenchmarking.org Ops/sec, More Is Better Memcached 1.6.18 Set To Get Ratio: 1:10 a c b smt d smt c smt a smt b no smt a no smt b 1.5M 3M 4.5M 6M 7.5M 6861088.89 6839975.87 6792746.34 4569476.86 4549313.42 4530948.62 4392934.75 4244908.83 4220522.49 1. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU no smt a no smt b a c b smt d smt c smt a smt b 0.0918 0.1836 0.2754 0.3672 0.459 0.254845 0.255796 0.311405 0.312289 0.312721 0.389223 0.394377 0.407430 0.408074 MIN: 0.18 MIN: 0.18 MIN: 0.3 MIN: 0.28 MIN: 0.28 MIN: 0.29 MIN: 0.29 MIN: 0.27 MIN: 0.27 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
Kvazaar Video Input: Bosphorus 4K - Video Preset: Medium OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 4K - Video Preset: Medium smt b smt c smt a smt d no smt a no smt b c a b 15 30 45 60 75 65.69 65.59 65.56 65.49 47.97 47.93 41.47 41.40 41.39 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
uvg266 Video Input: Bosphorus 4K - Video Preset: Slow OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 4K - Video Preset: Slow smt c smt d smt b smt a no smt b no smt a c a b 11 22 33 44 55 46.34 46.25 46.25 45.56 34.68 34.59 29.41 29.37 29.29
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU smt b smt a smt c smt d no smt b no smt a a c b 40K 80K 120K 160K 200K 175754.35 173926.92 173620.31 172228.34 162294.47 160545.41 112346.36 112186.25 111378.39 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU no smt b no smt a smt a smt b smt c smt d c a b 0.1603 0.3206 0.4809 0.6412 0.8015 0.461174 0.462386 0.672585 0.672768 0.676798 0.680395 0.708563 0.711588 0.712495 MIN: 0.43 MIN: 0.42 MIN: 0.52 MIN: 0.53 MIN: 0.53 MIN: 0.53 MIN: 0.67 MIN: 0.68 MIN: 0.68 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
Memcached Set To Get Ratio: 1:100 OpenBenchmarking.org Ops/sec, More Is Better Memcached 1.6.18 Set To Get Ratio: 1:100 a b c smt d smt c smt a smt b no smt b no smt a 1000K 2000K 3000K 4000K 5000K 4878860.00 4876951.36 4852421.67 4403267.42 4394194.49 4383314.65 4357453.76 3216133.91 3192061.68 1. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre
Stress-NG Test: Semaphores OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: Semaphores smt a smt b smt c smt d a b c no smt b no smt a 4M 8M 12M 16M 20M 20047519.05 19927313.52 19866440.86 19842969.28 18128283.29 18100474.36 18088584.67 13192391.85 13141129.68 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Timed Linux Kernel Compilation Build: defconfig OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 6.1 Build: defconfig smt b smt c smt d smt a no smt b no smt a c b a 6 12 18 24 30 17.02 17.05 17.08 17.38 17.59 17.61 25.68 25.75 25.77
Stress-NG Test: Poll OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: Poll smt d smt a smt b smt c c b a no smt a no smt b 3M 6M 9M 12M 15M 15403597.24 15359471.73 15341564.16 15228320.07 12676101.41 12661687.46 12653709.64 10458275.24 10393402.01 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
OpenVKL Benchmark: vklBenchmark Scalar OpenBenchmarking.org Items / Sec, More Is Better OpenVKL 1.3.1 Benchmark: vklBenchmark Scalar smt b smt c smt d smt a no smt b no smt a b a c 200 400 600 800 1000 810 793 781 764 652 647 557 556 549 MIN: 138 / MAX: 3650 MIN: 139 / MAX: 3583 MIN: 139 / MAX: 3776 MIN: 139 / MAX: 3808 MIN: 102 / MAX: 3990 MIN: 101 / MAX: 4024 MIN: 61 / MAX: 6019 MIN: 61 / MAX: 5994 MIN: 60 / MAX: 5475
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream b c a no smt b smt b smt d smt c smt a no smt a 2 4 6 8 10 5.1633 5.1696 5.1812 6.4174 6.8844 7.0640 7.3550 7.5225 7.5856
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream b c a no smt b smt b smt d smt c smt a no smt a 40 80 120 160 200 193.53 193.31 192.88 155.74 145.17 141.47 135.89 132.87 131.76
uvg266 Video Input: Bosphorus 4K - Video Preset: Medium OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 4K - Video Preset: Medium smt a smt b smt c smt d no smt b no smt a b c a 11 22 33 44 55 47.56 46.62 46.57 46.21 38.65 38.38 33.13 33.10 33.03
Neural Magic DeepSparse Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Synchronous Single-Stream a b c smt a smt b smt d no smt a no smt b smt c 20 40 60 80 100 99.15 98.62 98.59 77.88 77.41 75.52 72.13 71.89 70.14
Neural Magic DeepSparse Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Synchronous Single-Stream a b c smt a smt b smt d no smt a no smt b smt c 4 8 12 16 20 10.08 10.14 10.14 12.83 12.91 13.23 13.86 13.90 14.25
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream a b c no smt a no smt b smt c smt d smt a smt b 2 4 6 8 10 5.3018 5.4980 5.5007 6.0118 6.0650 6.1057 7.1490 7.1865 7.3054
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream a b c no smt a no smt b smt c smt d smt a smt b 40 80 120 160 200 188.54 181.81 181.72 166.27 164.81 163.71 139.82 139.10 136.83
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Synchronous Single-Stream c b a smt a smt c no smt b no smt a smt b smt d 4 8 12 16 20 11.16 11.19 11.36 13.64 14.20 14.42 14.46 14.64 15.13
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Synchronous Single-Stream c b a smt a smt c no smt b no smt a smt b smt d 20 40 60 80 100 89.55 89.34 87.96 73.29 70.38 69.33 69.11 68.29 66.08
uvg266 Video Input: Bosphorus 1080p - Video Preset: Super Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 1080p - Video Preset: Super Fast b a c smt b smt d no smt b no smt a smt c smt a 50 100 150 200 250 239.92 238.68 237.96 220.79 218.91 216.15 181.67 178.78 178.49
uvg266 Video Input: Bosphorus 1080p - Video Preset: Ultra Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 1080p - Video Preset: Ultra Fast a b c smt d smt c smt a no smt b no smt a smt b 50 100 150 200 250 240.98 240.91 238.77 224.31 218.28 216.15 209.85 186.24 179.56
Zstd Compression Compression Level: 12 - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 12 - Compression Speed b a c no smt a no smt b smt b smt d smt a smt c 70 140 210 280 350 332.0 330.8 330.1 279.9 278.2 259.1 256.2 254.1 249.8 1. (CC) gcc options: -O3 -pthread -lz
RocksDB Test: Update Random OpenBenchmarking.org Op/s, More Is Better RocksDB 7.9.2 Test: Update Random b a c no smt a no smt b smt d smt c smt a smt b 120K 240K 360K 480K 600K 545556 544384 543572 462018 452228 420514 419985 413199 411513 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti
RocksDB Test: Sequential Fill OpenBenchmarking.org Op/s, More Is Better RocksDB 7.9.2 Test: Sequential Fill b c a no smt a no smt b smt c smt a smt d smt b 120K 240K 360K 480K 600K 545396 544565 542256 465700 464044 414282 414168 414047 413708 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream c a b no smt b smt c no smt a smt a smt d smt b 5 10 15 20 25 16.52 16.58 16.67 20.42 20.49 20.74 21.03 21.36 21.74
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream c a b no smt b smt c no smt a smt a smt d smt b 14 28 42 56 70 60.49 60.25 59.92 48.93 48.77 48.18 47.51 46.78 45.96
uvg266 Video Input: Bosphorus 1080p - Video Preset: Very Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 1080p - Video Preset: Very Fast c a b smt b no smt a smt c smt a no smt b smt d 50 100 150 200 250 237.44 234.95 234.68 209.55 196.25 192.25 183.99 183.75 181.83
ClickHouse 100M Rows Hits Dataset, Second Run OpenBenchmarking.org Queries Per Minute, Geo Mean, More Is Better ClickHouse 22.12.3.5 100M Rows Hits Dataset, Second Run no smt a no smt b b a c smt b smt c smt a smt d 140 280 420 560 700 666.43 649.52 627.20 625.91 621.88 527.88 524.75 524.04 515.11 MIN: 85.11 / MAX: 6666.67 MIN: 86.46 / MAX: 6000 MIN: 58.54 / MAX: 6000 MIN: 56.13 / MAX: 7500 MIN: 58.54 / MAX: 6666.67 MIN: 90.63 / MAX: 6666.67 MIN: 90.23 / MAX: 6000 MIN: 90.5 / MAX: 6666.67 MIN: 88.5 / MAX: 6000
RocksDB Test: Random Fill OpenBenchmarking.org Op/s, More Is Better RocksDB 7.9.2 Test: Random Fill c b a no smt a no smt b smt a smt b smt c smt d 110K 220K 330K 440K 550K 536551 534681 533927 478629 468210 423027 417882 416006 415428 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti
ClickHouse 100M Rows Hits Dataset, First Run / Cold Cache OpenBenchmarking.org Queries Per Minute, Geo Mean, More Is Better ClickHouse 22.12.3.5 100M Rows Hits Dataset, First Run / Cold Cache no smt a no smt b c b a smt b smt a smt d smt c 140 280 420 560 700 635.48 622.73 614.24 610.79 600.70 516.09 500.13 500.12 495.85 MIN: 85.35 / MAX: 6000 MIN: 83.22 / MAX: 5454.55 MIN: 57.14 / MAX: 6666.67 MIN: 58.14 / MAX: 6666.67 MIN: 57.75 / MAX: 6000 MIN: 89.55 / MAX: 6000 MIN: 65.15 / MAX: 6000 MIN: 69.61 / MAX: 6000 MIN: 51.06 / MAX: 6000
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Stream smt a smt b c smt d b a no smt a no smt b smt c 3 6 9 12 15 8.1914 8.3506 8.8004 9.3656 9.5793 9.5962 9.8899 9.9395 10.3852
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Stream smt a smt b c smt d b a no smt a no smt b smt c 30 60 90 120 150 121.92 119.60 113.51 106.65 104.29 104.12 101.01 100.51 96.17
ClickHouse 100M Rows Hits Dataset, Third Run OpenBenchmarking.org Queries Per Minute, Geo Mean, More Is Better ClickHouse 22.12.3.5 100M Rows Hits Dataset, Third Run no smt a no smt b c b a smt b smt a smt c smt d 140 280 420 560 700 665.00 662.01 636.18 628.37 625.16 538.63 534.82 530.47 527.80 MIN: 85.84 / MAX: 6000 MIN: 86.33 / MAX: 6000 MIN: 58.03 / MAX: 6666.67 MIN: 58.37 / MAX: 6666.67 MIN: 57.69 / MAX: 6000 MIN: 89.82 / MAX: 6000 MIN: 91.32 / MAX: 6000 MIN: 92.02 / MAX: 5454.55 MIN: 81.52 / MAX: 6000
VP9 libvpx Encoding Speed: Speed 5 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better VP9 libvpx Encoding 1.13 Speed: Speed 5 - Input: Bosphorus 4K c b a smt c smt d no smt b no smt a smt b smt a 4 8 12 16 20 17.46 17.37 17.36 15.38 15.11 14.50 14.48 14.27 13.90 1. (CXX) g++ options: -m64 -lm -lpthread -O3 -fPIC -U_FORTIFY_SOURCE -std=gnu++11
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU smt a smt b smt c smt d no smt a no smt b a b c 0.1485 0.297 0.4455 0.594 0.7425 0.53 0.53 0.53 0.53 0.57 0.58 0.65 0.66 0.66 MIN: 0.5 / MAX: 24.2 MIN: 0.5 / MAX: 9.61 MIN: 0.5 / MAX: 9.15 MIN: 0.5 / MAX: 8.86 MIN: 0.5 / MAX: 13.02 MIN: 0.5 / MAX: 12.8 MIN: 0.32 / MAX: 20.68 MIN: 0.32 / MAX: 19.99 MIN: 0.31 / MAX: 22.67 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
uvg266 Video Input: Bosphorus 4K - Video Preset: Ultra Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 4K - Video Preset: Ultra Fast c b a smt d smt a no smt a smt b smt c no smt b 16 32 48 64 80 71.13 70.68 70.56 58.76 57.96 57.78 57.51 57.33 57.23
OpenVKL Benchmark: vklBenchmark ISPC OpenBenchmarking.org Items / Sec, More Is Better OpenVKL 1.3.1 Benchmark: vklBenchmark ISPC smt a smt b smt d smt c no smt b no smt a a b c 300 600 900 1200 1500 1318 1243 1238 1235 1108 1098 1089 1075 1066 MIN: 391 / MAX: 3855 MIN: 390 / MAX: 3501 MIN: 392 / MAX: 3738 MIN: 392 / MAX: 4332 MIN: 297 / MAX: 4208 MIN: 297 / MAX: 4284 MIN: 179 / MAX: 7194 MIN: 179 / MAX: 6442 MIN: 180 / MAX: 6427
Zstd Compression Compression Level: 8 - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 8 - Compression Speed c b no smt b a no smt a smt b smt d smt a smt c 300 600 900 1200 1500 1241.1 1239.3 1234.2 1233.8 1227.5 1122.4 1024.8 1023.9 1005.8 1. (CC) gcc options: -O3 -pthread -lz
Zstd Compression Compression Level: 3 - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 3 - Compression Speed b c a no smt b no smt a smt a smt c smt d smt b 700 1400 2100 2800 3500 3095.0 3049.4 3033.9 2865.9 2804.2 2795.1 2604.6 2528.7 2513.4 1. (CC) gcc options: -O3 -pthread -lz
Timed FFmpeg Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed FFmpeg Compilation 6.0 Time To Compile smt b smt d smt a smt c no smt a no smt b b c a 3 6 9 12 15 10.26 10.37 10.38 10.41 10.60 10.66 12.43 12.47 12.57
Kvazaar Video Input: Bosphorus 1080p - Video Preset: Very Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 1080p - Video Preset: Very Fast a c b smt b smt d no smt a no smt b smt a smt c 60 120 180 240 300 296.93 291.03 290.59 274.63 270.76 269.62 268.77 250.71 243.36 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
uvg266 Video Input: Bosphorus 4K - Video Preset: Very Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 4K - Video Preset: Very Fast c a b smt a no smt b smt b smt d smt c no smt a 15 30 45 60 75 69.04 68.82 68.79 59.30 58.09 57.86 57.52 57.44 57.13
Kvazaar Video Input: Bosphorus 1080p - Video Preset: Slow OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 1080p - Video Preset: Slow no smt b no smt a c a b smt b smt c smt d smt a 40 80 120 160 200 159.79 155.56 140.10 139.92 139.07 136.33 135.80 135.56 132.67 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Kvazaar Video Input: Bosphorus 1080p - Video Preset: Super Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 1080p - Video Preset: Super Fast a b c smt a no smt a smt c no smt b smt b smt d 70 140 210 280 350 307.49 303.99 301.40 296.72 288.41 288.09 280.18 267.04 256.45 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
uvg266 Video Input: Bosphorus 4K - Video Preset: Super Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 4K - Video Preset: Super Fast c a b no smt a smt b smt a smt d smt c no smt b 16 32 48 64 80 69.92 69.33 69.00 59.83 59.20 59.06 58.66 58.34 58.33
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream c a b no smt a smt d no smt b smt b smt a smt c 8 16 24 32 40 28.57 28.62 28.71 31.92 32.18 32.18 32.24 32.30 33.99
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream c a b no smt a smt d no smt b smt b smt a smt c 8 16 24 32 40 34.99 34.93 34.82 31.32 31.07 31.07 31.01 30.96 29.42
Zstd Compression Compression Level: 3, Long Mode - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 3, Long Mode - Compression Speed smt d smt a smt c smt b no smt b no smt a c b a 200 400 600 800 1000 1059.0 1051.4 1046.6 1038.9 1032.3 955.6 916.9 909.4 892.7 1. (CC) gcc options: -O3 -pthread -lz
Kvazaar Video Input: Bosphorus 1080p - Video Preset: Medium OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 1080p - Video Preset: Medium no smt b no smt a b c a smt c smt a smt b smt d 40 80 120 160 200 161.40 159.63 144.21 143.81 143.31 140.89 140.78 138.33 136.42 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream c b a smt c no smt a smt b smt a smt d no smt b 8 16 24 32 40 35.12 35.10 34.83 31.93 31.91 31.78 31.76 30.25 29.75
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream c b a smt c no smt a smt b smt a smt d no smt b 8 16 24 32 40 28.47 28.49 28.71 31.31 31.33 31.46 31.48 33.05 33.60
RocksDB Test: Random Fill Sync OpenBenchmarking.org Op/s, More Is Better RocksDB 7.9.2 Test: Random Fill Sync smt d smt b smt c smt a a b c no smt a no smt b 90K 180K 270K 360K 450K 404463 401295 394852 388052 376601 373002 356922 350673 344040 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream no smt a no smt b smt c smt a smt d smt b a b c 200 400 600 800 1000 955.16 955.24 971.90 972.52 974.99 975.50 1115.76 1118.44 1122.06
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream no smt b no smt a smt d smt b smt a smt c b a c 200 400 600 800 1000 955.34 956.01 966.76 969.21 970.22 971.67 1116.29 1116.43 1117.73
Kvazaar Video Input: Bosphorus 4K - Video Preset: Very Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 4K - Video Preset: Very Fast b a c smt a smt b smt c smt d no smt a no smt b 20 40 60 80 100 81.61 80.90 80.31 75.91 74.11 73.57 73.24 70.02 69.94 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Kvazaar Video Input: Bosphorus 1080p - Video Preset: Ultra Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 1080p - Video Preset: Ultra Fast a c b smt d smt c smt a smt b no smt a no smt b 70 140 210 280 350 310.73 309.64 305.52 303.63 303.43 302.55 295.18 278.62 271.45 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Kvazaar Video Input: Bosphorus 4K - Video Preset: Super Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 4K - Video Preset: Super Fast c a b smt c smt b smt a smt d no smt a no smt b 20 40 60 80 100 84.55 81.90 80.68 78.72 76.70 76.64 75.50 75.50 74.03 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Kvazaar Video Input: Bosphorus 4K - Video Preset: Ultra Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 4K - Video Preset: Ultra Fast b a c smt d smt a smt b smt c no smt a no smt b 20 40 60 80 100 84.77 83.45 82.70 78.85 77.96 77.00 76.14 76.05 74.37 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
OpenVINO Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU smt c smt a smt b smt d no smt a no smt b a b c 0.2453 0.4906 0.7359 0.9812 1.2265 0.96 0.97 0.97 0.97 1.01 1.01 1.09 1.09 1.09 MIN: 0.87 / MAX: 9.72 MIN: 0.87 / MAX: 9.78 MIN: 0.87 / MAX: 10.96 MIN: 0.86 / MAX: 13.31 MIN: 0.86 / MAX: 8.97 MIN: 0.86 / MAX: 8.72 MIN: 0.49 / MAX: 19.01 MIN: 0.48 / MAX: 25.82 MIN: 0.49 / MAX: 22.06 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
Neural Magic DeepSparse Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Synchronous Single-Stream c a b no smt a smt c smt a smt b smt d no smt b 40 80 120 160 200 197.66 197.10 196.54 191.26 189.41 188.58 188.41 183.95 175.32
Neural Magic DeepSparse Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Synchronous Single-Stream c a b no smt a smt c smt a smt b smt d no smt b 1.2828 2.5656 3.8484 5.1312 6.414 5.0573 5.0715 5.0859 5.2262 5.2773 5.3004 5.3053 5.4341 5.7015
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream smt d smt c smt b no smt a no smt b smt a c b a 11 22 33 44 55 42.48 42.55 42.65 42.81 42.84 42.95 47.29 47.35 47.46
VP9 libvpx Encoding Speed: Speed 0 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better VP9 libvpx Encoding 1.13 Speed: Speed 0 - Input: Bosphorus 4K b c a smt a no smt b smt d no smt a smt b smt c 2 4 6 8 10 7.71 7.70 7.68 7.24 7.16 7.15 7.04 6.97 6.92 1. (CXX) g++ options: -m64 -lm -lpthread -O3 -fPIC -U_FORTIFY_SOURCE -std=gnu++11
uvg266 Video Input: Bosphorus 1080p - Video Preset: Slow OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 1080p - Video Preset: Slow no smt b no smt a b a c smt a smt d smt b smt c 20 40 60 80 100 88.13 87.15 81.41 81.16 81.10 80.17 80.00 79.95 79.63
uvg266 Video Input: Bosphorus 1080p - Video Preset: Medium OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 1080p - Video Preset: Medium no smt a no smt b b a c smt b smt a smt c smt d 20 40 60 80 100 98.21 96.67 91.47 91.37 91.29 89.70 89.40 89.16 88.91
OpenVINO Model: Weld Porosity Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Weld Porosity Detection FP16-INT8 - Device: CPU no smt a no smt b smt c smt d smt a smt b a b c 3 6 9 12 15 9.18 9.18 9.25 9.26 9.28 9.28 10.10 10.10 10.11 MIN: 8.17 / MAX: 31.1 MIN: 8.21 / MAX: 24.31 MIN: 8.12 / MAX: 19.04 MIN: 8.12 / MAX: 30.29 MIN: 8.1 / MAX: 22.87 MIN: 8.13 / MAX: 24.28 MIN: 5.46 / MAX: 35.41 MIN: 5.47 / MAX: 28.16 MIN: 5.43 / MAX: 36.27 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
Zstd Compression Compression Level: 8, Long Mode - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 8, Long Mode - Compression Speed a c b no smt b smt d smt c no smt a smt a smt b 200 400 600 800 1000 938.5 926.9 910.8 892.0 879.6 860.8 859.9 853.8 852.7 1. (CC) gcc options: -O3 -pthread -lz
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream no smt a no smt b smt c smt d smt a smt b a b c 20 40 60 80 100 100.24 100.42 102.10 102.76 103.09 103.19 108.68 109.11 109.25
VP9 libvpx Encoding Speed: Speed 0 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better VP9 libvpx Encoding 1.13 Speed: Speed 0 - Input: Bosphorus 1080p c a b no smt a smt a smt c no smt b smt d smt b 4 8 12 16 20 14.76 14.69 14.67 14.28 14.08 13.97 13.72 13.68 13.57 1. (CXX) g++ options: -m64 -lm -lpthread -O3 -fPIC -U_FORTIFY_SOURCE -std=gnu++11
Zstd Compression Compression Level: 19, Long Mode - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19, Long Mode - Compression Speed smt b smt c smt d smt a no smt b no smt a c b a 3 6 9 12 15 9.87 9.82 9.81 9.78 9.76 9.39 9.30 9.29 9.22 1. (CC) gcc options: -O3 -pthread -lz
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream smt d smt c smt a no smt b smt b no smt a b a c 70 140 210 280 350 301.92 303.25 303.33 303.74 304.08 304.21 319.96 319.98 320.05
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream no smt a no smt b smt a smt c smt b smt d c b a 20 40 60 80 100 73.20 73.21 73.31 73.34 73.35 73.62 76.62 77.31 77.34
VVenC Video Input: Bosphorus 1080p - Video Preset: Faster OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.7 Video Input: Bosphorus 1080p - Video Preset: Faster c a b no smt a no smt b 7 14 21 28 35 30.08 30.04 29.85 28.69 28.52 1. (CXX) g++ options: -O3 -flto -fno-fat-lto-objects -flto=auto
Zstd Compression Compression Level: 19 - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19 - Compression Speed no smt b no smt a smt d smt b smt c c b a smt a 5 10 15 20 25 19.8 19.8 19.6 19.5 19.2 19.1 19.1 19.1 18.8 1. (CC) gcc options: -O3 -pthread -lz
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream smt c smt d smt b no smt a no smt b smt a c b a 30 60 90 120 150 144.47 144.91 145.27 145.35 145.36 145.40 151.39 151.60 152.14
Neural Magic DeepSparse Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream no smt b no smt a smt b smt c smt a smt d a b c 30 60 90 120 150 115.64 116.15 116.48 116.97 117.12 117.22 119.21 119.38 119.70
Stress-NG Test: Socket Activity OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: Socket Activity no smt a no smt b c a smt d b smt b smt a smt c 2K 4K 6K 8K 10K 8968.28 8924.10 8876.10 8873.58 8864.78 8851.65 8750.61 8748.99 8747.83 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Zstd Compression Compression Level: 19 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19 - Decompression Speed smt a no smt b smt b no smt a smt d smt c a c b 300 600 900 1200 1500 1495.2 1483.8 1483.5 1483.1 1479.1 1475.2 1472.6 1470.5 1467.8 1. (CC) gcc options: -O3 -pthread -lz
Zstd Compression Compression Level: 12 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 12 - Decompression Speed smt c smt d no smt a no smt b smt b smt a b c a 400 800 1200 1600 2000 1732.3 1731.2 1728.0 1727.8 1726.3 1723.7 1716.7 1715.6 1704.8 1. (CC) gcc options: -O3 -pthread -lz
VVenC Video Input: Bosphorus 4K - Video Preset: Fast OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.7 Video Input: Bosphorus 4K - Video Preset: Fast no smt b no smt a a b c 1.3264 2.6528 3.9792 5.3056 6.632 5.895 5.890 5.820 5.809 5.808 1. (CXX) g++ options: -O3 -flto -fno-fat-lto-objects -flto=auto
Zstd Compression Compression Level: 19, Long Mode - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19, Long Mode - Decompression Speed no smt b no smt a smt a smt b smt d smt c c a b 300 600 900 1200 1500 1397.6 1395.8 1393.5 1392.4 1391.7 1389.0 1384.0 1383.7 1378.2 1. (CC) gcc options: -O3 -pthread -lz
VVenC Video Input: Bosphorus 4K - Video Preset: Faster OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.7 Video Input: Bosphorus 4K - Video Preset: Faster no smt a c b a no smt b 3 6 9 12 15 12.48 12.40 12.39 12.34 12.31 1. (CXX) g++ options: -O3 -flto -fno-fat-lto-objects -flto=auto
Zstd Compression Compression Level: 3 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 3 - Decompression Speed smt b c b smt c smt a a smt d no smt b no smt a 300 600 900 1200 1500 1519.1 1517.4 1516.6 1516.2 1515.6 1514.8 1513.5 1511.5 1500.4 1. (CC) gcc options: -O3 -pthread -lz
Zstd Compression Compression Level: 8 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 8 - Decompression Speed smt b no smt a a no smt b smt a smt d smt c c b 400 800 1200 1600 2000 1671.0 1669.7 1669.3 1667.6 1664.3 1664.2 1664.0 1661.9 1651.0 1. (CC) gcc options: -O3 -pthread -lz
VP9 libvpx Encoding Speed: Speed 5 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better VP9 libvpx Encoding 1.13 Speed: Speed 5 - Input: Bosphorus 1080p no smt b smt d no smt a smt b smt c smt a b c a 7 14 21 28 35 29.71 29.66 29.62 29.60 29.54 29.54 29.50 29.46 29.37 1. (CXX) g++ options: -m64 -lm -lpthread -O3 -fPIC -U_FORTIFY_SOURCE -std=gnu++11
Neural Magic DeepSparse Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream no smt a b c smt c a no smt b smt d smt a smt b 7 14 21 28 35 30.42 30.46 30.49 30.58 30.59 30.64 30.72 30.75 30.75
VVenC Video Input: Bosphorus 1080p - Video Preset: Fast OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.7 Video Input: Bosphorus 1080p - Video Preset: Fast c a b no smt b no smt a 3 6 9 12 15 12.44 12.44 12.40 12.37 12.36 1. (CXX) g++ options: -O3 -flto -fno-fat-lto-objects -flto=auto
Zstd Compression Compression Level: 8, Long Mode - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 8, Long Mode - Decompression Speed no smt a smt c no smt b smt a a smt d smt b c b 400 800 1200 1600 2000 1684.8 1683.7 1682.8 1682.5 1677.9 1677.6 1676.5 1675.4 1673.7 1. (CC) gcc options: -O3 -pthread -lz
Zstd Compression Compression Level: 3, Long Mode - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 3, Long Mode - Decompression Speed smt d no smt a smt c c smt b smt a no smt b b a 300 600 900 1200 1500 1543.6 1542.5 1540.9 1540.9 1540.4 1539.5 1538.8 1537.5 1536.6 1. (CC) gcc options: -O3 -pthread -lz
Phoronix Test Suite v10.8.4