2 x AMD EPYC 9654 96-Core testing with a AMD Titanite_4G (RTI1004D BIOS) and llvmpipe on Red Hat Enterprise Linux 9.1 via the Phoronix Test Suite.
Compare your own system(s) to this result file with the
Phoronix Test Suite by running the command:
phoronix-test-suite benchmark 2303114-NE-9654NEW5019 9654 new - Phoronix Test Suite 9654 new 2 x AMD EPYC 9654 96-Core testing with a AMD Titanite_4G (RTI1004D BIOS) and llvmpipe on Red Hat Enterprise Linux 9.1 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2303114-NE-9654NEW5019&sor&grr .
9654 new Processor Motherboard Chipset Memory Disk Graphics Monitor Network OS Kernel Desktop Display Server Compiler File-System Screen Resolution OpenGL a b c no smt a no smt b smt a smt b smt c smt d AMD EPYC 9654 96-Core @ 2.40GHz (96 Cores / 192 Threads) AMD Titanite_4G (RTI1004D BIOS) AMD Device 14a4 768GB 2 x 1920GB SAMSUNG MZWLJ1T9HBJR-00007 ASPEED VGA HDMI Broadcom NetXtreme BCM5720 PCIe Red Hat Enterprise Linux 9.1 5.14.0-162.6.1.el9_1.x86_64 (x86_64) GNOME Shell 40.10 X Server 1.20.11 GCC 11.3.1 20220421 xfs 1600x1200 2 x AMD EPYC 9654 96-Core @ 2.40GHz (192 Cores) 1520GB llvmpipe 4.5 Mesa 22.1.5 (LLVM 14.0.6 256 bits) 1024x768 2 x AMD EPYC 9654 96-Core @ 2.40GHz (192 Cores / 384 Threads) OpenBenchmarking.org Kernel Details - Transparent Huge Pages: always Compiler Details - --build=x86_64-redhat-linux --disable-libunwind-exceptions --enable-__cxa_atexit --enable-bootstrap --enable-cet --enable-checking=release --enable-gnu-indirect-function --enable-gnu-unique-object --enable-host-bind-now --enable-host-pie --enable-initfini-array --enable-languages=c,c++,fortran,lto --enable-link-serialization=1 --enable-multilib --enable-offload-targets=nvptx-none --enable-plugin --enable-shared --enable-threads=posix --mandir=/usr/share/man --with-arch_32=x86-64 --with-arch_64=x86-64-v2 --with-build-config=bootstrap-lto --with-gcc-major-version-only --with-linker-hash-style=gnu --with-tune=generic --without-cuda-driver --without-isl Processor Details - Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa101111 Python Details - Python 3.9.14 Security Details - a: SELinux + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected - b: SELinux + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected - c: SELinux + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected - no smt a: SELinux + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: disabled RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected - no smt b: SELinux + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: disabled RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected - smt a: SELinux + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected - smt b: SELinux + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected - smt c: SELinux + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected - smt d: SELinux + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
9654 new rocksdb: Seq Fill openvkl: vklBenchmark Scalar openvkl: vklBenchmark ISPC clickhouse: 100M Rows Hits Dataset, Third Run clickhouse: 100M Rows Hits Dataset, Second Run clickhouse: 100M Rows Hits Dataset, First Run / Cold Cache vpxenc: Speed 0 - Bosphorus 4K onednn: Recurrent Neural Network Inference - u8s8f32 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU compress-zstd: 19, Long Mode - Decompression Speed compress-zstd: 19, Long Mode - Compression Speed memcached: 1:100 memcached: 1:10 memcached: 1:5 openvino: Face Detection FP16 - CPU openvino: Face Detection FP16 - CPU openvino: Person Detection FP16 - CPU openvino: Person Detection FP16 - CPU openvino: Face Detection FP16-INT8 - CPU openvino: Face Detection FP16-INT8 - CPU openvino: Person Detection FP32 - CPU openvino: Person Detection FP32 - CPU deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream compress-zstd: 19 - Decompression Speed compress-zstd: 19 - Compression Speed deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream openvino: Person Vehicle Bike Detection FP16 - CPU openvino: Person Vehicle Bike Detection FP16 - CPU compress-zstd: 8, Long Mode - Decompression Speed compress-zstd: 8, Long Mode - Compression Speed compress-zstd: 12 - Decompression Speed compress-zstd: 12 - Compression Speed compress-zstd: 3, Long Mode - Decompression Speed compress-zstd: 3, Long Mode - Compression Speed compress-zstd: 8 - Decompression Speed compress-zstd: 8 - Compression Speed openvino: Machine Translation EN To DE FP16 - CPU openvino: Machine Translation EN To DE FP16 - CPU compress-zstd: 3 - Decompression Speed compress-zstd: 3 - Compression Speed openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Age Gender Recognition Retail 0013 FP16 - CPU openvino: Age Gender Recognition Retail 0013 FP16 - CPU openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU openvino: Vehicle Detection FP16-INT8 - CPU openvino: Vehicle Detection FP16-INT8 - CPU openvino: Vehicle Detection FP16 - CPU openvino: Vehicle Detection FP16 - CPU openvino: Weld Porosity Detection FP16 - CPU openvino: Weld Porosity Detection FP16 - CPU rocksdb: Rand Fill rocksdb: Update Rand rocksdb: Rand Fill Sync rocksdb: Read Rand Write Rand rocksdb: Read While Writing rocksdb: Rand Read deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream vvenc: Bosphorus 4K - Fast deepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Stream deepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Asynchronous Multi-Stream deepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Asynchronous Multi-Stream deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Stream deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2 - Synchronous Single-Stream deepsparse: NLP Text Classification, BERT base uncased SST2 - Synchronous Single-Stream deepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Synchronous Single-Stream deepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Synchronous Single-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Stream vpxenc: Speed 0 - Bosphorus 1080p deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Synchronous Single-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Synchronous Single-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO - Asynchronous Multi-Stream deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Synchronous Single-Stream deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Synchronous Single-Stream vpxenc: Speed 5 - Bosphorus 4K deepsparse: NLP Text Classification, DistilBERT mnli - Synchronous Single-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Synchronous Single-Stream deepsparse: CV Detection, YOLOv5s COCO - Synchronous Single-Stream deepsparse: CV Detection, YOLOv5s COCO - Synchronous Single-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Synchronous Single-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Synchronous Single-Stream stress-ng: Pthread stress-ng: Atomic stress-ng: NUMA stress-ng: Context Switching stress-ng: Forking stress-ng: Semaphores stress-ng: Crypto stress-ng: Poll stress-ng: CPU Cache stress-ng: Memory Copying stress-ng: Socket Activity stress-ng: Malloc stress-ng: MEMFD stress-ng: Futex stress-ng: Hash stress-ng: Matrix Math stress-ng: CPU Stress stress-ng: MMAP stress-ng: System V Message Passing stress-ng: Vector Math stress-ng: SENDFILE stress-ng: Function Call stress-ng: Glibc C String Functions stress-ng: Glibc Qsort Data Sorting stress-ng: Mutex vvenc: Bosphorus 4K - Faster vvenc: Bosphorus 1080p - Fast gromacs: MPI CPU - water_GMX50_bare onednn: Deconvolution Batch shapes_1d - f32 - CPU onednn: Deconvolution Batch shapes_1d - bf16bf16bf16 - CPU onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU vpxenc: Speed 5 - Bosphorus 1080p build-linux-kernel: defconfig uvg266: Bosphorus 4K - Slow uvg266: Bosphorus 4K - Medium onednn: IP Shapes 1D - f32 - CPU onednn: IP Shapes 1D - bf16bf16bf16 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU vvenc: Bosphorus 1080p - Faster onednn: Matrix Multiply Batch Shapes Transformer - bf16bf16bf16 - CPU onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU kvazaar: Bosphorus 4K - Slow kvazaar: Bosphorus 4K - Medium build-ffmpeg: Time To Compile uvg266: Bosphorus 4K - Very Fast uvg266: Bosphorus 4K - Ultra Fast uvg266: Bosphorus 4K - Super Fast onednn: IP Shapes 3D - f32 - CPU onednn: IP Shapes 3D - bf16bf16bf16 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU kvazaar: Bosphorus 4K - Very Fast kvazaar: Bosphorus 4K - Super Fast kvazaar: Bosphorus 4K - Ultra Fast uvg266: Bosphorus 1080p - Slow uvg266: Bosphorus 1080p - Medium onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: Convolution Batch Shapes Auto - bf16bf16bf16 - CPU kvazaar: Bosphorus 1080p - Slow kvazaar: Bosphorus 1080p - Medium onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: Deconvolution Batch shapes_3d - bf16bf16bf16 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU uvg266: Bosphorus 1080p - Very Fast uvg266: Bosphorus 1080p - Super Fast uvg266: Bosphorus 1080p - Ultra Fast kvazaar: Bosphorus 1080p - Very Fast kvazaar: Bosphorus 1080p - Super Fast kvazaar: Bosphorus 1080p - Ultra Fast embree: Pathtracer - Crown a b c no smt a no smt b smt a smt b smt c smt d 542256 556 1089 625.16 625.91 600.70 7.68 671.69 675.915 674.088 907.489 909.691 911.245 1383.7 9.22 4878860 6861088.89 4143044.21 962.1 49.72 1671.46 28.47 499.32 95.76 1701.7 27.97 319.9815 149.7446 1472.6 19.1 1115.7602 42.9893 8.34 5750.74 1677.9 938.5 1704.8 330.8 1536.6 892.7 1669.3 1233.8 88.71 540.41 1514.8 3033.9 10.1 9494.22 0.65 112346.36 1.09 74495.17 8.11 5908.89 16.3 2941.47 9.76 4914.17 533927 544384 376601 2926458 9296185 468231434 1116.426 42.9581 5.82 152.1377 314.8361 30.5924 1566.564 119.2067 401.7766 11.362 87.9631 5.0715 197.1018 28.7077 34.8262 77.3363 619.7531 28.6233 34.9289 14.69 16.5828 60.2522 47.464 1009.8486 108.6775 440.8533 10.0811 99.1493 17.36 5.1812 192.8806 9.5962 104.1178 5.3018 188.5378 109397.78 174.72 483.9 18941003.97 58156.36 18128283.29 203073.15 12653709.64 77.51 20106.51 8873.58 312709034.05 518.46 2794694.37 18954936 382305.69 205134.87 1663.08 10473084.09 556875.59 1950323.96 621015.34 16257100.25 1122.39 59479783.64 12.336 12.44 10.569 7.35746 1.83147 0.555277 29.37 25.768 29.37 33.03 2.01581 8.74576 3.11497 30.039 0.2902 0.357017 0.263997 40.63 41.4 12.569 68.82 70.56 69.33 1.56461 1.87777 0.53445 80.9 81.9 83.45 81.16 91.37 0.311405 0.693367 0.415446 139.92 143.31 1.17004 0.711588 0.304972 234.95 238.68 240.98 296.93 307.49 310.73 545396 557 1075 628.37 627.20 610.79 7.71 670.202 668.952 671.654 903.272 917.077 912.114 1378.2 9.29 4876951.36 6792746.34 4162812.47 962.63 49.72 1675.96 28.42 500.83 95.48 1685.26 28.23 319.9648 149.8207 1467.8 19.1 1118.4383 42.897 8.38 5720.9 1673.7 910.8 1716.7 332 1537.5 909.4 1651 1239.3 88.61 541.03 1516.6 3095 10.1 9496.95 0.66 111378.39 1.09 74353.41 8.13 5894.27 15.54 3085.53 9.75 4915.42 534681 545556 373002 2891962 8620352 466888888 1116.2899 42.8348 5.809 151.6003 316.2907 30.4608 1573.7496 119.3799 401.2053 11.1872 89.3355 5.0859 196.5414 28.4866 35.0962 77.305 619.9442 28.7137 34.8189 14.67 16.6745 59.9222 47.3532 1012.3494 109.1129 439.0373 10.136 98.6175 17.37 5.1633 193.5305 9.5793 104.2908 5.498 181.8123 109356.78 223.29 498.67 16313126.86 58664.97 18100474.36 203147.15 12661687.46 67.21 20340.4 8851.65 314418461.02 507.74 2805836.52 18955118.1 382304.04 217072.27 1664.75 10471889.97 556797.27 1913590.77 621003.92 16168655.17 1132.35 60031543.97 12.391 12.399 10.609 7.28243 1.84172 0.551966 29.5 25.749 29.29 33.13 1.9234 9.1957 4.27858 29.845 0.291387 0.356875 0.263373 40.86 41.39 12.434 68.79 70.68 69 1.56257 1.72975 0.513068 81.61 80.68 84.77 81.41 91.47 0.312721 0.689064 0.578466 139.07 144.21 1.16696 0.712495 0.305731 234.68 239.92 240.91 290.59 303.99 305.52 544565 549 1066 636.18 621.88 614.24 7.7 671.233 667.95 669.146 906.139 911.762 915.084 1384 9.3 4852421.67 6839975.87 4155273.46 962.15 49.77 1679.03 28.33 500.3 95.59 1704.02 27.89 320.0504 149.6905 1470.5 19.1 1122.0584 42.7433 8.37 5724.81 1675.4 926.9 1715.6 330.1 1540.9 916.9 1661.9 1241.1 89.46 535.78 1517.4 3049.4 10.11 9485.65 0.66 112186.25 1.09 74486.08 8.13 5898.79 15.1 3174.04 9.76 4910.13 536551 543572 356922 2910023 8316379 468069792 1117.7323 42.9152 5.808 151.3899 316.6282 30.4911 1571.8281 119.7026 400.0427 11.1605 89.547 5.0573 197.6601 28.4687 35.1188 76.6181 625.5512 28.5698 34.9937 14.76 16.5185 60.4884 47.2942 1013.7357 109.2467 438.4372 10.1384 98.5922 17.46 5.1696 193.3119 8.8004 113.5057 5.5007 181.7243 109609.15 183.33 478.05 16862185.54 64299.62 18088584.67 203095.44 12676101.41 97.04 20297.91 8876.1 313768771.26 507.97 2794473.75 18961773.18 382328.5 217304.19 1668.92 10475486.39 556833.5 1891202.37 621041.93 16537965.14 1125.86 59929401.4 12.399 12.441 10.587 7.38116 1.82264 0.556851 29.46 25.68 29.41 33.1 1.88686 3.83017 4.3972 30.081 0.285432 0.353175 0.26292 40.76 41.47 12.465 69.04 71.13 69.92 1.56742 1.6386 0.546991 80.31 84.55 82.7 81.1 91.29 0.312289 0.692667 0.417188 140.1 143.81 1.16919 0.708563 0.305599 237.44 237.96 238.77 291.03 301.4 309.64 465700 647 1098 665.00 666.43 635.48 7.04 913.201 973.318 930.257 1148.26 1147.34 1151.25 1395.8 9.39 3192061.68 4244908.83 2444886.82 438.29 109.08 828.67 57.58 229.46 208.82 833.99 57.18 304.2075 308.4294 1483.1 19.8 955.1631 97.4442 4.9 9784.09 1684.8 859.9 1728 279.9 1542.5 955.6 1669.7 1227.5 45.86 1045.81 1500.4 2804.2 9.18 20836.1 0.57 160545.41 1.01 127833.52 3.95 12119.66 6.01 7979.05 4.54 10556.93 478629 462018 350673 2079804 7643831 1209611055 956.0088 97.3996 5.89 145.3548 649.302 30.4213 3107.2957 116.1532 813.9832 14.4638 69.1063 5.2262 191.2605 31.327 31.9146 73.1993 1291.9733 31.9229 31.3187 14.28 20.7385 48.1849 42.805 2204.7533 100.2418 941.2443 13.8559 72.1298 14.48 7.5856 131.7586 9.8899 101.0095 6.0118 166.2725 68076.03 400.03 20.51 47222624.49 43094.96 13141129.68 435615.35 10458275.24 40.94 11342.69 8968.28 456657338.84 303.56 3746361.52 27408413.1 932248.18 328297.04 4520.48 7402514.74 920216.27 3284433.2 829106.63 26755146.69 1978.8 63581395.67 12.477 12.357 19.175 9.25292 1.65041 0.664359 29.62 17.606 34.59 38.38 2.04003 6.19257 4.87883 28.689 0.316779 0.342836 0.316665 47.13 47.97 10.597 57.13 57.78 59.83 2.19645 2.02286 0.556069 70.02 75.5 76.05 87.15 98.21 0.254845 0.40359 0.291687 155.56 159.63 0.664295 0.462386 0.160361 196.25 181.67 186.24 269.62 288.41 278.62 464044 652 1108 662.01 649.52 622.73 7.16 903.358 939.547 901.044 1124.75 1139.63 1119.17 1397.6 9.76 3216133.91 4220522.49 2460416.33 438.99 109.04 830.04 57.49 228.01 210.14 834.17 57.19 303.7377 308.3033 1483.8 19.8 955.2386 97.3838 4.91 9758.74 1682.8 892 1727.8 278.2 1538.8 1032.3 1667.6 1234.2 45.73 1048.74 1511.5 2865.9 9.18 20849.75 0.58 162294.47 1.01 127770.39 3.95 12128.38 6.01 7979.82 4.54 10549.24 468210 452228 344040 2063831 7913568 1213540299 955.3438 97.3441 5.895 145.3627 647.7596 30.6403 3086.3591 115.6369 817.6313 14.4178 69.3266 5.7015 175.3199 33.6012 29.7544 73.2087 1292.8038 32.1778 31.07 13.72 20.4223 48.9257 42.8393 2203.1573 100.4198 938.9084 13.9039 71.8878 14.5 6.4174 155.7357 9.9395 100.509 6.065 164.8126 67451.3 395.95 19.78 44683906.68 45685.28 13192391.85 437065.75 10393402.01 55.84 10949.9 8924.1 456651508.04 308.67 3802292.6 27422305.53 925984.24 326819.72 3591.71 7372780.72 920642.34 3282827.5 829445.38 28009942.56 1963.75 65500297.73 12.31 12.374 18.837 9.31751 1.70057 0.670574 29.71 17.586 34.68 38.65 1.93126 7.98861 4.79302 28.52 0.340498 0.347231 0.293564 47.43 47.93 10.662 58.09 57.23 58.33 2.12379 2.02833 0.629026 69.94 74.03 74.37 88.13 96.67 0.255796 0.400325 0.291229 159.79 161.4 0.651301 0.461174 0.164845 183.75 216.15 209.85 268.77 280.18 271.45 414168 764 1318 534.82 524.04 500.13 7.24 3269.05 3133.65 3187 1912.16 1888.1 1830.41 1393.5 9.78 4383314.65 4530948.62 2520253 439.41 108.93 827.07 57.72 230.21 208.13 833.15 57.29 303.3272 314.6186 1495.2 18.8 972.5163 97.0249 4.92 9746.08 1682.5 853.8 1723.7 254.1 1539.5 1051.4 1664.3 1023.9 45.71 1049.16 1515.6 2795.1 9.28 20610.23 0.53 173926.92 0.97 148316.88 3.95 12117.16 6.02 7953.21 4.54 10544.03 423027 413199 388052 1787682 15317250 1225662852 970.2202 97.2916 145.3966 658.2111 30.7462 3114.6848 117.1233 817.7536 13.637 73.2924 5.3004 188.5826 31.4805 31.7592 73.306 1306.6677 32.2956 30.9572 14.08 21.0323 47.506 42.9515 2230.3099 103.0888 929.0246 12.8322 77.8827 13.9 7.5225 132.8655 8.1914 121.9232 7.1865 139.0973 74978.51 184.34 24.82 12895047.73 36266.02 20047519.05 466292.12 15359471.73 47.58 15914.1 8748.99 634718750.41 464.7 2333781.39 41966139.67 946032.88 487359.39 8360.76 10103952.13 1291689.33 4329963.26 1414423.65 35687958.37 2564.48 136339636.91 18.818 20.2695 2.70807 1.20049 29.54 17.376 45.56 47.56 7.60566 15.421 11.4326 0.621776 0.674455 0.591738 64.63 65.56 10.381 59.3 57.96 59.06 3.39151 3.38833 0.975357 75.91 76.64 77.96 80.17 89.4 0.40743 0.538539 0.451472 132.67 140.78 0.97859 0.672585 0.279638 183.99 178.49 216.15 250.71 296.72 302.55 413708 810 1243 538.63 527.88 516.09 6.97 3249.28 3222.93 3034.62 1890.28 1894.07 1937.65 1392.4 9.87 4357453.76 4392934.75 2507331.29 439.5 108.89 827.03 57.67 230.06 208.19 832.7 57.31 304.0782 314.2452 1483.5 19.5 975.504 96.906 4.95 9680.44 1676.5 852.7 1726.3 259.1 1540.4 1038.9 1671 1122.4 45.66 1050.37 1519.1 2513.4 9.28 20610.62 0.53 175754.35 0.97 147582.87 3.96 12109.64 6.03 7949.05 4.54 10549.15 417882 411513 401295 1761416 13830689 1231168093 969.2122 97.3528 145.2745 658.341 30.7508 3114.4293 116.4813 822.1552 14.6356 68.2924 5.3053 188.4147 31.4593 31.7806 73.3496 1306.5839 32.2415 31.0093 13.57 21.7398 45.9642 42.6536 2245.8036 103.1902 927.679 12.9106 77.4096 14.27 6.8844 145.1736 8.3506 119.5964 7.3054 136.8324 180407.59 186.64 24.77 12221058.34 36020.16 19927313.52 466609.22 15341564.16 44.38 15430.18 8750.61 634995429.66 564.05 2396325.93 41989522.68 946660.88 487987.32 9017.12 8586858.51 1291704.57 4329824.38 1413555.44 35996976.13 2520.38 135855921.28 19.015 20.572 2.7614 1.18135 29.6 17.017 46.25 46.62 7.4227 19.549 7.30147 0.620907 0.675664 0.600199 66.22 65.69 10.259 57.86 57.51 59.2 3.49453 3.59643 0.971704 74.11 76.7 77 79.95 89.7 0.408074 0.551783 0.450261 136.33 138.33 0.973555 0.672768 0.291087 209.55 220.79 179.56 274.63 267.04 295.18 414282 793 1235 530.47 524.75 495.85 6.92 3153.22 3185.64 3011.21 1888.21 1925.68 1926.15 1389 9.82 4394194.49 4549313.42 2517750.47 437.51 109.36 821.93 58.05 227.93 210.19 826.67 57.71 303.2549 315.3318 1475.2 19.2 971.8988 96.9975 4.9 9773.26 1683.7 860.8 1732.3 249.8 1540.9 1046.6 1664 1005.8 45.31 1058.57 1516.2 2604.6 9.25 20684.29 0.53 173620.31 0.96 148047.13 3.93 12168.05 5.99 7993.18 4.52 10603.84 416006 419985 394852 1613913 13948914 1234570512 971.6677 96.9077 144.4661 662.2585 30.5784 3132.3838 116.9689 818.6037 14.2011 70.3828 5.2773 189.4054 31.3146 31.9277 73.3407 1305.7137 33.9879 29.4163 13.97 20.4884 48.772 42.5521 2253.1411 102.0999 937.992 14.2486 70.1418 15.38 7.355 135.8907 10.3852 96.1737 6.1057 163.7111 77834.39 184.24 24.79 12328933.65 34917.39 19866440.86 468159.21 15228320.07 42.55 15077.69 8747.83 639757070.52 413.34 2067499.58 41972765.93 952461.48 490300.48 7633.16 8609357.68 1295970.28 4351871.12 1422505.51 34602827.77 2517.26 138939958.5 19.045 21.0301 2.70235 1.20146 29.54 17.053 46.34 46.57 9.34981 18.6594 12.013 0.634992 0.676124 0.593299 65.49 65.59 10.407 57.44 57.33 58.34 3.60194 3.36535 0.99571 73.57 78.72 76.14 79.63 89.16 0.394377 0.545118 0.44782 135.8 140.89 0.972863 0.676798 0.275917 192.25 178.78 218.28 243.36 288.09 303.43 414047 781 1238 527.80 515.11 500.118340751 7.15 3171.15 3172.19 3349.36 1957.77 1912.74 1883.76 1391.7 9.81 4403267.42 4569476.86 2527084.87 437.68 109.23 822.5 58 228.94 209.24 828.01 57.62 301.9192 316.0328 1479.1 19.6 974.985 96.7169 4.91 9767.99 1677.6 879.6 1731.2 256.2 1543.6 1059 1664.2 1024.8 45.34 1057.74 1513.5 2528.7 9.26 20679.68 0.53 172228.34 0.97 149647.68 3.93 12172.92 6 7990.47 4.52 10587.3 415428 420514 404463 1752904 13098097 1231916197 966.7606 97.8908 144.9144 660.2132 30.7247 3116.0909 117.2219 816.7562 15.126 66.083 5.4341 183.949 33.0514 30.2494 73.6196 1301.1811 32.1758 31.0727 13.68 21.3593 46.7815 42.4798 2255.422 102.7646 931.8223 13.2331 75.524 15.11 7.064 141.4705 9.3656 106.6518 7.149 139.8241 91735.75 182.8 24.71 12728401.45 34685.85 19842969.28 468006 15403597.24 40.56 13370.33 8864.78 640853365.54 394.49 2077119.76 41965286.06 951700.17 489998.75 7273.19 12418451.71 1300693.93 4351920.42 1425621.44 36111781.82 2516.09 137579874.56 19.077 20.9726 2.67303 1.2334 29.66 17.079 46.25 46.21 8.26078 11.7573 12.1571 0.627133 0.672115 0.602831 66.08 65.49 10.373 57.52 58.76 58.66 3.50563 2.97427 0.975091 73.24 75.5 78.85 80 88.91 0.389223 0.63063 0.457554 135.56 136.42 0.973984 0.680395 0.247825 181.83 218.91 224.31 270.76 256.45 303.63 OpenBenchmarking.org
RocksDB Test: Sequential Fill OpenBenchmarking.org Op/s, More Is Better RocksDB 7.9.2 Test: Sequential Fill b c a no smt a no smt b smt c smt a smt d smt b 120K 240K 360K 480K 600K 545396 544565 542256 465700 464044 414282 414168 414047 413708 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti
OpenVKL Benchmark: vklBenchmark Scalar OpenBenchmarking.org Items / Sec, More Is Better OpenVKL 1.3.1 Benchmark: vklBenchmark Scalar smt b smt c smt d smt a no smt b no smt a b a c 200 400 600 800 1000 810 793 781 764 652 647 557 556 549 MIN: 138 / MAX: 3650 MIN: 139 / MAX: 3583 MIN: 139 / MAX: 3776 MIN: 139 / MAX: 3808 MIN: 102 / MAX: 3990 MIN: 101 / MAX: 4024 MIN: 61 / MAX: 6019 MIN: 61 / MAX: 5994 MIN: 60 / MAX: 5475
OpenVKL Benchmark: vklBenchmark ISPC OpenBenchmarking.org Items / Sec, More Is Better OpenVKL 1.3.1 Benchmark: vklBenchmark ISPC smt a smt b smt d smt c no smt b no smt a a b c 300 600 900 1200 1500 1318 1243 1238 1235 1108 1098 1089 1075 1066 MIN: 391 / MAX: 3855 MIN: 390 / MAX: 3501 MIN: 392 / MAX: 3738 MIN: 392 / MAX: 4332 MIN: 297 / MAX: 4208 MIN: 297 / MAX: 4284 MIN: 179 / MAX: 7194 MIN: 179 / MAX: 6442 MIN: 180 / MAX: 6427
ClickHouse 100M Rows Hits Dataset, Third Run OpenBenchmarking.org Queries Per Minute, Geo Mean, More Is Better ClickHouse 22.12.3.5 100M Rows Hits Dataset, Third Run no smt a no smt b c b a smt b smt a smt c smt d 140 280 420 560 700 665.00 662.01 636.18 628.37 625.16 538.63 534.82 530.47 527.80 MIN: 85.84 / MAX: 6000 MIN: 86.33 / MAX: 6000 MIN: 58.03 / MAX: 6666.67 MIN: 58.37 / MAX: 6666.67 MIN: 57.69 / MAX: 6000 MIN: 89.82 / MAX: 6000 MIN: 91.32 / MAX: 6000 MIN: 92.02 / MAX: 5454.55 MIN: 81.52 / MAX: 6000
ClickHouse 100M Rows Hits Dataset, Second Run OpenBenchmarking.org Queries Per Minute, Geo Mean, More Is Better ClickHouse 22.12.3.5 100M Rows Hits Dataset, Second Run no smt a no smt b b a c smt b smt c smt a smt d 140 280 420 560 700 666.43 649.52 627.20 625.91 621.88 527.88 524.75 524.04 515.11 MIN: 85.11 / MAX: 6666.67 MIN: 86.46 / MAX: 6000 MIN: 58.54 / MAX: 6000 MIN: 56.13 / MAX: 7500 MIN: 58.54 / MAX: 6666.67 MIN: 90.63 / MAX: 6666.67 MIN: 90.23 / MAX: 6000 MIN: 90.5 / MAX: 6666.67 MIN: 88.5 / MAX: 6000
ClickHouse 100M Rows Hits Dataset, First Run / Cold Cache OpenBenchmarking.org Queries Per Minute, Geo Mean, More Is Better ClickHouse 22.12.3.5 100M Rows Hits Dataset, First Run / Cold Cache no smt a no smt b c b a smt b smt a smt d smt c 140 280 420 560 700 635.48 622.73 614.24 610.79 600.70 516.09 500.13 500.12 495.85 MIN: 85.35 / MAX: 6000 MIN: 83.22 / MAX: 5454.55 MIN: 57.14 / MAX: 6666.67 MIN: 58.14 / MAX: 6666.67 MIN: 57.75 / MAX: 6000 MIN: 89.55 / MAX: 6000 MIN: 65.15 / MAX: 6000 MIN: 69.61 / MAX: 6000 MIN: 51.06 / MAX: 6000
VP9 libvpx Encoding Speed: Speed 0 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better VP9 libvpx Encoding 1.13 Speed: Speed 0 - Input: Bosphorus 4K b c a smt a no smt b smt d no smt a smt b smt c 2 4 6 8 10 7.71 7.70 7.68 7.24 7.16 7.15 7.04 6.97 6.92 1. (CXX) g++ options: -m64 -lm -lpthread -O3 -fPIC -U_FORTIFY_SOURCE -std=gnu++11
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU b c a no smt b no smt a smt c smt d smt b smt a 700 1400 2100 2800 3500 670.20 671.23 671.69 903.36 913.20 3153.22 3171.15 3249.28 3269.05 MIN: 663.54 MIN: 662.9 MIN: 664.46 MIN: 874.93 MIN: 884.92 MIN: 3059.88 MIN: 3148.29 MIN: 3226.78 MIN: 3243.11 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU c b a no smt b no smt a smt a smt d smt c smt b 700 1400 2100 2800 3500 667.95 668.95 675.92 939.55 973.32 3133.65 3172.19 3185.64 3222.93 MIN: 660.93 MIN: 662.29 MIN: 668.99 MIN: 904.78 MIN: 937.3 MIN: 3037.06 MIN: 3155.85 MIN: 2949.58 MIN: 2927.41 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU c b a no smt b no smt a smt c smt b smt a smt d 700 1400 2100 2800 3500 669.15 671.65 674.09 901.04 930.26 3011.21 3034.62 3187.00 3349.36 MIN: 661.71 MIN: 664.19 MIN: 667.01 MIN: 864.28 MIN: 898.89 MIN: 2853.26 MIN: 2821.2 MIN: 3034.82 MIN: 3325.75 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU b c a no smt b no smt a smt c smt b smt a smt d 400 800 1200 1600 2000 903.27 906.14 907.49 1124.75 1148.26 1888.21 1890.28 1912.16 1957.77 MIN: 894.76 MIN: 898.41 MIN: 897.95 MIN: 1091.36 MIN: 1108.42 MIN: 1860.29 MIN: 1864.29 MIN: 1891.03 MIN: 1929.47 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU a c b no smt b no smt a smt a smt b smt d smt c 400 800 1200 1600 2000 909.69 911.76 917.08 1139.63 1147.34 1888.10 1894.07 1912.74 1925.68 MIN: 901.1 MIN: 901.88 MIN: 909.14 MIN: 1100.83 MIN: 1109.85 MIN: 1865.76 MIN: 1870.66 MIN: 1890.93 MIN: 1901.2 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU a b c no smt b no smt a smt a smt d smt c smt b 400 800 1200 1600 2000 911.25 912.11 915.08 1119.17 1151.25 1830.41 1883.76 1926.15 1937.65 MIN: 903.01 MIN: 902.43 MIN: 906.26 MIN: 1086.08 MIN: 1070.35 MIN: 1807.97 MIN: 1850.75 MIN: 1901.09 MIN: 1908.65 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
Zstd Compression Compression Level: 19, Long Mode - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19, Long Mode - Decompression Speed no smt b no smt a smt a smt b smt d smt c c a b 300 600 900 1200 1500 1397.6 1395.8 1393.5 1392.4 1391.7 1389.0 1384.0 1383.7 1378.2 1. (CC) gcc options: -O3 -pthread -lz
Zstd Compression Compression Level: 19, Long Mode - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19, Long Mode - Compression Speed smt b smt c smt d smt a no smt b no smt a c b a 3 6 9 12 15 9.87 9.82 9.81 9.78 9.76 9.39 9.30 9.29 9.22 1. (CC) gcc options: -O3 -pthread -lz
Memcached Set To Get Ratio: 1:100 OpenBenchmarking.org Ops/sec, More Is Better Memcached 1.6.18 Set To Get Ratio: 1:100 a b c smt d smt c smt a smt b no smt b no smt a 1000K 2000K 3000K 4000K 5000K 4878860.00 4876951.36 4852421.67 4403267.42 4394194.49 4383314.65 4357453.76 3216133.91 3192061.68 1. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre
Memcached Set To Get Ratio: 1:10 OpenBenchmarking.org Ops/sec, More Is Better Memcached 1.6.18 Set To Get Ratio: 1:10 a c b smt d smt c smt a smt b no smt a no smt b 1.5M 3M 4.5M 6M 7.5M 6861088.89 6839975.87 6792746.34 4569476.86 4549313.42 4530948.62 4392934.75 4244908.83 4220522.49 1. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre
Memcached Set To Get Ratio: 1:5 OpenBenchmarking.org Ops/sec, More Is Better Memcached 1.6.18 Set To Get Ratio: 1:5 b c a smt d smt a smt c smt b no smt b no smt a 900K 1800K 2700K 3600K 4500K 4162812.47 4155273.46 4143044.21 2527084.87 2520253.00 2517750.47 2507331.29 2460416.33 2444886.82 1. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre
OpenVINO Model: Face Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Face Detection FP16 - Device: CPU smt c smt d no smt a no smt b smt a smt b a c b 200 400 600 800 1000 437.51 437.68 438.29 438.99 439.41 439.50 962.10 962.15 962.63 MIN: 400.05 / MAX: 473.91 MIN: 394.89 / MAX: 478.58 MIN: 416.93 / MAX: 496.86 MIN: 427.57 / MAX: 484.31 MIN: 410.81 / MAX: 465.4 MIN: 424.22 / MAX: 477.66 MIN: 879.24 / MAX: 1018.81 MIN: 888.7 / MAX: 1017.92 MIN: 893.43 / MAX: 1015.71 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenVINO Model: Face Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Face Detection FP16 - Device: CPU smt c smt d no smt a no smt b smt a smt b c b a 20 40 60 80 100 109.36 109.23 109.08 109.04 108.93 108.89 49.77 49.72 49.72 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Person Detection FP16 - Device: CPU smt c smt d smt b smt a no smt a no smt b a b c 400 800 1200 1600 2000 821.93 822.50 827.03 827.07 828.67 830.04 1671.46 1675.96 1679.03 MIN: 725.25 / MAX: 1010.56 MIN: 717.65 / MAX: 997.43 MIN: 723.77 / MAX: 1003.25 MIN: 724.84 / MAX: 1037.91 MIN: 730.29 / MAX: 1036.1 MIN: 722.6 / MAX: 1015.67 MIN: 924.15 / MAX: 1977.46 MIN: 1231.52 / MAX: 1967.75 MIN: 865.58 / MAX: 1995.12 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Person Detection FP16 - Device: CPU smt c smt d smt a smt b no smt a no smt b a b c 13 26 39 52 65 58.05 58.00 57.72 57.67 57.58 57.49 28.47 28.42 28.33 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenVINO Model: Face Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Face Detection FP16-INT8 - Device: CPU smt c no smt b smt d no smt a smt b smt a a c b 110 220 330 440 550 227.93 228.01 228.94 229.46 230.06 230.21 499.32 500.30 500.83 MIN: 210.46 / MAX: 252.96 MIN: 212.99 / MAX: 267.17 MIN: 214.92 / MAX: 248.29 MIN: 217.93 / MAX: 265.81 MIN: 210.58 / MAX: 251.42 MIN: 211.56 / MAX: 253.58 MIN: 264.54 / MAX: 537.26 MIN: 410.04 / MAX: 531.93 MIN: 418.76 / MAX: 546.62 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenVINO Model: Face Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Face Detection FP16-INT8 - Device: CPU smt c no smt b smt d no smt a smt b smt a a c b 50 100 150 200 250 210.19 210.14 209.24 208.82 208.19 208.13 95.76 95.59 95.48 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenVINO Model: Person Detection FP32 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Person Detection FP32 - Device: CPU smt c smt d smt b smt a no smt a no smt b b a c 400 800 1200 1600 2000 826.67 828.01 832.70 833.15 833.99 834.17 1685.26 1701.70 1704.02 MIN: 724.1 / MAX: 1006.75 MIN: 716.19 / MAX: 1018.34 MIN: 723.65 / MAX: 1031.69 MIN: 725.84 / MAX: 1017.38 MIN: 723.91 / MAX: 1011.94 MIN: 732.01 / MAX: 1006.46 MIN: 891.16 / MAX: 1979.37 MIN: 1395.71 / MAX: 2063.97 MIN: 828.99 / MAX: 1969.02 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenVINO Model: Person Detection FP32 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Person Detection FP32 - Device: CPU smt c smt d smt b smt a no smt b no smt a b a c 13 26 39 52 65 57.71 57.62 57.31 57.29 57.19 57.18 28.23 27.97 27.89 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream smt d smt c smt a no smt b smt b no smt a b a c 70 140 210 280 350 301.92 303.25 303.33 303.74 304.08 304.21 319.96 319.98 320.05
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream smt d smt c smt a smt b no smt a no smt b b a c 70 140 210 280 350 316.03 315.33 314.62 314.25 308.43 308.30 149.82 149.74 149.69
Zstd Compression Compression Level: 19 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19 - Decompression Speed smt a no smt b smt b no smt a smt d smt c a c b 300 600 900 1200 1500 1495.2 1483.8 1483.5 1483.1 1479.1 1475.2 1472.6 1470.5 1467.8 1. (CC) gcc options: -O3 -pthread -lz
Zstd Compression Compression Level: 19 - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19 - Compression Speed no smt b no smt a smt d smt b smt c c b a smt a 5 10 15 20 25 19.8 19.8 19.6 19.5 19.2 19.1 19.1 19.1 18.8 1. (CC) gcc options: -O3 -pthread -lz
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream no smt a no smt b smt c smt a smt d smt b a b c 200 400 600 800 1000 955.16 955.24 971.90 972.52 974.99 975.50 1115.76 1118.44 1122.06
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream no smt a no smt b smt a smt c smt b smt d a b c 20 40 60 80 100 97.44 97.38 97.02 97.00 96.91 96.72 42.99 42.90 42.74
OpenVINO Model: Person Vehicle Bike Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Person Vehicle Bike Detection FP16 - Device: CPU no smt a smt c no smt b smt d smt a smt b a c b 2 4 6 8 10 4.90 4.90 4.91 4.91 4.92 4.95 8.34 8.37 8.38 MIN: 4.46 / MAX: 56.52 MIN: 4.5 / MAX: 29.95 MIN: 4.49 / MAX: 34.72 MIN: 4.51 / MAX: 27.27 MIN: 4.52 / MAX: 34.8 MIN: 4.54 / MAX: 24.23 MIN: 6.61 / MAX: 55.54 MIN: 6.83 / MAX: 50.24 MIN: 6.4 / MAX: 32.33 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenVINO Model: Person Vehicle Bike Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Person Vehicle Bike Detection FP16 - Device: CPU no smt a smt c smt d no smt b smt a smt b a c b 2K 4K 6K 8K 10K 9784.09 9773.26 9767.99 9758.74 9746.08 9680.44 5750.74 5724.81 5720.90 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
Zstd Compression Compression Level: 8, Long Mode - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 8, Long Mode - Decompression Speed no smt a smt c no smt b smt a a smt d smt b c b 400 800 1200 1600 2000 1684.8 1683.7 1682.8 1682.5 1677.9 1677.6 1676.5 1675.4 1673.7 1. (CC) gcc options: -O3 -pthread -lz
Zstd Compression Compression Level: 8, Long Mode - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 8, Long Mode - Compression Speed a c b no smt b smt d smt c no smt a smt a smt b 200 400 600 800 1000 938.5 926.9 910.8 892.0 879.6 860.8 859.9 853.8 852.7 1. (CC) gcc options: -O3 -pthread -lz
Zstd Compression Compression Level: 12 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 12 - Decompression Speed smt c smt d no smt a no smt b smt b smt a b c a 400 800 1200 1600 2000 1732.3 1731.2 1728.0 1727.8 1726.3 1723.7 1716.7 1715.6 1704.8 1. (CC) gcc options: -O3 -pthread -lz
Zstd Compression Compression Level: 12 - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 12 - Compression Speed b a c no smt a no smt b smt b smt d smt a smt c 70 140 210 280 350 332.0 330.8 330.1 279.9 278.2 259.1 256.2 254.1 249.8 1. (CC) gcc options: -O3 -pthread -lz
Zstd Compression Compression Level: 3, Long Mode - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 3, Long Mode - Decompression Speed smt d no smt a smt c c smt b smt a no smt b b a 300 600 900 1200 1500 1543.6 1542.5 1540.9 1540.9 1540.4 1539.5 1538.8 1537.5 1536.6 1. (CC) gcc options: -O3 -pthread -lz
Zstd Compression Compression Level: 3, Long Mode - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 3, Long Mode - Compression Speed smt d smt a smt c smt b no smt b no smt a c b a 200 400 600 800 1000 1059.0 1051.4 1046.6 1038.9 1032.3 955.6 916.9 909.4 892.7 1. (CC) gcc options: -O3 -pthread -lz
Zstd Compression Compression Level: 8 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 8 - Decompression Speed smt b no smt a a no smt b smt a smt d smt c c b 400 800 1200 1600 2000 1671.0 1669.7 1669.3 1667.6 1664.3 1664.2 1664.0 1661.9 1651.0 1. (CC) gcc options: -O3 -pthread -lz
Zstd Compression Compression Level: 8 - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 8 - Compression Speed c b no smt b a no smt a smt b smt d smt a smt c 300 600 900 1200 1500 1241.1 1239.3 1234.2 1233.8 1227.5 1122.4 1024.8 1023.9 1005.8 1. (CC) gcc options: -O3 -pthread -lz
OpenVINO Model: Machine Translation EN To DE FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Machine Translation EN To DE FP16 - Device: CPU smt c smt d smt b smt a no smt b no smt a b a c 20 40 60 80 100 45.31 45.34 45.66 45.71 45.73 45.86 88.61 88.71 89.46 MIN: 39.35 / MAX: 73.19 MIN: 39.4 / MAX: 73.83 MIN: 38.82 / MAX: 75.65 MIN: 39.78 / MAX: 73.33 MIN: 39.68 / MAX: 86.83 MIN: 39.29 / MAX: 91.13 MIN: 47.57 / MAX: 124.67 MIN: 44.2 / MAX: 132.86 MIN: 42.16 / MAX: 123.4 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenVINO Model: Machine Translation EN To DE FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Machine Translation EN To DE FP16 - Device: CPU smt c smt d smt b smt a no smt b no smt a b a c 200 400 600 800 1000 1058.57 1057.74 1050.37 1049.16 1048.74 1045.81 541.03 540.41 535.78 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
Zstd Compression Compression Level: 3 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 3 - Decompression Speed smt b c b smt c smt a a smt d no smt b no smt a 300 600 900 1200 1500 1519.1 1517.4 1516.6 1516.2 1515.6 1514.8 1513.5 1511.5 1500.4 1. (CC) gcc options: -O3 -pthread -lz
Zstd Compression Compression Level: 3 - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 3 - Compression Speed b c a no smt b no smt a smt a smt c smt d smt b 700 1400 2100 2800 3500 3095.0 3049.4 3033.9 2865.9 2804.2 2795.1 2604.6 2528.7 2513.4 1. (CC) gcc options: -O3 -pthread -lz
OpenVINO Model: Weld Porosity Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Weld Porosity Detection FP16-INT8 - Device: CPU no smt a no smt b smt c smt d smt a smt b a b c 3 6 9 12 15 9.18 9.18 9.25 9.26 9.28 9.28 10.10 10.10 10.11 MIN: 8.17 / MAX: 31.1 MIN: 8.21 / MAX: 24.31 MIN: 8.12 / MAX: 19.04 MIN: 8.12 / MAX: 30.29 MIN: 8.1 / MAX: 22.87 MIN: 8.13 / MAX: 24.28 MIN: 5.46 / MAX: 35.41 MIN: 5.47 / MAX: 28.16 MIN: 5.43 / MAX: 36.27 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenVINO Model: Weld Porosity Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Weld Porosity Detection FP16-INT8 - Device: CPU no smt b no smt a smt c smt d smt b smt a b a c 4K 8K 12K 16K 20K 20849.75 20836.10 20684.29 20679.68 20610.62 20610.23 9496.95 9494.22 9485.65 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU smt a smt b smt c smt d no smt a no smt b a b c 0.1485 0.297 0.4455 0.594 0.7425 0.53 0.53 0.53 0.53 0.57 0.58 0.65 0.66 0.66 MIN: 0.5 / MAX: 24.2 MIN: 0.5 / MAX: 9.61 MIN: 0.5 / MAX: 9.15 MIN: 0.5 / MAX: 8.86 MIN: 0.5 / MAX: 13.02 MIN: 0.5 / MAX: 12.8 MIN: 0.32 / MAX: 20.68 MIN: 0.32 / MAX: 19.99 MIN: 0.31 / MAX: 22.67 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU smt b smt a smt c smt d no smt b no smt a a c b 40K 80K 120K 160K 200K 175754.35 173926.92 173620.31 172228.34 162294.47 160545.41 112346.36 112186.25 111378.39 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenVINO Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU smt c smt a smt b smt d no smt a no smt b a b c 0.2453 0.4906 0.7359 0.9812 1.2265 0.96 0.97 0.97 0.97 1.01 1.01 1.09 1.09 1.09 MIN: 0.87 / MAX: 9.72 MIN: 0.87 / MAX: 9.78 MIN: 0.87 / MAX: 10.96 MIN: 0.86 / MAX: 13.31 MIN: 0.86 / MAX: 8.97 MIN: 0.86 / MAX: 8.72 MIN: 0.49 / MAX: 19.01 MIN: 0.48 / MAX: 25.82 MIN: 0.49 / MAX: 22.06 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenVINO Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU smt d smt a smt c smt b no smt a no smt b a c b 30K 60K 90K 120K 150K 149647.68 148316.88 148047.13 147582.87 127833.52 127770.39 74495.17 74486.08 74353.41 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenVINO Model: Vehicle Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Vehicle Detection FP16-INT8 - Device: CPU smt c smt d no smt a no smt b smt a smt b a b c 2 4 6 8 10 3.93 3.93 3.95 3.95 3.95 3.96 8.11 8.13 8.13 MIN: 3.61 / MAX: 42.82 MIN: 3.61 / MAX: 23.62 MIN: 3.61 / MAX: 38 MIN: 3.68 / MAX: 42.53 MIN: 3.61 / MAX: 34.38 MIN: 3.66 / MAX: 32.83 MIN: 5.39 / MAX: 69.87 MIN: 5.35 / MAX: 55.84 MIN: 3.83 / MAX: 59.87 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenVINO Model: Vehicle Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Vehicle Detection FP16-INT8 - Device: CPU smt d smt c no smt b no smt a smt a smt b a c b 3K 6K 9K 12K 15K 12172.92 12168.05 12128.38 12119.66 12117.16 12109.64 5908.89 5898.79 5894.27 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenVINO Model: Vehicle Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Vehicle Detection FP16 - Device: CPU smt c smt d no smt a no smt b smt a smt b c b a 4 8 12 16 20 5.99 6.00 6.01 6.01 6.02 6.03 15.10 15.54 16.30 MIN: 5.13 / MAX: 31.29 MIN: 5.21 / MAX: 25.51 MIN: 5.2 / MAX: 37.8 MIN: 5.02 / MAX: 36.88 MIN: 5.27 / MAX: 25.12 MIN: 5.17 / MAX: 38.35 MIN: 6.94 / MAX: 60.59 MIN: 8.28 / MAX: 57.72 MIN: 7.91 / MAX: 51.62 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenVINO Model: Vehicle Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Vehicle Detection FP16 - Device: CPU smt c smt d no smt b no smt a smt a smt b c b a 2K 4K 6K 8K 10K 7993.18 7990.47 7979.82 7979.05 7953.21 7949.05 3174.04 3085.53 2941.47 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenVINO Model: Weld Porosity Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Weld Porosity Detection FP16 - Device: CPU smt c smt d no smt a no smt b smt a smt b b a c 3 6 9 12 15 4.52 4.52 4.54 4.54 4.54 4.54 9.75 9.76 9.76 MIN: 4.12 / MAX: 33.17 MIN: 4.03 / MAX: 45.52 MIN: 4.12 / MAX: 27.91 MIN: 4.09 / MAX: 55.78 MIN: 4.07 / MAX: 35.16 MIN: 4.11 / MAX: 42.14 MIN: 5.25 / MAX: 35.53 MIN: 5.03 / MAX: 28.1 MIN: 4.98 / MAX: 36.12 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenVINO Model: Weld Porosity Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Weld Porosity Detection FP16 - Device: CPU smt c smt d no smt a no smt b smt b smt a b a c 2K 4K 6K 8K 10K 10603.84 10587.30 10556.93 10549.24 10549.15 10544.03 4915.42 4914.17 4910.13 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
RocksDB Test: Random Fill OpenBenchmarking.org Op/s, More Is Better RocksDB 7.9.2 Test: Random Fill c b a no smt a no smt b smt a smt b smt c smt d 110K 220K 330K 440K 550K 536551 534681 533927 478629 468210 423027 417882 416006 415428 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti
RocksDB Test: Update Random OpenBenchmarking.org Op/s, More Is Better RocksDB 7.9.2 Test: Update Random b a c no smt a no smt b smt d smt c smt a smt b 120K 240K 360K 480K 600K 545556 544384 543572 462018 452228 420514 419985 413199 411513 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti
RocksDB Test: Random Fill Sync OpenBenchmarking.org Op/s, More Is Better RocksDB 7.9.2 Test: Random Fill Sync smt d smt b smt c smt a a b c no smt a no smt b 90K 180K 270K 360K 450K 404463 401295 394852 388052 376601 373002 356922 350673 344040 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti
RocksDB Test: Read Random Write Random OpenBenchmarking.org Op/s, More Is Better RocksDB 7.9.2 Test: Read Random Write Random a c b no smt a no smt b smt a smt b smt d smt c 600K 1200K 1800K 2400K 3000K 2926458 2910023 2891962 2079804 2063831 1787682 1761416 1752904 1613913 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti
RocksDB Test: Read While Writing OpenBenchmarking.org Op/s, More Is Better RocksDB 7.9.2 Test: Read While Writing smt a smt c smt b smt d a b c no smt b no smt a 3M 6M 9M 12M 15M 15317250 13948914 13830689 13098097 9296185 8620352 8316379 7913568 7643831 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti
RocksDB Test: Random Read OpenBenchmarking.org Op/s, More Is Better RocksDB 7.9.2 Test: Random Read smt c smt d smt b smt a no smt b no smt a a c b 300M 600M 900M 1200M 1500M 1234570512 1231916197 1231168093 1225662852 1213540299 1209611055 468231434 468069792 466888888 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream no smt b no smt a smt d smt b smt a smt c b a c 200 400 600 800 1000 955.34 956.01 966.76 969.21 970.22 971.67 1116.29 1116.43 1117.73
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream smt d no smt a smt b no smt b smt a smt c a c b 20 40 60 80 100 97.89 97.40 97.35 97.34 97.29 96.91 42.96 42.92 42.83
VVenC Video Input: Bosphorus 4K - Video Preset: Fast OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.7 Video Input: Bosphorus 4K - Video Preset: Fast no smt b no smt a a b c 1.3264 2.6528 3.9792 5.3056 6.632 5.895 5.890 5.820 5.809 5.808 1. (CXX) g++ options: -O3 -flto -fno-fat-lto-objects -flto=auto
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream smt c smt d smt b no smt a no smt b smt a c b a 30 60 90 120 150 144.47 144.91 145.27 145.35 145.36 145.40 151.39 151.60 152.14
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream smt c smt d smt b smt a no smt a no smt b c b a 140 280 420 560 700 662.26 660.21 658.34 658.21 649.30 647.76 316.63 316.29 314.84
Neural Magic DeepSparse Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream no smt a b c smt c a no smt b smt d smt a smt b 7 14 21 28 35 30.42 30.46 30.49 30.58 30.59 30.64 30.72 30.75 30.75
Neural Magic DeepSparse Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream smt c smt d smt a smt b no smt a no smt b b c a 700 1400 2100 2800 3500 3132.38 3116.09 3114.68 3114.43 3107.30 3086.36 1573.75 1571.83 1566.56
Neural Magic DeepSparse Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream no smt b no smt a smt b smt c smt a smt d a b c 30 60 90 120 150 115.64 116.15 116.48 116.97 117.12 117.22 119.21 119.38 119.70
Neural Magic DeepSparse Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream smt b smt c smt a no smt b smt d no smt a a b c 200 400 600 800 1000 822.16 818.60 817.75 817.63 816.76 813.98 401.78 401.21 400.04
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Synchronous Single-Stream c b a smt a smt c no smt b no smt a smt b smt d 4 8 12 16 20 11.16 11.19 11.36 13.64 14.20 14.42 14.46 14.64 15.13
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Synchronous Single-Stream c b a smt a smt c no smt b no smt a smt b smt d 20 40 60 80 100 89.55 89.34 87.96 73.29 70.38 69.33 69.11 68.29 66.08
Neural Magic DeepSparse Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Synchronous Single-Stream c a b no smt a smt c smt a smt b smt d no smt b 1.2828 2.5656 3.8484 5.1312 6.414 5.0573 5.0715 5.0859 5.2262 5.2773 5.3004 5.3053 5.4341 5.7015
Neural Magic DeepSparse Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Synchronous Single-Stream c a b no smt a smt c smt a smt b smt d no smt b 40 80 120 160 200 197.66 197.10 196.54 191.26 189.41 188.58 188.41 183.95 175.32
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream c b a smt c no smt a smt b smt a smt d no smt b 8 16 24 32 40 28.47 28.49 28.71 31.31 31.33 31.46 31.48 33.05 33.60
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream c b a smt c no smt a smt b smt a smt d no smt b 8 16 24 32 40 35.12 35.10 34.83 31.93 31.91 31.78 31.76 30.25 29.75
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream no smt a no smt b smt a smt c smt b smt d c b a 20 40 60 80 100 73.20 73.21 73.31 73.34 73.35 73.62 76.62 77.31 77.34
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream smt a smt b smt c smt d no smt b no smt a c b a 300 600 900 1200 1500 1306.67 1306.58 1305.71 1301.18 1292.80 1291.97 625.55 619.94 619.75
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream c a b no smt a smt d no smt b smt b smt a smt c 8 16 24 32 40 28.57 28.62 28.71 31.92 32.18 32.18 32.24 32.30 33.99
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream c a b no smt a smt d no smt b smt b smt a smt c 8 16 24 32 40 34.99 34.93 34.82 31.32 31.07 31.07 31.01 30.96 29.42
VP9 libvpx Encoding Speed: Speed 0 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better VP9 libvpx Encoding 1.13 Speed: Speed 0 - Input: Bosphorus 1080p c a b no smt a smt a smt c no smt b smt d smt b 4 8 12 16 20 14.76 14.69 14.67 14.28 14.08 13.97 13.72 13.68 13.57 1. (CXX) g++ options: -m64 -lm -lpthread -O3 -fPIC -U_FORTIFY_SOURCE -std=gnu++11
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream c a b no smt b smt c no smt a smt a smt d smt b 5 10 15 20 25 16.52 16.58 16.67 20.42 20.49 20.74 21.03 21.36 21.74
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream c a b no smt b smt c no smt a smt a smt d smt b 14 28 42 56 70 60.49 60.25 59.92 48.93 48.77 48.18 47.51 46.78 45.96
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream smt d smt c smt b no smt a no smt b smt a c b a 11 22 33 44 55 42.48 42.55 42.65 42.81 42.84 42.95 47.29 47.35 47.46
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream smt d smt c smt b smt a no smt a no smt b c b a 500 1000 1500 2000 2500 2255.42 2253.14 2245.80 2230.31 2204.75 2203.16 1013.74 1012.35 1009.85
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream no smt a no smt b smt c smt d smt a smt b a b c 20 40 60 80 100 100.24 100.42 102.10 102.76 103.09 103.19 108.68 109.11 109.25
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream no smt a no smt b smt c smt d smt a smt b a b c 200 400 600 800 1000 941.24 938.91 937.99 931.82 929.02 927.68 440.85 439.04 438.44
Neural Magic DeepSparse Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Synchronous Single-Stream a b c smt a smt b smt d no smt a no smt b smt c 4 8 12 16 20 10.08 10.14 10.14 12.83 12.91 13.23 13.86 13.90 14.25
Neural Magic DeepSparse Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Synchronous Single-Stream a b c smt a smt b smt d no smt a no smt b smt c 20 40 60 80 100 99.15 98.62 98.59 77.88 77.41 75.52 72.13 71.89 70.14
VP9 libvpx Encoding Speed: Speed 5 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better VP9 libvpx Encoding 1.13 Speed: Speed 5 - Input: Bosphorus 4K c b a smt c smt d no smt b no smt a smt b smt a 4 8 12 16 20 17.46 17.37 17.36 15.38 15.11 14.50 14.48 14.27 13.90 1. (CXX) g++ options: -m64 -lm -lpthread -O3 -fPIC -U_FORTIFY_SOURCE -std=gnu++11
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream b c a no smt b smt b smt d smt c smt a no smt a 2 4 6 8 10 5.1633 5.1696 5.1812 6.4174 6.8844 7.0640 7.3550 7.5225 7.5856
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream b c a no smt b smt b smt d smt c smt a no smt a 40 80 120 160 200 193.53 193.31 192.88 155.74 145.17 141.47 135.89 132.87 131.76
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Stream smt a smt b c smt d b a no smt a no smt b smt c 3 6 9 12 15 8.1914 8.3506 8.8004 9.3656 9.5793 9.5962 9.8899 9.9395 10.3852
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Stream smt a smt b c smt d b a no smt a no smt b smt c 30 60 90 120 150 121.92 119.60 113.51 106.65 104.29 104.12 101.01 100.51 96.17
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream a b c no smt a no smt b smt c smt d smt a smt b 2 4 6 8 10 5.3018 5.4980 5.5007 6.0118 6.0650 6.1057 7.1490 7.1865 7.3054
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream a b c no smt a no smt b smt c smt d smt a smt b 40 80 120 160 200 188.54 181.81 181.72 166.27 164.81 163.71 139.82 139.10 136.83
Stress-NG Test: Pthread OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: Pthread smt b c a b smt d smt c smt a no smt a no smt b 40K 80K 120K 160K 200K 180407.59 109609.15 109397.78 109356.78 91735.75 77834.39 74978.51 68076.03 67451.30 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: Atomic OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: Atomic no smt a no smt b b smt b smt a smt c c smt d a 90 180 270 360 450 400.03 395.95 223.29 186.64 184.34 184.24 183.33 182.80 174.72 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: NUMA OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: NUMA b a c smt a smt c smt b smt d no smt a no smt b 110 220 330 440 550 498.67 483.90 478.05 24.82 24.79 24.77 24.71 20.51 19.78 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: Context Switching OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: Context Switching no smt a no smt b a c b smt a smt d smt c smt b 10M 20M 30M 40M 50M 47222624.49 44683906.68 18941003.97 16862185.54 16313126.86 12895047.73 12728401.45 12328933.65 12221058.34 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: Forking OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: Forking c b a no smt b no smt a smt a smt b smt c smt d 14K 28K 42K 56K 70K 64299.62 58664.97 58156.36 45685.28 43094.96 36266.02 36020.16 34917.39 34685.85 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: Semaphores OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: Semaphores smt a smt b smt c smt d a b c no smt b no smt a 4M 8M 12M 16M 20M 20047519.05 19927313.52 19866440.86 19842969.28 18128283.29 18100474.36 18088584.67 13192391.85 13141129.68 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: Crypto OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: Crypto smt c smt d smt b smt a no smt b no smt a b c a 100K 200K 300K 400K 500K 468159.21 468006.00 466609.22 466292.12 437065.75 435615.35 203147.15 203095.44 203073.15 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: Poll OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: Poll smt d smt a smt b smt c c b a no smt a no smt b 3M 6M 9M 12M 15M 15403597.24 15359471.73 15341564.16 15228320.07 12676101.41 12661687.46 12653709.64 10458275.24 10393402.01 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: CPU Cache OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: CPU Cache c a b no smt b smt a smt b smt c no smt a smt d 20 40 60 80 100 97.04 77.51 67.21 55.84 47.58 44.38 42.55 40.94 40.56 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: Memory Copying OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: Memory Copying b c a smt a smt b smt c smt d no smt a no smt b 4K 8K 12K 16K 20K 20340.40 20297.91 20106.51 15914.10 15430.18 15077.69 13370.33 11342.69 10949.90 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: Socket Activity OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: Socket Activity no smt a no smt b c a smt d b smt b smt a smt c 2K 4K 6K 8K 10K 8968.28 8924.10 8876.10 8873.58 8864.78 8851.65 8750.61 8748.99 8747.83 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: Malloc OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: Malloc smt d smt c smt b smt a no smt a no smt b b c a 140M 280M 420M 560M 700M 640853365.54 639757070.52 634995429.66 634718750.41 456657338.84 456651508.04 314418461.02 313768771.26 312709034.05 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: MEMFD OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: MEMFD smt b a c b smt a smt c smt d no smt b no smt a 120 240 360 480 600 564.05 518.46 507.97 507.74 464.70 413.34 394.49 308.67 303.56 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: Futex OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: Futex no smt b no smt a b a c smt b smt a smt d smt c 800K 1600K 2400K 3200K 4000K 3802292.60 3746361.52 2805836.52 2794694.37 2794473.75 2396325.93 2333781.39 2077119.76 2067499.58 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: Hash OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: Hash smt b smt c smt a smt d no smt b no smt a c b a 9M 18M 27M 36M 45M 41989522.68 41972765.93 41966139.67 41965286.06 27422305.53 27408413.10 18961773.18 18955118.10 18954936.00 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: Matrix Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: Matrix Math smt c smt d smt b smt a no smt a no smt b c a b 200K 400K 600K 800K 1000K 952461.48 951700.17 946660.88 946032.88 932248.18 925984.24 382328.50 382305.69 382304.04 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: CPU Stress OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: CPU Stress smt c smt d smt b smt a no smt a no smt b c b a 110K 220K 330K 440K 550K 490300.48 489998.75 487987.32 487359.39 328297.04 326819.72 217304.19 217072.27 205134.87 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: MMAP OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: MMAP smt b smt a smt c smt d no smt a no smt b c b a 2K 4K 6K 8K 10K 9017.12 8360.76 7633.16 7273.19 4520.48 3591.71 1668.92 1664.75 1663.08 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: System V Message Passing OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: System V Message Passing smt d c a b smt a smt c smt b no smt a no smt b 3M 6M 9M 12M 15M 12418451.71 10475486.39 10473084.09 10471889.97 10103952.13 8609357.68 8586858.51 7402514.74 7372780.72 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: Vector Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: Vector Math smt d smt c smt b smt a no smt b no smt a a c b 300K 600K 900K 1200K 1500K 1300693.93 1295970.28 1291704.57 1291689.33 920642.34 920216.27 556875.59 556833.50 556797.27 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: SENDFILE OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: SENDFILE smt d smt c smt a smt b no smt a no smt b a b c 900K 1800K 2700K 3600K 4500K 4351920.42 4351871.12 4329963.26 4329824.38 3284433.20 3282827.50 1950323.96 1913590.77 1891202.37 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: Function Call OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: Function Call smt d smt c smt a smt b no smt b no smt a c a b 300K 600K 900K 1200K 1500K 1425621.44 1422505.51 1414423.65 1413555.44 829445.38 829106.63 621041.93 621015.34 621003.92 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: Glibc C String Functions OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: Glibc C String Functions smt d smt b smt a smt c no smt b no smt a c a b 8M 16M 24M 32M 40M 36111781.82 35996976.13 35687958.37 34602827.77 28009942.56 26755146.69 16537965.14 16257100.25 16168655.17 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: Glibc Qsort Data Sorting OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: Glibc Qsort Data Sorting smt a smt b smt c smt d no smt a no smt b b c a 600 1200 1800 2400 3000 2564.48 2520.38 2517.26 2516.09 1978.80 1963.75 1132.35 1125.86 1122.39 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: Mutex OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: Mutex smt c smt d smt a smt b no smt b no smt a b c a 30M 60M 90M 120M 150M 138939958.50 137579874.56 136339636.91 135855921.28 65500297.73 63581395.67 60031543.97 59929401.40 59479783.64 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
VVenC Video Input: Bosphorus 4K - Video Preset: Faster OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.7 Video Input: Bosphorus 4K - Video Preset: Faster no smt a c b a no smt b 3 6 9 12 15 12.48 12.40 12.39 12.34 12.31 1. (CXX) g++ options: -O3 -flto -fno-fat-lto-objects -flto=auto
VVenC Video Input: Bosphorus 1080p - Video Preset: Fast OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.7 Video Input: Bosphorus 1080p - Video Preset: Fast c a b no smt b no smt a 3 6 9 12 15 12.44 12.44 12.40 12.37 12.36 1. (CXX) g++ options: -O3 -flto -fno-fat-lto-objects -flto=auto
GROMACS Implementation: MPI CPU - Input: water_GMX50_bare OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2023 Implementation: MPI CPU - Input: water_GMX50_bare no smt a smt d smt c smt b no smt b smt a b c a 5 10 15 20 25 19.18 19.08 19.05 19.02 18.84 18.82 10.61 10.59 10.57 1. (CXX) g++ options: -O3
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU b a c no smt a no smt b smt a smt b smt d smt c 5 10 15 20 25 7.28243 7.35746 7.38116 9.25292 9.31751 20.26950 20.57200 20.97260 21.03010 MIN: 6.58 MIN: 4.85 MIN: 6.77 MIN: 7.79 MIN: 8.07 MIN: 17.68 MIN: 18.25 MIN: 18.21 MIN: 17.96 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU no smt a no smt b c a b smt d smt c smt a smt b 0.6213 1.2426 1.8639 2.4852 3.1065 1.65041 1.70057 1.82264 1.83147 1.84172 2.67303 2.70235 2.70807 2.76140 MIN: 1.41 MIN: 1.5 MIN: 1.71 MIN: 1.73 MIN: 1.76 MIN: 2.09 MIN: 2.27 MIN: 2.09 MIN: 2.46 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU b a c no smt a no smt b smt b smt a smt c smt d 0.2775 0.555 0.8325 1.11 1.3875 0.551966 0.555277 0.556851 0.664359 0.670574 1.181350 1.200490 1.201460 1.233400 MIN: 0.49 MIN: 0.53 MIN: 0.53 MIN: 0.56 MIN: 0.55 MIN: 1.08 MIN: 1.04 MIN: 1.08 MIN: 1.08 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
VP9 libvpx Encoding Speed: Speed 5 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better VP9 libvpx Encoding 1.13 Speed: Speed 5 - Input: Bosphorus 1080p no smt b smt d no smt a smt b smt c smt a b c a 7 14 21 28 35 29.71 29.66 29.62 29.60 29.54 29.54 29.50 29.46 29.37 1. (CXX) g++ options: -m64 -lm -lpthread -O3 -fPIC -U_FORTIFY_SOURCE -std=gnu++11
Timed Linux Kernel Compilation Build: defconfig OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 6.1 Build: defconfig smt b smt c smt d smt a no smt b no smt a c b a 6 12 18 24 30 17.02 17.05 17.08 17.38 17.59 17.61 25.68 25.75 25.77
uvg266 Video Input: Bosphorus 4K - Video Preset: Slow OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 4K - Video Preset: Slow smt c smt d smt b smt a no smt b no smt a c a b 11 22 33 44 55 46.34 46.25 46.25 45.56 34.68 34.59 29.41 29.37 29.29
uvg266 Video Input: Bosphorus 4K - Video Preset: Medium OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 4K - Video Preset: Medium smt a smt b smt c smt d no smt b no smt a b c a 11 22 33 44 55 47.56 46.62 46.57 46.21 38.65 38.38 33.13 33.10 33.03
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU c b no smt b a no smt a smt b smt a smt d smt c 3 6 9 12 15 1.88686 1.92340 1.93126 2.01581 2.04003 7.42270 7.60566 8.26078 9.34981 MIN: 1.68 MIN: 1.72 MIN: 1.77 MIN: 1.81 MIN: 1.78 MIN: 6.44 MIN: 6.54 MIN: 7.09 MIN: 7.73 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU c no smt a no smt b a b smt d smt a smt c smt b 5 10 15 20 25 3.83017 6.19257 7.98861 8.74576 9.19570 11.75730 15.42100 18.65940 19.54900 MIN: 2.72 MIN: 3.65 MIN: 3.88 MIN: 3.65 MIN: 4.13 MIN: 8.17 MIN: 10.4 MIN: 11.23 MIN: 10.67 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU a b c no smt b no smt a smt b smt a smt c smt d 3 6 9 12 15 3.11497 4.27858 4.39720 4.79302 4.87883 7.30147 11.43260 12.01300 12.15710 MIN: 2.47 MIN: 3.37 MIN: 3.21 MIN: 3.4 MIN: 3.53 MIN: 5.77 MIN: 7.59 MIN: 8.02 MIN: 8.52 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
VVenC Video Input: Bosphorus 1080p - Video Preset: Faster OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.7 Video Input: Bosphorus 1080p - Video Preset: Faster c a b no smt a no smt b 7 14 21 28 35 30.08 30.04 29.85 28.69 28.52 1. (CXX) g++ options: -O3 -flto -fno-fat-lto-objects -flto=auto
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU c a b no smt a no smt b smt b smt a smt d smt c 0.1429 0.2858 0.4287 0.5716 0.7145 0.285432 0.290200 0.291387 0.316779 0.340498 0.620907 0.621776 0.627133 0.634992 MIN: 0.24 MIN: 0.25 MIN: 0.23 MIN: 0.25 MIN: 0.28 MIN: 0.41 MIN: 0.45 MIN: 0.4 MIN: 0.55 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU no smt a no smt b c b a smt d smt a smt b smt c 0.1521 0.3042 0.4563 0.6084 0.7605 0.342836 0.347231 0.353175 0.356875 0.357017 0.672115 0.674455 0.675664 0.676124 MIN: 0.28 MIN: 0.3 MIN: 0.31 MIN: 0.31 MIN: 0.31 MIN: 0.49 MIN: 0.49 MIN: 0.55 MIN: 0.49 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU c b a no smt b no smt a smt a smt c smt b smt d 0.1356 0.2712 0.4068 0.5424 0.678 0.262920 0.263373 0.263997 0.293564 0.316665 0.591738 0.593299 0.600199 0.602831 MIN: 0.2 MIN: 0.2 MIN: 0.18 MIN: 0.22 MIN: 0.24 MIN: 0.43 MIN: 0.39 MIN: 0.38 MIN: 0.48 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
Kvazaar Video Input: Bosphorus 4K - Video Preset: Slow OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 4K - Video Preset: Slow smt b smt d smt c smt a no smt b no smt a b c a 15 30 45 60 75 66.22 66.08 65.49 64.63 47.43 47.13 40.86 40.76 40.63 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Kvazaar Video Input: Bosphorus 4K - Video Preset: Medium OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 4K - Video Preset: Medium smt b smt c smt a smt d no smt a no smt b c a b 15 30 45 60 75 65.69 65.59 65.56 65.49 47.97 47.93 41.47 41.40 41.39 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Timed FFmpeg Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed FFmpeg Compilation 6.0 Time To Compile smt b smt d smt a smt c no smt a no smt b b c a 3 6 9 12 15 10.26 10.37 10.38 10.41 10.60 10.66 12.43 12.47 12.57
uvg266 Video Input: Bosphorus 4K - Video Preset: Very Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 4K - Video Preset: Very Fast c a b smt a no smt b smt b smt d smt c no smt a 15 30 45 60 75 69.04 68.82 68.79 59.30 58.09 57.86 57.52 57.44 57.13
uvg266 Video Input: Bosphorus 4K - Video Preset: Ultra Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 4K - Video Preset: Ultra Fast c b a smt d smt a no smt a smt b smt c no smt b 16 32 48 64 80 71.13 70.68 70.56 58.76 57.96 57.78 57.51 57.33 57.23
uvg266 Video Input: Bosphorus 4K - Video Preset: Super Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 4K - Video Preset: Super Fast c a b no smt a smt b smt a smt d smt c no smt b 16 32 48 64 80 69.92 69.33 69.00 59.83 59.20 59.06 58.66 58.34 58.33
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU b a c no smt b no smt a smt a smt b smt d smt c 0.8104 1.6208 2.4312 3.2416 4.052 1.56257 1.56461 1.56742 2.12379 2.19645 3.39151 3.49453 3.50563 3.60194 MIN: 1.41 MIN: 1.41 MIN: 1.39 MIN: 1.9 MIN: 2 MIN: 2.93 MIN: 3.11 MIN: 3.19 MIN: 3.08 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU c b a no smt a no smt b smt d smt c smt a smt b 0.8092 1.6184 2.4276 3.2368 4.046 1.63860 1.72975 1.87777 2.02286 2.02833 2.97427 3.36535 3.38833 3.59643 MIN: 1.2 MIN: 1.27 MIN: 1.23 MIN: 1.56 MIN: 1.43 MIN: 2.3 MIN: 2.7 MIN: 2.76 MIN: 2.7 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU b a c no smt a no smt b smt b smt d smt a smt c 0.224 0.448 0.672 0.896 1.12 0.513068 0.534450 0.546991 0.556069 0.629026 0.971704 0.975091 0.975357 0.995710 MIN: 0.47 MIN: 0.49 MIN: 0.5 MIN: 0.46 MIN: 0.51 MIN: 0.82 MIN: 0.91 MIN: 0.92 MIN: 0.87 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
Kvazaar Video Input: Bosphorus 4K - Video Preset: Very Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 4K - Video Preset: Very Fast b a c smt a smt b smt c smt d no smt a no smt b 20 40 60 80 100 81.61 80.90 80.31 75.91 74.11 73.57 73.24 70.02 69.94 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Kvazaar Video Input: Bosphorus 4K - Video Preset: Super Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 4K - Video Preset: Super Fast c a b smt c smt b smt a smt d no smt a no smt b 20 40 60 80 100 84.55 81.90 80.68 78.72 76.70 76.64 75.50 75.50 74.03 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Kvazaar Video Input: Bosphorus 4K - Video Preset: Ultra Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 4K - Video Preset: Ultra Fast b a c smt d smt a smt b smt c no smt a no smt b 20 40 60 80 100 84.77 83.45 82.70 78.85 77.96 77.00 76.14 76.05 74.37 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
uvg266 Video Input: Bosphorus 1080p - Video Preset: Slow OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 1080p - Video Preset: Slow no smt b no smt a b a c smt a smt d smt b smt c 20 40 60 80 100 88.13 87.15 81.41 81.16 81.10 80.17 80.00 79.95 79.63
uvg266 Video Input: Bosphorus 1080p - Video Preset: Medium OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 1080p - Video Preset: Medium no smt a no smt b b a c smt b smt a smt c smt d 20 40 60 80 100 98.21 96.67 91.47 91.37 91.29 89.70 89.40 89.16 88.91
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU no smt a no smt b a c b smt d smt c smt a smt b 0.0918 0.1836 0.2754 0.3672 0.459 0.254845 0.255796 0.311405 0.312289 0.312721 0.389223 0.394377 0.407430 0.408074 MIN: 0.18 MIN: 0.18 MIN: 0.3 MIN: 0.28 MIN: 0.28 MIN: 0.29 MIN: 0.29 MIN: 0.27 MIN: 0.27 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU no smt b no smt a smt a smt c smt b smt d b c a 0.156 0.312 0.468 0.624 0.78 0.400325 0.403590 0.538539 0.545118 0.551783 0.630630 0.689064 0.692667 0.693367 MIN: 0.38 MIN: 0.38 MIN: 0.48 MIN: 0.45 MIN: 0.49 MIN: 0.48 MIN: 0.64 MIN: 0.65 MIN: 0.64 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU no smt b no smt a a c smt c smt b smt a smt d b 0.1302 0.2604 0.3906 0.5208 0.651 0.291229 0.291687 0.415446 0.417188 0.447820 0.450261 0.451472 0.457554 0.578466 MIN: 0.27 MIN: 0.27 MIN: 0.4 MIN: 0.4 MIN: 0.37 MIN: 0.37 MIN: 0.4 MIN: 0.34 MIN: 0.4 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
Kvazaar Video Input: Bosphorus 1080p - Video Preset: Slow OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 1080p - Video Preset: Slow no smt b no smt a c a b smt b smt c smt d smt a 40 80 120 160 200 159.79 155.56 140.10 139.92 139.07 136.33 135.80 135.56 132.67 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Kvazaar Video Input: Bosphorus 1080p - Video Preset: Medium OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 1080p - Video Preset: Medium no smt b no smt a b c a smt c smt a smt b smt d 40 80 120 160 200 161.40 159.63 144.21 143.81 143.31 140.89 140.78 138.33 136.42 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU no smt b no smt a smt c smt b smt d smt a b c a 0.2633 0.5266 0.7899 1.0532 1.3165 0.651301 0.664295 0.972863 0.973555 0.973984 0.978590 1.166960 1.169190 1.170040 MIN: 0.62 MIN: 0.63 MIN: 0.92 MIN: 0.92 MIN: 0.92 MIN: 0.93 MIN: 1.07 MIN: 1.07 MIN: 1.07 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU no smt b no smt a smt a smt b smt c smt d c a b 0.1603 0.3206 0.4809 0.6412 0.8015 0.461174 0.462386 0.672585 0.672768 0.676798 0.680395 0.708563 0.711588 0.712495 MIN: 0.43 MIN: 0.42 MIN: 0.52 MIN: 0.53 MIN: 0.53 MIN: 0.53 MIN: 0.67 MIN: 0.68 MIN: 0.68 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU no smt a no smt b smt d smt c smt a smt b a c b 0.0688 0.1376 0.2064 0.2752 0.344 0.160361 0.164845 0.247825 0.275917 0.279638 0.291087 0.304972 0.305599 0.305731 MIN: 0.15 MIN: 0.15 MIN: 0.23 MIN: 0.23 MIN: 0.23 MIN: 0.23 MIN: 0.28 MIN: 0.28 MIN: 0.28 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
uvg266 Video Input: Bosphorus 1080p - Video Preset: Very Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 1080p - Video Preset: Very Fast c a b smt b no smt a smt c smt a no smt b smt d 50 100 150 200 250 237.44 234.95 234.68 209.55 196.25 192.25 183.99 183.75 181.83
uvg266 Video Input: Bosphorus 1080p - Video Preset: Super Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 1080p - Video Preset: Super Fast b a c smt b smt d no smt b no smt a smt c smt a 50 100 150 200 250 239.92 238.68 237.96 220.79 218.91 216.15 181.67 178.78 178.49
uvg266 Video Input: Bosphorus 1080p - Video Preset: Ultra Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 1080p - Video Preset: Ultra Fast a b c smt d smt c smt a no smt b no smt a smt b 50 100 150 200 250 240.98 240.91 238.77 224.31 218.28 216.15 209.85 186.24 179.56
Kvazaar Video Input: Bosphorus 1080p - Video Preset: Very Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 1080p - Video Preset: Very Fast a c b smt b smt d no smt a no smt b smt a smt c 60 120 180 240 300 296.93 291.03 290.59 274.63 270.76 269.62 268.77 250.71 243.36 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Kvazaar Video Input: Bosphorus 1080p - Video Preset: Super Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 1080p - Video Preset: Super Fast a b c smt a no smt a smt c no smt b smt b smt d 70 140 210 280 350 307.49 303.99 301.40 296.72 288.41 288.09 280.18 267.04 256.45 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Kvazaar Video Input: Bosphorus 1080p - Video Preset: Ultra Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 1080p - Video Preset: Ultra Fast a c b smt d smt c smt a smt b no smt a no smt b 70 140 210 280 350 310.73 309.64 305.52 303.63 303.43 302.55 295.18 278.62 271.45 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Phoronix Test Suite v10.8.4