2 x AMD EPYC 9654 96-Core testing with a AMD Titanite_4G (RTI1004D BIOS) and llvmpipe on Red Hat Enterprise Linux 9.1 via the Phoronix Test Suite.
Compare your own system(s) to this result file with the
Phoronix Test Suite by running the command:
phoronix-test-suite benchmark 2303114-NE-9654NEW5019 9654 new - Phoronix Test Suite 9654 new 2 x AMD EPYC 9654 96-Core testing with a AMD Titanite_4G (RTI1004D BIOS) and llvmpipe on Red Hat Enterprise Linux 9.1 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2303114-NE-9654NEW5019&sro&grr .
9654 new Processor Motherboard Chipset Memory Disk Graphics Monitor Network OS Kernel Desktop Display Server Compiler File-System Screen Resolution OpenGL a b c no smt a no smt b smt a smt b smt c smt d AMD EPYC 9654 96-Core @ 2.40GHz (96 Cores / 192 Threads) AMD Titanite_4G (RTI1004D BIOS) AMD Device 14a4 768GB 2 x 1920GB SAMSUNG MZWLJ1T9HBJR-00007 ASPEED VGA HDMI Broadcom NetXtreme BCM5720 PCIe Red Hat Enterprise Linux 9.1 5.14.0-162.6.1.el9_1.x86_64 (x86_64) GNOME Shell 40.10 X Server 1.20.11 GCC 11.3.1 20220421 xfs 1600x1200 2 x AMD EPYC 9654 96-Core @ 2.40GHz (192 Cores) 1520GB llvmpipe 4.5 Mesa 22.1.5 (LLVM 14.0.6 256 bits) 1024x768 2 x AMD EPYC 9654 96-Core @ 2.40GHz (192 Cores / 384 Threads) OpenBenchmarking.org Kernel Details - Transparent Huge Pages: always Compiler Details - --build=x86_64-redhat-linux --disable-libunwind-exceptions --enable-__cxa_atexit --enable-bootstrap --enable-cet --enable-checking=release --enable-gnu-indirect-function --enable-gnu-unique-object --enable-host-bind-now --enable-host-pie --enable-initfini-array --enable-languages=c,c++,fortran,lto --enable-link-serialization=1 --enable-multilib --enable-offload-targets=nvptx-none --enable-plugin --enable-shared --enable-threads=posix --mandir=/usr/share/man --with-arch_32=x86-64 --with-arch_64=x86-64-v2 --with-build-config=bootstrap-lto --with-gcc-major-version-only --with-linker-hash-style=gnu --with-tune=generic --without-cuda-driver --without-isl Processor Details - Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa101111 Python Details - Python 3.9.14 Security Details - a: SELinux + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected - b: SELinux + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected - c: SELinux + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected - no smt a: SELinux + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: disabled RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected - no smt b: SELinux + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: disabled RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected - smt a: SELinux + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected - smt b: SELinux + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected - smt c: SELinux + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected - smt d: SELinux + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
9654 new rocksdb: Seq Fill openvkl: vklBenchmark Scalar openvkl: vklBenchmark ISPC clickhouse: 100M Rows Hits Dataset, Third Run clickhouse: 100M Rows Hits Dataset, Second Run clickhouse: 100M Rows Hits Dataset, First Run / Cold Cache vpxenc: Speed 0 - Bosphorus 4K onednn: Recurrent Neural Network Inference - u8s8f32 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU compress-zstd: 19, Long Mode - Decompression Speed compress-zstd: 19, Long Mode - Compression Speed memcached: 1:100 memcached: 1:10 memcached: 1:5 openvino: Face Detection FP16 - CPU openvino: Face Detection FP16 - CPU openvino: Person Detection FP16 - CPU openvino: Person Detection FP16 - CPU openvino: Face Detection FP16-INT8 - CPU openvino: Face Detection FP16-INT8 - CPU openvino: Person Detection FP32 - CPU openvino: Person Detection FP32 - CPU deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream compress-zstd: 19 - Decompression Speed compress-zstd: 19 - Compression Speed deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream openvino: Person Vehicle Bike Detection FP16 - CPU openvino: Person Vehicle Bike Detection FP16 - CPU compress-zstd: 8, Long Mode - Decompression Speed compress-zstd: 8, Long Mode - Compression Speed compress-zstd: 12 - Decompression Speed compress-zstd: 12 - Compression Speed compress-zstd: 3, Long Mode - Decompression Speed compress-zstd: 3, Long Mode - Compression Speed compress-zstd: 8 - Decompression Speed compress-zstd: 8 - Compression Speed openvino: Machine Translation EN To DE FP16 - CPU openvino: Machine Translation EN To DE FP16 - CPU compress-zstd: 3 - Decompression Speed compress-zstd: 3 - Compression Speed openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Weld Porosity Detection FP16-INT8 - CPU openvino: Age Gender Recognition Retail 0013 FP16 - CPU openvino: Age Gender Recognition Retail 0013 FP16 - CPU openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU openvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPU openvino: Vehicle Detection FP16-INT8 - CPU openvino: Vehicle Detection FP16-INT8 - CPU openvino: Vehicle Detection FP16 - CPU openvino: Vehicle Detection FP16 - CPU openvino: Weld Porosity Detection FP16 - CPU openvino: Weld Porosity Detection FP16 - CPU rocksdb: Rand Fill rocksdb: Update Rand rocksdb: Rand Fill Sync rocksdb: Read Rand Write Rand rocksdb: Read While Writing rocksdb: Rand Read deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream vvenc: Bosphorus 4K - Fast deepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Stream deepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Asynchronous Multi-Stream deepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Asynchronous Multi-Stream deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Stream deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2 - Synchronous Single-Stream deepsparse: NLP Text Classification, BERT base uncased SST2 - Synchronous Single-Stream deepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Synchronous Single-Stream deepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Synchronous Single-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Stream vpxenc: Speed 0 - Bosphorus 1080p deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Synchronous Single-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Synchronous Single-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO - Asynchronous Multi-Stream deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Synchronous Single-Stream deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Synchronous Single-Stream vpxenc: Speed 5 - Bosphorus 4K deepsparse: NLP Text Classification, DistilBERT mnli - Synchronous Single-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Synchronous Single-Stream deepsparse: CV Detection, YOLOv5s COCO - Synchronous Single-Stream deepsparse: CV Detection, YOLOv5s COCO - Synchronous Single-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Synchronous Single-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Synchronous Single-Stream stress-ng: Pthread stress-ng: Atomic stress-ng: NUMA stress-ng: Context Switching stress-ng: Forking stress-ng: Semaphores stress-ng: Crypto stress-ng: Poll stress-ng: CPU Cache stress-ng: Memory Copying stress-ng: Socket Activity stress-ng: Malloc stress-ng: MEMFD stress-ng: Futex stress-ng: Hash stress-ng: Matrix Math stress-ng: CPU Stress stress-ng: MMAP stress-ng: System V Message Passing stress-ng: Vector Math stress-ng: SENDFILE stress-ng: Function Call stress-ng: Glibc C String Functions stress-ng: Glibc Qsort Data Sorting stress-ng: Mutex vvenc: Bosphorus 4K - Faster vvenc: Bosphorus 1080p - Fast gromacs: MPI CPU - water_GMX50_bare onednn: Deconvolution Batch shapes_1d - f32 - CPU onednn: Deconvolution Batch shapes_1d - bf16bf16bf16 - CPU onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU vpxenc: Speed 5 - Bosphorus 1080p build-linux-kernel: defconfig uvg266: Bosphorus 4K - Slow uvg266: Bosphorus 4K - Medium onednn: IP Shapes 1D - f32 - CPU onednn: IP Shapes 1D - bf16bf16bf16 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU vvenc: Bosphorus 1080p - Faster onednn: Matrix Multiply Batch Shapes Transformer - bf16bf16bf16 - CPU onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU kvazaar: Bosphorus 4K - Slow kvazaar: Bosphorus 4K - Medium build-ffmpeg: Time To Compile uvg266: Bosphorus 4K - Very Fast uvg266: Bosphorus 4K - Ultra Fast uvg266: Bosphorus 4K - Super Fast onednn: IP Shapes 3D - f32 - CPU onednn: IP Shapes 3D - bf16bf16bf16 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU kvazaar: Bosphorus 4K - Very Fast kvazaar: Bosphorus 4K - Super Fast kvazaar: Bosphorus 4K - Ultra Fast uvg266: Bosphorus 1080p - Slow uvg266: Bosphorus 1080p - Medium onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: Convolution Batch Shapes Auto - bf16bf16bf16 - CPU kvazaar: Bosphorus 1080p - Slow kvazaar: Bosphorus 1080p - Medium onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: Deconvolution Batch shapes_3d - bf16bf16bf16 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU uvg266: Bosphorus 1080p - Very Fast uvg266: Bosphorus 1080p - Super Fast uvg266: Bosphorus 1080p - Ultra Fast kvazaar: Bosphorus 1080p - Very Fast kvazaar: Bosphorus 1080p - Super Fast kvazaar: Bosphorus 1080p - Ultra Fast embree: Pathtracer - Crown a b c no smt a no smt b smt a smt b smt c smt d 542256 556 1089 625.16 625.91 600.70 7.68 671.69 675.915 674.088 907.489 909.691 911.245 1383.7 9.22 4878860 6861088.89 4143044.21 962.1 49.72 1671.46 28.47 499.32 95.76 1701.7 27.97 319.9815 149.7446 1472.6 19.1 1115.7602 42.9893 8.34 5750.74 1677.9 938.5 1704.8 330.8 1536.6 892.7 1669.3 1233.8 88.71 540.41 1514.8 3033.9 10.1 9494.22 0.65 112346.36 1.09 74495.17 8.11 5908.89 16.3 2941.47 9.76 4914.17 533927 544384 376601 2926458 9296185 468231434 1116.426 42.9581 5.82 152.1377 314.8361 30.5924 1566.564 119.2067 401.7766 11.362 87.9631 5.0715 197.1018 28.7077 34.8262 77.3363 619.7531 28.6233 34.9289 14.69 16.5828 60.2522 47.464 1009.8486 108.6775 440.8533 10.0811 99.1493 17.36 5.1812 192.8806 9.5962 104.1178 5.3018 188.5378 109397.78 174.72 483.9 18941003.97 58156.36 18128283.29 203073.15 12653709.64 77.51 20106.51 8873.58 312709034.05 518.46 2794694.37 18954936 382305.69 205134.87 1663.08 10473084.09 556875.59 1950323.96 621015.34 16257100.25 1122.39 59479783.64 12.336 12.44 10.569 7.35746 1.83147 0.555277 29.37 25.768 29.37 33.03 2.01581 8.74576 3.11497 30.039 0.2902 0.357017 0.263997 40.63 41.4 12.569 68.82 70.56 69.33 1.56461 1.87777 0.53445 80.9 81.9 83.45 81.16 91.37 0.311405 0.693367 0.415446 139.92 143.31 1.17004 0.711588 0.304972 234.95 238.68 240.98 296.93 307.49 310.73 545396 557 1075 628.37 627.20 610.79 7.71 670.202 668.952 671.654 903.272 917.077 912.114 1378.2 9.29 4876951.36 6792746.34 4162812.47 962.63 49.72 1675.96 28.42 500.83 95.48 1685.26 28.23 319.9648 149.8207 1467.8 19.1 1118.4383 42.897 8.38 5720.9 1673.7 910.8 1716.7 332 1537.5 909.4 1651 1239.3 88.61 541.03 1516.6 3095 10.1 9496.95 0.66 111378.39 1.09 74353.41 8.13 5894.27 15.54 3085.53 9.75 4915.42 534681 545556 373002 2891962 8620352 466888888 1116.2899 42.8348 5.809 151.6003 316.2907 30.4608 1573.7496 119.3799 401.2053 11.1872 89.3355 5.0859 196.5414 28.4866 35.0962 77.305 619.9442 28.7137 34.8189 14.67 16.6745 59.9222 47.3532 1012.3494 109.1129 439.0373 10.136 98.6175 17.37 5.1633 193.5305 9.5793 104.2908 5.498 181.8123 109356.78 223.29 498.67 16313126.86 58664.97 18100474.36 203147.15 12661687.46 67.21 20340.4 8851.65 314418461.02 507.74 2805836.52 18955118.1 382304.04 217072.27 1664.75 10471889.97 556797.27 1913590.77 621003.92 16168655.17 1132.35 60031543.97 12.391 12.399 10.609 7.28243 1.84172 0.551966 29.5 25.749 29.29 33.13 1.9234 9.1957 4.27858 29.845 0.291387 0.356875 0.263373 40.86 41.39 12.434 68.79 70.68 69 1.56257 1.72975 0.513068 81.61 80.68 84.77 81.41 91.47 0.312721 0.689064 0.578466 139.07 144.21 1.16696 0.712495 0.305731 234.68 239.92 240.91 290.59 303.99 305.52 544565 549 1066 636.18 621.88 614.24 7.7 671.233 667.95 669.146 906.139 911.762 915.084 1384 9.3 4852421.67 6839975.87 4155273.46 962.15 49.77 1679.03 28.33 500.3 95.59 1704.02 27.89 320.0504 149.6905 1470.5 19.1 1122.0584 42.7433 8.37 5724.81 1675.4 926.9 1715.6 330.1 1540.9 916.9 1661.9 1241.1 89.46 535.78 1517.4 3049.4 10.11 9485.65 0.66 112186.25 1.09 74486.08 8.13 5898.79 15.1 3174.04 9.76 4910.13 536551 543572 356922 2910023 8316379 468069792 1117.7323 42.9152 5.808 151.3899 316.6282 30.4911 1571.8281 119.7026 400.0427 11.1605 89.547 5.0573 197.6601 28.4687 35.1188 76.6181 625.5512 28.5698 34.9937 14.76 16.5185 60.4884 47.2942 1013.7357 109.2467 438.4372 10.1384 98.5922 17.46 5.1696 193.3119 8.8004 113.5057 5.5007 181.7243 109609.15 183.33 478.05 16862185.54 64299.62 18088584.67 203095.44 12676101.41 97.04 20297.91 8876.1 313768771.26 507.97 2794473.75 18961773.18 382328.5 217304.19 1668.92 10475486.39 556833.5 1891202.37 621041.93 16537965.14 1125.86 59929401.4 12.399 12.441 10.587 7.38116 1.82264 0.556851 29.46 25.68 29.41 33.1 1.88686 3.83017 4.3972 30.081 0.285432 0.353175 0.26292 40.76 41.47 12.465 69.04 71.13 69.92 1.56742 1.6386 0.546991 80.31 84.55 82.7 81.1 91.29 0.312289 0.692667 0.417188 140.1 143.81 1.16919 0.708563 0.305599 237.44 237.96 238.77 291.03 301.4 309.64 465700 647 1098 665.00 666.43 635.48 7.04 913.201 973.318 930.257 1148.26 1147.34 1151.25 1395.8 9.39 3192061.68 4244908.83 2444886.82 438.29 109.08 828.67 57.58 229.46 208.82 833.99 57.18 304.2075 308.4294 1483.1 19.8 955.1631 97.4442 4.9 9784.09 1684.8 859.9 1728 279.9 1542.5 955.6 1669.7 1227.5 45.86 1045.81 1500.4 2804.2 9.18 20836.1 0.57 160545.41 1.01 127833.52 3.95 12119.66 6.01 7979.05 4.54 10556.93 478629 462018 350673 2079804 7643831 1209611055 956.0088 97.3996 5.89 145.3548 649.302 30.4213 3107.2957 116.1532 813.9832 14.4638 69.1063 5.2262 191.2605 31.327 31.9146 73.1993 1291.9733 31.9229 31.3187 14.28 20.7385 48.1849 42.805 2204.7533 100.2418 941.2443 13.8559 72.1298 14.48 7.5856 131.7586 9.8899 101.0095 6.0118 166.2725 68076.03 400.03 20.51 47222624.49 43094.96 13141129.68 435615.35 10458275.24 40.94 11342.69 8968.28 456657338.84 303.56 3746361.52 27408413.1 932248.18 328297.04 4520.48 7402514.74 920216.27 3284433.2 829106.63 26755146.69 1978.8 63581395.67 12.477 12.357 19.175 9.25292 1.65041 0.664359 29.62 17.606 34.59 38.38 2.04003 6.19257 4.87883 28.689 0.316779 0.342836 0.316665 47.13 47.97 10.597 57.13 57.78 59.83 2.19645 2.02286 0.556069 70.02 75.5 76.05 87.15 98.21 0.254845 0.40359 0.291687 155.56 159.63 0.664295 0.462386 0.160361 196.25 181.67 186.24 269.62 288.41 278.62 464044 652 1108 662.01 649.52 622.73 7.16 903.358 939.547 901.044 1124.75 1139.63 1119.17 1397.6 9.76 3216133.91 4220522.49 2460416.33 438.99 109.04 830.04 57.49 228.01 210.14 834.17 57.19 303.7377 308.3033 1483.8 19.8 955.2386 97.3838 4.91 9758.74 1682.8 892 1727.8 278.2 1538.8 1032.3 1667.6 1234.2 45.73 1048.74 1511.5 2865.9 9.18 20849.75 0.58 162294.47 1.01 127770.39 3.95 12128.38 6.01 7979.82 4.54 10549.24 468210 452228 344040 2063831 7913568 1213540299 955.3438 97.3441 5.895 145.3627 647.7596 30.6403 3086.3591 115.6369 817.6313 14.4178 69.3266 5.7015 175.3199 33.6012 29.7544 73.2087 1292.8038 32.1778 31.07 13.72 20.4223 48.9257 42.8393 2203.1573 100.4198 938.9084 13.9039 71.8878 14.5 6.4174 155.7357 9.9395 100.509 6.065 164.8126 67451.3 395.95 19.78 44683906.68 45685.28 13192391.85 437065.75 10393402.01 55.84 10949.9 8924.1 456651508.04 308.67 3802292.6 27422305.53 925984.24 326819.72 3591.71 7372780.72 920642.34 3282827.5 829445.38 28009942.56 1963.75 65500297.73 12.31 12.374 18.837 9.31751 1.70057 0.670574 29.71 17.586 34.68 38.65 1.93126 7.98861 4.79302 28.52 0.340498 0.347231 0.293564 47.43 47.93 10.662 58.09 57.23 58.33 2.12379 2.02833 0.629026 69.94 74.03 74.37 88.13 96.67 0.255796 0.400325 0.291229 159.79 161.4 0.651301 0.461174 0.164845 183.75 216.15 209.85 268.77 280.18 271.45 414168 764 1318 534.82 524.04 500.13 7.24 3269.05 3133.65 3187 1912.16 1888.1 1830.41 1393.5 9.78 4383314.65 4530948.62 2520253 439.41 108.93 827.07 57.72 230.21 208.13 833.15 57.29 303.3272 314.6186 1495.2 18.8 972.5163 97.0249 4.92 9746.08 1682.5 853.8 1723.7 254.1 1539.5 1051.4 1664.3 1023.9 45.71 1049.16 1515.6 2795.1 9.28 20610.23 0.53 173926.92 0.97 148316.88 3.95 12117.16 6.02 7953.21 4.54 10544.03 423027 413199 388052 1787682 15317250 1225662852 970.2202 97.2916 145.3966 658.2111 30.7462 3114.6848 117.1233 817.7536 13.637 73.2924 5.3004 188.5826 31.4805 31.7592 73.306 1306.6677 32.2956 30.9572 14.08 21.0323 47.506 42.9515 2230.3099 103.0888 929.0246 12.8322 77.8827 13.9 7.5225 132.8655 8.1914 121.9232 7.1865 139.0973 74978.51 184.34 24.82 12895047.73 36266.02 20047519.05 466292.12 15359471.73 47.58 15914.1 8748.99 634718750.41 464.7 2333781.39 41966139.67 946032.88 487359.39 8360.76 10103952.13 1291689.33 4329963.26 1414423.65 35687958.37 2564.48 136339636.91 18.818 20.2695 2.70807 1.20049 29.54 17.376 45.56 47.56 7.60566 15.421 11.4326 0.621776 0.674455 0.591738 64.63 65.56 10.381 59.3 57.96 59.06 3.39151 3.38833 0.975357 75.91 76.64 77.96 80.17 89.4 0.40743 0.538539 0.451472 132.67 140.78 0.97859 0.672585 0.279638 183.99 178.49 216.15 250.71 296.72 302.55 413708 810 1243 538.63 527.88 516.09 6.97 3249.28 3222.93 3034.62 1890.28 1894.07 1937.65 1392.4 9.87 4357453.76 4392934.75 2507331.29 439.5 108.89 827.03 57.67 230.06 208.19 832.7 57.31 304.0782 314.2452 1483.5 19.5 975.504 96.906 4.95 9680.44 1676.5 852.7 1726.3 259.1 1540.4 1038.9 1671 1122.4 45.66 1050.37 1519.1 2513.4 9.28 20610.62 0.53 175754.35 0.97 147582.87 3.96 12109.64 6.03 7949.05 4.54 10549.15 417882 411513 401295 1761416 13830689 1231168093 969.2122 97.3528 145.2745 658.341 30.7508 3114.4293 116.4813 822.1552 14.6356 68.2924 5.3053 188.4147 31.4593 31.7806 73.3496 1306.5839 32.2415 31.0093 13.57 21.7398 45.9642 42.6536 2245.8036 103.1902 927.679 12.9106 77.4096 14.27 6.8844 145.1736 8.3506 119.5964 7.3054 136.8324 180407.59 186.64 24.77 12221058.34 36020.16 19927313.52 466609.22 15341564.16 44.38 15430.18 8750.61 634995429.66 564.05 2396325.93 41989522.68 946660.88 487987.32 9017.12 8586858.51 1291704.57 4329824.38 1413555.44 35996976.13 2520.38 135855921.28 19.015 20.572 2.7614 1.18135 29.6 17.017 46.25 46.62 7.4227 19.549 7.30147 0.620907 0.675664 0.600199 66.22 65.69 10.259 57.86 57.51 59.2 3.49453 3.59643 0.971704 74.11 76.7 77 79.95 89.7 0.408074 0.551783 0.450261 136.33 138.33 0.973555 0.672768 0.291087 209.55 220.79 179.56 274.63 267.04 295.18 414282 793 1235 530.47 524.75 495.85 6.92 3153.22 3185.64 3011.21 1888.21 1925.68 1926.15 1389 9.82 4394194.49 4549313.42 2517750.47 437.51 109.36 821.93 58.05 227.93 210.19 826.67 57.71 303.2549 315.3318 1475.2 19.2 971.8988 96.9975 4.9 9773.26 1683.7 860.8 1732.3 249.8 1540.9 1046.6 1664 1005.8 45.31 1058.57 1516.2 2604.6 9.25 20684.29 0.53 173620.31 0.96 148047.13 3.93 12168.05 5.99 7993.18 4.52 10603.84 416006 419985 394852 1613913 13948914 1234570512 971.6677 96.9077 144.4661 662.2585 30.5784 3132.3838 116.9689 818.6037 14.2011 70.3828 5.2773 189.4054 31.3146 31.9277 73.3407 1305.7137 33.9879 29.4163 13.97 20.4884 48.772 42.5521 2253.1411 102.0999 937.992 14.2486 70.1418 15.38 7.355 135.8907 10.3852 96.1737 6.1057 163.7111 77834.39 184.24 24.79 12328933.65 34917.39 19866440.86 468159.21 15228320.07 42.55 15077.69 8747.83 639757070.52 413.34 2067499.58 41972765.93 952461.48 490300.48 7633.16 8609357.68 1295970.28 4351871.12 1422505.51 34602827.77 2517.26 138939958.5 19.045 21.0301 2.70235 1.20146 29.54 17.053 46.34 46.57 9.34981 18.6594 12.013 0.634992 0.676124 0.593299 65.49 65.59 10.407 57.44 57.33 58.34 3.60194 3.36535 0.99571 73.57 78.72 76.14 79.63 89.16 0.394377 0.545118 0.44782 135.8 140.89 0.972863 0.676798 0.275917 192.25 178.78 218.28 243.36 288.09 303.43 414047 781 1238 527.80 515.11 500.118340751 7.15 3171.15 3172.19 3349.36 1957.77 1912.74 1883.76 1391.7 9.81 4403267.42 4569476.86 2527084.87 437.68 109.23 822.5 58 228.94 209.24 828.01 57.62 301.9192 316.0328 1479.1 19.6 974.985 96.7169 4.91 9767.99 1677.6 879.6 1731.2 256.2 1543.6 1059 1664.2 1024.8 45.34 1057.74 1513.5 2528.7 9.26 20679.68 0.53 172228.34 0.97 149647.68 3.93 12172.92 6 7990.47 4.52 10587.3 415428 420514 404463 1752904 13098097 1231916197 966.7606 97.8908 144.9144 660.2132 30.7247 3116.0909 117.2219 816.7562 15.126 66.083 5.4341 183.949 33.0514 30.2494 73.6196 1301.1811 32.1758 31.0727 13.68 21.3593 46.7815 42.4798 2255.422 102.7646 931.8223 13.2331 75.524 15.11 7.064 141.4705 9.3656 106.6518 7.149 139.8241 91735.75 182.8 24.71 12728401.45 34685.85 19842969.28 468006 15403597.24 40.56 13370.33 8864.78 640853365.54 394.49 2077119.76 41965286.06 951700.17 489998.75 7273.19 12418451.71 1300693.93 4351920.42 1425621.44 36111781.82 2516.09 137579874.56 19.077 20.9726 2.67303 1.2334 29.66 17.079 46.25 46.21 8.26078 11.7573 12.1571 0.627133 0.672115 0.602831 66.08 65.49 10.373 57.52 58.76 58.66 3.50563 2.97427 0.975091 73.24 75.5 78.85 80 88.91 0.389223 0.63063 0.457554 135.56 136.42 0.973984 0.680395 0.247825 181.83 218.91 224.31 270.76 256.45 303.63 OpenBenchmarking.org
RocksDB Test: Sequential Fill OpenBenchmarking.org Op/s, More Is Better RocksDB 7.9.2 Test: Sequential Fill a b c no smt a no smt b smt a smt b smt c smt d 120K 240K 360K 480K 600K 542256 545396 544565 465700 464044 414168 413708 414282 414047 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti
OpenVKL Benchmark: vklBenchmark Scalar OpenBenchmarking.org Items / Sec, More Is Better OpenVKL 1.3.1 Benchmark: vklBenchmark Scalar a b c no smt a no smt b smt a smt b smt c smt d 200 400 600 800 1000 556 557 549 647 652 764 810 793 781 MIN: 61 / MAX: 5994 MIN: 61 / MAX: 6019 MIN: 60 / MAX: 5475 MIN: 101 / MAX: 4024 MIN: 102 / MAX: 3990 MIN: 139 / MAX: 3808 MIN: 138 / MAX: 3650 MIN: 139 / MAX: 3583 MIN: 139 / MAX: 3776
OpenVKL Benchmark: vklBenchmark ISPC OpenBenchmarking.org Items / Sec, More Is Better OpenVKL 1.3.1 Benchmark: vklBenchmark ISPC a b c no smt a no smt b smt a smt b smt c smt d 300 600 900 1200 1500 1089 1075 1066 1098 1108 1318 1243 1235 1238 MIN: 179 / MAX: 7194 MIN: 179 / MAX: 6442 MIN: 180 / MAX: 6427 MIN: 297 / MAX: 4284 MIN: 297 / MAX: 4208 MIN: 391 / MAX: 3855 MIN: 390 / MAX: 3501 MIN: 392 / MAX: 4332 MIN: 392 / MAX: 3738
ClickHouse 100M Rows Hits Dataset, Third Run OpenBenchmarking.org Queries Per Minute, Geo Mean, More Is Better ClickHouse 22.12.3.5 100M Rows Hits Dataset, Third Run a b c no smt a no smt b smt a smt b smt c smt d 140 280 420 560 700 625.16 628.37 636.18 665.00 662.01 534.82 538.63 530.47 527.80 MIN: 57.69 / MAX: 6000 MIN: 58.37 / MAX: 6666.67 MIN: 58.03 / MAX: 6666.67 MIN: 85.84 / MAX: 6000 MIN: 86.33 / MAX: 6000 MIN: 91.32 / MAX: 6000 MIN: 89.82 / MAX: 6000 MIN: 92.02 / MAX: 5454.55 MIN: 81.52 / MAX: 6000
ClickHouse 100M Rows Hits Dataset, Second Run OpenBenchmarking.org Queries Per Minute, Geo Mean, More Is Better ClickHouse 22.12.3.5 100M Rows Hits Dataset, Second Run a b c no smt a no smt b smt a smt b smt c smt d 140 280 420 560 700 625.91 627.20 621.88 666.43 649.52 524.04 527.88 524.75 515.11 MIN: 56.13 / MAX: 7500 MIN: 58.54 / MAX: 6000 MIN: 58.54 / MAX: 6666.67 MIN: 85.11 / MAX: 6666.67 MIN: 86.46 / MAX: 6000 MIN: 90.5 / MAX: 6666.67 MIN: 90.63 / MAX: 6666.67 MIN: 90.23 / MAX: 6000 MIN: 88.5 / MAX: 6000
ClickHouse 100M Rows Hits Dataset, First Run / Cold Cache OpenBenchmarking.org Queries Per Minute, Geo Mean, More Is Better ClickHouse 22.12.3.5 100M Rows Hits Dataset, First Run / Cold Cache a b c no smt a no smt b smt a smt b smt c smt d 140 280 420 560 700 600.70 610.79 614.24 635.48 622.73 500.13 516.09 495.85 500.12 MIN: 57.75 / MAX: 6000 MIN: 58.14 / MAX: 6666.67 MIN: 57.14 / MAX: 6666.67 MIN: 85.35 / MAX: 6000 MIN: 83.22 / MAX: 5454.55 MIN: 65.15 / MAX: 6000 MIN: 89.55 / MAX: 6000 MIN: 51.06 / MAX: 6000 MIN: 69.61 / MAX: 6000
VP9 libvpx Encoding Speed: Speed 0 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better VP9 libvpx Encoding 1.13 Speed: Speed 0 - Input: Bosphorus 4K a b c no smt a no smt b smt a smt b smt c smt d 2 4 6 8 10 7.68 7.71 7.70 7.04 7.16 7.24 6.97 6.92 7.15 1. (CXX) g++ options: -m64 -lm -lpthread -O3 -fPIC -U_FORTIFY_SOURCE -std=gnu++11
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU a b c no smt a no smt b smt a smt b smt c smt d 700 1400 2100 2800 3500 671.69 670.20 671.23 913.20 903.36 3269.05 3249.28 3153.22 3171.15 MIN: 664.46 MIN: 663.54 MIN: 662.9 MIN: 884.92 MIN: 874.93 MIN: 3243.11 MIN: 3226.78 MIN: 3059.88 MIN: 3148.29 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU a b c no smt a no smt b smt a smt b smt c smt d 700 1400 2100 2800 3500 675.92 668.95 667.95 973.32 939.55 3133.65 3222.93 3185.64 3172.19 MIN: 668.99 MIN: 662.29 MIN: 660.93 MIN: 937.3 MIN: 904.78 MIN: 3037.06 MIN: 2927.41 MIN: 2949.58 MIN: 3155.85 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU a b c no smt a no smt b smt a smt b smt c smt d 700 1400 2100 2800 3500 674.09 671.65 669.15 930.26 901.04 3187.00 3034.62 3011.21 3349.36 MIN: 667.01 MIN: 664.19 MIN: 661.71 MIN: 898.89 MIN: 864.28 MIN: 3034.82 MIN: 2821.2 MIN: 2853.26 MIN: 3325.75 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU a b c no smt a no smt b smt a smt b smt c smt d 400 800 1200 1600 2000 907.49 903.27 906.14 1148.26 1124.75 1912.16 1890.28 1888.21 1957.77 MIN: 897.95 MIN: 894.76 MIN: 898.41 MIN: 1108.42 MIN: 1091.36 MIN: 1891.03 MIN: 1864.29 MIN: 1860.29 MIN: 1929.47 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU a b c no smt a no smt b smt a smt b smt c smt d 400 800 1200 1600 2000 909.69 917.08 911.76 1147.34 1139.63 1888.10 1894.07 1925.68 1912.74 MIN: 901.1 MIN: 909.14 MIN: 901.88 MIN: 1109.85 MIN: 1100.83 MIN: 1865.76 MIN: 1870.66 MIN: 1901.2 MIN: 1890.93 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU a b c no smt a no smt b smt a smt b smt c smt d 400 800 1200 1600 2000 911.25 912.11 915.08 1151.25 1119.17 1830.41 1937.65 1926.15 1883.76 MIN: 903.01 MIN: 902.43 MIN: 906.26 MIN: 1070.35 MIN: 1086.08 MIN: 1807.97 MIN: 1908.65 MIN: 1901.09 MIN: 1850.75 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
Zstd Compression Compression Level: 19, Long Mode - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19, Long Mode - Decompression Speed a b c no smt a no smt b smt a smt b smt c smt d 300 600 900 1200 1500 1383.7 1378.2 1384.0 1395.8 1397.6 1393.5 1392.4 1389.0 1391.7 1. (CC) gcc options: -O3 -pthread -lz
Zstd Compression Compression Level: 19, Long Mode - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19, Long Mode - Compression Speed a b c no smt a no smt b smt a smt b smt c smt d 3 6 9 12 15 9.22 9.29 9.30 9.39 9.76 9.78 9.87 9.82 9.81 1. (CC) gcc options: -O3 -pthread -lz
Memcached Set To Get Ratio: 1:100 OpenBenchmarking.org Ops/sec, More Is Better Memcached 1.6.18 Set To Get Ratio: 1:100 a b c no smt a no smt b smt a smt b smt c smt d 1000K 2000K 3000K 4000K 5000K 4878860.00 4876951.36 4852421.67 3192061.68 3216133.91 4383314.65 4357453.76 4394194.49 4403267.42 1. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre
Memcached Set To Get Ratio: 1:10 OpenBenchmarking.org Ops/sec, More Is Better Memcached 1.6.18 Set To Get Ratio: 1:10 a b c no smt a no smt b smt a smt b smt c smt d 1.5M 3M 4.5M 6M 7.5M 6861088.89 6792746.34 6839975.87 4244908.83 4220522.49 4530948.62 4392934.75 4549313.42 4569476.86 1. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre
Memcached Set To Get Ratio: 1:5 OpenBenchmarking.org Ops/sec, More Is Better Memcached 1.6.18 Set To Get Ratio: 1:5 a b c no smt a no smt b smt a smt b smt c smt d 900K 1800K 2700K 3600K 4500K 4143044.21 4162812.47 4155273.46 2444886.82 2460416.33 2520253.00 2507331.29 2517750.47 2527084.87 1. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre
OpenVINO Model: Face Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Face Detection FP16 - Device: CPU a b c no smt a no smt b smt a smt b smt c smt d 200 400 600 800 1000 962.10 962.63 962.15 438.29 438.99 439.41 439.50 437.51 437.68 MIN: 879.24 / MAX: 1018.81 MIN: 893.43 / MAX: 1015.71 MIN: 888.7 / MAX: 1017.92 MIN: 416.93 / MAX: 496.86 MIN: 427.57 / MAX: 484.31 MIN: 410.81 / MAX: 465.4 MIN: 424.22 / MAX: 477.66 MIN: 400.05 / MAX: 473.91 MIN: 394.89 / MAX: 478.58 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenVINO Model: Face Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Face Detection FP16 - Device: CPU a b c no smt a no smt b smt a smt b smt c smt d 20 40 60 80 100 49.72 49.72 49.77 109.08 109.04 108.93 108.89 109.36 109.23 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Person Detection FP16 - Device: CPU a b c no smt a no smt b smt a smt b smt c smt d 400 800 1200 1600 2000 1671.46 1675.96 1679.03 828.67 830.04 827.07 827.03 821.93 822.50 MIN: 924.15 / MAX: 1977.46 MIN: 1231.52 / MAX: 1967.75 MIN: 865.58 / MAX: 1995.12 MIN: 730.29 / MAX: 1036.1 MIN: 722.6 / MAX: 1015.67 MIN: 724.84 / MAX: 1037.91 MIN: 723.77 / MAX: 1003.25 MIN: 725.25 / MAX: 1010.56 MIN: 717.65 / MAX: 997.43 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Person Detection FP16 - Device: CPU a b c no smt a no smt b smt a smt b smt c smt d 13 26 39 52 65 28.47 28.42 28.33 57.58 57.49 57.72 57.67 58.05 58.00 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenVINO Model: Face Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Face Detection FP16-INT8 - Device: CPU a b c no smt a no smt b smt a smt b smt c smt d 110 220 330 440 550 499.32 500.83 500.30 229.46 228.01 230.21 230.06 227.93 228.94 MIN: 264.54 / MAX: 537.26 MIN: 418.76 / MAX: 546.62 MIN: 410.04 / MAX: 531.93 MIN: 217.93 / MAX: 265.81 MIN: 212.99 / MAX: 267.17 MIN: 211.56 / MAX: 253.58 MIN: 210.58 / MAX: 251.42 MIN: 210.46 / MAX: 252.96 MIN: 214.92 / MAX: 248.29 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenVINO Model: Face Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Face Detection FP16-INT8 - Device: CPU a b c no smt a no smt b smt a smt b smt c smt d 50 100 150 200 250 95.76 95.48 95.59 208.82 210.14 208.13 208.19 210.19 209.24 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenVINO Model: Person Detection FP32 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Person Detection FP32 - Device: CPU a b c no smt a no smt b smt a smt b smt c smt d 400 800 1200 1600 2000 1701.70 1685.26 1704.02 833.99 834.17 833.15 832.70 826.67 828.01 MIN: 1395.71 / MAX: 2063.97 MIN: 891.16 / MAX: 1979.37 MIN: 828.99 / MAX: 1969.02 MIN: 723.91 / MAX: 1011.94 MIN: 732.01 / MAX: 1006.46 MIN: 725.84 / MAX: 1017.38 MIN: 723.65 / MAX: 1031.69 MIN: 724.1 / MAX: 1006.75 MIN: 716.19 / MAX: 1018.34 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenVINO Model: Person Detection FP32 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Person Detection FP32 - Device: CPU a b c no smt a no smt b smt a smt b smt c smt d 13 26 39 52 65 27.97 28.23 27.89 57.18 57.19 57.29 57.31 57.71 57.62 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream a b c no smt a no smt b smt a smt b smt c smt d 70 140 210 280 350 319.98 319.96 320.05 304.21 303.74 303.33 304.08 303.25 301.92
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream a b c no smt a no smt b smt a smt b smt c smt d 70 140 210 280 350 149.74 149.82 149.69 308.43 308.30 314.62 314.25 315.33 316.03
Zstd Compression Compression Level: 19 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19 - Decompression Speed a b c no smt a no smt b smt a smt b smt c smt d 300 600 900 1200 1500 1472.6 1467.8 1470.5 1483.1 1483.8 1495.2 1483.5 1475.2 1479.1 1. (CC) gcc options: -O3 -pthread -lz
Zstd Compression Compression Level: 19 - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19 - Compression Speed a b c no smt a no smt b smt a smt b smt c smt d 5 10 15 20 25 19.1 19.1 19.1 19.8 19.8 18.8 19.5 19.2 19.6 1. (CC) gcc options: -O3 -pthread -lz
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream a b c no smt a no smt b smt a smt b smt c smt d 200 400 600 800 1000 1115.76 1118.44 1122.06 955.16 955.24 972.52 975.50 971.90 974.99
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream a b c no smt a no smt b smt a smt b smt c smt d 20 40 60 80 100 42.99 42.90 42.74 97.44 97.38 97.02 96.91 97.00 96.72
OpenVINO Model: Person Vehicle Bike Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Person Vehicle Bike Detection FP16 - Device: CPU a b c no smt a no smt b smt a smt b smt c smt d 2 4 6 8 10 8.34 8.38 8.37 4.90 4.91 4.92 4.95 4.90 4.91 MIN: 6.61 / MAX: 55.54 MIN: 6.4 / MAX: 32.33 MIN: 6.83 / MAX: 50.24 MIN: 4.46 / MAX: 56.52 MIN: 4.49 / MAX: 34.72 MIN: 4.52 / MAX: 34.8 MIN: 4.54 / MAX: 24.23 MIN: 4.5 / MAX: 29.95 MIN: 4.51 / MAX: 27.27 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenVINO Model: Person Vehicle Bike Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Person Vehicle Bike Detection FP16 - Device: CPU a b c no smt a no smt b smt a smt b smt c smt d 2K 4K 6K 8K 10K 5750.74 5720.90 5724.81 9784.09 9758.74 9746.08 9680.44 9773.26 9767.99 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
Zstd Compression Compression Level: 8, Long Mode - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 8, Long Mode - Decompression Speed a b c no smt a no smt b smt a smt b smt c smt d 400 800 1200 1600 2000 1677.9 1673.7 1675.4 1684.8 1682.8 1682.5 1676.5 1683.7 1677.6 1. (CC) gcc options: -O3 -pthread -lz
Zstd Compression Compression Level: 8, Long Mode - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 8, Long Mode - Compression Speed a b c no smt a no smt b smt a smt b smt c smt d 200 400 600 800 1000 938.5 910.8 926.9 859.9 892.0 853.8 852.7 860.8 879.6 1. (CC) gcc options: -O3 -pthread -lz
Zstd Compression Compression Level: 12 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 12 - Decompression Speed a b c no smt a no smt b smt a smt b smt c smt d 400 800 1200 1600 2000 1704.8 1716.7 1715.6 1728.0 1727.8 1723.7 1726.3 1732.3 1731.2 1. (CC) gcc options: -O3 -pthread -lz
Zstd Compression Compression Level: 12 - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 12 - Compression Speed a b c no smt a no smt b smt a smt b smt c smt d 70 140 210 280 350 330.8 332.0 330.1 279.9 278.2 254.1 259.1 249.8 256.2 1. (CC) gcc options: -O3 -pthread -lz
Zstd Compression Compression Level: 3, Long Mode - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 3, Long Mode - Decompression Speed a b c no smt a no smt b smt a smt b smt c smt d 300 600 900 1200 1500 1536.6 1537.5 1540.9 1542.5 1538.8 1539.5 1540.4 1540.9 1543.6 1. (CC) gcc options: -O3 -pthread -lz
Zstd Compression Compression Level: 3, Long Mode - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 3, Long Mode - Compression Speed a b c no smt a no smt b smt a smt b smt c smt d 200 400 600 800 1000 892.7 909.4 916.9 955.6 1032.3 1051.4 1038.9 1046.6 1059.0 1. (CC) gcc options: -O3 -pthread -lz
Zstd Compression Compression Level: 8 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 8 - Decompression Speed a b c no smt a no smt b smt a smt b smt c smt d 400 800 1200 1600 2000 1669.3 1651.0 1661.9 1669.7 1667.6 1664.3 1671.0 1664.0 1664.2 1. (CC) gcc options: -O3 -pthread -lz
Zstd Compression Compression Level: 8 - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 8 - Compression Speed a b c no smt a no smt b smt a smt b smt c smt d 300 600 900 1200 1500 1233.8 1239.3 1241.1 1227.5 1234.2 1023.9 1122.4 1005.8 1024.8 1. (CC) gcc options: -O3 -pthread -lz
OpenVINO Model: Machine Translation EN To DE FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Machine Translation EN To DE FP16 - Device: CPU a b c no smt a no smt b smt a smt b smt c smt d 20 40 60 80 100 88.71 88.61 89.46 45.86 45.73 45.71 45.66 45.31 45.34 MIN: 44.2 / MAX: 132.86 MIN: 47.57 / MAX: 124.67 MIN: 42.16 / MAX: 123.4 MIN: 39.29 / MAX: 91.13 MIN: 39.68 / MAX: 86.83 MIN: 39.78 / MAX: 73.33 MIN: 38.82 / MAX: 75.65 MIN: 39.35 / MAX: 73.19 MIN: 39.4 / MAX: 73.83 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenVINO Model: Machine Translation EN To DE FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Machine Translation EN To DE FP16 - Device: CPU a b c no smt a no smt b smt a smt b smt c smt d 200 400 600 800 1000 540.41 541.03 535.78 1045.81 1048.74 1049.16 1050.37 1058.57 1057.74 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
Zstd Compression Compression Level: 3 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 3 - Decompression Speed a b c no smt a no smt b smt a smt b smt c smt d 300 600 900 1200 1500 1514.8 1516.6 1517.4 1500.4 1511.5 1515.6 1519.1 1516.2 1513.5 1. (CC) gcc options: -O3 -pthread -lz
Zstd Compression Compression Level: 3 - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 3 - Compression Speed a b c no smt a no smt b smt a smt b smt c smt d 700 1400 2100 2800 3500 3033.9 3095.0 3049.4 2804.2 2865.9 2795.1 2513.4 2604.6 2528.7 1. (CC) gcc options: -O3 -pthread -lz
OpenVINO Model: Weld Porosity Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Weld Porosity Detection FP16-INT8 - Device: CPU a b c no smt a no smt b smt a smt b smt c smt d 3 6 9 12 15 10.10 10.10 10.11 9.18 9.18 9.28 9.28 9.25 9.26 MIN: 5.46 / MAX: 35.41 MIN: 5.47 / MAX: 28.16 MIN: 5.43 / MAX: 36.27 MIN: 8.17 / MAX: 31.1 MIN: 8.21 / MAX: 24.31 MIN: 8.1 / MAX: 22.87 MIN: 8.13 / MAX: 24.28 MIN: 8.12 / MAX: 19.04 MIN: 8.12 / MAX: 30.29 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenVINO Model: Weld Porosity Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Weld Porosity Detection FP16-INT8 - Device: CPU a b c no smt a no smt b smt a smt b smt c smt d 4K 8K 12K 16K 20K 9494.22 9496.95 9485.65 20836.10 20849.75 20610.23 20610.62 20684.29 20679.68 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU a b c no smt a no smt b smt a smt b smt c smt d 0.1485 0.297 0.4455 0.594 0.7425 0.65 0.66 0.66 0.57 0.58 0.53 0.53 0.53 0.53 MIN: 0.32 / MAX: 20.68 MIN: 0.32 / MAX: 19.99 MIN: 0.31 / MAX: 22.67 MIN: 0.5 / MAX: 13.02 MIN: 0.5 / MAX: 12.8 MIN: 0.5 / MAX: 24.2 MIN: 0.5 / MAX: 9.61 MIN: 0.5 / MAX: 9.15 MIN: 0.5 / MAX: 8.86 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU a b c no smt a no smt b smt a smt b smt c smt d 40K 80K 120K 160K 200K 112346.36 111378.39 112186.25 160545.41 162294.47 173926.92 175754.35 173620.31 172228.34 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenVINO Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU a b c no smt a no smt b smt a smt b smt c smt d 0.2453 0.4906 0.7359 0.9812 1.2265 1.09 1.09 1.09 1.01 1.01 0.97 0.97 0.96 0.97 MIN: 0.49 / MAX: 19.01 MIN: 0.48 / MAX: 25.82 MIN: 0.49 / MAX: 22.06 MIN: 0.86 / MAX: 8.97 MIN: 0.86 / MAX: 8.72 MIN: 0.87 / MAX: 9.78 MIN: 0.87 / MAX: 10.96 MIN: 0.87 / MAX: 9.72 MIN: 0.86 / MAX: 13.31 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenVINO Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU a b c no smt a no smt b smt a smt b smt c smt d 30K 60K 90K 120K 150K 74495.17 74353.41 74486.08 127833.52 127770.39 148316.88 147582.87 148047.13 149647.68 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenVINO Model: Vehicle Detection FP16-INT8 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Vehicle Detection FP16-INT8 - Device: CPU a b c no smt a no smt b smt a smt b smt c smt d 2 4 6 8 10 8.11 8.13 8.13 3.95 3.95 3.95 3.96 3.93 3.93 MIN: 5.39 / MAX: 69.87 MIN: 5.35 / MAX: 55.84 MIN: 3.83 / MAX: 59.87 MIN: 3.61 / MAX: 38 MIN: 3.68 / MAX: 42.53 MIN: 3.61 / MAX: 34.38 MIN: 3.66 / MAX: 32.83 MIN: 3.61 / MAX: 42.82 MIN: 3.61 / MAX: 23.62 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenVINO Model: Vehicle Detection FP16-INT8 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Vehicle Detection FP16-INT8 - Device: CPU a b c no smt a no smt b smt a smt b smt c smt d 3K 6K 9K 12K 15K 5908.89 5894.27 5898.79 12119.66 12128.38 12117.16 12109.64 12168.05 12172.92 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenVINO Model: Vehicle Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Vehicle Detection FP16 - Device: CPU a b c no smt a no smt b smt a smt b smt c smt d 4 8 12 16 20 16.30 15.54 15.10 6.01 6.01 6.02 6.03 5.99 6.00 MIN: 7.91 / MAX: 51.62 MIN: 8.28 / MAX: 57.72 MIN: 6.94 / MAX: 60.59 MIN: 5.2 / MAX: 37.8 MIN: 5.02 / MAX: 36.88 MIN: 5.27 / MAX: 25.12 MIN: 5.17 / MAX: 38.35 MIN: 5.13 / MAX: 31.29 MIN: 5.21 / MAX: 25.51 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenVINO Model: Vehicle Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Vehicle Detection FP16 - Device: CPU a b c no smt a no smt b smt a smt b smt c smt d 2K 4K 6K 8K 10K 2941.47 3085.53 3174.04 7979.05 7979.82 7953.21 7949.05 7993.18 7990.47 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenVINO Model: Weld Porosity Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2022.3 Model: Weld Porosity Detection FP16 - Device: CPU a b c no smt a no smt b smt a smt b smt c smt d 3 6 9 12 15 9.76 9.75 9.76 4.54 4.54 4.54 4.54 4.52 4.52 MIN: 5.03 / MAX: 28.1 MIN: 5.25 / MAX: 35.53 MIN: 4.98 / MAX: 36.12 MIN: 4.12 / MAX: 27.91 MIN: 4.09 / MAX: 55.78 MIN: 4.07 / MAX: 35.16 MIN: 4.11 / MAX: 42.14 MIN: 4.12 / MAX: 33.17 MIN: 4.03 / MAX: 45.52 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
OpenVINO Model: Weld Porosity Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2022.3 Model: Weld Porosity Detection FP16 - Device: CPU a b c no smt a no smt b smt a smt b smt c smt d 2K 4K 6K 8K 10K 4914.17 4915.42 4910.13 10556.93 10549.24 10544.03 10549.15 10603.84 10587.30 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -pie -ldl
RocksDB Test: Random Fill OpenBenchmarking.org Op/s, More Is Better RocksDB 7.9.2 Test: Random Fill a b c no smt a no smt b smt a smt b smt c smt d 110K 220K 330K 440K 550K 533927 534681 536551 478629 468210 423027 417882 416006 415428 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti
RocksDB Test: Update Random OpenBenchmarking.org Op/s, More Is Better RocksDB 7.9.2 Test: Update Random a b c no smt a no smt b smt a smt b smt c smt d 120K 240K 360K 480K 600K 544384 545556 543572 462018 452228 413199 411513 419985 420514 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti
RocksDB Test: Random Fill Sync OpenBenchmarking.org Op/s, More Is Better RocksDB 7.9.2 Test: Random Fill Sync a b c no smt a no smt b smt a smt b smt c smt d 90K 180K 270K 360K 450K 376601 373002 356922 350673 344040 388052 401295 394852 404463 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti
RocksDB Test: Read Random Write Random OpenBenchmarking.org Op/s, More Is Better RocksDB 7.9.2 Test: Read Random Write Random a b c no smt a no smt b smt a smt b smt c smt d 600K 1200K 1800K 2400K 3000K 2926458 2891962 2910023 2079804 2063831 1787682 1761416 1613913 1752904 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti
RocksDB Test: Read While Writing OpenBenchmarking.org Op/s, More Is Better RocksDB 7.9.2 Test: Read While Writing a b c no smt a no smt b smt a smt b smt c smt d 3M 6M 9M 12M 15M 9296185 8620352 8316379 7643831 7913568 15317250 13830689 13948914 13098097 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti
RocksDB Test: Random Read OpenBenchmarking.org Op/s, More Is Better RocksDB 7.9.2 Test: Random Read a b c no smt a no smt b smt a smt b smt c smt d 300M 600M 900M 1200M 1500M 468231434 466888888 468069792 1209611055 1213540299 1225662852 1231168093 1234570512 1231916197 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream a b c no smt a no smt b smt a smt b smt c smt d 200 400 600 800 1000 1116.43 1116.29 1117.73 956.01 955.34 970.22 969.21 971.67 966.76
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream a b c no smt a no smt b smt a smt b smt c smt d 20 40 60 80 100 42.96 42.83 42.92 97.40 97.34 97.29 97.35 96.91 97.89
VVenC Video Input: Bosphorus 4K - Video Preset: Fast OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.7 Video Input: Bosphorus 4K - Video Preset: Fast a b c no smt a no smt b 1.3264 2.6528 3.9792 5.3056 6.632 5.820 5.809 5.808 5.890 5.895 1. (CXX) g++ options: -O3 -flto -fno-fat-lto-objects -flto=auto
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream a b c no smt a no smt b smt a smt b smt c smt d 30 60 90 120 150 152.14 151.60 151.39 145.35 145.36 145.40 145.27 144.47 144.91
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream a b c no smt a no smt b smt a smt b smt c smt d 140 280 420 560 700 314.84 316.29 316.63 649.30 647.76 658.21 658.34 662.26 660.21
Neural Magic DeepSparse Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream a b c no smt a no smt b smt a smt b smt c smt d 7 14 21 28 35 30.59 30.46 30.49 30.42 30.64 30.75 30.75 30.58 30.72
Neural Magic DeepSparse Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream a b c no smt a no smt b smt a smt b smt c smt d 700 1400 2100 2800 3500 1566.56 1573.75 1571.83 3107.30 3086.36 3114.68 3114.43 3132.38 3116.09
Neural Magic DeepSparse Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream a b c no smt a no smt b smt a smt b smt c smt d 30 60 90 120 150 119.21 119.38 119.70 116.15 115.64 117.12 116.48 116.97 117.22
Neural Magic DeepSparse Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream a b c no smt a no smt b smt a smt b smt c smt d 200 400 600 800 1000 401.78 401.21 400.04 813.98 817.63 817.75 822.16 818.60 816.76
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Synchronous Single-Stream a b c no smt a no smt b smt a smt b smt c smt d 4 8 12 16 20 11.36 11.19 11.16 14.46 14.42 13.64 14.64 14.20 15.13
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Synchronous Single-Stream a b c no smt a no smt b smt a smt b smt c smt d 20 40 60 80 100 87.96 89.34 89.55 69.11 69.33 73.29 68.29 70.38 66.08
Neural Magic DeepSparse Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Synchronous Single-Stream a b c no smt a no smt b smt a smt b smt c smt d 1.2828 2.5656 3.8484 5.1312 6.414 5.0715 5.0859 5.0573 5.2262 5.7015 5.3004 5.3053 5.2773 5.4341
Neural Magic DeepSparse Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Synchronous Single-Stream a b c no smt a no smt b smt a smt b smt c smt d 40 80 120 160 200 197.10 196.54 197.66 191.26 175.32 188.58 188.41 189.41 183.95
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream a b c no smt a no smt b smt a smt b smt c smt d 8 16 24 32 40 28.71 28.49 28.47 31.33 33.60 31.48 31.46 31.31 33.05
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream a b c no smt a no smt b smt a smt b smt c smt d 8 16 24 32 40 34.83 35.10 35.12 31.91 29.75 31.76 31.78 31.93 30.25
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream a b c no smt a no smt b smt a smt b smt c smt d 20 40 60 80 100 77.34 77.31 76.62 73.20 73.21 73.31 73.35 73.34 73.62
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream a b c no smt a no smt b smt a smt b smt c smt d 300 600 900 1200 1500 619.75 619.94 625.55 1291.97 1292.80 1306.67 1306.58 1305.71 1301.18
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream a b c no smt a no smt b smt a smt b smt c smt d 8 16 24 32 40 28.62 28.71 28.57 31.92 32.18 32.30 32.24 33.99 32.18
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream a b c no smt a no smt b smt a smt b smt c smt d 8 16 24 32 40 34.93 34.82 34.99 31.32 31.07 30.96 31.01 29.42 31.07
VP9 libvpx Encoding Speed: Speed 0 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better VP9 libvpx Encoding 1.13 Speed: Speed 0 - Input: Bosphorus 1080p a b c no smt a no smt b smt a smt b smt c smt d 4 8 12 16 20 14.69 14.67 14.76 14.28 13.72 14.08 13.57 13.97 13.68 1. (CXX) g++ options: -m64 -lm -lpthread -O3 -fPIC -U_FORTIFY_SOURCE -std=gnu++11
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream a b c no smt a no smt b smt a smt b smt c smt d 5 10 15 20 25 16.58 16.67 16.52 20.74 20.42 21.03 21.74 20.49 21.36
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream a b c no smt a no smt b smt a smt b smt c smt d 14 28 42 56 70 60.25 59.92 60.49 48.18 48.93 47.51 45.96 48.77 46.78
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream a b c no smt a no smt b smt a smt b smt c smt d 11 22 33 44 55 47.46 47.35 47.29 42.81 42.84 42.95 42.65 42.55 42.48
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream a b c no smt a no smt b smt a smt b smt c smt d 500 1000 1500 2000 2500 1009.85 1012.35 1013.74 2204.75 2203.16 2230.31 2245.80 2253.14 2255.42
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream a b c no smt a no smt b smt a smt b smt c smt d 20 40 60 80 100 108.68 109.11 109.25 100.24 100.42 103.09 103.19 102.10 102.76
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream a b c no smt a no smt b smt a smt b smt c smt d 200 400 600 800 1000 440.85 439.04 438.44 941.24 938.91 929.02 927.68 937.99 931.82
Neural Magic DeepSparse Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Synchronous Single-Stream a b c no smt a no smt b smt a smt b smt c smt d 4 8 12 16 20 10.08 10.14 10.14 13.86 13.90 12.83 12.91 14.25 13.23
Neural Magic DeepSparse Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Synchronous Single-Stream a b c no smt a no smt b smt a smt b smt c smt d 20 40 60 80 100 99.15 98.62 98.59 72.13 71.89 77.88 77.41 70.14 75.52
VP9 libvpx Encoding Speed: Speed 5 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better VP9 libvpx Encoding 1.13 Speed: Speed 5 - Input: Bosphorus 4K a b c no smt a no smt b smt a smt b smt c smt d 4 8 12 16 20 17.36 17.37 17.46 14.48 14.50 13.90 14.27 15.38 15.11 1. (CXX) g++ options: -m64 -lm -lpthread -O3 -fPIC -U_FORTIFY_SOURCE -std=gnu++11
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream a b c no smt a no smt b smt a smt b smt c smt d 2 4 6 8 10 5.1812 5.1633 5.1696 7.5856 6.4174 7.5225 6.8844 7.3550 7.0640
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream a b c no smt a no smt b smt a smt b smt c smt d 40 80 120 160 200 192.88 193.53 193.31 131.76 155.74 132.87 145.17 135.89 141.47
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Stream a b c no smt a no smt b smt a smt b smt c smt d 3 6 9 12 15 9.5962 9.5793 8.8004 9.8899 9.9395 8.1914 8.3506 10.3852 9.3656
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Stream a b c no smt a no smt b smt a smt b smt c smt d 30 60 90 120 150 104.12 104.29 113.51 101.01 100.51 121.92 119.60 96.17 106.65
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream a b c no smt a no smt b smt a smt b smt c smt d 2 4 6 8 10 5.3018 5.4980 5.5007 6.0118 6.0650 7.1865 7.3054 6.1057 7.1490
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream a b c no smt a no smt b smt a smt b smt c smt d 40 80 120 160 200 188.54 181.81 181.72 166.27 164.81 139.10 136.83 163.71 139.82
Stress-NG Test: Pthread OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: Pthread a b c no smt a no smt b smt a smt b smt c smt d 40K 80K 120K 160K 200K 109397.78 109356.78 109609.15 68076.03 67451.30 74978.51 180407.59 77834.39 91735.75 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: Atomic OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: Atomic a b c no smt a no smt b smt a smt b smt c smt d 90 180 270 360 450 174.72 223.29 183.33 400.03 395.95 184.34 186.64 184.24 182.80 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: NUMA OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: NUMA a b c no smt a no smt b smt a smt b smt c smt d 110 220 330 440 550 483.90 498.67 478.05 20.51 19.78 24.82 24.77 24.79 24.71 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: Context Switching OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: Context Switching a b c no smt a no smt b smt a smt b smt c smt d 10M 20M 30M 40M 50M 18941003.97 16313126.86 16862185.54 47222624.49 44683906.68 12895047.73 12221058.34 12328933.65 12728401.45 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: Forking OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: Forking a b c no smt a no smt b smt a smt b smt c smt d 14K 28K 42K 56K 70K 58156.36 58664.97 64299.62 43094.96 45685.28 36266.02 36020.16 34917.39 34685.85 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: Semaphores OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: Semaphores a b c no smt a no smt b smt a smt b smt c smt d 4M 8M 12M 16M 20M 18128283.29 18100474.36 18088584.67 13141129.68 13192391.85 20047519.05 19927313.52 19866440.86 19842969.28 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: Crypto OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: Crypto a b c no smt a no smt b smt a smt b smt c smt d 100K 200K 300K 400K 500K 203073.15 203147.15 203095.44 435615.35 437065.75 466292.12 466609.22 468159.21 468006.00 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: Poll OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: Poll a b c no smt a no smt b smt a smt b smt c smt d 3M 6M 9M 12M 15M 12653709.64 12661687.46 12676101.41 10458275.24 10393402.01 15359471.73 15341564.16 15228320.07 15403597.24 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: CPU Cache OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: CPU Cache a b c no smt a no smt b smt a smt b smt c smt d 20 40 60 80 100 77.51 67.21 97.04 40.94 55.84 47.58 44.38 42.55 40.56 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: Memory Copying OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: Memory Copying a b c no smt a no smt b smt a smt b smt c smt d 4K 8K 12K 16K 20K 20106.51 20340.40 20297.91 11342.69 10949.90 15914.10 15430.18 15077.69 13370.33 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: Socket Activity OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: Socket Activity a b c no smt a no smt b smt a smt b smt c smt d 2K 4K 6K 8K 10K 8873.58 8851.65 8876.10 8968.28 8924.10 8748.99 8750.61 8747.83 8864.78 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: Malloc OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: Malloc a b c no smt a no smt b smt a smt b smt c smt d 140M 280M 420M 560M 700M 312709034.05 314418461.02 313768771.26 456657338.84 456651508.04 634718750.41 634995429.66 639757070.52 640853365.54 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: MEMFD OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: MEMFD a b c no smt a no smt b smt a smt b smt c smt d 120 240 360 480 600 518.46 507.74 507.97 303.56 308.67 464.70 564.05 413.34 394.49 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: Futex OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: Futex a b c no smt a no smt b smt a smt b smt c smt d 800K 1600K 2400K 3200K 4000K 2794694.37 2805836.52 2794473.75 3746361.52 3802292.60 2333781.39 2396325.93 2067499.58 2077119.76 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: Hash OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: Hash a b c no smt a no smt b smt a smt b smt c smt d 9M 18M 27M 36M 45M 18954936.00 18955118.10 18961773.18 27408413.10 27422305.53 41966139.67 41989522.68 41972765.93 41965286.06 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: Matrix Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: Matrix Math a b c no smt a no smt b smt a smt b smt c smt d 200K 400K 600K 800K 1000K 382305.69 382304.04 382328.50 932248.18 925984.24 946032.88 946660.88 952461.48 951700.17 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: CPU Stress OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: CPU Stress a b c no smt a no smt b smt a smt b smt c smt d 110K 220K 330K 440K 550K 205134.87 217072.27 217304.19 328297.04 326819.72 487359.39 487987.32 490300.48 489998.75 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: MMAP OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: MMAP a b c no smt a no smt b smt a smt b smt c smt d 2K 4K 6K 8K 10K 1663.08 1664.75 1668.92 4520.48 3591.71 8360.76 9017.12 7633.16 7273.19 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: System V Message Passing OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: System V Message Passing a b c no smt a no smt b smt a smt b smt c smt d 3M 6M 9M 12M 15M 10473084.09 10471889.97 10475486.39 7402514.74 7372780.72 10103952.13 8586858.51 8609357.68 12418451.71 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: Vector Math OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: Vector Math a b c no smt a no smt b smt a smt b smt c smt d 300K 600K 900K 1200K 1500K 556875.59 556797.27 556833.50 920216.27 920642.34 1291689.33 1291704.57 1295970.28 1300693.93 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: SENDFILE OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: SENDFILE a b c no smt a no smt b smt a smt b smt c smt d 900K 1800K 2700K 3600K 4500K 1950323.96 1913590.77 1891202.37 3284433.20 3282827.50 4329963.26 4329824.38 4351871.12 4351920.42 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: Function Call OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: Function Call a b c no smt a no smt b smt a smt b smt c smt d 300K 600K 900K 1200K 1500K 621015.34 621003.92 621041.93 829106.63 829445.38 1414423.65 1413555.44 1422505.51 1425621.44 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: Glibc C String Functions OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: Glibc C String Functions a b c no smt a no smt b smt a smt b smt c smt d 8M 16M 24M 32M 40M 16257100.25 16168655.17 16537965.14 26755146.69 28009942.56 35687958.37 35996976.13 34602827.77 36111781.82 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: Glibc Qsort Data Sorting OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: Glibc Qsort Data Sorting a b c no smt a no smt b smt a smt b smt c smt d 600 1200 1800 2400 3000 1122.39 1132.35 1125.86 1978.80 1963.75 2564.48 2520.38 2517.26 2516.09 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
Stress-NG Test: Mutex OpenBenchmarking.org Bogo Ops/s, More Is Better Stress-NG 0.15.04 Test: Mutex a b c no smt a no smt b smt a smt b smt c smt d 30M 60M 90M 120M 150M 59479783.64 60031543.97 59929401.40 63581395.67 65500297.73 136339636.91 135855921.28 138939958.50 137579874.56 1. (CC) gcc options: -std=gnu99 -O2 -lm -lc -lcrypt -ldl -lrt -pthread
VVenC Video Input: Bosphorus 4K - Video Preset: Faster OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.7 Video Input: Bosphorus 4K - Video Preset: Faster a b c no smt a no smt b 3 6 9 12 15 12.34 12.39 12.40 12.48 12.31 1. (CXX) g++ options: -O3 -flto -fno-fat-lto-objects -flto=auto
VVenC Video Input: Bosphorus 1080p - Video Preset: Fast OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.7 Video Input: Bosphorus 1080p - Video Preset: Fast a b c no smt a no smt b 3 6 9 12 15 12.44 12.40 12.44 12.36 12.37 1. (CXX) g++ options: -O3 -flto -fno-fat-lto-objects -flto=auto
GROMACS Implementation: MPI CPU - Input: water_GMX50_bare OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2023 Implementation: MPI CPU - Input: water_GMX50_bare a b c no smt a no smt b smt a smt b smt c smt d 5 10 15 20 25 10.57 10.61 10.59 19.18 18.84 18.82 19.02 19.05 19.08 1. (CXX) g++ options: -O3
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU a b c no smt a no smt b smt a smt b smt c smt d 5 10 15 20 25 7.35746 7.28243 7.38116 9.25292 9.31751 20.26950 20.57200 21.03010 20.97260 MIN: 4.85 MIN: 6.58 MIN: 6.77 MIN: 7.79 MIN: 8.07 MIN: 17.68 MIN: 18.25 MIN: 17.96 MIN: 18.21 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU a b c no smt a no smt b smt a smt b smt c smt d 0.6213 1.2426 1.8639 2.4852 3.1065 1.83147 1.84172 1.82264 1.65041 1.70057 2.70807 2.76140 2.70235 2.67303 MIN: 1.73 MIN: 1.76 MIN: 1.71 MIN: 1.41 MIN: 1.5 MIN: 2.09 MIN: 2.46 MIN: 2.27 MIN: 2.09 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU a b c no smt a no smt b smt a smt b smt c smt d 0.2775 0.555 0.8325 1.11 1.3875 0.555277 0.551966 0.556851 0.664359 0.670574 1.200490 1.181350 1.201460 1.233400 MIN: 0.53 MIN: 0.49 MIN: 0.53 MIN: 0.56 MIN: 0.55 MIN: 1.04 MIN: 1.08 MIN: 1.08 MIN: 1.08 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
VP9 libvpx Encoding Speed: Speed 5 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better VP9 libvpx Encoding 1.13 Speed: Speed 5 - Input: Bosphorus 1080p a b c no smt a no smt b smt a smt b smt c smt d 7 14 21 28 35 29.37 29.50 29.46 29.62 29.71 29.54 29.60 29.54 29.66 1. (CXX) g++ options: -m64 -lm -lpthread -O3 -fPIC -U_FORTIFY_SOURCE -std=gnu++11
Timed Linux Kernel Compilation Build: defconfig OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 6.1 Build: defconfig a b c no smt a no smt b smt a smt b smt c smt d 6 12 18 24 30 25.77 25.75 25.68 17.61 17.59 17.38 17.02 17.05 17.08
uvg266 Video Input: Bosphorus 4K - Video Preset: Slow OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 4K - Video Preset: Slow a b c no smt a no smt b smt a smt b smt c smt d 11 22 33 44 55 29.37 29.29 29.41 34.59 34.68 45.56 46.25 46.34 46.25
uvg266 Video Input: Bosphorus 4K - Video Preset: Medium OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 4K - Video Preset: Medium a b c no smt a no smt b smt a smt b smt c smt d 11 22 33 44 55 33.03 33.13 33.10 38.38 38.65 47.56 46.62 46.57 46.21
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU a b c no smt a no smt b smt a smt b smt c smt d 3 6 9 12 15 2.01581 1.92340 1.88686 2.04003 1.93126 7.60566 7.42270 9.34981 8.26078 MIN: 1.81 MIN: 1.72 MIN: 1.68 MIN: 1.78 MIN: 1.77 MIN: 6.54 MIN: 6.44 MIN: 7.73 MIN: 7.09 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU a b c no smt a no smt b smt a smt b smt c smt d 5 10 15 20 25 8.74576 9.19570 3.83017 6.19257 7.98861 15.42100 19.54900 18.65940 11.75730 MIN: 3.65 MIN: 4.13 MIN: 2.72 MIN: 3.65 MIN: 3.88 MIN: 10.4 MIN: 10.67 MIN: 11.23 MIN: 8.17 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU a b c no smt a no smt b smt a smt b smt c smt d 3 6 9 12 15 3.11497 4.27858 4.39720 4.87883 4.79302 11.43260 7.30147 12.01300 12.15710 MIN: 2.47 MIN: 3.37 MIN: 3.21 MIN: 3.53 MIN: 3.4 MIN: 7.59 MIN: 5.77 MIN: 8.02 MIN: 8.52 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
VVenC Video Input: Bosphorus 1080p - Video Preset: Faster OpenBenchmarking.org Frames Per Second, More Is Better VVenC 1.7 Video Input: Bosphorus 1080p - Video Preset: Faster a b c no smt a no smt b 7 14 21 28 35 30.04 29.85 30.08 28.69 28.52 1. (CXX) g++ options: -O3 -flto -fno-fat-lto-objects -flto=auto
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU a b c no smt a no smt b smt a smt b smt c smt d 0.1429 0.2858 0.4287 0.5716 0.7145 0.290200 0.291387 0.285432 0.316779 0.340498 0.621776 0.620907 0.634992 0.627133 MIN: 0.25 MIN: 0.23 MIN: 0.24 MIN: 0.25 MIN: 0.28 MIN: 0.45 MIN: 0.41 MIN: 0.55 MIN: 0.4 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU a b c no smt a no smt b smt a smt b smt c smt d 0.1521 0.3042 0.4563 0.6084 0.7605 0.357017 0.356875 0.353175 0.342836 0.347231 0.674455 0.675664 0.676124 0.672115 MIN: 0.31 MIN: 0.31 MIN: 0.31 MIN: 0.28 MIN: 0.3 MIN: 0.49 MIN: 0.55 MIN: 0.49 MIN: 0.49 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU a b c no smt a no smt b smt a smt b smt c smt d 0.1356 0.2712 0.4068 0.5424 0.678 0.263997 0.263373 0.262920 0.316665 0.293564 0.591738 0.600199 0.593299 0.602831 MIN: 0.18 MIN: 0.2 MIN: 0.2 MIN: 0.24 MIN: 0.22 MIN: 0.43 MIN: 0.38 MIN: 0.39 MIN: 0.48 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
Kvazaar Video Input: Bosphorus 4K - Video Preset: Slow OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 4K - Video Preset: Slow a b c no smt a no smt b smt a smt b smt c smt d 15 30 45 60 75 40.63 40.86 40.76 47.13 47.43 64.63 66.22 65.49 66.08 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Kvazaar Video Input: Bosphorus 4K - Video Preset: Medium OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 4K - Video Preset: Medium a b c no smt a no smt b smt a smt b smt c smt d 15 30 45 60 75 41.40 41.39 41.47 47.97 47.93 65.56 65.69 65.59 65.49 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Timed FFmpeg Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed FFmpeg Compilation 6.0 Time To Compile a b c no smt a no smt b smt a smt b smt c smt d 3 6 9 12 15 12.57 12.43 12.47 10.60 10.66 10.38 10.26 10.41 10.37
uvg266 Video Input: Bosphorus 4K - Video Preset: Very Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 4K - Video Preset: Very Fast a b c no smt a no smt b smt a smt b smt c smt d 15 30 45 60 75 68.82 68.79 69.04 57.13 58.09 59.30 57.86 57.44 57.52
uvg266 Video Input: Bosphorus 4K - Video Preset: Ultra Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 4K - Video Preset: Ultra Fast a b c no smt a no smt b smt a smt b smt c smt d 16 32 48 64 80 70.56 70.68 71.13 57.78 57.23 57.96 57.51 57.33 58.76
uvg266 Video Input: Bosphorus 4K - Video Preset: Super Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 4K - Video Preset: Super Fast a b c no smt a no smt b smt a smt b smt c smt d 16 32 48 64 80 69.33 69.00 69.92 59.83 58.33 59.06 59.20 58.34 58.66
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU a b c no smt a no smt b smt a smt b smt c smt d 0.8104 1.6208 2.4312 3.2416 4.052 1.56461 1.56257 1.56742 2.19645 2.12379 3.39151 3.49453 3.60194 3.50563 MIN: 1.41 MIN: 1.41 MIN: 1.39 MIN: 2 MIN: 1.9 MIN: 2.93 MIN: 3.11 MIN: 3.08 MIN: 3.19 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU a b c no smt a no smt b smt a smt b smt c smt d 0.8092 1.6184 2.4276 3.2368 4.046 1.87777 1.72975 1.63860 2.02286 2.02833 3.38833 3.59643 3.36535 2.97427 MIN: 1.23 MIN: 1.27 MIN: 1.2 MIN: 1.56 MIN: 1.43 MIN: 2.76 MIN: 2.7 MIN: 2.7 MIN: 2.3 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU a b c no smt a no smt b smt a smt b smt c smt d 0.224 0.448 0.672 0.896 1.12 0.534450 0.513068 0.546991 0.556069 0.629026 0.975357 0.971704 0.995710 0.975091 MIN: 0.49 MIN: 0.47 MIN: 0.5 MIN: 0.46 MIN: 0.51 MIN: 0.92 MIN: 0.82 MIN: 0.87 MIN: 0.91 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
Kvazaar Video Input: Bosphorus 4K - Video Preset: Very Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 4K - Video Preset: Very Fast a b c no smt a no smt b smt a smt b smt c smt d 20 40 60 80 100 80.90 81.61 80.31 70.02 69.94 75.91 74.11 73.57 73.24 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Kvazaar Video Input: Bosphorus 4K - Video Preset: Super Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 4K - Video Preset: Super Fast a b c no smt a no smt b smt a smt b smt c smt d 20 40 60 80 100 81.90 80.68 84.55 75.50 74.03 76.64 76.70 78.72 75.50 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Kvazaar Video Input: Bosphorus 4K - Video Preset: Ultra Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 4K - Video Preset: Ultra Fast a b c no smt a no smt b smt a smt b smt c smt d 20 40 60 80 100 83.45 84.77 82.70 76.05 74.37 77.96 77.00 76.14 78.85 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
uvg266 Video Input: Bosphorus 1080p - Video Preset: Slow OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 1080p - Video Preset: Slow a b c no smt a no smt b smt a smt b smt c smt d 20 40 60 80 100 81.16 81.41 81.10 87.15 88.13 80.17 79.95 79.63 80.00
uvg266 Video Input: Bosphorus 1080p - Video Preset: Medium OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 1080p - Video Preset: Medium a b c no smt a no smt b smt a smt b smt c smt d 20 40 60 80 100 91.37 91.47 91.29 98.21 96.67 89.40 89.70 89.16 88.91
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU a b c no smt a no smt b smt a smt b smt c smt d 0.0918 0.1836 0.2754 0.3672 0.459 0.311405 0.312721 0.312289 0.254845 0.255796 0.407430 0.408074 0.394377 0.389223 MIN: 0.3 MIN: 0.28 MIN: 0.28 MIN: 0.18 MIN: 0.18 MIN: 0.27 MIN: 0.27 MIN: 0.29 MIN: 0.29 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU a b c no smt a no smt b smt a smt b smt c smt d 0.156 0.312 0.468 0.624 0.78 0.693367 0.689064 0.692667 0.403590 0.400325 0.538539 0.551783 0.545118 0.630630 MIN: 0.64 MIN: 0.64 MIN: 0.65 MIN: 0.38 MIN: 0.38 MIN: 0.48 MIN: 0.49 MIN: 0.45 MIN: 0.48 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU a b c no smt a no smt b smt a smt b smt c smt d 0.1302 0.2604 0.3906 0.5208 0.651 0.415446 0.578466 0.417188 0.291687 0.291229 0.451472 0.450261 0.447820 0.457554 MIN: 0.4 MIN: 0.4 MIN: 0.4 MIN: 0.27 MIN: 0.27 MIN: 0.4 MIN: 0.37 MIN: 0.37 MIN: 0.34 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
Kvazaar Video Input: Bosphorus 1080p - Video Preset: Slow OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 1080p - Video Preset: Slow a b c no smt a no smt b smt a smt b smt c smt d 40 80 120 160 200 139.92 139.07 140.10 155.56 159.79 132.67 136.33 135.80 135.56 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Kvazaar Video Input: Bosphorus 1080p - Video Preset: Medium OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 1080p - Video Preset: Medium a b c no smt a no smt b smt a smt b smt c smt d 40 80 120 160 200 143.31 144.21 143.81 159.63 161.40 140.78 138.33 140.89 136.42 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU a b c no smt a no smt b smt a smt b smt c smt d 0.2633 0.5266 0.7899 1.0532 1.3165 1.170040 1.166960 1.169190 0.664295 0.651301 0.978590 0.973555 0.972863 0.973984 MIN: 1.07 MIN: 1.07 MIN: 1.07 MIN: 0.63 MIN: 0.62 MIN: 0.93 MIN: 0.92 MIN: 0.92 MIN: 0.92 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU a b c no smt a no smt b smt a smt b smt c smt d 0.1603 0.3206 0.4809 0.6412 0.8015 0.711588 0.712495 0.708563 0.462386 0.461174 0.672585 0.672768 0.676798 0.680395 MIN: 0.68 MIN: 0.68 MIN: 0.67 MIN: 0.42 MIN: 0.43 MIN: 0.52 MIN: 0.53 MIN: 0.53 MIN: 0.53 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.0 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU a b c no smt a no smt b smt a smt b smt c smt d 0.0688 0.1376 0.2064 0.2752 0.344 0.304972 0.305731 0.305599 0.160361 0.164845 0.279638 0.291087 0.275917 0.247825 MIN: 0.28 MIN: 0.28 MIN: 0.28 MIN: 0.15 MIN: 0.15 MIN: 0.23 MIN: 0.23 MIN: 0.23 MIN: 0.23 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
uvg266 Video Input: Bosphorus 1080p - Video Preset: Very Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 1080p - Video Preset: Very Fast a b c no smt a no smt b smt a smt b smt c smt d 50 100 150 200 250 234.95 234.68 237.44 196.25 183.75 183.99 209.55 192.25 181.83
uvg266 Video Input: Bosphorus 1080p - Video Preset: Super Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 1080p - Video Preset: Super Fast a b c no smt a no smt b smt a smt b smt c smt d 50 100 150 200 250 238.68 239.92 237.96 181.67 216.15 178.49 220.79 178.78 218.91
uvg266 Video Input: Bosphorus 1080p - Video Preset: Ultra Fast OpenBenchmarking.org Frames Per Second, More Is Better uvg266 0.4.1 Video Input: Bosphorus 1080p - Video Preset: Ultra Fast a b c no smt a no smt b smt a smt b smt c smt d 50 100 150 200 250 240.98 240.91 238.77 186.24 209.85 216.15 179.56 218.28 224.31
Kvazaar Video Input: Bosphorus 1080p - Video Preset: Very Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 1080p - Video Preset: Very Fast a b c no smt a no smt b smt a smt b smt c smt d 60 120 180 240 300 296.93 290.59 291.03 269.62 268.77 250.71 274.63 243.36 270.76 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Kvazaar Video Input: Bosphorus 1080p - Video Preset: Super Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 1080p - Video Preset: Super Fast a b c no smt a no smt b smt a smt b smt c smt d 70 140 210 280 350 307.49 303.99 301.40 288.41 280.18 296.72 267.04 288.09 256.45 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Kvazaar Video Input: Bosphorus 1080p - Video Preset: Ultra Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.2 Video Input: Bosphorus 1080p - Video Preset: Ultra Fast a b c no smt a no smt b smt a smt b smt c smt d 70 140 210 280 350 310.73 305.52 309.64 278.62 271.45 302.55 295.18 303.43 303.63 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Phoronix Test Suite v10.8.4