epyc 9654 AMD March Tests for a future article. 2 x AMD EPYC 9654 96-Core testing with a AMD Titanite_4G (RTI1004D BIOS) and ASPEED on Ubuntu 23.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2303299-NE-EPYC9654A14&sro&grw .
epyc 9654 AMD March Processor Motherboard Chipset Memory Disk Graphics Monitor Network OS Kernel Desktop Display Server Vulkan Compiler File-System Screen Resolution a b c d e AMD EPYC 9654 96-Core @ 3.71GHz (96 Cores / 192 Threads) AMD Titanite_4G (RTI1004D BIOS) AMD Device 14a4 768GB 800GB INTEL SSDPF21Q800GB ASPEED VGA HDMI Broadcom NetXtreme BCM5720 PCIe Ubuntu 23.04 5.19.0-21-generic (x86_64) GNOME Shell 43.1 X Server 1.21.1.4 1.3.224 GCC 12.2.0 ext4 1920x1080 2 x AMD EPYC 9654 96-Core @ 3.71GHz (192 Cores / 384 Threads) 1520GB OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-l0Aoyl/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-l0Aoyl/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: amd-pstate performance (Boost: Enabled) - CPU Microcode: 0xa101111 Python Details - Python 3.10.9 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
epyc 9654 AMD March draco: Church Facade specfem3d: Water-layered Halfspace draco: Lion specfem3d: Homogeneous Halfspace specfem3d: Tomographic Model daphne: OpenMP - NDT Mapping daphne: OpenMP - Points2Image daphne: OpenMP - Euclidean Cluster tensorflow: CPU - 16 - AlexNet tensorflow: CPU - 32 - AlexNet tensorflow: CPU - 64 - AlexNet tensorflow: CPU - 256 - AlexNet tensorflow: CPU - 512 - AlexNet tensorflow: CPU - 16 - GoogLeNet tensorflow: CPU - 16 - ResNet-50 tensorflow: CPU - 32 - GoogLeNet tensorflow: CPU - 32 - ResNet-50 tensorflow: CPU - 64 - GoogLeNet tensorflow: CPU - 64 - ResNet-50 tensorflow: CPU - 256 - GoogLeNet tensorflow: CPU - 256 - ResNet-50 tensorflow: CPU - 512 - GoogLeNet tensorflow: CPU - 512 - ResNet-50 deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Stream deepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Asynchronous Multi-Stream deepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Asynchronous Multi-Stream deepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Synchronous Single-Stream deepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Synchronous Single-Stream deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Stream deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Stream deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Synchronous Single-Stream deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Synchronous Single-Stream deepsparse: CV Detection, YOLOv5s COCO - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO - Asynchronous Multi-Stream deepsparse: CV Detection, YOLOv5s COCO - Synchronous Single-Stream deepsparse: CV Detection, YOLOv5s COCO - Synchronous Single-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Synchronous Single-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Synchronous Single-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Synchronous Single-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Synchronous Single-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Synchronous Single-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Synchronous Single-Stream deepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2 - Synchronous Single-Stream deepsparse: NLP Text Classification, BERT base uncased SST2 - Synchronous Single-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Stream onnx: GPT-2 - CPU - Parallel onnx: GPT-2 - CPU - Standard onnx: yolov4 - CPU - Parallel onnx: yolov4 - CPU - Standard onnx: bertsquad-12 - CPU - Parallel onnx: bertsquad-12 - CPU - Standard onnx: CaffeNet 12-int8 - CPU - Parallel onnx: CaffeNet 12-int8 - CPU - Standard onnx: fcn-resnet101-11 - CPU - Parallel onnx: fcn-resnet101-11 - CPU - Standard onnx: ArcFace ResNet-100 - CPU - Parallel onnx: ArcFace ResNet-100 - CPU - Standard specfem3d: Layered Halfspace onnx: ResNet50 v1-12-int8 - CPU - Parallel onnx: ResNet50 v1-12-int8 - CPU - Standard onnx: super-resolution-10 - CPU - Parallel onnx: super-resolution-10 - CPU - Standard onnx: Faster R-CNN R-50-FPN-int8 - CPU - Parallel onnx: Faster R-CNN R-50-FPN-int8 - CPU - Standard opencv: Core opencv: Video opencv: Graph API opencv: Stitching opencv: Features 2D opencv: Image Processing opencv: Object Detection opencv: DNN - Deep Neural Network gromacs: MPI CPU - water_GMX50_bare build-ffmpeg: Time To Compile john-the-ripper: bcrypt john-the-ripper: WPA PSK john-the-ripper: Blowfish john-the-ripper: HMAC-SHA512 john-the-ripper: MD5 build-llvm: Ninja build-llvm: Unix Makefiles compress-zstd: 8 - Compression Speed compress-zstd: 8 - Decompression Speed compress-zstd: 12 - Compression Speed compress-zstd: 12 - Decompression Speed compress-zstd: 19 - Compression Speed compress-zstd: 19 - Decompression Speed compress-zstd: 8, Long Mode - Compression Speed compress-zstd: 8, Long Mode - Decompression Speed compress-zstd: 19, Long Mode - Compression Speed compress-zstd: 19, Long Mode - Decompression Speed dav1d: Chimera 1080p dav1d: Summer Nature 4K dav1d: Summer Nature 1080p dav1d: Chimera 1080p 10-bit ffmpeg: libx264 - Live ffmpeg: libx264 - Live ffmpeg: libx265 - Live ffmpeg: libx265 - Live ffmpeg: libx264 - Upload ffmpeg: libx264 - Upload ffmpeg: libx265 - Upload ffmpeg: libx265 - Upload ffmpeg: libx264 - Platform ffmpeg: libx264 - Platform ffmpeg: libx265 - Platform ffmpeg: libx265 - Platform ffmpeg: libx264 - Video On Demand ffmpeg: libx264 - Video On Demand ffmpeg: libx265 - Video On Demand ffmpeg: libx265 - Video On Demand build-godot: Time To Compile embree: Pathtracer - Crown embree: Pathtracer ISPC - Crown embree: Pathtracer - Asian Dragon embree: Pathtracer - Asian Dragon Obj embree: Pathtracer ISPC - Asian Dragon embree: Pathtracer ISPC - Asian Dragon Obj build2: Time To Compile nginx: 200 nginx: 500 apache: 200 apache: 500 openssl: SHA256 openssl: SHA512 openssl: RSA4096 openssl: RSA4096 openssl: ChaCha20 openssl: AES-128-GCM openssl: AES-256-GCM openssl: ChaCha20-Poly1305 clickhouse: 100M Rows Hits Dataset, First Run / Cold Cache clickhouse: 100M Rows Hits Dataset, Second Run clickhouse: 100M Rows Hits Dataset, Third Run memcached: 1:5 memcached: 1:10 memcached: 1:100 rocksdb: Rand Fill rocksdb: Rand Read rocksdb: Update Rand rocksdb: Seq Fill rocksdb: Rand Fill Sync rocksdb: Read While Writing rocksdb: Read Rand Write Rand pgbench: 1 - 800 - Read Only pgbench: 1 - 800 - Read Only - Average Latency pgbench: 1 - 1000 - Read Only pgbench: 1 - 1000 - Read Only - Average Latency pgbench: 1 - 800 - Read Write pgbench: 1 - 800 - Read Write - Average Latency pgbench: 1 - 1000 - Read Write pgbench: 1 - 1000 - Read Write - Average Latency pgbench: 100 - 800 - Read Only pgbench: 100 - 800 - Read Only - Average Latency pgbench: 100 - 1000 - Read Only pgbench: 100 - 1000 - Read Only - Average Latency pgbench: 100 - 800 - Read Write pgbench: 100 - 800 - Read Write - Average Latency pgbench: 100 - 1000 - Read Write pgbench: 100 - 1000 - Read Write - Average Latency mysqlslap: 512 mysqlslap: 1024 mysqlslap: 2048 mysqlslap: 4096 mysqlslap: 8192 specfem3d: Mount St. Helens build-nodejs: Time To Compile onnx: GPT-2 - CPU - Parallel onnx: GPT-2 - CPU - Standard onnx: yolov4 - CPU - Parallel onnx: yolov4 - CPU - Standard onnx: bertsquad-12 - CPU - Parallel onnx: bertsquad-12 - CPU - Standard onnx: CaffeNet 12-int8 - CPU - Parallel onnx: CaffeNet 12-int8 - CPU - Standard onnx: fcn-resnet101-11 - CPU - Parallel onnx: fcn-resnet101-11 - CPU - Standard onnx: ArcFace ResNet-100 - CPU - Parallel onnx: ArcFace ResNet-100 - CPU - Standard onnx: ResNet50 v1-12-int8 - CPU - Parallel onnx: ResNet50 v1-12-int8 - CPU - Standard onnx: super-resolution-10 - CPU - Parallel onnx: super-resolution-10 - CPU - Standard onnx: Faster R-CNN R-50-FPN-int8 - CPU - Parallel onnx: Faster R-CNN R-50-FPN-int8 - CPU - Standard a b c d e 6872 20.426995567 5321 10.661133476 8.695806704 954.82 18078.769197393 1637.357959197 355.72 593.9 856.98 1276.22 1375.44 142.32 57.39 241.43 80.72 316.04 104.12 459.97 146.79 516.87 163.85 42.3271 1109.1052 35.6789 28.0217 1567.7533 30.5695 194.537 5.1385 400.507 119.5961 101.7889 9.8202 440.4626 108.8046 206.9391 4.8287 1003.5666 47.7743 195.128 5.1229 624.9625 76.7052 195.1995 5.1198 149.0497 321.4749 62.3453 16.0249 316.6097 151.3746 88.7046 11.2681 43.0198 1108.7024 35.4293 28.2192 159.109 128.485 6.40254 6.61035 12.1742 9.3652 599.496 524.041 1.25331 5.11448 24.1547 29.4284 19.841829247 194.561 207.156 111.698 123.879 30.7765 37.0363 65772 41999 230494 190987 71850 119907 24950 22944 11.248 12.811 163238 653913 163353 309175000 15556000 125.877 217.187 1225.5 1580.9 316.8 1633.7 17.4 1395.1 903.2 1619.8 8.5 1329.8 657.5 379.84 807.16 602.64 23.17 217.98 37.07 136.22 202.82901313 12.45 89.614916611 28.18 156.979756208 48.25 132.597004196 57.13 157.534841326 48.08 132.842087496 57.02 107.66 104.2447 110.7402 121.1083 106.5442 132.7141 113.1673 63.167 257954.01 240111.29 143188.9 173757.38 129947980460 40028926290 35951.3 1462850.1 510745602460 908982494280 780271471000 356999237630 584.98 602.50 612.78 3870015.6 3203112.6 2851726.85 644356 432927777 645530 662613 445786 9939924 2792738 3718995 0.215 3738972 0.267 676 1184.082 382 2619.509 3833822 0.209 3741941 0.267 58262 13.731 54169 18.461 915 912 860 654 446 8.549248083 133.701 6.27808 7.78072 156.183 151.274 82.137 106.776 1.66637 1.90737 797.88 195.521 41.3976 33.9783 5.13811 4.82662 8.95095 8.07185 32.4891 26.998 6788 20.448751145 5296 10.386901707 8.699354709 949.71 17717.930545712 1637.00 353.36 594.27 853.06 1272.13 1375.77 157.65 57.85 239.45 81.8 310.53 105.04 454.67 146.9 515.11 163.76 42.0961 1110.6738 35.4512 28.2018 1567.6088 30.5827 199.0439 5.0222 400.3873 119.6426 101.3977 9.8577 439.9247 108.8016 206.6159 4.8365 1001.9837 47.8736 193.3983 5.1686 626.0654 76.5973 195.7729 5.1045 149.2181 320.9851 61.9263 16.1331 320.1841 149.8129 88.5975 11.2807 42.2783 1108.9774 35.3934 28.2485 159.988 128.979 6.36316 6.48661 12.2692 12.0125 563.298 552.793 1.258 4.89263 24.5996 32.4597 19.458733603 194.057 233.812 111.351 113.016 30.6798 37.3945 65256 38143 204945 190687 73789 119436 24394 23755 11.237 13.012 163353 654104 163241 308492000 15608000 126.288 217.451 1217.1 1593 317.4 1629 17.4 1399.2 900.7 1625.7 8.56 1336.1 656.31 381.16 806.08 603.05 23.15 218.14 36.891181239 136.89 203.29 12.42 89.730906933 28.14 156.98 48.26 132.947013563 56.98 157.201576479 48.19 132.48 57.18 107.352 105.1822 111.2807 121.2691 107.1054 132.7068 113.4477 63.014 258099.68 237868.2 164665.51 208703.78 129484883100 40018641540 35946.3 1462987.4 506631603000 910814515750 776495266470 356991460690 582.44 603.93 606.58 3833862.64 3163183.27 2813977.52 640977 435267657 644787 662046 446883 9108882 2785663 3803554 0.21 3707315 0.27 552 1449.389 476 2099.011 3816666 0.21 3730123 0.268 61635 12.98 54579 18.322 898 874 839 693 438 8.433500494 132.796 6.24337 7.75078 157.15 154.16 81.5006 83.2442 1.77352 1.80855 794.909 204.386 40.6488 30.8057 5.15191 4.27622 8.97883 8.84768 32.5919 26.739 6888 19.898906002 5270 10.665913818 8.463161346 937.41 17373.125512145 1636.09 354.96 591.78 857.87 1276.62 1378.85 158.76 57.9 239.48 80.41 295.75 105.29 455.45 146.99 518.52 163.66 42.8173 1108.4187 35.5865 28.0947 1569.2552 30.5443 197.0861 5.0721 399.6568 119.8062 100.9338 9.9034 439.0234 109.2253 206.6742 4.8353 1005.7215 47.6861 196.1478 5.0961 625.2598 76.6635 194.0788 5.1492 149.1959 321.1526 62.2933 16.0386 317.4936 150.7648 88.831 11.251 42.1065 1112.3295 35.4677 28.1884 159.596 126.839 6.3862 6.96155 12.1735 9.35231 598.432 536.343 1.17121 5.10242 24.648 32.5181 19.778071377 183.218 207.345 112.031 123.129 29.0291 37.0657 68743 37173 207090 191634 75180 122137 23509 23144 11.246 13.161 163353 653913 163299 309621000 15608000 126.228 214.021 1220.5 1577.1 314.8 1641.7 17.4 1393.8 893.2 1613.8 8.38 1338 657.22 383.95 809.86 603.51 23.203361631 217.64 36.98 136.56 202.88 12.45 89.78 28.13 157.00 48.25 132.872175892 57.01 156.93 48.27 132.626633247 57.12 107.075 105.124 111.3003 120.9085 106.9305 132.5315 113.3183 63.222 255419.28 241662.73 165838.18 185857.48 130061831940 40002428390 35968.3 1462827.1 510959296510 909186377320 779191711330 356961832960 578.76 592.33 623.39 3858723.95 3154822.34 2821181.21 641338 434781404 647442 660783 451734 10000158 2802641 3804637 0.21 3672471 0.272 711 1125.104 424 2357.907 3785876 0.211 3776352 0.265 65740 12.169 54120 18.477 894 873 852 678 439 8.26604604 133.106 6.25916 7.88188 156.583 143.643 82.1422 106.923 1.66912 1.86383 853.816 195.983 40.5691 30.7501 5.45677 4.82211 8.92427 8.12097 34.4452 26.9766 6721 12.810395283 5218 6.276873254 5.078447644 802.27 13064.153704123 1506.75 184.36 330.97 588.27 1347.52 1843.74 67.8 25.41 120.06 45.4 191.95 68.59 382.29 131.65 530.51 166.89 84.5516 1133.2605 33.947 29.4515 2845.8208 33.6465 196.6158 5.0842 754.4392 126.9401 99.65 10.0304 840.7314 113.905 203.7038 4.9053 1953.9329 49.0273 191.6421 5.2159 1209.9217 79.1804 188.2083 5.3096 281.229 339.5778 61.0447 16.3645 620.4124 154.2824 86.1837 11.5966 85.0229 1127.5771 34.2025 29.2312 111.962 126.738 4.09797 4.84593 8.85423 10.5465 326.983 417.118 0.922595 4.78422 13.2688 19.7663 11.916989228 101.311 137.868 88.9539 123.932 24.7447 36.2449 267066 126947 390167 268869 110697 333961 71477 34502 18.413 10.857 315340 1263000 315110 286156000 27276000 97.849 199.261 1217.7 1611.3 317.8 1636 17.3 1406.1 865.5 1615.1 8.34 1334.7 23.087381477 218.73 39.276270815 128.58 198.824978146 12.70 89.69 28.15 155.276512617 48.78 134.37552551 56.37 155.359546761 48.76 131.931538366 57.42 97.311 172.2769 180.4766 194.9019 174.2633 211.9803 181.4535 60.474 196034.9 141164.84 258641794620 79615585770 72050.2 2936562.9 1017352168790 1810537338580 1552708662150 710753283550 525.86 536.53 551.92 2575013.52 3068513.78 2571874.26 438023 863491650 436861 438833 357298 16955493 1689260 3638802 0.22 3700926 0.27 565 1415.162 455 2198.896 3643496 0.22 3381479 0.296 46860 17.072 44818 22.312 624 561 650 578 384 4.709617691 106.05 8.92113 7.88812 244.017 206.355 112.935 94.8158 3.0553 2.39687 1083.89 209.018 75.3612 50.5887 9.86863 7.25262 11.2389 8.06837 40.4083 27.5873 6784 10.759381397 5300 6.233892837 5.322492803 760.15 11981.045985251 1494.17 184.99 319.03 597.46 1386.72 1775.18 67.07 24.48 106.1 45.9 177.73 68.53 377.56 132.39 521.87 171.18 84.2779 1136.2993 34.372 29.0871 2819.9596 33.9411 196.3169 5.0918 754.7591 126.8569 100.2364 9.9712 838.0345 114.3093 202.464 4.9353 1964.2605 48.8074 189.0071 5.2886 1204.6601 79.542 190.6621 5.2412 279.9601 340.4654 61.6961 16.1904 617.5781 155.0272 86.2895 11.5825 84.4009 1135.447 34.5339 28.9508 112.505 102.872 4.06505 4.70693 8.98441 8.85844 323.431 482.694 0.880761 4.46687 13.307 19.2692 12.57728528 102.184 139.612 89.8277 123.282 26.622 35.5324 182548 122021 382454 241229 119368 312082 33386 47834 19.134 11.19 314928 1255000 314188 292569000 27169000 97.521 200.02 1213.4 1599.2 315.2 1611.2 17.3 1385 943.2 1606.4 8.46 1330.7 23.09 218.71 37.18 135.83 199.473346601 12.66 89.59 28.18 155.684439476 48.66 137.04 55.28 155.854204067 48.60 134.34 56.39 98.009 173.5786 180.8434 195.2731 173.9641 213.2995 182.3542 60.348 194753.92 142512.83 258600679990 79537665830 72086.6 2937037.2 1017084218150 1804869906900 1551466680320 710739693380 527.95 536.90 568.25 2550554.27 3063463.55 2595069.86 431217 859555542 435840 438667 174168 14344054 1713878 3669550 0.218 3716270 0.269 523 1529.108 534 1871.558 3274920 0.244 3461133 0.289 17020 47.002 14053 71.159 334 336 327 351 293 4.677201807 104.355 8.8782 9.71824 245.992 212.449 111.298 112.883 3.08913 2.07112 1135.38 223.867 75.1448 51.8944 9.78363 7.16208 11.1294 8.11091 37.5588 28.1403 OpenBenchmarking.org
Google Draco Model: Church Facade OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.5.6 Model: Church Facade a b c d e 1500 3000 4500 6000 7500 6872 6788 6888 6721 6784 1. (CXX) g++ options: -O3
SPECFEM3D Model: Water-layered Halfspace OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.0 Model: Water-layered Halfspace a b c d e 5 10 15 20 25 20.43 20.45 19.90 12.81 10.76 1. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
Google Draco Model: Lion OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.5.6 Model: Lion a b c d e 1100 2200 3300 4400 5500 5321 5296 5270 5218 5300 1. (CXX) g++ options: -O3
SPECFEM3D Model: Homogeneous Halfspace OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.0 Model: Homogeneous Halfspace a b c d e 3 6 9 12 15 10.661133476 10.386901707 10.665913818 6.276873254 6.233892837 1. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
SPECFEM3D Model: Tomographic Model OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.0 Model: Tomographic Model a b c d e 2 4 6 8 10 8.695806704 8.699354709 8.463161346 5.078447644 5.322492803 1. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
Darmstadt Automotive Parallel Heterogeneous Suite Backend: OpenMP - Kernel: NDT Mapping OpenBenchmarking.org Test Cases Per Minute, More Is Better Darmstadt Automotive Parallel Heterogeneous Suite 2021.11.02 Backend: OpenMP - Kernel: NDT Mapping a b c d e 200 400 600 800 1000 954.82 949.71 937.41 802.27 760.15 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp
Darmstadt Automotive Parallel Heterogeneous Suite Backend: OpenMP - Kernel: Points2Image OpenBenchmarking.org Test Cases Per Minute, More Is Better Darmstadt Automotive Parallel Heterogeneous Suite 2021.11.02 Backend: OpenMP - Kernel: Points2Image a b c d e 4K 8K 12K 16K 20K 18078.77 17717.93 17373.13 13064.15 11981.05 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp
Darmstadt Automotive Parallel Heterogeneous Suite Backend: OpenMP - Kernel: Euclidean Cluster OpenBenchmarking.org Test Cases Per Minute, More Is Better Darmstadt Automotive Parallel Heterogeneous Suite 2021.11.02 Backend: OpenMP - Kernel: Euclidean Cluster a b c d e 400 800 1200 1600 2000 1637.36 1637.00 1636.09 1506.75 1494.17 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp
TensorFlow Device: CPU - Batch Size: 16 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 16 - Model: AlexNet a b c d e 80 160 240 320 400 355.72 353.36 354.96 184.36 184.99
TensorFlow Device: CPU - Batch Size: 32 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 32 - Model: AlexNet a b c d e 130 260 390 520 650 593.90 594.27 591.78 330.97 319.03
TensorFlow Device: CPU - Batch Size: 64 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 64 - Model: AlexNet a b c d e 200 400 600 800 1000 856.98 853.06 857.87 588.27 597.46
TensorFlow Device: CPU - Batch Size: 256 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 256 - Model: AlexNet a b c d e 300 600 900 1200 1500 1276.22 1272.13 1276.62 1347.52 1386.72
TensorFlow Device: CPU - Batch Size: 512 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 512 - Model: AlexNet a b c d e 400 800 1200 1600 2000 1375.44 1375.77 1378.85 1843.74 1775.18
TensorFlow Device: CPU - Batch Size: 16 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 16 - Model: GoogLeNet a b c d e 40 80 120 160 200 142.32 157.65 158.76 67.80 67.07
TensorFlow Device: CPU - Batch Size: 16 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 16 - Model: ResNet-50 a b c d e 13 26 39 52 65 57.39 57.85 57.90 25.41 24.48
TensorFlow Device: CPU - Batch Size: 32 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 32 - Model: GoogLeNet a b c d e 50 100 150 200 250 241.43 239.45 239.48 120.06 106.10
TensorFlow Device: CPU - Batch Size: 32 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 32 - Model: ResNet-50 a b c d e 20 40 60 80 100 80.72 81.80 80.41 45.40 45.90
TensorFlow Device: CPU - Batch Size: 64 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 64 - Model: GoogLeNet a b c d e 70 140 210 280 350 316.04 310.53 295.75 191.95 177.73
TensorFlow Device: CPU - Batch Size: 64 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 64 - Model: ResNet-50 a b c d e 20 40 60 80 100 104.12 105.04 105.29 68.59 68.53
TensorFlow Device: CPU - Batch Size: 256 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 256 - Model: GoogLeNet a b c d e 100 200 300 400 500 459.97 454.67 455.45 382.29 377.56
TensorFlow Device: CPU - Batch Size: 256 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 256 - Model: ResNet-50 a b c d e 30 60 90 120 150 146.79 146.90 146.99 131.65 132.39
TensorFlow Device: CPU - Batch Size: 512 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 512 - Model: GoogLeNet a b c d e 110 220 330 440 550 516.87 515.11 518.52 530.51 521.87
TensorFlow Device: CPU - Batch Size: 512 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 512 - Model: ResNet-50 a b c d e 40 80 120 160 200 163.85 163.76 163.66 166.89 171.18
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream a b c d e 20 40 60 80 100 42.33 42.10 42.82 84.55 84.28
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream a b c d e 200 400 600 800 1000 1109.11 1110.67 1108.42 1133.26 1136.30
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream a b c d e 8 16 24 32 40 35.68 35.45 35.59 33.95 34.37
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream a b c d e 7 14 21 28 35 28.02 28.20 28.09 29.45 29.09
Neural Magic DeepSparse Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream a b c d e 600 1200 1800 2400 3000 1567.75 1567.61 1569.26 2845.82 2819.96
Neural Magic DeepSparse Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream a b c d e 8 16 24 32 40 30.57 30.58 30.54 33.65 33.94
Neural Magic DeepSparse Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Synchronous Single-Stream a b c d e 40 80 120 160 200 194.54 199.04 197.09 196.62 196.32
Neural Magic DeepSparse Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Synchronous Single-Stream a b c d e 1.1562 2.3124 3.4686 4.6248 5.781 5.1385 5.0222 5.0721 5.0842 5.0918
Neural Magic DeepSparse Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream a b c d e 160 320 480 640 800 400.51 400.39 399.66 754.44 754.76
Neural Magic DeepSparse Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream a b c d e 30 60 90 120 150 119.60 119.64 119.81 126.94 126.86
Neural Magic DeepSparse Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Synchronous Single-Stream a b c d e 20 40 60 80 100 101.79 101.40 100.93 99.65 100.24
Neural Magic DeepSparse Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Synchronous Single-Stream a b c d e 3 6 9 12 15 9.8202 9.8577 9.9034 10.0304 9.9712
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream a b c d e 200 400 600 800 1000 440.46 439.92 439.02 840.73 838.03
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream a b c d e 30 60 90 120 150 108.80 108.80 109.23 113.91 114.31
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Stream a b c d e 50 100 150 200 250 206.94 206.62 206.67 203.70 202.46
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Stream a b c d e 1.1104 2.2208 3.3312 4.4416 5.552 4.8287 4.8365 4.8353 4.9053 4.9353
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream a b c d e 400 800 1200 1600 2000 1003.57 1001.98 1005.72 1953.93 1964.26
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream a b c d e 11 22 33 44 55 47.77 47.87 47.69 49.03 48.81
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream a b c d e 40 80 120 160 200 195.13 193.40 196.15 191.64 189.01
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream a b c d e 1.1899 2.3798 3.5697 4.7596 5.9495 5.1229 5.1686 5.0961 5.2159 5.2886
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream a b c d e 300 600 900 1200 1500 624.96 626.07 625.26 1209.92 1204.66
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream a b c d e 20 40 60 80 100 76.71 76.60 76.66 79.18 79.54
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream a b c d e 40 80 120 160 200 195.20 195.77 194.08 188.21 190.66
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream a b c d e 1.1947 2.3894 3.5841 4.7788 5.9735 5.1198 5.1045 5.1492 5.3096 5.2412
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream a b c d e 60 120 180 240 300 149.05 149.22 149.20 281.23 279.96
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream a b c d e 70 140 210 280 350 321.47 320.99 321.15 339.58 340.47
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream a b c d e 14 28 42 56 70 62.35 61.93 62.29 61.04 61.70
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream a b c d e 4 8 12 16 20 16.02 16.13 16.04 16.36 16.19
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream a b c d e 130 260 390 520 650 316.61 320.18 317.49 620.41 617.58
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream a b c d e 30 60 90 120 150 151.37 149.81 150.76 154.28 155.03
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Synchronous Single-Stream a b c d e 20 40 60 80 100 88.70 88.60 88.83 86.18 86.29
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Synchronous Single-Stream a b c d e 3 6 9 12 15 11.27 11.28 11.25 11.60 11.58
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream a b c d e 20 40 60 80 100 43.02 42.28 42.11 85.02 84.40
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream a b c d e 200 400 600 800 1000 1108.70 1108.98 1112.33 1127.58 1135.45
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream a b c d e 8 16 24 32 40 35.43 35.39 35.47 34.20 34.53
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream a b c d e 7 14 21 28 35 28.22 28.25 28.19 29.23 28.95
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: GPT-2 - Device: CPU - Executor: Parallel a b c d e 40 80 120 160 200 159.11 159.99 159.60 111.96 112.51 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: GPT-2 - Device: CPU - Executor: Standard a b c d e 30 60 90 120 150 128.49 128.98 126.84 126.74 102.87 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: yolov4 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: yolov4 - Device: CPU - Executor: Parallel a b c d e 2 4 6 8 10 6.40254 6.36316 6.38620 4.09797 4.06505 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: yolov4 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: yolov4 - Device: CPU - Executor: Standard a b c d e 2 4 6 8 10 6.61035 6.48661 6.96155 4.84593 4.70693 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: bertsquad-12 - Device: CPU - Executor: Parallel a b c d e 3 6 9 12 15 12.17420 12.26920 12.17350 8.85423 8.98441 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: bertsquad-12 - Device: CPU - Executor: Standard a b c d e 3 6 9 12 15 9.36520 12.01250 9.35231 10.54650 8.85844 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel a b c d e 130 260 390 520 650 599.50 563.30 598.43 326.98 323.43 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard a b c d e 120 240 360 480 600 524.04 552.79 536.34 417.12 482.69 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel a b c d e 0.2831 0.5662 0.8493 1.1324 1.4155 1.253310 1.258000 1.171210 0.922595 0.880761 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: fcn-resnet101-11 - Device: CPU - Executor: Standard a b c d e 1.1508 2.3016 3.4524 4.6032 5.754 5.11448 4.89263 5.10242 4.78422 4.46687 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel a b c d e 6 12 18 24 30 24.15 24.60 24.65 13.27 13.31 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard a b c d e 8 16 24 32 40 29.43 32.46 32.52 19.77 19.27 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
SPECFEM3D Model: Layered Halfspace OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.0 Model: Layered Halfspace a b c d e 5 10 15 20 25 19.84 19.46 19.78 11.92 12.58 1. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel a b c d e 40 80 120 160 200 194.56 194.06 183.22 101.31 102.18 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard a b c d e 50 100 150 200 250 207.16 233.81 207.35 137.87 139.61 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: super-resolution-10 - Device: CPU - Executor: Parallel a b c d e 30 60 90 120 150 111.70 111.35 112.03 88.95 89.83 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: super-resolution-10 - Device: CPU - Executor: Standard a b c d e 30 60 90 120 150 123.88 113.02 123.13 123.93 123.28 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel a b c d e 7 14 21 28 35 30.78 30.68 29.03 24.74 26.62 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard a b c d e 9 18 27 36 45 37.04 37.39 37.07 36.24 35.53 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenCV Test: Core OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Core a b c d e 60K 120K 180K 240K 300K 65772 65256 68743 267066 182548 1. (CXX) g++ options: -fPIC -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -shared
OpenCV Test: Video OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Video a b c d e 30K 60K 90K 120K 150K 41999 38143 37173 126947 122021 1. (CXX) g++ options: -fPIC -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -shared
OpenCV Test: Graph API OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Graph API a b c d e 80K 160K 240K 320K 400K 230494 204945 207090 390167 382454 1. (CXX) g++ options: -fPIC -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -shared
OpenCV Test: Stitching OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Stitching a b c d e 60K 120K 180K 240K 300K 190987 190687 191634 268869 241229 1. (CXX) g++ options: -fPIC -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -shared
OpenCV Test: Features 2D OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Features 2D a b c d e 30K 60K 90K 120K 150K 71850 73789 75180 110697 119368 1. (CXX) g++ options: -fPIC -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -shared
OpenCV Test: Image Processing OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Image Processing a b c d e 70K 140K 210K 280K 350K 119907 119436 122137 333961 312082 1. (CXX) g++ options: -fPIC -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -shared
OpenCV Test: Object Detection OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Object Detection a b c d e 15K 30K 45K 60K 75K 24950 24394 23509 71477 33386 1. (CXX) g++ options: -fPIC -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -shared
OpenCV Test: DNN - Deep Neural Network OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: DNN - Deep Neural Network a b c d e 10K 20K 30K 40K 50K 22944 23755 23144 34502 47834 1. (CXX) g++ options: -fPIC -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -shared
GROMACS Implementation: MPI CPU - Input: water_GMX50_bare OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2023 Implementation: MPI CPU - Input: water_GMX50_bare a b c d e 5 10 15 20 25 11.25 11.24 11.25 18.41 19.13 1. (CXX) g++ options: -O3
Timed FFmpeg Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed FFmpeg Compilation 6.0 Time To Compile a b c d e 3 6 9 12 15 12.81 13.01 13.16 10.86 11.19
John The Ripper Test: bcrypt OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: bcrypt a b c d e 70K 140K 210K 280K 350K 163238 163353 163353 315340 314928 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lgmp -lm -lrt -lz -ldl -lcrypt
John The Ripper Test: WPA PSK OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: WPA PSK a b c d e 300K 600K 900K 1200K 1500K 653913 654104 653913 1263000 1255000 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lgmp -lm -lrt -lz -ldl -lcrypt
John The Ripper Test: Blowfish OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: Blowfish a b c d e 70K 140K 210K 280K 350K 163353 163241 163299 315110 314188 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lgmp -lm -lrt -lz -ldl -lcrypt
John The Ripper Test: HMAC-SHA512 OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: HMAC-SHA512 a b c d e 70M 140M 210M 280M 350M 309175000 308492000 309621000 286156000 292569000 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lgmp -lm -lrt -lz -ldl -lcrypt
John The Ripper Test: MD5 OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: MD5 a b c d e 6M 12M 18M 24M 30M 15556000 15608000 15608000 27276000 27169000 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lgmp -lm -lrt -lz -ldl -lcrypt
Timed LLVM Compilation Build System: Ninja OpenBenchmarking.org Seconds, Fewer Is Better Timed LLVM Compilation 16.0 Build System: Ninja a b c d e 30 60 90 120 150 125.88 126.29 126.23 97.85 97.52
Timed LLVM Compilation Build System: Unix Makefiles OpenBenchmarking.org Seconds, Fewer Is Better Timed LLVM Compilation 16.0 Build System: Unix Makefiles a b c d e 50 100 150 200 250 217.19 217.45 214.02 199.26 200.02
Zstd Compression Compression Level: 8 - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 8 - Compression Speed a b c d e 300 600 900 1200 1500 1225.5 1217.1 1220.5 1217.7 1213.4 1. (CC) gcc options: -O3 -pthread -lz -llzma
Zstd Compression Compression Level: 8 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 8 - Decompression Speed a b c d e 300 600 900 1200 1500 1580.9 1593.0 1577.1 1611.3 1599.2 1. (CC) gcc options: -O3 -pthread -lz -llzma
Zstd Compression Compression Level: 12 - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 12 - Compression Speed a b c d e 70 140 210 280 350 316.8 317.4 314.8 317.8 315.2 1. (CC) gcc options: -O3 -pthread -lz -llzma
Zstd Compression Compression Level: 12 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 12 - Decompression Speed a b c d e 400 800 1200 1600 2000 1633.7 1629.0 1641.7 1636.0 1611.2 1. (CC) gcc options: -O3 -pthread -lz -llzma
Zstd Compression Compression Level: 19 - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19 - Compression Speed a b c d e 4 8 12 16 20 17.4 17.4 17.4 17.3 17.3 1. (CC) gcc options: -O3 -pthread -lz -llzma
Zstd Compression Compression Level: 19 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19 - Decompression Speed a b c d e 300 600 900 1200 1500 1395.1 1399.2 1393.8 1406.1 1385.0 1. (CC) gcc options: -O3 -pthread -lz -llzma
Zstd Compression Compression Level: 8, Long Mode - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 8, Long Mode - Compression Speed a b c d e 200 400 600 800 1000 903.2 900.7 893.2 865.5 943.2 1. (CC) gcc options: -O3 -pthread -lz -llzma
Zstd Compression Compression Level: 8, Long Mode - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 8, Long Mode - Decompression Speed a b c d e 300 600 900 1200 1500 1619.8 1625.7 1613.8 1615.1 1606.4 1. (CC) gcc options: -O3 -pthread -lz -llzma
Zstd Compression Compression Level: 19, Long Mode - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19, Long Mode - Compression Speed a b c d e 2 4 6 8 10 8.50 8.56 8.38 8.34 8.46 1. (CC) gcc options: -O3 -pthread -lz -llzma
Zstd Compression Compression Level: 19, Long Mode - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19, Long Mode - Decompression Speed a b c d e 300 600 900 1200 1500 1329.8 1336.1 1338.0 1334.7 1330.7 1. (CC) gcc options: -O3 -pthread -lz -llzma
dav1d Video Input: Chimera 1080p OpenBenchmarking.org FPS, More Is Better dav1d 1.1 Video Input: Chimera 1080p a b c 140 280 420 560 700 657.50 656.31 657.22 1. (CC) gcc options: -pthread -lm
dav1d Video Input: Summer Nature 4K OpenBenchmarking.org FPS, More Is Better dav1d 1.1 Video Input: Summer Nature 4K a b c 80 160 240 320 400 379.84 381.16 383.95 1. (CC) gcc options: -pthread -lm
dav1d Video Input: Summer Nature 1080p OpenBenchmarking.org FPS, More Is Better dav1d 1.1 Video Input: Summer Nature 1080p a b c 200 400 600 800 1000 807.16 806.08 809.86 1. (CC) gcc options: -pthread -lm
dav1d Video Input: Chimera 1080p 10-bit OpenBenchmarking.org FPS, More Is Better dav1d 1.1 Video Input: Chimera 1080p 10-bit a b c 130 260 390 520 650 602.64 603.05 603.51 1. (CC) gcc options: -pthread -lm
FFmpeg Encoder: libx264 - Scenario: Live OpenBenchmarking.org Seconds, Fewer Is Better FFmpeg 6.0 Encoder: libx264 - Scenario: Live a b c d e 6 12 18 24 30 23.17 23.15 23.20 23.09 23.09 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx264 - Scenario: Live OpenBenchmarking.org FPS, More Is Better FFmpeg 6.0 Encoder: libx264 - Scenario: Live a b c d e 50 100 150 200 250 217.98 218.14 217.64 218.73 218.71 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx265 - Scenario: Live OpenBenchmarking.org Seconds, Fewer Is Better FFmpeg 6.0 Encoder: libx265 - Scenario: Live a b c d e 9 18 27 36 45 37.07 36.89 36.98 39.28 37.18 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx265 - Scenario: Live OpenBenchmarking.org FPS, More Is Better FFmpeg 6.0 Encoder: libx265 - Scenario: Live a b c d e 30 60 90 120 150 136.22 136.89 136.56 128.58 135.83 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx264 - Scenario: Upload OpenBenchmarking.org Seconds, Fewer Is Better FFmpeg 6.0 Encoder: libx264 - Scenario: Upload a b c d e 40 80 120 160 200 202.83 203.29 202.88 198.82 199.47 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx264 - Scenario: Upload OpenBenchmarking.org FPS, More Is Better FFmpeg 6.0 Encoder: libx264 - Scenario: Upload a b c d e 3 6 9 12 15 12.45 12.42 12.45 12.70 12.66 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx265 - Scenario: Upload OpenBenchmarking.org Seconds, Fewer Is Better FFmpeg 6.0 Encoder: libx265 - Scenario: Upload a b c d e 20 40 60 80 100 89.61 89.73 89.78 89.69 89.59 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx265 - Scenario: Upload OpenBenchmarking.org FPS, More Is Better FFmpeg 6.0 Encoder: libx265 - Scenario: Upload a b c d e 7 14 21 28 35 28.18 28.14 28.13 28.15 28.18 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx264 - Scenario: Platform OpenBenchmarking.org Seconds, Fewer Is Better FFmpeg 6.0 Encoder: libx264 - Scenario: Platform a b c d e 30 60 90 120 150 156.98 156.98 157.00 155.28 155.68 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx264 - Scenario: Platform OpenBenchmarking.org FPS, More Is Better FFmpeg 6.0 Encoder: libx264 - Scenario: Platform a b c d e 11 22 33 44 55 48.25 48.26 48.25 48.78 48.66 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx265 - Scenario: Platform OpenBenchmarking.org Seconds, Fewer Is Better FFmpeg 6.0 Encoder: libx265 - Scenario: Platform a b c d e 30 60 90 120 150 132.60 132.95 132.87 134.38 137.04 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx265 - Scenario: Platform OpenBenchmarking.org FPS, More Is Better FFmpeg 6.0 Encoder: libx265 - Scenario: Platform a b c d e 13 26 39 52 65 57.13 56.98 57.01 56.37 55.28 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx264 - Scenario: Video On Demand OpenBenchmarking.org Seconds, Fewer Is Better FFmpeg 6.0 Encoder: libx264 - Scenario: Video On Demand a b c d e 30 60 90 120 150 157.53 157.20 156.93 155.36 155.85 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx264 - Scenario: Video On Demand OpenBenchmarking.org FPS, More Is Better FFmpeg 6.0 Encoder: libx264 - Scenario: Video On Demand a b c d e 11 22 33 44 55 48.08 48.19 48.27 48.76 48.60 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx265 - Scenario: Video On Demand OpenBenchmarking.org Seconds, Fewer Is Better FFmpeg 6.0 Encoder: libx265 - Scenario: Video On Demand a b c d e 30 60 90 120 150 132.84 132.48 132.63 131.93 134.34 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx265 - Scenario: Video On Demand OpenBenchmarking.org FPS, More Is Better FFmpeg 6.0 Encoder: libx265 - Scenario: Video On Demand a b c d e 13 26 39 52 65 57.02 57.18 57.12 57.42 56.39 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
Timed Godot Game Engine Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Godot Game Engine Compilation 4.0 Time To Compile a b c d e 20 40 60 80 100 107.66 107.35 107.08 97.31 98.01
Embree Binary: Pathtracer - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.0.1 Binary: Pathtracer - Model: Crown a b c d e 40 80 120 160 200 104.24 105.18 105.12 172.28 173.58 MIN: 102.16 / MAX: 107.49 MIN: 103.34 / MAX: 107.96 MIN: 102.85 / MAX: 108.38 MIN: 167.64 / MAX: 181.13 MIN: 168.69 / MAX: 180.7
Embree Binary: Pathtracer ISPC - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.0.1 Binary: Pathtracer ISPC - Model: Crown a b c d e 40 80 120 160 200 110.74 111.28 111.30 180.48 180.84 MIN: 108.24 / MAX: 114.35 MIN: 108.92 / MAX: 115.18 MIN: 108.84 / MAX: 114.92 MIN: 174.62 / MAX: 189.72 MIN: 174.89 / MAX: 190.36
Embree Binary: Pathtracer - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.0.1 Binary: Pathtracer - Model: Asian Dragon a b c d e 40 80 120 160 200 121.11 121.27 120.91 194.90 195.27 MIN: 118.84 / MAX: 124.01 MIN: 119.43 / MAX: 123.26 MIN: 119.01 / MAX: 122.81 MIN: 190.69 / MAX: 207.25 MIN: 191.17 / MAX: 206.18
Embree Binary: Pathtracer - Model: Asian Dragon Obj OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.0.1 Binary: Pathtracer - Model: Asian Dragon Obj a b c d e 40 80 120 160 200 106.54 107.11 106.93 174.26 173.96 MIN: 104.86 / MAX: 109.02 MIN: 105.52 / MAX: 109.66 MIN: 105.38 / MAX: 108.8 MIN: 170.44 / MAX: 179.73 MIN: 169.75 / MAX: 180.18
Embree Binary: Pathtracer ISPC - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.0.1 Binary: Pathtracer ISPC - Model: Asian Dragon a b c d e 50 100 150 200 250 132.71 132.71 132.53 211.98 213.30 MIN: 131.03 / MAX: 135.12 MIN: 130.87 / MAX: 135.14 MIN: 130.91 / MAX: 135.37 MIN: 207.56 / MAX: 231.22 MIN: 208.86 / MAX: 228.78
Embree Binary: Pathtracer ISPC - Model: Asian Dragon Obj OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.0.1 Binary: Pathtracer ISPC - Model: Asian Dragon Obj a b c d e 40 80 120 160 200 113.17 113.45 113.32 181.45 182.35 MIN: 111.68 / MAX: 116.14 MIN: 111.85 / MAX: 115.79 MIN: 111.66 / MAX: 115.92 MIN: 177.49 / MAX: 186.64 MIN: 178.38 / MAX: 188.72
Build2 Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Build2 0.15 Time To Compile a b c d e 14 28 42 56 70 63.17 63.01 63.22 60.47 60.35
nginx Connections: 200 OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.23.2 Connections: 200 a b c 60K 120K 180K 240K 300K 257954.01 258099.68 255419.28 1. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2
nginx Connections: 500 OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.23.2 Connections: 500 a b c d e 50K 100K 150K 200K 250K 240111.29 237868.20 241662.73 196034.90 194753.92 1. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2
Apache HTTP Server Concurrent Requests: 200 OpenBenchmarking.org Requests Per Second, More Is Better Apache HTTP Server 2.4.56 Concurrent Requests: 200 a b c 40K 80K 120K 160K 200K 143188.90 164665.51 165838.18 1. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2
Apache HTTP Server Concurrent Requests: 500 OpenBenchmarking.org Requests Per Second, More Is Better Apache HTTP Server 2.4.56 Concurrent Requests: 500 a b c d e 40K 80K 120K 160K 200K 173757.38 208703.78 185857.48 141164.84 142512.83 1. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2
OpenSSL Algorithm: SHA256 OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: SHA256 a b c d e 60000M 120000M 180000M 240000M 300000M 129947980460 129484883100 130061831940 258641794620 258600679990 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenSSL Algorithm: SHA512 OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: SHA512 a b c d e 20000M 40000M 60000M 80000M 100000M 40028926290 40018641540 40002428390 79615585770 79537665830 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenSSL Algorithm: RSA4096 OpenBenchmarking.org sign/s, More Is Better OpenSSL 3.1 Algorithm: RSA4096 a b c d e 15K 30K 45K 60K 75K 35951.3 35946.3 35968.3 72050.2 72086.6 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenSSL Algorithm: RSA4096 OpenBenchmarking.org verify/s, More Is Better OpenSSL 3.1 Algorithm: RSA4096 a b c d e 600K 1200K 1800K 2400K 3000K 1462850.1 1462987.4 1462827.1 2936562.9 2937037.2 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenSSL Algorithm: ChaCha20 OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: ChaCha20 a b c d e 200000M 400000M 600000M 800000M 1000000M 510745602460 506631603000 510959296510 1017352168790 1017084218150 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenSSL Algorithm: AES-128-GCM OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: AES-128-GCM a b c d e 400000M 800000M 1200000M 1600000M 2000000M 908982494280 910814515750 909186377320 1810537338580 1804869906900 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenSSL Algorithm: AES-256-GCM OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: AES-256-GCM a b c d e 300000M 600000M 900000M 1200000M 1500000M 780271471000 776495266470 779191711330 1552708662150 1551466680320 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenSSL Algorithm: ChaCha20-Poly1305 OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: ChaCha20-Poly1305 a b c d e 150000M 300000M 450000M 600000M 750000M 356999237630 356991460690 356961832960 710753283550 710739693380 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
ClickHouse 100M Rows Hits Dataset, First Run / Cold Cache OpenBenchmarking.org Queries Per Minute, Geo Mean, More Is Better ClickHouse 22.12.3.5 100M Rows Hits Dataset, First Run / Cold Cache a b c d e 130 260 390 520 650 584.98 582.44 578.76 525.86 527.95 MIN: 58.37 / MAX: 6000 MIN: 56.98 / MAX: 6000 MIN: 57.75 / MAX: 6000 MIN: 60.3 / MAX: 5454.55 MIN: 61.35 / MAX: 5454.55
ClickHouse 100M Rows Hits Dataset, Second Run OpenBenchmarking.org Queries Per Minute, Geo Mean, More Is Better ClickHouse 22.12.3.5 100M Rows Hits Dataset, Second Run a b c d e 130 260 390 520 650 602.50 603.93 592.33 536.53 536.90 MIN: 58.14 / MAX: 6666.67 MIN: 59 / MAX: 7500 MIN: 58.2 / MAX: 5000 MIN: 74.17 / MAX: 6666.67 MIN: 75.09 / MAX: 6000
ClickHouse 100M Rows Hits Dataset, Third Run OpenBenchmarking.org Queries Per Minute, Geo Mean, More Is Better ClickHouse 22.12.3.5 100M Rows Hits Dataset, Third Run a b c d e 130 260 390 520 650 612.78 606.58 623.39 551.92 568.25 MIN: 59.52 / MAX: 5454.55 MIN: 58.71 / MAX: 5454.55 MIN: 57.97 / MAX: 7500 MIN: 87.98 / MAX: 6000 MIN: 90.09 / MAX: 6666.67
Memcached Set To Get Ratio: 1:5 OpenBenchmarking.org Ops/sec, More Is Better Memcached 1.6.19 Set To Get Ratio: 1:5 a b c d e 800K 1600K 2400K 3200K 4000K 3870015.60 3833862.64 3858723.95 2575013.52 2550554.27 1. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre
Memcached Set To Get Ratio: 1:10 OpenBenchmarking.org Ops/sec, More Is Better Memcached 1.6.19 Set To Get Ratio: 1:10 a b c d e 700K 1400K 2100K 2800K 3500K 3203112.60 3163183.27 3154822.34 3068513.78 3063463.55 1. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre
Memcached Set To Get Ratio: 1:100 OpenBenchmarking.org Ops/sec, More Is Better Memcached 1.6.19 Set To Get Ratio: 1:100 a b c d e 600K 1200K 1800K 2400K 3000K 2851726.85 2813977.52 2821181.21 2571874.26 2595069.86 1. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre
RocksDB Test: Random Fill OpenBenchmarking.org Op/s, More Is Better RocksDB 8.0 Test: Random Fill a b c d e 140K 280K 420K 560K 700K 644356 640977 641338 438023 431217 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
RocksDB Test: Random Read OpenBenchmarking.org Op/s, More Is Better RocksDB 8.0 Test: Random Read a b c d e 200M 400M 600M 800M 1000M 432927777 435267657 434781404 863491650 859555542 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
RocksDB Test: Update Random OpenBenchmarking.org Op/s, More Is Better RocksDB 8.0 Test: Update Random a b c d e 140K 280K 420K 560K 700K 645530 644787 647442 436861 435840 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
RocksDB Test: Sequential Fill OpenBenchmarking.org Op/s, More Is Better RocksDB 8.0 Test: Sequential Fill a b c d e 140K 280K 420K 560K 700K 662613 662046 660783 438833 438667 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
RocksDB Test: Random Fill Sync OpenBenchmarking.org Op/s, More Is Better RocksDB 8.0 Test: Random Fill Sync a b c d e 100K 200K 300K 400K 500K 445786 446883 451734 357298 174168 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
RocksDB Test: Read While Writing OpenBenchmarking.org Op/s, More Is Better RocksDB 8.0 Test: Read While Writing a b c d e 4M 8M 12M 16M 20M 9939924 9108882 10000158 16955493 14344054 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
RocksDB Test: Read Random Write Random OpenBenchmarking.org Op/s, More Is Better RocksDB 8.0 Test: Read Random Write Random a b c d e 600K 1200K 1800K 2400K 3000K 2792738 2785663 2802641 1689260 1713878 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
PostgreSQL Scaling Factor: 1 - Clients: 800 - Mode: Read Only OpenBenchmarking.org TPS, More Is Better PostgreSQL 15 Scaling Factor: 1 - Clients: 800 - Mode: Read Only a b c d e 800K 1600K 2400K 3200K 4000K 3718995 3803554 3804637 3638802 3669550 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
PostgreSQL Scaling Factor: 1 - Clients: 800 - Mode: Read Only - Average Latency OpenBenchmarking.org ms, Fewer Is Better PostgreSQL 15 Scaling Factor: 1 - Clients: 800 - Mode: Read Only - Average Latency a b c d e 0.0495 0.099 0.1485 0.198 0.2475 0.215 0.210 0.210 0.220 0.218 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
PostgreSQL Scaling Factor: 1 - Clients: 1000 - Mode: Read Only OpenBenchmarking.org TPS, More Is Better PostgreSQL 15 Scaling Factor: 1 - Clients: 1000 - Mode: Read Only a b c d e 800K 1600K 2400K 3200K 4000K 3738972 3707315 3672471 3700926 3716270 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
PostgreSQL Scaling Factor: 1 - Clients: 1000 - Mode: Read Only - Average Latency OpenBenchmarking.org ms, Fewer Is Better PostgreSQL 15 Scaling Factor: 1 - Clients: 1000 - Mode: Read Only - Average Latency a b c d e 0.0612 0.1224 0.1836 0.2448 0.306 0.267 0.270 0.272 0.270 0.269 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
PostgreSQL Scaling Factor: 1 - Clients: 800 - Mode: Read Write OpenBenchmarking.org TPS, More Is Better PostgreSQL 15 Scaling Factor: 1 - Clients: 800 - Mode: Read Write a b c d e 150 300 450 600 750 676 552 711 565 523 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
PostgreSQL Scaling Factor: 1 - Clients: 800 - Mode: Read Write - Average Latency OpenBenchmarking.org ms, Fewer Is Better PostgreSQL 15 Scaling Factor: 1 - Clients: 800 - Mode: Read Write - Average Latency a b c d e 300 600 900 1200 1500 1184.08 1449.39 1125.10 1415.16 1529.11 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
PostgreSQL Scaling Factor: 1 - Clients: 1000 - Mode: Read Write OpenBenchmarking.org TPS, More Is Better PostgreSQL 15 Scaling Factor: 1 - Clients: 1000 - Mode: Read Write a b c d e 120 240 360 480 600 382 476 424 455 534 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
PostgreSQL Scaling Factor: 1 - Clients: 1000 - Mode: Read Write - Average Latency OpenBenchmarking.org ms, Fewer Is Better PostgreSQL 15 Scaling Factor: 1 - Clients: 1000 - Mode: Read Write - Average Latency a b c d e 600 1200 1800 2400 3000 2619.51 2099.01 2357.91 2198.90 1871.56 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
PostgreSQL Scaling Factor: 100 - Clients: 800 - Mode: Read Only OpenBenchmarking.org TPS, More Is Better PostgreSQL 15 Scaling Factor: 100 - Clients: 800 - Mode: Read Only a b c d e 800K 1600K 2400K 3200K 4000K 3833822 3816666 3785876 3643496 3274920 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
PostgreSQL Scaling Factor: 100 - Clients: 800 - Mode: Read Only - Average Latency OpenBenchmarking.org ms, Fewer Is Better PostgreSQL 15 Scaling Factor: 100 - Clients: 800 - Mode: Read Only - Average Latency a b c d e 0.0549 0.1098 0.1647 0.2196 0.2745 0.209 0.210 0.211 0.220 0.244 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
PostgreSQL Scaling Factor: 100 - Clients: 1000 - Mode: Read Only OpenBenchmarking.org TPS, More Is Better PostgreSQL 15 Scaling Factor: 100 - Clients: 1000 - Mode: Read Only a b c d e 800K 1600K 2400K 3200K 4000K 3741941 3730123 3776352 3381479 3461133 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
PostgreSQL Scaling Factor: 100 - Clients: 1000 - Mode: Read Only - Average Latency OpenBenchmarking.org ms, Fewer Is Better PostgreSQL 15 Scaling Factor: 100 - Clients: 1000 - Mode: Read Only - Average Latency a b c d e 0.0666 0.1332 0.1998 0.2664 0.333 0.267 0.268 0.265 0.296 0.289 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
PostgreSQL Scaling Factor: 100 - Clients: 800 - Mode: Read Write OpenBenchmarking.org TPS, More Is Better PostgreSQL 15 Scaling Factor: 100 - Clients: 800 - Mode: Read Write a b c d e 14K 28K 42K 56K 70K 58262 61635 65740 46860 17020 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
PostgreSQL Scaling Factor: 100 - Clients: 800 - Mode: Read Write - Average Latency OpenBenchmarking.org ms, Fewer Is Better PostgreSQL 15 Scaling Factor: 100 - Clients: 800 - Mode: Read Write - Average Latency a b c d e 11 22 33 44 55 13.73 12.98 12.17 17.07 47.00 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
PostgreSQL Scaling Factor: 100 - Clients: 1000 - Mode: Read Write OpenBenchmarking.org TPS, More Is Better PostgreSQL 15 Scaling Factor: 100 - Clients: 1000 - Mode: Read Write a b c d e 12K 24K 36K 48K 60K 54169 54579 54120 44818 14053 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
PostgreSQL Scaling Factor: 100 - Clients: 1000 - Mode: Read Write - Average Latency OpenBenchmarking.org ms, Fewer Is Better PostgreSQL 15 Scaling Factor: 100 - Clients: 1000 - Mode: Read Write - Average Latency a b c d e 16 32 48 64 80 18.46 18.32 18.48 22.31 71.16 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
MariaDB Clients: 512 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 11.0.1 Clients: 512 a b c d e 200 400 600 800 1000 915 898 894 624 334 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O3 -shared -lrt -lpthread -lz -ldl -lm -lstdc++
MariaDB Clients: 1024 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 11.0.1 Clients: 1024 a b c d e 200 400 600 800 1000 912 874 873 561 336 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O3 -shared -lrt -lpthread -lz -ldl -lm -lstdc++
MariaDB Clients: 2048 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 11.0.1 Clients: 2048 a b c d e 200 400 600 800 1000 860 839 852 650 327 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O3 -shared -lrt -lpthread -lz -ldl -lm -lstdc++
MariaDB Clients: 4096 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 11.0.1 Clients: 4096 a b c d e 150 300 450 600 750 654 693 678 578 351 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O3 -shared -lrt -lpthread -lz -ldl -lm -lstdc++
MariaDB Clients: 8192 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 11.0.1 Clients: 8192 a b c d e 100 200 300 400 500 446 438 439 384 293 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O3 -shared -lrt -lpthread -lz -ldl -lm -lstdc++
SPECFEM3D Model: Mount St. Helens OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.0 Model: Mount St. Helens a b c d e 2 4 6 8 10 8.549248083 8.433500494 8.266046040 4.709617691 4.677201807 1. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
Timed Node.js Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Node.js Compilation 19.8.1 Time To Compile a b c d e 30 60 90 120 150 133.70 132.80 133.11 106.05 104.36
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: GPT-2 - Device: CPU - Executor: Parallel a b c d e 2 4 6 8 10 6.27808 6.24337 6.25916 8.92113 8.87820 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: GPT-2 - Device: CPU - Executor: Standard a b c d e 3 6 9 12 15 7.78072 7.75078 7.88188 7.88812 9.71824 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: yolov4 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: yolov4 - Device: CPU - Executor: Parallel a b c d e 50 100 150 200 250 156.18 157.15 156.58 244.02 245.99 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: yolov4 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: yolov4 - Device: CPU - Executor: Standard a b c d e 50 100 150 200 250 151.27 154.16 143.64 206.36 212.45 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: bertsquad-12 - Device: CPU - Executor: Parallel a b c d e 30 60 90 120 150 82.14 81.50 82.14 112.94 111.30 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: bertsquad-12 - Device: CPU - Executor: Standard a b c d e 30 60 90 120 150 106.78 83.24 106.92 94.82 112.88 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel a b c d e 0.6951 1.3902 2.0853 2.7804 3.4755 1.66637 1.77352 1.66912 3.05530 3.08913 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard a b c d e 0.5393 1.0786 1.6179 2.1572 2.6965 1.90737 1.80855 1.86383 2.39687 2.07112 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel a b c d e 200 400 600 800 1000 797.88 794.91 853.82 1083.89 1135.38 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: fcn-resnet101-11 - Device: CPU - Executor: Standard a b c d e 50 100 150 200 250 195.52 204.39 195.98 209.02 223.87 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel a b c d e 20 40 60 80 100 41.40 40.65 40.57 75.36 75.14 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard a b c d e 12 24 36 48 60 33.98 30.81 30.75 50.59 51.89 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel a b c d e 3 6 9 12 15 5.13811 5.15191 5.45677 9.86863 9.78363 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard a b c d e 2 4 6 8 10 4.82662 4.27622 4.82211 7.25262 7.16208 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: super-resolution-10 - Device: CPU - Executor: Parallel a b c d e 3 6 9 12 15 8.95095 8.97883 8.92427 11.23890 11.12940 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: super-resolution-10 - Device: CPU - Executor: Standard a b c d e 2 4 6 8 10 8.07185 8.84768 8.12097 8.06837 8.11091 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel a b c d e 9 18 27 36 45 32.49 32.59 34.45 40.41 37.56 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard a b c d e 7 14 21 28 35 27.00 26.74 26.98 27.59 28.14 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
Phoronix Test Suite v10.8.5