epyc 9654 AMD March Tests for a future article. 2 x AMD EPYC 9654 96-Core testing with a AMD Titanite_4G (RTI1004D BIOS) and ASPEED on Ubuntu 23.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2303299-NE-EPYC9654A14&grs&sor .
epyc 9654 AMD March Processor Motherboard Chipset Memory Disk Graphics Monitor Network OS Kernel Desktop Display Server Vulkan Compiler File-System Screen Resolution a b c d e AMD EPYC 9654 96-Core @ 3.71GHz (96 Cores / 192 Threads) AMD Titanite_4G (RTI1004D BIOS) AMD Device 14a4 768GB 800GB INTEL SSDPF21Q800GB ASPEED VGA HDMI Broadcom NetXtreme BCM5720 PCIe Ubuntu 23.04 5.19.0-21-generic (x86_64) GNOME Shell 43.1 X Server 1.21.1.4 1.3.224 GCC 12.2.0 ext4 1920x1080 2 x AMD EPYC 9654 96-Core @ 3.71GHz (192 Cores / 384 Threads) 1520GB OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-l0Aoyl/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-l0Aoyl/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: amd-pstate performance (Boost: Enabled) - CPU Microcode: 0xa101111 Python Details - Python 3.10.9 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
epyc 9654 AMD March opencv: Core pgbench: 100 - 1000 - Read Write - Average Latency pgbench: 100 - 1000 - Read Write pgbench: 100 - 800 - Read Write pgbench: 100 - 800 - Read Write - Average Latency opencv: Video opencv: Object Detection opencv: Image Processing mysqlslap: 512 mysqlslap: 1024 mysqlslap: 2048 rocksdb: Rand Fill Sync tensorflow: CPU - 16 - GoogLeNet tensorflow: CPU - 16 - ResNet-50 tensorflow: CPU - 32 - GoogLeNet opencv: DNN - Deep Neural Network deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream openssl: ChaCha20 openssl: RSA4096 openssl: RSA4096 openssl: AES-256-GCM openssl: SHA256 rocksdb: Rand Read openssl: AES-128-GCM openssl: ChaCha20-Poly1305 openssl: SHA512 mysqlslap: 4096 deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream deepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream john-the-ripper: bcrypt john-the-ripper: WPA PSK john-the-ripper: Blowfish tensorflow: CPU - 16 - AlexNet onnx: ResNet50 v1-12-int8 - CPU - Parallel deepsparse: CV Detection, YOLOv5s COCO - Asynchronous Multi-Stream opencv: Graph API specfem3d: Water-layered Halfspace deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream tensorflow: CPU - 32 - AlexNet rocksdb: Read While Writing onnx: ArcFace ResNet-100 - CPU - Parallel onnx: CaffeNet 12-int8 - CPU - Parallel specfem3d: Mount St. Helens deepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Asynchronous Multi-Stream tensorflow: CPU - 32 - ResNet-50 tensorflow: CPU - 64 - GoogLeNet john-the-ripper: MD5 specfem3d: Tomographic Model specfem3d: Homogeneous Halfspace gromacs: MPI CPU - water_GMX50_bare onnx: ResNet50 v1-12-int8 - CPU - Standard onnx: ArcFace ResNet-100 - CPU - Standard embree: Pathtracer - Crown specfem3d: Layered Halfspace opencv: Features 2D rocksdb: Read Rand Write Rand embree: Pathtracer - Asian Dragon Obj embree: Pathtracer ISPC - Crown embree: Pathtracer - Asian Dragon embree: Pathtracer ISPC - Asian Dragon Obj embree: Pathtracer ISPC - Asian Dragon onnx: yolov4 - CPU - Parallel tensorflow: CPU - 64 - ResNet-50 mysqlslap: 8192 memcached: 1:5 rocksdb: Seq Fill daphne: OpenMP - Points2Image rocksdb: Rand Fill rocksdb: Update Rand onnx: yolov4 - CPU - Standard apache: 500 tensorflow: CPU - 64 - AlexNet onnx: GPT-2 - CPU - Parallel onnx: fcn-resnet101-11 - CPU - Parallel opencv: Stitching pgbench: 1 - 1000 - Read Write - Average Latency pgbench: 1 - 1000 - Read Write onnx: bertsquad-12 - CPU - Parallel pgbench: 1 - 800 - Read Write pgbench: 1 - 800 - Read Write - Average Latency onnx: bertsquad-12 - CPU - Standard tensorflow: CPU - 512 - AlexNet onnx: CaffeNet 12-int8 - CPU - Standard build-llvm: Ninja build-nodejs: Time To Compile onnx: super-resolution-10 - CPU - Parallel daphne: OpenMP - NDT Mapping onnx: GPT-2 - CPU - Standard onnx: Faster R-CNN R-50-FPN-int8 - CPU - Parallel nginx: 500 tensorflow: CPU - 256 - GoogLeNet build-ffmpeg: Time To Compile pgbench: 100 - 800 - Read Only pgbench: 100 - 800 - Read Only - Average Latency apache: 200 onnx: fcn-resnet101-11 - CPU - Standard clickhouse: 100M Rows Hits Dataset, Third Run clickhouse: 100M Rows Hits Dataset, Second Run pgbench: 100 - 1000 - Read Only - Average Latency pgbench: 100 - 1000 - Read Only tensorflow: CPU - 256 - ResNet-50 clickhouse: 100M Rows Hits Dataset, First Run / Cold Cache deepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Asynchronous Multi-Stream memcached: 1:100 build-godot: Time To Compile onnx: super-resolution-10 - CPU - Standard daphne: OpenMP - Euclidean Cluster build-llvm: Unix Makefiles tensorflow: CPU - 256 - AlexNet compress-zstd: 8, Long Mode - Compression Speed john-the-ripper: HMAC-SHA512 ffmpeg: libx265 - Live ffmpeg: libx265 - Live deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream onnx: Faster R-CNN R-50-FPN-int8 - CPU - Standard deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Stream deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Stream deepsparse: CV Detection, YOLOv5s COCO - Asynchronous Multi-Stream build2: Time To Compile pgbench: 1 - 800 - Read Only - Average Latency tensorflow: CPU - 512 - ResNet-50 memcached: 1:10 pgbench: 1 - 800 - Read Only deepsparse: NLP Text Classification, DistilBERT mnli - Synchronous Single-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Synchronous Single-Stream deepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Synchronous Single-Stream deepsparse: CV Classification, ResNet-50 ImageNet - Synchronous Single-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Stream deepsparse: NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Stream deepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Stream ffmpeg: libx265 - Platform ffmpeg: libx265 - Platform deepsparse: NLP Text Classification, BERT base uncased SST2 - Synchronous Single-Stream deepsparse: NLP Text Classification, BERT base uncased SST2 - Synchronous Single-Stream tensorflow: CPU - 512 - GoogLeNet deepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream compress-zstd: 19, Long Mode - Compression Speed deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream draco: Church Facade deepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream deepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Synchronous Single-Stream deepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Synchronous Single-Stream ffmpeg: libx264 - Upload ffmpeg: libx264 - Upload deepsparse: CV Detection, YOLOv5s COCO - Synchronous Single-Stream deepsparse: CV Detection, YOLOv5s COCO - Synchronous Single-Stream compress-zstd: 8 - Decompression Speed deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Synchronous Single-Stream deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Synchronous Single-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Synchronous Single-Stream deepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Synchronous Single-Stream draco: Lion compress-zstd: 12 - Decompression Speed pgbench: 1 - 1000 - Read Only - Average Latency ffmpeg: libx265 - Video On Demand ffmpeg: libx265 - Video On Demand pgbench: 1 - 1000 - Read Only compress-zstd: 19 - Decompression Speed ffmpeg: libx264 - Video On Demand ffmpeg: libx264 - Video On Demand compress-zstd: 8, Long Mode - Decompression Speed ffmpeg: libx264 - Platform ffmpeg: libx264 - Platform dav1d: Summer Nature 4K nginx: 200 compress-zstd: 8 - Compression Speed compress-zstd: 12 - Compression Speed compress-zstd: 19, Long Mode - Decompression Speed compress-zstd: 19 - Compression Speed ffmpeg: libx264 - Live ffmpeg: libx264 - Live dav1d: Summer Nature 1080p ffmpeg: libx265 - Upload dav1d: Chimera 1080p ffmpeg: libx265 - Upload dav1d: Chimera 1080p 10-bit onnx: Faster R-CNN R-50-FPN-int8 - CPU - Standard onnx: Faster R-CNN R-50-FPN-int8 - CPU - Parallel onnx: super-resolution-10 - CPU - Standard onnx: super-resolution-10 - CPU - Parallel onnx: ResNet50 v1-12-int8 - CPU - Standard onnx: ResNet50 v1-12-int8 - CPU - Parallel onnx: ArcFace ResNet-100 - CPU - Standard onnx: ArcFace ResNet-100 - CPU - Parallel onnx: fcn-resnet101-11 - CPU - Standard onnx: fcn-resnet101-11 - CPU - Parallel onnx: CaffeNet 12-int8 - CPU - Standard onnx: CaffeNet 12-int8 - CPU - Parallel onnx: bertsquad-12 - CPU - Standard onnx: bertsquad-12 - CPU - Parallel onnx: yolov4 - CPU - Standard onnx: yolov4 - CPU - Parallel onnx: GPT-2 - CPU - Standard onnx: GPT-2 - CPU - Parallel nginx: 100 a b c d e 65772 18.461 54169 58262 13.731 41999 24950 119907 915 912 860 445786 142.32 57.39 241.43 22944 43.0198 42.3271 510745602460 1462850.1 35951.3 780271471000 129947980460 432927777 908982494280 356999237630 40028926290 654 1003.5666 316.6097 624.9625 163238 653913 163353 355.72 194.561 440.4626 230494 20.426995567 400.507 149.0497 593.9 9939924 24.1547 599.496 8.549248083 1567.7533 80.72 316.04 15556000 8.695806704 10.661133476 11.248 207.156 29.4284 104.2447 19.841829247 71850 2792738 106.5442 110.7402 121.1083 113.1673 132.7141 6.40254 104.12 446 3870015.6 662613 18078.769197393 644356 645530 6.61035 173757.38 856.98 159.109 1.25331 190987 2619.509 382 12.1742 676 1184.082 9.3652 1375.44 524.041 125.877 133.701 111.698 954.82 128.485 30.7765 240111.29 459.97 12.811 3833822 0.209 143188.9 5.11448 612.78 602.50 0.267 3741941 146.79 584.98 30.5695 2851726.85 107.66 123.879 1637.357959197 217.187 1276.22 903.2 309175000 37.07 136.22 119.5961 321.4749 37.0363 28.0217 35.6789 108.8046 63.167 0.215 163.85 3203112.6 3718995 195.1995 5.1198 76.7052 195.128 5.1229 28.2192 35.4293 151.3746 132.597004196 57.13 11.2681 88.7046 516.87 47.7743 8.5 1109.1052 6872 1108.7024 194.537 5.1385 12.45 202.82901313 206.9391 4.8287 1580.9 101.7889 9.8202 62.3453 16.0249 5321 1633.7 0.267 57.02 132.842087496 3738972 1395.1 48.08 157.534841326 1619.8 156.979756208 48.25 379.84 257954.01 1225.5 316.8 1329.8 17.4 23.17 217.98 807.16 89.614916611 657.5 28.18 602.64 26.998 32.4891 8.07185 8.95095 4.82662 5.13811 33.9783 41.3976 195.521 797.88 1.90737 1.66637 106.776 82.137 151.274 156.183 7.78072 6.27808 65256 18.322 54579 61635 12.98 38143 24394 119436 898 874 839 446883 157.65 57.85 239.45 23755 42.2783 42.0961 506631603000 1462987.4 35946.3 776495266470 129484883100 435267657 910814515750 356991460690 40018641540 693 1001.9837 320.1841 626.0654 163353 654104 163241 353.36 194.057 439.9247 204945 20.448751145 400.3873 149.2181 594.27 9108882 24.5996 563.298 8.433500494 1567.6088 81.8 310.53 15608000 8.699354709 10.386901707 11.237 233.812 32.4597 105.1822 19.458733603 73789 2785663 107.1054 111.2807 121.2691 113.4477 132.7068 6.36316 105.04 438 3833862.64 662046 17717.930545712 640977 644787 6.48661 208703.78 853.06 159.988 1.258 190687 2099.011 476 12.2692 552 1449.389 12.0125 1375.77 552.793 126.288 132.796 111.351 949.71 128.979 30.6798 237868.2 454.67 13.012 3816666 0.21 164665.51 4.89263 606.58 603.93 0.268 3730123 146.9 582.44 30.5827 2813977.52 107.352 113.016 1637.00 217.451 1272.13 900.7 308492000 36.891181239 136.89 119.6426 320.9851 37.3945 28.2018 35.4512 108.8016 63.014 0.21 163.76 3163183.27 3803554 195.7729 5.1045 76.5973 193.3983 5.1686 28.2485 35.3934 149.8129 132.947013563 56.98 11.2807 88.5975 515.11 47.8736 8.56 1110.6738 6788 1108.9774 199.0439 5.0222 12.42 203.29 206.6159 4.8365 1593 101.3977 9.8577 61.9263 16.1331 5296 1629 0.27 57.18 132.48 3707315 1399.2 48.19 157.201576479 1625.7 156.98 48.26 381.16 258099.68 1217.1 317.4 1336.1 17.4 23.15 218.14 806.08 89.730906933 656.31 28.14 603.05 26.739 32.5919 8.84768 8.97883 4.27622 5.15191 30.8057 40.6488 204.386 794.909 1.80855 1.77352 83.2442 81.5006 154.16 157.15 7.75078 6.24337 68743 18.477 54120 65740 12.169 37173 23509 122137 894 873 852 451734 158.76 57.9 239.48 23144 42.1065 42.8173 510959296510 1462827.1 35968.3 779191711330 130061831940 434781404 909186377320 356961832960 40002428390 678 1005.7215 317.4936 625.2598 163353 653913 163299 354.96 183.218 439.0234 207090 19.898906002 399.6568 149.1959 591.78 10000158 24.648 598.432 8.26604604 1569.2552 80.41 295.75 15608000 8.463161346 10.665913818 11.246 207.345 32.5181 105.124 19.778071377 75180 2802641 106.9305 111.3003 120.9085 113.3183 132.5315 6.3862 105.29 439 3858723.95 660783 17373.125512145 641338 647442 6.96155 185857.48 857.87 159.596 1.17121 191634 2357.907 424 12.1735 711 1125.104 9.35231 1378.85 536.343 126.228 133.106 112.031 937.41 126.839 29.0291 241662.73 455.45 13.161 3785876 0.211 165838.18 5.10242 623.39 592.33 0.265 3776352 146.99 578.76 30.5443 2821181.21 107.075 123.129 1636.09 214.021 1276.62 893.2 309621000 36.98 136.56 119.8062 321.1526 37.0657 28.0947 35.5865 109.2253 63.222 0.21 163.66 3154822.34 3804637 194.0788 5.1492 76.6635 196.1478 5.0961 28.1884 35.4677 150.7648 132.872175892 57.01 11.251 88.831 518.52 47.6861 8.38 1108.4187 6888 1112.3295 197.0861 5.0721 12.45 202.88 206.6742 4.8353 1577.1 100.9338 9.9034 62.2933 16.0386 5270 1641.7 0.272 57.12 132.626633247 3672471 1393.8 48.27 156.93 1613.8 157.00 48.25 383.95 255419.28 1220.5 314.8 1338 17.4 23.203361631 217.64 809.86 89.78 657.22 28.13 603.51 26.9766 34.4452 8.12097 8.92427 4.82211 5.45677 30.7501 40.5691 195.983 853.816 1.86383 1.66912 106.923 82.1422 143.643 156.583 7.88188 6.25916 267066 22.312 44818 46860 17.072 126947 71477 333961 624 561 650 357298 67.8 25.41 120.06 34502 85.0229 84.5516 1017352168790 2936562.9 72050.2 1552708662150 258641794620 863491650 1810537338580 710753283550 79615585770 578 1953.9329 620.4124 1209.9217 315340 1263000 315110 184.36 101.311 840.7314 390167 12.810395283 754.4392 281.229 330.97 16955493 13.2688 326.983 4.709617691 2845.8208 45.4 191.95 27276000 5.078447644 6.276873254 18.413 137.868 19.7663 172.2769 11.916989228 110697 1689260 174.2633 180.4766 194.9019 181.4535 211.9803 4.09797 68.59 384 2575013.52 438833 13064.153704123 438023 436861 4.84593 141164.84 588.27 111.962 0.922595 268869 2198.896 455 8.85423 565 1415.162 10.5465 1843.74 417.118 97.849 106.05 88.9539 802.27 126.738 24.7447 196034.9 382.29 10.857 3643496 0.22 4.78422 551.92 536.53 0.296 3381479 131.65 525.86 33.6465 2571874.26 97.311 123.932 1506.75 199.261 1347.52 865.5 286156000 39.276270815 128.58 126.9401 339.5778 36.2449 29.4515 33.947 113.905 60.474 0.22 166.89 3068513.78 3638802 188.2083 5.3096 79.1804 191.6421 5.2159 29.2312 34.2025 154.2824 134.37552551 56.37 11.5966 86.1837 530.51 49.0273 8.34 1133.2605 6721 1127.5771 196.6158 5.0842 12.70 198.824978146 203.7038 4.9053 1611.3 99.65 10.0304 61.0447 16.3645 5218 1636 0.27 57.42 131.931538366 3700926 1406.1 48.76 155.359546761 1615.1 155.276512617 48.78 1217.7 317.8 1334.7 17.3 23.087381477 218.73 89.69 28.15 27.5873 40.4083 8.06837 11.2389 7.25262 9.86863 50.5887 75.3612 209.018 1083.89 2.39687 3.0553 94.8158 112.935 206.355 244.017 7.88812 8.92113 182548 71.159 14053 17020 47.002 122021 33386 312082 334 336 327 174168 67.07 24.48 106.1 47834 84.4009 84.2779 1017084218150 2937037.2 72086.6 1551466680320 258600679990 859555542 1804869906900 710739693380 79537665830 351 1964.2605 617.5781 1204.6601 314928 1255000 314188 184.99 102.184 838.0345 382454 10.759381397 754.7591 279.9601 319.03 14344054 13.307 323.431 4.677201807 2819.9596 45.9 177.73 27169000 5.322492803 6.233892837 19.134 139.612 19.2692 173.5786 12.57728528 119368 1713878 173.9641 180.8434 195.2731 182.3542 213.2995 4.06505 68.53 293 2550554.27 438667 11981.045985251 431217 435840 4.70693 142512.83 597.46 112.505 0.880761 241229 1871.558 534 8.98441 523 1529.108 8.85844 1775.18 482.694 97.521 104.355 89.8277 760.15 102.872 26.622 194753.92 377.56 11.19 3274920 0.244 4.46687 568.25 536.90 0.289 3461133 132.39 527.95 33.9411 2595069.86 98.009 123.282 1494.17 200.02 1386.72 943.2 292569000 37.18 135.83 126.8569 340.4654 35.5324 29.0871 34.372 114.3093 60.348 0.218 171.18 3063463.55 3669550 190.6621 5.2412 79.542 189.0071 5.2886 28.9508 34.5339 155.0272 137.04 55.28 11.5825 86.2895 521.87 48.8074 8.46 1136.2993 6784 1135.447 196.3169 5.0918 12.66 199.473346601 202.464 4.9353 1599.2 100.2364 9.9712 61.6961 16.1904 5300 1611.2 0.269 56.39 134.34 3716270 1385 48.60 155.854204067 1606.4 155.684439476 48.66 1213.4 315.2 1330.7 17.3 23.09 218.71 89.59 28.18 28.1403 37.5588 8.11091 11.1294 7.16208 9.78363 51.8944 75.1448 223.867 1135.38 2.07112 3.08913 112.883 111.298 212.449 245.992 9.71824 8.8782 OpenBenchmarking.org
OpenCV Test: Core OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Core b a c e d 60K 120K 180K 240K 300K 65256 65772 68743 182548 267066 1. (CXX) g++ options: -fPIC -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -shared
PostgreSQL Scaling Factor: 100 - Clients: 1000 - Mode: Read Write - Average Latency OpenBenchmarking.org ms, Fewer Is Better PostgreSQL 15 Scaling Factor: 100 - Clients: 1000 - Mode: Read Write - Average Latency b a c d e 16 32 48 64 80 18.32 18.46 18.48 22.31 71.16 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
PostgreSQL Scaling Factor: 100 - Clients: 1000 - Mode: Read Write OpenBenchmarking.org TPS, More Is Better PostgreSQL 15 Scaling Factor: 100 - Clients: 1000 - Mode: Read Write b a c d e 12K 24K 36K 48K 60K 54579 54169 54120 44818 14053 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
PostgreSQL Scaling Factor: 100 - Clients: 800 - Mode: Read Write OpenBenchmarking.org TPS, More Is Better PostgreSQL 15 Scaling Factor: 100 - Clients: 800 - Mode: Read Write c b a d e 14K 28K 42K 56K 70K 65740 61635 58262 46860 17020 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
PostgreSQL Scaling Factor: 100 - Clients: 800 - Mode: Read Write - Average Latency OpenBenchmarking.org ms, Fewer Is Better PostgreSQL 15 Scaling Factor: 100 - Clients: 800 - Mode: Read Write - Average Latency c b a d e 11 22 33 44 55 12.17 12.98 13.73 17.07 47.00 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
OpenCV Test: Video OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Video c b a e d 30K 60K 90K 120K 150K 37173 38143 41999 122021 126947 1. (CXX) g++ options: -fPIC -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -shared
OpenCV Test: Object Detection OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Object Detection c b a e d 15K 30K 45K 60K 75K 23509 24394 24950 33386 71477 1. (CXX) g++ options: -fPIC -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -shared
OpenCV Test: Image Processing OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Image Processing b a c e d 70K 140K 210K 280K 350K 119436 119907 122137 312082 333961 1. (CXX) g++ options: -fPIC -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -shared
MariaDB Clients: 512 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 11.0.1 Clients: 512 a b c d e 200 400 600 800 1000 915 898 894 624 334 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O3 -shared -lrt -lpthread -lz -ldl -lm -lstdc++
MariaDB Clients: 1024 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 11.0.1 Clients: 1024 a b c d e 200 400 600 800 1000 912 874 873 561 336 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O3 -shared -lrt -lpthread -lz -ldl -lm -lstdc++
MariaDB Clients: 2048 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 11.0.1 Clients: 2048 a c b d e 200 400 600 800 1000 860 852 839 650 327 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O3 -shared -lrt -lpthread -lz -ldl -lm -lstdc++
RocksDB Test: Random Fill Sync OpenBenchmarking.org Op/s, More Is Better RocksDB 8.0 Test: Random Fill Sync c b a d e 100K 200K 300K 400K 500K 451734 446883 445786 357298 174168 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
TensorFlow Device: CPU - Batch Size: 16 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 16 - Model: GoogLeNet c b a d e 40 80 120 160 200 158.76 157.65 142.32 67.80 67.07
TensorFlow Device: CPU - Batch Size: 16 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 16 - Model: ResNet-50 c b a d e 13 26 39 52 65 57.90 57.85 57.39 25.41 24.48
TensorFlow Device: CPU - Batch Size: 32 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 32 - Model: GoogLeNet a c b d e 50 100 150 200 250 241.43 239.48 239.45 120.06 106.10
OpenCV Test: DNN - Deep Neural Network OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: DNN - Deep Neural Network a c b d e 10K 20K 30K 40K 50K 22944 23144 23755 34502 47834 1. (CXX) g++ options: -fPIC -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -shared
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream d e a b c 20 40 60 80 100 85.02 84.40 43.02 42.28 42.11
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream d e c a b 20 40 60 80 100 84.55 84.28 42.82 42.33 42.10
OpenSSL Algorithm: ChaCha20 OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: ChaCha20 d e c a b 200000M 400000M 600000M 800000M 1000000M 1017352168790 1017084218150 510959296510 510745602460 506631603000 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenSSL Algorithm: RSA4096 OpenBenchmarking.org verify/s, More Is Better OpenSSL 3.1 Algorithm: RSA4096 e d b a c 600K 1200K 1800K 2400K 3000K 2937037.2 2936562.9 1462987.4 1462850.1 1462827.1 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenSSL Algorithm: RSA4096 OpenBenchmarking.org sign/s, More Is Better OpenSSL 3.1 Algorithm: RSA4096 e d c a b 15K 30K 45K 60K 75K 72086.6 72050.2 35968.3 35951.3 35946.3 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenSSL Algorithm: AES-256-GCM OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: AES-256-GCM d e a c b 300000M 600000M 900000M 1200000M 1500000M 1552708662150 1551466680320 780271471000 779191711330 776495266470 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenSSL Algorithm: SHA256 OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: SHA256 d e c a b 60000M 120000M 180000M 240000M 300000M 258641794620 258600679990 130061831940 129947980460 129484883100 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
RocksDB Test: Random Read OpenBenchmarking.org Op/s, More Is Better RocksDB 8.0 Test: Random Read d e b c a 200M 400M 600M 800M 1000M 863491650 859555542 435267657 434781404 432927777 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
OpenSSL Algorithm: AES-128-GCM OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: AES-128-GCM d e b c a 400000M 800000M 1200000M 1600000M 2000000M 1810537338580 1804869906900 910814515750 909186377320 908982494280 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenSSL Algorithm: ChaCha20-Poly1305 OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: ChaCha20-Poly1305 d e a b c 150000M 300000M 450000M 600000M 750000M 710753283550 710739693380 356999237630 356991460690 356961832960 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
OpenSSL Algorithm: SHA512 OpenBenchmarking.org byte/s, More Is Better OpenSSL 3.1 Algorithm: SHA512 d e a b c 20000M 40000M 60000M 80000M 100000M 79615585770 79537665830 40028926290 40018641540 40002428390 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
MariaDB Clients: 4096 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 11.0.1 Clients: 4096 b c a d e 150 300 450 600 750 693 678 654 578 351 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O3 -shared -lrt -lpthread -lz -ldl -lm -lstdc++
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream e d c a b 400 800 1200 1600 2000 1964.26 1953.93 1005.72 1003.57 1001.98
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream d e b c a 130 260 390 520 650 620.41 617.58 320.18 317.49 316.61
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream d e b c a 300 600 900 1200 1500 1209.92 1204.66 626.07 625.26 624.96
John The Ripper Test: bcrypt OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: bcrypt d e c b a 70K 140K 210K 280K 350K 315340 314928 163353 163353 163238 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lgmp -lm -lrt -lz -ldl -lcrypt
John The Ripper Test: WPA PSK OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: WPA PSK d e b c a 300K 600K 900K 1200K 1500K 1263000 1255000 654104 653913 653913 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lgmp -lm -lrt -lz -ldl -lcrypt
John The Ripper Test: Blowfish OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: Blowfish d e a c b 70K 140K 210K 280K 350K 315110 314188 163353 163299 163241 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lgmp -lm -lrt -lz -ldl -lcrypt
TensorFlow Device: CPU - Batch Size: 16 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 16 - Model: AlexNet a c b e d 80 160 240 320 400 355.72 354.96 353.36 184.99 184.36
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel a b c e d 40 80 120 160 200 194.56 194.06 183.22 102.18 101.31 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream d e a b c 200 400 600 800 1000 840.73 838.03 440.46 439.92 439.02
OpenCV Test: Graph API OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Graph API b c a e d 80K 160K 240K 320K 400K 204945 207090 230494 382454 390167 1. (CXX) g++ options: -fPIC -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -shared
SPECFEM3D Model: Water-layered Halfspace OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.0 Model: Water-layered Halfspace e d c a b 5 10 15 20 25 10.76 12.81 19.90 20.43 20.45 1. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
Neural Magic DeepSparse Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream e d a b c 160 320 480 640 800 754.76 754.44 400.51 400.39 399.66
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream d e b c a 60 120 180 240 300 281.23 279.96 149.22 149.20 149.05
TensorFlow Device: CPU - Batch Size: 32 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 32 - Model: AlexNet b a c d e 130 260 390 520 650 594.27 593.90 591.78 330.97 319.03
RocksDB Test: Read While Writing OpenBenchmarking.org Op/s, More Is Better RocksDB 8.0 Test: Read While Writing d e c a b 4M 8M 12M 16M 20M 16955493 14344054 10000158 9939924 9108882 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel c b a e d 6 12 18 24 30 24.65 24.60 24.15 13.31 13.27 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel a c b d e 130 260 390 520 650 599.50 598.43 563.30 326.98 323.43 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
SPECFEM3D Model: Mount St. Helens OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.0 Model: Mount St. Helens e d c b a 2 4 6 8 10 4.677201807 4.709617691 8.266046040 8.433500494 8.549248083 1. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
Neural Magic DeepSparse Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream d e c a b 600 1200 1800 2400 3000 2845.82 2819.96 1569.26 1567.75 1567.61
TensorFlow Device: CPU - Batch Size: 32 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 32 - Model: ResNet-50 b a c e d 20 40 60 80 100 81.80 80.72 80.41 45.90 45.40
TensorFlow Device: CPU - Batch Size: 64 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 64 - Model: GoogLeNet a b c d e 70 140 210 280 350 316.04 310.53 295.75 191.95 177.73
John The Ripper Test: MD5 OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: MD5 d e c b a 6M 12M 18M 24M 30M 27276000 27169000 15608000 15608000 15556000 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lgmp -lm -lrt -lz -ldl -lcrypt
SPECFEM3D Model: Tomographic Model OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.0 Model: Tomographic Model d e c a b 2 4 6 8 10 5.078447644 5.322492803 8.463161346 8.695806704 8.699354709 1. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
SPECFEM3D Model: Homogeneous Halfspace OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.0 Model: Homogeneous Halfspace e d b a c 3 6 9 12 15 6.233892837 6.276873254 10.386901707 10.661133476 10.665913818 1. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
GROMACS Implementation: MPI CPU - Input: water_GMX50_bare OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2023 Implementation: MPI CPU - Input: water_GMX50_bare e d a c b 5 10 15 20 25 19.13 18.41 11.25 11.25 11.24 1. (CXX) g++ options: -O3
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard b c a e d 50 100 150 200 250 233.81 207.35 207.16 139.61 137.87 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard c b a d e 8 16 24 32 40 32.52 32.46 29.43 19.77 19.27 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
Embree Binary: Pathtracer - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.0.1 Binary: Pathtracer - Model: Crown e d b c a 40 80 120 160 200 173.58 172.28 105.18 105.12 104.24 MIN: 168.69 / MAX: 180.7 MIN: 167.64 / MAX: 181.13 MIN: 103.34 / MAX: 107.96 MIN: 102.85 / MAX: 108.38 MIN: 102.16 / MAX: 107.49
SPECFEM3D Model: Layered Halfspace OpenBenchmarking.org Seconds, Fewer Is Better SPECFEM3D 4.0 Model: Layered Halfspace d e b c a 5 10 15 20 25 11.92 12.58 19.46 19.78 19.84 1. (F9X) gfortran options: -O2 -fopenmp -std=f2003 -fimplicit-none -fmax-errors=10 -pedantic -pedantic-errors -O3 -finline-functions -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
OpenCV Test: Features 2D OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Features 2D a b c d e 30K 60K 90K 120K 150K 71850 73789 75180 110697 119368 1. (CXX) g++ options: -fPIC -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -shared
RocksDB Test: Read Random Write Random OpenBenchmarking.org Op/s, More Is Better RocksDB 8.0 Test: Read Random Write Random c a b e d 600K 1200K 1800K 2400K 3000K 2802641 2792738 2785663 1713878 1689260 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
Embree Binary: Pathtracer - Model: Asian Dragon Obj OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.0.1 Binary: Pathtracer - Model: Asian Dragon Obj d e b c a 40 80 120 160 200 174.26 173.96 107.11 106.93 106.54 MIN: 170.44 / MAX: 179.73 MIN: 169.75 / MAX: 180.18 MIN: 105.52 / MAX: 109.66 MIN: 105.38 / MAX: 108.8 MIN: 104.86 / MAX: 109.02
Embree Binary: Pathtracer ISPC - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.0.1 Binary: Pathtracer ISPC - Model: Crown e d c b a 40 80 120 160 200 180.84 180.48 111.30 111.28 110.74 MIN: 174.89 / MAX: 190.36 MIN: 174.62 / MAX: 189.72 MIN: 108.84 / MAX: 114.92 MIN: 108.92 / MAX: 115.18 MIN: 108.24 / MAX: 114.35
Embree Binary: Pathtracer - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.0.1 Binary: Pathtracer - Model: Asian Dragon e d b a c 40 80 120 160 200 195.27 194.90 121.27 121.11 120.91 MIN: 191.17 / MAX: 206.18 MIN: 190.69 / MAX: 207.25 MIN: 119.43 / MAX: 123.26 MIN: 118.84 / MAX: 124.01 MIN: 119.01 / MAX: 122.81
Embree Binary: Pathtracer ISPC - Model: Asian Dragon Obj OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.0.1 Binary: Pathtracer ISPC - Model: Asian Dragon Obj e d b c a 40 80 120 160 200 182.35 181.45 113.45 113.32 113.17 MIN: 178.38 / MAX: 188.72 MIN: 177.49 / MAX: 186.64 MIN: 111.85 / MAX: 115.79 MIN: 111.66 / MAX: 115.92 MIN: 111.68 / MAX: 116.14
Embree Binary: Pathtracer ISPC - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 4.0.1 Binary: Pathtracer ISPC - Model: Asian Dragon e d a b c 50 100 150 200 250 213.30 211.98 132.71 132.71 132.53 MIN: 208.86 / MAX: 228.78 MIN: 207.56 / MAX: 231.22 MIN: 131.03 / MAX: 135.12 MIN: 130.87 / MAX: 135.14 MIN: 130.91 / MAX: 135.37
ONNX Runtime Model: yolov4 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: yolov4 - Device: CPU - Executor: Parallel a c b d e 2 4 6 8 10 6.40254 6.38620 6.36316 4.09797 4.06505 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
TensorFlow Device: CPU - Batch Size: 64 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 64 - Model: ResNet-50 c b a d e 20 40 60 80 100 105.29 105.04 104.12 68.59 68.53
MariaDB Clients: 8192 OpenBenchmarking.org Queries Per Second, More Is Better MariaDB 11.0.1 Clients: 8192 a c b d e 100 200 300 400 500 446 439 438 384 293 1. (CXX) g++ options: -fPIC -pie -fstack-protector -O3 -shared -lrt -lpthread -lz -ldl -lm -lstdc++
Memcached Set To Get Ratio: 1:5 OpenBenchmarking.org Ops/sec, More Is Better Memcached 1.6.19 Set To Get Ratio: 1:5 a c b d e 800K 1600K 2400K 3200K 4000K 3870015.60 3858723.95 3833862.64 2575013.52 2550554.27 1. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre
RocksDB Test: Sequential Fill OpenBenchmarking.org Op/s, More Is Better RocksDB 8.0 Test: Sequential Fill a b c d e 140K 280K 420K 560K 700K 662613 662046 660783 438833 438667 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
Darmstadt Automotive Parallel Heterogeneous Suite Backend: OpenMP - Kernel: Points2Image OpenBenchmarking.org Test Cases Per Minute, More Is Better Darmstadt Automotive Parallel Heterogeneous Suite 2021.11.02 Backend: OpenMP - Kernel: Points2Image a b c d e 4K 8K 12K 16K 20K 18078.77 17717.93 17373.13 13064.15 11981.05 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp
RocksDB Test: Random Fill OpenBenchmarking.org Op/s, More Is Better RocksDB 8.0 Test: Random Fill a c b d e 140K 280K 420K 560K 700K 644356 641338 640977 438023 431217 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
RocksDB Test: Update Random OpenBenchmarking.org Op/s, More Is Better RocksDB 8.0 Test: Update Random c a b d e 140K 280K 420K 560K 700K 647442 645530 644787 436861 435840 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
ONNX Runtime Model: yolov4 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: yolov4 - Device: CPU - Executor: Standard c a b d e 2 4 6 8 10 6.96155 6.61035 6.48661 4.84593 4.70693 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
Apache HTTP Server Concurrent Requests: 500 OpenBenchmarking.org Requests Per Second, More Is Better Apache HTTP Server 2.4.56 Concurrent Requests: 500 b c a e d 40K 80K 120K 160K 200K 208703.78 185857.48 173757.38 142512.83 141164.84 1. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2
TensorFlow Device: CPU - Batch Size: 64 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 64 - Model: AlexNet c a b e d 200 400 600 800 1000 857.87 856.98 853.06 597.46 588.27
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: GPT-2 - Device: CPU - Executor: Parallel b c a e d 40 80 120 160 200 159.99 159.60 159.11 112.51 111.96 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel b a c d e 0.2831 0.5662 0.8493 1.1324 1.4155 1.258000 1.253310 1.171210 0.922595 0.880761 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenCV Test: Stitching OpenBenchmarking.org ms, Fewer Is Better OpenCV 4.7 Test: Stitching b a c e d 60K 120K 180K 240K 300K 190687 190987 191634 241229 268869 1. (CXX) g++ options: -fPIC -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -shared
PostgreSQL Scaling Factor: 1 - Clients: 1000 - Mode: Read Write - Average Latency OpenBenchmarking.org ms, Fewer Is Better PostgreSQL 15 Scaling Factor: 1 - Clients: 1000 - Mode: Read Write - Average Latency e b d c a 600 1200 1800 2400 3000 1871.56 2099.01 2198.90 2357.91 2619.51 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
PostgreSQL Scaling Factor: 1 - Clients: 1000 - Mode: Read Write OpenBenchmarking.org TPS, More Is Better PostgreSQL 15 Scaling Factor: 1 - Clients: 1000 - Mode: Read Write e b d c a 120 240 360 480 600 534 476 455 424 382 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: bertsquad-12 - Device: CPU - Executor: Parallel b a c e d 3 6 9 12 15 12.26920 12.17420 12.17350 8.98441 8.85423 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
PostgreSQL Scaling Factor: 1 - Clients: 800 - Mode: Read Write OpenBenchmarking.org TPS, More Is Better PostgreSQL 15 Scaling Factor: 1 - Clients: 800 - Mode: Read Write c a d b e 150 300 450 600 750 711 676 565 552 523 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
PostgreSQL Scaling Factor: 1 - Clients: 800 - Mode: Read Write - Average Latency OpenBenchmarking.org ms, Fewer Is Better PostgreSQL 15 Scaling Factor: 1 - Clients: 800 - Mode: Read Write - Average Latency c a d b e 300 600 900 1200 1500 1125.10 1184.08 1415.16 1449.39 1529.11 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: bertsquad-12 - Device: CPU - Executor: Standard b d a c e 3 6 9 12 15 12.01250 10.54650 9.36520 9.35231 8.85844 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
TensorFlow Device: CPU - Batch Size: 512 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 512 - Model: AlexNet d e c b a 400 800 1200 1600 2000 1843.74 1775.18 1378.85 1375.77 1375.44
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard b c a e d 120 240 360 480 600 552.79 536.34 524.04 482.69 417.12 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
Timed LLVM Compilation Build System: Ninja OpenBenchmarking.org Seconds, Fewer Is Better Timed LLVM Compilation 16.0 Build System: Ninja e d a c b 30 60 90 120 150 97.52 97.85 125.88 126.23 126.29
Timed Node.js Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Node.js Compilation 19.8.1 Time To Compile e d b c a 30 60 90 120 150 104.36 106.05 132.80 133.11 133.70
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: super-resolution-10 - Device: CPU - Executor: Parallel c a b e d 30 60 90 120 150 112.03 111.70 111.35 89.83 88.95 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
Darmstadt Automotive Parallel Heterogeneous Suite Backend: OpenMP - Kernel: NDT Mapping OpenBenchmarking.org Test Cases Per Minute, More Is Better Darmstadt Automotive Parallel Heterogeneous Suite 2021.11.02 Backend: OpenMP - Kernel: NDT Mapping a b c d e 200 400 600 800 1000 954.82 949.71 937.41 802.27 760.15 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: GPT-2 - Device: CPU - Executor: Standard b a c d e 30 60 90 120 150 128.98 128.49 126.84 126.74 102.87 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel a b c e d 7 14 21 28 35 30.78 30.68 29.03 26.62 24.74 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
nginx Connections: 500 OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.23.2 Connections: 500 c a b d e 50K 100K 150K 200K 250K 241662.73 240111.29 237868.20 196034.90 194753.92 1. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2
TensorFlow Device: CPU - Batch Size: 256 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 256 - Model: GoogLeNet a c b d e 100 200 300 400 500 459.97 455.45 454.67 382.29 377.56
Timed FFmpeg Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed FFmpeg Compilation 6.0 Time To Compile d e a b c 3 6 9 12 15 10.86 11.19 12.81 13.01 13.16
PostgreSQL Scaling Factor: 100 - Clients: 800 - Mode: Read Only OpenBenchmarking.org TPS, More Is Better PostgreSQL 15 Scaling Factor: 100 - Clients: 800 - Mode: Read Only a b c d e 800K 1600K 2400K 3200K 4000K 3833822 3816666 3785876 3643496 3274920 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
PostgreSQL Scaling Factor: 100 - Clients: 800 - Mode: Read Only - Average Latency OpenBenchmarking.org ms, Fewer Is Better PostgreSQL 15 Scaling Factor: 100 - Clients: 800 - Mode: Read Only - Average Latency a b c d e 0.0549 0.1098 0.1647 0.2196 0.2745 0.209 0.210 0.211 0.220 0.244 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
Apache HTTP Server Concurrent Requests: 200 OpenBenchmarking.org Requests Per Second, More Is Better Apache HTTP Server 2.4.56 Concurrent Requests: 200 c b a 40K 80K 120K 160K 200K 165838.18 164665.51 143188.90 1. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: fcn-resnet101-11 - Device: CPU - Executor: Standard a c b d e 1.1508 2.3016 3.4524 4.6032 5.754 5.11448 5.10242 4.89263 4.78422 4.46687 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ClickHouse 100M Rows Hits Dataset, Third Run OpenBenchmarking.org Queries Per Minute, Geo Mean, More Is Better ClickHouse 22.12.3.5 100M Rows Hits Dataset, Third Run c a b e d 130 260 390 520 650 623.39 612.78 606.58 568.25 551.92 MIN: 57.97 / MAX: 7500 MIN: 59.52 / MAX: 5454.55 MIN: 58.71 / MAX: 5454.55 MIN: 90.09 / MAX: 6666.67 MIN: 87.98 / MAX: 6000
ClickHouse 100M Rows Hits Dataset, Second Run OpenBenchmarking.org Queries Per Minute, Geo Mean, More Is Better ClickHouse 22.12.3.5 100M Rows Hits Dataset, Second Run b a c e d 130 260 390 520 650 603.93 602.50 592.33 536.90 536.53 MIN: 59 / MAX: 7500 MIN: 58.14 / MAX: 6666.67 MIN: 58.2 / MAX: 5000 MIN: 75.09 / MAX: 6000 MIN: 74.17 / MAX: 6666.67
PostgreSQL Scaling Factor: 100 - Clients: 1000 - Mode: Read Only - Average Latency OpenBenchmarking.org ms, Fewer Is Better PostgreSQL 15 Scaling Factor: 100 - Clients: 1000 - Mode: Read Only - Average Latency c a b e d 0.0666 0.1332 0.1998 0.2664 0.333 0.265 0.267 0.268 0.289 0.296 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
PostgreSQL Scaling Factor: 100 - Clients: 1000 - Mode: Read Only OpenBenchmarking.org TPS, More Is Better PostgreSQL 15 Scaling Factor: 100 - Clients: 1000 - Mode: Read Only c a b e d 800K 1600K 2400K 3200K 4000K 3776352 3741941 3730123 3461133 3381479 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
TensorFlow Device: CPU - Batch Size: 256 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 256 - Model: ResNet-50 c b a e d 30 60 90 120 150 146.99 146.90 146.79 132.39 131.65
ClickHouse 100M Rows Hits Dataset, First Run / Cold Cache OpenBenchmarking.org Queries Per Minute, Geo Mean, More Is Better ClickHouse 22.12.3.5 100M Rows Hits Dataset, First Run / Cold Cache a b c e d 130 260 390 520 650 584.98 582.44 578.76 527.95 525.86 MIN: 58.37 / MAX: 6000 MIN: 56.98 / MAX: 6000 MIN: 57.75 / MAX: 6000 MIN: 61.35 / MAX: 5454.55 MIN: 60.3 / MAX: 5454.55
Neural Magic DeepSparse Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-Stream c a b d e 8 16 24 32 40 30.54 30.57 30.58 33.65 33.94
Memcached Set To Get Ratio: 1:100 OpenBenchmarking.org Ops/sec, More Is Better Memcached 1.6.19 Set To Get Ratio: 1:100 a c b e d 600K 1200K 1800K 2400K 3000K 2851726.85 2821181.21 2813977.52 2595069.86 2571874.26 1. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre
Timed Godot Game Engine Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Godot Game Engine Compilation 4.0 Time To Compile d e c b a 20 40 60 80 100 97.31 98.01 107.08 107.35 107.66
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: super-resolution-10 - Device: CPU - Executor: Standard d a e c b 30 60 90 120 150 123.93 123.88 123.28 123.13 113.02 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
Darmstadt Automotive Parallel Heterogeneous Suite Backend: OpenMP - Kernel: Euclidean Cluster OpenBenchmarking.org Test Cases Per Minute, More Is Better Darmstadt Automotive Parallel Heterogeneous Suite 2021.11.02 Backend: OpenMP - Kernel: Euclidean Cluster a b c d e 400 800 1200 1600 2000 1637.36 1637.00 1636.09 1506.75 1494.17 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp
Timed LLVM Compilation Build System: Unix Makefiles OpenBenchmarking.org Seconds, Fewer Is Better Timed LLVM Compilation 16.0 Build System: Unix Makefiles d e c a b 50 100 150 200 250 199.26 200.02 214.02 217.19 217.45
TensorFlow Device: CPU - Batch Size: 256 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 256 - Model: AlexNet e d c a b 300 600 900 1200 1500 1386.72 1347.52 1276.62 1276.22 1272.13
Zstd Compression Compression Level: 8, Long Mode - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 8, Long Mode - Compression Speed e a b c d 200 400 600 800 1000 943.2 903.2 900.7 893.2 865.5 1. (CC) gcc options: -O3 -pthread -lz -llzma
John The Ripper Test: HMAC-SHA512 OpenBenchmarking.org Real C/S, More Is Better John The Ripper 2023.03.14 Test: HMAC-SHA512 c a b e d 70M 140M 210M 280M 350M 309621000 309175000 308492000 292569000 286156000 1. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lgmp -lm -lrt -lz -ldl -lcrypt
FFmpeg Encoder: libx265 - Scenario: Live OpenBenchmarking.org Seconds, Fewer Is Better FFmpeg 6.0 Encoder: libx265 - Scenario: Live b c a e d 9 18 27 36 45 36.89 36.98 37.07 37.18 39.28 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx265 - Scenario: Live OpenBenchmarking.org FPS, More Is Better FFmpeg 6.0 Encoder: libx265 - Scenario: Live b c a e d 30 60 90 120 150 136.89 136.56 136.22 135.83 128.58 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
Neural Magic DeepSparse Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-Stream a b c e d 30 60 90 120 150 119.60 119.64 119.81 126.86 126.94
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream b c a d e 70 140 210 280 350 320.99 321.15 321.47 339.58 340.47
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard b c a d e 9 18 27 36 45 37.39 37.07 37.04 36.24 35.53 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream a c b e d 7 14 21 28 35 28.02 28.09 28.20 29.09 29.45
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream a c b e d 8 16 24 32 40 35.68 35.59 35.45 34.37 33.95
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: CV Detection, YOLOv5s COCO - Scenario: Asynchronous Multi-Stream b a c d e 30 60 90 120 150 108.80 108.80 109.23 113.91 114.31
Build2 Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Build2 0.15 Time To Compile e d b a c 14 28 42 56 70 60.35 60.47 63.01 63.17 63.22
PostgreSQL Scaling Factor: 1 - Clients: 800 - Mode: Read Only - Average Latency OpenBenchmarking.org ms, Fewer Is Better PostgreSQL 15 Scaling Factor: 1 - Clients: 800 - Mode: Read Only - Average Latency b c a e d 0.0495 0.099 0.1485 0.198 0.2475 0.210 0.210 0.215 0.218 0.220 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
TensorFlow Device: CPU - Batch Size: 512 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 512 - Model: ResNet-50 e d a b c 40 80 120 160 200 171.18 166.89 163.85 163.76 163.66
Memcached Set To Get Ratio: 1:10 OpenBenchmarking.org Ops/sec, More Is Better Memcached 1.6.19 Set To Get Ratio: 1:10 a b c d e 700K 1400K 2100K 2800K 3500K 3203112.60 3163183.27 3154822.34 3068513.78 3063463.55 1. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre
PostgreSQL Scaling Factor: 1 - Clients: 800 - Mode: Read Only OpenBenchmarking.org TPS, More Is Better PostgreSQL 15 Scaling Factor: 1 - Clients: 800 - Mode: Read Only c b a e d 800K 1600K 2400K 3200K 4000K 3804637 3803554 3718995 3669550 3638802 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream b a c e d 40 80 120 160 200 195.77 195.20 194.08 190.66 188.21
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream b a c e d 1.1947 2.3894 3.5841 4.7788 5.9735 5.1045 5.1198 5.1492 5.2412 5.3096
Neural Magic DeepSparse Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream b c a d e 20 40 60 80 100 76.60 76.66 76.71 79.18 79.54
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream c a b d e 40 80 120 160 200 196.15 195.13 193.40 191.64 189.01
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream c a b d e 1.1899 2.3798 3.5697 4.7596 5.9495 5.0961 5.1229 5.1686 5.2159 5.2886
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream c a b e d 7 14 21 28 35 28.19 28.22 28.25 28.95 29.23
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream c a b e d 8 16 24 32 40 35.47 35.43 35.39 34.53 34.20
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-Stream b c a d e 30 60 90 120 150 149.81 150.76 151.37 154.28 155.03
FFmpeg Encoder: libx265 - Scenario: Platform OpenBenchmarking.org Seconds, Fewer Is Better FFmpeg 6.0 Encoder: libx265 - Scenario: Platform a c b d e 30 60 90 120 150 132.60 132.87 132.95 134.38 137.04 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx265 - Scenario: Platform OpenBenchmarking.org FPS, More Is Better FFmpeg 6.0 Encoder: libx265 - Scenario: Platform a c b d e 13 26 39 52 65 57.13 57.01 56.98 56.37 55.28 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Synchronous Single-Stream c a b e d 3 6 9 12 15 11.25 11.27 11.28 11.58 11.60
Neural Magic DeepSparse Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Synchronous Single-Stream c a b e d 20 40 60 80 100 88.83 88.70 88.60 86.29 86.18
TensorFlow Device: CPU - Batch Size: 512 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.12 Device: CPU - Batch Size: 512 - Model: GoogLeNet d e c a b 110 220 330 440 550 530.51 521.87 518.52 516.87 515.11
Neural Magic DeepSparse Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream c a b e d 11 22 33 44 55 47.69 47.77 47.87 48.81 49.03
Zstd Compression Compression Level: 19, Long Mode - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19, Long Mode - Compression Speed b a e c d 2 4 6 8 10 8.56 8.50 8.46 8.38 8.34 1. (CC) gcc options: -O3 -pthread -lz -llzma
Neural Magic DeepSparse Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream c a b d e 200 400 600 800 1000 1108.42 1109.11 1110.67 1133.26 1136.30
Google Draco Model: Church Facade OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.5.6 Model: Church Facade d e b a c 1500 3000 4500 6000 7500 6721 6784 6788 6872 6888 1. (CXX) g++ options: -O3
Neural Magic DeepSparse Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream a b c d e 200 400 600 800 1000 1108.70 1108.98 1112.33 1127.58 1135.45
Neural Magic DeepSparse Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Synchronous Single-Stream b c d e a 40 80 120 160 200 199.04 197.09 196.62 196.32 194.54
Neural Magic DeepSparse Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Synchronous Single-Stream b c d e a 1.1562 2.3124 3.4686 4.6248 5.781 5.0222 5.0721 5.0842 5.0918 5.1385
FFmpeg Encoder: libx264 - Scenario: Upload OpenBenchmarking.org FPS, More Is Better FFmpeg 6.0 Encoder: libx264 - Scenario: Upload d e c a b 3 6 9 12 15 12.70 12.66 12.45 12.45 12.42 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx264 - Scenario: Upload OpenBenchmarking.org Seconds, Fewer Is Better FFmpeg 6.0 Encoder: libx264 - Scenario: Upload d e a c b 40 80 120 160 200 198.82 199.47 202.83 202.88 203.29 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Stream a c b d e 50 100 150 200 250 206.94 206.67 206.62 203.70 202.46
Neural Magic DeepSparse Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: CV Detection, YOLOv5s COCO - Scenario: Synchronous Single-Stream a c b d e 1.1104 2.2208 3.3312 4.4416 5.552 4.8287 4.8353 4.8365 4.9053 4.9353
Zstd Compression Compression Level: 8 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 8 - Decompression Speed d e b a c 300 600 900 1200 1500 1611.3 1599.2 1593.0 1580.9 1577.1 1. (CC) gcc options: -O3 -pthread -lz -llzma
Neural Magic DeepSparse Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Synchronous Single-Stream a b c e d 20 40 60 80 100 101.79 101.40 100.93 100.24 99.65
Neural Magic DeepSparse Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Synchronous Single-Stream a b c e d 3 6 9 12 15 9.8202 9.8577 9.9034 9.9712 10.0304
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream OpenBenchmarking.org items/sec, More Is Better Neural Magic DeepSparse 1.3.2 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream a c b e d 14 28 42 56 70 62.35 62.29 61.93 61.70 61.04
Neural Magic DeepSparse Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream OpenBenchmarking.org ms/batch, Fewer Is Better Neural Magic DeepSparse 1.3.2 Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream a c b e d 4 8 12 16 20 16.02 16.04 16.13 16.19 16.36
Google Draco Model: Lion OpenBenchmarking.org ms, Fewer Is Better Google Draco 1.5.6 Model: Lion d c b e a 1100 2200 3300 4400 5500 5218 5270 5296 5300 5321 1. (CXX) g++ options: -O3
Zstd Compression Compression Level: 12 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 12 - Decompression Speed c d a b e 400 800 1200 1600 2000 1641.7 1636.0 1633.7 1629.0 1611.2 1. (CC) gcc options: -O3 -pthread -lz -llzma
PostgreSQL Scaling Factor: 1 - Clients: 1000 - Mode: Read Only - Average Latency OpenBenchmarking.org ms, Fewer Is Better PostgreSQL 15 Scaling Factor: 1 - Clients: 1000 - Mode: Read Only - Average Latency a e b d c 0.0612 0.1224 0.1836 0.2448 0.306 0.267 0.269 0.270 0.270 0.272 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
FFmpeg Encoder: libx265 - Scenario: Video On Demand OpenBenchmarking.org FPS, More Is Better FFmpeg 6.0 Encoder: libx265 - Scenario: Video On Demand d b c a e 13 26 39 52 65 57.42 57.18 57.12 57.02 56.39 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx265 - Scenario: Video On Demand OpenBenchmarking.org Seconds, Fewer Is Better FFmpeg 6.0 Encoder: libx265 - Scenario: Video On Demand d b c a e 30 60 90 120 150 131.93 132.48 132.63 132.84 134.34 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
PostgreSQL Scaling Factor: 1 - Clients: 1000 - Mode: Read Only OpenBenchmarking.org TPS, More Is Better PostgreSQL 15 Scaling Factor: 1 - Clients: 1000 - Mode: Read Only a e b d c 800K 1600K 2400K 3200K 4000K 3738972 3716270 3707315 3700926 3672471 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lm
Zstd Compression Compression Level: 19 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19 - Decompression Speed d b a c e 300 600 900 1200 1500 1406.1 1399.2 1395.1 1393.8 1385.0 1. (CC) gcc options: -O3 -pthread -lz -llzma
FFmpeg Encoder: libx264 - Scenario: Video On Demand OpenBenchmarking.org FPS, More Is Better FFmpeg 6.0 Encoder: libx264 - Scenario: Video On Demand d e c b a 11 22 33 44 55 48.76 48.60 48.27 48.19 48.08 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx264 - Scenario: Video On Demand OpenBenchmarking.org Seconds, Fewer Is Better FFmpeg 6.0 Encoder: libx264 - Scenario: Video On Demand d e c b a 30 60 90 120 150 155.36 155.85 156.93 157.20 157.53 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
Zstd Compression Compression Level: 8, Long Mode - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 8, Long Mode - Decompression Speed b a d c e 300 600 900 1200 1500 1625.7 1619.8 1615.1 1613.8 1606.4 1. (CC) gcc options: -O3 -pthread -lz -llzma
FFmpeg Encoder: libx264 - Scenario: Platform OpenBenchmarking.org Seconds, Fewer Is Better FFmpeg 6.0 Encoder: libx264 - Scenario: Platform d e a b c 30 60 90 120 150 155.28 155.68 156.98 156.98 157.00 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx264 - Scenario: Platform OpenBenchmarking.org FPS, More Is Better FFmpeg 6.0 Encoder: libx264 - Scenario: Platform d e b c a 11 22 33 44 55 48.78 48.66 48.26 48.25 48.25 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
dav1d Video Input: Summer Nature 4K OpenBenchmarking.org FPS, More Is Better dav1d 1.1 Video Input: Summer Nature 4K c b a 80 160 240 320 400 383.95 381.16 379.84 1. (CC) gcc options: -pthread -lm
nginx Connections: 200 OpenBenchmarking.org Requests Per Second, More Is Better nginx 1.23.2 Connections: 200 b a c 60K 120K 180K 240K 300K 258099.68 257954.01 255419.28 1. (CC) gcc options: -lluajit-5.1 -lm -lssl -lcrypto -lpthread -ldl -std=c99 -O2
Zstd Compression Compression Level: 8 - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 8 - Compression Speed a c d b e 300 600 900 1200 1500 1225.5 1220.5 1217.7 1217.1 1213.4 1. (CC) gcc options: -O3 -pthread -lz -llzma
Zstd Compression Compression Level: 12 - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 12 - Compression Speed d b a e c 70 140 210 280 350 317.8 317.4 316.8 315.2 314.8 1. (CC) gcc options: -O3 -pthread -lz -llzma
Zstd Compression Compression Level: 19, Long Mode - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19, Long Mode - Decompression Speed c b d e a 300 600 900 1200 1500 1338.0 1336.1 1334.7 1330.7 1329.8 1. (CC) gcc options: -O3 -pthread -lz -llzma
Zstd Compression Compression Level: 19 - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19 - Compression Speed c b a e d 4 8 12 16 20 17.4 17.4 17.4 17.3 17.3 1. (CC) gcc options: -O3 -pthread -lz -llzma
FFmpeg Encoder: libx264 - Scenario: Live OpenBenchmarking.org Seconds, Fewer Is Better FFmpeg 6.0 Encoder: libx264 - Scenario: Live d e b a c 6 12 18 24 30 23.09 23.09 23.15 23.17 23.20 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
FFmpeg Encoder: libx264 - Scenario: Live OpenBenchmarking.org FPS, More Is Better FFmpeg 6.0 Encoder: libx264 - Scenario: Live d e b a c 50 100 150 200 250 218.73 218.71 218.14 217.98 217.64 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
dav1d Video Input: Summer Nature 1080p OpenBenchmarking.org FPS, More Is Better dav1d 1.1 Video Input: Summer Nature 1080p c a b 200 400 600 800 1000 809.86 807.16 806.08 1. (CC) gcc options: -pthread -lm
FFmpeg Encoder: libx265 - Scenario: Upload OpenBenchmarking.org Seconds, Fewer Is Better FFmpeg 6.0 Encoder: libx265 - Scenario: Upload e a d b c 20 40 60 80 100 89.59 89.61 89.69 89.73 89.78 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
dav1d Video Input: Chimera 1080p OpenBenchmarking.org FPS, More Is Better dav1d 1.1 Video Input: Chimera 1080p a c b 140 280 420 560 700 657.50 657.22 656.31 1. (CC) gcc options: -pthread -lm
FFmpeg Encoder: libx265 - Scenario: Upload OpenBenchmarking.org FPS, More Is Better FFmpeg 6.0 Encoder: libx265 - Scenario: Upload e a d b c 7 14 21 28 35 28.18 28.18 28.15 28.14 28.13 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
dav1d Video Input: Chimera 1080p 10-bit OpenBenchmarking.org FPS, More Is Better dav1d 1.1 Video Input: Chimera 1080p 10-bit c b a 130 260 390 520 650 603.51 603.05 602.64 1. (CC) gcc options: -pthread -lm
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard b c a d e 7 14 21 28 35 26.74 26.98 27.00 27.59 28.14 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel a b c e d 9 18 27 36 45 32.49 32.59 34.45 37.56 40.41 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: super-resolution-10 - Device: CPU - Executor: Standard d a e c b 2 4 6 8 10 8.06837 8.07185 8.11091 8.12097 8.84768 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: super-resolution-10 - Device: CPU - Executor: Parallel c a b e d 3 6 9 12 15 8.92427 8.95095 8.97883 11.12940 11.23890 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard b c a e d 2 4 6 8 10 4.27622 4.82211 4.82662 7.16208 7.25262 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel a b c e d 3 6 9 12 15 5.13811 5.15191 5.45677 9.78363 9.86863 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard c b a d e 12 24 36 48 60 30.75 30.81 33.98 50.59 51.89 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel c b a e d 20 40 60 80 100 40.57 40.65 41.40 75.14 75.36 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: fcn-resnet101-11 - Device: CPU - Executor: Standard a c b d e 50 100 150 200 250 195.52 195.98 204.39 209.02 223.87 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel b a c d e 200 400 600 800 1000 794.91 797.88 853.82 1083.89 1135.38 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard b c a e d 0.5393 1.0786 1.6179 2.1572 2.6965 1.80855 1.86383 1.90737 2.07112 2.39687 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel a c b d e 0.6951 1.3902 2.0853 2.7804 3.4755 1.66637 1.66912 1.77352 3.05530 3.08913 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: bertsquad-12 - Device: CPU - Executor: Standard b d a c e 30 60 90 120 150 83.24 94.82 106.78 106.92 112.88 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: bertsquad-12 - Device: CPU - Executor: Parallel b a c e d 30 60 90 120 150 81.50 82.14 82.14 111.30 112.94 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: yolov4 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: yolov4 - Device: CPU - Executor: Standard c a b d e 50 100 150 200 250 143.64 151.27 154.16 206.36 212.45 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: yolov4 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: yolov4 - Device: CPU - Executor: Parallel a c b d e 50 100 150 200 250 156.18 156.58 157.15 244.02 245.99 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: GPT-2 - Device: CPU - Executor: Standard b a c d e 3 6 9 12 15 7.75078 7.78072 7.88188 7.88812 9.71824 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: GPT-2 - Device: CPU - Executor: Parallel b c a e d 2 4 6 8 10 6.24337 6.25916 6.27808 8.87820 8.92113 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
Phoronix Test Suite v10.8.5