intel onnx Intel Core i5-14500 testing with a ASUS PRIME Z790-P WIFI (1402 BIOS) and ASUS Intel UHD 770 ADL-S GT1 31GB on Ubuntu 23.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2402049-NE-INTELONNX80&grt&sor .
intel onnx Processor Motherboard Chipset Memory Disk Graphics Audio Monitor OS Kernel Desktop Display Server OpenGL Compiler File-System Screen Resolution a b c d Intel Core i5-14600K @ 5.30GHz (14 Cores / 20 Threads) ASUS PRIME Z790-P WIFI (1402 BIOS) Intel Device 7a27 2 x 16GB DRAM-6000MT/s Corsair CMK32GX5M2B6000C36 1024GB SOLIDIGM SSDPFKKW010X7 ASUS Intel RPL-S 31GB (1550MHz) Realtek ALC897 ASUS VP28U Ubuntu 23.10 6.7.0-060700-generic (x86_64) GNOME Shell 45.0 X Server + Wayland 4.6 Mesa 23.2.1-1ubuntu3.1 GCC 13.2.0 ext4 1920x1080 Intel Core i5-14500 @ 5.00GHz (14 Cores / 20 Threads) ASUS Intel UHD 770 ADL-S GT1 31GB (1550MHz) OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - a: Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0x11d - Thermald 2.5.4 - b: Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0x11d - Thermald 2.5.4 - c: Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0x32 - Thermald 2.5.4 - d: Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0x32 - Thermald 2.5.4 Python Details - Python 3.11.6 Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected
intel onnx onnx: GPT-2 - CPU - Parallel onnx: GPT-2 - CPU - Parallel onnx: GPT-2 - CPU - Standard onnx: GPT-2 - CPU - Standard onnx: yolov4 - CPU - Parallel onnx: yolov4 - CPU - Parallel onnx: yolov4 - CPU - Standard onnx: yolov4 - CPU - Standard onnx: T5 Encoder - CPU - Parallel onnx: T5 Encoder - CPU - Parallel onnx: T5 Encoder - CPU - Standard onnx: T5 Encoder - CPU - Standard onnx: bertsquad-12 - CPU - Parallel onnx: bertsquad-12 - CPU - Parallel onnx: bertsquad-12 - CPU - Standard onnx: bertsquad-12 - CPU - Standard onnx: CaffeNet 12-int8 - CPU - Parallel onnx: CaffeNet 12-int8 - CPU - Parallel onnx: CaffeNet 12-int8 - CPU - Standard onnx: CaffeNet 12-int8 - CPU - Standard onnx: fcn-resnet101-11 - CPU - Parallel onnx: fcn-resnet101-11 - CPU - Parallel onnx: fcn-resnet101-11 - CPU - Standard onnx: fcn-resnet101-11 - CPU - Standard onnx: ArcFace ResNet-100 - CPU - Parallel onnx: ArcFace ResNet-100 - CPU - Parallel onnx: ArcFace ResNet-100 - CPU - Standard onnx: ArcFace ResNet-100 - CPU - Standard onnx: ResNet50 v1-12-int8 - CPU - Parallel onnx: ResNet50 v1-12-int8 - CPU - Parallel onnx: ResNet50 v1-12-int8 - CPU - Standard onnx: ResNet50 v1-12-int8 - CPU - Standard onnx: super-resolution-10 - CPU - Parallel onnx: super-resolution-10 - CPU - Parallel onnx: super-resolution-10 - CPU - Standard onnx: super-resolution-10 - CPU - Standard onnx: Faster R-CNN R-50-FPN-int8 - CPU - Parallel onnx: Faster R-CNN R-50-FPN-int8 - CPU - Parallel onnx: Faster R-CNN R-50-FPN-int8 - CPU - Standard onnx: Faster R-CNN R-50-FPN-int8 - CPU - Standard a b c d 109.53 9.12514 173.505 5.76013 10.0398 99.6017 12.8473 77.8358 134.589 7.42879 201.51 4.96161 16.07 62.2255 18.4604 54.1682 266.064 3.75732 722.924 1.38253 1.58044 632.734 2.11386 473.066 8.25464 121.143 8.51173 117.484 135.109 7.40035 239.339 4.1776 64.2721 15.5579 82.6432 12.0998 39.9747 25.0142 53.1498 18.8133 113.197 8.83005 174.028 5.74282 10.5773 94.5408 12.8285 77.9497 126.438 7.90741 201.77 4.95524 13.7285 72.8399 18.5426 53.9283 263.017 3.80074 710.938 1.40589 1.52026 657.781 2.09046 478.362 8.23229 121.472 8.49818 117.671 132.06 7.57099 239.941 4.16707 64.7914 15.4334 82.6685 12.096 39.4603 25.3401 52.4488 19.0643 109.357 9.13993 176.035 5.67713 10.1779 98.2498 11.8415 84.4468 133.14 7.50952 208.882 4.78645 12.5893 79.431 17.3278 57.7089 247.001 4.04715 649.33 1.53907 1.35743 736.684 1.78177 561.237 7.63408 130.99 7.78147 128.509 124.204 8.04979 216.778 4.61229 63.6816 15.702 79.1907 12.6271 36.2335 27.5969 46.1761 21.6546 107.328 9.31307 176.575 5.65923 9.48916 105.381 11.8566 84.3393 129.339 7.73021 210.031 4.75974 14.4327 69.2846 17.3196 57.7365 249.431 4.00771 654.423 1.52723 1.44994 689.68 1.78007 561.774 7.53407 132.729 7.78073 128.522 125.592 7.9609 217.526 4.59654 61.3126 16.3088 79.1887 12.6275 36.1624 27.6513 45.9831 21.7455 OpenBenchmarking.org
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.17 Model: GPT-2 - Device: CPU - Executor: Parallel b a c d 30 60 90 120 150 113.20 109.53 109.36 107.33 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: GPT-2 - Device: CPU - Executor: Parallel b a c d 3 6 9 12 15 8.83005 9.12514 9.13993 9.31307 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.17 Model: GPT-2 - Device: CPU - Executor: Standard d c b a 40 80 120 160 200 176.58 176.04 174.03 173.51 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: GPT-2 - Device: CPU - Executor: Standard d c b a 1.296 2.592 3.888 5.184 6.48 5.65923 5.67713 5.74282 5.76013 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: yolov4 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.17 Model: yolov4 - Device: CPU - Executor: Parallel b c a d 3 6 9 12 15 10.57730 10.17790 10.03980 9.48916 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: yolov4 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: yolov4 - Device: CPU - Executor: Parallel b c a d 20 40 60 80 100 94.54 98.25 99.60 105.38 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: yolov4 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.17 Model: yolov4 - Device: CPU - Executor: Standard a b d c 3 6 9 12 15 12.85 12.83 11.86 11.84 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: yolov4 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: yolov4 - Device: CPU - Executor: Standard a b d c 20 40 60 80 100 77.84 77.95 84.34 84.45 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: T5 Encoder - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.17 Model: T5 Encoder - Device: CPU - Executor: Parallel a c d b 30 60 90 120 150 134.59 133.14 129.34 126.44 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: T5 Encoder - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: T5 Encoder - Device: CPU - Executor: Parallel a c d b 2 4 6 8 10 7.42879 7.50952 7.73021 7.90741 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: T5 Encoder - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.17 Model: T5 Encoder - Device: CPU - Executor: Standard d c b a 50 100 150 200 250 210.03 208.88 201.77 201.51 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: T5 Encoder - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: T5 Encoder - Device: CPU - Executor: Standard d c b a 1.1164 2.2328 3.3492 4.4656 5.582 4.75974 4.78645 4.95524 4.96161 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.17 Model: bertsquad-12 - Device: CPU - Executor: Parallel a d b c 4 8 12 16 20 16.07 14.43 13.73 12.59 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: bertsquad-12 - Device: CPU - Executor: Parallel a d b c 20 40 60 80 100 62.23 69.28 72.84 79.43 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.17 Model: bertsquad-12 - Device: CPU - Executor: Standard b a c d 5 10 15 20 25 18.54 18.46 17.33 17.32 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: bertsquad-12 - Device: CPU - Executor: Standard b a c d 13 26 39 52 65 53.93 54.17 57.71 57.74 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.17 Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel a b d c 60 120 180 240 300 266.06 263.02 249.43 247.00 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel a b d c 0.9106 1.8212 2.7318 3.6424 4.553 3.75732 3.80074 4.00771 4.04715 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.17 Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard a b d c 160 320 480 640 800 722.92 710.94 654.42 649.33 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard a b d c 0.3463 0.6926 1.0389 1.3852 1.7315 1.38253 1.40589 1.52723 1.53907 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.17 Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel a b d c 0.3556 0.7112 1.0668 1.4224 1.778 1.58044 1.52026 1.44994 1.35743 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel a b d c 160 320 480 640 800 632.73 657.78 689.68 736.68 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.17 Model: fcn-resnet101-11 - Device: CPU - Executor: Standard a b c d 0.4756 0.9512 1.4268 1.9024 2.378 2.11386 2.09046 1.78177 1.78007 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: fcn-resnet101-11 - Device: CPU - Executor: Standard a b c d 120 240 360 480 600 473.07 478.36 561.24 561.77 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.17 Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel a b c d 2 4 6 8 10 8.25464 8.23229 7.63408 7.53407 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel a b c d 30 60 90 120 150 121.14 121.47 130.99 132.73 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.17 Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard a b c d 2 4 6 8 10 8.51173 8.49818 7.78147 7.78073 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard a b c d 30 60 90 120 150 117.48 117.67 128.51 128.52 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.17 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel a b d c 30 60 90 120 150 135.11 132.06 125.59 124.20 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel a b d c 2 4 6 8 10 7.40035 7.57099 7.96090 8.04979 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.17 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard b a d c 50 100 150 200 250 239.94 239.34 217.53 216.78 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard b a d c 1.0378 2.0756 3.1134 4.1512 5.189 4.16707 4.17760 4.59654 4.61229 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.17 Model: super-resolution-10 - Device: CPU - Executor: Parallel b a c d 14 28 42 56 70 64.79 64.27 63.68 61.31 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: super-resolution-10 - Device: CPU - Executor: Parallel b a c d 4 8 12 16 20 15.43 15.56 15.70 16.31 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.17 Model: super-resolution-10 - Device: CPU - Executor: Standard b a c d 20 40 60 80 100 82.67 82.64 79.19 79.19 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: super-resolution-10 - Device: CPU - Executor: Standard b a c d 3 6 9 12 15 12.10 12.10 12.63 12.63 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.17 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel a b c d 9 18 27 36 45 39.97 39.46 36.23 36.16 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel a b c d 7 14 21 28 35 25.01 25.34 27.60 27.65 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.17 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard a b c d 12 24 36 48 60 53.15 52.45 46.18 45.98 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.17 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard a b c d 5 10 15 20 25 18.81 19.06 21.65 21.75 1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
Phoronix Test Suite v10.8.5