AMD Ryzen 7 PRO 6850U testing with a LENOVO 21CM0001US (R22ET51W 1.21 BIOS) and AMD Radeon 680M 1GB on Ubuntu 22.10 via the Phoronix Test Suite.
Compare your own system(s) to this result file with the
Phoronix Test Suite by running the command:
phoronix-test-suite benchmark 2302110-NE-ONNXRUNTI70 onnx runtime zstd - Phoronix Test Suite onnx runtime zstd AMD Ryzen 7 PRO 6850U testing with a LENOVO 21CM0001US (R22ET51W 1.21 BIOS) and AMD Radeon 680M 1GB on Ubuntu 22.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2302110-NE-ONNXRUNTI70&grs&sor&rro .
onnx runtime zstd Processor Motherboard Chipset Memory Disk Graphics Audio Network OS Kernel Desktop Display Server OpenGL Vulkan Compiler File-System Screen Resolution a b c d AMD Ryzen 7 PRO 6850U @ 4.77GHz (8 Cores / 16 Threads) LENOVO 21CM0001US (R22ET51W 1.21 BIOS) AMD Device 14b5 16GB 512GB Micron MTFDKBA512TFK AMD Radeon 680M 1GB (2200/400MHz) AMD Rembrandt Radeon HD Audio Qualcomm QCNFA765 Ubuntu 22.10 6.1.0-060100rc2daily20221028-generic (x86_64) GNOME Shell 43.0 X Server + Wayland 4.6 Mesa 22.2.1 (LLVM 15.0.2 DRM 3.49) 1.3.224 GCC 12.2.0 ext4 1920x1200 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: amd-pstate schedutil (Boost: Enabled) - Platform Profile: performance - CPU Microcode: 0xa404102 - ACPI Profile: performance Python Details - Python 3.10.7 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
onnx runtime zstd onnx: fcn-resnet101-11 - CPU - Standard onnx: bertsquad-12 - CPU - Standard onnx: yolov4 - CPU - Standard onnx: super-resolution-10 - CPU - Standard compress-zstd: 3, Long Mode - Compression Speed onnx: Faster R-CNN R-50-FPN-int8 - CPU - Standard onnx: GPT-2 - CPU - Standard onnx: CaffeNet 12-int8 - CPU - Standard onnx: ResNet50 v1-12-int8 - CPU - Standard onnx: fcn-resnet101-11 - CPU - Parallel onnx: ArcFace ResNet-100 - CPU - Parallel onnx: bertsquad-12 - CPU - Parallel onnx: yolov4 - CPU - Parallel onnx: CaffeNet 12-int8 - CPU - Parallel compress-zstd: 12 - Decompression Speed compress-zstd: 19 - Compression Speed compress-zstd: 8 - Decompression Speed compress-zstd: 3, Long Mode - Decompression Speed compress-zstd: 19, Long Mode - Compression Speed onnx: Faster R-CNN R-50-FPN-int8 - CPU - Parallel compress-zstd: 8, Long Mode - Decompression Speed onnx: super-resolution-10 - CPU - Parallel onnx: GPT-2 - CPU - Parallel compress-zstd: 3 - Compression Speed compress-zstd: 3 - Decompression Speed compress-zstd: 12 - Compression Speed compress-zstd: 8 - Compression Speed compress-zstd: 19 - Decompression Speed compress-zstd: 8, Long Mode - Compression Speed onnx: ArcFace ResNet-100 - CPU - Standard onnx: ResNet50 v1-12-int8 - CPU - Parallel compress-zstd: 19, Long Mode - Decompression Speed onnx: Faster R-CNN R-50-FPN-int8 - CPU - Standard onnx: Faster R-CNN R-50-FPN-int8 - CPU - Parallel onnx: super-resolution-10 - CPU - Standard onnx: super-resolution-10 - CPU - Parallel onnx: ResNet50 v1-12-int8 - CPU - Standard onnx: ResNet50 v1-12-int8 - CPU - Parallel onnx: ArcFace ResNet-100 - CPU - Standard onnx: ArcFace ResNet-100 - CPU - Parallel onnx: fcn-resnet101-11 - CPU - Standard onnx: fcn-resnet101-11 - CPU - Parallel onnx: CaffeNet 12-int8 - CPU - Standard onnx: CaffeNet 12-int8 - CPU - Parallel onnx: bertsquad-12 - CPU - Standard onnx: bertsquad-12 - CPU - Parallel onnx: yolov4 - CPU - Standard onnx: yolov4 - CPU - Parallel onnx: GPT-2 - CPU - Standard onnx: GPT-2 - CPU - Parallel a b c d 1.06759 10.8926 6.66706 72.6 448.2 30.1729 82.1847 336.841 120.228 0.652903 14.707 7.30784 4.81183 283.465 1782.6 10.3 1723.7 1629.9 6.05 25.2513 1780.2 51.8882 79.8976 1003.7 1627.1 99 275.6 1503.9 281.3 21.8453 110.183 1439 33.1361 39.5979 13.7702 19.2696 8.31409 9.07349 45.7728 67.9915 936.682 1531.61 2.96616 3.52574 91.7992 136.834 149.984 207.815 12.158 12.5089 0.692005 11.0216 6.33386 72.8737 448.6 26.439 89.421 297.641 112.269 0.660573 14.5215 7.68535 4.89053 291.299 1790.2 10.4 1738.6 1645.9 5.98 25.2707 1778.5 52.3344 80.0631 1006.6 1631.2 99.6 276.1 1501.7 282.3 21.9682 110.717 1438.8 37.8157 39.5676 13.7184 19.105 8.90396 9.02988 45.5162 68.8604 1445.07 1513.83 3.35715 3.4307 90.7239 130.113 157.874 204.472 11.1748 12.4834 0.689648 10.8695 4.6377 72.7577 515.3 29.9516 92.3389 335.135 112.244 0.688063 15.2791 7.46141 4.85635 289.293 1763.8 10.4 1749.9 1658.4 5.96 25.6741 1754.7 51.7394 79.8964 1008.4 1632 98.8 276.8 1503.7 282.9 21.9416 110.564 1433.9 33.3819 38.946 13.7405 19.3254 8.90587 9.04234 45.57 65.4462 1450.01 1453.35 2.98143 3.4545 91.9938 134.019 215.616 205.911 10.8221 12.5091 1.07163 7.28019 4.65036 50.8368 455.3 27.8055 80.9612 336.714 112.078 0.687937 14.9103 7.44275 4.71553 289.149 1753.1 10.2 1754.5 1648.9 6.06 25.6501 1764 52.0531 79.1687 999.8 1640.9 99.3 275 1510.8 281.9 21.9199 110.538 1436 35.9574 38.9827 19.6643 19.2088 8.91846 9.04455 45.6169 67.065 933.153 1453.61 2.96729 3.45622 137.353 134.355 215.03 212.061 12.3418 12.6243 OpenBenchmarking.org
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: fcn-resnet101-11 - Device: CPU - Executor: Standard c b a d 0.2411 0.4822 0.7233 0.9644 1.2055 0.689648 0.692005 1.067590 1.071630 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: bertsquad-12 - Device: CPU - Executor: Standard d c a b 3 6 9 12 15 7.28019 10.86950 10.89260 11.02160 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: yolov4 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: yolov4 - Device: CPU - Executor: Standard c d b a 2 4 6 8 10 4.63770 4.65036 6.33386 6.66706 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: super-resolution-10 - Device: CPU - Executor: Standard d a c b 16 32 48 64 80 50.84 72.60 72.76 72.87 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
Zstd Compression Compression Level: 3, Long Mode - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 3, Long Mode - Compression Speed a b d c 110 220 330 440 550 448.2 448.6 455.3 515.3 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard b d c a 7 14 21 28 35 26.44 27.81 29.95 30.17 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: GPT-2 - Device: CPU - Executor: Standard d a b c 20 40 60 80 100 80.96 82.18 89.42 92.34 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard b c d a 70 140 210 280 350 297.64 335.14 336.71 336.84 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard d c b a 30 60 90 120 150 112.08 112.24 112.27 120.23 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel a b d c 0.1548 0.3096 0.4644 0.6192 0.774 0.652903 0.660573 0.687937 0.688063 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel b a d c 4 8 12 16 20 14.52 14.71 14.91 15.28 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: bertsquad-12 - Device: CPU - Executor: Parallel a d c b 2 4 6 8 10 7.30784 7.44275 7.46141 7.68535 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: yolov4 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: yolov4 - Device: CPU - Executor: Parallel d a c b 1.1004 2.2008 3.3012 4.4016 5.502 4.71553 4.81183 4.85635 4.89053 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel a d c b 60 120 180 240 300 283.47 289.15 289.29 291.30 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
Zstd Compression Compression Level: 12 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 12 - Decompression Speed d c a b 400 800 1200 1600 2000 1753.1 1763.8 1782.6 1790.2 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
Zstd Compression Compression Level: 19 - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19 - Compression Speed d a b c 3 6 9 12 15 10.2 10.3 10.4 10.4 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
Zstd Compression Compression Level: 8 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 8 - Decompression Speed a b c d 400 800 1200 1600 2000 1723.7 1738.6 1749.9 1754.5 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
Zstd Compression Compression Level: 3, Long Mode - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 3, Long Mode - Decompression Speed a b d c 400 800 1200 1600 2000 1629.9 1645.9 1648.9 1658.4 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
Zstd Compression Compression Level: 19, Long Mode - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19, Long Mode - Compression Speed c b a d 2 4 6 8 10 5.96 5.98 6.05 6.06 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel a b d c 6 12 18 24 30 25.25 25.27 25.65 25.67 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
Zstd Compression Compression Level: 8, Long Mode - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 8, Long Mode - Decompression Speed c d b a 400 800 1200 1600 2000 1754.7 1764.0 1778.5 1780.2 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: super-resolution-10 - Device: CPU - Executor: Parallel c a d b 12 24 36 48 60 51.74 51.89 52.05 52.33 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: GPT-2 - Device: CPU - Executor: Parallel d c a b 20 40 60 80 100 79.17 79.90 79.90 80.06 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
Zstd Compression Compression Level: 3 - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 3 - Compression Speed d a b c 200 400 600 800 1000 999.8 1003.7 1006.6 1008.4 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
Zstd Compression Compression Level: 3 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 3 - Decompression Speed a b c d 400 800 1200 1600 2000 1627.1 1631.2 1632.0 1640.9 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
Zstd Compression Compression Level: 12 - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 12 - Compression Speed c a d b 20 40 60 80 100 98.8 99.0 99.3 99.6 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
Zstd Compression Compression Level: 8 - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 8 - Compression Speed d a b c 60 120 180 240 300 275.0 275.6 276.1 276.8 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
Zstd Compression Compression Level: 19 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19 - Decompression Speed b c a d 300 600 900 1200 1500 1501.7 1503.7 1503.9 1510.8 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
Zstd Compression Compression Level: 8, Long Mode - Compression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 8, Long Mode - Compression Speed a d b c 60 120 180 240 300 281.3 281.9 282.3 282.9 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard a d c b 5 10 15 20 25 21.85 21.92 21.94 21.97 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inferences Per Second, More Is Better ONNX Runtime 1.14 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel a d c b 20 40 60 80 100 110.18 110.54 110.56 110.72 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
Zstd Compression Compression Level: 19, Long Mode - Decompression Speed OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.5.4 Compression Level: 19, Long Mode - Decompression Speed c d b a 300 600 900 1200 1500 1433.9 1436.0 1438.8 1439.0 1. (CC) gcc options: -O3 -pthread -lz -llzma -llz4
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard b d c a 9 18 27 36 45 37.82 35.96 33.38 33.14 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel a b d c 9 18 27 36 45 39.60 39.57 38.98 38.95 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: super-resolution-10 - Device: CPU - Executor: Standard d a c b 5 10 15 20 25 19.66 13.77 13.74 13.72 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: super-resolution-10 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: super-resolution-10 - Device: CPU - Executor: Parallel c a d b 5 10 15 20 25 19.33 19.27 19.21 19.11 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard d c b a 2 4 6 8 10 8.91846 8.90587 8.90396 8.31409 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel a d c b 3 6 9 12 15 9.07349 9.04455 9.04234 9.02988 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard a d c b 10 20 30 40 50 45.77 45.62 45.57 45.52 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel b a d c 15 30 45 60 75 68.86 67.99 67.07 65.45 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: fcn-resnet101-11 - Device: CPU - Executor: Standard c b a d 300 600 900 1200 1500 1450.01 1445.07 936.68 933.15 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel a b d c 300 600 900 1200 1500 1531.61 1513.83 1453.61 1453.35 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard b c d a 0.7554 1.5108 2.2662 3.0216 3.777 3.35715 2.98143 2.96729 2.96616 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel a d c b 0.7933 1.5866 2.3799 3.1732 3.9665 3.52574 3.45622 3.45450 3.43070 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: bertsquad-12 - Device: CPU - Executor: Standard d c a b 30 60 90 120 150 137.35 91.99 91.80 90.72 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: bertsquad-12 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: bertsquad-12 - Device: CPU - Executor: Parallel a d c b 30 60 90 120 150 136.83 134.36 134.02 130.11 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: yolov4 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: yolov4 - Device: CPU - Executor: Standard c d b a 50 100 150 200 250 215.62 215.03 157.87 149.98 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: yolov4 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: yolov4 - Device: CPU - Executor: Parallel d a c b 50 100 150 200 250 212.06 207.82 205.91 204.47 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Standard OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: GPT-2 - Device: CPU - Executor: Standard d a b c 3 6 9 12 15 12.34 12.16 11.17 10.82 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: GPT-2 - Device: CPU - Executor: Parallel OpenBenchmarking.org Inference Time Cost (ms), Fewer Is Better ONNX Runtime 1.14 Model: GPT-2 - Device: CPU - Executor: Parallel d c a b 3 6 9 12 15 12.62 12.51 12.51 12.48 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto=auto -fno-fat-lto-objects -ldl -lrt
Phoronix Test Suite v10.8.4