onnx 119

AMD Ryzen Threadripper 7980X 64-Cores testing with a System76 Thelio Major (FA Z5 BIOS) and AMD Radeon Pro W7900 on Ubuntu 24.10 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2408227-PTS-ONNX119192&grr&sor&rro.

onnx 119ProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerOpenGLCompilerFile-SystemScreen ResolutionabcdeAMD Ryzen Threadripper 7980X 64-Cores @ 7.79GHz (64 Cores / 128 Threads)System76 Thelio Major (FA Z5 BIOS)AMD Device 14a44 x 32GB DDR5-4800MT/s Micron MTC20F1045S1RC48BA21000GB CT1000T700SSD5AMD Radeon Pro W7900AMD Device 14ccDELL P2415QAquantia AQC113C NBase-T/IEEE + Realtek RTL8125 2.5GbE + Intel Wi-Fi 6EUbuntu 24.106.8.0-31-generic (x86_64)GNOME ShellX Server + Wayland4.6 Mesa 24.0.9-0ubuntu2 (LLVM 17.0.6 DRM 3.57)GCC 14.2.0ext41920x1200OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-F5tscv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-F5tscv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - CPU Microcode: 0xa108105Python Details- Python 3.12.5Security Details- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected

onnx 119svt-av1: Preset 3 - Beauty 4K 10-bitonnx: fcn-resnet101-11 - CPU - Parallelonnx: fcn-resnet101-11 - CPU - Parallelonnx: fcn-resnet101-11 - CPU - Standardonnx: fcn-resnet101-11 - CPU - Standardsvt-av1: Preset 3 - Bosphorus 4Ksvt-av1: Preset 5 - Beauty 4K 10-bitonnx: ZFNet-512 - CPU - Standardonnx: ZFNet-512 - CPU - Standardonnx: yolov4 - CPU - Parallelonnx: yolov4 - CPU - Parallelonnx: bertsquad-12 - CPU - Standardonnx: bertsquad-12 - CPU - Standardonnx: Faster R-CNN R-50-FPN-int8 - CPU - Parallelonnx: Faster R-CNN R-50-FPN-int8 - CPU - Parallelonnx: CaffeNet 12-int8 - CPU - Parallelonnx: CaffeNet 12-int8 - CPU - Parallelonnx: bertsquad-12 - CPU - Parallelonnx: bertsquad-12 - CPU - Parallelsvt-av1: Preset 8 - Beauty 4K 10-bitsvt-av1: Preset 3 - Bosphorus 1080ponnx: super-resolution-10 - CPU - Standardonnx: super-resolution-10 - CPU - Standardonnx: yolov4 - CPU - Standardonnx: yolov4 - CPU - Standardsvt-av1: Preset 13 - Beauty 4K 10-bitonnx: ResNet101_DUC_HDC-12 - CPU - Parallelonnx: ResNet101_DUC_HDC-12 - CPU - Parallelonnx: ResNet101_DUC_HDC-12 - CPU - Standardonnx: ResNet101_DUC_HDC-12 - CPU - Standardonnx: GPT-2 - CPU - Parallelonnx: GPT-2 - CPU - Parallelonnx: ZFNet-512 - CPU - Parallelonnx: ZFNet-512 - CPU - Parallelonnx: GPT-2 - CPU - Standardonnx: GPT-2 - CPU - Standardonnx: T5 Encoder - CPU - Parallelonnx: T5 Encoder - CPU - Parallelonnx: T5 Encoder - CPU - Standardonnx: T5 Encoder - CPU - Standardonnx: ArcFace ResNet-100 - CPU - Parallelonnx: ArcFace ResNet-100 - CPU - Parallelonnx: ArcFace ResNet-100 - CPU - Standardonnx: ArcFace ResNet-100 - CPU - Standardonnx: Faster R-CNN R-50-FPN-int8 - CPU - Standardonnx: Faster R-CNN R-50-FPN-int8 - CPU - Standardonnx: CaffeNet 12-int8 - CPU - Standardonnx: CaffeNet 12-int8 - CPU - Standardonnx: ResNet50 v1-12-int8 - CPU - Parallelonnx: ResNet50 v1-12-int8 - CPU - Parallelonnx: ResNet50 v1-12-int8 - CPU - Standardonnx: ResNet50 v1-12-int8 - CPU - Standardonnx: super-resolution-10 - CPU - Parallelonnx: super-resolution-10 - CPU - Parallelsvt-av1: Preset 5 - Bosphorus 4Ksvt-av1: Preset 5 - Bosphorus 1080psvt-av1: Preset 8 - Bosphorus 4Ksvt-av1: Preset 13 - Bosphorus 4Ksvt-av1: Preset 8 - Bosphorus 1080psvt-av1: Preset 13 - Bosphorus 1080pabcde1.864908.2961.10358257.9903.9268613.3987.77512.226881.7752189.6775.2724678.920912.673034.828828.72594.88925204.479147.8146.7685410.84436.6059.94497100.5927114.1528.7600719.520621.2261.60987372.7892.682486.21628160.70618.729753.40689.51832105.0392.97938335.5037.09442140.95879.340012.603926.080338.347622.441544.56002.19841454.72313.474874.20614.97187201.1037.61417131.31345.614119.22196.210231.111271.676740.0011.857913.5391.09825252.3354.0040413.3507.80112.293981.4158195.1285.1278978.572112.737734.645028.87664.99490200.840145.7846.8626210.82736.5539.93654100.6830114.8278.7125119.417625.0601.59999370.1772.701456.18911161.40718.326654.57799.45128105.7912.94936338.9287.03694142.11480.159412.475627.058236.965822.491844.45662.17339460.01613.448274.36244.96437201.4247.66242130.48945.343119.22996.423233.248269.879740.7901.867913.4941.09469291.4443.4311513.537.81112.118682.4953188.55.3048675.871613.179636.160527.65265.61236178.126144.5596.917410.85236.6649.91284100.876110.4189.0561919.396617.61.61916371.5992.691056.15989162.17519.705450.74219.46853105.592.9469339.1757.08004141.23180.702112.390825.783638.781622.703244.04192.26126442.11713.674673.12085.14489194.3437.6459130.76645.662119.76896.06232.396266.006764.0031.874873.4671.14486288.1243.4706913.3817.80912.242681.6673199.7075.0071776.936512.997136.211127.61394.69742212.81140.7247.1059310.87336.7079.52165105.021113.7268.7927919.442630.8051.58526374.5772.669666.17185161.86218.412854.3049.50891105.1462.97608335.8597.23734138.16479.291612.611225.501239.210122.078345.28832.38355419.42113.608873.47494.90503203.8357.5618132.22145.165118.79897.046228.834269.428783.1221.867889.2311.12456276.9233.6110813.4137.76512.792378.1552195.3665.1184278.471712.742836.024227.75724.74627210.622142.4227.0212110.96536.45610.059299.4083110.0339.0878819.441621.0761.6101371.1752.694136.15756162.24518.375154.41559.28112107.7232.95124338.7037.21415138.60577.848312.84527.260436.6822.66844.10992.27019440.35413.452274.33024.93285202.6927.61659131.26945.41119.92396.775228.97268.76749.257OpenBenchmarking.org

SVT-AV1

Encoder Mode: Preset 3 - Input: Beauty 4K 10-bit

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 3 - Input: Beauty 4K 10-bitbaced0.42170.84341.26511.68682.1085SE +/- 0.004, N = 3SE +/- 0.002, N = 31.8571.8641.8671.8671.8741. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

ONNX Runtime

Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: fcn-resnet101-11 - Device: CPU - Executor: Parallelbcaed2004006008001000SE +/- 13.79, N = 15SE +/- 11.68, N = 15913.54913.49908.30889.23873.471. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: fcn-resnet101-11 - Device: CPU - Executor: Parallelcbaed0.25760.51520.77281.03041.288SE +/- 0.01709, N = 15SE +/- 0.01462, N = 151.094691.098251.103581.124561.144861. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: fcn-resnet101-11 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: fcn-resnet101-11 - Device: CPU - Executor: Standardcdeab60120180240300SE +/- 7.70, N = 15SE +/- 6.89, N = 15291.44288.12276.92257.99252.341. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: fcn-resnet101-11 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: fcn-resnet101-11 - Device: CPU - Executor: Standardcdeab0.90091.80182.70273.60364.5045SE +/- 0.12199, N = 15SE +/- 0.10773, N = 153.431153.470693.611083.926864.004041. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

SVT-AV1

Encoder Mode: Preset 3 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 3 - Input: Bosphorus 4Kbdaec3691215SE +/- 0.04, N = 3SE +/- 0.02, N = 313.3513.3813.4013.4113.531. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

SVT-AV1

Encoder Mode: Preset 5 - Input: Beauty 4K 10-bit

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 5 - Input: Beauty 4K 10-biteabdc246810SE +/- 0.029, N = 3SE +/- 0.008, N = 37.7657.7757.8017.8097.8111. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

ONNX Runtime

Model: ZFNet-512 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: ZFNet-512 - Device: CPU - Executor: Standardebdac3691215SE +/- 0.11, N = 15SE +/- 0.06, N = 312.7912.2912.2412.2312.121. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ZFNet-512 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ZFNet-512 - Device: CPU - Executor: Standardebdac20406080100SE +/- 0.72, N = 15SE +/- 0.40, N = 378.1681.4281.6781.7882.501. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: yolov4 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: yolov4 - Device: CPU - Executor: Paralleldebac4080120160200SE +/- 1.30, N = 15SE +/- 1.34, N = 3199.71195.37195.13189.68188.501. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: yolov4 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: yolov4 - Device: CPU - Executor: Paralleldebac1.19362.38723.58084.77445.968SE +/- 0.03454, N = 15SE +/- 0.03706, N = 35.007175.118425.127895.272465.304861. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: bertsquad-12 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: bertsquad-12 - Device: CPU - Executor: Standardabedc20406080100SE +/- 0.83, N = 3SE +/- 0.63, N = 1578.9278.5778.4776.9475.871. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: bertsquad-12 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: bertsquad-12 - Device: CPU - Executor: Standardabedc3691215SE +/- 0.13, N = 3SE +/- 0.10, N = 1512.6712.7412.7413.0013.181. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Paralleldceab816243240SE +/- 0.26, N = 11SE +/- 0.39, N = 536.2136.1636.0234.8334.651. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Paralleldceab714212835SE +/- 0.21, N = 11SE +/- 0.32, N = 527.6127.6527.7628.7328.881. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallelcbaed1.26282.52563.78845.05126.314SE +/- 0.09195, N = 12SE +/- 0.03038, N = 35.612364.994904.889254.746274.697421. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallelcbaed50100150200250SE +/- 3.47, N = 12SE +/- 1.28, N = 3178.13200.84204.48210.62212.811. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: bertsquad-12 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: bertsquad-12 - Device: CPU - Executor: Parallelabced306090120150SE +/- 1.16, N = 9SE +/- 1.59, N = 5147.81145.78144.56142.42140.721. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: bertsquad-12 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: bertsquad-12 - Device: CPU - Executor: Parallelabced246810SE +/- 0.05485, N = 9SE +/- 0.07644, N = 56.768546.862626.917407.021217.105931. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

SVT-AV1

Encoder Mode: Preset 8 - Input: Beauty 4K 10-bit

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 8 - Input: Beauty 4K 10-bitbacde3691215SE +/- 0.05, N = 3SE +/- 0.04, N = 310.8310.8410.8510.8710.971. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

SVT-AV1

Encoder Mode: Preset 3 - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 3 - Input: Bosphorus 1080pebacd816243240SE +/- 0.01, N = 3SE +/- 0.03, N = 336.4636.5536.6136.6636.711. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

ONNX Runtime

Model: super-resolution-10 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: super-resolution-10 - Device: CPU - Executor: Standardeabcd3691215SE +/- 0.08968, N = 6SE +/- 0.10599, N = 510.059209.944979.936549.912849.521651. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: super-resolution-10 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: super-resolution-10 - Device: CPU - Executor: Standardeabcd20406080100SE +/- 0.94, N = 6SE +/- 1.11, N = 599.41100.59100.68100.88105.021. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: yolov4 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: yolov4 - Device: CPU - Executor: Standardbadce306090120150SE +/- 1.43, N = 4SE +/- 0.25, N = 3114.83114.15113.73110.42110.031. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: yolov4 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: yolov4 - Device: CPU - Executor: Standardbadce3691215SE +/- 0.10839, N = 4SE +/- 0.01899, N = 38.712518.760078.792799.056199.087881. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

SVT-AV1

Encoder Mode: Preset 13 - Input: Beauty 4K 10-bit

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 13 - Input: Beauty 4K 10-bitcbeda510152025SE +/- 0.04, N = 3SE +/- 0.03, N = 319.4019.4219.4419.4419.521. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

ONNX Runtime

Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Paralleldbaec140280420560700SE +/- 4.33, N = 3SE +/- 4.43, N = 3630.81625.06621.23621.08617.601. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Paralleldbaec0.36430.72861.09291.45721.8215SE +/- 0.01116, N = 3SE +/- 0.01149, N = 31.585261.599991.609871.610101.619161. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standarddaceb80160240320400SE +/- 0.46, N = 3SE +/- 1.22, N = 3374.58372.79371.60371.18370.181. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standarddaceb0.60781.21561.82342.43123.039SE +/- 0.00331, N = 3SE +/- 0.00886, N = 32.669662.682482.691052.694132.701451. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: GPT-2 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: GPT-2 - Device: CPU - Executor: Parallelabdce246810SE +/- 0.00960, N = 3SE +/- 0.01421, N = 36.216286.189116.171856.159896.157561. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: GPT-2 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: GPT-2 - Device: CPU - Executor: Parallelabdce4080120160200SE +/- 0.25, N = 3SE +/- 0.37, N = 3160.71161.41161.86162.18162.251. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ZFNet-512 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: ZFNet-512 - Device: CPU - Executor: Parallelcadeb510152025SE +/- 0.27, N = 3SE +/- 0.25, N = 319.7118.7318.4118.3818.331. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ZFNet-512 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ZFNet-512 - Device: CPU - Executor: Parallelcadeb1224364860SE +/- 0.75, N = 3SE +/- 0.73, N = 350.7453.4154.3054.4254.581. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: GPT-2 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: GPT-2 - Device: CPU - Executor: Standardadcbe3691215SE +/- 0.01924, N = 3SE +/- 0.05188, N = 39.518329.508919.468539.451289.281121. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: GPT-2 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: GPT-2 - Device: CPU - Executor: Standardadcbe20406080100SE +/- 0.21, N = 3SE +/- 0.58, N = 3105.04105.15105.59105.79107.721. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: T5 Encoder - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: T5 Encoder - Device: CPU - Executor: Paralleladebc0.67041.34082.01122.68163.352SE +/- 0.00618, N = 3SE +/- 0.01351, N = 32.979382.976082.951242.949362.946901. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: T5 Encoder - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: T5 Encoder - Device: CPU - Executor: Paralleladebc70140210280350SE +/- 0.69, N = 3SE +/- 1.56, N = 3335.50335.86338.70338.93339.181. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: T5 Encoder - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: T5 Encoder - Device: CPU - Executor: Standarddeacb246810SE +/- 0.04465, N = 3SE +/- 0.05397, N = 37.237347.214157.094427.080047.036941. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: T5 Encoder - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: T5 Encoder - Device: CPU - Executor: Standarddeacb306090120150SE +/- 0.88, N = 3SE +/- 1.08, N = 3138.16138.61140.96141.23142.111. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallelcbade20406080100SE +/- 0.51, N = 3SE +/- 0.34, N = 380.7080.1679.3479.2977.851. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallelcbade3691215SE +/- 0.08, N = 3SE +/- 0.05, N = 312.3912.4812.6012.6112.851. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: ArcFace ResNet-100 - Device: CPU - Executor: Standardebacd612182430SE +/- 0.33, N = 3SE +/- 0.27, N = 327.2627.0626.0825.7825.501. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ArcFace ResNet-100 - Device: CPU - Executor: Standardebacd918273645SE +/- 0.45, N = 3SE +/- 0.40, N = 336.6836.9738.3538.7839.211. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standardcebad510152025SE +/- 0.07, N = 3SE +/- 0.16, N = 322.7022.6722.4922.4422.081. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standardcebad1020304050SE +/- 0.14, N = 3SE +/- 0.33, N = 344.0444.1144.4644.5645.291. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: CaffeNet 12-int8 - Device: CPU - Executor: Standarddecab0.53631.07261.60892.14522.6815SE +/- 0.00686, N = 3SE +/- 0.01799, N = 32.383552.270192.261262.198412.173391. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: CaffeNet 12-int8 - Device: CPU - Executor: Standarddecab100200300400500SE +/- 1.43, N = 3SE +/- 3.83, N = 3419.42440.35442.12454.72460.021. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallelcdaeb48121620SE +/- 0.04, N = 3SE +/- 0.13, N = 313.6713.6113.4713.4513.451. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallelcdaeb20406080100SE +/- 0.25, N = 3SE +/- 0.72, N = 373.1273.4774.2174.3374.361. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standardcabed1.15762.31523.47284.63045.788SE +/- 0.02497, N = 3SE +/- 0.03731, N = 35.144894.971874.964374.932854.905031. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standardcabed4080120160200SE +/- 1.01, N = 3SE +/- 1.52, N = 3194.34201.10201.42202.69203.841. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: super-resolution-10 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: super-resolution-10 - Device: CPU - Executor: Parallelbcead246810SE +/- 0.02614, N = 3SE +/- 0.01157, N = 37.662427.645907.616597.614177.561801. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: super-resolution-10 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: super-resolution-10 - Device: CPU - Executor: Parallelbcead306090120150SE +/- 0.45, N = 3SE +/- 0.20, N = 3130.49130.77131.27131.31132.221. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

SVT-AV1

Encoder Mode: Preset 5 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 5 - Input: Bosphorus 4Kdbeac1020304050SE +/- 0.22, N = 3SE +/- 0.18, N = 345.1745.3445.4145.6145.661. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

SVT-AV1

Encoder Mode: Preset 5 - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 5 - Input: Bosphorus 1080pdabce306090120150SE +/- 0.52, N = 3SE +/- 0.35, N = 3118.80119.22119.23119.77119.921. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

SVT-AV1

Encoder Mode: Preset 8 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 8 - Input: Bosphorus 4Kcabed20406080100SE +/- 0.26, N = 3SE +/- 0.79, N = 396.0696.2196.4296.7897.051. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

SVT-AV1

Encoder Mode: Preset 13 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 13 - Input: Bosphorus 4Kdeacb50100150200250SE +/- 1.99, N = 8SE +/- 2.31, N = 3228.83228.97231.11232.40233.251. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

SVT-AV1

Encoder Mode: Preset 8 - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 8 - Input: Bosphorus 1080pcedba60120180240300SE +/- 0.65, N = 3SE +/- 0.46, N = 3266.01268.76269.43269.88271.681. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

SVT-AV1

Encoder Mode: Preset 13 - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 13 - Input: Bosphorus 1080pabecd2004006008001000SE +/- 7.99, N = 4SE +/- 8.40, N = 4740.00740.79749.26764.00783.121. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq


Phoronix Test Suite v10.8.5