9950X onnx svt

AMD Ryzen 9 9950X 16-Core testing with a ASUS ROG STRIX X670E-E GAMING WIFI (2204 BIOS) and AMD Radeon RX 7900 GRE 16GB on Ubuntu 24.04 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2408225-NE-9950XONNX44&grr&sor.

9950X onnx svtProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerOpenGLCompilerFile-SystemScreen ResolutionabcdeAMD Ryzen 9 9950X 16-Core @ 5.75GHz (16 Cores / 32 Threads)ASUS ROG STRIX X670E-E GAMING WIFI (2204 BIOS)AMD Device 14d82 x 32GB DDR5-6400MT/s Corsair CMK64GX5M2B6400C322000GB Corsair MP700 PROAMD Radeon RX 7900 GRE 16GBAMD Navi 31 HDMI/DPDELL U2723QEIntel I225-V + Intel Wi-Fi 6EUbuntu 24.046.10.0-phx (x86_64)GNOME Shell 46.0X Server + Wayland4.6 Mesa 24.2~git2406040600.8112d4~oibaf~n (git-8112d44 2024-06-04 noble-oibaf-ppa) (LLVM 17.0.6 DRM 3.57)GCC 13.2.0ext43840x2160OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-uJ7kn6/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-uJ7kn6/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - CPU Microcode: 0xb40401aPython Details- Python 3.12.3Security Details- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected

9950X onnx svtsvt-av1: Preset 3 - Beauty 4K 10-bitwhisperfile: Mediumsvt-av1: Preset 3 - Bosphorus 4Ksvt-av1: Preset 5 - Beauty 4K 10-bitsvt-av1: Preset 8 - Beauty 4K 10-bitonnx: yolov4 - CPU - Parallelonnx: yolov4 - CPU - Parallelwhisperfile: Smallsvt-av1: Preset 3 - Bosphorus 1080ponnx: T5 Encoder - CPU - Standardonnx: T5 Encoder - CPU - Standardsvt-av1: Preset 13 - Beauty 4K 10-bitonnx: GPT-2 - CPU - Parallelonnx: GPT-2 - CPU - Parallelonnx: GPT-2 - CPU - Standardonnx: GPT-2 - CPU - Standardonnx: bertsquad-12 - CPU - Parallelonnx: bertsquad-12 - CPU - Parallelonnx: ZFNet-512 - CPU - Parallelonnx: ZFNet-512 - CPU - Parallelonnx: ZFNet-512 - CPU - Standardonnx: ZFNet-512 - CPU - Standardonnx: yolov4 - CPU - Standardonnx: yolov4 - CPU - Standardonnx: T5 Encoder - CPU - Parallelonnx: T5 Encoder - CPU - Parallelsvt-av1: Preset 5 - Bosphorus 4Konnx: ResNet101_DUC_HDC-12 - CPU - Parallelonnx: ResNet101_DUC_HDC-12 - CPU - Parallelonnx: ResNet101_DUC_HDC-12 - CPU - Standardonnx: ResNet101_DUC_HDC-12 - CPU - Standardonnx: fcn-resnet101-11 - CPU - Parallelonnx: fcn-resnet101-11 - CPU - Parallelonnx: fcn-resnet101-11 - CPU - Standardonnx: fcn-resnet101-11 - CPU - Standardonnx: bertsquad-12 - CPU - Standardonnx: bertsquad-12 - CPU - Standardonnx: ArcFace ResNet-100 - CPU - Parallelonnx: ArcFace ResNet-100 - CPU - Parallelonnx: ArcFace ResNet-100 - CPU - Standardonnx: ArcFace ResNet-100 - CPU - Standardonnx: Faster R-CNN R-50-FPN-int8 - CPU - Standardonnx: Faster R-CNN R-50-FPN-int8 - CPU - Standardonnx: Faster R-CNN R-50-FPN-int8 - CPU - Parallelonnx: Faster R-CNN R-50-FPN-int8 - CPU - Parallelonnx: CaffeNet 12-int8 - CPU - Parallelonnx: CaffeNet 12-int8 - CPU - Parallelonnx: CaffeNet 12-int8 - CPU - Standardonnx: CaffeNet 12-int8 - CPU - Standardonnx: ResNet50 v1-12-int8 - CPU - Parallelonnx: ResNet50 v1-12-int8 - CPU - Parallelonnx: ResNet50 v1-12-int8 - CPU - Standardonnx: ResNet50 v1-12-int8 - CPU - Standardonnx: super-resolution-10 - CPU - Parallelonnx: super-resolution-10 - CPU - Parallelonnx: super-resolution-10 - CPU - Standardonnx: super-resolution-10 - CPU - Standardsvt-av1: Preset 5 - Bosphorus 1080psvt-av1: Preset 8 - Bosphorus 4Kwhisperfile: Tinysvt-av1: Preset 8 - Bosphorus 1080psvt-av1: Preset 13 - Bosphorus 4Ksvt-av1: Preset 13 - Bosphorus 1080pabcde1.760328.9718411.8147.57410.593129.6447.71368127.9731637.3135.24802190.62419.4006.62484150.8656.19550161.41484.928611.775014.067371.08237.86110127.18572.547513.78405.24748190.54944.104705.2091.41808390.732.55959449.4812.22498201.8664.9537539.834825.102240.849524.479417.938655.741116.260961.498524.191841.33304.07620245.2781.17405851.2446.55079152.6411.72670578.9404.50538221.9364.44116225.156128.70798.65729.63488302.419257.2011009.6401.758327.1284111.8217.58810.609130.1247.68899128.2362737.2095.18948192.67619.4046.60878151.2456.37911156.71786.198011.601614.123970.79587.95942125.62572.032613.88255.24914190.48243.974708.8211.41079391.3842.55502444.4362.25003202.8814.9289439.913925.052440.904124.446117.839256.051116.531560.483924.209941.30184.1004243.7271.17235852.4186.54126152.8531.7433573.4254.53509220.474.4613224.14127.96199.25530.31093304.327258.6471011.2361.76329.2382811.7947.58310.527124.418.03773129.6257537.1985.18564192.82319.4196.6827149.5666.34248157.62883.446811.983213.965771.59537.93481125.99771.867713.9145.32595187.73543.769703.8911.42067387.8732.57815444.8472.24795200.784.9805339.92425.045841.180624.281317.851756.011916.07662.19824.451240.89414.12702242.2381.16879855.0826.68702149.5111.74974571.1934.66209214.4754.46838223.785128.36899.22230.03969303.057257.4181012.7681.759328.2894411.8187.5410.525115.8948.62837129.7964837.1515.46322183.01819.4886.65046150.2896.33978157.68985.401311.708814.170470.55387.86194127.16971.692513.9485.2459190.60443.874709.791.40886390.942.55793447.0632.2368202.7244.9327742.714223.409840.364824.772917.844756.03416.088462.149524.338841.08324.06632245.8671.17003854.1396.69723149.2961.74031574.3174.4796223.2094.45929224.24128.10398.24230.03582303.122255.7051011.4331.759328.0132511.7447.54110.737127.8017.82444129.3809137.1975.17395193.25319.4876.61099151.1976.34139157.6585.084211.752613.957771.63597.93399126.01672.963713.7055.22498191.36143.739695.9371.43691392.0572.55063446.8912.23767202.8074.9307539.958925.024341.219224.259417.885255.906816.57860.31524.493540.82344.1821239.0321.15972861.6546.66991149.9071.7372575.4154.58833217.9174.54187220.163127.60598.0630.37277301.639256.1521010.763OpenBenchmarking.org

SVT-AV1

Encoder Mode: Preset 3 - Input: Beauty 4K 10-bit

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 3 - Input: Beauty 4K 10-bitcaedb0.3960.7921.1881.5841.98SE +/- 0.005, N = 3SE +/- 0.003, N = 31.7601.7601.7591.7591.7581. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

Whisperfile

Model Size: Medium

OpenBenchmarking.orgSeconds, Fewer Is BetterWhisperfile 20Aug24Model Size: Mediumbedac70140210280350SE +/- 0.20, N = 3327.13328.01328.29328.97329.24

SVT-AV1

Encoder Mode: Preset 3 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 3 - Input: Bosphorus 4Kbdace3691215SE +/- 0.00, N = 3SE +/- 0.04, N = 311.8211.8211.8111.7911.741. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

SVT-AV1

Encoder Mode: Preset 5 - Input: Beauty 4K 10-bit

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 5 - Input: Beauty 4K 10-bitbcaed246810SE +/- 0.052, N = 3SE +/- 0.022, N = 37.5887.5837.5747.5417.5401. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

SVT-AV1

Encoder Mode: Preset 8 - Input: Beauty 4K 10-bit

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 8 - Input: Beauty 4K 10-bitebacd3691215SE +/- 0.04, N = 3SE +/- 0.03, N = 310.7410.6110.5910.5310.531. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

ONNX Runtime

Model: yolov4 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: yolov4 - Device: CPU - Executor: Paralleldceab306090120150SE +/- 0.73, N = 3SE +/- 1.08, N = 9115.89124.41127.80129.64130.121. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: yolov4 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: yolov4 - Device: CPU - Executor: Paralleldceab246810SE +/- 0.04367, N = 3SE +/- 0.06383, N = 98.628378.037737.824447.713687.688991. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

Whisperfile

Model Size: Small

OpenBenchmarking.orgSeconds, Fewer Is BetterWhisperfile 20Aug24Model Size: Smallabecd306090120150SE +/- 0.21, N = 3127.97128.24129.38129.63129.80

SVT-AV1

Encoder Mode: Preset 3 - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 3 - Input: Bosphorus 1080pabced918273645SE +/- 0.01, N = 3SE +/- 0.01, N = 337.3137.2137.2037.2037.151. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

ONNX Runtime

Model: T5 Encoder - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: T5 Encoder - Device: CPU - Executor: Standardecbad1.22922.45843.68764.91686.146SE +/- 0.00692, N = 3SE +/- 0.05529, N = 65.173955.185645.189485.248025.463221. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: T5 Encoder - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: T5 Encoder - Device: CPU - Executor: Standardecbad4080120160200SE +/- 0.25, N = 3SE +/- 1.93, N = 6193.25192.82192.68190.62183.021. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

SVT-AV1

Encoder Mode: Preset 13 - Input: Beauty 4K 10-bit

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 13 - Input: Beauty 4K 10-bitdecba510152025SE +/- 0.01, N = 3SE +/- 0.01, N = 319.4919.4919.4219.4019.401. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

ONNX Runtime

Model: GPT-2 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: GPT-2 - Device: CPU - Executor: Parallelbeadc246810SE +/- 0.02289, N = 3SE +/- 0.00827, N = 36.608786.610996.624846.650466.682701. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: GPT-2 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: GPT-2 - Device: CPU - Executor: Parallelbeadc306090120150SE +/- 0.53, N = 3SE +/- 0.19, N = 3151.25151.20150.87150.29149.571. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: GPT-2 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: GPT-2 - Device: CPU - Executor: Standardadecb246810SE +/- 0.07694, N = 3SE +/- 0.01653, N = 36.195506.339786.341396.342486.379111. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: GPT-2 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: GPT-2 - Device: CPU - Executor: Standardadecb4080120160200SE +/- 1.98, N = 3SE +/- 0.41, N = 3161.41157.69157.65157.63156.721. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: bertsquad-12 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: bertsquad-12 - Device: CPU - Executor: Parallelcaedb20406080100SE +/- 0.50, N = 3SE +/- 0.50, N = 383.4584.9385.0885.4086.201. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: bertsquad-12 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: bertsquad-12 - Device: CPU - Executor: Parallelcaedb3691215SE +/- 0.07, N = 3SE +/- 0.07, N = 311.9811.7811.7511.7111.601. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ZFNet-512 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: ZFNet-512 - Device: CPU - Executor: Parallelecabd48121620SE +/- 0.08, N = 3SE +/- 0.07, N = 313.9613.9714.0714.1214.171. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ZFNet-512 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ZFNet-512 - Device: CPU - Executor: Parallelecabd1632486480SE +/- 0.42, N = 3SE +/- 0.36, N = 371.6471.6071.0870.8070.551. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ZFNet-512 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: ZFNet-512 - Device: CPU - Executor: Standardadecb246810SE +/- 0.03518, N = 3SE +/- 0.06114, N = 37.861107.861947.933997.934817.959421. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ZFNet-512 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ZFNet-512 - Device: CPU - Executor: Standardadecb306090120150SE +/- 0.57, N = 3SE +/- 0.96, N = 3127.19127.17126.02126.00125.631. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: yolov4 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: yolov4 - Device: CPU - Executor: Standarddcbae1632486480SE +/- 0.27, N = 3SE +/- 0.26, N = 371.6971.8772.0372.5572.961. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: yolov4 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: yolov4 - Device: CPU - Executor: Standarddcbae48121620SE +/- 0.05, N = 3SE +/- 0.05, N = 313.9513.9113.8813.7813.711. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: T5 Encoder - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: T5 Encoder - Device: CPU - Executor: Paralleledabc1.19832.39663.59494.79325.9915SE +/- 0.02621, N = 3SE +/- 0.01022, N = 35.224985.245905.247485.249145.325951. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: T5 Encoder - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: T5 Encoder - Device: CPU - Executor: Paralleledabc4080120160200SE +/- 0.96, N = 3SE +/- 0.37, N = 3191.36190.60190.55190.48187.741. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

SVT-AV1

Encoder Mode: Preset 5 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 5 - Input: Bosphorus 4Kabdce1020304050SE +/- 0.06, N = 3SE +/- 0.13, N = 344.1043.9743.8743.7743.741. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

ONNX Runtime

Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Parallelecabd150300450600750SE +/- 3.47, N = 3695.94703.89705.21708.82709.791. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Parallelecabd0.32330.64660.96991.29321.6165SE +/- 0.00694, N = 31.436911.420671.418081.410791.408861. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standardcadbe90180270360450SE +/- 3.03, N = 3387.87390.73390.94391.38392.061. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standardcadbe0.58011.16021.74032.32042.9005SE +/- 0.02001, N = 32.578152.559592.557932.555022.550631. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: fcn-resnet101-11 - Device: CPU - Executor: Parallelbceda100200300400500SE +/- 3.07, N = 3444.44444.85446.89447.06449.481. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: fcn-resnet101-11 - Device: CPU - Executor: Parallelbceda0.50631.01261.51892.02522.5315SE +/- 0.01511, N = 32.250032.247952.237672.236802.224981. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: fcn-resnet101-11 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: fcn-resnet101-11 - Device: CPU - Executor: Standardcadeb4080120160200SE +/- 0.28, N = 3200.78201.87202.72202.81202.881. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: fcn-resnet101-11 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: fcn-resnet101-11 - Device: CPU - Executor: Standardcadeb1.12062.24123.36184.48245.603SE +/- 0.00675, N = 34.980534.953754.932774.930754.928941. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: bertsquad-12 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: bertsquad-12 - Device: CPU - Executor: Standardabced1020304050SE +/- 0.07, N = 339.8339.9139.9239.9642.711. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: bertsquad-12 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: bertsquad-12 - Device: CPU - Executor: Standardabced612182430SE +/- 0.04, N = 325.1025.0525.0525.0223.411. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: ArcFace ResNet-100 - Device: CPU - Executor: Paralleldabce918273645SE +/- 0.15, N = 340.3640.8540.9041.1841.221. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ArcFace ResNet-100 - Device: CPU - Executor: Paralleldabce612182430SE +/- 0.09, N = 324.7724.4824.4524.2824.261. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: ArcFace ResNet-100 - Device: CPU - Executor: Standardbdcea48121620SE +/- 0.04, N = 317.8417.8417.8517.8917.941. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ArcFace ResNet-100 - Device: CPU - Executor: Standardbdcea1326395265SE +/- 0.11, N = 356.0556.0356.0155.9155.741. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standardcdabe48121620SE +/- 0.13, N = 316.0816.0916.2616.5316.581. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standardcdabe1428425670SE +/- 0.49, N = 362.2062.1561.5060.4860.321. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallelabdce612182430SE +/- 0.06, N = 324.1924.2124.3424.4524.491. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallelabdce918273645SE +/- 0.10, N = 341.3341.3041.0840.8940.821. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: CaffeNet 12-int8 - Device: CPU - Executor: Paralleldabce0.9411.8822.8233.7644.705SE +/- 0.03127, N = 34.066324.076204.100404.127024.182101. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: CaffeNet 12-int8 - Device: CPU - Executor: Paralleldabce50100150200250SE +/- 1.89, N = 3245.87245.28243.73242.24239.031. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: CaffeNet 12-int8 - Device: CPU - Executor: Standardecdba0.26420.52840.79261.05681.321SE +/- 0.00364, N = 31.159721.168791.170031.172351.174051. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: CaffeNet 12-int8 - Device: CPU - Executor: Standardecdba2004006008001000SE +/- 2.64, N = 3861.65855.08854.14852.42851.241. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallelbaecd246810SE +/- 0.03664, N = 36.541266.550796.669916.687026.697231. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallelbaecd306090120150SE +/- 0.85, N = 3152.85152.64149.91149.51149.301. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standardaedbc0.39370.78741.18111.57481.9685SE +/- 0.00641, N = 31.726701.737201.740311.743301.749741. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standardaedbc130260390520650SE +/- 2.14, N = 3578.94575.42574.32573.43571.191. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: super-resolution-10 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: super-resolution-10 - Device: CPU - Executor: Paralleldabec1.0492.0983.1474.1965.245SE +/- 0.02467, N = 34.479604.505384.535094.588334.662091. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: super-resolution-10 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: super-resolution-10 - Device: CPU - Executor: Paralleldabec50100150200250SE +/- 1.21, N = 3223.21221.94220.47217.92214.481. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: super-resolution-10 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: super-resolution-10 - Device: CPU - Executor: Standardadbce1.02192.04383.06574.08765.1095SE +/- 0.00178, N = 34.441164.459294.461304.468384.541871. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: super-resolution-10 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: super-resolution-10 - Device: CPU - Executor: Standardadbce50100150200250SE +/- 0.09, N = 3225.16224.24224.14223.79220.161. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

SVT-AV1

Encoder Mode: Preset 5 - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 5 - Input: Bosphorus 1080pacdbe306090120150SE +/- 0.11, N = 3SE +/- 0.17, N = 3128.71128.37128.10127.96127.611. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

SVT-AV1

Encoder Mode: Preset 8 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 8 - Input: Bosphorus 4Kbcade20406080100SE +/- 0.23, N = 3SE +/- 0.16, N = 399.2699.2298.6698.2498.061. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

Whisperfile

Model Size: Tiny

OpenBenchmarking.orgSeconds, Fewer Is BetterWhisperfile 20Aug24Model Size: Tinyadcbe714212835SE +/- 0.12, N = 329.6330.0430.0430.3130.37

SVT-AV1

Encoder Mode: Preset 8 - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 8 - Input: Bosphorus 1080pbdcae70140210280350SE +/- 0.56, N = 3SE +/- 0.64, N = 3304.33303.12303.06302.42301.641. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

SVT-AV1

Encoder Mode: Preset 13 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 13 - Input: Bosphorus 4Kbcaed60120180240300SE +/- 0.18, N = 3SE +/- 0.27, N = 3258.65257.42257.20256.15255.711. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq

SVT-AV1

Encoder Mode: Preset 13 - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 2.2Encoder Mode: Preset 13 - Input: Bosphorus 1080pcdbea2004006008001000SE +/- 1.10, N = 3SE +/- 2.29, N = 31012.771011.431011.241010.761009.641. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq


Phoronix Test Suite v10.8.5