phoronix-machine-learning.txt

AMD Ryzen Threadripper 7960X 24-Cores testing with a Gigabyte TRX50 AERO D (FA BIOS) and Sapphire AMD Radeon RX 7900 XTX 24GB on Ubuntu 24.04 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2411137-NE-PHORONIXM28.

phoronix-machine-learning.txtProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerOpenGLOpenCLCompilerFile-SystemScreen Resolutionphoronix-ml.txtAMD Ryzen Threadripper 7960X 24-Cores @ 7.79GHz (24 Cores / 48 Threads)Gigabyte TRX50 AERO D (FA BIOS)AMD Device 14a44 x 32GB DDR5-5200MT/s Micron MTC20F1045S1RC56BG11000GB GIGABYTE AG512K1TBSapphire AMD Radeon RX 7900 XTX 24GBAMD Device 14ccHP E273Aquantia AQC113C NBase-T/IEEE + Realtek RTL8125 2.5GbE + Qualcomm WCN785x Wi-Fi 7Ubuntu 24.046.8.0-48-generic (x86_64)GNOME Shell 46.0X Server + Wayland4.6 Mesa 24.2.0-devel (LLVM 18.1.7 DRM 3.58)OpenCL 2.1 AMD-APP (3625.0)GCC 13.2.0ext41920x1080OpenBenchmarking.org- Transparent Huge Pages: madvise- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-uJ7kn6/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-uJ7kn6/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - CPU Microcode: 0xa108105- BAR1 / Visible vRAM Size: 24560 MB- Python 3.12.3- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected

phoronix-machine-learning.txtshoc: OpenCL - S3Dshoc: OpenCL - Triadshoc: OpenCL - FFT SPshoc: OpenCL - MD5 Hashshoc: OpenCL - Reductionshoc: OpenCL - GEMM SGEMM_Nshoc: OpenCL - Max SP Flopsshoc: OpenCL - Bus Speed Downloadshoc: OpenCL - Bus Speed Readbackshoc: OpenCL - Texture Read Bandwidthlczero: BLASonednn: IP Shapes 1D - CPUonednn: IP Shapes 3D - CPUonednn: Convolution Batch Shapes Auto - CPUonednn: Deconvolution Batch shapes_1d - CPUonednn: Deconvolution Batch shapes_3d - CPUonednn: Recurrent Neural Network Training - CPUonednn: Recurrent Neural Network Inference - CPUnumpy: deepspeech: CPUrbenchmark: rnnoise: 26 Minute Long Talking Sampletensorflow-lite: SqueezeNettensorflow-lite: Inception V4tensorflow-lite: NASNet Mobiletensorflow-lite: Mobilenet Floattensorflow-lite: Mobilenet Quanttensorflow-lite: Inception ResNet V2pytorch: CPU - 1 - ResNet-50pytorch: CPU - 1 - ResNet-152pytorch: CPU - 16 - ResNet-50pytorch: CPU - 32 - ResNet-50pytorch: CPU - 64 - ResNet-50pytorch: CPU - 16 - ResNet-152pytorch: CPU - 256 - ResNet-50pytorch: CPU - 32 - ResNet-152pytorch: CPU - 512 - ResNet-50pytorch: CPU - 64 - ResNet-152pytorch: CPU - 256 - ResNet-152pytorch: CPU - 512 - ResNet-152pytorch: CPU - 1 - Efficientnet_v2_lpytorch: CPU - 16 - Efficientnet_v2_lpytorch: CPU - 32 - Efficientnet_v2_lpytorch: CPU - 64 - Efficientnet_v2_lpytorch: CPU - 256 - Efficientnet_v2_lpytorch: CPU - 512 - Efficientnet_v2_ltensorflow: CPU - 1 - VGG-16tensorflow: GPU - 1 - VGG-16tensorflow: CPU - 1 - AlexNettensorflow: CPU - 16 - VGG-16tensorflow: CPU - 32 - VGG-16tensorflow: CPU - 64 - VGG-16tensorflow: GPU - 1 - AlexNettensorflow: GPU - 16 - VGG-16tensorflow: GPU - 32 - VGG-16tensorflow: GPU - 64 - VGG-16tensorflow: CPU - 16 - AlexNettensorflow: CPU - 256 - VGG-16tensorflow: CPU - 32 - AlexNettensorflow: CPU - 512 - VGG-16tensorflow: CPU - 64 - AlexNettensorflow: GPU - 16 - AlexNettensorflow: GPU - 256 - VGG-16tensorflow: GPU - 32 - AlexNettensorflow: GPU - 512 - VGG-16tensorflow: GPU - 64 - AlexNettensorflow: CPU - 1 - GoogLeNettensorflow: CPU - 1 - ResNet-50tensorflow: CPU - 256 - AlexNettensorflow: CPU - 512 - AlexNettensorflow: GPU - 1 - GoogLeNettensorflow: GPU - 1 - ResNet-50tensorflow: GPU - 256 - AlexNettensorflow: GPU - 512 - AlexNettensorflow: CPU - 16 - GoogLeNettensorflow: CPU - 16 - ResNet-50tensorflow: CPU - 32 - GoogLeNettensorflow: CPU - 32 - ResNet-50tensorflow: CPU - 64 - GoogLeNettensorflow: CPU - 64 - ResNet-50tensorflow: GPU - 16 - GoogLeNettensorflow: GPU - 16 - ResNet-50tensorflow: GPU - 32 - GoogLeNettensorflow: GPU - 32 - ResNet-50tensorflow: GPU - 64 - GoogLeNettensorflow: GPU - 64 - ResNet-50tensorflow: CPU - 256 - GoogLeNettensorflow: CPU - 256 - ResNet-50tensorflow: CPU - 512 - GoogLeNettensorflow: CPU - 512 - ResNet-50tensorflow: GPU - 256 - GoogLeNettensorflow: GPU - 256 - ResNet-50tensorflow: GPU - 512 - GoogLeNettensorflow: GPU - 512 - ResNet-50mnn: nasnetmnn: mobilenetV3mnn: squeezenetv1.1mnn: resnet-v2-50mnn: SqueezeNetV1.0mnn: MobileNetV2_224mnn: mobilenet-v1-1.0mnn: inception-v3ncnn: CPU - mobilenetncnn: CPU-v2-v2 - mobilenet-v2ncnn: CPU-v3-v3 - mobilenet-v3ncnn: CPU - shufflenet-v2ncnn: CPU - mnasnetncnn: CPU - efficientnet-b0ncnn: CPU - blazefacencnn: CPU - googlenetncnn: CPU - vgg16ncnn: CPU - resnet18ncnn: CPU - alexnetncnn: CPU - resnet50ncnn: CPUv2-yolov3v2-yolov3 - mobilenetv2-yolov3ncnn: CPU - yolov4-tinyncnn: CPU - squeezenet_ssdncnn: CPU - regnety_400mncnn: CPU - vision_transformerncnn: CPU - FastestDetncnn: Vulkan GPU - mobilenetncnn: Vulkan GPU-v2-v2 - mobilenet-v2ncnn: Vulkan GPU-v3-v3 - mobilenet-v3ncnn: Vulkan GPU - shufflenet-v2ncnn: Vulkan GPU - mnasnetncnn: Vulkan GPU - efficientnet-b0ncnn: Vulkan GPU - blazefacencnn: Vulkan GPU - googlenetncnn: Vulkan GPU - vgg16ncnn: Vulkan GPU - resnet18ncnn: Vulkan GPU - alexnetncnn: Vulkan GPU - resnet50ncnn: Vulkan GPUv2-yolov3v2-yolov3 - mobilenetv2-yolov3ncnn: Vulkan GPU - yolov4-tinyncnn: Vulkan GPU - squeezenet_ssdncnn: Vulkan GPU - regnety_400mncnn: Vulkan GPU - vision_transformerncnn: Vulkan GPU - FastestDetxnnpack: FP32MobileNetV1xnnpack: FP32MobileNetV2xnnpack: FP32MobileNetV3Largexnnpack: FP32MobileNetV3Smallxnnpack: FP16MobileNetV1xnnpack: FP16MobileNetV2xnnpack: FP16MobileNetV3Largexnnpack: FP16MobileNetV3Smallxnnpack: QS8MobileNetV2openvino: Face Detection FP16 - CPUopenvino: Face Detection FP16 - CPUopenvino: Person Detection FP16 - CPUopenvino: Person Detection FP16 - CPUopenvino: Person Detection FP32 - CPUopenvino: Person Detection FP32 - CPUopenvino: Vehicle Detection FP16 - CPUopenvino: Vehicle Detection FP16 - CPUopenvino: Face Detection FP16-INT8 - CPUopenvino: Face Detection FP16-INT8 - CPUopenvino: Face Detection Retail FP16 - CPUopenvino: Face Detection Retail FP16 - CPUopenvino: Road Segmentation ADAS FP16 - CPUopenvino: Road Segmentation ADAS FP16 - CPUopenvino: Vehicle Detection FP16-INT8 - CPUopenvino: Vehicle Detection FP16-INT8 - CPUopenvino: Weld Porosity Detection FP16 - CPUopenvino: Weld Porosity Detection FP16 - CPUopenvino: Face Detection Retail FP16-INT8 - CPUopenvino: Face Detection Retail FP16-INT8 - CPUopenvino: Road Segmentation ADAS FP16-INT8 - CPUopenvino: Road Segmentation ADAS FP16-INT8 - CPUopenvino: Machine Translation EN To DE FP16 - CPUopenvino: Machine Translation EN To DE FP16 - CPUopenvino: Weld Porosity Detection FP16-INT8 - CPUopenvino: Weld Porosity Detection FP16-INT8 - CPUopenvino: Person Vehicle Bike Detection FP16 - CPUopenvino: Person Vehicle Bike Detection FP16 - CPUopenvino: Noise Suppression Poconet-Like FP16 - CPUopenvino: Noise Suppression Poconet-Like FP16 - CPUopenvino: Handwritten English Recognition FP16 - CPUopenvino: Handwritten English Recognition FP16 - CPUopenvino: Person Re-Identification Retail FP16 - CPUopenvino: Person Re-Identification Retail FP16 - CPUopenvino: Age Gender Recognition Retail 0013 FP16 - CPUopenvino: Age Gender Recognition Retail 0013 FP16 - CPUopenvino: Handwritten English Recognition FP16-INT8 - CPUopenvino: Handwritten English Recognition FP16-INT8 - CPUopenvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPUopenvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPUonnx: yolov4 - CPU - Parallelonnx: yolov4 - CPU - Parallelonnx: yolov4 - CPU - Standardonnx: yolov4 - CPU - Standardonnx: ZFNet-512 - CPU - Parallelonnx: ZFNet-512 - CPU - Parallelonnx: ZFNet-512 - CPU - Standardonnx: ZFNet-512 - CPU - Standardonnx: T5 Encoder - CPU - Parallelonnx: T5 Encoder - CPU - Parallelonnx: T5 Encoder - CPU - Standardonnx: T5 Encoder - CPU - Standardonnx: CaffeNet 12-int8 - CPU - Parallelonnx: CaffeNet 12-int8 - CPU - Parallelonnx: CaffeNet 12-int8 - CPU - Standardonnx: CaffeNet 12-int8 - CPU - Standardonnx: fcn-resnet101-11 - CPU - Parallelonnx: fcn-resnet101-11 - CPU - Parallelonnx: fcn-resnet101-11 - CPU - Standardonnx: fcn-resnet101-11 - CPU - Standardonnx: ResNet50 v1-12-int8 - CPU - Parallelonnx: ResNet50 v1-12-int8 - CPU - Parallelonnx: ResNet50 v1-12-int8 - CPU - Standardonnx: ResNet50 v1-12-int8 - CPU - Standardonnx: super-resolution-10 - CPU - Parallelonnx: super-resolution-10 - CPU - Parallelonnx: super-resolution-10 - CPU - Standardonnx: super-resolution-10 - CPU - Standardonnx: ResNet101_DUC_HDC-12 - CPU - Parallelonnx: ResNet101_DUC_HDC-12 - CPU - Parallelonnx: ResNet101_DUC_HDC-12 - CPU - Standardonnx: ResNet101_DUC_HDC-12 - CPU - Standardscikit-learn: GLMscikit-learn: SAGAscikit-learn: Treescikit-learn: Lassoscikit-learn: Sparsifyscikit-learn: Plot Wardscikit-learn: MNIST Datasetscikit-learn: Plot Neighborsscikit-learn: SGD Regressionscikit-learn: SGDOneClassSVMscikit-learn: Isolation Forestscikit-learn: Text Vectorizersscikit-learn: Plot Hierarchicalscikit-learn: Plot OMP vs. LARSscikit-learn: Feature Expansionsscikit-learn: LocalOutlierFactorscikit-learn: TSNE MNIST Datasetscikit-learn: Isotonic / Logisticscikit-learn: Plot Incremental PCAscikit-learn: Hist Gradient Boostingscikit-learn: Plot Parallel Pairwisescikit-learn: Isotonic / Pathologicalscikit-learn: Sample Without Replacementscikit-learn: Covertype Dataset Benchmarkscikit-learn: Hist Gradient Boosting Adultscikit-learn: Isotonic / Perturbed Logarithmscikit-learn: Hist Gradient Boosting Threadingscikit-learn: Hist Gradient Boosting Higgs Bosonscikit-learn: 20 Newsgroups / Logistic Regressionscikit-learn: Plot Polynomial Kernel Approximationscikit-learn: Hist Gradient Boosting Categorical Onlyscikit-learn: Kernel PCA Solvers / Time vs. N Samplesscikit-learn: Kernel PCA Solvers / Time vs. N Componentsscikit-learn: Sparse Rand Projections / 100 Iterationswhisper-cpp: ggml-base.en - 2016 State of the Unionwhisper-cpp: ggml-small.en - 2016 State of the Unionwhisper-cpp: ggml-medium.en - 2016 State of the Unionopencv: DNN - Deep Neural Networkphoronix-ml.txt289.54313.8158752.83746.508442.94497615.3593757.324.989326.25251003.3211841.136571.395912.363173.775671.852061261.40736.400715.5046.234750.12527.8521836.2820372.633662.51381.252501.2133356.360.1723.1945.5946.4246.6917.9746.0617.7845.7517.6417.9217.9914.189.909.859.9810.119.949.702.2030.6627.3428.3229.0515.382.512.542.55288.7130.16409.5630.43516.1842.432.5546.142.5547.8560.9218.38627.50643.4421.036.6948.9249.28198.1162.29218.0767.94225.1670.8926.918.9427.919.1428.539.24227.0659.70185.2558.5029.039.3628.739.3415.2972.5364.32718.5346.4293.2683.78436.45213.856.306.458.156.118.283.1116.4225.718.025.5613.3313.8524.1414.3418.5840.599.8013.796.316.498.065.998.683.1416.0125.137.865.2814.6513.7923.6416.0418.6341.059.8212331873246515031144149521281464139819.69607.48129.4593.29119.41100.461077.8711.1237.36320.424273.092.76465.8625.832160.575.511947.6712.256458.383.58679.6617.62187.8863.993742.776.311930.086.182052.0211.531035.2923.122523.594.7248433.920.431108.6521.5867537.770.35.39684185.3609.65442103.57957.861217.2875109.8639.10376272.1843.67307172.0375.81489203.0594.92768639.9051.562531.20339834.5724.11676242.908108.8759.18292328.2933.04551123.7368.0814997.317710.27541.29718770.9062.21086452.314168.333669.11846.970308.225108.45542.10452.736114.83964.368233.407176.28745.340141.46341.476100.35321.616247.5561406.45231.209166.776167.9973843.25890.640320.394153.3381528.96652.72965.73610.450104.70030.18861.61131.037504.82992.75261218.18523579.1707733080OpenBenchmarking.org

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: S3D

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: S3Dphoronix-ml.txt70140210280350SE +/- 3.81, N = 15298.491. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Triad

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Triadphoronix-ml.txt612182430SE +/- 0.23, N = 623.051. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: FFT SP

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: FFT SPphoronix-ml.txt6001200180024003000SE +/- 2.81, N = 32703.371. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: MD5 Hash

OpenBenchmarking.orgGHash/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: MD5 Hashphoronix-ml.txt1122334455SE +/- 0.68, N = 349.641. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Reduction

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Reductionphoronix-ml.txt130260390520650SE +/- 0.41, N = 3595.051. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: GEMM SGEMM_N

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: GEMM SGEMM_Nphoronix-ml.txt2K4K6K8K10KSE +/- 23.01, N = 38470.021. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Max SP Flops

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Max SP Flopsphoronix-ml.txt20K40K60K80K100KSE +/- 230.38, N = 393757.31. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Bus Speed Download

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Bus Speed Downloadphoronix-ml.txt612182430SE +/- 0.00, N = 324.991. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Bus Speed Readback

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Bus Speed Readbackphoronix-ml.txt612182430SE +/- 0.00, N = 326.251. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Texture Read Bandwidth

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Texture Read Bandwidthphoronix-ml.txt2004006008001000SE +/- 5.65, N = 31003.321. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

LeelaChessZero

Backend: BLAS

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.31.1Backend: BLASphoronix-ml.txt4080120160200SE +/- 12.39, N = 91841. (CXX) g++ options: -flto -pthread

oneDNN

Harness: IP Shapes 1D - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: IP Shapes 1D - Engine: CPUphoronix-ml.txt0.25570.51140.76711.02281.2785SE +/- 0.00979, N = 151.13657MIN: 1.011. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl

oneDNN

Harness: IP Shapes 3D - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: IP Shapes 3D - Engine: CPUphoronix-ml.txt0.31410.62820.94231.25641.5705SE +/- 0.01618, N = 151.39591MIN: 1.161. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl

oneDNN

Harness: Convolution Batch Shapes Auto - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: Convolution Batch Shapes Auto - Engine: CPUphoronix-ml.txt0.53171.06341.59512.12682.6585SE +/- 0.02737, N = 42.36317MIN: 1.971. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl

oneDNN

Harness: Deconvolution Batch shapes_1d - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: Deconvolution Batch shapes_1d - Engine: CPUphoronix-ml.txt0.84951.6992.54853.3984.2475SE +/- 0.00801, N = 33.77567MIN: 2.811. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl

oneDNN

Harness: Deconvolution Batch shapes_3d - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: Deconvolution Batch shapes_3d - Engine: CPUphoronix-ml.txt0.41670.83341.25011.66682.0835SE +/- 0.01905, N = 41.85206MIN: 1.731. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl

oneDNN

Harness: Recurrent Neural Network Training - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: Recurrent Neural Network Training - Engine: CPUphoronix-ml.txt30060090012001500SE +/- 9.63, N = 31261.40MIN: 1196.781. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl

oneDNN

Harness: Recurrent Neural Network Inference - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: Recurrent Neural Network Inference - Engine: CPUphoronix-ml.txt160320480640800SE +/- 8.66, N = 3736.40MIN: 639.281. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl

Numpy Benchmark

OpenBenchmarking.orgScore, More Is BetterNumpy Benchmarkphoronix-ml.txt150300450600750SE +/- 5.45, N = 3715.50

DeepSpeech

Acceleration: CPU

OpenBenchmarking.orgSeconds, Fewer Is BetterDeepSpeech 0.6Acceleration: CPUphoronix-ml.txt1020304050SE +/- 0.15, N = 346.23

R Benchmark

OpenBenchmarking.orgSeconds, Fewer Is BetterR Benchmarkphoronix-ml.txt0.02820.05640.08460.11280.141SE +/- 0.0007, N = 30.1252

RNNoise

Input: 26 Minute Long Talking Sample

OpenBenchmarking.orgSeconds, Fewer Is BetterRNNoise 0.2Input: 26 Minute Long Talking Samplephoronix-ml.txt246810SE +/- 0.019, N = 37.8521. (CC) gcc options: -O2 -pedantic -fvisibility=hidden

TensorFlow Lite

Model: SqueezeNet

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2022-05-18Model: SqueezeNetphoronix-ml.txt400800120016002000SE +/- 17.45, N = 151836.28

TensorFlow Lite

Model: Inception V4

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2022-05-18Model: Inception V4phoronix-ml.txt4K8K12K16K20KSE +/- 520.83, N = 1520372.6

TensorFlow Lite

Model: NASNet Mobile

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2022-05-18Model: NASNet Mobilephoronix-ml.txt7K14K21K28K35KSE +/- 419.25, N = 1533662.5

TensorFlow Lite

Model: Mobilenet Float

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2022-05-18Model: Mobilenet Floatphoronix-ml.txt30060090012001500SE +/- 4.96, N = 31381.25

TensorFlow Lite

Model: Mobilenet Quant

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2022-05-18Model: Mobilenet Quantphoronix-ml.txt5001000150020002500SE +/- 12.96, N = 32501.21

TensorFlow Lite

Model: Inception ResNet V2

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2022-05-18Model: Inception ResNet V2phoronix-ml.txt7K14K21K28K35KSE +/- 434.39, N = 333356.3

PyTorch

Device: CPU - Batch Size: 1 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 1 - Model: ResNet-50phoronix-ml.txt1326395265SE +/- 0.03, N = 360.17MIN: 49.64 / MAX: 62.8

PyTorch

Device: CPU - Batch Size: 1 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 1 - Model: ResNet-152phoronix-ml.txt612182430SE +/- 0.14, N = 323.19MIN: 19.02 / MAX: 24.35

PyTorch

Device: CPU - Batch Size: 16 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 16 - Model: ResNet-50phoronix-ml.txt1020304050SE +/- 0.20, N = 345.59MIN: 38.82 / MAX: 46.87

PyTorch

Device: CPU - Batch Size: 32 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 32 - Model: ResNet-50phoronix-ml.txt1122334455SE +/- 0.47, N = 346.42MIN: 41.66 / MAX: 47.49

PyTorch

Device: CPU - Batch Size: 64 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 64 - Model: ResNet-50phoronix-ml.txt1122334455SE +/- 0.10, N = 346.69MIN: 42.52 / MAX: 47.36

PyTorch

Device: CPU - Batch Size: 16 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 16 - Model: ResNet-152phoronix-ml.txt48121620SE +/- 0.07, N = 317.97MIN: 14.56 / MAX: 18.28

PyTorch

Device: CPU - Batch Size: 256 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 256 - Model: ResNet-50phoronix-ml.txt1020304050SE +/- 0.23, N = 346.06MIN: 38.74 / MAX: 46.92

PyTorch

Device: CPU - Batch Size: 32 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 32 - Model: ResNet-152phoronix-ml.txt48121620SE +/- 0.03, N = 317.78MIN: 15.02 / MAX: 18.13

PyTorch

Device: CPU - Batch Size: 512 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 512 - Model: ResNet-50phoronix-ml.txt1020304050SE +/- 0.29, N = 345.75MIN: 41.27 / MAX: 46.82

PyTorch

Device: CPU - Batch Size: 64 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 64 - Model: ResNet-152phoronix-ml.txt48121620SE +/- 0.08, N = 317.64MIN: 14.41 / MAX: 18.1

PyTorch

Device: CPU - Batch Size: 256 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 256 - Model: ResNet-152phoronix-ml.txt48121620SE +/- 0.06, N = 317.92MIN: 14.68 / MAX: 18.36

PyTorch

Device: CPU - Batch Size: 512 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 512 - Model: ResNet-152phoronix-ml.txt48121620SE +/- 0.07, N = 317.99MIN: 14.67 / MAX: 18.38

PyTorch

Device: CPU - Batch Size: 1 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 1 - Model: Efficientnet_v2_lphoronix-ml.txt48121620SE +/- 0.19, N = 314.18MIN: 12.11 / MAX: 14.81

PyTorch

Device: CPU - Batch Size: 16 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 16 - Model: Efficientnet_v2_lphoronix-ml.txt3691215SE +/- 0.09, N = 39.90MIN: 8.07 / MAX: 10.35

PyTorch

Device: CPU - Batch Size: 32 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 32 - Model: Efficientnet_v2_lphoronix-ml.txt3691215SE +/- 0.07, N = 39.85MIN: 8.27 / MAX: 10.3

PyTorch

Device: CPU - Batch Size: 64 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 64 - Model: Efficientnet_v2_lphoronix-ml.txt3691215SE +/- 0.05, N = 39.98MIN: 7.98 / MAX: 10.29

PyTorch

Device: CPU - Batch Size: 256 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 256 - Model: Efficientnet_v2_lphoronix-ml.txt3691215SE +/- 0.05, N = 310.11MIN: 8.31 / MAX: 10.38

PyTorch

Device: CPU - Batch Size: 512 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 512 - Model: Efficientnet_v2_lphoronix-ml.txt3691215SE +/- 0.09, N = 129.94MIN: 7.81 / MAX: 10.45

TensorFlow

Device: CPU - Batch Size: 1 - Model: VGG-16

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 1 - Model: VGG-16phoronix-ml.txt3691215SE +/- 0.00, N = 39.70

TensorFlow

Device: GPU - Batch Size: 1 - Model: VGG-16

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 1 - Model: VGG-16phoronix-ml.txt0.4950.991.4851.982.475SE +/- 0.00, N = 32.20

TensorFlow

Device: CPU - Batch Size: 1 - Model: AlexNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 1 - Model: AlexNetphoronix-ml.txt714212835SE +/- 0.00, N = 330.66

TensorFlow

Device: CPU - Batch Size: 16 - Model: VGG-16

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 16 - Model: VGG-16phoronix-ml.txt612182430SE +/- 0.05, N = 327.34

TensorFlow

Device: CPU - Batch Size: 32 - Model: VGG-16

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 32 - Model: VGG-16phoronix-ml.txt714212835SE +/- 0.05, N = 328.32

TensorFlow

Device: CPU - Batch Size: 64 - Model: VGG-16

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 64 - Model: VGG-16phoronix-ml.txt714212835SE +/- 0.02, N = 329.05

TensorFlow

Device: GPU - Batch Size: 1 - Model: AlexNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 1 - Model: AlexNetphoronix-ml.txt48121620SE +/- 0.14, N = 1515.38

TensorFlow

Device: GPU - Batch Size: 16 - Model: VGG-16

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 16 - Model: VGG-16phoronix-ml.txt0.56481.12961.69442.25922.824SE +/- 0.00, N = 32.51

TensorFlow

Device: GPU - Batch Size: 32 - Model: VGG-16

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 32 - Model: VGG-16phoronix-ml.txt0.57151.1431.71452.2862.8575SE +/- 0.00, N = 32.54

TensorFlow

Device: GPU - Batch Size: 64 - Model: VGG-16

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 64 - Model: VGG-16phoronix-ml.txt0.57381.14761.72142.29522.869SE +/- 0.00, N = 32.55

TensorFlow

Device: CPU - Batch Size: 16 - Model: AlexNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 16 - Model: AlexNetphoronix-ml.txt60120180240300SE +/- 0.39, N = 3288.71

TensorFlow

Device: CPU - Batch Size: 256 - Model: VGG-16

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 256 - Model: VGG-16phoronix-ml.txt714212835SE +/- 0.08, N = 330.16

TensorFlow

Device: CPU - Batch Size: 32 - Model: AlexNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 32 - Model: AlexNetphoronix-ml.txt90180270360450SE +/- 0.28, N = 3409.56

TensorFlow

Device: CPU - Batch Size: 512 - Model: VGG-16

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 512 - Model: VGG-16phoronix-ml.txt714212835SE +/- 0.01, N = 330.43

TensorFlow

Device: CPU - Batch Size: 64 - Model: AlexNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 64 - Model: AlexNetphoronix-ml.txt110220330440550SE +/- 0.16, N = 3516.18

TensorFlow

Device: GPU - Batch Size: 16 - Model: AlexNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 16 - Model: AlexNetphoronix-ml.txt1020304050SE +/- 0.02, N = 342.43

TensorFlow

Device: GPU - Batch Size: 256 - Model: VGG-16

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 256 - Model: VGG-16phoronix-ml.txt0.57381.14761.72142.29522.869SE +/- 0.01, N = 32.55

TensorFlow

Device: GPU - Batch Size: 32 - Model: AlexNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 32 - Model: AlexNetphoronix-ml.txt1020304050SE +/- 0.04, N = 346.14

TensorFlow

Device: GPU - Batch Size: 512 - Model: VGG-16

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 512 - Model: VGG-16phoronix-ml.txt0.57381.14761.72142.29522.869SE +/- 0.00, N = 32.55

TensorFlow

Device: GPU - Batch Size: 64 - Model: AlexNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 64 - Model: AlexNetphoronix-ml.txt1122334455SE +/- 0.04, N = 347.85

TensorFlow

Device: CPU - Batch Size: 1 - Model: GoogLeNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 1 - Model: GoogLeNetphoronix-ml.txt1428425670SE +/- 0.30, N = 360.92

TensorFlow

Device: CPU - Batch Size: 1 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 1 - Model: ResNet-50phoronix-ml.txt510152025SE +/- 0.11, N = 318.38

TensorFlow

Device: CPU - Batch Size: 256 - Model: AlexNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 256 - Model: AlexNetphoronix-ml.txt140280420560700SE +/- 0.20, N = 3627.50

TensorFlow

Device: CPU - Batch Size: 512 - Model: AlexNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 512 - Model: AlexNetphoronix-ml.txt140280420560700SE +/- 1.19, N = 3643.44

TensorFlow

Device: GPU - Batch Size: 1 - Model: GoogLeNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 1 - Model: GoogLeNetphoronix-ml.txt510152025SE +/- 0.15, N = 321.03

TensorFlow

Device: GPU - Batch Size: 1 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 1 - Model: ResNet-50phoronix-ml.txt246810SE +/- 0.03, N = 36.69

TensorFlow

Device: GPU - Batch Size: 256 - Model: AlexNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 256 - Model: AlexNetphoronix-ml.txt1122334455SE +/- 0.03, N = 348.92

TensorFlow

Device: GPU - Batch Size: 512 - Model: AlexNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 512 - Model: AlexNetphoronix-ml.txt1122334455SE +/- 0.09, N = 349.28

TensorFlow

Device: CPU - Batch Size: 16 - Model: GoogLeNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 16 - Model: GoogLeNetphoronix-ml.txt4080120160200SE +/- 0.22, N = 3198.11

TensorFlow

Device: CPU - Batch Size: 16 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 16 - Model: ResNet-50phoronix-ml.txt1428425670SE +/- 0.06, N = 362.29

TensorFlow

Device: CPU - Batch Size: 32 - Model: GoogLeNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 32 - Model: GoogLeNetphoronix-ml.txt50100150200250SE +/- 0.23, N = 3218.07

TensorFlow

Device: CPU - Batch Size: 32 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 32 - Model: ResNet-50phoronix-ml.txt1530456075SE +/- 0.07, N = 367.94

TensorFlow

Device: CPU - Batch Size: 64 - Model: GoogLeNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 64 - Model: GoogLeNetphoronix-ml.txt50100150200250SE +/- 0.29, N = 3225.16

TensorFlow

Device: CPU - Batch Size: 64 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 64 - Model: ResNet-50phoronix-ml.txt1632486480SE +/- 0.33, N = 370.89

TensorFlow

Device: GPU - Batch Size: 16 - Model: GoogLeNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 16 - Model: GoogLeNetphoronix-ml.txt612182430SE +/- 0.01, N = 326.91

TensorFlow

Device: GPU - Batch Size: 16 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 16 - Model: ResNet-50phoronix-ml.txt246810SE +/- 0.01, N = 38.94

TensorFlow

Device: GPU - Batch Size: 32 - Model: GoogLeNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 32 - Model: GoogLeNetphoronix-ml.txt714212835SE +/- 0.04, N = 327.91

TensorFlow

Device: GPU - Batch Size: 32 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 32 - Model: ResNet-50phoronix-ml.txt3691215SE +/- 0.01, N = 39.14

TensorFlow

Device: GPU - Batch Size: 64 - Model: GoogLeNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 64 - Model: GoogLeNetphoronix-ml.txt714212835SE +/- 0.02, N = 328.53

TensorFlow

Device: GPU - Batch Size: 64 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 64 - Model: ResNet-50phoronix-ml.txt3691215SE +/- 0.00, N = 39.24

TensorFlow

Device: CPU - Batch Size: 256 - Model: GoogLeNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 256 - Model: GoogLeNetphoronix-ml.txt50100150200250SE +/- 0.25, N = 3227.06

TensorFlow

Device: CPU - Batch Size: 256 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 256 - Model: ResNet-50phoronix-ml.txt1326395265SE +/- 0.91, N = 959.70

TensorFlow

Device: CPU - Batch Size: 512 - Model: GoogLeNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 512 - Model: GoogLeNetphoronix-ml.txt4080120160200SE +/- 1.69, N = 7185.25

TensorFlow

Device: CPU - Batch Size: 512 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 512 - Model: ResNet-50phoronix-ml.txt1326395265SE +/- 0.33, N = 358.50

TensorFlow

Device: GPU - Batch Size: 256 - Model: GoogLeNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 256 - Model: GoogLeNetphoronix-ml.txt714212835SE +/- 0.01, N = 329.03

TensorFlow

Device: GPU - Batch Size: 256 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 256 - Model: ResNet-50phoronix-ml.txt3691215SE +/- 0.01, N = 39.36

TensorFlow

Device: GPU - Batch Size: 512 - Model: GoogLeNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 512 - Model: GoogLeNetphoronix-ml.txt714212835SE +/- 0.10, N = 328.73

TensorFlow

Device: GPU - Batch Size: 512 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 512 - Model: ResNet-50phoronix-ml.txt3691215SE +/- 0.03, N = 39.34

Mobile Neural Network

Model: nasnet

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.9.b11b7037dModel: nasnetphoronix-ml.txt48121620SE +/- 0.02, N = 315.30MIN: 14.65 / MAX: 21.591. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl

Mobile Neural Network

Model: mobilenetV3

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.9.b11b7037dModel: mobilenetV3phoronix-ml.txt0.57061.14121.71182.28242.853SE +/- 0.008, N = 32.536MIN: 2.4 / MAX: 3.471. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl

Mobile Neural Network

Model: squeezenetv1.1

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.9.b11b7037dModel: squeezenetv1.1phoronix-ml.txt0.97361.94722.92083.89444.868SE +/- 0.117, N = 34.327MIN: 3.96 / MAX: 6.651. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl

Mobile Neural Network

Model: resnet-v2-50

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.9.b11b7037dModel: resnet-v2-50phoronix-ml.txt510152025SE +/- 0.11, N = 318.53MIN: 18.26 / MAX: 28.91. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl

Mobile Neural Network

Model: SqueezeNetV1.0

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.9.b11b7037dModel: SqueezeNetV1.0phoronix-ml.txt246810SE +/- 0.200, N = 36.429MIN: 5.97 / MAX: 7.141. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl

Mobile Neural Network

Model: MobileNetV2_224

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.9.b11b7037dModel: MobileNetV2_224phoronix-ml.txt0.73531.47062.20592.94123.6765SE +/- 0.042, N = 33.268MIN: 3.15 / MAX: 5.051. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl

Mobile Neural Network

Model: mobilenet-v1-1.0

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.9.b11b7037dModel: mobilenet-v1-1.0phoronix-ml.txt0.85141.70282.55423.40564.257SE +/- 0.007, N = 33.784MIN: 3.71 / MAX: 6.581. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl

Mobile Neural Network

Model: inception-v3

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.9.b11b7037dModel: inception-v3phoronix-ml.txt816243240SE +/- 0.04, N = 336.45MIN: 36.16 / MAX: 50.971. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl

NCNN

Target: CPU - Model: mobilenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: mobilenetphoronix-ml.txt48121620SE +/- 0.12, N = 1513.85MIN: 12.83 / MAX: 247.131. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU-v2-v2 - Model: mobilenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU-v2-v2 - Model: mobilenet-v2phoronix-ml.txt246810SE +/- 0.05, N = 156.30MIN: 5.63 / MAX: 33.151. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU-v3-v3 - Model: mobilenet-v3

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU-v3-v3 - Model: mobilenet-v3phoronix-ml.txt246810SE +/- 0.04, N = 156.45MIN: 5.96 / MAX: 63.771. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: shufflenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: shufflenet-v2phoronix-ml.txt246810SE +/- 0.09, N = 158.15MIN: 7.52 / MAX: 291.181. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: mnasnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: mnasnetphoronix-ml.txt246810SE +/- 0.11, N = 156.11MIN: 5.26 / MAX: 321.291. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: efficientnet-b0

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: efficientnet-b0phoronix-ml.txt246810SE +/- 0.09, N = 158.28MIN: 7.57 / MAX: 296.151. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: blazeface

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: blazefacephoronix-ml.txt0.69981.39962.09942.79923.499SE +/- 0.02, N = 153.11MIN: 2.85 / MAX: 11.531. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: googlenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: googlenetphoronix-ml.txt48121620SE +/- 0.15, N = 1516.42MIN: 15.38 / MAX: 271.341. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: vgg16

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: vgg16phoronix-ml.txt612182430SE +/- 0.34, N = 1525.71MIN: 22.56 / MAX: 344.21. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: resnet18

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: resnet18phoronix-ml.txt246810SE +/- 0.07, N = 158.02MIN: 7.54 / MAX: 17.961. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: alexnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: alexnetphoronix-ml.txt1.2512.5023.7535.0046.255SE +/- 0.07, N = 155.56MIN: 5.07 / MAX: 35.351. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: resnet50

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: resnet50phoronix-ml.txt3691215SE +/- 0.13, N = 1513.33MIN: 11.94 / MAX: 281.31. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3phoronix-ml.txt48121620SE +/- 0.12, N = 1513.85MIN: 12.83 / MAX: 247.131. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: yolov4-tiny

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: yolov4-tinyphoronix-ml.txt612182430SE +/- 0.11, N = 1524.14MIN: 21.58 / MAX: 105.121. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: squeezenet_ssd

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: squeezenet_ssdphoronix-ml.txt48121620SE +/- 0.12, N = 1514.34MIN: 13.13 / MAX: 263.271. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: regnety_400m

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: regnety_400mphoronix-ml.txt510152025SE +/- 0.12, N = 1518.58MIN: 17.32 / MAX: 295.211. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: vision_transformer

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: vision_transformerphoronix-ml.txt918273645SE +/- 0.18, N = 1540.59MIN: 37.83 / MAX: 299.431. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: FastestDet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: FastestDetphoronix-ml.txt3691215SE +/- 0.30, N = 159.80MIN: 6.73 / MAX: 273.491. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: mobilenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: mobilenetphoronix-ml.txt48121620SE +/- 0.10, N = 313.79MIN: 13.15 / MAX: 23.261. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2phoronix-ml.txt246810SE +/- 0.02, N = 36.31MIN: 5.98 / MAX: 14.561. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3phoronix-ml.txt246810SE +/- 0.03, N = 36.49MIN: 6.21 / MAX: 15.751. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: shufflenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: shufflenet-v2phoronix-ml.txt246810SE +/- 0.05, N = 38.06MIN: 7.83 / MAX: 15.081. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: mnasnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: mnasnetphoronix-ml.txt1.34782.69564.04345.39126.739SE +/- 0.01, N = 35.99MIN: 5.68 / MAX: 14.251. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: efficientnet-b0

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: efficientnet-b0phoronix-ml.txt246810SE +/- 0.37, N = 38.68MIN: 7.76 / MAX: 202.361. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: blazeface

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: blazefacephoronix-ml.txt0.70651.4132.11952.8263.5325SE +/- 0.00, N = 33.14MIN: 3.01 / MAX: 8.51. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: googlenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: googlenetphoronix-ml.txt48121620SE +/- 0.04, N = 316.01MIN: 15.51 / MAX: 26.581. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: vgg16

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: vgg16phoronix-ml.txt612182430SE +/- 0.31, N = 325.13MIN: 23.1 / MAX: 139.051. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: resnet18

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: resnet18phoronix-ml.txt246810SE +/- 0.08, N = 37.86MIN: 7.54 / MAX: 15.181. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: alexnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: alexnetphoronix-ml.txt1.1882.3763.5644.7525.94SE +/- 0.02, N = 35.28MIN: 5.1 / MAX: 15.431. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: resnet50

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: resnet50phoronix-ml.txt48121620SE +/- 1.16, N = 314.65MIN: 12.41 / MAX: 377.471. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPUv2-yolov3v2-yolov3 - Model: mobilenetv2-yolov3phoronix-ml.txt48121620SE +/- 0.10, N = 313.79MIN: 13.15 / MAX: 23.261. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: yolov4-tiny

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: yolov4-tinyphoronix-ml.txt612182430SE +/- 0.41, N = 323.64MIN: 22.17 / MAX: 39.941. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: squeezenet_ssd

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: squeezenet_ssdphoronix-ml.txt48121620SE +/- 1.72, N = 316.04MIN: 13.26 / MAX: 580.791. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: regnety_400m

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: regnety_400mphoronix-ml.txt510152025SE +/- 0.09, N = 318.63MIN: 18.07 / MAX: 108.41. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: vision_transformer

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: vision_transformerphoronix-ml.txt918273645SE +/- 0.27, N = 341.05MIN: 38.99 / MAX: 101.631. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: FastestDet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: FastestDetphoronix-ml.txt3691215SE +/- 0.39, N = 39.82MIN: 8.74 / MAX: 19.331. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

XNNPACK

Model: FP32MobileNetV1

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP32MobileNetV1phoronix-ml.txt30060090012001500SE +/- 2.52, N = 312331. (CXX) g++ options: -O3 -lrt -lm

XNNPACK

Model: FP32MobileNetV2

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP32MobileNetV2phoronix-ml.txt400800120016002000SE +/- 14.40, N = 318731. (CXX) g++ options: -O3 -lrt -lm

XNNPACK

Model: FP32MobileNetV3Large

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP32MobileNetV3Largephoronix-ml.txt5001000150020002500SE +/- 12.67, N = 324651. (CXX) g++ options: -O3 -lrt -lm

XNNPACK

Model: FP32MobileNetV3Small

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP32MobileNetV3Smallphoronix-ml.txt30060090012001500SE +/- 3.61, N = 315031. (CXX) g++ options: -O3 -lrt -lm

XNNPACK

Model: FP16MobileNetV1

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP16MobileNetV1phoronix-ml.txt2004006008001000SE +/- 6.56, N = 311441. (CXX) g++ options: -O3 -lrt -lm

XNNPACK

Model: FP16MobileNetV2

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP16MobileNetV2phoronix-ml.txt30060090012001500SE +/- 15.14, N = 314951. (CXX) g++ options: -O3 -lrt -lm

XNNPACK

Model: FP16MobileNetV3Large

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP16MobileNetV3Largephoronix-ml.txt5001000150020002500SE +/- 6.66, N = 321281. (CXX) g++ options: -O3 -lrt -lm

XNNPACK

Model: FP16MobileNetV3Small

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP16MobileNetV3Smallphoronix-ml.txt30060090012001500SE +/- 5.55, N = 314641. (CXX) g++ options: -O3 -lrt -lm

XNNPACK

Model: QS8MobileNetV2

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: QS8MobileNetV2phoronix-ml.txt30060090012001500SE +/- 7.84, N = 313981. (CXX) g++ options: -O3 -lrt -lm

OpenVINO

Model: Face Detection FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Face Detection FP16 - Device: CPUphoronix-ml.txt510152025SE +/- 0.09, N = 319.691. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Face Detection FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Face Detection FP16 - Device: CPUphoronix-ml.txt130260390520650SE +/- 2.89, N = 3607.48MIN: 575.29 / MAX: 657.061. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Person Detection FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Person Detection FP16 - Device: CPUphoronix-ml.txt306090120150SE +/- 3.20, N = 15129.451. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Person Detection FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Person Detection FP16 - Device: CPUphoronix-ml.txt20406080100SE +/- 2.03, N = 1593.29MIN: 31.12 / MAX: 185.661. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Person Detection FP32 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Person Detection FP32 - Device: CPUphoronix-ml.txt306090120150SE +/- 1.07, N = 15119.411. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Person Detection FP32 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Person Detection FP32 - Device: CPUphoronix-ml.txt20406080100SE +/- 0.82, N = 15100.46MIN: 32.5 / MAX: 161.811. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Vehicle Detection FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Vehicle Detection FP16 - Device: CPUphoronix-ml.txt2004006008001000SE +/- 15.55, N = 151077.871. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Vehicle Detection FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Vehicle Detection FP16 - Device: CPUphoronix-ml.txt3691215SE +/- 0.14, N = 1511.12MIN: 4.52 / MAX: 34.071. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Face Detection FP16-INT8 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Face Detection FP16-INT8 - Device: CPUphoronix-ml.txt918273645SE +/- 0.02, N = 337.361. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Face Detection FP16-INT8 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Face Detection FP16-INT8 - Device: CPUphoronix-ml.txt70140210280350SE +/- 0.15, N = 3320.42MIN: 299.06 / MAX: 380.181. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Face Detection Retail FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Face Detection Retail FP16 - Device: CPUphoronix-ml.txt9001800270036004500SE +/- 13.60, N = 34273.091. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Face Detection Retail FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Face Detection Retail FP16 - Device: CPUphoronix-ml.txt0.6211.2421.8632.4843.105SE +/- 0.01, N = 32.76MIN: 1.41 / MAX: 15.611. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Road Segmentation ADAS FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Road Segmentation ADAS FP16 - Device: CPUphoronix-ml.txt100200300400500SE +/- 9.05, N = 15465.861. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Road Segmentation ADAS FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Road Segmentation ADAS FP16 - Device: CPUphoronix-ml.txt612182430SE +/- 0.43, N = 1525.83MIN: 10.2 / MAX: 57.071. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Vehicle Detection FP16-INT8 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Vehicle Detection FP16-INT8 - Device: CPUphoronix-ml.txt5001000150020002500SE +/- 3.34, N = 32160.571. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Vehicle Detection FP16-INT8 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Vehicle Detection FP16-INT8 - Device: CPUphoronix-ml.txt1.23982.47963.71944.95926.199SE +/- 0.01, N = 35.51MIN: 2.97 / MAX: 19.151. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Weld Porosity Detection FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Weld Porosity Detection FP16 - Device: CPUphoronix-ml.txt400800120016002000SE +/- 1.95, N = 31947.671. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Weld Porosity Detection FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Weld Porosity Detection FP16 - Device: CPUphoronix-ml.txt3691215SE +/- 0.01, N = 312.25MIN: 6.32 / MAX: 26.61. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Face Detection Retail FP16-INT8 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Face Detection Retail FP16-INT8 - Device: CPUphoronix-ml.txt14002800420056007000SE +/- 8.36, N = 36458.381. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Face Detection Retail FP16-INT8 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Face Detection Retail FP16-INT8 - Device: CPUphoronix-ml.txt0.80551.6112.41653.2224.0275SE +/- 0.00, N = 33.58MIN: 1.95 / MAX: 16.911. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Road Segmentation ADAS FP16-INT8 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Road Segmentation ADAS FP16-INT8 - Device: CPUphoronix-ml.txt150300450600750SE +/- 2.47, N = 3679.661. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Road Segmentation ADAS FP16-INT8 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Road Segmentation ADAS FP16-INT8 - Device: CPUphoronix-ml.txt48121620SE +/- 0.06, N = 317.62MIN: 9.01 / MAX: 33.691. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Machine Translation EN To DE FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Machine Translation EN To DE FP16 - Device: CPUphoronix-ml.txt4080120160200SE +/- 3.37, N = 12187.881. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Machine Translation EN To DE FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Machine Translation EN To DE FP16 - Device: CPUphoronix-ml.txt1428425670SE +/- 1.08, N = 1263.99MIN: 29.61 / MAX: 110.091. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Weld Porosity Detection FP16-INT8 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Weld Porosity Detection FP16-INT8 - Device: CPUphoronix-ml.txt8001600240032004000SE +/- 3.08, N = 33742.771. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Weld Porosity Detection FP16-INT8 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Weld Porosity Detection FP16-INT8 - Device: CPUphoronix-ml.txt246810SE +/- 0.01, N = 36.31MIN: 3.31 / MAX: 21.371. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Person Vehicle Bike Detection FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Person Vehicle Bike Detection FP16 - Device: CPUphoronix-ml.txt400800120016002000SE +/- 3.67, N = 31930.081. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Person Vehicle Bike Detection FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Person Vehicle Bike Detection FP16 - Device: CPUphoronix-ml.txt246810SE +/- 0.01, N = 36.18MIN: 3.57 / MAX: 20.461. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Noise Suppression Poconet-Like FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Noise Suppression Poconet-Like FP16 - Device: CPUphoronix-ml.txt400800120016002000SE +/- 35.59, N = 152052.021. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Noise Suppression Poconet-Like FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Noise Suppression Poconet-Like FP16 - Device: CPUphoronix-ml.txt3691215SE +/- 0.18, N = 1511.53MIN: 5.76 / MAX: 42.761. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Handwritten English Recognition FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Handwritten English Recognition FP16 - Device: CPUphoronix-ml.txt2004006008001000SE +/- 4.12, N = 31035.291. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Handwritten English Recognition FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Handwritten English Recognition FP16 - Device: CPUphoronix-ml.txt612182430SE +/- 0.09, N = 323.12MIN: 14.9 / MAX: 38.921. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Person Re-Identification Retail FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Person Re-Identification Retail FP16 - Device: CPUphoronix-ml.txt5001000150020002500SE +/- 3.45, N = 32523.591. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Person Re-Identification Retail FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Person Re-Identification Retail FP16 - Device: CPUphoronix-ml.txt1.0622.1243.1864.2485.31SE +/- 0.01, N = 34.72MIN: 2.72 / MAX: 17.211. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Age Gender Recognition Retail 0013 FP16 - Device: CPUphoronix-ml.txt10K20K30K40K50KSE +/- 17.34, N = 348433.921. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Age Gender Recognition Retail 0013 FP16 - Device: CPUphoronix-ml.txt0.09680.19360.29040.38720.484SE +/- 0.00, N = 30.43MIN: 0.23 / MAX: 11.961. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Handwritten English Recognition FP16-INT8 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Handwritten English Recognition FP16-INT8 - Device: CPUphoronix-ml.txt2004006008001000SE +/- 4.28, N = 31108.651. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Handwritten English Recognition FP16-INT8 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Handwritten English Recognition FP16-INT8 - Device: CPUphoronix-ml.txt510152025SE +/- 0.08, N = 321.58MIN: 16.4 / MAX: 43.471. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPUphoronix-ml.txt14K28K42K56K70KSE +/- 38.55, N = 367537.771. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPUphoronix-ml.txt0.06750.1350.20250.270.3375SE +/- 0.00, N = 30.3MIN: 0.17 / MAX: 9.441. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

ONNX Runtime

Model: yolov4 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: yolov4 - Device: CPU - Executor: Parallelphoronix-ml.txt1.21432.42863.64294.85726.0715SE +/- 0.07557, N = 35.396841. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: yolov4 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: yolov4 - Device: CPU - Executor: Parallelphoronix-ml.txt4080120160200SE +/- 2.57, N = 3185.361. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: yolov4 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: yolov4 - Device: CPU - Executor: Standardphoronix-ml.txt3691215SE +/- 0.03544, N = 39.654421. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: yolov4 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: yolov4 - Device: CPU - Executor: Standardphoronix-ml.txt20406080100SE +/- 0.38, N = 3103.581. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ZFNet-512 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ZFNet-512 - Device: CPU - Executor: Parallelphoronix-ml.txt1326395265SE +/- 0.67, N = 457.861. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ZFNet-512 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: ZFNet-512 - Device: CPU - Executor: Parallelphoronix-ml.txt48121620SE +/- 0.20, N = 417.291. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ZFNet-512 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ZFNet-512 - Device: CPU - Executor: Standardphoronix-ml.txt20406080100SE +/- 1.32, N = 4109.861. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ZFNet-512 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: ZFNet-512 - Device: CPU - Executor: Standardphoronix-ml.txt3691215SE +/- 0.10702, N = 49.103761. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: T5 Encoder - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: T5 Encoder - Device: CPU - Executor: Parallelphoronix-ml.txt60120180240300SE +/- 1.41, N = 3272.181. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: T5 Encoder - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: T5 Encoder - Device: CPU - Executor: Parallelphoronix-ml.txt0.82641.65282.47923.30564.132SE +/- 0.01909, N = 33.673071. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: T5 Encoder - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: T5 Encoder - Device: CPU - Executor: Standardphoronix-ml.txt4080120160200SE +/- 2.15, N = 4172.041. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: T5 Encoder - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: T5 Encoder - Device: CPU - Executor: Standardphoronix-ml.txt1.30842.61683.92525.23366.542SE +/- 0.07434, N = 45.814891. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallelphoronix-ml.txt4080120160200SE +/- 1.63, N = 15203.061. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallelphoronix-ml.txt1.10872.21743.32614.43485.5435SE +/- 0.04054, N = 154.927681. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: CaffeNet 12-int8 - Device: CPU - Executor: Standardphoronix-ml.txt140280420560700SE +/- 7.24, N = 3639.911. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: CaffeNet 12-int8 - Device: CPU - Executor: Standardphoronix-ml.txt0.35160.70321.05481.40641.758SE +/- 0.01747, N = 31.562531. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: fcn-resnet101-11 - Device: CPU - Executor: Parallelphoronix-ml.txt0.27080.54160.81241.08321.354SE +/- 0.02346, N = 121.203391. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: fcn-resnet101-11 - Device: CPU - Executor: Parallelphoronix-ml.txt2004006008001000SE +/- 16.79, N = 12834.571. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: fcn-resnet101-11 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: fcn-resnet101-11 - Device: CPU - Executor: Standardphoronix-ml.txt0.92631.85262.77893.70524.6315SE +/- 0.00482, N = 34.116761. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: fcn-resnet101-11 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: fcn-resnet101-11 - Device: CPU - Executor: Standardphoronix-ml.txt50100150200250SE +/- 0.28, N = 3242.911. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallelphoronix-ml.txt20406080100SE +/- 0.36, N = 3108.881. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallelphoronix-ml.txt3691215SE +/- 0.03096, N = 39.182921. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standardphoronix-ml.txt70140210280350SE +/- 0.68, N = 3328.291. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standardphoronix-ml.txt0.68521.37042.05562.74083.426SE +/- 0.00627, N = 33.045511. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: super-resolution-10 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: super-resolution-10 - Device: CPU - Executor: Parallelphoronix-ml.txt306090120150SE +/- 0.93, N = 3123.741. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: super-resolution-10 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: super-resolution-10 - Device: CPU - Executor: Parallelphoronix-ml.txt246810SE +/- 0.06021, N = 38.081491. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: super-resolution-10 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: super-resolution-10 - Device: CPU - Executor: Standardphoronix-ml.txt20406080100SE +/- 0.21, N = 397.321. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: super-resolution-10 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: super-resolution-10 - Device: CPU - Executor: Standardphoronix-ml.txt3691215SE +/- 0.02, N = 310.281. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Parallelphoronix-ml.txt0.29190.58380.87571.16761.4595SE +/- 0.00341, N = 31.297181. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Parallelphoronix-ml.txt170340510680850SE +/- 2.03, N = 3770.911. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standardphoronix-ml.txt0.49740.99481.49221.98962.487SE +/- 0.00518, N = 32.210861. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standardphoronix-ml.txt100200300400500SE +/- 1.06, N = 3452.311. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

Scikit-Learn

Benchmark: GLM

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: GLMphoronix-ml.txt4080120160200SE +/- 0.81, N = 3168.331. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: SAGA

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: SAGAphoronix-ml.txt140280420560700SE +/- 3.66, N = 3669.121. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: Tree

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Treephoronix-ml.txt1122334455SE +/- 0.48, N = 546.971. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: Lasso

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Lassophoronix-ml.txt70140210280350SE +/- 0.07, N = 3308.231. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: Sparsify

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Sparsifyphoronix-ml.txt20406080100SE +/- 0.30, N = 3108.461. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: Plot Ward

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Plot Wardphoronix-ml.txt1020304050SE +/- 0.35, N = 842.101. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: MNIST Dataset

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: MNIST Datasetphoronix-ml.txt1224364860SE +/- 0.42, N = 352.741. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: Plot Neighbors

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Plot Neighborsphoronix-ml.txt306090120150SE +/- 0.47, N = 3114.841. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: SGD Regression

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: SGD Regressionphoronix-ml.txt1428425670SE +/- 0.08, N = 364.371. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: SGDOneClassSVM

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: SGDOneClassSVMphoronix-ml.txt50100150200250SE +/- 0.33, N = 3233.411. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: Isolation Forest

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Isolation Forestphoronix-ml.txt4080120160200SE +/- 0.54, N = 3176.291. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: Text Vectorizers

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Text Vectorizersphoronix-ml.txt1020304050SE +/- 0.18, N = 345.341. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: Plot Hierarchical

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Plot Hierarchicalphoronix-ml.txt306090120150SE +/- 0.41, N = 3141.461. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: Plot OMP vs. LARS

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Plot OMP vs. LARSphoronix-ml.txt918273645SE +/- 0.09, N = 341.481. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: Feature Expansions

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Feature Expansionsphoronix-ml.txt20406080100SE +/- 0.56, N = 3100.351. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: LocalOutlierFactor

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: LocalOutlierFactorphoronix-ml.txt510152025SE +/- 0.13, N = 321.621. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: TSNE MNIST Dataset

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: TSNE MNIST Datasetphoronix-ml.txt50100150200250SE +/- 0.66, N = 3247.561. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: Isotonic / Logistic

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Isotonic / Logisticphoronix-ml.txt30060090012001500SE +/- 0.82, N = 31406.451. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: Plot Incremental PCA

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Plot Incremental PCAphoronix-ml.txt714212835SE +/- 0.08, N = 331.211. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: Hist Gradient Boosting

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Hist Gradient Boostingphoronix-ml.txt4080120160200SE +/- 0.90, N = 3166.781. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: Plot Parallel Pairwise

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Plot Parallel Pairwisephoronix-ml.txt4080120160200SE +/- 4.47, N = 9168.001. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: Isotonic / Pathological

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Isotonic / Pathologicalphoronix-ml.txt8001600240032004000SE +/- 9.06, N = 33843.261. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: Sample Without Replacement

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Sample Without Replacementphoronix-ml.txt20406080100SE +/- 0.76, N = 390.641. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: Covertype Dataset Benchmark

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Covertype Dataset Benchmarkphoronix-ml.txt70140210280350SE +/- 0.61, N = 3320.391. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: Hist Gradient Boosting Adult

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Hist Gradient Boosting Adultphoronix-ml.txt306090120150SE +/- 1.23, N = 12153.341. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: Isotonic / Perturbed Logarithm

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Isotonic / Perturbed Logarithmphoronix-ml.txt30060090012001500SE +/- 2.36, N = 31528.971. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: Hist Gradient Boosting Threading

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Hist Gradient Boosting Threadingphoronix-ml.txt1224364860SE +/- 0.61, N = 452.731. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: Hist Gradient Boosting Higgs Boson

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Hist Gradient Boosting Higgs Bosonphoronix-ml.txt1530456075SE +/- 0.83, N = 365.741. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: 20 Newsgroups / Logistic Regression

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: 20 Newsgroups / Logistic Regressionphoronix-ml.txt3691215SE +/- 0.06, N = 310.451. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: Plot Polynomial Kernel Approximation

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Plot Polynomial Kernel Approximationphoronix-ml.txt20406080100SE +/- 0.04, N = 3104.701. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: Hist Gradient Boosting Categorical Only

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Hist Gradient Boosting Categorical Onlyphoronix-ml.txt714212835SE +/- 0.30, N = 1530.191. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: Kernel PCA Solvers / Time vs. N Samples

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Kernel PCA Solvers / Time vs. N Samplesphoronix-ml.txt1428425670SE +/- 0.24, N = 361.611. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: Kernel PCA Solvers / Time vs. N Components

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Kernel PCA Solvers / Time vs. N Componentsphoronix-ml.txt714212835SE +/- 0.33, N = 331.041. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: Sparse Random Projections / 100 Iterations

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Sparse Random Projections / 100 Iterationsphoronix-ml.txt110220330440550SE +/- 2.62, N = 3504.831. (F9X) gfortran options: -O0

Whisper.cpp

Model: ggml-base.en - Input: 2016 State of the Union

OpenBenchmarking.orgSeconds, Fewer Is BetterWhisper.cpp 1.6.2Model: ggml-base.en - Input: 2016 State of the Unionphoronix-ml.txt20406080100SE +/- 0.44, N = 392.751. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread -msse3 -mssse3 -mavx -mf16c -mfma -mavx2 -mavx512f -mavx512cd -mavx512vl -mavx512dq -mavx512bw -mavx512vbmi -mavx512vnni

Whisper.cpp

Model: ggml-small.en - Input: 2016 State of the Union

OpenBenchmarking.orgSeconds, Fewer Is BetterWhisper.cpp 1.6.2Model: ggml-small.en - Input: 2016 State of the Unionphoronix-ml.txt50100150200250SE +/- 0.41, N = 3218.191. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread -msse3 -mssse3 -mavx -mf16c -mfma -mavx2 -mavx512f -mavx512cd -mavx512vl -mavx512dq -mavx512bw -mavx512vbmi -mavx512vnni

Whisper.cpp

Model: ggml-medium.en - Input: 2016 State of the Union

OpenBenchmarking.orgSeconds, Fewer Is BetterWhisper.cpp 1.6.2Model: ggml-medium.en - Input: 2016 State of the Unionphoronix-ml.txt130260390520650SE +/- 1.41, N = 3579.171. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread -msse3 -mssse3 -mavx -mf16c -mfma -mavx2 -mavx512f -mavx512cd -mavx512vl -mavx512dq -mavx512bw -mavx512vbmi -mavx512vnni

OpenCV

Test: DNN - Deep Neural Network

OpenBenchmarking.orgms, Fewer Is BetterOpenCV 4.7Test: DNN - Deep Neural Networkphoronix-ml.txt7K14K21K28K35KSE +/- 1066.17, N = 15330801. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt


Phoronix Test Suite v10.8.5