102424machinelearningtest

Intel Core i9-12900K testing with a ASUS PRIME Z790-V AX (1802 BIOS) and ASUS NVIDIA w Dual GeForce RTX 3090 24GB w NVLink on Ubuntu 24.04 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2410281-NE-102424MAC72&gru.

102424machinelearningtestProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLOpenCLCompilerFile-SystemScreen ResolutionASUS NVIDIA GeForce RTX 3090Intel Core i9-12900K @ 5.10GHz (16 Cores / 24 Threads)ASUS PRIME Z790-V AX (1802 BIOS)Intel Raptor Lake-S PCH96GB2000GB Samsung SSD 970 EVO Plus 2TBASUS NVIDIA GeForce RTX 3090 24GBIntel Raptor Lake HD AudioS24F350Realtek RTL8111/8168/8211/8411 + Realtek Device b851Ubuntu 24.046.8.0-47-generic (x86_64)GNOME Shell 46.0X Server + WaylandNVIDIA 560.35.034.6.0OpenCL 3.0 CUDA 12.6.65GCC 13.2.0 + CUDA 12.5ext41920x1080OpenBenchmarking.org- Transparent Huge Pages: madvise- PRIMUS_libGLa=/usr/lib/nvidia-current/libGL.so.1:/usr/lib32/nvidia-current/libGL.so.1:/usr/lib/x86_64-linux-gnu/libGL.so.1:/usr/lib/i386-linux-gnu/libGL.so.1 PRIMUS_libGLd=/usr/$LIB/libGL.so.1:/usr/lib/$LIB/libGL.so.1:/usr/$LIB/mesa/libGL.so.1:/usr/lib/$LIB/mesa/libGL.so.1 - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-uJ7kn6/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-uJ7kn6/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0x37 - Thermald 2.5.6 - BAR1 / Visible vRAM Size: 32768 MiB - vBIOS Version: 94.02.4b.00.0b- GPU Compute Cores: 10496- Python 3.12.3- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Mitigation of Clear Register File + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; RSB filling; PBRSB-eIBRS: SW sequence; BHI: BHI_DIS_S + srbds: Not affected + tsx_async_abort: Not affected

102424machinelearningtestpytorch: CPU - 1 - ResNet-50pytorch: CPU - 1 - ResNet-152pytorch: CPU - 16 - ResNet-50pytorch: CPU - 32 - ResNet-50pytorch: CPU - 64 - ResNet-50pytorch: CPU - 16 - ResNet-152pytorch: CPU - 256 - ResNet-50pytorch: CPU - 32 - ResNet-152pytorch: CPU - 512 - ResNet-50pytorch: CPU - 64 - ResNet-152pytorch: CPU - 256 - ResNet-152pytorch: CPU - 512 - ResNet-152pytorch: CPU - 1 - Efficientnet_v2_lpytorch: CPU - 16 - Efficientnet_v2_lpytorch: CPU - 32 - Efficientnet_v2_lpytorch: CPU - 64 - Efficientnet_v2_lpytorch: CPU - 256 - Efficientnet_v2_lpytorch: CPU - 512 - Efficientnet_v2_lpytorch: NVIDIA CUDA GPU - 1 - ResNet-50pytorch: NVIDIA CUDA GPU - 1 - ResNet-152pytorch: NVIDIA CUDA GPU - 16 - ResNet-50pytorch: NVIDIA CUDA GPU - 32 - ResNet-50pytorch: NVIDIA CUDA GPU - 64 - ResNet-50pytorch: NVIDIA CUDA GPU - 16 - ResNet-152pytorch: NVIDIA CUDA GPU - 256 - ResNet-50pytorch: NVIDIA CUDA GPU - 32 - ResNet-152pytorch: NVIDIA CUDA GPU - 512 - ResNet-50pytorch: NVIDIA CUDA GPU - 64 - ResNet-152pytorch: NVIDIA CUDA GPU - 256 - ResNet-152pytorch: NVIDIA CUDA GPU - 512 - ResNet-152pytorch: NVIDIA CUDA GPU - 1 - Efficientnet_v2_lpytorch: NVIDIA CUDA GPU - 16 - Efficientnet_v2_lpytorch: NVIDIA CUDA GPU - 32 - Efficientnet_v2_lpytorch: NVIDIA CUDA GPU - 64 - Efficientnet_v2_lpytorch: NVIDIA CUDA GPU - 256 - Efficientnet_v2_lpytorch: NVIDIA CUDA GPU - 512 - Efficientnet_v2_lopenvino: Face Detection FP16 - CPUopenvino: Person Detection FP16 - CPUopenvino: Person Detection FP32 - CPUopenvino: Vehicle Detection FP16 - CPUopenvino: Face Detection FP16-INT8 - CPUopenvino: Face Detection Retail FP16 - CPUopenvino: Road Segmentation ADAS FP16 - CPUopenvino: Vehicle Detection FP16-INT8 - CPUopenvino: Weld Porosity Detection FP16 - CPUopenvino: Face Detection Retail FP16-INT8 - CPUopenvino: Road Segmentation ADAS FP16-INT8 - CPUopenvino: Machine Translation EN To DE FP16 - CPUopenvino: Weld Porosity Detection FP16-INT8 - CPUopenvino: Person Vehicle Bike Detection FP16 - CPUopenvino: Noise Suppression Poconet-Like FP16 - CPUopenvino: Handwritten English Recognition FP16 - CPUopenvino: Person Re-Identification Retail FP16 - CPUopenvino: Age Gender Recognition Retail 0013 FP16 - CPUopenvino: Handwritten English Recognition FP16-INT8 - CPUopenvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPUshoc: OpenCL - Triadshoc: OpenCL - Reductionshoc: OpenCL - Bus Speed Downloadshoc: OpenCL - Bus Speed Readbackshoc: OpenCL - Texture Read Bandwidthshoc: OpenCL - S3Dshoc: OpenCL - FFT SPshoc: OpenCL - GEMM SGEMM_Nshoc: OpenCL - Max SP Flopsshoc: OpenCL - MD5 Hashtensorflow: CPU - 1 - VGG-16tensorflow: GPU - 1 - VGG-16tensorflow: CPU - 1 - AlexNettensorflow: CPU - 16 - VGG-16tensorflow: CPU - 32 - VGG-16tensorflow: CPU - 64 - VGG-16tensorflow: GPU - 1 - AlexNettensorflow: GPU - 16 - VGG-16tensorflow: GPU - 32 - VGG-16tensorflow: GPU - 64 - VGG-16tensorflow: CPU - 16 - AlexNettensorflow: CPU - 256 - VGG-16tensorflow: CPU - 32 - AlexNettensorflow: CPU - 512 - VGG-16tensorflow: CPU - 64 - AlexNettensorflow: GPU - 16 - AlexNettensorflow: GPU - 256 - VGG-16tensorflow: GPU - 32 - AlexNettensorflow: GPU - 512 - VGG-16tensorflow: GPU - 64 - AlexNettensorflow: CPU - 1 - GoogLeNettensorflow: CPU - 1 - ResNet-50tensorflow: CPU - 256 - AlexNettensorflow: CPU - 512 - AlexNettensorflow: GPU - 1 - GoogLeNettensorflow: GPU - 1 - ResNet-50tensorflow: GPU - 256 - AlexNettensorflow: GPU - 512 - AlexNettensorflow: CPU - 16 - GoogLeNettensorflow: CPU - 16 - ResNet-50tensorflow: CPU - 32 - GoogLeNettensorflow: CPU - 32 - ResNet-50tensorflow: CPU - 64 - GoogLeNettensorflow: CPU - 64 - ResNet-50tensorflow: GPU - 16 - GoogLeNettensorflow: GPU - 16 - ResNet-50tensorflow: GPU - 32 - GoogLeNettensorflow: GPU - 32 - ResNet-50tensorflow: GPU - 64 - GoogLeNettensorflow: GPU - 64 - ResNet-50tensorflow: CPU - 256 - GoogLeNettensorflow: CPU - 256 - ResNet-50tensorflow: CPU - 512 - GoogLeNettensorflow: CPU - 512 - ResNet-50tensorflow: GPU - 256 - GoogLeNettensorflow: GPU - 256 - ResNet-50tensorflow: GPU - 512 - GoogLeNettensorflow: GPU - 512 - ResNet-50onnx: yolov4 - CPU - Parallelonnx: yolov4 - CPU - Standardonnx: ZFNet-512 - CPU - Parallelonnx: ZFNet-512 - CPU - Standardonnx: T5 Encoder - CPU - Parallelonnx: T5 Encoder - CPU - Standardonnx: CaffeNet 12-int8 - CPU - Parallelonnx: CaffeNet 12-int8 - CPU - Standardonnx: fcn-resnet101-11 - CPU - Parallelonnx: fcn-resnet101-11 - CPU - Standardonnx: ResNet50 v1-12-int8 - CPU - Parallelonnx: ResNet50 v1-12-int8 - CPU - Standardonnx: super-resolution-10 - CPU - Parallelonnx: super-resolution-10 - CPU - Standardonnx: ResNet101_DUC_HDC-12 - CPU - Parallelonnx: ResNet101_DUC_HDC-12 - CPU - Standardlczero: BLASnumpy: onnx: yolov4 - CPU - Parallelonnx: yolov4 - CPU - Standardonnx: ZFNet-512 - CPU - Parallelonnx: ZFNet-512 - CPU - Standardonnx: T5 Encoder - CPU - Parallelonnx: T5 Encoder - CPU - Standardonnx: CaffeNet 12-int8 - CPU - Parallelonnx: CaffeNet 12-int8 - CPU - Standardonnx: fcn-resnet101-11 - CPU - Parallelonnx: fcn-resnet101-11 - CPU - Standardonnx: ResNet50 v1-12-int8 - CPU - Parallelonnx: ResNet50 v1-12-int8 - CPU - Standardonnx: super-resolution-10 - CPU - Parallelonnx: super-resolution-10 - CPU - Standardonnx: ResNet101_DUC_HDC-12 - CPU - Parallelonnx: ResNet101_DUC_HDC-12 - CPU - Standardtensorflow-lite: SqueezeNettensorflow-lite: Inception V4tensorflow-lite: NASNet Mobiletensorflow-lite: Mobilenet Floattensorflow-lite: Mobilenet Quanttensorflow-lite: Inception ResNet V2onednn: IP Shapes 1D - CPUonednn: IP Shapes 3D - CPUonednn: Convolution Batch Shapes Auto - CPUonednn: Deconvolution Batch shapes_1d - CPUonednn: Deconvolution Batch shapes_3d - CPUonednn: Recurrent Neural Network Training - CPUonednn: Recurrent Neural Network Inference - CPUmnn: nasnetmnn: mobilenetV3mnn: squeezenetv1.1mnn: resnet-v2-50mnn: SqueezeNetV1.0mnn: MobileNetV2_224mnn: mobilenet-v1-1.0mnn: inception-v3ncnn: CPU - mobilenetncnn: CPU-v2-v2 - mobilenet-v2ncnn: CPU-v3-v3 - mobilenet-v3ncnn: CPU - shufflenet-v2ncnn: CPU - mnasnetncnn: CPU - efficientnet-b0ncnn: CPU - blazefacencnn: CPU - googlenetncnn: CPU - vgg16ncnn: CPU - resnet18ncnn: CPU - alexnetncnn: CPU - resnet50ncnn: CPU - yolov4-tinyncnn: CPU - squeezenet_ssdncnn: CPU - regnety_400mncnn: CPU - vision_transformerncnn: CPU - FastestDetncnn: Vulkan GPU - mobilenetncnn: Vulkan GPU-v2-v2 - mobilenet-v2ncnn: Vulkan GPU-v3-v3 - mobilenet-v3ncnn: Vulkan GPU - shufflenet-v2ncnn: Vulkan GPU - mnasnetncnn: Vulkan GPU - efficientnet-b0ncnn: Vulkan GPU - blazefacencnn: Vulkan GPU - googlenetncnn: Vulkan GPU - vgg16ncnn: Vulkan GPU - resnet18ncnn: Vulkan GPU - alexnetncnn: Vulkan GPU - resnet50ncnn: Vulkan GPU - yolov4-tinyncnn: Vulkan GPU - squeezenet_ssdncnn: Vulkan GPU - regnety_400mncnn: Vulkan GPU - vision_transformerncnn: Vulkan GPU - FastestDetopenvino: Face Detection FP16 - CPUopenvino: Person Detection FP16 - CPUopenvino: Person Detection FP32 - CPUopenvino: Vehicle Detection FP16 - CPUopenvino: Face Detection FP16-INT8 - CPUopenvino: Face Detection Retail FP16 - CPUopenvino: Road Segmentation ADAS FP16 - CPUopenvino: Vehicle Detection FP16-INT8 - CPUopenvino: Weld Porosity Detection FP16 - CPUopenvino: Face Detection Retail FP16-INT8 - CPUopenvino: Road Segmentation ADAS FP16-INT8 - CPUopenvino: Machine Translation EN To DE FP16 - CPUopenvino: Weld Porosity Detection FP16-INT8 - CPUopenvino: Person Vehicle Bike Detection FP16 - CPUopenvino: Noise Suppression Poconet-Like FP16 - CPUopenvino: Handwritten English Recognition FP16 - CPUopenvino: Person Re-Identification Retail FP16 - CPUopenvino: Age Gender Recognition Retail 0013 FP16 - CPUopenvino: Handwritten English Recognition FP16-INT8 - CPUopenvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPUopencv: DNN - Deep Neural Networkdeepspeech: CPUrbenchmark: rnnoise: 26 Minute Long Talking Samplescikit-learn: GLMscikit-learn: SAGAscikit-learn: Treescikit-learn: Lassoscikit-learn: Sparsifyscikit-learn: Plot Wardscikit-learn: MNIST Datasetscikit-learn: Plot Neighborsscikit-learn: SGD Regressionscikit-learn: SGDOneClassSVMscikit-learn: Isolation Forestscikit-learn: Text Vectorizersscikit-learn: Plot Hierarchicalscikit-learn: Plot OMP vs. LARSscikit-learn: Feature Expansionsscikit-learn: LocalOutlierFactorscikit-learn: TSNE MNIST Datasetscikit-learn: Isotonic / Logisticscikit-learn: Plot Incremental PCAscikit-learn: Hist Gradient Boostingscikit-learn: Plot Parallel Pairwisescikit-learn: Sample Without Replacementscikit-learn: Covertype Dataset Benchmarkscikit-learn: Hist Gradient Boosting Adultscikit-learn: Isotonic / Perturbed Logarithmscikit-learn: Hist Gradient Boosting Threadingscikit-learn: Hist Gradient Boosting Higgs Bosonscikit-learn: 20 Newsgroups / Logistic Regressionscikit-learn: Plot Polynomial Kernel Approximationscikit-learn: Hist Gradient Boosting Categorical Onlyscikit-learn: Kernel PCA Solvers / Time vs. N Samplesscikit-learn: Kernel PCA Solvers / Time vs. N Componentsscikit-learn: Sparse Rand Projections / 100 Iterationswhisper-cpp: ggml-base.en - 2016 State of the Unionwhisper-cpp: ggml-small.en - 2016 State of the Unionwhisper-cpp: ggml-medium.en - 2016 State of the Unionxnnpack: FP32MobileNetV1xnnpack: FP32MobileNetV2xnnpack: FP32MobileNetV3Largexnnpack: FP32MobileNetV3Smallxnnpack: FP16MobileNetV1xnnpack: FP16MobileNetV2xnnpack: FP16MobileNetV3Largexnnpack: FP16MobileNetV3Smallxnnpack: QS8MobileNetV2ASUS NVIDIA GeForce RTX 309049.4918.8030.0730.2729.5211.4929.3310.1630.1411.6611.4511.4610.556.977.927.826.487.26469.71172.05414.29413.19413.29162.59413.12162.45414.51162.69162.68162.3588.4185.7484.0985.3785.5786.274.3340.9841.47296.6215.611180.4686.71712.75450.441948.78283.3451.691544.54528.13656.86225.32659.7211244.68266.8729697.316.6057406.4116.64136.76522178.63455.9552510.988456.9941982.245.44894.121.5715.838.979.229.3114.142.252.272.27129.709.11167.178.95192.1338.932.2841.452.2942.4947.9214.04227.40245.6712.005.6244.0544.64101.5027.8096.6926.8294.2126.5225.997.9326.637.9826.918.0093.4527.4993.7828.1327.048.0427.088.0410.4925311.145679.434592.4654115.166152.798307.372646.1091.656522.49779162.891313.23571.808181.77720.7434360.943867166644.5495.559189.758112.592410.81418.686146.543043.251791.54657604.549400.3516.144253.1915113.939312.22761345.661059.491904.0126961.03115421383.022317.671281792.892539.326768.249154.472595.577852702.001452.766.9561.0121.83614.3612.9351.8822.08421.36716.465.538.2811.055.4619.355.9519.6528.607.505.5822.0918.4013.6282.6370.736.4717.345.727.9310.515.4318.736.0019.4328.617.295.4922.1318.1714.2584.8570.756.561375.14146.25144.4720.16382.745.0269.098.3544.262.9721.09115.8712.8111.299.0288.659.011.6674.850.622555254.629170.09707.245183.476521.30542.771258.72485.97543.61553.53996.75563.791196.466147.51638.906153.05645.622120.32935.645188.3261447.69321.3071120.037332.19484.273317.4621173.8901469.791140.749190.5829.614133.120195.29569.63425.585587.844209.01863638.010922000.10000139411251585820219716461834928904OpenBenchmarking.org

PyTorch

Device: CPU - Batch Size: 1 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 1 - Model: ResNet-50ASUS NVIDIA GeForce RTX 30901122334455SE +/- 0.18, N = 349.49MIN: 30.42 / MAX: 50.09

PyTorch

Device: CPU - Batch Size: 1 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 1 - Model: ResNet-152ASUS NVIDIA GeForce RTX 3090510152025SE +/- 0.44, N = 1518.80MIN: 15.08 / MAX: 19.89

PyTorch

Device: CPU - Batch Size: 16 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 16 - Model: ResNet-50ASUS NVIDIA GeForce RTX 3090714212835SE +/- 0.12, N = 330.07MIN: 28.1 / MAX: 30.36

PyTorch

Device: CPU - Batch Size: 32 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 32 - Model: ResNet-50ASUS NVIDIA GeForce RTX 3090714212835SE +/- 0.11, N = 330.27MIN: 28.61 / MAX: 30.57

PyTorch

Device: CPU - Batch Size: 64 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 64 - Model: ResNet-50ASUS NVIDIA GeForce RTX 3090714212835SE +/- 0.32, N = 1529.52MIN: 25.74 / MAX: 30.64

PyTorch

Device: CPU - Batch Size: 16 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 16 - Model: ResNet-152ASUS NVIDIA GeForce RTX 30903691215SE +/- 0.20, N = 1211.49MIN: 10.07 / MAX: 12.18

PyTorch

Device: CPU - Batch Size: 256 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 256 - Model: ResNet-50ASUS NVIDIA GeForce RTX 3090714212835SE +/- 0.38, N = 1529.33MIN: 25.91 / MAX: 30.51

PyTorch

Device: CPU - Batch Size: 32 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 32 - Model: ResNet-152ASUS NVIDIA GeForce RTX 30903691215SE +/- 0.01, N = 310.16MIN: 10.01 / MAX: 10.25

PyTorch

Device: CPU - Batch Size: 512 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 512 - Model: ResNet-50ASUS NVIDIA GeForce RTX 3090714212835SE +/- 0.24, N = 330.14MIN: 18.07 / MAX: 30.74

PyTorch

Device: CPU - Batch Size: 64 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 64 - Model: ResNet-152ASUS NVIDIA GeForce RTX 30903691215SE +/- 0.05, N = 311.66MIN: 11.02 / MAX: 11.79

PyTorch

Device: CPU - Batch Size: 256 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 256 - Model: ResNet-152ASUS NVIDIA GeForce RTX 30903691215SE +/- 0.19, N = 1211.45MIN: 10.05 / MAX: 12.11

PyTorch

Device: CPU - Batch Size: 512 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 512 - Model: ResNet-152ASUS NVIDIA GeForce RTX 30903691215SE +/- 0.14, N = 1211.46MIN: 10.06 / MAX: 11.81

PyTorch

Device: CPU - Batch Size: 1 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 1 - Model: Efficientnet_v2_lASUS NVIDIA GeForce RTX 30903691215SE +/- 0.12, N = 310.55MIN: 6.85 / MAX: 13.27

PyTorch

Device: CPU - Batch Size: 16 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 16 - Model: Efficientnet_v2_lASUS NVIDIA GeForce RTX 3090246810SE +/- 0.22, N = 96.97MIN: 6.34 / MAX: 8.11

PyTorch

Device: CPU - Batch Size: 32 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 32 - Model: Efficientnet_v2_lASUS NVIDIA GeForce RTX 3090246810SE +/- 0.05, N = 37.92MIN: 7.36 / MAX: 8.02

PyTorch

Device: CPU - Batch Size: 64 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 64 - Model: Efficientnet_v2_lASUS NVIDIA GeForce RTX 3090246810SE +/- 0.07, N = 37.82MIN: 6.39 / MAX: 7.98

PyTorch

Device: CPU - Batch Size: 256 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 256 - Model: Efficientnet_v2_lASUS NVIDIA GeForce RTX 3090246810SE +/- 0.01, N = 36.48MIN: 6.44 / MAX: 6.62

PyTorch

Device: CPU - Batch Size: 512 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 512 - Model: Efficientnet_v2_lASUS NVIDIA GeForce RTX 3090246810SE +/- 0.24, N = 97.26MIN: 6.42 / MAX: 8.05

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: ResNet-50ASUS NVIDIA GeForce RTX 3090100200300400500SE +/- 1.86, N = 3469.71MIN: 339.62 / MAX: 480.89

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: ResNet-152ASUS NVIDIA GeForce RTX 30904080120160200SE +/- 0.36, N = 3172.05MIN: 143.74 / MAX: 175.36

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: ResNet-50ASUS NVIDIA GeForce RTX 309090180270360450SE +/- 0.07, N = 3414.29MIN: 339.59 / MAX: 422.13

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: ResNet-50ASUS NVIDIA GeForce RTX 309090180270360450SE +/- 0.17, N = 3413.19MIN: 321.33 / MAX: 421.07

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: ResNet-50ASUS NVIDIA GeForce RTX 309090180270360450SE +/- 0.26, N = 3413.29MIN: 337.68 / MAX: 420.31

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: ResNet-152ASUS NVIDIA GeForce RTX 30904080120160200SE +/- 0.56, N = 3162.59MIN: 114.4 / MAX: 167.22

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: ResNet-50ASUS NVIDIA GeForce RTX 309090180270360450SE +/- 0.66, N = 3413.12MIN: 161.19 / MAX: 421.75

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: ResNet-152ASUS NVIDIA GeForce RTX 30904080120160200SE +/- 0.21, N = 3162.45MIN: 87.19 / MAX: 165.45

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: ResNet-50ASUS NVIDIA GeForce RTX 309090180270360450SE +/- 0.17, N = 3414.51MIN: 337.19 / MAX: 422.34

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: ResNet-152ASUS NVIDIA GeForce RTX 30904080120160200SE +/- 0.16, N = 3162.69MIN: 143.93 / MAX: 165.52

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: ResNet-152ASUS NVIDIA GeForce RTX 30904080120160200SE +/- 0.10, N = 3162.68MIN: 145.92 / MAX: 165.55

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: ResNet-152ASUS NVIDIA GeForce RTX 30904080120160200SE +/- 0.22, N = 3162.35MIN: 99.28 / MAX: 165.46

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: Efficientnet_v2_lASUS NVIDIA GeForce RTX 309020406080100SE +/- 0.70, N = 388.41MIN: 76.75 / MAX: 90.99

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: Efficientnet_v2_lASUS NVIDIA GeForce RTX 309020406080100SE +/- 0.47, N = 385.74MIN: 75.44 / MAX: 87.56

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: Efficientnet_v2_lASUS NVIDIA GeForce RTX 309020406080100SE +/- 0.95, N = 384.09MIN: 70.71 / MAX: 86.82

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: Efficientnet_v2_lASUS NVIDIA GeForce RTX 309020406080100SE +/- 0.50, N = 385.37MIN: 74.95 / MAX: 88.01

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: Efficientnet_v2_lASUS NVIDIA GeForce RTX 309020406080100SE +/- 0.11, N = 385.57MIN: 73.68 / MAX: 87.24

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: Efficientnet_v2_lASUS NVIDIA GeForce RTX 309020406080100SE +/- 0.12, N = 386.27MIN: 76.41 / MAX: 88.03

OpenVINO

Model: Face Detection FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Face Detection FP16 - Device: CPUASUS NVIDIA GeForce RTX 30900.97431.94862.92293.89724.8715SE +/- 0.03, N = 34.331. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Person Detection FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Person Detection FP16 - Device: CPUASUS NVIDIA GeForce RTX 3090918273645SE +/- 0.13, N = 340.981. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Person Detection FP32 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Person Detection FP32 - Device: CPUASUS NVIDIA GeForce RTX 3090918273645SE +/- 0.20, N = 341.471. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Vehicle Detection FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Vehicle Detection FP16 - Device: CPUASUS NVIDIA GeForce RTX 309060120180240300SE +/- 1.40, N = 3296.621. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Face Detection FP16-INT8 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Face Detection FP16-INT8 - Device: CPUASUS NVIDIA GeForce RTX 309048121620SE +/- 0.05, N = 315.611. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Face Detection Retail FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Face Detection Retail FP16 - Device: CPUASUS NVIDIA GeForce RTX 309030060090012001500SE +/- 1.52, N = 31180.461. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Road Segmentation ADAS FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Road Segmentation ADAS FP16 - Device: CPUASUS NVIDIA GeForce RTX 309020406080100SE +/- 0.37, N = 386.711. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Vehicle Detection FP16-INT8 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Vehicle Detection FP16-INT8 - Device: CPUASUS NVIDIA GeForce RTX 3090150300450600750SE +/- 0.27, N = 3712.751. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Weld Porosity Detection FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Weld Porosity Detection FP16 - Device: CPUASUS NVIDIA GeForce RTX 3090100200300400500SE +/- 0.70, N = 3450.441. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Face Detection Retail FP16-INT8 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Face Detection Retail FP16-INT8 - Device: CPUASUS NVIDIA GeForce RTX 3090400800120016002000SE +/- 3.09, N = 31948.781. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Road Segmentation ADAS FP16-INT8 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Road Segmentation ADAS FP16-INT8 - Device: CPUASUS NVIDIA GeForce RTX 309060120180240300SE +/- 0.63, N = 3283.341. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Machine Translation EN To DE FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Machine Translation EN To DE FP16 - Device: CPUASUS NVIDIA GeForce RTX 30901224364860SE +/- 0.18, N = 351.691. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Weld Porosity Detection FP16-INT8 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Weld Porosity Detection FP16-INT8 - Device: CPUASUS NVIDIA GeForce RTX 309030060090012001500SE +/- 0.90, N = 31544.541. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Person Vehicle Bike Detection FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Person Vehicle Bike Detection FP16 - Device: CPUASUS NVIDIA GeForce RTX 3090110220330440550SE +/- 0.48, N = 3528.131. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Noise Suppression Poconet-Like FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Noise Suppression Poconet-Like FP16 - Device: CPUASUS NVIDIA GeForce RTX 3090140280420560700SE +/- 0.48, N = 3656.861. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Handwritten English Recognition FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Handwritten English Recognition FP16 - Device: CPUASUS NVIDIA GeForce RTX 309050100150200250SE +/- 0.26, N = 3225.321. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Person Re-Identification Retail FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Person Re-Identification Retail FP16 - Device: CPUASUS NVIDIA GeForce RTX 3090140280420560700SE +/- 0.35, N = 3659.721. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Age Gender Recognition Retail 0013 FP16 - Device: CPUASUS NVIDIA GeForce RTX 30902K4K6K8K10KSE +/- 7.15, N = 311244.681. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Handwritten English Recognition FP16-INT8 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Handwritten English Recognition FP16-INT8 - Device: CPUASUS NVIDIA GeForce RTX 309060120180240300SE +/- 0.68, N = 3266.871. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPUASUS NVIDIA GeForce RTX 30906K12K18K24K30KSE +/- 14.75, N = 329697.311. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Triad

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: TriadASUS NVIDIA GeForce RTX 3090246810SE +/- 0.0011, N = 36.60571. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Reduction

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: ReductionASUS NVIDIA GeForce RTX 309090180270360450SE +/- 0.22, N = 3406.411. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Bus Speed Download

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Bus Speed DownloadASUS NVIDIA GeForce RTX 3090246810SE +/- 0.0002, N = 36.64131. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Bus Speed Readback

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Bus Speed ReadbackASUS NVIDIA GeForce RTX 3090246810SE +/- 0.0000, N = 36.76521. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Texture Read Bandwidth

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Texture Read BandwidthASUS NVIDIA GeForce RTX 30905001000150020002500SE +/- 2.20, N = 32178.631. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: S3D

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: S3DASUS NVIDIA GeForce RTX 3090100200300400500SE +/- 0.72, N = 3455.961. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: FFT SP

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: FFT SPASUS NVIDIA GeForce RTX 30905001000150020002500SE +/- 35.21, N = 32510.981. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: GEMM SGEMM_N

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: GEMM SGEMM_NASUS NVIDIA GeForce RTX 30902K4K6K8K10KSE +/- 38.88, N = 38456.991. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: Max SP Flops

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Max SP FlopsASUS NVIDIA GeForce RTX 30909K18K27K36K45KSE +/- 208.39, N = 341982.21. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

SHOC Scalable HeterOgeneous Computing

Target: OpenCL - Benchmark: MD5 Hash

OpenBenchmarking.orgGHash/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: MD5 HashASUS NVIDIA GeForce RTX 30901020304050SE +/- 0.01, N = 345.451. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

TensorFlow

Device: CPU - Batch Size: 1 - Model: VGG-16

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 1 - Model: VGG-16ASUS NVIDIA GeForce RTX 30900.9271.8542.7813.7084.635SE +/- 0.00, N = 34.12

TensorFlow

Device: GPU - Batch Size: 1 - Model: VGG-16

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 1 - Model: VGG-16ASUS NVIDIA GeForce RTX 30900.35330.70661.05991.41321.7665SE +/- 0.02, N = 151.57

TensorFlow

Device: CPU - Batch Size: 1 - Model: AlexNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 1 - Model: AlexNetASUS NVIDIA GeForce RTX 309048121620SE +/- 0.02, N = 315.83

TensorFlow

Device: CPU - Batch Size: 16 - Model: VGG-16

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 16 - Model: VGG-16ASUS NVIDIA GeForce RTX 30903691215SE +/- 0.06, N = 38.97

TensorFlow

Device: CPU - Batch Size: 32 - Model: VGG-16

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 32 - Model: VGG-16ASUS NVIDIA GeForce RTX 30903691215SE +/- 0.02, N = 39.22

TensorFlow

Device: CPU - Batch Size: 64 - Model: VGG-16

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 64 - Model: VGG-16ASUS NVIDIA GeForce RTX 30903691215SE +/- 0.01, N = 39.31

TensorFlow

Device: GPU - Batch Size: 1 - Model: AlexNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 1 - Model: AlexNetASUS NVIDIA GeForce RTX 309048121620SE +/- 0.07, N = 314.14

TensorFlow

Device: GPU - Batch Size: 16 - Model: VGG-16

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 16 - Model: VGG-16ASUS NVIDIA GeForce RTX 30900.50631.01261.51892.02522.5315SE +/- 0.00, N = 32.25

TensorFlow

Device: GPU - Batch Size: 32 - Model: VGG-16

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 32 - Model: VGG-16ASUS NVIDIA GeForce RTX 30900.51081.02161.53242.04322.554SE +/- 0.00, N = 32.27

TensorFlow

Device: GPU - Batch Size: 64 - Model: VGG-16

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 64 - Model: VGG-16ASUS NVIDIA GeForce RTX 30900.51081.02161.53242.04322.554SE +/- 0.00, N = 32.27

TensorFlow

Device: CPU - Batch Size: 16 - Model: AlexNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 16 - Model: AlexNetASUS NVIDIA GeForce RTX 3090306090120150SE +/- 0.01, N = 3129.70

TensorFlow

Device: CPU - Batch Size: 256 - Model: VGG-16

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 256 - Model: VGG-16ASUS NVIDIA GeForce RTX 30903691215SE +/- 0.01, N = 39.11

TensorFlow

Device: CPU - Batch Size: 32 - Model: AlexNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 32 - Model: AlexNetASUS NVIDIA GeForce RTX 30904080120160200SE +/- 0.14, N = 3167.17

TensorFlow

Device: CPU - Batch Size: 512 - Model: VGG-16

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 512 - Model: VGG-16ASUS NVIDIA GeForce RTX 30903691215SE +/- 0.06, N = 38.95

TensorFlow

Device: CPU - Batch Size: 64 - Model: AlexNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 64 - Model: AlexNetASUS NVIDIA GeForce RTX 30904080120160200SE +/- 0.32, N = 3192.13

TensorFlow

Device: GPU - Batch Size: 16 - Model: AlexNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 16 - Model: AlexNetASUS NVIDIA GeForce RTX 3090918273645SE +/- 0.43, N = 338.93

TensorFlow

Device: GPU - Batch Size: 256 - Model: VGG-16

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 256 - Model: VGG-16ASUS NVIDIA GeForce RTX 30900.5131.0261.5392.0522.565SE +/- 0.00, N = 32.28

TensorFlow

Device: GPU - Batch Size: 32 - Model: AlexNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 32 - Model: AlexNetASUS NVIDIA GeForce RTX 3090918273645SE +/- 0.40, N = 341.45

TensorFlow

Device: GPU - Batch Size: 512 - Model: VGG-16

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 512 - Model: VGG-16ASUS NVIDIA GeForce RTX 30900.51531.03061.54592.06122.5765SE +/- 0.00, N = 32.29

TensorFlow

Device: GPU - Batch Size: 64 - Model: AlexNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 64 - Model: AlexNetASUS NVIDIA GeForce RTX 30901020304050SE +/- 0.02, N = 342.49

TensorFlow

Device: CPU - Batch Size: 1 - Model: GoogLeNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 1 - Model: GoogLeNetASUS NVIDIA GeForce RTX 30901122334455SE +/- 0.60, N = 347.92

TensorFlow

Device: CPU - Batch Size: 1 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 1 - Model: ResNet-50ASUS NVIDIA GeForce RTX 309048121620SE +/- 0.07, N = 314.04

TensorFlow

Device: CPU - Batch Size: 256 - Model: AlexNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 256 - Model: AlexNetASUS NVIDIA GeForce RTX 309050100150200250SE +/- 0.07, N = 3227.40

TensorFlow

Device: CPU - Batch Size: 512 - Model: AlexNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 512 - Model: AlexNetASUS NVIDIA GeForce RTX 309050100150200250SE +/- 0.02, N = 3245.67

TensorFlow

Device: GPU - Batch Size: 1 - Model: GoogLeNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 1 - Model: GoogLeNetASUS NVIDIA GeForce RTX 30903691215SE +/- 0.21, N = 1512.00

TensorFlow

Device: GPU - Batch Size: 1 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 1 - Model: ResNet-50ASUS NVIDIA GeForce RTX 30901.26452.5293.79355.0586.3225SE +/- 0.05, N = 35.62

TensorFlow

Device: GPU - Batch Size: 256 - Model: AlexNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 256 - Model: AlexNetASUS NVIDIA GeForce RTX 30901020304050SE +/- 0.40, N = 344.05

TensorFlow

Device: GPU - Batch Size: 512 - Model: AlexNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 512 - Model: AlexNetASUS NVIDIA GeForce RTX 30901020304050SE +/- 0.42, N = 344.64

TensorFlow

Device: CPU - Batch Size: 16 - Model: GoogLeNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 16 - Model: GoogLeNetASUS NVIDIA GeForce RTX 309020406080100SE +/- 0.15, N = 3101.50

TensorFlow

Device: CPU - Batch Size: 16 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 16 - Model: ResNet-50ASUS NVIDIA GeForce RTX 3090714212835SE +/- 0.02, N = 327.80

TensorFlow

Device: CPU - Batch Size: 32 - Model: GoogLeNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 32 - Model: GoogLeNetASUS NVIDIA GeForce RTX 309020406080100SE +/- 0.06, N = 396.69

TensorFlow

Device: CPU - Batch Size: 32 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 32 - Model: ResNet-50ASUS NVIDIA GeForce RTX 3090612182430SE +/- 0.01, N = 326.82

TensorFlow

Device: CPU - Batch Size: 64 - Model: GoogLeNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 64 - Model: GoogLeNetASUS NVIDIA GeForce RTX 309020406080100SE +/- 0.26, N = 394.21

TensorFlow

Device: CPU - Batch Size: 64 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 64 - Model: ResNet-50ASUS NVIDIA GeForce RTX 3090612182430SE +/- 0.01, N = 326.52

TensorFlow

Device: GPU - Batch Size: 16 - Model: GoogLeNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 16 - Model: GoogLeNetASUS NVIDIA GeForce RTX 3090612182430SE +/- 0.06, N = 325.99

TensorFlow

Device: GPU - Batch Size: 16 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 16 - Model: ResNet-50ASUS NVIDIA GeForce RTX 3090246810SE +/- 0.01, N = 37.93

TensorFlow

Device: GPU - Batch Size: 32 - Model: GoogLeNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 32 - Model: GoogLeNetASUS NVIDIA GeForce RTX 3090612182430SE +/- 0.04, N = 326.63

TensorFlow

Device: GPU - Batch Size: 32 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 32 - Model: ResNet-50ASUS NVIDIA GeForce RTX 3090246810SE +/- 0.01, N = 37.98

TensorFlow

Device: GPU - Batch Size: 64 - Model: GoogLeNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 64 - Model: GoogLeNetASUS NVIDIA GeForce RTX 3090612182430SE +/- 0.01, N = 326.91

TensorFlow

Device: GPU - Batch Size: 64 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 64 - Model: ResNet-50ASUS NVIDIA GeForce RTX 3090246810SE +/- 0.00, N = 38.00

TensorFlow

Device: CPU - Batch Size: 256 - Model: GoogLeNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 256 - Model: GoogLeNetASUS NVIDIA GeForce RTX 309020406080100SE +/- 0.03, N = 393.45

TensorFlow

Device: CPU - Batch Size: 256 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 256 - Model: ResNet-50ASUS NVIDIA GeForce RTX 3090612182430SE +/- 0.18, N = 327.49

TensorFlow

Device: CPU - Batch Size: 512 - Model: GoogLeNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 512 - Model: GoogLeNetASUS NVIDIA GeForce RTX 309020406080100SE +/- 0.03, N = 393.78

TensorFlow

Device: CPU - Batch Size: 512 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 512 - Model: ResNet-50ASUS NVIDIA GeForce RTX 3090714212835SE +/- 0.01, N = 328.13

TensorFlow

Device: GPU - Batch Size: 256 - Model: GoogLeNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 256 - Model: GoogLeNetASUS NVIDIA GeForce RTX 3090612182430SE +/- 0.05, N = 327.04

TensorFlow

Device: GPU - Batch Size: 256 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 256 - Model: ResNet-50ASUS NVIDIA GeForce RTX 3090246810SE +/- 0.00, N = 38.04

TensorFlow

Device: GPU - Batch Size: 512 - Model: GoogLeNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 512 - Model: GoogLeNetASUS NVIDIA GeForce RTX 3090612182430SE +/- 0.04, N = 327.08

TensorFlow

Device: GPU - Batch Size: 512 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 512 - Model: ResNet-50ASUS NVIDIA GeForce RTX 3090246810SE +/- 0.01, N = 38.04

ONNX Runtime

Model: yolov4 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: yolov4 - Device: CPU - Executor: ParallelASUS NVIDIA GeForce RTX 30903691215SE +/- 0.14, N = 1510.491. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: yolov4 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: yolov4 - Device: CPU - Executor: StandardASUS NVIDIA GeForce RTX 30903691215SE +/- 0.13, N = 411.151. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ZFNet-512 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ZFNet-512 - Device: CPU - Executor: ParallelASUS NVIDIA GeForce RTX 309020406080100SE +/- 1.14, N = 379.431. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ZFNet-512 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ZFNet-512 - Device: CPU - Executor: StandardASUS NVIDIA GeForce RTX 309020406080100SE +/- 0.61, N = 392.471. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: T5 Encoder - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: T5 Encoder - Device: CPU - Executor: ParallelASUS NVIDIA GeForce RTX 3090306090120150SE +/- 0.72, N = 15115.171. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: T5 Encoder - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: T5 Encoder - Device: CPU - Executor: StandardASUS NVIDIA GeForce RTX 3090306090120150SE +/- 0.46, N = 3152.801. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: CaffeNet 12-int8 - Device: CPU - Executor: ParallelASUS NVIDIA GeForce RTX 309070140210280350SE +/- 0.56, N = 3307.371. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: CaffeNet 12-int8 - Device: CPU - Executor: StandardASUS NVIDIA GeForce RTX 3090140280420560700SE +/- 0.70, N = 3646.111. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: fcn-resnet101-11 - Device: CPU - Executor: ParallelASUS NVIDIA GeForce RTX 30900.37270.74541.11811.49081.8635SE +/- 0.01666, N = 151.656521. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: fcn-resnet101-11 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: fcn-resnet101-11 - Device: CPU - Executor: StandardASUS NVIDIA GeForce RTX 30900.5621.1241.6862.2482.81SE +/- 0.00202, N = 32.497791. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ResNet50 v1-12-int8 - Device: CPU - Executor: ParallelASUS NVIDIA GeForce RTX 30904080120160200SE +/- 1.41, N = 15162.891. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ResNet50 v1-12-int8 - Device: CPU - Executor: StandardASUS NVIDIA GeForce RTX 309070140210280350SE +/- 0.91, N = 3313.241. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: super-resolution-10 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: super-resolution-10 - Device: CPU - Executor: ParallelASUS NVIDIA GeForce RTX 30901632486480SE +/- 0.62, N = 1571.811. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: super-resolution-10 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: super-resolution-10 - Device: CPU - Executor: StandardASUS NVIDIA GeForce RTX 309020406080100SE +/- 0.01, N = 381.781. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: ParallelASUS NVIDIA GeForce RTX 30900.16730.33460.50190.66920.8365SE +/- 0.010713, N = 30.7434361. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: StandardASUS NVIDIA GeForce RTX 30900.21240.42480.63720.84961.062SE +/- 0.002603, N = 30.9438671. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

LeelaChessZero

Backend: BLAS

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.31.1Backend: BLASASUS NVIDIA GeForce RTX 30904080120160200SE +/- 1.83, N = 51661. (CXX) g++ options: -flto -pthread

Numpy Benchmark

OpenBenchmarking.orgScore, More Is BetterNumpy BenchmarkASUS NVIDIA GeForce RTX 3090140280420560700SE +/- 0.55, N = 3644.54

ONNX Runtime

Model: yolov4 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: yolov4 - Device: CPU - Executor: ParallelASUS NVIDIA GeForce RTX 309020406080100SE +/- 1.36, N = 1595.561. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: yolov4 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: yolov4 - Device: CPU - Executor: StandardASUS NVIDIA GeForce RTX 309020406080100SE +/- 1.09, N = 489.761. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ZFNet-512 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: ZFNet-512 - Device: CPU - Executor: ParallelASUS NVIDIA GeForce RTX 30903691215SE +/- 0.18, N = 312.591. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ZFNet-512 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: ZFNet-512 - Device: CPU - Executor: StandardASUS NVIDIA GeForce RTX 30903691215SE +/- 0.07, N = 310.811. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: T5 Encoder - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: T5 Encoder - Device: CPU - Executor: ParallelASUS NVIDIA GeForce RTX 3090246810SE +/- 0.05490, N = 158.686141. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: T5 Encoder - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: T5 Encoder - Device: CPU - Executor: StandardASUS NVIDIA GeForce RTX 3090246810SE +/- 0.01981, N = 36.543041. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: CaffeNet 12-int8 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: CaffeNet 12-int8 - Device: CPU - Executor: ParallelASUS NVIDIA GeForce RTX 30900.73171.46342.19512.92683.6585SE +/- 0.00596, N = 33.251791. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: CaffeNet 12-int8 - Device: CPU - Executor: StandardASUS NVIDIA GeForce RTX 30900.3480.6961.0441.3921.74SE +/- 0.00165, N = 31.546571. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: fcn-resnet101-11 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: fcn-resnet101-11 - Device: CPU - Executor: ParallelASUS NVIDIA GeForce RTX 3090130260390520650SE +/- 6.22, N = 15604.551. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: fcn-resnet101-11 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: fcn-resnet101-11 - Device: CPU - Executor: StandardASUS NVIDIA GeForce RTX 309090180270360450SE +/- 0.32, N = 3400.351. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: ResNet50 v1-12-int8 - Device: CPU - Executor: ParallelASUS NVIDIA GeForce RTX 3090246810SE +/- 0.05367, N = 156.144251. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: ResNet50 v1-12-int8 - Device: CPU - Executor: StandardASUS NVIDIA GeForce RTX 30900.71811.43622.15432.87243.5905SE +/- 0.00923, N = 33.191511. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: super-resolution-10 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: super-resolution-10 - Device: CPU - Executor: ParallelASUS NVIDIA GeForce RTX 309048121620SE +/- 0.12, N = 1513.941. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: super-resolution-10 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: super-resolution-10 - Device: CPU - Executor: StandardASUS NVIDIA GeForce RTX 30903691215SE +/- 0.00, N = 312.231. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Parallel

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: ParallelASUS NVIDIA GeForce RTX 309030060090012001500SE +/- 19.21, N = 31345.661. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.19Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: StandardASUS NVIDIA GeForce RTX 30902004006008001000SE +/- 2.93, N = 31059.491. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

TensorFlow Lite

Model: SqueezeNet

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2022-05-18Model: SqueezeNetASUS NVIDIA GeForce RTX 3090400800120016002000SE +/- 15.29, N = 151904.01

TensorFlow Lite

Model: Inception V4

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2022-05-18Model: Inception V4ASUS NVIDIA GeForce RTX 30906K12K18K24K30KSE +/- 395.13, N = 1526961.0

TensorFlow Lite

Model: NASNet Mobile

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2022-05-18Model: NASNet MobileASUS NVIDIA GeForce RTX 309070K140K210K280K350KSE +/- 11554.76, N = 15311542

TensorFlow Lite

Model: Mobilenet Float

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2022-05-18Model: Mobilenet FloatASUS NVIDIA GeForce RTX 309030060090012001500SE +/- 17.51, N = 151383.02

TensorFlow Lite

Model: Mobilenet Quant

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2022-05-18Model: Mobilenet QuantASUS NVIDIA GeForce RTX 30905001000150020002500SE +/- 20.94, N = 32317.67

TensorFlow Lite

Model: Inception ResNet V2

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2022-05-18Model: Inception ResNet V2ASUS NVIDIA GeForce RTX 309030K60K90K120K150KSE +/- 4137.26, N = 15128179

oneDNN

Harness: IP Shapes 1D - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: IP Shapes 1D - Engine: CPUASUS NVIDIA GeForce RTX 30900.65081.30161.95242.60323.254SE +/- 0.00897, N = 32.89253MIN: 2.721. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl

oneDNN

Harness: IP Shapes 3D - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: IP Shapes 3D - Engine: CPUASUS NVIDIA GeForce RTX 30903691215SE +/- 0.00039, N = 39.32676MIN: 9.11. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl

oneDNN

Harness: Convolution Batch Shapes Auto - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: Convolution Batch Shapes Auto - Engine: CPUASUS NVIDIA GeForce RTX 3090246810SE +/- 0.00332, N = 38.24915MIN: 7.761. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl

oneDNN

Harness: Deconvolution Batch shapes_1d - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: Deconvolution Batch shapes_1d - Engine: CPUASUS NVIDIA GeForce RTX 30901.00632.01263.01894.02525.0315SE +/- 0.00491, N = 34.47259MIN: 4.021. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl

oneDNN

Harness: Deconvolution Batch shapes_3d - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: Deconvolution Batch shapes_3d - Engine: CPUASUS NVIDIA GeForce RTX 30901.2552.513.7655.026.275SE +/- 0.00815, N = 35.57785MIN: 5.371. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl

oneDNN

Harness: Recurrent Neural Network Training - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: Recurrent Neural Network Training - Engine: CPUASUS NVIDIA GeForce RTX 30906001200180024003000SE +/- 1.60, N = 32702.00MIN: 2689.781. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl

oneDNN

Harness: Recurrent Neural Network Inference - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: Recurrent Neural Network Inference - Engine: CPUASUS NVIDIA GeForce RTX 309030060090012001500SE +/- 2.94, N = 31452.76MIN: 1432.321. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl

Mobile Neural Network

Model: nasnet

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.9.b11b7037dModel: nasnetASUS NVIDIA GeForce RTX 3090246810SE +/- 0.004, N = 36.956MIN: 6.61 / MAX: 53.041. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl

Mobile Neural Network

Model: mobilenetV3

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.9.b11b7037dModel: mobilenetV3ASUS NVIDIA GeForce RTX 30900.22770.45540.68310.91081.1385SE +/- 0.003, N = 31.012MIN: 0.93 / MAX: 24.11. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl

Mobile Neural Network

Model: squeezenetv1.1

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.9.b11b7037dModel: squeezenetv1.1ASUS NVIDIA GeForce RTX 30900.41310.82621.23931.65242.0655SE +/- 0.006, N = 31.836MIN: 1.71 / MAX: 31.71. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl

Mobile Neural Network

Model: resnet-v2-50

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.9.b11b7037dModel: resnet-v2-50ASUS NVIDIA GeForce RTX 309048121620SE +/- 0.08, N = 314.36MIN: 13.6 / MAX: 59.491. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl

Mobile Neural Network

Model: SqueezeNetV1.0

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.9.b11b7037dModel: SqueezeNetV1.0ASUS NVIDIA GeForce RTX 30900.66041.32081.98122.64163.302SE +/- 0.049, N = 32.935MIN: 2.7 / MAX: 34.141. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl

Mobile Neural Network

Model: MobileNetV2_224

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.9.b11b7037dModel: MobileNetV2_224ASUS NVIDIA GeForce RTX 30900.42350.8471.27051.6942.1175SE +/- 0.011, N = 31.882MIN: 1.79 / MAX: 33.111. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl

Mobile Neural Network

Model: mobilenet-v1-1.0

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.9.b11b7037dModel: mobilenet-v1-1.0ASUS NVIDIA GeForce RTX 30900.46890.93781.40671.87562.3445SE +/- 0.015, N = 32.084MIN: 1.99 / MAX: 26.251. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl

Mobile Neural Network

Model: inception-v3

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.9.b11b7037dModel: inception-v3ASUS NVIDIA GeForce RTX 3090510152025SE +/- 0.04, N = 321.37MIN: 20.43 / MAX: 62.251. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl

NCNN

Target: CPU - Model: mobilenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: mobilenetASUS NVIDIA GeForce RTX 309048121620SE +/- 0.47, N = 916.46MIN: 9.33 / MAX: 40.481. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU-v2-v2 - Model: mobilenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU-v2-v2 - Model: mobilenet-v2ASUS NVIDIA GeForce RTX 30901.24432.48863.73294.97726.2215SE +/- 0.07, N = 95.53MIN: 4.11 / MAX: 12.431. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU-v3-v3 - Model: mobilenet-v3

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU-v3-v3 - Model: mobilenet-v3ASUS NVIDIA GeForce RTX 3090246810SE +/- 0.34, N = 98.28MIN: 4.71 / MAX: 21.81. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: shufflenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: shufflenet-v2ASUS NVIDIA GeForce RTX 30903691215SE +/- 0.58, N = 911.05MIN: 4.48 / MAX: 28.41. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: mnasnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: mnasnetASUS NVIDIA GeForce RTX 30901.22852.4573.68554.9146.1425SE +/- 0.05, N = 95.46MIN: 4.13 / MAX: 13.571. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: efficientnet-b0

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: efficientnet-b0ASUS NVIDIA GeForce RTX 3090510152025SE +/- 0.78, N = 919.35MIN: 7.5 / MAX: 39.351. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: blazeface

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: blazefaceASUS NVIDIA GeForce RTX 30901.33882.67764.01645.35526.694SE +/- 0.11, N = 95.95MIN: 2.83 / MAX: 16.591. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: googlenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: googlenetASUS NVIDIA GeForce RTX 3090510152025SE +/- 0.62, N = 919.65MIN: 9.41 / MAX: 46.011. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: vgg16

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: vgg16ASUS NVIDIA GeForce RTX 3090714212835SE +/- 0.01, N = 928.60MIN: 27.69 / MAX: 31.261. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: resnet18

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: resnet18ASUS NVIDIA GeForce RTX 3090246810SE +/- 0.04, N = 97.50MIN: 5.96 / MAX: 14.471. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: alexnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: alexnetASUS NVIDIA GeForce RTX 30901.25552.5113.76655.0226.2775SE +/- 0.05, N = 95.58MIN: 5.23 / MAX: 13.311. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: resnet50

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: resnet50ASUS NVIDIA GeForce RTX 3090510152025SE +/- 0.34, N = 922.09MIN: 13.37 / MAX: 48.041. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: yolov4-tiny

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: yolov4-tinyASUS NVIDIA GeForce RTX 3090510152025SE +/- 0.31, N = 918.40MIN: 14.15 / MAX: 42.771. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: squeezenet_ssd

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: squeezenet_ssdASUS NVIDIA GeForce RTX 309048121620SE +/- 0.29, N = 913.62MIN: 8.06 / MAX: 30.151. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: regnety_400m

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: regnety_400mASUS NVIDIA GeForce RTX 309020406080100SE +/- 0.75, N = 982.63MIN: 22.37 / MAX: 120.261. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: vision_transformer

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: vision_transformerASUS NVIDIA GeForce RTX 30901632486480SE +/- 0.06, N = 970.73MIN: 67.43 / MAX: 80.711. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: CPU - Model: FastestDet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: FastestDetASUS NVIDIA GeForce RTX 3090246810SE +/- 0.13, N = 96.47MIN: 4.73 / MAX: 17.11. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: mobilenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: mobilenetASUS NVIDIA GeForce RTX 309048121620SE +/- 0.37, N = 917.34MIN: 9.39 / MAX: 40.251. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2ASUS NVIDIA GeForce RTX 30901.2872.5743.8615.1486.435SE +/- 0.13, N = 95.72MIN: 4.2 / MAX: 19.441. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3ASUS NVIDIA GeForce RTX 3090246810SE +/- 0.60, N = 97.93MIN: 4.48 / MAX: 21.461. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: shufflenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: shufflenet-v2ASUS NVIDIA GeForce RTX 30903691215SE +/- 0.50, N = 910.51MIN: 4.55 / MAX: 29.191. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: mnasnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: mnasnetASUS NVIDIA GeForce RTX 30901.22182.44363.66544.88726.109SE +/- 0.03, N = 95.43MIN: 4.16 / MAX: 9.811. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: efficientnet-b0

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: efficientnet-b0ASUS NVIDIA GeForce RTX 3090510152025SE +/- 0.83, N = 918.73MIN: 7.31 / MAX: 37.641. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: blazeface

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: blazefaceASUS NVIDIA GeForce RTX 3090246810SE +/- 0.11, N = 86.00MIN: 2.81 / MAX: 17.231. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: googlenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: googlenetASUS NVIDIA GeForce RTX 3090510152025SE +/- 0.55, N = 919.43MIN: 9.05 / MAX: 43.991. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: vgg16

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: vgg16ASUS NVIDIA GeForce RTX 3090714212835SE +/- 0.01, N = 928.61MIN: 27.79 / MAX: 31.771. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: resnet18

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: resnet18ASUS NVIDIA GeForce RTX 3090246810SE +/- 0.07, N = 97.29MIN: 5.96 / MAX: 11.881. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: alexnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: alexnetASUS NVIDIA GeForce RTX 30901.23532.47063.70594.94126.1765SE +/- 0.01, N = 85.49MIN: 5.21 / MAX: 6.71. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: resnet50

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: resnet50ASUS NVIDIA GeForce RTX 3090510152025SE +/- 0.46, N = 922.13MIN: 13.15 / MAX: 49.981. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: yolov4-tiny

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: yolov4-tinyASUS NVIDIA GeForce RTX 309048121620SE +/- 0.19, N = 918.17MIN: 14.35 / MAX: 48.071. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: squeezenet_ssd

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: squeezenet_ssdASUS NVIDIA GeForce RTX 309048121620SE +/- 0.48, N = 914.25MIN: 7.96 / MAX: 30.381. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: regnety_400m

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: regnety_400mASUS NVIDIA GeForce RTX 309020406080100SE +/- 0.69, N = 984.85MIN: 22.37 / MAX: 1211. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: vision_transformer

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: vision_transformerASUS NVIDIA GeForce RTX 30901632486480SE +/- 0.03, N = 970.75MIN: 67.65 / MAX: 81.351. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

NCNN

Target: Vulkan GPU - Model: FastestDet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: FastestDetASUS NVIDIA GeForce RTX 3090246810SE +/- 0.07, N = 96.56MIN: 5.01 / MAX: 14.91. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenVINO

Model: Face Detection FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Face Detection FP16 - Device: CPUASUS NVIDIA GeForce RTX 309030060090012001500SE +/- 6.34, N = 31375.14MIN: 1104.54 / MAX: 1761.351. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Person Detection FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Person Detection FP16 - Device: CPUASUS NVIDIA GeForce RTX 3090306090120150SE +/- 0.46, N = 3146.25MIN: 120.28 / MAX: 197.971. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Person Detection FP32 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Person Detection FP32 - Device: CPUASUS NVIDIA GeForce RTX 3090306090120150SE +/- 0.71, N = 3144.47MIN: 72.51 / MAX: 211.861. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Vehicle Detection FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Vehicle Detection FP16 - Device: CPUASUS NVIDIA GeForce RTX 3090510152025SE +/- 0.10, N = 320.16MIN: 7.21 / MAX: 40.971. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Face Detection FP16-INT8 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Face Detection FP16-INT8 - Device: CPUASUS NVIDIA GeForce RTX 309080160240320400SE +/- 0.21, N = 3382.74MIN: 223.52 / MAX: 881.71. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Face Detection Retail FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Face Detection Retail FP16 - Device: CPUASUS NVIDIA GeForce RTX 30901.12952.2593.38854.5185.6475SE +/- 0.01, N = 35.02MIN: 2.23 / MAX: 27.191. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Road Segmentation ADAS FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Road Segmentation ADAS FP16 - Device: CPUASUS NVIDIA GeForce RTX 30901530456075SE +/- 0.30, N = 369.09MIN: 27.97 / MAX: 92.161. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Vehicle Detection FP16-INT8 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Vehicle Detection FP16-INT8 - Device: CPUASUS NVIDIA GeForce RTX 3090246810SE +/- 0.00, N = 38.35MIN: 4.12 / MAX: 34.481. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Weld Porosity Detection FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Weld Porosity Detection FP16 - Device: CPUASUS NVIDIA GeForce RTX 30901020304050SE +/- 0.07, N = 344.26MIN: 25.84 / MAX: 77.31. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Face Detection Retail FP16-INT8 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Face Detection Retail FP16-INT8 - Device: CPUASUS NVIDIA GeForce RTX 30900.66831.33662.00492.67323.3415SE +/- 0.00, N = 32.97MIN: 1.4 / MAX: 22.51. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Road Segmentation ADAS FP16-INT8 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Road Segmentation ADAS FP16-INT8 - Device: CPUASUS NVIDIA GeForce RTX 3090510152025SE +/- 0.05, N = 321.09MIN: 9.8 / MAX: 54.631. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Machine Translation EN To DE FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Machine Translation EN To DE FP16 - Device: CPUASUS NVIDIA GeForce RTX 3090306090120150SE +/- 0.40, N = 3115.87MIN: 60.69 / MAX: 196.81. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Weld Porosity Detection FP16-INT8 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Weld Porosity Detection FP16-INT8 - Device: CPUASUS NVIDIA GeForce RTX 30903691215SE +/- 0.00, N = 312.81MIN: 7.26 / MAX: 52.971. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Person Vehicle Bike Detection FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Person Vehicle Bike Detection FP16 - Device: CPUASUS NVIDIA GeForce RTX 30903691215SE +/- 0.01, N = 311.29MIN: 5.5 / MAX: 26.391. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Noise Suppression Poconet-Like FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Noise Suppression Poconet-Like FP16 - Device: CPUASUS NVIDIA GeForce RTX 30903691215SE +/- 0.01, N = 39.02MIN: 6.21 / MAX: 25.891. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Handwritten English Recognition FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Handwritten English Recognition FP16 - Device: CPUASUS NVIDIA GeForce RTX 309020406080100SE +/- 0.10, N = 388.65MIN: 56.64 / MAX: 150.061. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Person Re-Identification Retail FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Person Re-Identification Retail FP16 - Device: CPUASUS NVIDIA GeForce RTX 30903691215SE +/- 0.01, N = 39.01MIN: 4.38 / MAX: 37.431. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Age Gender Recognition Retail 0013 FP16 - Device: CPUASUS NVIDIA GeForce RTX 30900.37350.7471.12051.4941.8675SE +/- 0.00, N = 31.66MIN: 0.78 / MAX: 17.391. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Handwritten English Recognition FP16-INT8 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Handwritten English Recognition FP16-INT8 - Device: CPUASUS NVIDIA GeForce RTX 309020406080100SE +/- 0.19, N = 374.85MIN: 56.4 / MAX: 150.261. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenVINO

Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPU

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPUASUS NVIDIA GeForce RTX 30900.13950.2790.41850.5580.6975SE +/- 0.00, N = 30.62MIN: 0.31 / MAX: 13.61. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenCV

Test: DNN - Deep Neural Network

OpenBenchmarking.orgms, Fewer Is BetterOpenCV 4.7Test: DNN - Deep Neural NetworkASUS NVIDIA GeForce RTX 30905K10K15K20K25KSE +/- 635.50, N = 13255521. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt

DeepSpeech

Acceleration: CPU

OpenBenchmarking.orgSeconds, Fewer Is BetterDeepSpeech 0.6Acceleration: CPUASUS NVIDIA GeForce RTX 30901224364860SE +/- 0.08, N = 354.63

R Benchmark

OpenBenchmarking.orgSeconds, Fewer Is BetterR BenchmarkASUS NVIDIA GeForce RTX 30900.02180.04360.06540.08720.109SE +/- 0.0004, N = 30.0970

RNNoise

Input: 26 Minute Long Talking Sample

OpenBenchmarking.orgSeconds, Fewer Is BetterRNNoise 0.2Input: 26 Minute Long Talking SampleASUS NVIDIA GeForce RTX 3090246810SE +/- 0.014, N = 37.2451. (CC) gcc options: -O2 -pedantic -fvisibility=hidden

Scikit-Learn

Benchmark: GLM

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: GLMASUS NVIDIA GeForce RTX 30904080120160200SE +/- 0.31, N = 3183.481. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: SAGA

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: SAGAASUS NVIDIA GeForce RTX 3090110220330440550SE +/- 0.32, N = 3521.311. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: Tree

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: TreeASUS NVIDIA GeForce RTX 30901020304050SE +/- 0.38, N = 1542.771. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: Lasso

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: LassoASUS NVIDIA GeForce RTX 309060120180240300SE +/- 1.16, N = 3258.721. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: Sparsify

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: SparsifyASUS NVIDIA GeForce RTX 309020406080100SE +/- 0.25, N = 385.981. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: Plot Ward

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Plot WardASUS NVIDIA GeForce RTX 30901020304050SE +/- 0.60, N = 343.621. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: MNIST Dataset

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: MNIST DatasetASUS NVIDIA GeForce RTX 30901224364860SE +/- 0.03, N = 353.541. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: Plot Neighbors

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Plot NeighborsASUS NVIDIA GeForce RTX 309020406080100SE +/- 0.18, N = 396.761. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: SGD Regression

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: SGD RegressionASUS NVIDIA GeForce RTX 30901428425670SE +/- 0.12, N = 363.791. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: SGDOneClassSVM

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: SGDOneClassSVMASUS NVIDIA GeForce RTX 30904080120160200SE +/- 1.90, N = 6196.471. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: Isolation Forest

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Isolation ForestASUS NVIDIA GeForce RTX 3090306090120150SE +/- 0.04, N = 3147.521. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: Text Vectorizers

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Text VectorizersASUS NVIDIA GeForce RTX 3090918273645SE +/- 0.03, N = 338.911. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: Plot Hierarchical

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Plot HierarchicalASUS NVIDIA GeForce RTX 3090306090120150SE +/- 0.14, N = 3153.061. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: Plot OMP vs. LARS

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Plot OMP vs. LARSASUS NVIDIA GeForce RTX 30901020304050SE +/- 0.29, N = 345.621. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: Feature Expansions

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Feature ExpansionsASUS NVIDIA GeForce RTX 3090306090120150SE +/- 0.09, N = 3120.331. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: LocalOutlierFactor

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: LocalOutlierFactorASUS NVIDIA GeForce RTX 3090816243240SE +/- 0.32, N = 1535.651. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: TSNE MNIST Dataset

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: TSNE MNIST DatasetASUS NVIDIA GeForce RTX 30904080120160200SE +/- 0.44, N = 3188.331. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: Isotonic / Logistic

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Isotonic / LogisticASUS NVIDIA GeForce RTX 309030060090012001500SE +/- 0.80, N = 31447.691. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: Plot Incremental PCA

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Plot Incremental PCAASUS NVIDIA GeForce RTX 3090510152025SE +/- 0.18, N = 1521.311. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: Hist Gradient Boosting

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Hist Gradient BoostingASUS NVIDIA GeForce RTX 30902004006008001000SE +/- 3.91, N = 31120.041. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: Plot Parallel Pairwise

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Plot Parallel PairwiseASUS NVIDIA GeForce RTX 309070140210280350SE +/- 1.13, N = 3332.191. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: Sample Without Replacement

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Sample Without ReplacementASUS NVIDIA GeForce RTX 309020406080100SE +/- 0.15, N = 384.271. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: Covertype Dataset Benchmark

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Covertype Dataset BenchmarkASUS NVIDIA GeForce RTX 309070140210280350SE +/- 0.13, N = 3317.461. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: Hist Gradient Boosting Adult

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Hist Gradient Boosting AdultASUS NVIDIA GeForce RTX 309030060090012001500SE +/- 4.93, N = 31173.891. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: Isotonic / Perturbed Logarithm

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Isotonic / Perturbed LogarithmASUS NVIDIA GeForce RTX 309030060090012001500SE +/- 0.23, N = 31469.791. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: Hist Gradient Boosting Threading

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Hist Gradient Boosting ThreadingASUS NVIDIA GeForce RTX 3090306090120150SE +/- 1.02, N = 3140.751. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: Hist Gradient Boosting Higgs Boson

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Hist Gradient Boosting Higgs BosonASUS NVIDIA GeForce RTX 30904080120160200SE +/- 2.11, N = 5190.581. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: 20 Newsgroups / Logistic Regression

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: 20 Newsgroups / Logistic RegressionASUS NVIDIA GeForce RTX 30903691215SE +/- 0.072, N = 39.6141. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: Plot Polynomial Kernel Approximation

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Plot Polynomial Kernel ApproximationASUS NVIDIA GeForce RTX 3090306090120150SE +/- 0.20, N = 3133.121. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: Hist Gradient Boosting Categorical Only

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Hist Gradient Boosting Categorical OnlyASUS NVIDIA GeForce RTX 30904080120160200SE +/- 0.28, N = 3195.301. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: Kernel PCA Solvers / Time vs. N Samples

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Kernel PCA Solvers / Time vs. N SamplesASUS NVIDIA GeForce RTX 30901530456075SE +/- 0.35, N = 369.631. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: Kernel PCA Solvers / Time vs. N Components

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Kernel PCA Solvers / Time vs. N ComponentsASUS NVIDIA GeForce RTX 3090612182430SE +/- 0.23, N = 325.591. (F9X) gfortran options: -O0

Scikit-Learn

Benchmark: Sparse Random Projections / 100 Iterations

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Sparse Random Projections / 100 IterationsASUS NVIDIA GeForce RTX 3090130260390520650SE +/- 1.77, N = 3587.841. (F9X) gfortran options: -O0

Whisper.cpp

Model: ggml-base.en - Input: 2016 State of the Union

OpenBenchmarking.orgSeconds, Fewer Is BetterWhisper.cpp 1.6.2Model: ggml-base.en - Input: 2016 State of the UnionASUS NVIDIA GeForce RTX 309050100150200250SE +/- 0.74, N = 3209.021. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread -msse3 -mssse3 -mavx -mf16c -mfma -mavx2

Whisper.cpp

Model: ggml-small.en - Input: 2016 State of the Union

OpenBenchmarking.orgSeconds, Fewer Is BetterWhisper.cpp 1.6.2Model: ggml-small.en - Input: 2016 State of the UnionASUS NVIDIA GeForce RTX 3090140280420560700SE +/- 0.42, N = 3638.011. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread -msse3 -mssse3 -mavx -mf16c -mfma -mavx2

Whisper.cpp

Model: ggml-medium.en - Input: 2016 State of the Union

OpenBenchmarking.orgSeconds, Fewer Is BetterWhisper.cpp 1.6.2Model: ggml-medium.en - Input: 2016 State of the UnionASUS NVIDIA GeForce RTX 3090400800120016002000SE +/- 2.12, N = 32000.101. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread -msse3 -mssse3 -mavx -mf16c -mfma -mavx2

XNNPACK

Model: FP32MobileNetV1

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP32MobileNetV1ASUS NVIDIA GeForce RTX 309030060090012001500SE +/- 27.87, N = 1213941. (CXX) g++ options: -O3 -lrt -lm

XNNPACK

Model: FP32MobileNetV2

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP32MobileNetV2ASUS NVIDIA GeForce RTX 30902004006008001000SE +/- 17.99, N = 1211251. (CXX) g++ options: -O3 -lrt -lm

XNNPACK

Model: FP32MobileNetV3Large

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP32MobileNetV3LargeASUS NVIDIA GeForce RTX 309030060090012001500SE +/- 66.95, N = 1215851. (CXX) g++ options: -O3 -lrt -lm

XNNPACK

Model: FP32MobileNetV3Small

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP32MobileNetV3SmallASUS NVIDIA GeForce RTX 30902004006008001000SE +/- 29.73, N = 128201. (CXX) g++ options: -O3 -lrt -lm

XNNPACK

Model: FP16MobileNetV1

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP16MobileNetV1ASUS NVIDIA GeForce RTX 30905001000150020002500SE +/- 74.47, N = 1221971. (CXX) g++ options: -O3 -lrt -lm

XNNPACK

Model: FP16MobileNetV2

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP16MobileNetV2ASUS NVIDIA GeForce RTX 3090400800120016002000SE +/- 50.43, N = 1216461. (CXX) g++ options: -O3 -lrt -lm

XNNPACK

Model: FP16MobileNetV3Large

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP16MobileNetV3LargeASUS NVIDIA GeForce RTX 3090400800120016002000SE +/- 29.31, N = 1218341. (CXX) g++ options: -O3 -lrt -lm

XNNPACK

Model: FP16MobileNetV3Small

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP16MobileNetV3SmallASUS NVIDIA GeForce RTX 30902004006008001000SE +/- 16.32, N = 129281. (CXX) g++ options: -O3 -lrt -lm

XNNPACK

Model: QS8MobileNetV2

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: QS8MobileNetV2ASUS NVIDIA GeForce RTX 30902004006008001000SE +/- 35.09, N = 129041. (CXX) g++ options: -O3 -lrt -lm


Phoronix Test Suite v10.8.5