102424machinelearningtest

Intel Core i9-12900K testing with a ASUS PRIME Z790-V AX (1802 BIOS) and ASUS NVIDIA w Dual GeForce RTX 3090 24GB w NVLink on Ubuntu 24.04 via the Phoronix Test Suite.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2410281-NE-102424MAC72
Jump To Table - Results

Statistics

Remove Outliers Before Calculating Averages

Graph Settings

Prefer Vertical Bar Graphs

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Result
Identifier
Performance Per
Dollar
Date
Run
  Test
  Duration
ASUS NVIDIA GeForce RTX 3090
October 25
  4 Days, 17 Hours, 7 Minutes
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


102424machinelearningtestOpenBenchmarking.orgPhoronix Test SuiteIntel Core i9-12900K @ 5.10GHz (16 Cores / 24 Threads)ASUS PRIME Z790-V AX (1802 BIOS)Intel Raptor Lake-S PCH96GB2000GB Samsung SSD 970 EVO Plus 2TBASUS NVIDIA GeForce RTX 3090 24GBIntel Raptor Lake HD AudioS24F350Realtek RTL8111/8168/8211/8411 + Realtek Device b851Ubuntu 24.046.8.0-47-generic (x86_64)GNOME Shell 46.0X Server + WaylandNVIDIA 560.35.034.6.0OpenCL 3.0 CUDA 12.6.65GCC 13.2.0 + CUDA 12.5ext41920x1080ProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLOpenCLCompilerFile-SystemScreen Resolution102424machinelearningtest BenchmarksSystem Logs- Transparent Huge Pages: madvise- PRIMUS_libGLa=/usr/lib/nvidia-current/libGL.so.1:/usr/lib32/nvidia-current/libGL.so.1:/usr/lib/x86_64-linux-gnu/libGL.so.1:/usr/lib/i386-linux-gnu/libGL.so.1 PRIMUS_libGLd=/usr/$LIB/libGL.so.1:/usr/lib/$LIB/libGL.so.1:/usr/$LIB/mesa/libGL.so.1:/usr/lib/$LIB/mesa/libGL.so.1 - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-uJ7kn6/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-uJ7kn6/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0x37 - Thermald 2.5.6 - BAR1 / Visible vRAM Size: 32768 MiB - vBIOS Version: 94.02.4b.00.0b- GPU Compute Cores: 10496- Python 3.12.3- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Mitigation of Clear Register File + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; RSB filling; PBRSB-eIBRS: SW sequence; BHI: BHI_DIS_S + srbds: Not affected + tsx_async_abort: Not affected

102424machinelearningtestwhisper-cpp: ggml-medium.en - 2016 State of the Unionwhisper-cpp: ggml-small.en - 2016 State of the Unionwhisper-cpp: ggml-base.en - 2016 State of the Unionscikit-learn: Sparse Rand Projections / 100 Iterationsscikit-learn: Kernel PCA Solvers / Time vs. N Componentsscikit-learn: Kernel PCA Solvers / Time vs. N Samplesscikit-learn: Hist Gradient Boosting Categorical Onlyscikit-learn: Plot Polynomial Kernel Approximationscikit-learn: 20 Newsgroups / Logistic Regressionscikit-learn: Hist Gradient Boosting Higgs Bosonscikit-learn: Hist Gradient Boosting Threadingscikit-learn: Isotonic / Perturbed Logarithmscikit-learn: Hist Gradient Boosting Adultscikit-learn: Covertype Dataset Benchmarkscikit-learn: Sample Without Replacementscikit-learn: Plot Parallel Pairwisescikit-learn: Hist Gradient Boostingscikit-learn: Plot Incremental PCAscikit-learn: Isotonic / Logisticscikit-learn: TSNE MNIST Datasetscikit-learn: LocalOutlierFactorscikit-learn: Feature Expansionsscikit-learn: Plot OMP vs. LARSscikit-learn: Plot Hierarchicalscikit-learn: Text Vectorizersscikit-learn: Isolation Forestscikit-learn: SGDOneClassSVMscikit-learn: SGD Regressionscikit-learn: Plot Neighborsscikit-learn: MNIST Datasetscikit-learn: Plot Wardscikit-learn: Sparsifyscikit-learn: Lassoscikit-learn: Treescikit-learn: SAGAscikit-learn: GLMonnx: ResNet101_DUC_HDC-12 - CPU - Standardonnx: ResNet101_DUC_HDC-12 - CPU - Parallelonnx: super-resolution-10 - CPU - Standardonnx: super-resolution-10 - CPU - Parallelonnx: ResNet50 v1-12-int8 - CPU - Standardonnx: ResNet50 v1-12-int8 - CPU - Parallelonnx: fcn-resnet101-11 - CPU - Standardonnx: fcn-resnet101-11 - CPU - Parallelonnx: CaffeNet 12-int8 - CPU - Standardonnx: CaffeNet 12-int8 - CPU - Parallelonnx: T5 Encoder - CPU - Standardonnx: T5 Encoder - CPU - Parallelonnx: ZFNet-512 - CPU - Standardonnx: ZFNet-512 - CPU - Parallelonnx: yolov4 - CPU - Standardonnx: yolov4 - CPU - Parallelopenvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPUopenvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPUopenvino: Handwritten English Recognition FP16-INT8 - CPUopenvino: Handwritten English Recognition FP16-INT8 - CPUopenvino: Age Gender Recognition Retail 0013 FP16 - CPUopenvino: Age Gender Recognition Retail 0013 FP16 - CPUopenvino: Person Re-Identification Retail FP16 - CPUopenvino: Person Re-Identification Retail FP16 - CPUopenvino: Handwritten English Recognition FP16 - CPUopenvino: Handwritten English Recognition FP16 - CPUopenvino: Noise Suppression Poconet-Like FP16 - CPUopenvino: Noise Suppression Poconet-Like FP16 - CPUopenvino: Person Vehicle Bike Detection FP16 - CPUopenvino: Person Vehicle Bike Detection FP16 - CPUopenvino: Weld Porosity Detection FP16-INT8 - CPUopenvino: Weld Porosity Detection FP16-INT8 - CPUopenvino: Machine Translation EN To DE FP16 - CPUopenvino: Machine Translation EN To DE FP16 - CPUopenvino: Road Segmentation ADAS FP16-INT8 - CPUopenvino: Road Segmentation ADAS FP16-INT8 - CPUopenvino: Face Detection Retail FP16-INT8 - CPUopenvino: Face Detection Retail FP16-INT8 - CPUopenvino: Weld Porosity Detection FP16 - CPUopenvino: Weld Porosity Detection FP16 - CPUopenvino: Vehicle Detection FP16-INT8 - CPUopenvino: Vehicle Detection FP16-INT8 - CPUopenvino: Road Segmentation ADAS FP16 - CPUopenvino: Road Segmentation ADAS FP16 - CPUopenvino: Face Detection Retail FP16 - CPUopenvino: Face Detection Retail FP16 - CPUopenvino: Face Detection FP16-INT8 - CPUopenvino: Face Detection FP16-INT8 - CPUopenvino: Vehicle Detection FP16 - CPUopenvino: Vehicle Detection FP16 - CPUopenvino: Person Detection FP32 - CPUopenvino: Person Detection FP32 - CPUopenvino: Person Detection FP16 - CPUopenvino: Person Detection FP16 - CPUopenvino: Face Detection FP16 - CPUopenvino: Face Detection FP16 - CPUxnnpack: FP16MobileNetV3Largexnnpack: FP32MobileNetV2ncnn: Vulkan GPU - FastestDetncnn: Vulkan GPU - vision_transformerncnn: Vulkan GPU - regnety_400mncnn: Vulkan GPU - yolov4-tinyncnn: Vulkan GPU - alexnetncnn: Vulkan GPU - resnet18ncnn: Vulkan GPU - vgg16ncnn: Vulkan GPU - blazefacencnn: Vulkan GPU - mnasnetncnn: CPU - FastestDetncnn: CPU - vision_transformerncnn: CPU - regnety_400mncnn: CPU - yolov4-tinyncnn: CPU - resnet50ncnn: CPU - alexnetncnn: CPU - resnet18ncnn: CPU - vgg16ncnn: CPU - blazefacencnn: CPU - mnasnetncnn: CPU-v2-v2 - mobilenet-v2mnn: inception-v3mnn: mobilenet-v1-1.0mnn: MobileNetV2_224mnn: SqueezeNetV1.0mnn: resnet-v2-50mnn: squeezenetv1.1mnn: mobilenetV3mnn: nasnettensorflow: GPU - 512 - ResNet-50tensorflow: GPU - 512 - GoogLeNettensorflow: GPU - 256 - ResNet-50tensorflow: GPU - 256 - GoogLeNettensorflow: CPU - 512 - ResNet-50tensorflow: CPU - 512 - GoogLeNettensorflow: CPU - 256 - ResNet-50tensorflow: CPU - 256 - GoogLeNettensorflow: GPU - 64 - ResNet-50tensorflow: GPU - 64 - GoogLeNettensorflow: GPU - 32 - ResNet-50tensorflow: GPU - 32 - GoogLeNettensorflow: GPU - 16 - ResNet-50tensorflow: GPU - 16 - GoogLeNettensorflow: CPU - 64 - ResNet-50tensorflow: CPU - 64 - GoogLeNettensorflow: CPU - 32 - ResNet-50tensorflow: CPU - 32 - GoogLeNettensorflow: CPU - 16 - ResNet-50tensorflow: CPU - 16 - GoogLeNettensorflow: GPU - 512 - AlexNettensorflow: GPU - 256 - AlexNettensorflow: GPU - 1 - ResNet-50tensorflow: CPU - 512 - AlexNettensorflow: CPU - 256 - AlexNettensorflow: CPU - 1 - ResNet-50tensorflow: CPU - 1 - GoogLeNettensorflow: GPU - 64 - AlexNettensorflow: GPU - 512 - VGG-16tensorflow: GPU - 32 - AlexNettensorflow: GPU - 256 - VGG-16tensorflow: GPU - 16 - AlexNettensorflow: CPU - 64 - AlexNettensorflow: CPU - 512 - VGG-16tensorflow: CPU - 32 - AlexNettensorflow: CPU - 256 - VGG-16tensorflow: CPU - 16 - AlexNettensorflow: GPU - 64 - VGG-16tensorflow: GPU - 32 - VGG-16tensorflow: GPU - 16 - VGG-16tensorflow: GPU - 1 - AlexNettensorflow: CPU - 64 - VGG-16tensorflow: CPU - 32 - VGG-16tensorflow: CPU - 16 - VGG-16tensorflow: CPU - 1 - AlexNettensorflow: GPU - 1 - VGG-16tensorflow: CPU - 1 - VGG-16pytorch: NVIDIA CUDA GPU - 512 - Efficientnet_v2_lpytorch: NVIDIA CUDA GPU - 256 - Efficientnet_v2_lpytorch: NVIDIA CUDA GPU - 64 - Efficientnet_v2_lpytorch: NVIDIA CUDA GPU - 32 - Efficientnet_v2_lpytorch: NVIDIA CUDA GPU - 16 - Efficientnet_v2_lpytorch: NVIDIA CUDA GPU - 1 - Efficientnet_v2_lpytorch: NVIDIA CUDA GPU - 512 - ResNet-152pytorch: NVIDIA CUDA GPU - 256 - ResNet-152pytorch: NVIDIA CUDA GPU - 64 - ResNet-152pytorch: NVIDIA CUDA GPU - 512 - ResNet-50pytorch: NVIDIA CUDA GPU - 32 - ResNet-152pytorch: NVIDIA CUDA GPU - 256 - ResNet-50pytorch: NVIDIA CUDA GPU - 16 - ResNet-152pytorch: NVIDIA CUDA GPU - 64 - ResNet-50pytorch: NVIDIA CUDA GPU - 32 - ResNet-50pytorch: NVIDIA CUDA GPU - 16 - ResNet-50pytorch: NVIDIA CUDA GPU - 1 - ResNet-152pytorch: NVIDIA CUDA GPU - 1 - ResNet-50pytorch: CPU - 256 - Efficientnet_v2_lpytorch: CPU - 64 - Efficientnet_v2_lpytorch: CPU - 32 - Efficientnet_v2_lpytorch: CPU - 1 - Efficientnet_v2_lpytorch: CPU - 512 - ResNet-152pytorch: CPU - 256 - ResNet-152pytorch: CPU - 64 - ResNet-152pytorch: CPU - 512 - ResNet-50pytorch: CPU - 32 - ResNet-152pytorch: CPU - 256 - ResNet-50pytorch: CPU - 16 - ResNet-152pytorch: CPU - 64 - ResNet-50pytorch: CPU - 32 - ResNet-50pytorch: CPU - 16 - ResNet-50pytorch: CPU - 1 - ResNet-50tensorflow-lite: Mobilenet Quanttensorflow-lite: Mobilenet Floattensorflow-lite: Inception V4tensorflow-lite: SqueezeNetrnnoise: 26 Minute Long Talking Samplerbenchmark: deepspeech: CPUnumpy: onednn: Recurrent Neural Network Inference - CPUonednn: Recurrent Neural Network Training - CPUonednn: Deconvolution Batch shapes_3d - CPUonednn: Deconvolution Batch shapes_1d - CPUonednn: Convolution Batch Shapes Auto - CPUonednn: IP Shapes 3D - CPUonednn: IP Shapes 1D - CPUlczero: BLASshoc: OpenCL - Texture Read Bandwidthshoc: OpenCL - Bus Speed Readbackshoc: OpenCL - Bus Speed Downloadshoc: OpenCL - Max SP Flopsshoc: OpenCL - GEMM SGEMM_Nshoc: OpenCL - Reductionshoc: OpenCL - MD5 Hashshoc: OpenCL - FFT SPshoc: OpenCL - Triadshoc: OpenCL - S3Dopencv: DNN - Deep Neural Networkonnx: ResNet101_DUC_HDC-12 - CPU - Standardonnx: ResNet101_DUC_HDC-12 - CPU - Parallelonnx: super-resolution-10 - CPU - Standardonnx: super-resolution-10 - CPU - Parallelonnx: ResNet50 v1-12-int8 - CPU - Standardonnx: ResNet50 v1-12-int8 - CPU - Parallelonnx: fcn-resnet101-11 - CPU - Standardonnx: fcn-resnet101-11 - CPU - Parallelonnx: CaffeNet 12-int8 - CPU - Standardonnx: CaffeNet 12-int8 - CPU - Parallelonnx: T5 Encoder - CPU - Standardonnx: T5 Encoder - CPU - Parallelonnx: ZFNet-512 - CPU - Standardonnx: ZFNet-512 - CPU - Parallelonnx: yolov4 - CPU - Standardonnx: yolov4 - CPU - Parallelxnnpack: QS8MobileNetV2xnnpack: FP16MobileNetV3Smallxnnpack: FP16MobileNetV2xnnpack: FP16MobileNetV1xnnpack: FP32MobileNetV3Smallxnnpack: FP32MobileNetV3Largexnnpack: FP32MobileNetV1ncnn: Vulkan GPU - squeezenet_ssdncnn: Vulkan GPU - resnet50ncnn: Vulkan GPU - googlenetncnn: Vulkan GPU - efficientnet-b0ncnn: Vulkan GPU - shufflenet-v2ncnn: Vulkan GPU-v3-v3 - mobilenet-v3ncnn: Vulkan GPU-v2-v2 - mobilenet-v2ncnn: Vulkan GPU - mobilenetncnn: CPU - squeezenet_ssdncnn: CPU - googlenetncnn: CPU - efficientnet-b0ncnn: CPU - shufflenet-v2ncnn: CPU-v3-v3 - mobilenet-v3ncnn: CPU - mobilenettensorflow: GPU - 1 - GoogLeNetpytorch: CPU - 512 - Efficientnet_v2_lpytorch: CPU - 16 - Efficientnet_v2_lpytorch: CPU - 1 - ResNet-152tensorflow-lite: Inception ResNet V2tensorflow-lite: NASNet MobileASUS NVIDIA GeForce RTX 30902000.10000638.01092209.01863587.84425.58569.634195.295133.1209.614190.582140.7491469.7911173.890317.46284.273332.1941120.03721.3071447.693188.32635.645120.32945.622153.05638.906147.516196.46663.79196.75553.53943.61585.975258.72442.771521.305183.4760.9438670.74343681.777271.8081313.235162.8912.497791.65652646.109307.372152.798115.16692.465479.434511.145610.492530.6229697.3174.85266.871.6611244.689.01659.7288.65225.329.02656.8611.29528.1312.811544.54115.8751.6921.09283.342.971948.7844.26450.448.35712.7569.0986.715.021180.46382.7415.6120.16296.62144.4741.47146.2540.981375.144.33183411256.5670.7584.8518.175.497.2928.616.005.436.4770.7382.6318.4022.095.587.5028.605.955.465.5321.3672.0841.8822.93514.3611.8361.0126.9568.0427.088.0427.0428.1393.7827.4993.458.0026.917.9826.637.9325.9926.5294.2126.8296.6927.80101.5044.6444.055.62245.67227.4014.0447.9242.492.2941.452.2838.93192.138.95167.179.11129.702.272.272.2514.149.319.228.9715.831.574.1286.2785.5785.3784.0985.7488.41162.35162.68162.69414.51162.45413.12162.59413.29413.19414.29172.05469.716.487.827.9210.5511.4611.4511.6630.1410.1629.3311.4929.5230.2730.0749.492317.671383.0226961.01904.017.2450.097054.62917644.541452.762702.005.577854.472598.249159.326762.892531662178.636.76526.641341982.28456.99406.41145.44892510.986.6057455.955255521059.491345.6612.227613.93933.191516.14425400.351604.5491.546573.251796.543048.6861410.814112.592489.758195.5591904928164621978201585139414.2522.1319.4318.7310.517.935.7217.3413.6219.6519.3511.058.2816.4612.007.266.9718.80128179311542OpenBenchmarking.org

Whisper.cpp

OpenBenchmarking.orgSeconds, Fewer Is BetterWhisper.cpp 1.6.2Model: ggml-medium.en - Input: 2016 State of the UnionASUS NVIDIA GeForce RTX 3090400800120016002000SE +/- 2.12, N = 32000.101. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread -msse3 -mssse3 -mavx -mf16c -mfma -mavx2

OpenBenchmarking.orgSeconds, Fewer Is BetterWhisper.cpp 1.6.2Model: ggml-small.en - Input: 2016 State of the UnionASUS NVIDIA GeForce RTX 3090140280420560700SE +/- 0.42, N = 3638.011. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread -msse3 -mssse3 -mavx -mf16c -mfma -mavx2

OpenBenchmarking.orgSeconds, Fewer Is BetterWhisper.cpp 1.6.2Model: ggml-base.en - Input: 2016 State of the UnionASUS NVIDIA GeForce RTX 309050100150200250SE +/- 0.74, N = 3209.021. (CXX) g++ options: -O3 -std=c++11 -fPIC -pthread -msse3 -mssse3 -mavx -mf16c -mfma -mavx2

Scikit-Learn

Scikit-learn is a Python module for machine learning built on NumPy, SciPy, and is BSD-licensed. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Sparse Random Projections / 100 IterationsASUS NVIDIA GeForce RTX 3090130260390520650SE +/- 1.77, N = 3587.841. (F9X) gfortran options: -O0

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Kernel PCA Solvers / Time vs. N ComponentsASUS NVIDIA GeForce RTX 3090612182430SE +/- 0.23, N = 325.591. (F9X) gfortran options: -O0

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Kernel PCA Solvers / Time vs. N SamplesASUS NVIDIA GeForce RTX 30901530456075SE +/- 0.35, N = 369.631. (F9X) gfortran options: -O0

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Hist Gradient Boosting Categorical OnlyASUS NVIDIA GeForce RTX 30904080120160200SE +/- 0.28, N = 3195.301. (F9X) gfortran options: -O0

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Plot Polynomial Kernel ApproximationASUS NVIDIA GeForce RTX 3090306090120150SE +/- 0.20, N = 3133.121. (F9X) gfortran options: -O0

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: 20 Newsgroups / Logistic RegressionASUS NVIDIA GeForce RTX 30903691215SE +/- 0.072, N = 39.6141. (F9X) gfortran options: -O0

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Hist Gradient Boosting Higgs BosonASUS NVIDIA GeForce RTX 30904080120160200SE +/- 2.11, N = 5190.581. (F9X) gfortran options: -O0

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Hist Gradient Boosting ThreadingASUS NVIDIA GeForce RTX 3090306090120150SE +/- 1.02, N = 3140.751. (F9X) gfortran options: -O0

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Isotonic / Perturbed LogarithmASUS NVIDIA GeForce RTX 309030060090012001500SE +/- 0.23, N = 31469.791. (F9X) gfortran options: -O0

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Hist Gradient Boosting AdultASUS NVIDIA GeForce RTX 309030060090012001500SE +/- 4.93, N = 31173.891. (F9X) gfortran options: -O0

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Covertype Dataset BenchmarkASUS NVIDIA GeForce RTX 309070140210280350SE +/- 0.13, N = 3317.461. (F9X) gfortran options: -O0

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Sample Without ReplacementASUS NVIDIA GeForce RTX 309020406080100SE +/- 0.15, N = 384.271. (F9X) gfortran options: -O0

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Plot Parallel PairwiseASUS NVIDIA GeForce RTX 309070140210280350SE +/- 1.13, N = 3332.191. (F9X) gfortran options: -O0

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Hist Gradient BoostingASUS NVIDIA GeForce RTX 30902004006008001000SE +/- 3.91, N = 31120.041. (F9X) gfortran options: -O0

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Plot Incremental PCAASUS NVIDIA GeForce RTX 3090510152025SE +/- 0.18, N = 1521.311. (F9X) gfortran options: -O0

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Isotonic / LogisticASUS NVIDIA GeForce RTX 309030060090012001500SE +/- 0.80, N = 31447.691. (F9X) gfortran options: -O0

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: TSNE MNIST DatasetASUS NVIDIA GeForce RTX 30904080120160200SE +/- 0.44, N = 3188.331. (F9X) gfortran options: -O0

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: LocalOutlierFactorASUS NVIDIA GeForce RTX 3090816243240SE +/- 0.32, N = 1535.651. (F9X) gfortran options: -O0

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Feature ExpansionsASUS NVIDIA GeForce RTX 3090306090120150SE +/- 0.09, N = 3120.331. (F9X) gfortran options: -O0

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Plot OMP vs. LARSASUS NVIDIA GeForce RTX 30901020304050SE +/- 0.29, N = 345.621. (F9X) gfortran options: -O0

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Plot HierarchicalASUS NVIDIA GeForce RTX 3090306090120150SE +/- 0.14, N = 3153.061. (F9X) gfortran options: -O0

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Text VectorizersASUS NVIDIA GeForce RTX 3090918273645SE +/- 0.03, N = 338.911. (F9X) gfortran options: -O0

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Isolation ForestASUS NVIDIA GeForce RTX 3090306090120150SE +/- 0.04, N = 3147.521. (F9X) gfortran options: -O0

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: SGDOneClassSVMASUS NVIDIA GeForce RTX 30904080120160200SE +/- 1.90, N = 6196.471. (F9X) gfortran options: -O0

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: SGD RegressionASUS NVIDIA GeForce RTX 30901428425670SE +/- 0.12, N = 363.791. (F9X) gfortran options: -O0

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Plot NeighborsASUS NVIDIA GeForce RTX 309020406080100SE +/- 0.18, N = 396.761. (F9X) gfortran options: -O0

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: MNIST DatasetASUS NVIDIA GeForce RTX 30901224364860SE +/- 0.03, N = 353.541. (F9X) gfortran options: -O0

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: Plot WardASUS NVIDIA GeForce RTX 30901020304050SE +/- 0.60, N = 343.621. (F9X) gfortran options: -O0

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: SparsifyASUS NVIDIA GeForce RTX 309020406080100SE +/- 0.25, N = 385.981. (F9X) gfortran options: -O0

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: LassoASUS NVIDIA GeForce RTX 309060120180240300SE +/- 1.16, N = 3258.721. (F9X) gfortran options: -O0

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: TreeASUS NVIDIA GeForce RTX 30901020304050SE +/- 0.38, N = 1542.771. (F9X) gfortran options: -O0

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: SAGAASUS NVIDIA GeForce RTX 3090110220330440550SE +/- 0.32, N = 3521.311. (F9X) gfortran options: -O0

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 1.2.2Benchmark: GLMASUS NVIDIA GeForce RTX 30904080120160200SE +/- 0.31, N = 3183.481. (F9X) gfortran options: -O0

ONNX Runtime

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: StandardASUS NVIDIA GeForce RTX 30900.21240.42480.63720.84961.062SE +/- 0.002603, N = 30.9438671. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ResNet101_DUC_HDC-12 - Device: CPU - Executor: ParallelASUS NVIDIA GeForce RTX 30900.16730.33460.50190.66920.8365SE +/- 0.010713, N = 30.7434361. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: super-resolution-10 - Device: CPU - Executor: StandardASUS NVIDIA GeForce RTX 309020406080100SE +/- 0.01, N = 381.781. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: super-resolution-10 - Device: CPU - Executor: ParallelASUS NVIDIA GeForce RTX 30901632486480SE +/- 0.62, N = 1571.811. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ResNet50 v1-12-int8 - Device: CPU - Executor: StandardASUS NVIDIA GeForce RTX 309070140210280350SE +/- 0.91, N = 3313.241. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ResNet50 v1-12-int8 - Device: CPU - Executor: ParallelASUS NVIDIA GeForce RTX 30904080120160200SE +/- 1.41, N = 15162.891. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: fcn-resnet101-11 - Device: CPU - Executor: StandardASUS NVIDIA GeForce RTX 30900.5621.1241.6862.2482.81SE +/- 0.00202, N = 32.497791. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: fcn-resnet101-11 - Device: CPU - Executor: ParallelASUS NVIDIA GeForce RTX 30900.37270.74541.11811.49081.8635SE +/- 0.01666, N = 151.656521. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: CaffeNet 12-int8 - Device: CPU - Executor: StandardASUS NVIDIA GeForce RTX 3090140280420560700SE +/- 0.70, N = 3646.111. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: CaffeNet 12-int8 - Device: CPU - Executor: ParallelASUS NVIDIA GeForce RTX 309070140210280350SE +/- 0.56, N = 3307.371. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: T5 Encoder - Device: CPU - Executor: StandardASUS NVIDIA GeForce RTX 3090306090120150SE +/- 0.46, N = 3152.801. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: T5 Encoder - Device: CPU - Executor: ParallelASUS NVIDIA GeForce RTX 3090306090120150SE +/- 0.72, N = 15115.171. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ZFNet-512 - Device: CPU - Executor: StandardASUS NVIDIA GeForce RTX 309020406080100SE +/- 0.61, N = 392.471. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: ZFNet-512 - Device: CPU - Executor: ParallelASUS NVIDIA GeForce RTX 309020406080100SE +/- 1.14, N = 379.431. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: yolov4 - Device: CPU - Executor: StandardASUS NVIDIA GeForce RTX 30903691215SE +/- 0.13, N = 411.151. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.19Model: yolov4 - Device: CPU - Executor: ParallelASUS NVIDIA GeForce RTX 30903691215SE +/- 0.14, N = 1510.491. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OpenVINO

This is a test of the Intel OpenVINO, a toolkit around neural networks, using its built-in benchmarking support and analyzing the throughput and latency for various models. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPUASUS NVIDIA GeForce RTX 30900.13950.2790.41850.5580.6975SE +/- 0.00, N = 30.62MIN: 0.31 / MAX: 13.61. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPUASUS NVIDIA GeForce RTX 30906K12K18K24K30KSE +/- 14.75, N = 329697.311. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Handwritten English Recognition FP16-INT8 - Device: CPUASUS NVIDIA GeForce RTX 309020406080100SE +/- 0.19, N = 374.85MIN: 56.4 / MAX: 150.261. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Handwritten English Recognition FP16-INT8 - Device: CPUASUS NVIDIA GeForce RTX 309060120180240300SE +/- 0.68, N = 3266.871. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Age Gender Recognition Retail 0013 FP16 - Device: CPUASUS NVIDIA GeForce RTX 30900.37350.7471.12051.4941.8675SE +/- 0.00, N = 31.66MIN: 0.78 / MAX: 17.391. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Age Gender Recognition Retail 0013 FP16 - Device: CPUASUS NVIDIA GeForce RTX 30902K4K6K8K10KSE +/- 7.15, N = 311244.681. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Person Re-Identification Retail FP16 - Device: CPUASUS NVIDIA GeForce RTX 30903691215SE +/- 0.01, N = 39.01MIN: 4.38 / MAX: 37.431. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Person Re-Identification Retail FP16 - Device: CPUASUS NVIDIA GeForce RTX 3090140280420560700SE +/- 0.35, N = 3659.721. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Handwritten English Recognition FP16 - Device: CPUASUS NVIDIA GeForce RTX 309020406080100SE +/- 0.10, N = 388.65MIN: 56.64 / MAX: 150.061. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Handwritten English Recognition FP16 - Device: CPUASUS NVIDIA GeForce RTX 309050100150200250SE +/- 0.26, N = 3225.321. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Noise Suppression Poconet-Like FP16 - Device: CPUASUS NVIDIA GeForce RTX 30903691215SE +/- 0.01, N = 39.02MIN: 6.21 / MAX: 25.891. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Noise Suppression Poconet-Like FP16 - Device: CPUASUS NVIDIA GeForce RTX 3090140280420560700SE +/- 0.48, N = 3656.861. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Person Vehicle Bike Detection FP16 - Device: CPUASUS NVIDIA GeForce RTX 30903691215SE +/- 0.01, N = 311.29MIN: 5.5 / MAX: 26.391. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Person Vehicle Bike Detection FP16 - Device: CPUASUS NVIDIA GeForce RTX 3090110220330440550SE +/- 0.48, N = 3528.131. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Weld Porosity Detection FP16-INT8 - Device: CPUASUS NVIDIA GeForce RTX 30903691215SE +/- 0.00, N = 312.81MIN: 7.26 / MAX: 52.971. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Weld Porosity Detection FP16-INT8 - Device: CPUASUS NVIDIA GeForce RTX 309030060090012001500SE +/- 0.90, N = 31544.541. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Machine Translation EN To DE FP16 - Device: CPUASUS NVIDIA GeForce RTX 3090306090120150SE +/- 0.40, N = 3115.87MIN: 60.69 / MAX: 196.81. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Machine Translation EN To DE FP16 - Device: CPUASUS NVIDIA GeForce RTX 30901224364860SE +/- 0.18, N = 351.691. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Road Segmentation ADAS FP16-INT8 - Device: CPUASUS NVIDIA GeForce RTX 3090510152025SE +/- 0.05, N = 321.09MIN: 9.8 / MAX: 54.631. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Road Segmentation ADAS FP16-INT8 - Device: CPUASUS NVIDIA GeForce RTX 309060120180240300SE +/- 0.63, N = 3283.341. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Face Detection Retail FP16-INT8 - Device: CPUASUS NVIDIA GeForce RTX 30900.66831.33662.00492.67323.3415SE +/- 0.00, N = 32.97MIN: 1.4 / MAX: 22.51. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Face Detection Retail FP16-INT8 - Device: CPUASUS NVIDIA GeForce RTX 3090400800120016002000SE +/- 3.09, N = 31948.781. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Weld Porosity Detection FP16 - Device: CPUASUS NVIDIA GeForce RTX 30901020304050SE +/- 0.07, N = 344.26MIN: 25.84 / MAX: 77.31. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Weld Porosity Detection FP16 - Device: CPUASUS NVIDIA GeForce RTX 3090100200300400500SE +/- 0.70, N = 3450.441. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Vehicle Detection FP16-INT8 - Device: CPUASUS NVIDIA GeForce RTX 3090246810SE +/- 0.00, N = 38.35MIN: 4.12 / MAX: 34.481. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Vehicle Detection FP16-INT8 - Device: CPUASUS NVIDIA GeForce RTX 3090150300450600750SE +/- 0.27, N = 3712.751. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Road Segmentation ADAS FP16 - Device: CPUASUS NVIDIA GeForce RTX 30901530456075SE +/- 0.30, N = 369.09MIN: 27.97 / MAX: 92.161. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Road Segmentation ADAS FP16 - Device: CPUASUS NVIDIA GeForce RTX 309020406080100SE +/- 0.37, N = 386.711. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Face Detection Retail FP16 - Device: CPUASUS NVIDIA GeForce RTX 30901.12952.2593.38854.5185.6475SE +/- 0.01, N = 35.02MIN: 2.23 / MAX: 27.191. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Face Detection Retail FP16 - Device: CPUASUS NVIDIA GeForce RTX 309030060090012001500SE +/- 1.52, N = 31180.461. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Face Detection FP16-INT8 - Device: CPUASUS NVIDIA GeForce RTX 309080160240320400SE +/- 0.21, N = 3382.74MIN: 223.52 / MAX: 881.71. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Face Detection FP16-INT8 - Device: CPUASUS NVIDIA GeForce RTX 309048121620SE +/- 0.05, N = 315.611. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Vehicle Detection FP16 - Device: CPUASUS NVIDIA GeForce RTX 3090510152025SE +/- 0.10, N = 320.16MIN: 7.21 / MAX: 40.971. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Vehicle Detection FP16 - Device: CPUASUS NVIDIA GeForce RTX 309060120180240300SE +/- 1.40, N = 3296.621. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Person Detection FP32 - Device: CPUASUS NVIDIA GeForce RTX 3090306090120150SE +/- 0.71, N = 3144.47MIN: 72.51 / MAX: 211.861. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Person Detection FP32 - Device: CPUASUS NVIDIA GeForce RTX 3090918273645SE +/- 0.20, N = 341.471. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Person Detection FP16 - Device: CPUASUS NVIDIA GeForce RTX 3090306090120150SE +/- 0.46, N = 3146.25MIN: 120.28 / MAX: 197.971. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Person Detection FP16 - Device: CPUASUS NVIDIA GeForce RTX 3090918273645SE +/- 0.13, N = 340.981. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Face Detection FP16 - Device: CPUASUS NVIDIA GeForce RTX 309030060090012001500SE +/- 6.34, N = 31375.14MIN: 1104.54 / MAX: 1761.351. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Face Detection FP16 - Device: CPUASUS NVIDIA GeForce RTX 30900.97431.94862.92293.89724.8715SE +/- 0.03, N = 34.331. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

XNNPACK

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP16MobileNetV3LargeASUS NVIDIA GeForce RTX 3090400800120016002000SE +/- 29.31, N = 1218341. (CXX) g++ options: -O3 -lrt -lm

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP32MobileNetV2ASUS NVIDIA GeForce RTX 30902004006008001000SE +/- 17.99, N = 1211251. (CXX) g++ options: -O3 -lrt -lm

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: FastestDetASUS NVIDIA GeForce RTX 3090246810SE +/- 0.07, N = 96.56MIN: 5.01 / MAX: 14.91. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: vision_transformerASUS NVIDIA GeForce RTX 30901632486480SE +/- 0.03, N = 970.75MIN: 67.65 / MAX: 81.351. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: regnety_400mASUS NVIDIA GeForce RTX 309020406080100SE +/- 0.69, N = 984.85MIN: 22.37 / MAX: 1211. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: yolov4-tinyASUS NVIDIA GeForce RTX 309048121620SE +/- 0.19, N = 918.17MIN: 14.35 / MAX: 48.071. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: alexnetASUS NVIDIA GeForce RTX 30901.23532.47063.70594.94126.1765SE +/- 0.01, N = 85.49MIN: 5.21 / MAX: 6.71. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: resnet18ASUS NVIDIA GeForce RTX 3090246810SE +/- 0.07, N = 97.29MIN: 5.96 / MAX: 11.881. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: vgg16ASUS NVIDIA GeForce RTX 3090714212835SE +/- 0.01, N = 928.61MIN: 27.79 / MAX: 31.771. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: blazefaceASUS NVIDIA GeForce RTX 3090246810SE +/- 0.11, N = 86.00MIN: 2.81 / MAX: 17.231. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: mnasnetASUS NVIDIA GeForce RTX 30901.22182.44363.66544.88726.109SE +/- 0.03, N = 95.43MIN: 4.16 / MAX: 9.811. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: FastestDetASUS NVIDIA GeForce RTX 3090246810SE +/- 0.13, N = 96.47MIN: 4.73 / MAX: 17.11. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: vision_transformerASUS NVIDIA GeForce RTX 30901632486480SE +/- 0.06, N = 970.73MIN: 67.43 / MAX: 80.711. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: regnety_400mASUS NVIDIA GeForce RTX 309020406080100SE +/- 0.75, N = 982.63MIN: 22.37 / MAX: 120.261. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: yolov4-tinyASUS NVIDIA GeForce RTX 3090510152025SE +/- 0.31, N = 918.40MIN: 14.15 / MAX: 42.771. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: resnet50ASUS NVIDIA GeForce RTX 3090510152025SE +/- 0.34, N = 922.09MIN: 13.37 / MAX: 48.041. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: alexnetASUS NVIDIA GeForce RTX 30901.25552.5113.76655.0226.2775SE +/- 0.05, N = 95.58MIN: 5.23 / MAX: 13.311. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: resnet18ASUS NVIDIA GeForce RTX 3090246810SE +/- 0.04, N = 97.50MIN: 5.96 / MAX: 14.471. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: vgg16ASUS NVIDIA GeForce RTX 3090714212835SE +/- 0.01, N = 928.60MIN: 27.69 / MAX: 31.261. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: blazefaceASUS NVIDIA GeForce RTX 30901.33882.67764.01645.35526.694SE +/- 0.11, N = 95.95MIN: 2.83 / MAX: 16.591. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: mnasnetASUS NVIDIA GeForce RTX 30901.22852.4573.68554.9146.1425SE +/- 0.05, N = 95.46MIN: 4.13 / MAX: 13.571. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU-v2-v2 - Model: mobilenet-v2ASUS NVIDIA GeForce RTX 30901.24432.48863.73294.97726.2215SE +/- 0.07, N = 95.53MIN: 4.11 / MAX: 12.431. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

Mobile Neural Network

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.9.b11b7037dModel: inception-v3ASUS NVIDIA GeForce RTX 3090510152025SE +/- 0.04, N = 321.37MIN: 20.43 / MAX: 62.251. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.9.b11b7037dModel: mobilenet-v1-1.0ASUS NVIDIA GeForce RTX 30900.46890.93781.40671.87562.3445SE +/- 0.015, N = 32.084MIN: 1.99 / MAX: 26.251. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.9.b11b7037dModel: MobileNetV2_224ASUS NVIDIA GeForce RTX 30900.42350.8471.27051.6942.1175SE +/- 0.011, N = 31.882MIN: 1.79 / MAX: 33.111. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.9.b11b7037dModel: SqueezeNetV1.0ASUS NVIDIA GeForce RTX 30900.66041.32081.98122.64163.302SE +/- 0.049, N = 32.935MIN: 2.7 / MAX: 34.141. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.9.b11b7037dModel: resnet-v2-50ASUS NVIDIA GeForce RTX 309048121620SE +/- 0.08, N = 314.36MIN: 13.6 / MAX: 59.491. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.9.b11b7037dModel: squeezenetv1.1ASUS NVIDIA GeForce RTX 30900.41310.82621.23931.65242.0655SE +/- 0.006, N = 31.836MIN: 1.71 / MAX: 31.71. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.9.b11b7037dModel: mobilenetV3ASUS NVIDIA GeForce RTX 30900.22770.45540.68310.91081.1385SE +/- 0.003, N = 31.012MIN: 0.93 / MAX: 24.11. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.9.b11b7037dModel: nasnetASUS NVIDIA GeForce RTX 3090246810SE +/- 0.004, N = 36.956MIN: 6.61 / MAX: 53.041. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 512 - Model: ResNet-50ASUS NVIDIA GeForce RTX 3090246810SE +/- 0.01, N = 38.04

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 512 - Model: GoogLeNetASUS NVIDIA GeForce RTX 3090612182430SE +/- 0.04, N = 327.08

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 256 - Model: ResNet-50ASUS NVIDIA GeForce RTX 3090246810SE +/- 0.00, N = 38.04

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 256 - Model: GoogLeNetASUS NVIDIA GeForce RTX 3090612182430SE +/- 0.05, N = 327.04

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 512 - Model: ResNet-50ASUS NVIDIA GeForce RTX 3090714212835SE +/- 0.01, N = 328.13

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 512 - Model: GoogLeNetASUS NVIDIA GeForce RTX 309020406080100SE +/- 0.03, N = 393.78

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 256 - Model: ResNet-50ASUS NVIDIA GeForce RTX 3090612182430SE +/- 0.18, N = 327.49

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 256 - Model: GoogLeNetASUS NVIDIA GeForce RTX 309020406080100SE +/- 0.03, N = 393.45

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 64 - Model: ResNet-50ASUS NVIDIA GeForce RTX 3090246810SE +/- 0.00, N = 38.00

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 64 - Model: GoogLeNetASUS NVIDIA GeForce RTX 3090612182430SE +/- 0.01, N = 326.91

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 32 - Model: ResNet-50ASUS NVIDIA GeForce RTX 3090246810SE +/- 0.01, N = 37.98

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 32 - Model: GoogLeNetASUS NVIDIA GeForce RTX 3090612182430SE +/- 0.04, N = 326.63

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 16 - Model: ResNet-50ASUS NVIDIA GeForce RTX 3090246810SE +/- 0.01, N = 37.93

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 16 - Model: GoogLeNetASUS NVIDIA GeForce RTX 3090612182430SE +/- 0.06, N = 325.99

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 64 - Model: ResNet-50ASUS NVIDIA GeForce RTX 3090612182430SE +/- 0.01, N = 326.52

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 64 - Model: GoogLeNetASUS NVIDIA GeForce RTX 309020406080100SE +/- 0.26, N = 394.21

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 32 - Model: ResNet-50ASUS NVIDIA GeForce RTX 3090612182430SE +/- 0.01, N = 326.82

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 32 - Model: GoogLeNetASUS NVIDIA GeForce RTX 309020406080100SE +/- 0.06, N = 396.69

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 16 - Model: ResNet-50ASUS NVIDIA GeForce RTX 3090714212835SE +/- 0.02, N = 327.80

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 16 - Model: GoogLeNetASUS NVIDIA GeForce RTX 309020406080100SE +/- 0.15, N = 3101.50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 512 - Model: AlexNetASUS NVIDIA GeForce RTX 30901020304050SE +/- 0.42, N = 344.64

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 256 - Model: AlexNetASUS NVIDIA GeForce RTX 30901020304050SE +/- 0.40, N = 344.05

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 1 - Model: ResNet-50ASUS NVIDIA GeForce RTX 30901.26452.5293.79355.0586.3225SE +/- 0.05, N = 35.62

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 512 - Model: AlexNetASUS NVIDIA GeForce RTX 309050100150200250SE +/- 0.02, N = 3245.67

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 256 - Model: AlexNetASUS NVIDIA GeForce RTX 309050100150200250SE +/- 0.07, N = 3227.40

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 1 - Model: ResNet-50ASUS NVIDIA GeForce RTX 309048121620SE +/- 0.07, N = 314.04

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 1 - Model: GoogLeNetASUS NVIDIA GeForce RTX 30901122334455SE +/- 0.60, N = 347.92

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 64 - Model: AlexNetASUS NVIDIA GeForce RTX 30901020304050SE +/- 0.02, N = 342.49

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 512 - Model: VGG-16ASUS NVIDIA GeForce RTX 30900.51531.03061.54592.06122.5765SE +/- 0.00, N = 32.29

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 32 - Model: AlexNetASUS NVIDIA GeForce RTX 3090918273645SE +/- 0.40, N = 341.45

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 256 - Model: VGG-16ASUS NVIDIA GeForce RTX 30900.5131.0261.5392.0522.565SE +/- 0.00, N = 32.28

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 16 - Model: AlexNetASUS NVIDIA GeForce RTX 3090918273645SE +/- 0.43, N = 338.93

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 64 - Model: AlexNetASUS NVIDIA GeForce RTX 30904080120160200SE +/- 0.32, N = 3192.13

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 512 - Model: VGG-16ASUS NVIDIA GeForce RTX 30903691215SE +/- 0.06, N = 38.95

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 32 - Model: AlexNetASUS NVIDIA GeForce RTX 30904080120160200SE +/- 0.14, N = 3167.17

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 256 - Model: VGG-16ASUS NVIDIA GeForce RTX 30903691215SE +/- 0.01, N = 39.11

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 16 - Model: AlexNetASUS NVIDIA GeForce RTX 3090306090120150SE +/- 0.01, N = 3129.70

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 64 - Model: VGG-16ASUS NVIDIA GeForce RTX 30900.51081.02161.53242.04322.554SE +/- 0.00, N = 32.27

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 32 - Model: VGG-16ASUS NVIDIA GeForce RTX 30900.51081.02161.53242.04322.554SE +/- 0.00, N = 32.27

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 16 - Model: VGG-16ASUS NVIDIA GeForce RTX 30900.50631.01261.51892.02522.5315SE +/- 0.00, N = 32.25

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 1 - Model: AlexNetASUS NVIDIA GeForce RTX 309048121620SE +/- 0.07, N = 314.14

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 64 - Model: VGG-16ASUS NVIDIA GeForce RTX 30903691215SE +/- 0.01, N = 39.31

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 32 - Model: VGG-16ASUS NVIDIA GeForce RTX 30903691215SE +/- 0.02, N = 39.22

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 16 - Model: VGG-16ASUS NVIDIA GeForce RTX 30903691215SE +/- 0.06, N = 38.97

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 1 - Model: AlexNetASUS NVIDIA GeForce RTX 309048121620SE +/- 0.02, N = 315.83

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 1 - Model: VGG-16ASUS NVIDIA GeForce RTX 30900.35330.70661.05991.41321.7665SE +/- 0.02, N = 151.57

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 1 - Model: VGG-16ASUS NVIDIA GeForce RTX 30900.9271.8542.7813.7084.635SE +/- 0.00, N = 34.12

PyTorch

This is a benchmark of PyTorch making use of pytorch-benchmark [https://github.com/LukasHedegaard/pytorch-benchmark]. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: Efficientnet_v2_lASUS NVIDIA GeForce RTX 309020406080100SE +/- 0.12, N = 386.27MIN: 76.41 / MAX: 88.03

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: Efficientnet_v2_lASUS NVIDIA GeForce RTX 309020406080100SE +/- 0.11, N = 385.57MIN: 73.68 / MAX: 87.24

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: Efficientnet_v2_lASUS NVIDIA GeForce RTX 309020406080100SE +/- 0.50, N = 385.37MIN: 74.95 / MAX: 88.01

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: Efficientnet_v2_lASUS NVIDIA GeForce RTX 309020406080100SE +/- 0.95, N = 384.09MIN: 70.71 / MAX: 86.82

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: Efficientnet_v2_lASUS NVIDIA GeForce RTX 309020406080100SE +/- 0.47, N = 385.74MIN: 75.44 / MAX: 87.56

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: Efficientnet_v2_lASUS NVIDIA GeForce RTX 309020406080100SE +/- 0.70, N = 388.41MIN: 76.75 / MAX: 90.99

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: ResNet-152ASUS NVIDIA GeForce RTX 30904080120160200SE +/- 0.22, N = 3162.35MIN: 99.28 / MAX: 165.46

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: ResNet-152ASUS NVIDIA GeForce RTX 30904080120160200SE +/- 0.10, N = 3162.68MIN: 145.92 / MAX: 165.55

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: ResNet-152ASUS NVIDIA GeForce RTX 30904080120160200SE +/- 0.16, N = 3162.69MIN: 143.93 / MAX: 165.52

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: ResNet-50ASUS NVIDIA GeForce RTX 309090180270360450SE +/- 0.17, N = 3414.51MIN: 337.19 / MAX: 422.34

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: ResNet-152ASUS NVIDIA GeForce RTX 30904080120160200SE +/- 0.21, N = 3162.45MIN: 87.19 / MAX: 165.45

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: ResNet-50ASUS NVIDIA GeForce RTX 309090180270360450SE +/- 0.66, N = 3413.12MIN: 161.19 / MAX: 421.75

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: ResNet-152ASUS NVIDIA GeForce RTX 30904080120160200SE +/- 0.56, N = 3162.59MIN: 114.4 / MAX: 167.22

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: ResNet-50ASUS NVIDIA GeForce RTX 309090180270360450SE +/- 0.26, N = 3413.29MIN: 337.68 / MAX: 420.31

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: ResNet-50ASUS NVIDIA GeForce RTX 309090180270360450SE +/- 0.17, N = 3413.19MIN: 321.33 / MAX: 421.07

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: ResNet-50ASUS NVIDIA GeForce RTX 309090180270360450SE +/- 0.07, N = 3414.29MIN: 339.59 / MAX: 422.13

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: ResNet-152ASUS NVIDIA GeForce RTX 30904080120160200SE +/- 0.36, N = 3172.05MIN: 143.74 / MAX: 175.36

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: ResNet-50ASUS NVIDIA GeForce RTX 3090100200300400500SE +/- 1.86, N = 3469.71MIN: 339.62 / MAX: 480.89

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 256 - Model: Efficientnet_v2_lASUS NVIDIA GeForce RTX 3090246810SE +/- 0.01, N = 36.48MIN: 6.44 / MAX: 6.62

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 64 - Model: Efficientnet_v2_lASUS NVIDIA GeForce RTX 3090246810SE +/- 0.07, N = 37.82MIN: 6.39 / MAX: 7.98

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 32 - Model: Efficientnet_v2_lASUS NVIDIA GeForce RTX 3090246810SE +/- 0.05, N = 37.92MIN: 7.36 / MAX: 8.02

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 1 - Model: Efficientnet_v2_lASUS NVIDIA GeForce RTX 30903691215SE +/- 0.12, N = 310.55MIN: 6.85 / MAX: 13.27

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 512 - Model: ResNet-152ASUS NVIDIA GeForce RTX 30903691215SE +/- 0.14, N = 1211.46MIN: 10.06 / MAX: 11.81

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 256 - Model: ResNet-152ASUS NVIDIA GeForce RTX 30903691215SE +/- 0.19, N = 1211.45MIN: 10.05 / MAX: 12.11

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 64 - Model: ResNet-152ASUS NVIDIA GeForce RTX 30903691215SE +/- 0.05, N = 311.66MIN: 11.02 / MAX: 11.79

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 512 - Model: ResNet-50ASUS NVIDIA GeForce RTX 3090714212835SE +/- 0.24, N = 330.14MIN: 18.07 / MAX: 30.74

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 32 - Model: ResNet-152ASUS NVIDIA GeForce RTX 30903691215SE +/- 0.01, N = 310.16MIN: 10.01 / MAX: 10.25

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 256 - Model: ResNet-50ASUS NVIDIA GeForce RTX 3090714212835SE +/- 0.38, N = 1529.33MIN: 25.91 / MAX: 30.51

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 16 - Model: ResNet-152ASUS NVIDIA GeForce RTX 30903691215SE +/- 0.20, N = 1211.49MIN: 10.07 / MAX: 12.18

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 64 - Model: ResNet-50ASUS NVIDIA GeForce RTX 3090714212835SE +/- 0.32, N = 1529.52MIN: 25.74 / MAX: 30.64

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 32 - Model: ResNet-50ASUS NVIDIA GeForce RTX 3090714212835SE +/- 0.11, N = 330.27MIN: 28.61 / MAX: 30.57

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 16 - Model: ResNet-50ASUS NVIDIA GeForce RTX 3090714212835SE +/- 0.12, N = 330.07MIN: 28.1 / MAX: 30.36

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 1 - Model: ResNet-50ASUS NVIDIA GeForce RTX 30901122334455SE +/- 0.18, N = 349.49MIN: 30.42 / MAX: 50.09

TensorFlow Lite

This is a benchmark of the TensorFlow Lite implementation focused on TensorFlow machine learning for mobile, IoT, edge, and other cases. The current Linux support is limited to running on CPUs. This test profile is measuring the average inference time. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2022-05-18Model: Mobilenet QuantASUS NVIDIA GeForce RTX 30905001000150020002500SE +/- 20.94, N = 32317.67

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2022-05-18Model: Mobilenet FloatASUS NVIDIA GeForce RTX 309030060090012001500SE +/- 17.51, N = 151383.02

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2022-05-18Model: Inception V4ASUS NVIDIA GeForce RTX 30906K12K18K24K30KSE +/- 395.13, N = 1526961.0

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2022-05-18Model: SqueezeNetASUS NVIDIA GeForce RTX 3090400800120016002000SE +/- 15.29, N = 151904.01

RNNoise

RNNoise is a recurrent neural network for audio noise reduction developed by Mozilla and Xiph.Org. This test profile is a single-threaded test measuring the time to denoise a sample 26 minute long 16-bit RAW audio file using this recurrent neural network noise suppression library. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterRNNoise 0.2Input: 26 Minute Long Talking SampleASUS NVIDIA GeForce RTX 3090246810SE +/- 0.014, N = 37.2451. (CC) gcc options: -O2 -pedantic -fvisibility=hidden

R Benchmark

This test is a quick-running survey of general R performance Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterR BenchmarkASUS NVIDIA GeForce RTX 30900.02180.04360.06540.08720.109SE +/- 0.0004, N = 30.0970

DeepSpeech

Mozilla DeepSpeech is a speech-to-text engine powered by TensorFlow for machine learning and derived from Baidu's Deep Speech research paper. This test profile times the speech-to-text process for a roughly three minute audio recording. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterDeepSpeech 0.6Acceleration: CPUASUS NVIDIA GeForce RTX 30901224364860SE +/- 0.08, N = 354.63

Numpy Benchmark

This is a test to obtain the general Numpy performance. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgScore, More Is BetterNumpy BenchmarkASUS NVIDIA GeForce RTX 3090140280420560700SE +/- 0.55, N = 3644.54

oneDNN

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: Recurrent Neural Network Inference - Engine: CPUASUS NVIDIA GeForce RTX 309030060090012001500SE +/- 2.94, N = 31452.76MIN: 1432.321. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: Recurrent Neural Network Training - Engine: CPUASUS NVIDIA GeForce RTX 30906001200180024003000SE +/- 1.60, N = 32702.00MIN: 2689.781. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: Deconvolution Batch shapes_3d - Engine: CPUASUS NVIDIA GeForce RTX 30901.2552.513.7655.026.275SE +/- 0.00815, N = 35.57785MIN: 5.371. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: Deconvolution Batch shapes_1d - Engine: CPUASUS NVIDIA GeForce RTX 30901.00632.01263.01894.02525.0315SE +/- 0.00491, N = 34.47259MIN: 4.021. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: Convolution Batch Shapes Auto - Engine: CPUASUS NVIDIA GeForce RTX 3090246810SE +/- 0.00332, N = 38.24915MIN: 7.761. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: IP Shapes 3D - Engine: CPUASUS NVIDIA GeForce RTX 30903691215SE +/- 0.00039, N = 39.32676MIN: 9.11. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: IP Shapes 1D - Engine: CPUASUS NVIDIA GeForce RTX 30900.65081.30161.95242.60323.254SE +/- 0.00897, N = 32.89253MIN: 2.721. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl

LeelaChessZero

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.31.1Backend: BLASASUS NVIDIA GeForce RTX 30904080120160200SE +/- 1.83, N = 51661. (CXX) g++ options: -flto -pthread

SHOC Scalable HeterOgeneous Computing

The CUDA and OpenCL version of Vetter's Scalable HeterOgeneous Computing benchmark suite. SHOC provides a number of different benchmark programs for evaluating the performance and stability of compute devices. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Texture Read BandwidthASUS NVIDIA GeForce RTX 30905001000150020002500SE +/- 2.20, N = 32178.631. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Bus Speed ReadbackASUS NVIDIA GeForce RTX 3090246810SE +/- 0.0000, N = 36.76521. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Bus Speed DownloadASUS NVIDIA GeForce RTX 3090246810SE +/- 0.0002, N = 36.64131. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: Max SP FlopsASUS NVIDIA GeForce RTX 30909K18K27K36K45KSE +/- 208.39, N = 341982.21. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: GEMM SGEMM_NASUS NVIDIA GeForce RTX 30902K4K6K8K10KSE +/- 38.88, N = 38456.991. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: ReductionASUS NVIDIA GeForce RTX 309090180270360450SE +/- 0.22, N = 3406.411. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGHash/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: MD5 HashASUS NVIDIA GeForce RTX 30901020304050SE +/- 0.01, N = 345.451. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: FFT SPASUS NVIDIA GeForce RTX 30905001000150020002500SE +/- 35.21, N = 32510.981. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGB/s, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: TriadASUS NVIDIA GeForce RTX 3090246810SE +/- 0.0011, N = 36.60571. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenBenchmarking.orgGFLOPS, More Is BetterSHOC Scalable HeterOgeneous Computing 2020-04-17Target: OpenCL - Benchmark: S3DASUS NVIDIA GeForce RTX 3090100200300400500SE +/- 0.72, N = 3455.961. (CXX) g++ options: -O2 -lSHOCCommonMPI -lSHOCCommonOpenCL -lSHOCCommon -lOpenCL -lrt -lmpi_cxx -lmpi

OpenCV

This is a benchmark of the OpenCV (Computer Vision) library's built-in performance tests. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterOpenCV 4.7Test: DNN - Deep Neural NetworkASUS NVIDIA GeForce RTX 30905K10K15K20K25KSE +/- 635.50, N = 13255521. (CXX) g++ options: -fsigned-char -pthread -fomit-frame-pointer -ffunction-sections -fdata-sections -msse -msse2 -msse3 -fvisibility=hidden -O3 -ldl -lm -lpthread -lrt

Llamafile

Test: wizardcoder-python-34b-v1.0.Q6_K - Acceleration: CPU

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./run-wizardcoder: line 2: ./wizardcoder-python-34b-v1.0.Q6_K.llamafile.86: No such file or directory

Test: mistral-7b-instruct-v0.2.Q8_0 - Acceleration: CPU

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./run-mistral: line 2: ./mistral-7b-instruct-v0.2.Q5_K_M.llamafile.86: No such file or directory

Test: llava-v1.5-7b-q4 - Acceleration: CPU

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./run-llava: line 2: ./llava-v1.6-mistral-7b.Q8_0.llamafile.86: No such file or directory

Llama.cpp

Model: llama-2-70b-chat.Q5_0.gguf

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: main: error: unable to load model

Model: llama-2-13b.Q4_0.gguf

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: main: error: unable to load model

Model: llama-2-7b.Q4_0.gguf

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: main: error: unable to load model

Scikit-Learn

Scikit-learn is a Python module for machine learning built on NumPy, SciPy, and is BSD-licensed. Learn more via the OpenBenchmarking.org test page.

Benchmark: Plot Non-Negative Matrix Factorization

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: KeyError:

Benchmark: Plot Singular Value Decomposition

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'matplotlib.tri.triangulation'

Benchmark: RCV1 Logreg Convergencet

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: IndexError: list index out of range

Benchmark: Isotonic / Pathological

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status.

Benchmark: Plot Fast KMeans

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'matplotlib.tri.triangulation'

Benchmark: Plot Lasso Path

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'matplotlib.tri.triangulation'

Benchmark: Glmnet

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'glmnet'

Mlpack Benchmark

Mlpack benchmark scripts for machine learning libraries Learn more via the OpenBenchmarking.org test page.

Benchmark: scikit_linearridgeregression

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'imp'

Benchmark: scikit_svm

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'imp'

Benchmark: scikit_qda

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'imp'

Benchmark: scikit_ica

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'imp'

AI Benchmark Alpha

AI Benchmark Alpha is a Python library for evaluating artificial intelligence (AI) performance on diverse hardware platforms and relies upon the TensorFlow machine learning library. Learn more via the OpenBenchmarking.org test page.

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'tensorflow'

ONNX Runtime

Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: onnxruntime/onnxruntime/test/onnx/onnx_model_info.cc:45 void OnnxModelInfo::InitOnnxModelInfo(const std::filesystem::__cxx11::path&) open file "FasterRCNN-12-int8/FasterRCNN-12-int8.onnx" failed: No such file or directory

Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Parallel

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: onnxruntime/onnxruntime/test/onnx/onnx_model_info.cc:45 void OnnxModelInfo::InitOnnxModelInfo(const std::filesystem::__cxx11::path&) open file "FasterRCNN-12-int8/FasterRCNN-12-int8.onnx" failed: No such file or directory

Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: onnxruntime/onnxruntime/test/onnx/onnx_model_info.cc:45 void OnnxModelInfo::InitOnnxModelInfo(const std::filesystem::__cxx11::path&) open file "resnet100/resnet100.onnx" failed: No such file or directory

Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallel

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: onnxruntime/onnxruntime/test/onnx/onnx_model_info.cc:45 void OnnxModelInfo::InitOnnxModelInfo(const std::filesystem::__cxx11::path&) open file "resnet100/resnet100.onnx" failed: No such file or directory

Model: bertsquad-12 - Device: CPU - Executor: Standard

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: onnxruntime/onnxruntime/test/onnx/onnx_model_info.cc:45 void OnnxModelInfo::InitOnnxModelInfo(const std::filesystem::__cxx11::path&) open file "bertsquad-12/bertsquad-12.onnx" failed: No such file or directory

Model: bertsquad-12 - Device: CPU - Executor: Parallel

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: onnxruntime/onnxruntime/test/onnx/onnx_model_info.cc:45 void OnnxModelInfo::InitOnnxModelInfo(const std::filesystem::__cxx11::path&) open file "bertsquad-12/bertsquad-12.onnx" failed: No such file or directory

Model: GPT-2 - Device: CPU - Executor: Standard

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: onnxruntime/onnxruntime/test/onnx/onnx_model_info.cc:45 void OnnxModelInfo::InitOnnxModelInfo(const std::filesystem::__cxx11::path&) open file "GPT2/model.onnx" failed: No such file or directory

Model: GPT-2 - Device: CPU - Executor: Parallel

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: onnxruntime/onnxruntime/test/onnx/onnx_model_info.cc:45 void OnnxModelInfo::InitOnnxModelInfo(const std::filesystem::__cxx11::path&) open file "GPT2/model.onnx" failed: No such file or directory

Numenta Anomaly Benchmark

Numenta Anomaly Benchmark (NAB) is a benchmark for evaluating algorithms for anomaly detection in streaming, real-time applications. It is comprised of over 50 labeled real-world and artificial time-series data files plus a novel scoring mechanism designed for real-time applications. This test profile currently measures the time to run various detectors. Learn more via the OpenBenchmarking.org test page.

Detector: Contextual Anomaly Detector OSE

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'pandas'

Detector: Bayesian Changepoint

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'pandas'

Detector: Earthgecko Skyline

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'pandas'

Detector: Windowed Gaussian

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'pandas'

Detector: Relative Entropy

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'pandas'

Detector: KNN CAD

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'pandas'

ECP-CANDLE

The CANDLE benchmark codes implement deep learning architectures relevant to problems in cancer. These architectures address problems at different biological scales, specifically problems at the molecular, cellular and population scales. Learn more via the OpenBenchmarking.org test page.

Benchmark: P3B2

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'tensorflow'

Benchmark: P3B1

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'tensorflow'

Benchmark: P1B2

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'tensorflow'

PlaidML

This test profile uses PlaidML deep learning framework developed by Intel for offering up various benchmarks. Learn more via the OpenBenchmarking.org test page.

FP16: No - Mode: Inference - Network: ResNet 50 - Device: CPU

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'tensorflow'

FP16: No - Mode: Inference - Network: VGG16 - Device: CPU

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'tensorflow'

XNNPACK

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: QS8MobileNetV2ASUS NVIDIA GeForce RTX 30902004006008001000SE +/- 35.09, N = 129041. (CXX) g++ options: -O3 -lrt -lm

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP16MobileNetV3SmallASUS NVIDIA GeForce RTX 30902004006008001000SE +/- 16.32, N = 129281. (CXX) g++ options: -O3 -lrt -lm

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP16MobileNetV2ASUS NVIDIA GeForce RTX 3090400800120016002000SE +/- 50.43, N = 1216461. (CXX) g++ options: -O3 -lrt -lm

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP16MobileNetV1ASUS NVIDIA GeForce RTX 30905001000150020002500SE +/- 74.47, N = 1221971. (CXX) g++ options: -O3 -lrt -lm

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP32MobileNetV3SmallASUS NVIDIA GeForce RTX 30902004006008001000SE +/- 29.73, N = 128201. (CXX) g++ options: -O3 -lrt -lm

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP32MobileNetV3LargeASUS NVIDIA GeForce RTX 309030060090012001500SE +/- 66.95, N = 1215851. (CXX) g++ options: -O3 -lrt -lm

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP32MobileNetV1ASUS NVIDIA GeForce RTX 309030060090012001500SE +/- 27.87, N = 1213941. (CXX) g++ options: -O3 -lrt -lm

TNN

TNN is an open-source deep learning reasoning framework developed by Tencent. Learn more via the OpenBenchmarking.org test page.

Target: CPU - Model: SqueezeNet v1.1

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./tnn: 3: ./test/TNNTest: not found

Target: CPU - Model: SqueezeNet v2

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./tnn: 3: ./test/TNNTest: not found

Target: CPU - Model: MobileNet v2

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./tnn: 3: ./test/TNNTest: not found

Target: CPU - Model: DenseNet

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./tnn: 3: ./test/TNNTest: not found

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: squeezenet_ssdASUS NVIDIA GeForce RTX 309048121620SE +/- 0.48, N = 914.25MIN: 7.96 / MAX: 30.381. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: resnet50ASUS NVIDIA GeForce RTX 3090510152025SE +/- 0.46, N = 922.13MIN: 13.15 / MAX: 49.981. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: googlenetASUS NVIDIA GeForce RTX 3090510152025SE +/- 0.55, N = 919.43MIN: 9.05 / MAX: 43.991. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: efficientnet-b0ASUS NVIDIA GeForce RTX 3090510152025SE +/- 0.83, N = 918.73MIN: 7.31 / MAX: 37.641. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: shufflenet-v2ASUS NVIDIA GeForce RTX 30903691215SE +/- 0.50, N = 910.51MIN: 4.55 / MAX: 29.191. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3ASUS NVIDIA GeForce RTX 3090246810SE +/- 0.60, N = 97.93MIN: 4.48 / MAX: 21.461. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2ASUS NVIDIA GeForce RTX 30901.2872.5743.8615.1486.435SE +/- 0.13, N = 95.72MIN: 4.2 / MAX: 19.441. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: Vulkan GPU - Model: mobilenetASUS NVIDIA GeForce RTX 309048121620SE +/- 0.37, N = 917.34MIN: 9.39 / MAX: 40.251. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: squeezenet_ssdASUS NVIDIA GeForce RTX 309048121620SE +/- 0.29, N = 913.62MIN: 8.06 / MAX: 30.151. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: googlenetASUS NVIDIA GeForce RTX 3090510152025SE +/- 0.62, N = 919.65MIN: 9.41 / MAX: 46.011. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: efficientnet-b0ASUS NVIDIA GeForce RTX 3090510152025SE +/- 0.78, N = 919.35MIN: 7.5 / MAX: 39.351. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: shufflenet-v2ASUS NVIDIA GeForce RTX 30903691215SE +/- 0.58, N = 911.05MIN: 4.48 / MAX: 28.41. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU-v3-v3 - Model: mobilenet-v3ASUS NVIDIA GeForce RTX 3090246810SE +/- 0.34, N = 98.28MIN: 4.71 / MAX: 21.81. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20230517Target: CPU - Model: mobilenetASUS NVIDIA GeForce RTX 309048121620SE +/- 0.47, N = 916.46MIN: 9.33 / MAX: 40.481. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

Caffe

This is a benchmark of the Caffe deep learning framework and currently supports the AlexNet and Googlenet model and execution on both CPUs and NVIDIA GPUs. Learn more via the OpenBenchmarking.org test page.

Model: GoogleNet - Acceleration: CPU - Iterations: 1000

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./caffe: 3: ./tools/caffe: not found

Model: GoogleNet - Acceleration: CPU - Iterations: 200

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./caffe: 3: ./tools/caffe: not found

Model: GoogleNet - Acceleration: CPU - Iterations: 100

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./caffe: 3: ./tools/caffe: not found

Model: AlexNet - Acceleration: CPU - Iterations: 1000

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./caffe: 3: ./tools/caffe: not found

Model: AlexNet - Acceleration: CPU - Iterations: 200

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./caffe: 3: ./tools/caffe: not found

Model: AlexNet - Acceleration: CPU - Iterations: 100

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./caffe: 3: ./tools/caffe: not found

spaCy

The spaCy library is an open-source solution for advanced neural language processing (NLP). The spaCy library leverages Python and is a leading neural language processing solution. This test profile times the spaCy CPU performance with various models. Learn more via the OpenBenchmarking.org test page.

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ModuleNotFoundError: No module named 'tqdm'

Neural Magic DeepSparse

This is a benchmark of Neural Magic's DeepSparse using its built-in deepsparse.benchmark utility and various models from their SparseZoo (https://sparsezoo.neuralmagic.com/). Learn more via the OpenBenchmarking.org test page.

Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-Stream

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./deepsparse: 2: /.local/bin/deepsparse.benchmark: not found

Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-Stream

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./deepsparse: 2: /.local/bin/deepsparse.benchmark: not found

Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Synchronous Single-Stream

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./deepsparse: 2: /.local/bin/deepsparse.benchmark: not found

Model: BERT-Large, NLP Question Answering, Sparse INT8 - Scenario: Asynchronous Multi-Stream

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./deepsparse: 2: /.local/bin/deepsparse.benchmark: not found

Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Synchronous Single-Stream

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./deepsparse: 2: /.local/bin/deepsparse.benchmark: not found

Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-Stream

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./deepsparse: 2: /.local/bin/deepsparse.benchmark: not found

Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-Stream

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./deepsparse: 2: /.local/bin/deepsparse.benchmark: not found

Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-Stream

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./deepsparse: 2: /.local/bin/deepsparse.benchmark: not found

Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Synchronous Single-Stream

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./deepsparse: 2: /.local/bin/deepsparse.benchmark: not found

Model: CV Detection, YOLOv5s COCO, Sparse INT8 - Scenario: Asynchronous Multi-Stream

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./deepsparse: 2: /.local/bin/deepsparse.benchmark: not found

Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-Stream

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./deepsparse: 2: /.local/bin/deepsparse.benchmark: not found

Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-Stream

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./deepsparse: 2: /.local/bin/deepsparse.benchmark: not found

Model: Llama2 Chat 7b Quantized - Scenario: Synchronous Single-Stream

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./deepsparse: 2: /.local/bin/deepsparse.benchmark: not found

Model: Llama2 Chat 7b Quantized - Scenario: Asynchronous Multi-Stream

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./deepsparse: 2: /.local/bin/deepsparse.benchmark: not found

Model: ResNet-50, Sparse INT8 - Scenario: Synchronous Single-Stream

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./deepsparse: 2: /.local/bin/deepsparse.benchmark: not found

Model: ResNet-50, Sparse INT8 - Scenario: Asynchronous Multi-Stream

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./deepsparse: 2: /.local/bin/deepsparse.benchmark: not found

Model: ResNet-50, Baseline - Scenario: Synchronous Single-Stream

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./deepsparse: 2: /.local/bin/deepsparse.benchmark: not found

Model: ResNet-50, Baseline - Scenario: Asynchronous Multi-Stream

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./deepsparse: 2: /.local/bin/deepsparse.benchmark: not found

Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Synchronous Single-Stream

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./deepsparse: 2: /.local/bin/deepsparse.benchmark: not found

Model: NLP Text Classification, BERT base uncased SST2, Sparse INT8 - Scenario: Asynchronous Multi-Stream

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./deepsparse: 2: /.local/bin/deepsparse.benchmark: not found

Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-Stream

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./deepsparse: 2: /.local/bin/deepsparse.benchmark: not found

Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-Stream

ASUS NVIDIA GeForce RTX 3090: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./deepsparse: 2: /.local/bin/deepsparse.benchmark: not found

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: GPU - Batch Size: 1 - Model: GoogLeNetASUS NVIDIA GeForce RTX 30903691215SE +/- 0.21, N = 1512.00

PyTorch

This is a benchmark of PyTorch making use of pytorch-benchmark [https://github.com/LukasHedegaard/pytorch-benchmark]. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 512 - Model: Efficientnet_v2_lASUS NVIDIA GeForce RTX 3090246810SE +/- 0.24, N = 97.26MIN: 6.42 / MAX: 8.05

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 16 - Model: Efficientnet_v2_lASUS NVIDIA GeForce RTX 3090246810SE +/- 0.22, N = 96.97MIN: 6.34 / MAX: 8.11

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 1 - Model: ResNet-152ASUS NVIDIA GeForce RTX 3090510152025SE +/- 0.44, N = 1518.80MIN: 15.08 / MAX: 19.89

TensorFlow Lite

This is a benchmark of the TensorFlow Lite implementation focused on TensorFlow machine learning for mobile, IoT, edge, and other cases. The current Linux support is limited to running on CPUs. This test profile is measuring the average inference time. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2022-05-18Model: Inception ResNet V2ASUS NVIDIA GeForce RTX 309030K60K90K120K150KSE +/- 4137.26, N = 15128179

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2022-05-18Model: NASNet MobileASUS NVIDIA GeForce RTX 309070K140K210K280K350KSE +/- 11554.76, N = 15311542

256 Results Shown

Whisper.cpp:
  ggml-medium.en - 2016 State of the Union
  ggml-small.en - 2016 State of the Union
  ggml-base.en - 2016 State of the Union
Scikit-Learn:
  Sparse Rand Projections / 100 Iterations
  Kernel PCA Solvers / Time vs. N Components
  Kernel PCA Solvers / Time vs. N Samples
  Hist Gradient Boosting Categorical Only
  Plot Polynomial Kernel Approximation
  20 Newsgroups / Logistic Regression
  Hist Gradient Boosting Higgs Boson
  Hist Gradient Boosting Threading
  Isotonic / Perturbed Logarithm
  Hist Gradient Boosting Adult
  Covertype Dataset Benchmark
  Sample Without Replacement
  Plot Parallel Pairwise
  Hist Gradient Boosting
  Plot Incremental PCA
  Isotonic / Logistic
  TSNE MNIST Dataset
  LocalOutlierFactor
  Feature Expansions
  Plot OMP vs. LARS
  Plot Hierarchical
  Text Vectorizers
  Isolation Forest
  SGDOneClassSVM
  SGD Regression
  Plot Neighbors
  MNIST Dataset
  Plot Ward
  Sparsify
  Lasso
  Tree
  SAGA
  GLM
ONNX Runtime:
  ResNet101_DUC_HDC-12 - CPU - Standard
  ResNet101_DUC_HDC-12 - CPU - Parallel
  super-resolution-10 - CPU - Standard
  super-resolution-10 - CPU - Parallel
  ResNet50 v1-12-int8 - CPU - Standard
  ResNet50 v1-12-int8 - CPU - Parallel
  fcn-resnet101-11 - CPU - Standard
  fcn-resnet101-11 - CPU - Parallel
  CaffeNet 12-int8 - CPU - Standard
  CaffeNet 12-int8 - CPU - Parallel
  T5 Encoder - CPU - Standard
  T5 Encoder - CPU - Parallel
  ZFNet-512 - CPU - Standard
  ZFNet-512 - CPU - Parallel
  yolov4 - CPU - Standard
  yolov4 - CPU - Parallel
OpenVINO:
  Age Gender Recognition Retail 0013 FP16-INT8 - CPU:
    ms
    FPS
  Handwritten English Recognition FP16-INT8 - CPU:
    ms
    FPS
  Age Gender Recognition Retail 0013 FP16 - CPU:
    ms
    FPS
  Person Re-Identification Retail FP16 - CPU:
    ms
    FPS
  Handwritten English Recognition FP16 - CPU:
    ms
    FPS
  Noise Suppression Poconet-Like FP16 - CPU:
    ms
    FPS
  Person Vehicle Bike Detection FP16 - CPU:
    ms
    FPS
  Weld Porosity Detection FP16-INT8 - CPU:
    ms
    FPS
  Machine Translation EN To DE FP16 - CPU:
    ms
    FPS
  Road Segmentation ADAS FP16-INT8 - CPU:
    ms
    FPS
  Face Detection Retail FP16-INT8 - CPU:
    ms
    FPS
  Weld Porosity Detection FP16 - CPU:
    ms
    FPS
  Vehicle Detection FP16-INT8 - CPU:
    ms
    FPS
  Road Segmentation ADAS FP16 - CPU:
    ms
    FPS
  Face Detection Retail FP16 - CPU:
    ms
    FPS
  Face Detection FP16-INT8 - CPU:
    ms
    FPS
  Vehicle Detection FP16 - CPU:
    ms
    FPS
  Person Detection FP32 - CPU:
    ms
    FPS
  Person Detection FP16 - CPU:
    ms
    FPS
  Face Detection FP16 - CPU:
    ms
    FPS
XNNPACK:
  FP16MobileNetV3Large
  FP32MobileNetV2
NCNN:
  Vulkan GPU - FastestDet
  Vulkan GPU - vision_transformer
  Vulkan GPU - regnety_400m
  Vulkan GPU - yolov4-tiny
  Vulkan GPU - alexnet
  Vulkan GPU - resnet18
  Vulkan GPU - vgg16
  Vulkan GPU - blazeface
  Vulkan GPU - mnasnet
  CPU - FastestDet
  CPU - vision_transformer
  CPU - regnety_400m
  CPU - yolov4-tiny
  CPU - resnet50
  CPU - alexnet
  CPU - resnet18
  CPU - vgg16
  CPU - blazeface
  CPU - mnasnet
  CPU-v2-v2 - mobilenet-v2
Mobile Neural Network:
  inception-v3
  mobilenet-v1-1.0
  MobileNetV2_224
  SqueezeNetV1.0
  resnet-v2-50
  squeezenetv1.1
  mobilenetV3
  nasnet
TensorFlow:
  GPU - 512 - ResNet-50
  GPU - 512 - GoogLeNet
  GPU - 256 - ResNet-50
  GPU - 256 - GoogLeNet
  CPU - 512 - ResNet-50
  CPU - 512 - GoogLeNet
  CPU - 256 - ResNet-50
  CPU - 256 - GoogLeNet
  GPU - 64 - ResNet-50
  GPU - 64 - GoogLeNet
  GPU - 32 - ResNet-50
  GPU - 32 - GoogLeNet
  GPU - 16 - ResNet-50
  GPU - 16 - GoogLeNet
  CPU - 64 - ResNet-50
  CPU - 64 - GoogLeNet
  CPU - 32 - ResNet-50
  CPU - 32 - GoogLeNet
  CPU - 16 - ResNet-50
  CPU - 16 - GoogLeNet
  GPU - 512 - AlexNet
  GPU - 256 - AlexNet
  GPU - 1 - ResNet-50
  CPU - 512 - AlexNet
  CPU - 256 - AlexNet
  CPU - 1 - ResNet-50
  CPU - 1 - GoogLeNet
  GPU - 64 - AlexNet
  GPU - 512 - VGG-16
  GPU - 32 - AlexNet
  GPU - 256 - VGG-16
  GPU - 16 - AlexNet
  CPU - 64 - AlexNet
  CPU - 512 - VGG-16
  CPU - 32 - AlexNet
  CPU - 256 - VGG-16
  CPU - 16 - AlexNet
  GPU - 64 - VGG-16
  GPU - 32 - VGG-16
  GPU - 16 - VGG-16
  GPU - 1 - AlexNet
  CPU - 64 - VGG-16
  CPU - 32 - VGG-16
  CPU - 16 - VGG-16
  CPU - 1 - AlexNet
  GPU - 1 - VGG-16
  CPU - 1 - VGG-16
PyTorch:
  NVIDIA CUDA GPU - 512 - Efficientnet_v2_l
  NVIDIA CUDA GPU - 256 - Efficientnet_v2_l
  NVIDIA CUDA GPU - 64 - Efficientnet_v2_l
  NVIDIA CUDA GPU - 32 - Efficientnet_v2_l
  NVIDIA CUDA GPU - 16 - Efficientnet_v2_l
  NVIDIA CUDA GPU - 1 - Efficientnet_v2_l
  NVIDIA CUDA GPU - 512 - ResNet-152
  NVIDIA CUDA GPU - 256 - ResNet-152
  NVIDIA CUDA GPU - 64 - ResNet-152
  NVIDIA CUDA GPU - 512 - ResNet-50
  NVIDIA CUDA GPU - 32 - ResNet-152
  NVIDIA CUDA GPU - 256 - ResNet-50
  NVIDIA CUDA GPU - 16 - ResNet-152
  NVIDIA CUDA GPU - 64 - ResNet-50
  NVIDIA CUDA GPU - 32 - ResNet-50
  NVIDIA CUDA GPU - 16 - ResNet-50
  NVIDIA CUDA GPU - 1 - ResNet-152
  NVIDIA CUDA GPU - 1 - ResNet-50
  CPU - 256 - Efficientnet_v2_l
  CPU - 64 - Efficientnet_v2_l
  CPU - 32 - Efficientnet_v2_l
  CPU - 1 - Efficientnet_v2_l
  CPU - 512 - ResNet-152
  CPU - 256 - ResNet-152
  CPU - 64 - ResNet-152
  CPU - 512 - ResNet-50
  CPU - 32 - ResNet-152
  CPU - 256 - ResNet-50
  CPU - 16 - ResNet-152
  CPU - 64 - ResNet-50
  CPU - 32 - ResNet-50
  CPU - 16 - ResNet-50
  CPU - 1 - ResNet-50
TensorFlow Lite:
  Mobilenet Quant
  Mobilenet Float
  Inception V4
  SqueezeNet
RNNoise
R Benchmark
DeepSpeech
Numpy Benchmark
oneDNN:
  Recurrent Neural Network Inference - CPU
  Recurrent Neural Network Training - CPU
  Deconvolution Batch shapes_3d - CPU
  Deconvolution Batch shapes_1d - CPU
  Convolution Batch Shapes Auto - CPU
  IP Shapes 3D - CPU
  IP Shapes 1D - CPU
LeelaChessZero
SHOC Scalable HeterOgeneous Computing:
  OpenCL - Texture Read Bandwidth
  OpenCL - Bus Speed Readback
  OpenCL - Bus Speed Download
  OpenCL - Max SP Flops
  OpenCL - GEMM SGEMM_N
  OpenCL - Reduction
  OpenCL - MD5 Hash
  OpenCL - FFT SP
  OpenCL - Triad
  OpenCL - S3D
OpenCV
XNNPACK:
  QS8MobileNetV2
  FP16MobileNetV3Small
  FP16MobileNetV2
  FP16MobileNetV1
  FP32MobileNetV3Small
  FP32MobileNetV3Large
  FP32MobileNetV1
NCNN:
  Vulkan GPU - squeezenet_ssd
  Vulkan GPU - resnet50
  Vulkan GPU - googlenet
  Vulkan GPU - efficientnet-b0
  Vulkan GPU - shufflenet-v2
  Vulkan GPU-v3-v3 - mobilenet-v3
  Vulkan GPU-v2-v2 - mobilenet-v2
  Vulkan GPU - mobilenet
  CPU - squeezenet_ssd
  CPU - googlenet
  CPU - efficientnet-b0
  CPU - shufflenet-v2
  CPU-v3-v3 - mobilenet-v3
  CPU - mobilenet
TensorFlow
PyTorch:
  CPU - 512 - Efficientnet_v2_l
  CPU - 16 - Efficientnet_v2_l
  CPU - 1 - ResNet-152
TensorFlow Lite:
  Inception ResNet V2
  NASNet Mobile