mltestresults2

wsl testing on Ubuntu 20.04 via the Phoronix Test Suite.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2208042-NE-MLTESTRES66
Jump To Table - Results

Statistics

Remove Outliers Before Calculating Averages

Graph Settings

Prefer Vertical Bar Graphs

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Result
Identifier
Performance Per
Dollar
Date
Run
  Test
  Duration
rtx3080_1290k_2
August 04 2022
  1 Day, 22 Hours, 48 Minutes
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


mltestresults2OpenBenchmarking.orgPhoronix Test SuiteIntel Core i9-12900K (12 Cores / 24 Threads)16GB4 x 275GB Virtual DiskNVIDIA GeForce RTX 3080 10GBUbuntu 20.045.10.16.3-microsoft-standard-WSL2 (x86_64)Wayland3.3 Mesa 21.2.61.1.182GCC 9.4.0 + CUDA 11.7ext41920x1080wslProcessorMemoryDiskGraphicsOSKernelDisplay ServerOpenGLVulkanCompilerFile-SystemScreen ResolutionSystem LayerMltestresults2 BenchmarksSystem Logs- Transparent Huge Pages: always- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-Av3uEd/gcc-9-9.4.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - CPU Microcode: 0xffffffff- Python 3.9.12- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected

mltestresults2ncnn: Vulkan GPU - regnety_400mncnn: Vulkan GPU - squeezenet_ssdncnn: Vulkan GPU - yolov4-tinyncnn: Vulkan GPU - resnet50ncnn: Vulkan GPU - alexnetncnn: Vulkan GPU - resnet18ncnn: Vulkan GPU - vgg16ncnn: Vulkan GPU - googlenetncnn: Vulkan GPU - blazefacencnn: Vulkan GPU - efficientnet-b0ncnn: Vulkan GPU - mnasnetncnn: Vulkan GPU - shufflenet-v2ncnn: Vulkan GPU-v3-v3 - mobilenet-v3ncnn: Vulkan GPU-v2-v2 - mobilenet-v2ncnn: Vulkan GPU - mobilenetonnx: fcn-resnet101-11 - CPU - Standardmnn: inception-v3mnn: mobilenet-v1-1.0mnn: MobileNetV2_224mnn: SqueezeNetV1.0mnn: resnet-v2-50mnn: squeezenetv1.1mnn: mobilenetV3lczero: BLASai-benchmark: Device AI Scoreai-benchmark: Device Training Scoreai-benchmark: Device Inference Scoretensorflow-lite: SqueezeNetplaidml: No - Inference - ResNet 50 - CPUncnn: CPU - regnety_400mncnn: CPU - squeezenet_ssdncnn: CPU - yolov4-tinyncnn: CPU - resnet50ncnn: CPU - alexnetncnn: CPU - resnet18ncnn: CPU - vgg16ncnn: CPU - googlenetncnn: CPU - blazefacencnn: CPU - efficientnet-b0ncnn: CPU - mnasnetncnn: CPU - shufflenet-v2ncnn: CPU-v3-v3 - mobilenet-v3ncnn: CPU-v2-v2 - mobilenet-v2ncnn: CPU - mobilenetnumenta-nab: EXPoSEecp-candle: P3B2tnn: CPU - DenseNetecp-candle: P3B1onnx: fcn-resnet101-11 - CPU - Parallelonnx: ArcFace ResNet-100 - CPU - Parallelonnx: bertsquad-12 - CPU - Parallelonnx: GPT-2 - CPU - Parallelonnx: ArcFace ResNet-100 - CPU - Standardonnx: GPT-2 - CPU - Standardonnx: bertsquad-12 - CPU - Standardonnx: super-resolution-10 - CPU - Parallelonnx: super-resolution-10 - CPU - Standardnumpy: plaidml: No - Inference - VGG16 - CPUonednn: Recurrent Neural Network Training - f32 - CPUonednn: Recurrent Neural Network Training - u8s8f32 - CPUonednn: Recurrent Neural Network Training - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Inference - u8s8f32 - CPUonednn: Recurrent Neural Network Inference - f32 - CPUonednn: IP Shapes 1D - f32 - CPUnumenta-nab: Earthgecko Skylineopenvino: Face Detection 0106 FP16 - CPUopenvino: Face Detection 0106 FP16 - CPUopenvino: Person Detection 0106 FP16 - CPUopenvino: Person Detection 0106 FP16 - CPUopenvino: Face Detection 0106 FP32 - CPUopenvino: Face Detection 0106 FP32 - CPUopenvino: Person Detection 0106 FP32 - CPUopenvino: Person Detection 0106 FP32 - CPUtensorflow-lite: Inception V4tensorflow-lite: Inception ResNet V2tensorflow-lite: NASNet Mobiletensorflow-lite: Mobilenet Floattensorflow-lite: Mobilenet Quantopenvino: Age Gender Recognition Retail 0013 FP16 - CPUopenvino: Age Gender Recognition Retail 0013 FP16 - CPUopenvino: Age Gender Recognition Retail 0013 FP32 - CPUopenvino: Age Gender Recognition Retail 0013 FP32 - CPUnumenta-nab: Relative Entropydeepspeech: CPUonednn: Deconvolution Batch shapes_1d - f32 - CPUonednn: Deconvolution Batch shapes_1d - u8s8f32 - CPUrbenchmark: numenta-nab: Bayesian Changepointonednn: Matrix Multiply Batch Shapes Transformer - f32 - CPUonednn: IP Shapes 1D - u8s8f32 - CPUrnnoise: tnn: CPU - MobileNet v2onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPUtnn: CPU - SqueezeNet v1.1onednn: IP Shapes 3D - f32 - CPUonednn: IP Shapes 3D - u8s8f32 - CPUecp-candle: P1B2scikit-learn: onednn: Convolution Batch Shapes Auto - f32 - CPUonednn: Convolution Batch Shapes Auto - u8s8f32 - CPUnumenta-nab: Windowed Gaussianonednn: Deconvolution Batch shapes_3d - f32 - CPUonednn: Deconvolution Batch shapes_3d - u8s8f32 - CPUtnn: CPU - SqueezeNet v2caffe: GoogleNet - CPU - 1000rtx3080_1290k_2213.29693.06887.281312.42406.35522.912450.49490.7555.99218.24142.91115.07136.16137.72431.728728.6593.9353.0224.99927.2973.2881.6589054406258618202188.998.1111.3919.2621.4623.3410.3613.7537.9413.751.846.304.094.053.734.5013.46213.975471.092000.037399.2787995363863311311880180346445981628.5725.963360.373374.313344.941925.581940.491953.393.3753674.4991587.803.732314.362.581599.463.722304.082.6028429.328494.010891.61508.673584.510.728475.430.728436.699.47653.5190110.57331.785610.170117.1951.770541.2503313.736179.0601.14390140.9714.696221.2189920.7386.1537.837157.937885.0546.049592.4683339.023OpenBenchmarking.org

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: Vulkan GPU - Model: regnety_400mrtx3080_1290k_250100150200250SE +/- 0.03, N = 3213.29MIN: 209.62 / MAX: 216.791. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: Vulkan GPU - Model: squeezenet_ssdrtx3080_1290k_2150300450600750SE +/- 0.27, N = 3693.06MIN: 681.49 / MAX: 7071. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: Vulkan GPU - Model: yolov4-tinyrtx3080_1290k_22004006008001000SE +/- 0.29, N = 3887.28MIN: 873.59 / MAX: 917.961. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: Vulkan GPU - Model: resnet50rtx3080_1290k_230060090012001500SE +/- 3.95, N = 31312.42MIN: -1603.4 / MAX: 1332.841. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: Vulkan GPU - Model: alexnetrtx3080_1290k_290180270360450SE +/- 0.07, N = 3406.35MIN: 395.84 / MAX: 416.031. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: Vulkan GPU - Model: resnet18rtx3080_1290k_2110220330440550SE +/- 0.17, N = 3522.91MIN: 513.93 / MAX: 531.851. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: Vulkan GPU - Model: vgg16rtx3080_1290k_25001000150020002500SE +/- 0.28, N = 32450.49MIN: 2424.77 / MAX: 2483.931. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: Vulkan GPU - Model: googlenetrtx3080_1290k_2110220330440550SE +/- 0.05, N = 3490.75MIN: 483.19 / MAX: 503.531. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: Vulkan GPU - Model: blazefacertx3080_1290k_21326395265SE +/- 0.04, N = 355.99MIN: 50.78 / MAX: 72.481. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: Vulkan GPU - Model: efficientnet-b0rtx3080_1290k_250100150200250SE +/- 0.04, N = 3218.24MIN: 215.84 / MAX: 221.51. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: Vulkan GPU - Model: mnasnetrtx3080_1290k_2306090120150SE +/- 0.02, N = 3142.91MIN: 140.66 / MAX: 145.791. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: Vulkan GPU - Model: shufflenet-v2rtx3080_1290k_2306090120150SE +/- 0.03, N = 3115.07MIN: 113.32 / MAX: 117.211. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3rtx3080_1290k_2306090120150SE +/- 0.01, N = 3136.16MIN: 134.03 / MAX: 139.041. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2rtx3080_1290k_2306090120150SE +/- 0.01, N = 3137.72MIN: 134.88 / MAX: 141.551. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: Vulkan GPU - Model: mobilenetrtx3080_1290k_290180270360450SE +/- 0.14, N = 3431.72MIN: 420.49 / MAX: 462.311. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

ONNX Runtime

ONNX Runtime is developed by Microsoft and partners as a open-source, cross-platform, high performance machine learning inferencing and training accelerator. This test profile runs the ONNX Runtime with various models available from the ONNX Zoo. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: fcn-resnet101-11 - Device: CPU - Executor: Standardrtx3080_1290k_220406080100SE +/- 1.85, N = 12871. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -pthread -lpthread

Mobile Neural Network

MNN is the Mobile Neural Network as a highly efficient, lightweight deep learning framework developed by Alibaba. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 1.2Model: inception-v3rtx3080_1290k_2714212835SE +/- 0.42, N = 1528.66MIN: 20.12 / MAX: 103.251. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 1.2Model: mobilenet-v1-1.0rtx3080_1290k_20.88541.77082.65623.54164.427SE +/- 0.108, N = 153.935MIN: 2.89 / MAX: 39.451. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 1.2Model: MobileNetV2_224rtx3080_1290k_20.681.362.042.723.4SE +/- 0.040, N = 153.022MIN: 2.06 / MAX: 72.391. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 1.2Model: SqueezeNetV1.0rtx3080_1290k_21.12482.24963.37444.49925.624SE +/- 0.049, N = 154.999MIN: 3.21 / MAX: 55.71. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 1.2Model: resnet-v2-50rtx3080_1290k_2612182430SE +/- 0.72, N = 1527.30MIN: 18.35 / MAX: 116.051. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 1.2Model: squeezenetv1.1rtx3080_1290k_20.73981.47962.21942.95923.699SE +/- 0.052, N = 153.288MIN: 2.12 / MAX: 54.881. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 1.2Model: mobilenetV3rtx3080_1290k_20.37310.74621.11931.49241.8655SE +/- 0.029, N = 151.658MIN: 1.1 / MAX: 78.981. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl

LeelaChessZero

LeelaChessZero (lc0 / lczero) is a chess engine automated vian neural networks. This test profile can be used for OpenCL, CUDA + cuDNN, and BLAS (CPU-based) benchmarking. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.28Backend: BLASrtx3080_1290k_22004006008001000SE +/- 7.97, N = 39051. (CXX) g++ options: -flto -pthread

AI Benchmark Alpha

AI Benchmark Alpha is a Python library for evaluating artificial intelligence (AI) performance on diverse hardware platforms and relies upon the TensorFlow machine learning library. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgScore, More Is BetterAI Benchmark Alpha 0.1.2Device AI Scorertx3080_1290k_290018002700360045004406

OpenBenchmarking.orgScore, More Is BetterAI Benchmark Alpha 0.1.2Device Training Scorertx3080_1290k_260012001800240030002586

OpenBenchmarking.orgScore, More Is BetterAI Benchmark Alpha 0.1.2Device Inference Scorertx3080_1290k_24008001200160020001820

TensorFlow Lite

This is a benchmark of the TensorFlow Lite implementation focused on TensorFlow machine learning for mobile, IoT, edge, and other cases. The current Linux support is limited to running on CPUs. This test profile is measuring the average inference time. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2022-05-18Model: SqueezeNetrtx3080_1290k_25001000150020002500SE +/- 17.10, N = 152188.99

PlaidML

This test profile uses PlaidML deep learning framework developed by Intel for offering up various benchmarks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFPS, More Is BetterPlaidMLFP16: No - Mode: Inference - Network: ResNet 50 - Device: CPUrtx3080_1290k_2246810SE +/- 0.02, N = 38.11

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: CPU - Model: regnety_400mrtx3080_1290k_23691215SE +/- 0.30, N = 1211.39MIN: 6.68 / MAX: 108.161. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: CPU - Model: squeezenet_ssdrtx3080_1290k_2510152025SE +/- 0.21, N = 1219.26MIN: 13.35 / MAX: 145.581. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: CPU - Model: yolov4-tinyrtx3080_1290k_2510152025SE +/- 0.48, N = 1221.46MIN: 14.54 / MAX: 117.781. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: CPU - Model: resnet50rtx3080_1290k_2612182430SE +/- 0.18, N = 1223.34MIN: 14.8 / MAX: 156.351. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: CPU - Model: alexnetrtx3080_1290k_23691215SE +/- 0.09, N = 1210.36MIN: 8.18 / MAX: 79.291. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: CPU - Model: resnet18rtx3080_1290k_248121620SE +/- 0.35, N = 1213.75MIN: 8.53 / MAX: 371.771. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: CPU - Model: vgg16rtx3080_1290k_2918273645SE +/- 0.31, N = 1237.94MIN: 27.47 / MAX: 170.831. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: CPU - Model: googlenetrtx3080_1290k_248121620SE +/- 0.06, N = 1213.75MIN: 8.58 / MAX: 94.621. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: CPU - Model: blazefacertx3080_1290k_20.4140.8281.2421.6562.07SE +/- 0.07, N = 121.84MIN: 1.15 / MAX: 45.811. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: CPU - Model: efficientnet-b0rtx3080_1290k_2246810SE +/- 0.13, N = 126.30MIN: 4.05 / MAX: 821. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: CPU - Model: mnasnetrtx3080_1290k_20.92031.84062.76093.68124.6015SE +/- 0.09, N = 124.09MIN: 2.6 / MAX: 65.381. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: CPU - Model: shufflenet-v2rtx3080_1290k_20.91131.82262.73393.64524.5565SE +/- 0.10, N = 124.05MIN: 2.61 / MAX: 76.131. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: CPU-v3-v3 - Model: mobilenet-v3rtx3080_1290k_20.83931.67862.51793.35724.1965SE +/- 0.09, N = 123.73MIN: 2.52 / MAX: 90.841. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: CPU-v2-v2 - Model: mobilenet-v2rtx3080_1290k_21.01252.0253.03754.055.0625SE +/- 0.10, N = 124.50MIN: 2.76 / MAX: 69.051. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: CPU - Model: mobilenetrtx3080_1290k_23691215SE +/- 0.28, N = 1213.46MIN: 9.08 / MAX: 111.981. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread -pthread

Numenta Anomaly Benchmark

Numenta Anomaly Benchmark (NAB) is a benchmark for evaluating algorithms for anomaly detection in streaming, real-time applications. It is comprised of over 50 labeled real-world and artificial timeseries data files plus a novel scoring mechanism designed for real-time applications. This test profile currently measures the time to run various detectors. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterNumenta Anomaly Benchmark 1.1Detector: EXPoSErtx3080_1290k_250100150200250SE +/- 1.01, N = 3213.98

ECP-CANDLE

The CANDLE benchmark codes implement deep learning architectures relevant to problems in cancer. These architectures address problems at different biological scales, specifically problems at the molecular, cellular and population scales. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterECP-CANDLE 0.4Benchmark: P3B2rtx3080_1290k_2100200300400500471.09

TNN

TNN is an open-source deep learning reasoning framework developed by Tencent. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: DenseNetrtx3080_1290k_2400800120016002000SE +/- 1.34, N = 32000.04MIN: 1953.13 / MAX: 2068.991. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl

ECP-CANDLE

The CANDLE benchmark codes implement deep learning architectures relevant to problems in cancer. These architectures address problems at different biological scales, specifically problems at the molecular, cellular and population scales. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterECP-CANDLE 0.4Benchmark: P3B1rtx3080_1290k_290180270360450399.28

ONNX Runtime

ONNX Runtime is developed by Microsoft and partners as a open-source, cross-platform, high performance machine learning inferencing and training accelerator. This test profile runs the ONNX Runtime with various models available from the ONNX Zoo. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: fcn-resnet101-11 - Device: CPU - Executor: Parallelrtx3080_1290k_220406080100SE +/- 0.76, N = 3791. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -pthread -lpthread

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: ArcFace ResNet-100 - Device: CPU - Executor: Parallelrtx3080_1290k_22004006008001000SE +/- 5.95, N = 39531. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -pthread -lpthread

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: bertsquad-12 - Device: CPU - Executor: Parallelrtx3080_1290k_2140280420560700SE +/- 1.30, N = 36381. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -pthread -lpthread

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: GPT-2 - Device: CPU - Executor: Parallelrtx3080_1290k_214002800420056007000SE +/- 13.31, N = 363311. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -pthread -lpthread

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: ArcFace ResNet-100 - Device: CPU - Executor: Standardrtx3080_1290k_230060090012001500SE +/- 18.87, N = 313111. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -pthread -lpthread

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: GPT-2 - Device: CPU - Executor: Standardrtx3080_1290k_22K4K6K8K10KSE +/- 8.98, N = 388011. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -pthread -lpthread

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: bertsquad-12 - Device: CPU - Executor: Standardrtx3080_1290k_22004006008001000SE +/- 4.75, N = 38031. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -pthread -lpthread

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: super-resolution-10 - Device: CPU - Executor: Parallelrtx3080_1290k_210002000300040005000SE +/- 39.92, N = 346441. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -pthread -lpthread

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: super-resolution-10 - Device: CPU - Executor: Standardrtx3080_1290k_213002600390052006500SE +/- 16.23, N = 359811. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt -pthread -lpthread

Numpy Benchmark

This is a test to obtain the general Numpy performance. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgScore, More Is BetterNumpy Benchmarkrtx3080_1290k_2140280420560700SE +/- 1.66, N = 3628.57

PlaidML

This test profile uses PlaidML deep learning framework developed by Intel for offering up various benchmarks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFPS, More Is BetterPlaidMLFP16: No - Mode: Inference - Network: VGG16 - Device: CPUrtx3080_1290k_2612182430SE +/- 0.12, N = 325.96

oneDNN

This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of Intel oneAPI. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPUrtx3080_1290k_27001400210028003500SE +/- 25.28, N = 33360.37MIN: 3028.831. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPUrtx3080_1290k_27001400210028003500SE +/- 29.88, N = 33374.31MIN: 3069.81. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPUrtx3080_1290k_27001400210028003500SE +/- 6.78, N = 33344.94MIN: 3056.431. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPUrtx3080_1290k_2400800120016002000SE +/- 4.93, N = 31925.58MIN: 1675.991. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPUrtx3080_1290k_2400800120016002000SE +/- 21.03, N = 31940.49MIN: 1684.31. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPUrtx3080_1290k_2400800120016002000SE +/- 7.72, N = 31953.39MIN: 1688.141. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: IP Shapes 1D - Data Type: f32 - Engine: CPUrtx3080_1290k_20.75951.5192.27853.0383.7975SE +/- 0.20109, N = 153.37536MIN: 2.551. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl

Numenta Anomaly Benchmark

Numenta Anomaly Benchmark (NAB) is a benchmark for evaluating algorithms for anomaly detection in streaming, real-time applications. It is comprised of over 50 labeled real-world and artificial timeseries data files plus a novel scoring mechanism designed for real-time applications. This test profile currently measures the time to run various detectors. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterNumenta Anomaly Benchmark 1.1Detector: Earthgecko Skylinertx3080_1290k_220406080100SE +/- 0.24, N = 374.50

OpenVINO

This is a test of the Intel OpenVINO, a toolkit around neural networks, using its built-in benchmarking support and analyzing the throughput and latency for various models. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2021.1Model: Face Detection 0106 FP16 - Device: CPUrtx3080_1290k_230060090012001500SE +/- 5.67, N = 31587.801. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2021.1Model: Face Detection 0106 FP16 - Device: CPUrtx3080_1290k_20.83931.67862.51793.35724.1965SE +/- 0.00, N = 33.731. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2021.1Model: Person Detection 0106 FP16 - Device: CPUrtx3080_1290k_25001000150020002500SE +/- 2.71, N = 32314.361. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2021.1Model: Person Detection 0106 FP16 - Device: CPUrtx3080_1290k_20.58051.1611.74152.3222.9025SE +/- 0.00, N = 32.581. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2021.1Model: Face Detection 0106 FP32 - Device: CPUrtx3080_1290k_230060090012001500SE +/- 3.85, N = 31599.461. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2021.1Model: Face Detection 0106 FP32 - Device: CPUrtx3080_1290k_20.8371.6742.5113.3484.185SE +/- 0.00, N = 33.721. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2021.1Model: Person Detection 0106 FP32 - Device: CPUrtx3080_1290k_25001000150020002500SE +/- 4.66, N = 32304.081. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2021.1Model: Person Detection 0106 FP32 - Device: CPUrtx3080_1290k_20.5851.171.7552.342.925SE +/- 0.00, N = 32.601. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread

TensorFlow Lite

This is a benchmark of the TensorFlow Lite implementation focused on TensorFlow machine learning for mobile, IoT, edge, and other cases. The current Linux support is limited to running on CPUs. This test profile is measuring the average inference time. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2022-05-18Model: Inception V4rtx3080_1290k_26K12K18K24K30KSE +/- 322.34, N = 328429.3

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2022-05-18Model: Inception ResNet V2rtx3080_1290k_26K12K18K24K30KSE +/- 66.98, N = 328494.0

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2022-05-18Model: NASNet Mobilertx3080_1290k_22K4K6K8K10KSE +/- 48.23, N = 310891.6

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2022-05-18Model: Mobilenet Floatrtx3080_1290k_230060090012001500SE +/- 8.77, N = 31508.67

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2022-05-18Model: Mobilenet Quantrtx3080_1290k_28001600240032004000SE +/- 13.83, N = 33584.51

OpenVINO

This is a test of the Intel OpenVINO, a toolkit around neural networks, using its built-in benchmarking support and analyzing the throughput and latency for various models. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2021.1Model: Age Gender Recognition Retail 0013 FP16 - Device: CPUrtx3080_1290k_20.1620.3240.4860.6480.81SE +/- 0.00, N = 30.721. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2021.1Model: Age Gender Recognition Retail 0013 FP16 - Device: CPUrtx3080_1290k_22K4K6K8K10KSE +/- 4.39, N = 38475.431. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2021.1Model: Age Gender Recognition Retail 0013 FP32 - Device: CPUrtx3080_1290k_20.1620.3240.4860.6480.81SE +/- 0.00, N = 30.721. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2021.1Model: Age Gender Recognition Retail 0013 FP32 - Device: CPUrtx3080_1290k_22K4K6K8K10KSE +/- 24.56, N = 38436.691. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread

Numenta Anomaly Benchmark

Numenta Anomaly Benchmark (NAB) is a benchmark for evaluating algorithms for anomaly detection in streaming, real-time applications. It is comprised of over 50 labeled real-world and artificial timeseries data files plus a novel scoring mechanism designed for real-time applications. This test profile currently measures the time to run various detectors. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterNumenta Anomaly Benchmark 1.1Detector: Relative Entropyrtx3080_1290k_23691215SE +/- 0.083, N = 159.476

DeepSpeech

Mozilla DeepSpeech is a speech-to-text engine powered by TensorFlow for machine learning and derived from Baidu's Deep Speech research paper. This test profile times the speech-to-text process for a roughly three minute audio recording. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterDeepSpeech 0.6Acceleration: CPUrtx3080_1290k_21224364860SE +/- 0.08, N = 353.52

oneDNN

This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of Intel oneAPI. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPUrtx3080_1290k_23691215SE +/- 0.04, N = 310.57MIN: 4.511. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPUrtx3080_1290k_20.40180.80361.20541.60722.009SE +/- 0.00420, N = 31.78561MIN: -3.21. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl

R Benchmark

This test is a quick-running survey of general R performance Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterR Benchmarkrtx3080_1290k_20.03830.07660.11490.15320.1915SE +/- 0.0004, N = 30.17011. R scripting front-end version 3.6.3 (2020-02-29)

Numenta Anomaly Benchmark

Numenta Anomaly Benchmark (NAB) is a benchmark for evaluating algorithms for anomaly detection in streaming, real-time applications. It is comprised of over 50 labeled real-world and artificial timeseries data files plus a novel scoring mechanism designed for real-time applications. This test profile currently measures the time to run various detectors. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterNumenta Anomaly Benchmark 1.1Detector: Bayesian Changepointrtx3080_1290k_248121620SE +/- 0.13, N = 317.20

oneDNN

This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of Intel oneAPI. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPUrtx3080_1290k_20.39840.79681.19521.59361.992SE +/- 0.02077, N = 41.77054MIN: 1.31. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPUrtx3080_1290k_20.28130.56260.84391.12521.4065SE +/- 0.00272, N = 31.25033MIN: 1.011. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl

RNNoise

RNNoise is a recurrent neural network for audio noise reduction developed by Mozilla and Xiph.Org. This test profile is a single-threaded test measuring the time to denoise a sample 26 minute long 16-bit RAW audio file using this recurrent neural network noise suppression library. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterRNNoise 2020-06-28rtx3080_1290k_248121620SE +/- 0.00, N = 313.741. (CC) gcc options: -O2 -pedantic -fvisibility=hidden

TNN

TNN is an open-source deep learning reasoning framework developed by Tencent. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: MobileNet v2rtx3080_1290k_24080120160200SE +/- 0.54, N = 3179.06MIN: 173.12 / MAX: 190.841. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl

oneDNN

This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of Intel oneAPI. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPUrtx3080_1290k_20.25740.51480.77221.02961.287SE +/- 0.01262, N = 31.14390MIN: 0.81. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl

TNN

TNN is an open-source deep learning reasoning framework developed by Tencent. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: SqueezeNet v1.1rtx3080_1290k_2306090120150SE +/- 0.16, N = 3140.97MIN: 137.94 / MAX: 152.031. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl

oneDNN

This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of Intel oneAPI. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: IP Shapes 3D - Data Type: f32 - Engine: CPUrtx3080_1290k_21.05662.11323.16984.22645.283SE +/- 0.01580, N = 34.69622MIN: 4.111. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPUrtx3080_1290k_20.27430.54860.82291.09721.3715SE +/- 0.00504, N = 31.21899MIN: 0.91. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl

ECP-CANDLE

The CANDLE benchmark codes implement deep learning architectures relevant to problems in cancer. These architectures address problems at different biological scales, specifically problems at the molecular, cellular and population scales. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterECP-CANDLE 0.4Benchmark: P1B2rtx3080_1290k_251015202520.74

Scikit-Learn

Scikit-learn is a Python module for machine learning Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterScikit-Learn 0.22.1rtx3080_1290k_2246810SE +/- 0.084, N = 36.153

oneDNN

This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of Intel oneAPI. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPUrtx3080_1290k_2246810SE +/- 0.02261, N = 37.83715MIN: 6.971. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPUrtx3080_1290k_2246810SE +/- 0.03234, N = 37.93788MIN: 7.041. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl

Numenta Anomaly Benchmark

Numenta Anomaly Benchmark (NAB) is a benchmark for evaluating algorithms for anomaly detection in streaming, real-time applications. It is comprised of over 50 labeled real-world and artificial timeseries data files plus a novel scoring mechanism designed for real-time applications. This test profile currently measures the time to run various detectors. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterNumenta Anomaly Benchmark 1.1Detector: Windowed Gaussianrtx3080_1290k_21.13722.27443.41164.54885.686SE +/- 0.044, N = 35.054

oneDNN

This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of Intel oneAPI. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPUrtx3080_1290k_2246810SE +/- 0.08358, N = 36.04959MIN: 4.221. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.6Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPUrtx3080_1290k_20.55541.11081.66622.22162.777SE +/- 0.02051, N = 32.46833MIN: 1.861. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -lpthread -ldl

TNN

TNN is an open-source deep learning reasoning framework developed by Tencent. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: SqueezeNet v2rtx3080_1290k_2918273645SE +/- 0.03, N = 339.02MIN: 38.04 / MAX: 44.21. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl

Tensorflow

This is a benchmark of the Tensorflow deep learning framework using the CIFAR10 data set. Learn more via the OpenBenchmarking.org test page.

Build: Cifar10

rtx3080_1290k_2: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: AttributeError: module 'tensorflow' has no attribute 'app'

Mlpack Benchmark

Mlpack benchmark scripts for machine learning libraries Learn more via the OpenBenchmarking.org test page.

Benchmark: scikit_qda

rtx3080_1290k_2: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: TypeError: load_all() missing 1 required positional argument: 'Loader'

Benchmark: scikit_ica

rtx3080_1290k_2: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: TypeError: load_all() missing 1 required positional argument: 'Loader'

Benchmark: scikit_linearridgeregression

rtx3080_1290k_2: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: TypeError: load_all() missing 1 required positional argument: 'Loader'

Benchmark: scikit_svm

rtx3080_1290k_2: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: TypeError: load_all() missing 1 required positional argument: 'Loader'

OpenVINO

This is a test of the Intel OpenVINO, a toolkit around neural networks, using its built-in benchmarking support and analyzing the throughput and latency for various models. Learn more via the OpenBenchmarking.org test page.

Model: Person Detection 0106 FP32 - Device: Intel GPU

rtx3080_1290k_2: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status.

Model: Face Detection 0106 FP16 - Device: Intel GPU

rtx3080_1290k_2: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status.

Model: Age Gender Recognition Retail 0013 FP32 - Device: Intel GPU

rtx3080_1290k_2: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status.

Model: Person Detection 0106 FP16 - Device: Intel GPU

rtx3080_1290k_2: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status.

Model: Age Gender Recognition Retail 0013 FP16 - Device: Intel GPU

rtx3080_1290k_2: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status.

Model: Face Detection 0106 FP32 - Device: Intel GPU

rtx3080_1290k_2: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status.

ONNX Runtime

ONNX Runtime is developed by Microsoft and partners as a open-source, cross-platform, high performance machine learning inferencing and training accelerator. This test profile runs the ONNX Runtime with various models available from the ONNX Zoo. Learn more via the OpenBenchmarking.org test page.

Model: yolov4 - Device: CPU - Executor: Parallel

rtx3080_1290k_2: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: onnxruntime/onnxruntime/test/onnx/onnx_model_info.cc:45 void OnnxModelInfo::InitOnnxModelInfo(const PATH_CHAR_TYPE*) open file "yolov4/yolov4.onnx" failed: No such file or directory

Model: yolov4 - Device: CPU - Executor: Standard

rtx3080_1290k_2: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: onnxruntime/onnxruntime/test/onnx/onnx_model_info.cc:45 void OnnxModelInfo::InitOnnxModelInfo(const PATH_CHAR_TYPE*) open file "yolov4/yolov4.onnx" failed: No such file or directory

Caffe

This is a benchmark of the Caffe deep learning framework and currently supports the AlexNet and Googlenet model and execution on both CPUs and NVIDIA GPUs. Learn more via the OpenBenchmarking.org test page.

Model: AlexNet - Acceleration: CPU - Iterations: 1000

rtx3080_1290k_2: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./caffe: 3: ./tools/caffe: not found

oneDNN

This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of Intel oneAPI. Learn more via the OpenBenchmarking.org test page.

Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU

rtx3080_1290k_2: The test run did not produce a result. The test run did not produce a result. The test run did not produce a result.

Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU

rtx3080_1290k_2: The test run did not produce a result. The test run did not produce a result. The test run did not produce a result.

Caffe

This is a benchmark of the Caffe deep learning framework and currently supports the AlexNet and Googlenet model and execution on both CPUs and NVIDIA GPUs. Learn more via the OpenBenchmarking.org test page.

Model: AlexNet - Acceleration: CPU - Iterations: 200

rtx3080_1290k_2: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./caffe: 3: ./tools/caffe: not found

oneDNN

This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of Intel oneAPI. Learn more via the OpenBenchmarking.org test page.

Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU

rtx3080_1290k_2: The test run did not produce a result. The test run did not produce a result. The test run did not produce a result.

Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU

rtx3080_1290k_2: The test run did not produce a result. The test run did not produce a result. The test run did not produce a result.

Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU

rtx3080_1290k_2: The test run did not produce a result. The test run did not produce a result. The test run did not produce a result.

Caffe

This is a benchmark of the Caffe deep learning framework and currently supports the AlexNet and Googlenet model and execution on both CPUs and NVIDIA GPUs. Learn more via the OpenBenchmarking.org test page.

Model: GoogleNet - Acceleration: CPU - Iterations: 100

rtx3080_1290k_2: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./caffe: 3: ./tools/caffe: not found

Model: GoogleNet - Acceleration: CPU - Iterations: 200

rtx3080_1290k_2: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./caffe: 3: ./tools/caffe: not found

Model: AlexNet - Acceleration: CPU - Iterations: 100

rtx3080_1290k_2: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./caffe: 3: ./tools/caffe: not found

oneDNN

This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of Intel oneAPI. Learn more via the OpenBenchmarking.org test page.

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU

rtx3080_1290k_2: The test run did not produce a result. The test run did not produce a result. The test run did not produce a result.

Caffe

This is a benchmark of the Caffe deep learning framework and currently supports the AlexNet and Googlenet model and execution on both CPUs and NVIDIA GPUs. Learn more via the OpenBenchmarking.org test page.

Model: GoogleNet - Acceleration: CPU - Iterations: 1000

rtx3080_1290k_2: The test quit with a non-zero exit status. The test quit with a non-zero exit status. The test quit with a non-zero exit status. E: ./caffe: 3: ./tools/caffe: not found

106 Results Shown

NCNN:
  Vulkan GPU - regnety_400m
  Vulkan GPU - squeezenet_ssd
  Vulkan GPU - yolov4-tiny
  Vulkan GPU - resnet50
  Vulkan GPU - alexnet
  Vulkan GPU - resnet18
  Vulkan GPU - vgg16
  Vulkan GPU - googlenet
  Vulkan GPU - blazeface
  Vulkan GPU - efficientnet-b0
  Vulkan GPU - mnasnet
  Vulkan GPU - shufflenet-v2
  Vulkan GPU-v3-v3 - mobilenet-v3
  Vulkan GPU-v2-v2 - mobilenet-v2
  Vulkan GPU - mobilenet
ONNX Runtime
Mobile Neural Network:
  inception-v3
  mobilenet-v1-1.0
  MobileNetV2_224
  SqueezeNetV1.0
  resnet-v2-50
  squeezenetv1.1
  mobilenetV3
LeelaChessZero
AI Benchmark Alpha:
  Device AI Score
  Device Training Score
  Device Inference Score
TensorFlow Lite
PlaidML
NCNN:
  CPU - regnety_400m
  CPU - squeezenet_ssd
  CPU - yolov4-tiny
  CPU - resnet50
  CPU - alexnet
  CPU - resnet18
  CPU - vgg16
  CPU - googlenet
  CPU - blazeface
  CPU - efficientnet-b0
  CPU - mnasnet
  CPU - shufflenet-v2
  CPU-v3-v3 - mobilenet-v3
  CPU-v2-v2 - mobilenet-v2
  CPU - mobilenet
Numenta Anomaly Benchmark
ECP-CANDLE
TNN
ECP-CANDLE
ONNX Runtime:
  fcn-resnet101-11 - CPU - Parallel
  ArcFace ResNet-100 - CPU - Parallel
  bertsquad-12 - CPU - Parallel
  GPT-2 - CPU - Parallel
  ArcFace ResNet-100 - CPU - Standard
  GPT-2 - CPU - Standard
  bertsquad-12 - CPU - Standard
  super-resolution-10 - CPU - Parallel
  super-resolution-10 - CPU - Standard
Numpy Benchmark
PlaidML
oneDNN:
  Recurrent Neural Network Training - f32 - CPU
  Recurrent Neural Network Training - u8s8f32 - CPU
  Recurrent Neural Network Training - bf16bf16bf16 - CPU
  Recurrent Neural Network Inference - bf16bf16bf16 - CPU
  Recurrent Neural Network Inference - u8s8f32 - CPU
  Recurrent Neural Network Inference - f32 - CPU
  IP Shapes 1D - f32 - CPU
Numenta Anomaly Benchmark
OpenVINO:
  Face Detection 0106 FP16 - CPU:
    ms
    FPS
  Person Detection 0106 FP16 - CPU:
    ms
    FPS
  Face Detection 0106 FP32 - CPU:
    ms
    FPS
  Person Detection 0106 FP32 - CPU:
    ms
    FPS
TensorFlow Lite:
  Inception V4
  Inception ResNet V2
  NASNet Mobile
  Mobilenet Float
  Mobilenet Quant
OpenVINO:
  Age Gender Recognition Retail 0013 FP16 - CPU:
    ms
    FPS
  Age Gender Recognition Retail 0013 FP32 - CPU:
    ms
    FPS
Numenta Anomaly Benchmark
DeepSpeech
oneDNN:
  Deconvolution Batch shapes_1d - f32 - CPU
  Deconvolution Batch shapes_1d - u8s8f32 - CPU
R Benchmark
Numenta Anomaly Benchmark
oneDNN:
  Matrix Multiply Batch Shapes Transformer - f32 - CPU
  IP Shapes 1D - u8s8f32 - CPU
RNNoise
TNN
oneDNN
TNN
oneDNN:
  IP Shapes 3D - f32 - CPU
  IP Shapes 3D - u8s8f32 - CPU
ECP-CANDLE
Scikit-Learn
oneDNN:
  Convolution Batch Shapes Auto - f32 - CPU
  Convolution Batch Shapes Auto - u8s8f32 - CPU
Numenta Anomaly Benchmark
oneDNN:
  Deconvolution Batch shapes_3d - f32 - CPU
  Deconvolution Batch shapes_3d - u8s8f32 - CPU
TNN