hpc-xeon

2 x Intel Xeon Platinum 8380 testing with a Intel M50CYP2SB2U (SE5C6200.86B.0022.D08.2103221623 BIOS) and ASPEED on Ubuntu 20.04 via the Phoronix Test Suite.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2105039-IB-HPCXEON8261
Jump To Table - Results

View

Do Not Show Noisy Results
Do Not Show Results With Incomplete Data
Do Not Show Results With Little Change/Spread
List Notable Results
Show Result Confidence Charts

Limit displaying results to tests within:

CPU Massive 3 Tests
Creator Workloads 2 Tests
Fortran Tests 3 Tests
HPC - High Performance Computing 15 Tests
Machine Learning 7 Tests
MPI Benchmarks 3 Tests
Multi-Core 4 Tests
NVIDIA GPU Compute 2 Tests
OpenMPI Tests 5 Tests
Python Tests 6 Tests
Scientific Computing 5 Tests

Statistics

Show Overall Harmonic Mean(s)
Show Overall Geometric Mean
Show Geometric Means Per-Suite/Category
Show Wins / Losses Counts (Pie Chart)
Normalize Results
Remove Outliers Before Calculating Averages

Graph Settings

Force Line Graphs Where Applicable
Convert To Scalar Where Applicable
Prefer Vertical Bar Graphs

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Highlight
Result
Hide
Result
Result
Identifier
View Logs
Performance Per
Dollar
Date
Run
  Test
  Duration
1
May 02 2021
  18 Minutes
1a
May 02 2021
  13 Hours, 38 Minutes
2
May 02 2021
  7 Hours, 50 Minutes
2a
May 02 2021
  8 Hours, 49 Minutes
4
May 03 2021
  14 Hours, 22 Minutes
Invert Hiding All Results Option
  9 Hours

Only show results where is faster than
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


hpc-xeon ProcessorMotherboardChipsetMemoryDiskGraphicsNetworkOSKernelDesktopDisplay ServerCompilerFile-SystemScreen Resolution11a22a42 x Intel Xeon Platinum 8380 @ 3.40GHz (80 Cores / 160 Threads)Intel M50CYP2SB2U (SE5C6200.86B.0022.D08.2103221623 BIOS)Intel Device 099816 x 32 GB DDR4-3200MT/s Hynix HMA84GR7CJR4N-XN2 x 7682GB INTEL SSDPF2KX076TZ + 2 x 800GB INTEL SSDPF21Q800GB + 3841GB Micron_9300_MTFDHAL3T8TDP + 960GB INTEL SSDSC2KG96ASPEED2 x Intel X710 for 10GBASE-T + 2 x Intel E810-C for QSFPUbuntu 20.045.11.0-051100-generic (x86_64)GNOME Shell 3.36.4X Server 1.20.8GCC 9.3.0ext41024x768OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Disk Details- 1, 2, 2a, 4: NONE / errors=remount-ro,relatime,rw / Block Size: 4096Processor Details- Scaling Governor: intel_pstate powersave - CPU Microcode: 0xd000270Python Details- Python 2.7.18 + Python 3.8.5Security Details- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected

hpc-xeon ior: 2MB - Default Test Directoryior: 4MB - Default Test Directoryhpcg: parboil: OpenMP LBMparboil: OpenMP CUTCPparboil: OpenMP Stencilparboil: OpenMP MRI Griddingminife: Smallneat: mocassin: Dust 2D tau100.0arrayfire: BLAS CPUarrayfire: Conjugate Gradient CPUdeepspeech: CPUdaphne: OpenMP - NDT Mappingdaphne: OpenMP - Points2Imagedaphne: OpenMP - Euclidean Clusteroctave-benchmark: ncnn: CPU - mobilenetncnn: CPU-v2-v2 - mobilenet-v2ncnn: CPU-v3-v3 - mobilenet-v3ncnn: CPU - shufflenet-v2ncnn: CPU - mnasnetncnn: CPU - efficientnet-b0ncnn: CPU - blazefacencnn: CPU - googlenetncnn: CPU - vgg16ncnn: CPU - resnet18ncnn: CPU - alexnetncnn: CPU - resnet50ncnn: CPU - yolov4-tinyncnn: CPU - squeezenet_ssdncnn: CPU - regnety_400mopenvino: Face Detection 0106 FP16 - CPUopenvino: Face Detection 0106 FP16 - CPUopenvino: Face Detection 0106 FP32 - CPUopenvino: Face Detection 0106 FP32 - CPUopenvino: Person Detection 0106 FP16 - CPUopenvino: Person Detection 0106 FP16 - CPUopenvino: Person Detection 0106 FP32 - CPUopenvino: Person Detection 0106 FP32 - CPUopenvino: Age Gender Recognition Retail 0013 FP16 - CPUopenvino: Age Gender Recognition Retail 0013 FP16 - CPUopenvino: Age Gender Recognition Retail 0013 FP32 - CPUopenvino: Age Gender Recognition Retail 0013 FP32 - CPUonnx: yolov4 - OpenMP CPUonnx: bertsquad-10 - OpenMP CPUonnx: fcn-resnet101-11 - OpenMP CPUonnx: shufflenet-v2-10 - OpenMP CPUonnx: super-resolution-10 - OpenMP CPUecp-candle: P1B2ecp-candle: P3B1ecp-candle: P3B2ai-benchmark: Device Inference Scoreai-benchmark: Device Training Scoreai-benchmark: Device AI Scoremlpack: scikit_icamlpack: scikit_qdamlpack: scikit_svmmlpack: scikit_linearridgeregression11a22a4356.14343.8124.266213.4125441.4095181.653598415.58236716991.122.9211934730.454.101240.08165433.036515.63442.6313.13441.3918.2214.4210.3416.0819.285.9630.6050.9318.188.6547.3841.3832.9690.9018.762100.5818.462131.3010.893567.4610.973555.1743169.990.7942475.840.83054411147399615347.0921282.4713201.1211139572171165.5730.6631.473.50358.60362.2824.251913.1287371.4308581.644848458.38212117878.622.3651944673.134.696179.71648421.386616.65441.3712.93143.1718.1714.3310.8215.7120.156.1130.3851.9218.028.6248.2442.9134.7391.1418.792097.4418.442138.4411.023524.64363.19357.0024.221913.3915991.4380471.671065443.16788717815.522.9091934678.034.759182.43483405.386393.69432.6512.82239.2817.1113.9310.0115.1818.225.7928.4050.1016.788.2741.6540.1430.9089.6118.782102.6618.452136.8311.023538.0910.913563.6630539.470.9642412.470.83014431137328677144.9781292.223185.0661146582172869.0433.8831.133.50359.89365.7924.221912.9522651.4071871.664530408.71037517495.623.1541944684.874.370186.67090440.886495.99430.1413.25841.0518.0014.6810.2715.7419.145.7729.6252.9517.428.8946.9840.3733.1690.8418.832100.7818.402138.2011.003533.4010.843585.4930845.880.9642558.090.83004341137393681045.5231281.0343188.6311135577171266.0432.5732.123.41OpenBenchmarking.org

IOR

IOR is a parallel I/O storage benchmark making use of MPI with a particular focus on HPC (High Performance Computing) systems. IOR is developed at the Lawrence Livermore National Laboratory (LLNL). Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMB/s, More Is BetterIOR 3.3.0Block Size: 2MB - Disk Target: Default Test Directory122a480160240320400SE +/- 3.31, N = 15SE +/- 1.88, N = 3SE +/- 4.82, N = 3SE +/- 2.90, N = 3356.14358.60363.19359.89MIN: 304.39 / MAX: 641.59MIN: 316.53 / MAX: 627.12MIN: 314.55 / MAX: 619.81MIN: 313.46 / MAX: 610.71. (CC) gcc options: -O2 -lm -pthread -lmpi

OpenBenchmarking.orgMB/s, More Is BetterIOR 3.3.0Block Size: 4MB - Disk Target: Default Test Directory122a480160240320400SE +/- 2.36, N = 3SE +/- 2.35, N = 14SE +/- 1.97, N = 3SE +/- 2.06, N = 3343.81362.28357.00365.79MIN: 284.82 / MAX: 637.37MIN: 299.19 / MAX: 670.93MIN: 303.98 / MAX: 645.01MIN: 305.34 / MAX: 640.451. (CC) gcc options: -O2 -lm -pthread -lmpi

High Performance Conjugate Gradient

HPCG is the High Performance Conjugate Gradient and is a new scientific benchmark from Sandia National Lans focused for super-computer testing with modern real-world workloads compared to HPCC. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHigh Performance Conjugate Gradient 3.11a22a4612182430SE +/- 0.05, N = 3SE +/- 0.10, N = 3SE +/- 0.07, N = 3SE +/- 0.06, N = 324.2724.2524.2224.221. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -pthread -lmpi_cxx -lmpi

Parboil

The Parboil Benchmarks from the IMPACT Research Group at University of Illinois are a set of throughput computing applications for looking at computing architecture and compilers. Parboil test-cases support OpenMP, OpenCL, and CUDA multi-processing environments. However, at this time the test profile is just making use of the OpenMP and OpenCL test workloads. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterParboil 2.5Test: OpenMP LBM1a22a43691215SE +/- 0.11, N = 3SE +/- 0.13, N = 12SE +/- 0.04, N = 3SE +/- 0.12, N = 1513.4113.1313.3912.951. (CXX) g++ options: -lm -lpthread -lgomp -O3 -ffast-math -fopenmp

OpenBenchmarking.orgSeconds, Fewer Is BetterParboil 2.5Test: OpenMP CUTCP1a22a40.32360.64720.97081.29441.618SE +/- 0.015811, N = 4SE +/- 0.015699, N = 5SE +/- 0.011700, N = 9SE +/- 0.016684, N = 31.4095181.4308581.4380471.4071871. (CXX) g++ options: -lm -lpthread -lgomp -O3 -ffast-math -fopenmp

OpenBenchmarking.orgSeconds, Fewer Is BetterParboil 2.5Test: OpenMP Stencil1a22a40.3760.7521.1281.5041.88SE +/- 0.011350, N = 3SE +/- 0.010033, N = 3SE +/- 0.006902, N = 3SE +/- 0.001988, N = 31.6535981.6448481.6710651.6645301. (CXX) g++ options: -lm -lpthread -lgomp -O3 -ffast-math -fopenmp

OpenBenchmarking.orgSeconds, Fewer Is BetterParboil 2.5Test: OpenMP MRI Gridding1a22a4100200300400500SE +/- 20.24, N = 6SE +/- 17.63, N = 6SE +/- 16.35, N = 9SE +/- 8.08, N = 9415.58458.38443.17408.711. (CXX) g++ options: -lm -lpthread -lgomp -O3 -ffast-math -fopenmp

miniFE

MiniFE Finite Element is an application for unstructured implicit finite element codes. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgCG Mflops, More Is BetterminiFE 2.2Problem Size: Small1a22a44K8K12K16K20KSE +/- 154.34, N = 15SE +/- 343.64, N = 15SE +/- 313.32, N = 15SE +/- 298.66, N = 1516991.117878.617815.517495.61. (CXX) g++ options: -O3 -fopenmp -pthread -lmpi_cxx -lmpi

Nebular Empirical Analysis Tool

NEAT is the Nebular Empirical Analysis Tool for empirical analysis of ionised nebulae, with uncertainty propagation. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterNebular Empirical Analysis Tool 2020-02-291a22a4612182430SE +/- 0.21, N = 15SE +/- 0.12, N = 3SE +/- 0.25, N = 15SE +/- 0.30, N = 1522.9222.3722.9123.151. (F9X) gfortran options: -cpp -ffree-line-length-0 -Jsource/ -fopenmp -O3 -fno-backtrace

Monte Carlo Simulations of Ionised Nebulae

Mocassin is the Monte Carlo Simulations of Ionised Nebulae. MOCASSIN is a fully 3D or 2D photoionisation and dust radiative transfer code which employs a Monte Carlo approach to the transfer of radiation through media of arbitrary geometry and density distribution. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterMonte Carlo Simulations of Ionised Nebulae 2019-03-24Input: Dust 2D tau100.01a22a44080120160200SE +/- 0.58, N = 31931941931941. (F9X) gfortran options: -cpp -Jsource/ -ffree-line-length-0 -lm -std=legacy -O3 -O2 -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi

ArrayFire

ArrayFire is an GPU and CPU numeric processing library, this test uses the built-in CPU and OpenCL ArrayFire benchmarks. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS, More Is BetterArrayFire 3.7Test: BLAS CPU1a22a410002000300040005000SE +/- 13.93, N = 3SE +/- 9.91, N = 3SE +/- 13.38, N = 3SE +/- 19.25, N = 34730.454673.134678.034684.871. (CXX) g++ options: -rdynamic

OpenBenchmarking.orgms, Fewer Is BetterArrayFire 3.7Test: Conjugate Gradient CPU1a22a41.07082.14163.21244.28325.354SE +/- 0.035, N = 3SE +/- 0.122, N = 15SE +/- 0.132, N = 15SE +/- 0.148, N = 124.1014.6964.7594.3701. (CXX) g++ options: -rdynamic

DeepSpeech

Mozilla DeepSpeech is a speech-to-text engine powered by TensorFlow for machine learning and derived from Baidu's Deep Speech research paper. This test profile times the speech-to-text process for a roughly three minute audio recording. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterDeepSpeech 0.6Acceleration: CPU1a22a450100150200250SE +/- 9.43, N = 12SE +/- 0.46, N = 3SE +/- 1.38, N = 12SE +/- 4.30, N = 12240.08179.72182.43186.67

Darmstadt Automotive Parallel Heterogeneous Suite

DAPHNE is the Darmstadt Automotive Parallel HeterogeNEous Benchmark Suite with OpenCL / CUDA / OpenMP test cases for these automotive benchmarks for evaluating programming models in context to vehicle autonomous driving capabilities. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTest Cases Per Minute, More Is BetterDarmstadt Automotive Parallel Heterogeneous SuiteBackend: OpenMP - Kernel: NDT Mapping1a22a4100200300400500SE +/- 5.41, N = 3SE +/- 5.68, N = 15SE +/- 7.34, N = 15SE +/- 4.85, N = 4433.03421.38405.38440.881. (CXX) g++ options: -O3 -std=c++11 -fopenmp

OpenBenchmarking.orgTest Cases Per Minute, More Is BetterDarmstadt Automotive Parallel Heterogeneous SuiteBackend: OpenMP - Kernel: Points2Image1a22a414002800420056007000SE +/- 66.31, N = 12SE +/- 71.05, N = 12SE +/- 76.80, N = 4SE +/- 71.06, N = 126515.636616.656393.696495.991. (CXX) g++ options: -O3 -std=c++11 -fopenmp

OpenBenchmarking.orgTest Cases Per Minute, More Is BetterDarmstadt Automotive Parallel Heterogeneous SuiteBackend: OpenMP - Kernel: Euclidean Cluster1a22a4100200300400500SE +/- 8.55, N = 12SE +/- 8.80, N = 15SE +/- 11.68, N = 12SE +/- 3.31, N = 3442.63441.37432.65430.141. (CXX) g++ options: -O3 -std=c++11 -fopenmp

GNU Octave Benchmark

This test profile measures how long it takes to complete several reference GNU Octave files via octave-benchmark. GNU Octave is used for numerical computations and is an open-source alternative to MATLAB. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterGNU Octave Benchmark 5.2.01a22a43691215SE +/- 0.10, N = 25SE +/- 0.14, N = 25SE +/- 0.14, N = 20SE +/- 0.11, N = 2513.1312.9312.8213.26

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20201218Target: CPU - Model: mobilenet1a22a41020304050SE +/- 1.21, N = 12SE +/- 1.67, N = 9SE +/- 0.04, N = 3SE +/- 0.95, N = 1341.3943.1739.2841.05MIN: 37.92 / MAX: 77.52MIN: 37.53 / MAX: 168.48MIN: 38.42 / MAX: 68.14MIN: 37.92 / MAX: 117.921. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20201218Target: CPU-v2-v2 - Model: mobilenet-v21a22a448121620SE +/- 0.56, N = 12SE +/- 0.60, N = 9SE +/- 0.07, N = 3SE +/- 0.44, N = 1318.2218.1717.1118.00MIN: 16.61 / MAX: 93.24MIN: 16.66 / MAX: 255.58MIN: 16.72 / MAX: 38.78MIN: 16.69 / MAX: 240.121. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20201218Target: CPU-v3-v3 - Model: mobilenet-v31a22a448121620SE +/- 0.39, N = 12SE +/- 0.38, N = 9SE +/- 0.06, N = 3SE +/- 0.43, N = 1314.4214.3313.9314.68MIN: 13.27 / MAX: 61.57MIN: 13.43 / MAX: 22.53MIN: 13.42 / MAX: 42.9MIN: 13.4 / MAX: 403.861. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20201218Target: CPU - Model: shufflenet-v21a22a43691215SE +/- 0.22, N = 12SE +/- 0.39, N = 9SE +/- 0.03, N = 3SE +/- 0.15, N = 1310.3410.8210.0110.27MIN: 9.78 / MAX: 30.48MIN: 9.82 / MAX: 284.11MIN: 9.84 / MAX: 11.17MIN: 9.74 / MAX: 37.861. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20201218Target: CPU - Model: mnasnet1a22a448121620SE +/- 0.43, N = 12SE +/- 0.16, N = 9SE +/- 0.06, N = 3SE +/- 0.27, N = 1316.0815.7115.1815.74MIN: 14.77 / MAX: 63.14MIN: 14.8 / MAX: 118.47MIN: 14.91 / MAX: 42.82MIN: 14.84 / MAX: 49.531. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20201218Target: CPU - Model: efficientnet-b01a22a4510152025SE +/- 0.53, N = 12SE +/- 0.83, N = 9SE +/- 0.03, N = 3SE +/- 0.38, N = 1319.2820.1518.2219.14MIN: 17.82 / MAX: 54.47MIN: 17.74 / MAX: 131.21MIN: 17.86 / MAX: 36.46MIN: 17.72 / MAX: 51.281. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20201218Target: CPU - Model: blazeface1a22a4246810SE +/- 0.16, N = 12SE +/- 0.21, N = 9SE +/- 0.21, N = 3SE +/- 0.11, N = 135.966.115.795.77MIN: 5.47 / MAX: 8.13MIN: 5.43 / MAX: 33.62MIN: 5.5 / MAX: 7.01MIN: 5.48 / MAX: 33.81. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20201218Target: CPU - Model: googlenet1a22a4714212835SE +/- 1.15, N = 12SE +/- 1.67, N = 9SE +/- 1.36, N = 3SE +/- 1.00, N = 1330.6030.3828.4029.62MIN: 25.89 / MAX: 363.03MIN: 26.25 / MAX: 69.1MIN: 26.44 / MAX: 59.77MIN: 26.59 / MAX: 97.531. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20201218Target: CPU - Model: vgg161a22a41224364860SE +/- 0.90, N = 12SE +/- 1.35, N = 9SE +/- 0.86, N = 3SE +/- 0.97, N = 1350.9351.9250.1052.95MIN: 44.34 / MAX: 470.99MIN: 45.29 / MAX: 485.69MIN: 46.01 / MAX: 368.47MIN: 44.1 / MAX: 1134.631. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20201218Target: CPU - Model: resnet181a22a448121620SE +/- 0.60, N = 12SE +/- 0.79, N = 9SE +/- 0.61, N = 3SE +/- 0.52, N = 1318.1818.0216.7817.42MIN: 15.63 / MAX: 50.57MIN: 15.61 / MAX: 52.43MIN: 15.83 / MAX: 38.93MIN: 15.8 / MAX: 53.791. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20201218Target: CPU - Model: alexnet1a22a4246810SE +/- 0.24, N = 12SE +/- 0.34, N = 9SE +/- 0.61, N = 3SE +/- 0.27, N = 138.658.628.278.89MIN: 7.58 / MAX: 29.2MIN: 7.59 / MAX: 41.13MIN: 7.56 / MAX: 10.65MIN: 7.58 / MAX: 47.921. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20201218Target: CPU - Model: resnet501a22a41122334455SE +/- 0.97, N = 12SE +/- 1.43, N = 9SE +/- 0.06, N = 3SE +/- 1.26, N = 1347.3848.2441.6546.98MIN: 40.86 / MAX: 304.03MIN: 40.1 / MAX: 658.14MIN: 40.83 / MAX: 76.33MIN: 40.68 / MAX: 309.921. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20201218Target: CPU - Model: yolov4-tiny1a22a41020304050SE +/- 1.35, N = 12SE +/- 1.43, N = 9SE +/- 3.34, N = 3SE +/- 1.29, N = 1341.3842.9140.1440.37MIN: 35.51 / MAX: 1384.91MIN: 35.84 / MAX: 953.59MIN: 35.96 / MAX: 676.26MIN: 36.01 / MAX: 1384.641. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20201218Target: CPU - Model: squeezenet_ssd1a22a4816243240SE +/- 0.91, N = 12SE +/- 1.38, N = 9SE +/- 0.05, N = 3SE +/- 0.94, N = 1332.9634.7330.9033.16MIN: 30.31 / MAX: 200.45MIN: 29.98 / MAX: 260.21MIN: 30.39 / MAX: 61.66MIN: 30.14 / MAX: 141.881. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20201218Target: CPU - Model: regnety_400m1a22a420406080100SE +/- 0.56, N = 12SE +/- 0.96, N = 9SE +/- 0.70, N = 3SE +/- 0.70, N = 1390.9091.1489.6190.84MIN: 87.44 / MAX: 198.82MIN: 86.2 / MAX: 160.68MIN: 86.83 / MAX: 232.99MIN: 85.68 / MAX: 182.761. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread

OpenVINO

This is a test of the Intel OpenVINO, a toolkit around neural networks, using its built-in benchmarking support and analyzing the throughput and latency for various models. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2021.1Model: Face Detection 0106 FP16 - Device: CPU1a22a4510152025SE +/- 0.09, N = 3SE +/- 0.11, N = 3SE +/- 0.12, N = 3SE +/- 0.03, N = 318.7618.7918.7818.831. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2021.1Model: Face Detection 0106 FP16 - Device: CPU1a22a45001000150020002500SE +/- 4.10, N = 3SE +/- 8.24, N = 3SE +/- 8.74, N = 3SE +/- 1.04, N = 32100.582097.442102.662100.781. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2021.1Model: Face Detection 0106 FP32 - Device: CPU1a22a4510152025SE +/- 0.10, N = 3SE +/- 0.07, N = 3SE +/- 0.10, N = 3SE +/- 0.02, N = 318.4618.4418.4518.401. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2021.1Model: Face Detection 0106 FP32 - Device: CPU1a22a45001000150020002500SE +/- 5.79, N = 3SE +/- 2.13, N = 3SE +/- 5.92, N = 3SE +/- 1.18, N = 32131.302138.442136.832138.201. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2021.1Model: Person Detection 0106 FP16 - Device: CPU1a22a43691215SE +/- 0.05, N = 3SE +/- 0.02, N = 3SE +/- 0.04, N = 3SE +/- 0.03, N = 310.8911.0211.0211.001. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2021.1Model: Person Detection 0106 FP16 - Device: CPU1a22a48001600240032004000SE +/- 15.45, N = 3SE +/- 6.11, N = 3SE +/- 12.83, N = 3SE +/- 12.37, N = 33567.463524.643538.093533.401. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2021.1Model: Person Detection 0106 FP32 - Device: CPU1a2a43691215SE +/- 0.05, N = 3SE +/- 0.03, N = 3SE +/- 0.03, N = 310.9710.9110.841. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2021.1Model: Person Detection 0106 FP32 - Device: CPU1a2a48001600240032004000SE +/- 9.50, N = 3SE +/- 1.74, N = 3SE +/- 5.28, N = 33555.173563.663585.491. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2021.1Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU1a2a49K18K27K36K45KSE +/- 109.04, N = 3SE +/- 274.91, N = 3SE +/- 320.36, N = 343169.9930539.4730845.881. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2021.1Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU1a2a40.2160.4320.6480.8641.08SE +/- 0.00, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 30.790.960.961. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2021.1Model: Age Gender Recognition Retail 0013 FP32 - Device: CPU1a2a49K18K27K36K45KSE +/- 114.87, N = 3SE +/- 56.80, N = 3SE +/- 63.17, N = 342475.8442412.4742558.091. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2021.1Model: Age Gender Recognition Retail 0013 FP32 - Device: CPU1a2a40.180.360.540.720.9SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 30.80.80.81. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread

ONNX Runtime

ONNX Runtime is developed by Microsoft and partners as a open-source, cross-platform, high performance machine learning inferencing and training accelerator. This test profile runs the ONNX Runtime with various models available from the ONNX Zoo. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.6Model: yolov4 - Device: OpenMP CPU1a2a470140210280350SE +/- 1.44, N = 3SE +/- 3.79, N = 3SE +/- 3.91, N = 33053013001. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.6Model: bertsquad-10 - Device: OpenMP CPU1a2a4100200300400500SE +/- 3.33, N = 3SE +/- 5.39, N = 3SE +/- 7.23, N = 124414434341. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.6Model: fcn-resnet101-11 - Device: OpenMP CPU1a2a4306090120150SE +/- 0.29, N = 3SE +/- 0.76, N = 3SE +/- 0.44, N = 31141131131. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.6Model: shufflenet-v2-10 - Device: OpenMP CPU1a2a416003200480064008000SE +/- 28.82, N = 3SE +/- 86.98, N = 3SE +/- 16.54, N = 37399732873931. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.6Model: super-resolution-10 - Device: OpenMP CPU1a2a415003000450060007500SE +/- 234.15, N = 12SE +/- 45.37, N = 3SE +/- 22.93, N = 36153677168101. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt

ECP-CANDLE

The CANDLE benchmark codes implement deep learning architectures relevant to problems in cancer. These architectures address problems at different biological scales, specifically problems at the molecular, cellular and population scales. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterECP-CANDLE 0.3Benchmark: P1B21a2a4112233445547.0944.9845.52

OpenBenchmarking.orgSeconds, Fewer Is BetterECP-CANDLE 0.3Benchmark: P3B11a2a4300600900120015001282.471292.221281.03

OpenBenchmarking.orgSeconds, Fewer Is BetterECP-CANDLE 0.3Benchmark: P3B21a2a470014002100280035003201.123185.073188.63

AI Benchmark Alpha

AI Benchmark Alpha is a Python library for evaluating artificial intelligence (AI) performance on diverse hardware platforms and relies upon the TensorFlow machine learning library. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgScore, More Is BetterAI Benchmark Alpha 0.1.2Device Inference Score1a2a42004006008001000113911461135

OpenBenchmarking.orgScore, More Is BetterAI Benchmark Alpha 0.1.2Device Training Score1a2a4130260390520650572582577

OpenBenchmarking.orgScore, More Is BetterAI Benchmark Alpha 0.1.2Device AI Score1a2a4400800120016002000171117281712

Mlpack Benchmark

Mlpack benchmark scripts for machine learning libraries Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterMlpack BenchmarkBenchmark: scikit_ica1a2a41530456075SE +/- 0.50, N = 15SE +/- 0.81, N = 4SE +/- 0.88, N = 1565.5769.0466.04

OpenBenchmarking.orgSeconds, Fewer Is BetterMlpack BenchmarkBenchmark: scikit_qda1a2a4816243240SE +/- 0.02, N = 3SE +/- 0.42, N = 3SE +/- 0.30, N = 330.6633.8832.57

OpenBenchmarking.orgSeconds, Fewer Is BetterMlpack BenchmarkBenchmark: scikit_svm1a2a4714212835SE +/- 0.32, N = 5SE +/- 0.23, N = 15SE +/- 0.22, N = 1331.4731.1332.12

OpenBenchmarking.orgSeconds, Fewer Is BetterMlpack BenchmarkBenchmark: scikit_linearridgeregression1a2a40.78751.5752.36253.153.9375SE +/- 0.04, N = 15SE +/- 0.07, N = 15SE +/- 0.05, N = 123.503.503.41

59 Results Shown

IOR:
  2MB - Default Test Directory
  4MB - Default Test Directory
High Performance Conjugate Gradient
Parboil:
  OpenMP LBM
  OpenMP CUTCP
  OpenMP Stencil
  OpenMP MRI Gridding
miniFE
Nebular Empirical Analysis Tool
Monte Carlo Simulations of Ionised Nebulae
ArrayFire:
  BLAS CPU
  Conjugate Gradient CPU
DeepSpeech
Darmstadt Automotive Parallel Heterogeneous Suite:
  OpenMP - NDT Mapping
  OpenMP - Points2Image
  OpenMP - Euclidean Cluster
GNU Octave Benchmark
NCNN:
  CPU - mobilenet
  CPU-v2-v2 - mobilenet-v2
  CPU-v3-v3 - mobilenet-v3
  CPU - shufflenet-v2
  CPU - mnasnet
  CPU - efficientnet-b0
  CPU - blazeface
  CPU - googlenet
  CPU - vgg16
  CPU - resnet18
  CPU - alexnet
  CPU - resnet50
  CPU - yolov4-tiny
  CPU - squeezenet_ssd
  CPU - regnety_400m
OpenVINO:
  Face Detection 0106 FP16 - CPU:
    FPS
    ms
  Face Detection 0106 FP32 - CPU:
    FPS
    ms
  Person Detection 0106 FP16 - CPU:
    FPS
    ms
  Person Detection 0106 FP32 - CPU:
    FPS
    ms
  Age Gender Recognition Retail 0013 FP16 - CPU:
    FPS
    ms
  Age Gender Recognition Retail 0013 FP32 - CPU:
    FPS
    ms
ONNX Runtime:
  yolov4 - OpenMP CPU
  bertsquad-10 - OpenMP CPU
  fcn-resnet101-11 - OpenMP CPU
  shufflenet-v2-10 - OpenMP CPU
  super-resolution-10 - OpenMP CPU
ECP-CANDLE:
  P1B2
  P3B1
  P3B2
AI Benchmark Alpha:
  Device Inference Score
  Device Training Score
  Device AI Score
Mlpack Benchmark:
  scikit_ica
  scikit_qda
  scikit_svm
  scikit_linearridgeregression