AMD EPYC 9754 Bergamo AVX-512

AMD EPYC 9754 1P benchmarks with AVX-512 benchmarking and then AVX-512 disabled. Tests by Michael Larabel for a future article.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2307197-NE-AMDBERGAM43
Jump To Table - Results

View

Do Not Show Noisy Results
Do Not Show Results With Incomplete Data
Do Not Show Results With Little Change/Spread
List Notable Results

Limit displaying results to tests within:

CPU Massive 3 Tests
Creator Workloads 5 Tests
HPC - High Performance Computing 5 Tests
Machine Learning 4 Tests
Multi-Core 6 Tests
Intel oneAPI 5 Tests
Python Tests 3 Tests
Server CPU Tests 2 Tests

Statistics

Show Overall Harmonic Mean(s)
Show Overall Geometric Mean
Show Geometric Means Per-Suite/Category
Show Wins / Losses Counts (Pie Chart)
Normalize Results
Remove Outliers Before Calculating Averages

Graph Settings

Force Line Graphs Where Applicable
Convert To Scalar Where Applicable
Prefer Vertical Bar Graphs
No Box Plots
On Line Graphs With Missing Data, Connect The Line Gaps

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs
Condense Test Profiles With Multiple Version Results Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Highlight
Result
Hide
Result
Result
Identifier
View Logs
Performance Per
Dollar
Date
Run
  Test
  Duration
AVX512 On
July 16 2023
  7 Hours, 54 Minutes
AVX512 Off
July 16 2023
  11 Hours, 27 Minutes
Invert Hiding All Results Option
  9 Hours, 40 Minutes
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


AMD EPYC 9754 Bergamo AVX-512OpenBenchmarking.orgPhoronix Test SuiteAMD EPYC 9754 128-Core @ 2.25GHz (128 Cores / 256 Threads)AMD Titanite_4G (RTI1007B BIOS)AMD Device 14a4768GB2 x 1920GB SAMSUNG MZWLJ1T9HBJR-00007ASPEEDBroadcom NetXtreme BCM5720 PCIeUbuntu 22.045.19.0-41-generic (x86_64)GNOME Shell 42.5X Server 1.21.1.41.3.224GCC 11.3.0ext41024x768ProcessorMotherboardChipsetMemoryDiskGraphicsNetworkOSKernelDesktopDisplay ServerVulkanCompilerFile-SystemScreen ResolutionAMD EPYC 9754 Bergamo AVX-512 BenchmarksSystem Logs- Transparent Huge Pages: madvise- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-xKiWfi/gcc-11-11.3.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-xKiWfi/gcc-11-11.3.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xaa0010b - Python 3.10.6- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected

AVX512 On vs. AVX512 Off ComparisonPhoronix Test SuiteBaseline+346.4%+346.4%+692.8%+692.8%+1039.2%+1039.2%CPU - 512 - GoogLeNet789.3%CPU - 64 - AlexNet778.7%CPU - 32 - AlexNet562.1%CPU - 256 - ResNet-50499.9%CPU - 64 - GoogLeNet499%CPU - 512 - ResNet-50491%CPU - 64 - ResNet-50423.7%CPU - 16 - AlexNet389.1%CPU - 32 - ResNet-50309.7%CPU - 32 - GoogLeNet306.6%CPU - 16 - ResNet-50171%CPU - 16 - GoogLeNet164.3%W.P.D.F - CPU138.2%W.P.D.F - CPU138%F.D.F - CPU132%F.D.F - CPU131.2%M.T.E.T.D.F - CPU130.3%M.T.E.T.D.F - CPU130.2%W.P.D.F.I - CPU107.7%W.P.D.F.I - CPU107.6%F.D.F.I - CPU103.6%F.D.F.I - CPU102.6%P.V.B.D.F - CPU100.2%P.V.B.D.F - CPU100.1%CPU - 512 - AlexNet1385.6%CPU - 256 - AlexNet1238.3%CPU - 256 - GoogLeNet986.6%LBC, LBRY Credits98.6%A.G.R.R.0.F - CPU92.9%N.S.A.8.P.Q.B.B.U - A.M.S92.3%N.S.A.8.P.Q.B.B.U - A.M.S92.2%C.S.9.P.Y.P - A.M.S83.3%C.S.9.P.Y.P - A.M.S81.9%P.D.F - CPU80.2%P.D.F - CPU79.8%P.D.F - CPU78.2%P.D.F - CPU77.9%A.G.R.R.0.F - CPU76.2%gravity_spheres_volume/dim_512/scivis/real_time75.8%Q.S.2.P71.1%gravity_spheres_volume/dim_512/ao/real_time70.8%x25x65.4%Blake-2 S58.9%scrypt47.2%V.D.F.I - CPU43.9%V.D.F.I - CPU43.6%gravity_spheres_volume/dim_512/pathtracer/real_time43.1%25638.4%Garlicoin34.5%OpenMP - BM231.9%OpenMP - BM231.9%OpenMP - BM126.7%OpenMP - BM126.7%Skeincoin25.3%N.T.C.B.b.u.S - A.M.S21.2%N.T.C.B.b.u.S - A.M.S21.1%Myriad-Groestl20.7%V.D.F - CPU20.1%V.D.F - CPU20%N.D.C.o.b.u.o.I - A.M.S19.8%N.T.C.D.m - A.M.S19.7%N.T.C.D.m - A.M.S19.6%N.T.C.B.b.u.c - A.M.S19.6%N.T.C.B.b.u.c - A.M.S19.3%N.D.C.o.b.u.o.I - A.M.S19.2%vklBenchmark ISPC18.6%Pathtracer ISPC - Asian Dragon18.6%Pathtracer ISPC - Asian Dragon Obj17%N.Q.A.B.b.u.S.1.P - A.M.S16.1%N.Q.A.B.b.u.S.1.P - A.M.S14.8%R.N.N.T - bf16bf16bf16 - CPU12.2%Pathtracer ISPC - Crown11.9%A.G.R.R.0.F.I - CPU11.4%C.C.R.5.I - A.M.S11.4%C.C.R.5.I - A.M.S11.3%A.G.R.R.0.F.I - CPU10.6%1284.6%TensorFlowTensorFlowTensorFlowTensorFlowTensorFlowTensorFlowTensorFlowTensorFlowTensorFlowTensorFlowTensorFlowTensorFlowOpenVINOOpenVINOOpenVINOOpenVINOOpenVINOOpenVINOOpenVINOOpenVINOOpenVINOOpenVINOOpenVINOOpenVINOTensorFlowTensorFlowTensorFlowCpuminer-OptOpenVINONeural Magic DeepSparseNeural Magic DeepSparseNeural Magic DeepSparseNeural Magic DeepSparseOpenVINOOpenVINOOpenVINOOpenVINOOpenVINOOSPRayCpuminer-OptOSPRayCpuminer-OptCpuminer-OptCpuminer-OptOpenVINOOpenVINOOSPRaylibxsmmCpuminer-OptminiBUDEminiBUDEminiBUDEminiBUDECpuminer-OptNeural Magic DeepSparseNeural Magic DeepSparseCpuminer-OptOpenVINOOpenVINONeural Magic DeepSparseNeural Magic DeepSparseNeural Magic DeepSparseNeural Magic DeepSparseNeural Magic DeepSparseNeural Magic DeepSparseOpenVKLEmbreeEmbreeNeural Magic DeepSparseNeural Magic DeepSparseoneDNNEmbreeOpenVINONeural Magic DeepSparseNeural Magic DeepSparseOpenVINOlibxsmmAVX512 OnAVX512 Off

AMD EPYC 9754 Bergamo AVX-512minibude: OpenMP - BM1minibude: OpenMP - BM1minibude: OpenMP - BM2minibude: OpenMP - BM2libxsmm: 128libxsmm: 256embree: Pathtracer ISPC - Crownembree: Pathtracer ISPC - Asian Dragonembree: Pathtracer ISPC - Asian Dragon Objopenvkl: vklBenchmark ISPCospray: gravity_spheres_volume/dim_512/ao/real_timeospray: gravity_spheres_volume/dim_512/scivis/real_timeospray: gravity_spheres_volume/dim_512/pathtracer/real_timeonednn: Recurrent Neural Network Training - bf16bf16bf16 - CPUcpuminer-opt: x25xcpuminer-opt: scryptcpuminer-opt: Blake-2 Scpuminer-opt: Garlicoincpuminer-opt: Skeincoincpuminer-opt: Myriad-Groestlcpuminer-opt: LBC, LBRY Creditscpuminer-opt: Quad SHA-256, Pyritetensorflow: CPU - 16 - AlexNettensorflow: CPU - 32 - AlexNettensorflow: CPU - 64 - AlexNettensorflow: CPU - 256 - AlexNettensorflow: CPU - 512 - AlexNettensorflow: CPU - 16 - GoogLeNettensorflow: CPU - 16 - ResNet-50tensorflow: CPU - 32 - GoogLeNettensorflow: CPU - 32 - ResNet-50tensorflow: CPU - 64 - GoogLeNettensorflow: CPU - 64 - ResNet-50tensorflow: CPU - 256 - GoogLeNettensorflow: CPU - 256 - ResNet-50tensorflow: CPU - 512 - GoogLeNettensorflow: CPU - 512 - ResNet-50deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Streamdeepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Streamdeepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Asynchronous Multi-Streamdeepsparse: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Asynchronous Multi-Streamdeepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Streamdeepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Streamdeepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Streamdeepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Streamdeepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Streamdeepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Streamdeepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Streamdeepsparse: CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Streamdeepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Streamdeepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Streamdeepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Streamdeepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Streamopenvino: Face Detection FP16 - CPUopenvino: Face Detection FP16 - CPUopenvino: Person Detection FP16 - CPUopenvino: Person Detection FP16 - CPUopenvino: Person Detection FP32 - CPUopenvino: Person Detection FP32 - CPUopenvino: Vehicle Detection FP16 - CPUopenvino: Vehicle Detection FP16 - CPUopenvino: Face Detection FP16-INT8 - CPUopenvino: Face Detection FP16-INT8 - CPUopenvino: Vehicle Detection FP16-INT8 - CPUopenvino: Vehicle Detection FP16-INT8 - CPUopenvino: Weld Porosity Detection FP16 - CPUopenvino: Weld Porosity Detection FP16 - CPUopenvino: Machine Translation EN To DE FP16 - CPUopenvino: Machine Translation EN To DE FP16 - CPUopenvino: Weld Porosity Detection FP16-INT8 - CPUopenvino: Weld Porosity Detection FP16-INT8 - CPUopenvino: Person Vehicle Bike Detection FP16 - CPUopenvino: Person Vehicle Bike Detection FP16 - CPUopenvino: Age Gender Recognition Retail 0013 FP16 - CPUopenvino: Age Gender Recognition Retail 0013 FP16 - CPUopenvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPUopenvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPUAVX512 OnAVX512 Off5925.670237.0275972.187238.8872690.73342.5125.5414157.6450134.8396139832.755731.675327.97151174.754977.892993.2172386505309011749538628.766606601498937342.88562.48857.551422.361632.40104.7743.11180.7771.62277.2496.63501.67119.32417.25122.8173.5393858.40291381.566946.2608247.9653259.6468970.056865.8909624.7370102.2238127.0611498.4779316.1679201.596473.1459859.711960.731048.3727.082334.2927.012339.811430.4544.84118.00540.355690.3411.266073.2210.52580.40110.3511818.3310.826638.719.63110240.890.9973970.121.584677.682187.1074528.305181.1322573.32415.3112.2274132.9504115.2682117919.173118.020319.54631317.573010.372033.154555580394739379777149.9533266787613770.1084.9697.59106.28109.8839.6415.9144.4617.4846.2818.4546.1719.8946.9220.7861.36891023.1690718.303288.9093213.6393298.0628870.954173.3502522.1104122.229569.3341906.9414260.7671244.151361.17551025.480626.182423.3915.064153.5914.994170.491190.9153.7957.951094.523954.9016.172551.7025.06251.97254.015692.9922.473317.3419.2862564.161.9166895.491.76OpenBenchmarking.org

miniBUDE

MiniBUDE is a mini application for the the core computation of the Bristol University Docking Engine (BUDE). This test profile currently makes use of the OpenMP implementation of miniBUDE for CPU benchmarking. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFInst/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM1AVX512 OnAVX512 Off13002600390052006500SE +/- 2.27, N = 9SE +/- 18.03, N = 85925.674677.681. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm
OpenBenchmarking.orgGFInst/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM1AVX512 OnAVX512 Off10002000300040005000Min: 5912.5 / Avg: 5925.67 / Max: 5933.29Min: 4556.82 / Avg: 4677.68 / Max: 4722.361. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm

OpenBenchmarking.orgBillion Interactions/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM1AVX512 OnAVX512 Off50100150200250SE +/- 0.09, N = 9SE +/- 0.72, N = 8237.03187.111. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm
OpenBenchmarking.orgBillion Interactions/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM1AVX512 OnAVX512 Off4080120160200Min: 236.5 / Avg: 237.03 / Max: 237.33Min: 182.27 / Avg: 187.11 / Max: 188.891. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm

OpenBenchmarking.orgGFInst/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM2AVX512 OnAVX512 Off13002600390052006500SE +/- 0.44, N = 3SE +/- 15.49, N = 35972.194528.311. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm
OpenBenchmarking.orgGFInst/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM2AVX512 OnAVX512 Off10002000300040005000Min: 5971.3 / Avg: 5972.19 / Max: 5972.69Min: 4499.54 / Avg: 4528.31 / Max: 4552.661. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm

OpenBenchmarking.orgBillion Interactions/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM2AVX512 OnAVX512 Off50100150200250SE +/- 0.02, N = 3SE +/- 0.62, N = 3238.89181.131. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm
OpenBenchmarking.orgBillion Interactions/s, More Is BetterminiBUDE 20210901Implementation: OpenMP - Input Deck: BM2AVX512 OnAVX512 Off4080120160200Min: 238.85 / Avg: 238.89 / Max: 238.91Min: 179.98 / Avg: 181.13 / Max: 182.111. (CC) gcc options: -std=c99 -Ofast -ffast-math -fopenmp -march=native -lm

libxsmm

Libxsmm is an open-source library for specialized dense and sparse matrix operations and deep learning primitives. Libxsmm supports making use of Intel AMX, AVX-512, and other modern CPU instruction set capabilities. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS/s, More Is Betterlibxsmm 2-1.17-3645M N K: 128AVX512 OnAVX512 Off6001200180024003000SE +/- 12.20, N = 3SE +/- 2.70, N = 32690.72573.31. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2
OpenBenchmarking.orgGFLOPS/s, More Is Betterlibxsmm 2-1.17-3645M N K: 128AVX512 OnAVX512 Off5001000150020002500Min: 2666.3 / Avg: 2690.7 / Max: 2703.2Min: 2570.1 / Avg: 2573.33 / Max: 2578.71. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2

OpenBenchmarking.orgGFLOPS/s, More Is Betterlibxsmm 2-1.17-3645M N K: 256AVX512 OnAVX512 Off7001400210028003500SE +/- 5.78, N = 3SE +/- 6.53, N = 33342.52415.31. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2
OpenBenchmarking.orgGFLOPS/s, More Is Betterlibxsmm 2-1.17-3645M N K: 256AVX512 OnAVX512 Off6001200180024003000Min: 3333.4 / Avg: 3342.47 / Max: 3353.2Min: 2407.5 / Avg: 2415.33 / Max: 2428.31. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2

Embree

Intel Embree is a collection of high-performance ray-tracing kernels for execution on CPUs (and GPUs via SYCL) and supporting instruction sets such as SSE, AVX, AVX2, and AVX-512. Embree also supports making use of the Intel SPMD Program Compiler (ISPC). Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.1Binary: Pathtracer ISPC - Model: CrownAVX512 OnAVX512 Off306090120150SE +/- 0.09, N = 7SE +/- 0.11, N = 7125.54112.23MIN: 122.35 / MAX: 131.95MIN: 109.49 / MAX: 117.02
OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.1Binary: Pathtracer ISPC - Model: CrownAVX512 OnAVX512 Off20406080100Min: 125.03 / Avg: 125.54 / Max: 125.71Min: 111.88 / Avg: 112.23 / Max: 112.72

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.1Binary: Pathtracer ISPC - Model: Asian DragonAVX512 OnAVX512 Off306090120150SE +/- 0.09, N = 8SE +/- 0.11, N = 7157.65132.95MIN: 155.33 / MAX: 162.92MIN: 130.4 / MAX: 138.16
OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.1Binary: Pathtracer ISPC - Model: Asian DragonAVX512 OnAVX512 Off306090120150Min: 157.35 / Avg: 157.65 / Max: 158.1Min: 132.34 / Avg: 132.95 / Max: 133.27

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.1Binary: Pathtracer ISPC - Model: Asian Dragon ObjAVX512 OnAVX512 Off306090120150SE +/- 0.16, N = 4SE +/- 0.08, N = 4134.84115.27MIN: 132.49 / MAX: 139MIN: 113.48 / MAX: 119.15
OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 4.1Binary: Pathtracer ISPC - Model: Asian Dragon ObjAVX512 OnAVX512 Off306090120150Min: 134.52 / Avg: 134.84 / Max: 135.2Min: 115.06 / Avg: 115.27 / Max: 115.47

OpenVKL

OpenVKL is the Intel Open Volume Kernel Library that offers high-performance volume computation kernels and part of the Intel oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgItems / Sec, More Is BetterOpenVKL 1.3.1Benchmark: vklBenchmark ISPCAVX512 OnAVX512 Off30060090012001500SE +/- 2.65, N = 3SE +/- 0.33, N = 313981179MIN: 229 / MAX: 11779MIN: 178 / MAX: 10473
OpenBenchmarking.orgItems / Sec, More Is BetterOpenVKL 1.3.1Benchmark: vklBenchmark ISPCAVX512 OnAVX512 Off2004006008001000Min: 1393 / Avg: 1398 / Max: 1402Min: 1178 / Avg: 1178.67 / Max: 1179

OSPRay

Intel OSPRay is a portable ray-tracing engine for high-performance, high-fidelity scientific visualizations. OSPRay builds off Intel's Embree and Intel SPMD Program Compiler (ISPC) components as part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 2.12Benchmark: gravity_spheres_volume/dim_512/ao/real_timeAVX512 OnAVX512 Off816243240SE +/- 0.01, N = 3SE +/- 0.02, N = 332.7619.17
OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 2.12Benchmark: gravity_spheres_volume/dim_512/ao/real_timeAVX512 OnAVX512 Off714212835Min: 32.74 / Avg: 32.76 / Max: 32.77Min: 19.14 / Avg: 19.17 / Max: 19.21

OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 2.12Benchmark: gravity_spheres_volume/dim_512/scivis/real_timeAVX512 OnAVX512 Off714212835SE +/- 0.02, N = 3SE +/- 0.00, N = 331.6818.02
OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 2.12Benchmark: gravity_spheres_volume/dim_512/scivis/real_timeAVX512 OnAVX512 Off714212835Min: 31.65 / Avg: 31.68 / Max: 31.72Min: 18.02 / Avg: 18.02 / Max: 18.03

OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 2.12Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_timeAVX512 OnAVX512 Off714212835SE +/- 0.01, N = 3SE +/- 0.01, N = 327.9719.55
OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 2.12Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_timeAVX512 OnAVX512 Off612182430Min: 27.95 / Avg: 27.97 / Max: 27.99Min: 19.52 / Avg: 19.55 / Max: 19.56

oneDNN

This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.1Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPUAVX512 OnAVX512 Off30060090012001500SE +/- 12.60, N = 4SE +/- 1.80, N = 31174.751317.57MIN: 1143.88MIN: 1299.381. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.1Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPUAVX512 OnAVX512 Off2004006008001000Min: 1156.9 / Avg: 1174.75 / Max: 1212.12Min: 1314.49 / Avg: 1317.57 / Max: 1320.731. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl -lpthread

Cpuminer-Opt

Cpuminer-Opt is a fork of cpuminer-multi that carries a wide range of CPU performance optimizations for measuring the potential cryptocurrency mining performance of the CPU/processor with a wide variety of cryptocurrencies. The benchmark reports the hash speed for the CPU mining performance for the selected cryptocurrency. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.20.3Algorithm: x25xAVX512 OnAVX512 Off11002200330044005500SE +/- 15.71, N = 3SE +/- 35.48, N = 44977.893010.371. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lssl -lcrypto -lgmp
OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.20.3Algorithm: x25xAVX512 OnAVX512 Off9001800270036004500Min: 4952.23 / Avg: 4977.89 / Max: 5006.41Min: 2951.37 / Avg: 3010.37 / Max: 3113.531. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lssl -lcrypto -lgmp

OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.20.3Algorithm: scryptAVX512 OnAVX512 Off6001200180024003000SE +/- 1.66, N = 3SE +/- 0.24, N = 32993.212033.151. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lssl -lcrypto -lgmp
OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.20.3Algorithm: scryptAVX512 OnAVX512 Off5001000150020002500Min: 2990.21 / Avg: 2993.21 / Max: 2995.94Min: 2032.77 / Avg: 2033.15 / Max: 2033.581. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lssl -lcrypto -lgmp

OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.20.3Algorithm: Blake-2 SAVX512 OnAVX512 Off1.6M3.2M4.8M6.4M8MSE +/- 3854.05, N = 3SE +/- 37514.23, N = 15723865045555801. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lssl -lcrypto -lgmp
OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.20.3Algorithm: Blake-2 SAVX512 OnAVX512 Off1.3M2.6M3.9M5.2M6.5MMin: 7233240 / Avg: 7238650 / Max: 7246110Min: 4440420 / Avg: 4555580 / Max: 47814301. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lssl -lcrypto -lgmp

OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.20.3Algorithm: GarlicoinAVX512 OnAVX512 Off11K22K33K44K55KSE +/- 110.15, N = 3SE +/- 102.69, N = 353090394731. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lssl -lcrypto -lgmp
OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.20.3Algorithm: GarlicoinAVX512 OnAVX512 Off9K18K27K36K45KMin: 52870 / Avg: 53090 / Max: 53210Min: 39280 / Avg: 39473.33 / Max: 396301. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lssl -lcrypto -lgmp

OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.20.3Algorithm: SkeincoinAVX512 OnAVX512 Off300K600K900K1200K1500KSE +/- 4577.90, N = 3SE +/- 1811.04, N = 311749539379771. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lssl -lcrypto -lgmp
OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.20.3Algorithm: SkeincoinAVX512 OnAVX512 Off200K400K600K800K1000KMin: 1169160 / Avg: 1174953.33 / Max: 1183990Min: 934780 / Avg: 937976.67 / Max: 9410501. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lssl -lcrypto -lgmp

OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.20.3Algorithm: Myriad-GroestlAVX512 OnAVX512 Off2K4K6K8K10KSE +/- 340.56, N = 15SE +/- 21.55, N = 38628.767149.951. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lssl -lcrypto -lgmp
OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.20.3Algorithm: Myriad-GroestlAVX512 OnAVX512 Off15003000450060007500Min: 5934.4 / Avg: 8628.76 / Max: 10730Min: 7108.52 / Avg: 7149.95 / Max: 7180.941. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lssl -lcrypto -lgmp

OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.20.3Algorithm: LBC, LBRY CreditsAVX512 OnAVX512 Off140K280K420K560K700KSE +/- 76.38, N = 3SE +/- 141.93, N = 36606603326671. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lssl -lcrypto -lgmp
OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.20.3Algorithm: LBC, LBRY CreditsAVX512 OnAVX512 Off110K220K330K440K550KMin: 660560 / Avg: 660660 / Max: 660810Min: 332510 / Avg: 332666.67 / Max: 3329501. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lssl -lcrypto -lgmp

OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.20.3Algorithm: Quad SHA-256, PyriteAVX512 OnAVX512 Off300K600K900K1200K1500KSE +/- 3455.26, N = 3SE +/- 2198.34, N = 314989378761371. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lssl -lcrypto -lgmp
OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.20.3Algorithm: Quad SHA-256, PyriteAVX512 OnAVX512 Off300K600K900K1200K1500KMin: 1494880 / Avg: 1498936.67 / Max: 1505810Min: 872250 / Avg: 876136.67 / Max: 8798601. (CXX) g++ options: -O2 -lcurl -lz -lpthread -lssl -lcrypto -lgmp

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries if desired for complementary metrics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 16 - Model: AlexNetAVX512 OnAVX512 Off70140210280350SE +/- 0.70, N = 6SE +/- 0.29, N = 3342.8870.10
OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 16 - Model: AlexNetAVX512 OnAVX512 Off60120180240300Min: 341.01 / Avg: 342.88 / Max: 345.16Min: 69.52 / Avg: 70.1 / Max: 70.46

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 32 - Model: AlexNetAVX512 OnAVX512 Off120240360480600SE +/- 2.06, N = 6SE +/- 0.16, N = 3562.4884.96
OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 32 - Model: AlexNetAVX512 OnAVX512 Off100200300400500Min: 557.18 / Avg: 562.48 / Max: 569.03Min: 84.67 / Avg: 84.96 / Max: 85.24

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 64 - Model: AlexNetAVX512 OnAVX512 Off2004006008001000SE +/- 1.99, N = 5SE +/- 0.08, N = 3857.5597.59
OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 64 - Model: AlexNetAVX512 OnAVX512 Off150300450600750Min: 852.57 / Avg: 857.55 / Max: 863.4Min: 97.44 / Avg: 97.59 / Max: 97.68

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 256 - Model: AlexNetAVX512 OnAVX512 Off30060090012001500SE +/- 6.55, N = 3SE +/- 0.24, N = 31422.36106.28
OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 256 - Model: AlexNetAVX512 OnAVX512 Off2004006008001000Min: 1414.72 / Avg: 1422.36 / Max: 1435.4Min: 105.81 / Avg: 106.28 / Max: 106.59

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 512 - Model: AlexNetAVX512 OnAVX512 Off400800120016002000SE +/- 1.53, N = 3SE +/- 0.06, N = 31632.40109.88
OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 512 - Model: AlexNetAVX512 OnAVX512 Off30060090012001500Min: 1630.85 / Avg: 1632.4 / Max: 1635.47Min: 109.77 / Avg: 109.88 / Max: 109.95

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 16 - Model: GoogLeNetAVX512 OnAVX512 Off20406080100SE +/- 1.96, N = 15SE +/- 0.17, N = 3104.7739.64
OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 16 - Model: GoogLeNetAVX512 OnAVX512 Off20406080100Min: 98.44 / Avg: 104.77 / Max: 114.49Min: 39.39 / Avg: 39.64 / Max: 39.96

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 16 - Model: ResNet-50AVX512 OnAVX512 Off1020304050SE +/- 0.03, N = 3SE +/- 0.05, N = 343.1115.91
OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 16 - Model: ResNet-50AVX512 OnAVX512 Off918273645Min: 43.06 / Avg: 43.11 / Max: 43.15Min: 15.84 / Avg: 15.91 / Max: 16.01

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 32 - Model: GoogLeNetAVX512 OnAVX512 Off4080120160200SE +/- 3.09, N = 15SE +/- 0.10, N = 3180.7744.46
OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 32 - Model: GoogLeNetAVX512 OnAVX512 Off306090120150Min: 164.85 / Avg: 180.77 / Max: 192.76Min: 44.25 / Avg: 44.46 / Max: 44.58

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 32 - Model: ResNet-50AVX512 OnAVX512 Off1632486480SE +/- 0.23, N = 3SE +/- 0.04, N = 371.6217.48
OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 32 - Model: ResNet-50AVX512 OnAVX512 Off1428425670Min: 71.17 / Avg: 71.62 / Max: 71.9Min: 17.4 / Avg: 17.48 / Max: 17.53

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 64 - Model: GoogLeNetAVX512 OnAVX512 Off60120180240300SE +/- 2.62, N = 15SE +/- 0.18, N = 3277.2446.28
OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 64 - Model: GoogLeNetAVX512 OnAVX512 Off50100150200250Min: 260.94 / Avg: 277.24 / Max: 286.16Min: 46.08 / Avg: 46.28 / Max: 46.63

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 64 - Model: ResNet-50AVX512 OnAVX512 Off20406080100SE +/- 0.06, N = 3SE +/- 0.03, N = 396.6318.45
OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 64 - Model: ResNet-50AVX512 OnAVX512 Off20406080100Min: 96.55 / Avg: 96.63 / Max: 96.74Min: 18.42 / Avg: 18.45 / Max: 18.5

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 256 - Model: GoogLeNetAVX512 OnAVX512 Off110220330440550SE +/- 4.92, N = 3SE +/- 0.06, N = 3501.6746.17
OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 256 - Model: GoogLeNetAVX512 OnAVX512 Off90180270360450Min: 496.6 / Avg: 501.67 / Max: 511.5Min: 46.09 / Avg: 46.17 / Max: 46.29

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 256 - Model: ResNet-50AVX512 OnAVX512 Off306090120150SE +/- 1.01, N = 12SE +/- 0.02, N = 3119.3219.89
OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 256 - Model: ResNet-50AVX512 OnAVX512 Off20406080100Min: 117.47 / Avg: 119.32 / Max: 130.38Min: 19.85 / Avg: 19.89 / Max: 19.92

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 512 - Model: GoogLeNetAVX512 OnAVX512 Off90180270360450SE +/- 5.30, N = 12SE +/- 0.05, N = 3417.2546.92
OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 512 - Model: GoogLeNetAVX512 OnAVX512 Off70140210280350Min: 408.31 / Avg: 417.25 / Max: 475.24Min: 46.83 / Avg: 46.92 / Max: 47

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 512 - Model: ResNet-50AVX512 OnAVX512 Off306090120150SE +/- 0.99, N = 3SE +/- 0.04, N = 3122.8120.78
OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 512 - Model: ResNet-50AVX512 OnAVX512 Off20406080100Min: 121.82 / Avg: 122.81 / Max: 124.8Min: 20.71 / Avg: 20.78 / Max: 20.86

Neural Magic DeepSparse

This is a benchmark of Neural Magic's DeepSparse using its built-in deepsparse.benchmark utility and various models from their SparseZoo (https://sparsezoo.neuralmagic.com/). Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.5Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-StreamAVX512 OnAVX512 Off1632486480SE +/- 0.14, N = 3SE +/- 0.02, N = 373.5461.37
OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.5Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-StreamAVX512 OnAVX512 Off1428425670Min: 73.26 / Avg: 73.54 / Max: 73.73Min: 61.34 / Avg: 61.37 / Max: 61.41

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.5Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-StreamAVX512 OnAVX512 Off2004006008001000SE +/- 0.33, N = 3SE +/- 1.40, N = 3858.401023.17
OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.5Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-StreamAVX512 OnAVX512 Off2004006008001000Min: 857.91 / Avg: 858.4 / Max: 859.03Min: 1020.38 / Avg: 1023.17 / Max: 1024.83

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.5Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-StreamAVX512 OnAVX512 Off30060090012001500SE +/- 1.58, N = 3SE +/- 5.33, N = 31381.57718.30
OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.5Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-StreamAVX512 OnAVX512 Off2004006008001000Min: 1378.44 / Avg: 1381.57 / Max: 1383.55Min: 708.7 / Avg: 718.3 / Max: 727.1

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.5Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-StreamAVX512 OnAVX512 Off20406080100SE +/- 0.05, N = 3SE +/- 0.67, N = 346.2688.91
OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.5Model: NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Scenario: Asynchronous Multi-StreamAVX512 OnAVX512 Off20406080100Min: 46.19 / Avg: 46.26 / Max: 46.37Min: 87.82 / Avg: 88.91 / Max: 90.12

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.5Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-StreamAVX512 OnAVX512 Off50100150200250SE +/- 7.49, N = 15SE +/- 0.14, N = 3247.97213.64
OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.5Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-StreamAVX512 OnAVX512 Off4080120160200Min: 233.36 / Avg: 247.97 / Max: 345.13Min: 213.37 / Avg: 213.64 / Max: 213.83

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.5Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-StreamAVX512 OnAVX512 Off60120180240300SE +/- 6.02, N = 15SE +/- 0.24, N = 3259.65298.06
OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.5Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-StreamAVX512 OnAVX512 Off50100150200250Min: 184.64 / Avg: 259.65 / Max: 273.24Min: 297.77 / Avg: 298.06 / Max: 298.55

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.5Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-StreamAVX512 OnAVX512 Off2004006008001000SE +/- 0.42, N = 3SE +/- 0.45, N = 3970.06870.95
OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.5Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-StreamAVX512 OnAVX512 Off2004006008001000Min: 969.35 / Avg: 970.06 / Max: 970.82Min: 870.46 / Avg: 870.95 / Max: 871.86

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.5Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-StreamAVX512 OnAVX512 Off1632486480SE +/- 0.03, N = 3SE +/- 0.04, N = 365.8973.35
OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.5Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-StreamAVX512 OnAVX512 Off1428425670Min: 65.84 / Avg: 65.89 / Max: 65.94Min: 73.27 / Avg: 73.35 / Max: 73.4

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.5Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-StreamAVX512 OnAVX512 Off130260390520650SE +/- 0.52, N = 3SE +/- 0.58, N = 3624.74522.11
OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.5Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-StreamAVX512 OnAVX512 Off110220330440550Min: 623.88 / Avg: 624.74 / Max: 625.69Min: 521.46 / Avg: 522.11 / Max: 523.26

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.5Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-StreamAVX512 OnAVX512 Off306090120150SE +/- 0.07, N = 3SE +/- 0.15, N = 3102.22122.23
OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.5Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-StreamAVX512 OnAVX512 Off20406080100Min: 102.11 / Avg: 102.22 / Max: 102.36Min: 121.93 / Avg: 122.23 / Max: 122.42

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.5Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-StreamAVX512 OnAVX512 Off306090120150SE +/- 0.01, N = 3SE +/- 0.03, N = 3127.0669.33
OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.5Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-StreamAVX512 OnAVX512 Off20406080100Min: 127.05 / Avg: 127.06 / Max: 127.07Min: 69.3 / Avg: 69.33 / Max: 69.4

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.5Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-StreamAVX512 OnAVX512 Off2004006008001000SE +/- 0.16, N = 3SE +/- 0.94, N = 3498.48906.94
OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.5Model: CV Segmentation, 90% Pruned YOLACT Pruned - Scenario: Asynchronous Multi-StreamAVX512 OnAVX512 Off160320480640800Min: 498.26 / Avg: 498.48 / Max: 498.79Min: 905.06 / Avg: 906.94 / Max: 908.01

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.5Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-StreamAVX512 OnAVX512 Off70140210280350SE +/- 0.78, N = 3SE +/- 0.63, N = 3316.17260.77
OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.5Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-StreamAVX512 OnAVX512 Off60120180240300Min: 315.22 / Avg: 316.17 / Max: 317.71Min: 259.73 / Avg: 260.77 / Max: 261.91

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.5Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-StreamAVX512 OnAVX512 Off50100150200250SE +/- 0.47, N = 3SE +/- 0.64, N = 3201.60244.15
OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.5Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-StreamAVX512 OnAVX512 Off4080120160200Min: 200.66 / Avg: 201.6 / Max: 202.17Min: 242.94 / Avg: 244.15 / Max: 245.15

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.5Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-StreamAVX512 OnAVX512 Off1632486480SE +/- 0.19, N = 3SE +/- 0.13, N = 373.1561.18
OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.5Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-StreamAVX512 OnAVX512 Off1428425670Min: 72.88 / Avg: 73.15 / Max: 73.52Min: 60.93 / Avg: 61.18 / Max: 61.38

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.5Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-StreamAVX512 OnAVX512 Off2004006008001000SE +/- 0.45, N = 3SE +/- 1.31, N = 3859.711025.48
OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.5Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-StreamAVX512 OnAVX512 Off2004006008001000Min: 858.81 / Avg: 859.71 / Max: 860.2Min: 1023.7 / Avg: 1025.48 / Max: 1028.03

OpenVINO

This is a test of the Intel OpenVINO, a toolkit around neural networks, using its built-in benchmarking support and analyzing the throughput and latency for various models. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Face Detection FP16 - Device: CPUAVX512 OnAVX512 Off1428425670SE +/- 0.06, N = 3SE +/- 0.36, N = 360.7326.181. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Face Detection FP16 - Device: CPUAVX512 OnAVX512 Off1224364860Min: 60.64 / Avg: 60.73 / Max: 60.83Min: 25.46 / Avg: 26.18 / Max: 26.631. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Face Detection FP16 - Device: CPUAVX512 OnAVX512 Off5001000150020002500SE +/- 0.42, N = 3SE +/- 26.84, N = 31048.372423.39MIN: 508.1 / MAX: 1159.4MIN: 1130.42 / MAX: 2937.221. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Face Detection FP16 - Device: CPUAVX512 OnAVX512 Off400800120016002000Min: 1047.78 / Avg: 1048.37 / Max: 1049.19Min: 2389.17 / Avg: 2423.39 / Max: 2476.331. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Person Detection FP16 - Device: CPUAVX512 OnAVX512 Off612182430SE +/- 0.30, N = 12SE +/- 0.21, N = 327.0815.061. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Person Detection FP16 - Device: CPUAVX512 OnAVX512 Off612182430Min: 26.47 / Avg: 27.08 / Max: 30.33Min: 14.83 / Avg: 15.06 / Max: 15.481. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Person Detection FP16 - Device: CPUAVX512 OnAVX512 Off9001800270036004500SE +/- 22.79, N = 12SE +/- 59.43, N = 32334.294153.59MIN: 1017.83 / MAX: 3101.05MIN: 1975.28 / MAX: 5152.661. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Person Detection FP16 - Device: CPUAVX512 OnAVX512 Off7001400210028003500Min: 2086.76 / Avg: 2334.29 / Max: 2383.73Min: 4034.82 / Avg: 4153.59 / Max: 4217.081. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Person Detection FP32 - Device: CPUAVX512 OnAVX512 Off612182430SE +/- 0.18, N = 12SE +/- 0.16, N = 527.0114.991. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Person Detection FP32 - Device: CPUAVX512 OnAVX512 Off612182430Min: 26.7 / Avg: 27.01 / Max: 29.02Min: 14.77 / Avg: 14.99 / Max: 15.641. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Person Detection FP32 - Device: CPUAVX512 OnAVX512 Off9001800270036004500SE +/- 14.83, N = 12SE +/- 41.79, N = 52339.814170.49MIN: 1045.35 / MAX: 3232.69MIN: 1759.65 / MAX: 4969.51. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Person Detection FP32 - Device: CPUAVX512 OnAVX512 Off7001400210028003500Min: 2178.07 / Avg: 2339.81 / Max: 2362.39Min: 4005.67 / Avg: 4170.49 / Max: 4233.021. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Vehicle Detection FP16 - Device: CPUAVX512 OnAVX512 Off30060090012001500SE +/- 22.93, N = 15SE +/- 15.59, N = 141430.451190.911. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Vehicle Detection FP16 - Device: CPUAVX512 OnAVX512 Off2004006008001000Min: 1379.86 / Avg: 1430.45 / Max: 1745.86Min: 1147.23 / Avg: 1190.91 / Max: 1387.491. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Vehicle Detection FP16 - Device: CPUAVX512 OnAVX512 Off1224364860SE +/- 0.60, N = 15SE +/- 0.62, N = 1444.8453.79MIN: 8.1 / MAX: 137.01MIN: 13.85 / MAX: 136.831. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Vehicle Detection FP16 - Device: CPUAVX512 OnAVX512 Off1122334455Min: 36.62 / Avg: 44.84 / Max: 46.34Min: 46.08 / Avg: 53.79 / Max: 55.731. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Face Detection FP16-INT8 - Device: CPUAVX512 OnAVX512 Off306090120150SE +/- 0.02, N = 3SE +/- 0.03, N = 3118.0057.951. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Face Detection FP16-INT8 - Device: CPUAVX512 OnAVX512 Off20406080100Min: 117.96 / Avg: 118 / Max: 118.04Min: 57.92 / Avg: 57.95 / Max: 58.011. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Face Detection FP16-INT8 - Device: CPUAVX512 OnAVX512 Off2004006008001000SE +/- 0.07, N = 3SE +/- 0.34, N = 3540.351094.52MIN: 257.81 / MAX: 586.56MIN: 509.48 / MAX: 1179.631. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Face Detection FP16-INT8 - Device: CPUAVX512 OnAVX512 Off2004006008001000Min: 540.21 / Avg: 540.35 / Max: 540.44Min: 1093.84 / Avg: 1094.52 / Max: 1094.931. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Vehicle Detection FP16-INT8 - Device: CPUAVX512 OnAVX512 Off12002400360048006000SE +/- 89.26, N = 15SE +/- 2.12, N = 35690.343954.901. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Vehicle Detection FP16-INT8 - Device: CPUAVX512 OnAVX512 Off10002000300040005000Min: 5492.9 / Avg: 5690.34 / Max: 6783.82Min: 3951.56 / Avg: 3954.9 / Max: 3958.841. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Vehicle Detection FP16-INT8 - Device: CPUAVX512 OnAVX512 Off48121620SE +/- 0.15, N = 15SE +/- 0.01, N = 311.2616.17MIN: 4.37 / MAX: 45.06MIN: 8.44 / MAX: 56.881. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Vehicle Detection FP16-INT8 - Device: CPUAVX512 OnAVX512 Off48121620Min: 9.42 / Avg: 11.26 / Max: 11.63Min: 16.15 / Avg: 16.17 / Max: 16.181. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Weld Porosity Detection FP16 - Device: CPUAVX512 OnAVX512 Off13002600390052006500SE +/- 1.32, N = 3SE +/- 0.56, N = 36073.222551.701. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Weld Porosity Detection FP16 - Device: CPUAVX512 OnAVX512 Off11002200330044005500Min: 6071.06 / Avg: 6073.22 / Max: 6075.62Min: 2550.65 / Avg: 2551.7 / Max: 2552.551. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Weld Porosity Detection FP16 - Device: CPUAVX512 OnAVX512 Off612182430SE +/- 0.00, N = 3SE +/- 0.01, N = 310.5225.06MIN: 5.11 / MAX: 34.37MIN: 13.08 / MAX: 57.021. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Weld Porosity Detection FP16 - Device: CPUAVX512 OnAVX512 Off612182430Min: 10.52 / Avg: 10.52 / Max: 10.53Min: 25.05 / Avg: 25.06 / Max: 25.071. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Machine Translation EN To DE FP16 - Device: CPUAVX512 OnAVX512 Off130260390520650SE +/- 7.10, N = 15SE +/- 3.15, N = 15580.40251.971. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Machine Translation EN To DE FP16 - Device: CPUAVX512 OnAVX512 Off100200300400500Min: 561.87 / Avg: 580.4 / Max: 661.39Min: 242.48 / Avg: 251.97 / Max: 292.931. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Machine Translation EN To DE FP16 - Device: CPUAVX512 OnAVX512 Off60120180240300SE +/- 1.22, N = 15SE +/- 2.83, N = 15110.35254.01MIN: 49.7 / MAX: 183.59MIN: 116.71 / MAX: 398.881. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Machine Translation EN To DE FP16 - Device: CPUAVX512 OnAVX512 Off50100150200250Min: 96.66 / Avg: 110.35 / Max: 113.77Min: 218.07 / Avg: 254.01 / Max: 263.491. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Weld Porosity Detection FP16-INT8 - Device: CPUAVX512 OnAVX512 Off3K6K9K12K15KSE +/- 1.17, N = 3SE +/- 1.65, N = 311818.335692.991. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Weld Porosity Detection FP16-INT8 - Device: CPUAVX512 OnAVX512 Off2K4K6K8K10KMin: 11816.15 / Avg: 11818.33 / Max: 11820.17Min: 5691.32 / Avg: 5692.99 / Max: 5696.31. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Weld Porosity Detection FP16-INT8 - Device: CPUAVX512 OnAVX512 Off510152025SE +/- 0.00, N = 3SE +/- 0.01, N = 310.8222.47MIN: 5 / MAX: 31.44MIN: 10.87 / MAX: 43.51. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Weld Porosity Detection FP16-INT8 - Device: CPUAVX512 OnAVX512 Off510152025Min: 10.82 / Avg: 10.82 / Max: 10.82Min: 22.46 / Avg: 22.47 / Max: 22.481. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Person Vehicle Bike Detection FP16 - Device: CPUAVX512 OnAVX512 Off14002800420056007000SE +/- 13.51, N = 3SE +/- 26.07, N = 156638.713317.341. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Person Vehicle Bike Detection FP16 - Device: CPUAVX512 OnAVX512 Off12002400360048006000Min: 6619.92 / Avg: 6638.71 / Max: 6664.93Min: 3249.25 / Avg: 3317.34 / Max: 3581.191. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Person Vehicle Bike Detection FP16 - Device: CPUAVX512 OnAVX512 Off510152025SE +/- 0.02, N = 3SE +/- 0.14, N = 159.6319.28MIN: 6.4 / MAX: 33.26MIN: 10.31 / MAX: 50.611. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Person Vehicle Bike Detection FP16 - Device: CPUAVX512 OnAVX512 Off510152025Min: 9.59 / Avg: 9.63 / Max: 9.65Min: 17.85 / Avg: 19.28 / Max: 19.671. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Age Gender Recognition Retail 0013 FP16 - Device: CPUAVX512 OnAVX512 Off20K40K60K80K100KSE +/- 314.35, N = 3SE +/- 278.13, N = 3110240.8962564.161. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Age Gender Recognition Retail 0013 FP16 - Device: CPUAVX512 OnAVX512 Off20K40K60K80K100KMin: 109612.19 / Avg: 110240.89 / Max: 110557.53Min: 62007.9 / Avg: 62564.16 / Max: 62843.911. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Age Gender Recognition Retail 0013 FP16 - Device: CPUAVX512 OnAVX512 Off0.42980.85961.28941.71922.149SE +/- 0.00, N = 3SE +/- 0.01, N = 30.991.91MIN: 0.35 / MAX: 19.47MIN: 0.67 / MAX: 19.481. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Age Gender Recognition Retail 0013 FP16 - Device: CPUAVX512 OnAVX512 Off246810Min: 0.99 / Avg: 0.99 / Max: 1Min: 1.9 / Avg: 1.91 / Max: 1.921. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPUAVX512 OnAVX512 Off16K32K48K64K80KSE +/- 95.74, N = 3SE +/- 20.10, N = 373970.1266895.491. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.3Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPUAVX512 OnAVX512 Off13K26K39K52K65KMin: 73778.69 / Avg: 73970.12 / Max: 74069.3Min: 66863.26 / Avg: 66895.49 / Max: 66932.41. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPUAVX512 OnAVX512 Off0.3960.7921.1881.5841.98SE +/- 0.00, N = 3SE +/- 0.00, N = 31.581.76MIN: 0.55 / MAX: 17.78MIN: 0.64 / MAX: 19.61. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF
OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.3Model: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPUAVX512 OnAVX512 Off246810Min: 1.58 / Avg: 1.58 / Max: 1.59Min: 1.76 / Avg: 1.76 / Max: 1.761. (CXX) g++ options: -isystem -fsigned-char -ffunction-sections -fdata-sections -msse4.1 -msse4.2 -O3 -fno-strict-overflow -fwrapv -fPIC -fvisibility=hidden -Os -std=c++11 -MD -MT -MF

CPU Peak Freq (Highest CPU Core Frequency) Monitor

OpenBenchmarking.orgMegahertzCPU Peak Freq (Highest CPU Core Frequency) MonitorPhoronix Test Suite System MonitoringAVX512 OnAVX512 Off6001200180024003000Min: 2250 / Avg: 2918.06 / Max: 3532Min: 2203 / Avg: 2979.69 / Max: 3559

CPU Power Consumption Monitor

OpenBenchmarking.orgWattsCPU Power Consumption MonitorPhoronix Test Suite System MonitoringAVX512 OnAVX512 Off70140210280350Min: 10.25 / Avg: 231.36 / Max: 398.39Min: 10.15 / Avg: 179.15 / Max: 378.14

CPU Temperature Monitor

OpenBenchmarking.orgCelsiusCPU Temperature MonitorPhoronix Test Suite System MonitoringAVX512 OnAVX512 Off1530456075Min: 23.25 / Avg: 51.4 / Max: 74.25Min: 20.75 / Avg: 44.22 / Max: 76.13

80 Results Shown

miniBUDE:
  OpenMP - BM1:
    GFInst/s
    Billion Interactions/s
  OpenMP - BM2:
    GFInst/s
    Billion Interactions/s
libxsmm:
  128
  256
Embree:
  Pathtracer ISPC - Crown
  Pathtracer ISPC - Asian Dragon
  Pathtracer ISPC - Asian Dragon Obj
OpenVKL
OSPRay:
  gravity_spheres_volume/dim_512/ao/real_time
  gravity_spheres_volume/dim_512/scivis/real_time
  gravity_spheres_volume/dim_512/pathtracer/real_time
oneDNN
Cpuminer-Opt:
  x25x
  scrypt
  Blake-2 S
  Garlicoin
  Skeincoin
  Myriad-Groestl
  LBC, LBRY Credits
  Quad SHA-256, Pyrite
TensorFlow:
  CPU - 16 - AlexNet
  CPU - 32 - AlexNet
  CPU - 64 - AlexNet
  CPU - 256 - AlexNet
  CPU - 512 - AlexNet
  CPU - 16 - GoogLeNet
  CPU - 16 - ResNet-50
  CPU - 32 - GoogLeNet
  CPU - 32 - ResNet-50
  CPU - 64 - GoogLeNet
  CPU - 64 - ResNet-50
  CPU - 256 - GoogLeNet
  CPU - 256 - ResNet-50
  CPU - 512 - GoogLeNet
  CPU - 512 - ResNet-50
Neural Magic DeepSparse:
  NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream:
    items/sec
    ms/batch
  NLP Sentiment Analysis, 80% Pruned Quantized BERT Base Uncased - Asynchronous Multi-Stream:
    items/sec
    ms/batch
  NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Stream:
    items/sec
    ms/batch
  CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream:
    items/sec
    ms/batch
  NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream:
    items/sec
    ms/batch
  CV Segmentation, 90% Pruned YOLACT Pruned - Asynchronous Multi-Stream:
    items/sec
    ms/batch
  NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Stream:
    items/sec
    ms/batch
  NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream:
    items/sec
    ms/batch
OpenVINO:
  Face Detection FP16 - CPU:
    FPS
    ms
  Person Detection FP16 - CPU:
    FPS
    ms
  Person Detection FP32 - CPU:
    FPS
    ms
  Vehicle Detection FP16 - CPU:
    FPS
    ms
  Face Detection FP16-INT8 - CPU:
    FPS
    ms
  Vehicle Detection FP16-INT8 - CPU:
    FPS
    ms
  Weld Porosity Detection FP16 - CPU:
    FPS
    ms
  Machine Translation EN To DE FP16 - CPU:
    FPS
    ms
  Weld Porosity Detection FP16-INT8 - CPU:
    FPS
    ms
  Person Vehicle Bike Detection FP16 - CPU:
    FPS
    ms
  Age Gender Recognition Retail 0013 FP16 - CPU:
    FPS
    ms
  Age Gender Recognition Retail 0013 FP16-INT8 - CPU:
    FPS
    ms
CPU Peak Freq (Highest CPU Core Frequency) Monitor:
  Phoronix Test Suite System Monitoring:
    Megahertz
    Watts
    Celsius