AVX-512 Core i9 Intel Rocket Lake

Benchmarks for a future article. Intel Core i9-11900K testing with a ASUS ROG MAXIMUS XIII HERO (1402 BIOS) and ASUS Intel RKL GT1 31GB on Ubuntu 22.10 via the Phoronix Test Suite.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2210210-NE-AVX512ROC09
Jump To Table - Results

View

Do Not Show Noisy Results
Do Not Show Results With Incomplete Data
Do Not Show Results With Little Change/Spread
List Notable Results

Limit displaying results to tests within:

C/C++ Compiler Tests 2 Tests
CPU Massive 6 Tests
Creator Workloads 8 Tests
Cryptocurrency Benchmarks, CPU Mining Tests 2 Tests
Cryptography 2 Tests
Game Development 2 Tests
HPC - High Performance Computing 14 Tests
Machine Learning 11 Tests
Molecular Dynamics 2 Tests
Multi-Core 10 Tests
NVIDIA GPU Compute 3 Tests
Intel oneAPI 7 Tests
OpenMPI Tests 3 Tests
Python Tests 6 Tests
Raytracing 2 Tests
Renderers 2 Tests
Scientific Computing 2 Tests
Server CPU Tests 4 Tests

Statistics

Show Overall Harmonic Mean(s)
Show Overall Geometric Mean
Show Geometric Means Per-Suite/Category
Show Wins / Losses Counts (Pie Chart)
Normalize Results
Remove Outliers Before Calculating Averages

Graph Settings

Force Line Graphs Where Applicable
Convert To Scalar Where Applicable
Prefer Vertical Bar Graphs
No Box Plots
On Line Graphs With Missing Data, Connect The Line Gaps

Multi-Way Comparison

Condense Comparison
Transpose Comparison

Table

Show Detailed System Result Table

Run Management

Highlight
Result
Hide
Result
Result
Identifier
Performance Per
Dollar
Date
Run
  Test
  Duration
i9-11900K: AVX-512 On
October 18 2022
  1 Day, 1 Hour, 48 Minutes
i9-11900K: AVX-512 Off
October 19 2022
  1 Day, 2 Hours, 28 Minutes
i9-11900K: AVX-512 On 512
October 20 2022
  1 Day, 1 Hour, 45 Minutes
Invert Hiding All Results Option
  1 Day, 2 Hours

Only show results where is faster than
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


AVX-512 Core i9 Intel Rocket LakeOpenBenchmarking.orgPhoronix Test SuiteIntel Core i9-11900K @ 5.10GHz (8 Cores / 16 Threads)ASUS ROG MAXIMUS XIII HERO (1402 BIOS)Intel Tiger Lake-H32GB2000GB Corsair Force MP600 + 32GB Flash DriveASUS Intel RKL GT1 31GB (1300MHz)Intel Tiger Lake-H HD AudioASUS MG28U2 x Intel I225-V + Intel Wi-Fi 6 AX210/AX211/AX411Ubuntu 22.105.19.0-21-generic (x86_64)GNOME Shell 43.0X Server + Wayland4.6 Mesa 22.2.11.3.224GCC 12.2.0ext43840x2160ProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerOpenGLVulkanCompilerFile-SystemScreen ResolutionAVX-512 Core I9 Intel Rocket Lake BenchmarksSystem Logs- Transparent Huge Pages: madvise- i9-11900K: AVX-512 On: CXXFLAGS="-O3 -march=native" CFLAGS="-O3 -march=native" - i9-11900K: AVX-512 Off: CXXFLAGS="-O3 -march=native -mno-avx512f" CFLAGS="-O3 -march=native -mno-avx512f" - i9-11900K: AVX-512 On 512: CXXFLAGS="-O3 -march=native -mprefer-vector-width=512" CFLAGS="-O3 -march=native -mprefer-vector-width=512" - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-U8K4Qv/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: intel_pstate performance (EPP: performance) - CPU Microcode: 0x54 - Thermald 2.5.1 - Python 3.10.7- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Mitigation of Clear buffers; SMT vulnerable + retbleed: Mitigation of Enhanced IBRS + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected

cpuminer-opt: LBC, LBRY Creditscpuminer-opt: Quad SHA-256, Pyritecpuminer-opt: Myriad-Groestlopenvino: Face Detection FP16-INT8 - CPUopenvino: Face Detection FP16-INT8 - CPUopenvino: Weld Porosity Detection FP16-INT8 - CPUopenvino: Weld Porosity Detection FP16-INT8 - CPUcpuminer-opt: Blake-2 Scpuminer-opt: Triple SHA-256, Onecoinonednn: Deconvolution Batch shapes_1d - u8s8f32 - CPUopenvino: Vehicle Detection FP16-INT8 - CPUopenvino: Vehicle Detection FP16-INT8 - CPUtensorflow: CPU - 512 - GoogLeNetcpuminer-opt: Garlicointensorflow: CPU - 256 - GoogLeNettensorflow: CPU - 256 - ResNet-50tensorflow: CPU - 64 - ResNet-50tensorflow: CPU - 32 - GoogLeNettensorflow: CPU - 64 - GoogLeNettensorflow: CPU - 16 - GoogLeNetmnn: resnet-v2-50onednn: IP Shapes 1D - u8s8f32 - CPUtensorflow: CPU - 32 - ResNet-50deepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Synchronous Single-Streamdeepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Synchronous Single-Streamtensorflow: CPU - 16 - ResNet-50ai-benchmark: Device Training Scoretensorflow: CPU - 512 - AlexNetospray: gravity_spheres_volume/dim_512/scivis/real_timecpuminer-opt: Skeincoinospray: gravity_spheres_volume/dim_512/ao/real_timeopenvino: Weld Porosity Detection FP16 - CPUonednn: Deconvolution Batch shapes_3d - u8s8f32 - CPUncnn: CPU - vision_transformerdeepsparse: CV Classification, ResNet-50 ImageNet - Synchronous Single-Streamdeepsparse: CV Classification, ResNet-50 ImageNet - Synchronous Single-Streamdeepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Streamdeepsparse: CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Streamtensorflow: CPU - 256 - VGG-16tensorflow: CPU - 64 - VGG-16tensorflow: CPU - 32 - VGG-16tensorflow: CPU - 256 - AlexNetdeepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Streamdeepsparse: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Streamsimdjson: TopTweetsimdjson: DistinctUserIDonednn: IP Shapes 3D - u8s8f32 - CPUsimdjson: PartialTweetstensorflow: CPU - 16 - VGG-16tensorflow: CPU - 64 - AlexNetospray: gravity_spheres_volume/dim_512/pathtracer/real_timeopenvino: Vehicle Detection FP16 - CPUopenvino: Vehicle Detection FP16 - CPUopenvkl: vklBenchmark ISPCtensorflow: CPU - 32 - AlexNetai-benchmark: Device AI Scoreopenvino: Weld Porosity Detection FP16 - CPUonednn: IP Shapes 1D - f32 - CPUncnn: CPU - squeezenet_ssdonednn: Deconvolution Batch shapes_3d - f32 - CPUonednn: Convolution Batch Shapes Auto - u8s8f32 - CPUdeepsparse: NLP Text Classification, BERT base uncased SST2 - Synchronous Single-Streamdeepsparse: NLP Text Classification, BERT base uncased SST2 - Synchronous Single-Streamospray: particle_volume/ao/real_timedeepsparse: NLP Text Classification, DistilBERT mnli - Synchronous Single-Streamdeepsparse: NLP Text Classification, DistilBERT mnli - Synchronous Single-Streammnn: squeezenetv1.1tensorflow: CPU - 16 - AlexNetmnn: inception-v3mnn: SqueezeNetV1.0ospray: particle_volume/scivis/real_timeopenvino: Age Gender Recognition Retail 0013 FP16 - CPUopenvino: Age Gender Recognition Retail 0013 FP16 - CPUdeepsparse: CV Detection,YOLOv5s COCO - Asynchronous Multi-Streamdeepsparse: CV Detection,YOLOv5s COCO - Asynchronous Multi-Streamsimdjson: Kostyaonednn: IP Shapes 3D - f32 - CPUdeepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Streamdeepsparse: NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Streamncnn: CPU - yolov4-tinyonednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPUopenvino: Person Detection FP16 - CPUonednn: Recurrent Neural Network Inference - u8s8f32 - CPUdeepsparse: CV Detection,YOLOv5s COCO - Synchronous Single-Streamdeepsparse: CV Detection,YOLOv5s COCO - Synchronous Single-Streamdeepsparse: NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Streamdeepsparse: NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Streamdeepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Streamopenvino: Person Detection FP16 - CPUdeepsparse: NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Streamdeepsparse: NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Streamdeepsparse: NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Streamonednn: Recurrent Neural Network Inference - f32 - CPUopenvino: Person Detection FP32 - CPUopenvino: Person Detection FP32 - CPUopenvino: Face Detection FP16 - CPUonednn: Matrix Multiply Batch Shapes Transformer - f32 - CPUopenvino: Person Vehicle Bike Detection FP16 - CPUopenvino: Person Vehicle Bike Detection FP16 - CPUopenvino: Face Detection FP16 - CPUncnn: CPU-v2-v2 - mobilenet-v2lczero: Eigenopenvino: Machine Translation EN To DE FP16 - CPUopenvino: Machine Translation EN To DE FP16 - CPUdeepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Streamdeepsparse: NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Streamncnn: CPU - efficientnet-b0deepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Streamdeepsparse: NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Streammnn: mobilenet-v1-1.0ncnn: CPU - mnasnetncnn: CPU - alexnetospray-studio: 1 - 1080p - 32 - Path Tracermnn: MobileNetV2_224ospray-studio: 1 - 4K - 32 - Path Tracerospray-studio: 1 - 1080p - 16 - Path Tracernumpy: onnx: super-resolution-10 - CPU - Parallelospray-studio: 1 - 4K - 16 - Path Traceronednn: Recurrent Neural Network Training - bf16bf16bf16 - CPUospray-studio: 1 - 1080p - 1 - Path Traceronednn: Convolution Batch Shapes Auto - f32 - CPUospray-studio: 1 - 4K - 1 - Path Tracerospray-studio: 3 - 1080p - 32 - Path Tracerospray-studio: 3 - 4K - 32 - Path Tracerospray-studio: 3 - 4K - 16 - Path Tracerospray-studio: 3 - 1080p - 16 - Path Tracerncnn: CPU-v3-v3 - mobilenet-v3ospray-studio: 3 - 1080p - 1 - Path Traceronnx: ArcFace ResNet-100 - CPU - Paralleldav1d: Summer Nature 4Kospray-studio: 3 - 4K - 1 - Path Tracersimdjson: LargeRandopenradioss: Cell Phone Drop Testonnx: super-resolution-10 - CPU - Standardncnn: CPU - mobilenetonednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPUdav1d: Chimera 1080p 10-bitonednn: Recurrent Neural Network Training - f32 - CPUopenradioss: INIVOL and Fluid Structure Interaction Drop Containerncnn: CPU - googlenetdav1d: Chimera 1080pncnn: CPU - regnety_400membree: Pathtracer ISPC - Crownncnn: CPU - resnet18ncnn: CPU - resnet50mnn: nasnetospray: particle_volume/pathtracer/real_timencnn: CPU - blazefacencnn: CPU - FastestDetonednn: Matrix Multiply Batch Shapes Transformer - bf16bf16bf16 - CPUopenradioss: Rubber O-Ring Seal Installationembree: Pathtracer ISPC - Asian Dragonai-benchmark: Device Inference Scoreonednn: IP Shapes 3D - bf16bf16bf16 - CPUonnx: bertsquad-12 - CPU - Standardtnn: CPU - SqueezeNet v2onnx: fcn-resnet101-11 - CPU - Standardembree: Pathtracer ISPC - Asian Dragon Objonednn: Deconvolution Batch shapes_1d - bf16bf16bf16 - CPUdav1d: Summer Nature 1080pmnn: mobilenetV3lczero: BLASopenfoam: drivaerFastback, Small Mesh Size - Mesh Timencnn: CPU - vgg16onnx: yolov4 - CPU - Standardtnn: CPU - MobileNet v2openradioss: Bird Strike on Windshieldonednn: Deconvolution Batch shapes_1d - f32 - CPUonnx: bertsquad-12 - CPU - Parallelopenradioss: Bumper Beamxmrig: Wownero - 1Mgromacs: MPI CPU - water_GMX50_bareonnx: fcn-resnet101-11 - CPU - Parallelonnx: yolov4 - CPU - Paralleltnn: CPU - SqueezeNet v1.1cpuminer-opt: Magixmrig: Monero - 1Mopenfoam: drivaerFastback, Small Mesh Size - Execution Timetnn: CPU - DenseNetonednn: Deconvolution Batch shapes_3d - bf16bf16bf16 - CPUonnx: GPT-2 - CPU - Parallelonnx: GPT-2 - CPU - Standardonednn: Convolution Batch Shapes Auto - bf16bf16bf16 - CPUonednn: IP Shapes 1D - bf16bf16bf16 - CPUncnn: CPU - shufflenet-v2oidn: RTLightmap.hdr.4096x4096oidn: RT.ldr_alb_nrm.3840x2160oidn: RT.hdr_alb_nrm.3840x2160onnx: ArcFace ResNet-100 - CPU - Standardopenvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPUopenvino: Age Gender Recognition Retail 0013 FP16-INT8 - CPUcpuminer-opt: Deepcoincpuminer-opt: Ringcoincpuminer-opt: x25xonednn: Recurrent Neural Network Training - u8s8f32 - CPUi9-11900K AVX-512 On AVX-512 Off AVX-512 On 512770831675004057312.23326.766.431240.568621102317000.8406606.41623.1767.684384.3866.6322.2021.8664.2265.0263.3410.2950.72070221.2422.890043.675320.462069173.674.370421117304.4207224.921.4704987.297.5173132.912425.7285155.38529.428.738.32163.87109.280836.59798.828.783.013778.527.76136.925.36461108.6136.80113112.103445320.804.0355417.194.3306312.109935.383428.25674.9582214.538168.76071.67080.7519.5803.0734.980370.859379.2750.767078.77614.5410.634749.568280.674720.991814.031.781817.5920.125949.66518.8759112.675697.84812235.5440.86688.9135112.18911814.651.762250.022.973.2677313.95286.431346.993.73118635.62112.278.9516446.88144.868.9588446.48891.9722.626.98657232.13125126131054631.2449501272673093.58195814.3619774378746302582152903376622.6823581388207.1493731.47120.62735612.281.32134510.883079.96759.2310.16734.617.0514.62528.6915.997.468173.1340.802.983.81793250.3316.512913764.81904100847.19310214.429820.8058894.730.938126845.48846544.74529245.952309.218.55193519177.404015.01.03173319230.819452.822239.5538.976182704.93417.30845548695416.16408.552742.330.210.430.4322490.3423074.8211148.962152.94387.723088.892727064663161504.88817.1714.98533.793932131061371.7783212.88310.3833.922381.4537.3712.4812.3736.4936.9436.2218.0431.2439612.3239.144825.540712.291282107.722.72378707202.8282115.812.31183110.3811.624485.977739.3450101.63656.226.115.95117.43150.766826.52616.356.332.186706.205.69103.314.05745141.3028.298787.792696252.693.2268013.895.3932714.841128.967734.51456.0244617.664056.59282.02367.8023.4183.6945.916361.007956.7843.230692.50543.8611.677157.876469.100518.802006.021.542003.5423.035443.39337.8089128.0523112.59892563.4535.52077.8337127.64692000.121.542563.872.603.6736012.26325.951525.183.34106231.70126.098.1213492.37674.358.1425491.11531.7982.376.33721451.93427569334112575.7945151395193271.27213415.6093845085854329527166508409482.4925611278192.71101481.36128.16727911.431.40330487.093271.06713.359.55691.436.7013.86188.2415.407.139164.9630.762.85243.4715.89521414104246.36010014.0227879.370.942124046.53262245.68517241.314311.828.39772515174.604076.41.02072315228.683453.392256.1538.290622703.558555769542.330.210.430.4319510.5314973.889966.102130.49359.013270.55763931647904066312.17328.086.421242.458955132306870.8463766.33630.5867.674392.2966.4922.1321.8864.4265.1563.8310.3680.71592521.2923.013443.440820.552076172.664.367721124034.4690624.861.4694870.397.4707133.743525.4949156.81219.428.698.29163.09108.489036.86178.828.772.752468.517.76137.755.35301109.0936.64112113.253439321.514.0329417.344.3256412.017835.423828.22494.9528014.535568.77151.67582.1019.4033.1144.951480.849460.4551.357377.87304.539.9511349.359381.021622.011715.751.781738.0019.993949.99138.9905111.224998.02862227.6940.80219.0118110.96171739.491.772244.252.953.2174213.99285.751352.083.77119435.55112.489.0886440.09944.799.0792440.56081.9892.586.69654472.12525069831180631.8646261275402986.72195114.2835773478809303966153101376202.7123641368190.8293761.47118.89682412.281.30906477.093062.36755.609.96727.767.1014.61988.6416.247.527173.7390.802.983.66945240.6216.454113634.64814101245.67910314.350220.2612902.920.963127345.332144.59527240.410315.868.50891524175.694075.51.03573319228.424449.422252.6535.108252695.64617.26875545696216.16058.553742.330.210.430.4322500.3522962.10112702190.91379.943073.17OpenBenchmarking.org

Cpuminer-Opt

Cpuminer-Opt is a fork of cpuminer-multi that carries a wide range of CPU performance optimizations for measuring the potential cryptocurrency mining performance of the CPU/processor with a wide variety of cryptocurrencies. The benchmark reports the hash speed for the CPU mining performance for the selected cryptocurrency. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.18Algorithm: LBC, LBRY CreditsAVX-512 OnAVX-512 OffAVX-512 On 51217K34K51K68K85KSE +/- 84.13, N = 3SE +/- 20.00, N = 3SE +/- 116.81, N = 3770832727076393-mno-avx512f1. (CXX) g++ options: -O3 -march=native -lcurl -lz -lpthread -lssl -lcrypto -lgmp
OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.18Algorithm: LBC, LBRY CreditsAVX-512 OnAVX-512 OffAVX-512 On 51213K26K39K52K65KMin: 76920 / Avg: 77083.33 / Max: 77200Min: 27230 / Avg: 27270 / Max: 27290Min: 76160 / Avg: 76393.33 / Max: 765201. (CXX) g++ options: -O3 -march=native -lcurl -lz -lpthread -lssl -lcrypto -lgmp

OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.18Algorithm: Quad SHA-256, PyriteAVX-512 OnAVX-512 OffAVX-512 On 51240K80K120K160K200KSE +/- 280.42, N = 3SE +/- 42.56, N = 3SE +/- 346.55, N = 316750064663164790-mno-avx512f1. (CXX) g++ options: -O3 -march=native -lcurl -lz -lpthread -lssl -lcrypto -lgmp
OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.18Algorithm: Quad SHA-256, PyriteAVX-512 OnAVX-512 OffAVX-512 On 51230K60K90K120K150KMin: 167030 / Avg: 167500 / Max: 168000Min: 64580 / Avg: 64663.33 / Max: 64720Min: 164180 / Avg: 164790 / Max: 1653801. (CXX) g++ options: -O3 -march=native -lcurl -lz -lpthread -lssl -lcrypto -lgmp

OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.18Algorithm: Myriad-GroestlAVX-512 OnAVX-512 OffAVX-512 On 5129K18K27K36K45KSE +/- 58.12, N = 3SE +/- 17.32, N = 3SE +/- 164.55, N = 3405731615040663-mno-avx512f1. (CXX) g++ options: -O3 -march=native -lcurl -lz -lpthread -lssl -lcrypto -lgmp
OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.18Algorithm: Myriad-GroestlAVX-512 OnAVX-512 OffAVX-512 On 5127K14K21K28K35KMin: 40480 / Avg: 40573.33 / Max: 40680Min: 16120 / Avg: 16150 / Max: 16180Min: 40380 / Avg: 40663.33 / Max: 409501. (CXX) g++ options: -O3 -march=native -lcurl -lz -lpthread -lssl -lcrypto -lgmp

OpenVINO

This is a test of the Intel OpenVINO, a toolkit around neural networks, using its built-in benchmarking support and analyzing the throughput and latency for various models. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.2.devModel: Face Detection FP16-INT8 - Device: CPUAVX-512 OnAVX-512 OffAVX-512 On 5123691215SE +/- 0.01, N = 3SE +/- 0.02, N = 3SE +/- 0.01, N = 312.234.8812.17-ldl-mno-avx512f-ldl1. (CXX) g++ options: -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.2.devModel: Face Detection FP16-INT8 - Device: CPUAVX-512 OnAVX-512 OffAVX-512 On 51248121620Min: 12.21 / Avg: 12.23 / Max: 12.25Min: 4.86 / Avg: 4.88 / Max: 4.91Min: 12.15 / Avg: 12.17 / Max: 12.21. (CXX) g++ options: -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -pie

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.2.devModel: Face Detection FP16-INT8 - Device: CPUAVX-512 OnAVX-512 OffAVX-512 On 5122004006008001000SE +/- 0.34, N = 3SE +/- 3.19, N = 3SE +/- 0.43, N = 3326.76817.17328.08-ldl - MIN: 183.35 / MAX: 422.57-mno-avx512f - MIN: 559.45 / MAX: 1507.03-ldl - MIN: 193.4 / MAX: 422.421. (CXX) g++ options: -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.2.devModel: Face Detection FP16-INT8 - Device: CPUAVX-512 OnAVX-512 OffAVX-512 On 512140280420560700Min: 326.09 / Avg: 326.76 / Max: 327.18Min: 810.9 / Avg: 817.17 / Max: 821.33Min: 327.33 / Avg: 328.08 / Max: 328.831. (CXX) g++ options: -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -pie

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.2.devModel: Weld Porosity Detection FP16-INT8 - Device: CPUAVX-512 OnAVX-512 OffAVX-512 On 51248121620SE +/- 0.03, N = 3SE +/- 0.06, N = 3SE +/- 0.02, N = 36.4314.986.42-ldl - MIN: 3.29 / MAX: 13.76-mno-avx512f - MIN: 8.89 / MAX: 192.81-ldl - MIN: 3.32 / MAX: 13.511. (CXX) g++ options: -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.2.devModel: Weld Porosity Detection FP16-INT8 - Device: CPUAVX-512 OnAVX-512 OffAVX-512 On 51248121620Min: 6.38 / Avg: 6.43 / Max: 6.48Min: 14.86 / Avg: 14.98 / Max: 15.05Min: 6.38 / Avg: 6.42 / Max: 6.451. (CXX) g++ options: -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -pie

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.2.devModel: Weld Porosity Detection FP16-INT8 - Device: CPUAVX-512 OnAVX-512 OffAVX-512 On 51230060090012001500SE +/- 5.38, N = 3SE +/- 2.18, N = 3SE +/- 4.58, N = 31240.56533.791242.45-ldl-mno-avx512f-ldl1. (CXX) g++ options: -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.2.devModel: Weld Porosity Detection FP16-INT8 - Device: CPUAVX-512 OnAVX-512 OffAVX-512 On 5122004006008001000Min: 1232.3 / Avg: 1240.56 / Max: 1250.65Min: 531.3 / Avg: 533.79 / Max: 538.13Min: 1237.1 / Avg: 1242.45 / Max: 1251.571. (CXX) g++ options: -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -pie

Cpuminer-Opt

Cpuminer-Opt is a fork of cpuminer-multi that carries a wide range of CPU performance optimizations for measuring the potential cryptocurrency mining performance of the CPU/processor with a wide variety of cryptocurrencies. The benchmark reports the hash speed for the CPU mining performance for the selected cryptocurrency. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.18Algorithm: Blake-2 SAVX-512 OnAVX-512 OffAVX-512 On 512200K400K600K800K1000KSE +/- 9836.24, N = 3SE +/- 382.81, N = 3SE +/- 7851.09, N = 15862110393213895513-mno-avx512f1. (CXX) g++ options: -O3 -march=native -lcurl -lz -lpthread -lssl -lcrypto -lgmp
OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.18Algorithm: Blake-2 SAVX-512 OnAVX-512 OffAVX-512 On 512160K320K480K640K800KMin: 850670 / Avg: 862110 / Max: 881690Min: 392590 / Avg: 393213.33 / Max: 393910Min: 871390 / Avg: 895512.67 / Max: 9990201. (CXX) g++ options: -O3 -march=native -lcurl -lz -lpthread -lssl -lcrypto -lgmp

OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.18Algorithm: Triple SHA-256, OnecoinAVX-512 OnAVX-512 OffAVX-512 On 51250K100K150K200K250KSE +/- 1895.81, N = 3SE +/- 627.84, N = 3SE +/- 1461.35, N = 3231700106137230687-mno-avx512f1. (CXX) g++ options: -O3 -march=native -lcurl -lz -lpthread -lssl -lcrypto -lgmp
OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.18Algorithm: Triple SHA-256, OnecoinAVX-512 OnAVX-512 OffAVX-512 On 51240K80K120K160K200KMin: 227960 / Avg: 231700 / Max: 234110Min: 104890 / Avg: 106136.67 / Max: 106890Min: 228060 / Avg: 230686.67 / Max: 2331101. (CXX) g++ options: -O3 -march=native -lcurl -lz -lpthread -lssl -lcrypto -lgmp

oneDNN

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPUAVX-512 OnAVX-512 OffAVX-512 On 5120.40010.80021.20031.60042.0005SE +/- 0.003322, N = 3SE +/- 0.003095, N = 3SE +/- 0.003829, N = 30.8406601.7783200.846376MIN: 0.82-mno-avx512f - MIN: 1.73MIN: 0.811. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPUAVX-512 OnAVX-512 OffAVX-512 On 512246810Min: 0.83 / Avg: 0.84 / Max: 0.85Min: 1.77 / Avg: 1.78 / Max: 1.78Min: 0.84 / Avg: 0.85 / Max: 0.851. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

OpenVINO

This is a test of the Intel OpenVINO, a toolkit around neural networks, using its built-in benchmarking support and analyzing the throughput and latency for various models. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.2.devModel: Vehicle Detection FP16-INT8 - Device: CPUAVX-512 OnAVX-512 OffAVX-512 On 5123691215SE +/- 0.04, N = 3SE +/- 0.06, N = 3SE +/- 0.02, N = 36.4112.886.33-ldl - MIN: 3.21 / MAX: 14.83-mno-avx512f - MIN: 7.49 / MAX: 109.79-ldl - MIN: 4.22 / MAX: 15.171. (CXX) g++ options: -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.2.devModel: Vehicle Detection FP16-INT8 - Device: CPUAVX-512 OnAVX-512 OffAVX-512 On 51248121620Min: 6.35 / Avg: 6.41 / Max: 6.48Min: 12.75 / Avg: 12.88 / Max: 12.96Min: 6.3 / Avg: 6.33 / Max: 6.361. (CXX) g++ options: -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -pie

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.2.devModel: Vehicle Detection FP16-INT8 - Device: CPUAVX-512 OnAVX-512 OffAVX-512 On 512140280420560700SE +/- 3.77, N = 3SE +/- 1.58, N = 3SE +/- 1.77, N = 3623.17310.38630.58-ldl-mno-avx512f-ldl1. (CXX) g++ options: -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.2.devModel: Vehicle Detection FP16-INT8 - Device: CPUAVX-512 OnAVX-512 OffAVX-512 On 512110220330440550Min: 616.09 / Avg: 623.17 / Max: 628.96Min: 308.4 / Avg: 310.38 / Max: 313.51Min: 627.3 / Avg: 630.58 / Max: 633.361. (CXX) g++ options: -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -pie

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries too. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.10Device: CPU - Batch Size: 512 - Model: GoogLeNetAVX-512 OnAVX-512 OffAVX-512 On 5121530456075SE +/- 0.02, N = 3SE +/- 0.06, N = 3SE +/- 0.03, N = 367.6833.9267.67
OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.10Device: CPU - Batch Size: 512 - Model: GoogLeNetAVX-512 OnAVX-512 OffAVX-512 On 5121326395265Min: 67.65 / Avg: 67.68 / Max: 67.71Min: 33.85 / Avg: 33.92 / Max: 34.05Min: 67.6 / Avg: 67.67 / Max: 67.71

Cpuminer-Opt

Cpuminer-Opt is a fork of cpuminer-multi that carries a wide range of CPU performance optimizations for measuring the potential cryptocurrency mining performance of the CPU/processor with a wide variety of cryptocurrencies. The benchmark reports the hash speed for the CPU mining performance for the selected cryptocurrency. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.18Algorithm: GarlicoinAVX-512 OnAVX-512 OffAVX-512 On 5129001800270036004500SE +/- 25.90, N = 3SE +/- 6.13, N = 3SE +/- 10.30, N = 34384.382381.454392.29-mno-avx512f1. (CXX) g++ options: -O3 -march=native -lcurl -lz -lpthread -lssl -lcrypto -lgmp
OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.18Algorithm: GarlicoinAVX-512 OnAVX-512 OffAVX-512 On 5128001600240032004000Min: 4332.81 / Avg: 4384.38 / Max: 4414.29Min: 2370.02 / Avg: 2381.45 / Max: 2391Min: 4375.47 / Avg: 4392.29 / Max: 4411.011. (CXX) g++ options: -O3 -march=native -lcurl -lz -lpthread -lssl -lcrypto -lgmp

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries too. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.10Device: CPU - Batch Size: 256 - Model: GoogLeNetAVX-512 OnAVX-512 OffAVX-512 On 5121530456075SE +/- 0.11, N = 3SE +/- 0.02, N = 3SE +/- 0.06, N = 366.6337.3766.49
OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.10Device: CPU - Batch Size: 256 - Model: GoogLeNetAVX-512 OnAVX-512 OffAVX-512 On 5121326395265Min: 66.42 / Avg: 66.63 / Max: 66.81Min: 37.35 / Avg: 37.37 / Max: 37.4Min: 66.41 / Avg: 66.49 / Max: 66.62

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.10Device: CPU - Batch Size: 256 - Model: ResNet-50AVX-512 OnAVX-512 OffAVX-512 On 512510152025SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 322.2012.4822.13
OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.10Device: CPU - Batch Size: 256 - Model: ResNet-50AVX-512 OnAVX-512 OffAVX-512 On 512510152025Min: 22.19 / Avg: 22.2 / Max: 22.21Min: 12.48 / Avg: 12.48 / Max: 12.49Min: 22.12 / Avg: 22.13 / Max: 22.13

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.10Device: CPU - Batch Size: 64 - Model: ResNet-50AVX-512 OnAVX-512 OffAVX-512 On 512510152025SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 321.8612.3721.88
OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.10Device: CPU - Batch Size: 64 - Model: ResNet-50AVX-512 OnAVX-512 OffAVX-512 On 512510152025Min: 21.84 / Avg: 21.86 / Max: 21.88Min: 12.36 / Avg: 12.37 / Max: 12.38Min: 21.87 / Avg: 21.88 / Max: 21.89

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.10Device: CPU - Batch Size: 32 - Model: GoogLeNetAVX-512 OnAVX-512 OffAVX-512 On 5121428425670SE +/- 0.01, N = 3SE +/- 0.04, N = 3SE +/- 0.06, N = 364.2236.4964.42
OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.10Device: CPU - Batch Size: 32 - Model: GoogLeNetAVX-512 OnAVX-512 OffAVX-512 On 5121326395265Min: 64.21 / Avg: 64.22 / Max: 64.24Min: 36.43 / Avg: 36.49 / Max: 36.56Min: 64.34 / Avg: 64.42 / Max: 64.54

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.10Device: CPU - Batch Size: 64 - Model: GoogLeNetAVX-512 OnAVX-512 OffAVX-512 On 5121530456075SE +/- 0.05, N = 3SE +/- 0.01, N = 3SE +/- 0.07, N = 365.0236.9465.15
OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.10Device: CPU - Batch Size: 64 - Model: GoogLeNetAVX-512 OnAVX-512 OffAVX-512 On 5121326395265Min: 64.94 / Avg: 65.02 / Max: 65.11Min: 36.92 / Avg: 36.94 / Max: 36.96Min: 65.08 / Avg: 65.15 / Max: 65.29

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.10Device: CPU - Batch Size: 16 - Model: GoogLeNetAVX-512 OnAVX-512 OffAVX-512 On 5121428425670SE +/- 0.03, N = 3SE +/- 0.02, N = 3SE +/- 0.16, N = 363.3436.2263.83
OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.10Device: CPU - Batch Size: 16 - Model: GoogLeNetAVX-512 OnAVX-512 OffAVX-512 On 5121224364860Min: 63.29 / Avg: 63.34 / Max: 63.38Min: 36.18 / Avg: 36.22 / Max: 36.25Min: 63.56 / Avg: 63.83 / Max: 64.1

Mobile Neural Network

MNN is the Mobile Neural Network as a highly efficient, lightweight deep learning framework developed by Alibaba. This MNN test profile is building the OpenMP / CPU threaded version for processor benchmarking and not any GPU-accelerated test. MNN does allow making use of AVX-512 extensions. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.1Model: resnet-v2-50AVX-512 OnAVX-512 OffAVX-512 On 51248121620SE +/- 0.05, N = 15SE +/- 0.08, N = 3SE +/- 0.04, N = 1510.3018.0410.37MIN: 9.74 / MAX: 33.88-mno-avx512f - MIN: 17.67 / MAX: 24.46MIN: 10.05 / MAX: 22.641. (CXX) g++ options: -O3 -march=native -std=c++11 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.1Model: resnet-v2-50AVX-512 OnAVX-512 OffAVX-512 On 512510152025Min: 9.92 / Avg: 10.3 / Max: 10.6Min: 17.9 / Avg: 18.04 / Max: 18.18Min: 10.23 / Avg: 10.37 / Max: 10.611. (CXX) g++ options: -O3 -march=native -std=c++11 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl

oneDNN

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPUAVX-512 OnAVX-512 OffAVX-512 On 5120.27990.55980.83971.11961.3995SE +/- 0.001991, N = 4SE +/- 0.000314, N = 4SE +/- 0.001828, N = 40.7207021.2439600.715925MIN: 0.67-mno-avx512f - MIN: 1.22MIN: 0.661. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPUAVX-512 OnAVX-512 OffAVX-512 On 512246810Min: 0.72 / Avg: 0.72 / Max: 0.73Min: 1.24 / Avg: 1.24 / Max: 1.24Min: 0.71 / Avg: 0.72 / Max: 0.721. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries too. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.10Device: CPU - Batch Size: 32 - Model: ResNet-50AVX-512 OnAVX-512 OffAVX-512 On 512510152025SE +/- 0.02, N = 3SE +/- 0.01, N = 3SE +/- 0.00, N = 321.2412.3221.29
OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.10Device: CPU - Batch Size: 32 - Model: ResNet-50AVX-512 OnAVX-512 OffAVX-512 On 512510152025Min: 21.21 / Avg: 21.24 / Max: 21.28Min: 12.31 / Avg: 12.32 / Max: 12.33Min: 21.28 / Avg: 21.29 / Max: 21.29

Neural Magic DeepSparse

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.1Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Synchronous Single-StreamAVX-512 OnAVX-512 OffAVX-512 On 512918273645SE +/- 0.10, N = 3SE +/- 0.08, N = 3SE +/- 0.10, N = 322.8939.1423.01
OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.1Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Synchronous Single-StreamAVX-512 OnAVX-512 OffAVX-512 On 512816243240Min: 22.71 / Avg: 22.89 / Max: 23.07Min: 39.03 / Avg: 39.14 / Max: 39.3Min: 22.83 / Avg: 23.01 / Max: 23.17

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.1Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Synchronous Single-StreamAVX-512 OnAVX-512 OffAVX-512 On 5121020304050SE +/- 0.20, N = 3SE +/- 0.05, N = 3SE +/- 0.19, N = 343.6825.5443.44
OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.1Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Synchronous Single-StreamAVX-512 OnAVX-512 OffAVX-512 On 512918273645Min: 43.34 / Avg: 43.68 / Max: 44.02Min: 25.44 / Avg: 25.54 / Max: 25.62Min: 43.14 / Avg: 43.44 / Max: 43.79

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries too. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.10Device: CPU - Batch Size: 16 - Model: ResNet-50AVX-512 OnAVX-512 OffAVX-512 On 512510152025SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 320.4612.2920.55
OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.10Device: CPU - Batch Size: 16 - Model: ResNet-50AVX-512 OnAVX-512 OffAVX-512 On 512510152025Min: 20.44 / Avg: 20.46 / Max: 20.47Min: 12.28 / Avg: 12.29 / Max: 12.32Min: 20.53 / Avg: 20.55 / Max: 20.56

AI Benchmark Alpha

AI Benchmark Alpha is a Python library for evaluating artificial intelligence (AI) performance on diverse hardware platforms and relies upon the TensorFlow machine learning library. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgScore, More Is BetterAI Benchmark Alpha 0.1.2Device Training ScoreAVX-512 OnAVX-512 OffAVX-512 On 512400800120016002000206912822076

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries too. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.10Device: CPU - Batch Size: 512 - Model: AlexNetAVX-512 OnAVX-512 OffAVX-512 On 5124080120160200SE +/- 0.14, N = 3SE +/- 0.09, N = 3SE +/- 0.21, N = 3173.67107.72172.66
OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.10Device: CPU - Batch Size: 512 - Model: AlexNetAVX-512 OnAVX-512 OffAVX-512 On 512306090120150Min: 173.47 / Avg: 173.67 / Max: 173.93Min: 107.55 / Avg: 107.72 / Max: 107.86Min: 172.24 / Avg: 172.66 / Max: 172.93

OSPRay

Intel OSPRay is a portable ray-tracing engine for high-performance, high-fidelity scientific visualizations. OSPRay builds off Intel's Embree and Intel SPMD Program Compiler (ISPC) components as part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 2.10Benchmark: gravity_spheres_volume/dim_512/scivis/real_timeAVX-512 OnAVX-512 OffAVX-512 On 5120.98331.96662.94993.93324.9165SE +/- 0.01025, N = 3SE +/- 0.00481, N = 3SE +/- 0.02096, N = 34.370422.723784.36772
OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 2.10Benchmark: gravity_spheres_volume/dim_512/scivis/real_timeAVX-512 OnAVX-512 OffAVX-512 On 512246810Min: 4.35 / Avg: 4.37 / Max: 4.38Min: 2.72 / Avg: 2.72 / Max: 2.73Min: 4.34 / Avg: 4.37 / Max: 4.41

Cpuminer-Opt

Cpuminer-Opt is a fork of cpuminer-multi that carries a wide range of CPU performance optimizations for measuring the potential cryptocurrency mining performance of the CPU/processor with a wide variety of cryptocurrencies. The benchmark reports the hash speed for the CPU mining performance for the selected cryptocurrency. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.18Algorithm: SkeincoinAVX-512 OnAVX-512 OffAVX-512 On 51220K40K60K80K100KSE +/- 394.25, N = 3SE +/- 85.44, N = 3SE +/- 146.55, N = 311173070720112403-mno-avx512f1. (CXX) g++ options: -O3 -march=native -lcurl -lz -lpthread -lssl -lcrypto -lgmp
OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.18Algorithm: SkeincoinAVX-512 OnAVX-512 OffAVX-512 On 51220K40K60K80K100KMin: 111240 / Avg: 111730 / Max: 112510Min: 70620 / Avg: 70720 / Max: 70890Min: 112120 / Avg: 112403.33 / Max: 1126101. (CXX) g++ options: -O3 -march=native -lcurl -lz -lpthread -lssl -lcrypto -lgmp

OSPRay

Intel OSPRay is a portable ray-tracing engine for high-performance, high-fidelity scientific visualizations. OSPRay builds off Intel's Embree and Intel SPMD Program Compiler (ISPC) components as part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 2.10Benchmark: gravity_spheres_volume/dim_512/ao/real_timeAVX-512 OnAVX-512 OffAVX-512 On 5121.00552.0113.01654.0225.0275SE +/- 0.02867, N = 3SE +/- 0.01595, N = 3SE +/- 0.01309, N = 34.420722.828214.46906
OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 2.10Benchmark: gravity_spheres_volume/dim_512/ao/real_timeAVX-512 OnAVX-512 OffAVX-512 On 512246810Min: 4.39 / Avg: 4.42 / Max: 4.48Min: 2.8 / Avg: 2.83 / Max: 2.85Min: 4.45 / Avg: 4.47 / Max: 4.49

OpenVINO

This is a test of the Intel OpenVINO, a toolkit around neural networks, using its built-in benchmarking support and analyzing the throughput and latency for various models. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.2.devModel: Weld Porosity Detection FP16 - Device: CPUAVX-512 OnAVX-512 OffAVX-512 On 512612182430SE +/- 0.03, N = 3SE +/- 0.00, N = 3SE +/- 0.08, N = 324.9215.8124.86-ldl - MIN: 12.72 / MAX: 35.46-mno-avx512f - MIN: 7.14 / MAX: 105.25-ldl - MIN: 12.43 / MAX: 42.381. (CXX) g++ options: -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.2.devModel: Weld Porosity Detection FP16 - Device: CPUAVX-512 OnAVX-512 OffAVX-512 On 512612182430Min: 24.88 / Avg: 24.92 / Max: 24.98Min: 15.81 / Avg: 15.81 / Max: 15.82Min: 24.76 / Avg: 24.86 / Max: 25.021. (CXX) g++ options: -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -pie

oneDNN

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPUAVX-512 OnAVX-512 OffAVX-512 On 5120.52021.04041.56062.08082.601SE +/- 0.00510, N = 9SE +/- 0.00997, N = 9SE +/- 0.00281, N = 91.470492.311831.46948MIN: 1.35-mno-avx512f - MIN: 2.22MIN: 1.361. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPUAVX-512 OnAVX-512 OffAVX-512 On 512246810Min: 1.44 / Avg: 1.47 / Max: 1.49Min: 2.27 / Avg: 2.31 / Max: 2.37Min: 1.46 / Avg: 1.47 / Max: 1.481. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: vision_transformerAVX-512 OnAVX-512 OffAVX-512 On 51220406080100SE +/- 0.11, N = 3SE +/- 0.39, N = 3SE +/- 0.12, N = 387.29110.3870.39MIN: 86.63 / MAX: 92.92-mno-avx512f - MIN: 109.43 / MAX: 116.1MIN: 69.62 / MAX: 71.721. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread
OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: vision_transformerAVX-512 OnAVX-512 OffAVX-512 On 51220406080100Min: 87.14 / Avg: 87.29 / Max: 87.5Min: 109.75 / Avg: 110.38 / Max: 111.1Min: 70.24 / Avg: 70.39 / Max: 70.631. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread

Neural Magic DeepSparse

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.1Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-StreamAVX-512 OnAVX-512 OffAVX-512 On 5123691215SE +/- 0.0274, N = 3SE +/- 0.0062, N = 3SE +/- 0.0069, N = 37.517311.62447.4707
OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.1Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-StreamAVX-512 OnAVX-512 OffAVX-512 On 5123691215Min: 7.48 / Avg: 7.52 / Max: 7.57Min: 11.62 / Avg: 11.62 / Max: 11.64Min: 7.46 / Avg: 7.47 / Max: 7.48

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.1Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-StreamAVX-512 OnAVX-512 OffAVX-512 On 512306090120150SE +/- 0.48, N = 3SE +/- 0.04, N = 3SE +/- 0.12, N = 3132.9185.98133.74
OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.1Model: CV Classification, ResNet-50 ImageNet - Scenario: Synchronous Single-StreamAVX-512 OnAVX-512 OffAVX-512 On 512306090120150Min: 131.96 / Avg: 132.91 / Max: 133.51Min: 85.89 / Avg: 85.98 / Max: 86.04Min: 133.57 / Avg: 133.74 / Max: 133.98

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.1Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-StreamAVX-512 OnAVX-512 OffAVX-512 On 512918273645SE +/- 0.15, N = 3SE +/- 0.05, N = 3SE +/- 0.11, N = 325.7339.3525.49
OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.1Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-StreamAVX-512 OnAVX-512 OffAVX-512 On 512816243240Min: 25.5 / Avg: 25.73 / Max: 26.01Min: 39.24 / Avg: 39.34 / Max: 39.41Min: 25.35 / Avg: 25.49 / Max: 25.71

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.1Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-StreamAVX-512 OnAVX-512 OffAVX-512 On 512306090120150SE +/- 0.92, N = 3SE +/- 0.14, N = 3SE +/- 0.68, N = 3155.39101.64156.81
OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.1Model: CV Classification, ResNet-50 ImageNet - Scenario: Asynchronous Multi-StreamAVX-512 OnAVX-512 OffAVX-512 On 512306090120150Min: 153.64 / Avg: 155.39 / Max: 156.77Min: 101.48 / Avg: 101.64 / Max: 101.92Min: 155.49 / Avg: 156.81 / Max: 157.74

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries too. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.10Device: CPU - Batch Size: 256 - Model: VGG-16AVX-512 OnAVX-512 OffAVX-512 On 5123691215SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.01, N = 39.426.229.42
OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.10Device: CPU - Batch Size: 256 - Model: VGG-16AVX-512 OnAVX-512 OffAVX-512 On 5123691215Min: 9.41 / Avg: 9.42 / Max: 9.43Min: 6.21 / Avg: 6.22 / Max: 6.22Min: 9.41 / Avg: 9.42 / Max: 9.43

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.10Device: CPU - Batch Size: 64 - Model: VGG-16AVX-512 OnAVX-512 OffAVX-512 On 512246810SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.01, N = 38.736.118.69
OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.10Device: CPU - Batch Size: 64 - Model: VGG-16AVX-512 OnAVX-512 OffAVX-512 On 5123691215Min: 8.71 / Avg: 8.73 / Max: 8.74Min: 6.11 / Avg: 6.11 / Max: 6.12Min: 8.67 / Avg: 8.69 / Max: 8.7

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.10Device: CPU - Batch Size: 32 - Model: VGG-16AVX-512 OnAVX-512 OffAVX-512 On 512246810SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.01, N = 38.325.958.29
OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.10Device: CPU - Batch Size: 32 - Model: VGG-16AVX-512 OnAVX-512 OffAVX-512 On 5123691215Min: 8.32 / Avg: 8.32 / Max: 8.33Min: 5.95 / Avg: 5.95 / Max: 5.96Min: 8.28 / Avg: 8.29 / Max: 8.3

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.10Device: CPU - Batch Size: 256 - Model: AlexNetAVX-512 OnAVX-512 OffAVX-512 On 5124080120160200SE +/- 0.42, N = 3SE +/- 0.05, N = 3SE +/- 0.22, N = 3163.87117.43163.09
OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.10Device: CPU - Batch Size: 256 - Model: AlexNetAVX-512 OnAVX-512 OffAVX-512 On 512306090120150Min: 163.22 / Avg: 163.87 / Max: 164.65Min: 117.38 / Avg: 117.43 / Max: 117.54Min: 162.66 / Avg: 163.09 / Max: 163.34

Neural Magic DeepSparse

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.1Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-StreamAVX-512 OnAVX-512 OffAVX-512 On 512306090120150SE +/- 0.09, N = 3SE +/- 0.85, N = 3SE +/- 0.07, N = 3109.28150.77108.49
OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.1Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-StreamAVX-512 OnAVX-512 OffAVX-512 On 512306090120150Min: 109.12 / Avg: 109.28 / Max: 109.44Min: 149.8 / Avg: 150.77 / Max: 152.45Min: 108.38 / Avg: 108.49 / Max: 108.61

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.1Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-StreamAVX-512 OnAVX-512 OffAVX-512 On 512816243240SE +/- 0.03, N = 3SE +/- 0.15, N = 3SE +/- 0.02, N = 336.6026.5336.86
OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.1Model: NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Scenario: Asynchronous Multi-StreamAVX-512 OnAVX-512 OffAVX-512 On 512816243240Min: 36.54 / Avg: 36.6 / Max: 36.65Min: 26.23 / Avg: 26.53 / Max: 26.7Min: 36.82 / Avg: 36.86 / Max: 36.89

simdjson

This is a benchmark of SIMDJSON, a high performance JSON parser. SIMDJSON aims to be the fastest JSON parser and is used by projects like Microsoft FishStore, Yandex ClickHouse, Shopify, and others. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is Bettersimdjson 2.0Throughput Test: TopTweetAVX-512 OnAVX-512 OffAVX-512 On 512246810SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.00, N = 38.826.358.82-mno-avx512f1. (CXX) g++ options: -O3 -march=native
OpenBenchmarking.orgGB/s, More Is Bettersimdjson 2.0Throughput Test: TopTweetAVX-512 OnAVX-512 OffAVX-512 On 5123691215Min: 8.8 / Avg: 8.82 / Max: 8.84Min: 6.34 / Avg: 6.35 / Max: 6.36Min: 8.81 / Avg: 8.82 / Max: 8.821. (CXX) g++ options: -O3 -march=native

OpenBenchmarking.orgGB/s, More Is Bettersimdjson 2.0Throughput Test: DistinctUserIDAVX-512 OnAVX-512 OffAVX-512 On 512246810SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 38.786.338.77-mno-avx512f1. (CXX) g++ options: -O3 -march=native
OpenBenchmarking.orgGB/s, More Is Bettersimdjson 2.0Throughput Test: DistinctUserIDAVX-512 OnAVX-512 OffAVX-512 On 5123691215Min: 8.77 / Avg: 8.78 / Max: 8.8Min: 6.33 / Avg: 6.33 / Max: 6.34Min: 8.77 / Avg: 8.77 / Max: 8.781. (CXX) g++ options: -O3 -march=native

oneDNN

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPUAVX-512 OnAVX-512 OffAVX-512 On 5120.67811.35622.03432.71243.3905SE +/- 0.00593, N = 5SE +/- 0.00482, N = 5SE +/- 0.01183, N = 53.013772.186702.75246MIN: 2.91-mno-avx512f - MIN: 2.11MIN: 2.641. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPUAVX-512 OnAVX-512 OffAVX-512 On 512246810Min: 3 / Avg: 3.01 / Max: 3.03Min: 2.18 / Avg: 2.19 / Max: 2.2Min: 2.73 / Avg: 2.75 / Max: 2.791. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

simdjson

This is a benchmark of SIMDJSON, a high performance JSON parser. SIMDJSON aims to be the fastest JSON parser and is used by projects like Microsoft FishStore, Yandex ClickHouse, Shopify, and others. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is Bettersimdjson 2.0Throughput Test: PartialTweetsAVX-512 OnAVX-512 OffAVX-512 On 512246810SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 38.526.208.51-mno-avx512f1. (CXX) g++ options: -O3 -march=native
OpenBenchmarking.orgGB/s, More Is Bettersimdjson 2.0Throughput Test: PartialTweetsAVX-512 OnAVX-512 OffAVX-512 On 5123691215Min: 8.51 / Avg: 8.52 / Max: 8.52Min: 6.2 / Avg: 6.2 / Max: 6.21Min: 8.5 / Avg: 8.51 / Max: 8.511. (CXX) g++ options: -O3 -march=native

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries too. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.10Device: CPU - Batch Size: 16 - Model: VGG-16AVX-512 OnAVX-512 OffAVX-512 On 512246810SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 37.765.697.76
OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.10Device: CPU - Batch Size: 16 - Model: VGG-16AVX-512 OnAVX-512 OffAVX-512 On 5123691215Min: 7.74 / Avg: 7.76 / Max: 7.77Min: 5.69 / Avg: 5.69 / Max: 5.69Min: 7.75 / Avg: 7.76 / Max: 7.76

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.10Device: CPU - Batch Size: 64 - Model: AlexNetAVX-512 OnAVX-512 OffAVX-512 On 512306090120150SE +/- 0.11, N = 3SE +/- 0.07, N = 3SE +/- 0.13, N = 3136.92103.31137.75
OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.10Device: CPU - Batch Size: 64 - Model: AlexNetAVX-512 OnAVX-512 OffAVX-512 On 512306090120150Min: 136.78 / Avg: 136.92 / Max: 137.13Min: 103.17 / Avg: 103.31 / Max: 103.42Min: 137.5 / Avg: 137.75 / Max: 137.9

OSPRay

Intel OSPRay is a portable ray-tracing engine for high-performance, high-fidelity scientific visualizations. OSPRay builds off Intel's Embree and Intel SPMD Program Compiler (ISPC) components as part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 2.10Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_timeAVX-512 OnAVX-512 OffAVX-512 On 5121.2072.4143.6214.8286.035SE +/- 0.01274, N = 3SE +/- 0.01158, N = 3SE +/- 0.02156, N = 35.364614.057455.35301
OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 2.10Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_timeAVX-512 OnAVX-512 OffAVX-512 On 512246810Min: 5.34 / Avg: 5.36 / Max: 5.38Min: 4.04 / Avg: 4.06 / Max: 4.08Min: 5.32 / Avg: 5.35 / Max: 5.39

OpenVINO

This is a test of the Intel OpenVINO, a toolkit around neural networks, using its built-in benchmarking support and analyzing the throughput and latency for various models. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.2.devModel: Vehicle Detection FP16 - Device: CPUAVX-512 OnAVX-512 OffAVX-512 On 512306090120150SE +/- 0.12, N = 3SE +/- 1.44, N = 3SE +/- 0.25, N = 3108.61141.30109.09-ldl-mno-avx512f-ldl1. (CXX) g++ options: -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.2.devModel: Vehicle Detection FP16 - Device: CPUAVX-512 OnAVX-512 OffAVX-512 On 512306090120150Min: 108.45 / Avg: 108.61 / Max: 108.84Min: 138.68 / Avg: 141.3 / Max: 143.63Min: 108.73 / Avg: 109.09 / Max: 109.561. (CXX) g++ options: -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -pie

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.2.devModel: Vehicle Detection FP16 - Device: CPUAVX-512 OnAVX-512 OffAVX-512 On 512816243240SE +/- 0.04, N = 3SE +/- 0.29, N = 3SE +/- 0.08, N = 336.8028.2936.64-ldl - MIN: 12.79 / MAX: 55.71-mno-avx512f - MIN: 11.7 / MAX: 274.34-ldl - MIN: 13.62 / MAX: 58.491. (CXX) g++ options: -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.2.devModel: Vehicle Detection FP16 - Device: CPUAVX-512 OnAVX-512 OffAVX-512 On 512816243240Min: 36.72 / Avg: 36.8 / Max: 36.85Min: 27.82 / Avg: 28.29 / Max: 28.82Min: 36.48 / Avg: 36.64 / Max: 36.761. (CXX) g++ options: -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -pie

OpenVKL

OpenVKL is the Intel Open Volume Kernel Library that offers high-performance volume computation kernels and part of the Intel oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgItems / Sec, More Is BetterOpenVKL 1.0Benchmark: vklBenchmark ISPCAVX-512 OnAVX-512 OffAVX-512 On 512306090120150SE +/- 0.33, N = 3SE +/- 0.00, N = 3SE +/- 0.67, N = 311387112MIN: 10 / MAX: 1116MIN: 7 / MAX: 901MIN: 9 / MAX: 1115
OpenBenchmarking.orgItems / Sec, More Is BetterOpenVKL 1.0Benchmark: vklBenchmark ISPCAVX-512 OnAVX-512 OffAVX-512 On 51220406080100Min: 112 / Avg: 112.67 / Max: 113Min: 87 / Avg: 87 / Max: 87Min: 111 / Avg: 111.67 / Max: 113

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries too. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.10Device: CPU - Batch Size: 32 - Model: AlexNetAVX-512 OnAVX-512 OffAVX-512 On 512306090120150SE +/- 0.17, N = 3SE +/- 0.05, N = 3SE +/- 0.26, N = 3112.1087.79113.25
OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.10Device: CPU - Batch Size: 32 - Model: AlexNetAVX-512 OnAVX-512 OffAVX-512 On 51220406080100Min: 111.78 / Avg: 112.1 / Max: 112.36Min: 87.71 / Avg: 87.79 / Max: 87.87Min: 112.83 / Avg: 113.25 / Max: 113.73

Meta Performance Per Watts

OpenBenchmarking.orgPerformance Per Watts, More Is BetterMeta Performance Per WattsPerformance Per WattsAVX-512 OnAVX-512 OffAVX-512 On 5124080120160200170.75133.30170.21

AI Benchmark Alpha

AI Benchmark Alpha is a Python library for evaluating artificial intelligence (AI) performance on diverse hardware platforms and relies upon the TensorFlow machine learning library. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgScore, More Is BetterAI Benchmark Alpha 0.1.2Device AI ScoreAVX-512 OnAVX-512 OffAVX-512 On 5127001400210028003500344526963439

OpenVINO

This is a test of the Intel OpenVINO, a toolkit around neural networks, using its built-in benchmarking support and analyzing the throughput and latency for various models. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.2.devModel: Weld Porosity Detection FP16 - Device: CPUAVX-512 OnAVX-512 OffAVX-512 On 51270140210280350SE +/- 0.40, N = 3SE +/- 0.05, N = 3SE +/- 1.00, N = 3320.80252.69321.51-ldl-mno-avx512f-ldl1. (CXX) g++ options: -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.2.devModel: Weld Porosity Detection FP16 - Device: CPUAVX-512 OnAVX-512 OffAVX-512 On 51260120180240300Min: 320.01 / Avg: 320.8 / Max: 321.26Min: 252.59 / Avg: 252.69 / Max: 252.76Min: 319.55 / Avg: 321.51 / Max: 322.811. (CXX) g++ options: -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -pie

oneDNN

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: IP Shapes 1D - Data Type: f32 - Engine: CPUAVX-512 OnAVX-512 OffAVX-512 On 5120.9081.8162.7243.6324.54SE +/- 0.00337, N = 4SE +/- 0.00517, N = 4SE +/- 0.00488, N = 44.035543.226804.03294MIN: 3.9-mno-avx512f - MIN: 3.09MIN: 3.891. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: IP Shapes 1D - Data Type: f32 - Engine: CPUAVX-512 OnAVX-512 OffAVX-512 On 512246810Min: 4.03 / Avg: 4.04 / Max: 4.04Min: 3.21 / Avg: 3.23 / Max: 3.24Min: 4.02 / Avg: 4.03 / Max: 4.041. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: squeezenet_ssdAVX-512 OnAVX-512 OffAVX-512 On 51248121620SE +/- 0.18, N = 3SE +/- 0.04, N = 3SE +/- 0.05, N = 317.1913.8917.34MIN: 16.54 / MAX: 25.09-mno-avx512f - MIN: 13.58 / MAX: 15.32MIN: 17.03 / MAX: 18.981. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread
OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: squeezenet_ssdAVX-512 OnAVX-512 OffAVX-512 On 51248121620Min: 16.82 / Avg: 17.19 / Max: 17.37Min: 13.82 / Avg: 13.89 / Max: 13.96Min: 17.25 / Avg: 17.34 / Max: 17.421. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread

oneDNN

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPUAVX-512 OnAVX-512 OffAVX-512 On 5121.21352.4273.64054.8546.0675SE +/- 0.00401, N = 9SE +/- 0.00507, N = 9SE +/- 0.00516, N = 94.330635.393274.32564MIN: 4.21-mno-avx512f - MIN: 5.31MIN: 4.211. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPUAVX-512 OnAVX-512 OffAVX-512 On 512246810Min: 4.31 / Avg: 4.33 / Max: 4.35Min: 5.36 / Avg: 5.39 / Max: 5.41Min: 4.31 / Avg: 4.33 / Max: 4.351. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPUAVX-512 OnAVX-512 OffAVX-512 On 51248121620SE +/- 0.01, N = 7SE +/- 0.01, N = 7SE +/- 0.00, N = 712.1114.8412.02MIN: 11.95-mno-avx512f - MIN: 14.31MIN: 11.851. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPUAVX-512 OnAVX-512 OffAVX-512 On 51248121620Min: 12.1 / Avg: 12.11 / Max: 12.14Min: 14.77 / Avg: 14.84 / Max: 14.9Min: 12.01 / Avg: 12.02 / Max: 12.031. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

Neural Magic DeepSparse

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.1Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Synchronous Single-StreamAVX-512 OnAVX-512 OffAVX-512 On 512816243240SE +/- 0.07, N = 3SE +/- 0.02, N = 3SE +/- 0.11, N = 335.3828.9735.42
OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.1Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Synchronous Single-StreamAVX-512 OnAVX-512 OffAVX-512 On 512816243240Min: 35.26 / Avg: 35.38 / Max: 35.5Min: 28.94 / Avg: 28.97 / Max: 29Min: 35.28 / Avg: 35.42 / Max: 35.63

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.1Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Synchronous Single-StreamAVX-512 OnAVX-512 OffAVX-512 On 512816243240SE +/- 0.06, N = 3SE +/- 0.02, N = 3SE +/- 0.08, N = 328.2634.5128.22
OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.1Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Synchronous Single-StreamAVX-512 OnAVX-512 OffAVX-512 On 512714212835Min: 28.17 / Avg: 28.26 / Max: 28.36Min: 34.47 / Avg: 34.51 / Max: 34.55Min: 28.06 / Avg: 28.22 / Max: 28.34

OSPRay

Intel OSPRay is a portable ray-tracing engine for high-performance, high-fidelity scientific visualizations. OSPRay builds off Intel's Embree and Intel SPMD Program Compiler (ISPC) components as part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 2.10Benchmark: particle_volume/ao/real_timeAVX-512 OnAVX-512 OffAVX-512 On 512246810SE +/- 0.01187, N = 3SE +/- 0.00177, N = 3SE +/- 0.03653, N = 34.958226.024464.95280
OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 2.10Benchmark: particle_volume/ao/real_timeAVX-512 OnAVX-512 OffAVX-512 On 512246810Min: 4.94 / Avg: 4.96 / Max: 4.98Min: 6.02 / Avg: 6.02 / Max: 6.03Min: 4.91 / Avg: 4.95 / Max: 5.02

Neural Magic DeepSparse

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.1Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-StreamAVX-512 OnAVX-512 OffAVX-512 On 51248121620SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 314.5417.6614.54
OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.1Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-StreamAVX-512 OnAVX-512 OffAVX-512 On 51248121620Min: 14.52 / Avg: 14.54 / Max: 14.56Min: 17.65 / Avg: 17.66 / Max: 17.68Min: 14.53 / Avg: 14.54 / Max: 14.55

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.1Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-StreamAVX-512 OnAVX-512 OffAVX-512 On 5121530456075SE +/- 0.05, N = 3SE +/- 0.03, N = 3SE +/- 0.03, N = 368.7656.5968.77
OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.1Model: NLP Text Classification, DistilBERT mnli - Scenario: Synchronous Single-StreamAVX-512 OnAVX-512 OffAVX-512 On 5121326395265Min: 68.66 / Avg: 68.76 / Max: 68.85Min: 56.54 / Avg: 56.59 / Max: 56.65Min: 68.71 / Avg: 68.77 / Max: 68.81

Mobile Neural Network

MNN is the Mobile Neural Network as a highly efficient, lightweight deep learning framework developed by Alibaba. This MNN test profile is building the OpenMP / CPU threaded version for processor benchmarking and not any GPU-accelerated test. MNN does allow making use of AVX-512 extensions. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.1Model: squeezenetv1.1AVX-512 OnAVX-512 OffAVX-512 On 5120.45520.91041.36561.82082.276SE +/- 0.013, N = 15SE +/- 0.014, N = 3SE +/- 0.014, N = 151.6702.0231.675MIN: 1.54 / MAX: 11.33-mno-avx512f - MIN: 1.97 / MAX: 2.72MIN: 1.55 / MAX: 7.761. (CXX) g++ options: -O3 -march=native -std=c++11 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.1Model: squeezenetv1.1AVX-512 OnAVX-512 OffAVX-512 On 512246810Min: 1.61 / Avg: 1.67 / Max: 1.75Min: 2 / Avg: 2.02 / Max: 2.05Min: 1.58 / Avg: 1.68 / Max: 1.761. (CXX) g++ options: -O3 -march=native -std=c++11 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries too. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.10Device: CPU - Batch Size: 16 - Model: AlexNetAVX-512 OnAVX-512 OffAVX-512 On 51220406080100SE +/- 0.06, N = 3SE +/- 0.03, N = 3SE +/- 0.12, N = 380.7567.8082.10
OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.10Device: CPU - Batch Size: 16 - Model: AlexNetAVX-512 OnAVX-512 OffAVX-512 On 5121632486480Min: 80.64 / Avg: 80.75 / Max: 80.81Min: 67.76 / Avg: 67.8 / Max: 67.85Min: 81.87 / Avg: 82.1 / Max: 82.24

Mobile Neural Network

MNN is the Mobile Neural Network as a highly efficient, lightweight deep learning framework developed by Alibaba. This MNN test profile is building the OpenMP / CPU threaded version for processor benchmarking and not any GPU-accelerated test. MNN does allow making use of AVX-512 extensions. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.1Model: inception-v3AVX-512 OnAVX-512 OffAVX-512 On 512612182430SE +/- 0.26, N = 15SE +/- 0.02, N = 3SE +/- 0.27, N = 1519.5823.4219.40MIN: 17.25 / MAX: 32.64-mno-avx512f - MIN: 22.95 / MAX: 30.84MIN: 17.18 / MAX: 49.21. (CXX) g++ options: -O3 -march=native -std=c++11 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.1Model: inception-v3AVX-512 OnAVX-512 OffAVX-512 On 512510152025Min: 18.11 / Avg: 19.58 / Max: 20.98Min: 23.39 / Avg: 23.42 / Max: 23.46Min: 17.97 / Avg: 19.4 / Max: 20.711. (CXX) g++ options: -O3 -march=native -std=c++11 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.1Model: SqueezeNetV1.0AVX-512 OnAVX-512 OffAVX-512 On 5120.83121.66242.49363.32484.156SE +/- 0.020, N = 15SE +/- 0.032, N = 3SE +/- 0.022, N = 153.0733.6943.114MIN: 2.89 / MAX: 8.65-mno-avx512f - MIN: 3.57 / MAX: 4.37MIN: 2.9 / MAX: 9.421. (CXX) g++ options: -O3 -march=native -std=c++11 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.1Model: SqueezeNetV1.0AVX-512 OnAVX-512 OffAVX-512 On 512246810Min: 2.99 / Avg: 3.07 / Max: 3.26Min: 3.63 / Avg: 3.69 / Max: 3.74Min: 3 / Avg: 3.11 / Max: 3.31. (CXX) g++ options: -O3 -march=native -std=c++11 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl

OSPRay

Intel OSPRay is a portable ray-tracing engine for high-performance, high-fidelity scientific visualizations. OSPRay builds off Intel's Embree and Intel SPMD Program Compiler (ISPC) components as part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 2.10Benchmark: particle_volume/scivis/real_timeAVX-512 OnAVX-512 OffAVX-512 On 5121.33122.66243.99365.32486.656SE +/- 0.04966, N = 3SE +/- 0.02016, N = 3SE +/- 0.01748, N = 34.980375.916364.95148
OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 2.10Benchmark: particle_volume/scivis/real_timeAVX-512 OnAVX-512 OffAVX-512 On 512246810Min: 4.88 / Avg: 4.98 / Max: 5.04Min: 5.88 / Avg: 5.92 / Max: 5.94Min: 4.92 / Avg: 4.95 / Max: 4.98

OpenVINO

This is a test of the Intel OpenVINO, a toolkit around neural networks, using its built-in benchmarking support and analyzing the throughput and latency for various models. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.2.devModel: Age Gender Recognition Retail 0013 FP16 - Device: CPUAVX-512 OnAVX-512 OffAVX-512 On 5120.2250.450.6750.91.125SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 30.851.000.84-ldl - MIN: 0.45 / MAX: 10.27-mno-avx512f - MIN: 0.48 / MAX: 89.55-ldl - MIN: 0.48 / MAX: 9.991. (CXX) g++ options: -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.2.devModel: Age Gender Recognition Retail 0013 FP16 - Device: CPUAVX-512 OnAVX-512 OffAVX-512 On 512246810Min: 0.84 / Avg: 0.85 / Max: 0.86Min: 0.99 / Avg: 1 / Max: 1Min: 0.83 / Avg: 0.84 / Max: 0.841. (CXX) g++ options: -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -pie

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.2.devModel: Age Gender Recognition Retail 0013 FP16 - Device: CPUAVX-512 OnAVX-512 OffAVX-512 On 5122K4K6K8K10KSE +/- 94.72, N = 3SE +/- 34.03, N = 3SE +/- 54.72, N = 39379.277956.789460.45-ldl-mno-avx512f-ldl1. (CXX) g++ options: -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.2.devModel: Age Gender Recognition Retail 0013 FP16 - Device: CPUAVX-512 OnAVX-512 OffAVX-512 On 51216003200480064008000Min: 9189.83 / Avg: 9379.27 / Max: 9474.39Min: 7911 / Avg: 7956.78 / Max: 8023.29Min: 9379.38 / Avg: 9460.45 / Max: 9564.641. (CXX) g++ options: -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -pie

Neural Magic DeepSparse

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.1Model: CV Detection,YOLOv5s COCO - Scenario: Asynchronous Multi-StreamAVX-512 OnAVX-512 OffAVX-512 On 5121224364860SE +/- 0.15, N = 3SE +/- 0.08, N = 3SE +/- 0.37, N = 350.7743.2351.36
OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.1Model: CV Detection,YOLOv5s COCO - Scenario: Asynchronous Multi-StreamAVX-512 OnAVX-512 OffAVX-512 On 5121020304050Min: 50.55 / Avg: 50.77 / Max: 51.04Min: 43.14 / Avg: 43.23 / Max: 43.4Min: 50.99 / Avg: 51.36 / Max: 52.09

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.1Model: CV Detection,YOLOv5s COCO - Scenario: Asynchronous Multi-StreamAVX-512 OnAVX-512 OffAVX-512 On 51220406080100SE +/- 0.23, N = 3SE +/- 0.19, N = 3SE +/- 0.56, N = 378.7892.5177.87
OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.1Model: CV Detection,YOLOv5s COCO - Scenario: Asynchronous Multi-StreamAVX-512 OnAVX-512 OffAVX-512 On 51220406080100Min: 78.35 / Avg: 78.78 / Max: 79.11Min: 92.13 / Avg: 92.51 / Max: 92.7Min: 76.76 / Avg: 77.87 / Max: 78.44

simdjson

This is a benchmark of SIMDJSON, a high performance JSON parser. SIMDJSON aims to be the fastest JSON parser and is used by projects like Microsoft FishStore, Yandex ClickHouse, Shopify, and others. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is Bettersimdjson 2.0Throughput Test: KostyaAVX-512 OnAVX-512 OffAVX-512 On 5121.02152.0433.06454.0865.1075SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 34.543.864.53-mno-avx512f1. (CXX) g++ options: -O3 -march=native
OpenBenchmarking.orgGB/s, More Is Bettersimdjson 2.0Throughput Test: KostyaAVX-512 OnAVX-512 OffAVX-512 On 512246810Min: 4.53 / Avg: 4.54 / Max: 4.55Min: 3.86 / Avg: 3.86 / Max: 3.86Min: 4.52 / Avg: 4.53 / Max: 4.531. (CXX) g++ options: -O3 -march=native

oneDNN

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: IP Shapes 3D - Data Type: f32 - Engine: CPUAVX-512 OnAVX-512 OffAVX-512 On 5123691215SE +/- 0.00598, N = 5SE +/- 0.00748, N = 5SE +/- 0.01026, N = 510.6347011.677109.95113MIN: 10.54-mno-avx512f - MIN: 11.41MIN: 9.831. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: IP Shapes 3D - Data Type: f32 - Engine: CPUAVX-512 OnAVX-512 OffAVX-512 On 5123691215Min: 10.61 / Avg: 10.63 / Max: 10.65Min: 11.65 / Avg: 11.68 / Max: 11.69Min: 9.94 / Avg: 9.95 / Max: 9.991. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

Neural Magic DeepSparse

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.1Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-StreamAVX-512 OnAVX-512 OffAVX-512 On 5121326395265SE +/- 0.07, N = 3SE +/- 0.03, N = 3SE +/- 0.03, N = 349.5757.8849.36
OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.1Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-StreamAVX-512 OnAVX-512 OffAVX-512 On 5121122334455Min: 49.43 / Avg: 49.57 / Max: 49.67Min: 57.83 / Avg: 57.88 / Max: 57.93Min: 49.3 / Avg: 49.36 / Max: 49.41

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.1Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-StreamAVX-512 OnAVX-512 OffAVX-512 On 51220406080100SE +/- 0.12, N = 3SE +/- 0.03, N = 3SE +/- 0.06, N = 380.6769.1081.02
OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.1Model: NLP Text Classification, DistilBERT mnli - Scenario: Asynchronous Multi-StreamAVX-512 OnAVX-512 OffAVX-512 On 5121530456075Min: 80.5 / Avg: 80.67 / Max: 80.91Min: 69.04 / Avg: 69.1 / Max: 69.16Min: 80.94 / Avg: 81.02 / Max: 81.13

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: yolov4-tinyAVX-512 OnAVX-512 OffAVX-512 On 512510152025SE +/- 0.41, N = 3SE +/- 0.06, N = 3SE +/- 0.44, N = 320.9918.8022.01MIN: 20.36 / MAX: 37.15-mno-avx512f - MIN: 18.51 / MAX: 19.88MIN: 21.29 / MAX: 26.711. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread
OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: yolov4-tinyAVX-512 OnAVX-512 OffAVX-512 On 512510152025Min: 20.57 / Avg: 20.99 / Max: 21.81Min: 18.72 / Avg: 18.8 / Max: 18.91Min: 21.52 / Avg: 22.01 / Max: 22.881. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread

oneDNN

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPUAVX-512 OnAVX-512 OffAVX-512 On 512400800120016002000SE +/- 2.62, N = 3SE +/- 0.87, N = 3SE +/- 18.65, N = 51814.032006.021715.75MIN: 1799.95-mno-avx512f - MIN: 1997.88MIN: 1674.741. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPUAVX-512 OnAVX-512 OffAVX-512 On 51230060090012001500Min: 1808.79 / Avg: 1814.03 / Max: 1816.72Min: 2004.54 / Avg: 2006.02 / Max: 2007.56Min: 1684.17 / Avg: 1715.75 / Max: 1779.121. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

OpenVINO

This is a test of the Intel OpenVINO, a toolkit around neural networks, using its built-in benchmarking support and analyzing the throughput and latency for various models. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.2.devModel: Person Detection FP16 - Device: CPUAVX-512 OnAVX-512 OffAVX-512 On 5120.40050.8011.20151.6022.0025SE +/- 0.00, N = 3SE +/- 0.01, N = 3SE +/- 0.00, N = 31.781.541.78-ldl-mno-avx512f-ldl1. (CXX) g++ options: -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.2.devModel: Person Detection FP16 - Device: CPUAVX-512 OnAVX-512 OffAVX-512 On 512246810Min: 1.77 / Avg: 1.78 / Max: 1.78Min: 1.53 / Avg: 1.54 / Max: 1.55Min: 1.77 / Avg: 1.78 / Max: 1.781. (CXX) g++ options: -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -pie

oneDNN

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPUAVX-512 OnAVX-512 OffAVX-512 On 512400800120016002000SE +/- 7.10, N = 3SE +/- 1.98, N = 3SE +/- 14.49, N = 31817.592003.541738.00MIN: 1801.35-mno-avx512f - MIN: 1993.54MIN: 1697.551. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPUAVX-512 OnAVX-512 OffAVX-512 On 51230060090012001500Min: 1808.28 / Avg: 1817.59 / Max: 1831.53Min: 1999.63 / Avg: 2003.54 / Max: 2005.97Min: 1709.2 / Avg: 1738 / Max: 1755.141. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

Neural Magic DeepSparse

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.1Model: CV Detection,YOLOv5s COCO - Scenario: Synchronous Single-StreamAVX-512 OnAVX-512 OffAVX-512 On 512612182430SE +/- 0.06, N = 3SE +/- 0.00, N = 3SE +/- 0.02, N = 320.1323.0419.99
OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.1Model: CV Detection,YOLOv5s COCO - Scenario: Synchronous Single-StreamAVX-512 OnAVX-512 OffAVX-512 On 512510152025Min: 20.04 / Avg: 20.13 / Max: 20.24Min: 23.03 / Avg: 23.04 / Max: 23.04Min: 19.97 / Avg: 19.99 / Max: 20.03

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.1Model: CV Detection,YOLOv5s COCO - Scenario: Synchronous Single-StreamAVX-512 OnAVX-512 OffAVX-512 On 5121122334455SE +/- 0.15, N = 3SE +/- 0.01, N = 3SE +/- 0.04, N = 349.6743.3949.99
OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.1Model: CV Detection,YOLOv5s COCO - Scenario: Synchronous Single-StreamAVX-512 OnAVX-512 OffAVX-512 On 5121020304050Min: 49.38 / Avg: 49.67 / Max: 49.88Min: 43.38 / Avg: 43.39 / Max: 43.4Min: 49.91 / Avg: 49.99 / Max: 50.05

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.1Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-StreamAVX-512 OnAVX-512 OffAVX-512 On 5123691215SE +/- 0.0785, N = 3SE +/- 0.0071, N = 3SE +/- 0.0301, N = 38.87597.80898.9905
OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.1Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-StreamAVX-512 OnAVX-512 OffAVX-512 On 5123691215Min: 8.73 / Avg: 8.88 / Max: 9Min: 7.8 / Avg: 7.81 / Max: 7.82Min: 8.95 / Avg: 8.99 / Max: 9.05

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.1Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-StreamAVX-512 OnAVX-512 OffAVX-512 On 512306090120150SE +/- 1.00, N = 3SE +/- 0.12, N = 3SE +/- 0.37, N = 3112.68128.05111.22
OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.1Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Synchronous Single-StreamAVX-512 OnAVX-512 OffAVX-512 On 51220406080100Min: 111.12 / Avg: 112.68 / Max: 114.54Min: 127.87 / Avg: 128.05 / Max: 128.27Min: 110.49 / Avg: 111.22 / Max: 111.7

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.1Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-StreamAVX-512 OnAVX-512 OffAVX-512 On 512306090120150SE +/- 0.31, N = 3SE +/- 0.07, N = 3SE +/- 0.54, N = 397.85112.6098.03
OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.1Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-StreamAVX-512 OnAVX-512 OffAVX-512 On 51220406080100Min: 97.29 / Avg: 97.85 / Max: 98.35Min: 112.47 / Avg: 112.6 / Max: 112.7Min: 97.26 / Avg: 98.03 / Max: 99.07

OpenVINO

This is a test of the Intel OpenVINO, a toolkit around neural networks, using its built-in benchmarking support and analyzing the throughput and latency for various models. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.2.devModel: Person Detection FP16 - Device: CPUAVX-512 OnAVX-512 OffAVX-512 On 5126001200180024003000SE +/- 0.33, N = 3SE +/- 7.35, N = 3SE +/- 7.01, N = 32235.542563.452227.69-ldl - MIN: 1874.94 / MAX: 2477.63-mno-avx512f - MIN: 1711.34 / MAX: 3871.27-ldl - MIN: 1807.91 / MAX: 2513.611. (CXX) g++ options: -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.2.devModel: Person Detection FP16 - Device: CPUAVX-512 OnAVX-512 OffAVX-512 On 512400800120016002000Min: 2235.17 / Avg: 2235.54 / Max: 2236.21Min: 2548.75 / Avg: 2563.45 / Max: 2571.28Min: 2218.16 / Avg: 2227.69 / Max: 2241.371. (CXX) g++ options: -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -pie

Neural Magic DeepSparse

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.1Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-StreamAVX-512 OnAVX-512 OffAVX-512 On 512918273645SE +/- 0.12, N = 3SE +/- 0.02, N = 3SE +/- 0.22, N = 340.8735.5240.80
OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.1Model: NLP Text Classification, BERT base uncased SST2 - Scenario: Asynchronous Multi-StreamAVX-512 OnAVX-512 OffAVX-512 On 512918273645Min: 40.67 / Avg: 40.87 / Max: 41.08Min: 35.49 / Avg: 35.52 / Max: 35.56Min: 40.37 / Avg: 40.8 / Max: 41.12

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.1Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-StreamAVX-512 OnAVX-512 OffAVX-512 On 5123691215SE +/- 0.0463, N = 3SE +/- 0.0041, N = 3SE +/- 0.0265, N = 38.91357.83379.0118
OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.1Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-StreamAVX-512 OnAVX-512 OffAVX-512 On 5123691215Min: 8.85 / Avg: 8.91 / Max: 9Min: 7.83 / Avg: 7.83 / Max: 7.84Min: 8.98 / Avg: 9.01 / Max: 9.06

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.1Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-StreamAVX-512 OnAVX-512 OffAVX-512 On 512306090120150SE +/- 0.58, N = 3SE +/- 0.07, N = 3SE +/- 0.33, N = 3112.19127.65110.96
OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.1Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Synchronous Single-StreamAVX-512 OnAVX-512 OffAVX-512 On 51220406080100Min: 111.06 / Avg: 112.19 / Max: 113Min: 127.52 / Avg: 127.65 / Max: 127.74Min: 110.31 / Avg: 110.96 / Max: 111.33

oneDNN

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPUAVX-512 OnAVX-512 OffAVX-512 On 512400800120016002000SE +/- 0.41, N = 3SE +/- 3.47, N = 3SE +/- 20.46, N = 41814.652000.121739.49MIN: 1805.32-mno-avx512f - MIN: 1987.76MIN: 1670.861. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPUAVX-512 OnAVX-512 OffAVX-512 On 51230060090012001500Min: 1814.23 / Avg: 1814.65 / Max: 1815.48Min: 1995.29 / Avg: 2000.12 / Max: 2006.85Min: 1678.71 / Avg: 1739.49 / Max: 1764.291. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

OpenVINO

This is a test of the Intel OpenVINO, a toolkit around neural networks, using its built-in benchmarking support and analyzing the throughput and latency for various models. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.2.devModel: Person Detection FP32 - Device: CPUAVX-512 OnAVX-512 OffAVX-512 On 5120.39830.79661.19491.59321.9915SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.00, N = 31.761.541.77-ldl-mno-avx512f-ldl1. (CXX) g++ options: -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.2.devModel: Person Detection FP32 - Device: CPUAVX-512 OnAVX-512 OffAVX-512 On 512246810Min: 1.74 / Avg: 1.76 / Max: 1.77Min: 1.53 / Avg: 1.54 / Max: 1.55Min: 1.76 / Avg: 1.77 / Max: 1.771. (CXX) g++ options: -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -pie

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.2.devModel: Person Detection FP32 - Device: CPUAVX-512 OnAVX-512 OffAVX-512 On 5126001200180024003000SE +/- 9.13, N = 3SE +/- 9.17, N = 3SE +/- 8.02, N = 32250.022563.872244.25-ldl - MIN: 1294.64 / MAX: 2438.65-mno-avx512f - MIN: 1540.87 / MAX: 3813.98-ldl - MIN: 1760.66 / MAX: 2505.491. (CXX) g++ options: -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.2.devModel: Person Detection FP32 - Device: CPUAVX-512 OnAVX-512 OffAVX-512 On 512400800120016002000Min: 2232 / Avg: 2250.02 / Max: 2261.59Min: 2550.94 / Avg: 2563.87 / Max: 2581.59Min: 2233.07 / Avg: 2244.25 / Max: 2259.791. (CXX) g++ options: -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -pie

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.2.devModel: Face Detection FP16 - Device: CPUAVX-512 OnAVX-512 OffAVX-512 On 5120.66831.33662.00492.67323.3415SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.00, N = 32.972.602.95-ldl-mno-avx512f-ldl1. (CXX) g++ options: -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.2.devModel: Face Detection FP16 - Device: CPUAVX-512 OnAVX-512 OffAVX-512 On 512246810Min: 2.96 / Avg: 2.97 / Max: 2.98Min: 2.58 / Avg: 2.6 / Max: 2.61Min: 2.95 / Avg: 2.95 / Max: 2.961. (CXX) g++ options: -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -pie

oneDNN

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPUAVX-512 OnAVX-512 OffAVX-512 On 5120.82661.65322.47983.30644.133SE +/- 0.00624, N = 4SE +/- 0.03921, N = 4SE +/- 0.00586, N = 43.267733.673603.21742MIN: 3.18-mno-avx512f - MIN: 3.39MIN: 3.141. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPUAVX-512 OnAVX-512 OffAVX-512 On 512246810Min: 3.25 / Avg: 3.27 / Max: 3.27Min: 3.61 / Avg: 3.67 / Max: 3.78Min: 3.2 / Avg: 3.22 / Max: 3.231. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

OpenVINO

This is a test of the Intel OpenVINO, a toolkit around neural networks, using its built-in benchmarking support and analyzing the throughput and latency for various models. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.2.devModel: Person Vehicle Bike Detection FP16 - Device: CPUAVX-512 OnAVX-512 OffAVX-512 On 51248121620SE +/- 0.12, N = 3SE +/- 0.12, N = 6SE +/- 0.12, N = 713.9512.2613.99-ldl - MIN: 6.25 / MAX: 52.24-mno-avx512f - MIN: 6.09 / MAX: 23.07-ldl - MIN: 6.28 / MAX: 24.011. (CXX) g++ options: -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.2.devModel: Person Vehicle Bike Detection FP16 - Device: CPUAVX-512 OnAVX-512 OffAVX-512 On 51248121620Min: 13.72 / Avg: 13.95 / Max: 14.08Min: 11.77 / Avg: 12.26 / Max: 12.56Min: 13.26 / Avg: 13.99 / Max: 14.161. (CXX) g++ options: -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -pie

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.2.devModel: Person Vehicle Bike Detection FP16 - Device: CPUAVX-512 OnAVX-512 OffAVX-512 On 51270140210280350SE +/- 2.42, N = 3SE +/- 3.32, N = 6SE +/- 2.59, N = 7286.43325.95285.75-ldl-mno-avx512f-ldl1. (CXX) g++ options: -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.2.devModel: Person Vehicle Bike Detection FP16 - Device: CPUAVX-512 OnAVX-512 OffAVX-512 On 51260120180240300Min: 283.84 / Avg: 286.43 / Max: 291.27Min: 318.08 / Avg: 325.95 / Max: 339.21Min: 282.16 / Avg: 285.75 / Max: 301.231. (CXX) g++ options: -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -pie

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.2.devModel: Face Detection FP16 - Device: CPUAVX-512 OnAVX-512 OffAVX-512 On 51230060090012001500SE +/- 3.79, N = 3SE +/- 4.24, N = 3SE +/- 1.26, N = 31346.991525.181352.08-ldl - MIN: 1224.1 / MAX: 1502.9-mno-avx512f - MIN: 825.15 / MAX: 2479.54-ldl - MIN: 1184.99 / MAX: 1541.851. (CXX) g++ options: -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.2.devModel: Face Detection FP16 - Device: CPUAVX-512 OnAVX-512 OffAVX-512 On 51230060090012001500Min: 1339.67 / Avg: 1346.99 / Max: 1352.34Min: 1520.21 / Avg: 1525.18 / Max: 1533.61Min: 1350.58 / Avg: 1352.08 / Max: 1354.591. (CXX) g++ options: -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -pie

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU-v2-v2 - Model: mobilenet-v2AVX-512 OnAVX-512 OffAVX-512 On 5120.84831.69662.54493.39324.2415SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.05, N = 33.733.343.77MIN: 3.56 / MAX: 5.05-mno-avx512f - MIN: 3.16 / MAX: 5.05MIN: 3.53 / MAX: 4.41. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread
OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU-v2-v2 - Model: mobilenet-v2AVX-512 OnAVX-512 OffAVX-512 On 512246810Min: 3.72 / Avg: 3.73 / Max: 3.74Min: 3.32 / Avg: 3.34 / Max: 3.36Min: 3.68 / Avg: 3.77 / Max: 3.871. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread

LeelaChessZero

LeelaChessZero (lc0 / lczero) is a chess engine automated vian neural networks. This test profile can be used for OpenCL, CUDA + cuDNN, and BLAS (CPU-based) benchmarking. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.28Backend: EigenAVX-512 OnAVX-512 OffAVX-512 On 51230060090012001500SE +/- 11.35, N = 3SE +/- 10.84, N = 5SE +/- 14.97, N = 3118610621194-mno-avx512f1. (CXX) g++ options: -flto -O3 -march=native -pthread
OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.28Backend: EigenAVX-512 OnAVX-512 OffAVX-512 On 5122004006008001000Min: 1174 / Avg: 1186.33 / Max: 1209Min: 1031 / Avg: 1062.4 / Max: 1087Min: 1176 / Avg: 1194.33 / Max: 12241. (CXX) g++ options: -flto -O3 -march=native -pthread

OpenVINO

This is a test of the Intel OpenVINO, a toolkit around neural networks, using its built-in benchmarking support and analyzing the throughput and latency for various models. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.2.devModel: Machine Translation EN To DE FP16 - Device: CPUAVX-512 OnAVX-512 OffAVX-512 On 512816243240SE +/- 0.27, N = 3SE +/- 0.04, N = 3SE +/- 0.18, N = 335.6231.7035.55-ldl-mno-avx512f-ldl1. (CXX) g++ options: -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.2.devModel: Machine Translation EN To DE FP16 - Device: CPUAVX-512 OnAVX-512 OffAVX-512 On 512816243240Min: 35.2 / Avg: 35.62 / Max: 36.13Min: 31.65 / Avg: 31.7 / Max: 31.78Min: 35.21 / Avg: 35.55 / Max: 35.831. (CXX) g++ options: -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -pie

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.2.devModel: Machine Translation EN To DE FP16 - Device: CPUAVX-512 OnAVX-512 OffAVX-512 On 512306090120150SE +/- 0.88, N = 3SE +/- 0.14, N = 3SE +/- 0.59, N = 3112.27126.09112.48-ldl - MIN: 60.98 / MAX: 140.63-mno-avx512f - MIN: 58.5 / MAX: 136.1-ldl - MIN: 53.91 / MAX: 140.821. (CXX) g++ options: -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.2.devModel: Machine Translation EN To DE FP16 - Device: CPUAVX-512 OnAVX-512 OffAVX-512 On 51220406080100Min: 110.63 / Avg: 112.27 / Max: 113.63Min: 125.81 / Avg: 126.09 / Max: 126.28Min: 111.57 / Avg: 112.48 / Max: 113.591. (CXX) g++ options: -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -pie

Neural Magic DeepSparse

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.1Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-StreamAVX-512 OnAVX-512 OffAVX-512 On 5123691215SE +/- 0.0648, N = 3SE +/- 0.0060, N = 3SE +/- 0.0140, N = 38.95168.12139.0886
OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.1Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-StreamAVX-512 OnAVX-512 OffAVX-512 On 5123691215Min: 8.83 / Avg: 8.95 / Max: 9.04Min: 8.11 / Avg: 8.12 / Max: 8.13Min: 9.07 / Avg: 9.09 / Max: 9.12

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.1Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-StreamAVX-512 OnAVX-512 OffAVX-512 On 512110220330440550SE +/- 3.25, N = 3SE +/- 0.40, N = 3SE +/- 0.68, N = 3446.88492.38440.10
OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.1Model: NLP Token Classification, BERT base uncased conll2003 - Scenario: Asynchronous Multi-StreamAVX-512 OnAVX-512 OffAVX-512 On 51290180270360450Min: 442.44 / Avg: 446.88 / Max: 453.21Min: 491.91 / Avg: 492.38 / Max: 493.18Min: 438.75 / Avg: 440.1 / Max: 440.88

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: efficientnet-b0AVX-512 OnAVX-512 OffAVX-512 On 5121.09352.1873.28054.3745.4675SE +/- 0.06, N = 3SE +/- 0.05, N = 3SE +/- 0.05, N = 34.864.354.79MIN: 4.6 / MAX: 6.13-mno-avx512f - MIN: 4.2 / MAX: 5.22MIN: 4.61 / MAX: 9.91. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread
OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: efficientnet-b0AVX-512 OnAVX-512 OffAVX-512 On 512246810Min: 4.73 / Avg: 4.86 / Max: 4.93Min: 4.3 / Avg: 4.35 / Max: 4.45Min: 4.71 / Avg: 4.79 / Max: 4.891. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread

Neural Magic DeepSparse

OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.1Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-StreamAVX-512 OnAVX-512 OffAVX-512 On 5123691215SE +/- 0.0383, N = 3SE +/- 0.0056, N = 3SE +/- 0.0265, N = 38.95888.14259.0792
OpenBenchmarking.orgitems/sec, More Is BetterNeural Magic DeepSparse 1.1Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-StreamAVX-512 OnAVX-512 OffAVX-512 On 5123691215Min: 8.92 / Avg: 8.96 / Max: 9.04Min: 8.13 / Avg: 8.14 / Max: 8.15Min: 9.03 / Avg: 9.08 / Max: 9.11

OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.1Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-StreamAVX-512 OnAVX-512 OffAVX-512 On 512110220330440550SE +/- 1.90, N = 3SE +/- 0.41, N = 3SE +/- 1.29, N = 3446.49491.12440.56
OpenBenchmarking.orgms/batch, Fewer Is BetterNeural Magic DeepSparse 1.1Model: NLP Document Classification, oBERT base uncased on IMDB - Scenario: Asynchronous Multi-StreamAVX-512 OnAVX-512 OffAVX-512 On 51290180270360450Min: 442.69 / Avg: 446.49 / Max: 448.42Min: 490.53 / Avg: 491.12 / Max: 491.91Min: 439.04 / Avg: 440.56 / Max: 443.13

Mobile Neural Network

MNN is the Mobile Neural Network as a highly efficient, lightweight deep learning framework developed by Alibaba. This MNN test profile is building the OpenMP / CPU threaded version for processor benchmarking and not any GPU-accelerated test. MNN does allow making use of AVX-512 extensions. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.1Model: mobilenet-v1-1.0AVX-512 OnAVX-512 OffAVX-512 On 5120.44750.8951.34251.792.2375SE +/- 0.005, N = 15SE +/- 0.007, N = 3SE +/- 0.007, N = 151.9721.7981.989MIN: 1.8 / MAX: 8.03-mno-avx512f - MIN: 1.74 / MAX: 2.69MIN: 1.9 / MAX: 8.11. (CXX) g++ options: -O3 -march=native -std=c++11 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.1Model: mobilenet-v1-1.0AVX-512 OnAVX-512 OffAVX-512 On 512246810Min: 1.95 / Avg: 1.97 / Max: 2.02Min: 1.79 / Avg: 1.8 / Max: 1.81Min: 1.95 / Avg: 1.99 / Max: 2.041. (CXX) g++ options: -O3 -march=native -std=c++11 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: mnasnetAVX-512 OnAVX-512 OffAVX-512 On 5120.58951.1791.76852.3582.9475SE +/- 0.03, N = 3SE +/- 0.03, N = 3SE +/- 0.03, N = 32.622.372.58MIN: 2.48 / MAX: 3.81-mno-avx512f - MIN: 2.28 / MAX: 3.27MIN: 2.48 / MAX: 3.141. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread
OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: mnasnetAVX-512 OnAVX-512 OffAVX-512 On 512246810Min: 2.57 / Avg: 2.62 / Max: 2.67Min: 2.33 / Avg: 2.37 / Max: 2.43Min: 2.55 / Avg: 2.58 / Max: 2.631. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: alexnetAVX-512 OnAVX-512 OffAVX-512 On 512246810SE +/- 0.02, N = 3SE +/- 0.01, N = 3SE +/- 0.19, N = 36.986.336.69MIN: 6.82 / MAX: 7.88-mno-avx512f - MIN: 6.2 / MAX: 7.75MIN: 6.2 / MAX: 7.351. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread
OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: alexnetAVX-512 OnAVX-512 OffAVX-512 On 5123691215Min: 6.95 / Avg: 6.98 / Max: 7.02Min: 6.3 / Avg: 6.33 / Max: 6.35Min: 6.31 / Avg: 6.69 / Max: 6.881. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread

OSPRay Studio

Intel OSPRay Studio is an open-source, interactive visualization and ray-tracing software package. OSPRay Studio makes use of Intel OSPRay, a portable ray-tracing engine for high-performance, high-fidelity visualizations. OSPRay builds off Intel's Embree and Intel SPMD Program Compiler (ISPC) components as part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 1 - Resolution: 1080p - Samples Per Pixel: 32 - Renderer: Path TracerAVX-512 OnAVX-512 OffAVX-512 On 51215K30K45K60K75KSE +/- 55.47, N = 3SE +/- 50.78, N = 3SE +/- 140.32, N = 3657237214565447-mno-avx512f1. (CXX) g++ options: -O3 -march=native -ldl
OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 1 - Resolution: 1080p - Samples Per Pixel: 32 - Renderer: Path TracerAVX-512 OnAVX-512 OffAVX-512 On 51213K26K39K52K65KMin: 65647 / Avg: 65723 / Max: 65831Min: 72063 / Avg: 72145.33 / Max: 72238Min: 65171 / Avg: 65447 / Max: 656291. (CXX) g++ options: -O3 -march=native -ldl

Mobile Neural Network

MNN is the Mobile Neural Network as a highly efficient, lightweight deep learning framework developed by Alibaba. This MNN test profile is building the OpenMP / CPU threaded version for processor benchmarking and not any GPU-accelerated test. MNN does allow making use of AVX-512 extensions. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.1Model: MobileNetV2_224AVX-512 OnAVX-512 OffAVX-512 On 5120.47950.9591.43851.9182.3975SE +/- 0.014, N = 15SE +/- 0.019, N = 3SE +/- 0.014, N = 152.1311.9342.125MIN: 2.03 / MAX: 8.46-mno-avx512f - MIN: 1.84 / MAX: 11.59MIN: 2.03 / MAX: 8.621. (CXX) g++ options: -O3 -march=native -std=c++11 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.1Model: MobileNetV2_224AVX-512 OnAVX-512 OffAVX-512 On 512246810Min: 2.08 / Avg: 2.13 / Max: 2.23Min: 1.9 / Avg: 1.93 / Max: 1.96Min: 2.07 / Avg: 2.13 / Max: 2.211. (CXX) g++ options: -O3 -march=native -std=c++11 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl

OSPRay Studio

Intel OSPRay Studio is an open-source, interactive visualization and ray-tracing software package. OSPRay Studio makes use of Intel OSPRay, a portable ray-tracing engine for high-performance, high-fidelity visualizations. OSPRay builds off Intel's Embree and Intel SPMD Program Compiler (ISPC) components as part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 1 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path TracerAVX-512 OnAVX-512 OffAVX-512 On 51260K120K180K240K300KSE +/- 259.35, N = 3SE +/- 100.66, N = 3SE +/- 784.12, N = 3251261275693250698-mno-avx512f1. (CXX) g++ options: -O3 -march=native -ldl
OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 1 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path TracerAVX-512 OnAVX-512 OffAVX-512 On 51250K100K150K200K250KMin: 250920 / Avg: 251261 / Max: 251770Min: 275573 / Avg: 275693 / Max: 275893Min: 249144 / Avg: 250698.33 / Max: 2516561. (CXX) g++ options: -O3 -march=native -ldl

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 1 - Resolution: 1080p - Samples Per Pixel: 16 - Renderer: Path TracerAVX-512 OnAVX-512 OffAVX-512 On 5127K14K21K28K35KSE +/- 32.95, N = 3SE +/- 5.46, N = 3SE +/- 206.81, N = 3310543411231180-mno-avx512f1. (CXX) g++ options: -O3 -march=native -ldl
OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 1 - Resolution: 1080p - Samples Per Pixel: 16 - Renderer: Path TracerAVX-512 OnAVX-512 OffAVX-512 On 5126K12K18K24K30KMin: 30999 / Avg: 31054.33 / Max: 31113Min: 34101 / Avg: 34111.67 / Max: 34119Min: 30883 / Avg: 31180.33 / Max: 315781. (CXX) g++ options: -O3 -march=native -ldl

Numpy Benchmark

This is a test to obtain the general Numpy performance. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgScore, More Is BetterNumpy BenchmarkAVX-512 OnAVX-512 OffAVX-512 On 512140280420560700SE +/- 1.23, N = 3SE +/- 0.90, N = 3SE +/- 0.87, N = 3631.24575.79631.86
OpenBenchmarking.orgScore, More Is BetterNumpy BenchmarkAVX-512 OnAVX-512 OffAVX-512 On 512110220330440550Min: 628.84 / Avg: 631.24 / Max: 632.93Min: 574.03 / Avg: 575.79 / Max: 577.02Min: 630.43 / Avg: 631.86 / Max: 633.42

ONNX Runtime

ONNX Runtime is developed by Microsoft and partners as a open-source, cross-platform, high performance machine learning inferencing and training accelerator. This test profile runs the ONNX Runtime with various models available from the ONNX Zoo. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: super-resolution-10 - Device: CPU - Executor: ParallelAVX-512 OnAVX-512 OffAVX-512 On 51211002200330044005500SE +/- 5.95, N = 3SE +/- 8.93, N = 3SE +/- 25.62, N = 3495045154626-mno-avx512f1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: super-resolution-10 - Device: CPU - Executor: ParallelAVX-512 OnAVX-512 OffAVX-512 On 5129001800270036004500Min: 4940 / Avg: 4949.67 / Max: 4960.5Min: 4498 / Avg: 4515.17 / Max: 4528Min: 4583.5 / Avg: 4625.83 / Max: 46721. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OSPRay Studio

Intel OSPRay Studio is an open-source, interactive visualization and ray-tracing software package. OSPRay Studio makes use of Intel OSPRay, a portable ray-tracing engine for high-performance, high-fidelity visualizations. OSPRay builds off Intel's Embree and Intel SPMD Program Compiler (ISPC) components as part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 1 - Resolution: 4K - Samples Per Pixel: 16 - Renderer: Path TracerAVX-512 OnAVX-512 OffAVX-512 On 51230K60K90K120K150KSE +/- 834.11, N = 3SE +/- 39.26, N = 3SE +/- 430.68, N = 3127267139519127540-mno-avx512f1. (CXX) g++ options: -O3 -march=native -ldl
OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 1 - Resolution: 4K - Samples Per Pixel: 16 - Renderer: Path TracerAVX-512 OnAVX-512 OffAVX-512 On 51220K40K60K80K100KMin: 125898 / Avg: 127266.67 / Max: 128777Min: 139451 / Avg: 139519 / Max: 139587Min: 127039 / Avg: 127539.67 / Max: 1283971. (CXX) g++ options: -O3 -march=native -ldl

oneDNN

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPUAVX-512 OnAVX-512 OffAVX-512 On 5127001400210028003500SE +/- 4.17, N = 3SE +/- 2.45, N = 3SE +/- 38.98, N = 33093.583271.272986.72MIN: 3075.2-mno-avx512f - MIN: 3259.85MIN: 2932.851. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPUAVX-512 OnAVX-512 OffAVX-512 On 5126001200180024003000Min: 3088.75 / Avg: 3093.58 / Max: 3101.89Min: 3266.71 / Avg: 3271.27 / Max: 3275.12Min: 2943.91 / Avg: 2986.72 / Max: 3064.551. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

OSPRay Studio

Intel OSPRay Studio is an open-source, interactive visualization and ray-tracing software package. OSPRay Studio makes use of Intel OSPRay, a portable ray-tracing engine for high-performance, high-fidelity visualizations. OSPRay builds off Intel's Embree and Intel SPMD Program Compiler (ISPC) components as part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 1 - Resolution: 1080p - Samples Per Pixel: 1 - Renderer: Path TracerAVX-512 OnAVX-512 OffAVX-512 On 5125001000150020002500SE +/- 18.66, N = 3SE +/- 1.45, N = 3SE +/- 3.51, N = 3195821341951-mno-avx512f1. (CXX) g++ options: -O3 -march=native -ldl
OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 1 - Resolution: 1080p - Samples Per Pixel: 1 - Renderer: Path TracerAVX-512 OnAVX-512 OffAVX-512 On 512400800120016002000Min: 1934 / Avg: 1958.33 / Max: 1995Min: 2132 / Avg: 2134.33 / Max: 2137Min: 1944 / Avg: 1951 / Max: 19551. (CXX) g++ options: -O3 -march=native -ldl

oneDNN

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPUAVX-512 OnAVX-512 OffAVX-512 On 51248121620SE +/- 0.01, N = 7SE +/- 0.01, N = 7SE +/- 0.01, N = 714.3615.6114.28MIN: 14.18-mno-avx512f - MIN: 15.43MIN: 14.091. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPUAVX-512 OnAVX-512 OffAVX-512 On 51248121620Min: 14.33 / Avg: 14.36 / Max: 14.38Min: 15.56 / Avg: 15.61 / Max: 15.65Min: 14.25 / Avg: 14.28 / Max: 14.31. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

OSPRay Studio

Intel OSPRay Studio is an open-source, interactive visualization and ray-tracing software package. OSPRay Studio makes use of Intel OSPRay, a portable ray-tracing engine for high-performance, high-fidelity visualizations. OSPRay builds off Intel's Embree and Intel SPMD Program Compiler (ISPC) components as part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 1 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path TracerAVX-512 OnAVX-512 OffAVX-512 On 5122K4K6K8K10KSE +/- 26.87, N = 3SE +/- 2.40, N = 3SE +/- 11.53, N = 3774384507734-mno-avx512f1. (CXX) g++ options: -O3 -march=native -ldl
OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 1 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path TracerAVX-512 OnAVX-512 OffAVX-512 On 51215003000450060007500Min: 7714 / Avg: 7743.33 / Max: 7797Min: 8447 / Avg: 8450.33 / Max: 8455Min: 7711 / Avg: 7734 / Max: 77471. (CXX) g++ options: -O3 -march=native -ldl

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 3 - Resolution: 1080p - Samples Per Pixel: 32 - Renderer: Path TracerAVX-512 OnAVX-512 OffAVX-512 On 51220K40K60K80K100KSE +/- 263.45, N = 3SE +/- 31.80, N = 3SE +/- 171.11, N = 3787468585478809-mno-avx512f1. (CXX) g++ options: -O3 -march=native -ldl
OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 3 - Resolution: 1080p - Samples Per Pixel: 32 - Renderer: Path TracerAVX-512 OnAVX-512 OffAVX-512 On 51215K30K45K60K75KMin: 78340 / Avg: 78746.33 / Max: 79240Min: 85797 / Avg: 85853.67 / Max: 85907Min: 78534 / Avg: 78809.33 / Max: 791231. (CXX) g++ options: -O3 -march=native -ldl

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 3 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path TracerAVX-512 OnAVX-512 OffAVX-512 On 51270K140K210K280K350KSE +/- 646.03, N = 3SE +/- 178.71, N = 3SE +/- 378.84, N = 3302582329527303966-mno-avx512f1. (CXX) g++ options: -O3 -march=native -ldl
OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 3 - Resolution: 4K - Samples Per Pixel: 32 - Renderer: Path TracerAVX-512 OnAVX-512 OffAVX-512 On 51260K120K180K240K300KMin: 301291 / Avg: 302582 / Max: 303273Min: 329215 / Avg: 329527 / Max: 329834Min: 303583 / Avg: 303966.33 / Max: 3047241. (CXX) g++ options: -O3 -march=native -ldl

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 3 - Resolution: 4K - Samples Per Pixel: 16 - Renderer: Path TracerAVX-512 OnAVX-512 OffAVX-512 On 51240K80K120K160K200KSE +/- 313.74, N = 3SE +/- 189.77, N = 3SE +/- 494.14, N = 3152903166508153101-mno-avx512f1. (CXX) g++ options: -O3 -march=native -ldl
OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 3 - Resolution: 4K - Samples Per Pixel: 16 - Renderer: Path TracerAVX-512 OnAVX-512 OffAVX-512 On 51230K60K90K120K150KMin: 152522 / Avg: 152902.67 / Max: 153525Min: 166301 / Avg: 166508 / Max: 166887Min: 152120 / Avg: 153100.67 / Max: 1536971. (CXX) g++ options: -O3 -march=native -ldl

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 3 - Resolution: 1080p - Samples Per Pixel: 16 - Renderer: Path TracerAVX-512 OnAVX-512 OffAVX-512 On 5129K18K27K36K45KSE +/- 89.45, N = 3SE +/- 14.77, N = 3SE +/- 9.61, N = 3376624094837620-mno-avx512f1. (CXX) g++ options: -O3 -march=native -ldl
OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 3 - Resolution: 1080p - Samples Per Pixel: 16 - Renderer: Path TracerAVX-512 OnAVX-512 OffAVX-512 On 5127K14K21K28K35KMin: 37485 / Avg: 37662 / Max: 37773Min: 40930 / Avg: 40947.67 / Max: 40977Min: 37601 / Avg: 37619.67 / Max: 376331. (CXX) g++ options: -O3 -march=native -ldl

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU-v3-v3 - Model: mobilenet-v3AVX-512 OnAVX-512 OffAVX-512 On 5120.60981.21961.82942.43923.049SE +/- 0.03, N = 3SE +/- 0.01, N = 3SE +/- 0.05, N = 32.682.492.71MIN: 2.56 / MAX: 3.63-mno-avx512f - MIN: 2.4 / MAX: 3.36MIN: 2.55 / MAX: 3.51. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread
OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU-v3-v3 - Model: mobilenet-v3AVX-512 OnAVX-512 OffAVX-512 On 512246810Min: 2.64 / Avg: 2.68 / Max: 2.73Min: 2.47 / Avg: 2.49 / Max: 2.5Min: 2.62 / Avg: 2.71 / Max: 2.791. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread

OSPRay Studio

Intel OSPRay Studio is an open-source, interactive visualization and ray-tracing software package. OSPRay Studio makes use of Intel OSPRay, a portable ray-tracing engine for high-performance, high-fidelity visualizations. OSPRay builds off Intel's Embree and Intel SPMD Program Compiler (ISPC) components as part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 3 - Resolution: 1080p - Samples Per Pixel: 1 - Renderer: Path TracerAVX-512 OnAVX-512 OffAVX-512 On 5125001000150020002500SE +/- 3.76, N = 3SE +/- 2.85, N = 3SE +/- 2.33, N = 3235825612364-mno-avx512f1. (CXX) g++ options: -O3 -march=native -ldl
OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 3 - Resolution: 1080p - Samples Per Pixel: 1 - Renderer: Path TracerAVX-512 OnAVX-512 OffAVX-512 On 512400800120016002000Min: 2351 / Avg: 2357.67 / Max: 2364Min: 2558 / Avg: 2561.33 / Max: 2567Min: 2360 / Avg: 2363.67 / Max: 23681. (CXX) g++ options: -O3 -march=native -ldl

ONNX Runtime

ONNX Runtime is developed by Microsoft and partners as a open-source, cross-platform, high performance machine learning inferencing and training accelerator. This test profile runs the ONNX Runtime with various models available from the ONNX Zoo. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: ArcFace ResNet-100 - Device: CPU - Executor: ParallelAVX-512 OnAVX-512 OffAVX-512 On 51230060090012001500SE +/- 1.36, N = 3SE +/- 1.76, N = 3SE +/- 4.33, N = 3138812781368-mno-avx512f1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: ArcFace ResNet-100 - Device: CPU - Executor: ParallelAVX-512 OnAVX-512 OffAVX-512 On 5122004006008001000Min: 1386.5 / Avg: 1388.33 / Max: 1391Min: 1274.5 / Avg: 1277.83 / Max: 1280.5Min: 1360.5 / Avg: 1368 / Max: 1375.51. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

dav1d

Dav1d is an open-source, speedy AV1 video decoder. This test profile times how long it takes to decode sample AV1 video content. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFPS, More Is Betterdav1d 1.0Video Input: Summer Nature 4KAVX-512 OnAVX-512 OffAVX-512 On 51250100150200250SE +/- 0.18, N = 3SE +/- 0.10, N = 3SE +/- 0.11, N = 3207.14192.71190.82-mno-avx512f1. (CC) gcc options: -O3 -march=native -pthread -lm
OpenBenchmarking.orgFPS, More Is Betterdav1d 1.0Video Input: Summer Nature 4KAVX-512 OnAVX-512 OffAVX-512 On 5124080120160200Min: 206.86 / Avg: 207.14 / Max: 207.48Min: 192.52 / Avg: 192.71 / Max: 192.81Min: 190.68 / Avg: 190.82 / Max: 191.051. (CC) gcc options: -O3 -march=native -pthread -lm

OSPRay Studio

Intel OSPRay Studio is an open-source, interactive visualization and ray-tracing software package. OSPRay Studio makes use of Intel OSPRay, a portable ray-tracing engine for high-performance, high-fidelity visualizations. OSPRay builds off Intel's Embree and Intel SPMD Program Compiler (ISPC) components as part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 3 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path TracerAVX-512 OnAVX-512 OffAVX-512 On 5122K4K6K8K10KSE +/- 6.67, N = 3SE +/- 5.70, N = 3SE +/- 5.51, N = 39373101489376-mno-avx512f1. (CXX) g++ options: -O3 -march=native -ldl
OpenBenchmarking.orgms, Fewer Is BetterOSPRay Studio 0.11Camera: 3 - Resolution: 4K - Samples Per Pixel: 1 - Renderer: Path TracerAVX-512 OnAVX-512 OffAVX-512 On 5122K4K6K8K10KMin: 9366 / Avg: 9372.67 / Max: 9386Min: 10137 / Avg: 10148.33 / Max: 10155Min: 9367 / Avg: 9376 / Max: 93861. (CXX) g++ options: -O3 -march=native -ldl

simdjson

This is a benchmark of SIMDJSON, a high performance JSON parser. SIMDJSON aims to be the fastest JSON parser and is used by projects like Microsoft FishStore, Yandex ClickHouse, Shopify, and others. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGB/s, More Is Bettersimdjson 2.0Throughput Test: LargeRandomAVX-512 OnAVX-512 OffAVX-512 On 5120.33080.66160.99241.32321.654SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 31.471.361.47-mno-avx512f1. (CXX) g++ options: -O3 -march=native
OpenBenchmarking.orgGB/s, More Is Bettersimdjson 2.0Throughput Test: LargeRandomAVX-512 OnAVX-512 OffAVX-512 On 512246810Min: 1.47 / Avg: 1.47 / Max: 1.48Min: 1.36 / Avg: 1.36 / Max: 1.36Min: 1.47 / Avg: 1.47 / Max: 1.471. (CXX) g++ options: -O3 -march=native

OpenRadioss

OpenRadioss is an open-source AGPL-licensed finite element solver for dynamic event analysis OpenRadioss is based on Altair Radioss and open-sourced in 2022. This open-source finite element solver is benchmarked with various example models available from https://www.openradioss.org/models/. This test is currently using a reference OpenRadioss binary build offered via GitHub. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenRadioss 2022.10.13Model: Cell Phone Drop TestAVX-512 OnAVX-512 OffAVX-512 On 512306090120150SE +/- 1.35, N = 3SE +/- 0.52, N = 3SE +/- 0.31, N = 3120.62128.16118.89
OpenBenchmarking.orgSeconds, Fewer Is BetterOpenRadioss 2022.10.13Model: Cell Phone Drop TestAVX-512 OnAVX-512 OffAVX-512 On 51220406080100Min: 118.87 / Avg: 120.62 / Max: 123.28Min: 127.24 / Avg: 128.16 / Max: 129.03Min: 118.32 / Avg: 118.89 / Max: 119.39

ONNX Runtime

ONNX Runtime is developed by Microsoft and partners as a open-source, cross-platform, high performance machine learning inferencing and training accelerator. This test profile runs the ONNX Runtime with various models available from the ONNX Zoo. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: super-resolution-10 - Device: CPU - Executor: StandardAVX-512 OnAVX-512 OffAVX-512 On 51216003200480064008000SE +/- 80.51, N = 4SE +/- 44.00, N = 3SE +/- 64.19, N = 12735672796824-mno-avx512f1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: super-resolution-10 - Device: CPU - Executor: StandardAVX-512 OnAVX-512 OffAVX-512 On 51213002600390052006500Min: 7121.5 / Avg: 7356.25 / Max: 7471.5Min: 7191 / Avg: 7279 / Max: 7323.5Min: 6420 / Avg: 6824.08 / Max: 7076.51. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: mobilenetAVX-512 OnAVX-512 OffAVX-512 On 5123691215SE +/- 0.15, N = 3SE +/- 0.02, N = 3SE +/- 0.13, N = 312.2811.4312.28MIN: 11.79 / MAX: 13.67-mno-avx512f - MIN: 11.21 / MAX: 16.87MIN: 11.87 / MAX: 18.861. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread
OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: mobilenetAVX-512 OnAVX-512 OffAVX-512 On 51248121620Min: 11.98 / Avg: 12.28 / Max: 12.46Min: 11.39 / Avg: 11.43 / Max: 11.47Min: 12.03 / Avg: 12.28 / Max: 12.441. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread

oneDNN

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPUAVX-512 OnAVX-512 OffAVX-512 On 5120.31570.63140.94711.26281.5785SE +/- 0.00334, N = 4SE +/- 0.00373, N = 4SE +/- 0.00378, N = 41.321341.403301.30906MIN: 1.26-mno-avx512f - MIN: 1.31MIN: 1.241. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPUAVX-512 OnAVX-512 OffAVX-512 On 512246810Min: 1.31 / Avg: 1.32 / Max: 1.33Min: 1.4 / Avg: 1.4 / Max: 1.41Min: 1.3 / Avg: 1.31 / Max: 1.311. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

dav1d

Dav1d is an open-source, speedy AV1 video decoder. This test profile times how long it takes to decode sample AV1 video content. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFPS, More Is Betterdav1d 1.0Video Input: Chimera 1080p 10-bitAVX-512 OnAVX-512 OffAVX-512 On 512110220330440550SE +/- 0.34, N = 3SE +/- 0.38, N = 3SE +/- 0.21, N = 3510.88487.09477.09-mno-avx512f1. (CC) gcc options: -O3 -march=native -pthread -lm
OpenBenchmarking.orgFPS, More Is Betterdav1d 1.0Video Input: Chimera 1080p 10-bitAVX-512 OnAVX-512 OffAVX-512 On 51290180270360450Min: 510.2 / Avg: 510.88 / Max: 511.28Min: 486.35 / Avg: 487.09 / Max: 487.63Min: 476.68 / Avg: 477.09 / Max: 477.41. (CC) gcc options: -O3 -march=native -pthread -lm

oneDNN

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPUAVX-512 OnAVX-512 OffAVX-512 On 5127001400210028003500SE +/- 2.36, N = 3SE +/- 1.15, N = 3SE +/- 2.98, N = 33079.963271.063062.36MIN: 3061.76-mno-avx512f - MIN: 3261.78MIN: 3040.471. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPUAVX-512 OnAVX-512 OffAVX-512 On 5126001200180024003000Min: 3075.75 / Avg: 3079.96 / Max: 3083.91Min: 3268.84 / Avg: 3271.06 / Max: 3272.71Min: 3056.47 / Avg: 3062.36 / Max: 3066.021. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

OpenRadioss

OpenRadioss is an open-source AGPL-licensed finite element solver for dynamic event analysis OpenRadioss is based on Altair Radioss and open-sourced in 2022. This open-source finite element solver is benchmarked with various example models available from https://www.openradioss.org/models/. This test is currently using a reference OpenRadioss binary build offered via GitHub. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenRadioss 2022.10.13Model: INIVOL and Fluid Structure Interaction Drop ContainerAVX-512 OnAVX-512 OffAVX-512 On 512160320480640800SE +/- 7.20, N = 9SE +/- 4.20, N = 3SE +/- 10.28, N = 3759.23713.35755.60
OpenBenchmarking.orgSeconds, Fewer Is BetterOpenRadioss 2022.10.13Model: INIVOL and Fluid Structure Interaction Drop ContainerAVX-512 OnAVX-512 OffAVX-512 On 512130260390520650Min: 742.64 / Avg: 759.23 / Max: 791.11Min: 706.43 / Avg: 713.35 / Max: 720.93Min: 743.66 / Avg: 755.6 / Max: 776.06

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: googlenetAVX-512 OnAVX-512 OffAVX-512 On 5123691215SE +/- 0.00, N = 3SE +/- 0.03, N = 3SE +/- 0.13, N = 310.169.559.96MIN: 9.92 / MAX: 11.36-mno-avx512f - MIN: 9.33 / MAX: 10.86MIN: 9.52 / MAX: 10.981. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread
OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: googlenetAVX-512 OnAVX-512 OffAVX-512 On 5123691215Min: 10.15 / Avg: 10.16 / Max: 10.16Min: 9.51 / Avg: 9.55 / Max: 9.61Min: 9.7 / Avg: 9.96 / Max: 10.11. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread

dav1d

Dav1d is an open-source, speedy AV1 video decoder. This test profile times how long it takes to decode sample AV1 video content. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFPS, More Is Betterdav1d 1.0Video Input: Chimera 1080pAVX-512 OnAVX-512 OffAVX-512 On 512160320480640800SE +/- 0.58, N = 4SE +/- 0.49, N = 4SE +/- 0.36, N = 4734.61691.43727.76-mno-avx512f1. (CC) gcc options: -O3 -march=native -pthread -lm
OpenBenchmarking.orgFPS, More Is Betterdav1d 1.0Video Input: Chimera 1080pAVX-512 OnAVX-512 OffAVX-512 On 512130260390520650Min: 732.92 / Avg: 734.61 / Max: 735.61Min: 690.5 / Avg: 691.43 / Max: 692.76Min: 726.71 / Avg: 727.76 / Max: 728.251. (CC) gcc options: -O3 -march=native -pthread -lm

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: regnety_400mAVX-512 OnAVX-512 OffAVX-512 On 512246810SE +/- 0.02, N = 3SE +/- 0.02, N = 3SE +/- 0.03, N = 37.056.707.10MIN: 6.87 / MAX: 8.31-mno-avx512f - MIN: 6.54 / MAX: 7.93MIN: 6.91 / MAX: 11.121. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread
OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: regnety_400mAVX-512 OnAVX-512 OffAVX-512 On 5123691215Min: 7.02 / Avg: 7.05 / Max: 7.1Min: 6.67 / Avg: 6.7 / Max: 6.74Min: 7.05 / Avg: 7.1 / Max: 7.141. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread

Embree

Intel Embree is a collection of high-performance ray-tracing kernels for execution on CPUs and supporting instruction sets such as SSE, AVX, AVX2, and AVX-512. Embree also supports making use of the Intel SPMD Program Compiler (ISPC). Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 3.13Binary: Pathtracer ISPC - Model: CrownAVX-512 OnAVX-512 OffAVX-512 On 51248121620SE +/- 0.03, N = 3SE +/- 0.11, N = 3SE +/- 0.03, N = 314.6313.8614.62MIN: 14.41 / MAX: 15.16MIN: 13.49 / MAX: 14.34MIN: 14.37 / MAX: 15.18
OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 3.13Binary: Pathtracer ISPC - Model: CrownAVX-512 OnAVX-512 OffAVX-512 On 51248121620Min: 14.59 / Avg: 14.63 / Max: 14.69Min: 13.64 / Avg: 13.86 / Max: 14Min: 14.56 / Avg: 14.62 / Max: 14.66

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: resnet18AVX-512 OnAVX-512 OffAVX-512 On 512246810SE +/- 0.02, N = 3SE +/- 0.00, N = 3SE +/- 0.02, N = 38.698.248.64MIN: 8.51 / MAX: 14.15-mno-avx512f - MIN: 8.06 / MAX: 9.19MIN: 8.45 / MAX: 9.391. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread
OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: resnet18AVX-512 OnAVX-512 OffAVX-512 On 5123691215Min: 8.67 / Avg: 8.69 / Max: 8.72Min: 8.24 / Avg: 8.24 / Max: 8.24Min: 8.6 / Avg: 8.64 / Max: 8.671. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: resnet50AVX-512 OnAVX-512 OffAVX-512 On 51248121620SE +/- 0.42, N = 3SE +/- 0.02, N = 3SE +/- 0.42, N = 315.9915.4016.24MIN: 15.25 / MAX: 18.48-mno-avx512f - MIN: 15.07 / MAX: 24.51MIN: 15.19 / MAX: 17.461. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread
OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: resnet50AVX-512 OnAVX-512 OffAVX-512 On 51248121620Min: 15.55 / Avg: 15.99 / Max: 16.84Min: 15.37 / Avg: 15.4 / Max: 15.42Min: 15.41 / Avg: 16.24 / Max: 16.721. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread

Mobile Neural Network

MNN is the Mobile Neural Network as a highly efficient, lightweight deep learning framework developed by Alibaba. This MNN test profile is building the OpenMP / CPU threaded version for processor benchmarking and not any GPU-accelerated test. MNN does allow making use of AVX-512 extensions. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.1Model: nasnetAVX-512 OnAVX-512 OffAVX-512 On 512246810SE +/- 0.077, N = 15SE +/- 0.056, N = 3SE +/- 0.087, N = 157.4687.1397.527MIN: 6.73 / MAX: 13.9-mno-avx512f - MIN: 6.92 / MAX: 7.99MIN: 6.72 / MAX: 17.921. (CXX) g++ options: -O3 -march=native -std=c++11 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.1Model: nasnetAVX-512 OnAVX-512 OffAVX-512 On 5123691215Min: 6.97 / Avg: 7.47 / Max: 8.22Min: 7.03 / Avg: 7.14 / Max: 7.21Min: 7.12 / Avg: 7.53 / Max: 8.271. (CXX) g++ options: -O3 -march=native -std=c++11 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl

OSPRay

Intel OSPRay is a portable ray-tracing engine for high-performance, high-fidelity scientific visualizations. OSPRay builds off Intel's Embree and Intel SPMD Program Compiler (ISPC) components as part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 2.10Benchmark: particle_volume/pathtracer/real_timeAVX-512 OnAVX-512 OffAVX-512 On 5124080120160200SE +/- 0.21, N = 3SE +/- 0.37, N = 3SE +/- 0.56, N = 3173.13164.96173.74
OpenBenchmarking.orgItems Per Second, More Is BetterOSPRay 2.10Benchmark: particle_volume/pathtracer/real_timeAVX-512 OnAVX-512 OffAVX-512 On 512306090120150Min: 172.84 / Avg: 173.13 / Max: 173.55Min: 164.42 / Avg: 164.96 / Max: 165.68Min: 172.93 / Avg: 173.74 / Max: 174.82

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: blazefaceAVX-512 OnAVX-512 OffAVX-512 On 5120.180.360.540.720.9SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.01, N = 30.800.760.80MIN: 0.77 / MAX: 1.52-mno-avx512f - MIN: 0.73 / MAX: 1.52MIN: 0.77 / MAX: 1.281. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread
OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: blazefaceAVX-512 OnAVX-512 OffAVX-512 On 512246810Min: 0.79 / Avg: 0.8 / Max: 0.8Min: 0.75 / Avg: 0.76 / Max: 0.76Min: 0.79 / Avg: 0.8 / Max: 0.811. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: FastestDetAVX-512 OnAVX-512 OffAVX-512 On 5120.67051.3412.01152.6823.3525SE +/- 0.08, N = 3SE +/- 0.02, N = 3SE +/- 0.08, N = 32.982.852.98MIN: 2.81 / MAX: 3.62-mno-avx512f - MIN: 2.78 / MAX: 3.41MIN: 2.84 / MAX: 3.61. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread
OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: FastestDetAVX-512 OnAVX-512 OffAVX-512 On 512246810Min: 2.86 / Avg: 2.98 / Max: 3.14Min: 2.83 / Avg: 2.85 / Max: 2.88Min: 2.9 / Avg: 2.98 / Max: 3.131. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread

oneDNN

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPUAVX-512 OnAVX-512 On 5120.8591.7182.5773.4364.295SE +/- 0.04459, N = 4SE +/- 0.03397, N = 43.817933.66945MIN: 3.28MIN: 3.281. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPUAVX-512 OnAVX-512 On 512246810Min: 3.69 / Avg: 3.82 / Max: 3.89Min: 3.59 / Avg: 3.67 / Max: 3.741. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU

i9-11900K: AVX-512 Off: The test run did not produce a result.

OpenRadioss

OpenRadioss is an open-source AGPL-licensed finite element solver for dynamic event analysis OpenRadioss is based on Altair Radioss and open-sourced in 2022. This open-source finite element solver is benchmarked with various example models available from https://www.openradioss.org/models/. This test is currently using a reference OpenRadioss binary build offered via GitHub. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenRadioss 2022.10.13Model: Rubber O-Ring Seal InstallationAVX-512 OnAVX-512 OffAVX-512 On 51250100150200250SE +/- 2.27, N = 7SE +/- 2.54, N = 5SE +/- 2.53, N = 4250.33243.47240.62
OpenBenchmarking.orgSeconds, Fewer Is BetterOpenRadioss 2022.10.13Model: Rubber O-Ring Seal InstallationAVX-512 OnAVX-512 OffAVX-512 On 51250100150200250Min: 237.4 / Avg: 250.33 / Max: 254.77Min: 240.24 / Avg: 243.47 / Max: 253.62Min: 235.73 / Avg: 240.62 / Max: 247.01

Embree

Intel Embree is a collection of high-performance ray-tracing kernels for execution on CPUs and supporting instruction sets such as SSE, AVX, AVX2, and AVX-512. Embree also supports making use of the Intel SPMD Program Compiler (ISPC). Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 3.13Binary: Pathtracer ISPC - Model: Asian DragonAVX-512 OnAVX-512 OffAVX-512 On 51248121620SE +/- 0.07, N = 3SE +/- 0.10, N = 3SE +/- 0.06, N = 316.5115.9016.45MIN: 16.3 / MAX: 17.02MIN: 15.66 / MAX: 16.42MIN: 16.21 / MAX: 16.89
OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 3.13Binary: Pathtracer ISPC - Model: Asian DragonAVX-512 OnAVX-512 OffAVX-512 On 51248121620Min: 16.42 / Avg: 16.51 / Max: 16.66Min: 15.78 / Avg: 15.9 / Max: 16.1Min: 16.33 / Avg: 16.45 / Max: 16.52

AI Benchmark Alpha

AI Benchmark Alpha is a Python library for evaluating artificial intelligence (AI) performance on diverse hardware platforms and relies upon the TensorFlow machine learning library. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgScore, More Is BetterAI Benchmark Alpha 0.1.2Device Inference ScoreAVX-512 OnAVX-512 OffAVX-512 On 51230060090012001500137614141363

oneDNN

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPUAVX-512 OnAVX-512 On 5121.08432.16863.25294.33725.4215SE +/- 0.03916, N = 9SE +/- 0.02954, N = 154.819044.64814MIN: 4.15MIN: 4.011. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPUAVX-512 OnAVX-512 On 512246810Min: 4.59 / Avg: 4.82 / Max: 4.96Min: 4.39 / Avg: 4.65 / Max: 4.81. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU

i9-11900K: AVX-512 Off: The test run did not produce a result.

ONNX Runtime

ONNX Runtime is developed by Microsoft and partners as a open-source, cross-platform, high performance machine learning inferencing and training accelerator. This test profile runs the ONNX Runtime with various models available from the ONNX Zoo. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: bertsquad-12 - Device: CPU - Executor: StandardAVX-512 OnAVX-512 OffAVX-512 On 5122004006008001000SE +/- 2.67, N = 3SE +/- 8.55, N = 9SE +/- 5.77, N = 3100810421012-mno-avx512f1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: bertsquad-12 - Device: CPU - Executor: StandardAVX-512 OnAVX-512 OffAVX-512 On 5122004006008001000Min: 1002.5 / Avg: 1007.83 / Max: 1010.5Min: 978.5 / Avg: 1041.89 / Max: 1057Min: 1005.5 / Avg: 1012 / Max: 1023.51. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

TNN

TNN is an open-source deep learning reasoning framework developed by Tencent. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: SqueezeNet v2AVX-512 OnAVX-512 OffAVX-512 On 5121122334455SE +/- 0.07, N = 9SE +/- 0.17, N = 9SE +/- 0.11, N = 947.1946.3645.68MIN: 46.52 / MAX: 48.8-mno-avx512f - MIN: 45.49 / MAX: 48.72MIN: 45.02 / MAX: 47.451. (CXX) g++ options: -O3 -march=native -fopenmp -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl
OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: SqueezeNet v2AVX-512 OnAVX-512 OffAVX-512 On 5121020304050Min: 46.97 / Avg: 47.19 / Max: 47.59Min: 45.84 / Avg: 46.36 / Max: 47.48Min: 45.31 / Avg: 45.68 / Max: 46.071. (CXX) g++ options: -O3 -march=native -fopenmp -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl

ONNX Runtime

ONNX Runtime is developed by Microsoft and partners as a open-source, cross-platform, high performance machine learning inferencing and training accelerator. This test profile runs the ONNX Runtime with various models available from the ONNX Zoo. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: fcn-resnet101-11 - Device: CPU - Executor: StandardAVX-512 OnAVX-512 OffAVX-512 On 51220406080100SE +/- 0.88, N = 8SE +/- 0.17, N = 3SE +/- 0.17, N = 3102100103-mno-avx512f1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: fcn-resnet101-11 - Device: CPU - Executor: StandardAVX-512 OnAVX-512 OffAVX-512 On 51220406080100Min: 96 / Avg: 102.13 / Max: 103.5Min: 100 / Avg: 100.33 / Max: 100.5Min: 103 / Avg: 103.17 / Max: 103.51. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

Embree

Intel Embree is a collection of high-performance ray-tracing kernels for execution on CPUs and supporting instruction sets such as SSE, AVX, AVX2, and AVX-512. Embree also supports making use of the Intel SPMD Program Compiler (ISPC). Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 3.13Binary: Pathtracer ISPC - Model: Asian Dragon ObjAVX-512 OnAVX-512 OffAVX-512 On 51248121620SE +/- 0.01, N = 3SE +/- 0.02, N = 3SE +/- 0.03, N = 314.4314.0214.35MIN: 14.31 / MAX: 14.74MIN: 13.89 / MAX: 14.33MIN: 14.2 / MAX: 14.69
OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 3.13Binary: Pathtracer ISPC - Model: Asian Dragon ObjAVX-512 OnAVX-512 OffAVX-512 On 51248121620Min: 14.41 / Avg: 14.43 / Max: 14.45Min: 13.99 / Avg: 14.02 / Max: 14.07Min: 14.3 / Avg: 14.35 / Max: 14.38

oneDNN

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPUAVX-512 OnAVX-512 On 512510152025SE +/- 0.01, N = 3SE +/- 0.00, N = 320.8120.26MIN: 20.45MIN: 19.821. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPUAVX-512 OnAVX-512 On 512510152025Min: 20.8 / Avg: 20.81 / Max: 20.82Min: 20.26 / Avg: 20.26 / Max: 20.261. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU

i9-11900K: AVX-512 Off: The test run did not produce a result.

dav1d

Dav1d is an open-source, speedy AV1 video decoder. This test profile times how long it takes to decode sample AV1 video content. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFPS, More Is Betterdav1d 1.0Video Input: Summer Nature 1080pAVX-512 OnAVX-512 OffAVX-512 On 5122004006008001000SE +/- 0.63, N = 8SE +/- 0.63, N = 8SE +/- 0.57, N = 8894.73879.37902.92-mno-avx512f1. (CC) gcc options: -O3 -march=native -pthread -lm
OpenBenchmarking.orgFPS, More Is Betterdav1d 1.0Video Input: Summer Nature 1080pAVX-512 OnAVX-512 OffAVX-512 On 512160320480640800Min: 892.46 / Avg: 894.73 / Max: 897.24Min: 876.53 / Avg: 879.37 / Max: 882.38Min: 901.34 / Avg: 902.92 / Max: 905.371. (CC) gcc options: -O3 -march=native -pthread -lm

Mobile Neural Network

MNN is the Mobile Neural Network as a highly efficient, lightweight deep learning framework developed by Alibaba. This MNN test profile is building the OpenMP / CPU threaded version for processor benchmarking and not any GPU-accelerated test. MNN does allow making use of AVX-512 extensions. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.1Model: mobilenetV3AVX-512 OnAVX-512 OffAVX-512 On 5120.21670.43340.65010.86681.0835SE +/- 0.005, N = 15SE +/- 0.003, N = 3SE +/- 0.012, N = 150.9380.9420.963MIN: 0.89 / MAX: 3.62-mno-avx512f - MIN: 0.92 / MAX: 1.64MIN: 0.89 / MAX: 10.541. (CXX) g++ options: -O3 -march=native -std=c++11 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
OpenBenchmarking.orgms, Fewer Is BetterMobile Neural Network 2.1Model: mobilenetV3AVX-512 OnAVX-512 OffAVX-512 On 512246810Min: 0.91 / Avg: 0.94 / Max: 0.97Min: 0.94 / Avg: 0.94 / Max: 0.95Min: 0.91 / Avg: 0.96 / Max: 1.051. (CXX) g++ options: -O3 -march=native -std=c++11 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl

LeelaChessZero

LeelaChessZero (lc0 / lczero) is a chess engine automated vian neural networks. This test profile can be used for OpenCL, CUDA + cuDNN, and BLAS (CPU-based) benchmarking. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.28Backend: BLASAVX-512 OnAVX-512 OffAVX-512 On 51230060090012001500SE +/- 17.53, N = 3SE +/- 8.35, N = 3SE +/- 11.26, N = 9126812401273-mno-avx512f1. (CXX) g++ options: -flto -O3 -march=native -pthread
OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.28Backend: BLASAVX-512 OnAVX-512 OffAVX-512 On 5122004006008001000Min: 1243 / Avg: 1268.33 / Max: 1302Min: 1223 / Avg: 1239.67 / Max: 1249Min: 1244 / Avg: 1273.33 / Max: 13541. (CXX) g++ options: -flto -O3 -march=native -pthread

OpenFOAM

OpenFOAM is the leading free, open-source software for computational fluid dynamics (CFD). This test profile currently uses the drivaerFastback test case for analyzing automotive aerodynamics or alternatively the older motorBike input. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 10Input: drivaerFastback, Small Mesh Size - Mesh TimeAVX-512 OnAVX-512 OffAVX-512 On 512112233445545.4946.5345.331. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfoamToVTK -ldynamicMesh -llagrangian -lgenericPatchFields -lfileFormats -lOpenFOAM -ldl -lm

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: vgg16AVX-512 OnAVX-512 OffAVX-512 On 5121020304050SE +/- 0.13, N = 3SE +/- 0.13, N = 3SE +/- 0.23, N = 344.7445.6844.59MIN: 44.1 / MAX: 49.06-mno-avx512f - MIN: 44.98 / MAX: 50.56MIN: 43.87 / MAX: 50.431. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread
OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: vgg16AVX-512 OnAVX-512 OffAVX-512 On 512918273645Min: 44.58 / Avg: 44.74 / Max: 44.99Min: 45.42 / Avg: 45.68 / Max: 45.85Min: 44.14 / Avg: 44.59 / Max: 44.871. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread

ONNX Runtime

ONNX Runtime is developed by Microsoft and partners as a open-source, cross-platform, high performance machine learning inferencing and training accelerator. This test profile runs the ONNX Runtime with various models available from the ONNX Zoo. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: yolov4 - Device: CPU - Executor: StandardAVX-512 OnAVX-512 OffAVX-512 On 512110220330440550SE +/- 0.60, N = 3SE +/- 0.76, N = 3SE +/- 1.09, N = 3529517527-mno-avx512f1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: yolov4 - Device: CPU - Executor: StandardAVX-512 OnAVX-512 OffAVX-512 On 51290180270360450Min: 528 / Avg: 529.17 / Max: 530Min: 515.5 / Avg: 516.5 / Max: 518Min: 524.5 / Avg: 526.67 / Max: 5281. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

TNN

TNN is an open-source deep learning reasoning framework developed by Tencent. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: MobileNet v2AVX-512 OnAVX-512 OffAVX-512 On 51250100150200250SE +/- 0.25, N = 3SE +/- 0.28, N = 3SE +/- 0.06, N = 3245.95241.31240.41MIN: 242.61 / MAX: 266.39-mno-avx512f - MIN: 238.09 / MAX: 252.58MIN: 237.96 / MAX: 252.771. (CXX) g++ options: -O3 -march=native -fopenmp -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl
OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: MobileNet v2AVX-512 OnAVX-512 OffAVX-512 On 5124080120160200Min: 245.63 / Avg: 245.95 / Max: 246.45Min: 240.82 / Avg: 241.31 / Max: 241.8Min: 240.3 / Avg: 240.41 / Max: 240.511. (CXX) g++ options: -O3 -march=native -fopenmp -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl

OpenRadioss

OpenRadioss is an open-source AGPL-licensed finite element solver for dynamic event analysis OpenRadioss is based on Altair Radioss and open-sourced in 2022. This open-source finite element solver is benchmarked with various example models available from https://www.openradioss.org/models/. This test is currently using a reference OpenRadioss binary build offered via GitHub. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenRadioss 2022.10.13Model: Bird Strike on WindshieldAVX-512 OnAVX-512 OffAVX-512 On 51270140210280350SE +/- 0.35, N = 3SE +/- 3.86, N = 4SE +/- 3.22, N = 5309.21311.82315.86
OpenBenchmarking.orgSeconds, Fewer Is BetterOpenRadioss 2022.10.13Model: Bird Strike on WindshieldAVX-512 OnAVX-512 OffAVX-512 On 51260120180240300Min: 308.7 / Avg: 309.21 / Max: 309.89Min: 307.76 / Avg: 311.82 / Max: 323.38Min: 308.59 / Avg: 315.86 / Max: 323.79

oneDNN

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPUAVX-512 OnAVX-512 OffAVX-512 On 512246810SE +/- 0.02566, N = 3SE +/- 0.04615, N = 3SE +/- 0.02594, N = 38.551938.397728.50891MIN: 4.79-mno-avx512f - MIN: 4.7MIN: 4.81. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPUAVX-512 OnAVX-512 OffAVX-512 On 5123691215Min: 8.52 / Avg: 8.55 / Max: 8.6Min: 8.31 / Avg: 8.4 / Max: 8.47Min: 8.46 / Avg: 8.51 / Max: 8.541. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

ONNX Runtime

ONNX Runtime is developed by Microsoft and partners as a open-source, cross-platform, high performance machine learning inferencing and training accelerator. This test profile runs the ONNX Runtime with various models available from the ONNX Zoo. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: bertsquad-12 - Device: CPU - Executor: ParallelAVX-512 OnAVX-512 OffAVX-512 On 512110220330440550SE +/- 0.44, N = 3SE +/- 2.46, N = 3SE +/- 0.60, N = 3519515524-mno-avx512f1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: bertsquad-12 - Device: CPU - Executor: ParallelAVX-512 OnAVX-512 OffAVX-512 On 51290180270360450Min: 518 / Avg: 518.67 / Max: 519.5Min: 510.5 / Avg: 514.67 / Max: 519Min: 523.5 / Avg: 524.33 / Max: 525.51. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OpenRadioss

OpenRadioss is an open-source AGPL-licensed finite element solver for dynamic event analysis OpenRadioss is based on Altair Radioss and open-sourced in 2022. This open-source finite element solver is benchmarked with various example models available from https://www.openradioss.org/models/. This test is currently using a reference OpenRadioss binary build offered via GitHub. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenRadioss 2022.10.13Model: Bumper BeamAVX-512 OnAVX-512 OffAVX-512 On 5124080120160200SE +/- 2.45, N = 3SE +/- 1.56, N = 7SE +/- 1.21, N = 3177.40174.60175.69
OpenBenchmarking.orgSeconds, Fewer Is BetterOpenRadioss 2022.10.13Model: Bumper BeamAVX-512 OnAVX-512 OffAVX-512 On 512306090120150Min: 172.5 / Avg: 177.4 / Max: 180.01Min: 171.08 / Avg: 174.6 / Max: 182.53Min: 174.01 / Avg: 175.69 / Max: 178.03

Xmrig

Xmrig is an open-source cross-platform CPU/GPU miner for RandomX, KawPow, CryptoNight and AstroBWT. This test profile is setup to measure the Xmlrig CPU mining performance. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgH/s, More Is BetterXmrig 6.12.1Variant: Wownero - Hash Count: 1MAVX-512 OnAVX-512 OffAVX-512 On 5129001800270036004500SE +/- 44.80, N = 3SE +/- 5.83, N = 3SE +/- 21.46, N = 34015.04076.44075.5-mno-avx512f1. (CXX) g++ options: -O3 -march=native -fexceptions -fno-rtti -maes -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
OpenBenchmarking.orgH/s, More Is BetterXmrig 6.12.1Variant: Wownero - Hash Count: 1MAVX-512 OnAVX-512 OffAVX-512 On 5127001400210028003500Min: 3948.6 / Avg: 4015 / Max: 4100.3Min: 4065.8 / Avg: 4076.37 / Max: 4085.9Min: 4034 / Avg: 4075.47 / Max: 4105.81. (CXX) g++ options: -O3 -march=native -fexceptions -fno-rtti -maes -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc

GROMACS

The GROMACS (GROningen MAchine for Chemical Simulations) molecular dynamics package testing with the water_GMX50 data. This test profile allows selecting between CPU and GPU-based GROMACS builds. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNs Per Day, More Is BetterGROMACS 2022.1Implementation: MPI CPU - Input: water_GMX50_bareAVX-512 OnAVX-512 OffAVX-512 On 5120.23290.46580.69870.93161.1645SE +/- 0.004, N = 3SE +/- 0.002, N = 3SE +/- 0.001, N = 31.0311.0201.035-mno-avx512f1. (CXX) g++ options: -O3 -march=native
OpenBenchmarking.orgNs Per Day, More Is BetterGROMACS 2022.1Implementation: MPI CPU - Input: water_GMX50_bareAVX-512 OnAVX-512 OffAVX-512 On 512246810Min: 1.02 / Avg: 1.03 / Max: 1.04Min: 1.02 / Avg: 1.02 / Max: 1.02Min: 1.03 / Avg: 1.03 / Max: 1.041. (CXX) g++ options: -O3 -march=native

ONNX Runtime

ONNX Runtime is developed by Microsoft and partners as a open-source, cross-platform, high performance machine learning inferencing and training accelerator. This test profile runs the ONNX Runtime with various models available from the ONNX Zoo. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: fcn-resnet101-11 - Device: CPU - Executor: ParallelAVX-512 OnAVX-512 OffAVX-512 On 5121632486480SE +/- 0.00, N = 3SE +/- 0.33, N = 3SE +/- 0.44, N = 3737273-mno-avx512f1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: fcn-resnet101-11 - Device: CPU - Executor: ParallelAVX-512 OnAVX-512 OffAVX-512 On 5121428425670Min: 73 / Avg: 73 / Max: 73Min: 71 / Avg: 71.67 / Max: 72Min: 72.5 / Avg: 73.33 / Max: 741. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: yolov4 - Device: CPU - Executor: ParallelAVX-512 OnAVX-512 OffAVX-512 On 51270140210280350SE +/- 0.29, N = 3SE +/- 0.33, N = 3SE +/- 0.33, N = 3319315319-mno-avx512f1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: yolov4 - Device: CPU - Executor: ParallelAVX-512 OnAVX-512 OffAVX-512 On 51260120180240300Min: 318 / Avg: 318.5 / Max: 319Min: 314.5 / Avg: 315.17 / Max: 315.5Min: 318 / Avg: 318.67 / Max: 3191. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

TNN

TNN is an open-source deep learning reasoning framework developed by Tencent. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: SqueezeNet v1.1AVX-512 OnAVX-512 OffAVX-512 On 51250100150200250SE +/- 0.08, N = 4SE +/- 0.07, N = 4SE +/- 0.09, N = 4230.82228.68228.42MIN: 229.65 / MAX: 231.83-mno-avx512f - MIN: 227.32 / MAX: 231.38MIN: 227.09 / MAX: 230.261. (CXX) g++ options: -O3 -march=native -fopenmp -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl
OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: SqueezeNet v1.1AVX-512 OnAVX-512 OffAVX-512 On 5124080120160200Min: 230.69 / Avg: 230.82 / Max: 231.02Min: 228.58 / Avg: 228.68 / Max: 228.9Min: 228.26 / Avg: 228.42 / Max: 228.681. (CXX) g++ options: -O3 -march=native -fopenmp -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl

Cpuminer-Opt

Cpuminer-Opt is a fork of cpuminer-multi that carries a wide range of CPU performance optimizations for measuring the potential cryptocurrency mining performance of the CPU/processor with a wide variety of cryptocurrencies. The benchmark reports the hash speed for the CPU mining performance for the selected cryptocurrency. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.18Algorithm: MagiAVX-512 OnAVX-512 OffAVX-512 On 512100200300400500SE +/- 0.67, N = 3SE +/- 0.84, N = 3SE +/- 3.36, N = 3452.82453.39449.42-mno-avx512f1. (CXX) g++ options: -O3 -march=native -lcurl -lz -lpthread -lssl -lcrypto -lgmp
OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.18Algorithm: MagiAVX-512 OnAVX-512 OffAVX-512 On 51280160240320400Min: 451.64 / Avg: 452.82 / Max: 453.96Min: 452.2 / Avg: 453.39 / Max: 455.01Min: 442.71 / Avg: 449.42 / Max: 453.131. (CXX) g++ options: -O3 -march=native -lcurl -lz -lpthread -lssl -lcrypto -lgmp

Xmrig

Xmrig is an open-source cross-platform CPU/GPU miner for RandomX, KawPow, CryptoNight and AstroBWT. This test profile is setup to measure the Xmlrig CPU mining performance. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgH/s, More Is BetterXmrig 6.12.1Variant: Monero - Hash Count: 1MAVX-512 OnAVX-512 OffAVX-512 On 5125001000150020002500SE +/- 3.33, N = 3SE +/- 6.72, N = 3SE +/- 6.87, N = 32239.52256.12252.6-mno-avx512f1. (CXX) g++ options: -O3 -march=native -fexceptions -fno-rtti -maes -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc
OpenBenchmarking.orgH/s, More Is BetterXmrig 6.12.1Variant: Monero - Hash Count: 1MAVX-512 OnAVX-512 OffAVX-512 On 512400800120016002000Min: 2233.1 / Avg: 2239.5 / Max: 2244.3Min: 2245 / Avg: 2256.07 / Max: 2268.2Min: 2238.9 / Avg: 2252.63 / Max: 2259.81. (CXX) g++ options: -O3 -march=native -fexceptions -fno-rtti -maes -Ofast -static-libgcc -static-libstdc++ -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc

OpenFOAM

OpenFOAM is the leading free, open-source software for computational fluid dynamics (CFD). This test profile currently uses the drivaerFastback test case for analyzing automotive aerodynamics or alternatively the older motorBike input. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 10Input: drivaerFastback, Small Mesh Size - Execution TimeAVX-512 OnAVX-512 OffAVX-512 On 512120240360480600538.98538.29535.111. (CXX) g++ options: -std=c++14 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfoamToVTK -ldynamicMesh -llagrangian -lgenericPatchFields -lfileFormats -lOpenFOAM -ldl -lm

TNN

TNN is an open-source deep learning reasoning framework developed by Tencent. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: DenseNetAVX-512 OnAVX-512 OffAVX-512 On 5126001200180024003000SE +/- 2.72, N = 3SE +/- 0.35, N = 3SE +/- 1.39, N = 32704.932703.562695.65MIN: 2642.6 / MAX: 2789.25-mno-avx512f - MIN: 2643.22 / MAX: 2776.22MIN: 2633.85 / MAX: 2768.471. (CXX) g++ options: -O3 -march=native -fopenmp -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl
OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: DenseNetAVX-512 OnAVX-512 OffAVX-512 On 5125001000150020002500Min: 2701.18 / Avg: 2704.93 / Max: 2710.21Min: 2703.02 / Avg: 2703.56 / Max: 2704.22Min: 2693.02 / Avg: 2695.65 / Max: 2697.761. (CXX) g++ options: -O3 -march=native -fopenmp -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl

oneDNN

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPUAVX-512 OnAVX-512 On 51248121620SE +/- 0.00, N = 9SE +/- 0.00, N = 917.3117.27MIN: 16.91MIN: 16.851. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPUAVX-512 OnAVX-512 On 51248121620Min: 17.3 / Avg: 17.31 / Max: 17.33Min: 17.25 / Avg: 17.27 / Max: 17.291. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU

i9-11900K: AVX-512 Off: The test run did not produce a result.

ONNX Runtime

ONNX Runtime is developed by Microsoft and partners as a open-source, cross-platform, high performance machine learning inferencing and training accelerator. This test profile runs the ONNX Runtime with various models available from the ONNX Zoo. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: GPT-2 - Device: CPU - Executor: ParallelAVX-512 OnAVX-512 OffAVX-512 On 51212002400360048006000SE +/- 7.10, N = 3SE +/- 1.76, N = 3SE +/- 3.18, N = 3554855575545-mno-avx512f1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: GPT-2 - Device: CPU - Executor: ParallelAVX-512 OnAVX-512 OffAVX-512 On 51210002000300040005000Min: 5538.5 / Avg: 5548.17 / Max: 5562Min: 5555 / Avg: 5557 / Max: 5560.5Min: 5541 / Avg: 5544.67 / Max: 55511. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: GPT-2 - Device: CPU - Executor: StandardAVX-512 OnAVX-512 OffAVX-512 On 51215003000450060007500SE +/- 4.94, N = 3SE +/- 4.16, N = 3SE +/- 9.16, N = 3695469546962-mno-avx512f1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: GPT-2 - Device: CPU - Executor: StandardAVX-512 OnAVX-512 OffAVX-512 On 51212002400360048006000Min: 6944 / Avg: 6953.83 / Max: 6959.5Min: 6945.5 / Avg: 6953.5 / Max: 6959.5Min: 6947.5 / Avg: 6962.17 / Max: 69791. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

oneDNN

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPUAVX-512 OnAVX-512 On 51248121620SE +/- 0.00, N = 7SE +/- 0.00, N = 716.1616.16MIN: 16.07MIN: 16.061. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPUAVX-512 OnAVX-512 On 51248121620Min: 16.15 / Avg: 16.16 / Max: 16.17Min: 16.16 / Avg: 16.16 / Max: 16.161. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU

i9-11900K: AVX-512 Off: The test run did not produce a result.

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPUAVX-512 OnAVX-512 On 512246810SE +/- 0.00165, N = 4SE +/- 0.00438, N = 48.552748.55374MIN: 8.4MIN: 8.331. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPUAVX-512 OnAVX-512 On 5123691215Min: 8.55 / Avg: 8.55 / Max: 8.56Min: 8.54 / Avg: 8.55 / Max: 8.561. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU

i9-11900K: AVX-512 Off: The test run did not produce a result.

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: shufflenet-v2AVX-512 OnAVX-512 OffAVX-512 On 5120.52431.04861.57292.09722.6215SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 32.332.332.33MIN: 2.26 / MAX: 3.21-mno-avx512f - MIN: 2.27 / MAX: 3.18MIN: 2.27 / MAX: 2.811. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread
OpenBenchmarking.orgms, Fewer Is BetterNCNN 20220729Target: CPU - Model: shufflenet-v2AVX-512 OnAVX-512 OffAVX-512 On 512246810Min: 2.32 / Avg: 2.33 / Max: 2.35Min: 2.32 / Avg: 2.33 / Max: 2.34Min: 2.32 / Avg: 2.33 / Max: 2.341. (CXX) g++ options: -O3 -march=native -rdynamic -lgomp -lpthread

Intel Open Image Denoise

Open Image Denoise is a denoising library for ray-tracing and part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgImages / Sec, More Is BetterIntel Open Image Denoise 1.4.0Run: RTLightmap.hdr.4096x4096AVX-512 OnAVX-512 OffAVX-512 On 5120.04730.09460.14190.18920.2365SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 30.210.210.21
OpenBenchmarking.orgImages / Sec, More Is BetterIntel Open Image Denoise 1.4.0Run: RTLightmap.hdr.4096x4096AVX-512 OnAVX-512 OffAVX-512 On 51212345Min: 0.21 / Avg: 0.21 / Max: 0.21Min: 0.21 / Avg: 0.21 / Max: 0.21Min: 0.2 / Avg: 0.21 / Max: 0.21

OpenBenchmarking.orgImages / Sec, More Is BetterIntel Open Image Denoise 1.4.0Run: RT.ldr_alb_nrm.3840x2160AVX-512 OnAVX-512 OffAVX-512 On 5120.09680.19360.29040.38720.484SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 30.430.430.43
OpenBenchmarking.orgImages / Sec, More Is BetterIntel Open Image Denoise 1.4.0Run: RT.ldr_alb_nrm.3840x2160AVX-512 OnAVX-512 OffAVX-512 On 51212345Min: 0.43 / Avg: 0.43 / Max: 0.43Min: 0.43 / Avg: 0.43 / Max: 0.43Min: 0.43 / Avg: 0.43 / Max: 0.43

OpenBenchmarking.orgImages / Sec, More Is BetterIntel Open Image Denoise 1.4.0Run: RT.hdr_alb_nrm.3840x2160AVX-512 OnAVX-512 OffAVX-512 On 5120.09680.19360.29040.38720.484SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 30.430.430.43
OpenBenchmarking.orgImages / Sec, More Is BetterIntel Open Image Denoise 1.4.0Run: RT.hdr_alb_nrm.3840x2160AVX-512 OnAVX-512 OffAVX-512 On 51212345Min: 0.43 / Avg: 0.43 / Max: 0.43Min: 0.43 / Avg: 0.43 / Max: 0.43Min: 0.43 / Avg: 0.43 / Max: 0.43

CPU Temperature Monitor

OpenBenchmarking.orgCelsiusCPU Temperature MonitorPhoronix Test Suite System MonitoringAVX-512 OnAVX-512 OffAVX-512 On 51220406080100Min: 29 / Avg: 78.17 / Max: 100Min: 27 / Avg: 72.98 / Max: 95Min: 31 / Avg: 79.47 / Max: 100

CPU Power Consumption Monitor

OpenBenchmarking.orgWattsCPU Power Consumption MonitorPhoronix Test Suite System MonitoringAVX-512 OnAVX-512 OffAVX-512 On 51250100150200250Min: 6.37 / Avg: 188.45 / Max: 283.94Min: 6.36 / Avg: 173.58 / Max: 251.93Min: 6.3 / Avg: 192.29 / Max: 280.23

CPU Peak Freq (Highest CPU Core Frequency) Monitor

OpenBenchmarking.orgMegahertzCPU Peak Freq (Highest CPU Core Frequency) MonitorPhoronix Test Suite System MonitoringAVX-512 OnAVX-512 OffAVX-512 On 51210002000300040005000Min: 3350 / Avg: 4733.96 / Max: 5541Min: 3500 / Avg: 4781.84 / Max: 5621Min: 2700 / Avg: 4722.22 / Max: 5323

ONNX Runtime

OpenBenchmarking.orgCelsius, Fewer Is BetterONNX Runtime 1.11CPU Temperature MonitorAVX-512 OnAVX-512 OffAVX-512 On 51220406080100Min: 40 / Avg: 85.67 / Max: 96Min: 38 / Avg: 80.76 / Max: 86Min: 40 / Avg: 86.37 / Max: 96

OpenBenchmarking.orgWatts, Fewer Is BetterONNX Runtime 1.11CPU Power Consumption MonitorAVX-512 OnAVX-512 OffAVX-512 On 5124080120160200Min: 12.25 / Avg: 215.19 / Max: 242.31Min: 12.04 / Avg: 197.45 / Max: 208.91Min: 12.17 / Avg: 215.02 / Max: 242.14

OpenBenchmarking.orgMegahertz, More Is BetterONNX Runtime 1.11CPU Peak Freq (Highest CPU Core Frequency) MonitorAVX-512 OnAVX-512 OffAVX-512 On 5129001800270036004500Min: 3500 / Avg: 4585.07 / Max: 5300Min: 4700 / Avg: 4719.81 / Max: 5300Min: 3587 / Avg: 4590.75 / Max: 5143

OpenBenchmarking.orgInferences Per Minute Per Watt, More Is BetterONNX Runtime 1.11Model: ArcFace ResNet-100 - Device: CPU - Executor: StandardAVX-512 OnAVX-512 OffAVX-512 On 512369121510.4519.88110.464

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: ArcFace ResNet-100 - Device: CPU - Executor: StandardAVX-512 OnAVX-512 OffAVX-512 On 5125001000150020002500SE +/- 2.02, N = 3SE +/- 35.14, N = 12SE +/- 1.92, N = 3224919512250-mno-avx512f1. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt
OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.11Model: ArcFace ResNet-100 - Device: CPU - Executor: StandardAVX-512 OnAVX-512 OffAVX-512 On 512400800120016002000Min: 2244.5 / Avg: 2248.5 / Max: 2251Min: 1565 / Avg: 1951.38 / Max: 1988.5Min: 2247 / Avg: 2249.83 / Max: 2253.51. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

OpenVINO

OpenBenchmarking.orgCelsius, Fewer Is BetterOpenVINO 2022.2.devCPU Temperature MonitorAVX-512 OnAVX-512 OffAVX-512 On 51220406080100Min: 40 / Avg: 87.04 / Max: 95Min: 61 / Avg: 85.95 / Max: 90Min: 41 / Avg: 86.73 / Max: 95

OpenBenchmarking.orgWatts, Fewer Is BetterOpenVINO 2022.2.devCPU Power Consumption MonitorAVX-512 OnAVX-512 OffAVX-512 On 51250100150200250Min: 12.32 / Avg: 232.01 / Max: 274.61Min: 55.78 / Avg: 221.82 / Max: 236.18Min: 12.41 / Avg: 231.67 / Max: 274.84

OpenBenchmarking.orgMegahertz, More Is BetterOpenVINO 2022.2.devCPU Peak Freq (Highest CPU Core Frequency) MonitorAVX-512 OnAVX-512 OffAVX-512 On 5129001800270036004500Min: 3400 / Avg: 4624.71 / Max: 5302Min: 4700 / Avg: 4738.5 / Max: 5300Min: 3408 / Avg: 4610.19 / Max: 5310

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.2.devModel: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPUAVX-512 OnAVX-512 OffAVX-512 On 5120.11930.23860.35790.47720.5965SE +/- 0.01, N = 15SE +/- 0.00, N = 3SE +/- 0.01, N = 150.340.530.35-ldl - MIN: 0.18 / MAX: 82.47-mno-avx512f - MIN: 0.28 / MAX: 83.77-ldl - MIN: 0.18 / MAX: 11.281. (CXX) g++ options: -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2022.2.devModel: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPUAVX-512 OnAVX-512 OffAVX-512 On 512246810Min: 0.31 / Avg: 0.34 / Max: 0.4Min: 0.53 / Avg: 0.53 / Max: 0.53Min: 0.31 / Avg: 0.35 / Max: 0.391. (CXX) g++ options: -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -pie

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.2.devModel: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPUAVX-512 OnAVX-512 OffAVX-512 On 5125K10K15K20K25KSE +/- 411.52, N = 15SE +/- 47.04, N = 3SE +/- 387.31, N = 1523074.8214973.8822962.10-ldl-mno-avx512f-ldl1. (CXX) g++ options: -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -pie
OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2022.2.devModel: Age Gender Recognition Retail 0013 FP16-INT8 - Device: CPUAVX-512 OnAVX-512 OffAVX-512 On 5124K8K12K16K20KMin: 19833.99 / Avg: 23074.82 / Max: 25477.04Min: 14923.91 / Avg: 14973.88 / Max: 15067.91Min: 20238.11 / Avg: 22962.1 / Max: 25365.441. (CXX) g++ options: -O3 -march=native -fsigned-char -ffunction-sections -fdata-sections -fno-strict-overflow -fwrapv -pie

TensorFlow

This is a benchmark of the TensorFlow deep learning framework using the TensorFlow reference benchmarks (tensorflow/benchmarks with tf_cnn_benchmarks.py). Note with the Phoronix Test Suite there is also pts/tensorflow-lite for benchmarking the TensorFlow Lite binaries too. Learn more via the OpenBenchmarking.org test page.

Device: CPU - Batch Size: 512 - Model: ResNet-50

i9-11900K: AVX-512 On: The test quit with a non-zero exit status.

i9-11900K: AVX-512 Off: The test quit with a non-zero exit status.

i9-11900K: AVX-512 On 512: The test quit with a non-zero exit status.

Device: CPU - Batch Size: 512 - Model: VGG-16

i9-11900K: AVX-512 On: The test quit with a non-zero exit status.

i9-11900K: AVX-512 Off: The test quit with a non-zero exit status. E: Fatal Python error: Aborted

i9-11900K: AVX-512 On 512: The test quit with a non-zero exit status. E: Fatal Python error: Segmentation fault

Cpuminer-Opt

OpenBenchmarking.orgCelsius, Fewer Is BetterCpuminer-Opt 3.18CPU Temperature MonitorAVX-512 OnAVX-512 OffAVX-512 On 51220406080100Min: 40 / Avg: 84.5 / Max: 93Min: 38 / Avg: 70.52 / Max: 77Min: 42 / Avg: 83.16 / Max: 91

OpenBenchmarking.orgWatts, Fewer Is BetterCpuminer-Opt 3.18CPU Power Consumption MonitorAVX-512 OnAVX-512 OffAVX-512 On 51250100150200250Min: 12.44 / Avg: 219.64 / Max: 263.83Min: 12.31 / Avg: 169.09 / Max: 198.24Min: 12.34 / Avg: 217.57 / Max: 263.4

OpenBenchmarking.orgMegahertz, More Is BetterCpuminer-Opt 3.18CPU Peak Freq (Highest CPU Core Frequency) MonitorAVX-512 OnAVX-512 OffAVX-512 On 5129001800270036004500Min: 4600 / Avg: 4719.95 / Max: 5301Min: 4700 / Avg: 4759.57 / Max: 5303Min: 3500 / Avg: 4700.75 / Max: 5100

OpenBenchmarking.orgkH/s Per Watt, More Is BetterCpuminer-Opt 3.18Algorithm: DeepcoinAVX-512 OnAVX-512 OffAVX-512 On 512132639526550.7658.9451.80

OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.18Algorithm: DeepcoinAVX-512 OnAVX-512 OffAVX-512 On 5122K4K6K8K10KSE +/- 226.60, N = 12SE +/- 3.11, N = 3SE +/- 25.17, N = 311148.969966.1011270.00-mno-avx512f1. (CXX) g++ options: -O3 -march=native -lcurl -lz -lpthread -lssl -lcrypto -lgmp
OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.18Algorithm: DeepcoinAVX-512 OnAVX-512 OffAVX-512 On 5122K4K6K8K10KMin: 8717.49 / Avg: 11148.96 / Max: 11780Min: 9960.4 / Avg: 9966.1 / Max: 9971.09Min: 11240 / Avg: 11270 / Max: 113201. (CXX) g++ options: -O3 -march=native -lcurl -lz -lpthread -lssl -lcrypto -lgmp

OpenBenchmarking.orgCelsius, Fewer Is BetterCpuminer-Opt 3.18CPU Temperature MonitorAVX-512 OnAVX-512 OffAVX-512 On 51220406080100Min: 40 / Avg: 85.36 / Max: 94Min: 38 / Avg: 71.45 / Max: 78Min: 41 / Avg: 85.18 / Max: 96

OpenBenchmarking.orgWatts, Fewer Is BetterCpuminer-Opt 3.18CPU Power Consumption MonitorAVX-512 OnAVX-512 OffAVX-512 On 51250100150200250Min: 12.54 / Avg: 213.04 / Max: 259.5Min: 12.48 / Avg: 166.37 / Max: 191.48Min: 12.63 / Avg: 211.76 / Max: 259.97

OpenBenchmarking.orgMegahertz, More Is BetterCpuminer-Opt 3.18CPU Peak Freq (Highest CPU Core Frequency) MonitorAVX-512 OnAVX-512 OffAVX-512 On 5129001800270036004500Min: 3500 / Avg: 4653.7 / Max: 5308Min: 4700 / Avg: 4750.11 / Max: 5297Min: 3500 / Avg: 4689.84 / Max: 5300

OpenBenchmarking.orgkH/s Per Watt, More Is BetterCpuminer-Opt 3.18Algorithm: RingcoinAVX-512 OnAVX-512 OffAVX-512 On 512369121510.1112.8110.35

OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.18Algorithm: RingcoinAVX-512 OnAVX-512 OffAVX-512 On 5125001000150020002500SE +/- 40.17, N = 13SE +/- 5.83, N = 3SE +/- 16.96, N = 152152.942130.492190.91-mno-avx512f1. (CXX) g++ options: -O3 -march=native -lcurl -lz -lpthread -lssl -lcrypto -lgmp
OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.18Algorithm: RingcoinAVX-512 OnAVX-512 OffAVX-512 On 512400800120016002000Min: 1679.61 / Avg: 2152.94 / Max: 2229.9Min: 2120.07 / Avg: 2130.49 / Max: 2140.25Min: 2142.59 / Avg: 2190.91 / Max: 2364.181. (CXX) g++ options: -O3 -march=native -lcurl -lz -lpthread -lssl -lcrypto -lgmp

OpenBenchmarking.orgCelsius, Fewer Is BetterCpuminer-Opt 3.18CPU Temperature MonitorAVX-512 OnAVX-512 OffAVX-512 On 51220406080100Min: 38 / Avg: 82.81 / Max: 93Min: 35 / Avg: 67.7 / Max: 74Min: 39 / Avg: 80.97 / Max: 94

OpenBenchmarking.orgWatts, Fewer Is BetterCpuminer-Opt 3.18CPU Power Consumption MonitorAVX-512 OnAVX-512 OffAVX-512 On 51250100150200250Min: 12.61 / Avg: 210.01 / Max: 243.37Min: 12.29 / Avg: 156.93 / Max: 187.81Min: 12.21 / Avg: 205.74 / Max: 250.13

OpenBenchmarking.orgMegahertz, More Is BetterCpuminer-Opt 3.18CPU Peak Freq (Highest CPU Core Frequency) MonitorAVX-512 OnAVX-512 OffAVX-512 On 5129001800270036004500Min: 4650 / Avg: 4751.77 / Max: 5252Min: 4700 / Avg: 4771.9 / Max: 5294Min: 3500 / Avg: 4605.7 / Max: 5307

OpenBenchmarking.orgkH/s Per Watt, More Is BetterCpuminer-Opt 3.18Algorithm: x25xAVX-512 OnAVX-512 OffAVX-512 On 5120.51481.02961.54442.05922.5741.8462.2881.847

OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.18Algorithm: x25xAVX-512 OnAVX-512 OffAVX-512 On 51280160240320400SE +/- 1.89, N = 3SE +/- 0.48, N = 3SE +/- 7.53, N = 12387.72359.01379.94-mno-avx512f1. (CXX) g++ options: -O3 -march=native -lcurl -lz -lpthread -lssl -lcrypto -lgmp
OpenBenchmarking.orgkH/s, More Is BetterCpuminer-Opt 3.18Algorithm: x25xAVX-512 OnAVX-512 OffAVX-512 On 51270140210280350Min: 385.59 / Avg: 387.72 / Max: 391.49Min: 358.13 / Avg: 359.01 / Max: 359.79Min: 298.93 / Avg: 379.94 / Max: 392.931. (CXX) g++ options: -O3 -march=native -lcurl -lz -lpthread -lssl -lcrypto -lgmp

oneDNN

OpenBenchmarking.orgCelsius, Fewer Is BetteroneDNN 2.7CPU Temperature MonitorAVX-512 OnAVX-512 OffAVX-512 On 51220406080100Min: 40 / Avg: 84.18 / Max: 95Min: 39 / Avg: 80.93 / Max: 87Min: 41 / Avg: 85.38 / Max: 95

OpenBenchmarking.orgWatts, Fewer Is BetteroneDNN 2.7CPU Power Consumption MonitorAVX-512 OnAVX-512 OffAVX-512 On 51250100150200250Min: 12.72 / Avg: 223.66 / Max: 258.29Min: 12.57 / Avg: 214.65 / Max: 239.45Min: 6.51 / Avg: 222.96 / Max: 260.16

OpenBenchmarking.orgMegahertz, More Is BetteroneDNN 2.7CPU Peak Freq (Highest CPU Core Frequency) MonitorAVX-512 OnAVX-512 OffAVX-512 On 5129001800270036004500Min: 3500 / Avg: 4678.34 / Max: 5300Min: 4700 / Avg: 4726.9 / Max: 5300Min: 3500 / Avg: 4667.49 / Max: 5300

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPUAVX-512 OnAVX-512 OffAVX-512 On 5127001400210028003500SE +/- 3.05, N = 3SE +/- 1.82, N = 3SE +/- 89.39, N = 153088.893270.553073.17MIN: 3071.02-mno-avx512f - MIN: 3260.73MIN: 2922.721. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.7Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPUAVX-512 OnAVX-512 OffAVX-512 On 5126001200180024003000Min: 3083.31 / Avg: 3088.89 / Max: 3093.83Min: 3268.67 / Avg: 3270.55 / Max: 3274.19Min: 2934.9 / Avg: 3073.17 / Max: 4315.971. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl

222 Results Shown

Cpuminer-Opt:
  LBC, LBRY Credits
  Quad SHA-256, Pyrite
  Myriad-Groestl
OpenVINO:
  Face Detection FP16-INT8 - CPU:
    FPS
    ms
  Weld Porosity Detection FP16-INT8 - CPU:
    ms
    FPS
Cpuminer-Opt:
  Blake-2 S
  Triple SHA-256, Onecoin
oneDNN
OpenVINO:
  Vehicle Detection FP16-INT8 - CPU:
    ms
    FPS
TensorFlow
Cpuminer-Opt
TensorFlow:
  CPU - 256 - GoogLeNet
  CPU - 256 - ResNet-50
  CPU - 64 - ResNet-50
  CPU - 32 - GoogLeNet
  CPU - 64 - GoogLeNet
  CPU - 16 - GoogLeNet
Mobile Neural Network
oneDNN
TensorFlow
Neural Magic DeepSparse:
  NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Synchronous Single-Stream:
    ms/batch
    items/sec
TensorFlow
AI Benchmark Alpha
TensorFlow
OSPRay
Cpuminer-Opt
OSPRay
OpenVINO
oneDNN
NCNN
Neural Magic DeepSparse:
  CV Classification, ResNet-50 ImageNet - Synchronous Single-Stream:
    ms/batch
    items/sec
  CV Classification, ResNet-50 ImageNet - Asynchronous Multi-Stream:
    ms/batch
    items/sec
TensorFlow:
  CPU - 256 - VGG-16
  CPU - 64 - VGG-16
  CPU - 32 - VGG-16
  CPU - 256 - AlexNet
Neural Magic DeepSparse:
  NLP Question Answering, BERT base uncased SQuaD 12layer Pruned90 - Asynchronous Multi-Stream:
    ms/batch
    items/sec
simdjson:
  TopTweet
  DistinctUserID
oneDNN
simdjson
TensorFlow:
  CPU - 16 - VGG-16
  CPU - 64 - AlexNet
OSPRay
OpenVINO:
  Vehicle Detection FP16 - CPU:
    FPS
    ms
OpenVKL
TensorFlow
Meta Performance Per Watts
AI Benchmark Alpha
OpenVINO
oneDNN
NCNN
oneDNN:
  Deconvolution Batch shapes_3d - f32 - CPU
  Convolution Batch Shapes Auto - u8s8f32 - CPU
Neural Magic DeepSparse:
  NLP Text Classification, BERT base uncased SST2 - Synchronous Single-Stream:
    items/sec
    ms/batch
OSPRay
Neural Magic DeepSparse:
  NLP Text Classification, DistilBERT mnli - Synchronous Single-Stream:
    ms/batch
    items/sec
Mobile Neural Network
TensorFlow
Mobile Neural Network:
  inception-v3
  SqueezeNetV1.0
OSPRay
OpenVINO:
  Age Gender Recognition Retail 0013 FP16 - CPU:
    ms
    FPS
Neural Magic DeepSparse:
  CV Detection,YOLOv5s COCO - Asynchronous Multi-Stream:
    items/sec
    ms/batch
simdjson
oneDNN
Neural Magic DeepSparse:
  NLP Text Classification, DistilBERT mnli - Asynchronous Multi-Stream:
    ms/batch
    items/sec
NCNN
oneDNN
OpenVINO
oneDNN
Neural Magic DeepSparse:
  CV Detection,YOLOv5s COCO - Synchronous Single-Stream:
    ms/batch
    items/sec
  NLP Token Classification, BERT base uncased conll2003 - Synchronous Single-Stream:
    items/sec
    ms/batch
  NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Stream:
    ms/batch
OpenVINO
Neural Magic DeepSparse:
  NLP Text Classification, BERT base uncased SST2 - Asynchronous Multi-Stream
  NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Stream
  NLP Document Classification, oBERT base uncased on IMDB - Synchronous Single-Stream
oneDNN
OpenVINO:
  Person Detection FP32 - CPU:
    FPS
    ms
  Face Detection FP16 - CPU:
    FPS
oneDNN
OpenVINO:
  Person Vehicle Bike Detection FP16 - CPU:
    ms
    FPS
  Face Detection FP16 - CPU:
    ms
NCNN
LeelaChessZero
OpenVINO:
  Machine Translation EN To DE FP16 - CPU:
    FPS
    ms
Neural Magic DeepSparse:
  NLP Token Classification, BERT base uncased conll2003 - Asynchronous Multi-Stream:
    items/sec
    ms/batch
NCNN
Neural Magic DeepSparse:
  NLP Document Classification, oBERT base uncased on IMDB - Asynchronous Multi-Stream:
    items/sec
    ms/batch
Mobile Neural Network
NCNN:
  CPU - mnasnet
  CPU - alexnet
OSPRay Studio
Mobile Neural Network
OSPRay Studio:
  1 - 4K - 32 - Path Tracer
  1 - 1080p - 16 - Path Tracer
Numpy Benchmark
ONNX Runtime
OSPRay Studio
oneDNN
OSPRay Studio
oneDNN
OSPRay Studio:
  1 - 4K - 1 - Path Tracer
  3 - 1080p - 32 - Path Tracer
  3 - 4K - 32 - Path Tracer
  3 - 4K - 16 - Path Tracer
  3 - 1080p - 16 - Path Tracer
NCNN
OSPRay Studio
ONNX Runtime
dav1d
OSPRay Studio
simdjson
OpenRadioss
ONNX Runtime
NCNN
oneDNN
dav1d
oneDNN
OpenRadioss
NCNN
dav1d
NCNN
Embree
NCNN:
  CPU - resnet18
  CPU - resnet50
Mobile Neural Network
OSPRay
NCNN:
  CPU - blazeface
  CPU - FastestDet
oneDNN
OpenRadioss
Embree
AI Benchmark Alpha
oneDNN
ONNX Runtime
TNN
ONNX Runtime
Embree
oneDNN
dav1d
Mobile Neural Network
LeelaChessZero
OpenFOAM
NCNN
ONNX Runtime
TNN
OpenRadioss
oneDNN
ONNX Runtime
OpenRadioss
Xmrig
GROMACS
ONNX Runtime:
  fcn-resnet101-11 - CPU - Parallel
  yolov4 - CPU - Parallel
TNN
Cpuminer-Opt
Xmrig
OpenFOAM
TNN
oneDNN
ONNX Runtime:
  GPT-2 - CPU - Parallel
  GPT-2 - CPU - Standard
oneDNN:
  Convolution Batch Shapes Auto - bf16bf16bf16 - CPU
  IP Shapes 1D - bf16bf16bf16 - CPU
NCNN
Intel Open Image Denoise:
  RTLightmap.hdr.4096x4096
  RT.ldr_alb_nrm.3840x2160
  RT.hdr_alb_nrm.3840x2160
CPU Temperature Monitor:
  Phoronix Test Suite System Monitoring:
    Celsius
    Watts
    Megahertz
  CPU Temp Monitor:
    Celsius
  CPU Power Consumption Monitor:
    Watts
  CPU Peak Freq (Highest CPU Core Frequency) Monitor:
    Megahertz
  ArcFace ResNet-100 - CPU - Standard:
    Inferences Per Minute Per Watt
ONNX Runtime
OpenVINO:
  CPU Temp Monitor
  CPU Power Consumption Monitor
  CPU Peak Freq (Highest CPU Core Frequency) Monitor
OpenVINO:
  Age Gender Recognition Retail 0013 FP16-INT8 - CPU:
    ms
    FPS
Cpuminer-Opt:
  CPU Temp Monitor
  CPU Power Consumption Monitor
  CPU Peak Freq (Highest CPU Core Frequency) Monitor
  Deepcoin
Cpuminer-Opt
Cpuminer-Opt:
  CPU Temp Monitor
  CPU Power Consumption Monitor
  CPU Peak Freq (Highest CPU Core Frequency) Monitor
  Ringcoin
Cpuminer-Opt
Cpuminer-Opt:
  CPU Temp Monitor
  CPU Power Consumption Monitor
  CPU Peak Freq (Highest CPU Core Frequency) Monitor
  x25x
Cpuminer-Opt
oneDNN:
  CPU Temp Monitor
  CPU Power Consumption Monitor
  CPU Peak Freq (Highest CPU Core Frequency) Monitor
oneDNN