Core i3 8100 December Intel Core i3-8100 testing with a ASRock Z370M-ITX/ac (P4.10 BIOS) and Intel 8th Gen Core Gaussian Mixture Model 3GB on Ubuntu 20.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2012270-HA-COREI381005&grs&sor .
Core i3 8100 December Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL Vulkan Compiler File-System Screen Resolution 1 2 3 Intel Core i3-8100 @ 3.60GHz (4 Cores) ASRock Z370M-ITX/ac (P4.10 BIOS) Intel 8th Gen Core 4-core Desktop 8GB 60GB DREVO X1 SSD Intel 8th Gen Core Gaussian Mixture Model 3GB (1100MHz) Realtek ALC892 VA2431 Intel I219-V + Intel I211 + Intel Dual Band-AC 3168NGW Ubuntu 20.04 5.9.0-050900rc1daily20200819-generic (x86_64) 20200818 GNOME Shell 3.36.4 X Server 1.20.8 modesetting 1.20.8 4.6 Mesa 20.0.8 1.2.131 GCC 9.3.0 ext4 1920x1080 OpenBenchmarking.org Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: intel_pstate powersave - CPU Microcode: 0xde - Thermald 1.9.1 Python Details - Python 3.8.5 Security Details - itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Mitigation of PTE Inversion; VMX: conditional cache flushes SMT disabled + mds: Mitigation of Clear buffers; SMT disabled + meltdown: Mitigation of PTI + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full generic retpoline IBPB: conditional IBRS_FW STIBP: disabled RSB filling + srbds: Mitigation of Microcode + tsx_async_abort: Not affected
Core i3 8100 December vkresample: 2x - Single onednn: IP Shapes 3D - f32 - CPU node-web-tooling: openvino: Age Gender Recognition Retail 0013 FP16 - CPU unpack-firefox: firefox-84.0.source.tar.xz simdjson: Kostya stockfish: Total Time openvino: Person Detection 0106 FP32 - CPU vkmark: 1280 x 1024 onednn: IP Shapes 3D - u8s8f32 - CPU asmfish: 1024 Hash Memory, 26 Depth onednn: Convolution Batch Shapes Auto - f32 - CPU sqlite-speedtest: Timed Time - Size 1,000 brl-cad: VGR Performance Metric openvino: Face Detection 0106 FP16 - CPU compress-lz4: 9 - Compression Speed onednn: Deconvolution Batch shapes_3d - f32 - CPU ncnn: CPU-v2-v2 - mobilenet-v2 openvino: Age Gender Recognition Retail 0013 FP32 - CPU openvino: Age Gender Recognition Retail 0013 FP16 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU mafft: Multiple Sequence Alignment - LSU RNA ncnn: CPU - mnasnet encode-ape: WAV To APE onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU build-eigen: Time To Compile onednn: Deconvolution Batch shapes_1d - f32 - CPU build2: Time To Compile onednn: Recurrent Neural Network Inference - f32 - CPU ncnn: CPU - regnety_400m openvino: Age Gender Recognition Retail 0013 FP32 - CPU vkresample: 2x - Double ncnn: Vulkan GPU - shufflenet-v2 ncnn: Vulkan GPU - resnet50 compress-lz4: 1 - Compression Speed rav1e: 1 ncnn: CPU - efficientnet-b0 ncnn: Vulkan GPU-v2-v2 - mobilenet-v2 ncnn: CPU - googlenet compress-lz4: 9 - Decompression Speed ncnn: CPU - shufflenet-v2 openvino: Person Detection 0106 FP32 - CPU ncnn: Vulkan GPU - regnety_400m ncnn: Vulkan GPU - squeezenet_ssd ncnn: Vulkan GPU - efficientnet-b0 ncnn: Vulkan GPU - googlenet astcenc: Fast compress-lz4: 1 - Decompression Speed onednn: IP Shapes 1D - u8s8f32 - CPU onednn: IP Shapes 1D - f32 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU ncnn: CPU-v3-v3 - mobilenet-v3 ncnn: Vulkan GPU-v3-v3 - mobilenet-v3 ncnn: Vulkan GPU - alexnet ncnn: Vulkan GPU - mnasnet vkmark: 1920 x 1080 ncnn: CPU - squeezenet_ssd ncnn: CPU - resnet50 vkfft: onednn: Recurrent Neural Network Training - u8s8f32 - CPU encode-wavpack: WAV To WavPack compress-lz4: 3 - Compression Speed onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU ncnn: Vulkan GPU - mobilenet rav1e: 10 ncnn: CPU - alexnet rav1e: 5 build-ffmpeg: Time To Compile onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU openvino: Face Detection 0106 FP16 - CPU astcenc: Exhaustive ncnn: Vulkan GPU - yolov4-tiny astcenc: Thorough compress-lz4: 3 - Decompression Speed ncnn: CPU - mobilenet openvino: Person Detection 0106 FP16 - CPU astcenc: Medium onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU ncnn: Vulkan GPU - vgg16 ncnn: CPU - yolov4-tiny ncnn: Vulkan GPU - resnet18 ncnn: CPU - resnet18 encode-opus: WAV To Opus Encode hmmer: Pfam Database Search ncnn: CPU - vgg16 onednn: Recurrent Neural Network Training - f32 - CPU openvino: Face Detection 0106 FP32 - CPU openvino: Person Detection 0106 FP16 - CPU openvino: Face Detection 0106 FP32 - CPU ncnn: Vulkan GPU - blazeface ncnn: CPU - blazeface rav1e: 6 simdjson: DistinctUserID simdjson: PartialTweets simdjson: LargeRand clomp: Static OMP Speedup betsy: ETC2 RGB - Highest coremark: CoreMark Size 666 - Iterations Per Second betsy: ETC1 - Highest 1 2 3 478.833 9.06142 10.17 2549.49 23.256 0.57 6456585 0.70 977 3.59647 8988021 18.5346 79.432 36580 1.16 40.08 14.4610 7.18 1.46 1.45 6758.84 11.939 6.55 13.624 17.7086 93.508 11.5173 338.864 3873.75 14.55 2543.51 1013.827 9.36 47.84 7608.28 0.344 10.39 7.16 21.69 9077.6 9.34 5648.20 14.55 40.52 10.37 21.68 5.76 9201.5 5.58876 8.47060 3855.79 4.71186 6.24 6.26 19.28 6.54 657 40.40 47.83 1367 6758.25 17.965 40.97 3856.20 14.5548 28.16 2.890 19.32 0.984 151.630 11.2454 3426.15 738.37 38.97 92.92 9058.0 28.15 5641.28 14.28 7.50114 87.28 38.99 22.52 22.55 10.648 126.157 87.35 6759.55 3436.65 0.71 1.15 2.36 2.36 1.308 0.54 0.52 0.35 1.7 7.541 98117.509075 10.906 422.227 9.52884 10.39 2516.51 23.393 0.58 6551982 0.71 969 3.63768 9008380 18.7125 79.660 36247 1.17 39.76 14.3583 7.14 1.45 1.45 6804.34 11.922 6.51 13.704 17.8085 93.150 11.4995 337.988 3860.52 14.50 2551.41 1010.496 9.37 47.79 7630.56 0.344 10.36 7.15 21.75 9053.2 9.34 5638.59 14.53 40.46 10.37 21.72 5.75 9212.4 5.58671 8.47988 3858.99 4.71931 6.25 6.26 19.31 6.54 658 40.46 47.90 1369 6759.25 17.942 41.02 3860.15 14.5390 28.14 2.887 19.34 0.984 151.678 11.2340 3422.79 737.79 38.96 92.88 9051.3 28.15 5638.32 14.29 7.50639 87.24 38.97 22.52 22.56 10.648 126.198 87.32 6761.68 3436.56 0.71 1.15 2.36 2.36 1.308 0.54 0.52 0.35 1.7 100468.346119 11.082 421.635 9.18889 10.29 2562.17 23.669 0.58 6451727 0.70 964 3.59274 8912458 18.5430 78.924 36569 1.16 39.98 14.4145 7.13 1.45 1.46 6764.33 11.997 6.53 13.643 17.7601 93.090 11.4730 337.576 3874.23 14.55 2552.27 1011.005 9.34 47.93 7612.81 0.343 10.37 7.14 21.74 9066.9 9.36 5650.56 14.56 40.44 10.39 21.69 5.75 9216.9 5.57958 8.46603 3852.80 4.71943 6.24 6.25 19.30 6.53 658 40.45 47.88 1368 6767.39 17.945 40.99 3855.49 14.5402 28.13 2.889 19.33 0.983 151.784 11.2343 3426.05 738.44 38.94 92.95 9057.3 28.17 5642.32 14.29 7.50482 87.23 38.97 22.53 22.56 10.652 126.154 87.34 6761.79 3437.37 0.71 1.15 2.36 2.36 1.308 0.54 0.52 0.35 1.7 92000.140543 10.370 OpenBenchmarking.org
VkResample Upscale: 2x - Precision: Single OpenBenchmarking.org ms, Fewer Is Better VkResample 1.0 Upscale: 2x - Precision: Single 3 2 1 100 200 300 400 500 SE +/- 0.49, N = 3 SE +/- 0.41, N = 3 SE +/- 0.46, N = 3 421.64 422.23 478.83 1. (CXX) g++ options: -O3 -pthread
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU 1 3 2 3 6 9 12 15 SE +/- 0.01406, N = 3 SE +/- 0.02174, N = 3 SE +/- 0.02685, N = 3 9.06142 9.18889 9.52884 MIN: 8.95 MIN: 9.1 MIN: 9.43 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Node.js V8 Web Tooling Benchmark OpenBenchmarking.org runs/s, More Is Better Node.js V8 Web Tooling Benchmark 2 3 1 3 6 9 12 15 SE +/- 0.00, N = 3 SE +/- 0.06, N = 3 SE +/- 0.06, N = 3 10.39 10.29 10.17 1. Nodejs
v10.19.0
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2021.1 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU 3 1 2 500 1000 1500 2000 2500 SE +/- 18.26, N = 3 SE +/- 10.55, N = 3 SE +/- 32.43, N = 12 2562.17 2549.49 2516.51
Unpacking Firefox Extracting: firefox-84.0.source.tar.xz OpenBenchmarking.org Seconds, Fewer Is Better Unpacking Firefox 84.0 Extracting: firefox-84.0.source.tar.xz 1 2 3 6 12 18 24 30 SE +/- 0.18, N = 4 SE +/- 0.24, N = 4 SE +/- 0.30, N = 4 23.26 23.39 23.67
simdjson Throughput Test: Kostya OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: Kostya 3 2 1 0.1305 0.261 0.3915 0.522 0.6525 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.58 0.58 0.57 1. (CXX) g++ options: -O3 -pthread
Stockfish Total Time OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 12 Total Time 2 1 3 1.4M 2.8M 4.2M 5.6M 7M SE +/- 39431.93, N = 3 SE +/- 59897.76, N = 3 SE +/- 73617.81, N = 3 6551982 6456585 6451727 1. (CXX) g++ options: -m64 -lpthread -fno-exceptions -std=c++17 -pedantic -O3 -msse -msse3 -mpopcnt -msse4.1 -mssse3 -msse2 -flto -flto=jobserver
OpenVINO Model: Person Detection 0106 FP32 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2021.1 Model: Person Detection 0106 FP32 - Device: CPU 2 3 1 0.1598 0.3196 0.4794 0.6392 0.799 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.71 0.70 0.70
VKMark Resolution: 1280 x 1024 OpenBenchmarking.org VKMark Score, More Is Better VKMark 2020-05-21 Resolution: 1280 x 1024 1 2 3 200 400 600 800 1000 SE +/- 1.00, N = 3 SE +/- 3.93, N = 3 977 969 964 1. (CXX) g++ options: -pthread -ldl -pipe -std=c++14 -MD -MQ -MF
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU 3 1 2 0.8185 1.637 2.4555 3.274 4.0925 SE +/- 0.01609, N = 3 SE +/- 0.01043, N = 3 SE +/- 0.03180, N = 3 3.59274 3.59647 3.63768 MIN: 3.34 MIN: 3.39 MIN: 3.18 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
asmFish 1024 Hash Memory, 26 Depth OpenBenchmarking.org Nodes/second, More Is Better asmFish 2018-07-23 1024 Hash Memory, 26 Depth 2 1 3 2M 4M 6M 8M 10M SE +/- 71523.89, N = 3 SE +/- 98059.95, N = 3 SE +/- 46897.59, N = 3 9008380 8988021 8912458
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU 1 3 2 5 10 15 20 25 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 18.53 18.54 18.71 MIN: 18.42 MIN: 18.43 MIN: 18.57 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
SQLite Speedtest Timed Time - Size 1,000 OpenBenchmarking.org Seconds, Fewer Is Better SQLite Speedtest 3.30 Timed Time - Size 1,000 3 1 2 20 40 60 80 100 SE +/- 0.10, N = 3 SE +/- 0.03, N = 3 SE +/- 0.34, N = 3 78.92 79.43 79.66 1. (CC) gcc options: -O2 -ldl -lz -lpthread
BRL-CAD VGR Performance Metric OpenBenchmarking.org VGR Performance Metric, More Is Better BRL-CAD 7.30.8 VGR Performance Metric 1 3 2 8K 16K 24K 32K 40K 36580 36569 36247 1. (CXX) g++ options: -std=c++11 -pipe -fno-strict-aliasing -fno-common -fexceptions -ftemplate-depth-128 -m64 -ggdb3 -O3 -fipa-pta -fstrength-reduce -finline-functions -flto -pedantic -rdynamic -lSM -lICE -lXi -lGLU -lGL -lGLdispatch -lX11 -lXext -lXrender -lpthread -ldl -luuid -lm
OpenVINO Model: Face Detection 0106 FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2021.1 Model: Face Detection 0106 FP16 - Device: CPU 2 3 1 0.2633 0.5266 0.7899 1.0532 1.3165 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 1.17 1.16 1.16
LZ4 Compression Compression Level: 9 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 9 - Compression Speed 1 3 2 9 18 27 36 45 SE +/- 0.06, N = 3 SE +/- 0.01, N = 3 SE +/- 0.21, N = 3 40.08 39.98 39.76 1. (CC) gcc options: -O3
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU 2 3 1 4 8 12 16 20 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 SE +/- 0.08, N = 3 14.36 14.41 14.46 MIN: 14.23 MIN: 14.25 MIN: 14.22 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU-v2-v2 - Model: mobilenet-v2 3 2 1 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.05, N = 3 7.13 7.14 7.18 MIN: 7.07 / MAX: 8.56 MIN: 7.07 / MAX: 8.36 MIN: 7.07 / MAX: 29.75 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenVINO Model: Age Gender Recognition Retail 0013 FP32 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2021.1 Model: Age Gender Recognition Retail 0013 FP32 - Device: CPU 2 3 1 0.3285 0.657 0.9855 1.314 1.6425 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 1.45 1.45 1.46
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2021.1 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU 1 2 3 0.3285 0.657 0.9855 1.314 1.6425 SE +/- 0.00, N = 3 SE +/- 0.00, N = 12 SE +/- 0.00, N = 3 1.45 1.45 1.46
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU 1 3 2 1500 3000 4500 6000 7500 SE +/- 7.71, N = 3 SE +/- 4.92, N = 3 SE +/- 37.00, N = 3 6758.84 6764.33 6804.34 MIN: 6740.34 MIN: 6750.92 MIN: 6761.06 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Timed MAFFT Alignment Multiple Sequence Alignment - LSU RNA OpenBenchmarking.org Seconds, Fewer Is Better Timed MAFFT Alignment 7.471 Multiple Sequence Alignment - LSU RNA 2 1 3 3 6 9 12 15 SE +/- 0.04, N = 3 SE +/- 0.04, N = 3 SE +/- 0.07, N = 3 11.92 11.94 12.00 1. (CC) gcc options: -std=c99 -O3 -lm -lpthread
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: mnasnet 2 3 1 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 6.51 6.53 6.55 MIN: 6.47 / MAX: 6.73 MIN: 6.49 / MAX: 6.66 MIN: 6.5 / MAX: 16.64 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Monkey Audio Encoding WAV To APE OpenBenchmarking.org Seconds, Fewer Is Better Monkey Audio Encoding 3.99.6 WAV To APE 1 3 2 4 8 12 16 20 SE +/- 0.03, N = 5 SE +/- 0.03, N = 5 SE +/- 0.06, N = 5 13.62 13.64 13.70 1. (CXX) g++ options: -O3 -pedantic -rdynamic -lrt
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU 1 3 2 4 8 12 16 20 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 17.71 17.76 17.81 MIN: 17.57 MIN: 17.61 MIN: 17.67 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Timed Eigen Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Eigen Compilation 3.3.9 Time To Compile 3 2 1 20 40 60 80 100 SE +/- 0.06, N = 3 SE +/- 0.07, N = 3 SE +/- 0.06, N = 3 93.09 93.15 93.51
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU 3 2 1 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 11.47 11.50 11.52 MIN: 11.38 MIN: 11.4 MIN: 11.42 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Build2 Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Build2 0.13 Time To Compile 3 2 1 70 140 210 280 350 SE +/- 0.64, N = 3 SE +/- 0.28, N = 3 SE +/- 0.66, N = 3 337.58 337.99 338.86
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU 2 1 3 800 1600 2400 3200 4000 SE +/- 0.72, N = 3 SE +/- 20.54, N = 3 SE +/- 18.83, N = 3 3860.52 3873.75 3874.23 MIN: 3856.22 MIN: 3848.44 MIN: 3849.24 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
NCNN Target: CPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: regnety_400m 2 1 3 4 8 12 16 20 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 14.50 14.55 14.55 MIN: 14.4 / MAX: 15.22 MIN: 14.43 / MAX: 15.76 MIN: 14.46 / MAX: 15.19 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenVINO Model: Age Gender Recognition Retail 0013 FP32 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2021.1 Model: Age Gender Recognition Retail 0013 FP32 - Device: CPU 3 2 1 500 1000 1500 2000 2500 SE +/- 6.24, N = 3 SE +/- 2.44, N = 3 SE +/- 4.46, N = 3 2552.27 2551.41 2543.51
VkResample Upscale: 2x - Precision: Double OpenBenchmarking.org ms, Fewer Is Better VkResample 1.0 Upscale: 2x - Precision: Double 2 3 1 200 400 600 800 1000 SE +/- 2.85, N = 3 SE +/- 4.21, N = 3 SE +/- 1.41, N = 3 1010.50 1011.01 1013.83 1. (CXX) g++ options: -O3 -pthread
NCNN Target: Vulkan GPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: shufflenet-v2 3 1 2 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 9.34 9.36 9.37 MIN: 9.29 / MAX: 10.75 MIN: 9.29 / MAX: 10.85 MIN: 9.31 / MAX: 10.64 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: resnet50 2 1 3 11 22 33 44 55 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 SE +/- 0.11, N = 3 47.79 47.84 47.93 MIN: 47.59 / MAX: 57.75 MIN: 47.59 / MAX: 59.98 MIN: 47.56 / MAX: 120.58 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
LZ4 Compression Compression Level: 1 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 1 - Compression Speed 2 3 1 1600 3200 4800 6400 8000 SE +/- 19.00, N = 3 SE +/- 7.93, N = 3 SE +/- 4.40, N = 3 7630.56 7612.81 7608.28 1. (CC) gcc options: -O3
rav1e Speed: 1 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Alpha Speed: 1 2 1 3 0.0774 0.1548 0.2322 0.3096 0.387 SE +/- 0.001, N = 3 SE +/- 0.000, N = 3 SE +/- 0.001, N = 3 0.344 0.344 0.343
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: efficientnet-b0 2 3 1 3 6 9 12 15 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 10.36 10.37 10.39 MIN: 10.33 / MAX: 10.6 MIN: 10.32 / MAX: 10.49 MIN: 10.35 / MAX: 11.79 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 3 2 1 2 4 6 8 10 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 7.14 7.15 7.16 MIN: 7.08 / MAX: 8.35 MIN: 7.07 / MAX: 9.43 MIN: 7.08 / MAX: 8.7 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: googlenet 1 3 2 5 10 15 20 25 SE +/- 0.02, N = 3 SE +/- 0.06, N = 3 SE +/- 0.05, N = 3 21.69 21.74 21.75 MIN: 21.59 / MAX: 29.99 MIN: 21.62 / MAX: 52.65 MIN: 21.6 / MAX: 56.79 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
LZ4 Compression Compression Level: 9 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 9 - Decompression Speed 1 3 2 2K 4K 6K 8K 10K SE +/- 15.30, N = 3 SE +/- 14.57, N = 3 SE +/- 13.38, N = 3 9077.6 9066.9 9053.2 1. (CC) gcc options: -O3
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: shufflenet-v2 1 2 3 3 6 9 12 15 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 9.34 9.34 9.36 MIN: 9.28 / MAX: 10.73 MIN: 9.28 / MAX: 10.73 MIN: 9.29 / MAX: 11.04 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenVINO Model: Person Detection 0106 FP32 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2021.1 Model: Person Detection 0106 FP32 - Device: CPU 2 1 3 1200 2400 3600 4800 6000 SE +/- 5.55, N = 3 SE +/- 7.38, N = 3 SE +/- 8.84, N = 3 5638.59 5648.20 5650.56
NCNN Target: Vulkan GPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: regnety_400m 2 1 3 4 8 12 16 20 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 14.53 14.55 14.56 MIN: 14.46 / MAX: 15.24 MIN: 14.45 / MAX: 28.03 MIN: 14.49 / MAX: 15.22 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: squeezenet_ssd 3 2 1 9 18 27 36 45 SE +/- 0.05, N = 3 SE +/- 0.05, N = 3 SE +/- 0.08, N = 3 40.44 40.46 40.52 MIN: 40.32 / MAX: 50.6 MIN: 40.33 / MAX: 67.8 MIN: 40.36 / MAX: 90.92 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: efficientnet-b0 1 2 3 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.03, N = 3 10.37 10.37 10.39 MIN: 10.32 / MAX: 11.49 MIN: 10.34 / MAX: 11.79 MIN: 10.32 / MAX: 20.27 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: googlenet 1 3 2 5 10 15 20 25 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 21.68 21.69 21.72 MIN: 21.59 / MAX: 31.15 MIN: 21.59 / MAX: 31.58 MIN: 21.61 / MAX: 31.67 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
ASTC Encoder Preset: Fast OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Fast 2 3 1 1.296 2.592 3.888 5.184 6.48 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 5.75 5.75 5.76 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
LZ4 Compression Compression Level: 1 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 1 - Decompression Speed 3 2 1 2K 4K 6K 8K 10K SE +/- 21.43, N = 3 SE +/- 38.65, N = 3 SE +/- 38.89, N = 3 9216.9 9212.4 9201.5 1. (CC) gcc options: -O3
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU 3 2 1 1.2575 2.515 3.7725 5.03 6.2875 SE +/- 0.00020, N = 3 SE +/- 0.00388, N = 3 SE +/- 0.00294, N = 3 5.57958 5.58671 5.58876 MIN: 5.56 MIN: 5.55 MIN: 5.55 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU 3 1 2 2 4 6 8 10 SE +/- 0.01958, N = 3 SE +/- 0.01592, N = 3 SE +/- 0.00926, N = 3 8.46603 8.47060 8.47988 MIN: 8.37 MIN: 8.4 MIN: 8.39 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU 3 1 2 800 1600 2400 3200 4000 SE +/- 1.54, N = 3 SE +/- 5.10, N = 3 SE +/- 1.86, N = 3 3852.80 3855.79 3858.99 MIN: 3847.92 MIN: 3845.17 MIN: 3851.44 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU 1 2 3 1.0619 2.1238 3.1857 4.2476 5.3095 SE +/- 0.00413, N = 3 SE +/- 0.00367, N = 3 SE +/- 0.00063, N = 3 4.71186 4.71931 4.71943 MIN: 4.66 MIN: 4.66 MIN: 4.66 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU-v3-v3 - Model: mobilenet-v3 1 3 2 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 6.24 6.24 6.25 MIN: 6.17 / MAX: 7.41 MIN: 6.17 / MAX: 7.8 MIN: 6.17 / MAX: 15.83 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 3 1 2 2 4 6 8 10 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 6.25 6.26 6.26 MIN: 6.18 / MAX: 7.7 MIN: 6.18 / MAX: 7.85 MIN: 6.19 / MAX: 7.44 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: alexnet 1 3 2 5 10 15 20 25 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 19.28 19.30 19.31 MIN: 19.2 / MAX: 19.63 MIN: 19.2 / MAX: 21.55 MIN: 19.23 / MAX: 21.66 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: mnasnet 3 1 2 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 6.53 6.54 6.54 MIN: 6.49 / MAX: 7.16 MIN: 6.5 / MAX: 8.09 MIN: 6.51 / MAX: 7.17 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
VKMark Resolution: 1920 x 1080 OpenBenchmarking.org VKMark Score, More Is Better VKMark 2020-05-21 Resolution: 1920 x 1080 3 2 1 140 280 420 560 700 SE +/- 0.58, N = 3 SE +/- 1.76, N = 3 658 658 657 1. (CXX) g++ options: -pthread -ldl -pipe -std=c++14 -MD -MQ -MF
NCNN Target: CPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: squeezenet_ssd 1 3 2 9 18 27 36 45 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 40.40 40.45 40.46 MIN: 40.28 / MAX: 42.66 MIN: 40.35 / MAX: 52.52 MIN: 40.32 / MAX: 57.2 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: resnet50 1 3 2 11 22 33 44 55 SE +/- 0.01, N = 3 SE +/- 0.07, N = 3 SE +/- 0.06, N = 3 47.83 47.88 47.90 MIN: 47.62 / MAX: 57.56 MIN: 47.6 / MAX: 95.3 MIN: 47.62 / MAX: 88.54 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
VkFFT OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.1.1 2 3 1 300 600 900 1200 1500 SE +/- 1.20, N = 3 SE +/- 1.00, N = 3 SE +/- 2.33, N = 3 1369 1368 1367 1. (CXX) g++ options: -O3 -pthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU 1 2 3 1500 3000 4500 6000 7500 SE +/- 7.88, N = 3 SE +/- 2.23, N = 3 SE +/- 3.49, N = 3 6758.25 6759.25 6767.39 MIN: 6740.42 MIN: 6750.8 MIN: 6755.1 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
WavPack Audio Encoding WAV To WavPack OpenBenchmarking.org Seconds, Fewer Is Better WavPack Audio Encoding 5.3 WAV To WavPack 2 3 1 4 8 12 16 20 SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 SE +/- 0.02, N = 5 17.94 17.95 17.97 1. (CXX) g++ options: -rdynamic
LZ4 Compression Compression Level: 3 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 3 - Compression Speed 2 3 1 9 18 27 36 45 SE +/- 0.05, N = 3 SE +/- 0.08, N = 3 SE +/- 0.08, N = 3 41.02 40.99 40.97 1. (CC) gcc options: -O3
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU 3 1 2 800 1600 2400 3200 4000 SE +/- 2.63, N = 3 SE +/- 1.72, N = 3 SE +/- 1.45, N = 3 3855.49 3856.20 3860.15 MIN: 3847.91 MIN: 3850.81 MIN: 3854.62 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU 2 3 1 4 8 12 16 20 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 14.54 14.54 14.55 MIN: 14.47 MIN: 14.44 MIN: 14.46 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
NCNN Target: Vulkan GPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: mobilenet 3 2 1 7 14 21 28 35 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 SE +/- 0.03, N = 3 28.13 28.14 28.16 MIN: 28 / MAX: 30.42 MIN: 27.98 / MAX: 33.11 MIN: 28 / MAX: 30.49 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
rav1e Speed: 10 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Alpha Speed: 10 1 3 2 0.6503 1.3006 1.9509 2.6012 3.2515 SE +/- 0.004, N = 3 SE +/- 0.006, N = 3 SE +/- 0.006, N = 3 2.890 2.889 2.887
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: alexnet 1 3 2 5 10 15 20 25 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 19.32 19.33 19.34 MIN: 19.23 / MAX: 21.26 MIN: 19.23 / MAX: 28.73 MIN: 19.23 / MAX: 21.66 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
rav1e Speed: 5 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Alpha Speed: 5 2 1 3 0.2214 0.4428 0.6642 0.8856 1.107 SE +/- 0.002, N = 3 SE +/- 0.001, N = 3 SE +/- 0.002, N = 3 0.984 0.984 0.983
Timed FFmpeg Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed FFmpeg Compilation 4.2.2 Time To Compile 1 2 3 30 60 90 120 150 SE +/- 0.07, N = 3 SE +/- 0.05, N = 3 SE +/- 0.12, N = 3 151.63 151.68 151.78
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU 2 3 1 3 6 9 12 15 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 11.23 11.23 11.25 MIN: 11.2 MIN: 11.19 MIN: 11.19 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenVINO Model: Face Detection 0106 FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2021.1 Model: Face Detection 0106 FP16 - Device: CPU 2 3 1 700 1400 2100 2800 3500 SE +/- 2.63, N = 3 SE +/- 5.13, N = 3 SE +/- 2.94, N = 3 3422.79 3426.05 3426.15
ASTC Encoder Preset: Exhaustive OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Exhaustive 2 1 3 160 320 480 640 800 SE +/- 0.07, N = 3 SE +/- 0.13, N = 3 SE +/- 0.14, N = 3 737.79 738.37 738.44 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
NCNN Target: Vulkan GPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: yolov4-tiny 3 2 1 9 18 27 36 45 SE +/- 0.03, N = 3 SE +/- 0.04, N = 3 SE +/- 0.03, N = 3 38.94 38.96 38.97 MIN: 38.78 / MAX: 41.06 MIN: 38.78 / MAX: 48.94 MIN: 38.82 / MAX: 49.15 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
ASTC Encoder Preset: Thorough OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Thorough 2 1 3 20 40 60 80 100 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 92.88 92.92 92.95 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
LZ4 Compression Compression Level: 3 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 3 - Decompression Speed 1 3 2 2K 4K 6K 8K 10K SE +/- 9.10, N = 3 SE +/- 2.77, N = 3 SE +/- 18.24, N = 3 9058.0 9057.3 9051.3 1. (CC) gcc options: -O3
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: mobilenet 1 2 3 7 14 21 28 35 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 28.15 28.15 28.17 MIN: 28.03 / MAX: 30.22 MIN: 28.04 / MAX: 30.44 MIN: 28.04 / MAX: 38.21 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenVINO Model: Person Detection 0106 FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2021.1 Model: Person Detection 0106 FP16 - Device: CPU 2 1 3 1200 2400 3600 4800 6000 SE +/- 4.42, N = 3 SE +/- 3.62, N = 3 SE +/- 3.88, N = 3 5638.32 5641.28 5642.32
ASTC Encoder Preset: Medium OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Medium 1 2 3 4 8 12 16 20 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 14.28 14.29 14.29 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU 1 3 2 2 4 6 8 10 SE +/- 0.00777, N = 3 SE +/- 0.00679, N = 3 SE +/- 0.00589, N = 3 7.50114 7.50482 7.50639 MIN: 7.46 MIN: 7.46 MIN: 7.46 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
NCNN Target: Vulkan GPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: vgg16 3 2 1 20 40 60 80 100 SE +/- 0.01, N = 3 SE +/- 0.09, N = 3 SE +/- 0.01, N = 3 87.23 87.24 87.28 MIN: 87.02 / MAX: 97.07 MIN: 86.94 / MAX: 96.82 MIN: 87.01 / MAX: 97.33 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: yolov4-tiny 2 3 1 9 18 27 36 45 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.06, N = 3 38.97 38.97 38.99 MIN: 38.81 / MAX: 49.04 MIN: 38.81 / MAX: 41.21 MIN: 38.79 / MAX: 49.09 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: resnet18 1 2 3 5 10 15 20 25 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 22.52 22.52 22.53 MIN: 22.4 / MAX: 23.98 MIN: 22.41 / MAX: 23.18 MIN: 22.46 / MAX: 23.24 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: resnet18 1 2 3 5 10 15 20 25 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 22.55 22.56 22.56 MIN: 22.49 / MAX: 23.15 MIN: 22.46 / MAX: 23.21 MIN: 22.46 / MAX: 23.29 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Opus Codec Encoding WAV To Opus Encode OpenBenchmarking.org Seconds, Fewer Is Better Opus Codec Encoding 1.3.1 WAV To Opus Encode 1 2 3 3 6 9 12 15 SE +/- 0.02, N = 5 SE +/- 0.02, N = 5 SE +/- 0.02, N = 5 10.65 10.65 10.65 1. (CXX) g++ options: -fvisibility=hidden -logg -lm
Timed HMMer Search Pfam Database Search OpenBenchmarking.org Seconds, Fewer Is Better Timed HMMer Search 3.3.1 Pfam Database Search 3 1 2 30 60 90 120 150 SE +/- 0.06, N = 3 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 126.15 126.16 126.20 1. (CC) gcc options: -O3 -pthread -lhmmer -leasel -lm
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: vgg16 2 3 1 20 40 60 80 100 SE +/- 0.09, N = 3 SE +/- 0.11, N = 3 SE +/- 0.03, N = 3 87.32 87.34 87.35 MIN: 87 / MAX: 97.85 MIN: 87.03 / MAX: 96.31 MIN: 87.09 / MAX: 96.91 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU 1 2 3 1400 2800 4200 5600 7000 SE +/- 1.61, N = 3 SE +/- 7.18, N = 3 SE +/- 0.71, N = 3 6759.55 6761.68 6761.79 MIN: 6754.19 MIN: 6742.39 MIN: 6755.52 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OpenVINO Model: Face Detection 0106 FP32 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2021.1 Model: Face Detection 0106 FP32 - Device: CPU 2 1 3 700 1400 2100 2800 3500 SE +/- 6.17, N = 3 SE +/- 2.49, N = 3 SE +/- 9.57, N = 3 3436.56 3436.65 3437.37
OpenVINO Model: Person Detection 0106 FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2021.1 Model: Person Detection 0106 FP16 - Device: CPU 3 2 1 0.1598 0.3196 0.4794 0.6392 0.799 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.71 0.71 0.71
OpenVINO Model: Face Detection 0106 FP32 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2021.1 Model: Face Detection 0106 FP32 - Device: CPU 3 2 1 0.2588 0.5176 0.7764 1.0352 1.294 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 1.15 1.15 1.15
NCNN Target: Vulkan GPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: blazeface 1 2 3 0.531 1.062 1.593 2.124 2.655 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 2.36 2.36 2.36 MIN: 2.34 / MAX: 3.09 MIN: 2.35 / MAX: 2.57 MIN: 2.34 / MAX: 2.49 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: blazeface 1 2 3 0.531 1.062 1.593 2.124 2.655 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 2.36 2.36 2.36 MIN: 2.34 / MAX: 2.49 MIN: 2.34 / MAX: 2.38 MIN: 2.34 / MAX: 2.47 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
rav1e Speed: 6 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Alpha Speed: 6 3 2 1 0.2943 0.5886 0.8829 1.1772 1.4715 SE +/- 0.001, N = 3 SE +/- 0.001, N = 3 SE +/- 0.001, N = 3 1.308 1.308 1.308
simdjson Throughput Test: DistinctUserID OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: DistinctUserID 3 2 1 0.1215 0.243 0.3645 0.486 0.6075 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.54 0.54 0.54 1. (CXX) g++ options: -O3 -pthread
simdjson Throughput Test: PartialTweets OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: PartialTweets 3 2 1 0.117 0.234 0.351 0.468 0.585 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.52 0.52 0.52 1. (CXX) g++ options: -O3 -pthread
simdjson Throughput Test: LargeRandom OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: LargeRandom 3 2 1 0.0788 0.1576 0.2364 0.3152 0.394 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.35 0.35 0.35 1. (CXX) g++ options: -O3 -pthread
CLOMP Static OMP Speedup OpenBenchmarking.org Speedup, More Is Better CLOMP 1.2 Static OMP Speedup 3 2 1 0.3825 0.765 1.1475 1.53 1.9125 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 1.7 1.7 1.7 1. (CC) gcc options: -fopenmp -O3 -lm
Betsy GPU Compressor Codec: ETC2 RGB - Quality: Highest OpenBenchmarking.org Seconds, Fewer Is Better Betsy GPU Compressor 1.1 Beta Codec: ETC2 RGB - Quality: Highest 1 2 4 6 8 10 7.541 1. (CXX) g++ options: -O3 -O2 -lpthread -ldl
Coremark CoreMark Size 666 - Iterations Per Second OpenBenchmarking.org Iterations/Sec, More Is Better Coremark 1.0 CoreMark Size 666 - Iterations Per Second 2 1 3 20K 40K 60K 80K 100K SE +/- 1569.03, N = 15 SE +/- 1397.84, N = 15 SE +/- 971.22, N = 8 100468.35 98117.51 92000.14 1. (CC) gcc options: -O2 -lrt" -lrt
Betsy GPU Compressor Codec: ETC1 - Quality: Highest OpenBenchmarking.org Seconds, Fewer Is Better Betsy GPU Compressor 1.1 Beta Codec: ETC1 - Quality: Highest 3 1 2 3 6 9 12 15 SE +/- 0.63, N = 12 SE +/- 0.78, N = 12 SE +/- 0.08, N = 15 10.37 10.91 11.08 1. (CXX) g++ options: -O3 -O2 -lpthread -ldl
Phoronix Test Suite v10.8.4