Core i3 8100 December Intel Core i3-8100 testing with a ASRock Z370M-ITX/ac (P4.10 BIOS) and Intel 8th Gen Core Gaussian Mixture Model 3GB on Ubuntu 20.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2012270-HA-COREI381005 .
Core i3 8100 December Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL Vulkan Compiler File-System Screen Resolution 1 2 3 Intel Core i3-8100 @ 3.60GHz (4 Cores) ASRock Z370M-ITX/ac (P4.10 BIOS) Intel 8th Gen Core 4-core Desktop 8GB 60GB DREVO X1 SSD Intel 8th Gen Core Gaussian Mixture Model 3GB (1100MHz) Realtek ALC892 VA2431 Intel I219-V + Intel I211 + Intel Dual Band-AC 3168NGW Ubuntu 20.04 5.9.0-050900rc1daily20200819-generic (x86_64) 20200818 GNOME Shell 3.36.4 X Server 1.20.8 modesetting 1.20.8 4.6 Mesa 20.0.8 1.2.131 GCC 9.3.0 ext4 1920x1080 OpenBenchmarking.org Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: intel_pstate powersave - CPU Microcode: 0xde - Thermald 1.9.1 Python Details - Python 3.8.5 Security Details - itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Mitigation of PTE Inversion; VMX: conditional cache flushes SMT disabled + mds: Mitigation of Clear buffers; SMT disabled + meltdown: Mitigation of PTI + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full generic retpoline IBPB: conditional IBRS_FW STIBP: disabled RSB filling + srbds: Mitigation of Microcode + tsx_async_abort: Not affected
Core i3 8100 December vkfft: betsy: ETC1 - Highest betsy: ETC2 RGB - Highest vkresample: 2x - Double vkresample: 2x - Single vkmark: 1280 x 1024 vkmark: 1920 x 1080 clomp: Static OMP Speedup hmmer: Pfam Database Search mafft: Multiple Sequence Alignment - LSU RNA simdjson: Kostya simdjson: LargeRand simdjson: PartialTweets simdjson: DistinctUserID compress-lz4: 1 - Compression Speed compress-lz4: 1 - Decompression Speed compress-lz4: 3 - Compression Speed compress-lz4: 3 - Decompression Speed compress-lz4: 9 - Compression Speed compress-lz4: 9 - Decompression Speed onednn: IP Shapes 1D - f32 - CPU onednn: IP Shapes 3D - f32 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: Deconvolution Batch shapes_1d - f32 - CPU onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU rav1e: 1 rav1e: 5 rav1e: 6 rav1e: 10 coremark: CoreMark Size 666 - Iterations Per Second stockfish: Total Time asmfish: 1024 Hash Memory, 26 Depth build-ffmpeg: Time To Compile build2: Time To Compile build-eigen: Time To Compile encode-ape: WAV To APE encode-opus: WAV To Opus Encode node-web-tooling: astcenc: Fast astcenc: Medium astcenc: Thorough astcenc: Exhaustive sqlite-speedtest: Timed Time - Size 1,000 ncnn: CPU - mobilenet ncnn: CPU-v2-v2 - mobilenet-v2 ncnn: CPU-v3-v3 - mobilenet-v3 ncnn: CPU - shufflenet-v2 ncnn: CPU - mnasnet ncnn: CPU - efficientnet-b0 ncnn: CPU - blazeface ncnn: CPU - googlenet ncnn: CPU - vgg16 ncnn: CPU - resnet18 ncnn: CPU - alexnet ncnn: CPU - resnet50 ncnn: CPU - yolov4-tiny ncnn: CPU - squeezenet_ssd ncnn: CPU - regnety_400m ncnn: Vulkan GPU - mobilenet ncnn: Vulkan GPU-v2-v2 - mobilenet-v2 ncnn: Vulkan GPU-v3-v3 - mobilenet-v3 ncnn: Vulkan GPU - shufflenet-v2 ncnn: Vulkan GPU - mnasnet ncnn: Vulkan GPU - efficientnet-b0 ncnn: Vulkan GPU - blazeface ncnn: Vulkan GPU - googlenet ncnn: Vulkan GPU - vgg16 ncnn: Vulkan GPU - resnet18 ncnn: Vulkan GPU - alexnet ncnn: Vulkan GPU - resnet50 ncnn: Vulkan GPU - yolov4-tiny ncnn: Vulkan GPU - squeezenet_ssd ncnn: Vulkan GPU - regnety_400m openvino: Face Detection 0106 FP16 - CPU openvino: Face Detection 0106 FP16 - CPU openvino: Face Detection 0106 FP32 - CPU openvino: Face Detection 0106 FP32 - CPU openvino: Person Detection 0106 FP16 - CPU openvino: Person Detection 0106 FP16 - CPU openvino: Person Detection 0106 FP32 - CPU openvino: Person Detection 0106 FP32 - CPU openvino: Age Gender Recognition Retail 0013 FP16 - CPU openvino: Age Gender Recognition Retail 0013 FP16 - CPU openvino: Age Gender Recognition Retail 0013 FP32 - CPU openvino: Age Gender Recognition Retail 0013 FP32 - CPU encode-wavpack: WAV To WavPack unpack-firefox: firefox-84.0.source.tar.xz brl-cad: VGR Performance Metric 1 2 3 1367 10.906 7.541 1013.827 478.833 977 657 1.7 126.157 11.939 0.57 0.35 0.52 0.54 7608.28 9201.5 40.97 9058.0 40.08 9077.6 8.47060 9.06142 5.58876 3.59647 18.5346 11.5173 14.4610 17.7086 14.5548 11.2454 6759.55 3873.75 6758.25 3855.79 4.71186 6758.84 3856.20 7.50114 0.344 0.984 1.308 2.890 98117.509075 6456585 8988021 151.630 338.864 93.508 13.624 10.648 10.17 5.76 14.28 92.92 738.37 79.432 28.15 7.18 6.24 9.34 6.55 10.39 2.36 21.69 87.35 22.55 19.32 47.83 38.99 40.40 14.55 28.16 7.16 6.26 9.36 6.54 10.37 2.36 21.68 87.28 22.52 19.28 47.84 38.97 40.52 14.55 1.16 3426.15 1.15 3436.65 0.71 5641.28 0.70 5648.20 2549.49 1.45 2543.51 1.46 17.965 23.256 36580 1369 11.082 1010.496 422.227 969 658 1.7 126.198 11.922 0.58 0.35 0.52 0.54 7630.56 9212.4 41.02 9051.3 39.76 9053.2 8.47988 9.52884 5.58671 3.63768 18.7125 11.4995 14.3583 17.8085 14.5390 11.2340 6761.68 3860.52 6759.25 3858.99 4.71931 6804.34 3860.15 7.50639 0.344 0.984 1.308 2.887 100468.346119 6551982 9008380 151.678 337.988 93.150 13.704 10.648 10.39 5.75 14.29 92.88 737.79 79.660 28.15 7.14 6.25 9.34 6.51 10.36 2.36 21.75 87.32 22.56 19.34 47.90 38.97 40.46 14.50 28.14 7.15 6.26 9.37 6.54 10.37 2.36 21.72 87.24 22.52 19.31 47.79 38.96 40.46 14.53 1.17 3422.79 1.15 3436.56 0.71 5638.32 0.71 5638.59 2516.51 1.45 2551.41 1.45 17.942 23.393 36247 1368 10.370 1011.005 421.635 964 658 1.7 126.154 11.997 0.58 0.35 0.52 0.54 7612.81 9216.9 40.99 9057.3 39.98 9066.9 8.46603 9.18889 5.57958 3.59274 18.5430 11.4730 14.4145 17.7601 14.5402 11.2343 6761.79 3874.23 6767.39 3852.80 4.71943 6764.33 3855.49 7.50482 0.343 0.983 1.308 2.889 92000.140543 6451727 8912458 151.784 337.576 93.090 13.643 10.652 10.29 5.75 14.29 92.95 738.44 78.924 28.17 7.13 6.24 9.36 6.53 10.37 2.36 21.74 87.34 22.56 19.33 47.88 38.97 40.45 14.55 28.13 7.14 6.25 9.34 6.53 10.39 2.36 21.69 87.23 22.53 19.30 47.93 38.94 40.44 14.56 1.16 3426.05 1.15 3437.37 0.71 5642.32 0.70 5650.56 2562.17 1.46 2552.27 1.45 17.945 23.669 36569 OpenBenchmarking.org
VkFFT OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.1.1 1 2 3 300 600 900 1200 1500 SE +/- 2.33, N = 3 SE +/- 1.20, N = 3 SE +/- 1.00, N = 3 1367 1369 1368 1. (CXX) g++ options: -O3 -pthread
Betsy GPU Compressor Codec: ETC1 - Quality: Highest OpenBenchmarking.org Seconds, Fewer Is Better Betsy GPU Compressor 1.1 Beta Codec: ETC1 - Quality: Highest 1 2 3 3 6 9 12 15 SE +/- 0.78, N = 12 SE +/- 0.08, N = 15 SE +/- 0.63, N = 12 10.91 11.08 10.37 1. (CXX) g++ options: -O3 -O2 -lpthread -ldl
Betsy GPU Compressor Codec: ETC2 RGB - Quality: Highest OpenBenchmarking.org Seconds, Fewer Is Better Betsy GPU Compressor 1.1 Beta Codec: ETC2 RGB - Quality: Highest 1 2 4 6 8 10 7.541 1. (CXX) g++ options: -O3 -O2 -lpthread -ldl
VkResample Upscale: 2x - Precision: Double OpenBenchmarking.org ms, Fewer Is Better VkResample 1.0 Upscale: 2x - Precision: Double 1 2 3 200 400 600 800 1000 SE +/- 1.41, N = 3 SE +/- 2.85, N = 3 SE +/- 4.21, N = 3 1013.83 1010.50 1011.01 1. (CXX) g++ options: -O3 -pthread
VkResample Upscale: 2x - Precision: Single OpenBenchmarking.org ms, Fewer Is Better VkResample 1.0 Upscale: 2x - Precision: Single 1 2 3 100 200 300 400 500 SE +/- 0.46, N = 3 SE +/- 0.41, N = 3 SE +/- 0.49, N = 3 478.83 422.23 421.64 1. (CXX) g++ options: -O3 -pthread
VKMark Resolution: 1280 x 1024 OpenBenchmarking.org VKMark Score, More Is Better VKMark 2020-05-21 Resolution: 1280 x 1024 1 2 3 200 400 600 800 1000 SE +/- 1.00, N = 3 SE +/- 3.93, N = 3 977 969 964 1. (CXX) g++ options: -pthread -ldl -pipe -std=c++14 -MD -MQ -MF
VKMark Resolution: 1920 x 1080 OpenBenchmarking.org VKMark Score, More Is Better VKMark 2020-05-21 Resolution: 1920 x 1080 1 2 3 140 280 420 560 700 SE +/- 1.76, N = 3 SE +/- 0.58, N = 3 657 658 658 1. (CXX) g++ options: -pthread -ldl -pipe -std=c++14 -MD -MQ -MF
CLOMP Static OMP Speedup OpenBenchmarking.org Speedup, More Is Better CLOMP 1.2 Static OMP Speedup 1 2 3 0.3825 0.765 1.1475 1.53 1.9125 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 1.7 1.7 1.7 1. (CC) gcc options: -fopenmp -O3 -lm
Timed HMMer Search Pfam Database Search OpenBenchmarking.org Seconds, Fewer Is Better Timed HMMer Search 3.3.1 Pfam Database Search 1 2 3 30 60 90 120 150 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 SE +/- 0.06, N = 3 126.16 126.20 126.15 1. (CC) gcc options: -O3 -pthread -lhmmer -leasel -lm
Timed MAFFT Alignment Multiple Sequence Alignment - LSU RNA OpenBenchmarking.org Seconds, Fewer Is Better Timed MAFFT Alignment 7.471 Multiple Sequence Alignment - LSU RNA 1 2 3 3 6 9 12 15 SE +/- 0.04, N = 3 SE +/- 0.04, N = 3 SE +/- 0.07, N = 3 11.94 11.92 12.00 1. (CC) gcc options: -std=c99 -O3 -lm -lpthread
simdjson Throughput Test: Kostya OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: Kostya 1 2 3 0.1305 0.261 0.3915 0.522 0.6525 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.57 0.58 0.58 1. (CXX) g++ options: -O3 -pthread
simdjson Throughput Test: LargeRandom OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: LargeRandom 1 2 3 0.0788 0.1576 0.2364 0.3152 0.394 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.35 0.35 0.35 1. (CXX) g++ options: -O3 -pthread
simdjson Throughput Test: PartialTweets OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: PartialTweets 1 2 3 0.117 0.234 0.351 0.468 0.585 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.52 0.52 0.52 1. (CXX) g++ options: -O3 -pthread
simdjson Throughput Test: DistinctUserID OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: DistinctUserID 1 2 3 0.1215 0.243 0.3645 0.486 0.6075 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.54 0.54 0.54 1. (CXX) g++ options: -O3 -pthread
LZ4 Compression Compression Level: 1 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 1 - Compression Speed 1 2 3 1600 3200 4800 6400 8000 SE +/- 4.40, N = 3 SE +/- 19.00, N = 3 SE +/- 7.93, N = 3 7608.28 7630.56 7612.81 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 1 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 1 - Decompression Speed 1 2 3 2K 4K 6K 8K 10K SE +/- 38.89, N = 3 SE +/- 38.65, N = 3 SE +/- 21.43, N = 3 9201.5 9212.4 9216.9 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 3 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 3 - Compression Speed 1 2 3 9 18 27 36 45 SE +/- 0.08, N = 3 SE +/- 0.05, N = 3 SE +/- 0.08, N = 3 40.97 41.02 40.99 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 3 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 3 - Decompression Speed 1 2 3 2K 4K 6K 8K 10K SE +/- 9.10, N = 3 SE +/- 18.24, N = 3 SE +/- 2.77, N = 3 9058.0 9051.3 9057.3 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 9 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 9 - Compression Speed 1 2 3 9 18 27 36 45 SE +/- 0.06, N = 3 SE +/- 0.21, N = 3 SE +/- 0.01, N = 3 40.08 39.76 39.98 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 9 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 9 - Decompression Speed 1 2 3 2K 4K 6K 8K 10K SE +/- 15.30, N = 3 SE +/- 13.38, N = 3 SE +/- 14.57, N = 3 9077.6 9053.2 9066.9 1. (CC) gcc options: -O3
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU 1 2 3 2 4 6 8 10 SE +/- 0.01592, N = 3 SE +/- 0.00926, N = 3 SE +/- 0.01958, N = 3 8.47060 8.47988 8.46603 MIN: 8.4 MIN: 8.39 MIN: 8.37 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU 1 2 3 3 6 9 12 15 SE +/- 0.01406, N = 3 SE +/- 0.02685, N = 3 SE +/- 0.02174, N = 3 9.06142 9.52884 9.18889 MIN: 8.95 MIN: 9.43 MIN: 9.1 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU 1 2 3 1.2575 2.515 3.7725 5.03 6.2875 SE +/- 0.00294, N = 3 SE +/- 0.00388, N = 3 SE +/- 0.00020, N = 3 5.58876 5.58671 5.57958 MIN: 5.55 MIN: 5.55 MIN: 5.56 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU 1 2 3 0.8185 1.637 2.4555 3.274 4.0925 SE +/- 0.01043, N = 3 SE +/- 0.03180, N = 3 SE +/- 0.01609, N = 3 3.59647 3.63768 3.59274 MIN: 3.39 MIN: 3.18 MIN: 3.34 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU 1 2 3 5 10 15 20 25 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 18.53 18.71 18.54 MIN: 18.42 MIN: 18.57 MIN: 18.43 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU 1 2 3 3 6 9 12 15 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 11.52 11.50 11.47 MIN: 11.42 MIN: 11.4 MIN: 11.38 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU 1 2 3 4 8 12 16 20 SE +/- 0.08, N = 3 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 14.46 14.36 14.41 MIN: 14.22 MIN: 14.23 MIN: 14.25 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU 1 2 3 4 8 12 16 20 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 17.71 17.81 17.76 MIN: 17.57 MIN: 17.67 MIN: 17.61 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU 1 2 3 4 8 12 16 20 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 14.55 14.54 14.54 MIN: 14.46 MIN: 14.47 MIN: 14.44 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU 1 2 3 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 11.25 11.23 11.23 MIN: 11.19 MIN: 11.2 MIN: 11.19 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU 1 2 3 1400 2800 4200 5600 7000 SE +/- 1.61, N = 3 SE +/- 7.18, N = 3 SE +/- 0.71, N = 3 6759.55 6761.68 6761.79 MIN: 6754.19 MIN: 6742.39 MIN: 6755.52 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU 1 2 3 800 1600 2400 3200 4000 SE +/- 20.54, N = 3 SE +/- 0.72, N = 3 SE +/- 18.83, N = 3 3873.75 3860.52 3874.23 MIN: 3848.44 MIN: 3856.22 MIN: 3849.24 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU 1 2 3 1500 3000 4500 6000 7500 SE +/- 7.88, N = 3 SE +/- 2.23, N = 3 SE +/- 3.49, N = 3 6758.25 6759.25 6767.39 MIN: 6740.42 MIN: 6750.8 MIN: 6755.1 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU 1 2 3 800 1600 2400 3200 4000 SE +/- 5.10, N = 3 SE +/- 1.86, N = 3 SE +/- 1.54, N = 3 3855.79 3858.99 3852.80 MIN: 3845.17 MIN: 3851.44 MIN: 3847.92 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU 1 2 3 1.0619 2.1238 3.1857 4.2476 5.3095 SE +/- 0.00413, N = 3 SE +/- 0.00367, N = 3 SE +/- 0.00063, N = 3 4.71186 4.71931 4.71943 MIN: 4.66 MIN: 4.66 MIN: 4.66 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU 1 2 3 1500 3000 4500 6000 7500 SE +/- 7.71, N = 3 SE +/- 37.00, N = 3 SE +/- 4.92, N = 3 6758.84 6804.34 6764.33 MIN: 6740.34 MIN: 6761.06 MIN: 6750.92 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU 1 2 3 800 1600 2400 3200 4000 SE +/- 1.72, N = 3 SE +/- 1.45, N = 3 SE +/- 2.63, N = 3 3856.20 3860.15 3855.49 MIN: 3850.81 MIN: 3854.62 MIN: 3847.91 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU 1 2 3 2 4 6 8 10 SE +/- 0.00777, N = 3 SE +/- 0.00589, N = 3 SE +/- 0.00679, N = 3 7.50114 7.50639 7.50482 MIN: 7.46 MIN: 7.46 MIN: 7.46 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
rav1e Speed: 1 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Alpha Speed: 1 1 2 3 0.0774 0.1548 0.2322 0.3096 0.387 SE +/- 0.000, N = 3 SE +/- 0.001, N = 3 SE +/- 0.001, N = 3 0.344 0.344 0.343
rav1e Speed: 5 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Alpha Speed: 5 1 2 3 0.2214 0.4428 0.6642 0.8856 1.107 SE +/- 0.001, N = 3 SE +/- 0.002, N = 3 SE +/- 0.002, N = 3 0.984 0.984 0.983
rav1e Speed: 6 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Alpha Speed: 6 1 2 3 0.2943 0.5886 0.8829 1.1772 1.4715 SE +/- 0.001, N = 3 SE +/- 0.001, N = 3 SE +/- 0.001, N = 3 1.308 1.308 1.308
rav1e Speed: 10 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Alpha Speed: 10 1 2 3 0.6503 1.3006 1.9509 2.6012 3.2515 SE +/- 0.004, N = 3 SE +/- 0.006, N = 3 SE +/- 0.006, N = 3 2.890 2.887 2.889
Coremark CoreMark Size 666 - Iterations Per Second OpenBenchmarking.org Iterations/Sec, More Is Better Coremark 1.0 CoreMark Size 666 - Iterations Per Second 1 2 3 20K 40K 60K 80K 100K SE +/- 1397.84, N = 15 SE +/- 1569.03, N = 15 SE +/- 971.22, N = 8 98117.51 100468.35 92000.14 1. (CC) gcc options: -O2 -lrt" -lrt
Stockfish Total Time OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 12 Total Time 1 2 3 1.4M 2.8M 4.2M 5.6M 7M SE +/- 59897.76, N = 3 SE +/- 39431.93, N = 3 SE +/- 73617.81, N = 3 6456585 6551982 6451727 1. (CXX) g++ options: -m64 -lpthread -fno-exceptions -std=c++17 -pedantic -O3 -msse -msse3 -mpopcnt -msse4.1 -mssse3 -msse2 -flto -flto=jobserver
asmFish 1024 Hash Memory, 26 Depth OpenBenchmarking.org Nodes/second, More Is Better asmFish 2018-07-23 1024 Hash Memory, 26 Depth 1 2 3 2M 4M 6M 8M 10M SE +/- 98059.95, N = 3 SE +/- 71523.89, N = 3 SE +/- 46897.59, N = 3 8988021 9008380 8912458
Timed FFmpeg Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed FFmpeg Compilation 4.2.2 Time To Compile 1 2 3 30 60 90 120 150 SE +/- 0.07, N = 3 SE +/- 0.05, N = 3 SE +/- 0.12, N = 3 151.63 151.68 151.78
Build2 Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Build2 0.13 Time To Compile 1 2 3 70 140 210 280 350 SE +/- 0.66, N = 3 SE +/- 0.28, N = 3 SE +/- 0.64, N = 3 338.86 337.99 337.58
Timed Eigen Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Eigen Compilation 3.3.9 Time To Compile 1 2 3 20 40 60 80 100 SE +/- 0.06, N = 3 SE +/- 0.07, N = 3 SE +/- 0.06, N = 3 93.51 93.15 93.09
Monkey Audio Encoding WAV To APE OpenBenchmarking.org Seconds, Fewer Is Better Monkey Audio Encoding 3.99.6 WAV To APE 1 2 3 4 8 12 16 20 SE +/- 0.03, N = 5 SE +/- 0.06, N = 5 SE +/- 0.03, N = 5 13.62 13.70 13.64 1. (CXX) g++ options: -O3 -pedantic -rdynamic -lrt
Opus Codec Encoding WAV To Opus Encode OpenBenchmarking.org Seconds, Fewer Is Better Opus Codec Encoding 1.3.1 WAV To Opus Encode 1 2 3 3 6 9 12 15 SE +/- 0.02, N = 5 SE +/- 0.02, N = 5 SE +/- 0.02, N = 5 10.65 10.65 10.65 1. (CXX) g++ options: -fvisibility=hidden -logg -lm
Node.js V8 Web Tooling Benchmark OpenBenchmarking.org runs/s, More Is Better Node.js V8 Web Tooling Benchmark 1 2 3 3 6 9 12 15 SE +/- 0.06, N = 3 SE +/- 0.00, N = 3 SE +/- 0.06, N = 3 10.17 10.39 10.29 1. Nodejs
v10.19.0
ASTC Encoder Preset: Fast OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Fast 1 2 3 1.296 2.592 3.888 5.184 6.48 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 5.76 5.75 5.75 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
ASTC Encoder Preset: Medium OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Medium 1 2 3 4 8 12 16 20 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 14.28 14.29 14.29 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
ASTC Encoder Preset: Thorough OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Thorough 1 2 3 20 40 60 80 100 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 92.92 92.88 92.95 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
ASTC Encoder Preset: Exhaustive OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Exhaustive 1 2 3 160 320 480 640 800 SE +/- 0.13, N = 3 SE +/- 0.07, N = 3 SE +/- 0.14, N = 3 738.37 737.79 738.44 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
SQLite Speedtest Timed Time - Size 1,000 OpenBenchmarking.org Seconds, Fewer Is Better SQLite Speedtest 3.30 Timed Time - Size 1,000 1 2 3 20 40 60 80 100 SE +/- 0.03, N = 3 SE +/- 0.34, N = 3 SE +/- 0.10, N = 3 79.43 79.66 78.92 1. (CC) gcc options: -O2 -ldl -lz -lpthread
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: mobilenet 1 2 3 7 14 21 28 35 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 28.15 28.15 28.17 MIN: 28.03 / MAX: 30.22 MIN: 28.04 / MAX: 30.44 MIN: 28.04 / MAX: 38.21 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU-v2-v2 - Model: mobilenet-v2 1 2 3 2 4 6 8 10 SE +/- 0.05, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 7.18 7.14 7.13 MIN: 7.07 / MAX: 29.75 MIN: 7.07 / MAX: 8.36 MIN: 7.07 / MAX: 8.56 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU-v3-v3 - Model: mobilenet-v3 1 2 3 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 6.24 6.25 6.24 MIN: 6.17 / MAX: 7.41 MIN: 6.17 / MAX: 15.83 MIN: 6.17 / MAX: 7.8 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: shufflenet-v2 1 2 3 3 6 9 12 15 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 9.34 9.34 9.36 MIN: 9.28 / MAX: 10.73 MIN: 9.28 / MAX: 10.73 MIN: 9.29 / MAX: 11.04 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: mnasnet 1 2 3 2 4 6 8 10 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 6.55 6.51 6.53 MIN: 6.5 / MAX: 16.64 MIN: 6.47 / MAX: 6.73 MIN: 6.49 / MAX: 6.66 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: efficientnet-b0 1 2 3 3 6 9 12 15 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 10.39 10.36 10.37 MIN: 10.35 / MAX: 11.79 MIN: 10.33 / MAX: 10.6 MIN: 10.32 / MAX: 10.49 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: blazeface 1 2 3 0.531 1.062 1.593 2.124 2.655 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 2.36 2.36 2.36 MIN: 2.34 / MAX: 2.49 MIN: 2.34 / MAX: 2.38 MIN: 2.34 / MAX: 2.47 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: googlenet 1 2 3 5 10 15 20 25 SE +/- 0.02, N = 3 SE +/- 0.05, N = 3 SE +/- 0.06, N = 3 21.69 21.75 21.74 MIN: 21.59 / MAX: 29.99 MIN: 21.6 / MAX: 56.79 MIN: 21.62 / MAX: 52.65 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: vgg16 1 2 3 20 40 60 80 100 SE +/- 0.03, N = 3 SE +/- 0.09, N = 3 SE +/- 0.11, N = 3 87.35 87.32 87.34 MIN: 87.09 / MAX: 96.91 MIN: 87 / MAX: 97.85 MIN: 87.03 / MAX: 96.31 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: resnet18 1 2 3 5 10 15 20 25 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 22.55 22.56 22.56 MIN: 22.49 / MAX: 23.15 MIN: 22.46 / MAX: 23.21 MIN: 22.46 / MAX: 23.29 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: alexnet 1 2 3 5 10 15 20 25 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 19.32 19.34 19.33 MIN: 19.23 / MAX: 21.26 MIN: 19.23 / MAX: 21.66 MIN: 19.23 / MAX: 28.73 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: resnet50 1 2 3 11 22 33 44 55 SE +/- 0.01, N = 3 SE +/- 0.06, N = 3 SE +/- 0.07, N = 3 47.83 47.90 47.88 MIN: 47.62 / MAX: 57.56 MIN: 47.62 / MAX: 88.54 MIN: 47.6 / MAX: 95.3 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: yolov4-tiny 1 2 3 9 18 27 36 45 SE +/- 0.06, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 38.99 38.97 38.97 MIN: 38.79 / MAX: 49.09 MIN: 38.81 / MAX: 49.04 MIN: 38.81 / MAX: 41.21 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: squeezenet_ssd 1 2 3 9 18 27 36 45 SE +/- 0.03, N = 3 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 40.40 40.46 40.45 MIN: 40.28 / MAX: 42.66 MIN: 40.32 / MAX: 57.2 MIN: 40.35 / MAX: 52.52 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: regnety_400m 1 2 3 4 8 12 16 20 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 14.55 14.50 14.55 MIN: 14.43 / MAX: 15.76 MIN: 14.4 / MAX: 15.22 MIN: 14.46 / MAX: 15.19 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: mobilenet 1 2 3 7 14 21 28 35 SE +/- 0.03, N = 3 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 28.16 28.14 28.13 MIN: 28 / MAX: 30.49 MIN: 27.98 / MAX: 33.11 MIN: 28 / MAX: 30.42 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 1 2 3 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 7.16 7.15 7.14 MIN: 7.08 / MAX: 8.7 MIN: 7.07 / MAX: 9.43 MIN: 7.08 / MAX: 8.35 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 1 2 3 2 4 6 8 10 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 6.26 6.26 6.25 MIN: 6.18 / MAX: 7.85 MIN: 6.19 / MAX: 7.44 MIN: 6.18 / MAX: 7.7 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: shufflenet-v2 1 2 3 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 9.36 9.37 9.34 MIN: 9.29 / MAX: 10.85 MIN: 9.31 / MAX: 10.64 MIN: 9.29 / MAX: 10.75 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: mnasnet 1 2 3 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 6.54 6.54 6.53 MIN: 6.5 / MAX: 8.09 MIN: 6.51 / MAX: 7.17 MIN: 6.49 / MAX: 7.16 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: efficientnet-b0 1 2 3 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.03, N = 3 10.37 10.37 10.39 MIN: 10.32 / MAX: 11.49 MIN: 10.34 / MAX: 11.79 MIN: 10.32 / MAX: 20.27 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: blazeface 1 2 3 0.531 1.062 1.593 2.124 2.655 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 2.36 2.36 2.36 MIN: 2.34 / MAX: 3.09 MIN: 2.35 / MAX: 2.57 MIN: 2.34 / MAX: 2.49 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: googlenet 1 2 3 5 10 15 20 25 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 21.68 21.72 21.69 MIN: 21.59 / MAX: 31.15 MIN: 21.61 / MAX: 31.67 MIN: 21.59 / MAX: 31.58 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: vgg16 1 2 3 20 40 60 80 100 SE +/- 0.01, N = 3 SE +/- 0.09, N = 3 SE +/- 0.01, N = 3 87.28 87.24 87.23 MIN: 87.01 / MAX: 97.33 MIN: 86.94 / MAX: 96.82 MIN: 87.02 / MAX: 97.07 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: resnet18 1 2 3 5 10 15 20 25 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 22.52 22.52 22.53 MIN: 22.4 / MAX: 23.98 MIN: 22.41 / MAX: 23.18 MIN: 22.46 / MAX: 23.24 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: alexnet 1 2 3 5 10 15 20 25 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 19.28 19.31 19.30 MIN: 19.2 / MAX: 19.63 MIN: 19.23 / MAX: 21.66 MIN: 19.2 / MAX: 21.55 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: resnet50 1 2 3 11 22 33 44 55 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 SE +/- 0.11, N = 3 47.84 47.79 47.93 MIN: 47.59 / MAX: 59.98 MIN: 47.59 / MAX: 57.75 MIN: 47.56 / MAX: 120.58 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: yolov4-tiny 1 2 3 9 18 27 36 45 SE +/- 0.03, N = 3 SE +/- 0.04, N = 3 SE +/- 0.03, N = 3 38.97 38.96 38.94 MIN: 38.82 / MAX: 49.15 MIN: 38.78 / MAX: 48.94 MIN: 38.78 / MAX: 41.06 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: squeezenet_ssd 1 2 3 9 18 27 36 45 SE +/- 0.08, N = 3 SE +/- 0.05, N = 3 SE +/- 0.05, N = 3 40.52 40.46 40.44 MIN: 40.36 / MAX: 90.92 MIN: 40.33 / MAX: 67.8 MIN: 40.32 / MAX: 50.6 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: regnety_400m 1 2 3 4 8 12 16 20 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 14.55 14.53 14.56 MIN: 14.45 / MAX: 28.03 MIN: 14.46 / MAX: 15.24 MIN: 14.49 / MAX: 15.22 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
OpenVINO Model: Face Detection 0106 FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2021.1 Model: Face Detection 0106 FP16 - Device: CPU 1 2 3 0.2633 0.5266 0.7899 1.0532 1.3165 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 1.16 1.17 1.16
OpenVINO Model: Face Detection 0106 FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2021.1 Model: Face Detection 0106 FP16 - Device: CPU 1 2 3 700 1400 2100 2800 3500 SE +/- 2.94, N = 3 SE +/- 2.63, N = 3 SE +/- 5.13, N = 3 3426.15 3422.79 3426.05
OpenVINO Model: Face Detection 0106 FP32 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2021.1 Model: Face Detection 0106 FP32 - Device: CPU 1 2 3 0.2588 0.5176 0.7764 1.0352 1.294 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 1.15 1.15 1.15
OpenVINO Model: Face Detection 0106 FP32 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2021.1 Model: Face Detection 0106 FP32 - Device: CPU 1 2 3 700 1400 2100 2800 3500 SE +/- 2.49, N = 3 SE +/- 6.17, N = 3 SE +/- 9.57, N = 3 3436.65 3436.56 3437.37
OpenVINO Model: Person Detection 0106 FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2021.1 Model: Person Detection 0106 FP16 - Device: CPU 1 2 3 0.1598 0.3196 0.4794 0.6392 0.799 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.71 0.71 0.71
OpenVINO Model: Person Detection 0106 FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2021.1 Model: Person Detection 0106 FP16 - Device: CPU 1 2 3 1200 2400 3600 4800 6000 SE +/- 3.62, N = 3 SE +/- 4.42, N = 3 SE +/- 3.88, N = 3 5641.28 5638.32 5642.32
OpenVINO Model: Person Detection 0106 FP32 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2021.1 Model: Person Detection 0106 FP32 - Device: CPU 1 2 3 0.1598 0.3196 0.4794 0.6392 0.799 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.70 0.71 0.70
OpenVINO Model: Person Detection 0106 FP32 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2021.1 Model: Person Detection 0106 FP32 - Device: CPU 1 2 3 1200 2400 3600 4800 6000 SE +/- 7.38, N = 3 SE +/- 5.55, N = 3 SE +/- 8.84, N = 3 5648.20 5638.59 5650.56
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2021.1 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU 1 2 3 500 1000 1500 2000 2500 SE +/- 10.55, N = 3 SE +/- 32.43, N = 12 SE +/- 18.26, N = 3 2549.49 2516.51 2562.17
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2021.1 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU 1 2 3 0.3285 0.657 0.9855 1.314 1.6425 SE +/- 0.00, N = 3 SE +/- 0.00, N = 12 SE +/- 0.00, N = 3 1.45 1.45 1.46
OpenVINO Model: Age Gender Recognition Retail 0013 FP32 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2021.1 Model: Age Gender Recognition Retail 0013 FP32 - Device: CPU 1 2 3 500 1000 1500 2000 2500 SE +/- 4.46, N = 3 SE +/- 2.44, N = 3 SE +/- 6.24, N = 3 2543.51 2551.41 2552.27
OpenVINO Model: Age Gender Recognition Retail 0013 FP32 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2021.1 Model: Age Gender Recognition Retail 0013 FP32 - Device: CPU 1 2 3 0.3285 0.657 0.9855 1.314 1.6425 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 1.46 1.45 1.45
WavPack Audio Encoding WAV To WavPack OpenBenchmarking.org Seconds, Fewer Is Better WavPack Audio Encoding 5.3 WAV To WavPack 1 2 3 4 8 12 16 20 SE +/- 0.02, N = 5 SE +/- 0.00, N = 5 SE +/- 0.00, N = 5 17.97 17.94 17.95 1. (CXX) g++ options: -rdynamic
Unpacking Firefox Extracting: firefox-84.0.source.tar.xz OpenBenchmarking.org Seconds, Fewer Is Better Unpacking Firefox 84.0 Extracting: firefox-84.0.source.tar.xz 1 2 3 6 12 18 24 30 SE +/- 0.18, N = 4 SE +/- 0.24, N = 4 SE +/- 0.30, N = 4 23.26 23.39 23.67
BRL-CAD VGR Performance Metric OpenBenchmarking.org VGR Performance Metric, More Is Better BRL-CAD 7.30.8 VGR Performance Metric 1 2 3 8K 16K 24K 32K 40K 36580 36247 36569 1. (CXX) g++ options: -std=c++11 -pipe -fno-strict-aliasing -fno-common -fexceptions -ftemplate-depth-128 -m64 -ggdb3 -O3 -fipa-pta -fstrength-reduce -finline-functions -flto -pedantic -rdynamic -lSM -lICE -lXi -lGLU -lGL -lGLdispatch -lX11 -lXext -lXrender -lpthread -ldl -luuid -lm
Phoronix Test Suite v10.8.4