CompuLab Airtop 3 Intel Xeon E-2288G testing with a Compulab SBC-ATCFL v1.2 (ATOP3.PRD.0.29.2 BIOS) and NVIDIA Quadro RTX 4000 8GB on Ubuntu 20.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2011040-FI-COMPULABA81&grt&rdt .
CompuLab Airtop 3 Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL OpenCL Vulkan Compiler File-System Screen Resolution 1 2 3 Intel Xeon E-2288G @ 5.00GHz (8 Cores / 16 Threads) Compulab SBC-ATCFL v1.2 (ATOP3.PRD.0.29.2 BIOS) Intel Cannon Lake PCH 64GB Samsung SSD 970 EVO Plus 250GB NVIDIA Quadro RTX 4000 8GB (1005/6500MHz) Intel Cannon Lake PCH cAVS VE228 Intel I219-LM + Intel I210 Ubuntu 20.10 5.8.0-26-generic (x86_64) GNOME Shell 3.38.1 X Server 1.20.9 NVIDIA 455.28 4.6.0 OpenCL 1.2 CUDA 11.1.96 1.2.142 GCC 10.2.0 ext4 1920x1080 NVIDIA Quadro RTX 4000 8GB (300/405MHz) OpenBenchmarking.org Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-gcn/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: intel_pstate powersave - CPU Microcode: 0xd6 - Thermald 2.3 OpenCL Details - GPU Compute Cores: 2304 Java Details - OpenJDK Runtime Environment (build 11.0.9+11-Ubuntu-0ubuntu1) Python Details - Python 3.8.6 Security Details - itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Mitigation of TSX disabled + tsx_async_abort: Mitigation of TSX disabled
CompuLab Airtop 3 aom-av1: Speed 0 Two-Pass aom-av1: Speed 4 Two-Pass aom-av1: Speed 6 Realtime aom-av1: Speed 6 Two-Pass aom-av1: Speed 8 Realtime astcenc: Fast astcenc: Medium astcenc: Thorough astcenc: Exhaustive blender: BMW27 - CPU-Only blender: BMW27 - NVIDIA OptiX blender: Classroom - CPU-Only blender: Fishy Cat - CPU-Only blender: Barbershop - CPU-Only blender: Classroom - NVIDIA OptiX blender: Fishy Cat - NVIDIA OptiX blender: Barbershop - NVIDIA OptiX blender: Pabellon Barcelona - CPU-Only blender: Pabellon Barcelona - NVIDIA OptiX brl-cad: VGR Performance Metric byte: Dhrystone 2 caffe: AlexNet - CPU - 100 caffe: AlexNet - CPU - 200 caffe: GoogleNet - CPU - 100 caffe: GoogleNet - CPU - 200 dacapobench: H2 dacapobench: Jython dacapobench: Tradesoap dacapobench: Tradebeans dav1d: Chimera 1080p dav1d: Summer Nature 4K dav1d: Summer Nature 1080p dav1d: Chimera 1080p 10-bit dolfyn: Computational Fluid Dynamics ffte: N=256, 3D Complex FFT Routine gromacs: Water Benchmark hint: FLOAT incompact3d: Cylinder influxdb: 4 - 10000 - 2,5000,1 - 10000 influxdb: 64 - 10000 - 2,5000,1 - 10000 influxdb: 1024 - 10000 - 2,5000,1 - 10000 java-gradle-perf: Reactor keydb: kvazaar: Bosphorus 4K - Slow kvazaar: Bosphorus 4K - Medium kvazaar: Bosphorus 1080p - Slow kvazaar: Bosphorus 1080p - Medium kvazaar: Bosphorus 4K - Very Fast kvazaar: Bosphorus 4K - Ultra Fast kvazaar: Bosphorus 1080p - Very Fast kvazaar: Bosphorus 1080p - Ultra Fast lammps: 20k Atoms lammps: Rhodopsin Protein lczero: BLAS lczero: Eigen avifenc: 0 avifenc: 2 avifenc: 8 avifenc: 10 libraw: Post-Processing Benchmark namd: ATPase Simulation - 327,506 Atoms ncnn: CPU - squeezenet ncnn: CPU - mobilenet ncnn: CPU-v2-v2 - mobilenet-v2 ncnn: CPU-v3-v3 - mobilenet-v3 ncnn: CPU - shufflenet-v2 ncnn: CPU - mnasnet ncnn: CPU - efficientnet-b0 ncnn: CPU - blazeface ncnn: CPU - googlenet ncnn: CPU - vgg16 ncnn: CPU - resnet18 ncnn: CPU - alexnet ncnn: CPU - resnet50 ncnn: CPU - yolov4-tiny ncnn: Vulkan GPU - squeezenet ncnn: Vulkan GPU - mobilenet ncnn: Vulkan GPU-v2-v2 - mobilenet-v2 ncnn: Vulkan GPU-v3-v3 - mobilenet-v3 ncnn: Vulkan GPU - shufflenet-v2 ncnn: Vulkan GPU - mnasnet ncnn: Vulkan GPU - efficientnet-b0 ncnn: Vulkan GPU - blazeface ncnn: Vulkan GPU - googlenet ncnn: Vulkan GPU - vgg16 ncnn: Vulkan GPU - resnet18 ncnn: Vulkan GPU - alexnet ncnn: Vulkan GPU - resnet50 ncnn: Vulkan GPU - yolov4-tiny neatbench: CPU neatbench: GPU ocrmypdf: Processing 60 Page PDF Document onednn: IP Batch 1D - f32 - CPU onednn: IP Batch All - f32 - CPU onednn: IP Batch 1D - u8s8f32 - CPU onednn: IP Batch All - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: Deconvolution Batch deconv_1d - f32 - CPU onednn: Deconvolution Batch deconv_3d - f32 - CPU onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Deconvolution Batch deconv_1d - u8s8f32 - CPU onednn: Deconvolution Batch deconv_3d - u8s8f32 - CPU onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU openssl: RSA 4096-bit Performance pgbench: 1 - 1 - Read Only pgbench: 1 - 1 - Read Only - Average Latency pgbench: 1 - 1 - Read Write pgbench: 1 - 1 - Read Write - Average Latency pgbench: 1 - 50 - Read Only pgbench: 1 - 50 - Read Only - Average Latency pgbench: 1 - 100 - Read Only pgbench: 1 - 100 - Read Only - Average Latency pgbench: 1 - 250 - Read Only pgbench: 1 - 250 - Read Only - Average Latency pgbench: 1 - 50 - Read Write pgbench: 1 - 50 - Read Write - Average Latency pgbench: 1 - 100 - Read Write pgbench: 1 - 100 - Read Write - Average Latency pgbench: 1 - 250 - Read Write pgbench: 1 - 250 - Read Write - Average Latency pyperformance: go pyperformance: 2to3 pyperformance: chaos pyperformance: float pyperformance: nbody pyperformance: pathlib pyperformance: raytrace pyperformance: json_loads pyperformance: crypto_pyaes pyperformance: regex_compile pyperformance: python_startup pyperformance: django_template pyperformance: pickle_pure_python rawtherapee: Total Benchmark Time rnnoise: rodinia: OpenMP LavaMD rodinia: OpenCL Myocyte rodinia: OpenMP HotSpot3D rodinia: OpenMP Leukocyte rodinia: OpenMP CFD Solver rodinia: OpenMP Streamcluster sunflow: Global Illumination + Image Synthesis tensorflow-lite: SqueezeNet tensorflow-lite: Inception V4 tensorflow-lite: NASNet Mobile tensorflow-lite: Mobilenet Float tensorflow-lite: Mobilenet Quant tensorflow-lite: Inception ResNet V2 tesseract-ocr: Time To OCR 7 Images hmmer: Pfam Database Search build-linux-kernel: Time To Compile build-llvm: Time To Compile tnn: CPU - MobileNet v2 tnn: CPU - SqueezeNet v1.1 webp: Default webp: Quality 100 webp: Quality 100, Lossless webp: Quality 100, Highest Compression webp: Quality 100, Lossless, Highest Compression wireguard: x265: Bosphorus 4K x265: Bosphorus 1080p 1 2 3 0.33 2.72 23.54 4.30 45.92 5.53 8.48 33.33 274.72 186.64 27.90 584.98 271.16 787.85 104.11 54.93 1312.28 651.27 145.47 99977 48134384.7 40095 80273 103786 207902 2864 3709 3652 2690 639.97 156.40 581.34 114.55 16.833 31760.023038312 0.820 472824990.62325 378.478658 1507808.8 1534293.6 1535930.8 197.762 737463.31 4.15 4.21 18.24 18.69 11.95 21.87 46.73 87.16 5.851 6.600 844 751 99.669 58.664 5.022 4.757 33.31 1.94453 16.10 19.30 5.11 4.06 3.21 3.92 6.69 1.46 14.99 67.22 15.20 14.38 28.80 28.70 3.66 4.63 1.45 1.67 1.30 1.48 2.67 0.61 3.53 8.36 1.68 2.10 3.77 8.67 10.9 29.4 22.024 4.49627 68.0525 1.80103 27.6364 17.1183 4.89651 6.64261 16.4418 5.70594 3.58701 321.803 149.618 3.99911 2.58081 2431.8 33414 0.03 480 2.084 289401 0.173 274463 0.365 254678 0.982 611 81.865 568 176.126 599 419.725 200 264 85.7 90.0 103 14.4 374 21.3 85.8 138 6.80 38.6 338 61.749 21.587 276.215 34.786 92.668 140.313 22.649 19.563 1.138 227852 3269353 201015 153919 157250 2954177 20.134 100.123 98.182 771.207 286.035 269.712 1.313 2.077 15.613 6.385 34.700 160.513 12.37 54.49 0.33 2.72 23.44 4.29 46.26 5.52 8.44 33.24 273.46 186.19 27.76 587.97 269.67 783.83 103.42 54.71 1264.86 650.83 144.87 99832 48808319.1 39759 79922 103349 207889 2925 3698 3550 2645 647.74 160.47 582.13 116.36 16.711 31509.634098231 0.815 474322870.49016 377.557241 1526391.3 1543312.5 1557794.2 199.454 736916.33 4.17 4.23 18.58 18.92 12.05 22.45 48.24 92.53 5.909 6.805 837 762 99.273 58.399 4.995 4.738 33.66 1.93847 16.10 19.22 5.13 4.07 3.17 3.91 6.68 1.45 15.09 66.60 15.49 14.33 28.91 29.58 3.65 4.58 1.42 1.68 1.31 1.48 2.65 0.62 3.22 8.32 1.72 2.08 3.74 8.32 10.8 30.5 21.067 4.13153 68.9994 1.80571 27.3707 17.1397 4.67392 6.47349 16.7441 5.59357 3.24189 326.960 139.871 4.06970 2.57926 2467.9 33267 0.030 429 2.412 296088 0.169 281291 0.356 251429 0.995 596 83.886 578 173.120 591 424.257 199 261 85.8 90.0 103 14.3 375 21.2 85.8 137 6.74 38.9 339 61.854 21.593 273.155 34.514 89.729 138.389 22.740 19.631 1.132 226239 3260710 199504 153221 156299 2946580 20.315 100.236 98.147 762.884 287.559 269.656 1.315 2.076 15.603 6.386 34.844 158.360 12.39 56.14 0.33 2.72 23.36 4.27 45.78 5.55 8.55 33.34 275.59 186.87 27.86 584.40 270.29 787.49 104.69 55.23 1325.38 650.06 145.79 99792 48679363.3 40315 80507 103903 208264 2775 3718 3704 2801 641.98 156.84 579.67 113.41 17.027 19217.721194045 0.821 471834576.34719 605.889669 1508093.5 1530241.0 1539617.5 197.054 734176.59 4.15 4.20 18.11 18.69 11.91 21.92 46.79 87.18 5.905 6.322 809 671 99.341 58.664 5.017 4.733 33.87 2.62968 16.38 19.23 5.31 4.20 3.26 4.17 6.85 1.50 15.38 67.82 15.61 14.72 29.20 29.07 3.66 4.77 1.46 1.68 1.31 1.50 2.65 0.63 3.23 8.42 1.66 2.09 3.76 8.70 10.9 30.6 22.094 4.13129 68.5948 1.86706 27.3473 17.1967 4.70756 6.61172 16.6046 5.70690 3.30402 324.335 136.151 3.87614 2.58232 2474.6 32579 0.031 442 2.312 291995 0.171 278019 0.360 249348 1.004 604 82.733 578 173.239 557 449.114 200 261 86.1 90.1 103 14.5 377 21.2 85.9 139 6.75 39.2 339 62.124 21.597 296.269 35.760 213.515 214.308 265.624 69.374 1.127 227599 3270433 201265 154362 157719 2957670 20.224 100.832 284.025 269.435 1.318 2.078 15.694 6.382 34.612 159.753 12.34 54.42 OpenBenchmarking.org
AOM AV1 Encoder Mode: Speed 0 Two-Pass OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 2.0 Encoder Mode: Speed 0 Two-Pass 1 2 3 0.0743 0.1486 0.2229 0.2972 0.3715 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.33 0.33 0.33 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
AOM AV1 Encoder Mode: Speed 4 Two-Pass OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 2.0 Encoder Mode: Speed 4 Two-Pass 1 2 3 0.612 1.224 1.836 2.448 3.06 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 2.72 2.72 2.72 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
AOM AV1 Encoder Mode: Speed 6 Realtime OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 2.0 Encoder Mode: Speed 6 Realtime 1 2 3 6 12 18 24 30 SE +/- 0.09, N = 3 SE +/- 0.10, N = 3 SE +/- 0.09, N = 3 23.54 23.44 23.36 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
AOM AV1 Encoder Mode: Speed 6 Two-Pass OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 2.0 Encoder Mode: Speed 6 Two-Pass 1 2 3 0.9675 1.935 2.9025 3.87 4.8375 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 4.30 4.29 4.27 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
AOM AV1 Encoder Mode: Speed 8 Realtime OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 2.0 Encoder Mode: Speed 8 Realtime 1 2 3 10 20 30 40 50 SE +/- 0.13, N = 3 SE +/- 0.09, N = 3 SE +/- 0.23, N = 3 45.92 46.26 45.78 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
ASTC Encoder Preset: Fast OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Fast 1 2 3 1.2488 2.4976 3.7464 4.9952 6.244 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 5.53 5.52 5.55 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
ASTC Encoder Preset: Medium OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Medium 1 2 3 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 8.48 8.44 8.55 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
ASTC Encoder Preset: Thorough OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Thorough 1 2 3 8 16 24 32 40 SE +/- 0.40, N = 3 SE +/- 0.42, N = 3 SE +/- 0.40, N = 6 33.33 33.24 33.34 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
ASTC Encoder Preset: Exhaustive OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Exhaustive 1 2 3 60 120 180 240 300 SE +/- 0.80, N = 3 SE +/- 0.56, N = 3 SE +/- 0.84, N = 3 274.72 273.46 275.59 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
Blender Blend File: BMW27 - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: BMW27 - Compute: CPU-Only 1 2 3 40 80 120 160 200 SE +/- 0.68, N = 3 SE +/- 1.07, N = 3 SE +/- 1.31, N = 3 186.64 186.19 186.87
Blender Blend File: BMW27 - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: BMW27 - Compute: NVIDIA OptiX 1 2 3 7 14 21 28 35 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 27.90 27.76 27.86
Blender Blend File: Classroom - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Classroom - Compute: CPU-Only 1 2 3 130 260 390 520 650 SE +/- 0.63, N = 3 SE +/- 1.27, N = 3 SE +/- 0.47, N = 3 584.98 587.97 584.40
Blender Blend File: Fishy Cat - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Fishy Cat - Compute: CPU-Only 1 2 3 60 120 180 240 300 SE +/- 0.29, N = 3 SE +/- 0.26, N = 3 SE +/- 0.29, N = 3 271.16 269.67 270.29
Blender Blend File: Barbershop - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Barbershop - Compute: CPU-Only 1 2 3 200 400 600 800 1000 SE +/- 1.21, N = 3 SE +/- 0.37, N = 3 SE +/- 0.65, N = 3 787.85 783.83 787.49
Blender Blend File: Classroom - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Classroom - Compute: NVIDIA OptiX 1 2 3 20 40 60 80 100 SE +/- 0.79, N = 3 SE +/- 0.75, N = 3 SE +/- 0.86, N = 3 104.11 103.42 104.69
Blender Blend File: Fishy Cat - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Fishy Cat - Compute: NVIDIA OptiX 1 2 3 12 24 36 48 60 SE +/- 0.14, N = 3 SE +/- 0.17, N = 3 SE +/- 0.12, N = 3 54.93 54.71 55.23
Blender Blend File: Barbershop - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Barbershop - Compute: NVIDIA OptiX 1 2 3 300 600 900 1200 1500 SE +/- 19.68, N = 4 SE +/- 20.32, N = 3 SE +/- 22.81, N = 3 1312.28 1264.86 1325.38
Blender Blend File: Pabellon Barcelona - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Pabellon Barcelona - Compute: CPU-Only 1 2 3 140 280 420 560 700 SE +/- 1.32, N = 3 SE +/- 1.12, N = 3 SE +/- 0.68, N = 3 651.27 650.83 650.06
Blender Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX 1 2 3 30 60 90 120 150 SE +/- 0.92, N = 3 SE +/- 0.89, N = 3 SE +/- 0.98, N = 3 145.47 144.87 145.79
BRL-CAD VGR Performance Metric OpenBenchmarking.org VGR Performance Metric, More Is Better BRL-CAD 7.30.8 VGR Performance Metric 1 2 3 20K 40K 60K 80K 100K 99977 99832 99792 1. (CXX) g++ options: -std=c++11 -pipe -fno-strict-aliasing -fno-common -fexceptions -ftemplate-depth-128 -m64 -ggdb3 -O3 -fipa-pta -fstrength-reduce -finline-functions -flto -pedantic -rdynamic -lSM -lICE -lGLU -lGL -lGLdispatch -lX11 -lXext -lpthread -ldl -luuid -lm
BYTE Unix Benchmark Computational Test: Dhrystone 2 OpenBenchmarking.org LPS, More Is Better BYTE Unix Benchmark 3.6 Computational Test: Dhrystone 2 1 2 3 10M 20M 30M 40M 50M SE +/- 543289.97, N = 3 SE +/- 22614.50, N = 3 SE +/- 80326.86, N = 3 48134384.7 48808319.1 48679363.3
Caffe Model: AlexNet - Acceleration: CPU - Iterations: 100 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: AlexNet - Acceleration: CPU - Iterations: 100 1 2 3 9K 18K 27K 36K 45K SE +/- 134.25, N = 3 SE +/- 121.90, N = 3 SE +/- 115.70, N = 3 40095 39759 40315 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
Caffe Model: AlexNet - Acceleration: CPU - Iterations: 200 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: AlexNet - Acceleration: CPU - Iterations: 200 1 2 3 20K 40K 60K 80K 100K SE +/- 78.77, N = 3 SE +/- 143.26, N = 3 SE +/- 112.10, N = 3 80273 79922 80507 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
Caffe Model: GoogleNet - Acceleration: CPU - Iterations: 100 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: CPU - Iterations: 100 1 2 3 20K 40K 60K 80K 100K SE +/- 216.65, N = 3 SE +/- 79.24, N = 3 SE +/- 179.36, N = 3 103786 103349 103903 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
Caffe Model: GoogleNet - Acceleration: CPU - Iterations: 200 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: CPU - Iterations: 200 1 2 3 40K 80K 120K 160K 200K SE +/- 124.50, N = 3 SE +/- 420.35, N = 3 SE +/- 147.41, N = 3 207902 207889 208264 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
DaCapo Benchmark Java Test: H2 OpenBenchmarking.org msec, Fewer Is Better DaCapo Benchmark 9.12-MR1 Java Test: H2 1 2 3 600 1200 1800 2400 3000 SE +/- 46.69, N = 19 SE +/- 39.90, N = 20 SE +/- 38.16, N = 4 2864 2925 2775
DaCapo Benchmark Java Test: Jython OpenBenchmarking.org msec, Fewer Is Better DaCapo Benchmark 9.12-MR1 Java Test: Jython 1 2 3 800 1600 2400 3200 4000 SE +/- 52.97, N = 4 SE +/- 52.89, N = 4 SE +/- 27.24, N = 4 3709 3698 3718
DaCapo Benchmark Java Test: Tradesoap OpenBenchmarking.org msec, Fewer Is Better DaCapo Benchmark 9.12-MR1 Java Test: Tradesoap 1 2 3 800 1600 2400 3200 4000 SE +/- 18.80, N = 4 SE +/- 25.69, N = 4 SE +/- 42.18, N = 4 3652 3550 3704
DaCapo Benchmark Java Test: Tradebeans OpenBenchmarking.org msec, Fewer Is Better DaCapo Benchmark 9.12-MR1 Java Test: Tradebeans 1 2 3 600 1200 1800 2400 3000 SE +/- 35.35, N = 5 SE +/- 27.27, N = 4 SE +/- 21.94, N = 4 2690 2645 2801
dav1d Video Input: Chimera 1080p OpenBenchmarking.org FPS, More Is Better dav1d 0.7.0 Video Input: Chimera 1080p 1 2 3 140 280 420 560 700 SE +/- 2.78, N = 3 SE +/- 1.12, N = 3 SE +/- 3.61, N = 3 639.97 647.74 641.98 MIN: 466.44 / MAX: 934.79 MIN: 471.76 / MAX: 969.53 MIN: 460.65 / MAX: 940.69 1. (CC) gcc options: -pthread
dav1d Video Input: Summer Nature 4K OpenBenchmarking.org FPS, More Is Better dav1d 0.7.0 Video Input: Summer Nature 4K 1 2 3 40 80 120 160 200 SE +/- 1.17, N = 3 SE +/- 1.07, N = 3 SE +/- 0.52, N = 3 156.40 160.47 156.84 MIN: 125.77 / MAX: 171.79 MIN: 149.07 / MAX: 177.15 MIN: 148.02 / MAX: 171.09 1. (CC) gcc options: -pthread
dav1d Video Input: Summer Nature 1080p OpenBenchmarking.org FPS, More Is Better dav1d 0.7.0 Video Input: Summer Nature 1080p 1 2 3 130 260 390 520 650 SE +/- 1.17, N = 3 SE +/- 0.67, N = 3 SE +/- 1.29, N = 3 581.34 582.13 579.67 MIN: 498.37 / MAX: 635.39 MIN: 511.78 / MAX: 635.13 MIN: 479.84 / MAX: 634.42 1. (CC) gcc options: -pthread
dav1d Video Input: Chimera 1080p 10-bit OpenBenchmarking.org FPS, More Is Better dav1d 0.7.0 Video Input: Chimera 1080p 10-bit 1 2 3 30 60 90 120 150 SE +/- 0.12, N = 3 SE +/- 0.22, N = 3 SE +/- 0.06, N = 3 114.55 116.36 113.41 MIN: 72.75 / MAX: 275.84 MIN: 73.44 / MAX: 275.45 MIN: 72.47 / MAX: 269.27 1. (CC) gcc options: -pthread
Dolfyn Computational Fluid Dynamics OpenBenchmarking.org Seconds, Fewer Is Better Dolfyn 0.527 Computational Fluid Dynamics 1 2 3 4 8 12 16 20 SE +/- 0.12, N = 3 SE +/- 0.03, N = 3 SE +/- 0.08, N = 3 16.83 16.71 17.03
FFTE N=256, 3D Complex FFT Routine OpenBenchmarking.org MFLOPS, More Is Better FFTE 7.0 N=256, 3D Complex FFT Routine 1 2 3 7K 14K 21K 28K 35K SE +/- 39.81, N = 3 SE +/- 21.47, N = 3 SE +/- 122.08, N = 3 31760.02 31509.63 19217.72 1. (F9X) gfortran options: -O3 -fomit-frame-pointer -fopenmp
GROMACS Water Benchmark OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2020.3 Water Benchmark 1 2 3 0.1847 0.3694 0.5541 0.7388 0.9235 SE +/- 0.001, N = 3 SE +/- 0.003, N = 3 SE +/- 0.001, N = 3 0.820 0.815 0.821 1. (CXX) g++ options: -O3 -pthread -lrt -lpthread -lm
Hierarchical INTegration Test: FLOAT OpenBenchmarking.org QUIPs, More Is Better Hierarchical INTegration 1.0 Test: FLOAT 1 2 3 100M 200M 300M 400M 500M SE +/- 493184.05, N = 3 SE +/- 93008.17, N = 3 SE +/- 1375357.19, N = 3 472824990.62 474322870.49 471834576.35 1. (CC) gcc options: -O3 -march=native -lm
Incompact3D Input: Cylinder OpenBenchmarking.org Seconds, Fewer Is Better Incompact3D 2020-09-17 Input: Cylinder 1 2 3 130 260 390 520 650 SE +/- 1.16, N = 3 SE +/- 1.24, N = 3 SE +/- 3.73, N = 3 378.48 377.56 605.89 1. (F9X) gfortran options: -cpp -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -ldl -levent -levent_pthreads -lutil -lm -lrt -lz
InfluxDB Concurrent Streams: 4 - Batch Size: 10000 - Tags: 2,5000,1 - Points Per Series: 10000 OpenBenchmarking.org val/sec, More Is Better InfluxDB 1.8.2 Concurrent Streams: 4 - Batch Size: 10000 - Tags: 2,5000,1 - Points Per Series: 10000 1 2 3 300K 600K 900K 1200K 1500K SE +/- 2205.46, N = 3 SE +/- 2194.00, N = 3 SE +/- 2673.51, N = 3 1507808.8 1526391.3 1508093.5
InfluxDB Concurrent Streams: 64 - Batch Size: 10000 - Tags: 2,5000,1 - Points Per Series: 10000 OpenBenchmarking.org val/sec, More Is Better InfluxDB 1.8.2 Concurrent Streams: 64 - Batch Size: 10000 - Tags: 2,5000,1 - Points Per Series: 10000 1 2 3 300K 600K 900K 1200K 1500K SE +/- 5130.21, N = 3 SE +/- 4759.58, N = 3 SE +/- 4617.21, N = 3 1534293.6 1543312.5 1530241.0
InfluxDB Concurrent Streams: 1024 - Batch Size: 10000 - Tags: 2,5000,1 - Points Per Series: 10000 OpenBenchmarking.org val/sec, More Is Better InfluxDB 1.8.2 Concurrent Streams: 1024 - Batch Size: 10000 - Tags: 2,5000,1 - Points Per Series: 10000 1 2 3 300K 600K 900K 1200K 1500K SE +/- 4676.23, N = 3 SE +/- 6187.13, N = 3 SE +/- 1033.19, N = 3 1535930.8 1557794.2 1539617.5
Java Gradle Build Gradle Build: Reactor OpenBenchmarking.org Seconds, Fewer Is Better Java Gradle Build Gradle Build: Reactor 1 2 3 40 80 120 160 200 SE +/- 2.21, N = 12 SE +/- 2.22, N = 12 SE +/- 2.36, N = 12 197.76 199.45 197.05
KeyDB OpenBenchmarking.org Ops/sec, More Is Better KeyDB 6.0.16 1 2 3 160K 320K 480K 640K 800K SE +/- 1649.46, N = 3 SE +/- 3580.61, N = 3 SE +/- 673.14, N = 3 737463.31 736916.33 734176.59 1. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre
Kvazaar Video Input: Bosphorus 4K - Video Preset: Slow OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 4K - Video Preset: Slow 1 2 3 0.9383 1.8766 2.8149 3.7532 4.6915 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 4.15 4.17 4.15 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Kvazaar Video Input: Bosphorus 4K - Video Preset: Medium OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 4K - Video Preset: Medium 1 2 3 0.9518 1.9036 2.8554 3.8072 4.759 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 4.21 4.23 4.20 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Kvazaar Video Input: Bosphorus 1080p - Video Preset: Slow OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 1080p - Video Preset: Slow 1 2 3 5 10 15 20 25 SE +/- 0.01, N = 3 SE +/- 0.15, N = 3 SE +/- 0.02, N = 3 18.24 18.58 18.11 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Kvazaar Video Input: Bosphorus 1080p - Video Preset: Medium OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 1080p - Video Preset: Medium 1 2 3 5 10 15 20 25 SE +/- 0.08, N = 3 SE +/- 0.02, N = 3 SE +/- 0.06, N = 3 18.69 18.92 18.69 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Kvazaar Video Input: Bosphorus 4K - Video Preset: Very Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 4K - Video Preset: Very Fast 1 2 3 3 6 9 12 15 SE +/- 0.03, N = 3 SE +/- 0.04, N = 3 SE +/- 0.04, N = 3 11.95 12.05 11.91 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Kvazaar Video Input: Bosphorus 4K - Video Preset: Ultra Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 4K - Video Preset: Ultra Fast 1 2 3 5 10 15 20 25 SE +/- 0.10, N = 3 SE +/- 0.23, N = 3 SE +/- 0.11, N = 3 21.87 22.45 21.92 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Kvazaar Video Input: Bosphorus 1080p - Video Preset: Very Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 1080p - Video Preset: Very Fast 1 2 3 11 22 33 44 55 SE +/- 0.52, N = 3 SE +/- 0.64, N = 5 SE +/- 0.02, N = 3 46.73 48.24 46.79 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Kvazaar Video Input: Bosphorus 1080p - Video Preset: Ultra Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 1080p - Video Preset: Ultra Fast 1 2 3 20 40 60 80 100 SE +/- 1.07, N = 3 SE +/- 0.90, N = 9 SE +/- 1.01, N = 3 87.16 92.53 87.18 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
LAMMPS Molecular Dynamics Simulator Model: 20k Atoms OpenBenchmarking.org ns/day, More Is Better LAMMPS Molecular Dynamics Simulator 29Oct2020 Model: 20k Atoms 1 2 3 1.3295 2.659 3.9885 5.318 6.6475 SE +/- 0.013, N = 3 SE +/- 0.005, N = 3 SE +/- 0.046, N = 3 5.851 5.909 5.905 1. (CXX) g++ options: -O3 -pthread -lm
LAMMPS Molecular Dynamics Simulator Model: Rhodopsin Protein OpenBenchmarking.org ns/day, More Is Better LAMMPS Molecular Dynamics Simulator 29Oct2020 Model: Rhodopsin Protein 1 2 3 2 4 6 8 10 SE +/- 0.086, N = 15 SE +/- 0.019, N = 3 SE +/- 0.197, N = 12 6.600 6.805 6.322 1. (CXX) g++ options: -O3 -pthread -lm
LeelaChessZero Backend: BLAS OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.26 Backend: BLAS 1 2 3 200 400 600 800 1000 SE +/- 10.82, N = 3 844 837 809 1. (CXX) g++ options: -flto -pthread
LeelaChessZero Backend: Eigen OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.26 Backend: Eigen 1 2 3 160 320 480 640 800 SE +/- 8.19, N = 3 SE +/- 4.63, N = 3 SE +/- 13.55, N = 9 751 762 671 1. (CXX) g++ options: -flto -pthread
libavif avifenc Encoder Speed: 0 OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.7.3 Encoder Speed: 0 1 2 3 20 40 60 80 100 SE +/- 0.38, N = 3 SE +/- 0.39, N = 3 SE +/- 0.38, N = 3 99.67 99.27 99.34 1. (CXX) g++ options: -O3 -fPIC
libavif avifenc Encoder Speed: 2 OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.7.3 Encoder Speed: 2 1 2 3 13 26 39 52 65 SE +/- 0.10, N = 3 SE +/- 0.27, N = 3 SE +/- 0.17, N = 3 58.66 58.40 58.66 1. (CXX) g++ options: -O3 -fPIC
libavif avifenc Encoder Speed: 8 OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.7.3 Encoder Speed: 8 1 2 3 1.13 2.26 3.39 4.52 5.65 SE +/- 0.013, N = 3 SE +/- 0.019, N = 3 SE +/- 0.017, N = 3 5.022 4.995 5.017 1. (CXX) g++ options: -O3 -fPIC
libavif avifenc Encoder Speed: 10 OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.7.3 Encoder Speed: 10 1 2 3 1.0703 2.1406 3.2109 4.2812 5.3515 SE +/- 0.009, N = 3 SE +/- 0.014, N = 3 SE +/- 0.010, N = 3 4.757 4.738 4.733 1. (CXX) g++ options: -O3 -fPIC
LibRaw Post-Processing Benchmark OpenBenchmarking.org Mpix/sec, More Is Better LibRaw 0.20 Post-Processing Benchmark 1 2 3 8 16 24 32 40 SE +/- 0.13, N = 3 SE +/- 0.13, N = 3 SE +/- 0.08, N = 3 33.31 33.66 33.87 1. (CXX) g++ options: -O2 -fopenmp -ljpeg -lz -lm
NAMD ATPase Simulation - 327,506 Atoms OpenBenchmarking.org days/ns, Fewer Is Better NAMD 2.14 ATPase Simulation - 327,506 Atoms 1 2 3 0.5917 1.1834 1.7751 2.3668 2.9585 SE +/- 0.00650, N = 3 SE +/- 0.00297, N = 3 SE +/- 0.04166, N = 3 1.94453 1.93847 2.62968
NCNN Target: CPU - Model: squeezenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: squeezenet 1 2 3 4 8 12 16 20 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 SE +/- 0.16, N = 3 16.10 16.10 16.38 MIN: 15.74 / MAX: 17.79 MIN: 15.94 / MAX: 17.11 MIN: 15.97 / MAX: 24.3 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: mobilenet 1 2 3 5 10 15 20 25 SE +/- 0.05, N = 3 SE +/- 0.04, N = 3 SE +/- 0.09, N = 3 19.30 19.22 19.23 MIN: 18.89 / MAX: 19.7 MIN: 18.82 / MAX: 26.38 MIN: 18.97 / MAX: 19.68 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU-v2-v2 - Model: mobilenet-v2 1 2 3 1.1948 2.3896 3.5844 4.7792 5.974 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.14, N = 3 5.11 5.13 5.31 MIN: 5.01 / MAX: 5.37 MIN: 5.01 / MAX: 5.35 MIN: 5.02 / MAX: 8.3 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU-v3-v3 - Model: mobilenet-v3 1 2 3 0.945 1.89 2.835 3.78 4.725 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.15, N = 3 4.06 4.07 4.20 MIN: 4.01 / MAX: 4.32 MIN: 4.03 / MAX: 4.44 MIN: 4.01 / MAX: 4.64 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: shufflenet-v2 1 2 3 0.7335 1.467 2.2005 2.934 3.6675 SE +/- 0.04, N = 3 SE +/- 0.03, N = 3 SE +/- 0.06, N = 3 3.21 3.17 3.26 MIN: 2.98 / MAX: 3.47 MIN: 2.99 / MAX: 3.47 MIN: 2.99 / MAX: 3.84 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: mnasnet 1 2 3 0.9383 1.8766 2.8149 3.7532 4.6915 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.13, N = 3 3.92 3.91 4.17 MIN: 3.87 / MAX: 4.28 MIN: 3.86 / MAX: 4.25 MIN: 3.88 / MAX: 4.56 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: efficientnet-b0 1 2 3 2 4 6 8 10 SE +/- 0.14, N = 3 SE +/- 0.15, N = 3 SE +/- 0.01, N = 3 6.69 6.68 6.85 MIN: 6.36 / MAX: 7.32 MIN: 6.34 / MAX: 7.11 MIN: 6.72 / MAX: 7.36 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: blazeface 1 2 3 0.3375 0.675 1.0125 1.35 1.6875 SE +/- 0.03, N = 3 SE +/- 0.04, N = 3 SE +/- 0.00, N = 3 1.46 1.45 1.50 MIN: 1.38 / MAX: 1.57 MIN: 1.34 / MAX: 1.66 MIN: 1.42 / MAX: 1.63 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: googlenet 1 2 3 4 8 12 16 20 SE +/- 0.31, N = 3 SE +/- 0.26, N = 3 SE +/- 0.01, N = 3 14.99 15.09 15.38 MIN: 14.17 / MAX: 16.1 MIN: 14.25 / MAX: 15.64 MIN: 15.04 / MAX: 15.73 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: vgg16 1 2 3 15 30 45 60 75 SE +/- 0.58, N = 3 SE +/- 0.00, N = 3 SE +/- 0.37, N = 3 67.22 66.60 67.82 MIN: 65.96 / MAX: 186.53 MIN: 66.46 / MAX: 67.69 MIN: 66.24 / MAX: 187.16 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: resnet18 1 2 3 4 8 12 16 20 SE +/- 0.20, N = 3 SE +/- 0.04, N = 3 SE +/- 0.25, N = 3 15.20 15.49 15.61 MIN: 14.69 / MAX: 15.68 MIN: 14.84 / MAX: 15.88 MIN: 15.01 / MAX: 51.06 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: alexnet 1 2 3 4 8 12 16 20 SE +/- 0.11, N = 3 SE +/- 0.01, N = 3 SE +/- 0.16, N = 3 14.38 14.33 14.72 MIN: 14.11 / MAX: 17.2 MIN: 14.26 / MAX: 14.55 MIN: 14.22 / MAX: 122.36 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: resnet50 1 2 3 7 14 21 28 35 SE +/- 0.17, N = 3 SE +/- 0.43, N = 3 SE +/- 0.53, N = 3 28.80 28.91 29.20 MIN: 27.99 / MAX: 145.54 MIN: 27.49 / MAX: 158.03 MIN: 27.61 / MAX: 140.55 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: yolov4-tiny 1 2 3 7 14 21 28 35 SE +/- 0.04, N = 3 SE +/- 0.35, N = 3 SE +/- 0.12, N = 3 28.70 29.58 29.07 MIN: 28.5 / MAX: 29.09 MIN: 28.17 / MAX: 140.34 MIN: 28.8 / MAX: 29.52 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: squeezenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: squeezenet 1 2 3 0.8235 1.647 2.4705 3.294 4.1175 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 3.66 3.65 3.66 MIN: 3.6 / MAX: 3.77 MIN: 3.6 / MAX: 3.7 MIN: 3.61 / MAX: 3.73 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: mobilenet 1 2 3 1.0733 2.1466 3.2199 4.2932 5.3665 SE +/- 0.02, N = 3 SE +/- 0.00, N = 3 SE +/- 0.12, N = 3 4.63 4.58 4.77 MIN: 4.56 / MAX: 5.02 MIN: 4.54 / MAX: 4.65 MIN: 4.56 / MAX: 34.32 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 1 2 3 0.3285 0.657 0.9855 1.314 1.6425 SE +/- 0.03, N = 3 SE +/- 0.00, N = 3 SE +/- 0.03, N = 3 1.45 1.42 1.46 MIN: 1.41 / MAX: 20.49 MIN: 1.41 / MAX: 1.44 MIN: 1.41 / MAX: 20.24 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 1 2 3 0.378 0.756 1.134 1.512 1.89 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 1.67 1.68 1.68 MIN: 1.66 / MAX: 1.7 MIN: 1.66 / MAX: 1.93 MIN: 1.66 / MAX: 1.9 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: shufflenet-v2 1 2 3 0.2948 0.5896 0.8844 1.1792 1.474 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 1.30 1.31 1.31 MIN: 1.29 / MAX: 1.32 MIN: 1.29 / MAX: 1.35 MIN: 1.3 / MAX: 1.53 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: mnasnet 1 2 3 0.3375 0.675 1.0125 1.35 1.6875 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 1.48 1.48 1.50 MIN: 1.47 / MAX: 1.72 MIN: 1.47 / MAX: 1.53 MIN: 1.47 / MAX: 6.94 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: efficientnet-b0 1 2 3 0.6008 1.2016 1.8024 2.4032 3.004 SE +/- 0.02, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 2.67 2.65 2.65 MIN: 2.64 / MAX: 12.58 MIN: 2.63 / MAX: 3.85 MIN: 2.63 / MAX: 2.77 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: blazeface 1 2 3 0.1418 0.2836 0.4254 0.5672 0.709 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 0.61 0.62 0.63 MIN: 0.6 / MAX: 0.63 MIN: 0.6 / MAX: 0.66 MIN: 0.61 / MAX: 2.14 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: googlenet 1 2 3 0.7943 1.5886 2.3829 3.1772 3.9715 SE +/- 0.31, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 3.53 3.22 3.23 MIN: 3.2 / MAX: 21.92 MIN: 3.21 / MAX: 3.25 MIN: 3.22 / MAX: 3.35 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: vgg16 1 2 3 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 SE +/- 0.00, N = 3 8.36 8.32 8.42 MIN: 7.77 / MAX: 22.44 MIN: 7.77 / MAX: 22.07 MIN: 7.88 / MAX: 18.71 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: resnet18 1 2 3 0.387 0.774 1.161 1.548 1.935 SE +/- 0.03, N = 3 SE +/- 0.04, N = 3 SE +/- 0.00, N = 3 1.68 1.72 1.66 MIN: 1.64 / MAX: 17.75 MIN: 1.64 / MAX: 23.58 MIN: 1.64 / MAX: 1.7 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: alexnet 1 2 3 0.4725 0.945 1.4175 1.89 2.3625 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 2.10 2.08 2.09 MIN: 1.85 / MAX: 2.59 MIN: 2.04 / MAX: 2.57 MIN: 1.82 / MAX: 2.6 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: resnet50 1 2 3 0.8483 1.6966 2.5449 3.3932 4.2415 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 3.77 3.74 3.76 MIN: 3.72 / MAX: 13.18 MIN: 3.72 / MAX: 3.76 MIN: 3.74 / MAX: 3.9 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: yolov4-tiny 1 2 3 2 4 6 8 10 SE +/- 0.18, N = 3 SE +/- 0.02, N = 3 SE +/- 0.34, N = 3 8.67 8.32 8.70 MIN: 8.03 / MAX: 53 MIN: 8.01 / MAX: 8.7 MIN: 8.04 / MAX: 83.54 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NeatBench Acceleration: CPU OpenBenchmarking.org FPS, More Is Better NeatBench 5 Acceleration: CPU 1 2 3 3 6 9 12 15 SE +/- 1.26, N = 16 SE +/- 1.23, N = 16 SE +/- 1.26, N = 16 10.9 10.8 10.9
NeatBench Acceleration: GPU OpenBenchmarking.org FPS, More Is Better NeatBench 5 Acceleration: GPU 1 2 3 7 14 21 28 35 SE +/- 0.06, N = 3 SE +/- 0.50, N = 15 SE +/- 0.50, N = 15 29.4 30.5 30.6
OCRMyPDF Processing 60 Page PDF Document OpenBenchmarking.org Seconds, Fewer Is Better OCRMyPDF 10.3.1+dfsg Processing 60 Page PDF Document 1 2 3 5 10 15 20 25 SE +/- 0.12, N = 3 SE +/- 0.22, N = 3 SE +/- 0.21, N = 3 22.02 21.07 22.09
oneDNN Harness: IP Batch 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: IP Batch 1D - Data Type: f32 - Engine: CPU 1 2 3 1.0117 2.0234 3.0351 4.0468 5.0585 SE +/- 0.04422, N = 15 SE +/- 0.03723, N = 13 SE +/- 0.04076, N = 12 4.49627 4.13153 4.13129 MIN: 3.83 MIN: 3.59 MIN: 3.59 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Batch All - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: IP Batch All - Data Type: f32 - Engine: CPU 1 2 3 15 30 45 60 75 SE +/- 0.38, N = 3 SE +/- 0.04, N = 3 SE +/- 0.54, N = 3 68.05 69.00 68.59 MIN: 64.46 MIN: 63.06 MIN: 62.18 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Batch 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: IP Batch 1D - Data Type: u8s8f32 - Engine: CPU 1 2 3 0.4201 0.8402 1.2603 1.6804 2.1005 SE +/- 0.01882, N = 8 SE +/- 0.01584, N = 15 SE +/- 0.03006, N = 3 1.80103 1.80571 1.86706 MIN: 1.51 MIN: 1.49 MIN: 1.53 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Batch All - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: IP Batch All - Data Type: u8s8f32 - Engine: CPU 1 2 3 7 14 21 28 35 SE +/- 0.07, N = 3 SE +/- 0.24, N = 3 SE +/- 0.04, N = 3 27.64 27.37 27.35 MIN: 25.19 MIN: 24.77 MIN: 25.36 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU 1 2 3 4 8 12 16 20 SE +/- 0.13, N = 3 SE +/- 0.15, N = 3 SE +/- 0.24, N = 4 17.12 17.14 17.20 MIN: 16.89 MIN: 16.92 MIN: 16.87 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch deconv_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Deconvolution Batch deconv_1d - Data Type: f32 - Engine: CPU 1 2 3 1.1017 2.2034 3.3051 4.4068 5.5085 SE +/- 0.04290, N = 15 SE +/- 0.05834, N = 12 SE +/- 0.04582, N = 3 4.89651 4.67392 4.70756 MIN: 3.92 MIN: 3.78 MIN: 4.21 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch deconv_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Deconvolution Batch deconv_3d - Data Type: f32 - Engine: CPU 1 2 3 2 4 6 8 10 SE +/- 0.02212, N = 3 SE +/- 0.00067, N = 3 SE +/- 0.09853, N = 4 6.64261 6.47349 6.61172 MIN: 6.44 MIN: 6.31 MIN: 6.32 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU 1 2 3 4 8 12 16 20 SE +/- 0.01, N = 3 SE +/- 0.27, N = 3 SE +/- 0.16, N = 3 16.44 16.74 16.60 MIN: 16.35 MIN: 16.38 MIN: 16.35 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch deconv_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Deconvolution Batch deconv_1d - Data Type: u8s8f32 - Engine: CPU 1 2 3 1.2841 2.5682 3.8523 5.1364 6.4205 SE +/- 0.03901, N = 3 SE +/- 0.01208, N = 3 SE +/- 0.05691, N = 3 5.70594 5.59357 5.70690 MIN: 4.96 MIN: 4.91 MIN: 4.91 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch deconv_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Deconvolution Batch deconv_3d - Data Type: u8s8f32 - Engine: CPU 1 2 3 0.8071 1.6142 2.4213 3.2284 4.0355 SE +/- 0.03967, N = 3 SE +/- 0.05394, N = 3 SE +/- 0.03320, N = 15 3.58701 3.24189 3.30402 MIN: 3.47 MIN: 3.1 MIN: 3.1 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU 1 2 3 70 140 210 280 350 SE +/- 3.32, N = 3 SE +/- 2.20, N = 3 SE +/- 1.53, N = 3 321.80 326.96 324.34 MIN: 304.56 MIN: 315.72 MIN: 307.8 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU 1 2 3 30 60 90 120 150 SE +/- 1.81, N = 3 SE +/- 0.87, N = 3 SE +/- 1.35, N = 3 149.62 139.87 136.15 MIN: 143.31 MIN: 136.06 MIN: 130.37 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU 1 2 3 0.9157 1.8314 2.7471 3.6628 4.5785 SE +/- 0.00919, N = 3 SE +/- 0.04507, N = 3 SE +/- 0.00529, N = 3 3.99911 4.06970 3.87614 MIN: 3.93 MIN: 3.86 MIN: 3.82 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU 1 2 3 0.581 1.162 1.743 2.324 2.905 SE +/- 0.02285, N = 3 SE +/- 0.03056, N = 3 SE +/- 0.02621, N = 3 2.58081 2.57926 2.58232 MIN: 2.21 MIN: 2.16 MIN: 2.21 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenSSL RSA 4096-bit Performance OpenBenchmarking.org Signs Per Second, More Is Better OpenSSL 1.1.1 RSA 4096-bit Performance 1 2 3 500 1000 1500 2000 2500 SE +/- 27.67, N = 3 SE +/- 18.02, N = 3 SE +/- 14.41, N = 3 2431.8 2467.9 2474.6 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
PostgreSQL pgbench Scaling Factor: 1 - Clients: 1 - Mode: Read Only OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 13.0 Scaling Factor: 1 - Clients: 1 - Mode: Read Only 1 2 3 7K 14K 21K 28K 35K SE +/- 77.57, N = 3 SE +/- 373.98, N = 3 SE +/- 92.20, N = 3 33414 33267 32579 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm
PostgreSQL pgbench Scaling Factor: 1 - Clients: 1 - Mode: Read Only - Average Latency OpenBenchmarking.org ms, Fewer Is Better PostgreSQL pgbench 13.0 Scaling Factor: 1 - Clients: 1 - Mode: Read Only - Average Latency 1 2 3 0.007 0.014 0.021 0.028 0.035 SE +/- 0.000, N = 3 SE +/- 0.000, N = 3 SE +/- 0.000, N = 3 0.030 0.030 0.031 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm
PostgreSQL pgbench Scaling Factor: 1 - Clients: 1 - Mode: Read Write OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 13.0 Scaling Factor: 1 - Clients: 1 - Mode: Read Write 1 2 3 100 200 300 400 500 SE +/- 5.79, N = 3 SE +/- 22.30, N = 12 SE +/- 17.55, N = 12 480 429 442 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm
PostgreSQL pgbench Scaling Factor: 1 - Clients: 1 - Mode: Read Write - Average Latency OpenBenchmarking.org ms, Fewer Is Better PostgreSQL pgbench 13.0 Scaling Factor: 1 - Clients: 1 - Mode: Read Write - Average Latency 1 2 3 0.5427 1.0854 1.6281 2.1708 2.7135 SE +/- 0.025, N = 3 SE +/- 0.138, N = 12 SE +/- 0.108, N = 12 2.084 2.412 2.312 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm
PostgreSQL pgbench Scaling Factor: 1 - Clients: 50 - Mode: Read Only OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 13.0 Scaling Factor: 1 - Clients: 50 - Mode: Read Only 1 2 3 60K 120K 180K 240K 300K SE +/- 1001.63, N = 3 SE +/- 4333.65, N = 3 SE +/- 3723.51, N = 3 289401 296088 291995 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm
PostgreSQL pgbench Scaling Factor: 1 - Clients: 50 - Mode: Read Only - Average Latency OpenBenchmarking.org ms, Fewer Is Better PostgreSQL pgbench 13.0 Scaling Factor: 1 - Clients: 50 - Mode: Read Only - Average Latency 1 2 3 0.0389 0.0778 0.1167 0.1556 0.1945 SE +/- 0.001, N = 3 SE +/- 0.003, N = 3 SE +/- 0.002, N = 3 0.173 0.169 0.171 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm
PostgreSQL pgbench Scaling Factor: 1 - Clients: 100 - Mode: Read Only OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 13.0 Scaling Factor: 1 - Clients: 100 - Mode: Read Only 1 2 3 60K 120K 180K 240K 300K SE +/- 1999.95, N = 3 SE +/- 3007.11, N = 3 SE +/- 753.18, N = 3 274463 281291 278019 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm
PostgreSQL pgbench Scaling Factor: 1 - Clients: 100 - Mode: Read Only - Average Latency OpenBenchmarking.org ms, Fewer Is Better PostgreSQL pgbench 13.0 Scaling Factor: 1 - Clients: 100 - Mode: Read Only - Average Latency 1 2 3 0.0821 0.1642 0.2463 0.3284 0.4105 SE +/- 0.003, N = 3 SE +/- 0.004, N = 3 SE +/- 0.001, N = 3 0.365 0.356 0.360 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm
PostgreSQL pgbench Scaling Factor: 1 - Clients: 250 - Mode: Read Only OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 13.0 Scaling Factor: 1 - Clients: 250 - Mode: Read Only 1 2 3 50K 100K 150K 200K 250K SE +/- 853.38, N = 3 SE +/- 3217.19, N = 5 SE +/- 3216.57, N = 5 254678 251429 249348 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm
PostgreSQL pgbench Scaling Factor: 1 - Clients: 250 - Mode: Read Only - Average Latency OpenBenchmarking.org ms, Fewer Is Better PostgreSQL pgbench 13.0 Scaling Factor: 1 - Clients: 250 - Mode: Read Only - Average Latency 1 2 3 0.2259 0.4518 0.6777 0.9036 1.1295 SE +/- 0.003, N = 3 SE +/- 0.013, N = 5 SE +/- 0.013, N = 5 0.982 0.995 1.004 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm
PostgreSQL pgbench Scaling Factor: 1 - Clients: 50 - Mode: Read Write OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 13.0 Scaling Factor: 1 - Clients: 50 - Mode: Read Write 1 2 3 130 260 390 520 650 SE +/- 8.58, N = 3 SE +/- 0.22, N = 3 SE +/- 4.93, N = 3 611 596 604 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm
PostgreSQL pgbench Scaling Factor: 1 - Clients: 50 - Mode: Read Write - Average Latency OpenBenchmarking.org ms, Fewer Is Better PostgreSQL pgbench 13.0 Scaling Factor: 1 - Clients: 50 - Mode: Read Write - Average Latency 1 2 3 20 40 60 80 100 SE +/- 1.15, N = 3 SE +/- 0.03, N = 3 SE +/- 0.67, N = 3 81.87 83.89 82.73 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm
PostgreSQL pgbench Scaling Factor: 1 - Clients: 100 - Mode: Read Write OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 13.0 Scaling Factor: 1 - Clients: 100 - Mode: Read Write 1 2 3 120 240 360 480 600 SE +/- 1.34, N = 3 SE +/- 5.30, N = 3 SE +/- 3.40, N = 3 568 578 578 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm
PostgreSQL pgbench Scaling Factor: 1 - Clients: 100 - Mode: Read Write - Average Latency OpenBenchmarking.org ms, Fewer Is Better PostgreSQL pgbench 13.0 Scaling Factor: 1 - Clients: 100 - Mode: Read Write - Average Latency 1 2 3 40 80 120 160 200 SE +/- 0.42, N = 3 SE +/- 1.60, N = 3 SE +/- 1.01, N = 3 176.13 173.12 173.24 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm
PostgreSQL pgbench Scaling Factor: 1 - Clients: 250 - Mode: Read Write OpenBenchmarking.org TPS, More Is Better PostgreSQL pgbench 13.0 Scaling Factor: 1 - Clients: 250 - Mode: Read Write 1 2 3 130 260 390 520 650 SE +/- 11.83, N = 15 SE +/- 6.84, N = 15 SE +/- 5.55, N = 3 599 591 557 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm
PostgreSQL pgbench Scaling Factor: 1 - Clients: 250 - Mode: Read Write - Average Latency OpenBenchmarking.org ms, Fewer Is Better PostgreSQL pgbench 13.0 Scaling Factor: 1 - Clients: 250 - Mode: Read Write - Average Latency 1 2 3 100 200 300 400 500 SE +/- 7.83, N = 15 SE +/- 4.74, N = 15 SE +/- 4.44, N = 3 419.73 424.26 449.11 1. (CC) gcc options: -fno-strict-aliasing -fwrapv -O2 -lpgcommon -lpgport -lpq -lpthread -lrt -ldl -lm
PyPerformance Benchmark: go OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.0.0 Benchmark: go 1 2 3 40 80 120 160 200 200 199 200
PyPerformance Benchmark: 2to3 OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.0.0 Benchmark: 2to3 1 2 3 60 120 180 240 300 264 261 261
PyPerformance Benchmark: chaos OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.0.0 Benchmark: chaos 1 2 3 20 40 60 80 100 SE +/- 0.07, N = 3 SE +/- 0.09, N = 3 SE +/- 0.07, N = 3 85.7 85.8 86.1
PyPerformance Benchmark: float OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.0.0 Benchmark: float 1 2 3 20 40 60 80 100 SE +/- 0.03, N = 3 SE +/- 0.06, N = 3 SE +/- 0.36, N = 3 90.0 90.0 90.1
PyPerformance Benchmark: nbody OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.0.0 Benchmark: nbody 1 2 3 20 40 60 80 100 103 103 103
PyPerformance Benchmark: pathlib OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.0.0 Benchmark: pathlib 1 2 3 4 8 12 16 20 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.09, N = 3 14.4 14.3 14.5
PyPerformance Benchmark: raytrace OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.0.0 Benchmark: raytrace 1 2 3 80 160 240 320 400 374 375 377
PyPerformance Benchmark: json_loads OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.0.0 Benchmark: json_loads 1 2 3 5 10 15 20 25 SE +/- 0.09, N = 3 SE +/- 0.00, N = 3 SE +/- 0.03, N = 3 21.3 21.2 21.2
PyPerformance Benchmark: crypto_pyaes OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.0.0 Benchmark: crypto_pyaes 1 2 3 20 40 60 80 100 SE +/- 0.03, N = 3 SE +/- 0.06, N = 3 SE +/- 0.03, N = 3 85.8 85.8 85.9
PyPerformance Benchmark: regex_compile OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.0.0 Benchmark: regex_compile 1 2 3 30 60 90 120 150 138 137 139
PyPerformance Benchmark: python_startup OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.0.0 Benchmark: python_startup 1 2 3 2 4 6 8 10 SE +/- 0.04, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 6.80 6.74 6.75
PyPerformance Benchmark: django_template OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.0.0 Benchmark: django_template 1 2 3 9 18 27 36 45 SE +/- 0.09, N = 3 SE +/- 0.06, N = 3 SE +/- 0.09, N = 3 38.6 38.9 39.2
PyPerformance Benchmark: pickle_pure_python OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.0.0 Benchmark: pickle_pure_python 1 2 3 70 140 210 280 350 338 339 339
RawTherapee Total Benchmark Time OpenBenchmarking.org Seconds, Fewer Is Better RawTherapee Total Benchmark Time 1 2 3 14 28 42 56 70 SE +/- 0.13, N = 3 SE +/- 0.28, N = 3 SE +/- 0.05, N = 3 61.75 61.85 62.12 1. RawTherapee, version 5.8, command line.
RNNoise OpenBenchmarking.org Seconds, Fewer Is Better RNNoise 2020-06-28 1 2 3 5 10 15 20 25 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 21.59 21.59 21.60 1. (CC) gcc options: -O2 -pedantic -fvisibility=hidden
Rodinia Test: OpenMP LavaMD OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenMP LavaMD 1 2 3 60 120 180 240 300 SE +/- 1.21, N = 3 SE +/- 0.19, N = 3 SE +/- 3.61, N = 3 276.22 273.16 296.27 1. (CXX) g++ options: -O2 -lOpenCL
Rodinia Test: OpenCL Myocyte OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenCL Myocyte 1 2 3 8 16 24 32 40 SE +/- 0.13, N = 3 SE +/- 0.05, N = 3 SE +/- 0.36, N = 8 34.79 34.51 35.76 1. (CXX) g++ options: -O2 -lOpenCL
Rodinia Test: OpenMP HotSpot3D OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenMP HotSpot3D 1 2 3 50 100 150 200 250 SE +/- 1.38, N = 3 SE +/- 1.37, N = 3 SE +/- 13.20, N = 12 92.67 89.73 213.52 1. (CXX) g++ options: -O2 -lOpenCL
Rodinia Test: OpenMP Leukocyte OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenMP Leukocyte 1 2 3 50 100 150 200 250 SE +/- 1.02, N = 3 SE +/- 0.74, N = 3 SE +/- 0.84, N = 3 140.31 138.39 214.31 1. (CXX) g++ options: -O2 -lOpenCL
Rodinia Test: OpenMP CFD Solver OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenMP CFD Solver 1 2 3 60 120 180 240 300 SE +/- 0.23, N = 3 SE +/- 0.15, N = 3 SE +/- 37.41, N = 7 22.65 22.74 265.62 1. (CXX) g++ options: -O2 -lOpenCL
Rodinia Test: OpenMP Streamcluster OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenMP Streamcluster 1 2 3 15 30 45 60 75 SE +/- 0.06, N = 3 SE +/- 0.08, N = 3 SE +/- 7.66, N = 12 19.56 19.63 69.37 1. (CXX) g++ options: -O2 -lOpenCL
Sunflow Rendering System Global Illumination + Image Synthesis OpenBenchmarking.org Seconds, Fewer Is Better Sunflow Rendering System 0.07.2 Global Illumination + Image Synthesis 1 2 3 0.2561 0.5122 0.7683 1.0244 1.2805 SE +/- 0.010, N = 15 SE +/- 0.011, N = 15 SE +/- 0.011, N = 15 1.138 1.132 1.127 MIN: 0.97 / MAX: 1.48 MIN: 0.94 / MAX: 1.51 MIN: 0.95 / MAX: 1.48
TensorFlow Lite Model: SqueezeNet OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: SqueezeNet 1 2 3 50K 100K 150K 200K 250K SE +/- 345.32, N = 3 SE +/- 496.71, N = 3 SE +/- 627.42, N = 3 227852 226239 227599
TensorFlow Lite Model: Inception V4 OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: Inception V4 1 2 3 700K 1400K 2100K 2800K 3500K SE +/- 1176.00, N = 3 SE +/- 1202.26, N = 3 SE +/- 369.56, N = 3 3269353 3260710 3270433
TensorFlow Lite Model: NASNet Mobile OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: NASNet Mobile 1 2 3 40K 80K 120K 160K 200K SE +/- 363.56, N = 3 SE +/- 120.29, N = 3 SE +/- 1000.76, N = 3 201015 199504 201265
TensorFlow Lite Model: Mobilenet Float OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: Mobilenet Float 1 2 3 30K 60K 90K 120K 150K SE +/- 387.33, N = 3 SE +/- 330.84, N = 3 SE +/- 188.88, N = 3 153919 153221 154362
TensorFlow Lite Model: Mobilenet Quant OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: Mobilenet Quant 1 2 3 30K 60K 90K 120K 150K SE +/- 112.51, N = 3 SE +/- 261.09, N = 3 SE +/- 197.44, N = 3 157250 156299 157719
TensorFlow Lite Model: Inception ResNet V2 OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: Inception ResNet V2 1 2 3 600K 1200K 1800K 2400K 3000K SE +/- 93.33, N = 3 SE +/- 2042.95, N = 3 SE +/- 867.58, N = 3 2954177 2946580 2957670
Tesseract OCR Time To OCR 7 Images OpenBenchmarking.org Seconds, Fewer Is Better Tesseract OCR 4.1.1 Time To OCR 7 Images 1 2 3 5 10 15 20 25 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 20.13 20.32 20.22
Timed HMMer Search Pfam Database Search OpenBenchmarking.org Seconds, Fewer Is Better Timed HMMer Search 3.3.1 Pfam Database Search 1 2 3 20 40 60 80 100 SE +/- 0.05, N = 3 SE +/- 0.11, N = 3 SE +/- 0.30, N = 3 100.12 100.24 100.83 1. (CC) gcc options: -O3 -pthread -lhmmer -leasel -lm
Timed Linux Kernel Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 5.4 Time To Compile 1 2 20 40 60 80 100 SE +/- 0.05, N = 3 SE +/- 0.19, N = 3 98.18 98.15
Timed LLVM Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed LLVM Compilation 10.0 Time To Compile 1 2 170 340 510 680 850 SE +/- 2.53, N = 3 SE +/- 1.50, N = 3 771.21 762.88
TNN Target: CPU - Model: MobileNet v2 OpenBenchmarking.org ms, Fewer Is Better TNN 0.2.3 Target: CPU - Model: MobileNet v2 1 2 3 60 120 180 240 300 SE +/- 1.30, N = 3 SE +/- 0.68, N = 3 SE +/- 0.57, N = 3 286.04 287.56 284.03 MIN: 283.63 / MAX: 356.82 MIN: 285.93 / MAX: 307.5 MIN: 282.82 / MAX: 325.11 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -O3 -rdynamic -ldl
TNN Target: CPU - Model: SqueezeNet v1.1 OpenBenchmarking.org ms, Fewer Is Better TNN 0.2.3 Target: CPU - Model: SqueezeNet v1.1 1 2 3 60 120 180 240 300 SE +/- 0.03, N = 3 SE +/- 0.07, N = 3 SE +/- 0.13, N = 3 269.71 269.66 269.44 MIN: 268.38 / MAX: 284.03 MIN: 268.59 / MAX: 282.48 MIN: 268.04 / MAX: 281.47 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -O3 -rdynamic -ldl
WebP Image Encode Encode Settings: Default OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Default 1 2 3 0.2966 0.5932 0.8898 1.1864 1.483 SE +/- 0.000, N = 3 SE +/- 0.003, N = 3 SE +/- 0.006, N = 3 1.313 1.315 1.318 1. (CC) gcc options: -fvisibility=hidden -O2 -pthread -lm -ljpeg -lpng16 -ltiff
WebP Image Encode Encode Settings: Quality 100 OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100 1 2 3 0.4676 0.9352 1.4028 1.8704 2.338 SE +/- 0.002, N = 3 SE +/- 0.000, N = 3 SE +/- 0.001, N = 3 2.077 2.076 2.078 1. (CC) gcc options: -fvisibility=hidden -O2 -pthread -lm -ljpeg -lpng16 -ltiff
WebP Image Encode Encode Settings: Quality 100, Lossless OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100, Lossless 1 2 3 4 8 12 16 20 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.06, N = 3 15.61 15.60 15.69 1. (CC) gcc options: -fvisibility=hidden -O2 -pthread -lm -ljpeg -lpng16 -ltiff
WebP Image Encode Encode Settings: Quality 100, Highest Compression OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100, Highest Compression 1 2 3 2 4 6 8 10 SE +/- 0.014, N = 3 SE +/- 0.012, N = 3 SE +/- 0.013, N = 3 6.385 6.386 6.382 1. (CC) gcc options: -fvisibility=hidden -O2 -pthread -lm -ljpeg -lpng16 -ltiff
WebP Image Encode Encode Settings: Quality 100, Lossless, Highest Compression OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100, Lossless, Highest Compression 1 2 3 8 16 24 32 40 SE +/- 0.23, N = 3 SE +/- 0.14, N = 3 SE +/- 0.07, N = 3 34.70 34.84 34.61 1. (CC) gcc options: -fvisibility=hidden -O2 -pthread -lm -ljpeg -lpng16 -ltiff
WireGuard + Linux Networking Stack Stress Test OpenBenchmarking.org Seconds, Fewer Is Better WireGuard + Linux Networking Stack Stress Test 1 2 3 40 80 120 160 200 SE +/- 0.83, N = 3 SE +/- 0.95, N = 3 SE +/- 0.79, N = 3 160.51 158.36 159.75
x265 Video Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better x265 3.4 Video Input: Bosphorus 4K 1 2 3 3 6 9 12 15 SE +/- 0.07, N = 3 SE +/- 0.09, N = 3 SE +/- 0.02, N = 3 12.37 12.39 12.34 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
x265 Video Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better x265 3.4 Video Input: Bosphorus 1080p 1 2 3 13 26 39 52 65 SE +/- 0.06, N = 3 SE +/- 0.96, N = 3 SE +/- 0.78, N = 3 54.49 56.14 54.42 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
Phoronix Test Suite v10.8.4