3100 compare AMD Ryzen 3 3100 4-Core testing with a ASUS ROG CROSSHAIR VIII HERO (2702 BIOS) and AMD Radeon RX 56/64 8GB on Ubuntu 20.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2011296-HA-3100COMPA40&rdt&grr .
3100 compare Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL Vulkan Compiler File-System Screen Resolution 1 2 3 AMD Ryzen 3 3100 4-Core @ 3.60GHz (4 Cores / 8 Threads) ASUS ROG CROSSHAIR VIII HERO (2702 BIOS) AMD Starship/Matisse 16GB 1000GB Sabrent Rocket 4.0 1TB AMD Radeon RX 56/64 8GB (1590/800MHz) AMD Vega 10 HDMI Audio LG Ultra HD Realtek RTL8125 2.5GbE + Intel I211 Ubuntu 20.10 5.8.0-29-generic (x86_64) GNOME Shell 3.38.1 X Server 1.20.9 amdgpu 19.1.0 4.6 Mesa 20.2.1 (LLVM 11.0.0) 1.2.131 GCC 10.2.0 ext4 1920x1080 OpenBenchmarking.org Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-gcn/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq ondemand - CPU Microcode: 0x8701021 Graphics Details - GLAMOR Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected Java Details - 2, 3: OpenJDK Runtime Environment (build 11.0.9.1+1-Ubuntu-0ubuntu1.20.10) Python Details - 2, 3: Python 3.8.6
3100 compare kripke: build-llvm: Time To Compile basis: UASTC Level 2 + RDO Post-Processing lczero: Eigen lczero: BLAS ai-benchmark: Device AI Score ai-benchmark: Device Training Score ai-benchmark: Device Inference Score astcenc: Exhaustive hpcc: G-HPL blender: BMW27 - CPU-Only gromacs: Water Benchmark ddnet: 1920 x 1080 - Fullscreen - OpenGL 3.3 - Default - RaiNyMore2 mocassin: Dust 2D tau100.0 brl-cad: VGR Performance Metric mlpack: scikit_qda numpy: hint: FLOAT namd: ATPase Simulation - 327,506 Atoms kvazaar: Bosphorus 4K - Medium asmfish: 1024 Hash Memory, 26 Depth build-linux-kernel: Time To Compile tensorflow-lite: Inception ResNet V2 tensorflow-lite: Inception V4 avifenc: 0 onednn: IP Batch All - f32 - CPU caffe: GoogleNet - CPU - 100 embree: Pathtracer ISPC - Crown compress-zstd: 19 basis: UASTC Level 3 embree: Pathtracer - Crown hmmer: Pfam Database Search glmark2: 1920 x 1080 mnn: inception-v3 mnn: mobilenet-v1-1.0 mnn: MobileNetV2_224 mnn: resnet-v2-50 mnn: SqueezeNetV1.0 embree: Pathtracer ISPC - Asian Dragon embree: Pathtracer - Asian Dragon stockfish: Total Time avifenc: 2 pyperformance: raytrace rawtherapee: Total Benchmark Time ncnn: CPU - yolov4-tiny ncnn: CPU - resnet50 ncnn: CPU - alexnet ncnn: CPU - resnet18 ncnn: CPU - vgg16 ncnn: CPU - googlenet ncnn: CPU - blazeface ncnn: CPU - efficientnet-b0 ncnn: CPU - mnasnet ncnn: CPU - shufflenet-v2 ncnn: CPU-v3-v3 - mobilenet-v3 ncnn: CPU-v2-v2 - mobilenet-v2 ncnn: CPU - mobilenet ncnn: CPU - squeezenet x265: Bosphorus 4K kvazaar: Bosphorus 4K - Very Fast pyperformance: python_startup geekbench: CPU Multi Core geekbench: GPU Vulkan basis: ETC1S hugin: Panorama Photo Assistant + Stitching Time indigobench: CPU - Bedroom indigobench: CPU - Supercar tensorflow-lite: SqueezeNet tensorflow-lite: Mobilenet Float tensorflow-lite: Mobilenet Quant tensorflow-lite: NASNet Mobile ddnet: 1920 x 1080 - Fullscreen - OpenGL 3.3 - Default - Multeasymap mlpack: scikit_linearridgeregression compress-lz4: 9 - Decompression Speed compress-lz4: 9 - Compression Speed pyperformance: 2to3 basis: UASTC Level 2 compress-lz4: 3 - Decompression Speed compress-lz4: 3 - Compression Speed astcenc: Thorough rav1e: 5 rav1e: 1 geekbench: CPU Single Core redis: GET mlpack: scikit_ica caffe: AlexNet - CPU - 100 onednn: Deconvolution Batch deconv_1d - f32 - CPU kvazaar: Bosphorus 1080p - Medium pyperformance: go rav1e: 6 espeak: Text-To-Speech Synthesis kvazaar: Bosphorus 4K - Ultra Fast aom-av1: Speed 0 Two-Pass mpv: Big Buck Bunny Sunflower 4K - Software Only redis: SADD pyperformance: django_template onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU aom-av1: Speed 6 Realtime onednn: IP Batch 1D - f32 - CPU redis: LPOP pyperformance: regex_compile phpbench: PHP Benchmark Suite vkfft: ocrmypdf: Processing 60 Page PDF Document compress-lz4: 1 - Decompression Speed compress-lz4: 1 - Compression Speed aom-av1: Speed 6 Two-Pass pyperformance: pathlib rav1e: 10 compress-zstd: 3 pyperformance: pickle_pure_python crafty: Elapsed Time pyperformance: json_loads mlpack: scikit_svm redis: LPUSH redis: SET tesseract-ocr: Time To OCR 7 Images pybench: Total For Average Test Times ncnn: Vulkan GPU - yolov4-tiny ncnn: Vulkan GPU - resnet50 ncnn: Vulkan GPU - alexnet ncnn: Vulkan GPU - resnet18 ncnn: Vulkan GPU - vgg16 ncnn: Vulkan GPU - googlenet ncnn: Vulkan GPU - blazeface ncnn: Vulkan GPU - efficientnet-b0 ncnn: Vulkan GPU - mnasnet ncnn: Vulkan GPU - shufflenet-v2 ncnn: Vulkan GPU-v3-v3 - mobilenet-v3 ncnn: Vulkan GPU-v2-v2 - mobilenet-v2 ncnn: Vulkan GPU - mobilenet ncnn: Vulkan GPU - squeezenet aom-av1: Speed 4 Two-Pass sunflow: Global Illumination + Image Synthesis pyperformance: chaos pyperformance: float pyperformance: nbody pyperformance: crypto_pyaes openssl: RSA 4096-bit Performance rnnoise: kvazaar: Bosphorus 1080p - Very Fast tnn: CPU - MobileNet v2 tnn: CPU - SqueezeNet v1.1 dolfyn: Computational Fluid Dynamics x265: Bosphorus 1080p aom-av1: Speed 8 Realtime betsy: ETC2 RGB - Highest mpv: Big Buck Bunny Sunflower 1080p - Software Only lammps: Rhodopsin Protein betsy: ETC1 - Highest darktable: Boat - CPU-only astcenc: Medium onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU kvazaar: Bosphorus 1080p - Ultra Fast basis: UASTC Level 0 darktable: Masskrug - CPU-only darktable: Server Room - CPU-only astcenc: Fast waifu2x-ncnn: 2x - 3 - Yes avifenc: 8 onednn: Convolution Batch Shapes Auto - f32 - CPU yquake2: Software CPU - 1920 x 1080 avifenc: 10 onednn: Deconvolution Batch deconv_3d - f32 - CPU ffte: N=256, 3D Complex FFT Routine yquake2: OpenGL 1.x - 1920 x 1080 yquake2: OpenGL 3.x - 1920 x 1080 darktable: Server Rack - CPU-only hpcc: Max Ping Pong Bandwidth hpcc: Rand Ring Bandwidth hpcc: Rand Ring Latency hpcc: G-Rand Access hpcc: EP-STREAM Triad hpcc: G-Ptrans hpcc: EP-DGEMM hpcc: G-Ffte 1 2 3 6926419 1215.501 702.907 730 775 1543 788 755 425.23 120.90700 323.46 0.542 236.14 288 57419 74.11 321.42 340004918.53832 4.49500 2.91 13797562 159.301 5048647 5572917 146.262 96.7410 120634 5.1754 21.1 114.270 5.3673 111.354 6530 42.573 5.819 4.937 34.931 8.853 6.1185 6.2294 9840561 86.460 480 79.955 32.96 37.32 18.95 17.83 72.57 19.42 1.95 9.21 6.32 4.79 6.18 6.66 23.82 21.87 7.77 8.03 8.02 5227 36246 62.795 61.999 0.930 1.954 387017 259621 266426 280959 334.97 3.19 10575.6 52.50 320 58.589 10496.6 53.56 52.39 1.050 0.372 1223 2310604.62 49.12 49082 8.05221 12.72 249 1.404 30.991 14.55 0.26 454.91 1951987.09 46.9 503.700 249.371 16.93 7.43267 2624535.48 169 604079 18432 32.164 11116.3 9763.10 3.43 17.7 3.041 3687.7 440 7738146 27.2 21.43 1437805.00 1725875.75 24.567 1020 11.26 6.10 4.09 2.16 10.82 5.74 0.91 9.33 2.78 2.32 3.57 2.57 8.28 4.84 2.18 1.948 111 117 117 116 1145.0 20.022 32.34 263.815 260.260 17.756 34.71 37.09 6.688 1300.12 3.249 5.799 12.620 8.30 4.73057 57.37 9.297 8.019 6.318 7.39 7.158 6.631 20.6452 110.9 6.100 12.0299 25405.988379614 737.4 977.8 0.230 14400.766 4.96131 0.39400 0.02004 6.74179 0.65366 55.50950 3.30686 6434795 1214.944 701.853 688 730 1549 788 761 425.58 324.80 0.550 214.08 288 56730 76.91 320.60 339696862.29595 4.48209 2.92 13866946 160.558 5046933 5574923 149.263 99.7379 120679 5.1563 21.2 114.186 5.3310 111.303 6547 42.444 5.824 4.936 34.995 8.848 6.1354 6.1430 10050125 88.209 477 79.418 32.84 37.05 18.57 17.75 72.10 19.30 1.92 9.1 6.27 4.82 6.11 6.64 23.71 21.67 7.78 8.03 8.08 5248 36793 62.694 61.972 0.935 1.958 386977 259600 266319 279167 331.87 3.20 10711.5 51.20 321 58.628 10635.3 53.60 52.46 1.047 0.365 1228 2095708.19 49.12 49047 9.74749 12.75 248 1.370 30.941 14.63 0.25 457.51 1947660.53 46.7 491.110 262.368 16.66 8.42072 2159676.18 170 604512 18479 32.233 11146.3 10001.81 3.42 17.7 3.068 3632.0 439 7573497 27.3 21.44 1413366.45 1754885.38 24.505 1017 11.29 6.13 4.10 2.17 10.71 5.70 0.88 9.34 2.76 2.33 3.58 2.58 8.25 4.85 2.17 1.965 111 116 117 116 1144.3 20.015 32.38 264.303 260.451 17.704 35.12 36.18 6.476 1305.29 3.329 5.608 12.477 8.3 4.03910 57.57 9.267 7.963 6.274 7.39 7.184 6.575 19.1978 109.5 6.037 13.1798 25808.658192384 747.0 977.8 0.229 1216.085 704.133 638 649 1553 790 763 425.39 325.17 0.552 203.50 288 56960 75.29 321.61 339643088.46449 4.48308 2.92 13730556 160.149 5048433 5571717 145.653 99.9683 120291 5.1464 21.2 114.261 5.3162 111.494 6559 42.545 5.811 4.910 35.167 8.874 6.1401 6.2800 9864045 86.293 476 80.349 33.20 37.31 18.81 18.00 72.72 19.51 1.92 9.16 6.30 4.84 6.25 6.68 23.71 21.83 7.75 8.01 8.13 5256 36985 62.769 62.706 0.931 1.958 387009 259575 266346 279158 333.51 3.22 10653.0 51.45 321 58.621 10588.0 52.55 52.40 1.058 0.361 1228 2135888.90 49.14 48962 9.82517 12.76 248 1.393 30.956 14.60 0.25 456.91 2004338.25 46.8 462.801 247.156 16.91 8.81065 1559670.96 168 604322 18478 32.264 11202.6 9959.06 3.43 17.8 3.087 3687.7 438 7394699 27.3 21.52 1459398.53 1695298.03 24.509 1021 11.30 6.09 4.09 2.17 10.61 5.69 0.87 9.36 2.76 2.32 3.58 2.58 8.27 4.85 2.18 1.996 111 117 117 116 1144.7 20.028 32.38 264.123 260.441 17.680 34.66 37.07 6.470 1304.01 3.340 5.602 12.509 8.30 4.06739 57.47 9.267 8.026 6.292 7.37 7.183 6.584 19.4144 110.4 6.064 12.8921 25835.646713364 730.4 975.8 0.234 OpenBenchmarking.org
Kripke OpenBenchmarking.org Throughput FoM, More Is Better Kripke 1.2.4 1 2 1.5M 3M 4.5M 6M 7.5M SE +/- 717516.50, N = 2 6926419 6434795 1. (CXX) g++ options: -O3 -fopenmp
Timed LLVM Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed LLVM Compilation 10.0 Time To Compile 1 2 3 300 600 900 1200 1500 SE +/- 1.95, N = 3 SE +/- 1.32, N = 3 SE +/- 0.35, N = 3 1215.50 1214.94 1216.09
Basis Universal Settings: UASTC Level 2 + RDO Post-Processing OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: UASTC Level 2 + RDO Post-Processing 1 2 3 150 300 450 600 750 SE +/- 1.79, N = 3 SE +/- 0.97, N = 3 SE +/- 1.97, N = 3 702.91 701.85 704.13 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
LeelaChessZero Backend: Eigen OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.26 Backend: Eigen 1 2 3 160 320 480 640 800 SE +/- 4.41, N = 3 SE +/- 9.06, N = 5 SE +/- 7.37, N = 9 730 688 638 1. (CXX) g++ options: -flto -pthread
LeelaChessZero Backend: BLAS OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.26 Backend: BLAS 1 2 3 200 400 600 800 1000 SE +/- 7.81, N = 3 SE +/- 10.34, N = 4 SE +/- 7.68, N = 9 775 730 649 1. (CXX) g++ options: -flto -pthread
AI Benchmark Alpha Device AI Score OpenBenchmarking.org Score, More Is Better AI Benchmark Alpha 0.1.2 Device AI Score 1 2 3 300 600 900 1200 1500 1543 1549 1553
AI Benchmark Alpha Device Training Score OpenBenchmarking.org Score, More Is Better AI Benchmark Alpha 0.1.2 Device Training Score 1 2 3 200 400 600 800 1000 788 788 790
AI Benchmark Alpha Device Inference Score OpenBenchmarking.org Score, More Is Better AI Benchmark Alpha 0.1.2 Device Inference Score 1 2 3 160 320 480 640 800 755 761 763
ASTC Encoder Preset: Exhaustive OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Exhaustive 1 2 3 90 180 270 360 450 SE +/- 0.13, N = 3 SE +/- 0.16, N = 3 SE +/- 0.06, N = 3 425.23 425.58 425.39 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
HPC Challenge Test / Class: G-HPL OpenBenchmarking.org GFLOPS, More Is Better HPC Challenge 1.5.0 Test / Class: G-HPL 1 30 60 90 120 150 120.91 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
Blender Blend File: BMW27 - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: BMW27 - Compute: CPU-Only 1 2 3 70 140 210 280 350 SE +/- 0.35, N = 3 SE +/- 0.44, N = 3 SE +/- 0.71, N = 3 323.46 324.80 325.17
GROMACS Water Benchmark OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2020.3 Water Benchmark 1 2 3 0.1242 0.2484 0.3726 0.4968 0.621 SE +/- 0.000, N = 3 SE +/- 0.002, N = 3 SE +/- 0.000, N = 3 0.542 0.550 0.552 1. (CXX) g++ options: -O3 -pthread -lrt -lpthread -lm
DDraceNetwork Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL 3.3 - Zoom: Default - Demo: RaiNyMore2 OpenBenchmarking.org Frames Per Second, More Is Better DDraceNetwork 15.2.3 Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL 3.3 - Zoom: Default - Demo: RaiNyMore2 1 2 3 50 100 150 200 250 SE +/- 9.66, N = 15 SE +/- 13.98, N = 15 SE +/- 13.89, N = 12 236.14 214.08 203.50 MIN: 35.16 / MAX: 498.75 MIN: 33.12 / MAX: 476.64 MIN: 27.45 / MAX: 451.06 1. (CXX) g++ options: -O3 -rdynamic -lcrypto -lz -lrt -lpthread -lcurl -lfreetype -lSDL2 -lwavpack -lopusfile -lopus -logg -lGL -lX11 -lnotify -lgdk_pixbuf-2.0 -lgio-2.0 -lgobject-2.0 -lglib-2.0
Monte Carlo Simulations of Ionised Nebulae Input: Dust 2D tau100.0 OpenBenchmarking.org Seconds, Fewer Is Better Monte Carlo Simulations of Ionised Nebulae 2019-03-24 Input: Dust 2D tau100.0 1 2 3 60 120 180 240 300 SE +/- 0.58, N = 3 SE +/- 0.67, N = 3 SE +/- 0.88, N = 3 288 288 288 1. (F9X) gfortran options: -cpp -Jsource/ -ffree-line-length-0 -lm -std=legacy -O3 -O2 -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -ldl -levent -levent_pthreads -lutil -lrt -lz
BRL-CAD VGR Performance Metric OpenBenchmarking.org VGR Performance Metric, More Is Better BRL-CAD 7.30.8 VGR Performance Metric 1 2 3 12K 24K 36K 48K 60K 57419 56730 56960 1. (CXX) g++ options: -std=c++11 -pipe -fno-strict-aliasing -fno-common -fexceptions -ftemplate-depth-128 -m64 -ggdb3 -O3 -fipa-pta -fstrength-reduce -finline-functions -flto -pedantic -rdynamic -lSM -lICE -lXi -lGLU -lGL -lGLdispatch -lX11 -lXext -lXrender -lpthread -ldl -luuid -lm
Mlpack Benchmark Benchmark: scikit_qda OpenBenchmarking.org Seconds, Fewer Is Better Mlpack Benchmark Benchmark: scikit_qda 1 2 3 20 40 60 80 100 SE +/- 0.39, N = 3 SE +/- 1.17, N = 12 SE +/- 1.16, N = 3 74.11 76.91 75.29
Numpy Benchmark OpenBenchmarking.org Score, More Is Better Numpy Benchmark 1 2 3 70 140 210 280 350 SE +/- 0.41, N = 3 SE +/- 0.23, N = 3 SE +/- 0.63, N = 3 321.42 320.60 321.61
Hierarchical INTegration Test: FLOAT OpenBenchmarking.org QUIPs, More Is Better Hierarchical INTegration 1.0 Test: FLOAT 1 2 3 70M 140M 210M 280M 350M SE +/- 215594.66, N = 3 SE +/- 156026.35, N = 3 SE +/- 106732.80, N = 3 340004918.54 339696862.30 339643088.46 1. (CC) gcc options: -O3 -march=native -lm
NAMD ATPase Simulation - 327,506 Atoms OpenBenchmarking.org days/ns, Fewer Is Better NAMD 2.14 ATPase Simulation - 327,506 Atoms 1 2 3 1.0114 2.0228 3.0342 4.0456 5.057 SE +/- 0.00243, N = 3 SE +/- 0.00085, N = 3 SE +/- 0.00219, N = 3 4.49500 4.48209 4.48308
Kvazaar Video Input: Bosphorus 4K - Video Preset: Medium OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 4K - Video Preset: Medium 1 2 3 0.657 1.314 1.971 2.628 3.285 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 2.91 2.92 2.92 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
asmFish 1024 Hash Memory, 26 Depth OpenBenchmarking.org Nodes/second, More Is Better asmFish 2018-07-23 1024 Hash Memory, 26 Depth 1 2 3 3M 6M 9M 12M 15M SE +/- 23080.42, N = 3 SE +/- 160984.48, N = 3 SE +/- 105834.72, N = 3 13797562 13866946 13730556
Timed Linux Kernel Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 5.4 Time To Compile 1 2 3 40 80 120 160 200 SE +/- 0.52, N = 3 SE +/- 1.04, N = 3 SE +/- 0.59, N = 3 159.30 160.56 160.15
TensorFlow Lite Model: Inception ResNet V2 OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: Inception ResNet V2 1 2 3 1.1M 2.2M 3.3M 4.4M 5.5M SE +/- 308.45, N = 3 SE +/- 255.76, N = 3 SE +/- 2537.61, N = 3 5048647 5046933 5048433
TensorFlow Lite Model: Inception V4 OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: Inception V4 1 2 3 1.2M 2.4M 3.6M 4.8M 6M SE +/- 421.91, N = 3 SE +/- 2695.68, N = 3 SE +/- 230.96, N = 3 5572917 5574923 5571717
libavif avifenc Encoder Speed: 0 OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.7.3 Encoder Speed: 0 1 2 3 30 60 90 120 150 SE +/- 0.63, N = 3 SE +/- 0.61, N = 3 SE +/- 0.42, N = 3 146.26 149.26 145.65 1. (CXX) g++ options: -O3 -fPIC
oneDNN Harness: IP Batch All - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: IP Batch All - Data Type: f32 - Engine: CPU 1 2 3 20 40 60 80 100 SE +/- 0.07, N = 3 SE +/- 0.13, N = 3 SE +/- 0.89, N = 15 96.74 99.74 99.97 MIN: 95.29 MIN: 97.89 MIN: 95.99 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Caffe Model: GoogleNet - Acceleration: CPU - Iterations: 100 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: CPU - Iterations: 100 1 2 3 30K 60K 90K 120K 150K SE +/- 118.13, N = 3 SE +/- 32.70, N = 3 SE +/- 73.69, N = 3 120634 120679 120291 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
Embree Binary: Pathtracer ISPC - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer ISPC - Model: Crown 1 2 3 1.1645 2.329 3.4935 4.658 5.8225 SE +/- 0.0016, N = 3 SE +/- 0.0124, N = 3 SE +/- 0.0320, N = 3 5.1754 5.1563 5.1464 MIN: 5.15 / MAX: 5.25 MIN: 5.11 / MAX: 5.25 MIN: 5.05 / MAX: 5.25
Zstd Compression Compression Level: 19 OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.4.5 Compression Level: 19 1 2 3 5 10 15 20 25 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 21.1 21.2 21.2 1. (CC) gcc options: -O3 -pthread -lz -llzma
Basis Universal Settings: UASTC Level 3 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: UASTC Level 3 1 2 3 30 60 90 120 150 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 114.27 114.19 114.26 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
Embree Binary: Pathtracer - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer - Model: Crown 1 2 3 1.2076 2.4152 3.6228 4.8304 6.038 SE +/- 0.0401, N = 3 SE +/- 0.0092, N = 3 SE +/- 0.0526, N = 3 5.3673 5.3310 5.3162 MIN: 4.89 / MAX: 5.49 MIN: 5.3 / MAX: 5.4 MIN: 4.89 / MAX: 5.45
Timed HMMer Search Pfam Database Search OpenBenchmarking.org Seconds, Fewer Is Better Timed HMMer Search 3.3.1 Pfam Database Search 1 2 3 20 40 60 80 100 SE +/- 0.15, N = 3 SE +/- 0.11, N = 3 SE +/- 0.23, N = 3 111.35 111.30 111.49 1. (CC) gcc options: -O3 -pthread -lhmmer -leasel -lm
GLmark2 Resolution: 1920 x 1080 OpenBenchmarking.org Score, More Is Better GLmark2 2020.04 Resolution: 1920 x 1080 1 2 3 1400 2800 4200 5600 7000 6530 6547 6559
Mobile Neural Network Model: inception-v3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2020-09-17 Model: inception-v3 1 2 3 10 20 30 40 50 SE +/- 0.08, N = 3 SE +/- 0.08, N = 3 SE +/- 0.17, N = 3 42.57 42.44 42.55 MIN: 42.21 / MAX: 51.92 MIN: 42.09 / MAX: 51.33 MIN: 42.02 / MAX: 58.27 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: mobilenet-v1-1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2020-09-17 Model: mobilenet-v1-1.0 1 2 3 1.3104 2.6208 3.9312 5.2416 6.552 SE +/- 0.016, N = 3 SE +/- 0.028, N = 3 SE +/- 0.015, N = 3 5.819 5.824 5.811 MIN: 5.74 / MAX: 6.34 MIN: 5.72 / MAX: 25.28 MIN: 5.73 / MAX: 15.06 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: MobileNetV2_224 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2020-09-17 Model: MobileNetV2_224 1 2 3 1.1108 2.2216 3.3324 4.4432 5.554 SE +/- 0.018, N = 3 SE +/- 0.020, N = 3 SE +/- 0.030, N = 3 4.937 4.936 4.910 MIN: 4.86 / MAX: 5.44 MIN: 4.86 / MAX: 5.66 MIN: 4.82 / MAX: 6.27 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: resnet-v2-50 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2020-09-17 Model: resnet-v2-50 1 2 3 8 16 24 32 40 SE +/- 0.05, N = 3 SE +/- 0.04, N = 3 SE +/- 0.25, N = 3 34.93 35.00 35.17 MIN: 34.38 / MAX: 58.39 MIN: 34.7 / MAX: 43.77 MIN: 34.65 / MAX: 80.67 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: SqueezeNetV1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2020-09-17 Model: SqueezeNetV1.0 1 2 3 2 4 6 8 10 SE +/- 0.003, N = 3 SE +/- 0.005, N = 3 SE +/- 0.018, N = 3 8.853 8.848 8.874 MIN: 8.76 / MAX: 11.24 MIN: 8.76 / MAX: 9.86 MIN: 8.77 / MAX: 11.46 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Embree Binary: Pathtracer ISPC - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer ISPC - Model: Asian Dragon 1 2 3 2 4 6 8 10 SE +/- 0.0222, N = 3 SE +/- 0.0098, N = 3 SE +/- 0.0025, N = 3 6.1185 6.1354 6.1401 MIN: 6.05 / MAX: 6.22 MIN: 6.09 / MAX: 6.22 MIN: 6.11 / MAX: 6.21
Embree Binary: Pathtracer - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer - Model: Asian Dragon 1 2 3 2 4 6 8 10 SE +/- 0.0315, N = 3 SE +/- 0.0103, N = 3 SE +/- 0.0267, N = 3 6.2294 6.1430 6.2800 MIN: 6.16 / MAX: 6.36 MIN: 6.1 / MAX: 6.24 MIN: 6.21 / MAX: 6.4
Stockfish Total Time OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 12 Total Time 1 2 3 2M 4M 6M 8M 10M SE +/- 111801.84, N = 6 SE +/- 45923.84, N = 3 SE +/- 124228.11, N = 3 9840561 10050125 9864045 1. (CXX) g++ options: -m64 -lpthread -fno-exceptions -std=c++17 -pedantic -O3 -msse -msse3 -mpopcnt -msse4.1 -mssse3 -msse2 -flto -flto=jobserver
libavif avifenc Encoder Speed: 2 OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.7.3 Encoder Speed: 2 1 2 3 20 40 60 80 100 SE +/- 0.23, N = 3 SE +/- 0.07, N = 3 SE +/- 0.06, N = 3 86.46 88.21 86.29 1. (CXX) g++ options: -O3 -fPIC
PyPerformance Benchmark: raytrace OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.0.0 Benchmark: raytrace 1 2 3 100 200 300 400 500 SE +/- 0.33, N = 3 480 477 476
RawTherapee Total Benchmark Time OpenBenchmarking.org Seconds, Fewer Is Better RawTherapee Total Benchmark Time 1 2 3 20 40 60 80 100 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.06, N = 3 79.96 79.42 80.35 1. RawTherapee, version 5.8, command line.
NCNN Target: CPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: yolov4-tiny 1 2 3 8 16 24 32 40 SE +/- 0.04, N = 3 SE +/- 0.09, N = 3 SE +/- 0.07, N = 3 32.96 32.84 33.20 MIN: 32.78 / MAX: 34.14 MIN: 32.57 / MAX: 55.87 MIN: 32.95 / MAX: 34.47 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: resnet50 1 2 3 9 18 27 36 45 SE +/- 0.03, N = 3 SE +/- 0.07, N = 3 SE +/- 0.07, N = 3 37.32 37.05 37.31 MIN: 37.15 / MAX: 39.69 MIN: 36.81 / MAX: 38.07 MIN: 37.14 / MAX: 39.49 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: alexnet 1 2 3 5 10 15 20 25 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 18.95 18.57 18.81 MIN: 18.85 / MAX: 19.3 MIN: 18.49 / MAX: 19.27 MIN: 18.72 / MAX: 19.3 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: resnet18 1 2 3 4 8 12 16 20 SE +/- 0.05, N = 3 SE +/- 0.05, N = 3 SE +/- 0.01, N = 3 17.83 17.75 18.00 MIN: 17.65 / MAX: 21.58 MIN: 17.57 / MAX: 18.32 MIN: 17.88 / MAX: 18.52 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: vgg16 1 2 3 16 32 48 64 80 SE +/- 0.13, N = 3 SE +/- 0.22, N = 3 SE +/- 0.02, N = 3 72.57 72.10 72.72 MIN: 72.08 / MAX: 107.37 MIN: 71.61 / MAX: 88.56 MIN: 72.42 / MAX: 80.73 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: googlenet 1 2 3 5 10 15 20 25 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 19.42 19.30 19.51 MIN: 19.3 / MAX: 24.48 MIN: 19.11 / MAX: 38.78 MIN: 19.37 / MAX: 20.62 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: blazeface 1 2 3 0.4388 0.8776 1.3164 1.7552 2.194 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 1.95 1.92 1.92 MIN: 1.9 / MAX: 2.15 MIN: 1.88 / MAX: 2.17 MIN: 1.89 / MAX: 2.52 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: efficientnet-b0 1 2 3 3 6 9 12 15 SE +/- 0.05, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 9.21 9.10 9.16 MIN: 9.07 / MAX: 14.29 MIN: 9.02 / MAX: 9.48 MIN: 9.06 / MAX: 9.6 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: mnasnet 1 2 3 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 6.32 6.27 6.30 MIN: 6.26 / MAX: 7.07 MIN: 6.2 / MAX: 7.42 MIN: 6.22 / MAX: 7.04 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: shufflenet-v2 1 2 3 1.089 2.178 3.267 4.356 5.445 SE +/- 0.04, N = 3 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 4.79 4.82 4.84 MIN: 4.65 / MAX: 5.62 MIN: 4.76 / MAX: 6 MIN: 4.76 / MAX: 5.65 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU-v3-v3 - Model: mobilenet-v3 1 2 3 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 6.18 6.11 6.25 MIN: 6.11 / MAX: 7.82 MIN: 6.03 / MAX: 7.3 MIN: 6.11 / MAX: 37.7 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU-v2-v2 - Model: mobilenet-v2 1 2 3 2 4 6 8 10 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 6.66 6.64 6.68 MIN: 6.56 / MAX: 7.93 MIN: 6.54 / MAX: 7.9 MIN: 6.59 / MAX: 8.01 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: mobilenet 1 2 3 6 12 18 24 30 SE +/- 0.18, N = 3 SE +/- 0.15, N = 3 SE +/- 0.11, N = 3 23.82 23.71 23.71 MIN: 23.36 / MAX: 24.99 MIN: 23.3 / MAX: 24.65 MIN: 23.46 / MAX: 24.96 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: squeezenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: squeezenet 1 2 3 5 10 15 20 25 SE +/- 0.09, N = 3 SE +/- 0.06, N = 3 SE +/- 0.12, N = 3 21.87 21.67 21.83 MIN: 21.59 / MAX: 23.26 MIN: 21.39 / MAX: 37.01 MIN: 21.56 / MAX: 22.96 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
x265 Video Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better x265 3.4 Video Input: Bosphorus 4K 1 2 3 2 4 6 8 10 SE +/- 0.05, N = 3 SE +/- 0.08, N = 3 SE +/- 0.05, N = 3 7.77 7.78 7.75 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
Kvazaar Video Input: Bosphorus 4K - Video Preset: Very Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 4K - Video Preset: Very Fast 1 2 3 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 8.03 8.03 8.01 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
PyPerformance Benchmark: python_startup OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.0.0 Benchmark: python_startup 1 2 3 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 8.02 8.08 8.13
Geekbench Test: CPU Multi Core OpenBenchmarking.org Score, More Is Better Geekbench 5 Test: CPU Multi Core 1 2 3 1100 2200 3300 4400 5500 SE +/- 4.98, N = 3 SE +/- 7.09, N = 3 SE +/- 6.17, N = 3 5227 5248 5256
Geekbench Test: GPU Vulkan OpenBenchmarking.org Score, More Is Better Geekbench 5 Test: GPU Vulkan 1 2 3 8K 16K 24K 32K 40K SE +/- 459.84, N = 3 SE +/- 115.14, N = 3 SE +/- 56.79, N = 3 36246 36793 36985
Basis Universal Settings: ETC1S OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: ETC1S 1 2 3 14 28 42 56 70 SE +/- 0.17, N = 3 SE +/- 0.11, N = 3 SE +/- 0.15, N = 3 62.80 62.69 62.77 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
Hugin Panorama Photo Assistant + Stitching Time OpenBenchmarking.org Seconds, Fewer Is Better Hugin Panorama Photo Assistant + Stitching Time 1 2 3 14 28 42 56 70 SE +/- 0.29, N = 3 SE +/- 0.43, N = 3 SE +/- 0.37, N = 3 62.00 61.97 62.71
IndigoBench Acceleration: CPU - Scene: Bedroom OpenBenchmarking.org M samples/s, More Is Better IndigoBench 4.4 Acceleration: CPU - Scene: Bedroom 1 2 3 0.2104 0.4208 0.6312 0.8416 1.052 SE +/- 0.001, N = 3 SE +/- 0.001, N = 3 SE +/- 0.001, N = 3 0.930 0.935 0.931
IndigoBench Acceleration: CPU - Scene: Supercar OpenBenchmarking.org M samples/s, More Is Better IndigoBench 4.4 Acceleration: CPU - Scene: Supercar 1 2 3 0.4406 0.8812 1.3218 1.7624 2.203 SE +/- 0.005, N = 3 SE +/- 0.004, N = 3 SE +/- 0.001, N = 3 1.954 1.958 1.958
TensorFlow Lite Model: SqueezeNet OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: SqueezeNet 1 2 3 80K 160K 240K 320K 400K SE +/- 60.36, N = 3 SE +/- 56.90, N = 3 SE +/- 10.21, N = 3 387017 386977 387009
TensorFlow Lite Model: Mobilenet Float OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: Mobilenet Float 1 2 3 60K 120K 180K 240K 300K SE +/- 20.61, N = 3 SE +/- 44.31, N = 3 SE +/- 21.18, N = 3 259621 259600 259575
TensorFlow Lite Model: Mobilenet Quant OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: Mobilenet Quant 1 2 3 60K 120K 180K 240K 300K SE +/- 51.91, N = 3 SE +/- 131.03, N = 3 SE +/- 60.84, N = 3 266426 266319 266346
TensorFlow Lite Model: NASNet Mobile OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: NASNet Mobile 1 2 3 60K 120K 180K 240K 300K SE +/- 1226.74, N = 3 SE +/- 79.18, N = 3 SE +/- 151.01, N = 3 280959 279167 279158
DDraceNetwork Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL 3.3 - Zoom: Default - Demo: Multeasymap - Total Frame Time OpenBenchmarking.org Milliseconds, Fewer Is Better DDraceNetwork 15.2.3 Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL 3.3 - Zoom: Default - Demo: Multeasymap - Total Frame Time 1 2 3 3 6 9 12 15 Min: 2.07 / Avg: 2.98 / Max: 7.54 Min: 2.09 / Avg: 3.02 / Max: 7.16 Min: 2.02 / Avg: 3 / Max: 10.7 1. (CXX) g++ options: -O3 -rdynamic -lcrypto -lz -lrt -lpthread -lcurl -lfreetype -lSDL2 -lwavpack -lopusfile -lopus -logg -lGL -lX11 -lnotify -lgdk_pixbuf-2.0 -lgio-2.0 -lgobject-2.0 -lglib-2.0
DDraceNetwork Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL 3.3 - Zoom: Default - Demo: Multeasymap OpenBenchmarking.org Frames Per Second, More Is Better DDraceNetwork 15.2.3 Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL 3.3 - Zoom: Default - Demo: Multeasymap 1 2 3 70 140 210 280 350 SE +/- 1.03, N = 3 SE +/- 1.65, N = 3 SE +/- 0.66, N = 3 334.97 331.87 333.51 MIN: 110.11 / MAX: 493.34 MIN: 104.44 / MAX: 494.32 MIN: 109.02 / MAX: 495.29 1. (CXX) g++ options: -O3 -rdynamic -lcrypto -lz -lrt -lpthread -lcurl -lfreetype -lSDL2 -lwavpack -lopusfile -lopus -logg -lGL -lX11 -lnotify -lgdk_pixbuf-2.0 -lgio-2.0 -lgobject-2.0 -lglib-2.0
Mlpack Benchmark Benchmark: scikit_linearridgeregression OpenBenchmarking.org Seconds, Fewer Is Better Mlpack Benchmark Benchmark: scikit_linearridgeregression 1 2 3 0.7245 1.449 2.1735 2.898 3.6225 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 3.19 3.20 3.22
LZ4 Compression Compression Level: 9 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 9 - Decompression Speed 1 2 3 2K 4K 6K 8K 10K SE +/- 23.62, N = 3 SE +/- 8.30, N = 3 SE +/- 26.11, N = 3 10575.6 10711.5 10653.0 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 9 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 9 - Compression Speed 1 2 3 12 24 36 48 60 SE +/- 0.72, N = 3 SE +/- 0.27, N = 3 SE +/- 0.03, N = 3 52.50 51.20 51.45 1. (CC) gcc options: -O3
PyPerformance Benchmark: 2to3 OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.0.0 Benchmark: 2to3 1 2 3 70 140 210 280 350 SE +/- 0.33, N = 3 320 321 321
Basis Universal Settings: UASTC Level 2 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: UASTC Level 2 1 2 3 13 26 39 52 65 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 58.59 58.63 58.62 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
LZ4 Compression Compression Level: 3 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 3 - Decompression Speed 1 2 3 2K 4K 6K 8K 10K SE +/- 12.91, N = 3 SE +/- 18.42, N = 3 SE +/- 22.07, N = 3 10496.6 10635.3 10588.0 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 3 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 3 - Compression Speed 1 2 3 12 24 36 48 60 SE +/- 0.84, N = 3 SE +/- 0.85, N = 3 SE +/- 0.28, N = 3 53.56 53.60 52.55 1. (CC) gcc options: -O3
ASTC Encoder Preset: Thorough OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Thorough 1 2 3 12 24 36 48 60 SE +/- 0.01, N = 3 SE +/- 0.05, N = 3 SE +/- 0.01, N = 3 52.39 52.46 52.40 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
rav1e Speed: 5 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Alpha Speed: 5 1 2 3 0.2381 0.4762 0.7143 0.9524 1.1905 SE +/- 0.007, N = 3 SE +/- 0.004, N = 3 SE +/- 0.000, N = 3 1.050 1.047 1.058
rav1e Speed: 1 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Alpha Speed: 1 1 2 3 0.0837 0.1674 0.2511 0.3348 0.4185 SE +/- 0.004, N = 3 SE +/- 0.005, N = 3 SE +/- 0.006, N = 3 0.372 0.365 0.361
Geekbench Test: CPU Single Core OpenBenchmarking.org Score, More Is Better Geekbench 5 Test: CPU Single Core 1 2 3 300 600 900 1200 1500 SE +/- 0.58, N = 3 SE +/- 1.33, N = 3 SE +/- 0.88, N = 3 1223 1228 1228
Redis Test: GET OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: GET 1 2 3 500K 1000K 1500K 2000K 2500K SE +/- 40313.16, N = 15 SE +/- 26352.66, N = 15 SE +/- 21009.40, N = 15 2310604.62 2095708.19 2135888.90 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
Mlpack Benchmark Benchmark: scikit_ica OpenBenchmarking.org Seconds, Fewer Is Better Mlpack Benchmark Benchmark: scikit_ica 1 2 3 11 22 33 44 55 SE +/- 0.07, N = 3 SE +/- 0.05, N = 3 SE +/- 0.04, N = 3 49.12 49.12 49.14
Caffe Model: AlexNet - Acceleration: CPU - Iterations: 100 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: AlexNet - Acceleration: CPU - Iterations: 100 1 2 3 11K 22K 33K 44K 55K SE +/- 14.01, N = 3 SE +/- 8.08, N = 3 SE +/- 27.06, N = 3 49082 49047 48962 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
oneDNN Harness: Deconvolution Batch deconv_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Deconvolution Batch deconv_1d - Data Type: f32 - Engine: CPU 1 2 3 3 6 9 12 15 SE +/- 0.01249, N = 3 SE +/- 0.12846, N = 3 SE +/- 0.17039, N = 15 8.05221 9.74749 9.82517 MIN: 7.86 MIN: 9.46 MIN: 8.65 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Kvazaar Video Input: Bosphorus 1080p - Video Preset: Medium OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 1080p - Video Preset: Medium 1 2 3 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 12.72 12.75 12.76 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
PyPerformance Benchmark: go OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.0.0 Benchmark: go 1 2 3 50 100 150 200 250 SE +/- 0.33, N = 3 249 248 248
rav1e Speed: 6 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Alpha Speed: 6 1 2 3 0.3159 0.6318 0.9477 1.2636 1.5795 SE +/- 0.007, N = 3 SE +/- 0.006, N = 3 SE +/- 0.011, N = 3 1.404 1.370 1.393
eSpeak-NG Speech Engine Text-To-Speech Synthesis OpenBenchmarking.org Seconds, Fewer Is Better eSpeak-NG Speech Engine 20200907 Text-To-Speech Synthesis 1 2 3 7 14 21 28 35 SE +/- 0.15, N = 4 SE +/- 0.01, N = 4 SE +/- 0.02, N = 4 30.99 30.94 30.96 1. (CC) gcc options: -O2 -std=c99
Kvazaar Video Input: Bosphorus 4K - Video Preset: Ultra Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 4K - Video Preset: Ultra Fast 1 2 3 4 8 12 16 20 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 14.55 14.63 14.60 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
AOM AV1 Encoder Mode: Speed 0 Two-Pass OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 2.0 Encoder Mode: Speed 0 Two-Pass 1 2 3 0.0585 0.117 0.1755 0.234 0.2925 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.26 0.25 0.25 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
MPV Video Input: Big Buck Bunny Sunflower 4K - Decode: Software Only OpenBenchmarking.org FPS, More Is Better MPV Video Input: Big Buck Bunny Sunflower 4K - Decode: Software Only 1 2 3 100 200 300 400 500 SE +/- 0.40, N = 3 SE +/- 0.40, N = 3 SE +/- 0.38, N = 3 454.91 457.51 456.91 MIN: 299.99 / MAX: 631.56 MIN: 299.99 / MAX: 666.65 MIN: 292.67 / MAX: 631.56 1. mpv 0.32.0
Redis Test: SADD OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: SADD 1 2 3 400K 800K 1200K 1600K 2000K SE +/- 22720.13, N = 15 SE +/- 25708.45, N = 15 SE +/- 18274.51, N = 3 1951987.09 1947660.53 2004338.25 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
PyPerformance Benchmark: django_template OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.0.0 Benchmark: django_template 1 2 3 11 22 33 44 55 SE +/- 0.10, N = 3 SE +/- 0.03, N = 3 SE +/- 0.06, N = 3 46.9 46.7 46.8
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU 1 2 3 110 220 330 440 550 SE +/- 2.86, N = 3 SE +/- 2.10, N = 3 SE +/- 3.41, N = 3 503.70 491.11 462.80 MIN: 495.04 MIN: 474.1 MIN: 453.6 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU 1 2 3 60 120 180 240 300 SE +/- 0.60, N = 3 SE +/- 3.42, N = 3 SE +/- 1.11, N = 3 249.37 262.37 247.16 MIN: 247.16 MIN: 254.47 MIN: 243.72 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
AOM AV1 Encoder Mode: Speed 6 Realtime OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 2.0 Encoder Mode: Speed 6 Realtime 1 2 3 4 8 12 16 20 SE +/- 0.02, N = 3 SE +/- 0.07, N = 3 SE +/- 0.03, N = 3 16.93 16.66 16.91 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
oneDNN Harness: IP Batch 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: IP Batch 1D - Data Type: f32 - Engine: CPU 1 2 3 2 4 6 8 10 SE +/- 0.03162, N = 3 SE +/- 0.12292, N = 3 SE +/- 0.12077, N = 15 7.43267 8.42072 8.81065 MIN: 7.23 MIN: 8.01 MIN: 8.21 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Redis Test: LPOP OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: LPOP 1 2 3 600K 1200K 1800K 2400K 3000K SE +/- 39136.83, N = 15 SE +/- 123032.71, N = 12 SE +/- 13136.89, N = 3 2624535.48 2159676.18 1559670.96 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
PyPerformance Benchmark: regex_compile OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.0.0 Benchmark: regex_compile 1 2 3 40 80 120 160 200 169 170 168
PHPBench PHP Benchmark Suite OpenBenchmarking.org Score, More Is Better PHPBench 0.8.1 PHP Benchmark Suite 1 2 3 130K 260K 390K 520K 650K SE +/- 5320.80, N = 3 SE +/- 3538.13, N = 3 SE +/- 3258.38, N = 3 604079 604512 604322
VkFFT OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 2020-09-29 1 2 3 4K 8K 12K 16K 20K SE +/- 80.13, N = 3 SE +/- 18.75, N = 3 SE +/- 15.18, N = 3 18432 18479 18478
OCRMyPDF Processing 60 Page PDF Document OpenBenchmarking.org Seconds, Fewer Is Better OCRMyPDF 10.3.1+dfsg Processing 60 Page PDF Document 1 2 3 7 14 21 28 35 SE +/- 0.12, N = 3 SE +/- 0.07, N = 3 SE +/- 0.03, N = 3 32.16 32.23 32.26
LZ4 Compression Compression Level: 1 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 1 - Decompression Speed 1 2 3 2K 4K 6K 8K 10K SE +/- 46.13, N = 3 SE +/- 12.36, N = 3 SE +/- 11.80, N = 3 11116.3 11146.3 11202.6 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 1 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 1 - Compression Speed 1 2 3 2K 4K 6K 8K 10K SE +/- 45.71, N = 3 SE +/- 99.34, N = 3 SE +/- 105.59, N = 3 9763.10 10001.81 9959.06 1. (CC) gcc options: -O3
AOM AV1 Encoder Mode: Speed 6 Two-Pass OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 2.0 Encoder Mode: Speed 6 Two-Pass 1 2 3 0.7718 1.5436 2.3154 3.0872 3.859 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 3.43 3.42 3.43 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
PyPerformance Benchmark: pathlib OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.0.0 Benchmark: pathlib 1 2 3 4 8 12 16 20 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 17.7 17.7 17.8
rav1e Speed: 10 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Alpha Speed: 10 1 2 3 0.6946 1.3892 2.0838 2.7784 3.473 SE +/- 0.003, N = 3 SE +/- 0.030, N = 3 SE +/- 0.026, N = 3 3.041 3.068 3.087
Zstd Compression Compression Level: 3 OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.4.5 Compression Level: 3 1 2 3 800 1600 2400 3200 4000 SE +/- 19.56, N = 3 SE +/- 37.93, N = 3 SE +/- 19.08, N = 3 3687.7 3632.0 3687.7 1. (CC) gcc options: -O3 -pthread -lz -llzma
PyPerformance Benchmark: pickle_pure_python OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.0.0 Benchmark: pickle_pure_python 1 2 3 100 200 300 400 500 440 439 438
Crafty Elapsed Time OpenBenchmarking.org Nodes Per Second, More Is Better Crafty 25.2 Elapsed Time 1 2 3 1.7M 3.4M 5.1M 6.8M 8.5M SE +/- 11851.34, N = 3 SE +/- 103135.62, N = 3 SE +/- 94283.61, N = 3 7738146 7573497 7394699 1. (CC) gcc options: -pthread -lstdc++ -fprofile-use -lm
PyPerformance Benchmark: json_loads OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.0.0 Benchmark: json_loads 1 2 3 6 12 18 24 30 SE +/- 0.03, N = 3 SE +/- 0.00, N = 3 SE +/- 0.03, N = 3 27.2 27.3 27.3
Mlpack Benchmark Benchmark: scikit_svm OpenBenchmarking.org Seconds, Fewer Is Better Mlpack Benchmark Benchmark: scikit_svm 1 2 3 5 10 15 20 25 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 21.43 21.44 21.52
Redis Test: LPUSH OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: LPUSH 1 2 3 300K 600K 900K 1200K 1500K SE +/- 14444.21, N = 3 SE +/- 12228.65, N = 3 SE +/- 13914.34, N = 15 1437805.00 1413366.45 1459398.53 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
Redis Test: SET OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: SET 1 2 3 400K 800K 1200K 1600K 2000K SE +/- 18920.71, N = 3 SE +/- 29276.53, N = 3 SE +/- 19775.05, N = 15 1725875.75 1754885.38 1695298.03 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
Tesseract OCR Time To OCR 7 Images OpenBenchmarking.org Seconds, Fewer Is Better Tesseract OCR 4.1.1 Time To OCR 7 Images 1 2 3 6 12 18 24 30 SE +/- 0.03, N = 3 SE +/- 0.06, N = 3 SE +/- 0.02, N = 3 24.57 24.51 24.51
PyBench Total For Average Test Times OpenBenchmarking.org Milliseconds, Fewer Is Better PyBench 2018-02-16 Total For Average Test Times 1 2 3 200 400 600 800 1000 SE +/- 2.31, N = 3 SE +/- 1.76, N = 3 1020 1017 1021
NCNN Target: Vulkan GPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: yolov4-tiny 1 2 3 3 6 9 12 15 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 11.26 11.29 11.30 MIN: 11.09 / MAX: 11.79 MIN: 11.1 / MAX: 11.73 MIN: 11.15 / MAX: 11.62 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: resnet50 1 2 3 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 SE +/- 0.00, N = 3 6.10 6.13 6.09 MIN: 6.06 / MAX: 10.11 MIN: 6.04 / MAX: 11.31 MIN: 6.06 / MAX: 6.39 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: alexnet 1 2 3 0.9225 1.845 2.7675 3.69 4.6125 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 4.09 4.10 4.09 MIN: 3.96 / MAX: 4.79 MIN: 3.99 / MAX: 4.76 MIN: 3.97 / MAX: 4.66 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: resnet18 1 2 3 0.4883 0.9766 1.4649 1.9532 2.4415 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 2.16 2.17 2.17 MIN: 2.09 / MAX: 2.76 MIN: 2.1 / MAX: 2.77 MIN: 2.1 / MAX: 2.76 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: vgg16 1 2 3 3 6 9 12 15 SE +/- 0.16, N = 3 SE +/- 0.05, N = 3 SE +/- 0.07, N = 3 10.82 10.71 10.61 MIN: 10.18 / MAX: 24.02 MIN: 10.19 / MAX: 20.01 MIN: 10.2 / MAX: 24 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: googlenet 1 2 3 1.2915 2.583 3.8745 5.166 6.4575 SE +/- 0.05, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 5.74 5.70 5.69 MIN: 5.64 / MAX: 15.26 MIN: 5.65 / MAX: 8.52 MIN: 5.63 / MAX: 10.18 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: blazeface 1 2 3 0.2048 0.4096 0.6144 0.8192 1.024 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 0.91 0.88 0.87 MIN: 0.85 / MAX: 1.18 MIN: 0.86 / MAX: 1.45 MIN: 0.85 / MAX: 1.15 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: efficientnet-b0 1 2 3 3 6 9 12 15 SE +/- 0.12, N = 3 SE +/- 0.17, N = 3 SE +/- 0.10, N = 3 9.33 9.34 9.36 MIN: 8.97 / MAX: 19.5 MIN: 8.93 / MAX: 17.56 MIN: 8.96 / MAX: 20 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: mnasnet 1 2 3 0.6255 1.251 1.8765 2.502 3.1275 SE +/- 0.02, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 2.78 2.76 2.76 MIN: 2.71 / MAX: 15.58 MIN: 2.72 / MAX: 3.3 MIN: 2.71 / MAX: 3.28 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: shufflenet-v2 1 2 3 0.5243 1.0486 1.5729 2.0972 2.6215 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 2.32 2.33 2.32 MIN: 2.28 / MAX: 3.46 MIN: 2.3 / MAX: 2.98 MIN: 2.29 / MAX: 2.74 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 1 2 3 0.8055 1.611 2.4165 3.222 4.0275 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 3.57 3.58 3.58 MIN: 3.54 / MAX: 4.28 MIN: 3.53 / MAX: 4.28 MIN: 3.54 / MAX: 4.3 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 1 2 3 0.5805 1.161 1.7415 2.322 2.9025 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 2.57 2.58 2.58 MIN: 2.53 / MAX: 3.17 MIN: 2.53 / MAX: 5.2 MIN: 2.53 / MAX: 3.72 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: mobilenet 1 2 3 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 8.28 8.25 8.27 MIN: 7.59 / MAX: 11.65 MIN: 7.57 / MAX: 11.24 MIN: 7.6 / MAX: 11.3 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: squeezenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: squeezenet 1 2 3 1.0913 2.1826 3.2739 4.3652 5.4565 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 4.84 4.85 4.85 MIN: 4.71 / MAX: 5.84 MIN: 4.69 / MAX: 5.86 MIN: 4.71 / MAX: 5.84 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
AOM AV1 Encoder Mode: Speed 4 Two-Pass OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 2.0 Encoder Mode: Speed 4 Two-Pass 1 2 3 0.4905 0.981 1.4715 1.962 2.4525 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 2.18 2.17 2.18 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
Sunflow Rendering System Global Illumination + Image Synthesis OpenBenchmarking.org Seconds, Fewer Is Better Sunflow Rendering System 0.07.2 Global Illumination + Image Synthesis 1 2 3 0.4491 0.8982 1.3473 1.7964 2.2455 SE +/- 0.029, N = 4 SE +/- 0.024, N = 3 SE +/- 0.023, N = 3 1.948 1.965 1.996 MIN: 1.73 / MAX: 2.71 MIN: 1.78 / MAX: 2.54 MIN: 1.81 / MAX: 2.69
PyPerformance Benchmark: chaos OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.0.0 Benchmark: chaos 1 2 3 20 40 60 80 100 111 111 111
PyPerformance Benchmark: float OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.0.0 Benchmark: float 1 2 3 30 60 90 120 150 SE +/- 0.33, N = 3 117 116 117
PyPerformance Benchmark: nbody OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.0.0 Benchmark: nbody 1 2 3 30 60 90 120 150 117 117 117
PyPerformance Benchmark: crypto_pyaes OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.0.0 Benchmark: crypto_pyaes 1 2 3 30 60 90 120 150 116 116 116
OpenSSL RSA 4096-bit Performance OpenBenchmarking.org Signs Per Second, More Is Better OpenSSL 1.1.1 RSA 4096-bit Performance 1 2 3 200 400 600 800 1000 SE +/- 0.15, N = 3 SE +/- 0.27, N = 3 SE +/- 0.07, N = 3 1145.0 1144.3 1144.7 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
RNNoise OpenBenchmarking.org Seconds, Fewer Is Better RNNoise 2020-06-28 1 2 3 5 10 15 20 25 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 20.02 20.02 20.03 1. (CC) gcc options: -O2 -pedantic -fvisibility=hidden
Kvazaar Video Input: Bosphorus 1080p - Video Preset: Very Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 1080p - Video Preset: Very Fast 1 2 3 8 16 24 32 40 SE +/- 0.05, N = 3 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 32.34 32.38 32.38 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
TNN Target: CPU - Model: MobileNet v2 OpenBenchmarking.org ms, Fewer Is Better TNN 0.2.3 Target: CPU - Model: MobileNet v2 1 2 3 60 120 180 240 300 SE +/- 0.00, N = 3 SE +/- 0.19, N = 3 SE +/- 0.03, N = 3 263.82 264.30 264.12 MIN: 263.16 / MAX: 270.27 MIN: 263.28 / MAX: 314.67 MIN: 263.32 / MAX: 271.64 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -O3 -rdynamic -ldl
TNN Target: CPU - Model: SqueezeNet v1.1 OpenBenchmarking.org ms, Fewer Is Better TNN 0.2.3 Target: CPU - Model: SqueezeNet v1.1 1 2 3 60 120 180 240 300 SE +/- 0.22, N = 3 SE +/- 0.15, N = 3 SE +/- 0.05, N = 3 260.26 260.45 260.44 MIN: 259.28 / MAX: 261.34 MIN: 259.62 / MAX: 261.24 MIN: 259.68 / MAX: 261.29 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -O3 -rdynamic -ldl
Dolfyn Computational Fluid Dynamics OpenBenchmarking.org Seconds, Fewer Is Better Dolfyn 0.527 Computational Fluid Dynamics 1 2 3 4 8 12 16 20 SE +/- 0.01, N = 3 SE +/- 0.05, N = 3 SE +/- 0.02, N = 3 17.76 17.70 17.68
x265 Video Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better x265 3.4 Video Input: Bosphorus 1080p 1 2 3 8 16 24 32 40 SE +/- 0.05, N = 3 SE +/- 0.20, N = 3 SE +/- 0.22, N = 3 34.71 35.12 34.66 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
AOM AV1 Encoder Mode: Speed 8 Realtime OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 2.0 Encoder Mode: Speed 8 Realtime 1 2 3 9 18 27 36 45 SE +/- 0.08, N = 3 SE +/- 0.53, N = 3 SE +/- 0.08, N = 3 37.09 36.18 37.07 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
Betsy GPU Compressor Codec: ETC2 RGB - Quality: Highest OpenBenchmarking.org Seconds, Fewer Is Better Betsy GPU Compressor 1.1 Beta Codec: ETC2 RGB - Quality: Highest 1 2 3 2 4 6 8 10 SE +/- 0.213, N = 15 SE +/- 0.003, N = 3 SE +/- 0.003, N = 3 6.688 6.476 6.470 1. (CXX) g++ options: -O3 -O2 -lpthread -ldl
MPV Video Input: Big Buck Bunny Sunflower 1080p - Decode: Software Only OpenBenchmarking.org FPS, More Is Better MPV Video Input: Big Buck Bunny Sunflower 1080p - Decode: Software Only 1 2 3 300 600 900 1200 1500 SE +/- 2.29, N = 3 SE +/- 2.23, N = 3 SE +/- 7.87, N = 3 1300.12 1305.29 1304.01 MIN: 749.97 / MAX: 2399.95 MIN: 799.97 / MAX: 2399.92 MIN: 799.97 / MAX: 2399.92 1. mpv 0.32.0
LAMMPS Molecular Dynamics Simulator Model: Rhodopsin Protein OpenBenchmarking.org ns/day, More Is Better LAMMPS Molecular Dynamics Simulator 29Oct2020 Model: Rhodopsin Protein 1 2 3 0.7515 1.503 2.2545 3.006 3.7575 SE +/- 0.052, N = 15 SE +/- 0.029, N = 3 SE +/- 0.021, N = 3 3.249 3.329 3.340 1. (CXX) g++ options: -O3 -pthread -lm
Betsy GPU Compressor Codec: ETC1 - Quality: Highest OpenBenchmarking.org Seconds, Fewer Is Better Betsy GPU Compressor 1.1 Beta Codec: ETC1 - Quality: Highest 1 2 3 1.3048 2.6096 3.9144 5.2192 6.524 SE +/- 0.224, N = 15 SE +/- 0.062, N = 3 SE +/- 0.064, N = 3 5.799 5.608 5.602 1. (CXX) g++ options: -O3 -O2 -lpthread -ldl
Darktable Test: Boat - Acceleration: CPU-only OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.2.1 Test: Boat - Acceleration: CPU-only 1 2 3 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 12.62 12.48 12.51
ASTC Encoder Preset: Medium OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Medium 1 2 3 2 4 6 8 10 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 8.30 8.30 8.30 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU 1 2 3 1.0644 2.1288 3.1932 4.2576 5.322 SE +/- 0.00490, N = 3 SE +/- 0.01879, N = 3 SE +/- 0.00436, N = 3 4.73057 4.03910 4.06739 MIN: 4.66 MIN: 3.96 MIN: 3.99 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Kvazaar Video Input: Bosphorus 1080p - Video Preset: Ultra Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 1080p - Video Preset: Ultra Fast 1 2 3 13 26 39 52 65 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 SE +/- 0.11, N = 3 57.37 57.57 57.47 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Basis Universal Settings: UASTC Level 0 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: UASTC Level 0 1 2 3 3 6 9 12 15 SE +/- 0.015, N = 3 SE +/- 0.018, N = 3 SE +/- 0.014, N = 3 9.297 9.267 9.267 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
Darktable Test: Masskrug - Acceleration: CPU-only OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.2.1 Test: Masskrug - Acceleration: CPU-only 1 2 3 2 4 6 8 10 SE +/- 0.016, N = 3 SE +/- 0.006, N = 3 SE +/- 0.018, N = 3 8.019 7.963 8.026
Darktable Test: Server Room - Acceleration: CPU-only OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.2.1 Test: Server Room - Acceleration: CPU-only 1 2 3 2 4 6 8 10 SE +/- 0.003, N = 3 SE +/- 0.010, N = 3 SE +/- 0.002, N = 3 6.318 6.274 6.292
ASTC Encoder Preset: Fast OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Fast 1 2 3 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 7.39 7.39 7.37 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
Waifu2x-NCNN Vulkan Scale: 2x - Denoise: 3 - TAA: Yes OpenBenchmarking.org Seconds, Fewer Is Better Waifu2x-NCNN Vulkan 20200818 Scale: 2x - Denoise: 3 - TAA: Yes 1 2 3 2 4 6 8 10 SE +/- 0.002, N = 3 SE +/- 0.007, N = 3 SE +/- 0.008, N = 3 7.158 7.184 7.183
libavif avifenc Encoder Speed: 8 OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.7.3 Encoder Speed: 8 1 2 3 2 4 6 8 10 SE +/- 0.004, N = 3 SE +/- 0.005, N = 3 SE +/- 0.001, N = 3 6.631 6.575 6.584 1. (CXX) g++ options: -O3 -fPIC
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU 1 2 3 5 10 15 20 25 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 20.65 19.20 19.41 MIN: 20.28 MIN: 19.01 MIN: 19.18 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
yquake2 Renderer: Software CPU - Resolution: 1920 x 1080 OpenBenchmarking.org Frames Per Second, More Is Better yquake2 7.45 Renderer: Software CPU - Resolution: 1920 x 1080 1 2 3 20 40 60 80 100 SE +/- 0.48, N = 3 SE +/- 0.30, N = 3 SE +/- 0.31, N = 3 110.9 109.5 110.4 1. (CC) gcc options: -lm -ldl -rdynamic -shared -lSDL2 -O2 -pipe -fomit-frame-pointer -std=gnu99 -fno-strict-aliasing -fwrapv -fvisibility=hidden -MMD -mfpmath=sse -fPIC
libavif avifenc Encoder Speed: 10 OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.7.3 Encoder Speed: 10 1 2 3 2 4 6 8 10 SE +/- 0.015, N = 3 SE +/- 0.019, N = 3 SE +/- 0.017, N = 3 6.100 6.037 6.064 1. (CXX) g++ options: -O3 -fPIC
oneDNN Harness: Deconvolution Batch deconv_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Deconvolution Batch deconv_3d - Data Type: f32 - Engine: CPU 1 2 3 3 6 9 12 15 SE +/- 0.02, N = 3 SE +/- 0.17, N = 3 SE +/- 0.02, N = 3 12.03 13.18 12.89 MIN: 11.78 MIN: 12.66 MIN: 12.66 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
FFTE N=256, 3D Complex FFT Routine OpenBenchmarking.org MFLOPS, More Is Better FFTE 7.0 N=256, 3D Complex FFT Routine 1 2 3 6K 12K 18K 24K 30K SE +/- 31.78, N = 3 SE +/- 19.99, N = 3 SE +/- 17.94, N = 3 25405.99 25808.66 25835.65 1. (F9X) gfortran options: -O3 -fomit-frame-pointer -fopenmp
yquake2 Renderer: OpenGL 1.x - Resolution: 1920 x 1080 OpenBenchmarking.org Frames Per Second, More Is Better yquake2 7.45 Renderer: OpenGL 1.x - Resolution: 1920 x 1080 1 2 3 160 320 480 640 800 SE +/- 8.88, N = 3 SE +/- 9.93, N = 3 SE +/- 1.13, N = 3 737.4 747.0 730.4 1. (CC) gcc options: -lm -ldl -rdynamic -shared -lSDL2 -O2 -pipe -fomit-frame-pointer -std=gnu99 -fno-strict-aliasing -fwrapv -fvisibility=hidden -MMD -mfpmath=sse -fPIC
yquake2 Renderer: OpenGL 3.x - Resolution: 1920 x 1080 OpenBenchmarking.org Frames Per Second, More Is Better yquake2 7.45 Renderer: OpenGL 3.x - Resolution: 1920 x 1080 1 2 3 200 400 600 800 1000 SE +/- 1.84, N = 3 SE +/- 0.50, N = 3 SE +/- 2.82, N = 3 977.8 977.8 975.8 1. (CC) gcc options: -lm -ldl -rdynamic -shared -lSDL2 -O2 -pipe -fomit-frame-pointer -std=gnu99 -fno-strict-aliasing -fwrapv -fvisibility=hidden -MMD -mfpmath=sse -fPIC
Darktable Test: Server Rack - Acceleration: CPU-only OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.2.1 Test: Server Rack - Acceleration: CPU-only 1 2 3 0.0527 0.1054 0.1581 0.2108 0.2635 SE +/- 0.000, N = 3 SE +/- 0.000, N = 3 SE +/- 0.004, N = 3 0.230 0.229 0.234
HPC Challenge Test / Class: Max Ping Pong Bandwidth OpenBenchmarking.org MB/s, More Is Better HPC Challenge 1.5.0 Test / Class: Max Ping Pong Bandwidth 1 3K 6K 9K 12K 15K 14400.77 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
HPC Challenge Test / Class: Random Ring Bandwidth OpenBenchmarking.org GB/s, More Is Better HPC Challenge 1.5.0 Test / Class: Random Ring Bandwidth 1 1.1163 2.2326 3.3489 4.4652 5.5815 4.96131 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
HPC Challenge Test / Class: Random Ring Latency OpenBenchmarking.org usecs, Fewer Is Better HPC Challenge 1.5.0 Test / Class: Random Ring Latency 1 0.0887 0.1774 0.2661 0.3548 0.4435 0.39400 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
HPC Challenge Test / Class: G-Random Access OpenBenchmarking.org GUP/s, More Is Better HPC Challenge 1.5.0 Test / Class: G-Random Access 1 0.0045 0.009 0.0135 0.018 0.0225 0.02004 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
HPC Challenge Test / Class: EP-STREAM Triad OpenBenchmarking.org GB/s, More Is Better HPC Challenge 1.5.0 Test / Class: EP-STREAM Triad 1 2 4 6 8 10 6.74179 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
HPC Challenge Test / Class: G-Ptrans OpenBenchmarking.org GB/s, More Is Better HPC Challenge 1.5.0 Test / Class: G-Ptrans 1 0.1471 0.2942 0.4413 0.5884 0.7355 0.65366 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
HPC Challenge Test / Class: EP-DGEMM OpenBenchmarking.org GFLOPS, More Is Better HPC Challenge 1.5.0 Test / Class: EP-DGEMM 1 12 24 36 48 60 55.51 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
HPC Challenge Test / Class: G-Ffte OpenBenchmarking.org GFLOPS, More Is Better HPC Challenge 1.5.0 Test / Class: G-Ffte 1 0.744 1.488 2.232 2.976 3.72 3.30686 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
Phoronix Test Suite v10.8.5