3100 compare AMD Ryzen 3 3100 4-Core testing with a ASUS ROG CROSSHAIR VIII HERO (2702 BIOS) and AMD Radeon RX 56/64 8GB on Ubuntu 20.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2011296-HA-3100COMPA40&rdt&gru&export=txt .
3100 compare Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL Vulkan Compiler File-System Screen Resolution 1 2 3 AMD Ryzen 3 3100 4-Core @ 3.60GHz (4 Cores / 8 Threads) ASUS ROG CROSSHAIR VIII HERO (2702 BIOS) AMD Starship/Matisse 16GB 1000GB Sabrent Rocket 4.0 1TB AMD Radeon RX 56/64 8GB (1590/800MHz) AMD Vega 10 HDMI Audio LG Ultra HD Realtek RTL8125 2.5GbE + Intel I211 Ubuntu 20.10 5.8.0-29-generic (x86_64) GNOME Shell 3.38.1 X Server 1.20.9 amdgpu 19.1.0 4.6 Mesa 20.2.1 (LLVM 11.0.0) 1.2.131 GCC 10.2.0 ext4 1920x1080 OpenBenchmarking.org Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-gcn/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq ondemand - CPU Microcode: 0x8701021 Graphics Details - GLAMOR Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected Java Details - 2, 3: OpenJDK Runtime Environment (build 11.0.9.1+1-Ubuntu-0ubuntu1.20.10) Python Details - 2, 3: Python 3.8.6
3100 compare vkfft: mpv: Big Buck Bunny Sunflower 4K - Software Only mpv: Big Buck Bunny Sunflower 1080p - Software Only ddnet: 1920 x 1080 - Fullscreen - OpenGL 3.3 - Default - RaiNyMore2 ddnet: 1920 x 1080 - Fullscreen - OpenGL 3.3 - Default - Multeasymap yquake2: OpenGL 1.x - 1920 x 1080 yquake2: OpenGL 3.x - 1920 x 1080 yquake2: Software CPU - 1920 x 1080 aom-av1: Speed 0 Two-Pass aom-av1: Speed 4 Two-Pass aom-av1: Speed 6 Realtime aom-av1: Speed 6 Two-Pass aom-av1: Speed 8 Realtime embree: Pathtracer - Crown embree: Pathtracer ISPC - Crown embree: Pathtracer - Asian Dragon embree: Pathtracer ISPC - Asian Dragon kvazaar: Bosphorus 4K - Medium kvazaar: Bosphorus 1080p - Medium kvazaar: Bosphorus 4K - Very Fast kvazaar: Bosphorus 4K - Ultra Fast kvazaar: Bosphorus 1080p - Very Fast kvazaar: Bosphorus 1080p - Ultra Fast rav1e: 1 rav1e: 5 rav1e: 6 rav1e: 10 x265: Bosphorus 4K x265: Bosphorus 1080p hpcc: G-Ptrans hpcc: EP-STREAM Triad hpcc: Rand Ring Bandwidth hpcc: G-HPL hpcc: G-Ffte hpcc: EP-DGEMM hpcc: G-Rand Access indigobench: CPU - Bedroom indigobench: CPU - Supercar hpcc: Max Ping Pong Bandwidth compress-lz4: 1 - Compression Speed compress-lz4: 1 - Decompression Speed compress-lz4: 3 - Compression Speed compress-lz4: 3 - Decompression Speed compress-lz4: 9 - Compression Speed compress-lz4: 9 - Decompression Speed compress-zstd: 3 compress-zstd: 19 ffte: N=256, 3D Complex FFT Routine lczero: BLAS lczero: Eigen crafty: Elapsed Time stockfish: Total Time asmfish: 1024 Hash Memory, 26 Depth gromacs: Water Benchmark lammps: Rhodopsin Protein hint: FLOAT redis: LPOP redis: SADD redis: LPUSH redis: GET redis: SET glmark2: 1920 x 1080 numpy: ai-benchmark: Device Inference Score ai-benchmark: Device Training Score ai-benchmark: Device AI Score geekbench: GPU Vulkan geekbench: CPU Multi Core geekbench: CPU Single Core phpbench: PHP Benchmark Suite openssl: RSA 4096-bit Performance kripke: brl-cad: VGR Performance Metric namd: ATPase Simulation - 327,506 Atoms tensorflow-lite: SqueezeNet tensorflow-lite: Inception V4 tensorflow-lite: NASNet Mobile tensorflow-lite: Mobilenet Float tensorflow-lite: Mobilenet Quant tensorflow-lite: Inception ResNet V2 caffe: AlexNet - CPU - 100 caffe: GoogleNet - CPU - 100 pybench: Total For Average Test Times pyperformance: go pyperformance: 2to3 pyperformance: chaos pyperformance: float pyperformance: nbody pyperformance: pathlib pyperformance: raytrace pyperformance: json_loads pyperformance: crypto_pyaes pyperformance: regex_compile pyperformance: python_startup pyperformance: django_template pyperformance: pickle_pure_python onednn: IP Batch 1D - f32 - CPU onednn: IP Batch All - f32 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: Deconvolution Batch deconv_1d - f32 - CPU onednn: Deconvolution Batch deconv_3d - f32 - CPU onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU mnn: SqueezeNetV1.0 mnn: resnet-v2-50 mnn: MobileNetV2_224 mnn: mobilenet-v1-1.0 mnn: inception-v3 ncnn: CPU - squeezenet ncnn: CPU - mobilenet ncnn: CPU-v2-v2 - mobilenet-v2 ncnn: CPU-v3-v3 - mobilenet-v3 ncnn: CPU - shufflenet-v2 ncnn: CPU - mnasnet ncnn: CPU - efficientnet-b0 ncnn: CPU - blazeface ncnn: CPU - googlenet ncnn: CPU - vgg16 ncnn: CPU - resnet18 ncnn: CPU - alexnet ncnn: CPU - resnet50 ncnn: CPU - yolov4-tiny ncnn: Vulkan GPU - squeezenet ncnn: Vulkan GPU - mobilenet ncnn: Vulkan GPU-v2-v2 - mobilenet-v2 ncnn: Vulkan GPU-v3-v3 - mobilenet-v3 ncnn: Vulkan GPU - shufflenet-v2 ncnn: Vulkan GPU - mnasnet ncnn: Vulkan GPU - efficientnet-b0 ncnn: Vulkan GPU - blazeface ncnn: Vulkan GPU - googlenet ncnn: Vulkan GPU - vgg16 ncnn: Vulkan GPU - resnet18 ncnn: Vulkan GPU - alexnet ncnn: Vulkan GPU - resnet50 ncnn: Vulkan GPU - yolov4-tiny tnn: CPU - MobileNet v2 tnn: CPU - SqueezeNet v1.1 betsy: ETC1 - Highest betsy: ETC2 RGB - Highest waifu2x-ncnn: 2x - 3 - Yes dolfyn: Computational Fluid Dynamics hmmer: Pfam Database Search mocassin: Dust 2D tau100.0 avifenc: 0 avifenc: 2 avifenc: 8 avifenc: 10 build-linux-kernel: Time To Compile build-llvm: Time To Compile espeak: Text-To-Speech Synthesis rnnoise: astcenc: Fast astcenc: Medium astcenc: Thorough astcenc: Exhaustive basis: ETC1S basis: UASTC Level 0 basis: UASTC Level 2 basis: UASTC Level 3 basis: UASTC Level 2 + RDO Post-Processing darktable: Boat - CPU-only darktable: Masskrug - CPU-only darktable: Server Rack - CPU-only darktable: Server Room - CPU-only hugin: Panorama Photo Assistant + Stitching Time ocrmypdf: Processing 60 Page PDF Document rawtherapee: Total Benchmark Time blender: BMW27 - CPU-Only mlpack: scikit_ica mlpack: scikit_qda mlpack: scikit_svm mlpack: scikit_linearridgeregression sunflow: Global Illumination + Image Synthesis tesseract-ocr: Time To OCR 7 Images hpcc: Rand Ring Latency 1 2 3 18432 454.91 1300.12 236.14 334.97 737.4 977.8 110.9 0.26 2.18 16.93 3.43 37.09 5.3673 5.1754 6.2294 6.1185 2.91 12.72 8.03 14.55 32.34 57.37 0.372 1.050 1.404 3.041 7.77 34.71 0.65366 6.74179 4.96131 120.90700 3.30686 55.50950 0.02004 0.930 1.954 14400.766 9763.10 11116.3 53.56 10496.6 52.50 10575.6 3687.7 21.1 25405.988379614 775 730 7738146 9840561 13797562 0.542 3.249 340004918.53832 2624535.48 1951987.09 1437805.00 2310604.62 1725875.75 6530 321.42 755 788 1543 36246 5227 1223 604079 1145.0 6926419 57419 4.49500 387017 5572917 280959 259621 266426 5048647 49082 120634 1020 249 320 111 117 117 17.7 480 27.2 116 169 8.02 46.9 440 7.43267 96.7410 20.6452 8.05221 12.0299 503.700 249.371 4.73057 8.853 34.931 4.937 5.819 42.573 21.87 23.82 6.66 6.18 4.79 6.32 9.21 1.95 19.42 72.57 17.83 18.95 37.32 32.96 4.84 8.28 2.57 3.57 2.32 2.78 9.33 0.91 5.74 10.82 2.16 4.09 6.10 11.26 263.815 260.260 5.799 6.688 7.158 17.756 111.354 288 146.262 86.460 6.631 6.100 159.301 1215.501 30.991 20.022 7.39 8.30 52.39 425.23 62.795 9.297 58.589 114.270 702.907 12.620 8.019 0.230 6.318 61.999 32.164 79.955 323.46 49.12 74.11 21.43 3.19 1.948 24.567 0.39400 18479 457.51 1305.29 214.08 331.87 747.0 977.8 109.5 0.25 2.17 16.66 3.42 36.18 5.3310 5.1563 6.1430 6.1354 2.92 12.75 8.03 14.63 32.38 57.57 0.365 1.047 1.370 3.068 7.78 35.12 0.935 1.958 10001.81 11146.3 53.60 10635.3 51.20 10711.5 3632.0 21.2 25808.658192384 730 688 7573497 10050125 13866946 0.550 3.329 339696862.29595 2159676.18 1947660.53 1413366.45 2095708.19 1754885.38 6547 320.60 761 788 1549 36793 5248 1228 604512 1144.3 6434795 56730 4.48209 386977 5574923 279167 259600 266319 5046933 49047 120679 1017 248 321 111 116 117 17.7 477 27.3 116 170 8.08 46.7 439 8.42072 99.7379 19.1978 9.74749 13.1798 491.110 262.368 4.03910 8.848 34.995 4.936 5.824 42.444 21.67 23.71 6.64 6.11 4.82 6.27 9.1 1.92 19.30 72.10 17.75 18.57 37.05 32.84 4.85 8.25 2.58 3.58 2.33 2.76 9.34 0.88 5.70 10.71 2.17 4.10 6.13 11.29 264.303 260.451 5.608 6.476 7.184 17.704 111.303 288 149.263 88.209 6.575 6.037 160.558 1214.944 30.941 20.015 7.39 8.3 52.46 425.58 62.694 9.267 58.628 114.186 701.853 12.477 7.963 0.229 6.274 61.972 32.233 79.418 324.80 49.12 76.91 21.44 3.20 1.965 24.505 18478 456.91 1304.01 203.50 333.51 730.4 975.8 110.4 0.25 2.18 16.91 3.43 37.07 5.3162 5.1464 6.2800 6.1401 2.92 12.76 8.01 14.60 32.38 57.47 0.361 1.058 1.393 3.087 7.75 34.66 0.931 1.958 9959.06 11202.6 52.55 10588.0 51.45 10653.0 3687.7 21.2 25835.646713364 649 638 7394699 9864045 13730556 0.552 3.340 339643088.46449 1559670.96 2004338.25 1459398.53 2135888.90 1695298.03 6559 321.61 763 790 1553 36985 5256 1228 604322 1144.7 56960 4.48308 387009 5571717 279158 259575 266346 5048433 48962 120291 1021 248 321 111 117 117 17.8 476 27.3 116 168 8.13 46.8 438 8.81065 99.9683 19.4144 9.82517 12.8921 462.801 247.156 4.06739 8.874 35.167 4.910 5.811 42.545 21.83 23.71 6.68 6.25 4.84 6.30 9.16 1.92 19.51 72.72 18.00 18.81 37.31 33.20 4.85 8.27 2.58 3.58 2.32 2.76 9.36 0.87 5.69 10.61 2.17 4.09 6.09 11.30 264.123 260.441 5.602 6.470 7.183 17.680 111.494 288 145.653 86.293 6.584 6.064 160.149 1216.085 30.956 20.028 7.37 8.30 52.40 425.39 62.769 9.267 58.621 114.261 704.133 12.509 8.026 0.234 6.292 62.706 32.264 80.349 325.17 49.14 75.29 21.52 3.22 1.996 24.509 OpenBenchmarking.org
VkFFT OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 2020-09-29 1 2 3 4K 8K 12K 16K 20K SE +/- 80.13, N = 3 SE +/- 18.75, N = 3 SE +/- 15.18, N = 3 18432 18479 18478
MPV Video Input: Big Buck Bunny Sunflower 4K - Decode: Software Only OpenBenchmarking.org FPS, More Is Better MPV Video Input: Big Buck Bunny Sunflower 4K - Decode: Software Only 1 2 3 100 200 300 400 500 SE +/- 0.40, N = 3 SE +/- 0.40, N = 3 SE +/- 0.38, N = 3 454.91 457.51 456.91 MIN: 299.99 / MAX: 631.56 MIN: 299.99 / MAX: 666.65 MIN: 292.67 / MAX: 631.56 1. mpv 0.32.0
MPV Video Input: Big Buck Bunny Sunflower 1080p - Decode: Software Only OpenBenchmarking.org FPS, More Is Better MPV Video Input: Big Buck Bunny Sunflower 1080p - Decode: Software Only 1 2 3 300 600 900 1200 1500 SE +/- 2.29, N = 3 SE +/- 2.23, N = 3 SE +/- 7.87, N = 3 1300.12 1305.29 1304.01 MIN: 749.97 / MAX: 2399.95 MIN: 799.97 / MAX: 2399.92 MIN: 799.97 / MAX: 2399.92 1. mpv 0.32.0
DDraceNetwork Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL 3.3 - Zoom: Default - Demo: RaiNyMore2 OpenBenchmarking.org Frames Per Second, More Is Better DDraceNetwork 15.2.3 Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL 3.3 - Zoom: Default - Demo: RaiNyMore2 1 2 3 50 100 150 200 250 SE +/- 9.66, N = 15 SE +/- 13.98, N = 15 SE +/- 13.89, N = 12 236.14 214.08 203.50 MIN: 35.16 / MAX: 498.75 MIN: 33.12 / MAX: 476.64 MIN: 27.45 / MAX: 451.06 1. (CXX) g++ options: -O3 -rdynamic -lcrypto -lz -lrt -lpthread -lcurl -lfreetype -lSDL2 -lwavpack -lopusfile -lopus -logg -lGL -lX11 -lnotify -lgdk_pixbuf-2.0 -lgio-2.0 -lgobject-2.0 -lglib-2.0
DDraceNetwork Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL 3.3 - Zoom: Default - Demo: Multeasymap OpenBenchmarking.org Frames Per Second, More Is Better DDraceNetwork 15.2.3 Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL 3.3 - Zoom: Default - Demo: Multeasymap 1 2 3 70 140 210 280 350 SE +/- 1.03, N = 3 SE +/- 1.65, N = 3 SE +/- 0.66, N = 3 334.97 331.87 333.51 MIN: 110.11 / MAX: 493.34 MIN: 104.44 / MAX: 494.32 MIN: 109.02 / MAX: 495.29 1. (CXX) g++ options: -O3 -rdynamic -lcrypto -lz -lrt -lpthread -lcurl -lfreetype -lSDL2 -lwavpack -lopusfile -lopus -logg -lGL -lX11 -lnotify -lgdk_pixbuf-2.0 -lgio-2.0 -lgobject-2.0 -lglib-2.0
yquake2 Renderer: OpenGL 1.x - Resolution: 1920 x 1080 OpenBenchmarking.org Frames Per Second, More Is Better yquake2 7.45 Renderer: OpenGL 1.x - Resolution: 1920 x 1080 1 2 3 160 320 480 640 800 SE +/- 8.88, N = 3 SE +/- 9.93, N = 3 SE +/- 1.13, N = 3 737.4 747.0 730.4 1. (CC) gcc options: -lm -ldl -rdynamic -shared -lSDL2 -O2 -pipe -fomit-frame-pointer -std=gnu99 -fno-strict-aliasing -fwrapv -fvisibility=hidden -MMD -mfpmath=sse -fPIC
yquake2 Renderer: OpenGL 3.x - Resolution: 1920 x 1080 OpenBenchmarking.org Frames Per Second, More Is Better yquake2 7.45 Renderer: OpenGL 3.x - Resolution: 1920 x 1080 1 2 3 200 400 600 800 1000 SE +/- 1.84, N = 3 SE +/- 0.50, N = 3 SE +/- 2.82, N = 3 977.8 977.8 975.8 1. (CC) gcc options: -lm -ldl -rdynamic -shared -lSDL2 -O2 -pipe -fomit-frame-pointer -std=gnu99 -fno-strict-aliasing -fwrapv -fvisibility=hidden -MMD -mfpmath=sse -fPIC
yquake2 Renderer: Software CPU - Resolution: 1920 x 1080 OpenBenchmarking.org Frames Per Second, More Is Better yquake2 7.45 Renderer: Software CPU - Resolution: 1920 x 1080 1 2 3 20 40 60 80 100 SE +/- 0.48, N = 3 SE +/- 0.30, N = 3 SE +/- 0.31, N = 3 110.9 109.5 110.4 1. (CC) gcc options: -lm -ldl -rdynamic -shared -lSDL2 -O2 -pipe -fomit-frame-pointer -std=gnu99 -fno-strict-aliasing -fwrapv -fvisibility=hidden -MMD -mfpmath=sse -fPIC
AOM AV1 Encoder Mode: Speed 0 Two-Pass OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 2.0 Encoder Mode: Speed 0 Two-Pass 1 2 3 0.0585 0.117 0.1755 0.234 0.2925 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.26 0.25 0.25 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
AOM AV1 Encoder Mode: Speed 4 Two-Pass OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 2.0 Encoder Mode: Speed 4 Two-Pass 1 2 3 0.4905 0.981 1.4715 1.962 2.4525 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 2.18 2.17 2.18 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
AOM AV1 Encoder Mode: Speed 6 Realtime OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 2.0 Encoder Mode: Speed 6 Realtime 1 2 3 4 8 12 16 20 SE +/- 0.02, N = 3 SE +/- 0.07, N = 3 SE +/- 0.03, N = 3 16.93 16.66 16.91 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
AOM AV1 Encoder Mode: Speed 6 Two-Pass OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 2.0 Encoder Mode: Speed 6 Two-Pass 1 2 3 0.7718 1.5436 2.3154 3.0872 3.859 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 3.43 3.42 3.43 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
AOM AV1 Encoder Mode: Speed 8 Realtime OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 2.0 Encoder Mode: Speed 8 Realtime 1 2 3 9 18 27 36 45 SE +/- 0.08, N = 3 SE +/- 0.53, N = 3 SE +/- 0.08, N = 3 37.09 36.18 37.07 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
Embree Binary: Pathtracer - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer - Model: Crown 1 2 3 1.2076 2.4152 3.6228 4.8304 6.038 SE +/- 0.0401, N = 3 SE +/- 0.0092, N = 3 SE +/- 0.0526, N = 3 5.3673 5.3310 5.3162 MIN: 4.89 / MAX: 5.49 MIN: 5.3 / MAX: 5.4 MIN: 4.89 / MAX: 5.45
Embree Binary: Pathtracer ISPC - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer ISPC - Model: Crown 1 2 3 1.1645 2.329 3.4935 4.658 5.8225 SE +/- 0.0016, N = 3 SE +/- 0.0124, N = 3 SE +/- 0.0320, N = 3 5.1754 5.1563 5.1464 MIN: 5.15 / MAX: 5.25 MIN: 5.11 / MAX: 5.25 MIN: 5.05 / MAX: 5.25
Embree Binary: Pathtracer - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer - Model: Asian Dragon 1 2 3 2 4 6 8 10 SE +/- 0.0315, N = 3 SE +/- 0.0103, N = 3 SE +/- 0.0267, N = 3 6.2294 6.1430 6.2800 MIN: 6.16 / MAX: 6.36 MIN: 6.1 / MAX: 6.24 MIN: 6.21 / MAX: 6.4
Embree Binary: Pathtracer ISPC - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer ISPC - Model: Asian Dragon 1 2 3 2 4 6 8 10 SE +/- 0.0222, N = 3 SE +/- 0.0098, N = 3 SE +/- 0.0025, N = 3 6.1185 6.1354 6.1401 MIN: 6.05 / MAX: 6.22 MIN: 6.09 / MAX: 6.22 MIN: 6.11 / MAX: 6.21
Kvazaar Video Input: Bosphorus 4K - Video Preset: Medium OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 4K - Video Preset: Medium 1 2 3 0.657 1.314 1.971 2.628 3.285 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 2.91 2.92 2.92 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Kvazaar Video Input: Bosphorus 1080p - Video Preset: Medium OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 1080p - Video Preset: Medium 1 2 3 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 12.72 12.75 12.76 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Kvazaar Video Input: Bosphorus 4K - Video Preset: Very Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 4K - Video Preset: Very Fast 1 2 3 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 8.03 8.03 8.01 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Kvazaar Video Input: Bosphorus 4K - Video Preset: Ultra Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 4K - Video Preset: Ultra Fast 1 2 3 4 8 12 16 20 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 14.55 14.63 14.60 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Kvazaar Video Input: Bosphorus 1080p - Video Preset: Very Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 1080p - Video Preset: Very Fast 1 2 3 8 16 24 32 40 SE +/- 0.05, N = 3 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 32.34 32.38 32.38 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Kvazaar Video Input: Bosphorus 1080p - Video Preset: Ultra Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 1080p - Video Preset: Ultra Fast 1 2 3 13 26 39 52 65 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 SE +/- 0.11, N = 3 57.37 57.57 57.47 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
rav1e Speed: 1 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Alpha Speed: 1 1 2 3 0.0837 0.1674 0.2511 0.3348 0.4185 SE +/- 0.004, N = 3 SE +/- 0.005, N = 3 SE +/- 0.006, N = 3 0.372 0.365 0.361
rav1e Speed: 5 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Alpha Speed: 5 1 2 3 0.2381 0.4762 0.7143 0.9524 1.1905 SE +/- 0.007, N = 3 SE +/- 0.004, N = 3 SE +/- 0.000, N = 3 1.050 1.047 1.058
rav1e Speed: 6 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Alpha Speed: 6 1 2 3 0.3159 0.6318 0.9477 1.2636 1.5795 SE +/- 0.007, N = 3 SE +/- 0.006, N = 3 SE +/- 0.011, N = 3 1.404 1.370 1.393
rav1e Speed: 10 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Alpha Speed: 10 1 2 3 0.6946 1.3892 2.0838 2.7784 3.473 SE +/- 0.003, N = 3 SE +/- 0.030, N = 3 SE +/- 0.026, N = 3 3.041 3.068 3.087
x265 Video Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better x265 3.4 Video Input: Bosphorus 4K 1 2 3 2 4 6 8 10 SE +/- 0.05, N = 3 SE +/- 0.08, N = 3 SE +/- 0.05, N = 3 7.77 7.78 7.75 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
x265 Video Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better x265 3.4 Video Input: Bosphorus 1080p 1 2 3 8 16 24 32 40 SE +/- 0.05, N = 3 SE +/- 0.20, N = 3 SE +/- 0.22, N = 3 34.71 35.12 34.66 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
HPC Challenge Test / Class: G-Ptrans OpenBenchmarking.org GB/s, More Is Better HPC Challenge 1.5.0 Test / Class: G-Ptrans 1 0.1471 0.2942 0.4413 0.5884 0.7355 0.65366 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
HPC Challenge Test / Class: EP-STREAM Triad OpenBenchmarking.org GB/s, More Is Better HPC Challenge 1.5.0 Test / Class: EP-STREAM Triad 1 2 4 6 8 10 6.74179 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
HPC Challenge Test / Class: Random Ring Bandwidth OpenBenchmarking.org GB/s, More Is Better HPC Challenge 1.5.0 Test / Class: Random Ring Bandwidth 1 1.1163 2.2326 3.3489 4.4652 5.5815 4.96131 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
HPC Challenge Test / Class: G-HPL OpenBenchmarking.org GFLOPS, More Is Better HPC Challenge 1.5.0 Test / Class: G-HPL 1 30 60 90 120 150 120.91 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
HPC Challenge Test / Class: G-Ffte OpenBenchmarking.org GFLOPS, More Is Better HPC Challenge 1.5.0 Test / Class: G-Ffte 1 0.744 1.488 2.232 2.976 3.72 3.30686 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
HPC Challenge Test / Class: EP-DGEMM OpenBenchmarking.org GFLOPS, More Is Better HPC Challenge 1.5.0 Test / Class: EP-DGEMM 1 12 24 36 48 60 55.51 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
HPC Challenge Test / Class: G-Random Access OpenBenchmarking.org GUP/s, More Is Better HPC Challenge 1.5.0 Test / Class: G-Random Access 1 0.0045 0.009 0.0135 0.018 0.0225 0.02004 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
IndigoBench Acceleration: CPU - Scene: Bedroom OpenBenchmarking.org M samples/s, More Is Better IndigoBench 4.4 Acceleration: CPU - Scene: Bedroom 1 2 3 0.2104 0.4208 0.6312 0.8416 1.052 SE +/- 0.001, N = 3 SE +/- 0.001, N = 3 SE +/- 0.001, N = 3 0.930 0.935 0.931
IndigoBench Acceleration: CPU - Scene: Supercar OpenBenchmarking.org M samples/s, More Is Better IndigoBench 4.4 Acceleration: CPU - Scene: Supercar 1 2 3 0.4406 0.8812 1.3218 1.7624 2.203 SE +/- 0.005, N = 3 SE +/- 0.004, N = 3 SE +/- 0.001, N = 3 1.954 1.958 1.958
HPC Challenge Test / Class: Max Ping Pong Bandwidth OpenBenchmarking.org MB/s, More Is Better HPC Challenge 1.5.0 Test / Class: Max Ping Pong Bandwidth 1 3K 6K 9K 12K 15K 14400.77 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
LZ4 Compression Compression Level: 1 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 1 - Compression Speed 1 2 3 2K 4K 6K 8K 10K SE +/- 45.71, N = 3 SE +/- 99.34, N = 3 SE +/- 105.59, N = 3 9763.10 10001.81 9959.06 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 1 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 1 - Decompression Speed 1 2 3 2K 4K 6K 8K 10K SE +/- 46.13, N = 3 SE +/- 12.36, N = 3 SE +/- 11.80, N = 3 11116.3 11146.3 11202.6 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 3 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 3 - Compression Speed 1 2 3 12 24 36 48 60 SE +/- 0.84, N = 3 SE +/- 0.85, N = 3 SE +/- 0.28, N = 3 53.56 53.60 52.55 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 3 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 3 - Decompression Speed 1 2 3 2K 4K 6K 8K 10K SE +/- 12.91, N = 3 SE +/- 18.42, N = 3 SE +/- 22.07, N = 3 10496.6 10635.3 10588.0 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 9 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 9 - Compression Speed 1 2 3 12 24 36 48 60 SE +/- 0.72, N = 3 SE +/- 0.27, N = 3 SE +/- 0.03, N = 3 52.50 51.20 51.45 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 9 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 9 - Decompression Speed 1 2 3 2K 4K 6K 8K 10K SE +/- 23.62, N = 3 SE +/- 8.30, N = 3 SE +/- 26.11, N = 3 10575.6 10711.5 10653.0 1. (CC) gcc options: -O3
Zstd Compression Compression Level: 3 OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.4.5 Compression Level: 3 1 2 3 800 1600 2400 3200 4000 SE +/- 19.56, N = 3 SE +/- 37.93, N = 3 SE +/- 19.08, N = 3 3687.7 3632.0 3687.7 1. (CC) gcc options: -O3 -pthread -lz -llzma
Zstd Compression Compression Level: 19 OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.4.5 Compression Level: 19 1 2 3 5 10 15 20 25 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 21.1 21.2 21.2 1. (CC) gcc options: -O3 -pthread -lz -llzma
FFTE N=256, 3D Complex FFT Routine OpenBenchmarking.org MFLOPS, More Is Better FFTE 7.0 N=256, 3D Complex FFT Routine 1 2 3 6K 12K 18K 24K 30K SE +/- 31.78, N = 3 SE +/- 19.99, N = 3 SE +/- 17.94, N = 3 25405.99 25808.66 25835.65 1. (F9X) gfortran options: -O3 -fomit-frame-pointer -fopenmp
LeelaChessZero Backend: BLAS OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.26 Backend: BLAS 1 2 3 200 400 600 800 1000 SE +/- 7.81, N = 3 SE +/- 10.34, N = 4 SE +/- 7.68, N = 9 775 730 649 1. (CXX) g++ options: -flto -pthread
LeelaChessZero Backend: Eigen OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.26 Backend: Eigen 1 2 3 160 320 480 640 800 SE +/- 4.41, N = 3 SE +/- 9.06, N = 5 SE +/- 7.37, N = 9 730 688 638 1. (CXX) g++ options: -flto -pthread
Crafty Elapsed Time OpenBenchmarking.org Nodes Per Second, More Is Better Crafty 25.2 Elapsed Time 1 2 3 1.7M 3.4M 5.1M 6.8M 8.5M SE +/- 11851.34, N = 3 SE +/- 103135.62, N = 3 SE +/- 94283.61, N = 3 7738146 7573497 7394699 1. (CC) gcc options: -pthread -lstdc++ -fprofile-use -lm
Stockfish Total Time OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 12 Total Time 1 2 3 2M 4M 6M 8M 10M SE +/- 111801.84, N = 6 SE +/- 45923.84, N = 3 SE +/- 124228.11, N = 3 9840561 10050125 9864045 1. (CXX) g++ options: -m64 -lpthread -fno-exceptions -std=c++17 -pedantic -O3 -msse -msse3 -mpopcnt -msse4.1 -mssse3 -msse2 -flto -flto=jobserver
asmFish 1024 Hash Memory, 26 Depth OpenBenchmarking.org Nodes/second, More Is Better asmFish 2018-07-23 1024 Hash Memory, 26 Depth 1 2 3 3M 6M 9M 12M 15M SE +/- 23080.42, N = 3 SE +/- 160984.48, N = 3 SE +/- 105834.72, N = 3 13797562 13866946 13730556
GROMACS Water Benchmark OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2020.3 Water Benchmark 1 2 3 0.1242 0.2484 0.3726 0.4968 0.621 SE +/- 0.000, N = 3 SE +/- 0.002, N = 3 SE +/- 0.000, N = 3 0.542 0.550 0.552 1. (CXX) g++ options: -O3 -pthread -lrt -lpthread -lm
LAMMPS Molecular Dynamics Simulator Model: Rhodopsin Protein OpenBenchmarking.org ns/day, More Is Better LAMMPS Molecular Dynamics Simulator 29Oct2020 Model: Rhodopsin Protein 1 2 3 0.7515 1.503 2.2545 3.006 3.7575 SE +/- 0.052, N = 15 SE +/- 0.029, N = 3 SE +/- 0.021, N = 3 3.249 3.329 3.340 1. (CXX) g++ options: -O3 -pthread -lm
Hierarchical INTegration Test: FLOAT OpenBenchmarking.org QUIPs, More Is Better Hierarchical INTegration 1.0 Test: FLOAT 1 2 3 70M 140M 210M 280M 350M SE +/- 215594.66, N = 3 SE +/- 156026.35, N = 3 SE +/- 106732.80, N = 3 340004918.54 339696862.30 339643088.46 1. (CC) gcc options: -O3 -march=native -lm
Redis Test: LPOP OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: LPOP 1 2 3 600K 1200K 1800K 2400K 3000K SE +/- 39136.83, N = 15 SE +/- 123032.71, N = 12 SE +/- 13136.89, N = 3 2624535.48 2159676.18 1559670.96 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
Redis Test: SADD OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: SADD 1 2 3 400K 800K 1200K 1600K 2000K SE +/- 22720.13, N = 15 SE +/- 25708.45, N = 15 SE +/- 18274.51, N = 3 1951987.09 1947660.53 2004338.25 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
Redis Test: LPUSH OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: LPUSH 1 2 3 300K 600K 900K 1200K 1500K SE +/- 14444.21, N = 3 SE +/- 12228.65, N = 3 SE +/- 13914.34, N = 15 1437805.00 1413366.45 1459398.53 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
Redis Test: GET OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: GET 1 2 3 500K 1000K 1500K 2000K 2500K SE +/- 40313.16, N = 15 SE +/- 26352.66, N = 15 SE +/- 21009.40, N = 15 2310604.62 2095708.19 2135888.90 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
Redis Test: SET OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: SET 1 2 3 400K 800K 1200K 1600K 2000K SE +/- 18920.71, N = 3 SE +/- 29276.53, N = 3 SE +/- 19775.05, N = 15 1725875.75 1754885.38 1695298.03 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
GLmark2 Resolution: 1920 x 1080 OpenBenchmarking.org Score, More Is Better GLmark2 2020.04 Resolution: 1920 x 1080 1 2 3 1400 2800 4200 5600 7000 6530 6547 6559
Numpy Benchmark OpenBenchmarking.org Score, More Is Better Numpy Benchmark 1 2 3 70 140 210 280 350 SE +/- 0.41, N = 3 SE +/- 0.23, N = 3 SE +/- 0.63, N = 3 321.42 320.60 321.61
AI Benchmark Alpha Device Inference Score OpenBenchmarking.org Score, More Is Better AI Benchmark Alpha 0.1.2 Device Inference Score 1 2 3 160 320 480 640 800 755 761 763
AI Benchmark Alpha Device Training Score OpenBenchmarking.org Score, More Is Better AI Benchmark Alpha 0.1.2 Device Training Score 1 2 3 200 400 600 800 1000 788 788 790
AI Benchmark Alpha Device AI Score OpenBenchmarking.org Score, More Is Better AI Benchmark Alpha 0.1.2 Device AI Score 1 2 3 300 600 900 1200 1500 1543 1549 1553
Geekbench Test: GPU Vulkan OpenBenchmarking.org Score, More Is Better Geekbench 5 Test: GPU Vulkan 1 2 3 8K 16K 24K 32K 40K SE +/- 459.84, N = 3 SE +/- 115.14, N = 3 SE +/- 56.79, N = 3 36246 36793 36985
Geekbench Test: CPU Multi Core OpenBenchmarking.org Score, More Is Better Geekbench 5 Test: CPU Multi Core 1 2 3 1100 2200 3300 4400 5500 SE +/- 4.98, N = 3 SE +/- 7.09, N = 3 SE +/- 6.17, N = 3 5227 5248 5256
Geekbench Test: CPU Single Core OpenBenchmarking.org Score, More Is Better Geekbench 5 Test: CPU Single Core 1 2 3 300 600 900 1200 1500 SE +/- 0.58, N = 3 SE +/- 1.33, N = 3 SE +/- 0.88, N = 3 1223 1228 1228
PHPBench PHP Benchmark Suite OpenBenchmarking.org Score, More Is Better PHPBench 0.8.1 PHP Benchmark Suite 1 2 3 130K 260K 390K 520K 650K SE +/- 5320.80, N = 3 SE +/- 3538.13, N = 3 SE +/- 3258.38, N = 3 604079 604512 604322
OpenSSL RSA 4096-bit Performance OpenBenchmarking.org Signs Per Second, More Is Better OpenSSL 1.1.1 RSA 4096-bit Performance 1 2 3 200 400 600 800 1000 SE +/- 0.15, N = 3 SE +/- 0.27, N = 3 SE +/- 0.07, N = 3 1145.0 1144.3 1144.7 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
Kripke OpenBenchmarking.org Throughput FoM, More Is Better Kripke 1.2.4 1 2 1.5M 3M 4.5M 6M 7.5M SE +/- 717516.50, N = 2 6926419 6434795 1. (CXX) g++ options: -O3 -fopenmp
BRL-CAD VGR Performance Metric OpenBenchmarking.org VGR Performance Metric, More Is Better BRL-CAD 7.30.8 VGR Performance Metric 1 2 3 12K 24K 36K 48K 60K 57419 56730 56960 1. (CXX) g++ options: -std=c++11 -pipe -fno-strict-aliasing -fno-common -fexceptions -ftemplate-depth-128 -m64 -ggdb3 -O3 -fipa-pta -fstrength-reduce -finline-functions -flto -pedantic -rdynamic -lSM -lICE -lXi -lGLU -lGL -lGLdispatch -lX11 -lXext -lXrender -lpthread -ldl -luuid -lm
NAMD ATPase Simulation - 327,506 Atoms OpenBenchmarking.org days/ns, Fewer Is Better NAMD 2.14 ATPase Simulation - 327,506 Atoms 1 2 3 1.0114 2.0228 3.0342 4.0456 5.057 SE +/- 0.00243, N = 3 SE +/- 0.00085, N = 3 SE +/- 0.00219, N = 3 4.49500 4.48209 4.48308
TensorFlow Lite Model: SqueezeNet OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: SqueezeNet 1 2 3 80K 160K 240K 320K 400K SE +/- 60.36, N = 3 SE +/- 56.90, N = 3 SE +/- 10.21, N = 3 387017 386977 387009
TensorFlow Lite Model: Inception V4 OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: Inception V4 1 2 3 1.2M 2.4M 3.6M 4.8M 6M SE +/- 421.91, N = 3 SE +/- 2695.68, N = 3 SE +/- 230.96, N = 3 5572917 5574923 5571717
TensorFlow Lite Model: NASNet Mobile OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: NASNet Mobile 1 2 3 60K 120K 180K 240K 300K SE +/- 1226.74, N = 3 SE +/- 79.18, N = 3 SE +/- 151.01, N = 3 280959 279167 279158
TensorFlow Lite Model: Mobilenet Float OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: Mobilenet Float 1 2 3 60K 120K 180K 240K 300K SE +/- 20.61, N = 3 SE +/- 44.31, N = 3 SE +/- 21.18, N = 3 259621 259600 259575
TensorFlow Lite Model: Mobilenet Quant OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: Mobilenet Quant 1 2 3 60K 120K 180K 240K 300K SE +/- 51.91, N = 3 SE +/- 131.03, N = 3 SE +/- 60.84, N = 3 266426 266319 266346
TensorFlow Lite Model: Inception ResNet V2 OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: Inception ResNet V2 1 2 3 1.1M 2.2M 3.3M 4.4M 5.5M SE +/- 308.45, N = 3 SE +/- 255.76, N = 3 SE +/- 2537.61, N = 3 5048647 5046933 5048433
Caffe Model: AlexNet - Acceleration: CPU - Iterations: 100 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: AlexNet - Acceleration: CPU - Iterations: 100 1 2 3 11K 22K 33K 44K 55K SE +/- 14.01, N = 3 SE +/- 8.08, N = 3 SE +/- 27.06, N = 3 49082 49047 48962 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
Caffe Model: GoogleNet - Acceleration: CPU - Iterations: 100 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: CPU - Iterations: 100 1 2 3 30K 60K 90K 120K 150K SE +/- 118.13, N = 3 SE +/- 32.70, N = 3 SE +/- 73.69, N = 3 120634 120679 120291 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
DDraceNetwork Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL 3.3 - Zoom: Default - Demo: Multeasymap - Total Frame Time OpenBenchmarking.org Milliseconds, Fewer Is Better DDraceNetwork 15.2.3 Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL 3.3 - Zoom: Default - Demo: Multeasymap - Total Frame Time 1 2 3 3 6 9 12 15 Min: 2.07 / Avg: 2.98 / Max: 7.54 Min: 2.09 / Avg: 3.02 / Max: 7.16 Min: 2.02 / Avg: 3 / Max: 10.7 1. (CXX) g++ options: -O3 -rdynamic -lcrypto -lz -lrt -lpthread -lcurl -lfreetype -lSDL2 -lwavpack -lopusfile -lopus -logg -lGL -lX11 -lnotify -lgdk_pixbuf-2.0 -lgio-2.0 -lgobject-2.0 -lglib-2.0
PyBench Total For Average Test Times OpenBenchmarking.org Milliseconds, Fewer Is Better PyBench 2018-02-16 Total For Average Test Times 1 2 3 200 400 600 800 1000 SE +/- 2.31, N = 3 SE +/- 1.76, N = 3 1020 1017 1021
PyPerformance Benchmark: go OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.0.0 Benchmark: go 1 2 3 50 100 150 200 250 SE +/- 0.33, N = 3 249 248 248
PyPerformance Benchmark: 2to3 OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.0.0 Benchmark: 2to3 1 2 3 70 140 210 280 350 SE +/- 0.33, N = 3 320 321 321
PyPerformance Benchmark: chaos OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.0.0 Benchmark: chaos 1 2 3 20 40 60 80 100 111 111 111
PyPerformance Benchmark: float OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.0.0 Benchmark: float 1 2 3 30 60 90 120 150 SE +/- 0.33, N = 3 117 116 117
PyPerformance Benchmark: nbody OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.0.0 Benchmark: nbody 1 2 3 30 60 90 120 150 117 117 117
PyPerformance Benchmark: pathlib OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.0.0 Benchmark: pathlib 1 2 3 4 8 12 16 20 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 17.7 17.7 17.8
PyPerformance Benchmark: raytrace OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.0.0 Benchmark: raytrace 1 2 3 100 200 300 400 500 SE +/- 0.33, N = 3 480 477 476
PyPerformance Benchmark: json_loads OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.0.0 Benchmark: json_loads 1 2 3 6 12 18 24 30 SE +/- 0.03, N = 3 SE +/- 0.00, N = 3 SE +/- 0.03, N = 3 27.2 27.3 27.3
PyPerformance Benchmark: crypto_pyaes OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.0.0 Benchmark: crypto_pyaes 1 2 3 30 60 90 120 150 116 116 116
PyPerformance Benchmark: regex_compile OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.0.0 Benchmark: regex_compile 1 2 3 40 80 120 160 200 169 170 168
PyPerformance Benchmark: python_startup OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.0.0 Benchmark: python_startup 1 2 3 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 8.02 8.08 8.13
PyPerformance Benchmark: django_template OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.0.0 Benchmark: django_template 1 2 3 11 22 33 44 55 SE +/- 0.10, N = 3 SE +/- 0.03, N = 3 SE +/- 0.06, N = 3 46.9 46.7 46.8
PyPerformance Benchmark: pickle_pure_python OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.0.0 Benchmark: pickle_pure_python 1 2 3 100 200 300 400 500 440 439 438
oneDNN Harness: IP Batch 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: IP Batch 1D - Data Type: f32 - Engine: CPU 1 2 3 2 4 6 8 10 SE +/- 0.03162, N = 3 SE +/- 0.12292, N = 3 SE +/- 0.12077, N = 15 7.43267 8.42072 8.81065 MIN: 7.23 MIN: 8.01 MIN: 8.21 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: IP Batch All - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: IP Batch All - Data Type: f32 - Engine: CPU 1 2 3 20 40 60 80 100 SE +/- 0.07, N = 3 SE +/- 0.13, N = 3 SE +/- 0.89, N = 15 96.74 99.74 99.97 MIN: 95.29 MIN: 97.89 MIN: 95.99 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU 1 2 3 5 10 15 20 25 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 20.65 19.20 19.41 MIN: 20.28 MIN: 19.01 MIN: 19.18 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch deconv_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Deconvolution Batch deconv_1d - Data Type: f32 - Engine: CPU 1 2 3 3 6 9 12 15 SE +/- 0.01249, N = 3 SE +/- 0.12846, N = 3 SE +/- 0.17039, N = 15 8.05221 9.74749 9.82517 MIN: 7.86 MIN: 9.46 MIN: 8.65 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Deconvolution Batch deconv_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Deconvolution Batch deconv_3d - Data Type: f32 - Engine: CPU 1 2 3 3 6 9 12 15 SE +/- 0.02, N = 3 SE +/- 0.17, N = 3 SE +/- 0.02, N = 3 12.03 13.18 12.89 MIN: 11.78 MIN: 12.66 MIN: 12.66 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU 1 2 3 110 220 330 440 550 SE +/- 2.86, N = 3 SE +/- 2.10, N = 3 SE +/- 3.41, N = 3 503.70 491.11 462.80 MIN: 495.04 MIN: 474.1 MIN: 453.6 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU 1 2 3 60 120 180 240 300 SE +/- 0.60, N = 3 SE +/- 3.42, N = 3 SE +/- 1.11, N = 3 249.37 262.37 247.16 MIN: 247.16 MIN: 254.47 MIN: 243.72 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU 1 2 3 1.0644 2.1288 3.1932 4.2576 5.322 SE +/- 0.00490, N = 3 SE +/- 0.01879, N = 3 SE +/- 0.00436, N = 3 4.73057 4.03910 4.06739 MIN: 4.66 MIN: 3.96 MIN: 3.99 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Mobile Neural Network Model: SqueezeNetV1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2020-09-17 Model: SqueezeNetV1.0 1 2 3 2 4 6 8 10 SE +/- 0.003, N = 3 SE +/- 0.005, N = 3 SE +/- 0.018, N = 3 8.853 8.848 8.874 MIN: 8.76 / MAX: 11.24 MIN: 8.76 / MAX: 9.86 MIN: 8.77 / MAX: 11.46 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: resnet-v2-50 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2020-09-17 Model: resnet-v2-50 1 2 3 8 16 24 32 40 SE +/- 0.05, N = 3 SE +/- 0.04, N = 3 SE +/- 0.25, N = 3 34.93 35.00 35.17 MIN: 34.38 / MAX: 58.39 MIN: 34.7 / MAX: 43.77 MIN: 34.65 / MAX: 80.67 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: MobileNetV2_224 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2020-09-17 Model: MobileNetV2_224 1 2 3 1.1108 2.2216 3.3324 4.4432 5.554 SE +/- 0.018, N = 3 SE +/- 0.020, N = 3 SE +/- 0.030, N = 3 4.937 4.936 4.910 MIN: 4.86 / MAX: 5.44 MIN: 4.86 / MAX: 5.66 MIN: 4.82 / MAX: 6.27 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: mobilenet-v1-1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2020-09-17 Model: mobilenet-v1-1.0 1 2 3 1.3104 2.6208 3.9312 5.2416 6.552 SE +/- 0.016, N = 3 SE +/- 0.028, N = 3 SE +/- 0.015, N = 3 5.819 5.824 5.811 MIN: 5.74 / MAX: 6.34 MIN: 5.72 / MAX: 25.28 MIN: 5.73 / MAX: 15.06 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: inception-v3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2020-09-17 Model: inception-v3 1 2 3 10 20 30 40 50 SE +/- 0.08, N = 3 SE +/- 0.08, N = 3 SE +/- 0.17, N = 3 42.57 42.44 42.55 MIN: 42.21 / MAX: 51.92 MIN: 42.09 / MAX: 51.33 MIN: 42.02 / MAX: 58.27 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
NCNN Target: CPU - Model: squeezenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: squeezenet 1 2 3 5 10 15 20 25 SE +/- 0.09, N = 3 SE +/- 0.06, N = 3 SE +/- 0.12, N = 3 21.87 21.67 21.83 MIN: 21.59 / MAX: 23.26 MIN: 21.39 / MAX: 37.01 MIN: 21.56 / MAX: 22.96 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: mobilenet 1 2 3 6 12 18 24 30 SE +/- 0.18, N = 3 SE +/- 0.15, N = 3 SE +/- 0.11, N = 3 23.82 23.71 23.71 MIN: 23.36 / MAX: 24.99 MIN: 23.3 / MAX: 24.65 MIN: 23.46 / MAX: 24.96 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU-v2-v2 - Model: mobilenet-v2 1 2 3 2 4 6 8 10 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 6.66 6.64 6.68 MIN: 6.56 / MAX: 7.93 MIN: 6.54 / MAX: 7.9 MIN: 6.59 / MAX: 8.01 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU-v3-v3 - Model: mobilenet-v3 1 2 3 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 6.18 6.11 6.25 MIN: 6.11 / MAX: 7.82 MIN: 6.03 / MAX: 7.3 MIN: 6.11 / MAX: 37.7 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: shufflenet-v2 1 2 3 1.089 2.178 3.267 4.356 5.445 SE +/- 0.04, N = 3 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 4.79 4.82 4.84 MIN: 4.65 / MAX: 5.62 MIN: 4.76 / MAX: 6 MIN: 4.76 / MAX: 5.65 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: mnasnet 1 2 3 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 6.32 6.27 6.30 MIN: 6.26 / MAX: 7.07 MIN: 6.2 / MAX: 7.42 MIN: 6.22 / MAX: 7.04 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: efficientnet-b0 1 2 3 3 6 9 12 15 SE +/- 0.05, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 9.21 9.10 9.16 MIN: 9.07 / MAX: 14.29 MIN: 9.02 / MAX: 9.48 MIN: 9.06 / MAX: 9.6 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: blazeface 1 2 3 0.4388 0.8776 1.3164 1.7552 2.194 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 1.95 1.92 1.92 MIN: 1.9 / MAX: 2.15 MIN: 1.88 / MAX: 2.17 MIN: 1.89 / MAX: 2.52 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: googlenet 1 2 3 5 10 15 20 25 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 19.42 19.30 19.51 MIN: 19.3 / MAX: 24.48 MIN: 19.11 / MAX: 38.78 MIN: 19.37 / MAX: 20.62 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: vgg16 1 2 3 16 32 48 64 80 SE +/- 0.13, N = 3 SE +/- 0.22, N = 3 SE +/- 0.02, N = 3 72.57 72.10 72.72 MIN: 72.08 / MAX: 107.37 MIN: 71.61 / MAX: 88.56 MIN: 72.42 / MAX: 80.73 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: resnet18 1 2 3 4 8 12 16 20 SE +/- 0.05, N = 3 SE +/- 0.05, N = 3 SE +/- 0.01, N = 3 17.83 17.75 18.00 MIN: 17.65 / MAX: 21.58 MIN: 17.57 / MAX: 18.32 MIN: 17.88 / MAX: 18.52 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: alexnet 1 2 3 5 10 15 20 25 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 18.95 18.57 18.81 MIN: 18.85 / MAX: 19.3 MIN: 18.49 / MAX: 19.27 MIN: 18.72 / MAX: 19.3 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: resnet50 1 2 3 9 18 27 36 45 SE +/- 0.03, N = 3 SE +/- 0.07, N = 3 SE +/- 0.07, N = 3 37.32 37.05 37.31 MIN: 37.15 / MAX: 39.69 MIN: 36.81 / MAX: 38.07 MIN: 37.14 / MAX: 39.49 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: yolov4-tiny 1 2 3 8 16 24 32 40 SE +/- 0.04, N = 3 SE +/- 0.09, N = 3 SE +/- 0.07, N = 3 32.96 32.84 33.20 MIN: 32.78 / MAX: 34.14 MIN: 32.57 / MAX: 55.87 MIN: 32.95 / MAX: 34.47 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: squeezenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: squeezenet 1 2 3 1.0913 2.1826 3.2739 4.3652 5.4565 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 4.84 4.85 4.85 MIN: 4.71 / MAX: 5.84 MIN: 4.69 / MAX: 5.86 MIN: 4.71 / MAX: 5.84 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: mobilenet 1 2 3 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 8.28 8.25 8.27 MIN: 7.59 / MAX: 11.65 MIN: 7.57 / MAX: 11.24 MIN: 7.6 / MAX: 11.3 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 1 2 3 0.5805 1.161 1.7415 2.322 2.9025 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 2.57 2.58 2.58 MIN: 2.53 / MAX: 3.17 MIN: 2.53 / MAX: 5.2 MIN: 2.53 / MAX: 3.72 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 1 2 3 0.8055 1.611 2.4165 3.222 4.0275 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 3.57 3.58 3.58 MIN: 3.54 / MAX: 4.28 MIN: 3.53 / MAX: 4.28 MIN: 3.54 / MAX: 4.3 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: shufflenet-v2 1 2 3 0.5243 1.0486 1.5729 2.0972 2.6215 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 2.32 2.33 2.32 MIN: 2.28 / MAX: 3.46 MIN: 2.3 / MAX: 2.98 MIN: 2.29 / MAX: 2.74 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: mnasnet 1 2 3 0.6255 1.251 1.8765 2.502 3.1275 SE +/- 0.02, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 2.78 2.76 2.76 MIN: 2.71 / MAX: 15.58 MIN: 2.72 / MAX: 3.3 MIN: 2.71 / MAX: 3.28 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: efficientnet-b0 1 2 3 3 6 9 12 15 SE +/- 0.12, N = 3 SE +/- 0.17, N = 3 SE +/- 0.10, N = 3 9.33 9.34 9.36 MIN: 8.97 / MAX: 19.5 MIN: 8.93 / MAX: 17.56 MIN: 8.96 / MAX: 20 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: blazeface 1 2 3 0.2048 0.4096 0.6144 0.8192 1.024 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 0.91 0.88 0.87 MIN: 0.85 / MAX: 1.18 MIN: 0.86 / MAX: 1.45 MIN: 0.85 / MAX: 1.15 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: googlenet 1 2 3 1.2915 2.583 3.8745 5.166 6.4575 SE +/- 0.05, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 5.74 5.70 5.69 MIN: 5.64 / MAX: 15.26 MIN: 5.65 / MAX: 8.52 MIN: 5.63 / MAX: 10.18 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: vgg16 1 2 3 3 6 9 12 15 SE +/- 0.16, N = 3 SE +/- 0.05, N = 3 SE +/- 0.07, N = 3 10.82 10.71 10.61 MIN: 10.18 / MAX: 24.02 MIN: 10.19 / MAX: 20.01 MIN: 10.2 / MAX: 24 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: resnet18 1 2 3 0.4883 0.9766 1.4649 1.9532 2.4415 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 2.16 2.17 2.17 MIN: 2.09 / MAX: 2.76 MIN: 2.1 / MAX: 2.77 MIN: 2.1 / MAX: 2.76 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: alexnet 1 2 3 0.9225 1.845 2.7675 3.69 4.6125 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 4.09 4.10 4.09 MIN: 3.96 / MAX: 4.79 MIN: 3.99 / MAX: 4.76 MIN: 3.97 / MAX: 4.66 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: resnet50 1 2 3 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 SE +/- 0.00, N = 3 6.10 6.13 6.09 MIN: 6.06 / MAX: 10.11 MIN: 6.04 / MAX: 11.31 MIN: 6.06 / MAX: 6.39 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: yolov4-tiny 1 2 3 3 6 9 12 15 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 11.26 11.29 11.30 MIN: 11.09 / MAX: 11.79 MIN: 11.1 / MAX: 11.73 MIN: 11.15 / MAX: 11.62 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
TNN Target: CPU - Model: MobileNet v2 OpenBenchmarking.org ms, Fewer Is Better TNN 0.2.3 Target: CPU - Model: MobileNet v2 1 2 3 60 120 180 240 300 SE +/- 0.00, N = 3 SE +/- 0.19, N = 3 SE +/- 0.03, N = 3 263.82 264.30 264.12 MIN: 263.16 / MAX: 270.27 MIN: 263.28 / MAX: 314.67 MIN: 263.32 / MAX: 271.64 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -O3 -rdynamic -ldl
TNN Target: CPU - Model: SqueezeNet v1.1 OpenBenchmarking.org ms, Fewer Is Better TNN 0.2.3 Target: CPU - Model: SqueezeNet v1.1 1 2 3 60 120 180 240 300 SE +/- 0.22, N = 3 SE +/- 0.15, N = 3 SE +/- 0.05, N = 3 260.26 260.45 260.44 MIN: 259.28 / MAX: 261.34 MIN: 259.62 / MAX: 261.24 MIN: 259.68 / MAX: 261.29 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -O3 -rdynamic -ldl
Betsy GPU Compressor Codec: ETC1 - Quality: Highest OpenBenchmarking.org Seconds, Fewer Is Better Betsy GPU Compressor 1.1 Beta Codec: ETC1 - Quality: Highest 1 2 3 1.3048 2.6096 3.9144 5.2192 6.524 SE +/- 0.224, N = 15 SE +/- 0.062, N = 3 SE +/- 0.064, N = 3 5.799 5.608 5.602 1. (CXX) g++ options: -O3 -O2 -lpthread -ldl
Betsy GPU Compressor Codec: ETC2 RGB - Quality: Highest OpenBenchmarking.org Seconds, Fewer Is Better Betsy GPU Compressor 1.1 Beta Codec: ETC2 RGB - Quality: Highest 1 2 3 2 4 6 8 10 SE +/- 0.213, N = 15 SE +/- 0.003, N = 3 SE +/- 0.003, N = 3 6.688 6.476 6.470 1. (CXX) g++ options: -O3 -O2 -lpthread -ldl
Waifu2x-NCNN Vulkan Scale: 2x - Denoise: 3 - TAA: Yes OpenBenchmarking.org Seconds, Fewer Is Better Waifu2x-NCNN Vulkan 20200818 Scale: 2x - Denoise: 3 - TAA: Yes 1 2 3 2 4 6 8 10 SE +/- 0.002, N = 3 SE +/- 0.007, N = 3 SE +/- 0.008, N = 3 7.158 7.184 7.183
Dolfyn Computational Fluid Dynamics OpenBenchmarking.org Seconds, Fewer Is Better Dolfyn 0.527 Computational Fluid Dynamics 1 2 3 4 8 12 16 20 SE +/- 0.01, N = 3 SE +/- 0.05, N = 3 SE +/- 0.02, N = 3 17.76 17.70 17.68
Timed HMMer Search Pfam Database Search OpenBenchmarking.org Seconds, Fewer Is Better Timed HMMer Search 3.3.1 Pfam Database Search 1 2 3 20 40 60 80 100 SE +/- 0.15, N = 3 SE +/- 0.11, N = 3 SE +/- 0.23, N = 3 111.35 111.30 111.49 1. (CC) gcc options: -O3 -pthread -lhmmer -leasel -lm
Monte Carlo Simulations of Ionised Nebulae Input: Dust 2D tau100.0 OpenBenchmarking.org Seconds, Fewer Is Better Monte Carlo Simulations of Ionised Nebulae 2019-03-24 Input: Dust 2D tau100.0 1 2 3 60 120 180 240 300 SE +/- 0.58, N = 3 SE +/- 0.67, N = 3 SE +/- 0.88, N = 3 288 288 288 1. (F9X) gfortran options: -cpp -Jsource/ -ffree-line-length-0 -lm -std=legacy -O3 -O2 -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -ldl -levent -levent_pthreads -lutil -lrt -lz
libavif avifenc Encoder Speed: 0 OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.7.3 Encoder Speed: 0 1 2 3 30 60 90 120 150 SE +/- 0.63, N = 3 SE +/- 0.61, N = 3 SE +/- 0.42, N = 3 146.26 149.26 145.65 1. (CXX) g++ options: -O3 -fPIC
libavif avifenc Encoder Speed: 2 OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.7.3 Encoder Speed: 2 1 2 3 20 40 60 80 100 SE +/- 0.23, N = 3 SE +/- 0.07, N = 3 SE +/- 0.06, N = 3 86.46 88.21 86.29 1. (CXX) g++ options: -O3 -fPIC
libavif avifenc Encoder Speed: 8 OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.7.3 Encoder Speed: 8 1 2 3 2 4 6 8 10 SE +/- 0.004, N = 3 SE +/- 0.005, N = 3 SE +/- 0.001, N = 3 6.631 6.575 6.584 1. (CXX) g++ options: -O3 -fPIC
libavif avifenc Encoder Speed: 10 OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.7.3 Encoder Speed: 10 1 2 3 2 4 6 8 10 SE +/- 0.015, N = 3 SE +/- 0.019, N = 3 SE +/- 0.017, N = 3 6.100 6.037 6.064 1. (CXX) g++ options: -O3 -fPIC
Timed Linux Kernel Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 5.4 Time To Compile 1 2 3 40 80 120 160 200 SE +/- 0.52, N = 3 SE +/- 1.04, N = 3 SE +/- 0.59, N = 3 159.30 160.56 160.15
Timed LLVM Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed LLVM Compilation 10.0 Time To Compile 1 2 3 300 600 900 1200 1500 SE +/- 1.95, N = 3 SE +/- 1.32, N = 3 SE +/- 0.35, N = 3 1215.50 1214.94 1216.09
eSpeak-NG Speech Engine Text-To-Speech Synthesis OpenBenchmarking.org Seconds, Fewer Is Better eSpeak-NG Speech Engine 20200907 Text-To-Speech Synthesis 1 2 3 7 14 21 28 35 SE +/- 0.15, N = 4 SE +/- 0.01, N = 4 SE +/- 0.02, N = 4 30.99 30.94 30.96 1. (CC) gcc options: -O2 -std=c99
RNNoise OpenBenchmarking.org Seconds, Fewer Is Better RNNoise 2020-06-28 1 2 3 5 10 15 20 25 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 20.02 20.02 20.03 1. (CC) gcc options: -O2 -pedantic -fvisibility=hidden
ASTC Encoder Preset: Fast OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Fast 1 2 3 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 7.39 7.39 7.37 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
ASTC Encoder Preset: Medium OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Medium 1 2 3 2 4 6 8 10 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 8.30 8.30 8.30 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
ASTC Encoder Preset: Thorough OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Thorough 1 2 3 12 24 36 48 60 SE +/- 0.01, N = 3 SE +/- 0.05, N = 3 SE +/- 0.01, N = 3 52.39 52.46 52.40 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
ASTC Encoder Preset: Exhaustive OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Exhaustive 1 2 3 90 180 270 360 450 SE +/- 0.13, N = 3 SE +/- 0.16, N = 3 SE +/- 0.06, N = 3 425.23 425.58 425.39 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
Basis Universal Settings: ETC1S OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: ETC1S 1 2 3 14 28 42 56 70 SE +/- 0.17, N = 3 SE +/- 0.11, N = 3 SE +/- 0.15, N = 3 62.80 62.69 62.77 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
Basis Universal Settings: UASTC Level 0 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: UASTC Level 0 1 2 3 3 6 9 12 15 SE +/- 0.015, N = 3 SE +/- 0.018, N = 3 SE +/- 0.014, N = 3 9.297 9.267 9.267 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
Basis Universal Settings: UASTC Level 2 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: UASTC Level 2 1 2 3 13 26 39 52 65 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 58.59 58.63 58.62 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
Basis Universal Settings: UASTC Level 3 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: UASTC Level 3 1 2 3 30 60 90 120 150 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 114.27 114.19 114.26 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
Basis Universal Settings: UASTC Level 2 + RDO Post-Processing OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: UASTC Level 2 + RDO Post-Processing 1 2 3 150 300 450 600 750 SE +/- 1.79, N = 3 SE +/- 0.97, N = 3 SE +/- 1.97, N = 3 702.91 701.85 704.13 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
Darktable Test: Boat - Acceleration: CPU-only OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.2.1 Test: Boat - Acceleration: CPU-only 1 2 3 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 12.62 12.48 12.51
Darktable Test: Masskrug - Acceleration: CPU-only OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.2.1 Test: Masskrug - Acceleration: CPU-only 1 2 3 2 4 6 8 10 SE +/- 0.016, N = 3 SE +/- 0.006, N = 3 SE +/- 0.018, N = 3 8.019 7.963 8.026
Darktable Test: Server Rack - Acceleration: CPU-only OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.2.1 Test: Server Rack - Acceleration: CPU-only 1 2 3 0.0527 0.1054 0.1581 0.2108 0.2635 SE +/- 0.000, N = 3 SE +/- 0.000, N = 3 SE +/- 0.004, N = 3 0.230 0.229 0.234
Darktable Test: Server Room - Acceleration: CPU-only OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.2.1 Test: Server Room - Acceleration: CPU-only 1 2 3 2 4 6 8 10 SE +/- 0.003, N = 3 SE +/- 0.010, N = 3 SE +/- 0.002, N = 3 6.318 6.274 6.292
Hugin Panorama Photo Assistant + Stitching Time OpenBenchmarking.org Seconds, Fewer Is Better Hugin Panorama Photo Assistant + Stitching Time 1 2 3 14 28 42 56 70 SE +/- 0.29, N = 3 SE +/- 0.43, N = 3 SE +/- 0.37, N = 3 62.00 61.97 62.71
OCRMyPDF Processing 60 Page PDF Document OpenBenchmarking.org Seconds, Fewer Is Better OCRMyPDF 10.3.1+dfsg Processing 60 Page PDF Document 1 2 3 7 14 21 28 35 SE +/- 0.12, N = 3 SE +/- 0.07, N = 3 SE +/- 0.03, N = 3 32.16 32.23 32.26
RawTherapee Total Benchmark Time OpenBenchmarking.org Seconds, Fewer Is Better RawTherapee Total Benchmark Time 1 2 3 20 40 60 80 100 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.06, N = 3 79.96 79.42 80.35 1. RawTherapee, version 5.8, command line.
Blender Blend File: BMW27 - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: BMW27 - Compute: CPU-Only 1 2 3 70 140 210 280 350 SE +/- 0.35, N = 3 SE +/- 0.44, N = 3 SE +/- 0.71, N = 3 323.46 324.80 325.17
Mlpack Benchmark Benchmark: scikit_ica OpenBenchmarking.org Seconds, Fewer Is Better Mlpack Benchmark Benchmark: scikit_ica 1 2 3 11 22 33 44 55 SE +/- 0.07, N = 3 SE +/- 0.05, N = 3 SE +/- 0.04, N = 3 49.12 49.12 49.14
Mlpack Benchmark Benchmark: scikit_qda OpenBenchmarking.org Seconds, Fewer Is Better Mlpack Benchmark Benchmark: scikit_qda 1 2 3 20 40 60 80 100 SE +/- 0.39, N = 3 SE +/- 1.17, N = 12 SE +/- 1.16, N = 3 74.11 76.91 75.29
Mlpack Benchmark Benchmark: scikit_svm OpenBenchmarking.org Seconds, Fewer Is Better Mlpack Benchmark Benchmark: scikit_svm 1 2 3 5 10 15 20 25 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 21.43 21.44 21.52
Mlpack Benchmark Benchmark: scikit_linearridgeregression OpenBenchmarking.org Seconds, Fewer Is Better Mlpack Benchmark Benchmark: scikit_linearridgeregression 1 2 3 0.7245 1.449 2.1735 2.898 3.6225 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 3.19 3.20 3.22
Sunflow Rendering System Global Illumination + Image Synthesis OpenBenchmarking.org Seconds, Fewer Is Better Sunflow Rendering System 0.07.2 Global Illumination + Image Synthesis 1 2 3 0.4491 0.8982 1.3473 1.7964 2.2455 SE +/- 0.029, N = 4 SE +/- 0.024, N = 3 SE +/- 0.023, N = 3 1.948 1.965 1.996 MIN: 1.73 / MAX: 2.71 MIN: 1.78 / MAX: 2.54 MIN: 1.81 / MAX: 2.69
Tesseract OCR Time To OCR 7 Images OpenBenchmarking.org Seconds, Fewer Is Better Tesseract OCR 4.1.1 Time To OCR 7 Images 1 2 3 6 12 18 24 30 SE +/- 0.03, N = 3 SE +/- 0.06, N = 3 SE +/- 0.02, N = 3 24.57 24.51 24.51
HPC Challenge Test / Class: Random Ring Latency OpenBenchmarking.org usecs, Fewer Is Better HPC Challenge 1.5.0 Test / Class: Random Ring Latency 1 0.0887 0.1774 0.2661 0.3548 0.4435 0.39400 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
Phoronix Test Suite v10.8.5