3100 compare AMD Ryzen 3 3100 4-Core testing with a ASUS ROG CROSSHAIR VIII HERO (2702 BIOS) and AMD Radeon RX 56/64 8GB on Ubuntu 20.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2011296-HA-3100COMPA40&rdt&grs .
3100 compare Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL Vulkan Compiler File-System Screen Resolution 1 2 3 AMD Ryzen 3 3100 4-Core @ 3.60GHz (4 Cores / 8 Threads) ASUS ROG CROSSHAIR VIII HERO (2702 BIOS) AMD Starship/Matisse 16GB 1000GB Sabrent Rocket 4.0 1TB AMD Radeon RX 56/64 8GB (1590/800MHz) AMD Vega 10 HDMI Audio LG Ultra HD Realtek RTL8125 2.5GbE + Intel I211 Ubuntu 20.10 5.8.0-29-generic (x86_64) GNOME Shell 3.38.1 X Server 1.20.9 amdgpu 19.1.0 4.6 Mesa 20.2.1 (LLVM 11.0.0) 1.2.131 GCC 10.2.0 ext4 1920x1080 OpenBenchmarking.org Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-gcn/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq ondemand - CPU Microcode: 0x8701021 Graphics Details - GLAMOR Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected Java Details - 2, 3: OpenJDK Runtime Environment (build 11.0.9.1+1-Ubuntu-0ubuntu1.20.10) Python Details - 2, 3: Python 3.8.6
3100 compare lczero: BLAS onednn: IP Batch 1D - f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU lczero: Eigen onednn: Deconvolution Batch deconv_3d - f32 - CPU onednn: Recurrent Neural Network Training - f32 - CPU kripke: onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU crafty: Elapsed Time ncnn: Vulkan GPU - blazeface aom-av1: Speed 0 Two-Pass mlpack: scikit_qda redis: SET onednn: IP Batch All - f32 - CPU redis: LPUSH rav1e: 1 redis: SADD compress-lz4: 9 - Compression Speed aom-av1: Speed 8 Realtime rav1e: 6 avifenc: 0 sunflow: Global Illumination + Image Synthesis compress-lz4: 1 - Compression Speed ncnn: CPU-v3-v3 - mobilenet-v3 yquake2: OpenGL 1.x - 1920 x 1080 embree: Pathtracer - Asian Dragon avifenc: 2 darktable: Server Rack - CPU-only stockfish: Total Time ncnn: CPU - alexnet geekbench: GPU Vulkan compress-lz4: 3 - Compression Speed ncnn: Vulkan GPU - vgg16 gromacs: Water Benchmark ffte: N=256, 3D Complex FFT Routine aom-av1: Speed 6 Realtime ncnn: CPU - blazeface compress-zstd: 3 rav1e: 10 ncnn: CPU - resnet18 pyperformance: python_startup x265: Bosphorus 1080p compress-lz4: 3 - Decompression Speed compress-lz4: 9 - Decompression Speed yquake2: Software CPU - 1920 x 1080 brl-cad: VGR Performance Metric ncnn: CPU - efficientnet-b0 pyperformance: regex_compile hugin: Panorama Photo Assistant + Stitching Time rawtherapee: Total Benchmark Time darktable: Boat - CPU-only ncnn: CPU - yolov4-tiny ncnn: CPU - googlenet ai-benchmark: Device Inference Score rav1e: 5 ncnn: CPU - shufflenet-v2 avifenc: 10 asmfish: 1024 Hash Memory, 26 Depth embree: Pathtracer - Crown mlpack: scikit_linearridgeregression ddnet: 1920 x 1080 - Fullscreen - OpenGL 3.3 - Default - Multeasymap ncnn: CPU - squeezenet ncnn: Vulkan GPU - googlenet pyperformance: float ncnn: CPU - vgg16 avifenc: 8 pyperformance: raytrace ncnn: CPU - mnasnet darktable: Masskrug - CPU-only build-linux-kernel: Time To Compile compress-lz4: 1 - Decompression Speed ncnn: CPU - resnet50 ncnn: Vulkan GPU - mnasnet darktable: Server Room - CPU-only mnn: resnet-v2-50 ncnn: Vulkan GPU - resnet50 ai-benchmark: Device AI Score tensorflow-lite: NASNet Mobile ncnn: CPU-v2-v2 - mobilenet-v2 mpv: Big Buck Bunny Sunflower 4K - Software Only pyperformance: pathlib embree: Pathtracer ISPC - Crown geekbench: CPU Multi Core mnn: MobileNetV2_224 kvazaar: Bosphorus 4K - Ultra Fast indigobench: CPU - Bedroom blender: BMW27 - CPU-Only compress-zstd: 19 ncnn: CPU - mobilenet ncnn: Vulkan GPU - resnet18 aom-av1: Speed 4 Two-Pass pyperformance: pickle_pure_python glmark2: 1920 x 1080 ncnn: Vulkan GPU - shufflenet-v2 dolfyn: Computational Fluid Dynamics pyperformance: django_template mlpack: scikit_svm geekbench: CPU Single Core pyperformance: go mpv: Big Buck Bunny Sunflower 1080p - Software Only pybench: Total For Average Test Times ncnn: Vulkan GPU-v2-v2 - mobilenet-v2 x265: Bosphorus 4K pyperformance: json_loads ncnn: Vulkan GPU - mobilenet waifu2x-ncnn: 2x - 3 - Yes ncnn: Vulkan GPU - yolov4-tiny embree: Pathtracer ISPC - Asian Dragon kvazaar: Bosphorus 1080p - Ultra Fast kvazaar: Bosphorus 4K - Medium basis: UASTC Level 2 + RDO Post-Processing basis: UASTC Level 0 caffe: GoogleNet - CPU - 100 ncnn: Vulkan GPU - efficientnet-b0 numpy: kvazaar: Bosphorus 1080p - Medium pyperformance: 2to3 ocrmypdf: Processing 60 Page PDF Document mnn: inception-v3 mnn: SqueezeNetV1.0 aom-av1: Speed 6 Two-Pass namd: ATPase Simulation - 327,506 Atoms ncnn: Vulkan GPU-v3-v3 - mobilenet-v3 astcenc: Fast vkfft: ai-benchmark: Device Training Score tesseract-ocr: Time To OCR 7 Images kvazaar: Bosphorus 4K - Very Fast caffe: AlexNet - CPU - 100 ncnn: Vulkan GPU - alexnet mnn: mobilenet-v1-1.0 ncnn: Vulkan GPU - squeezenet yquake2: OpenGL 3.x - 1920 x 1080 indigobench: CPU - Supercar tnn: CPU - MobileNet v2 hmmer: Pfam Database Search espeak: Text-To-Speech Synthesis basis: ETC1S astcenc: Thorough kvazaar: Bosphorus 1080p - Very Fast hint: FLOAT build-llvm: Time To Compile astcenc: Exhaustive basis: UASTC Level 3 tnn: CPU - SqueezeNet v1.1 phpbench: PHP Benchmark Suite basis: UASTC Level 2 rnnoise: openssl: RSA 4096-bit Performance tensorflow-lite: Inception V4 mlpack: scikit_ica tensorflow-lite: Mobilenet Quant tensorflow-lite: Inception ResNet V2 tensorflow-lite: Mobilenet Float tensorflow-lite: SqueezeNet pyperformance: crypto_pyaes pyperformance: nbody pyperformance: chaos astcenc: Medium mocassin: Dust 2D tau100.0 hpcc: Max Ping Pong Bandwidth hpcc: Rand Ring Bandwidth hpcc: Rand Ring Latency hpcc: G-Rand Access hpcc: EP-STREAM Triad hpcc: G-Ptrans hpcc: EP-DGEMM hpcc: G-Ffte hpcc: G-HPL redis: GET redis: LPOP onednn: Deconvolution Batch deconv_1d - f32 - CPU lammps: Rhodopsin Protein ddnet: 1920 x 1080 - Fullscreen - OpenGL 3.3 - Default - RaiNyMore2 betsy: ETC2 RGB - Highest betsy: ETC1 - Highest 1 2 3 775 7.43267 4.73057 730 12.0299 503.700 6926419 20.6452 249.371 7738146 0.91 0.26 74.11 1725875.75 96.7410 1437805.00 0.372 1951987.09 52.50 37.09 1.404 146.262 1.948 9763.10 6.18 737.4 6.2294 86.460 0.230 9840561 18.95 36246 53.56 10.82 0.542 25405.988379614 16.93 1.95 3687.7 3.041 17.83 8.02 34.71 10496.6 10575.6 110.9 57419 9.21 169 61.999 79.955 12.620 32.96 19.42 755 1.050 4.79 6.100 13797562 5.3673 3.19 334.97 21.87 5.74 117 72.57 6.631 480 6.32 8.019 159.301 11116.3 37.32 2.78 6.318 34.931 6.10 1543 280959 6.66 454.91 17.7 5.1754 5227 4.937 14.55 0.930 323.46 21.1 23.82 2.16 2.18 440 6530 2.32 17.756 46.9 21.43 1223 249 1300.12 1020 2.57 7.77 27.2 8.28 7.158 11.26 6.1185 57.37 2.91 702.907 9.297 120634 9.33 321.42 12.72 320 32.164 42.573 8.853 3.43 4.49500 3.57 7.39 18432 788 24.567 8.03 49082 4.09 5.819 4.84 977.8 1.954 263.815 111.354 30.991 62.795 52.39 32.34 340004918.53832 1215.501 425.23 114.270 260.260 604079 58.589 20.022 1145.0 5572917 49.12 266426 5048647 259621 387017 116 117 111 8.30 288 14400.766 4.96131 0.39400 0.02004 6.74179 0.65366 55.50950 3.30686 120.90700 2310604.62 2624535.48 8.05221 3.249 236.14 6.688 5.799 730 8.42072 4.03910 688 13.1798 491.110 6434795 19.1978 262.368 7573497 0.88 0.25 76.91 1754885.38 99.7379 1413366.45 0.365 1947660.53 51.20 36.18 1.370 149.263 1.965 10001.81 6.11 747.0 6.1430 88.209 0.229 10050125 18.57 36793 53.60 10.71 0.550 25808.658192384 16.66 1.92 3632.0 3.068 17.75 8.08 35.12 10635.3 10711.5 109.5 56730 9.1 170 61.972 79.418 12.477 32.84 19.30 761 1.047 4.82 6.037 13866946 5.3310 3.20 331.87 21.67 5.70 116 72.10 6.575 477 6.27 7.963 160.558 11146.3 37.05 2.76 6.274 34.995 6.13 1549 279167 6.64 457.51 17.7 5.1563 5248 4.936 14.63 0.935 324.80 21.2 23.71 2.17 2.17 439 6547 2.33 17.704 46.7 21.44 1228 248 1305.29 1017 2.58 7.78 27.3 8.25 7.184 11.29 6.1354 57.57 2.92 701.853 9.267 120679 9.34 320.60 12.75 321 32.233 42.444 8.848 3.42 4.48209 3.58 7.39 18479 788 24.505 8.03 49047 4.10 5.824 4.85 977.8 1.958 264.303 111.303 30.941 62.694 52.46 32.38 339696862.29595 1214.944 425.58 114.186 260.451 604512 58.628 20.015 1144.3 5574923 49.12 266319 5046933 259600 386977 116 117 111 8.3 288 2095708.19 2159676.18 9.74749 3.329 214.08 6.476 5.608 649 8.81065 4.06739 638 12.8921 462.801 19.4144 247.156 7394699 0.87 0.25 75.29 1695298.03 99.9683 1459398.53 0.361 2004338.25 51.45 37.07 1.393 145.653 1.996 9959.06 6.25 730.4 6.2800 86.293 0.234 9864045 18.81 36985 52.55 10.61 0.552 25835.646713364 16.91 1.92 3687.7 3.087 18.00 8.13 34.66 10588.0 10653.0 110.4 56960 9.16 168 62.706 80.349 12.509 33.20 19.51 763 1.058 4.84 6.064 13730556 5.3162 3.22 333.51 21.83 5.69 117 72.72 6.584 476 6.30 8.026 160.149 11202.6 37.31 2.76 6.292 35.167 6.09 1553 279158 6.68 456.91 17.8 5.1464 5256 4.910 14.60 0.931 325.17 21.2 23.71 2.17 2.18 438 6559 2.32 17.680 46.8 21.52 1228 248 1304.01 1021 2.58 7.75 27.3 8.27 7.183 11.30 6.1401 57.47 2.92 704.133 9.267 120291 9.36 321.61 12.76 321 32.264 42.545 8.874 3.43 4.48308 3.58 7.37 18478 790 24.509 8.01 48962 4.09 5.811 4.85 975.8 1.958 264.123 111.494 30.956 62.769 52.40 32.38 339643088.46449 1216.085 425.39 114.261 260.441 604322 58.621 20.028 1144.7 5571717 49.14 266346 5048433 259575 387009 116 117 111 8.30 288 2135888.90 1559670.96 9.82517 3.340 203.50 6.470 5.602 OpenBenchmarking.org
DDraceNetwork Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL 3.3 - Zoom: Default - Demo: Multeasymap - Total Frame Time OpenBenchmarking.org Milliseconds, Fewer Is Better DDraceNetwork 15.2.3 Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL 3.3 - Zoom: Default - Demo: Multeasymap - Total Frame Time 1 2 3 3 6 9 12 15 Min: 2.07 / Avg: 2.98 / Max: 7.54 Min: 2.09 / Avg: 3.02 / Max: 7.16 Min: 2.02 / Avg: 3 / Max: 10.7 1. (CXX) g++ options: -O3 -rdynamic -lcrypto -lz -lrt -lpthread -lcurl -lfreetype -lSDL2 -lwavpack -lopusfile -lopus -logg -lGL -lX11 -lnotify -lgdk_pixbuf-2.0 -lgio-2.0 -lgobject-2.0 -lglib-2.0
LeelaChessZero Backend: BLAS OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.26 Backend: BLAS 1 2 3 200 400 600 800 1000 SE +/- 7.81, N = 3 SE +/- 10.34, N = 4 SE +/- 7.68, N = 9 775 730 649 1. (CXX) g++ options: -flto -pthread
oneDNN Harness: IP Batch 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: IP Batch 1D - Data Type: f32 - Engine: CPU 1 2 3 2 4 6 8 10 SE +/- 0.03162, N = 3 SE +/- 0.12292, N = 3 SE +/- 0.12077, N = 15 7.43267 8.42072 8.81065 MIN: 7.23 MIN: 8.01 MIN: 8.21 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU 1 2 3 1.0644 2.1288 3.1932 4.2576 5.322 SE +/- 0.00490, N = 3 SE +/- 0.01879, N = 3 SE +/- 0.00436, N = 3 4.73057 4.03910 4.06739 MIN: 4.66 MIN: 3.96 MIN: 3.99 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
LeelaChessZero Backend: Eigen OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.26 Backend: Eigen 1 2 3 160 320 480 640 800 SE +/- 4.41, N = 3 SE +/- 9.06, N = 5 SE +/- 7.37, N = 9 730 688 638 1. (CXX) g++ options: -flto -pthread
oneDNN Harness: Deconvolution Batch deconv_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Deconvolution Batch deconv_3d - Data Type: f32 - Engine: CPU 1 2 3 3 6 9 12 15 SE +/- 0.02, N = 3 SE +/- 0.17, N = 3 SE +/- 0.02, N = 3 12.03 13.18 12.89 MIN: 11.78 MIN: 12.66 MIN: 12.66 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU 1 2 3 110 220 330 440 550 SE +/- 2.86, N = 3 SE +/- 2.10, N = 3 SE +/- 3.41, N = 3 503.70 491.11 462.80 MIN: 495.04 MIN: 474.1 MIN: 453.6 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Kripke OpenBenchmarking.org Throughput FoM, More Is Better Kripke 1.2.4 1 2 1.5M 3M 4.5M 6M 7.5M SE +/- 717516.50, N = 2 6926419 6434795 1. (CXX) g++ options: -O3 -fopenmp
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU 1 2 3 5 10 15 20 25 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 20.65 19.20 19.41 MIN: 20.28 MIN: 19.01 MIN: 19.18 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU 1 2 3 60 120 180 240 300 SE +/- 0.60, N = 3 SE +/- 3.42, N = 3 SE +/- 1.11, N = 3 249.37 262.37 247.16 MIN: 247.16 MIN: 254.47 MIN: 243.72 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Crafty Elapsed Time OpenBenchmarking.org Nodes Per Second, More Is Better Crafty 25.2 Elapsed Time 1 2 3 1.7M 3.4M 5.1M 6.8M 8.5M SE +/- 11851.34, N = 3 SE +/- 103135.62, N = 3 SE +/- 94283.61, N = 3 7738146 7573497 7394699 1. (CC) gcc options: -pthread -lstdc++ -fprofile-use -lm
NCNN Target: Vulkan GPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: blazeface 1 2 3 0.2048 0.4096 0.6144 0.8192 1.024 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 0.91 0.88 0.87 MIN: 0.85 / MAX: 1.18 MIN: 0.86 / MAX: 1.45 MIN: 0.85 / MAX: 1.15 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
AOM AV1 Encoder Mode: Speed 0 Two-Pass OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 2.0 Encoder Mode: Speed 0 Two-Pass 1 2 3 0.0585 0.117 0.1755 0.234 0.2925 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.26 0.25 0.25 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
Mlpack Benchmark Benchmark: scikit_qda OpenBenchmarking.org Seconds, Fewer Is Better Mlpack Benchmark Benchmark: scikit_qda 1 2 3 20 40 60 80 100 SE +/- 0.39, N = 3 SE +/- 1.17, N = 12 SE +/- 1.16, N = 3 74.11 76.91 75.29
Redis Test: SET OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: SET 1 2 3 400K 800K 1200K 1600K 2000K SE +/- 18920.71, N = 3 SE +/- 29276.53, N = 3 SE +/- 19775.05, N = 15 1725875.75 1754885.38 1695298.03 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
oneDNN Harness: IP Batch All - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: IP Batch All - Data Type: f32 - Engine: CPU 1 2 3 20 40 60 80 100 SE +/- 0.07, N = 3 SE +/- 0.13, N = 3 SE +/- 0.89, N = 15 96.74 99.74 99.97 MIN: 95.29 MIN: 97.89 MIN: 95.99 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
Redis Test: LPUSH OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: LPUSH 1 2 3 300K 600K 900K 1200K 1500K SE +/- 14444.21, N = 3 SE +/- 12228.65, N = 3 SE +/- 13914.34, N = 15 1437805.00 1413366.45 1459398.53 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
rav1e Speed: 1 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Alpha Speed: 1 1 2 3 0.0837 0.1674 0.2511 0.3348 0.4185 SE +/- 0.004, N = 3 SE +/- 0.005, N = 3 SE +/- 0.006, N = 3 0.372 0.365 0.361
Redis Test: SADD OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: SADD 1 2 3 400K 800K 1200K 1600K 2000K SE +/- 22720.13, N = 15 SE +/- 25708.45, N = 15 SE +/- 18274.51, N = 3 1951987.09 1947660.53 2004338.25 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
LZ4 Compression Compression Level: 9 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 9 - Compression Speed 1 2 3 12 24 36 48 60 SE +/- 0.72, N = 3 SE +/- 0.27, N = 3 SE +/- 0.03, N = 3 52.50 51.20 51.45 1. (CC) gcc options: -O3
AOM AV1 Encoder Mode: Speed 8 Realtime OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 2.0 Encoder Mode: Speed 8 Realtime 1 2 3 9 18 27 36 45 SE +/- 0.08, N = 3 SE +/- 0.53, N = 3 SE +/- 0.08, N = 3 37.09 36.18 37.07 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
rav1e Speed: 6 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Alpha Speed: 6 1 2 3 0.3159 0.6318 0.9477 1.2636 1.5795 SE +/- 0.007, N = 3 SE +/- 0.006, N = 3 SE +/- 0.011, N = 3 1.404 1.370 1.393
libavif avifenc Encoder Speed: 0 OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.7.3 Encoder Speed: 0 1 2 3 30 60 90 120 150 SE +/- 0.63, N = 3 SE +/- 0.61, N = 3 SE +/- 0.42, N = 3 146.26 149.26 145.65 1. (CXX) g++ options: -O3 -fPIC
Sunflow Rendering System Global Illumination + Image Synthesis OpenBenchmarking.org Seconds, Fewer Is Better Sunflow Rendering System 0.07.2 Global Illumination + Image Synthesis 1 2 3 0.4491 0.8982 1.3473 1.7964 2.2455 SE +/- 0.029, N = 4 SE +/- 0.024, N = 3 SE +/- 0.023, N = 3 1.948 1.965 1.996 MIN: 1.73 / MAX: 2.71 MIN: 1.78 / MAX: 2.54 MIN: 1.81 / MAX: 2.69
LZ4 Compression Compression Level: 1 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 1 - Compression Speed 1 2 3 2K 4K 6K 8K 10K SE +/- 45.71, N = 3 SE +/- 99.34, N = 3 SE +/- 105.59, N = 3 9763.10 10001.81 9959.06 1. (CC) gcc options: -O3
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU-v3-v3 - Model: mobilenet-v3 1 2 3 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 6.18 6.11 6.25 MIN: 6.11 / MAX: 7.82 MIN: 6.03 / MAX: 7.3 MIN: 6.11 / MAX: 37.7 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
yquake2 Renderer: OpenGL 1.x - Resolution: 1920 x 1080 OpenBenchmarking.org Frames Per Second, More Is Better yquake2 7.45 Renderer: OpenGL 1.x - Resolution: 1920 x 1080 1 2 3 160 320 480 640 800 SE +/- 8.88, N = 3 SE +/- 9.93, N = 3 SE +/- 1.13, N = 3 737.4 747.0 730.4 1. (CC) gcc options: -lm -ldl -rdynamic -shared -lSDL2 -O2 -pipe -fomit-frame-pointer -std=gnu99 -fno-strict-aliasing -fwrapv -fvisibility=hidden -MMD -mfpmath=sse -fPIC
Embree Binary: Pathtracer - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer - Model: Asian Dragon 1 2 3 2 4 6 8 10 SE +/- 0.0315, N = 3 SE +/- 0.0103, N = 3 SE +/- 0.0267, N = 3 6.2294 6.1430 6.2800 MIN: 6.16 / MAX: 6.36 MIN: 6.1 / MAX: 6.24 MIN: 6.21 / MAX: 6.4
libavif avifenc Encoder Speed: 2 OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.7.3 Encoder Speed: 2 1 2 3 20 40 60 80 100 SE +/- 0.23, N = 3 SE +/- 0.07, N = 3 SE +/- 0.06, N = 3 86.46 88.21 86.29 1. (CXX) g++ options: -O3 -fPIC
Darktable Test: Server Rack - Acceleration: CPU-only OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.2.1 Test: Server Rack - Acceleration: CPU-only 1 2 3 0.0527 0.1054 0.1581 0.2108 0.2635 SE +/- 0.000, N = 3 SE +/- 0.000, N = 3 SE +/- 0.004, N = 3 0.230 0.229 0.234
Stockfish Total Time OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 12 Total Time 1 2 3 2M 4M 6M 8M 10M SE +/- 111801.84, N = 6 SE +/- 45923.84, N = 3 SE +/- 124228.11, N = 3 9840561 10050125 9864045 1. (CXX) g++ options: -m64 -lpthread -fno-exceptions -std=c++17 -pedantic -O3 -msse -msse3 -mpopcnt -msse4.1 -mssse3 -msse2 -flto -flto=jobserver
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: alexnet 1 2 3 5 10 15 20 25 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 18.95 18.57 18.81 MIN: 18.85 / MAX: 19.3 MIN: 18.49 / MAX: 19.27 MIN: 18.72 / MAX: 19.3 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Geekbench Test: GPU Vulkan OpenBenchmarking.org Score, More Is Better Geekbench 5 Test: GPU Vulkan 1 2 3 8K 16K 24K 32K 40K SE +/- 459.84, N = 3 SE +/- 115.14, N = 3 SE +/- 56.79, N = 3 36246 36793 36985
LZ4 Compression Compression Level: 3 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 3 - Compression Speed 1 2 3 12 24 36 48 60 SE +/- 0.84, N = 3 SE +/- 0.85, N = 3 SE +/- 0.28, N = 3 53.56 53.60 52.55 1. (CC) gcc options: -O3
NCNN Target: Vulkan GPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: vgg16 1 2 3 3 6 9 12 15 SE +/- 0.16, N = 3 SE +/- 0.05, N = 3 SE +/- 0.07, N = 3 10.82 10.71 10.61 MIN: 10.18 / MAX: 24.02 MIN: 10.19 / MAX: 20.01 MIN: 10.2 / MAX: 24 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
GROMACS Water Benchmark OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2020.3 Water Benchmark 1 2 3 0.1242 0.2484 0.3726 0.4968 0.621 SE +/- 0.000, N = 3 SE +/- 0.002, N = 3 SE +/- 0.000, N = 3 0.542 0.550 0.552 1. (CXX) g++ options: -O3 -pthread -lrt -lpthread -lm
FFTE N=256, 3D Complex FFT Routine OpenBenchmarking.org MFLOPS, More Is Better FFTE 7.0 N=256, 3D Complex FFT Routine 1 2 3 6K 12K 18K 24K 30K SE +/- 31.78, N = 3 SE +/- 19.99, N = 3 SE +/- 17.94, N = 3 25405.99 25808.66 25835.65 1. (F9X) gfortran options: -O3 -fomit-frame-pointer -fopenmp
AOM AV1 Encoder Mode: Speed 6 Realtime OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 2.0 Encoder Mode: Speed 6 Realtime 1 2 3 4 8 12 16 20 SE +/- 0.02, N = 3 SE +/- 0.07, N = 3 SE +/- 0.03, N = 3 16.93 16.66 16.91 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
NCNN Target: CPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: blazeface 1 2 3 0.4388 0.8776 1.3164 1.7552 2.194 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 1.95 1.92 1.92 MIN: 1.9 / MAX: 2.15 MIN: 1.88 / MAX: 2.17 MIN: 1.89 / MAX: 2.52 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Zstd Compression Compression Level: 3 OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.4.5 Compression Level: 3 1 2 3 800 1600 2400 3200 4000 SE +/- 19.56, N = 3 SE +/- 37.93, N = 3 SE +/- 19.08, N = 3 3687.7 3632.0 3687.7 1. (CC) gcc options: -O3 -pthread -lz -llzma
rav1e Speed: 10 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Alpha Speed: 10 1 2 3 0.6946 1.3892 2.0838 2.7784 3.473 SE +/- 0.003, N = 3 SE +/- 0.030, N = 3 SE +/- 0.026, N = 3 3.041 3.068 3.087
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: resnet18 1 2 3 4 8 12 16 20 SE +/- 0.05, N = 3 SE +/- 0.05, N = 3 SE +/- 0.01, N = 3 17.83 17.75 18.00 MIN: 17.65 / MAX: 21.58 MIN: 17.57 / MAX: 18.32 MIN: 17.88 / MAX: 18.52 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
PyPerformance Benchmark: python_startup OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.0.0 Benchmark: python_startup 1 2 3 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 8.02 8.08 8.13
x265 Video Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better x265 3.4 Video Input: Bosphorus 1080p 1 2 3 8 16 24 32 40 SE +/- 0.05, N = 3 SE +/- 0.20, N = 3 SE +/- 0.22, N = 3 34.71 35.12 34.66 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
LZ4 Compression Compression Level: 3 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 3 - Decompression Speed 1 2 3 2K 4K 6K 8K 10K SE +/- 12.91, N = 3 SE +/- 18.42, N = 3 SE +/- 22.07, N = 3 10496.6 10635.3 10588.0 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 9 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 9 - Decompression Speed 1 2 3 2K 4K 6K 8K 10K SE +/- 23.62, N = 3 SE +/- 8.30, N = 3 SE +/- 26.11, N = 3 10575.6 10711.5 10653.0 1. (CC) gcc options: -O3
yquake2 Renderer: Software CPU - Resolution: 1920 x 1080 OpenBenchmarking.org Frames Per Second, More Is Better yquake2 7.45 Renderer: Software CPU - Resolution: 1920 x 1080 1 2 3 20 40 60 80 100 SE +/- 0.48, N = 3 SE +/- 0.30, N = 3 SE +/- 0.31, N = 3 110.9 109.5 110.4 1. (CC) gcc options: -lm -ldl -rdynamic -shared -lSDL2 -O2 -pipe -fomit-frame-pointer -std=gnu99 -fno-strict-aliasing -fwrapv -fvisibility=hidden -MMD -mfpmath=sse -fPIC
BRL-CAD VGR Performance Metric OpenBenchmarking.org VGR Performance Metric, More Is Better BRL-CAD 7.30.8 VGR Performance Metric 1 2 3 12K 24K 36K 48K 60K 57419 56730 56960 1. (CXX) g++ options: -std=c++11 -pipe -fno-strict-aliasing -fno-common -fexceptions -ftemplate-depth-128 -m64 -ggdb3 -O3 -fipa-pta -fstrength-reduce -finline-functions -flto -pedantic -rdynamic -lSM -lICE -lXi -lGLU -lGL -lGLdispatch -lX11 -lXext -lXrender -lpthread -ldl -luuid -lm
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: efficientnet-b0 1 2 3 3 6 9 12 15 SE +/- 0.05, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 9.21 9.10 9.16 MIN: 9.07 / MAX: 14.29 MIN: 9.02 / MAX: 9.48 MIN: 9.06 / MAX: 9.6 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
PyPerformance Benchmark: regex_compile OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.0.0 Benchmark: regex_compile 1 2 3 40 80 120 160 200 169 170 168
Hugin Panorama Photo Assistant + Stitching Time OpenBenchmarking.org Seconds, Fewer Is Better Hugin Panorama Photo Assistant + Stitching Time 1 2 3 14 28 42 56 70 SE +/- 0.29, N = 3 SE +/- 0.43, N = 3 SE +/- 0.37, N = 3 62.00 61.97 62.71
RawTherapee Total Benchmark Time OpenBenchmarking.org Seconds, Fewer Is Better RawTherapee Total Benchmark Time 1 2 3 20 40 60 80 100 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.06, N = 3 79.96 79.42 80.35 1. RawTherapee, version 5.8, command line.
Darktable Test: Boat - Acceleration: CPU-only OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.2.1 Test: Boat - Acceleration: CPU-only 1 2 3 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 12.62 12.48 12.51
NCNN Target: CPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: yolov4-tiny 1 2 3 8 16 24 32 40 SE +/- 0.04, N = 3 SE +/- 0.09, N = 3 SE +/- 0.07, N = 3 32.96 32.84 33.20 MIN: 32.78 / MAX: 34.14 MIN: 32.57 / MAX: 55.87 MIN: 32.95 / MAX: 34.47 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: googlenet 1 2 3 5 10 15 20 25 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 19.42 19.30 19.51 MIN: 19.3 / MAX: 24.48 MIN: 19.11 / MAX: 38.78 MIN: 19.37 / MAX: 20.62 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
AI Benchmark Alpha Device Inference Score OpenBenchmarking.org Score, More Is Better AI Benchmark Alpha 0.1.2 Device Inference Score 1 2 3 160 320 480 640 800 755 761 763
rav1e Speed: 5 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Alpha Speed: 5 1 2 3 0.2381 0.4762 0.7143 0.9524 1.1905 SE +/- 0.007, N = 3 SE +/- 0.004, N = 3 SE +/- 0.000, N = 3 1.050 1.047 1.058
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: shufflenet-v2 1 2 3 1.089 2.178 3.267 4.356 5.445 SE +/- 0.04, N = 3 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 4.79 4.82 4.84 MIN: 4.65 / MAX: 5.62 MIN: 4.76 / MAX: 6 MIN: 4.76 / MAX: 5.65 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
libavif avifenc Encoder Speed: 10 OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.7.3 Encoder Speed: 10 1 2 3 2 4 6 8 10 SE +/- 0.015, N = 3 SE +/- 0.019, N = 3 SE +/- 0.017, N = 3 6.100 6.037 6.064 1. (CXX) g++ options: -O3 -fPIC
asmFish 1024 Hash Memory, 26 Depth OpenBenchmarking.org Nodes/second, More Is Better asmFish 2018-07-23 1024 Hash Memory, 26 Depth 1 2 3 3M 6M 9M 12M 15M SE +/- 23080.42, N = 3 SE +/- 160984.48, N = 3 SE +/- 105834.72, N = 3 13797562 13866946 13730556
Embree Binary: Pathtracer - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer - Model: Crown 1 2 3 1.2076 2.4152 3.6228 4.8304 6.038 SE +/- 0.0401, N = 3 SE +/- 0.0092, N = 3 SE +/- 0.0526, N = 3 5.3673 5.3310 5.3162 MIN: 4.89 / MAX: 5.49 MIN: 5.3 / MAX: 5.4 MIN: 4.89 / MAX: 5.45
Mlpack Benchmark Benchmark: scikit_linearridgeregression OpenBenchmarking.org Seconds, Fewer Is Better Mlpack Benchmark Benchmark: scikit_linearridgeregression 1 2 3 0.7245 1.449 2.1735 2.898 3.6225 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 3.19 3.20 3.22
DDraceNetwork Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL 3.3 - Zoom: Default - Demo: Multeasymap OpenBenchmarking.org Frames Per Second, More Is Better DDraceNetwork 15.2.3 Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL 3.3 - Zoom: Default - Demo: Multeasymap 1 2 3 70 140 210 280 350 SE +/- 1.03, N = 3 SE +/- 1.65, N = 3 SE +/- 0.66, N = 3 334.97 331.87 333.51 MIN: 110.11 / MAX: 493.34 MIN: 104.44 / MAX: 494.32 MIN: 109.02 / MAX: 495.29 1. (CXX) g++ options: -O3 -rdynamic -lcrypto -lz -lrt -lpthread -lcurl -lfreetype -lSDL2 -lwavpack -lopusfile -lopus -logg -lGL -lX11 -lnotify -lgdk_pixbuf-2.0 -lgio-2.0 -lgobject-2.0 -lglib-2.0
NCNN Target: CPU - Model: squeezenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: squeezenet 1 2 3 5 10 15 20 25 SE +/- 0.09, N = 3 SE +/- 0.06, N = 3 SE +/- 0.12, N = 3 21.87 21.67 21.83 MIN: 21.59 / MAX: 23.26 MIN: 21.39 / MAX: 37.01 MIN: 21.56 / MAX: 22.96 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: googlenet 1 2 3 1.2915 2.583 3.8745 5.166 6.4575 SE +/- 0.05, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 5.74 5.70 5.69 MIN: 5.64 / MAX: 15.26 MIN: 5.65 / MAX: 8.52 MIN: 5.63 / MAX: 10.18 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
PyPerformance Benchmark: float OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.0.0 Benchmark: float 1 2 3 30 60 90 120 150 SE +/- 0.33, N = 3 117 116 117
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: vgg16 1 2 3 16 32 48 64 80 SE +/- 0.13, N = 3 SE +/- 0.22, N = 3 SE +/- 0.02, N = 3 72.57 72.10 72.72 MIN: 72.08 / MAX: 107.37 MIN: 71.61 / MAX: 88.56 MIN: 72.42 / MAX: 80.73 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
libavif avifenc Encoder Speed: 8 OpenBenchmarking.org Seconds, Fewer Is Better libavif avifenc 0.7.3 Encoder Speed: 8 1 2 3 2 4 6 8 10 SE +/- 0.004, N = 3 SE +/- 0.005, N = 3 SE +/- 0.001, N = 3 6.631 6.575 6.584 1. (CXX) g++ options: -O3 -fPIC
PyPerformance Benchmark: raytrace OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.0.0 Benchmark: raytrace 1 2 3 100 200 300 400 500 SE +/- 0.33, N = 3 480 477 476
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: mnasnet 1 2 3 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 6.32 6.27 6.30 MIN: 6.26 / MAX: 7.07 MIN: 6.2 / MAX: 7.42 MIN: 6.22 / MAX: 7.04 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Darktable Test: Masskrug - Acceleration: CPU-only OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.2.1 Test: Masskrug - Acceleration: CPU-only 1 2 3 2 4 6 8 10 SE +/- 0.016, N = 3 SE +/- 0.006, N = 3 SE +/- 0.018, N = 3 8.019 7.963 8.026
Timed Linux Kernel Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 5.4 Time To Compile 1 2 3 40 80 120 160 200 SE +/- 0.52, N = 3 SE +/- 1.04, N = 3 SE +/- 0.59, N = 3 159.30 160.56 160.15
LZ4 Compression Compression Level: 1 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 1 - Decompression Speed 1 2 3 2K 4K 6K 8K 10K SE +/- 46.13, N = 3 SE +/- 12.36, N = 3 SE +/- 11.80, N = 3 11116.3 11146.3 11202.6 1. (CC) gcc options: -O3
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: resnet50 1 2 3 9 18 27 36 45 SE +/- 0.03, N = 3 SE +/- 0.07, N = 3 SE +/- 0.07, N = 3 37.32 37.05 37.31 MIN: 37.15 / MAX: 39.69 MIN: 36.81 / MAX: 38.07 MIN: 37.14 / MAX: 39.49 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: mnasnet 1 2 3 0.6255 1.251 1.8765 2.502 3.1275 SE +/- 0.02, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 2.78 2.76 2.76 MIN: 2.71 / MAX: 15.58 MIN: 2.72 / MAX: 3.3 MIN: 2.71 / MAX: 3.28 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Darktable Test: Server Room - Acceleration: CPU-only OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.2.1 Test: Server Room - Acceleration: CPU-only 1 2 3 2 4 6 8 10 SE +/- 0.003, N = 3 SE +/- 0.010, N = 3 SE +/- 0.002, N = 3 6.318 6.274 6.292
Mobile Neural Network Model: resnet-v2-50 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2020-09-17 Model: resnet-v2-50 1 2 3 8 16 24 32 40 SE +/- 0.05, N = 3 SE +/- 0.04, N = 3 SE +/- 0.25, N = 3 34.93 35.00 35.17 MIN: 34.38 / MAX: 58.39 MIN: 34.7 / MAX: 43.77 MIN: 34.65 / MAX: 80.67 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
NCNN Target: Vulkan GPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: resnet50 1 2 3 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 SE +/- 0.00, N = 3 6.10 6.13 6.09 MIN: 6.06 / MAX: 10.11 MIN: 6.04 / MAX: 11.31 MIN: 6.06 / MAX: 6.39 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
AI Benchmark Alpha Device AI Score OpenBenchmarking.org Score, More Is Better AI Benchmark Alpha 0.1.2 Device AI Score 1 2 3 300 600 900 1200 1500 1543 1549 1553
TensorFlow Lite Model: NASNet Mobile OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: NASNet Mobile 1 2 3 60K 120K 180K 240K 300K SE +/- 1226.74, N = 3 SE +/- 79.18, N = 3 SE +/- 151.01, N = 3 280959 279167 279158
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU-v2-v2 - Model: mobilenet-v2 1 2 3 2 4 6 8 10 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 6.66 6.64 6.68 MIN: 6.56 / MAX: 7.93 MIN: 6.54 / MAX: 7.9 MIN: 6.59 / MAX: 8.01 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
MPV Video Input: Big Buck Bunny Sunflower 4K - Decode: Software Only OpenBenchmarking.org FPS, More Is Better MPV Video Input: Big Buck Bunny Sunflower 4K - Decode: Software Only 1 2 3 100 200 300 400 500 SE +/- 0.40, N = 3 SE +/- 0.40, N = 3 SE +/- 0.38, N = 3 454.91 457.51 456.91 MIN: 299.99 / MAX: 631.56 MIN: 299.99 / MAX: 666.65 MIN: 292.67 / MAX: 631.56 1. mpv 0.32.0
PyPerformance Benchmark: pathlib OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.0.0 Benchmark: pathlib 1 2 3 4 8 12 16 20 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 17.7 17.7 17.8
Embree Binary: Pathtracer ISPC - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer ISPC - Model: Crown 1 2 3 1.1645 2.329 3.4935 4.658 5.8225 SE +/- 0.0016, N = 3 SE +/- 0.0124, N = 3 SE +/- 0.0320, N = 3 5.1754 5.1563 5.1464 MIN: 5.15 / MAX: 5.25 MIN: 5.11 / MAX: 5.25 MIN: 5.05 / MAX: 5.25
Geekbench Test: CPU Multi Core OpenBenchmarking.org Score, More Is Better Geekbench 5 Test: CPU Multi Core 1 2 3 1100 2200 3300 4400 5500 SE +/- 4.98, N = 3 SE +/- 7.09, N = 3 SE +/- 6.17, N = 3 5227 5248 5256
Mobile Neural Network Model: MobileNetV2_224 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2020-09-17 Model: MobileNetV2_224 1 2 3 1.1108 2.2216 3.3324 4.4432 5.554 SE +/- 0.018, N = 3 SE +/- 0.020, N = 3 SE +/- 0.030, N = 3 4.937 4.936 4.910 MIN: 4.86 / MAX: 5.44 MIN: 4.86 / MAX: 5.66 MIN: 4.82 / MAX: 6.27 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Kvazaar Video Input: Bosphorus 4K - Video Preset: Ultra Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 4K - Video Preset: Ultra Fast 1 2 3 4 8 12 16 20 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 14.55 14.63 14.60 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
IndigoBench Acceleration: CPU - Scene: Bedroom OpenBenchmarking.org M samples/s, More Is Better IndigoBench 4.4 Acceleration: CPU - Scene: Bedroom 1 2 3 0.2104 0.4208 0.6312 0.8416 1.052 SE +/- 0.001, N = 3 SE +/- 0.001, N = 3 SE +/- 0.001, N = 3 0.930 0.935 0.931
Blender Blend File: BMW27 - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: BMW27 - Compute: CPU-Only 1 2 3 70 140 210 280 350 SE +/- 0.35, N = 3 SE +/- 0.44, N = 3 SE +/- 0.71, N = 3 323.46 324.80 325.17
Zstd Compression Compression Level: 19 OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.4.5 Compression Level: 19 1 2 3 5 10 15 20 25 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 21.1 21.2 21.2 1. (CC) gcc options: -O3 -pthread -lz -llzma
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: CPU - Model: mobilenet 1 2 3 6 12 18 24 30 SE +/- 0.18, N = 3 SE +/- 0.15, N = 3 SE +/- 0.11, N = 3 23.82 23.71 23.71 MIN: 23.36 / MAX: 24.99 MIN: 23.3 / MAX: 24.65 MIN: 23.46 / MAX: 24.96 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: resnet18 1 2 3 0.4883 0.9766 1.4649 1.9532 2.4415 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 2.16 2.17 2.17 MIN: 2.09 / MAX: 2.76 MIN: 2.1 / MAX: 2.77 MIN: 2.1 / MAX: 2.76 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
AOM AV1 Encoder Mode: Speed 4 Two-Pass OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 2.0 Encoder Mode: Speed 4 Two-Pass 1 2 3 0.4905 0.981 1.4715 1.962 2.4525 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 2.18 2.17 2.18 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
PyPerformance Benchmark: pickle_pure_python OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.0.0 Benchmark: pickle_pure_python 1 2 3 100 200 300 400 500 440 439 438
GLmark2 Resolution: 1920 x 1080 OpenBenchmarking.org Score, More Is Better GLmark2 2020.04 Resolution: 1920 x 1080 1 2 3 1400 2800 4200 5600 7000 6530 6547 6559
NCNN Target: Vulkan GPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: shufflenet-v2 1 2 3 0.5243 1.0486 1.5729 2.0972 2.6215 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 2.32 2.33 2.32 MIN: 2.28 / MAX: 3.46 MIN: 2.3 / MAX: 2.98 MIN: 2.29 / MAX: 2.74 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Dolfyn Computational Fluid Dynamics OpenBenchmarking.org Seconds, Fewer Is Better Dolfyn 0.527 Computational Fluid Dynamics 1 2 3 4 8 12 16 20 SE +/- 0.01, N = 3 SE +/- 0.05, N = 3 SE +/- 0.02, N = 3 17.76 17.70 17.68
PyPerformance Benchmark: django_template OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.0.0 Benchmark: django_template 1 2 3 11 22 33 44 55 SE +/- 0.10, N = 3 SE +/- 0.03, N = 3 SE +/- 0.06, N = 3 46.9 46.7 46.8
Mlpack Benchmark Benchmark: scikit_svm OpenBenchmarking.org Seconds, Fewer Is Better Mlpack Benchmark Benchmark: scikit_svm 1 2 3 5 10 15 20 25 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 21.43 21.44 21.52
Geekbench Test: CPU Single Core OpenBenchmarking.org Score, More Is Better Geekbench 5 Test: CPU Single Core 1 2 3 300 600 900 1200 1500 SE +/- 0.58, N = 3 SE +/- 1.33, N = 3 SE +/- 0.88, N = 3 1223 1228 1228
PyPerformance Benchmark: go OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.0.0 Benchmark: go 1 2 3 50 100 150 200 250 SE +/- 0.33, N = 3 249 248 248
MPV Video Input: Big Buck Bunny Sunflower 1080p - Decode: Software Only OpenBenchmarking.org FPS, More Is Better MPV Video Input: Big Buck Bunny Sunflower 1080p - Decode: Software Only 1 2 3 300 600 900 1200 1500 SE +/- 2.29, N = 3 SE +/- 2.23, N = 3 SE +/- 7.87, N = 3 1300.12 1305.29 1304.01 MIN: 749.97 / MAX: 2399.95 MIN: 799.97 / MAX: 2399.92 MIN: 799.97 / MAX: 2399.92 1. mpv 0.32.0
PyBench Total For Average Test Times OpenBenchmarking.org Milliseconds, Fewer Is Better PyBench 2018-02-16 Total For Average Test Times 1 2 3 200 400 600 800 1000 SE +/- 2.31, N = 3 SE +/- 1.76, N = 3 1020 1017 1021
NCNN Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 1 2 3 0.5805 1.161 1.7415 2.322 2.9025 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 2.57 2.58 2.58 MIN: 2.53 / MAX: 3.17 MIN: 2.53 / MAX: 5.2 MIN: 2.53 / MAX: 3.72 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
x265 Video Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better x265 3.4 Video Input: Bosphorus 4K 1 2 3 2 4 6 8 10 SE +/- 0.05, N = 3 SE +/- 0.08, N = 3 SE +/- 0.05, N = 3 7.77 7.78 7.75 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
PyPerformance Benchmark: json_loads OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.0.0 Benchmark: json_loads 1 2 3 6 12 18 24 30 SE +/- 0.03, N = 3 SE +/- 0.00, N = 3 SE +/- 0.03, N = 3 27.2 27.3 27.3
NCNN Target: Vulkan GPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: mobilenet 1 2 3 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 8.28 8.25 8.27 MIN: 7.59 / MAX: 11.65 MIN: 7.57 / MAX: 11.24 MIN: 7.6 / MAX: 11.3 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Waifu2x-NCNN Vulkan Scale: 2x - Denoise: 3 - TAA: Yes OpenBenchmarking.org Seconds, Fewer Is Better Waifu2x-NCNN Vulkan 20200818 Scale: 2x - Denoise: 3 - TAA: Yes 1 2 3 2 4 6 8 10 SE +/- 0.002, N = 3 SE +/- 0.007, N = 3 SE +/- 0.008, N = 3 7.158 7.184 7.183
NCNN Target: Vulkan GPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: yolov4-tiny 1 2 3 3 6 9 12 15 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 11.26 11.29 11.30 MIN: 11.09 / MAX: 11.79 MIN: 11.1 / MAX: 11.73 MIN: 11.15 / MAX: 11.62 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Embree Binary: Pathtracer ISPC - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer ISPC - Model: Asian Dragon 1 2 3 2 4 6 8 10 SE +/- 0.0222, N = 3 SE +/- 0.0098, N = 3 SE +/- 0.0025, N = 3 6.1185 6.1354 6.1401 MIN: 6.05 / MAX: 6.22 MIN: 6.09 / MAX: 6.22 MIN: 6.11 / MAX: 6.21
Kvazaar Video Input: Bosphorus 1080p - Video Preset: Ultra Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 1080p - Video Preset: Ultra Fast 1 2 3 13 26 39 52 65 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 SE +/- 0.11, N = 3 57.37 57.57 57.47 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Kvazaar Video Input: Bosphorus 4K - Video Preset: Medium OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 4K - Video Preset: Medium 1 2 3 0.657 1.314 1.971 2.628 3.285 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 2.91 2.92 2.92 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Basis Universal Settings: UASTC Level 2 + RDO Post-Processing OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: UASTC Level 2 + RDO Post-Processing 1 2 3 150 300 450 600 750 SE +/- 1.79, N = 3 SE +/- 0.97, N = 3 SE +/- 1.97, N = 3 702.91 701.85 704.13 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
Basis Universal Settings: UASTC Level 0 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: UASTC Level 0 1 2 3 3 6 9 12 15 SE +/- 0.015, N = 3 SE +/- 0.018, N = 3 SE +/- 0.014, N = 3 9.297 9.267 9.267 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
Caffe Model: GoogleNet - Acceleration: CPU - Iterations: 100 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: CPU - Iterations: 100 1 2 3 30K 60K 90K 120K 150K SE +/- 118.13, N = 3 SE +/- 32.70, N = 3 SE +/- 73.69, N = 3 120634 120679 120291 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
NCNN Target: Vulkan GPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: efficientnet-b0 1 2 3 3 6 9 12 15 SE +/- 0.12, N = 3 SE +/- 0.17, N = 3 SE +/- 0.10, N = 3 9.33 9.34 9.36 MIN: 8.97 / MAX: 19.5 MIN: 8.93 / MAX: 17.56 MIN: 8.96 / MAX: 20 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Numpy Benchmark OpenBenchmarking.org Score, More Is Better Numpy Benchmark 1 2 3 70 140 210 280 350 SE +/- 0.41, N = 3 SE +/- 0.23, N = 3 SE +/- 0.63, N = 3 321.42 320.60 321.61
Kvazaar Video Input: Bosphorus 1080p - Video Preset: Medium OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 1080p - Video Preset: Medium 1 2 3 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 12.72 12.75 12.76 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
PyPerformance Benchmark: 2to3 OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.0.0 Benchmark: 2to3 1 2 3 70 140 210 280 350 SE +/- 0.33, N = 3 320 321 321
OCRMyPDF Processing 60 Page PDF Document OpenBenchmarking.org Seconds, Fewer Is Better OCRMyPDF 10.3.1+dfsg Processing 60 Page PDF Document 1 2 3 7 14 21 28 35 SE +/- 0.12, N = 3 SE +/- 0.07, N = 3 SE +/- 0.03, N = 3 32.16 32.23 32.26
Mobile Neural Network Model: inception-v3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2020-09-17 Model: inception-v3 1 2 3 10 20 30 40 50 SE +/- 0.08, N = 3 SE +/- 0.08, N = 3 SE +/- 0.17, N = 3 42.57 42.44 42.55 MIN: 42.21 / MAX: 51.92 MIN: 42.09 / MAX: 51.33 MIN: 42.02 / MAX: 58.27 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: SqueezeNetV1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2020-09-17 Model: SqueezeNetV1.0 1 2 3 2 4 6 8 10 SE +/- 0.003, N = 3 SE +/- 0.005, N = 3 SE +/- 0.018, N = 3 8.853 8.848 8.874 MIN: 8.76 / MAX: 11.24 MIN: 8.76 / MAX: 9.86 MIN: 8.77 / MAX: 11.46 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
AOM AV1 Encoder Mode: Speed 6 Two-Pass OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 2.0 Encoder Mode: Speed 6 Two-Pass 1 2 3 0.7718 1.5436 2.3154 3.0872 3.859 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 3.43 3.42 3.43 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
NAMD ATPase Simulation - 327,506 Atoms OpenBenchmarking.org days/ns, Fewer Is Better NAMD 2.14 ATPase Simulation - 327,506 Atoms 1 2 3 1.0114 2.0228 3.0342 4.0456 5.057 SE +/- 0.00243, N = 3 SE +/- 0.00085, N = 3 SE +/- 0.00219, N = 3 4.49500 4.48209 4.48308
NCNN Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 1 2 3 0.8055 1.611 2.4165 3.222 4.0275 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 3.57 3.58 3.58 MIN: 3.54 / MAX: 4.28 MIN: 3.53 / MAX: 4.28 MIN: 3.54 / MAX: 4.3 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
ASTC Encoder Preset: Fast OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Fast 1 2 3 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 7.39 7.39 7.37 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
VkFFT OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 2020-09-29 1 2 3 4K 8K 12K 16K 20K SE +/- 80.13, N = 3 SE +/- 18.75, N = 3 SE +/- 15.18, N = 3 18432 18479 18478
AI Benchmark Alpha Device Training Score OpenBenchmarking.org Score, More Is Better AI Benchmark Alpha 0.1.2 Device Training Score 1 2 3 200 400 600 800 1000 788 788 790
Tesseract OCR Time To OCR 7 Images OpenBenchmarking.org Seconds, Fewer Is Better Tesseract OCR 4.1.1 Time To OCR 7 Images 1 2 3 6 12 18 24 30 SE +/- 0.03, N = 3 SE +/- 0.06, N = 3 SE +/- 0.02, N = 3 24.57 24.51 24.51
Kvazaar Video Input: Bosphorus 4K - Video Preset: Very Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 4K - Video Preset: Very Fast 1 2 3 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 8.03 8.03 8.01 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Caffe Model: AlexNet - Acceleration: CPU - Iterations: 100 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: AlexNet - Acceleration: CPU - Iterations: 100 1 2 3 11K 22K 33K 44K 55K SE +/- 14.01, N = 3 SE +/- 8.08, N = 3 SE +/- 27.06, N = 3 49082 49047 48962 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
NCNN Target: Vulkan GPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: alexnet 1 2 3 0.9225 1.845 2.7675 3.69 4.6125 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 4.09 4.10 4.09 MIN: 3.96 / MAX: 4.79 MIN: 3.99 / MAX: 4.76 MIN: 3.97 / MAX: 4.66 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Mobile Neural Network Model: mobilenet-v1-1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2020-09-17 Model: mobilenet-v1-1.0 1 2 3 1.3104 2.6208 3.9312 5.2416 6.552 SE +/- 0.016, N = 3 SE +/- 0.028, N = 3 SE +/- 0.015, N = 3 5.819 5.824 5.811 MIN: 5.74 / MAX: 6.34 MIN: 5.72 / MAX: 25.28 MIN: 5.73 / MAX: 15.06 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
NCNN Target: Vulkan GPU - Model: squeezenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20200916 Target: Vulkan GPU - Model: squeezenet 1 2 3 1.0913 2.1826 3.2739 4.3652 5.4565 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 4.84 4.85 4.85 MIN: 4.71 / MAX: 5.84 MIN: 4.69 / MAX: 5.86 MIN: 4.71 / MAX: 5.84 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
yquake2 Renderer: OpenGL 3.x - Resolution: 1920 x 1080 OpenBenchmarking.org Frames Per Second, More Is Better yquake2 7.45 Renderer: OpenGL 3.x - Resolution: 1920 x 1080 1 2 3 200 400 600 800 1000 SE +/- 1.84, N = 3 SE +/- 0.50, N = 3 SE +/- 2.82, N = 3 977.8 977.8 975.8 1. (CC) gcc options: -lm -ldl -rdynamic -shared -lSDL2 -O2 -pipe -fomit-frame-pointer -std=gnu99 -fno-strict-aliasing -fwrapv -fvisibility=hidden -MMD -mfpmath=sse -fPIC
IndigoBench Acceleration: CPU - Scene: Supercar OpenBenchmarking.org M samples/s, More Is Better IndigoBench 4.4 Acceleration: CPU - Scene: Supercar 1 2 3 0.4406 0.8812 1.3218 1.7624 2.203 SE +/- 0.005, N = 3 SE +/- 0.004, N = 3 SE +/- 0.001, N = 3 1.954 1.958 1.958
TNN Target: CPU - Model: MobileNet v2 OpenBenchmarking.org ms, Fewer Is Better TNN 0.2.3 Target: CPU - Model: MobileNet v2 1 2 3 60 120 180 240 300 SE +/- 0.00, N = 3 SE +/- 0.19, N = 3 SE +/- 0.03, N = 3 263.82 264.30 264.12 MIN: 263.16 / MAX: 270.27 MIN: 263.28 / MAX: 314.67 MIN: 263.32 / MAX: 271.64 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -O3 -rdynamic -ldl
Timed HMMer Search Pfam Database Search OpenBenchmarking.org Seconds, Fewer Is Better Timed HMMer Search 3.3.1 Pfam Database Search 1 2 3 20 40 60 80 100 SE +/- 0.15, N = 3 SE +/- 0.11, N = 3 SE +/- 0.23, N = 3 111.35 111.30 111.49 1. (CC) gcc options: -O3 -pthread -lhmmer -leasel -lm
eSpeak-NG Speech Engine Text-To-Speech Synthesis OpenBenchmarking.org Seconds, Fewer Is Better eSpeak-NG Speech Engine 20200907 Text-To-Speech Synthesis 1 2 3 7 14 21 28 35 SE +/- 0.15, N = 4 SE +/- 0.01, N = 4 SE +/- 0.02, N = 4 30.99 30.94 30.96 1. (CC) gcc options: -O2 -std=c99
Basis Universal Settings: ETC1S OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: ETC1S 1 2 3 14 28 42 56 70 SE +/- 0.17, N = 3 SE +/- 0.11, N = 3 SE +/- 0.15, N = 3 62.80 62.69 62.77 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
ASTC Encoder Preset: Thorough OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Thorough 1 2 3 12 24 36 48 60 SE +/- 0.01, N = 3 SE +/- 0.05, N = 3 SE +/- 0.01, N = 3 52.39 52.46 52.40 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
Kvazaar Video Input: Bosphorus 1080p - Video Preset: Very Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 1080p - Video Preset: Very Fast 1 2 3 8 16 24 32 40 SE +/- 0.05, N = 3 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 32.34 32.38 32.38 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Hierarchical INTegration Test: FLOAT OpenBenchmarking.org QUIPs, More Is Better Hierarchical INTegration 1.0 Test: FLOAT 1 2 3 70M 140M 210M 280M 350M SE +/- 215594.66, N = 3 SE +/- 156026.35, N = 3 SE +/- 106732.80, N = 3 340004918.54 339696862.30 339643088.46 1. (CC) gcc options: -O3 -march=native -lm
Timed LLVM Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed LLVM Compilation 10.0 Time To Compile 1 2 3 300 600 900 1200 1500 SE +/- 1.95, N = 3 SE +/- 1.32, N = 3 SE +/- 0.35, N = 3 1215.50 1214.94 1216.09
ASTC Encoder Preset: Exhaustive OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Exhaustive 1 2 3 90 180 270 360 450 SE +/- 0.13, N = 3 SE +/- 0.16, N = 3 SE +/- 0.06, N = 3 425.23 425.58 425.39 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
Basis Universal Settings: UASTC Level 3 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: UASTC Level 3 1 2 3 30 60 90 120 150 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 114.27 114.19 114.26 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
TNN Target: CPU - Model: SqueezeNet v1.1 OpenBenchmarking.org ms, Fewer Is Better TNN 0.2.3 Target: CPU - Model: SqueezeNet v1.1 1 2 3 60 120 180 240 300 SE +/- 0.22, N = 3 SE +/- 0.15, N = 3 SE +/- 0.05, N = 3 260.26 260.45 260.44 MIN: 259.28 / MAX: 261.34 MIN: 259.62 / MAX: 261.24 MIN: 259.68 / MAX: 261.29 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -O3 -rdynamic -ldl
PHPBench PHP Benchmark Suite OpenBenchmarking.org Score, More Is Better PHPBench 0.8.1 PHP Benchmark Suite 1 2 3 130K 260K 390K 520K 650K SE +/- 5320.80, N = 3 SE +/- 3538.13, N = 3 SE +/- 3258.38, N = 3 604079 604512 604322
Basis Universal Settings: UASTC Level 2 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: UASTC Level 2 1 2 3 13 26 39 52 65 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 58.59 58.63 58.62 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
RNNoise OpenBenchmarking.org Seconds, Fewer Is Better RNNoise 2020-06-28 1 2 3 5 10 15 20 25 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 20.02 20.02 20.03 1. (CC) gcc options: -O2 -pedantic -fvisibility=hidden
OpenSSL RSA 4096-bit Performance OpenBenchmarking.org Signs Per Second, More Is Better OpenSSL 1.1.1 RSA 4096-bit Performance 1 2 3 200 400 600 800 1000 SE +/- 0.15, N = 3 SE +/- 0.27, N = 3 SE +/- 0.07, N = 3 1145.0 1144.3 1144.7 1. (CC) gcc options: -pthread -m64 -O3 -lssl -lcrypto -ldl
TensorFlow Lite Model: Inception V4 OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: Inception V4 1 2 3 1.2M 2.4M 3.6M 4.8M 6M SE +/- 421.91, N = 3 SE +/- 2695.68, N = 3 SE +/- 230.96, N = 3 5572917 5574923 5571717
Mlpack Benchmark Benchmark: scikit_ica OpenBenchmarking.org Seconds, Fewer Is Better Mlpack Benchmark Benchmark: scikit_ica 1 2 3 11 22 33 44 55 SE +/- 0.07, N = 3 SE +/- 0.05, N = 3 SE +/- 0.04, N = 3 49.12 49.12 49.14
TensorFlow Lite Model: Mobilenet Quant OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: Mobilenet Quant 1 2 3 60K 120K 180K 240K 300K SE +/- 51.91, N = 3 SE +/- 131.03, N = 3 SE +/- 60.84, N = 3 266426 266319 266346
TensorFlow Lite Model: Inception ResNet V2 OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: Inception ResNet V2 1 2 3 1.1M 2.2M 3.3M 4.4M 5.5M SE +/- 308.45, N = 3 SE +/- 255.76, N = 3 SE +/- 2537.61, N = 3 5048647 5046933 5048433
TensorFlow Lite Model: Mobilenet Float OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: Mobilenet Float 1 2 3 60K 120K 180K 240K 300K SE +/- 20.61, N = 3 SE +/- 44.31, N = 3 SE +/- 21.18, N = 3 259621 259600 259575
TensorFlow Lite Model: SqueezeNet OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: SqueezeNet 1 2 3 80K 160K 240K 320K 400K SE +/- 60.36, N = 3 SE +/- 56.90, N = 3 SE +/- 10.21, N = 3 387017 386977 387009
PyPerformance Benchmark: crypto_pyaes OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.0.0 Benchmark: crypto_pyaes 1 2 3 30 60 90 120 150 116 116 116
PyPerformance Benchmark: nbody OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.0.0 Benchmark: nbody 1 2 3 30 60 90 120 150 117 117 117
PyPerformance Benchmark: chaos OpenBenchmarking.org Milliseconds, Fewer Is Better PyPerformance 1.0.0 Benchmark: chaos 1 2 3 20 40 60 80 100 111 111 111
ASTC Encoder Preset: Medium OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Medium 1 2 3 2 4 6 8 10 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 8.30 8.30 8.30 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
Monte Carlo Simulations of Ionised Nebulae Input: Dust 2D tau100.0 OpenBenchmarking.org Seconds, Fewer Is Better Monte Carlo Simulations of Ionised Nebulae 2019-03-24 Input: Dust 2D tau100.0 1 2 3 60 120 180 240 300 SE +/- 0.58, N = 3 SE +/- 0.67, N = 3 SE +/- 0.88, N = 3 288 288 288 1. (F9X) gfortran options: -cpp -Jsource/ -ffree-line-length-0 -lm -std=legacy -O3 -O2 -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -ldl -levent -levent_pthreads -lutil -lrt -lz
HPC Challenge Test / Class: Max Ping Pong Bandwidth OpenBenchmarking.org MB/s, More Is Better HPC Challenge 1.5.0 Test / Class: Max Ping Pong Bandwidth 1 3K 6K 9K 12K 15K 14400.77 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
HPC Challenge Test / Class: Random Ring Bandwidth OpenBenchmarking.org GB/s, More Is Better HPC Challenge 1.5.0 Test / Class: Random Ring Bandwidth 1 1.1163 2.2326 3.3489 4.4652 5.5815 4.96131 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
HPC Challenge Test / Class: Random Ring Latency OpenBenchmarking.org usecs, Fewer Is Better HPC Challenge 1.5.0 Test / Class: Random Ring Latency 1 0.0887 0.1774 0.2661 0.3548 0.4435 0.39400 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
HPC Challenge Test / Class: G-Random Access OpenBenchmarking.org GUP/s, More Is Better HPC Challenge 1.5.0 Test / Class: G-Random Access 1 0.0045 0.009 0.0135 0.018 0.0225 0.02004 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
HPC Challenge Test / Class: EP-STREAM Triad OpenBenchmarking.org GB/s, More Is Better HPC Challenge 1.5.0 Test / Class: EP-STREAM Triad 1 2 4 6 8 10 6.74179 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
HPC Challenge Test / Class: G-Ptrans OpenBenchmarking.org GB/s, More Is Better HPC Challenge 1.5.0 Test / Class: G-Ptrans 1 0.1471 0.2942 0.4413 0.5884 0.7355 0.65366 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
HPC Challenge Test / Class: EP-DGEMM OpenBenchmarking.org GFLOPS, More Is Better HPC Challenge 1.5.0 Test / Class: EP-DGEMM 1 12 24 36 48 60 55.51 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
HPC Challenge Test / Class: G-Ffte OpenBenchmarking.org GFLOPS, More Is Better HPC Challenge 1.5.0 Test / Class: G-Ffte 1 0.744 1.488 2.232 2.976 3.72 3.30686 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
HPC Challenge Test / Class: G-HPL OpenBenchmarking.org GFLOPS, More Is Better HPC Challenge 1.5.0 Test / Class: G-HPL 1 30 60 90 120 150 120.91 1. (CC) gcc options: -lblas -lm -pthread -lmpi -fomit-frame-pointer -funroll-loops 2. ATLAS + Open MPI 4.0.3
Redis Test: GET OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: GET 1 2 3 500K 1000K 1500K 2000K 2500K SE +/- 40313.16, N = 15 SE +/- 26352.66, N = 15 SE +/- 21009.40, N = 15 2310604.62 2095708.19 2135888.90 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
Redis Test: LPOP OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: LPOP 1 2 3 600K 1200K 1800K 2400K 3000K SE +/- 39136.83, N = 15 SE +/- 123032.71, N = 12 SE +/- 13136.89, N = 3 2624535.48 2159676.18 1559670.96 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
oneDNN Harness: Deconvolution Batch deconv_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 1.5 Harness: Deconvolution Batch deconv_1d - Data Type: f32 - Engine: CPU 1 2 3 3 6 9 12 15 SE +/- 0.01249, N = 3 SE +/- 0.12846, N = 3 SE +/- 0.17039, N = 15 8.05221 9.74749 9.82517 MIN: 7.86 MIN: 9.46 MIN: 8.65 1. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
LAMMPS Molecular Dynamics Simulator Model: Rhodopsin Protein OpenBenchmarking.org ns/day, More Is Better LAMMPS Molecular Dynamics Simulator 29Oct2020 Model: Rhodopsin Protein 1 2 3 0.7515 1.503 2.2545 3.006 3.7575 SE +/- 0.052, N = 15 SE +/- 0.029, N = 3 SE +/- 0.021, N = 3 3.249 3.329 3.340 1. (CXX) g++ options: -O3 -pthread -lm
DDraceNetwork Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL 3.3 - Zoom: Default - Demo: RaiNyMore2 OpenBenchmarking.org Frames Per Second, More Is Better DDraceNetwork 15.2.3 Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL 3.3 - Zoom: Default - Demo: RaiNyMore2 1 2 3 50 100 150 200 250 SE +/- 9.66, N = 15 SE +/- 13.98, N = 15 SE +/- 13.89, N = 12 236.14 214.08 203.50 MIN: 35.16 / MAX: 498.75 MIN: 33.12 / MAX: 476.64 MIN: 27.45 / MAX: 451.06 1. (CXX) g++ options: -O3 -rdynamic -lcrypto -lz -lrt -lpthread -lcurl -lfreetype -lSDL2 -lwavpack -lopusfile -lopus -logg -lGL -lX11 -lnotify -lgdk_pixbuf-2.0 -lgio-2.0 -lgobject-2.0 -lglib-2.0
Betsy GPU Compressor Codec: ETC2 RGB - Quality: Highest OpenBenchmarking.org Seconds, Fewer Is Better Betsy GPU Compressor 1.1 Beta Codec: ETC2 RGB - Quality: Highest 1 2 3 2 4 6 8 10 SE +/- 0.213, N = 15 SE +/- 0.003, N = 3 SE +/- 0.003, N = 3 6.688 6.476 6.470 1. (CXX) g++ options: -O3 -O2 -lpthread -ldl
Betsy GPU Compressor Codec: ETC1 - Quality: Highest OpenBenchmarking.org Seconds, Fewer Is Better Betsy GPU Compressor 1.1 Beta Codec: ETC1 - Quality: Highest 1 2 3 1.3048 2.6096 3.9144 5.2192 6.524 SE +/- 0.224, N = 15 SE +/- 0.062, N = 3 SE +/- 0.064, N = 3 5.799 5.608 5.602 1. (CXX) g++ options: -O3 -O2 -lpthread -ldl
Phoronix Test Suite v10.8.5