AMD Ryzen 3 2200G testing with a ASUS PRIME B350M-E (5220 BIOS) and ASUS AMD Radeon Vega / Mobile 2GB on Ubuntu 20.10 via the Phoronix Test Suite.
Compare your own system(s) to this result file with the
Phoronix Test Suite by running the command:
phoronix-test-suite benchmark 2101191-HA-RYZEN322022 Ryzen 3 2200G 2021 - Phoronix Test Suite Ryzen 3 2200G 2021 AMD Ryzen 3 2200G testing with a ASUS PRIME B350M-E (5220 BIOS) and ASUS AMD Radeon Vega / Mobile 2GB on Ubuntu 20.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2101191-HA-RYZEN322022&grr&sro .
Ryzen 3 2200G 2021 Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL Vulkan Compiler File-System Screen Resolution 1 2 3 AMD Ryzen 3 2200G @ 3.50GHz (4 Cores) ASUS PRIME B350M-E (5220 BIOS) AMD Raven/Raven2 6GB Samsung SSD 970 EVO 250GB ASUS AMD Radeon Vega / Mobile 2GB (1100/1600MHz) AMD Raven/Raven2/Fenghuang G237HL Realtek RTL8111/8168/8411 Ubuntu 20.10 5.8.0-38-generic (x86_64) GNOME Shell 3.38.1 X Server 1.20.9 modesetting 1.20.9 4.6 Mesa 20.2.6 (LLVM 11.0.0) 1.2.131 GCC 10.2.0 ext4 1920x1080 OpenBenchmarking.org Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-gcn/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq ondemand (Boost: Enabled) - CPU Microcode: 0x8101016 Graphics Details - GLAMOR Java Details - OpenJDK Runtime Environment (build 11.0.9.1+1-Ubuntu-0ubuntu1.20.10) Python Details - Python 3.8.6 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: disabled RSB filling + srbds: Not affected + tsx_async_abort: Not affected
Ryzen 3 2200G 2021 incompact3d: Cylinder lczero: BLAS kripke: astcenc: Exhaustive lczero: Eigen gromacs: Water Benchmark build2: Time To Compile build-godot: Time To Compile cp2k: Fayalite-FIST Data realsr-ncnn: 4x - Yes kvazaar: Bosphorus 4K - Medium namd: ATPase Simulation - 327,506 Atoms openfoam: Motorbike 30M compress-lz4: 9 - Decompression Speed compress-lz4: 9 - Compression Speed mocassin: Dust 2D tau100.0 numpy: compress-lz4: 3 - Decompression Speed compress-lz4: 3 - Compression Speed onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU dav1d: Chimera 1080p 10-bit embree: Pathtracer ISPC - Crown embree: Pathtracer ISPC - Asian Dragon Obj embree: Pathtracer - Asian Dragon Obj embree: Pathtracer - Crown asmfish: 1024 Hash Memory, 26 Depth compress-zstd: 19 cloverleaf: Lagrangian-Eulerian Hydrodynamics embree: Pathtracer ISPC - Asian Dragon embree: Pathtracer - Asian Dragon build-ffmpeg: Time To Compile tensorflow-lite: Inception V4 ncnn: Vulkan GPU - regnety_400m ncnn: Vulkan GPU - squeezenet_ssd ncnn: Vulkan GPU - yolov4-tiny ncnn: Vulkan GPU - resnet50 ncnn: Vulkan GPU - alexnet ncnn: Vulkan GPU - resnet18 ncnn: Vulkan GPU - vgg16 ncnn: Vulkan GPU - googlenet ncnn: Vulkan GPU - blazeface ncnn: Vulkan GPU - efficientnet-b0 ncnn: Vulkan GPU - mnasnet ncnn: Vulkan GPU - shufflenet-v2 ncnn: Vulkan GPU-v3-v3 - mobilenet-v3 ncnn: Vulkan GPU-v2-v2 - mobilenet-v2 ncnn: Vulkan GPU - mobilenet tensorflow-lite: Inception ResNet V2 kvazaar: Bosphorus 4K - Very Fast mnn: inception-v3 mnn: mobilenet-v1-1.0 mnn: MobileNetV2_224 mnn: resnet-v2-50 mnn: SqueezeNetV1.0 influxdb: 4 - 10000 - 2,5000,1 - 10000 hint: FLOAT clomp: Static OMP Speedup influxdb: 64 - 10000 - 2,5000,1 - 10000 ncnn: CPU - regnety_400m ncnn: CPU - squeezenet_ssd ncnn: CPU - yolov4-tiny ncnn: CPU - resnet50 ncnn: CPU - alexnet ncnn: CPU - resnet18 ncnn: CPU - vgg16 ncnn: CPU - googlenet ncnn: CPU - blazeface ncnn: CPU - efficientnet-b0 ncnn: CPU - mnasnet ncnn: CPU - shufflenet-v2 ncnn: CPU-v3-v3 - mobilenet-v3 ncnn: CPU-v2-v2 - mobilenet-v2 ncnn: CPU - mobilenet vkmark: 1920 x 1080 hmmer: Pfam Database Search node-web-tooling: rawtherapee: Total Benchmark Time x265: Bosphorus 4K byte: Dhrystone 2 build-eigen: Time To Compile caffe: GoogleNet - CPU - 100 glmark2: 1920 x 1080 onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU kvazaar: Bosphorus 1080p - Medium astcenc: Thorough kvazaar: Bosphorus 4K - Ultra Fast basis: UASTC Level 2 hugin: Panorama Photo Assistant + Stitching Time basis: ETC1S sqlite-speedtest: Timed Time - Size 1,000 warsow: 1920 x 1080 stockfish: Total Time sunflow: Global Illumination + Image Synthesis dav1d: Summer Nature 4K rav1e: 5 keydb: dav1d: Chimera 1080p compress-lz4: 1 - Decompression Speed compress-lz4: 1 - Compression Speed sockperf: Latency Under Load realsr-ncnn: 4x - No indigobench: CPU - Bedroom libraw: Post-Processing Benchmark indigobench: CPU - Supercar tensorflow-lite: SqueezeNet tensorflow-lite: NASNet Mobile tensorflow-lite: Mobilenet Float tensorflow-lite: Mobilenet Quant aom-av1: Speed 6 Realtime simdjson: LargeRand simdjson: PartialTweets simdjson: DistinctUserID webp: Quality 100, Lossless, Highest Compression darktable: Boat - CPU-only rav1e: 6 ocrmypdf: Processing 60 Page PDF Document encode-wavpack: WAV To WavPack onednn: Deconvolution Batch shapes_1d - f32 - CPU simdjson: Kostya espeak: Text-To-Speech Synthesis onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU aom-av1: Speed 6 Two-Pass compress-zstd: 3 caffe: AlexNet - CPU - 100 phpbench: PHP Benchmark Suite kvazaar: Bosphorus 1080p - Very Fast rnnoise: rav1e: 10 onednn: IP Shapes 3D - f32 - CPU aom-av1: Speed 4 Two-Pass unpack-firefox: firefox-84.0.source.tar.xz crafty: Elapsed Time x265: Bosphorus 1080p synthmark: VoiceMark_100 waifu2x-ncnn: 2x - 3 - Yes encode-ape: WAV To APE darktable: Masskrug - CPU-only webp: Quality 100, Lossless aom-av1: Speed 8 Realtime darktable: Server Room - CPU-only kvazaar: Bosphorus 1080p - Ultra Fast dolfyn: Computational Fluid Dynamics dav1d: Summer Nature 1080p tnn: CPU - SqueezeNet v1.1 coremark: CoreMark Size 666 - Iterations Per Second tnn: CPU - MobileNet v2 astcenc: Medium gimp: unsharp-mask redis: SET gimp: auto-levels onednn: IP Shapes 1D - f32 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU mafft: Multiple Sequence Alignment - LSU RNA encode-opus: WAV To Opus Encode gimp: rotate sockperf: Latency Ping Pong sockperf: Throughput redis: GET gimp: resize onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU basis: UASTC Level 0 amg: redis: LPUSH redis: LPOP redis: SADD astcenc: Fast onednn: IP Shapes 3D - u8s8f32 - CPU webp: Quality 100, Highest Compression yquake2: Software CPU - 1920 x 1080 lammps: Rhodopsin Protein onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU osbench: Create Files lulesh: osbench: Memory Allocations osbench: Launch Programs osbench: Create Processes osbench: Create Threads waifu2x-ncnn: 2x - 3 - No onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU onednn: Deconvolution Batch shapes_3d - f32 - CPU webp: Quality 100 ffte: N=256, 3D Complex FFT Routine webp: Default yquake2: OpenGL 3.x - 1920 x 1080 darktable: Server Rack - CPU-only 1 2 3 810.954712 432 4811563 696.05 448 0.333 514.482 501.200 1448.589 482.609 1.49 6.75407 342.98 8565.2 42.22 342 242.34 8554.3 42.77 8195.20 52.55 2.5819 2.8199 2.9903 2.7601 7748047 14.0 191.41 3.2431 3.3140 182.189 6441017 18.88 59.28 59.00 71.68 23.46 29.17 117.46 32.56 3.25 16.98 10.38 12.63 9.59 10.88 46.40 5691310 3.94 63.415 7.395 5.424 50.216 9.732 706035.5 301333349.99382 2 721428.3 19.11 59.39 59.28 72.55 23.30 29.04 117.19 32.59 3.31 17.07 10.50 12.65 9.67 11.20 46.83 1199 127.149 7.38 123.684 4.81 35499453.0 113.520 110084 1849 7721.13 7913.61 7746.87 36.5596 8437.12 8193.61 6.51 84.33 6.84 86.479 82.006 82.060 81.222 158.1 5718169 3.206 51.92 0.848 265074.45 184.25 8722.4 7994.30 53.775 63.028 0.494 19.36 1.107 467745 316112 309946 328733 10.13 0.35 0.45 0.46 57.670 25.206 1.082 52.675 15.082 22.5413 0.38 35.134 14.8215 2.22 2346.0 41877 508106 15.60 22.238 2.573 13.0736 1.38 23.847 6274275 19.49 596.254 26.675 15.955 24.172 24.894 27.20 20.695 27.01 21.069 182.89 287.255 102524.442176 279.339 12.77 17.332 1489969.25 15.924 16.5429 14.6322 15.042 8.936 14.487 6.927 555055 2064794.83 12.855 7.35744 11.898 213491533 1216336.46 2261210.92 1735687.33 9.75 5.81417 8.872 92.9 2.603 38.8392 23.0213 18.245888 1180.0971 81.742684 81.523260 26.190281 14.920235 4.115 29.7098 30.8073 2.595 15392.809951922 1.648 814.1 0.339 821.055725 374 3117717 697.50 380 0.330 516.516 503.977 1461.853 482.839 1.49 6.79902 339.54 8552.1 41.02 340 241.36 8547.8 42.34 8438.64 52.39 2.5670 2.8371 2.9682 2.7659 7802828 14.2 191.45 3.2504 3.3113 183.024 6389943 19.06 59.16 59.40 74.10 23.24 29.08 118.91 32.64 3.31 16.95 10.44 12.98 9.69 11.30 47.01 5697083 3.94 63.997 7.526 5.419 50.269 9.613 696009.5 301687480.18842 2.0 725224.2 18.87 59.71 59.42 73.14 23.42 29.04 119.38 32.79 3.33 16.76 10.34 12.89 9.49 10.91 46.49 1199 127.645 7.74 123.405 4.83 35791748.5 113.654 110320 1851 7794.13 7667.94 7725.63 35.8461 8277.53 8356.19 6.51 84.61 6.85 86.540 83.581 82.178 81.423 159.4 5628220 3.148 52.00 0.844 267212.95 184.17 8646.3 8015.39 52.758 63.023 0.494 19.66 1.098 461955 317213 306270 318685 10.12 0.35 0.45 0.46 57.216 25.901 1.083 52.741 15.077 22.3144 0.38 35.319 15.0825 2.22 2358.0 41672 506055 15.52 22.693 2.566 13.1685 1.38 23.799 6255015 19.60 596.615 26.685 15.994 24.518 24.957 27.18 21.008 27.07 21.139 183.50 287.063 101765.566926 279.626 12.83 17.317 1486411.63 15.853 16.5558 14.8700 15.000 8.923 14.470 6.751 559663 1931045.20 12.866 7.36061 12.054 214232233 1213155.04 1258380.50 1758200.50 9.74 5.84129 8.871 93.3 2.586 38.9728 23.6791 18.315846 1208.3866 81.999382 81.967513 26.479562 14.823278 4.110 29.8889 31.4147 2.601 15755.585271762 1.662 807.2 0.342 820.322815 353 695.24 377 0.326 514.799 502.608 1452.469 482.551 1.50 6.83284 338.27 8562.8 41.23 341 243.26 8547.2 41.81 8426.84 53.51 2.5828 2.8151 2.9782 2.7779 7669043 14.2 191.01 3.2482 3.3432 182.601 6468567 19.07 59.50 59.28 71.69 23.48 29.28 118.04 32.59 3.33 16.90 10.43 12.75 9.60 10.68 46.32 5689070 3.95 63.269 7.313 5.398 50.489 9.867 700554.6 301185316.58909 2 723222.1 18.75 59.34 59.28 71.82 23.51 29.18 117.45 32.43 3.29 16.89 10.25 12.70 9.59 11.04 46.32 1196 127.385 7.38 123.339 4.83 35427649.2 112.964 110157 1852 7837.72 7750.49 7701.75 33.0587 8342.73 8419.82 6.50 84.49 6.83 86.340 82.188 82.076 81.933 159.4 5648589 3.302 52.09 0.839 269044.48 183.88 8690.1 8022.48 50.699 63.002 0.498 19.79 1.106 467404 315790 313216 327187 10.25 0.35 0.45 0.46 57.452 25.451 1.089 52.986 15.174 22.6619 0.38 35.312 15.5619 2.24 2324.2 41573 504159 15.62 22.580 2.639 13.3661 1.40 23.806 6322069 19.71 593.909 26.673 15.995 24.194 24.900 27.29 20.739 27.05 20.988 184.81 286.140 102339.408641 279.278 12.75 17.328 1472539.67 15.894 16.6108 15.1975 15.035 8.915 14.424 6.790 557665 1930168.38 12.826 7.31351 11.863 214072633 1223284.42 1275489.17 1734495.83 9.80 5.79119 8.869 93.3 2.613 38.5055 23.6697 18.441862 1208.0185 85.632006 82.073212 26.857058 14.909108 4.097 29.1210 30.8208 2.590 15437.468578998 1.657 807.9 0.344 OpenBenchmarking.org
Incompact3D Input: Cylinder OpenBenchmarking.org Seconds, Fewer Is Better Incompact3D 2020-09-17 Input: Cylinder 1 2 3 200 400 600 800 1000 SE +/- 3.54, N = 3 SE +/- 10.03, N = 3 SE +/- 2.19, N = 3 810.95 821.06 820.32 1. (F9X) gfortran options: -cpp -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -ldl -levent -levent_pthreads -lutil -lm -lrt -lz
LeelaChessZero Backend: BLAS OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.26 Backend: BLAS 1 2 3 90 180 270 360 450 SE +/- 4.54, N = 8 SE +/- 6.01, N = 9 SE +/- 2.52, N = 3 432 374 353 1. (CXX) g++ options: -flto -pthread
Kripke OpenBenchmarking.org Throughput FoM, More Is Better Kripke 1.2.4 1 2 1000K 2000K 3000K 4000K 5000K SE +/- 36406.50, N = 2 SE +/- 35494.54, N = 3 4811563 3117717 1. (CXX) g++ options: -O3 -fopenmp
ASTC Encoder Preset: Exhaustive OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Exhaustive 1 2 3 150 300 450 600 750 SE +/- 1.18, N = 3 SE +/- 0.48, N = 3 SE +/- 0.20, N = 3 696.05 697.50 695.24 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
LeelaChessZero Backend: Eigen OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.26 Backend: Eigen 1 2 3 100 200 300 400 500 SE +/- 4.81, N = 3 SE +/- 5.13, N = 9 448 380 377 1. (CXX) g++ options: -flto -pthread
GROMACS Water Benchmark OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2020.3 Water Benchmark 1 2 3 0.0749 0.1498 0.2247 0.2996 0.3745 SE +/- 0.002, N = 3 SE +/- 0.002, N = 3 SE +/- 0.005, N = 3 0.333 0.330 0.326 1. (CXX) g++ options: -O3 -pthread -lrt -lpthread -lm
Build2 Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Build2 0.13 Time To Compile 1 2 3 110 220 330 440 550 SE +/- 0.43, N = 3 SE +/- 2.15, N = 3 SE +/- 1.15, N = 3 514.48 516.52 514.80
Timed Godot Game Engine Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Godot Game Engine Compilation 3.2.3 Time To Compile 1 2 3 110 220 330 440 550 SE +/- 0.16, N = 3 SE +/- 0.30, N = 3 SE +/- 0.32, N = 3 501.20 503.98 502.61
CP2K Molecular Dynamics Fayalite-FIST Data OpenBenchmarking.org Seconds, Fewer Is Better CP2K Molecular Dynamics 8.1 Fayalite-FIST Data 1 2 3 300 600 900 1200 1500 1448.59 1461.85 1452.47
RealSR-NCNN Scale: 4x - TAA: Yes OpenBenchmarking.org Seconds, Fewer Is Better RealSR-NCNN 20200818 Scale: 4x - TAA: Yes 1 2 3 100 200 300 400 500 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 482.61 482.84 482.55
Kvazaar Video Input: Bosphorus 4K - Video Preset: Medium OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 4K - Video Preset: Medium 1 2 3 0.3375 0.675 1.0125 1.35 1.6875 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 1.49 1.49 1.50 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
NAMD ATPase Simulation - 327,506 Atoms OpenBenchmarking.org days/ns, Fewer Is Better NAMD 2.14 ATPase Simulation - 327,506 Atoms 1 2 3 2 4 6 8 10 SE +/- 0.01425, N = 3 SE +/- 0.03865, N = 3 SE +/- 0.08887, N = 5 6.75407 6.79902 6.83284
OpenFOAM Input: Motorbike 30M OpenBenchmarking.org Seconds, Fewer Is Better OpenFOAM 8 Input: Motorbike 30M 1 2 3 70 140 210 280 350 SE +/- 1.66, N = 3 SE +/- 0.27, N = 3 SE +/- 2.23, N = 3 342.98 339.54 338.27 1. (CXX) g++ options: -std=c++11 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -ldynamicMesh -lgenericPatchFields -lOpenFOAM -ldl -lm
LZ4 Compression Compression Level: 9 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 9 - Decompression Speed 1 2 3 2K 4K 6K 8K 10K SE +/- 6.38, N = 13 SE +/- 5.73, N = 15 SE +/- 3.43, N = 15 8565.2 8552.1 8562.8 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 9 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 9 - Compression Speed 1 2 3 10 20 30 40 50 SE +/- 0.73, N = 13 SE +/- 0.55, N = 15 SE +/- 0.47, N = 15 42.22 41.02 41.23 1. (CC) gcc options: -O3
Monte Carlo Simulations of Ionised Nebulae Input: Dust 2D tau100.0 OpenBenchmarking.org Seconds, Fewer Is Better Monte Carlo Simulations of Ionised Nebulae 2019-03-24 Input: Dust 2D tau100.0 1 2 3 70 140 210 280 350 SE +/- 1.76, N = 3 SE +/- 0.67, N = 3 342 340 341 1. (F9X) gfortran options: -cpp -Jsource/ -ffree-line-length-0 -lm -std=legacy -O3 -O2 -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -ldl -levent -levent_pthreads -lutil -lrt -lz
Numpy Benchmark OpenBenchmarking.org Score, More Is Better Numpy Benchmark 1 2 3 50 100 150 200 250 SE +/- 0.34, N = 3 SE +/- 0.33, N = 3 SE +/- 0.50, N = 3 242.34 241.36 243.26
LZ4 Compression Compression Level: 3 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 3 - Decompression Speed 1 2 3 2K 4K 6K 8K 10K SE +/- 26.74, N = 3 SE +/- 8.28, N = 15 SE +/- 5.83, N = 15 8554.3 8547.8 8547.2 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 3 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 3 - Compression Speed 1 2 3 10 20 30 40 50 SE +/- 0.58, N = 3 SE +/- 0.43, N = 15 SE +/- 0.65, N = 15 42.77 42.34 41.81 1. (CC) gcc options: -O3
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU 1 2 3 2K 4K 6K 8K 10K SE +/- 99.84, N = 5 SE +/- 66.22, N = 15 SE +/- 46.50, N = 3 8195.20 8438.64 8426.84 MIN: 7505 MIN: 7752.96 MIN: 8003.45 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
dav1d Video Input: Chimera 1080p 10-bit OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Chimera 1080p 10-bit 1 2 3 12 24 36 48 60 SE +/- 0.17, N = 3 SE +/- 0.21, N = 3 SE +/- 0.31, N = 3 52.55 52.39 53.51 MIN: 35.45 / MAX: 124.71 MIN: 35.47 / MAX: 120.48 MIN: 35.6 / MAX: 125.13 1. (CC) gcc options: -pthread -ldl -lm
Embree Binary: Pathtracer ISPC - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer ISPC - Model: Crown 1 2 3 0.5811 1.1622 1.7433 2.3244 2.9055 SE +/- 0.0040, N = 3 SE +/- 0.0105, N = 3 SE +/- 0.0165, N = 3 2.5819 2.5670 2.5828 MIN: 2.55 / MAX: 2.62 MIN: 2.52 / MAX: 2.63 MIN: 2.51 / MAX: 2.65
Embree Binary: Pathtracer ISPC - Model: Asian Dragon Obj OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer ISPC - Model: Asian Dragon Obj 1 2 3 0.6383 1.2766 1.9149 2.5532 3.1915 SE +/- 0.0119, N = 3 SE +/- 0.0157, N = 3 SE +/- 0.0152, N = 3 2.8199 2.8371 2.8151 MIN: 2.75 / MAX: 2.92 MIN: 2.77 / MAX: 2.9 MIN: 2.75 / MAX: 2.89
Embree Binary: Pathtracer - Model: Asian Dragon Obj OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer - Model: Asian Dragon Obj 1 2 3 0.6728 1.3456 2.0184 2.6912 3.364 SE +/- 0.0237, N = 3 SE +/- 0.0135, N = 3 SE +/- 0.0217, N = 3 2.9903 2.9682 2.9782 MIN: 2.9 / MAX: 3.08 MIN: 2.9 / MAX: 3.07 MIN: 2.91 / MAX: 3.08
Embree Binary: Pathtracer - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer - Model: Crown 1 2 3 0.625 1.25 1.875 2.5 3.125 SE +/- 0.0145, N = 3 SE +/- 0.0043, N = 3 SE +/- 0.0087, N = 3 2.7601 2.7659 2.7779 MIN: 2.71 / MAX: 2.86 MIN: 2.73 / MAX: 2.83 MIN: 2.75 / MAX: 2.87
asmFish 1024 Hash Memory, 26 Depth OpenBenchmarking.org Nodes/second, More Is Better asmFish 2018-07-23 1024 Hash Memory, 26 Depth 1 2 3 2M 4M 6M 8M 10M SE +/- 28500.12, N = 3 SE +/- 50254.79, N = 3 SE +/- 29445.58, N = 3 7748047 7802828 7669043
Zstd Compression Compression Level: 19 OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.4.5 Compression Level: 19 1 2 3 4 8 12 16 20 SE +/- 0.18, N = 5 SE +/- 0.03, N = 3 SE +/- 0.06, N = 3 14.0 14.2 14.2 1. (CC) gcc options: -O3 -pthread -lz -llzma
CloverLeaf Lagrangian-Eulerian Hydrodynamics OpenBenchmarking.org Seconds, Fewer Is Better CloverLeaf Lagrangian-Eulerian Hydrodynamics 1 2 3 40 80 120 160 200 SE +/- 0.19, N = 3 SE +/- 0.07, N = 3 SE +/- 0.07, N = 3 191.41 191.45 191.01 1. (F9X) gfortran options: -O3 -march=native -funroll-loops -fopenmp
Embree Binary: Pathtracer ISPC - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer ISPC - Model: Asian Dragon 1 2 3 0.7313 1.4626 2.1939 2.9252 3.6565 SE +/- 0.0133, N = 3 SE +/- 0.0130, N = 3 SE +/- 0.0025, N = 3 3.2431 3.2504 3.2482 MIN: 3.18 / MAX: 3.32 MIN: 3.19 / MAX: 3.32 MIN: 3.2 / MAX: 3.32
Embree Binary: Pathtracer - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer - Model: Asian Dragon 1 2 3 0.7522 1.5044 2.2566 3.0088 3.761 SE +/- 0.0186, N = 3 SE +/- 0.0143, N = 3 SE +/- 0.0299, N = 3 3.3140 3.3113 3.3432 MIN: 3.25 / MAX: 3.4 MIN: 3.26 / MAX: 3.4 MIN: 3.25 / MAX: 3.45
Timed FFmpeg Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed FFmpeg Compilation 4.2.2 Time To Compile 1 2 3 40 80 120 160 200 SE +/- 0.34, N = 3 SE +/- 0.22, N = 3 SE +/- 0.72, N = 3 182.19 183.02 182.60
TensorFlow Lite Model: Inception V4 OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: Inception V4 1 2 3 1.4M 2.8M 4.2M 5.6M 7M SE +/- 24424.39, N = 3 SE +/- 7475.10, N = 3 SE +/- 4440.22, N = 3 6441017 6389943 6468567
NCNN Target: Vulkan GPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: regnety_400m 1 2 3 5 10 15 20 25 SE +/- 0.09, N = 3 SE +/- 0.15, N = 4 SE +/- 0.01, N = 3 18.88 19.06 19.07 MIN: 16.76 / MAX: 26.52 MIN: 16.61 / MAX: 34.07 MIN: 16.77 / MAX: 34.77 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: squeezenet_ssd 1 2 3 13 26 39 52 65 SE +/- 0.08, N = 3 SE +/- 0.14, N = 4 SE +/- 0.16, N = 3 59.28 59.16 59.50 MIN: 53.06 / MAX: 77.54 MIN: 51.95 / MAX: 72.82 MIN: 52.65 / MAX: 71.96 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: yolov4-tiny 1 2 3 13 26 39 52 65 SE +/- 0.04, N = 3 SE +/- 0.06, N = 4 SE +/- 0.16, N = 3 59.00 59.40 59.28 MIN: 55.05 / MAX: 74.18 MIN: 55.12 / MAX: 75.24 MIN: 54.59 / MAX: 74.87 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: resnet50 1 2 3 16 32 48 64 80 SE +/- 0.20, N = 3 SE +/- 0.63, N = 4 SE +/- 0.29, N = 3 71.68 74.10 71.69 MIN: 65.87 / MAX: 90.56 MIN: 66.26 / MAX: 110.26 MIN: 66.47 / MAX: 91.18 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: alexnet 1 2 3 6 12 18 24 30 SE +/- 0.08, N = 3 SE +/- 0.03, N = 4 SE +/- 0.02, N = 3 23.46 23.24 23.48 MIN: 21.29 / MAX: 37.36 MIN: 21.11 / MAX: 37.35 MIN: 21.25 / MAX: 36.41 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: resnet18 1 2 3 7 14 21 28 35 SE +/- 0.26, N = 3 SE +/- 0.14, N = 4 SE +/- 0.02, N = 3 29.17 29.08 29.28 MIN: 26 / MAX: 40.97 MIN: 25.74 / MAX: 44.35 MIN: 25.57 / MAX: 39.41 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: vgg16 1 2 3 30 60 90 120 150 SE +/- 0.37, N = 3 SE +/- 0.24, N = 4 SE +/- 0.16, N = 3 117.46 118.91 118.04 MIN: 111.97 / MAX: 149.37 MIN: 113.22 / MAX: 141.78 MIN: 112.22 / MAX: 141.41 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: googlenet 1 2 3 8 16 24 32 40 SE +/- 0.20, N = 3 SE +/- 0.12, N = 4 SE +/- 0.13, N = 3 32.56 32.64 32.59 MIN: 28.77 / MAX: 47.43 MIN: 28.94 / MAX: 42.38 MIN: 28.42 / MAX: 45.73 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: blazeface 1 2 3 0.7493 1.4986 2.2479 2.9972 3.7465 SE +/- 0.01, N = 3 SE +/- 0.02, N = 4 SE +/- 0.03, N = 3 3.25 3.31 3.33 MIN: 2.61 / MAX: 14.28 MIN: 2.6 / MAX: 4.93 MIN: 2.62 / MAX: 5.7 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: efficientnet-b0 1 2 3 4 8 12 16 20 SE +/- 0.07, N = 3 SE +/- 0.26, N = 4 SE +/- 0.14, N = 3 16.98 16.95 16.90 MIN: 14.01 / MAX: 31.1 MIN: 14.04 / MAX: 27.15 MIN: 13.97 / MAX: 30.89 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: mnasnet 1 2 3 3 6 9 12 15 SE +/- 0.08, N = 3 SE +/- 0.20, N = 4 SE +/- 0.08, N = 3 10.38 10.44 10.43 MIN: 8.43 / MAX: 17.05 MIN: 8.36 / MAX: 26.87 MIN: 8.39 / MAX: 21.7 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: shufflenet-v2 1 2 3 3 6 9 12 15 SE +/- 0.09, N = 3 SE +/- 0.16, N = 4 SE +/- 0.20, N = 3 12.63 12.98 12.75 MIN: 10.39 / MAX: 25.47 MIN: 10.26 / MAX: 24.78 MIN: 10.36 / MAX: 21.98 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 1 2 3 3 6 9 12 15 SE +/- 0.13, N = 3 SE +/- 0.14, N = 4 SE +/- 0.09, N = 3 9.59 9.69 9.60 MIN: 7.78 / MAX: 22.63 MIN: 7.81 / MAX: 18.31 MIN: 7.73 / MAX: 19.55 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 1 2 3 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.21, N = 4 SE +/- 0.38, N = 3 10.88 11.30 10.68 MIN: 8.93 / MAX: 27.73 MIN: 8.86 / MAX: 23.26 MIN: 8.89 / MAX: 17.41 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: mobilenet 1 2 3 11 22 33 44 55 SE +/- 0.03, N = 3 SE +/- 0.66, N = 4 SE +/- 0.07, N = 3 46.40 47.01 46.32 MIN: 42.8 / MAX: 59.97 MIN: 42.66 / MAX: 60.93 MIN: 42.77 / MAX: 61.8 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
TensorFlow Lite Model: Inception ResNet V2 OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: Inception ResNet V2 1 2 3 1.2M 2.4M 3.6M 4.8M 6M SE +/- 3601.25, N = 3 SE +/- 3137.20, N = 3 SE +/- 6688.08, N = 3 5691310 5697083 5689070
Kvazaar Video Input: Bosphorus 4K - Video Preset: Very Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 4K - Video Preset: Very Fast 1 2 3 0.8888 1.7776 2.6664 3.5552 4.444 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 3.94 3.94 3.95 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Mobile Neural Network Model: inception-v3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.1 Model: inception-v3 1 2 3 14 28 42 56 70 SE +/- 0.18, N = 3 SE +/- 0.32, N = 3 SE +/- 0.19, N = 3 63.42 64.00 63.27 MIN: 60.02 / MAX: 120.02 MIN: 60.33 / MAX: 93.06 MIN: 60.45 / MAX: 98.45 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: mobilenet-v1-1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.1 Model: mobilenet-v1-1.0 1 2 3 2 4 6 8 10 SE +/- 0.021, N = 3 SE +/- 0.065, N = 3 SE +/- 0.030, N = 3 7.395 7.526 7.313 MIN: 6.57 / MAX: 16.47 MIN: 6.6 / MAX: 20.05 MIN: 6.61 / MAX: 17.32 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: MobileNetV2_224 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.1 Model: MobileNetV2_224 1 2 3 1.2204 2.4408 3.6612 4.8816 6.102 SE +/- 0.019, N = 3 SE +/- 0.038, N = 3 SE +/- 0.037, N = 3 5.424 5.419 5.398 MIN: 4.8 / MAX: 14.75 MIN: 4.88 / MAX: 15.69 MIN: 4.83 / MAX: 15.64 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: resnet-v2-50 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.1 Model: resnet-v2-50 1 2 3 11 22 33 44 55 SE +/- 0.51, N = 3 SE +/- 0.33, N = 3 SE +/- 0.27, N = 3 50.22 50.27 50.49 MIN: 47.21 / MAX: 83.29 MIN: 47.5 / MAX: 72.85 MIN: 47.85 / MAX: 147.84 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: SqueezeNetV1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 1.1.1 Model: SqueezeNetV1.0 1 2 3 3 6 9 12 15 SE +/- 0.127, N = 3 SE +/- 0.039, N = 3 SE +/- 0.066, N = 3 9.732 9.613 9.867 MIN: 8.67 / MAX: 18.76 MIN: 8.69 / MAX: 20.7 MIN: 8.73 / MAX: 39.42 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
InfluxDB Concurrent Streams: 4 - Batch Size: 10000 - Tags: 2,5000,1 - Points Per Series: 10000 OpenBenchmarking.org val/sec, More Is Better InfluxDB 1.8.2 Concurrent Streams: 4 - Batch Size: 10000 - Tags: 2,5000,1 - Points Per Series: 10000 1 2 3 150K 300K 450K 600K 750K SE +/- 8641.46, N = 3 SE +/- 6558.55, N = 3 SE +/- 5594.95, N = 3 706035.5 696009.5 700554.6
Hierarchical INTegration Test: FLOAT OpenBenchmarking.org QUIPs, More Is Better Hierarchical INTegration 1.0 Test: FLOAT 1 2 3 60M 120M 180M 240M 300M SE +/- 252430.64, N = 3 SE +/- 109501.99, N = 3 SE +/- 702868.97, N = 3 301333349.99 301687480.19 301185316.59 1. (CC) gcc options: -O3 -march=native -lm
CLOMP Static OMP Speedup OpenBenchmarking.org Speedup, More Is Better CLOMP 1.2 Static OMP Speedup 1 2 3 0.45 0.9 1.35 1.8 2.25 SE +/- 0.03, N = 3 2.0 2.0 2.0 1. (CC) gcc options: -fopenmp -O3 -lm
InfluxDB Concurrent Streams: 64 - Batch Size: 10000 - Tags: 2,5000,1 - Points Per Series: 10000 OpenBenchmarking.org val/sec, More Is Better InfluxDB 1.8.2 Concurrent Streams: 64 - Batch Size: 10000 - Tags: 2,5000,1 - Points Per Series: 10000 1 2 3 160K 320K 480K 640K 800K SE +/- 1902.33, N = 3 SE +/- 1666.30, N = 3 SE +/- 3366.02, N = 3 721428.3 725224.2 723222.1
NCNN Target: CPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: regnety_400m 1 2 3 5 10 15 20 25 SE +/- 0.07, N = 3 SE +/- 0.10, N = 3 SE +/- 0.18, N = 3 19.11 18.87 18.75 MIN: 16.69 / MAX: 35.81 MIN: 16.81 / MAX: 33.23 MIN: 16.84 / MAX: 32.85 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: squeezenet_ssd 1 2 3 13 26 39 52 65 SE +/- 0.05, N = 3 SE +/- 0.20, N = 3 SE +/- 0.27, N = 3 59.39 59.71 59.34 MIN: 53.07 / MAX: 79.98 MIN: 52.9 / MAX: 78.52 MIN: 52.64 / MAX: 71.17 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: yolov4-tiny 1 2 3 13 26 39 52 65 SE +/- 0.05, N = 3 SE +/- 0.08, N = 3 SE +/- 0.06, N = 3 59.28 59.42 59.28 MIN: 55.69 / MAX: 71.74 MIN: 55.13 / MAX: 75.95 MIN: 55.36 / MAX: 72.94 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: resnet50 1 2 3 16 32 48 64 80 SE +/- 0.82, N = 3 SE +/- 0.39, N = 3 SE +/- 0.19, N = 3 72.55 73.14 71.82 MIN: 66.75 / MAX: 91.97 MIN: 66.24 / MAX: 103.74 MIN: 65.77 / MAX: 87.35 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: alexnet 1 2 3 6 12 18 24 30 SE +/- 0.11, N = 3 SE +/- 0.07, N = 3 SE +/- 0.07, N = 3 23.30 23.42 23.51 MIN: 21.24 / MAX: 38.11 MIN: 21.27 / MAX: 37.51 MIN: 21.21 / MAX: 37.48 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: resnet18 1 2 3 7 14 21 28 35 SE +/- 0.16, N = 3 SE +/- 0.12, N = 3 SE +/- 0.33, N = 3 29.04 29.04 29.18 MIN: 25.89 / MAX: 42.02 MIN: 25.69 / MAX: 36.07 MIN: 25.64 / MAX: 44.22 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: vgg16 1 2 3 30 60 90 120 150 SE +/- 0.18, N = 3 SE +/- 0.23, N = 3 SE +/- 0.17, N = 3 117.19 119.38 117.45 MIN: 112.25 / MAX: 143.19 MIN: 113.9 / MAX: 142.52 MIN: 112.54 / MAX: 135.06 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: googlenet 1 2 3 8 16 24 32 40 SE +/- 0.18, N = 3 SE +/- 0.05, N = 3 SE +/- 0.18, N = 3 32.59 32.79 32.43 MIN: 28.72 / MAX: 51.96 MIN: 28.77 / MAX: 48.75 MIN: 28.44 / MAX: 46.34 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: blazeface 1 2 3 0.7493 1.4986 2.2479 2.9972 3.7465 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 3.31 3.33 3.29 MIN: 2.64 / MAX: 9.91 MIN: 2.62 / MAX: 5.11 MIN: 2.73 / MAX: 4.72 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: efficientnet-b0 1 2 3 4 8 12 16 20 SE +/- 0.15, N = 3 SE +/- 0.03, N = 3 SE +/- 0.12, N = 3 17.07 16.76 16.89 MIN: 14.1 / MAX: 30.1 MIN: 13.96 / MAX: 31.87 MIN: 14.11 / MAX: 30.24 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: mnasnet 1 2 3 3 6 9 12 15 SE +/- 0.05, N = 3 SE +/- 0.10, N = 3 SE +/- 0.06, N = 3 10.50 10.34 10.25 MIN: 8.45 / MAX: 18.4 MIN: 8.42 / MAX: 16.24 MIN: 8.43 / MAX: 24.69 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: shufflenet-v2 1 2 3 3 6 9 12 15 SE +/- 0.12, N = 3 SE +/- 0.14, N = 3 SE +/- 0.14, N = 3 12.65 12.89 12.70 MIN: 10.41 / MAX: 23.3 MIN: 10.42 / MAX: 26.77 MIN: 10.47 / MAX: 19.71 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU-v3-v3 - Model: mobilenet-v3 1 2 3 3 6 9 12 15 SE +/- 0.07, N = 3 SE +/- 0.05, N = 3 SE +/- 0.03, N = 3 9.67 9.49 9.59 MIN: 7.8 / MAX: 15.65 MIN: 7.78 / MAX: 14.76 MIN: 7.81 / MAX: 16.29 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU-v2-v2 - Model: mobilenet-v2 1 2 3 3 6 9 12 15 SE +/- 0.13, N = 3 SE +/- 0.08, N = 3 SE +/- 0.05, N = 3 11.20 10.91 11.04 MIN: 8.97 / MAX: 20.5 MIN: 8.91 / MAX: 18.07 MIN: 8.96 / MAX: 21.86 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: mobilenet 1 2 3 11 22 33 44 55 SE +/- 0.46, N = 3 SE +/- 0.07, N = 3 SE +/- 0.05, N = 3 46.83 46.49 46.32 MIN: 42.34 / MAX: 64.41 MIN: 42.72 / MAX: 62.2 MIN: 43.57 / MAX: 62.16 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
VKMark Resolution: 1920 x 1080 OpenBenchmarking.org VKMark Score, More Is Better VKMark 2020-05-21 Resolution: 1920 x 1080 1 2 3 300 600 900 1200 1500 1199 1199 1196 1. (CXX) g++ options: -pthread -ldl -pipe -std=c++14 -MD -MQ -MF
Timed HMMer Search Pfam Database Search OpenBenchmarking.org Seconds, Fewer Is Better Timed HMMer Search 3.3.1 Pfam Database Search 1 2 3 30 60 90 120 150 SE +/- 0.03, N = 3 SE +/- 0.26, N = 3 SE +/- 0.08, N = 3 127.15 127.65 127.39 1. (CC) gcc options: -O3 -pthread -lhmmer -leasel -lm
Node.js V8 Web Tooling Benchmark OpenBenchmarking.org runs/s, More Is Better Node.js V8 Web Tooling Benchmark 1 2 3 2 4 6 8 10 SE +/- 0.08, N = 3 SE +/- 0.03, N = 3 SE +/- 0.09, N = 4 7.38 7.74 7.38 1. Nodejs
v12.18.2
RawTherapee Total Benchmark Time OpenBenchmarking.org Seconds, Fewer Is Better RawTherapee Total Benchmark Time 1 2 3 30 60 90 120 150 SE +/- 0.06, N = 3 SE +/- 0.21, N = 3 SE +/- 0.14, N = 3 123.68 123.41 123.34 1. RawTherapee, version 5.8, command line.
x265 Video Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better x265 3.4 Video Input: Bosphorus 4K 1 2 3 1.0868 2.1736 3.2604 4.3472 5.434 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 4.81 4.83 4.83 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
BYTE Unix Benchmark Computational Test: Dhrystone 2 OpenBenchmarking.org LPS, More Is Better BYTE Unix Benchmark 3.6 Computational Test: Dhrystone 2 1 2 3 8M 16M 24M 32M 40M SE +/- 289585.60, N = 3 SE +/- 208729.08, N = 3 SE +/- 503813.27, N = 3 35499453.0 35791748.5 35427649.2
Timed Eigen Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Eigen Compilation 3.3.9 Time To Compile 1 2 3 30 60 90 120 150 SE +/- 0.54, N = 3 SE +/- 0.19, N = 3 SE +/- 0.21, N = 3 113.52 113.65 112.96
Caffe Model: GoogleNet - Acceleration: CPU - Iterations: 100 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: GoogleNet - Acceleration: CPU - Iterations: 100 1 2 3 20K 40K 60K 80K 100K SE +/- 106.80, N = 3 SE +/- 38.96, N = 3 SE +/- 223.03, N = 3 110084 110320 110157 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
GLmark2 Resolution: 1920 x 1080 OpenBenchmarking.org Score, More Is Better GLmark2 2020.04 Resolution: 1920 x 1080 1 2 3 400 800 1200 1600 2000 1849 1851 1852
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU 1 2 3 2K 4K 6K 8K 10K SE +/- 11.25, N = 3 SE +/- 85.05, N = 3 SE +/- 30.28, N = 3 7721.13 7794.13 7837.72 MIN: 7547.72 MIN: 7534.49 MIN: 7613.07 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU 1 2 3 2K 4K 6K 8K 10K SE +/- 101.76, N = 3 SE +/- 22.70, N = 3 SE +/- 13.61, N = 3 7913.61 7667.94 7750.49 MIN: 7617.25 MIN: 7509.56 MIN: 7562.51 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU 1 2 3 1700 3400 5100 6800 8500 SE +/- 38.84, N = 3 SE +/- 27.04, N = 3 SE +/- 16.23, N = 3 7746.87 7725.63 7701.75 MIN: 7556.29 MIN: 7520.66 MIN: 7494.48 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU 1 2 3 8 16 24 32 40 SE +/- 1.59, N = 15 SE +/- 1.77, N = 15 SE +/- 1.44, N = 12 36.56 35.85 33.06 MIN: 27.16 MIN: 27.16 MIN: 26.99 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU 1 2 3 2K 4K 6K 8K 10K SE +/- 133.51, N = 3 SE +/- 86.65, N = 3 SE +/- 61.07, N = 3 8437.12 8277.53 8342.73 MIN: 7874.47 MIN: 7767.68 MIN: 7929.12 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU 1 2 3 2K 4K 6K 8K 10K SE +/- 128.90, N = 3 SE +/- 144.66, N = 3 SE +/- 28.77, N = 3 8193.61 8356.19 8419.82 MIN: 7671.11 MIN: 7776.27 MIN: 8053.44 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Kvazaar Video Input: Bosphorus 1080p - Video Preset: Medium OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 1080p - Video Preset: Medium 1 2 3 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 6.51 6.51 6.50 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
ASTC Encoder Preset: Thorough OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Thorough 1 2 3 20 40 60 80 100 SE +/- 0.21, N = 3 SE +/- 0.21, N = 3 SE +/- 0.04, N = 3 84.33 84.61 84.49 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
Kvazaar Video Input: Bosphorus 4K - Video Preset: Ultra Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 4K - Video Preset: Ultra Fast 1 2 3 2 4 6 8 10 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 6.84 6.85 6.83 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Basis Universal Settings: UASTC Level 2 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: UASTC Level 2 1 2 3 20 40 60 80 100 SE +/- 0.24, N = 3 SE +/- 0.18, N = 3 SE +/- 0.16, N = 3 86.48 86.54 86.34 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
Hugin Panorama Photo Assistant + Stitching Time OpenBenchmarking.org Seconds, Fewer Is Better Hugin Panorama Photo Assistant + Stitching Time 1 2 3 20 40 60 80 100 SE +/- 0.57, N = 3 SE +/- 0.26, N = 3 SE +/- 0.13, N = 3 82.01 83.58 82.19
Basis Universal Settings: ETC1S OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: ETC1S 1 2 3 20 40 60 80 100 SE +/- 0.05, N = 3 SE +/- 0.14, N = 3 SE +/- 0.07, N = 3 82.06 82.18 82.08 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
SQLite Speedtest Timed Time - Size 1,000 OpenBenchmarking.org Seconds, Fewer Is Better SQLite Speedtest 3.30 Timed Time - Size 1,000 1 2 3 20 40 60 80 100 SE +/- 0.14, N = 3 SE +/- 0.76, N = 3 SE +/- 0.70, N = 3 81.22 81.42 81.93 1. (CC) gcc options: -O2 -ldl -lz -lpthread
Warsow Resolution: 1920 x 1080 OpenBenchmarking.org Frames Per Second, More Is Better Warsow 2.5 Beta Resolution: 1920 x 1080 1 2 3 40 80 120 160 200 SE +/- 1.30, N = 3 SE +/- 0.12, N = 3 SE +/- 0.10, N = 3 158.1 159.4 159.4
Stockfish Total Time OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 12 Total Time 1 2 3 1.2M 2.4M 3.6M 4.8M 6M SE +/- 48644.36, N = 3 SE +/- 74806.22, N = 3 SE +/- 39149.01, N = 3 5718169 5628220 5648589 1. (CXX) g++ options: -m64 -lpthread -fno-exceptions -std=c++17 -pedantic -O3 -msse -msse3 -mpopcnt -msse4.1 -mssse3 -msse2 -flto -flto=jobserver
Sunflow Rendering System Global Illumination + Image Synthesis OpenBenchmarking.org Seconds, Fewer Is Better Sunflow Rendering System 0.07.2 Global Illumination + Image Synthesis 1 2 3 0.743 1.486 2.229 2.972 3.715 SE +/- 0.041, N = 3 SE +/- 0.032, N = 3 SE +/- 0.028, N = 15 3.206 3.148 3.302 MIN: 2.88 / MAX: 3.79 MIN: 2.89 / MAX: 3.84 MIN: 2.87 / MAX: 4.18
dav1d Video Input: Summer Nature 4K OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Summer Nature 4K 1 2 3 12 24 36 48 60 SE +/- 0.35, N = 3 SE +/- 0.37, N = 3 SE +/- 0.44, N = 3 51.92 52.00 52.09 MIN: 48.08 / MAX: 61.73 MIN: 48.01 / MAX: 61.72 MIN: 48.07 / MAX: 61.57 1. (CC) gcc options: -pthread -ldl -lm
rav1e Speed: 5 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Speed: 5 1 2 3 0.1908 0.3816 0.5724 0.7632 0.954 SE +/- 0.001, N = 3 SE +/- 0.001, N = 3 SE +/- 0.000, N = 3 0.848 0.844 0.839
KeyDB OpenBenchmarking.org Ops/sec, More Is Better KeyDB 6.0.16 1 2 3 60K 120K 180K 240K 300K SE +/- 3138.20, N = 3 SE +/- 2051.97, N = 3 SE +/- 1852.38, N = 3 265074.45 267212.95 269044.48 1. (CXX) g++ options: -O2 -levent_openssl -levent -lcrypto -lssl -lpthread -lz -lpcre
dav1d Video Input: Chimera 1080p OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Chimera 1080p 1 2 3 40 80 120 160 200 SE +/- 0.04, N = 3 SE +/- 1.76, N = 3 SE +/- 1.89, N = 3 184.25 184.17 183.88 MIN: 129.28 / MAX: 331.16 MIN: 127.79 / MAX: 333.51 MIN: 127.66 / MAX: 340.88 1. (CC) gcc options: -pthread -ldl -lm
LZ4 Compression Compression Level: 1 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 1 - Decompression Speed 1 2 3 2K 4K 6K 8K 10K SE +/- 6.35, N = 3 SE +/- 8.66, N = 3 SE +/- 57.65, N = 3 8722.4 8646.3 8690.1 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 1 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 1 - Compression Speed 1 2 3 2K 4K 6K 8K 10K SE +/- 51.24, N = 3 SE +/- 41.71, N = 3 SE +/- 92.88, N = 3 7994.30 8015.39 8022.48 1. (CC) gcc options: -O3
Sockperf Test: Latency Under Load OpenBenchmarking.org usec, Fewer Is Better Sockperf 3.4 Test: Latency Under Load 1 2 3 12 24 36 48 60 SE +/- 2.25, N = 20 SE +/- 1.88, N = 25 SE +/- 1.93, N = 25 53.78 52.76 50.70 1. (CXX) g++ options: --param -O3 -rdynamic -ldl -lpthread
RealSR-NCNN Scale: 4x - TAA: No OpenBenchmarking.org Seconds, Fewer Is Better RealSR-NCNN 20200818 Scale: 4x - TAA: No 1 2 3 14 28 42 56 70 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 63.03 63.02 63.00
IndigoBench Acceleration: CPU - Scene: Bedroom OpenBenchmarking.org M samples/s, More Is Better IndigoBench 4.4 Acceleration: CPU - Scene: Bedroom 1 2 3 0.1121 0.2242 0.3363 0.4484 0.5605 SE +/- 0.001, N = 3 SE +/- 0.000, N = 3 SE +/- 0.002, N = 3 0.494 0.494 0.498
LibRaw Post-Processing Benchmark OpenBenchmarking.org Mpix/sec, More Is Better LibRaw 0.20 Post-Processing Benchmark 1 2 3 5 10 15 20 25 SE +/- 0.05, N = 3 SE +/- 0.11, N = 3 SE +/- 0.12, N = 3 19.36 19.66 19.79 1. (CXX) g++ options: -O2 -fopenmp -ljpeg -lz -lm
IndigoBench Acceleration: CPU - Scene: Supercar OpenBenchmarking.org M samples/s, More Is Better IndigoBench 4.4 Acceleration: CPU - Scene: Supercar 1 2 3 0.2491 0.4982 0.7473 0.9964 1.2455 SE +/- 0.004, N = 3 SE +/- 0.009, N = 3 SE +/- 0.002, N = 3 1.107 1.098 1.106
TensorFlow Lite Model: SqueezeNet OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: SqueezeNet 1 2 3 100K 200K 300K 400K 500K SE +/- 158.13, N = 3 SE +/- 1459.42, N = 3 SE +/- 564.75, N = 3 467745 461955 467404
TensorFlow Lite Model: NASNet Mobile OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: NASNet Mobile 1 2 3 70K 140K 210K 280K 350K SE +/- 995.45, N = 3 SE +/- 384.90, N = 3 SE +/- 504.11, N = 3 316112 317213 315790
TensorFlow Lite Model: Mobilenet Float OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: Mobilenet Float 1 2 3 70K 140K 210K 280K 350K SE +/- 172.14, N = 3 SE +/- 1840.34, N = 3 SE +/- 2441.23, N = 3 309946 306270 313216
TensorFlow Lite Model: Mobilenet Quant OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: Mobilenet Quant 1 2 3 70K 140K 210K 280K 350K SE +/- 1183.02, N = 3 SE +/- 1025.34, N = 3 SE +/- 1866.22, N = 3 328733 318685 327187
AOM AV1 Encoder Mode: Speed 6 Realtime OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 2.0 Encoder Mode: Speed 6 Realtime 1 2 3 3 6 9 12 15 SE +/- 0.04, N = 3 SE +/- 0.08, N = 3 SE +/- 0.13, N = 3 10.13 10.12 10.25 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
simdjson Throughput Test: LargeRandom OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: LargeRandom 1 2 3 0.0788 0.1576 0.2364 0.3152 0.394 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.35 0.35 0.35 1. (CXX) g++ options: -O3 -pthread
simdjson Throughput Test: PartialTweets OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: PartialTweets 1 2 3 0.1013 0.2026 0.3039 0.4052 0.5065 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.45 0.45 0.45 1. (CXX) g++ options: -O3 -pthread
simdjson Throughput Test: DistinctUserID OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: DistinctUserID 1 2 3 0.1035 0.207 0.3105 0.414 0.5175 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.46 0.46 0.46 1. (CXX) g++ options: -O3 -pthread
WebP Image Encode Encode Settings: Quality 100, Lossless, Highest Compression OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100, Lossless, Highest Compression 1 2 3 13 26 39 52 65 SE +/- 0.26, N = 3 SE +/- 0.02, N = 3 SE +/- 0.09, N = 3 57.67 57.22 57.45 1. (CC) gcc options: -fvisibility=hidden -O2 -pthread -lm -ljpeg -lpng16 -ltiff
Darktable Test: Boat - Acceleration: CPU-only OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.2.1 Test: Boat - Acceleration: CPU-only 1 2 3 6 12 18 24 30 SE +/- 0.09, N = 3 SE +/- 0.25, N = 13 SE +/- 0.09, N = 3 25.21 25.90 25.45
rav1e Speed: 6 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Speed: 6 1 2 3 0.245 0.49 0.735 0.98 1.225 SE +/- 0.005, N = 3 SE +/- 0.001, N = 3 SE +/- 0.003, N = 3 1.082 1.083 1.089
OCRMyPDF Processing 60 Page PDF Document OpenBenchmarking.org Seconds, Fewer Is Better OCRMyPDF 10.3.1+dfsg Processing 60 Page PDF Document 1 2 3 12 24 36 48 60 SE +/- 0.13, N = 3 SE +/- 0.08, N = 3 SE +/- 0.10, N = 3 52.68 52.74 52.99
WavPack Audio Encoding WAV To WavPack OpenBenchmarking.org Seconds, Fewer Is Better WavPack Audio Encoding 5.3 WAV To WavPack 1 2 3 4 8 12 16 20 SE +/- 0.01, N = 5 SE +/- 0.01, N = 5 SE +/- 0.09, N = 21 15.08 15.08 15.17 1. (CXX) g++ options: -rdynamic
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU 1 2 3 5 10 15 20 25 SE +/- 0.26, N = 3 SE +/- 0.18, N = 15 SE +/- 0.14, N = 3 22.54 22.31 22.66 MIN: 17.75 MIN: 17.69 MIN: 17.69 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
simdjson Throughput Test: Kostya OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: Kostya 1 2 3 0.0855 0.171 0.2565 0.342 0.4275 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.38 0.38 0.38 1. (CXX) g++ options: -O3 -pthread
eSpeak-NG Speech Engine Text-To-Speech Synthesis OpenBenchmarking.org Seconds, Fewer Is Better eSpeak-NG Speech Engine 20200907 Text-To-Speech Synthesis 1 2 3 8 16 24 32 40 SE +/- 0.11, N = 4 SE +/- 0.12, N = 4 SE +/- 0.11, N = 4 35.13 35.32 35.31 1. (CC) gcc options: -O2 -std=c99
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU 1 2 3 4 8 12 16 20 SE +/- 0.22, N = 15 SE +/- 0.18, N = 15 SE +/- 0.09, N = 3 14.82 15.08 15.56 MIN: 11.81 MIN: 12.35 MIN: 13.12 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
AOM AV1 Encoder Mode: Speed 6 Two-Pass OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 2.0 Encoder Mode: Speed 6 Two-Pass 1 2 3 0.504 1.008 1.512 2.016 2.52 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 2.22 2.22 2.24 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
Zstd Compression Compression Level: 3 OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.4.5 Compression Level: 3 1 2 3 500 1000 1500 2000 2500 SE +/- 27.78, N = 3 SE +/- 16.00, N = 3 SE +/- 7.82, N = 3 2346.0 2358.0 2324.2 1. (CC) gcc options: -O3 -pthread -lz -llzma
Caffe Model: AlexNet - Acceleration: CPU - Iterations: 100 OpenBenchmarking.org Milli-Seconds, Fewer Is Better Caffe 2020-02-13 Model: AlexNet - Acceleration: CPU - Iterations: 100 1 2 3 9K 18K 27K 36K 45K SE +/- 193.32, N = 3 SE +/- 90.86, N = 3 SE +/- 137.35, N = 3 41877 41672 41573 1. (CXX) g++ options: -fPIC -O3 -rdynamic -lglog -lgflags -lprotobuf -lpthread -lsz -lz -ldl -lm -llmdb -lopenblas
PHPBench PHP Benchmark Suite OpenBenchmarking.org Score, More Is Better PHPBench 0.8.1 PHP Benchmark Suite 1 2 3 110K 220K 330K 440K 550K SE +/- 423.52, N = 3 SE +/- 1952.23, N = 3 SE +/- 2233.09, N = 3 508106 506055 504159
Kvazaar Video Input: Bosphorus 1080p - Video Preset: Very Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 1080p - Video Preset: Very Fast 1 2 3 4 8 12 16 20 SE +/- 0.06, N = 3 SE +/- 0.06, N = 3 SE +/- 0.03, N = 3 15.60 15.52 15.62 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
RNNoise OpenBenchmarking.org Seconds, Fewer Is Better RNNoise 2020-06-28 1 2 3 5 10 15 20 25 SE +/- 0.03, N = 3 SE +/- 0.23, N = 8 SE +/- 0.35, N = 3 22.24 22.69 22.58 1. (CC) gcc options: -O2 -pedantic -fvisibility=hidden
rav1e Speed: 10 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Speed: 10 1 2 3 0.5938 1.1876 1.7814 2.3752 2.969 SE +/- 0.015, N = 3 SE +/- 0.008, N = 3 SE +/- 0.007, N = 3 2.573 2.566 2.639
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU 1 2 3 3 6 9 12 15 SE +/- 0.18, N = 15 SE +/- 0.17, N = 15 SE +/- 0.06, N = 3 13.07 13.17 13.37 MIN: 10.63 MIN: 10.78 MIN: 12.28 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
AOM AV1 Encoder Mode: Speed 4 Two-Pass OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 2.0 Encoder Mode: Speed 4 Two-Pass 1 2 3 0.315 0.63 0.945 1.26 1.575 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 1.38 1.38 1.40 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
Unpacking Firefox Extracting: firefox-84.0.source.tar.xz OpenBenchmarking.org Seconds, Fewer Is Better Unpacking Firefox 84.0 Extracting: firefox-84.0.source.tar.xz 1 2 3 6 12 18 24 30 SE +/- 0.03, N = 4 SE +/- 0.03, N = 4 SE +/- 0.08, N = 4 23.85 23.80 23.81
Crafty Elapsed Time OpenBenchmarking.org Nodes Per Second, More Is Better Crafty 25.2 Elapsed Time 1 2 3 1.4M 2.8M 4.2M 5.6M 7M SE +/- 2855.69, N = 3 SE +/- 20483.82, N = 3 SE +/- 23149.53, N = 3 6274275 6255015 6322069 1. (CC) gcc options: -pthread -lstdc++ -fprofile-use -lm
x265 Video Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better x265 3.4 Video Input: Bosphorus 1080p 1 2 3 5 10 15 20 25 SE +/- 0.16, N = 3 SE +/- 0.07, N = 3 SE +/- 0.11, N = 3 19.49 19.60 19.71 1. (CXX) g++ options: -O3 -rdynamic -lpthread -lrt -ldl -lnuma
Google SynthMark Test: VoiceMark_100 OpenBenchmarking.org Voices, More Is Better Google SynthMark 20201109 Test: VoiceMark_100 1 2 3 130 260 390 520 650 SE +/- 1.56, N = 3 SE +/- 0.89, N = 3 SE +/- 1.20, N = 3 596.25 596.62 593.91 1. (CXX) g++ options: -lm -lpthread -std=c++11 -Ofast
Waifu2x-NCNN Vulkan Scale: 2x - Denoise: 3 - TAA: Yes OpenBenchmarking.org Seconds, Fewer Is Better Waifu2x-NCNN Vulkan 20200818 Scale: 2x - Denoise: 3 - TAA: Yes 1 2 3 6 12 18 24 30 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 26.68 26.69 26.67
Monkey Audio Encoding WAV To APE OpenBenchmarking.org Seconds, Fewer Is Better Monkey Audio Encoding 3.99.6 WAV To APE 1 2 3 4 8 12 16 20 SE +/- 0.05, N = 5 SE +/- 0.08, N = 5 SE +/- 0.05, N = 5 15.96 15.99 16.00 1. (CXX) g++ options: -O3 -pedantic -rdynamic -lrt
Darktable Test: Masskrug - Acceleration: CPU-only OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.2.1 Test: Masskrug - Acceleration: CPU-only 1 2 3 6 12 18 24 30 SE +/- 0.03, N = 3 SE +/- 0.04, N = 3 SE +/- 0.08, N = 3 24.17 24.52 24.19
WebP Image Encode Encode Settings: Quality 100, Lossless OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100, Lossless 1 2 3 6 12 18 24 30 SE +/- 0.03, N = 3 SE +/- 0.10, N = 3 SE +/- 0.21, N = 3 24.89 24.96 24.90 1. (CC) gcc options: -fvisibility=hidden -O2 -pthread -lm -ljpeg -lpng16 -ltiff
AOM AV1 Encoder Mode: Speed 8 Realtime OpenBenchmarking.org Frames Per Second, More Is Better AOM AV1 2.0 Encoder Mode: Speed 8 Realtime 1 2 3 6 12 18 24 30 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 27.20 27.18 27.29 1. (CXX) g++ options: -O3 -std=c++11 -U_FORTIFY_SOURCE -lm -lpthread
Darktable Test: Server Room - Acceleration: CPU-only OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.2.1 Test: Server Room - Acceleration: CPU-only 1 2 3 5 10 15 20 25 SE +/- 0.21, N = 3 SE +/- 0.12, N = 3 SE +/- 0.19, N = 3 20.70 21.01 20.74
Kvazaar Video Input: Bosphorus 1080p - Video Preset: Ultra Fast OpenBenchmarking.org Frames Per Second, More Is Better Kvazaar 2.0 Video Input: Bosphorus 1080p - Video Preset: Ultra Fast 1 2 3 6 12 18 24 30 SE +/- 0.16, N = 3 SE +/- 0.05, N = 3 SE +/- 0.02, N = 3 27.01 27.07 27.05 1. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O2 -lpthread -lm -lrt
Dolfyn Computational Fluid Dynamics OpenBenchmarking.org Seconds, Fewer Is Better Dolfyn 0.527 Computational Fluid Dynamics 1 2 3 5 10 15 20 25 SE +/- 0.07, N = 3 SE +/- 0.05, N = 3 SE +/- 0.04, N = 3 21.07 21.14 20.99
dav1d Video Input: Summer Nature 1080p OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Summer Nature 1080p 1 2 3 40 80 120 160 200 SE +/- 0.82, N = 3 SE +/- 0.30, N = 3 SE +/- 0.36, N = 3 182.89 183.50 184.81 MIN: 167.96 / MAX: 203.4 MIN: 169.65 / MAX: 201.98 MIN: 171.93 / MAX: 203.27 1. (CC) gcc options: -pthread -ldl -lm
TNN Target: CPU - Model: SqueezeNet v1.1 OpenBenchmarking.org ms, Fewer Is Better TNN 0.2.3 Target: CPU - Model: SqueezeNet v1.1 1 2 3 60 120 180 240 300 SE +/- 0.12, N = 3 SE +/- 0.28, N = 3 SE +/- 0.02, N = 3 287.26 287.06 286.14 MIN: 286.22 / MAX: 288.27 MIN: 286.17 / MAX: 287.91 MIN: 285.41 / MAX: 286.85 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -O3 -rdynamic -ldl
Coremark CoreMark Size 666 - Iterations Per Second OpenBenchmarking.org Iterations/Sec, More Is Better Coremark 1.0 CoreMark Size 666 - Iterations Per Second 1 2 3 20K 40K 60K 80K 100K SE +/- 816.07, N = 3 SE +/- 290.85, N = 3 SE +/- 373.76, N = 3 102524.44 101765.57 102339.41 1. (CC) gcc options: -O2 -lrt" -lrt
TNN Target: CPU - Model: MobileNet v2 OpenBenchmarking.org ms, Fewer Is Better TNN 0.2.3 Target: CPU - Model: MobileNet v2 1 2 3 60 120 180 240 300 SE +/- 0.05, N = 3 SE +/- 0.44, N = 3 SE +/- 0.29, N = 3 279.34 279.63 279.28 MIN: 276.5 / MAX: 293.72 MIN: 276.69 / MAX: 295.47 MIN: 276.39 / MAX: 296.43 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -O3 -rdynamic -ldl
ASTC Encoder Preset: Medium OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Medium 1 2 3 3 6 9 12 15 SE +/- 0.03, N = 3 SE +/- 0.04, N = 3 SE +/- 0.03, N = 3 12.77 12.83 12.75 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
GIMP Test: unsharp-mask OpenBenchmarking.org Seconds, Fewer Is Better GIMP 2.10.18 Test: unsharp-mask 1 2 3 4 8 12 16 20 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 17.33 17.32 17.33
Redis Test: SET OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: SET 1 2 3 300K 600K 900K 1200K 1500K SE +/- 5888.95, N = 3 SE +/- 19097.96, N = 3 SE +/- 15253.86, N = 8 1489969.25 1486411.63 1472539.67 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
GIMP Test: auto-levels OpenBenchmarking.org Seconds, Fewer Is Better GIMP 2.10.18 Test: auto-levels 1 2 3 4 8 12 16 20 SE +/- 0.01, N = 3 SE +/- 0.11, N = 3 SE +/- 0.06, N = 3 15.92 15.85 15.89
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU 1 2 3 4 8 12 16 20 SE +/- 0.17, N = 3 SE +/- 0.19, N = 3 SE +/- 0.24, N = 3 16.54 16.56 16.61 MIN: 13.54 MIN: 13.67 MIN: 13.5 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU 1 2 3 4 8 12 16 20 SE +/- 0.11, N = 3 SE +/- 0.12, N = 3 SE +/- 0.12, N = 3 14.63 14.87 15.20 MIN: 13.38 MIN: 13.38 MIN: 13.47 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Timed MAFFT Alignment Multiple Sequence Alignment - LSU RNA OpenBenchmarking.org Seconds, Fewer Is Better Timed MAFFT Alignment 7.471 Multiple Sequence Alignment - LSU RNA 1 2 3 4 8 12 16 20 SE +/- 0.21, N = 3 SE +/- 0.06, N = 3 SE +/- 0.10, N = 3 15.04 15.00 15.04 1. (CC) gcc options: -std=c99 -O3 -lm -lpthread
Opus Codec Encoding WAV To Opus Encode OpenBenchmarking.org Seconds, Fewer Is Better Opus Codec Encoding 1.3.1 WAV To Opus Encode 1 2 3 2 4 6 8 10 SE +/- 0.033, N = 5 SE +/- 0.030, N = 5 SE +/- 0.021, N = 5 8.936 8.923 8.915 1. (CXX) g++ options: -fvisibility=hidden -logg -lm
GIMP Test: rotate OpenBenchmarking.org Seconds, Fewer Is Better GIMP 2.10.18 Test: rotate 1 2 3 4 8 12 16 20 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 SE +/- 0.03, N = 3 14.49 14.47 14.42
Sockperf Test: Latency Ping Pong OpenBenchmarking.org usec, Fewer Is Better Sockperf 3.4 Test: Latency Ping Pong 1 2 3 2 4 6 8 10 SE +/- 0.065, N = 5 SE +/- 0.049, N = 5 SE +/- 0.074, N = 5 6.927 6.751 6.790 1. (CXX) g++ options: --param -O3 -rdynamic -ldl -lpthread
Sockperf Test: Throughput OpenBenchmarking.org Messages Per Second, More Is Better Sockperf 3.4 Test: Throughput 1 2 3 120K 240K 360K 480K 600K SE +/- 6800.99, N = 5 SE +/- 3595.18, N = 5 SE +/- 3270.45, N = 5 555055 559663 557665 1. (CXX) g++ options: --param -O3 -rdynamic -ldl -lpthread
Redis Test: GET OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: GET 1 2 3 400K 800K 1200K 1600K 2000K SE +/- 35016.53, N = 3 SE +/- 23617.43, N = 5 SE +/- 22292.08, N = 3 2064794.83 1931045.20 1930168.38 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
GIMP Test: resize OpenBenchmarking.org Seconds, Fewer Is Better GIMP 2.10.18 Test: resize 1 2 3 3 6 9 12 15 SE +/- 0.09, N = 3 SE +/- 0.07, N = 3 SE +/- 0.10, N = 3 12.86 12.87 12.83
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU 1 2 3 2 4 6 8 10 SE +/- 0.02489, N = 3 SE +/- 0.00711, N = 3 SE +/- 0.00751, N = 3 7.35744 7.36061 7.31351 MIN: 6.35 MIN: 6.34 MIN: 6.35 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Basis Universal Settings: UASTC Level 0 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: UASTC Level 0 1 2 3 3 6 9 12 15 SE +/- 0.02, N = 3 SE +/- 0.09, N = 3 SE +/- 0.00, N = 3 11.90 12.05 11.86 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
Algebraic Multi-Grid Benchmark OpenBenchmarking.org Figure Of Merit, More Is Better Algebraic Multi-Grid Benchmark 1.2 1 2 3 50M 100M 150M 200M 250M SE +/- 326563.65, N = 3 SE +/- 543041.48, N = 3 SE +/- 483314.65, N = 3 213491533 214232233 214072633 1. (CC) gcc options: -lparcsr_ls -lparcsr_mv -lseq_mv -lIJ_mv -lkrylov -lHYPRE_utilities -lm -fopenmp -pthread -lmpi
Redis Test: LPUSH OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: LPUSH 1 2 3 300K 600K 900K 1200K 1500K SE +/- 16215.78, N = 3 SE +/- 2985.46, N = 3 SE +/- 4396.42, N = 3 1216336.46 1213155.04 1223284.42 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
Redis Test: LPOP OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: LPOP 1 2 3 500K 1000K 1500K 2000K 2500K SE +/- 16948.58, N = 3 SE +/- 11741.76, N = 3 SE +/- 8952.66, N = 3 2261210.92 1258380.50 1275489.17 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
Redis Test: SADD OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: SADD 1 2 3 400K 800K 1200K 1600K 2000K SE +/- 11595.14, N = 3 SE +/- 21162.12, N = 3 SE +/- 4337.84, N = 3 1735687.33 1758200.50 1734495.83 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
ASTC Encoder Preset: Fast OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Fast 1 2 3 3 6 9 12 15 SE +/- 0.07, N = 3 SE +/- 0.04, N = 3 SE +/- 0.06, N = 3 9.75 9.74 9.80 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU 1 2 3 1.3143 2.6286 3.9429 5.2572 6.5715 SE +/- 0.02077, N = 3 SE +/- 0.01362, N = 3 SE +/- 0.01488, N = 3 5.81417 5.84129 5.79119 MIN: 5.17 MIN: 5.26 MIN: 5.24 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
WebP Image Encode Encode Settings: Quality 100, Highest Compression OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100, Highest Compression 1 2 3 2 4 6 8 10 SE +/- 0.001, N = 3 SE +/- 0.011, N = 3 SE +/- 0.013, N = 3 8.872 8.871 8.869 1. (CC) gcc options: -fvisibility=hidden -O2 -pthread -lm -ljpeg -lpng16 -ltiff
yquake2 Renderer: Software CPU - Resolution: 1920 x 1080 OpenBenchmarking.org Frames Per Second, More Is Better yquake2 7.45 Renderer: Software CPU - Resolution: 1920 x 1080 1 2 3 20 40 60 80 100 SE +/- 0.38, N = 3 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 92.9 93.3 93.3 1. (CC) gcc options: -lm -ldl -rdynamic -shared -lSDL2 -O2 -pipe -fomit-frame-pointer -std=gnu99 -fno-strict-aliasing -fwrapv -fvisibility=hidden -MMD -mfpmath=sse -fPIC
LAMMPS Molecular Dynamics Simulator Model: Rhodopsin Protein OpenBenchmarking.org ns/day, More Is Better LAMMPS Molecular Dynamics Simulator 29Oct2020 Model: Rhodopsin Protein 1 2 3 0.5879 1.1758 1.7637 2.3516 2.9395 SE +/- 0.015, N = 3 SE +/- 0.030, N = 3 SE +/- 0.031, N = 3 2.603 2.586 2.613 1. (CXX) g++ options: -O3 -pthread -lm
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU 1 2 3 9 18 27 36 45 SE +/- 0.18, N = 3 SE +/- 0.23, N = 3 SE +/- 0.51, N = 3 38.84 38.97 38.51 MIN: 35.67 MIN: 35.93 MIN: 35.63 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU 1 2 3 6 12 18 24 30 SE +/- 0.34, N = 3 SE +/- 0.20, N = 3 SE +/- 0.14, N = 3 23.02 23.68 23.67 MIN: 19.03 MIN: 20.06 MIN: 20.01 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
OSBench Test: Create Files OpenBenchmarking.org us Per Event, Fewer Is Better OSBench Test: Create Files 1 2 3 5 10 15 20 25 SE +/- 0.24, N = 3 SE +/- 0.21, N = 3 SE +/- 0.12, N = 3 18.25 18.32 18.44 1. (CC) gcc options: -lm
LULESH OpenBenchmarking.org z/s, More Is Better LULESH 2.0.3 1 2 3 300 600 900 1200 1500 SE +/- 0.53, N = 3 SE +/- 0.69, N = 3 SE +/- 2.03, N = 3 1180.10 1208.39 1208.02 1. (CXX) g++ options: -O3 -fopenmp -lm -pthread -lmpi_cxx -lmpi
OSBench Test: Memory Allocations OpenBenchmarking.org Ns Per Event, Fewer Is Better OSBench Test: Memory Allocations 1 2 3 20 40 60 80 100 SE +/- 0.02, N = 3 SE +/- 0.10, N = 3 SE +/- 1.45, N = 3 81.74 82.00 85.63 1. (CC) gcc options: -lm
OSBench Test: Launch Programs OpenBenchmarking.org us Per Event, Fewer Is Better OSBench Test: Launch Programs 1 2 3 20 40 60 80 100 SE +/- 0.27, N = 3 SE +/- 0.05, N = 3 SE +/- 0.09, N = 3 81.52 81.97 82.07 1. (CC) gcc options: -lm
OSBench Test: Create Processes OpenBenchmarking.org us Per Event, Fewer Is Better OSBench Test: Create Processes 1 2 3 6 12 18 24 30 SE +/- 0.10, N = 3 SE +/- 0.03, N = 3 SE +/- 0.21, N = 3 26.19 26.48 26.86 1. (CC) gcc options: -lm
OSBench Test: Create Threads OpenBenchmarking.org us Per Event, Fewer Is Better OSBench Test: Create Threads 1 2 3 4 8 12 16 20 SE +/- 0.03, N = 3 SE +/- 0.09, N = 3 SE +/- 0.14, N = 3 14.92 14.82 14.91 1. (CC) gcc options: -lm
Waifu2x-NCNN Vulkan Scale: 2x - Denoise: 3 - TAA: No OpenBenchmarking.org Seconds, Fewer Is Better Waifu2x-NCNN Vulkan 20200818 Scale: 2x - Denoise: 3 - TAA: No 1 2 3 0.9259 1.8518 2.7777 3.7036 4.6295 SE +/- 0.023, N = 3 SE +/- 0.005, N = 3 SE +/- 0.005, N = 3 4.115 4.110 4.097
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU 1 2 3 7 14 21 28 35 SE +/- 0.09, N = 3 SE +/- 0.14, N = 3 SE +/- 0.29, N = 3 29.71 29.89 29.12 MIN: 26.4 MIN: 26.42 MIN: 26.35 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU 1 2 3 7 14 21 28 35 SE +/- 0.15, N = 3 SE +/- 0.28, N = 3 SE +/- 0.14, N = 3 30.81 31.41 30.82 MIN: 22.57 MIN: 22.58 MIN: 22.62 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
WebP Image Encode Encode Settings: Quality 100 OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Quality 100 1 2 3 0.5852 1.1704 1.7556 2.3408 2.926 SE +/- 0.009, N = 3 SE +/- 0.007, N = 3 SE +/- 0.004, N = 3 2.595 2.601 2.590 1. (CC) gcc options: -fvisibility=hidden -O2 -pthread -lm -ljpeg -lpng16 -ltiff
FFTE N=256, 3D Complex FFT Routine OpenBenchmarking.org MFLOPS, More Is Better FFTE 7.0 N=256, 3D Complex FFT Routine 1 2 3 3K 6K 9K 12K 15K SE +/- 120.97, N = 3 SE +/- 110.35, N = 3 SE +/- 161.49, N = 3 15392.81 15755.59 15437.47 1. (F9X) gfortran options: -O3 -fomit-frame-pointer -fopenmp
WebP Image Encode Encode Settings: Default OpenBenchmarking.org Encode Time - Seconds, Fewer Is Better WebP Image Encode 1.1 Encode Settings: Default 1 2 3 0.374 0.748 1.122 1.496 1.87 SE +/- 0.002, N = 3 SE +/- 0.003, N = 3 SE +/- 0.009, N = 3 1.648 1.662 1.657 1. (CC) gcc options: -fvisibility=hidden -O2 -pthread -lm -ljpeg -lpng16 -ltiff
yquake2 Renderer: OpenGL 3.x - Resolution: 1920 x 1080 OpenBenchmarking.org Frames Per Second, More Is Better yquake2 7.45 Renderer: OpenGL 3.x - Resolution: 1920 x 1080 1 2 3 200 400 600 800 1000 SE +/- 4.11, N = 3 SE +/- 3.33, N = 3 SE +/- 4.89, N = 3 814.1 807.2 807.9 1. (CC) gcc options: -lm -ldl -rdynamic -shared -lSDL2 -O2 -pipe -fomit-frame-pointer -std=gnu99 -fno-strict-aliasing -fwrapv -fvisibility=hidden -MMD -mfpmath=sse -fPIC
Darktable Test: Server Rack - Acceleration: CPU-only OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.2.1 Test: Server Rack - Acceleration: CPU-only 1 2 3 0.0774 0.1548 0.2322 0.3096 0.387 SE +/- 0.001, N = 3 SE +/- 0.001, N = 3 SE +/- 0.004, N = 3 0.339 0.342 0.344
Phoronix Test Suite v10.8.5