HP Zbook Intel Core i9-10885H testing with a HP 8736 (S91 Ver. 01.02.01 BIOS) and NVIDIA Quadro RTX 5000 with Max-Q Design 16GB on Ubuntu 20.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2101076-HA-HPZBOOK6247&sor&grr .
HP Zbook Processor Motherboard Chipset Memory Disk Graphics Audio Network OS Kernel Desktop Display Server Display Driver OpenGL OpenCL Vulkan Compiler File-System Screen Resolution r1 r2 r3 Intel Core i9-10885H @ 5.30GHz (8 Cores / 16 Threads) HP 8736 (S91 Ver. 01.02.01 BIOS) Intel Comet Lake PCH 32GB 2048GB KXG50PNV2T04 KIOXIA NVIDIA Quadro RTX 5000 with Max-Q Design 16GB (600/6000MHz) Intel Comet Lake PCH cAVS Intel Wi-Fi 6 AX201 Ubuntu 20.04 5.6.0-1034-oem (x86_64) GNOME Shell 3.36.4 X Server 1.20.8 NVIDIA 450.80.02 4.6.0 OpenCL 1.2 CUDA 11.0.228 1.2.133 GCC 9.3.0 + CUDA 10.1 ext4 1920x1080 OpenBenchmarking.org Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Disk Details - NONE / errors=remount-ro,relatime,rw / Block Size: 4096 Processor Details - Scaling Governor: intel_pstate powersave - CPU Microcode: 0xe0 - Thermald 1.9.1 OpenCL Details - GPU Compute Cores: 3072 Python Details - Python 3.8.3 Security Details - itlb_multihit: KVM: Mitigation of Split huge pages + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
HP Zbook blender: Barbershop - NVIDIA OptiX basis: UASTC Level 2 + RDO Post-Processing blender: Barbershop - CUDA blender: Pabellon Barcelona - CUDA mnn: inception-v3 mnn: mobilenet-v1-1.0 mnn: MobileNetV2_224 mnn: resnet-v2-50 mnn: SqueezeNetV1.0 ai-benchmark: Device AI Score ai-benchmark: Device Training Score ai-benchmark: Device Inference Score redshift: ddnet: 1920 x 1080 - Fullscreen - OpenGL 3.0 - Default - RaiNyMore2 astcenc: Exhaustive lczero: OpenCL brl-cad: VGR Performance Metric vkfft: ddnet: 1920 x 1080 - Fullscreen - OpenGL 3.3 - Default - RaiNyMore2 gromacs: Water Benchmark asmfish: 1024 Hash Memory, 26 Depth stockfish: Total Time unigine-heaven: 1920 x 1080 - Fullscreen - OpenGL blender: Classroom - CUDA dav1d: Chimera 1080p 10-bit build2: Time To Compile numpy: blender: Pabellon Barcelona - NVIDIA OptiX hpcg: unigine-super: 1920 x 1080 - Fullscreen - Ultra - OpenGL unigine-super: 1920 x 1080 - Fullscreen - High - OpenGL unigine-super: 1920 x 1080 - Fullscreen - Medium - OpenGL unigine-super: 1920 x 1080 - Fullscreen - Low - OpenGL graphics-magick: Swirl clomp: Static OMP Speedup blender: Fishy Cat - CUDA warsow: 1920 x 1080 luxcorerender-cl: LuxCore Benchmark tensorflow-lite: Inception ResNet V2 tensorflow-lite: Inception V4 luxcorerender-cl: Food build-linux-kernel: Time To Compile octanebench: Total Score openvino: Person Detection 0106 FP32 - CPU openvino: Person Detection 0106 FP32 - CPU dav1d: Chimera 1080p luxcorerender-cl: DLSC embree: Pathtracer ISPC - Crown blender: Classroom - NVIDIA OptiX basis: UASTC Level 3 fahbench: hmmer: Pfam Database Search embree: Pathtracer - Crown realsr-ncnn: 4x - Yes build-ffmpeg: Time To Compile onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU blender: BMW27 - NVIDIA OptiX blender: BMW27 - CUDA compress-zstd: 19 gegl: Cartoon openvino: Age Gender Recognition Retail 0013 FP16 - CPU openvino: Age Gender Recognition Retail 0013 FP16 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU ncnn: CPU - regnety_400m ncnn: CPU - squeezenet_ssd ncnn: CPU - yolov4-tiny ncnn: CPU - resnet50 ncnn: CPU - alexnet ncnn: CPU - resnet18 ncnn: CPU - vgg16 ncnn: CPU - googlenet ncnn: CPU - blazeface ncnn: CPU - efficientnet-b0 ncnn: CPU - mnasnet ncnn: CPU - shufflenet-v2 ncnn: CPU-v3-v3 - mobilenet-v3 ncnn: CPU-v2-v2 - mobilenet-v2 ncnn: CPU - mobilenet ncnn: Vulkan GPU - regnety_400m ncnn: Vulkan GPU - squeezenet_ssd ncnn: Vulkan GPU - yolov4-tiny ncnn: Vulkan GPU - resnet50 ncnn: Vulkan GPU - alexnet ncnn: Vulkan GPU - resnet18 ncnn: Vulkan GPU - vgg16 ncnn: Vulkan GPU - googlenet ncnn: Vulkan GPU - blazeface ncnn: Vulkan GPU - efficientnet-b0 ncnn: Vulkan GPU - mnasnet ncnn: Vulkan GPU - shufflenet-v2 ncnn: Vulkan GPU-v3-v3 - mobilenet-v3 ncnn: Vulkan GPU-v2-v2 - mobilenet-v2 ncnn: Vulkan GPU - mobilenet embree: Pathtracer - Asian Dragon rawtherapee: Total Benchmark Time openvino: Age Gender Recognition Retail 0013 FP32 - CPU openvino: Age Gender Recognition Retail 0013 FP32 - CPU openvino: Face Detection 0106 FP32 - CPU openvino: Face Detection 0106 FP32 - CPU dav1d: Summer Nature 4K openvino: Person Detection 0106 FP16 - CPU openvino: Person Detection 0106 FP16 - CPU build-eigen: Time To Compile compress-lz4: 9 - Decompression Speed compress-lz4: 9 - Compression Speed openvino: Face Detection 0106 FP16 - CPU openvino: Face Detection 0106 FP16 - CPU embree: Pathtracer ISPC - Asian Dragon compress-lz4: 3 - Decompression Speed compress-lz4: 3 - Compression Speed node-web-tooling: simdjson: Kostya ddnet: 1920 x 1080 - Fullscreen - OpenGL 3.0 - Default - Multeasymap leveldb: Seek Rand blender: Fishy Cat - NVIDIA OptiX indigobench: CPU - Bedroom indigobench: CPU - Supercar tensorflow-lite: SqueezeNet tensorflow-lite: Mobilenet Quant tensorflow-lite: NASNet Mobile tensorflow-lite: Mobilenet Float ddnet: 1920 x 1080 - Fullscreen - OpenGL 3.3 - Default - Multeasymap graphics-magick: Sharpen graphics-magick: Enhanced graphics-magick: Noise-Gaussian graphics-magick: Resizing graphics-magick: HWB Color Space graphics-magick: Rotate astcenc: Thorough basis: ETC1S gegl: Wavelet Blur rav1e: 1 deepspeech: CPU luxcorerender-cl: Rainbow Colors and Prism rav1e: 5 basis: UASTC Level 2 astcenc: Medium gegl: Color Enhance simdjson: LargeRand simdjson: PartialTweets simdjson: DistinctUserID sqlite-speedtest: Timed Time - Size 1,000 mafft: Multiple Sequence Alignment - LSU RNA coremark: CoreMark Size 666 - Iterations Per Second leveldb: Rand Read dav1d: Summer Nature 1080p rav1e: 6 plaidml: No - Inference - DenseNet 201 - OpenCL vkresample: 2x - Double gegl: Rotate 90 Degrees gegl: Antialias espeak: Text-To-Speech Synthesis cryptsetup: Twofish-XTS 512b Encryption cryptsetup: Twofish-XTS 512b Decryption cryptsetup: Serpent-XTS 512b Decryption cryptsetup: Serpent-XTS 512b Encryption cryptsetup: AES-XTS 512b Decryption cryptsetup: AES-XTS 512b Encryption cryptsetup: Twofish-XTS 256b Decryption cryptsetup: Twofish-XTS 256b Encryption cryptsetup: Serpent-XTS 256b Decryption cryptsetup: Serpent-XTS 256b Encryption cryptsetup: AES-XTS 256b Decryption cryptsetup: AES-XTS 256b Encryption cryptsetup: PBKDF2-whirlpool cryptsetup: PBKDF2-sha512 leveldb: Seq Fill leveldb: Seq Fill leveldb: Rand Delete tnn: CPU - MobileNet v2 clpeak: Double-Precision Double darktable: Masskrug - CPU-only onednn: IP Shapes 1D - f32 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU gegl: Scale compress-lz4: 1 - Decompression Speed compress-lz4: 1 - Compression Speed compress-zstd: 3 gegl: Reflect gegl: Tile Glass gegl: Crop rav1e: 10 namd-cuda: ATPase Simulation - 327,506 Atoms redis: SADD phpbench: PHP Benchmark Suite rnnoise: unpack-firefox: firefox-84.0.source.tar.xz lammps: Rhodopsin Protein onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU onednn: Deconvolution Batch shapes_1d - f32 - CPU inkscape: SVG Files To PNG redis: LPOP crafty: Elapsed Time tnn: CPU - SqueezeNet v1.1 encode-ape: WAV To APE betsy: ETC2 RGB - Highest darktable: Boat - CPU-only realsr-ncnn: 4x - No hashcat: SHA-512 onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU onednn: Deconvolution Batch shapes_3d - f32 - CPU yquake2: OpenGL 1.x - 1920 x 1080 yquake2: OpenGL 3.x - 1920 x 1080 encode-opus: WAV To Opus Encode onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU redis: LPUSH betsy: ETC1 - Highest yquake2: Software CPU - 1920 x 1080 vkresample: 2x - Single astcenc: Fast plaidml: No - Inference - IMDB LSTM - OpenCL redis: SET redis: GET onednn: IP Shapes 3D - u8s8f32 - CPU onednn: IP Shapes 3D - f32 - CPU clpeak: Integer Compute INT basis: UASTC Level 0 leveldb: Hot Read rodinia: OpenCL Particle Filter hashcat: TrueCrypt RIPEMD160 + XTS hashcat: SHA1 hashcat: MD5 onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU cl-mem: Write cl-mem: Copy cl-mem: Read plaidml: Yes - Inference - Mobilenet - OpenCL waifu2x-ncnn: 2x - 3 - Yes plaidml: No - Inference - Mobilenet - OpenCL darktable: Server Room - CPU-only neatbench: GPU leveldb: Rand Fill leveldb: Rand Fill leveldb: Overwrite leveldb: Overwrite mandelgpu: GPU clpeak: Single-Precision Float leveldb: Fill Sync leveldb: Fill Sync arrayfire: Conjugate Gradient OpenCL hashcat: 7-Zip viennacl: OpenCL LU Factorization clpeak: Global Memory Bandwidth darktable: Server Rack - CPU-only financebench: Black-Scholes OpenCL r1 r2 r3 1192.96 840.347 734.81 608.80 62.568 10.646 5.239 58.161 8.899 1546 816 730 461 170.36 447.99 13277 63909 25820 158.21 0.617 15984719 9703133 139.126 250.78 86.08 210.051 419.58 196.21 3.96177 25.1 65.9 90.4 177.7 207 3.7 168.87 955.6 2.26 4660197 5163190 1.27 151.656 189.085068 5069.44 0.79 489.84 2.70 7.0735 116.76 110.838 186.4611 105.526 6.0806 99.812 100.257 7140.50 7155.41 7144.23 41.47 91.00 28.8 86.789 1.17 3442.78 3795.02 3795.81 3797.32 19.16 27.64 35.95 37.81 15.50 18.62 72.09 19.98 2.54 10.00 6.67 7.93 5.74 7.31 26.62 19.16 27.58 35.52 37.25 15.44 18.62 71.96 20.05 2.55 10.01 6.63 7.92 5.74 7.23 26.52 7.5555 80.586 1.21 3363.55 3202.53 1.26 112.75 4961.99 0.80 68.744 9679.8 55.72 3165.24 1.28 9.1343 9676.3 57.88 13.06 0.76 413.88 12.694 60.35 0.939 2.147 354892 236716 302594 239119 435.20 72 115 146 552 775 902 54.29 57.824 57.993 0.347 81.29517 5.30 1.069 55.499 7.68 54.114 0.5 0.86 0.89 49.547 10.497 223414.807558 9.620 460.02 1.444 110.07 256.867 37.697 36.556 26.474 483.0 482.7 871.7 878.0 3348.3 3346.8 482.5 482.0 872.3 874.1 4002.4 4005.6 816282 1919349 47.235 37.5 47.228 321.420 340.42 7.128 7.16575 3.17762 6.954 9823.2 8120.67 2833.6 28.183 28.243 8.900 3.422 0.22103 2660539.42 837911 22.084 16.028 5.198 8.96782 9.77594 20.996 3394660.2 9497414 272.907 10.512 8.016 15.914 14.734 1023100000 4.74772 9.87893 59.9 60 7.624 4.36381 4.45564 2041750.08 5.854 60.7 24.992 5.44 463.34 2375800.25 3248596.08 2.72558 12.4721 5504.35 7.288 6.946 7.115 301233 8585766667 24334866667 18.0080 21.6871 215.7 236.6 330.3 1819.24 6.020 1246.78 4.181 27.5 41.037 43.1 40.925 43.2 251986408.7 5940.64 3361.777 0.5 2.549 373667 68.2924 324.63 0.181 17.477667 1190.05 840.319 731.67 609.56 63.180 10.675 5.291 58.530 8.982 1544 814 730 460 169.30 449.37 13173 63822 25647 100.58 0.610 15974611 9839292 139.905 251.90 85.83 210.712 419.36 196.28 3.96068 25.4 66.5 90.6 178.1 207 2.5 167.96 967.9 2.31 4670567 5168183 1.32 152.208 189.101719 5079.89 0.79 486.46 2.77 6.9794 116.15 110.926 186.4777 105.572 6.0989 100.617 100.397 7159.42 7159.48 7154.66 38.07 90.82 28.7 87.319 1.19 3403.45 3797.05 3800.41 3799.45 18.91 27.51 35.59 37.30 15.46 18.71 71.91 20.01 2.60 9.05 5.96 6.95 5.81 7.22 26.63 17.15 27.52 35.51 37.34 15.53 18.33 71.82 18.20 2.29 9.02 5.86 6.98 5.73 7.22 26.53 7.5656 80.934 1.23 3307.53 3207.35 1.27 112.03 4978.25 0.80 67.543 9664.8 56.07 3166.57 1.28 9.2596 9653.7 57.36 13.17 0.75 412.43 12.629 60.18 0.938 2.150 356034 237129 304756 239224 429.37 72 115 147 551 774 875 54.38 58.063 57.950 0.346 81.07316 5.39 1.064 55.742 7.61 54.312 0.5 0.87 0.88 50.268 10.564 223304.983286 9.692 459.61 1.440 109.98 257.062 37.541 36.557 27.178 486.4 485.7 878.1 882.1 3388.5 3381.9 486.3 487.4 876.6 881.4 4055.1 4080.5 830020 1943008 47.287 37.4 47.296 295.547 340.46 7.150 7.04404 3.16769 6.973 9839.9 8127.78 2831.0 28.496 28.242 8.839 3.404 0.22238 2628039.25 832417 21.316 16.137 5.169 9.00692 9.76468 21.048 2104092.33 9584148 264.948 10.861 7.912 15.870 14.656 1020000000 4.71457 9.77701 59.9 60 7.602 4.37852 4.47062 2094056.31 5.789 60.7 25.190 5.59 477.39 2413657.0 3012560.83 2.77670 12.4447 5519.39 7.345 7.099 7.055 301433 8544500000 24260200000 17.9035 21.6992 215.6 235.4 329.9 1823.06 6.102 1244.95 4.174 27.1 41.027 43.1 40.955 43.2 252826584.8 5858.32 3424.918 0.5 2.531 370400 64.2335 324.58 0.181 17.476 1192.80 841.228 733.02 608.62 63.563 10.658 5.285 58.786 8.944 1544 814 730 459 151.49 449.90 13416 64033 25683 130.66 0.614 16180674 9629353 139.184 251.80 85.95 210.945 417.03 196.41 3.95457 25.3 66.2 90.5 177.4 207 3.6 168.08 968.6 2.29 4677473 5178263 1.30 151.478 189.316553 5073.09 0.79 487.57 2.76 6.9976 116.26 111.040 186.6158 105.505 6.0641 100.748 100.203 7151.58 7169.03 7147.09 38.07 90.93 28.8 86.993 1.19 3405.92 3797.72 3798.12 3792.87 19.38 27.63 35.66 37.22 15.49 18.66 71.86 20.21 2.57 9.06 5.96 7.03 5.81 7.23 26.53 17.60 27.55 35.53 37.26 15.50 18.38 71.86 18.26 2.29 8.99 5.91 7.05 5.81 7.19 26.51 7.5496 80.712 1.22 3347.93 3212.10 1.27 112.65 5006.34 0.80 68.699 9695.2 57.01 3164.51 1.28 9.1967 9685.2 58.89 13.18 0.75 412.38 12.644 60.25 0.935 2.156 356258 237406 304079 239537 434.24 73 115 147 551 776 900 54.65 58.062 57.843 0.347 81.03983 5.41 1.064 55.766 7.58 54.101 0.5 0.86 0.88 50.261 10.608 223892.444726 9.573 459.71 1.443 109.99 257.615 37.691 36.646 27.713 483.8 483.0 873.5 874.4 3362.9 3336.0 483.0 483.6 870.9 874.1 4026.9 4023.0 810352 1886103 47.415 37.3 47.388 299.396 340.59 7.155 7.14574 3.11291 7.000 9810.0 8079.18 2835.1 28.313 28.055 8.826 3.420 0.22171 2634908.83 829705 22.044 16.103 5.179 9.06628 9.73732 21.068 2809233.48 9560012 272.676 10.592 7.903 15.863 14.694 1016800000 4.73728 9.81238 59.9 60 7.616 4.38535 4.46656 2083566.29 5.792 60.6 25.225 5.63 478.73 2433543.8 3009326.75 2.74874 12.6089 5540.44 7.353 7.128 7.027 298133 8535333333 24196900000 18.0326 21.6210 214.8 235.1 329.9 1817.78 6.093 1247.93 4.178 27.6 40.981 43.2 40.760 43.4 252822614.4 5892.70 3386.084 0.5 2.548 366433 65.9180 324.78 0.181 17.476333 OpenBenchmarking.org
Blender Blend File: Barbershop - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Barbershop - Compute: NVIDIA OptiX r2 r3 r1 300 600 900 1200 1500 SE +/- 0.85, N = 3 SE +/- 2.01, N = 3 SE +/- 0.44, N = 3 1190.05 1192.80 1192.96
Basis Universal Settings: UASTC Level 2 + RDO Post-Processing OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: UASTC Level 2 + RDO Post-Processing r2 r1 r3 200 400 600 800 1000 SE +/- 0.35, N = 3 SE +/- 0.74, N = 3 SE +/- 0.62, N = 3 840.32 840.35 841.23 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
Blender Blend File: Barbershop - Compute: CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Barbershop - Compute: CUDA r2 r3 r1 160 320 480 640 800 SE +/- 0.26, N = 3 SE +/- 0.41, N = 3 SE +/- 0.24, N = 3 731.67 733.02 734.81
Blender Blend File: Pabellon Barcelona - Compute: CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Pabellon Barcelona - Compute: CUDA r3 r1 r2 130 260 390 520 650 SE +/- 0.06, N = 3 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 608.62 608.80 609.56
Mobile Neural Network Model: inception-v3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2020-09-17 Model: inception-v3 r1 r2 r3 14 28 42 56 70 SE +/- 0.15, N = 10 SE +/- 0.18, N = 11 SE +/- 0.22, N = 10 62.57 63.18 63.56 MIN: 60.82 / MAX: 96.05 MIN: 61.02 / MAX: 104.39 MIN: 60.92 / MAX: 102.85 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: mobilenet-v1-1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2020-09-17 Model: mobilenet-v1-1.0 r1 r3 r2 3 6 9 12 15 SE +/- 0.01, N = 10 SE +/- 0.01, N = 10 SE +/- 0.01, N = 11 10.65 10.66 10.68 MIN: 10.33 / MAX: 34.53 MIN: 10.33 / MAX: 32.25 MIN: 10.35 / MAX: 33.35 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: MobileNetV2_224 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2020-09-17 Model: MobileNetV2_224 r1 r3 r2 1.1905 2.381 3.5715 4.762 5.9525 SE +/- 0.210, N = 10 SE +/- 0.209, N = 10 SE +/- 0.185, N = 11 5.239 5.285 5.291 MIN: 3.19 / MAX: 26.27 MIN: 3.27 / MAX: 26.82 MIN: 3.3 / MAX: 27.38 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: resnet-v2-50 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2020-09-17 Model: resnet-v2-50 r1 r2 r3 13 26 39 52 65 SE +/- 0.40, N = 10 SE +/- 0.35, N = 11 SE +/- 0.40, N = 10 58.16 58.53 58.79 MIN: 36.86 / MAX: 81.73 MIN: 37.33 / MAX: 83.74 MIN: 36.87 / MAX: 85.77 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
Mobile Neural Network Model: SqueezeNetV1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2020-09-17 Model: SqueezeNetV1.0 r1 r3 r2 3 6 9 12 15 SE +/- 0.373, N = 10 SE +/- 0.373, N = 10 SE +/- 0.316, N = 11 8.899 8.944 8.982 MIN: 4.96 / MAX: 31.21 MIN: 5.01 / MAX: 31.89 MIN: 5.05 / MAX: 31.35 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -rdynamic -pthread -ldl
AI Benchmark Alpha Device AI Score OpenBenchmarking.org Score, More Is Better AI Benchmark Alpha 0.1.2 Device AI Score r1 r3 r2 300 600 900 1200 1500 1546 1544 1544
AI Benchmark Alpha Device Training Score OpenBenchmarking.org Score, More Is Better AI Benchmark Alpha 0.1.2 Device Training Score r1 r3 r2 200 400 600 800 1000 816 814 814
AI Benchmark Alpha Device Inference Score OpenBenchmarking.org Score, More Is Better AI Benchmark Alpha 0.1.2 Device Inference Score r3 r2 r1 160 320 480 640 800 730 730 730
RedShift Demo OpenBenchmarking.org Seconds, Fewer Is Better RedShift Demo 3.0 r3 r2 r1 100 200 300 400 500 SE +/- 0.33, N = 3 SE +/- 0.88, N = 3 459 460 461
DDraceNetwork Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL 3.0 - Zoom: Default - Demo: RaiNyMore2 OpenBenchmarking.org Frames Per Second, More Is Better DDraceNetwork 15.2.3 Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL 3.0 - Zoom: Default - Demo: RaiNyMore2 r1 r2 r3 40 80 120 160 200 SE +/- 9.09, N = 15 SE +/- 9.59, N = 15 SE +/- 11.09, N = 15 170.36 169.30 151.49 MIN: 2.43 / MAX: 499.5 MIN: 2.38 / MAX: 499.5 MIN: 2.37 / MAX: 499.75 1. (CXX) g++ options: -O3 -rdynamic -lcrypto -lz -lrt -lpthread -lcurl -lfreetype -lSDL2 -lwavpack -lopusfile -lopus -logg -lGL -lX11 -lnotify -lgdk_pixbuf-2.0 -lgio-2.0 -lgobject-2.0 -lglib-2.0
ASTC Encoder Preset: Exhaustive OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Exhaustive r1 r2 r3 100 200 300 400 500 SE +/- 0.52, N = 3 SE +/- 0.81, N = 3 SE +/- 0.54, N = 3 447.99 449.37 449.90 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
LeelaChessZero Backend: OpenCL OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.26 Backend: OpenCL r3 r1 r2 3K 6K 9K 12K 15K SE +/- 44.68, N = 3 SE +/- 160.45, N = 3 SE +/- 176.76, N = 3 13416 13277 13173 1. (CXX) g++ options: -flto -pthread
BRL-CAD VGR Performance Metric OpenBenchmarking.org VGR Performance Metric, More Is Better BRL-CAD 7.30.8 VGR Performance Metric r3 r1 r2 14K 28K 42K 56K 70K 64033 63909 63822 1. (CXX) g++ options: -std=c++11 -pipe -fno-strict-aliasing -fno-common -fexceptions -ftemplate-depth-128 -m64 -ggdb3 -O3 -fipa-pta -fstrength-reduce -finline-functions -flto -pedantic -rdynamic -lSM -lICE -lXi -lGLU -lGL -lGLdispatch -lX11 -lXext -lXrender -lpthread -ldl -luuid -lm
VkFFT OpenBenchmarking.org Benchmark Score, More Is Better VkFFT 1.1.1 r1 r3 r2 6K 12K 18K 24K 30K SE +/- 62.93, N = 3 SE +/- 108.37, N = 3 SE +/- 58.68, N = 3 25820 25683 25647 1. (CXX) g++ options: -O3 -pthread
DDraceNetwork Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL 3.3 - Zoom: Default - Demo: RaiNyMore2 OpenBenchmarking.org Frames Per Second, More Is Better DDraceNetwork 15.2.3 Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL 3.3 - Zoom: Default - Demo: RaiNyMore2 r1 r3 r2 30 60 90 120 150 SE +/- 9.86, N = 15 SE +/- 13.14, N = 12 158.21 130.66 100.58 MIN: 7.02 / MAX: 449.03 MIN: 6.67 / MAX: 498.75 MIN: 6.72 / MAX: 493.34 1. (CXX) g++ options: -O3 -rdynamic -lcrypto -lz -lrt -lpthread -lcurl -lfreetype -lSDL2 -lwavpack -lopusfile -lopus -logg -lGL -lX11 -lnotify -lgdk_pixbuf-2.0 -lgio-2.0 -lgobject-2.0 -lglib-2.0
GROMACS Water Benchmark OpenBenchmarking.org Ns Per Day, More Is Better GROMACS 2020.3 Water Benchmark r1 r3 r2 0.1388 0.2776 0.4164 0.5552 0.694 SE +/- 0.003, N = 3 SE +/- 0.002, N = 3 SE +/- 0.004, N = 3 0.617 0.614 0.610 1. (CXX) g++ options: -O3 -pthread -lrt -lpthread -lm
asmFish 1024 Hash Memory, 26 Depth OpenBenchmarking.org Nodes/second, More Is Better asmFish 2018-07-23 1024 Hash Memory, 26 Depth r3 r1 r2 3M 6M 9M 12M 15M SE +/- 142852.80, N = 3 SE +/- 174263.56, N = 3 SE +/- 148124.86, N = 3 16180674 15984719 15974611
Stockfish Total Time OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish 12 Total Time r2 r1 r3 2M 4M 6M 8M 10M SE +/- 85742.14, N = 3 SE +/- 85083.98, N = 8 SE +/- 67987.28, N = 12 9839292 9703133 9629353 1. (CXX) g++ options: -m64 -lpthread -fno-exceptions -std=c++17 -pedantic -O3 -msse -msse3 -mpopcnt -msse4.1 -mssse3 -msse2 -flto -flto=jobserver
Unigine Heaven Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL OpenBenchmarking.org Frames Per Second, More Is Better Unigine Heaven 4.0 Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL r2 r3 r1 30 60 90 120 150 SE +/- 0.96, N = 3 SE +/- 0.56, N = 3 SE +/- 0.71, N = 3 139.91 139.18 139.13
Blender Blend File: Classroom - Compute: CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Classroom - Compute: CUDA r1 r3 r2 60 120 180 240 300 SE +/- 0.03, N = 3 SE +/- 0.05, N = 3 SE +/- 0.04, N = 3 250.78 251.80 251.90
dav1d Video Input: Chimera 1080p 10-bit OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Chimera 1080p 10-bit r1 r3 r2 20 40 60 80 100 SE +/- 0.99, N = 4 SE +/- 1.03, N = 4 SE +/- 1.05, N = 4 86.08 85.95 85.83 MIN: 54.34 / MAX: 256.39 MIN: 54.21 / MAX: 255.72 MIN: 54.27 / MAX: 257.58 1. (CC) gcc options: -pthread
Build2 Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Build2 0.13 Time To Compile r1 r2 r3 50 100 150 200 250 SE +/- 0.40, N = 3 SE +/- 0.49, N = 3 SE +/- 0.85, N = 3 210.05 210.71 210.95
Numpy Benchmark OpenBenchmarking.org Score, More Is Better Numpy Benchmark r1 r2 r3 90 180 270 360 450 SE +/- 1.54, N = 3 SE +/- 0.84, N = 3 SE +/- 0.70, N = 3 419.58 419.36 417.03
Blender Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Pabellon Barcelona - Compute: NVIDIA OptiX r1 r2 r3 40 80 120 160 200 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 SE +/- 0.08, N = 3 196.21 196.28 196.41
High Performance Conjugate Gradient OpenBenchmarking.org GFLOP/s, More Is Better High Performance Conjugate Gradient 3.1 r1 r2 r3 0.8914 1.7828 2.6742 3.5656 4.457 SE +/- 0.00082, N = 3 SE +/- 0.00692, N = 3 SE +/- 0.01196, N = 3 3.96177 3.96068 3.95457 1. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -pthread -lmpi_cxx -lmpi
Unigine Superposition Resolution: 1920 x 1080 - Mode: Fullscreen - Quality: Ultra - Renderer: OpenGL OpenBenchmarking.org Frames Per Second, More Is Better Unigine Superposition 1.0 Resolution: 1920 x 1080 - Mode: Fullscreen - Quality: Ultra - Renderer: OpenGL r2 r3 r1 6 12 18 24 30 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 SE +/- 0.06, N = 3 25.4 25.3 25.1 MAX: 29.4 MAX: 29.7 MAX: 29.3
Unigine Superposition Resolution: 1920 x 1080 - Mode: Fullscreen - Quality: High - Renderer: OpenGL OpenBenchmarking.org Frames Per Second, More Is Better Unigine Superposition 1.0 Resolution: 1920 x 1080 - Mode: Fullscreen - Quality: High - Renderer: OpenGL r2 r3 r1 15 30 45 60 75 SE +/- 0.12, N = 3 SE +/- 0.09, N = 3 SE +/- 0.19, N = 3 66.5 66.2 65.9 MAX: 80.8 MAX: 80.3 MAX: 81.6
Unigine Superposition Resolution: 1920 x 1080 - Mode: Fullscreen - Quality: Medium - Renderer: OpenGL OpenBenchmarking.org Frames Per Second, More Is Better Unigine Superposition 1.0 Resolution: 1920 x 1080 - Mode: Fullscreen - Quality: Medium - Renderer: OpenGL r2 r3 r1 20 40 60 80 100 SE +/- 0.15, N = 3 SE +/- 0.15, N = 3 SE +/- 0.15, N = 3 90.6 90.5 90.4 MAX: 114.4 MAX: 113 MAX: 114.5
Unigine Superposition Resolution: 1920 x 1080 - Mode: Fullscreen - Quality: Low - Renderer: OpenGL OpenBenchmarking.org Frames Per Second, More Is Better Unigine Superposition 1.0 Resolution: 1920 x 1080 - Mode: Fullscreen - Quality: Low - Renderer: OpenGL r2 r1 r3 40 80 120 160 200 SE +/- 0.71, N = 3 SE +/- 0.23, N = 3 SE +/- 0.52, N = 3 178.1 177.7 177.4 MAX: 259.4 MAX: 260.1 MAX: 263.9
GraphicsMagick Operation: Swirl OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Swirl r3 r2 r1 50 100 150 200 250 SE +/- 1.72, N = 8 SE +/- 1.60, N = 10 SE +/- 1.72, N = 8 207 207 207 1. (CC) gcc options: -fopenmp -O2 -pthread -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
CLOMP Static OMP Speedup OpenBenchmarking.org Speedup, More Is Better CLOMP 1.2 Static OMP Speedup r1 r3 r2 0.8325 1.665 2.4975 3.33 4.1625 SE +/- 0.03, N = 3 SE +/- 0.03, N = 15 SE +/- 0.03, N = 15 3.7 3.6 2.5 1. (CC) gcc options: -fopenmp -O3 -lm
Blender Blend File: Fishy Cat - Compute: CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Fishy Cat - Compute: CUDA r2 r3 r1 40 80 120 160 200 SE +/- 0.11, N = 3 SE +/- 0.05, N = 3 SE +/- 0.10, N = 3 167.96 168.08 168.87
Warsow Resolution: 1920 x 1080 OpenBenchmarking.org Frames Per Second, More Is Better Warsow 2.5 Beta Resolution: 1920 x 1080 r3 r2 r1 200 400 600 800 1000 SE +/- 1.81, N = 3 SE +/- 1.46, N = 3 SE +/- 13.76, N = 12 968.6 967.9 955.6
LuxCoreRender OpenCL Scene: LuxCore Benchmark OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender OpenCL 2.3 Scene: LuxCore Benchmark r2 r3 r1 0.5198 1.0396 1.5594 2.0792 2.599 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.04, N = 12 2.31 2.29 2.26 MIN: 0.27 / MAX: 2.63 MIN: 0.27 / MAX: 2.64 MIN: 0.14 / MAX: 2.63
TensorFlow Lite Model: Inception ResNet V2 OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: Inception ResNet V2 r1 r2 r3 1000K 2000K 3000K 4000K 5000K SE +/- 8775.31, N = 3 SE +/- 8796.49, N = 3 SE +/- 8398.83, N = 3 4660197 4670567 4677473
TensorFlow Lite Model: Inception V4 OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: Inception V4 r1 r2 r3 1.1M 2.2M 3.3M 4.4M 5.5M SE +/- 5618.75, N = 3 SE +/- 7685.69, N = 3 SE +/- 8609.77, N = 3 5163190 5168183 5178263
LuxCoreRender OpenCL Scene: Food OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender OpenCL 2.3 Scene: Food r2 r3 r1 0.297 0.594 0.891 1.188 1.485 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.04, N = 12 1.32 1.30 1.27 MIN: 0.29 / MAX: 1.57 MIN: 0.26 / MAX: 1.57 MIN: 0.13 / MAX: 1.57
Timed Linux Kernel Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Linux Kernel Compilation 5.4 Time To Compile r3 r1 r2 30 60 90 120 150 SE +/- 0.75, N = 3 SE +/- 0.33, N = 3 SE +/- 0.24, N = 3 151.48 151.66 152.21
OctaneBench Total Score OpenBenchmarking.org Score, More Is Better OctaneBench 2020.1 Total Score r3 r2 r1 40 80 120 160 200 189.32 189.10 189.09
OpenVINO Model: Person Detection 0106 FP32 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2021.1 Model: Person Detection 0106 FP32 - Device: CPU r1 r3 r2 1100 2200 3300 4400 5500 SE +/- 15.43, N = 3 SE +/- 14.45, N = 5 SE +/- 9.68, N = 9 5069.44 5073.09 5079.89 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread
OpenVINO Model: Person Detection 0106 FP32 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2021.1 Model: Person Detection 0106 FP32 - Device: CPU r3 r2 r1 0.1778 0.3556 0.5334 0.7112 0.889 SE +/- 0.01, N = 5 SE +/- 0.01, N = 9 SE +/- 0.01, N = 3 0.79 0.79 0.79 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread
dav1d Video Input: Chimera 1080p OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Chimera 1080p r1 r3 r2 110 220 330 440 550 SE +/- 5.73, N = 14 SE +/- 3.24, N = 13 SE +/- 3.02, N = 14 489.84 487.57 486.46 MIN: 317.1 / MAX: 898.12 MIN: 316.7 / MAX: 911.47 MIN: 316.37 / MAX: 900.57 1. (CC) gcc options: -pthread
LuxCoreRender OpenCL Scene: DLSC OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender OpenCL 2.3 Scene: DLSC r2 r3 r1 0.6233 1.2466 1.8699 2.4932 3.1165 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.06, N = 12 2.77 2.76 2.70 MIN: 2.57 / MAX: 2.84 MIN: 2.56 / MAX: 2.84 MIN: 0.69 / MAX: 2.81
Embree Binary: Pathtracer ISPC - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer ISPC - Model: Crown r1 r3 r2 2 4 6 8 10 SE +/- 0.0830, N = 3 SE +/- 0.0756, N = 5 SE +/- 0.0728, N = 4 7.0735 6.9976 6.9794 MIN: 6.66 / MAX: 12.73 MIN: 6.56 / MAX: 12.56 MIN: 6.57 / MAX: 12.32
Blender Blend File: Classroom - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Classroom - Compute: NVIDIA OptiX r2 r3 r1 30 60 90 120 150 SE +/- 0.23, N = 3 SE +/- 0.13, N = 3 SE +/- 0.13, N = 3 116.15 116.26 116.76
Basis Universal Settings: UASTC Level 3 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: UASTC Level 3 r1 r2 r3 20 40 60 80 100 SE +/- 0.55, N = 3 SE +/- 0.55, N = 3 SE +/- 0.53, N = 3 110.84 110.93 111.04 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
FAHBench OpenBenchmarking.org Ns Per Day, More Is Better FAHBench 2.3.2 r3 r2 r1 40 80 120 160 200 SE +/- 0.11, N = 3 SE +/- 0.14, N = 3 SE +/- 0.23, N = 3 186.62 186.48 186.46
Timed HMMer Search Pfam Database Search OpenBenchmarking.org Seconds, Fewer Is Better Timed HMMer Search 3.3.1 Pfam Database Search r3 r1 r2 20 40 60 80 100 SE +/- 0.02, N = 3 SE +/- 0.06, N = 3 SE +/- 0.04, N = 3 105.51 105.53 105.57 1. (CC) gcc options: -O3 -pthread -lhmmer -leasel -lm
Embree Binary: Pathtracer - Model: Crown OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer - Model: Crown r2 r1 r3 2 4 6 8 10 SE +/- 0.0737, N = 3 SE +/- 0.0766, N = 3 SE +/- 0.0667, N = 3 6.0989 6.0806 6.0641 MIN: 5.88 / MAX: 10.98 MIN: 5.86 / MAX: 11.02 MIN: 5.86 / MAX: 10.95
RealSR-NCNN Scale: 4x - TAA: Yes OpenBenchmarking.org Seconds, Fewer Is Better RealSR-NCNN 20200818 Scale: 4x - TAA: Yes r1 r2 r3 20 40 60 80 100 SE +/- 0.31, N = 3 SE +/- 0.48, N = 3 SE +/- 0.35, N = 3 99.81 100.62 100.75
Timed FFmpeg Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed FFmpeg Compilation 4.2.2 Time To Compile r3 r1 r2 20 40 60 80 100 SE +/- 0.30, N = 3 SE +/- 0.78, N = 3 SE +/- 0.39, N = 3 100.20 100.26 100.40
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU r1 r3 r2 1500 3000 4500 6000 7500 SE +/- 2.95, N = 3 SE +/- 6.73, N = 3 SE +/- 4.70, N = 3 7140.50 7151.58 7159.42 MIN: 7021.68 MIN: 7027.2 MIN: 7041.4 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU r1 r2 r3 1500 3000 4500 6000 7500 SE +/- 12.55, N = 3 SE +/- 1.75, N = 3 SE +/- 6.55, N = 3 7155.41 7159.48 7169.03 MIN: 7025.22 MIN: 7040.61 MIN: 7046.49 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU r1 r3 r2 1500 3000 4500 6000 7500 SE +/- 3.89, N = 3 SE +/- 2.23, N = 3 SE +/- 0.92, N = 3 7144.23 7147.09 7154.66 MIN: 7028.46 MIN: 7033.98 MIN: 7035.88 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Blender Blend File: BMW27 - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: BMW27 - Compute: NVIDIA OptiX r2 r3 r1 9 18 27 36 45 SE +/- 0.02, N = 3 SE +/- 0.05, N = 3 SE +/- 3.33, N = 15 38.07 38.07 41.47
Blender Blend File: BMW27 - Compute: CUDA OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: BMW27 - Compute: CUDA r2 r3 r1 20 40 60 80 100 SE +/- 0.16, N = 3 SE +/- 0.10, N = 3 SE +/- 0.14, N = 3 90.82 90.93 91.00
Zstd Compression Compression Level: 19 OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.4.5 Compression Level: 19 r3 r1 r2 7 14 21 28 35 SE +/- 0.06, N = 3 SE +/- 0.07, N = 3 SE +/- 0.06, N = 3 28.8 28.8 28.7 1. (CC) gcc options: -O3 -pthread -lz -llzma
GEGL Operation: Cartoon OpenBenchmarking.org Seconds, Fewer Is Better GEGL Operation: Cartoon r1 r3 r2 20 40 60 80 100 SE +/- 0.12, N = 3 SE +/- 0.09, N = 3 SE +/- 0.19, N = 3 86.79 86.99 87.32
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2021.1 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU r1 r2 r3 0.2678 0.5356 0.8034 1.0712 1.339 SE +/- 0.00, N = 3 SE +/- 0.00, N = 4 SE +/- 0.00, N = 6 1.17 1.19 1.19 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread
OpenVINO Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2021.1 Model: Age Gender Recognition Retail 0013 FP16 - Device: CPU r1 r3 r2 700 1400 2100 2800 3500 SE +/- 33.67, N = 3 SE +/- 34.05, N = 6 SE +/- 38.35, N = 4 3442.78 3405.92 3403.45 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU r1 r2 r3 800 1600 2400 3200 4000 SE +/- 2.45, N = 3 SE +/- 2.65, N = 3 SE +/- 3.77, N = 3 3795.02 3797.05 3797.72 MIN: 3682.24 MIN: 3673.18 MIN: 3684.19 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU r1 r3 r2 800 1600 2400 3200 4000 SE +/- 6.76, N = 3 SE +/- 3.22, N = 3 SE +/- 4.34, N = 3 3795.81 3798.12 3800.41 MIN: 3687.23 MIN: 3685.27 MIN: 3681.23 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU r3 r1 r2 800 1600 2400 3200 4000 SE +/- 1.33, N = 3 SE +/- 1.61, N = 3 SE +/- 1.20, N = 3 3792.87 3797.32 3799.45 MIN: 3672.83 MIN: 3686.53 MIN: 3692.97 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
NCNN Target: CPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: regnety_400m r2 r1 r3 5 10 15 20 25 SE +/- 0.24, N = 3 SE +/- 0.06, N = 3 SE +/- 0.10, N = 3 18.91 19.16 19.38 MIN: 13.5 / MAX: 30.63 MIN: 18.07 / MAX: 22.36 MIN: 14.45 / MAX: 42.2 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: squeezenet_ssd r2 r3 r1 7 14 21 28 35 SE +/- 0.03, N = 3 SE +/- 0.05, N = 3 SE +/- 0.14, N = 3 27.51 27.63 27.64 MIN: 26.93 / MAX: 43.6 MIN: 27.02 / MAX: 46.56 MIN: 27 / MAX: 40.14 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: yolov4-tiny r2 r3 r1 8 16 24 32 40 SE +/- 0.05, N = 3 SE +/- 0.02, N = 3 SE +/- 0.48, N = 3 35.59 35.66 35.95 MIN: 34.42 / MAX: 51.24 MIN: 34.45 / MAX: 49.15 MIN: 34.4 / MAX: 55.63 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: resnet50 r3 r2 r1 9 18 27 36 45 SE +/- 0.05, N = 3 SE +/- 0.03, N = 3 SE +/- 0.51, N = 3 37.22 37.30 37.81 MIN: 33.9 / MAX: 52.84 MIN: 33.91 / MAX: 56.28 MIN: 34.04 / MAX: 52.8 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: alexnet r2 r3 r1 4 8 12 16 20 SE +/- 0.04, N = 3 SE +/- 0.03, N = 3 SE +/- 0.08, N = 3 15.46 15.49 15.50 MIN: 14.35 / MAX: 27.24 MIN: 14.41 / MAX: 24.83 MIN: 14.41 / MAX: 55.15 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: resnet18 r1 r3 r2 5 10 15 20 25 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 18.62 18.66 18.71 MIN: 17.08 / MAX: 32.57 MIN: 17.05 / MAX: 30.94 MIN: 17.06 / MAX: 33.58 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: vgg16 r3 r2 r1 16 32 48 64 80 SE +/- 0.12, N = 3 SE +/- 0.03, N = 3 SE +/- 0.20, N = 3 71.86 71.91 72.09 MIN: 70.48 / MAX: 88 MIN: 70.43 / MAX: 92.47 MIN: 70.5 / MAX: 88.28 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: googlenet r1 r2 r3 5 10 15 20 25 SE +/- 0.01, N = 3 SE +/- 0.08, N = 3 SE +/- 0.06, N = 3 19.98 20.01 20.21 MIN: 18.95 / MAX: 23.24 MIN: 18.96 / MAX: 24.67 MIN: 19.11 / MAX: 32.7 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: blazeface r1 r3 r2 0.585 1.17 1.755 2.34 2.925 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 SE +/- 0.05, N = 3 2.54 2.57 2.60 MIN: 2.35 / MAX: 2.74 MIN: 2.45 / MAX: 2.83 MIN: 2.45 / MAX: 10.37 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: efficientnet-b0 r2 r3 r1 3 6 9 12 15 SE +/- 0.96, N = 3 SE +/- 0.96, N = 3 SE +/- 0.05, N = 3 9.05 9.06 10.00 MIN: 6.99 / MAX: 21.76 MIN: 7.04 / MAX: 12.38 MIN: 9.46 / MAX: 24.32 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: mnasnet r2 r3 r1 2 4 6 8 10 SE +/- 0.75, N = 3 SE +/- 0.74, N = 3 SE +/- 0.02, N = 3 5.96 5.96 6.67 MIN: 4.32 / MAX: 14.32 MIN: 4.33 / MAX: 28.21 MIN: 5.99 / MAX: 21.18 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: shufflenet-v2 r2 r3 r1 2 4 6 8 10 SE +/- 0.94, N = 3 SE +/- 0.95, N = 3 SE +/- 0.03, N = 3 6.95 7.03 7.93 MIN: 5.01 / MAX: 9.68 MIN: 5.04 / MAX: 20.64 MIN: 7.52 / MAX: 16.61 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU-v3-v3 - Model: mobilenet-v3 r1 r2 r3 1.3073 2.6146 3.9219 5.2292 6.5365 SE +/- 0.65, N = 3 SE +/- 0.65, N = 3 SE +/- 0.62, N = 3 5.74 5.81 5.81 MIN: 4.3 / MAX: 7.75 MIN: 4.43 / MAX: 17.76 MIN: 4.48 / MAX: 10.59 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU-v2-v2 - Model: mobilenet-v2 r2 r3 r1 2 4 6 8 10 SE +/- 0.73, N = 3 SE +/- 0.73, N = 3 SE +/- 0.67, N = 3 7.22 7.23 7.31 MIN: 5.54 / MAX: 12.03 MIN: 5.55 / MAX: 12.3 MIN: 5.51 / MAX: 16.43 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: mobilenet r3 r1 r2 6 12 18 24 30 SE +/- 0.02, N = 3 SE +/- 0.17, N = 3 SE +/- 0.01, N = 3 26.53 26.62 26.63 MIN: 25.78 / MAX: 41.25 MIN: 25.69 / MAX: 38.05 MIN: 25.7 / MAX: 41.21 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: regnety_400m r2 r3 r1 5 10 15 20 25 SE +/- 1.83, N = 3 SE +/- 1.77, N = 3 SE +/- 0.09, N = 3 17.15 17.60 19.16 MIN: 13.3 / MAX: 38.12 MIN: 13.79 / MAX: 32.97 MIN: 17.94 / MAX: 21.24 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: squeezenet_ssd r2 r3 r1 6 12 18 24 30 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 27.52 27.55 27.58 MIN: 26.95 / MAX: 42.6 MIN: 26.92 / MAX: 41.99 MIN: 26.94 / MAX: 43.23 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: yolov4-tiny r2 r1 r3 8 16 24 32 40 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 SE +/- 0.05, N = 3 35.51 35.52 35.53 MIN: 33.05 / MAX: 50.05 MIN: 34.38 / MAX: 51.44 MIN: 32.99 / MAX: 52.01 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: resnet50 r1 r3 r2 9 18 27 36 45 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 SE +/- 0.06, N = 3 37.25 37.26 37.34 MIN: 34.07 / MAX: 48.19 MIN: 33.79 / MAX: 52.48 MIN: 33.97 / MAX: 56.32 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: alexnet r1 r3 r2 4 8 12 16 20 SE +/- 0.03, N = 3 SE +/- 0.05, N = 3 SE +/- 0.04, N = 3 15.44 15.50 15.53 MIN: 14.41 / MAX: 26.42 MIN: 14.41 / MAX: 26.23 MIN: 14.41 / MAX: 25.62 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: resnet18 r2 r3 r1 5 10 15 20 25 SE +/- 0.34, N = 3 SE +/- 0.27, N = 3 SE +/- 0.00, N = 3 18.33 18.38 18.62 MIN: 14.43 / MAX: 32.39 MIN: 14.4 / MAX: 32.57 MIN: 17.13 / MAX: 20.97 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: vgg16 r2 r3 r1 16 32 48 64 80 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 SE +/- 0.05, N = 3 71.82 71.86 71.96 MIN: 70.37 / MAX: 86.67 MIN: 70.4 / MAX: 88.5 MIN: 70.52 / MAX: 88.3 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: googlenet r2 r3 r1 5 10 15 20 25 SE +/- 1.77, N = 3 SE +/- 1.84, N = 3 SE +/- 0.06, N = 3 18.20 18.26 20.05 MIN: 14.26 / MAX: 31.74 MIN: 14.28 / MAX: 36.09 MIN: 18.94 / MAX: 32.96 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: blazeface r2 r3 r1 0.5738 1.1476 1.7214 2.2952 2.869 SE +/- 0.26, N = 3 SE +/- 0.25, N = 3 SE +/- 0.02, N = 3 2.29 2.29 2.55 MIN: 1.68 / MAX: 8.91 MIN: 1.69 / MAX: 12.73 MIN: 2.43 / MAX: 2.76 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: efficientnet-b0 r3 r2 r1 3 6 9 12 15 SE +/- 0.94, N = 3 SE +/- 0.95, N = 3 SE +/- 0.10, N = 3 8.99 9.02 10.01 MIN: 6.99 / MAX: 13.79 MIN: 7 / MAX: 19.29 MIN: 9.44 / MAX: 29.57 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: mnasnet r2 r3 r1 2 4 6 8 10 SE +/- 0.71, N = 3 SE +/- 0.76, N = 3 SE +/- 0.00, N = 3 5.86 5.91 6.63 MIN: 4.3 / MAX: 15.47 MIN: 4.32 / MAX: 7.94 MIN: 6.21 / MAX: 8.85 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: shufflenet-v2 r2 r3 r1 2 4 6 8 10 SE +/- 0.96, N = 3 SE +/- 0.93, N = 3 SE +/- 0.07, N = 3 6.98 7.05 7.92 MIN: 4.98 / MAX: 27.09 MIN: 5.04 / MAX: 20.37 MIN: 7.27 / MAX: 20.3 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 r2 r1 r3 1.3073 2.6146 3.9219 5.2292 6.5365 SE +/- 0.65, N = 3 SE +/- 0.62, N = 3 SE +/- 0.64, N = 3 5.73 5.74 5.81 MIN: 4.33 / MAX: 10.47 MIN: 4.43 / MAX: 9.64 MIN: 4.41 / MAX: 25.12 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 r3 r2 r1 2 4 6 8 10 SE +/- 0.73, N = 3 SE +/- 0.79, N = 3 SE +/- 0.74, N = 3 7.19 7.22 7.23 MIN: 5.52 / MAX: 9.67 MIN: 5.41 / MAX: 20.72 MIN: 5.54 / MAX: 9.59 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: mobilenet r3 r1 r2 6 12 18 24 30 SE +/- 0.07, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 26.51 26.52 26.53 MIN: 25.69 / MAX: 45.35 MIN: 25.69 / MAX: 43.81 MIN: 25.76 / MAX: 43.91 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Embree Binary: Pathtracer - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer - Model: Asian Dragon r2 r1 r3 2 4 6 8 10 SE +/- 0.0719, N = 3 SE +/- 0.0643, N = 3 SE +/- 0.0754, N = 3 7.5656 7.5555 7.5496 MIN: 7.18 / MAX: 12.51 MIN: 7.18 / MAX: 12.55 MIN: 7.19 / MAX: 12.66
RawTherapee Total Benchmark Time OpenBenchmarking.org Seconds, Fewer Is Better RawTherapee Total Benchmark Time r1 r3 r2 20 40 60 80 100 SE +/- 0.53, N = 3 SE +/- 0.45, N = 3 SE +/- 0.46, N = 3 80.59 80.71 80.93 1. RawTherapee, version 5.8, command line.
OpenVINO Model: Age Gender Recognition Retail 0013 FP32 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2021.1 Model: Age Gender Recognition Retail 0013 FP32 - Device: CPU r1 r3 r2 0.2768 0.5536 0.8304 1.1072 1.384 SE +/- 0.00, N = 3 SE +/- 0.00, N = 4 SE +/- 0.00, N = 5 1.21 1.22 1.23 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread
OpenVINO Model: Age Gender Recognition Retail 0013 FP32 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2021.1 Model: Age Gender Recognition Retail 0013 FP32 - Device: CPU r1 r3 r2 700 1400 2100 2800 3500 SE +/- 35.01, N = 3 SE +/- 40.89, N = 4 SE +/- 33.23, N = 5 3363.55 3347.93 3307.53 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread
OpenVINO Model: Face Detection 0106 FP32 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2021.1 Model: Face Detection 0106 FP32 - Device: CPU r1 r2 r3 700 1400 2100 2800 3500 SE +/- 2.58, N = 3 SE +/- 1.22, N = 4 SE +/- 2.51, N = 3 3202.53 3207.35 3212.10 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread
OpenVINO Model: Face Detection 0106 FP32 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2021.1 Model: Face Detection 0106 FP32 - Device: CPU r3 r2 r1 0.2858 0.5716 0.8574 1.1432 1.429 SE +/- 0.02, N = 3 SE +/- 0.02, N = 4 SE +/- 0.01, N = 3 1.27 1.27 1.26 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread
dav1d Video Input: Summer Nature 4K OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Summer Nature 4K r1 r3 r2 30 60 90 120 150 SE +/- 1.06, N = 6 SE +/- 1.07, N = 6 SE +/- 1.08, N = 6 112.75 112.65 112.03 MIN: 99.69 / MAX: 158.99 MIN: 99.62 / MAX: 158.58 MIN: 99.17 / MAX: 157.08 1. (CC) gcc options: -pthread
OpenVINO Model: Person Detection 0106 FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2021.1 Model: Person Detection 0106 FP16 - Device: CPU r1 r2 r3 1100 2200 3300 4400 5500 SE +/- 4.97, N = 3 SE +/- 19.24, N = 3 SE +/- 4.20, N = 3 4961.99 4978.25 5006.34 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread
OpenVINO Model: Person Detection 0106 FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2021.1 Model: Person Detection 0106 FP16 - Device: CPU r3 r2 r1 0.18 0.36 0.54 0.72 0.9 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 0.80 0.80 0.80 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread
Timed Eigen Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Eigen Compilation 3.3.9 Time To Compile r2 r3 r1 15 30 45 60 75 SE +/- 0.30, N = 3 SE +/- 0.22, N = 3 SE +/- 0.16, N = 3 67.54 68.70 68.74
LZ4 Compression Compression Level: 9 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 9 - Decompression Speed r3 r1 r2 2K 4K 6K 8K 10K SE +/- 0.78, N = 3 SE +/- 1.80, N = 5 SE +/- 15.38, N = 3 9695.2 9679.8 9664.8 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 9 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 9 - Compression Speed r3 r2 r1 13 26 39 52 65 SE +/- 0.66, N = 3 SE +/- 0.36, N = 3 SE +/- 0.59, N = 5 57.01 56.07 55.72 1. (CC) gcc options: -O3
OpenVINO Model: Face Detection 0106 FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2021.1 Model: Face Detection 0106 FP16 - Device: CPU r3 r1 r2 700 1400 2100 2800 3500 SE +/- 7.78, N = 3 SE +/- 4.35, N = 3 SE +/- 3.88, N = 3 3164.51 3165.24 3166.57 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread
OpenVINO Model: Face Detection 0106 FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2021.1 Model: Face Detection 0106 FP16 - Device: CPU r3 r2 r1 0.288 0.576 0.864 1.152 1.44 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 1.28 1.28 1.28 1. (CXX) g++ options: -fsigned-char -ffunction-sections -fdata-sections -O3 -pie -pthread -lpthread
Embree Binary: Pathtracer ISPC - Model: Asian Dragon OpenBenchmarking.org Frames Per Second, More Is Better Embree 3.9.0 Binary: Pathtracer ISPC - Model: Asian Dragon r2 r3 r1 3 6 9 12 15 SE +/- 0.0236, N = 3 SE +/- 0.1308, N = 3 SE +/- 0.0822, N = 3 9.2596 9.1967 9.1343 MIN: 8.82 / MAX: 14.99 MIN: 8.85 / MAX: 15 MIN: 8.81 / MAX: 15.06
LZ4 Compression Compression Level: 3 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 3 - Decompression Speed r3 r1 r2 2K 4K 6K 8K 10K SE +/- 0.67, N = 3 SE +/- 1.84, N = 5 SE +/- 16.28, N = 3 9685.2 9676.3 9653.7 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 3 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 3 - Compression Speed r3 r1 r2 13 26 39 52 65 SE +/- 0.48, N = 3 SE +/- 0.61, N = 5 SE +/- 0.58, N = 3 58.89 57.88 57.36 1. (CC) gcc options: -O3
Node.js V8 Web Tooling Benchmark OpenBenchmarking.org runs/s, More Is Better Node.js V8 Web Tooling Benchmark r3 r2 r1 3 6 9 12 15 SE +/- 0.11, N = 3 SE +/- 0.11, N = 3 SE +/- 0.14, N = 3 13.18 13.17 13.06 1. Nodejs
v10.19.0
simdjson Throughput Test: Kostya OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: Kostya r1 r3 r2 0.171 0.342 0.513 0.684 0.855 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.76 0.75 0.75 1. (CXX) g++ options: -O3 -pthread
DDraceNetwork Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL 3.0 - Zoom: Default - Demo: Multeasymap - Total Frame Time OpenBenchmarking.org Milliseconds, Fewer Is Better DDraceNetwork 15.2.3 Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL 3.0 - Zoom: Default - Demo: Multeasymap - Total Frame Time r3 r1 r2 3 6 9 12 15 Min: 2 / Avg: 2.39 / Max: 7.28 Min: 2 / Avg: 2.43 / Max: 6.55 Min: 2 / Avg: 2.46 / Max: 6.5 1. (CXX) g++ options: -O3 -rdynamic -lcrypto -lz -lrt -lpthread -lcurl -lfreetype -lSDL2 -lwavpack -lopusfile -lopus -logg -lGL -lX11 -lnotify -lgdk_pixbuf-2.0 -lgio-2.0 -lgobject-2.0 -lglib-2.0
DDraceNetwork Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL 3.0 - Zoom: Default - Demo: Multeasymap OpenBenchmarking.org Frames Per Second, More Is Better DDraceNetwork 15.2.3 Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL 3.0 - Zoom: Default - Demo: Multeasymap r1 r2 r3 90 180 270 360 450 SE +/- 0.79, N = 3 SE +/- 2.87, N = 3 SE +/- 4.35, N = 3 413.88 412.43 412.38 MIN: 119.86 / MAX: 499.75 MIN: 103.17 / MAX: 499.75 MIN: 127.91 / MAX: 499.75 1. (CXX) g++ options: -O3 -rdynamic -lcrypto -lz -lrt -lpthread -lcurl -lfreetype -lSDL2 -lwavpack -lopusfile -lopus -logg -lGL -lX11 -lnotify -lgdk_pixbuf-2.0 -lgio-2.0 -lgobject-2.0 -lglib-2.0
LevelDB Benchmark: Seek Random OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.22 Benchmark: Seek Random r2 r3 r1 3 6 9 12 15 SE +/- 0.10, N = 15 SE +/- 0.11, N = 14 SE +/- 0.11, N = 15 12.63 12.64 12.69 1. (CXX) g++ options: -O3 -lsnappy -lpthread
Blender Blend File: Fishy Cat - Compute: NVIDIA OptiX OpenBenchmarking.org Seconds, Fewer Is Better Blender 2.90 Blend File: Fishy Cat - Compute: NVIDIA OptiX r2 r3 r1 14 28 42 56 70 SE +/- 0.04, N = 3 SE +/- 0.12, N = 3 SE +/- 0.03, N = 3 60.18 60.25 60.35
IndigoBench Acceleration: CPU - Scene: Bedroom OpenBenchmarking.org M samples/s, More Is Better IndigoBench 4.4 Acceleration: CPU - Scene: Bedroom r1 r2 r3 0.2113 0.4226 0.6339 0.8452 1.0565 SE +/- 0.002, N = 3 SE +/- 0.000, N = 3 SE +/- 0.001, N = 3 0.939 0.938 0.935
IndigoBench Acceleration: CPU - Scene: Supercar OpenBenchmarking.org M samples/s, More Is Better IndigoBench 4.4 Acceleration: CPU - Scene: Supercar r3 r2 r1 0.4851 0.9702 1.4553 1.9404 2.4255 SE +/- 0.002, N = 3 SE +/- 0.001, N = 3 SE +/- 0.002, N = 3 2.156 2.150 2.147
TensorFlow Lite Model: SqueezeNet OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: SqueezeNet r1 r2 r3 80K 160K 240K 320K 400K SE +/- 2566.21, N = 3 SE +/- 2576.61, N = 3 SE +/- 2539.06, N = 3 354892 356034 356258
TensorFlow Lite Model: Mobilenet Quant OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: Mobilenet Quant r1 r2 r3 50K 100K 150K 200K 250K SE +/- 1686.36, N = 3 SE +/- 1668.46, N = 3 SE +/- 1810.35, N = 3 236716 237129 237406
TensorFlow Lite Model: NASNet Mobile OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: NASNet Mobile r1 r3 r2 70K 140K 210K 280K 350K SE +/- 3140.84, N = 3 SE +/- 1284.72, N = 3 SE +/- 2025.87, N = 3 302594 304079 304756
TensorFlow Lite Model: Mobilenet Float OpenBenchmarking.org Microseconds, Fewer Is Better TensorFlow Lite 2020-08-23 Model: Mobilenet Float r1 r2 r3 50K 100K 150K 200K 250K SE +/- 1996.41, N = 3 SE +/- 1820.00, N = 3 SE +/- 1638.46, N = 3 239119 239224 239537
DDraceNetwork Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL 3.3 - Zoom: Default - Demo: Multeasymap - Total Frame Time OpenBenchmarking.org Milliseconds, Fewer Is Better DDraceNetwork 15.2.3 Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL 3.3 - Zoom: Default - Demo: Multeasymap - Total Frame Time r1 r2 r3 3 6 9 12 15 Min: 2 / Avg: 2.3 / Max: 10.06 Min: 2 / Avg: 2.32 / Max: 5.18 Min: 2 / Avg: 2.32 / Max: 8.68 1. (CXX) g++ options: -O3 -rdynamic -lcrypto -lz -lrt -lpthread -lcurl -lfreetype -lSDL2 -lwavpack -lopusfile -lopus -logg -lGL -lX11 -lnotify -lgdk_pixbuf-2.0 -lgio-2.0 -lgobject-2.0 -lglib-2.0
DDraceNetwork Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL 3.3 - Zoom: Default - Demo: Multeasymap OpenBenchmarking.org Frames Per Second, More Is Better DDraceNetwork 15.2.3 Resolution: 1920 x 1080 - Mode: Fullscreen - Renderer: OpenGL 3.3 - Zoom: Default - Demo: Multeasymap r1 r3 r2 90 180 270 360 450 SE +/- 0.25, N = 3 SE +/- 2.45, N = 3 SE +/- 2.73, N = 3 435.20 434.24 429.37 MIN: 99.45 / MAX: 499.75 MIN: 115.25 / MAX: 499.75 MIN: 112.88 / MAX: 499.75 1. (CXX) g++ options: -O3 -rdynamic -lcrypto -lz -lrt -lpthread -lcurl -lfreetype -lSDL2 -lwavpack -lopusfile -lopus -logg -lGL -lX11 -lnotify -lgdk_pixbuf-2.0 -lgio-2.0 -lgobject-2.0 -lglib-2.0
GraphicsMagick Operation: Sharpen OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Sharpen r3 r2 r1 16 32 48 64 80 SE +/- 0.67, N = 3 SE +/- 0.58, N = 3 SE +/- 0.33, N = 3 73 72 72 1. (CC) gcc options: -fopenmp -O2 -pthread -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
GraphicsMagick Operation: Enhanced OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Enhanced r3 r2 r1 30 60 90 120 150 SE +/- 0.67, N = 3 SE +/- 0.67, N = 3 SE +/- 0.67, N = 3 115 115 115 1. (CC) gcc options: -fopenmp -O2 -pthread -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
GraphicsMagick Operation: Noise-Gaussian OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Noise-Gaussian r3 r2 r1 30 60 90 120 150 SE +/- 1.20, N = 3 SE +/- 1.00, N = 3 SE +/- 1.33, N = 3 147 147 146 1. (CC) gcc options: -fopenmp -O2 -pthread -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
GraphicsMagick Operation: Resizing OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Resizing r1 r3 r2 120 240 360 480 600 SE +/- 2.73, N = 3 SE +/- 5.36, N = 3 SE +/- 5.00, N = 3 552 551 551 1. (CC) gcc options: -fopenmp -O2 -pthread -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
GraphicsMagick Operation: HWB Color Space OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: HWB Color Space r3 r1 r2 200 400 600 800 1000 SE +/- 4.51, N = 3 SE +/- 5.03, N = 3 SE +/- 5.70, N = 3 776 775 774 1. (CC) gcc options: -fopenmp -O2 -pthread -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
GraphicsMagick Operation: Rotate OpenBenchmarking.org Iterations Per Minute, More Is Better GraphicsMagick 1.3.33 Operation: Rotate r1 r3 r2 200 400 600 800 1000 SE +/- 2.52, N = 3 SE +/- 1.86, N = 3 SE +/- 3.18, N = 3 902 900 875 1. (CC) gcc options: -fopenmp -O2 -pthread -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread
ASTC Encoder Preset: Thorough OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Thorough r1 r2 r3 12 24 36 48 60 SE +/- 0.54, N = 3 SE +/- 0.54, N = 3 SE +/- 0.42, N = 3 54.29 54.38 54.65 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
Basis Universal Settings: ETC1S OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: ETC1S r1 r3 r2 13 26 39 52 65 SE +/- 0.38, N = 3 SE +/- 0.56, N = 3 SE +/- 0.15, N = 3 57.82 58.06 58.06 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
GEGL Operation: Wavelet Blur OpenBenchmarking.org Seconds, Fewer Is Better GEGL Operation: Wavelet Blur r3 r2 r1 13 26 39 52 65 SE +/- 0.25, N = 3 SE +/- 0.39, N = 3 SE +/- 0.25, N = 3 57.84 57.95 57.99
rav1e Speed: 1 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Alpha Speed: 1 r3 r1 r2 0.0781 0.1562 0.2343 0.3124 0.3905 SE +/- 0.003, N = 3 SE +/- 0.002, N = 3 SE +/- 0.002, N = 3 0.347 0.347 0.346
DeepSpeech Acceleration: CPU OpenBenchmarking.org Seconds, Fewer Is Better DeepSpeech 0.6 Acceleration: CPU r3 r2 r1 20 40 60 80 100 SE +/- 0.04, N = 3 SE +/- 0.04, N = 3 SE +/- 0.21, N = 3 81.04 81.07 81.30
LuxCoreRender OpenCL Scene: Rainbow Colors and Prism OpenBenchmarking.org M samples/sec, More Is Better LuxCoreRender OpenCL 2.3 Scene: Rainbow Colors and Prism r3 r2 r1 1.2173 2.4346 3.6519 4.8692 6.0865 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.12, N = 12 5.41 5.39 5.30 MIN: 4.58 / MAX: 5.7 MIN: 4.6 / MAX: 5.67 MIN: 1.66 / MAX: 5.7
rav1e Speed: 5 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Alpha Speed: 5 r1 r3 r2 0.2405 0.481 0.7215 0.962 1.2025 SE +/- 0.005, N = 3 SE +/- 0.003, N = 3 SE +/- 0.004, N = 3 1.069 1.064 1.064
Basis Universal Settings: UASTC Level 2 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: UASTC Level 2 r1 r2 r3 13 26 39 52 65 SE +/- 0.55, N = 3 SE +/- 0.41, N = 3 SE +/- 0.58, N = 3 55.50 55.74 55.77 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
ASTC Encoder Preset: Medium OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Medium r3 r2 r1 2 4 6 8 10 SE +/- 0.16, N = 15 SE +/- 0.11, N = 15 SE +/- 0.14, N = 15 7.58 7.61 7.68 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
GEGL Operation: Color Enhance OpenBenchmarking.org Seconds, Fewer Is Better GEGL Operation: Color Enhance r3 r1 r2 12 24 36 48 60 SE +/- 0.28, N = 3 SE +/- 0.22, N = 3 SE +/- 0.04, N = 3 54.10 54.11 54.31
simdjson Throughput Test: LargeRandom OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: LargeRandom r3 r2 r1 0.1125 0.225 0.3375 0.45 0.5625 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.5 0.5 0.5 1. (CXX) g++ options: -O3 -pthread
simdjson Throughput Test: PartialTweets OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: PartialTweets r2 r3 r1 0.1958 0.3916 0.5874 0.7832 0.979 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.87 0.86 0.86 1. (CXX) g++ options: -O3 -pthread
simdjson Throughput Test: DistinctUserID OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: DistinctUserID r1 r3 r2 0.2003 0.4006 0.6009 0.8012 1.0015 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.89 0.88 0.88 1. (CXX) g++ options: -O3 -pthread
SQLite Speedtest Timed Time - Size 1,000 OpenBenchmarking.org Seconds, Fewer Is Better SQLite Speedtest 3.30 Timed Time - Size 1,000 r1 r3 r2 11 22 33 44 55 SE +/- 0.25, N = 3 SE +/- 0.13, N = 3 SE +/- 0.17, N = 3 49.55 50.26 50.27 1. (CC) gcc options: -O2 -ldl -lz -lpthread
Timed MAFFT Alignment Multiple Sequence Alignment - LSU RNA OpenBenchmarking.org Seconds, Fewer Is Better Timed MAFFT Alignment 7.471 Multiple Sequence Alignment - LSU RNA r1 r2 r3 3 6 9 12 15 SE +/- 0.08, N = 12 SE +/- 0.10, N = 15 SE +/- 0.10, N = 14 10.50 10.56 10.61 1. (CC) gcc options: -std=c99 -O3 -lm -lpthread
Coremark CoreMark Size 666 - Iterations Per Second OpenBenchmarking.org Iterations/Sec, More Is Better Coremark 1.0 CoreMark Size 666 - Iterations Per Second r3 r1 r2 50K 100K 150K 200K 250K SE +/- 2209.16, N = 3 SE +/- 2532.07, N = 3 SE +/- 1894.03, N = 3 223892.44 223414.81 223304.98 1. (CC) gcc options: -O2 -lrt" -lrt
LevelDB Benchmark: Random Read OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.22 Benchmark: Random Read r3 r1 r2 3 6 9 12 15 SE +/- 0.214, N = 15 SE +/- 0.250, N = 12 SE +/- 0.206, N = 15 9.573 9.620 9.692 1. (CXX) g++ options: -O3 -lsnappy -lpthread
dav1d Video Input: Summer Nature 1080p OpenBenchmarking.org FPS, More Is Better dav1d 0.8.1 Video Input: Summer Nature 1080p r1 r3 r2 100 200 300 400 500 SE +/- 3.60, N = 14 SE +/- 3.80, N = 13 SE +/- 3.46, N = 13 460.02 459.71 459.61 MIN: 375.05 / MAX: 590.01 MIN: 374.63 / MAX: 587.93 MIN: 374.03 / MAX: 582.97 1. (CC) gcc options: -pthread
rav1e Speed: 6 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Alpha Speed: 6 r1 r3 r2 0.3249 0.6498 0.9747 1.2996 1.6245 SE +/- 0.010, N = 3 SE +/- 0.012, N = 3 SE +/- 0.006, N = 3 1.444 1.443 1.440
PlaidML FP16: No - Mode: Inference - Network: DenseNet 201 - Device: OpenCL OpenBenchmarking.org FPS, More Is Better PlaidML FP16: No - Mode: Inference - Network: DenseNet 201 - Device: OpenCL r1 r3 r2 20 40 60 80 100 SE +/- 0.19, N = 3 SE +/- 0.40, N = 3 SE +/- 0.42, N = 3 110.07 109.99 109.98
VkResample Upscale: 2x - Precision: Double OpenBenchmarking.org ms, Fewer Is Better VkResample 1.0 Upscale: 2x - Precision: Double r1 r2 r3 60 120 180 240 300 SE +/- 0.20, N = 3 SE +/- 0.11, N = 3 SE +/- 0.20, N = 3 256.87 257.06 257.62 1. (CXX) g++ options: -O3 -pthread
GEGL Operation: Rotate 90 Degrees OpenBenchmarking.org Seconds, Fewer Is Better GEGL Operation: Rotate 90 Degrees r2 r3 r1 9 18 27 36 45 SE +/- 0.36, N = 3 SE +/- 0.43, N = 3 SE +/- 0.31, N = 3 37.54 37.69 37.70
GEGL Operation: Antialias OpenBenchmarking.org Seconds, Fewer Is Better GEGL Operation: Antialias r1 r2 r3 8 16 24 32 40 SE +/- 0.45, N = 3 SE +/- 0.35, N = 3 SE +/- 0.38, N = 3 36.56 36.56 36.65
eSpeak-NG Speech Engine Text-To-Speech Synthesis OpenBenchmarking.org Seconds, Fewer Is Better eSpeak-NG Speech Engine 20200907 Text-To-Speech Synthesis r1 r2 r3 7 14 21 28 35 SE +/- 0.29, N = 4 SE +/- 0.12, N = 4 SE +/- 0.04, N = 4 26.47 27.18 27.71 1. (CC) gcc options: -O2 -std=c99 -lpthread -lm
Cryptsetup Twofish-XTS 512b Encryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup Twofish-XTS 512b Encryption r2 r3 r1 110 220 330 440 550 SE +/- 0.97, N = 3 SE +/- 2.12, N = 3 SE +/- 0.30, N = 2 486.4 483.8 483.0
Cryptsetup Twofish-XTS 512b Decryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup Twofish-XTS 512b Decryption r2 r3 r1 110 220 330 440 550 SE +/- 1.44, N = 3 SE +/- 2.34, N = 3 SE +/- 0.10, N = 3 485.7 483.0 482.7
Cryptsetup Serpent-XTS 512b Decryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup Serpent-XTS 512b Decryption r2 r3 r1 200 400 600 800 1000 SE +/- 1.17, N = 3 SE +/- 4.24, N = 3 SE +/- 1.28, N = 3 878.1 873.5 871.7
Cryptsetup Serpent-XTS 512b Encryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup Serpent-XTS 512b Encryption r2 r1 r3 200 400 600 800 1000 SE +/- 0.87, N = 3 SE +/- 0.83, N = 3 SE +/- 4.25, N = 3 882.1 878.0 874.4
Cryptsetup AES-XTS 512b Decryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup AES-XTS 512b Decryption r2 r3 r1 700 1400 2100 2800 3500 SE +/- 10.03, N = 3 SE +/- 13.02, N = 3 SE +/- 1.21, N = 3 3388.5 3362.9 3348.3
Cryptsetup AES-XTS 512b Encryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup AES-XTS 512b Encryption r2 r1 r3 700 1400 2100 2800 3500 SE +/- 15.69, N = 3 SE +/- 3.15, N = 3 SE +/- 25.61, N = 3 3381.9 3346.8 3336.0
Cryptsetup Twofish-XTS 256b Decryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup Twofish-XTS 256b Decryption r2 r3 r1 110 220 330 440 550 SE +/- 1.43, N = 3 SE +/- 2.21, N = 3 SE +/- 0.34, N = 3 486.3 483.0 482.5
Cryptsetup Twofish-XTS 256b Encryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup Twofish-XTS 256b Encryption r2 r3 r1 110 220 330 440 550 SE +/- 1.08, N = 3 SE +/- 2.51, N = 3 SE +/- 0.75, N = 3 487.4 483.6 482.0
Cryptsetup Serpent-XTS 256b Decryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup Serpent-XTS 256b Decryption r2 r1 r3 200 400 600 800 1000 SE +/- 1.50, N = 3 SE +/- 1.62, N = 3 SE +/- 4.03, N = 3 876.6 872.3 870.9
Cryptsetup Serpent-XTS 256b Encryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup Serpent-XTS 256b Encryption r2 r3 r1 200 400 600 800 1000 SE +/- 1.25, N = 3 SE +/- 2.67, N = 3 SE +/- 0.92, N = 3 881.4 874.1 874.1
Cryptsetup AES-XTS 256b Decryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup AES-XTS 256b Decryption r2 r3 r1 900 1800 2700 3600 4500 SE +/- 17.20, N = 3 SE +/- 15.07, N = 3 SE +/- 4.92, N = 3 4055.1 4026.9 4002.4
Cryptsetup AES-XTS 256b Encryption OpenBenchmarking.org MiB/s, More Is Better Cryptsetup AES-XTS 256b Encryption r2 r3 r1 900 1800 2700 3600 4500 SE +/- 25.91, N = 3 SE +/- 20.10, N = 3 SE +/- 1.66, N = 3 4080.5 4023.0 4005.6
Cryptsetup PBKDF2-whirlpool OpenBenchmarking.org Iterations Per Second, More Is Better Cryptsetup PBKDF2-whirlpool r2 r1 r3 200K 400K 600K 800K 1000K SE +/- 2314.28, N = 3 SE +/- 4903.32, N = 3 SE +/- 2497.33, N = 3 830020 816282 810352
Cryptsetup PBKDF2-sha512 OpenBenchmarking.org Iterations Per Second, More Is Better Cryptsetup PBKDF2-sha512 r2 r1 r3 400K 800K 1200K 1600K 2000K SE +/- 1201.00, N = 3 SE +/- 7117.07, N = 3 SE +/- 12877.64, N = 3 1943008 1919349 1886103
LevelDB Benchmark: Sequential Fill OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.22 Benchmark: Sequential Fill r1 r2 r3 11 22 33 44 55 SE +/- 0.54, N = 4 SE +/- 0.58, N = 4 SE +/- 0.48, N = 5 47.24 47.29 47.42 1. (CXX) g++ options: -O3 -lsnappy -lpthread
LevelDB Benchmark: Sequential Fill OpenBenchmarking.org MB/s, More Is Better LevelDB 1.22 Benchmark: Sequential Fill r1 r2 r3 9 18 27 36 45 SE +/- 0.44, N = 4 SE +/- 0.46, N = 4 SE +/- 0.39, N = 5 37.5 37.4 37.3 1. (CXX) g++ options: -O3 -lsnappy -lpthread
LevelDB Benchmark: Random Delete OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.22 Benchmark: Random Delete r1 r2 r3 11 22 33 44 55 SE +/- 0.49, N = 5 SE +/- 0.57, N = 4 SE +/- 0.56, N = 4 47.23 47.30 47.39 1. (CXX) g++ options: -O3 -lsnappy -lpthread
TNN Target: CPU - Model: MobileNet v2 OpenBenchmarking.org ms, Fewer Is Better TNN 0.2.3 Target: CPU - Model: MobileNet v2 r2 r3 r1 70 140 210 280 350 SE +/- 0.81, N = 3 SE +/- 0.36, N = 3 SE +/- 2.78, N = 8 295.55 299.40 321.42 MIN: 292.39 / MAX: 306.56 MIN: 297.92 / MAX: 315.55 MIN: 300.42 / MAX: 371.06 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -O3 -rdynamic -ldl
clpeak OpenCL Test: Double-Precision Double OpenBenchmarking.org GFLOPS, More Is Better clpeak OpenCL Test: Double-Precision Double r3 r2 r1 70 140 210 280 350 SE +/- 3.74, N = 3 SE +/- 3.68, N = 3 SE +/- 3.78, N = 3 340.59 340.46 340.42 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
Darktable Test: Masskrug - Acceleration: CPU-only OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.0.1 Test: Masskrug - Acceleration: CPU-only r1 r2 r3 2 4 6 8 10 SE +/- 0.097, N = 12 SE +/- 0.096, N = 12 SE +/- 0.099, N = 12 7.128 7.150 7.155
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU r2 r3 r1 2 4 6 8 10 SE +/- 0.11582, N = 12 SE +/- 0.02993, N = 3 SE +/- 0.05152, N = 3 7.04404 7.14574 7.16575 MIN: 4.11 MIN: 5.45 MIN: 5.58 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU r3 r2 r1 0.715 1.43 2.145 2.86 3.575 SE +/- 0.06527, N = 12 SE +/- 0.02081, N = 3 SE +/- 0.01732, N = 3 3.11291 3.16769 3.17762 MIN: 1.86 MIN: 2.39 MIN: 2.58 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
GEGL Operation: Scale OpenBenchmarking.org Seconds, Fewer Is Better GEGL Operation: Scale r1 r2 r3 2 4 6 8 10 SE +/- 0.055, N = 12 SE +/- 0.059, N = 13 SE +/- 0.056, N = 14 6.954 6.973 7.000
LZ4 Compression Compression Level: 1 - Decompression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 1 - Decompression Speed r2 r1 r3 2K 4K 6K 8K 10K SE +/- 2.38, N = 3 SE +/- 3.96, N = 3 SE +/- 10.11, N = 3 9839.9 9823.2 9810.0 1. (CC) gcc options: -O3
LZ4 Compression Compression Level: 1 - Compression Speed OpenBenchmarking.org MB/s, More Is Better LZ4 Compression 1.9.3 Compression Level: 1 - Compression Speed r2 r1 r3 2K 4K 6K 8K 10K SE +/- 4.75, N = 3 SE +/- 6.52, N = 3 SE +/- 11.24, N = 3 8127.78 8120.67 8079.18 1. (CC) gcc options: -O3
Zstd Compression Compression Level: 3 OpenBenchmarking.org MB/s, More Is Better Zstd Compression 1.4.5 Compression Level: 3 r3 r1 r2 600 1200 1800 2400 3000 SE +/- 4.18, N = 3 SE +/- 7.25, N = 3 SE +/- 8.65, N = 3 2835.1 2833.6 2831.0 1. (CC) gcc options: -O3 -pthread -lz -llzma
GEGL Operation: Reflect OpenBenchmarking.org Seconds, Fewer Is Better GEGL Operation: Reflect r1 r3 r2 7 14 21 28 35 SE +/- 0.29, N = 3 SE +/- 0.22, N = 3 SE +/- 0.30, N = 3 28.18 28.31 28.50
GEGL Operation: Tile Glass OpenBenchmarking.org Seconds, Fewer Is Better GEGL Operation: Tile Glass r3 r2 r1 7 14 21 28 35 SE +/- 0.39, N = 3 SE +/- 0.27, N = 3 SE +/- 0.36, N = 3 28.06 28.24 28.24
GEGL Operation: Crop OpenBenchmarking.org Seconds, Fewer Is Better GEGL Operation: Crop r3 r2 r1 2 4 6 8 10 SE +/- 0.077, N = 8 SE +/- 0.073, N = 9 SE +/- 0.065, N = 11 8.826 8.839 8.900
rav1e Speed: 10 OpenBenchmarking.org Frames Per Second, More Is Better rav1e 0.4 Alpha Speed: 10 r1 r3 r2 0.77 1.54 2.31 3.08 3.85 SE +/- 0.044, N = 3 SE +/- 0.027, N = 3 SE +/- 0.035, N = 3 3.422 3.420 3.404
NAMD CUDA ATPase Simulation - 327,506 Atoms OpenBenchmarking.org days/ns, Fewer Is Better NAMD CUDA 2.14 ATPase Simulation - 327,506 Atoms r1 r3 r2 0.05 0.1 0.15 0.2 0.25 SE +/- 0.00131, N = 3 SE +/- 0.00272, N = 4 SE +/- 0.00245, N = 5 0.22103 0.22171 0.22238
Redis Test: SADD OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: SADD r1 r3 r2 600K 1200K 1800K 2400K 3000K SE +/- 28020.60, N = 3 SE +/- 27994.25, N = 3 SE +/- 23332.27, N = 15 2660539.42 2634908.83 2628039.25 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
PHPBench PHP Benchmark Suite OpenBenchmarking.org Score, More Is Better PHPBench 0.8.1 PHP Benchmark Suite r1 r2 r3 200K 400K 600K 800K 1000K SE +/- 4346.11, N = 3 SE +/- 2600.83, N = 3 SE +/- 587.84, N = 3 837911 832417 829705
RNNoise OpenBenchmarking.org Seconds, Fewer Is Better RNNoise 2020-06-28 r2 r3 r1 5 10 15 20 25 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 SE +/- 0.06, N = 3 21.32 22.04 22.08 1. (CC) gcc options: -O2 -pedantic -fvisibility=hidden -lm
Unpacking Firefox Extracting: firefox-84.0.source.tar.xz OpenBenchmarking.org Seconds, Fewer Is Better Unpacking Firefox 84.0 Extracting: firefox-84.0.source.tar.xz r1 r3 r2 4 8 12 16 20 SE +/- 0.08, N = 4 SE +/- 0.14, N = 4 SE +/- 0.09, N = 4 16.03 16.10 16.14
LAMMPS Molecular Dynamics Simulator Model: Rhodopsin Protein OpenBenchmarking.org ns/day, More Is Better LAMMPS Molecular Dynamics Simulator 29Oct2020 Model: Rhodopsin Protein r1 r3 r2 1.1696 2.3392 3.5088 4.6784 5.848 SE +/- 0.111, N = 15 SE +/- 0.110, N = 15 SE +/- 0.109, N = 15 5.198 5.179 5.169 1. (CXX) g++ options: -O3 -pthread -lm
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU r1 r2 r3 3 6 9 12 15 SE +/- 0.04374, N = 3 SE +/- 0.01715, N = 3 SE +/- 0.11418, N = 3 8.96782 9.00692 9.06628 MIN: 8.14 MIN: 8.15 MIN: 8 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU r3 r2 r1 3 6 9 12 15 SE +/- 0.03582, N = 3 SE +/- 0.03928, N = 3 SE +/- 0.04555, N = 3 9.73732 9.76468 9.77594 MIN: 8.75 MIN: 8.72 MIN: 8.77 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Inkscape Operation: SVG Files To PNG OpenBenchmarking.org Seconds, Fewer Is Better Inkscape Operation: SVG Files To PNG r1 r2 r3 5 10 15 20 25 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 21.00 21.05 21.07 1. Inkscape 0.92.5 (2060ec1f9f, 2020-04-08)
Redis Test: LPOP OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: LPOP r1 r3 r2 700K 1400K 2100K 2800K 3500K SE +/- 36042.05, N = 3 SE +/- 181152.66, N = 12 SE +/- 3702.86, N = 3 3394660.20 2809233.48 2104092.33 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
Crafty Elapsed Time OpenBenchmarking.org Nodes Per Second, More Is Better Crafty 25.2 Elapsed Time r2 r3 r1 2M 4M 6M 8M 10M SE +/- 7176.35, N = 3 SE +/- 16578.83, N = 3 SE +/- 45086.65, N = 3 9584148 9560012 9497414 1. (CC) gcc options: -pthread -lstdc++ -fprofile-use -lm
TNN Target: CPU - Model: SqueezeNet v1.1 OpenBenchmarking.org ms, Fewer Is Better TNN 0.2.3 Target: CPU - Model: SqueezeNet v1.1 r2 r3 r1 60 120 180 240 300 SE +/- 0.11, N = 3 SE +/- 0.12, N = 3 SE +/- 1.46, N = 3 264.95 272.68 272.91 MIN: 264.07 / MAX: 268.01 MIN: 271.53 / MAX: 277.6 MIN: 264.43 / MAX: 277.05 1. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -O3 -rdynamic -ldl
Monkey Audio Encoding WAV To APE OpenBenchmarking.org Seconds, Fewer Is Better Monkey Audio Encoding 3.99.6 WAV To APE r1 r3 r2 3 6 9 12 15 SE +/- 0.03, N = 5 SE +/- 0.01, N = 5 SE +/- 0.04, N = 5 10.51 10.59 10.86 1. (CXX) g++ options: -O3 -pedantic -rdynamic -lrt
Betsy GPU Compressor Codec: ETC2 RGB - Quality: Highest OpenBenchmarking.org Seconds, Fewer Is Better Betsy GPU Compressor 1.1 Beta Codec: ETC2 RGB - Quality: Highest r3 r2 r1 2 4 6 8 10 SE +/- 0.023, N = 3 SE +/- 0.018, N = 3 SE +/- 0.064, N = 13 7.903 7.912 8.016 1. (CXX) g++ options: -O3 -O2 -lpthread -ldl
Darktable Test: Boat - Acceleration: CPU-only OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.0.1 Test: Boat - Acceleration: CPU-only r3 r2 r1 4 8 12 16 20 SE +/- 0.03, N = 3 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 15.86 15.87 15.91
RealSR-NCNN Scale: 4x - TAA: No OpenBenchmarking.org Seconds, Fewer Is Better RealSR-NCNN 20200818 Scale: 4x - TAA: No r2 r3 r1 4 8 12 16 20 SE +/- 0.09, N = 3 SE +/- 0.11, N = 3 SE +/- 0.01, N = 3 14.66 14.69 14.73
Hashcat Benchmark: SHA-512 OpenBenchmarking.org H/s, More Is Better Hashcat 6.1.1 Benchmark: SHA-512 r1 r2 r3 200M 400M 600M 800M 1000M SE +/- 11546345.54, N = 15 SE +/- 2594224.35, N = 3 SE +/- 1852025.92, N = 3 1023100000 1020000000 1016800000
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU r2 r3 r1 1.0682 2.1364 3.2046 4.2728 5.341 SE +/- 0.06823, N = 15 SE +/- 0.07477, N = 15 SE +/- 0.10403, N = 12 4.71457 4.73728 4.74772 MIN: 3.29 MIN: 3.29 MIN: 3.29 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU r2 r3 r1 3 6 9 12 15 SE +/- 0.15643, N = 15 SE +/- 0.22537, N = 12 SE +/- 0.23621, N = 12 9.77701 9.81238 9.87893 MIN: 6.67 MIN: 6.65 MIN: 6.66 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
yquake2 Renderer: OpenGL 1.x - Resolution: 1920 x 1080 OpenBenchmarking.org Frames Per Second, More Is Better yquake2 7.45 Renderer: OpenGL 1.x - Resolution: 1920 x 1080 r3 r2 r1 13 26 39 52 65 SE +/- 0.03, N = 3 SE +/- 0.07, N = 3 SE +/- 0.03, N = 3 59.9 59.9 59.9 1. (CC) gcc options: -lm -ldl -rdynamic -shared -lSDL2 -O2 -pipe -fomit-frame-pointer -std=gnu99 -fno-strict-aliasing -fwrapv -fvisibility=hidden -MMD -mfpmath=sse -fPIC
yquake2 Renderer: OpenGL 3.x - Resolution: 1920 x 1080 OpenBenchmarking.org Frames Per Second, More Is Better yquake2 7.45 Renderer: OpenGL 3.x - Resolution: 1920 x 1080 r3 r2 r1 13 26 39 52 65 60 60 60 1. (CC) gcc options: -lm -ldl -rdynamic -shared -lSDL2 -O2 -pipe -fomit-frame-pointer -std=gnu99 -fno-strict-aliasing -fwrapv -fvisibility=hidden -MMD -mfpmath=sse -fPIC
Opus Codec Encoding WAV To Opus Encode OpenBenchmarking.org Seconds, Fewer Is Better Opus Codec Encoding 1.3.1 WAV To Opus Encode r2 r3 r1 2 4 6 8 10 SE +/- 0.004, N = 5 SE +/- 0.008, N = 5 SE +/- 0.009, N = 5 7.602 7.616 7.624 1. (CXX) g++ options: -fvisibility=hidden -logg -lm
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU r1 r2 r3 0.9867 1.9734 2.9601 3.9468 4.9335 SE +/- 0.00310, N = 3 SE +/- 0.00806, N = 3 SE +/- 0.00559, N = 3 4.36381 4.37852 4.38535 MIN: 4.23 MIN: 4.25 MIN: 4.25 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU r1 r3 r2 1.0059 2.0118 3.0177 4.0236 5.0295 SE +/- 0.00967, N = 3 SE +/- 0.00726, N = 3 SE +/- 0.01661, N = 3 4.45564 4.46656 4.47062 MIN: 4.02 MIN: 4.01 MIN: 4.02 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Redis Test: LPUSH OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: LPUSH r2 r3 r1 400K 800K 1200K 1600K 2000K SE +/- 21753.96, N = 4 SE +/- 8925.21, N = 3 SE +/- 25221.07, N = 3 2094056.31 2083566.29 2041750.08 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
Betsy GPU Compressor Codec: ETC1 - Quality: Highest OpenBenchmarking.org Seconds, Fewer Is Better Betsy GPU Compressor 1.1 Beta Codec: ETC1 - Quality: Highest r2 r3 r1 1.3172 2.6344 3.9516 5.2688 6.586 SE +/- 0.008, N = 3 SE +/- 0.024, N = 3 SE +/- 0.068, N = 12 5.789 5.792 5.854 1. (CXX) g++ options: -O3 -O2 -lpthread -ldl
yquake2 Renderer: Software CPU - Resolution: 1920 x 1080 OpenBenchmarking.org Frames Per Second, More Is Better yquake2 7.45 Renderer: Software CPU - Resolution: 1920 x 1080 r2 r1 r3 14 28 42 56 70 SE +/- 0.07, N = 3 SE +/- 0.07, N = 3 SE +/- 0.09, N = 3 60.7 60.7 60.6 1. (CC) gcc options: -lm -ldl -rdynamic -shared -lSDL2 -O2 -pipe -fomit-frame-pointer -std=gnu99 -fno-strict-aliasing -fwrapv -fvisibility=hidden -MMD -mfpmath=sse -fPIC
VkResample Upscale: 2x - Precision: Single OpenBenchmarking.org ms, Fewer Is Better VkResample 1.0 Upscale: 2x - Precision: Single r1 r2 r3 6 12 18 24 30 SE +/- 0.02, N = 3 SE +/- 0.05, N = 3 SE +/- 0.08, N = 3 24.99 25.19 25.23 1. (CXX) g++ options: -O3 -pthread
ASTC Encoder Preset: Fast OpenBenchmarking.org Seconds, Fewer Is Better ASTC Encoder 2.0 Preset: Fast r1 r2 r3 1.2668 2.5336 3.8004 5.0672 6.334 SE +/- 0.05, N = 3 SE +/- 0.08, N = 3 SE +/- 0.04, N = 12 5.44 5.59 5.63 1. (CXX) g++ options: -std=c++14 -fvisibility=hidden -O3 -flto -mfpmath=sse -mavx2 -mpopcnt -lpthread
PlaidML FP16: No - Mode: Inference - Network: IMDB LSTM - Device: OpenCL OpenBenchmarking.org FPS, More Is Better PlaidML FP16: No - Mode: Inference - Network: IMDB LSTM - Device: OpenCL r3 r2 r1 100 200 300 400 500 SE +/- 2.86, N = 3 SE +/- 1.92, N = 3 SE +/- 0.36, N = 3 478.73 477.39 463.34
Redis Test: SET OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: SET r3 r2 r1 500K 1000K 1500K 2000K 2500K SE +/- 6859.51, N = 3 SE +/- 3903.32, N = 3 SE +/- 17218.21, N = 3 2433543.80 2413657.00 2375800.25 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
Redis Test: GET OpenBenchmarking.org Requests Per Second, More Is Better Redis 6.0.9 Test: GET r1 r2 r3 700K 1400K 2100K 2800K 3500K SE +/- 41615.25, N = 3 SE +/- 13828.40, N = 3 SE +/- 8077.93, N = 3 3248596.08 3012560.83 3009326.75 1. (CXX) g++ options: -MM -MT -g3 -fvisibility=hidden -O3
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU r1 r3 r2 0.6248 1.2496 1.8744 2.4992 3.124 SE +/- 0.00400, N = 3 SE +/- 0.00352, N = 3 SE +/- 0.01530, N = 3 2.72558 2.74874 2.77670 MIN: 2.54 MIN: 2.54 MIN: 2.56 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU r2 r1 r3 3 6 9 12 15 SE +/- 0.04, N = 3 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 12.44 12.47 12.61 MIN: 12.09 MIN: 12.08 MIN: 12.2 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
clpeak OpenCL Test: Integer Compute INT OpenBenchmarking.org GIOPS, More Is Better clpeak OpenCL Test: Integer Compute INT r3 r2 r1 1200 2400 3600 4800 6000 SE +/- 81.16, N = 15 SE +/- 81.08, N = 15 SE +/- 71.93, N = 15 5540.44 5519.39 5504.35 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
Basis Universal Settings: UASTC Level 0 OpenBenchmarking.org Seconds, Fewer Is Better Basis Universal 1.12 Settings: UASTC Level 0 r1 r2 r3 2 4 6 8 10 SE +/- 0.079, N = 3 SE +/- 0.061, N = 3 SE +/- 0.095, N = 3 7.288 7.345 7.353 1. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread
LevelDB Benchmark: Hot Read OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.22 Benchmark: Hot Read r1 r2 r3 2 4 6 8 10 SE +/- 0.013, N = 3 SE +/- 0.075, N = 3 SE +/- 0.049, N = 3 6.946 7.099 7.128 1. (CXX) g++ options: -O3 -lsnappy -lpthread
Rodinia Test: OpenCL Particle Filter OpenBenchmarking.org Seconds, Fewer Is Better Rodinia 3.1 Test: OpenCL Particle Filter r3 r2 r1 2 4 6 8 10 SE +/- 0.016, N = 3 SE +/- 0.013, N = 3 SE +/- 0.065, N = 3 7.027 7.055 7.115 1. (CXX) g++ options: -O2 -lOpenCL
Hashcat Benchmark: TrueCrypt RIPEMD160 + XTS OpenBenchmarking.org H/s, More Is Better Hashcat 6.1.1 Benchmark: TrueCrypt RIPEMD160 + XTS r2 r1 r3 60K 120K 180K 240K 300K SE +/- 851.14, N = 3 SE +/- 1322.04, N = 3 SE +/- 545.69, N = 3 301433 301233 298133
Hashcat Benchmark: SHA1 OpenBenchmarking.org H/s, More Is Better Hashcat 6.1.1 Benchmark: SHA1 r1 r2 r3 2000M 4000M 6000M 8000M 10000M SE +/- 31347213.24, N = 3 SE +/- 17380832.35, N = 3 SE +/- 18653000.95, N = 3 8585766667 8544500000 8535333333
Hashcat Benchmark: MD5 OpenBenchmarking.org H/s, More Is Better Hashcat 6.1.1 Benchmark: MD5 r1 r2 r3 5000M 10000M 15000M 20000M 25000M SE +/- 110495102.96, N = 3 SE +/- 81107726.72, N = 3 SE +/- 49256167.13, N = 3 24334866667 24260200000 24196900000
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU r2 r1 r3 4 8 12 16 20 SE +/- 0.08, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 17.90 18.01 18.03 MIN: 17.18 MIN: 17.22 MIN: 17.24 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU r3 r1 r2 5 10 15 20 25 SE +/- 0.02, N = 3 SE +/- 0.06, N = 3 SE +/- 0.06, N = 3 21.62 21.69 21.70 MIN: 21.51 MIN: 21.47 MIN: 21.48 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
cl-mem Benchmark: Write OpenBenchmarking.org GB/s, More Is Better cl-mem 2017-01-13 Benchmark: Write r1 r2 r3 50 100 150 200 250 SE +/- 0.47, N = 3 SE +/- 0.26, N = 3 SE +/- 0.50, N = 3 215.7 215.6 214.8 1. (CC) gcc options: -O2 -flto -lOpenCL
cl-mem Benchmark: Copy OpenBenchmarking.org GB/s, More Is Better cl-mem 2017-01-13 Benchmark: Copy r1 r2 r3 50 100 150 200 250 SE +/- 0.22, N = 3 SE +/- 0.24, N = 3 SE +/- 0.27, N = 3 236.6 235.4 235.1 1. (CC) gcc options: -O2 -flto -lOpenCL
cl-mem Benchmark: Read OpenBenchmarking.org GB/s, More Is Better cl-mem 2017-01-13 Benchmark: Read r1 r3 r2 70 140 210 280 350 SE +/- 0.18, N = 3 SE +/- 0.03, N = 3 SE +/- 0.09, N = 3 330.3 329.9 329.9 1. (CC) gcc options: -O2 -flto -lOpenCL
PlaidML FP16: Yes - Mode: Inference - Network: Mobilenet - Device: OpenCL OpenBenchmarking.org FPS, More Is Better PlaidML FP16: Yes - Mode: Inference - Network: Mobilenet - Device: OpenCL r2 r1 r3 400 800 1200 1600 2000 SE +/- 3.54, N = 3 SE +/- 7.57, N = 3 SE +/- 8.53, N = 3 1823.06 1819.24 1817.78
Waifu2x-NCNN Vulkan Scale: 2x - Denoise: 3 - TAA: Yes OpenBenchmarking.org Seconds, Fewer Is Better Waifu2x-NCNN Vulkan 20200818 Scale: 2x - Denoise: 3 - TAA: Yes r1 r3 r2 2 4 6 8 10 SE +/- 0.004, N = 3 SE +/- 0.011, N = 3 SE +/- 0.007, N = 3 6.020 6.093 6.102
PlaidML FP16: No - Mode: Inference - Network: Mobilenet - Device: OpenCL OpenBenchmarking.org FPS, More Is Better PlaidML FP16: No - Mode: Inference - Network: Mobilenet - Device: OpenCL r3 r1 r2 300 600 900 1200 1500 SE +/- 4.92, N = 3 SE +/- 3.10, N = 3 SE +/- 2.03, N = 3 1247.93 1246.78 1244.95
Darktable Test: Server Room - Acceleration: CPU-only OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.0.1 Test: Server Room - Acceleration: CPU-only r2 r3 r1 0.9407 1.8814 2.8221 3.7628 4.7035 SE +/- 0.004, N = 3 SE +/- 0.006, N = 3 SE +/- 0.010, N = 3 4.174 4.178 4.181
NeatBench Acceleration: GPU OpenBenchmarking.org FPS, More Is Better NeatBench 5 Acceleration: GPU r3 r1 r2 6 12 18 24 30 SE +/- 0.60, N = 15 SE +/- 0.57, N = 15 SE +/- 0.47, N = 15 27.6 27.5 27.1
LevelDB Benchmark: Random Fill OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.22 Benchmark: Random Fill r3 r2 r1 9 18 27 36 45 SE +/- 0.07, N = 3 SE +/- 0.20, N = 3 SE +/- 0.19, N = 3 40.98 41.03 41.04 1. (CXX) g++ options: -O3 -lsnappy -lpthread
LevelDB Benchmark: Random Fill OpenBenchmarking.org MB/s, More Is Better LevelDB 1.22 Benchmark: Random Fill r3 r2 r1 10 20 30 40 50 SE +/- 0.07, N = 3 SE +/- 0.19, N = 3 SE +/- 0.21, N = 3 43.2 43.1 43.1 1. (CXX) g++ options: -O3 -lsnappy -lpthread
LevelDB Benchmark: Overwrite OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.22 Benchmark: Overwrite r3 r1 r2 9 18 27 36 45 SE +/- 0.04, N = 3 SE +/- 0.15, N = 3 SE +/- 0.08, N = 3 40.76 40.93 40.96 1. (CXX) g++ options: -O3 -lsnappy -lpthread
LevelDB Benchmark: Overwrite OpenBenchmarking.org MB/s, More Is Better LevelDB 1.22 Benchmark: Overwrite r3 r2 r1 10 20 30 40 50 SE +/- 0.03, N = 3 SE +/- 0.07, N = 3 SE +/- 0.15, N = 3 43.4 43.2 43.2 1. (CXX) g++ options: -O3 -lsnappy -lpthread
MandelGPU OpenCL Device: GPU OpenBenchmarking.org Samples/sec, More Is Better MandelGPU 1.3pts1 OpenCL Device: GPU r2 r3 r1 50M 100M 150M 200M 250M SE +/- 157365.45, N = 3 SE +/- 1449538.54, N = 3 SE +/- 1032565.22, N = 3 252826584.8 252822614.4 251986408.7 1. (CC) gcc options: -O3 -lm -ftree-vectorize -funroll-loops -lglut -lOpenCL -lGL
clpeak OpenCL Test: Single-Precision Float OpenBenchmarking.org GFLOPS, More Is Better clpeak OpenCL Test: Single-Precision Float r1 r3 r2 1300 2600 3900 5200 6500 SE +/- 83.30, N = 15 SE +/- 47.53, N = 3 SE +/- 64.05, N = 3 5940.64 5892.70 5858.32 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
LevelDB Benchmark: Fill Sync OpenBenchmarking.org Microseconds Per Op, Fewer Is Better LevelDB 1.22 Benchmark: Fill Sync r1 r3 r2 700 1400 2100 2800 3500 SE +/- 33.91, N = 3 SE +/- 60.32, N = 3 SE +/- 25.98, N = 3 3361.78 3386.08 3424.92 1. (CXX) g++ options: -O3 -lsnappy -lpthread
LevelDB Benchmark: Fill Sync OpenBenchmarking.org MB/s, More Is Better LevelDB 1.22 Benchmark: Fill Sync r3 r2 r1 0.1125 0.225 0.3375 0.45 0.5625 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.5 0.5 0.5 1. (CXX) g++ options: -O3 -lsnappy -lpthread
ArrayFire Test: Conjugate Gradient OpenCL OpenBenchmarking.org ms, Fewer Is Better ArrayFire 3.7 Test: Conjugate Gradient OpenCL r2 r3 r1 0.5735 1.147 1.7205 2.294 2.8675 SE +/- 0.022, N = 3 SE +/- 0.018, N = 3 SE +/- 0.015, N = 3 2.531 2.548 2.549 1. (CXX) g++ options: -rdynamic
Hashcat Benchmark: 7-Zip OpenBenchmarking.org H/s, More Is Better Hashcat 6.1.1 Benchmark: 7-Zip r1 r2 r3 80K 160K 240K 320K 400K SE +/- 1589.90, N = 3 SE +/- 1858.31, N = 3 SE +/- 3670.30, N = 3 373667 370400 366433
ViennaCL OpenCL LU Factorization OpenBenchmarking.org GFLOPS, More Is Better ViennaCL 1.4.2 OpenCL LU Factorization r1 r3 r2 15 30 45 60 75 SE +/- 0.36, N = 3 SE +/- 0.44, N = 3 SE +/- 0.08, N = 3 68.29 65.92 64.23 1. (CXX) g++ options: -rdynamic -lOpenCL
clpeak OpenCL Test: Global Memory Bandwidth OpenBenchmarking.org GBPS, More Is Better clpeak OpenCL Test: Global Memory Bandwidth r3 r1 r2 70 140 210 280 350 SE +/- 0.28, N = 3 SE +/- 0.32, N = 3 SE +/- 0.28, N = 3 324.78 324.63 324.58 1. (CXX) g++ options: -O3 -rdynamic -lOpenCL
Darktable Test: Server Rack - Acceleration: CPU-only OpenBenchmarking.org Seconds, Fewer Is Better Darktable 3.0.1 Test: Server Rack - Acceleration: CPU-only r1 r2 r3 0.0407 0.0814 0.1221 0.1628 0.2035 SE +/- 0.000, N = 3 SE +/- 0.000, N = 3 SE +/- 0.000, N = 3 0.181 0.181 0.181
FinanceBench Benchmark: Black-Scholes OpenCL OpenBenchmarking.org ms, Fewer Is Better FinanceBench 2016-06-06 Benchmark: Black-Scholes OpenCL r2 r3 r1 4 8 12 16 20 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 17.48 17.48 17.48 1. (CXX) g++ options: -O3 -lOpenCL
Phoronix Test Suite v10.8.5