Intel Core i7-4770K testing with a Gigabyte Z97-HD3 (F10c BIOS) and Gigabyte Intel HD 4600 2GB on Ubuntu 20.10 via the Phoronix Test Suite.
Compare your own system(s) to this result file with the
Phoronix Test Suite by running the command:
phoronix-test-suite benchmark 2012256-HA-COREI747702 Core i7 4770K Xmas - Phoronix Test Suite Core i7 4770K Xmas Intel Core i7-4770K testing with a Gigabyte Z97-HD3 (F10c BIOS) and Gigabyte Intel HD 4600 2GB on Ubuntu 20.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2012256-HA-COREI747702&grt&sor&rro .
Core i7 4770K Xmas Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL Vulkan Compiler File-System Screen Resolution 1 2 3 Intel Core i7-4770K @ 3.90GHz (4 Cores / 8 Threads) Gigabyte Z97-HD3 (F10c BIOS) Intel 4th Gen Core DRAM 8GB 120GB ADATA SU700 Gigabyte Intel HD 4600 2GB (1250MHz) Intel Xeon E3-1200 v3/4th DELL S2409W Realtek RTL8111/8168/8411 Ubuntu 20.10 5.8.0-31-generic (x86_64) GNOME Shell 3.38.1 X Server 1.20.9 modesetting 1.20.9 4.5 Mesa 20.2.1 1.2.145 GCC 10.2.0 ext4 1920x1080 OpenBenchmarking.org Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-gcn/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: intel_cpufreq ondemand - CPU Microcode: 0x28 - Thermald 2.3 Security Details - itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Mitigation of PTE Inversion; VMX: conditional cache flushes SMT vulnerable + mds: Mitigation of Clear buffers; SMT vulnerable + meltdown: Mitigation of PTI + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full generic retpoline IBPB: conditional IBRS_FW STIBP: conditional RSB filling + srbds: Mitigation of Microcode + tsx_async_abort: Not affected
Core i7 4770K Xmas brl-cad: VGR Performance Metric build2: Time To Compile clomp: Static OMP Speedup coremark: CoreMark Size 666 - Iterations Per Second encode-ape: WAV To APE ncnn: CPU - mobilenet ncnn: CPU-v2-v2 - mobilenet-v2 ncnn: CPU-v3-v3 - mobilenet-v3 ncnn: CPU - shufflenet-v2 ncnn: CPU - mnasnet ncnn: CPU - efficientnet-b0 ncnn: CPU - blazeface ncnn: CPU - googlenet ncnn: CPU - vgg16 ncnn: CPU - resnet18 ncnn: CPU - alexnet ncnn: CPU - resnet50 ncnn: CPU - yolov4-tiny ncnn: CPU - squeezenet_ssd ncnn: CPU - regnety_400m ncnn: Vulkan GPU - mobilenet ncnn: Vulkan GPU-v2-v2 - mobilenet-v2 ncnn: Vulkan GPU-v3-v3 - mobilenet-v3 ncnn: Vulkan GPU - shufflenet-v2 ncnn: Vulkan GPU - mnasnet ncnn: Vulkan GPU - efficientnet-b0 ncnn: Vulkan GPU - blazeface ncnn: Vulkan GPU - googlenet ncnn: Vulkan GPU - vgg16 ncnn: Vulkan GPU - resnet18 ncnn: Vulkan GPU - alexnet ncnn: Vulkan GPU - resnet50 ncnn: Vulkan GPU - yolov4-tiny ncnn: Vulkan GPU - squeezenet_ssd ncnn: Vulkan GPU - regnety_400m node-web-tooling: onednn: IP Shapes 1D - f32 - CPU onednn: IP Shapes 3D - f32 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: Deconvolution Batch shapes_1d - f32 - CPU onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU encode-opus: WAV To Opus Encode simdjson: Kostya simdjson: LargeRand simdjson: PartialTweets simdjson: DistinctUserID sqlite-speedtest: Timed Time - Size 1,000 build-eigen: Time To Compile build-ffmpeg: Time To Compile hmmer: Pfam Database Search vkmark: 1920 x 1080 encode-wavpack: WAV To WavPack 1 2 3 42373 388.393 0.9 144430.967517 14.059 39.30 10.42 8.74 11.41 8.37 14.26 3.29 29.65 127.46 30.48 25.10 61.85 51.58 40.61 20.69 39.24 10.56 8.77 11.51 8.53 14.40 3.25 29.81 126.48 30.47 25.09 63.21 51.83 40.27 20.70 9.30 11.1734 15.9142 5.97793 4.72613 32.1198 13.5497 18.9265 31.0265 14.4191 12.9561 10419.2 5784.04 10639.4 5730.72 8.10435 10641.5 5907.30 7.32951 9.132 0.60 0.4 0.66 0.68 80.186 96.656 145.942 137.774 292 15.548 42365 385.584 1.1 144054.759207 13.873 39.29 10.66 8.66 11.42 8.42 14.32 3.17 30.43 127.55 31.42 25.16 63.27 52.98 40.30 20.55 39.55 10.50 8.72 11.42 8.41 14.28 3.18 30.53 128.34 30.64 25.62 63.55 53.05 41.27 20.75 9.29 11.3617 15.6513 5.94146 4.69355 32.1261 13.3557 18.9535 30.9250 14.8307 12.9542 10472.4 5922.99 10420.8 5954.52 8.09766 10647.0 5822.13 7.33567 9.133 0.60 0.4 0.66 0.68 81.524 97.289 146.295 137.939 292 15.544 42169 388.953 1.1 144381.891641 13.953 39.77 11.04 8.91 11.44 8.65 14.43 3.28 30.19 128.02 30.75 25.17 64.10 54.23 41.57 20.90 39.64 10.89 8.84 11.49 8.43 14.26 3.24 30.37 127.97 30.69 25.11 63.34 53.45 42.10 20.88 9.24 11.5933 18.8928 5.99872 4.99786 32.5780 13.2922 19.0737 30.9474 14.5672 12.9019 10683.5 5971.44 10853.5 6041.90 8.85443 10925.1 5981.80 7.23631 9.113 0.60 0.4 0.66 0.68 80.100 97.299 146.978 137.903 290 15.562 OpenBenchmarking.org
BRL-CAD VGR Performance Metric OpenBenchmarking.org VGR Performance Metric, More Is Better BRL-CAD 7.30.8 VGR Performance Metric 3 2 1 9K 18K 27K 36K 45K 42169 42365 42373 1. (CXX) g++ options: -std=c++11 -pipe -fno-strict-aliasing -fno-common -fexceptions -ftemplate-depth-128 -m64 -ggdb3 -O3 -fipa-pta -fstrength-reduce -finline-functions -flto -pedantic -rdynamic -lSM -lICE -lXi -lGLU -lGL -lGLdispatch -lX11 -lXext -lXrender -lpthread -ldl -luuid -lm
Build2 Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Build2 0.13 Time To Compile 3 1 2 80 160 240 320 400 SE +/- 1.04, N = 3 SE +/- 2.56, N = 3 SE +/- 1.77, N = 3 388.95 388.39 385.58
CLOMP Static OMP Speedup OpenBenchmarking.org Speedup, More Is Better CLOMP 1.2 Static OMP Speedup 1 2 3 0.2475 0.495 0.7425 0.99 1.2375 SE +/- 0.04, N = 11 SE +/- 0.04, N = 9 SE +/- 0.02, N = 12 0.9 1.1 1.1 1. (CC) gcc options: -fopenmp -O3 -lm
Coremark CoreMark Size 666 - Iterations Per Second OpenBenchmarking.org Iterations/Sec, More Is Better Coremark 1.0 CoreMark Size 666 - Iterations Per Second 2 3 1 30K 60K 90K 120K 150K SE +/- 324.47, N = 3 SE +/- 511.87, N = 3 SE +/- 201.95, N = 3 144054.76 144381.89 144430.97 1. (CC) gcc options: -O2 -lrt" -lrt
Monkey Audio Encoding WAV To APE OpenBenchmarking.org Seconds, Fewer Is Better Monkey Audio Encoding 3.99.6 WAV To APE 1 3 2 4 8 12 16 20 SE +/- 0.09, N = 5 SE +/- 0.04, N = 5 SE +/- 0.01, N = 5 14.06 13.95 13.87 1. (CXX) g++ options: -O3 -pedantic -rdynamic -lrt
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: mobilenet 3 1 2 9 18 27 36 45 SE +/- 0.05, N = 3 SE +/- 0.17, N = 3 SE +/- 0.17, N = 3 39.77 39.30 39.29 MIN: 38.23 / MAX: 54.99 MIN: 37.67 / MAX: 52.04 MIN: 37.69 / MAX: 53.53 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU-v2-v2 - Model: mobilenet-v2 3 2 1 3 6 9 12 15 SE +/- 0.11, N = 3 SE +/- 0.12, N = 3 SE +/- 0.01, N = 3 11.04 10.66 10.42 MIN: 9.49 / MAX: 23.36 MIN: 9.15 / MAX: 22.02 MIN: 9 / MAX: 26.06 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU-v3-v3 - Model: mobilenet-v3 3 1 2 2 4 6 8 10 SE +/- 0.08, N = 3 SE +/- 0.05, N = 3 SE +/- 0.03, N = 3 8.91 8.74 8.66 MIN: 7.68 / MAX: 24.14 MIN: 7.43 / MAX: 21.86 MIN: 7.66 / MAX: 11.93 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: shufflenet-v2 3 2 1 3 6 9 12 15 SE +/- 0.10, N = 3 SE +/- 0.09, N = 3 SE +/- 0.05, N = 3 11.44 11.42 11.41 MIN: 10.2 / MAX: 14.3 MIN: 9.78 / MAX: 21.73 MIN: 9.61 / MAX: 24.27 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: mnasnet 3 2 1 2 4 6 8 10 SE +/- 0.14, N = 3 SE +/- 0.06, N = 3 SE +/- 0.03, N = 3 8.65 8.42 8.37 MIN: 7.52 / MAX: 22.41 MIN: 7.43 / MAX: 10.7 MIN: 7.43 / MAX: 11.53 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: efficientnet-b0 3 2 1 4 8 12 16 20 SE +/- 0.07, N = 3 SE +/- 0.17, N = 3 SE +/- 0.06, N = 3 14.43 14.32 14.26 MIN: 12.96 / MAX: 26.1 MIN: 12.71 / MAX: 21.22 MIN: 12.32 / MAX: 58.63 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: blazeface 1 3 2 0.7403 1.4806 2.2209 2.9612 3.7015 SE +/- 0.05, N = 3 SE +/- 0.07, N = 3 SE +/- 0.02, N = 3 3.29 3.28 3.17 MIN: 2.93 / MAX: 5.59 MIN: 2.87 / MAX: 6.08 MIN: 2.84 / MAX: 5.58 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: googlenet 2 3 1 7 14 21 28 35 SE +/- 0.05, N = 3 SE +/- 0.38, N = 3 SE +/- 0.12, N = 3 30.43 30.19 29.65 MIN: 27.97 / MAX: 44.24 MIN: 27.95 / MAX: 43.65 MIN: 27.35 / MAX: 44.36 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: vgg16 3 2 1 30 60 90 120 150 SE +/- 0.28, N = 3 SE +/- 0.12, N = 3 SE +/- 0.47, N = 3 128.02 127.55 127.46 MIN: 124.75 / MAX: 141.83 MIN: 124.34 / MAX: 142.98 MIN: 123.62 / MAX: 155.82 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: resnet18 2 3 1 7 14 21 28 35 SE +/- 0.35, N = 3 SE +/- 0.49, N = 3 SE +/- 0.48, N = 3 31.42 30.75 30.48 MIN: 29.38 / MAX: 44.46 MIN: 28.65 / MAX: 43.26 MIN: 28.51 / MAX: 48.91 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: alexnet 3 2 1 6 12 18 24 30 SE +/- 0.07, N = 3 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 25.17 25.16 25.10 MIN: 23.93 / MAX: 36.81 MIN: 23.92 / MAX: 36.39 MIN: 23.66 / MAX: 35.12 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: resnet50 3 2 1 14 28 42 56 70 SE +/- 0.63, N = 3 SE +/- 0.36, N = 3 SE +/- 0.32, N = 3 64.10 63.27 61.85 MIN: 60.8 / MAX: 78.22 MIN: 60.14 / MAX: 79.04 MIN: 59.67 / MAX: 77.54 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: yolov4-tiny 3 2 1 12 24 36 48 60 SE +/- 0.49, N = 3 SE +/- 0.61, N = 3 SE +/- 0.60, N = 3 54.23 52.98 51.58 MIN: 51.36 / MAX: 69.51 MIN: 49.76 / MAX: 62.27 MIN: 49.32 / MAX: 73.38 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: squeezenet_ssd 3 1 2 9 18 27 36 45 SE +/- 0.08, N = 3 SE +/- 0.38, N = 3 SE +/- 0.31, N = 3 41.57 40.61 40.30 MIN: 39.79 / MAX: 51.01 MIN: 38.69 / MAX: 57.91 MIN: 38.88 / MAX: 50.48 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: regnety_400m 3 1 2 5 10 15 20 25 SE +/- 0.26, N = 3 SE +/- 0.10, N = 3 SE +/- 0.11, N = 3 20.90 20.69 20.55 MIN: 19.89 / MAX: 33.22 MIN: 19.76 / MAX: 32.47 MIN: 19.68 / MAX: 41.87 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: mobilenet 3 2 1 9 18 27 36 45 SE +/- 0.04, N = 3 SE +/- 0.08, N = 3 SE +/- 0.08, N = 3 39.64 39.55 39.24 MIN: 38.24 / MAX: 53.68 MIN: 37.98 / MAX: 52.81 MIN: 37.61 / MAX: 75.08 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU-v2-v2 - Model: mobilenet-v2 3 1 2 3 6 9 12 15 SE +/- 0.09, N = 3 SE +/- 0.07, N = 3 SE +/- 0.13, N = 3 10.89 10.56 10.50 MIN: 9.36 / MAX: 27.33 MIN: 9.17 / MAX: 22.12 MIN: 9.06 / MAX: 19.24 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU-v3-v3 - Model: mobilenet-v3 3 1 2 2 4 6 8 10 SE +/- 0.04, N = 3 SE +/- 0.13, N = 3 SE +/- 0.06, N = 3 8.84 8.77 8.72 MIN: 7.72 / MAX: 20.26 MIN: 7.65 / MAX: 16.44 MIN: 7.36 / MAX: 21.89 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: shufflenet-v2 1 3 2 3 6 9 12 15 SE +/- 0.04, N = 3 SE +/- 0.07, N = 3 SE +/- 0.09, N = 3 11.51 11.49 11.42 MIN: 9.89 / MAX: 25.37 MIN: 10.27 / MAX: 24.45 MIN: 10.18 / MAX: 14.51 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: mnasnet 1 3 2 2 4 6 8 10 SE +/- 0.04, N = 3 SE +/- 0.06, N = 3 SE +/- 0.08, N = 3 8.53 8.43 8.41 MIN: 7.67 / MAX: 11.27 MIN: 7.18 / MAX: 23.58 MIN: 7.37 / MAX: 18.13 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: efficientnet-b0 1 2 3 4 8 12 16 20 SE +/- 0.06, N = 3 SE +/- 0.05, N = 3 SE +/- 0.09, N = 3 14.40 14.28 14.26 MIN: 12.89 / MAX: 32.02 MIN: 12.77 / MAX: 26.63 MIN: 12.68 / MAX: 28.42 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: blazeface 1 3 2 0.7313 1.4626 2.1939 2.9252 3.6565 SE +/- 0.04, N = 3 SE +/- 0.05, N = 3 SE +/- 0.02, N = 3 3.25 3.24 3.18 MIN: 2.92 / MAX: 5.82 MIN: 2.74 / MAX: 13.7 MIN: 2.85 / MAX: 7.23 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: googlenet 2 3 1 7 14 21 28 35 SE +/- 0.15, N = 3 SE +/- 0.30, N = 3 SE +/- 0.08, N = 3 30.53 30.37 29.81 MIN: 27.72 / MAX: 49.16 MIN: 27.85 / MAX: 41.19 MIN: 27.66 / MAX: 43.24 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: vgg16 2 3 1 30 60 90 120 150 SE +/- 0.19, N = 3 SE +/- 0.22, N = 3 SE +/- 0.12, N = 3 128.34 127.97 126.48 MIN: 124.8 / MAX: 144.28 MIN: 124.99 / MAX: 146.71 MIN: 123.52 / MAX: 147.25 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: resnet18 3 2 1 7 14 21 28 35 SE +/- 0.32, N = 3 SE +/- 0.25, N = 3 SE +/- 0.34, N = 3 30.69 30.64 30.47 MIN: 29.02 / MAX: 42.89 MIN: 29 / MAX: 45.72 MIN: 28.73 / MAX: 39.99 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: alexnet 2 3 1 6 12 18 24 30 SE +/- 0.30, N = 3 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 25.62 25.11 25.09 MIN: 24.03 / MAX: 37.26 MIN: 23.93 / MAX: 31.49 MIN: 24.07 / MAX: 34.98 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: resnet50 2 3 1 14 28 42 56 70 SE +/- 0.50, N = 3 SE +/- 0.70, N = 3 SE +/- 1.40, N = 3 63.55 63.34 63.21 MIN: 59.8 / MAX: 83.82 MIN: 59.91 / MAX: 77.11 MIN: 59.61 / MAX: 81.53 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: yolov4-tiny 3 2 1 12 24 36 48 60 SE +/- 0.37, N = 3 SE +/- 0.51, N = 3 SE +/- 0.34, N = 3 53.45 53.05 51.83 MIN: 50.95 / MAX: 68.33 MIN: 49.64 / MAX: 66.59 MIN: 49.56 / MAX: 65.5 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: squeezenet_ssd 3 2 1 10 20 30 40 50 SE +/- 0.42, N = 3 SE +/- 0.71, N = 3 SE +/- 0.38, N = 3 42.10 41.27 40.27 MIN: 39.99 / MAX: 60.7 MIN: 38.91 / MAX: 54.13 MIN: 38.86 / MAX: 55.4 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: Vulkan GPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: Vulkan GPU - Model: regnety_400m 3 2 1 5 10 15 20 25 SE +/- 0.11, N = 3 SE +/- 0.10, N = 3 SE +/- 0.08, N = 3 20.88 20.75 20.70 MIN: 19.92 / MAX: 33.66 MIN: 19.66 / MAX: 33.43 MIN: 19.92 / MAX: 33.36 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Node.js V8 Web Tooling Benchmark OpenBenchmarking.org runs/s, More Is Better Node.js V8 Web Tooling Benchmark 3 2 1 3 6 9 12 15 SE +/- 0.08, N = 3 SE +/- 0.01, N = 3 SE +/- 0.08, N = 3 9.24 9.29 9.30 1. Nodejs
v12.18.2
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU 3 2 1 3 6 9 12 15 SE +/- 0.02, N = 3 SE +/- 0.12, N = 3 SE +/- 0.05, N = 3 11.59 11.36 11.17 MIN: 9.81 MIN: 9.64 MIN: 9.59 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU 3 1 2 5 10 15 20 25 SE +/- 0.18, N = 9 SE +/- 0.19, N = 3 SE +/- 0.23, N = 3 18.89 15.91 15.65 MIN: 17.56 MIN: 14.84 MIN: 14.56 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU 3 1 2 1.3497 2.6994 4.0491 5.3988 6.7485 SE +/- 0.02064, N = 3 SE +/- 0.00573, N = 3 SE +/- 0.01650, N = 3 5.99872 5.97793 5.94146 MIN: 5.39 MIN: 5.37 MIN: 5.35 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU 3 1 2 1.1245 2.249 3.3735 4.498 5.6225 SE +/- 0.00347, N = 3 SE +/- 0.00568, N = 3 SE +/- 0.01254, N = 3 4.99786 4.72613 4.69355 MIN: 4.39 MIN: 4.17 MIN: 4.15 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU 3 2 1 8 16 24 32 40 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 32.58 32.13 32.12 MIN: 31.18 MIN: 30.76 MIN: 30.77 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU 1 2 3 3 6 9 12 15 SE +/- 0.05, N = 3 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 13.55 13.36 13.29 MIN: 11.76 MIN: 11.47 MIN: 11.34 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU 3 2 1 5 10 15 20 25 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 19.07 18.95 18.93 MIN: 17.71 MIN: 17.8 MIN: 17.94 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU 1 3 2 7 14 21 28 35 SE +/- 0.04, N = 3 SE +/- 0.25, N = 3 SE +/- 0.03, N = 3 31.03 30.95 30.93 MIN: 29.17 MIN: 29.08 MIN: 29.31 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU 2 3 1 4 8 12 16 20 SE +/- 0.16, N = 7 SE +/- 0.19, N = 3 SE +/- 0.08, N = 3 14.83 14.57 14.42 MIN: 12.64 MIN: 12.66 MIN: 12.59 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU 1 2 3 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.05, N = 3 12.96 12.95 12.90 MIN: 11.96 MIN: 11.96 MIN: 11.73 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU 3 2 1 2K 4K 6K 8K 10K SE +/- 103.29, N = 3 SE +/- 143.78, N = 3 SE +/- 107.95, N = 3 10683.5 10472.4 10419.2 MIN: 10271.5 MIN: 10130.6 MIN: 10130.6 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU 3 2 1 1300 2600 3900 5200 6500 SE +/- 76.96, N = 3 SE +/- 39.11, N = 3 SE +/- 67.54, N = 3 5971.44 5922.99 5784.04 MIN: 5746.72 MIN: 5689.07 MIN: 5574.2 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU 3 1 2 2K 4K 6K 8K 10K SE +/- 89.46, N = 3 SE +/- 126.69, N = 3 SE +/- 55.37, N = 3 10853.5 10639.4 10420.8 MIN: 10230.8 MIN: 10142.8 MIN: 10122.1 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU 3 2 1 1300 2600 3900 5200 6500 SE +/- 103.03, N = 3 SE +/- 28.84, N = 3 SE +/- 34.96, N = 3 6041.90 5954.52 5730.72 MIN: 5795.41 MIN: 5716.33 MIN: 5610.11 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU 3 1 2 2 4 6 8 10 SE +/- 0.08022, N = 10 SE +/- 0.02583, N = 3 SE +/- 0.00568, N = 3 8.85443 8.10435 8.09766 MIN: 7.41 MIN: 7.33 MIN: 7.36 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU 3 2 1 2K 4K 6K 8K 10K SE +/- 134.60, N = 3 SE +/- 79.82, N = 3 SE +/- 84.33, N = 3 10925.1 10647.0 10641.5 MIN: 10351.4 MIN: 10136.8 MIN: 10074.8 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU 3 1 2 1300 2600 3900 5200 6500 SE +/- 45.12, N = 3 SE +/- 36.51, N = 3 SE +/- 36.61, N = 3 5981.80 5907.30 5822.13 MIN: 5790.74 MIN: 5685.76 MIN: 5676.41 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU 2 1 3 2 4 6 8 10 SE +/- 0.02785, N = 3 SE +/- 0.01240, N = 3 SE +/- 0.02346, N = 3 7.33567 7.32951 7.23631 MIN: 6.13 MIN: 6.09 MIN: 6.08 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Opus Codec Encoding WAV To Opus Encode OpenBenchmarking.org Seconds, Fewer Is Better Opus Codec Encoding 1.3.1 WAV To Opus Encode 2 1 3 3 6 9 12 15 SE +/- 0.008, N = 5 SE +/- 0.019, N = 5 SE +/- 0.008, N = 5 9.133 9.132 9.113 1. (CXX) g++ options: -fvisibility=hidden -logg -lm
simdjson Throughput Test: Kostya OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: Kostya 1 2 3 0.135 0.27 0.405 0.54 0.675 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.60 0.60 0.60 1. (CXX) g++ options: -O3 -pthread
simdjson Throughput Test: LargeRandom OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: LargeRandom 1 2 3 0.09 0.18 0.27 0.36 0.45 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.4 0.4 0.4 1. (CXX) g++ options: -O3 -pthread
simdjson Throughput Test: PartialTweets OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: PartialTweets 1 2 3 0.1485 0.297 0.4455 0.594 0.7425 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.66 0.66 0.66 1. (CXX) g++ options: -O3 -pthread
simdjson Throughput Test: DistinctUserID OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: DistinctUserID 1 2 3 0.153 0.306 0.459 0.612 0.765 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.68 0.68 0.68 1. (CXX) g++ options: -O3 -pthread
SQLite Speedtest Timed Time - Size 1,000 OpenBenchmarking.org Seconds, Fewer Is Better SQLite Speedtest 3.30 Timed Time - Size 1,000 2 1 3 20 40 60 80 100 SE +/- 0.42, N = 3 SE +/- 0.66, N = 3 SE +/- 0.67, N = 3 81.52 80.19 80.10 1. (CC) gcc options: -O2 -ldl -lz -lpthread
Timed Eigen Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Eigen Compilation 3.3.9 Time To Compile 3 2 1 20 40 60 80 100 SE +/- 0.28, N = 3 SE +/- 0.33, N = 3 SE +/- 0.15, N = 3 97.30 97.29 96.66
Timed FFmpeg Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed FFmpeg Compilation 4.2.2 Time To Compile 3 2 1 30 60 90 120 150 SE +/- 1.80, N = 3 SE +/- 2.25, N = 3 SE +/- 2.20, N = 3 146.98 146.30 145.94
Timed HMMer Search Pfam Database Search OpenBenchmarking.org Seconds, Fewer Is Better Timed HMMer Search 3.3.1 Pfam Database Search 2 3 1 30 60 90 120 150 SE +/- 0.10, N = 3 SE +/- 0.03, N = 3 SE +/- 0.04, N = 3 137.94 137.90 137.77 1. (CC) gcc options: -O3 -pthread -lhmmer -leasel -lm
VKMark Resolution: 1920 x 1080 OpenBenchmarking.org VKMark Score, More Is Better VKMark 2020-05-21 Resolution: 1920 x 1080 3 1 2 60 120 180 240 300 SE +/- 0.33, N = 3 SE +/- 0.67, N = 3 290 292 292 1. (CXX) g++ options: -pthread -ldl -pipe -std=c++14 -MD -MQ -MF
WavPack Audio Encoding WAV To WavPack OpenBenchmarking.org Seconds, Fewer Is Better WavPack Audio Encoding 5.3 WAV To WavPack 3 1 2 4 8 12 16 20 SE +/- 0.04, N = 5 SE +/- 0.03, N = 5 SE +/- 0.03, N = 5 15.56 15.55 15.54 1. (CXX) g++ options: -rdynamic
Phoronix Test Suite v10.8.4