Intel Xeon Gold 6226R testing with a Supermicro X11SPL-F v1.02 (3.1 BIOS) and llvmpipe on Ubuntu 20.04 via the Phoronix Test Suite.
Compare your own system(s) to this result file with the
Phoronix Test Suite by running the command:
phoronix-test-suite benchmark 2012200-HA-XEONGOLD615 Xeon Gold 6226R December - Phoronix Test Suite Xeon Gold 6226R December Intel Xeon Gold 6226R testing with a Supermicro X11SPL-F v1.02 (3.1 BIOS) and llvmpipe on Ubuntu 20.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2012200-HA-XEONGOLD615&sor&grt .
Xeon Gold 6226R December Processor Motherboard Chipset Memory Disk Graphics Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL Compiler File-System Screen Resolution 1 2 3 Intel Xeon Gold 6226R @ 3.90GHz (16 Cores / 32 Threads) Supermicro X11SPL-F v1.02 (3.1 BIOS) Intel Sky Lake-E DMI3 Registers 188GB 3841GB Micron_9300_MTFDHAL3T8TDP llvmpipe VE228 2 x Intel I210 Ubuntu 20.04 5.9.0-050900rc6daily20200921-generic (x86_64) 20200920 GNOME Shell 3.36.4 X Server 1.20.8 modesetting 1.20.8 3.3 Mesa 20.0.8 (LLVM 10.0.0 256 bits) GCC 9.3.0 ext4 1920x1080 OpenBenchmarking.org Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: intel_cpufreq ondemand - CPU Microcode: 0x5002f01 Security Details - itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Mitigation of TSX disabled
Xeon Gold 6226R December brl-cad: VGR Performance Metric build2: Time To Compile clomp: Static OMP Speedup coremark: CoreMark Size 666 - Iterations Per Second encode-ape: WAV To APE ncnn: CPU - mobilenet ncnn: CPU-v2-v2 - mobilenet-v2 ncnn: CPU-v3-v3 - mobilenet-v3 ncnn: CPU - shufflenet-v2 ncnn: CPU - mnasnet ncnn: CPU - efficientnet-b0 ncnn: CPU - blazeface ncnn: CPU - googlenet ncnn: CPU - vgg16 ncnn: CPU - resnet18 ncnn: CPU - alexnet ncnn: CPU - resnet50 ncnn: CPU - yolov4-tiny ncnn: CPU - squeezenet_ssd ncnn: CPU - regnety_400m node-web-tooling: onednn: IP Shapes 1D - f32 - CPU onednn: IP Shapes 3D - f32 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU onednn: IP Shapes 1D - bf16bf16bf16 - CPU onednn: IP Shapes 3D - bf16bf16bf16 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: Deconvolution Batch shapes_1d - f32 - CPU onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - bf16bf16bf16 - CPU onednn: Deconvolution Batch shapes_1d - bf16bf16bf16 - CPU onednn: Deconvolution Batch shapes_3d - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - bf16bf16bf16 - CPU simdjson: Kostya simdjson: LargeRand simdjson: PartialTweets simdjson: DistinctUserID sqlite-speedtest: Timed Time - Size 1,000 build-eigen: Time To Compile build-ffmpeg: Time To Compile hmmer: Pfam Database Search mafft: Multiple Sequence Alignment - LSU RNA encode-wavpack: WAV To WavPack 1 2 3 164354 95.580 26.8 537952.810837 17.528 17.77 6.07 5.21 5.94 5.56 7.70 2.93 14.56 29.65 10.58 8.07 20.27 24.72 16.73 27.19 10.78 2.35358 3.14816 0.496245 1.24491 5.61346 2.59610 4.34336 2.75497 3.21126 4.16656 0.526692 0.867040 1643.22 922.992 1643.59 9.42174 11.2912 12.5274 922.874 0.974624 1645.61 921.674 0.486440 2.05419 0.56 0.39 0.57 0.58 65.363 85.527 43.497 174.227 10.637 16.731 163713 95.686 26.2 535255.625888 17.535 16.96 5.96 5.14 5.97 5.37 7.26 2.88 12.99 28.04 9.41 6.73 18.93 24.40 16.45 27.13 10.69 2.35038 3.13707 0.491459 1.24781 5.60704 2.58806 4.34082 2.76210 3.20825 4.16237 0.527252 0.860663 1645.08 923.536 1645.64 9.42301 11.2891 12.5418 923.369 0.973874 1643.67 923.438 0.477951 2.05772 0.56 0.39 0.57 0.58 65.852 85.551 43.236 174.169 10.590 16.753 95.528 26.5 534883.155085 17.543 17.73 5.89 5.07 6.01 5.34 7.20 2.90 13.09 29.24 9.49 6.99 20.38 25.22 16.94 27.41 10.68 2.34990 3.15835 0.495733 1.24380 5.60477 2.61350 4.35080 2.75619 3.21278 4.17445 0.526437 0.862856 1647.12 922.270 1645.91 9.42277 11.3162 12.5335 923.336 0.977073 1644.83 920.863 0.479216 2.05675 0.56 0.39 0.57 0.58 65.594 85.784 43.454 174.447 10.549 16.779 OpenBenchmarking.org
BRL-CAD VGR Performance Metric OpenBenchmarking.org VGR Performance Metric, More Is Better BRL-CAD 7.30.8 VGR Performance Metric 1 2 40K 80K 120K 160K 200K 164354 163713 1. (CXX) g++ options: -std=c++11 -pipe -fno-strict-aliasing -fno-common -fexceptions -ftemplate-depth-128 -m64 -ggdb3 -O3 -fipa-pta -fstrength-reduce -finline-functions -flto -pedantic -rdynamic -lSM -lICE -lGLU -lGL -lGLdispatch -lX11 -lXext -lXrender -lpthread -ldl -luuid -lm
Build2 Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Build2 0.13 Time To Compile 3 1 2 20 40 60 80 100 SE +/- 0.15, N = 3 SE +/- 0.24, N = 3 SE +/- 0.25, N = 3 95.53 95.58 95.69
CLOMP Static OMP Speedup OpenBenchmarking.org Speedup, More Is Better CLOMP 1.2 Static OMP Speedup 1 3 2 6 12 18 24 30 SE +/- 0.32, N = 3 SE +/- 0.12, N = 3 SE +/- 0.42, N = 3 26.8 26.5 26.2 1. (CC) gcc options: -fopenmp -O3 -lm
Coremark CoreMark Size 666 - Iterations Per Second OpenBenchmarking.org Iterations/Sec, More Is Better Coremark 1.0 CoreMark Size 666 - Iterations Per Second 1 2 3 120K 240K 360K 480K 600K SE +/- 746.08, N = 3 SE +/- 1073.23, N = 3 SE +/- 2681.32, N = 3 537952.81 535255.63 534883.16 1. (CC) gcc options: -O2 -lrt" -lrt
Monkey Audio Encoding WAV To APE OpenBenchmarking.org Seconds, Fewer Is Better Monkey Audio Encoding 3.99.6 WAV To APE 1 2 3 4 8 12 16 20 SE +/- 0.01, N = 5 SE +/- 0.02, N = 5 SE +/- 0.00, N = 5 17.53 17.54 17.54 1. (CXX) g++ options: -O3 -pedantic -rdynamic -lrt
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: mobilenet 2 3 1 4 8 12 16 20 SE +/- 0.02, N = 3 SE +/- 0.27, N = 3 SE +/- 0.12, N = 3 16.96 17.73 17.77 MIN: 16.84 / MAX: 18.68 MIN: 16.96 / MAX: 19.65 MIN: 17.45 / MAX: 18.07 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU-v2-v2 - Model: mobilenet-v2 3 2 1 2 4 6 8 10 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.08, N = 3 5.89 5.96 6.07 MIN: 5.64 / MAX: 9.39 MIN: 5.64 / MAX: 9.74 MIN: 5.65 / MAX: 23.59 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU-v3-v3 - Model: mobilenet-v3 3 2 1 1.1723 2.3446 3.5169 4.6892 5.8615 SE +/- 0.01, N = 3 SE +/- 0.05, N = 3 SE +/- 0.06, N = 3 5.07 5.14 5.21 MIN: 4.93 / MAX: 8.45 MIN: 4.94 / MAX: 8.76 MIN: 4.94 / MAX: 9.16 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: shufflenet-v2 1 2 3 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.04, N = 2 SE +/- 0.03, N = 3 5.94 5.97 6.01 MIN: 5.83 / MAX: 9.49 MIN: 5.86 / MAX: 9.64 MIN: 5.83 / MAX: 9.91 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: mnasnet 3 2 1 1.251 2.502 3.753 5.004 6.255 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.05, N = 3 5.34 5.37 5.56 MIN: 5.16 / MAX: 9.17 MIN: 5.18 / MAX: 9.36 MIN: 5.21 / MAX: 9.44 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: efficientnet-b0 3 2 1 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.07, N = 3 7.20 7.26 7.70 MIN: 7 / MAX: 9 MIN: 7.01 / MAX: 8.01 MIN: 7.35 / MAX: 8.6 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: blazeface 2 3 1 0.6593 1.3186 1.9779 2.6372 3.2965 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.05, N = 3 2.88 2.90 2.93 MIN: 2.82 / MAX: 3.69 MIN: 2.83 / MAX: 4.6 MIN: 2.85 / MAX: 3.18 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: googlenet 2 3 1 4 8 12 16 20 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.46, N = 3 12.99 13.09 14.56 MIN: 12.91 / MAX: 13.36 MIN: 12.88 / MAX: 15.13 MIN: 13.58 / MAX: 15.21 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: vgg16 2 3 1 7 14 21 28 35 SE +/- 0.03, N = 3 SE +/- 0.36, N = 3 SE +/- 0.50, N = 3 28.04 29.24 29.65 MIN: 27.76 / MAX: 46.41 MIN: 27.78 / MAX: 48.69 MIN: 28.54 / MAX: 34.02 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: resnet18 2 3 1 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.42, N = 3 9.41 9.49 10.58 MIN: 9.33 / MAX: 10.24 MIN: 9.34 / MAX: 11.41 MIN: 9.67 / MAX: 12.86 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: alexnet 2 3 1 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.22, N = 3 SE +/- 0.01, N = 3 6.73 6.99 8.07 MIN: 6.67 / MAX: 8.33 MIN: 6.66 / MAX: 23.91 MIN: 8.01 / MAX: 9.86 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: resnet50 2 1 3 5 10 15 20 25 SE +/- 0.08, N = 3 SE +/- 0.51, N = 3 SE +/- 0.52, N = 3 18.93 20.27 20.38 MIN: 18.61 / MAX: 38.46 MIN: 19.1 / MAX: 21.77 MIN: 19.05 / MAX: 23.49 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: yolov4-tiny 2 1 3 6 12 18 24 30 SE +/- 0.38, N = 3 SE +/- 0.51, N = 3 SE +/- 0.35, N = 3 24.40 24.72 25.22 MIN: 23.06 / MAX: 27.46 MIN: 23.36 / MAX: 28.82 MIN: 23.44 / MAX: 28.59 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: squeezenet_ssd 2 1 3 4 8 12 16 20 SE +/- 0.00, N = 3 SE +/- 0.11, N = 3 SE +/- 0.15, N = 3 16.45 16.73 16.94 MIN: 16.33 / MAX: 19.11 MIN: 16.43 / MAX: 17.04 MIN: 16.44 / MAX: 19 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: regnety_400m 2 1 3 6 12 18 24 30 SE +/- 0.11, N = 3 SE +/- 0.06, N = 3 SE +/- 0.11, N = 3 27.13 27.19 27.41 MIN: 26.7 / MAX: 28.12 MIN: 26.8 / MAX: 29.86 MIN: 26.75 / MAX: 31.13 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Node.js V8 Web Tooling Benchmark OpenBenchmarking.org runs/s, More Is Better Node.js V8 Web Tooling Benchmark 1 2 3 3 6 9 12 15 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 SE +/- 0.05, N = 3 10.78 10.69 10.68 1. Nodejs
v10.19.0
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU 3 2 1 0.5296 1.0592 1.5888 2.1184 2.648 SE +/- 0.00268, N = 3 SE +/- 0.00411, N = 3 SE +/- 0.00656, N = 3 2.34990 2.35038 2.35358 MIN: 2.25 MIN: 2.22 MIN: 2.26 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU 2 1 3 0.7106 1.4212 2.1318 2.8424 3.553 SE +/- 0.00567, N = 3 SE +/- 0.01117, N = 3 SE +/- 0.00910, N = 3 3.13707 3.14816 3.15835 MIN: 3.09 MIN: 3.1 MIN: 3.11 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU 2 3 1 0.1117 0.2234 0.3351 0.4468 0.5585 SE +/- 0.000604, N = 3 SE +/- 0.000320, N = 3 SE +/- 0.000595, N = 3 0.491459 0.495733 0.496245 MIN: 0.47 MIN: 0.47 MIN: 0.47 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU 3 1 2 0.2808 0.5616 0.8424 1.1232 1.404 SE +/- 0.00126, N = 3 SE +/- 0.00268, N = 3 SE +/- 0.00242, N = 3 1.24380 1.24491 1.24781 MIN: 1.2 MIN: 1.2 MIN: 1.2 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU 3 2 1 1.263 2.526 3.789 5.052 6.315 SE +/- 0.00888, N = 3 SE +/- 0.00948, N = 3 SE +/- 0.01299, N = 3 5.60477 5.60704 5.61346 MIN: 5.48 MIN: 5.44 MIN: 5.5 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU 2 1 3 0.588 1.176 1.764 2.352 2.94 SE +/- 0.00862, N = 3 SE +/- 0.00587, N = 3 SE +/- 0.00522, N = 3 2.58806 2.59610 2.61350 MIN: 2.53 MIN: 2.51 MIN: 2.56 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU 2 1 3 0.9789 1.9578 2.9367 3.9156 4.8945 SE +/- 0.01378, N = 3 SE +/- 0.01713, N = 3 SE +/- 0.01473, N = 3 4.34082 4.34336 4.35080 MIN: 4.27 MIN: 4.27 MIN: 4.28 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU 1 3 2 0.6215 1.243 1.8645 2.486 3.1075 SE +/- 0.00248, N = 3 SE +/- 0.00159, N = 3 SE +/- 0.00638, N = 3 2.75497 2.75619 2.76210 MIN: 2.69 MIN: 2.68 MIN: 2.69 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU 2 1 3 0.7229 1.4458 2.1687 2.8916 3.6145 SE +/- 0.00574, N = 3 SE +/- 0.00363, N = 3 SE +/- 0.01283, N = 3 3.20825 3.21126 3.21278 MIN: 3.16 MIN: 3.17 MIN: 3.16 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU 2 1 3 0.9393 1.8786 2.8179 3.7572 4.6965 SE +/- 0.01546, N = 3 SE +/- 0.01549, N = 3 SE +/- 0.01564, N = 3 4.16237 4.16656 4.17445 MIN: 4.1 MIN: 4.09 MIN: 4.1 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU 3 1 2 0.1186 0.2372 0.3558 0.4744 0.593 SE +/- 0.000172, N = 3 SE +/- 0.000759, N = 3 SE +/- 0.000712, N = 3 0.526437 0.526692 0.527252 MIN: 0.51 MIN: 0.51 MIN: 0.51 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU 2 3 1 0.1951 0.3902 0.5853 0.7804 0.9755 SE +/- 0.011319, N = 5 SE +/- 0.008326, N = 3 SE +/- 0.007447, N = 3 0.860663 0.862856 0.867040 MIN: 0.82 MIN: 0.83 MIN: 0.83 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU 1 2 3 400 800 1200 1600 2000 SE +/- 0.66, N = 3 SE +/- 0.79, N = 3 SE +/- 4.18, N = 3 1643.22 1645.08 1647.12 MIN: 1637.46 MIN: 1639.15 MIN: 1640.3 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU 3 1 2 200 400 600 800 1000 SE +/- 1.52, N = 3 SE +/- 0.36, N = 3 SE +/- 1.28, N = 3 922.27 922.99 923.54 MIN: 917.55 MIN: 920.22 MIN: 919.76 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU 1 2 3 400 800 1200 1600 2000 SE +/- 2.60, N = 3 SE +/- 4.86, N = 3 SE +/- 1.22, N = 3 1643.59 1645.64 1645.91 MIN: 1637.17 MIN: 1635.99 MIN: 1639.71 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU 1 3 2 3 6 9 12 15 SE +/- 0.01674, N = 3 SE +/- 0.00825, N = 3 SE +/- 0.00522, N = 3 9.42174 9.42277 9.42301 MIN: 9.07 MIN: 9.06 MIN: 9.08 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU 2 1 3 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 11.29 11.29 11.32 MIN: 11.17 MIN: 11.16 MIN: 11.17 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU 1 3 2 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 12.53 12.53 12.54 MIN: 12.41 MIN: 12.41 MIN: 12.41 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU 1 3 2 200 400 600 800 1000 SE +/- 0.33, N = 3 SE +/- 0.27, N = 3 SE +/- 0.50, N = 3 922.87 923.34 923.37 MIN: 919.98 MIN: 920.26 MIN: 919.79 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU 2 1 3 0.2198 0.4396 0.6594 0.8792 1.099 SE +/- 0.001190, N = 3 SE +/- 0.001208, N = 3 SE +/- 0.001996, N = 3 0.973874 0.974624 0.977073 MIN: 0.94 MIN: 0.94 MIN: 0.94 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU 2 3 1 400 800 1200 1600 2000 SE +/- 1.59, N = 3 SE +/- 0.87, N = 3 SE +/- 0.69, N = 3 1643.67 1644.83 1645.61 MIN: 1637.61 MIN: 1640.5 MIN: 1641.21 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU 3 1 2 200 400 600 800 1000 SE +/- 0.78, N = 3 SE +/- 0.31, N = 3 SE +/- 1.61, N = 3 920.86 921.67 923.44 MIN: 916.71 MIN: 918.58 MIN: 918.55 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU 2 3 1 0.1094 0.2188 0.3282 0.4376 0.547 SE +/- 0.005102, N = 3 SE +/- 0.004438, N = 3 SE +/- 0.001972, N = 3 0.477951 0.479216 0.486440 MIN: 0.46 MIN: 0.46 MIN: 0.47 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU 1 3 2 0.463 0.926 1.389 1.852 2.315 SE +/- 0.00388, N = 3 SE +/- 0.00357, N = 3 SE +/- 0.00591, N = 3 2.05419 2.05675 2.05772 MIN: 1.97 MIN: 1.97 MIN: 1.97 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
simdjson Throughput Test: Kostya OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: Kostya 3 2 1 0.126 0.252 0.378 0.504 0.63 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.56 0.56 0.56 1. (CXX) g++ options: -O3 -pthread
simdjson Throughput Test: LargeRandom OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: LargeRandom 3 2 1 0.0878 0.1756 0.2634 0.3512 0.439 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.39 0.39 0.39 1. (CXX) g++ options: -O3 -pthread
simdjson Throughput Test: PartialTweets OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: PartialTweets 3 2 1 0.1283 0.2566 0.3849 0.5132 0.6415 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.57 0.57 0.57 1. (CXX) g++ options: -O3 -pthread
simdjson Throughput Test: DistinctUserID OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: DistinctUserID 3 2 1 0.1305 0.261 0.3915 0.522 0.6525 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.58 0.58 0.58 1. (CXX) g++ options: -O3 -pthread
SQLite Speedtest Timed Time - Size 1,000 OpenBenchmarking.org Seconds, Fewer Is Better SQLite Speedtest 3.30 Timed Time - Size 1,000 1 3 2 15 30 45 60 75 SE +/- 0.05, N = 3 SE +/- 0.20, N = 3 SE +/- 0.12, N = 3 65.36 65.59 65.85 1. (CC) gcc options: -O2 -ldl -lz -lpthread
Timed Eigen Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Eigen Compilation 3.3.9 Time To Compile 1 2 3 20 40 60 80 100 SE +/- 0.12, N = 3 SE +/- 0.22, N = 3 SE +/- 0.23, N = 3 85.53 85.55 85.78
Timed FFmpeg Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed FFmpeg Compilation 4.2.2 Time To Compile 2 3 1 10 20 30 40 50 SE +/- 0.09, N = 3 SE +/- 0.10, N = 3 SE +/- 0.15, N = 3 43.24 43.45 43.50
Timed HMMer Search Pfam Database Search OpenBenchmarking.org Seconds, Fewer Is Better Timed HMMer Search 3.3.1 Pfam Database Search 2 1 3 40 80 120 160 200 SE +/- 0.41, N = 3 SE +/- 0.36, N = 3 SE +/- 0.20, N = 3 174.17 174.23 174.45 1. (CC) gcc options: -O3 -pthread -lhmmer -leasel -lm
Timed MAFFT Alignment Multiple Sequence Alignment - LSU RNA OpenBenchmarking.org Seconds, Fewer Is Better Timed MAFFT Alignment 7.471 Multiple Sequence Alignment - LSU RNA 3 2 1 3 6 9 12 15 SE +/- 0.03, N = 3 SE +/- 0.08, N = 3 SE +/- 0.08, N = 3 10.55 10.59 10.64 1. (CC) gcc options: -std=c99 -O3 -lm -lpthread
WavPack Audio Encoding WAV To WavPack OpenBenchmarking.org Seconds, Fewer Is Better WavPack Audio Encoding 5.3 WAV To WavPack 1 2 3 4 8 12 16 20 SE +/- 0.01, N = 5 SE +/- 0.01, N = 5 SE +/- 0.01, N = 5 16.73 16.75 16.78 1. (CXX) g++ options: -rdynamic
Phoronix Test Suite v10.8.4