Intel Xeon Gold 6226R testing with a Supermicro X11SPL-F v1.02 (3.1 BIOS) and llvmpipe on Ubuntu 20.04 via the Phoronix Test Suite.
Compare your own system(s) to this result file with the
Phoronix Test Suite by running the command:
phoronix-test-suite benchmark 2012200-HA-XEONGOLD615 Xeon Gold 6226R December - Phoronix Test Suite Xeon Gold 6226R December Intel Xeon Gold 6226R testing with a Supermicro X11SPL-F v1.02 (3.1 BIOS) and llvmpipe on Ubuntu 20.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2012200-HA-XEONGOLD615&sro&grr .
Xeon Gold 6226R December Processor Motherboard Chipset Memory Disk Graphics Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL Compiler File-System Screen Resolution 1 2 3 Intel Xeon Gold 6226R @ 3.90GHz (16 Cores / 32 Threads) Supermicro X11SPL-F v1.02 (3.1 BIOS) Intel Sky Lake-E DMI3 Registers 188GB 3841GB Micron_9300_MTFDHAL3T8TDP llvmpipe VE228 2 x Intel I210 Ubuntu 20.04 5.9.0-050900rc6daily20200921-generic (x86_64) 20200920 GNOME Shell 3.36.4 X Server 1.20.8 modesetting 1.20.8 3.3 Mesa 20.0.8 (LLVM 10.0.0 256 bits) GCC 9.3.0 ext4 1920x1080 OpenBenchmarking.org Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: intel_cpufreq ondemand - CPU Microcode: 0x5002f01 Security Details - itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Mitigation of TSX disabled
Xeon Gold 6226R December brl-cad: VGR Performance Metric hmmer: Pfam Database Search build2: Time To Compile build-eigen: Time To Compile node-web-tooling: onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU simdjson: Kostya sqlite-speedtest: Timed Time - Size 1,000 simdjson: LargeRand simdjson: PartialTweets simdjson: DistinctUserID ncnn: CPU - regnety_400m ncnn: CPU - squeezenet_ssd ncnn: CPU - yolov4-tiny ncnn: CPU - resnet50 ncnn: CPU - alexnet ncnn: CPU - resnet18 ncnn: CPU - vgg16 ncnn: CPU - googlenet ncnn: CPU - blazeface ncnn: CPU - efficientnet-b0 ncnn: CPU - mnasnet ncnn: CPU - shufflenet-v2 ncnn: CPU-v3-v3 - mobilenet-v3 ncnn: CPU-v2-v2 - mobilenet-v2 ncnn: CPU - mobilenet build-ffmpeg: Time To Compile encode-ape: WAV To APE encode-wavpack: WAV To WavPack coremark: CoreMark Size 666 - Iterations Per Second clomp: Static OMP Speedup onednn: Deconvolution Batch shapes_1d - bf16bf16bf16 - CPU onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU onednn: Deconvolution Batch shapes_1d - f32 - CPU onednn: IP Shapes 1D - bf16bf16bf16 - CPU onednn: IP Shapes 1D - f32 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - bf16bf16bf16 - CPU onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU mafft: Multiple Sequence Alignment - LSU RNA onednn: IP Shapes 3D - bf16bf16bf16 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU onednn: IP Shapes 3D - f32 - CPU onednn: Convolution Batch Shapes Auto - bf16bf16bf16 - CPU onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU onednn: Deconvolution Batch shapes_3d - bf16bf16bf16 - CPU onednn: Deconvolution Batch shapes_3d - f32 - CPU 1 2 3 164354 174.227 95.580 85.527 10.78 1645.61 1643.59 1643.22 922.874 921.674 922.992 0.56 65.363 0.39 0.57 0.58 27.19 16.73 24.72 20.27 8.07 10.58 29.65 14.56 2.93 7.70 5.56 5.94 5.21 6.07 17.77 43.497 17.528 16.731 537952.810837 26.8 11.2912 0.526692 2.75497 5.61346 2.35358 0.496245 0.974624 2.05419 0.486440 10.637 2.59610 1.24491 3.14816 9.42174 4.16656 4.34336 0.867040 12.5274 3.21126 163713 174.169 95.686 85.551 10.69 1643.67 1645.64 1645.08 923.369 923.438 923.536 0.56 65.852 0.39 0.57 0.58 27.13 16.45 24.40 18.93 6.73 9.41 28.04 12.99 2.88 7.26 5.37 5.97 5.14 5.96 16.96 43.236 17.535 16.753 535255.625888 26.2 11.2891 0.527252 2.76210 5.60704 2.35038 0.491459 0.973874 2.05772 0.477951 10.590 2.58806 1.24781 3.13707 9.42301 4.16237 4.34082 0.860663 12.5418 3.20825 174.447 95.528 85.784 10.68 1644.83 1645.91 1647.12 923.336 920.863 922.270 0.56 65.594 0.39 0.57 0.58 27.41 16.94 25.22 20.38 6.99 9.49 29.24 13.09 2.90 7.20 5.34 6.01 5.07 5.89 17.73 43.454 17.543 16.779 534883.155085 26.5 11.3162 0.526437 2.75619 5.60477 2.34990 0.495733 0.977073 2.05675 0.479216 10.549 2.61350 1.24380 3.15835 9.42277 4.17445 4.35080 0.862856 12.5335 3.21278 OpenBenchmarking.org
BRL-CAD VGR Performance Metric OpenBenchmarking.org VGR Performance Metric, More Is Better BRL-CAD 7.30.8 VGR Performance Metric 1 2 40K 80K 120K 160K 200K 164354 163713 1. (CXX) g++ options: -std=c++11 -pipe -fno-strict-aliasing -fno-common -fexceptions -ftemplate-depth-128 -m64 -ggdb3 -O3 -fipa-pta -fstrength-reduce -finline-functions -flto -pedantic -rdynamic -lSM -lICE -lGLU -lGL -lGLdispatch -lX11 -lXext -lXrender -lpthread -ldl -luuid -lm
Timed HMMer Search Pfam Database Search OpenBenchmarking.org Seconds, Fewer Is Better Timed HMMer Search 3.3.1 Pfam Database Search 1 2 3 40 80 120 160 200 SE +/- 0.36, N = 3 SE +/- 0.41, N = 3 SE +/- 0.20, N = 3 174.23 174.17 174.45 1. (CC) gcc options: -O3 -pthread -lhmmer -leasel -lm
Build2 Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Build2 0.13 Time To Compile 1 2 3 20 40 60 80 100 SE +/- 0.24, N = 3 SE +/- 0.25, N = 3 SE +/- 0.15, N = 3 95.58 95.69 95.53
Timed Eigen Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Eigen Compilation 3.3.9 Time To Compile 1 2 3 20 40 60 80 100 SE +/- 0.12, N = 3 SE +/- 0.22, N = 3 SE +/- 0.23, N = 3 85.53 85.55 85.78
Node.js V8 Web Tooling Benchmark OpenBenchmarking.org runs/s, More Is Better Node.js V8 Web Tooling Benchmark 1 2 3 3 6 9 12 15 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 SE +/- 0.05, N = 3 10.78 10.69 10.68 1. Nodejs
v10.19.0
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU 1 2 3 400 800 1200 1600 2000 SE +/- 0.69, N = 3 SE +/- 1.59, N = 3 SE +/- 0.87, N = 3 1645.61 1643.67 1644.83 MIN: 1641.21 MIN: 1637.61 MIN: 1640.5 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU 1 2 3 400 800 1200 1600 2000 SE +/- 2.60, N = 3 SE +/- 4.86, N = 3 SE +/- 1.22, N = 3 1643.59 1645.64 1645.91 MIN: 1637.17 MIN: 1635.99 MIN: 1639.71 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU 1 2 3 400 800 1200 1600 2000 SE +/- 0.66, N = 3 SE +/- 0.79, N = 3 SE +/- 4.18, N = 3 1643.22 1645.08 1647.12 MIN: 1637.46 MIN: 1639.15 MIN: 1640.3 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU 1 2 3 200 400 600 800 1000 SE +/- 0.33, N = 3 SE +/- 0.50, N = 3 SE +/- 0.27, N = 3 922.87 923.37 923.34 MIN: 919.98 MIN: 919.79 MIN: 920.26 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU 1 2 3 200 400 600 800 1000 SE +/- 0.31, N = 3 SE +/- 1.61, N = 3 SE +/- 0.78, N = 3 921.67 923.44 920.86 MIN: 918.58 MIN: 918.55 MIN: 916.71 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU 1 2 3 200 400 600 800 1000 SE +/- 0.36, N = 3 SE +/- 1.28, N = 3 SE +/- 1.52, N = 3 922.99 923.54 922.27 MIN: 920.22 MIN: 919.76 MIN: 917.55 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
simdjson Throughput Test: Kostya OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: Kostya 1 2 3 0.126 0.252 0.378 0.504 0.63 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.56 0.56 0.56 1. (CXX) g++ options: -O3 -pthread
SQLite Speedtest Timed Time - Size 1,000 OpenBenchmarking.org Seconds, Fewer Is Better SQLite Speedtest 3.30 Timed Time - Size 1,000 1 2 3 15 30 45 60 75 SE +/- 0.05, N = 3 SE +/- 0.12, N = 3 SE +/- 0.20, N = 3 65.36 65.85 65.59 1. (CC) gcc options: -O2 -ldl -lz -lpthread
simdjson Throughput Test: LargeRandom OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: LargeRandom 1 2 3 0.0878 0.1756 0.2634 0.3512 0.439 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.39 0.39 0.39 1. (CXX) g++ options: -O3 -pthread
simdjson Throughput Test: PartialTweets OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: PartialTweets 1 2 3 0.1283 0.2566 0.3849 0.5132 0.6415 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.57 0.57 0.57 1. (CXX) g++ options: -O3 -pthread
simdjson Throughput Test: DistinctUserID OpenBenchmarking.org GB/s, More Is Better simdjson 0.7.1 Throughput Test: DistinctUserID 1 2 3 0.1305 0.261 0.3915 0.522 0.6525 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 0.58 0.58 0.58 1. (CXX) g++ options: -O3 -pthread
NCNN Target: CPU - Model: regnety_400m OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: regnety_400m 1 2 3 6 12 18 24 30 SE +/- 0.06, N = 3 SE +/- 0.11, N = 3 SE +/- 0.11, N = 3 27.19 27.13 27.41 MIN: 26.8 / MAX: 29.86 MIN: 26.7 / MAX: 28.12 MIN: 26.75 / MAX: 31.13 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: squeezenet_ssd OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: squeezenet_ssd 1 2 3 4 8 12 16 20 SE +/- 0.11, N = 3 SE +/- 0.00, N = 3 SE +/- 0.15, N = 3 16.73 16.45 16.94 MIN: 16.43 / MAX: 17.04 MIN: 16.33 / MAX: 19.11 MIN: 16.44 / MAX: 19 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: yolov4-tiny OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: yolov4-tiny 1 2 3 6 12 18 24 30 SE +/- 0.51, N = 3 SE +/- 0.38, N = 3 SE +/- 0.35, N = 3 24.72 24.40 25.22 MIN: 23.36 / MAX: 28.82 MIN: 23.06 / MAX: 27.46 MIN: 23.44 / MAX: 28.59 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet50 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: resnet50 1 2 3 5 10 15 20 25 SE +/- 0.51, N = 3 SE +/- 0.08, N = 3 SE +/- 0.52, N = 3 20.27 18.93 20.38 MIN: 19.1 / MAX: 21.77 MIN: 18.61 / MAX: 38.46 MIN: 19.05 / MAX: 23.49 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: alexnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: alexnet 1 2 3 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.22, N = 3 8.07 6.73 6.99 MIN: 8.01 / MAX: 9.86 MIN: 6.67 / MAX: 8.33 MIN: 6.66 / MAX: 23.91 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: resnet18 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: resnet18 1 2 3 3 6 9 12 15 SE +/- 0.42, N = 3 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 10.58 9.41 9.49 MIN: 9.67 / MAX: 12.86 MIN: 9.33 / MAX: 10.24 MIN: 9.34 / MAX: 11.41 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: vgg16 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: vgg16 1 2 3 7 14 21 28 35 SE +/- 0.50, N = 3 SE +/- 0.03, N = 3 SE +/- 0.36, N = 3 29.65 28.04 29.24 MIN: 28.54 / MAX: 34.02 MIN: 27.76 / MAX: 46.41 MIN: 27.78 / MAX: 48.69 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: googlenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: googlenet 1 2 3 4 8 12 16 20 SE +/- 0.46, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 14.56 12.99 13.09 MIN: 13.58 / MAX: 15.21 MIN: 12.91 / MAX: 13.36 MIN: 12.88 / MAX: 15.13 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: blazeface OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: blazeface 1 2 3 0.6593 1.3186 1.9779 2.6372 3.2965 SE +/- 0.05, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 2.93 2.88 2.90 MIN: 2.85 / MAX: 3.18 MIN: 2.82 / MAX: 3.69 MIN: 2.83 / MAX: 4.6 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: efficientnet-b0 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: efficientnet-b0 1 2 3 2 4 6 8 10 SE +/- 0.07, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 7.70 7.26 7.20 MIN: 7.35 / MAX: 8.6 MIN: 7.01 / MAX: 8.01 MIN: 7 / MAX: 9 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mnasnet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: mnasnet 1 2 3 1.251 2.502 3.753 5.004 6.255 SE +/- 0.05, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 5.56 5.37 5.34 MIN: 5.21 / MAX: 9.44 MIN: 5.18 / MAX: 9.36 MIN: 5.16 / MAX: 9.17 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: shufflenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: shufflenet-v2 1 2 3 2 4 6 8 10 SE +/- 0.01, N = 3 SE +/- 0.04, N = 2 SE +/- 0.03, N = 3 5.94 5.97 6.01 MIN: 5.83 / MAX: 9.49 MIN: 5.86 / MAX: 9.64 MIN: 5.83 / MAX: 9.91 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v3-v3 - Model: mobilenet-v3 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU-v3-v3 - Model: mobilenet-v3 1 2 3 1.1723 2.3446 3.5169 4.6892 5.8615 SE +/- 0.06, N = 3 SE +/- 0.05, N = 3 SE +/- 0.01, N = 3 5.21 5.14 5.07 MIN: 4.94 / MAX: 9.16 MIN: 4.94 / MAX: 8.76 MIN: 4.93 / MAX: 8.45 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU-v2-v2 - Model: mobilenet-v2 OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU-v2-v2 - Model: mobilenet-v2 1 2 3 2 4 6 8 10 SE +/- 0.08, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 6.07 5.96 5.89 MIN: 5.65 / MAX: 23.59 MIN: 5.64 / MAX: 9.74 MIN: 5.64 / MAX: 9.39 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
NCNN Target: CPU - Model: mobilenet OpenBenchmarking.org ms, Fewer Is Better NCNN 20201218 Target: CPU - Model: mobilenet 1 2 3 4 8 12 16 20 SE +/- 0.12, N = 3 SE +/- 0.02, N = 3 SE +/- 0.27, N = 3 17.77 16.96 17.73 MIN: 17.45 / MAX: 18.07 MIN: 16.84 / MAX: 18.68 MIN: 16.96 / MAX: 19.65 1. (CXX) g++ options: -O3 -rdynamic -lgomp -lpthread
Timed FFmpeg Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed FFmpeg Compilation 4.2.2 Time To Compile 1 2 3 10 20 30 40 50 SE +/- 0.15, N = 3 SE +/- 0.09, N = 3 SE +/- 0.10, N = 3 43.50 43.24 43.45
Monkey Audio Encoding WAV To APE OpenBenchmarking.org Seconds, Fewer Is Better Monkey Audio Encoding 3.99.6 WAV To APE 1 2 3 4 8 12 16 20 SE +/- 0.01, N = 5 SE +/- 0.02, N = 5 SE +/- 0.00, N = 5 17.53 17.54 17.54 1. (CXX) g++ options: -O3 -pedantic -rdynamic -lrt
WavPack Audio Encoding WAV To WavPack OpenBenchmarking.org Seconds, Fewer Is Better WavPack Audio Encoding 5.3 WAV To WavPack 1 2 3 4 8 12 16 20 SE +/- 0.01, N = 5 SE +/- 0.01, N = 5 SE +/- 0.01, N = 5 16.73 16.75 16.78 1. (CXX) g++ options: -rdynamic
Coremark CoreMark Size 666 - Iterations Per Second OpenBenchmarking.org Iterations/Sec, More Is Better Coremark 1.0 CoreMark Size 666 - Iterations Per Second 1 2 3 120K 240K 360K 480K 600K SE +/- 746.08, N = 3 SE +/- 1073.23, N = 3 SE +/- 2681.32, N = 3 537952.81 535255.63 534883.16 1. (CC) gcc options: -O2 -lrt" -lrt
CLOMP Static OMP Speedup OpenBenchmarking.org Speedup, More Is Better CLOMP 1.2 Static OMP Speedup 1 2 3 6 12 18 24 30 SE +/- 0.32, N = 3 SE +/- 0.42, N = 3 SE +/- 0.12, N = 3 26.8 26.2 26.5 1. (CC) gcc options: -fopenmp -O3 -lm
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU 1 2 3 3 6 9 12 15 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 11.29 11.29 11.32 MIN: 11.16 MIN: 11.17 MIN: 11.17 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU 1 2 3 0.1186 0.2372 0.3558 0.4744 0.593 SE +/- 0.000759, N = 3 SE +/- 0.000712, N = 3 SE +/- 0.000172, N = 3 0.526692 0.527252 0.526437 MIN: 0.51 MIN: 0.51 MIN: 0.51 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU 1 2 3 0.6215 1.243 1.8645 2.486 3.1075 SE +/- 0.00248, N = 3 SE +/- 0.00638, N = 3 SE +/- 0.00159, N = 3 2.75497 2.76210 2.75619 MIN: 2.69 MIN: 2.69 MIN: 2.68 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU 1 2 3 1.263 2.526 3.789 5.052 6.315 SE +/- 0.01299, N = 3 SE +/- 0.00948, N = 3 SE +/- 0.00888, N = 3 5.61346 5.60704 5.60477 MIN: 5.5 MIN: 5.44 MIN: 5.48 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU 1 2 3 0.5296 1.0592 1.5888 2.1184 2.648 SE +/- 0.00656, N = 3 SE +/- 0.00411, N = 3 SE +/- 0.00268, N = 3 2.35358 2.35038 2.34990 MIN: 2.26 MIN: 2.22 MIN: 2.25 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU 1 2 3 0.1117 0.2234 0.3351 0.4468 0.5585 SE +/- 0.000595, N = 3 SE +/- 0.000604, N = 3 SE +/- 0.000320, N = 3 0.496245 0.491459 0.495733 MIN: 0.47 MIN: 0.47 MIN: 0.47 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU 1 2 3 0.2198 0.4396 0.6594 0.8792 1.099 SE +/- 0.001208, N = 3 SE +/- 0.001190, N = 3 SE +/- 0.001996, N = 3 0.974624 0.973874 0.977073 MIN: 0.94 MIN: 0.94 MIN: 0.94 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPU 1 2 3 0.463 0.926 1.389 1.852 2.315 SE +/- 0.00388, N = 3 SE +/- 0.00591, N = 3 SE +/- 0.00357, N = 3 2.05419 2.05772 2.05675 MIN: 1.97 MIN: 1.97 MIN: 1.97 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU 1 2 3 0.1094 0.2188 0.3282 0.4376 0.547 SE +/- 0.001972, N = 3 SE +/- 0.005102, N = 3 SE +/- 0.004438, N = 3 0.486440 0.477951 0.479216 MIN: 0.47 MIN: 0.46 MIN: 0.46 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Timed MAFFT Alignment Multiple Sequence Alignment - LSU RNA OpenBenchmarking.org Seconds, Fewer Is Better Timed MAFFT Alignment 7.471 Multiple Sequence Alignment - LSU RNA 1 2 3 3 6 9 12 15 SE +/- 0.08, N = 3 SE +/- 0.08, N = 3 SE +/- 0.03, N = 3 10.64 10.59 10.55 1. (CC) gcc options: -std=c99 -O3 -lm -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU 1 2 3 0.588 1.176 1.764 2.352 2.94 SE +/- 0.00587, N = 3 SE +/- 0.00862, N = 3 SE +/- 0.00522, N = 3 2.59610 2.58806 2.61350 MIN: 2.51 MIN: 2.53 MIN: 2.56 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU 1 2 3 0.2808 0.5616 0.8424 1.1232 1.404 SE +/- 0.00268, N = 3 SE +/- 0.00242, N = 3 SE +/- 0.00126, N = 3 1.24491 1.24781 1.24380 MIN: 1.2 MIN: 1.2 MIN: 1.2 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU 1 2 3 0.7106 1.4212 2.1318 2.8424 3.553 SE +/- 0.01117, N = 3 SE +/- 0.00567, N = 3 SE +/- 0.00910, N = 3 3.14816 3.13707 3.15835 MIN: 3.1 MIN: 3.09 MIN: 3.11 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU 1 2 3 3 6 9 12 15 SE +/- 0.01674, N = 3 SE +/- 0.00522, N = 3 SE +/- 0.00825, N = 3 9.42174 9.42301 9.42277 MIN: 9.07 MIN: 9.08 MIN: 9.06 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU 1 2 3 0.9393 1.8786 2.8179 3.7572 4.6965 SE +/- 0.01549, N = 3 SE +/- 0.01546, N = 3 SE +/- 0.01564, N = 3 4.16656 4.16237 4.17445 MIN: 4.09 MIN: 4.1 MIN: 4.1 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU 1 2 3 0.9789 1.9578 2.9367 3.9156 4.8945 SE +/- 0.01713, N = 3 SE +/- 0.01378, N = 3 SE +/- 0.01473, N = 3 4.34336 4.34082 4.35080 MIN: 4.27 MIN: 4.27 MIN: 4.28 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU 1 2 3 0.1951 0.3902 0.5853 0.7804 0.9755 SE +/- 0.007447, N = 3 SE +/- 0.011319, N = 5 SE +/- 0.008326, N = 3 0.867040 0.860663 0.862856 MIN: 0.83 MIN: 0.82 MIN: 0.83 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU 1 2 3 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 12.53 12.54 12.53 MIN: 12.41 MIN: 12.41 MIN: 12.41 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU 1 2 3 0.7229 1.4458 2.1687 2.8916 3.6145 SE +/- 0.00363, N = 3 SE +/- 0.00574, N = 3 SE +/- 0.01283, N = 3 3.21126 3.20825 3.21278 MIN: 3.17 MIN: 3.16 MIN: 3.16 1. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread
Phoronix Test Suite v10.8.4