lczero onnx Ice Lake 2 x Intel Xeon Platinum 8380 testing with a Intel M50CYP2SB2U (SE5C6200.86B.0022.D08.2103221623 BIOS) and ASPEED on Ubuntu 21.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2108264-TJ-LCZEROONN24&grs&sor .
lczero onnx Ice Lake Processor Motherboard Chipset Memory Disk Graphics Monitor Network OS Kernel Desktop Display Server Compiler File-System Screen Resolution 1 2 3 4 2 x Intel Xeon Platinum 8380 @ 3.40GHz (80 Cores / 160 Threads) Intel M50CYP2SB2U (SE5C6200.86B.0022.D08.2103221623 BIOS) Intel Device 0998 504GB 7682GB INTEL SSDPF2KX076TZ ASPEED VE228 2 x Intel X710 for 10GBASE-T + 2 x Intel E810-C for QSFP Ubuntu 21.04 5.14.0-rc1-folio (x86_64) 20210715 GNOME Shell 3.38.4 X Server 1.20.11 GCC 10.3.0 ext4 1920x1080 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-mutex --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-10-gDeRY6/gcc-10-10.3.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-10-gDeRY6/gcc-10-10.3.0/debian/tmp-gcn/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: intel_pstate powersave - CPU Microcode: 0xd0002a0 Python Details - Python 3.9.5 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
lczero onnx Ice Lake lczero: Eigen onnx: shufflenet-v2-10 - OpenMP CPU onnx: yolov4 - OpenMP CPU onnx: fcn-resnet101-11 - OpenMP CPU synthmark: VoiceMark_100 onnx: super-resolution-10 - OpenMP CPU onnx: bertsquad-10 - OpenMP CPU lczero: BLAS 1 2 3 4 4120 11748 531 465 552.043 6771 721 882 4260 11396 528 463 551.034 6359 697 835 3954 11597 538 461 550.938 7050 718 910 4148 11638 528 460 553.201 735 873 OpenBenchmarking.org
LeelaChessZero Backend: Eigen OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.28 Backend: Eigen 2 4 1 3 900 1800 2700 3600 4500 SE +/- 40.81, N = 3 SE +/- 3.93, N = 3 SE +/- 46.87, N = 3 4260 4148 4120 3954 1. (CXX) g++ options: -flto -pthread
ONNX Runtime Model: shufflenet-v2-10 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.8.2 Model: shufflenet-v2-10 - Device: OpenMP CPU 1 4 3 2 3K 6K 9K 12K 15K SE +/- 68.29, N = 3 SE +/- 73.24, N = 3 SE +/- 51.60, N = 3 11748 11638 11597 11396 1. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt
ONNX Runtime Model: yolov4 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.8.2 Model: yolov4 - Device: OpenMP CPU 3 1 4 2 120 240 360 480 600 SE +/- 4.33, N = 3 SE +/- 6.09, N = 3 SE +/- 2.75, N = 3 538 531 528 528 1. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt
ONNX Runtime Model: fcn-resnet101-11 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.8.2 Model: fcn-resnet101-11 - Device: OpenMP CPU 1 2 3 4 100 200 300 400 500 SE +/- 3.47, N = 3 SE +/- 1.42, N = 3 SE +/- 1.17, N = 3 465 463 461 460 1. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt
Google SynthMark Test: VoiceMark_100 OpenBenchmarking.org Voices, More Is Better Google SynthMark 20201109 Test: VoiceMark_100 4 1 2 3 120 240 360 480 600 SE +/- 2.48, N = 3 SE +/- 2.04, N = 3 SE +/- 2.57, N = 3 553.20 552.04 551.03 550.94 1. (CXX) g++ options: -lm -lpthread -std=c++11 -Ofast
ONNX Runtime Model: super-resolution-10 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.8.2 Model: super-resolution-10 - Device: OpenMP CPU 3 1 2 1500 3000 4500 6000 7500 SE +/- 203.29, N = 12 SE +/- 233.89, N = 12 7050 6771 6359 1. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt
ONNX Runtime Model: bertsquad-10 - Device: OpenMP CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.8.2 Model: bertsquad-10 - Device: OpenMP CPU 4 1 3 2 160 320 480 640 800 SE +/- 16.15, N = 12 SE +/- 10.36, N = 12 SE +/- 19.35, N = 12 735 721 718 697 1. (CXX) g++ options: -fopenmp -ffunction-sections -fdata-sections -O3 -ldl -lrt
LeelaChessZero Backend: BLAS OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.28 Backend: BLAS 3 1 4 2 200 400 600 800 1000 SE +/- 19.54, N = 6 SE +/- 31.49, N = 6 SE +/- 7.79, N = 9 910 882 873 835 1. (CXX) g++ options: -flto -pthread
Phoronix Test Suite v10.8.5