10700t comet weds Intel Core i7-10700T testing with a Logic Supply RXM-181 (Z01-0002A026 BIOS) and Intel UHD 630 CML GT2 3GB on Ubuntu 21.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2203303-NE-10700TCOM78&grr&export=txt&rdt .
10700t comet weds Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server OpenGL Vulkan Compiler File-System Screen Resolution A B C Intel Core i7-10700T @ 4.50GHz (8 Cores / 16 Threads) Logic Supply RXM-181 (Z01-0002A026 BIOS) Intel Comet Lake PCH 32GB 256GB TS256GMTS800 Intel UHD 630 CML GT2 3GB (1200MHz) Realtek ALC233 DELL P2415Q Intel I219-LM + Intel I210 Ubuntu 21.10 5.13.0-35-generic (x86_64) GNOME Shell 40.5 X Server + Wayland 4.6 Mesa 21.2.2 1.2.182 GCC 11.2.0 ext4 1920x1080 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-ZPT0kp/gcc-11-11.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-ZPT0kp/gcc-11-11.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0xec - Thermald 2.4.6 Python Details - Python 3.9.7 Security Details - itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
10700t comet weds onnx: fcn-resnet101-11 - CPU onnx: bertsquad-12 - CPU onnx: ArcFace ResNet-100 - CPU onnx: super-resolution-10 - CPU onnx: GPT-2 - CPU onnx: yolov4 - CPU perf-bench: Epoll Wait onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU onednn: Deconvolution Batch shapes_1d - f32 - CPU onednn: IP Shapes 1D - f32 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU perf-bench: Futex Hash perf-bench: Sched Pipe perf-bench: Futex Lock-Pi onednn: IP Shapes 3D - u8s8f32 - CPU onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU perf-bench: Memcpy 1MB onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU perf-bench: Memset 1MB onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU onednn: IP Shapes 3D - f32 - CPU perf-bench: Syscall Basic onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: IP Shapes 1D - bf16bf16bf16 - CPU A B C 39 303 694 2167 3644 203 109366 6801.99 6792.71 6791.09 3612.75 3603.05 3597.62 14.0868 6.31917 2.80373 3559455 163024 843 2.62391 3.87824 26.836621 3.98170 2.26038 43.091819 9.22489 5.46541 9.74806 14180008 15.5284 17.0221 39 306 703 2186 3713 203 115261 6804 6780.39 6778.99 3619.68 3588.14 3575.32 11.0421 4.19472 2.2005 3644833 161604 845 2.48963 3.76516 26.803861 3.96065 2.30454 43.234222 8.17558 4.65868 9.64795 14188609 15.5771 17.0692 3711 204 112104 6790.86 6786.74 6787.52 3600.38 3594.68 3590.54 12.9019 4.18294 2.26074 3674250 161030 848 2.43947 3.76013 25.616662 3.94494 2.24351 41.195661 8.25954 4.60998 9.36624 14102641 15.4461 17.017 OpenBenchmarking.org
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: fcn-resnet101-11 - Device: CPU A B 9 18 27 36 45 SE +/- 0.29, N = 3 39 39 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: bertsquad-12 - Device: CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: bertsquad-12 - Device: CPU A B 70 140 210 280 350 SE +/- 1.48, N = 3 303 306 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: ArcFace ResNet-100 - Device: CPU A B 150 300 450 600 750 SE +/- 1.89, N = 3 694 703 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: super-resolution-10 - Device: CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: super-resolution-10 - Device: CPU A B 500 1000 1500 2000 2500 SE +/- 4.93, N = 3 2167 2186 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: GPT-2 - Device: CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: GPT-2 - Device: CPU A B C 800 1600 2400 3200 4000 SE +/- 41.41, N = 3 3644 3713 3711 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: yolov4 - Device: CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: yolov4 - Device: CPU A B C 40 80 120 160 200 SE +/- 0.44, N = 3 203 203 204 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
perf-bench Benchmark: Epoll Wait OpenBenchmarking.org ops/sec, More Is Better perf-bench Benchmark: Epoll Wait A B C 20K 40K 60K 80K 100K SE +/- 1498.86, N = 15 109366 115261 112104 1. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -lunwind-x86_64 -lunwind -llzma -Xlinker -lpthread -lrt -lm -ldl -lelf -lcrypto -lpython3.9 -lcrypt -lutil -lz -lnuma
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU A B C 1500 3000 4500 6000 7500 SE +/- 0.78, N = 3 6801.99 6804.00 6790.86 MIN: 6727 MIN: 6736.66 MIN: 6724.86 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU A B C 1500 3000 4500 6000 7500 SE +/- 2.83, N = 3 6792.71 6780.39 6786.74 MIN: 6719.79 MIN: 6693.38 MIN: 6699.4 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU A B C 1500 3000 4500 6000 7500 SE +/- 8.07, N = 3 6791.09 6778.99 6787.52 MIN: 6710.84 MIN: 6696.67 MIN: 6720.9 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU A B C 800 1600 2400 3200 4000 SE +/- 3.11, N = 3 3612.75 3619.68 3600.38 MIN: 3553.13 MIN: 3561.16 MIN: 3534.68 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU A B C 800 1600 2400 3200 4000 SE +/- 1.86, N = 3 3603.05 3588.14 3594.68 MIN: 3543.01 MIN: 3534.18 MIN: 3540.53 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU A B C 800 1600 2400 3200 4000 SE +/- 5.20, N = 3 3597.62 3575.32 3590.54 MIN: 3533.58 MIN: 3523.98 MIN: 3533.63 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU A B C 4 8 12 16 20 SE +/- 0.12, N = 15 14.09 11.04 12.90 MIN: 9.05 MIN: 7.6 MIN: 7.8 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU A B C 2 4 6 8 10 SE +/- 0.18527, N = 12 6.31917 4.19472 4.18294 MIN: 3.82 MIN: 3.9 MIN: 3.89 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU A B C 0.6308 1.2616 1.8924 2.5232 3.154 SE +/- 0.03745, N = 12 2.80373 2.20050 2.26074 MIN: 1.77 MIN: 2.11 MIN: 2.03 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
perf-bench Benchmark: Futex Hash OpenBenchmarking.org ops/sec, More Is Better perf-bench Benchmark: Futex Hash A B C 800K 1600K 2400K 3200K 4000K SE +/- 37483.12, N = 5 3559455 3644833 3674250 1. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -lunwind-x86_64 -lunwind -llzma -Xlinker -lpthread -lrt -lm -ldl -lelf -lcrypto -lpython3.9 -lcrypt -lutil -lz -lnuma
perf-bench Benchmark: Sched Pipe OpenBenchmarking.org ops/sec, More Is Better perf-bench Benchmark: Sched Pipe A B C 30K 60K 90K 120K 150K SE +/- 537.38, N = 3 163024 161604 161030 1. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -lunwind-x86_64 -lunwind -llzma -Xlinker -lpthread -lrt -lm -ldl -lelf -lcrypto -lpython3.9 -lcrypt -lutil -lz -lnuma
perf-bench Benchmark: Futex Lock-Pi OpenBenchmarking.org ops/sec, More Is Better perf-bench Benchmark: Futex Lock-Pi A B C 200 400 600 800 1000 SE +/- 0.00, N = 3 843 845 848 1. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -lunwind-x86_64 -lunwind -llzma -Xlinker -lpthread -lrt -lm -ldl -lelf -lcrypto -lpython3.9 -lcrypt -lutil -lz -lnuma
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU A B C 0.5904 1.1808 1.7712 2.3616 2.952 SE +/- 0.01939, N = 12 2.62391 2.48963 2.43947 MIN: 2.27 MIN: 2.31 MIN: 2.25 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU A B C 0.8726 1.7452 2.6178 3.4904 4.363 SE +/- 0.04471, N = 3 3.87824 3.76516 3.76013 MIN: 3.24 MIN: 3.27 MIN: 3.27 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
perf-bench Benchmark: Memcpy 1MB OpenBenchmarking.org GB/sec, More Is Better perf-bench Benchmark: Memcpy 1MB A B C 6 12 18 24 30 SE +/- 0.03, N = 3 26.84 26.80 25.62 1. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -lunwind-x86_64 -lunwind -llzma -Xlinker -lpthread -lrt -lm -ldl -lelf -lcrypto -lpython3.9 -lcrypt -lutil -lz -lnuma
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU A B C 0.8959 1.7918 2.6877 3.5836 4.4795 SE +/- 0.00171, N = 3 3.98170 3.96065 3.94494 MIN: 3.87 MIN: 3.85 MIN: 3.84 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU A B C 0.5185 1.037 1.5555 2.074 2.5925 SE +/- 0.00803, N = 3 2.26038 2.30454 2.24351 MIN: 1.79 MIN: 1.76 MIN: 1.8 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
perf-bench Benchmark: Memset 1MB OpenBenchmarking.org GB/sec, More Is Better perf-bench Benchmark: Memset 1MB A B C 10 20 30 40 50 SE +/- 0.20, N = 3 43.09 43.23 41.20 1. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -lunwind-x86_64 -lunwind -llzma -Xlinker -lpthread -lrt -lm -ldl -lelf -lcrypto -lpython3.9 -lcrypt -lutil -lz -lnuma
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU A B C 3 6 9 12 15 SE +/- 0.11489, N = 15 9.22489 8.17558 8.25954 MIN: 7.85 MIN: 7.92 MIN: 7.97 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU A B C 1.2297 2.4594 3.6891 4.9188 6.1485 SE +/- 0.07628, N = 15 5.46541 4.65868 4.60998 MIN: 4.42 MIN: 4.51 MIN: 4.46 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU A B C 3 6 9 12 15 SE +/- 0.03085, N = 3 9.74806 9.64795 9.36624 MIN: 9.5 MIN: 9.49 MIN: 9.21 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
perf-bench Benchmark: Syscall Basic OpenBenchmarking.org ops/sec, More Is Better perf-bench Benchmark: Syscall Basic A B C 3M 6M 9M 12M 15M SE +/- 15185.61, N = 3 14180008 14188609 14102641 1. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -lunwind-x86_64 -lunwind -llzma -Xlinker -lpthread -lrt -lm -ldl -lelf -lcrypto -lpython3.9 -lcrypt -lutil -lz -lnuma
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU A B C 4 8 12 16 20 SE +/- 0.09, N = 3 15.53 15.58 15.45 MIN: 15.19 MIN: 15.35 MIN: 15.22 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU A B C 4 8 12 16 20 SE +/- 0.02, N = 3 17.02 17.07 17.02 MIN: 16.94 MIN: 16.94 MIN: 16.91 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
Phoronix Test Suite v10.8.5