10700t comet weds Intel Core i7-10700T testing with a Logic Supply RXM-181 (Z01-0002A026 BIOS) and Intel UHD 630 CML GT2 3GB on Ubuntu 21.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2203303-NE-10700TCOM78&sor&grr .
10700t comet weds Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server OpenGL Vulkan Compiler File-System Screen Resolution A B C Intel Core i7-10700T @ 4.50GHz (8 Cores / 16 Threads) Logic Supply RXM-181 (Z01-0002A026 BIOS) Intel Comet Lake PCH 32GB 256GB TS256GMTS800 Intel UHD 630 CML GT2 3GB (1200MHz) Realtek ALC233 DELL P2415Q Intel I219-LM + Intel I210 Ubuntu 21.10 5.13.0-35-generic (x86_64) GNOME Shell 40.5 X Server + Wayland 4.6 Mesa 21.2.2 1.2.182 GCC 11.2.0 ext4 1920x1080 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-ZPT0kp/gcc-11-11.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-ZPT0kp/gcc-11-11.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0xec - Thermald 2.4.6 Python Details - Python 3.9.7 Security Details - itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
10700t comet weds onnx: fcn-resnet101-11 - CPU onnx: bertsquad-12 - CPU onnx: ArcFace ResNet-100 - CPU onnx: super-resolution-10 - CPU onnx: GPT-2 - CPU onnx: yolov4 - CPU perf-bench: Epoll Wait onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU onednn: Deconvolution Batch shapes_1d - f32 - CPU onednn: IP Shapes 1D - f32 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU perf-bench: Futex Hash perf-bench: Sched Pipe perf-bench: Futex Lock-Pi onednn: IP Shapes 3D - u8s8f32 - CPU onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU perf-bench: Memcpy 1MB onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU perf-bench: Memset 1MB onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU onednn: IP Shapes 3D - f32 - CPU perf-bench: Syscall Basic onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: IP Shapes 1D - bf16bf16bf16 - CPU A B C 39 303 694 2167 3644 203 109366 6801.99 6792.71 6791.09 3612.75 3603.05 3597.62 14.0868 6.31917 2.80373 3559455 163024 843 2.62391 3.87824 26.836621 3.98170 2.26038 43.091819 9.22489 5.46541 9.74806 14180008 15.5284 17.0221 39 306 703 2186 3713 203 115261 6804 6780.39 6778.99 3619.68 3588.14 3575.32 11.0421 4.19472 2.2005 3644833 161604 845 2.48963 3.76516 26.803861 3.96065 2.30454 43.234222 8.17558 4.65868 9.64795 14188609 15.5771 17.0692 3711 204 112104 6790.86 6786.74 6787.52 3600.38 3594.68 3590.54 12.9019 4.18294 2.26074 3674250 161030 848 2.43947 3.76013 25.616662 3.94494 2.24351 41.195661 8.25954 4.60998 9.36624 14102641 15.4461 17.017 OpenBenchmarking.org
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: fcn-resnet101-11 - Device: CPU B A 9 18 27 36 45 SE +/- 0.29, N = 3 39 39 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: bertsquad-12 - Device: CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: bertsquad-12 - Device: CPU B A 70 140 210 280 350 SE +/- 1.48, N = 3 306 303 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: ArcFace ResNet-100 - Device: CPU B A 150 300 450 600 750 SE +/- 1.89, N = 3 703 694 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: super-resolution-10 - Device: CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: super-resolution-10 - Device: CPU B A 500 1000 1500 2000 2500 SE +/- 4.93, N = 3 2186 2167 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: GPT-2 - Device: CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: GPT-2 - Device: CPU B C A 800 1600 2400 3200 4000 SE +/- 41.41, N = 3 3713 3711 3644 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: yolov4 - Device: CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: yolov4 - Device: CPU C B A 40 80 120 160 200 SE +/- 0.44, N = 3 204 203 203 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
perf-bench Benchmark: Epoll Wait OpenBenchmarking.org ops/sec, More Is Better perf-bench Benchmark: Epoll Wait B C A 20K 40K 60K 80K 100K SE +/- 1498.86, N = 15 115261 112104 109366 1. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -lunwind-x86_64 -lunwind -llzma -Xlinker -lpthread -lrt -lm -ldl -lelf -lcrypto -lpython3.9 -lcrypt -lutil -lz -lnuma
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU C A B 1500 3000 4500 6000 7500 SE +/- 0.78, N = 3 6790.86 6801.99 6804.00 MIN: 6724.86 MIN: 6727 MIN: 6736.66 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU B C A 1500 3000 4500 6000 7500 SE +/- 2.83, N = 3 6780.39 6786.74 6792.71 MIN: 6693.38 MIN: 6699.4 MIN: 6719.79 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU B C A 1500 3000 4500 6000 7500 SE +/- 8.07, N = 3 6778.99 6787.52 6791.09 MIN: 6696.67 MIN: 6720.9 MIN: 6710.84 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU C A B 800 1600 2400 3200 4000 SE +/- 3.11, N = 3 3600.38 3612.75 3619.68 MIN: 3534.68 MIN: 3553.13 MIN: 3561.16 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU B C A 800 1600 2400 3200 4000 SE +/- 1.86, N = 3 3588.14 3594.68 3603.05 MIN: 3534.18 MIN: 3540.53 MIN: 3543.01 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU B C A 800 1600 2400 3200 4000 SE +/- 5.20, N = 3 3575.32 3590.54 3597.62 MIN: 3523.98 MIN: 3533.63 MIN: 3533.58 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU B C A 4 8 12 16 20 SE +/- 0.12, N = 15 11.04 12.90 14.09 MIN: 7.6 MIN: 7.8 MIN: 9.05 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU C B A 2 4 6 8 10 SE +/- 0.18527, N = 12 4.18294 4.19472 6.31917 MIN: 3.89 MIN: 3.9 MIN: 3.82 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU B C A 0.6308 1.2616 1.8924 2.5232 3.154 SE +/- 0.03745, N = 12 2.20050 2.26074 2.80373 MIN: 2.11 MIN: 2.03 MIN: 1.77 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
perf-bench Benchmark: Futex Hash OpenBenchmarking.org ops/sec, More Is Better perf-bench Benchmark: Futex Hash C B A 800K 1600K 2400K 3200K 4000K SE +/- 37483.12, N = 5 3674250 3644833 3559455 1. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -lunwind-x86_64 -lunwind -llzma -Xlinker -lpthread -lrt -lm -ldl -lelf -lcrypto -lpython3.9 -lcrypt -lutil -lz -lnuma
perf-bench Benchmark: Sched Pipe OpenBenchmarking.org ops/sec, More Is Better perf-bench Benchmark: Sched Pipe A B C 30K 60K 90K 120K 150K SE +/- 537.38, N = 3 163024 161604 161030 1. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -lunwind-x86_64 -lunwind -llzma -Xlinker -lpthread -lrt -lm -ldl -lelf -lcrypto -lpython3.9 -lcrypt -lutil -lz -lnuma
perf-bench Benchmark: Futex Lock-Pi OpenBenchmarking.org ops/sec, More Is Better perf-bench Benchmark: Futex Lock-Pi C B A 200 400 600 800 1000 SE +/- 0.00, N = 3 848 845 843 1. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -lunwind-x86_64 -lunwind -llzma -Xlinker -lpthread -lrt -lm -ldl -lelf -lcrypto -lpython3.9 -lcrypt -lutil -lz -lnuma
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU C B A 0.5904 1.1808 1.7712 2.3616 2.952 SE +/- 0.01939, N = 12 2.43947 2.48963 2.62391 MIN: 2.25 MIN: 2.31 MIN: 2.27 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU C B A 0.8726 1.7452 2.6178 3.4904 4.363 SE +/- 0.04471, N = 3 3.76013 3.76516 3.87824 MIN: 3.27 MIN: 3.27 MIN: 3.24 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
perf-bench Benchmark: Memcpy 1MB OpenBenchmarking.org GB/sec, More Is Better perf-bench Benchmark: Memcpy 1MB A B C 6 12 18 24 30 SE +/- 0.03, N = 3 26.84 26.80 25.62 1. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -lunwind-x86_64 -lunwind -llzma -Xlinker -lpthread -lrt -lm -ldl -lelf -lcrypto -lpython3.9 -lcrypt -lutil -lz -lnuma
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU C B A 0.8959 1.7918 2.6877 3.5836 4.4795 SE +/- 0.00171, N = 3 3.94494 3.96065 3.98170 MIN: 3.84 MIN: 3.85 MIN: 3.87 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU C A B 0.5185 1.037 1.5555 2.074 2.5925 SE +/- 0.00803, N = 3 2.24351 2.26038 2.30454 MIN: 1.8 MIN: 1.79 MIN: 1.76 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
perf-bench Benchmark: Memset 1MB OpenBenchmarking.org GB/sec, More Is Better perf-bench Benchmark: Memset 1MB B A C 10 20 30 40 50 SE +/- 0.20, N = 3 43.23 43.09 41.20 1. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -lunwind-x86_64 -lunwind -llzma -Xlinker -lpthread -lrt -lm -ldl -lelf -lcrypto -lpython3.9 -lcrypt -lutil -lz -lnuma
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU B C A 3 6 9 12 15 SE +/- 0.11489, N = 15 8.17558 8.25954 9.22489 MIN: 7.92 MIN: 7.97 MIN: 7.85 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU C B A 1.2297 2.4594 3.6891 4.9188 6.1485 SE +/- 0.07628, N = 15 4.60998 4.65868 5.46541 MIN: 4.46 MIN: 4.51 MIN: 4.42 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU C B A 3 6 9 12 15 SE +/- 0.03085, N = 3 9.36624 9.64795 9.74806 MIN: 9.21 MIN: 9.49 MIN: 9.5 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
perf-bench Benchmark: Syscall Basic OpenBenchmarking.org ops/sec, More Is Better perf-bench Benchmark: Syscall Basic B A C 3M 6M 9M 12M 15M SE +/- 15185.61, N = 3 14188609 14180008 14102641 1. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -lunwind-x86_64 -lunwind -llzma -Xlinker -lpthread -lrt -lm -ldl -lelf -lcrypto -lpython3.9 -lcrypt -lutil -lz -lnuma
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU C A B 4 8 12 16 20 SE +/- 0.09, N = 3 15.45 15.53 15.58 MIN: 15.22 MIN: 15.19 MIN: 15.35 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU C A B 4 8 12 16 20 SE +/- 0.02, N = 3 17.02 17.02 17.07 MIN: 16.91 MIN: 16.94 MIN: 16.94 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
Phoronix Test Suite v10.8.4