10700t comet weds Intel Core i7-10700T testing with a Logic Supply RXM-181 (Z01-0002A026 BIOS) and Intel UHD 630 CML GT2 3GB on Ubuntu 21.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2203303-NE-10700TCOM78&sor&grs .
10700t comet weds Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server OpenGL Vulkan Compiler File-System Screen Resolution A B C Intel Core i7-10700T @ 4.50GHz (8 Cores / 16 Threads) Logic Supply RXM-181 (Z01-0002A026 BIOS) Intel Comet Lake PCH 32GB 256GB TS256GMTS800 Intel UHD 630 CML GT2 3GB (1200MHz) Realtek ALC233 DELL P2415Q Intel I219-LM + Intel I210 Ubuntu 21.10 5.13.0-35-generic (x86_64) GNOME Shell 40.5 X Server + Wayland 4.6 Mesa 21.2.2 1.2.182 GCC 11.2.0 ext4 1920x1080 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-ZPT0kp/gcc-11-11.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-ZPT0kp/gcc-11-11.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0xec - Thermald 2.4.6 Python Details - Python 3.9.7 Security Details - itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
10700t comet weds onednn: Deconvolution Batch shapes_1d - f32 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU perf-bench: Epoll Wait perf-bench: Memset 1MB perf-bench: Memcpy 1MB onednn: IP Shapes 3D - f32 - CPU perf-bench: Futex Hash onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU onnx: GPT-2 - CPU onnx: ArcFace ResNet-100 - CPU perf-bench: Sched Pipe onnx: bertsquad-12 - CPU onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU onnx: super-resolution-10 - CPU onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU perf-bench: Syscall Basic perf-bench: Futex Lock-Pi onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onnx: yolov4 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Recurrent Neural Network Training - f32 - CPU onnx: fcn-resnet101-11 - CPU onednn: IP Shapes 1D - f32 - CPU A B C 14.0868 2.80373 5.46541 9.22489 2.62391 109366 43.091819 26.836621 9.74806 3559455 3.87824 2.26038 3644 694 163024 303 3.98170 2167 15.5284 3597.62 14180008 843 3612.75 203 3603.05 17.0221 6801.99 6792.71 6791.09 39 6.31917 11.0421 2.2005 4.65868 8.17558 2.48963 115261 43.234222 26.803861 9.64795 3644833 3.76516 2.30454 3713 703 161604 306 3.96065 2186 15.5771 3575.32 14188609 845 3619.68 203 3588.14 17.0692 6804 6780.39 6778.99 39 4.19472 12.9019 2.26074 4.60998 8.25954 2.43947 112104 41.195661 25.616662 9.36624 3674250 3.76013 2.24351 3711 161030 3.94494 15.4461 3590.54 14102641 848 3600.38 204 3594.68 17.017 6790.86 6786.74 6787.52 4.18294 OpenBenchmarking.org
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU B C A 4 8 12 16 20 SE +/- 0.12, N = 15 11.04 12.90 14.09 MIN: 7.6 MIN: 7.8 MIN: 9.05 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU B C A 0.6308 1.2616 1.8924 2.5232 3.154 SE +/- 0.03745, N = 12 2.20050 2.26074 2.80373 MIN: 2.11 MIN: 2.03 MIN: 1.77 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU C B A 1.2297 2.4594 3.6891 4.9188 6.1485 SE +/- 0.07628, N = 15 4.60998 4.65868 5.46541 MIN: 4.46 MIN: 4.51 MIN: 4.42 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU B C A 3 6 9 12 15 SE +/- 0.11489, N = 15 8.17558 8.25954 9.22489 MIN: 7.92 MIN: 7.97 MIN: 7.85 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU C B A 0.5904 1.1808 1.7712 2.3616 2.952 SE +/- 0.01939, N = 12 2.43947 2.48963 2.62391 MIN: 2.25 MIN: 2.31 MIN: 2.27 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
perf-bench Benchmark: Epoll Wait OpenBenchmarking.org ops/sec, More Is Better perf-bench Benchmark: Epoll Wait B C A 20K 40K 60K 80K 100K SE +/- 1498.86, N = 15 115261 112104 109366 1. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -lunwind-x86_64 -lunwind -llzma -Xlinker -lpthread -lrt -lm -ldl -lelf -lcrypto -lpython3.9 -lcrypt -lutil -lz -lnuma
perf-bench Benchmark: Memset 1MB OpenBenchmarking.org GB/sec, More Is Better perf-bench Benchmark: Memset 1MB B A C 10 20 30 40 50 SE +/- 0.20, N = 3 43.23 43.09 41.20 1. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -lunwind-x86_64 -lunwind -llzma -Xlinker -lpthread -lrt -lm -ldl -lelf -lcrypto -lpython3.9 -lcrypt -lutil -lz -lnuma
perf-bench Benchmark: Memcpy 1MB OpenBenchmarking.org GB/sec, More Is Better perf-bench Benchmark: Memcpy 1MB A B C 6 12 18 24 30 SE +/- 0.03, N = 3 26.84 26.80 25.62 1. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -lunwind-x86_64 -lunwind -llzma -Xlinker -lpthread -lrt -lm -ldl -lelf -lcrypto -lpython3.9 -lcrypt -lutil -lz -lnuma
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU C B A 3 6 9 12 15 SE +/- 0.03085, N = 3 9.36624 9.64795 9.74806 MIN: 9.21 MIN: 9.49 MIN: 9.5 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
perf-bench Benchmark: Futex Hash OpenBenchmarking.org ops/sec, More Is Better perf-bench Benchmark: Futex Hash C B A 800K 1600K 2400K 3200K 4000K SE +/- 37483.12, N = 5 3674250 3644833 3559455 1. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -lunwind-x86_64 -lunwind -llzma -Xlinker -lpthread -lrt -lm -ldl -lelf -lcrypto -lpython3.9 -lcrypt -lutil -lz -lnuma
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU C B A 0.8726 1.7452 2.6178 3.4904 4.363 SE +/- 0.04471, N = 3 3.76013 3.76516 3.87824 MIN: 3.27 MIN: 3.27 MIN: 3.24 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU C A B 0.5185 1.037 1.5555 2.074 2.5925 SE +/- 0.00803, N = 3 2.24351 2.26038 2.30454 MIN: 1.8 MIN: 1.79 MIN: 1.76 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
ONNX Runtime Model: GPT-2 - Device: CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: GPT-2 - Device: CPU B C A 800 1600 2400 3200 4000 SE +/- 41.41, N = 3 3713 3711 3644 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: ArcFace ResNet-100 - Device: CPU B A 150 300 450 600 750 SE +/- 1.89, N = 3 703 694 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
perf-bench Benchmark: Sched Pipe OpenBenchmarking.org ops/sec, More Is Better perf-bench Benchmark: Sched Pipe A B C 30K 60K 90K 120K 150K SE +/- 537.38, N = 3 163024 161604 161030 1. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -lunwind-x86_64 -lunwind -llzma -Xlinker -lpthread -lrt -lm -ldl -lelf -lcrypto -lpython3.9 -lcrypt -lutil -lz -lnuma
ONNX Runtime Model: bertsquad-12 - Device: CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: bertsquad-12 - Device: CPU B A 70 140 210 280 350 SE +/- 1.48, N = 3 306 303 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU C B A 0.8959 1.7918 2.6877 3.5836 4.4795 SE +/- 0.00171, N = 3 3.94494 3.96065 3.98170 MIN: 3.84 MIN: 3.85 MIN: 3.87 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
ONNX Runtime Model: super-resolution-10 - Device: CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: super-resolution-10 - Device: CPU B A 500 1000 1500 2000 2500 SE +/- 4.93, N = 3 2186 2167 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU C A B 4 8 12 16 20 SE +/- 0.09, N = 3 15.45 15.53 15.58 MIN: 15.22 MIN: 15.19 MIN: 15.35 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU B C A 800 1600 2400 3200 4000 SE +/- 5.20, N = 3 3575.32 3590.54 3597.62 MIN: 3523.98 MIN: 3533.63 MIN: 3533.58 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
perf-bench Benchmark: Syscall Basic OpenBenchmarking.org ops/sec, More Is Better perf-bench Benchmark: Syscall Basic B A C 3M 6M 9M 12M 15M SE +/- 15185.61, N = 3 14188609 14180008 14102641 1. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -lunwind-x86_64 -lunwind -llzma -Xlinker -lpthread -lrt -lm -ldl -lelf -lcrypto -lpython3.9 -lcrypt -lutil -lz -lnuma
perf-bench Benchmark: Futex Lock-Pi OpenBenchmarking.org ops/sec, More Is Better perf-bench Benchmark: Futex Lock-Pi C B A 200 400 600 800 1000 SE +/- 0.00, N = 3 848 845 843 1. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -lunwind-x86_64 -lunwind -llzma -Xlinker -lpthread -lrt -lm -ldl -lelf -lcrypto -lpython3.9 -lcrypt -lutil -lz -lnuma
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU C A B 800 1600 2400 3200 4000 SE +/- 3.11, N = 3 3600.38 3612.75 3619.68 MIN: 3534.68 MIN: 3553.13 MIN: 3561.16 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
ONNX Runtime Model: yolov4 - Device: CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: yolov4 - Device: CPU C B A 40 80 120 160 200 SE +/- 0.44, N = 3 204 203 203 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU B C A 800 1600 2400 3200 4000 SE +/- 1.86, N = 3 3588.14 3594.68 3603.05 MIN: 3534.18 MIN: 3540.53 MIN: 3543.01 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU C A B 4 8 12 16 20 SE +/- 0.02, N = 3 17.02 17.02 17.07 MIN: 16.91 MIN: 16.94 MIN: 16.94 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU C A B 1500 3000 4500 6000 7500 SE +/- 0.78, N = 3 6790.86 6801.99 6804.00 MIN: 6724.86 MIN: 6727 MIN: 6736.66 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU B C A 1500 3000 4500 6000 7500 SE +/- 2.83, N = 3 6780.39 6786.74 6792.71 MIN: 6693.38 MIN: 6699.4 MIN: 6719.79 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU B C A 1500 3000 4500 6000 7500 SE +/- 8.07, N = 3 6778.99 6787.52 6791.09 MIN: 6696.67 MIN: 6720.9 MIN: 6710.84 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: fcn-resnet101-11 - Device: CPU B A 9 18 27 36 45 SE +/- 0.29, N = 3 39 39 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU C B A 2 4 6 8 10 SE +/- 0.18527, N = 12 4.18294 4.19472 6.31917 MIN: 3.89 MIN: 3.9 MIN: 3.82 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
Phoronix Test Suite v10.8.5