10700t comet weds Intel Core i7-10700T testing with a Logic Supply RXM-181 (Z01-0002A026 BIOS) and Intel UHD 630 CML GT2 3GB on Ubuntu 21.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2203303-NE-10700TCOM78&gru&export=pdf .
10700t comet weds Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server OpenGL Vulkan Compiler File-System Screen Resolution A B C Intel Core i7-10700T @ 4.50GHz (8 Cores / 16 Threads) Logic Supply RXM-181 (Z01-0002A026 BIOS) Intel Comet Lake PCH 32GB 256GB TS256GMTS800 Intel UHD 630 CML GT2 3GB (1200MHz) Realtek ALC233 DELL P2415Q Intel I219-LM + Intel I210 Ubuntu 21.10 5.13.0-35-generic (x86_64) GNOME Shell 40.5 X Server + Wayland 4.6 Mesa 21.2.2 1.2.182 GCC 11.2.0 ext4 1920x1080 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-ZPT0kp/gcc-11-11.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-ZPT0kp/gcc-11-11.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0xec - Thermald 2.4.6 Python Details - Python 3.9.7 Security Details - itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
10700t comet weds perf-bench: Memcpy 1MB perf-bench: Memset 1MB onnx: GPT-2 - CPU onnx: yolov4 - CPU onnx: bertsquad-12 - CPU onnx: fcn-resnet101-11 - CPU onnx: ArcFace ResNet-100 - CPU onnx: super-resolution-10 - CPU perf-bench: Epoll Wait perf-bench: Futex Hash perf-bench: Sched Pipe perf-bench: Futex Lock-Pi perf-bench: Syscall Basic onednn: IP Shapes 1D - f32 - CPU onednn: IP Shapes 3D - f32 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: Deconvolution Batch shapes_1d - f32 - CPU onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU onednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU onednn: Matrix Multiply Batch Shapes Transformer - bf16bf16bf16 - CPU A B C 26.836621 43.091819 3644 203 303 39 694 2167 109366 3559455 163024 843 14180008 6.31917 9.74806 2.80373 2.62391 17.0221 14.0868 9.22489 15.5284 3.87824 5.46541 6791.09 3603.05 6792.71 3597.62 3.98170 6801.99 3612.75 2.26038 26.803861 43.234222 3713 203 306 39 703 2186 115261 3644833 161604 845 14188609 4.19472 9.64795 2.2005 2.48963 17.0692 11.0421 8.17558 15.5771 3.76516 4.65868 6778.99 3588.14 6780.39 3575.32 3.96065 6804 3619.68 2.30454 25.616662 41.195661 3711 204 112104 3674250 161030 848 14102641 4.18294 9.36624 2.26074 2.43947 17.017 12.9019 8.25954 15.4461 3.76013 4.60998 6787.52 3594.68 6786.74 3590.54 3.94494 6790.86 3600.38 2.24351 OpenBenchmarking.org
perf-bench Benchmark: Memcpy 1MB OpenBenchmarking.org GB/sec, More Is Better perf-bench Benchmark: Memcpy 1MB A B C 6 12 18 24 30 SE +/- 0.03, N = 3 26.84 26.80 25.62 1. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -lunwind-x86_64 -lunwind -llzma -Xlinker -lpthread -lrt -lm -ldl -lelf -lcrypto -lpython3.9 -lcrypt -lutil -lz -lnuma
perf-bench Benchmark: Memset 1MB OpenBenchmarking.org GB/sec, More Is Better perf-bench Benchmark: Memset 1MB A B C 10 20 30 40 50 SE +/- 0.20, N = 3 43.09 43.23 41.20 1. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -lunwind-x86_64 -lunwind -llzma -Xlinker -lpthread -lrt -lm -ldl -lelf -lcrypto -lpython3.9 -lcrypt -lutil -lz -lnuma
ONNX Runtime Model: GPT-2 - Device: CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: GPT-2 - Device: CPU A B C 800 1600 2400 3200 4000 SE +/- 41.41, N = 3 3644 3713 3711 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: yolov4 - Device: CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: yolov4 - Device: CPU A B C 40 80 120 160 200 SE +/- 0.44, N = 3 203 203 204 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: bertsquad-12 - Device: CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: bertsquad-12 - Device: CPU A B 70 140 210 280 350 SE +/- 1.48, N = 3 303 306 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: fcn-resnet101-11 - Device: CPU A B 9 18 27 36 45 SE +/- 0.29, N = 3 39 39 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: ArcFace ResNet-100 - Device: CPU A B 150 300 450 600 750 SE +/- 1.89, N = 3 694 703 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: super-resolution-10 - Device: CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: super-resolution-10 - Device: CPU A B 500 1000 1500 2000 2500 SE +/- 4.93, N = 3 2167 2186 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
perf-bench Benchmark: Epoll Wait OpenBenchmarking.org ops/sec, More Is Better perf-bench Benchmark: Epoll Wait A B C 20K 40K 60K 80K 100K SE +/- 1498.86, N = 15 109366 115261 112104 1. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -lunwind-x86_64 -lunwind -llzma -Xlinker -lpthread -lrt -lm -ldl -lelf -lcrypto -lpython3.9 -lcrypt -lutil -lz -lnuma
perf-bench Benchmark: Futex Hash OpenBenchmarking.org ops/sec, More Is Better perf-bench Benchmark: Futex Hash A B C 800K 1600K 2400K 3200K 4000K SE +/- 37483.12, N = 5 3559455 3644833 3674250 1. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -lunwind-x86_64 -lunwind -llzma -Xlinker -lpthread -lrt -lm -ldl -lelf -lcrypto -lpython3.9 -lcrypt -lutil -lz -lnuma
perf-bench Benchmark: Sched Pipe OpenBenchmarking.org ops/sec, More Is Better perf-bench Benchmark: Sched Pipe A B C 30K 60K 90K 120K 150K SE +/- 537.38, N = 3 163024 161604 161030 1. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -lunwind-x86_64 -lunwind -llzma -Xlinker -lpthread -lrt -lm -ldl -lelf -lcrypto -lpython3.9 -lcrypt -lutil -lz -lnuma
perf-bench Benchmark: Futex Lock-Pi OpenBenchmarking.org ops/sec, More Is Better perf-bench Benchmark: Futex Lock-Pi A B C 200 400 600 800 1000 SE +/- 0.00, N = 3 843 845 848 1. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -lunwind-x86_64 -lunwind -llzma -Xlinker -lpthread -lrt -lm -ldl -lelf -lcrypto -lpython3.9 -lcrypt -lutil -lz -lnuma
perf-bench Benchmark: Syscall Basic OpenBenchmarking.org ops/sec, More Is Better perf-bench Benchmark: Syscall Basic A B C 3M 6M 9M 12M 15M SE +/- 15185.61, N = 3 14180008 14188609 14102641 1. (CC) gcc options: -O6 -ggdb3 -funwind-tables -std=gnu99 -lunwind-x86_64 -lunwind -llzma -Xlinker -lpthread -lrt -lm -ldl -lelf -lcrypto -lpython3.9 -lcrypt -lutil -lz -lnuma
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU A B C 2 4 6 8 10 SE +/- 0.18527, N = 12 6.31917 4.19472 4.18294 MIN: 3.82 MIN: 3.9 MIN: 3.89 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU A B C 3 6 9 12 15 SE +/- 0.03085, N = 3 9.74806 9.64795 9.36624 MIN: 9.5 MIN: 9.49 MIN: 9.21 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU A B C 0.6308 1.2616 1.8924 2.5232 3.154 SE +/- 0.03745, N = 12 2.80373 2.20050 2.26074 MIN: 1.77 MIN: 2.11 MIN: 2.03 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU A B C 0.5904 1.1808 1.7712 2.3616 2.952 SE +/- 0.01939, N = 12 2.62391 2.48963 2.43947 MIN: 2.27 MIN: 2.31 MIN: 2.25 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU A B C 4 8 12 16 20 SE +/- 0.02, N = 3 17.02 17.07 17.02 MIN: 16.94 MIN: 16.94 MIN: 16.91 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU A B C 4 8 12 16 20 SE +/- 0.12, N = 15 14.09 11.04 12.90 MIN: 9.05 MIN: 7.6 MIN: 7.8 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU A B C 3 6 9 12 15 SE +/- 0.11489, N = 15 9.22489 8.17558 8.25954 MIN: 7.85 MIN: 7.92 MIN: 7.97 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU A B C 4 8 12 16 20 SE +/- 0.09, N = 3 15.53 15.58 15.45 MIN: 15.19 MIN: 15.35 MIN: 15.22 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU A B C 0.8726 1.7452 2.6178 3.4904 4.363 SE +/- 0.04471, N = 3 3.87824 3.76516 3.76013 MIN: 3.24 MIN: 3.27 MIN: 3.27 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU A B C 1.2297 2.4594 3.6891 4.9188 6.1485 SE +/- 0.07628, N = 15 5.46541 4.65868 4.60998 MIN: 4.42 MIN: 4.51 MIN: 4.46 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU A B C 1500 3000 4500 6000 7500 SE +/- 8.07, N = 3 6791.09 6778.99 6787.52 MIN: 6710.84 MIN: 6696.67 MIN: 6720.9 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU A B C 800 1600 2400 3200 4000 SE +/- 1.86, N = 3 3603.05 3588.14 3594.68 MIN: 3543.01 MIN: 3534.18 MIN: 3540.53 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU A B C 1500 3000 4500 6000 7500 SE +/- 2.83, N = 3 6792.71 6780.39 6786.74 MIN: 6719.79 MIN: 6693.38 MIN: 6699.4 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU A B C 800 1600 2400 3200 4000 SE +/- 5.20, N = 3 3597.62 3575.32 3590.54 MIN: 3533.58 MIN: 3523.98 MIN: 3533.63 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU A B C 0.8959 1.7918 2.6877 3.5836 4.4795 SE +/- 0.00171, N = 3 3.98170 3.96065 3.94494 MIN: 3.87 MIN: 3.85 MIN: 3.84 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU A B C 1500 3000 4500 6000 7500 SE +/- 0.78, N = 3 6801.99 6804.00 6790.86 MIN: 6727 MIN: 6736.66 MIN: 6724.86 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU A B C 800 1600 2400 3200 4000 SE +/- 3.11, N = 3 3612.75 3619.68 3600.38 MIN: 3553.13 MIN: 3561.16 MIN: 3534.68 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
oneDNN Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.6 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU A B C 0.5185 1.037 1.5555 2.074 2.5925 SE +/- 0.00803, N = 3 2.26038 2.30454 2.24351 MIN: 1.79 MIN: 1.76 MIN: 1.8 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -std=c++11 -pie -ldl -lpthread
Phoronix Test Suite v10.8.5