alder lake onnx perf bench more Intel Core i9-12900K testing with a ASUS ROG STRIX Z690-E GAMING WIFI (1003 BIOS) and NVIDIA GeForce RTX 3090 24GB on Ubuntu 21.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2203268-NE-ALDERLAKE22&grr .
alder lake onnx perf bench more Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL OpenCL Vulkan Compiler File-System Screen Resolution A B C D Intel Core i9-12900K @ 6.50GHz (16 Cores / 24 Threads) ASUS ROG STRIX Z690-E GAMING WIFI (1003 BIOS) Intel Device 7aa7 32GB 1000GB Western Digital WDS100T1X0E-00AFY0 + 2000GB NVIDIA GeForce RTX 3090 24GB Intel Device 7ad0 ASUS VP28U Intel I225-V + Intel Wi-Fi 6 AX210/AX211/AX411 Ubuntu 21.10 5.13.0-35-generic (x86_64) GNOME Shell 40.5 X Server 1.20.13 NVIDIA 510.54 4.6.0 OpenCL 3.0 CUDA 11.6.110 1.3.194 GCC 11.2.0 ext4 3840x2160 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-ZPT0kp/gcc-11-11.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-ZPT0kp/gcc-11-11.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0x18 - Thermald 2.4.6 Python Details - Python 3.9.7 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
alder lake onnx perf bench more onnx: fcn-resnet101-11 - CPU onnx: GPT-2 - CPU onnx: ArcFace ResNet-100 - CPU onnx: bertsquad-12 - CPU onnx: yolov4 - CPU onnx: super-resolution-10 - CPU fast-cli: Internet Loaded Latency (Bufferbloat) fast-cli: Internet Latency fast-cli: Internet Upload Speed fast-cli: Internet Download Speed perf-bench: Epoll Wait perf-bench: Futex Lock-Pi perf-bench: Futex Hash speedtest-cli: Internet Latency speedtest-cli: Internet Upload Speed speedtest-cli: Internet Download Speed perf-bench: Sched Pipe perf-bench: Memcpy 1MB perf-bench: Memset 1MB perf-bench: Syscall Basic A B C D 110 8129 355 1014 640 4896 66 13 8.1 360 70630 573 6042276 22.483 10.05 325.86 355673 34.125043 72.47884 20929959 109 8240 363 1028 628 4825 73 15 5.7 340 72747 574 6044069 16.822 7.85 283.52 291778 33.939201 76.26886 20980604 104 8365 357 1017 639 4933 17 12 7.4 110 71499 572 6056727 32.882 10.22 114.62 375581 33.039571 75.432891 20988142 109 8359 358 1030 631 4830 58 10 5.6 370 76620 574 6045095 18.787 8.69 230.95 322287 33.057891 76.431043 21305407 OpenBenchmarking.org
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: fcn-resnet101-11 - Device: CPU A B C D 20 40 60 80 100 110 109 104 109 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: GPT-2 - Device: CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: GPT-2 - Device: CPU A B C D 2K 4K 6K 8K 10K 8129 8240 8365 8359 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: ArcFace ResNet-100 - Device: CPU A B C D 80 160 240 320 400 355 363 357 358 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: bertsquad-12 - Device: CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: bertsquad-12 - Device: CPU A B C D 200 400 600 800 1000 1014 1028 1017 1030 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: yolov4 - Device: CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: yolov4 - Device: CPU A B C D 140 280 420 560 700 640 628 639 631 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: super-resolution-10 - Device: CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: super-resolution-10 - Device: CPU A B C D 1100 2200 3300 4400 5500 4896 4825 4933 4830 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
fast-cli Internet Loaded Latency (Bufferbloat) OpenBenchmarking.org ms, Fewer Is Better fast-cli Internet Loaded Latency (Bufferbloat) A B C D 16 32 48 64 80 66 73 17 58
fast-cli Internet Latency OpenBenchmarking.org ms, Fewer Is Better fast-cli Internet Latency A B C D 4 8 12 16 20 13 15 12 10
fast-cli Internet Upload Speed OpenBenchmarking.org Mbit/s, More Is Better fast-cli Internet Upload Speed A B C D 2 4 6 8 10 8.1 5.7 7.4 5.6
fast-cli Internet Download Speed OpenBenchmarking.org Mbit/s, More Is Better fast-cli Internet Download Speed A B C D 80 160 240 320 400 360 340 110 370
perf-bench Benchmark: Epoll Wait OpenBenchmarking.org ops/sec, More Is Better perf-bench Benchmark: Epoll Wait A B C D 16K 32K 48K 64K 80K 70630 72747 71499 76620 1. (CC) gcc options: -pthread -shared -Xlinker -O6 -ggdb3 -funwind-tables -std=gnu99
perf-bench Benchmark: Futex Lock-Pi OpenBenchmarking.org ops/sec, More Is Better perf-bench Benchmark: Futex Lock-Pi A B C D 120 240 360 480 600 573 574 572 574 1. (CC) gcc options: -pthread -shared -Xlinker -O6 -ggdb3 -funwind-tables -std=gnu99
perf-bench Benchmark: Futex Hash OpenBenchmarking.org ops/sec, More Is Better perf-bench Benchmark: Futex Hash A B C D 1.3M 2.6M 3.9M 5.2M 6.5M 6042276 6044069 6056727 6045095 1. (CC) gcc options: -pthread -shared -Xlinker -O6 -ggdb3 -funwind-tables -std=gnu99
speedtest-cli Internet Latency OpenBenchmarking.org ms, Fewer Is Better speedtest-cli 2.1.3 Internet Latency A B C D 8 16 24 32 40 22.48 16.82 32.88 18.79
speedtest-cli Internet Upload Speed OpenBenchmarking.org Mbit/s, More Is Better speedtest-cli 2.1.3 Internet Upload Speed A B C D 3 6 9 12 15 10.05 7.85 10.22 8.69
speedtest-cli Internet Download Speed OpenBenchmarking.org Mbit/s, More Is Better speedtest-cli 2.1.3 Internet Download Speed A B C D 70 140 210 280 350 325.86 283.52 114.62 230.95
perf-bench Benchmark: Sched Pipe OpenBenchmarking.org ops/sec, More Is Better perf-bench Benchmark: Sched Pipe A B C D 80K 160K 240K 320K 400K 355673 291778 375581 322287 1. (CC) gcc options: -pthread -shared -Xlinker -O6 -ggdb3 -funwind-tables -std=gnu99
perf-bench Benchmark: Memcpy 1MB OpenBenchmarking.org GB/sec, More Is Better perf-bench Benchmark: Memcpy 1MB A B C D 8 16 24 32 40 34.13 33.94 33.04 33.06 1. (CC) gcc options: -pthread -shared -Xlinker -O6 -ggdb3 -funwind-tables -std=gnu99
perf-bench Benchmark: Memset 1MB OpenBenchmarking.org GB/sec, More Is Better perf-bench Benchmark: Memset 1MB A B C D 20 40 60 80 100 72.48 76.27 75.43 76.43 1. (CC) gcc options: -pthread -shared -Xlinker -O6 -ggdb3 -funwind-tables -std=gnu99
perf-bench Benchmark: Syscall Basic OpenBenchmarking.org ops/sec, More Is Better perf-bench Benchmark: Syscall Basic A B C D 5M 10M 15M 20M 25M 20929959 20980604 20988142 21305407 1. (CC) gcc options: -pthread -shared -Xlinker -O6 -ggdb3 -funwind-tables -std=gnu99
Phoronix Test Suite v10.8.4