alder lake onnx perf bench more Intel Core i9-12900K testing with a ASUS ROG STRIX Z690-E GAMING WIFI (1003 BIOS) and NVIDIA GeForce RTX 3090 24GB on Ubuntu 21.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2203268-NE-ALDERLAKE22&grs&sor .
alder lake onnx perf bench more Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL OpenCL Vulkan Compiler File-System Screen Resolution A B C D Intel Core i9-12900K @ 6.50GHz (16 Cores / 24 Threads) ASUS ROG STRIX Z690-E GAMING WIFI (1003 BIOS) Intel Device 7aa7 32GB 1000GB Western Digital WDS100T1X0E-00AFY0 + 2000GB NVIDIA GeForce RTX 3090 24GB Intel Device 7ad0 ASUS VP28U Intel I225-V + Intel Wi-Fi 6 AX210/AX211/AX411 Ubuntu 21.10 5.13.0-35-generic (x86_64) GNOME Shell 40.5 X Server 1.20.13 NVIDIA 510.54 4.6.0 OpenCL 3.0 CUDA 11.6.110 1.3.194 GCC 11.2.0 ext4 3840x2160 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-ZPT0kp/gcc-11-11.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-ZPT0kp/gcc-11-11.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0x18 - Thermald 2.4.6 Python Details - Python 3.9.7 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected
alder lake onnx perf bench more fast-cli: Internet Loaded Latency (Bufferbloat) fast-cli: Internet Download Speed speedtest-cli: Internet Download Speed speedtest-cli: Internet Latency fast-cli: Internet Latency fast-cli: Internet Upload Speed speedtest-cli: Internet Upload Speed perf-bench: Sched Pipe perf-bench: Epoll Wait onnx: fcn-resnet101-11 - CPU perf-bench: Memset 1MB perf-bench: Memcpy 1MB onnx: GPT-2 - CPU onnx: ArcFace ResNet-100 - CPU onnx: super-resolution-10 - CPU onnx: yolov4 - CPU perf-bench: Syscall Basic onnx: bertsquad-12 - CPU perf-bench: Futex Lock-Pi perf-bench: Futex Hash A B C D 66 360 325.86 22.483 13 8.1 10.05 355673 70630 110 72.47884 34.125043 8129 355 4896 640 20929959 1014 573 6042276 73 340 283.52 16.822 15 5.7 7.85 291778 72747 109 76.26886 33.939201 8240 363 4825 628 20980604 1028 574 6044069 17 110 114.62 32.882 12 7.4 10.22 375581 71499 104 75.432891 33.039571 8365 357 4933 639 20988142 1017 572 6056727 58 370 230.95 18.787 10 5.6 8.69 322287 76620 109 76.431043 33.057891 8359 358 4830 631 21305407 1030 574 6045095 OpenBenchmarking.org
fast-cli Internet Loaded Latency (Bufferbloat) OpenBenchmarking.org ms, Fewer Is Better fast-cli Internet Loaded Latency (Bufferbloat) C D A B 16 32 48 64 80 17 58 66 73
fast-cli Internet Download Speed OpenBenchmarking.org Mbit/s, More Is Better fast-cli Internet Download Speed D A B C 80 160 240 320 400 370 360 340 110
speedtest-cli Internet Download Speed OpenBenchmarking.org Mbit/s, More Is Better speedtest-cli 2.1.3 Internet Download Speed A B D C 70 140 210 280 350 325.86 283.52 230.95 114.62
speedtest-cli Internet Latency OpenBenchmarking.org ms, Fewer Is Better speedtest-cli 2.1.3 Internet Latency B D A C 8 16 24 32 40 16.82 18.79 22.48 32.88
fast-cli Internet Latency OpenBenchmarking.org ms, Fewer Is Better fast-cli Internet Latency D C A B 4 8 12 16 20 10 12 13 15
fast-cli Internet Upload Speed OpenBenchmarking.org Mbit/s, More Is Better fast-cli Internet Upload Speed A C B D 2 4 6 8 10 8.1 7.4 5.7 5.6
speedtest-cli Internet Upload Speed OpenBenchmarking.org Mbit/s, More Is Better speedtest-cli 2.1.3 Internet Upload Speed C A D B 3 6 9 12 15 10.22 10.05 8.69 7.85
perf-bench Benchmark: Sched Pipe OpenBenchmarking.org ops/sec, More Is Better perf-bench Benchmark: Sched Pipe C A D B 80K 160K 240K 320K 400K 375581 355673 322287 291778 1. (CC) gcc options: -pthread -shared -Xlinker -O6 -ggdb3 -funwind-tables -std=gnu99
perf-bench Benchmark: Epoll Wait OpenBenchmarking.org ops/sec, More Is Better perf-bench Benchmark: Epoll Wait D B C A 16K 32K 48K 64K 80K 76620 72747 71499 70630 1. (CC) gcc options: -pthread -shared -Xlinker -O6 -ggdb3 -funwind-tables -std=gnu99
ONNX Runtime Model: fcn-resnet101-11 - Device: CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: fcn-resnet101-11 - Device: CPU A D B C 20 40 60 80 100 110 109 109 104 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
perf-bench Benchmark: Memset 1MB OpenBenchmarking.org GB/sec, More Is Better perf-bench Benchmark: Memset 1MB D B C A 20 40 60 80 100 76.43 76.27 75.43 72.48 1. (CC) gcc options: -pthread -shared -Xlinker -O6 -ggdb3 -funwind-tables -std=gnu99
perf-bench Benchmark: Memcpy 1MB OpenBenchmarking.org GB/sec, More Is Better perf-bench Benchmark: Memcpy 1MB A B D C 8 16 24 32 40 34.13 33.94 33.06 33.04 1. (CC) gcc options: -pthread -shared -Xlinker -O6 -ggdb3 -funwind-tables -std=gnu99
ONNX Runtime Model: GPT-2 - Device: CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: GPT-2 - Device: CPU C D B A 2K 4K 6K 8K 10K 8365 8359 8240 8129 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: ArcFace ResNet-100 - Device: CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: ArcFace ResNet-100 - Device: CPU B D C A 80 160 240 320 400 363 358 357 355 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: super-resolution-10 - Device: CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: super-resolution-10 - Device: CPU C A D B 1100 2200 3300 4400 5500 4933 4896 4830 4825 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
ONNX Runtime Model: yolov4 - Device: CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: yolov4 - Device: CPU A C D B 140 280 420 560 700 640 639 631 628 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
perf-bench Benchmark: Syscall Basic OpenBenchmarking.org ops/sec, More Is Better perf-bench Benchmark: Syscall Basic D C B A 5M 10M 15M 20M 25M 21305407 20988142 20980604 20929959 1. (CC) gcc options: -pthread -shared -Xlinker -O6 -ggdb3 -funwind-tables -std=gnu99
ONNX Runtime Model: bertsquad-12 - Device: CPU OpenBenchmarking.org Inferences Per Minute, More Is Better ONNX Runtime 1.11 Model: bertsquad-12 - Device: CPU D B C A 200 400 600 800 1000 1030 1028 1017 1014 1. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O3 -flto -fno-fat-lto-objects -ldl -lrt
perf-bench Benchmark: Futex Lock-Pi OpenBenchmarking.org ops/sec, More Is Better perf-bench Benchmark: Futex Lock-Pi D B A C 120 240 360 480 600 574 574 573 572 1. (CC) gcc options: -pthread -shared -Xlinker -O6 -ggdb3 -funwind-tables -std=gnu99
perf-bench Benchmark: Futex Hash OpenBenchmarking.org ops/sec, More Is Better perf-bench Benchmark: Futex Hash C D B A 1.3M 2.6M 3.9M 5.2M 6.5M 6056727 6045095 6044069 6042276 1. (CC) gcc options: -pthread -shared -Xlinker -O6 -ggdb3 -funwind-tables -std=gnu99
Phoronix Test Suite v10.8.4