atomicrulez_ubuntu_2310_mkl-dnn Intel Core i9-10850K testing with a ASUS ROG MAXIMUS XII APEX (2701 BIOS) and NVIDIA GeForce RTX 3090 Ti 24GB on Ubuntu 23.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2310017-MICH-ATOMICR47 .
atomicrulez_ubuntu_2310_mkl-dnn Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL OpenCL Compiler File-System Screen Resolution i9-10850K Intel Core i9-10850K @ 5.20GHz (10 Cores / 20 Threads) ASUS ROG MAXIMUS XII APEX (2701 BIOS) Intel Comet Lake PCH 64GB 280GB INTEL SSDPE21D280GA NVIDIA GeForce RTX 3090 Ti 24GB Realtek ALC1220 ROG PG259QN Intel I225-V + Intel Comet Lake PCH CNVi WiFi Ubuntu 23.10 6.5.0-5-generic (x86_64) GNOME Shell 45.0 X Server 1.21.1.7 NVIDIA 535.104.05 4.6.0 OpenCL 3.0 CUDA 12.2.138 GCC 13.2.0 + CUDA 12.0 ext4 1920x1080 OpenBenchmarking.org - Transparent Huge Pages: madvise - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: intel_pstate powersave (EPP: performance) - CPU Microcode: 0xf8 - Thermald 2.5.4 - gather_data_sampling: Mitigation of Microcode + itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Mitigation of Clear buffers; SMT vulnerable + retbleed: Mitigation of Enhanced IBRS + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Mitigation of Microcode + tsx_async_abort: Not affected
atomicrulez_ubuntu_2310_mkl-dnn mkl-dnn: IP Batch 1D - f32 mkl-dnn: IP Batch All - f32 mkl-dnn: IP Batch 1D - u8s8f32 mkl-dnn: IP Batch All - u8s8f32 mkl-dnn: Deconvolution Batch deconv_1d - f32 mkl-dnn: Deconvolution Batch deconv_3d - f32 mkl-dnn: Deconvolution Batch deconv_1d - u8s8f32 mkl-dnn: Deconvolution Batch deconv_3d - u8s8f32 mkl-dnn: Recurrent Neural Network Training - f32 mkl-dnn: Recurrent Neural Network Inference - f32 mkl-dnn: Recurrent Neural Network Inference - bf16bf16bf16 i9-10850K 2.93504 43.8559 1.23910 16.8429 3.05229 4.75091 124.524 2.18324 167.674 25.7463 OpenBenchmarking.org
oneDNN MKL-DNN Harness: IP Batch 1D - Data Type: f32 OpenBenchmarking.org ms, Fewer Is Better oneDNN MKL-DNN 1.3 Harness: IP Batch 1D - Data Type: f32 i9-10850K 0.6604 1.3208 1.9812 2.6416 3.302 SE +/- 0.03799, N = 3 2.93504 MIN: 2.53 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -ldl
oneDNN MKL-DNN Harness: IP Batch All - Data Type: f32 OpenBenchmarking.org ms, Fewer Is Better oneDNN MKL-DNN 1.3 Harness: IP Batch All - Data Type: f32 i9-10850K 10 20 30 40 50 SE +/- 0.01, N = 3 43.86 MIN: 43.36 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -ldl
oneDNN MKL-DNN Harness: IP Batch 1D - Data Type: u8s8f32 OpenBenchmarking.org ms, Fewer Is Better oneDNN MKL-DNN 1.3 Harness: IP Batch 1D - Data Type: u8s8f32 i9-10850K 0.2788 0.5576 0.8364 1.1152 1.394 SE +/- 0.01401, N = 4 1.23910 MIN: 1.16 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -ldl
oneDNN MKL-DNN Harness: IP Batch All - Data Type: u8s8f32 OpenBenchmarking.org ms, Fewer Is Better oneDNN MKL-DNN 1.3 Harness: IP Batch All - Data Type: u8s8f32 i9-10850K 4 8 12 16 20 SE +/- 0.03, N = 3 16.84 MIN: 16.4 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -ldl
oneDNN MKL-DNN Harness: Deconvolution Batch deconv_1d - Data Type: f32 OpenBenchmarking.org ms, Fewer Is Better oneDNN MKL-DNN 1.3 Harness: Deconvolution Batch deconv_1d - Data Type: f32 i9-10850K 0.6868 1.3736 2.0604 2.7472 3.434 SE +/- 0.00699, N = 3 3.05229 MIN: 2.95 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -ldl
oneDNN MKL-DNN Harness: Deconvolution Batch deconv_3d - Data Type: f32 OpenBenchmarking.org ms, Fewer Is Better oneDNN MKL-DNN 1.3 Harness: Deconvolution Batch deconv_3d - Data Type: f32 i9-10850K 1.069 2.138 3.207 4.276 5.345 SE +/- 0.00615, N = 3 4.75091 MIN: 4.63 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -ldl
oneDNN MKL-DNN Harness: Deconvolution Batch deconv_1d - Data Type: u8s8f32 OpenBenchmarking.org ms, Fewer Is Better oneDNN MKL-DNN 1.3 Harness: Deconvolution Batch deconv_1d - Data Type: u8s8f32 i9-10850K 30 60 90 120 150 SE +/- 0.49, N = 3 124.52 MIN: 117.9 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -ldl
oneDNN MKL-DNN Harness: Deconvolution Batch deconv_3d - Data Type: u8s8f32 OpenBenchmarking.org ms, Fewer Is Better oneDNN MKL-DNN 1.3 Harness: Deconvolution Batch deconv_3d - Data Type: u8s8f32 i9-10850K 0.4912 0.9824 1.4736 1.9648 2.456 SE +/- 0.01126, N = 3 2.18324 MIN: 2.12 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -ldl
oneDNN MKL-DNN Harness: Recurrent Neural Network Training - Data Type: f32 OpenBenchmarking.org ms, Fewer Is Better oneDNN MKL-DNN 1.3 Harness: Recurrent Neural Network Training - Data Type: f32 i9-10850K 40 80 120 160 200 SE +/- 0.41, N = 3 167.67 MIN: 160.27 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -ldl
oneDNN MKL-DNN Harness: Recurrent Neural Network Inference - Data Type: f32 OpenBenchmarking.org ms, Fewer Is Better oneDNN MKL-DNN 1.3 Harness: Recurrent Neural Network Inference - Data Type: f32 i9-10850K 6 12 18 24 30 SE +/- 0.08, N = 3 25.75 MIN: 24.55 1. (CXX) g++ options: -O3 -march=native -std=c++11 -msse4.1 -fPIC -fopenmp -pie -ldl
Phoronix Test Suite v10.8.4