atomicrulez_ubuntu_2310_onednn Intel Core i9-10850K testing with a ASUS ROG MAXIMUS XII APEX (2701 BIOS) and NVIDIA GeForce RTX 3090 Ti 24GB on Ubuntu 23.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2310067-MICH-ATOMICR12 .
atomicrulez_ubuntu_2310_onednn Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL OpenCL Compiler File-System Screen Resolution i9-10850K Intel Core i9-10850K @ 5.30GHz (10 Cores / 20 Threads) ASUS ROG MAXIMUS XII APEX (2701 BIOS) Intel Comet Lake PCH 64GB 280GB INTEL SSDPE21D280GA NVIDIA GeForce RTX 3090 Ti 24GB Realtek ALC1220 ROG PG259QN Intel I225-V + Intel Comet Lake PCH CNVi WiFi Ubuntu 23.10 6.5.0-7-generic (x86_64) GNOME Shell 45.0 X Server 1.21.1.7 NVIDIA 535.104.05 4.6.0 OpenCL 3.0 CUDA 12.2.138 GCC 13.2.0 + CUDA 12.0 ext4 1920x1080 OpenBenchmarking.org - Transparent Huge Pages: madvise - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: intel_pstate powersave (EPP: performance) - CPU Microcode: 0xf8 - Thermald 2.5.4 - gather_data_sampling: Mitigation of Microcode + itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Mitigation of Clear buffers; SMT vulnerable + retbleed: Mitigation of Enhanced IBRS + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Mitigation of Microcode + tsx_async_abort: Not affected
atomicrulez_ubuntu_2310_onednn onednn: IP Shapes 1D - f32 - CPU onednn: IP Shapes 3D - f32 - CPU onednn: IP Shapes 1D - u8s8f32 - CPU onednn: IP Shapes 3D - u8s8f32 - CPU onednn: Convolution Batch Shapes Auto - f32 - CPU onednn: Deconvolution Batch shapes_1d - f32 - CPU onednn: Deconvolution Batch shapes_3d - f32 - CPU onednn: Convolution Batch Shapes Auto - u8s8f32 - CPU onednn: Deconvolution Batch shapes_1d - u8s8f32 - CPU onednn: Deconvolution Batch shapes_3d - u8s8f32 - CPU onednn: Recurrent Neural Network Training - f32 - CPU onednn: Recurrent Neural Network Inference - f32 - CPU onednn: Recurrent Neural Network Training - u8s8f32 - CPU onednn: Recurrent Neural Network Inference - u8s8f32 - CPU onednn: Recurrent Neural Network Training - bf16bf16bf16 - CPU onednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPU i9-10850K 2.88763 7.57034 1.12647 1.75239 12.8900 6.24648 4.87505 11.5654 1.51759 3.79757 2711.72 1563.39 2712.30 1577.28 2706.33 1575.59 OpenBenchmarking.org
oneDNN Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU i9-10850K 0.6497 1.2994 1.9491 2.5988 3.2485 SE +/- 0.03564, N = 3 2.88763 MIN: 2.45 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU i9-10850K 2 4 6 8 10 SE +/- 0.00674, N = 3 7.57034 MIN: 7.37 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU i9-10850K 0.2535 0.507 0.7605 1.014 1.2675 SE +/- 0.00139, N = 3 1.12647 MIN: 1.07 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU i9-10850K 0.3943 0.7886 1.1829 1.5772 1.9715 SE +/- 0.00553, N = 3 1.75239 MIN: 1.65 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU i9-10850K 3 6 9 12 15 SE +/- 0.02, N = 3 12.89 MIN: 12.76 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU i9-10850K 2 4 6 8 10 SE +/- 0.06303, N = 3 6.24648 MIN: 4.05 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU i9-10850K 1.0969 2.1938 3.2907 4.3876 5.4845 SE +/- 0.01134, N = 3 4.87505 MIN: 4.78 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU i9-10850K 3 6 9 12 15 SE +/- 0.07, N = 3 11.57 MIN: 11.05 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU i9-10850K 0.3415 0.683 1.0245 1.366 1.7075 SE +/- 0.00461, N = 3 1.51759 MIN: 1.47 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU i9-10850K 0.8545 1.709 2.5635 3.418 4.2725 SE +/- 0.00309, N = 3 3.79757 MIN: 3.76 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU i9-10850K 600 1200 1800 2400 3000 SE +/- 0.39, N = 3 2711.72 MIN: 2700.5 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU i9-10850K 300 600 900 1200 1500 SE +/- 13.53, N = 3 1563.39 MIN: 1515.69 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU i9-10850K 600 1200 1800 2400 3000 SE +/- 0.39, N = 3 2712.30 MIN: 2701.89 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU i9-10850K 300 600 900 1200 1500 SE +/- 13.01, N = 3 1577.28 MIN: 1512.75 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU i9-10850K 600 1200 1800 2400 3000 SE +/- 5.21, N = 3 2706.33 MIN: 2687.24 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
oneDNN Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.1 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU i9-10850K 300 600 900 1200 1500 SE +/- 10.41, N = 3 1575.59 MIN: 1514.3 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -ldl
Phoronix Test Suite v10.8.5