onednn_test_all

2 x AMD EPYC 7552 48-Core testing with a HPE ProLiant DL385 Gen10 Plus (A42 BIOS) and Matrox MGA G200eH3 on Ubuntu 20.04 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2010062-NE-ONEDNNTES64&grt.

onednn_test_allProcessorMotherboardChipsetMemoryDiskGraphicsNetworkOSKernelCompilerFile-SystemScreen Resolution2 x AMD EPYC 7552 48-Core2 x AMD EPYC 7552 48-Core @ 2.20GHz (96 Cores / 192 Threads)HPE ProLiant DL385 Gen10 Plus (A42 BIOS)AMD Starship/Matisse1008GB4 x 3201GB MO003200KWZQQ + 192010GB LOGICAL VOLUMEMatrox MGA G200eH34 x QLogic FastLinQ QL41000 10/25/40/50GbE + 4 x Intel I350Ubuntu 20.045.4.0-48-generic (x86_64)GCC 9.3.0ext41024x768OpenBenchmarking.org- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: acpi-cpufreq ondemand - CPU Microcode: 0x8301038- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected

onednn_test_allonednn: IP Batch 1D - f32 - CPUonednn: IP Batch All - f32 - CPUonednn: IP Batch 1D - u8s8f32 - CPUonednn: IP Batch All - u8s8f32 - CPUonednn: Convolution Batch Shapes Auto - f32 - CPUonednn: Deconvolution Batch deconv_1d - f32 - CPUonednn: Deconvolution Batch deconv_3d - f32 - CPUonednn: Convolution Batch Shapes Auto - u8s8f32 - CPUonednn: Deconvolution Batch deconv_1d - u8s8f32 - CPUonednn: Deconvolution Batch deconv_3d - u8s8f32 - CPUonednn: Recurrent Neural Network Training - f32 - CPUonednn: Recurrent Neural Network Inference - f32 - CPUonednn: Matrix Multiply Batch Shapes Transformer - f32 - CPUonednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU2 x AMD EPYC 7552 48-Core2.2049441.33233.4022210.73050.8601583.345144.514014.002392.464711.63114932.136404.9890.8306510.868965OpenBenchmarking.org

oneDNN

Harness: IP Batch 1D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 1.5Harness: IP Batch 1D - Data Type: f32 - Engine: CPU2 x AMD EPYC 7552 48-Core0.49610.99221.48831.98442.4805SE +/- 0.01971, N = 112.20494MIN: 1.791. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: IP Batch All - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 1.5Harness: IP Batch All - Data Type: f32 - Engine: CPU2 x AMD EPYC 7552 48-Core918273645SE +/- 0.79, N = 1341.33MIN: 25.511. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: IP Batch 1D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 1.5Harness: IP Batch 1D - Data Type: u8s8f32 - Engine: CPU2 x AMD EPYC 7552 48-Core0.76551.5312.29653.0623.8275SE +/- 0.02438, N = 33.40222MIN: 2.561. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: IP Batch All - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 1.5Harness: IP Batch All - Data Type: u8s8f32 - Engine: CPU2 x AMD EPYC 7552 48-Core3691215SE +/- 0.09, N = 310.73MIN: 9.611. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 1.5Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU2 x AMD EPYC 7552 48-Core0.19350.3870.58050.7740.9675SE +/- 0.012123, N = 30.860158MIN: 0.731. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Deconvolution Batch deconv_1d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 1.5Harness: Deconvolution Batch deconv_1d - Data Type: f32 - Engine: CPU2 x AMD EPYC 7552 48-Core0.75271.50542.25813.01083.7635SE +/- 0.01778, N = 33.34514MIN: 2.651. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Deconvolution Batch deconv_3d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 1.5Harness: Deconvolution Batch deconv_3d - Data Type: f32 - Engine: CPU2 x AMD EPYC 7552 48-Core1.01572.03143.04714.06285.0785SE +/- 0.04556, N = 34.51401MIN: 3.171. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 1.5Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU2 x AMD EPYC 7552 48-Core0.90051.8012.70153.6024.5025SE +/- 0.16329, N = 124.00239MIN: 3.151. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Deconvolution Batch deconv_1d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 1.5Harness: Deconvolution Batch deconv_1d - Data Type: u8s8f32 - Engine: CPU2 x AMD EPYC 7552 48-Core0.55461.10921.66382.21842.773SE +/- 0.02700, N = 72.46471MIN: 2.191. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Deconvolution Batch deconv_3d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 1.5Harness: Deconvolution Batch deconv_3d - Data Type: u8s8f32 - Engine: CPU2 x AMD EPYC 7552 48-Core0.3670.7341.1011.4681.835SE +/- 0.01236, N = 31.63114MIN: 1.21. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 1.5Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU2 x AMD EPYC 7552 48-Core2004006008001000SE +/- 11.02, N = 15932.14MIN: 767.431. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 1.5Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU2 x AMD EPYC 7552 48-Core90180270360450SE +/- 15.20, N = 12404.99MIN: 316.821. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 1.5Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU2 x AMD EPYC 7552 48-Core0.18690.37380.56070.74760.9345SE +/- 0.007317, N = 150.830651MIN: 0.621. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 1.5Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU2 x AMD EPYC 7552 48-Core0.19550.3910.58650.7820.9775SE +/- 0.003342, N = 30.868965MIN: 0.751. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl


Phoronix Test Suite v10.8.4