oneDNN EPYC 7551

AMD EPYC 7551 32-Core testing with a GIGABYTE MZ31-AR0-00 v01010101 (F10 BIOS) and ASPEED on Debian 10 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2012097-HA-ONEDNNEPY33.

oneDNN EPYC 7551ProcessorMotherboardChipsetMemoryDiskGraphicsNetworkOSKernelCompilerFile-SystemScreen ResolutionEPYC 755123AMD EPYC 7551 32-Core @ 2.00GHz (32 Cores / 64 Threads)GIGABYTE MZ31-AR0-00 v01010101 (F10 BIOS)AMD 17h32GBSamsung SSD 960 EVO 500GBASPEEDRealtek RTL8111/8168/8411 + 2 x Broadcom NetXtreme II BCM57810 10Debian 104.19.0-10-amd64 (x86_64)GCC 8.3.0ext41024x768OpenBenchmarking.orgCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --enable-libmpx --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: acpi-cpufreq ondemand (Boost: Enabled) - CPU Microcode: 0x8001227Security Details- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional STIBP: disabled RSB filling + srbds: Not affected + tsx_async_abort: Not affected

oneDNN EPYC 7551onednn: IP Shapes 1D - f32 - CPUonednn: IP Shapes 3D - f32 - CPUonednn: IP Shapes 1D - u8s8f32 - CPUonednn: IP Shapes 3D - u8s8f32 - CPUonednn: Convolution Batch Shapes Auto - f32 - CPUonednn: Deconvolution Batch shapes_1d - f32 - CPUonednn: Deconvolution Batch shapes_3d - f32 - CPUonednn: Convolution Batch Shapes Auto - u8s8f32 - CPUonednn: Deconvolution Batch shapes_1d - u8s8f32 - CPUonednn: Deconvolution Batch shapes_3d - u8s8f32 - CPUonednn: Recurrent Neural Network Training - f32 - CPUonednn: Recurrent Neural Network Inference - f32 - CPUonednn: Recurrent Neural Network Training - u8s8f32 - CPUonednn: Recurrent Neural Network Inference - u8s8f32 - CPUonednn: Matrix Multiply Batch Shapes Transformer - f32 - CPUonednn: Recurrent Neural Network Training - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPUonednn: Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPUEPYC 7551236.4110314.20422.854964.1901021.56054.235779.3553127.46104.904964.8071713523.43888.3713881.23939.312.2645513715.33901.531.921666.2384214.33252.899854.1640421.45804.158599.3682626.54174.548884.9748813904.93790.4613850.73907.312.142236.4571714.39563.014704.2344321.43224.272519.4133627.377521.7555625.066227428.49669.20OpenBenchmarking.org

oneDNN

Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 1D - Data Type: f32 - Engine: CPUEPYC 755123246810SE +/- 0.13783, N = 12SE +/- 0.15563, N = 15SE +/- 0.10425, N = 36.411036.238426.45717MIN: 4.43MIN: 4.22MIN: 5.711. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 3D - Data Type: f32 - Engine: CPUEPYC 75512348121620SE +/- 0.01, N = 3SE +/- 0.18, N = 7SE +/- 0.22, N = 514.2014.3314.40MIN: 13.85MIN: 12.85MIN: 13.131. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPUEPYC 7551230.67831.35662.03492.71323.3915SE +/- 0.01055, N = 3SE +/- 0.05060, N = 4SE +/- 0.06392, N = 152.854962.899853.01470MIN: 2.73MIN: 2.72MIN: 2.711. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPUEPYC 7551230.95271.90542.85813.81084.7635SE +/- 0.03560, N = 3SE +/- 0.03384, N = 3SE +/- 0.04775, N = 34.190104.164044.23443MIN: 3.63MIN: 3.55MIN: 3.71. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPUEPYC 755123510152025SE +/- 0.28, N = 3SE +/- 0.13, N = 3SE +/- 0.10, N = 321.5621.4621.43MIN: 21.02MIN: 19.05MIN: 21.051. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPUEPYC 7551230.96131.92262.88393.84524.8065SE +/- 0.05023, N = 8SE +/- 0.03979, N = 3SE +/- 0.03978, N = 144.235774.158594.27251MIN: 3.9MIN: 3.98MIN: 3.911. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPUEPYC 7551233691215SE +/- 0.11154, N = 15SE +/- 0.16498, N = 15SE +/- 0.13762, N = 139.355319.368269.41336MIN: 7.36MIN: 7.33MIN: 7.321. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPUEPYC 755123612182430SE +/- 0.18, N = 3SE +/- 0.12, N = 3SE +/- 0.18, N = 327.4626.5427.38MIN: 23.55MIN: 23.21MIN: 19.81. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPUEPYC 755123510152025SE +/- 0.10526, N = 15SE +/- 0.03741, N = 3SE +/- 5.46332, N = 124.904964.5488821.75556MIN: 4.31MIN: 4.35MIN: 4.341. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPUEPYC 755123612182430SE +/- 0.01248, N = 3SE +/- 0.09029, N = 3SE +/- 3.80502, N = 124.807174.9748825.06620MIN: 4.47MIN: 4.46MIN: 4.511. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPUEPYC 7551236K12K18K24K30KSE +/- 242.83, N = 12SE +/- 141.40, N = 3SE +/- 303.23, N = 1013523.413904.927428.4MIN: 11111.7MIN: 13415.4MIN: 20387.51. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPUEPYC 7551232K4K6K8K10KSE +/- 23.01, N = 3SE +/- 54.22, N = 3SE +/- 109.66, N = 93888.373790.469669.20MIN: 3817.23MIN: 3533.9MIN: 7540.181. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPUEPYC 755123K6K9K12K15KSE +/- 278.98, N = 3SE +/- 154.35, N = 1213881.213850.7MIN: 13036.4MIN: 123241. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPUEPYC 755128001600240032004000SE +/- 58.63, N = 3SE +/- 26.13, N = 33939.313907.31MIN: 3818.26MIN: 3791.271. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPUEPYC 755120.50951.0191.52852.0382.5475SE +/- 0.14921, N = 15SE +/- 0.11716, N = 152.264552.14223MIN: 1.49MIN: 1.371. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPUEPYC 75513K6K9K12K15KSE +/- 59.91, N = 313715.3MIN: 132661. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPUEPYC 75518001600240032004000SE +/- 55.17, N = 63901.53MIN: 3620.211. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Matrix Multiply Batch Shapes Transformer - Data Type: u8s8f32 - Engine: CPUEPYC 75510.43240.86481.29721.72962.162SE +/- 0.01368, N = 31.92166MIN: 1.841. (CXX) g++ options: -O3 -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread


Phoronix Test Suite v10.8.4