onednn_centos7 2 x AMD EPYC 7642 48-Core testing with a Dell 0GK70M (1.5.5 BIOS) and Matrox G200eW3 on CentOS Linux 7 via the Phoronix Test Suite.
Compare your own system(s) to this result file with the
Phoronix Test Suite by running the command:
phoronix-test-suite benchmark 2311161-NE-ONEDNNCEN98 gator Processor: 2 x AMD EPYC 7642 48-Core (96 Cores), Motherboard: Dell 0GK70M (1.5.5 BIOS), Chipset: AMD Starship/Matisse, Memory: 256GB, Disk: 600GB PERC H345 Front, Graphics: Matrox G200eW3, Network: 6 x Broadcom NetXtreme BCM5720 2-port PCIe + 2 x Solarflare SFC9120 10G
OS: CentOS Linux 7, Kernel: 3.10.0-1160.24.1.el7.x86_64 (x86_64), Compiler: GCC 4.8.5 20150623, File-System: xfs, Screen Resolution: 1024x768
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-redhat-linux --disable-libgcj --disable-libunwind-exceptions --enable-__cxa_atexit --enable-bootstrap --enable-checking=release --enable-gnu-indirect-function --enable-gnu-unique-object --enable-initfini-array --enable-languages=c,c++,objc,obj-c++,java,fortran,ada,go,lto --enable-plugin --enable-shared --enable-threads=posix --mandir=/usr/share/man --with-arch_32=x86-64 --with-linker-hash-style=gnu --with-tune=genericProcessor Notes: CPU Microcode: 0x8301038
oneDNN This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI toolkit. Learn more via the OpenBenchmarking.org test page.
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPU gator 400 800 1200 1600 2000 SE +/- 26.77, N = 3 1950.44 MIN: 1808.48 1. (CXX) g++ options: -std=c++11 -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -rdynamic -lpthread -lrt -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPU gator 800 1600 2400 3200 4000 SE +/- 44.07, N = 3 3608.80 MIN: 3277.79 1. (CXX) g++ options: -std=c++11 -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -rdynamic -lpthread -lrt -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU gator 400 800 1200 1600 2000 SE +/- 12.03, N = 3 1925.07 MIN: 1821.14 1. (CXX) g++ options: -std=c++11 -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -rdynamic -lpthread -lrt -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU gator 800 1600 2400 3200 4000 SE +/- 38.91, N = 5 3545.63 MIN: 3308.92 1. (CXX) g++ options: -std=c++11 -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -rdynamic -lpthread -lrt -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU gator 400 800 1200 1600 2000 SE +/- 17.40, N = 3 1891.39 MIN: 1791.55 1. (CXX) g++ options: -std=c++11 -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -rdynamic -lpthread -lrt -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU gator 800 1600 2400 3200 4000 SE +/- 36.98, N = 4 3540.14 MIN: 3260.38 1. (CXX) g++ options: -std=c++11 -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -rdynamic -lpthread -lrt -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU gator 0.3463 0.6926 1.0389 1.3852 1.7315 SE +/- 0.00407, N = 3 1.53893 MIN: 1.45 1. (CXX) g++ options: -std=c++11 -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -rdynamic -lpthread -lrt -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU gator 0.6855 1.371 2.0565 2.742 3.4275 SE +/- 0.03418, N = 3 3.04685 MIN: 2.17 1. (CXX) g++ options: -std=c++11 -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -rdynamic -lpthread -lrt -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU gator 0.8322 1.6644 2.4966 3.3288 4.161 SE +/- 0.03332, N = 15 3.69848 MIN: 2.21 1. (CXX) g++ options: -std=c++11 -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -rdynamic -lpthread -lrt -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU gator 0.7713 1.5426 2.3139 3.0852 3.8565 SE +/- 0.02385, N = 15 3.42799 MIN: 2.68 1. (CXX) g++ options: -std=c++11 -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -rdynamic -lpthread -lrt -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU gator 3 6 9 12 15 SE +/- 0.06, N = 3 12.83 MIN: 5.83 1. (CXX) g++ options: -std=c++11 -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -rdynamic -lpthread -lrt -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU gator 0.3649 0.7298 1.0947 1.4596 1.8245 SE +/- 0.00901, N = 3 1.62164 MIN: 1.3 1. (CXX) g++ options: -std=c++11 -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -rdynamic -lpthread -lrt -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU gator 0.3286 0.6572 0.9858 1.3144 1.643 SE +/- 0.00550, N = 3 1.46065 MIN: 1.14 1. (CXX) g++ options: -std=c++11 -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -rdynamic -lpthread -lrt -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU gator 3 6 9 12 15 SE +/- 0.01306, N = 3 9.71207 MIN: 8.91 1. (CXX) g++ options: -std=c++11 -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -rdynamic -lpthread -lrt -ldl
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU gator 0.7726 1.5452 2.3178 3.0904 3.863 SE +/- 0.00291, N = 3 3.43365 MIN: 2.82 1. (CXX) g++ options: -std=c++11 -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -rdynamic -lpthread -lrt -ldl
Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPU
gator: The test run did not produce a result. The test run did not produce a result. The test run did not produce a result.
Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPU
gator: The test run did not produce a result. The test run did not produce a result. The test run did not produce a result.
Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPU
gator: The test run did not produce a result. The test run did not produce a result. The test run did not produce a result.
Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPU
gator: The test run did not produce a result. The test run did not produce a result. The test run did not produce a result.
Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPU
gator: The test run did not produce a result. The test run did not produce a result. The test run did not produce a result.
OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.3 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU gator 2 4 6 8 10 SE +/- 0.63391, N = 15 7.03111 MIN: 3.2 1. (CXX) g++ options: -std=c++11 -O3 -march=native -fopenmp -msse4.1 -fPIC -pie -rdynamic -lpthread -lrt -ldl
gator Processor: 2 x AMD EPYC 7642 48-Core (96 Cores), Motherboard: Dell 0GK70M (1.5.5 BIOS), Chipset: AMD Starship/Matisse, Memory: 256GB, Disk: 600GB PERC H345 Front, Graphics: Matrox G200eW3, Network: 6 x Broadcom NetXtreme BCM5720 2-port PCIe + 2 x Solarflare SFC9120 10G
OS: CentOS Linux 7, Kernel: 3.10.0-1160.24.1.el7.x86_64 (x86_64), Compiler: GCC 4.8.5 20150623, File-System: xfs, Screen Resolution: 1024x768
Kernel Notes: Transparent Huge Pages: madviseCompiler Notes: --build=x86_64-redhat-linux --disable-libgcj --disable-libunwind-exceptions --enable-__cxa_atexit --enable-bootstrap --enable-checking=release --enable-gnu-indirect-function --enable-gnu-unique-object --enable-initfini-array --enable-languages=c,c++,objc,obj-c++,java,fortran,ada,go,lto --enable-plugin --enable-shared --enable-threads=posix --mandir=/usr/share/man --with-arch_32=x86-64 --with-linker-hash-style=gnu --with-tune=genericProcessor Notes: CPU Microcode: 0x8301038
Testing initiated at 16 November 2023 13:20 by user .