ARM64 oneDNN 2.0 Ampere Altra ARMv8 Neoverse-N1 testing with a WIWYNN Mt.Jade (1.1.20201019 BIOS) and ASPEED on Ubuntu 20.10 via the Phoronix Test Suite.
Compare your own system(s) to this result file with the
Phoronix Test Suite by running the command:
phoronix-test-suite benchmark 2012096-HA-ARM64ONED36 1 Processor: Ampere Altra ARMv8 Neoverse-N1 @ 3.30GHz (160 Cores), Motherboard: WIWYNN Mt.Jade (1.1.20201019 BIOS), Chipset: Ampere Computing LLC Device e100, Memory: 502GB, Disk: 3841GB Micron_9300_MTFDHAL3T8TDP + 960GB SAMSUNG MZ1LB960HAJQ-00007, Graphics: ASPEED, Monitor: VE228, Network: Mellanox MT28908 + Intel I210
OS: Ubuntu 20.10, Kernel: 5.10.0-051000rc6daily20201206-generic (aarch64) 20201206, Display Server: X Server 1.20.9, Display Driver: modesetting 1.20.9, Compiler: GCC 10.2.0, File-System: ext4, Screen Resolution: 1920x1080
Environment Notes: CXXFLAGS="-O3 -march=armv8.2-a -mtune=neoverse-n1" CFLAGS="-O3 -march=armv8.2-a -mtune=neoverse-n1"Compiler Notes: --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -vProcessor Notes: Scaling Governor: cppc_cpufreq performance (Boost: Enabled)Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected
oneDNN OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU 1 13 26 39 52 65 SE +/- 0.12, N = 3 55.69 MIN: 49 1. (CXX) g++ options: -O3 -march=armv8.2-a -mtune=neoverse-n1 -std=c++11 -fopenmp -mcpu=native -fPIC -pie -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU 1 30 60 90 120 150 SE +/- 0.43, N = 3 121.39 MIN: 118.05 1. (CXX) g++ options: -O3 -march=armv8.2-a -mtune=neoverse-n1 -std=c++11 -fopenmp -mcpu=native -fPIC -pie -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU 1 20 40 60 80 100 SE +/- 0.13, N = 3 82.24 MIN: 73.61 1. (CXX) g++ options: -O3 -march=armv8.2-a -mtune=neoverse-n1 -std=c++11 -fopenmp -mcpu=native -fPIC -pie -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU 1 40 80 120 160 200 SE +/- 0.82, N = 3 179.47 MIN: 167.23 1. (CXX) g++ options: -O3 -march=armv8.2-a -mtune=neoverse-n1 -std=c++11 -fopenmp -mcpu=native -fPIC -pie -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU 1 30 60 90 120 150 SE +/- 1.60, N = 3 143.52 MIN: 135.02 1. (CXX) g++ options: -O3 -march=armv8.2-a -mtune=neoverse-n1 -std=c++11 -fopenmp -mcpu=native -fPIC -pie -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU 1 8 16 24 32 40 SE +/- 0.22, N = 3 34.67 MIN: 27.26 1. (CXX) g++ options: -O3 -march=armv8.2-a -mtune=neoverse-n1 -std=c++11 -fopenmp -mcpu=native -fPIC -pie -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU 1 8 16 24 32 40 SE +/- 0.45, N = 4 34.72 MIN: 25.61 1. (CXX) g++ options: -O3 -march=armv8.2-a -mtune=neoverse-n1 -std=c++11 -fopenmp -mcpu=native -fPIC -pie -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU 1 15 30 45 60 75 SE +/- 0.37, N = 3 65.94 MIN: 38.48 1. (CXX) g++ options: -O3 -march=armv8.2-a -mtune=neoverse-n1 -std=c++11 -fopenmp -mcpu=native -fPIC -pie -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU 1 5 10 15 20 25 SE +/- 0.09, N = 3 20.74 MIN: 13.96 1. (CXX) g++ options: -O3 -march=armv8.2-a -mtune=neoverse-n1 -std=c++11 -fopenmp -mcpu=native -fPIC -pie -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU 1 8 16 24 32 40 SE +/- 0.09, N = 3 35.54 MIN: 32.93 1. (CXX) g++ options: -O3 -march=armv8.2-a -mtune=neoverse-n1 -std=c++11 -fopenmp -mcpu=native -fPIC -pie -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU 1 4K 8K 12K 16K 20K SE +/- 586.35, N = 10 16960.1 MIN: 12083.4 1. (CXX) g++ options: -O3 -march=armv8.2-a -mtune=neoverse-n1 -std=c++11 -fopenmp -mcpu=native -fPIC -pie -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU 1 4K 8K 12K 16K 20K SE +/- 490.90, N = 9 16887.1 MIN: 11151.4 1. (CXX) g++ options: -O3 -march=armv8.2-a -mtune=neoverse-n1 -std=c++11 -fopenmp -mcpu=native -fPIC -pie -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU 1 3K 6K 9K 12K 15K SE +/- 687.41, N = 9 15974.1 MIN: 11432.3 1. (CXX) g++ options: -O3 -march=armv8.2-a -mtune=neoverse-n1 -std=c++11 -fopenmp -mcpu=native -fPIC -pie -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU 1 4K 8K 12K 16K 20K SE +/- 724.35, N = 12 16839.0 MIN: 8880.14 1. (CXX) g++ options: -O3 -march=armv8.2-a -mtune=neoverse-n1 -std=c++11 -fopenmp -mcpu=native -fPIC -pie -lpthread
OpenBenchmarking.org ms, Fewer Is Better oneDNN 2.0 Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU 1 4 8 12 16 20 SE +/- 0.23, N = 3 17.23 MIN: 14.47 1. (CXX) g++ options: -O3 -march=armv8.2-a -mtune=neoverse-n1 -std=c++11 -fopenmp -mcpu=native -fPIC -pie -lpthread
1 Processor: Ampere Altra ARMv8 Neoverse-N1 @ 3.30GHz (160 Cores), Motherboard: WIWYNN Mt.Jade (1.1.20201019 BIOS), Chipset: Ampere Computing LLC Device e100, Memory: 502GB, Disk: 3841GB Micron_9300_MTFDHAL3T8TDP + 960GB SAMSUNG MZ1LB960HAJQ-00007, Graphics: ASPEED, Monitor: VE228, Network: Mellanox MT28908 + Intel I210
OS: Ubuntu 20.10, Kernel: 5.10.0-051000rc6daily20201206-generic (aarch64) 20201206, Display Server: X Server 1.20.9, Display Driver: modesetting 1.20.9, Compiler: GCC 10.2.0, File-System: ext4, Screen Resolution: 1920x1080
Environment Notes: CXXFLAGS="-O3 -march=armv8.2-a -mtune=neoverse-n1" CFLAGS="-O3 -march=armv8.2-a -mtune=neoverse-n1"Compiler Notes: --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -vProcessor Notes: Scaling Governor: cppc_cpufreq performance (Boost: Enabled)Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected
Testing initiated at 9 December 2020 23:13 by user phoronix.