ARM64 oneDNN 2.0

Ampere Altra ARMv8 Neoverse-N1 testing with a WIWYNN Mt.Jade (1.1.20201019 BIOS) and ASPEED on Ubuntu 20.10 via the Phoronix Test Suite.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2012096-HA-ARM64ONED36
Jump To Table - Results

Statistics

Remove Outliers Before Calculating Averages

Graph Settings

Prefer Vertical Bar Graphs

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Result
Identifier
Performance Per
Dollar
Date
Run
  Test
  Duration
1
December 09 2020
  1 Hour, 55 Minutes
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


ARM64 oneDNN 2.0OpenBenchmarking.orgPhoronix Test SuiteAmpere Altra ARMv8 Neoverse-N1 @ 3.30GHz (160 Cores)WIWYNN Mt.Jade (1.1.20201019 BIOS)Ampere Computing LLC Device e100502GB3841GB Micron_9300_MTFDHAL3T8TDP + 960GB SAMSUNG MZ1LB960HAJQ-00007ASPEEDVE228Mellanox MT28908 + Intel I210Ubuntu 20.105.10.0-051000rc6daily20201206-generic (aarch64) 20201206X Server 1.20.9modesetting 1.20.9GCC 10.2.0ext41920x1080ProcessorMotherboardChipsetMemoryDiskGraphicsMonitorNetworkOSKernelDisplay ServerDisplay DriverCompilerFile-SystemScreen ResolutionARM64 OneDNN 2.0 BenchmarksSystem Logs- CXXFLAGS="-O3 -march=armv8.2-a -mtune=neoverse-n1" CFLAGS="-O3 -march=armv8.2-a -mtune=neoverse-n1" - --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -v - Scaling Governor: cppc_cpufreq performance (Boost: Enabled)- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected

ARM64 oneDNN 2.0onednn: IP Shapes 1D - f32 - CPUonednn: IP Shapes 3D - f32 - CPUonednn: IP Shapes 1D - u8s8f32 - CPUonednn: IP Shapes 3D - u8s8f32 - CPUonednn: Convolution Batch Shapes Auto - f32 - CPUonednn: Deconvolution Batch shapes_1d - f32 - CPUonednn: Deconvolution Batch shapes_3d - f32 - CPUonednn: Convolution Batch Shapes Auto - u8s8f32 - CPUonednn: Deconvolution Batch shapes_1d - u8s8f32 - CPUonednn: Deconvolution Batch shapes_3d - u8s8f32 - CPUonednn: Recurrent Neural Network Training - f32 - CPUonednn: Recurrent Neural Network Inference - f32 - CPUonednn: Recurrent Neural Network Training - u8s8f32 - CPUonednn: Recurrent Neural Network Inference - u8s8f32 - CPUonednn: Matrix Multiply Batch Shapes Transformer - f32 - CPU155.6889121.39282.2356179.469143.52134.669534.723565.942020.737435.542716960.116887.115974.116839.017.2265OpenBenchmarking.org

oneDNN

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU11326395265SE +/- 0.12, N = 355.69MIN: 491. (CXX) g++ options: -O3 -march=armv8.2-a -mtune=neoverse-n1 -std=c++11 -fopenmp -mcpu=native -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU1306090120150SE +/- 0.43, N = 3121.39MIN: 118.051. (CXX) g++ options: -O3 -march=armv8.2-a -mtune=neoverse-n1 -std=c++11 -fopenmp -mcpu=native -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 1D - Data Type: u8s8f32 - Engine: CPU120406080100SE +/- 0.13, N = 382.24MIN: 73.611. (CXX) g++ options: -O3 -march=armv8.2-a -mtune=neoverse-n1 -std=c++11 -fopenmp -mcpu=native -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: IP Shapes 3D - Data Type: u8s8f32 - Engine: CPU14080120160200SE +/- 0.82, N = 3179.47MIN: 167.231. (CXX) g++ options: -O3 -march=armv8.2-a -mtune=neoverse-n1 -std=c++11 -fopenmp -mcpu=native -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU1306090120150SE +/- 1.60, N = 3143.52MIN: 135.021. (CXX) g++ options: -O3 -march=armv8.2-a -mtune=neoverse-n1 -std=c++11 -fopenmp -mcpu=native -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU1816243240SE +/- 0.22, N = 334.67MIN: 27.261. (CXX) g++ options: -O3 -march=armv8.2-a -mtune=neoverse-n1 -std=c++11 -fopenmp -mcpu=native -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU1816243240SE +/- 0.45, N = 434.72MIN: 25.611. (CXX) g++ options: -O3 -march=armv8.2-a -mtune=neoverse-n1 -std=c++11 -fopenmp -mcpu=native -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Convolution Batch Shapes Auto - Data Type: u8s8f32 - Engine: CPU11530456075SE +/- 0.37, N = 365.94MIN: 38.481. (CXX) g++ options: -O3 -march=armv8.2-a -mtune=neoverse-n1 -std=c++11 -fopenmp -mcpu=native -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_1d - Data Type: u8s8f32 - Engine: CPU1510152025SE +/- 0.09, N = 320.74MIN: 13.961. (CXX) g++ options: -O3 -march=armv8.2-a -mtune=neoverse-n1 -std=c++11 -fopenmp -mcpu=native -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Deconvolution Batch shapes_3d - Data Type: u8s8f32 - Engine: CPU1816243240SE +/- 0.09, N = 335.54MIN: 32.931. (CXX) g++ options: -O3 -march=armv8.2-a -mtune=neoverse-n1 -std=c++11 -fopenmp -mcpu=native -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU14K8K12K16K20KSE +/- 586.35, N = 1016960.1MIN: 12083.41. (CXX) g++ options: -O3 -march=armv8.2-a -mtune=neoverse-n1 -std=c++11 -fopenmp -mcpu=native -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU14K8K12K16K20KSE +/- 490.90, N = 916887.1MIN: 11151.41. (CXX) g++ options: -O3 -march=armv8.2-a -mtune=neoverse-n1 -std=c++11 -fopenmp -mcpu=native -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Training - Data Type: u8s8f32 - Engine: CPU13K6K9K12K15KSE +/- 687.41, N = 915974.1MIN: 11432.31. (CXX) g++ options: -O3 -march=armv8.2-a -mtune=neoverse-n1 -std=c++11 -fopenmp -mcpu=native -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Recurrent Neural Network Inference - Data Type: u8s8f32 - Engine: CPU14K8K12K16K20KSE +/- 724.35, N = 1216839.0MIN: 8880.141. (CXX) g++ options: -O3 -march=armv8.2-a -mtune=neoverse-n1 -std=c++11 -fopenmp -mcpu=native -fPIC -pie -lpthread

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.0Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU148121620SE +/- 0.23, N = 317.23MIN: 14.471. (CXX) g++ options: -O3 -march=armv8.2-a -mtune=neoverse-n1 -std=c++11 -fopenmp -mcpu=native -fPIC -pie -lpthread