Ampere Altra Page Size

Ampere Altra ARMv8 Neoverse-N1 testing with a WIWYNN Mt.Jade (1.1.20201019 BIOS) and ASPEED on Ubuntu 20.10 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2102201-HA-AMPEREALT76&sro&gru.

Ampere Altra Page SizeProcessorMotherboardChipsetMemoryDiskGraphicsNetworkOSKernelCompilerFile-SystemScreen Resolution64k4kAmpere Altra ARMv8 Neoverse-N1 @ 3.30GHz (160 Cores)WIWYNN Mt.Jade (1.1.20201019 BIOS)Ampere Computing LLC Device e100510GB3841GB Micron_9300_MTFDHAL3T8TDP + 960GB SAMSUNG MZ1LB960HAJQ-00007ASPEEDMellanox MT28908 + Intel I210Ubuntu 20.105.11.0-051100-generic-64k (aarch64)GCC 10.2.0ext41024x768502GB5.11.0-051100-generic (aarch64)OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=aarch64-linux-gnu --disable-libquadmath --disable-libquadmath-support --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-fix-cortex-a53-843419 --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-nls --enable-objc-gc=auto --enable-plugin --enable-shared --enable-threads=posix --host=aarch64-linux-gnu --program-prefix=aarch64-linux-gnu- --target=aarch64-linux-gnu --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-target-system-zlib=auto -v Processor Details- Scaling Governor: cppc_cpufreq performance (Boost: Enabled)Python Details- Python 3.8.6Security Details- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of __user pointer sanitization + spectre_v2: Not affected + srbds: Not affected + tsx_async_abort: Not affected

Ampere Altra Page Sizesysbench: Memorysysbench: CPUamg: dav1d: Chimera 1080p 10-bitstream: Copystream: Scalestream: Triadstream: Addtinymembench: Standard Memcpytinymembench: Standard Memsetmbw: Memory Copy - 8192 MiBmbw: Memory Copy, Fixed Block Size - 8192 MiBlammps: 20k Atomslammps: Rhodopsin Proteinnpb: EP.Dnpb: LU.Ctoybrot: TBBtoybrot: OpenMPtoybrot: C++ Taskstoybrot: C++ Threadsopenfoam: Motorbike 30Mopenfoam: Motorbike 60Mqe: AUSURF112build-godot: Time To Compilebuild-imagemagick: Time To Compilebuild-llvm: Time To Compilebuild2: Time To Compilengspice: C2670webp2: Quality 100, Compression Effort 5blender: Classroom - CPU-Only64k4k1446026.8214575364.44962287877000234.17275825.8267186.7273077.7272961.811880.542969.110366.2410279.41245.58542.4177364.2455003.44410048904889403518.05102.461775.8670.01723.268254.10083.031251.9798.72463.081525334.2419563242.09922218320333271387.5262820.5268932.1264195.711138.538102.49694.3659669.57244.20740.4847304.1754846.19414256384771404720.33104.8193.61726.118280.86887.2458.85466.72OpenBenchmarking.org

Sysbench

Test: Memory

OpenBenchmarking.orgEvents Per Second, More Is BetterSysbench 2018-07-28Test: Memory4k64k300K600K900K1200K1500KSE +/- 14039.57, N = 15SE +/- 10177.62, N = 121525334.241446026.821. (CC) gcc options: -pthread -O3 -funroll-loops -ggdb3 -rdynamic -ldl -laio -lm

Sysbench

Test: CPU

OpenBenchmarking.orgEvents Per Second, More Is BetterSysbench 2018-07-28Test: CPU4k64k120K240K360K480K600KSE +/- 1878.66, N = 3SE +/- 7720.77, N = 3563242.10575364.451. (CC) gcc options: -pthread -O3 -funroll-loops -ggdb3 -rdynamic -ldl -laio -lm

Algebraic Multi-Grid Benchmark

OpenBenchmarking.orgFigure Of Merit, More Is BetterAlgebraic Multi-Grid Benchmark 1.24k64k500M1000M1500M2000M2500MSE +/- 6786373.02, N = 3SE +/- 675918.14, N = 3221832033322878770001. (CC) gcc options: -lparcsr_ls -lparcsr_mv -lseq_mv -lIJ_mv -lkrylov -lHYPRE_utilities -lm -fopenmp -pthread -lmpi

dav1d

Video Input: Chimera 1080p 10-bit

OpenBenchmarking.orgFPS, More Is Betterdav1d 0.8.1Video Input: Chimera 1080p 10-bit64k50100150200250SE +/- 9.30, N = 4234.17MIN: 196.75 / MAX: 435.081. (CC) gcc options: -pthread

Stream

Type: Copy

OpenBenchmarking.orgMB/s, More Is BetterStream 2013-01-17Type: Copy4k64k60K120K180K240K300KSE +/- 6794.21, N = 25SE +/- 5808.82, N = 25271387.5275825.81. (CC) gcc options: -O3 -march=native -fopenmp

Stream

Type: Scale

OpenBenchmarking.orgMB/s, More Is BetterStream 2013-01-17Type: Scale4k64k60K120K180K240K300KSE +/- 6965.01, N = 5SE +/- 19138.93, N = 5262820.5267186.71. (CC) gcc options: -O3 -march=native -fopenmp

Stream

Type: Triad

OpenBenchmarking.orgMB/s, More Is BetterStream 2013-01-17Type: Triad4k64k60K120K180K240K300KSE +/- 10116.80, N = 5SE +/- 24704.85, N = 5268932.1273077.71. (CC) gcc options: -O3 -march=native -fopenmp

Stream

Type: Add

OpenBenchmarking.orgMB/s, More Is BetterStream 2013-01-17Type: Add4k64k60K120K180K240K300KSE +/- 11637.45, N = 5SE +/- 19202.03, N = 5264195.7272961.81. (CC) gcc options: -O3 -march=native -fopenmp

Tinymembench

Standard Memcpy

OpenBenchmarking.orgMB/s, More Is BetterTinymembench 2018-05-28Standard Memcpy4k64k3K6K9K12K15KSE +/- 16.49, N = 3SE +/- 36.86, N = 311138.511880.51. (CC) gcc options: -O2 -lm

Tinymembench

Standard Memset

OpenBenchmarking.orgMB/s, More Is BetterTinymembench 2018-05-28Standard Memset4k64k9K18K27K36K45KSE +/- 154.20, N = 3SE +/- 106.93, N = 338102.442969.11. (CC) gcc options: -O2 -lm

MBW

Test: Memory Copy - Array Size: 8192 MiB

OpenBenchmarking.orgMiB/s, More Is BetterMBW 2018-09-08Test: Memory Copy - Array Size: 8192 MiB4k64k2K4K6K8K10KSE +/- 31.89, N = 3SE +/- 2.91, N = 39694.3710366.241. (CC) gcc options: -O3 -march=native

MBW

Test: Memory Copy, Fixed Block Size - Array Size: 8192 MiB

OpenBenchmarking.orgMiB/s, More Is BetterMBW 2018-09-08Test: Memory Copy, Fixed Block Size - Array Size: 8192 MiB4k64k2K4K6K8K10KSE +/- 1.39, N = 3SE +/- 58.43, N = 39669.5710279.411. (CC) gcc options: -O3 -march=native

LAMMPS Molecular Dynamics Simulator

Model: 20k Atoms

OpenBenchmarking.orgns/day, More Is BetterLAMMPS Molecular Dynamics Simulator 29Oct2020Model: 20k Atoms4k64k1020304050SE +/- 0.05, N = 3SE +/- 0.63, N = 344.2145.591. (CXX) g++ options: -O3 -pthread -lm

LAMMPS Molecular Dynamics Simulator

Model: Rhodopsin Protein

OpenBenchmarking.orgns/day, More Is BetterLAMMPS Molecular Dynamics Simulator 29Oct2020Model: Rhodopsin Protein4k64k1020304050SE +/- 0.10, N = 3SE +/- 0.06, N = 340.4842.421. (CXX) g++ options: -O3 -pthread -lm

NAS Parallel Benchmarks

Test / Class: EP.D

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: EP.D4k64k16003200480064008000SE +/- 24.03, N = 3SE +/- 26.51, N = 37304.177364.241. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -ldl -levent -levent_pthreads -lutil -lm -lrt -lz

NAS Parallel Benchmarks

Test / Class: LU.C

OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: LU.C4k64k12K24K36K48K60KSE +/- 139.86, N = 3SE +/- 94.94, N = 354846.1955003.441. (F9X) gfortran options: -O3 -march=native -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -ldl -levent -levent_pthreads -lutil -lm -lrt -lz

toyBrot Fractal Generator

Implementation: TBB

OpenBenchmarking.orgms, Fewer Is BettertoyBrot Fractal Generator 2020-11-18Implementation: TBB4k64k9001800270036004500SE +/- 46.99, N = 15SE +/- 57.62, N = 3414241001. (CXX) g++ options: -O3 -lpthread -lm -lgcc -lgcc_s -lc

toyBrot Fractal Generator

Implementation: OpenMP

OpenBenchmarking.orgms, Fewer Is BettertoyBrot Fractal Generator 2020-11-18Implementation: OpenMP4k64k12002400360048006000SE +/- 135.93, N = 15SE +/- 84.87, N = 15563848901. (CXX) g++ options: -O3 -lpthread -lm -lgcc -lgcc_s -lc

toyBrot Fractal Generator

Implementation: C++ Tasks

OpenBenchmarking.orgms, Fewer Is BettertoyBrot Fractal Generator 2020-11-18Implementation: C++ Tasks4k64k10002000300040005000SE +/- 28.42, N = 3SE +/- 65.19, N = 3477148891. (CXX) g++ options: -O3 -lpthread -lm -lgcc -lgcc_s -lc

toyBrot Fractal Generator

Implementation: C++ Threads

OpenBenchmarking.orgms, Fewer Is BettertoyBrot Fractal Generator 2020-11-18Implementation: C++ Threads4k64k9001800270036004500SE +/- 47.21, N = 3SE +/- 43.99, N = 4404740351. (CXX) g++ options: -O3 -lpthread -lm -lgcc -lgcc_s -lc

OpenFOAM

Input: Motorbike 30M

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 8Input: Motorbike 30M4k64k510152025SE +/- 0.25, N = 4SE +/- 0.22, N = 420.3318.051. (CXX) g++ options: -std=c++11 -O3 -mcpu=native -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -ldynamicMesh -ldecompose -lgenericPatchFields -lmetisDecomp -lscotchDecomp -llagrangian -lregionModels -lOpenFOAM -ldl -lm

OpenFOAM

Input: Motorbike 60M

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 8Input: Motorbike 60M4k64k20406080100SE +/- 0.07, N = 3SE +/- 0.13, N = 3104.81102.461. (CXX) g++ options: -std=c++11 -O3 -mcpu=native -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -ldynamicMesh -ldecompose -lgenericPatchFields -lmetisDecomp -lscotchDecomp -llagrangian -lregionModels -lOpenFOAM -ldl -lm

Quantum ESPRESSO

Input: AUSURF112

OpenBenchmarking.orgSeconds, Fewer Is BetterQuantum ESPRESSO 6.7Input: AUSURF11264k400800120016002000SE +/- 43.80, N = 61775.861. (F9X) gfortran options: -lopenblas -lFoX_dom -lFoX_sax -lFoX_wxml -lFoX_common -lFoX_utils -lFoX_fsys -lfftw3 -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -ldl -levent -levent_pthreads -lutil -lm -lrt -lz

Timed Godot Game Engine Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Godot Game Engine Compilation 3.2.3Time To Compile4k64k20406080100SE +/- 0.28, N = 3SE +/- 1.03, N = 1593.6270.02

Timed ImageMagick Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed ImageMagick Compilation 6.9.0Time To Compile4k64k612182430SE +/- 0.19, N = 15SE +/- 0.21, N = 1526.1223.27

Timed LLVM Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed LLVM Compilation 10.0Time To Compile4k64k60120180240300SE +/- 2.79, N = 3SE +/- 3.38, N = 9280.87254.10

Build2

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterBuild2 0.13Time To Compile4k64k20406080100SE +/- 1.74, N = 15SE +/- 1.64, N = 1587.2583.03

Ngspice

Circuit: C2670

OpenBenchmarking.orgSeconds, Fewer Is BetterNgspice 34Circuit: C267064k60120180240300SE +/- 4.08, N = 4251.981. (CC) gcc options: -O0 -fopenmp -lm -lstdc++ -lfftw3 -lXaw -lXmu -lXt -lXext -lX11 -lXft -lfontconfig -lXrender -lfreetype -lSM -lICE

WebP2 Image Encode

Encode Settings: Quality 100, Compression Effort 5

OpenBenchmarking.orgSeconds, Fewer Is BetterWebP2 Image Encode 20210126Encode Settings: Quality 100, Compression Effort 54k64k246810SE +/- 0.002, N = 3SE +/- 0.010, N = 38.8548.7241. (CXX) g++ options: -fno-rtti -O3 -rdynamic -lpthread -ljpeg -lgif

Blender

Blend File: Classroom - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 2.83.5Blend File: Classroom - Compute: CPU-Only4k64k1530456075SE +/- 0.87, N = 3SE +/- 0.17, N = 366.7263.08


Phoronix Test Suite v10.8.4