Amazon EC2 c2a.8xlarge

KVM testing on Ubuntu 20.04 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2202164-NE-AMAZONEC290.

Amazon EC2 c2a.8xlargeProcessorMotherboardChipsetMemoryDiskNetworkOSKernelVulkanCompilerFile-SystemSystem Layerc2a.8xlargeAMD EPYC 7R13 (16 Cores / 32 Threads)Amazon EC2 c6a.8xlarge (1.0 BIOS)Intel 440FX 82441FX PMC62GB107GB Amazon Elastic Block StoreAmazon ElasticUbuntu 20.045.11.0-1022-aws (x86_64)1.1.182GCC 9.3.0ext4KVMOpenBenchmarking.org- Transparent Huge Pages: madvise- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - CPU Microcode: 0xa001143- Python 3.8.10- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected

Amazon EC2 c2a.8xlargehpcg: namd: ATPase Simulation - 327,506 Atomsopenfoam: Motorbike 30Mospray: San Miguel - SciVisospray: NASA Streamlines - SciVisospray: Magnetic Reconnection - SciVissvt-hevc: 1 - Bosphorus 1080psvt-hevc: 7 - Bosphorus 1080psvt-hevc: 10 - Bosphorus 1080pmt-dgemm: Sustained Floating-Point Ratebuild-godot: Time To Compilebuild-linux-kernel: defconfiggraph500: 26graph500: 26graph500: 26graph500: 26gromacs: MPI CPU - water_GMX50_baretensorflow-lite: SqueezeNettensorflow-lite: Inception V4tensorflow-lite: NASNet Mobiletensorflow-lite: Mobilenet Floattensorflow-lite: Mobilenet Quanttensorflow-lite: Inception ResNet V2tnn: CPU - DenseNettnn: CPU - MobileNet v2tnn: CPU - SqueezeNet v2tnn: CPU - SqueezeNet v1.1c2a.8xlarge10.40731.2480367.652534.1018.1814.22178.33366.385.15979395.09362.437223366000227385000758493001015890002.032108293153059712121871160.981238.113689973257.110318.29078.997282.695OpenBenchmarking.org

High Performance Conjugate Gradient

OpenBenchmarking.orgGFLOP/s, More Is BetterHigh Performance Conjugate Gradient 3.1c2a.8xlarge3691215SE +/- 0.07, N = 310.411. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -pthread -lmpi_cxx -lmpi

NAMD

ATPase Simulation - 327,506 Atoms

OpenBenchmarking.orgdays/ns, Fewer Is BetterNAMD 2.14ATPase Simulation - 327,506 Atomsc2a.8xlarge0.28080.56160.84241.12321.404SE +/- 0.01177, N = 71.24803

OpenFOAM

Input: Motorbike 30M

OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 8Input: Motorbike 30Mc2a.8xlarge1530456075SE +/- 0.09, N = 367.651. (CXX) g++ options: -std=c++11 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfoamToVTK -ldynamicMesh -llagrangian -lgenericPatchFields -lfileFormats -lOpenFOAM -ldl -lm

OSPray

Demo: San Miguel - Renderer: SciVis

OpenBenchmarking.orgFPS, More Is BetterOSPray 1.8.5Demo: San Miguel - Renderer: SciVisc2a.8xlarge61218243025MIN: 17.86 / MAX: 27.03

OSPray

Demo: NASA Streamlines - Renderer: SciVis

OpenBenchmarking.orgFPS, More Is BetterOSPray 1.8.5Demo: NASA Streamlines - Renderer: SciVisc2a.8xlarge816243240SE +/- 0.38, N = 334.10MIN: 13.7 / MAX: 34.48

OSPray

Demo: Magnetic Reconnection - Renderer: SciVis

OpenBenchmarking.orgFPS, More Is BetterOSPray 1.8.5Demo: Magnetic Reconnection - Renderer: SciVisc2a.8xlarge48121620SE +/- 0.00, N = 318.18MIN: 8.26 / MAX: 18.52

SVT-HEVC

Tuning: 1 - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-HEVC 1.5.0Tuning: 1 - Input: Bosphorus 1080pc2a.8xlarge48121620SE +/- 0.02, N = 314.221. (CC) gcc options: -fPIE -fPIC -O3 -O2 -pie -rdynamic -lpthread -lrt

SVT-HEVC

Tuning: 7 - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-HEVC 1.5.0Tuning: 7 - Input: Bosphorus 1080pc2a.8xlarge4080120160200SE +/- 0.71, N = 3178.331. (CC) gcc options: -fPIE -fPIC -O3 -O2 -pie -rdynamic -lpthread -lrt

SVT-HEVC

Tuning: 10 - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-HEVC 1.5.0Tuning: 10 - Input: Bosphorus 1080pc2a.8xlarge80160240320400SE +/- 0.66, N = 3366.381. (CC) gcc options: -fPIE -fPIC -O3 -O2 -pie -rdynamic -lpthread -lrt

ACES DGEMM

Sustained Floating-Point Rate

OpenBenchmarking.orgGFLOP/s, More Is BetterACES DGEMM 1.0Sustained Floating-Point Ratec2a.8xlarge1.1612.3223.4834.6445.805SE +/- 0.025529, N = 35.1597931. (CC) gcc options: -O3 -march=native -fopenmp

Timed Godot Game Engine Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Godot Game Engine Compilation 3.2.3Time To Compilec2a.8xlarge20406080100SE +/- 0.22, N = 395.09

Timed Linux Kernel Compilation

Build: defconfig

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Linux Kernel Compilation 5.16Build: defconfigc2a.8xlarge1428425670SE +/- 0.22, N = 362.44

Graph500

Scale: 26

OpenBenchmarking.orgbfs median_TEPS, More Is BetterGraph500 3.0Scale: 26c2a.8xlarge50M100M150M200M250M2233660001. (CC) gcc options: -fcommon -O3 -lpthread -lm -pthread -lmpi

Graph500

Scale: 26

OpenBenchmarking.orgbfs max_TEPS, More Is BetterGraph500 3.0Scale: 26c2a.8xlarge50M100M150M200M250M2273850001. (CC) gcc options: -fcommon -O3 -lpthread -lm -pthread -lmpi

Graph500

Scale: 26

OpenBenchmarking.orgsssp median_TEPS, More Is BetterGraph500 3.0Scale: 26c2a.8xlarge16M32M48M64M80M758493001. (CC) gcc options: -fcommon -O3 -lpthread -lm -pthread -lmpi

Graph500

Scale: 26

OpenBenchmarking.orgsssp max_TEPS, More Is BetterGraph500 3.0Scale: 26c2a.8xlarge20M40M60M80M100M1015890001. (CC) gcc options: -fcommon -O3 -lpthread -lm -pthread -lmpi

GROMACS

Implementation: MPI CPU - Input: water_GMX50_bare

OpenBenchmarking.orgNs Per Day, More Is BetterGROMACS 2021.2Implementation: MPI CPU - Input: water_GMX50_barec2a.8xlarge0.45720.91441.37161.82882.286SE +/- 0.002, N = 32.0321. (CXX) g++ options: -O3 -pthread

TensorFlow Lite

Model: SqueezeNet

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2020-08-23Model: SqueezeNetc2a.8xlarge20K40K60K80K100KSE +/- 203.21, N = 3108293

TensorFlow Lite

Model: Inception V4

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2020-08-23Model: Inception V4c2a.8xlarge300K600K900K1200K1500KSE +/- 1506.68, N = 31530597

TensorFlow Lite

Model: NASNet Mobile

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2020-08-23Model: NASNet Mobilec2a.8xlarge30K60K90K120K150KSE +/- 1436.57, N = 3121218

TensorFlow Lite

Model: Mobilenet Float

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2020-08-23Model: Mobilenet Floatc2a.8xlarge15K30K45K60K75KSE +/- 210.72, N = 371160.9

TensorFlow Lite

Model: Mobilenet Quant

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2020-08-23Model: Mobilenet Quantc2a.8xlarge20K40K60K80K100KSE +/- 217.08, N = 381238.1

TensorFlow Lite

Model: Inception ResNet V2

OpenBenchmarking.orgMicroseconds, Fewer Is BetterTensorFlow Lite 2020-08-23Model: Inception ResNet V2c2a.8xlarge300K600K900K1200K1500KSE +/- 398.43, N = 31368997

TNN

Target: CPU - Model: DenseNet

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: DenseNetc2a.8xlarge7001400210028003500SE +/- 2.96, N = 33257.11MIN: 3092.98 / MAX: 3850.221. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl

TNN

Target: CPU - Model: MobileNet v2

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: MobileNet v2c2a.8xlarge70140210280350SE +/- 4.12, N = 3318.29MIN: 308.92 / MAX: 958.391. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl

TNN

Target: CPU - Model: SqueezeNet v2

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: SqueezeNet v2c2a.8xlarge20406080100SE +/- 0.29, N = 379.00MIN: 78.5 / MAX: 117.411. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl

TNN

Target: CPU - Model: SqueezeNet v1.1

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: SqueezeNet v1.1c2a.8xlarge60120180240300SE +/- 0.23, N = 3282.70MIN: 280.75 / MAX: 353.261. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl


Phoronix Test Suite v10.8.4