Amazon EC2 m6i.8xlarge

KVM testing on Ubuntu 20.04 via the Phoronix Test Suite.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2108173-TJ-AMAZONEC299
Jump To Table - Results

Statistics

Remove Outliers Before Calculating Averages

Graph Settings

Prefer Vertical Bar Graphs

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Result
Identifier
Performance Per
Dollar
Date
Run
  Test
  Duration
m6i.8xlarge
August 17 2021
  4 Hours, 20 Minutes
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


Amazon EC2 m6i.8xlargeOpenBenchmarking.orgPhoronix Test SuiteIntel Xeon Platinum 8375C (16 Cores / 32 Threads)Amazon EC2 m6i.8xlarge (1.0 BIOS)Intel 440FX 82441FX PMC124GB86GB Amazon Elastic Block StoreAmazon ElasticUbuntu 20.045.4.0-1045-aws (x86_64)1.0.2GCC 9.3.0ext4KVMProcessorMotherboardChipsetMemoryDiskNetworkOSKernelVulkanCompilerFile-SystemSystem LayerAmazon EC2 M6i.8xlarge BenchmarksSystem Logs- Transparent Huge Pages: madvise- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - CPU Microcode: 0xd0002b1- Python 3.8.10- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected

Amazon EC2 m6i.8xlargeappleseed: Material Testerappleseed: Disney Materialappleseed: Emilyapache: 1000apache: 500apache: 200nginx: 1000nginx: 500nginx: 200tnn: CPU - SqueezeNet v1.1tnn: CPU - SqueezeNet v2tnn: CPU - MobileNet v2tnn: CPU - DenseNetgromacs: MPI CPU - water_GMX50_bareonednn: Matrix Multiply Batch Shapes Transformer - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Training - bf16bf16bf16 - CPUonednn: Matrix Multiply Batch Shapes Transformer - f32 - CPUonednn: Deconvolution Batch shapes_3d - bf16bf16bf16 - CPUonednn: Deconvolution Batch shapes_1d - bf16bf16bf16 - CPUonednn: Convolution Batch Shapes Auto - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Inference - f32 - CPUonednn: Recurrent Neural Network Training - f32 - CPUonednn: Deconvolution Batch shapes_3d - f32 - CPUonednn: Deconvolution Batch shapes_1d - f32 - CPUonednn: Convolution Batch Shapes Auto - f32 - CPUonednn: IP Shapes 3D - bf16bf16bf16 - CPUonednn: IP Shapes 1D - bf16bf16bf16 - CPUonednn: IP Shapes 3D - f32 - CPUonednn: IP Shapes 1D - f32 - CPUyafaray: Total Time For Sample Scenebuild-nodejs: Time To Compilebuild-llvm: Unix Makefilesbuild-linux-kernel: Time To Compilebuild-ffmpeg: Time To Compileopenvkl: vklBenchmark Scalaropenvkl: vklBenchmark ISPCoidn: RTLightmap.hdr.4096x4096oidn: RT.ldr_alb_nrm.3840x2160oidn: RT.hdr_alb_nrm.3840x2160svt-hevc: 10 - Bosphorus 1080psvt-hevc: 7 - Bosphorus 1080psvt-hevc: 1 - Bosphorus 1080psvt-av1: Preset 8 - Bosphorus 4Kospray: Magnetic Reconnection - Path Tracerospray: NASA Streamlines - Path Tracerospray: XFrog Forest - Path Tracerospray: San Miguel - Path Tracerlulesh: incompact3d: input.i3d 193 Cells Per Directionincompact3d: input.i3d 129 Cells Per Directionpennant: leblancbigpennant: sedovbignamd: ATPase Simulation - 327,506 Atomshpcg: quantlib: m6i.8xlarge132.091256133.278786239.740087169650.54162463.56162518.62401818.94414507.60440121.66357.65870.554349.6773786.2622.6761.83916681.6871259.810.59826711.541911.19228.37531683.6121262.002.360066.602452.891771.753144.340381.831331.7254790.463253.246428.62853.90639.30731810.501.061.06387.27187.9512.9519.6335007.412.692.588427.768341.352092710.383473220.3437343.295480.9130915.92342478.8OpenBenchmarking.org

Appleseed

Appleseed is an open-source production renderer focused on physically-based global illumination rendering engine primarily designed for animation and visual effects. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterAppleseed 2.0 BetaScene: Material Testerm6i.8xlarge306090120150132.09

OpenBenchmarking.orgSeconds, Fewer Is BetterAppleseed 2.0 BetaScene: Disney Materialm6i.8xlarge306090120150133.28

OpenBenchmarking.orgSeconds, Fewer Is BetterAppleseed 2.0 BetaScene: Emilym6i.8xlarge50100150200250239.74

Apache HTTP Server

This is a test of the Apache HTTPD web server. This Apache HTTPD web server benchmark test profile makes use of the Golang "Bombardier" program for facilitating the HTTP requests over a fixed period time with a configurable number of concurrent clients. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgRequests Per Second, More Is BetterApache HTTP Server 2.4.48Concurrent Requests: 1000m6i.8xlarge40K80K120K160K200KSE +/- 1307.55, N = 10169650.541. (CC) gcc options: -shared -fPIC -O2 -pthread

OpenBenchmarking.orgRequests Per Second, More Is BetterApache HTTP Server 2.4.48Concurrent Requests: 500m6i.8xlarge30K60K90K120K150KSE +/- 1515.80, N = 7162463.561. (CC) gcc options: -shared -fPIC -O2 -pthread

OpenBenchmarking.orgRequests Per Second, More Is BetterApache HTTP Server 2.4.48Concurrent Requests: 200m6i.8xlarge30K60K90K120K150KSE +/- 812.36, N = 3162518.621. (CC) gcc options: -shared -fPIC -O2 -pthread

nginx

This is a benchmark of the lightweight Nginx HTTP(S) web-server. This Nginx web server benchmark test profile makes use of the Golang "Bombardier" program for facilitating the HTTP requests over a fixed period time with a configurable number of concurrent clients. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgRequests Per Second, More Is Betternginx 1.21.1Concurrent Requests: 1000m6i.8xlarge90K180K270K360K450KSE +/- 2242.18, N = 3401818.941. (CC) gcc options: -ldl -lpthread -lcrypt -lz -O3 -march=native

OpenBenchmarking.orgRequests Per Second, More Is Betternginx 1.21.1Concurrent Requests: 500m6i.8xlarge90K180K270K360K450KSE +/- 2040.23, N = 3414507.601. (CC) gcc options: -ldl -lpthread -lcrypt -lz -O3 -march=native

OpenBenchmarking.orgRequests Per Second, More Is Betternginx 1.21.1Concurrent Requests: 200m6i.8xlarge90K180K270K360K450KSE +/- 600.47, N = 3440121.661. (CC) gcc options: -ldl -lpthread -lcrypt -lz -O3 -march=native

TNN

TNN is an open-source deep learning reasoning framework developed by Tencent. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: SqueezeNet v1.1m6i.8xlarge80160240320400SE +/- 0.06, N = 3357.66MIN: 356.91 / MAX: 3611. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: SqueezeNet v2m6i.8xlarge1632486480SE +/- 0.08, N = 370.55MIN: 70.14 / MAX: 71.881. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: MobileNet v2m6i.8xlarge80160240320400SE +/- 0.22, N = 3349.68MIN: 348.47 / MAX: 351.381. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: DenseNetm6i.8xlarge8001600240032004000SE +/- 0.64, N = 33786.26MIN: 3733.1 / MAX: 3883.971. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl

GROMACS

The GROMACS (GROningen MAchine for Chemical Simulations) molecular dynamics package testing with the water_GMX50 data. This test profile allows selecting between CPU and GPU-based GROMACS builds. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNs Per Day, More Is BetterGROMACS 2021.2Implementation: MPI CPU - Input: water_GMX50_barem6i.8xlarge0.60211.20421.80632.40843.0105SE +/- 0.004, N = 32.6761. (CXX) g++ options: -O3 -pthread

oneDNN

This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI initiative. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPUm6i.8xlarge0.41380.82761.24141.65522.069SE +/- 0.00072, N = 31.83916MIN: 1.761. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPUm6i.8xlarge150300450600750SE +/- 0.16, N = 3681.69MIN: 667.851. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPUm6i.8xlarge30060090012001500SE +/- 0.80, N = 31259.81MIN: 1231.211. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPUm6i.8xlarge0.13460.26920.40380.53840.673SE +/- 0.000278, N = 30.598267MIN: 0.541. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPUm6i.8xlarge3691215SE +/- 0.00, N = 311.54MIN: 11.51. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPUm6i.8xlarge3691215SE +/- 0.01, N = 311.19MIN: 11.141. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPUm6i.8xlarge246810SE +/- 0.00038, N = 38.37531MIN: 8.351. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPUm6i.8xlarge150300450600750SE +/- 0.39, N = 3683.61MIN: 668.661. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPUm6i.8xlarge30060090012001500SE +/- 0.36, N = 31262.00MIN: 1237.581. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPUm6i.8xlarge0.5311.0621.5932.1242.655SE +/- 0.00094, N = 32.36006MIN: 2.181. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPUm6i.8xlarge246810SE +/- 0.01111, N = 36.60245MIN: 3.591. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPUm6i.8xlarge0.65061.30121.95182.60243.253SE +/- 0.00091, N = 32.89177MIN: 2.821. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPUm6i.8xlarge0.39450.7891.18351.5781.9725SE +/- 0.00108, N = 31.75314MIN: 1.71. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPUm6i.8xlarge0.97661.95322.92983.90644.883SE +/- 0.00222, N = 34.34038MIN: 4.131. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: IP Shapes 3D - Data Type: f32 - Engine: CPUm6i.8xlarge0.4120.8241.2361.6482.06SE +/- 0.00113, N = 31.83133MIN: 1.781. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: IP Shapes 1D - Data Type: f32 - Engine: CPUm6i.8xlarge0.38820.77641.16461.55281.941SE +/- 0.00183, N = 31.72547MIN: 1.521. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

YafaRay

YafaRay is an open-source physically based montecarlo ray-tracing engine. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterYafaRay 3.5.1Total Time For Sample Scenem6i.8xlarge20406080100SE +/- 1.25, N = 390.461. (CXX) g++ options: -std=c++11 -pthread -O3 -ffast-math -rdynamic -ldl -lImath -lIlmImf -lIex -lHalf -lz -lIlmThread -lxml2 -lfreetype

Timed Node.js Compilation

This test profile times how long it takes to build/compile Node.js itself from source. Node.js is a JavaScript run-time built from the Chrome V8 JavaScript engine while itself is written in C/C++. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Node.js Compilation 15.11Time To Compilem6i.8xlarge60120180240300SE +/- 0.12, N = 3253.25

Timed LLVM Compilation

This test times how long it takes to build the LLVM compiler. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed LLVM Compilation 12.0Build System: Unix Makefilesm6i.8xlarge90180270360450SE +/- 2.32, N = 3428.63

Timed Linux Kernel Compilation

This test times how long it takes to build the Linux kernel in a default configuration (defconfig) for the architecture being tested. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Linux Kernel Compilation 5.10.20Time To Compilem6i.8xlarge1224364860SE +/- 0.49, N = 353.91

Timed FFmpeg Compilation

This test times how long it takes to build the FFmpeg multimedia library. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed FFmpeg Compilation 4.4Time To Compilem6i.8xlarge918273645SE +/- 0.15, N = 339.31

OpenVKL

OpenVKL is the Intel Open Volume Kernel Library that offers high-performance volume computation kernels and part of the Intel oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgItems / Sec, More Is BetterOpenVKL 1.0Benchmark: vklBenchmark Scalarm6i.8xlarge71421283531MIN: 2 / MAX: 931

OpenBenchmarking.orgItems / Sec, More Is BetterOpenVKL 1.0Benchmark: vklBenchmark ISPCm6i.8xlarge2040608010081MIN: 6 / MAX: 1934

Intel Open Image Denoise

Open Image Denoise is a denoising library for ray-tracing and part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgImages / Sec, More Is BetterIntel Open Image Denoise 1.4.0Run: RTLightmap.hdr.4096x4096m6i.8xlarge0.11250.2250.33750.450.5625SE +/- 0.00, N = 30.50

OpenBenchmarking.orgImages / Sec, More Is BetterIntel Open Image Denoise 1.4.0Run: RT.ldr_alb_nrm.3840x2160m6i.8xlarge0.23850.4770.71550.9541.1925SE +/- 0.00, N = 31.06

OpenBenchmarking.orgImages / Sec, More Is BetterIntel Open Image Denoise 1.4.0Run: RT.hdr_alb_nrm.3840x2160m6i.8xlarge0.23850.4770.71550.9541.1925SE +/- 0.00, N = 31.06

SVT-HEVC

This is a test of the Intel Open Visual Cloud Scalable Video Technology SVT-HEVC CPU-based multi-threaded video encoder for the HEVC / H.265 video format with a sample 1080p YUV video file. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-HEVC 1.5.0Tuning: 10 - Input: Bosphorus 1080pm6i.8xlarge80160240320400SE +/- 1.33, N = 3387.271. (CC) gcc options: -fPIE -fPIC -O3 -O2 -pie -rdynamic -lpthread -lrt

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-HEVC 1.5.0Tuning: 7 - Input: Bosphorus 1080pm6i.8xlarge4080120160200SE +/- 0.11, N = 3187.951. (CC) gcc options: -fPIE -fPIC -O3 -O2 -pie -rdynamic -lpthread -lrt

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-HEVC 1.5.0Tuning: 1 - Input: Bosphorus 1080pm6i.8xlarge3691215SE +/- 0.02, N = 312.951. (CC) gcc options: -fPIE -fPIC -O3 -O2 -pie -rdynamic -lpthread -lrt

SVT-AV1

This is a benchmark of the SVT-AV1 open-source video encoder/decoder. SVT-AV1 was originally developed by Intel as part of their Open Visual Cloud / Scalable Video Technology (SVT). Development of SVT-AV1 has since moved to the Alliance for Open Media as part of upstream AV1 development. SVT-AV1 is a CPU-based multi-threaded video encoder for the AV1 video format with a sample YUV video file. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 0.8.7Encoder Mode: Preset 8 - Input: Bosphorus 4Km6i.8xlarge510152025SE +/- 0.01, N = 319.631. (CXX) g++ options: -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq -pie

OSPray

Intel OSPray is a portable ray-tracing engine for high-performance, high-fidenlity scientific visualizations. OSPray builds off Intel's Embree and Intel SPMD Program Compiler (ISPC) components as part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFPS, More Is BetterOSPray 1.8.5Demo: Magnetic Reconnection - Renderer: Path Tracerm6i.8xlarge110220330440550500MAX: 1000

OpenBenchmarking.orgFPS, More Is BetterOSPray 1.8.5Demo: NASA Streamlines - Renderer: Path Tracerm6i.8xlarge246810SE +/- 0.00, N = 37.41MIN: 7.25 / MAX: 7.58

OpenBenchmarking.orgFPS, More Is BetterOSPray 1.8.5Demo: XFrog Forest - Renderer: Path Tracerm6i.8xlarge0.60531.21061.81592.42123.0265SE +/- 0.00, N = 32.69MIN: 2.6 / MAX: 2.7

OpenBenchmarking.orgFPS, More Is BetterOSPray 1.8.5Demo: San Miguel - Renderer: Path Tracerm6i.8xlarge0.58051.1611.74152.3222.9025SE +/- 0.00, N = 32.58MIN: 2.53 / MAX: 2.6

LULESH

LULESH is the Livermore Unstructured Lagrangian Explicit Shock Hydrodynamics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgz/s, More Is BetterLULESH 2.0.3m6i.8xlarge2K4K6K8K10KSE +/- 23.16, N = 38427.771. (CXX) g++ options: -O3 -fopenmp -lm -pthread -lmpi_cxx -lmpi

Xcompact3d Incompact3d

Xcompact3d Incompact3d is a Fortran-MPI based, finite difference high-performance code for solving the incompressible Navier-Stokes equation and as many as you need scalar transport equations. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterXcompact3d Incompact3d 2021-03-11Input: input.i3d 193 Cells Per Directionm6i.8xlarge918273645SE +/- 0.35, N = 341.351. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi

OpenBenchmarking.orgSeconds, Fewer Is BetterXcompact3d Incompact3d 2021-03-11Input: input.i3d 129 Cells Per Directionm6i.8xlarge3691215SE +/- 0.13, N = 410.381. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi

Pennant

Pennant is an application focused on hydrodynamics on general unstructured meshes in 2D. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgHydro Cycle Time - Seconds, Fewer Is BetterPennant 1.0.1Test: leblancbigm6i.8xlarge510152025SE +/- 0.01, N = 320.341. (CXX) g++ options: -fopenmp -pthread -lmpi_cxx -lmpi

OpenBenchmarking.orgHydro Cycle Time - Seconds, Fewer Is BetterPennant 1.0.1Test: sedovbigm6i.8xlarge1020304050SE +/- 0.06, N = 343.301. (CXX) g++ options: -fopenmp -pthread -lmpi_cxx -lmpi

NAMD

NAMD is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems. NAMD was developed by the Theoretical and Computational Biophysics Group in the Beckman Institute for Advanced Science and Technology at the University of Illinois at Urbana-Champaign. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgdays/ns, Fewer Is BetterNAMD 2.14ATPase Simulation - 327,506 Atomsm6i.8xlarge0.20540.41080.61620.82161.027SE +/- 0.00097, N = 30.91309

High Performance Conjugate Gradient

HPCG is the High Performance Conjugate Gradient and is a new scientific benchmark from Sandia National Lans focused for super-computer testing with modern real-world workloads compared to HPCC. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHigh Performance Conjugate Gradient 3.1m6i.8xlarge48121620SE +/- 0.01, N = 315.921. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -pthread -lmpi_cxx -lmpi

QuantLib

QuantLib is an open-source library/framework around quantitative finance for modeling, trading and risk management scenarios. QuantLib is written in C++ with Boost and its built-in benchmark used reports the QuantLib Benchmark Index benchmark score. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMFLOPS, More Is BetterQuantLib 1.21m6i.8xlarge5001000150020002500SE +/- 4.97, N = 32478.81. (CXX) g++ options: -O3 -march=native -rdynamic

56 Results Shown

Appleseed:
  Material Tester
  Disney Material
  Emily
Apache HTTP Server:
  1000
  500
  200
nginx:
  1000
  500
  200
TNN:
  CPU - SqueezeNet v1.1
  CPU - SqueezeNet v2
  CPU - MobileNet v2
  CPU - DenseNet
GROMACS
oneDNN:
  Matrix Multiply Batch Shapes Transformer - bf16bf16bf16 - CPU
  Recurrent Neural Network Inference - bf16bf16bf16 - CPU
  Recurrent Neural Network Training - bf16bf16bf16 - CPU
  Matrix Multiply Batch Shapes Transformer - f32 - CPU
  Deconvolution Batch shapes_3d - bf16bf16bf16 - CPU
  Deconvolution Batch shapes_1d - bf16bf16bf16 - CPU
  Convolution Batch Shapes Auto - bf16bf16bf16 - CPU
  Recurrent Neural Network Inference - f32 - CPU
  Recurrent Neural Network Training - f32 - CPU
  Deconvolution Batch shapes_3d - f32 - CPU
  Deconvolution Batch shapes_1d - f32 - CPU
  Convolution Batch Shapes Auto - f32 - CPU
  IP Shapes 3D - bf16bf16bf16 - CPU
  IP Shapes 1D - bf16bf16bf16 - CPU
  IP Shapes 3D - f32 - CPU
  IP Shapes 1D - f32 - CPU
YafaRay
Timed Node.js Compilation
Timed LLVM Compilation
Timed Linux Kernel Compilation
Timed FFmpeg Compilation
OpenVKL:
  vklBenchmark Scalar
  vklBenchmark ISPC
Intel Open Image Denoise:
  RTLightmap.hdr.4096x4096
  RT.ldr_alb_nrm.3840x2160
  RT.hdr_alb_nrm.3840x2160
SVT-HEVC:
  10 - Bosphorus 1080p
  7 - Bosphorus 1080p
  1 - Bosphorus 1080p
SVT-AV1
OSPray:
  Magnetic Reconnection - Path Tracer
  NASA Streamlines - Path Tracer
  XFrog Forest - Path Tracer
  San Miguel - Path Tracer
LULESH
Xcompact3d Incompact3d:
  input.i3d 193 Cells Per Direction
  input.i3d 129 Cells Per Direction
Pennant:
  leblancbig
  sedovbig
NAMD
High Performance Conjugate Gradient
QuantLib