Amazon EC2 m6i.8xlarge

KVM testing on Ubuntu 20.04 via the Phoronix Test Suite.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2108184-TJ-2108173TJ20
Jump To Table - Results

View

Do Not Show Noisy Results
Do Not Show Results With Incomplete Data
Do Not Show Results With Little Change/Spread
List Notable Results

Limit displaying results to tests within:

Timed Code Compilation 4 Tests
C/C++ Compiler Tests 6 Tests
CPU Massive 9 Tests
Creator Workloads 8 Tests
Encoding 2 Tests
Fortran Tests 2 Tests
Game Development 2 Tests
Go Language Tests 2 Tests
HPC - High Performance Computing 8 Tests
Machine Learning 2 Tests
Molecular Dynamics 5 Tests
MPI Benchmarks 4 Tests
Multi-Core 16 Tests
Intel oneAPI 4 Tests
OpenMPI Tests 5 Tests
Programmer / Developer System Benchmarks 4 Tests
Python Tests 3 Tests
Raytracing 2 Tests
Renderers 3 Tests
Scientific Computing 5 Tests
Server 2 Tests
Server CPU Tests 7 Tests
Video Encoding 2 Tests

Statistics

Show Overall Harmonic Mean(s)
Show Overall Geometric Mean
Show Geometric Means Per-Suite/Category
Show Wins / Losses Counts (Pie Chart)
Normalize Results
Remove Outliers Before Calculating Averages

Graph Settings

Force Line Graphs Where Applicable
Convert To Scalar Where Applicable
Prefer Vertical Bar Graphs

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Highlight
Result
Hide
Result
Result
Identifier
Performance Per
Dollar
Date
Run
  Test
  Duration
m6i.8xlarge
August 17 2021
  4 Hours, 20 Minutes
m5.8xlarge
August 18 2021
  5 Hours, 17 Minutes
Invert Hiding All Results Option
  4 Hours, 49 Minutes
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


Amazon EC2 m6i.8xlargeProcessorMotherboardChipsetMemoryDiskNetworkOSKernelVulkanCompilerFile-SystemSystem Layerm6i.8xlargem5.8xlargeIntel Xeon Platinum 8375C (16 Cores / 32 Threads)Amazon EC2 m6i.8xlarge (1.0 BIOS)Intel 440FX 82441FX PMC124GB86GB Amazon Elastic Block StoreAmazon ElasticUbuntu 20.045.4.0-1045-aws (x86_64)1.0.2GCC 9.3.0ext4KVMIntel Xeon Platinum 8259CL (16 Cores / 32 Threads)Amazon EC2 m5.8xlarge (1.0 BIOS)126GBOpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,gm2 --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-9-HskZEa/gcc-9-9.3.0/debian/tmp-nvptx/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- m6i.8xlarge: CPU Microcode: 0xd0002b1- m5.8xlarge: CPU Microcode: 0x5003005Python Details- Python 3.8.10Security Details- m6i.8xlarge: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected - m5.8xlarge: itlb_multihit: KVM: Vulnerable + l1tf: Mitigation of PTE Inversion + mds: Vulnerable: Clear buffers attempted no microcode; SMT Host state unknown + meltdown: Mitigation of PTI + spec_store_bypass: Vulnerable + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full generic retpoline STIBP: disabled RSB filling + srbds: Not affected + tsx_async_abort: Not affected

m6i.8xlarge vs. m5.8xlarge ComparisonPhoronix Test SuiteBaseline+16.7%+16.7%+33.4%+33.4%+50.1%+50.1%+66.8%+66.8%IP Shapes 3D - f32 - CPU66.9%20059.8%IP Shapes 3D - bf16bf16bf16 - CPU54.7%100054.5%M.R - Path Tracer50%T.T.F.S.S47.8%MPI CPU - water_GMX50_bare47.3%Preset 8 - Bosphorus 4K46.8%CPU - SqueezeNet v246.7%44.4%ATPase Simulation - 327,506 Atoms44%sedovbig43.2%1 - Bosphorus 1080p41.5%D.B.s - f32 - CPU41%leblancbig39.3%C.B.S.A - f32 - CPU39.1%10 - Bosphorus 1080p39%NASA Streamlines - Path Tracer38.5%XFrog Forest - Path Tracer37.9%50037.8%7 - Bosphorus 1080p37.2%Time To Compile36.8%36.3%RT.ldr_alb_nrm.3840x216035.9%RT.hdr_alb_nrm.3840x216035.9%RTLightmap.hdr.4096x409635.1%vklBenchmark ISPC35%Unix Makefiles34.8%Time To Compile34.4%IP Shapes 1D - bf16bf16bf16 - CPU34.2%IP Shapes 1D - f32 - CPU34.1%i.i.1.C.P.D33%CPU - MobileNet v232.2%San Miguel - Path Tracer31.6%20031.5%Time To Compile31.4%M.M.B.S.T - f32 - CPU31.3%R.N.N.I - bf16bf16bf16 - CPU31%R.N.N.I - f32 - CPU30.5%R.N.N.T - bf16bf16bf16 - CPU30.5%R.N.N.T - f32 - CPU30.5%i.i.1.C.P.D29.5%v.S29.2%100028.7%M.M.B.S.T - bf16bf16bf16 - CPU28.7%50028.5%C.B.S.A - bf16bf16bf16 - CPU26.6%Material Tester26%Emily24.6%D.B.s - bf16bf16bf16 - CPU24.5%D.B.s - bf16bf16bf16 - CPU23.8%Disney Material23.3%CPU - SqueezeNet v1.122.1%21.6%CPU - DenseNet17.7%D.B.s - f32 - CPU7.8%oneDNNApache HTTP ServeroneDNNApache HTTP ServerOSPrayYafaRayGROMACSSVT-AV1TNNQuantLibNAMDPennantSVT-HEVConeDNNPennantoneDNNSVT-HEVCOSPrayOSPrayApache HTTP ServerSVT-HEVCTimed FFmpeg CompilationLULESHIntel Open Image DenoiseIntel Open Image DenoiseIntel Open Image DenoiseOpenVKLTimed LLVM CompilationTimed Linux Kernel CompilationoneDNNoneDNNXcompact3d Incompact3dTNNOSPraynginxTimed Node.js CompilationoneDNNoneDNNoneDNNoneDNNoneDNNXcompact3d Incompact3dOpenVKLnginxoneDNNnginxoneDNNAppleseedAppleseedoneDNNoneDNNAppleseedTNNHigh Performance Conjugate GradientTNNoneDNNm6i.8xlargem5.8xlarge

Amazon EC2 m6i.8xlargeonednn: IP Shapes 3D - f32 - CPUapache: 200onednn: IP Shapes 3D - bf16bf16bf16 - CPUapache: 1000ospray: Magnetic Reconnection - Path Tracergromacs: MPI CPU - water_GMX50_baresvt-av1: Preset 8 - Bosphorus 4Ktnn: CPU - SqueezeNet v2quantlib: namd: ATPase Simulation - 327,506 Atomspennant: sedovbigsvt-hevc: 1 - Bosphorus 1080ponednn: Deconvolution Batch shapes_3d - f32 - CPUpennant: leblancbigonednn: Convolution Batch Shapes Auto - f32 - CPUsvt-hevc: 10 - Bosphorus 1080pospray: NASA Streamlines - Path Tracerospray: XFrog Forest - Path Tracerapache: 500svt-hevc: 7 - Bosphorus 1080pbuild-ffmpeg: Time To Compilelulesh: oidn: RT.ldr_alb_nrm.3840x2160oidn: RT.hdr_alb_nrm.3840x2160oidn: RTLightmap.hdr.4096x4096openvkl: vklBenchmark ISPCbuild-llvm: Unix Makefilesbuild-linux-kernel: Time To Compileonednn: IP Shapes 1D - bf16bf16bf16 - CPUonednn: IP Shapes 1D - f32 - CPUincompact3d: input.i3d 193 Cells Per Directiontnn: CPU - MobileNet v2ospray: San Miguel - Path Tracernginx: 200build-nodejs: Time To Compileonednn: Matrix Multiply Batch Shapes Transformer - f32 - CPUonednn: Recurrent Neural Network Inference - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Inference - f32 - CPUonednn: Recurrent Neural Network Training - bf16bf16bf16 - CPUonednn: Recurrent Neural Network Training - f32 - CPUincompact3d: input.i3d 129 Cells Per Directionopenvkl: vklBenchmark Scalarnginx: 1000onednn: Matrix Multiply Batch Shapes Transformer - bf16bf16bf16 - CPUnginx: 500onednn: Convolution Batch Shapes Auto - bf16bf16bf16 - CPUappleseed: Material Testerappleseed: Emilyonednn: Deconvolution Batch shapes_3d - bf16bf16bf16 - CPUonednn: Deconvolution Batch shapes_1d - bf16bf16bf16 - CPUappleseed: Disney Materialtnn: CPU - SqueezeNet v1.1hpcg: tnn: CPU - DenseNetonednn: Deconvolution Batch shapes_1d - f32 - CPUyafaray: Total Time For Sample Scenem6i.8xlargem5.8xlarge1.83133162518.621.75314169650.545002.67619.63370.5542478.80.9130943.2954812.952.3600620.343732.89177387.277.412.69162463.56187.9539.3078427.76831.061.060.5081428.62853.9064.340381.7254741.3520927349.6772.58440121.66253.2460.598267681.687683.6121259.811262.0010.383473231401818.941.83916414507.608.37531132.091256239.74008711.541911.1922133.278786357.65815.92343786.2626.6024590.4633.05679101689.842.71186109815.03333.331.81713.371103.4701716.61.3145861.982339.153.3275128.347914.02131278.605.351.95117933.04136.9653.7616183.67470.780.780.3760577.68872.4585.825542.3136154.9849396462.3571.96334665.99332.8630.785254892.717891.9031643.551646.3713.450386424312237.312.36659322639.4210.6066166.450262298.61254714.369213.8517164.39384436.69013.09304457.1397.11450133.741OpenBenchmarking.org

oneDNN

This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI initiative. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: IP Shapes 3D - Data Type: f32 - Engine: CPUm6i.8xlargem5.8xlarge0.68781.37562.06342.75123.439SE +/- 0.00113, N = 3SE +/- 0.00364, N = 31.831333.05679MIN: 1.78MIN: 31. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: IP Shapes 3D - Data Type: f32 - Engine: CPUm6i.8xlargem5.8xlarge246810Min: 1.83 / Avg: 1.83 / Max: 1.83Min: 3.05 / Avg: 3.06 / Max: 3.061. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

Apache HTTP Server

This is a test of the Apache HTTPD web server. This Apache HTTPD web server benchmark test profile makes use of the Golang "Bombardier" program for facilitating the HTTP requests over a fixed period time with a configurable number of concurrent clients. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgRequests Per Second, More Is BetterApache HTTP Server 2.4.48Concurrent Requests: 200m6i.8xlargem5.8xlarge30K60K90K120K150KSE +/- 812.36, N = 3SE +/- 267.71, N = 3162518.62101689.841. (CC) gcc options: -shared -fPIC -O2 -pthread
OpenBenchmarking.orgRequests Per Second, More Is BetterApache HTTP Server 2.4.48Concurrent Requests: 200m6i.8xlargem5.8xlarge30K60K90K120K150KMin: 160894.37 / Avg: 162518.62 / Max: 163364.47Min: 101163.62 / Avg: 101689.84 / Max: 102038.581. (CC) gcc options: -shared -fPIC -O2 -pthread

oneDNN

This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI initiative. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPUm6i.8xlargem5.8xlarge0.61021.22041.83062.44083.051SE +/- 0.00108, N = 3SE +/- 0.00021, N = 31.753142.71186MIN: 1.7MIN: 2.661. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: IP Shapes 3D - Data Type: bf16bf16bf16 - Engine: CPUm6i.8xlargem5.8xlarge246810Min: 1.75 / Avg: 1.75 / Max: 1.75Min: 2.71 / Avg: 2.71 / Max: 2.711. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

Apache HTTP Server

This is a test of the Apache HTTPD web server. This Apache HTTPD web server benchmark test profile makes use of the Golang "Bombardier" program for facilitating the HTTP requests over a fixed period time with a configurable number of concurrent clients. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgRequests Per Second, More Is BetterApache HTTP Server 2.4.48Concurrent Requests: 1000m6i.8xlargem5.8xlarge40K80K120K160K200KSE +/- 1307.55, N = 10SE +/- 1046.25, N = 3169650.54109815.031. (CC) gcc options: -shared -fPIC -O2 -pthread
OpenBenchmarking.orgRequests Per Second, More Is BetterApache HTTP Server 2.4.48Concurrent Requests: 1000m6i.8xlargem5.8xlarge30K60K90K120K150KMin: 157955.92 / Avg: 169650.54 / Max: 171623.48Min: 107728.88 / Avg: 109815.03 / Max: 110999.051. (CC) gcc options: -shared -fPIC -O2 -pthread

OSPray

Intel OSPray is a portable ray-tracing engine for high-performance, high-fidenlity scientific visualizations. OSPray builds off Intel's Embree and Intel SPMD Program Compiler (ISPC) components as part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFPS, More Is BetterOSPray 1.8.5Demo: Magnetic Reconnection - Renderer: Path Tracerm6i.8xlargem5.8xlarge110220330440550SE +/- 0.00, N = 3500.00333.33MAX: 1000MIN: 250
OpenBenchmarking.orgFPS, More Is BetterOSPray 1.8.5Demo: Magnetic Reconnection - Renderer: Path Tracerm6i.8xlargem5.8xlarge90180270360450Min: 333.33 / Avg: 333.33 / Max: 333.33

GROMACS

The GROMACS (GROningen MAchine for Chemical Simulations) molecular dynamics package testing with the water_GMX50 data. This test profile allows selecting between CPU and GPU-based GROMACS builds. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgNs Per Day, More Is BetterGROMACS 2021.2Implementation: MPI CPU - Input: water_GMX50_barem6i.8xlargem5.8xlarge0.60211.20421.80632.40843.0105SE +/- 0.004, N = 3SE +/- 0.002, N = 32.6761.8171. (CXX) g++ options: -O3 -pthread
OpenBenchmarking.orgNs Per Day, More Is BetterGROMACS 2021.2Implementation: MPI CPU - Input: water_GMX50_barem6i.8xlargem5.8xlarge246810Min: 2.67 / Avg: 2.68 / Max: 2.68Min: 1.81 / Avg: 1.82 / Max: 1.821. (CXX) g++ options: -O3 -pthread

SVT-AV1

This is a benchmark of the SVT-AV1 open-source video encoder/decoder. SVT-AV1 was originally developed by Intel as part of their Open Visual Cloud / Scalable Video Technology (SVT). Development of SVT-AV1 has since moved to the Alliance for Open Media as part of upstream AV1 development. SVT-AV1 is a CPU-based multi-threaded video encoder for the AV1 video format with a sample YUV video file. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 0.8.7Encoder Mode: Preset 8 - Input: Bosphorus 4Km6i.8xlargem5.8xlarge510152025SE +/- 0.01, N = 3SE +/- 0.01, N = 319.6313.371. (CXX) g++ options: -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq -pie
OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 0.8.7Encoder Mode: Preset 8 - Input: Bosphorus 4Km6i.8xlargem5.8xlarge510152025Min: 19.62 / Avg: 19.63 / Max: 19.65Min: 13.36 / Avg: 13.37 / Max: 13.41. (CXX) g++ options: -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq -pie

TNN

TNN is an open-source deep learning reasoning framework developed by Tencent. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: SqueezeNet v2m6i.8xlargem5.8xlarge20406080100SE +/- 0.08, N = 3SE +/- 0.01, N = 370.55103.47MIN: 70.14 / MAX: 71.88MIN: 103.34 / MAX: 103.681. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: SqueezeNet v2m6i.8xlargem5.8xlarge20406080100Min: 70.41 / Avg: 70.55 / Max: 70.7Min: 103.45 / Avg: 103.47 / Max: 103.491. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl

QuantLib

QuantLib is an open-source library/framework around quantitative finance for modeling, trading and risk management scenarios. QuantLib is written in C++ with Boost and its built-in benchmark used reports the QuantLib Benchmark Index benchmark score. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMFLOPS, More Is BetterQuantLib 1.21m6i.8xlargem5.8xlarge5001000150020002500SE +/- 4.97, N = 3SE +/- 0.78, N = 32478.81716.61. (CXX) g++ options: -O3 -march=native -rdynamic
OpenBenchmarking.orgMFLOPS, More Is BetterQuantLib 1.21m6i.8xlargem5.8xlarge400800120016002000Min: 2469.9 / Avg: 2478.8 / Max: 2487.1Min: 1715.2 / Avg: 1716.57 / Max: 1717.91. (CXX) g++ options: -O3 -march=native -rdynamic

NAMD

NAMD is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems. NAMD was developed by the Theoretical and Computational Biophysics Group in the Beckman Institute for Advanced Science and Technology at the University of Illinois at Urbana-Champaign. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgdays/ns, Fewer Is BetterNAMD 2.14ATPase Simulation - 327,506 Atomsm6i.8xlargem5.8xlarge0.29580.59160.88741.18321.479SE +/- 0.00097, N = 3SE +/- 0.00352, N = 30.913091.31458
OpenBenchmarking.orgdays/ns, Fewer Is BetterNAMD 2.14ATPase Simulation - 327,506 Atomsm6i.8xlargem5.8xlarge246810Min: 0.91 / Avg: 0.91 / Max: 0.91Min: 1.31 / Avg: 1.31 / Max: 1.32

Pennant

Pennant is an application focused on hydrodynamics on general unstructured meshes in 2D. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgHydro Cycle Time - Seconds, Fewer Is BetterPennant 1.0.1Test: sedovbigm6i.8xlargem5.8xlarge1428425670SE +/- 0.06, N = 3SE +/- 0.00, N = 343.3061.981. (CXX) g++ options: -fopenmp -pthread -lmpi_cxx -lmpi
OpenBenchmarking.orgHydro Cycle Time - Seconds, Fewer Is BetterPennant 1.0.1Test: sedovbigm6i.8xlargem5.8xlarge1224364860Min: 43.22 / Avg: 43.3 / Max: 43.41Min: 61.97 / Avg: 61.98 / Max: 61.991. (CXX) g++ options: -fopenmp -pthread -lmpi_cxx -lmpi

SVT-HEVC

This is a test of the Intel Open Visual Cloud Scalable Video Technology SVT-HEVC CPU-based multi-threaded video encoder for the HEVC / H.265 video format with a sample 1080p YUV video file. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-HEVC 1.5.0Tuning: 1 - Input: Bosphorus 1080pm6i.8xlargem5.8xlarge3691215SE +/- 0.02, N = 3SE +/- 0.02, N = 312.959.151. (CC) gcc options: -fPIE -fPIC -O3 -O2 -pie -rdynamic -lpthread -lrt
OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-HEVC 1.5.0Tuning: 1 - Input: Bosphorus 1080pm6i.8xlargem5.8xlarge48121620Min: 12.93 / Avg: 12.95 / Max: 12.99Min: 9.11 / Avg: 9.15 / Max: 9.181. (CC) gcc options: -fPIE -fPIC -O3 -O2 -pie -rdynamic -lpthread -lrt

oneDNN

This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI initiative. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPUm6i.8xlargem5.8xlarge0.74871.49742.24612.99483.7435SE +/- 0.00094, N = 3SE +/- 0.00287, N = 32.360063.32751MIN: 2.18MIN: 3.291. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPUm6i.8xlargem5.8xlarge246810Min: 2.36 / Avg: 2.36 / Max: 2.36Min: 3.32 / Avg: 3.33 / Max: 3.331. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

Pennant

Pennant is an application focused on hydrodynamics on general unstructured meshes in 2D. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgHydro Cycle Time - Seconds, Fewer Is BetterPennant 1.0.1Test: leblancbigm6i.8xlargem5.8xlarge714212835SE +/- 0.01, N = 3SE +/- 0.02, N = 320.3428.351. (CXX) g++ options: -fopenmp -pthread -lmpi_cxx -lmpi
OpenBenchmarking.orgHydro Cycle Time - Seconds, Fewer Is BetterPennant 1.0.1Test: leblancbigm6i.8xlargem5.8xlarge612182430Min: 20.32 / Avg: 20.34 / Max: 20.37Min: 28.32 / Avg: 28.35 / Max: 28.391. (CXX) g++ options: -fopenmp -pthread -lmpi_cxx -lmpi

oneDNN

This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI initiative. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPUm6i.8xlargem5.8xlarge0.90481.80962.71443.61924.524SE +/- 0.00091, N = 3SE +/- 0.00415, N = 32.891774.02131MIN: 2.82MIN: 3.941. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPUm6i.8xlargem5.8xlarge246810Min: 2.89 / Avg: 2.89 / Max: 2.89Min: 4.01 / Avg: 4.02 / Max: 4.031. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

SVT-HEVC

This is a test of the Intel Open Visual Cloud Scalable Video Technology SVT-HEVC CPU-based multi-threaded video encoder for the HEVC / H.265 video format with a sample 1080p YUV video file. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-HEVC 1.5.0Tuning: 10 - Input: Bosphorus 1080pm6i.8xlargem5.8xlarge80160240320400SE +/- 1.33, N = 3SE +/- 0.62, N = 3387.27278.601. (CC) gcc options: -fPIE -fPIC -O3 -O2 -pie -rdynamic -lpthread -lrt
OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-HEVC 1.5.0Tuning: 10 - Input: Bosphorus 1080pm6i.8xlargem5.8xlarge70140210280350Min: 384.62 / Avg: 387.27 / Max: 388.85Min: 277.39 / Avg: 278.6 / Max: 279.461. (CC) gcc options: -fPIE -fPIC -O3 -O2 -pie -rdynamic -lpthread -lrt

OSPray

Intel OSPray is a portable ray-tracing engine for high-performance, high-fidenlity scientific visualizations. OSPray builds off Intel's Embree and Intel SPMD Program Compiler (ISPC) components as part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFPS, More Is BetterOSPray 1.8.5Demo: NASA Streamlines - Renderer: Path Tracerm6i.8xlargem5.8xlarge246810SE +/- 0.00, N = 3SE +/- 0.00, N = 37.415.35MIN: 7.25 / MAX: 7.58MIN: 5.24 / MAX: 5.43
OpenBenchmarking.orgFPS, More Is BetterOSPray 1.8.5Demo: NASA Streamlines - Renderer: Path Tracerm6i.8xlargem5.8xlarge3691215Min: 7.41 / Avg: 7.41 / Max: 7.41Min: 5.35 / Avg: 5.35 / Max: 5.35

OpenBenchmarking.orgFPS, More Is BetterOSPray 1.8.5Demo: XFrog Forest - Renderer: Path Tracerm6i.8xlargem5.8xlarge0.60531.21061.81592.42123.0265SE +/- 0.00, N = 3SE +/- 0.00, N = 32.691.95MIN: 2.6 / MAX: 2.7MIN: 1.94 / MAX: 1.96
OpenBenchmarking.orgFPS, More Is BetterOSPray 1.8.5Demo: XFrog Forest - Renderer: Path Tracerm6i.8xlargem5.8xlarge246810Min: 2.68 / Avg: 2.69 / Max: 2.69Min: 1.95 / Avg: 1.95 / Max: 1.95

Apache HTTP Server

This is a test of the Apache HTTPD web server. This Apache HTTPD web server benchmark test profile makes use of the Golang "Bombardier" program for facilitating the HTTP requests over a fixed period time with a configurable number of concurrent clients. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgRequests Per Second, More Is BetterApache HTTP Server 2.4.48Concurrent Requests: 500m6i.8xlargem5.8xlarge30K60K90K120K150KSE +/- 1515.80, N = 7SE +/- 155.99, N = 3162463.56117933.041. (CC) gcc options: -shared -fPIC -O2 -pthread
OpenBenchmarking.orgRequests Per Second, More Is BetterApache HTTP Server 2.4.48Concurrent Requests: 500m6i.8xlargem5.8xlarge30K60K90K120K150KMin: 155560.51 / Avg: 162463.56 / Max: 165727.91Min: 117723.19 / Avg: 117933.04 / Max: 118237.881. (CC) gcc options: -shared -fPIC -O2 -pthread

SVT-HEVC

This is a test of the Intel Open Visual Cloud Scalable Video Technology SVT-HEVC CPU-based multi-threaded video encoder for the HEVC / H.265 video format with a sample 1080p YUV video file. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-HEVC 1.5.0Tuning: 7 - Input: Bosphorus 1080pm6i.8xlargem5.8xlarge4080120160200SE +/- 0.11, N = 3SE +/- 0.21, N = 3187.95136.961. (CC) gcc options: -fPIE -fPIC -O3 -O2 -pie -rdynamic -lpthread -lrt
OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-HEVC 1.5.0Tuning: 7 - Input: Bosphorus 1080pm6i.8xlargem5.8xlarge306090120150Min: 187.73 / Avg: 187.95 / Max: 188.09Min: 136.55 / Avg: 136.96 / Max: 137.211. (CC) gcc options: -fPIE -fPIC -O3 -O2 -pie -rdynamic -lpthread -lrt

Timed FFmpeg Compilation

This test times how long it takes to build the FFmpeg multimedia library. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed FFmpeg Compilation 4.4Time To Compilem6i.8xlargem5.8xlarge1224364860SE +/- 0.15, N = 3SE +/- 0.12, N = 339.3153.76
OpenBenchmarking.orgSeconds, Fewer Is BetterTimed FFmpeg Compilation 4.4Time To Compilem6i.8xlargem5.8xlarge1122334455Min: 39.08 / Avg: 39.31 / Max: 39.6Min: 53.58 / Avg: 53.76 / Max: 54

LULESH

LULESH is the Livermore Unstructured Lagrangian Explicit Shock Hydrodynamics. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgz/s, More Is BetterLULESH 2.0.3m6i.8xlargem5.8xlarge2K4K6K8K10KSE +/- 23.16, N = 3SE +/- 10.80, N = 38427.776183.671. (CXX) g++ options: -O3 -fopenmp -lm -pthread -lmpi_cxx -lmpi
OpenBenchmarking.orgz/s, More Is BetterLULESH 2.0.3m6i.8xlargem5.8xlarge15003000450060007500Min: 8383.7 / Avg: 8427.77 / Max: 8462.15Min: 6170.26 / Avg: 6183.67 / Max: 6205.051. (CXX) g++ options: -O3 -fopenmp -lm -pthread -lmpi_cxx -lmpi

Intel Open Image Denoise

Open Image Denoise is a denoising library for ray-tracing and part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgImages / Sec, More Is BetterIntel Open Image Denoise 1.4.0Run: RT.ldr_alb_nrm.3840x2160m6i.8xlargem5.8xlarge0.23850.4770.71550.9541.1925SE +/- 0.00, N = 3SE +/- 0.00, N = 31.060.78
OpenBenchmarking.orgImages / Sec, More Is BetterIntel Open Image Denoise 1.4.0Run: RT.ldr_alb_nrm.3840x2160m6i.8xlargem5.8xlarge246810Min: 1.06 / Avg: 1.06 / Max: 1.06Min: 0.78 / Avg: 0.78 / Max: 0.78

OpenBenchmarking.orgImages / Sec, More Is BetterIntel Open Image Denoise 1.4.0Run: RT.hdr_alb_nrm.3840x2160m6i.8xlargem5.8xlarge0.23850.4770.71550.9541.1925SE +/- 0.00, N = 3SE +/- 0.00, N = 31.060.78
OpenBenchmarking.orgImages / Sec, More Is BetterIntel Open Image Denoise 1.4.0Run: RT.hdr_alb_nrm.3840x2160m6i.8xlargem5.8xlarge246810Min: 1.06 / Avg: 1.06 / Max: 1.06Min: 0.78 / Avg: 0.78 / Max: 0.78

OpenBenchmarking.orgImages / Sec, More Is BetterIntel Open Image Denoise 1.4.0Run: RTLightmap.hdr.4096x4096m6i.8xlargem5.8xlarge0.11250.2250.33750.450.5625SE +/- 0.00, N = 3SE +/- 0.00, N = 30.500.37
OpenBenchmarking.orgImages / Sec, More Is BetterIntel Open Image Denoise 1.4.0Run: RTLightmap.hdr.4096x4096m6i.8xlargem5.8xlarge246810Min: 0.5 / Avg: 0.5 / Max: 0.5Min: 0.36 / Avg: 0.37 / Max: 0.37

OpenVKL

OpenVKL is the Intel Open Volume Kernel Library that offers high-performance volume computation kernels and part of the Intel oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgItems / Sec, More Is BetterOpenVKL 1.0Benchmark: vklBenchmark ISPCm6i.8xlargem5.8xlarge204060801008160MIN: 6 / MAX: 1934MIN: 4 / MAX: 1457

Timed LLVM Compilation

This test times how long it takes to build the LLVM compiler. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed LLVM Compilation 12.0Build System: Unix Makefilesm6i.8xlargem5.8xlarge120240360480600SE +/- 2.32, N = 3SE +/- 3.92, N = 3428.63577.69
OpenBenchmarking.orgSeconds, Fewer Is BetterTimed LLVM Compilation 12.0Build System: Unix Makefilesm6i.8xlargem5.8xlarge100200300400500Min: 424.24 / Avg: 428.63 / Max: 432.13Min: 572.81 / Avg: 577.69 / Max: 585.44

Timed Linux Kernel Compilation

This test times how long it takes to build the Linux kernel in a default configuration (defconfig) for the architecture being tested. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Linux Kernel Compilation 5.10.20Time To Compilem6i.8xlargem5.8xlarge1632486480SE +/- 0.49, N = 3SE +/- 0.70, N = 353.9172.46
OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Linux Kernel Compilation 5.10.20Time To Compilem6i.8xlargem5.8xlarge1428425670Min: 53.36 / Avg: 53.91 / Max: 54.89Min: 71.71 / Avg: 72.46 / Max: 73.85

oneDNN

This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI initiative. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPUm6i.8xlargem5.8xlarge1.31072.62143.93215.24286.5535SE +/- 0.00222, N = 3SE +/- 0.00191, N = 34.340385.82554MIN: 4.13MIN: 5.741. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: IP Shapes 1D - Data Type: bf16bf16bf16 - Engine: CPUm6i.8xlargem5.8xlarge246810Min: 4.34 / Avg: 4.34 / Max: 4.34Min: 5.82 / Avg: 5.83 / Max: 5.831. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: IP Shapes 1D - Data Type: f32 - Engine: CPUm6i.8xlargem5.8xlarge0.52061.04121.56182.08242.603SE +/- 0.00183, N = 3SE +/- 0.00021, N = 31.725472.31361MIN: 1.52MIN: 2.261. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: IP Shapes 1D - Data Type: f32 - Engine: CPUm6i.8xlargem5.8xlarge246810Min: 1.72 / Avg: 1.73 / Max: 1.73Min: 2.31 / Avg: 2.31 / Max: 2.311. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

Xcompact3d Incompact3d

Xcompact3d Incompact3d is a Fortran-MPI based, finite difference high-performance code for solving the incompressible Navier-Stokes equation and as many as you need scalar transport equations. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterXcompact3d Incompact3d 2021-03-11Input: input.i3d 193 Cells Per Directionm6i.8xlargem5.8xlarge1224364860SE +/- 0.35, N = 3SE +/- 0.43, N = 341.3554.981. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi
OpenBenchmarking.orgSeconds, Fewer Is BetterXcompact3d Incompact3d 2021-03-11Input: input.i3d 193 Cells Per Directionm6i.8xlargem5.8xlarge1122334455Min: 40.94 / Avg: 41.35 / Max: 42.05Min: 54.13 / Avg: 54.98 / Max: 55.471. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi

TNN

TNN is an open-source deep learning reasoning framework developed by Tencent. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: MobileNet v2m6i.8xlargem5.8xlarge100200300400500SE +/- 0.22, N = 3SE +/- 0.09, N = 3349.68462.36MIN: 348.47 / MAX: 351.38MIN: 460.64 / MAX: 467.451. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: MobileNet v2m6i.8xlargem5.8xlarge80160240320400Min: 349.36 / Avg: 349.68 / Max: 350.1Min: 462.17 / Avg: 462.36 / Max: 462.451. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl

OSPray

Intel OSPray is a portable ray-tracing engine for high-performance, high-fidenlity scientific visualizations. OSPray builds off Intel's Embree and Intel SPMD Program Compiler (ISPC) components as part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgFPS, More Is BetterOSPray 1.8.5Demo: San Miguel - Renderer: Path Tracerm6i.8xlargem5.8xlarge0.58051.1611.74152.3222.9025SE +/- 0.00, N = 3SE +/- 0.00, N = 32.581.96MIN: 2.53 / MAX: 2.6MIN: 1.94
OpenBenchmarking.orgFPS, More Is BetterOSPray 1.8.5Demo: San Miguel - Renderer: Path Tracerm6i.8xlargem5.8xlarge246810Min: 2.58 / Avg: 2.58 / Max: 2.58Min: 1.96 / Avg: 1.96 / Max: 1.96

nginx

This is a benchmark of the lightweight Nginx HTTP(S) web-server. This Nginx web server benchmark test profile makes use of the Golang "Bombardier" program for facilitating the HTTP requests over a fixed period time with a configurable number of concurrent clients. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgRequests Per Second, More Is Betternginx 1.21.1Concurrent Requests: 200m6i.8xlargem5.8xlarge90K180K270K360K450KSE +/- 600.47, N = 3SE +/- 59.66, N = 3440121.66334665.991. (CC) gcc options: -ldl -lpthread -lcrypt -lz -O3 -march=native
OpenBenchmarking.orgRequests Per Second, More Is Betternginx 1.21.1Concurrent Requests: 200m6i.8xlargem5.8xlarge80K160K240K320K400KMin: 439121.26 / Avg: 440121.66 / Max: 441197.26Min: 334581.84 / Avg: 334665.99 / Max: 334781.321. (CC) gcc options: -ldl -lpthread -lcrypt -lz -O3 -march=native

Timed Node.js Compilation

This test profile times how long it takes to build/compile Node.js itself from source. Node.js is a JavaScript run-time built from the Chrome V8 JavaScript engine while itself is written in C/C++. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Node.js Compilation 15.11Time To Compilem6i.8xlargem5.8xlarge70140210280350SE +/- 0.12, N = 3SE +/- 0.40, N = 3253.25332.86
OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Node.js Compilation 15.11Time To Compilem6i.8xlargem5.8xlarge60120180240300Min: 253.06 / Avg: 253.25 / Max: 253.48Min: 332.16 / Avg: 332.86 / Max: 333.54

oneDNN

This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI initiative. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPUm6i.8xlargem5.8xlarge0.17670.35340.53010.70680.8835SE +/- 0.000278, N = 3SE +/- 0.000465, N = 30.5982670.785254MIN: 0.54MIN: 0.751. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPUm6i.8xlargem5.8xlarge246810Min: 0.6 / Avg: 0.6 / Max: 0.6Min: 0.78 / Avg: 0.79 / Max: 0.791. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPUm6i.8xlargem5.8xlarge2004006008001000SE +/- 0.16, N = 3SE +/- 0.44, N = 3681.69892.72MIN: 667.85MIN: 889.581. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Inference - Data Type: bf16bf16bf16 - Engine: CPUm6i.8xlargem5.8xlarge160320480640800Min: 681.38 / Avg: 681.69 / Max: 681.86Min: 891.96 / Avg: 892.72 / Max: 893.471. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPUm6i.8xlargem5.8xlarge2004006008001000SE +/- 0.39, N = 3SE +/- 0.58, N = 3683.61891.90MIN: 668.66MIN: 888.421. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPUm6i.8xlargem5.8xlarge160320480640800Min: 682.84 / Avg: 683.61 / Max: 684.01Min: 890.77 / Avg: 891.9 / Max: 892.661. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPUm6i.8xlargem5.8xlarge400800120016002000SE +/- 0.80, N = 3SE +/- 3.08, N = 31259.811643.55MIN: 1231.21MIN: 1634.91. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Training - Data Type: bf16bf16bf16 - Engine: CPUm6i.8xlargem5.8xlarge30060090012001500Min: 1258.28 / Avg: 1259.81 / Max: 1260.96Min: 1637.96 / Avg: 1643.55 / Max: 1648.61. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPUm6i.8xlargem5.8xlarge400800120016002000SE +/- 0.36, N = 3SE +/- 1.52, N = 31262.001646.37MIN: 1237.58MIN: 1641.321. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPUm6i.8xlargem5.8xlarge30060090012001500Min: 1261.28 / Avg: 1262 / Max: 1262.44Min: 1644.37 / Avg: 1646.37 / Max: 1649.361. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

Xcompact3d Incompact3d

Xcompact3d Incompact3d is a Fortran-MPI based, finite difference high-performance code for solving the incompressible Navier-Stokes equation and as many as you need scalar transport equations. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterXcompact3d Incompact3d 2021-03-11Input: input.i3d 129 Cells Per Directionm6i.8xlargem5.8xlarge3691215SE +/- 0.13, N = 4SE +/- 0.07, N = 310.3813.451. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi
OpenBenchmarking.orgSeconds, Fewer Is BetterXcompact3d Incompact3d 2021-03-11Input: input.i3d 129 Cells Per Directionm6i.8xlargem5.8xlarge48121620Min: 10.01 / Avg: 10.38 / Max: 10.58Min: 13.32 / Avg: 13.45 / Max: 13.521. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi

OpenVKL

OpenVKL is the Intel Open Volume Kernel Library that offers high-performance volume computation kernels and part of the Intel oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgItems / Sec, More Is BetterOpenVKL 1.0Benchmark: vklBenchmark Scalarm6i.8xlargem5.8xlarge7142128353124MIN: 2 / MAX: 931MIN: 2 / MAX: 667

nginx

This is a benchmark of the lightweight Nginx HTTP(S) web-server. This Nginx web server benchmark test profile makes use of the Golang "Bombardier" program for facilitating the HTTP requests over a fixed period time with a configurable number of concurrent clients. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgRequests Per Second, More Is Betternginx 1.21.1Concurrent Requests: 1000m6i.8xlargem5.8xlarge90K180K270K360K450KSE +/- 2242.18, N = 3SE +/- 1189.81, N = 3401818.94312237.311. (CC) gcc options: -ldl -lpthread -lcrypt -lz -O3 -march=native
OpenBenchmarking.orgRequests Per Second, More Is Betternginx 1.21.1Concurrent Requests: 1000m6i.8xlargem5.8xlarge70K140K210K280K350KMin: 397587.12 / Avg: 401818.94 / Max: 405219.71Min: 309857.71 / Avg: 312237.31 / Max: 313432.071. (CC) gcc options: -ldl -lpthread -lcrypt -lz -O3 -march=native

oneDNN

This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI initiative. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPUm6i.8xlargem5.8xlarge0.53251.0651.59752.132.6625SE +/- 0.00072, N = 3SE +/- 0.00424, N = 31.839162.36659MIN: 1.76MIN: 2.281. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Matrix Multiply Batch Shapes Transformer - Data Type: bf16bf16bf16 - Engine: CPUm6i.8xlargem5.8xlarge246810Min: 1.84 / Avg: 1.84 / Max: 1.84Min: 2.36 / Avg: 2.37 / Max: 2.371. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

nginx

This is a benchmark of the lightweight Nginx HTTP(S) web-server. This Nginx web server benchmark test profile makes use of the Golang "Bombardier" program for facilitating the HTTP requests over a fixed period time with a configurable number of concurrent clients. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgRequests Per Second, More Is Betternginx 1.21.1Concurrent Requests: 500m6i.8xlargem5.8xlarge90K180K270K360K450KSE +/- 2040.23, N = 3SE +/- 143.26, N = 3414507.60322639.421. (CC) gcc options: -ldl -lpthread -lcrypt -lz -O3 -march=native
OpenBenchmarking.orgRequests Per Second, More Is Betternginx 1.21.1Concurrent Requests: 500m6i.8xlargem5.8xlarge70K140K210K280K350KMin: 412275.27 / Avg: 414507.6 / Max: 418581.82Min: 322431.88 / Avg: 322639.42 / Max: 322914.271. (CC) gcc options: -ldl -lpthread -lcrypt -lz -O3 -march=native

oneDNN

This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI initiative. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPUm6i.8xlargem5.8xlarge3691215SE +/- 0.00038, N = 3SE +/- 0.00721, N = 38.3753110.60660MIN: 8.35MIN: 10.391. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Convolution Batch Shapes Auto - Data Type: bf16bf16bf16 - Engine: CPUm6i.8xlargem5.8xlarge3691215Min: 8.37 / Avg: 8.38 / Max: 8.38Min: 10.59 / Avg: 10.61 / Max: 10.621. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

Appleseed

Appleseed is an open-source production renderer focused on physically-based global illumination rendering engine primarily designed for animation and visual effects. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterAppleseed 2.0 BetaScene: Material Testerm6i.8xlargem5.8xlarge4080120160200132.09166.45

OpenBenchmarking.orgSeconds, Fewer Is BetterAppleseed 2.0 BetaScene: Emilym6i.8xlargem5.8xlarge70140210280350239.74298.61

oneDNN

This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI initiative. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPUm6i.8xlargem5.8xlarge48121620SE +/- 0.00, N = 3SE +/- 0.02, N = 311.5414.37MIN: 11.5MIN: 14.321. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Deconvolution Batch shapes_3d - Data Type: bf16bf16bf16 - Engine: CPUm6i.8xlargem5.8xlarge48121620Min: 11.54 / Avg: 11.54 / Max: 11.54Min: 14.35 / Avg: 14.37 / Max: 14.41. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPUm6i.8xlargem5.8xlarge48121620SE +/- 0.01, N = 3SE +/- 0.00, N = 311.1913.85MIN: 11.14MIN: 13.71. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Deconvolution Batch shapes_1d - Data Type: bf16bf16bf16 - Engine: CPUm6i.8xlargem5.8xlarge48121620Min: 11.18 / Avg: 11.19 / Max: 11.2Min: 13.85 / Avg: 13.85 / Max: 13.861. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

Appleseed

Appleseed is an open-source production renderer focused on physically-based global illumination rendering engine primarily designed for animation and visual effects. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterAppleseed 2.0 BetaScene: Disney Materialm6i.8xlargem5.8xlarge4080120160200133.28164.39

TNN

TNN is an open-source deep learning reasoning framework developed by Tencent. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: SqueezeNet v1.1m6i.8xlargem5.8xlarge90180270360450SE +/- 0.06, N = 3SE +/- 0.03, N = 3357.66436.69MIN: 356.91 / MAX: 361MIN: 436.06 / MAX: 437.361. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: SqueezeNet v1.1m6i.8xlargem5.8xlarge80160240320400Min: 357.54 / Avg: 357.66 / Max: 357.72Min: 436.63 / Avg: 436.69 / Max: 436.741. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl

High Performance Conjugate Gradient

HPCG is the High Performance Conjugate Gradient and is a new scientific benchmark from Sandia National Lans focused for super-computer testing with modern real-world workloads compared to HPCC. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHigh Performance Conjugate Gradient 3.1m6i.8xlargem5.8xlarge48121620SE +/- 0.01, N = 3SE +/- 0.02, N = 315.9213.091. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -pthread -lmpi_cxx -lmpi
OpenBenchmarking.orgGFLOP/s, More Is BetterHigh Performance Conjugate Gradient 3.1m6i.8xlargem5.8xlarge48121620Min: 15.91 / Avg: 15.92 / Max: 15.93Min: 13.06 / Avg: 13.09 / Max: 13.121. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -pthread -lmpi_cxx -lmpi

TNN

TNN is an open-source deep learning reasoning framework developed by Tencent. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: DenseNetm6i.8xlargem5.8xlarge10002000300040005000SE +/- 0.64, N = 3SE +/- 3.11, N = 33786.264457.14MIN: 3733.1 / MAX: 3883.97MIN: 4358.21 / MAX: 4557.141. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl
OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: DenseNetm6i.8xlargem5.8xlarge8001600240032004000Min: 3785.49 / Avg: 3786.26 / Max: 3787.54Min: 4451.83 / Avg: 4457.14 / Max: 4462.591. (CXX) g++ options: -fopenmp -pthread -fvisibility=hidden -fvisibility=default -O3 -rdynamic -ldl

oneDNN

This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI initiative. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPUm6i.8xlargem5.8xlarge246810SE +/- 0.01111, N = 3SE +/- 0.01645, N = 36.602457.11450MIN: 3.59MIN: 4.071. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl
OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPUm6i.8xlargem5.8xlarge3691215Min: 6.59 / Avg: 6.6 / Max: 6.62Min: 7.09 / Avg: 7.11 / Max: 7.151. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp -msse4.1 -fPIC -pie -lpthread -ldl

YafaRay

YafaRay is an open-source physically based montecarlo ray-tracing engine. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgSeconds, Fewer Is BetterYafaRay 3.5.1Total Time For Sample Scenem6i.8xlargem5.8xlarge306090120150SE +/- 1.25, N = 3SE +/- 2.77, N = 1290.46133.741. (CXX) g++ options: -std=c++11 -pthread -O3 -ffast-math -rdynamic -ldl -lImath -lIlmImf -lIex -lHalf -lz -lIlmThread -lxml2 -lfreetype
OpenBenchmarking.orgSeconds, Fewer Is BetterYafaRay 3.5.1Total Time For Sample Scenem6i.8xlargem5.8xlarge306090120150Min: 88.81 / Avg: 90.46 / Max: 92.91Min: 122.16 / Avg: 133.74 / Max: 154.191. (CXX) g++ options: -std=c++11 -pthread -O3 -ffast-math -rdynamic -ldl -lImath -lIlmImf -lIex -lHalf -lz -lIlmThread -lxml2 -lfreetype