Microsoft Azure EPYC Milan-X HBv3 Benchmarks

Microsoft Azure HBv3 (Milan) versus HBv3 (Milan-X) benchmarking by Michael Larabel for a future article on Phoronix.com. Looking at performance of AMD EPYC Milan-X in Microsoft Azure cloud for a variety of workloads.

HTML result view exported from: https://openbenchmarking.org/result/2203201-PTS-AZUREHBV49&sro&grt.

ProcessorMotherboardMemoryDiskGraphicsNetworkOSKernelCompilerFile-SystemScreen ResolutionSystem LayerHBv3HBv3 Milan-XHBv3HBv3 Milan-X 64 Cores 64 Cores 120 Cores 120 Cores2 x AMD EPYC 7V13 64-Core (64 Cores)Microsoft Virtual Machine (Hyper-V UEFI v4.1 BIOS)442GB2 x 960GB Microsoft NVMe Direct Disk + 32GB Virtual Disk + 515GB Virtual Diskhyperv_fbMellanox MT27710CentOS Linux 84.18.0-147.8.1.el8_1.x86_64 (x86_64)GCC 8.3.1 20190507ext41152x864microsoft2 x AMD EPYC 7V73X 64-Core (64 Cores)2 x AMD EPYC 7V13 64-Core (120 Cores)2 x AMD EPYC 7V73X 64-Core (120 Cores)OpenBenchmarking.orgKernel Details- Transparent Huge Pages: alwaysCompiler Details- --build=x86_64-redhat-linux --disable-libmpx --disable-libunwind-exceptions --enable-__cxa_atexit --enable-bootstrap --enable-cet --enable-checking=release --enable-gnu-indirect-function --enable-gnu-unique-object --enable-initfini-array --enable-languages=c,c++,fortran,lto --enable-multilib --enable-offload-targets=nvptx-none --enable-plugin --enable-shared --enable-threads=posix --mandir=/usr/share/man --with-arch_32=x86-64 --with-gcc-major-version-only --with-isl --with-linker-hash-style=gnu --with-tune=generic --without-cuda-driver Processor Details- CPU Microcode: 0xffffffffPython Details- Python 3.6.8Security Details- SELinux + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Vulnerable + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full generic retpoline STIBP: disabled RSB filling + tsx_async_abort: Not affected

askap: tConvolve MPI - Degriddingaskap: tConvolve MPI - Griddingbrl-cad: VGR Performance Metricembree: Pathtracer - Crownembree: Pathtracer ISPC - Crownembree: Pathtracer - Asian Dragonembree: Pathtracer ISPC - Asian Dragonrocksdb: Rand Readrocksdb: Read Rand Write Randgraphics-magick: Noise-Gaussiangromacs: MPI CPU - water_GMX50_barehpcg: hpcc: G-HPLhpcc: Rand Ring Bandwidthhpcc: Max Ping Pong Bandwidthjohn-the-ripper: MD5kripke: lammps: 20k Atomslammps: Rhodopsin Proteinlulesh: namd: ATPase Simulation - 327,506 Atomsnpb: CG.Cnwchem: C240 Buckyballonnx: super-resolution-10 - CPUopenfoam: Motorbike 60Mopenvkl: vklBenchmark ISPCopenvkl: vklBenchmark Scalarospray: San Miguel - SciVisospray: XFrog Forest - SciVisospray: San Miguel - Path Tracerospray: NASA Streamlines - SciVisospray: XFrog Forest - Path Tracerospray: Magnetic Reconnection - SciVisospray: NASA Streamlines - Path Tracerparboil: OpenMP CUTCPrelion: Basic - CPUbuild-linux-kernel: Time To Compilebuild-nodejs: Time To Compilewrf: conus 2.5kmincompact3d: X3D-benchmarking input.i3dincompact3d: input.i3d 193 Cells Per Directioncompress-zstd: 19 - Compression Speedcompress-zstd: 19, Long Mode - Compression SpeedHBv3HBv3 Milan-XHBv3HBv3 Milan-X 64 Cores 64 Cores 120 Cores 120 Cores35988.038175.061849240.780038.905141.859642.001532482244713573495857.47640.023399.566101.8299217174.75356974677363552131.60532.95844262.2270.4115720940.702256.6610789.651207252.6310.754.3271.435.7038.4615.871.515548418.47924.15996.34810150.067348.11460413.678109285.139.840896.341160.165518346.092944.252345.683345.656633041091113811578747.97741.1303175.027006.2024218347.86659130009737330132.37333.95554759.6890.4080222323.232219.8635465.501267455.5611.364.9383.336.204017.241.127166414.54123.90693.7959294.703322.87511210.8780505106.259.841724.741287.0104436866.504963.286264.371763.438450272880815877437219.05438.718089.355500.7641415815.90371432678820114236.88135.40940262.2060.2761920926.522557.1585280.6016610683.3316.957.39111.119.0962.524.590.976470312.79719.06575.7148766.54287.76138312.285994582.036.657881.457042.5110948675.940971.980278.771076.5242522387301168465411239.70539.4368139.044003.4153816082.14881414009393654139.53538.46747341.1280.2690021914.512467.9648554.0317711185.8618.418.501259.9066.6727.780.847720274.35318.55972.4527804.46255.8364119.9382982393.852.8OpenBenchmarking.org

ASKAP

Test: tConvolve MPI - Degridding

HBv3 Milan-XHBv3OpenBenchmarking.orgMpix/sec, More Is BetterASKAP 1.0Test: tConvolve MPI - Degridding120 Cores64 Cores12K24K36K48K60KSE +/- 0.00, N = 3SE +/- 263.83, N = 3SE +/- 146.90, N = 3SE +/- 204.47, N = 357881.440896.341724.735988.01. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

ASKAP

Test: tConvolve MPI - Gridding

HBv3 Milan-XHBv3OpenBenchmarking.orgMpix/sec, More Is BetterASKAP 1.0Test: tConvolve MPI - Gridding120 Cores64 Cores12K24K36K48K60KSE +/- 0.00, N = 3SE +/- 143.87, N = 3SE +/- 400.79, N = 357042.541160.141287.038175.01. (CXX) g++ options: -O3 -fstrict-aliasing -fopenmp

BRL-CAD

VGR Performance Metric

HBv3 Milan-XHBv3OpenBenchmarking.orgVGR Performance Metric, More Is BetterBRL-CAD 7.32.2VGR Performance Metric120 Cores64 Cores200K400K600K800K1000K110948665518310443686184921. (CXX) g++ options: -std=c++11 -pipe -fvisibility=hidden -fno-strict-aliasing -fno-common -fexceptions -ftemplate-depth-128 -m64 -ggdb3 -O3 -fipa-pta -fstrength-reduce -finline-functions -flto -pedantic -pthread -ldl -lm

Embree

Binary: Pathtracer - Model: Crown

HBv3 Milan-XHBv3OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 3.13Binary: Pathtracer - Model: Crown120 Cores64 Cores20406080100SE +/- 0.07, N = 3SE +/- 0.15, N = 3SE +/- 0.17, N = 3SE +/- 0.05, N = 375.9446.0966.5040.78

Embree

Binary: Pathtracer ISPC - Model: Crown

HBv3 Milan-XHBv3OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 3.13Binary: Pathtracer ISPC - Model: Crown120 Cores64 Cores1632486480SE +/- 0.18, N = 3SE +/- 0.07, N = 3SE +/- 0.16, N = 3SE +/- 0.08, N = 371.9844.2563.2938.91

Embree

Binary: Pathtracer - Model: Asian Dragon

HBv3 Milan-XHBv3OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 3.13Binary: Pathtracer - Model: Asian Dragon120 Cores64 Cores20406080100SE +/- 0.21, N = 3SE +/- 0.29, N = 3SE +/- 0.28, N = 3SE +/- 0.23, N = 378.7745.6864.3741.86

Embree

Binary: Pathtracer ISPC - Model: Asian Dragon

HBv3 Milan-XHBv3OpenBenchmarking.orgFrames Per Second, More Is BetterEmbree 3.13Binary: Pathtracer ISPC - Model: Asian Dragon120 Cores64 Cores20406080100SE +/- 0.12, N = 3SE +/- 0.14, N = 3SE +/- 0.14, N = 3SE +/- 0.21, N = 376.5245.6663.4442.00

Facebook RocksDB

Test: Random Read

HBv3 Milan-XHBv3OpenBenchmarking.orgOp/s, More Is BetterFacebook RocksDB 6.22.1Test: Random Read120 Cores64 Cores110M220M330M440M550MSE +/- 1522212.52, N = 3SE +/- 648314.38, N = 3SE +/- 4680557.08, N = 7SE +/- 95304.17, N = 35223873013304109115027288083248224471. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -O2 -fno-rtti -lgflags

Facebook RocksDB

Test: Read Random Write Random

HBv3 Milan-XHBv3OpenBenchmarking.orgOp/s, More Is BetterFacebook RocksDB 6.22.1Test: Read Random Write Random120 Cores64 Cores400K800K1200K1600K2000KSE +/- 6175.84, N = 3SE +/- 6520.37, N = 3SE +/- 5368.09, N = 3SE +/- 10799.70, N = 316846541381157158774313573491. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -O2 -fno-rtti -lgflags

GraphicsMagick

Operation: Noise-Gaussian

HBv3 Milan-XHBv3OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: Noise-Gaussian120 Cores64 Cores2004006008001000SE +/- 11.24, N = 15SE +/- 4.48, N = 3SE +/- 4.18, N = 3SE +/- 6.24, N = 411238747215851. (CC) gcc options: -fopenmp -O2 -pthread -ltiff -ljpeg -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread

GROMACS

Implementation: MPI CPU - Input: water_GMX50_bare

HBv3 Milan-XHBv3OpenBenchmarking.orgNs Per Day, More Is BetterGROMACS 2021.2Implementation: MPI CPU - Input: water_GMX50_bare120 Cores64 Cores3691215SE +/- 0.061, N = 3SE +/- 0.020, N = 3SE +/- 0.051, N = 3SE +/- 0.061, N = 39.7057.9779.0547.4761. (CXX) g++ options: -O2 -pthread

High Performance Conjugate Gradient

HBv3 Milan-XHBv3OpenBenchmarking.orgGFLOP/s, More Is BetterHigh Performance Conjugate Gradient 3.1120 Cores64 Cores918273645SE +/- 0.03, N = 3SE +/- 0.09, N = 3SE +/- 0.04, N = 3SE +/- 0.01, N = 339.4441.1338.7240.021. (CXX) g++ options: -O3 -ffast-math -ftree-vectorize -pthread -lmpi

HPC Challenge

Test / Class: G-HPL

HBv3 Milan-XHBv3OpenBenchmarking.orgGFLOPS, More Is BetterHPC Challenge 1.5.0Test / Class: G-HPL120 Cores64 Cores4080120160200139.04175.0389.3699.571. (CC) gcc options: -lblas -lm -fexceptions -pthread -lmpi2. ATLAS + Open MPI 4.0.5

HPC Challenge

Test / Class: Random Ring Bandwidth

HBv3 Milan-XHBv3OpenBenchmarking.orgGB/s, More Is BetterHPC Challenge 1.5.0Test / Class: Random Ring Bandwidth120 Cores64 Cores2468103.415386.202420.764141.829921. (CC) gcc options: -lblas -lm -fexceptions -pthread -lmpi2. ATLAS + Open MPI 4.0.5

HPC Challenge

Test / Class: Max Ping Pong Bandwidth

HBv3 Milan-XHBv3OpenBenchmarking.orgMB/s, More Is BetterHPC Challenge 1.5.0Test / Class: Max Ping Pong Bandwidth120 Cores64 Cores4K8K12K16K20K16082.1518347.8715815.9017174.751. (CC) gcc options: -lblas -lm -fexceptions -pthread -lmpi2. ATLAS + Open MPI 4.0.5

John The Ripper

Test: MD5

HBv3 Milan-XHBv3OpenBenchmarking.orgReal C/S, More Is BetterJohn The Ripper 1.9.0-jumbo-1Test: MD5120 Cores64 Cores2M4M6M8M10MSE +/- 271831.96, N = 15SE +/- 10969.66, N = 3SE +/- 283586.65, N = 15SE +/- 54210.51, N = 1581414005913000714326756974671. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lgmp -pthread -lm -lz -ldl -lcrypt -lbz2

Kripke

HBv3 Milan-XHBv3OpenBenchmarking.orgThroughput FoM, More Is BetterKripke 1.2.4120 Cores64 Cores20M40M60M80M100MSE +/- 2167036.06, N = 15SE +/- 2522711.91, N = 15SE +/- 2974209.65, N = 15SE +/- 1812363.94, N = 15939365419737330188201142736355211. (CXX) g++ options: -O2 -fopenmp

LAMMPS Molecular Dynamics Simulator

Model: 20k Atoms

HBv3 Milan-XHBv3OpenBenchmarking.orgns/day, More Is BetterLAMMPS Molecular Dynamics Simulator 29Oct2020Model: 20k Atoms120 Cores64 Cores918273645SE +/- 0.01, N = 3SE +/- 0.03, N = 3SE +/- 0.22, N = 3SE +/- 0.12, N = 339.5432.3736.8831.611. (CXX) g++ options: -O2 -pthread -lm

LAMMPS Molecular Dynamics Simulator

Model: Rhodopsin Protein

HBv3 Milan-XHBv3OpenBenchmarking.orgns/day, More Is BetterLAMMPS Molecular Dynamics Simulator 29Oct2020Model: Rhodopsin Protein120 Cores64 Cores918273645SE +/- 0.29, N = 3SE +/- 0.11, N = 3SE +/- 0.11, N = 3SE +/- 0.24, N = 338.4733.9635.4132.961. (CXX) g++ options: -O2 -pthread -lm

LULESH

HBv3 Milan-XHBv3OpenBenchmarking.orgz/s, More Is BetterLULESH 2.0.3120 Cores64 Cores12K24K36K48K60KSE +/- 209.57, N = 3SE +/- 258.47, N = 3SE +/- 286.97, N = 3SE +/- 36.72, N = 347341.1354759.6940262.2144262.231. (CXX) g++ options: -O3 -fopenmp -lm -pthread -lmpi

NAMD

ATPase Simulation - 327,506 Atoms

HBv3 Milan-XHBv3OpenBenchmarking.orgdays/ns, Fewer Is BetterNAMD 2.14ATPase Simulation - 327,506 Atoms120 Cores64 Cores0.09260.18520.27780.37040.463SE +/- 0.00007, N = 3SE +/- 0.00005, N = 3SE +/- 0.00012, N = 3SE +/- 0.00048, N = 30.269000.408020.276190.41157

NAS Parallel Benchmarks

Test / Class: CG.C

HBv3 Milan-XHBv3OpenBenchmarking.orgTotal Mop/s, More Is BetterNAS Parallel Benchmarks 3.4Test / Class: CG.C120 Cores64 Cores5K10K15K20K25KSE +/- 70.02, N = 3SE +/- 34.77, N = 3SE +/- 26.14, N = 3SE +/- 51.72, N = 321914.5122323.2320926.5220940.701. (F9X) gfortran options: -O3 -march=native -fexceptions -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi

NWChem

Input: C240 Buckyball

HBv3 Milan-XHBv3OpenBenchmarking.orgSeconds, Fewer Is BetterNWChem 7.0.2Input: C240 Buckyball120 Cores64 Cores50010001500200025002467.92219.82557.12256.61. (F9X) gfortran options: -lnwctask -lccsd -lmcscf -lselci -lmp2 -lmoints -lstepper -ldriver -loptim -lnwdft -lgradients -lcphf -lesp -lddscf -ldangchang -lguess -lhessian -lvib -lnwcutil -lrimp2 -lproperty -lsolvation -lnwints -lprepar -lnwmd -lnwpw -lofpw -lpaw -lpspw -lband -lnwpwlib -lcafe -lspace -lanalyze -lqhop -lpfft -ldplot -ldrdy -lvscf -lqmmm -lqmd -letrans -ltce -lbq -lmm -lcons -lperfm -ldntmc -lccca -ldimqm -lga -larmci -lpeigs -l64to32 -lopenblas -lpthread -lrt -llapack -lnwcblas -lmpi_usempif08 -lmpi_mpifh -lmpi -lcomex -lm -m64 -ffast-math -std=legacy -fdefault-integer-8 -finline-functions -O2

ONNX Runtime

Model: super-resolution-10 - Device: CPU

HBv3 Milan-XHBv3OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.9.1Model: super-resolution-10 - Device: CPU120 Cores64 Cores14002800420056007000SE +/- 117.46, N = 9SE +/- 100.53, N = 9SE +/- 56.15, N = 3SE +/- 62.83, N = 364856354585261071. (CXX) g++ options: -ffunction-sections -fdata-sections -march=native -mtune=native -O2 -flto -fno-fat-lto-objects -ldl -lrt -pthread -lpthread

OpenFOAM

Input: Motorbike 60M

HBv3 Milan-XHBv3OpenBenchmarking.orgSeconds, Fewer Is BetterOpenFOAM 8Input: Motorbike 60M120 Cores64 Cores20406080100SE +/- 0.22, N = 3SE +/- 0.15, N = 3SE +/- 0.05, N = 3SE +/- 0.13, N = 354.0365.5080.6089.651. (CXX) g++ options: -std=c++11 -m64 -O3 -ftemplate-depth-100 -fPIC -fuse-ld=bfd -Xlinker --add-needed --no-as-needed -lfoamToVTK -ldynamicMesh -llagrangian -lgenericPatchFields -lfileFormats -lOpenFOAM -ldl -lm

OpenVKL

Benchmark: vklBenchmark ISPC

HBv3 Milan-XHBv3OpenBenchmarking.orgItems / Sec, More Is BetterOpenVKL 1.0Benchmark: vklBenchmark ISPC120 Cores64 Cores4080120160200SE +/- 1.75, N = 5SE +/- 0.88, N = 3177126166120

OpenVKL

Benchmark: vklBenchmark Scalar

HBv3 Milan-XHBv3OpenBenchmarking.orgItems / Sec, More Is BetterOpenVKL 1.0Benchmark: vklBenchmark Scalar120 Cores64 Cores20406080100SE +/- 1.11, N = 6SE +/- 0.67, N = 3SE +/- 1.21, N = 9SE +/- 0.88, N = 31117410672

OSPray

Demo: San Miguel - Renderer: SciVis

HBv3 Milan-XHBv3OpenBenchmarking.orgFPS, More Is BetterOSPray 1.8.5Demo: San Miguel - Renderer: SciVis120 Cores64 Cores20406080100SE +/- 0.95, N = 15SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 385.8655.5683.3352.63

OSPray

Demo: XFrog Forest - Renderer: SciVis

HBv3 Milan-XHBv3OpenBenchmarking.orgFPS, More Is BetterOSPray 1.8.5Demo: XFrog Forest - Renderer: SciVis120 Cores64 Cores510152025SE +/- 0.11, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 318.4111.3616.9510.75

OSPray

Demo: San Miguel - Renderer: Path Tracer

HBv3 Milan-XHBv3OpenBenchmarking.orgFPS, More Is BetterOSPray 1.8.5Demo: San Miguel - Renderer: Path Tracer120 Cores64 Cores246810SE +/- 0.02, N = 3SE +/- 0.02, N = 3SE +/- 0.02, N = 3SE +/- 0.01, N = 38.504.937.394.32

OSPray

Demo: NASA Streamlines - Renderer: SciVis

HBv3 Milan-XHBv3OpenBenchmarking.orgFPS, More Is BetterOSPray 1.8.5Demo: NASA Streamlines - Renderer: SciVis120 Cores64 Cores306090120150SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 3125.0083.33111.1171.43

OSPray

Demo: XFrog Forest - Renderer: Path Tracer

HBv3 Milan-XHBv3OpenBenchmarking.orgFPS, More Is BetterOSPray 1.8.5Demo: XFrog Forest - Renderer: Path Tracer120 Cores64 Cores3691215SE +/- 0.00, N = 3SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.01, N = 39.906.209.095.70

OSPray

Demo: Magnetic Reconnection - Renderer: SciVis

HBv3 Milan-XHBv3OpenBenchmarking.orgFPS, More Is BetterOSPray 1.8.5Demo: Magnetic Reconnection - Renderer: SciVis120 Cores64 Cores1530456075SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 366.6740.0062.5038.46

OSPray

Demo: NASA Streamlines - Renderer: Path Tracer

HBv3 Milan-XHBv3OpenBenchmarking.orgFPS, More Is BetterOSPray 1.8.5Demo: NASA Streamlines - Renderer: Path Tracer120 Cores64 Cores714212835SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.20, N = 3SE +/- 0.00, N = 327.7817.2424.5915.87

Parboil

Test: OpenMP CUTCP

HBv3 Milan-XHBv3OpenBenchmarking.orgSeconds, Fewer Is BetterParboil 2.5Test: OpenMP CUTCP120 Cores64 Cores0.3410.6821.0231.3641.705SE +/- 0.022647, N = 12SE +/- 0.011448, N = 6SE +/- 0.006046, N = 3SE +/- 0.014450, N = 30.8477201.1271660.9764701.5155481. (CXX) g++ options: -lm -lpthread -lgomp -O3 -ffast-math -fopenmp

RELION

Test: Basic - Device: CPU

HBv3 Milan-XHBv3OpenBenchmarking.orgSeconds, Fewer Is BetterRELION 3.1.1Test: Basic - Device: CPU120 Cores64 Cores90180270360450SE +/- 1.22, N = 3SE +/- 0.68, N = 3SE +/- 1.58, N = 3SE +/- 1.03, N = 3274.35414.54312.80418.481. (CXX) g++ options: -fopenmp -std=c++0x -O2 -rdynamic -ldl -ltiff -lfftw3f -lfftw3 -lpng -fexceptions -pthread -lmpi_cxx -lmpi

Timed Linux Kernel Compilation

Time To Compile

HBv3 Milan-XHBv3OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Linux Kernel Compilation 5.14Time To Compile120 Cores64 Cores612182430SE +/- 0.12, N = 13SE +/- 0.21, N = 8SE +/- 0.16, N = 8SE +/- 0.22, N = 718.5623.9119.0724.16

Timed Node.js Compilation

Time To Compile

HBv3 Milan-XHBv3OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Node.js Compilation 15.11Time To Compile120 Cores64 Cores20406080100SE +/- 0.32, N = 3SE +/- 0.31, N = 3SE +/- 0.28, N = 3SE +/- 0.27, N = 372.4593.8075.7196.35

WRF

Input: conus 2.5km

HBv3 Milan-XHBv3OpenBenchmarking.orgSeconds, Fewer Is BetterWRF 4.2.2Input: conus 2.5km120 Cores64 Cores2K4K6K8K10K7804.469294.708766.5410150.071. (F9X) gfortran options: -O2 -ftree-vectorize -funroll-loops -ffree-form -fconvert=big-endian -frecord-marker=4 -lesmf_time -lwrfio_nf -lnetcdff -lnetcdf -fexceptions -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi

Xcompact3d Incompact3d

Input: X3D-benchmarking input.i3d

HBv3 Milan-XHBv3OpenBenchmarking.orgSeconds, Fewer Is BetterXcompact3d Incompact3d 2021-03-11Input: X3D-benchmarking input.i3d120 Cores64 Cores80160240320400SE +/- 0.52, N = 3SE +/- 0.69, N = 3SE +/- 0.24, N = 3SE +/- 0.83, N = 3255.84322.88287.76348.111. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi

Xcompact3d Incompact3d

Input: input.i3d 193 Cells Per Direction

HBv3 Milan-XHBv3OpenBenchmarking.orgSeconds, Fewer Is BetterXcompact3d Incompact3d 2021-03-11Input: input.i3d 193 Cells Per Direction120 Cores64 Cores48121620SE +/- 0.02331367, N = 3SE +/- 0.04593850, N = 3SE +/- 0.02189533, N = 3SE +/- 0.43535891, N = 159.9382982310.8780505012.2859945013.678109201. (F9X) gfortran options: -cpp -O2 -funroll-loops -floop-optimize -fcray-pointer -fbacktrace -pthread -lmpi_usempif08 -lmpi_mpifh -lmpi

Zstd Compression

Compression Level: 19 - Compression Speed

HBv3 Milan-XHBv3OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 19 - Compression Speed120 Cores64 Cores20406080100SE +/- 1.33, N = 3SE +/- 1.05, N = 15SE +/- 0.83, N = 6SE +/- 0.93, N = 1593.8106.282.085.11. (CC) gcc options: -O3 -pthread -lz -llzma

Zstd Compression

Compression Level: 19, Long Mode - Compression Speed

HBv3 Milan-XHBv3OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 19, Long Mode - Compression Speed120 Cores64 Cores1326395265SE +/- 0.50, N = 15SE +/- 0.64, N = 3SE +/- 0.29, N = 3SE +/- 0.34, N = 1552.859.836.639.81. (CC) gcc options: -O3 -pthread -lz -llzma


Phoronix Test Suite v10.8.5