nekRS

nekRS is an open-source Navier Stokes solver based on the spectral element method. NekRS supports both CPU and GPU/accelerator support though this test profile is currently configured for CPU execution. NekRS is part of Nek5000 of the Mathematics and Computer Science MCS at Argonne National Laboratory. This nekRS benchmark is primarily relevant to large core count HPC servers and otherwise may be very time consuming on smaller systems.

To run this test with the Phoronix Test Suite, the basic command is: phoronix-test-suite benchmark nekrs.

Project Site

nek5000.mcs.anl.gov

Source Repository

github.com

Test Created

8 November 2022

Last Updated

4 June 2023

Test Maintainer

Michael Larabel

Test Type

Processor

Average Install Time

1 Minute, 35 Seconds

Average Run Time

18 Minutes, 40 Seconds

Test Dependencies

OpenMPI + CMake + C/C++ Compiler Toolchain

Accolades

20k+ Downloads

Supported Platforms

* Uploading of benchmark result data to OpenBenchmarking.org is always optional (opt-in) via the Phoronix Test Suite for users wishing to share their results publicly.
** Data based on those opting to upload their test results to OpenBenchmarking.org and users enabling the opt-in anonymous statistics reporting while running benchmarks from an Internet-connected platform.
Data updated weekly as of 29 December 2024.

Revision History

pts/nekrs-1.1.0 [View Source] Sun, 04 Jun 2023 20:11:43 GMT
Update against nekRS 23 upstream.

pts/nekrs-1.0.0 [View Source] Tue, 08 Nov 2022 18:57:06 GMT
Add nekRS scalable CFD benchmark.

Suites Using This Test

HPC - High Performance Computing

Performance Metrics

Analyze Test Configuration:

nekRS 23.0

Input: Kershaw

OpenBenchmarking.org metrics for this test profile configuration based on 245 public results since 4 June 2023 with the latest data as of 30 October 2023.

Below is an overview of the generalized performance for components where there is sufficient statistically significant data based upon user-uploaded results. It is important to keep in mind particularly in the Linux/open-source space there can be vastly different OS configurations, with this overview intended to offer just general guidance as to the performance expectations.

Component

Percentile Rank

# Compatible Public Results

flops/rank (Average)

2 x AMD EPYC 9554 64-Core

100th

12800353333 ^{+/- 427629288}

2 x AMD EPYC 9124 16-Core

98th

11455766667 ^{+/- 123865115}

2 x AMD EPYC 9254 24-Core

97th

11057966667 ^{+/- 211097355}

AMD EPYC 9124 16-Core

95th

10237225000 ^{+/- 187983110}

AMD EPYC 9334 32-Core

93rd

9192422500 ^{+/- 75560543}

2 x AMD EPYC 9654 96-Core

91st

8863695000 ^{+/- 233702674}

2 x Intel Xeon Gold 6442Y

90th

8562610000

Intel Xeon Gold 6444Y

88th

8285166667 ^{+/- 70519766}

2 x AMD EPYC 9684X 96-Core

87th

8222346667 ^{+/- 465390635}

2 x AMD EPYC 9334 32-Core

86th

8124566667 ^{+/- 26331829}

2 x Intel Xeon Platinum 8468

83rd

7491736667 ^{+/- 43910284}

Intel Core i7-1185G7

81st

7277980000 ^{+/- 9711032}

2 x Intel Xeon Max 9468

80th

6930664444 ^{+/- 330298088}

AMD Ryzen 7 7800X3D 8-Core

79th

6751504167 ^{+/- 523048334}

2 x Intel Xeon Platinum 8490H

78th

6724270000 ^{+/- 242487}

Mid-Tier

75th

< 6603000000

AMD EPYC 7F32 8-Core

72nd

6387210000

2 x Intel Xeon Max 9480

70th

6304686706 ^{+/- 325892703}

Intel Xeon Platinum 8462Y

68th

6084607500 ^{+/- 80874964}

AMD EPYC 9754 128-Core

62nd

5739568889 ^{+/- 42154394}

Intel Xeon Platinum 8468

60th

5710816667 ^{+/- 37396529}

AMD Ryzen 5 7600X 6-Core

59th

5642100000 ^{+/- 108865336}

AMD Ryzen 9 7900X3D 12-Core

56th

4742400000 ^{+/- 71948835}

AMD Ryzen 7 5800X3D 8-Core

54th

4653558889 ^{+/- 180401958}

Median

50th

4495660000

AMD Ryzen 7 7700X 8-Core

50th

4473080000 ^{+/- 24735151}

AMD Ryzen 7 7700 8-Core

48th

4442480000 ^{+/- 74928446}

2 x Intel Xeon Platinum 8380

44th

4267857500 ^{+/- 52075869}

AMD Ryzen 9 7900X 12-Core

40th

3634780303 ^{+/- 80515089}

AMD Ryzen 9 7950X3D 16-Core

38th

3624445000 ^{+/- 9897347}

AMD Ryzen 5 5500U

32nd

3289403333 ^{+/- 22254508}

Intel Core i5-12600K

29th

3261400000 ^{+/- 646632}

Intel Xeon Silver 4216

26th

3122730000

Low-Tier

25th

< 3122730000

AMD Ryzen 7 5800X 8-Core

24th

2961128889 ^{+/- 7216717}

2 x AMD EPYC 7773X 64-Core

23rd

2904119524 ^{+/- 108016945}

AMD Ryzen 9 7950X 16-Core

23rd

2889336667 ^{+/- 233354810}

AMD Ryzen 5 4500U

19th

2667765000 ^{+/- 548483}

AMD Ryzen Threadripper 3970X 32-Core

16th

2116343334 ^{+/- 7740343}

Intel Core i7-1280P

15th

2103476667 ^{+/- 1835657}

Intel Core i9-13900K

13th

2055712500 ^{+/- 2086777}

AMD Ryzen 9 5900HX

11th

2032139167 ^{+/- 2927330}

AMD Ryzen 9 5950X 16-Core

9th

1949529167 ^{+/- 17194802}

AMD Ryzen 7 4700U

8th

1931180000 ^{+/- 7111336}

AMD Ryzen 9 3900XT 12-Core

7th

1774375000 ^{+/- 3423687}

AMD Ryzen Threadripper 3990X 64-Core

4th

1291546250 ^{+/- 1761867}

Detailed Performance Overview

Based on OpenBenchmarking.org data, the selected test / test configuration (nekRS 23.0 - Input: Kershaw) has an average run-time of 12 minutes. By default this test profile is set to run at least 3 times but may increase if the standard deviation exceeds pre-defined defaults or other calculations deem additional runs necessary for greater statistical accuracy of the result.

Based on public OpenBenchmarking.org results, the selected test / test configuration has an average standard deviation of 0.1%.

Does It Scale Well With Increasing Cores?

Yes, based on the automated analysis of the collected public benchmark data, this test / test settings does generally scale well with increasing CPU core counts. Data based on publicly available results for this test / test settings, separated by vendor, result divided by the reference CPU clock speed, grouped by matching physical CPU core count, and normalized against the smallest core count tested from each vendor for each CPU having a sufficient number of test samples and statistically significant data.

Notable Instruction Set Usage

Notable instruction set extensions supported by this test, based on an automatic analysis by the Phoronix Test Suite / OpenBenchmarking.org analytics engine.

Instruction Set

Support

Instructions Detected

Advanced Vector Extensions (AVX)

Used by default on supported hardware.
Found on Intel processors since Sandy Bridge (2011).
Found on AMD processors since Bulldozer (2011).

VZEROUPPER VBROADCASTSD VINSERTF128 VEXTRACTF128 VBROADCASTSS VPERMILPD VPERMILPS

Advanced Vector Extensions 2 (AVX2)

Used by default on supported hardware.
Found on Intel processors since Haswell (2013).
Found on AMD processors since Excavator (2016).

VPBROADCASTD VPBROADCASTW VPERMQ VPBROADCASTQ VEXTRACTI128 VINSERTI128 VPBROADCASTB VGATHERDPD VPERM2I128 VPGATHERDQ VPGATHERDD VGATHERDPS VPSRAVD VPERMPD VPGATHERQQ VPERMD VPGATHERQD

FMA (FMA)

Used by default on supported hardware.
Found on Intel processors since Haswell (2013).
Found on AMD processors since Bulldozer (2011).

VFMADD132SD VFNMADD132SD VFMADD213SD VFMADD231SD VFMSUB231SD VFMADD132PD VFNMADD231SD VFMSUB132SD VFMADD231SS VFMADD231PS VFNMADD132PD VFMADD231PD VFMADD213PD VFNMADD231PD VFNMADD213SD VFMSUB132PD VFNMADD213PD VFNMSUB132SD VFMSUB231PD VFNMADD213PS VFNMADD132PS VFNMADD213SS VFNMADD132SS VFMSUB132SS VFMADD213PS VFMADD132PS VFMADD213SS VFMADD132SS VFNMADD231SS VFMSUB231SS VFNMSUB132SS

Advanced Vector Extensions 512 (AVX512)

Requires passing a supported compiler/build flag (verified with targets: cascadelake, sapphirerapids).

(ZMM REGISTER USE)

The test / benchmark does honor compiler flag changes.

Last automated analysis: 24 June 2023

This test profile binary relies on the shared libraries libnekrs.so, libmpi.so.40, libm.so.6, libc.so.6, libnekrs-hypre.so, libnekrs-hypre-device.so, libocca.so, libgomp.so.1, libgfortran.so.5, libopen-pal.so.40, libopen-rte.so.40, libhwloc.so.15, libquadmath.so.0, libz.so.1, libudev.so.1.

Tested CPU Architectures

This benchmark has been successfully tested on the below mentioned architectures. The CPU architectures listed is where successful OpenBenchmarking.org result uploads occurred, namely for helping to determine if a given test is compatible with various alternative CPU architectures.

CPU Architecture

Kernel Identifier

Verified On

Intel / AMD x86 64-bit

x86_64

(Many Processors)

ARMv8 64-bit

aarch64

ARMv8 Neoverse-N1, ARMv8 Neoverse-V1

Recent Test Results

Compare

2024-10-08-0736

1 System - 461 Benchmark Results

AMD Ryzen 7 5800X 8-Core - GIGABYTE MC12-LE0-00 v01000100 - AMD Starship

Ubuntu 24.04 - 6.11.0-061100-generic - GNOME Shell 46.0