satty

Intel Xeon Platinum 8490H testing with a Quanta Cloud S6Q-MB-MPS (3A10.uh BIOS) and ASPEED on Ubuntu 22.04 via the Phoronix Test Suite.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2307291-NE-SATTY133636
Jump To Table - Results

View

Do Not Show Noisy Results
Do Not Show Results With Incomplete Data
Do Not Show Results With Little Change/Spread
List Notable Results
Show Result Confidence Charts

Statistics

Show Overall Harmonic Mean(s)
Show Overall Geometric Mean
Show Wins / Losses Counts (Pie Chart)
Normalize Results
Remove Outliers Before Calculating Averages

Graph Settings

Force Line Graphs Where Applicable
Convert To Scalar Where Applicable
Prefer Vertical Bar Graphs

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Highlight
Result
Hide
Result
Result
Identifier
View Logs
Performance Per
Dollar
Date
Run
  Test
  Duration
a
July 29 2023
  58 Minutes
b
July 29 2023
  1 Hour, 19 Minutes
Invert Hiding All Results Option
  1 Hour, 8 Minutes
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


sattyOpenBenchmarking.orgPhoronix Test SuiteIntel Xeon Platinum 8490H @ 3.50GHz (60 Cores / 120 Threads)Quanta Cloud S6Q-MB-MPS (3A10.uh BIOS)Intel Device 1bce512GB3 x 3841GB Micron_9300_MTFDHAL3T8TDPASPEED4 x Intel E810-C for QSFPUbuntu 22.045.15.0-47-generic (x86_64)GNOME Shell 42.4X Server 1.21.1.31.2.204GCC 11.2.0ext41024x7681600x1200ProcessorMotherboardChipsetMemoryDiskGraphicsNetworkOSKernelDesktopDisplay ServerVulkanCompilerFile-SystemScreen ResolutionsSatty BenchmarksSystem Logs- Transparent Huge Pages: madvise- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-gBFGDP/gcc-11-11.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-gBFGDP/gcc-11-11.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v - Scaling Governor: intel_pstate performance (EPP: performance) - CPU Microcode: 0x2b0000c0 - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected

a vs. b ComparisonPhoronix Test SuiteBaseline+2.8%+2.8%+5.6%+5.6%+8.4%+8.4%+11.2%+11.2%25611.3%r2c - Stock - float - 2564.5%r2c - FFTW - float - 2563.8%1283%r2c - Stock - double - 1282.1%c2c - Stock - double - 1282.1%322%libxsmmHeFFTe - Highly Efficient FFT for ExascaleHeFFTe - Highly Efficient FFT for ExascalelibxsmmHeFFTe - Highly Efficient FFT for ExascaleHeFFTe - Highly Efficient FFT for Exascalelibxsmmab

sattylibxsmm: 256libxsmm: 128palabos: 100palabos: 400palabos: 1000palabos: 500heffte: c2c - Stock - double - 512heffte: c2c - FFTW - double - 512heffte: r2c - FFTW - double - 512heffte: c2c - Stock - float - 512heffte: c2c - FFTW - float - 512heffte: r2c - Stock - double - 512libxsmm: 64libxsmm: 32heffte: c2c - FFTW - float - 256heffte: r2c - FFTW - float - 512heffte: r2c - Stock - float - 512heffte: r2c - Stock - double - 256heffte: r2c - FFTW - float - 256heffte: c2c - Stock - double - 256heffte: c2c - FFTW - double - 256heffte: c2c - Stock - float - 256heffte: r2c - FFTW - double - 256heffte: r2c - Stock - float - 256heffte: c2c - Stock - double - 128heffte: c2c - FFTW - double - 128heffte: c2c - Stock - float - 128heffte: r2c - Stock - double - 128heffte: c2c - FFTW - float - 128heffte: r2c - FFTW - double - 128heffte: r2c - Stock - float - 128heffte: r2c - FFTW - float - 128ab883.51741.8314.532333.855391.950346.11652.953253.123695.089598.903499.5391101.543978.8496.0110.359177.478183.404105.980255.35149.293149.0032109.37397.2883263.23393.9306132.727147.840164.544232.000232.789232.159336.173793.51691.1308.948334.051389.107346.19952.712152.994594.456498.163298.7880100.722960.3486.4108.379174.661180.793106.150246.10949.323548.8145108.23196.0922251.97892.0352130.656148.326161.209233.509230.649228.742337.495OpenBenchmarking.org

libxsmm

Libxsmm is an open-source library for specialized dense and sparse matrix operations and deep learning primitives. Libxsmm supports making use of Intel AMX, AVX-512, and other modern CPU instruction set capabilities. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS/s, More Is Betterlibxsmm 2-1.17-3645M N K: 256ba2004006008001000SE +/- 7.87, N = 12SE +/- 1.79, N = 3793.5883.51. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2

OpenBenchmarking.orgGFLOPS/s, More Is Betterlibxsmm 2-1.17-3645M N K: 128ba400800120016002000SE +/- 6.75, N = 3SE +/- 4.61, N = 31691.11741.81. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2

Palabos

The Palabos library is a framework for general purpose Computational Fluid Dynamics (CFD). Palabos uses a kernel based on the Lattice Boltzmann method. This test profile uses the Palabos MPI-based Cavity3D benchmark. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgMega Site Updates Per Second, More Is BetterPalabos 2.3Grid Size: 100ba70140210280350SE +/- 1.54, N = 3SE +/- 0.80, N = 3308.95314.531. (CXX) g++ options: -std=c++17 -pedantic -O3 -rdynamic -lcrypto -lcurl -lsz -lz -ldl -lm

OpenBenchmarking.orgMega Site Updates Per Second, More Is BetterPalabos 2.3Grid Size: 400ab70140210280350SE +/- 0.14, N = 3SE +/- 0.33, N = 3333.86334.051. (CXX) g++ options: -std=c++17 -pedantic -O3 -rdynamic -lcrypto -lcurl -lsz -lz -ldl -lm

OpenBenchmarking.orgMega Site Updates Per Second, More Is BetterPalabos 2.3Grid Size: 1000ba90180270360450SE +/- 3.39, N = 3SE +/- 0.17, N = 3389.11391.951. (CXX) g++ options: -std=c++17 -pedantic -O3 -rdynamic -lcrypto -lcurl -lsz -lz -ldl -lm

OpenBenchmarking.orgMega Site Updates Per Second, More Is BetterPalabos 2.3Grid Size: 500ab80160240320400SE +/- 0.12, N = 3SE +/- 0.10, N = 3346.12346.201. (CXX) g++ options: -std=c++17 -pedantic -O3 -rdynamic -lcrypto -lcurl -lsz -lz -ldl -lm

Grid Size: 4000

a: The test quit with a non-zero exit status.

b: The test quit with a non-zero exit status.

HeFFTe - Highly Efficient FFT for Exascale

HeFFTe is the Highly Efficient FFT for Exascale software developed as part of the Exascale Computing Project. This test profile uses HeFFTe's built-in speed benchmarks under a variety of configuration options and currently catering to CPU/processor testing. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: Stock - Precision: double - X Y Z: 512ba1224364860SE +/- 0.03, N = 3SE +/- 0.05, N = 352.7152.951. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: double - X Y Z: 512ba1224364860SE +/- 0.05, N = 3SE +/- 0.06, N = 352.9953.121. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: double - X Y Z: 512ba20406080100SE +/- 0.32, N = 3SE +/- 0.37, N = 394.4695.091. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: Stock - Precision: float - X Y Z: 512ba20406080100SE +/- 0.10, N = 3SE +/- 0.02, N = 398.1698.901. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: float - X Y Z: 512ba20406080100SE +/- 0.13, N = 3SE +/- 0.13, N = 398.7999.541. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: Stock - Precision: double - X Y Z: 512ba20406080100SE +/- 0.16, N = 3SE +/- 0.10, N = 3100.72101.541. (CXX) g++ options: -O3

libxsmm

Libxsmm is an open-source library for specialized dense and sparse matrix operations and deep learning primitives. Libxsmm supports making use of Intel AMX, AVX-512, and other modern CPU instruction set capabilities. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOPS/s, More Is Betterlibxsmm 2-1.17-3645M N K: 64ba2004006008001000SE +/- 1.29, N = 3SE +/- 1.56, N = 3960.3978.81. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2

OpenBenchmarking.orgGFLOPS/s, More Is Betterlibxsmm 2-1.17-3645M N K: 32ba110220330440550SE +/- 0.88, N = 3SE +/- 0.67, N = 3486.4496.01. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2

HeFFTe - Highly Efficient FFT for Exascale

HeFFTe is the Highly Efficient FFT for Exascale software developed as part of the Exascale Computing Project. This test profile uses HeFFTe's built-in speed benchmarks under a variety of configuration options and currently catering to CPU/processor testing. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: float - X Y Z: 256ba20406080100SE +/- 0.73, N = 13SE +/- 0.76, N = 13108.38110.361. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: float - X Y Z: 512ba4080120160200SE +/- 0.61, N = 3SE +/- 0.84, N = 3174.66177.481. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: Stock - Precision: float - X Y Z: 512ba4080120160200SE +/- 0.64, N = 3SE +/- 0.03, N = 3180.79183.401. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: Stock - Precision: double - X Y Z: 256ab20406080100SE +/- 0.83, N = 15SE +/- 1.32, N = 4105.98106.151. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: float - X Y Z: 256ba60120180240300SE +/- 1.98, N = 15SE +/- 3.07, N = 4246.11255.351. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: Stock - Precision: double - X Y Z: 256ab1122334455SE +/- 0.04, N = 3SE +/- 0.41, N = 349.2949.321. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: double - X Y Z: 256ba1122334455SE +/- 0.54, N = 3SE +/- 0.08, N = 348.8149.001. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: Stock - Precision: float - X Y Z: 256ba20406080100SE +/- 1.20, N = 5SE +/- 0.85, N = 3108.23109.371. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: double - X Y Z: 256ba20406080100SE +/- 1.29, N = 3SE +/- 1.26, N = 396.0997.291. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: Stock - Precision: float - X Y Z: 256ba60120180240300SE +/- 1.39, N = 3SE +/- 2.00, N = 3251.98263.231. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: Stock - Precision: double - X Y Z: 128ba20406080100SE +/- 0.37, N = 3SE +/- 0.88, N = 392.0493.931. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: double - X Y Z: 128ba306090120150SE +/- 0.55, N = 3SE +/- 0.48, N = 3130.66132.731. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: Stock - Precision: float - X Y Z: 128ab306090120150SE +/- 0.81, N = 3SE +/- 0.19, N = 3147.84148.331. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: Stock - Precision: double - X Y Z: 128ba4080120160200SE +/- 0.50, N = 3SE +/- 0.46, N = 3161.21164.541. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: float - X Y Z: 128ab50100150200250SE +/- 2.30, N = 3SE +/- 0.67, N = 3232.00233.511. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: double - X Y Z: 128ba50100150200250SE +/- 1.13, N = 3SE +/- 1.50, N = 3230.65232.791. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: Stock - Precision: float - X Y Z: 128ba50100150200250SE +/- 0.78, N = 3SE +/- 1.82, N = 3228.74232.161. (CXX) g++ options: -O3

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: float - X Y Z: 128ab70140210280350SE +/- 2.06, N = 3SE +/- 2.26, N = 3336.17337.501. (CXX) g++ options: -O3