satty

Intel Xeon Platinum 8490H testing with a Quanta Cloud S6Q-MB-MPS (3A10.uh BIOS) and ASPEED on Ubuntu 22.04 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2307291-NE-SATTY133636&rdt&grr.

sattyProcessorMotherboardChipsetMemoryDiskGraphicsNetworkOSKernelDesktopDisplay ServerVulkanCompilerFile-SystemScreen ResolutionabIntel Xeon Platinum 8490H @ 3.50GHz (60 Cores / 120 Threads)Quanta Cloud S6Q-MB-MPS (3A10.uh BIOS)Intel Device 1bce512GB3 x 3841GB Micron_9300_MTFDHAL3T8TDPASPEED4 x Intel E810-C for QSFPUbuntu 22.045.15.0-47-generic (x86_64)GNOME Shell 42.4X Server 1.21.1.31.2.204GCC 11.2.0ext41024x7681600x1200OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-gBFGDP/gcc-11-11.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-gBFGDP/gcc-11-11.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: intel_pstate performance (EPP: performance) - CPU Microcode: 0x2b0000c0 Security Details- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected

sattylibxsmm: 256libxsmm: 128palabos: 100palabos: 400palabos: 1000palabos: 500heffte: c2c - Stock - double - 512heffte: c2c - FFTW - double - 512heffte: r2c - FFTW - double - 512heffte: c2c - Stock - float - 512heffte: c2c - FFTW - float - 512heffte: r2c - Stock - double - 512libxsmm: 64libxsmm: 32heffte: c2c - FFTW - float - 256heffte: r2c - FFTW - float - 512heffte: r2c - Stock - float - 512heffte: r2c - Stock - double - 256heffte: r2c - FFTW - float - 256heffte: c2c - Stock - double - 256heffte: c2c - FFTW - double - 256heffte: c2c - Stock - float - 256heffte: r2c - FFTW - double - 256heffte: r2c - Stock - float - 256heffte: c2c - Stock - double - 128heffte: c2c - FFTW - double - 128heffte: c2c - Stock - float - 128heffte: r2c - Stock - double - 128heffte: c2c - FFTW - float - 128heffte: r2c - FFTW - double - 128heffte: r2c - Stock - float - 128heffte: r2c - FFTW - float - 128ab883.51741.8314.532333.855391.950346.11652.953253.123695.089598.903499.5391101.543978.8496.0110.359177.478183.404105.980255.35149.293149.0032109.37397.2883263.23393.9306132.727147.840164.544232.000232.789232.159336.173793.51691.1308.948334.051389.107346.19952.712152.994594.456498.163298.7880100.722960.3486.4108.379174.661180.793106.150246.10949.323548.8145108.23196.0922251.97892.0352130.656148.326161.209233.509230.649228.742337.495OpenBenchmarking.org

libxsmm

M N K: 256

OpenBenchmarking.orgGFLOPS/s, More Is Betterlibxsmm 2-1.17-3645M N K: 256ab2004006008001000SE +/- 1.79, N = 3SE +/- 7.87, N = 12883.5793.51. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2

libxsmm

M N K: 128

OpenBenchmarking.orgGFLOPS/s, More Is Betterlibxsmm 2-1.17-3645M N K: 128ab400800120016002000SE +/- 4.61, N = 3SE +/- 6.75, N = 31741.81691.11. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2

Palabos

Grid Size: 100

OpenBenchmarking.orgMega Site Updates Per Second, More Is BetterPalabos 2.3Grid Size: 100ab70140210280350SE +/- 0.80, N = 3SE +/- 1.54, N = 3314.53308.951. (CXX) g++ options: -std=c++17 -pedantic -O3 -rdynamic -lcrypto -lcurl -lsz -lz -ldl -lm

Palabos

Grid Size: 400

OpenBenchmarking.orgMega Site Updates Per Second, More Is BetterPalabos 2.3Grid Size: 400ab70140210280350SE +/- 0.14, N = 3SE +/- 0.33, N = 3333.86334.051. (CXX) g++ options: -std=c++17 -pedantic -O3 -rdynamic -lcrypto -lcurl -lsz -lz -ldl -lm

Palabos

Grid Size: 1000

OpenBenchmarking.orgMega Site Updates Per Second, More Is BetterPalabos 2.3Grid Size: 1000ab90180270360450SE +/- 0.17, N = 3SE +/- 3.39, N = 3391.95389.111. (CXX) g++ options: -std=c++17 -pedantic -O3 -rdynamic -lcrypto -lcurl -lsz -lz -ldl -lm

Palabos

Grid Size: 500

OpenBenchmarking.orgMega Site Updates Per Second, More Is BetterPalabos 2.3Grid Size: 500ab80160240320400SE +/- 0.12, N = 3SE +/- 0.10, N = 3346.12346.201. (CXX) g++ options: -std=c++17 -pedantic -O3 -rdynamic -lcrypto -lcurl -lsz -lz -ldl -lm

HeFFTe - Highly Efficient FFT for Exascale

Test: c2c - Backend: Stock - Precision: double - X Y Z: 512

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: Stock - Precision: double - X Y Z: 512ab1224364860SE +/- 0.05, N = 3SE +/- 0.03, N = 352.9552.711. (CXX) g++ options: -O3

HeFFTe - Highly Efficient FFT for Exascale

Test: c2c - Backend: FFTW - Precision: double - X Y Z: 512

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: double - X Y Z: 512ab1224364860SE +/- 0.06, N = 3SE +/- 0.05, N = 353.1252.991. (CXX) g++ options: -O3

HeFFTe - Highly Efficient FFT for Exascale

Test: r2c - Backend: FFTW - Precision: double - X Y Z: 512

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: double - X Y Z: 512ab20406080100SE +/- 0.37, N = 3SE +/- 0.32, N = 395.0994.461. (CXX) g++ options: -O3

HeFFTe - Highly Efficient FFT for Exascale

Test: c2c - Backend: Stock - Precision: float - X Y Z: 512

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: Stock - Precision: float - X Y Z: 512ab20406080100SE +/- 0.02, N = 3SE +/- 0.10, N = 398.9098.161. (CXX) g++ options: -O3

HeFFTe - Highly Efficient FFT for Exascale

Test: c2c - Backend: FFTW - Precision: float - X Y Z: 512

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: float - X Y Z: 512ab20406080100SE +/- 0.13, N = 3SE +/- 0.13, N = 399.5498.791. (CXX) g++ options: -O3

HeFFTe - Highly Efficient FFT for Exascale

Test: r2c - Backend: Stock - Precision: double - X Y Z: 512

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: Stock - Precision: double - X Y Z: 512ab20406080100SE +/- 0.10, N = 3SE +/- 0.16, N = 3101.54100.721. (CXX) g++ options: -O3

libxsmm

M N K: 64

OpenBenchmarking.orgGFLOPS/s, More Is Betterlibxsmm 2-1.17-3645M N K: 64ab2004006008001000SE +/- 1.56, N = 3SE +/- 1.29, N = 3978.8960.31. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2

libxsmm

M N K: 32

OpenBenchmarking.orgGFLOPS/s, More Is Betterlibxsmm 2-1.17-3645M N K: 32ab110220330440550SE +/- 0.67, N = 3SE +/- 0.88, N = 3496.0486.41. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -msse4.2

HeFFTe - Highly Efficient FFT for Exascale

Test: c2c - Backend: FFTW - Precision: float - X Y Z: 256

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: float - X Y Z: 256ab20406080100SE +/- 0.76, N = 13SE +/- 0.73, N = 13110.36108.381. (CXX) g++ options: -O3

HeFFTe - Highly Efficient FFT for Exascale

Test: r2c - Backend: FFTW - Precision: float - X Y Z: 512

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: float - X Y Z: 512ab4080120160200SE +/- 0.84, N = 3SE +/- 0.61, N = 3177.48174.661. (CXX) g++ options: -O3

HeFFTe - Highly Efficient FFT for Exascale

Test: r2c - Backend: Stock - Precision: float - X Y Z: 512

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: Stock - Precision: float - X Y Z: 512ab4080120160200SE +/- 0.03, N = 3SE +/- 0.64, N = 3183.40180.791. (CXX) g++ options: -O3

HeFFTe - Highly Efficient FFT for Exascale

Test: r2c - Backend: Stock - Precision: double - X Y Z: 256

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: Stock - Precision: double - X Y Z: 256ab20406080100SE +/- 0.83, N = 15SE +/- 1.32, N = 4105.98106.151. (CXX) g++ options: -O3

HeFFTe - Highly Efficient FFT for Exascale

Test: r2c - Backend: FFTW - Precision: float - X Y Z: 256

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: float - X Y Z: 256ab60120180240300SE +/- 3.07, N = 4SE +/- 1.98, N = 15255.35246.111. (CXX) g++ options: -O3

HeFFTe - Highly Efficient FFT for Exascale

Test: c2c - Backend: Stock - Precision: double - X Y Z: 256

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: Stock - Precision: double - X Y Z: 256ab1122334455SE +/- 0.04, N = 3SE +/- 0.41, N = 349.2949.321. (CXX) g++ options: -O3

HeFFTe - Highly Efficient FFT for Exascale

Test: c2c - Backend: FFTW - Precision: double - X Y Z: 256

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: double - X Y Z: 256ab1122334455SE +/- 0.08, N = 3SE +/- 0.54, N = 349.0048.811. (CXX) g++ options: -O3

HeFFTe - Highly Efficient FFT for Exascale

Test: c2c - Backend: Stock - Precision: float - X Y Z: 256

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: Stock - Precision: float - X Y Z: 256ab20406080100SE +/- 0.85, N = 3SE +/- 1.20, N = 5109.37108.231. (CXX) g++ options: -O3

HeFFTe - Highly Efficient FFT for Exascale

Test: r2c - Backend: FFTW - Precision: double - X Y Z: 256

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: double - X Y Z: 256ab20406080100SE +/- 1.26, N = 3SE +/- 1.29, N = 397.2996.091. (CXX) g++ options: -O3

HeFFTe - Highly Efficient FFT for Exascale

Test: r2c - Backend: Stock - Precision: float - X Y Z: 256

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: Stock - Precision: float - X Y Z: 256ab60120180240300SE +/- 2.00, N = 3SE +/- 1.39, N = 3263.23251.981. (CXX) g++ options: -O3

HeFFTe - Highly Efficient FFT for Exascale

Test: c2c - Backend: Stock - Precision: double - X Y Z: 128

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: Stock - Precision: double - X Y Z: 128ab20406080100SE +/- 0.88, N = 3SE +/- 0.37, N = 393.9392.041. (CXX) g++ options: -O3

HeFFTe - Highly Efficient FFT for Exascale

Test: c2c - Backend: FFTW - Precision: double - X Y Z: 128

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: double - X Y Z: 128ab306090120150SE +/- 0.48, N = 3SE +/- 0.55, N = 3132.73130.661. (CXX) g++ options: -O3

HeFFTe - Highly Efficient FFT for Exascale

Test: c2c - Backend: Stock - Precision: float - X Y Z: 128

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: Stock - Precision: float - X Y Z: 128ab306090120150SE +/- 0.81, N = 3SE +/- 0.19, N = 3147.84148.331. (CXX) g++ options: -O3

HeFFTe - Highly Efficient FFT for Exascale

Test: r2c - Backend: Stock - Precision: double - X Y Z: 128

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: Stock - Precision: double - X Y Z: 128ab4080120160200SE +/- 0.46, N = 3SE +/- 0.50, N = 3164.54161.211. (CXX) g++ options: -O3

HeFFTe - Highly Efficient FFT for Exascale

Test: c2c - Backend: FFTW - Precision: float - X Y Z: 128

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: c2c - Backend: FFTW - Precision: float - X Y Z: 128ab50100150200250SE +/- 2.30, N = 3SE +/- 0.67, N = 3232.00233.511. (CXX) g++ options: -O3

HeFFTe - Highly Efficient FFT for Exascale

Test: r2c - Backend: FFTW - Precision: double - X Y Z: 128

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: double - X Y Z: 128ab50100150200250SE +/- 1.50, N = 3SE +/- 1.13, N = 3232.79230.651. (CXX) g++ options: -O3

HeFFTe - Highly Efficient FFT for Exascale

Test: r2c - Backend: Stock - Precision: float - X Y Z: 128

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: Stock - Precision: float - X Y Z: 128ab50100150200250SE +/- 1.82, N = 3SE +/- 0.78, N = 3232.16228.741. (CXX) g++ options: -O3

HeFFTe - Highly Efficient FFT for Exascale

Test: r2c - Backend: FFTW - Precision: float - X Y Z: 128

OpenBenchmarking.orgGFLOP/s, More Is BetterHeFFTe - Highly Efficient FFT for Exascale 2.3Test: r2c - Backend: FFTW - Precision: float - X Y Z: 128ab70140210280350SE +/- 2.06, N = 3SE +/- 2.26, N = 3336.17337.501. (CXX) g++ options: -O3


Phoronix Test Suite v10.8.5