HeFFTe - Highly Efficient FFT for Exascale

HeFFTe is the Highly Efficient FFT for Exascale software developed as part of the Exascale Computing Project. This test profile uses HeFFTe's built-in speed benchmarks under a variety of configuration options and currently catering to CPU/processor testing.

To run this test with the Phoronix Test Suite, the basic command is: phoronix-test-suite benchmark heffte.

Project Site

icl.utk.edu

Source Repository

github.com

Test Created

18 June 2023

Last Updated

27 October 2023

Test Maintainer

Michael Larabel 

Test Type

Processor

Average Install Time

49 Seconds

Average Run Time

1 Minute, 10 Seconds

Test Dependencies

C/C++ Compiler Toolchain + Fortran + OpenMPI + CMake + FFTW

Accolades

10k+ Downloads

Supported Platforms


Public Result Uploads *Reported Installs **Reported Test Completions **Test Profile Page ViewsOpenBenchmarking.orgEventsHeFFTe - Highly Efficient FFT for Exascale Popularity Statisticspts/heffte2023.062023.072023.082023.092023.102023.112023.122024.012024.022024.032024.042024.058001600240032004000
* Uploading of benchmark result data to OpenBenchmarking.org is always optional (opt-in) via the Phoronix Test Suite for users wishing to share their results publicly.
** Data based on those opting to upload their test results to OpenBenchmarking.org and users enabling the opt-in anonymous statistics reporting while running benchmarks from an Internet-connected platform.
Data updated weekly as of 31 May 2024.
c2c48.5%r2c51.5%Test Option PopularityOpenBenchmarking.org
Stock47.8%FFTW52.2%Backend Option PopularityOpenBenchmarking.org
double26.0%float28.1%double-long22.6%float-long23.3%Precision Option PopularityOpenBenchmarking.org
12833.3%25631.0%10246.9%51228.7%X Y Z Option PopularityOpenBenchmarking.org

Revision History

pts/heffte-1.1.0   [View Source]   Fri, 27 Oct 2023 15:16:02 GMT
Update against HeFFTe 2.4 upstream.

pts/heffte-1.0.0   [View Source]   Sun, 18 Jun 2023 09:51:50 GMT
Initial commit of HeFFTe benchmark.


Performance Metrics

Analyze Test Configuration:

HeFFTe - Highly Efficient FFT for Exascale 2.4

Test: r2c - Backend: FFTW - Precision: float - X Y Z: 128

OpenBenchmarking.org metrics for this test profile configuration based on 107 public results since 27 October 2023 with the latest data as of 23 May 2024.

Below is an overview of the generalized performance for components where there is sufficient statistically significant data based upon user-uploaded results. It is important to keep in mind particularly in the Linux/open-source space there can be vastly different OS configurations, with this overview intended to offer just general guidance as to the performance expectations.

Component
Percentile Rank
# Compatible Public Results
GFLOP/s (Average)
87th
8
188 +/- 2
78th
6
182 +/- 2
Mid-Tier
75th
< 179
52nd
4
118 +/- 16
Median
50th
114
Low-Tier
25th
< 76
14th
5
35 +/- 3
7th
4
17 +/- 1
OpenBenchmarking.orgDistribution Of Public Results - Test: r2c - Backend: FFTW - Precision: float - X Y Z: 128107 Results Range From 11 To 328 GFLOP/s1129476583101119137155173191209227245263281299317335510152025

Based on OpenBenchmarking.org data, the selected test / test configuration (HeFFTe - Highly Efficient FFT for Exascale 2.4 - Test: r2c - Backend: FFTW - Precision: float - X Y Z: 128) has an average run-time of 2 minutes. By default this test profile is set to run at least 3 times but may increase if the standard deviation exceeds pre-defined defaults or other calculations deem additional runs necessary for greater statistical accuracy of the result.

OpenBenchmarking.orgMinutesTime Required To Complete BenchmarkTest: r2c - Backend: FFTW - Precision: float - X Y Z: 128Run-Time246810Min: 1 / Avg: 1 / Max: 1

Based on public OpenBenchmarking.org results, the selected test / test configuration has an average standard deviation of 1.4%.

OpenBenchmarking.orgPercent, Fewer Is BetterAverage Deviation Between RunsTest: r2c - Backend: FFTW - Precision: float - X Y Z: 128Deviation3691215Min: 0 / Avg: 1.42 / Max: 9

Tested CPU Architectures

This benchmark has been successfully tested on the below mentioned architectures. The CPU architectures listed is where successful OpenBenchmarking.org result uploads occurred, namely for helping to determine if a given test is compatible with various alternative CPU architectures.

CPU Architecture
Kernel Identifier
Verified On
Intel / AMD x86 64-bit
x86_64
(Many Processors)
Loongson LoongArch 64-bit
loongarch64
Loongson-3A6000

Recent Test Results

OpenBenchmarking.org Results Compare

6 Systems - 87 Benchmark Results

ARMv8 Neoverse-N1 - Amazon EC2 c6g.16xlarge - Amazon Device 0200

Ubuntu 22.04 - 5.19.0-1025-aws - GCC 11.3.0

1 System - 1 Benchmark Result

2 x Intel Xeon E5-2699 v4 - Supermicro X10DRL-i v1.01 - Intel Xeon E7 v4

Ubuntu 22.04 - 6.5.0-21-generic - X Server 1.21.1.4

1 System - 494 Benchmark Results

Find More Test Results