heffte 7950x AMD Ryzen 9 7950X 16-Core testing with a ASUS ROG STRIX X670E-E GAMING WIFI (1416 BIOS) and NVIDIA GeForce RTX 3060 Ti 8GB on Ubuntu 23.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2310274-PTS-HEFFTE7999 .
heffte 7950x Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL OpenCL Compiler File-System Screen Resolution a b c AMD Ryzen 9 7950X 16-Core @ 5.88GHz (16 Cores / 32 Threads) ASUS ROG STRIX X670E-E GAMING WIFI (1416 BIOS) AMD Device 14d8 32GB Western Digital WD_BLACK SN850X 1000GB NVIDIA GeForce RTX 3060 Ti 8GB NVIDIA GA104 HD Audio ASUS MG28U Intel I225-V + Intel Wi-Fi 6 AX210/AX211/AX411 Ubuntu 23.10 6.6.0-060600rc5-generic (x86_64) GNOME Shell 45.0 X Server 1.21.1.7 NVIDIA 545.23.06 4.6.0 OpenCL 3.0 CUDA 12.3.68 GCC 13.2.0 ext4 3840x2160 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - CPU Microcode: 0xa601203 Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of safe RET no microcode + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
heffte 7950x heffte: c2c - FFTW - float - 128 heffte: c2c - FFTW - float - 256 heffte: c2c - FFTW - float - 512 heffte: r2c - FFTW - float - 128 heffte: r2c - FFTW - float - 256 heffte: r2c - FFTW - float - 512 heffte: c2c - FFTW - double - 128 heffte: c2c - FFTW - double - 256 heffte: c2c - FFTW - double - 512 heffte: c2c - Stock - float - 128 heffte: c2c - Stock - float - 256 heffte: c2c - Stock - float - 512 heffte: r2c - FFTW - double - 128 heffte: r2c - FFTW - double - 256 heffte: r2c - FFTW - double - 512 heffte: r2c - Stock - float - 128 heffte: r2c - Stock - float - 256 heffte: r2c - Stock - float - 512 heffte: c2c - Stock - double - 128 heffte: c2c - Stock - double - 256 heffte: c2c - Stock - double - 512 heffte: r2c - Stock - double - 128 heffte: r2c - Stock - double - 256 heffte: r2c - Stock - double - 512 heffte: c2c - FFTW - float-long - 128 heffte: c2c - FFTW - float-long - 256 heffte: c2c - FFTW - float-long - 512 heffte: r2c - FFTW - float-long - 128 heffte: r2c - FFTW - float-long - 256 heffte: r2c - FFTW - float-long - 512 heffte: c2c - FFTW - double-long - 128 heffte: c2c - FFTW - double-long - 256 heffte: c2c - FFTW - double-long - 512 heffte: c2c - Stock - float-long - 128 heffte: c2c - Stock - float-long - 256 heffte: c2c - Stock - float-long - 512 heffte: r2c - FFTW - double-long - 128 heffte: r2c - FFTW - double-long - 256 heffte: r2c - FFTW - double-long - 512 heffte: r2c - Stock - float-long - 128 heffte: r2c - Stock - float-long - 256 heffte: r2c - Stock - float-long - 512 heffte: c2c - Stock - double-long - 128 heffte: c2c - Stock - double-long - 256 heffte: c2c - Stock - double-long - 512 heffte: r2c - Stock - double-long - 128 heffte: r2c - Stock - double-long - 256 heffte: r2c - Stock - double-long - 512 a b c 77.4191 20.8780 22.3129 136.396 42.0885 40.7358 16.9742 10.2063 11.3175 62.4184 21.1409 22.4705 44.5940 19.1375 20.5913 115.306 46.0696 44.0010 17.5246 10.4724 11.5504 49.7110 21.2099 22.6468 80.7018 20.8630 22.3077 137.395 42.0270 40.6924 17.2073 10.1326 11.3033 60.5802 21.1193 22.4717 45.2281 19.0626 20.5876 118.581 46.1778 43.9465 17.5118 10.4604 11.5650 47.4095 21.1703 22.6477 76.8080 20.8919 22.3309 135.919 41.7135 40.7427 16.4968 10.1193 11.3006 61.7571 21.1237 22.4766 44.3283 19.0872 20.6314 120.987 46.1572 44.0299 18.0207 10.4680 11.5585 49.3308 21.1655 22.6361 77.9866 20.8895 22.2746 133.317 41.8702 40.6980 16.8374 10.1385 11.3054 59.8700 21.1267 22.4482 46.5240 19.1298 20.4946 116.576 46.1111 43.9320 17.3012 10.4640 11.5733 49.8414 21.1834 22.6688 77.2216 20.8252 22.2898 135.262 41.8229 40.7257 17.0937 10.1553 11.3353 61.2619 21.1643 22.4658 44.0131 19.1488 20.5596 120.153 46.2602 43.9660 17.6702 10.4641 11.5672 49.1807 21.1486 22.6527 77.6373 20.8501 22.2755 133.515 41.6494 40.6778 17.0719 10.1830 11.3213 60.8686 21.1028 22.4678 45.5619 19.1423 20.5184 120.073 46.7987 43.9385 17.3218 10.4565 11.5655 48.8921 21.2032 22.6505 OpenBenchmarking.org
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: float - X Y Z: 128 a b c 20 40 60 80 100 SE +/- 1.05, N = 3 SE +/- 1.96, N = 15 SE +/- 2.22, N = 12 77.42 76.81 77.22 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: float - X Y Z: 256 a b c 5 10 15 20 25 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 20.88 20.89 20.83 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: float - X Y Z: 512 a b c 5 10 15 20 25 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 22.31 22.33 22.29 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: float - X Y Z: 128 a b c 30 60 90 120 150 SE +/- 0.97, N = 12 SE +/- 1.41, N = 3 SE +/- 1.19, N = 15 136.40 135.92 135.26 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: float - X Y Z: 256 a b c 10 20 30 40 50 SE +/- 0.13, N = 3 SE +/- 0.08, N = 3 SE +/- 0.11, N = 3 42.09 41.71 41.82 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: float - X Y Z: 512 a b c 9 18 27 36 45 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 40.74 40.74 40.73 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: double - X Y Z: 128 a b c 4 8 12 16 20 SE +/- 0.15, N = 15 SE +/- 0.02, N = 3 SE +/- 0.10, N = 3 16.97 16.50 17.09 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: double - X Y Z: 256 a b c 3 6 9 12 15 SE +/- 0.04, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 10.21 10.12 10.16 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: double - X Y Z: 512 a b c 3 6 9 12 15 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.00, N = 3 11.32 11.30 11.34 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: float - X Y Z: 128 a b c 14 28 42 56 70 SE +/- 0.65, N = 15 SE +/- 0.74, N = 3 SE +/- 0.56, N = 15 62.42 61.76 61.26 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: float - X Y Z: 256 a b c 5 10 15 20 25 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 21.14 21.12 21.16 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: float - X Y Z: 512 a b c 5 10 15 20 25 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 22.47 22.48 22.47 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: double - X Y Z: 128 a b c 10 20 30 40 50 SE +/- 0.98, N = 15 SE +/- 0.46, N = 15 SE +/- 1.16, N = 12 44.59 44.33 44.01 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: double - X Y Z: 256 a b c 5 10 15 20 25 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 19.14 19.09 19.15 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: double - X Y Z: 512 a b c 5 10 15 20 25 SE +/- 0.07, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 20.59 20.63 20.56 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: float - X Y Z: 128 a b c 30 60 90 120 150 SE +/- 0.62, N = 3 SE +/- 1.24, N = 14 SE +/- 1.05, N = 15 115.31 120.99 120.15 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: float - X Y Z: 256 a b c 10 20 30 40 50 SE +/- 0.17, N = 3 SE +/- 0.18, N = 3 SE +/- 0.03, N = 3 46.07 46.16 46.26 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: float - X Y Z: 512 a b c 10 20 30 40 50 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 44.00 44.03 43.97 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: double - X Y Z: 128 a b c 4 8 12 16 20 SE +/- 0.22, N = 4 SE +/- 0.20, N = 4 SE +/- 0.18, N = 3 17.52 18.02 17.67 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: double - X Y Z: 256 a b c 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 10.47 10.47 10.46 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: double - X Y Z: 512 a b c 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 11.55 11.56 11.57 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: double - X Y Z: 128 a b c 11 22 33 44 55 SE +/- 0.50, N = 3 SE +/- 0.36, N = 15 SE +/- 0.58, N = 3 49.71 49.33 49.18 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: double - X Y Z: 256 a b c 5 10 15 20 25 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 SE +/- 0.03, N = 3 21.21 21.17 21.15 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: double - X Y Z: 512 a b c 5 10 15 20 25 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 22.65 22.64 22.65 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 128 a b c 20 40 60 80 100 SE +/- 1.13, N = 15 SE +/- 1.24, N = 15 SE +/- 2.02, N = 15 80.70 77.99 77.64 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 256 a b c 5 10 15 20 25 SE +/- 0.04, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 20.86 20.89 20.85 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 512 a b c 5 10 15 20 25 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 22.31 22.27 22.28 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 128 a b c 30 60 90 120 150 SE +/- 1.34, N = 15 SE +/- 1.70, N = 3 SE +/- 1.45, N = 15 137.40 133.32 133.52 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 256 a b c 10 20 30 40 50 SE +/- 0.12, N = 3 SE +/- 0.12, N = 3 SE +/- 0.10, N = 3 42.03 41.87 41.65 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 512 a b c 9 18 27 36 45 SE +/- 0.02, N = 3 SE +/- 0.04, N = 3 SE +/- 0.04, N = 3 40.69 40.70 40.68 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 128 a b c 4 8 12 16 20 SE +/- 0.24, N = 3 SE +/- 0.05, N = 3 SE +/- 0.12, N = 15 17.21 16.84 17.07 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 256 a b c 3 6 9 12 15 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 10.13 10.14 10.18 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 512 a b c 3 6 9 12 15 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 11.30 11.31 11.32 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 128 a b c 14 28 42 56 70 SE +/- 0.82, N = 15 SE +/- 0.80, N = 3 SE +/- 0.69, N = 15 60.58 59.87 60.87 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 256 a b c 5 10 15 20 25 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 21.12 21.13 21.10 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 512 a b c 5 10 15 20 25 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 SE +/- 0.00, N = 3 22.47 22.45 22.47 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 128 a b c 11 22 33 44 55 SE +/- 0.67, N = 15 SE +/- 0.37, N = 3 SE +/- 0.54, N = 15 45.23 46.52 45.56 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 256 a b c 5 10 15 20 25 SE +/- 0.05, N = 3 SE +/- 0.03, N = 3 SE +/- 0.06, N = 3 19.06 19.13 19.14 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 512 a b c 5 10 15 20 25 SE +/- 0.05, N = 3 SE +/- 0.02, N = 3 SE +/- 0.06, N = 3 20.59 20.49 20.52 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 128 a b c 30 60 90 120 150 SE +/- 0.75, N = 3 SE +/- 0.49, N = 3 SE +/- 1.65, N = 3 118.58 116.58 120.07 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 256 a b c 11 22 33 44 55 SE +/- 0.01, N = 3 SE +/- 0.23, N = 3 SE +/- 0.14, N = 3 46.18 46.11 46.80 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 512 a b c 10 20 30 40 50 SE +/- 0.02, N = 3 SE +/- 0.03, N = 3 SE +/- 0.02, N = 3 43.95 43.93 43.94 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 128 a b c 4 8 12 16 20 SE +/- 0.24, N = 3 SE +/- 0.14, N = 15 SE +/- 0.13, N = 15 17.51 17.30 17.32 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 256 a b c 3 6 9 12 15 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 10.46 10.46 10.46 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 512 a b c 3 6 9 12 15 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 11.57 11.57 11.57 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 128 a b c 11 22 33 44 55 SE +/- 0.53, N = 3 SE +/- 0.46, N = 6 SE +/- 0.58, N = 15 47.41 49.84 48.89 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 256 a b c 5 10 15 20 25 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 21.17 21.18 21.20 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 512 a b c 5 10 15 20 25 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 22.65 22.67 22.65 1. (CXX) g++ options: -O3
Phoronix Test Suite v10.8.4