cloverleaf threadripper AMD Ryzen Threadripper 3990X 64-Core testing with a Gigabyte TRX40 AORUS PRO WIFI (F6 BIOS) and AMD Radeon RX 5700 8GB on Ubuntu 23.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2310271-PTS-CLOVERLE91&grr&sor .
cloverleaf threadripper Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server OpenGL Compiler File-System Screen Resolution a b c AMD Ryzen Threadripper 3990X 64-Core @ 2.90GHz (64 Cores / 128 Threads) Gigabyte TRX40 AORUS PRO WIFI (F6 BIOS) AMD Starship/Matisse 128GB Samsung SSD 970 EVO Plus 500GB AMD Radeon RX 5700 8GB (1750/875MHz) AMD Navi 10 HDMI Audio DELL P2415Q Intel I211 + Intel Wi-Fi 6 AX200 Ubuntu 23.04 6.2.0-34-generic (x86_64) GNOME Shell 44.3 X Server + Wayland 4.6 Mesa 23.0.2 (LLVM 15.0.7 DRM 3.49) GCC 12.3.0 ext4 3840x2160 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-DAPbBt/gcc-12-12.3.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-DAPbBt/gcc-12-12.3.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0x830107a Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Mitigation of untrained return thunk; SMT enabled with STIBP protection + spec_rstack_overflow: Mitigation of safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
cloverleaf threadripper cloverleaf: clover_bm16 cloverleaf: clover_bm64_short heffte: r2c - FFTW - double - 1024 heffte: r2c - FFTW - double-long - 1024 heffte: r2c - Stock - double-long - 1024 heffte: r2c - Stock - double - 1024 heffte: c2c - Stock - float - 1024 heffte: c2c - FFTW - float - 1024 heffte: c2c - Stock - float-long - 1024 heffte: c2c - FFTW - float-long - 1024 cloverleaf: clover_bm heffte: r2c - FFTW - float - 1024 heffte: r2c - FFTW - float-long - 1024 heffte: r2c - Stock - float - 1024 heffte: r2c - Stock - float-long - 1024 heffte: c2c - FFTW - double-long - 512 heffte: c2c - FFTW - double - 512 heffte: c2c - Stock - double - 512 heffte: c2c - Stock - double-long - 512 heffte: r2c - FFTW - double - 512 heffte: r2c - FFTW - double-long - 512 heffte: r2c - Stock - double-long - 512 heffte: r2c - Stock - double - 512 heffte: c2c - FFTW - float-long - 512 heffte: c2c - Stock - float - 512 heffte: c2c - FFTW - float - 512 heffte: c2c - Stock - float-long - 512 heffte: r2c - FFTW - float-long - 512 heffte: r2c - FFTW - float - 512 heffte: r2c - Stock - float-long - 512 heffte: r2c - Stock - float - 512 heffte: c2c - FFTW - double-long - 128 heffte: r2c - FFTW - double - 128 heffte: c2c - FFTW - double - 128 heffte: r2c - FFTW - double-long - 128 heffte: c2c - FFTW - float - 128 heffte: r2c - Stock - float-long - 256 heffte: c2c - FFTW - double-long - 256 heffte: c2c - FFTW - double - 256 heffte: c2c - Stock - double-long - 256 heffte: c2c - FFTW - float-long - 128 heffte: c2c - Stock - double - 256 heffte: r2c - FFTW - double-long - 256 heffte: r2c - FFTW - double - 256 heffte: r2c - Stock - double-long - 256 heffte: r2c - Stock - double - 256 heffte: c2c - FFTW - float-long - 256 heffte: c2c - FFTW - float - 256 heffte: c2c - Stock - float-long - 256 heffte: c2c - Stock - float - 256 heffte: r2c - FFTW - float-long - 256 heffte: r2c - FFTW - float - 256 heffte: r2c - Stock - float - 256 heffte: c2c - Stock - double - 128 heffte: c2c - Stock - double-long - 128 heffte: r2c - Stock - double - 128 heffte: r2c - Stock - float-long - 128 heffte: r2c - Stock - double-long - 128 heffte: c2c - Stock - float-long - 128 heffte: r2c - Stock - float - 128 heffte: c2c - Stock - float - 128 heffte: r2c - FFTW - float-long - 128 heffte: r2c - FFTW - float - 128 a b c 1253.12 144.51 24.9067 24.9248 26.9875 26.9747 27.1235 27.1341 27.1819 27.1810 17.54 48.4776 48.5744 52.2726 52.3467 12.3421 12.3492 12.4020 12.4059 22.2806 22.3108 24.1192 24.0929 23.6062 23.6449 23.5486 23.7094 44.2387 44.1653 48.2998 48.1639 18.8907 26.0046 20.9992 25.0364 55.8176 95.5643 12.7467 12.8747 12.8346 54.9524 13.0608 28.0234 28.3524 34.1682 34.4148 34.6871 35.2477 37.7505 37.8262 82.3872 85.1009 95.7520 29.1052 28.8662 48.2953 81.1234 47.5897 50.1664 82.3672 49.9233 91.3115 92.2391 1252.53 144.65 24.9146 24.9204 26.9768 26.9774 27.1160 27.1177 27.2048 27.1785 16.13 48.4453 48.5647 52.2634 52.3830 12.3489 12.3379 12.4030 12.4035 22.2975 22.3188 24.1468 24.0988 23.6093 23.6339 23.5270 23.6585 44.1198 44.1977 48.0933 47.9281 19.2424 27.2799 19.8631 28.2660 55.0641 96.2684 12.7313 12.6387 12.8246 55.6265 12.8450 28.0753 27.6720 34.2844 34.3795 34.7931 35.1868 37.8566 37.9740 82.7044 83.3580 97.5981 28.5774 28.6991 48.5824 80.8669 48.5144 50.5380 79.8604 49.7466 91.2275 91.5295 1252.61 144.69 24.9303 24.9473 27.0114 27.0081 27.1292 27.1493 27.2101 27.1777 17.00 48.5125 48.571 52.3086 52.4069 12.3427 12.3492 12.401 12.421 22.3154 22.3489 24.1199 24.089 23.5665 23.6284 23.5485 23.689 44.2775 44.0628 48.2783 47.9948 27.2693 17.3744 23.9639 18.3771 59.611 95.4689 12.6448 12.6566 12.6667 59.4715 12.6881 28.0944 28.1929 34.4753 34.5829 34.7741 35.0137 37.808 38.3008 81.1273 83.141 93.8026 28.3257 28.7278 48.1773 80.8598 48.14 50.7326 80.7837 47.1137 89.8136 91.992 OpenBenchmarking.org
CloverLeaf Input: clover_bm16 OpenBenchmarking.org Seconds, Fewer Is Better CloverLeaf 1.3 Input: clover_bm16 b c a 300 600 900 1200 1500 SE +/- 0.09, N = 3 SE +/- 1.27, N = 3 1252.53 1252.61 1253.12 1. (F9X) gfortran options: -O3 -march=native -funroll-loops -fopenmp
CloverLeaf Input: clover_bm64_short OpenBenchmarking.org Seconds, Fewer Is Better CloverLeaf 1.3 Input: clover_bm64_short a b c 30 60 90 120 150 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 144.51 144.65 144.69 1. (F9X) gfortran options: -O3 -march=native -funroll-loops -fopenmp
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double - X Y Z: 1024 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: double - X Y Z: 1024 c b a 6 12 18 24 30 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 24.93 24.91 24.91 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 1024 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 1024 c a b 6 12 18 24 30 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 24.95 24.92 24.92 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 1024 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 1024 c a b 6 12 18 24 30 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 27.01 26.99 26.98 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double - X Y Z: 1024 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: double - X Y Z: 1024 c b a 6 12 18 24 30 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 27.01 26.98 26.97 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float - X Y Z: 1024 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: float - X Y Z: 1024 c a b 6 12 18 24 30 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 27.13 27.12 27.12 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float - X Y Z: 1024 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: float - X Y Z: 1024 c a b 6 12 18 24 30 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 27.15 27.13 27.12 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 1024 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 1024 c b a 6 12 18 24 30 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 27.21 27.20 27.18 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 1024 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 1024 a b c 6 12 18 24 30 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 27.18 27.18 27.18 1. (CXX) g++ options: -O3
CloverLeaf Input: clover_bm OpenBenchmarking.org Seconds, Fewer Is Better CloverLeaf 1.3 Input: clover_bm b c a 4 8 12 16 20 SE +/- 0.25, N = 15 SE +/- 1.44, N = 12 16.13 17.00 17.54 1. (F9X) gfortran options: -O3 -march=native -funroll-loops -fopenmp
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float - X Y Z: 1024 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: float - X Y Z: 1024 c a b 11 22 33 44 55 SE +/- 0.01, N = 3 SE +/- 0.05, N = 3 48.51 48.48 48.45 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 1024 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 1024 a c b 11 22 33 44 55 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 48.57 48.57 48.56 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float - X Y Z: 1024 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: float - X Y Z: 1024 c a b 12 24 36 48 60 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 52.31 52.27 52.26 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 1024 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 1024 c b a 12 24 36 48 60 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 52.41 52.38 52.35 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 512 b c a 3 6 9 12 15 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 12.35 12.34 12.34 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: double - X Y Z: 512 c a b 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 12.35 12.35 12.34 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: double - X Y Z: 512 b a c 3 6 9 12 15 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 12.40 12.40 12.40 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 512 c a b 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 12.42 12.41 12.40 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: double - X Y Z: 512 c b a 5 10 15 20 25 SE +/- 0.02, N = 3 SE +/- 0.00, N = 3 22.32 22.30 22.28 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 512 c b a 5 10 15 20 25 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 22.35 22.32 22.31 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 512 b c a 6 12 18 24 30 SE +/- 0.04, N = 3 SE +/- 0.01, N = 3 24.15 24.12 24.12 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: double - X Y Z: 512 b a c 6 12 18 24 30 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 24.10 24.09 24.09 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 512 b a c 6 12 18 24 30 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 23.61 23.61 23.57 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: float - X Y Z: 512 a b c 6 12 18 24 30 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 23.64 23.63 23.63 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: float - X Y Z: 512 a c b 6 12 18 24 30 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 23.55 23.55 23.53 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 512 a c b 6 12 18 24 30 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 23.71 23.69 23.66 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 512 c a b 10 20 30 40 50 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 44.28 44.24 44.12 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: float - X Y Z: 512 b a c 10 20 30 40 50 SE +/- 0.08, N = 3 SE +/- 0.05, N = 3 44.20 44.17 44.06 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 512 a c b 11 22 33 44 55 SE +/- 0.23, N = 3 SE +/- 0.05, N = 3 48.30 48.28 48.09 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: float - X Y Z: 512 a c b 11 22 33 44 55 SE +/- 0.14, N = 3 SE +/- 0.07, N = 3 48.16 47.99 47.93 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 128 c b a 6 12 18 24 30 SE +/- 1.41, N = 15 SE +/- 1.04, N = 15 27.27 19.24 18.89 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: double - X Y Z: 128 b a c 6 12 18 24 30 SE +/- 2.33, N = 15 SE +/- 2.70, N = 15 27.28 26.00 17.37 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: double - X Y Z: 128 c a b 6 12 18 24 30 SE +/- 1.55, N = 12 SE +/- 1.31, N = 15 23.96 21.00 19.86 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 128 b a c 7 14 21 28 35 SE +/- 2.45, N = 15 SE +/- 3.07, N = 12 28.27 25.04 18.38 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: float - X Y Z: 128 c a b 13 26 39 52 65 SE +/- 0.34, N = 15 SE +/- 0.41, N = 13 59.61 55.82 55.06 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 256 b a c 20 40 60 80 100 SE +/- 0.91, N = 3 SE +/- 0.68, N = 12 96.27 95.56 95.47 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 256 a b c 3 6 9 12 15 SE +/- 0.05, N = 3 SE +/- 0.10, N = 3 12.75 12.73 12.64 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: double - X Y Z: 256 a c b 3 6 9 12 15 SE +/- 0.06, N = 3 SE +/- 0.03, N = 3 12.87 12.66 12.64 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 256 a b c 3 6 9 12 15 SE +/- 0.17, N = 3 SE +/- 0.12, N = 3 12.83 12.82 12.67 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 128 c b a 13 26 39 52 65 SE +/- 0.45, N = 9 SE +/- 0.42, N = 10 59.47 55.63 54.95 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: double - X Y Z: 256 a b c 3 6 9 12 15 SE +/- 0.13, N = 3 SE +/- 0.14, N = 3 13.06 12.85 12.69 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 256 c b a 7 14 21 28 35 SE +/- 0.06, N = 3 SE +/- 0.09, N = 3 28.09 28.08 28.02 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: double - X Y Z: 256 a c b 7 14 21 28 35 SE +/- 0.14, N = 3 SE +/- 0.25, N = 3 28.35 28.19 27.67 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 256 c b a 8 16 24 32 40 SE +/- 0.17, N = 3 SE +/- 0.09, N = 3 34.48 34.28 34.17 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: double - X Y Z: 256 c a b 8 16 24 32 40 SE +/- 0.12, N = 3 SE +/- 0.11, N = 3 34.58 34.41 34.38 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 256 b c a 8 16 24 32 40 SE +/- 0.47, N = 3 SE +/- 0.16, N = 3 34.79 34.77 34.69 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: float - X Y Z: 256 a b c 8 16 24 32 40 SE +/- 0.04, N = 3 SE +/- 0.20, N = 3 35.25 35.19 35.01 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 256 b c a 9 18 27 36 45 SE +/- 0.04, N = 3 SE +/- 0.23, N = 3 37.86 37.81 37.75 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: float - X Y Z: 256 c b a 9 18 27 36 45 SE +/- 0.35, N = 3 SE +/- 0.29, N = 3 38.30 37.97 37.83 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 256 b a c 20 40 60 80 100 SE +/- 0.40, N = 3 SE +/- 0.20, N = 3 82.70 82.39 81.13 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: float - X Y Z: 256 a b c 20 40 60 80 100 SE +/- 0.47, N = 3 SE +/- 0.64, N = 3 85.10 83.36 83.14 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: float - X Y Z: 256 b a c 20 40 60 80 100 SE +/- 0.29, N = 3 SE +/- 0.22, N = 3 97.60 95.75 93.80 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: double - X Y Z: 128 a b c 7 14 21 28 35 SE +/- 0.31, N = 3 SE +/- 0.41, N = 3 29.11 28.58 28.33 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 128 a c b 7 14 21 28 35 SE +/- 0.12, N = 3 SE +/- 0.34, N = 3 28.87 28.73 28.70 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: double - X Y Z: 128 b a c 11 22 33 44 55 SE +/- 0.52, N = 3 SE +/- 0.44, N = 3 48.58 48.30 48.18 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 128 a b c 20 40 60 80 100 SE +/- 0.29, N = 3 SE +/- 0.24, N = 3 81.12 80.87 80.86 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 128 b c a 11 22 33 44 55 SE +/- 0.40, N = 3 SE +/- 0.69, N = 3 48.51 48.14 47.59 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 128 c b a 11 22 33 44 55 SE +/- 0.12, N = 3 SE +/- 0.11, N = 3 50.73 50.54 50.17 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: float - X Y Z: 128 a c b 20 40 60 80 100 SE +/- 0.06, N = 3 SE +/- 0.45, N = 3 82.37 80.78 79.86 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: float - X Y Z: 128 a b c 11 22 33 44 55 SE +/- 0.22, N = 3 SE +/- 0.34, N = 3 49.92 49.75 47.11 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 128 a b c 20 40 60 80 100 SE +/- 0.39, N = 3 SE +/- 0.16, N = 3 91.31 91.23 89.81 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: float - X Y Z: 128 a c b 20 40 60 80 100 SE +/- 0.34, N = 3 SE +/- 0.58, N = 3 92.24 91.99 91.53 1. (CXX) g++ options: -O3
Phoronix Test Suite v10.8.5