cloverleaf threadripper AMD Ryzen Threadripper 3990X 64-Core testing with a Gigabyte TRX40 AORUS PRO WIFI (F6 BIOS) and AMD Radeon RX 5700 8GB on Ubuntu 23.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2310271-PTS-CLOVERLE91&sor&gru .
cloverleaf threadripper Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server OpenGL Compiler File-System Screen Resolution a b c AMD Ryzen Threadripper 3990X 64-Core @ 2.90GHz (64 Cores / 128 Threads) Gigabyte TRX40 AORUS PRO WIFI (F6 BIOS) AMD Starship/Matisse 128GB Samsung SSD 970 EVO Plus 500GB AMD Radeon RX 5700 8GB (1750/875MHz) AMD Navi 10 HDMI Audio DELL P2415Q Intel I211 + Intel Wi-Fi 6 AX200 Ubuntu 23.04 6.2.0-34-generic (x86_64) GNOME Shell 44.3 X Server + Wayland 4.6 Mesa 23.0.2 (LLVM 15.0.7 DRM 3.49) GCC 12.3.0 ext4 3840x2160 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-DAPbBt/gcc-12-12.3.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-DAPbBt/gcc-12-12.3.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0x830107a Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Mitigation of untrained return thunk; SMT enabled with STIBP protection + spec_rstack_overflow: Mitigation of safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
cloverleaf threadripper heffte: c2c - FFTW - float - 128 heffte: c2c - FFTW - float - 256 heffte: c2c - FFTW - float - 512 heffte: r2c - FFTW - float - 128 heffte: r2c - FFTW - float - 256 heffte: r2c - FFTW - float - 512 heffte: c2c - FFTW - double - 128 heffte: c2c - FFTW - double - 256 heffte: c2c - FFTW - double - 512 heffte: c2c - FFTW - float - 1024 heffte: c2c - Stock - float - 128 heffte: c2c - Stock - float - 256 heffte: c2c - Stock - float - 512 heffte: r2c - FFTW - double - 128 heffte: r2c - FFTW - double - 256 heffte: r2c - FFTW - double - 512 heffte: r2c - FFTW - float - 1024 heffte: r2c - Stock - float - 128 heffte: r2c - Stock - float - 256 heffte: r2c - Stock - float - 512 heffte: c2c - Stock - double - 128 heffte: c2c - Stock - double - 256 heffte: c2c - Stock - double - 512 heffte: c2c - Stock - float - 1024 heffte: r2c - FFTW - double - 1024 heffte: r2c - Stock - double - 128 heffte: r2c - Stock - double - 256 heffte: r2c - Stock - double - 512 heffte: r2c - Stock - float - 1024 heffte: r2c - Stock - double - 1024 heffte: c2c - FFTW - float-long - 128 heffte: c2c - FFTW - float-long - 256 heffte: c2c - FFTW - float-long - 512 heffte: r2c - FFTW - float-long - 128 heffte: r2c - FFTW - float-long - 256 heffte: r2c - FFTW - float-long - 512 heffte: c2c - FFTW - double-long - 128 heffte: c2c - FFTW - double-long - 256 heffte: c2c - FFTW - double-long - 512 heffte: c2c - FFTW - float-long - 1024 heffte: c2c - Stock - float-long - 128 heffte: c2c - Stock - float-long - 256 heffte: c2c - Stock - float-long - 512 heffte: r2c - FFTW - double-long - 128 heffte: r2c - FFTW - double-long - 256 heffte: r2c - FFTW - double-long - 512 heffte: r2c - FFTW - float-long - 1024 heffte: r2c - Stock - float-long - 128 heffte: r2c - Stock - float-long - 256 heffte: r2c - Stock - float-long - 512 heffte: c2c - Stock - double-long - 128 heffte: c2c - Stock - double-long - 256 heffte: c2c - Stock - double-long - 512 heffte: c2c - Stock - float-long - 1024 heffte: r2c - FFTW - double-long - 1024 heffte: r2c - Stock - double-long - 128 heffte: r2c - Stock - double-long - 256 heffte: r2c - Stock - double-long - 512 heffte: r2c - Stock - float-long - 1024 heffte: r2c - Stock - double-long - 1024 cloverleaf: clover_bm cloverleaf: clover_bm16 cloverleaf: clover_bm64_short a b c 55.8176 35.2477 23.5486 92.2391 85.1009 44.1653 20.9992 12.8747 12.3492 27.1341 49.9233 37.8262 23.6449 26.0046 28.3524 22.2806 48.4776 82.3672 95.7520 48.1639 29.1052 13.0608 12.4020 27.1235 24.9067 48.2953 34.4148 24.0929 52.2726 26.9747 54.9524 34.6871 23.6062 91.3115 82.3872 44.2387 18.8907 12.7467 12.3421 27.1810 50.1664 37.7505 23.7094 25.0364 28.0234 22.3108 48.5744 81.1234 95.5643 48.2998 28.8662 12.8346 12.4059 27.1819 24.9248 47.5897 34.1682 24.1192 52.3467 26.9875 17.54 1253.12 144.51 55.0641 35.1868 23.5270 91.5295 83.3580 44.1977 19.8631 12.6387 12.3379 27.1177 49.7466 37.9740 23.6339 27.2799 27.6720 22.2975 48.4453 79.8604 97.5981 47.9281 28.5774 12.8450 12.4030 27.1160 24.9146 48.5824 34.3795 24.0988 52.2634 26.9774 55.6265 34.7931 23.6093 91.2275 82.7044 44.1198 19.2424 12.7313 12.3489 27.1785 50.5380 37.8566 23.6585 28.2660 28.0753 22.3188 48.5647 80.8669 96.2684 48.0933 28.6991 12.8246 12.4035 27.2048 24.9204 48.5144 34.2844 24.1468 52.3830 26.9768 16.13 1252.53 144.65 59.611 35.0137 23.5485 91.992 83.141 44.0628 23.9639 12.6566 12.3492 27.1493 47.1137 38.3008 23.6284 17.3744 28.1929 22.3154 48.5125 80.7837 93.8026 47.9948 28.3257 12.6881 12.401 27.1292 24.9303 48.1773 34.5829 24.089 52.3086 27.0081 59.4715 34.7741 23.5665 89.8136 81.1273 44.2775 27.2693 12.6448 12.3427 27.1777 50.7326 37.808 23.689 18.3771 28.0944 22.3489 48.571 80.8598 95.4689 48.2783 28.7278 12.6667 12.421 27.2101 24.9473 48.14 34.4753 24.1199 52.4069 27.0114 17.00 1252.61 144.69 OpenBenchmarking.org
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: float - X Y Z: 128 c a b 13 26 39 52 65 SE +/- 0.34, N = 15 SE +/- 0.41, N = 13 59.61 55.82 55.06 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: float - X Y Z: 256 a b c 8 16 24 32 40 SE +/- 0.04, N = 3 SE +/- 0.20, N = 3 35.25 35.19 35.01 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: float - X Y Z: 512 a c b 6 12 18 24 30 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 23.55 23.55 23.53 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: float - X Y Z: 128 a c b 20 40 60 80 100 SE +/- 0.34, N = 3 SE +/- 0.58, N = 3 92.24 91.99 91.53 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: float - X Y Z: 256 a b c 20 40 60 80 100 SE +/- 0.47, N = 3 SE +/- 0.64, N = 3 85.10 83.36 83.14 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: float - X Y Z: 512 b a c 10 20 30 40 50 SE +/- 0.08, N = 3 SE +/- 0.05, N = 3 44.20 44.17 44.06 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: double - X Y Z: 128 c a b 6 12 18 24 30 SE +/- 1.55, N = 12 SE +/- 1.31, N = 15 23.96 21.00 19.86 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: double - X Y Z: 256 a c b 3 6 9 12 15 SE +/- 0.06, N = 3 SE +/- 0.03, N = 3 12.87 12.66 12.64 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: double - X Y Z: 512 c a b 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 12.35 12.35 12.34 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float - X Y Z: 1024 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: float - X Y Z: 1024 c a b 6 12 18 24 30 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 27.15 27.13 27.12 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: float - X Y Z: 128 a b c 11 22 33 44 55 SE +/- 0.22, N = 3 SE +/- 0.34, N = 3 49.92 49.75 47.11 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: float - X Y Z: 256 c b a 9 18 27 36 45 SE +/- 0.35, N = 3 SE +/- 0.29, N = 3 38.30 37.97 37.83 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: float - X Y Z: 512 a b c 6 12 18 24 30 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 23.64 23.63 23.63 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: double - X Y Z: 128 b a c 6 12 18 24 30 SE +/- 2.33, N = 15 SE +/- 2.70, N = 15 27.28 26.00 17.37 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: double - X Y Z: 256 a c b 7 14 21 28 35 SE +/- 0.14, N = 3 SE +/- 0.25, N = 3 28.35 28.19 27.67 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: double - X Y Z: 512 c b a 5 10 15 20 25 SE +/- 0.02, N = 3 SE +/- 0.00, N = 3 22.32 22.30 22.28 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float - X Y Z: 1024 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: float - X Y Z: 1024 c a b 11 22 33 44 55 SE +/- 0.01, N = 3 SE +/- 0.05, N = 3 48.51 48.48 48.45 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: float - X Y Z: 128 a c b 20 40 60 80 100 SE +/- 0.06, N = 3 SE +/- 0.45, N = 3 82.37 80.78 79.86 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: float - X Y Z: 256 b a c 20 40 60 80 100 SE +/- 0.29, N = 3 SE +/- 0.22, N = 3 97.60 95.75 93.80 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: float - X Y Z: 512 a c b 11 22 33 44 55 SE +/- 0.14, N = 3 SE +/- 0.07, N = 3 48.16 47.99 47.93 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: double - X Y Z: 128 a b c 7 14 21 28 35 SE +/- 0.31, N = 3 SE +/- 0.41, N = 3 29.11 28.58 28.33 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: double - X Y Z: 256 a b c 3 6 9 12 15 SE +/- 0.13, N = 3 SE +/- 0.14, N = 3 13.06 12.85 12.69 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: double - X Y Z: 512 b a c 3 6 9 12 15 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 12.40 12.40 12.40 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float - X Y Z: 1024 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: float - X Y Z: 1024 c a b 6 12 18 24 30 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 27.13 27.12 27.12 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double - X Y Z: 1024 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: double - X Y Z: 1024 c b a 6 12 18 24 30 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 24.93 24.91 24.91 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: double - X Y Z: 128 b a c 11 22 33 44 55 SE +/- 0.52, N = 3 SE +/- 0.44, N = 3 48.58 48.30 48.18 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: double - X Y Z: 256 c a b 8 16 24 32 40 SE +/- 0.12, N = 3 SE +/- 0.11, N = 3 34.58 34.41 34.38 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: double - X Y Z: 512 b a c 6 12 18 24 30 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 24.10 24.09 24.09 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float - X Y Z: 1024 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: float - X Y Z: 1024 c a b 12 24 36 48 60 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 52.31 52.27 52.26 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double - X Y Z: 1024 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: double - X Y Z: 1024 c b a 6 12 18 24 30 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 27.01 26.98 26.97 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 128 c b a 13 26 39 52 65 SE +/- 0.45, N = 9 SE +/- 0.42, N = 10 59.47 55.63 54.95 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 256 b c a 8 16 24 32 40 SE +/- 0.47, N = 3 SE +/- 0.16, N = 3 34.79 34.77 34.69 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 512 b a c 6 12 18 24 30 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 23.61 23.61 23.57 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 128 a b c 20 40 60 80 100 SE +/- 0.39, N = 3 SE +/- 0.16, N = 3 91.31 91.23 89.81 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 256 b a c 20 40 60 80 100 SE +/- 0.40, N = 3 SE +/- 0.20, N = 3 82.70 82.39 81.13 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 512 c a b 10 20 30 40 50 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 44.28 44.24 44.12 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 128 c b a 6 12 18 24 30 SE +/- 1.41, N = 15 SE +/- 1.04, N = 15 27.27 19.24 18.89 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 256 a b c 3 6 9 12 15 SE +/- 0.05, N = 3 SE +/- 0.10, N = 3 12.75 12.73 12.64 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 512 b c a 3 6 9 12 15 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 12.35 12.34 12.34 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 1024 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 1024 a b c 6 12 18 24 30 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 27.18 27.18 27.18 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 128 c b a 11 22 33 44 55 SE +/- 0.12, N = 3 SE +/- 0.11, N = 3 50.73 50.54 50.17 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 256 b c a 9 18 27 36 45 SE +/- 0.04, N = 3 SE +/- 0.23, N = 3 37.86 37.81 37.75 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 512 a c b 6 12 18 24 30 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 23.71 23.69 23.66 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 128 b a c 7 14 21 28 35 SE +/- 2.45, N = 15 SE +/- 3.07, N = 12 28.27 25.04 18.38 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 256 c b a 7 14 21 28 35 SE +/- 0.06, N = 3 SE +/- 0.09, N = 3 28.09 28.08 28.02 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 512 c b a 5 10 15 20 25 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 22.35 22.32 22.31 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 1024 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 1024 a c b 11 22 33 44 55 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 48.57 48.57 48.56 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 128 a b c 20 40 60 80 100 SE +/- 0.29, N = 3 SE +/- 0.24, N = 3 81.12 80.87 80.86 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 256 b a c 20 40 60 80 100 SE +/- 0.91, N = 3 SE +/- 0.68, N = 12 96.27 95.56 95.47 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 512 a c b 11 22 33 44 55 SE +/- 0.23, N = 3 SE +/- 0.05, N = 3 48.30 48.28 48.09 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 128 a c b 7 14 21 28 35 SE +/- 0.12, N = 3 SE +/- 0.34, N = 3 28.87 28.73 28.70 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 256 a b c 3 6 9 12 15 SE +/- 0.17, N = 3 SE +/- 0.12, N = 3 12.83 12.82 12.67 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 512 c a b 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 12.42 12.41 12.40 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 1024 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 1024 c b a 6 12 18 24 30 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 27.21 27.20 27.18 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 1024 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 1024 c a b 6 12 18 24 30 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 24.95 24.92 24.92 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 128 b c a 11 22 33 44 55 SE +/- 0.40, N = 3 SE +/- 0.69, N = 3 48.51 48.14 47.59 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 256 c b a 8 16 24 32 40 SE +/- 0.17, N = 3 SE +/- 0.09, N = 3 34.48 34.28 34.17 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 512 b c a 6 12 18 24 30 SE +/- 0.04, N = 3 SE +/- 0.01, N = 3 24.15 24.12 24.12 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 1024 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 1024 c b a 12 24 36 48 60 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 52.41 52.38 52.35 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 1024 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 1024 c a b 6 12 18 24 30 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 27.01 26.99 26.98 1. (CXX) g++ options: -O3
CloverLeaf Input: clover_bm OpenBenchmarking.org Seconds, Fewer Is Better CloverLeaf 1.3 Input: clover_bm b c a 4 8 12 16 20 SE +/- 0.25, N = 15 SE +/- 1.44, N = 12 16.13 17.00 17.54 1. (F9X) gfortran options: -O3 -march=native -funroll-loops -fopenmp
CloverLeaf Input: clover_bm16 OpenBenchmarking.org Seconds, Fewer Is Better CloverLeaf 1.3 Input: clover_bm16 b c a 300 600 900 1200 1500 SE +/- 0.09, N = 3 SE +/- 1.27, N = 3 1252.53 1252.61 1253.12 1. (F9X) gfortran options: -O3 -march=native -funroll-loops -fopenmp
CloverLeaf Input: clover_bm64_short OpenBenchmarking.org Seconds, Fewer Is Better CloverLeaf 1.3 Input: clover_bm64_short a b c 30 60 90 120 150 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 144.51 144.65 144.69 1. (F9X) gfortran options: -O3 -march=native -funroll-loops -fopenmp
Phoronix Test Suite v10.8.5