cloverleaf threadripper AMD Ryzen Threadripper 3990X 64-Core testing with a Gigabyte TRX40 AORUS PRO WIFI (F6 BIOS) and AMD Radeon RX 5700 8GB on Ubuntu 23.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2310271-PTS-CLOVERLE91&grs&sro .
cloverleaf threadripper Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server OpenGL Compiler File-System Screen Resolution a b c AMD Ryzen Threadripper 3990X 64-Core @ 2.90GHz (64 Cores / 128 Threads) Gigabyte TRX40 AORUS PRO WIFI (F6 BIOS) AMD Starship/Matisse 128GB Samsung SSD 970 EVO Plus 500GB AMD Radeon RX 5700 8GB (1750/875MHz) AMD Navi 10 HDMI Audio DELL P2415Q Intel I211 + Intel Wi-Fi 6 AX200 Ubuntu 23.04 6.2.0-34-generic (x86_64) GNOME Shell 44.3 X Server + Wayland 4.6 Mesa 23.0.2 (LLVM 15.0.7 DRM 3.49) GCC 12.3.0 ext4 3840x2160 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-DAPbBt/gcc-12-12.3.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-DAPbBt/gcc-12-12.3.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0x830107a Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Mitigation of untrained return thunk; SMT enabled with STIBP protection + spec_rstack_overflow: Mitigation of safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
cloverleaf threadripper heffte: c2c - FFTW - float - 128 heffte: c2c - FFTW - float-long - 128 heffte: c2c - Stock - float - 128 heffte: r2c - Stock - float - 256 heffte: r2c - Stock - float - 128 heffte: c2c - Stock - double - 256 heffte: c2c - Stock - double - 128 heffte: r2c - FFTW - double - 256 heffte: r2c - FFTW - float - 256 heffte: r2c - FFTW - float-long - 256 heffte: r2c - Stock - double-long - 128 heffte: c2c - FFTW - double - 256 heffte: r2c - FFTW - float-long - 128 heffte: c2c - Stock - double-long - 256 heffte: c2c - Stock - float - 256 heffte: c2c - Stock - float-long - 128 heffte: r2c - Stock - double-long - 256 heffte: r2c - Stock - double - 128 heffte: r2c - Stock - float-long - 256 heffte: c2c - FFTW - double-long - 256 heffte: r2c - FFTW - float - 128 heffte: c2c - FFTW - float - 256 heffte: r2c - Stock - double - 256 heffte: c2c - Stock - double-long - 128 heffte: r2c - Stock - float - 512 heffte: r2c - Stock - float-long - 512 heffte: r2c - FFTW - float-long - 512 heffte: r2c - Stock - float-long - 128 heffte: r2c - FFTW - float - 512 heffte: c2c - FFTW - float-long - 256 heffte: c2c - Stock - float-long - 256 heffte: r2c - FFTW - double-long - 256 heffte: c2c - Stock - float-long - 512 heffte: c2c - FFTW - float-long - 512 heffte: r2c - FFTW - double-long - 512 heffte: r2c - FFTW - double - 512 heffte: c2c - Stock - double-long - 512 heffte: r2c - FFTW - float - 1024 heffte: r2c - Stock - double-long - 1024 cloverleaf: clover_bm64_short heffte: r2c - Stock - double - 1024 heffte: c2c - FFTW - float - 1024 heffte: r2c - Stock - float-long - 1024 heffte: r2c - Stock - double-long - 512 heffte: r2c - FFTW - double-long - 1024 heffte: c2c - Stock - float-long - 1024 heffte: r2c - FFTW - double - 1024 heffte: c2c - FFTW - float - 512 heffte: c2c - FFTW - double - 512 heffte: r2c - Stock - float - 1024 heffte: c2c - Stock - float - 512 heffte: c2c - FFTW - double-long - 512 heffte: c2c - Stock - float - 1024 cloverleaf: clover_bm16 heffte: r2c - Stock - double - 512 heffte: r2c - FFTW - float-long - 1024 heffte: c2c - Stock - double - 512 heffte: c2c - FFTW - float-long - 1024 heffte: r2c - FFTW - double-long - 128 heffte: c2c - FFTW - double-long - 128 heffte: r2c - FFTW - double - 128 heffte: c2c - FFTW - double - 128 cloverleaf: clover_bm a b c 55.8176 54.9524 49.9233 95.7520 82.3672 13.0608 29.1052 28.3524 85.1009 82.3872 47.5897 12.8747 91.3115 12.8346 37.8262 50.1664 34.1682 48.2953 95.5643 12.7467 92.2391 35.2477 34.4148 28.8662 48.1639 48.2998 44.2387 81.1234 44.1653 34.6871 37.7505 28.0234 23.7094 23.6062 22.3108 22.2806 12.4059 48.4776 26.9875 144.51 26.9747 27.1341 52.3467 24.1192 24.9248 27.1819 24.9067 23.5486 12.3492 52.2726 23.6449 12.3421 27.1235 1253.12 24.0929 48.5744 12.4020 27.1810 25.0364 18.8907 26.0046 20.9992 17.54 55.0641 55.6265 49.7466 97.5981 79.8604 12.8450 28.5774 27.6720 83.3580 82.7044 48.5144 12.6387 91.2275 12.8246 37.9740 50.5380 34.2844 48.5824 96.2684 12.7313 91.5295 35.1868 34.3795 28.6991 47.9281 48.0933 44.1198 80.8669 44.1977 34.7931 37.8566 28.0753 23.6585 23.6093 22.3188 22.2975 12.4035 48.4453 26.9768 144.65 26.9774 27.1177 52.3830 24.1468 24.9204 27.2048 24.9146 23.5270 12.3379 52.2634 23.6339 12.3489 27.1160 1252.53 24.0988 48.5647 12.4030 27.1785 28.2660 19.2424 27.2799 19.8631 16.13 59.611 59.4715 47.1137 93.8026 80.7837 12.6881 28.3257 28.1929 83.141 81.1273 48.14 12.6566 89.8136 12.6667 38.3008 50.7326 34.4753 48.1773 95.4689 12.6448 91.992 35.0137 34.5829 28.7278 47.9948 48.2783 44.2775 80.8598 44.0628 34.7741 37.808 28.0944 23.689 23.5665 22.3489 22.3154 12.421 48.5125 27.0114 144.69 27.0081 27.1493 52.4069 24.1199 24.9473 27.2101 24.9303 23.5485 12.3492 52.3086 23.6284 12.3427 27.1292 1252.61 24.089 48.571 12.401 27.1777 18.3771 27.2693 17.3744 23.9639 17.00 OpenBenchmarking.org
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: float - X Y Z: 128 a b c 13 26 39 52 65 SE +/- 0.34, N = 15 SE +/- 0.41, N = 13 55.82 55.06 59.61 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 128 a b c 13 26 39 52 65 SE +/- 0.42, N = 10 SE +/- 0.45, N = 9 54.95 55.63 59.47 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: float - X Y Z: 128 a b c 11 22 33 44 55 SE +/- 0.22, N = 3 SE +/- 0.34, N = 3 49.92 49.75 47.11 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: float - X Y Z: 256 a b c 20 40 60 80 100 SE +/- 0.22, N = 3 SE +/- 0.29, N = 3 95.75 97.60 93.80 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: float - X Y Z: 128 a b c 20 40 60 80 100 SE +/- 0.06, N = 3 SE +/- 0.45, N = 3 82.37 79.86 80.78 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: double - X Y Z: 256 a b c 3 6 9 12 15 SE +/- 0.13, N = 3 SE +/- 0.14, N = 3 13.06 12.85 12.69 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: double - X Y Z: 128 a b c 7 14 21 28 35 SE +/- 0.31, N = 3 SE +/- 0.41, N = 3 29.11 28.58 28.33 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: double - X Y Z: 256 a b c 7 14 21 28 35 SE +/- 0.14, N = 3 SE +/- 0.25, N = 3 28.35 27.67 28.19 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: float - X Y Z: 256 a b c 20 40 60 80 100 SE +/- 0.47, N = 3 SE +/- 0.64, N = 3 85.10 83.36 83.14 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 256 a b c 20 40 60 80 100 SE +/- 0.20, N = 3 SE +/- 0.40, N = 3 82.39 82.70 81.13 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 128 a b c 11 22 33 44 55 SE +/- 0.69, N = 3 SE +/- 0.40, N = 3 47.59 48.51 48.14 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: double - X Y Z: 256 a b c 3 6 9 12 15 SE +/- 0.06, N = 3 SE +/- 0.03, N = 3 12.87 12.64 12.66 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 128 a b c 20 40 60 80 100 SE +/- 0.39, N = 3 SE +/- 0.16, N = 3 91.31 91.23 89.81 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 256 a b c 3 6 9 12 15 SE +/- 0.17, N = 3 SE +/- 0.12, N = 3 12.83 12.82 12.67 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: float - X Y Z: 256 a b c 9 18 27 36 45 SE +/- 0.29, N = 3 SE +/- 0.35, N = 3 37.83 37.97 38.30 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 128 a b c 11 22 33 44 55 SE +/- 0.11, N = 3 SE +/- 0.12, N = 3 50.17 50.54 50.73 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 256 a b c 8 16 24 32 40 SE +/- 0.09, N = 3 SE +/- 0.17, N = 3 34.17 34.28 34.48 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: double - X Y Z: 128 a b c 11 22 33 44 55 SE +/- 0.44, N = 3 SE +/- 0.52, N = 3 48.30 48.58 48.18 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 256 a b c 20 40 60 80 100 SE +/- 0.68, N = 12 SE +/- 0.91, N = 3 95.56 96.27 95.47 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 256 a b c 3 6 9 12 15 SE +/- 0.05, N = 3 SE +/- 0.10, N = 3 12.75 12.73 12.64 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: float - X Y Z: 128 a b c 20 40 60 80 100 SE +/- 0.34, N = 3 SE +/- 0.58, N = 3 92.24 91.53 91.99 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: float - X Y Z: 256 a b c 8 16 24 32 40 SE +/- 0.04, N = 3 SE +/- 0.20, N = 3 35.25 35.19 35.01 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: double - X Y Z: 256 a b c 8 16 24 32 40 SE +/- 0.12, N = 3 SE +/- 0.11, N = 3 34.41 34.38 34.58 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 128 a b c 7 14 21 28 35 SE +/- 0.12, N = 3 SE +/- 0.34, N = 3 28.87 28.70 28.73 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: float - X Y Z: 512 a b c 11 22 33 44 55 SE +/- 0.14, N = 3 SE +/- 0.07, N = 3 48.16 47.93 47.99 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 512 a b c 11 22 33 44 55 SE +/- 0.23, N = 3 SE +/- 0.05, N = 3 48.30 48.09 48.28 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 512 a b c 10 20 30 40 50 SE +/- 0.01, N = 3 SE +/- 0.03, N = 3 44.24 44.12 44.28 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 128 a b c 20 40 60 80 100 SE +/- 0.29, N = 3 SE +/- 0.24, N = 3 81.12 80.87 80.86 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: float - X Y Z: 512 a b c 10 20 30 40 50 SE +/- 0.05, N = 3 SE +/- 0.08, N = 3 44.17 44.20 44.06 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 256 a b c 8 16 24 32 40 SE +/- 0.16, N = 3 SE +/- 0.47, N = 3 34.69 34.79 34.77 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 256 a b c 9 18 27 36 45 SE +/- 0.23, N = 3 SE +/- 0.04, N = 3 37.75 37.86 37.81 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 256 a b c 7 14 21 28 35 SE +/- 0.09, N = 3 SE +/- 0.06, N = 3 28.02 28.08 28.09 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 512 a b c 6 12 18 24 30 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 23.71 23.66 23.69 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 512 a b c 6 12 18 24 30 SE +/- 0.02, N = 3 SE +/- 0.00, N = 3 23.61 23.61 23.57 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 512 a b c 5 10 15 20 25 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 22.31 22.32 22.35 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: double - X Y Z: 512 a b c 5 10 15 20 25 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 22.28 22.30 22.32 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 512 a b c 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 12.41 12.40 12.42 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float - X Y Z: 1024 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: float - X Y Z: 1024 a b c 11 22 33 44 55 SE +/- 0.01, N = 3 SE +/- 0.05, N = 3 48.48 48.45 48.51 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 1024 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 1024 a b c 6 12 18 24 30 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 26.99 26.98 27.01 1. (CXX) g++ options: -O3
CloverLeaf Input: clover_bm64_short OpenBenchmarking.org Seconds, Fewer Is Better CloverLeaf 1.3 Input: clover_bm64_short a b c 30 60 90 120 150 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 144.51 144.65 144.69 1. (F9X) gfortran options: -O3 -march=native -funroll-loops -fopenmp
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double - X Y Z: 1024 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: double - X Y Z: 1024 a b c 6 12 18 24 30 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 26.97 26.98 27.01 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float - X Y Z: 1024 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: float - X Y Z: 1024 a b c 6 12 18 24 30 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 27.13 27.12 27.15 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 1024 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 1024 a b c 12 24 36 48 60 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 52.35 52.38 52.41 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 512 a b c 6 12 18 24 30 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 24.12 24.15 24.12 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 1024 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 1024 a b c 6 12 18 24 30 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 24.92 24.92 24.95 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 1024 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 1024 a b c 6 12 18 24 30 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 27.18 27.20 27.21 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double - X Y Z: 1024 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: double - X Y Z: 1024 a b c 6 12 18 24 30 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 24.91 24.91 24.93 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: float - X Y Z: 512 a b c 6 12 18 24 30 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 23.55 23.53 23.55 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: double - X Y Z: 512 a b c 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 12.35 12.34 12.35 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float - X Y Z: 1024 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: float - X Y Z: 1024 a b c 12 24 36 48 60 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 52.27 52.26 52.31 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: float - X Y Z: 512 a b c 6 12 18 24 30 SE +/- 0.01, N = 3 SE +/- 0.04, N = 3 23.64 23.63 23.63 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 512 a b c 3 6 9 12 15 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 12.34 12.35 12.34 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float - X Y Z: 1024 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: float - X Y Z: 1024 a b c 6 12 18 24 30 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 27.12 27.12 27.13 1. (CXX) g++ options: -O3
CloverLeaf Input: clover_bm16 OpenBenchmarking.org Seconds, Fewer Is Better CloverLeaf 1.3 Input: clover_bm16 a b c 300 600 900 1200 1500 SE +/- 1.27, N = 3 SE +/- 0.09, N = 3 1253.12 1252.53 1252.61 1. (F9X) gfortran options: -O3 -march=native -funroll-loops -fopenmp
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: double - X Y Z: 512 a b c 6 12 18 24 30 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 24.09 24.10 24.09 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 1024 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 1024 a b c 11 22 33 44 55 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 48.57 48.56 48.57 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: double - X Y Z: 512 a b c 3 6 9 12 15 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 12.40 12.40 12.40 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 1024 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 1024 a b c 6 12 18 24 30 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 27.18 27.18 27.18 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 128 a b c 7 14 21 28 35 SE +/- 3.07, N = 12 SE +/- 2.45, N = 15 25.04 28.27 18.38 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 128 a b c 6 12 18 24 30 SE +/- 1.04, N = 15 SE +/- 1.41, N = 15 18.89 19.24 27.27 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: double - X Y Z: 128 a b c 6 12 18 24 30 SE +/- 2.70, N = 15 SE +/- 2.33, N = 15 26.00 27.28 17.37 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: double - X Y Z: 128 a b c 6 12 18 24 30 SE +/- 1.55, N = 12 SE +/- 1.31, N = 15 21.00 19.86 23.96 1. (CXX) g++ options: -O3
CloverLeaf Input: clover_bm OpenBenchmarking.org Seconds, Fewer Is Better CloverLeaf 1.3 Input: clover_bm a b c 4 8 12 16 20 SE +/- 1.44, N = 12 SE +/- 0.25, N = 15 17.54 16.13 17.00 1. (F9X) gfortran options: -O3 -march=native -funroll-loops -fopenmp
Phoronix Test Suite v10.8.5