cloverleaf threadripper AMD Ryzen Threadripper 3990X 64-Core testing with a Gigabyte TRX40 AORUS PRO WIFI (F6 BIOS) and AMD Radeon RX 5700 8GB on Ubuntu 23.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2310271-PTS-CLOVERLE91&grw&sro&rro .
cloverleaf threadripper Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server OpenGL Compiler File-System Screen Resolution a b c AMD Ryzen Threadripper 3990X 64-Core @ 2.90GHz (64 Cores / 128 Threads) Gigabyte TRX40 AORUS PRO WIFI (F6 BIOS) AMD Starship/Matisse 128GB Samsung SSD 970 EVO Plus 500GB AMD Radeon RX 5700 8GB (1750/875MHz) AMD Navi 10 HDMI Audio DELL P2415Q Intel I211 + Intel Wi-Fi 6 AX200 Ubuntu 23.04 6.2.0-34-generic (x86_64) GNOME Shell 44.3 X Server + Wayland 4.6 Mesa 23.0.2 (LLVM 15.0.7 DRM 3.49) GCC 12.3.0 ext4 3840x2160 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-DAPbBt/gcc-12-12.3.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-DAPbBt/gcc-12-12.3.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0x830107a Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Mitigation of untrained return thunk; SMT enabled with STIBP protection + spec_rstack_overflow: Mitigation of safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
cloverleaf threadripper heffte: r2c - FFTW - float - 512 cloverleaf: clover_bm16 heffte: r2c - Stock - double-long - 512 heffte: c2c - FFTW - double - 128 heffte: c2c - Stock - float-long - 1024 heffte: c2c - Stock - float-long - 512 heffte: r2c - FFTW - float-long - 1024 heffte: r2c - FFTW - float-long - 256 heffte: c2c - FFTW - double-long - 512 heffte: r2c - FFTW - float - 256 heffte: c2c - FFTW - float-long - 128 heffte: r2c - FFTW - float - 128 heffte: c2c - FFTW - float - 512 heffte: c2c - FFTW - float - 256 heffte: c2c - FFTW - float - 128 cloverleaf: clover_bm64_short heffte: c2c - FFTW - float-long - 512 heffte: c2c - FFTW - double - 256 heffte: c2c - FFTW - double-long - 128 heffte: c2c - FFTW - double - 512 heffte: c2c - Stock - float-long - 128 heffte: c2c - FFTW - float - 1024 heffte: r2c - FFTW - double-long - 256 heffte: c2c - Stock - float - 128 heffte: r2c - Stock - float-long - 256 heffte: c2c - Stock - float - 256 heffte: c2c - Stock - double-long - 256 heffte: c2c - Stock - float - 512 heffte: r2c - Stock - double-long - 128 heffte: r2c - FFTW - double - 128 heffte: r2c - FFTW - double - 256 heffte: r2c - Stock - double - 1024 heffte: r2c - FFTW - double - 512 heffte: c2c - FFTW - float-long - 256 heffte: r2c - FFTW - float - 1024 heffte: r2c - FFTW - float-long - 128 heffte: r2c - Stock - float - 128 heffte: r2c - FFTW - float-long - 512 heffte: r2c - Stock - float - 256 heffte: c2c - FFTW - double-long - 256 heffte: r2c - Stock - float - 512 heffte: c2c - FFTW - float-long - 1024 heffte: c2c - Stock - float-long - 256 heffte: c2c - Stock - double - 128 heffte: r2c - FFTW - double-long - 128 heffte: c2c - Stock - double - 256 heffte: r2c - FFTW - double-long - 512 heffte: c2c - Stock - double - 512 heffte: r2c - Stock - float-long - 128 heffte: c2c - Stock - float - 1024 heffte: r2c - Stock - float-long - 512 heffte: r2c - FFTW - double - 1024 heffte: c2c - Stock - double-long - 128 heffte: r2c - Stock - double - 128 heffte: c2c - Stock - double-long - 512 heffte: r2c - Stock - double - 256 heffte: r2c - FFTW - double-long - 1024 heffte: r2c - Stock - double - 512 heffte: r2c - Stock - double-long - 256 heffte: r2c - Stock - float - 1024 heffte: r2c - Stock - float-long - 1024 heffte: r2c - Stock - double-long - 1024 cloverleaf: clover_bm a b c 44.1653 1253.12 24.1192 20.9992 27.1819 23.7094 48.5744 82.3872 12.3421 85.1009 54.9524 92.2391 23.5486 35.2477 55.8176 144.51 23.6062 12.8747 18.8907 12.3492 50.1664 27.1341 28.0234 49.9233 95.5643 37.8262 12.8346 23.6449 47.5897 26.0046 28.3524 26.9747 22.2806 34.6871 48.4776 91.3115 82.3672 44.2387 95.7520 12.7467 48.1639 27.1810 37.7505 29.1052 25.0364 13.0608 22.3108 12.4020 81.1234 27.1235 48.2998 24.9067 28.8662 48.2953 12.4059 34.4148 24.9248 24.0929 34.1682 52.2726 52.3467 26.9875 17.54 44.1977 1252.53 24.1468 19.8631 27.2048 23.6585 48.5647 82.7044 12.3489 83.3580 55.6265 91.5295 23.5270 35.1868 55.0641 144.65 23.6093 12.6387 19.2424 12.3379 50.5380 27.1177 28.0753 49.7466 96.2684 37.9740 12.8246 23.6339 48.5144 27.2799 27.6720 26.9774 22.2975 34.7931 48.4453 91.2275 79.8604 44.1198 97.5981 12.7313 47.9281 27.1785 37.8566 28.5774 28.2660 12.8450 22.3188 12.4030 80.8669 27.1160 48.0933 24.9146 28.6991 48.5824 12.4035 34.3795 24.9204 24.0988 34.2844 52.2634 52.3830 26.9768 16.13 44.0628 1252.61 24.1199 23.9639 27.2101 23.689 48.571 81.1273 12.3427 83.141 59.4715 91.992 23.5485 35.0137 59.611 144.69 23.5665 12.6566 27.2693 12.3492 50.7326 27.1493 28.0944 47.1137 95.4689 38.3008 12.6667 23.6284 48.14 17.3744 28.1929 27.0081 22.3154 34.7741 48.5125 89.8136 80.7837 44.2775 93.8026 12.6448 47.9948 27.1777 37.808 28.3257 18.3771 12.6881 22.3489 12.401 80.8598 27.1292 48.2783 24.9303 28.7278 48.1773 12.421 34.5829 24.9473 24.089 34.4753 52.3086 52.4069 27.0114 17.00 OpenBenchmarking.org
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: float - X Y Z: 512 c b a 10 20 30 40 50 SE +/- 0.08, N = 3 SE +/- 0.05, N = 3 44.06 44.20 44.17 1. (CXX) g++ options: -O3
CloverLeaf Input: clover_bm16 OpenBenchmarking.org Seconds, Fewer Is Better CloverLeaf 1.3 Input: clover_bm16 c b a 300 600 900 1200 1500 SE +/- 0.09, N = 3 SE +/- 1.27, N = 3 1252.61 1252.53 1253.12 1. (F9X) gfortran options: -O3 -march=native -funroll-loops -fopenmp
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 512 c b a 6 12 18 24 30 SE +/- 0.04, N = 3 SE +/- 0.01, N = 3 24.12 24.15 24.12 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: double - X Y Z: 128 c b a 6 12 18 24 30 SE +/- 1.31, N = 15 SE +/- 1.55, N = 12 23.96 19.86 21.00 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 1024 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 1024 c b a 6 12 18 24 30 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 27.21 27.20 27.18 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 512 c b a 6 12 18 24 30 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 23.69 23.66 23.71 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 1024 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 1024 c b a 11 22 33 44 55 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 48.57 48.56 48.57 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 256 c b a 20 40 60 80 100 SE +/- 0.40, N = 3 SE +/- 0.20, N = 3 81.13 82.70 82.39 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 512 c b a 3 6 9 12 15 SE +/- 0.00, N = 3 SE +/- 0.00, N = 3 12.34 12.35 12.34 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: float - X Y Z: 256 c b a 20 40 60 80 100 SE +/- 0.64, N = 3 SE +/- 0.47, N = 3 83.14 83.36 85.10 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 128 c b a 13 26 39 52 65 SE +/- 0.45, N = 9 SE +/- 0.42, N = 10 59.47 55.63 54.95 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: float - X Y Z: 128 c b a 20 40 60 80 100 SE +/- 0.58, N = 3 SE +/- 0.34, N = 3 91.99 91.53 92.24 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: float - X Y Z: 512 c b a 6 12 18 24 30 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 23.55 23.53 23.55 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: float - X Y Z: 256 c b a 8 16 24 32 40 SE +/- 0.20, N = 3 SE +/- 0.04, N = 3 35.01 35.19 35.25 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: float - X Y Z: 128 c b a 13 26 39 52 65 SE +/- 0.41, N = 13 SE +/- 0.34, N = 15 59.61 55.06 55.82 1. (CXX) g++ options: -O3
CloverLeaf Input: clover_bm64_short OpenBenchmarking.org Seconds, Fewer Is Better CloverLeaf 1.3 Input: clover_bm64_short c b a 30 60 90 120 150 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 144.69 144.65 144.51 1. (F9X) gfortran options: -O3 -march=native -funroll-loops -fopenmp
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 512 c b a 6 12 18 24 30 SE +/- 0.00, N = 3 SE +/- 0.02, N = 3 23.57 23.61 23.61 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: double - X Y Z: 256 c b a 3 6 9 12 15 SE +/- 0.03, N = 3 SE +/- 0.06, N = 3 12.66 12.64 12.87 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 128 c b a 6 12 18 24 30 SE +/- 1.41, N = 15 SE +/- 1.04, N = 15 27.27 19.24 18.89 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: double - X Y Z: 512 c b a 3 6 9 12 15 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 12.35 12.34 12.35 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 128 c b a 11 22 33 44 55 SE +/- 0.12, N = 3 SE +/- 0.11, N = 3 50.73 50.54 50.17 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float - X Y Z: 1024 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: float - X Y Z: 1024 c b a 6 12 18 24 30 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 27.15 27.12 27.13 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 256 c b a 7 14 21 28 35 SE +/- 0.06, N = 3 SE +/- 0.09, N = 3 28.09 28.08 28.02 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: float - X Y Z: 128 c b a 11 22 33 44 55 SE +/- 0.34, N = 3 SE +/- 0.22, N = 3 47.11 49.75 49.92 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 256 c b a 20 40 60 80 100 SE +/- 0.91, N = 3 SE +/- 0.68, N = 12 95.47 96.27 95.56 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: float - X Y Z: 256 c b a 9 18 27 36 45 SE +/- 0.35, N = 3 SE +/- 0.29, N = 3 38.30 37.97 37.83 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 256 c b a 3 6 9 12 15 SE +/- 0.12, N = 3 SE +/- 0.17, N = 3 12.67 12.82 12.83 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: float - X Y Z: 512 c b a 6 12 18 24 30 SE +/- 0.04, N = 3 SE +/- 0.01, N = 3 23.63 23.63 23.64 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 128 c b a 11 22 33 44 55 SE +/- 0.40, N = 3 SE +/- 0.69, N = 3 48.14 48.51 47.59 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: double - X Y Z: 128 c b a 6 12 18 24 30 SE +/- 2.33, N = 15 SE +/- 2.70, N = 15 17.37 27.28 26.00 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: double - X Y Z: 256 c b a 7 14 21 28 35 SE +/- 0.25, N = 3 SE +/- 0.14, N = 3 28.19 27.67 28.35 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double - X Y Z: 1024 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: double - X Y Z: 1024 c b a 6 12 18 24 30 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 27.01 26.98 26.97 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: double - X Y Z: 512 c b a 5 10 15 20 25 SE +/- 0.02, N = 3 SE +/- 0.00, N = 3 22.32 22.30 22.28 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 256 c b a 8 16 24 32 40 SE +/- 0.47, N = 3 SE +/- 0.16, N = 3 34.77 34.79 34.69 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float - X Y Z: 1024 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: float - X Y Z: 1024 c b a 11 22 33 44 55 SE +/- 0.05, N = 3 SE +/- 0.01, N = 3 48.51 48.45 48.48 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 128 c b a 20 40 60 80 100 SE +/- 0.16, N = 3 SE +/- 0.39, N = 3 89.81 91.23 91.31 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: float - X Y Z: 128 c b a 20 40 60 80 100 SE +/- 0.45, N = 3 SE +/- 0.06, N = 3 80.78 79.86 82.37 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 512 c b a 10 20 30 40 50 SE +/- 0.03, N = 3 SE +/- 0.01, N = 3 44.28 44.12 44.24 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: float - X Y Z: 256 c b a 20 40 60 80 100 SE +/- 0.29, N = 3 SE +/- 0.22, N = 3 93.80 97.60 95.75 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 256 c b a 3 6 9 12 15 SE +/- 0.10, N = 3 SE +/- 0.05, N = 3 12.64 12.73 12.75 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: float - X Y Z: 512 c b a 11 22 33 44 55 SE +/- 0.07, N = 3 SE +/- 0.14, N = 3 47.99 47.93 48.16 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 1024 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 1024 c b a 6 12 18 24 30 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 27.18 27.18 27.18 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 256 c b a 9 18 27 36 45 SE +/- 0.04, N = 3 SE +/- 0.23, N = 3 37.81 37.86 37.75 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: double - X Y Z: 128 c b a 7 14 21 28 35 SE +/- 0.41, N = 3 SE +/- 0.31, N = 3 28.33 28.58 29.11 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 128 c b a 7 14 21 28 35 SE +/- 2.45, N = 15 SE +/- 3.07, N = 12 18.38 28.27 25.04 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: double - X Y Z: 256 c b a 3 6 9 12 15 SE +/- 0.14, N = 3 SE +/- 0.13, N = 3 12.69 12.85 13.06 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 512 c b a 5 10 15 20 25 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 22.35 22.32 22.31 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: double - X Y Z: 512 c b a 3 6 9 12 15 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 12.40 12.40 12.40 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 128 c b a 20 40 60 80 100 SE +/- 0.24, N = 3 SE +/- 0.29, N = 3 80.86 80.87 81.12 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float - X Y Z: 1024 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: float - X Y Z: 1024 c b a 6 12 18 24 30 SE +/- 0.01, N = 3 SE +/- 0.01, N = 3 27.13 27.12 27.12 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 512 c b a 11 22 33 44 55 SE +/- 0.05, N = 3 SE +/- 0.23, N = 3 48.28 48.09 48.30 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double - X Y Z: 1024 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: double - X Y Z: 1024 c b a 6 12 18 24 30 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 24.93 24.91 24.91 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 128 c b a 7 14 21 28 35 SE +/- 0.34, N = 3 SE +/- 0.12, N = 3 28.73 28.70 28.87 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: double - X Y Z: 128 c b a 11 22 33 44 55 SE +/- 0.52, N = 3 SE +/- 0.44, N = 3 48.18 48.58 48.30 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 512 c b a 3 6 9 12 15 SE +/- 0.00, N = 3 SE +/- 0.01, N = 3 12.42 12.40 12.41 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: double - X Y Z: 256 c b a 8 16 24 32 40 SE +/- 0.11, N = 3 SE +/- 0.12, N = 3 34.58 34.38 34.41 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 1024 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 1024 c b a 6 12 18 24 30 SE +/- 0.01, N = 3 SE +/- 0.00, N = 3 24.95 24.92 24.92 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double - X Y Z: 512 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: double - X Y Z: 512 c b a 6 12 18 24 30 SE +/- 0.01, N = 3 SE +/- 0.02, N = 3 24.09 24.10 24.09 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 256 c b a 8 16 24 32 40 SE +/- 0.17, N = 3 SE +/- 0.09, N = 3 34.48 34.28 34.17 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float - X Y Z: 1024 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: float - X Y Z: 1024 c b a 12 24 36 48 60 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 52.31 52.26 52.27 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 1024 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 1024 c b a 12 24 36 48 60 SE +/- 0.02, N = 3 SE +/- 0.02, N = 3 52.41 52.38 52.35 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 1024 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.4 Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 1024 c b a 6 12 18 24 30 SE +/- 0.02, N = 3 SE +/- 0.01, N = 3 27.01 26.98 26.99 1. (CXX) g++ options: -O3
CloverLeaf Input: clover_bm OpenBenchmarking.org Seconds, Fewer Is Better CloverLeaf 1.3 Input: clover_bm c b a 4 8 12 16 20 SE +/- 0.25, N = 15 SE +/- 1.44, N = 12 17.00 16.13 17.54 1. (F9X) gfortran options: -O3 -march=native -funroll-loops -fopenmp
Phoronix Test Suite v10.8.5