new sun AMD Ryzen Threadripper 3990X 64-Core testing with a Gigabyte TRX40 AORUS PRO WIFI (F6 BIOS) and AMD Radeon RX 5700 8GB on Ubuntu 23.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2306183-PTS-NEWSUN6339&grr&rdt .
new sun Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server OpenGL Compiler File-System Screen Resolution a b c AMD Ryzen Threadripper 3990X 64-Core @ 2.90GHz (64 Cores / 128 Threads) Gigabyte TRX40 AORUS PRO WIFI (F6 BIOS) AMD Starship/Matisse 128GB Samsung SSD 970 EVO Plus 500GB AMD Radeon RX 5700 8GB (1750/875MHz) AMD Navi 10 HDMI Audio DELL P2415Q Intel I211 + Intel Wi-Fi 6 AX200 Ubuntu 23.04 6.2.0-20-generic (x86_64) GNOME Shell 44.0 X Server + Wayland 4.6 Mesa 23.0.2 (LLVM 15.0.7 DRM 3.49) GCC 12.2.0 ext4 3840x2160 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-Pa930Z/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-Pa930Z/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0x8301055 Python Details - Python 3.11.2 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Mitigation of untrained return thunk; SMT enabled with STIBP protection + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
new sun palabos: 400 palabos: 500 palabos: 100 mocassin: Dust 2D tau100.0 gpaw: Carbon Nanotube cp2k: Fayalite-FIST kripke: cp2k: H20-64 heffte: c2c - FFTW - double - 256 heffte: c2c - FFTW - double-long - 256 mocassin: Gas HII40 heffte: c2c - Stock - double-long - 256 heffte: c2c - Stock - double - 256 heffte: r2c - FFTW - double-long - 256 heffte: r2c - FFTW - double - 256 heffte: r2c - Stock - double-long - 256 heffte: r2c - Stock - double - 256 heffte: c2c - FFTW - float - 256 heffte: c2c - FFTW - float-long - 256 heffte: c2c - Stock - float-long - 256 heffte: c2c - Stock - float - 256 heffte: r2c - FFTW - float - 256 heffte: r2c - FFTW - float-long - 256 heffte: r2c - Stock - float - 256 heffte: r2c - Stock - float-long - 256 heffte: c2c - Stock - double - 128 heffte: c2c - Stock - double-long - 128 heffte: c2c - FFTW - double-long - 128 heffte: c2c - FFTW - double - 128 heffte: r2c - Stock - double-long - 128 heffte: r2c - FFTW - double - 128 heffte: r2c - Stock - double - 128 heffte: r2c - FFTW - double-long - 128 heffte: c2c - Stock - float-long - 128 heffte: c2c - FFTW - float - 128 heffte: c2c - Stock - float - 128 heffte: c2c - FFTW - float-long - 128 heffte: r2c - Stock - float - 128 heffte: r2c - FFTW - float - 128 heffte: r2c - Stock - float-long - 128 heffte: r2c - FFTW - float-long - 128 a b c 110.662 114.295 154.425 141.939 106.141 151.857 132875000 40.67 12.5259 12.6224 16.908 12.7636 12.8321 29.6979 29.8294 34.3018 34.2182 34.6666 34.7005 37.3511 37.4329 86.0624 87.4568 98.6876 101.924 32.0508 31.6274 33.4237 33.0833 58.8215 60.6959 58.5881 60.5588 59.6935 62.0359 59.1864 61.2783 103.581 112.297 105.451 113.155 110.757 114.362 154.674 136.990 106.632 153.861 134555700 41.509 12.6053 12.5854 17.135 12.7611 12.8373 29.8739 29.8226 34.3575 34.3706 34.3657 34.3100 37.3154 37.4736 87.4605 87.1139 101.591 101.2671 31.5836 31.6224 33.5672 33.7305 58.5089 60.3691 58.4202 60.1095 59.9952 61.8921 59.5955 61.7781 104.373 113.572 103.173 110.945 110.74 114.311 154.347 136.197 106.679 156.121 132703600 41.822 12.5994 12.6235 17.03 12.7487 12.7754 29.8797 29.8038 34.5142 34.3981 34.2195 34.1329 37.8371 37.5434 87.203 86.4824 100.265 100.399 32.1516 31.9532 32.7971 33.2519 58.9169 60.3375 58.239 60.552 60.099 63.9041 61.2339 61.2873 105.452 112.333 103.765 110.735 OpenBenchmarking.org
Palabos Grid Size: 400 OpenBenchmarking.org Mega Site Updates Per Second, More Is Better Palabos 2.3 Grid Size: 400 a b c 20 40 60 80 100 SE +/- 0.04, N = 3 110.66 110.76 110.74 1. (CXX) g++ options: -std=c++17 -pedantic -O3 -rdynamic -lcrypto -lcurl -lsz -lz -ldl -lm
Palabos Grid Size: 500 OpenBenchmarking.org Mega Site Updates Per Second, More Is Better Palabos 2.3 Grid Size: 500 a b c 30 60 90 120 150 SE +/- 0.10, N = 3 114.30 114.36 114.31 1. (CXX) g++ options: -std=c++17 -pedantic -O3 -rdynamic -lcrypto -lcurl -lsz -lz -ldl -lm
Palabos Grid Size: 100 OpenBenchmarking.org Mega Site Updates Per Second, More Is Better Palabos 2.3 Grid Size: 100 a b c 30 60 90 120 150 SE +/- 0.26, N = 3 154.43 154.67 154.35 1. (CXX) g++ options: -std=c++17 -pedantic -O3 -rdynamic -lcrypto -lcurl -lsz -lz -ldl -lm
Monte Carlo Simulations of Ionised Nebulae Input: Dust 2D tau100.0 OpenBenchmarking.org Seconds, Fewer Is Better Monte Carlo Simulations of Ionised Nebulae 2.02.73.3 Input: Dust 2D tau100.0 a b c 30 60 90 120 150 SE +/- 0.19, N = 3 141.94 136.99 136.20 1. (F9X) gfortran options: -cpp -Jsource/ -ffree-line-length-0 -lm -std=legacy -O2 -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lz
GPAW Input: Carbon Nanotube OpenBenchmarking.org Seconds, Fewer Is Better GPAW 23.6 Input: Carbon Nanotube a b c 20 40 60 80 100 SE +/- 0.13, N = 3 106.14 106.63 106.68 1. (CC) gcc options: -shared -fwrapv -O2 -lxc -lblas -lmpi
CP2K Molecular Dynamics Input: Fayalite-FIST OpenBenchmarking.org Seconds, Fewer Is Better CP2K Molecular Dynamics 2023.1 Input: Fayalite-FIST a b c 30 60 90 120 150 151.86 153.86 156.12 1. (F9X) gfortran options: -fopenmp -mtune=native -O3 -funroll-loops -fbacktrace -ffree-form -fimplicit-none -std=f2008 -lcp2kstart -lcp2kmc -lcp2kswarm -lcp2kmotion -lcp2kthermostat -lcp2kemd -lcp2ktmc -lcp2kmain -lcp2kdbt -lcp2ktas -lcp2kdbm -lcp2kgrid -lcp2kgridcpu -lcp2kgridref -lcp2kgridcommon -ldbcsrarnoldi -ldbcsrx -lcp2kshg_int -lcp2keri_mme -lcp2kminimax -lcp2khfxbase -lcp2ksubsys -lcp2kxc -lcp2kao -lcp2kpw_env -lcp2kinput -lcp2kpw -lcp2kgpu -lcp2kfft -lcp2kfpga -lcp2kfm -lcp2kcommon -lcp2koffload -lcp2kmpiwrap -lcp2kbase -ldbcsr -lsirius -lspla -lspfft -lsymspg -lvdwxc -lhdf5 -lhdf5_hl -lz -lgsl -lelpa_openmp -lcosma -lcosta -lscalapack -lxsmmf -lxsmm -ldl -lpthread -lxcf03 -lxc -lint2 -lfftw3_mpi -lfftw3 -lfftw3_omp -lmpi_cxx -lmpi -lopenblas -lvori -lstdc++ -lmpi_usempif08 -lmpi_mpifh -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm
Kripke OpenBenchmarking.org Throughput FoM, More Is Better Kripke 1.2.6 a b c 30M 60M 90M 120M 150M SE +/- 1359629.29, N = 3 132875000 134555700 132703600 1. (CXX) g++ options: -O3 -fopenmp -ldl
CP2K Molecular Dynamics Input: H20-64 OpenBenchmarking.org Seconds, Fewer Is Better CP2K Molecular Dynamics 2023.1 Input: H20-64 a b c 10 20 30 40 50 40.67 41.51 41.82 1. (F9X) gfortran options: -fopenmp -mtune=native -O3 -funroll-loops -fbacktrace -ffree-form -fimplicit-none -std=f2008 -lcp2kstart -lcp2kmc -lcp2kswarm -lcp2kmotion -lcp2kthermostat -lcp2kemd -lcp2ktmc -lcp2kmain -lcp2kdbt -lcp2ktas -lcp2kdbm -lcp2kgrid -lcp2kgridcpu -lcp2kgridref -lcp2kgridcommon -ldbcsrarnoldi -ldbcsrx -lcp2kshg_int -lcp2keri_mme -lcp2kminimax -lcp2khfxbase -lcp2ksubsys -lcp2kxc -lcp2kao -lcp2kpw_env -lcp2kinput -lcp2kpw -lcp2kgpu -lcp2kfft -lcp2kfpga -lcp2kfm -lcp2kcommon -lcp2koffload -lcp2kmpiwrap -lcp2kbase -ldbcsr -lsirius -lspla -lspfft -lsymspg -lvdwxc -lhdf5 -lhdf5_hl -lz -lgsl -lelpa_openmp -lcosma -lcosta -lscalapack -lxsmmf -lxsmm -ldl -lpthread -lxcf03 -lxc -lint2 -lfftw3_mpi -lfftw3 -lfftw3_omp -lmpi_cxx -lmpi -lopenblas -lvori -lstdc++ -lmpi_usempif08 -lmpi_mpifh -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: double - X Y Z: 256 a b c 3 6 9 12 15 SE +/- 0.05, N = 3 12.53 12.61 12.60 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 256 a b c 3 6 9 12 15 SE +/- 0.02, N = 3 12.62 12.59 12.62 1. (CXX) g++ options: -O3
Monte Carlo Simulations of Ionised Nebulae Input: Gas HII40 OpenBenchmarking.org Seconds, Fewer Is Better Monte Carlo Simulations of Ionised Nebulae 2.02.73.3 Input: Gas HII40 a b c 4 8 12 16 20 SE +/- 0.05, N = 3 16.91 17.14 17.03 1. (F9X) gfortran options: -cpp -Jsource/ -ffree-line-length-0 -lm -std=legacy -O2 -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lz
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 256 a b c 3 6 9 12 15 SE +/- 0.05, N = 3 12.76 12.76 12.75 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: double - X Y Z: 256 a b c 3 6 9 12 15 SE +/- 0.05, N = 3 12.83 12.84 12.78 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 256 a b c 7 14 21 28 35 SE +/- 0.04, N = 3 29.70 29.87 29.88 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: double - X Y Z: 256 a b c 7 14 21 28 35 SE +/- 0.03, N = 3 29.83 29.82 29.80 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 256 a b c 8 16 24 32 40 SE +/- 0.03, N = 3 34.30 34.36 34.51 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: double - X Y Z: 256 a b c 8 16 24 32 40 SE +/- 0.04, N = 3 34.22 34.37 34.40 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: float - X Y Z: 256 a b c 8 16 24 32 40 SE +/- 0.13, N = 3 34.67 34.37 34.22 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 256 a b c 8 16 24 32 40 SE +/- 0.08, N = 3 34.70 34.31 34.13 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 256 a b c 9 18 27 36 45 SE +/- 0.11, N = 3 37.35 37.32 37.84 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: float - X Y Z: 256 a b c 9 18 27 36 45 SE +/- 0.03, N = 3 37.43 37.47 37.54 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: float - X Y Z: 256 a b c 20 40 60 80 100 SE +/- 0.87, N = 3 86.06 87.46 87.20 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 256 a b c 20 40 60 80 100 SE +/- 0.14, N = 3 87.46 87.11 86.48 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: float - X Y Z: 256 a b c 20 40 60 80 100 SE +/- 0.12, N = 3 98.69 101.59 100.27 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 256 a b c 20 40 60 80 100 SE +/- 1.05, N = 3 101.92 101.27 100.40 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: double - X Y Z: 128 a b c 7 14 21 28 35 SE +/- 0.12, N = 3 32.05 31.58 32.15 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 128 a b c 7 14 21 28 35 SE +/- 0.03, N = 3 31.63 31.62 31.95 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 128 a b c 8 16 24 32 40 SE +/- 0.07, N = 3 33.42 33.57 32.80 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: double - X Y Z: 128 a b c 8 16 24 32 40 SE +/- 0.26, N = 3 33.08 33.73 33.25 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 128 a b c 13 26 39 52 65 SE +/- 0.41, N = 3 58.82 58.51 58.92 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: double - X Y Z: 128 a b c 14 28 42 56 70 SE +/- 0.19, N = 3 60.70 60.37 60.34 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: double - X Y Z: 128 a b c 13 26 39 52 65 SE +/- 0.19, N = 3 58.59 58.42 58.24 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 128 a b c 14 28 42 56 70 SE +/- 0.20, N = 3 60.56 60.11 60.55 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 128 a b c 13 26 39 52 65 SE +/- 0.14, N = 3 59.69 60.00 60.10 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: float - X Y Z: 128 a b c 14 28 42 56 70 SE +/- 0.38, N = 3 62.04 61.89 63.90 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: float - X Y Z: 128 a b c 14 28 42 56 70 SE +/- 0.04, N = 3 59.19 59.60 61.23 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 128 a b c 14 28 42 56 70 SE +/- 0.42, N = 3 61.28 61.78 61.29 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: float - X Y Z: 128 a b c 20 40 60 80 100 SE +/- 0.57, N = 3 103.58 104.37 105.45 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: float - X Y Z: 128 a b c 30 60 90 120 150 SE +/- 0.84, N = 3 112.30 113.57 112.33 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 128 a b c 20 40 60 80 100 SE +/- 1.42, N = 3 105.45 103.17 103.77 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 128 a b c 30 60 90 120 150 SE +/- 0.65, N = 3 113.16 110.95 110.74 1. (CXX) g++ options: -O3
Phoronix Test Suite v10.8.5