ddds Tests for a future article. Intel Core i7-1280P testing with a MSI MS-14C6 (E14C6IMS.115 BIOS) and MSI Intel ADL GT2 15GB on Pop 22.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2306220-NE-DDDS3788647&grs&sro .
ddds Processor Motherboard Chipset Memory Disk Graphics Audio Network OS Kernel Desktop Display Server OpenGL Vulkan Compiler File-System Screen Resolution s b Intel Core i7-1280P @ 4.70GHz (14 Cores / 20 Threads) MSI MS-14C6 (E14C6IMS.115 BIOS) Intel Alder Lake PCH 16GB 1024GB Micron_3400_MTFDKBA1T0TFH MSI Intel ADL GT2 15GB (1450MHz) Realtek ALC274 Intel Alder Lake-P PCH CNVi WiFi Pop 22.04 6.2.6-76060206-generic (x86_64) GNOME Shell 42.5 X Server 1.21.1.4 4.6 Mesa 22.3.5 1.3.230 GCC 11.3.0 ext4 1920x1080 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-xKiWfi/gcc-11-11.3.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-xKiWfi/gcc-11-11.3.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0x429 - Thermald 2.4.9 Python Details - Python 3.10.6 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling PBRSB-eIBRS: SW sequence + srbds: Not affected + tsx_async_abort: Not affected
ddds cp2k: Fayalite-FIST palabos: 500 libxsmm: 64 laghos: Triple Point Problem svt-av1: Preset 4 - Bosphorus 1080p liquid-dsp: 16 - 256 - 512 qmcpack: FeCO6_b3lyp_gms remhos: Sample Remap Example gpaw: Carbon Nanotube heffte: c2c - Stock - double - 128 svt-av1: Preset 8 - Bosphorus 4K liquid-dsp: 8 - 256 - 512 heffte: c2c - Stock - float - 128 liquid-dsp: 1 - 256 - 57 liquid-dsp: 1 - 256 - 512 palabos: 100 liquid-dsp: 8 - 256 - 57 liquid-dsp: 8 - 256 - 32 heffte: c2c - Stock - double-long - 128 heffte: r2c - Stock - float - 128 heffte: r2c - Stock - double-long - 128 heffte: c2c - Stock - double-long - 256 cp2k: H20-64 svt-av1: Preset 13 - Bosphorus 1080p svt-av1: Preset 12 - Bosphorus 1080p liquid-dsp: 4 - 256 - 57 libxsmm: 32 qmcpack: simple-H2O heffte: c2c - FFTW - float-long - 128 liquid-dsp: 16 - 256 - 32 heffte: r2c - FFTW - float - 128 srsran: PUSCH Processor Benchmark, Throughput Thread ospray: gravity_spheres_volume/dim_512/pathtracer/real_time mocassin: Dust 2D tau100.0 mocassin: Gas HII40 liquid-dsp: 20 - 256 - 512 laghos: Sedov Blast Wave, ube_922_hex.mesh heffte: c2c - Stock - double - 256 srsran: PUSCH Processor Benchmark, Throughput Total heffte: r2c - Stock - double-long - 256 heffte: c2c - FFTW - double - 128 srsran: Downlink Processor Benchmark heffte: c2c - Stock - float-long - 256 heffte: c2c - FFTW - double-long - 256 ospray: particle_volume/pathtracer/real_time heffte: r2c - Stock - double - 128 heffte: c2c - Stock - float - 256 liquid-dsp: 4 - 256 - 32 heffte: c2c - FFTW - float-long - 256 qmcpack: Li2_STO_ae heffte: c2c - FFTW - float - 128 heffte: c2c - FFTW - double - 256 liquid-dsp: 2 - 256 - 57 heffte: r2c - FFTW - double - 128 heffte: r2c - FFTW - double-long - 256 heffte: c2c - FFTW - double-long - 128 heffte: r2c - FFTW - double - 256 svt-av1: Preset 8 - Bosphorus 1080p ospray: gravity_spheres_volume/dim_512/ao/real_time ospray: gravity_spheres_volume/dim_512/scivis/real_time svt-av1: Preset 13 - Bosphorus 4K heffte: r2c - Stock - float-long - 128 heffte: r2c - FFTW - float - 256 svt-av1: Preset 12 - Bosphorus 4K heffte: r2c - Stock - float-long - 256 palabos: 400 heffte: r2c - FFTW - double-long - 128 liquid-dsp: 20 - 256 - 57 qmcpack: FeCO6_b3lyp_gms liquid-dsp: 16 - 256 - 57 heffte: c2c - Stock - float-long - 128 liquid-dsp: 20 - 256 - 32 liquid-dsp: 2 - 256 - 32 libxsmm: 128 heffte: r2c - Stock - double - 256 liquid-dsp: 2 - 256 - 512 heffte: r2c - FFTW - float-long - 128 heffte: r2c - Stock - float - 256 ospray: particle_volume/ao/real_time svt-av1: Preset 4 - Bosphorus 4K libxsmm: 256 ospray: particle_volume/scivis/real_time heffte: r2c - FFTW - float-long - 256 liquid-dsp: 4 - 256 - 512 heffte: c2c - FFTW - float - 256 liquid-dsp: 1 - 256 - 32 cp2k: H2O-DFT-LS s b 274.838 4.01361 165.2 55.64 10.527 90490000 785.94 156.205 504.804 10.9079 25.092 76055000 21.3012 74746000 20261000 42.8666 297390000 260250000 11.1481 41.4002 21.2745 12.3561 183.282 414.315 313.928 219310000 115.9 97.233 33.497 404840000 69.2827 262.1 1.90768 347.856 43.322 103260000 54.57 12.2602 928.8 23.5563 13.1156 930.8 23.1243 12.5828 86.5083 21.6659 23.0588 173530000 24.0358 764.32 32.7562 12.4492 131530000 29.0664 22.5498 13.0911 22.4292 82.594 1.38514 1.34452 95.008 41.2863 43.6571 95.307 45.0903 71.5688 29.2543 398190000 701.74 409920000 21.4591 459560000 106970000 155 23.3075 36447000 70.3859 44.8541 2.72471 2.536 132 2.68839 43.9622 56270000 23.7337 55367000 321.562 3.59061 148.5 50.43 9.831 94755000 821.04 149.627 526.532 11.3257 24.191 73590000 21.9433 76920000 19701000 41.8292 304170000 265880000 10.9293 42.2187 21.6826 12.1257 186.76 422.121 308.174 215430000 113.9 95.648 33.0029 398890000 70.3081 265.8 1.8824 343.614 42.805 104360000 54.00 12.1383 937.4 23.3425 12.9978 939 22.9256 12.4748 85.8189 21.8393 22.8766 172210000 23.863 769.69 32.9737 12.3722 130780000 29.2268 22.4287 13.0229 22.3159 82.226 1.37949 1.33905 95.367 41.4403 43.4953 95.639 45.2388 71.7773 29.1713 399280000 700 408980000 21.4194 460350000 106820000 154.8 23.2797 36484000 70.4569 44.8106 2.72216 2.538 132.1 2.68738 43.9486 56254000 23.7399 55376000 OpenBenchmarking.org
CP2K Molecular Dynamics Input: Fayalite-FIST OpenBenchmarking.org Seconds, Fewer Is Better CP2K Molecular Dynamics 2023.1 Input: Fayalite-FIST b s 70 140 210 280 350 321.56 274.84 1. (F9X) gfortran options: -fopenmp -mtune=native -O3 -funroll-loops -fbacktrace -ffree-form -fimplicit-none -std=f2008 -lcp2kstart -lcp2kmc -lcp2kswarm -lcp2kmotion -lcp2kthermostat -lcp2kemd -lcp2ktmc -lcp2kmain -lcp2kdbt -lcp2ktas -lcp2kdbm -lcp2kgrid -lcp2kgridcpu -lcp2kgridref -lcp2kgridcommon -ldbcsrarnoldi -ldbcsrx -lcp2kshg_int -lcp2keri_mme -lcp2kminimax -lcp2khfxbase -lcp2ksubsys -lcp2kxc -lcp2kao -lcp2kpw_env -lcp2kinput -lcp2kpw -lcp2kgpu -lcp2kfft -lcp2kfpga -lcp2kfm -lcp2kcommon -lcp2koffload -lcp2kmpiwrap -lcp2kbase -ldbcsr -lsirius -lspla -lspfft -lsymspg -lvdwxc -lhdf5 -lhdf5_hl -lz -lgsl -lelpa_openmp -lcosma -lcosta -lscalapack -lxsmmf -lxsmm -ldl -lpthread -lxcf03 -lxc -lint2 -lfftw3_mpi -lfftw3 -lfftw3_omp -lmpi_cxx -lmpi -lopenblas -lvori -lstdc++ -lmpi_usempif08 -lmpi_mpifh -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm
Palabos Grid Size: 500 OpenBenchmarking.org Mega Site Updates Per Second, More Is Better Palabos 2.3 Grid Size: 500 b s 0.9031 1.8062 2.7093 3.6124 4.5155 3.59061 4.01361 1. (CXX) g++ options: -std=c++17 -pedantic -O3 -rdynamic -lcrypto -lcurl -lsz -lz -ldl -lm
libxsmm M N K: 64 OpenBenchmarking.org GFLOPS/s, More Is Better libxsmm 2-1.17-3645 M N K: 64 b s 40 80 120 160 200 148.5 165.2 1. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -pedantic -O2 -fopenmp -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -march=core-avx2
Laghos Test: Triple Point Problem OpenBenchmarking.org Major Kernels Total Rate, More Is Better Laghos 3.1 Test: Triple Point Problem b s 12 24 36 48 60 50.43 55.64 1. (CXX) g++ options: -O3 -std=c++11 -lmfem -lHYPRE -lmetis -lrt -lmpi_cxx -lmpi
SVT-AV1 Encoder Mode: Preset 4 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.6 Encoder Mode: Preset 4 - Input: Bosphorus 1080p b s 3 6 9 12 15 9.831 10.527 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
Liquid-DSP Threads: 16 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 16 - Buffer Length: 256 - Filter Length: 512 b s 20M 40M 60M 80M 100M 94755000 90490000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
QMCPACK Input: FeCO6_b3lyp_gms OpenBenchmarking.org Total Execution Time - Seconds, Fewer Is Better QMCPACK 3.16 Input: FeCO6_b3lyp_gms b s 200 400 600 800 1000 821.04 785.94 1. (CXX) g++ options: -fopenmp -foffload=disable -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -ffast-math -march=native -O3 -lm -ldl
Remhos Test: Sample Remap Example OpenBenchmarking.org Seconds, Fewer Is Better Remhos 1.0 Test: Sample Remap Example b s 30 60 90 120 150 149.63 156.21 1. (CXX) g++ options: -O3 -std=c++11 -lmfem -lHYPRE -lmetis -lrt -lmpi_cxx -lmpi
GPAW Input: Carbon Nanotube OpenBenchmarking.org Seconds, Fewer Is Better GPAW 23.6 Input: Carbon Nanotube b s 110 220 330 440 550 526.53 504.80 1. (CC) gcc options: -shared -fwrapv -O2 -lxc -lblas -lmpi
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: double - X Y Z: 128 b s 3 6 9 12 15 11.33 10.91 1. (CXX) g++ options: -O3
SVT-AV1 Encoder Mode: Preset 8 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.6 Encoder Mode: Preset 8 - Input: Bosphorus 4K b s 6 12 18 24 30 24.19 25.09 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
Liquid-DSP Threads: 8 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 8 - Buffer Length: 256 - Filter Length: 512 b s 16M 32M 48M 64M 80M 73590000 76055000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: float - X Y Z: 128 b s 5 10 15 20 25 21.94 21.30 1. (CXX) g++ options: -O3
Liquid-DSP Threads: 1 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 1 - Buffer Length: 256 - Filter Length: 57 b s 16M 32M 48M 64M 80M 76920000 74746000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 1 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 1 - Buffer Length: 256 - Filter Length: 512 b s 4M 8M 12M 16M 20M 19701000 20261000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Palabos Grid Size: 100 OpenBenchmarking.org Mega Site Updates Per Second, More Is Better Palabos 2.3 Grid Size: 100 b s 10 20 30 40 50 41.83 42.87 1. (CXX) g++ options: -std=c++17 -pedantic -O3 -rdynamic -lcrypto -lcurl -lsz -lz -ldl -lm
Liquid-DSP Threads: 8 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 8 - Buffer Length: 256 - Filter Length: 57 b s 70M 140M 210M 280M 350M 304170000 297390000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 8 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 8 - Buffer Length: 256 - Filter Length: 32 b s 60M 120M 180M 240M 300M 265880000 260250000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 128 b s 3 6 9 12 15 10.93 11.15 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: float - X Y Z: 128 b s 10 20 30 40 50 42.22 41.40 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 128 b s 5 10 15 20 25 21.68 21.27 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: double-long - X Y Z: 256 b s 3 6 9 12 15 12.13 12.36 1. (CXX) g++ options: -O3
CP2K Molecular Dynamics Input: H20-64 OpenBenchmarking.org Seconds, Fewer Is Better CP2K Molecular Dynamics 2023.1 Input: H20-64 b s 40 80 120 160 200 186.76 183.28 1. (F9X) gfortran options: -fopenmp -mtune=native -O3 -funroll-loops -fbacktrace -ffree-form -fimplicit-none -std=f2008 -lcp2kstart -lcp2kmc -lcp2kswarm -lcp2kmotion -lcp2kthermostat -lcp2kemd -lcp2ktmc -lcp2kmain -lcp2kdbt -lcp2ktas -lcp2kdbm -lcp2kgrid -lcp2kgridcpu -lcp2kgridref -lcp2kgridcommon -ldbcsrarnoldi -ldbcsrx -lcp2kshg_int -lcp2keri_mme -lcp2kminimax -lcp2khfxbase -lcp2ksubsys -lcp2kxc -lcp2kao -lcp2kpw_env -lcp2kinput -lcp2kpw -lcp2kgpu -lcp2kfft -lcp2kfpga -lcp2kfm -lcp2kcommon -lcp2koffload -lcp2kmpiwrap -lcp2kbase -ldbcsr -lsirius -lspla -lspfft -lsymspg -lvdwxc -lhdf5 -lhdf5_hl -lz -lgsl -lelpa_openmp -lcosma -lcosta -lscalapack -lxsmmf -lxsmm -ldl -lpthread -lxcf03 -lxc -lint2 -lfftw3_mpi -lfftw3 -lfftw3_omp -lmpi_cxx -lmpi -lopenblas -lvori -lstdc++ -lmpi_usempif08 -lmpi_mpifh -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm
SVT-AV1 Encoder Mode: Preset 13 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.6 Encoder Mode: Preset 13 - Input: Bosphorus 1080p b s 90 180 270 360 450 422.12 414.32 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
SVT-AV1 Encoder Mode: Preset 12 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.6 Encoder Mode: Preset 12 - Input: Bosphorus 1080p b s 70 140 210 280 350 308.17 313.93 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
Liquid-DSP Threads: 4 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 4 - Buffer Length: 256 - Filter Length: 57 b s 50M 100M 150M 200M 250M 215430000 219310000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
libxsmm M N K: 32 OpenBenchmarking.org GFLOPS/s, More Is Better libxsmm 2-1.17-3645 M N K: 32 b s 30 60 90 120 150 113.9 115.9 1. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -pedantic -O2 -fopenmp -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -march=core-avx2
QMCPACK Input: simple-H2O OpenBenchmarking.org Total Execution Time - Seconds, Fewer Is Better QMCPACK 3.16 Input: simple-H2O b s 20 40 60 80 100 95.65 97.23 1. (CXX) g++ options: -fopenmp -foffload=disable -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -ffast-math -march=native -O3 -lm -ldl
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 128 b s 8 16 24 32 40 33.00 33.50 1. (CXX) g++ options: -O3
Liquid-DSP Threads: 16 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 16 - Buffer Length: 256 - Filter Length: 32 b s 90M 180M 270M 360M 450M 398890000 404840000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: float - X Y Z: 128 b s 16 32 48 64 80 70.31 69.28 1. (CXX) g++ options: -O3
srsRAN Project Test: PUSCH Processor Benchmark, Throughput Thread OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.5 Test: PUSCH Processor Benchmark, Throughput Thread b s 60 120 180 240 300 265.8 262.1 1. (CXX) g++ options: -march=native -mfma -O3 -fno-trapping-math -fno-math-errno -lgtest
OSPRay Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.12 Benchmark: gravity_spheres_volume/dim_512/pathtracer/real_time b s 0.4292 0.8584 1.2876 1.7168 2.146 1.88240 1.90768
Monte Carlo Simulations of Ionised Nebulae Input: Dust 2D tau100.0 OpenBenchmarking.org Seconds, Fewer Is Better Monte Carlo Simulations of Ionised Nebulae 2.02.73.3 Input: Dust 2D tau100.0 b s 80 160 240 320 400 343.61 347.86 1. (F9X) gfortran options: -cpp -Jsource/ -ffree-line-length-0 -lm -std=legacy -O2 -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lz
Monte Carlo Simulations of Ionised Nebulae Input: Gas HII40 OpenBenchmarking.org Seconds, Fewer Is Better Monte Carlo Simulations of Ionised Nebulae 2.02.73.3 Input: Gas HII40 b s 10 20 30 40 50 42.81 43.32 1. (F9X) gfortran options: -cpp -Jsource/ -ffree-line-length-0 -lm -std=legacy -O2 -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lz
Liquid-DSP Threads: 20 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 20 - Buffer Length: 256 - Filter Length: 512 b s 20M 40M 60M 80M 100M 104360000 103260000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Laghos Test: Sedov Blast Wave, ube_922_hex.mesh OpenBenchmarking.org Major Kernels Total Rate, More Is Better Laghos 3.1 Test: Sedov Blast Wave, ube_922_hex.mesh b s 12 24 36 48 60 54.00 54.57 1. (CXX) g++ options: -O3 -std=c++11 -lmfem -lHYPRE -lmetis -lrt -lmpi_cxx -lmpi
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: double - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: double - X Y Z: 256 b s 3 6 9 12 15 12.14 12.26 1. (CXX) g++ options: -O3
srsRAN Project Test: PUSCH Processor Benchmark, Throughput Total OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.5 Test: PUSCH Processor Benchmark, Throughput Total b s 200 400 600 800 1000 937.4 928.8 1. (CXX) g++ options: -march=native -mfma -O3 -fno-trapping-math -fno-math-errno -lgtest
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: double-long - X Y Z: 256 b s 6 12 18 24 30 23.34 23.56 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: double - X Y Z: 128 b s 3 6 9 12 15 13.00 13.12 1. (CXX) g++ options: -O3
srsRAN Project Test: Downlink Processor Benchmark OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.5 Test: Downlink Processor Benchmark b s 200 400 600 800 1000 939.0 930.8 1. (CXX) g++ options: -march=native -mfma -O3 -fno-trapping-math -fno-math-errno -lgtest
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 256 b s 6 12 18 24 30 22.93 23.12 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 256 b s 3 6 9 12 15 12.47 12.58 1. (CXX) g++ options: -O3
OSPRay Benchmark: particle_volume/pathtracer/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.12 Benchmark: particle_volume/pathtracer/real_time b s 20 40 60 80 100 85.82 86.51
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: double - X Y Z: 128 b s 5 10 15 20 25 21.84 21.67 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: float - X Y Z: 256 b s 6 12 18 24 30 22.88 23.06 1. (CXX) g++ options: -O3
Liquid-DSP Threads: 4 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 4 - Buffer Length: 256 - Filter Length: 32 b s 40M 80M 120M 160M 200M 172210000 173530000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: float-long - X Y Z: 256 b s 6 12 18 24 30 23.86 24.04 1. (CXX) g++ options: -O3
QMCPACK Input: Li2_STO_ae OpenBenchmarking.org Total Execution Time - Seconds, Fewer Is Better QMCPACK 3.16 Input: Li2_STO_ae b s 170 340 510 680 850 769.69 764.32 1. (CXX) g++ options: -fopenmp -foffload=disable -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -ffast-math -march=native -O3 -lm -ldl
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: float - X Y Z: 128 b s 8 16 24 32 40 32.97 32.76 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: double - X Y Z: 256 b s 3 6 9 12 15 12.37 12.45 1. (CXX) g++ options: -O3
Liquid-DSP Threads: 2 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 2 - Buffer Length: 256 - Filter Length: 57 b s 30M 60M 90M 120M 150M 130780000 131530000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: double - X Y Z: 128 b s 7 14 21 28 35 29.23 29.07 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 256 b s 5 10 15 20 25 22.43 22.55 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: double-long - X Y Z: 128 b s 3 6 9 12 15 13.02 13.09 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: double - X Y Z: 256 b s 5 10 15 20 25 22.32 22.43 1. (CXX) g++ options: -O3
SVT-AV1 Encoder Mode: Preset 8 - Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.6 Encoder Mode: Preset 8 - Input: Bosphorus 1080p b s 20 40 60 80 100 82.23 82.59 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
OSPRay Benchmark: gravity_spheres_volume/dim_512/ao/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.12 Benchmark: gravity_spheres_volume/dim_512/ao/real_time b s 0.3117 0.6234 0.9351 1.2468 1.5585 1.37949 1.38514
OSPRay Benchmark: gravity_spheres_volume/dim_512/scivis/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.12 Benchmark: gravity_spheres_volume/dim_512/scivis/real_time b s 0.3025 0.605 0.9075 1.21 1.5125 1.33905 1.34452
SVT-AV1 Encoder Mode: Preset 13 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.6 Encoder Mode: Preset 13 - Input: Bosphorus 4K b s 20 40 60 80 100 95.37 95.01 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 128 b s 9 18 27 36 45 41.44 41.29 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: float - X Y Z: 256 b s 10 20 30 40 50 43.50 43.66 1. (CXX) g++ options: -O3
SVT-AV1 Encoder Mode: Preset 12 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.6 Encoder Mode: Preset 12 - Input: Bosphorus 4K b s 20 40 60 80 100 95.64 95.31 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: float-long - X Y Z: 256 b s 10 20 30 40 50 45.24 45.09 1. (CXX) g++ options: -O3
Palabos Grid Size: 400 OpenBenchmarking.org Mega Site Updates Per Second, More Is Better Palabos 2.3 Grid Size: 400 b s 16 32 48 64 80 71.78 71.57 1. (CXX) g++ options: -std=c++17 -pedantic -O3 -rdynamic -lcrypto -lcurl -lsz -lz -ldl -lm
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: double-long - X Y Z: 128 b s 7 14 21 28 35 29.17 29.25 1. (CXX) g++ options: -O3
Liquid-DSP Threads: 20 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 20 - Buffer Length: 256 - Filter Length: 57 b s 90M 180M 270M 360M 450M 399280000 398190000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
QMCPACK Input: FeCO6_b3lyp_gms OpenBenchmarking.org Total Execution Time - Seconds, Fewer Is Better QMCPACK 3.16 Input: FeCO6_b3lyp_gms b s 150 300 450 600 750 700.00 701.74 1. (CXX) g++ options: -fopenmp -foffload=disable -finline-limit=1000 -fstrict-aliasing -funroll-all-loops -ffast-math -march=native -O3 -lm -ldl
Liquid-DSP Threads: 16 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 16 - Buffer Length: 256 - Filter Length: 57 b s 90M 180M 270M 360M 450M 408980000 409920000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: Stock - Precision: float-long - X Y Z: 128 b s 5 10 15 20 25 21.42 21.46 1. (CXX) g++ options: -O3
Liquid-DSP Threads: 20 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 20 - Buffer Length: 256 - Filter Length: 32 b s 100M 200M 300M 400M 500M 460350000 459560000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 2 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 2 - Buffer Length: 256 - Filter Length: 32 b s 20M 40M 60M 80M 100M 106820000 106970000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
libxsmm M N K: 128 OpenBenchmarking.org GFLOPS/s, More Is Better libxsmm 2-1.17-3645 M N K: 128 b s 30 60 90 120 150 154.8 155.0 1. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -pedantic -O2 -fopenmp -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -march=core-avx2
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: double - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: double - X Y Z: 256 b s 6 12 18 24 30 23.28 23.31 1. (CXX) g++ options: -O3
Liquid-DSP Threads: 2 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 2 - Buffer Length: 256 - Filter Length: 512 b s 8M 16M 24M 32M 40M 36484000 36447000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 128 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 128 b s 16 32 48 64 80 70.46 70.39 1. (CXX) g++ options: -O3
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: Stock - Precision: float - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: Stock - Precision: float - X Y Z: 256 b s 10 20 30 40 50 44.81 44.85 1. (CXX) g++ options: -O3
OSPRay Benchmark: particle_volume/ao/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.12 Benchmark: particle_volume/ao/real_time b s 0.6131 1.2262 1.8393 2.4524 3.0655 2.72216 2.72471
SVT-AV1 Encoder Mode: Preset 4 - Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better SVT-AV1 1.6 Encoder Mode: Preset 4 - Input: Bosphorus 4K b s 0.5711 1.1422 1.7133 2.2844 2.8555 2.538 2.536 1. (CXX) g++ options: -march=native -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq
libxsmm M N K: 256 OpenBenchmarking.org GFLOPS/s, More Is Better libxsmm 2-1.17-3645 M N K: 256 b s 30 60 90 120 150 132.1 132.0 1. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -pedantic -O2 -fopenmp -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -march=core-avx2
OSPRay Benchmark: particle_volume/scivis/real_time OpenBenchmarking.org Items Per Second, More Is Better OSPRay 2.12 Benchmark: particle_volume/scivis/real_time b s 0.6049 1.2098 1.8147 2.4196 3.0245 2.68738 2.68839
HeFFTe - Highly Efficient FFT for Exascale Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: r2c - Backend: FFTW - Precision: float-long - X Y Z: 256 b s 10 20 30 40 50 43.95 43.96 1. (CXX) g++ options: -O3
Liquid-DSP Threads: 4 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 4 - Buffer Length: 256 - Filter Length: 512 b s 12M 24M 36M 48M 60M 56254000 56270000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
HeFFTe - Highly Efficient FFT for Exascale Test: c2c - Backend: FFTW - Precision: float - X Y Z: 256 OpenBenchmarking.org GFLOP/s, More Is Better HeFFTe - Highly Efficient FFT for Exascale 2.3 Test: c2c - Backend: FFTW - Precision: float - X Y Z: 256 b s 6 12 18 24 30 23.74 23.73 1. (CXX) g++ options: -O3
Liquid-DSP Threads: 1 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.6 Threads: 1 - Buffer Length: 256 - Filter Length: 32 b s 12M 24M 36M 48M 60M 55376000 55367000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Phoronix Test Suite v10.8.5