dddd Tests for a future article. Intel Core i7-1280P testing with a MSI Prestige 14Evo A12M MS-14C6 (E14C6IMS.115 BIOS) and MSI Intel ADL GT2 8GB on Ubuntu 24.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2410194-NE-DDDD7825639&grr .
dddd Processor Motherboard Chipset Memory Disk Graphics Audio Network OS Kernel Desktop Display Server OpenGL Compiler File-System Screen Resolution a b Intel Core i7-1280P @ 4.70GHz (14 Cores / 20 Threads) MSI Prestige 14Evo A12M MS-14C6 (E14C6IMS.115 BIOS) Intel Alder Lake PCH 8 x 2GB LPDDR4-4267MT/s SK Hynix H9HCNNNCPMMLXR- 1024GB Micron_3400_MTFDKBA1T0TFH MSI Intel ADL GT2 8GB Realtek ALC274 Intel Alder Lake-P PCH CNVi WiFi Ubuntu 24.10 6.11.0-rc6-phx (x86_64) GNOME Shell 47.0 X Server + Wayland 4.6 Mesa 24.2.3-1ubuntu1 GCC 14.2.0 ext4 1920x1080 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0x434 - Thermald 2.5.8 Python Details - Python 3.12.6 Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Mitigation of Clear Register File + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; RSB filling; PBRSB-eIBRS: SW sequence; BHI: BHI_DIS_S + srbds: Not affected + tsx_async_abort: Not affected
dddd cp2k: H20-256 epoch: Cone namd: STMV with 1,066,628 Atoms cp2k: Fayalite-FIST xnnpack: QS8MobileNetV2 xnnpack: FP16MobileNetV3Small xnnpack: FP16MobileNetV3Large xnnpack: FP16MobileNetV2 xnnpack: FP16MobileNetV1 xnnpack: FP32MobileNetV3Small xnnpack: FP32MobileNetV3Large xnnpack: FP32MobileNetV2 xnnpack: FP32MobileNetV1 cp2k: H20-64 namd: ATPase with 327,506 Atoms warpx: Plasma Acceleration onednn: Recurrent Neural Network Training - CPU onednn: Recurrent Neural Network Inference - CPU warpx: Uniform Plasma litert: NASNet Mobile litert: Inception V4 litert: Inception ResNet V2 litert: DeepLab V3 litert: Mobilenet Float litert: Quantized COCO SSD MobileNet v1 litert: SqueezeNet litert: Mobilenet Quant onednn: Deconvolution Batch shapes_1d - CPU onednn: IP Shapes 1D - CPU onednn: IP Shapes 3D - CPU onednn: Convolution Batch Shapes Auto - CPU onednn: Deconvolution Batch shapes_3d - CPU a b 2776.526 1073.62 0.08333 336.756 1843 1519 3939 3676 5338 1325 2942 2675 3766 162.632 0.32227 97.32212998 6909.46 3669.59 60.69487664 24466.6 60434 66739.5 12481.3 3267.47 14677.1 4297.68 5772.5 7.56991 4.6992 8.97964 8.31985 8.83578 2228.375 941.82 0.11113 326.883 1888 1528 3676 3670 5730 1309 2971 2656 3776 175.654 0.36493 187.7052446 6480.37 3644.4 58.37226117 20572.6 59982.9 60923.1 8785.7 3216.26 11862.2 4341.81 5428.36 7.71558 4.75677 8.99418 8.32147 8.78226 OpenBenchmarking.org
CP2K Molecular Dynamics Input: H20-256 OpenBenchmarking.org Seconds, Fewer Is Better CP2K Molecular Dynamics 2024.3 Input: H20-256 a b 600 1200 1800 2400 3000 2776.53 2228.38 1. (F9X) gfortran options: -fopenmp -march=native -mtune=native -O3 -funroll-loops -fbacktrace -ffree-form -fimplicit-none -std=f2008 -lcp2kstart -lcp2kmc -lcp2kswarm -lcp2kmotion -lcp2kthermostat -lcp2kemd -lcp2ktmc -lcp2kmain -lcp2kdbt -lcp2ktas -lcp2kgrid -lcp2kgriddgemm -lcp2kgridcpu -lcp2kgridref -lcp2kgridcommon -ldbcsrarnoldi -ldbcsrx -lcp2kdbx -lcp2kdbm -lcp2kshg_int -lcp2keri_mme -lcp2kminimax -lcp2khfxbase -lcp2ksubsys -lcp2kxc -lcp2kao -lcp2kpw_env -lcp2kinput -lcp2kpw -lcp2kgpu -lcp2kfft -lcp2kfpga -lcp2kfm -lcp2kcommon -lcp2koffload -lcp2kmpiwrap -lcp2kbase -ldbcsr -lsirius -lspla -lspfft -lsymspg -lvdwxc -l:libhdf5_fortran.a -l:libhdf5.a -lz -lgsl -lelpa_openmp -lcosma -lcosta -lscalapack -lxsmmf -lxsmm -ldl -lpthread -llibgrpp -lxcf03 -lxc -lint2 -lfftw3_mpi -lfftw3 -lfftw3_omp -lmpi_cxx -lmpi -l:libopenblas.a -lvori -lstdc++ -lmpi_usempif08 -lmpi_mpifh -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm
Epoch Epoch3D Deck: Cone OpenBenchmarking.org Seconds, Fewer Is Better Epoch 4.19.4 Epoch3D Deck: Cone a b 200 400 600 800 1000 1073.62 941.82 1. (F9X) gfortran options: -O3 -std=f2003 -Jobj -lsdf -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
NAMD Input: STMV with 1,066,628 Atoms OpenBenchmarking.org ns/day, More Is Better NAMD 3.0 Input: STMV with 1,066,628 Atoms a b 0.025 0.05 0.075 0.1 0.125 0.08333 0.11113
CP2K Molecular Dynamics Input: Fayalite-FIST OpenBenchmarking.org Seconds, Fewer Is Better CP2K Molecular Dynamics 2024.3 Input: Fayalite-FIST a b 70 140 210 280 350 336.76 326.88 1. (F9X) gfortran options: -fopenmp -march=native -mtune=native -O3 -funroll-loops -fbacktrace -ffree-form -fimplicit-none -std=f2008 -lcp2kstart -lcp2kmc -lcp2kswarm -lcp2kmotion -lcp2kthermostat -lcp2kemd -lcp2ktmc -lcp2kmain -lcp2kdbt -lcp2ktas -lcp2kgrid -lcp2kgriddgemm -lcp2kgridcpu -lcp2kgridref -lcp2kgridcommon -ldbcsrarnoldi -ldbcsrx -lcp2kdbx -lcp2kdbm -lcp2kshg_int -lcp2keri_mme -lcp2kminimax -lcp2khfxbase -lcp2ksubsys -lcp2kxc -lcp2kao -lcp2kpw_env -lcp2kinput -lcp2kpw -lcp2kgpu -lcp2kfft -lcp2kfpga -lcp2kfm -lcp2kcommon -lcp2koffload -lcp2kmpiwrap -lcp2kbase -ldbcsr -lsirius -lspla -lspfft -lsymspg -lvdwxc -l:libhdf5_fortran.a -l:libhdf5.a -lz -lgsl -lelpa_openmp -lcosma -lcosta -lscalapack -lxsmmf -lxsmm -ldl -lpthread -llibgrpp -lxcf03 -lxc -lint2 -lfftw3_mpi -lfftw3 -lfftw3_omp -lmpi_cxx -lmpi -l:libopenblas.a -lvori -lstdc++ -lmpi_usempif08 -lmpi_mpifh -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm
XNNPACK Model: QS8MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: QS8MobileNetV2 a b 400 800 1200 1600 2000 1843 1888 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV3Small OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV3Small a b 300 600 900 1200 1500 1519 1528 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV3Large OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV3Large a b 800 1600 2400 3200 4000 3939 3676 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV2 a b 800 1600 2400 3200 4000 3676 3670 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV1 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV1 a b 1200 2400 3600 4800 6000 5338 5730 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV3Small OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV3Small a b 300 600 900 1200 1500 1325 1309 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV3Large OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV3Large a b 600 1200 1800 2400 3000 2942 2971 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV2 a b 600 1200 1800 2400 3000 2675 2656 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV1 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV1 a b 800 1600 2400 3200 4000 3766 3776 1. (CXX) g++ options: -O3 -lrt -lm
CP2K Molecular Dynamics Input: H20-64 OpenBenchmarking.org Seconds, Fewer Is Better CP2K Molecular Dynamics 2024.3 Input: H20-64 a b 40 80 120 160 200 162.63 175.65 1. (F9X) gfortran options: -fopenmp -march=native -mtune=native -O3 -funroll-loops -fbacktrace -ffree-form -fimplicit-none -std=f2008 -lcp2kstart -lcp2kmc -lcp2kswarm -lcp2kmotion -lcp2kthermostat -lcp2kemd -lcp2ktmc -lcp2kmain -lcp2kdbt -lcp2ktas -lcp2kgrid -lcp2kgriddgemm -lcp2kgridcpu -lcp2kgridref -lcp2kgridcommon -ldbcsrarnoldi -ldbcsrx -lcp2kdbx -lcp2kdbm -lcp2kshg_int -lcp2keri_mme -lcp2kminimax -lcp2khfxbase -lcp2ksubsys -lcp2kxc -lcp2kao -lcp2kpw_env -lcp2kinput -lcp2kpw -lcp2kgpu -lcp2kfft -lcp2kfpga -lcp2kfm -lcp2kcommon -lcp2koffload -lcp2kmpiwrap -lcp2kbase -ldbcsr -lsirius -lspla -lspfft -lsymspg -lvdwxc -l:libhdf5_fortran.a -l:libhdf5.a -lz -lgsl -lelpa_openmp -lcosma -lcosta -lscalapack -lxsmmf -lxsmm -ldl -lpthread -llibgrpp -lxcf03 -lxc -lint2 -lfftw3_mpi -lfftw3 -lfftw3_omp -lmpi_cxx -lmpi -l:libopenblas.a -lvori -lstdc++ -lmpi_usempif08 -lmpi_mpifh -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm
NAMD Input: ATPase with 327,506 Atoms OpenBenchmarking.org ns/day, More Is Better NAMD 3.0 Input: ATPase with 327,506 Atoms a b 0.0821 0.1642 0.2463 0.3284 0.4105 0.32227 0.36493
WarpX Input: Plasma Acceleration OpenBenchmarking.org Seconds, Fewer Is Better WarpX 24.10 Input: Plasma Acceleration a b 40 80 120 160 200 97.32 187.71 1. (CXX) g++ options: -O3 -lm
oneDNN Harness: Recurrent Neural Network Training - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Recurrent Neural Network Training - Engine: CPU a b 1500 3000 4500 6000 7500 6909.46 6480.37 MIN: 6852.68 MIN: 6423.57 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: Recurrent Neural Network Inference - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Recurrent Neural Network Inference - Engine: CPU a b 800 1600 2400 3200 4000 3669.59 3644.40 MIN: 3612.72 MIN: 3590.71 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
WarpX Input: Uniform Plasma OpenBenchmarking.org Seconds, Fewer Is Better WarpX 24.10 Input: Uniform Plasma a b 14 28 42 56 70 60.69 58.37 1. (CXX) g++ options: -O3 -lm
LiteRT Model: NASNet Mobile OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: NASNet Mobile a b 5K 10K 15K 20K 25K 24466.6 20572.6
LiteRT Model: Inception V4 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Inception V4 a b 13K 26K 39K 52K 65K 60434.0 59982.9
LiteRT Model: Inception ResNet V2 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Inception ResNet V2 a b 14K 28K 42K 56K 70K 66739.5 60923.1
LiteRT Model: DeepLab V3 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: DeepLab V3 a b 3K 6K 9K 12K 15K 12481.3 8785.7
LiteRT Model: Mobilenet Float OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Mobilenet Float a b 700 1400 2100 2800 3500 3267.47 3216.26
LiteRT Model: Quantized COCO SSD MobileNet v1 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Quantized COCO SSD MobileNet v1 a b 3K 6K 9K 12K 15K 14677.1 11862.2
LiteRT Model: SqueezeNet OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: SqueezeNet a b 900 1800 2700 3600 4500 4297.68 4341.81
LiteRT Model: Mobilenet Quant OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Mobilenet Quant a b 1200 2400 3600 4800 6000 5772.50 5428.36
oneDNN Harness: Deconvolution Batch shapes_1d - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Deconvolution Batch shapes_1d - Engine: CPU a b 2 4 6 8 10 7.56991 7.71558 MIN: 6.51 MIN: 6.5 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: IP Shapes 1D - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: IP Shapes 1D - Engine: CPU a b 1.0703 2.1406 3.2109 4.2812 5.3515 4.69920 4.75677 MIN: 4.13 MIN: 4.12 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: IP Shapes 3D - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: IP Shapes 3D - Engine: CPU a b 3 6 9 12 15 8.97964 8.99418 MIN: 8.79 MIN: 8.8 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Convolution Batch Shapes Auto - Engine: CPU a b 2 4 6 8 10 8.31985 8.32147 MIN: 8.1 MIN: 7.97 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Deconvolution Batch shapes_3d - Engine: CPU a b 2 4 6 8 10 8.83578 8.78226 MIN: 8.24 MIN: 8.17 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
Phoronix Test Suite v10.8.5