dddd Tests for a future article. Intel Core i7-1280P testing with a MSI Prestige 14Evo A12M MS-14C6 (E14C6IMS.115 BIOS) and MSI Intel ADL GT2 8GB on Ubuntu 24.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2410194-NE-DDDD7825639&grs&sor .
dddd Processor Motherboard Chipset Memory Disk Graphics Audio Network OS Kernel Desktop Display Server OpenGL Compiler File-System Screen Resolution a b Intel Core i7-1280P @ 4.70GHz (14 Cores / 20 Threads) MSI Prestige 14Evo A12M MS-14C6 (E14C6IMS.115 BIOS) Intel Alder Lake PCH 8 x 2GB LPDDR4-4267MT/s SK Hynix H9HCNNNCPMMLXR- 1024GB Micron_3400_MTFDKBA1T0TFH MSI Intel ADL GT2 8GB Realtek ALC274 Intel Alder Lake-P PCH CNVi WiFi Ubuntu 24.10 6.11.0-rc6-phx (x86_64) GNOME Shell 47.0 X Server + Wayland 4.6 Mesa 24.2.3-1ubuntu1 GCC 14.2.0 ext4 1920x1080 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0x434 - Thermald 2.5.8 Python Details - Python 3.12.6 Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Mitigation of Clear Register File + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; RSB filling; PBRSB-eIBRS: SW sequence; BHI: BHI_DIS_S + srbds: Not affected + tsx_async_abort: Not affected
dddd warpx: Plasma Acceleration litert: DeepLab V3 namd: STMV with 1,066,628 Atoms cp2k: H20-256 litert: Quantized COCO SSD MobileNet v1 litert: NASNet Mobile epoch: Cone namd: ATPase with 327,506 Atoms litert: Inception ResNet V2 cp2k: H20-64 xnnpack: FP16MobileNetV1 xnnpack: FP16MobileNetV3Large onednn: Recurrent Neural Network Training - CPU litert: Mobilenet Quant warpx: Uniform Plasma cp2k: Fayalite-FIST xnnpack: QS8MobileNetV2 onednn: Deconvolution Batch shapes_1d - CPU litert: Mobilenet Float onednn: IP Shapes 1D - CPU xnnpack: FP32MobileNetV3Small litert: SqueezeNet xnnpack: FP32MobileNetV3Large litert: Inception V4 xnnpack: FP32MobileNetV2 onednn: Recurrent Neural Network Inference - CPU onednn: Deconvolution Batch shapes_3d - CPU xnnpack: FP16MobileNetV3Small xnnpack: FP32MobileNetV1 xnnpack: FP16MobileNetV2 onednn: IP Shapes 3D - CPU onednn: Convolution Batch Shapes Auto - CPU a b 97.32212998 12481.3 0.08333 2776.526 14677.1 24466.6 1073.62 0.32227 66739.5 162.632 5338 3939 6909.46 5772.5 60.69487664 336.756 1843 7.56991 3267.47 4.6992 1325 4297.68 2942 60434 2675 3669.59 8.83578 1519 3766 3676 8.97964 8.31985 187.7052446 8785.7 0.11113 2228.375 11862.2 20572.6 941.82 0.36493 60923.1 175.654 5730 3676 6480.37 5428.36 58.37226117 326.883 1888 7.71558 3216.26 4.75677 1309 4341.81 2971 59982.9 2656 3644.4 8.78226 1528 3776 3670 8.99418 8.32147 OpenBenchmarking.org
WarpX Input: Plasma Acceleration OpenBenchmarking.org Seconds, Fewer Is Better WarpX 24.10 Input: Plasma Acceleration a b 40 80 120 160 200 97.32 187.71 1. (CXX) g++ options: -O3 -lm
LiteRT Model: DeepLab V3 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: DeepLab V3 b a 3K 6K 9K 12K 15K 8785.7 12481.3
NAMD Input: STMV with 1,066,628 Atoms OpenBenchmarking.org ns/day, More Is Better NAMD 3.0 Input: STMV with 1,066,628 Atoms b a 0.025 0.05 0.075 0.1 0.125 0.11113 0.08333
CP2K Molecular Dynamics Input: H20-256 OpenBenchmarking.org Seconds, Fewer Is Better CP2K Molecular Dynamics 2024.3 Input: H20-256 b a 600 1200 1800 2400 3000 2228.38 2776.53 1. (F9X) gfortran options: -fopenmp -march=native -mtune=native -O3 -funroll-loops -fbacktrace -ffree-form -fimplicit-none -std=f2008 -lcp2kstart -lcp2kmc -lcp2kswarm -lcp2kmotion -lcp2kthermostat -lcp2kemd -lcp2ktmc -lcp2kmain -lcp2kdbt -lcp2ktas -lcp2kgrid -lcp2kgriddgemm -lcp2kgridcpu -lcp2kgridref -lcp2kgridcommon -ldbcsrarnoldi -ldbcsrx -lcp2kdbx -lcp2kdbm -lcp2kshg_int -lcp2keri_mme -lcp2kminimax -lcp2khfxbase -lcp2ksubsys -lcp2kxc -lcp2kao -lcp2kpw_env -lcp2kinput -lcp2kpw -lcp2kgpu -lcp2kfft -lcp2kfpga -lcp2kfm -lcp2kcommon -lcp2koffload -lcp2kmpiwrap -lcp2kbase -ldbcsr -lsirius -lspla -lspfft -lsymspg -lvdwxc -l:libhdf5_fortran.a -l:libhdf5.a -lz -lgsl -lelpa_openmp -lcosma -lcosta -lscalapack -lxsmmf -lxsmm -ldl -lpthread -llibgrpp -lxcf03 -lxc -lint2 -lfftw3_mpi -lfftw3 -lfftw3_omp -lmpi_cxx -lmpi -l:libopenblas.a -lvori -lstdc++ -lmpi_usempif08 -lmpi_mpifh -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm
LiteRT Model: Quantized COCO SSD MobileNet v1 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Quantized COCO SSD MobileNet v1 b a 3K 6K 9K 12K 15K 11862.2 14677.1
LiteRT Model: NASNet Mobile OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: NASNet Mobile b a 5K 10K 15K 20K 25K 20572.6 24466.6
Epoch Epoch3D Deck: Cone OpenBenchmarking.org Seconds, Fewer Is Better Epoch 4.19.4 Epoch3D Deck: Cone b a 200 400 600 800 1000 941.82 1073.62 1. (F9X) gfortran options: -O3 -std=f2003 -Jobj -lsdf -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz
NAMD Input: ATPase with 327,506 Atoms OpenBenchmarking.org ns/day, More Is Better NAMD 3.0 Input: ATPase with 327,506 Atoms b a 0.0821 0.1642 0.2463 0.3284 0.4105 0.36493 0.32227
LiteRT Model: Inception ResNet V2 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Inception ResNet V2 b a 14K 28K 42K 56K 70K 60923.1 66739.5
CP2K Molecular Dynamics Input: H20-64 OpenBenchmarking.org Seconds, Fewer Is Better CP2K Molecular Dynamics 2024.3 Input: H20-64 a b 40 80 120 160 200 162.63 175.65 1. (F9X) gfortran options: -fopenmp -march=native -mtune=native -O3 -funroll-loops -fbacktrace -ffree-form -fimplicit-none -std=f2008 -lcp2kstart -lcp2kmc -lcp2kswarm -lcp2kmotion -lcp2kthermostat -lcp2kemd -lcp2ktmc -lcp2kmain -lcp2kdbt -lcp2ktas -lcp2kgrid -lcp2kgriddgemm -lcp2kgridcpu -lcp2kgridref -lcp2kgridcommon -ldbcsrarnoldi -ldbcsrx -lcp2kdbx -lcp2kdbm -lcp2kshg_int -lcp2keri_mme -lcp2kminimax -lcp2khfxbase -lcp2ksubsys -lcp2kxc -lcp2kao -lcp2kpw_env -lcp2kinput -lcp2kpw -lcp2kgpu -lcp2kfft -lcp2kfpga -lcp2kfm -lcp2kcommon -lcp2koffload -lcp2kmpiwrap -lcp2kbase -ldbcsr -lsirius -lspla -lspfft -lsymspg -lvdwxc -l:libhdf5_fortran.a -l:libhdf5.a -lz -lgsl -lelpa_openmp -lcosma -lcosta -lscalapack -lxsmmf -lxsmm -ldl -lpthread -llibgrpp -lxcf03 -lxc -lint2 -lfftw3_mpi -lfftw3 -lfftw3_omp -lmpi_cxx -lmpi -l:libopenblas.a -lvori -lstdc++ -lmpi_usempif08 -lmpi_mpifh -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm
XNNPACK Model: FP16MobileNetV1 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV1 a b 1200 2400 3600 4800 6000 5338 5730 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV3Large OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV3Large b a 800 1600 2400 3200 4000 3676 3939 1. (CXX) g++ options: -O3 -lrt -lm
oneDNN Harness: Recurrent Neural Network Training - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Recurrent Neural Network Training - Engine: CPU b a 1500 3000 4500 6000 7500 6480.37 6909.46 MIN: 6423.57 MIN: 6852.68 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
LiteRT Model: Mobilenet Quant OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Mobilenet Quant b a 1200 2400 3600 4800 6000 5428.36 5772.50
WarpX Input: Uniform Plasma OpenBenchmarking.org Seconds, Fewer Is Better WarpX 24.10 Input: Uniform Plasma b a 14 28 42 56 70 58.37 60.69 1. (CXX) g++ options: -O3 -lm
CP2K Molecular Dynamics Input: Fayalite-FIST OpenBenchmarking.org Seconds, Fewer Is Better CP2K Molecular Dynamics 2024.3 Input: Fayalite-FIST b a 70 140 210 280 350 326.88 336.76 1. (F9X) gfortran options: -fopenmp -march=native -mtune=native -O3 -funroll-loops -fbacktrace -ffree-form -fimplicit-none -std=f2008 -lcp2kstart -lcp2kmc -lcp2kswarm -lcp2kmotion -lcp2kthermostat -lcp2kemd -lcp2ktmc -lcp2kmain -lcp2kdbt -lcp2ktas -lcp2kgrid -lcp2kgriddgemm -lcp2kgridcpu -lcp2kgridref -lcp2kgridcommon -ldbcsrarnoldi -ldbcsrx -lcp2kdbx -lcp2kdbm -lcp2kshg_int -lcp2keri_mme -lcp2kminimax -lcp2khfxbase -lcp2ksubsys -lcp2kxc -lcp2kao -lcp2kpw_env -lcp2kinput -lcp2kpw -lcp2kgpu -lcp2kfft -lcp2kfpga -lcp2kfm -lcp2kcommon -lcp2koffload -lcp2kmpiwrap -lcp2kbase -ldbcsr -lsirius -lspla -lspfft -lsymspg -lvdwxc -l:libhdf5_fortran.a -l:libhdf5.a -lz -lgsl -lelpa_openmp -lcosma -lcosta -lscalapack -lxsmmf -lxsmm -ldl -lpthread -llibgrpp -lxcf03 -lxc -lint2 -lfftw3_mpi -lfftw3 -lfftw3_omp -lmpi_cxx -lmpi -l:libopenblas.a -lvori -lstdc++ -lmpi_usempif08 -lmpi_mpifh -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm
XNNPACK Model: QS8MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: QS8MobileNetV2 a b 400 800 1200 1600 2000 1843 1888 1. (CXX) g++ options: -O3 -lrt -lm
oneDNN Harness: Deconvolution Batch shapes_1d - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Deconvolution Batch shapes_1d - Engine: CPU a b 2 4 6 8 10 7.56991 7.71558 MIN: 6.51 MIN: 6.5 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
LiteRT Model: Mobilenet Float OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Mobilenet Float b a 700 1400 2100 2800 3500 3216.26 3267.47
oneDNN Harness: IP Shapes 1D - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: IP Shapes 1D - Engine: CPU a b 1.0703 2.1406 3.2109 4.2812 5.3515 4.69920 4.75677 MIN: 4.13 MIN: 4.12 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
XNNPACK Model: FP32MobileNetV3Small OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV3Small b a 300 600 900 1200 1500 1309 1325 1. (CXX) g++ options: -O3 -lrt -lm
LiteRT Model: SqueezeNet OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: SqueezeNet a b 900 1800 2700 3600 4500 4297.68 4341.81
XNNPACK Model: FP32MobileNetV3Large OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV3Large a b 600 1200 1800 2400 3000 2942 2971 1. (CXX) g++ options: -O3 -lrt -lm
LiteRT Model: Inception V4 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Inception V4 b a 13K 26K 39K 52K 65K 59982.9 60434.0
XNNPACK Model: FP32MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV2 b a 600 1200 1800 2400 3000 2656 2675 1. (CXX) g++ options: -O3 -lrt -lm
oneDNN Harness: Recurrent Neural Network Inference - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Recurrent Neural Network Inference - Engine: CPU b a 800 1600 2400 3200 4000 3644.40 3669.59 MIN: 3590.71 MIN: 3612.72 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Deconvolution Batch shapes_3d - Engine: CPU b a 2 4 6 8 10 8.78226 8.83578 MIN: 8.17 MIN: 8.24 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
XNNPACK Model: FP16MobileNetV3Small OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV3Small a b 300 600 900 1200 1500 1519 1528 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV1 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV1 a b 800 1600 2400 3200 4000 3766 3776 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV2 b a 800 1600 2400 3200 4000 3670 3676 1. (CXX) g++ options: -O3 -lrt -lm
oneDNN Harness: IP Shapes 3D - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: IP Shapes 3D - Engine: CPU a b 3 6 9 12 15 8.97964 8.99418 MIN: 8.79 MIN: 8.8 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Convolution Batch Shapes Auto - Engine: CPU a b 2 4 6 8 10 8.31985 8.32147 MIN: 8.1 MIN: 7.97 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
Phoronix Test Suite v10.8.5