new mtl framework Intel Core Ultra 7 155H testing with a Framework Laptop 13 (Intel Core Ultra 1) FRANMECP05 (03.01 BIOS) and Intel Arc MTL 8GB on Ubuntu 24.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2408139-NE-NEWMTLFRA01&sor&grr .
new mtl framework Processor Motherboard Chipset Memory Disk Graphics Audio Network OS Kernel Desktop Display Server OpenGL Compiler File-System Screen Resolution a b c Intel Core Ultra 7 155H @ 4.50GHz (16 Cores / 22 Threads) Framework Laptop 13 (Intel Core Ultra 1) FRANMECP05 (03.01 BIOS) Intel Device 7e7f 2 x 8GB DDR5-5600MT/s A-DATA AD5S56008G-SFW 512GB Western Digital WD PC SN740 SDDPNQD-512G Intel Arc MTL 8GB Realtek ALC285 MEDIATEK MT7922 802.11ax PCI Ubuntu 24.04 6.10.0-061000rc4daily20240621-generic (x86_64) GNOME Shell 46.0 X Server + Wayland 4.6 Mesa 24.2~git2406250600.5cb15a~oibaf~n (git-5cb15a6 2024-06-25 noble-oibaf-ppa) GCC 13.2.0 ext4 1920x1080 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-uJ7kn6/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-uJ7kn6/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: intel_pstate powersave (EPP: balance_performance) - CPU Microcode: 0x1e - Thermald 2.5.6 Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; RSB filling; PBRSB-eIBRS: Not affected; BHI: BHI_DIS_S + srbds: Not affected + tsx_async_abort: Not affected
new mtl framework mt-dgemm: Sustained Floating-Point Rate lczero: Eigen lczero: BLAS mnn: inception-v3 mnn: mobilenet-v1-1.0 mnn: MobileNetV2_224 mnn: SqueezeNetV1.0 mnn: resnet-v2-50 mnn: squeezenetv1.1 mnn: mobilenetV3 mnn: nasnet xnnpack: QU8MobileNetV3Small xnnpack: QU8MobileNetV3Large xnnpack: QU8MobileNetV2 xnnpack: FP16MobileNetV3Small xnnpack: FP16MobileNetV3Large xnnpack: FP16MobileNetV2 xnnpack: FP32MobileNetV3Small xnnpack: FP32MobileNetV3Large xnnpack: FP32MobileNetV2 gromacs: water_GMX50_bare build2: Time To Compile stockfish: Chess Benchmark y-cruncher: 1B simdjson: DistinctUserID simdjson: PartialTweets simdjson: TopTweet simdjson: Kostya povray: Trace Time etcpak: Multi-Threaded - ETC2 simdjson: LargeRand x265: Bosphorus 4K y-cruncher: 500M x265: Bosphorus 1080p a b c 52.027491 25 26 69.547 6.216 6.692 9.814 48.546 5.861 3.111 23.726 2132 3605 3737 2385 5247 4251 2246 6627 4388 0.579 297.757 10688485 74.994 6.81 6.62 6.81 4.2 57.761 220.524 1.46 12.1 31.851 51.06 52.067546 24 27 69.196 6.253 6.701 9.669 48.746 5.881 3.157 23.633 2264 3747 3554 2523 4817 4377 2362 5697 4077 0.581 298.584 10530329 75.403 6.83 6.62 6.81 4.21 58.398 223.28 1.47 12.08 31.872 49.51 52.094087 25 26 69.412 6.152 6.757 9.9 48.796 5.921 3.772 23.841 1896 4439 4238 2277 4784 5409 2160 5356 4397 0.581 299.953 11305323 75.633 6.83 6.62 6.79 4.26 57.951 222.068 1.46 12.13 31.502 50.61 OpenBenchmarking.org
ACES DGEMM Sustained Floating-Point Rate OpenBenchmarking.org GFLOP/s, More Is Better ACES DGEMM 1.0 Sustained Floating-Point Rate c b a 12 24 36 48 60 52.09 52.07 52.03 1. (CC) gcc options: -ffast-math -mavx2 -O3 -fopenmp -lopenblas
LeelaChessZero Backend: Eigen OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.31.1 Backend: Eigen c a b 6 12 18 24 30 25 25 24 1. (CXX) g++ options: -flto -pthread
LeelaChessZero Backend: BLAS OpenBenchmarking.org Nodes Per Second, More Is Better LeelaChessZero 0.31.1 Backend: BLAS b c a 6 12 18 24 30 27 26 26 1. (CXX) g++ options: -flto -pthread
Mobile Neural Network Model: inception-v3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.9.b11b7037d Model: inception-v3 b c a 15 30 45 60 75 69.20 69.41 69.55 MIN: 57.21 / MAX: 117.21 MIN: 55.7 / MAX: 110.33 MIN: 58.98 / MAX: 120.56 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl
Mobile Neural Network Model: mobilenet-v1-1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.9.b11b7037d Model: mobilenet-v1-1.0 c a b 2 4 6 8 10 6.152 6.216 6.253 MIN: 3.54 / MAX: 24.17 MIN: 3.56 / MAX: 26.49 MIN: 3.61 / MAX: 31.97 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl
Mobile Neural Network Model: MobileNetV2_224 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.9.b11b7037d Model: MobileNetV2_224 a b c 2 4 6 8 10 6.692 6.701 6.757 MIN: 5.01 / MAX: 16.16 MIN: 5.19 / MAX: 26.48 MIN: 5.05 / MAX: 29.03 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl
Mobile Neural Network Model: SqueezeNetV1.0 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.9.b11b7037d Model: SqueezeNetV1.0 b a c 3 6 9 12 15 9.669 9.814 9.900 MIN: 7.59 / MAX: 16.03 MIN: 7.23 / MAX: 25.89 MIN: 7.61 / MAX: 30.63 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl
Mobile Neural Network Model: resnet-v2-50 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.9.b11b7037d Model: resnet-v2-50 a b c 11 22 33 44 55 48.55 48.75 48.80 MIN: 36.42 / MAX: 73.25 MIN: 37.7 / MAX: 70.01 MIN: 38.05 / MAX: 133.88 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl
Mobile Neural Network Model: squeezenetv1.1 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.9.b11b7037d Model: squeezenetv1.1 a b c 1.3322 2.6644 3.9966 5.3288 6.661 5.861 5.881 5.921 MIN: 4.69 / MAX: 21.36 MIN: 4.62 / MAX: 25.58 MIN: 4.44 / MAX: 25.13 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl
Mobile Neural Network Model: mobilenetV3 OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.9.b11b7037d Model: mobilenetV3 a b c 0.8487 1.6974 2.5461 3.3948 4.2435 3.111 3.157 3.772 MIN: 2.56 / MAX: 10.64 MIN: 2.44 / MAX: 19.36 MIN: 2.45 / MAX: 17.15 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl
Mobile Neural Network Model: nasnet OpenBenchmarking.org ms, Fewer Is Better Mobile Neural Network 2.9.b11b7037d Model: nasnet b a c 6 12 18 24 30 23.63 23.73 23.84 MIN: 19.46 / MAX: 56.99 MIN: 20 / MAX: 52.87 MIN: 19.99 / MAX: 46.4 1. (CXX) g++ options: -std=c++11 -O3 -fvisibility=hidden -fomit-frame-pointer -fstrict-aliasing -ffunction-sections -fdata-sections -ffast-math -fno-rtti -fno-exceptions -pthread -ldl
XNNPACK Model: QU8MobileNetV3Small OpenBenchmarking.org us, Fewer Is Better XNNPACK 2cd86b Model: QU8MobileNetV3Small c a b 500 1000 1500 2000 2500 1896 2132 2264 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: QU8MobileNetV3Large OpenBenchmarking.org us, Fewer Is Better XNNPACK 2cd86b Model: QU8MobileNetV3Large a b c 1000 2000 3000 4000 5000 3605 3747 4439 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: QU8MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK 2cd86b Model: QU8MobileNetV2 b a c 900 1800 2700 3600 4500 3554 3737 4238 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV3Small OpenBenchmarking.org us, Fewer Is Better XNNPACK 2cd86b Model: FP16MobileNetV3Small c a b 500 1000 1500 2000 2500 2277 2385 2523 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV3Large OpenBenchmarking.org us, Fewer Is Better XNNPACK 2cd86b Model: FP16MobileNetV3Large c b a 1100 2200 3300 4400 5500 4784 4817 5247 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK 2cd86b Model: FP16MobileNetV2 a b c 1200 2400 3600 4800 6000 4251 4377 5409 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV3Small OpenBenchmarking.org us, Fewer Is Better XNNPACK 2cd86b Model: FP32MobileNetV3Small c a b 500 1000 1500 2000 2500 2160 2246 2362 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV3Large OpenBenchmarking.org us, Fewer Is Better XNNPACK 2cd86b Model: FP32MobileNetV3Large c b a 1400 2800 4200 5600 7000 5356 5697 6627 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK 2cd86b Model: FP32MobileNetV2 b a c 900 1800 2700 3600 4500 4077 4388 4397 1. (CXX) g++ options: -O3 -lrt -lm
GROMACS Input: water_GMX50_bare OpenBenchmarking.org Ns Per Day, More Is Better GROMACS Input: water_GMX50_bare c b a 0.1307 0.2614 0.3921 0.5228 0.6535 0.581 0.581 0.579 1. GROMACS version: 2023.3-Ubuntu_2023.3_1ubuntu3
Build2 Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Build2 0.17 Time To Compile a b c 70 140 210 280 350 297.76 298.58 299.95
Stockfish Chess Benchmark OpenBenchmarking.org Nodes Per Second, More Is Better Stockfish Chess Benchmark c a b 2M 4M 6M 8M 10M 11305323 10688485 10530329 1. Stockfish 16 by the Stockfish developers (see AUTHORS file)
Y-Cruncher Pi Digits To Calculate: 1B OpenBenchmarking.org Seconds, Fewer Is Better Y-Cruncher 0.8.5 Pi Digits To Calculate: 1B a b c 20 40 60 80 100 74.99 75.40 75.63
simdjson Throughput Test: DistinctUserID OpenBenchmarking.org GB/s, More Is Better simdjson 3.10 Throughput Test: DistinctUserID c b a 2 4 6 8 10 6.83 6.83 6.81 1. (CXX) g++ options: -O3 -lrt
simdjson Throughput Test: PartialTweets OpenBenchmarking.org GB/s, More Is Better simdjson 3.10 Throughput Test: PartialTweets c b a 2 4 6 8 10 6.62 6.62 6.62 1. (CXX) g++ options: -O3 -lrt
simdjson Throughput Test: TopTweet OpenBenchmarking.org GB/s, More Is Better simdjson 3.10 Throughput Test: TopTweet b a c 2 4 6 8 10 6.81 6.81 6.79 1. (CXX) g++ options: -O3 -lrt
simdjson Throughput Test: Kostya OpenBenchmarking.org GB/s, More Is Better simdjson 3.10 Throughput Test: Kostya c b a 0.9585 1.917 2.8755 3.834 4.7925 4.26 4.21 4.20 1. (CXX) g++ options: -O3 -lrt
POV-Ray Trace Time OpenBenchmarking.org Seconds, Fewer Is Better POV-Ray Trace Time a c b 13 26 39 52 65 57.76 57.95 58.40 1. POV-Ray 3.7.0.10.unofficial
Etcpak Benchmark: Multi-Threaded - Configuration: ETC2 OpenBenchmarking.org Mpx/s, More Is Better Etcpak 2.0 Benchmark: Multi-Threaded - Configuration: ETC2 b c a 50 100 150 200 250 223.28 222.07 220.52 1. (CXX) g++ options: -flto -pthread
simdjson Throughput Test: LargeRandom OpenBenchmarking.org GB/s, More Is Better simdjson 3.10 Throughput Test: LargeRandom b c a 0.3308 0.6616 0.9924 1.3232 1.654 1.47 1.46 1.46 1. (CXX) g++ options: -O3 -lrt
x265 Video Input: Bosphorus 4K OpenBenchmarking.org Frames Per Second, More Is Better x265 Video Input: Bosphorus 4K c a b 3 6 9 12 15 12.13 12.10 12.08 1. x265 [info]: HEVC encoder version 3.5+1-f0c1022b6
Y-Cruncher Pi Digits To Calculate: 500M OpenBenchmarking.org Seconds, Fewer Is Better Y-Cruncher 0.8.5 Pi Digits To Calculate: 500M c a b 7 14 21 28 35 31.50 31.85 31.87
x265 Video Input: Bosphorus 1080p OpenBenchmarking.org Frames Per Second, More Is Better x265 Video Input: Bosphorus 1080p a c b 12 24 36 48 60 51.06 50.61 49.51 1. x265 [info]: HEVC encoder version 3.5+1-f0c1022b6
Phoronix Test Suite v10.8.5