new feb AMD Ryzen AI 9 HX 370 testing with a ASUS Zenbook S 16 UM5606WA_UM5606WA UM5606WA v1.0 (UM5606WA.308 BIOS) and AMD Radeon 512MB on Ubuntu 24.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2502026-NE-NEWFEB88615&rdt&grw .
new feb Processor Motherboard Chipset Memory Disk Graphics Audio Network OS Kernel Desktop Display Server OpenGL Compiler File-System Screen Resolution a b c d AMD Ryzen AI 9 HX 370 @ 4.37GHz (12 Cores / 24 Threads) ASUS Zenbook S 16 UM5606WA_UM5606WA UM5606WA v1.0 (UM5606WA.308 BIOS) AMD Device 1507 4 x 8GB LPDDR5-7500MT/s Samsung K3KL9L90CM-MGCT 1024GB MTFDKBA1T0QFM-1BD1AABGB AMD Radeon 512MB AMD Rembrandt Radeon HD Audio MEDIATEK Device 7925 Ubuntu 24.10 6.11.0-rc6-phx (x86_64) GNOME Shell 47.0 X Server + Wayland 4.6 Mesa 24.2.3-1ubuntu1 (LLVM 19.1.0 DRM 3.58) GCC 14.2.0 ext4 2880x1800 OpenBenchmarking.org Kernel Details - amdgpu.dcdebugmask=0x600 - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - a: Scaling Governor: amd-pstate-epp powersave (Boost: Enabled EPP: balance_power) - Platform Profile: balanced - CPU Microcode: 0xb204011 - ACPI Profile: balanced - b: Scaling Governor: amd-pstate-epp powersave (Boost: Enabled EPP: balance_power) - Platform Profile: balanced - CPU Microcode: 0xb204011 - ACPI Profile: balanced - c: Scaling Governor: amd-pstate-epp powersave (Boost: Enabled EPP: power) - Platform Profile: low-power - CPU Microcode: 0xb204011 - ACPI Profile: low-power - d: Scaling Governor: amd-pstate-epp powersave (Boost: Enabled EPP: balance_performance) - Platform Profile: balanced - CPU Microcode: 0xb204011 - ACPI Profile: balanced Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
new feb llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Text Generation 128 llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 512 llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 1024 llama-cpp: CPU BLAS - Llama-3.1-Tulu-3-8B-Q8_0 - Prompt Processing 2048 llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Text Generation 128 llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 512 llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 1024 llama-cpp: CPU BLAS - Mistral-7B-Instruct-v0.3-Q8_0 - Prompt Processing 2048 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Text Generation 128 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 512 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 1024 llama-cpp: CPU BLAS - granite-3.0-3b-a800m-instruct-Q8_0 - Prompt Processing 2048 liquid-dsp: 1 - 256 - 32 liquid-dsp: 1 - 256 - 57 liquid-dsp: 2 - 256 - 32 liquid-dsp: 2 - 256 - 57 liquid-dsp: 4 - 256 - 32 liquid-dsp: 4 - 256 - 57 liquid-dsp: 8 - 256 - 32 liquid-dsp: 8 - 256 - 57 liquid-dsp: 1 - 256 - 512 liquid-dsp: 16 - 256 - 32 liquid-dsp: 16 - 256 - 57 liquid-dsp: 2 - 256 - 512 liquid-dsp: 24 - 256 - 32 liquid-dsp: 24 - 256 - 57 liquid-dsp: 4 - 256 - 512 liquid-dsp: 8 - 256 - 512 liquid-dsp: 16 - 256 - 512 liquid-dsp: 24 - 256 - 512 srsran: PDSCH Processor Benchmark, Throughput Total srsran: PUSCH Processor Benchmark, Throughput Total srsran: PDSCH Processor Benchmark, Throughput Thread srsran: PUSCH Processor Benchmark, Throughput Thread a b c d 10.14 39.29 38.36 36.83 10.62 37.4 37.26 34.55 55.48 176.3 171.34 146.25 32814000 51965000 82974000 126200000 147090000 208370000 289220000 370880000 19340000 481950000 532610000 40464000 623040000 627970000 71434000 127510000 174070000 175350000 13881.1 2302.7 922.1 142.2 10.1 38.08 37.67 34.19 10.59 37.34 35.73 33.36 50.29 145.77 142.69 128.98 32809000 51976000 82734000 126430000 147980000 205530000 292860000 370980000 16375000 487150000 537800000 40117000 630610000 627070000 72413000 131770000 173140000 175110000 13645.2 2311.5 919.2 142.2 10.14 30.4 30.83 29.66 10.12 30.92 30.57 29.49 54.77 123.08 119.29 113.16 20243000 31499000 41054000 63351000 81463000 119840000 163770000 234220000 10019000 319550000 395490000 39374000 629110000 621930000 71542000 131960000 172230000 173480000 10930.4 1962.2 696.8 106.4 10.11 31.27 31.89 31.43 10.43 33.9 32.86 32.15 53.94 147.4 153.27 147.57 33348000 52131000 77682000 121610000 144880000 212600000 292960000 376430000 16680000 482810000 534610000 38620000 623460000 616900000 72638000 132340000 171480000 173070000 14074.7 2306.4 1176.4 178 OpenBenchmarking.org
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Text Generation 128 a b c d 3 6 9 12 15 10.14 10.10 10.14 10.11 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 512 a b c d 9 18 27 36 45 39.29 38.08 30.40 31.27 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 1024 a b c d 9 18 27 36 45 38.36 37.67 30.83 31.89 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Llama-3.1-Tulu-3-8B-Q8_0 - Test: Prompt Processing 2048 a b c d 8 16 24 32 40 36.83 34.19 29.66 31.43 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Text Generation 128 a b c d 3 6 9 12 15 10.62 10.59 10.12 10.43 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 512 a b c d 9 18 27 36 45 37.40 37.34 30.92 33.90 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 1024 a b c d 9 18 27 36 45 37.26 35.73 30.57 32.86 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: Mistral-7B-Instruct-v0.3-Q8_0 - Test: Prompt Processing 2048 a b c d 8 16 24 32 40 34.55 33.36 29.49 32.15 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Text Generation 128 a b c d 12 24 36 48 60 55.48 50.29 54.77 53.94 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 512 a b c d 40 80 120 160 200 176.30 145.77 123.08 147.40 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 1024 a b c d 40 80 120 160 200 171.34 142.69 119.29 153.27 1. (CXX) g++ options: -O3
Llama.cpp Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048 OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b4397 Backend: CPU BLAS - Model: granite-3.0-3b-a800m-instruct-Q8_0 - Test: Prompt Processing 2048 a b c d 30 60 90 120 150 146.25 128.98 113.16 147.57 1. (CXX) g++ options: -O3
Liquid-DSP Threads: 1 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.7 Threads: 1 - Buffer Length: 256 - Filter Length: 32 a b c d 7M 14M 21M 28M 35M 32814000 32809000 20243000 33348000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 1 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.7 Threads: 1 - Buffer Length: 256 - Filter Length: 57 a b c d 11M 22M 33M 44M 55M 51965000 51976000 31499000 52131000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 2 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.7 Threads: 2 - Buffer Length: 256 - Filter Length: 32 a b c d 20M 40M 60M 80M 100M 82974000 82734000 41054000 77682000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 2 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.7 Threads: 2 - Buffer Length: 256 - Filter Length: 57 a b c d 30M 60M 90M 120M 150M 126200000 126430000 63351000 121610000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 4 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.7 Threads: 4 - Buffer Length: 256 - Filter Length: 32 a b c d 30M 60M 90M 120M 150M 147090000 147980000 81463000 144880000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 4 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.7 Threads: 4 - Buffer Length: 256 - Filter Length: 57 a b c d 50M 100M 150M 200M 250M 208370000 205530000 119840000 212600000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 8 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.7 Threads: 8 - Buffer Length: 256 - Filter Length: 32 a b c d 60M 120M 180M 240M 300M 289220000 292860000 163770000 292960000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 8 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.7 Threads: 8 - Buffer Length: 256 - Filter Length: 57 a b c d 80M 160M 240M 320M 400M 370880000 370980000 234220000 376430000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 1 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.7 Threads: 1 - Buffer Length: 256 - Filter Length: 512 a b c d 4M 8M 12M 16M 20M 19340000 16375000 10019000 16680000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 16 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.7 Threads: 16 - Buffer Length: 256 - Filter Length: 32 a b c d 100M 200M 300M 400M 500M 481950000 487150000 319550000 482810000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 16 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.7 Threads: 16 - Buffer Length: 256 - Filter Length: 57 a b c d 120M 240M 360M 480M 600M 532610000 537800000 395490000 534610000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 2 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.7 Threads: 2 - Buffer Length: 256 - Filter Length: 512 a b c d 9M 18M 27M 36M 45M 40464000 40117000 39374000 38620000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 24 - Buffer Length: 256 - Filter Length: 32 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.7 Threads: 24 - Buffer Length: 256 - Filter Length: 32 a b c d 140M 280M 420M 560M 700M 623040000 630610000 629110000 623460000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 24 - Buffer Length: 256 - Filter Length: 57 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.7 Threads: 24 - Buffer Length: 256 - Filter Length: 57 a b c d 130M 260M 390M 520M 650M 627970000 627070000 621930000 616900000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 4 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.7 Threads: 4 - Buffer Length: 256 - Filter Length: 512 a b c d 16M 32M 48M 64M 80M 71434000 72413000 71542000 72638000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 8 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.7 Threads: 8 - Buffer Length: 256 - Filter Length: 512 a b c d 30M 60M 90M 120M 150M 127510000 131770000 131960000 132340000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 16 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.7 Threads: 16 - Buffer Length: 256 - Filter Length: 512 a b c d 40M 80M 120M 160M 200M 174070000 173140000 172230000 171480000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
Liquid-DSP Threads: 24 - Buffer Length: 256 - Filter Length: 512 OpenBenchmarking.org samples/s, More Is Better Liquid-DSP 1.7 Threads: 24 - Buffer Length: 256 - Filter Length: 512 a b c d 40M 80M 120M 160M 200M 175350000 175110000 173480000 173070000 1. (CC) gcc options: -O3 -pthread -lm -lc -lliquid
srsRAN Project Test: PDSCH Processor Benchmark, Throughput Total OpenBenchmarking.org Mbps, More Is Better srsRAN Project 24.10 Test: PDSCH Processor Benchmark, Throughput Total a b c d 3K 6K 9K 12K 15K 13881.1 13645.2 10930.4 14074.7 1. (CXX) g++ options: -O3 -march=native -mtune=generic -fno-trapping-math -fno-math-errno -ldl
srsRAN Project Test: PUSCH Processor Benchmark, Throughput Total OpenBenchmarking.org Mbps, More Is Better srsRAN Project 24.10 Test: PUSCH Processor Benchmark, Throughput Total a b c d 500 1000 1500 2000 2500 2302.7 2311.5 1962.2 2306.4 1. (CXX) g++ options: -O3 -march=native -mtune=generic -fno-trapping-math -fno-math-errno -ldl
srsRAN Project Test: PDSCH Processor Benchmark, Throughput Thread OpenBenchmarking.org Mbps, More Is Better srsRAN Project 24.10 Test: PDSCH Processor Benchmark, Throughput Thread a b c d 300 600 900 1200 1500 922.1 919.2 696.8 1176.4 1. (CXX) g++ options: -O3 -march=native -mtune=generic -fno-trapping-math -fno-math-errno -ldl
srsRAN Project Test: PUSCH Processor Benchmark, Throughput Thread OpenBenchmarking.org Mbps, More Is Better srsRAN Project 24.10 Test: PUSCH Processor Benchmark, Throughput Thread a b c d 40 80 120 160 200 142.2 142.2 106.4 178.0 1. (CXX) g++ options: -O3 -march=native -mtune=generic -fno-trapping-math -fno-math-errno -ldl
Phoronix Test Suite v10.8.5