ran xsmm AMD Ryzen 9 7950X 16-Core testing with a ASUS ROG STRIX X670E-E GAMING WIFI (1416 BIOS) and AMD Radeon RX 7900 XTX on Ubuntu 23.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2306214-PTS-RANXSMM255&sor .
ran xsmm Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server OpenGL Compiler File-System Screen Resolution a b c d AMD Ryzen 9 7950X 16-Core @ 4.50GHz (16 Cores / 32 Threads) ASUS ROG STRIX X670E-E GAMING WIFI (1416 BIOS) AMD Device 14d8 32GB Western Digital WD_BLACK SN850X 1000GB AMD Radeon RX 7900 XTX (2304/1249MHz) AMD Device ab30 ASUS MG28U Intel I225-V + Intel Wi-Fi 6 AX210/AX211/AX411 Ubuntu 23.04 6.2.0-20-generic (x86_64) GNOME Shell 44.1 X Server 1.21.1.7 + Wayland 4.6 Mesa 23.2.0-devel (git-926e97d 2023-06-12 lunar-oibaf-ppa) (LLVM 15.0.7 DRM 3.49) GCC 12.2.0 ext4 3840x2160 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-Pa930Z/gcc-12-12.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-Pa930Z/gcc-12-12.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0xa601203 Security Details - itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
ran xsmm libxsmm: 128 libxsmm: 256 libxsmm: 32 libxsmm: 64 srsran: Downlink Processor Benchmark srsran: PUSCH Processor Benchmark, Throughput Total srsran: PUSCH Processor Benchmark, Throughput Thread a b c d 470.4 818.1 119.3 241.8 1044 5402.4 335.2 470.6 806.0 119.3 241.9 1008.7 5404.7 328.6 471 823.6 119.3 241.7 1074.3 5390.4 334.9 470.8 820 119.2 241.5 1068.4 5370.9 332.7 OpenBenchmarking.org
libxsmm M N K: 128 OpenBenchmarking.org GFLOPS/s, More Is Better libxsmm 2-1.17-3645 M N K: 128 c d b a 100 200 300 400 500 SE +/- 0.12, N = 3 471.0 470.8 470.6 470.4 1. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -march=core-avx2
libxsmm M N K: 256 OpenBenchmarking.org GFLOPS/s, More Is Better libxsmm 2-1.17-3645 M N K: 256 c d a b 200 400 600 800 1000 SE +/- 8.02, N = 6 823.6 820.0 818.1 806.0 1. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -march=core-avx2
libxsmm M N K: 32 OpenBenchmarking.org GFLOPS/s, More Is Better libxsmm 2-1.17-3645 M N K: 32 c b a d 30 60 90 120 150 SE +/- 0.06, N = 3 119.3 119.3 119.3 119.2 1. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -march=core-avx2
libxsmm M N K: 64 OpenBenchmarking.org GFLOPS/s, More Is Better libxsmm 2-1.17-3645 M N K: 64 b a c d 50 100 150 200 250 SE +/- 0.21, N = 3 241.9 241.8 241.7 241.5 1. (CXX) g++ options: -dynamic -Bstatic -static-libgcc -lgomp -lm -lrt -ldl -lquadmath -lstdc++ -pthread -fPIC -std=c++14 -O2 -fopenmp -fopenmp-simd -funroll-loops -ftree-vectorize -fdata-sections -ffunction-sections -fvisibility=hidden -march=core-avx2
srsRAN Project Test: Downlink Processor Benchmark OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.5 Test: Downlink Processor Benchmark c d a b 200 400 600 800 1000 SE +/- 43.96, N = 12 1074.3 1068.4 1044.0 1008.7 1. (CXX) g++ options: -march=native -mfma -O3 -fno-trapping-math -fno-math-errno -lgtest
srsRAN Project Test: PUSCH Processor Benchmark, Throughput Total OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.5 Test: PUSCH Processor Benchmark, Throughput Total b a c d 1200 2400 3600 4800 6000 SE +/- 1.42, N = 3 5404.7 5402.4 5390.4 5370.9 1. (CXX) g++ options: -march=native -mfma -O3 -fno-trapping-math -fno-math-errno -lgtest
srsRAN Project Test: PUSCH Processor Benchmark, Throughput Thread OpenBenchmarking.org Mbps, More Is Better srsRAN Project 23.5 Test: PUSCH Processor Benchmark, Throughput Thread a c d b 70 140 210 280 350 SE +/- 3.20, N = 15 335.2 334.9 332.7 328.6 1. (CXX) g++ options: -march=native -mfma -O3 -fno-trapping-math -fno-math-errno -lgtest
Phoronix Test Suite v10.8.5