rt up

AMD Ryzen 9 9950X 16-Core testing with a ASUS ROG STRIX X670E-E GAMING WIFI (2401 BIOS) and AMD Radeon PRO W7900 45GB on Ubuntu 24.04 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2410151-PTS-RTUP561730&grs.

rt upProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerOpenGLOpenCLCompilerFile-SystemScreen ResolutionabcdAMD Ryzen 9 9950X 16-Core @ 5.75GHz (16 Cores / 32 Threads)ASUS ROG STRIX X670E-E GAMING WIFI (2401 BIOS)AMD Device 14d82 x 32GB DDR5-6400MT/s Corsair CMK64GX5M2B6400C32Western Digital WD_BLACK SN850X 2000GB + 257GB Flash DriveAMD Radeon PRO W7900 45GB (2200/3200MHz)AMD Navi 31 HDMI/DPDELL U2723QEIntel I225-V + Intel Wi-Fi 6EUbuntu 24.046.10.1-061001-generic (x86_64)GNOME Shell 46.0X Server 1.21.1.11 + Wayland4.6 Mesa 24.2.0-devel (LLVM 18.1.7 DRM 3.57)OpenCL 2.1 AMD-APP (3625.0)GCC 13.2.0ext43840x2160OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-uJ7kn6/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-uJ7kn6/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - CPU Microcode: 0xb404022 Security Details- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected

rt uplitert: Mobilenet Quantlitert: NASNet Mobilelitert: DeepLab V3litert: Quantized COCO SSD MobileNet v1litert: Inception V4xnnpack: FP32MobileNetV1litert: Inception ResNet V2onednn: Recurrent Neural Network Inference - CPUonednn: IP Shapes 3D - CPUxnnpack: QS8MobileNetV2xnnpack: FP32MobileNetV3Largelitert: Mobilenet Floatonednn: Recurrent Neural Network Training - CPUxnnpack: FP32MobileNetV3Smallonednn: IP Shapes 1D - CPUxnnpack: FP16MobileNetV2onednn: Convolution Batch Shapes Auto - CPUxnnpack: FP32MobileNetV2xnnpack: FP16MobileNetV3Smallxnnpack: FP16MobileNetV1xnnpack: FP16MobileNetV3Largeonednn: Deconvolution Batch shapes_1d - CPUonednn: Deconvolution Batch shapes_3d - CPUlitert: SqueezeNetabcd593.87010524.11993.851237.7815724.4102414335.0401.0803.1709274716011002.97752.8459020.63658710705.240081385851100813951.821001.408251399.04599.28810303.41916.711265.6915880.399514386.0401.0523.1869575916221003.998753.3269130.63764710665.234541355858100813971.817841.413651377.42585.75810342.51971.801273.9115836.1100114366.4399.1523.1789275216181008.93752.4609070.63866110755.242251355861100013981.814601.406591390.86738.06712903.42250.631448.4517160.6107215403.4422.9103.3583379116931060.57795.0169500.66983911125.450771409884103814401.867551.446001408.46OpenBenchmarking.org

LiteRT

Model: Mobilenet Quant

OpenBenchmarking.orgMicroseconds, Fewer Is BetterLiteRT 2024-10-15Model: Mobilenet Quantabcd160320480640800SE +/- 2.40, N = 3SE +/- 0.53, N = 3SE +/- 6.33, N = 3SE +/- 10.36, N = 3593.87599.29585.76738.07

LiteRT

Model: NASNet Mobile

OpenBenchmarking.orgMicroseconds, Fewer Is BetterLiteRT 2024-10-15Model: NASNet Mobileabcd3K6K9K12K15KSE +/- 42.72, N = 3SE +/- 51.10, N = 3SE +/- 86.24, N = 3SE +/- 102.57, N = 310524.110303.410342.512903.4

LiteRT

Model: DeepLab V3

OpenBenchmarking.orgMicroseconds, Fewer Is BetterLiteRT 2024-10-15Model: DeepLab V3abcd5001000150020002500SE +/- 10.75, N = 3SE +/- 19.56, N = 3SE +/- 10.88, N = 3SE +/- 30.70, N = 31993.851916.711971.802250.63

LiteRT

Model: Quantized COCO SSD MobileNet v1

OpenBenchmarking.orgMicroseconds, Fewer Is BetterLiteRT 2024-10-15Model: Quantized COCO SSD MobileNet v1abcd30060090012001500SE +/- 1.69, N = 3SE +/- 7.74, N = 3SE +/- 12.41, N = 6SE +/- 10.93, N = 31237.781265.691273.911448.45

LiteRT

Model: Inception V4

OpenBenchmarking.orgMicroseconds, Fewer Is BetterLiteRT 2024-10-15Model: Inception V4abcd4K8K12K16K20KSE +/- 91.27, N = 3SE +/- 106.26, N = 3SE +/- 31.83, N = 3SE +/- 90.13, N = 315724.415880.315836.117160.6

XNNPACK

Model: FP32MobileNetV1

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP32MobileNetV1abcd2004006008001000SE +/- 2.33, N = 3SE +/- 9.53, N = 3SE +/- 5.51, N = 3SE +/- 14.74, N = 31024995100110721. (CXX) g++ options: -O3 -lrt -lm

LiteRT

Model: Inception ResNet V2

OpenBenchmarking.orgMicroseconds, Fewer Is BetterLiteRT 2024-10-15Model: Inception ResNet V2abcd3K6K9K12K15KSE +/- 163.02, N = 3SE +/- 120.32, N = 3SE +/- 120.76, N = 3SE +/- 38.80, N = 314335.014386.014366.415403.4

oneDNN

Harness: Recurrent Neural Network Inference - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: Recurrent Neural Network Inference - Engine: CPUabcd90180270360450SE +/- 0.27, N = 3SE +/- 0.66, N = 3SE +/- 0.53, N = 3SE +/- 0.44, N = 3401.08401.05399.15422.91MIN: 386.16MIN: 385.35MIN: 384.47MIN: 402.961. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl

oneDNN

Harness: IP Shapes 3D - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: IP Shapes 3D - Engine: CPUabcd0.75561.51122.26683.02243.778SE +/- 0.01148, N = 3SE +/- 0.01628, N = 3SE +/- 0.01639, N = 3SE +/- 0.01850, N = 33.170923.186953.178923.35833MIN: 3.01MIN: 3.01MIN: 3.01MIN: 3.021. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl

XNNPACK

Model: QS8MobileNetV2

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: QS8MobileNetV2abcd2004006008001000SE +/- 6.66, N = 3SE +/- 2.19, N = 3SE +/- 4.84, N = 3SE +/- 4.00, N = 37477597527911. (CXX) g++ options: -O3 -lrt -lm

XNNPACK

Model: FP32MobileNetV3Large

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP32MobileNetV3Largeabcd400800120016002000SE +/- 14.53, N = 3SE +/- 2.65, N = 3SE +/- 5.13, N = 3SE +/- 6.69, N = 316011622161816931. (CXX) g++ options: -O3 -lrt -lm

LiteRT

Model: Mobilenet Float

OpenBenchmarking.orgMicroseconds, Fewer Is BetterLiteRT 2024-10-15Model: Mobilenet Floatabcd2004006008001000SE +/- 1.06, N = 3SE +/- 3.97, N = 3SE +/- 3.98, N = 3SE +/- 3.94, N = 31002.971004.001008.931060.57

oneDNN

Harness: Recurrent Neural Network Training - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: Recurrent Neural Network Training - Engine: CPUabcd2004006008001000SE +/- 1.48, N = 3SE +/- 0.30, N = 3SE +/- 0.72, N = 3SE +/- 1.11, N = 3752.85753.33752.46795.02MIN: 726.85MIN: 729.83MIN: 726.66MIN: 763.711. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl

XNNPACK

Model: FP32MobileNetV3Small

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP32MobileNetV3Smallabcd2004006008001000SE +/- 3.61, N = 3SE +/- 1.45, N = 3SE +/- 3.28, N = 3SE +/- 3.18, N = 39029139079501. (CXX) g++ options: -O3 -lrt -lm

oneDNN

Harness: IP Shapes 1D - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: IP Shapes 1D - Engine: CPUabcd0.15070.30140.45210.60280.7535SE +/- 0.002186, N = 3SE +/- 0.003981, N = 3SE +/- 0.003543, N = 3SE +/- 0.004097, N = 30.6365870.6376470.6386610.669839MIN: 0.6MIN: 0.6MIN: 0.59MIN: 0.591. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl

XNNPACK

Model: FP16MobileNetV2

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP16MobileNetV2abcd2004006008001000SE +/- 5.49, N = 3SE +/- 6.11, N = 3SE +/- 0.88, N = 3SE +/- 2.19, N = 310701066107511121. (CXX) g++ options: -O3 -lrt -lm

oneDNN

Harness: Convolution Batch Shapes Auto - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: Convolution Batch Shapes Auto - Engine: CPUabcd1.22642.45283.67924.90566.132SE +/- 0.01241, N = 3SE +/- 0.01667, N = 3SE +/- 0.01347, N = 3SE +/- 0.00366, N = 35.240085.234545.242255.45077MIN: 4.93MIN: 4.93MIN: 4.94MIN: 4.961. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl

XNNPACK

Model: FP32MobileNetV2

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP32MobileNetV2abcd30060090012001500SE +/- 8.45, N = 3SE +/- 8.33, N = 3SE +/- 6.39, N = 3SE +/- 30.28, N = 313851355135514091. (CXX) g++ options: -O3 -lrt -lm

XNNPACK

Model: FP16MobileNetV3Small

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP16MobileNetV3Smallabcd2004006008001000SE +/- 1.76, N = 3SE +/- 4.70, N = 3SE +/- 1.67, N = 3SE +/- 4.41, N = 38518588618841. (CXX) g++ options: -O3 -lrt -lm

XNNPACK

Model: FP16MobileNetV1

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP16MobileNetV1abcd2004006008001000SE +/- 6.17, N = 3SE +/- 5.04, N = 3SE +/- 7.22, N = 3SE +/- 6.08, N = 310081008100010381. (CXX) g++ options: -O3 -lrt -lm

XNNPACK

Model: FP16MobileNetV3Large

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP16MobileNetV3Largeabcd30060090012001500SE +/- 6.84, N = 3SE +/- 2.65, N = 3SE +/- 5.21, N = 3SE +/- 5.51, N = 313951397139814401. (CXX) g++ options: -O3 -lrt -lm

oneDNN

Harness: Deconvolution Batch shapes_1d - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: Deconvolution Batch shapes_1d - Engine: CPUabcd0.42020.84041.26061.68082.101SE +/- 0.01078, N = 3SE +/- 0.00686, N = 3SE +/- 0.00929, N = 3SE +/- 0.01009, N = 31.821001.817841.814601.86755MIN: 1.4MIN: 1.4MIN: 1.38MIN: 1.391. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl

oneDNN

Harness: Deconvolution Batch shapes_3d - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: Deconvolution Batch shapes_3d - Engine: CPUabcd0.32540.65080.97621.30161.627SE +/- 0.00511, N = 3SE +/- 0.00648, N = 3SE +/- 0.00405, N = 3SE +/- 0.00140, N = 31.408251.413651.406591.44600MIN: 1.33MIN: 1.33MIN: 1.33MIN: 1.331. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl

LiteRT

Model: SqueezeNet

OpenBenchmarking.orgMicroseconds, Fewer Is BetterLiteRT 2024-10-15Model: SqueezeNetabcd30060090012001500SE +/- 2.23, N = 3SE +/- 12.91, N = 6SE +/- 5.30, N = 3SE +/- 3.75, N = 31399.041377.421390.861408.46


Phoronix Test Suite v10.8.5