ldld

AMD Ryzen 7 PRO 5850U testing with a LENOVO ThinkPad T14s Gen 2a 20XF004WUS (R1NET57W 1.27 BIOS) and AMD Radeon Vega / Mobile 1GB on Fedora Linux 39 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2402198-NE-LDLD6476438&rdt&grr.

ldldProcessorMotherboardChipsetMemoryDiskGraphicsAudioNetworkOSKernelDesktopDisplay ServerOpenGLCompilerFile-SystemScreen ResolutionabcAMD Ryzen 7 PRO 5850U @ 4.51GHz (8 Cores / 16 Threads)LENOVO ThinkPad T14s Gen 2a 20XF004WUS (R1NET57W 1.27 BIOS)AMD Renoir/Cezanne2 x 16GB LPDDR4-4266MT/s Micron MT53E2G32D4NQ-0461024GB SAMSUNG MZVLB1T0HBLR-000L7AMD Radeon Vega / Mobile 1GBAMD Renoir Radeon HD AudioRealtek RTL8111/8168/8411 + MEDIATEK MT7921 802.11ax PCIFedora Linux 396.5.8-300.fc39.x86_64 (x86_64)GNOME Shell 45.0X Server + Wayland4.6 Mesa 23.2.1 (LLVM 16.0.6 DRM 3.54)GCC 13.2.1 20230918btrfs3840x2160OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-redhat-linux --disable-libunwind-exceptions --enable-__cxa_atexit --enable-bootstrap --enable-cet --enable-checking=release --enable-gnu-indirect-function --enable-gnu-unique-object --enable-initfini-array --enable-languages=c,c++,fortran,objc,obj-c++,ada,go,d,m2,lto --enable-libstdcxx-backtrace --enable-link-serialization=1 --enable-multilib --enable-offload-defaulted --enable-offload-targets=nvptx-none --enable-plugin --enable-shared --enable-threads=posix --mandir=/usr/share/man --with-arch_32=i686 --with-build-config=bootstrap-lto --with-gcc-major-version-only --with-libstdcxx-zoneinfo=/usr/share/zoneinfo --with-linker-hash-style=gnu --with-tune=generic --without-cuda-driver Processor Details- Scaling Governor: amd-pstate-epp powersave (EPP: performance) - Platform Profile: balanced - CPU Microcode: 0xa50000d - ACPI Profile: balanced Graphics Details- BAR1 / Visible vRAM Size: 1024 MBPython Details- Python 3.12.0Security Details- SELinux + gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of safe RET no microcode + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected

ldldnamd: STMV with 1,066,628 Atomsvkfft: FFT + iFFT C2C Bluestein benchmark in double precisionvkfft: FFT + iFFT C2C 1D batched in double precisionvkfft: FFT + iFFT C2C 1D batched in single precisiongromacs: MPI CPU - water_GMX50_bareoidn: RTLightmap.hdr.4096x4096 - CPU-Onlyvkfft: FFT + iFFT C2C 1D batched in single precision, no reshufflingonnx: Faster R-CNN R-50-FPN-int8 - CPU - Standardonnx: Faster R-CNN R-50-FPN-int8 - CPU - Standardvkfft: FFT + iFFT C2C Bluestein in single precisionnamd: ATPase with 327,506 Atomsoidn: RT.hdr_alb_nrm.3840x2160 - CPU-Onlyoidn: RT.ldr_alb_nrm.3840x2160 - CPU-Onlyvkfft: FFT + iFFT C2C multidimensional in single precisionvkfft: FFT + iFFT C2C 1D batched in half precisiondav1d: Chimera 1080p 10-bitvkfft: FFT + iFFT R2C / C2Ronnx: fcn-resnet101-11 - CPU - Standardonnx: fcn-resnet101-11 - CPU - Standardonnx: GPT-2 - CPU - Standardonnx: GPT-2 - CPU - Standardonnx: yolov4 - CPU - Standardonnx: yolov4 - CPU - Standardonnx: bertsquad-12 - CPU - Standardonnx: bertsquad-12 - CPU - Standardonnx: T5 Encoder - CPU - Standardonnx: T5 Encoder - CPU - Standardonnx: ArcFace ResNet-100 - CPU - Standardonnx: ArcFace ResNet-100 - CPU - Standardonnx: CaffeNet 12-int8 - CPU - Standardonnx: CaffeNet 12-int8 - CPU - Standardonnx: ResNet50 v1-12-int8 - CPU - Standardonnx: ResNet50 v1-12-int8 - CPU - Standardonnx: super-resolution-10 - CPU - Standardonnx: super-resolution-10 - CPU - Standarddav1d: Summer Nature 4Kdav1d: Chimera 1080pcompress-lz4: 3 - Decompression Speedcompress-lz4: 3 - Compression Speedcompress-lz4: 9 - Decompression Speedcompress-lz4: 9 - Compression Speedcompress-lz4: 1 - Decompression Speedcompress-lz4: 1 - Compression Speeddav1d: Summer Nature 1080pabc0.09658912268361260.5950.10645346.154721.664315400.320430.210.21330811894306.0732671274.370.78469713.51773.9314211.424.72981128.1587.8025211.225589.051757.012317.53914.28653233.210.292797.132417.440957.3281124.8373.164203.1119.474258.639.064566.3779.42529.590.09286926261060780.5830.10652556.366817.765415560.308290.200.20334412032279.4133461324.750.75485513.669873.1053220.8344.52820133.9567.4648311.265588.739157.796517.30144.47781223.24210.722593.242718.460154.1641119.84346.733836.5110.844006.137.654200.9724.19498.530.09314928267161410.5850.10646556.617317.660715790.309930.200.20336812045284.8433921324.380.75506513.718872.8414220.3844.53743133.6777.4803811.268388.715957.922917.26334.51337221.47810.631694.036618.54653.9127120.36348.763895.6110.794027.637.324218.9718.46499.42OpenBenchmarking.org

NAMD

Input: STMV with 1,066,628 Atoms

OpenBenchmarking.orgns/day, More Is BetterNAMD 3.0b6Input: STMV with 1,066,628 Atomsabc0.02170.04340.06510.08680.1085SE +/- 0.00005, N = 30.096580.092860.09314

VkFFT

Test: FFT + iFFT C2C Bluestein benchmark in double precision

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C Bluestein benchmark in double precisionabc2004006008001000SE +/- 7.54, N = 39129269281. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT C2C 1D batched in double precision

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C 1D batched in double precisionabc6001200180024003000SE +/- 25.37, N = 62683261026711. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT C2C 1D batched in single precision

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C 1D batched in single precisionabc13002600390052006500SE +/- 73.26, N = 36126607861411. (CXX) g++ options: -O3

GROMACS

Implementation: MPI CPU - Input: water_GMX50_bare

OpenBenchmarking.orgNs Per Day, More Is BetterGROMACS 2024Implementation: MPI CPU - Input: water_GMX50_bareabc0.13390.26780.40170.53560.6695SE +/- 0.001, N = 30.5950.5830.5851. (CXX) g++ options: -O3 -lm

Intel Open Image Denoise

Run: RTLightmap.hdr.4096x4096 - Device: CPU-Only

OpenBenchmarking.orgImages / Sec, More Is BetterIntel Open Image Denoise 2.2Run: RTLightmap.hdr.4096x4096 - Device: CPU-Onlyabc0.02250.0450.06750.090.1125SE +/- 0.00, N = 30.100.100.10

VkFFT

Test: FFT + iFFT C2C 1D batched in single precision, no reshuffling

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C 1D batched in single precision, no reshufflingabc14002800420056007000SE +/- 17.74, N = 36453652564651. (CXX) g++ options: -O3

ONNX Runtime

Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.17Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standardabc1326395265SE +/- 0.55, N = 1546.1556.3756.621. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: Faster R-CNN R-50-FPN-int8 - Device: CPU - Executor: Standardabc510152025SE +/- 0.19, N = 1521.6617.7717.661. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

VkFFT

Test: FFT + iFFT C2C Bluestein in single precision

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C Bluestein in single precisionabc30060090012001500SE +/- 10.48, N = 31540155615791. (CXX) g++ options: -O3

NAMD

Input: ATPase with 327,506 Atoms

OpenBenchmarking.orgns/day, More Is BetterNAMD 3.0b6Input: ATPase with 327,506 Atomsabc0.07210.14420.21630.28840.3605SE +/- 0.00004, N = 30.320430.308290.30993

Intel Open Image Denoise

Run: RT.hdr_alb_nrm.3840x2160 - Device: CPU-Only

OpenBenchmarking.orgImages / Sec, More Is BetterIntel Open Image Denoise 2.2Run: RT.hdr_alb_nrm.3840x2160 - Device: CPU-Onlyabc0.04730.09460.14190.18920.2365SE +/- 0.00, N = 30.210.200.20

Intel Open Image Denoise

Run: RT.ldr_alb_nrm.3840x2160 - Device: CPU-Only

OpenBenchmarking.orgImages / Sec, More Is BetterIntel Open Image Denoise 2.2Run: RT.ldr_alb_nrm.3840x2160 - Device: CPU-Onlyabc0.04730.09460.14190.18920.2365SE +/- 0.00, N = 30.210.200.20

VkFFT

Test: FFT + iFFT C2C multidimensional in single precision

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C multidimensional in single precisionabc7001400210028003500SE +/- 11.68, N = 33308334433681. (CXX) g++ options: -O3

VkFFT

Test: FFT + iFFT C2C 1D batched in half precision

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT C2C 1D batched in half precisionabc3K6K9K12K15KSE +/- 36.70, N = 31189412032120451. (CXX) g++ options: -O3

dav1d

Video Input: Chimera 1080p 10-bit

OpenBenchmarking.orgFPS, More Is Betterdav1d 1.4Video Input: Chimera 1080p 10-bitabc70140210280350SE +/- 2.30, N = 15306.07279.41284.841. (CC) gcc options: -pthread

VkFFT

Test: FFT + iFFT R2C / C2R

OpenBenchmarking.orgBenchmark Score, More Is BetterVkFFT 1.3.4Test: FFT + iFFT R2C / C2Rabc7001400210028003500SE +/- 2.65, N = 33267334633921. (CXX) g++ options: -O3

ONNX Runtime

Model: fcn-resnet101-11 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.17Model: fcn-resnet101-11 - Device: CPU - Executor: Standardabc30060090012001500SE +/- 1.09, N = 31274.371324.751324.381. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: fcn-resnet101-11 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: fcn-resnet101-11 - Device: CPU - Executor: Standardabc0.17660.35320.52980.70640.883SE +/- 0.000621, N = 30.7846970.7548550.7550651. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: GPT-2 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.17Model: GPT-2 - Device: CPU - Executor: Standardabc48121620SE +/- 0.01, N = 313.5213.6713.721. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: GPT-2 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: GPT-2 - Device: CPU - Executor: Standardabc1632486480SE +/- 0.04, N = 373.9373.1172.841. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: yolov4 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.17Model: yolov4 - Device: CPU - Executor: Standardabc50100150200250SE +/- 0.35, N = 3211.42220.83220.381. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: yolov4 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: yolov4 - Device: CPU - Executor: Standardabc1.06422.12843.19264.25685.321SE +/- 0.00718, N = 34.729814.528204.537431. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: bertsquad-12 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.17Model: bertsquad-12 - Device: CPU - Executor: Standardabc306090120150SE +/- 0.06, N = 3128.16133.96133.681. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: bertsquad-12 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: bertsquad-12 - Device: CPU - Executor: Standardabc246810SE +/- 0.00345, N = 37.802527.464837.480381. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: T5 Encoder - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.17Model: T5 Encoder - Device: CPU - Executor: Standardabc3691215SE +/- 0.02, N = 311.2311.2711.271. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: T5 Encoder - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: T5 Encoder - Device: CPU - Executor: Standardabc20406080100SE +/- 0.15, N = 389.0588.7488.721. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.17Model: ArcFace ResNet-100 - Device: CPU - Executor: Standardabc1326395265SE +/- 0.16, N = 357.0157.8057.921. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ArcFace ResNet-100 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: ArcFace ResNet-100 - Device: CPU - Executor: Standardabc48121620SE +/- 0.05, N = 317.5417.3017.261. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.17Model: CaffeNet 12-int8 - Device: CPU - Executor: Standardabc1.01552.0313.04654.0625.0775SE +/- 0.01608, N = 34.286534.477814.513371. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: CaffeNet 12-int8 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: CaffeNet 12-int8 - Device: CPU - Executor: Standardabc50100150200250SE +/- 0.80, N = 3233.20223.24221.481. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.17Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standardabc3691215SE +/- 0.02, N = 310.2910.7210.631. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: ResNet50 v1-12-int8 - Device: CPU - Executor: Standardabc20406080100SE +/- 0.21, N = 397.1393.2494.041. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: super-resolution-10 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInference Time Cost (ms), Fewer Is BetterONNX Runtime 1.17Model: super-resolution-10 - Device: CPU - Executor: Standardabc510152025SE +/- 0.06, N = 317.4418.4618.551. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

ONNX Runtime

Model: super-resolution-10 - Device: CPU - Executor: Standard

OpenBenchmarking.orgInferences Per Second, More Is BetterONNX Runtime 1.17Model: super-resolution-10 - Device: CPU - Executor: Standardabc1326395265SE +/- 0.17, N = 357.3354.1653.911. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=auto -fno-fat-lto-objects -ldl -lrt

dav1d

Video Input: Summer Nature 4K

OpenBenchmarking.orgFPS, More Is Betterdav1d 1.4Video Input: Summer Nature 4Kabc306090120150SE +/- 0.11, N = 3124.80119.84120.361. (CC) gcc options: -pthread

dav1d

Video Input: Chimera 1080p

OpenBenchmarking.orgFPS, More Is Betterdav1d 1.4Video Input: Chimera 1080pabc80160240320400SE +/- 2.21, N = 3373.16346.73348.761. (CC) gcc options: -pthread

LZ4 Compression

Compression Level: 3 - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterLZ4 Compression 1.9.4Compression Level: 3 - Decompression Speedabc9001800270036004500SE +/- 19.26, N = 34203.13836.53895.61. (CC) gcc options: -O3

LZ4 Compression

Compression Level: 3 - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterLZ4 Compression 1.9.4Compression Level: 3 - Compression Speedabc306090120150SE +/- 0.32, N = 3119.47110.84110.791. (CC) gcc options: -O3

LZ4 Compression

Compression Level: 9 - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterLZ4 Compression 1.9.4Compression Level: 9 - Decompression Speedabc9001800270036004500SE +/- 6.24, N = 34258.64006.14027.61. (CC) gcc options: -O3

LZ4 Compression

Compression Level: 9 - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterLZ4 Compression 1.9.4Compression Level: 9 - Compression Speedabc918273645SE +/- 0.11, N = 339.0637.6537.321. (CC) gcc options: -O3

LZ4 Compression

Compression Level: 1 - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterLZ4 Compression 1.9.4Compression Level: 1 - Decompression Speedabc10002000300040005000SE +/- 39.91, N = 34566.34200.94218.91. (CC) gcc options: -O3

LZ4 Compression

Compression Level: 1 - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterLZ4 Compression 1.9.4Compression Level: 1 - Compression Speedabc2004006008001000SE +/- 6.53, N = 3779.42724.19718.461. (CC) gcc options: -O3

dav1d

Video Input: Summer Nature 1080p

OpenBenchmarking.orgFPS, More Is Betterdav1d 1.4Video Input: Summer Nature 1080pabc110220330440550SE +/- 4.79, N = 6529.59498.53499.421. (CC) gcc options: -pthread


Phoronix Test Suite v10.8.5