new rn tr AMD Ryzen Threadripper 7980X 64-Cores testing with a System76 Thelio Major (FA Z5 BIOS) and AMD Radeon RX 6700 XT 12GB on Ubuntu 24.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2410161-PTS-NEWRNTR258 .
new rn tr Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server OpenGL Compiler File-System Screen Resolution a b c d AMD Ryzen Threadripper 7980X 64-Cores @ 5.37GHz (64 Cores / 128 Threads) System76 Thelio Major (FA Z5 BIOS) AMD Device 14a4 4 x 32GB DDR5-4800MT/s Micron MTC20F1045S1RC48BA2 1000GB CT1000T700SSD5 AMD Radeon RX 6700 XT 12GB AMD Device 14cc DELL P2415Q Aquantia AQC113C NBase-T/IEEE + Realtek RTL8125 2.5GbE + Intel Wi-Fi 6E Ubuntu 24.10 6.11.0-8-generic (x86_64) GNOME Shell 47.0 X Server + Wayland 4.6 Mesa 24.2.3-1ubuntu1 (LLVM 19.1.0 DRM 3.58) GCC 14.2.0 ext4 1920x1200 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-14-zdkDXv/gcc-14-14.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: amd-pstate-epp powersave (Boost: Enabled EPP: balance_performance) - CPU Microcode: 0xa108105 Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
new rn tr onednn: IP Shapes 1D - CPU onednn: IP Shapes 3D - CPU onednn: Convolution Batch Shapes Auto - CPU onednn: Deconvolution Batch shapes_1d - CPU onednn: Deconvolution Batch shapes_3d - CPU onednn: Recurrent Neural Network Training - CPU onednn: Recurrent Neural Network Inference - CPU litert: DeepLab V3 litert: SqueezeNet litert: Inception V4 litert: NASNet Mobile litert: Mobilenet Float litert: Mobilenet Quant litert: Inception ResNet V2 litert: Quantized COCO SSD MobileNet v1 xnnpack: FP32MobileNetV1 xnnpack: FP32MobileNetV2 xnnpack: FP32MobileNetV3Large xnnpack: FP32MobileNetV3Small xnnpack: FP16MobileNetV1 xnnpack: FP16MobileNetV2 xnnpack: FP16MobileNetV3Large xnnpack: FP16MobileNetV3Small xnnpack: QS8MobileNetV2 a b c d 0.582029 0.335141 0.557676 7.46041 1.02976 546.398 326.236 13513.8 3266.42 26612.8 173455 2102.99 15789.8 34923.9 7924.86 2119 4032 5962 4238 2031 3743 5624 4164 4031 0.573270 0.337890 0.554351 7.42610 1.02672 548.732 326.788 12117.3 3303.96 26574.6 156703 2141.96 16068.8 33651.9 7935.03 2156 4053 6005 4271 2052 3804 5748 4301 3988 0.577303 0.333022 0.560390 7.50405 1.02743 548.888 325.448 11896.5 3264.06 26433.2 142541 2112.35 16501.2 34821.7 7672.13 2156 4055 6047 4269 2070 3825 5827 4255 4025 0.572739 0.337384 0.554405 7.48971 1.02498 550.540 325.778 12363.4 3282.50 26390.1 151275 2132.45 15994.0 34742.1 7531.43 2159 4007 5932 4285 2052 3762 5806 4294 3985 OpenBenchmarking.org
oneDNN Harness: IP Shapes 1D - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: IP Shapes 1D - Engine: CPU a b c d 0.131 0.262 0.393 0.524 0.655 SE +/- 0.003697, N = 3 SE +/- 0.000734, N = 3 SE +/- 0.004087, N = 3 0.582029 0.573270 0.577303 0.572739 MIN: 0.55 MIN: 0.54 MIN: 0.54 MIN: 0.54 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: IP Shapes 3D - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: IP Shapes 3D - Engine: CPU a b c d 0.076 0.152 0.228 0.304 0.38 SE +/- 0.001917, N = 3 SE +/- 0.002114, N = 3 SE +/- 0.001618, N = 3 0.335141 0.337890 0.333022 0.337384 MIN: 0.31 MIN: 0.31 MIN: 0.31 MIN: 0.31 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: Convolution Batch Shapes Auto - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Convolution Batch Shapes Auto - Engine: CPU a b c d 0.1261 0.2522 0.3783 0.5044 0.6305 SE +/- 0.004992, N = 3 SE +/- 0.000955, N = 3 SE +/- 0.003803, N = 3 0.557676 0.554351 0.560390 0.554405 MIN: 0.52 MIN: 0.51 MIN: 0.52 MIN: 0.52 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_1d - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Deconvolution Batch shapes_1d - Engine: CPU a b c d 2 4 6 8 10 SE +/- 0.00222, N = 3 SE +/- 0.04488, N = 3 SE +/- 0.07017, N = 3 7.46041 7.42610 7.50405 7.48971 MIN: 6.52 MIN: 6.55 MIN: 4.63 MIN: 4.6 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: Deconvolution Batch shapes_3d - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Deconvolution Batch shapes_3d - Engine: CPU a b c d 0.2317 0.4634 0.6951 0.9268 1.1585 SE +/- 0.00042, N = 3 SE +/- 0.00238, N = 3 SE +/- 0.00347, N = 3 1.02976 1.02672 1.02743 1.02498 MIN: 0.96 MIN: 0.97 MIN: 0.96 MIN: 0.96 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: Recurrent Neural Network Training - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Recurrent Neural Network Training - Engine: CPU a b c d 120 240 360 480 600 SE +/- 0.35, N = 3 SE +/- 0.64, N = 3 SE +/- 1.00, N = 3 546.40 548.73 548.89 550.54 MIN: 540.95 MIN: 542.36 MIN: 542.07 MIN: 542.26 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
oneDNN Harness: Recurrent Neural Network Inference - Engine: CPU OpenBenchmarking.org ms, Fewer Is Better oneDNN 3.6 Harness: Recurrent Neural Network Inference - Engine: CPU a b c d 70 140 210 280 350 SE +/- 1.83, N = 3 SE +/- 0.28, N = 3 SE +/- 0.91, N = 3 326.24 326.79 325.45 325.78 MIN: 322.1 MIN: 319.81 MIN: 321.39 MIN: 320.56 1. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl
LiteRT Model: DeepLab V3 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: DeepLab V3 a b c d 3K 6K 9K 12K 15K SE +/- 244.20, N = 15 SE +/- 118.42, N = 3 SE +/- 50.22, N = 3 13513.8 12117.3 11896.5 12363.4
LiteRT Model: SqueezeNet OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: SqueezeNet a b c d 700 1400 2100 2800 3500 SE +/- 24.67, N = 3 SE +/- 34.80, N = 4 SE +/- 2.50, N = 3 3266.42 3303.96 3264.06 3282.50
LiteRT Model: Inception V4 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Inception V4 a b c d 6K 12K 18K 24K 30K SE +/- 227.17, N = 14 SE +/- 209.28, N = 3 SE +/- 113.50, N = 3 26612.8 26574.6 26433.2 26390.1
LiteRT Model: NASNet Mobile OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: NASNet Mobile a b c d 40K 80K 120K 160K 200K SE +/- 1880.66, N = 3 SE +/- 1575.26, N = 3 SE +/- 2554.81, N = 12 173455 156703 142541 151275
LiteRT Model: Mobilenet Float OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Mobilenet Float a b c d 500 1000 1500 2000 2500 SE +/- 20.11, N = 3 SE +/- 12.05, N = 3 SE +/- 14.67, N = 3 2102.99 2141.96 2112.35 2132.45
LiteRT Model: Mobilenet Quant OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Mobilenet Quant a b c d 4K 8K 12K 16K 20K SE +/- 321.23, N = 12 SE +/- 268.08, N = 13 SE +/- 187.77, N = 3 15789.8 16068.8 16501.2 15994.0
LiteRT Model: Inception ResNet V2 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Inception ResNet V2 a b c d 7K 14K 21K 28K 35K SE +/- 242.28, N = 3 SE +/- 330.78, N = 15 SE +/- 375.95, N = 3 34923.9 33651.9 34821.7 34742.1
LiteRT Model: Quantized COCO SSD MobileNet v1 OpenBenchmarking.org Microseconds, Fewer Is Better LiteRT 2024-10-15 Model: Quantized COCO SSD MobileNet v1 a b c d 2K 4K 6K 8K 10K SE +/- 199.12, N = 12 SE +/- 83.25, N = 15 SE +/- 184.40, N = 12 7924.86 7935.03 7672.13 7531.43
XNNPACK Model: FP32MobileNetV1 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV1 a b c d 500 1000 1500 2000 2500 SE +/- 28.29, N = 3 SE +/- 15.37, N = 3 SE +/- 9.40, N = 3 2119 2156 2156 2159 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV2 a b c d 900 1800 2700 3600 4500 SE +/- 40.70, N = 3 SE +/- 22.36, N = 3 SE +/- 4.98, N = 3 4032 4053 4055 4007 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV3Large OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV3Large a b c d 1300 2600 3900 5200 6500 SE +/- 43.59, N = 3 SE +/- 3.71, N = 3 SE +/- 38.74, N = 3 5962 6005 6047 5932 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP32MobileNetV3Small OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP32MobileNetV3Small a b c d 900 1800 2700 3600 4500 SE +/- 28.17, N = 3 SE +/- 20.51, N = 3 SE +/- 34.96, N = 3 4238 4271 4269 4285 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV1 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV1 a b c d 400 800 1200 1600 2000 SE +/- 11.84, N = 3 SE +/- 19.34, N = 3 SE +/- 17.89, N = 3 2031 2052 2070 2052 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV2 a b c d 800 1600 2400 3200 4000 SE +/- 37.65, N = 3 SE +/- 48.96, N = 3 SE +/- 23.68, N = 3 3743 3804 3825 3762 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV3Large OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV3Large a b c d 1200 2400 3600 4800 6000 SE +/- 67.87, N = 3 SE +/- 63.97, N = 3 SE +/- 39.03, N = 3 5624 5748 5827 5806 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: FP16MobileNetV3Small OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: FP16MobileNetV3Small a b c d 900 1800 2700 3600 4500 SE +/- 51.08, N = 3 SE +/- 19.50, N = 3 SE +/- 72.75, N = 3 4164 4301 4255 4294 1. (CXX) g++ options: -O3 -lrt -lm
XNNPACK Model: QS8MobileNetV2 OpenBenchmarking.org us, Fewer Is Better XNNPACK b7b048 Model: QS8MobileNetV2 a b c d 900 1800 2700 3600 4500 SE +/- 32.71, N = 3 SE +/- 32.13, N = 3 SE +/- 27.10, N = 3 4031 3988 4025 3985 1. (CXX) g++ options: -O3 -lrt -lm
Phoronix Test Suite v10.8.5