epyc turin

Benchmarks for a future article. 2 x AMD EPYC 9755 128-Core testing with a AMD VOLCANO (RVOT1001B BIOS) and ASPEED on Ubuntu 24.04 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2410191-NE-EPYCTURIN56.

epyc turinProcessorMotherboardChipsetMemoryDiskGraphicsNetworkOSKernelCompilerFile-SystemScreen Resolutionab2 x AMD EPYC 9755 128-Core @ 2.70GHz (256 Cores / 512 Threads)AMD VOLCANO (RVOT1001B BIOS)AMD Device 153a1520GB512GB SAMSUNG MZVL2512HCJQ-00B00 + 3201GB Micron_7450_MTFDKCB3T2TFSASPEEDBroadcom NetXtreme BCM5720 PCIeUbuntu 24.046.12.0-rc3-phx (x86_64)GCC 13.2.0 + Clang 18.1.3ext41024x768OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-backtrace --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-uJ7kn6/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-uJ7kn6/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xb002116Security Details- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Vulnerable + spectre_v1: Vulnerable: __user pointer sanitization and usercopy barriers only; no swapgs barriers + spectre_v2: Vulnerable; IBPB: disabled; STIBP: disabled; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected

epyc turinepoch: Conewarpx: Uniform Plasmawarpx: Plasma Accelerationonednn: IP Shapes 1D - CPUonednn: IP Shapes 3D - CPUonednn: Convolution Batch Shapes Auto - CPUonednn: Deconvolution Batch shapes_1d - CPUonednn: Deconvolution Batch shapes_3d - CPUonednn: Recurrent Neural Network Training - CPUonednn: Recurrent Neural Network Inference - CPUlitert: DeepLab V3litert: SqueezeNetlitert: Inception V4litert: NASNet Mobilelitert: Mobilenet Floatlitert: Mobilenet Quantlitert: Inception ResNet V2litert: Quantized COCO SSD MobileNet v1xnnpack: FP32MobileNetV1xnnpack: FP32MobileNetV2xnnpack: FP32MobileNetV3Largexnnpack: FP32MobileNetV3Smallxnnpack: FP16MobileNetV1xnnpack: FP16MobileNetV2xnnpack: FP16MobileNetV3Largexnnpack: FP16MobileNetV3Smallxnnpack: QS8MobileNetV2ab283.7520.38302423.470436340.7973410.6521960.29830226.14460.423216761.155514.98793820.7105380422782183165075229.343601.882433563357.4126508349733322181306213121353243557278473308464202569280.7820.3832420323.426495180.7994820.6521100.30122526.56730.421823758.564516.464102742.946189.4380938125456124645.946489.236351271555.63021328302726983328616399495262462304488406223238886OpenBenchmarking.org

Epoch

Epoch3D Deck: Cone

OpenBenchmarking.orgSeconds, Fewer Is BetterEpoch 4.19.4Epoch3D Deck: Coneab60120180240300SE +/- 0.98, N = 3283.75280.781. (F9X) gfortran options: -O3 -std=f2003 -Jobj -lsdf -lmpi_usempif08 -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lm -lz

WarpX

Input: Uniform Plasma

OpenBenchmarking.orgSeconds, Fewer Is BetterWarpX 24.10Input: Uniform Plasmaab510152025SE +/- 0.19, N = 1520.3820.381. (CXX) g++ options: -O3 -lm

WarpX

Input: Plasma Acceleration

OpenBenchmarking.orgSeconds, Fewer Is BetterWarpX 24.10Input: Plasma Accelerationab612182430SE +/- 0.13, N = 323.4723.431. (CXX) g++ options: -O3 -lm

oneDNN

Harness: IP Shapes 1D - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: IP Shapes 1D - Engine: CPUab0.17990.35980.53970.71960.8995SE +/- 0.000618, N = 30.7973410.799482MIN: 0.76MIN: 0.761. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl -lpthread

oneDNN

Harness: IP Shapes 3D - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: IP Shapes 3D - Engine: CPUab0.14670.29340.44010.58680.7335SE +/- 0.003624, N = 30.6521960.652110MIN: 0.61MIN: 0.61. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl -lpthread

oneDNN

Harness: Convolution Batch Shapes Auto - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: Convolution Batch Shapes Auto - Engine: CPUab0.06780.13560.20340.27120.339SE +/- 0.001827, N = 30.2983020.301225MIN: 0.29MIN: 0.281. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl -lpthread

oneDNN

Harness: Deconvolution Batch shapes_1d - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: Deconvolution Batch shapes_1d - Engine: CPUab612182430SE +/- 0.13, N = 326.1426.57MIN: 24.29MIN: 24.161. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl -lpthread

oneDNN

Harness: Deconvolution Batch shapes_3d - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: Deconvolution Batch shapes_3d - Engine: CPUab0.09520.19040.28560.38080.476SE +/- 0.000942, N = 30.4232160.421823MIN: 0.41MIN: 0.41. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl -lpthread

oneDNN

Harness: Recurrent Neural Network Training - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: Recurrent Neural Network Training - Engine: CPUab160320480640800SE +/- 0.94, N = 3761.16758.56MIN: 753.78MIN: 750.891. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl -lpthread

oneDNN

Harness: Recurrent Neural Network Inference - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 3.6Harness: Recurrent Neural Network Inference - Engine: CPUab110220330440550SE +/- 1.40, N = 3514.99516.46MIN: 510.29MIN: 510.391. (CXX) g++ options: -O3 -march=native -fopenmp -msse4.1 -fPIC -fcf-protection=full -pie -ldl -lpthread

LiteRT

Model: DeepLab V3

OpenBenchmarking.orgMicroseconds, Fewer Is BetterLiteRT 2024-10-15Model: DeepLab V3ab20K40K60K80K100KSE +/- 6031.32, N = 1293820.7102742.9

LiteRT

Model: SqueezeNet

OpenBenchmarking.orgMicroseconds, Fewer Is BetterLiteRT 2024-10-15Model: SqueezeNetab20K40K60K80K100KSE +/- 759.10, N = 15105380.046189.4

LiteRT

Model: Inception V4

OpenBenchmarking.orgMicroseconds, Fewer Is BetterLiteRT 2024-10-15Model: Inception V4ab90K180K270K360K450KSE +/- 19162.36, N = 12422782380938

LiteRT

Model: NASNet Mobile

OpenBenchmarking.orgMicroseconds, Fewer Is BetterLiteRT 2024-10-15Model: NASNet Mobileab400K800K1200K1600K2000KSE +/- 119244.97, N = 1218316501254561

LiteRT

Model: Mobilenet Float

OpenBenchmarking.orgMicroseconds, Fewer Is BetterLiteRT 2024-10-15Model: Mobilenet Floatab16K32K48K64K80K75229.324645.9

LiteRT

Model: Mobilenet Quant

OpenBenchmarking.orgMicroseconds, Fewer Is BetterLiteRT 2024-10-15Model: Mobilenet Quantab10K20K30K40K50K43601.846489.2

LiteRT

Model: Inception ResNet V2

OpenBenchmarking.orgMicroseconds, Fewer Is BetterLiteRT 2024-10-15Model: Inception ResNet V2ab200K400K600K800K1000K824335363512

LiteRT

Model: Quantized COCO SSD MobileNet v1

OpenBenchmarking.orgMicroseconds, Fewer Is BetterLiteRT 2024-10-15Model: Quantized COCO SSD MobileNet v1ab15K30K45K60K75K63357.471555.6

XNNPACK

Model: FP32MobileNetV1

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP32MobileNetV1ab30K60K90K120K150K126508302131. (CXX) g++ options: -O3 -lrt -lm

XNNPACK

Model: FP32MobileNetV2

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP32MobileNetV2ab70K140K210K280K350K3497332830271. (CXX) g++ options: -O3 -lrt -lm

XNNPACK

Model: FP32MobileNetV3Large

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP32MobileNetV3Largeab70K140K210K280K350K3221812698331. (CXX) g++ options: -O3 -lrt -lm

XNNPACK

Model: FP32MobileNetV3Small

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP32MobileNetV3Smallab70K140K210K280K350K3062132861631. (CXX) g++ options: -O3 -lrt -lm

XNNPACK

Model: FP16MobileNetV1

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP16MobileNetV1ab30K60K90K120K150K121353994951. (CXX) g++ options: -O3 -lrt -lm

XNNPACK

Model: FP16MobileNetV2

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP16MobileNetV2ab60K120K180K240K300K2435572624621. (CXX) g++ options: -O3 -lrt -lm

XNNPACK

Model: FP16MobileNetV3Large

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP16MobileNetV3Largeab70K140K210K280K350K2784733044881. (CXX) g++ options: -O3 -lrt -lm

XNNPACK

Model: FP16MobileNetV3Small

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: FP16MobileNetV3Smallab90K180K270K360K450K3084644062231. (CXX) g++ options: -O3 -lrt -lm

XNNPACK

Model: QS8MobileNetV2

OpenBenchmarking.orgus, Fewer Is BetterXNNPACK b7b048Model: QS8MobileNetV2ab50K100K150K200K250K2025692388861. (CXX) g++ options: -O3 -lrt -lm


Phoronix Test Suite v10.8.5