9684x-march

Tests for a future article. 2 x AMD EPYC 9684X 96-Core testing with a AMD Titanite_4G (RTI1007B BIOS) and ASPEED on Ubuntu 23.10 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2403270-NE-9684XMARC10&grr.

9684x-march ProcessorMotherboardChipsetMemoryDiskGraphicsNetworkOSKernelCompilerFile-SystemScreen ResolutionPREa2 x AMD EPYC 9684X 96-Core @ 2.55GHz (192 Cores / 384 Threads)AMD Titanite_4G (RTI1007B BIOS)AMD Device 14a41520GB3201GB Micron_7450_MTFDKCB3T2TFS + 257GB Flash DriveASPEEDBroadcom NetXtreme BCM5720 PCIeUbuntu 23.106.5.0-25-generic (x86_64)GCC 13.2.0ext4640x480OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa10113e Python Details- Python 3.11.6Security Details- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected

9684x-march brl-cad: VGR Performance Metricpytorch: CPU - 64 - ResNet-152pytorch: CPU - 256 - ResNet-152pytorch: CPU - 256 - Efficientnet_v2_lpytorch: CPU - 64 - Efficientnet_v2_lpytorch: CPU - 32 - Efficientnet_v2_lpytorch: CPU - 16 - Efficientnet_v2_lpytorch: CPU - 512 - Efficientnet_v2_lpytorch: CPU - 1 - ResNet-152pytorch: CPU - 32 - ResNet-50pytorch: CPU - 512 - ResNet-50pytorch: CPU - 16 - ResNet-152pytorch: CPU - 32 - ResNet-152pytorch: CPU - 512 - ResNet-152tensorflow: CPU - 512 - ResNet-50pytorch: CPU - 1 - ResNet-50pytorch: CPU - 1 - Efficientnet_v2_ltensorflow: CPU - 256 - ResNet-50pytorch: CPU - 256 - ResNet-50pytorch: CPU - 16 - ResNet-50pytorch: CPU - 64 - ResNet-50tensorflow: CPU - 512 - GoogLeNettensorflow: CPU - 1 - GoogLeNettensorflow: CPU - 64 - AlexNettensorflow: CPU - 64 - ResNet-50tensorflow: CPU - 32 - AlexNettensorflow: CPU - 256 - GoogLeNettensorflow: CPU - 16 - AlexNetblender: Barbershop - CPU-Onlytensorflow: CPU - 32 - ResNet-50rocksdb: Update Randrocksdb: Overwriterocksdb: Read Rand Write Randrocksdb: Read While Writingrocksdb: Rand Readtensorflow: CPU - 1 - AlexNettensorflow: CPU - 16 - ResNet-50tensorflow: CPU - 1 - ResNet-50tensorflow: CPU - 512 - AlexNettensorflow: CPU - 64 - GoogLeNetbuild-mesa: Time To Compileblender: Pabellon Barcelona - CPU-Onlytensorflow: CPU - 32 - GoogLeNettensorflow: CPU - 256 - AlexNettensorflow: CPU - 16 - GoogLeNetblender: Classroom - CPU-Onlyblender: Junkshop - CPU-Onlyblender: Fishy Cat - CPU-Onlyblender: BMW27 - CPU-OnlyPREa59566129.218.922.292.322.332.332.319.9720.1920.438.938.729.47140.5923.066.29119.8321.2020.9321.59493.3112.58765.5587.72424.06400.03242.2967.3865.88421266421049361914227130363110530623321.1639.684.051980.51275.3414.6622.99185.161652.23112.6418.0311.49.967.5559275648.919.092.332.312.312.332.3310.5820.8421.019.019.349.33140.4923.206.45118.8820.7721.5321.08484.0213.20749.4688.93436.25399.46247.5567.6660.25425687421616364326326406662110889277620.7841.263.92010.56273.6814.75623.1176.361604.52114.2618.0811.449.857.55OpenBenchmarking.org

BRL-CAD

VGR Performance Metric

OpenBenchmarking.orgVGR Performance Metric, More Is BetterBRL-CAD 7.38.2VGR Performance MetricaPRE1.3M2.6M3.9M5.2M6.5M592756459566121. (CXX) g++ options: -std=c++17 -pipe -fvisibility=hidden -fno-strict-aliasing -fno-common -fexceptions -ftemplate-depth-128 -m64 -ggdb3 -O3 -fipa-pta -fstrength-reduce -finline-functions -flto -ltcl8.6 -lnetpbm -lregex_brl -lz_brl -lassimp -ldl -lm -ltk8.6

PyTorch

Device: CPU - Batch Size: 64 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 64 - Model: ResNet-152PREa3691215SE +/- 0.09, N = 129.218.91MIN: 4.8 / MAX: 9.43MIN: 4.5 / MAX: 9.7

PyTorch

Device: CPU - Batch Size: 256 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 256 - Model: ResNet-152PREa3691215SE +/- 0.10, N = 128.929.09MIN: 5.04 / MAX: 9.16MIN: 4.84 / MAX: 10.03

PyTorch

Device: CPU - Batch Size: 256 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 256 - Model: Efficientnet_v2_lPREa0.52431.04861.57292.09722.6215SE +/- 0.01, N = 32.292.33MIN: 1.79 / MAX: 2.72MIN: 1.59 / MAX: 2.78

PyTorch

Device: CPU - Batch Size: 64 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 64 - Model: Efficientnet_v2_lPREa0.5221.0441.5662.0882.61SE +/- 0.01, N = 32.322.31MIN: 1.9 / MAX: 2.75MIN: 1.53 / MAX: 2.83

PyTorch

Device: CPU - Batch Size: 32 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 32 - Model: Efficientnet_v2_lPREa0.52431.04861.57292.09722.6215SE +/- 0.01, N = 32.332.31MIN: 1.78 / MAX: 2.8MIN: 1.88 / MAX: 2.74

PyTorch

Device: CPU - Batch Size: 16 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 16 - Model: Efficientnet_v2_lPREa0.52431.04861.57292.09722.6215SE +/- 0.01, N = 32.332.33MIN: 1.76 / MAX: 2.72MIN: 1.77 / MAX: 2.9

PyTorch

Device: CPU - Batch Size: 512 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 512 - Model: Efficientnet_v2_lPREa0.52431.04861.57292.09722.6215SE +/- 0.01, N = 32.312.33MIN: 1.7 / MAX: 2.84MIN: 1.58 / MAX: 2.83

PyTorch

Device: CPU - Batch Size: 1 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 1 - Model: ResNet-152PREa3691215SE +/- 0.10, N = 159.9710.58MIN: 4.85 / MAX: 10.69MIN: 4.55 / MAX: 11.67

PyTorch

Device: CPU - Batch Size: 32 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 32 - Model: ResNet-50PREa510152025SE +/- 0.16, N = 1520.1920.84MIN: 11.95 / MAX: 21.04MIN: 11.24 / MAX: 22.33

PyTorch

Device: CPU - Batch Size: 512 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 512 - Model: ResNet-50PREa510152025SE +/- 0.14, N = 1520.4321.01MIN: 13.46 / MAX: 21.1MIN: 11.92 / MAX: 22.65

PyTorch

Device: CPU - Batch Size: 16 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 16 - Model: ResNet-152PREa3691215SE +/- 0.09, N = 38.939.01MIN: 8.8 / MAX: 9.04MIN: 4.81 / MAX: 9.31

PyTorch

Device: CPU - Batch Size: 32 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 32 - Model: ResNet-152PREa3691215SE +/- 0.08, N = 38.729.34MIN: 5.23 / MAX: 9.06MIN: 4.74 / MAX: 9.74

PyTorch

Device: CPU - Batch Size: 512 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 512 - Model: ResNet-152PREa3691215SE +/- 0.10, N = 39.479.33MIN: 5.17 / MAX: 9.87MIN: 4.69 / MAX: 9.66

TensorFlow

Device: CPU - Batch Size: 512 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 512 - Model: ResNet-50aPRE306090120150140.49140.59

PyTorch

Device: CPU - Batch Size: 1 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 1 - Model: ResNet-50PREa612182430SE +/- 0.20, N = 1523.0623.20MIN: 12.95 / MAX: 24.52MIN: 12.21 / MAX: 25.13

PyTorch

Device: CPU - Batch Size: 1 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 1 - Model: Efficientnet_v2_lPREa246810SE +/- 0.09, N = 36.296.45MIN: 3.09 / MAX: 6.44MIN: 3.05 / MAX: 6.85

TensorFlow

Device: CPU - Batch Size: 256 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 256 - Model: ResNet-50aPRE306090120150118.88119.83

PyTorch

Device: CPU - Batch Size: 256 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 256 - Model: ResNet-50PREa510152025SE +/- 0.10, N = 321.2020.77MIN: 12.68 / MAX: 21.88MIN: 12.97 / MAX: 21.67

PyTorch

Device: CPU - Batch Size: 16 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 16 - Model: ResNet-50PREa510152025SE +/- 0.16, N = 320.9321.53MIN: 12.91 / MAX: 21.51MIN: 12.64 / MAX: 22.28

PyTorch

Device: CPU - Batch Size: 64 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.2.1Device: CPU - Batch Size: 64 - Model: ResNet-50PREa510152025SE +/- 0.23, N = 321.5921.08MIN: 14.02 / MAX: 22.21MIN: 13.2 / MAX: 22.07

TensorFlow

Device: CPU - Batch Size: 512 - Model: GoogLeNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 512 - Model: GoogLeNetaPRE110220330440550484.02493.31

TensorFlow

Device: CPU - Batch Size: 1 - Model: GoogLeNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 1 - Model: GoogLeNetPREa3691215SE +/- 0.14, N = 1512.5813.20

TensorFlow

Device: CPU - Batch Size: 64 - Model: AlexNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 64 - Model: AlexNetPREa170340510680850SE +/- 5.39, N = 15765.55749.46

TensorFlow

Device: CPU - Batch Size: 64 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 64 - Model: ResNet-50aPRE2040608010088.9387.72

TensorFlow

Device: CPU - Batch Size: 32 - Model: AlexNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 32 - Model: AlexNetPREa90180270360450SE +/- 6.62, N = 15424.06436.25

TensorFlow

Device: CPU - Batch Size: 256 - Model: GoogLeNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 256 - Model: GoogLeNetaPRE90180270360450399.46400.03

TensorFlow

Device: CPU - Batch Size: 16 - Model: AlexNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 16 - Model: AlexNetPREa50100150200250SE +/- 2.30, N = 15242.29247.55

Blender

Blend File: Barbershop - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.1Blend File: Barbershop - Compute: CPU-OnlyaPRE153045607567.6667.38

TensorFlow

Device: CPU - Batch Size: 32 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 32 - Model: ResNet-50aPRE153045607560.2565.88

RocksDB

Test: Update Random

OpenBenchmarking.orgOp/s, More Is BetterRocksDB 9.0Test: Update RandomaPRE90K180K270K360K450K4256874212661. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

RocksDB

Test: Overwrite

OpenBenchmarking.orgOp/s, More Is BetterRocksDB 9.0Test: OverwriteaPRE90K180K270K360K450K4216164210491. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

RocksDB

Test: Read Random Write Random

OpenBenchmarking.orgOp/s, More Is BetterRocksDB 9.0Test: Read Random Write RandomaPRE800K1600K2400K3200K4000K364326336191421. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

RocksDB

Test: Read While Writing

OpenBenchmarking.orgOp/s, More Is BetterRocksDB 9.0Test: Read While WritingaPRE6M12M18M24M30M26406662271303631. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

RocksDB

Test: Random Read

OpenBenchmarking.orgOp/s, More Is BetterRocksDB 9.0Test: Random ReadaPRE200M400M600M800M1000M110889277611053062331. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

TensorFlow

Device: CPU - Batch Size: 1 - Model: AlexNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 1 - Model: AlexNetPREa510152025SE +/- 0.16, N = 1521.1620.78

TensorFlow

Device: CPU - Batch Size: 16 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 16 - Model: ResNet-50PREa91827364539.6841.26

TensorFlow

Device: CPU - Batch Size: 1 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 1 - Model: ResNet-50PREa0.91131.82262.73393.64524.55654.053.90

TensorFlow

Device: CPU - Batch Size: 512 - Model: AlexNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 512 - Model: AlexNetPREa4008001200160020001980.512010.56

TensorFlow

Device: CPU - Batch Size: 64 - Model: GoogLeNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 64 - Model: GoogLeNetaPRE60120180240300273.68275.34

Timed Mesa Compilation

Time To Compile

OpenBenchmarking.orgSeconds, Fewer Is BetterTimed Mesa Compilation 24.0Time To CompilePREa48121620SE +/- 0.04, N = 314.6614.76

Blender

Blend File: Pabellon Barcelona - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.1Blend File: Pabellon Barcelona - Compute: CPU-OnlyaPRE61218243023.1022.99

TensorFlow

Device: CPU - Batch Size: 32 - Model: GoogLeNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 32 - Model: GoogLeNetPREa4080120160200185.16176.36

TensorFlow

Device: CPU - Batch Size: 256 - Model: AlexNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 256 - Model: AlexNetPREa4008001200160020001652.231604.52

TensorFlow

Device: CPU - Batch Size: 16 - Model: GoogLeNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.16.1Device: CPU - Batch Size: 16 - Model: GoogLeNetPREa306090120150112.64114.26

Blender

Blend File: Classroom - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.1Blend File: Classroom - Compute: CPU-OnlyaPRE4812162018.0818.03

Blender

Blend File: Junkshop - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.1Blend File: Junkshop - Compute: CPU-OnlyaPRE369121511.4411.40

Blender

Blend File: Fishy Cat - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.1Blend File: Fishy Cat - Compute: CPU-OnlyaPRE36912159.859.96

Blender

Blend File: BMW27 - Compute: CPU-Only

OpenBenchmarking.orgSeconds, Fewer Is BetterBlender 4.1Blend File: BMW27 - Compute: CPU-OnlyaPRE2468107.557.55


Phoronix Test Suite v10.8.5