9684x-march Tests for a future article. 2 x AMD EPYC 9684X 96-Core testing with a AMD Titanite_4G (RTI1007B BIOS) and ASPEED on Ubuntu 23.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2403270-NE-9684XMARC10&gru&rdt .
9684x-march Processor Motherboard Chipset Memory Disk Graphics Network OS Kernel Compiler File-System Screen Resolution PRE a 2 x AMD EPYC 9684X 96-Core @ 2.55GHz (192 Cores / 384 Threads) AMD Titanite_4G (RTI1007B BIOS) AMD Device 14a4 1520GB 3201GB Micron_7450_MTFDKCB3T2TFS + 257GB Flash Drive ASPEED Broadcom NetXtreme BCM5720 PCIe Ubuntu 23.10 6.5.0-25-generic (x86_64) GCC 13.2.0 ext4 640x480 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa10113e Python Details - Python 3.11.6 Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
9684x-march pytorch: CPU - 1 - ResNet-50 pytorch: CPU - 1 - ResNet-152 pytorch: CPU - 16 - ResNet-50 pytorch: CPU - 32 - ResNet-50 pytorch: CPU - 64 - ResNet-50 pytorch: CPU - 16 - ResNet-152 pytorch: CPU - 256 - ResNet-50 pytorch: CPU - 32 - ResNet-152 pytorch: CPU - 512 - ResNet-50 pytorch: CPU - 64 - ResNet-152 pytorch: CPU - 256 - ResNet-152 pytorch: CPU - 512 - ResNet-152 pytorch: CPU - 1 - Efficientnet_v2_l pytorch: CPU - 16 - Efficientnet_v2_l pytorch: CPU - 32 - Efficientnet_v2_l pytorch: CPU - 64 - Efficientnet_v2_l pytorch: CPU - 256 - Efficientnet_v2_l pytorch: CPU - 512 - Efficientnet_v2_l tensorflow: CPU - 1 - AlexNet tensorflow: CPU - 16 - AlexNet tensorflow: CPU - 32 - AlexNet tensorflow: CPU - 64 - AlexNet tensorflow: CPU - 1 - GoogLeNet tensorflow: CPU - 1 - ResNet-50 tensorflow: CPU - 256 - AlexNet tensorflow: CPU - 512 - AlexNet tensorflow: CPU - 16 - GoogLeNet tensorflow: CPU - 16 - ResNet-50 tensorflow: CPU - 32 - GoogLeNet tensorflow: CPU - 32 - ResNet-50 tensorflow: CPU - 64 - GoogLeNet tensorflow: CPU - 64 - ResNet-50 tensorflow: CPU - 256 - GoogLeNet tensorflow: CPU - 256 - ResNet-50 tensorflow: CPU - 512 - GoogLeNet tensorflow: CPU - 512 - ResNet-50 rocksdb: Overwrite rocksdb: Rand Read rocksdb: Update Rand rocksdb: Read While Writing rocksdb: Read Rand Write Rand brl-cad: VGR Performance Metric build-mesa: Time To Compile blender: BMW27 - CPU-Only blender: Junkshop - CPU-Only blender: Classroom - CPU-Only blender: Fishy Cat - CPU-Only blender: Barbershop - CPU-Only blender: Pabellon Barcelona - CPU-Only PRE a 23.06 9.97 20.93 20.19 21.59 8.93 21.20 8.72 20.43 9.21 8.92 9.47 6.29 2.33 2.33 2.32 2.29 2.31 21.16 242.29 424.06 765.55 12.58 4.05 1652.23 1980.51 112.64 39.68 185.16 65.88 275.34 87.72 400.03 119.83 493.31 140.59 421049 1105306233 421266 27130363 3619142 5956612 14.66 7.55 11.4 18.03 9.96 67.38 22.99 23.20 10.58 21.53 20.84 21.08 9.01 20.77 9.34 21.01 8.91 9.09 9.33 6.45 2.33 2.31 2.31 2.33 2.33 20.78 247.55 436.25 749.46 13.20 3.9 1604.52 2010.56 114.26 41.26 176.36 60.25 273.68 88.93 399.46 118.88 484.02 140.49 421616 1108892776 425687 26406662 3643263 5927564 14.756 7.55 11.44 18.08 9.85 67.66 23.1 OpenBenchmarking.org
PyTorch Device: CPU - Batch Size: 1 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 1 - Model: ResNet-50 PRE a 6 12 18 24 30 SE +/- 0.20, N = 15 23.06 23.20 MIN: 12.95 / MAX: 24.52 MIN: 12.21 / MAX: 25.13
PyTorch Device: CPU - Batch Size: 1 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 1 - Model: ResNet-152 PRE a 3 6 9 12 15 SE +/- 0.10, N = 15 9.97 10.58 MIN: 4.85 / MAX: 10.69 MIN: 4.55 / MAX: 11.67
PyTorch Device: CPU - Batch Size: 16 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 16 - Model: ResNet-50 PRE a 5 10 15 20 25 SE +/- 0.16, N = 3 20.93 21.53 MIN: 12.91 / MAX: 21.51 MIN: 12.64 / MAX: 22.28
PyTorch Device: CPU - Batch Size: 32 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 32 - Model: ResNet-50 PRE a 5 10 15 20 25 SE +/- 0.16, N = 15 20.19 20.84 MIN: 11.95 / MAX: 21.04 MIN: 11.24 / MAX: 22.33
PyTorch Device: CPU - Batch Size: 64 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 64 - Model: ResNet-50 PRE a 5 10 15 20 25 SE +/- 0.23, N = 3 21.59 21.08 MIN: 14.02 / MAX: 22.21 MIN: 13.2 / MAX: 22.07
PyTorch Device: CPU - Batch Size: 16 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 16 - Model: ResNet-152 PRE a 3 6 9 12 15 SE +/- 0.09, N = 3 8.93 9.01 MIN: 8.8 / MAX: 9.04 MIN: 4.81 / MAX: 9.31
PyTorch Device: CPU - Batch Size: 256 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 256 - Model: ResNet-50 PRE a 5 10 15 20 25 SE +/- 0.10, N = 3 21.20 20.77 MIN: 12.68 / MAX: 21.88 MIN: 12.97 / MAX: 21.67
PyTorch Device: CPU - Batch Size: 32 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 32 - Model: ResNet-152 PRE a 3 6 9 12 15 SE +/- 0.08, N = 3 8.72 9.34 MIN: 5.23 / MAX: 9.06 MIN: 4.74 / MAX: 9.74
PyTorch Device: CPU - Batch Size: 512 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 512 - Model: ResNet-50 PRE a 5 10 15 20 25 SE +/- 0.14, N = 15 20.43 21.01 MIN: 13.46 / MAX: 21.1 MIN: 11.92 / MAX: 22.65
PyTorch Device: CPU - Batch Size: 64 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 64 - Model: ResNet-152 PRE a 3 6 9 12 15 SE +/- 0.09, N = 12 9.21 8.91 MIN: 4.8 / MAX: 9.43 MIN: 4.5 / MAX: 9.7
PyTorch Device: CPU - Batch Size: 256 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 256 - Model: ResNet-152 PRE a 3 6 9 12 15 SE +/- 0.10, N = 12 8.92 9.09 MIN: 5.04 / MAX: 9.16 MIN: 4.84 / MAX: 10.03
PyTorch Device: CPU - Batch Size: 512 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 512 - Model: ResNet-152 PRE a 3 6 9 12 15 SE +/- 0.10, N = 3 9.47 9.33 MIN: 5.17 / MAX: 9.87 MIN: 4.69 / MAX: 9.66
PyTorch Device: CPU - Batch Size: 1 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 1 - Model: Efficientnet_v2_l PRE a 2 4 6 8 10 SE +/- 0.09, N = 3 6.29 6.45 MIN: 3.09 / MAX: 6.44 MIN: 3.05 / MAX: 6.85
PyTorch Device: CPU - Batch Size: 16 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 16 - Model: Efficientnet_v2_l PRE a 0.5243 1.0486 1.5729 2.0972 2.6215 SE +/- 0.01, N = 3 2.33 2.33 MIN: 1.76 / MAX: 2.72 MIN: 1.77 / MAX: 2.9
PyTorch Device: CPU - Batch Size: 32 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 32 - Model: Efficientnet_v2_l PRE a 0.5243 1.0486 1.5729 2.0972 2.6215 SE +/- 0.01, N = 3 2.33 2.31 MIN: 1.78 / MAX: 2.8 MIN: 1.88 / MAX: 2.74
PyTorch Device: CPU - Batch Size: 64 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 64 - Model: Efficientnet_v2_l PRE a 0.522 1.044 1.566 2.088 2.61 SE +/- 0.01, N = 3 2.32 2.31 MIN: 1.9 / MAX: 2.75 MIN: 1.53 / MAX: 2.83
PyTorch Device: CPU - Batch Size: 256 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 256 - Model: Efficientnet_v2_l PRE a 0.5243 1.0486 1.5729 2.0972 2.6215 SE +/- 0.01, N = 3 2.29 2.33 MIN: 1.79 / MAX: 2.72 MIN: 1.59 / MAX: 2.78
PyTorch Device: CPU - Batch Size: 512 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 512 - Model: Efficientnet_v2_l PRE a 0.5243 1.0486 1.5729 2.0972 2.6215 SE +/- 0.01, N = 3 2.31 2.33 MIN: 1.7 / MAX: 2.84 MIN: 1.58 / MAX: 2.83
TensorFlow Device: CPU - Batch Size: 1 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 1 - Model: AlexNet PRE a 5 10 15 20 25 SE +/- 0.16, N = 15 21.16 20.78
TensorFlow Device: CPU - Batch Size: 16 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 16 - Model: AlexNet PRE a 50 100 150 200 250 SE +/- 2.30, N = 15 242.29 247.55
TensorFlow Device: CPU - Batch Size: 32 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 32 - Model: AlexNet PRE a 90 180 270 360 450 SE +/- 6.62, N = 15 424.06 436.25
TensorFlow Device: CPU - Batch Size: 64 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 64 - Model: AlexNet PRE a 170 340 510 680 850 SE +/- 5.39, N = 15 765.55 749.46
TensorFlow Device: CPU - Batch Size: 1 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 1 - Model: GoogLeNet PRE a 3 6 9 12 15 SE +/- 0.14, N = 15 12.58 13.20
TensorFlow Device: CPU - Batch Size: 1 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 1 - Model: ResNet-50 PRE a 0.9113 1.8226 2.7339 3.6452 4.5565 4.05 3.90
TensorFlow Device: CPU - Batch Size: 256 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 256 - Model: AlexNet PRE a 400 800 1200 1600 2000 1652.23 1604.52
TensorFlow Device: CPU - Batch Size: 512 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 512 - Model: AlexNet PRE a 400 800 1200 1600 2000 1980.51 2010.56
TensorFlow Device: CPU - Batch Size: 16 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 16 - Model: GoogLeNet PRE a 30 60 90 120 150 112.64 114.26
TensorFlow Device: CPU - Batch Size: 16 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 16 - Model: ResNet-50 PRE a 9 18 27 36 45 39.68 41.26
TensorFlow Device: CPU - Batch Size: 32 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 32 - Model: GoogLeNet PRE a 40 80 120 160 200 185.16 176.36
TensorFlow Device: CPU - Batch Size: 32 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 32 - Model: ResNet-50 PRE a 15 30 45 60 75 65.88 60.25
TensorFlow Device: CPU - Batch Size: 64 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 64 - Model: GoogLeNet PRE a 60 120 180 240 300 275.34 273.68
TensorFlow Device: CPU - Batch Size: 64 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 64 - Model: ResNet-50 PRE a 20 40 60 80 100 87.72 88.93
TensorFlow Device: CPU - Batch Size: 256 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 256 - Model: GoogLeNet PRE a 90 180 270 360 450 400.03 399.46
TensorFlow Device: CPU - Batch Size: 256 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 256 - Model: ResNet-50 PRE a 30 60 90 120 150 119.83 118.88
TensorFlow Device: CPU - Batch Size: 512 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 512 - Model: GoogLeNet PRE a 110 220 330 440 550 493.31 484.02
TensorFlow Device: CPU - Batch Size: 512 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 512 - Model: ResNet-50 PRE a 30 60 90 120 150 140.59 140.49
RocksDB Test: Overwrite OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Overwrite PRE a 90K 180K 270K 360K 450K 421049 421616 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
RocksDB Test: Random Read OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Random Read PRE a 200M 400M 600M 800M 1000M 1105306233 1108892776 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
RocksDB Test: Update Random OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Update Random PRE a 90K 180K 270K 360K 450K 421266 425687 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
RocksDB Test: Read While Writing OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Read While Writing PRE a 6M 12M 18M 24M 30M 27130363 26406662 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
RocksDB Test: Read Random Write Random OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Read Random Write Random PRE a 800K 1600K 2400K 3200K 4000K 3619142 3643263 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
BRL-CAD VGR Performance Metric OpenBenchmarking.org VGR Performance Metric, More Is Better BRL-CAD 7.38.2 VGR Performance Metric PRE a 1.3M 2.6M 3.9M 5.2M 6.5M 5956612 5927564 1. (CXX) g++ options: -std=c++17 -pipe -fvisibility=hidden -fno-strict-aliasing -fno-common -fexceptions -ftemplate-depth-128 -m64 -ggdb3 -O3 -fipa-pta -fstrength-reduce -finline-functions -flto -ltcl8.6 -lnetpbm -lregex_brl -lz_brl -lassimp -ldl -lm -ltk8.6
Timed Mesa Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Mesa Compilation 24.0 Time To Compile PRE a 4 8 12 16 20 SE +/- 0.04, N = 3 14.66 14.76
Blender Blend File: BMW27 - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.1 Blend File: BMW27 - Compute: CPU-Only PRE a 2 4 6 8 10 7.55 7.55
Blender Blend File: Junkshop - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.1 Blend File: Junkshop - Compute: CPU-Only PRE a 3 6 9 12 15 11.40 11.44
Blender Blend File: Classroom - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.1 Blend File: Classroom - Compute: CPU-Only PRE a 4 8 12 16 20 18.03 18.08
Blender Blend File: Fishy Cat - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.1 Blend File: Fishy Cat - Compute: CPU-Only PRE a 3 6 9 12 15 9.96 9.85
Blender Blend File: Barbershop - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.1 Blend File: Barbershop - Compute: CPU-Only PRE a 15 30 45 60 75 67.38 67.66
Blender Blend File: Pabellon Barcelona - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.1 Blend File: Pabellon Barcelona - Compute: CPU-Only PRE a 6 12 18 24 30 22.99 23.10
Phoronix Test Suite v10.8.5