9684x-march Tests for a future article. 2 x AMD EPYC 9684X 96-Core testing with a AMD Titanite_4G (RTI1007B BIOS) and ASPEED on Ubuntu 23.10 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2403270-NE-9684XMARC10&sor&grr .
9684x-march Processor Motherboard Chipset Memory Disk Graphics Network OS Kernel Compiler File-System Screen Resolution PRE a 2 x AMD EPYC 9684X 96-Core @ 2.55GHz (192 Cores / 384 Threads) AMD Titanite_4G (RTI1007B BIOS) AMD Device 14a4 1520GB 3201GB Micron_7450_MTFDKCB3T2TFS + 257GB Flash Drive ASPEED Broadcom NetXtreme BCM5720 PCIe Ubuntu 23.10 6.5.0-25-generic (x86_64) GCC 13.2.0 ext4 640x480 OpenBenchmarking.org Kernel Details - Transparent Huge Pages: madvise Compiler Details - --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-13-XYspKM/gcc-13-13.2.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details - Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa10113e Python Details - Python 3.11.6 Security Details - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected
9684x-march brl-cad: VGR Performance Metric pytorch: CPU - 64 - ResNet-152 pytorch: CPU - 256 - ResNet-152 pytorch: CPU - 256 - Efficientnet_v2_l pytorch: CPU - 64 - Efficientnet_v2_l pytorch: CPU - 32 - Efficientnet_v2_l pytorch: CPU - 16 - Efficientnet_v2_l pytorch: CPU - 512 - Efficientnet_v2_l pytorch: CPU - 1 - ResNet-152 pytorch: CPU - 32 - ResNet-50 pytorch: CPU - 512 - ResNet-50 pytorch: CPU - 16 - ResNet-152 pytorch: CPU - 32 - ResNet-152 pytorch: CPU - 512 - ResNet-152 tensorflow: CPU - 512 - ResNet-50 pytorch: CPU - 1 - ResNet-50 pytorch: CPU - 1 - Efficientnet_v2_l tensorflow: CPU - 256 - ResNet-50 pytorch: CPU - 256 - ResNet-50 pytorch: CPU - 16 - ResNet-50 pytorch: CPU - 64 - ResNet-50 tensorflow: CPU - 512 - GoogLeNet tensorflow: CPU - 1 - GoogLeNet tensorflow: CPU - 64 - AlexNet tensorflow: CPU - 64 - ResNet-50 tensorflow: CPU - 32 - AlexNet tensorflow: CPU - 256 - GoogLeNet tensorflow: CPU - 16 - AlexNet blender: Barbershop - CPU-Only tensorflow: CPU - 32 - ResNet-50 rocksdb: Update Rand rocksdb: Overwrite rocksdb: Read Rand Write Rand rocksdb: Read While Writing rocksdb: Rand Read tensorflow: CPU - 1 - AlexNet tensorflow: CPU - 16 - ResNet-50 tensorflow: CPU - 1 - ResNet-50 tensorflow: CPU - 512 - AlexNet tensorflow: CPU - 64 - GoogLeNet build-mesa: Time To Compile blender: Pabellon Barcelona - CPU-Only tensorflow: CPU - 32 - GoogLeNet tensorflow: CPU - 256 - AlexNet tensorflow: CPU - 16 - GoogLeNet blender: Classroom - CPU-Only blender: Junkshop - CPU-Only blender: Fishy Cat - CPU-Only blender: BMW27 - CPU-Only PRE a 5956612 9.21 8.92 2.29 2.32 2.33 2.33 2.31 9.97 20.19 20.43 8.93 8.72 9.47 140.59 23.06 6.29 119.83 21.20 20.93 21.59 493.31 12.58 765.55 87.72 424.06 400.03 242.29 67.38 65.88 421266 421049 3619142 27130363 1105306233 21.16 39.68 4.05 1980.51 275.34 14.66 22.99 185.16 1652.23 112.64 18.03 11.4 9.96 7.55 5927564 8.91 9.09 2.33 2.31 2.31 2.33 2.33 10.58 20.84 21.01 9.01 9.34 9.33 140.49 23.20 6.45 118.88 20.77 21.53 21.08 484.02 13.20 749.46 88.93 436.25 399.46 247.55 67.66 60.25 425687 421616 3643263 26406662 1108892776 20.78 41.26 3.9 2010.56 273.68 14.756 23.1 176.36 1604.52 114.26 18.08 11.44 9.85 7.55 OpenBenchmarking.org
BRL-CAD VGR Performance Metric OpenBenchmarking.org VGR Performance Metric, More Is Better BRL-CAD 7.38.2 VGR Performance Metric PRE a 1.3M 2.6M 3.9M 5.2M 6.5M 5956612 5927564 1. (CXX) g++ options: -std=c++17 -pipe -fvisibility=hidden -fno-strict-aliasing -fno-common -fexceptions -ftemplate-depth-128 -m64 -ggdb3 -O3 -fipa-pta -fstrength-reduce -finline-functions -flto -ltcl8.6 -lnetpbm -lregex_brl -lz_brl -lassimp -ldl -lm -ltk8.6
PyTorch Device: CPU - Batch Size: 64 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 64 - Model: ResNet-152 PRE a 3 6 9 12 15 SE +/- 0.09, N = 12 9.21 8.91 MIN: 4.8 / MAX: 9.43 MIN: 4.5 / MAX: 9.7
PyTorch Device: CPU - Batch Size: 256 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 256 - Model: ResNet-152 a PRE 3 6 9 12 15 SE +/- 0.10, N = 12 9.09 8.92 MIN: 4.84 / MAX: 10.03 MIN: 5.04 / MAX: 9.16
PyTorch Device: CPU - Batch Size: 256 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 256 - Model: Efficientnet_v2_l a PRE 0.5243 1.0486 1.5729 2.0972 2.6215 SE +/- 0.01, N = 3 2.33 2.29 MIN: 1.59 / MAX: 2.78 MIN: 1.79 / MAX: 2.72
PyTorch Device: CPU - Batch Size: 64 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 64 - Model: Efficientnet_v2_l PRE a 0.522 1.044 1.566 2.088 2.61 SE +/- 0.01, N = 3 2.32 2.31 MIN: 1.9 / MAX: 2.75 MIN: 1.53 / MAX: 2.83
PyTorch Device: CPU - Batch Size: 32 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 32 - Model: Efficientnet_v2_l PRE a 0.5243 1.0486 1.5729 2.0972 2.6215 SE +/- 0.01, N = 3 2.33 2.31 MIN: 1.78 / MAX: 2.8 MIN: 1.88 / MAX: 2.74
PyTorch Device: CPU - Batch Size: 16 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 16 - Model: Efficientnet_v2_l a PRE 0.5243 1.0486 1.5729 2.0972 2.6215 SE +/- 0.01, N = 3 2.33 2.33 MIN: 1.77 / MAX: 2.9 MIN: 1.76 / MAX: 2.72
PyTorch Device: CPU - Batch Size: 512 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 512 - Model: Efficientnet_v2_l a PRE 0.5243 1.0486 1.5729 2.0972 2.6215 SE +/- 0.01, N = 3 2.33 2.31 MIN: 1.58 / MAX: 2.83 MIN: 1.7 / MAX: 2.84
PyTorch Device: CPU - Batch Size: 1 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 1 - Model: ResNet-152 a PRE 3 6 9 12 15 SE +/- 0.10, N = 15 10.58 9.97 MIN: 4.55 / MAX: 11.67 MIN: 4.85 / MAX: 10.69
PyTorch Device: CPU - Batch Size: 32 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 32 - Model: ResNet-50 a PRE 5 10 15 20 25 SE +/- 0.16, N = 15 20.84 20.19 MIN: 11.24 / MAX: 22.33 MIN: 11.95 / MAX: 21.04
PyTorch Device: CPU - Batch Size: 512 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 512 - Model: ResNet-50 a PRE 5 10 15 20 25 SE +/- 0.14, N = 15 21.01 20.43 MIN: 11.92 / MAX: 22.65 MIN: 13.46 / MAX: 21.1
PyTorch Device: CPU - Batch Size: 16 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 16 - Model: ResNet-152 a PRE 3 6 9 12 15 SE +/- 0.09, N = 3 9.01 8.93 MIN: 4.81 / MAX: 9.31 MIN: 8.8 / MAX: 9.04
PyTorch Device: CPU - Batch Size: 32 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 32 - Model: ResNet-152 a PRE 3 6 9 12 15 SE +/- 0.08, N = 3 9.34 8.72 MIN: 4.74 / MAX: 9.74 MIN: 5.23 / MAX: 9.06
PyTorch Device: CPU - Batch Size: 512 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 512 - Model: ResNet-152 PRE a 3 6 9 12 15 SE +/- 0.10, N = 3 9.47 9.33 MIN: 5.17 / MAX: 9.87 MIN: 4.69 / MAX: 9.66
TensorFlow Device: CPU - Batch Size: 512 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 512 - Model: ResNet-50 PRE a 30 60 90 120 150 140.59 140.49
PyTorch Device: CPU - Batch Size: 1 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 1 - Model: ResNet-50 a PRE 6 12 18 24 30 SE +/- 0.20, N = 15 23.20 23.06 MIN: 12.21 / MAX: 25.13 MIN: 12.95 / MAX: 24.52
PyTorch Device: CPU - Batch Size: 1 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 1 - Model: Efficientnet_v2_l a PRE 2 4 6 8 10 SE +/- 0.09, N = 3 6.45 6.29 MIN: 3.05 / MAX: 6.85 MIN: 3.09 / MAX: 6.44
TensorFlow Device: CPU - Batch Size: 256 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 256 - Model: ResNet-50 PRE a 30 60 90 120 150 119.83 118.88
PyTorch Device: CPU - Batch Size: 256 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 256 - Model: ResNet-50 PRE a 5 10 15 20 25 SE +/- 0.10, N = 3 21.20 20.77 MIN: 12.68 / MAX: 21.88 MIN: 12.97 / MAX: 21.67
PyTorch Device: CPU - Batch Size: 16 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 16 - Model: ResNet-50 a PRE 5 10 15 20 25 SE +/- 0.16, N = 3 21.53 20.93 MIN: 12.64 / MAX: 22.28 MIN: 12.91 / MAX: 21.51
PyTorch Device: CPU - Batch Size: 64 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 64 - Model: ResNet-50 PRE a 5 10 15 20 25 SE +/- 0.23, N = 3 21.59 21.08 MIN: 14.02 / MAX: 22.21 MIN: 13.2 / MAX: 22.07
TensorFlow Device: CPU - Batch Size: 512 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 512 - Model: GoogLeNet PRE a 110 220 330 440 550 493.31 484.02
TensorFlow Device: CPU - Batch Size: 1 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 1 - Model: GoogLeNet a PRE 3 6 9 12 15 SE +/- 0.14, N = 15 13.20 12.58
TensorFlow Device: CPU - Batch Size: 64 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 64 - Model: AlexNet PRE a 170 340 510 680 850 SE +/- 5.39, N = 15 765.55 749.46
TensorFlow Device: CPU - Batch Size: 64 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 64 - Model: ResNet-50 a PRE 20 40 60 80 100 88.93 87.72
TensorFlow Device: CPU - Batch Size: 32 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 32 - Model: AlexNet a PRE 90 180 270 360 450 SE +/- 6.62, N = 15 436.25 424.06
TensorFlow Device: CPU - Batch Size: 256 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 256 - Model: GoogLeNet PRE a 90 180 270 360 450 400.03 399.46
TensorFlow Device: CPU - Batch Size: 16 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 16 - Model: AlexNet a PRE 50 100 150 200 250 SE +/- 2.30, N = 15 247.55 242.29
Blender Blend File: Barbershop - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.1 Blend File: Barbershop - Compute: CPU-Only PRE a 15 30 45 60 75 67.38 67.66
TensorFlow Device: CPU - Batch Size: 32 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 32 - Model: ResNet-50 PRE a 15 30 45 60 75 65.88 60.25
RocksDB Test: Update Random OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Update Random a PRE 90K 180K 270K 360K 450K 425687 421266 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
RocksDB Test: Overwrite OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Overwrite a PRE 90K 180K 270K 360K 450K 421616 421049 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
RocksDB Test: Read Random Write Random OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Read Random Write Random a PRE 800K 1600K 2400K 3200K 4000K 3643263 3619142 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
RocksDB Test: Read While Writing OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Read While Writing PRE a 6M 12M 18M 24M 30M 27130363 26406662 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
RocksDB Test: Random Read OpenBenchmarking.org Op/s, More Is Better RocksDB 9.0 Test: Random Read a PRE 200M 400M 600M 800M 1000M 1108892776 1105306233 1. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread
TensorFlow Device: CPU - Batch Size: 1 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 1 - Model: AlexNet PRE a 5 10 15 20 25 SE +/- 0.16, N = 15 21.16 20.78
TensorFlow Device: CPU - Batch Size: 16 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 16 - Model: ResNet-50 a PRE 9 18 27 36 45 41.26 39.68
TensorFlow Device: CPU - Batch Size: 1 - Model: ResNet-50 OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 1 - Model: ResNet-50 PRE a 0.9113 1.8226 2.7339 3.6452 4.5565 4.05 3.90
TensorFlow Device: CPU - Batch Size: 512 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 512 - Model: AlexNet a PRE 400 800 1200 1600 2000 2010.56 1980.51
TensorFlow Device: CPU - Batch Size: 64 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 64 - Model: GoogLeNet PRE a 60 120 180 240 300 275.34 273.68
Timed Mesa Compilation Time To Compile OpenBenchmarking.org Seconds, Fewer Is Better Timed Mesa Compilation 24.0 Time To Compile PRE a 4 8 12 16 20 SE +/- 0.04, N = 3 14.66 14.76
Blender Blend File: Pabellon Barcelona - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.1 Blend File: Pabellon Barcelona - Compute: CPU-Only PRE a 6 12 18 24 30 22.99 23.10
TensorFlow Device: CPU - Batch Size: 32 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 32 - Model: GoogLeNet PRE a 40 80 120 160 200 185.16 176.36
TensorFlow Device: CPU - Batch Size: 256 - Model: AlexNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 256 - Model: AlexNet PRE a 400 800 1200 1600 2000 1652.23 1604.52
TensorFlow Device: CPU - Batch Size: 16 - Model: GoogLeNet OpenBenchmarking.org images/sec, More Is Better TensorFlow 2.16.1 Device: CPU - Batch Size: 16 - Model: GoogLeNet a PRE 30 60 90 120 150 114.26 112.64
Blender Blend File: Classroom - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.1 Blend File: Classroom - Compute: CPU-Only PRE a 4 8 12 16 20 18.03 18.08
Blender Blend File: Junkshop - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.1 Blend File: Junkshop - Compute: CPU-Only PRE a 3 6 9 12 15 11.40 11.44
Blender Blend File: Fishy Cat - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.1 Blend File: Fishy Cat - Compute: CPU-Only a PRE 3 6 9 12 15 9.85 9.96
Blender Blend File: BMW27 - Compute: CPU-Only OpenBenchmarking.org Seconds, Fewer Is Better Blender 4.1 Blend File: BMW27 - Compute: CPU-Only a PRE 2 4 6 8 10 7.55 7.55
Phoronix Test Suite v10.8.5