pyt

AMD Ryzen Threadripper PRO 7995WX 96-Cores testing with a HP 8B24 (U65 Ver. 01.01.04 BIOS) and NVIDIA RTX A4000 16GB on Ubuntu 23.10 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2401079-PTS-PYT7659364.

pytProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLOpenCLCompilerFile-SystemScreen ResolutionabcdAMD Ryzen Threadripper PRO 7995WX 96-Cores @ 6.44GHz (96 Cores / 192 Threads)HP 8B24 (U65 Ver. 01.01.04 BIOS)AMD Device 14a4128GB2 x 1024GB SAMSUNG MZVL21T0HCLR-00BH1NVIDIA RTX A4000 16GBNVIDIA GA104 HD AudioASUS VP28URealtek RTL8111/8168/8411Ubuntu 23.106.5.0-14-generic (x86_64)GNOME Shell 45.0X Server 1.21.1.7NVIDIA 535.129.034.6.0OpenCL 3.0 CUDA 12.2.147GCC 13.2.0ext43840x2160OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseProcessor Details- Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - CPU Microcode: 0xa108105Python Details- Python 3.11.6Security Details- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected

pytpytorch: CPU - 1 - ResNet-50pytorch: CPU - 1 - ResNet-152pytorch: CPU - 16 - ResNet-50pytorch: CPU - 32 - ResNet-50pytorch: CPU - 64 - ResNet-50pytorch: CPU - 16 - ResNet-152pytorch: CPU - 256 - ResNet-50pytorch: CPU - 32 - ResNet-152pytorch: CPU - 512 - ResNet-50pytorch: CPU - 64 - ResNet-152pytorch: CPU - 256 - ResNet-152pytorch: CPU - 512 - ResNet-152pytorch: CPU - 1 - Efficientnet_v2_lpytorch: CPU - 16 - Efficientnet_v2_lpytorch: CPU - 32 - Efficientnet_v2_lpytorch: CPU - 64 - Efficientnet_v2_lpytorch: CPU - 256 - Efficientnet_v2_lpytorch: CPU - 512 - Efficientnet_v2_lpytorch: NVIDIA CUDA GPU - 1 - ResNet-50pytorch: NVIDIA CUDA GPU - 1 - ResNet-152pytorch: NVIDIA CUDA GPU - 16 - ResNet-50pytorch: NVIDIA CUDA GPU - 32 - ResNet-50pytorch: NVIDIA CUDA GPU - 64 - ResNet-50pytorch: NVIDIA CUDA GPU - 16 - ResNet-152pytorch: NVIDIA CUDA GPU - 256 - ResNet-50pytorch: NVIDIA CUDA GPU - 32 - ResNet-152pytorch: NVIDIA CUDA GPU - 512 - ResNet-50pytorch: NVIDIA CUDA GPU - 64 - ResNet-152pytorch: NVIDIA CUDA GPU - 256 - ResNet-152pytorch: NVIDIA CUDA GPU - 512 - ResNet-152pytorch: NVIDIA CUDA GPU - 1 - Efficientnet_v2_lpytorch: NVIDIA CUDA GPU - 16 - Efficientnet_v2_lpytorch: NVIDIA CUDA GPU - 32 - Efficientnet_v2_lpytorch: NVIDIA CUDA GPU - 64 - Efficientnet_v2_lpytorch: NVIDIA CUDA GPU - 256 - Efficientnet_v2_lpytorch: NVIDIA CUDA GPU - 512 - Efficientnet_v2_labcd50.0118.7040.8340.6140.7916.1840.8216.1040.3316.0516.0916.2011.166.246.286.256.276.24361.97132.98288.42287.50287.15127.01286.04126.84285.74126.04126.55126.5870.3867.4866.6567.1866.5765.9450.0118.5740.6940.3340.6415.7540.9816.4040.4415.8815.7416.2411.386.286.236.236.196.25357.01134.62291.63291.74291.97125.19289.02129.44288.50129.28125.39126.9469.6067.8668.6968.0766.2766.6350.4618.7940.5440.5940.3915.8740.9316.1740.7416.1615.8716.1711.216.246.286.356.196.26355.60132.86291.60291.63291.52128.92288.89129.52288.57126.001462851127.26123.4171.8066.7967.1267.3067.1266.2049.6019.1540.4040.7240.9716.3240.3516.4139.9916.3116.5115.9611.426.256.256.236.366.27353.30133.46291.81292.06291.76129.27289.30125.41288.51127.10126.60125.0571.3666.6367.5967.8367.1166.20OpenBenchmarking.org

PyTorch

Device: CPU - Batch Size: 1 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 1 - Model: ResNet-50abcd1122334455SE +/- 0.35, N = 350.0150.0150.4649.60MIN: 41.31 / MAX: 52.57MIN: 47.29 / MAX: 51.69MIN: 42.91 / MAX: 52.85MIN: 46.83 / MAX: 51.31

PyTorch

Device: CPU - Batch Size: 1 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 1 - Model: ResNet-152abcd510152025SE +/- 0.10, N = 318.7018.5718.7919.15MIN: 16.98 / MAX: 19.52MIN: 18.2 / MAX: 19.14MIN: 18.25 / MAX: 19.38MIN: 18.8 / MAX: 19.48

PyTorch

Device: CPU - Batch Size: 16 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 16 - Model: ResNet-50abcd918273645SE +/- 0.11, N = 340.8340.6940.5440.40MIN: 37.32 / MAX: 42.16MIN: 37.2 / MAX: 41.69MIN: 37.73 / MAX: 41.72MIN: 37.83 / MAX: 41.64

PyTorch

Device: CPU - Batch Size: 32 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 32 - Model: ResNet-50abcd918273645SE +/- 0.11, N = 340.6140.3340.5940.72MIN: 36.37 / MAX: 42.03MIN: 37.66 / MAX: 41.57MIN: 37.53 / MAX: 41.98MIN: 38.11 / MAX: 41.8

PyTorch

Device: CPU - Batch Size: 64 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 64 - Model: ResNet-50abcd918273645SE +/- 0.09, N = 340.7940.6440.3940.97MIN: 38.02 / MAX: 42.17MIN: 38.74 / MAX: 42.14MIN: 37.31 / MAX: 41.6MIN: 38.49 / MAX: 42.31

PyTorch

Device: CPU - Batch Size: 16 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 16 - Model: ResNet-152abcd48121620SE +/- 0.12, N = 316.1815.7515.8716.32MIN: 15.51 / MAX: 16.68MIN: 15.47 / MAX: 15.96MIN: 15.29 / MAX: 16.21MIN: 16.02 / MAX: 16.63

PyTorch

Device: CPU - Batch Size: 256 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 256 - Model: ResNet-50abcd918273645SE +/- 0.04, N = 340.8240.9840.9340.35MIN: 37.5 / MAX: 42.24MIN: 37.68 / MAX: 42.28MIN: 37.58 / MAX: 42.04MIN: 38.37 / MAX: 41.61

PyTorch

Device: CPU - Batch Size: 32 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 32 - Model: ResNet-152abcd48121620SE +/- 0.12, N = 316.1016.4016.1716.41MIN: 15.51 / MAX: 16.66MIN: 16.04 / MAX: 16.64MIN: 15.85 / MAX: 16.44MIN: 16 / MAX: 16.65

PyTorch

Device: CPU - Batch Size: 512 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 512 - Model: ResNet-50abcd918273645SE +/- 0.03, N = 340.3340.4440.7439.99MIN: 36.91 / MAX: 41.68MIN: 38.09 / MAX: 41.52MIN: 38.28 / MAX: 41.79MIN: 37.13 / MAX: 41.16

PyTorch

Device: CPU - Batch Size: 64 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 64 - Model: ResNet-152abcd48121620SE +/- 0.14, N = 316.0515.8816.1616.31MIN: 15.53 / MAX: 16.58MIN: 15.64 / MAX: 16.11MIN: 15.77 / MAX: 16.39MIN: 15.81 / MAX: 16.6

PyTorch

Device: CPU - Batch Size: 256 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 256 - Model: ResNet-152abcd48121620SE +/- 0.21, N = 316.0915.7415.8716.51MIN: 15.51 / MAX: 16.77MIN: 15.42 / MAX: 16MIN: 15.48 / MAX: 16.09MIN: 16.1 / MAX: 16.76

PyTorch

Device: CPU - Batch Size: 512 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 512 - Model: ResNet-152abcd48121620SE +/- 0.13, N = 316.2016.2416.1715.96MIN: 15.68 / MAX: 16.65MIN: 16 / MAX: 16.49MIN: 15.8 / MAX: 16.39MIN: 15.52 / MAX: 16.2

PyTorch

Device: CPU - Batch Size: 1 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 1 - Model: Efficientnet_v2_labcd3691215SE +/- 0.13, N = 311.1611.3811.2111.42MIN: 10.76 / MAX: 11.57MIN: 11.17 / MAX: 11.58MIN: 11.06 / MAX: 11.41MIN: 11.27 / MAX: 11.64

PyTorch

Device: CPU - Batch Size: 16 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 16 - Model: Efficientnet_v2_labcd246810SE +/- 0.00, N = 36.246.286.246.25MIN: 5.66 / MAX: 6.57MIN: 5.73 / MAX: 6.59MIN: 5.68 / MAX: 6.54MIN: 5.76 / MAX: 6.54

PyTorch

Device: CPU - Batch Size: 32 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 32 - Model: Efficientnet_v2_labcd246810SE +/- 0.03, N = 36.286.236.286.25MIN: 5.73 / MAX: 6.61MIN: 5.64 / MAX: 6.54MIN: 5.72 / MAX: 6.6MIN: 5.72 / MAX: 6.55

PyTorch

Device: CPU - Batch Size: 64 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 64 - Model: Efficientnet_v2_labcd246810SE +/- 0.01, N = 36.256.236.356.23MIN: 5.63 / MAX: 6.58MIN: 5.76 / MAX: 6.58MIN: 5.68 / MAX: 6.64MIN: 5.65 / MAX: 6.55

PyTorch

Device: CPU - Batch Size: 256 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 256 - Model: Efficientnet_v2_labcd246810SE +/- 0.01, N = 36.276.196.196.36MIN: 5.64 / MAX: 6.59MIN: 5.65 / MAX: 6.52MIN: 5.79 / MAX: 6.52MIN: 5.83 / MAX: 6.65

PyTorch

Device: CPU - Batch Size: 512 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 512 - Model: Efficientnet_v2_labcd246810SE +/- 0.01, N = 36.246.256.266.27MIN: 5.55 / MAX: 6.61MIN: 5.71 / MAX: 6.55MIN: 5.73 / MAX: 6.65MIN: 5.68 / MAX: 6.58

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: ResNet-50abcd80160240320400SE +/- 0.86, N = 3361.97357.01355.60353.30MIN: 275.89 / MAX: 372.99MIN: 279.97 / MAX: 368.92MIN: 253.71 / MAX: 368.08MIN: 256.99 / MAX: 364.75

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: ResNet-152abcd306090120150SE +/- 1.79, N = 3132.98134.62132.86133.46MIN: 109 / MAX: 139.03MIN: 120.29 / MAX: 137.09MIN: 118.76 / MAX: 135.2MIN: 120.4 / MAX: 136

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: ResNet-50abcd60120180240300SE +/- 0.75, N = 3288.42291.63291.60291.81MIN: 161 / MAX: 307.66MIN: 177.88 / MAX: 309.38MIN: 170.99 / MAX: 308.54MIN: 173.47 / MAX: 308.79

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: ResNet-50abcd60120180240300SE +/- 0.61, N = 3287.50291.74291.63292.06MIN: 162.04 / MAX: 306.4MIN: 166.05 / MAX: 307.97MIN: 172.75 / MAX: 308.18MIN: 174.72 / MAX: 308

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: ResNet-50abcd60120180240300SE +/- 0.63, N = 3287.15291.97291.52291.76MIN: 158.4 / MAX: 306.18MIN: 165.71 / MAX: 307.55MIN: 165.73 / MAX: 307.7MIN: 164.85 / MAX: 307.84

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: ResNet-152abcd306090120150SE +/- 0.37, N = 3127.01125.19128.92129.27MIN: 84.3 / MAX: 133.47MIN: 85.51 / MAX: 130.24MIN: 85.34 / MAX: 134.89MIN: 85.34 / MAX: 134.51

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: ResNet-50abcd60120180240300SE +/- 0.24, N = 3286.04289.02288.89289.30MIN: 159.06 / MAX: 306.89MIN: 162.8 / MAX: 306.2MIN: 163.12 / MAX: 306.42MIN: 163.39 / MAX: 309.15

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: ResNet-152abcd306090120150SE +/- 0.95, N = 3126.84129.44129.52125.41MIN: 84.12 / MAX: 134.23MIN: 82.93 / MAX: 135.75MIN: 83.58 / MAX: 135.06MIN: 86.22 / MAX: 129.99

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: ResNet-50abcd60120180240300SE +/- 0.18, N = 3285.74288.50288.57288.51MIN: 155.06 / MAX: 305.6MIN: 158.95 / MAX: 305.84MIN: 160.08 / MAX: 306.4MIN: 160.15 / MAX: 306.96

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: ResNet-152abcd306090120150SE +/- 0.61, N = 3126.04129.28126.00127.10MIN: 82.85 / MAX: 132.47MIN: 87.37 / MAX: 135.32MIN: 84.46 / MAX: 131.5MIN: 86.66 / MAX: 132.45

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: ResNet-152abcd306090120150SE +/- 0.24, N = 3126.55125.39127.26126.60MIN: 83.28 / MAX: 132.07MIN: 86.14 / MAX: 129.58MIN: 83.4 / MAX: 132.61MIN: 85.8 / MAX: 131.88

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: ResNet-152abcd306090120150SE +/- 0.41, N = 3126.58126.94123.41125.05MIN: 84.07 / MAX: 132.98MIN: 84.98 / MAX: 132.33MIN: 83.78 / MAX: 128.02MIN: 85.26 / MAX: 130.68

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: Efficientnet_v2_labcd1632486480SE +/- 0.68, N = 370.3869.6071.8071.36MIN: 61.79 / MAX: 72.15MIN: 61.72 / MAX: 70.71MIN: 63.35 / MAX: 73.09MIN: 63.42 / MAX: 72.31

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: Efficientnet_v2_labcd1530456075SE +/- 0.16, N = 367.4867.8666.7966.63MIN: 56.83 / MAX: 69.22MIN: 57.92 / MAX: 69.28MIN: 57.46 / MAX: 68.25MIN: 56.78 / MAX: 68.44

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: Efficientnet_v2_labcd1530456075SE +/- 0.50, N = 366.6568.6967.1267.59MIN: 55.99 / MAX: 68.52MIN: 58.81 / MAX: 70.38MIN: 55.88 / MAX: 68.73MIN: 58.11 / MAX: 68.86

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: Efficientnet_v2_labcd1530456075SE +/- 0.29, N = 367.1868.0767.3067.83MIN: 56.83 / MAX: 69.18MIN: 56.37 / MAX: 69.63MIN: 57.99 / MAX: 68.74MIN: 57.39 / MAX: 69.34

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: Efficientnet_v2_labcd1530456075SE +/- 0.23, N = 366.5766.2767.1267.11MIN: 55.8 / MAX: 68.27MIN: 57.1 / MAX: 67.74MIN: 56.29 / MAX: 68.56MIN: 56.28 / MAX: 68.34

PyTorch

Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: Efficientnet_v2_labcd1530456075SE +/- 0.87, N = 365.9466.6366.2066.20MIN: 55.7 / MAX: 68.83MIN: 57.06 / MAX: 67.92MIN: 55.97 / MAX: 67.74MIN: 55.93 / MAX: 67.54


Phoronix Test Suite v10.8.4