nogaAllPyTorchResults AMD Ryzen 9 9950X 16-Core testing with a ASUS PRIME B650M-A II (3201 BIOS) and NVIDIA GeForce RTX 4060 Ti 16GB on Ubuntu 24.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2501238-NE-NOGAALLPY69&grr .
nogaAllPyTorchResults Processor Motherboard Chipset Memory Disk Graphics Audio Network OS Kernel Display Server Display Driver Compiler File-System Screen Resolution ptRun2 AMD Ryzen 9 9950X 16-Core @ 5.75GHz (16 Cores / 32 Threads) ASUS PRIME B650M-A II (3201 BIOS) AMD Device 14d8 4 x 48GB DDR5-3600MT/s G Skill F5-6800J3446F48G 2000GB Samsung SSD 980 PRO 2TB NVIDIA GeForce RTX 4060 Ti 16GB NVIDIA Device 22bd 2 x Intel 10-Gigabit X540-AT2 + Realtek RTL8125 2.5GbE Ubuntu 24.04 6.8.0-51-generic (x86_64) X Server 1.21.1.11 NVIDIA GCC 13.3.0 + CUDA 12.4 ext4 1920x1080 OpenBenchmarking.org - Transparent Huge Pages: madvise - Scaling Governor: amd-pstate-epp powersave (EPP: balance_performance) - CPU Microcode: 0xb404023 - Python 3.12.3 - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced / Automatic IBRS; IBPB: conditional; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
nogaAllPyTorchResults pytorch: CPU - 64 - ResNet-50 pytorch: CPU - 256 - Efficientnet_v2_l pytorch: CPU - 512 - Efficientnet_v2_l pytorch: CPU - 32 - Efficientnet_v2_l pytorch: CPU - 16 - Efficientnet_v2_l pytorch: CPU - 64 - Efficientnet_v2_l pytorch: CPU - 32 - ResNet-152 pytorch: CPU - 64 - ResNet-152 pytorch: CPU - 16 - ResNet-152 pytorch: CPU - 512 - ResNet-152 pytorch: CPU - 256 - ResNet-152 pytorch: CPU - 1 - Efficientnet_v2_l pytorch: CPU - 512 - ResNet-50 pytorch: CPU - 1 - ResNet-152 pytorch: CPU - 32 - ResNet-50 pytorch: CPU - 256 - ResNet-50 pytorch: CPU - 16 - ResNet-50 pytorch: NVIDIA CUDA GPU - 16 - Efficientnet_v2_l pytorch: NVIDIA CUDA GPU - 512 - Efficientnet_v2_l pytorch: NVIDIA CUDA GPU - 32 - Efficientnet_v2_l pytorch: NVIDIA CUDA GPU - 256 - Efficientnet_v2_l pytorch: NVIDIA CUDA GPU - 64 - Efficientnet_v2_l pytorch: CPU - 1 - ResNet-50 pytorch: NVIDIA CUDA GPU - 64 - ResNet-152 pytorch: NVIDIA CUDA GPU - 256 - ResNet-152 pytorch: NVIDIA CUDA GPU - 16 - ResNet-152 pytorch: NVIDIA CUDA GPU - 32 - ResNet-152 pytorch: NVIDIA CUDA GPU - 512 - ResNet-152 pytorch: NVIDIA CUDA GPU - 1 - Efficientnet_v2_l pytorch: NVIDIA CUDA GPU - 1 - ResNet-152 pytorch: NVIDIA CUDA GPU - 32 - ResNet-50 pytorch: NVIDIA CUDA GPU - 512 - ResNet-50 pytorch: NVIDIA CUDA GPU - 64 - ResNet-50 pytorch: NVIDIA CUDA GPU - 256 - ResNet-50 pytorch: NVIDIA CUDA GPU - 16 - ResNet-50 pytorch: NVIDIA CUDA GPU - 1 - ResNet-50 ptRun2 52.37 12.49 12.59 12.56 12.70 12.61 21.22 21.27 21.36 21.69 21.81 16.18 52.98 29.80 52.11 52.52 53.21 102.33 102.25 102.31 102.23 102.31 73.27 170.14 170.09 169.72 170.11 170.15 113.69 212.67 402.56 400.50 402.07 400.84 401.90 546.09 OpenBenchmarking.org
PyTorch Device: CPU - Batch Size: 64 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 64 - Model: ResNet-50 ptRun2 12 24 36 48 60 SE +/- 0.39, N = 15 52.37 MIN: 46.86 / MAX: 55.47
PyTorch Device: CPU - Batch Size: 256 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 256 - Model: Efficientnet_v2_l ptRun2 3 6 9 12 15 SE +/- 0.03, N = 3 12.49 MIN: 11.15 / MAX: 12.99
PyTorch Device: CPU - Batch Size: 512 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 512 - Model: Efficientnet_v2_l ptRun2 3 6 9 12 15 SE +/- 0.06, N = 3 12.59 MIN: 11.43 / MAX: 13.06
PyTorch Device: CPU - Batch Size: 32 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 32 - Model: Efficientnet_v2_l ptRun2 3 6 9 12 15 SE +/- 0.16, N = 3 12.56 MIN: 11.17 / MAX: 13.14
PyTorch Device: CPU - Batch Size: 16 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 16 - Model: Efficientnet_v2_l ptRun2 3 6 9 12 15 SE +/- 0.08, N = 3 12.70 MIN: 11.27 / MAX: 13.33
PyTorch Device: CPU - Batch Size: 64 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 64 - Model: Efficientnet_v2_l ptRun2 3 6 9 12 15 SE +/- 0.05, N = 3 12.61 MIN: 11.4 / MAX: 13.2
PyTorch Device: CPU - Batch Size: 32 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 32 - Model: ResNet-152 ptRun2 5 10 15 20 25 SE +/- 0.15, N = 3 21.22 MIN: 20.15 / MAX: 22.02
PyTorch Device: CPU - Batch Size: 64 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 64 - Model: ResNet-152 ptRun2 5 10 15 20 25 SE +/- 0.26, N = 3 21.27 MIN: 20.06 / MAX: 21.77
PyTorch Device: CPU - Batch Size: 16 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 16 - Model: ResNet-152 ptRun2 5 10 15 20 25 SE +/- 0.18, N = 3 21.36 MIN: 20.23 / MAX: 21.66
PyTorch Device: CPU - Batch Size: 512 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 512 - Model: ResNet-152 ptRun2 5 10 15 20 25 SE +/- 0.17, N = 3 21.69 MIN: 20.36 / MAX: 22.06
PyTorch Device: CPU - Batch Size: 256 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 256 - Model: ResNet-152 ptRun2 5 10 15 20 25 SE +/- 0.05, N = 3 21.81 MIN: 18.37 / MAX: 22.12
PyTorch Device: CPU - Batch Size: 1 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 1 - Model: Efficientnet_v2_l ptRun2 4 8 12 16 20 SE +/- 0.06, N = 3 16.18 MIN: 14.75 / MAX: 16.36
PyTorch Device: CPU - Batch Size: 512 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 512 - Model: ResNet-50 ptRun2 12 24 36 48 60 SE +/- 0.58, N = 5 52.98 MIN: 45.32 / MAX: 55.2
PyTorch Device: CPU - Batch Size: 1 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 1 - Model: ResNet-152 ptRun2 7 14 21 28 35 SE +/- 0.33, N = 3 29.80 MIN: 28.4 / MAX: 30.51
PyTorch Device: CPU - Batch Size: 32 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 32 - Model: ResNet-50 ptRun2 12 24 36 48 60 SE +/- 0.65, N = 3 52.11 MIN: 47.37 / MAX: 53.95
PyTorch Device: CPU - Batch Size: 256 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 256 - Model: ResNet-50 ptRun2 12 24 36 48 60 SE +/- 0.66, N = 3 52.52 MIN: 48.77 / MAX: 53.97
PyTorch Device: CPU - Batch Size: 16 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 16 - Model: ResNet-50 ptRun2 12 24 36 48 60 SE +/- 0.55, N = 3 53.21 MIN: 47.38 / MAX: 54.58
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: Efficientnet_v2_l ptRun2 20 40 60 80 100 SE +/- 0.02, N = 3 102.33 MIN: 83.07 / MAX: 102.9
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: Efficientnet_v2_l ptRun2 20 40 60 80 100 SE +/- 0.01, N = 3 102.25 MIN: 82.89 / MAX: 102.61
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: Efficientnet_v2_l ptRun2 20 40 60 80 100 SE +/- 0.01, N = 3 102.31 MIN: 82.12 / MAX: 102.84
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: Efficientnet_v2_l ptRun2 20 40 60 80 100 SE +/- 0.01, N = 3 102.23 MIN: 82.95 / MAX: 102.8
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: Efficientnet_v2_l ptRun2 20 40 60 80 100 SE +/- 0.05, N = 3 102.31 MIN: 84.21 / MAX: 102.86
PyTorch Device: CPU - Batch Size: 1 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 1 - Model: ResNet-50 ptRun2 16 32 48 64 80 SE +/- 0.80, N = 3 73.27 MIN: 66.63 / MAX: 75.92
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: ResNet-152 ptRun2 40 80 120 160 200 SE +/- 0.10, N = 3 170.14 MIN: 156.17 / MAX: 171.11
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: ResNet-152 ptRun2 40 80 120 160 200 SE +/- 0.07, N = 3 170.09 MIN: 155.92 / MAX: 171.11
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: ResNet-152 ptRun2 40 80 120 160 200 SE +/- 0.07, N = 3 169.72 MIN: 134.18 / MAX: 170.75
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: ResNet-152 ptRun2 40 80 120 160 200 SE +/- 0.11, N = 3 170.11 MIN: 155.88 / MAX: 171.23
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: ResNet-152 ptRun2 40 80 120 160 200 SE +/- 0.07, N = 3 170.15 MIN: 156.38 / MAX: 170.8
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: Efficientnet_v2_l ptRun2 30 60 90 120 150 SE +/- 0.42, N = 3 113.69 MIN: 85.58 / MAX: 115.03
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: ResNet-152 ptRun2 50 100 150 200 250 SE +/- 1.05, N = 3 212.67 MIN: 166.05 / MAX: 216.41
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: ResNet-50 ptRun2 90 180 270 360 450 SE +/- 0.92, N = 3 402.56 MIN: 335.33 / MAX: 405.72
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: ResNet-50 ptRun2 90 180 270 360 450 SE +/- 0.05, N = 3 400.50 MIN: 336.68 / MAX: 403.22
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: ResNet-50 ptRun2 90 180 270 360 450 SE +/- 0.58, N = 3 402.07 MIN: 335.52 / MAX: 405.33
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: ResNet-50 ptRun2 90 180 270 360 450 SE +/- 0.11, N = 3 400.84 MIN: 336.14 / MAX: 403.07
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: ResNet-50 ptRun2 90 180 270 360 450 SE +/- 0.87, N = 3 401.90 MIN: 337.81 / MAX: 405.52
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: ResNet-50 ptRun2 120 240 360 480 600 SE +/- 1.81, N = 3 546.09 MIN: 476.52 / MAX: 554.58
Phoronix Test Suite v10.8.5