test-pytorch Intel Xeon E5-2680 v4 testing with a MACHINIST X99-MR9A PRO MAX (5.11 BIOS) and NVIDIA GeForce RTX 3060 12GB on Ubuntu 22.04 via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2404251-NE-TESTPYTOR08&grw .
test-pytorch Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver Compiler File-System Screen Resolution teste-total Intel Xeon E5-2680 v4 @ 3.30GHz (14 Cores / 28 Threads) MACHINIST X99-MR9A PRO MAX (5.11 BIOS) Intel Xeon E7 v4/Xeon 32GB 500GB KINGSTON SNV2S500G + 160GB MAXTOR STM316021 + 512GB P3-512 NVIDIA GeForce RTX 3060 12GB Realtek ALC897 TV-PHILCO Realtek RTL8111/8168/8411 Ubuntu 22.04 6.5.0-28-generic (x86_64) GNOME Shell 42.9 X Server 1.21.1.4 NVIDIA 535.171.04 GCC 11.4.0 ext4 1920x1080 OpenBenchmarking.org - Transparent Huge Pages: madvise - Scaling Governor: intel_cpufreq schedutil - CPU Microcode: 0xb000040 - Python 3.10.12 - gather_data_sampling: Not affected + itlb_multihit: KVM: Mitigation of VMX disabled + l1tf: Mitigation of PTE Inversion; VMX: conditional cache flushes SMT vulnerable + mds: Mitigation of Clear buffers; SMT vulnerable + meltdown: Mitigation of PTI + mmio_stale_data: Mitigation of Clear buffers; SMT vulnerable + retbleed: Not affected + spec_rstack_overflow: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: conditional RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Mitigation of Clear buffers; SMT vulnerable
test-pytorch pytorch: CPU - 1 - ResNet-50 pytorch: CPU - 1 - ResNet-152 pytorch: CPU - 16 - ResNet-50 pytorch: CPU - 32 - ResNet-50 pytorch: CPU - 64 - ResNet-50 pytorch: CPU - 16 - ResNet-152 pytorch: CPU - 256 - ResNet-50 pytorch: CPU - 32 - ResNet-152 pytorch: CPU - 512 - ResNet-50 pytorch: CPU - 64 - ResNet-152 pytorch: CPU - 256 - ResNet-152 pytorch: CPU - 512 - ResNet-152 pytorch: CPU - 1 - Efficientnet_v2_l pytorch: CPU - 16 - Efficientnet_v2_l pytorch: CPU - 32 - Efficientnet_v2_l pytorch: CPU - 64 - Efficientnet_v2_l pytorch: CPU - 256 - Efficientnet_v2_l pytorch: CPU - 512 - Efficientnet_v2_l pytorch: NVIDIA CUDA GPU - 1 - ResNet-50 pytorch: NVIDIA CUDA GPU - 1 - ResNet-152 pytorch: NVIDIA CUDA GPU - 16 - ResNet-50 pytorch: NVIDIA CUDA GPU - 32 - ResNet-50 pytorch: NVIDIA CUDA GPU - 64 - ResNet-50 pytorch: NVIDIA CUDA GPU - 16 - ResNet-152 pytorch: NVIDIA CUDA GPU - 256 - ResNet-50 pytorch: NVIDIA CUDA GPU - 32 - ResNet-152 pytorch: NVIDIA CUDA GPU - 512 - ResNet-50 pytorch: NVIDIA CUDA GPU - 64 - ResNet-152 pytorch: NVIDIA CUDA GPU - 256 - ResNet-152 pytorch: NVIDIA CUDA GPU - 512 - ResNet-152 pytorch: NVIDIA CUDA GPU - 1 - Efficientnet_v2_l pytorch: NVIDIA CUDA GPU - 16 - Efficientnet_v2_l pytorch: NVIDIA CUDA GPU - 32 - Efficientnet_v2_l pytorch: NVIDIA CUDA GPU - 64 - Efficientnet_v2_l pytorch: NVIDIA CUDA GPU - 256 - Efficientnet_v2_l pytorch: NVIDIA CUDA GPU - 512 - Efficientnet_v2_l teste-total 29.09 11.29 20.75 20.96 21.07 8.80 20.86 8.81 21.04 8.85 8.86 8.70 6.66 4.98 4.96 5.05 5.00 4.92 102.30 36.32 99.64 99.91 99.62 35.53 100.18 35.53 100.21 35.40 35.16 35.50 18.22 17.79 17.87 17.96 17.87 18.05 OpenBenchmarking.org
PyTorch Device: CPU - Batch Size: 1 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 1 - Model: ResNet-50 teste-total 7 14 21 28 35 SE +/- 0.21, N = 12 29.09 MIN: 22.21 / MAX: 31.7
PyTorch Device: CPU - Batch Size: 1 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 1 - Model: ResNet-152 teste-total 3 6 9 12 15 SE +/- 0.04, N = 3 11.29 MIN: 8.37 / MAX: 12.03
PyTorch Device: CPU - Batch Size: 16 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 16 - Model: ResNet-50 teste-total 5 10 15 20 25 SE +/- 0.23, N = 3 20.75 MIN: 15.96 / MAX: 21.78
PyTorch Device: CPU - Batch Size: 32 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 32 - Model: ResNet-50 teste-total 5 10 15 20 25 SE +/- 0.19, N = 3 20.96 MIN: 17.26 / MAX: 22.23
PyTorch Device: CPU - Batch Size: 64 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 64 - Model: ResNet-50 teste-total 5 10 15 20 25 SE +/- 0.19, N = 3 21.07 MIN: 16.2 / MAX: 22.19
PyTorch Device: CPU - Batch Size: 16 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 16 - Model: ResNet-152 teste-total 2 4 6 8 10 SE +/- 0.03, N = 3 8.80 MIN: 6.41 / MAX: 9.14
PyTorch Device: CPU - Batch Size: 256 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 256 - Model: ResNet-50 teste-total 5 10 15 20 25 SE +/- 0.15, N = 3 20.86 MIN: 15.67 / MAX: 22.29
PyTorch Device: CPU - Batch Size: 32 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 32 - Model: ResNet-152 teste-total 2 4 6 8 10 SE +/- 0.03, N = 3 8.81 MIN: 8.42 / MAX: 9.02
PyTorch Device: CPU - Batch Size: 512 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 512 - Model: ResNet-50 teste-total 5 10 15 20 25 SE +/- 0.16, N = 10 21.04 MIN: 14.62 / MAX: 23.22
PyTorch Device: CPU - Batch Size: 64 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 64 - Model: ResNet-152 teste-total 2 4 6 8 10 SE +/- 0.01, N = 3 8.85 MIN: 7.34 / MAX: 9.06
PyTorch Device: CPU - Batch Size: 256 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 256 - Model: ResNet-152 teste-total 2 4 6 8 10 SE +/- 0.04, N = 3 8.86 MIN: 7.22 / MAX: 9.09
PyTorch Device: CPU - Batch Size: 512 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 512 - Model: ResNet-152 teste-total 2 4 6 8 10 SE +/- 0.03, N = 3 8.70 MIN: 6.36 / MAX: 8.91
PyTorch Device: CPU - Batch Size: 1 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 1 - Model: Efficientnet_v2_l teste-total 2 4 6 8 10 SE +/- 0.03, N = 3 6.66 MIN: 5.42 / MAX: 7.35
PyTorch Device: CPU - Batch Size: 16 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 16 - Model: Efficientnet_v2_l teste-total 1.1205 2.241 3.3615 4.482 5.6025 SE +/- 0.00, N = 3 4.98 MIN: 4.4 / MAX: 5.16
PyTorch Device: CPU - Batch Size: 32 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 32 - Model: Efficientnet_v2_l teste-total 1.116 2.232 3.348 4.464 5.58 SE +/- 0.03, N = 3 4.96 MIN: 4.58 / MAX: 5.17
PyTorch Device: CPU - Batch Size: 64 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 64 - Model: Efficientnet_v2_l teste-total 1.1363 2.2726 3.4089 4.5452 5.6815 SE +/- 0.02, N = 3 5.05 MIN: 4.2 / MAX: 5.22
PyTorch Device: CPU - Batch Size: 256 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 256 - Model: Efficientnet_v2_l teste-total 1.125 2.25 3.375 4.5 5.625 SE +/- 0.01, N = 3 5.00 MIN: 4.53 / MAX: 5.17
PyTorch Device: CPU - Batch Size: 512 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: CPU - Batch Size: 512 - Model: Efficientnet_v2_l teste-total 1.107 2.214 3.321 4.428 5.535 SE +/- 0.05, N = 3 4.92 MIN: 4.17 / MAX: 5.11
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: ResNet-50 teste-total 20 40 60 80 100 SE +/- 0.46, N = 3 102.30 MIN: 92.86 / MAX: 105.67
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: ResNet-152 teste-total 8 16 24 32 40 SE +/- 0.38, N = 3 36.32 MIN: 34.01 / MAX: 37.76
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: ResNet-50 teste-total 20 40 60 80 100 SE +/- 1.02, N = 5 99.64 MIN: 82.77 / MAX: 104.87
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: ResNet-50 teste-total 20 40 60 80 100 SE +/- 0.89, N = 7 99.91 MIN: 76.59 / MAX: 106.68
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: ResNet-50 teste-total 20 40 60 80 100 SE +/- 0.73, N = 3 99.62 MIN: 89.42 / MAX: 102.62
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: ResNet-152 teste-total 8 16 24 32 40 SE +/- 0.16, N = 3 35.53 MIN: 33.45 / MAX: 36.51
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: ResNet-50 teste-total 20 40 60 80 100 SE +/- 0.32, N = 3 100.18 MIN: 91.43 / MAX: 102.65
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: ResNet-152 teste-total 8 16 24 32 40 SE +/- 0.43, N = 3 35.53 MIN: 32.04 / MAX: 37.12
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: ResNet-50 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: ResNet-50 teste-total 20 40 60 80 100 SE +/- 1.18, N = 3 100.21 MIN: 89.37 / MAX: 104.69
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: ResNet-152 teste-total 8 16 24 32 40 SE +/- 0.35, N = 3 35.40 MIN: 32.64 / MAX: 36.8
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: ResNet-152 teste-total 8 16 24 32 40 SE +/- 0.33, N = 3 35.16 MIN: 32.29 / MAX: 36.53
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: ResNet-152 OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: ResNet-152 teste-total 8 16 24 32 40 SE +/- 0.28, N = 3 35.50 MIN: 32.84 / MAX: 36.69
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 1 - Model: Efficientnet_v2_l teste-total 4 8 12 16 20 SE +/- 0.21, N = 3 18.22 MIN: 17.02 / MAX: 18.98
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 16 - Model: Efficientnet_v2_l teste-total 4 8 12 16 20 SE +/- 0.06, N = 3 17.79 MIN: 16.76 / MAX: 18.24
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 32 - Model: Efficientnet_v2_l teste-total 4 8 12 16 20 SE +/- 0.22, N = 3 17.87 MIN: 16.93 / MAX: 18.63
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 64 - Model: Efficientnet_v2_l teste-total 4 8 12 16 20 SE +/- 0.06, N = 3 17.96 MIN: 17.08 / MAX: 18.37
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 256 - Model: Efficientnet_v2_l teste-total 4 8 12 16 20 SE +/- 0.03, N = 3 17.87 MIN: 16.71 / MAX: 18.31
PyTorch Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: Efficientnet_v2_l OpenBenchmarking.org batches/sec, More Is Better PyTorch 2.2.1 Device: NVIDIA CUDA GPU - Batch Size: 512 - Model: Efficientnet_v2_l teste-total 4 8 12 16 20 SE +/- 0.05, N = 3 18.05 MIN: 16.72 / MAX: 18.46
Phoronix Test Suite v10.8.5