ff

AMD Ryzen Threadripper 3970X 32-Core testing with a ASUS ROG ZENITH II EXTREME (1802 BIOS) and AMD Radeon RX 5700 8GB on Ubuntu 22.04 via the Phoronix Test Suite.

HTML result view exported from: https://openbenchmarking.org/result/2401113-NE-FF610899407&grr.

ffProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerOpenGLVulkanCompilerFile-SystemScreen ResolutionabAMD Ryzen Threadripper 3970X 32-Core @ 3.70GHz (32 Cores / 64 Threads)ASUS ROG ZENITH II EXTREME (1802 BIOS)AMD Starship/Matisse4 x 16 GB DRAM-3600MT/s Corsair CMT64GX4M4Z3600C16Samsung SSD 980 PRO 500GBAMD Radeon RX 5700 8GB (1750/875MHz)AMD Navi 10 HDMI AudioASUS VP28UAquantia AQC107 NBase-T/IEEE + Intel I211 + Intel Wi-Fi 6 AX200Ubuntu 22.046.2.0-39-generic (x86_64)GNOME Shell 42.2X Server + Wayland4.6 Mesa 22.0.1 (LLVM 13.0.1 DRM 3.49)1.2.204GCC 11.4.0ext43840x2160OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseCompiler Details- --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-bootstrap --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-link-serialization=2 --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-build-config=bootstrap-lto-lean --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v Processor Details- Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0x830107aPython Details- Python 3.10.12Security Details- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Mitigation of untrained return thunk; SMT enabled with STIBP protection + spec_rstack_overflow: Mitigation of safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected

ffpytorch: CPU - 16 - Efficientnet_v2_ltensorflow: CPU - 16 - VGG-16speedb: Seq Fillquicksilver: CTS2llama-cpp: llama-2-70b-chat.Q5_0.ggufpytorch: CPU - 16 - ResNet-152quicksilver: CORAL2 P2tensorflow: CPU - 16 - ResNet-50pytorch: CPU - 1 - Efficientnet_v2_lcachebench: Read / Modify / Writecachebench: Writecachebench: Readrav1e: 1pytorch: CPU - 1 - ResNet-152rav1e: 5pytorch: CPU - 16 - ResNet-50rav1e: 10speedb: Rand Fill Syncspeedb: Rand Fillspeedb: Update Randspeedb: Read Rand Write Randspeedb: Read While Writingspeedb: Rand Readtensorflow: CPU - 1 - VGG-16rav1e: 6quicksilver: CORAL2 P1llama-cpp: llama-2-13b.Q4_0.gguftensorflow: CPU - 16 - GoogLeNetpytorch: CPU - 1 - ResNet-50tensorflow: CPU - 16 - AlexNetllama-cpp: llama-2-7b.Q4_0.gguftensorflow: CPU - 1 - AlexNettensorflow: CPU - 1 - ResNet-50y-cruncher: 1Btensorflow: CPU - 1 - GoogLeNety-cruncher: 500Mab5.996.90254122207866671.8611.742895000012.547.90119470.65801761568.84340511066.9554370.83613.952.81330.688.6305618252455241710242214479912641294957762.073.7332196000010.6342.9035.7961.4119.365.356.3716.3739.237.9256.026.81254353208500001.8611.882895000012.597.89118721.39533761337.27197311033.029310.83914.192.80931.168.8695613252251240957240310380527761314757342.013.782199000010.6443.2935.4360.4219.325.356.416.3949.237.986OpenBenchmarking.org

PyTorch

Device: CPU - Batch Size: 16 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 16 - Model: Efficientnet_v2_lab246810SE +/- 0.02, N = 35.996.02MIN: 5.9 / MAX: 6.06MIN: 5.98 / MAX: 6.06

TensorFlow

Device: CPU - Batch Size: 16 - Model: VGG-16

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 16 - Model: VGG-16ab246810SE +/- 0.01, N = 36.906.81

Speedb

Test: Sequential Fill

OpenBenchmarking.orgOp/s, More Is BetterSpeedb 2.7Test: Sequential Fillab50K100K150K200K250KSE +/- 336.51, N = 32541222543531. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

Quicksilver

Input: CTS2

OpenBenchmarking.orgFigure Of Merit, More Is BetterQuicksilver 20230818Input: CTS2ab4M8M12M16M20MSE +/- 37564.76, N = 320786667208500001. (CXX) g++ options: -fopenmp -O3 -march=native

Llama.cpp

Model: llama-2-70b-chat.Q5_0.gguf

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b1808Model: llama-2-70b-chat.Q5_0.ggufab0.41850.8371.25551.6742.0925SE +/- 0.00, N = 31.861.861. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -march=native -mtune=native -lopenblas

PyTorch

Device: CPU - Batch Size: 16 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 16 - Model: ResNet-152ab3691215SE +/- 0.05, N = 311.7411.88MIN: 11.43 / MAX: 11.97MIN: 11.8 / MAX: 11.95

Quicksilver

Input: CORAL2 P2

OpenBenchmarking.orgFigure Of Merit, More Is BetterQuicksilver 20230818Input: CORAL2 P2ab6M12M18M24M30MSE +/- 37859.39, N = 328950000289500001. (CXX) g++ options: -fopenmp -O3 -march=native

TensorFlow

Device: CPU - Batch Size: 16 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 16 - Model: ResNet-50ab3691215SE +/- 0.08, N = 312.5412.59

PyTorch

Device: CPU - Batch Size: 1 - Model: Efficientnet_v2_l

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 1 - Model: Efficientnet_v2_lab246810SE +/- 0.01, N = 37.907.89MIN: 7.78 / MAX: 8.01MIN: 7.8 / MAX: 7.97

CacheBench

Test: Read / Modify / Write

OpenBenchmarking.orgMB/s, More Is BetterCacheBenchTest: Read / Modify / Writeab30K60K90K120K150KSE +/- 81.20, N = 3119470.66118721.40MIN: 97982.3 / MAX: 130920.18MIN: 99330.52 / MAX: 130732.061. (CC) gcc options: -O3 -lrt

CacheBench

Test: Write

OpenBenchmarking.orgMB/s, More Is BetterCacheBenchTest: Writeab13K26K39K52K65KSE +/- 29.06, N = 361568.8461337.27MIN: 40788.42 / MAX: 66208.33MIN: 41835.46 / MAX: 65734.931. (CC) gcc options: -O3 -lrt

CacheBench

Test: Read

OpenBenchmarking.orgMB/s, More Is BetterCacheBenchTest: Readab2K4K6K8K10KSE +/- 33.97, N = 311066.9611033.03MIN: 10956.53 / MAX: 11112.67MIN: 11002.15 / MAX: 11058.911. (CC) gcc options: -O3 -lrt

rav1e

Speed: 1

OpenBenchmarking.orgFrames Per Second, More Is Betterrav1e 0.7Speed: 1ab0.18880.37760.56640.75520.944SE +/- 0.001, N = 30.8360.839

PyTorch

Device: CPU - Batch Size: 1 - Model: ResNet-152

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 1 - Model: ResNet-152ab48121620SE +/- 0.08, N = 313.9514.19MIN: 13.59 / MAX: 14.24MIN: 14.01 / MAX: 14.36

rav1e

Speed: 5

OpenBenchmarking.orgFrames Per Second, More Is Betterrav1e 0.7Speed: 5ab0.63291.26581.89872.53163.1645SE +/- 0.005, N = 32.8132.809

PyTorch

Device: CPU - Batch Size: 16 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 16 - Model: ResNet-50ab714212835SE +/- 0.17, N = 330.6831.16MIN: 30 / MAX: 31.31MIN: 30.47 / MAX: 31.36

rav1e

Speed: 10

OpenBenchmarking.orgFrames Per Second, More Is Betterrav1e 0.7Speed: 10ab246810SE +/- 0.075, N = 38.6308.869

Speedb

Test: Random Fill Sync

OpenBenchmarking.orgOp/s, More Is BetterSpeedb 2.7Test: Random Fill Syncab12002400360048006000SE +/- 49.65, N = 3561856131. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

Speedb

Test: Random Fill

OpenBenchmarking.orgOp/s, More Is BetterSpeedb 2.7Test: Random Fillab50K100K150K200K250KSE +/- 91.82, N = 32524552522511. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

Speedb

Test: Update Random

OpenBenchmarking.orgOp/s, More Is BetterSpeedb 2.7Test: Update Randomab50K100K150K200K250KSE +/- 580.55, N = 32417102409571. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

Speedb

Test: Read Random Write Random

OpenBenchmarking.orgOp/s, More Is BetterSpeedb 2.7Test: Read Random Write Randomab500K1000K1500K2000K2500KSE +/- 3556.84, N = 3242214424031031. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

Speedb

Test: Read While Writing

OpenBenchmarking.orgOp/s, More Is BetterSpeedb 2.7Test: Read While Writingab2M4M6M8M10MSE +/- 69727.75, N = 3799126480527761. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

Speedb

Test: Random Read

OpenBenchmarking.orgOp/s, More Is BetterSpeedb 2.7Test: Random Readab30M60M90M120M150MSE +/- 1683044.27, N = 31294957761314757341. (CXX) g++ options: -O3 -march=native -pthread -fno-builtin-memcmp -fno-rtti -lpthread

TensorFlow

Device: CPU - Batch Size: 1 - Model: VGG-16

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 1 - Model: VGG-16ab0.46580.93161.39741.86322.329SE +/- 0.01, N = 32.072.01

rav1e

Speed: 6

OpenBenchmarking.orgFrames Per Second, More Is Betterrav1e 0.7Speed: 6ab0.85051.7012.55153.4024.2525SE +/- 0.017, N = 33.7333.780

Quicksilver

Input: CORAL2 P1

OpenBenchmarking.orgFigure Of Merit, More Is BetterQuicksilver 20230818Input: CORAL2 P1ab5M10M15M20M25MSE +/- 0.00, N = 321960000219900001. (CXX) g++ options: -fopenmp -O3 -march=native

Llama.cpp

Model: llama-2-13b.Q4_0.gguf

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b1808Model: llama-2-13b.Q4_0.ggufab3691215SE +/- 0.00, N = 310.6310.641. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -march=native -mtune=native -lopenblas

TensorFlow

Device: CPU - Batch Size: 16 - Model: GoogLeNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 16 - Model: GoogLeNetab1020304050SE +/- 0.16, N = 342.9043.29

PyTorch

Device: CPU - Batch Size: 1 - Model: ResNet-50

OpenBenchmarking.orgbatches/sec, More Is BetterPyTorch 2.1Device: CPU - Batch Size: 1 - Model: ResNet-50ab816243240SE +/- 0.14, N = 335.7935.43MIN: 35 / MAX: 36.54MIN: 34.31 / MAX: 36.32

TensorFlow

Device: CPU - Batch Size: 16 - Model: AlexNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 16 - Model: AlexNetab1428425670SE +/- 0.13, N = 361.4160.42

Llama.cpp

Model: llama-2-7b.Q4_0.gguf

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b1808Model: llama-2-7b.Q4_0.ggufab510152025SE +/- 0.07, N = 319.3619.321. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -march=native -mtune=native -lopenblas

TensorFlow

Device: CPU - Batch Size: 1 - Model: AlexNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 1 - Model: AlexNetab1.20382.40763.61144.81526.019SE +/- 0.00, N = 35.355.35

TensorFlow

Device: CPU - Batch Size: 1 - Model: ResNet-50

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 1 - Model: ResNet-50ab246810SE +/- 0.01, N = 36.376.40

Y-Cruncher

Pi Digits To Calculate: 1B

OpenBenchmarking.orgSeconds, Fewer Is BetterY-Cruncher 0.8.3Pi Digits To Calculate: 1Bab48121620SE +/- 0.03, N = 316.3716.39

TensorFlow

Device: CPU - Batch Size: 1 - Model: GoogLeNet

OpenBenchmarking.orgimages/sec, More Is BetterTensorFlow 2.12Device: CPU - Batch Size: 1 - Model: GoogLeNetab3691215SE +/- 0.02, N = 39.239.23

Y-Cruncher

Pi Digits To Calculate: 500M

OpenBenchmarking.orgSeconds, Fewer Is BetterY-Cruncher 0.8.3Pi Digits To Calculate: 500Mab246810SE +/- 0.007, N = 37.9257.986


Phoronix Test Suite v10.8.5