newd AMD EPYC 7R13 48-Core testing with a Supermicro H12SSL-I v1.02 (2.8 BIOS) and NVIDIA GeForce RTX 4090 24GB on EndeavourOS rolling via the Phoronix Test Suite.
HTML result view exported from: https://openbenchmarking.org/result/2404233-NE-NEWD5443293 .
newd Processor Motherboard Chipset Memory Disk Graphics Audio Monitor Network OS Kernel Desktop Display Server Display Driver OpenGL Compiler File-System Screen Resolution AMD EPYC 7R13 48-Core - NVIDIA GeForce RTX 4090 24GB AMD EPYC 7R13 48-Core @ 3.73GHz (48 Cores / 96 Threads) Supermicro H12SSL-I v1.02 (2.8 BIOS) AMD Starship/Matisse 256GB 15363GB Micron_7450_MTFDKCC15T3TFR NVIDIA GeForce RTX 4090 24GB NVIDIA AD102 HD Audio 38GN950 2 x Intel X710 for 10GbE SFP+ EndeavourOS rolling 6.8.7-zen1-1-zen (x86_64) Xfce 4.18 X Server 1.21.1.13 NVIDIA 550.76 4.6.0 GCC 13.2.1 20230801 + Clang 17.0.6 + LLVM 17.0.6 + CUDA 12.4 btrfs 3840x1600 OpenBenchmarking.org - Transparent Huge Pages: always - NVCC_PREPEND_FLAGS="-ccbin /opt/cuda/bin" - --disable-libssp --disable-libstdcxx-pch --disable-werror --enable-__cxa_atexit --enable-bootstrap --enable-cet=auto --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-default-ssp --enable-gnu-indirect-function --enable-gnu-unique-object --enable-languages=ada,c,c++,d,fortran,go,lto,m2,objc,obj-c++ --enable-libstdcxx-backtrace --enable-link-serialization=1 --enable-lto --enable-multilib --enable-plugin --enable-shared --enable-threads=posix --mandir=/usr/share/man --with-build-config=bootstrap-lto --with-linker-hash-style=gnu - Scaling Governor: amd-pstate-epp performance (EPP: performance) - CPU Microcode: 0xa0011d3 - Python 3.11.8 - gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines; IBPB: conditional; IBRS_FW; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected
newd openvino: Face Detection FP16 - CPU openvino: Face Detection FP16 - CPU openvino: Person Detection FP16 - CPU openvino: Person Detection FP16 - CPU openvino: Person Detection FP32 - CPU openvino: Person Detection FP32 - CPU llama-cpp: llama-2-13b.Q4_0.gguf AMD EPYC 7R13 48-Core - NVIDIA GeForce RTX 4090 24GB 10.31 2306.36 104.73 228.80 105.53 227.07 15.84 OpenBenchmarking.org
OpenVINO Model: Face Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Face Detection FP16 - Device: CPU AMD EPYC 7R13 48-Core - NVIDIA GeForce RTX 4090 24GB 3 6 9 12 15 SE +/- 0.02, N = 3 10.31 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Face Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Face Detection FP16 - Device: CPU AMD EPYC 7R13 48-Core - NVIDIA GeForce RTX 4090 24GB 500 1000 1500 2000 2500 SE +/- 3.56, N = 3 2306.36 MIN: 1274.94 / MAX: 2469.91 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Person Detection FP16 - Device: CPU AMD EPYC 7R13 48-Core - NVIDIA GeForce RTX 4090 24GB 20 40 60 80 100 SE +/- 0.13, N = 3 104.73 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Person Detection FP16 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Person Detection FP16 - Device: CPU AMD EPYC 7R13 48-Core - NVIDIA GeForce RTX 4090 24GB 50 100 150 200 250 SE +/- 0.31, N = 3 228.80 MIN: 115.55 / MAX: 317.77 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Person Detection FP32 - Device: CPU OpenBenchmarking.org FPS, More Is Better OpenVINO 2024.0 Model: Person Detection FP32 - Device: CPU AMD EPYC 7R13 48-Core - NVIDIA GeForce RTX 4090 24GB 20 40 60 80 100 SE +/- 0.39, N = 3 105.53 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
OpenVINO Model: Person Detection FP32 - Device: CPU OpenBenchmarking.org ms, Fewer Is Better OpenVINO 2024.0 Model: Person Detection FP32 - Device: CPU AMD EPYC 7R13 48-Core - NVIDIA GeForce RTX 4090 24GB 50 100 150 200 250 SE +/- 0.81, N = 3 227.07 MIN: 130.78 / MAX: 319.21 1. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl
Llama.cpp Model: llama-2-13b.Q4_0.gguf OpenBenchmarking.org Tokens Per Second, More Is Better Llama.cpp b1808 Model: llama-2-13b.Q4_0.gguf AMD EPYC 7R13 48-Core - NVIDIA GeForce RTX 4090 24GB 4 8 12 16 20 SE +/- 0.02, N = 3 15.84 1. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -march=native -mtune=native -fopenmp -lopenblas
Phoronix Test Suite v10.8.5