newd

AMD EPYC 7R13 48-Core testing with a Supermicro H12SSL-I v1.02 (2.8 BIOS) and NVIDIA GeForce RTX 4090 24GB on EndeavourOS rolling via the Phoronix Test Suite.

Compare your own system(s) to this result file with the Phoronix Test Suite by running the command: phoronix-test-suite benchmark 2404233-NE-NEWD5443293
Jump To Table - Results

Statistics

Remove Outliers Before Calculating Averages

Graph Settings

Prefer Vertical Bar Graphs

Multi-Way Comparison

Condense Multi-Option Tests Into Single Result Graphs

Table

Show Detailed System Result Table

Run Management

Result
Identifier
Performance Per
Dollar
Date
Run
  Test
  Duration
AMD EPYC 7R13 48-Core - NVIDIA GeForce RTX 4090 24GB
April 23
  21 Minutes
Only show results matching title/arguments (delimit multiple options with a comma):
Do not show results matching title/arguments (delimit multiple options with a comma):


newdOpenBenchmarking.orgPhoronix Test SuiteAMD EPYC 7R13 48-Core @ 3.73GHz (48 Cores / 96 Threads)Supermicro H12SSL-I v1.02 (2.8 BIOS)AMD Starship/Matisse256GB15363GB Micron_7450_MTFDKCC15T3TFRNVIDIA GeForce RTX 4090 24GBNVIDIA AD102 HD Audio38GN9502 x Intel X710 for 10GbE SFP+EndeavourOS rolling6.8.7-zen1-1-zen (x86_64)Xfce 4.18X Server 1.21.1.13NVIDIA 550.764.6.0GCC 13.2.1 20230801 + Clang 17.0.6 + LLVM 17.0.6 + CUDA 12.4btrfs3840x1600ProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerDisplay DriverOpenGLCompilerFile-SystemScreen ResolutionNewd BenchmarksSystem Logs- Transparent Huge Pages: always- NVCC_PREPEND_FLAGS="-ccbin /opt/cuda/bin"- --disable-libssp --disable-libstdcxx-pch --disable-werror --enable-__cxa_atexit --enable-bootstrap --enable-cet=auto --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-default-ssp --enable-gnu-indirect-function --enable-gnu-unique-object --enable-languages=ada,c,c++,d,fortran,go,lto,m2,objc,obj-c++ --enable-libstdcxx-backtrace --enable-link-serialization=1 --enable-lto --enable-multilib --enable-plugin --enable-shared --enable-threads=posix --mandir=/usr/share/man --with-build-config=bootstrap-lto --with-linker-hash-style=gnu - Scaling Governor: amd-pstate-epp performance (EPP: performance) - CPU Microcode: 0xa0011d3 - Python 3.11.8- gather_data_sampling: Not affected + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + reg_file_data_sampling: Not affected + retbleed: Not affected + spec_rstack_overflow: Mitigation of Safe RET + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines; IBPB: conditional; IBRS_FW; STIBP: always-on; RSB filling; PBRSB-eIBRS: Not affected; BHI: Not affected + srbds: Not affected + tsx_async_abort: Not affected

newdopenvino: Face Detection FP16 - CPUopenvino: Face Detection FP16 - CPUopenvino: Person Detection FP16 - CPUopenvino: Person Detection FP16 - CPUopenvino: Person Detection FP32 - CPUopenvino: Person Detection FP32 - CPUllama-cpp: llama-2-13b.Q4_0.ggufAMD EPYC 7R13 48-Core - NVIDIA GeForce RTX 4090 24GB10.312306.36104.73228.80105.53227.0715.84OpenBenchmarking.org

OpenVINO

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Face Detection FP16 - Device: CPUAMD EPYC 7R13 48-Core - NVIDIA GeForce RTX 4090 24GB3691215SE +/- 0.02, N = 310.311. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Face Detection FP16 - Device: CPUAMD EPYC 7R13 48-Core - NVIDIA GeForce RTX 4090 24GB5001000150020002500SE +/- 3.56, N = 32306.36MIN: 1274.94 / MAX: 2469.911. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Person Detection FP16 - Device: CPUAMD EPYC 7R13 48-Core - NVIDIA GeForce RTX 4090 24GB20406080100SE +/- 0.13, N = 3104.731. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Person Detection FP16 - Device: CPUAMD EPYC 7R13 48-Core - NVIDIA GeForce RTX 4090 24GB50100150200250SE +/- 0.31, N = 3228.80MIN: 115.55 / MAX: 317.771. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgFPS, More Is BetterOpenVINO 2024.0Model: Person Detection FP32 - Device: CPUAMD EPYC 7R13 48-Core - NVIDIA GeForce RTX 4090 24GB20406080100SE +/- 0.39, N = 3105.531. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

OpenBenchmarking.orgms, Fewer Is BetterOpenVINO 2024.0Model: Person Detection FP32 - Device: CPUAMD EPYC 7R13 48-Core - NVIDIA GeForce RTX 4090 24GB50100150200250SE +/- 0.81, N = 3227.07MIN: 130.78 / MAX: 319.211. (CXX) g++ options: -fPIC -fsigned-char -ffunction-sections -fdata-sections -O3 -fno-strict-overflow -fwrapv -shared -ldl

Llama.cpp

Llama.cpp is a port of Facebook's LLaMA model in C/C++ developed by Georgi Gerganov. Llama.cpp allows the inference of LLaMA and other supported models in C/C++. For CPU inference Llama.cpp supports AVX2/AVX-512, ARM NEON, and other modern ISAs along with features like OpenBLAS usage. Learn more via the OpenBenchmarking.org test page.

OpenBenchmarking.orgTokens Per Second, More Is BetterLlama.cpp b1808Model: llama-2-13b.Q4_0.ggufAMD EPYC 7R13 48-Core - NVIDIA GeForce RTX 4090 24GB48121620SE +/- 0.02, N = 315.841. (CXX) g++ options: -std=c++11 -fPIC -O3 -pthread -march=native -mtune=native -fopenmp -lopenblas

7 Results Shown

OpenVINO:
  Face Detection FP16 - CPU:
    FPS
    ms
  Person Detection FP16 - CPU:
    FPS
    ms
  Person Detection FP32 - CPU:
    FPS
    ms
Llama.cpp