AMD Ryzen 9 7950X AVX-512

AMD Ryzen 9 7950X AVX-512 benchmark comparison by Michael Larabel launch day embargo lift review. Stock/out-of-the-box build with AVX-512. For lack of any AVX-512 toggle from the ASUS BIOS, the AVX2 / non-AVX-512 run was carried out by booting kernel with "clearcpuid=304" to clear AVX-512 support from the kernel and for the binary programs that scan /proc/cpuinfo for avx512* extensions. Plus for the open-source benchmarks specifying CFLAGS/CXXFLAGS without AVX-512 extensions. See full launch day review @ https://www.phoronix.com/review/amd-zen4-avx512

Without AVX-512

Kernel Notes: Transparent Huge Pages: madvise
Environment Notes: CXXFLAGS="-O3 -march=native -mno-avx512f" CFLAGS="-O3 -march=native -mno-avx512f"
Compiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-OcsLtf/gcc-12-12-20220319/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-OcsLtf/gcc-12-12-20220319/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v
Processor Notes: Scaling Governor: amd-pstate performance (Boost: Enabled) - CPU Microcode: 0xa601203
Python Notes: Python 3.10.4
Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected

Default, AVX-512 Enabled

Processor: AMD Ryzen 9 7950X 16-Core @ 5.88GHz (16 Cores / 32 Threads), Motherboard: ASUS ROG CROSSHAIR X670E HERO (0604 BIOS), Chipset: AMD Device 14d8, Memory: 32GB, Disk: 2000GB Samsung SSD 980 PRO 2TB + 2000GB, Graphics: AMD Radeon RX 6800 XT 16GB (2575/1000MHz), Audio: AMD Navi 21 HDMI Audio, Monitor: ASUS VP28U, Network: Intel I225-V + Intel Wi-Fi 6 AX210/AX211/AX411

OS: Ubuntu 22.04, Kernel: 6.0.0-060000rc1daily20220820-generic (x86_64), Desktop: GNOME Shell 42.2, Display Server: X Server + Wayland, OpenGL: 4.6 Mesa 22.3.0-devel (git-4685385 2022-08-23 jammy-oibaf-ppa) (LLVM 14.0.6 DRM 3.48), Vulkan: 1.3.224, Compiler: GCC 12.0.1 20220319, File-System: ext4, Screen Resolution: 3840x2160

Kernel Notes: Transparent Huge Pages: madvise
Environment Notes: CXXFLAGS="-O3 -march=native -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -mprefer-vector-width=512" CFLAGS="-O3 -march=native -mavx512f -mavx512cd -mavx512vl -mavx512bw -mavx512dq -mavx512ifma -mavx512vbmi -mprefer-vector-width=512"
Compiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-cet --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-defaulted --enable-offload-targets=nvptx-none=/build/gcc-12-OcsLtf/gcc-12-12-20220319/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-12-OcsLtf/gcc-12-12-20220319/debian/tmp-gcn/usr --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v
Processor Notes: Scaling Governor: amd-pstate performance (Boost: Enabled) - CPU Microcode: 0xa601203
Python Notes: Python 3.10.4
Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected

LeelaChessZero

LeelaChessZero (lc0 / lczero) is a chess engine automated vian neural networks. This test profile can be used for OpenCL, CUDA + cuDNN, and BLAS (CPU-based) benchmarking. Learn more via the OpenBenchmarking.org test page.

Result

Nodes Per Second Per Watt

CPU Peak Freq (Highest CPU Core Frequency

CPU Power Consumption

CPU Temp

simdjson

This is a benchmark of SIMDJSON, a high performance JSON parser. SIMDJSON aims to be the fastest JSON parser and is used by projects like Microsoft FishStore, Yandex ClickHouse, Shopify, and others. Learn more via the OpenBenchmarking.org test page.

Result

GB/s Per Watt

CPU Peak Freq (Highest CPU Core Frequency

CPU Power Consumption

CPU Temp

Result

GB/s Per Watt

CPU Peak Freq (Highest CPU Core Frequency

CPU Power Consumption

CPU Temp

Result

GB/s Per Watt

CPU Peak Freq (Highest CPU Core Frequency

CPU Power Consumption

CPU Temp

Result

GB/s Per Watt

CPU Peak Freq (Highest CPU Core Frequency

CPU Power Consumption

CPU Temp

Result

GB/s Per Watt

CPU Peak Freq (Highest CPU Core Frequency

CPU Power Consumption

CPU Temp

dav1d

Dav1d is an open-source, speedy AV1 video decoder. This test profile times how long it takes to decode sample AV1 video content. Learn more via the OpenBenchmarking.org test page.

Result

FPS Per Watt

CPU Peak Freq (Highest CPU Core Frequency

CPU Power Consumption

CPU Temp

Result

FPS Per Watt

CPU Peak Freq (Highest CPU Core Frequency

CPU Power Consumption

CPU Temp

Embree

Intel Embree is a collection of high-performance ray-tracing kernels for execution on CPUs and supporting instruction sets such as SSE, AVX, AVX2, and AVX-512. Embree also supports making use of the Intel SPMD Program Compiler (ISPC). Learn more via the OpenBenchmarking.org test page.

Result

Frames Per Second Per Watt

CPU Peak Freq (Highest CPU Core Frequency

CPU Power Consumption

CPU Temp

Result

Frames Per Second Per Watt

CPU Peak Freq (Highest CPU Core Frequency

CPU Power Consumption

CPU Temp

OpenVKL

OpenVKL is the Intel Open Volume Kernel Library that offers high-performance volume computation kernels and part of the Intel oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

Result

Items / Sec Per Watt

CPU Peak Freq (Highest CPU Core Frequency

CPU Power Consumption

CPU Temp

OSPRay

Intel OSPRay is a portable ray-tracing engine for high-performance, high-fidelity scientific visualizations. OSPRay builds off Intel's Embree and Intel SPMD Program Compiler (ISPC) components as part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

Result

Items Per Second Per Watt

CPU Peak Freq (Highest CPU Core Frequency

CPU Power Consumption

CPU Temp

Result

Items Per Second Per Watt

CPU Peak Freq (Highest CPU Core Frequency

CPU Power Consumption

CPU Temp

Result

Items Per Second Per Watt

CPU Peak Freq (Highest CPU Core Frequency

CPU Power Consumption

CPU Temp

Result

Items Per Second Per Watt

CPU Peak Freq (Highest CPU Core Frequency

CPU Power Consumption

CPU Temp

oneDNN

This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of Intel oneAPI. Learn more via the OpenBenchmarking.org test page.

Result

CPU Peak Freq (Highest CPU Core Frequency

CPU Power Consumption

CPU Temp

Result

CPU Peak Freq (Highest CPU Core Frequency

CPU Power Consumption

CPU Temp

Result

CPU Peak Freq (Highest CPU Core Frequency

CPU Power Consumption

CPU Temp

Result

CPU Peak Freq (Highest CPU Core Frequency

CPU Power Consumption

CPU Temp

Result

CPU Peak Freq (Highest CPU Core Frequency

CPU Power Consumption

CPU Temp

Result

CPU Peak Freq (Highest CPU Core Frequency

CPU Power Consumption

CPU Temp

Result

CPU Peak Freq (Highest CPU Core Frequency

CPU Power Consumption

CPU Temp

Result

CPU Peak Freq (Highest CPU Core Frequency

CPU Power Consumption

CPU Temp

OSPRay Studio

Intel OSPRay Studio is an open-source, interactive visualization and ray-tracing software package. OSPRay Studio makes use of Intel OSPRay, a portable ray-tracing engine for high-performance, high-fidelity visualizations. OSPRay builds off Intel's Embree and Intel SPMD Program Compiler (ISPC) components as part of the oneAPI rendering toolkit. Learn more via the OpenBenchmarking.org test page.

Result

CPU Peak Freq (Highest CPU Core Frequency

CPU Power Consumption

CPU Temp

Result

CPU Peak Freq (Highest CPU Core Frequency

CPU Power Consumption

CPU Temp

Result

CPU Peak Freq (Highest CPU Core Frequency

CPU Power Consumption

CPU Temp

Result

CPU Peak Freq (Highest CPU Core Frequency

CPU Power Consumption

CPU Temp

Result

CPU Peak Freq (Highest CPU Core Frequency

CPU Power Consumption

CPU Temp

Result

CPU Peak Freq (Highest CPU Core Frequency

CPU Power Consumption

CPU Temp

Cpuminer-Opt

Cpuminer-Opt is a fork of cpuminer-multi that carries a wide range of CPU performance optimizations for measuring the potential cryptocurrency mining performance of the CPU/processor with a wide variety of cryptocurrencies. The benchmark reports the hash speed for the CPU mining performance for the selected cryptocurrency. Learn more via the OpenBenchmarking.org test page.

Result

kH/s Per Watt

CPU Peak Freq (Highest CPU Core Frequency

CPU Power Consumption

CPU Temp

Result

kH/s Per Watt

CPU Peak Freq (Highest CPU Core Frequency

CPU Power Consumption

CPU Temp

Result

kH/s Per Watt

CPU Peak Freq (Highest CPU Core Frequency

CPU Power Consumption

CPU Temp

Result

kH/s Per Watt

CPU Peak Freq (Highest CPU Core Frequency

CPU Power Consumption

CPU Temp

Result

kH/s Per Watt

CPU Peak Freq (Highest CPU Core Frequency

CPU Power Consumption

CPU Temp

Result

kH/s Per Watt

CPU Peak Freq (Highest CPU Core Frequency

CPU Power Consumption

CPU Temp

Mobile Neural Network

MNN is the Mobile Neural Network as a highly efficient, lightweight deep learning framework developed by Alibaba. This MNN test profile is building the OpenMP / CPU threaded version for processor benchmarking and not any GPU-accelerated test. MNN does allow making use of AVX-512 extensions. Learn more via the OpenBenchmarking.org test page.

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

OpenVINO

This is a test of the Intel OpenVINO, a toolkit around neural networks, using its built-in benchmarking support and analyzing the throughput and latency for various models. Learn more via the OpenBenchmarking.org test page.

Result

CPU Peak Freq (Highest CPU Core Frequency

CPU Power Consumption

CPU Temp

Result

CPU Peak Freq (Highest CPU Core Frequency

CPU Power Consumption

CPU Temp

Result

CPU Peak Freq (Highest CPU Core Frequency

CPU Power Consumption

CPU Temp

Result

CPU Peak Freq (Highest CPU Core Frequency

CPU Power Consumption

CPU Temp

Result

CPU Peak Freq (Highest CPU Core Frequency

CPU Power Consumption

CPU Temp

Result

CPU Peak Freq (Highest CPU Core Frequency

CPU Power Consumption

CPU Temp

Result

CPU Peak Freq (Highest CPU Core Frequency

CPU Power Consumption

CPU Temp

Result

CPU Peak Freq (Highest CPU Core Frequency

CPU Power Consumption

CPU Temp

Result

CPU Peak Freq (Highest CPU Core Frequency

CPU Power Consumption

CPU Temp

Result

CPU Peak Freq (Highest CPU Core Frequency

CPU Power Consumption

CPU Temp

Result

CPU Peak Freq (Highest CPU Core Frequency

CPU Power Consumption

CPU Temp

Result

CPU Peak Freq (Highest CPU Core Frequency

CPU Power Consumption

CPU Temp

CPU Peak Freq (Highest CPU Core Frequency) Monitor

CPU Power Consumption Monitor

CPU Temperature Monitor

64 Results Shown

LeelaChessZero
simdjson:
Kostya
TopTweet
LargeRand
PartialTweets
DistinctUserID
dav1d:
Summer Nature 4K
Summer Nature 1080p
Embree:
Pathtracer ISPC - Crown
Pathtracer ISPC - Asian Dragon
OpenVKL
OSPRay:
particle_volume/pathtracer/real_time
gravity_spheres_volume/dim_512/ao/real_time
gravity_spheres_volume/dim_512/scivis/real_time
gravity_spheres_volume/dim_512/pathtracer/real_time
oneDNN:
IP Shapes 1D - u8s8f32 - CPU
Convolution Batch Shapes Auto - u8s8f32 - CPU
Deconvolution Batch shapes_1d - u8s8f32 - CPU
Deconvolution Batch shapes_3d - u8s8f32 - CPU
Recurrent Neural Network Training - u8s8f32 - CPU
Recurrent Neural Network Inference - u8s8f32 - CPU
Recurrent Neural Network Training - bf16bf16bf16 - CPU
Recurrent Neural Network Inference - bf16bf16bf16 - CPU
OSPRay Studio:
1 - 4K - 1 - Path Tracer
3 - 4K - 1 - Path Tracer
1 - 4K - 16 - Path Tracer
1 - 4K - 32 - Path Tracer
3 - 4K - 16 - Path Tracer
3 - 4K - 32 - Path Tracer
Cpuminer-Opt:
Blake-2 S
Garlicoin
Skeincoin
Myriad-Groestl
LBC, LBRY Credits
Quad SHA-256, Pyrite
Mobile Neural Network
NCNN
OpenVINO:
Face Detection FP16 - CPU:
FPS
ms
Person Detection FP16 - CPU:
FPS
ms
Person Detection FP32 - CPU:
FPS
ms
Vehicle Detection FP16 - CPU:
FPS
ms
Face Detection FP16-INT8 - CPU:
FPS
ms
Vehicle Detection FP16-INT8 - CPU:
FPS
ms
Weld Porosity Detection FP16 - CPU:
FPS
ms
Machine Translation EN To DE FP16 - CPU:
FPS
ms
Weld Porosity Detection FP16-INT8 - CPU:
FPS
ms
Person Vehicle Bike Detection FP16 - CPU:
FPS
ms
Age Gender Recognition Retail 0013 FP16 - CPU:
FPS
ms
Age Gender Recognition Retail 0013 FP16-INT8 - CPU:
FPS
ms
CPU Peak Freq (Highest CPU Core Frequency) Monitor:
Phoronix Test Suite System Monitoring:
Megahertz
Watts
Celsius

Without AVX-512

Testing initiated at 17 September 2022 15:47 by user phoronix.

Default, AVX-512 Enabled

Testing initiated at 17 September 2022 20:32 by user phoronix.

AMD Ryzen 9 7950X AVX-512

View

Statistics

Graph Settings

Multi-Way Comparison

Table

Run Management

Without AVX-512

Default, AVX-512 Enabled

LeelaChessZero

simdjson

dav1d

Embree

OpenVKL

OSPRay

oneDNN

OSPRay Studio

Cpuminer-Opt

Mobile Neural Network

NCNN

OpenVINO

CPU Peak Freq (Highest CPU Core Frequency) Monitor

CPU Power Consumption Monitor

CPU Temperature Monitor

64 Results Shown

Without AVX-512

Default, AVX-512 Enabled