EPYC 7502 AOCC 2.3 Compiler Comparison

AMD EPYC 7502 testing of various benchmarks under AMD AOCC 2.3, GCC 10.2, LLVM Clang 11. CFLAGS/CXXFLAGS of "-O3 -march=znver2" throughout. Benchmarks by Michael Larabel for a future article.

GCC 10.2

Processor: AMD EPYC 7502 32-Core @ 2.50GHz (32 Cores / 64 Threads), Motherboard: ASRockRack EPYCD8 (P2.10 BIOS), Chipset: AMD Starship/Matisse, Memory: 126GB, Disk: 280GB INTEL SSDPED1D280GA, Graphics: ASPEED, Audio: AMD Starship/Matisse, Monitor: VE228, Network: 2 x Intel I350

OS: Ubuntu 20.10, Kernel: 5.8.0-31-generic (x86_64), Desktop: GNOME Shell 3.38.1, Display Server: X Server 1.20.9, Display Driver: modesetting 1.20.9, Compiler: GCC 10.2.0, File-System: ext4, Screen Resolution: 1920x1080

Environment Notes: CXXFLAGS="-O3 -march=znver2" CFLAGS="-O3 -march=znver2"
Compiler Notes: --build=x86_64-linux-gnu --disable-vtable-verify --disable-werror --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-gnu-unique-object --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --enable-libphobos-checking=release --enable-libstdcxx-debug --enable-libstdcxx-time=yes --enable-multiarch --enable-multilib --enable-nls --enable-objc-gc=auto --enable-offload-targets=nvptx-none=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-10-JvwpWM/gcc-10-10.2.0/debian/tmp-gcn/usr,hsa --enable-plugin --enable-shared --enable-threads=posix --host=x86_64-linux-gnu --program-prefix=x86_64-linux-gnu- --target=x86_64-linux-gnu --with-abi=m64 --with-arch-32=i686 --with-default-libstdcxx-abi=new --with-gcc-major-version-only --with-multilib-list=m32,m64,mx32 --with-target-system-zlib=auto --with-tune=generic --without-cuda-driver -v
Processor Notes: Scaling Governor: acpi-cpufreq ondemand (Boost: Enabled) - CPU Microcode: 0x830101c
Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected

LLVM Clang 11

OS: Ubuntu 20.10, Kernel: 5.8.0-31-generic (x86_64), Desktop: GNOME Shell 3.38.1, Display Server: X Server 1.20.9, Display Driver: modesetting 1.20.9, Compiler: Clang 11.0.0-2Target:, File-System: ext4, Screen Resolution: 1920x1080

Environment Notes: CXXFLAGS="-O3 -march=znver2" CFLAGS="-O3 -march=znver2"
Processor Notes: Scaling Governor: acpi-cpufreq ondemand (Boost: Enabled) - CPU Microcode: 0x830101c
Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected

AMD AOCC 2.3

OS: Ubuntu 20.10, Kernel: 5.8.0-31-generic (x86_64), Desktop: GNOME Shell 3.38.1, Display Server: X Server 1.20.9, Display Driver: modesetting 1.20.9, Compiler: Clang 11.0.0, File-System: ext4, Screen Resolution: 1920x1080

Environment Notes: CXXFLAGS="-O3 -march=znver2" CFLAGS="-O3 -march=znver2"
Compiler Notes: Optimized build with assertions; Default target: x86_64-unknown-linux-gnu; Host CPU: znver2
Processor Notes: Scaling Governor: acpi-cpufreq ondemand (Boost: Enabled) - CPU Microcode: 0x830101c
Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

C-Ray

This is a test of C-Ray, a simple raytracer designed to test the floating-point CPU performance. This test is multi-threaded (16 threads per core), will shoot 8 rays per pixel for anti-aliasing, and will generate a 1600 x 1200 image. Learn more via the OpenBenchmarking.org test page.

oneDNN

This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the oneAPI initiative. Learn more via the OpenBenchmarking.org test page.

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

dav1d

Dav1d is an open-source, speedy AV1 video decoder. This test profile times how long it takes to decode sample AV1 video content. Learn more via the OpenBenchmarking.org test page.

Darmstadt Automotive Parallel Heterogeneous Suite

DAPHNE is the Darmstadt Automotive Parallel HeterogeNEous Benchmark Suite with OpenCL / CUDA / OpenMP test cases for these automotive benchmarks for evaluating programming models in context to vehicle autonomous driving capabilities. Learn more via the OpenBenchmarking.org test page.

LibRaw

LibRaw is a RAW image decoder for digital camera photos. This test profile runs LibRaw's post-processing benchmark. Learn more via the OpenBenchmarking.org test page.

SVT-AV1

This is a test of the Intel Open Visual Cloud Scalable Video Technology SVT-AV1 CPU-based multi-threaded video encoder for the AV1 video format with a sample 1080p YUV video file. Learn more via the OpenBenchmarking.org test page.

GraphicsMagick

This is a test of GraphicsMagick with its OpenMP implementation that performs various imaging tests on a sample 6000x4000 pixel JPEG image. Learn more via the OpenBenchmarking.org test page.

OpenSSL

OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test measures the RSA 4096-bit performance of OpenSSL. Learn more via the OpenBenchmarking.org test page.

Darmstadt Automotive Parallel Heterogeneous Suite

SVT-AV1

GraphicsMagick

This is a test of GraphicsMagick with its OpenMP implementation that performs various imaging tests on a sample 6000x4000 pixel JPEG image. Learn more via the OpenBenchmarking.org test page.

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

LAME MP3 Encoding

LAME is an MP3 encoder licensed under the LGPL. This test measures the time required to encode a WAV file to MP3 format. Learn more via the OpenBenchmarking.org test page.

oneDNN

TNN

TNN is an open-source deep learning reasoning framework developed by Tencent. Learn more via the OpenBenchmarking.org test page.

ASTC Encoder

ASTC Encoder (astcenc) is for the Adaptive Scalable Texture Compression (ASTC) format commonly used with OpenGL, OpenGL ES, and Vulkan graphics APIs. This test profile does a coding test of both compression/decompression. Learn more via the OpenBenchmarking.org test page.

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

GraphicsMagick

This is a test of GraphicsMagick with its OpenMP implementation that performs various imaging tests on a sample 6000x4000 pixel JPEG image. Learn more via the OpenBenchmarking.org test page.

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

ASTC Encoder

WebP Image Encode

This is a test of Google's libwebp with the cwebp image encode utility and using a sample 6000x4000 pixel JPEG image as the input. Learn more via the OpenBenchmarking.org test page.

oneDNN

AOBench

AOBench is a lightweight ambient occlusion renderer, written in C. The test profile is using a size of 2048 x 2048. Learn more via the OpenBenchmarking.org test page.

oneDNN

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

TSCP

This is a performance test of TSCP, Tom Kerrigan's Simple Chess Program, which has a built-in performance benchmark. Learn more via the OpenBenchmarking.org test page.

VP9 libvpx Encoding

This is a standard video encoding performance test of Google's libvpx library and the vpxenc command for the VP9/WebM format using a sample 1080p video. Learn more via the OpenBenchmarking.org test page.

ASTC Encoder

Basis Universal

Basis Universal is a GPU texture codoec. This test times how long it takes to convert sRGB PNGs into Basis Univeral assets with various settings. Learn more via the OpenBenchmarking.org test page.

VP9 libvpx Encoding

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

oneDNN

Darmstadt Automotive Parallel Heterogeneous Suite

Stockfish

This is a test of Stockfish, an advanced C++11 chess benchmark that can scale up to 128 CPU cores. Learn more via the OpenBenchmarking.org test page.

LZ4 Compression

This test measures the time needed to compress/decompress a sample file (an Ubuntu ISO) using LZ4 compression. Learn more via the OpenBenchmarking.org test page.

oneDNN

SVT-VP9

This is a test of the Intel Open Visual Cloud Scalable Video Technology SVT-VP9 CPU-based multi-threaded video encoder for the VP9 video format with a sample 1080p YUV video file. Learn more via the OpenBenchmarking.org test page.

GraphicsMagick

This is a test of GraphicsMagick with its OpenMP implementation that performs various imaging tests on a sample 6000x4000 pixel JPEG image. Learn more via the OpenBenchmarking.org test page.

x265

This is a simple test of the x265 encoder run on the CPU with 1080p and 4K options for H.265 video encode performance with x265. Learn more via the OpenBenchmarking.org test page.

WebP Image Encode

This is a test of Google's libwebp with the cwebp image encode utility and using a sample 6000x4000 pixel JPEG image as the input. Learn more via the OpenBenchmarking.org test page.

RNNoise

RNNoise is a recurrent neural network for audio noise reduction developed by Mozilla and Xiph.Org. This test profile is a single-threaded test measuring the time to denoise a sample 26 minute long 16-bit RAW audio file using this recurrent neural network noise suppression library. Learn more via the OpenBenchmarking.org test page.

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

PostgreSQL pgbench

This is a benchmark of PostgreSQL using pgbench for facilitating the database benchmarks. Learn more via the OpenBenchmarking.org test page.

Zstd Compression

This test measures the time needed to compress a sample file (an Ubuntu ISO) using Zstd compression. Learn more via the OpenBenchmarking.org test page.

SVT-VP9

LZ4 Compression

This test measures the time needed to compress/decompress a sample file (an Ubuntu ISO) using LZ4 compression. Learn more via the OpenBenchmarking.org test page.

SciMark

This test runs the ANSI C version of SciMark 2.0, which is a benchmark for scientific and numerical computing developed by programmers at the National Institute of Standards and Technology. This test is made up of Fast Foruier Transform, Jacobi Successive Over-relaxation, Monte Carlo, Sparse Matrix Multiply, and dense LU matrix factorization benchmarks. Learn more via the OpenBenchmarking.org test page.

SQLite Speedtest

This is a benchmark of SQLite's speedtest1 benchmark program with an increased problem size of 1,000. Learn more via the OpenBenchmarking.org test page.

Timed MrBayes Analysis

This test performs a bayesian analysis of a set of primate genome sequences in order to estimate their phylogeny. Learn more via the OpenBenchmarking.org test page.

PostgreSQL pgbench

This is a benchmark of PostgreSQL using pgbench for facilitating the database benchmarks. Learn more via the OpenBenchmarking.org test page.

x264

This is a simple test of the x264 encoder run on the CPU (OpenCL support disabled) with a sample video file. Learn more via the OpenBenchmarking.org test page.

dav1d

Dav1d is an open-source, speedy AV1 video decoder. This test profile times how long it takes to decode sample AV1 video content. Learn more via the OpenBenchmarking.org test page.

PostgreSQL pgbench

This is a benchmark of PostgreSQL using pgbench for facilitating the database benchmarks. Learn more via the OpenBenchmarking.org test page.

LZ4 Compression

This test measures the time needed to compress/decompress a sample file (an Ubuntu ISO) using LZ4 compression. Learn more via the OpenBenchmarking.org test page.

SVT-VP9

x265

This is a simple test of the x265 encoder run on the CPU with 1080p and 4K options for H.265 video encode performance with x265. Learn more via the OpenBenchmarking.org test page.

PostgreSQL pgbench

This is a benchmark of PostgreSQL using pgbench for facilitating the database benchmarks. Learn more via the OpenBenchmarking.org test page.

libjpeg-turbo tjbench

tjbench is a JPEG decompression/compression benchmark part of libjpeg-turbo. Learn more via the OpenBenchmarking.org test page.

Crypto++

Crypto++ is a C++ class library of cryptographic algorithms. Learn more via the OpenBenchmarking.org test page.

PostgreSQL pgbench

This is a benchmark of PostgreSQL using pgbench for facilitating the database benchmarks. Learn more via the OpenBenchmarking.org test page.

TNN

TNN is an open-source deep learning reasoning framework developed by Tencent. Learn more via the OpenBenchmarking.org test page.

NGINX Benchmark

This is a test of ab, which is the Apache Benchmark program running against nginx. This test profile measures how many requests per second a given system can sustain when carrying out 2,000,000 requests with 500 requests being carried out concurrently. Learn more via the OpenBenchmarking.org test page.

dav1d

Dav1d is an open-source, speedy AV1 video decoder. This test profile times how long it takes to decode sample AV1 video content. Learn more via the OpenBenchmarking.org test page.

GraphicsMagick

This is a test of GraphicsMagick with its OpenMP implementation that performs various imaging tests on a sample 6000x4000 pixel JPEG image. Learn more via the OpenBenchmarking.org test page.

dav1d

Dav1d is an open-source, speedy AV1 video decoder. This test profile times how long it takes to decode sample AV1 video content. Learn more via the OpenBenchmarking.org test page.

oneDNN

Basis Universal

Basis Universal is a GPU texture codoec. This test times how long it takes to convert sRGB PNGs into Basis Univeral assets with various settings. Learn more via the OpenBenchmarking.org test page.

PostgreSQL pgbench

This is a benchmark of PostgreSQL using pgbench for facilitating the database benchmarks. Learn more via the OpenBenchmarking.org test page.

Zstd Compression

This test measures the time needed to compress a sample file (an Ubuntu ISO) using Zstd compression. Learn more via the OpenBenchmarking.org test page.

WebP Image Encode

This is a test of Google's libwebp with the cwebp image encode utility and using a sample 6000x4000 pixel JPEG image as the input. Learn more via the OpenBenchmarking.org test page.

Hierarchical INTegration

This test runs the U.S. Department of Energy's Ames Laboratory Hierarchical INTegration (HINT) benchmark. Learn more via the OpenBenchmarking.org test page.

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

Redis

Redis is an open-source data structure server. Learn more via the OpenBenchmarking.org test page.

oneDNN

Geometric Mean Of All Test Results

Number Of First Place Finishes

Number Of Last Place Finishes

92 Results Shown

NCNN:
CPU - blazeface
CPU - mnasnet
C-Ray
oneDNN
NCNN:
CPU-v3-v3 - mobilenet-v3
CPU-v2-v2 - mobilenet-v2
CPU - efficientnet-b0
dav1d
Darmstadt Automotive Parallel Heterogeneous Suite
LibRaw
SVT-AV1
GraphicsMagick
OpenSSL
Darmstadt Automotive Parallel Heterogeneous Suite
SVT-AV1:
Enc Mode 4 - 1080p
Enc Mode 8 - 1080p
GraphicsMagick
NCNN
LAME MP3 Encoding
oneDNN:
Matrix Multiply Batch Shapes Transformer - f32 - CPU
IP Batch 1D - f32 - CPU
TNN
ASTC Encoder
NCNN:
CPU - resnet50
CPU - squeezenet
GraphicsMagick
NCNN
ASTC Encoder
WebP Image Encode
oneDNN:
Deconvolution Batch deconv_1d - f32 - CPU
IP Batch 1D - u8s8f32 - CPU
AOBench
oneDNN
NCNN
TSCP
VP9 libvpx Encoding
ASTC Encoder
Basis Universal
VP9 libvpx Encoding
NCNN
oneDNN
Darmstadt Automotive Parallel Heterogeneous Suite
Stockfish
LZ4 Compression
oneDNN
SVT-VP9
GraphicsMagick
x265
WebP Image Encode
RNNoise
NCNN
PostgreSQL pgbench
Zstd Compression
SVT-VP9
LZ4 Compression
SciMark
SQLite Speedtest
Timed MrBayes Analysis
PostgreSQL pgbench
x264
dav1d
PostgreSQL pgbench:
1 - 1 - Read Write - Average Latency
1 - 1 - Read Write
LZ4 Compression
SVT-VP9
x265
PostgreSQL pgbench
libjpeg-turbo tjbench
Crypto++
PostgreSQL pgbench
TNN
NGINX Benchmark
dav1d
GraphicsMagick
dav1d
oneDNN
Basis Universal:
UASTC Level 3
UASTC Level 2
PostgreSQL pgbench:
1 - 50 - Read Write
1 - 50 - Read Write - Average Latency
Zstd Compression
WebP Image Encode
Hierarchical INTegration
NCNN:
CPU - alexnet
CPU - shufflenet-v2
Redis:
SET
GET
LPUSH
oneDNN
Geometric Mean Of All Test Results:
Result Composite - EPYC 7502 AOCC 2.3 Compiler Comparison
Wins - 89 Tests
Losses - 89 Tests

GCC 10.2

Testing initiated at 7 December 2020 16:58 by user phoronix.

LLVM Clang 11

OS: Ubuntu 20.10, Kernel: 5.8.0-31-generic (x86_64), Desktop: GNOME Shell 3.38.1, Display Server: X Server 1.20.9, Display Driver: modesetting 1.20.9, Compiler: Clang 11.0.0-2Target:, File-System: ext4, Screen Resolution: 1920x1080

Testing initiated at 8 December 2020 06:09 by user phoronix.

AMD AOCC 2.3

OS: Ubuntu 20.10, Kernel: 5.8.0-31-generic (x86_64), Desktop: GNOME Shell 3.38.1, Display Server: X Server 1.20.9, Display Driver: modesetting 1.20.9, Compiler: Clang 11.0.0, File-System: ext4, Screen Resolution: 1920x1080

Testing initiated at 7 December 2020 10:27 by user phoronix.

EPYC 7502 AOCC 2.3 Compiler Comparison

View

Limit displaying results to tests within:

Statistics

Graph Settings

Multi-Way Comparison

Table

Run Management

GCC 10.2

LLVM Clang 11

AMD AOCC 2.3

NCNN

C-Ray

oneDNN

NCNN

dav1d

Darmstadt Automotive Parallel Heterogeneous Suite

LibRaw

SVT-AV1

GraphicsMagick

OpenSSL

Darmstadt Automotive Parallel Heterogeneous Suite

SVT-AV1

GraphicsMagick

NCNN

LAME MP3 Encoding

oneDNN

TNN

ASTC Encoder

NCNN

GraphicsMagick

NCNN

ASTC Encoder

WebP Image Encode

oneDNN

AOBench

oneDNN

NCNN

TSCP

VP9 libvpx Encoding

ASTC Encoder

Basis Universal

VP9 libvpx Encoding

NCNN

oneDNN

Darmstadt Automotive Parallel Heterogeneous Suite

Stockfish

LZ4 Compression

oneDNN

SVT-VP9

GraphicsMagick

x265

WebP Image Encode

RNNoise

NCNN

PostgreSQL pgbench

Zstd Compression

SVT-VP9

LZ4 Compression

SciMark

SQLite Speedtest

Timed MrBayes Analysis

PostgreSQL pgbench

x264

dav1d

PostgreSQL pgbench

LZ4 Compression

SVT-VP9

x265

PostgreSQL pgbench

libjpeg-turbo tjbench

Crypto++

PostgreSQL pgbench

TNN

NGINX Benchmark

dav1d

GraphicsMagick

dav1d

oneDNN

Basis Universal