GCC 11 vs. LLVM Clang 12 Benchmarks On Xeon Ice Lake

Xeon Platinum 8380 compiler benchmarks by Michael Larabel looking at GCC 11 against LLVM Clang 12 for some initial holiday weekend tests...

GCC 11.1

Processor: 2 x Intel Xeon Platinum 8380 @ 3.40GHz (80 Cores / 160 Threads), Motherboard: Intel M50CYP2SB2U (SE5C6200.86B.0022.D08.2103221623 BIOS), Chipset: Intel Device 0998, Memory: 16 x 32 GB DDR4-3200MT/s Hynix HMA84GR7CJR4N-XN, Disk: 800GB INTEL SSDPF21Q800GB, Graphics: ASPEED, Network: 2 x Intel X710 for 10GBASE-T + 2 x Intel E810-C for QSFP

OS: Fedora 34, Kernel: 5.12.6-300.fc34.x86_64 (x86_64), Compiler: GCC 11.1.1 20210428, File-System: xfs, Screen Resolution: 1024x768

Kernel Notes: Transparent Huge Pages: madvise
Environment Notes: CXXFLAGS="-O3 -march=native -flto" CFLAGS="-O3 -march=native -flto"
Compiler Notes: --build=x86_64-redhat-linux --disable-libunwind-exceptions --enable-__cxa_atexit --enable-bootstrap --enable-cet --enable-checking=release --enable-gnu-indirect-function --enable-gnu-unique-object --enable-initfini-array --enable-languages=c,c++,fortran,objc,obj-c++,ada,go,d,lto --enable-multilib --enable-offload-targets=nvptx-none --enable-plugin --enable-shared --enable-threads=posix --mandir=/usr/share/man --with-arch_32=i686 --with-gcc-major-version-only --with-linker-hash-style=gnu --with-tune=generic --without-cuda-driver
Processor Notes: Scaling Governor: intel_pstate performance - CPU Microcode: 0xd000270
Python Notes: Python 3.9.5
Security Notes: SELinux + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected

Clang 12.0

OS: Fedora 34, Kernel: 5.12.6-300.fc34.x86_64 (x86_64), Compiler: Clang 12.0.0, File-System: xfs, Screen Resolution: 1024x768

Kernel Notes: Transparent Huge Pages: madvise
Environment Notes: CXXFLAGS="-O3 -march=native -flto" CFLAGS="-O3 -march=native -flto"
Processor Notes: Scaling Governor: intel_pstate performance - CPU Microcode: 0xd000270
Python Notes: Python 3.9.5
Security Notes: SELinux + itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Enhanced IBRS IBPB: conditional RSB filling + srbds: Not affected + tsx_async_abort: Not affected

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

C-Ray

This is a test of C-Ray, a simple raytracer designed to test the floating-point CPU performance. This test is multi-threaded (16 threads per core), will shoot 8 rays per pixel for anti-aliasing, and will generate a 1600 x 1200 image. Learn more via the OpenBenchmarking.org test page.

oneDNN

This is a test of the Intel oneDNN as an Intel-optimized library for Deep Neural Networks and making use of its built-in benchdnn functionality. The result is the total perf time reported. Intel oneDNN was formerly known as DNNL (Deep Neural Network Library) and MKL-DNN before being rebranded as part of the Intel oneAPI initiative. Learn more via the OpenBenchmarking.org test page.

OpenSSL

OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test measures the RSA 4096-bit performance of OpenSSL. Learn more via the OpenBenchmarking.org test page.

TNN

TNN is an open-source deep learning reasoning framework developed by Tencent. Learn more via the OpenBenchmarking.org test page.

ASTC Encoder

ASTC Encoder (astcenc) is for the Adaptive Scalable Texture Compression (ASTC) format commonly used with OpenGL, OpenGL ES, and Vulkan graphics APIs. This test profile does a coding test of both compression/decompression. Learn more via the OpenBenchmarking.org test page.

oneDNN

ASTC Encoder

oneDNN

Zstd Compression

This test measures the time needed to compress/decompress a sample file (a FreeBSD disk image - FreeBSD-12.2-RELEASE-amd64-memstick.img) using Zstd compression with options for different compression levels / settings. Learn more via the OpenBenchmarking.org test page.

oneDNN

Bullet Physics Engine

This is a benchmark of the Bullet Physics Engine. Learn more via the OpenBenchmarking.org test page.

Coremark

This is a test of EEMBC CoreMark processor benchmark. Learn more via the OpenBenchmarking.org test page.

ASTC Encoder

GraphicsMagick

This is a test of GraphicsMagick with its OpenMP implementation that performs various imaging tests on a sample 6000x4000 pixel JPEG image. Learn more via the OpenBenchmarking.org test page.

oneDNN

Liquid-DSP

LiquidSDR's Liquid-DSP is a software-defined radio (SDR) digital signal processing library. This test profile runs a multi-threaded benchmark of this SDR/DSP library focused on embedded platform usage. Learn more via the OpenBenchmarking.org test page.

oneDNN

GraphicsMagick

This is a test of GraphicsMagick with its OpenMP implementation that performs various imaging tests on a sample 6000x4000 pixel JPEG image. Learn more via the OpenBenchmarking.org test page.

oneDNN

WebP2 Image Encode

This is a test of Google's libwebp2 library with the WebP2 image encode utility and using a sample 6000x4000 pixel JPEG image as the input, similar to the WebP/libwebp test profile. WebP2 is currently experimental and under heavy development as ultimately the successor to WebP. WebP2 supports 10-bit HDR, more efficienct lossy compression, improved lossless compression, animation support, and full multi-threading support compared to WebP. Learn more via the OpenBenchmarking.org test page.

Himeno Benchmark

The Himeno benchmark is a linear solver of pressure Poisson using a point-Jacobi method. Learn more via the OpenBenchmarking.org test page.

Opus Codec Encoding

Opus is an open audio codec. Opus is a lossy audio compression format designed primarily for interactive real-time applications over the Internet. This test uses Opus-Tools and measures the time required to encode a WAV file to Opus. Learn more via the OpenBenchmarking.org test page.

LAME MP3 Encoding

LAME is an MP3 encoder licensed under the LGPL. This test measures the time required to encode a WAV file to MP3 format. Learn more via the OpenBenchmarking.org test page.

oneDNN

Bullet Physics Engine

This is a benchmark of the Bullet Physics Engine. Learn more via the OpenBenchmarking.org test page.

Kripke

Kripke is a simple, scalable, 3D Sn deterministic particle transport code. Its primary purpose is to research how data layout, programming paradigms and architectures effect the implementation and performance of Sn transport. Kripke is developed by LLNL. Learn more via the OpenBenchmarking.org test page.

Bullet Physics Engine

This is a benchmark of the Bullet Physics Engine. Learn more via the OpenBenchmarking.org test page.

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

Bullet Physics Engine

This is a benchmark of the Bullet Physics Engine. Learn more via the OpenBenchmarking.org test page.

oneDNN

x265

This is a simple test of the x265 encoder run on the CPU with 1080p and 4K options for H.265 video encode performance with x265. Learn more via the OpenBenchmarking.org test page.

WebP Image Encode

This is a test of Google's libwebp with the cwebp image encode utility and using a sample 6000x4000 pixel JPEG image as the input. Learn more via the OpenBenchmarking.org test page.

Crypto++

Crypto++ is a C++ class library of cryptographic algorithms. Learn more via the OpenBenchmarking.org test page.

WebP2 Image Encode

oneDNN

WebP2 Image Encode

TNN

TNN is an open-source deep learning reasoning framework developed by Tencent. Learn more via the OpenBenchmarking.org test page.

AOBench

AOBench is a lightweight ambient occlusion renderer, written in C. The test profile is using a size of 2048 x 2048. Learn more via the OpenBenchmarking.org test page.

oneDNN

WebP2 Image Encode

SVT-HEVC

This is a test of the Intel Open Visual Cloud Scalable Video Technology SVT-HEVC CPU-based multi-threaded video encoder for the HEVC / H.265 video format with a sample 1080p YUV video file. Learn more via the OpenBenchmarking.org test page.

Kvazaar

This is a test of Kvazaar as a CPU-based H.265 video encoder written in the C programming language and optimized in Assembly. Kvazaar is the winner of the 2016 ACM Open-Source Software Competition and developed at the Ultra Video Group, Tampere University, Finland. Learn more via the OpenBenchmarking.org test page.

SVT-HEVC

Zstd Compression

SVT-AV1

This is a benchmark of the SVT-AV1 open-source video encoder/decoder. SVT-AV1 was originally developed by Intel as part of their Open Visual Cloud / Scalable Video Technology (SVT). Development of SVT-AV1 has since moved to the Alliance for Open Media as part of upstream AV1 development. SVT-AV1 is a CPU-based multi-threaded video encoder for the AV1 video format with a sample YUV video file. Learn more via the OpenBenchmarking.org test page.

Gcrypt Library

Libgcrypt is a general purpose cryptographic library developed as part of the GnuPG project. This is a benchmark of libgcrypt's integrated benchmark and is measuring the time to run the benchmark command with a cipher/mac/hash repetition count set for 50 times as simple, high level look at the overall crypto performance of the system under test. Learn more via the OpenBenchmarking.org test page.

Kvazaar

WebP Image Encode

This is a test of Google's libwebp with the cwebp image encode utility and using a sample 6000x4000 pixel JPEG image as the input. Learn more via the OpenBenchmarking.org test page.

Kvazaar

Zstd Compression

libjpeg-turbo tjbench

tjbench is a JPEG decompression/compression benchmark that is part of libjpeg-turbo, a JPEG image codec library optimized for SIMD instructions on modern CPU architectures. Learn more via the OpenBenchmarking.org test page.

PostgreSQL pgbench

This is a benchmark of PostgreSQL using pgbench for facilitating the database benchmarks. Learn more via the OpenBenchmarking.org test page.

SVT-VP9

This is a test of the Intel Open Visual Cloud Scalable Video Technology SVT-VP9 CPU-based multi-threaded video encoder for the VP9 video format with a sample YUV input video file. Learn more via the OpenBenchmarking.org test page.

oneDNN

WebP Image Encode

This is a test of Google's libwebp with the cwebp image encode utility and using a sample 6000x4000 pixel JPEG image as the input. Learn more via the OpenBenchmarking.org test page.

WebP2 Image Encode

Kvazaar

SVT-AV1

Zstd Compression

Timed MrBayes Analysis

This test performs a bayesian analysis of a set of primate genome sequences in order to estimate their phylogeny. Learn more via the OpenBenchmarking.org test page.

SVT-VP9

eSpeak-NG Speech Engine

This test times how long it takes the eSpeak speech synthesizer to read Project Gutenberg's The Outline of Science and output to a WAV file. This test profile is now tracking the eSpeak-NG version of eSpeak. Learn more via the OpenBenchmarking.org test page.

SVT-VP9

SVT-AV1

WebP Image Encode

This is a test of Google's libwebp with the cwebp image encode utility and using a sample 6000x4000 pixel JPEG image as the input. Learn more via the OpenBenchmarking.org test page.

Zstd Compression

Liquid-DSP

WebP Image Encode

This is a test of Google's libwebp with the cwebp image encode utility and using a sample 6000x4000 pixel JPEG image as the input. Learn more via the OpenBenchmarking.org test page.

SVT-AV1

Primesieve

Primesieve generates prime numbers using a highly optimized sieve of Eratosthenes implementation. Primesieve benchmarks the CPU's L1/L2 cache performance. Learn more via the OpenBenchmarking.org test page.

oneDNN

Zstd Compression

FLAC Audio Encoding

This test times how long it takes to encode a sample WAV file to FLAC format five times. Learn more via the OpenBenchmarking.org test page.

x265

This is a simple test of the x265 encoder run on the CPU with 1080p and 4K options for H.265 video encode performance with x265. Learn more via the OpenBenchmarking.org test page.

oneDNN

Caffe

This is a benchmark of the Caffe deep learning framework and currently supports the AlexNet and Googlenet model and execution on both CPUs and NVIDIA GPUs. Learn more via the OpenBenchmarking.org test page.

SVT-HEVC

WavPack Audio Encoding

This test times how long it takes to encode a sample WAV file to WavPack format with very high quality settings. Learn more via the OpenBenchmarking.org test page.

oneDNN

Darmstadt Automotive Parallel Heterogeneous Suite

DAPHNE is the Darmstadt Automotive Parallel HeterogeNEous Benchmark Suite with OpenCL / CUDA / OpenMP test cases for these automotive benchmarks for evaluating programming models in context to vehicle autonomous driving capabilities. Learn more via the OpenBenchmarking.org test page.

GraphicsMagick

This is a test of GraphicsMagick with its OpenMP implementation that performs various imaging tests on a sample 6000x4000 pixel JPEG image. Learn more via the OpenBenchmarking.org test page.

GNU GMP GMPbench

GMPbench is a test of the GNU Multiple Precision Arithmetic (GMP) Library. GMPbench is a single-threaded integer benchmark that leverages the GMP library to stress the CPU with widening integer multiplication. Learn more via the OpenBenchmarking.org test page.

NCNN

NCNN is a high performance neural network inference framework optimized for mobile and other platforms developed by Tencent. Learn more via the OpenBenchmarking.org test page.

PostgreSQL pgbench

This is a benchmark of PostgreSQL using pgbench for facilitating the database benchmarks. Learn more via the OpenBenchmarking.org test page.

GraphicsMagick

This is a test of GraphicsMagick with its OpenMP implementation that performs various imaging tests on a sample 6000x4000 pixel JPEG image. Learn more via the OpenBenchmarking.org test page.

113 Results Shown

NCNN:
CPU - blazeface
CPU-v3-v3 - mobilenet-v3
CPU-v2-v2 - mobilenet-v2
C-Ray
oneDNN:
Matrix Multiply Batch Shapes Transformer - u8s8f32 - CPU
Deconvolution Batch shapes_1d - u8s8f32 - CPU
Matrix Multiply Batch Shapes Transformer - f32 - CPU
OpenSSL
TNN
ASTC Encoder
oneDNN
ASTC Encoder
oneDNN:
IP Shapes 3D - u8s8f32 - CPU
Recurrent Neural Network Inference - f32 - CPU
Recurrent Neural Network Inference - u8s8f32 - CPU
Recurrent Neural Network Inference - bf16bf16bf16 - CPU
Zstd Compression
oneDNN
Bullet Physics Engine
Coremark
ASTC Encoder
GraphicsMagick
oneDNN:
Matrix Multiply Batch Shapes Transformer - bf16bf16bf16 - CPU
Recurrent Neural Network Training - f32 - CPU
Recurrent Neural Network Training - bf16bf16bf16 - CPU
Liquid-DSP
oneDNN
GraphicsMagick
oneDNN:
IP Shapes 3D - f32 - CPU
Deconvolution Batch shapes_1d - bf16bf16bf16 - CPU
WebP2 Image Encode
Himeno Benchmark
Opus Codec Encoding
LAME MP3 Encoding
oneDNN
Bullet Physics Engine
Kripke
Bullet Physics Engine
NCNN
Bullet Physics Engine:
1000 Convex
Prim Trimesh
Convex Trimesh
oneDNN
x265
WebP Image Encode
Crypto++
WebP2 Image Encode
oneDNN
WebP2 Image Encode
TNN
AOBench
oneDNN
WebP2 Image Encode
SVT-HEVC
Kvazaar
SVT-HEVC
Zstd Compression
SVT-AV1
Gcrypt Library
Kvazaar
WebP Image Encode
Kvazaar
Zstd Compression
libjpeg-turbo tjbench
PostgreSQL pgbench:
100 - 250 - Read Write
100 - 250 - Read Write - Average Latency
SVT-VP9
oneDNN
WebP Image Encode
WebP2 Image Encode
Kvazaar
SVT-AV1
Zstd Compression
Timed MrBayes Analysis
SVT-VP9
eSpeak-NG Speech Engine
SVT-VP9
SVT-AV1
WebP Image Encode
Zstd Compression:
19 - Decompression Speed
19, Long Mode - Decompression Speed
Liquid-DSP
WebP Image Encode
SVT-AV1
Primesieve
oneDNN
Zstd Compression:
8 - Decompression Speed
8, Long Mode - Decompression Speed
FLAC Audio Encoding
x265
oneDNN
Caffe:
AlexNet - CPU - 200
GoogleNet - CPU - 200
SVT-HEVC
WavPack Audio Encoding
oneDNN
Darmstadt Automotive Parallel Heterogeneous Suite:
OpenMP - Euclidean Cluster
OpenMP - Points2Image
OpenMP - NDT Mapping
GraphicsMagick
GNU GMP GMPbench
NCNN:
CPU - regnety_400m
CPU - squeezenet_ssd
CPU - resnet18
CPU - vgg16
CPU - googlenet
CPU - efficientnet-b0
CPU - mnasnet
CPU - shufflenet-v2
CPU - mobilenet
PostgreSQL pgbench:
100 - 250 - Read Only - Average Latency
100 - 250 - Read Only
GraphicsMagick

GCC 11.1

OS: Fedora 34, Kernel: 5.12.6-300.fc34.x86_64 (x86_64), Compiler: GCC 11.1.1 20210428, File-System: xfs, Screen Resolution: 1024x768

Testing initiated at 28 May 2021 11:02 by user .

Clang 12.0

OS: Fedora 34, Kernel: 5.12.6-300.fc34.x86_64 (x86_64), Compiler: Clang 12.0.0, File-System: xfs, Screen Resolution: 1024x768

Testing initiated at 28 May 2021 19:40 by user .

GCC 11 vs. LLVM Clang 12 Benchmarks On Xeon Ice Lake

View

Statistics

Graph Settings

Multi-Way Comparison

Table

Run Management

GCC 11.1

Clang 12.0

NCNN

C-Ray

oneDNN

OpenSSL

TNN

ASTC Encoder

oneDNN

ASTC Encoder

oneDNN

Zstd Compression

oneDNN

Bullet Physics Engine

Coremark

ASTC Encoder

GraphicsMagick

oneDNN

Liquid-DSP

oneDNN

GraphicsMagick

oneDNN

WebP2 Image Encode

Himeno Benchmark

Opus Codec Encoding

LAME MP3 Encoding

oneDNN

Bullet Physics Engine

Kripke

Bullet Physics Engine

NCNN

Bullet Physics Engine

oneDNN

x265

WebP Image Encode

Crypto++

WebP2 Image Encode

oneDNN

WebP2 Image Encode

TNN

AOBench

oneDNN

WebP2 Image Encode

SVT-HEVC

Kvazaar

SVT-HEVC

Zstd Compression

SVT-AV1

Gcrypt Library

Kvazaar

WebP Image Encode

Kvazaar

Zstd Compression

libjpeg-turbo tjbench

PostgreSQL pgbench

SVT-VP9

oneDNN

WebP Image Encode

WebP2 Image Encode

Kvazaar

SVT-AV1

Zstd Compression

Timed MrBayes Analysis

SVT-VP9

eSpeak-NG Speech Engine

SVT-VP9

SVT-AV1

WebP Image Encode

Zstd Compression

Liquid-DSP

WebP Image Encode

SVT-AV1

Primesieve