GCC AMD Znver3 Compiler Optimization Levels

Benchmarks for a future article.

-O2 -march=x86-64

Environment Notes: CXXFLAGS="-O2 -march=x86-64" CFLAGS="-O2 -march=x86-64"
Compiler Notes: --disable-multilib --enable-checking=release
Processor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa201009
Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected

-O3 -march=x86-64

Environment Notes: CXXFLAGS="-O3 -march=x86-64" CFLAGS="-O3 -march=x86-64"
Compiler Notes: --disable-multilib --enable-checking=release
Processor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa201009
Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected

-O3 -march=znver2

Environment Notes: CXXFLAGS="-O3 -march=znver2" CFLAGS="-O3 -march=znver2"
Compiler Notes: --disable-multilib --enable-checking=release
Processor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa201009
Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected

-O2 -march=znver3

Environment Notes: CXXFLAGS="-O2 -march=znver3" CFLAGS="-O2 -march=znver3"
Compiler Notes: --disable-multilib --enable-checking=release
Processor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa201009
Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected

-O3 -march=znver3

Environment Notes: CXXFLAGS="-O3 -march=znver3" CFLAGS="-O3 -march=znver3"
Compiler Notes: --disable-multilib --enable-checking=release
Disk Notes: NONE / errors=remount-ro,relatime,rw / Block Size: 4096
Processor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa201009
Python Notes: Python 2.7.18 + Python 3.8.5
Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected

-O2 -march=znver3 -flto

Environment Notes: CXXFLAGS="-O2 -march=znver3 -flto" CFLAGS="-O2 -march=znver3 -flto"
Compiler Notes: --disable-multilib --enable-checking=release
Processor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa201009
Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected

-O3 -maech=znver3 -flto

Environment Notes: CXXFLAGS="-O3 -march=znver3 -flto" CFLAGS="-O3 -march=znver3 -flto"
Compiler Notes: --disable-multilib --enable-checking=release
Processor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa201009
Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected

-Ofast -march=znver3 -flto

Processor: AMD Ryzen 9 5950X 16-Core @ 3.40GHz (16 Cores / 32 Threads), Motherboard: ASUS ROG CROSSHAIR VIII HERO (WI-FI) (3003 BIOS), Chipset: AMD Starship/Matisse, Memory: 16GB, Disk: 2000GB Corsair Force MP600, Graphics: AMD Radeon RX 5600 OEM/5600 XT / 5700/5700 8GB (2100/875MHz), Audio: AMD Navi 10 HDMI Audio, Monitor: ASUS MG28U, Network: Realtek RTL8125 2.5GbE + Intel I211 + Intel Wi-Fi 6 AX200

OS: Ubuntu 20.04, Kernel: 5.10.0-051000rc6daily20201205-generic (x86_64) 20201204, Desktop: GNOME Shell 3.36.4, Display Server: X Server 1.20.8, Display Driver: modesetting 1.20.8, OpenGL: 4.6 Mesa 21.0.0-devel (git-1a53572 2020-12-09 focal-oibaf-ppa) (LLVM 11.0.0), Vulkan: 1.2.145, Compiler: GCC 11.0.0 20201213, File-System: ext4, Screen Resolution: 3840x2160

Environment Notes: CXXFLAGS="-Ofast -march=znver3 -flto" CFLAGS="-Ofast -march=znver3 -flto"
Compiler Notes: --disable-multilib --enable-checking=release
Processor Notes: Scaling Governor: acpi-cpufreq performance (Boost: Enabled) - CPU Microcode: 0xa201009
Security Notes: itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected

FFTW

FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions. Learn more via the OpenBenchmarking.org test page.

Result

Result Confidence

Timed MrBayes Analysis

This test performs a bayesian analysis of a set of primate genome sequences in order to estimate their phylogeny. Learn more via the OpenBenchmarking.org test page.

Result

Result Confidence

Timed HMMer Search

This test searches through the Pfam database of profile hidden markov models. The search finds the domain structure of Drosophila Sevenless protein. Learn more via the OpenBenchmarking.org test page.

Result

Result Confidence

GraphicsMagick

This is a test of GraphicsMagick with its OpenMP implementation that performs various imaging tests on a sample 6000x4000 pixel JPEG image. Learn more via the OpenBenchmarking.org test page.

Result

Result Confidence

AOBench

AOBench is a lightweight ambient occlusion renderer, written in C. The test profile is using a size of 2048 x 2048. Learn more via the OpenBenchmarking.org test page.

Result

Result Confidence

GraphicsMagick

This is a test of GraphicsMagick with its OpenMP implementation that performs various imaging tests on a sample 6000x4000 pixel JPEG image. Learn more via the OpenBenchmarking.org test page.

Result

Result Confidence

Result

Result Confidence

Result

Result Confidence

SciMark

This test runs the ANSI C version of SciMark 2.0, which is a benchmark for scientific and numerical computing developed by programmers at the National Institute of Standards and Technology. This test is made up of Fast Foruier Transform, Jacobi Successive Over-relaxation, Monte Carlo, Sparse Matrix Multiply, and dense LU matrix factorization benchmarks. Learn more via the OpenBenchmarking.org test page.

Result

Result Confidence

ACES DGEMM

This is a multi-threaded DGEMM benchmark. Learn more via the OpenBenchmarking.org test page.

Result

Result Confidence

Timed ImageMagick Compilation

This test times how long it takes to build ImageMagick. Learn more via the OpenBenchmarking.org test page.

Result

Result Confidence

Kvazaar

This is a test of Kvazaar as a CPU-based H.265 video encoder written in the C programming language and optimized in Assembly. Kvazaar is the winner of the 2016 ACM Open-Source Software Competition and developed at the Ultra Video Group, Tampere University, Finland. Learn more via the OpenBenchmarking.org test page.

Result

Result Confidence

Result

Result Confidence

C-Ray

This is a test of C-Ray, a simple raytracer designed to test the floating-point CPU performance. This test is multi-threaded (16 threads per core), will shoot 8 rays per pixel for anti-aliasing, and will generate a 1600 x 1200 image. Learn more via the OpenBenchmarking.org test page.

Result

Result Confidence

Coremark

This is a test of EEMBC CoreMark processor benchmark. Learn more via the OpenBenchmarking.org test page.

Result

Result Confidence

LibRaw

LibRaw is a RAW image decoder for digital camera photos. This test profile runs LibRaw's post-processing benchmark. Learn more via the OpenBenchmarking.org test page.

Result

Result Confidence

Darmstadt Automotive Parallel Heterogeneous Suite

DAPHNE is the Darmstadt Automotive Parallel HeterogeNEous Benchmark Suite with OpenCL / CUDA / OpenMP test cases for these automotive benchmarks for evaluating programming models in context to vehicle autonomous driving capabilities. Learn more via the OpenBenchmarking.org test page.

Result

Result Confidence

Kvazaar

Result

Result Confidence

AOM AV1

This is a simple test of the AOMedia AV1 encoder run on the CPU with a sample video file. Learn more via the OpenBenchmarking.org test page.

Result

Result Confidence

Kvazaar

Result

Result Confidence

Result

Result Confidence

Result

Result Confidence

FFTW

FFTW is a C subroutine library for computing the discrete Fourier transform (DFT) in one or more dimensions. Learn more via the OpenBenchmarking.org test page.

Result

Result Confidence

Kvazaar

Result

Result Confidence

Smallpt

Smallpt is a C++ global illumination renderer written in less than 100 lines of code. Global illumination is done via unbiased Monte Carlo path tracing and there is multi-threading support via the OpenMP library. Learn more via the OpenBenchmarking.org test page.

Result

Result Confidence

Kvazaar

Result

Result Confidence

-O2 -march=x86-64

Testing initiated at 15 December 2020 20:25 by user phoronix.

-O3 -march=x86-64

Testing initiated at 14 December 2020 06:00 by user phoronix.

-O3 -march=znver2

Testing initiated at 13 December 2020 20:13 by user phoronix.

-O2 -march=znver3

Testing initiated at 15 December 2020 18:24 by user phoronix.

-O3 -march=znver3

Testing initiated at 13 December 2020 11:18 by user phoronix.

-O2 -march=znver3 -flto

Testing initiated at 15 December 2020 11:57 by user phoronix.

-O3 -maech=znver3 -flto

Testing initiated at 14 December 2020 17:26 by user phoronix.

-Ofast -march=znver3 -flto

Testing initiated at 15 December 2020 05:44 by user phoronix.

GCC AMD Znver3 Compiler Optimization Levels

View

Limit displaying results to tests within:

Statistics

Graph Settings

Multi-Way Comparison

Table

Run Management

-O2 -march=x86-64

-O3 -march=x86-64

-O3 -march=znver2

-O2 -march=znver3

-O3 -march=znver3

-O2 -march=znver3 -flto

-O3 -maech=znver3 -flto

-Ofast -march=znver3 -flto

FFTW

Timed MrBayes Analysis

Timed HMMer Search

GraphicsMagick

AOBench

GraphicsMagick

SciMark

ACES DGEMM

Timed ImageMagick Compilation

Kvazaar

C-Ray

Coremark

LibRaw

Darmstadt Automotive Parallel Heterogeneous Suite

Kvazaar

AOM AV1

Kvazaar

FFTW

Kvazaar

Smallpt

Kvazaar

26 Results Shown

-O2 -march=x86-64

-O3 -march=x86-64

-O3 -march=znver2

-O2 -march=znver3

-O3 -march=znver3

-O2 -march=znver3 -flto

-O3 -maech=znver3 -flto

-Ofast -march=znver3 -flto