AMD AOCC 4.0 Compiler Benchmarks

Initial AOCC 4.0 compiler benchmarks by Michael Larabel.

HTML result view exported from: https://openbenchmarking.org/result/2211114-PTS-AMDAOCC455&sor&grw.

AMD AOCC 4.0 Compiler BenchmarksProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerOpenGLVulkanCompilerFile-SystemScreen ResolutionLLVM Clang 14AOCC 4.0AMD Ryzen 9 7950X 16-Core @ 5.88GHz (16 Cores / 32 Threads)ASUS ROG CROSSHAIR X670E HERO (0703 BIOS)AMD Device 14d832GB1000GB Sabrent Rocket 4.0 PlusAMD Radeon RX 6800 16GB (2475/1000MHz)AMD Navi 21/23ASUS MG28UIntel I225-V + Intel Wi-Fi 6 AX210/AX211/AX411Ubuntu 22.106.1.0-060100rc3daily20221103-generic (x86_64)GNOME Shell 43.0X Server + Wayland4.6 Mesa 22.2.1 (LLVM 15.0.2 DRM 3.49)1.3.224Clang 14.0.6-2ext43840x2160Clang 14.0.6OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseEnvironment Details- CXXFLAGS="-O3 -march=native" CFLAGS="-O3 -march=native"Processor Details- Scaling Governor: amd-pstate performance (Boost: Enabled) - CPU Microcode: 0xa601203 Python Details- Python 3.10.7Security Details- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected Compiler Details- AOCC 4.0: Optimized build with assertions; Default target: x86_64-unknown-linux-gnu; Host CPU: znver4

AMD AOCC 4.0 Compiler Benchmarksastcenc: Mediumastcenc: Thoroughastcenc: Exhaustivedraco: Church Facadewebp: Quality 100, Losslesswebp: Quality 100, Highest Compressionwebp: Quality 100, Lossless, Highest Compressionxmrig: Monero - 1Mquantlib: lczero: BLASlczero: Eigencompress-zstd: 19 - Compression Speedcompress-zstd: 19 - Decompression Speedcompress-zstd: 19, Long Mode - Compression Speedcompress-zstd: 19, Long Mode - Decompression Speedkvazaar: Bosphorus 4K - Very Fastkvazaar: Bosphorus 4K - Ultra Fastgraphics-magick: Sharpengraphics-magick: Enhancedgraphics-magick: Resizingsvt-vp9: VMAF Optimized - Bosphorus 4Ksvt-vp9: PSNR/SSIM Optimized - Bosphorus 4Ksvt-vp9: Visual Quality Optimized - Bosphorus 4Kdav1d: Summer Nature 4Kdav1d: Chimera 1080p 10-bitsvt-av1: Preset 10 - Bosphorus 4Ksvt-av1: Preset 12 - Bosphorus 4Kx265: Bosphorus 4Kc-ray: Total Time - 4K, 16 Rays Per Pixelsvt-hevc: 1 - Bosphorus 4Ksvt-hevc: 7 - Bosphorus 4Ksvt-hevc: 10 - Bosphorus 4Kavifenc: 2avifenc: 6avifenc: 6, Losslessquadray: 1 - 4Kquadray: 1 - 1080pliquid-dsp: 16 - 256 - 57liquid-dsp: 32 - 256 - 57srsran: 4G PHY_DL_Test 100 PRB MIMO 64-QAMsrsran: 4G PHY_DL_Test 100 PRB SISO 64-QAMopenssl: SHA256openssl: RSA4096openssl: RSA4096LLVM Clang 14AOCC 4.0156.196319.24431.969547042.015.890.8212543.44535.61694184478.94941.655.14966.348.0787.373375812470111.30120.22109.73393.37825.00139.029200.78736.9127.0976.43104.05169.3933.6353.0804.65026.23102.1216661000001742933333199.3209.9371079907705998.4391464.0162.170719.86281.997947052.156.210.8712571.94475.31755195979.24981.155.45017.249.0688.764605932474112.09120.86110.24394.20821.26143.901209.40038.1226.9746.43107.09169.5432.2842.9714.50426.44101.5617509333331868800000201.3210.7372854928475971.9393216.4OpenBenchmarking.org

ASTC Encoder

Preset: Medium

OpenBenchmarking.orgMT/s, More Is BetterASTC Encoder 4.0Preset: MediumAOCC 4.0LLVM Clang 144080120160200SE +/- 0.83, N = 3SE +/- 0.85, N = 3162.17156.201. (CXX) g++ options: -O3 -march=native -flto -pthread

ASTC Encoder

Preset: Thorough

OpenBenchmarking.orgMT/s, More Is BetterASTC Encoder 4.0Preset: ThoroughAOCC 4.0LLVM Clang 14510152025SE +/- 0.05, N = 3SE +/- 0.08, N = 319.8619.241. (CXX) g++ options: -O3 -march=native -flto -pthread

ASTC Encoder

Preset: Exhaustive

OpenBenchmarking.orgMT/s, More Is BetterASTC Encoder 4.0Preset: ExhaustiveAOCC 4.0LLVM Clang 140.44950.8991.34851.7982.2475SE +/- 0.0025, N = 3SE +/- 0.0036, N = 31.99791.96951. (CXX) g++ options: -O3 -march=native -flto -pthread

Google Draco

Model: Church Facade

OpenBenchmarking.orgms, Fewer Is BetterGoogle Draco 1.5.0Model: Church FacadeLLVM Clang 14AOCC 4.010002000300040005000SE +/- 19.68, N = 3SE +/- 36.11, N = 3470447051. (CXX) g++ options: -O3 -march=native

WebP Image Encode

Encode Settings: Quality 100, Lossless

OpenBenchmarking.orgMP/s, More Is BetterWebP Image Encode 1.2.4Encode Settings: Quality 100, LosslessAOCC 4.0LLVM Clang 140.48380.96761.45141.93522.419SE +/- 0.01, N = 3SE +/- 0.00, N = 32.152.011. (CC) gcc options: -fvisibility=hidden -O3 -march=native -lm

WebP Image Encode

Encode Settings: Quality 100, Highest Compression

OpenBenchmarking.orgMP/s, More Is BetterWebP Image Encode 1.2.4Encode Settings: Quality 100, Highest CompressionAOCC 4.0LLVM Clang 14246810SE +/- 0.01, N = 3SE +/- 0.00, N = 36.215.891. (CC) gcc options: -fvisibility=hidden -O3 -march=native -lm

WebP Image Encode

Encode Settings: Quality 100, Lossless, Highest Compression

OpenBenchmarking.orgMP/s, More Is BetterWebP Image Encode 1.2.4Encode Settings: Quality 100, Lossless, Highest CompressionAOCC 4.0LLVM Clang 140.19580.39160.58740.78320.979SE +/- 0.00, N = 3SE +/- 0.00, N = 30.870.821. (CC) gcc options: -fvisibility=hidden -O3 -march=native -lm

Xmrig

Variant: Monero - Hash Count: 1M

OpenBenchmarking.orgH/s, More Is BetterXmrig 6.18.1Variant: Monero - Hash Count: 1MAOCC 4.0LLVM Clang 143K6K9K12K15KSE +/- 26.06, N = 3SE +/- 30.70, N = 312571.912543.41. (CXX) g++ options: -O3 -march=native -fexceptions -fno-rtti -maes -Ofast -funroll-loops -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc

QuantLib

OpenBenchmarking.orgMFLOPS, More Is BetterQuantLib 1.21LLVM Clang 14AOCC 4.010002000300040005000SE +/- 44.47, N = 12SE +/- 32.16, N = 124535.64475.31. (CXX) g++ options: -O3 -march=native -rdynamic

LeelaChessZero

Backend: BLAS

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.28Backend: BLASAOCC 4.0LLVM Clang 14400800120016002000SE +/- 19.19, N = 4SE +/- 7.42, N = 3175516941. (CXX) g++ options: -flto -O3 -march=native -pthread

LeelaChessZero

Backend: Eigen

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.28Backend: EigenAOCC 4.0LLVM Clang 14400800120016002000SE +/- 7.67, N = 3SE +/- 8.95, N = 3195918441. (CXX) g++ options: -flto -O3 -march=native -pthread

Zstd Compression

Compression Level: 19 - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 19 - Compression SpeedAOCC 4.0LLVM Clang 1420406080100SE +/- 0.28, N = 3SE +/- 0.06, N = 379.278.91. (CC) gcc options: -O3 -march=native -pthread -lz -llzma

Zstd Compression

Compression Level: 19 - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 19 - Decompression SpeedAOCC 4.0LLVM Clang 1411002200330044005500SE +/- 26.61, N = 3SE +/- 59.66, N = 34981.14941.61. (CC) gcc options: -O3 -march=native -pthread -lz -llzma

Zstd Compression

Compression Level: 19, Long Mode - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 19, Long Mode - Compression SpeedAOCC 4.0LLVM Clang 141224364860SE +/- 0.06, N = 3SE +/- 0.17, N = 355.455.11. (CC) gcc options: -O3 -march=native -pthread -lz -llzma

Zstd Compression

Compression Level: 19, Long Mode - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 19, Long Mode - Decompression SpeedAOCC 4.0LLVM Clang 1411002200330044005500SE +/- 14.78, N = 3SE +/- 4.60, N = 35017.24966.31. (CC) gcc options: -O3 -march=native -pthread -lz -llzma

Kvazaar

Video Input: Bosphorus 4K - Video Preset: Very Fast

OpenBenchmarking.orgFrames Per Second, More Is BetterKvazaar 2.1Video Input: Bosphorus 4K - Video Preset: Very FastAOCC 4.0LLVM Clang 141122334455SE +/- 0.36, N = 3SE +/- 0.32, N = 349.0648.071. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O3 -march=native -lm -lrt

Kvazaar

Video Input: Bosphorus 4K - Video Preset: Ultra Fast

OpenBenchmarking.orgFrames Per Second, More Is BetterKvazaar 2.1Video Input: Bosphorus 4K - Video Preset: Ultra FastAOCC 4.0LLVM Clang 1420406080100SE +/- 0.04, N = 3SE +/- 0.05, N = 388.7687.371. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O3 -march=native -lm -lrt

GraphicsMagick

Operation: Sharpen

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.38Operation: SharpenAOCC 4.0LLVM Clang 14100200300400500SE +/- 0.33, N = 3SE +/- 0.33, N = 34603371. (CC) gcc options: -fopenmp -O3 -march=native -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread

GraphicsMagick

Operation: Enhanced

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.38Operation: EnhancedAOCC 4.0LLVM Clang 14130260390520650SE +/- 0.33, N = 3SE +/- 0.00, N = 35935811. (CC) gcc options: -fopenmp -O3 -march=native -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread

GraphicsMagick

Operation: Resizing

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.38Operation: ResizingAOCC 4.0LLVM Clang 145001000150020002500SE +/- 3.51, N = 3SE +/- 0.88, N = 3247424701. (CC) gcc options: -fopenmp -O3 -march=native -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread

SVT-VP9

Tuning: VMAF Optimized - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-VP9 0.3Tuning: VMAF Optimized - Input: Bosphorus 4KAOCC 4.0LLVM Clang 14306090120150SE +/- 1.05, N = 15SE +/- 0.93, N = 15112.09111.301. (CC) gcc options: -O3 -fcommon -march=native -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm

SVT-VP9

Tuning: PSNR/SSIM Optimized - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-VP9 0.3Tuning: PSNR/SSIM Optimized - Input: Bosphorus 4KAOCC 4.0LLVM Clang 14306090120150SE +/- 0.23, N = 3SE +/- 0.09, N = 3120.86120.221. (CC) gcc options: -O3 -fcommon -march=native -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm

SVT-VP9

Tuning: Visual Quality Optimized - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-VP9 0.3Tuning: Visual Quality Optimized - Input: Bosphorus 4KAOCC 4.0LLVM Clang 1420406080100SE +/- 0.19, N = 3SE +/- 0.08, N = 3110.24109.731. (CC) gcc options: -O3 -fcommon -march=native -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm

dav1d

Video Input: Summer Nature 4K

OpenBenchmarking.orgFPS, More Is Betterdav1d 1.0Video Input: Summer Nature 4KAOCC 4.0LLVM Clang 1490180270360450SE +/- 2.47, N = 3SE +/- 3.69, N = 3394.20393.371. (CC) gcc options: -O3 -march=native -pthread -lm

dav1d

Video Input: Chimera 1080p 10-bit

OpenBenchmarking.orgFPS, More Is Betterdav1d 1.0Video Input: Chimera 1080p 10-bitLLVM Clang 14AOCC 4.02004006008001000SE +/- 1.48, N = 3SE +/- 1.11, N = 3825.00821.261. (CC) gcc options: -O3 -march=native -pthread -lm

SVT-AV1

Encoder Mode: Preset 10 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 1.2Encoder Mode: Preset 10 - Input: Bosphorus 4KAOCC 4.0LLVM Clang 14306090120150SE +/- 3.87, N = 15SE +/- 2.91, N = 15143.90139.03

SVT-AV1

Encoder Mode: Preset 12 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 1.2Encoder Mode: Preset 12 - Input: Bosphorus 4KAOCC 4.0LLVM Clang 1450100150200250SE +/- 1.07, N = 3SE +/- 1.63, N = 3209.40200.79

x265

Video Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is Betterx265 3.4Video Input: Bosphorus 4KAOCC 4.0LLVM Clang 14918273645SE +/- 0.03, N = 3SE +/- 0.34, N = 338.1236.911. (CXX) g++ options: -O3 -march=native -rdynamic -lpthread -lrt -ldl -lnuma

C-Ray

Total Time - 4K, 16 Rays Per Pixel

OpenBenchmarking.orgSeconds, Fewer Is BetterC-Ray 1.1Total Time - 4K, 16 Rays Per PixelAOCC 4.0LLVM Clang 14612182430SE +/- 0.01, N = 3SE +/- 0.15, N = 326.9727.101. (CC) gcc options: -lm -lpthread -O3 -march=native

SVT-HEVC

Tuning: 1 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-HEVC 1.5.0Tuning: 1 - Input: Bosphorus 4KAOCC 4.0LLVM Clang 14246810SE +/- 0.02, N = 3SE +/- 0.02, N = 36.436.431. (CC) gcc options: -O3 -march=native -fPIE -fPIC -O2 -pie -rdynamic -lpthread -lrt

SVT-HEVC

Tuning: 7 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-HEVC 1.5.0Tuning: 7 - Input: Bosphorus 4KAOCC 4.0LLVM Clang 1420406080100SE +/- 0.23, N = 3SE +/- 0.06, N = 3107.09104.051. (CC) gcc options: -O3 -march=native -fPIE -fPIC -O2 -pie -rdynamic -lpthread -lrt

SVT-HEVC

Tuning: 10 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-HEVC 1.5.0Tuning: 10 - Input: Bosphorus 4KAOCC 4.0LLVM Clang 144080120160200SE +/- 0.44, N = 3SE +/- 0.32, N = 3169.54169.391. (CC) gcc options: -O3 -march=native -fPIE -fPIC -O2 -pie -rdynamic -lpthread -lrt

libavif avifenc

Encoder Speed: 2

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.11Encoder Speed: 2AOCC 4.0LLVM Clang 14816243240SE +/- 0.08, N = 3SE +/- 0.22, N = 1532.2833.641. (CXX) g++ options: -O3 -fPIC -march=native -lm

libavif avifenc

Encoder Speed: 6

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.11Encoder Speed: 6AOCC 4.0LLVM Clang 140.6931.3862.0792.7723.465SE +/- 0.004, N = 3SE +/- 0.027, N = 82.9713.0801. (CXX) g++ options: -O3 -fPIC -march=native -lm

libavif avifenc

Encoder Speed: 6, Lossless

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.11Encoder Speed: 6, LosslessAOCC 4.0LLVM Clang 141.04632.09263.13894.18525.2315SE +/- 0.011, N = 3SE +/- 0.032, N = 34.5044.6501. (CXX) g++ options: -O3 -fPIC -march=native -lm

QuadRay

Scene: 1 - Resolution: 4K

OpenBenchmarking.orgFPS, More Is BetterQuadRay 2022.05.25Scene: 1 - Resolution: 4KAOCC 4.0LLVM Clang 14612182430SE +/- 0.06, N = 3SE +/- 0.10, N = 326.4426.231. (CXX) g++ options: -O3 -pthread -lm -lstdc++ -lX11 -lXext -lpthread

QuadRay

Scene: 1 - Resolution: 1080p

OpenBenchmarking.orgFPS, More Is BetterQuadRay 2022.05.25Scene: 1 - Resolution: 1080pLLVM Clang 14AOCC 4.020406080100SE +/- 0.20, N = 3SE +/- 0.40, N = 3102.12101.561. (CXX) g++ options: -O3 -pthread -lm -lstdc++ -lX11 -lXext -lpthread

Liquid-DSP

Threads: 16 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 2021.01.31Threads: 16 - Buffer Length: 256 - Filter Length: 57AOCC 4.0LLVM Clang 14400M800M1200M1600M2000MSE +/- 2171277.15, N = 3SE +/- 1386842.94, N = 3175093333316661000001. (CC) gcc options: -O3 -march=native -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 32 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 2021.01.31Threads: 32 - Buffer Length: 256 - Filter Length: 57AOCC 4.0LLVM Clang 14400M800M1200M1600M2000MSE +/- 1442220.51, N = 3SE +/- 5397633.23, N = 3186880000017429333331. (CC) gcc options: -O3 -march=native -pthread -lm -lc -lliquid

srsRAN

Test: 4G PHY_DL_Test 100 PRB MIMO 64-QAM

OpenBenchmarking.orgUE Mb/s, More Is BettersrsRAN 22.04.1Test: 4G PHY_DL_Test 100 PRB MIMO 64-QAMAOCC 4.0LLVM Clang 144080120160200SE +/- 0.84, N = 3SE +/- 1.03, N = 3201.3199.31. (CXX) g++ options: -O3 -march=native -std=c++14 -fno-strict-aliasing -mfpmath=sse -mavx2 -fvisibility=hidden -fno-trapping-math -fno-math-errno -mavx512f -mavx512cd -mavx512bw -mavx512dq -latomic -ldl -lpthread -lm

srsRAN

Test: 4G PHY_DL_Test 100 PRB SISO 64-QAM

OpenBenchmarking.orgUE Mb/s, More Is BettersrsRAN 22.04.1Test: 4G PHY_DL_Test 100 PRB SISO 64-QAMAOCC 4.0LLVM Clang 1450100150200250SE +/- 0.99, N = 3SE +/- 0.21, N = 3210.7209.91. (CXX) g++ options: -O3 -march=native -std=c++14 -fno-strict-aliasing -mfpmath=sse -mavx2 -fvisibility=hidden -fno-trapping-math -fno-math-errno -mavx512f -mavx512cd -mavx512bw -mavx512dq -latomic -ldl -lpthread -lm

OpenSSL

Algorithm: SHA256

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.0Algorithm: SHA256AOCC 4.0LLVM Clang 148000M16000M24000M32000M40000MSE +/- 19509012.02, N = 3SE +/- 5864900.49, N = 337285492847371079907701. (CC) gcc options: -pthread -m64 -Qunused-arguments -O3 -march=native -lssl -lcrypto -ldl

OpenSSL

Algorithm: RSA4096

OpenBenchmarking.orgsign/s, More Is BetterOpenSSL 3.0Algorithm: RSA4096LLVM Clang 14AOCC 4.013002600390052006500SE +/- 5.11, N = 3SE +/- 27.80, N = 35998.45971.91. (CC) gcc options: -pthread -m64 -Qunused-arguments -O3 -march=native -lssl -lcrypto -ldl

OpenSSL

Algorithm: RSA4096

OpenBenchmarking.orgverify/s, More Is BetterOpenSSL 3.0Algorithm: RSA4096AOCC 4.0LLVM Clang 1480K160K240K320K400KSE +/- 114.01, N = 3SE +/- 1647.21, N = 3393216.4391464.01. (CC) gcc options: -pthread -m64 -Qunused-arguments -O3 -march=native -lssl -lcrypto -ldl


Phoronix Test Suite v10.8.4