AMD AOCC 4.0 Compiler Benchmarks

Initial AOCC 4.0 compiler benchmarks by Michael Larabel.

HTML result view exported from: https://openbenchmarking.org/result/2211114-PTS-AMDAOCC455.

AMD AOCC 4.0 Compiler BenchmarksProcessorMotherboardChipsetMemoryDiskGraphicsAudioMonitorNetworkOSKernelDesktopDisplay ServerOpenGLVulkanCompilerFile-SystemScreen ResolutionLLVM Clang 14AOCC 4.0AMD Ryzen 9 7950X 16-Core @ 5.88GHz (16 Cores / 32 Threads)ASUS ROG CROSSHAIR X670E HERO (0703 BIOS)AMD Device 14d832GB1000GB Sabrent Rocket 4.0 PlusAMD Radeon RX 6800 16GB (2475/1000MHz)AMD Navi 21/23ASUS MG28UIntel I225-V + Intel Wi-Fi 6 AX210/AX211/AX411Ubuntu 22.106.1.0-060100rc3daily20221103-generic (x86_64)GNOME Shell 43.0X Server + Wayland4.6 Mesa 22.2.1 (LLVM 15.0.2 DRM 3.49)1.3.224Clang 14.0.6-2ext43840x2160Clang 14.0.6OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseEnvironment Details- CXXFLAGS="-O3 -march=native" CFLAGS="-O3 -march=native"Processor Details- Scaling Governor: amd-pstate performance (Boost: Enabled) - CPU Microcode: 0xa601203 Python Details- Python 3.10.7Security Details- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + mmio_stale_data: Not affected + retbleed: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Retpolines IBPB: conditional IBRS_FW STIBP: always-on RSB filling PBRSB-eIBRS: Not affected + srbds: Not affected + tsx_async_abort: Not affected Compiler Details- AOCC 4.0: Optimized build with assertions; Default target: x86_64-unknown-linux-gnu; Host CPU: znver4

AMD AOCC 4.0 Compiler Benchmarksquantlib: lczero: BLASlczero: Eigenxmrig: Monero - 1Mcompress-zstd: 19 - Compression Speedcompress-zstd: 19 - Decompression Speedcompress-zstd: 19, Long Mode - Compression Speedcompress-zstd: 19, Long Mode - Decompression Speedwebp: Quality 100, Losslesswebp: Quality 100, Highest Compressionwebp: Quality 100, Lossless, Highest Compressionsrsran: 4G PHY_DL_Test 100 PRB MIMO 64-QAMsrsran: 4G PHY_DL_Test 100 PRB SISO 64-QAMgraphics-magick: Sharpengraphics-magick: Enhancedgraphics-magick: Resizingdav1d: Summer Nature 4Kdav1d: Chimera 1080p 10-bitquadray: 1 - 4Kquadray: 1 - 1080pkvazaar: Bosphorus 4K - Very Fastkvazaar: Bosphorus 4K - Ultra Fastsvt-av1: Preset 10 - Bosphorus 4Ksvt-av1: Preset 12 - Bosphorus 4Ksvt-hevc: 1 - Bosphorus 4Ksvt-hevc: 7 - Bosphorus 4Ksvt-hevc: 10 - Bosphorus 4Ksvt-vp9: VMAF Optimized - Bosphorus 4Ksvt-vp9: PSNR/SSIM Optimized - Bosphorus 4Ksvt-vp9: Visual Quality Optimized - Bosphorus 4Kx265: Bosphorus 4Kavifenc: 2avifenc: 6avifenc: 6, Losslessc-ray: Total Time - 4K, 16 Rays Per Pixelopenssl: SHA256openssl: RSA4096openssl: RSA4096liquid-dsp: 16 - 256 - 57liquid-dsp: 32 - 256 - 57astcenc: Mediumastcenc: Thoroughastcenc: Exhaustivedraco: Church FacadeLLVM Clang 14AOCC 4.04535.61694184412543.478.94941.655.14966.32.015.890.82199.3209.93375812470393.37825.0026.23102.1248.0787.37139.029200.7876.43104.05169.39111.30120.22109.7336.9133.6353.0804.65027.097371079907705998.4391464.016661000001742933333156.196319.24431.969547044475.31755195912571.979.24981.155.45017.22.156.210.87201.3210.74605932474394.20821.2626.44101.5649.0688.76143.901209.4006.43107.09169.54112.09120.86110.2438.1232.2842.9714.50426.974372854928475971.9393216.417509333331868800000162.170719.86281.99794705OpenBenchmarking.org

QuantLib

OpenBenchmarking.orgMFLOPS, More Is BetterQuantLib 1.21LLVM Clang 14AOCC 4.010002000300040005000SE +/- 44.47, N = 12SE +/- 32.16, N = 124535.64475.31. (CXX) g++ options: -O3 -march=native -rdynamic

LeelaChessZero

Backend: BLAS

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.28Backend: BLASLLVM Clang 14AOCC 4.0400800120016002000SE +/- 7.42, N = 3SE +/- 19.19, N = 4169417551. (CXX) g++ options: -flto -O3 -march=native -pthread

LeelaChessZero

Backend: Eigen

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.28Backend: EigenLLVM Clang 14AOCC 4.0400800120016002000SE +/- 8.95, N = 3SE +/- 7.67, N = 3184419591. (CXX) g++ options: -flto -O3 -march=native -pthread

Xmrig

Variant: Monero - Hash Count: 1M

OpenBenchmarking.orgH/s, More Is BetterXmrig 6.18.1Variant: Monero - Hash Count: 1MLLVM Clang 14AOCC 4.03K6K9K12K15KSE +/- 30.70, N = 3SE +/- 26.06, N = 312543.412571.91. (CXX) g++ options: -O3 -march=native -fexceptions -fno-rtti -maes -Ofast -funroll-loops -rdynamic -lssl -lcrypto -luv -lpthread -lrt -ldl -lhwloc

Zstd Compression

Compression Level: 19 - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 19 - Compression SpeedLLVM Clang 14AOCC 4.020406080100SE +/- 0.06, N = 3SE +/- 0.28, N = 378.979.21. (CC) gcc options: -O3 -march=native -pthread -lz -llzma

Zstd Compression

Compression Level: 19 - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 19 - Decompression SpeedLLVM Clang 14AOCC 4.011002200330044005500SE +/- 59.66, N = 3SE +/- 26.61, N = 34941.64981.11. (CC) gcc options: -O3 -march=native -pthread -lz -llzma

Zstd Compression

Compression Level: 19, Long Mode - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 19, Long Mode - Compression SpeedLLVM Clang 14AOCC 4.01224364860SE +/- 0.17, N = 3SE +/- 0.06, N = 355.155.41. (CC) gcc options: -O3 -march=native -pthread -lz -llzma

Zstd Compression

Compression Level: 19, Long Mode - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 19, Long Mode - Decompression SpeedLLVM Clang 14AOCC 4.011002200330044005500SE +/- 4.60, N = 3SE +/- 14.78, N = 34966.35017.21. (CC) gcc options: -O3 -march=native -pthread -lz -llzma

WebP Image Encode

Encode Settings: Quality 100, Lossless

OpenBenchmarking.orgMP/s, More Is BetterWebP Image Encode 1.2.4Encode Settings: Quality 100, LosslessLLVM Clang 14AOCC 4.00.48380.96761.45141.93522.419SE +/- 0.00, N = 3SE +/- 0.01, N = 32.012.151. (CC) gcc options: -fvisibility=hidden -O3 -march=native -lm

WebP Image Encode

Encode Settings: Quality 100, Highest Compression

OpenBenchmarking.orgMP/s, More Is BetterWebP Image Encode 1.2.4Encode Settings: Quality 100, Highest CompressionLLVM Clang 14AOCC 4.0246810SE +/- 0.00, N = 3SE +/- 0.01, N = 35.896.211. (CC) gcc options: -fvisibility=hidden -O3 -march=native -lm

WebP Image Encode

Encode Settings: Quality 100, Lossless, Highest Compression

OpenBenchmarking.orgMP/s, More Is BetterWebP Image Encode 1.2.4Encode Settings: Quality 100, Lossless, Highest CompressionLLVM Clang 14AOCC 4.00.19580.39160.58740.78320.979SE +/- 0.00, N = 3SE +/- 0.00, N = 30.820.871. (CC) gcc options: -fvisibility=hidden -O3 -march=native -lm

srsRAN

Test: 4G PHY_DL_Test 100 PRB MIMO 64-QAM

OpenBenchmarking.orgUE Mb/s, More Is BettersrsRAN 22.04.1Test: 4G PHY_DL_Test 100 PRB MIMO 64-QAMLLVM Clang 14AOCC 4.04080120160200SE +/- 1.03, N = 3SE +/- 0.84, N = 3199.3201.31. (CXX) g++ options: -O3 -march=native -std=c++14 -fno-strict-aliasing -mfpmath=sse -mavx2 -fvisibility=hidden -fno-trapping-math -fno-math-errno -mavx512f -mavx512cd -mavx512bw -mavx512dq -latomic -ldl -lpthread -lm

srsRAN

Test: 4G PHY_DL_Test 100 PRB SISO 64-QAM

OpenBenchmarking.orgUE Mb/s, More Is BettersrsRAN 22.04.1Test: 4G PHY_DL_Test 100 PRB SISO 64-QAMLLVM Clang 14AOCC 4.050100150200250SE +/- 0.21, N = 3SE +/- 0.99, N = 3209.9210.71. (CXX) g++ options: -O3 -march=native -std=c++14 -fno-strict-aliasing -mfpmath=sse -mavx2 -fvisibility=hidden -fno-trapping-math -fno-math-errno -mavx512f -mavx512cd -mavx512bw -mavx512dq -latomic -ldl -lpthread -lm

GraphicsMagick

Operation: Sharpen

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.38Operation: SharpenLLVM Clang 14AOCC 4.0100200300400500SE +/- 0.33, N = 3SE +/- 0.33, N = 33374601. (CC) gcc options: -fopenmp -O3 -march=native -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread

GraphicsMagick

Operation: Enhanced

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.38Operation: EnhancedLLVM Clang 14AOCC 4.0130260390520650SE +/- 0.00, N = 3SE +/- 0.33, N = 35815931. (CC) gcc options: -fopenmp -O3 -march=native -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread

GraphicsMagick

Operation: Resizing

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.38Operation: ResizingLLVM Clang 14AOCC 4.05001000150020002500SE +/- 0.88, N = 3SE +/- 3.51, N = 3247024741. (CC) gcc options: -fopenmp -O3 -march=native -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread

dav1d

Video Input: Summer Nature 4K

OpenBenchmarking.orgFPS, More Is Betterdav1d 1.0Video Input: Summer Nature 4KLLVM Clang 14AOCC 4.090180270360450SE +/- 3.69, N = 3SE +/- 2.47, N = 3393.37394.201. (CC) gcc options: -O3 -march=native -pthread -lm

dav1d

Video Input: Chimera 1080p 10-bit

OpenBenchmarking.orgFPS, More Is Betterdav1d 1.0Video Input: Chimera 1080p 10-bitLLVM Clang 14AOCC 4.02004006008001000SE +/- 1.48, N = 3SE +/- 1.11, N = 3825.00821.261. (CC) gcc options: -O3 -march=native -pthread -lm

QuadRay

Scene: 1 - Resolution: 4K

OpenBenchmarking.orgFPS, More Is BetterQuadRay 2022.05.25Scene: 1 - Resolution: 4KLLVM Clang 14AOCC 4.0612182430SE +/- 0.10, N = 3SE +/- 0.06, N = 326.2326.441. (CXX) g++ options: -O3 -pthread -lm -lstdc++ -lX11 -lXext -lpthread

QuadRay

Scene: 1 - Resolution: 1080p

OpenBenchmarking.orgFPS, More Is BetterQuadRay 2022.05.25Scene: 1 - Resolution: 1080pLLVM Clang 14AOCC 4.020406080100SE +/- 0.20, N = 3SE +/- 0.40, N = 3102.12101.561. (CXX) g++ options: -O3 -pthread -lm -lstdc++ -lX11 -lXext -lpthread

Kvazaar

Video Input: Bosphorus 4K - Video Preset: Very Fast

OpenBenchmarking.orgFrames Per Second, More Is BetterKvazaar 2.1Video Input: Bosphorus 4K - Video Preset: Very FastLLVM Clang 14AOCC 4.01122334455SE +/- 0.32, N = 3SE +/- 0.36, N = 348.0749.061. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O3 -march=native -lm -lrt

Kvazaar

Video Input: Bosphorus 4K - Video Preset: Ultra Fast

OpenBenchmarking.orgFrames Per Second, More Is BetterKvazaar 2.1Video Input: Bosphorus 4K - Video Preset: Ultra FastLLVM Clang 14AOCC 4.020406080100SE +/- 0.05, N = 3SE +/- 0.04, N = 387.3788.761. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O3 -march=native -lm -lrt

SVT-AV1

Encoder Mode: Preset 10 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 1.2Encoder Mode: Preset 10 - Input: Bosphorus 4KLLVM Clang 14AOCC 4.0306090120150SE +/- 2.91, N = 15SE +/- 3.87, N = 15139.03143.90

SVT-AV1

Encoder Mode: Preset 12 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 1.2Encoder Mode: Preset 12 - Input: Bosphorus 4KLLVM Clang 14AOCC 4.050100150200250SE +/- 1.63, N = 3SE +/- 1.07, N = 3200.79209.40

SVT-HEVC

Tuning: 1 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-HEVC 1.5.0Tuning: 1 - Input: Bosphorus 4KLLVM Clang 14AOCC 4.0246810SE +/- 0.02, N = 3SE +/- 0.02, N = 36.436.431. (CC) gcc options: -O3 -march=native -fPIE -fPIC -O2 -pie -rdynamic -lpthread -lrt

SVT-HEVC

Tuning: 7 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-HEVC 1.5.0Tuning: 7 - Input: Bosphorus 4KLLVM Clang 14AOCC 4.020406080100SE +/- 0.06, N = 3SE +/- 0.23, N = 3104.05107.091. (CC) gcc options: -O3 -march=native -fPIE -fPIC -O2 -pie -rdynamic -lpthread -lrt

SVT-HEVC

Tuning: 10 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-HEVC 1.5.0Tuning: 10 - Input: Bosphorus 4KLLVM Clang 14AOCC 4.04080120160200SE +/- 0.32, N = 3SE +/- 0.44, N = 3169.39169.541. (CC) gcc options: -O3 -march=native -fPIE -fPIC -O2 -pie -rdynamic -lpthread -lrt

SVT-VP9

Tuning: VMAF Optimized - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-VP9 0.3Tuning: VMAF Optimized - Input: Bosphorus 4KLLVM Clang 14AOCC 4.0306090120150SE +/- 0.93, N = 15SE +/- 1.05, N = 15111.30112.091. (CC) gcc options: -O3 -fcommon -march=native -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm

SVT-VP9

Tuning: PSNR/SSIM Optimized - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-VP9 0.3Tuning: PSNR/SSIM Optimized - Input: Bosphorus 4KLLVM Clang 14AOCC 4.0306090120150SE +/- 0.09, N = 3SE +/- 0.23, N = 3120.22120.861. (CC) gcc options: -O3 -fcommon -march=native -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm

SVT-VP9

Tuning: Visual Quality Optimized - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-VP9 0.3Tuning: Visual Quality Optimized - Input: Bosphorus 4KLLVM Clang 14AOCC 4.020406080100SE +/- 0.08, N = 3SE +/- 0.19, N = 3109.73110.241. (CC) gcc options: -O3 -fcommon -march=native -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm

x265

Video Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is Betterx265 3.4Video Input: Bosphorus 4KLLVM Clang 14AOCC 4.0918273645SE +/- 0.34, N = 3SE +/- 0.03, N = 336.9138.121. (CXX) g++ options: -O3 -march=native -rdynamic -lpthread -lrt -ldl -lnuma

libavif avifenc

Encoder Speed: 2

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.11Encoder Speed: 2LLVM Clang 14AOCC 4.0816243240SE +/- 0.22, N = 15SE +/- 0.08, N = 333.6432.281. (CXX) g++ options: -O3 -fPIC -march=native -lm

libavif avifenc

Encoder Speed: 6

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.11Encoder Speed: 6LLVM Clang 14AOCC 4.00.6931.3862.0792.7723.465SE +/- 0.027, N = 8SE +/- 0.004, N = 33.0802.9711. (CXX) g++ options: -O3 -fPIC -march=native -lm

libavif avifenc

Encoder Speed: 6, Lossless

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.11Encoder Speed: 6, LosslessLLVM Clang 14AOCC 4.01.04632.09263.13894.18525.2315SE +/- 0.032, N = 3SE +/- 0.011, N = 34.6504.5041. (CXX) g++ options: -O3 -fPIC -march=native -lm

C-Ray

Total Time - 4K, 16 Rays Per Pixel

OpenBenchmarking.orgSeconds, Fewer Is BetterC-Ray 1.1Total Time - 4K, 16 Rays Per PixelLLVM Clang 14AOCC 4.0612182430SE +/- 0.15, N = 3SE +/- 0.01, N = 327.1026.971. (CC) gcc options: -lm -lpthread -O3 -march=native

OpenSSL

Algorithm: SHA256

OpenBenchmarking.orgbyte/s, More Is BetterOpenSSL 3.0Algorithm: SHA256LLVM Clang 14AOCC 4.08000M16000M24000M32000M40000MSE +/- 5864900.49, N = 3SE +/- 19509012.02, N = 337107990770372854928471. (CC) gcc options: -pthread -m64 -Qunused-arguments -O3 -march=native -lssl -lcrypto -ldl

OpenSSL

Algorithm: RSA4096

OpenBenchmarking.orgsign/s, More Is BetterOpenSSL 3.0Algorithm: RSA4096LLVM Clang 14AOCC 4.013002600390052006500SE +/- 5.11, N = 3SE +/- 27.80, N = 35998.45971.91. (CC) gcc options: -pthread -m64 -Qunused-arguments -O3 -march=native -lssl -lcrypto -ldl

OpenSSL

Algorithm: RSA4096

OpenBenchmarking.orgverify/s, More Is BetterOpenSSL 3.0Algorithm: RSA4096LLVM Clang 14AOCC 4.080K160K240K320K400KSE +/- 1647.21, N = 3SE +/- 114.01, N = 3391464.0393216.41. (CC) gcc options: -pthread -m64 -Qunused-arguments -O3 -march=native -lssl -lcrypto -ldl

Liquid-DSP

Threads: 16 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 2021.01.31Threads: 16 - Buffer Length: 256 - Filter Length: 57LLVM Clang 14AOCC 4.0400M800M1200M1600M2000MSE +/- 1386842.94, N = 3SE +/- 2171277.15, N = 3166610000017509333331. (CC) gcc options: -O3 -march=native -pthread -lm -lc -lliquid

Liquid-DSP

Threads: 32 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 2021.01.31Threads: 32 - Buffer Length: 256 - Filter Length: 57LLVM Clang 14AOCC 4.0400M800M1200M1600M2000MSE +/- 5397633.23, N = 3SE +/- 1442220.51, N = 3174293333318688000001. (CC) gcc options: -O3 -march=native -pthread -lm -lc -lliquid

ASTC Encoder

Preset: Medium

OpenBenchmarking.orgMT/s, More Is BetterASTC Encoder 4.0Preset: MediumLLVM Clang 14AOCC 4.04080120160200SE +/- 0.85, N = 3SE +/- 0.83, N = 3156.20162.171. (CXX) g++ options: -O3 -march=native -flto -pthread

ASTC Encoder

Preset: Thorough

OpenBenchmarking.orgMT/s, More Is BetterASTC Encoder 4.0Preset: ThoroughLLVM Clang 14AOCC 4.0510152025SE +/- 0.08, N = 3SE +/- 0.05, N = 319.2419.861. (CXX) g++ options: -O3 -march=native -flto -pthread

ASTC Encoder

Preset: Exhaustive

OpenBenchmarking.orgMT/s, More Is BetterASTC Encoder 4.0Preset: ExhaustiveLLVM Clang 14AOCC 4.00.44950.8991.34851.7982.2475SE +/- 0.0036, N = 3SE +/- 0.0025, N = 31.96951.99791. (CXX) g++ options: -O3 -march=native -flto -pthread

Google Draco

Model: Church Facade

OpenBenchmarking.orgms, Fewer Is BetterGoogle Draco 1.5.0Model: Church FacadeLLVM Clang 14AOCC 4.010002000300040005000SE +/- 19.68, N = 3SE +/- 36.11, N = 3470447051. (CXX) g++ options: -O3 -march=native


Phoronix Test Suite v10.8.4