AMD AOCC 3.2 Compiler Benchmarks

AMD EPYC 72F3 of AOCC 3.2 compiler and prior releases. Benchmarks by Michael Larabel for a future article. 

HTML result view exported from: https://openbenchmarking.org/result/2112199-PTS-AOCC327345&sgm=1&sor.

AMD AOCC 3.2 Compiler BenchmarksProcessorMotherboardChipsetMemoryDiskGraphicsMonitorNetworkOSKernelDesktopDisplay ServerCompilerFile-SystemScreen ResolutionAMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.2AMD EPYC 72F3 8-Core @ 3.70GHz (8 Cores / 16 Threads)Supermicro H12SSL-i v1.01 (2.0 BIOS)AMD Starship/Matisse126GB3841GB Micron_9300_MTFDHAL3T8TDP + 1000GB Corsair Force MP600ASPEEDVE2282 x Broadcom NetXtreme BCM5720 2-port PCIeUbuntu 21.045.14.0-rc7-amd-pstate-phx (x86_64) 20210909GNOME Shell 3.38.4X ServerClang 12.0.0ext41920x1080Clang 13.0.0OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseEnvironment Details- CXXFLAGS="-O3 -march=native" CFLAGS="-O3 -march=native"Compiler Details- AMD AOCC 3.0: Optimized build with assertions; Default target: x86_64-unknown-linux-gnu; Host CPU: (unknown) - AMD AOCC 3.1: Optimized build with assertions; Default target: x86_64-unknown-linux-gnu; Host CPU: znver3 - AMD AOCC 3.2: Optimized build with assertions; Default target: x86_64-unknown-linux-gnu; Host CPU: znver3 Processor Details- Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0xa001119Python Details- Python 3.9.5Security Details- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected

AMD AOCC 3.2 Compiler Benchmarksquantlib: etcpak: ETC2lczero: BLASlczero: Eigenwebp: Quality 100, Highest Compressionchia-vdf: Square Plain C++chia-vdf: Square Assembly Optimizedcompress-lz4: 1 - Compression Speedcompress-zstd: 3 - Compression Speedcompress-zstd: 3 - Decompression Speedcompress-zstd: 8 - Compression Speedcompress-zstd: 8 - Decompression Speedcompress-zstd: 19 - Compression Speedcompress-zstd: 19 - Decompression Speedcompress-zstd: 3, Long Mode - Compression Speedcompress-zstd: 3, Long Mode - Decompression Speedjpegxl: JPEG - 8jpegxl-decode: 1botan: KASUMIbotan: Twofishbotan: Twofish - Decryptbotan: CAST-256botan: CAST-256 - Decryptgraphics-magick: Sharpenkvazaar: Bosphorus 4K - Mediumkvazaar: Bosphorus 4K - Very Fastkvazaar: Bosphorus 4K - Ultra Fastsvt-hevc: 1 - Bosphorus 1080psvt-hevc: 7 - Bosphorus 1080psvt-hevc: 10 - Bosphorus 1080psvt-vp9: VMAF Optimized - Bosphorus 1080pcoremark: CoreMark Size 666 - Iterations Per Secondstockfish: Total Timeprimesieve: 1e12 Prime Number Generationonednn: IP Shapes 3D - f32 - CPUonednn: Convolution Batch Shapes Auto - f32 - CPUencode-flac: WAV To FLACngspice: C2670rnnoise: liquid-dsp: 16 - 256 - 57tjbench: Decompression Throughputbasis: UASTC Level 0basis: UASTC Level 2basis: UASTC Level 3cpp-perf-bench: Atoltoktx: Zstd Compression 19toktx: UASTC 3 + Zstd Compression 19dav1d: Chimera 1080pdav1d: Summer Nature 1080pncnn: CPU - mobilenetncnn: CPU-v2-v2 - mobilenet-v2ncnn: CPU-v3-v3 - mobilenet-v3ncnn: CPU - shufflenet-v2ncnn: CPU - mnasnetncnn: CPU - efficientnet-b0ncnn: CPU - resnet50ncnn: CPU - yolov4-tinyncnn: CPU - squeezenet_ssdonnx: shufflenet-v2-10 - CPUapache: 1apache: 200apache: 500AMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.23159.8208.988197119955.61218500012390713663.523101.43946.61370.84140.652.03674.6444.74309.528.4765.9996.470356.958355.702149.815150.0321169.04115.76228.54150.26348647.2165532543524221.0812.114713.9967515.92191.49816.880689073333233.0457356.89827.75851.51943.33417.77516.593531.81497.8713.474.554.025.044.015.7917.3121.9418.44198866241.2487051.7082641.763151.1229.494190921505.56718646712468713770.383284.53907.91367.34019.951.83602.1442.34200.028.6166.9298.869368.265376.197148.839152.6791138.2619.1332.159.11116.46229.69149.85348059.9518292592905320.9851.963473.8852215.85289.12416.943636810000244.4875576.77427.60051.39843.49917.49016.583537.39504.0913.294.483.985.053.955.7017.0721.4018.53196346015.7185820.0982157.173208.6231.489198020855.54818666712731313906.923251.94009.51396.24138.351.73717.6443.24392.428.6966.7398.001368.107397.061157.577154.6101238.3719.1732.269.17117.07231.07151.30358684.3307652583364220.9541.903233.8771315.86588.42116.840695340000244.1172676.76827.53251.15443.75917.55816.568538.84504.5312.544.263.794.943.865.5516.7121.4618.27212506329.4089516.5284374.51OpenBenchmarking.org

QuantLib

OpenBenchmarking.orgMFLOPS, More Is BetterQuantLib 1.21AMD AOCC 3.2AMD AOCC 3.0AMD AOCC 3.17001400210028003500SE +/- 4.72, N = 3SE +/- 3.35, N = 3SE +/- 1.88, N = 33208.63159.83151.11. (CXX) g++ options: -O3 -march=native -rdynamic

Etcpak

Configuration: ETC2

OpenBenchmarking.orgMpx/s, More Is BetterEtcpak 0.7Configuration: ETC2AMD AOCC 3.2AMD AOCC 3.1AMD AOCC 3.050100150200250SE +/- 0.02, N = 3SE +/- 0.05, N = 3SE +/- 0.05, N = 3231.49229.49208.991. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread

LeelaChessZero

Backend: BLAS

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.28Backend: BLASAMD AOCC 3.2AMD AOCC 3.0AMD AOCC 3.1400800120016002000SE +/- 24.17, N = 3SE +/- 27.27, N = 3SE +/- 25.01, N = 31980197119091. (CXX) g++ options: -flto -O3 -march=native -pthread

LeelaChessZero

Backend: Eigen

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.28Backend: EigenAMD AOCC 3.1AMD AOCC 3.2AMD AOCC 3.05001000150020002500SE +/- 19.04, N = 3SE +/- 14.26, N = 3SE +/- 18.25, N = 32150208519951. (CXX) g++ options: -flto -O3 -march=native -pthread

WebP Image Encode

Encode Settings: Quality 100, Highest Compression

OpenBenchmarking.orgEncode Time - Seconds, Fewer Is BetterWebP Image Encode 1.1Encode Settings: Quality 100, Highest CompressionAMD AOCC 3.2AMD AOCC 3.1AMD AOCC 3.01.26272.52543.78815.05086.3135SE +/- 0.000, N = 3SE +/- 0.011, N = 3SE +/- 0.001, N = 35.5485.5675.6121. (CC) gcc options: -fvisibility=hidden -O3 -march=native -pthread -lm -ljpeg -lpng16 -ltiff

Chia Blockchain VDF

Test: Square Plain C++

OpenBenchmarking.orgIPS, More Is BetterChia Blockchain VDF 1.0.1Test: Square Plain C++AMD AOCC 3.2AMD AOCC 3.1AMD AOCC 3.040K80K120K160K200KSE +/- 961.48, N = 3SE +/- 272.85, N = 3SE +/- 814.45, N = 31866671864671850001. (CXX) g++ options: -flto -fno-PIE -lgmpxx -lgmp -lboost_system -pthread

Chia Blockchain VDF

Test: Square Assembly Optimized

OpenBenchmarking.orgIPS, More Is BetterChia Blockchain VDF 1.0.1Test: Square Assembly OptimizedAMD AOCC 3.2AMD AOCC 3.1AMD AOCC 3.030K60K90K120K150KSE +/- 1751.38, N = 15SE +/- 1162.50, N = 15SE +/- 1025.27, N = 151273131246871239071. (CXX) g++ options: -flto -fno-PIE -lgmpxx -lgmp -lboost_system -pthread

LZ4 Compression

Compression Level: 1 - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterLZ4 Compression 1.9.3Compression Level: 1 - Compression SpeedAMD AOCC 3.2AMD AOCC 3.1AMD AOCC 3.03K6K9K12K15KSE +/- 7.51, N = 3SE +/- 34.48, N = 3SE +/- 45.07, N = 313906.9213770.3813663.521. (CC) gcc options: -O3

Zstd Compression

Compression Level: 3 - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 3 - Compression SpeedAMD AOCC 3.1AMD AOCC 3.2AMD AOCC 3.07001400210028003500SE +/- 7.29, N = 3SE +/- 4.13, N = 3SE +/- 16.85, N = 33284.53251.93101.41. (CC) gcc options: -O3 -march=native -pthread -lz -llzma

Zstd Compression

Compression Level: 3 - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 3 - Decompression SpeedAMD AOCC 3.2AMD AOCC 3.0AMD AOCC 3.19001800270036004500SE +/- 4.62, N = 3SE +/- 3.55, N = 3SE +/- 28.97, N = 34009.53946.63907.91. (CC) gcc options: -O3 -march=native -pthread -lz -llzma

Zstd Compression

Compression Level: 8 - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 8 - Compression SpeedAMD AOCC 3.2AMD AOCC 3.0AMD AOCC 3.130060090012001500SE +/- 5.45, N = 3SE +/- 1.68, N = 3SE +/- 10.80, N = 31396.21370.81367.31. (CC) gcc options: -O3 -march=native -pthread -lz -llzma

Zstd Compression

Compression Level: 8 - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 8 - Decompression SpeedAMD AOCC 3.0AMD AOCC 3.2AMD AOCC 3.19001800270036004500SE +/- 42.37, N = 3SE +/- 9.09, N = 3SE +/- 20.84, N = 34140.64138.34019.91. (CC) gcc options: -O3 -march=native -pthread -lz -llzma

Zstd Compression

Compression Level: 19 - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 19 - Compression SpeedAMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.21224364860SE +/- 0.40, N = 3SE +/- 0.07, N = 3SE +/- 0.17, N = 352.051.851.71. (CC) gcc options: -O3 -march=native -pthread -lz -llzma

Zstd Compression

Compression Level: 19 - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 19 - Decompression SpeedAMD AOCC 3.2AMD AOCC 3.0AMD AOCC 3.18001600240032004000SE +/- 38.85, N = 3SE +/- 20.53, N = 3SE +/- 64.59, N = 33717.63674.63602.11. (CC) gcc options: -O3 -march=native -pthread -lz -llzma

Zstd Compression

Compression Level: 3, Long Mode - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 3, Long Mode - Compression SpeedAMD AOCC 3.0AMD AOCC 3.2AMD AOCC 3.1100200300400500SE +/- 5.04, N = 3SE +/- 1.57, N = 3SE +/- 3.07, N = 3444.7443.2442.31. (CC) gcc options: -O3 -march=native -pthread -lz -llzma

Zstd Compression

Compression Level: 3, Long Mode - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 3, Long Mode - Decompression SpeedAMD AOCC 3.2AMD AOCC 3.0AMD AOCC 3.19001800270036004500SE +/- 35.14, N = 3SE +/- 45.88, N = 3SE +/- 21.45, N = 34392.44309.54200.01. (CC) gcc options: -O3 -march=native -pthread -lz -llzma

JPEG XL libjxl

Input: JPEG - Encode Speed: 8

OpenBenchmarking.orgMP/s, More Is BetterJPEG XL libjxl 0.6.1Input: JPEG - Encode Speed: 8AMD AOCC 3.2AMD AOCC 3.1AMD AOCC 3.0714212835SE +/- 0.07, N = 3SE +/- 0.17, N = 3SE +/- 0.09, N = 328.6928.6128.471. (CXX) g++ options: -O3 -march=native -funwind-tables -Xclang -mrelax-all -O2 -pthread -fPIE -pie

JPEG XL Decoding libjxl

CPU Threads: 1

OpenBenchmarking.orgMP/s, More Is BetterJPEG XL Decoding libjxl 0.6.1CPU Threads: 1AMD AOCC 3.1AMD AOCC 3.2AMD AOCC 3.01530456075SE +/- 0.03, N = 3SE +/- 0.13, N = 3SE +/- 0.24, N = 366.9266.7365.99

Botan

Test: KASUMI

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: KASUMIAMD AOCC 3.1AMD AOCC 3.2AMD AOCC 3.020406080100SE +/- 0.02, N = 3SE +/- 0.02, N = 3SE +/- 0.02, N = 398.8798.0096.471. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt

Botan

Test: Twofish

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: TwofishAMD AOCC 3.1AMD AOCC 3.2AMD AOCC 3.080160240320400SE +/- 0.10, N = 3SE +/- 0.13, N = 3SE +/- 0.16, N = 3368.27368.11356.961. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt

Botan

Test: Twofish - Decrypt

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: Twofish - DecryptAMD AOCC 3.2AMD AOCC 3.1AMD AOCC 3.090180270360450SE +/- 0.14, N = 3SE +/- 0.09, N = 3SE +/- 0.14, N = 3397.06376.20355.701. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt

Botan

Test: CAST-256

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: CAST-256AMD AOCC 3.2AMD AOCC 3.0AMD AOCC 3.1306090120150SE +/- 0.14, N = 3SE +/- 0.01, N = 3SE +/- 0.02, N = 3157.58149.82148.841. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt

Botan

Test: CAST-256 - Decrypt

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: CAST-256 - DecryptAMD AOCC 3.2AMD AOCC 3.1AMD AOCC 3.0306090120150SE +/- 0.02, N = 3SE +/- 0.01, N = 3SE +/- 0.02, N = 3154.61152.68150.031. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt

GraphicsMagick

Operation: Sharpen

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: SharpenAMD AOCC 3.2AMD AOCC 3.0AMD AOCC 3.13060901201501231161131. (CC) gcc options: -fopenmp -O3 -march=native -pthread -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread

Kvazaar

Video Input: Bosphorus 4K - Video Preset: Medium

OpenBenchmarking.orgFrames Per Second, More Is BetterKvazaar 2.1Video Input: Bosphorus 4K - Video Preset: MediumAMD AOCC 3.2AMD AOCC 3.1246810SE +/- 0.01, N = 3SE +/- 0.01, N = 38.378.261. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O3 -march=native -lm -lrt

Kvazaar

Video Input: Bosphorus 4K - Video Preset: Very Fast

OpenBenchmarking.orgFrames Per Second, More Is BetterKvazaar 2.1Video Input: Bosphorus 4K - Video Preset: Very FastAMD AOCC 3.2AMD AOCC 3.1510152025SE +/- 0.04, N = 3SE +/- 0.01, N = 319.1719.131. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O3 -march=native -lm -lrt

Kvazaar

Video Input: Bosphorus 4K - Video Preset: Ultra Fast

OpenBenchmarking.orgFrames Per Second, More Is BetterKvazaar 2.1Video Input: Bosphorus 4K - Video Preset: Ultra FastAMD AOCC 3.2AMD AOCC 3.1714212835SE +/- 0.10, N = 3SE +/- 0.07, N = 332.2632.151. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O3 -march=native -lm -lrt

SVT-HEVC

Tuning: 1 - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-HEVC 1.5.0Tuning: 1 - Input: Bosphorus 1080pAMD AOCC 3.2AMD AOCC 3.1AMD AOCC 3.03691215SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 39.179.119.041. (CC) gcc options: -O3 -march=native -fPIE -fPIC -O2 -pie -rdynamic -lpthread -lrt

SVT-HEVC

Tuning: 7 - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-HEVC 1.5.0Tuning: 7 - Input: Bosphorus 1080pAMD AOCC 3.2AMD AOCC 3.1AMD AOCC 3.0306090120150SE +/- 0.20, N = 3SE +/- 0.36, N = 3SE +/- 0.27, N = 3117.07116.46115.761. (CC) gcc options: -O3 -march=native -fPIE -fPIC -O2 -pie -rdynamic -lpthread -lrt

SVT-HEVC

Tuning: 10 - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-HEVC 1.5.0Tuning: 10 - Input: Bosphorus 1080pAMD AOCC 3.2AMD AOCC 3.1AMD AOCC 3.050100150200250SE +/- 0.38, N = 3SE +/- 0.94, N = 3SE +/- 0.18, N = 3231.07229.69228.541. (CC) gcc options: -O3 -march=native -fPIE -fPIC -O2 -pie -rdynamic -lpthread -lrt

SVT-VP9

Tuning: VMAF Optimized - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-VP9 0.3Tuning: VMAF Optimized - Input: Bosphorus 1080pAMD AOCC 3.2AMD AOCC 3.0AMD AOCC 3.1306090120150SE +/- 0.23, N = 3SE +/- 0.17, N = 3SE +/- 0.52, N = 3151.30150.26149.851. (CC) gcc options: -O3 -fcommon -march=native -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm

Coremark

CoreMark Size 666 - Iterations Per Second

OpenBenchmarking.orgIterations/Sec, More Is BetterCoremark 1.0CoreMark Size 666 - Iterations Per SecondAMD AOCC 3.2AMD AOCC 3.0AMD AOCC 3.180K160K240K320K400KSE +/- 212.94, N = 3SE +/- 83.74, N = 3SE +/- 313.30, N = 3358684.33348647.22348059.951. (CC) gcc options: -O2 -O3 -march=native -lrt" -lrt

Stockfish

Total Time

OpenBenchmarking.orgNodes Per Second, More Is BetterStockfish 13Total TimeAMD AOCC 3.1AMD AOCC 3.2AMD AOCC 3.06M12M18M24M30MSE +/- 267405.57, N = 3SE +/- 117326.99, N = 3SE +/- 339377.91, N = 32592905325833642254352421. (CXX) g++ options: -fprofile-use -m64 -lpthread -O3 -march=native -fno-exceptions -std=c++17 -pedantic -msse -msse3 -mpopcnt -mavx2 -msse4.1 -mssse3 -msse2 -flto

Primesieve

1e12 Prime Number Generation

OpenBenchmarking.orgSeconds, Fewer Is BetterPrimesieve 7.71e12 Prime Number GenerationAMD AOCC 3.2AMD AOCC 3.1AMD AOCC 3.0510152025SE +/- 0.03, N = 3SE +/- 0.05, N = 3SE +/- 0.11, N = 320.9520.9921.081. (CXX) g++ options: -O3 -march=native -lpthread

oneDNN

Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: IP Shapes 3D - Data Type: f32 - Engine: CPUAMD AOCC 3.2AMD AOCC 3.1AMD AOCC 3.00.47580.95161.42741.90322.379SE +/- 0.00196, N = 3SE +/- 0.00558, N = 3SE +/- 0.01961, N = 71.903231.963472.11471MIN: 1.64MIN: 1.65MIN: 1.741. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp=libomp -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPUAMD AOCC 3.2AMD AOCC 3.1AMD AOCC 3.00.89931.79862.69793.59724.4965SE +/- 0.00042, N = 3SE +/- 0.00235, N = 3SE +/- 0.00548, N = 33.877133.885223.99675MIN: 3.76MIN: 3.81MIN: 3.881. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp=libomp -msse4.1 -fPIC -pie -lpthread -ldl

FLAC Audio Encoding

WAV To FLAC

OpenBenchmarking.orgSeconds, Fewer Is BetterFLAC Audio Encoding 1.3.3WAV To FLACAMD AOCC 3.1AMD AOCC 3.2AMD AOCC 3.048121620SE +/- 0.01, N = 5SE +/- 0.02, N = 5SE +/- 0.02, N = 515.8515.8715.921. (CXX) g++ options: -O3 -march=native -logg -lm

Ngspice

Circuit: C2670

OpenBenchmarking.orgSeconds, Fewer Is BetterNgspice 34Circuit: C2670AMD AOCC 3.2AMD AOCC 3.1AMD AOCC 3.020406080100SE +/- 0.10, N = 3SE +/- 0.42, N = 3SE +/- 0.91, N = 688.4289.1291.501. (CC) gcc options: -O3 -march=native -fopenmp -lm -lstdc++ -lfftw3 -lXaw -lXmu -lXt -lXext -lX11 -lXft -lfontconfig -lXrender -lfreetype -lSM -lICE

RNNoise

OpenBenchmarking.orgSeconds, Fewer Is BetterRNNoise 2020-06-28AMD AOCC 3.2AMD AOCC 3.0AMD AOCC 3.148121620SE +/- 0.04, N = 3SE +/- 0.03, N = 3SE +/- 0.03, N = 316.8416.8816.941. (CC) gcc options: -O3 -march=native -pedantic -fvisibility=hidden

Liquid-DSP

Threads: 16 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 2021.01.31Threads: 16 - Buffer Length: 256 - Filter Length: 57AMD AOCC 3.2AMD AOCC 3.0AMD AOCC 3.1150M300M450M600M750MSE +/- 162583.31, N = 3SE +/- 56075.35, N = 3SE +/- 100166.53, N = 36953400006890733336368100001. (CC) gcc options: -O3 -march=native -pthread -lm -lc -lliquid

libjpeg-turbo tjbench

Test: Decompression Throughput

OpenBenchmarking.orgMegapixels/sec, More Is Betterlibjpeg-turbo tjbench 2.1.0Test: Decompression ThroughputAMD AOCC 3.1AMD AOCC 3.2AMD AOCC 3.050100150200250SE +/- 0.13, N = 3SE +/- 0.87, N = 3SE +/- 0.01, N = 3244.49244.12233.051. (CC) gcc options: -O3 -march=native -rdynamic

Basis Universal

Settings: UASTC Level 0

OpenBenchmarking.orgSeconds, Fewer Is BetterBasis Universal 1.13Settings: UASTC Level 0AMD AOCC 3.2AMD AOCC 3.1AMD AOCC 3.0246810SE +/- 0.006, N = 3SE +/- 0.006, N = 3SE +/- 0.007, N = 36.7686.7746.8981. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread

Basis Universal

Settings: UASTC Level 2

OpenBenchmarking.orgSeconds, Fewer Is BetterBasis Universal 1.13Settings: UASTC Level 2AMD AOCC 3.2AMD AOCC 3.1AMD AOCC 3.0714212835SE +/- 0.00, N = 3SE +/- 0.03, N = 3SE +/- 0.01, N = 327.5327.6027.761. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread

Basis Universal

Settings: UASTC Level 3

OpenBenchmarking.orgSeconds, Fewer Is BetterBasis Universal 1.13Settings: UASTC Level 3AMD AOCC 3.2AMD AOCC 3.1AMD AOCC 3.01224364860SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.01, N = 351.1551.4051.521. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread

CppPerformanceBenchmarks

Test: Atol

OpenBenchmarking.orgSeconds, Fewer Is BetterCppPerformanceBenchmarks 9Test: AtolAMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.21020304050SE +/- 0.06, N = 3SE +/- 0.03, N = 3SE +/- 0.05, N = 343.3343.5043.761. (CXX) g++ options: -O3 -march=native -std=c++11

KTX-Software toktx

Settings: Zstd Compression 19

OpenBenchmarking.orgSeconds, Fewer Is BetterKTX-Software toktx 4.0Settings: Zstd Compression 19AMD AOCC 3.1AMD AOCC 3.2AMD AOCC 3.048121620SE +/- 0.06, N = 3SE +/- 0.04, N = 3SE +/- 0.03, N = 317.4917.5617.78

KTX-Software toktx

Settings: UASTC 3 + Zstd Compression 19

OpenBenchmarking.orgSeconds, Fewer Is BetterKTX-Software toktx 4.0Settings: UASTC 3 + Zstd Compression 19AMD AOCC 3.2AMD AOCC 3.1AMD AOCC 3.048121620SE +/- 0.02, N = 3SE +/- 0.02, N = 3SE +/- 0.02, N = 316.5716.5816.59

dav1d

Video Input: Chimera 1080p

OpenBenchmarking.orgFPS, More Is Betterdav1d 0.9.2Video Input: Chimera 1080pAMD AOCC 3.2AMD AOCC 3.1AMD AOCC 3.0120240360480600SE +/- 0.97, N = 3SE +/- 2.89, N = 3SE +/- 2.67, N = 3538.84537.39531.81-lm - MIN: 431.65 / MAX: 824.69MIN: 429.9 / MAX: 785.03-lm - MIN: 427.63 / MAX: 832.451. (CC) gcc options: -O3 -march=native -pthread

dav1d

Video Input: Summer Nature 1080p

OpenBenchmarking.orgFPS, More Is Betterdav1d 0.9.2Video Input: Summer Nature 1080pAMD AOCC 3.2AMD AOCC 3.1AMD AOCC 3.0110220330440550SE +/- 0.93, N = 3SE +/- 2.18, N = 3SE +/- 0.39, N = 3504.53504.09497.87-lm - MIN: 458.56 / MAX: 543.03MIN: 442.56 / MAX: 541.4-lm - MIN: 435.72 / MAX: 539.241. (CC) gcc options: -O3 -march=native -pthread

NCNN

Target: CPU - Model: mobilenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: CPU - Model: mobilenetAMD AOCC 3.2AMD AOCC 3.1AMD AOCC 3.03691215SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.08, N = 312.5413.2913.47MIN: 12.29 / MAX: 13.52MIN: 13.05 / MAX: 15.9MIN: 13.18 / MAX: 14.451. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread

NCNN

Target: CPU-v2-v2 - Model: mobilenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: CPU-v2-v2 - Model: mobilenet-v2AMD AOCC 3.2AMD AOCC 3.1AMD AOCC 3.01.02382.04763.07144.09525.119SE +/- 0.02, N = 3SE +/- 0.01, N = 3SE +/- 0.08, N = 34.264.484.55MIN: 4.15 / MAX: 4.92MIN: 4.35 / MAX: 4.93MIN: 4.35 / MAX: 5.161. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread

NCNN

Target: CPU-v3-v3 - Model: mobilenet-v3

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: CPU-v3-v3 - Model: mobilenet-v3AMD AOCC 3.2AMD AOCC 3.1AMD AOCC 3.00.90451.8092.71353.6184.5225SE +/- 0.02, N = 3SE +/- 0.01, N = 3SE +/- 0.04, N = 33.793.984.02MIN: 3.69 / MAX: 4.26MIN: 3.85 / MAX: 4.43MIN: 3.88 / MAX: 4.531. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread

NCNN

Target: CPU - Model: shufflenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: CPU - Model: shufflenet-v2AMD AOCC 3.2AMD AOCC 3.0AMD AOCC 3.11.13632.27263.40894.54525.6815SE +/- 0.04, N = 3SE +/- 0.03, N = 3SE +/- 0.01, N = 34.945.045.05MIN: 4.79 / MAX: 6.89MIN: 4.91 / MAX: 5.71MIN: 4.89 / MAX: 5.611. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread

NCNN

Target: CPU - Model: mnasnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: CPU - Model: mnasnetAMD AOCC 3.2AMD AOCC 3.1AMD AOCC 3.00.90231.80462.70693.60924.5115SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.06, N = 33.863.954.01MIN: 3.79 / MAX: 4.3MIN: 3.87 / MAX: 4.42MIN: 3.86 / MAX: 4.531. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread

NCNN

Target: CPU - Model: efficientnet-b0

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: CPU - Model: efficientnet-b0AMD AOCC 3.2AMD AOCC 3.1AMD AOCC 3.01.30282.60563.90845.21126.514SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.05, N = 35.555.705.79MIN: 5.46 / MAX: 6.26MIN: 5.61 / MAX: 6.34MIN: 5.66 / MAX: 8.451. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread

NCNN

Target: CPU - Model: resnet50

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: CPU - Model: resnet50AMD AOCC 3.2AMD AOCC 3.1AMD AOCC 3.048121620SE +/- 0.00, N = 3SE +/- 0.03, N = 3SE +/- 0.01, N = 316.7117.0717.31MIN: 15.97 / MAX: 17.44MIN: 16.46 / MAX: 17.8MIN: 16.66 / MAX: 18.121. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread

NCNN

Target: CPU - Model: yolov4-tiny

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: CPU - Model: yolov4-tinyAMD AOCC 3.1AMD AOCC 3.2AMD AOCC 3.0510152025SE +/- 0.04, N = 3SE +/- 0.06, N = 3SE +/- 0.30, N = 321.4021.4621.94MIN: 21.12 / MAX: 32.55MIN: 21.22 / MAX: 22.1MIN: 21.39 / MAX: 46.331. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread

NCNN

Target: CPU - Model: squeezenet_ssd

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: CPU - Model: squeezenet_ssdAMD AOCC 3.2AMD AOCC 3.0AMD AOCC 3.1510152025SE +/- 0.02, N = 3SE +/- 0.06, N = 3SE +/- 0.10, N = 318.2718.4418.53MIN: 17.97 / MAX: 18.95MIN: 18.06 / MAX: 19.31MIN: 17.16 / MAX: 19.511. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread

ONNX Runtime

Model: shufflenet-v2-10 - Device: CPU

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.10Model: shufflenet-v2-10 - Device: CPUAMD AOCC 3.2AMD AOCC 3.0AMD AOCC 3.15K10K15K20K25KSE +/- 460.99, N = 12SE +/- 398.96, N = 12SE +/- 398.04, N = 122125019886196341. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=thin -ldl -pthread -lrt -lpthread

Apache HTTP Server

Concurrent Requests: 1

OpenBenchmarking.orgRequests Per Second, More Is BetterApache HTTP Server 2.4.48Concurrent Requests: 1AMD AOCC 3.2AMD AOCC 3.0AMD AOCC 3.114002800420056007000SE +/- 59.67, N = 7SE +/- 144.81, N = 12SE +/- 45.70, N = 156329.406241.246015.711. (CC) gcc options: -shared -fPIC -pthread -O3 -march=native

Apache HTTP Server

Concurrent Requests: 200

OpenBenchmarking.orgRequests Per Second, More Is BetterApache HTTP Server 2.4.48Concurrent Requests: 200AMD AOCC 3.2AMD AOCC 3.0AMD AOCC 3.120K40K60K80K100KSE +/- 276.94, N = 3SE +/- 278.02, N = 3SE +/- 129.96, N = 389516.5287051.7085820.091. (CC) gcc options: -shared -fPIC -pthread -O3 -march=native

Apache HTTP Server

Concurrent Requests: 500

OpenBenchmarking.orgRequests Per Second, More Is BetterApache HTTP Server 2.4.48Concurrent Requests: 500AMD AOCC 3.2AMD AOCC 3.0AMD AOCC 3.120K40K60K80K100KSE +/- 185.04, N = 3SE +/- 218.70, N = 3SE +/- 267.66, N = 384374.5182641.7682157.171. (CC) gcc options: -shared -fPIC -pthread -O3 -march=native

Geometric Mean Of All Test Results

Result Composite - AMD AOCC 3.2 Compiler Benchmarks

OpenBenchmarking.orgGeometric Mean, More Is BetterGeometric Mean Of All Test ResultsResult Composite - AMD AOCC 3.2 Compiler BenchmarksAMD AOCC 3.2AMD AOCC 3.1AMD AOCC 3.060120180240300287.33282.22279.98


Phoronix Test Suite v10.8.4