AMD AOCC 3.1 Compiler Comparison

AMD EPYC 7543 testing of AMD AOCC 3.1 compiler benchmarks by Michael Larabel for a future article.

HTML result view exported from: https://openbenchmarking.org/result/2107288-IB-AOCC31BEN82&sor&sgm=1&hgv=AOCC+3.1&grs.

AMD AOCC 3.1 Compiler ComparisonProcessorMotherboardChipsetMemoryDiskGraphicsMonitorNetworkOSKernelDesktopDisplay ServerCompilerFile-SystemScreen ResolutionAOCC 3.1Clang 12.0GCC 11.1AMD EPYC 7543 32-Core @ 2.80GHz (32 Cores / 64 Threads)TYAN S8036GM2NE-LE (V2.00.B21 BIOS)AMD Starship/Matisse64GB1000GB Corsair Force MP600ASPEEDVE2282 x Broadcom NetXtreme BCM5720 2-port PCIeUbuntu 21.045.11.0-25-generic (x86_64)GNOME Shell 3.38.4X ServerClang 12.0.0ext41920x1080Clang 12.0.1-++20210630032617+fed41342a82f-1~exp1~20210630133328.128GCC 11.1.0OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseEnvironment Details- CXXFLAGS="-O3 -march=znver3" CFLAGS="-O3 -march=znver3"Compiler Details- AOCC 3.1: Optimized build with assertions; Default target: x86_64-unknown-linux-gnu; Host CPU: znver3- GCC 11.1: --disable-multilib --enable-checking=releaseProcessor Details- Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0xa001119Python Details- Python 3.9.5Security Details- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected

AMD AOCC 3.1 Compiler Comparisononednn: Deconvolution Batch shapes_1d - f32 - CPUetcpak: DXT1onednn: Convolution Batch Shapes Auto - f32 - CPUlibraw: Post-Processing Benchmarksvt-hevc: 10 - Bosphorus 1080pncnn: CPU-v2-v2 - mobilenet-v2ncnn: CPU-v3-v3 - mobilenet-v3botan: AES-256botan: AES-256 - Decryptencode-flac: WAV To FLACliquid-dsp: 16 - 256 - 57ncnn: CPU - resnet18etcpak: ETC2compress-lz4: 1 - Compression Speeddraco: Church Facadecompress-zstd: 19, Long Mode - Compression Speedonnx: bertsquad-10 - OpenMP CPUliquid-dsp: 32 - 256 - 57himeno: Poisson Pressure Solversecuremark: SecureMark-TLSdraco: Lionliquid-dsp: 64 - 256 - 57encode-mp3: WAV To MP3onednn: Recurrent Neural Network Training - f32 - CPUsvt-vp9: Visual Quality Optimized - Bosphorus 1080pjohn-the-ripper: MD5vpxenc: Speed 5 - Bosphorus 4Krocksdb: Read While Writingrocksdb: Read Rand Write Randngspice: C2670onnx: super-resolution-10 - OpenMP CPUavifenc: 6avifenc: 2avifenc: 10, Losslessgraphics-magick: Rotateavifenc: 6, Losslesstnn: CPU - SqueezeNet v2astcenc: Exhaustiveonednn: Deconvolution Batch shapes_3d - f32 - CPUgraphics-magick: Enhancedavifenc: 10compress-zstd: 8 - Compression Speedcompress-zstd: 19, Long Mode - Decompression Speedbotan: CAST-256 - Decryptonednn: Matrix Multiply Batch Shapes Transformer - f32 - CPUonednn: Recurrent Neural Network Inference - f32 - CPUjpegxl: PNG - 7botan: CAST-256tjbench: Decompression Throughputbotan: Blowfishrnnoise: financebench: Repo OpenMPpovray: Trace Timetachyon: Total Timesvt-av1: Preset 8 - Bosphorus 4Ksvt-hevc: 7 - Bosphorus 1080psqlite-speedtest: Timed Time - Size 1,000svt-av1: Preset 4 - Bosphorus 4Krocksdb: Update Randjpegxl: JPEG - 7botan: Twofish - Decryptngspice: C7552tnn: CPU - SqueezeNet v1.1jpegxl: JPEG - 8blosc: blosclzonednn: IP Shapes 1D - f32 - CPUstockfish: Total Timefinancebench: Bonds OpenMPjohn-the-ripper: Blowfishonednn: IP Shapes 3D - f32 - CPUbotan: Twofishquantlib: botan: Blowfish - Decryptgnupg: 2.7GB Sample File Encryptionencode-wavpack: WAV To WavPacksynthmark: VoiceMark_100onnx: fcn-resnet101-11 - OpenMP CPUncnn: CPU - vgg16AOCC 3.1Clang 12.0GCC 11.11.800962897.2841.4067445.11408.515.745.604859.6604876.9558.70080651000011.88207.55010964.11648842.148715279333333819.985446279931496018949666677.8286432.49234.18224000015.626135560305906891.52245689.67224.7675.80079226.66660.65521.89383.131866963.4972852.43264.4137.9070.435999668.7899.04134.367220.671079417.04518.73333763.57812514.77928.662219.467303.7555.2351.98783265165.89339.97482.279266.24128.3025069.61.401649044803051185.485677618425.07547332.8082854.9405.31167.60913.175621.6299051.601.836572862.3982.7776543.68388.866.826.774929.6724924.9657.32688679333313.15214.16810375.18746942.948016517333334112.995483285172549020683666677.8266828.04240.65204333317.125822258287599492.72946309.63224.4015.96682026.10361.2223.109646663.5172708.43315.9140.5380.436906679.0499.49140.287214.483099398.70618.55533674.40364615.13628.660719.399295.6754.7981.98881266564.35339.90083.886265.12227.7825417.81.4086751304.839844627455.00842334.1832838.6404.37367.89813.173621.4268559.744.592731170.4661.9182760.93529.507.717.095849.1325866.4818.16195449333311.19183.80212074.0537.643217191666674237.37748625719118736666677.0996221.98256.9116.256369004313999399.085492210.37626.2086.21284827.89164.79023.26252.954146583.6982792.33434.5133.6190.457865702.084133.676210.798266407.14718.00735022.33105515.35627.637918.873302.0753.7791.937832118332.39884.139270.54025536.11.425559186378950564.0208335.03445331.4132861.2407.58367.49213.1318956.42OpenBenchmarking.org

oneDNN

Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Deconvolution Batch shapes_1d - Data Type: f32 - Engine: CPUAOCC 3.1Clang 12.0GCC 11.11.03342.06683.10024.13365.167SE +/- 0.00353, N = 3SE +/- 0.00389, N = 3SE +/- 0.03320, N = 151.800961.836574.59273-fopenmp=libomp - MIN: 1.57-fopenmp=libomp - MIN: 1.61-fopenmp - MIN: 3.811. (CXX) g++ options: -O3 -march=native -march=znver3 -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl

Etcpak

Configuration: DXT1

OpenBenchmarking.orgMpx/s, More Is BetterEtcpak 0.7Configuration: DXT1AOCC 3.1Clang 12.0GCC 11.16001200180024003000SE +/- 4.90, N = 3SE +/- 6.55, N = 3SE +/- 0.56, N = 32897.282862.401170.471. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPUAOCC 3.1GCC 11.1Clang 12.00.6251.251.8752.53.125SE +/- 0.01062, N = 3SE +/- 0.00417, N = 3SE +/- 0.01715, N = 31.406741.918272.77765-fopenmp=libomp - MIN: 1.25-fopenmp - MIN: 1.2-fopenmp=libomp - MIN: 2.541. (CXX) g++ options: -O3 -march=native -march=znver3 -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl

LibRaw

Post-Processing Benchmark

OpenBenchmarking.orgMpix/sec, More Is BetterLibRaw 0.20Post-Processing BenchmarkGCC 11.1AOCC 3.1Clang 12.01428425670SE +/- 0.28, N = 3SE +/- 0.48, N = 3SE +/- 0.35, N = 360.9345.1143.681. (CXX) g++ options: -O3 -march=znver3 -fopenmp -ljpeg -lz -lm

SVT-HEVC

Tuning: 10 - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-HEVC 1.5.0Tuning: 10 - Input: Bosphorus 1080pGCC 11.1AOCC 3.1Clang 12.0110220330440550SE +/- 4.86, N = 3SE +/- 5.79, N = 15SE +/- 4.14, N = 3529.50408.51388.861. (CC) gcc options: -O3 -march=znver3 -fPIE -fPIC -O2 -pie -rdynamic -lpthread -lrt

NCNN

Target: CPU-v2-v2 - Model: mobilenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210525Target: CPU-v2-v2 - Model: mobilenet-v2AOCC 3.1Clang 12.0GCC 11.1246810SE +/- 0.03, N = 15SE +/- 0.05, N = 3SE +/- 0.01, N = 35.746.827.71-lomp - MIN: 5.3 / MAX: 32.98-lomp - MIN: 6.56 / MAX: 7.47-lgomp - MIN: 7.5 / MAX: 16.341. (CXX) g++ options: -O3 -march=znver3 -rdynamic -lpthread -pthread

NCNN

Target: CPU-v3-v3 - Model: mobilenet-v3

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210525Target: CPU-v3-v3 - Model: mobilenet-v3AOCC 3.1Clang 12.0GCC 11.1246810SE +/- 0.04, N = 15SE +/- 0.19, N = 3SE +/- 0.04, N = 35.606.777.09-lomp - MIN: 4.66 / MAX: 56.3-lomp - MIN: 5.93 / MAX: 47.97-lgomp - MIN: 6.52 / MAX: 39.191. (CXX) g++ options: -O3 -march=znver3 -rdynamic -lpthread -pthread

Botan

Test: AES-256

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: AES-256GCC 11.1Clang 12.0AOCC 3.113002600390052006500SE +/- 4.89, N = 3SE +/- 9.74, N = 3SE +/- 3.96, N = 35849.134929.674859.661. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt

Botan

Test: AES-256 - Decrypt

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: AES-256 - DecryptGCC 11.1Clang 12.0AOCC 3.113002600390052006500SE +/- 6.69, N = 3SE +/- 1.46, N = 3SE +/- 6.85, N = 35866.484924.974876.961. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt

FLAC Audio Encoding

WAV To FLAC

OpenBenchmarking.orgSeconds, Fewer Is BetterFLAC Audio Encoding 1.3.2WAV To FLACClang 12.0GCC 11.1AOCC 3.1246810SE +/- 0.007, N = 5SE +/- 0.007, N = 5SE +/- 0.002, N = 57.3268.1618.700-fvisibility=hidden1. (CXX) g++ options: -O3 -march=znver3 -logg -lm

Liquid-DSP

Threads: 16 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 2021.01.31Threads: 16 - Buffer Length: 256 - Filter Length: 57GCC 11.1Clang 12.0AOCC 3.1200M400M600M800M1000MSE +/- 1548777.30, N = 3SE +/- 1271918.94, N = 3SE +/- 596852.86, N = 39544933338867933338065100001. (CC) gcc options: -O3 -march=znver3 -pthread -lm -lc -lliquid

NCNN

Target: CPU - Model: resnet18

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210525Target: CPU - Model: resnet18GCC 11.1AOCC 3.1Clang 12.03691215SE +/- 0.11, N = 3SE +/- 0.09, N = 15SE +/- 0.04, N = 311.1911.8813.15-lgomp - MIN: 10.93 / MAX: 50.17-lomp - MIN: 10.11 / MAX: 110.24-lomp - MIN: 12.7 / MAX: 14.361. (CXX) g++ options: -O3 -march=znver3 -rdynamic -lpthread -pthread

Etcpak

Configuration: ETC2

OpenBenchmarking.orgMpx/s, More Is BetterEtcpak 0.7Configuration: ETC2Clang 12.0AOCC 3.1GCC 11.150100150200250SE +/- 0.01, N = 3SE +/- 0.03, N = 3SE +/- 0.04, N = 3214.17207.55183.801. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread

LZ4 Compression

Compression Level: 1 - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterLZ4 Compression 1.9.3Compression Level: 1 - Compression SpeedGCC 11.1AOCC 3.1Clang 12.03K6K9K12K15KSE +/- 86.05, N = 3SE +/- 7.15, N = 3SE +/- 14.32, N = 312074.0510964.1110375.181. (CC) gcc options: -O3

Google Draco

Model: Church Facade

OpenBenchmarking.orgms, Fewer Is BetterGoogle Draco 1.4.1Model: Church FacadeAOCC 3.1Clang 12.016003200480064008000SE +/- 49.64, N = 14SE +/- 22.32, N = 3648874691. (CXX) g++ options: -O3 -march=znver3

Zstd Compression

Compression Level: 19, Long Mode - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 19, Long Mode - Compression SpeedClang 12.0AOCC 3.1GCC 11.11020304050SE +/- 0.12, N = 3SE +/- 0.38, N = 3SE +/- 0.10, N = 342.942.137.61. (CC) gcc options: -O3 -march=znver3 -pthread -lz -llzma

ONNX Runtime

Model: bertsquad-10 - Device: OpenMP CPU

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.6Model: bertsquad-10 - Device: OpenMP CPUAOCC 3.1Clang 12.0GCC 11.1110220330440550SE +/- 4.81, N = 5SE +/- 5.04, N = 3SE +/- 5.78, N = 34874804321. (CXX) g++ options: -O3 -march=znver3 -fopenmp=libomp -ffunction-sections -fdata-sections -ldl -lrt

Liquid-DSP

Threads: 32 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 2021.01.31Threads: 32 - Buffer Length: 256 - Filter Length: 57GCC 11.1Clang 12.0AOCC 3.1400M800M1200M1600M2000MSE +/- 3779035.74, N = 3SE +/- 1570916.22, N = 3SE +/- 635959.47, N = 31719166667165173333315279333331. (CC) gcc options: -O3 -march=znver3 -pthread -lm -lc -lliquid

Himeno Benchmark

Poisson Pressure Solver

OpenBenchmarking.orgMFLOPS, More Is BetterHimeno Benchmark 3.0Poisson Pressure SolverGCC 11.1Clang 12.0AOCC 3.19001800270036004500SE +/- 19.17, N = 3SE +/- 58.62, N = 15SE +/- 26.55, N = 34237.384113.003819.991. (CC) gcc options: -O3 -march=znver3 -mavx2

SecureMark

Benchmark: SecureMark-TLS

OpenBenchmarking.orgmarks, More Is BetterSecureMark 1.0.4Benchmark: SecureMark-TLSClang 12.0AOCC 3.1GCC 11.160K120K180K240K300KSE +/- 513.21, N = 3SE +/- 250.58, N = 3SE +/- 178.03, N = 32851722799312571911. (CC) gcc options: -pedantic -O3

Google Draco

Model: Lion

OpenBenchmarking.orgms, Fewer Is BetterGoogle Draco 1.4.1Model: LionAOCC 3.1Clang 12.012002400360048006000SE +/- 25.93, N = 3SE +/- 6.01, N = 3496054901. (CXX) g++ options: -O3 -march=znver3

Liquid-DSP

Threads: 64 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 2021.01.31Threads: 64 - Buffer Length: 256 - Filter Length: 57Clang 12.0AOCC 3.1GCC 11.1400M800M1200M1600M2000MSE +/- 88191.71, N = 3SE +/- 983756.97, N = 3SE +/- 581186.53, N = 32068366667189496666718736666671. (CC) gcc options: -O3 -march=znver3 -pthread -lm -lc -lliquid

LAME MP3 Encoding

WAV To MP3

OpenBenchmarking.orgSeconds, Fewer Is BetterLAME MP3 Encoding 3.100WAV To MP3GCC 11.1Clang 12.0AOCC 3.1246810SE +/- 0.016, N = 3SE +/- 0.003, N = 3SE +/- 0.003, N = 37.0997.8267.828-ffast-math -funroll-loops -fschedule-insns2 -fbranch-count-reg -fforce-addr -lncurses-lncurses1. (CC) gcc options: -O3 -pipe -march=znver3 -lm

oneDNN

Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Training - Data Type: f32 - Engine: CPUGCC 11.1AOCC 3.1Clang 12.015003000450060007500SE +/- 85.54, N = 15SE +/- 68.78, N = 3SE +/- 58.03, N = 36221.986432.496828.04-fopenmp - MIN: 5154.78-fopenmp=libomp - MIN: 6265.76-fopenmp=libomp - MIN: 64941. (CXX) g++ options: -O3 -march=native -march=znver3 -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl

SVT-VP9

Tuning: Visual Quality Optimized - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-VP9 0.3Tuning: Visual Quality Optimized - Input: Bosphorus 1080pGCC 11.1Clang 12.0AOCC 3.160120180240300SE +/- 0.79, N = 3SE +/- 3.15, N = 15SE +/- 1.81, N = 3256.91240.65234.181. (CC) gcc options: -O3 -fcommon -march=znver3 -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm

John The Ripper

Test: MD5

OpenBenchmarking.orgReal C/S, More Is BetterJohn The Ripper 1.9.0-jumbo-1Test: MD5AOCC 3.1Clang 12.0500K1000K1500K2000K2500KSE +/- 23259.41, N = 3SE +/- 4666.67, N = 3224000020433331. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lgmp -pthread -lm -lz -ldl -lcrypt

VP9 libvpx Encoding

Speed: Speed 5 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterVP9 libvpx Encoding 1.10.0Speed: Speed 5 - Input: Bosphorus 4KClang 12.0GCC 11.1AOCC 3.148121620SE +/- 0.12, N = 3SE +/- 0.07, N = 3SE +/- 0.12, N = 317.1216.2515.621. (CXX) g++ options: -m64 -lm -lpthread -O3 -march=znver3 -fPIC -U_FORTIFY_SOURCE -std=gnu++11

Facebook RocksDB

Test: Read While Writing

OpenBenchmarking.orgOp/s, More Is BetterFacebook RocksDB 6.22.1Test: Read While WritingGCC 11.1AOCC 3.1Clang 12.01.4M2.8M4.2M5.6M7MSE +/- 51448.06, N = 9SE +/- 55702.89, N = 3SE +/- 43820.68, N = 15636900461355605822258-fno-builtin-memcmp-latomic-latomic1. (CXX) g++ options: -O3 -march=native -pthread -fno-rtti -lpthread

Facebook RocksDB

Test: Read Random Write Random

OpenBenchmarking.orgOp/s, More Is BetterFacebook RocksDB 6.22.1Test: Read Random Write RandomGCC 11.1AOCC 3.1Clang 12.0700K1400K2100K2800K3500KSE +/- 12186.30, N = 3SE +/- 32512.94, N = 4SE +/- 18538.77, N = 14313999330590682875994-fno-builtin-memcmp-latomic-latomic1. (CXX) g++ options: -O3 -march=native -pthread -fno-rtti -lpthread

Ngspice

Circuit: C2670

OpenBenchmarking.orgSeconds, Fewer Is BetterNgspice 34Circuit: C2670AOCC 3.1Clang 12.0GCC 11.120406080100SE +/- 0.15, N = 3SE +/- 0.63, N = 3SE +/- 0.74, N = 391.5292.7399.09-lstdc++-lstdc++1. (CC) gcc options: -O3 -march=znver3 -fopenmp -lm -lfftw3 -lXaw -lXmu -lXt -lXext -lX11 -lXft -lfontconfig -lXrender -lfreetype -lSM -lICE

ONNX Runtime

Model: super-resolution-10 - Device: OpenMP CPU

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.6Model: super-resolution-10 - Device: OpenMP CPUGCC 11.1Clang 12.0AOCC 3.111002200330044005500SE +/- 51.87, N = 3SE +/- 23.74, N = 3SE +/- 49.29, N = 54922463045681. (CXX) g++ options: -O3 -march=znver3 -fopenmp=libomp -ffunction-sections -fdata-sections -ldl -lrt

libavif avifenc

Encoder Speed: 6

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.9.0Encoder Speed: 6Clang 12.0AOCC 3.1GCC 11.13691215SE +/- 0.128, N = 3SE +/- 0.067, N = 3SE +/- 0.102, N = 39.6329.67210.3761. (CXX) g++ options: -O3 -fPIC -lm

libavif avifenc

Encoder Speed: 2

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.9.0Encoder Speed: 2Clang 12.0AOCC 3.1GCC 11.1612182430SE +/- 0.07, N = 3SE +/- 0.25, N = 3SE +/- 0.11, N = 324.4024.7726.211. (CXX) g++ options: -O3 -fPIC -lm

libavif avifenc

Encoder Speed: 10, Lossless

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.9.0Encoder Speed: 10, LosslessAOCC 3.1Clang 12.0GCC 11.1246810SE +/- 0.053, N = 3SE +/- 0.002, N = 3SE +/- 0.065, N = 35.8005.9666.2121. (CXX) g++ options: -O3 -fPIC -lm

GraphicsMagick

Operation: Rotate

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: RotateGCC 11.1Clang 12.0AOCC 3.12004006008001000SE +/- 1.33, N = 3SE +/- 2.08, N = 3SE +/- 3.71, N = 38488207921. (CC) gcc options: -fopenmp -O3 -march=znver3 -pthread -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lxml2 -lz -lm -lpthread

libavif avifenc

Encoder Speed: 6, Lossless

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.9.0Encoder Speed: 6, LosslessClang 12.0AOCC 3.1GCC 11.1714212835SE +/- 0.14, N = 3SE +/- 0.14, N = 3SE +/- 0.10, N = 326.1026.6727.891. (CXX) g++ options: -O3 -fPIC -lm

TNN

Target: CPU - Model: SqueezeNet v2

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: SqueezeNet v2AOCC 3.1Clang 12.0GCC 11.11428425670SE +/- 0.03, N = 3SE +/- 0.62, N = 3SE +/- 0.23, N = 360.6661.2264.79-fopenmp=libomp - MIN: 60.48 / MAX: 60.97-fopenmp=libomp - MIN: 60.1 / MAX: 67.06-fopenmp - MIN: 64.25 / MAX: 65.311. (CXX) g++ options: -O3 -march=znver3 -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl

ASTC Encoder

Preset: Exhaustive

OpenBenchmarking.orgSeconds, Fewer Is BetterASTC Encoder 3.0Preset: ExhaustiveAOCC 3.1GCC 11.1612182430SE +/- 0.01, N = 3SE +/- 0.00, N = 321.8923.261. (CXX) g++ options: -O3 -march=znver3 -flto -pthread

oneDNN

Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Deconvolution Batch shapes_3d - Data Type: f32 - Engine: CPUGCC 11.1Clang 12.0AOCC 3.10.70471.40942.11412.81883.5235SE +/- 0.00916, N = 3SE +/- 0.00207, N = 3SE +/- 0.00672, N = 32.954143.109643.13186-fopenmp - MIN: 2.68-fopenmp=libomp - MIN: 2.66-fopenmp=libomp - MIN: 2.761. (CXX) g++ options: -O3 -march=native -march=znver3 -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl

GraphicsMagick

Operation: Enhanced

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: EnhancedAOCC 3.1Clang 12.0GCC 11.1150300450600750SE +/- 0.33, N = 3SE +/- 0.58, N = 3SE +/- 3.48, N = 36966666581. (CC) gcc options: -fopenmp -O3 -march=znver3 -pthread -ljbig -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lxml2 -lz -lm -lpthread

libavif avifenc

Encoder Speed: 10

OpenBenchmarking.orgSeconds, Fewer Is Betterlibavif avifenc 0.9.0Encoder Speed: 10AOCC 3.1Clang 12.0GCC 11.10.83211.66422.49633.32844.1605SE +/- 0.022, N = 3SE +/- 0.025, N = 15SE +/- 0.045, N = 33.4973.5173.6981. (CXX) g++ options: -O3 -fPIC -lm

Zstd Compression

Compression Level: 8 - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 8 - Compression SpeedAOCC 3.1GCC 11.1Clang 12.06001200180024003000SE +/- 17.26, N = 3SE +/- 4.24, N = 3SE +/- 23.16, N = 32852.42792.32708.41. (CC) gcc options: -O3 -march=znver3 -pthread -lz -llzma

Zstd Compression

Compression Level: 19, Long Mode - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 19, Long Mode - Decompression SpeedGCC 11.1Clang 12.0AOCC 3.17001400210028003500SE +/- 30.46, N = 3SE +/- 18.29, N = 3SE +/- 6.78, N = 33434.53315.93264.41. (CC) gcc options: -O3 -march=znver3 -pthread -lz -llzma

Botan

Test: CAST-256 - Decrypt

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: CAST-256 - DecryptClang 12.0AOCC 3.1GCC 11.1306090120150SE +/- 0.12, N = 3SE +/- 0.02, N = 3SE +/- 0.06, N = 3140.54137.91133.621. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt

oneDNN

Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Matrix Multiply Batch Shapes Transformer - Data Type: f32 - Engine: CPUAOCC 3.1Clang 12.0GCC 11.10.1030.2060.3090.4120.515SE +/- 0.000575, N = 3SE +/- 0.000360, N = 3SE +/- 0.000447, N = 30.4359990.4369060.457865-fopenmp=libomp - MIN: 0.39-fopenmp=libomp - MIN: 0.4-fopenmp - MIN: 0.421. (CXX) g++ options: -O3 -march=native -march=znver3 -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl

oneDNN

Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Recurrent Neural Network Inference - Data Type: f32 - Engine: CPUAOCC 3.1Clang 12.0GCC 11.1150300450600750SE +/- 0.23, N = 3SE +/- 2.18, N = 3SE +/- 6.29, N = 7668.79679.05702.08-fopenmp=libomp - MIN: 659.59-fopenmp=libomp - MIN: 671.01-fopenmp - MIN: 683.481. (CXX) g++ options: -O3 -march=native -march=znver3 -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl

JPEG XL

Input: PNG - Encode Speed: 7

OpenBenchmarking.orgMP/s, More Is BetterJPEG XL 0.3.3Input: PNG - Encode Speed: 7Clang 12.0AOCC 3.13691215SE +/- 0.01, N = 3SE +/- 0.01, N = 39.499.041. (CXX) g++ options: -O3 -march=znver3 -funwind-tables -Xclang -mrelax-all -O2 -pthread -fPIE -pie -ldl

Botan

Test: CAST-256

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: CAST-256Clang 12.0AOCC 3.1GCC 11.1306090120150SE +/- 0.16, N = 3SE +/- 0.02, N = 3SE +/- 0.07, N = 3140.29134.37133.681. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt

libjpeg-turbo tjbench

Test: Decompression Throughput

OpenBenchmarking.orgMegapixels/sec, More Is Betterlibjpeg-turbo tjbench 2.1.0Test: Decompression ThroughputAOCC 3.1Clang 12.0GCC 11.150100150200250SE +/- 0.05, N = 3SE +/- 0.22, N = 3SE +/- 0.09, N = 3220.67214.48210.801. (CC) gcc options: -O3 -march=znver3 -rdynamic

Botan

Test: Blowfish

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: BlowfishAOCC 3.1GCC 11.1Clang 12.090180270360450SE +/- 0.17, N = 3SE +/- 0.11, N = 3SE +/- 0.05, N = 3417.05407.15398.711. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt

RNNoise

OpenBenchmarking.orgSeconds, Fewer Is BetterRNNoise 2020-06-28GCC 11.1Clang 12.0AOCC 3.1510152025SE +/- 0.01, N = 3SE +/- 0.01, N = 3SE +/- 0.05, N = 318.0118.5618.731. (CC) gcc options: -O3 -march=znver3 -pedantic -fvisibility=hidden

FinanceBench

Benchmark: Repo OpenMP

OpenBenchmarking.orgms, Fewer Is BetterFinanceBench 2016-07-25Benchmark: Repo OpenMPClang 12.0AOCC 3.1GCC 11.18K16K24K32K40KSE +/- 257.66, N = 3SE +/- 308.02, N = 3SE +/- 418.97, N = 433674.4033763.5835022.331. (CXX) g++ options: -O3 -march=native -fopenmp

POV-Ray

Trace Time

OpenBenchmarking.orgSeconds, Fewer Is BetterPOV-Ray 3.7.0.7Trace TimeAOCC 3.1Clang 12.0GCC 11.148121620SE +/- 0.04, N = 3SE +/- 0.02, N = 3SE +/- 0.13, N = 314.7815.1415.36-R/usr/lib1. (CXX) g++ options: -pipe -O3 -ffast-math -march=native -march=znver3 -pthread -lXpm -lSM -lICE -lX11 -lIlmImf -lIlmImf-2_5 -lImath-2_5 -lHalf-2_5 -lIex-2_5 -lIexMath-2_5 -lIlmThread-2_5 -lIlmThread -ltiff -ljpeg -lpng -lz -lrt -lm -lboost_thread -lboost_system

Tachyon

Total Time

OpenBenchmarking.orgSeconds, Fewer Is BetterTachyon 0.99b6Total TimeGCC 11.1Clang 12.0AOCC 3.1714212835SE +/- 0.04, N = 3SE +/- 0.03, N = 3SE +/- 0.03, N = 327.6428.6628.661. (CC) gcc options: -m64 -O3 -fomit-frame-pointer -ffast-math -ltachyon -lm -lpthread

SVT-AV1

Encoder Mode: Preset 8 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 0.8.7Encoder Mode: Preset 8 - Input: Bosphorus 4KAOCC 3.1Clang 12.0GCC 11.1510152025SE +/- 0.10, N = 3SE +/- 0.04, N = 3SE +/- 0.10, N = 319.4719.4018.871. (CXX) g++ options: -O3 -march=znver3 -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq -pie

SVT-HEVC

Tuning: 7 - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-HEVC 1.5.0Tuning: 7 - Input: Bosphorus 1080pAOCC 3.1GCC 11.1Clang 12.070140210280350SE +/- 4.16, N = 3SE +/- 1.06, N = 3SE +/- 3.03, N = 15303.75302.07295.671. (CC) gcc options: -O3 -march=znver3 -fPIE -fPIC -O2 -pie -rdynamic -lpthread -lrt

SQLite Speedtest

Timed Time - Size 1,000

OpenBenchmarking.orgSeconds, Fewer Is BetterSQLite Speedtest 3.30Timed Time - Size 1,000GCC 11.1Clang 12.0AOCC 3.11224364860SE +/- 0.18, N = 3SE +/- 0.02, N = 3SE +/- 0.06, N = 353.7854.8055.241. (CC) gcc options: -O3 -march=znver3 -ldl -lz -lpthread

SVT-AV1

Encoder Mode: Preset 4 - Input: Bosphorus 4K

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-AV1 0.8.7Encoder Mode: Preset 4 - Input: Bosphorus 4KClang 12.0AOCC 3.1GCC 11.10.44730.89461.34191.78922.2365SE +/- 0.004, N = 3SE +/- 0.001, N = 3SE +/- 0.005, N = 31.9881.9871.9371. (CXX) g++ options: -O3 -march=znver3 -mno-avx -mavx2 -mavx512f -mavx512bw -mavx512dq -pie

Facebook RocksDB

Test: Update Random

OpenBenchmarking.orgOp/s, More Is BetterFacebook RocksDB 6.22.1Test: Update RandomAOCC 3.1GCC 11.1Clang 12.0200K400K600K800K1000KSE +/- 1446.17, N = 3SE +/- 3250.14, N = 3SE +/- 1551.72, N = 3832651832118812665-latomic-fno-builtin-memcmp-latomic1. (CXX) g++ options: -O3 -march=native -pthread -fno-rtti -lpthread

JPEG XL

Input: JPEG - Encode Speed: 7

OpenBenchmarking.orgMP/s, More Is BetterJPEG XL 0.3.3Input: JPEG - Encode Speed: 7AOCC 3.1Clang 12.01530456075SE +/- 0.18, N = 3SE +/- 0.19, N = 365.8964.351. (CXX) g++ options: -O3 -march=znver3 -funwind-tables -Xclang -mrelax-all -O2 -pthread -fPIE -pie -ldl

Botan

Test: Twofish - Decrypt

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: Twofish - DecryptAOCC 3.1Clang 12.0GCC 11.170140210280350SE +/- 0.13, N = 3SE +/- 0.14, N = 3SE +/- 2.99, N = 12339.97339.90332.401. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt

Ngspice

Circuit: C7552

OpenBenchmarking.orgSeconds, Fewer Is BetterNgspice 34Circuit: C7552AOCC 3.1Clang 12.0GCC 11.120406080100SE +/- 0.29, N = 3SE +/- 0.30, N = 3SE +/- 0.69, N = 982.2883.8984.14-lstdc++-lstdc++1. (CC) gcc options: -O3 -march=znver3 -fopenmp -lm -lfftw3 -lXaw -lXmu -lXt -lXext -lX11 -lXft -lfontconfig -lXrender -lfreetype -lSM -lICE

TNN

Target: CPU - Model: SqueezeNet v1.1

OpenBenchmarking.orgms, Fewer Is BetterTNN 0.3Target: CPU - Model: SqueezeNet v1.1Clang 12.0AOCC 3.1GCC 11.160120180240300SE +/- 0.09, N = 3SE +/- 0.08, N = 3SE +/- 0.20, N = 3265.12266.24270.54-fopenmp=libomp - MIN: 264.65 / MAX: 271.09-fopenmp=libomp - MIN: 265.84 / MAX: 266.54-fopenmp - MIN: 270.13 / MAX: 271.091. (CXX) g++ options: -O3 -march=znver3 -pthread -fvisibility=hidden -fvisibility=default -rdynamic -ldl

JPEG XL

Input: JPEG - Encode Speed: 8

OpenBenchmarking.orgMP/s, More Is BetterJPEG XL 0.3.3Input: JPEG - Encode Speed: 8AOCC 3.1Clang 12.0714212835SE +/- 0.04, N = 3SE +/- 0.10, N = 328.3027.781. (CXX) g++ options: -O3 -march=znver3 -funwind-tables -Xclang -mrelax-all -O2 -pthread -fPIE -pie -ldl

C-Blosc

Compressor: blosclz

OpenBenchmarking.orgMB/s, More Is BetterC-Blosc 2.0Compressor: blosclzGCC 11.1Clang 12.0AOCC 3.15K10K15K20K25KSE +/- 95.08, N = 3SE +/- 118.87, N = 3SE +/- 100.31, N = 325536.125417.825069.6

oneDNN

Harness: IP Shapes 1D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: IP Shapes 1D - Data Type: f32 - Engine: CPUAOCC 3.1Clang 12.0GCC 11.10.32070.64140.96211.28281.6035SE +/- 0.00055, N = 3SE +/- 0.00211, N = 3SE +/- 0.00145, N = 31.401641.408671.42555-fopenmp=libomp - MIN: 1.27-fopenmp=libomp - MIN: 1.29-fopenmp - MIN: 1.281. (CXX) g++ options: -O3 -march=native -march=znver3 -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl

Stockfish

Total Time

OpenBenchmarking.orgNodes Per Second, More Is BetterStockfish 13Total TimeGCC 11.1AOCC 3.120M40M60M80M100MSE +/- 1240215.58, N = 3SE +/- 655436.02, N = 159186378990448030-lgcov -mbmi2 -fno-peel-loops -fno-tracer -flto=jobserver1. (CXX) g++ options: -m64 -lpthread -O3 -march=znver3 -fno-exceptions -std=c++17 -pedantic -msse -msse3 -mpopcnt -mavx2 -msse4.1 -mssse3 -msse2 -flto -fprofile-use

FinanceBench

Benchmark: Bonds OpenMP

OpenBenchmarking.orgms, Fewer Is BetterFinanceBench 2016-07-25Benchmark: Bonds OpenMPGCC 11.1AOCC 3.1Clang 12.011K22K33K44K55KSE +/- 192.12, N = 3SE +/- 150.62, N = 3SE +/- 370.43, N = 350564.0251185.4951304.841. (CXX) g++ options: -O3 -march=native -fopenmp

John The Ripper

Test: Blowfish

OpenBenchmarking.orgReal C/S, More Is BetterJohn The Ripper 1.9.0-jumbo-1Test: BlowfishClang 12.0AOCC 3.113K26K39K52K65KSE +/- 601.59, N = 6SE +/- 125.66, N = 362745618421. (CC) gcc options: -m64 -lssl -lcrypto -fopenmp -lgmp -pthread -lm -lz -ldl -lcrypt

oneDNN

Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: IP Shapes 3D - Data Type: f32 - Engine: CPUClang 12.0GCC 11.1AOCC 3.11.1422.2843.4264.5685.71SE +/- 0.00188, N = 3SE +/- 0.03095, N = 14SE +/- 0.00433, N = 35.008425.034455.07547-fopenmp=libomp - MIN: 4.9-fopenmp - MIN: 4.87-fopenmp=libomp - MIN: 4.961. (CXX) g++ options: -O3 -march=native -march=znver3 -std=c++11 -msse4.1 -fPIC -pie -lpthread -ldl

Botan

Test: Twofish

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: TwofishClang 12.0AOCC 3.1GCC 11.170140210280350SE +/- 0.11, N = 3SE +/- 0.05, N = 3SE +/- 2.97, N = 12334.18332.81331.411. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt

QuantLib

OpenBenchmarking.orgMFLOPS, More Is BetterQuantLib 1.21GCC 11.1AOCC 3.1Clang 12.06001200180024003000SE +/- 23.43, N = 9SE +/- 5.98, N = 3SE +/- 4.58, N = 32861.22854.92838.61. (CXX) g++ options: -O3 -march=native -rdynamic

Botan

Test: Blowfish - Decrypt

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: Blowfish - DecryptGCC 11.1AOCC 3.1Clang 12.090180270360450SE +/- 0.07, N = 3SE +/- 0.11, N = 3SE +/- 0.03, N = 3407.58405.31404.371. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt

GnuPG

2.7GB Sample File Encryption

OpenBenchmarking.orgSeconds, Fewer Is BetterGnuPG 2.2.272.7GB Sample File EncryptionGCC 11.1AOCC 3.1Clang 12.01530456075SE +/- 0.12, N = 3SE +/- 0.24, N = 3SE +/- 0.51, N = 367.4967.6167.901. (CC) gcc options: -O3 -march=znver3

WavPack Audio Encoding

WAV To WavPack

OpenBenchmarking.orgSeconds, Fewer Is BetterWavPack Audio Encoding 5.3WAV To WavPackGCC 11.1Clang 12.0AOCC 3.13691215SE +/- 0.00, N = 5SE +/- 0.00, N = 5SE +/- 0.00, N = 513.1313.1713.181. (CXX) g++ options: -O3 -march=znver3 -rdynamic

Google SynthMark

Test: VoiceMark_100

OpenBenchmarking.orgVoices, More Is BetterGoogle SynthMark 20201109Test: VoiceMark_100AOCC 3.1Clang 12.0130260390520650SE +/- 0.49, N = 3SE +/- 0.68, N = 3621.63621.431. (CXX) g++ options: -lm -lpthread -std=c++11 -Ofast

ONNX Runtime

Model: fcn-resnet101-11 - Device: OpenMP CPU

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.6Model: fcn-resnet101-11 - Device: OpenMP CPUAOCC 3.1GCC 11.1Clang 12.020406080100SE +/- 1.74, N = 11SE +/- 3.97, N = 9SE +/- 1.22, N = 129089851. (CXX) g++ options: -O3 -march=znver3 -fopenmp=libomp -ffunction-sections -fdata-sections -ldl -lrt

NCNN

Target: CPU - Model: vgg16

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210525Target: CPU - Model: vgg16AOCC 3.1GCC 11.1Clang 12.01326395265SE +/- 1.18, N = 15SE +/- 1.12, N = 3SE +/- 2.17, N = 351.6056.4259.74-lomp - MIN: 29.8 / MAX: 198.51-lgomp - MIN: 47.61 / MAX: 93.66-lomp - MIN: 51.34 / MAX: 157.21. (CXX) g++ options: -O3 -march=znver3 -rdynamic -lpthread -pthread

Geometric Mean Of All Test Results

Result Composite - AMD AOCC 3.1 Compiler Comparison

OpenBenchmarking.orgGeometric Mean, More Is BetterGeometric Mean Of All Test ResultsResult Composite - AMD AOCC 3.1 Compiler ComparisonAOCC 3.1Clang 12.0GCC 11.1306090120150154.72152.20150.45


Phoronix Test Suite v10.8.4