AMD AOCC 3.2 Compiler Benchmarks

AMD EPYC 72F3 of AOCC 3.2 compiler and prior releases. Benchmarks by Michael Larabel for a future article. 

HTML result view exported from: https://openbenchmarking.org/result/2112199-PTS-AOCC327345&sgm=1&grr.

AMD AOCC 3.2 Compiler BenchmarksProcessorMotherboardChipsetMemoryDiskGraphicsMonitorNetworkOSKernelDesktopDisplay ServerCompilerFile-SystemScreen ResolutionAMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.2AMD EPYC 72F3 8-Core @ 3.70GHz (8 Cores / 16 Threads)Supermicro H12SSL-i v1.01 (2.0 BIOS)AMD Starship/Matisse126GB3841GB Micron_9300_MTFDHAL3T8TDP + 1000GB Corsair Force MP600ASPEEDVE2282 x Broadcom NetXtreme BCM5720 2-port PCIeUbuntu 21.045.14.0-rc7-amd-pstate-phx (x86_64) 20210909GNOME Shell 3.38.4X ServerClang 12.0.0ext41920x1080Clang 13.0.0OpenBenchmarking.orgKernel Details- Transparent Huge Pages: madviseEnvironment Details- CXXFLAGS="-O3 -march=native" CFLAGS="-O3 -march=native"Compiler Details- AMD AOCC 3.0: Optimized build with assertions; Default target: x86_64-unknown-linux-gnu; Host CPU: (unknown) - AMD AOCC 3.1: Optimized build with assertions; Default target: x86_64-unknown-linux-gnu; Host CPU: znver3 - AMD AOCC 3.2: Optimized build with assertions; Default target: x86_64-unknown-linux-gnu; Host CPU: znver3 Processor Details- Scaling Governor: acpi-cpufreq schedutil (Boost: Enabled) - CPU Microcode: 0xa001119Python Details- Python 3.9.5Security Details- itlb_multihit: Not affected + l1tf: Not affected + mds: Not affected + meltdown: Not affected + spec_store_bypass: Mitigation of SSB disabled via prctl and seccomp + spectre_v1: Mitigation of usercopy/swapgs barriers and __user pointer sanitization + spectre_v2: Mitigation of Full AMD retpoline IBPB: conditional IBRS_FW STIBP: always-on RSB filling + srbds: Not affected + tsx_async_abort: Not affected

AMD AOCC 3.2 Compiler Benchmarksonnx: shufflenet-v2-10 - CPUlczero: Eigenlczero: BLASapache: 1chia-vdf: Square Assembly Optimizedngspice: C2670apache: 500apache: 200kvazaar: Bosphorus 4K - Mediumsvt-hevc: 1 - Bosphorus 1080pgraphics-magick: Sharpenncnn: CPU - squeezenet_ssdncnn: CPU - yolov4-tinyncnn: CPU - resnet50ncnn: CPU - efficientnet-b0ncnn: CPU - mnasnetncnn: CPU - shufflenet-v2ncnn: CPU-v3-v3 - mobilenet-v3ncnn: CPU-v2-v2 - mobilenet-v2ncnn: CPU - mobilenetbasis: UASTC Level 3cpp-perf-bench: Atoljpegxl-decode: 1compress-zstd: 19 - Decompression Speedcompress-zstd: 19 - Compression Speedcompress-zstd: 3, Long Mode - Decompression Speedcompress-zstd: 3, Long Mode - Compression Speedcompress-zstd: 8 - Decompression Speedcompress-zstd: 8 - Compression Speedcompress-zstd: 3 - Decompression Speedcompress-zstd: 3 - Compression Speedstockfish: Total Timecompress-lz4: 1 - Compression Speedbotan: Twofish - Decryptbotan: Twofishbotan: CAST-256 - Decryptbotan: CAST-256botan: KASUMIbasis: UASTC Level 2chia-vdf: Square Plain C++encode-flac: WAV To FLACquantlib: tjbench: Decompression Throughputetcpak: ETC2coremark: CoreMark Size 666 - Iterations Per Secondprimesieve: 1e12 Prime Number Generationkvazaar: Bosphorus 4K - Very Fastdav1d: Chimera 1080pliquid-dsp: 16 - 256 - 57kvazaar: Bosphorus 4K - Ultra Fasttoktx: Zstd Compression 19rnnoise: toktx: UASTC 3 + Zstd Compression 19onednn: IP Shapes 3D - f32 - CPUjpegxl: JPEG - 8dav1d: Summer Nature 1080pbasis: UASTC Level 0onednn: Convolution Batch Shapes Auto - f32 - CPUwebp: Quality 100, Highest Compressionsvt-hevc: 7 - Bosphorus 1080psvt-vp9: VMAF Optimized - Bosphorus 1080psvt-hevc: 10 - Bosphorus 1080pAMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.219886199519716241.2412390791.49882641.7687051.709.0411618.4421.9417.315.794.015.044.024.5513.4751.51943.33465.993674.652.04309.5444.74140.61370.83946.63101.42543524213663.52355.702356.958150.032149.81596.47027.75818500015.9213159.8233.045735208.988348647.21655321.081531.8168907333317.77516.88016.5932.1147128.47497.876.8983.996755.612115.76150.26228.5419634215019096015.7112468789.12482157.1785820.098.269.1111318.5321.4017.075.703.955.053.984.4813.2951.39843.49966.923602.151.84200.0442.34019.91367.33907.93284.52592905313770.38376.197368.265152.679148.83998.86927.60018646715.8523151.1244.487557229.494348059.95182920.98519.13537.3963681000032.1517.49016.94316.5831.9634728.61504.096.7743.885225.567116.46149.85229.6921250208519806329.4012731388.42184374.5189516.528.379.1712318.2721.4616.715.553.864.943.794.2612.5451.15443.75966.733717.651.74392.4443.24138.31396.24009.53251.92583364213906.92397.061368.107154.610157.57798.00127.53218666715.8653208.6244.117267231.489358684.33076520.95419.17538.8469534000032.2617.55816.84016.5681.9032328.69504.536.7683.877135.548117.07151.30231.07OpenBenchmarking.org

ONNX Runtime

Model: shufflenet-v2-10 - Device: CPU

OpenBenchmarking.orgInferences Per Minute, More Is BetterONNX Runtime 1.10Model: shufflenet-v2-10 - Device: CPUAMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.25K10K15K20K25KSE +/- 398.96, N = 12SE +/- 398.04, N = 12SE +/- 460.99, N = 121988619634212501. (CXX) g++ options: -O3 -march=native -ffunction-sections -fdata-sections -mtune=native -flto=thin -ldl -pthread -lrt -lpthread

LeelaChessZero

Backend: Eigen

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.28Backend: EigenAMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.25001000150020002500SE +/- 18.25, N = 3SE +/- 19.04, N = 3SE +/- 14.26, N = 31995215020851. (CXX) g++ options: -flto -O3 -march=native -pthread

LeelaChessZero

Backend: BLAS

OpenBenchmarking.orgNodes Per Second, More Is BetterLeelaChessZero 0.28Backend: BLASAMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.2400800120016002000SE +/- 27.27, N = 3SE +/- 25.01, N = 3SE +/- 24.17, N = 31971190919801. (CXX) g++ options: -flto -O3 -march=native -pthread

Apache HTTP Server

Concurrent Requests: 1

OpenBenchmarking.orgRequests Per Second, More Is BetterApache HTTP Server 2.4.48Concurrent Requests: 1AMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.214002800420056007000SE +/- 144.81, N = 12SE +/- 45.70, N = 15SE +/- 59.67, N = 76241.246015.716329.401. (CC) gcc options: -shared -fPIC -pthread -O3 -march=native

Chia Blockchain VDF

Test: Square Assembly Optimized

OpenBenchmarking.orgIPS, More Is BetterChia Blockchain VDF 1.0.1Test: Square Assembly OptimizedAMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.230K60K90K120K150KSE +/- 1025.27, N = 15SE +/- 1162.50, N = 15SE +/- 1751.38, N = 151239071246871273131. (CXX) g++ options: -flto -fno-PIE -lgmpxx -lgmp -lboost_system -pthread

Ngspice

Circuit: C2670

OpenBenchmarking.orgSeconds, Fewer Is BetterNgspice 34Circuit: C2670AMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.220406080100SE +/- 0.91, N = 6SE +/- 0.42, N = 3SE +/- 0.10, N = 391.5089.1288.421. (CC) gcc options: -O3 -march=native -fopenmp -lm -lstdc++ -lfftw3 -lXaw -lXmu -lXt -lXext -lX11 -lXft -lfontconfig -lXrender -lfreetype -lSM -lICE

Apache HTTP Server

Concurrent Requests: 500

OpenBenchmarking.orgRequests Per Second, More Is BetterApache HTTP Server 2.4.48Concurrent Requests: 500AMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.220K40K60K80K100KSE +/- 218.70, N = 3SE +/- 267.66, N = 3SE +/- 185.04, N = 382641.7682157.1784374.511. (CC) gcc options: -shared -fPIC -pthread -O3 -march=native

Apache HTTP Server

Concurrent Requests: 200

OpenBenchmarking.orgRequests Per Second, More Is BetterApache HTTP Server 2.4.48Concurrent Requests: 200AMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.220K40K60K80K100KSE +/- 278.02, N = 3SE +/- 129.96, N = 3SE +/- 276.94, N = 387051.7085820.0989516.521. (CC) gcc options: -shared -fPIC -pthread -O3 -march=native

Kvazaar

Video Input: Bosphorus 4K - Video Preset: Medium

OpenBenchmarking.orgFrames Per Second, More Is BetterKvazaar 2.1Video Input: Bosphorus 4K - Video Preset: MediumAMD AOCC 3.1AMD AOCC 3.2246810SE +/- 0.01, N = 3SE +/- 0.01, N = 38.268.371. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O3 -march=native -lm -lrt

SVT-HEVC

Tuning: 1 - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-HEVC 1.5.0Tuning: 1 - Input: Bosphorus 1080pAMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.23691215SE +/- 0.00, N = 3SE +/- 0.00, N = 3SE +/- 0.01, N = 39.049.119.171. (CC) gcc options: -O3 -march=native -fPIE -fPIC -O2 -pie -rdynamic -lpthread -lrt

GraphicsMagick

Operation: Sharpen

OpenBenchmarking.orgIterations Per Minute, More Is BetterGraphicsMagick 1.3.33Operation: SharpenAMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.23060901201501161131231. (CC) gcc options: -fopenmp -O3 -march=native -pthread -ljbig -lwebp -lwebpmux -ltiff -lfreetype -ljpeg -lXext -lSM -lICE -lX11 -llzma -lbz2 -lxml2 -lz -lm -lpthread

NCNN

Target: CPU - Model: squeezenet_ssd

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: CPU - Model: squeezenet_ssdAMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.2510152025SE +/- 0.06, N = 3SE +/- 0.10, N = 3SE +/- 0.02, N = 318.4418.5318.27MIN: 18.06 / MAX: 19.31MIN: 17.16 / MAX: 19.51MIN: 17.97 / MAX: 18.951. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread

NCNN

Target: CPU - Model: yolov4-tiny

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: CPU - Model: yolov4-tinyAMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.2510152025SE +/- 0.30, N = 3SE +/- 0.04, N = 3SE +/- 0.06, N = 321.9421.4021.46MIN: 21.39 / MAX: 46.33MIN: 21.12 / MAX: 32.55MIN: 21.22 / MAX: 22.11. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread

NCNN

Target: CPU - Model: resnet50

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: CPU - Model: resnet50AMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.248121620SE +/- 0.01, N = 3SE +/- 0.03, N = 3SE +/- 0.00, N = 317.3117.0716.71MIN: 16.66 / MAX: 18.12MIN: 16.46 / MAX: 17.8MIN: 15.97 / MAX: 17.441. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread

NCNN

Target: CPU - Model: efficientnet-b0

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: CPU - Model: efficientnet-b0AMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.21.30282.60563.90845.21126.514SE +/- 0.05, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 35.795.705.55MIN: 5.66 / MAX: 8.45MIN: 5.61 / MAX: 6.34MIN: 5.46 / MAX: 6.261. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread

NCNN

Target: CPU - Model: mnasnet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: CPU - Model: mnasnetAMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.20.90231.80462.70693.60924.5115SE +/- 0.06, N = 3SE +/- 0.00, N = 3SE +/- 0.01, N = 34.013.953.86MIN: 3.86 / MAX: 4.53MIN: 3.87 / MAX: 4.42MIN: 3.79 / MAX: 4.31. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread

NCNN

Target: CPU - Model: shufflenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: CPU - Model: shufflenet-v2AMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.21.13632.27263.40894.54525.6815SE +/- 0.03, N = 3SE +/- 0.01, N = 3SE +/- 0.04, N = 35.045.054.94MIN: 4.91 / MAX: 5.71MIN: 4.89 / MAX: 5.61MIN: 4.79 / MAX: 6.891. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread

NCNN

Target: CPU-v3-v3 - Model: mobilenet-v3

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: CPU-v3-v3 - Model: mobilenet-v3AMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.20.90451.8092.71353.6184.5225SE +/- 0.04, N = 3SE +/- 0.01, N = 3SE +/- 0.02, N = 34.023.983.79MIN: 3.88 / MAX: 4.53MIN: 3.85 / MAX: 4.43MIN: 3.69 / MAX: 4.261. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread

NCNN

Target: CPU-v2-v2 - Model: mobilenet-v2

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: CPU-v2-v2 - Model: mobilenet-v2AMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.21.02382.04763.07144.09525.119SE +/- 0.08, N = 3SE +/- 0.01, N = 3SE +/- 0.02, N = 34.554.484.26MIN: 4.35 / MAX: 5.16MIN: 4.35 / MAX: 4.93MIN: 4.15 / MAX: 4.921. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread

NCNN

Target: CPU - Model: mobilenet

OpenBenchmarking.orgms, Fewer Is BetterNCNN 20210720Target: CPU - Model: mobilenetAMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.23691215SE +/- 0.08, N = 3SE +/- 0.01, N = 3SE +/- 0.01, N = 313.4713.2912.54MIN: 13.18 / MAX: 14.45MIN: 13.05 / MAX: 15.9MIN: 12.29 / MAX: 13.521. (CXX) g++ options: -O3 -march=native -rdynamic -lomp -lpthread -pthread

Basis Universal

Settings: UASTC Level 3

OpenBenchmarking.orgSeconds, Fewer Is BetterBasis Universal 1.13Settings: UASTC Level 3AMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.21224364860SE +/- 0.01, N = 3SE +/- 0.00, N = 3SE +/- 0.00, N = 351.5251.4051.151. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread

CppPerformanceBenchmarks

Test: Atol

OpenBenchmarking.orgSeconds, Fewer Is BetterCppPerformanceBenchmarks 9Test: AtolAMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.21020304050SE +/- 0.06, N = 3SE +/- 0.03, N = 3SE +/- 0.05, N = 343.3343.5043.761. (CXX) g++ options: -O3 -march=native -std=c++11

JPEG XL Decoding libjxl

CPU Threads: 1

OpenBenchmarking.orgMP/s, More Is BetterJPEG XL Decoding libjxl 0.6.1CPU Threads: 1AMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.21530456075SE +/- 0.24, N = 3SE +/- 0.03, N = 3SE +/- 0.13, N = 365.9966.9266.73

Zstd Compression

Compression Level: 19 - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 19 - Decompression SpeedAMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.28001600240032004000SE +/- 20.53, N = 3SE +/- 64.59, N = 3SE +/- 38.85, N = 33674.63602.13717.61. (CC) gcc options: -O3 -march=native -pthread -lz -llzma

Zstd Compression

Compression Level: 19 - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 19 - Compression SpeedAMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.21224364860SE +/- 0.40, N = 3SE +/- 0.07, N = 3SE +/- 0.17, N = 352.051.851.71. (CC) gcc options: -O3 -march=native -pthread -lz -llzma

Zstd Compression

Compression Level: 3, Long Mode - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 3, Long Mode - Decompression SpeedAMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.29001800270036004500SE +/- 45.88, N = 3SE +/- 21.45, N = 3SE +/- 35.14, N = 34309.54200.04392.41. (CC) gcc options: -O3 -march=native -pthread -lz -llzma

Zstd Compression

Compression Level: 3, Long Mode - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 3, Long Mode - Compression SpeedAMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.2100200300400500SE +/- 5.04, N = 3SE +/- 3.07, N = 3SE +/- 1.57, N = 3444.7442.3443.21. (CC) gcc options: -O3 -march=native -pthread -lz -llzma

Zstd Compression

Compression Level: 8 - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 8 - Decompression SpeedAMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.29001800270036004500SE +/- 42.37, N = 3SE +/- 20.84, N = 3SE +/- 9.09, N = 34140.64019.94138.31. (CC) gcc options: -O3 -march=native -pthread -lz -llzma

Zstd Compression

Compression Level: 8 - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 8 - Compression SpeedAMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.230060090012001500SE +/- 1.68, N = 3SE +/- 10.80, N = 3SE +/- 5.45, N = 31370.81367.31396.21. (CC) gcc options: -O3 -march=native -pthread -lz -llzma

Zstd Compression

Compression Level: 3 - Decompression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 3 - Decompression SpeedAMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.29001800270036004500SE +/- 3.55, N = 3SE +/- 28.97, N = 3SE +/- 4.62, N = 33946.63907.94009.51. (CC) gcc options: -O3 -march=native -pthread -lz -llzma

Zstd Compression

Compression Level: 3 - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterZstd Compression 1.5.0Compression Level: 3 - Compression SpeedAMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.27001400210028003500SE +/- 16.85, N = 3SE +/- 7.29, N = 3SE +/- 4.13, N = 33101.43284.53251.91. (CC) gcc options: -O3 -march=native -pthread -lz -llzma

Stockfish

Total Time

OpenBenchmarking.orgNodes Per Second, More Is BetterStockfish 13Total TimeAMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.26M12M18M24M30MSE +/- 339377.91, N = 3SE +/- 267405.57, N = 3SE +/- 117326.99, N = 32543524225929053258336421. (CXX) g++ options: -fprofile-use -m64 -lpthread -O3 -march=native -fno-exceptions -std=c++17 -pedantic -msse -msse3 -mpopcnt -mavx2 -msse4.1 -mssse3 -msse2 -flto

LZ4 Compression

Compression Level: 1 - Compression Speed

OpenBenchmarking.orgMB/s, More Is BetterLZ4 Compression 1.9.3Compression Level: 1 - Compression SpeedAMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.23K6K9K12K15KSE +/- 45.07, N = 3SE +/- 34.48, N = 3SE +/- 7.51, N = 313663.5213770.3813906.921. (CC) gcc options: -O3

Botan

Test: Twofish - Decrypt

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: Twofish - DecryptAMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.290180270360450SE +/- 0.14, N = 3SE +/- 0.09, N = 3SE +/- 0.14, N = 3355.70376.20397.061. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt

Botan

Test: Twofish

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: TwofishAMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.280160240320400SE +/- 0.16, N = 3SE +/- 0.10, N = 3SE +/- 0.13, N = 3356.96368.27368.111. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt

Botan

Test: CAST-256 - Decrypt

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: CAST-256 - DecryptAMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.2306090120150SE +/- 0.02, N = 3SE +/- 0.01, N = 3SE +/- 0.02, N = 3150.03152.68154.611. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt

Botan

Test: CAST-256

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: CAST-256AMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.2306090120150SE +/- 0.01, N = 3SE +/- 0.02, N = 3SE +/- 0.14, N = 3149.82148.84157.581. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt

Botan

Test: KASUMI

OpenBenchmarking.orgMiB/s, More Is BetterBotan 2.17.3Test: KASUMIAMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.220406080100SE +/- 0.02, N = 3SE +/- 0.02, N = 3SE +/- 0.02, N = 396.4798.8798.001. (CXX) g++ options: -fstack-protector -m64 -pthread -lbotan-2 -ldl -lrt

Basis Universal

Settings: UASTC Level 2

OpenBenchmarking.orgSeconds, Fewer Is BetterBasis Universal 1.13Settings: UASTC Level 2AMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.2714212835SE +/- 0.01, N = 3SE +/- 0.03, N = 3SE +/- 0.00, N = 327.7627.6027.531. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread

Chia Blockchain VDF

Test: Square Plain C++

OpenBenchmarking.orgIPS, More Is BetterChia Blockchain VDF 1.0.1Test: Square Plain C++AMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.240K80K120K160K200KSE +/- 814.45, N = 3SE +/- 272.85, N = 3SE +/- 961.48, N = 31850001864671866671. (CXX) g++ options: -flto -fno-PIE -lgmpxx -lgmp -lboost_system -pthread

FLAC Audio Encoding

WAV To FLAC

OpenBenchmarking.orgSeconds, Fewer Is BetterFLAC Audio Encoding 1.3.3WAV To FLACAMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.248121620SE +/- 0.02, N = 5SE +/- 0.01, N = 5SE +/- 0.02, N = 515.9215.8515.871. (CXX) g++ options: -O3 -march=native -logg -lm

QuantLib

OpenBenchmarking.orgMFLOPS, More Is BetterQuantLib 1.21AMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.27001400210028003500SE +/- 3.35, N = 3SE +/- 1.88, N = 3SE +/- 4.72, N = 33159.83151.13208.61. (CXX) g++ options: -O3 -march=native -rdynamic

libjpeg-turbo tjbench

Test: Decompression Throughput

OpenBenchmarking.orgMegapixels/sec, More Is Betterlibjpeg-turbo tjbench 2.1.0Test: Decompression ThroughputAMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.250100150200250SE +/- 0.01, N = 3SE +/- 0.13, N = 3SE +/- 0.87, N = 3233.05244.49244.121. (CC) gcc options: -O3 -march=native -rdynamic

Etcpak

Configuration: ETC2

OpenBenchmarking.orgMpx/s, More Is BetterEtcpak 0.7Configuration: ETC2AMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.250100150200250SE +/- 0.05, N = 3SE +/- 0.05, N = 3SE +/- 0.02, N = 3208.99229.49231.491. (CXX) g++ options: -O3 -march=native -std=c++11 -lpthread

Coremark

CoreMark Size 666 - Iterations Per Second

OpenBenchmarking.orgIterations/Sec, More Is BetterCoremark 1.0CoreMark Size 666 - Iterations Per SecondAMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.280K160K240K320K400KSE +/- 83.74, N = 3SE +/- 313.30, N = 3SE +/- 212.94, N = 3348647.22348059.95358684.331. (CC) gcc options: -O2 -O3 -march=native -lrt" -lrt

Primesieve

1e12 Prime Number Generation

OpenBenchmarking.orgSeconds, Fewer Is BetterPrimesieve 7.71e12 Prime Number GenerationAMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.2510152025SE +/- 0.11, N = 3SE +/- 0.05, N = 3SE +/- 0.03, N = 321.0820.9920.951. (CXX) g++ options: -O3 -march=native -lpthread

Kvazaar

Video Input: Bosphorus 4K - Video Preset: Very Fast

OpenBenchmarking.orgFrames Per Second, More Is BetterKvazaar 2.1Video Input: Bosphorus 4K - Video Preset: Very FastAMD AOCC 3.1AMD AOCC 3.2510152025SE +/- 0.01, N = 3SE +/- 0.04, N = 319.1319.171. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O3 -march=native -lm -lrt

dav1d

Video Input: Chimera 1080p

OpenBenchmarking.orgFPS, More Is Betterdav1d 0.9.2Video Input: Chimera 1080pAMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.2120240360480600SE +/- 2.67, N = 3SE +/- 2.89, N = 3SE +/- 0.97, N = 3531.81537.39538.84-lm - MIN: 427.63 / MAX: 832.45MIN: 429.9 / MAX: 785.03-lm - MIN: 431.65 / MAX: 824.691. (CC) gcc options: -O3 -march=native -pthread

Liquid-DSP

Threads: 16 - Buffer Length: 256 - Filter Length: 57

OpenBenchmarking.orgsamples/s, More Is BetterLiquid-DSP 2021.01.31Threads: 16 - Buffer Length: 256 - Filter Length: 57AMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.2150M300M450M600M750MSE +/- 56075.35, N = 3SE +/- 100166.53, N = 3SE +/- 162583.31, N = 36890733336368100006953400001. (CC) gcc options: -O3 -march=native -pthread -lm -lc -lliquid

Kvazaar

Video Input: Bosphorus 4K - Video Preset: Ultra Fast

OpenBenchmarking.orgFrames Per Second, More Is BetterKvazaar 2.1Video Input: Bosphorus 4K - Video Preset: Ultra FastAMD AOCC 3.1AMD AOCC 3.2714212835SE +/- 0.07, N = 3SE +/- 0.10, N = 332.1532.261. (CC) gcc options: -pthread -ftree-vectorize -fvisibility=hidden -O3 -march=native -lm -lrt

KTX-Software toktx

Settings: Zstd Compression 19

OpenBenchmarking.orgSeconds, Fewer Is BetterKTX-Software toktx 4.0Settings: Zstd Compression 19AMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.248121620SE +/- 0.03, N = 3SE +/- 0.06, N = 3SE +/- 0.04, N = 317.7817.4917.56

RNNoise

OpenBenchmarking.orgSeconds, Fewer Is BetterRNNoise 2020-06-28AMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.248121620SE +/- 0.03, N = 3SE +/- 0.03, N = 3SE +/- 0.04, N = 316.8816.9416.841. (CC) gcc options: -O3 -march=native -pedantic -fvisibility=hidden

KTX-Software toktx

Settings: UASTC 3 + Zstd Compression 19

OpenBenchmarking.orgSeconds, Fewer Is BetterKTX-Software toktx 4.0Settings: UASTC 3 + Zstd Compression 19AMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.248121620SE +/- 0.02, N = 3SE +/- 0.02, N = 3SE +/- 0.02, N = 316.5916.5816.57

oneDNN

Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: IP Shapes 3D - Data Type: f32 - Engine: CPUAMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.20.47580.95161.42741.90322.379SE +/- 0.01961, N = 7SE +/- 0.00558, N = 3SE +/- 0.00196, N = 32.114711.963471.90323MIN: 1.74MIN: 1.65MIN: 1.641. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp=libomp -msse4.1 -fPIC -pie -lpthread -ldl

JPEG XL libjxl

Input: JPEG - Encode Speed: 8

OpenBenchmarking.orgMP/s, More Is BetterJPEG XL libjxl 0.6.1Input: JPEG - Encode Speed: 8AMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.2714212835SE +/- 0.09, N = 3SE +/- 0.17, N = 3SE +/- 0.07, N = 328.4728.6128.691. (CXX) g++ options: -O3 -march=native -funwind-tables -Xclang -mrelax-all -O2 -pthread -fPIE -pie

dav1d

Video Input: Summer Nature 1080p

OpenBenchmarking.orgFPS, More Is Betterdav1d 0.9.2Video Input: Summer Nature 1080pAMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.2110220330440550SE +/- 0.39, N = 3SE +/- 2.18, N = 3SE +/- 0.93, N = 3497.87504.09504.53-lm - MIN: 435.72 / MAX: 539.24MIN: 442.56 / MAX: 541.4-lm - MIN: 458.56 / MAX: 543.031. (CC) gcc options: -O3 -march=native -pthread

Basis Universal

Settings: UASTC Level 0

OpenBenchmarking.orgSeconds, Fewer Is BetterBasis Universal 1.13Settings: UASTC Level 0AMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.2246810SE +/- 0.007, N = 3SE +/- 0.006, N = 3SE +/- 0.006, N = 36.8986.7746.7681. (CXX) g++ options: -std=c++11 -fvisibility=hidden -fPIC -fno-strict-aliasing -O3 -rdynamic -lm -lpthread

oneDNN

Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU

OpenBenchmarking.orgms, Fewer Is BetteroneDNN 2.1.2Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPUAMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.20.89931.79862.69793.59724.4965SE +/- 0.00548, N = 3SE +/- 0.00235, N = 3SE +/- 0.00042, N = 33.996753.885223.87713MIN: 3.88MIN: 3.81MIN: 3.761. (CXX) g++ options: -O3 -march=native -std=c++11 -fopenmp=libomp -msse4.1 -fPIC -pie -lpthread -ldl

WebP Image Encode

Encode Settings: Quality 100, Highest Compression

OpenBenchmarking.orgEncode Time - Seconds, Fewer Is BetterWebP Image Encode 1.1Encode Settings: Quality 100, Highest CompressionAMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.21.26272.52543.78815.05086.3135SE +/- 0.001, N = 3SE +/- 0.011, N = 3SE +/- 0.000, N = 35.6125.5675.5481. (CC) gcc options: -fvisibility=hidden -O3 -march=native -pthread -lm -ljpeg -lpng16 -ltiff

SVT-HEVC

Tuning: 7 - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-HEVC 1.5.0Tuning: 7 - Input: Bosphorus 1080pAMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.2306090120150SE +/- 0.27, N = 3SE +/- 0.36, N = 3SE +/- 0.20, N = 3115.76116.46117.071. (CC) gcc options: -O3 -march=native -fPIE -fPIC -O2 -pie -rdynamic -lpthread -lrt

SVT-VP9

Tuning: VMAF Optimized - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-VP9 0.3Tuning: VMAF Optimized - Input: Bosphorus 1080pAMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.2306090120150SE +/- 0.17, N = 3SE +/- 0.52, N = 3SE +/- 0.23, N = 3150.26149.85151.301. (CC) gcc options: -O3 -fcommon -march=native -fPIE -fPIC -fvisibility=hidden -pie -rdynamic -lpthread -lrt -lm

SVT-HEVC

Tuning: 10 - Input: Bosphorus 1080p

OpenBenchmarking.orgFrames Per Second, More Is BetterSVT-HEVC 1.5.0Tuning: 10 - Input: Bosphorus 1080pAMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.250100150200250SE +/- 0.18, N = 3SE +/- 0.94, N = 3SE +/- 0.38, N = 3228.54229.69231.071. (CC) gcc options: -O3 -march=native -fPIE -fPIC -O2 -pie -rdynamic -lpthread -lrt

Geometric Mean Of All Test Results

Result Composite - AMD AOCC 3.2 Compiler Benchmarks

OpenBenchmarking.orgGeometric Mean, More Is BetterGeometric Mean Of All Test ResultsResult Composite - AMD AOCC 3.2 Compiler BenchmarksAMD AOCC 3.0AMD AOCC 3.1AMD AOCC 3.260120180240300279.98282.22287.33


Phoronix Test Suite v10.8.4