AMD AOCC 3.2 Compiler Benchmarks AMD EPYC 72F3 of AOCC 3.2 compiler and prior releases. Benchmarks by Michael Larabel for a future article.  AMD AOCC 3.0: Processor: AMD EPYC 72F3 8-Core @ 3.70GHz (8 Cores / 16 Threads), Motherboard: Supermicro H12SSL-i v1.01 (2.0 BIOS), Chipset: AMD Starship/Matisse, Memory: 126GB, Disk: 3841GB Micron_9300_MTFDHAL3T8TDP + 1000GB Corsair Force MP600, Graphics: ASPEED, Monitor: VE228, Network: 2 x Broadcom NetXtreme BCM5720 2-port PCIe OS: Ubuntu 21.04, Kernel: 5.14.0-rc7-amd-pstate-phx (x86_64) 20210909, Desktop: GNOME Shell 3.38.4, Display Server: X Server, Compiler: Clang 12.0.0, File-System: ext4, Screen Resolution: 1920x1080 AMD AOCC 3.1: Processor: AMD EPYC 72F3 8-Core @ 3.70GHz (8 Cores / 16 Threads), Motherboard: Supermicro H12SSL-i v1.01 (2.0 BIOS), Chipset: AMD Starship/Matisse, Memory: 126GB, Disk: 3841GB Micron_9300_MTFDHAL3T8TDP + 1000GB Corsair Force MP600, Graphics: ASPEED, Monitor: VE228, Network: 2 x Broadcom NetXtreme BCM5720 2-port PCIe OS: Ubuntu 21.04, Kernel: 5.14.0-rc7-amd-pstate-phx (x86_64) 20210909, Desktop: GNOME Shell 3.38.4, Display Server: X Server, Compiler: Clang 12.0.0, File-System: ext4, Screen Resolution: 1920x1080 AMD AOCC 3.2: Processor: AMD EPYC 72F3 8-Core @ 3.70GHz (8 Cores / 16 Threads), Motherboard: Supermicro H12SSL-i v1.01 (2.0 BIOS), Chipset: AMD Starship/Matisse, Memory: 126GB, Disk: 3841GB Micron_9300_MTFDHAL3T8TDP + 1000GB Corsair Force MP600, Graphics: ASPEED, Monitor: VE228, Network: 2 x Broadcom NetXtreme BCM5720 2-port PCIe OS: Ubuntu 21.04, Kernel: 5.14.0-rc7-amd-pstate-phx (x86_64) 20210909, Desktop: GNOME Shell 3.38.4, Display Server: X Server, Compiler: Clang 13.0.0, File-System: ext4, Screen Resolution: 1920x1080 ONNX Runtime 1.10 Model: shufflenet-v2-10 - Device: CPU Inferences Per Minute > Higher Is Better AMD AOCC 3.0 . 19886 |===================================================== AMD AOCC 3.1 . 19634 |===================================================== AMD AOCC 3.2 . 21250 |========================================================= LeelaChessZero 0.28 Backend: Eigen Nodes Per Second > Higher Is Better AMD AOCC 3.0 . 1995 |====================================================== AMD AOCC 3.1 . 2150 |========================================================== AMD AOCC 3.2 . 2085 |======================================================== LeelaChessZero 0.28 Backend: BLAS Nodes Per Second > Higher Is Better AMD AOCC 3.0 . 1971 |========================================================== AMD AOCC 3.1 . 1909 |======================================================== AMD AOCC 3.2 . 1980 |========================================================== Apache HTTP Server 2.4.48 Concurrent Requests: 1 Requests Per Second > Higher Is Better AMD AOCC 3.0 . 6241.24 |====================================================== AMD AOCC 3.1 . 6015.71 |==================================================== AMD AOCC 3.2 . 6329.40 |======================================================= Chia Blockchain VDF 1.0.1 Test: Square Assembly Optimized IPS > Higher Is Better AMD AOCC 3.0 . 123907 |======================================================= AMD AOCC 3.1 . 124687 |======================================================= AMD AOCC 3.2 . 127313 |======================================================== Ngspice 34 Circuit: C2670 Seconds < Lower Is Better AMD AOCC 3.0 . 91.50 |========================================================= AMD AOCC 3.1 . 89.12 |======================================================== AMD AOCC 3.2 . 88.42 |======================================================= Apache HTTP Server 2.4.48 Concurrent Requests: 500 Requests Per Second > Higher Is Better AMD AOCC 3.0 . 82641.76 |===================================================== AMD AOCC 3.1 . 82157.17 |===================================================== AMD AOCC 3.2 . 84374.51 |====================================================== Apache HTTP Server 2.4.48 Concurrent Requests: 200 Requests Per Second > Higher Is Better AMD AOCC 3.0 . 87051.70 |===================================================== AMD AOCC 3.1 . 85820.09 |==================================================== AMD AOCC 3.2 . 89516.52 |====================================================== Kvazaar 2.1 Video Input: Bosphorus 4K - Video Preset: Medium Frames Per Second > Higher Is Better AMD AOCC 3.1 . 8.26 |========================================================= AMD AOCC 3.2 . 8.37 |========================================================== SVT-HEVC 1.5.0 Tuning: 1 - Input: Bosphorus 1080p Frames Per Second > Higher Is Better AMD AOCC 3.0 . 9.04 |========================================================= AMD AOCC 3.1 . 9.11 |========================================================== AMD AOCC 3.2 . 9.17 |========================================================== GraphicsMagick 1.3.33 Operation: Sharpen Iterations Per Minute > Higher Is Better AMD AOCC 3.0 . 116 |======================================================== AMD AOCC 3.1 . 113 |====================================================== AMD AOCC 3.2 . 123 |=========================================================== NCNN 20210720 Target: CPU - Model: squeezenet_ssd ms < Lower Is Better AMD AOCC 3.0 . 18.44 |========================================================= AMD AOCC 3.1 . 18.53 |========================================================= AMD AOCC 3.2 . 18.27 |======================================================== NCNN 20210720 Target: CPU - Model: yolov4-tiny ms < Lower Is Better AMD AOCC 3.0 . 21.94 |========================================================= AMD AOCC 3.1 . 21.40 |======================================================== AMD AOCC 3.2 . 21.46 |======================================================== NCNN 20210720 Target: CPU - Model: resnet50 ms < Lower Is Better AMD AOCC 3.0 . 17.31 |========================================================= AMD AOCC 3.1 . 17.07 |======================================================== AMD AOCC 3.2 . 16.71 |======================================================= NCNN 20210720 Target: CPU - Model: efficientnet-b0 ms < Lower Is Better AMD AOCC 3.0 . 5.79 |========================================================== AMD AOCC 3.1 . 5.70 |========================================================= AMD AOCC 3.2 . 5.55 |======================================================== NCNN 20210720 Target: CPU - Model: mnasnet ms < Lower Is Better AMD AOCC 3.0 . 4.01 |========================================================== AMD AOCC 3.1 . 3.95 |========================================================= AMD AOCC 3.2 . 3.86 |======================================================== NCNN 20210720 Target: CPU - Model: shufflenet-v2 ms < Lower Is Better AMD AOCC 3.0 . 5.04 |========================================================== AMD AOCC 3.1 . 5.05 |========================================================== AMD AOCC 3.2 . 4.94 |========================================================= NCNN 20210720 Target: CPU-v3-v3 - Model: mobilenet-v3 ms < Lower Is Better AMD AOCC 3.0 . 4.02 |========================================================== AMD AOCC 3.1 . 3.98 |========================================================= AMD AOCC 3.2 . 3.79 |======================================================= NCNN 20210720 Target: CPU-v2-v2 - Model: mobilenet-v2 ms < Lower Is Better AMD AOCC 3.0 . 4.55 |========================================================== AMD AOCC 3.1 . 4.48 |========================================================= AMD AOCC 3.2 . 4.26 |====================================================== NCNN 20210720 Target: CPU - Model: mobilenet ms < Lower Is Better AMD AOCC 3.0 . 13.47 |========================================================= AMD AOCC 3.1 . 13.29 |======================================================== AMD AOCC 3.2 . 12.54 |===================================================== Basis Universal 1.13 Settings: UASTC Level 3 Seconds < Lower Is Better AMD AOCC 3.0 . 51.52 |========================================================= AMD AOCC 3.1 . 51.40 |========================================================= AMD AOCC 3.2 . 51.15 |========================================================= CppPerformanceBenchmarks 9 Test: Atol Seconds < Lower Is Better AMD AOCC 3.0 . 43.33 |======================================================== AMD AOCC 3.1 . 43.50 |========================================================= AMD AOCC 3.2 . 43.76 |========================================================= JPEG XL Decoding libjxl 0.6.1 CPU Threads: 1 MP/s > Higher Is Better AMD AOCC 3.0 . 65.99 |======================================================== AMD AOCC 3.1 . 66.92 |========================================================= AMD AOCC 3.2 . 66.73 |========================================================= Zstd Compression 1.5.0 Compression Level: 19 - Decompression Speed MB/s > Higher Is Better AMD AOCC 3.0 . 3674.6 |======================================================= AMD AOCC 3.1 . 3602.1 |====================================================== AMD AOCC 3.2 . 3717.6 |======================================================== Zstd Compression 1.5.0 Compression Level: 19 - Compression Speed MB/s > Higher Is Better AMD AOCC 3.0 . 52.0 |========================================================== AMD AOCC 3.1 . 51.8 |========================================================== AMD AOCC 3.2 . 51.7 |========================================================== Zstd Compression 1.5.0 Compression Level: 3, Long Mode - Decompression Speed MB/s > Higher Is Better AMD AOCC 3.0 . 4309.5 |======================================================= AMD AOCC 3.1 . 4200.0 |====================================================== AMD AOCC 3.2 . 4392.4 |======================================================== Zstd Compression 1.5.0 Compression Level: 3, Long Mode - Compression Speed MB/s > Higher Is Better AMD AOCC 3.0 . 444.7 |========================================================= AMD AOCC 3.1 . 442.3 |========================================================= AMD AOCC 3.2 . 443.2 |========================================================= Zstd Compression 1.5.0 Compression Level: 8 - Decompression Speed MB/s > Higher Is Better AMD AOCC 3.0 . 4140.6 |======================================================== AMD AOCC 3.1 . 4019.9 |====================================================== AMD AOCC 3.2 . 4138.3 |======================================================== Zstd Compression 1.5.0 Compression Level: 8 - Compression Speed MB/s > Higher Is Better AMD AOCC 3.0 . 1370.8 |======================================================= AMD AOCC 3.1 . 1367.3 |======================================================= AMD AOCC 3.2 . 1396.2 |======================================================== Zstd Compression 1.5.0 Compression Level: 3 - Decompression Speed MB/s > Higher Is Better AMD AOCC 3.0 . 3946.6 |======================================================= AMD AOCC 3.1 . 3907.9 |======================================================= AMD AOCC 3.2 . 4009.5 |======================================================== Zstd Compression 1.5.0 Compression Level: 3 - Compression Speed MB/s > Higher Is Better AMD AOCC 3.0 . 3101.4 |===================================================== AMD AOCC 3.1 . 3284.5 |======================================================== AMD AOCC 3.2 . 3251.9 |======================================================= Stockfish 13 Total Time Nodes Per Second > Higher Is Better AMD AOCC 3.0 . 25435242 |===================================================== AMD AOCC 3.1 . 25929053 |====================================================== AMD AOCC 3.2 . 25833642 |====================================================== LZ4 Compression 1.9.3 Compression Level: 1 - Compression Speed MB/s > Higher Is Better AMD AOCC 3.0 . 13663.52 |===================================================== AMD AOCC 3.1 . 13770.38 |===================================================== AMD AOCC 3.2 . 13906.92 |====================================================== Botan 2.17.3 Test: Twofish - Decrypt MiB/s > Higher Is Better AMD AOCC 3.0 . 355.70 |================================================== AMD AOCC 3.1 . 376.20 |===================================================== AMD AOCC 3.2 . 397.06 |======================================================== Botan 2.17.3 Test: Twofish MiB/s > Higher Is Better AMD AOCC 3.0 . 356.96 |====================================================== AMD AOCC 3.1 . 368.27 |======================================================== AMD AOCC 3.2 . 368.11 |======================================================== Botan 2.17.3 Test: CAST-256 - Decrypt MiB/s > Higher Is Better AMD AOCC 3.0 . 150.03 |====================================================== AMD AOCC 3.1 . 152.68 |======================================================= AMD AOCC 3.2 . 154.61 |======================================================== Botan 2.17.3 Test: CAST-256 MiB/s > Higher Is Better AMD AOCC 3.0 . 149.82 |===================================================== AMD AOCC 3.1 . 148.84 |===================================================== AMD AOCC 3.2 . 157.58 |======================================================== Botan 2.17.3 Test: KASUMI MiB/s > Higher Is Better AMD AOCC 3.0 . 96.47 |======================================================== AMD AOCC 3.1 . 98.87 |========================================================= AMD AOCC 3.2 . 98.00 |======================================================== Basis Universal 1.13 Settings: UASTC Level 2 Seconds < Lower Is Better AMD AOCC 3.0 . 27.76 |========================================================= AMD AOCC 3.1 . 27.60 |========================================================= AMD AOCC 3.2 . 27.53 |========================================================= Chia Blockchain VDF 1.0.1 Test: Square Plain C++ IPS > Higher Is Better AMD AOCC 3.0 . 185000 |======================================================= AMD AOCC 3.1 . 186467 |======================================================== AMD AOCC 3.2 . 186667 |======================================================== FLAC Audio Encoding 1.3.3 WAV To FLAC Seconds < Lower Is Better AMD AOCC 3.0 . 15.92 |========================================================= AMD AOCC 3.1 . 15.85 |========================================================= AMD AOCC 3.2 . 15.87 |========================================================= QuantLib 1.21 MFLOPS > Higher Is Better AMD AOCC 3.0 . 3159.8 |======================================================= AMD AOCC 3.1 . 3151.1 |======================================================= AMD AOCC 3.2 . 3208.6 |======================================================== libjpeg-turbo tjbench 2.1.0 Test: Decompression Throughput Megapixels/sec > Higher Is Better AMD AOCC 3.0 . 233.05 |===================================================== AMD AOCC 3.1 . 244.49 |======================================================== AMD AOCC 3.2 . 244.12 |======================================================== Etcpak 0.7 Configuration: ETC2 Mpx/s > Higher Is Better AMD AOCC 3.0 . 208.99 |=================================================== AMD AOCC 3.1 . 229.49 |======================================================== AMD AOCC 3.2 . 231.49 |======================================================== Coremark 1.0 CoreMark Size 666 - Iterations Per Second Iterations/Sec > Higher Is Better AMD AOCC 3.0 . 348647.22 |==================================================== AMD AOCC 3.1 . 348059.95 |=================================================== AMD AOCC 3.2 . 358684.33 |===================================================== Primesieve 7.7 1e12 Prime Number Generation Seconds < Lower Is Better AMD AOCC 3.0 . 21.08 |========================================================= AMD AOCC 3.1 . 20.99 |========================================================= AMD AOCC 3.2 . 20.95 |========================================================= Kvazaar 2.1 Video Input: Bosphorus 4K - Video Preset: Very Fast Frames Per Second > Higher Is Better AMD AOCC 3.1 . 19.13 |========================================================= AMD AOCC 3.2 . 19.17 |========================================================= dav1d 0.9.2 Video Input: Chimera 1080p FPS > Higher Is Better AMD AOCC 3.0 . 531.81 |======================================================= AMD AOCC 3.1 . 537.39 |======================================================== AMD AOCC 3.2 . 538.84 |======================================================== Liquid-DSP 2021.01.31 Threads: 16 - Buffer Length: 256 - Filter Length: 57 samples/s > Higher Is Better AMD AOCC 3.0 . 689073333 |===================================================== AMD AOCC 3.1 . 636810000 |================================================= AMD AOCC 3.2 . 695340000 |===================================================== Kvazaar 2.1 Video Input: Bosphorus 4K - Video Preset: Ultra Fast Frames Per Second > Higher Is Better AMD AOCC 3.1 . 32.15 |========================================================= AMD AOCC 3.2 . 32.26 |========================================================= KTX-Software toktx 4.0 Settings: Zstd Compression 19 Seconds < Lower Is Better AMD AOCC 3.0 . 17.78 |========================================================= AMD AOCC 3.1 . 17.49 |======================================================== AMD AOCC 3.2 . 17.56 |======================================================== RNNoise 2020-06-28 Seconds < Lower Is Better AMD AOCC 3.0 . 16.88 |========================================================= AMD AOCC 3.1 . 16.94 |========================================================= AMD AOCC 3.2 . 16.84 |========================================================= KTX-Software toktx 4.0 Settings: UASTC 3 + Zstd Compression 19 Seconds < Lower Is Better AMD AOCC 3.0 . 16.59 |========================================================= AMD AOCC 3.1 . 16.58 |========================================================= AMD AOCC 3.2 . 16.57 |========================================================= oneDNN 2.1.2 Harness: IP Shapes 3D - Data Type: f32 - Engine: CPU ms < Lower Is Better AMD AOCC 3.0 . 2.11471 |======================================================= AMD AOCC 3.1 . 1.96347 |=================================================== AMD AOCC 3.2 . 1.90323 |================================================= JPEG XL libjxl 0.6.1 Input: JPEG - Encode Speed: 8 MP/s > Higher Is Better AMD AOCC 3.0 . 28.47 |========================================================= AMD AOCC 3.1 . 28.61 |========================================================= AMD AOCC 3.2 . 28.69 |========================================================= dav1d 0.9.2 Video Input: Summer Nature 1080p FPS > Higher Is Better AMD AOCC 3.0 . 497.87 |======================================================= AMD AOCC 3.1 . 504.09 |======================================================== AMD AOCC 3.2 . 504.53 |======================================================== Basis Universal 1.13 Settings: UASTC Level 0 Seconds < Lower Is Better AMD AOCC 3.0 . 6.898 |========================================================= AMD AOCC 3.1 . 6.774 |======================================================== AMD AOCC 3.2 . 6.768 |======================================================== oneDNN 2.1.2 Harness: Convolution Batch Shapes Auto - Data Type: f32 - Engine: CPU ms < Lower Is Better AMD AOCC 3.0 . 3.99675 |======================================================= AMD AOCC 3.1 . 3.88522 |===================================================== AMD AOCC 3.2 . 3.87713 |===================================================== WebP Image Encode 1.1 Encode Settings: Quality 100, Highest Compression Encode Time - Seconds < Lower Is Better AMD AOCC 3.0 . 5.612 |========================================================= AMD AOCC 3.1 . 5.567 |========================================================= AMD AOCC 3.2 . 5.548 |======================================================== SVT-HEVC 1.5.0 Tuning: 7 - Input: Bosphorus 1080p Frames Per Second > Higher Is Better AMD AOCC 3.0 . 115.76 |======================================================= AMD AOCC 3.1 . 116.46 |======================================================== AMD AOCC 3.2 . 117.07 |======================================================== SVT-VP9 0.3 Tuning: VMAF Optimized - Input: Bosphorus 1080p Frames Per Second > Higher Is Better AMD AOCC 3.0 . 150.26 |======================================================== AMD AOCC 3.1 . 149.85 |======================================================= AMD AOCC 3.2 . 151.30 |======================================================== SVT-HEVC 1.5.0 Tuning: 10 - Input: Bosphorus 1080p Frames Per Second > Higher Is Better AMD AOCC 3.0 . 228.54 |======================================================= AMD AOCC 3.1 . 229.69 |======================================================== AMD AOCC 3.2 . 231.07 |========================================================